Commit Graph

134 Commits

Author SHA1 Message Date
Chris Lattner ad89ec013f Add a bit to IdentifierInfo that acts as a simple predicate which
tells us whether Preprocessor::HandleIdentifier needs to be called.
Because this method is only rarely needed, this saves a call and a
bunch of random checks.  This drops the time in HandleIdentifier 
from 3.52ms to .98ms on cocoa.h on my machine.

llvm-svn: 62675
2009-01-21 07:43:11 +00:00
Ted Kremenek 52f73cad4a Fix: <rdar://problem/6510344> [pth] PTH slows down regular lexer considerably (when it has substantial work)
Changes to IdentifierTable:
- High-level summary: StringMap never owns IdentifierInfos.  It just
references them.
- The string map now has StringMapEntry<IdentifierInfo*> instead of
  StringMapEntry<IdentifierInfo>.  The IdentifierInfo object is
  allocated using the same bump pointer allocator as used by the
  StringMap.

Changes to IdentifierInfo:
- Added an extra pointer to point to the
  StringMapEntry<IdentifierInfo*> in the string map.  This pointer
  will be null if the IdentifierInfo* is *only* used by the PTHLexer
  (that is it isn't in the StringMap).

Algorithmic changes:
- Non-PTH case:
   IdentifierInfo::get() will always consult the StringMap first to
   see if we have an IdentifierInfo object.  If that StringMapEntry
   references a null pointer, we allocate a new one from the BumpPtrAllocator
   and update the reference in the StringMapEntry.
- PTH case:
   We do the same lookup as with the non-PTH case, but if we don't get
   a hit in the StringMap we do a secondary lookup in the PTHManager for
   the IdentifierInfo.  If we don't find an IdentifierInfo we create a
   new one as in the non-PTH case.  If we do find and IdentifierInfo
   in the PTHManager, we update the StringMapEntry to refer to it so
   that the IdentifierInfo will be found on the next StringMap lookup.
   This way we only do a binary search in the PTH file at most once
   for a given IdentifierInfo.  This greatly speeds things up for source
   files containing a non-trivial amount of code.

Performance impact:
   While these changes do add some extra indirection in
   IdentifierTable to access an IdentifierInfo*, I saw speedups even
   in the non-PTH case as well.

   Non-PTH: For -fsyntax-only on Cocoa.h, we see a 6% speedup.
   PTH (with Cocoa.h in token cache): 11% speedup.

   I also did an experiment where we did -fsyntax-only on a source file
   including a large header and Cocoa.h, but the token cache did not
   contain the larger header.  For this file, we were seeing a performance
   *regression* when using PTH of 3% over non-PTH.  Now we are seeing
   a performance improvement of 9%!

Tests:
   The serialization tests are now failing.  I looked at this extensively,
   and I my belief is that this change is unmasking a bug rather than
   introducing a new one.  I have disabled the serialization tests for now.

llvm-svn: 62636
2009-01-20 23:28:34 +00:00
Chris Lattner cbc35ecb04 Rename SourceManager::getCanonicalFileID -> getFileID. There is
no longer such thing as a non-canonical FileID.

llvm-svn: 62499
2009-01-19 07:46:45 +00:00
Chris Lattner 1e9e86f470 remove the public SourceManager::getContentCacheForLoc method.
llvm-svn: 62497
2009-01-19 07:40:40 +00:00
Chris Lattner f809bbdbb8 remove the SourceManager:: and FullSourceLoc::getFileEntryForLoc methods.
llvm-svn: 62496
2009-01-19 07:36:42 +00:00
Chris Lattner 7e343b2161 SourceManager::getBufferData(SourceLocation) is dead, delete it.
llvm-svn: 62495
2009-01-19 07:32:13 +00:00
Chris Lattner 91fda39454 some minor cleanups to SourceManager, and eliminate the
SourceManager::getBuffer(SourceLocation) method.

llvm-svn: 62494
2009-01-19 07:30:29 +00:00
Anders Carlsson e70cde134e Handle the 'X' constraint. Fixes <rdar://problem/6504897>.
llvm-svn: 62446
2009-01-18 02:12:04 +00:00
Anders Carlsson a79203be85 Add sema support for symbolic names in inline asm statements.
llvm-svn: 62441
2009-01-18 01:56:57 +00:00
Nate Begeman 95439108e8 Fit in 80 cols
llvm-svn: 62439
2009-01-18 01:08:03 +00:00
Nate Begeman a45707c06a Allow targets to override IntMaxTWidth
llvm-svn: 62434
2009-01-17 23:56:13 +00:00
Anders Carlsson 19aa04d270 Change TargetInfo::validateInputConstraint to take begin/end name iterators instead of the number of outputs. No functionality change.
llvm-svn: 62433
2009-01-17 23:36:15 +00:00
Chris Lattner 71dc14b9f0 Rename SourceLocation::getFileID to getChunkID, because it returns
the chunk ID not the file ID.  This exposes problems in 
TextDiagnosticPrinter where it should have been using the canonical
file ID but wasn't.  Fix these along the way.

llvm-svn: 62427
2009-01-17 08:45:21 +00:00
Chris Lattner d32480d3db this massive patch introduces a simple new abstraction: it makes
"FileID" a concept that is now enforced by the compiler's type checker
instead of yet-another-random-unsigned floating around.

This is an important distinction from the "FileID" currently tracked by
SourceLocation.  *That* FileID may refer to the start of a file or to a
chunk within it.  The new FileID *only* refers to the file (and its 
#include stack and eventually #line data), it cannot refer to a chunk.

FileID is a completely opaque datatype to all clients, only SourceManager
is allowed to poke and prod it.

llvm-svn: 62407
2009-01-17 06:22:33 +00:00
Chris Lattner 800979259e make "ContentCache::Buffer" mutable to avoid a const_cast.
llvm-svn: 62403
2009-01-17 03:54:16 +00:00
Chris Lattner fb1fd911cb Make FullSourceLoc derive from SourceLocation instead of
containing one.  Containment is generally better than derivation,
but in this case FullSourceLoc really 'isa' SourceLocation.

llvm-svn: 62375
2009-01-16 23:03:56 +00:00
Chris Lattner fcc0a5a3f7 elimiante FullSourceLoc::getCanonicalFileID
llvm-svn: 62374
2009-01-16 22:59:51 +00:00
Chris Lattner 7067b4f49d remove FullSourceLoc::isFileID
llvm-svn: 62371
2009-01-16 22:53:56 +00:00
Chris Lattner 8a42586c54 more SourceLocation lexicon change: instead of referring to the
"logical" location, refer to the "instantiation" location.

llvm-svn: 62316
2009-01-16 07:36:28 +00:00
Chris Lattner 3c91971b33 rename "virtual location" of a macro to "instantiation location".
llvm-svn: 62315
2009-01-16 07:15:35 +00:00
Chris Lattner 53e384f633 Change some terminology in SourceLocation: instead of referring to
the "physical" location of tokens, refer to the "spelling" location.
This is more concrete and useful, tokens aren't really physical objects!

llvm-svn: 62309
2009-01-16 07:00:02 +00:00
Ted Kremenek a705b04d7f IdentifierInfo:
- IdentifierInfo can now (optionally) have its string data not be
  co-located with itself.  This is for use with PTH.  This aspect is a
  little gross, as getName() and getLength() now make assumptions
  about a possible alternate representation of IdentifierInfo.
  Perhaps we should make IdentifierInfo have virtual methods?

IdentifierTable:
- Added class "IdentifierInfoLookup" that can be used by
  IdentifierTable to perform "string -> IdentifierInfo" lookups using
  an auxilliary data structure.  This is used by PTH.
- Perform tests show that IdentifierTable::get() does not slow down
  because of the extra check for the IdentiferInfoLookup object (the
  regular StringMap lookup does enough work to mitigate the impact of
  an extra null pointer check).
- The upshot is that now that some IdentifierInfo objects might be
  owned by the IdentiferInfoLookup object.  This should be reviewed.

PTH:
- Modified PTHManager::GetIdentifierInfo to *not* insert entries in
  IdentifierTable's string map, and instead create IdentifierInfo
  objects on the fly when mapping from persistent IDs to
  IdentifierInfos.  This saves a ton of work with string copies,
  hashing, and StringMap lookup and resizing.  This change was
  motivated because when processing source files in the PTH cache we
  don't need to do any string -> IdentifierInfo lookups.
- PTHManager now subclasses IdentifierInfoLookup, allowing clients of
  IdentifierTable to transparently use IdentifierInfo objects managed
  by the PTH file.  PTHManager resolves "string -> IdentifierInfo"
  queries by doing a binary search over a sorted table of identifier
  strings in the PTH file (the exact algorithm we use can be changed
  as needed).

These changes lead to the following performance changes when using PTH on Cocoa.h:
- fsyntax-only: 10% performance improvement
- Eonly: 30% performance improvement

llvm-svn: 62273
2009-01-15 18:47:46 +00:00
Anders Carlsson 30c235d182 Make sure to initialize the ConstraintInfo to 0
llvm-svn: 62068
2009-01-12 02:15:29 +00:00
Ted Kremenek 763ea559ac SourceManager: Implement "lazy" creation of MemBuffers for source files.
- Big Idea:
   Source files are now mmaped when ContentCache::getBuffer() is first called.
   While this doesn't change the functionality when lexing regular source files,
   it can result in source files not being paged in when using PTH.

- Performance change:
  - No observable difference (-fsyntax-only/-Eonly) on Cocoa.h when doing
    regular source lexing.
  - No observable time difference (-fsyntax-only/-Eonly) on Cocoa.h when using
    PTH. We do observe, however, a reduction of 279K in memory mapped source
    code (3% reduction). The majority of pages from Cocoa.h (and friends) are
    still being pulled in, however, because any literal will cause
    Preprocessor::getSpelling() to be called (causing the source for the file to
    get pulled in). The next possible optimization is to cache literal strings
    in the PTH file to avoid the need for the original header sources entirely.

- Right now there is a preprocessor directive to toggle between "lazy" and
  "eager" creation of MemBuffers. This is not permanent, and is there in the
  short term to just test additional optimizations.

llvm-svn: 61827
2009-01-06 22:43:04 +00:00
Ted Kremenek 12c2af44e9 Misc changes to SourceManager::ContentCache:
- 'Buffer' is now private and must be accessed via 'getBuffer()'.
   This paves the way for lazily mapping in source files on demand.
- Added 'getSize()' (which gets the size of the content without
   necessarily accessing the MemBuffer) and 'getSizeBytesMapped()'.
- Modifed SourceManager to use these new methods.  This reduces the
  number of places that actually access the MemBuffer object for a file
  to those that actually look at the character data.

These changes result in no performance change for -fsyntax-only on Cocoa.h.

llvm-svn: 61782
2009-01-06 01:55:26 +00:00
Chris Lattner 2ca529ce61 instead of forcing blocks on by default, make them default to off, but let
specific targets default them to on.  Default blocks to on on 10.6 and later.
Add a -fblocks option that allows the user to override the target's default.
Use -fblocks in the various testcases that use blocks.

llvm-svn: 60563
2008-12-04 23:20:07 +00:00
Chris Lattner c7c6dd4d97 replace useNeXTRuntimeAsDefault with a generic hook that allows targets
to specify their default language options.

llvm-svn: 60561
2008-12-04 22:54:33 +00:00
Sebastian Redl 3ceaf62240 Fix order of evaluation.
llvm-svn: 60160
2008-11-27 07:28:14 +00:00
Chris Lattner e4b95698df Rename Selector::getName() to Selector::getAsString(), and add
a new NamedDecl::getAsString() method.

Change uses of Selector::getName() to just pass in a Selector 
where possible (e.g. to diagnostics) instead of going through
an std::string.

This also adds new formatters for objcinstance and objcclass
as described in the dox.

llvm-svn: 59933
2008-11-24 03:33:13 +00:00
Chris Lattner e3d20d9545 Convert IdentifierInfo's to be printed the same as DeclarationNames
with implicit quotes around them.  This has a bunch of follow-on 
effects and requires tweaking to a whole lot of code.  This causes
a regression in two tests (xfailed) by causing it to emit things like:

  Line 10: duplicate interface declaration for category 'MyClass1' ('Category1')

instead of:

  Line 10: duplicate interface declaration for category 'MyClass1(Category1)'

I will fix this in a follow-up commit.

As part of this, I had to start switching stuff to use ->getDeclName() instead
of Decl::getName() for consistency.  This is good, but I was planning to do this
as an independent patch.  There will be several follow-on patches
to clean up some of the mess, but this patch is already too big.

llvm-svn: 59917
2008-11-23 21:45:46 +00:00
Chris Lattner f7e69d5a77 add support for inserting a DeclarationName into a diagnostic directly
without calling getAsString().  This implicitly puts quotes around the
name, so diagnostics need to be tweaked to accommodate this.

llvm-svn: 59916
2008-11-23 20:28:15 +00:00
Chris Lattner 63ecc509e3 Genericize the qualtype formating callback to support any diag argument.
No functionality change.

llvm-svn: 59908
2008-11-23 09:21:17 +00:00
Chris Lattner 6a2ed6f6dc Add support for sending QualType's directly into diags and convert two
diags over to use this.  QualTypes implicitly print single quotes around 
them for uniformity and future extension.

Doing this requires a little function pointer dance to prevent libbasic
from depending on libast.

llvm-svn: 59907
2008-11-23 09:13:29 +00:00
Sebastian Redl 15b02d2e62 Implement a %plural modifier for complex plural forms in diagnostics. Use it in the overload diagnostics.
llvm-svn: 59871
2008-11-22 13:44:36 +00:00
Chris Lattner 427c9c1763 Split the DiagnosticInfo class into two disjoint classes:
one for building up the diagnostic that is in flight (DiagnosticBuilder)
and one for pulling structured information out of the diagnostic when
formatting and presenting it.

There is no functionality change with this patch.

llvm-svn: 59849
2008-11-22 00:59:29 +00:00
Chris Lattner 2b78690a9c Add the concept of "modifiers" to the clang diagnostic format
strings.  This allows us to have considerable flexibility in how
these things are displayed and provides extra information that
allows us to merge away diagnostics that are very similar.

Diagnostic modifiers are a string of characters with the regex
[-a-z]+ that occur between the % and digit.  They may 
optionally have an argument that can parameterize them.

For now, I've added two example modifiers.  One is a very useful
tool that allows you to factor commonality across diagnostics
that need single words or phrases combined.  Basically you can
use %select{a|b|c}4 with with an integer argument that selects
either a/b/c based on an integer value in the range [0..3).

The second modifier is also an integer modifier, aimed to help
English diagnostics handle plurality.  "%s3" prints to 's' if 
integer argument #3 is not 1, otherwise it prints to nothing.
I'm fully aware that 's' is an English concept and doesn't
apply to all situations (mouse vs mice).  However, this is very
useful and we can add other crazy modifiers once we add support
for polish! ;-)

I converted a couple C++ diagnostics over to use this as an
example, I'd appreciate it if others could merge the other
likely candiates.  If you have other modifiers that you want,
lets talk on cfe-dev.

llvm-svn: 59803
2008-11-21 07:50:02 +00:00
Chris Lattner b91fd17b7d Allow sending IdentifierInfo*'s into Diagnostics without turning them into strings
first.  This should allow removal of a bunch of II->getName() calls.

llvm-svn: 59601
2008-11-19 07:32:16 +00:00
Chris Lattner 91aea716c6 add direct support for signed and unsigned integer arguments to diagnostics.
llvm-svn: 59598
2008-11-19 07:22:31 +00:00
Chris Lattner 23be067407 rewrite FormatDiagnostic to be less gross and a lot more efficient.
This also makes it illegal to have bare '%'s in diagnostics.  If you
want a % in a diagnostic, use %%.

llvm-svn: 59596
2008-11-19 06:51:40 +00:00
Chris Lattner 8d5bec4c7d implement a transparent optimization with the diagnostics stuff:
const char*'s are now not converted to std::strings when the diagnostic
is formed, we just hold onto their pointer and format as needed.

This commit makes DiagnosticClient::FormatDiagnostic even more of a 
mess, I'll fix it in the next commit.

llvm-svn: 59593
2008-11-19 06:04:55 +00:00
Douglas Gregor 163c58502a Extend DeclarationName to support C++ overloaded operators, e.g.,
operator+, directly, using the same mechanism as all other special
names.

Removed the "special" identifiers for the overloaded operators from
the identifier table and IdentifierInfo data structure. IdentifierInfo
is back to representing only real identifiers.

Added a new Action, ActOnOperatorFunctionIdExpr, that builds an
expression from an parsed operator-function-id (e.g., "operator
+"). ActOnIdentifierExpr used to do this job, but
operator-function-ids are no longer represented by IdentifierInfo's.

Extended Declarator to store overloaded operator names. 
Sema::GetNameForDeclarator now knows how to turn the operator
name into a DeclarationName for the overloaded operator. 

Except for (perhaps) consolidating the functionality of
ActOnIdentifier, ActOnOperatorFunctionIdExpr, and
ActOnConversionFunctionExpr into a common routine that builds an
appropriate DeclRefExpr by looking up a DeclarationName, all of the
work on normalizing declaration names should be complete with this
commit.

llvm-svn: 59526
2008-11-18 14:39:36 +00:00
Chris Lattner 8488c8297c This reworks some of the Diagnostic interfaces a bit to change how diagnostics
are formed.  In particular, a diagnostic with all its strings and ranges is now
packaged up and sent to DiagnosticClients as a DiagnosticInfo instead of as a 
ton of random stuff.  This has the benefit of simplifying the interface, making
it more extensible, and allowing us to do more checking for things like access
past the end of the various arrays passed in.

In addition to introducing DiagnosticInfo, this also substantially changes how 
Diagnostic::Report works.  Instead of being passed in all of the info required
to issue a diagnostic, Report now takes only the required info (a location and 
ID) and returns a fresh DiagnosticInfo *by value*.  The caller is then free to
stuff strings and ranges into the DiagnosticInfo with the << operator.  When
the dtor runs on the DiagnosticInfo object (which should happen at the end of
the statement), the diagnostic is actually emitted with all of the accumulated
information.  This is a somewhat tricky dance, but it means that the 
accumulated DiagnosticInfo is allowed to keep pointers to other expression 
temporaries without those pointers getting invalidated.

This is just the minimal change to get this stuff working, but this will allow
us to eliminate the zillions of variant "Diag" methods scattered throughout
(e.g.) sema.  For example, instead of calling:

  Diag(BuiltinLoc, diag::err_overload_no_match, typeNames,
       SourceRange(BuiltinLoc, RParenLoc));

We will soon be able to just do:

  Diag(BuiltinLoc, diag::err_overload_no_match)
      << typeNames << SourceRange(BuiltinLoc, RParenLoc));

This scales better to support arbitrary types being passed in (not just 
strings) in a type-safe way.  Go operator overloading?!

llvm-svn: 59502
2008-11-18 07:04:44 +00:00
Chris Lattner 746d474b28 SourceManager::getLineNumber is logically const except for caching.
Use mutable to make it so.

llvm-svn: 59498
2008-11-18 06:51:15 +00:00
Chris Lattner 16ba91396a Change the diagnostics interface to take an array of pointers to
strings instead of array of strings.  This reduces string copying
in some not-very-important cases, but paves the way for future 
improvements.

llvm-svn: 59494
2008-11-18 04:56:44 +00:00
Douglas Gregor 92751d41a0 Eliminate all of the placeholder identifiers used for constructors,
destructors, and conversion functions. The placeholders were used to
work around the fact that the parser and some of Sema really wanted
declarators to have simple identifiers; now, the code that deals with
declarators will use DeclarationNames.

llvm-svn: 59469
2008-11-17 22:58:34 +00:00
Douglas Gregor 77324f3854 Introduction the DeclarationName class, as a single, general method of
representing the names of declarations in the C family of
languages. DeclarationName is used in NamedDecl to store the name of
the declaration (naturally), and ObjCMethodDecl is now a NamedDecl.

llvm-svn: 59441
2008-11-17 14:58:09 +00:00
Chris Lattner e6de6252c2 Fix PR3077: tokens that come from macro expansions whose macro was
defined in a system header should be treated as system header tokens
even if they are instantiated in a different place.

llvm-svn: 59418
2008-11-16 18:36:34 +00:00
Douglas Gregor 993603d80d Add a new expression node, CXXOperatorCallExpr, which expresses a
function call created in response to the use of operator syntax that
resolves to an overloaded operator in C++, e.g., "str1 +
str2" that resolves to std::operator+(str1, str2)". We now build a
CXXOperatorCallExpr in C++ when we pick an overloaded operator. (But
only for binary operators, where we actually implement overloading)

I decided *not* to refactor the current CallExpr to make it abstract
(with FunctionCallExpr and CXXOperatorCallExpr as derived
classes). Doing so would allow us to make CXXOperatorCallExpr a little
bit smaller, at the cost of making the argument and callee accessors
virtual. We won't know if this is going to be a win until we can parse
lots of C++ code to determine how much memory we'll save by making
this change vs. the performance penalty due to the extra virtual
calls.

llvm-svn: 59306
2008-11-14 16:09:21 +00:00
Douglas Gregor b6acda0f36 Don't build identifiers for C++ constructors, destructors, or
conversion functions. Instead, we just use a placeholder identifier
for these (e.g., "<constructor>") and override NamedDecl::getName() to
provide a human-readable name.

This is one potential solution to the problem; another solution would
be to replace the use of IdentifierInfo* in NamedDecl with a different
class that deals with identifiers better. I'm also prototyping that to
see how it compares, but this commit is better than what we had
previously.

llvm-svn: 59193
2008-11-12 23:21:09 +00:00
Douglas Gregor 6cf0806e75 Some cleanups to the declaration/checking of overloaded operators in C++. Thanks to Sebastian for the review
llvm-svn: 58986
2008-11-10 13:38:07 +00:00