Commit Graph

435 Commits

Author SHA1 Message Date
Daniel Dunbar dc78bd9f79 Fix -E mismatch; an identifier followed by a numeric constant does not
require a space (to avoid concatenation) if the numeric constant had a
leading period.
 - PR3819.

llvm-svn: 67163
2009-03-18 03:32:24 +00:00
Gabor Greif 31a082fd7a typo
llvm-svn: 67081
2009-03-17 11:39:38 +00:00
Douglas Gregor 23d75bb326 Build system changes to use TableGen to generate the various
diagnostics. This builds on the patch that Sebastian committed and
then revert. Major differences are:

  - We don't remove or use the current ".def" files. Instead, for now,
    we just make sure that we're building the ".inc" files.
  - Fixed CMake makefiles to run TableGen and build the ".inc" files
    when needed. Tested with both the Xcode and Makefile generators
    provided by CMake, so it should be solid.
  - Fixed normal makefiles to handle out-of-source builds that involve
    the ".inc" files.

I'll send a separate patch to the list with Sebastian's changes that
eliminate the use of the .def files.

llvm-svn: 67058
2009-03-16 23:06:59 +00:00
Anders Carlsson 5bd30395b9 (Hopefully) instantiate dependent array types correctly.
llvm-svn: 67032
2009-03-15 20:12:13 +00:00
Chris Lattner 83aba00ee8 make Preprocessor::Diags be a pointer instead of a reference.
llvm-svn: 66955
2009-03-13 21:17:43 +00:00
Chris Lattner 80c21df8ea use accessor instead of poking ivar directly
llvm-svn: 66954
2009-03-13 21:17:23 +00:00
Chris Lattner cf35ce9d0d add a callback for macro expansion, based on a patch by Paolo Bolzoni!
llvm-svn: 66799
2009-03-12 17:31:43 +00:00
Chris Lattner da248f4f30 fix PR3768, Clang does -D__STDC_HOSTED__=1, even if -ffreestanding is passed.
llvm-svn: 66474
2009-03-09 21:50:12 +00:00
Chris Lattner 794c001ad4 fix PR3764 - A redefinition of a pre-processor macro fails
Redefinition checking should ignore the leading whitespace and
start of line flags on the first token of an expansion.

llvm-svn: 66442
2009-03-09 20:33:32 +00:00
Chris Lattner 7253991f9d add \n characters to the scratch buffer *before* returned tokens.
This prevents caret diagnostics from the scratch buffer from 
including other tokens in the scratch buffer that occurred beforei
them.

llvm-svn: 66375
2009-03-08 08:16:41 +00:00
Chris Lattner fa217bda40 simplify some logic by making ScratchBuffer handle the application of trailing
\0's to created tokens instead of making all clients do it.  No functionality
change.

llvm-svn: 66373
2009-03-08 08:08:45 +00:00
Mike Stump 82d8d559bb Fix warnings in build on clang-x86_64-freebsd buildbot.
llvm-svn: 66344
2009-03-07 18:35:41 +00:00
Chris Lattner d4a96730c1 #import is not considered an extension for ObjC.
llvm-svn: 66246
2009-03-06 04:28:03 +00:00
Chris Lattner 8322dc809e make the token lexer allocate its temporary token buffers for
preexpanded macro arguments from the preprocessor's bump pointer.
This reduces # mallocs from 12444 to 11792.

llvm-svn: 66025
2009-03-04 06:50:57 +00:00
Chris Lattner c25d8a7e30 improve compatibility with GCC 4.4, patch by Michel Salim (PR3697)
llvm-svn: 65884
2009-03-02 22:20:04 +00:00
Douglas Gregor 96977da72c Clean up and document code modification hints.
llvm-svn: 65641
2009-02-27 17:53:17 +00:00
Chris Lattner d42c29f9a2 fix some sema problems with wide strings and hook up basic codegen for them.
llvm-svn: 65582
2009-02-26 23:01:51 +00:00
Ted Kremenek 29b8697393 Move PTHStatCache within the anonymous namespace.
llvm-svn: 65348
2009-02-23 23:27:54 +00:00
Chris Lattner 70946da73a switch the macroinfo argument lists from being allocated off the heap
to being allocated from the same bumpptr that the MacroInfo objects 
themselves are.

This speeds up -Eonly cocoa.h pth by ~4%, fsyntax-only is barely measurable.

llvm-svn: 65195
2009-02-20 22:46:43 +00:00
Chris Lattner f87c510cc9 detemplatify setArgumentList and some other cleanups.
llvm-svn: 65187
2009-02-20 22:31:31 +00:00
Chris Lattner 666f7a42d6 require the MAcroInfo objects are explcitly destroyed.
llvm-svn: 65179
2009-02-20 22:19:20 +00:00
Ted Kremenek e76eb060c7 Fix another PTH warning that should not be a note.
llvm-svn: 65072
2009-02-19 22:14:49 +00:00
Ted Kremenek 36b005db45 Make PTH warnings actual warnings instead of 'notes'.
llvm-svn: 65071
2009-02-19 22:13:40 +00:00
Chris Lattner 91668def8b fix PR3609, emit:
t.c:1:10: error: missing terminating '>' character
#include <stdio.h
         ^

instead of:

t.c:1:10: error: missing terminating " character
#include <stdio.h
         ^

llvm-svn: 65052
2009-02-19 18:29:56 +00:00
Chris Lattner ddb7191920 Next step toward making string diagnostics correct: handle
escapes in the string for subtoken positioning.  This gives
us working examples like:

t.m:5:16: warning: field width should have type 'int', but argument has type 'unsigned int'
  printf("\n\n%*d", (unsigned) 1, 1);
               ^    ~~~~~~~~~~~~

where before the caret pointed two spaces to the left.

llvm-svn: 64940
2009-02-18 19:21:10 +00:00
Chris Lattner 57a09cfcbc update comment.
llvm-svn: 64939
2009-02-18 18:56:29 +00:00
Chris Lattner ec396b5114 Fix some issues handling sub-token locations that come from macro expansions.
We now emit:

t.m:6:15: warning: field width should have type 'int', but argument has type 'unsigned int'
  printf(STR, (unsigned) 1, 1);
         ^    ~~~~~~~~~~~~
t.m:3:18: note: instantiated from:
#define STR "abc%*ddef"
                 ^

which has the correct location in the string literal in the note line.

llvm-svn: 64936
2009-02-18 18:52:52 +00:00
Fariborz Jahanian eb209e7dbd define __OBJC2__ for objc's nonfragile abi.
llvm-svn: 64642
2009-02-16 18:28:48 +00:00
Chris Lattner ee4b5235e3 Add support for deprecated members of RecordDecls (e.g. struct fields).
llvm-svn: 64634
2009-02-16 17:07:21 +00:00
Chris Lattner f52c0b261c add a new SourceManager::getInstantiationRange helper method.
llvm-svn: 64606
2009-02-15 21:26:50 +00:00
Chris Lattner 5a2e9cb42d fix PR3579: __LINE__ expands to the presumed location of the
*end* of a macro instantiation, not the start of it.  This is
really all about bug-for-bug compatibility with GCC, but not
doing this breaks the FreeBSD kernel.

llvm-svn: 64604
2009-02-15 21:06:39 +00:00
Chris Lattner 9dc9c206d3 track "just a little more" location information for macro instantiations.
Now instead of just tracking the expansion history, also track the full
range of the macro that got replaced.  For object-like macros, this doesn't
change anything.  For _Pragma and function-like macros, this means we track
the locations of the ')'.

This is required for PR3579 because apparently GCC uses the line of the ')'
of a function-like macro as the location to expand __LINE__ to.

llvm-svn: 64601
2009-02-15 20:52:18 +00:00
Chris Lattner 190f64e9b8 add an assertion from Alexei Svitkine!
llvm-svn: 64503
2009-02-13 23:06:48 +00:00
Chris Lattner 7e4c81c8c6 Give TargetInfo a new IntPtrType to hold the intptr_t type for
a target.

Make Preprocessor.cpp define a new __INTPTR_TYPE__ macro based on this.

On linux/32, set intptr_t to int, instead of long.  This fixes PR3563.

llvm-svn: 64495
2009-02-13 22:28:55 +00:00
Ted Kremenek 2fd18ec43a PTH: Cache directory and negative 'stat' calls. This gives us a 1% performance improvement on Cocoa.h (fsyntax-only+PTH).
llvm-svn: 64490
2009-02-13 22:07:44 +00:00
Chris Lattner 9ef847be12 Fix rdar://6562329, a static analyzer crash Ted noticed on
wine sources.  This was happening because HighlightMacros was 
calling EnterMainFile multiple times on the same preprocessor
object and getting an assert due to the new #line stuff (the
file in question was bison output with #line directives).

The fix for this is to not reenter the file.  Instead, 
relex the tokens in raw mode, swizzle them a bit and repreprocess
the token stream.  An added bonus of this is that rewrite macros
will now hilight the macro definition as well as its uses.  Woo.

llvm-svn: 64480
2009-02-13 19:33:24 +00:00
Ted Kremenek 29942a349c Add some boilerplate to the PTH file to prepare for the caching of stats for directories (and negative stats too).
llvm-svn: 64477
2009-02-13 19:13:46 +00:00
Mike Stump 57d7354635 Fix cmake builds.
llvm-svn: 64455
2009-02-13 15:42:50 +00:00
Eli Friedman 159a7cbc36 Fix gcc warning: gcc correctly notes that const-qualifying the return
type doesn't do anything.

llvm-svn: 64424
2009-02-13 01:02:29 +00:00
Chris Lattner 644d452de5 factor token concatenation avoidance logic out of
PrintPreprocessedOutput into its own file.  No functionality change.

llvm-svn: 64418
2009-02-13 00:46:04 +00:00
Daniel Dunbar ad027c7781 Fix assertion when input is an empty string.
llvm-svn: 64397
2009-02-12 19:31:53 +00:00
Ted Kremenek b4c85ccaaa Re-enable PTH stat caching. All tests pass now.
llvm-svn: 64356
2009-02-12 03:45:39 +00:00
Ted Kremenek c6a2a37222 Fix bad reading of bytes in ReadUnalignedLE64() (copy-paste error).
llvm-svn: 64355
2009-02-12 03:39:55 +00:00
Ted Kremenek 3280145da4 Temporarily disable PTH stat caching as it appears to be failing on some machines.
llvm-svn: 64354
2009-02-12 03:36:54 +00:00
Ted Kremenek a5c2c27ebd PTH: Cache stat information for files in the PTH file. Hook up FileManager
to use this stat information in the PTH file using a 'StatSysCallCache' object.

Performance impact (Cocoa.h, PTH):
- number of stat calls reduces from 1230 to 425
- fsyntax-only: time improves by 4.2% 

We can reduce the number of stat calls to almost zero by caching negative stat
calls and directory stat calls in the PTH file as well.

llvm-svn: 64353
2009-02-12 03:26:59 +00:00
Ted Kremenek 4c1d41f2b1 PTH: Have meta data be at the beginning of the PTH file, not the end.
llvm-svn: 64338
2009-02-11 23:34:32 +00:00
Ted Kremenek e5554deb45 PTH: Replace string identifier to persistent ID lookup with a hashtable. This is
actually *slightly* slower than the binary search. Since this is algorithmically
better, further performance tuning should be able to make this faster.

llvm-svn: 64326
2009-02-11 21:29:16 +00:00
Ted Kremenek 8527e3a727 PTH: Don't emit the PTH offset of the IdentifierInfo string data as that data is
referenced by other tables.

llvm-svn: 64304
2009-02-11 16:06:55 +00:00
Ted Kremenek 86423a9993 PTH: Replace ad hoc 'file name' -> 'PTH data' lookup table in the PTH file with an on-disk chained hash table. This data structure is implemented using templates, and will be used to replace similar data structures. This change leads to no visibile performance impact on Cocoa.h, but now we only pay a price for the table on order with the number of files accessed and not the number in the PTH file.
llvm-svn: 64245
2009-02-10 22:16:22 +00:00
Chris Lattner 0763cc6bfd Export __INT8_TYPE__ / __INT16_TYPE__ / __INT32_TYPE__ / __INT64_TYPE__
in a gcc 4.5 compatible way so that stdint.h can follow the compiler's 
notion of types.

llvm-svn: 63976
2009-02-06 22:59:26 +00:00
Chris Lattner a31829b5bc -funsigned-char sets __CHAR_UNSIGNED__
llvm-svn: 63942
2009-02-06 18:20:57 +00:00
Chris Lattner 1630c3c4f0 Add an implementation of -dM that follows GCC closely enough to permit
diffing the output of:
  clang -dM -o - -E -x c foo.c | sort

llvm-svn: 63926
2009-02-06 06:45:26 +00:00
Chris Lattner 5c7cd6043e add interface for walking macro table.
llvm-svn: 63925
2009-02-06 06:26:42 +00:00
Chris Lattner bba531ce99 get __WCHAR_TYPE__ from the targetinfo hook
llvm-svn: 63920
2009-02-06 05:06:07 +00:00
Chris Lattner a91c30fdb0 simplify and refactor a bunch of type definition code in Preprocessor
predefines buffer initialization.

llvm-svn: 63919
2009-02-06 05:04:11 +00:00
Chris Lattner 61898606dc remove some ad-hocery and use DefineTypeSize for more things.
Now you too can have a 47 bit long long!

llvm-svn: 63918
2009-02-06 04:55:18 +00:00
Chris Lattner a3dc5d8423 refactor some code into a DefineTypeSize function.
llvm-svn: 63917
2009-02-06 04:50:25 +00:00
Chris Lattner fafd8d1be9 correct and generalize computation of __INTMAX_MAX__.
llvm-svn: 63848
2009-02-05 07:27:41 +00:00
Chris Lattner 8181312251 fix some differences between apple gcc and clang on darwin/x86-32.
llvm-svn: 63846
2009-02-05 07:19:24 +00:00
Chris Lattner 022923a22a Fix PR3464 by searching for headers from the predefines
buffer as if the #include happened from the main file.

llvm-svn: 63764
2009-02-04 19:45:07 +00:00
Chris Lattner 1c967784f3 Implement handling of file entry/exit notifications from GNU
line markers, including maintenance of the virtual include stack.

For something like this:

# 42 "bar.c" 1
# 142 "bar2.c" 1

#warning zappa
# 92 "bar.c" 2
#warning gonzo
# 102 "foo.c" 2
#warning bonkta


we now produce these three warnings:

#1:
In file included from foo.c:3:
In file included from bar.c:42:
bar2.c:143:2: warning: #warning zappa
#warning zappa
 ^

#2:
In file included from foo.c:3:
bar.c:92:2: warning: #warning gonzo
#warning gonzo
 ^

#3:
foo.c:102:2: warning: #warning bonkta
#warning bonkta
 ^

llvm-svn: 63722
2009-02-04 06:25:26 +00:00
Chris Lattner 0a1a8d8514 propagate linemarker flags down into the the line table, currently
ignoring include stack push/pop info though.

llvm-svn: 63719
2009-02-04 05:21:58 +00:00
Chris Lattner 1eaa70a612 stub out basic #line handling calls.
llvm-svn: 63667
2009-02-03 21:52:55 +00:00
Chris Lattner 60f36223a9 move library-specific diagnostic headers into library private dirs. Reduce
redundant #includes.  Patch by Anders Johnsen!

llvm-svn: 63271
2009-01-29 05:15:15 +00:00
Ted Kremenek 62224c1d7f Add more PTH diagnostics for invalid PTH files, etc.
llvm-svn: 63232
2009-01-28 21:02:43 +00:00
Ted Kremenek 3b0589e4b4 Enhance PTHManager::Create() to take an optional Diagnostic* argument that can be used to report issues such as a missing PTH file.
llvm-svn: 63231
2009-01-28 20:49:33 +00:00
Chris Lattner 7368d581c1 Split the single monolithic DiagnosticKinds.def file into one
.def file for each library.  This means that adding a diagnostic
to sema doesn't require all the other libraries to be rebuilt.

Patch by Anders Johnsen!

llvm-svn: 63111
2009-01-27 18:30:58 +00:00
Chris Lattner f1ca7d3e02 Introduce a new PresumedLoc class to represent the concept of a location
as reported to the user and as manipulated by #line.  This is what __FILE__,
__INCLUDE_LEVEL__, diagnostics and other things should follow (but not 
dependency generation!).  

This patch also includes several cleanups along the way: 

- SourceLocation now has a dump method, and several other places 
  that did similar things now use it.
- I cleaned up some code in AnalysisConsumer, but it should probably be
  simplified further now that NamedDecl is better.
- TextDiagnosticPrinter is now simplified and cleaned up a bit.

This patch is a prerequisite for #line, but does not actually provide 
any #line functionality.

llvm-svn: 63098
2009-01-27 07:57:44 +00:00
Chris Lattner bf648a3a63 Fix a bug that I noticed by inspection.
llvm-svn: 63094
2009-01-27 05:34:03 +00:00
Ted Kremenek 8d178f4357 PTH: Use Token::setLiteralData() to directly store a pointer to cached spelling data in the PTH file. This removes a ton of code for looking up spellings using sourcelocations in the PTH file. This simplifies both PTH-generation and reading.
Performance impact for -fsyntax-only on Cocoa.h (with Cocoa.h in the PTH file):
- PTH generation time improves by 5%
- PTH reading improves by 0.3%.

llvm-svn: 63072
2009-01-27 00:01:05 +00:00
Chris Lattner d381721810 Fix a bug I introduced in my changes, which caused MeasureTokenLength
to crash when given an instantiation location.  Thanks to Fariborz for
the testcase.

llvm-svn: 63057
2009-01-26 22:24:27 +00:00
Ted Kremenek 327d00cd45 Silence warning.
llvm-svn: 63054
2009-01-26 22:16:12 +00:00
Ted Kremenek 978b5becea Add version number checking to PTH files.
llvm-svn: 63047
2009-01-26 21:50:21 +00:00
Ted Kremenek eb8c8fbd63 Embed the offset of the PTH table inside the prologue of the PTH file. This will help improve gradual versioning of PTH files instead of relying that the PTH table is at a fixed offset.
llvm-svn: 63045
2009-01-26 21:43:14 +00:00
Chris Lattner 357b57d749 remove my hacks that aggressively threw away multiple
instantiation history in an effort to speed up c99-intconst-1.c.
Now that multiple nested instantiations are allowed, we just
make them and don't pay the cost of lookups.  With the other
changes that went in before this, reverting this is actually
a speedup for c99-intconst-1.c, speeding it up from 1.96s to 1.80s,
and preserves much better loc info.

llvm-svn: 63036
2009-01-26 20:24:53 +00:00
Chris Lattner 7e20927756 allow _Pragmas formed from #defines to keep their full instantiation
history

llvm-svn: 63035
2009-01-26 20:15:46 +00:00
Chris Lattner 5a7971e0c3 This change refactors some of the low-level lexer interfaces a bit.
Token now has a class of kinds for "literals", which include 
numeric constants, strings, etc.  These tokens can optionally have
a pointer to the start of the token in the lexer buffer.  This 
makes it faster to get spelling and do other gymnastics, because we
don't have to go through source locations.

This change is performance neutral, but will make other changes
more feasible down the road.

llvm-svn: 63028
2009-01-26 19:29:26 +00:00
Chris Lattner b5fba6f8d8 start plumbing together the line table information. So far we just
unique the Filenames in #line directives, assigning them UIDs.

llvm-svn: 63010
2009-01-26 07:57:50 +00:00
Chris Lattner 76e689636b add parsing and constraint enforcement for GNU line marker directives.
llvm-svn: 63003
2009-01-26 06:19:46 +00:00
Chris Lattner 38d7fd252a a few minor cleanups
llvm-svn: 63000
2009-01-26 05:30:54 +00:00
Chris Lattner 100c65e810 parse and enforce required constraints on #line directives. Right now
we just discard them.

llvm-svn: 62999
2009-01-26 05:29:08 +00:00
Chris Lattner ad13cf4e7a eagerly resolve the spelling locations of macro argument preexpansions.
This reduces fsyntax-only time on c99-intconst-1.c from 2.43s down to 
2.01s (20%), reducing the number of fileid lookups from 2529040 linear 
and 64771121 binary to 5625902 linear and 4151182 binary.

This knocks getFileID down to only 4.6% of compile time on this testcase.
At this point, malloc/free is over 35% of compile time, primarily allocating
MacroArgs objects and their argument preexpansion vectors.

I don't feel like malloc avoiding right now, so I'm just going to call
this good.

llvm-svn: 62994
2009-01-26 04:33:10 +00:00
Chris Lattner 5a5d67101b Eagerly resolve the spelling location of the tokens in a definition
of a macro.  Since these tokens may themselves be from macro 
expansions, we need to resolve down to the spelling loc when the
macro ends up being instantiated.  Instead of resolving this for
each token expanded from the macro definition, just do it once when
the macro is defined.  This speeds up clang on c99-intconst-1.c from
2.66s to 2.43s (9.5%), reducing the FileID lookups from 407244 linear and
114175649 binary to 2529040 linear and 64771121 binary.

llvm-svn: 62993
2009-01-26 04:06:48 +00:00
Chris Lattner dd9babc79a Only resolve a macro's instantiation loc once per macro, instead of once
per token lexed from it.  This speeds up clang on c99-intconst-1.c from
the GCC testsuite from 3.64s to 2.66s (36%).  This reduces the number of
binary search FileID lookups from 251570522 to 114175649 on this testcase.

llvm-svn: 62992
2009-01-26 03:46:22 +00:00
Chris Lattner 4fa23625ab Check in the long promised SourceLocation rewrite. This lays the
ground work for implementing #line, and fixes the "out of macro ID's" 
problem.

There is nothing particularly tricky about the code, other than the
very performance sensitive SourceManager::getFileID() method.

llvm-svn: 62978
2009-01-26 00:43:02 +00:00
Chris Lattner 1f6c7fe6a8 This is a follow-up to r62675:
Refactor how the preprocessor changes a token from being an tok::identifier to a 
keyword (e.g. tok::kw_for).  Instead of doing this in HandleIdentifier, hoist this
common case out into the caller, so that every keyword doesn't have to go through
HandleIdentifier.  This drops time in HandleIdentifier from 1.25ms to .62ms, and
speeds up clang -Eonly with PTH by about 1%.

llvm-svn: 62855
2009-01-23 18:35:48 +00:00
Chris Lattner f8ccb4f9e3 Update comment.
llvm-svn: 62819
2009-01-23 00:13:28 +00:00
Chris Lattner 34eab390b9 remove my gross #ifdef's, using portable abstractions now that the 32-bit
load is always aligned.

I verified that the bswap doesn't occur in the assembly code on x86.

llvm-svn: 62815
2009-01-22 23:50:07 +00:00
Chris Lattner fec5470f03 remove Read8/Read24, which are dead. Rename Read16/Read32 to be more
descriptive.

llvm-svn: 62775
2009-01-22 19:48:26 +00:00
Ted Kremenek ae54f2f590 Fix <rdar://problem/6512717> by correctly reading the right offset in the token data in PTHLexer::getSourceLocation().
llvm-svn: 62725
2009-01-21 22:41:38 +00:00
Chris Lattner 3029b35faa merge two checks for identifiers in the pth loop into one.
llvm-svn: 62677
2009-01-21 07:50:06 +00:00
Chris Lattner 8256b970a3 a trivial micro optimization to save a load.
llvm-svn: 62676
2009-01-21 07:45:14 +00:00
Chris Lattner ad89ec013f Add a bit to IdentifierInfo that acts as a simple predicate which
tells us whether Preprocessor::HandleIdentifier needs to be called.
Because this method is only rarely needed, this saves a call and a
bunch of random checks.  This drops the time in HandleIdentifier 
from 3.52ms to .98ms on cocoa.h on my machine.

llvm-svn: 62675
2009-01-21 07:43:11 +00:00
Ted Kremenek 8d6c828728 Don't crash on empty PTH files. This fixes <rdar://problem/6512714>.
llvm-svn: 62673
2009-01-21 07:34:28 +00:00
Chris Lattner c950296006 really we only need on Read24!
llvm-svn: 62672
2009-01-21 07:28:57 +00:00
Chris Lattner 47def9787e revert my previous patch, it assumed endianness.
llvm-svn: 62671
2009-01-21 07:21:56 +00:00
Chris Lattner a74f7cbb9d minor cleanups: now that tokens are 4-byte aligned in a PTH
file, just load them directly as ints.

llvm-svn: 62668
2009-01-21 07:06:08 +00:00
Ted Kremenek 52f73cad4a Fix: <rdar://problem/6510344> [pth] PTH slows down regular lexer considerably (when it has substantial work)
Changes to IdentifierTable:
- High-level summary: StringMap never owns IdentifierInfos.  It just
references them.
- The string map now has StringMapEntry<IdentifierInfo*> instead of
  StringMapEntry<IdentifierInfo>.  The IdentifierInfo object is
  allocated using the same bump pointer allocator as used by the
  StringMap.

Changes to IdentifierInfo:
- Added an extra pointer to point to the
  StringMapEntry<IdentifierInfo*> in the string map.  This pointer
  will be null if the IdentifierInfo* is *only* used by the PTHLexer
  (that is it isn't in the StringMap).

Algorithmic changes:
- Non-PTH case:
   IdentifierInfo::get() will always consult the StringMap first to
   see if we have an IdentifierInfo object.  If that StringMapEntry
   references a null pointer, we allocate a new one from the BumpPtrAllocator
   and update the reference in the StringMapEntry.
- PTH case:
   We do the same lookup as with the non-PTH case, but if we don't get
   a hit in the StringMap we do a secondary lookup in the PTHManager for
   the IdentifierInfo.  If we don't find an IdentifierInfo we create a
   new one as in the non-PTH case.  If we do find and IdentifierInfo
   in the PTHManager, we update the StringMapEntry to refer to it so
   that the IdentifierInfo will be found on the next StringMap lookup.
   This way we only do a binary search in the PTH file at most once
   for a given IdentifierInfo.  This greatly speeds things up for source
   files containing a non-trivial amount of code.

Performance impact:
   While these changes do add some extra indirection in
   IdentifierTable to access an IdentifierInfo*, I saw speedups even
   in the non-PTH case as well.

   Non-PTH: For -fsyntax-only on Cocoa.h, we see a 6% speedup.
   PTH (with Cocoa.h in token cache): 11% speedup.

   I also did an experiment where we did -fsyntax-only on a source file
   including a large header and Cocoa.h, but the token cache did not
   contain the larger header.  For this file, we were seeing a performance
   *regression* when using PTH of 3% over non-PTH.  Now we are seeing
   a performance improvement of 9%!

Tests:
   The serialization tests are now failing.  I looked at this extensively,
   and I my belief is that this change is unmasking a bug rather than
   introducing a new one.  I have disabled the serialization tests for now.

llvm-svn: 62636
2009-01-20 23:28:34 +00:00
Ted Kremenek 8433f0b400 PTH: Emitted tokens now consist of 12 bytes that are loaded used 3 32-bit loads. This reduces user time but increases system time because of the slightly larger PTH file. Although there is no performance win on Cocoa.h and -Eonly, overall this seems like a good step.
llvm-svn: 62542
2009-01-19 23:13:15 +00:00
Chris Lattner 4fd8b958be do not use SourceManager::getFileCharacteristic(FileID), it is not
safe because a #line can change the file characteristic on a per-loc
basis.

llvm-svn: 62502
2009-01-19 08:01:53 +00:00
Chris Lattner c033416639 do not use SourceManager::getFileCharacteristic(FileID), it is not
safe because a #line can change the file characteristic on a per-loc
basis.

llvm-svn: 62501
2009-01-19 07:59:15 +00:00
Chris Lattner cbc35ecb04 Rename SourceManager::getCanonicalFileID -> getFileID. There is
no longer such thing as a non-canonical FileID.

llvm-svn: 62499
2009-01-19 07:46:45 +00:00
Ted Kremenek 8c3b812148 Run destructors of MacroInfo objects to free memory they allocate. This addresses <rdar://problem/6506035>.
llvm-svn: 62498
2009-01-19 07:45:44 +00:00
Chris Lattner 02495d80ef Make some enums in SourceLocation private, remove a useless assertion from ScratchBuffer.
llvm-svn: 62492
2009-01-19 06:57:37 +00:00
Chris Lattner 29a2a191f2 Make SourceLocation::getFileLoc private to reduce the API exposure of
SourceLocation.  This requires making some cleanups to token pasting
and _Pragma expansion.

llvm-svn: 62490
2009-01-19 06:46:35 +00:00
Chris Lattner fc014f80e5 fix rdar://6505352 - Bogus warning with -WUndef, a case
Anders noticed.

llvm-svn: 62472
2009-01-18 21:18:58 +00:00
Chris Lattner 144aacd19e rearrange GetIdentifierInfo so that the fast path can be partially inlined into PTHLexer::Lex. This speeds up the user time of PTH -Eonly by another 2ms (4.4%)
llvm-svn: 62454
2009-01-18 02:57:21 +00:00
Chris Lattner 18fc6ceb56 rename some variables, only set a tokens identifierinfo if non-null.
llvm-svn: 62450
2009-01-18 02:34:01 +00:00
Chris Lattner 9cdd877436 On i386 and x86-64, just do unaligned loads
instead of assembling from bytes.  This speeds up -Eonly PTH reading 
of cocoa.h by about 2ms, which is 4.2%.

llvm-svn: 62447
2009-01-18 02:19:16 +00:00
Chris Lattner 137d6492a8 switch PTHLexer to use Read32 and friends instead of lots of inlined
copies.  I verified that this causes no performance change in PTH.

llvm-svn: 62445
2009-01-18 02:10:31 +00:00
Chris Lattner eb09754a9d switch PTH lexer from using "const char*"s to "const unsigned char*"s
internally.  This is just a cleanup that reduces the need to cast to
unsigned char before assembling a larger integer.

llvm-svn: 62442
2009-01-18 01:57:14 +00:00
Chris Lattner 71dc14b9f0 Rename SourceLocation::getFileID to getChunkID, because it returns
the chunk ID not the file ID.  This exposes problems in 
TextDiagnosticPrinter where it should have been using the canonical
file ID but wasn't.  Fix these along the way.

llvm-svn: 62427
2009-01-17 08:45:21 +00:00
Chris Lattner 5509d533f6 simplify some lookups.
llvm-svn: 62426
2009-01-17 08:30:10 +00:00
Chris Lattner 757169b60f Change the Lexer ctor used to lex _Pragma directives into a static factory
method.  This lets us clean up the interface and make it more obvious that
this method is *really really* _Pragma specific.

Note that _Pragma handling uglifies the Lexer in the critical path.  It would
be very interesting to consider making _Pragma remapping be a new special
lexer class of its own.

llvm-svn: 62425
2009-01-17 08:27:52 +00:00
Chris Lattner ab1d4b8abd simplify PTHManager::CreateLexer
llvm-svn: 62424
2009-01-17 08:06:50 +00:00
Chris Lattner c809089b26 Change the Lexer ctor used in the non _Pragma case to take a FileID instead
of a SourceLocation.  This should speed it up and definitely simplifies it.

llvm-svn: 62422
2009-01-17 08:03:42 +00:00
Chris Lattner 8ddb5cf0cf in Preprocessor::AdvanceToTokenCharacter, don't actually bother
creating a whole lexer when we just want one static method.

llvm-svn: 62420
2009-01-17 07:57:25 +00:00
Chris Lattner 5965a28a4b More simplifications to the lexer ctors.
llvm-svn: 62419
2009-01-17 07:56:59 +00:00
Chris Lattner fcf6452eb4 make the verbose raw-lexer ctor fully explicit instead of having
embedded magic.

llvm-svn: 62417
2009-01-17 07:42:27 +00:00
Chris Lattner 08354fef13 add a simplified lexer ctor that sets up the lexer to raw-lex an
entire file.

llvm-svn: 62414
2009-01-17 07:35:14 +00:00
Chris Lattner f76b92092e refactor some common initialization code out of the two lexer ctors.
llvm-svn: 62411
2009-01-17 06:55:17 +00:00
Chris Lattner 3793bba26f suck the call to "getSpellingLoc" that all clients do into
the implementation of PTHManager::getSpelling.

llvm-svn: 62408
2009-01-17 06:29:33 +00:00
Chris Lattner d32480d3db this massive patch introduces a simple new abstraction: it makes
"FileID" a concept that is now enforced by the compiler's type checker
instead of yet-another-random-unsigned floating around.

This is an important distinction from the "FileID" currently tracked by
SourceLocation.  *That* FileID may refer to the start of a file or to a
chunk within it.  The new FileID *only* refers to the file (and its 
#include stack and eventually #line data), it cannot refer to a chunk.

FileID is a completely opaque datatype to all clients, only SourceManager
is allowed to poke and prod it.

llvm-svn: 62407
2009-01-17 06:22:33 +00:00
Chris Lattner 1abd20901b Instead of iterating over FileID's, have PTH generation iterate over the
content cache directly.  Content cache has a 1-1 mapping with fileentries,
whereas multiple FileIDs can be the same FileEntry.

llvm-svn: 62401
2009-01-17 03:48:08 +00:00
Chris Lattner 5882771102 Fix PR2477 - clang misparses "//*" in C89 mode
llvm-svn: 62368
2009-01-16 22:39:25 +00:00
Chris Lattner 5244f34e75 As a performance optimization, don't bother calling MacroInfo::isIdenticalTo
if warnings in system headers are disabled.  isIdenticalTo can end up 
calling the expensive getSpelling method, and other bad stuff and is 
completely unneeded if the warning will be discarded anyway. rdar://6502956

llvm-svn: 62347
2009-01-16 19:50:11 +00:00
Chris Lattner f49775dc81 only notify callbacks if they exist.
llvm-svn: 62334
2009-01-16 19:01:46 +00:00
Chris Lattner 262d4e31b9 Improve #pragma comment support by building the string argument and
notifying PPCallbacks about it.

llvm-svn: 62333
2009-01-16 18:59:23 +00:00
Chris Lattner 8a24e588d7 minor cleanups to StringLiteralParser: no need to pass target info
into its ctor.  Also, make it handle validity checking of pascal
strings instead of making clients do it.

llvm-svn: 62332
2009-01-16 18:51:42 +00:00
Chris Lattner 2ff698df60 Implement basic support for parsing #pragma comment, a microsoft extension
documented here:
http://msdn.microsoft.com/en-us/library/7f0aews7(VS.80).aspx

This is according to my understanding reading the docs, I don't know if it
really agrees fully with what VC++ allows.

llvm-svn: 62317
2009-01-16 08:21:25 +00:00
Chris Lattner 8a42586c54 more SourceLocation lexicon change: instead of referring to the
"logical" location, refer to the "instantiation" location.

llvm-svn: 62316
2009-01-16 07:36:28 +00:00
Chris Lattner 7c8556e7bc remove obsolete comment which happened to go over 80 cols.
llvm-svn: 62313
2009-01-16 07:04:11 +00:00
Chris Lattner 15af77f679 remove an unneeded const_cast.
llvm-svn: 62311
2009-01-16 07:02:14 +00:00
Chris Lattner 53e384f633 Change some terminology in SourceLocation: instead of referring to
the "physical" location of tokens, refer to the "spelling" location.
This is more concrete and useful, tokens aren't really physical objects!

llvm-svn: 62309
2009-01-16 07:00:02 +00:00
Ted Kremenek 4bbb79a642 PTH: Fix termination condition in binary search.
llvm-svn: 62277
2009-01-15 19:28:38 +00:00
Ted Kremenek a705b04d7f IdentifierInfo:
- IdentifierInfo can now (optionally) have its string data not be
  co-located with itself.  This is for use with PTH.  This aspect is a
  little gross, as getName() and getLength() now make assumptions
  about a possible alternate representation of IdentifierInfo.
  Perhaps we should make IdentifierInfo have virtual methods?

IdentifierTable:
- Added class "IdentifierInfoLookup" that can be used by
  IdentifierTable to perform "string -> IdentifierInfo" lookups using
  an auxilliary data structure.  This is used by PTH.
- Perform tests show that IdentifierTable::get() does not slow down
  because of the extra check for the IdentiferInfoLookup object (the
  regular StringMap lookup does enough work to mitigate the impact of
  an extra null pointer check).
- The upshot is that now that some IdentifierInfo objects might be
  owned by the IdentiferInfoLookup object.  This should be reviewed.

PTH:
- Modified PTHManager::GetIdentifierInfo to *not* insert entries in
  IdentifierTable's string map, and instead create IdentifierInfo
  objects on the fly when mapping from persistent IDs to
  IdentifierInfos.  This saves a ton of work with string copies,
  hashing, and StringMap lookup and resizing.  This change was
  motivated because when processing source files in the PTH cache we
  don't need to do any string -> IdentifierInfo lookups.
- PTHManager now subclasses IdentifierInfoLookup, allowing clients of
  IdentifierTable to transparently use IdentifierInfo objects managed
  by the PTH file.  PTHManager resolves "string -> IdentifierInfo"
  queries by doing a binary search over a sorted table of identifier
  strings in the PTH file (the exact algorithm we use can be changed
  as needed).

These changes lead to the following performance changes when using PTH on Cocoa.h:
- fsyntax-only: 10% performance improvement
- Eonly: 30% performance improvement

llvm-svn: 62273
2009-01-15 18:47:46 +00:00
Ted Kremenek bef9fc2240 PTH: Embed a persistentID side-table in the PTH file that is sorted in the
lexical order of the corresponding identifier strings. This will be used for a
forthcoming optimization. This slows down PTH generation time by 7%. We can
revert this change if the optimization proves to not be valuable.

llvm-svn: 62248
2009-01-15 01:26:25 +00:00
Ted Kremenek e9814186ac PTH:
- Use canonical FileID when using getSpelling() caching.  This
  addresses some cache misses we were seeing with -fsyntax-only on
  Cocoa.h
- Added Preprocessor::getPhysicalCharacterAt() utility method for
  clients to grab the first character at a specified sourcelocation.
  This uses the PTH spelling cache.
- Modified Sema::ActOnNumericConstant() to use
  Preprocessor::getPhysicalCharacterAt() instead of
  SourceManager::getCharacterData() (to get PTH hits).

These changes cause -fsyntax-only to not page in any sources from
Cocoa.h.  We see a speedup of 27%.

llvm-svn: 62193
2009-01-13 23:19:12 +00:00
Ted Kremenek 7cbdcc25d4 Fix corner cases in PTH getSpelling() binary search.
llvm-svn: 62187
2009-01-13 22:16:45 +00:00
Ted Kremenek b0b4f74b6b PTH: Fix remaining cases where the spelling cache in the PTH file was being missed when it shouldn't. This shaves another 7% off PTH time for -Eonly on Cocoa.h
llvm-svn: 62186
2009-01-13 22:05:50 +00:00
Ted Kremenek 47b8cf6deb Enhance PTH 'getSpelling' caching:
- Refactor caching logic into a helper class PTHSpellingSearch
- Allow "random accesses" in the spelling cache, thus catching the remaining
  cases where 'getSpelling' wasn't hitting the PTH cache
  
For -Eonly, PTH, Cocoa.h:
- This reduces wall time by 3% (user time unchanged, sys time reduced)
- This reduces the amount of paged source by 1112K.
  The remaining 1112K still being paged in is from somewhere else
  (investigating).

llvm-svn: 62009
2009-01-09 22:05:30 +00:00
Ted Kremenek 8ae06625b5 Invert assertion condition.
llvm-svn: 61961
2009-01-09 00:36:11 +00:00
Ted Kremenek d5e6e16d0d PTH: Hook up getSpelling() caching in PTHLexer. This results in a nice
performance gain. Here's what we see for -Eonly on Cocoa.h (using PTH):

- wall time decreases by 21% (26% speedup overall)
- system time decreases by 35%
- user time decreases by 6%

These reductions are due to not paging source files just to get spellings for
literals. The solution in place doesn't appear to be 100% yet, as we still see
some of the pages for source files getting mapped in. Using -print-stats, we see
that SourceManager maps in 7179K less bytes of source text (reduction of 75%).
Will investigate why the remaining 25% are getting paged in.

With these changes, here's how PTH compares to non-PTH on Cocoa.h:
  -Eonly: PTH takes 64% of the time as non-PTH (54% speedup)
  -fsyntax-only: PTH takes 89% of the time as non-PTH (11% speedup)

llvm-svn: 61913
2009-01-08 04:30:32 +00:00
Ted Kremenek 884a558441 PTH:
- Added stub PTHLexer::getSpelling() that will be used for fetching cached
  spellings from the PTH file.  This doesn't do anything yet.
- Added a hook in Preprocessor::getSpelling() to call PTHLexer::getSpelling()
  when using a PTHLexer.
- Updated PTHLexer to read the offsets of spelling tables in the PTH file.

llvm-svn: 61911
2009-01-08 02:47:16 +00:00
Chris Lattner 933d5ffc11 Optimize stringification a bit to avoid std::string thrashing and
avoid the version of Preprocessor::getSpelling that returns an 
std::string.

llvm-svn: 61769
2009-01-05 23:04:18 +00:00
Chris Lattner 07ebf302e5 simplify Preprocessor::getSpelling now that identifiers carry around
their length.

llvm-svn: 61734
2009-01-05 19:44:41 +00:00
Steve Naroff f9c29d4200 Add parser support for __forceinline, __w64, __ptr64.
llvm-svn: 61431
2008-12-25 14:41:26 +00:00
Steve Naroff 44ac777741 Add parser support for __cdecl, __stdcall, and __fastcall.
Change preprocessor implementation of _cdecl to reference __cdecl.

llvm-svn: 61430
2008-12-25 14:16:32 +00:00
Steve Naroff 3a9b7e0cff Add explicit "fuzzy" parse support for Microsoft declspec.
Remove previous __declspec macro that would effectively erase the construct prior to parsing.

llvm-svn: 61422
2008-12-24 20:59:21 +00:00
Ted Kremenek b0051a9955 Remove old PTH token-generation test harness.
llvm-svn: 61382
2008-12-23 19:25:33 +00:00