Commit Graph

48 Commits

Author SHA1 Message Date
Ted Kremenek d6e190d899 PTH: Cache *un-cleaned* spellings for literals instead of cleaned spellings.
This allows the PTH file to stay 100% in fidelity with the source code and
defines away some weird cosmetic bugs for operations such as '-E' where
maintaining knowledge of the original literal representation is useful.

llvm-svn: 65361
2009-02-24 01:26:56 +00:00
Ted Kremenek 1f11c65a27 PTH: When emitting tokens for literals with cached spellings, change the token
size to that of the *cleaned* spelling. This way 'getSpelling()' for literals in
the Preprocessor just works and doesn't read beyond the bounds of the cached
spelling buffer.

llvm-svn: 65354
2009-02-24 00:30:21 +00:00
Ted Kremenek 01144480fd PTH generation: Clear the cleaning bit for literals (whose spellings are cached).
llvm-svn: 65148
2009-02-20 20:32:39 +00:00
Cedric Venet d3c80de95f Fix the build on win32.
llvm-svn: 64556
2009-02-14 16:15:20 +00:00
Ted Kremenek 2fd18ec43a PTH: Cache directory and negative 'stat' calls. This gives us a 1% performance improvement on Cocoa.h (fsyntax-only+PTH).
llvm-svn: 64490
2009-02-13 22:07:44 +00:00
Ted Kremenek 29942a349c Add some boilerplate to the PTH file to prepare for the caching of stats for directories (and negative stats too).
llvm-svn: 64477
2009-02-13 19:13:46 +00:00
Ted Kremenek a5c2c27ebd PTH: Cache stat information for files in the PTH file. Hook up FileManager
to use this stat information in the PTH file using a 'StatSysCallCache' object.

Performance impact (Cocoa.h, PTH):
- number of stat calls reduces from 1230 to 425
- fsyntax-only: time improves by 4.2% 

We can reduce the number of stat calls to almost zero by caching negative stat
calls and directory stat calls in the PTH file as well.

llvm-svn: 64353
2009-02-12 03:26:59 +00:00
Ted Kremenek 4c1d41f2b1 PTH: Have meta data be at the beginning of the PTH file, not the end.
llvm-svn: 64338
2009-02-11 23:34:32 +00:00
Ted Kremenek e5554deb45 PTH: Replace string identifier to persistent ID lookup with a hashtable. This is
actually *slightly* slower than the binary search. Since this is algorithmically
better, further performance tuning should be able to make this faster.

llvm-svn: 64326
2009-02-11 21:29:16 +00:00
Ted Kremenek 8527e3a727 PTH: Don't emit the PTH offset of the IdentifierInfo string data as that data is
referenced by other tables.

llvm-svn: 64304
2009-02-11 16:06:55 +00:00
Ted Kremenek c7d6964a38 PTH generation: Discard tokens that appear after and on the same line as '#endif'.
llvm-svn: 64250
2009-02-10 22:43:16 +00:00
Ted Kremenek 99c7275118 PTH generation: Don't call 'EmitToken' in the loop condition. This is preparing for other changes within the loop.
llvm-svn: 64247
2009-02-10 22:27:09 +00:00
Ted Kremenek 86423a9993 PTH: Replace ad hoc 'file name' -> 'PTH data' lookup table in the PTH file with an on-disk chained hash table. This data structure is implemented using templates, and will be used to replace similar data structures. This change leads to no visibile performance impact on Cocoa.h, but now we only pay a price for the table on order with the number of files accessed and not the number in the PTH file.
llvm-svn: 64245
2009-02-10 22:16:22 +00:00
Ted Kremenek 351f78822b Rearrange code. No functionality change.
llvm-svn: 64193
2009-02-10 01:14:45 +00:00
Ted Kremenek 5b229bcea8 Fix potential padding error in PTH file and add stub code for emitting an on-disk chained hash table.
llvm-svn: 64192
2009-02-10 01:06:17 +00:00
Chris Lattner c8233df64b switch SourceManager from using an std::map and std::list of
ContentCache objects to using a densemap and list, and allocating
the ContentCache objects from a bump pointer.  This does not speed
up or slow down things substantially, but gives us control over 
their alignment.

llvm-svn: 63628
2009-02-03 07:30:45 +00:00
Chris Lattner c360bf2e48 rename getFullFilePos -> getFileOffset.
llvm-svn: 63097
2009-01-27 06:27:13 +00:00
Ted Kremenek 8d178f4357 PTH: Use Token::setLiteralData() to directly store a pointer to cached spelling data in the PTH file. This removes a ton of code for looking up spellings using sourcelocations in the PTH file. This simplifies both PTH-generation and reading.
Performance impact for -fsyntax-only on Cocoa.h (with Cocoa.h in the PTH file):
- PTH generation time improves by 5%
- PTH reading improves by 0.3%.

llvm-svn: 63072
2009-01-27 00:01:05 +00:00
Ted Kremenek 978b5becea Add version number checking to PTH files.
llvm-svn: 63047
2009-01-26 21:50:21 +00:00
Ted Kremenek eb8c8fbd63 Embed the offset of the PTH table inside the prologue of the PTH file. This will help improve gradual versioning of PTH files instead of relying that the PTH table is at a fixed offset.
llvm-svn: 63045
2009-01-26 21:43:14 +00:00
Chris Lattner 5a7971e0c3 This change refactors some of the low-level lexer interfaces a bit.
Token now has a class of kinds for "literals", which include 
numeric constants, strings, etc.  These tokens can optionally have
a pointer to the start of the token in the lexer buffer.  This 
makes it faster to get spelling and do other gymnastics, because we
don't have to go through source locations.

This change is performance neutral, but will make other changes
more feasible down the road.

llvm-svn: 63028
2009-01-26 19:29:26 +00:00
Ted Kremenek 8433f0b400 PTH: Emitted tokens now consist of 12 bytes that are loaded used 3 32-bit loads. This reduces user time but increases system time because of the slightly larger PTH file. Although there is no performance win on Cocoa.h and -Eonly, overall this seems like a good step.
llvm-svn: 62542
2009-01-19 23:13:15 +00:00
Chris Lattner 08354fef13 add a simplified lexer ctor that sets up the lexer to raw-lex an
entire file.

llvm-svn: 62414
2009-01-17 07:35:14 +00:00
Chris Lattner d32480d3db this massive patch introduces a simple new abstraction: it makes
"FileID" a concept that is now enforced by the compiler's type checker
instead of yet-another-random-unsigned floating around.

This is an important distinction from the "FileID" currently tracked by
SourceLocation.  *That* FileID may refer to the start of a file or to a
chunk within it.  The new FileID *only* refers to the file (and its 
#include stack and eventually #line data), it cannot refer to a chunk.

FileID is a completely opaque datatype to all clients, only SourceManager
is allowed to poke and prod it.

llvm-svn: 62407
2009-01-17 06:22:33 +00:00
Chris Lattner 1ed28ce3fd no need to check this: content cache is already 1-1 map with fileentries.
llvm-svn: 62402
2009-01-17 03:49:48 +00:00
Chris Lattner 1abd20901b Instead of iterating over FileID's, have PTH generation iterate over the
content cache directly.  Content cache has a 1-1 mapping with fileentries,
whereas multiple FileIDs can be the same FileEntry.

llvm-svn: 62401
2009-01-17 03:48:08 +00:00
Ted Kremenek a705b04d7f IdentifierInfo:
- IdentifierInfo can now (optionally) have its string data not be
  co-located with itself.  This is for use with PTH.  This aspect is a
  little gross, as getName() and getLength() now make assumptions
  about a possible alternate representation of IdentifierInfo.
  Perhaps we should make IdentifierInfo have virtual methods?

IdentifierTable:
- Added class "IdentifierInfoLookup" that can be used by
  IdentifierTable to perform "string -> IdentifierInfo" lookups using
  an auxilliary data structure.  This is used by PTH.
- Perform tests show that IdentifierTable::get() does not slow down
  because of the extra check for the IdentiferInfoLookup object (the
  regular StringMap lookup does enough work to mitigate the impact of
  an extra null pointer check).
- The upshot is that now that some IdentifierInfo objects might be
  owned by the IdentiferInfoLookup object.  This should be reviewed.

PTH:
- Modified PTHManager::GetIdentifierInfo to *not* insert entries in
  IdentifierTable's string map, and instead create IdentifierInfo
  objects on the fly when mapping from persistent IDs to
  IdentifierInfos.  This saves a ton of work with string copies,
  hashing, and StringMap lookup and resizing.  This change was
  motivated because when processing source files in the PTH cache we
  don't need to do any string -> IdentifierInfo lookups.
- PTHManager now subclasses IdentifierInfoLookup, allowing clients of
  IdentifierTable to transparently use IdentifierInfo objects managed
  by the PTH file.  PTHManager resolves "string -> IdentifierInfo"
  queries by doing a binary search over a sorted table of identifier
  strings in the PTH file (the exact algorithm we use can be changed
  as needed).

These changes lead to the following performance changes when using PTH on Cocoa.h:
- fsyntax-only: 10% performance improvement
- Eonly: 30% performance improvement

llvm-svn: 62273
2009-01-15 18:47:46 +00:00
Ted Kremenek bef9fc2240 PTH: Embed a persistentID side-table in the PTH file that is sorted in the
lexical order of the corresponding identifier strings. This will be used for a
forthcoming optimization. This slows down PTH generation time by 7%. We can
revert this change if the optimization proves to not be valuable.

llvm-svn: 62248
2009-01-15 01:26:25 +00:00
Ted Kremenek 6697945069 Simpler solution to LiteralSupport compatibility: just add one whitespace character after each cached string.
llvm-svn: 61962
2009-01-09 00:37:37 +00:00
Ted Kremenek d4a4fd99de PTH: For the cached spellings of literals, store one whitespace character after the spelling to accomodate sanity checking in LiteralSuppoert.cpp.
llvm-svn: 61956
2009-01-08 23:40:50 +00:00
Ted Kremenek 5423c53b59 Remove debugging variable I forgot to remove in my last commit.
llvm-svn: 61910
2009-01-08 02:44:52 +00:00
Ted Kremenek 17f09da0a4 Cache the "spellings" of string, character, and numeric literals in the PTH
file. For Cocoa.h, this enlarges the PTH file by 310K (4%).

llvm-svn: 61909
2009-01-08 02:44:06 +00:00
Ted Kremenek 352b8bacdc Refactor CacheTokens to use a PTHWriter class that creates and manages most of the PTH generation data structures. No functionality change.
llvm-svn: 61902
2009-01-08 01:17:37 +00:00
Chris Lattner c7594438c7 use getBuffer() to fix compile error. Ted, please review.
llvm-svn: 61786
2009-01-06 04:47:20 +00:00
Ted Kremenek a754c40390 PTH: Use 3 bytes instead of 4 bytes to encode the persistent ID for a token.
- This reduces the PTH size for Cocoa.h by 7%.
- The increases PTH -Eonly speed for Cocoa.h by 0.8%.

llvm-svn: 61377
2008-12-23 18:41:34 +00:00
Ted Kremenek 1bd0a550d0 PTH:
- Encode the token length with 2 bytes instead of 4.
- This reduces the size of the .pth file for Cocoa.h by 12%.
- This speeds up PTH time (-Eonly) on Cocoa.h by 1.6%.

llvm-svn: 61364
2008-12-23 02:52:12 +00:00
Ted Kremenek 1b18ad240c PTH:
- Embed 'eom' tokens in PTH file.
- Use embedded 'eom' tokens to not lazily generate them in the PTHLexer.
  This means that PTHLexer can always advance to the next token after
  reading a token (instead of buffering tokens using a copy).
- Moved logic of 'ReadToken' into Lex.  GetToken & ReadToken no longer exist.
- These changes result in a 3.3% speedup (-Eonly) on Cocoa.h.
- The code is a little gross.  Many cleanups are possible and should be done.

llvm-svn: 61360
2008-12-23 01:30:52 +00:00
Ted Kremenek ee87e0bd31 Enhance PTH preprocessor-condition-block side table to track #elseinformation as well.
llvm-svn: 60955
2008-12-12 18:31:09 +00:00
Ted Kremenek 864eb39233 PTH:
- Added a side-table per each token-cached file with the preprocessor conditional stack.  This tracks what #if's are matched with what #endifs and where their respective tokens are in the PTH file.  This will allow for quick skipping of excluded conditional branches in the Preprocessor.
- Performance testing shows the addition of this information (without actually utilizing it) leads to no performance regressions.

llvm-svn: 60911
2008-12-11 23:36:38 +00:00
Ted Kremenek bf28bceb10 Remove unneeded assertion. We already know that FE->getName() is an absolute path.
llvm-svn: 60558
2008-12-04 22:36:44 +00:00
Ted Kremenek 73a4d28758 PTH:
Use an array instead of a DenseMap to cache persistent IDs -> IdentifierInfo*.  This leads to a 4% speedup at -fsyntax-only using PTH.

llvm-svn: 60452
2008-12-03 01:16:39 +00:00
Ted Kremenek ccaab17e2c PTH emission:
- Output 32 bit integers using bit-shifting + write of individual bytes.
  This is motivated because we aren't guaranteed to load 32-bit ints of the mmaped PTH file at 4-byte offsets.
- Don't emit flags for IdentifierInfos.  These are lazily populated by the Preprocessor/Parser.
- Only write out tokens for files with absolute paths.  This is potentially temporary, but simplifies things for now.

llvm-svn: 60435
2008-12-02 19:44:08 +00:00
Ted Kremenek b1ef37589c - Enhance PTH generation to write out IdentifierInfo table in two parts:
- a table including the IdentifierInfo data
  - an index from persistent IdentifierInfo IDs to indices within this file.
- Enhance PTH generation to write out file map information, mapping inodes to tokens.

llvm-svn: 60132
2008-11-26 23:58:26 +00:00
Ted Kremenek bf2d60145d Re-apply r60071 now that raw_fd_ostream::tell has been committed.
llvm-svn: 60086
2008-11-26 03:36:26 +00:00
Daniel Dunbar 3ea9485335 Revert 60071, depends on uncommitted LLVM changes.
llvm-svn: 60077
2008-11-26 02:18:33 +00:00
Ted Kremenek 22d4b885fd Migrate token-cache generation logic from dummy harness in PPLexerChange.cpp to CacheTokens.cpp.
llvm-svn: 60071
2008-11-26 00:57:55 +00:00
Daniel Dunbar f3502dbc14 [LLVM up] Update for raw_fd_ostream change. This fixes a FIXME that
the Backend output should be done in binary mode.
 - I'd appreciate it if someone who has a Windows build could verify
   this.

llvm-svn: 59221
2008-11-13 05:09:21 +00:00
Ted Kremenek 8ab23ade80 Added the start of a prototype implementation of PCH based on token caching.
llvm-svn: 57863
2008-10-21 00:54:44 +00:00