Add 'openFile' bool to FileManager::getFile to specify whether we want to have the file opened or not, have it
false by default, and enable it only in HeaderSearch.cpp where the open+fstat optimization matters.
Fixes rdar://9139899.
llvm-svn: 127748
clients to observe the exact path through which an #included file was
located. This is very useful when trying to record and replay inclusion
operations without it beind influenced by the aggressive caching done
inside the FileManager to avoid redundant system calls and filesystem
operations.
The work to compute and return this is only done in the presence of
callbacks, so it should have no effect on normal compilation.
Patch by Manuel Klimek.
llvm-svn: 127742
Find out that our C++0x status has only one field for noexcept expression and specification together, and that it was accidentally already marked as fully implemented.
This completes noexcept specification work.
llvm-svn: 127701
of an Objective-C method to be overridden on a case-by-case basis. This
is a higher-level tool than ns_returns_retained &c.; it lets users specify
that not only does a method have different retain/release semantics, but
that it semantically acts differently than one might assume from its name.
This in turn is quite useful to static analysis.
llvm-svn: 126839
The previous name was inaccurate as this token in fact appears at
the end of every preprocessing directive, not just macro definitions.
No functionality change, except for a diagnostic tweak.
llvm-svn: 126631
AST/PCH files more lazy:
- Don't preload all of the file source-location entries when reading
the AST file. Instead, load them lazily, when needed.
- Only look up header-search information (whether a header was already
#import'd, how many times it's been included, etc.) when it's needed
by the preprocessor, rather than pre-populating it.
Previously, we would pre-load all of the file source-location entries,
which also populated the header-search information structure. This was
a relatively minor performance issue, since we would end up stat()'ing
all of the headers stored within a AST/PCH file when the AST/PCH file
was loaded. In the normal PCH use case, the stat()s were cached, so
the cost--of preloading ~860 source-location entries in the Cocoa.h
case---was relatively low.
However, the recent optimization that replaced stat+open with
open+fstat turned this into a major problem, since the preloading of
source-location entries would now end up opening those files. Worse,
those files wouldn't be closed until the file manager was destroyed,
so just opening a Cocoa.h PCH file would hold on to ~860 file
descriptors, and it was easy to blow through the process's limit on
the number of open file descriptors.
By eliminating the preloading of these files, we neither open nor stat
the headers stored in the PCH/AST file until they're actually needed
for something. Concretely, we went from
*** HeaderSearch Stats:
835 files tracked.
364 #import/#pragma once files.
823 included exactly once.
6 max times a file is included.
3 #include/#include_next/#import.
0 #includes skipped due to the multi-include optimization.
1 framework lookups.
0 subframework lookups.
*** Source Manager Stats:
835 files mapped, 3 mem buffers mapped.
37460 SLocEntry's allocated, 11215575B of Sloc address space used.
62 bytes of files mapped, 0 files with line #'s computed.
with a trivial program that uses a chained PCH including a Cocoa PCH
to
*** HeaderSearch Stats:
4 files tracked.
1 #import/#pragma once files.
3 included exactly once.
2 max times a file is included.
3 #include/#include_next/#import.
0 #includes skipped due to the multi-include optimization.
1 framework lookups.
0 subframework lookups.
*** Source Manager Stats:
3 files mapped, 3 mem buffers mapped.
37460 SLocEntry's allocated, 11215575B of Sloc address space used.
62 bytes of files mapped, 0 files with line #'s computed.
for the same program.
llvm-svn: 125286
- Don't publicize a C++0x feature through __has_feature if we aren't
in C++0x mode (even if the feature is available only with a
warning).
- "auto" is not implemented well enough for its __has_feature to be
turned on.
- Fix the test of C++0x __has_feature to actually test what we're
trying to test. Searching for the substring "foo" when our options
are "foo" and "no_foo" doesn't work :)
llvm-svn: 124291
and turn on __has_feature(cxx_rvalue_references). The core rvalue
references proposal seems to be fully implemented now, pending lots
more testing.
llvm-svn: 124169
Turn on the __has_feature switch for variadic templates, document
their completion, and put the ExtWarn into the c++0x-extensions
warning group.
llvm-svn: 123854
new gcc warning that complains on self-assignments and
self-initializations. Fix one bug found by the warning, in which one
clang::OverloadCandidate constructor failed to initialize its
FunctionTemplate member.
llvm-svn: 122459
Diagnostic pragmas are broken because we don't keep track of the diagnostic state changes and we only check the current/latest state.
Problems manifest if a diagnostic is emitted for a source line that has different diagnostic state than the current state; this can affect
a lot of places, like C++ inline methods, template instantiations, the lexer, etc.
Fix the issue by having the Diagnostic object keep track of the source location of the pragmas so that it is able to know what is the diagnostic state at any given source location.
Fixes rdar://8365684.
llvm-svn: 121873
pointer that is passed down through the APIs, and make
FileSystemStatCache::get be the one that filters out
directory lookups that hit files. This also paves the
way to have stat queries be able to return opened files.
llvm-svn: 120060
FileSystemOpts through a ton of apis, simplifying a lot of code.
This also fixes a latent bug in ASTUnit where it would invoke
methods on FileManager without creating one in some code paths
in cindextext.
llvm-svn: 120010
and use a better and more general approach, where NullStmt has a flag to indicate whether it was preceded by an empty macro.
Thanks to Abramo Bagnara for the hint!
llvm-svn: 119887
than a Token that holds the same information all in one easy-to-use
package. There's no technical reason to prefer the former -- the
information comes from a Token originally -- and it's clumsier to use,
so I've changed the code to use tokens everywhere.
Approved by clattner
llvm-svn: 119845
The callback info for #if/#elif is not great -- ideally it would give
us a list of tokens in the #if, or even better, a little parse tree.
But that's a lot more work. Instead, clients can retokenize using
Lexer::LexFromRawLexer().
Reviewed by nlewycky.
llvm-svn: 118318
When -working-directory is passed in command line, file paths are resolved relative to the specified directory.
This helps both when using libclang (where we can't require the user to actually change the working directory)
and to help reproduce test cases when the reproduction work comes along.
--FileSystemOptions is introduced which controls how file system operations are performed (currently it just contains
the working directory value if set).
--FileSystemOptions are passed around to various interfaces that perform file operations.
--Opening & reading the content of files should be done only through FileManager. This is useful in general since
file operations will be abstracted in the future for the reproduction mechanism.
FileSystemOptions is independent of FileManager so that we can have multiple translation units sharing the same
FileManager but with different FileSystemOptions.
Addresses rdar://8583824.
llvm-svn: 118203
load identifiers without loading their corresponding macro
definitions. This is likely to improve PCH performance slightly, and
reduces deserialization stack depth considerably when using
preprocessor metaprogramming.
llvm-svn: 117750
inclusion directives, keeping track of every #include, #import,
etc. in the translation unit. We keep track of the source location and
kind of the inclusion, how the file name was spelled, and the
underlying file to which the inclusion resolved.
llvm-svn: 116952
Now MICache is a linked list (per the FIXME), where we tradeoff between MacroInfo objects being in MICache
and MIChainHead. MacroInfo objects in the MICache chain are already "Destroy()'ed", so they can be reused. When
inserting into MICache, we need to remove them from the regular linked list so that they aren't destroyed more than
once.
llvm-svn: 116869
The problem was not the management of MacroInfo objects, but that when we recycle them
via the MICache the memory of the underlying SmallVector (within MacroInfo) was not getting
released. This is because objects stashed into MICache simply are reused with a placement
new, and never have their destructor called.
llvm-svn: 116862
list of allocated MacroInfos. This requires only 1 extra pointer per MacroInfo object, and allows us to blow them
away in one place. This fixes an elusive memory leak with MacroInfos (whose exact location I couldn't still figure
out despite substantial digging).
Fixes <rdar://problem/8361834>.
llvm-svn: 116842
spelled (#pragma, _Pragma, __pragma). In -E mode, use that information
to add appropriate newlines when translating _Pragma and __pragma into
#pragma, like GCC does. Fixes <rdar://problem/8412013>.
llvm-svn: 113553
The extra data stored on user-defined literal Tokens is stored in extra
allocated memory, which is managed by the PreprocessorLexer because there isn't
a better place to put it that makes sure it gets deallocated, but only after
it's used up. My testing has shown no significant slowdown as a result, but
independent testing would be appreciated.
llvm-svn: 112458
__debug overflow_stack'.
- For testing crash reporting stuff... you'd think I could just use some C++
code but Doug keeps fixing stuff!
llvm-svn: 109587
reparsing an ASTUnit. When saving a preamble, create a buffer larger
than the actual file we're working with but fill everything from the
end of the preamble to the end of the file with spaces (so the lexer
will quickly skip them). When we load the file, create a buffer of the
same size, filling it with the file and then spaces. Then, instruct
the lexer to start lexing after the preamble, therefore continuing the
parse from the spot where the preamble left off.
It's now possible to perform a simple preamble build + parse (+
reparse) with ASTUnit. However, one has to disable a bunch of checking
in the PCH reader to do so. That part isn't committed; it will likely
be handled with some other kind of flag (e.g., -fno-validate-pch).
As part of this, fix some issues with null termination of the memory
buffers created for the preamble; we were trying to explicitly
NULL-terminate them, even though they were also getting implicitly
NULL terminated, leading to excess warnings about NULL characters in
source files.
llvm-svn: 109445
is present.
Rather than using clang_getCursorExtent(), which requires
us to lex the token at the ending position to determine its
length. Then, we'd be comparing [a, b) source ranges that cover the
characters in the range rather than the normal behavior for Clang's
source ranges, which covers the tokens in the range. However, relexing
causes us to read the source file (which may come from a precompiled
header), which is rather unfortunate and affects performance.
In the new scheme, we only use Clang-style source ranges that cover
the tokens in the range. At the entry points where this matters
(clang_annotateTokens, clang_getCursor), we make sure to move source
locations to the start of the token.
Addresses most of <rdar://problem/8049381>.
llvm-svn: 109134
which is the part of the file that contains all of the initial
comments, includes, and preprocessor directives that occur before any
of the actual code. Added a new -print-preamble cc1 action that is
only used for testing.
llvm-svn: 108913
When loading the PCH, IdentifierInfos that are associated with pragmas cause declarations that use these identifiers to be deserialized (e.g. the "clang" pragma causes the "clang" namespace to be loaded).
We can avoid this if we just use StringRefs for the pragmas.
As a bonus, since we don't have to create and pass IdentifierInfos, the pragma interfaces get a bit more simplified.
llvm-svn: 108237
In the case of backtracking, the cached token lexer will be the only
lexer on the stack, without this the token stack will be empty and EOF
won't be returned.
This fixes PR7072.
llvm-svn: 108124
Fix string concatenation to treat escapes in concatenated strings that
are wide because of other string chunks to process the escapes as wide
themselves. Before we would warn about and miscompile the attached testcase.
This fixes rdar://8040728 - miscompile + warning: hex escape sequence out of range
llvm-svn: 106012
diagnostics. That would be while we're parsing string literals for the
sole purpose of producing a diagnostic about them. Fixes
<rdar://problem/8026030>.
llvm-svn: 104684
1) Suppress diagnostics as soon as we form the code-completion
token, so we don't get any error/warning spew from the early
end-of-file.
2) If we consume a code-completion token when we weren't expecting
one, go into a code-completion recovery path that produces the best
results it can based on the context that the parser is in.
llvm-svn: 104585
make it miss (invalid) things like:
<<<<<<<
>>>>>>>
and crash if
<<<<<<<
was at the end of the line. When we find a >>>>>>> that is not at the
end of the line, make sure to reset Pos so we don't crash on something
like:
<<<<<<< >>>>>>>
This isn't worth making testcases for, since each would require a new file.
rdar://7987078 - signal 11 compiling "<<<<<<<<<<"
llvm-svn: 103968
in an input file like this:
# 42
int x;
we were emitting:
# <something>
int x;
(with a space before the int) because we weren't clearing the leading
whitespace flag properly after the \n from the directive was handled.
llvm-svn: 101084
instantiations when we have the corresponding macro definition and by
removing macro definition information from our table when the macro is
undefined.
llvm-svn: 99004
record (which includes all macro instantiations and definitions). As
with all lay deserialization, this introduces a new external source
(here, an external preprocessing record source) that loads all of the
preprocessed entities prior to iterating over the entities.
The preprocessing record is an optional part of the precompiled header
that is disabled by default (enabled with
-detailed-preprocessing-record). When the preprocessor given to the
PCH writer has a preprocessing record, that record is written into the
PCH file. When the PCH reader is given a PCH file that contains a
preprocessing record, it will be lazily loaded (which, effectively,
implicitly adds -detailed-preprocessing-record). This is the first
case where we have sections of the precompiled header that are
added/removed based on a compilation flag, which is
unfortunate. However, this data consumes ~550k in the PCH file for
Cocoa.h (out of ~9.9MB), and there is a non-trivial cost to gathering
this detailed preprocessing information, so it's too expensive to turn
on by default. In the future, we should investigate a better encoding
of this information.
llvm-svn: 99002
eliminating the extra PopulatePreprocessingRecord object. This will
become useful once we start writing the preprocessing record to
precompiled headers.
llvm-svn: 98966
preprocessing record. Use that link with clang_getCursorReferenced()
and clang_getCursorDefinition() to match instantiations of a macro to
the definition of the macro.
llvm-svn: 98842
the macro definitions and macro instantiations that are found
during preprocessing. Preprocessing records are *not* generated by
default; rather, we provide a PPCallbacks subclass that hooks into the
existing callback mechanism to record this activity.
The only client of preprocessing records is CIndex, which keeps track
of macro definitions and instantations so that they can be exposed via
cursors. At present, only token annotation uses these facilities, and
only for macro instantiations; both will change in the near
future. However, with this change, token annotation properly annotates
macro instantiations that do not produce any tokens and instantiations
of macros that are later undef'd, improving our consistency.
Preprocessing directives that are not macro definitions are still
handled by clang_annotateTokens() via re-lexing, so that we don't have
to track every preprocessing directive in the preprocessing record.
Performance impact of preprocessing records is still TBD, although it
is limited to CIndex and therefore out of the path of the main compiler.
llvm-svn: 98836
SourceManager's getBuffer() and, therefore, could fail, along with
Preprocessor::getSpelling(). Use the Invalid parameters in the literal
parsers (string, floating point, integral, character) to make them
robust against errors that stem from, e.g., PCH files that are not
consistent with the underlying file system.
I still need to audit every use caller to all of these routines, to
determine which ones need specific handling of error conditions.
llvm-svn: 98608
SourceManager's getBuffer() (and similar) operations. This abstract
can be used to force callers to cope with errors in getBuffer(), such
as missing files and changed files. Fix a bunch of callers to use the
new interface.
Add some very basic checks for file consistency (file size,
modification time) into ContentCache::getBuffer(), although these
checks don't help much until we've updated the main callers (e.g.,
SourceManager::getSpelling()).
llvm-svn: 98585