- We really shouldn't compute line numbers for every file that is asked if it's
the main file, it destroys the lazy computation.
- Invalid locations are no longer accounted to the main file, no other
functionality change.
llvm-svn: 191535
The main benefit is to speed-up SourceManager::isBeforeInTranslationUnit which is common to query
the included/expanded location of the same FileID multiple times.
llvm-svn: 179435
isBeforeInTranslationUnit() uses a cache to reduce the expensive work
to compute a common ancestor for two FileIDs. This work is very
expensive, so even caching the latest used FileIDs was a big win.
A closer analysis of the cache before, however, shows that the cache
access pattern would oscillate between a working set of FileIDs, and
thus caching more pairs would be profitable.
This patch adds a side table for extending caching. This side table
is bounded in size (experimentally determined in this case from
a simple Objective-C project), and when the table gets too large
we fall back to the single entry caching before as before.
On Sketch (a small example Objective-C project), this optimization
reduces -fsyntax-only time on SKTGraphicView.m by 5%. This is
for a project that is already using PCH.
Fixes <rdar://problem/13299847>
llvm-svn: 176142
This may seem counter-intuitive but the POD-like optimization helps when the
vectors grow into multimegabyte buffers. SmallVector calls realloc which knows
how to twiddle virtual memory bits and avoids large copies.
llvm-svn: 175906
Previously, -Wunused-comparison ignored comparisons in both macro bodies and
macro arguments, but we would still emit a -Wunused-value warning for either.
Now we correctly emit -Wunused-comparison for expressions in macro arguments.
Also, add isMacroBodyExpansion to SourceManager, to go along with
isMacroArgExpansion.
llvm-svn: 172279
uncovered.
This required manually correcting all of the incorrect main-module
headers I could find, and running the new llvm/utils/sort_includes.py
script over the files.
I also manually added quite a few missing headers that were uncovered by
shuffling the order or moving headers up to be main-module-headers.
llvm-svn: 169237
working with preprocessed testcases. This causes source locations in
diagnostics to point at the spelling location instead of the presumed location,
while still keeping the semantic effects of the line directives (entering and
leaving system-header mode, primarily).
llvm-svn: 168004
macro expansion ranges, make sure to check all the FileID
entries that are contained in the spelling range of the
expansion for the macro argument.
Fixes rdar://12537982
llvm-svn: 166359
as "volatile", meaning there's a high enough chance that they may
change while we are trying to use them.
This flag is only enabled by libclang.
Currently "volatile" source files will be stat'ed immediately
before opening them, because the file size stat info
may not be accurate since when we got it (e.g. from the PCH).
This avoids crashes when trying to reference mmap'ed memory
from a file whose size is not what we expect.
Note that there's still a window for a racing issue to occur
but the window for it should be way smaller than before.
We can consider later on to avoid mmap completely on such files.
rdar://11612916
llvm-svn: 160074
r158085 added some logic to track predefined declarations. The main reason we
had predefined declarations in the input was because the __builtin_va_list
declarations were injected into the preprocessor input. As of r158592 we
explicitly build the __builtin_va_list declarations. Therefore the predefined
decl tracking is no longer needed.
llvm-svn: 158732
In standard C since C89, a 'translation-unit' is syntactically defined to have
at least one "external-declaration", which is either a decl or a function
definition. In Clang the latter gives us a declaration as well.
The tricky bit about this warning is that our predefines can contain external
declarations (__builtin_va_list and the 128-bit integer types). Therefore our
AST parser now makes sure we have at least one declaration that doesn't come
from the predefines buffer.
Also, remove bogus warning about empty source files. This doesn't catch source
files that only contain comments, and never fired anyway because of our
predefines.
PR12665 and <rdar://problem/9165548>
llvm-svn: 158085
validate that we didn't override the contents of any of such files.
If this is detected, emit a diagnostic error and recover gracefully
by using the contents of the original file that the PCH was built from.
Part of rdar://11305263
llvm-svn: 156107
This method is very hot, it is called when emitting diagnostics, in -E mode
and for many #pragma handlers. It scans through the whole source file to
count newlines, records and caches them in a vector.
The speedup from vectorization isn't very large, as we fall back to bytewise
scanning when we hit a newline. There might be a way to avoid leaving the sse
loop but everything I tried didn't work out because a call to push_back
clobbers xmm registers.
About 2% speedup on average on "clang -E > /dev/null" of all .cpp files in
clang's lib/Sema.
llvm-svn: 154204
from the one stored in the PCH/AST, while trying to load a SLocEntry.
We verify that all files of the PCH did not change before loading it but this is not enough because:
- The AST may have been 1) kept around, 2) to do queries on it.
- We may have 1) verified the PCH and 2) started parsing.
Between 1) and 2) files may change and we are going to have crashes because the rest of clang
cannot deal with the ASTReader failing to read a SLocEntry.
Handle this by recovering gracefully in such a case, by initializing the SLocEntry
with the info from the PCH/AST as well as reporting failure by the ASTReader.
rdar://10888929
llvm-svn: 151004
file in the source manager. This allows us to properly create and use
modules described by module map files without umbrella headers (or
with incompletely umbrella headers). More generally, we can actually
build a PCH file that makes use of file -> buffer remappings, which
could be useful in libclang in the future.
llvm-svn: 144830
of a ContentCache, since multiple FileIDs can have the same ContentCache
but the expanded macro arguments locations will be different.
llvm-svn: 140521
check whether the requested location points inside the precompiled preamble,
in which case the returned source location will be a "loaded" one.
llvm-svn: 140060
to increased calls to SourceManager::getFileID. (rdar://9992664)
Use a slightly different approach that is more efficient both in terms of speed
(no extra getFileID calls) and in SLocEntries reduction.
Comparing pre-r138129 and this patch we get:
For compiling SemaExpr.cpp reduction of SLocEntries by 26%.
For the boost enum library:
-SLocEntries -34% (note that this was -5% for r138129)
-Memory consumption -50%
-PCH size -31%
Reduced SLocEntries also benefit the hot function SourceManager::getFileID,
evident by the reduced "FileID scans".
llvm-svn: 138380
Currently getMacroArgExpandedLocation is very inefficient and for the case
of a location pointing at the main file it will end up checking almost all of
the SLocEntries. Make it faster:
-Use a map of macro argument chunks to their expanded source location. The map
is for a single source file, it's stored in the file's ContentCache and lazily
computed, like the source lines cache.
-In SLocEntry's FileInfo add an 'unsigned NumCreatedFIDs' field that keeps track
of the number of FileIDs (files and macros) that were created during preprocessing
of that particular file SLocEntry. This is useful when computing the macro argument
map in skipping included files while scanning for macro arg FileIDs that lexed from
a specific source file. Due to padding, the new field does not increase the size
of SLocEntry.
llvm-svn: 138225
If we pass it a source location that points inside a function macro argument,
the returned location will be the macro location in which the argument was expanded.
If a macro argument is used multiple times, the expanded location will
be at the first expansion of the argument.
e.g.
MY_MACRO(foo);
^
Passing a file location pointing at 'foo', will yield a macro location
where 'foo' was expanded into.
Make SourceManager::getLocation call getMacroArgExpandedLocation as well.
llvm-svn: 137794
etc. With this I think essentially all of the SourceManager APIs are
converted. Comments and random other bits of cleanup should be all thats
left.
llvm-svn: 136057
and various other 'expansion' based terms. I've tried to reformat where
appropriate and catch as many references in comments but I'm going to do
several more passes. Also I've tried to expand parameter names to be
more clear where appropriate.
llvm-svn: 136056
source locations from source locations loaded from an AST/PCH file.
Previously, loading an AST/PCH file involved carefully pre-allocating
space at the beginning of the source manager for the source locations
and FileIDs that correspond to the prefix, and then appending the
source locations/FileIDs used for parsing the remaining translation
unit. This design forced us into loading PCH files early, as a prefix,
whic has become a rather significant limitation.
This patch splits the SourceManager space into two parts: for source
location "addresses", the lower values (growing upward) are used to
describe parsed code, while upper values (growing downward) are used
for source locations loaded from AST/PCH files. Similarly, positive
FileIDs are used to describe parsed code while negative FileIDs are
used to file/macro locations loaded from AST/PCH files. As a result,
we can load PCH/AST files even during parsing, making various
improvemnts in the future possible, e.g., teaching #include <foo.h> to
look for and load <foo.h.gch> if it happens to be already available.
This patch was originally written by Sebastian Redl, then brought
forward to the modern age by Jonathan Turner, and finally
polished/finished by me to be committed.
llvm-svn: 135484
instantiation and improve diagnostics which are stem from macro
arguments to trace the argument itself back through the layers of macro
expansion.
This requires some tricky handling of the source locations, as the
argument appears to be expanded in the opposite direction from the
surrounding macro. This patch provides helper routines that encapsulate
the logic and explain the reasoning behind how we step through macros
during diagnostic printing.
This fixes the rest of the test cases originially in PR9279, and later
split out into PR10214 and PR10215.
There is still some more work we can do here to improve the macro
backtrace, but those will follow as separate patches.
llvm-svn: 134660
When a macro instantiation occurs, reserve a SLocEntry chunk with length the
full length of the macro definition source. Set the spelling location of this chunk
to point to the start of the macro definition and any tokens that are lexed directly
from the macro definition will get a location from this chunk with the appropriate offset.
For any tokens that come from argument expansion, '##' paste operator, etc. have their
instantiation location point at the appropriate place in the instantiated macro definition
(the argument identifier and the '##' token respectively).
This improves macro instantiation diagnostics:
Before:
t.c:5:9: error: invalid operands to binary expression ('struct S' and 'int')
int y = M(/);
^~~~
t.c:5:11: note: instantiated from:
int y = M(/);
^
After:
t.c:5:9: error: invalid operands to binary expression ('struct S' and 'int')
int y = M(/);
^~~~
t.c:3:20: note: instantiated from:
\#define M(op) (foo op 3);
~~~ ^ ~
t.c:5:11: note: instantiated from:
int y = M(/);
^
The memory savings for a candidate boost library that abuses the preprocessor are:
- 32% less SLocEntries (37M -> 25M)
- 30% reduction in PCH file size (900M -> 635M)
- 50% reduction in memory usage for the SLocEntry table (1.6G -> 800M)
llvm-svn: 134587
It would add up relative (decomposed) offsets like in getDecomposedSpellingLocSlowCase, but while
it makes sense to preserve the offset among lexed spelling locations, it doesn't make
sense to add anything to the offset of the instantiation location. The instantiation
location will be the same regardless of the relative offset in the tokens that were
instantiated.
This bug didn't actually affect anything because, currently, in practice we never create macro
locations with relative offset greater than 0.
llvm-svn: 134586
The small number of elements was determined by taking the median
file length in clang+llvm and /usr/include on OS X with xcode installed.
llvm-svn: 134496
during deserialization from a precompiled header, and update all of
its callers to note when this problem occurs and recover (more)
gracefully. Fixes <rdar://problem/9119249>.
llvm-svn: 129839
should report the original file name for contents of files that were overriden by other files,
otherwise it should report the name of the new file. Default is true.
Also add similar field in PreprocessorOptions and pass similar parameter in ASTUnit::LoadFromCommandLine.
llvm-svn: 127289
Allow remapping a file by specifying another filename whose contents should be loaded if the original
file gets loaded. This allows to override files without having to create & load buffers in advance.
llvm-svn: 127052
whose inode has changed since the file was first created and that is
being seen through a different path name (e.g., due to symlinks or
relative path elements), such that its FileEntry pointer doesn't match
a known FileEntry pointer. Since this requires a system call (to
stat()), we only perform this deeper checking if we can't find the
file by comparing FileEntry pointers.
Also, add a micro-optimization where we don't bother to compute line
numbers when given the location (1, 1). This improves the
efficiency of clang_getLocationForOffset().
llvm-svn: 124800