Implemented DiagCollector::IncludeInDiagnosticCounts() to return 'false' so that the batching of diagnostics for use with BugReporter doesn't mess up the count of real diagnostics.
llvm-svn: 62873
tells us whether Preprocessor::HandleIdentifier needs to be called.
Because this method is only rarely needed, this saves a call and a
bunch of random checks. This drops the time in HandleIdentifier
from 3.52ms to .98ms on cocoa.h on my machine.
llvm-svn: 62675
Changes to IdentifierTable:
- High-level summary: StringMap never owns IdentifierInfos. It just
references them.
- The string map now has StringMapEntry<IdentifierInfo*> instead of
StringMapEntry<IdentifierInfo>. The IdentifierInfo object is
allocated using the same bump pointer allocator as used by the
StringMap.
Changes to IdentifierInfo:
- Added an extra pointer to point to the
StringMapEntry<IdentifierInfo*> in the string map. This pointer
will be null if the IdentifierInfo* is *only* used by the PTHLexer
(that is it isn't in the StringMap).
Algorithmic changes:
- Non-PTH case:
IdentifierInfo::get() will always consult the StringMap first to
see if we have an IdentifierInfo object. If that StringMapEntry
references a null pointer, we allocate a new one from the BumpPtrAllocator
and update the reference in the StringMapEntry.
- PTH case:
We do the same lookup as with the non-PTH case, but if we don't get
a hit in the StringMap we do a secondary lookup in the PTHManager for
the IdentifierInfo. If we don't find an IdentifierInfo we create a
new one as in the non-PTH case. If we do find and IdentifierInfo
in the PTHManager, we update the StringMapEntry to refer to it so
that the IdentifierInfo will be found on the next StringMap lookup.
This way we only do a binary search in the PTH file at most once
for a given IdentifierInfo. This greatly speeds things up for source
files containing a non-trivial amount of code.
Performance impact:
While these changes do add some extra indirection in
IdentifierTable to access an IdentifierInfo*, I saw speedups even
in the non-PTH case as well.
Non-PTH: For -fsyntax-only on Cocoa.h, we see a 6% speedup.
PTH (with Cocoa.h in token cache): 11% speedup.
I also did an experiment where we did -fsyntax-only on a source file
including a large header and Cocoa.h, but the token cache did not
contain the larger header. For this file, we were seeing a performance
*regression* when using PTH of 3% over non-PTH. Now we are seeing
a performance improvement of 9%!
Tests:
The serialization tests are now failing. I looked at this extensively,
and I my belief is that this change is unmasking a bug rather than
introducing a new one. I have disabled the serialization tests for now.
llvm-svn: 62636
the chunk ID not the file ID. This exposes problems in
TextDiagnosticPrinter where it should have been using the canonical
file ID but wasn't. Fix these along the way.
llvm-svn: 62427
"FileID" a concept that is now enforced by the compiler's type checker
instead of yet-another-random-unsigned floating around.
This is an important distinction from the "FileID" currently tracked by
SourceLocation. *That* FileID may refer to the start of a file or to a
chunk within it. The new FileID *only* refers to the file (and its
#include stack and eventually #line data), it cannot refer to a chunk.
FileID is a completely opaque datatype to all clients, only SourceManager
is allowed to poke and prod it.
llvm-svn: 62407
the "physical" location of tokens, refer to the "spelling" location.
This is more concrete and useful, tokens aren't really physical objects!
llvm-svn: 62309
- IdentifierInfo can now (optionally) have its string data not be
co-located with itself. This is for use with PTH. This aspect is a
little gross, as getName() and getLength() now make assumptions
about a possible alternate representation of IdentifierInfo.
Perhaps we should make IdentifierInfo have virtual methods?
IdentifierTable:
- Added class "IdentifierInfoLookup" that can be used by
IdentifierTable to perform "string -> IdentifierInfo" lookups using
an auxilliary data structure. This is used by PTH.
- Perform tests show that IdentifierTable::get() does not slow down
because of the extra check for the IdentiferInfoLookup object (the
regular StringMap lookup does enough work to mitigate the impact of
an extra null pointer check).
- The upshot is that now that some IdentifierInfo objects might be
owned by the IdentiferInfoLookup object. This should be reviewed.
PTH:
- Modified PTHManager::GetIdentifierInfo to *not* insert entries in
IdentifierTable's string map, and instead create IdentifierInfo
objects on the fly when mapping from persistent IDs to
IdentifierInfos. This saves a ton of work with string copies,
hashing, and StringMap lookup and resizing. This change was
motivated because when processing source files in the PTH cache we
don't need to do any string -> IdentifierInfo lookups.
- PTHManager now subclasses IdentifierInfoLookup, allowing clients of
IdentifierTable to transparently use IdentifierInfo objects managed
by the PTH file. PTHManager resolves "string -> IdentifierInfo"
queries by doing a binary search over a sorted table of identifier
strings in the PTH file (the exact algorithm we use can be changed
as needed).
These changes lead to the following performance changes when using PTH on Cocoa.h:
- fsyntax-only: 10% performance improvement
- Eonly: 30% performance improvement
llvm-svn: 62273
- Big Idea:
Source files are now mmaped when ContentCache::getBuffer() is first called.
While this doesn't change the functionality when lexing regular source files,
it can result in source files not being paged in when using PTH.
- Performance change:
- No observable difference (-fsyntax-only/-Eonly) on Cocoa.h when doing
regular source lexing.
- No observable time difference (-fsyntax-only/-Eonly) on Cocoa.h when using
PTH. We do observe, however, a reduction of 279K in memory mapped source
code (3% reduction). The majority of pages from Cocoa.h (and friends) are
still being pulled in, however, because any literal will cause
Preprocessor::getSpelling() to be called (causing the source for the file to
get pulled in). The next possible optimization is to cache literal strings
in the PTH file to avoid the need for the original header sources entirely.
- Right now there is a preprocessor directive to toggle between "lazy" and
"eager" creation of MemBuffers. This is not permanent, and is there in the
short term to just test additional optimizations.
llvm-svn: 61827
- 'Buffer' is now private and must be accessed via 'getBuffer()'.
This paves the way for lazily mapping in source files on demand.
- Added 'getSize()' (which gets the size of the content without
necessarily accessing the MemBuffer) and 'getSizeBytesMapped()'.
- Modifed SourceManager to use these new methods. This reduces the
number of places that actually access the MemBuffer object for a file
to those that actually look at the character data.
These changes result in no performance change for -fsyntax-only on Cocoa.h.
llvm-svn: 61782
specific targets default them to on. Default blocks to on on 10.6 and later.
Add a -fblocks option that allows the user to override the target's default.
Use -fblocks in the various testcases that use blocks.
llvm-svn: 60563
a new NamedDecl::getAsString() method.
Change uses of Selector::getName() to just pass in a Selector
where possible (e.g. to diagnostics) instead of going through
an std::string.
This also adds new formatters for objcinstance and objcclass
as described in the dox.
llvm-svn: 59933
with implicit quotes around them. This has a bunch of follow-on
effects and requires tweaking to a whole lot of code. This causes
a regression in two tests (xfailed) by causing it to emit things like:
Line 10: duplicate interface declaration for category 'MyClass1' ('Category1')
instead of:
Line 10: duplicate interface declaration for category 'MyClass1(Category1)'
I will fix this in a follow-up commit.
As part of this, I had to start switching stuff to use ->getDeclName() instead
of Decl::getName() for consistency. This is good, but I was planning to do this
as an independent patch. There will be several follow-on patches
to clean up some of the mess, but this patch is already too big.
llvm-svn: 59917
diags over to use this. QualTypes implicitly print single quotes around
them for uniformity and future extension.
Doing this requires a little function pointer dance to prevent libbasic
from depending on libast.
llvm-svn: 59907
one for building up the diagnostic that is in flight (DiagnosticBuilder)
and one for pulling structured information out of the diagnostic when
formatting and presenting it.
There is no functionality change with this patch.
llvm-svn: 59849
strings. This allows us to have considerable flexibility in how
these things are displayed and provides extra information that
allows us to merge away diagnostics that are very similar.
Diagnostic modifiers are a string of characters with the regex
[-a-z]+ that occur between the % and digit. They may
optionally have an argument that can parameterize them.
For now, I've added two example modifiers. One is a very useful
tool that allows you to factor commonality across diagnostics
that need single words or phrases combined. Basically you can
use %select{a|b|c}4 with with an integer argument that selects
either a/b/c based on an integer value in the range [0..3).
The second modifier is also an integer modifier, aimed to help
English diagnostics handle plurality. "%s3" prints to 's' if
integer argument #3 is not 1, otherwise it prints to nothing.
I'm fully aware that 's' is an English concept and doesn't
apply to all situations (mouse vs mice). However, this is very
useful and we can add other crazy modifiers once we add support
for polish! ;-)
I converted a couple C++ diagnostics over to use this as an
example, I'd appreciate it if others could merge the other
likely candiates. If you have other modifiers that you want,
lets talk on cfe-dev.
llvm-svn: 59803
const char*'s are now not converted to std::strings when the diagnostic
is formed, we just hold onto their pointer and format as needed.
This commit makes DiagnosticClient::FormatDiagnostic even more of a
mess, I'll fix it in the next commit.
llvm-svn: 59593
operator+, directly, using the same mechanism as all other special
names.
Removed the "special" identifiers for the overloaded operators from
the identifier table and IdentifierInfo data structure. IdentifierInfo
is back to representing only real identifiers.
Added a new Action, ActOnOperatorFunctionIdExpr, that builds an
expression from an parsed operator-function-id (e.g., "operator
+"). ActOnIdentifierExpr used to do this job, but
operator-function-ids are no longer represented by IdentifierInfo's.
Extended Declarator to store overloaded operator names.
Sema::GetNameForDeclarator now knows how to turn the operator
name into a DeclarationName for the overloaded operator.
Except for (perhaps) consolidating the functionality of
ActOnIdentifier, ActOnOperatorFunctionIdExpr, and
ActOnConversionFunctionExpr into a common routine that builds an
appropriate DeclRefExpr by looking up a DeclarationName, all of the
work on normalizing declaration names should be complete with this
commit.
llvm-svn: 59526
are formed. In particular, a diagnostic with all its strings and ranges is now
packaged up and sent to DiagnosticClients as a DiagnosticInfo instead of as a
ton of random stuff. This has the benefit of simplifying the interface, making
it more extensible, and allowing us to do more checking for things like access
past the end of the various arrays passed in.
In addition to introducing DiagnosticInfo, this also substantially changes how
Diagnostic::Report works. Instead of being passed in all of the info required
to issue a diagnostic, Report now takes only the required info (a location and
ID) and returns a fresh DiagnosticInfo *by value*. The caller is then free to
stuff strings and ranges into the DiagnosticInfo with the << operator. When
the dtor runs on the DiagnosticInfo object (which should happen at the end of
the statement), the diagnostic is actually emitted with all of the accumulated
information. This is a somewhat tricky dance, but it means that the
accumulated DiagnosticInfo is allowed to keep pointers to other expression
temporaries without those pointers getting invalidated.
This is just the minimal change to get this stuff working, but this will allow
us to eliminate the zillions of variant "Diag" methods scattered throughout
(e.g.) sema. For example, instead of calling:
Diag(BuiltinLoc, diag::err_overload_no_match, typeNames,
SourceRange(BuiltinLoc, RParenLoc));
We will soon be able to just do:
Diag(BuiltinLoc, diag::err_overload_no_match)
<< typeNames << SourceRange(BuiltinLoc, RParenLoc));
This scales better to support arbitrary types being passed in (not just
strings) in a type-safe way. Go operator overloading?!
llvm-svn: 59502
strings instead of array of strings. This reduces string copying
in some not-very-important cases, but paves the way for future
improvements.
llvm-svn: 59494
destructors, and conversion functions. The placeholders were used to
work around the fact that the parser and some of Sema really wanted
declarators to have simple identifiers; now, the code that deals with
declarators will use DeclarationNames.
llvm-svn: 59469
representing the names of declarations in the C family of
languages. DeclarationName is used in NamedDecl to store the name of
the declaration (naturally), and ObjCMethodDecl is now a NamedDecl.
llvm-svn: 59441
function call created in response to the use of operator syntax that
resolves to an overloaded operator in C++, e.g., "str1 +
str2" that resolves to std::operator+(str1, str2)". We now build a
CXXOperatorCallExpr in C++ when we pick an overloaded operator. (But
only for binary operators, where we actually implement overloading)
I decided *not* to refactor the current CallExpr to make it abstract
(with FunctionCallExpr and CXXOperatorCallExpr as derived
classes). Doing so would allow us to make CXXOperatorCallExpr a little
bit smaller, at the cost of making the argument and callee accessors
virtual. We won't know if this is going to be a win until we can parse
lots of C++ code to determine how much memory we'll save by making
this change vs. the performance penalty due to the extra virtual
calls.
llvm-svn: 59306
conversion functions. Instead, we just use a placeholder identifier
for these (e.g., "<constructor>") and override NamedDecl::getName() to
provide a human-readable name.
This is one potential solution to the problem; another solution would
be to replace the use of IdentifierInfo* in NamedDecl with a different
class that deals with identifiers better. I'm also prototyping that to
see how it compares, but this commit is better than what we had
previously.
llvm-svn: 59193
operators in C++. Overloaded operators can be called directly via
their operator-function-ids, e.g., "operator+(foo, bar)", but we don't
yet implement the semantics of operator overloading to handle, e.g.,
"foo + bar".
llvm-svn: 58817
the types for size_t and ptrdiff_t more accurate. I think all of these
are correct, but please compare the defines for __PTRDIFF_TYPE__ and
__SIZE_TYPE__ to gcc to double-check; this particularly applies to
those on BSD variants, since I'm not sure what they do here; I assume
here that they're the same as on Linux.
Fixes wchar_t to be "int", not "unsigned int" (which I think is
correct on everything but Windows).
Fixes ptrdiff_t to be "int" rather than "short" on PIC16; "short" is an
somewhat strange choice because it normally gets promoted, and it's not
consistent with the choice for size_t.
llvm-svn: 58556
etc more generic. For some targets, long may not be equal to pointer size. For
example: PIC16 has int as i16, ptr as i16 but long as i32.
Also fixed a few build warnings in assert() functions in CFRefCount.cpp,
CGDecl.cpp, SemaDeclCXX.cpp and ParseDeclCXX.cpp.
llvm-svn: 58501
target indep code.
Note that this changes functionality on PIC16: it defines __INT_MAX__
correctly for it, and it changes sizeof(long) to 16-bits (to match
the size of pointer).
llvm-svn: 57132
to whether the fileid is a 'extern c system header' in addition to whether it
is a system header, most of this is spreading plumbing around. Once we have that,
PPLexerChange bases its "file enter/exit" notifications to PPCallbacks to
base the system header state on FileIDInfo instead of HeaderSearch. Finally,
in Preprocessor::HandleIncludeDirective, mirror logic in GCC: the system headerness
of a file being entered can be set due to the #includer or the #includee.
llvm-svn: 56688
- For investigating warnings in system headers / builtins.
- Currently also enables the behavior that allows silent redefinition
of types in system headers. Conceptually these are separate but I
didn't feel it was worth two options (or changing LangOptions).
llvm-svn: 56163
If you're on some other platform, the correct definition for this macro
would be appreciated; to find the correct definition, just run the
following command:
echo | gcc -dM -E - | grep USER_LABEL_PREFIX
llvm-svn: 55869
- Used to autoselect runtime when neither -fnext-runtime nor
-fgnu-runtime is specified.
- Default impl is false, all darwin targets set it to true.
llvm-svn: 55231
difference from generic x86 is the defines. The rest is non-trivial to
implement.
I'm not planning on adding any more targets myself; if there are any
targets anyone is currently using that are missing, feel free to add
them, or ask me to add them.
This concludes the work I'm planning for the TargetInfo
implementations at the moment; all the other issues with TargetInfo require
some API changes, and I haven't really thought it through. Some of the
remaining issues: allowing targets to define size_t and wchar_t properly,
adding some sort of __builtin_type_info intrinsic so we can finish clang's
limits.h and float.h and get rid of a massive number of macro
definitions, allowing target-specific command-line options, allowing
target-specific defaults for certain command-line options like
-fms-extensions, exposing vector alignment outside of the description
string, exposing endianness outside of the description string, allowing
targets to expose special bit-field layout requirements, exposing some
sort of custom hook for call generation in CodeGen, and adding CPU
selection to control defines like __SSE__.
llvm-svn: 55098
This approach allows adding OS-specific targets/defines/etc. without
completely breaking unknown subtargets. No new subtargets yet, although
I plan to add x86-Linux soon. Others can add targets that they use as
needed; adding a new subtarget takes very little code.
Also does some fixups for description strings; a lot of them were
unspecified. I think all the ones I added are correct, but
they're unverified; corrections are welcome.
llvm-svn: 55091
cleaned it up a bit, including fixing the definition of va_list; this
shouldn't break anything, but anyone using Sparc should watch for
regressions.
llvm-svn: 55041
visible effects, but this will significantly reduce the amount of
boilerplate code necessary to add subtargets.
If this looks okay, I'll do the rest of the processors (PPC, Sparc, ARM)
soon.
llvm-svn: 55036
- Kill unnecessary #includes in .cpp files. This is an automatic
sweep so some things removed are actually used, but happen to be
included by a previous header. I tried to get rid of the obvious
examples and this was the easiest way to trim the #includes in one
fell swoop.
- We now return to regularly scheduled development.
llvm-svn: 54632
* Move FormatError() from TextDiagnostic up to DiagClient, remove now
empty class TextDiagnostic
* Make DiagClient optional for Diagnostic
This fixes the following problems:
* -html-diags (and probably others) does now output the same set of
warnings as console clang does
* nothing crashes if one forgets to call setHeaderSearch() on
TextDiagnostic
* some code duplication is removed
llvm-svn: 54620
- Like EXTENSION but always generates a warning (even without
-pedantic).
- Updated ptr -> int, int -> ptr, and incompatible cast warnings to
be EXTWARN.
- Other EXTENSION level diagnostics should be audited for upgrade.
- Updated several test cases to fix code which produced unanticipated
warnings.
llvm-svn: 54335
hardcoded data layout in getTargetDescription. Hopefully fixes a test
failure.
Of course, this should be fixed properly, but that's a bigger fix.
llvm-svn: 51948
the files of different SourceLocations. These methods correctly handle the
case where a file may have multiple FileIDs due to it being large enough
to be spread across several chunks.
llvm-svn: 49682
This is a temporary solution to avoid running out of file descriptors (which defaults to 256).
Need to benchmark to understand the speed benefit. If the benefit is small, the simple solution is to avoid memory mapping files. If the benefit is significant, more thought is necessary.
llvm-svn: 48991
lib dir and move all the libraries into it. This follows the main
llvm tree, and allows the libraries to be built in parallel. The
top level now enforces that all the libs are built before Driver,
but we don't care what order the libs are built in. This speeds
up parallel builds, particularly incremental ones.
llvm-svn: 48402