and into the ScheduleDAGInstrs class, so that they don't get
destructed and re-constructed for each block. This fixes a
compile-time hot spot in the post-pass scheduler.
To help facilitate this, tidy and do some minor reorganization
in the scheduler constructor functions.
llvm-svn: 62275
- IdentifierInfo can now (optionally) have its string data not be
co-located with itself. This is for use with PTH. This aspect is a
little gross, as getName() and getLength() now make assumptions
about a possible alternate representation of IdentifierInfo.
Perhaps we should make IdentifierInfo have virtual methods?
IdentifierTable:
- Added class "IdentifierInfoLookup" that can be used by
IdentifierTable to perform "string -> IdentifierInfo" lookups using
an auxilliary data structure. This is used by PTH.
- Perform tests show that IdentifierTable::get() does not slow down
because of the extra check for the IdentiferInfoLookup object (the
regular StringMap lookup does enough work to mitigate the impact of
an extra null pointer check).
- The upshot is that now that some IdentifierInfo objects might be
owned by the IdentiferInfoLookup object. This should be reviewed.
PTH:
- Modified PTHManager::GetIdentifierInfo to *not* insert entries in
IdentifierTable's string map, and instead create IdentifierInfo
objects on the fly when mapping from persistent IDs to
IdentifierInfos. This saves a ton of work with string copies,
hashing, and StringMap lookup and resizing. This change was
motivated because when processing source files in the PTH cache we
don't need to do any string -> IdentifierInfo lookups.
- PTHManager now subclasses IdentifierInfoLookup, allowing clients of
IdentifierTable to transparently use IdentifierInfo objects managed
by the PTH file. PTHManager resolves "string -> IdentifierInfo"
queries by doing a binary search over a sorted table of identifier
strings in the PTH file (the exact algorithm we use can be changed
as needed).
These changes lead to the following performance changes when using PTH on Cocoa.h:
- fsyntax-only: 10% performance improvement
- Eonly: 30% performance improvement
llvm-svn: 62273
- Looking at the number of sign bits of the a sext instruction to determine whether new trunc + sext pair should be added when its source is being evaluated in a different type.
llvm-svn: 62263
sequences in SPUDAGToDAGISel.cpp and SPU64InstrInfo.td, killing custom
DAG node types as needed.
- i64 mul is now a legal instruction, but emits an instruction sequence
that stretches tblgen and the imagination, as well as violating laws of
several small countries and most southern US states (just kidding, but
looking at a function with 80+ parameters is really weird and just plain
wrong.)
- Update tests as needed.
llvm-svn: 62254
- Mostly written as an entertaining exercise in enumerating large or
(countably, naturally) infinite sets. But hey, its useful too!
- Idea is to number all C-types so that the N-th type can quickly be
computed, with a good deal of flexibility about what types to
include, and taking some care so that the (N+1)-th type is
interestingly different from the N-th type. For example, using the
default generator, the 1,000,000-th function type is:
--
typedef _Complex int T0;
typedef char T1 __attribute__ ((vector_size (4)));
typedef int T2 __attribute__ ((vector_size (4)));
T2 fn1000000(T0 arg0, signed long long arg1, T1 arg2, T0 arg3);
--
and the 1,000,001-th type is:
--
typedef _Complex char T0;
typedef _Complex char T2;
typedef struct T1 { T2 field0; T2 field1; T2 field2; } T1;
typedef struct T3 { } T3;
unsigned short fn1000001(T0 arg0, T1 arg1, T3 arg2);
--
Computing the 10^1600-th type takes a little less than 1s. :)
llvm-svn: 62253
lexical order of the corresponding identifier strings. This will be used for a
forthcoming optimization. This slows down PTH generation time by 7%. We can
revert this change if the optimization proves to not be valuable.
llvm-svn: 62248
This requires some hackery, as gcc's PCH mechanism changes behavior,
whereas while PTH is simply a cache. Notably:
- Automatically cause clang to load a .pth file if we find one that
matches a command line -include argument (similar to how gcc
looks for .gch files).
- When generating precompiled headers, translate the suffix from .gch
to .pth (so we do not conflict with actual gcc PCH files).
- When generating precompiled headers, copy the input header to the
same location as the output PTH file. This is necessary because gcc
supports -include xxx.h even if xxx.h doesn't exist, but for clang
we need to actually have the contents of this file available.
llvm-svn: 62246
This change refactors and cleans up our handling of name lookup with
LookupDecl. There are several aspects to this refactoring:
- The criteria for name lookup is now encapsulated into the class
LookupCriteria, which replaces the hideous set of boolean values
that LookupDecl currently has.
- The results of name lookup are returned in a new class
LookupResult, which can lazily build OverloadedFunctionDecls for
overloaded function sets (and, eventually, eliminate the need to
allocate member for OverloadedFunctionDecls) and contains a
placeholder for handling ambiguous name lookup (for C++).
- The primary entry points for name lookup are now LookupName (for
unqualified name lookup) and LookupQualifiedName (for qualified
name lookup). There is also a convenience function
LookupParsedName that handles qualified/unqualified name lookup
when given a scope specifier. Together, these routines are meant
to gradually replace the kludgy LookupDecl, but this won't happen
until after we have base class lookup (which forces us to cope
with ambiguities).
- Documented the heck out of name lookup. Experimenting a little
with using Doxygen's member groups to make some sense of the Sema
class. Feedback welcome!
- Fixes some lingering issues with name lookup for
nested-name-specifiers, which now goes through
LookupName/LookupQualifiedName.
llvm-svn: 62245
frame index. eliminateFrameIndex will replace these instructions with
(LDWSP|STWSP|LDAWSP) or (LDW|STW|LDAWF) if a frame pointer is in use.
This fixes PR 3324. Previously we used LDWSP, STWSP, LDAWSP before frame
pointer elimination. However since they were marked as implicitly using
SP they could not be rematerialised.
llvm-svn: 62238
Small cleanup in the handling of user-defined conversions.
Also, implement an optimization when constructing a call. We avoid
recomputing implicit conversion sequences and instead use those
conversion sequences that we computed as part of overload resolution.
llvm-svn: 62231
my earlier patch to this file.
The issue there was that all uses of an IV inside a loop
are actually references to Base[IV*2], and there was one
use outside that was the same but LSR didn't see the base
or the scaling because it didn't recurse into uses outside
the loop; thus, it used base+IV*scale mode inside the loop
instead of pulling base out of the loop. This was extra bad
because register pressure later forced both base and IV into
memory. Doing that recursion, at least enough
to figure out addressing modes, is a good idea in general;
the change in AddUsersIfInteresting does this. However,
there were side effects....
It is also possible for recursing outside the loop to
introduce another IV where there was only 1 before (if
the refs inside are not scaled and the ref outside is).
I don't think this is a common case, but it's in the testsuite.
It is right to be very aggressive about getting rid of
such introduced IVs (CheckForIVReuse and the handling of
nonzero RewriteFactor in StrengthReduceStridedIVUsers).
In the testcase in question the new IV produced this way
has both a nonconstant stride and a nonzero base, neither
of which was handled before. And when inserting
new code that feeds into a PHI, it's right to put such
code at the original location rather than in the PHI's
immediate predecessor(s) when the original location is outside
the loop (a case that couldn't happen before)
(RewriteInstructionToUseNewBase); better to avoid making
multiple copies of it in this case.
Also, the mechanism for keeping SCEV's corresponding to GEP's
no longer works, as the GEP might change after its SCEV
is remembered, invalidating the SCEV, and we might get a bad
SCEV value when looking up the GEP again for a later loop.
This also couldn't happen before, as we weren't recursing
into GEP's outside the loop.
Also, when we build an expression that involves a (possibly
non-affine) IV from a different loop as well as an IV from
the one we're interested in (containsAddRecFromDifferentLoop),
don't recurse into that. We can't do much with it and will
get in trouble if we try to create new non-affine IVs or something.
More testcases are coming.
llvm-svn: 62212