mutated by recursive simplification. This also enhances
ReplaceAndSimplifyAllUses to actually do a real RAUW
at the end of it, which updates any value handles
pointing to "From" to start pointing to "To". This
seems useful for debug info and random other VH users.
llvm-svn: 108415
that was actually useful here.
Chris, please double check that this is the correct interpretation. I was
pretty sure, and ran it by Nick as well.
llvm-svn: 108129
interface needs implementations to be consistent, so any code which
wants to support different semantics must use a different interface.
It's not currently worthwhile to add a new interface for this new
concept.
Document that AliasAnalysis doesn't support cross-function queries.
llvm-svn: 107776
entries associated with the value being erased in the
folding set map. These entries used to be harmless, because
a SCEVUnknown doesn't store any information about its Value*,
so having a new Value allocated at the old Value's address
wasn't a problem. But now that ScalarEvolution is storing more
information about values, this is no longer safe.
llvm-svn: 107316
on ScalarEvolution successfully folding and preserving
range information for both A-B and B-A. Now, if it gets
either one, it's sufficient.
llvm-svn: 107249
instruction to an add scev, it's not safe to blindly transfer the
inbounds flag from a gep instruction to an nsw on the scev for the
gep.
llvm-svn: 107117
assuming that loops are in canonical form, as ScalarEvolution doesn't
depend on LoopSimplify itself. Also, with indirectbr not all loops can
be simplified. This fixes PR7416.
llvm-svn: 106389
scrounging through SCEVUnknown contents and SCEVNAryExpr operands;
instead just do a simple deterministic comparison of the precomputed
hash data.
Also, since this is more precise, it eliminates the need for the slow
N^2 duplicate detection code.
llvm-svn: 105540
encapsulation to force the users of these classes to know about the internal
data structure of the Operands structure. It also can lead to errors, like in
the MSIL writer.
llvm-svn: 105539
on RAUW of functions, this is a correctness issue instead of a mere memory
usage problem.
No testcase until the new MergeFunctions can land.
llvm-svn: 103653
Also, pass true for isSigned even when creating constants for unsigned
comparisons, because the point is to create an all-ones constant,
rather than UINT64_MAX, even for integers wider than 64 bits.
llvm-svn: 102946
SimplifyICmpOperands will simplify such cases to EQ or NE. This makes
the correcntess of the code independent on SimplifyICmpOperands doing
certain simplifications.
llvm-svn: 102927
that can have a big effect :). The first is to enable the
iterative SCC passmanager juice that kicks in when the
scc passmgr detects that a function pass has devirtualized
a call. In this case, it will rerun all the passes it
manages on the SCC, up to the iteration count limit (4). This
is useful because a function pass may devirualize a call, and
we want the inliner to inline it, or pruneeh to infer stuff
about it, etc.
The second patch is to add *all* call sites to the
DevirtualizedCalls list the inliner uses. This list is
about to get renamed, but the jist of this is that the
inliner now reconsiders *all* inlined call sites as candidates
for further inlining. The intuition is this that in cases
like this:
f() { g(1); } g(int x) { h(x); }
We analyze this bottom up, and may decide that it isn't
profitable to inline H into G. Next step, we decide that it is
profitable to inline G into F, and do so, which means that F
now calls H. Even though the call from G -> H may not have been
profitable to inline, the call from F -> H may be (in this case
because a constant allows folding etc).
In my spot checks, this doesn't have a big impact on code. For
example, the LLC output for 252.eon grew from 0.02% (from
317252 to 317308) and 176.gcc actually shrunk by .3% (from 1525612
to 1520964 bytes). 252.eon never iterated in the SCC Passmgr,
176.gcc iterated at most 1 time.
llvm-svn: 102823
were still inlining self-recursive functions into other functions.
Inlining a recursive function into itself has the potential to
reduce recursion depth by a factor of 2, inlining a recursive
function into something else reduces recursion depth by exactly
1. Since inlining a recursive function into something else is a
weird form of loop peeling, turn this off.
The deleted testcase was added by Dale in r62107, since then
we're leaning towards not inlining recursive stuff ever. In any
case, if we like inlining recursive stuff, it should be done
within the recursive function itself to get the algorithm
recursion depth win.
llvm-svn: 102798
doesn't dominate the header is needed, don't check whether the increment
expression has computable loop evolution. While the operands of an
addrec are required to be loop-invariant, they're not required to
dominate any part of the loop. This fixes PR6914.
llvm-svn: 102389
Also, generalize ScalarEvolutions's min and max recognition to handle
some new forms of min and max that this change makes more common.
llvm-svn: 102234
Add the instruction pointer value for debuggability.
We now get dump output that looks like this:
Call graph node for function: 'f1'<<0x1017086b0>> #uses=1
CS<0x1017046f8> calls external node
Call graph node for function: '_ZNSt6vectorIdSaIdEEC1EmRKdRKS0_'<<0x1017086f0>> #uses=1
CS<0x0> calls external node
Call graph node for function: 'f4'<<0x1017087a0>> #uses=1
CS<0x101708c88> calls function 'f3'
llvm-svn: 102194
Fix RefreshCallGraph to use CGN->replaceCallEdge instead of hand
rolling its own loop. replaceCallEdge properly maintains the
reference counts of the nodes, fixing a crash exposed by the
iterative callgraph stuff.
llvm-svn: 102120
we have RefreshCallGraph detect when a function pass devirtualizes
a call, and have CGSCCPassMgr iterate (up to a count) when this
happens. This allows (in the example) GVN to devirtualize the
call in foo, then the inliner to inline it away.
This is not currently enabled because I haven't done any analysis
on the (potentially substantial) code size or performance impact of
doing this, and guess what, it exposes callgraph updating bugs in
various passes. This is progress though, and you can play with it
by passing -max-cg-scc-iterations=5 to opt.
llvm-svn: 101973
recursive callsites, inlining can reduce the number of calls by
exponential factors, as it does in
MultiSource/Benchmarks/Olden/treeadd. More involved heuristics
will be needed.
llvm-svn: 101969
just ask ScalarEvolution for it on demand. This helps IVUsers be more robust
in the case of expressions changing underneath it. This fixes PR6862.
llvm-svn: 101819
by switching CachedFunctionInfo from a std::map to a
ValueMap (which is implemented in terms of a DenseMap).
DenseMap has different iterator invalidation semantics
than std::map.
This should hopefully fix the dragonegg builder.
llvm-svn: 101658
CGSCC can delete nodes in regions of the callgraph that
have already been visited. If new CG nodes are allocated
to the same pointer, we shouldn't abort, just handle it
correctly by assigning a new number. This should restore
stability by removing invalidated pointers that *will* be
reused from the densemap in the iterator.
llvm-svn: 101628
to keep the node entries in scc_iterator up to date instead of dangling as
the SCC mutates.
This is a really terrible problem which was causing -g to affect codegen
because it would permute the memory image of the compiler process.
Thanks to Dale for expertly hunting it down.
llvm-svn: 101565
to CallGraphSCCPass's instead of passing around a
std::vector<CallGraphNode*>. No functionality change,
but now we have a much tidier interface.
llvm-svn: 101558
with a fix for self-hosting
rotate CallInst operands, i.e. move callee to the back
of the operand array
the motivation for this patch are laid out in my mail to llvm-commits:
more efficient access to operands and callee, faster callgraph-construction,
smaller compiler binary
llvm-svn: 101465
with a fix
rotate CallInst operands, i.e. move callee to the back
of the operand array
the motivation for this patch are laid out in my mail to llvm-commits:
more efficient access to operands and callee, faster callgraph-construction,
smaller compiler binary
llvm-svn: 101397
of the operand array
the motivation for this patch are laid out in my mail to llvm-commits:
more efficient access to operands and callee, faster callgraph-construction,
smaller compiler binary
llvm-svn: 101364
The information is already available with "opt -analyze". The DominatorTree
does also not have this in its runOnFunction. So they behave now
more consistent.
llvm-svn: 101038
AddRecs so that it checks for overflow in the computation that it is
performing, rather than just checking hasNo{Signed,Unsigned}Wrap, since
those flags are for a different computation. This fixes a bug that
impacts an upcoming change.
llvm-svn: 101028
explicitly split into stride-and-offset pairs. Also, add the
ability to track multiple post-increment loops on the same expression.
This refines the concept of "normalizing" SCEV expressions used for
to post-increment uses, and introduces a dedicated utility routine for
normalizing and denormalizing expressions.
This fixes the expansion of expressions which are post-increment users
of more than one loop at a time. More broadly, this takes LSR another
step closer to being able to reason about more than one loop at a time.
llvm-svn: 100699
source addition. Apparently the buildbots were wrong about failures.
---
Add some switches helpful for debugging:
-print-before=<Pass Name>
Dump IR before running pass <Pass Name>.
-print-before-all
Dump IR before running each pass.
-print-after-all
Dump IR after running each pass.
These are helpful when tracking down a miscompilation. It is easy to
get IR dumps and do diffs on them, etc.
To make this work well, add a new getPrinterPass API to Pass so that
each kind of pass (ModulePass, FunctionPass, etc.) can create a Pass
suitable for dumping out the kind of object the Pass works on.
llvm-svn: 100249
materializing an MDNode for every debugloc. don't do that! :)
"clang -g -S t.c" really no longer makes mdnodes for location
tuples now.
llvm-svn: 100224
representation. This eliminates the 'DILocation' MDNodes for
file/line/col tuples from -O0 -g codegen.
This remove the old DebugLoc class, making it a typedef for DebugLoc,
I'll rename NewDebugLoc next.
I didn't update the JIT to use the new apis, so it will continue to
work, but be as slow as before. Someone should eventually do this
or, better yet, rip out the JIT debug info stuff and build the JIT
on top of MC.
llvm-svn: 100209
<string> include. For some reason the buildbot choked on this while my
builds did not. It's probably due to a difference in system headers.
---
Add some switches helpful for debugging:
-print-before=<Pass Name>
Dump IR before running pass <Pass Name>.
-print-before-all
Dump IR before running each pass.
-print-after-all
Dump IR after running each pass.
These are helpful when tracking down a miscompilation. It is easy to
get IR dumps and do diffs on them, etc.
To make this work well, add a new getPrinterPass API to Pass so that
each kind of pass (ModulePass, FunctionPass, etc.) can create a Pass
suitable for dumping out the kind of object the Pass works on.
llvm-svn: 100204
-print-before=<Pass Name>
Dump IR before running pass <Pass Name>.
-print-before-all
Dump IR before running each pass.
-print-after-all
Dump IR after running each pass.
These are helpful when tracking down a miscompilation. It is easy to
get IR dumps and do diffs on them, etc.
To make this work well, add a new getPrinterPass API to Pass so that
each kind of pass (ModulePass, FunctionPass, etc.) can create a Pass
suitable for dumping out the kind of object the Pass works on.
llvm-svn: 100143
I have audited all getOperandNo calls now, fixing
hidden assumptions. CallSite related uglyness will
be eliminated successively.
Note this patch has a long and griveous history,
for all the back-and-forths have a look at
CallSite.h's log.
llvm-svn: 99399
use-before-def errors in SCEVExpander-produced code in sqlite3 when debug
info with optimization is enabled, though the testcases for this are
dependent on use-list order.
llvm-svn: 99001
This time I did a self-hosted bootstrap on Linux x86-64,
with no problems. Let's see how darwin 64-bit self-hosting
goes. At the first sign of failure I'll back this out.
Maybe the valgrind bots give me a hint of what may be wrong
(it at all).
llvm-svn: 98957
BumpPtrAllocator-allocated region to allow it to be stored in a more
compact form and to avoid the need for a non-trivial destructor call.
Use this new mechanism in ScalarEvolution instead of
FastFoldingSetNode to avoid leaking memory in the case where a
FoldingSetNodeID uses heap storage, and to reduce overall memory
usage.
llvm-svn: 98829
pointer and length, and allocate the arrays in ScalarEvolution's
BumpPtrAllocator, so that they get released when their owning
SCEV gets released. SCEVs are immutable, so they don't need to worry
about operand array resizing. This fixes a memory leak reported
in PR6637.
llvm-svn: 98755
The Caller cost info would be reset everytime a callee was inlined. If the
caller has lots of calls and there is some mutual recursion going on, the
caller cost info could be calculated many times.
This patch reduces inliner runtime from 240s to 0.5s for a function with 20000
small function calls.
This is a more conservative version of r98089 that doesn't break the clang
test CodeGenCXX/temp-order.cpp. That test relies on rather extreme inlining
for constant folding.
llvm-svn: 98099
The Caller cost info would be reset everytime a callee was inlined. If the
caller has lots of calls and there is some mutual recursion going on, the
caller cost info could be calculated many times.
This patch reduces inliner runtime from 240s to 0.5s for a function with 20000
small function calls.
llvm-svn: 98089