Anyone interested in more general PRE would be better served by implementing it separately, to get real
anticipation calculation, etc.
llvm-svn: 115337
Splitting critical edges at the merge point only addressed part of the issue; it is also possible for non-post-domination
to occur when the path from the load to the merge has branches in it. Unfortunately, full anticipation analysis is
time-consuming, so for now approximate it. This is strictly more conservative than real anticipation, so we will miss
some cases that real PRE would allow, but we also no longer insert loads into paths where they didn't exist before. :-)
This is a very slight net positive on SPEC for me (0.5% on average). Most of the benchmarks are largely unaffected, but
when it pays off it pays off decently: 181.mcf improves by 4.5% on my machine.
llvm-svn: 114785
I'm sure it is harmless. Original commit message:
If PrototypeValue is erased in the middle of using the SSAUpdator
then the SSAUpdator may access freed memory. Instead, simply pass
in the type and name explicitly, which is all that was used anyway.
llvm-svn: 112810
indirect branches in all the predecessors. This avoids unnecessarily
splitting edges in cases where load PRE is not possible anyway.
Thanks to Jakub Staszak for pointing this out.
llvm-svn: 103034
with a fix for self-hosting
rotate CallInst operands, i.e. move callee to the back
of the operand array
the motivation for this patch are laid out in my mail to llvm-commits:
more efficient access to operands and callee, faster callgraph-construction,
smaller compiler binary
llvm-svn: 101465
with a fix
rotate CallInst operands, i.e. move callee to the back
of the operand array
the motivation for this patch are laid out in my mail to llvm-commits:
more efficient access to operands and callee, faster callgraph-construction,
smaller compiler binary
llvm-svn: 101397
of the operand array
the motivation for this patch are laid out in my mail to llvm-commits:
more efficient access to operands and callee, faster callgraph-construction,
smaller compiler binary
llvm-svn: 101364
predecessors before returning. Otherwise, if multiple predecessor edges need
splitting, we only get one of them per iteration. This makes a small but
measurable compile time improvement with -enable-full-load-pre.
llvm-svn: 97521
argument is non-null, pass it along to PHITranslateSubExpr so that it can
prefer using existing values that dominate the PredBB, instead of just
blindly picking the first equivalent value that it finds on a uselist.
Also when the DominatorTree is specified, have PHITranslateValue filter
out any result that does not dominate the PredBB. This is basically just
refactoring the check that used to be in GetAvailablePHITranslatedSubExpr
and also in GVN.
Despite my initial expectations, this change does not affect the results
of GVN for any testcases that I could find, but it should help compile time.
Before this change, if PHITranslateSubExpr picked a value that does not
dominate, PHITranslateWithInsertion would then insert a new value, which GVN
would later determine to be redundant and would replace. By picking a good
value to begin with, we save GVN the extra work of inserting and then
replacing a new value.
llvm-svn: 97010