Add statistics for regular edge profiling, this enables the comparation of the
number of edges inserted by regular and optimal edge profiling.
llvm-svn: 80668
for sanity. This didn't turn up any bugs.
Change CallGraphNode to maintain its "callsite" information in the
call edges list as a WeakVH instead of as an instruction*. This fixes
a broad class of dangling pointer bugs, and makes CallGraph have a number
of useful invariants again. This fixes the class of problem indicated
by PR4029 and PR3601.
llvm-svn: 80663
changes: SimplifyDemandedBits can't use the builder yet because it
has the wrong insertion point. This fixes a crash building
MultiSource/Benchmarks/PAQ8p
llvm-svn: 80537
instead of CallGraphNode*'s. This also papers over a callgraph
problem where a pass (in this case, MemCpyOpt) introduces a new
function into the module (llvm.memset.i64) but doesn't add it to
the call graph (nor should it, since it is a function pass).
While it might be a good idea for MemCpyOpt to not synthesize
functions in a runOnFunction(), there is no need for FunctionAttrs
to be boneheaded, so fix it there. This fixes an assertion building
176.gcc.
llvm-svn: 80535
indirect function pointer, inline it, then go to delete the body.
The problem is that the callgraph had other references to the function,
though the inliner had no way to know it, so we got a dangling pointer
and an invalid iterator out of the deal.
The fix to this is pretty simple: stop the inliner from deleting the
function by knowing that there are references to it. Do this by making
CallGraphNodes contain a refcount. This requires moving deletion of
available_externally functions to the module-level cleanup sweep where
it belongs.
llvm-svn: 80533
argpromotion and structretpromote. Basically, when replacing
a function, they used the 'changeFunction' api which changes
the entry in the function map (and steals/reuses the callgraph
node).
This has some interesting effects: first, the problem is that it doesn't
update the "callee" edges in any callees of the function in the call graph.
Second, this covers for a major problem in all the CGSCC pass stuff, which
is that it is completely broken when functions are deleted if they *don't*
reuse a CGN. (there is a cute little fixme about this though :).
This patch changes the protocol that CGSCC passes must obey: now the CGSCC
pass manager copies the SCC and preincrements its iterator to avoid passes
invalidating it. This allows CGSCC passes to mutate the current SCC. However
multiple passes may be run on that SCC, so if passes do this, they are now
required to *update* the SCC to be current when they return.
Other less interesting parts of this patch are that it makes passes update
the CG more directly, eliminates changeFunction, and requires clients of
replaceCallSite to specify the new callee CGN if they are changing it.
llvm-svn: 80527
is itself a bitcast. Since we have gep(bitcast(bitcast(y))) in this
case, just wait for the two bitcasts to get zapped. This prevents
instcombine from confusing some aliasing stuff, and allows it to
directly eliminate the load in the testcase.
llvm-svn: 80508
workslist and is set to insert new instructions before the current one.
Convert a bunch of stuff that used to call InsertNewInstBefore over to
use it, greatly simplifying code and making it more natural.
There is still a lot more to go, but this is a good start.
llvm-svn: 80492
if the operand is not an instruction.
Simplify most uses of AddOperandsToWorkList to use AddValue and
inline it into the one remaining callsite.
llvm-svn: 80488
former looks too much like AddUsersToWorkList and keeps
confusing me.
Remove AddSoonDeadInstToWorklist and change its two callers
to do the same thing in a simpler way.
llvm-svn: 80486
into their callers. simplify ReplaceInstUsesWith. Make
EraseInstFromFunction only add operands to the worklist if
there aren't too many of them (this was a scalability win
for crazy programs that was only infrequently enforced).
Switch more code to using EraseInstFromFunction instead of
duplicating it inline. Change some fcmp/icmp optimizations
to modify fcmp/icmp in place instead of creating a new one
and deleting the old one just to change the predicate.
llvm-svn: 80483
This implements the maximum spanning tree algorithm on CFGs according to
weights given by the ProfileEstimator. This is then used to implement Optimal
Edge Profiling.
llvm-svn: 80358
calls into a function and if the calls bring in arrays, try to merge
them together to reduce stack size. For example, in the testcase
we'd previously end up with 4 allocas, now we end up with 2 allocas.
As described in the comments, this is not really the ideal solution
to this problem, but it is surprisingly effective. For example, on
176.gcc, we end up eliminating 67 arrays at "gccas" time and another
24 at "llvm-ld" time.
One piece of concern that I didn't look into: at -O0 -g with
forced inlining this will almost certainly result in worse debug
info. I think this is acceptable though given that this is a case
of "debugging optimized code", and we don't want debug info to
prevent the optimizer from doing things anyway.
llvm-svn: 80215
and introduce a new Instruction::isIdenticalTo which tests for full
identity, including the SubclassOptionalData flags. Also, fix the
Instruction::clone implementations to preserve the SubclassOptionalData
flags. Finally, teach several optimizations how to handle
SubclassOptionalData correctly, given these changes.
This fixes the counterintuitive behavior of isIdenticalTo not comparing
the full value, and clone not returning an identical clone, as well as
some subtle bugs that could be caused by these.
Thanks to Nick Lewycky for reporting this, and for an initial patch!
llvm-svn: 80038
sinking code, since they are special. If the loop preheader happens
to be the entry block of a function, don't sink static allocas
out of it. This fixes PR4775.
llvm-svn: 80010
of an extracted block contains a PHI using a value defined in the extracted region.
With this patch, the partial inliner now passes MultiSource/Applications.
llvm-svn: 79963
try to use i686-darwin to build for arm-eabi, you'll quickly run into
several false assumptions that the target OS must be the same as the
host OS. These patches split $(OS) into $(HOST_OS) and $(TARGET_OS) to
help builds like "make check" and the test-suite able to cross
compile. Along the way a target of *-unknown-eabi is defined as
"Freestanding" so that TARGET_OS checks have something to work with.
Patch by Sandeep Patel!
llvm-svn: 79296
vector (&Formals[0]). With this change llvm-gcc builds
with expensive checking enabled for C, C++ and Fortran.
While there, change a std::vector into a SmallVector.
This is partly gratuitous, but mostly because not all
STL vector implementations define the data method (and
it should be faster).
llvm-svn: 79237
unfoldable references to a PHI node in the block being folded, and disable
the transformation in that case. The correct transformation of such PHI
nodes depends on whether BB dominates Succ, and dominance is expensive
to compute here. (Alternatively, it's possible to check whether any
uses are live, but that's also essentially a dominance calculation.
Another alternative is to use reg2mem, but it probably isn't a good idea to
use that in simplifycfg.)
Also, remove some incorrect code from CanPropagatePredecessorsForPHIs
which is made unnecessary with this patch: it didn't consider the case
where a PHI node in BB has multiple uses.
llvm-svn: 79174
the new load by the old load instead of by the extract element because
a store could have occurred between the load and extract element.
llvm-svn: 78891
- Part of optimal static profiling patch sequence by Andreas Neustifter.
- Store edge, block, and function information separately for each functions
(instead of in one giant map).
- Return frequencies as double instead of int, and use a sentinel value for
missing information.
llvm-svn: 78477
appropriate. Patch per report on llvmdev. No testcase because the
original report didn't come with a testcase, and I can't come up with a case
that actually fails.
llvm-svn: 77986
a Twine, e.g., for names).
- I am a little ambivalent about this; we don't want the string conversion of
utostr, but using overload '+' mixed with string and integer arguments is
sketchy. On the other hand, this particular usage is something of an idiom.
llvm-svn: 77579
- Some clients which used DOUT have moved to DEBUG. We are deprecating the
"magic" DOUT behavior which avoided calling printing functions when the
statement was disabled. In addition to being unnecessary magic, it had the
downside of leaving code in -Asserts builds, and of hiding potentially
unnecessary computations.
llvm-svn: 77019
- Yay for '-'s and simplifications!
- I kept StringMap::GetOrCreateValue for compatibility purposes, this can
eventually go away. Likewise the StringMapEntry Create functions still follow
the old style.
- NIFC.
llvm-svn: 76888
also apply to vectors. This allows us to compile this:
#include <emmintrin.h>
__m128i a(__m128 a, __m128 b) { return a==a & b==b; }
__m128i b(__m128 a, __m128 b) { return a!=a | b!=b; }
to:
_a:
cmpordps %xmm1, %xmm0
ret
_b:
cmpunordps %xmm1, %xmm0
ret
with clang instead of to a ton of horrible code.
llvm-svn: 76863
functions with a single use; eliminating the single use may eliminate
the function from the current module, but usually doesn't eliminate
it from the final program.
llvm-svn: 76730
Getelementptrs that are defined to wrap are virtually useless to
optimization, and getelementptrs that are undefined on any kind
of overflow are too restrictive -- it's difficult to ensure that
all intermediate addresses are within bounds. I'm going to take
a different approach.
Remove a few optimizations that depended on this flag.
llvm-svn: 76437
"private" symbols which the assember shouldn't strip, but which the linker may
remove after evaluation. This is mostly useful for Objective-C metadata.
This is plumbing, so we don't have a use of it yet. More to come, etc.
llvm-svn: 76385
insertelement/extractelement.
I'm not entirely sure this is precisely what we want to do: should we
prefer bitcast(insertelement) or insertelement(bitcast)? Similarly. should we
prefer extractelement(bitcast) or bitcast(extractelement)?
llvm-svn: 76345
all values belonging to the intersection will belong to the resulting range.
The former was inconsistent about that point (either way is fine, just pick
one.) This is part of PR4545.
llvm-svn: 76289
GEPOperator's hasNoPointer0verflow(), and make a few places in instcombine
that create GEPs that may overflow clear the NoOverflow value. Among
other things, this partially addresses PR2831.
llvm-svn: 76252
in a convenient manner, factoring out some common code from
InstructionCombining and ValueTracking. Move the contents of
BinaryOperators.h into Operator.h and use Operator to generalize them
to support ConstantExprs as well as Instructions.
llvm-svn: 76232
isSafeToSpeculativelyExecute. The new method is a bit closer to what
the callers actually care about in that it rejects more things callers
don't want. It also adds more precise handling for integer
division, and unifies code for analyzing the legality of a speculative
load.
llvm-svn: 76150
This adds location info for all llvm_unreachable calls (which is a macro now) in
!NDEBUG builds.
In NDEBUG builds location info and the message is off (it only prints
"UREACHABLE executed").
llvm-svn: 75640
(I think it's reasonably clear that we want to have a canonical form for
constructs like this; if anyone thinks that a select is not the best
canonical form, please tell me.)
llvm-svn: 75531
using the Curiously Recurring Template Pattern with LoopBase.
This will help further refactoring, and future functionality for
Loop. Also, Headers can now foward-declare Loop, instead of pulling
in LoopInfo.h or doing tricks.
llvm-svn: 75519
the changes are allowed by not calling this function for bitcasts.
The Instruction::AShr case is dead because
SimplifyDemandedInstructionBits handles that case.
llvm-svn: 75514
This involves temporarily hard wiring some parts to use the global context. This isn't ideal, but it's
the only way I could figure out to make this process vaguely incremental.
llvm-svn: 75445
Make llvm_unreachable take an optional string, thus moving the cerr<< out of
line.
LLVM_UNREACHABLE is now a simple wrapper that makes the message go away for
NDEBUG builds.
llvm-svn: 75379
per icmp predicate out of predsimplify and into ConstantRange.
Add another utility method that determines whether one range is a subset of
another. Combine with the former to determine whether icmp pred range, range
is known to be true or not.
llvm-svn: 75357