fix: add a flag to MapValue and friends which indicates whether
any module-level mappings are being made. In the common case of
inlining, no module-level mappings are needed, so MapValue doesn't
need to examine non-function-local metadata, which can be very
expensive in the case of a large module with really deep metadata
(e.g. a large C++ program compiled with -g).
This flag is a little awkward; perhaps eventually it can be moved
into the ClonedCodeInfo class.
llvm-svn: 112190
that can have a big effect :). The first is to enable the
iterative SCC passmanager juice that kicks in when the
scc passmgr detects that a function pass has devirtualized
a call. In this case, it will rerun all the passes it
manages on the SCC, up to the iteration count limit (4). This
is useful because a function pass may devirualize a call, and
we want the inliner to inline it, or pruneeh to infer stuff
about it, etc.
The second patch is to add *all* call sites to the
DevirtualizedCalls list the inliner uses. This list is
about to get renamed, but the jist of this is that the
inliner now reconsiders *all* inlined call sites as candidates
for further inlining. The intuition is this that in cases
like this:
f() { g(1); } g(int x) { h(x); }
We analyze this bottom up, and may decide that it isn't
profitable to inline H into G. Next step, we decide that it is
profitable to inline G into F, and do so, which means that F
now calls H. Even though the call from G -> H may not have been
profitable to inline, the call from F -> H may be (in this case
because a constant allows folding etc).
In my spot checks, this doesn't have a big impact on code. For
example, the LLC output for 252.eon grew from 0.02% (from
317252 to 317308) and 176.gcc actually shrunk by .3% (from 1525612
to 1520964 bytes). 252.eon never iterated in the SCC Passmgr,
176.gcc iterated at most 1 time.
llvm-svn: 102823
This fixes a bug where calls inlined into an invoke would get
changed into an invoke but the array would keep pointing to
the (now dead) call. The improved inliner behavior is still
disabled for now.
llvm-svn: 102196
that appear in the SCC as a result of inlining as candidates
for inlining. Change this so that it *does* consider call
sites that change from being indirect to being direct as a
result of inlining. This allows it to completely
"devirtualize" the testcase.
llvm-svn: 102146
arguments are handled with a new InlineFunctionInfo class. This
makes it easier to extend InlineFunction to return more info in the
future.
llvm-svn: 102137
define void @f3(void (i8*)* %__f) ssp {
entry:
call void %__f(i8* undef)
unreachable
}
define void @f4(i8* %this) ssp align 2 {
entry:
call void @f3(void (i8*)* @f2) ssp
ret void
}
The inliner is turning the indirect call to %__f into a direct
call to F2. Make the call graph more precise when this happens.
The inliner doesn't revisit call sites introduced by inlining,
so there isn't an easy way to test for this, but a more precise
callgraph is a good thing.
llvm-svn: 102131
with a fix for self-hosting
rotate CallInst operands, i.e. move callee to the back
of the operand array
the motivation for this patch are laid out in my mail to llvm-commits:
more efficient access to operands and callee, faster callgraph-construction,
smaller compiler binary
llvm-svn: 101465
with a fix
rotate CallInst operands, i.e. move callee to the back
of the operand array
the motivation for this patch are laid out in my mail to llvm-commits:
more efficient access to operands and callee, faster callgraph-construction,
smaller compiler binary
llvm-svn: 101397
of the operand array
the motivation for this patch are laid out in my mail to llvm-commits:
more efficient access to operands and callee, faster callgraph-construction,
smaller compiler binary
llvm-svn: 101364
Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset,
e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)
llvm-svn: 100304
Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset,
e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)
llvm-svn: 100191
e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)
A update of langref will occur in a subsequent checkin.
llvm-svn: 99928
for the noinline attribute, and make the inliner refuse to
inline a call site when the call site is marked noinline even
if the callee isn't. This fixes PR6682.
llvm-svn: 99341
with multiple return values it inserts a PHI to merge them all together.
However, if the return values are all the same, it ends up with a pointless
PHI and this pointless PHI happens to really block SRoA from happening in
at least a silly C++ example written by Doug, but probably others. This
fixes rdar://7339069.
llvm-svn: 85206
for sanity. This didn't turn up any bugs.
Change CallGraphNode to maintain its "callsite" information in the
call edges list as a WeakVH instead of as an instruction*. This fixes
a broad class of dangling pointer bugs, and makes CallGraph have a number
of useful invariants again. This fixes the class of problem indicated
by PR4029 and PR3601.
llvm-svn: 80663
llvm.dbg.region.end instrinsic. This nested llvm.dbg.func.start/llvm.dbg.region.end pair now enables DW_TAG_inlined_subroutine support in code generator.
llvm-svn: 69118
g++ -m32 -c -g -DIN_GCC -W -Wall -Wwrite-strings -Wmissing-format-attribute -fno-common -mdynamic-no-pic -DHAVE_CONFIG_H -Wno-unused -DTARGET_NAME=\"i386-apple-darwin9.5.0\" -I. -I. -I../../llvm-gcc.src/gcc -I../../llvm-gcc.src/gcc/. -I../../llvm-gcc.src/gcc/../include -I./../intl -I../../llvm-gcc.src/gcc/../libcpp/include -I../../llvm-gcc.src/gcc/../libdecnumber -I../libdecnumber -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/include -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/include -DENABLE_LLVM -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/../llvm.src/include -D_DEBUG -D_GNU_SOURCE -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -I. -I. -I../../llvm-gcc.src/gcc -I../../llvm-gcc.src/gcc/. -I../../llvm-gcc.src/gcc/../include -I./../intl -I../../llvm-gcc.src/gcc/../libcpp/include -I../../llvm-gcc.src/gcc/../libdecnumber -I../libdecnumber -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/include -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/include ../../llvm-gcc.src/gcc/llvm-types.cpp -o llvm-types.o
../../llvm-gcc.src/gcc/llvm-convert.cpp: In member function 'void TreeToLLVM::EmitMemCpy(llvm::Value*, llvm::Value*, llvm::Value*, unsigned int)':
../../llvm-gcc.src/gcc/llvm-convert.cpp:1496: error: 'memcpy_i32' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp:1496: error: 'memcpy_i64' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp: In member function 'void TreeToLLVM::EmitMemMove(llvm::Value*, llvm::Value*, llvm::Value*, unsigned int)':
../../llvm-gcc.src/gcc/llvm-convert.cpp:1512: error: 'memmove_i32' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp:1512: error: 'memmove_i64' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp: In member function 'void TreeToLLVM::EmitMemSet(llvm::Value*, llvm::Value*, llvm::Value*, unsigned int)':
../../llvm-gcc.src/gcc/llvm-convert.cpp:1528: error: 'memset_i32' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp:1528: error: 'memset_i64' is not a member of 'llvm::Intrinsic'
make[3]: *** [llvm-convert.o] Error 1
make[3]: *** Waiting for unfinished jobs....
rm fsf-funding.pod gcov.pod gfdl.pod cpp.pod gpl.pod gcc.pod
make[2]: *** [all-stage1-gcc] Error 2
make[1]: *** [stage1-bubble] Error 2
make: *** [all] Error 2
llvm-svn: 59809
s/ParamAttr/Attribute/g
s/PAList/AttrList/g
s/FnAttributeWithIndex/AttributeWithIndex/g
s/FnAttr/Attribute/g
This sets the stage
- to implement function notes as function attributes and
- to distinguish between function attributes and return value attributes.
This requires corresponding changes in llvm-gcc and clang.
llvm-svn: 56622
because it does not maintain a correct list
of callsites. I discovered (see following
commit) that the inliner will create a wrong
callgraph if it is fed a callgraph with
correct edges but incorrect callsites. These
were created by Prune-EH, and while it wasn't
done via removeCallEdgeTo, it could have been
done via removeCallEdgeTo, which is an accident
waiting to happen. Use removeCallEdgeFor
instead.
llvm-svn: 55859
In particular, Collector was confusing to implementors. Several
thought that this compile-time class was the place to implement
their runtime GC heap. Of course, it doesn't even exist at runtime.
Specifically, the renames are:
Collector -> GCStrategy
CollectorMetadata -> GCFunctionInfo
CollectorModuleMetadata -> GCModuleInfo
CollectorRegistry -> GCRegistry
Function::getCollector -> getGC (setGC, hasGC, clearGC)
Several accessors and nested types have also been renamed to be
consistent. These changes should be obvious.
llvm-svn: 54899
Remove the GetResultInst instruction. It is still accepted in LLVM assembly
and bitcode, where it is now auto-upgraded to ExtractValueInst. Also, remove
support for return instructions with multiple values. These are auto-upgraded
to use InsertValueInst instructions.
The IRBuilder still accepts multiple-value returns, and auto-upgrades them
to InsertValueInst instructions.
llvm-svn: 53941
needs to be fixed here - a previous commit made sure
that intrinsics always get the right attributes.
So remove no-longer needed code, and while there use
Intrinsic::getDeclaration rather than getOrInsertFunction.
llvm-svn: 49337
define void @f() {
...
call i32 @g()
...
}
define void @g() {
...
}
The hazards are:
- @f and @g have GC, but they differ GC. Inlining is invalid. This
may never occur.
- @f has no GC, but @g does. g's GC must be propagated to @f.
The other scenarios are safe:
- @f and @g have the same GC.
- @f and @g have no GC.
- @g has no GC.
This patch adds inliner checks for the former two scenarios.
llvm-svn: 45351
calls 'nounwind'. It is important for correct C++
exception handling that nounwind markings do not get
lost, so this transformation is actually needed for
correctness.
llvm-svn: 45218
calls. Remove special casing of inline asm from the
inliner. There is a potential problem: the verifier
rejects invokes of inline asm (not sure why). If an
asm call is not marked "nounwind" in some .ll, and
instcombine is not run, but the inliner is run, then
an illegal module will be created. This is bad but
I'm not sure what the best approach is. I'm tempted
to remove the check in the verifier...
llvm-svn: 45073
throw exceptions", just mark intrinsics with the nounwind
attribute. Likewise, mark intrinsics as readnone/readonly
and get rid of special aliasing logic (which didn't use
anything more than this anyway).
llvm-svn: 44544
the function type, instead they belong to functions
and function calls. This is an updated and slightly
corrected version of Reid Spencer's original patch.
The only known problem is that auto-upgrading of
bitcode files doesn't seem to work properly (see
test/Bitcode/AutoUpgradeIntrinsics.ll). Hopefully
a bitcode guru (who might that be? :) ) will fix it.
llvm-svn: 44359
This patch replaces signed integer types with signless ones:
1. [US]Byte -> Int8
2. [U]Short -> Int16
3. [U]Int -> Int32
4. [U]Long -> Int64.
5. Removal of isSigned, isUnsigned, getSignedVersion, getUnsignedVersion
and other methods related to signedness. In a few places this warranted
identifying the signedness information from other sources.
llvm-svn: 32785
target CG node. This allows the inliner to properly update the callgraph
when using the pruning inliner. The pruning inliner may not copy over all
call sites from a callee to a caller, so the edges corresponding to those
call sites should not be copied over either.
This fixes PR827 and Transforms/Inline/2006-07-12-InlinePruneCGUpdate.ll
llvm-svn: 29120
makes it so that it constant folds instructions on the fly. This is good
for several reasons:
0. Many instructions are constant foldable after inlining, particularly if
inlining a call with constant arguments.
1. Without this, the inliner has to allocate memory for all of the instructions
that can be constant folded, then a subsequent pass has to delete them. This
gets the job done without this extra work.
2. This makes the inliner *pass* a bit more aggressive: in particular, it
partially solves a phase order issue where the inliner would inline lots
of code that folds away to nothing, but think that the resultant function
is big because of this code that will be gone. Now the code never exists.
This is the first part of a 2-step process. The second part will be smart
enough to see when this implicit constant folding propagates a constant into
a branch or switch instruction, making CFG edges dead.
This implements Transforms/Inline/inline_constprop.ll
llvm-svn: 28521
it doesn't contain any calls. This is a fairly common case for C++ code,
so it will probably speed up the inliner marginally in these cases.
llvm-svn: 25284
function was not an alloca, we wouldn't check the entry block for any allocas,
leading to increased stack space in some cases. In practice, allocas are almost
always at the top of the block, so this was never noticed.
llvm-svn: 25280
Basically we were using SimplifyCFG as a huge sledgehammer for a simple
optimization. Because simplifycfg does so many things, we can't use it
for this purpose.
llvm-svn: 12977
allowed in invoke instructions. Thus, if we are inlining a call to an intrinsic
function into an invoke site, we don't need to turn the call into an invoke!
llvm-svn: 11384
1. Don't scan to the end of alloca instructions in the caller function to
insert inlined allocas, just insert at the top. This saves a lot of
time inlining into functions with a lot of allocas.
2. Use splice to move the alloca instructions over, instead of remove/insert.
This allows us to transfer a block at a time, and eliminates a bunch of
silly symbol table manipulations.
This speeds up the inliner on the testcase in PR209 from 1.73s -> 1.04s (67%)
llvm-svn: 11118
and that basic block ends with a return instruction. In this case, we can just splice
the cloned "body" of the function directly into the source basic block, avoiding a lot
of rearrangement and splitBasicBlock's linear scan over the split block. This speeds up
the inliner on the testcase in PR209 from 2.3s to 1.7s, a 35% reduction.
llvm-svn: 11116
process. The only optimization we did so far is to avoid creating a
PHI node, then immediately destroying it in the common case where the
callee has one return statement. Instead, we just don't create the return
value. This has no noticable performance impact, but paves the way for
future improvements.
llvm-svn: 11108
PHI node entries for unwind instructions just like for call instructions which
became invokes! This fixes PR57, tested by
Inline/2003-10-26-InlineInvokeExceptionDestPhi.ll
llvm-svn: 9526
break dominance relationships, and is otherwise bad. This fixes bug:
Inline/2003-10-13-AllocaDominanceProblem.ll. This also fixes miscompilation
of 3 176.gcc source files (reload1.c, global.c, flow.c)
llvm-svn: 9109
Running the inliner on 252.eon used to take 48.4763s, now it takes 14.4148s.
In release mode, it went from taking 25.8741s to taking 11.5712s.
This also fixes a FIXME.
llvm-svn: 8890
... by making sure to update PHI nodes to take into consideration the
extra edges we get if we inline a call instruction through an invoke.
llvm-svn: 8664