Chris Lattner
c1a682dda0
Fix an extremely serious thinko I made in revision 1.60 of this file.
...
llvm-svn: 13106
2004-04-22 14:59:40 +00:00
Chris Lattner
af532f27e7
Implement a todo, rewriting all possible scev expressions inside of the
...
loop. This eliminates the extra add from the previous case, but it's
not clear that this will be a performance win overall. Tommorows test
results will tell. :)
llvm-svn: 13103
2004-04-21 23:36:08 +00:00
Chris Lattner
fb9a299f68
This code really wants to iterate over the OPERANDS of an instruction, not
...
over its USES. If it's dead it doesn't have any uses! :)
Thanks to the fabulous and mysterious Bill Wendling for pointing this out. :)
llvm-svn: 13102
2004-04-21 22:29:37 +00:00
Chris Lattner
dc7cc35088
Implement a fixme. The helps loops that have induction variables of different
...
types in them. Instead of creating an induction variable for all types, it
creates a single induction variable and casts to the other sizes. This generates
this code:
no_exit: ; preds = %entry, %no_exit
%indvar = phi uint [ %indvar.next, %no_exit ], [ 0, %entry ] ; <uint> [#uses=4]
*** %j.0.0 = cast uint %indvar to short ; <short> [#uses=1]
%indvar = cast uint %indvar to int ; <int> [#uses=1]
%tmp.7 = getelementptr short* %P, uint %indvar ; <short*> [#uses=1]
store short %j.0.0, short* %tmp.7
%inc.0 = add int %indvar, 1 ; <int> [#uses=2]
%tmp.2 = setlt int %inc.0, %N ; <bool> [#uses=1]
%indvar.next = add uint %indvar, 1 ; <uint> [#uses=1]
br bool %tmp.2, label %no_exit, label %loopexit
instead of:
no_exit: ; preds = %entry, %no_exit
%indvar = phi ushort [ %indvar.next, %no_exit ], [ 0, %entry ] ; <ushort> [#uses=2]
*** %indvar = phi uint [ %indvar.next, %no_exit ], [ 0, %entry ] ; <uint> [#uses=3]
%indvar = cast uint %indvar to int ; <int> [#uses=1]
%indvar = cast ushort %indvar to short ; <short> [#uses=1]
%tmp.7 = getelementptr short* %P, uint %indvar ; <short*> [#uses=1]
store short %indvar, short* %tmp.7
%inc.0 = add int %indvar, 1 ; <int> [#uses=2]
%tmp.2 = setlt int %inc.0, %N ; <bool> [#uses=1]
%indvar.next = add uint %indvar, 1
*** %indvar.next = add ushort %indvar, 1
br bool %tmp.2, label %no_exit, label %loopexit
This is an improvement in register pressure, but probably doesn't happen that
often.
The more important fix will be to get rid of the redundant add.
llvm-svn: 13101
2004-04-21 22:22:01 +00:00
Chris Lattner
be8bb804c5
Fix an incredibly nasty iterator invalidation problem. I am too spoiled by ilists :)
...
Eventually it would be nice if CallGraph maintained an ilist of CallGraphNode's instead
of a vector of pointers to them, but today is not that day.
llvm-svn: 13100
2004-04-21 20:44:33 +00:00
Alkis Evlogimenos
f68f40ea42
Include cerrno (gcc-3.4 fix)
...
llvm-svn: 13091
2004-04-21 16:11:40 +00:00
Chris Lattner
a9691fe70d
Fix typeo
...
llvm-svn: 13089
2004-04-21 14:23:18 +00:00
Chris Lattner
c87784f1fc
REALLY fix PR324: don't delete linkonce functions until after the SCC traversal
...
is done, which avoids invalidating iterators in the SCC traversal routines
llvm-svn: 13088
2004-04-20 22:06:53 +00:00
Chris Lattner
c1aa21f5a7
Fix PR325
...
llvm-svn: 13081
2004-04-20 20:26:03 +00:00
Chris Lattner
514934051a
Fix PR324 and testcase: Inline/2004-04-20-InlineLinkOnce.llx
...
llvm-svn: 13080
2004-04-20 20:20:59 +00:00
Chris Lattner
f48f777d4c
Initial checkin of a simple loop unswitching pass. It still needs work,
...
but it's a start, and seems to do it's basic job.
llvm-svn: 13068
2004-04-19 18:07:02 +00:00
Chris Lattner
bc02177fdc
Add #include
...
llvm-svn: 13057
2004-04-19 03:01:23 +00:00
Chris Lattner
fc44a25bcb
Move isLoopInvariant to the Loop class
...
llvm-svn: 13051
2004-04-18 22:46:08 +00:00
Chris Lattner
827826320d
Correct rewriting of exit blocks after my last patch
...
llvm-svn: 13048
2004-04-18 22:27:10 +00:00
Chris Lattner
35eaa55cfc
Loop exit sets are no longer explicitly held, they are dynamically computed on demand.
...
llvm-svn: 13046
2004-04-18 22:15:13 +00:00
Chris Lattner
d72c3eb54e
Change the ExitBlocks list from being explicitly contained in the Loop
...
structure to being dynamically computed on demand. This makes updating
loop information MUCH easier.
llvm-svn: 13045
2004-04-18 22:14:10 +00:00
Chris Lattner
d15250240c
Reduce the unrolling limit
...
llvm-svn: 13040
2004-04-18 18:06:14 +00:00
Chris Lattner
30ae18155d
If the preheader of the loop was the entry block of the function, make sure
...
that the exit block of the loop becomes the new entry block of the function.
This was causing a verifier assertion on 252.eon.
llvm-svn: 13039
2004-04-18 17:38:42 +00:00
Chris Lattner
230bcb6b35
Be much more careful about how we update instructions outside of the loop
...
using instructions inside of the loop. This should fix the MishaTest failure
from last night.
llvm-svn: 13038
2004-04-18 17:32:39 +00:00
Chris Lattner
4d52e1e401
After unrolling our single basic block loop, fold it into the preheader and exit
...
block. The primary motivation for doing this is that we can now unroll nested loops.
This makes a pretty big difference in some cases. For example, in 183.equake,
we are now beating the native compiler with the CBE, and we are a lot closer
with LLC.
I'm now going to play around a bit with the unroll factor and see what effect
it really has.
llvm-svn: 13034
2004-04-18 06:27:43 +00:00
Chris Lattner
f2cc841619
Fix a bug: this does not preserve the CFG!
...
While we're at it, add support for updating loop information correctly.
llvm-svn: 13033
2004-04-18 05:38:37 +00:00
Chris Lattner
946b255977
Initial checkin of a simple loop unroller. This pass is extremely basic and
...
limited. Even in it's extremely simple state (it can only *fully* unroll single
basic block loops that execute a constant number of times), it already helps improve
performance a LOT on some benchmarks, particularly with the native code generators.
llvm-svn: 13028
2004-04-18 05:20:17 +00:00
Chris Lattner
c14da9600b
Make the tail duplication threshold accessible from the command line instead of hardcoded
...
llvm-svn: 13025
2004-04-18 00:52:43 +00:00
Chris Lattner
a814080025
If the loop executes a constant number of times, try a bit harder to replace
...
exit values.
llvm-svn: 13018
2004-04-17 18:44:09 +00:00
Chris Lattner
1e9ac1a45e
Fix a HUGE pessimization on X86. The indvars pass was taking this
...
(familiar) function:
int _strlen(const char *str) {
int len = 0;
while (*str++) len++;
return len;
}
And transforming it to use a ulong induction variable, because the type of
the pointer index was left as a constant long. This is obviously very bad.
The fix is to shrink long constants in getelementptr instructions to intptr_t,
making the indvars pass insert a uint induction variable, which is much more
efficient.
Here's the before code for this function:
int %_strlen(sbyte* %str) {
entry:
%tmp.13 = load sbyte* %str ; <sbyte> [#uses=1]
%tmp.24 = seteq sbyte %tmp.13, 0 ; <bool> [#uses=1]
br bool %tmp.24, label %loopexit, label %no_exit
no_exit: ; preds = %entry, %no_exit
*** %indvar = phi uint [ %indvar.next, %no_exit ], [ 0, %entry ] ; <uint> [#uses=2]
*** %indvar = phi ulong [ %indvar.next, %no_exit ], [ 0, %entry ] ; <ulong> [#uses=2]
%indvar1 = cast ulong %indvar to uint ; <uint> [#uses=1]
%inc.02.sum = add uint %indvar1, 1 ; <uint> [#uses=1]
%inc.0.0 = getelementptr sbyte* %str, uint %inc.02.sum ; <sbyte*> [#uses=1]
%tmp.1 = load sbyte* %inc.0.0 ; <sbyte> [#uses=1]
%tmp.2 = seteq sbyte %tmp.1, 0 ; <bool> [#uses=1]
%indvar.next = add ulong %indvar, 1 ; <ulong> [#uses=1]
%indvar.next = add uint %indvar, 1 ; <uint> [#uses=1]
br bool %tmp.2, label %loopexit.loopexit, label %no_exit
loopexit.loopexit: ; preds = %no_exit
%indvar = cast uint %indvar to int ; <int> [#uses=1]
%inc.1 = add int %indvar, 1 ; <int> [#uses=1]
ret int %inc.1
loopexit: ; preds = %entry
ret int 0
}
Here's the after code:
int %_strlen(sbyte* %str) {
entry:
%inc.02 = getelementptr sbyte* %str, uint 1 ; <sbyte*> [#uses=1]
%tmp.13 = load sbyte* %str ; <sbyte> [#uses=1]
%tmp.24 = seteq sbyte %tmp.13, 0 ; <bool> [#uses=1]
br bool %tmp.24, label %loopexit, label %no_exit
no_exit: ; preds = %entry, %no_exit
*** %indvar = phi uint [ %indvar.next, %no_exit ], [ 0, %entry ] ; <uint> [#uses=3]
%indvar = cast uint %indvar to int ; <int> [#uses=1]
%inc.0.0 = getelementptr sbyte* %inc.02, uint %indvar ; <sbyte*> [#uses=1]
%inc.1 = add int %indvar, 1 ; <int> [#uses=1]
%tmp.1 = load sbyte* %inc.0.0 ; <sbyte> [#uses=1]
%tmp.2 = seteq sbyte %tmp.1, 0 ; <bool> [#uses=1]
%indvar.next = add uint %indvar, 1 ; <uint> [#uses=1]
br bool %tmp.2, label %loopexit, label %no_exit
loopexit: ; preds = %entry, %no_exit
%len.0.1 = phi int [ 0, %entry ], [ %inc.1, %no_exit ] ; <int> [#uses=1]
ret int %len.0.1
}
llvm-svn: 13016
2004-04-17 18:16:10 +00:00
Chris Lattner
885a6eb74d
Even if there are not any induction variables in the loop, if we can compute
...
the trip count for the loop, insert one so that we can canonicalize the exit
condition.
llvm-svn: 13015
2004-04-17 18:08:33 +00:00
Chris Lattner
a43312d30b
Add support for evaluation of exp/log/log10/pow
...
llvm-svn: 13011
2004-04-16 22:35:33 +00:00
Chris Lattner
284d3b0311
Fix some really nasty dominance bugs that were exposed by my patch to
...
make the verifier more strict. This fixes building zlib
llvm-svn: 13002
2004-04-16 18:08:07 +00:00
Brian Gaeke
174633b078
Include <cmath> for compatibility with gcc 3.0.x (the system compiler on
...
Debian.)
llvm-svn: 12986
2004-04-16 15:57:32 +00:00
Chris Lattner
9e9b2b7474
Fix some of the strange CBE-only failures that happened last night.
...
llvm-svn: 12980
2004-04-16 06:03:17 +00:00
Chris Lattner
0328d75c83
Fix Inline/2004-04-15-InlineDeletesCall.ll
...
Basically we were using SimplifyCFG as a huge sledgehammer for a simple
optimization. Because simplifycfg does so many things, we can't use it
for this purpose.
llvm-svn: 12977
2004-04-16 05:17:59 +00:00
Chris Lattner
d7a559e353
Fix a bug in the previous checkin: if the exit block is not the same as
...
the back-edge block, we must check the preincremented value.
llvm-svn: 12968
2004-04-15 20:26:22 +00:00
Chris Lattner
0cec5cb92c
Change the canonical induction variable that we insert.
...
Instead of producing code like this:
Loop:
X = phi 0, X2
...
X2 = X + 1
if (X != N-1) goto Loop
We now generate code that looks like this:
Loop:
X = phi 0, X2
...
X2 = X + 1
if (X2 != N) goto Loop
This has two big advantages:
1. The trip count of the loop is now explicit in the code, allowing
the direct implementation of Loop::getTripCount()
2. This reduces register pressure in the loop, and allows X and X2 to be
put into the same register.
As a consequence of the second point, the code we generate for loops went
from:
.LBB2: # no_exit.1
...
mov %EDI, %ESI
inc %EDI
cmp %ESI, 2
mov %ESI, %EDI
jne .LBB2 # PC rel: no_exit.1
To:
.LBB2: # no_exit.1
...
inc %ESI
cmp %ESI, 3
jne .LBB2 # PC rel: no_exit.1
... which has two fewer moves, and uses one less register.
llvm-svn: 12961
2004-04-15 15:21:43 +00:00
Chris Lattner
6679e46b59
ADd a trivial instcombine: load null -> null
...
llvm-svn: 12940
2004-04-14 03:28:36 +00:00
Chris Lattner
ff9362a8da
Add SCCP support for constant folding calls, implementing:
...
test/Regression/Transforms/SCCP/calltest.ll
llvm-svn: 12921
2004-04-13 19:43:54 +00:00
Chris Lattner
ca52d0468e
Add a simple call constant propagation interface.
...
llvm-svn: 12919
2004-04-13 19:28:52 +00:00
Chris Lattner
d0dc6d5295
Constant propagation should remove the dead instructions
...
llvm-svn: 12917
2004-04-13 19:28:20 +00:00
Chris Lattner
89e959bb1f
Fix LoopSimplify/2004-04-13-LoopSimplifyUpdateDomFrontier.ll
...
LoopSimplify was not updating dominator frontiers correctly in some cases.
llvm-svn: 12890
2004-04-13 16:23:25 +00:00
Chris Lattner
a6e22814ab
Refactor code a bit to make it simpler and eliminate the goto
...
llvm-svn: 12888
2004-04-13 15:21:18 +00:00
Chris Lattner
8417052938
This patch addresses PR35: Loop simplify should reconstruct nested loops.
...
This is fairly straight-forward, but was a real nightmare to get just
perfect. aarg. :)
llvm-svn: 12884
2004-04-13 05:05:33 +00:00
Chris Lattner
be43544429
Actually update the call graph as the inliner changes it. This allows us to
...
execute other CallGraphSCCPasses after the inliner without crashing.
llvm-svn: 12861
2004-04-12 05:37:29 +00:00
Chris Lattner
494a685449
Add support for removing invoke instructions
...
llvm-svn: 12858
2004-04-12 05:15:13 +00:00
Chris Lattner
08f201bee5
Stop printing Function*
...
llvm-svn: 12857
2004-04-12 04:06:56 +00:00
Chris Lattner
d041dcd92f
Simplify code a bit, and be sure to mark the external node as potentially throwing
...
llvm-svn: 12856
2004-04-12 04:06:38 +00:00
Chris Lattner
24cf0200c7
Fix a bug in my select transformation
...
llvm-svn: 12826
2004-04-11 01:39:19 +00:00
Chris Lattner
f16fe7206c
Update the value numbering interface.
...
llvm-svn: 12824
2004-04-10 22:33:34 +00:00
Chris Lattner
623fba1107
Implement InstCombine/select.ll:test13*
...
llvm-svn: 12821
2004-04-10 22:21:27 +00:00
Chris Lattner
cf4a996cba
Implement InstCombine/add.ll:test20
...
Canonicalize add of sign bit constant into a xor
llvm-svn: 12819
2004-04-10 22:01:55 +00:00
Chris Lattner
69c4900512
Rewrite the GCSE pass to be *substantially* simpler, a bit more efficient,
...
and a bit more powerful
llvm-svn: 12817
2004-04-10 21:11:11 +00:00
Chris Lattner
f9d9665138
Fix spurious warning in release mode
...
llvm-svn: 12816
2004-04-10 19:15:56 +00:00
Chris Lattner
d95ef7eff0
Simplify code a bit, and fix a bug that was breaking perlbmk
...
llvm-svn: 12814
2004-04-10 18:06:21 +00:00
Chris Lattner
7ebfe61dc1
Fix a bug in my checkin last night that was breaking programs using invoke.
...
llvm-svn: 12813
2004-04-10 16:53:29 +00:00
Chris Lattner
5093213c40
Fix previous patch
...
llvm-svn: 12811
2004-04-10 07:27:48 +00:00
Chris Lattner
6149ac8991
Correctly update counters
...
llvm-svn: 12810
2004-04-10 07:02:02 +00:00
Chris Lattner
cfa1adcdb8
Simplify code a bit, and use alias analysis to allow us to delete unused
...
call and invoke instructions that are known to not write to memory.
llvm-svn: 12807
2004-04-10 06:53:09 +00:00
Chris Lattner
56e4d3d8ad
Implement select.ll:test12*
...
This transforms code like this:
%C = or %A, %B
%D = select %cond, %C, %A
into:
%C = select %cond, %B, 0
%D = or %A, %C
Since B is often a constant, the select can often be eliminated. In any case,
this reduces the usage count of A, allowing subsequent optimizations to happen.
This xform applies when the operator is any of:
add, sub, mul, or, xor, and, shl, shr
llvm-svn: 12800
2004-04-09 23:46:01 +00:00
Chris Lattner
0aa565647c
Fold code like:
...
if (C)
V1 |= V2;
into:
Vx = V1 | V2;
V1 = select C, V1, Vx
when the expression can be evaluated unconditionally and is *cheap* to
execute. This limited form of if conversion is quite handy in lots of cases.
For example, it turns this testcase into straight-line code:
int in0 ; int in1 ; int in2 ; int in3 ;
int in4 ; int in5 ; int in6 ; int in7 ;
int in8 ; int in9 ; int in10; int in11;
int in12; int in13; int in14; int in15;
long output;
void mux(void) {
output =
(in0 ? 0x00000001 : 0) | (in1 ? 0x00000002 : 0) |
(in2 ? 0x00000004 : 0) | (in3 ? 0x00000008 : 0) |
(in4 ? 0x00000010 : 0) | (in5 ? 0x00000020 : 0) |
(in6 ? 0x00000040 : 0) | (in7 ? 0x00000080 : 0) |
(in8 ? 0x00000100 : 0) | (in9 ? 0x00000200 : 0) |
(in10 ? 0x00000400 : 0) | (in11 ? 0x00000800 : 0) |
(in12 ? 0x00001000 : 0) | (in13 ? 0x00002000 : 0) |
(in14 ? 0x00004000 : 0) | (in15 ? 0x00008000 : 0) ;
}
llvm-svn: 12798
2004-04-09 22:50:22 +00:00
Chris Lattner
183b336a54
Fold binary operators with a constant operand into select instructions
...
that have a constant operand. This implements
add.ll:test19, shift.ll:test15*, and others that are not tested
llvm-svn: 12794
2004-04-09 19:05:30 +00:00
Chris Lattner
cf7baf3519
Implement select.ll:test11
...
llvm-svn: 12793
2004-04-09 18:19:44 +00:00
Chris Lattner
e228ee5870
Implement InstCombine/cast-propagate.ll
...
llvm-svn: 12784
2004-04-08 20:39:49 +00:00
Chris Lattner
3b3861d305
Implement ScalarRepl/select_promote.ll
...
llvm-svn: 12779
2004-04-08 19:59:34 +00:00
Chris Lattner
4d25c86b52
Remove the "really gross hacks" that are there to deal with recursive functions.
...
Now we collect all of the call sites we are interested in inlining, then inline
them. This entirely avoids issues with trying to inline a call site we got by
inlining another call site. This also eliminates iterator invalidation issues.
llvm-svn: 12770
2004-04-08 06:34:31 +00:00
Chris Lattner
1c631e813d
Implement InstCombine/select.ll:test[7-10]
...
llvm-svn: 12769
2004-04-08 04:43:23 +00:00
Chris Lattner
2b2412d0c8
Implement test/Regression/Transforms/InstCombine/getelementptr_index.ll
...
llvm-svn: 12762
2004-04-07 18:38:20 +00:00
Chris Lattner
4d1fcf1dcd
Fix a bug in yesterdays checkins which broke siod. siod is a great testcase! :)
...
llvm-svn: 12659
2004-04-05 16:02:41 +00:00
Chris Lattner
8953b90aaa
Fix InstCombine/2004-04-04-InstCombineReplaceAllUsesWith.ll
...
llvm-svn: 12658
2004-04-05 02:10:19 +00:00
Chris Lattner
69193f93b6
Support getelementptr instructions which use uint's to index into structure
...
types and can have arbitrary 32- and 64-bit integer types indexing into
sequential types.
llvm-svn: 12653
2004-04-05 01:30:19 +00:00
Chris Lattner
e61b67d7d5
Rewrite the indvars pass to use the ScalarEvolution analysis.
...
This also implements some new features for the indvars pass, including
linear function test replacement, exit value substitution, and it works with
a much more general class of induction variables and loops.
llvm-svn: 12620
2004-04-02 20:24:31 +00:00
Chris Lattner
eed034bcd3
Fix the obvious bug in my previous checkin
...
llvm-svn: 12618
2004-04-02 18:15:10 +00:00
Chris Lattner
9f0db32625
Implement Transforms/SimplifyCFG/return-merge.ll
...
This actually causes us to turn code like:
return C ? A : B;
into a select instruction.
llvm-svn: 12617
2004-04-02 18:13:43 +00:00
Chris Lattner
c24019c825
Fix PR310 and TailDup/2004-04-01-DemoteRegToStack.llx
...
llvm-svn: 12597
2004-04-01 20:28:45 +00:00
Chris Lattner
59fdf74968
Remove some assertions that are now bogus with the last patch I put in
...
llvm-svn: 12595
2004-04-01 19:21:46 +00:00
Chris Lattner
146d0df5e4
Fix PR306: Loop simplify incorrectly updates dominator information
...
Testcase: LoopSimplify/2004-04-01-IncorrectDomUpdate.ll
llvm-svn: 12592
2004-04-01 19:06:07 +00:00
Chris Lattner
61fab1409d
Add warning
...
llvm-svn: 12573
2004-03-31 22:00:30 +00:00
Chris Lattner
709f03e2dd
Fix linking of constant expr casts due to type resolution changes. With
...
this and the other patches 253.perlbmk links again.
llvm-svn: 12565
2004-03-31 02:58:28 +00:00
Brian Gaeke
ef327be6ed
Start cleaning up this pass so that I can debug it.
...
llvm-svn: 12548
2004-03-30 19:53:46 +00:00
Chris Lattner
81bdcb90ce
Now that all the code generators support the select instruction, and the instcombine
...
pass can eliminate many nasty cases of them, start generating them in the optimizers
llvm-svn: 12545
2004-03-30 19:44:05 +00:00
Chris Lattner
533bc49775
Implement select.ll:test[3-6]
...
llvm-svn: 12544
2004-03-30 19:37:13 +00:00
Chris Lattner
059f390257
Add a simple select instruction lowering pass
...
llvm-svn: 12540
2004-03-30 18:41:10 +00:00
Chris Lattner
56b5051428
X % -1 == X % 1 == 0
...
llvm-svn: 12520
2004-03-26 16:11:24 +00:00
Chris Lattner
57c67b06e9
Two changes:
...
#1 is to unconditionally strip constantpointerrefs out of
instruction operands where they are absolutely pointless and inhibit
optimization. GRRR!
#2 is to implement InstCombine/getelementptr_const.ll
llvm-svn: 12519
2004-03-25 22:59:29 +00:00
Chris Lattner
abb77c9959
Teach the optimizer to delete zero sized alloca's (but not mallocs!)
...
llvm-svn: 12507
2004-03-19 06:08:10 +00:00
Chris Lattner
232155dc1b
Fix bug: CodeExtractor/2004-03-17-MissedLiveIns.ll
...
With this fix we now successfully extract all 149 loops from 256.bzip2 without
crashing or miscompiling the program!
llvm-svn: 12493
2004-03-18 05:56:32 +00:00
Chris Lattner
e83693560a
Add statistics to the loop extractor. The loop extractor has successfully
...
extracted all 63 loops for Olden/bh without crashing and without
miscompiling the program!!!
llvm-svn: 12491
2004-03-18 05:46:10 +00:00
Chris Lattner
5bce0c807d
Fix problem with PHI nodes having multiple predecessors from different
...
exit nodes
llvm-svn: 12490
2004-03-18 05:43:18 +00:00
Chris Lattner
acd75986ee
Fix CodeExtractor/2004-03-17-UpdatePHIsOutsideRegion.ll
...
llvm-svn: 12489
2004-03-18 05:38:31 +00:00
Chris Lattner
320d59f4cd
Seriously simplify and correct the PHI node handling code.
...
llvm-svn: 12487
2004-03-18 05:28:49 +00:00
Chris Lattner
d8017a340d
Fix CodeExtractor/2004-03-17-OutputMismatch.ll
...
llvm-svn: 12486
2004-03-18 04:12:05 +00:00
Chris Lattner
37de257ef0
Fix several bugs in the extractor:
...
1. Names were not put on the new arguments created (ok, this just helps sanity :)
2. Fix outgoing pointer values
3. Do not insert stores for values that had not been computed
4. Fix some wierd problems with the outset calculation
This fixes CodeExtractor/2004-03-14-DominanceProblem.ll, making the extractor
work on at least one simple case!
llvm-svn: 12484
2004-03-18 03:49:40 +00:00
Chris Lattner
e9235d2dde
The code extractor needs dominator info. Provide it
...
llvm-svn: 12483
2004-03-18 03:48:06 +00:00
Chris Lattner
cee3404d0a
Prune #includes, moving the module interface to the front. Note that this
...
exposed the fact that the header was not self-contained. There is a reason
we do things :)
llvm-svn: 12481
2004-03-18 03:15:29 +00:00
Chris Lattner
a078f47b39
Fix compilation of mesa, which I broke earlier today
...
llvm-svn: 12465
2004-03-17 02:02:47 +00:00
Chris Lattner
684fa5ac64
Be more accurate
...
llvm-svn: 12464
2004-03-17 01:59:27 +00:00
Chris Lattner
a3783a577e
Fix bug in previous checkin
...
llvm-svn: 12458
2004-03-16 23:36:49 +00:00
Chris Lattner
95057f6ad1
Okay, so there is no reasonable way for tail duplication to update SSA form,
...
as it is making effectively arbitrary modifications to the CFG and we don't
have a domset/domfrontier implementations that can handle the dynamic updates.
Instead of having a bunch of code that doesn't actually work in practice,
just demote any potentially tricky values to the stack (causing the problem
to go away entirely). Later invocations of mem2reg will rebuild SSA for us.
This fixes all of the major performance regressions with tail duplication
from LLVM 1.1. For example, this loop:
---
int popcount(int x) {
int result = 0;
while (x != 0) {
result = result + (x & 0x1);
x = x >> 1;
}
return result;
}
---
Used to be compiled into:
int %popcount(int %X) {
entry:
br label %loopentry
loopentry: ; preds = %entry, %no_exit
%x.0 = phi int [ %X, %entry ], [ %tmp.9, %no_exit ] ; <int> [#uses=3]
%result.1.0 = phi int [ 0, %entry ], [ %tmp.6, %no_exit ] ; <int> [#uses=2]
%tmp.1 = seteq int %x.0, 0 ; <bool> [#uses=1]
br bool %tmp.1, label %loopexit, label %no_exit
no_exit: ; preds = %loopentry
%tmp.4 = and int %x.0, 1 ; <int> [#uses=1]
%tmp.6 = add int %tmp.4, %result.1.0 ; <int> [#uses=1]
%tmp.9 = shr int %x.0, ubyte 1 ; <int> [#uses=1]
br label %loopentry
loopexit: ; preds = %loopentry
ret int %result.1.0
}
And is now compiled into:
int %popcount(int %X) {
entry:
br label %no_exit
no_exit: ; preds = %entry, %no_exit
%x.0.0 = phi int [ %X, %entry ], [ %tmp.9, %no_exit ] ; <int> [#uses=2]
%result.1.0.0 = phi int [ 0, %entry ], [ %tmp.6, %no_exit ] ; <int> [#uses=1]
%tmp.4 = and int %x.0.0, 1 ; <int> [#uses=1]
%tmp.6 = add int %tmp.4, %result.1.0.0 ; <int> [#uses=2]
%tmp.9 = shr int %x.0.0, ubyte 1 ; <int> [#uses=2]
%tmp.1 = seteq int %tmp.9, 0 ; <bool> [#uses=1]
br bool %tmp.1, label %loopexit, label %no_exit
loopexit: ; preds = %no_exit
ret int %tmp.6
}
llvm-svn: 12457
2004-03-16 23:29:09 +00:00
Chris Lattner
bb1a2cc7ab
This code was both incredibly complex and incredibly broken. Fix it.
...
llvm-svn: 12456
2004-03-16 23:23:11 +00:00
Chris Lattner
fa48edfb7d
Punt if we see gigantic PHI nodes. This improves a huge interpreter loop
...
testcase from 32.5s in -raise to take .3s
llvm-svn: 12443
2004-03-16 19:52:53 +00:00
Chris Lattner
7a7b114871
Do not try to optimize PHI nodes with incredibly high degree. This reduces SCCP
...
time from 615s to 1.49s on a large testcase that has a gigantic switch statement
that all of the blocks in the function go to (an intepreter).
llvm-svn: 12442
2004-03-16 19:49:59 +00:00
Chris Lattner
a64923ad26
Do not copy gigantic switch instructions
...
llvm-svn: 12441
2004-03-16 19:45:22 +00:00
Chris Lattner
db5b8f4d6b
Fix a regression from this patch:
...
http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20040308/013095.html
Basically, this patch only updated the immediate dominatees of the header node
to tell them that the preheader also dominated them. In practice, ALL
dominatees of the header node are also dominated by the preheader.
This fixes: LoopSimplify/2004-03-15-IncorrectDomUpdate.
and PR293
llvm-svn: 12434
2004-03-16 06:00:15 +00:00
Chris Lattner
95ce36da0d
Restore old inlining heuristic. As the comment indicates, this is a nasty
...
horrible hack.
llvm-svn: 12423
2004-03-15 06:38:14 +00:00
Chris Lattner
cd83282df1
Add counters for the number of calls elimianted
...
llvm-svn: 12420
2004-03-15 05:46:59 +00:00
Chris Lattner
20cda2645e
Implement LICM of calls in simple cases. This is sufficient to move around
...
sin/cos/strlen calls and stuff. This implements:
LICM/call_sink_pure_function.ll
LICM/call_sink_const_function.ll
llvm-svn: 12415
2004-03-15 04:11:30 +00:00
Chris Lattner
fb87cdecd8
Mostly cosmetic improvements. Do fix the bug where a global value was considered an input.
...
llvm-svn: 12406
2004-03-15 01:26:44 +00:00
Chris Lattner
73ab1fa7c8
Assert that input blocks meet the invariants we expect
...
Simplify the input/output finder. All elements of a basic block are
instructions. Any used arguments are also inputs. An instruction can only
be used by another instruction.
llvm-svn: 12405
2004-03-15 01:18:23 +00:00
Chris Lattner
2f155d8734
Fix several bugs in the loop extractor. In particular, subloops were never
...
extracted, and a function that contained a single top-level loop never had
the loop extracted, regardless of how much non-loop code there was.
llvm-svn: 12403
2004-03-15 00:02:02 +00:00
Chris Lattner
5b2072ecd3
No correctness fixes here, just minor qoi fixes:
...
* Don't insert a branch to the switch instruction after the call, just
make it a single block.
* Insert the new alloca instructions in the entry block of the original
function instead of having them execute dynamically
* Don't make the default edge of the switch instruction go back to the switch.
The loop extractor shouldn't create new loops!
* Give meaningful names to the alloca slots and the reload instructions
* Some minor code simplifications
llvm-svn: 12402
2004-03-14 23:43:24 +00:00
Chris Lattner
b4d8bf365c
Simplify code a bit, and fix bug CodeExtractor/2004-03-14-NoSwitchSupport.ll
...
This also implements a two minor improvements:
* Don't insert live-out stores IN the region, insert them on the code path
that exits the region
* If the region is exited to the same block from multiple paths, share the
switch statement entry, live-out store code, and the basic block.
llvm-svn: 12401
2004-03-14 23:05:49 +00:00
Chris Lattner
9c431f6c44
Simplify the code a bit by making the collection of basic blocks to extract
...
a member of the class. While we're at it, turn the collection into a set
instead of a vector to improve efficiency and make queries simpler.
llvm-svn: 12400
2004-03-14 22:34:55 +00:00
Chris Lattner
a1672c1bd8
Split into two passes. Now there is the general loop extractor, usable on
...
the command line, and the single loop extractor, usable by bugpoint
llvm-svn: 12390
2004-03-14 20:01:36 +00:00
Chris Lattner
0137de5ecb
Passes don't print stuff!
...
llvm-svn: 12385
2004-03-14 04:17:53 +00:00
Chris Lattner
b68659552a
Do not create empty basic blocks when the lowerswitch pass expects blocks to
...
be non-empty! This fixes LowerSwitch/2004-03-13-SwitchIsDefaultCrash.ll
llvm-svn: 12384
2004-03-14 04:14:31 +00:00
Chris Lattner
4fca71eb44
Minor random cleanups
...
llvm-svn: 12382
2004-03-14 04:01:47 +00:00
Chris Lattner
6c3e8c78cf
FunctionPass's should not define their own 'run' method.
...
Require 'simplified' loops, not just raw natural loops. This fixes
CodeExtractor/2004-03-13-LoopExtractorCrash.ll
llvm-svn: 12381
2004-03-14 04:01:06 +00:00
Chris Lattner
d078812f96
If a block is dead, dominators will not be calculated for it. Because of this
...
loop information won't see it, and we could have unreachable blocks pointing to
the non-header node of blocks in a natural loop. This isn't tidy, so have the
loopsimplify pass clean it up.
llvm-svn: 12380
2004-03-14 03:59:22 +00:00
Chris Lattner
3684469326
Verify functions as they are produced if -debug is specified. Reduce
...
curly braceage
llvm-svn: 12378
2004-03-14 03:17:22 +00:00
Chris Lattner
78a996aec4
Move prototype to IPO.h instead of Scalar.h
...
Make sure that the file interface header (IPO.h) is included first
remove dead #incldue
llvm-svn: 12375
2004-03-14 02:37:16 +00:00
Chris Lattner
692a47aeb9
Indent anon namespace properly, add copyright block
...
llvm-svn: 12373
2004-03-14 02:34:07 +00:00
Chris Lattner
41ec709e00
Move to the IPO library. Utils shouldn't contain passes.
...
llvm-svn: 12372
2004-03-14 02:32:27 +00:00
Chris Lattner
8eebc49884
DemoteRegToStack got moved from DemoteRegToStack.h to Local.h
...
llvm-svn: 12368
2004-03-14 02:13:38 +00:00
Chris Lattner
7d2a539735
Add some debugging output
...
Fix InstCombine/2004-03-13-InstCombineInfLoop.ll which caused an infinite
loop compiling (I think) povray.
llvm-svn: 12365
2004-03-13 23:54:27 +00:00
Chris Lattner
2dc85b27e4
This change makes two big adjustments.
...
* Be a lot more accurate about what the effects will be when inlining a call
to a function when an argument is an alloca.
* Dramatically reduce the penalty for inlining a call in a large function.
This heuristic made it almost impossible to inline a function into a large
function, no matter how small the callee is.
llvm-svn: 12363
2004-03-13 23:15:45 +00:00
Chris Lattner
797cb2f6c1
This little patch speeds up the loop used to update the dominator set analysis.
...
On the testcase from GCC PR12440, which has a LOT of loops (1392 of which require
preheaders to be inserted), this speeds up the loopsimplify pass from 1.931s to
0.1875s. The loop in question goes from 1.65s -> 0.0097s, which isn't bad. All of
these times are a debug build.
This adds a dependency on DominatorTree analysis that was not there before, but
we always had dominatortree available anyway, because LICM requires both loop
simplify and DT, so this doesn't add any extra analysis in practice.
llvm-svn: 12362
2004-03-13 22:01:26 +00:00
Chris Lattner
022167f13b
Implement sub.ll:test14
...
llvm-svn: 12355
2004-03-13 00:11:49 +00:00
Chris Lattner
92295c5031
Implement InstCombine/sub.ll:test12 & test13
...
llvm-svn: 12353
2004-03-12 23:53:13 +00:00
Chris Lattner
cb015ee6c0
Add constant folding wrapper support for select instructions.
...
llvm-svn: 12319
2004-03-12 05:53:03 +00:00
Chris Lattner
59db22dcd4
Add sccp support for select instructions
...
llvm-svn: 12318
2004-03-12 05:52:44 +00:00
Chris Lattner
b909e8b0d4
Add trivial optimizations for select instructions
...
llvm-svn: 12317
2004-03-12 05:52:32 +00:00
Chris Lattner
721264aecc
Initial support for edge profiling
...
llvm-svn: 12225
2004-03-08 17:54:34 +00:00
Chris Lattner
dae48f93b0
Split utility functions out of BlockProfiling.cpp
...
llvm-svn: 12224
2004-03-08 17:06:13 +00:00
Chris Lattner
d91e676700
finegrainify namespacification
...
llvm-svn: 12221
2004-03-08 16:45:53 +00:00
Chris Lattner
fe6f2e3e80
Implement ArgumentPromotion/aggregate-promote.ll
...
This allows pointers to aggregate objects, whose elements are only read, to
be promoted and passed in by element instead of by reference. This can
enable a LOT of subsequent optimizations in the caller function.
It's worth pointing out that this stuff happens a LOT of C++ programs, because
objects in templates are generally passed around by reference. When these
templates are instantiated on small aggregate or scalar types, however, it is
more efficient to pass them in by value than by reference.
This transformation triggers most on C++ codes (e.g. 334 times on eon), but
does happen on C codes as well. For example, on mesa it triggers 72 times,
and on gcc it triggers 35 times. this is amazingly good considering that
we are using 'basicaa' so far.
llvm-svn: 12202
2004-03-08 01:04:36 +00:00
Chris Lattner
cc544e57f3
Implement: ArgumentPromotion/chained.ll
...
llvm-svn: 12200
2004-03-07 22:52:53 +00:00
Chris Lattner
64b8d697ad
Fix another minor bug, exposed by perlbmk
...
llvm-svn: 12198
2004-03-07 22:43:27 +00:00
Chris Lattner
538fee7aa2
Since 'load null' is undefined, we can make it do whatever we want. Returning
...
a zero value is the most likely way to cause further simplification, so we do it.
llvm-svn: 12197
2004-03-07 22:16:24 +00:00
Chris Lattner
6770842b67
Fix a minor bug and turn debug output into, well, debug output.
...
llvm-svn: 12195
2004-03-07 21:54:50 +00:00
Chris Lattner
483ae01c9c
New LLVM pass: argument promotion. This version only handles simple scalar
...
variables.
llvm-svn: 12193
2004-03-07 21:29:54 +00:00
Chris Lattner
7abcc387de
Don't emit things like malloc(16*1). Allocation instructions are fixed arity now.
...
llvm-svn: 12086
2004-03-03 01:40:53 +00:00
Misha Brukman
f44acae31e
Implement ExtractCodeRegion()
...
llvm-svn: 12070
2004-03-02 00:20:57 +00:00
Misha Brukman
f272f9b3d5
Make a note that this is usually used via bugpoint.
...
llvm-svn: 12068
2004-03-02 00:19:09 +00:00
Misha Brukman
5af2be7d09
* Add implementation of ExtractBasicBlock()
...
* Add comments to ExtractLoop()
llvm-svn: 12053
2004-03-01 18:28:34 +00:00
Chris Lattner
5cf39339d1
Disable tail duplication in a case that breaks on Olden/tsp
...
llvm-svn: 12021
2004-03-01 01:12:13 +00:00
Misha Brukman
c91e1ff50d
* Remove function to find "main" in a Module, there's a method for that
...
* Removing extraneous empty space and empty comment lines
llvm-svn: 12014
2004-02-29 23:09:10 +00:00
Chris Lattner
2de229f31b
Fix bug: test/Regression/Transforms/LowerInvoke/2004-02-29-PHICrash.llx
...
... which tickled the lowerinvoke pass because it used the BCE routines.
llvm-svn: 12012
2004-02-29 22:24:41 +00:00
Chris Lattner
bf2963ef91
Fix PR255: [tailduplication] Single basic block loops are very rare
...
Note that this is a band-aid put over a band-aid. This just undisables
tail duplication in on very specific case that it seems to work in.
llvm-svn: 11989
2004-02-29 06:41:20 +00:00
Chris Lattner
d3e6ae263c
Implement switch->br and br->switch folding by ripping out the switch->switch
...
and br->br code and generalizing it. This allows us to compile code like this:
int test(Instruction *I) {
if (isa<CastInst>(I))
return foo(7);
else if (isa<BranchInst>(I))
return foo(123);
else if (isa<UnwindInst>(I))
return foo(1241);
else if (isa<SetCondInst>(I))
return foo(1);
else if (isa<VAArgInst>(I))
return foo(42);
return foo(-1);
}
into:
int %_Z4testPN4llvm11InstructionE("struct.llvm::Instruction"* %I) {
entry:
%tmp.1.i.i.i.i.i.i.i = getelementptr "struct.llvm::Instruction"* %I, long 0, ubyte 4 ; <uint*> [#uses=1]
%tmp.2.i.i.i.i.i.i.i = load uint* %tmp.1.i.i.i.i.i.i.i ; <uint> [#uses=2]
%tmp.2.i.i.i.i.i.i = seteq uint %tmp.2.i.i.i.i.i.i.i, 27 ; <bool> [#uses=0]
switch uint %tmp.2.i.i.i.i.i.i.i, label %endif.0 [
uint 27, label %then.0
uint 2, label %then.1
uint 5, label %then.2
uint 14, label %then.3
uint 15, label %then.3
uint 16, label %then.3
uint 17, label %then.3
uint 18, label %then.3
uint 19, label %then.3
uint 32, label %then.4
]
...
As well as handling the cases in 176.gcc and many other programs more effectively.
llvm-svn: 11964
2004-02-28 21:28:10 +00:00
Chris Lattner
772eafa332
if there is already a prototype for malloc/free, use it, even if it's incorrect.
...
Do not just inject a new prototype.
llvm-svn: 11951
2004-02-28 18:51:45 +00:00
Chris Lattner
51ea127bf3
Rename AddUsesToWorkList -> AddUsersToWorkList, as that is what it does.
...
Create a new AddUsesToWorkList method
optimize memmove/set/cpy of zero bytes to a noop.
llvm-svn: 11941
2004-02-28 05:22:00 +00:00
Chris Lattner
f3a366062c
Turn 'free null' into nothing
...
llvm-svn: 11940
2004-02-28 04:57:37 +00:00
Misha Brukman
8a2c28fdda
Right, it's really Extractor, not Extraction.
...
llvm-svn: 11939
2004-02-28 03:37:58 +00:00