186.crafty, fhourstones and 132.ijpeg.
Bugpoint makes really nasty miscompilations embarassingly easy to find. It
narrowed it down to the instcombiner and this testcase (from fhourstones):
bool %l7153_l4706_htstat_loopentry_2E_4_no_exit_2E_4(int* %i, [32 x int]* %works, int* %tmp.98.out) {
newFuncRoot:
%tmp.96 = load int* %i ; <int> [#uses=1]
%tmp.97 = getelementptr [32 x int]* %works, long 0, int %tmp.96 ; <int*> [#uses=1]
%tmp.98 = load int* %tmp.97 ; <int> [#uses=2]
%tmp.99 = load int* %i ; <int> [#uses=1]
%tmp.100 = and int %tmp.99, 7 ; <int> [#uses=1]
%tmp.101 = seteq int %tmp.100, 7 ; <bool> [#uses=2]
%tmp.102 = cast bool %tmp.101 to int ; <int> [#uses=0]
br bool %tmp.101, label %codeRepl4.exitStub, label %codeRepl3.exitStub
codeRepl4.exitStub: ; preds = %newFuncRoot
store int %tmp.98, int* %tmp.98.out
ret bool true
codeRepl3.exitStub: ; preds = %newFuncRoot
store int %tmp.98, int* %tmp.98.out
ret bool false
}
... which only has one combination performed on it:
$ llvm-as < t.ll | opt -instcombine -debug | llvm-dis
IC: Old = %tmp.101 = seteq int %tmp.100, 7 ; <bool> [#uses=1]
New = setne int %tmp.100, 0 ; <bool>:<badref> [#uses=0]
IC: MOD = br bool %tmp.101, label %codeRepl3.exitStub, label %codeRepl4.exitStub
IC: MOD = %tmp.97 = getelementptr [32 x int]* %works, uint 0, int %tmp.96 ; <int*> [#uses=1]
It doesn't get much better than this. :)
llvm-svn: 14109
collapse this:
bool %le(int %A, int %B) {
%c1 = setgt int %A, %B
%tmp = select bool %c1, int 1, int 0
%c2 = setlt int %A, %B
%result = select bool %c2, int -1, int %tmp
%c3 = setle int %result, 0
ret bool %c3
}
into:
bool %le(int %A, int %B) {
%c3 = setle int %A, %B ; <bool> [#uses=1]
ret bool %c3
}
which is handy, because the Java FE makes these sequences all over the place.
This is tested as: test/Regression/Transforms/InstCombine/JavaCompare.ll
llvm-svn: 14086
This code hadn't been updated after the "structs with more than 256 elements"
related changes to the GEP instruction. Also it was not handling the
ConstantAggregateZero class.
Now it does!
llvm-svn: 13834
Add support for acos/asin/atan. 188.ammp contains three calls to acos with
constant arguments. Constant folding it allows elimination of those 3 calls
and three FP divisions of the results.
llvm-svn: 13821
into (X & (C2 << C1)) != (C3 << C1), where the shift may be either left or
right and the compare may be any one.
This triggers 1546 times in 176.gcc alone, as it is a common pattern that
occurs for bitfield accesses.
llvm-svn: 13740
CloneTrace, and because it is primarily an operation on ValueMaps. It
is now a global (non-static) function which can be pulled in using
ValueMapper.h.
llvm-svn: 13600
Add better comments, including a better head-of-file comment.
Prune #includes.
Fix a FIXME that Chris put here by using doInitialization().
Use DEBUG() to print out debug msgs.
Give names to basic blocks inserted by this pass.
Expand tabs.
Use InsertProfilingInitCall() from ProfilingUtils to insert the initialize call.
llvm-svn: 13581
in the size calculation.
This is not something you want to see:
Loop Unroll: F[main] Loop %no_exit Loop Size = 2 Trip Count = 2147483648 - UNROLLING!
The problem was that 2*2147483648 == 0.
Now we get:
Loop Unroll: F[main] Loop %no_exit Loop Size = 2 Trip Count = 2147483648 - TOO LARGE: 4294967296>100
Thanks to some anonymous person playing with the demo page that repeatedly
caused zion to go into swapping land. That's one way to ensure you'll get
a quick bugfix. :)
Testcase here: Transforms/LoopUnroll/2004-05-13-DontUnrollTooMuch.ll
llvm-svn: 13564
PHI node entries from multiple outside-the-region blocks. This also fixes
extraction of the entry block in a function. Yaay.
This has successfully block extracted all (but one) block from the score_move
function in obsequi (out of 33). Hrm, I wonder which block the bug is in. :)
llvm-svn: 13489
* Add a stub for the severSplitPHINodes which will allow us to bbextract
bb's with PHI nodes in them soon.
* Remove unused arguments from findInputsOutputs
* Dramatically simplify the code in findInputsOutputs. In particular,
nothing really cares whether or not a PHI node is using something.
* Move moveCodeToFunction to after emitCallAndSwitchStatement as that's the
order they get called.
* Fix a bug where we would code extract a region that included a call to
vastart. Like 'alloca', calls to vastart must stay in the function that
they are defined in.
* Add some comments.
llvm-svn: 13482
from the extracted region. If the return has 0 or 1 exit blocks, the new
function returns void. If it has 2 exits, it returns bool, otherwise it
returns a ushort as before.
This allows us to use a conditional branch instruction when there are two
exit blocks, as often happens during block extraction.
llvm-svn: 13481
1. Get rid of the silly abort block. When doing bb extraction, we get one
abort block for every block extracted, which is kinda annoying.
2. If the switch ends up having a single destination, turn it into an
unconditional branch.
I would like to add support for conditional branches, but to do this we will
want to have the function return a bool instead of a ushort.
llvm-svn: 13478
%tmp.0 = getelementptr [50 x sbyte]* %ar, uint 0, int 5 ; <sbyte*> [#uses=2]
%tmp.7 = getelementptr sbyte* %tmp.0, int 8 ; <sbyte*> [#uses=1]
together. This patch actually allows us to simplify and generalize the code.
llvm-svn: 13415