Chris Lattner
a31f0a622b
Generalize (zext (truncate x)) and (sext (truncate x)) folding to work when
...
the src/dst are not the same size. This catches things like "truncate
32-bit X to 8 bits, then zext to 16", which happens a bit on X86.
llvm-svn: 30557
2006-09-21 06:00:20 +00:00
Chris Lattner
c8cd62d381
Compile:
...
int test3(int a, int b) { return (a < 0) ? a : 0; }
to:
_test3:
srawi r2, r3, 31
and r3, r2, r3
blr
instead of:
_test3:
cmpwi cr0, r3, 1
li r2, 0
blt cr0, LBB2_2 ;entry
LBB2_1: ;entry
mr r3, r2
LBB2_2: ;entry
blr
This implements: PowerPC/select_lt0.ll:seli32_a_a
llvm-svn: 30517
2006-09-20 06:41:35 +00:00
Chris Lattner
8746e2cd57
Fold the full generality of (any_extend (truncate x))
...
llvm-svn: 30514
2006-09-20 06:29:17 +00:00
Chris Lattner
8b68decb27
Two things:
...
1. teach SimplifySetCC that '(srl (ctlz x), 5) == 0' is really x != 0.
2. Teach visitSELECT_CC to use SimplifySetCC instead of calling it and
ignoring the result. This allows us to compile:
bool %test(ulong %x) {
%tmp = setlt ulong %x, 4294967296
ret bool %tmp
}
to:
_test:
cntlzw r2, r3
cmplwi cr0, r3, 1
srwi r2, r2, 5
li r3, 0
beq cr0, LBB1_2 ;
LBB1_1: ;
mr r3, r2
LBB1_2: ;
blr
instead of:
_test:
addi r2, r3, -1
cntlzw r2, r2
cntlzw r3, r3
srwi r2, r2, 5
cmplwi cr0, r2, 0
srwi r2, r3, 5
li r3, 0
bne cr0, LBB1_2 ;
LBB1_1: ;
mr r3, r2
LBB1_2: ;
blr
This isn't wonderful, but it's an improvement.
llvm-svn: 30513
2006-09-20 06:19:26 +00:00
Chris Lattner
46d710e6ea
Fold (X & C1) | (Y & C2) -> (X|Y) & C3 when possible.
...
This implements CodeGen/X86/and-or-fold.ll
llvm-svn: 30379
2006-09-14 21:11:37 +00:00
Chris Lattner
97614c86ce
Split rotate matching code out to its own function. Make it stronger, by
...
matching things like ((x >> c1) & c2) | ((x << c3) & c4) to (rot x, c5) & c6
llvm-svn: 30376
2006-09-14 20:50:57 +00:00
Evan Cheng
31305c45da
DAG combiner fix for rotates. Previously the outer-most condition checks
...
for ROTL availability. This prevents it from forming ROTR for targets that
has ROTR only.
llvm-svn: 29997
2006-08-31 07:41:12 +00:00
Evan Cheng
e5570a4c3f
Move isCommutativeBinOp from SelectionDAG.cpp and DAGCombiner.cpp out. Make it a static method of SelectionDAG.
...
llvm-svn: 29951
2006-08-29 06:42:35 +00:00
Chris Lattner
3d27be1333
s|llvm/Support/Visibility.h|llvm/Support/Compiler.h|
...
llvm-svn: 29911
2006-08-27 12:54:02 +00:00
Chris Lattner
6f22ebd8be
change internal impl of dag combiner so that calls to CombineTo never have to
...
make a temporary vector.
llvm-svn: 29618
2006-08-11 17:56:38 +00:00
Chris Lattner
a2f4086828
Change one ReplaceAllUsesWith method to take an array of operands to replace
...
instead of a vector of operands.
llvm-svn: 29616
2006-08-11 17:46:28 +00:00
Chris Lattner
c24a1d3093
Start eliminating temporary vectors used to create DAG nodes. Instead, pass
...
in the start of an array and a count of operands where applicable. In many
cases, the number of operands is known, so this static array can be allocated
on the stack, avoiding the heap. In many other cases, a SmallVector can be
used, which has the same benefit in the common cases.
I updated a lot of code calling getNode that takes a vector, but ran out of
time. The rest of the code should be updated, and these methods should be
removed.
We should also do the same thing to eliminate the methods that take a
vector of MVT::ValueTypes.
It would be extra nice to convert the dagiselemitter to avoid creating vectors
for operands when calling getTargetNode.
llvm-svn: 29566
2006-08-08 02:23:42 +00:00
Reid Spencer
658b9476f0
Initialize some variables the compiler warns about.
...
llvm-svn: 29277
2006-07-25 20:44:41 +00:00
Evan Cheng
7c970b98d0
If a shuffle is a splat, check if the argument is a build_vector with all elements being the same. If so, return the argument.
...
llvm-svn: 29242
2006-07-21 08:25:53 +00:00
Evan Cheng
8472e0c4af
If a shuffle is unary, i.e. one of the vector argument is not needed, turn the
...
operand into a undef and adjust mask accordingly.
llvm-svn: 29232
2006-07-20 22:44:41 +00:00
Andrew Lenharth
ec104a2b41
80 cols
...
llvm-svn: 29221
2006-07-20 17:43:27 +00:00
Andrew Lenharth
c496b418b5
Reduce number of exported symbols
...
llvm-svn: 29220
2006-07-20 17:28:38 +00:00
Chris Lattner
54a34cd20b
Mark these two classes as hidden, shrinking libllbmgcc.dylib by 25K
...
llvm-svn: 28970
2006-06-28 21:58:30 +00:00
Andrew Lenharth
0e57b2cb92
Start on my todo list
...
llvm-svn: 28752
2006-06-12 16:07:18 +00:00
Evan Cheng
64d2846017
visitVBinOp: Can't fold divide by zero!
...
llvm-svn: 28584
2006-05-31 06:08:35 +00:00
Chris Lattner
8f872d2091
Fix a nasty dag combiner bug that caused nondeterminstic crashes (MY FAVORITE!):
...
SimplifySelectOps would eliminate a Select, delete it, then return true.
The clients would see that it did something and return null.
The top level would see a null return, and decide that nothing happened,
proceeding to process the node in other ways: boom.
The fix is simple: clients of SimplifySelectOps should return the select
node itself.
In order to catch really obnoxious boogs like this in the future, add an
assert that nodes are not deleted. We do this by checking for a sentry node
type that the SDNode dtor sets when a node is destroyed.
llvm-svn: 28514
2006-05-27 00:43:02 +00:00
Andrew Lenharth
1dc9ec5874
Move this code to a common place
...
llvm-svn: 28329
2006-05-16 17:42:15 +00:00
Chris Lattner
afe72481f6
Comment out dead variables
...
llvm-svn: 28252
2006-05-12 17:57:54 +00:00
Chris Lattner
66adee93aa
Two simplifications for token factor nodes: simplify tf(x,x) -> x.
...
simplify tf(x,y,y,z) -> tf(x,y,z).
llvm-svn: 28233
2006-05-12 05:01:37 +00:00
Evan Cheng
2c74848af1
Debugging info
...
llvm-svn: 28200
2006-05-09 06:55:15 +00:00
Chris Lattner
446e1ef26a
Make the case I just checked in stronger. Now we compile this:
...
short test2(short X, short x) {
int Y = (short)(X+x);
return Y >> 1;
}
to:
_test2:
add r2, r3, r4
extsh r2, r2
srawi r3, r2, 1
blr
instead of:
_test2:
add r2, r3, r4
extsh r2, r2
srwi r2, r2, 1
extsh r3, r2
blr
llvm-svn: 28175
2006-05-08 21:18:59 +00:00
Chris Lattner
29062da0ac
Implement and_sext.ll:test3, generating:
...
_test4:
srawi r3, r3, 16
blr
instead of:
_test4:
srwi r2, r3, 16
extsh r3, r2
blr
for:
short test4(unsigned X) {
return (X >> 16);
}
llvm-svn: 28174
2006-05-08 20:59:41 +00:00
Chris Lattner
2935d8190c
Compile this:
...
short test4(unsigned X) {
return (X >> 16);
}
to:
_test4:
movl 4(%esp), %eax
sarl $16, %eax
ret
instead of:
_test4:
movl $-65536, %eax
andl 4(%esp), %eax
sarl $16, %eax
ret
llvm-svn: 28171
2006-05-08 20:51:54 +00:00
Nate Begeman
e5ce5bb6da
Fix PR772
...
llvm-svn: 28161
2006-05-08 01:35:01 +00:00
Chris Lattner
7e7bcf3a54
Simplify some code, add a couple minor missed folds
...
llvm-svn: 28152
2006-05-06 23:06:26 +00:00
Chris Lattner
2a4d7b845b
remove cases handled elsewhere
...
llvm-svn: 28150
2006-05-06 22:43:44 +00:00
Chris Lattner
1ecb2a2dac
Use the new TargetLowering::ComputeNumSignBits method to eliminate
...
sign_extend_inreg operations. Though ComputeNumSignBits is still rudimentary,
this is enough to compile this:
short test(short X, short x) {
int Y = X+x;
return (Y >> 1);
}
short test2(short X, short x) {
int Y = (short)(X+x);
return Y >> 1;
}
into:
_test:
add r2, r3, r4
srawi r3, r2, 1
blr
_test2:
add r2, r3, r4
extsh r2, r2
srawi r3, r2, 1
blr
instead of:
_test:
add r2, r3, r4
srawi r2, r2, 1
extsh r3, r2
blr
_test2:
add r2, r3, r4
extsh r2, r2
srawi r2, r2, 1
extsh r3, r2
blr
llvm-svn: 28146
2006-05-06 09:30:03 +00:00
Chris Lattner
907e392dba
Fold trunc(any_ext). This gives stuff like:
...
27,28c27
< movzwl %di, %edi
< movl %edi, %ebx
---
> movw %di, %bx
llvm-svn: 28137
2006-05-05 22:56:26 +00:00
Chris Lattner
57f8c5a387
Shrink shifts when possible.
...
llvm-svn: 28136
2006-05-05 22:53:17 +00:00
Chris Lattner
3d26577396
Fold (fpext (load x)) -> (extload x)
...
llvm-svn: 28130
2006-05-05 21:34:35 +00:00
Chris Lattner
25a5283a86
Fold some common code.
...
llvm-svn: 28124
2006-05-05 06:32:04 +00:00
Chris Lattner
002ee91457
Implement:
...
// fold (and (sext x), (sext y)) -> (sext (and x, y))
// fold (or (sext x), (sext y)) -> (sext (or x, y))
// fold (xor (sext x), (sext y)) -> (sext (xor x, y))
// fold (and (aext x), (aext y)) -> (aext (and x, y))
// fold (or (aext x), (aext y)) -> (aext (or x, y))
// fold (xor (aext x), (aext y)) -> (aext (xor x, y))
llvm-svn: 28123
2006-05-05 06:31:05 +00:00
Chris Lattner
5ac4293606
Pull and through and/or/xor. This compiles some bitfield code to:
...
mov EAX, DWORD PTR [ESP + 4]
mov ECX, DWORD PTR [EAX]
mov EDX, ECX
add EDX, EDX
or EDX, ECX
and EDX, -2147483648
and ECX, 2147483647
or EDX, ECX
mov DWORD PTR [EAX], EDX
ret
instead of:
sub ESP, 4
mov DWORD PTR [ESP], ESI
mov EAX, DWORD PTR [ESP + 8]
mov ECX, DWORD PTR [EAX]
mov EDX, ECX
add EDX, EDX
mov ESI, ECX
and ESI, -2147483648
and EDX, -2147483648
or EDX, ESI
and ECX, 2147483647
or EDX, ECX
mov DWORD PTR [EAX], EDX
mov ESI, DWORD PTR [ESP]
add ESP, 4
ret
llvm-svn: 28122
2006-05-05 06:10:43 +00:00
Chris Lattner
812646aa0c
Implement a variety of simplifications for ANY_EXTEND.
...
llvm-svn: 28121
2006-05-05 05:58:59 +00:00
Chris Lattner
8d6fc20181
Factor some code, add these transformations:
...
// fold (and (trunc x), (trunc y)) -> (trunc (and x, y))
// fold (or (trunc x), (trunc y)) -> (trunc (or x, y))
// fold (xor (trunc x), (trunc y)) -> (trunc (xor x, y))
llvm-svn: 28120
2006-05-05 05:51:50 +00:00
Chris Lattner
2b48a94413
Remove a bogus transformation. This fixes SingleSource/UnitTests/2006-01-23-InitializedBitField.c
...
with some changes I have to the new CFE.
llvm-svn: 28022
2006-04-28 23:33:20 +00:00
Chris Lattner
662e940f73
Fix a couple more memory issues
...
llvm-svn: 27930
2006-04-21 15:32:26 +00:00
Chris Lattner
cc47ab3305
Fix a really subtle and obnoxious memory bug that caused issues with an
...
llvm-gcc4 boostrap. Whenever a node is deleted by the dag combiner, it
*must* be returned by the visit function, or the dag combiner will not
know that the node has been processed (and will, e.g., send it to the
target dag combine xforms).
llvm-svn: 27922
2006-04-20 23:55:59 +00:00
Evan Cheng
a320abc494
Turn a VAND into a VECTOR_SHUFFLE is applicable.
...
DAG combiner can turn a VAND V, <-1, 0, -1, -1>, i.e. vector clear elements,
into a vector shuffle with a zero vector. It only does so when TLI tells it
the xform is profitable.
llvm-svn: 27874
2006-04-20 08:56:16 +00:00
Chris Lattner
e1401e3610
Canonicalize vvector_shuffle(x,x) -> vvector_shuffle(x,undef) to enable patterns
...
to match again :)
llvm-svn: 27533
2006-04-08 05:34:25 +00:00
Chris Lattner
098c01e94e
Codegen shufflevector as VVECTOR_SHUFFLE
...
llvm-svn: 27529
2006-04-08 04:15:24 +00:00
Evan Cheng
613996c55e
1. If both vector operands of a vector_shuffle are undef, turn it into an undef.
...
2. A shuffle mask element can also be an undef.
llvm-svn: 27472
2006-04-06 23:20:43 +00:00
Chris Lattner
4ea52cac01
Do not create ZEXTLOAD's unless we are before legalize or the operation is
...
legal.
llvm-svn: 27402
2006-04-04 17:39:18 +00:00
Chris Lattner
e1e3adf802
Add a missing check, this fixes UnitTests/Vector/sumarray.c
...
llvm-svn: 27375
2006-04-03 17:29:28 +00:00
Chris Lattner
04c00fc844
Add a missing check, which broke a bunch of vector tests.
...
llvm-svn: 27374
2006-04-03 17:21:50 +00:00
Andrew Lenharth
94f012f606
back this out
...
llvm-svn: 27367
2006-04-03 03:16:50 +00:00
Andrew Lenharth
015eaf5f33
This should be a win of every arch
...
llvm-svn: 27364
2006-04-02 21:42:45 +00:00
Chris Lattner
4993249a04
Add a little dag combine to compile this:
...
int %AreSecondAndThirdElementsBothNegative(<4 x float>* %in) {
entry:
%tmp1 = load <4 x float>* %in ; <<4 x float>> [#uses=1]
%tmp = tail call int %llvm.ppc.altivec.vcmpgefp.p( int 1, <4 x float> < float 0x7FF8000000000000, float 0.000000e+00, float 0.000000e+00, float 0x7FF8000000000000 >, <4 x float> %tmp1 ) ; <int> [#uses=1]
%tmp = seteq int %tmp, 0 ; <bool> [#uses=1]
%tmp3 = cast bool %tmp to int ; <int> [#uses=1]
ret int %tmp3
}
into this:
_AreSecondAndThirdElementsBothNegative:
mfspr r2, 256
oris r4, r2, 49152
mtspr 256, r4
li r4, lo16(LCPI1_0)
lis r5, ha16(LCPI1_0)
lvx v0, 0, r3
lvx v1, r5, r4
vcmpgefp. v0, v1, v0
mfcr r3, 2
rlwinm r3, r3, 27, 31, 31
mtspr 256, r2
blr
instead of this:
_AreSecondAndThirdElementsBothNegative:
mfspr r2, 256
oris r4, r2, 49152
mtspr 256, r4
li r4, lo16(LCPI1_0)
lis r5, ha16(LCPI1_0)
lvx v0, 0, r3
lvx v1, r5, r4
vcmpgefp. v0, v1, v0
mfcr r3, 2
rlwinm r3, r3, 27, 31, 31
xori r3, r3, 1
cntlzw r3, r3
srwi r3, r3, 5
mtspr 256, r2
blr
llvm-svn: 27356
2006-04-02 06:11:11 +00:00
Chris Lattner
0442a18758
Constant fold all of the vector binops. This allows us to compile this:
...
"vector unsigned char mergeLowHigh = (vector unsigned char)
( 8, 9, 10, 11, 16, 17, 18, 19, 12, 13, 14, 15, 20, 21, 22, 23 );
vector unsigned char mergeHighLow = vec_xor( mergeLowHigh, vec_splat_u8(8));"
aka:
void %test2(<16 x sbyte>* %P) {
store <16 x sbyte> cast (<4 x int> xor (<4 x int> cast (<16 x ubyte> < ubyte 8, ubyte 9, ubyte 10, ubyte 11, ubyte 16, ubyte 17, ubyte 18, ubyte 19, ubyte 12, ubyte 13, ubyte 14, ubyte 15, ubyte 20, ubyte 21, ubyte 22, ubyte 23 > to <4 x int>), <4 x int> cast (<16 x sbyte> < sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8 > to <4 x int>)) to <16 x sbyte>), <16 x sbyte> * %P
ret void
}
into this:
_test2:
mfspr r2, 256
oris r4, r2, 32768
mtspr 256, r4
li r4, lo16(LCPI2_0)
lis r5, ha16(LCPI2_0)
lvx v0, r5, r4
stvx v0, 0, r3
mtspr 256, r2
blr
instead of this:
_test2:
mfspr r2, 256
oris r4, r2, 49152
mtspr 256, r4
li r4, lo16(LCPI2_0)
lis r5, ha16(LCPI2_0)
vspltisb v0, 8
lvx v1, r5, r4
vxor v0, v1, v0
stvx v0, 0, r3
mtspr 256, r2
blr
... which occurs here:
http://developer.apple.com/hardware/ve/calcspeed.html
llvm-svn: 27343
2006-04-02 03:25:57 +00:00
Chris Lattner
e4e64b6b85
Implement constant folding of bit_convert of arbitrary constant vbuild_vector nodes.
...
llvm-svn: 27341
2006-04-02 02:53:43 +00:00
Chris Lattner
39dcf1a9e2
Delete identity shuffles, implementing CodeGen/Generic/vector-identity-shuffle.ll
...
llvm-svn: 27317
2006-03-31 22:16:43 +00:00
Chris Lattner
7e30af3887
Remove dead *extloads. This allows us to codegen vector.ll:test_extract_elt
...
to:
test_extract_elt:
alloc r3 = ar.pfs,0,1,0,0
adds r8 = 12, r32
;;
ldfs f8 = [r8]
mov ar.pfs = r3
br.ret.sptk.many rp
instead of:
test_extract_elt:
alloc r3 = ar.pfs,0,1,0,0
adds r8 = 28, r32
adds r9 = 24, r32
adds r10 = 20, r32
adds r11 = 16, r32
;;
ldfs f6 = [r8]
;;
ldfs f6 = [r9]
adds r8 = 12, r32
adds r9 = 8, r32
adds r14 = 4, r32
;;
ldfs f6 = [r10]
;;
ldfs f6 = [r11]
ldfs f8 = [r8]
;;
ldfs f6 = [r9]
;;
ldfs f6 = [r14]
;;
ldfs f6 = [r32]
mov ar.pfs = r3
br.ret.sptk.many rp
llvm-svn: 27297
2006-03-31 18:10:41 +00:00
Chris Lattner
2d8551c85b
Delete dead loads in the dag. This allows us to compile
...
vector.ll:test_extract_elt2 into:
_test_extract_elt2:
lfd f1, 32(r3)
blr
instead of:
_test_extract_elt2:
lfd f0, 56(r3)
lfd f0, 48(r3)
lfd f0, 40(r3)
lfd f1, 32(r3)
lfd f0, 24(r3)
lfd f0, 16(r3)
lfd f0, 8(r3)
lfd f0, 0(r3)
blr
llvm-svn: 27296
2006-03-31 18:06:18 +00:00
Chris Lattner
20e619fba3
When building a VVECTOR_SHUFFLE node from extract_element operations, make
...
sure to build it as SHUFFLE(X, undef, mask), not SHUFFLE(X, X, mask).
The later is not canonical form, and prevents the PPC splat pattern from
matching. For a particular splat, we go from generating this:
li r10, lo16(LCPI1_0)
lis r11, ha16(LCPI1_0)
lvx v3, r11, r10
vperm v3, v2, v2, v3
to generating:
vspltw v3, v2, 3
llvm-svn: 27236
2006-03-28 22:19:47 +00:00
Chris Lattner
a46dfe80c8
Canonicalize VECTOR_SHUFFLE(X, X, Y) -> VECTOR_SHUFFLE(X,undef,Y')
...
llvm-svn: 27235
2006-03-28 22:11:53 +00:00
Chris Lattner
c9992548fc
Turn a series of extract_element's feeding a build_vector into a
...
vector_shuffle node. For this:
void test(__m128 *res, __m128 *A, __m128 *B) {
*res = _mm_unpacklo_ps(*A, *B);
}
we now produce this code:
_test:
movl 8(%esp), %eax
movaps (%eax), %xmm0
movl 12(%esp), %eax
unpcklps (%eax), %xmm0
movl 4(%esp), %eax
movaps %xmm0, (%eax)
ret
instead of this:
_test:
subl $76, %esp
movl 88(%esp), %eax
movaps (%eax), %xmm0
movaps %xmm0, (%esp)
movaps %xmm0, 32(%esp)
movss 4(%esp), %xmm0
movss 32(%esp), %xmm1
unpcklps %xmm0, %xmm1
movl 84(%esp), %eax
movaps (%eax), %xmm0
movaps %xmm0, 16(%esp)
movaps %xmm0, 48(%esp)
movss 20(%esp), %xmm0
movss 48(%esp), %xmm2
unpcklps %xmm0, %xmm2
unpcklps %xmm1, %xmm2
movl 80(%esp), %eax
movaps %xmm2, (%eax)
addl $76, %esp
ret
GCC produces this (with -fomit-frame-pointer):
_test:
subl $12, %esp
movl 20(%esp), %eax
movaps (%eax), %xmm0
movl 24(%esp), %eax
unpcklps (%eax), %xmm0
movl 16(%esp), %eax
movaps %xmm0, (%eax)
addl $12, %esp
ret
llvm-svn: 27233
2006-03-28 20:28:38 +00:00
Chris Lattner
b7163598f9
Don't crash on X^X if X is a vector. Instead, produce a vector of zeros.
...
llvm-svn: 27229
2006-03-28 19:11:05 +00:00
Chris Lattner
dc1eab5886
Don't call SimplifyDemandedBits on vectors
...
llvm-svn: 27128
2006-03-25 22:19:00 +00:00
Chris Lattner
5336a59e4b
fold insertelement(buildvector) -> buildvector if the inserted element # is
...
a constant. This implements test_constant_insert in CodeGen/Generic/vector.ll
llvm-svn: 26851
2006-03-19 01:27:56 +00:00
Nate Begeman
bb01d4f272
Remove BRTWOWAY*
...
Make the PPC backend not dependent on BRTWOWAY_CC and make the branch
selector smarter about the code it generates, fixing a case in the
readme.
llvm-svn: 26814
2006-03-17 01:40:33 +00:00
Chris Lattner
68ac09d5cb
make sure dead token factor nodes are removed by the dag combiner.
...
llvm-svn: 26731
2006-03-13 18:37:30 +00:00
Chris Lattner
d8c2a48d58
Fold X+Y -> X|Y when safe. This implements:
...
Regression/CodeGen/PowerPC/and_add.ll
a case that occurs with dynamic allocas of constant size.
llvm-svn: 26727
2006-03-13 06:51:27 +00:00
Chris Lattner
8bb6cb7d7b
add a couple of missing folds
...
llvm-svn: 26724
2006-03-13 06:26:26 +00:00
Chris Lattner
bdaf4f38b5
Reinstate this now that the offending opposite xform has been removed.
...
llvm-svn: 26548
2006-03-05 19:53:55 +00:00
Evan Cheng
d428e22c07
Back out fold (shl (add x, c1), c2) -> (add (shl x, c2), c1<<c2) for now.
...
It's causing an infinite loop compiling ldecod on x86 / Darwin.
llvm-svn: 26544
2006-03-05 07:30:16 +00:00
Chris Lattner
3bc4050217
Add some simple copysign folds
...
llvm-svn: 26543
2006-03-05 05:30:57 +00:00
Chris Lattner
f29f5204cc
fold (mul (add x, c1), c2) -> (add (mul x, c2), c1*c2)
...
fold (shl (add x, c1), c2) -> (add (shl x, c2), c1<<c2)
This allows us to compile CodeGen/PowerPC/addi-reassoc.ll into:
_test1:
slwi r2, r4, 4
add r2, r2, r3
lwz r3, 36(r2)
blr
_test2:
mulli r2, r4, 5
add r2, r2, r3
lbz r2, 11(r2)
extsb r3, r2
blr
instead of:
_test1:
addi r2, r4, 2
slwi r2, r2, 4
add r2, r3, r2
lwz r3, 4(r2)
blr
_test2:
addi r2, r4, 2
mulli r2, r2, 5
add r2, r3, r2
lbz r2, 1(r2)
extsb r3, r2
blr
llvm-svn: 26535
2006-03-04 23:33:26 +00:00
Chris Lattner
0db2f2c689
Fix CodeGen/Generic/2006-03-01-dagcombineinfloop.ll, an infinite loop
...
in the dag combiner on 176.gcc on x86.
llvm-svn: 26459
2006-03-01 21:47:21 +00:00
Chris Lattner
232024edb8
Fix a typo evan noticed
...
llvm-svn: 26454
2006-03-01 19:55:35 +00:00
Chris Lattner
bc1c85beea
Add support for target-specific dag combines
...
llvm-svn: 26443
2006-03-01 04:53:38 +00:00
Chris Lattner
fbcd62d3bb
Add a new AddToWorkList method, start using it
...
llvm-svn: 26441
2006-03-01 04:03:14 +00:00
Chris Lattner
324871ef1a
Pull shifts by a constant through multiplies (a form of reassociation),
...
implementing Regression/CodeGen/X86/mul-shift-reassoc.ll
llvm-svn: 26440
2006-03-01 03:44:24 +00:00
Evan Cheng
b97aab4371
Vector ops lowering.
...
llvm-svn: 26436
2006-03-01 01:09:54 +00:00
Chris Lattner
f0032b350c
Compile:
...
unsigned foo4(unsigned short *P) { return *P & 255; }
unsigned foo5(short *P) { return *P & 255; }
to:
_foo4:
lbz r3,1(r3)
blr
_foo5:
lbz r3,1(r3)
blr
not:
_foo4:
lhz r2, 0(r3)
rlwinm r3, r2, 0, 24, 31
blr
_foo5:
lhz r2, 0(r3)
rlwinm r3, r2, 0, 24, 31
blr
llvm-svn: 26419
2006-02-28 06:49:37 +00:00
Chris Lattner
bdbc4476d9
Fold "and (LOAD P), 255" -> zextload. This allows us to compile:
...
unsigned foo3(unsigned *P) { return *P & 255; }
as:
_foo3:
lbz r3, 3(r3)
blr
instead of:
_foo3:
lwz r2, 0(r3)
rlwinm r3, r2, 0, 24, 31
blr
and:
unsigned short foo2(float a) { return a; }
as:
_foo2:
fctiwz f0, f1
stfd f0, -8(r1)
lhz r3, -2(r1)
blr
instead of:
_foo2:
fctiwz f0, f1
stfd f0, -8(r1)
lwz r2, -4(r1)
rlwinm r3, r2, 0, 16, 31
blr
llvm-svn: 26417
2006-02-28 06:35:35 +00:00
Chris Lattner
0f8a727c49
fold (sra (sra x, c1), c2) -> (sra x, c1+c2)
...
llvm-svn: 26416
2006-02-28 06:23:04 +00:00
Chris Lattner
47ee42829d
remove some completed notes
...
llvm-svn: 26390
2006-02-27 00:39:31 +00:00
Chris Lattner
301f45cf6f
Fix a problem Nate and Duraid reported where simplifying nodes can cause
...
them to get ressurected, in which case, deleting the undead nodes is
unfriendly.
llvm-svn: 26291
2006-02-20 06:51:04 +00:00
Nate Begeman
abac61603f
Add checks to make sure we don't create bogus extend nodes, and fix a bug
...
where we were doing exactly that which was causing failures on x86 and
alpha.
llvm-svn: 26284
2006-02-18 02:40:58 +00:00
Chris Lattner
375e1a71cc
Fix a tricky issue in the SimplifyDemandedBits code where CombineTo wasn't
...
exactly the API we wanted to call into. This fixes the crash on crafty last
night.
llvm-svn: 26269
2006-02-17 21:58:01 +00:00
Nate Begeman
fb5dbadf15
Clean up DemandedBitsAreZero interface
...
Make more use of the new mask helpers in valuetypes.h
Combine (sra (srl x, c1), c1) -> sext_inreg if legal
llvm-svn: 26263
2006-02-17 19:54:08 +00:00
Nate Begeman
57b3567552
Don't expand sdiv by power of two before legalize, since it will likely
...
generate illegal nodes.
llvm-svn: 26261
2006-02-17 07:26:20 +00:00
Nate Begeman
5965bd19f8
kill ADD_PARTS & SUB_PARTS and replace them with fancy new ADDC, ADDE, SUBC
...
and SUBE nodes that actually expose what's going on and allow for
significant simplifications in the targets.
llvm-svn: 26255
2006-02-17 05:43:56 +00:00
Nate Begeman
8a77efe4f7
Rework the SelectionDAG-based implementations of SimplifyDemandedBits
...
and ComputeMaskedBits to match the new improved versions in instcombine.
Tested against all of multisource/benchmarks on ppc.
llvm-svn: 26238
2006-02-16 21:11:51 +00:00
Chris Lattner
471627c49d
Lowering of sdiv X, pow2 was broken, this fixes it. This patch is written
...
by Nate, I'm just committing it for him.
llvm-svn: 26230
2006-02-16 08:02:36 +00:00
Jim Laskey
2eea436192
Should not combine ISD::LOCATIONs until we have scheme to remove from
...
MachineDebugInfo tables.
llvm-svn: 26216
2006-02-15 19:34:44 +00:00
Chris Lattner
a10e23c19f
Compile this:
...
xori r6, r2, 1
rlwinm r6, r6, 0, 31, 31
cmpwi cr0, r6, 0
bne cr0, LBB1_3 ; endif
to this:
rlwinm r6, r2, 0, 31, 31
cmpwi cr0, r6, 0
beq cr0, LBB1_3 ; endif
llvm-svn: 26047
2006-02-08 02:13:15 +00:00
Nate Begeman
8c9cd461df
Back out previous commit, it isn't safe.
...
llvm-svn: 26006
2006-02-05 08:23:00 +00:00
Nate Begeman
3dc8b89493
fold c1 << (x + c2) into (c1 << c2) << x. fix a warning.
...
llvm-svn: 26005
2006-02-05 08:07:24 +00:00
Nate Begeman
c89fdf1eb3
Handle urem by shifted powers of 2.
...
llvm-svn: 26001
2006-02-05 07:36:48 +00:00
Nate Begeman
25d178bece
handle combining A / (B << N) into A >>u (log2(B)+N) when B is a power of 2
...
llvm-svn: 26000
2006-02-05 07:20:23 +00:00
Nate Begeman
dc7bba9ffe
Add a framework for eliminating instructions that produces undemanded bits.
...
llvm-svn: 25945
2006-02-03 22:24:05 +00:00
Nate Begeman
22e251abf1
Add common code for reassociating ops in the dag combiner
...
llvm-svn: 25934
2006-02-03 06:46:56 +00:00
Chris Lattner
49beaf40fc
Turn any_extend nodes into zero_extend nodes when it allows us to remove an
...
and instruction. This allows us to compile stuff like this:
bool %X(int %X) {
%Y = add int %X, 14
%Z = setne int %Y, 12345
ret bool %Z
}
to this:
_X:
cmpl $12331, 4(%esp)
setne %al
movzbl %al, %eax
ret
instead of this:
_X:
cmpl $12331, 4(%esp)
setne %al
movzbl %al, %eax
andl $1, %eax
ret
This occurs quite a bit with the X86 backend. For example, 25 times in
lambda, 30 times in 177.mesa, 14 times in galgel, 70 times in fma3d,
25 times in vpr, several hundred times in gcc, ~45 times in crafty,
~60 times in parser, ~140 times in eon, 110 times in perlbmk, 55 on gap,
16 times on bzip2, 14 times on twolf, and 1-2 times in many other SPEC2K
programs.
llvm-svn: 25901
2006-02-02 07:17:31 +00:00
Chris Lattner
49ce35542f
add two dag combines:
...
(C1-X) == C2 --> X == C1-C2
(X+C1) == C2 --> X == C2-C1
This allows us to compile this:
bool %X(int %X) {
%Y = add int %X, 14
%Z = setne int %Y, 12345
ret bool %Z
}
into this:
_X:
cmpl $12331, 4(%esp)
setne %al
movzbl %al, %eax
andl $1, %eax
ret
not this:
_X:
movl $14, %eax
addl 4(%esp), %eax
cmpl $12345, %eax
setne %al
movzbl %al, %eax
andl $1, %eax
ret
Testcase here: Regression/CodeGen/X86/compare-add.ll
nukage of the and coming up next.
llvm-svn: 25898
2006-02-02 06:36:13 +00:00
Nate Begeman
7e7f439f85
Fix some of the stuff in the PPC README file, and clean up legalization
...
of the SELECT_CC, BR_CC, and BRTWOWAY_CC nodes.
llvm-svn: 25875
2006-02-01 07:19:44 +00:00
Chris Lattner
f0b24d2dc0
Move MaskedValueIsZero from the DAGCombiner to the TargetLowering interface,making isMaskedValueZeroForTargetNode simpler, and useable from other partsof the compiler.
...
llvm-svn: 25803
2006-01-30 04:09:27 +00:00
Chris Lattner
3b40e64aa3
pass the address of MaskedValueIsZero into isMaskedValueZeroForTargetNode,
...
to permit recursion
llvm-svn: 25799
2006-01-30 03:49:37 +00:00
Chris Lattner
678da98835
eliminate uses of SelectionDAG::getBR2Way_CC
...
llvm-svn: 25767
2006-01-29 06:00:45 +00:00
Nate Begeman
af397cec0b
Add a missing case to the dag combiner.
...
llvm-svn: 25723
2006-01-28 01:06:30 +00:00
Chris Lattner
de02d7727f
Add explicit #includes of <iostream>
...
llvm-svn: 25515
2006-01-22 23:41:00 +00:00
Nate Begeman
569c439567
Get rid of code in the DAGCombiner that is duplicated in SelectionDAG.cpp
...
Now all constant folding in the code generator is in one place.
llvm-svn: 25426
2006-01-18 22:35:16 +00:00
Chris Lattner
5fee908be5
Fix a backwards conditional that caused an inf loop in some cases. This
...
fixes: test/Regression/CodeGen/Generic/2005-01-18-SetUO-InfLoop.ll
llvm-svn: 25419
2006-01-18 19:13:41 +00:00
Chris Lattner
fcdb420baf
Disable two transformations that contribute to bus errors on SparcV8.
...
llvm-svn: 25339
2006-01-15 18:58:59 +00:00
Chris Lattner
3470b5dee6
Add a simple missing fold to produce this:
...
subfic r3, r2, 33
instead of this:
subfic r2, r2, 32
addi r3, r2, 1
llvm-svn: 25255
2006-01-12 20:22:43 +00:00
Chris Lattner
b1ee616de9
Don't create rotate instructions in unsupported types, because we don't have
...
promote/expand code yet. This fixes the 177.mesa failure on PPC.
llvm-svn: 25250
2006-01-12 18:57:33 +00:00
Nate Begeman
1b8121b227
Add bswap, rotl, and rotr nodes
...
Add dag combiner code to recognize rotl, rotr
Add ppc code to match rotl
Targets should add rotl/rotr patterns if they have them
llvm-svn: 25222
2006-01-11 21:21:00 +00:00
Evan Cheng
85c973cda9
Revert the previous check-in. Leave shl x, 1 along for target to deal with.
...
llvm-svn: 25121
2006-01-06 01:56:02 +00:00
Evan Cheng
b03f9b32d2
fold (shl x, 1) -> (add x, x)
...
llvm-svn: 25120
2006-01-06 01:06:31 +00:00
Jim Laskey
762e9ec06c
Added initial support for DEBUG_LABEL allowing debug specific labels to be
...
inserted in the code.
llvm-svn: 25104
2006-01-05 01:25:28 +00:00
Jim Laskey
0da76a676a
Add unique id to debug location for debug label use (work in progress.)
...
llvm-svn: 25096
2006-01-04 15:04:11 +00:00
Jim Laskey
bdba3e2a46
Remove redundant debug locations.
...
llvm-svn: 24995
2005-12-23 20:08:28 +00:00
Chris Lattner
26943b9691
Simplify store(bitconv(x)) to store(x). This allows us to compile this:
...
void bar(double Y, double *X) {
*X = Y;
}
to this:
bar:
save -96, %o6, %o6
st %i1, [%i2+4]
st %i0, [%i2]
restore %g0, %g0, %g0
retl
nop
instead of this:
bar:
save -104, %o6, %o6
st %i1, [%i6+-4]
st %i0, [%i6+-8]
ldd [%i6+-8], %f0
std %f0, [%i2]
restore %g0, %g0, %g0
retl
nop
on sparcv8.
llvm-svn: 24983
2005-12-23 05:48:07 +00:00
Chris Lattner
54560f6887
fold (conv (load x)) -> (load (conv*)x).
...
This allows us to compile this:
void foo(double);
void bar(double *X) { foo(*X); }
To this:
bar:
save -96, %o6, %o6
ld [%i0+4], %o1
ld [%i0], %o0
call foo
nop
restore %g0, %g0, %g0
retl
nop
instead of this:
bar:
save -104, %o6, %o6
ldd [%i0], %f0
std %f0, [%i6+-8]
ld [%i6+-4], %o1
ld [%i6+-8], %o0
call foo
nop
restore %g0, %g0, %g0
retl
nop
on SparcV8.
llvm-svn: 24982
2005-12-23 05:44:41 +00:00
Chris Lattner
efbbedbf4a
Fold bitconv(bitconv(x)) -> x. We now compile this:
...
void foo(double);
void bar(double X) { foo(X); }
to this:
bar:
save -96, %o6, %o6
or %g0, %i0, %o0
or %g0, %i1, %o1
call foo
nop
restore %g0, %g0, %g0
retl
nop
instead of this:
bar:
save -112, %o6, %o6
st %i1, [%i6+-4]
st %i0, [%i6+-8]
ldd [%i6+-8], %f0
std %f0, [%i6+-16]
ld [%i6+-12], %o1
ld [%i6+-16], %o0
call foo
nop
restore %g0, %g0, %g0
retl
nop
on V8.
llvm-svn: 24981
2005-12-23 05:37:50 +00:00
Chris Lattner
a187460552
constant fold bits_convert in getNode and in the dag combiner for fp<->int
...
conversions. This allows V8 to compiles this:
void %test() {
call float %test2( float 1.000000e+00, float 2.000000e+00, double 3.000000e+00, double* null )
ret void
}
into:
test:
save -96, %o6, %o6
sethi 0, %o3
sethi 1049088, %o2
sethi 1048576, %o1
sethi 1040384
, %o0
or %g0, %o3, %o4
call test2
nop
restore %g0, %g0, %g0
retl
nop
instead of:
test:
save -112, %o6, %o6
sethi 0, %o4
sethi 1049088, %l0
st %o4, [%i6+-12]
st %l0, [%i6+-16]
ld [%i6+-12], %o3
ld [%i6+-16], %o2
sethi 1048576, %o1
sethi 1040384
, %o0
call test2
nop
restore %g0, %g0, %g0
retl
nop
llvm-svn: 24980
2005-12-23 05:30:37 +00:00
Evan Cheng
9cdc16c6d3
* Fix a GlobalAddress lowering bug.
...
* Teach DAG combiner about X86ISD::SETCC by adding a TargetLowering hook.
llvm-svn: 24921
2005-12-21 23:05:39 +00:00
Chris Lattner
83e4407379
Don't create SEXTLOAD/ZEXTLOAD instructions that the target doesn't support
...
if after legalize. This fixes IA64 failures.
llvm-svn: 24725
2005-12-15 19:02:38 +00:00
Chris Lattner
d39c60fcc8
When folding loads into ops, immediately replace uses of the op with the
...
load. This reduces number of worklist iterations and avoid missing optimizations
depending on folding of things into sext_inreg nodes (which aren't supported by
all targets).
Tested by Regression/CodeGen/X86/extend.ll:test2
llvm-svn: 24712
2005-12-14 19:25:30 +00:00
Chris Lattner
7dac1083da
Fix the (zext (zextload)) case to trigger, similarly for sign extends.
...
Allow (zext (truncate)) to apply after legalize if the target supports
AND (which all do).
This compiles
short %foo() {
%tmp.0 = load ubyte* %X ; <ubyte> [#uses=1]
%tmp.3 = cast ubyte %tmp.0 to short ; <short> [#uses=1]
ret short %tmp.3
}
to:
_foo:
movzbl _X, %eax
ret
instead of:
_foo:
movzbl _X, %eax
movzbl %al, %eax
ret
thanks to Evan for pointing this out.
llvm-svn: 24709
2005-12-14 19:05:06 +00:00
Chris Lattner
f753d1a574
Fix a miscompilation in crafty due to a recent patch
...
llvm-svn: 24706
2005-12-14 07:58:38 +00:00
Evan Cheng
bce7c47306
Fold (zext (load x) to (zextload x).
...
llvm-svn: 24702
2005-12-14 02:19:23 +00:00
Chris Lattner
57c882edf8
Only transform (sext (truncate x)) -> (sextinreg x) if before legalize or
...
if the target supports the resultant sextinreg
llvm-svn: 24632
2005-12-07 18:02:05 +00:00
Chris Lattner
cbd3d01a43
Teach the dag combiner to turn a truncate/sign_extend pair into a sextinreg
...
when the types match up. This allows the X86 backend to compile:
sbyte %toggle_value(sbyte* %tmp.1) {
%tmp.2 = load sbyte* %tmp.1
ret sbyte %tmp.2
}
to this:
_toggle_value:
mov %EAX, DWORD PTR [%ESP + 4]
movsx %EAX, BYTE PTR [%EAX]
ret
instead of this:
_toggle_value:
mov %EAX, DWORD PTR [%ESP + 4]
movsx %EAX, BYTE PTR [%EAX]
movsx %EAX, %AL
ret
noticed in Shootout/objinst.
-Chris
llvm-svn: 24630
2005-12-07 07:11:03 +00:00
Jeff Cohen
cf1f782a2f
Fix operator precedence bug caught by VC++.
...
llvm-svn: 24318
2005-11-12 00:59:01 +00:00
Chris Lattner
bf4f233214
Switch the allnodes list from a vector of pointers to an ilist of nodes.This eliminates the vector, allows constant time removal of a node froma graph, and makes iteration over the all nodes list stable when adding
...
nodes to the graph.
llvm-svn: 24263
2005-11-09 23:47:37 +00:00
Nate Begeman
ee065281e8
Fix a crash that Andrew noticed, and add a pair of braces to unfconfuse
...
XCode's indenting.
llvm-svn: 24159
2005-11-02 18:42:59 +00:00
Chris Lattner
17df608719
Fix a source of undefined behavior when dealing with 64-bit types. This
...
may fix PR652. Thanks to Andrew for tracking down the problem.
llvm-svn: 24145
2005-11-02 01:47:04 +00:00
Chris Lattner
a70878d4fb
Codegen mul by negative power of two with a shift and negate.
...
This implements test/Regression/CodeGen/PowerPC/mul-neg-power-2.ll,
producing:
_foo:
slwi r2, r3, 1
subfic r3, r2, 63
blr
instead of:
_foo:
mulli r2, r3, -2
addi r3, r2, 63
blr
llvm-svn: 24106
2005-10-30 06:41:49 +00:00
Chris Lattner
4b6d583d7a
Fix DSE to not nuke dead stores unless they redundant store is the same
...
VT as the killing one. Fix fixes PR491
llvm-svn: 24034
2005-10-27 07:10:34 +00:00
Chris Lattner
d8c5c066a1
Add a simple xform that is useful for bitfield operations.
...
llvm-svn: 24029
2005-10-27 05:06:38 +00:00
Chris Lattner
3b409a85eb
Clear a bit in this file that was causing a miscompilation of 178.galgel.
...
llvm-svn: 23980
2005-10-25 18:57:30 +00:00
Chris Lattner
9faa5b7a9a
BuildSDIV and BuildUDIV only work for i32/i64, but they don't check that
...
the input is that type, this caused a failure on gs on X86 last night.
Move the hard checks into Build[US]Div since that is where decisions like
this should be made.
llvm-svn: 23881
2005-10-22 18:50:15 +00:00
Chris Lattner
75ea5b10bf
add a case missing from the dag combiner that exposed the failure on
...
2005-10-21-longlonggtu.ll.
llvm-svn: 23875
2005-10-21 21:23:25 +00:00
Nate Begeman
8f62cd32ad
Fix a typo in the dag combiner, so that this can work on i64 targets
...
llvm-svn: 23856
2005-10-21 01:51:45 +00:00
Nate Begeman
4dd383120f
Invert the TargetLowering flag that controls divide by consant expansion.
...
Add a new flag to TargetLowering indicating if the target has really cheap
signed division by powers of two, make ppc use it. This will probably go
away in the future.
Implement some more ISD::SDIV folds in the dag combiner
Remove now dead code in the x86 backend.
llvm-svn: 23853
2005-10-21 00:02:42 +00:00
Nate Begeman
7efe53d90b
Fix a couple bugs in the const div stuff where we'd generate MULHS/MULHU
...
for types that aren't legal, and fail a divisor is less than zero
comparison, which would cause us to drop a subtract.
llvm-svn: 23846
2005-10-20 17:45:03 +00:00
Chris Lattner
a6efeb01f9
don't use llabs with apparently VC++ doesn't have
...
llvm-svn: 23845
2005-10-20 17:01:00 +00:00
Nate Begeman
c6f067a8c4
Move the target constant divide optimization up into the dag combiner, so
...
that the nodes can be folded with other nodes, and we can not duplicate
code in every backend. Alpha will probably want this too.
llvm-svn: 23835
2005-10-20 02:15:44 +00:00
Chris Lattner
6c14c35bd7
Fold (select C, load A, load B) -> load (select C, A, B). This happens quite
...
a lot throughout many programs. In particular, specfp triggers it a bunch for
constant FP nodes when you have code like cond ? 1.0 : -1.0.
If the PPC ISel exposed the loads implicit in pic references to external globals,
we would be able to eliminate a load in cases like this as well:
%X = external global int
%Y = external global int
int* %test4(bool %C) {
%G = select bool %C, int* %X, int* %Y
ret int* %G
}
Note that this breaks things that use SrcValue's (see the fixme), but since nothing
uses them yet, this is ok.
Also, simplify some code to use hasOneUse() on an SDOperand instead of hasNUsesOfValue directly.
llvm-svn: 23781
2005-10-18 06:04:22 +00:00
Nate Begeman
418c6e4045
Implement some feedback from Chris re: constant canonicalization
...
llvm-svn: 23777
2005-10-18 00:28:13 +00:00
Nate Begeman
ec48a1bfbd
fold fmul X, +2.0 -> fadd X, X;
...
llvm-svn: 23774
2005-10-17 20:40:11 +00:00
Chris Lattner
eeb2bda2fa
add a trivial fold
...
llvm-svn: 23764
2005-10-17 01:07:11 +00:00
Chris Lattner
e540800d5a
Fix this logic.
...
llvm-svn: 23756
2005-10-15 22:35:40 +00:00
Chris Lattner
17cc9edd33
Add a case we were missing that was causing us to fail CodeGen/PowerPC/rlwinm.ll:test3
...
llvm-svn: 23755
2005-10-15 22:18:08 +00:00