Dan Gohman
30f060be80
Fix the alias analysis query in DAGCombiner to not add in two
...
offsets. The SrcValueOffset values are the real offsets from the
SrcValue base pointers.
llvm-svn: 40534
2007-07-26 16:14:06 +00:00
Dan Gohman
80f9f077e3
Don't call SimplifyVBinOp for non-vector operations, following earlier review
...
feedback. This theoretically makes the common (scalar) case more efficient.
llvm-svn: 39823
2007-07-13 20:03:40 +00:00
Dan Gohman
adb3d37c07
Fix a bug in the folding of binary operators to undef.
...
Thanks to Lauro for spotting this!
llvm-svn: 38491
2007-07-10 15:19:29 +00:00
Dan Gohman
fa91282dbf
Fix the folding of undef in several binary operators to recognize
...
undef in either the left or right operand.
llvm-svn: 38489
2007-07-10 14:20:37 +00:00
Dan Gohman
2af3063337
Preserve volatililty and alignment information when lowering or
...
simplifying loads and stores.
llvm-svn: 38473
2007-07-09 22:18:38 +00:00
Chris Lattner
6caf8fdd04
Fix this warning:
...
DAGCombiner.cpp: In member function 'llvm::SDOperand<unnamed>::DAGCombiner::visitOR(llvm::SDNode*)':
DAGCombiner.cpp:1608: warning: passing negative value '-0x00000000000000001' for argument 1 to 'llvm::SDOperand llvm::SelectionDAG::getConstant(uint64_t, llvm::MVT::ValueType, bool)'
oiy.
llvm-svn: 38458
2007-07-09 16:16:34 +00:00
Dan Gohman
06563a8702
Fix several over-aggressive folds for undef nodes in dagcombine, to
...
follow the rules for undef used in instcombine.
llvm-svn: 37851
2007-07-03 14:03:57 +00:00
Dan Gohman
9a70823375
Teach GetNegatedExpression to negate 0-B to B in UnsafeFPMath mode, and
...
visitFSUB to fold 0-B to -B in UnsafeFPMath mode. Also change visitFNEG
to use isNegatibleForFree/GetNegatedExpression instead of doing a subset
of the same thing manually.
This fixes test/CodeGen/X86/negative-sin.ll.
llvm-svn: 37842
2007-07-02 15:48:56 +00:00
Dan Gohman
a866514528
Generalize MVT::ValueType and associated functions to be able to represent
...
extended vector types. Remove the special SDNode opcodes used for pre-legalize
vector operations, and the special MVT::Vector type used with them. Adjust
lowering and legalize to work with the normal SDNode kinds instead, and to
use the normal MVT functions to work with vector types instead of using the
two special operands that the pre-legalize nodes held.
This allows pre-legalize and post-legalize DAGs, and the code that operates
on them, to be more consistent. Pre-legalize vector operators can be handled
more consistently with scalar operators. And, -view-dag-combine1-dags and
-view-legalize-dags now look prettier for vector code.
llvm-svn: 37719
2007-06-25 16:23:39 +00:00
Dan Gohman
309d3d51b3
Move ComputeMaskedBits, MaskedValueIsZero, and ComputeNumSignBits from
...
TargetLowering to SelectionDAG so that they have more convenient
access to the current DAG, in preparation for the ValueType routines
being changed from standalone functions to members of SelectionDAG for
the pre-legalize vector type changes.
llvm-svn: 37704
2007-06-22 14:59:07 +00:00
Evan Cheng
aa5f5d960d
Xforms:
...
(add (select cc, 0, c), x) -> (select cc, x, (add, x, c))
(sub x, (select cc, 0, c)) -> (select cc, x, (sub, x, c))
llvm-svn: 37685
2007-06-21 07:39:16 +00:00
Dan Gohman
a7644dd9b9
Pass a SelectionDAG into SDNode::dump everywhere it's used, in prepration
...
for needing the DAG node to print pre-legalize extended value types, and
to get better debug messages with target-specific nodes.
llvm-svn: 37656
2007-06-19 14:13:56 +00:00
Dan Gohman
5c4413120f
Rename MVT::getVectorBaseType to MVT::getVectorElementType.
...
llvm-svn: 37579
2007-06-14 22:58:02 +00:00
Chris Lattner
4698083b96
tighten up recursion depth again
...
llvm-svn: 37330
2007-05-25 02:19:06 +00:00
Evan Cheng
a4d187b8ce
Fix a typo that caused combiner to create mal-formed pre-indexed store where value store is the same as the base pointer.
...
llvm-svn: 37318
2007-05-24 02:35:39 +00:00
Chris Lattner
6509c0673f
prevent exponential recursion in isNegatibleForFree
...
llvm-svn: 37310
2007-05-23 07:35:22 +00:00
Dan Gohman
b539df3389
Qualify calls to getTypeForValueType with MVT:: too.
...
llvm-svn: 37233
2007-05-18 18:41:29 +00:00
Dale Johannesen
7a6c175e7a
Don't fold bitconvert(load) for preinc/postdec loads. Likewise stores.
...
llvm-svn: 37130
2007-05-16 22:45:30 +00:00
Chris Lattner
48fb92f75d
Use a ptr set instead of a linear search to unique TokenFactor operands.
...
This fixes PR1423
llvm-svn: 37102
2007-05-16 06:37:59 +00:00
Evan Cheng
288f133c71
Bug fix: should check ABI alignment, not pref. alignment.
...
llvm-svn: 37094
2007-05-16 02:04:50 +00:00
Lauro Ramos Venancio
3f142cbca2
Fix an infinite recursion in GetNegatedExpression.
...
llvm-svn: 37086
2007-05-15 17:05:43 +00:00
Chris Lattner
e49c974a7c
implement a simple fneg optimization/propagation thing. This compiles:
...
CodeGen/PowerPC/fneg.ll into:
_t4:
fmul f0, f3, f4
fmadd f1, f1, f2, f0
blr
instead of:
_t4:
fneg f0, f3
fmul f0, f0, f4
fmsub f1, f1, f2, f0
blr
llvm-svn: 37054
2007-05-14 22:04:50 +00:00
Evan Cheng
f325c2a65e
Can't fold the bit_convert is the store is a truncating store.
...
llvm-svn: 36962
2007-05-09 21:49:47 +00:00
Evan Cheng
562e45692e
Forgot a check.
...
llvm-svn: 36910
2007-05-07 21:36:06 +00:00
Evan Cheng
a4cf58a103
Enable a couple of xforms:
...
- (store (bitconvert v)) -> (store v) if resultant store does not require
higher alignment
- (bitconvert (load v)) -> (load (bitconvert*)v) if resultant load does not
require higher alignment
llvm-svn: 36908
2007-05-07 21:27:48 +00:00
Evan Cheng
044a0a8cfb
Don't create indexed load / store with zero offset!
...
llvm-svn: 36716
2007-05-03 23:52:19 +00:00
Evan Cheng
b68343cdd8
Forgot about chain result; also UNDEF cannot have multiple values.
...
llvm-svn: 36622
2007-05-01 08:53:39 +00:00
Evan Cheng
a684cd23a5
* Only turn a load to UNDEF if all of its outputs have no uses (indexed loads
...
produce two results.)
* Do not touch volatile loads.
llvm-svn: 36604
2007-05-01 00:38:21 +00:00
Christopher Lamb
8af6d5896f
PR400 phase 2. Propagate attributed load/store information through DAGs.
...
llvm-svn: 36356
2007-04-22 23:15:30 +00:00
Reid Spencer
0c1349e6bc
Revert Christopher Lamb's load/store alignment changes.
...
llvm-svn: 36309
2007-04-21 18:36:27 +00:00
Christopher Lamb
bff50208c8
add support for alignment attributes on load/store instructions
...
llvm-svn: 36301
2007-04-21 08:16:25 +00:00
Chris Lattner
f03c90bee6
allow SRL to simplify its operands, as it doesn't demand all bits as input.
...
llvm-svn: 36245
2007-04-18 03:06:49 +00:00
Chris Lattner
bf14f20632
When replacing a node in SimplifyDemandedBits, if the old node used any
...
single-use nodes, they will be dead soon. Make sure to remove them before
processing other nodes. This implements CodeGen/X86/shl_elim.ll
llvm-svn: 36244
2007-04-18 03:05:22 +00:00
Chris Lattner
9ad5915559
SIGN_EXTEND_INREG does not demand its top bits. Give SimplifyDemandedBits
...
a chance to hack on it. This compiles:
int baz(long long a) { return (short)(((int)(a >>24)) >> 9); }
into:
_baz:
slwi r2, r3, 8
srwi r2, r2, 9
extsh r3, r2
blr
instead of:
_baz:
srwi r2, r4, 24
rlwimi r2, r3, 8, 0, 23
srwi r2, r2, 9
extsh r3, r2
blr
This implements CodeGen/PowerPC/sign_ext_inreg1.ll
llvm-svn: 36212
2007-04-17 19:03:21 +00:00
Chris Lattner
18e4ac4107
fix an infinite loop compiling ldecod, notice by JeffC.
...
llvm-svn: 35910
2007-04-11 16:51:53 +00:00
Chris Lattner
a083ffcad7
Fix this harder.
...
llvm-svn: 35888
2007-04-11 06:50:51 +00:00
Chris Lattner
c5f85d3738
don't create shifts by zero, fix some problems with my previous patch
...
llvm-svn: 35887
2007-04-11 06:43:25 +00:00
Chris Lattner
65786b078c
Teach the codegen to turn [aez]ext (setcc) -> selectcc of 1/0, which often
...
allows other simplifications. For example, this compiles:
int isnegative(unsigned int X) {
return !(X < 2147483648U);
}
Into this code:
x86:
movl 4(%esp), %eax
shrl $31, %eax
ret
arm:
mov r0, r0, lsr #31
bx lr
thumb:
lsr r0, r0, #31
bx lr
instead of:
x86:
cmpl $0, 4(%esp)
sets %al
movzbl %al, %eax
ret
arm:
mov r3, #0
cmp r0, #0
movlt r3, #1
mov r0, r3
bx lr
thumb:
mov r2, #1
mov r1, #0
cmp r0, #0
blt LBB1_2 @entry
LBB1_1: @entry
cpy r2, r1
LBB1_2: @entry
cpy r0, r2
bx lr
Testcase here: test/CodeGen/Generic/ispositive.ll
llvm-svn: 35883
2007-04-11 05:32:27 +00:00
Chris Lattner
41189c63cc
Codegen integer abs more efficiently using the trick from the PPC CWG. This
...
improves codegen on many architectures. Tests committed as CodeGen/*/iabs.ll
X86 Old: X86 New:
_test: _test:
movl 4(%esp), %ecx movl 4(%esp), %eax
movl %ecx, %eax movl %eax, %ecx
negl %eax sarl $31, %ecx
testl %ecx, %ecx addl %ecx, %eax
cmovns %ecx, %eax xorl %ecx, %eax
ret ret
PPC Old: PPC New:
_test: _test:
cmpwi cr0, r3, -1 srawi r2, r3, 31
neg r2, r3 add r3, r3, r2
bgt cr0, LBB1_2 ; xor r3, r3, r2
LBB1_1: ; blr
mr r3, r2
LBB1_2: ;
blr
ARM Old: ARM New:
_test: _test:
rsb r3, r0, #0 add r3, r0, r0, asr #31
cmp r0, #0 eor r0, r3, r0, asr #31
movge r3, r0 bx lr
mov r0, r3
bx lr
Thumb Old: Thumb New:
_test: _test:
neg r2, r0 asr r2, r0, #31
cmp r0, #0 add r0, r0, r2
bge LBB1_2 eor r0, r2
LBB1_1: @ bx lr
cpy r0, r2
LBB1_2: @
bx lr
Sparc Old: Sparc New:
test: test:
save -96, %o6, %o6 save -96, %o6, %o6
sethi 0, %l0 sra %i0, 31, %l0
sub %l0, %i0, %l0 add %i0, %l0, %l1
subcc %i0, -1, %l1 xor %l1, %l0, %i0
bg .BB1_2 restore %g0, %g0, %g0
nop retl
.BB1_1: nop
or %g0, %l0, %i0
.BB1_2:
restore %g0, %g0, %g0
retl
nop
It also helps alpha/ia64 :)
llvm-svn: 35881
2007-04-11 05:11:38 +00:00
Scott Michel
16627a542f
1. Insert custom lowering hooks for ISD::ROTR and ISD::ROTL.
...
2. Help DAGCombiner recognize zero/sign/any-extended versions of ROTR and ROTL
patterns. This was motivated by the X86/rotate.ll testcase, which should now
generate code for other platforms (and soon-to-come platforms.) Rewrote code
slightly to make it easier to read.
llvm-svn: 35605
2007-04-02 21:36:32 +00:00
Dale Johannesen
4bbd2eefba
Fix incorrect combination of different loads. Reenable zext-over-truncate
...
combination.
llvm-svn: 35517
2007-03-30 21:38:07 +00:00
Evan Cheng
ccee35fd0d
Disable load width reduction xform of variant (zext (truncate load x)) for
...
big endian targets until llvm-gcc build issue has been resolved.
llvm-svn: 35449
2007-03-29 07:56:46 +00:00
Evan Cheng
8275f0e0af
SIGN_EXTEND_INREG requires one extra operand, a ValueType node.
...
llvm-svn: 35350
2007-03-26 07:12:51 +00:00
Evan Cheng
b7051f596a
Adjust offset to compensate for big endian machines.
...
llvm-svn: 35293
2007-03-24 00:02:43 +00:00
Evan Cheng
a883b58caf
Make sure SEXTLOAD of the specific type is supported on the target.
...
llvm-svn: 35289
2007-03-23 22:13:36 +00:00
Evan Cheng
e2f5f24e8e
Also replace uses of SRL if that's also folded during ReduceLoadWidth().
...
llvm-svn: 35286
2007-03-23 20:55:21 +00:00
Evan Cheng
a824e79f06
A couple of bug fixes for reducing load width xform:
...
1. Address offset is in bytes.
2. Make sure truncate node uses are replaced with new load.
llvm-svn: 35274
2007-03-23 02:16:52 +00:00
Evan Cheng
464dc9b74c
More opportunities to reduce load size.
...
llvm-svn: 35254
2007-03-22 01:54:19 +00:00
Evan Cheng
d63baead9b
fold (truncate (srl (load x), c)) -> (smaller load (x+c/vt bits))
...
llvm-svn: 35239
2007-03-21 20:14:05 +00:00
Evan Cheng
8a1d09d079
Avoid combining indexed load further.
...
llvm-svn: 35005
2007-03-07 08:07:03 +00:00
Chris Lattner
47206667c0
fold away addc nodes when we know there cannot be a carry-out.
...
llvm-svn: 34913
2007-03-04 20:40:38 +00:00
Chris Lattner
2dcc6e7f58
generalize
...
llvm-svn: 34910
2007-03-04 20:08:45 +00:00
Chris Lattner
e2e13caeb2
canonicalize constants to the RHS of addc/adde. If nothing uses the carry out of
...
addc, turn it into add.
This allows us to compile:
long long test(long long A, unsigned B) {
return (A + ((long long)B << 32)) & 123;
}
into:
_test:
movl $123, %eax
andl 4(%esp), %eax
xorl %edx, %edx
ret
instead of:
_test:
xorl %edx, %edx
movl %edx, %eax
addl 4(%esp), %eax ;; add of zero
andl $123, %eax
ret
llvm-svn: 34909
2007-03-04 20:03:15 +00:00
Chris Lattner
fce448f856
Fold (sext (truncate x)) more aggressively, by avoiding creation of a
...
sextinreg if not needed. This is useful in two cases: before legalize,
it avoids creating a sextinreg that will be trivially removed. After legalize
if the target doesn't support sextinreg, the trunc/sext would not have been
removed before.
llvm-svn: 34621
2007-02-26 03:13:59 +00:00
Evan Cheng
92658d5648
Move SimplifySetCC to TargetLowering and allow it to be shared with legalizer.
...
llvm-svn: 34065
2007-02-08 22:13:59 +00:00
Evan Cheng
00a640dbe0
Fix for PR1108: type of insert_vector_elt index operand is PtrVT, not MVT::i32.
...
llvm-svn: 33398
2007-01-20 10:10:26 +00:00
Evan Cheng
9201100b29
Remove this xform:
...
(shl (add x, c1), c2) -> (add (shl x, c2), c1<<c2)
Replace it with:
(add (shl (add x, c1), c2), ) -> (add (add (shl x, c2), c1<<c2), )
This fixes test/CodeGen/ARM/smul.ll
llvm-svn: 33361
2007-01-19 17:51:44 +00:00
Chris Lattner
4dc4489286
Fix PR1114 and CodeGen/Generic/2007-01-15-LoadSelectCycle.ll by being
...
careful when folding "c ? load p : load q" that C doesn't reach either load.
If so, folding this into load (c ? p : q) will induce a cycle in the graph.
llvm-svn: 33251
2007-01-16 05:59:59 +00:00
Chris Lattner
f70c5cd5db
add options to view the dags before the first or second pass of dag combine.
...
llvm-svn: 33249
2007-01-16 04:55:25 +00:00
Chris Lattner
0199fd6d59
Implement some trivial FP foldings when -enable-unsafe-fp-math is specified.
...
This implements CodeGen/PowerPC/unsafe-math.ll
llvm-svn: 33024
2007-01-08 23:04:05 +00:00
Chris Lattner
aee775a6b7
Eliminate static ctors from Statistics
...
llvm-svn: 32698
2006-12-19 22:41:21 +00:00
Evan Cheng
28cf4277bb
Cannot combine an indexed load / store any further.
...
llvm-svn: 32629
2006-12-16 06:25:23 +00:00
Jim Laskey
26df19ace6
This code was usurping the sextload expand in teh legalizer. Just make
...
sure the right conditions are checked.
llvm-svn: 32611
2006-12-15 21:38:30 +00:00
Chris Lattner
b7524b6d0e
make this code more aggressive about turning store fpimm into store int imm.
...
This is not sufficient to fix X86/store-fp-constant.ll
llvm-svn: 32465
2006-12-12 04:16:14 +00:00
Evan Cheng
218369881f
Don't convert store double C, Ptr to store long C, Ptr if i64 is not a legal type.
...
llvm-svn: 32434
2006-12-11 17:25:19 +00:00
Nate Begeman
8e20c760fa
Move something that should be in the dag combiner from the legalizer to the
...
dag combiner.
llvm-svn: 32431
2006-12-11 02:23:46 +00:00
Chris Lattner
d9f04e4875
Fix CodeGen/PowerPC/2006-12-07-SelectCrash.ll on PPC64
...
llvm-svn: 32336
2006-12-07 22:36:47 +00:00
Bill Wendling
22e978a736
Removing even more <iostream> includes.
...
llvm-svn: 32320
2006-12-07 20:04:42 +00:00
Chris Lattner
700b873130
Detemplatize the Statistic class. The only type it is instantiated with
...
is 'unsigned'.
llvm-svn: 32279
2006-12-06 17:46:33 +00:00
Chris Lattner
3da631f29a
For better or worse, load from i1 is assumed to be zero extended. Do not
...
form a load from i1 from larger loads that may not be zext'd.
llvm-svn: 31933
2006-11-27 04:40:53 +00:00
Chris Lattner
3676a994ca
Fix PR1011 and CodeGen/Generic/2006-11-20-DAGCombineCrash.ll
...
llvm-svn: 31878
2006-11-20 18:05:46 +00:00
Evan Cheng
f64da389f8
Fix an incorrectly inverted condition.
...
llvm-svn: 31773
2006-11-16 00:08:20 +00:00
Chris Lattner
a0a8003f59
disallow preinc of a frameindex. This is not profitable and causes 2-addr
...
pass to explode. This fixes a bunch of llc-beta failures on ppc last night.
llvm-svn: 31661
2006-11-11 01:00:15 +00:00
Chris Lattner
eabc15c1d8
reduce indentation by using early exits. No functionality change.
...
llvm-svn: 31660
2006-11-11 00:56:29 +00:00
Chris Lattner
ffad2166e1
move big chunks of code out-of-line, no functionality change.
...
llvm-svn: 31658
2006-11-11 00:39:41 +00:00
Chris Lattner
4eac5f59e6
Fix a dag combiner bug exposed by my recent instcombine patch. This fixes
...
CodeGen/Generic/2006-11-10-DAGCombineMiscompile.ll and PPC gsm/toast
llvm-svn: 31644
2006-11-10 21:37:15 +00:00
Evan Cheng
13440b025c
When forming a pre-indexed store, make sure ptr isn't the same or is a pred of value being stored. It would cause a cycle.
...
llvm-svn: 31631
2006-11-10 08:28:11 +00:00
Evan Cheng
6878378390
Don't attempt expensive pre-/post- indexed dag combine if target does not support them.
...
llvm-svn: 31598
2006-11-09 19:10:46 +00:00
Evan Cheng
b15000736c
Rename ISD::MemOpAddrMode to ISD::MemIndexedMode
...
llvm-svn: 31595
2006-11-09 17:55:04 +00:00
Evan Cheng
b58e06bc9e
getPostIndexedAddressParts change: passes in load/store instead of its loaded / stored VT.
...
llvm-svn: 31584
2006-11-09 04:29:46 +00:00
Evan Cheng
85e54223cd
Match more post-indexed ops.
...
llvm-svn: 31569
2006-11-08 20:27:27 +00:00
Jim Laskey
61feeb90f9
Remove redundant <cmath>.
...
llvm-svn: 31561
2006-11-08 19:16:44 +00:00
Evan Cheng
0303cb9b33
- When performing pre-/post- indexed load/store transformation, do not worry
...
about whether the new base ptr would be live below the load/store. Let two
address pass split it back to non-indexed ops.
- Minor tweaks / fixes.
llvm-svn: 31544
2006-11-08 08:30:28 +00:00
Evan Cheng
6072435756
Fixed a minor bug preventing some pre-indexed load / store transformation.
...
llvm-svn: 31543
2006-11-08 06:56:05 +00:00
Evan Cheng
d48f7dd250
Fix a obscure post-indexed load / store dag combine bug.
...
llvm-svn: 31537
2006-11-08 02:38:55 +00:00
Evan Cheng
60c6846d21
Add post-indexed load / store transformations.
...
llvm-svn: 31498
2006-11-07 09:03:05 +00:00
Evan Cheng
eb99bd736a
Add comment.
...
llvm-svn: 31473
2006-11-06 08:14:30 +00:00
Jeff Cohen
7d6f3db3e2
Unbreak VC++ build.
...
llvm-svn: 31464
2006-11-05 19:31:28 +00:00
Evan Cheng
33157700d9
Added pre-indexed store support.
...
llvm-svn: 31459
2006-11-05 09:31:14 +00:00
Evan Cheng
1dfd26a151
Rename
...
llvm-svn: 31413
2006-11-03 07:21:16 +00:00
Reid Spencer
52f958741a
Remove dead variable. Fix 80 column violations.
...
llvm-svn: 31412
2006-11-03 03:30:34 +00:00
Evan Cheng
357017f4a9
Added DAG combiner transformation to generate pre-indexed loads.
...
llvm-svn: 31410
2006-11-03 03:06:21 +00:00
Reid Spencer
de46e48420
For PR786:
...
Turn on -Wunused and -Wno-unused-parameter. Clean up most of the resulting
fall out by removing unused variables. Remaining warnings have to do with
unused functions (I didn't want to delete code without review) and unused
variables in generated code. Maintainers should clean up the remaining
issues when they see them. All changes pass DejaGnu tests and Olden.
llvm-svn: 31380
2006-11-02 20:25:50 +00:00
Jim Laskey
55e4dcad36
Add option for controlling inclusion of global AA.
...
llvm-svn: 31040
2006-10-18 19:08:31 +00:00
Jim Laskey
a15b0ebb5e
Use global info for alias analysis.
...
llvm-svn: 31035
2006-10-18 12:29:57 +00:00
Chris Lattner
327b88b102
Fix CodeGen/PowerPC/2006-10-17-brcc-miscompile.ll
...
llvm-svn: 31019
2006-10-17 21:24:15 +00:00
Jim Laskey
e7d2c24a7d
Make it simplier to dump DAGs while in DAGCombiner. Remove a nasty optimization.
...
llvm-svn: 31009
2006-10-17 19:33:52 +00:00
Evan Cheng
1e3a39cd08
Make sure operand does have size and element type operands.
...
llvm-svn: 30999
2006-10-17 17:06:35 +00:00
Evan Cheng
f3ae00a64a
Be careful when looking through a vbit_convert. Optimizing this:
...
(vector_shuffle
(vbitconvert (vbuildvector (copyfromreg v4f32), 1, v4f32), 4, f32),
(undef, undef, undef, undef), (0, 0, 0, 0), 4, f32)
to the
vbitconvert
is a very bad idea.
llvm-svn: 30989
2006-10-16 22:49:37 +00:00
Jim Laskey
dcb2b83886
Pass AliasAnalysis thru to DAGCombiner.
...
llvm-svn: 30984
2006-10-16 20:52:31 +00:00
Jim Laskey
3bf4f3bd60
Tidy up after truncstore changes.
...
llvm-svn: 30961
2006-10-14 12:14:27 +00:00
Chris Lattner
6a1b2de8c4
Make sure that the node returned by SimplifySetCC is added to the worklist
...
so that it can be deleted if unused.
llvm-svn: 30955
2006-10-14 03:52:46 +00:00
Chris Lattner
0626bd2fbc
fold setcc of a setcc.
...
llvm-svn: 30953
2006-10-14 01:02:29 +00:00
Chris Lattner
bd9acad805
When SimplifySetCC was moved to the DAGCombiner, it was never removed from
...
SelectionDAG and it has since bitrotted. Remove the copy from SelectionDAG.
Next, remove the constant folding piece of DAGCombiner::SimplifySetCC into
a new FoldSetCC method which can be used by getNode() and SimplifySetCC.
This fixes obscure bugs.
llvm-svn: 30952
2006-10-14 00:41:01 +00:00
Jim Laskey
dcf983ce41
Reduce the workload by not adding chain users to work list.
...
llvm-svn: 30948
2006-10-13 23:32:28 +00:00
Evan Cheng
ab51cf2e78
Merge ISD::TRUNCSTORE to ISD::STORE. Switch to using StoreSDNode.
...
llvm-svn: 30945
2006-10-13 21:14:26 +00:00
Chris Lattner
d0620d2773
Lower X%C into X/C+stuff. This allows the 'division by a constant' logic to
...
apply to rems as well as divs. This fixes PR945 and speeds up ReedSolomon
from 14.57s to 10.90s (which is now faster than gcc).
It compiles CodeGen/X86/rem.ll into:
_test1:
subl $4, %esp
movl %esi, (%esp)
movl $2155905153, %ecx
movl 8(%esp), %esi
movl %esi, %eax
imull %ecx
addl %esi, %edx
movl %edx, %eax
shrl $31, %eax
sarl $7, %edx
addl %eax, %edx
imull $255, %edx, %eax
subl %eax, %esi
movl %esi, %eax
movl (%esp), %esi
addl $4, %esp
ret
_test2:
movl 4(%esp), %eax
movl %eax, %ecx
sarl $31, %ecx
shrl $24, %ecx
addl %eax, %ecx
andl $4294967040, %ecx
subl %ecx, %eax
ret
_test3:
subl $4, %esp
movl %esi, (%esp)
movl $2155905153, %ecx
movl 8(%esp), %esi
movl %esi, %eax
mull %ecx
shrl $7, %edx
imull $255, %edx, %eax
subl %eax, %esi
movl %esi, %eax
movl (%esp), %esi
addl $4, %esp
ret
instead of div/idiv instructions.
llvm-svn: 30920
2006-10-12 20:58:32 +00:00
Chris Lattner
2e33fb453b
add a minor dag combine noticed when looking at PR945
...
llvm-svn: 30915
2006-10-12 20:23:19 +00:00
Jim Laskey
df2ccc395e
D'oh - need to use the rigth kind of store.
...
llvm-svn: 30903
2006-10-12 15:22:24 +00:00
Jim Laskey
a13b9c7aa4
Alias analysis of TRUNCSTORE.
...
llvm-svn: 30889
2006-10-11 18:55:16 +00:00
Jim Laskey
0f7c328ae7
Handle aliasing of loadext.
...
llvm-svn: 30883
2006-10-11 17:47:52 +00:00
Jim Laskey
08edf332ed
Fix regression in combiner alias analysis.
...
llvm-svn: 30880
2006-10-11 13:47:09 +00:00
Evan Cheng
d35734bd1f
Naming consistency.
...
llvm-svn: 30878
2006-10-11 07:10:22 +00:00
Evan Cheng
e71fe34d75
Reflects ISD::LOAD / ISD::LOADX / LoadSDNode changes.
...
llvm-svn: 30844
2006-10-09 20:57:25 +00:00
Chris Lattner
5ab6d8b3fc
Eliminate more token factors by taking advantage of transitivity:
...
if TF depends on A and B, and A depends on B, TF just needs to depend on
A. With Jim's alias-analysis stuff enabled, this compiles the testcase in
PR892 into:
__Z4test3Val:
subl $44, %esp
call L__Z3foov$stub
movl %edx, 28(%esp)
movl %eax, 32(%esp)
movl %eax, 24(%esp)
movl %edx, 36(%esp)
movl 52(%esp), %ecx
movl %ecx, 4(%esp)
movl %eax, 8(%esp)
movl %edx, 12(%esp)
movl 48(%esp), %eax
movl %eax, (%esp)
call L__Z3bar3ValS_$stub
addl $44, %esp
ret
instead of:
__Z4test3Val:
subl $44, %esp
call L__Z3foov$stub
movl %eax, 24(%esp)
movl %edx, 28(%esp)
movl 24(%esp), %eax
movl %eax, 32(%esp)
movl 28(%esp), %eax
movl %eax, 36(%esp)
movl 32(%esp), %eax
movl 36(%esp), %ecx
movl 52(%esp), %edx
movl %edx, 4(%esp)
movl %eax, 8(%esp)
movl %ecx, 12(%esp)
movl 48(%esp), %eax
movl %eax, (%esp)
call L__Z3bar3ValS_$stub
addl $44, %esp
ret
llvm-svn: 30821
2006-10-08 22:57:01 +00:00
Jim Laskey
0463e08005
Combiner alias analysis passes Multisource (release-asserts.)
...
llvm-svn: 30818
2006-10-07 23:37:56 +00:00
Evan Cheng
df9ac47e5e
Make use of getStore().
...
llvm-svn: 30759
2006-10-05 23:01:46 +00:00
Jim Laskey
6549d22ef9
Alias analysis code clean ups.
...
llvm-svn: 30753
2006-10-05 15:07:25 +00:00
Jim Laskey
708d0db2d8
More extensive alias analysis.
...
llvm-svn: 30721
2006-10-04 16:53:27 +00:00
Evan Cheng
5d9fd977d3
Combine ISD::EXTLOAD, ISD::SEXTLOAD, ISD::ZEXTLOAD into ISD::LOADX. Add an
...
extra operand to LOADX to specify the exact value extension type.
llvm-svn: 30714
2006-10-04 00:56:09 +00:00
Jim Laskey
60832693a7
Load chain check is not needed
...
llvm-svn: 30613
2006-09-26 17:44:58 +00:00
Jim Laskey
dde51671e5
Chain can be any operand
...
llvm-svn: 30611
2006-09-26 09:32:41 +00:00
Jim Laskey
5f3e0af9d0
Wrong size for load
...
llvm-svn: 30610
2006-09-26 08:14:06 +00:00
Jim Laskey
b4a864d533
Can't move a load node if it's chain is not used.
...
llvm-svn: 30609
2006-09-26 07:37:42 +00:00
Jim Laskey
7aa0638aa9
Accidental enable of bad code
...
llvm-svn: 30601
2006-09-25 21:11:32 +00:00
Jim Laskey
b5534e5c28
Fix chain dropping in load and drop unused stores in ret blocks.
...
llvm-svn: 30600
2006-09-25 19:32:58 +00:00
Jim Laskey
d07be232ba
Core antialiasing for load and store.
...
llvm-svn: 30597
2006-09-25 16:29:54 +00:00
Evan Cheng
449a0c7e33
Make it work for DAG combine of multi-value nodes.
...
llvm-svn: 30573
2006-09-21 19:04:05 +00:00
Jim Laskey
35f7eebb49
core corrections
...
llvm-svn: 30570
2006-09-21 17:35:47 +00:00
Jim Laskey
5d19d59017
Basic "in frame" alias analysis.
...
llvm-svn: 30568
2006-09-21 16:28:59 +00:00
Chris Lattner
082db3f9aa
fold (aext (and (trunc x), cst)) -> (and x, cst).
...
llvm-svn: 30561
2006-09-21 06:40:43 +00:00
Chris Lattner
fa9f92cf65
Check the right value type. This fixes 186.crafty on x86
...
llvm-svn: 30560
2006-09-21 06:17:39 +00:00
Chris Lattner
8d8a3bf9c9
Compile:
...
int %test(ulong *%tmp) {
%tmp = load ulong* %tmp ; <ulong> [#uses=1]
%tmp.mask = shr ulong %tmp, ubyte 50 ; <ulong> [#uses=1]
%tmp.mask = cast ulong %tmp.mask to ubyte
%tmp2 = and ubyte %tmp.mask, 3 ; <ubyte> [#uses=1]
%tmp2 = cast ubyte %tmp2 to int ; <int> [#uses=1]
ret int %tmp2
}
to:
_test:
movl 4(%esp), %eax
movl 4(%eax), %eax
shrl $18, %eax
andl $3, %eax
ret
instead of:
_test:
movl 4(%esp), %eax
movl 4(%eax), %eax
shrl $18, %eax
# TRUNCATE movb %al, %al
andb $3, %al
movzbl %al, %eax
ret
llvm-svn: 30558
2006-09-21 06:14:31 +00:00
Chris Lattner
a31f0a622b
Generalize (zext (truncate x)) and (sext (truncate x)) folding to work when
...
the src/dst are not the same size. This catches things like "truncate
32-bit X to 8 bits, then zext to 16", which happens a bit on X86.
llvm-svn: 30557
2006-09-21 06:00:20 +00:00
Chris Lattner
c8cd62d381
Compile:
...
int test3(int a, int b) { return (a < 0) ? a : 0; }
to:
_test3:
srawi r2, r3, 31
and r3, r2, r3
blr
instead of:
_test3:
cmpwi cr0, r3, 1
li r2, 0
blt cr0, LBB2_2 ;entry
LBB2_1: ;entry
mr r3, r2
LBB2_2: ;entry
blr
This implements: PowerPC/select_lt0.ll:seli32_a_a
llvm-svn: 30517
2006-09-20 06:41:35 +00:00
Chris Lattner
8746e2cd57
Fold the full generality of (any_extend (truncate x))
...
llvm-svn: 30514
2006-09-20 06:29:17 +00:00
Chris Lattner
8b68decb27
Two things:
...
1. teach SimplifySetCC that '(srl (ctlz x), 5) == 0' is really x != 0.
2. Teach visitSELECT_CC to use SimplifySetCC instead of calling it and
ignoring the result. This allows us to compile:
bool %test(ulong %x) {
%tmp = setlt ulong %x, 4294967296
ret bool %tmp
}
to:
_test:
cntlzw r2, r3
cmplwi cr0, r3, 1
srwi r2, r2, 5
li r3, 0
beq cr0, LBB1_2 ;
LBB1_1: ;
mr r3, r2
LBB1_2: ;
blr
instead of:
_test:
addi r2, r3, -1
cntlzw r2, r2
cntlzw r3, r3
srwi r2, r2, 5
cmplwi cr0, r2, 0
srwi r2, r3, 5
li r3, 0
bne cr0, LBB1_2 ;
LBB1_1: ;
mr r3, r2
LBB1_2: ;
blr
This isn't wonderful, but it's an improvement.
llvm-svn: 30513
2006-09-20 06:19:26 +00:00
Chris Lattner
46d710e6ea
Fold (X & C1) | (Y & C2) -> (X|Y) & C3 when possible.
...
This implements CodeGen/X86/and-or-fold.ll
llvm-svn: 30379
2006-09-14 21:11:37 +00:00
Chris Lattner
97614c86ce
Split rotate matching code out to its own function. Make it stronger, by
...
matching things like ((x >> c1) & c2) | ((x << c3) & c4) to (rot x, c5) & c6
llvm-svn: 30376
2006-09-14 20:50:57 +00:00
Evan Cheng
31305c45da
DAG combiner fix for rotates. Previously the outer-most condition checks
...
for ROTL availability. This prevents it from forming ROTR for targets that
has ROTR only.
llvm-svn: 29997
2006-08-31 07:41:12 +00:00
Evan Cheng
e5570a4c3f
Move isCommutativeBinOp from SelectionDAG.cpp and DAGCombiner.cpp out. Make it a static method of SelectionDAG.
...
llvm-svn: 29951
2006-08-29 06:42:35 +00:00
Chris Lattner
3d27be1333
s|llvm/Support/Visibility.h|llvm/Support/Compiler.h|
...
llvm-svn: 29911
2006-08-27 12:54:02 +00:00
Chris Lattner
6f22ebd8be
change internal impl of dag combiner so that calls to CombineTo never have to
...
make a temporary vector.
llvm-svn: 29618
2006-08-11 17:56:38 +00:00
Chris Lattner
a2f4086828
Change one ReplaceAllUsesWith method to take an array of operands to replace
...
instead of a vector of operands.
llvm-svn: 29616
2006-08-11 17:46:28 +00:00
Chris Lattner
c24a1d3093
Start eliminating temporary vectors used to create DAG nodes. Instead, pass
...
in the start of an array and a count of operands where applicable. In many
cases, the number of operands is known, so this static array can be allocated
on the stack, avoiding the heap. In many other cases, a SmallVector can be
used, which has the same benefit in the common cases.
I updated a lot of code calling getNode that takes a vector, but ran out of
time. The rest of the code should be updated, and these methods should be
removed.
We should also do the same thing to eliminate the methods that take a
vector of MVT::ValueTypes.
It would be extra nice to convert the dagiselemitter to avoid creating vectors
for operands when calling getTargetNode.
llvm-svn: 29566
2006-08-08 02:23:42 +00:00
Reid Spencer
658b9476f0
Initialize some variables the compiler warns about.
...
llvm-svn: 29277
2006-07-25 20:44:41 +00:00
Evan Cheng
7c970b98d0
If a shuffle is a splat, check if the argument is a build_vector with all elements being the same. If so, return the argument.
...
llvm-svn: 29242
2006-07-21 08:25:53 +00:00
Evan Cheng
8472e0c4af
If a shuffle is unary, i.e. one of the vector argument is not needed, turn the
...
operand into a undef and adjust mask accordingly.
llvm-svn: 29232
2006-07-20 22:44:41 +00:00
Andrew Lenharth
ec104a2b41
80 cols
...
llvm-svn: 29221
2006-07-20 17:43:27 +00:00
Andrew Lenharth
c496b418b5
Reduce number of exported symbols
...
llvm-svn: 29220
2006-07-20 17:28:38 +00:00
Chris Lattner
54a34cd20b
Mark these two classes as hidden, shrinking libllbmgcc.dylib by 25K
...
llvm-svn: 28970
2006-06-28 21:58:30 +00:00
Andrew Lenharth
0e57b2cb92
Start on my todo list
...
llvm-svn: 28752
2006-06-12 16:07:18 +00:00
Evan Cheng
64d2846017
visitVBinOp: Can't fold divide by zero!
...
llvm-svn: 28584
2006-05-31 06:08:35 +00:00
Chris Lattner
8f872d2091
Fix a nasty dag combiner bug that caused nondeterminstic crashes (MY FAVORITE!):
...
SimplifySelectOps would eliminate a Select, delete it, then return true.
The clients would see that it did something and return null.
The top level would see a null return, and decide that nothing happened,
proceeding to process the node in other ways: boom.
The fix is simple: clients of SimplifySelectOps should return the select
node itself.
In order to catch really obnoxious boogs like this in the future, add an
assert that nodes are not deleted. We do this by checking for a sentry node
type that the SDNode dtor sets when a node is destroyed.
llvm-svn: 28514
2006-05-27 00:43:02 +00:00
Andrew Lenharth
1dc9ec5874
Move this code to a common place
...
llvm-svn: 28329
2006-05-16 17:42:15 +00:00
Chris Lattner
afe72481f6
Comment out dead variables
...
llvm-svn: 28252
2006-05-12 17:57:54 +00:00
Chris Lattner
66adee93aa
Two simplifications for token factor nodes: simplify tf(x,x) -> x.
...
simplify tf(x,y,y,z) -> tf(x,y,z).
llvm-svn: 28233
2006-05-12 05:01:37 +00:00
Evan Cheng
2c74848af1
Debugging info
...
llvm-svn: 28200
2006-05-09 06:55:15 +00:00
Chris Lattner
446e1ef26a
Make the case I just checked in stronger. Now we compile this:
...
short test2(short X, short x) {
int Y = (short)(X+x);
return Y >> 1;
}
to:
_test2:
add r2, r3, r4
extsh r2, r2
srawi r3, r2, 1
blr
instead of:
_test2:
add r2, r3, r4
extsh r2, r2
srwi r2, r2, 1
extsh r3, r2
blr
llvm-svn: 28175
2006-05-08 21:18:59 +00:00
Chris Lattner
29062da0ac
Implement and_sext.ll:test3, generating:
...
_test4:
srawi r3, r3, 16
blr
instead of:
_test4:
srwi r2, r3, 16
extsh r3, r2
blr
for:
short test4(unsigned X) {
return (X >> 16);
}
llvm-svn: 28174
2006-05-08 20:59:41 +00:00
Chris Lattner
2935d8190c
Compile this:
...
short test4(unsigned X) {
return (X >> 16);
}
to:
_test4:
movl 4(%esp), %eax
sarl $16, %eax
ret
instead of:
_test4:
movl $-65536, %eax
andl 4(%esp), %eax
sarl $16, %eax
ret
llvm-svn: 28171
2006-05-08 20:51:54 +00:00
Nate Begeman
e5ce5bb6da
Fix PR772
...
llvm-svn: 28161
2006-05-08 01:35:01 +00:00
Chris Lattner
7e7bcf3a54
Simplify some code, add a couple minor missed folds
...
llvm-svn: 28152
2006-05-06 23:06:26 +00:00
Chris Lattner
2a4d7b845b
remove cases handled elsewhere
...
llvm-svn: 28150
2006-05-06 22:43:44 +00:00
Chris Lattner
1ecb2a2dac
Use the new TargetLowering::ComputeNumSignBits method to eliminate
...
sign_extend_inreg operations. Though ComputeNumSignBits is still rudimentary,
this is enough to compile this:
short test(short X, short x) {
int Y = X+x;
return (Y >> 1);
}
short test2(short X, short x) {
int Y = (short)(X+x);
return Y >> 1;
}
into:
_test:
add r2, r3, r4
srawi r3, r2, 1
blr
_test2:
add r2, r3, r4
extsh r2, r2
srawi r3, r2, 1
blr
instead of:
_test:
add r2, r3, r4
srawi r2, r2, 1
extsh r3, r2
blr
_test2:
add r2, r3, r4
extsh r2, r2
srawi r2, r2, 1
extsh r3, r2
blr
llvm-svn: 28146
2006-05-06 09:30:03 +00:00
Chris Lattner
907e392dba
Fold trunc(any_ext). This gives stuff like:
...
27,28c27
< movzwl %di, %edi
< movl %edi, %ebx
---
> movw %di, %bx
llvm-svn: 28137
2006-05-05 22:56:26 +00:00
Chris Lattner
57f8c5a387
Shrink shifts when possible.
...
llvm-svn: 28136
2006-05-05 22:53:17 +00:00
Chris Lattner
3d26577396
Fold (fpext (load x)) -> (extload x)
...
llvm-svn: 28130
2006-05-05 21:34:35 +00:00
Chris Lattner
25a5283a86
Fold some common code.
...
llvm-svn: 28124
2006-05-05 06:32:04 +00:00
Chris Lattner
002ee91457
Implement:
...
// fold (and (sext x), (sext y)) -> (sext (and x, y))
// fold (or (sext x), (sext y)) -> (sext (or x, y))
// fold (xor (sext x), (sext y)) -> (sext (xor x, y))
// fold (and (aext x), (aext y)) -> (aext (and x, y))
// fold (or (aext x), (aext y)) -> (aext (or x, y))
// fold (xor (aext x), (aext y)) -> (aext (xor x, y))
llvm-svn: 28123
2006-05-05 06:31:05 +00:00
Chris Lattner
5ac4293606
Pull and through and/or/xor. This compiles some bitfield code to:
...
mov EAX, DWORD PTR [ESP + 4]
mov ECX, DWORD PTR [EAX]
mov EDX, ECX
add EDX, EDX
or EDX, ECX
and EDX, -2147483648
and ECX, 2147483647
or EDX, ECX
mov DWORD PTR [EAX], EDX
ret
instead of:
sub ESP, 4
mov DWORD PTR [ESP], ESI
mov EAX, DWORD PTR [ESP + 8]
mov ECX, DWORD PTR [EAX]
mov EDX, ECX
add EDX, EDX
mov ESI, ECX
and ESI, -2147483648
and EDX, -2147483648
or EDX, ESI
and ECX, 2147483647
or EDX, ECX
mov DWORD PTR [EAX], EDX
mov ESI, DWORD PTR [ESP]
add ESP, 4
ret
llvm-svn: 28122
2006-05-05 06:10:43 +00:00
Chris Lattner
812646aa0c
Implement a variety of simplifications for ANY_EXTEND.
...
llvm-svn: 28121
2006-05-05 05:58:59 +00:00
Chris Lattner
8d6fc20181
Factor some code, add these transformations:
...
// fold (and (trunc x), (trunc y)) -> (trunc (and x, y))
// fold (or (trunc x), (trunc y)) -> (trunc (or x, y))
// fold (xor (trunc x), (trunc y)) -> (trunc (xor x, y))
llvm-svn: 28120
2006-05-05 05:51:50 +00:00
Chris Lattner
2b48a94413
Remove a bogus transformation. This fixes SingleSource/UnitTests/2006-01-23-InitializedBitField.c
...
with some changes I have to the new CFE.
llvm-svn: 28022
2006-04-28 23:33:20 +00:00
Chris Lattner
662e940f73
Fix a couple more memory issues
...
llvm-svn: 27930
2006-04-21 15:32:26 +00:00
Chris Lattner
cc47ab3305
Fix a really subtle and obnoxious memory bug that caused issues with an
...
llvm-gcc4 boostrap. Whenever a node is deleted by the dag combiner, it
*must* be returned by the visit function, or the dag combiner will not
know that the node has been processed (and will, e.g., send it to the
target dag combine xforms).
llvm-svn: 27922
2006-04-20 23:55:59 +00:00
Evan Cheng
a320abc494
Turn a VAND into a VECTOR_SHUFFLE is applicable.
...
DAG combiner can turn a VAND V, <-1, 0, -1, -1>, i.e. vector clear elements,
into a vector shuffle with a zero vector. It only does so when TLI tells it
the xform is profitable.
llvm-svn: 27874
2006-04-20 08:56:16 +00:00
Chris Lattner
e1401e3610
Canonicalize vvector_shuffle(x,x) -> vvector_shuffle(x,undef) to enable patterns
...
to match again :)
llvm-svn: 27533
2006-04-08 05:34:25 +00:00
Chris Lattner
098c01e94e
Codegen shufflevector as VVECTOR_SHUFFLE
...
llvm-svn: 27529
2006-04-08 04:15:24 +00:00
Evan Cheng
613996c55e
1. If both vector operands of a vector_shuffle are undef, turn it into an undef.
...
2. A shuffle mask element can also be an undef.
llvm-svn: 27472
2006-04-06 23:20:43 +00:00
Chris Lattner
4ea52cac01
Do not create ZEXTLOAD's unless we are before legalize or the operation is
...
legal.
llvm-svn: 27402
2006-04-04 17:39:18 +00:00
Chris Lattner
e1e3adf802
Add a missing check, this fixes UnitTests/Vector/sumarray.c
...
llvm-svn: 27375
2006-04-03 17:29:28 +00:00
Chris Lattner
04c00fc844
Add a missing check, which broke a bunch of vector tests.
...
llvm-svn: 27374
2006-04-03 17:21:50 +00:00
Andrew Lenharth
94f012f606
back this out
...
llvm-svn: 27367
2006-04-03 03:16:50 +00:00
Andrew Lenharth
015eaf5f33
This should be a win of every arch
...
llvm-svn: 27364
2006-04-02 21:42:45 +00:00
Chris Lattner
4993249a04
Add a little dag combine to compile this:
...
int %AreSecondAndThirdElementsBothNegative(<4 x float>* %in) {
entry:
%tmp1 = load <4 x float>* %in ; <<4 x float>> [#uses=1]
%tmp = tail call int %llvm.ppc.altivec.vcmpgefp.p( int 1, <4 x float> < float 0x7FF8000000000000, float 0.000000e+00, float 0.000000e+00, float 0x7FF8000000000000 >, <4 x float> %tmp1 ) ; <int> [#uses=1]
%tmp = seteq int %tmp, 0 ; <bool> [#uses=1]
%tmp3 = cast bool %tmp to int ; <int> [#uses=1]
ret int %tmp3
}
into this:
_AreSecondAndThirdElementsBothNegative:
mfspr r2, 256
oris r4, r2, 49152
mtspr 256, r4
li r4, lo16(LCPI1_0)
lis r5, ha16(LCPI1_0)
lvx v0, 0, r3
lvx v1, r5, r4
vcmpgefp. v0, v1, v0
mfcr r3, 2
rlwinm r3, r3, 27, 31, 31
mtspr 256, r2
blr
instead of this:
_AreSecondAndThirdElementsBothNegative:
mfspr r2, 256
oris r4, r2, 49152
mtspr 256, r4
li r4, lo16(LCPI1_0)
lis r5, ha16(LCPI1_0)
lvx v0, 0, r3
lvx v1, r5, r4
vcmpgefp. v0, v1, v0
mfcr r3, 2
rlwinm r3, r3, 27, 31, 31
xori r3, r3, 1
cntlzw r3, r3
srwi r3, r3, 5
mtspr 256, r2
blr
llvm-svn: 27356
2006-04-02 06:11:11 +00:00
Chris Lattner
0442a18758
Constant fold all of the vector binops. This allows us to compile this:
...
"vector unsigned char mergeLowHigh = (vector unsigned char)
( 8, 9, 10, 11, 16, 17, 18, 19, 12, 13, 14, 15, 20, 21, 22, 23 );
vector unsigned char mergeHighLow = vec_xor( mergeLowHigh, vec_splat_u8(8));"
aka:
void %test2(<16 x sbyte>* %P) {
store <16 x sbyte> cast (<4 x int> xor (<4 x int> cast (<16 x ubyte> < ubyte 8, ubyte 9, ubyte 10, ubyte 11, ubyte 16, ubyte 17, ubyte 18, ubyte 19, ubyte 12, ubyte 13, ubyte 14, ubyte 15, ubyte 20, ubyte 21, ubyte 22, ubyte 23 > to <4 x int>), <4 x int> cast (<16 x sbyte> < sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8 > to <4 x int>)) to <16 x sbyte>), <16 x sbyte> * %P
ret void
}
into this:
_test2:
mfspr r2, 256
oris r4, r2, 32768
mtspr 256, r4
li r4, lo16(LCPI2_0)
lis r5, ha16(LCPI2_0)
lvx v0, r5, r4
stvx v0, 0, r3
mtspr 256, r2
blr
instead of this:
_test2:
mfspr r2, 256
oris r4, r2, 49152
mtspr 256, r4
li r4, lo16(LCPI2_0)
lis r5, ha16(LCPI2_0)
vspltisb v0, 8
lvx v1, r5, r4
vxor v0, v1, v0
stvx v0, 0, r3
mtspr 256, r2
blr
... which occurs here:
http://developer.apple.com/hardware/ve/calcspeed.html
llvm-svn: 27343
2006-04-02 03:25:57 +00:00
Chris Lattner
e4e64b6b85
Implement constant folding of bit_convert of arbitrary constant vbuild_vector nodes.
...
llvm-svn: 27341
2006-04-02 02:53:43 +00:00
Chris Lattner
39dcf1a9e2
Delete identity shuffles, implementing CodeGen/Generic/vector-identity-shuffle.ll
...
llvm-svn: 27317
2006-03-31 22:16:43 +00:00
Chris Lattner
7e30af3887
Remove dead *extloads. This allows us to codegen vector.ll:test_extract_elt
...
to:
test_extract_elt:
alloc r3 = ar.pfs,0,1,0,0
adds r8 = 12, r32
;;
ldfs f8 = [r8]
mov ar.pfs = r3
br.ret.sptk.many rp
instead of:
test_extract_elt:
alloc r3 = ar.pfs,0,1,0,0
adds r8 = 28, r32
adds r9 = 24, r32
adds r10 = 20, r32
adds r11 = 16, r32
;;
ldfs f6 = [r8]
;;
ldfs f6 = [r9]
adds r8 = 12, r32
adds r9 = 8, r32
adds r14 = 4, r32
;;
ldfs f6 = [r10]
;;
ldfs f6 = [r11]
ldfs f8 = [r8]
;;
ldfs f6 = [r9]
;;
ldfs f6 = [r14]
;;
ldfs f6 = [r32]
mov ar.pfs = r3
br.ret.sptk.many rp
llvm-svn: 27297
2006-03-31 18:10:41 +00:00
Chris Lattner
2d8551c85b
Delete dead loads in the dag. This allows us to compile
...
vector.ll:test_extract_elt2 into:
_test_extract_elt2:
lfd f1, 32(r3)
blr
instead of:
_test_extract_elt2:
lfd f0, 56(r3)
lfd f0, 48(r3)
lfd f0, 40(r3)
lfd f1, 32(r3)
lfd f0, 24(r3)
lfd f0, 16(r3)
lfd f0, 8(r3)
lfd f0, 0(r3)
blr
llvm-svn: 27296
2006-03-31 18:06:18 +00:00
Chris Lattner
20e619fba3
When building a VVECTOR_SHUFFLE node from extract_element operations, make
...
sure to build it as SHUFFLE(X, undef, mask), not SHUFFLE(X, X, mask).
The later is not canonical form, and prevents the PPC splat pattern from
matching. For a particular splat, we go from generating this:
li r10, lo16(LCPI1_0)
lis r11, ha16(LCPI1_0)
lvx v3, r11, r10
vperm v3, v2, v2, v3
to generating:
vspltw v3, v2, 3
llvm-svn: 27236
2006-03-28 22:19:47 +00:00
Chris Lattner
a46dfe80c8
Canonicalize VECTOR_SHUFFLE(X, X, Y) -> VECTOR_SHUFFLE(X,undef,Y')
...
llvm-svn: 27235
2006-03-28 22:11:53 +00:00
Chris Lattner
c9992548fc
Turn a series of extract_element's feeding a build_vector into a
...
vector_shuffle node. For this:
void test(__m128 *res, __m128 *A, __m128 *B) {
*res = _mm_unpacklo_ps(*A, *B);
}
we now produce this code:
_test:
movl 8(%esp), %eax
movaps (%eax), %xmm0
movl 12(%esp), %eax
unpcklps (%eax), %xmm0
movl 4(%esp), %eax
movaps %xmm0, (%eax)
ret
instead of this:
_test:
subl $76, %esp
movl 88(%esp), %eax
movaps (%eax), %xmm0
movaps %xmm0, (%esp)
movaps %xmm0, 32(%esp)
movss 4(%esp), %xmm0
movss 32(%esp), %xmm1
unpcklps %xmm0, %xmm1
movl 84(%esp), %eax
movaps (%eax), %xmm0
movaps %xmm0, 16(%esp)
movaps %xmm0, 48(%esp)
movss 20(%esp), %xmm0
movss 48(%esp), %xmm2
unpcklps %xmm0, %xmm2
unpcklps %xmm1, %xmm2
movl 80(%esp), %eax
movaps %xmm2, (%eax)
addl $76, %esp
ret
GCC produces this (with -fomit-frame-pointer):
_test:
subl $12, %esp
movl 20(%esp), %eax
movaps (%eax), %xmm0
movl 24(%esp), %eax
unpcklps (%eax), %xmm0
movl 16(%esp), %eax
movaps %xmm0, (%eax)
addl $12, %esp
ret
llvm-svn: 27233
2006-03-28 20:28:38 +00:00
Chris Lattner
b7163598f9
Don't crash on X^X if X is a vector. Instead, produce a vector of zeros.
...
llvm-svn: 27229
2006-03-28 19:11:05 +00:00
Chris Lattner
dc1eab5886
Don't call SimplifyDemandedBits on vectors
...
llvm-svn: 27128
2006-03-25 22:19:00 +00:00
Chris Lattner
5336a59e4b
fold insertelement(buildvector) -> buildvector if the inserted element # is
...
a constant. This implements test_constant_insert in CodeGen/Generic/vector.ll
llvm-svn: 26851
2006-03-19 01:27:56 +00:00
Nate Begeman
bb01d4f272
Remove BRTWOWAY*
...
Make the PPC backend not dependent on BRTWOWAY_CC and make the branch
selector smarter about the code it generates, fixing a case in the
readme.
llvm-svn: 26814
2006-03-17 01:40:33 +00:00
Chris Lattner
68ac09d5cb
make sure dead token factor nodes are removed by the dag combiner.
...
llvm-svn: 26731
2006-03-13 18:37:30 +00:00
Chris Lattner
d8c2a48d58
Fold X+Y -> X|Y when safe. This implements:
...
Regression/CodeGen/PowerPC/and_add.ll
a case that occurs with dynamic allocas of constant size.
llvm-svn: 26727
2006-03-13 06:51:27 +00:00