Chris Lattner
84b49d51be
Fix CodeGen/Generic/2006-04-28-Sign-extend-bool.ll
...
llvm-svn: 28017
2006-04-28 21:56:10 +00:00
Nate Begeman
4ca2ea5b43
JumpTable support! What this represents is working asm and jit support for
...
x86 and ppc for 100% dense switch statements when relocations are non-PIC.
This support will be extended and enhanced in the coming days to support
PIC, and less dense forms of jump tables.
llvm-svn: 27947
2006-04-22 18:53:45 +00:00
Chris Lattner
518834c67e
Fix a crash on:
...
void foo2(vector float *A, vector float *B) {
vector float C = (vector float)vec_cmpeq(*A, *B);
if (!vec_any_eq(*A, *B))
*B = (vector float){0,0,0,0};
*A = C;
}
llvm-svn: 27808
2006-04-18 18:28:22 +00:00
Chris Lattner
1e174c87c3
pretty print node name
...
llvm-svn: 27806
2006-04-18 18:05:58 +00:00
Chris Lattner
9754d142a4
Implement an important entry from README_ALTIVEC:
...
If an altivec predicate compare is used immediately by a branch, don't
use a (serializing) MFCR instruction to read the CR6 register, which requires
a compare to get it back to CR's. Instead, just branch on CR6 directly. :)
For example, for:
void foo2(vector float *A, vector float *B) {
if (!vec_any_eq(*A, *B))
*B = (vector float){0,0,0,0};
}
We now generate:
_foo2:
mfspr r2, 256
oris r5, r2, 12288
mtspr 256, r5
lvx v2, 0, r4
lvx v3, 0, r3
vcmpeqfp. v2, v3, v2
bne cr6, LBB1_2 ; UnifiedReturnBlock
LBB1_1: ; cond_true
vxor v2, v2, v2
stvx v2, 0, r4
mtspr 256, r2
blr
LBB1_2: ; UnifiedReturnBlock
mtspr 256, r2
blr
instead of:
_foo2:
mfspr r2, 256
oris r5, r2, 12288
mtspr 256, r5
lvx v2, 0, r4
lvx v3, 0, r3
vcmpeqfp. v2, v3, v2
mfcr r3, 2
rlwinm r3, r3, 27, 31, 31
cmpwi cr0, r3, 0
beq cr0, LBB1_2 ; UnifiedReturnBlock
LBB1_1: ; cond_true
vxor v2, v2, v2
stvx v2, 0, r4
mtspr 256, r2
blr
LBB1_2: ; UnifiedReturnBlock
mtspr 256, r2
blr
This implements CodeGen/PowerPC/vec_br_cmp.ll.
llvm-svn: 27804
2006-04-18 17:59:36 +00:00
Chris Lattner
96d50487c9
Use vmladduhm to do v8i16 multiplies which is faster and simpler than doing
...
even/odd halves. Thanks to Nate telling me what's what.
llvm-svn: 27793
2006-04-18 04:28:57 +00:00
Chris Lattner
d6d82aa889
Implement v16i8 multiply with this code:
...
vmuloub v5, v3, v2
vmuleub v2, v3, v2
vperm v2, v2, v5, v4
This implements CodeGen/PowerPC/vec_mul.ll. With this, v16i8 multiplies are
6.79x faster than before.
Overall, UnitTests/Vector/multiplies.c is now 2.45x faster with LLVM than with
GCC.
Remove the 'integer multiplies' todo from the README file.
llvm-svn: 27792
2006-04-18 03:57:35 +00:00
Chris Lattner
7e439874cb
Lower v8i16 multiply into this code:
...
li r5, lo16(LCPI1_0)
lis r6, ha16(LCPI1_0)
lvx v4, r6, r5
vmulouh v5, v3, v2
vmuleuh v2, v3, v2
vperm v2, v2, v5, v4
where v4 is:
LCPI1_0: ; <16 x ubyte>
.byte 2
.byte 3
.byte 18
.byte 19
.byte 6
.byte 7
.byte 22
.byte 23
.byte 10
.byte 11
.byte 26
.byte 27
.byte 14
.byte 15
.byte 30
.byte 31
This is 5.07x faster on the G5 (measured) than lowering to scalar code +
loads/stores.
llvm-svn: 27789
2006-04-18 03:43:48 +00:00
Chris Lattner
a2cae1bb10
Custom lower v4i32 multiplies into a cute sequence, instead of having legalize
...
scalarize the sequence into 4 mullw's and a bunch of load/store traffic.
This speeds up v4i32 multiplies 4.1x (measured) on a G5. This implements
PowerPC/vec_mul.ll
llvm-svn: 27788
2006-04-18 03:24:30 +00:00
Chris Lattner
e54133cfba
Make sure to check splats of every constant we can, handle splat(31) by
...
being a bit more clever, add support for odd splats from -31 to -17.
llvm-svn: 27764
2006-04-17 18:09:22 +00:00
Chris Lattner
264c908e3a
Teach the ppc backend to use rol and vsldoi to generate splatted constants.
...
This implements vec_constants.ll:test_vsldoi and test_rol
llvm-svn: 27760
2006-04-17 17:55:10 +00:00
Chris Lattner
1b3806ace5
Make some code more general, adding support for constant formation of several
...
new patterns.
llvm-svn: 27754
2006-04-17 06:58:41 +00:00
Chris Lattner
f8dd76df5b
Learn how to make odd splatted constants in range [17,29]. This implements
...
PowerPC/vec_constants.ll:test_29.
llvm-svn: 27752
2006-04-17 06:07:44 +00:00
Chris Lattner
2a099c04c1
Pull some code out into a helper function.
...
Effeciently codegen even splats in the range [-32,30].
This allows us to codegen <30,30,30,30> as:
vspltisw v0, 15
vadduwm v2, v0, v0
instead of as a cp load.
llvm-svn: 27750
2006-04-17 06:00:21 +00:00
Chris Lattner
071ad01ceb
Implement a TODO: for any shuffle that can be viewed as a v4[if]32 shuffle,
...
if it can be implemented in 3 or fewer discrete altivec instructions, codegen
it as such. This implements Regression/CodeGen/PowerPC/vec_perf_shuffle.ll
llvm-svn: 27748
2006-04-17 05:28:54 +00:00
Chris Lattner
06a21ba96b
Implement a TODO: have the legalizer canonicalize a bunch of operations to
...
one type (v4i32) so that we don't have to write patterns for each type, and
so that more CSE opportunities are exposed.
llvm-svn: 27731
2006-04-16 01:37:57 +00:00
Chris Lattner
fa5aa396c2
Make the BUILD_VECTOR lowering code much more aggressive w.r.t constant vectors.
...
Remove some done items from the todo list.
llvm-svn: 27729
2006-04-16 01:01:29 +00:00
Chris Lattner
24acbe46c0
Fix a crash when faced with a shuffle vector that has an undef in its mask.
...
llvm-svn: 27726
2006-04-15 23:48:05 +00:00
Chris Lattner
559c8ba466
Allow undef in a shuffle mask
...
llvm-svn: 27714
2006-04-14 23:19:08 +00:00
Chris Lattner
4211ca9108
Move the rest of the PPCTargetLowering::LowerOperation cases out into
...
separate functions, for simplicity and code clarity.
llvm-svn: 27693
2006-04-14 06:01:58 +00:00
Chris Lattner
19e9055eb5
Pull the VECTOR_SHUFFLE and BUILD_VECTOR lowering code out into separate
...
functions, which makes the code much cleaner :)
llvm-svn: 27692
2006-04-14 05:19:18 +00:00
Chris Lattner
883fb053bd
Force non-darwin targets to use a static relo model. This fixes PR734,
...
tested by CodeGen/Generic/vector.ll
llvm-svn: 27657
2006-04-13 17:10:48 +00:00
Chris Lattner
147e50e1c5
Add a new way to match vector constants, which make it easier to bang bits of
...
different types.
Codegen spltw(0x7FFFFFFF) and spltw(0x80000000) without a constant pool load,
implementing PowerPC/vec_constants.ll:test1. This compiles:
typedef float vf __attribute__ ((vector_size (16)));
typedef int vi __attribute__ ((vector_size (16)));
void test(vi *P1, vi *P2, vf *P3) {
*P1 &= (vi){0x80000000,0x80000000,0x80000000,0x80000000};
*P2 &= (vi){0x7FFFFFFF,0x7FFFFFFF,0x7FFFFFFF,0x7FFFFFFF};
*P3 = vec_abs((vector float)*P3);
}
to:
_test:
mfspr r2, 256
oris r6, r2, 49152
mtspr 256, r6
vspltisw v0, -1
vslw v0, v0, v0
lvx v1, 0, r3
vand v1, v1, v0
stvx v1, 0, r3
lvx v1, 0, r4
vandc v1, v1, v0
stvx v1, 0, r4
lvx v1, 0, r5
vandc v0, v1, v0
stvx v0, 0, r5
mtspr 256, r2
blr
instead of (with two constant pool entries):
_test:
mfspr r2, 256
oris r6, r2, 49152
mtspr 256, r6
li r6, lo16(LCPI1_0)
lis r7, ha16(LCPI1_0)
li r8, lo16(LCPI1_1)
lis r9, ha16(LCPI1_1)
lvx v0, r7, r6
lvx v1, 0, r3
vand v0, v1, v0
stvx v0, 0, r3
lvx v0, r9, r8
lvx v1, 0, r4
vand v1, v1, v0
stvx v1, 0, r4
lvx v1, 0, r5
vand v0, v1, v0
stvx v0, 0, r5
mtspr 256, r2
blr
GCC produces (with 2 cp entries):
_test:
mfspr r0,256
stw r0,-4(r1)
oris r0,r0,0xc00c
mtspr 256,r0
lis r2,ha16(LC0)
lis r9,ha16(LC1)
la r2,lo16(LC0)(r2)
lvx v0,0,r3
lvx v1,0,r5
la r9,lo16(LC1)(r9)
lwz r12,-4(r1)
lvx v12,0,r2
lvx v13,0,r9
vand v0,v0,v12
stvx v0,0,r3
vspltisw v0,-1
vslw v12,v0,v0
vandc v1,v1,v12
stvx v1,0,r5
lvx v0,0,r4
vand v0,v0,v13
stvx v0,0,r4
mtspr 256,r12
blr
llvm-svn: 27624
2006-04-12 19:07:14 +00:00
Chris Lattner
74cf9ff761
Rename get_VSPLI_elt -> get_VSPLTI_elt
...
Canonicalize BUILD_VECTOR's that match VSPLTI's into a single type for each
form, eliminating a bunch of Pat patterns in the .td file and allowing us to
CSE stuff more aggressively. This implements
PowerPC/buildvec_canonicalize.ll:VSPLTI
llvm-svn: 27614
2006-04-12 17:37:20 +00:00
Chris Lattner
e318a7574e
Ensure that zero vectors are always v4i32, which forces them to CSE with
...
each other. This implements CodeGen/PowerPC/vxor-canonicalize.ll
llvm-svn: 27609
2006-04-12 16:53:28 +00:00
Chris Lattner
e4db08a2f1
Vector function results go into V2 according to GCC. The darwin ABI doc
...
doesn't say where they go :-/
llvm-svn: 27579
2006-04-11 01:38:39 +00:00
Chris Lattner
92533cfb4a
Move some return-handling code from lowerarguments to the ISD::RET handling stuff.
...
No functionality change.
llvm-svn: 27577
2006-04-11 01:21:43 +00:00
Chris Lattner
3a68f3c3ca
properly mark vector selects as expanded to select_cc
...
llvm-svn: 27544
2006-04-08 22:59:15 +00:00
Chris Lattner
0a3d1bbca4
Add VRRC select support
...
llvm-svn: 27543
2006-04-08 22:45:08 +00:00
Chris Lattner
d9e80f4516
Implement PowerPC/CodeGen/vec_splat.ll:spltish to use vsplish instead of a
...
constant pool load.
llvm-svn: 27538
2006-04-08 07:14:26 +00:00
Chris Lattner
d71a1f946d
Change the interface to the predicate that determines if vsplti* can be used.
...
No functionality changes.
llvm-svn: 27536
2006-04-08 06:46:53 +00:00
Chris Lattner
466841ddc7
Make sure to return the result in the right type.
...
llvm-svn: 27469
2006-04-06 23:12:19 +00:00
Chris Lattner
a4bbfaed5c
Match vpku[hw]um(x,x).
...
Convert vsldoi(x,x) to work the same way other (x,x) cases work.
llvm-svn: 27467
2006-04-06 22:28:36 +00:00
Chris Lattner
f38e033270
Add support for matching vmrg(x,x) patterns
...
llvm-svn: 27463
2006-04-06 22:02:42 +00:00
Chris Lattner
d1dcb52093
Pattern match vmrg* instructions, which are now lowered by the CFE into shuffles.
...
llvm-svn: 27457
2006-04-06 21:11:54 +00:00
Chris Lattner
1d33819194
Support pattern matching vsldoi(x,y) and vsldoi(x,x), which allows the f.e. to
...
lower it and LLVM to have one fewer intrinsic. This implements
CodeGen/PowerPC/vec_shuffle.ll
llvm-svn: 27450
2006-04-06 18:26:28 +00:00
Chris Lattner
e8b83b4206
Compile the vpkuhum/vpkuwum intrinsics into vpkuhum/vpkuwum instead of into
...
vperm with a perm mask lvx'd from the constant pool.
llvm-svn: 27448
2006-04-06 17:23:16 +00:00
Chris Lattner
39cc717c65
Fix CodeGen/PowerPC/2006-04-05-splat-ish.ll
...
llvm-svn: 27439
2006-04-05 17:39:25 +00:00
Evan Cheng
2cf4232ced
Fallthrough to expand if a VECTOR_SHUFFLE cannot be custom lowered.
...
llvm-svn: 27433
2006-04-05 06:09:26 +00:00
Chris Lattner
4a744e5c9d
Fix some broken logic that would cause us to codegen {2147483647,2147483647,2147483647,2147483647} as 'vspltisb v0, -1'.
...
llvm-svn: 27413
2006-04-04 22:28:35 +00:00
Chris Lattner
95c7adc7cb
Ask legalize to promote all vector shuffles to be v16i8 instead of having to
...
handle all 4 PPC vector types. This simplifies the matching code and allows
us to eliminate a bunch of patterns. This also adds cases we were missing,
such as CodeGen/PowerPC/vec_splat.ll:splat_h.
llvm-svn: 27400
2006-04-04 17:25:31 +00:00
Chris Lattner
447a7968af
Revert accidentally committed hunks.
...
llvm-svn: 27386
2006-04-03 23:58:04 +00:00
Chris Lattner
533aed9a35
Make sure to mark unsupported SCALAR_TO_VECTOR operations as expand.
...
llvm-svn: 27385
2006-04-03 23:55:43 +00:00
Chris Lattner
c5287c0ece
Inform the dag combiner that the predicate compares only return a low bit.
...
llvm-svn: 27359
2006-04-02 06:26:07 +00:00
Chris Lattner
9b2d6e7886
Custom lower all BUILD_VECTOR's so that we can compile vec_splat_u8(8) into
...
"vspltisb v0, 8" instead of a constant pool load.
llvm-svn: 27335
2006-04-02 00:43:36 +00:00
Chris Lattner
baa73e0d91
Rearrange code a bit
...
llvm-svn: 27306
2006-03-31 19:52:36 +00:00
Chris Lattner
754b41c84b
Add, sub and shuffle are legal for all vector types
...
llvm-svn: 27305
2006-03-31 19:48:58 +00:00
Chris Lattner
829a061abf
note to self: *save* file, then check it in
...
llvm-svn: 27291
2006-03-31 06:04:53 +00:00
Chris Lattner
d4058a59d4
Implement an item from the readme, folding vcmp/vcmp. instructions with
...
identical instructions into a single instruction. For example, for:
void test(vector float *x, vector float *y, int *P) {
int v = vec_any_out(*x, *y);
*x = (vector float)vec_cmpb(*x, *y);
*P = v;
}
we now generate:
_test:
mfspr r2, 256
oris r6, r2, 49152
mtspr 256, r6
lvx v0, 0, r4
lvx v1, 0, r3
vcmpbfp. v0, v1, v0
mfcr r4, 2
stvx v0, 0, r3
rlwinm r3, r4, 27, 31, 31
xori r3, r3, 1
stw r3, 0(r5)
mtspr 256, r2
blr
instead of:
_test:
mfspr r2, 256
oris r6, r2, 57344
mtspr 256, r6
lvx v0, 0, r4
lvx v1, 0, r3
vcmpbfp. v2, v1, v0
mfcr r4, 2
*** vcmpbfp v0, v1, v0
rlwinm r4, r4, 27, 31, 31
stvx v0, 0, r3
xori r3, r4, 1
stw r3, 0(r5)
mtspr 256, r2
blr
Testcase here: CodeGen/PowerPC/vcmp-fold.ll
llvm-svn: 27290
2006-03-31 06:02:07 +00:00
Chris Lattner
d7495ae7e9
Lower vector compares to VCMP nodes, just like we lower vector comparison
...
predicates to VCMPo nodes.
llvm-svn: 27285
2006-03-31 05:13:27 +00:00
Chris Lattner
bca5fbe914
Mark INSERT_VECTOR_ELT as expand
...
llvm-svn: 27276
2006-03-31 01:48:55 +00:00
Nate Begeman
1b3928765d
Add a few more altivec intrinsics
...
llvm-svn: 27215
2006-03-28 04:15:58 +00:00
Chris Lattner
cb5ec07cc3
Use normal lvx for scalar_to_vector instead of lve*x. They do the exact
...
same thing and we have a dag node for the former.
llvm-svn: 27205
2006-03-28 01:43:22 +00:00
Chris Lattner
e55d171ccd
Tblgen doesn't like multiple SDNode<> definitions that map to the sameenum value. Split them into separate enums.
...
llvm-svn: 27201
2006-03-28 00:40:33 +00:00
Nate Begeman
ed728c1291
SelectionDAGISel can now natively handle Switch instructions, in the same
...
manner that the LowerSwitch LLVM to LLVM pass does: emitting a binary
search tree of basic blocks. The new approach has several advantages:
it is faster, it generates significantly smaller code in many cases, and
it paves the way for implementing dense switch tables as a jump table by
handling switches directly in the instruction selector.
This functionality is currently only enabled on x86, but should be safe for
every target. In anticipation of making it the default, the cfg is now
properly updated in the x86, ppc, and sparc select lowering code.
llvm-svn: 27156
2006-03-27 01:32:24 +00:00
Chris Lattner
6961fc76bb
Codegen vector predicate compares.
...
llvm-svn: 27151
2006-03-26 10:06:40 +00:00
Evan Cheng
b1ddc988af
Remove PPC:isZeroVector, use ISD::isBuildVectorAllZeros instead
...
llvm-svn: 27149
2006-03-26 09:52:32 +00:00
Chris Lattner
1cb91b3cd9
Add some basic patterns for other datatypes
...
llvm-svn: 27116
2006-03-25 07:39:07 +00:00
Chris Lattner
2771e2c960
Codegen things like:
...
<int -1, int -1, int -1, int -1>
and
<int 65537, int 65537, int 65537, int 65537>
Using things like:
vspltisb v0, -1
and:
vspltish v0, 1
instead of using constant pool loads.
This implements CodeGen/PowerPC/vec_splat.ll:splat_imm_i{32|16}.
llvm-svn: 27106
2006-03-25 06:12:06 +00:00
Chris Lattner
a90b7141ed
Disable the i32->float G5 optimization. It is unsafe, as documented in the
...
comment.
This fixes 177.mesa, and McCat/09-vor with the td scheduler.
llvm-svn: 27060
2006-03-24 07:53:47 +00:00
Chris Lattner
ab882abce8
add support for using vxor to build zero vectors. This implements
...
Regression/CodeGen/PowerPC/vec_zero.ll
llvm-svn: 27059
2006-03-24 07:48:08 +00:00
Chris Lattner
4a66d69433
When possible, custom lower 32-bit SINT_TO_FP to this:
...
_foo2:
extsw r2, r3
std r2, -8(r1)
lfd f0, -8(r1)
fcfid f0, f0
frsp f1, f0
blr
instead of this:
_foo2:
lis r2, ha16(LCPI2_0)
lis r4, 17200
xoris r3, r3, 32768
stw r3, -4(r1)
stw r4, -8(r1)
lfs f0, lo16(LCPI2_0)(r2)
lfd f1, -8(r1)
fsub f0, f1, f0
frsp f1, f0
blr
This speeds up Misc/pi from 2.44s->2.09s with LLC and from 3.01->2.18s
with llcbeta (16.7% and 38.1% respectively).
llvm-svn: 26943
2006-03-22 05:30:33 +00:00
Chris Lattner
00f4683bf6
These targets don't support EXTRACT_VECTOR_ELT, though, in time, X86 will.
...
llvm-svn: 26930
2006-03-21 20:51:05 +00:00
Chris Lattner
6d74b09da7
remove dead variable
...
llvm-svn: 26907
2006-03-20 22:37:23 +00:00
Chris Lattner
a1bc294f0c
Fix a couple of bugs in permute/splat generate, thanks to Nate for actually
...
figuring these out! :)
llvm-svn: 26904
2006-03-20 18:26:51 +00:00
Chris Lattner
a9a1313386
Add support for generating vspltw, instead of a vperm instruction with a
...
constant pool load. This generates significantly nicer code for splats.
When tblgen gets bugfixed, we can remove the custom selection code.
llvm-svn: 26898
2006-03-20 06:51:10 +00:00
Chris Lattner
a8fbb6dd3d
Implement PPC::isSplatShuffleMask and PPC::getVSPLTImmediate.
...
llvm-svn: 26897
2006-03-20 06:37:44 +00:00
Chris Lattner
ffc475689b
fix duplicate definition errors
...
llvm-svn: 26896
2006-03-20 06:33:01 +00:00
Chris Lattner
a8713b1ee6
Custom lower arbitrary VECTOR_SHUFFLE's to VPERM.
...
TODO: leave specific ones as VECTOR_SHUFFLE's and turn them into specialized
operations like vsplt*
llvm-svn: 26887
2006-03-20 01:53:53 +00:00
Chris Lattner
7e9440a4fc
Custom lower SCALAR_TO_VECTOR into lve*x.
...
llvm-svn: 26868
2006-03-19 06:55:52 +00:00
Chris Lattner
b1ee9c7e24
PPC doesn't have SCALAR_TO_VECTOR
...
llvm-svn: 26865
2006-03-19 06:17:19 +00:00
Chris Lattner
f7b6e7212f
rename these nodes
...
llvm-svn: 26848
2006-03-19 01:13:28 +00:00
Nate Begeman
bb01d4f272
Remove BRTWOWAY*
...
Make the PPC backend not dependent on BRTWOWAY_CC and make the branch
selector smarter about the code it generates, fixing a case in the
readme.
llvm-svn: 26814
2006-03-17 01:40:33 +00:00
Evan Cheng
2dd2c652b2
Added getTargetLowering() to TargetMachine. Refactored targets to support this.
...
llvm-svn: 26742
2006-03-13 23:20:37 +00:00
Chris Lattner
9c7f50376a
Copysign needs to be expanded everywhere. Note that Alpha and IA64 should
...
implement copysign as a native op if they have it.
llvm-svn: 26541
2006-03-05 05:08:37 +00:00
Chris Lattner
27f5345b1f
Compile this:
...
void foo(float a, int *b) { *b = a; }
to this:
_foo:
fctiwz f0, f1
stfiwx f0, 0, r4
blr
instead of this:
_foo:
fctiwz f0, f1
stfd f0, -8(r1)
lwz r2, -4(r1)
stw r2, 0(r4)
blr
This implements CodeGen/PowerPC/stfiwx.ll, and also incidentally does the
right thing for GCC bugzilla 26505.
llvm-svn: 26447
2006-03-01 05:50:56 +00:00
Chris Lattner
f418435819
Use a target-specific dag-combine to implement CodeGen/PowerPC/fp-int-fp.ll.
...
llvm-svn: 26445
2006-03-01 04:57:39 +00:00
Evan Cheng
1926427351
Vector op lowering.
...
llvm-svn: 26438
2006-03-01 01:11:20 +00:00
Evan Cheng
73136dfecc
- Added option -relocation-model to set relocation model. Valid values include static, pic,
...
dynamic-no-pic, and default.
PPC and x86 default is dynamic-no-pic for Darwin, pic for others.
- Removed options -enable-pic and -ppc-static.
llvm-svn: 26315
2006-02-22 20:19:42 +00:00
Chris Lattner
7ad77dfc2a
split register class handling from explicit physreg handling.
...
llvm-svn: 26308
2006-02-22 00:56:39 +00:00
Chris Lattner
7bb4696dc3
Updates to match change of getRegForInlineAsmConstraint prototype
...
llvm-svn: 26305
2006-02-21 23:11:00 +00:00
Evan Cheng
5f99760ae7
Moved PICEnabled to include/llvm/Target/TargetOptions.h
...
llvm-svn: 26272
2006-02-18 00:08:58 +00:00
Chris Lattner
3a0ad47b39
Switch to using getCALLSEQ_START instead of using our own creation calls
...
llvm-svn: 26142
2006-02-13 08:55:29 +00:00
Chris Lattner
203b2f1288
Implement getConstraintType for PPC.
...
llvm-svn: 26042
2006-02-07 20:16:30 +00:00
Chris Lattner
15a6c4c444
Add the simple PPC integer constraints
...
llvm-svn: 26027
2006-02-07 00:47:13 +00:00
Nate Begeman
7e7f439f85
Fix some of the stuff in the PPC README file, and clean up legalization
...
of the SELECT_CC, BR_CC, and BRTWOWAY_CC nodes.
llvm-svn: 25875
2006-02-01 07:19:44 +00:00
Evan Cheng
32be2dc0af
Allow the specification of explicit alignments for constant pool entries.
...
llvm-svn: 25855
2006-01-31 22:23:14 +00:00
Chris Lattner
0151361d21
add info about the inline asm register constraints for PPC
...
llvm-svn: 25853
2006-01-31 19:20:21 +00:00
Nate Begeman
a162f208ee
Codegen
...
bool %test(int %X) {
%Y = seteq int %X, 13
ret bool %Y
}
as
_test:
addi r2, r3, -13
cntlzw r2, r2
srwi r3, r2, 5
blr
rather than
_test:
cmpwi cr7, r3, 13
mfcr r2
rlwinm r3, r2, 31, 31, 31
blr
This has very little effect on most code, but speeds up analyzer 23% and
mason 11%
llvm-svn: 25848
2006-01-31 08:17:29 +00:00
Chris Lattner
32058cfb7b
Functions that are lazily streamed in from the .bc file are *not* external.
...
This fixes llvm-test/SingleSource/UnitTests/2006-01-29-SimpleIndirectCall.c
and PR704
llvm-svn: 25793
2006-01-29 20:49:17 +00:00
Chris Lattner
3072af4d4f
Now that OpActions is big enough, we can specify actions for vector types
...
llvm-svn: 25784
2006-01-29 08:41:37 +00:00
Chris Lattner
d7738e6b32
disable this for now
...
llvm-svn: 25778
2006-01-29 07:31:33 +00:00
Chris Lattner
d33c60b52b
Request expansion of ConstantVec nodes.
...
llvm-svn: 25773
2006-01-29 06:32:58 +00:00
Chris Lattner
61c9a8e942
Targets all now request ConstantFP to be legalized into TargetConstantFP.
...
'fpimm' in .td files is now TargetConstantFP.
llvm-svn: 25771
2006-01-29 06:26:08 +00:00
Chris Lattner
30432e07f0
Fix a bug in my elimination of ISD::CALL this morning. PPC now has to
...
provide the expansion for i64 calls itself
llvm-svn: 25735
2006-01-28 07:33:03 +00:00
Chris Lattner
f424a66524
Use PPCISD::CALL instead of ISD::CALL
...
llvm-svn: 25717
2006-01-27 23:34:02 +00:00
Chris Lattner
4d967a4cbb
Make llvm.frame/returnaddr not crash on ppc
...
llvm-svn: 25710
2006-01-27 22:25:06 +00:00
Nate Begeman
8c47c3a3b1
Remove TLI.LowerReturnTo, and just let targets custom lower ISD::RET for
...
the same functionality. This addresses another piece of bug 680. Next,
on to fixing Alpha VAARG, which I broke last time.
llvm-svn: 25696
2006-01-27 21:09:22 +00:00
Evan Cheng
030e002fb9
Set SchedulingForLatency to be the default scheduling preference for all.
...
llvm-svn: 25607
2006-01-25 18:52:42 +00:00
Nate Begeman
e74795cd70
First part of bug 680:
...
Remove TLI.LowerVA* and replace it with SDNodes that are lowered the same
way as everything else.
llvm-svn: 25606
2006-01-25 18:21:52 +00:00
Evan Cheng
1092a02619
Default scheduling preference is SchedulingForLatency.
...
llvm-svn: 25603
2006-01-25 09:15:54 +00:00
Chris Lattner
ce5066c863
Don't assert on 'select_cc SETUO'
...
llvm-svn: 25423
2006-01-18 19:42:35 +00:00
Chris Lattner
5bd514d7b0
Use the default impl of DYNAMIC_STACKALLOC, allowing us to delete some code.
...
llvm-svn: 25334
2006-01-15 09:02:48 +00:00
Nate Begeman
2fba8a3aaa
bswap implementation
...
llvm-svn: 25312
2006-01-14 03:14:10 +00:00
Chris Lattner
776c326c96
implement stacksave/stackrestore on PPC
...
llvm-svn: 25277
2006-01-13 17:52:03 +00:00
Chris Lattner
8e2f52e645
expand unsupported stacksave/stackrestore nodes
...
llvm-svn: 25272
2006-01-13 02:42:53 +00:00
Nate Begeman
1b8121b227
Add bswap, rotl, and rotr nodes
...
Add dag combiner code to recognize rotl, rotr
Add ppc code to match rotl
Targets should add rotl/rotr patterns if they have them
llvm-svn: 25222
2006-01-11 21:21:00 +00:00
Chris Lattner
602dfea79c
Fix calls that need to store values in stack slots, to not copy the stack
...
pointer. This allows us to emit stuff like this:
li r10, 0
stw r10, 56(r1)
or r3, r10, r10
or r4, r10, r10
or r5, r10, r10
or r6, r10, r10
or r7, r10, r10
or r8, r10, r10
or r9, r10, r10
bl L_bar$stub
instead of this:
or r2, r1, r1 ;; Extraneous copy.
li r10, 0
stw r10, 56(r2)
or r3, r10, r10
or r4, r10, r10
or r5, r10, r10
or r6, r10, r10
or r7, r10, r10
or r8, r10, r10
or r9, r10, r10
bl L_bar$stub
wowness.
llvm-svn: 25221
2006-01-11 19:55:07 +00:00
Chris Lattner
66f63f72f3
Dead FP arguments still use an incoming FP reg. This fixes
...
Regression/CodeGen/PowerPC/2006-01-11-darwin-fp-argument.ll, which was
distilled from a miscompilation in 252.eon.
llvm-svn: 25217
2006-01-11 18:21:25 +00:00
Chris Lattner
347ed8a581
Give PPCISD:: nodes legible names in dumps.
...
llvm-svn: 25166
2006-01-09 23:52:17 +00:00
Chris Lattner
b87030358d
linkonce symbols have an extra indirection, just like weak ones do. This fixes
...
Prolangs-C++/family and Prolangs-C++/primes.
llvm-svn: 25119
2006-01-06 01:04:03 +00:00
Jim Laskey
deeafa0f00
Had expand logic backward.
...
llvm-svn: 25105
2006-01-05 01:47:43 +00:00
Jim Laskey
762e9ec06c
Added initial support for DEBUG_LABEL allowing debug specific labels to be
...
inserted in the code.
llvm-svn: 25104
2006-01-05 01:25:28 +00:00
Nate Begeman
c2c8a6202f
Remove a fixme
...
llvm-svn: 25045
2005-12-30 00:11:07 +00:00
Nate Begeman
9aea6e4691
Fix one of the things in the todo file, and get a bit closer to folding
...
constant offsets from statics into the address arithmetic.
llvm-svn: 24999
2005-12-24 01:00:15 +00:00
Chris Lattner
c46fc2482c
make sure bit_converts are expanded
...
llvm-svn: 24978
2005-12-23 05:13:35 +00:00
Chris Lattner
f474034432
Simplify some code by using BIT_CONVERT
...
llvm-svn: 24974
2005-12-23 00:59:59 +00:00
Nate Begeman
b11b8e44fa
Pattern-match return. Includes gross hack!
...
llvm-svn: 24874
2005-12-20 00:26:01 +00:00
Nate Begeman
8e6a8af205
Convert load/store over to being pattern matched
...
llvm-svn: 24871
2005-12-19 23:25:09 +00:00
Nate Begeman
4e56db674c
Add support for TargetConstantPool nodes to the dag isel emitter, and use
...
them in the PPC backend, to simplify some logic out of Select and
SelectAddr.
llvm-svn: 24657
2005-12-10 02:36:00 +00:00
Chris Lattner
fea33f7e64
Use new PPC-specific nodes to represent shifts which require the 6-bit
...
amount handling that PPC provides. These are generated by the lowering code
and prevents the dag combiner from assuming (rightfully) that the shifts
don't only look at 5 bits. This fixes a miscompilation of crafty with
the new front-end.
llvm-svn: 24615
2005-12-06 02:10:38 +00:00
Chris Lattner
3713e6b49c
Fix Regression/CodeGen/PowerPC/2005-11-30-vastart-crash.ll
...
llvm-svn: 24547
2005-11-30 20:40:54 +00:00
Nate Begeman
3e7db9c6d5
Hook up one type, v4f32, to the VR RegisterClass for now.
...
llvm-svn: 24517
2005-11-29 08:17:20 +00:00
Chris Lattner
9c415364cf
No targets support line number info yet.
...
llvm-svn: 24513
2005-11-29 06:16:21 +00:00
Chris Lattner
3570cf456b
add an option to generate completely non-pic code, corresponding to what
...
gcc -static produces on PPC. This is used for building kexts and other things.
With this, materializing the address of a global looks like:
lis r2, ha16(L_H$non_lazy_ptr)
la r3, lo16(L_H$non_lazy_ptr)(r2)
we're still emitting stubs for functions, which is wrong. That is next.
llvm-svn: 24399
2005-11-17 18:55:48 +00:00
Chris Lattner
8f8ed28a64
Fix a bug that resistor on IRC hit where we tried to create token factor
...
nodes of load results, not of their chain results.
llvm-svn: 24398
2005-11-17 18:30:17 +00:00
Chris Lattner
5aba6ae3b3
Enable global address legalization, fixing a todo and allowing the removal
...
of some code. This exposes the implicit load from the stubs to the DAG, allowing
them to be optimized by the dag combiner. It also moves darwin specific stuff
out of the isel into the legalizer, and allows more to be moved to the .td file.
llvm-svn: 24397
2005-11-17 18:26:56 +00:00
Chris Lattner
3648c20472
Use the right accessor to create this node
...
llvm-svn: 24394
2005-11-17 17:51:38 +00:00
Chris Lattner
595088aa0f
Add an initial hack at legalizing GlobalAddress into the appropriate nodes
...
on Darwin to remove smarts from the isel. This is currently disabled by
default (uncomment setOperationAction(ISD::GlobalAddress to enable it).
tblgen needs to become smarter about tglobaladdr nodes and bigger patterns
needed to be added to the .td file. However, we can currently emit stuff like
this: :)
li r2, lo16(L_x$non_lazy_ptr)
lis r3, ha16(L_x$non_lazy_ptr)
lwzx r2, r3, r2
The obvious improvements will follow.
llvm-svn: 24390
2005-11-17 07:30:41 +00:00
Chris Lattner
b7025749e1
When lowering direct calls, lower them to use a targetglobaladress directly
...
instead of a globaladdress. This has no effect on the generated code at all.
llvm-svn: 24386
2005-11-17 05:56:14 +00:00
Chris Lattner
f718a9e17b
Fix an assert compiling MallocBench/gs
...
llvm-svn: 24017
2005-10-26 18:01:11 +00:00
Nate Begeman
762bf809b5
Correctly Expand or Promote FP_TO_UINT based on the capabilities of the
...
machine. This allows us to generate great code for i32 FP_TO_UINT now on
targets with 64 bit extensions.
llvm-svn: 23993
2005-10-25 23:48:36 +00:00
Chris Lattner
65845a2f7c
Expose the fextend on the DAG instead of doing it in the matcher
...
llvm-svn: 23986
2005-10-25 20:54:57 +00:00
Nate Begeman
4dd383120f
Invert the TargetLowering flag that controls divide by consant expansion.
...
Add a new flag to TargetLowering indicating if the target has really cheap
signed division by powers of two, make ppc use it. This will probably go
away in the future.
Implement some more ISD::SDIV folds in the dag combiner
Remove now dead code in the x86 backend.
llvm-svn: 23853
2005-10-21 00:02:42 +00:00
Nate Begeman
c6f067a8c4
Move the target constant divide optimization up into the dag combiner, so
...
that the nodes can be folded with other nodes, and we can not duplicate
code in every backend. Alpha will probably want this too.
llvm-svn: 23835
2005-10-20 02:15:44 +00:00
Nate Begeman
78afac2ddd
Add the ability to lower return instructions to TargetLowering. This
...
allows us to lower legal return types to something else, to meet ABI
requirements (such as that i64 be returned in two i32 regs on Darwin/ppc).
llvm-svn: 23802
2005-10-18 23:23:37 +00:00
Nate Begeman
e74dfbb9ce
Do the right thing and enable 64 bit regs under the control of a subtarget
...
option. Currently the only way to enable this is to specify the
64bitregs mattr flag. It is never enabled by default on any config yet.
llvm-svn: 23779
2005-10-18 00:56:42 +00:00
Nate Begeman
0b71e007ef
First bits of 64 bit PowerPC stuff, currently disabled. A lot of this is
...
purely mechanical.
llvm-svn: 23778
2005-10-18 00:28:58 +00:00
Nate Begeman
6cca84e43c
More PPC32 -> PPC changes, as well as merging some classes that were
...
redundant after the change.
llvm-svn: 23759
2005-10-16 05:39:50 +00:00
Chris Lattner
6f3b954662
Rename PPC32*.h to PPC*.h
...
This completes the grand PPC file renaming
llvm-svn: 23745
2005-10-14 23:59:06 +00:00
Chris Lattner
a17e6c486c
fix an f32/f64 type mismatch
...
llvm-svn: 23587
2005-10-02 06:37:13 +00:00
Chris Lattner
d3eee1a09b
Modify the ppc backend to use two register classes for FP: F8RC and F4RC.
...
These are used to represent float and double values, and the two regclasses
contain the same physical registers.
llvm-svn: 23577
2005-10-01 01:35:02 +00:00
Chris Lattner
d3ea19b51a
Add FP versions of the binary operators, keeping the int and fp worlds seperate.
...
llvm-svn: 23506
2005-09-28 22:29:58 +00:00
Chris Lattner
a028e7a39c
Darwin, like many BSD systems, has a setjmp/longjmp which saves the signal mask
...
on setjmp calls and restores it on longjmp calls (both of which require syscalls).
This makes the calls REALLY slow. Use _setjmp/_longjmp instead. This speeds up
hexxagon from 120.31s to 15.68s: from 5.53x slower than GCC to 28% faster than GCC.
llvm-svn: 23482
2005-09-27 22:18:25 +00:00
Chris Lattner
0f965a615e
Change the arg lowering code to use copyfromreg from vregs associated
...
with incoming arguments instead of the pregs themselves. This fixes
the scheduler from causing problems by moving a copyfromreg for an argument
to after a select_cc node (now it can, and bad things won't happen).
llvm-svn: 23334
2005-09-13 19:33:40 +00:00
Chris Lattner
aa6cbd90c5
Remove some dead vectors
...
llvm-svn: 23329
2005-09-13 18:47:49 +00:00
Chris Lattner
4309c3a785
PowerPC cannot truncstore i1 natively
...
llvm-svn: 23304
2005-09-10 00:21:06 +00:00
Nate Begeman
6095214bf0
Implement i64<->fp using the fctidz/fcfid instructions on PowerPC when we
...
are allowed to generate 64-bit-only PowerPC instructions for 32 bit hosts,
such as the PowerPC 970.
This speeds up 189.lucas from 81.99 to 32.64 seconds.
llvm-svn: 23250
2005-09-06 22:03:27 +00:00
Chris Lattner
aa3b1fcc58
Decouple fsqrt from gpul optimizations, implementing fsqrt.ll.
...
Remove the -enable-gpopt option which is subsumed by feature flags.
llvm-svn: 23218
2005-09-02 18:33:05 +00:00
Chris Lattner
763a3a0fa7
Restore this patch now that the latent bug has been fixed
...
llvm-svn: 23209
2005-09-02 01:24:55 +00:00
Chris Lattner
06d440f2ee
Revert the previous patch which causes a mysterious regression in toast.
...
llvm-svn: 23207
2005-09-02 00:47:05 +00:00
Chris Lattner
9ee867b93b
Implement small-arguments.ll:test3 by teaching the DAG optimizer that
...
the results of calls to functions returning small values are properly
sign/zero extended.
llvm-svn: 23198
2005-09-01 23:44:32 +00:00
Chris Lattner
da2e04c69d
Move FCTIWZ handling out of the instruction selectors and into legalization,
...
getting them out of the business of making stack slots.
llvm-svn: 23180
2005-08-31 21:09:52 +00:00
Chris Lattner
e675a08e10
Move SHL,SHR i64 -> legalizer
...
llvm-svn: 23178
2005-08-31 20:23:54 +00:00
Chris Lattner
2f03896a0f
lower sra_parts on the dag, implementing it for the dag isel, and exposing
...
the ops to dag optimization.
llvm-svn: 23176
2005-08-31 19:09:57 +00:00
Nate Begeman
e3287b85b7
Enable generation of AssertSext and AssertZext in the PPC backend.
...
llvm-svn: 23168
2005-08-31 01:58:39 +00:00
Chris Lattner
e75b5e63a7
Fix a bug in my patch for legalizing to fsel. It cannot handle seteq/setne,
...
which I failed to include when I moved the code over. This fixes
MallocBench/gs.
llvm-svn: 23140
2005-08-30 00:45:18 +00:00
Chris Lattner
62b9a5d1f8
Fix some really strange indentation that xcode likes to use.
...
no xcode, this is not right:
if (!foo) break;
X;
llvm-svn: 23138
2005-08-30 00:19:00 +00:00
Chris Lattner
9b577f108a
implement SELECT_CC fully for the DAG->DAG isel!
...
llvm-svn: 23101
2005-08-26 21:23:58 +00:00
Chris Lattner
b2854fadda
Make fsel emission work with both the pattern and dag-dag selectors, by
...
giving it a non-instruction opcode. The dag->dag selector used to not
select the operands of the fsel, because it thought that whole tree was
already selected.
llvm-svn: 23091
2005-08-26 20:25:03 +00:00
Chris Lattner
7f1fa8eaef
implement the other half of the select_cc -> fsel lowering, which handles
...
when the RHS of the comparison is 0.0. Turn this on by default.
llvm-svn: 23083
2005-08-26 17:36:52 +00:00
Chris Lattner
f3d06c6417
add initial support for converting select_cc -> fsel in the legalizer
...
instead of in the backend. This currently handles fsel cases with registers,
but doesn't have the 0.0 and -0.0 optimization enabled yet.
Once this is finished, special hack for fp immediates can go away.
llvm-svn: 23075
2005-08-26 00:52:45 +00:00
Nate Begeman
65ffd8fbf4
Remove option to make SetCC illegal on PowerPC after long discussion with
...
Chris. This will be accomplished through correctly modeling CR's and
subregs.
llvm-svn: 23056
2005-08-25 20:01:10 +00:00
Nate Begeman
f3ce09b36e
Ack, typo
...
llvm-svn: 22981
2005-08-23 05:45:10 +00:00
Nate Begeman
7216ad415b
Add an option to make SetCC illegal as a beta option
...
llvm-svn: 22979
2005-08-23 05:42:36 +00:00
Jim Laskey
6267b2c97c
Make UINT_TO_FP and SINT_TO_FP use generic expansion.
...
llvm-svn: 22815
2005-08-17 00:40:22 +00:00
Chris Lattner
79f5ebc7b9
updates for changes in nodes
...
llvm-svn: 22808
2005-08-16 21:58:15 +00:00
Nate Begeman
371e49515d
Implement BR_CC and BRTWOWAY_CC. This allows the removal of a rather nasty
...
fixme from the PowerPC backend. Emit slightly better code for legalizing
select_cc.
llvm-svn: 22805
2005-08-16 19:49:35 +00:00
Chris Lattner
f22556d3ad
Pull the LLVM -> DAG lowering code out of the pattern selector so that it
...
can be shared with the DAG->DAG selector.
llvm-svn: 22799
2005-08-16 17:14:42 +00:00