Chris Lattner
ab882abce8
add support for using vxor to build zero vectors. This implements
...
Regression/CodeGen/PowerPC/vec_zero.ll
llvm-svn: 27059
2006-03-24 07:48:08 +00:00
Chris Lattner
f5efddf80b
Gabor points out that we can't spell. :)
...
llvm-svn: 27049
2006-03-24 07:12:19 +00:00
Chris Lattner
81137629e0
Add PPC vector bit-convert support
...
llvm-svn: 26995
2006-03-23 19:54:27 +00:00
Chris Lattner
4a66d69433
When possible, custom lower 32-bit SINT_TO_FP to this:
...
_foo2:
extsw r2, r3
std r2, -8(r1)
lfd f0, -8(r1)
fcfid f0, f0
frsp f1, f0
blr
instead of this:
_foo2:
lis r2, ha16(LCPI2_0)
lis r4, 17200
xoris r3, r3, 32768
stw r3, -4(r1)
stw r4, -8(r1)
lfs f0, lo16(LCPI2_0)(r2)
lfd f1, -8(r1)
fsub f0, f1, f0
frsp f1, f0
blr
This speeds up Misc/pi from 2.44s->2.09s with LLC and from 3.01->2.18s
with llcbeta (16.7% and 38.1% respectively).
llvm-svn: 26943
2006-03-22 05:30:33 +00:00
Chris Lattner
4e7371758f
Fix the JIT encoding of the VAForm_1 instructions, including vmaddfp
...
llvm-svn: 26935
2006-03-22 01:44:36 +00:00
Chris Lattner
d2132f87d7
When codegen'ing vector MUL using VFMADD, *add* the 0, don't *mul* the 0.
...
llvm-svn: 26913
2006-03-21 00:51:38 +00:00
Chris Lattner
a1bc294f0c
Fix a couple of bugs in permute/splat generate, thanks to Nate for actually
...
figuring these out! :)
llvm-svn: 26904
2006-03-20 18:26:51 +00:00
Chris Lattner
f96d523b8f
Fix the pattern for VADDUWM, add i32 splat
...
llvm-svn: 26901
2006-03-20 17:51:58 +00:00
Evan Cheng
89f3cff0f5
Use tblgen'd VECTOR_SHUFFLE selection code.
...
llvm-svn: 26900
2006-03-20 08:14:16 +00:00
Chris Lattner
a9a1313386
Add support for generating vspltw, instead of a vperm instruction with a
...
constant pool load. This generates significantly nicer code for splats.
When tblgen gets bugfixed, we can remove the custom selection code.
llvm-svn: 26898
2006-03-20 06:51:10 +00:00
Chris Lattner
382f356bd9
Check in some intermediate code that adds a skeleton for matching vsplt*
...
instructions
llvm-svn: 26894
2006-03-20 06:15:45 +00:00
Chris Lattner
93d99f9928
fix typo
...
llvm-svn: 26889
2006-03-20 05:05:55 +00:00
Chris Lattner
366b2514fa
add vsplat instructions, fix sched description for vperm
...
llvm-svn: 26888
2006-03-20 04:47:33 +00:00
Chris Lattner
a8713b1ee6
Custom lower arbitrary VECTOR_SHUFFLE's to VPERM.
...
TODO: leave specific ones as VECTOR_SHUFFLE's and turn them into specialized
operations like vsplt*
llvm-svn: 26887
2006-03-20 01:53:53 +00:00
Chris Lattner
e7a058de7d
add the vperm instruction
...
llvm-svn: 26883
2006-03-20 01:00:56 +00:00
Chris Lattner
7e9440a4fc
Custom lower SCALAR_TO_VECTOR into lve*x.
...
llvm-svn: 26868
2006-03-19 06:55:52 +00:00
Chris Lattner
5b595af956
add support for vector undef
...
llvm-svn: 26863
2006-03-19 06:10:09 +00:00
Chris Lattner
0c9eb670bb
minor fixes
...
llvm-svn: 26857
2006-03-19 05:43:01 +00:00
Chris Lattner
431c90c9fa
we don't use lmw/stmw. When we want them they are easy enough to add
...
llvm-svn: 26853
2006-03-19 04:33:37 +00:00
Nate Begeman
21f87d0e4c
Fix subfic to match subc by default instead of sub so that it is correctly
...
cost-modeled as producing a flag. This fixes the test I just added for neg
llvm-svn: 26835
2006-03-17 22:41:37 +00:00
Nate Begeman
bb01d4f272
Remove BRTWOWAY*
...
Make the PPC backend not dependent on BRTWOWAY_CC and make the branch
selector smarter about the code it generates, fixing a case in the
readme.
llvm-svn: 26814
2006-03-17 01:40:33 +00:00
Chris Lattner
1e6dfa4c1f
Strangely, calls clobber call-clobbered vector regs. Whodathoughtit?
...
llvm-svn: 26808
2006-03-16 22:35:59 +00:00
Chris Lattner
fd9f3e8ed3
Add support for copying registers. still needed: spilling and reloading them
...
llvm-svn: 26800
2006-03-16 20:03:58 +00:00
Nate Begeman
2e1fde7c5c
Update scheduling info for vrsave instruction
...
llvm-svn: 26776
2006-03-15 05:25:05 +00:00
Chris Lattner
02e2c18c9c
For functions that use vector registers, save VRSAVE, mark used
...
registers, and update it on entry to each function, then restore it on exit.
This compiles:
void func(vfloat *a, vfloat *b, vfloat *c) {
*a = *b * *c + *c;
}
to this:
_func:
mfspr r2, 256
oris r6, r2, 49152
mtspr 256, r6
lvx v0, 0, r5
lvx v1, 0, r4
vmaddfp v0, v1, v0, v0
stvx v0, 0, r3
mtspr 256, r2
blr
GCC produces this (which has additional stack accesses):
_func:
mfspr r0,256
stw r0,-4(r1)
oris r0,r0,0xc000
mtspr 256,r0
lvx v0,0,r5
lvx v1,0,r4
lwz r12,-4(r1)
vmaddfp v0,v0,v1,v0
stvx v0,0,r3
mtspr 256,r12
blr
llvm-svn: 26733
2006-03-13 21:52:10 +00:00
Chris Lattner
7579cfb1a0
Mark instructions that are cracked by the PPC970 decoder as such.
...
llvm-svn: 26720
2006-03-13 05:15:10 +00:00
Chris Lattner
51348c5f27
Several big changes:
...
1. Use flags on the instructions in the .td file to indicate the PPC970 unit
type instead of a table in the .cpp file. Much cleaner.
2. Change the hazard recognizer to build d-groups according to the actual
algorithm used, not my flawed understanding of it.
3. Model "must be in the first slot" and "must be the only instr in a group"
accurately.
llvm-svn: 26719
2006-03-12 09:13:49 +00:00
Chris Lattner
ea79d9fd73
implement TII::insertNoop
...
llvm-svn: 26562
2006-03-05 23:49:55 +00:00
Chris Lattner
27f5345b1f
Compile this:
...
void foo(float a, int *b) { *b = a; }
to this:
_foo:
fctiwz f0, f1
stfiwx f0, 0, r4
blr
instead of this:
_foo:
fctiwz f0, f1
stfd f0, -8(r1)
lwz r2, -4(r1)
stw r2, 0(r4)
blr
This implements CodeGen/PowerPC/stfiwx.ll, and also incidentally does the
right thing for GCC bugzilla 26505.
llvm-svn: 26447
2006-03-01 05:50:56 +00:00
Nate Begeman
5965bd19f8
kill ADD_PARTS & SUB_PARTS and replace them with fancy new ADDC, ADDE, SUBC
...
and SUBE nodes that actually expose what's going on and allow for
significant simplifications in the targets.
llvm-svn: 26255
2006-02-17 05:43:56 +00:00
Nate Begeman
bc3ec1d37b
Add missing patterns for andi. and andis., fixing test/Regression/CodeGen/
...
PowerPC/and-imm.ll
llvm-svn: 26136
2006-02-12 09:09:52 +00:00
Chris Lattner
1240574609
PHI and INLINEASM are now built-in instructions provided by Target.td
...
llvm-svn: 25674
2006-01-27 01:46:15 +00:00
Chris Lattner
268d3584fc
ahem :)
...
llvm-svn: 25239
2006-01-12 02:05:36 +00:00
Nate Begeman
1b8121b227
Add bswap, rotl, and rotr nodes
...
Add dag combiner code to recognize rotl, rotr
Add ppc code to match rotl
Targets should add rotl/rotr patterns if they have them
llvm-svn: 25222
2006-01-11 21:21:00 +00:00
Nate Begeman
477933cfbd
Remove a comment that no longer applies.
...
llvm-svn: 25167
2006-01-10 00:15:59 +00:00
Chris Lattner
bfb2de9030
add ret void support back
...
llvm-svn: 25164
2006-01-09 23:20:37 +00:00
Evan Cheng
7785e5b3a4
New DAG node properties SNDPInFlag, SNDPOutFlag, and SNDPOptInFlag to replace
...
hasInFlag, hasOutFlag.
llvm-svn: 25155
2006-01-09 18:28:21 +00:00
Jim Laskey
762e9ec06c
Added initial support for DEBUG_LABEL allowing debug specific labels to be
...
inserted in the code.
llvm-svn: 25104
2006-01-05 01:25:28 +00:00
Jim Laskey
0da76a676a
Add unique id to debug location for debug label use (work in progress.)
...
llvm-svn: 25096
2006-01-04 15:04:11 +00:00
Nate Begeman
336dba6fb1
Add support for generating v4i32 altivec code
...
llvm-svn: 25046
2005-12-30 00:12:56 +00:00
Evan Cheng
14c53b45f5
Added field noResults to Instruction.
...
Currently tblgen cannot tell which operands in the operand list are results so
it assumes the first one is a result. This is bad. Ideally we would fix this
by separating results from inputs, e.g. (res R32:$dst),
(ops R32:$src1, R32:$src2). But that's a more distruptive change. Adding
'let noResults = 1' is the workaround to tell tblgen that the instruction does
not produces a result. It works for now since tblgen does not support
instructions which produce multiple results.
llvm-svn: 25017
2005-12-26 09:11:45 +00:00
Evan Cheng
9ae486047e
* Removed the use of FLAG. Now use hasFlagIn and hasFlagOut instead.
...
* Added a pseudo instruction (for each target) that represent "return void".
This is a workaround for lack of optional flag operand (return void is not
lowered so it does not have a flag operand.)
llvm-svn: 24997
2005-12-23 22:14:32 +00:00
Evan Cheng
82285c55aa
Flip the meaning of FPContractions to reflect Requires<[]> change.
...
llvm-svn: 24884
2005-12-20 20:08:53 +00:00
Nate Begeman
b11b8e44fa
Pattern-match return. Includes gross hack!
...
llvm-svn: 24874
2005-12-20 00:26:01 +00:00
Nate Begeman
8e6a8af205
Convert load/store over to being pattern matched
...
llvm-svn: 24871
2005-12-19 23:25:09 +00:00
Jim Laskey
7c462768ed
Added source file/line correspondence for dwarf (PowerPC only at this point.)
...
llvm-svn: 24748
2005-12-16 22:45:29 +00:00
Nate Begeman
672578bd94
Add a second vector type to the VRRC register class, and fix some patterns
...
so that tablegen can infer all types.
llvm-svn: 24746
2005-12-16 09:19:13 +00:00
Nate Begeman
e37cb604c1
Use the new predicate support that Evan Cheng added to remove some code
...
from the DAGToDAG cpp file. This adds pattern support for vector and
scalar fma, which passes test/Regression/CodeGen/PowerPC/fma.ll, and
does the right thing in the presence of -disable-excess-fp-precision.
Allows us to match:
void %foo(<4 x float> * %a) {
entry:
%tmp1 = load <4 x float> * %a;
%tmp2 = mul <4 x float> %tmp1, %tmp1
%tmp3 = add <4 x float> %tmp2, %tmp1
store <4 x float> %tmp3, <4 x float> *%a
ret void
}
As:
_foo:
li r2, 0
lvx v0, r2, r3
vmaddfp v0, v0, v0, v0
stvx v0, r2, r3
blr
Or, with llc -disable-excess-fp-precision,
_foo:
li r2, 0
lvx v0, r2, r3
vxor v1, v1, v1
vmaddfp v1, v0, v0, v1
vaddfp v0, v1, v0
stvx v0, r2, r3
blr
llvm-svn: 24719
2005-12-14 22:54:33 +00:00
Evan Cheng
3db275d996
Added predicate !NoExcessFPPrecision to FMADD, FMADDS, FMSUB, and FMSUBS.
...
llvm-svn: 24716
2005-12-14 22:07:12 +00:00
Nate Begeman
40f081d8e0
Add support for fmul node of type v4f32.
...
void %foo(<4 x float> * %a) {
entry:
%tmp1 = load <4 x float> * %a;
%tmp2 = mul <4 x float> %tmp1, %tmp1
store <4 x float> %tmp2, <4 x float> *%a
ret void
}
Is selected to:
_foo:
li r2, 0
lvx v0, r2, r3
vxor v1, v1, v1
vmaddfp v0, v0, v0, v1
stvx v0, r2, r3
blr
llvm-svn: 24701
2005-12-14 00:34:09 +00:00
Nate Begeman
69caef2b78
Prepare support for AltiVec multiply, divide, and sqrt.
...
llvm-svn: 24700
2005-12-13 22:55:22 +00:00
Chris Lattner
090eed0483
Remove type casts that are no longer needed
...
llvm-svn: 24661
2005-12-11 07:45:47 +00:00
Nate Begeman
4e56db674c
Add support for TargetConstantPool nodes to the dag isel emitter, and use
...
them in the PPC backend, to simplify some logic out of Select and
SelectAddr.
llvm-svn: 24657
2005-12-10 02:36:00 +00:00
Nate Begeman
ade6f9a255
Add support patterns to many load and store instructions which will
...
hopefully use patterns in the near future.
llvm-svn: 24651
2005-12-09 23:54:18 +00:00
Chris Lattner
fea33f7e64
Use new PPC-specific nodes to represent shifts which require the 6-bit
...
amount handling that PPC provides. These are generated by the lowering code
and prevents the dag combiner from assuming (rightfully) that the shifts
don't only look at 5 bits. This fixes a miscompilation of crafty with
the new front-end.
llvm-svn: 24615
2005-12-06 02:10:38 +00:00
Chris Lattner
f3322af5c6
Add some explicit type casts so that tblgen knows the type of the shift
...
amount, which is not necessarily the same as the type being shifted.
llvm-svn: 24594
2005-12-05 02:34:05 +00:00
Chris Lattner
f979794717
Autogen matching code for ADJCALLSTACK[UP|DOWN], thanks to Evan's tblgen
...
improvements.
llvm-svn: 24591
2005-12-04 19:01:59 +00:00
Chris Lattner
fd857daa0d
Finish moving uncond br over to .td file, remove from .cpp file.
...
llvm-svn: 24590
2005-12-04 18:48:01 +00:00
Chris Lattner
d9d18aff6a
Define BR in the .td file now that Evan made tblgen smarter.
...
llvm-svn: 24589
2005-12-04 18:42:54 +00:00
Nate Begeman
048b26387b
Represent the encoding of the SPR instructions as they actually are, so
...
that we can use the correct SPR numbers in the InstrInfo.td file. This is
necessary to support VRsave.
llvm-svn: 24521
2005-11-29 22:42:50 +00:00
Nate Begeman
c138118cdb
Add the remainder of the AltiVec 4 x float instructions. Further
...
enhancements will be necessary to teach the code generator that since
there is no fmul, it will have to do vmaddfp, adding +0.0.
llvm-svn: 24516
2005-11-29 08:04:45 +00:00
Nate Begeman
11fd6b22b1
Small tweaks noticed while on the plane.
...
llvm-svn: 24492
2005-11-26 22:39:34 +00:00
Nate Begeman
8492fd30ab
Some first bits of AltiVec stuff: Instruction Formats, Encodings, and
...
Registers. Apologies to Jim if the scheduling info so far isn't accurate.
There's a few more things like VRsave support that need to be finished up
in my local tree before I can commit code that Does The Right Thing for
turning 4 x float into the various altivec packed float instructions.
llvm-svn: 24489
2005-11-23 05:29:52 +00:00
Chris Lattner
bd9efdb64c
disentangle call operands from branch operands a bit
...
llvm-svn: 24400
2005-11-17 19:16:08 +00:00
Chris Lattner
4b11fa284d
Generate LA and ADDIS when possible.
...
llvm-svn: 24395
2005-11-17 17:52:01 +00:00
Chris Lattner
595088aa0f
Add an initial hack at legalizing GlobalAddress into the appropriate nodes
...
on Darwin to remove smarts from the isel. This is currently disabled by
default (uncomment setOperationAction(ISD::GlobalAddress to enable it).
tblgen needs to become smarter about tglobaladdr nodes and bigger patterns
needed to be added to the .td file. However, we can currently emit stuff like
this: :)
li r2, lo16(L_x$non_lazy_ptr)
lis r3, ha16(L_x$non_lazy_ptr)
lwzx r2, r3, r2
The obvious improvements will follow.
llvm-svn: 24390
2005-11-17 07:30:41 +00:00
Chris Lattner
63ed749ce0
LI could theoretically be used for the lo-part of a global address, just like
...
lis can be used for the high part.
llvm-svn: 24388
2005-11-17 07:04:43 +00:00
Nate Begeman
a171f6b20c
Patch to clean up function call pseudos and support the BLA instruction,
...
which branches to an absolute address. This is required to support objc
direct dispatch.
llvm-svn: 24370
2005-11-16 00:48:01 +00:00
Chris Lattner
5d6cb604de
add support for branch on ordered/unordered.
...
llvm-svn: 24067
2005-10-28 20:32:44 +00:00
Chris Lattner
81ff73ec46
autogen undef
...
llvm-svn: 23991
2005-10-25 21:03:41 +00:00
Chris Lattner
b439dad538
Allow pseudos to have patterns, no functionality change
...
llvm-svn: 23988
2005-10-25 20:58:43 +00:00
Chris Lattner
261009a4df
Autogen fsel
...
llvm-svn: 23987
2005-10-25 20:55:47 +00:00
Chris Lattner
cd7f101c9a
Autogen a few new ppc-specific nodes
...
llvm-svn: 23985
2005-10-25 20:41:46 +00:00
Chris Lattner
e296949fbe
Instead of aborting if not a case we can handle specially, break out and
...
let the generic code handle it. This fixes CodeGen/Generic/2005-10-21-longlonggtu.ll on ppc.
also, reindent this code
llvm-svn: 23874
2005-10-21 21:17:10 +00:00
Nate Begeman
fd0d55ec69
Match rotate. This does actually match the rotates in an rc5 cipher, but I
...
haven't seen it fire on our testsuite.
llvm-svn: 23863
2005-10-21 06:36:18 +00:00
Nate Begeman
60bbe2d1e5
Add some more patterns for i64 on ppc
...
llvm-svn: 23842
2005-10-20 07:51:08 +00:00
Jim Laskey
74ab9960f2
Added InstrSchedClass to each of the PowerPC Instructions.
...
Note that when adding new instructions that you should refer to the table at the
bottom of PPCSchedule.td.
llvm-svn: 23830
2005-10-19 19:51:16 +00:00
Nate Begeman
9f3c26c4ea
Write patterns for the various shl and srl patterns that don't involve
...
doing something clever.
llvm-svn: 23824
2005-10-19 18:42:01 +00:00
Chris Lattner
c16b0c387f
now that tblgen is smarter, use integers directly. This should help Andrew too
...
llvm-svn: 23818
2005-10-19 04:32:04 +00:00
Chris Lattner
5b6f4dc623
Convert these cases to patterns
...
llvm-svn: 23811
2005-10-19 01:38:02 +00:00
Nate Begeman
9eaa6bac06
Woo, it kinda works. We now generate this atrociously bad, but correct,
...
code for long long foo(long long a, long long b) { return a + b; }
_foo:
or r2, r3, r3
or r3, r4, r4
or r4, r5, r5
or r5, r6, r6
rldicr r2, r2, 32, 31
rldicl r3, r3, 0, 32
rldicr r4, r4, 32, 31
rldicl r5, r5, 0, 32
or r2, r3, r2
or r3, r5, r4
add r4, r3, r2
rldicl r2, r4, 32, 32
or r4, r4, r4
or r3, r2, r2
blr
llvm-svn: 23809
2005-10-19 01:12:32 +00:00
Nate Begeman
92e77502f3
Make a new reg class for 64 bit regs that aliases the 32 bit regs. This
...
will have to tide us over until we get real subreg support, but it prevents
the PrologEpilogInserter from spilling 8 byte GPRs on a G4 processor.
Add some initial support for TRUNCATE and ANY_EXTEND, but they don't
currently work due to issues with ScheduleDAG. Something wll have to be
figured out.
llvm-svn: 23803
2005-10-19 00:05:37 +00:00
Chris Lattner
5a2fb9787b
Fix the JIT encoding of LWA, LD, STD, and STDU.
...
llvm-svn: 23787
2005-10-18 16:51:22 +00:00
Nate Begeman
0b71e007ef
First bits of 64 bit PowerPC stuff, currently disabled. A lot of this is
...
purely mechanical.
llvm-svn: 23778
2005-10-18 00:28:58 +00:00
Chris Lattner
286c1d7cfa
Add a pattern for FSQRTS
...
llvm-svn: 23750
2005-10-15 21:44:15 +00:00
Chris Lattner
7503d46feb
Rename PowerPC*.td -> PPC*.td
...
llvm-svn: 23740
2005-10-14 23:40:39 +00:00