Commit Graph

386 Commits

Author SHA1 Message Date
Chris Lattner ab882abce8 add support for using vxor to build zero vectors. This implements
Regression/CodeGen/PowerPC/vec_zero.ll

llvm-svn: 27059
2006-03-24 07:48:08 +00:00
Chris Lattner f5efddf80b Gabor points out that we can't spell. :)
llvm-svn: 27049
2006-03-24 07:12:19 +00:00
Chris Lattner 81137629e0 Add PPC vector bit-convert support
llvm-svn: 26995
2006-03-23 19:54:27 +00:00
Chris Lattner 4a66d69433 When possible, custom lower 32-bit SINT_TO_FP to this:
_foo2:
        extsw r2, r3
        std r2, -8(r1)
        lfd f0, -8(r1)
        fcfid f0, f0
        frsp f1, f0
        blr

instead of this:

_foo2:
        lis r2, ha16(LCPI2_0)
        lis r4, 17200
        xoris r3, r3, 32768
        stw r3, -4(r1)
        stw r4, -8(r1)
        lfs f0, lo16(LCPI2_0)(r2)
        lfd f1, -8(r1)
        fsub f0, f1, f0
        frsp f1, f0
        blr

This speeds up Misc/pi from 2.44s->2.09s with LLC and from 3.01->2.18s
with llcbeta (16.7% and 38.1% respectively).

llvm-svn: 26943
2006-03-22 05:30:33 +00:00
Chris Lattner 4e7371758f Fix the JIT encoding of the VAForm_1 instructions, including vmaddfp
llvm-svn: 26935
2006-03-22 01:44:36 +00:00
Chris Lattner d2132f87d7 When codegen'ing vector MUL using VFMADD, *add* the 0, don't *mul* the 0.
llvm-svn: 26913
2006-03-21 00:51:38 +00:00
Chris Lattner a1bc294f0c Fix a couple of bugs in permute/splat generate, thanks to Nate for actually
figuring these out! :)

llvm-svn: 26904
2006-03-20 18:26:51 +00:00
Chris Lattner f96d523b8f Fix the pattern for VADDUWM, add i32 splat
llvm-svn: 26901
2006-03-20 17:51:58 +00:00
Evan Cheng 89f3cff0f5 Use tblgen'd VECTOR_SHUFFLE selection code.
llvm-svn: 26900
2006-03-20 08:14:16 +00:00
Chris Lattner a9a1313386 Add support for generating vspltw, instead of a vperm instruction with a
constant pool load.  This generates significantly nicer code for splats.

When tblgen gets bugfixed, we can remove the custom selection code.

llvm-svn: 26898
2006-03-20 06:51:10 +00:00
Chris Lattner 382f356bd9 Check in some intermediate code that adds a skeleton for matching vsplt*
instructions

llvm-svn: 26894
2006-03-20 06:15:45 +00:00
Chris Lattner 93d99f9928 fix typo
llvm-svn: 26889
2006-03-20 05:05:55 +00:00
Chris Lattner 366b2514fa add vsplat instructions, fix sched description for vperm
llvm-svn: 26888
2006-03-20 04:47:33 +00:00
Chris Lattner a8713b1ee6 Custom lower arbitrary VECTOR_SHUFFLE's to VPERM.
TODO: leave specific ones as VECTOR_SHUFFLE's and turn them into specialized
operations like vsplt*

llvm-svn: 26887
2006-03-20 01:53:53 +00:00
Chris Lattner e7a058de7d add the vperm instruction
llvm-svn: 26883
2006-03-20 01:00:56 +00:00
Chris Lattner 7e9440a4fc Custom lower SCALAR_TO_VECTOR into lve*x.
llvm-svn: 26868
2006-03-19 06:55:52 +00:00
Chris Lattner 5b595af956 add support for vector undef
llvm-svn: 26863
2006-03-19 06:10:09 +00:00
Chris Lattner 0c9eb670bb minor fixes
llvm-svn: 26857
2006-03-19 05:43:01 +00:00
Chris Lattner 431c90c9fa we don't use lmw/stmw. When we want them they are easy enough to add
llvm-svn: 26853
2006-03-19 04:33:37 +00:00
Nate Begeman 21f87d0e4c Fix subfic to match subc by default instead of sub so that it is correctly
cost-modeled as producing a flag.  This fixes the test I just added for neg

llvm-svn: 26835
2006-03-17 22:41:37 +00:00
Nate Begeman bb01d4f272 Remove BRTWOWAY*
Make the PPC backend not dependent on BRTWOWAY_CC and make the branch
selector smarter about the code it generates, fixing a case in the
readme.

llvm-svn: 26814
2006-03-17 01:40:33 +00:00
Chris Lattner 1e6dfa4c1f Strangely, calls clobber call-clobbered vector regs. Whodathoughtit?
llvm-svn: 26808
2006-03-16 22:35:59 +00:00
Chris Lattner fd9f3e8ed3 Add support for copying registers. still needed: spilling and reloading them
llvm-svn: 26800
2006-03-16 20:03:58 +00:00
Nate Begeman 2e1fde7c5c Update scheduling info for vrsave instruction
llvm-svn: 26776
2006-03-15 05:25:05 +00:00
Chris Lattner 02e2c18c9c For functions that use vector registers, save VRSAVE, mark used
registers, and update it on entry to each function, then restore it on exit.

This compiles:

void func(vfloat *a, vfloat *b, vfloat *c) {
        *a = *b * *c + *c;
}

to this:

_func:
        mfspr r2, 256
        oris r6, r2, 49152
        mtspr 256, r6
        lvx v0, 0, r5
        lvx v1, 0, r4
        vmaddfp v0, v1, v0, v0
        stvx v0, 0, r3
        mtspr 256, r2
        blr

GCC produces this (which has additional stack accesses):

_func:
        mfspr r0,256
        stw r0,-4(r1)
        oris r0,r0,0xc000
        mtspr 256,r0
        lvx v0,0,r5
        lvx v1,0,r4
        lwz r12,-4(r1)
        vmaddfp v0,v0,v1,v0
        stvx v0,0,r3
        mtspr 256,r12
        blr

llvm-svn: 26733
2006-03-13 21:52:10 +00:00
Chris Lattner 7579cfb1a0 Mark instructions that are cracked by the PPC970 decoder as such.
llvm-svn: 26720
2006-03-13 05:15:10 +00:00
Chris Lattner 51348c5f27 Several big changes:
1. Use flags on the instructions in the .td file to indicate the PPC970 unit
   type instead of a table in the .cpp file.  Much cleaner.
2. Change the hazard recognizer to build d-groups according to the actual
   algorithm used, not my flawed understanding of it.
3. Model "must be in the first slot" and "must be the only instr in a group"
   accurately.

llvm-svn: 26719
2006-03-12 09:13:49 +00:00
Chris Lattner ea79d9fd73 implement TII::insertNoop
llvm-svn: 26562
2006-03-05 23:49:55 +00:00
Chris Lattner 27f5345b1f Compile this:
void foo(float a, int *b) { *b = a; }

to this:

_foo:
        fctiwz f0, f1
        stfiwx f0, 0, r4
        blr

instead of this:

_foo:
        fctiwz f0, f1
        stfd f0, -8(r1)
        lwz r2, -4(r1)
        stw r2, 0(r4)
        blr

This implements CodeGen/PowerPC/stfiwx.ll, and also incidentally does the
right thing for GCC bugzilla 26505.

llvm-svn: 26447
2006-03-01 05:50:56 +00:00
Nate Begeman 5965bd19f8 kill ADD_PARTS & SUB_PARTS and replace them with fancy new ADDC, ADDE, SUBC
and SUBE nodes that actually expose what's going on and allow for
significant simplifications in the targets.

llvm-svn: 26255
2006-02-17 05:43:56 +00:00
Nate Begeman bc3ec1d37b Add missing patterns for andi. and andis., fixing test/Regression/CodeGen/
PowerPC/and-imm.ll

llvm-svn: 26136
2006-02-12 09:09:52 +00:00
Chris Lattner 1240574609 PHI and INLINEASM are now built-in instructions provided by Target.td
llvm-svn: 25674
2006-01-27 01:46:15 +00:00
Chris Lattner 268d3584fc ahem :)
llvm-svn: 25239
2006-01-12 02:05:36 +00:00
Nate Begeman 1b8121b227 Add bswap, rotl, and rotr nodes
Add dag combiner code to recognize rotl, rotr
Add ppc code to match rotl

Targets should add rotl/rotr patterns if they have them

llvm-svn: 25222
2006-01-11 21:21:00 +00:00
Nate Begeman 477933cfbd Remove a comment that no longer applies.
llvm-svn: 25167
2006-01-10 00:15:59 +00:00
Chris Lattner bfb2de9030 add ret void support back
llvm-svn: 25164
2006-01-09 23:20:37 +00:00
Evan Cheng 7785e5b3a4 New DAG node properties SNDPInFlag, SNDPOutFlag, and SNDPOptInFlag to replace
hasInFlag, hasOutFlag.

llvm-svn: 25155
2006-01-09 18:28:21 +00:00
Jim Laskey 762e9ec06c Added initial support for DEBUG_LABEL allowing debug specific labels to be
inserted in the code.

llvm-svn: 25104
2006-01-05 01:25:28 +00:00
Jim Laskey 0da76a676a Add unique id to debug location for debug label use (work in progress.)
llvm-svn: 25096
2006-01-04 15:04:11 +00:00
Nate Begeman 336dba6fb1 Add support for generating v4i32 altivec code
llvm-svn: 25046
2005-12-30 00:12:56 +00:00
Evan Cheng 14c53b45f5 Added field noResults to Instruction.
Currently tblgen cannot tell which operands in the operand list are results so
it assumes the first one is a result. This is bad. Ideally we would fix this
by separating results from inputs, e.g. (res R32:$dst),
(ops R32:$src1, R32:$src2). But that's a more distruptive change. Adding
'let noResults = 1' is the workaround to tell tblgen that the instruction does
not produces a result. It works for now since tblgen does not support
instructions which produce multiple results.

llvm-svn: 25017
2005-12-26 09:11:45 +00:00
Evan Cheng 9ae486047e * Removed the use of FLAG. Now use hasFlagIn and hasFlagOut instead.
* Added a pseudo instruction (for each target) that represent "return void".
  This is a workaround for lack of optional flag operand (return void is not
  lowered so it does not have a flag operand.)

llvm-svn: 24997
2005-12-23 22:14:32 +00:00
Evan Cheng 82285c55aa Flip the meaning of FPContractions to reflect Requires<[]> change.
llvm-svn: 24884
2005-12-20 20:08:53 +00:00
Nate Begeman b11b8e44fa Pattern-match return. Includes gross hack!
llvm-svn: 24874
2005-12-20 00:26:01 +00:00
Nate Begeman 8e6a8af205 Convert load/store over to being pattern matched
llvm-svn: 24871
2005-12-19 23:25:09 +00:00
Jim Laskey 7c462768ed Added source file/line correspondence for dwarf (PowerPC only at this point.)
llvm-svn: 24748
2005-12-16 22:45:29 +00:00
Nate Begeman 672578bd94 Add a second vector type to the VRRC register class, and fix some patterns
so that tablegen can infer all types.

llvm-svn: 24746
2005-12-16 09:19:13 +00:00
Nate Begeman e37cb604c1 Use the new predicate support that Evan Cheng added to remove some code
from the DAGToDAG cpp file.  This adds pattern support for vector and
scalar fma, which passes test/Regression/CodeGen/PowerPC/fma.ll, and
does the right thing in the presence of -disable-excess-fp-precision.

Allows us to match:
void %foo(<4 x float> * %a) {
entry:
  %tmp1 = load <4 x float> * %a;
  %tmp2 = mul <4 x float> %tmp1, %tmp1
  %tmp3 = add <4 x float> %tmp2, %tmp1
  store <4 x float> %tmp3, <4 x float> *%a
  ret void
}

As:

_foo:
        li r2, 0
        lvx v0, r2, r3
        vmaddfp v0, v0, v0, v0
        stvx v0, r2, r3
        blr

Or, with llc -disable-excess-fp-precision,

_foo:
        li r2, 0
        lvx v0, r2, r3
        vxor v1, v1, v1
        vmaddfp v1, v0, v0, v1
        vaddfp v0, v1, v0
        stvx v0, r2, r3
        blr

llvm-svn: 24719
2005-12-14 22:54:33 +00:00
Evan Cheng 3db275d996 Added predicate !NoExcessFPPrecision to FMADD, FMADDS, FMSUB, and FMSUBS.
llvm-svn: 24716
2005-12-14 22:07:12 +00:00
Nate Begeman 40f081d8e0 Add support for fmul node of type v4f32.
void %foo(<4 x float> * %a) {
entry:
  %tmp1 = load <4 x float> * %a;
  %tmp2 = mul <4 x float> %tmp1, %tmp1
  store <4 x float> %tmp2, <4 x float> *%a
  ret void
}

Is selected to:

_foo:
        li r2, 0
        lvx v0, r2, r3
        vxor v1, v1, v1
        vmaddfp v0, v0, v0, v1
        stvx v0, r2, r3
        blr

llvm-svn: 24701
2005-12-14 00:34:09 +00:00
Nate Begeman 69caef2b78 Prepare support for AltiVec multiply, divide, and sqrt.
llvm-svn: 24700
2005-12-13 22:55:22 +00:00
Chris Lattner 090eed0483 Remove type casts that are no longer needed
llvm-svn: 24661
2005-12-11 07:45:47 +00:00
Nate Begeman 4e56db674c Add support for TargetConstantPool nodes to the dag isel emitter, and use
them in the PPC backend, to simplify some logic out of Select and
SelectAddr.

llvm-svn: 24657
2005-12-10 02:36:00 +00:00
Nate Begeman ade6f9a255 Add support patterns to many load and store instructions which will
hopefully use patterns in the near future.

llvm-svn: 24651
2005-12-09 23:54:18 +00:00
Chris Lattner fea33f7e64 Use new PPC-specific nodes to represent shifts which require the 6-bit
amount handling that PPC provides.  These are generated by the lowering code
and prevents the dag combiner from assuming (rightfully) that the shifts
don't only look at 5 bits.  This fixes a miscompilation of crafty with
the new front-end.

llvm-svn: 24615
2005-12-06 02:10:38 +00:00
Chris Lattner f3322af5c6 Add some explicit type casts so that tblgen knows the type of the shift
amount, which is not necessarily the same as the type being shifted.

llvm-svn: 24594
2005-12-05 02:34:05 +00:00
Chris Lattner f979794717 Autogen matching code for ADJCALLSTACK[UP|DOWN], thanks to Evan's tblgen
improvements.

llvm-svn: 24591
2005-12-04 19:01:59 +00:00
Chris Lattner fd857daa0d Finish moving uncond br over to .td file, remove from .cpp file.
llvm-svn: 24590
2005-12-04 18:48:01 +00:00
Chris Lattner d9d18aff6a Define BR in the .td file now that Evan made tblgen smarter.
llvm-svn: 24589
2005-12-04 18:42:54 +00:00
Nate Begeman 048b26387b Represent the encoding of the SPR instructions as they actually are, so
that we can use the correct SPR numbers in the InstrInfo.td file.  This is
necessary to support VRsave.

llvm-svn: 24521
2005-11-29 22:42:50 +00:00
Nate Begeman c138118cdb Add the remainder of the AltiVec 4 x float instructions. Further
enhancements will be necessary to teach the code generator that since
there is no fmul, it will have to do vmaddfp, adding +0.0.

llvm-svn: 24516
2005-11-29 08:04:45 +00:00
Nate Begeman 11fd6b22b1 Small tweaks noticed while on the plane.
llvm-svn: 24492
2005-11-26 22:39:34 +00:00
Nate Begeman 8492fd30ab Some first bits of AltiVec stuff: Instruction Formats, Encodings, and
Registers.  Apologies to Jim if the scheduling info so far isn't accurate.

There's a few more things like VRsave support that need to be finished up
in my local tree before I can commit code that Does The Right Thing for
turning 4 x float into the various altivec packed float instructions.

llvm-svn: 24489
2005-11-23 05:29:52 +00:00
Chris Lattner bd9efdb64c disentangle call operands from branch operands a bit
llvm-svn: 24400
2005-11-17 19:16:08 +00:00
Chris Lattner 4b11fa284d Generate LA and ADDIS when possible.
llvm-svn: 24395
2005-11-17 17:52:01 +00:00
Chris Lattner 595088aa0f Add an initial hack at legalizing GlobalAddress into the appropriate nodes
on Darwin to remove smarts from the isel.  This is currently disabled by
default (uncomment setOperationAction(ISD::GlobalAddress to enable it).
tblgen needs to become smarter about tglobaladdr nodes and bigger patterns
needed to be added to the .td file.  However, we can currently emit stuff like
this:  :)

        li r2, lo16(L_x$non_lazy_ptr)
        lis r3, ha16(L_x$non_lazy_ptr)
        lwzx r2, r3, r2

The obvious improvements will follow.

llvm-svn: 24390
2005-11-17 07:30:41 +00:00
Chris Lattner 63ed749ce0 LI could theoretically be used for the lo-part of a global address, just like
lis can be used for the high part.

llvm-svn: 24388
2005-11-17 07:04:43 +00:00
Nate Begeman a171f6b20c Patch to clean up function call pseudos and support the BLA instruction,
which branches to an absolute address.  This is required to support objc
direct dispatch.

llvm-svn: 24370
2005-11-16 00:48:01 +00:00
Chris Lattner 5d6cb604de add support for branch on ordered/unordered.
llvm-svn: 24067
2005-10-28 20:32:44 +00:00
Chris Lattner 81ff73ec46 autogen undef
llvm-svn: 23991
2005-10-25 21:03:41 +00:00
Chris Lattner b439dad538 Allow pseudos to have patterns, no functionality change
llvm-svn: 23988
2005-10-25 20:58:43 +00:00
Chris Lattner 261009a4df Autogen fsel
llvm-svn: 23987
2005-10-25 20:55:47 +00:00
Chris Lattner cd7f101c9a Autogen a few new ppc-specific nodes
llvm-svn: 23985
2005-10-25 20:41:46 +00:00
Chris Lattner e296949fbe Instead of aborting if not a case we can handle specially, break out and
let the generic code handle it.  This fixes CodeGen/Generic/2005-10-21-longlonggtu.ll on ppc.

also, reindent this code

llvm-svn: 23874
2005-10-21 21:17:10 +00:00
Nate Begeman fd0d55ec69 Match rotate. This does actually match the rotates in an rc5 cipher, but I
haven't seen it fire on our testsuite.

llvm-svn: 23863
2005-10-21 06:36:18 +00:00
Nate Begeman 60bbe2d1e5 Add some more patterns for i64 on ppc
llvm-svn: 23842
2005-10-20 07:51:08 +00:00
Jim Laskey 74ab9960f2 Added InstrSchedClass to each of the PowerPC Instructions.
Note that when adding new instructions that you should refer to the table at the
bottom of PPCSchedule.td.

llvm-svn: 23830
2005-10-19 19:51:16 +00:00
Nate Begeman 9f3c26c4ea Write patterns for the various shl and srl patterns that don't involve
doing something clever.

llvm-svn: 23824
2005-10-19 18:42:01 +00:00
Chris Lattner c16b0c387f now that tblgen is smarter, use integers directly. This should help Andrew too
llvm-svn: 23818
2005-10-19 04:32:04 +00:00
Chris Lattner 5b6f4dc623 Convert these cases to patterns
llvm-svn: 23811
2005-10-19 01:38:02 +00:00
Nate Begeman 9eaa6bac06 Woo, it kinda works. We now generate this atrociously bad, but correct,
code for long long foo(long long a, long long b) { return a + b; }

_foo:
        or r2, r3, r3
        or r3, r4, r4
        or r4, r5, r5
        or r5, r6, r6
        rldicr r2, r2, 32, 31
        rldicl r3, r3, 0, 32
        rldicr r4, r4, 32, 31
        rldicl r5, r5, 0, 32
        or r2, r3, r2
        or r3, r5, r4
        add r4, r3, r2
        rldicl r2, r4, 32, 32
        or r4, r4, r4
        or r3, r2, r2
        blr

llvm-svn: 23809
2005-10-19 01:12:32 +00:00
Nate Begeman 92e77502f3 Make a new reg class for 64 bit regs that aliases the 32 bit regs. This
will have to tide us over until we get real subreg support, but it prevents
the PrologEpilogInserter from spilling 8 byte GPRs on a G4 processor.

Add some initial support for TRUNCATE and ANY_EXTEND, but they don't
currently work due to issues with ScheduleDAG.  Something wll have to be
figured out.

llvm-svn: 23803
2005-10-19 00:05:37 +00:00
Chris Lattner 5a2fb9787b Fix the JIT encoding of LWA, LD, STD, and STDU.
llvm-svn: 23787
2005-10-18 16:51:22 +00:00
Nate Begeman 0b71e007ef First bits of 64 bit PowerPC stuff, currently disabled. A lot of this is
purely mechanical.

llvm-svn: 23778
2005-10-18 00:28:58 +00:00
Chris Lattner 286c1d7cfa Add a pattern for FSQRTS
llvm-svn: 23750
2005-10-15 21:44:15 +00:00
Chris Lattner 7503d46feb Rename PowerPC*.td -> PPC*.td
llvm-svn: 23740
2005-10-14 23:40:39 +00:00