Commit Graph

394 Commits

Author SHA1 Message Date
Chris Lattner 259e6c76f2 Add a recursive-iterative hybrid stage to attempt to reduce stack space, this
helps but not enough.

Start pulling cases out of PPC32DAGToDAGISel::Select.  With GCC 4, this function
required 8512 bytes of stack space for each invocation (GCC 3 required less
than 700 bytes).  Pulling this first function out gets us down to 8224.  More
to come :(

llvm-svn: 23647
2005-10-06 18:45:51 +00:00
Chris Lattner 3734d204b8 another solution to the fsel issue. Instead of having 4 variants, just force
the comparison to be 64-bits.  This is fine because extensions from float
to double are free.

llvm-svn: 23589
2005-10-02 07:07:49 +00:00
Chris Lattner 9e98672962 fsel can take a different FP type for the comparison and for the result. As such
split the FSEL family into 4 things instead of just two.

llvm-svn: 23588
2005-10-02 06:58:23 +00:00
Chris Lattner 5ab9d42bb4 Minor tweak to the branch selector. When emitting a two-way branch, and if
we're in a single-mbb loop, make sure to emit the backwards branch as the
conditional branch instead of the uncond branch.  For example, emit this:

LBBl29_z__44:
        stw r9, 0(r15)
        stw r9, 4(r15)
        stw r9, 8(r15)
        stw r9, 12(r15)
        addi r15, r15, 16
        addi r8, r8, 1
        cmpw cr0, r8, r28
        ble cr0, LBBl29_z__44
        b LBBl29_z__48                   *** NOT PART OF LOOP

Instead of:

LBBl29_z__44:
        stw r9, 0(r15)
        stw r9, 4(r15)
        stw r9, 8(r15)
        stw r9, 12(r15)
        addi r15, r15, 16
        addi r8, r8, 1
        cmpw cr0, r8, r28
        bgt cr0, LBBl29_z__48            *** PART OF LOOP!
        b LBBl29_z__44

The former sequence has one fewer dispatch group for the loop body.

llvm-svn: 23582
2005-10-01 23:06:26 +00:00
Chris Lattner 8713ebf37c fix typo
llvm-svn: 23578
2005-10-01 02:51:36 +00:00
Chris Lattner d3eee1a09b Modify the ppc backend to use two register classes for FP: F8RC and F4RC.
These are used to represent float and double values, and the two regclasses
contain the same physical registers.

llvm-svn: 23577
2005-10-01 01:35:02 +00:00
Jim Laskey f61232354f Should be using flag and not chain.
llvm-svn: 23572
2005-09-30 23:43:37 +00:00
Chris Lattner 1de5706e68 Remove code for patterns that are autogenerated
llvm-svn: 23532
2005-09-29 23:33:31 +00:00
Chris Lattner 08c319fbdd Never rely on ReplaceAllUsesWith when selecting, use CodeGenMap instead.
ReplaceAllUsesWith does not replace scalars SDOperand floating around on
the stack, permitting things to be selected multiple times.

llvm-svn: 23515
2005-09-29 00:59:32 +00:00
Chris Lattner b9b2e77295 Autogen MUL, move FP cases together
llvm-svn: 23512
2005-09-28 22:53:16 +00:00
Chris Lattner 5769311c92 disentangle FP from INT versions of div/mul
llvm-svn: 23511
2005-09-28 22:50:24 +00:00
Chris Lattner 585131baaf Use the autogenerated matcher for ADD/SUB
llvm-svn: 23510
2005-09-28 22:47:28 +00:00
Chris Lattner d3ea19b51a Add FP versions of the binary operators, keeping the int and fp worlds seperate.
llvm-svn: 23506
2005-09-28 22:29:58 +00:00
Chris Lattner fab48b3285 All (xor *) cases are autogenerated now
llvm-svn: 23497
2005-09-28 18:12:37 +00:00
Chris Lattner 33f8e08c8f Implement PowerPC/eqv-andc-orc-nor.ll:EQV3
llvm-svn: 23494
2005-09-28 18:04:52 +00:00
Chris Lattner bb5939a436 These nodes are all autogenerated
llvm-svn: 23489
2005-09-28 17:07:09 +00:00
Chris Lattner c628f00845 Make sure to clear the CodeGenMap after each basic block is selected to avoid
cross MBB pollution.

llvm-svn: 23470
2005-09-27 17:45:33 +00:00
Chris Lattner b011cb2746 we don't need this proto any longer
llvm-svn: 23342
2005-09-13 22:05:21 +00:00
Chris Lattner 03e08eefc7 move the #include for the generated code into the isel class body so we
can use/define class methods

llvm-svn: 23339
2005-09-13 22:03:06 +00:00
Chris Lattner 4309c3a785 PowerPC cannot truncstore i1 natively
llvm-svn: 23304
2005-09-10 00:21:06 +00:00
Chris Lattner 498915dafa Remove some cases handled by the generated portion of the isel
llvm-svn: 23262
2005-09-07 23:45:15 +00:00
Nate Begeman 6095214bf0 Implement i64<->fp using the fctidz/fcfid instructions on PowerPC when we
are allowed to generate 64-bit-only PowerPC instructions for 32 bit hosts,
such as the PowerPC 970.

This speeds up 189.lucas from 81.99 to 32.64 seconds.

llvm-svn: 23250
2005-09-06 22:03:27 +00:00
Chris Lattner 8ae9525bd0 include the dag isel fragment
llvm-svn: 23239
2005-09-03 01:17:22 +00:00
Chris Lattner 5f12cf14be Change the isel to not break out of the big giant switch. Instead, the
switch should never be exited, so its bottom is now unreachable.

llvm-svn: 23234
2005-09-03 00:53:47 +00:00
Chris Lattner a305d28cf6 Implement dynamic allocas correctly. In particular, because we were copying
directly out of R1 (without using a CopyFromReg, which uses a chain), multiple
allocas were getting CSE'd together, producing bogus code.  For this:

int %foo(bool %X, int %A, int %B) {
        br bool %X, label %T, label %F
F:
        %G = alloca int
        %H = alloca int
        store int %A, int* %G
        store int %B, int* %H
        %R = load int* %G
        ret int %R
T:
        ret int 0
}

We were generating:

_foo:
        stwu r1, -16(r1)
        stw r31, 4(r1)
        or r31, r1, r1
        stw r1, 12(r31)
        cmpwi cr0, r3, 0
        bne cr0, .LBB_foo_2     ; T
.LBB_foo_1:     ; F
        li r2, 16
        subf r2, r2, r1   ;; One alloca
        or r1, r2, r2
        or r3, r1, r1
        or r1, r2, r2
        or r2, r1, r1
        stw r4, 0(r3)
        stw r5, 0(r2)
        lwz r3, 0(r3)
        lwz r1, 12(r31)
        lwz r31, 4(r31)
        lwz r1, 0(r1)
        blr
.LBB_foo_2:     ; T
        li r3, 0
        lwz r1, 12(r31)
        lwz r31, 4(r31)
        lwz r1, 0(r1)
        blr

Now we generate:

_foo:
        stwu r1, -16(r1)
        stw r31, 4(r1)
        or r31, r1, r1
        stw r1, 12(r31)
        cmpwi cr0, r3, 0
        bne cr0, .LBB_foo_2     ; T
.LBB_foo_1:     ; F
        or r2, r1, r1
        li r3, 16
        subf r2, r3, r2  ;; Alloca 1
        or r1, r2, r2
        or r2, r1, r1
        or r6, r1, r1
        subf r3, r3, r6  ;; Alloca 2
        or r1, r3, r3
        or r3, r1, r1
        stw r4, 0(r2)
        stw r5, 0(r3)
        lwz r3, 0(r2)
        lwz r1, 12(r31)
        lwz r31, 4(r31)
        lwz r1, 0(r1)
        blr
.LBB_foo_2:     ; T
        li r3, 0
        lwz r1, 12(r31)
        lwz r31, 4(r31)
        lwz r1, 0(r1)
        blr

This fixes Povray and SPASS with the dag isel, the last two failing cases.
Tommorow we will hopefully turn it on by default! :)

llvm-svn: 23190
2005-09-01 21:31:30 +00:00
Chris Lattner 293b3a68e0 Fix a bug where we were useing HA to get the high part, which seems like it
could cause a miscompile.  Fixing this didn't fix the two programs that fail
though.  :(

This also changes the implementation to follow the pattern selector more
closely, causing us to select 0 to li instead of lis.

llvm-svn: 23189
2005-09-01 19:38:28 +00:00
Chris Lattner 34182aff7f Do not select the operands being passed into SelectCC. IT does this itself
and selecting early prevents folding immediates into the cmpw* instructions

llvm-svn: 23188
2005-09-01 19:20:44 +00:00
Chris Lattner da2e04c69d Move FCTIWZ handling out of the instruction selectors and into legalization,
getting them out of the business of making stack slots.

llvm-svn: 23180
2005-08-31 21:09:52 +00:00
Chris Lattner 6bad1fb19e Remove dead code
llvm-svn: 23179
2005-08-31 20:25:15 +00:00
Chris Lattner 2bd2af8ecd add assert zext/sext to the dag isel
llvm-svn: 23171
2005-08-31 18:08:46 +00:00
Chris Lattner f4d594370b Fix 'ret long' to return the high and lo parts in the right registers. This
fixes crafty and probably others.

llvm-svn: 23167
2005-08-31 01:34:29 +00:00
Chris Lattner 69e9a9a94c now that physregs can exist in the same dag with multiple types, remove some
ugly hacks

llvm-svn: 23162
2005-08-30 22:59:48 +00:00
Chris Lattner 8f8d539746 Fix type mismatches when passing f32 values to calls
llvm-svn: 23159
2005-08-30 21:28:19 +00:00
Chris Lattner 9f23ae226f Fix some indentation (first hunks).
Remove code (last hunk) that miscompiled immediate and's, such as
  and uint %tmp.30, 4294958079

into

 andi. r8, r8, 56319
 andis. r8, r8, 65535

instead of:

 li r9, -9217
 and r8, r8, r9

The first always generates zero.

This fixes espresso.

llvm-svn: 23155
2005-08-30 18:37:48 +00:00
Chris Lattner 6a41fd75cd Fix a problem Nate found where we swapped the operands of SHL/SHR_PARTS. This
fixes fourinarow

llvm-svn: 23153
2005-08-30 17:42:59 +00:00
Chris Lattner bdf3d3defb codegen ADD_PARTS correctly: put the results in the right registers! This
fixes fhourstones

llvm-svn: 23152
2005-08-30 17:40:13 +00:00
Chris Lattner 45706e9fb8 add operands in the right order, fixing McCat/18-imp with the dag isel
llvm-svn: 23150
2005-08-30 17:13:58 +00:00
Chris Lattner 7a59b1cf90 Make sure the selector emits register register copies with flag operands
linking them to calls when appropriate, this prevents the scheduler from
pulling these copies away from the call.

This fixes Ptrdist/yacr2

llvm-svn: 23143
2005-08-30 01:57:02 +00:00
Chris Lattner e413b60632 The first operand to AND does not always have more than two operands. This
fixes MediaBench/toast with the dag selector

llvm-svn: 23141
2005-08-30 00:59:16 +00:00
Chris Lattner 61f7c3e843 emit FMR instructions to convert f64<->f32 instructions, so things like
STOREs, know the right type to store.

llvm-svn: 23139
2005-08-30 00:30:43 +00:00
Chris Lattner 12357281b8 fix a crash in cfrac
llvm-svn: 23137
2005-08-29 23:49:25 +00:00
Chris Lattner 1cbbe1015a Implement DYNAMIC_STACKALLOC, wrap some long lines
llvm-svn: 23136
2005-08-29 23:30:11 +00:00
Chris Lattner b2b418509b Fix a dumb bug of mine where we were mishandling the PPC ABI (undef handling).
This fixes voronoi and bh in Olden, allowing all of olden to pass!

llvm-svn: 23133
2005-08-29 22:22:57 +00:00
Chris Lattner c429ab2fb1 Fix a bug the last patch exposed in treeadd among others
llvm-svn: 23127
2005-08-29 01:07:02 +00:00
Chris Lattner d4d683a47b A hack to fix a problem folding immedaites. This fixes Olden/power.
llvm-svn: 23126
2005-08-29 01:01:01 +00:00
Chris Lattner 3ccad3fb8c Fix order of operands for copytoreg node when emitting calls. This fixes
Olden/msFix order of operands for copytoreg node when emitting calls.  This fixes
Olden/mstt.

llvm-svn: 23125
2005-08-29 00:26:57 +00:00
Chris Lattner 66ddc8d3bf add operands in the correct order
llvm-svn: 23123
2005-08-29 00:02:01 +00:00
Chris Lattner dfcde88d07 Fix a bug in FP_EXTEND, implement FP_TO_SINT
llvm-svn: 23121
2005-08-28 23:59:09 +00:00
Chris Lattner 38660c6666 fix an assertion failure in treeadd
llvm-svn: 23120
2005-08-28 23:39:22 +00:00
Chris Lattner 9b577f108a implement SELECT_CC fully for the DAG->DAG isel!
llvm-svn: 23101
2005-08-26 21:23:58 +00:00
Chris Lattner b2854fadda Make fsel emission work with both the pattern and dag-dag selectors, by
giving it a non-instruction opcode.  The dag->dag selector used to not
select the operands of the fsel, because it thought that whole tree was
already selected.

llvm-svn: 23091
2005-08-26 20:25:03 +00:00
Chris Lattner bec817ce6f implement the fold for:
bool %test(int %X, int %Y) {
        %C = setne int %X, 0
        ret bool %C
}

to:

_test:
        addic r2, r3, -1
        subfe r3, r2, r3
        blr

llvm-svn: 23089
2005-08-26 18:46:49 +00:00
Chris Lattner a9e6a82d66 Changes to adjust to new ReplaceAllUsesWith syntax. Change FP_EXTEND to
just return its input, instead of emitting an explicit copy.

llvm-svn: 23088
2005-08-26 18:37:23 +00:00
Chris Lattner c75e047245 now that fsel is formed during legalization, this code is dead
llvm-svn: 23084
2005-08-26 17:40:39 +00:00
Chris Lattner c30405e0ee Change ConstantPoolSDNode to actually hold the Constant itself instead of
putting it into the constant pool.  This allows the isel machinery to
create constants that it will end up deciding are not needed, without them
ending up in the resultant function constant pool.

llvm-svn: 23081
2005-08-26 17:15:30 +00:00
Chris Lattner 7bbdae53d6 Fix some warnings in an optimized build
llvm-svn: 23080
2005-08-26 16:38:51 +00:00
Chris Lattner 2091a36631 Fix a huge annoyance: SelectNodeTo took types before the opcode unlike
every other SD API.  Fix it to take the opcode before the types.

llvm-svn: 23079
2005-08-26 16:36:26 +00:00
Nate Begeman 89093ca62a SUBFIC produces two results, not one.
llvm-svn: 23073
2005-08-26 00:34:06 +00:00
Nate Begeman bed4f2b982 Implement SHL_PARTS and SRL_PARTS
llvm-svn: 23072
2005-08-26 00:28:00 +00:00
Chris Lattner b81431b012 Emit the lo/hi parts in the right order :)
llvm-svn: 23068
2005-08-25 23:36:49 +00:00
Chris Lattner 02884fe41c implement support for 64-bit add/sub, fix a broken assertion for 64-bit
return.  Allow the udiv breaker-upper to work with any non-zero constant
operand.

llvm-svn: 23066
2005-08-25 23:21:06 +00:00
Chris Lattner 6e184f2b3d Finish implementing SDIV/UDIV by copying over the majik constant code from
ISelPattern

llvm-svn: 23062
2005-08-25 22:04:30 +00:00
Chris Lattner b746dd1cf6 Implement setcc correctly for G5 and non-G5 systems
llvm-svn: 23060
2005-08-25 21:39:42 +00:00
Chris Lattner 3dcd75bc54 implement setcc on the G5. We're still missing the non-g5 specific bits, but
they will come later.

llvm-svn: 23059
2005-08-25 20:08:18 +00:00
Chris Lattner dc66457022 Add support for sdiv by 2^k and -2^k. Producing code like:
_test:
        srawi r2, r3, 2
        addze r3, r2
        blr

llvm-svn: 23052
2005-08-25 17:50:06 +00:00
Chris Lattner 25db699671 Implement support for taking the address of constant pool indices, which
is used by the int -> FP code among other things.  This gets
2005-05-12-Int64ToFP past that failure, to dying on lack of support for add_parts

llvm-svn: 23042
2005-08-25 05:04:11 +00:00
Chris Lattner 666512c832 Add support for FP constants, fixing UnitTests/2004-02-02-NegativeZero
llvm-svn: 23038
2005-08-25 04:47:18 +00:00
Chris Lattner e4c338d0d8 Fully implement frame index, so that we can pass the address of alloca's
around to functions and stuff

llvm-svn: 23036
2005-08-25 00:45:43 +00:00
Chris Lattner 66a6a13225 implement unconditional branches, fixing UnitTests/2003-05-02-DependentPHI.c
llvm-svn: 23034
2005-08-25 00:29:58 +00:00
Chris Lattner 794eb6684d Fix a broken assertion
llvm-svn: 23032
2005-08-25 00:19:12 +00:00
Chris Lattner a3fbdae515 Split IMPLICIT_DEF into IMPLICIT_DEF_GPR and IMPLICIT_DEF_FP, so that the
instructions take a consistent reg class.  Implement ISD::UNDEF in the dag->dag
selector to generate this, fixing UnitTests/2003-07-06-IntOverflow.

llvm-svn: 23028
2005-08-24 23:08:16 +00:00
Chris Lattner d83cd354bd implement support for calls
llvm-svn: 23026
2005-08-24 22:45:17 +00:00
Nate Begeman a1e0a2f72b Remove unused statistic
Prefer 'neg X' to 'subfic 0, X' since neg does not set XER[CA]

llvm-svn: 23001
2005-08-24 05:03:20 +00:00
Chris Lattner b6d034a841 Add callseq_begin/end support
Call stil not supported yet

llvm-svn: 22998
2005-08-24 00:47:15 +00:00
Chris Lattner ca0c0d7550 Implement stores.
llvm-svn: 22963
2005-08-22 01:27:59 +00:00
Chris Lattner 1d634b2f44 Fix compilation of:
float %test2(float* %P) {
        %Q = load float* %P
        %R = add float %Q, %Q
        ret float %R
}

By returning the right result.

llvm-svn: 22961
2005-08-22 00:59:14 +00:00
Chris Lattner c5292ec9de Implement most of load support. There is still a bug though.
llvm-svn: 22959
2005-08-21 22:31:09 +00:00
Chris Lattner 2a1823d178 Implement selection for branches.
llvm-svn: 22951
2005-08-21 18:50:37 +00:00
Chris Lattner 4564039498 add support for global address, including PIC support.
This REALLY should be lowered by the legalizer!

llvm-svn: 22941
2005-08-19 22:38:53 +00:00
Chris Lattner 65d66797a5 Fix a typeo, no wonder all tokenfactor edges were the same!
llvm-svn: 22935
2005-08-19 21:33:02 +00:00
Nate Begeman 93c4bc6dca ISD::OR, and it's accompanying SelectBitfieldInsert
llvm-svn: 22889
2005-08-19 00:38:14 +00:00
Nate Begeman 33acb2c135 Add shifts.
llvm-svn: 22884
2005-08-18 23:38:00 +00:00
Chris Lattner 4e00ff6e70 Move this to the emitter
llvm-svn: 22877
2005-08-18 20:08:53 +00:00
Chris Lattner 015d73996d After selecting the instructions for a basic block, emit the instructions
llvm-svn: 22869
2005-08-18 18:46:06 +00:00
Chris Lattner 15b5c7ca84 remove some unused stuff
llvm-svn: 22866
2005-08-18 18:34:00 +00:00
Nate Begeman d32638706a Improve ISD::Constant codegen.
Now for int foo() { return -1; } we generate:
_foo:
        li r3, -1
        blr

instead of
_foo:
        lis r2, -1
        ori r3, r2, 65535
        blr

llvm-svn: 22864
2005-08-18 18:01:39 +00:00
Nate Begeman b3821a3943 Add support for ISD::AND, and its various optimized forms.
llvm-svn: 22857
2005-08-18 07:30:46 +00:00
Nate Begeman cfb9a74c2e Maintain consistency in negating things
llvm-svn: 22855
2005-08-18 05:44:50 +00:00
Nate Begeman 72d6f8800d Implement XOR, remove a broken sign_extend_inreg case
llvm-svn: 22854
2005-08-18 05:00:13 +00:00
Nate Begeman 4bfb4a215d Add a bunch more simple nodes.
llvm-svn: 22851
2005-08-18 03:04:18 +00:00
Nate Begeman 457367f14c Add a couple more nodes that are easy to handle
llvm-svn: 22850
2005-08-18 00:53:47 +00:00
Nate Begeman 74d5529b88 Be fruitful and multiply!
llvm-svn: 22849
2005-08-18 00:21:41 +00:00
Nate Begeman 3fcf47d8f0 Teach the DAG->DAG ISel about FNEG, and how it can be used to invert
several of the PowerPC opcodes that come in both negated and non-negated
forms.

llvm-svn: 22845
2005-08-17 23:46:35 +00:00
Chris Lattner 43ff01e2e6 initial hack at a dag->dag instruction selector. This is obviously woefully
incomplete, but it is a start.  It handles basic argument/retval stuff, immediates,
add and sub.

llvm-svn: 22836
2005-08-17 19:33:03 +00:00