Chris Lattner
259e6c76f2
Add a recursive-iterative hybrid stage to attempt to reduce stack space, this
...
helps but not enough.
Start pulling cases out of PPC32DAGToDAGISel::Select. With GCC 4, this function
required 8512 bytes of stack space for each invocation (GCC 3 required less
than 700 bytes). Pulling this first function out gets us down to 8224. More
to come :(
llvm-svn: 23647
2005-10-06 18:45:51 +00:00
Chris Lattner
3734d204b8
another solution to the fsel issue. Instead of having 4 variants, just force
...
the comparison to be 64-bits. This is fine because extensions from float
to double are free.
llvm-svn: 23589
2005-10-02 07:07:49 +00:00
Chris Lattner
9e98672962
fsel can take a different FP type for the comparison and for the result. As such
...
split the FSEL family into 4 things instead of just two.
llvm-svn: 23588
2005-10-02 06:58:23 +00:00
Chris Lattner
5ab9d42bb4
Minor tweak to the branch selector. When emitting a two-way branch, and if
...
we're in a single-mbb loop, make sure to emit the backwards branch as the
conditional branch instead of the uncond branch. For example, emit this:
LBBl29_z__44:
stw r9, 0(r15)
stw r9, 4(r15)
stw r9, 8(r15)
stw r9, 12(r15)
addi r15, r15, 16
addi r8, r8, 1
cmpw cr0, r8, r28
ble cr0, LBBl29_z__44
b LBBl29_z__48 *** NOT PART OF LOOP
Instead of:
LBBl29_z__44:
stw r9, 0(r15)
stw r9, 4(r15)
stw r9, 8(r15)
stw r9, 12(r15)
addi r15, r15, 16
addi r8, r8, 1
cmpw cr0, r8, r28
bgt cr0, LBBl29_z__48 *** PART OF LOOP!
b LBBl29_z__44
The former sequence has one fewer dispatch group for the loop body.
llvm-svn: 23582
2005-10-01 23:06:26 +00:00
Chris Lattner
8713ebf37c
fix typo
...
llvm-svn: 23578
2005-10-01 02:51:36 +00:00
Chris Lattner
d3eee1a09b
Modify the ppc backend to use two register classes for FP: F8RC and F4RC.
...
These are used to represent float and double values, and the two regclasses
contain the same physical registers.
llvm-svn: 23577
2005-10-01 01:35:02 +00:00
Jim Laskey
f61232354f
Should be using flag and not chain.
...
llvm-svn: 23572
2005-09-30 23:43:37 +00:00
Chris Lattner
1de5706e68
Remove code for patterns that are autogenerated
...
llvm-svn: 23532
2005-09-29 23:33:31 +00:00
Chris Lattner
08c319fbdd
Never rely on ReplaceAllUsesWith when selecting, use CodeGenMap instead.
...
ReplaceAllUsesWith does not replace scalars SDOperand floating around on
the stack, permitting things to be selected multiple times.
llvm-svn: 23515
2005-09-29 00:59:32 +00:00
Chris Lattner
b9b2e77295
Autogen MUL, move FP cases together
...
llvm-svn: 23512
2005-09-28 22:53:16 +00:00
Chris Lattner
5769311c92
disentangle FP from INT versions of div/mul
...
llvm-svn: 23511
2005-09-28 22:50:24 +00:00
Chris Lattner
585131baaf
Use the autogenerated matcher for ADD/SUB
...
llvm-svn: 23510
2005-09-28 22:47:28 +00:00
Chris Lattner
d3ea19b51a
Add FP versions of the binary operators, keeping the int and fp worlds seperate.
...
llvm-svn: 23506
2005-09-28 22:29:58 +00:00
Chris Lattner
fab48b3285
All (xor *) cases are autogenerated now
...
llvm-svn: 23497
2005-09-28 18:12:37 +00:00
Chris Lattner
33f8e08c8f
Implement PowerPC/eqv-andc-orc-nor.ll:EQV3
...
llvm-svn: 23494
2005-09-28 18:04:52 +00:00
Chris Lattner
bb5939a436
These nodes are all autogenerated
...
llvm-svn: 23489
2005-09-28 17:07:09 +00:00
Chris Lattner
c628f00845
Make sure to clear the CodeGenMap after each basic block is selected to avoid
...
cross MBB pollution.
llvm-svn: 23470
2005-09-27 17:45:33 +00:00
Chris Lattner
b011cb2746
we don't need this proto any longer
...
llvm-svn: 23342
2005-09-13 22:05:21 +00:00
Chris Lattner
03e08eefc7
move the #include for the generated code into the isel class body so we
...
can use/define class methods
llvm-svn: 23339
2005-09-13 22:03:06 +00:00
Chris Lattner
4309c3a785
PowerPC cannot truncstore i1 natively
...
llvm-svn: 23304
2005-09-10 00:21:06 +00:00
Chris Lattner
498915dafa
Remove some cases handled by the generated portion of the isel
...
llvm-svn: 23262
2005-09-07 23:45:15 +00:00
Nate Begeman
6095214bf0
Implement i64<->fp using the fctidz/fcfid instructions on PowerPC when we
...
are allowed to generate 64-bit-only PowerPC instructions for 32 bit hosts,
such as the PowerPC 970.
This speeds up 189.lucas from 81.99 to 32.64 seconds.
llvm-svn: 23250
2005-09-06 22:03:27 +00:00
Chris Lattner
8ae9525bd0
include the dag isel fragment
...
llvm-svn: 23239
2005-09-03 01:17:22 +00:00
Chris Lattner
5f12cf14be
Change the isel to not break out of the big giant switch. Instead, the
...
switch should never be exited, so its bottom is now unreachable.
llvm-svn: 23234
2005-09-03 00:53:47 +00:00
Chris Lattner
a305d28cf6
Implement dynamic allocas correctly. In particular, because we were copying
...
directly out of R1 (without using a CopyFromReg, which uses a chain), multiple
allocas were getting CSE'd together, producing bogus code. For this:
int %foo(bool %X, int %A, int %B) {
br bool %X, label %T, label %F
F:
%G = alloca int
%H = alloca int
store int %A, int* %G
store int %B, int* %H
%R = load int* %G
ret int %R
T:
ret int 0
}
We were generating:
_foo:
stwu r1, -16(r1)
stw r31, 4(r1)
or r31, r1, r1
stw r1, 12(r31)
cmpwi cr0, r3, 0
bne cr0, .LBB_foo_2 ; T
.LBB_foo_1: ; F
li r2, 16
subf r2, r2, r1 ;; One alloca
or r1, r2, r2
or r3, r1, r1
or r1, r2, r2
or r2, r1, r1
stw r4, 0(r3)
stw r5, 0(r2)
lwz r3, 0(r3)
lwz r1, 12(r31)
lwz r31, 4(r31)
lwz r1, 0(r1)
blr
.LBB_foo_2: ; T
li r3, 0
lwz r1, 12(r31)
lwz r31, 4(r31)
lwz r1, 0(r1)
blr
Now we generate:
_foo:
stwu r1, -16(r1)
stw r31, 4(r1)
or r31, r1, r1
stw r1, 12(r31)
cmpwi cr0, r3, 0
bne cr0, .LBB_foo_2 ; T
.LBB_foo_1: ; F
or r2, r1, r1
li r3, 16
subf r2, r3, r2 ;; Alloca 1
or r1, r2, r2
or r2, r1, r1
or r6, r1, r1
subf r3, r3, r6 ;; Alloca 2
or r1, r3, r3
or r3, r1, r1
stw r4, 0(r2)
stw r5, 0(r3)
lwz r3, 0(r2)
lwz r1, 12(r31)
lwz r31, 4(r31)
lwz r1, 0(r1)
blr
.LBB_foo_2: ; T
li r3, 0
lwz r1, 12(r31)
lwz r31, 4(r31)
lwz r1, 0(r1)
blr
This fixes Povray and SPASS with the dag isel, the last two failing cases.
Tommorow we will hopefully turn it on by default! :)
llvm-svn: 23190
2005-09-01 21:31:30 +00:00
Chris Lattner
293b3a68e0
Fix a bug where we were useing HA to get the high part, which seems like it
...
could cause a miscompile. Fixing this didn't fix the two programs that fail
though. :(
This also changes the implementation to follow the pattern selector more
closely, causing us to select 0 to li instead of lis.
llvm-svn: 23189
2005-09-01 19:38:28 +00:00
Chris Lattner
34182aff7f
Do not select the operands being passed into SelectCC. IT does this itself
...
and selecting early prevents folding immediates into the cmpw* instructions
llvm-svn: 23188
2005-09-01 19:20:44 +00:00
Chris Lattner
da2e04c69d
Move FCTIWZ handling out of the instruction selectors and into legalization,
...
getting them out of the business of making stack slots.
llvm-svn: 23180
2005-08-31 21:09:52 +00:00
Chris Lattner
6bad1fb19e
Remove dead code
...
llvm-svn: 23179
2005-08-31 20:25:15 +00:00
Chris Lattner
2bd2af8ecd
add assert zext/sext to the dag isel
...
llvm-svn: 23171
2005-08-31 18:08:46 +00:00
Chris Lattner
f4d594370b
Fix 'ret long' to return the high and lo parts in the right registers. This
...
fixes crafty and probably others.
llvm-svn: 23167
2005-08-31 01:34:29 +00:00
Chris Lattner
69e9a9a94c
now that physregs can exist in the same dag with multiple types, remove some
...
ugly hacks
llvm-svn: 23162
2005-08-30 22:59:48 +00:00
Chris Lattner
8f8d539746
Fix type mismatches when passing f32 values to calls
...
llvm-svn: 23159
2005-08-30 21:28:19 +00:00
Chris Lattner
9f23ae226f
Fix some indentation (first hunks).
...
Remove code (last hunk) that miscompiled immediate and's, such as
and uint %tmp.30, 4294958079
into
andi. r8, r8, 56319
andis. r8, r8, 65535
instead of:
li r9, -9217
and r8, r8, r9
The first always generates zero.
This fixes espresso.
llvm-svn: 23155
2005-08-30 18:37:48 +00:00
Chris Lattner
6a41fd75cd
Fix a problem Nate found where we swapped the operands of SHL/SHR_PARTS. This
...
fixes fourinarow
llvm-svn: 23153
2005-08-30 17:42:59 +00:00
Chris Lattner
bdf3d3defb
codegen ADD_PARTS correctly: put the results in the right registers! This
...
fixes fhourstones
llvm-svn: 23152
2005-08-30 17:40:13 +00:00
Chris Lattner
45706e9fb8
add operands in the right order, fixing McCat/18-imp with the dag isel
...
llvm-svn: 23150
2005-08-30 17:13:58 +00:00
Chris Lattner
7a59b1cf90
Make sure the selector emits register register copies with flag operands
...
linking them to calls when appropriate, this prevents the scheduler from
pulling these copies away from the call.
This fixes Ptrdist/yacr2
llvm-svn: 23143
2005-08-30 01:57:02 +00:00
Chris Lattner
e413b60632
The first operand to AND does not always have more than two operands. This
...
fixes MediaBench/toast with the dag selector
llvm-svn: 23141
2005-08-30 00:59:16 +00:00
Chris Lattner
61f7c3e843
emit FMR instructions to convert f64<->f32 instructions, so things like
...
STOREs, know the right type to store.
llvm-svn: 23139
2005-08-30 00:30:43 +00:00
Chris Lattner
12357281b8
fix a crash in cfrac
...
llvm-svn: 23137
2005-08-29 23:49:25 +00:00
Chris Lattner
1cbbe1015a
Implement DYNAMIC_STACKALLOC, wrap some long lines
...
llvm-svn: 23136
2005-08-29 23:30:11 +00:00
Chris Lattner
b2b418509b
Fix a dumb bug of mine where we were mishandling the PPC ABI (undef handling).
...
This fixes voronoi and bh in Olden, allowing all of olden to pass!
llvm-svn: 23133
2005-08-29 22:22:57 +00:00
Chris Lattner
c429ab2fb1
Fix a bug the last patch exposed in treeadd among others
...
llvm-svn: 23127
2005-08-29 01:07:02 +00:00
Chris Lattner
d4d683a47b
A hack to fix a problem folding immedaites. This fixes Olden/power.
...
llvm-svn: 23126
2005-08-29 01:01:01 +00:00
Chris Lattner
3ccad3fb8c
Fix order of operands for copytoreg node when emitting calls. This fixes
...
Olden/msFix order of operands for copytoreg node when emitting calls. This fixes
Olden/mstt.
llvm-svn: 23125
2005-08-29 00:26:57 +00:00
Chris Lattner
66ddc8d3bf
add operands in the correct order
...
llvm-svn: 23123
2005-08-29 00:02:01 +00:00
Chris Lattner
dfcde88d07
Fix a bug in FP_EXTEND, implement FP_TO_SINT
...
llvm-svn: 23121
2005-08-28 23:59:09 +00:00
Chris Lattner
38660c6666
fix an assertion failure in treeadd
...
llvm-svn: 23120
2005-08-28 23:39:22 +00:00
Chris Lattner
9b577f108a
implement SELECT_CC fully for the DAG->DAG isel!
...
llvm-svn: 23101
2005-08-26 21:23:58 +00:00
Chris Lattner
b2854fadda
Make fsel emission work with both the pattern and dag-dag selectors, by
...
giving it a non-instruction opcode. The dag->dag selector used to not
select the operands of the fsel, because it thought that whole tree was
already selected.
llvm-svn: 23091
2005-08-26 20:25:03 +00:00
Chris Lattner
bec817ce6f
implement the fold for:
...
bool %test(int %X, int %Y) {
%C = setne int %X, 0
ret bool %C
}
to:
_test:
addic r2, r3, -1
subfe r3, r2, r3
blr
llvm-svn: 23089
2005-08-26 18:46:49 +00:00
Chris Lattner
a9e6a82d66
Changes to adjust to new ReplaceAllUsesWith syntax. Change FP_EXTEND to
...
just return its input, instead of emitting an explicit copy.
llvm-svn: 23088
2005-08-26 18:37:23 +00:00
Chris Lattner
c75e047245
now that fsel is formed during legalization, this code is dead
...
llvm-svn: 23084
2005-08-26 17:40:39 +00:00
Chris Lattner
c30405e0ee
Change ConstantPoolSDNode to actually hold the Constant itself instead of
...
putting it into the constant pool. This allows the isel machinery to
create constants that it will end up deciding are not needed, without them
ending up in the resultant function constant pool.
llvm-svn: 23081
2005-08-26 17:15:30 +00:00
Chris Lattner
7bbdae53d6
Fix some warnings in an optimized build
...
llvm-svn: 23080
2005-08-26 16:38:51 +00:00
Chris Lattner
2091a36631
Fix a huge annoyance: SelectNodeTo took types before the opcode unlike
...
every other SD API. Fix it to take the opcode before the types.
llvm-svn: 23079
2005-08-26 16:36:26 +00:00
Nate Begeman
89093ca62a
SUBFIC produces two results, not one.
...
llvm-svn: 23073
2005-08-26 00:34:06 +00:00
Nate Begeman
bed4f2b982
Implement SHL_PARTS and SRL_PARTS
...
llvm-svn: 23072
2005-08-26 00:28:00 +00:00
Chris Lattner
b81431b012
Emit the lo/hi parts in the right order :)
...
llvm-svn: 23068
2005-08-25 23:36:49 +00:00
Chris Lattner
02884fe41c
implement support for 64-bit add/sub, fix a broken assertion for 64-bit
...
return. Allow the udiv breaker-upper to work with any non-zero constant
operand.
llvm-svn: 23066
2005-08-25 23:21:06 +00:00
Chris Lattner
6e184f2b3d
Finish implementing SDIV/UDIV by copying over the majik constant code from
...
ISelPattern
llvm-svn: 23062
2005-08-25 22:04:30 +00:00
Chris Lattner
b746dd1cf6
Implement setcc correctly for G5 and non-G5 systems
...
llvm-svn: 23060
2005-08-25 21:39:42 +00:00
Chris Lattner
3dcd75bc54
implement setcc on the G5. We're still missing the non-g5 specific bits, but
...
they will come later.
llvm-svn: 23059
2005-08-25 20:08:18 +00:00
Chris Lattner
dc66457022
Add support for sdiv by 2^k and -2^k. Producing code like:
...
_test:
srawi r2, r3, 2
addze r3, r2
blr
llvm-svn: 23052
2005-08-25 17:50:06 +00:00
Chris Lattner
25db699671
Implement support for taking the address of constant pool indices, which
...
is used by the int -> FP code among other things. This gets
2005-05-12-Int64ToFP past that failure, to dying on lack of support for add_parts
llvm-svn: 23042
2005-08-25 05:04:11 +00:00
Chris Lattner
666512c832
Add support for FP constants, fixing UnitTests/2004-02-02-NegativeZero
...
llvm-svn: 23038
2005-08-25 04:47:18 +00:00
Chris Lattner
e4c338d0d8
Fully implement frame index, so that we can pass the address of alloca's
...
around to functions and stuff
llvm-svn: 23036
2005-08-25 00:45:43 +00:00
Chris Lattner
66a6a13225
implement unconditional branches, fixing UnitTests/2003-05-02-DependentPHI.c
...
llvm-svn: 23034
2005-08-25 00:29:58 +00:00
Chris Lattner
794eb6684d
Fix a broken assertion
...
llvm-svn: 23032
2005-08-25 00:19:12 +00:00
Chris Lattner
a3fbdae515
Split IMPLICIT_DEF into IMPLICIT_DEF_GPR and IMPLICIT_DEF_FP, so that the
...
instructions take a consistent reg class. Implement ISD::UNDEF in the dag->dag
selector to generate this, fixing UnitTests/2003-07-06-IntOverflow.
llvm-svn: 23028
2005-08-24 23:08:16 +00:00
Chris Lattner
d83cd354bd
implement support for calls
...
llvm-svn: 23026
2005-08-24 22:45:17 +00:00
Nate Begeman
a1e0a2f72b
Remove unused statistic
...
Prefer 'neg X' to 'subfic 0, X' since neg does not set XER[CA]
llvm-svn: 23001
2005-08-24 05:03:20 +00:00
Chris Lattner
b6d034a841
Add callseq_begin/end support
...
Call stil not supported yet
llvm-svn: 22998
2005-08-24 00:47:15 +00:00
Chris Lattner
ca0c0d7550
Implement stores.
...
llvm-svn: 22963
2005-08-22 01:27:59 +00:00
Chris Lattner
1d634b2f44
Fix compilation of:
...
float %test2(float* %P) {
%Q = load float* %P
%R = add float %Q, %Q
ret float %R
}
By returning the right result.
llvm-svn: 22961
2005-08-22 00:59:14 +00:00
Chris Lattner
c5292ec9de
Implement most of load support. There is still a bug though.
...
llvm-svn: 22959
2005-08-21 22:31:09 +00:00
Chris Lattner
2a1823d178
Implement selection for branches.
...
llvm-svn: 22951
2005-08-21 18:50:37 +00:00
Chris Lattner
4564039498
add support for global address, including PIC support.
...
This REALLY should be lowered by the legalizer!
llvm-svn: 22941
2005-08-19 22:38:53 +00:00
Chris Lattner
65d66797a5
Fix a typeo, no wonder all tokenfactor edges were the same!
...
llvm-svn: 22935
2005-08-19 21:33:02 +00:00
Nate Begeman
93c4bc6dca
ISD::OR, and it's accompanying SelectBitfieldInsert
...
llvm-svn: 22889
2005-08-19 00:38:14 +00:00
Nate Begeman
33acb2c135
Add shifts.
...
llvm-svn: 22884
2005-08-18 23:38:00 +00:00
Chris Lattner
4e00ff6e70
Move this to the emitter
...
llvm-svn: 22877
2005-08-18 20:08:53 +00:00
Chris Lattner
015d73996d
After selecting the instructions for a basic block, emit the instructions
...
llvm-svn: 22869
2005-08-18 18:46:06 +00:00
Chris Lattner
15b5c7ca84
remove some unused stuff
...
llvm-svn: 22866
2005-08-18 18:34:00 +00:00
Nate Begeman
d32638706a
Improve ISD::Constant codegen.
...
Now for int foo() { return -1; } we generate:
_foo:
li r3, -1
blr
instead of
_foo:
lis r2, -1
ori r3, r2, 65535
blr
llvm-svn: 22864
2005-08-18 18:01:39 +00:00
Nate Begeman
b3821a3943
Add support for ISD::AND, and its various optimized forms.
...
llvm-svn: 22857
2005-08-18 07:30:46 +00:00
Nate Begeman
cfb9a74c2e
Maintain consistency in negating things
...
llvm-svn: 22855
2005-08-18 05:44:50 +00:00
Nate Begeman
72d6f8800d
Implement XOR, remove a broken sign_extend_inreg case
...
llvm-svn: 22854
2005-08-18 05:00:13 +00:00
Nate Begeman
4bfb4a215d
Add a bunch more simple nodes.
...
llvm-svn: 22851
2005-08-18 03:04:18 +00:00
Nate Begeman
457367f14c
Add a couple more nodes that are easy to handle
...
llvm-svn: 22850
2005-08-18 00:53:47 +00:00
Nate Begeman
74d5529b88
Be fruitful and multiply!
...
llvm-svn: 22849
2005-08-18 00:21:41 +00:00
Nate Begeman
3fcf47d8f0
Teach the DAG->DAG ISel about FNEG, and how it can be used to invert
...
several of the PowerPC opcodes that come in both negated and non-negated
forms.
llvm-svn: 22845
2005-08-17 23:46:35 +00:00
Chris Lattner
43ff01e2e6
initial hack at a dag->dag instruction selector. This is obviously woefully
...
incomplete, but it is a start. It handles basic argument/retval stuff, immediates,
add and sub.
llvm-svn: 22836
2005-08-17 19:33:03 +00:00