Reid Spencer
dfb3fb4a25
Implement PR614:
...
These changes modify the makefiles so that the output of flex and bison are
placed in the SRC directory, not the OBJ directory. It is intended that they
be checked in as any other LLVM source so that platforms without convenient
access to flex/bison can be compiled. From now on, if you change a .y or
.l file you *must* also commit the generated .cpp and .h files.
llvm-svn: 23115
2005-08-27 18:50:39 +00:00
Chris Lattner
075250bda1
Disable this code, which broke many tests last night
...
llvm-svn: 23114
2005-08-27 16:16:51 +00:00
Chris Lattner
5ee85e89b6
fix PHI node emission for basic blocks that have select_cc's in them on ppc32
...
llvm-svn: 23113
2005-08-27 00:58:02 +00:00
Chris Lattner
787e962795
The condition register being branched on may not be cr0, as such, print it.
...
This fixes: UnitTests/2005-07-17-INT-To-FP.c
llvm-svn: 23112
2005-08-26 23:42:05 +00:00
Chris Lattner
29bfaa7ef0
Propagate cr# from COND_BRANCH to the actual branch instruction as appropriate
...
llvm-svn: 23111
2005-08-26 23:41:27 +00:00
Chris Lattner
56ca46ee04
Nate noticed that Andrew never did this. This fixes PR600
...
llvm-svn: 23110
2005-08-26 22:50:40 +00:00
Chris Lattner
e7a2998064
Don't copy regs that are only used in the entry block into a vreg. This
...
changes the code generated for:
short %test(short %A) {
%B = xor short %A, -32768
ret short %B
}
to:
_test:
xori r2, r3, 32768
xoris r2, r2, 65535
extsh r3, r2
blr
instead of:
_test:
rlwinm r2, r3, 0, 16, 31
xori r2, r3, 32768
xoris r2, r2, 65535
extsh r3, r2
blr
llvm-svn: 23109
2005-08-26 22:49:59 +00:00
Chris Lattner
d4f43f7967
Make this code safe for when loadRegFromStackSlot inserts multiple instructions.
...
llvm-svn: 23108
2005-08-26 22:18:32 +00:00
Chris Lattner
422e23dd02
allow code using mtcrf to assemble
...
llvm-svn: 23107
2005-08-26 22:05:54 +00:00
Nate Begeman
72f23815bc
Remove operand type 'crbit', since it is no longer used
...
llvm-svn: 23106
2005-08-26 22:04:17 +00:00
Chris Lattner
c3d1bdd0a9
teach getClass what a condition reg is
...
llvm-svn: 23105
2005-08-26 21:51:29 +00:00
Chris Lattner
97345405a6
Minor cleanups:
...
* avoid calling getClass() multiple times (it is relatively expensive)
* Allow -disable-fp-elim to turn of frame pointer elimination.
llvm-svn: 23104
2005-08-26 21:49:18 +00:00
Chris Lattner
4a5ebe94ba
Checking types here is not safe, because multiple types can map to the same
...
register class.
llvm-svn: 23103
2005-08-26 21:39:15 +00:00
Chris Lattner
9b577f108a
implement SELECT_CC fully for the DAG->DAG isel!
...
llvm-svn: 23101
2005-08-26 21:23:58 +00:00
Chris Lattner
c6a0338c04
spell this right
...
llvm-svn: 23099
2005-08-26 20:55:40 +00:00
Chris Lattner
13d7c252e5
Call the InsertAtEndOfBasicBlock hook if the usesCustomDAGSchedInserter
...
flag is set on an instruction.
llvm-svn: 23098
2005-08-26 20:54:47 +00:00
Chris Lattner
0081dfa91e
Add a flag
...
llvm-svn: 23092
2005-08-26 20:29:01 +00:00
Chris Lattner
b2854fadda
Make fsel emission work with both the pattern and dag-dag selectors, by
...
giving it a non-instruction opcode. The dag->dag selector used to not
select the operands of the fsel, because it thought that whole tree was
already selected.
llvm-svn: 23091
2005-08-26 20:25:03 +00:00
Chris Lattner
bec817ce6f
implement the fold for:
...
bool %test(int %X, int %Y) {
%C = setne int %X, 0
ret bool %C
}
to:
_test:
addic r2, r3, -1
subfe r3, r2, r3
blr
llvm-svn: 23089
2005-08-26 18:46:49 +00:00
Chris Lattner
a9e6a82d66
Changes to adjust to new ReplaceAllUsesWith syntax. Change FP_EXTEND to
...
just return its input, instead of emitting an explicit copy.
llvm-svn: 23088
2005-08-26 18:37:23 +00:00
Chris Lattner
373f048a79
Revampt ReplaceAllUsesWith to be more efficient and easier to use.
...
llvm-svn: 23087
2005-08-26 18:36:28 +00:00
Nate Begeman
76eea9a480
Remove some code made dead by the fsel patch
...
llvm-svn: 23085
2005-08-26 17:45:06 +00:00
Chris Lattner
c75e047245
now that fsel is formed during legalization, this code is dead
...
llvm-svn: 23084
2005-08-26 17:40:39 +00:00
Chris Lattner
7f1fa8eaef
implement the other half of the select_cc -> fsel lowering, which handles
...
when the RHS of the comparison is 0.0. Turn this on by default.
llvm-svn: 23083
2005-08-26 17:36:52 +00:00
Chris Lattner
d0dc6f4299
Fix a bug in my previous checkin
...
llvm-svn: 23082
2005-08-26 17:18:44 +00:00
Chris Lattner
c30405e0ee
Change ConstantPoolSDNode to actually hold the Constant itself instead of
...
putting it into the constant pool. This allows the isel machinery to
create constants that it will end up deciding are not needed, without them
ending up in the resultant function constant pool.
llvm-svn: 23081
2005-08-26 17:15:30 +00:00
Chris Lattner
7bbdae53d6
Fix some warnings in an optimized build
...
llvm-svn: 23080
2005-08-26 16:38:51 +00:00
Chris Lattner
2091a36631
Fix a huge annoyance: SelectNodeTo took types before the opcode unlike
...
every other SD API. Fix it to take the opcode before the types.
llvm-svn: 23079
2005-08-26 16:36:26 +00:00
Nate Begeman
7b809f593b
Fix JIT encoding of conditional branches
...
llvm-svn: 23076
2005-08-26 04:11:42 +00:00
Chris Lattner
f3d06c6417
add initial support for converting select_cc -> fsel in the legalizer
...
instead of in the backend. This currently handles fsel cases with registers,
but doesn't have the 0.0 and -0.0 optimization enabled yet.
Once this is finished, special hack for fp immediates can go away.
llvm-svn: 23075
2005-08-26 00:52:45 +00:00
Chris Lattner
c6d481db7a
the 5th operand is the 4th number
...
llvm-svn: 23074
2005-08-26 00:43:46 +00:00
Nate Begeman
89093ca62a
SUBFIC produces two results, not one.
...
llvm-svn: 23073
2005-08-26 00:34:06 +00:00
Nate Begeman
bed4f2b982
Implement SHL_PARTS and SRL_PARTS
...
llvm-svn: 23072
2005-08-26 00:28:00 +00:00
Chris Lattner
5f573416cd
Add support for targets that want to custom expand select_cc in some cases.
...
llvm-svn: 23071
2005-08-26 00:23:59 +00:00
Chris Lattner
dff50cadaa
Allow LowerOperation to return a null SDOperand in case it wants to lower
...
some things given to it, but not all.
llvm-svn: 23070
2005-08-26 00:14:16 +00:00
Chris Lattner
1cb550c603
Fix a nasty bug from a previous patch of mine
...
llvm-svn: 23069
2005-08-26 00:13:12 +00:00
Chris Lattner
b81431b012
Emit the lo/hi parts in the right order :)
...
llvm-svn: 23068
2005-08-25 23:36:49 +00:00
Chris Lattner
02884fe41c
implement support for 64-bit add/sub, fix a broken assertion for 64-bit
...
return. Allow the udiv breaker-upper to work with any non-zero constant
operand.
llvm-svn: 23066
2005-08-25 23:21:06 +00:00
Chris Lattner
abbd8ea048
simplify the add/sub_parts code
...
llvm-svn: 23065
2005-08-25 23:19:58 +00:00
Chris Lattner
6e184f2b3d
Finish implementing SDIV/UDIV by copying over the majik constant code from
...
ISelPattern
llvm-svn: 23062
2005-08-25 22:04:30 +00:00
Chris Lattner
717f97a5c8
Simplify some code. It's not clear why the UDIV expanded sequence
...
doesn't work for large uint constants, but we'll keep the current behavior
llvm-svn: 23061
2005-08-25 22:03:50 +00:00
Chris Lattner
b746dd1cf6
Implement setcc correctly for G5 and non-G5 systems
...
llvm-svn: 23060
2005-08-25 21:39:42 +00:00
Chris Lattner
3dcd75bc54
implement setcc on the G5. We're still missing the non-g5 specific bits, but
...
they will come later.
llvm-svn: 23059
2005-08-25 20:08:18 +00:00
Nate Begeman
33840c3268
New fold for SELECT_CC
...
llvm-svn: 23058
2005-08-25 20:04:38 +00:00
Nate Begeman
65ffd8fbf4
Remove option to make SetCC illegal on PowerPC after long discussion with
...
Chris. This will be accomplished through correctly modeling CR's and
subregs.
llvm-svn: 23056
2005-08-25 20:01:10 +00:00
Chris Lattner
f9c19157df
Don't auto-cse nodes that return flags
...
llvm-svn: 23055
2005-08-25 19:12:10 +00:00
Chris Lattner
12756be53b
add printer support for flag operands
...
llvm-svn: 23054
2005-08-25 17:59:23 +00:00
Chris Lattner
9d28a56d55
simplify the code a bit using isOperationLegal
...
llvm-svn: 23053
2005-08-25 17:54:58 +00:00
Chris Lattner
dc66457022
Add support for sdiv by 2^k and -2^k. Producing code like:
...
_test:
srawi r2, r3, 2
addze r3, r2
blr
llvm-svn: 23052
2005-08-25 17:50:06 +00:00
Chris Lattner
4bd2aab6c1
fit in 80 cols
...
llvm-svn: 23051
2005-08-25 17:49:31 +00:00
Chris Lattner
8a93f64efa
Add support for flag operands
...
llvm-svn: 23050
2005-08-25 17:48:54 +00:00
Chris Lattner
d24ad52efa
add an enum value
...
llvm-svn: 23048
2005-08-25 17:07:09 +00:00
Chris Lattner
25db699671
Implement support for taking the address of constant pool indices, which
...
is used by the int -> FP code among other things. This gets
2005-05-12-Int64ToFP past that failure, to dying on lack of support for add_parts
llvm-svn: 23042
2005-08-25 05:04:11 +00:00
Chris Lattner
407c6415b4
ADd support for TargetConstantPool nodes
...
llvm-svn: 23041
2005-08-25 05:03:06 +00:00
Chris Lattner
666512c832
Add support for FP constants, fixing UnitTests/2004-02-02-NegativeZero
...
llvm-svn: 23038
2005-08-25 04:47:18 +00:00
Chris Lattner
e4c338d0d8
Fully implement frame index, so that we can pass the address of alloca's
...
around to functions and stuff
llvm-svn: 23036
2005-08-25 00:45:43 +00:00
Chris Lattner
bbe0e7df2c
add a new TargetFrameIndex node
...
llvm-svn: 23035
2005-08-25 00:43:01 +00:00
Chris Lattner
66a6a13225
implement unconditional branches, fixing UnitTests/2003-05-02-DependentPHI.c
...
llvm-svn: 23034
2005-08-25 00:29:58 +00:00
Chris Lattner
4ae278a760
LFS/STFS load and store FP values, not integer ones. This change allows us
...
to codegen this: float foo() { return 1.245; }
into this:
_foo:
lis r2, ha16(.CPI_foo_0)
lfs f1, lo16(.CPI_foo_0)(r2)
blr
instead of this:
_foo:
lis r2, ha16(.CPI_foo_0)
lfs r2, lo16(.CPI_foo_0)(r2) <-- ouch
or f1, r2, r2 <-- ouch
blr
with the dag isel.
llvm-svn: 23033
2005-08-25 00:26:22 +00:00
Chris Lattner
794eb6684d
Fix a broken assertion
...
llvm-svn: 23032
2005-08-25 00:19:12 +00:00
Chris Lattner
c146940f0d
Fix a warning
...
llvm-svn: 23031
2005-08-25 00:05:15 +00:00
Chris Lattner
daae1e10f7
fix a warning in optimized build
...
llvm-svn: 23030
2005-08-25 00:03:21 +00:00
Chris Lattner
751c6c3944
Fix some warnings
...
llvm-svn: 23029
2005-08-25 00:00:26 +00:00
Chris Lattner
a3fbdae515
Split IMPLICIT_DEF into IMPLICIT_DEF_GPR and IMPLICIT_DEF_FP, so that the
...
instructions take a consistent reg class. Implement ISD::UNDEF in the dag->dag
selector to generate this, fixing UnitTests/2003-07-06-IntOverflow.
llvm-svn: 23028
2005-08-24 23:08:16 +00:00
Chris Lattner
45e1ce4e28
add a method
...
llvm-svn: 23027
2005-08-24 23:00:29 +00:00
Chris Lattner
d83cd354bd
implement support for calls
...
llvm-svn: 23026
2005-08-24 22:45:17 +00:00
Chris Lattner
d7ee4d8671
Add ReplaceAllUsesWith that can take a vector of replacement values.
...
Add some foldings to hopefully help the illegal setcc issue, and move some code around.
llvm-svn: 23025
2005-08-24 22:44:39 +00:00
Chris Lattner
1fc2a7f006
Remove some dead cases.
...
Emit the indcall sequence as:
mtctr inreg
mr R12, inreg
btctr
If inreg and R12 aren't coallesced, this reduces the odds of having the mtctr
and btctr in the same dispatch group. :)
llvm-svn: 23023
2005-08-24 22:21:47 +00:00
Chris Lattner
ad9565dfbe
Add support for external symbols, and support for variable arity instructions
...
llvm-svn: 23022
2005-08-24 22:02:41 +00:00
Chris Lattner
bb8cc0acb2
Fix pasto that prevented VT ndoes from showing up in -view-isel-dags correctly
...
llvm-svn: 23021
2005-08-24 18:30:00 +00:00
Chris Lattner
1e98a330f2
add an idea
...
llvm-svn: 23020
2005-08-24 18:15:24 +00:00
Chris Lattner
8ca5b2a6d2
Fix Regression/Transforms/Reassociate/2005-08-24-Crash.ll
...
llvm-svn: 23019
2005-08-24 17:55:32 +00:00
Chris Lattner
4201cd1bbc
Transform floor((double)FLT) -> (double)floorf(FLT), implementing
...
Regression/Transforms/SimplifyLibCalls/floor.ll. This triggers 19 times in
177.mesa.
llvm-svn: 23017
2005-08-24 17:22:17 +00:00
Chris Lattner
898e50ecb3
floor/ceil don't read/write memory. This allows gcse to eliminate 6 calls
...
in mesa.
llvm-svn: 23015
2005-08-24 16:58:56 +00:00
Chris Lattner
86b1658d58
teach selection dag mask tracking about the fact that select_cc operates like
...
select. Also teach it that the bit count instructions can only set the low bits
of the result, depending on the size of the input.
This allows us to compile this:
int %eq0(int %a) {
%tmp.1 = seteq int %a, 0 ; <bool> [#uses=1]
%tmp.2 = cast bool %tmp.1 to int ; <int> [#uses=1]
ret int %tmp.2
}
To this:
_eq0:
cntlzw r2, r3
srwi r3, r2, 5
blr
instead of this:
_eq0:
cntlzw r2, r3
rlwinm r3, r2, 27, 31, 31
blr
when setcc is marked illegal on ppc (which restores parity to non-illegal
setcc). Thanks to Nate for pointing this out.
llvm-svn: 23013
2005-08-24 16:46:55 +00:00
Chris Lattner
f12eb4d676
Start using isOperationLegal and isTypeLegal to simplify the code
...
llvm-svn: 23012
2005-08-24 16:35:28 +00:00
Chris Lattner
ade525491f
Adjust to new interface
...
llvm-svn: 23010
2005-08-24 16:34:12 +00:00
Reid Spencer
f85fabeb71
For PR616:
...
These patches make threading optional in LLVM. The configuration scripts are now
modified to accept a --disable-threads switch. If this is used, the Mutex class
will be implemented with all functions as no-op. Furthermore, linking against
libpthread will not be done. Finally, the ParallelJIT example needs libpthread
so its makefile was changed to always add -lpthread to the link line.
llvm-svn: 23003
2005-08-24 10:07:20 +00:00
Nate Begeman
7c1ba938be
Whoops, fix a thinko. All cases except SETNE are now handled by the
...
target independent code in SelectionDAG.cpp
llvm-svn: 23002
2005-08-24 05:06:48 +00:00
Nate Begeman
a1e0a2f72b
Remove unused statistic
...
Prefer 'neg X' to 'subfic 0, X' since neg does not set XER[CA]
llvm-svn: 23001
2005-08-24 05:03:20 +00:00
Nate Begeman
6948b79b26
Add the "ppc specific" setcc-equivalent select_cc cases
...
Prefer 'neg X' to 'subfic 0, X' since it does not set XER[CA]
llvm-svn: 23000
2005-08-24 04:59:21 +00:00
Nate Begeman
45bbbb3f11
Teach SelectionDAG how to simplify a few more setcc-equivalent select_cc
...
nodes so that backends don't have to.
llvm-svn: 22999
2005-08-24 04:57:57 +00:00
Chris Lattner
b6d034a841
Add callseq_begin/end support
...
Call stil not supported yet
llvm-svn: 22998
2005-08-24 00:47:15 +00:00
Chris Lattner
99282c7b92
Make -view-isel-dags show the dag before instruction selecting, in case
...
the target isel crashes due to unimplemented features like calls :)
llvm-svn: 22997
2005-08-24 00:34:29 +00:00
Nate Begeman
72eab5dd5c
Fix optimization of select_cc seteq X, 0, 1, 0 -> srl (ctlz X), log2 X size
...
llvm-svn: 22995
2005-08-24 00:21:28 +00:00
Chris Lattner
eeacce5a60
Implement LiveVariables.h change
...
llvm-svn: 22994
2005-08-24 00:09:33 +00:00
Chris Lattner
469652752c
adjust to new live variables interface
...
llvm-svn: 22992
2005-08-23 23:42:17 +00:00
Chris Lattner
cdc0cbbcd0
Adjust to new livevars interface
...
llvm-svn: 22991
2005-08-23 23:41:14 +00:00
Chris Lattner
774158239b
Simplify this code by using higher-level LiveVariables methods
...
llvm-svn: 22989
2005-08-23 22:51:41 +00:00
Chris Lattner
7c1c6e06f3
Simplify this code by using LiveVariables::KillsRegister
...
llvm-svn: 22988
2005-08-23 22:49:55 +00:00
Chris Lattner
22e91cc3b5
Keep track of which registers are related to which other registers.
...
Use this information to avoid doing expensive interval intersections for
registers that could not possible be interesting. This speeds up linscan
on ia64 compiling kc++ in release mode from taking 7.82s to 4.8s(!), total
itanium llc time on this program is 27.3s now. This marginally speeds up
PPC and X86, but they appear to be limited by other parts of linscan, not
this code.
On this program, on itanium, live intervals now takes 41% of llc time.
llvm-svn: 22986
2005-08-23 22:27:31 +00:00
Chris Lattner
9c0a243ce5
Fix PR618 and Regression/CodeGen/CBackend/2005-08-23-Fmod.ll by not emitting
...
x%y for 'rem' on fp values.
llvm-svn: 22984
2005-08-23 20:22:50 +00:00
Chris Lattner
5e3953d761
add a note
...
llvm-svn: 22982
2005-08-23 06:27:59 +00:00
Nate Begeman
f3ce09b36e
Ack, typo
...
llvm-svn: 22981
2005-08-23 05:45:10 +00:00
Nate Begeman
7216ad415b
Add an option to make SetCC illegal as a beta option
...
llvm-svn: 22979
2005-08-23 05:42:36 +00:00
Nate Begeman
bf8c3939d7
Teach the SelectionDAG how to transform select_cc eq, X, 0, 1, 0 into
...
either seteq X, 0 or srl (ctlz X), size(X-1), depending on what's legal
for the target.
llvm-svn: 22978
2005-08-23 05:41:12 +00:00
Nate Begeman
987121a61a
Teach Legalize how to turn setcc into select_cc
...
llvm-svn: 22977
2005-08-23 04:29:48 +00:00
Nate Begeman
06436b2b7d
Remove some instructions we no longer generate
...
llvm-svn: 22976
2005-08-23 01:16:46 +00:00
Chris Lattner
46323cf0e2
Remove some regs that are not used.
...
llvm-svn: 22975
2005-08-22 22:32:13 +00:00
Chris Lattner
956820d989
Nate noticed that 30% of the malloc/frees in llc come from calls to LowercaseString
...
in the asmprinter. This changes the .td files to use lower case register names,
avoiding the need to do this call. This speeds up the asmprinter from 1.52s
to 1.06s on kc++ in a release build.
llvm-svn: 22974
2005-08-22 22:00:02 +00:00
Chris Lattner
d2f2aff484
Fix a crash I introduced into the IA64 backend with my copyfromreg change.
...
It used to crash on any function that took float arguments.
llvm-svn: 22973
2005-08-22 21:33:11 +00:00
Chris Lattner
834a2316a3
Try to avoid scanning the fixed list. On architectures with a non-stupid
...
number of regs (e.g. most riscs), many functions won't need to use callee
clobbered registers. Do a speculative check to see if we can get a free
register without processing the fixed list (which has all of these). This
saves a lot of time on machines with lots of callee clobbered regs (e.g.
ppc and itanium, also x86).
This reduces ppc llc compile time from 184s -> 172s on kc++. This is probably
worth FAR FAR more on itanium though.
llvm-svn: 22972
2005-08-22 20:59:30 +00:00
Chris Lattner
95a157ae1a
Move some code in the register assignment case that only needs to happen if
...
we spill out of the fast path. The scan of active_ and the calls to
updateSpillWeights don't need to happen unless a spill occurs. This reduces
debug llc time of kc++ with ppc from 187.3s to 183.2s.
llvm-svn: 22971
2005-08-22 20:20:42 +00:00
Chris Lattner
9d46518e5c
Add a pass name for -time-passes output
...
llvm-svn: 22970
2005-08-22 18:28:09 +00:00
Chris Lattner
7f9e078d11
Fix a problem where constant expr shifts would not have their shift amount
...
promoted to the right type. This fixes: IA64/2005-08-22-LegalizerCrash.ll
llvm-svn: 22969
2005-08-22 17:28:31 +00:00
Chris Lattner
83b821b584
Speed up this loop a bit, based on some observations that Nate made, and
...
add some comments. This loop really needs to be reevaluated!
llvm-svn: 22966
2005-08-22 16:55:22 +00:00
Chris Lattner
ca0c0d7550
Implement stores.
...
llvm-svn: 22963
2005-08-22 01:27:59 +00:00
Chris Lattner
92626b9bc5
Add a fast-path for register values. Add support for constant pool entries,
...
allowing us to compile this:
float %test2(float* %P) {
%Q = load float* %P
%R = add float %Q, 10.1
ret float %R
}
to this:
_test2:
lfs r2, 0(r3)
lis r3, ha16(.CPI_test2_0)
lfs r3, lo16(.CPI_test2_0)(r3)
fadds f1, r2, r3
blr
llvm-svn: 22962
2005-08-22 01:04:32 +00:00
Chris Lattner
1d634b2f44
Fix compilation of:
...
float %test2(float* %P) {
%Q = load float* %P
%R = add float %Q, %Q
ret float %R
}
By returning the right result.
llvm-svn: 22961
2005-08-22 00:59:14 +00:00
Chris Lattner
b676e5a666
Make sure expressions only have one use before emitting them into a place that is conditionally executed
...
llvm-svn: 22960
2005-08-22 00:47:28 +00:00
Chris Lattner
c5292ec9de
Implement most of load support. There is still a bug though.
...
llvm-svn: 22959
2005-08-21 22:31:09 +00:00
Chris Lattner
466fecee19
add anew method
...
llvm-svn: 22957
2005-08-21 22:30:30 +00:00
Chris Lattner
4866356907
Add support for frame index nodes
...
llvm-svn: 22956
2005-08-21 19:56:04 +00:00
Chris Lattner
0548f50501
add a method
...
llvm-svn: 22955
2005-08-21 19:48:59 +00:00
Chris Lattner
968aeb18b0
Don't print out the MBB label for the entry mbb
...
llvm-svn: 22953
2005-08-21 19:09:33 +00:00
Chris Lattner
519acbfb76
Simplify the logic for BRTWOWAY_CC handling. The isel code already
...
simplifies BRTWOWAY into BR if one of the results is a fall-through.
Unless I'm missing something, there is no reason to duplicate this
in the target-specific code.
llvm-svn: 22952
2005-08-21 19:03:28 +00:00
Chris Lattner
2a1823d178
Implement selection for branches.
...
llvm-svn: 22951
2005-08-21 18:50:37 +00:00
Chris Lattner
707b39fb8c
add a method
...
llvm-svn: 22949
2005-08-21 18:49:33 +00:00
Chris Lattner
154b2bc59b
Add support for basic blocks, fix a bug in result # computation
...
llvm-svn: 22948
2005-08-21 18:49:29 +00:00
Chris Lattner
539c3fa863
When legalizing brcond ->brcc or select -> selectcc, make sure to truncate
...
the old condition to a one bit value. The incoming value must have been
promoted, and the top bits are undefined. This causes us to generate:
_test:
rlwinm r2, r3, 0, 31, 31
li r3, 17
cmpwi cr0, r2, 0
bne .LBB_test_2 ;
.LBB_test_1: ;
li r3, 1
.LBB_test_2: ;
blr
instead of:
_test:
rlwinm r2, r3, 0, 31, 31
li r2, 17
cmpwi cr0, r3, 0
bne .LBB_test_2 ;
.LBB_test_1: ;
li r2, 1
.LBB_test_2: ;
or r3, r2, r2
blr
for:
int %test(bool %c) {
%retval = select bool %c, int 17, int 1
ret int %retval
}
llvm-svn: 22947
2005-08-21 18:03:09 +00:00
Chris Lattner
0500e362bf
If the false value for a select_cc is really simple (has no inputs), evaluate
...
it in the block. This codegens:
int %test(bool %c) {
%retval = select bool %c, int 17, int 1
ret int %retval
}
as:
_test:
rlwinm r2, r3, 0, 31, 31
li r2, 17
cmpwi cr0, r3, 0
bne .LBB_test_2 ;
.LBB_test_1: ;
li r2, 1
.LBB_test_2: ;
or r3, r2, r2
blr
instead of:
_test:
rlwinm r2, r3, 0, 31, 31
li r2, 17
li r4, 1
cmpwi cr0, r3, 0
bne .LBB_test_2 ;
.LBB_test_1: ;
or r2, r4, r4
.LBB_test_2: ;
or r3, r2, r2
blr
... which is one fewer instruction. The savings are more significant for
global address and constantfp nodes.
llvm-svn: 22946
2005-08-21 17:41:11 +00:00
Duraid Madina
3588ea9bf5
reenable collapse of loadimm+AND -> dep.z (thanks guys)
...
llvm-svn: 22944
2005-08-21 15:43:53 +00:00
Chris Lattner
4b08ba26d8
fix bogus warning
...
llvm-svn: 22943
2005-08-20 18:07:27 +00:00
Jim Laskey
9b0a275f04
Repair an out by one error for IA64.
...
llvm-svn: 22942
2005-08-20 11:05:23 +00:00
Chris Lattner
4564039498
add support for global address, including PIC support.
...
This REALLY should be lowered by the legalizer!
llvm-svn: 22941
2005-08-19 22:38:53 +00:00
Chris Lattner
319e65696d
Add support for global address nodes
...
llvm-svn: 22940
2005-08-19 22:38:24 +00:00
Chris Lattner
1be7eddecf
Add support for TargetGlobalAddress nodes
...
llvm-svn: 22938
2005-08-19 22:31:04 +00:00
Chris Lattner
6d7f814b01
Implement CopyFromReg, TokenFactor, and fix a bug in CopyToReg. This allows
...
us to compile stuff like this:
double %test(double %A, double %B, double %C, double %E) {
%F = mul double %A, %A
%G = add double %F, %B
%H = sub double -0.0, %G
%I = mul double %H, %C
%J = add double %I, %E
ret double %J
}
to:
_test:
fnmadd f0, f1, f1, f2
fmadd f1, f0, f3, f4
blr
woot!
llvm-svn: 22937
2005-08-19 21:43:53 +00:00
Chris Lattner
0875d1ab89
Fix a bug in previous commit
...
llvm-svn: 22936
2005-08-19 21:34:13 +00:00
Chris Lattner
65d66797a5
Fix a typeo, no wonder all tokenfactor edges were the same!
...
llvm-svn: 22935
2005-08-19 21:33:02 +00:00
Chris Lattner
4990335eb8
Print physreg register nodes with target names (e.g. F1) instead of numbers
...
llvm-svn: 22934
2005-08-19 21:21:16 +00:00
Chris Lattner
78b200eb74
Before implementing copyfromreg, we'll implement copytoreg correctly.
...
This gets us this for the previous testcase:
_test:
lis r2, 0
ori r3, r2, 65535
blr
Note that we actually write to r3 (the return reg) correctly now :)
llvm-svn: 22933
2005-08-19 20:50:53 +00:00
Chris Lattner
cc3035e989
Now that we have operand info for machine instructions, use it to create
...
temporary registers for things that define a register. This allows dag->dag
isel to compile this:
int %test() { ret int 65535 }
into:
_test:
lis r2, 0
ori r2, r2, 65535
blr
Next up, getting CopyFromReg to work, allowing arguments and cross-bb values.
llvm-svn: 22932
2005-08-19 20:45:43 +00:00
Chris Lattner
bd26a82051
Split RegisterClass 'Methods' into MethodProtos and MethodBodies
...
llvm-svn: 22929
2005-08-19 19:13:20 +00:00
Chris Lattner
248933eb39
put reg classes into namespace
...
llvm-svn: 22927
2005-08-19 18:53:43 +00:00
Chris Lattner
1dee9b1685
Put reg classes into namespaces
...
llvm-svn: 22926
2005-08-19 18:52:55 +00:00
Chris Lattner
757a770a57
Put register classes into namespaces
...
llvm-svn: 22925
2005-08-19 18:51:57 +00:00
Chris Lattner
8975de187f
Put register classes in namespaces
...
llvm-svn: 22924
2005-08-19 18:50:46 +00:00
Chris Lattner
dc89d6be04
Fix code that assumes the register info will be dumped into a target
...
namespace instead of the reg class namespace. Update getRegClassForType()
to use modified names due to tblgen change.
llvm-svn: 22923
2005-08-19 18:50:11 +00:00
Chris Lattner
cca0503c24
put reg classes in namespaces
...
llvm-svn: 22922
2005-08-19 18:49:22 +00:00
Chris Lattner
3fb85f2702
Require that targets specify a namespace for their register classes.
...
llvm-svn: 22921
2005-08-19 18:48:48 +00:00
Chris Lattner
3262673610
The skeleton target has never had an isel
...
llvm-svn: 22917
2005-08-19 18:35:41 +00:00
Chris Lattner
f391f286d7
This code has always been dead on itanium
...
llvm-svn: 22916
2005-08-19 18:34:37 +00:00
Chris Lattner
0faf0b7313
This code has always been dead for alpha
...
llvm-svn: 22915
2005-08-19 18:33:26 +00:00
Chris Lattner
8ad3700a3e
The simple isel being gone makes this dead!
...
llvm-svn: 22914
2005-08-19 18:32:03 +00:00
Chris Lattner
ef6d8d8e94
Now that the simple isels are dead, so is this.
...
llvm-svn: 22913
2005-08-19 18:30:39 +00:00
Chris Lattner
63143a0609
Sparcv9 gets no operand info
...
llvm-svn: 22909
2005-08-19 16:56:56 +00:00
Jeff Cohen
486e36cfde
Fix VC++ constant truncation warning.
...
llvm-svn: 22907
2005-08-19 16:19:21 +00:00
Duraid Madina
fcbb38077b
a bugfix (up top) and a quick repair job: disable generation of dep.z
...
(which died about a week ago) so we're back to load-(2^n-1)-then-AND
sequences. slow, but things should now be Almost Completely Working,
modulo those pesky alignment/ABI issues.
llvm-svn: 22904
2005-08-19 13:25:50 +00:00
Jeff Cohen
d1f22b1282
Fix VC++ precedence warning.
...
llvm-svn: 22902
2005-08-19 04:39:48 +00:00
Nate Begeman
ce400dac21
Fix a bug where we were passing the wrong number of arguments to an
...
instruction.
llvm-svn: 22901
2005-08-19 03:42:28 +00:00
Chris Lattner
d18beab94c
Fix computation of # operands, add a temporary hack for CopyToReg
...
llvm-svn: 22896
2005-08-19 01:01:34 +00:00
Chris Lattner
8cbddfc8c5
mark variable arity instructions as such. Alpha wins the battle for
...
cleanest backend in this metric :)
llvm-svn: 22893
2005-08-19 00:51:37 +00:00
Chris Lattner
3e0335c9d1
Mark some instructions as variable_ops, and PSEUDO_ALLOC as taking a GPR.
...
I'm not convinced this is all of them, but I can't do much testing, because
IA64 LLC crashes on big programs :(
llvm-svn: 22892
2005-08-19 00:47:42 +00:00
Chris Lattner
423d7cbbf8
add a few missing cases
...
llvm-svn: 22891
2005-08-19 00:41:29 +00:00
Chris Lattner
e2967ac53d
Give ADJCALLSTACKDOWN/UP the correct operands.
...
Give a whole bunch of other stuff variable operands, particularly FP. The
FP stackifier is playing fast and loose with operands here, so we have to
mark them all as variable. This will have to be fixed before we can dag->dag
the X86 backend. The solution is for the pre-stackifier and post-stackifier
instructions to all be disjoint.
llvm-svn: 22890
2005-08-19 00:38:22 +00:00
Nate Begeman
93c4bc6dca
ISD::OR, and it's accompanying SelectBitfieldInsert
...
llvm-svn: 22889
2005-08-19 00:38:14 +00:00
Chris Lattner
a9d68f140e
The variable SAR's only take one operand too
...
llvm-svn: 22888
2005-08-19 00:31:37 +00:00
Chris Lattner
145695927a
Stop adding bogus operands to variable shifts on X86. These instructions
...
only take one operand. The other comes implicitly in through CL.
llvm-svn: 22887
2005-08-19 00:16:17 +00:00
Nate Begeman
be1f314a47
Remove the X86 and PowerPC Simple instruction selectors; their time has
...
passed.
llvm-svn: 22886
2005-08-18 23:53:15 +00:00
Nate Begeman
33acb2c135
Add shifts.
...
llvm-svn: 22884
2005-08-18 23:38:00 +00:00
Chris Lattner
4bd805e785
Fix operand numbers by marking variable arity nodes as such and by fixing
...
the operand lists of a few other nodes.
llvm-svn: 22883
2005-08-18 23:25:33 +00:00
Chris Lattner
51e851d0e6
MFLR doesn't take an operand, the LR register is implicit
...
llvm-svn: 22882
2005-08-18 23:24:50 +00:00
Chris Lattner
5cfa377947
Add a new flag
...
llvm-svn: 22881
2005-08-18 23:17:07 +00:00
Chris Lattner
0c8c2c102d
add a new -view-sched-dags option to view dags as they are sent to the scheduler.
...
llvm-svn: 22878
2005-08-18 20:11:49 +00:00
Chris Lattner
4e00ff6e70
Move this to the emitter
...
llvm-svn: 22877
2005-08-18 20:08:53 +00:00
Chris Lattner
d342de9aaa
Implement the first chunk of a code emitter. This is sophisticated enough to
...
codegen:
_empty:
.LBB_empty_0: ;
blr
but can't do anything more (yet). :)
llvm-svn: 22876
2005-08-18 20:07:59 +00:00
Jim Laskey
18b9b8df86
More optimal solution for loading constants.
...
llvm-svn: 22870
2005-08-18 18:58:23 +00:00
Chris Lattner
015d73996d
After selecting the instructions for a basic block, emit the instructions
...
llvm-svn: 22869
2005-08-18 18:46:06 +00:00
Chris Lattner
1b4727de7d
new file, obviously just a stub
...
llvm-svn: 22868
2005-08-18 18:45:24 +00:00
Chris Lattner
15b5c7ca84
remove some unused stuff
...
llvm-svn: 22866
2005-08-18 18:34:00 +00:00
Nate Begeman
d16a26a8d4
Fix int foo() { return 65535; } by using the top 16 bits of the constant
...
as the argument to LIS rather than the result of HA16(constant).
The DAG->DAG ISel was already doing the right thing.
llvm-svn: 22865
2005-08-18 18:14:49 +00:00
Nate Begeman
d32638706a
Improve ISD::Constant codegen.
...
Now for int foo() { return -1; } we generate:
_foo:
li r3, -1
blr
instead of
_foo:
lis r2, -1
ori r3, r2, 65535
blr
llvm-svn: 22864
2005-08-18 18:01:39 +00:00
Chris Lattner
1a908c8920
Enable critical edge splitting by default
...
llvm-svn: 22863
2005-08-18 17:35:14 +00:00
Chris Lattner
37faf35b35
replace switch stmt with an assert, generate li 0 instead of lis 0 for 0,
...
to make the code follow people's expectations better.
llvm-svn: 22861
2005-08-18 17:16:52 +00:00
Jim Laskey
32d4c85278
Handle loading of 0x????0000 constants with a single instruction.
...
llvm-svn: 22858
2005-08-18 15:52:30 +00:00
Nate Begeman
b3821a3943
Add support for ISD::AND, and its various optimized forms.
...
llvm-svn: 22857
2005-08-18 07:30:46 +00:00
Nate Begeman
19a271a67b
Add support for target DAG nodes that take 4 operands, such as PowerPC's
...
rlwinm.
llvm-svn: 22856
2005-08-18 07:30:15 +00:00
Nate Begeman
cfb9a74c2e
Maintain consistency in negating things
...
llvm-svn: 22855
2005-08-18 05:44:50 +00:00
Nate Begeman
72d6f8800d
Implement XOR, remove a broken sign_extend_inreg case
...
llvm-svn: 22854
2005-08-18 05:00:13 +00:00
Chris Lattner
802080d812
Fix printing of VTSDNodes
...
llvm-svn: 22853
2005-08-18 03:31:02 +00:00
Nate Begeman
4bfb4a215d
Add a bunch more simple nodes.
...
llvm-svn: 22851
2005-08-18 03:04:18 +00:00
Nate Begeman
457367f14c
Add a couple more nodes that are easy to handle
...
llvm-svn: 22850
2005-08-18 00:53:47 +00:00
Nate Begeman
74d5529b88
Be fruitful and multiply!
...
llvm-svn: 22849
2005-08-18 00:21:41 +00:00
Jim Laskey
04160c6d8d
Better version of isIntImmediate.
...
llvm-svn: 22848
2005-08-18 00:15:15 +00:00
Nate Begeman
3fcf47d8f0
Teach the DAG->DAG ISel about FNEG, and how it can be used to invert
...
several of the PowerPC opcodes that come in both negated and non-negated
forms.
llvm-svn: 22845
2005-08-17 23:46:35 +00:00
Chris Lattner
ea7dfd53d6
Fix Transforms/LoopStrengthReduce/2005-08-17-OutOfLoopVariant.ll, a crash
...
on 177.mesa
llvm-svn: 22843
2005-08-17 21:22:41 +00:00
Jim Laskey
d66e616545
Move the code dependency for MathExtras.h from SelectionDAGNodes.h.
...
Added some class dividers in SelectionDAG.cpp.
llvm-svn: 22841
2005-08-17 20:08:02 +00:00
Jim Laskey
8ad8f71447
Move code dependency for MathExtras.h out of Constants.h.
...
llvm-svn: 22840
2005-08-17 20:06:22 +00:00
Jim Laskey
17e7599ecb
Promote dependency for MathExtras.h out of Constants.h.
...
llvm-svn: 22839
2005-08-17 20:04:34 +00:00
Jim Laskey
b74c666186
Culling out use of unions for converting FP to bits and vice versa.
...
llvm-svn: 22838
2005-08-17 19:34:49 +00:00
Chris Lattner
c6aa80668e
add a beta option for turning on dag->dag isel
...
llvm-svn: 22837
2005-08-17 19:33:30 +00:00
Chris Lattner
43ff01e2e6
initial hack at a dag->dag instruction selector. This is obviously woefully
...
incomplete, but it is a start. It handles basic argument/retval stuff, immediates,
add and sub.
llvm-svn: 22836
2005-08-17 19:33:03 +00:00
Chris Lattner
f61cce952b
add prototype, remove dead proto
...
llvm-svn: 22835
2005-08-17 19:32:03 +00:00
Chris Lattner
ab0de9d7fc
Fix a bug in RemoveDeadNodes where it would crash when its "optional"
...
argument is not specified.
Implement ReplaceAllUsesWith.
llvm-svn: 22834
2005-08-17 19:00:20 +00:00
Jim Laskey
686d6a1cb2
Switched to using BitsToDouble for int_to_float to avoid aliasing problem.
...
llvm-svn: 22831
2005-08-17 17:42:52 +00:00
Chris Lattner
33900811ee
Fix some bugs in the alpha backend, some of which I introduced yesterday,
...
and some that were preexisting. All alpha regtests pass now.
llvm-svn: 22829
2005-08-17 17:08:24 +00:00
Jim Laskey
898ba557d0
Change hex float constants for the sake of VC++.
...
llvm-svn: 22828
2005-08-17 09:44:59 +00:00
Chris Lattner
c9950c11a9
Add a new beta option for critical edge splitting, to avoid a problem that
...
Nate noticed in yacr2 (and I know occurs in other places as well).
This is still rough, as the critical edge blocks are not intelligently placed
but is added to get some idea to see if this improves performance.
llvm-svn: 22825
2005-08-17 06:37:43 +00:00
Chris Lattner
2bf7cb5213
Use a new helper to split critical edges, making the code simpler.
...
Do not claim to not change the CFG. We do change the cfg to split critical
edges. This isn't causing us a problem now, but could likely do so in the
future.
llvm-svn: 22824
2005-08-17 06:35:16 +00:00
Chris Lattner
ba28c2733f
Fix a regression on X86, where FP values can be promoted too.
...
llvm-svn: 22822
2005-08-17 06:06:25 +00:00
Chris Lattner
63f774ec6e
Fix a few small typos I noticed when converting this over to the DAG->DAG
...
selector. Also, there is no difference between addSImm and addImm, so just
use addImm, folding some branches.
llvm-svn: 22819
2005-08-17 01:25:14 +00:00
Jim Laskey
9828f26cf1
Removed UINT_TO_FP and SINT_TO_FP from ISel outright.
...
llvm-svn: 22818
2005-08-17 01:14:38 +00:00
Andrew Lenharth
73370ba5fd
thinko. Should fix s4addl.ll regression
...
llvm-svn: 22817
2005-08-17 00:47:24 +00:00
Jim Laskey
5909c8b10a
Remove ISel code generation for UINT_TO_FP and SINT_TO_FP. Now asserts if
...
marked as legal.
llvm-svn: 22816
2005-08-17 00:41:40 +00:00
Jim Laskey
6267b2c97c
Make UINT_TO_FP and SINT_TO_FP use generic expansion.
...
llvm-svn: 22815
2005-08-17 00:40:22 +00:00
Jim Laskey
f2516a9180
Added generic code expansion for [signed|unsigned] i32 to [f32|f64] casts in the
...
legalizer. PowerPC now uses this expansion instead of ISel version.
Example:
// signed integer to double conversion
double f1(signed x) {
return (double)x;
}
// unsigned integer to double conversion
double f2(unsigned x) {
return (double)x;
}
// signed integer to float conversion
float f3(signed x) {
return (float)x;
}
// unsigned integer to float conversion
float f4(unsigned x) {
return (float)x;
}
Byte Code:
internal fastcc double %_Z2f1i(int %x) {
entry:
%tmp.1 = cast int %x to double ; <double> [#uses=1]
ret double %tmp.1
}
internal fastcc double %_Z2f2j(uint %x) {
entry:
%tmp.1 = cast uint %x to double ; <double> [#uses=1]
ret double %tmp.1
}
internal fastcc float %_Z2f3i(int %x) {
entry:
%tmp.1 = cast int %x to float ; <float> [#uses=1]
ret float %tmp.1
}
internal fastcc float %_Z2f4j(uint %x) {
entry:
%tmp.1 = cast uint %x to float ; <float> [#uses=1]
ret float %tmp.1
}
internal fastcc double %_Z2g1i(int %x) {
entry:
%buffer = alloca [2 x uint] ; <[2 x uint]*> [#uses=3]
%tmp.0 = getelementptr [2 x uint]* %buffer, int 0, int 0 ; <uint*> [#uses=1]
store uint 1127219200, uint* %tmp.0
%tmp.2 = cast int %x to uint ; <uint> [#uses=1]
%tmp.3 = xor uint %tmp.2, 2147483648 ; <uint> [#uses=1]
%tmp.5 = getelementptr [2 x uint]* %buffer, int 0, int 1 ; <uint*> [#uses=1]
store uint %tmp.3, uint* %tmp.5
%tmp.9 = cast [2 x uint]* %buffer to double* ; <double*> [#uses=1]
%tmp.10 = load double* %tmp.9 ; <double> [#uses=1]
%tmp.13 = load double* cast (long* %signed_bias to double*) ; <double> [#uses=1]
%tmp.14 = sub double %tmp.10, %tmp.13 ; <double> [#uses=1]
ret double %tmp.14
}
internal fastcc double %_Z2g2j(uint %x) {
entry:
%buffer = alloca [2 x uint] ; <[2 x uint]*> [#uses=3]
%tmp.0 = getelementptr [2 x uint]* %buffer, int 0, int 0 ; <uint*> [#uses=1]
store uint 1127219200, uint* %tmp.0
%tmp.1 = getelementptr [2 x uint]* %buffer, int 0, int 1 ; <uint*> [#uses=1]
store uint %x, uint* %tmp.1
%tmp.4 = cast [2 x uint]* %buffer to double* ; <double*> [#uses=1]
%tmp.5 = load double* %tmp.4 ; <double> [#uses=1]
%tmp.8 = load double* cast (long* %unsigned_bias to double*) ; <double> [#uses=1]
%tmp.9 = sub double %tmp.5, %tmp.8 ; <double> [#uses=1]
ret double %tmp.9
}
internal fastcc float %_Z2g3i(int %x) {
entry:
%buffer = alloca [2 x uint] ; <[2 x uint]*> [#uses=3]
%tmp.0 = getelementptr [2 x uint]* %buffer, int 0, int 0 ; <uint*> [#uses=1]
store uint 1127219200, uint* %tmp.0
%tmp.2 = cast int %x to uint ; <uint> [#uses=1]
%tmp.3 = xor uint %tmp.2, 2147483648 ; <uint> [#uses=1]
%tmp.5 = getelementptr [2 x uint]* %buffer, int 0, int 1 ; <uint*> [#uses=1]
store uint %tmp.3, uint* %tmp.5
%tmp.9 = cast [2 x uint]* %buffer to double* ; <double*> [#uses=1]
%tmp.10 = load double* %tmp.9 ; <double> [#uses=1]
%tmp.13 = load double* cast (long* %signed_bias to double*) ; <double> [#uses=1]
%tmp.14 = sub double %tmp.10, %tmp.13 ; <double> [#uses=1]
%tmp.16 = cast double %tmp.14 to float ; <float> [#uses=1]
ret float %tmp.16
}
internal fastcc float %_Z2g4j(uint %x) {
entry:
%buffer = alloca [2 x uint] ; <[2 x uint]*> [#uses=3]
%tmp.0 = getelementptr [2 x uint]* %buffer, int 0, int 0 ; <uint*> [#uses=1]
store uint 1127219200, uint* %tmp.0
%tmp.1 = getelementptr [2 x uint]* %buffer, int 0, int 1 ; <uint*> [#uses=1]
store uint %x, uint* %tmp.1
%tmp.4 = cast [2 x uint]* %buffer to double* ; <double*> [#uses=1]
%tmp.5 = load double* %tmp.4 ; <double> [#uses=1]
%tmp.8 = load double* cast (long* %unsigned_bias to double*) ; <double> [#uses=1]
%tmp.9 = sub double %tmp.5, %tmp.8 ; <double> [#uses=1]
%tmp.11 = cast double %tmp.9 to float ; <float> [#uses=1]
ret float %tmp.11
}
PowerPC Code:
.machine ppc970
.const
.align 2
.CPIl1__Z2f1i_0: ; float 0x4330000080000000
.long 1501560836 ; float 4.5036e+15
.text
.align 2
.globl l1__Z2f1i
l1__Z2f1i:
.LBBl1__Z2f1i_0: ; entry
xoris r2, r3, 32768
stw r2, -4(r1)
lis r2, 17200
stw r2, -8(r1)
lfd f0, -8(r1)
lis r2, ha16(.CPIl1__Z2f1i_0)
lfs f1, lo16(.CPIl1__Z2f1i_0)(r2)
fsub f1, f0, f1
blr
.const
.align 2
.CPIl2__Z2f2j_0: ; float 0x4330000000000000
.long 1501560832 ; float 4.5036e+15
.text
.align 2
.globl l2__Z2f2j
l2__Z2f2j:
.LBBl2__Z2f2j_0: ; entry
stw r3, -4(r1)
lis r2, 17200
stw r2, -8(r1)
lfd f0, -8(r1)
lis r2, ha16(.CPIl2__Z2f2j_0)
lfs f1, lo16(.CPIl2__Z2f2j_0)(r2)
fsub f1, f0, f1
blr
.const
.align 2
.CPIl3__Z2f3i_0: ; float 0x4330000080000000
.long 1501560836 ; float 4.5036e+15
.text
.align 2
.globl l3__Z2f3i
l3__Z2f3i:
.LBBl3__Z2f3i_0: ; entry
xoris r2, r3, 32768
stw r2, -4(r1)
lis r2, 17200
stw r2, -8(r1)
lfd f0, -8(r1)
lis r2, ha16(.CPIl3__Z2f3i_0)
lfs f1, lo16(.CPIl3__Z2f3i_0)(r2)
fsub f0, f0, f1
frsp f1, f0
blr
.const
.align 2
.CPIl4__Z2f4j_0: ; float 0x4330000000000000
.long 1501560832 ; float 4.5036e+15
.text
.align 2
.globl l4__Z2f4j
l4__Z2f4j:
.LBBl4__Z2f4j_0: ; entry
stw r3, -4(r1)
lis r2, 17200
stw r2, -8(r1)
lfd f0, -8(r1)
lis r2, ha16(.CPIl4__Z2f4j_0)
lfs f1, lo16(.CPIl4__Z2f4j_0)(r2)
fsub f0, f0, f1
frsp f1, f0
blr
llvm-svn: 22814
2005-08-17 00:39:29 +00:00
Chris Lattner
0d2456e1f0
add a new TargetConstant node
...
llvm-svn: 22813
2005-08-17 00:34:06 +00:00
Nate Begeman
784c8068a7
Implement a couple improvements:
...
Remove dead code in ISD::Constant handling
Add support for add long, imm16
We now codegen 'long long foo(long long a) { return ++a; }'
as:
addic r4, r4, 1
addze r3, r3
blr
instead of:
li r2, 1
li r5, 0
addc r2, r4, r2
adde r3, r3, r5
blr
llvm-svn: 22811
2005-08-17 00:20:08 +00:00
Chris Lattner
5a1d5e30e2
This is a dummy, it doesn't matter what the ValueType is
...
llvm-svn: 22809
2005-08-16 21:59:52 +00:00
Chris Lattner
79f5ebc7b9
updates for changes in nodes
...
llvm-svn: 22808
2005-08-16 21:58:15 +00:00
Chris Lattner
7c76278242
update the backends to work with the new CopyFromReg/CopyToReg/ImplicitDef nodes
...
llvm-svn: 22807
2005-08-16 21:56:37 +00:00
Chris Lattner
33182325f5
Eliminate the RegSDNode class, which 3 nodes (CopyFromReg/CopyToReg/ImplicitDef)
...
used to tack a register number onto the node.
Instead of doing this, make a new node, RegisterSDNode, which is a leaf
containing a register number. These three operations just become normal
DAG nodes now, instead of requiring special handling.
Note that with this change, it is no longer correct to make illegal
CopyFromReg/CopyToReg nodes. The legalizer will not touch them, and this
is bad, so don't do it. :)
llvm-svn: 22806
2005-08-16 21:55:35 +00:00
Nate Begeman
371e49515d
Implement BR_CC and BRTWOWAY_CC. This allows the removal of a rather nasty
...
fixme from the PowerPC backend. Emit slightly better code for legalizing
select_cc.
llvm-svn: 22805
2005-08-16 19:49:35 +00:00
Chris Lattner
bc89226527
Allow passing a dag into dump and getOperationName. If one is available
...
when printing a node, use it to render target operations with their
target instruction name instead of "<<unknown>>".
llvm-svn: 22804
2005-08-16 18:33:07 +00:00
Chris Lattner
7e57d18b79
Use a extant helper to do this.
...
llvm-svn: 22802
2005-08-16 18:31:23 +00:00
Chris Lattner
1973278b38
Add some methods for dag->dag isel.
...
Split RemoveNodeFromCSEMaps out of DeleteNodesIfDead to do it.
llvm-svn: 22801
2005-08-16 18:17:10 +00:00
Chris Lattner
f22556d3ad
Pull the LLVM -> DAG lowering code out of the pattern selector so that it
...
can be shared with the DAG->DAG selector.
llvm-svn: 22799
2005-08-16 17:14:42 +00:00
Chris Lattner
5cf983ee0f
Fix a bad case in gzip where we put lots of things in registers across the
...
loop, because a IV-dependent value was used outside of the loop and didn't
have immediate-folding capability
llvm-svn: 22798
2005-08-16 00:38:11 +00:00
Chris Lattner
e515416396
Fix Transforms/LoopStrengthReduce/2005-08-15-AddRecIV.ll
...
llvm-svn: 22797
2005-08-16 00:37:01 +00:00
Chris Lattner
73785d2ef2
Turn loop strength reduction on by default.
...
Only run createLowerConstantExpressionsPass for the simple isel. The DAG
isel has no need for it.
llvm-svn: 22794
2005-08-15 23:47:04 +00:00
Chris Lattner
587a75b6e0
Teach LLVM to know how many times a loop executes when constructed with
...
a < expression, e.g.: for (i = m; i < n; ++i)
llvm-svn: 22793
2005-08-15 23:33:51 +00:00
Jim Laskey
24b84072ea
Broke 80 column rule.
...
llvm-svn: 22792
2005-08-15 17:35:26 +00:00
Jim Laskey
42623a9539
Changed code gen for int to f32 to use rounding. This makes FP results
...
consistent with gcc.
llvm-svn: 22791
2005-08-15 17:14:19 +00:00
Andrew Lenharth
b65b1568ae
isIntImmediate is a good Idea. Add a flavor that checks bounds while it is at it
...
llvm-svn: 22790
2005-08-15 14:31:37 +00:00
Nate Begeman
d5e739dcc2
Fix last night's PPC32 regressions by
...
1. Not selecting the false value of a select_cc in the false arm, which
isn't legal for nested selects.
2. Actually returning the node we created and Legalized in the FP_TO_UINT
Expander.
llvm-svn: 22789
2005-08-14 18:38:32 +00:00
Nate Begeman
e5394d453d
Fix last night's X86 regressions by putting code for SSE in the if(SSE)
...
block. nur.
llvm-svn: 22788
2005-08-14 18:37:02 +00:00
Andrew Lenharth
ed07233868
only build .a on alpha
...
llvm-svn: 22787
2005-08-14 15:14:34 +00:00
Nate Begeman
4d959f6627
Fix FP_TO_UINT with Scalar SSE2 now that the legalizer can handle it. We
...
now generate the relatively good code sequences:
unsigned short foo(float a) { return a; }
_foo:
movss 4(%esp), %xmm0
cvttss2si %xmm0, %eax
movzwl %ax, %eax
ret
and
unsigned bar(float a) { return a; }
_bar:
movss .CPI_bar_0, %xmm0
movss 4(%esp), %xmm1
movapd %xmm1, %xmm2
subss %xmm0, %xmm2
cvttss2si %xmm2, %eax
xorl $-2147483648, %eax
cvttss2si %xmm1, %ecx
ucomiss %xmm0, %xmm1
cmovb %ecx, %eax
ret
llvm-svn: 22786
2005-08-14 04:36:51 +00:00
Nate Begeman
36853ee1fd
Teach the legalizer how to legalize FP_TO_UINT.
...
Teach the legalizer to promote FP_TO_UINT to FP_TO_SINT if the wider
FP_TO_UINT is also illegal. This allows us on PPC to codegen
unsigned short foo(float a) { return a; }
as:
_foo:
.LBB_foo_0: ; entry
fctiwz f0, f1
stfd f0, -8(r1)
lwz r2, -4(r1)
rlwinm r3, r2, 0, 16, 31
blr
instead of:
_foo:
.LBB_foo_0: ; entry
fctiwz f0, f1
stfd f0, -8(r1)
lwz r2, -4(r1)
lis r3, ha16(.CPI_foo_0)
lfs f0, lo16(.CPI_foo_0)(r3)
fcmpu cr0, f1, f0
blt .LBB_foo_2 ; entry
.LBB_foo_1: ; entry
fsubs f0, f1, f0
fctiwz f0, f0
stfd f0, -16(r1)
lwz r2, -12(r1)
xoris r2, r2, 32768
.LBB_foo_2: ; entry
rlwinm r3, r2, 0, 16, 31
blr
llvm-svn: 22785
2005-08-14 01:20:53 +00:00
Nate Begeman
83f6b98c42
Make FP_TO_UINT Illegal. This allows us to generate significantly better
...
codegen for FP_TO_UINT by using the legalizer's SELECT variant.
Implement a codegen improvement for SELECT_CC, selecting the false node in
the MBB that feeds the phi node. This allows us to codegen:
void foo(int *a, int b, int c) { int d = (a < b) ? 5 : 9; *a = d; }
as:
_foo:
li r2, 5
cmpw cr0, r4, r3
bgt .LBB_foo_2 ; entry
.LBB_foo_1: ; entry
li r2, 9
.LBB_foo_2: ; entry
stw r2, 0(r3)
blr
insted of:
_foo:
li r2, 5
li r5, 9
cmpw cr0, r4, r3
bgt .LBB_foo_2 ; entry
.LBB_foo_1: ; entry
or r2, r5, r5
.LBB_foo_2: ; entry
stw r2, 0(r3)
blr
llvm-svn: 22784
2005-08-14 01:17:16 +00:00
Andrew Lenharth
107a0a7690
Testing a variable before it is defined doesn't work so well. It is a fairly small thing, so just let everyone build the .a file
...
llvm-svn: 22783
2005-08-13 14:58:23 +00:00
Chris Lattner
47d3ec3525
Ooops, don't forget to clear this. The real inner loop is now:
...
.LBB_foo_3: ; no_exit.1
lfd f2, 0(r9)
lfd f3, 8(r9)
fmul f4, f1, f2
fmadd f4, f0, f3, f4
stfd f4, 8(r9)
fmul f3, f1, f3
fmsub f2, f0, f2, f3
stfd f2, 0(r9)
addi r9, r9, 16
addi r8, r8, 1
cmpw cr0, r8, r4
ble .LBB_foo_3 ; no_exit.1
llvm-svn: 22782
2005-08-13 07:42:01 +00:00
Chris Lattner
5949d49032
Recursively scan scev expressions for common subexpressions. This allows us
...
to handle nested loops much better, for example, by being able to tell that
these two expressions:
{( 8 + ( 16 * ( 1 + %Tmp11 + %Tmp12)) + %c_),+,( 16 * %Tmp 12)}<loopentry.1>
{(( 16 * ( 1 + %Tmp11 + %Tmp12)) + %c_),+,( 16 * %Tmp12)}<loopentry.1>
Have the following common part that can be shared:
{(( 16 * ( 1 + %Tmp11 + %Tmp12)) + %c_),+,( 16 * %Tmp12)}<loopentry.1>
This allows us to codegen an important inner loop in 168.wupwise as:
.LBB_foo_4: ; no_exit.1
lfd f2, 16(r9)
fmul f3, f0, f2
fmul f2, f1, f2
fadd f4, f3, f2
stfd f4, 8(r9)
fsub f2, f3, f2
stfd f2, 16(r9)
addi r8, r8, 1
addi r9, r9, 16
cmpw cr0, r8, r4
ble .LBB_foo_4 ; no_exit.1
instead of:
.LBB_foo_3: ; no_exit.1
lfdx f2, r6, r9
add r10, r6, r9
lfd f3, 8(r10)
fmul f4, f1, f2
fmadd f4, f0, f3, f4
stfd f4, 8(r10)
fmul f3, f1, f3
fmsub f2, f0, f2, f3
stfdx f2, r6, r9
addi r9, r9, 16
addi r8, r8, 1
cmpw cr0, r8, r4
ble .LBB_foo_3 ; no_exit.1
llvm-svn: 22781
2005-08-13 07:27:18 +00:00
Nate Begeman
dc3154ec66
Remove an unncessary argument to SimplifySelectCC and add an additional
...
assert when creating a select_cc node.
llvm-svn: 22780
2005-08-13 06:14:17 +00:00
Nate Begeman
b6651e81a0
Fix the fabs regression on x86 by abstracting the select_cc optimization
...
out into SimplifySelectCC. This allows both ISD::SELECT and ISD::SELECT_CC
to use the same set of simplifying folds.
llvm-svn: 22779
2005-08-13 06:00:21 +00:00
Nate Begeman
a22bf778c9
Remove support for 64b PPC, it's been broken for a long time. It'll be
...
back once a DAG->DAG ISel exists.
llvm-svn: 22778
2005-08-13 05:59:16 +00:00
Andrew Lenharth
6b62b479fa
Fix oversized GOT problem with gcc-4 on alpha
...
llvm-svn: 22777
2005-08-13 05:09:50 +00:00
Chris Lattner
89c1dfc733
Teach SplitCriticalEdge to update LoopInfo if it is alive. This fixes
...
a problem in LoopStrengthReduction, where it would split critical edges
then confused itself with outdated loop information.
llvm-svn: 22776
2005-08-13 01:38:43 +00:00
Chris Lattner
79396539d3
remove dead code. The exit block list is computed on demand, thus does not
...
need to be updated. This code is a relic from when it did.
llvm-svn: 22775
2005-08-13 01:30:36 +00:00
Chris Lattner
21381e8424
implement a couple of simple shift foldings.
...
e.g. (X & 7) >> 3 -> 0
llvm-svn: 22774
2005-08-12 23:54:58 +00:00
Jim Laskey
35960708b7
Fix for 2005-08-12-rlwimi-crash.ll. Make allowance for masks being shifted to
...
zero.
llvm-svn: 22773
2005-08-12 23:52:46 +00:00
Jim Laskey
a568700618
1. This changes handles the cases of (~x)&y and x&(~y) yielding ANDC, and
...
(~x)|y and x|(~y) yielding ORC.
llvm-svn: 22771
2005-08-12 23:38:02 +00:00
Chris Lattner
8447b49526
When splitting critical edges, make sure not to leave the new block in the
...
middle of the loop. This turns a critical loop in gzip into this:
.LBB_test_1: ; loopentry
or r27, r28, r28
add r28, r3, r27
lhz r28, 3(r28)
add r26, r4, r27
lhz r26, 3(r26)
cmpw cr0, r28, r26
bne .LBB_test_8 ; loopentry.loopexit_crit_edge
.LBB_test_2: ; shortcirc_next.0
add r28, r3, r27
lhz r28, 5(r28)
add r26, r4, r27
lhz r26, 5(r26)
cmpw cr0, r28, r26
bne .LBB_test_7 ; shortcirc_next.0.loopexit_crit_edge
.LBB_test_3: ; shortcirc_next.1
add r28, r3, r27
lhz r28, 7(r28)
add r26, r4, r27
lhz r26, 7(r26)
cmpw cr0, r28, r26
bne .LBB_test_6 ; shortcirc_next.1.loopexit_crit_edge
.LBB_test_4: ; shortcirc_next.2
add r28, r3, r27
lhz r26, 9(r28)
add r28, r4, r27
lhz r25, 9(r28)
addi r28, r27, 8
cmpw cr7, r26, r25
mfcr r26, 1
rlwinm r26, r26, 31, 31, 31
add r25, r8, r27
cmpw cr7, r25, r7
mfcr r25, 1
rlwinm r25, r25, 29, 31, 31
and. r26, r26, r25
bne .LBB_test_1 ; loopentry
instead of this:
.LBB_test_1: ; loopentry
or r27, r28, r28
add r28, r3, r27
lhz r28, 3(r28)
add r26, r4, r27
lhz r26, 3(r26)
cmpw cr0, r28, r26
beq .LBB_test_3 ; shortcirc_next.0
.LBB_test_2: ; loopentry.loopexit_crit_edge
add r2, r30, r27
add r8, r29, r27
b .LBB_test_9 ; loopexit
.LBB_test_3: ; shortcirc_next.0
add r28, r3, r27
lhz r28, 5(r28)
add r26, r4, r27
lhz r26, 5(r26)
cmpw cr0, r28, r26
beq .LBB_test_5 ; shortcirc_next.1
.LBB_test_4: ; shortcirc_next.0.loopexit_crit_edge
add r2, r11, r27
add r8, r12, r27
b .LBB_test_9 ; loopexit
.LBB_test_5: ; shortcirc_next.1
add r28, r3, r27
lhz r28, 7(r28)
add r26, r4, r27
lhz r26, 7(r26)
cmpw cr0, r28, r26
beq .LBB_test_7 ; shortcirc_next.2
.LBB_test_6: ; shortcirc_next.1.loopexit_crit_edge
add r2, r9, r27
add r8, r10, r27
b .LBB_test_9 ; loopexit
.LBB_test_7: ; shortcirc_next.2
add r28, r3, r27
lhz r26, 9(r28)
add r28, r4, r27
lhz r25, 9(r28)
addi r28, r27, 8
cmpw cr7, r26, r25
mfcr r26, 1
rlwinm r26, r26, 31, 31, 31
add r25, r8, r27
cmpw cr7, r25, r7
mfcr r25, 1
rlwinm r25, r25, 29, 31, 31
and. r26, r26, r25
bne .LBB_test_1 ; loopentry
Next up, improve the code for the loop.
llvm-svn: 22769
2005-08-12 22:22:17 +00:00
Chris Lattner
e09bbc800c
Add a helper method
...
llvm-svn: 22768
2005-08-12 22:14:06 +00:00
Chris Lattner
4fec86d348
Fix a FIXME: if we are inserting code for a PHI argument, split the critical
...
edge so that the code is not always executed for both operands. This
prevents LSR from inserting code into loops whose exit blocks contain
PHI uses of IV expressions (which are outside of loops). On gzip, for
example, we turn this ugly code:
.LBB_test_1: ; loopentry
add r27, r3, r28
lhz r27, 3(r27)
add r26, r4, r28
lhz r26, 3(r26)
add r25, r30, r28 ;; Only live if exiting the loop
add r24, r29, r28 ;; Only live if exiting the loop
cmpw cr0, r27, r26
bne .LBB_test_5 ; loopexit
into this:
.LBB_test_1: ; loopentry
or r27, r28, r28
add r28, r3, r27
lhz r28, 3(r28)
add r26, r4, r27
lhz r26, 3(r26)
cmpw cr0, r28, r26
beq .LBB_test_3 ; shortcirc_next.0
.LBB_test_2: ; loopentry.loopexit_crit_edge
add r2, r30, r27
add r8, r29, r27
b .LBB_test_9 ; loopexit
.LBB_test_2: ; shortcirc_next.0
...
blt .LBB_test_1
into this:
.LBB_test_1: ; loopentry
or r27, r28, r28
add r28, r3, r27
lhz r28, 3(r28)
add r26, r4, r27
lhz r26, 3(r26)
cmpw cr0, r28, r26
beq .LBB_test_3 ; shortcirc_next.0
.LBB_test_2: ; loopentry.loopexit_crit_edge
add r2, r30, r27
add r8, r29, r27
b .LBB_t_3: ; shortcirc_next.0
.LBB_test_3: ; shortcirc_next.0
...
blt .LBB_test_1
Next step: get the block out of the loop so that the loop is all
fall-throughs again.
llvm-svn: 22766
2005-08-12 22:06:11 +00:00
Chris Lattner
b7ebe65c56
Change break critical edges to not remove, then insert, PHI node entries.
...
Instead, just update the BB in-place. This is both faster, and it prevents
split-critical-edges from shuffling the PHI argument list unneccesarily.
llvm-svn: 22765
2005-08-12 21:58:07 +00:00
Andrew Lenharth
8c6701be6e
match gcc's use of tabs, makes diffs easier
...
llvm-svn: 22764
2005-08-12 16:14:08 +00:00
Andrew Lenharth
ca94102d3e
.section cleanup, patch from Nicholas Riley
...
llvm-svn: 22763
2005-08-12 16:13:43 +00:00
Jim Laskey
a50f770a2c
1. Added the function isOpcWithIntImmediate to simplify testing of operand with
...
specified opcode and an integer constant right operand.
2. Modified ISD::SHL, ISD::SRL, ISD::SRA to use rlwinm when applied after a mask.
llvm-svn: 22761
2005-08-11 21:59:23 +00:00
Chris Lattner
d418d752f4
Tidied up the use of dyn_cast<ConstantSDNode> by using isIntImmediate more.
...
Patch by Jim Laskey.
llvm-svn: 22760
2005-08-11 17:56:50 +00:00
Chris Lattner
c5e1312baa
Use a more efficient method of creating integer and float virtual registers
...
(avoids an extra level of indirection in MakeReg).
defined MakeIntReg using RegMap->createVirtualRegister(PPC32::GPRCRegisterClass)
defined MakeFPReg using RegMap->createVirtualRegister(PPC32::FPRCRegisterClass)
s/MakeReg(MVT::i32)/MakeIntReg/
s/MakeReg(MVT::f64)/MakeFPReg/
Patch by Jim Laskey!
llvm-svn: 22759
2005-08-11 17:15:31 +00:00
Nate Begeman
5c7656fd53
Add a select_cc optimization for recognizing abs(int). This speeds up an
...
integer MPEG encoding loop by a factor of two.
llvm-svn: 22758
2005-08-11 02:18:13 +00:00
Nate Begeman
180b08897f
Some SELECT_CC cleanups:
...
1. move assertions for node creation to getNode()
2. legalize the values returned in ExpandOp immediately
3. Move select_cc optimizations from SELECT's getNode() to SELECT_CC's,
allowing them to be cleaned up significantly.
This paves the way to pick up additional optimizations on SELECT_CC, such
as sum-of-absolute-differences.
llvm-svn: 22757
2005-08-11 01:12:20 +00:00
Nate Begeman
5646b181e8
Make SELECT illegal on PPC32, switch to using SELECT_CC, which more closely
...
reflects what the hardware is capable of. This significantly simplifies
the CC handling logic throughout the ISel.
llvm-svn: 22756
2005-08-10 20:52:09 +00:00
Nate Begeman
e5b86d7442
Add new node, SELECT_CC. This node is for targets that don't natively
...
implement SELECT.
llvm-svn: 22755
2005-08-10 20:51:12 +00:00
Chris Lattner
3428b95634
Changes for PPC32ISelPattern.cpp
...
1. Clean up how SelectIntImmediateExpr handles use counts.
2. "Subtract from" was not clearing hi 16 bits.
Patch by Jim Laskey
llvm-svn: 22754
2005-08-10 18:11:33 +00:00
Chris Lattner
21c0fd9e8f
Fix an oversight that may be causing PR617.
...
llvm-svn: 22753
2005-08-10 17:37:53 +00:00
Chris Lattner
62df798919
remove some trickiness that broke yacr2 and some other programs last night
...
llvm-svn: 22751
2005-08-10 17:15:20 +00:00
Chris Lattner
aeedcc7fc2
Changed the XOR case to use the isOprNot predicate.
...
Patch by Jim Laskey!
llvm-svn: 22750
2005-08-10 16:35:46 +00:00
Chris Lattner
67d0753773
1. Refactored handling of integer immediate values for add, or, xor and sub.
...
New routine: ISel::SelectIntImmediateExpr
2. Now checking use counts of large constants. If use count is > 2 then drop
thru so that the constant gets loaded into a register.
Source:
int %test1(int %a) {
entry:
%tmp.1 = add int %a, 123456789 ; <int> [#uses=1]
%tmp.2 = or int %tmp.1, 123456789 ; <int> [#uses=1]
%tmp.3 = xor int %tmp.2, 123456789 ; <int> [#uses=1]
%tmp.4 = sub int %tmp.3, -123456789 ; <int> [#uses=1]
ret int %tmp.4
}
Did Emit:
.machine ppc970
.text
.align 2
.globl _test1
_test1:
.LBB_test1_0: ; entry
addi r2, r3, -13035
addis r2, r2, 1884
ori r2, r2, 52501
oris r2, r2, 1883
xori r2, r2, 52501
xoris r2, r2, 1883
addi r2, r2, 52501
addis r3, r2, 1883
blr
Now Emits:
.machine ppc970
.text
.align 2
.globl _test1
_test1:
.LBB_test1_0: ; entry
lis r2, 1883
ori r2, r2, 52501
add r3, r3, r2
or r3, r3, r2
xor r3, r3, r2
add r3, r3, r2
blr
Patch by Jim Laskey!
llvm-svn: 22749
2005-08-10 16:34:52 +00:00
Duraid Madina
1c2f9fdf71
sorry!! this is temporary; for some reason the nasty constmul code seems to
...
be an infinite loop when using g++-4.0.1*, this kills the ia64 nightly
tester. A proper fix shall be forthcoming!!! thanks for not killing me. :)
llvm-svn: 22748
2005-08-10 12:38:57 +00:00
Chris Lattner
5f56d71cd7
Fix a bug compiling: select (i32 < i32), f32, f32
...
llvm-svn: 22747
2005-08-10 03:40:09 +00:00
Chris Lattner
f83ce5faee
Make loop-simplify produce better loops by turning PHI nodes like X = phi [X, Y]
...
into just Y. This often occurs when it seperates loops that have collapsed loop
headers. This implements LoopSimplify/phi-node-simplify.ll
llvm-svn: 22746
2005-08-10 02:07:32 +00:00
Chris Lattner
677d85784a
Allow indvar simplify to canonicalize ANY affine IV, not just affine IVs with
...
constant stride. This implements Transforms/IndVarsSimplify/variable-stride-ivs.ll
llvm-svn: 22744
2005-08-10 01:12:06 +00:00
Chris Lattner
35c0e2ee33
Fix an obvious oops
...
llvm-svn: 22742
2005-08-10 00:59:40 +00:00
Chris Lattner
edff91a49a
Teach LSR to strength reduce IVs that have a loop-invariant but non-constant stride.
...
For code like this:
void foo(float *a, float *b, int n, int stride_a, int stride_b) {
int i;
for (i=0; i<n; i++)
a[i*stride_a] = b[i*stride_b];
}
we now emit:
.LBB_foo2_2: ; no_exit
lfs f0, 0(r4)
stfs f0, 0(r3)
addi r7, r7, 1
add r4, r2, r4
add r3, r6, r3
cmpw cr0, r7, r5
blt .LBB_foo2_2 ; no_exit
instead of:
.LBB_foo_2: ; no_exit
mullw r8, r2, r7 ;; multiply!
slwi r8, r8, 2
lfsx f0, r4, r8
mullw r8, r2, r6 ;; multiply!
slwi r8, r8, 2
stfsx f0, r3, r8
addi r2, r2, 1
cmpw cr0, r2, r5
blt .LBB_foo_2 ; no_exit
loops with variable strides occur pretty often. For example, in SPECFP2K
there are 317 variable strides in 177.mesa, 3 in 179.art, 14 in 188.ammp,
56 in 168.wupwise, 36 in 172.mgrid.
Now we can allow indvars to turn functions written like this:
void foo2(float *a, float *b, int n, int stride_a, int stride_b) {
int i, ai = 0, bi = 0;
for (i=0; i<n; i++)
{
a[ai] = b[bi];
ai += stride_a;
bi += stride_b;
}
}
into code like the above for better analysis. With this patch, they generate
identical code.
llvm-svn: 22740
2005-08-10 00:45:21 +00:00
Chris Lattner
dde7dc525e
Fix Regression/Transforms/LoopStrengthReduce/phi_node_update_multiple_preds.ll
...
by being more careful about updating PHI nodes
llvm-svn: 22739
2005-08-10 00:35:32 +00:00
Chris Lattner
c6c4d99a21
Fix some 80 column violations.
...
Once we compute the evolution for a GEP, tell SE about it. This allows users
of the GEP to know it, if the users are not direct. This allows us to compile
this testcase:
void fbSolidFillmmx(int w, unsigned char *d) {
while (w >= 64) {
*(unsigned long long *) (d + 0) = 0;
*(unsigned long long *) (d + 8) = 0;
*(unsigned long long *) (d + 16) = 0;
*(unsigned long long *) (d + 24) = 0;
*(unsigned long long *) (d + 32) = 0;
*(unsigned long long *) (d + 40) = 0;
*(unsigned long long *) (d + 48) = 0;
*(unsigned long long *) (d + 56) = 0;
w -= 64;
d += 64;
}
}
into:
.LBB_fbSolidFillmmx_2: ; no_exit
li r2, 0
stw r2, 0(r4)
stw r2, 4(r4)
stw r2, 8(r4)
stw r2, 12(r4)
stw r2, 16(r4)
stw r2, 20(r4)
stw r2, 24(r4)
stw r2, 28(r4)
stw r2, 32(r4)
stw r2, 36(r4)
stw r2, 40(r4)
stw r2, 44(r4)
stw r2, 48(r4)
stw r2, 52(r4)
stw r2, 56(r4)
stw r2, 60(r4)
addi r4, r4, 64
addi r3, r3, -64
cmpwi cr0, r3, 63
bgt .LBB_fbSolidFillmmx_2 ; no_exit
instead of:
.LBB_fbSolidFillmmx_2: ; no_exit
li r11, 0
stw r11, 0(r4)
stw r11, 4(r4)
stwx r11, r10, r4
add r12, r10, r4
stw r11, 4(r12)
stwx r11, r9, r4
add r12, r9, r4
stw r11, 4(r12)
stwx r11, r8, r4
add r12, r8, r4
stw r11, 4(r12)
stwx r11, r7, r4
add r12, r7, r4
stw r11, 4(r12)
stwx r11, r6, r4
add r12, r6, r4
stw r11, 4(r12)
stwx r11, r5, r4
add r12, r5, r4
stw r11, 4(r12)
stwx r11, r2, r4
add r12, r2, r4
stw r11, 4(r12)
addi r4, r4, 64
addi r3, r3, -64
cmpwi cr0, r3, 63
bgt .LBB_fbSolidFillmmx_2 ; no_exit
llvm-svn: 22737
2005-08-09 23:39:36 +00:00
Chris Lattner
b310ac4a86
implement two helper methods
...
llvm-svn: 22736
2005-08-09 23:36:33 +00:00
Chris Lattner
679f5b0b40
Fix spelling, fix some broken canonicalizations by my last patch
...
llvm-svn: 22734
2005-08-09 23:09:05 +00:00
Chris Lattner
54ee86aca7
add a optimization note
...
llvm-svn: 22732
2005-08-09 22:30:57 +00:00
Chris Lattner
14e060f743
add cc nodes to the AllNodes list so they show up in Graphviz output
...
llvm-svn: 22731
2005-08-09 20:40:02 +00:00
Chris Lattner
6ec7745e80
Update the targets to the new SETCC/CondCodeSDNode interfaces.
...
llvm-svn: 22729
2005-08-09 20:21:10 +00:00
Chris Lattner
d47675ed24
Eliminate the SetCCSDNode in favor of a CondCodeSDNode class. This pulls the
...
CC out of the SetCC operation, making SETCC a standard ternary operation and
CC's a standard DAG leaf. This will make it possible for other node to use
CC's as operands in the future...
llvm-svn: 22728
2005-08-09 20:20:18 +00:00
Chris Lattner
2035c4f7f8
Minor cleanup patch, no functionality changes. Written by Jim Laskey.
...
llvm-svn: 22727
2005-08-09 18:29:55 +00:00
Chris Lattner
4c62c647c2
Fix CodeGen/Generic/div-neg-power-2.ll, a regression from last night.
...
llvm-svn: 22726
2005-08-09 18:08:41 +00:00
Chris Lattner
02742710f3
SCEVAddExpr::get() of an empty list is invalid.
...
llvm-svn: 22724
2005-08-09 01:13:47 +00:00
Chris Lattner
a091ff1764
Implement: LoopStrengthReduce/share_ivs.ll
...
Two changes:
* Only insert one PHI node for each stride. Other values are live in
values. This cannot introduce higher register pressure than the
previous approach, and can take advantage of reg+reg addressing modes.
* Factor common base values out of uses before moving values from the
base to the immediate fields. This improves codegen by starting the
stride-specific PHI node out at a common place for each IV use.
As an example, we used to generate this for a loop in swim:
.LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_2: ; no_exit.7.i
lfd f0, 0(r8)
stfd f0, 0(r3)
lfd f0, 0(r6)
stfd f0, 0(r7)
lfd f0, 0(r2)
stfd f0, 0(r5)
addi r9, r9, 1
addi r2, r2, 8
addi r5, r5, 8
addi r6, r6, 8
addi r7, r7, 8
addi r8, r8, 8
addi r3, r3, 8
cmpw cr0, r9, r4
bgt .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_1
now we emit:
.LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_2: ; no_exit.7.i
lfdx f0, r8, r2
stfdx f0, r9, r2
lfdx f0, r5, r2
stfdx f0, r7, r2
lfdx f0, r3, r2
stfdx f0, r6, r2
addi r10, r10, 1
addi r2, r2, 8
cmpw cr0, r10, r4
bgt .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_1
As another more dramatic example, we used to emit this:
.LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_2: ; no_exit.1.i19
lfd f0, 8(r21)
lfd f4, 8(r3)
lfd f5, 8(r27)
lfd f6, 8(r22)
lfd f7, 8(r5)
lfd f8, 8(r6)
lfd f9, 8(r30)
lfd f10, 8(r11)
lfd f11, 8(r12)
fsub f10, f10, f11
fadd f5, f4, f5
fmul f5, f5, f1
fadd f6, f6, f7
fadd f6, f6, f8
fadd f6, f6, f9
fmadd f0, f5, f6, f0
fnmsub f0, f10, f2, f0
stfd f0, 8(r4)
lfd f0, 8(r25)
lfd f5, 8(r26)
lfd f6, 8(r23)
lfd f9, 8(r28)
lfd f10, 8(r10)
lfd f12, 8(r9)
lfd f13, 8(r29)
fsub f11, f13, f11
fadd f4, f4, f5
fmul f4, f4, f1
fadd f5, f6, f9
fadd f5, f5, f10
fadd f5, f5, f12
fnmsub f0, f4, f5, f0
fnmsub f0, f11, f3, f0
stfd f0, 8(r24)
lfd f0, 8(r8)
fsub f4, f7, f8
fsub f5, f12, f10
fnmsub f0, f5, f2, f0
fnmsub f0, f4, f3, f0
stfd f0, 8(r2)
addi r20, r20, 1
addi r2, r2, 8
addi r8, r8, 8
addi r10, r10, 8
addi r12, r12, 8
addi r6, r6, 8
addi r29, r29, 8
addi r28, r28, 8
addi r26, r26, 8
addi r25, r25, 8
addi r24, r24, 8
addi r5, r5, 8
addi r23, r23, 8
addi r22, r22, 8
addi r3, r3, 8
addi r9, r9, 8
addi r11, r11, 8
addi r30, r30, 8
addi r27, r27, 8
addi r21, r21, 8
addi r4, r4, 8
cmpw cr0, r20, r7
bgt .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_1
we now emit:
.LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_2: ; no_exit.1.i19
lfdx f0, r21, r20
lfdx f4, r3, r20
lfdx f5, r27, r20
lfdx f6, r22, r20
lfdx f7, r5, r20
lfdx f8, r6, r20
lfdx f9, r30, r20
lfdx f10, r11, r20
lfdx f11, r12, r20
fsub f10, f10, f11
fadd f5, f4, f5
fmul f5, f5, f1
fadd f6, f6, f7
fadd f6, f6, f8
fadd f6, f6, f9
fmadd f0, f5, f6, f0
fnmsub f0, f10, f2, f0
stfdx f0, r4, r20
lfdx f0, r25, r20
lfdx f5, r26, r20
lfdx f6, r23, r20
lfdx f9, r28, r20
lfdx f10, r10, r20
lfdx f12, r9, r20
lfdx f13, r29, r20
fsub f11, f13, f11
fadd f4, f4, f5
fmul f4, f4, f1
fadd f5, f6, f9
fadd f5, f5, f10
fadd f5, f5, f12
fnmsub f0, f4, f5, f0
fnmsub f0, f11, f3, f0
stfdx f0, r24, r20
lfdx f0, r8, r20
fsub f4, f7, f8
fsub f5, f12, f10
fnmsub f0, f5, f2, f0
fnmsub f0, f4, f3, f0
stfdx f0, r2, r20
addi r19, r19, 1
addi r20, r20, 8
cmpw cr0, r19, r7
bgt .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_1
llvm-svn: 22722
2005-08-09 00:18:09 +00:00
Chris Lattner
37c24cc98c
Suck the base value out of the UsersToProcess vector into the BasedUser
...
class to simplify the code. Fuse two loops.
llvm-svn: 22721
2005-08-08 22:56:21 +00:00
Chris Lattner
37ed895bf1
Split MoveLoopVariantsToImediateField out from MoveImmediateValues. The
...
first is a correctness thing, and the later is an optzn thing. This also
is needed to support a future change.
llvm-svn: 22720
2005-08-08 22:32:34 +00:00
Nate Begeman
c92787e1f5
Factor out some common code, and be smarter about when to emit load hi/lo
...
code sequences.
llvm-svn: 22719
2005-08-08 22:22:56 +00:00
Chris Lattner
d09a9a788b
Allow tools with "consume after" options (like lli) to take more positional
...
opts than they take directly. Thanks to John C for pointing this problem
out to me!
llvm-svn: 22717
2005-08-08 21:57:27 +00:00
Chris Lattner
64068eb7da
Remove getImmediateForOpcode, which is now dead.
...
Patch by Jim Laskey.
llvm-svn: 22716
2005-08-08 21:34:13 +00:00
Chris Lattner
25388199a2
Add new immediate handling support for mul/div.
...
Patch by Jim Laskey!
llvm-svn: 22715
2005-08-08 21:33:23 +00:00
Chris Lattner
8e9dc31928
Add support for OR/XOR/SUB immediates that are handled with the new immediate
...
way. This allows ORI/ORIS pairs, for example.
llvm-svn: 22714
2005-08-08 21:30:29 +00:00
Chris Lattner
fd0fe76ba6
Modify the ISD::AND opcode case to use new immediate constant predicates.
...
Includes wider support for rotate and mask cases.
Patch by Jim Laskey.
I've requested that Jim add new regression tests the newly handled cases.
llvm-svn: 22712
2005-08-08 21:24:57 +00:00
Chris Lattner
81e0e3e933
Modify the ISD::ADD opcode case to use new immediate constant predicates.
...
Includes support for 32-bit constants using addi/addis.
Patch by Jim Laskey.
llvm-svn: 22711
2005-08-08 21:21:03 +00:00
Chris Lattner
4c54dae243
Modify existing support functions to use new immediate constant predicates.
...
Patch by Jim Laskey
llvm-svn: 22710
2005-08-08 21:12:35 +00:00
Chris Lattner
3cc070cc36
Add support predicates for future immediate constant changes.
...
Patch by Jim Laskey
llvm-svn: 22709
2005-08-08 21:10:27 +00:00
Chris Lattner
f2267ed5c5
Move IsRunOfOnes to a more logical place and rename to a proper predicate form
...
(lowercase isXXX).
Patch by Jim Laskey.
llvm-svn: 22708
2005-08-08 21:08:09 +00:00
Nate Begeman
9a838678b0
Fix JIT encoding of ppc mfocrf instruction; the operands were reversed
...
llvm-svn: 22707
2005-08-08 20:04:52 +00:00
Chris Lattner
9f269e40c9
Use the new 'moveBefore' method to simplify some code. Really, which is
...
easier to understand? :)
llvm-svn: 22706
2005-08-08 19:11:57 +00:00
Chris Lattner
d380d8412d
Reject command lines that have too many positional arguments passed (e.g.,
...
'opt x y'). This fixes PR493.
Patch contributed by Owen Anderson!
llvm-svn: 22705
2005-08-08 17:25:38 +00:00
Chris Lattner
14203e85b2
Not all constants are legal immediates in load/store instructions.
...
llvm-svn: 22704
2005-08-08 06:25:50 +00:00
Chris Lattner
c70bbc0c41
Implement LoopStrengthReduce/share_code_in_preheader.ll by having one
...
rewriter for all code inserted into the preheader, which is never flushed.
llvm-svn: 22702
2005-08-08 05:47:49 +00:00
Chris Lattner
9bfa6f8784
Implement a simple optimization for the termination condition of the loop.
...
The termination condition actually wants to use the post-incremented value
of the loop, not a new indvar with an unusual base.
On PPC, for example, this allows us to compile
LoopStrengthReduce/exit_compare_live_range.ll to:
_foo:
li r2, 0
.LBB_foo_1: ; no_exit
li r5, 0
stw r5, 0(r3)
addi r2, r2, 1
cmpw cr0, r2, r4
bne .LBB_foo_1 ; no_exit
blr
instead of:
_foo:
li r2, 1 ;; IV starts at 1, not 0
.LBB_foo_1: ; no_exit
li r5, 0
stw r5, 0(r3)
addi r5, r2, 1
cmpw cr0, r2, r4
or r2, r5, r5 ;; Reg-reg copy, extra live range
bne .LBB_foo_1 ; no_exit
blr
This implements LoopStrengthReduce/exit_compare_live_range.ll
llvm-svn: 22699
2005-08-08 05:28:22 +00:00
Chris Lattner
24a0a43cb0
add new helper function
...
llvm-svn: 22698
2005-08-08 05:21:50 +00:00
Chris Lattner
88e2d2ee6b
Handle 64-bit constant exprs on 64-bit targets.
...
llvm-svn: 22696
2005-08-08 04:26:32 +00:00
Chris Lattner
579b20b747
All stats are "Number of ..."
...
llvm-svn: 22694
2005-08-07 20:02:04 +00:00
Chris Lattner
2c14cf7b74
Add some simple folds that occur in bitfield cases. Fix a minor bug in
...
isHighOnes, where it would consider 0 to have high ones.
llvm-svn: 22693
2005-08-07 07:03:10 +00:00
Chris Lattner
134ebd0801
Fix typoCVS: ----------------------------------------------------------------------
...
llvm-svn: 22692
2005-08-07 07:00:52 +00:00
Chris Lattner
0c26a0b902
add a small simplification that can be exposed after promotion/expansion
...
llvm-svn: 22691
2005-08-07 05:00:44 +00:00
Chris Lattner
f4dd8c445c
* Use the new PHINode::hasConstantValue method to simplify some code
...
* Teach this code to move allocas out of the loop when tail call eliminating
a call marked 'tail'. This implements TailCallElim/move_alloca_for_tail_call.ll
* Do not perform this transformation if a call is marked 'tail' and if there
are allocas that we cannot move out of the loop in #2 . Doing so would increase
the stack usage of the function. This implements fixes
PR615 and TailCallElim/dont-tce-tail-marked-call.ll.
llvm-svn: 22690
2005-08-07 04:27:41 +00:00
Chris Lattner
983a415b6a
Consolidate the GPOpt stuff to all use the Subtarget, instead of still
...
depending on the command line option. Now the command line option just
sets the subtarget as appropriate. G5 opts will now default to on on
G5-enabled nightly testers among other machines.
llvm-svn: 22688
2005-08-05 22:05:03 +00:00
Chris Lattner
158acab986
adjust to change in getSubtarget() api
...
llvm-svn: 22687
2005-08-05 21:54:27 +00:00
Chris Lattner
431b8d80bd
Enable gp optimizations by default when available, even when a target triple
...
is available, since the target triple doesn't specify whether to use gpopts
or not.
llvm-svn: 22685
2005-08-05 21:25:13 +00:00
Chris Lattner
11fc319b5d
add a note
...
llvm-svn: 22681
2005-08-05 19:18:32 +00:00
Chris Lattner
96ad31321a
Change FindEarliestCallSeqEnd (used by libcall insertion) to use a set to
...
avoid revisiting nodes more than once. This eliminates a source of
potentially exponential behavior. For a small function in 191.fma3d
(hexah_stress_divergence_), this speeds up isel from taking > 20mins to
taking 0.07s.
llvm-svn: 22680
2005-08-05 18:10:27 +00:00
Chris Lattner
1095dc94a9
Fix a use-of-dangling-pointer bug, from the introduction of SrcValue's.
...
llvm-svn: 22679
2005-08-05 16:55:31 +00:00
Chris Lattner
cabdc34563
Fix a latent bug in the libcall inserter that was exposed by Nate's patch
...
yesterday. This fixes whetstone and a bunch of programs in the External tests.
llvm-svn: 22678
2005-08-05 16:23:57 +00:00
Chris Lattner
8c636bf8b2
don't crash when running the PPC backend on non-ppc hosts without specifying
...
a subtarget.
llvm-svn: 22677
2005-08-05 16:17:22 +00:00
Chris Lattner
6e709c1318
PHINode::hasConstantValue should never return the PHI itself, even if the
...
PHI is its only operand.
llvm-svn: 22676
2005-08-05 15:37:31 +00:00
Chris Lattner
1749aaa5e6
Fix an iterator invalidation problem when we decide a phi has a constant value
...
llvm-svn: 22675
2005-08-05 15:34:10 +00:00
Chris Lattner
11e7a5eda7
Make sure to clean CastedPointers after casts are potentially deleted.
...
This fixes LSR crashes on 301.apsi, 191.fma3d, and 189.lucas
llvm-svn: 22673
2005-08-05 01:30:11 +00:00
Chris Lattner
9f9c260b8c
now that hasConstantValue defaults to only returning values that dominate
...
the PHI node, this ugly code can vanish.
llvm-svn: 22672
2005-08-05 01:04:30 +00:00
Chris Lattner
37774affb1
Invoke instructions do not dominate all successors
...
llvm-svn: 22671
2005-08-05 01:03:27 +00:00
Chris Lattner
6f58350daf
Now that hasConstantValue is more careful w.r.t. returning values that only
...
dominate the PHI node, this code can go away. This also makes passes more
aggressive, e.g. implementing Transforms/CondProp/phisimplify2.ll
llvm-svn: 22670
2005-08-05 01:02:04 +00:00
Chris Lattner
bcd8d2c6e5
Use the bool argument to hasConstantValue to decide whether the client is
...
prepared to deal with return values that do not dominate the PHI. If we
cannot prove that the result dominates the PHI node, do not return it if
the client can't cope.
llvm-svn: 22669
2005-08-05 01:00:58 +00:00
Chris Lattner
257efb2ad3
This code can handle non-dominating instructions
...
llvm-svn: 22667
2005-08-05 00:57:45 +00:00
Chris Lattner
1d8b24878f
Mark hasConstantValue as a const method
...
llvm-svn: 22666
2005-08-05 00:49:06 +00:00
Nate Begeman
0a94dec78a
Add an extra parameter that Chris requested
...
llvm-svn: 22665
2005-08-04 23:50:43 +00:00
Nate Begeman
b392321cae
Fix a fixme in CondPropagate.cpp by moving a PhiNode optimization into
...
BasicBlock's removePredecessor routine. This requires shuffling around
the definition and implementation of hasContantValue from Utils.h,cpp into
Instructions.h,cpp
llvm-svn: 22664
2005-08-04 23:24:19 +00:00
Chris Lattner
45f8b6e7aa
Modify how immediates are removed from base expressions to deal with the fact
...
that the symbolic evaluator is not always able to use subtraction to remove
expressions. This makes the code faster, and fixes the last crash on 178.galgel.
Finally, add a statistic to see how many phi nodes are inserted.
On 178.galgel, we get the follow stats:
2562 loop-reduce - Number of PHIs inserted
3927 loop-reduce - Number of GEPs strength reduced
llvm-svn: 22662
2005-08-04 22:34:05 +00:00
Nate Begeman
77558da546
Fix a fixme in LegalizeDAG
...
llvm-svn: 22661
2005-08-04 21:43:28 +00:00
Nate Begeman
e3cbe1027d
Hack to naturally align doubles in the constant pool. Remove this once we
...
know what The Right Thing To Do is.
llvm-svn: 22660
2005-08-04 21:04:09 +00:00
Nate Begeman
295ea90634
Use the new subtarget support to automatically choose the correct ABI
...
and asm printer for PowerPC if one is not specified.
llvm-svn: 22659
2005-08-04 20:49:48 +00:00
Chris Lattner
a6d7c355bc
* Refactor some code into a new BasedUser::RewriteInstructionToUseNewBase
...
method.
* Fix a crash on 178.galgel, where we would insert expressions before PHI
nodes instead of into the PHI node predecessor blocks.
llvm-svn: 22657
2005-08-04 20:03:32 +00:00
Chris Lattner
0f7c0fa2a7
Fix a case that caused this to crash on 178.galgel
...
llvm-svn: 22653
2005-08-04 19:26:19 +00:00
Chris Lattner
acc42c4df1
Teach LSR about loop-variant expressions, such as loops like this:
...
for (i = 0; i < N; ++i)
A[i][foo()] = 0;
here we still want to strength reduce the A[i] part, even though foo() is
l-v.
This also simplifies some of the 'CanReduce' logic.
This implements Transforms/LoopStrengthReduce/ops_after_indvar.ll
llvm-svn: 22652
2005-08-04 19:08:16 +00:00
Nate Begeman
456044b724
Remove some more dead code.
...
llvm-svn: 22650
2005-08-04 18:13:56 +00:00
Chris Lattner
eaf24725b2
Refactor this code substantially with the following improvements:
...
1. We only analyze instructions once, guaranteed
2. AnalyzeGetElementPtrUsers has been ripped apart and replaced with
something much simpler.
The next step is to handle expressions that are not all indvar+loop-invariant
values (e.g. handling indvar+loopvariant).
llvm-svn: 22649
2005-08-04 17:40:30 +00:00
Andrew Lenharth
5adb830b30
No, IDEFs shouldn't be JITed
...
llvm-svn: 22648
2005-08-04 15:32:36 +00:00
Misha Brukman
a54e201edf
* Unbreak release build
...
* Add comments to #endif pragmas for readability
llvm-svn: 22647
2005-08-04 14:22:41 +00:00
Misha Brukman
41acd5e08d
* Unbreak optimized build (noticed by Eric van Riet Paap)
...
* Comment #endif clauses for readability
llvm-svn: 22646
2005-08-04 14:16:48 +00:00
Nate Begeman
3bcfcd9474
Add Subtarget support to PowerPC. Next up, using it.
...
llvm-svn: 22644
2005-08-04 07:12:09 +00:00
Chris Lattner
6f286b760f
refactor some code
...
llvm-svn: 22643
2005-08-04 01:19:13 +00:00
Chris Lattner
6510749050
invert to if's to make the logic simpler
...
llvm-svn: 22641
2005-08-04 00:40:47 +00:00
Chris Lattner
a0102fbc4f
When processing outer loops and we find uses of an IV in inner loops, make
...
sure to handle the use, just don't recurse into it.
This permits us to generate this code for a simple nested loop case:
.LBB_foo_0: ; entry
stwu r1, -48(r1)
stw r29, 44(r1)
stw r30, 40(r1)
mflr r11
stw r11, 56(r1)
lis r2, ha16(L_A$non_lazy_ptr)
lwz r30, lo16(L_A$non_lazy_ptr)(r2)
li r29, 1
.LBB_foo_1: ; no_exit.0
bl L_bar$stub
li r2, 1
or r3, r30, r30
.LBB_foo_2: ; no_exit.1
lfd f0, 8(r3)
stfd f0, 0(r3)
addi r4, r2, 1
addi r3, r3, 8
cmpwi cr0, r2, 100
or r2, r4, r4
bne .LBB_foo_2 ; no_exit.1
.LBB_foo_3: ; loopexit.1
addi r30, r30, 800
addi r2, r29, 1
cmpwi cr0, r29, 100
or r29, r2, r2
bne .LBB_foo_1 ; no_exit.0
.LBB_foo_4: ; return
lwz r11, 56(r1)
mtlr r11
lwz r30, 40(r1)
lwz r29, 44(r1)
lwz r1, 0(r1)
blr
instead of this:
_foo:
.LBB_foo_0: ; entry
stwu r1, -48(r1)
stw r28, 44(r1) ;; uses an extra register.
stw r29, 40(r1)
stw r30, 36(r1)
mflr r11
stw r11, 56(r1)
li r30, 1
li r29, 0
or r28, r29, r29
.LBB_foo_1: ; no_exit.0
bl L_bar$stub
mulli r2, r28, 800 ;; unstrength-reduced multiply
lis r3, ha16(L_A$non_lazy_ptr) ;; loop invariant address computation
lwz r3, lo16(L_A$non_lazy_ptr)(r3)
add r2, r2, r3
mulli r4, r29, 800 ;; unstrength-reduced multiply
addi r3, r3, 8
add r3, r4, r3
li r4, 1
.LBB_foo_2: ; no_exit.1
lfd f0, 0(r3)
stfd f0, 0(r2)
addi r5, r4, 1
addi r2, r2, 8 ;; multiple stride 8 IV's
addi r3, r3, 8
cmpwi cr0, r4, 100
or r4, r5, r5
bne .LBB_foo_2 ; no_exit.1
.LBB_foo_3: ; loopexit.1
addi r28, r28, 1 ;;; Many IV's with stride 1
addi r29, r29, 1
addi r2, r30, 1
cmpwi cr0, r30, 100
or r30, r2, r2
bne .LBB_foo_1 ; no_exit.0
.LBB_foo_4: ; return
lwz r11, 56(r1)
mtlr r11
lwz r30, 36(r1)
lwz r29, 40(r1)
lwz r28, 44(r1)
lwz r1, 0(r1)
blr
llvm-svn: 22640
2005-08-04 00:14:11 +00:00
Chris Lattner
fc62470466
Teach loop-reduce to see into nested loops, to pull out immediate values
...
pushed down by SCEV.
In a nested loop case, this allows us to emit this:
lis r3, ha16(L_A$non_lazy_ptr)
lwz r3, lo16(L_A$non_lazy_ptr)(r3)
add r2, r2, r3
li r3, 1
.LBB_foo_2: ; no_exit.1
lfd f0, 8(r2) ;; Uses offset of 8 instead of 0
stfd f0, 0(r2)
addi r4, r3, 1
addi r2, r2, 8
cmpwi cr0, r3, 100
or r3, r4, r4
bne .LBB_foo_2 ; no_exit.1
instead of this:
lis r3, ha16(L_A$non_lazy_ptr)
lwz r3, lo16(L_A$non_lazy_ptr)(r3)
add r2, r2, r3
addi r3, r3, 8
li r4, 1
.LBB_foo_2: ; no_exit.1
lfd f0, 0(r3)
stfd f0, 0(r2)
addi r5, r4, 1
addi r2, r2, 8
addi r3, r3, 8
cmpwi cr0, r4, 100
or r4, r5, r5
bne .LBB_foo_2 ; no_exit.1
llvm-svn: 22639
2005-08-03 23:44:42 +00:00
Chris Lattner
bb78c97e24
improve debug output
...
llvm-svn: 22638
2005-08-03 23:30:08 +00:00
Nate Begeman
8d394eb703
Scalar SSE: load +0.0 -> xorps/xorpd
...
Scalar SSE: a < b ? c : 0.0 -> cmpss, andps
Scalar SSE: float -> i16 needs to be promoted
llvm-svn: 22637
2005-08-03 23:26:28 +00:00
Chris Lattner
db23c74e5e
Move from Stage 0 to Stage 1.
...
Only emit one PHI node for IV uses with identical bases and strides (after
moving foldable immediates to the load/store instruction).
This implements LoopStrengthReduce/dont_insert_redundant_ops.ll, allowing
us to generate this PPC code for test1:
or r30, r3, r3
.LBB_test1_1: ; Loop
li r2, 0
stw r2, 0(r30)
stw r2, 4(r30)
bl L_pred$stub
addi r30, r30, 8
cmplwi cr0, r3, 0
bne .LBB_test1_1 ; Loop
instead of this code:
or r30, r3, r3
or r29, r3, r3
.LBB_test1_1: ; Loop
li r2, 0
stw r2, 0(r29)
stw r2, 4(r30)
bl L_pred$stub
addi r30, r30, 8 ;; Two iv's with step of 8
addi r29, r29, 8
cmplwi cr0, r3, 0
bne .LBB_test1_1 ; Loop
llvm-svn: 22635
2005-08-03 22:51:21 +00:00
Andrew Lenharth
3a18a39587
Alpha ABI specifies stack is always 16 byte alligned, and gcc does it, so I will too
...
llvm-svn: 22634
2005-08-03 22:33:21 +00:00
Chris Lattner
430d0022df
Rename IVUse to IVUsersOfOneStride, use a struct instead of a pair to
...
unify some parallel vectors and get field names more descriptive than
"first" and "second". This isn't lisp afterall :)
llvm-svn: 22633
2005-08-03 22:21:05 +00:00
Chris Lattner
84e9baa925
Fix a nasty dangling pointer issue. The ScalarEvolution pass would keep a
...
map from instruction* to SCEVHandles. When we delete instructions, we have
to tell it about it. We would run into nasty cases where new instructions
were reallocated at old instruction addresses and get the old map values.
Bad bad bad :(
llvm-svn: 22632
2005-08-03 21:36:09 +00:00
Chris Lattner
8191442548
Fix PR611, codegen'ing SREM of FP operands to fmod or fmodf instead of
...
the sequence used for integer ops
llvm-svn: 22629
2005-08-03 20:31:37 +00:00
Chris Lattner
3de05cc930
The correct fix for PR612, which also fixes
...
Transforms/LowerInvoke/2005-08-03-InvokeWithPHIUse.ll
llvm-svn: 22628
2005-08-03 18:51:44 +00:00
Chris Lattner
f8a81a9886
When inserting code, make sure not to insert it before PHI nodes. This
...
fixes PR612 and Transforms/LowerInvoke/2005-08-03-InvokeWithPHI.ll
llvm-svn: 22626
2005-08-03 18:34:29 +00:00
Chris Lattner
d683bdd0f8
Fix Transforms/SimplifyCFG/2005-08-03-PHIFactorCrash.ll, a problem that
...
occurred while bugpointing another testcase
llvm-svn: 22621
2005-08-03 17:59:45 +00:00