Commit Graph

105 Commits

Author SHA1 Message Date
Jeff Cohen 5f4ef3c5a8 Eliminate all remaining tabs and trailing spaces.
llvm-svn: 22523
2005-07-27 06:12:32 +00:00
Nate Begeman b9ce12b4bb Fix an optimization put in for accessing static globals. This obviates
the need to build PIC.

llvm-svn: 22512
2005-07-25 21:15:28 +00:00
Chris Lattner 12f5d12b21 PowerPC no-pic code is not quite ready for prime-time
llvm-svn: 22507
2005-07-22 22:58:34 +00:00
Nate Begeman a9443f29b0 Support building non-PIC
Remove the LoadHiAddr pseudo-instruction.
Optimization of stores to and loads from statics.
Force JIT to use new non-PIC codepaths.

llvm-svn: 22494
2005-07-21 20:44:43 +00:00
Nate Begeman 8465fe8b4b Generate mfocrf when targeting g5. Generate fsqrt/fsqrts when targetin g5.
8-byte align doubles.

llvm-svn: 22486
2005-07-20 22:42:00 +00:00
Nate Begeman 0851f1aaa1 Integrate SelectFPExpr into SelectExpr. This gets PPC32 closer to being
automatically generated from a target description.

llvm-svn: 22470
2005-07-19 16:51:05 +00:00
Chris Lattner 53676dfd33 Change *EXTLOAD to use an VTSDNode operand instead of being an MVTSDNode.
This is the last MVTSDNode.

This allows us to eliminate a bunch of special case code for handling
MVTSDNodes.

Also, remove some uses of dyn_cast that should really be cast (which is
cheaper in a release build).

llvm-svn: 22368
2005-07-10 01:56:13 +00:00
Chris Lattner 36db1ed06f Change TRUNCSTORE to use a VTSDNode operand instead of being an MVTSTDNode
llvm-svn: 22366
2005-07-10 00:29:18 +00:00
Chris Lattner a7220851c0 Make several cleanups to Andrews varargs change:
1. Pass Value*'s into lowering methods so that the proper pointers can be
   added to load/stores from the valist
2. Intrinsics that return void should only return a token chain, not a token
   chain/retval pair.
3. Rename LowerVAArgNext -> LowerVAArg, because VANext is long gone.
4. Now that we have Value*'s available in the lowering methods, pass them
   into any load/stores from the valist that are emitted

llvm-svn: 22339
2005-07-05 19:58:54 +00:00
Chris Lattner 1239d2d7ff Fix PowerPC varargs
llvm-svn: 22335
2005-07-05 17:48:31 +00:00
Chris Lattner d313b92b66 Varargs is apparently currently broken on PPC. This hacks it so that it
is at least overloading the right virtual methods.  The implementations
are currently wrong though.  This fixes Ptrdist/bc, but not other programs
(e.g. siod).

llvm-svn: 22326
2005-07-01 23:11:56 +00:00
Nate Begeman 1bf8927518 Commit fix for generating conditional branch pseudo instructions that
avoids dereferencing the end() iterator when selecting the fallthrough
block.  This requires an ilist change.

llvm-svn: 22212
2005-06-15 18:22:43 +00:00
Nate Begeman 54c022156d Commit a small improvement that is already in the x86 and ia64 backends to
not generate unnecessary register copies.  This improves compile time by
2-5% depending on the test.

llvm-svn: 22210
2005-06-14 03:55:23 +00:00
Nate Begeman 60cf00c982 Handle some more real world cases of rlwimi. These don't come up that
regularly in "normal" code, but for things like software graphics, they
make a big difference.

For the following code:
unsigned short Trans16Bit(unsigned srcA,unsigned srcB,unsigned alpha)
{
	unsigned tmpA,tmpB,mixed;
	tmpA = ((srcA & 0x03E0) << 15) | (srcA & 0x7C1F);
	tmpB = ((srcB & 0x03E0) << 15) | (srcB & 0x7C1F);
	mixed = (tmpA * alpha) + (tmpB * (32 - alpha));
	return ((mixed >> 5) & 0x7C1F) | ((mixed >> 20) & 0x03E0);
}

We now generate:
_Trans16Bit:
.LBB_Trans16Bit_0:      ; entry
        andi. r2, r4, 31775
        rlwimi r2, r4, 15, 7, 11
        subfic r4, r5, 32
        mullw r2, r2, r4
        andi. r4, r3, 31775
        rlwimi r4, r3, 15, 7, 11
        mullw r3, r4, r5
        add r2, r2, r3
        srwi r3, r2, 5
        andi. r3, r3, 31775
        rlwimi r3, r2, 12, 22, 26
        blr

Instead of:
_Trans16Bit:
.LBB_Trans16Bit_0:      ; entry
        slwi r2, r4, 15
        rlwinm r2, r2, 0, 7, 11
        andi. r4, r4, 31775
        or r2, r2, r4
        subfic r4, r5, 32
        mullw r2, r2, r4
        slwi r4, r3, 15
        rlwinm r4, r4, 0, 7, 11
        andi. r3, r3, 31775
        or r3, r4, r3
        mullw r3, r3, r5
        add r2, r2, r3
        srwi r3, r2, 5
        andi. r3, r3, 31775
        srwi r2, r2, 20
        rlwimi r3, r2, 0, 22, 26
        blr

llvm-svn: 22201
2005-06-08 04:14:27 +00:00
Chris Lattner 0ae9b08916 Fix andrews changes to fit in 80 columns
llvm-svn: 22064
2005-05-15 19:54:37 +00:00
Chris Lattner 8abab9b0c7 treat TAILCALL nodes identically to CALL nodes
llvm-svn: 21977
2005-05-13 20:29:26 +00:00
Chris Lattner 2e77db6af6 Add an isTailCall flag to LowerCallTo
llvm-svn: 21958
2005-05-13 18:50:42 +00:00
Chris Lattner 6756f2f795 Realize that we don't support fmod directly, fixing CodeGen/Generic/print-arith-fp.ll
llvm-svn: 21939
2005-05-13 16:20:22 +00:00
Chris Lattner 2dce703710 rename the ADJCALLSTACKDOWN/ADJCALLSTACKUP nodes to be CALLSEQ_START/BEGIN.
llvm-svn: 21915
2005-05-12 23:24:06 +00:00
Chris Lattner 36674a123e Pass in Calling Convention to use into LowerCallTo
llvm-svn: 21899
2005-05-12 19:56:45 +00:00
Chris Lattner f80969f29b These targets don't like setcc
llvm-svn: 21884
2005-05-12 02:06:00 +00:00
Nate Begeman 99fa5bc1fa Necessary changes to codegen cttz efficiently on PowerPC
1. Teach LegalizeDAG how to better legalize CTTZ if the target doesn't have
   CTPOP, but does have CTLZ
2. Teach PPC32 how to do sub x, const -> add x, -const for valid consts
3. Teach PPC32 how to do and (xor a, -1) b -> andc b, a
4. Teach PPC32 that ISD::CTLZ -> PPC::CNTLZW

llvm-svn: 21880
2005-05-11 23:43:56 +00:00
Chris Lattner 129c5fea44 fold and (shl X, C1), C2 -> rlwinm when possible. Many other cases are possible,
include and (srl)    and the inverses (shl and) etc.

llvm-svn: 21820
2005-05-09 17:39:48 +00:00
Andrew Lenharth b8e94c3499 fix typo
llvm-svn: 21693
2005-05-04 19:25:37 +00:00
Andrew Lenharth 5e177826fd Implement count leading zeros (ctlz), count trailing zeros (cttz), and count
population (ctpop).  Generic lowering is implemented, however only promotion
is implemented for SelectionDAG at the moment.

More coming soon.

llvm-svn: 21676
2005-05-03 17:19:30 +00:00
Chris Lattner 9c6bbafc15 This target doesn't support the FSIN/FCOS/FSQRT nodes yet
llvm-svn: 21633
2005-04-30 04:26:06 +00:00
Andrew Lenharth 4a73c2cfdc Implement Value* tracking for loads and stores in the selection DAG. This enables one to use alias analysis in the backends.
(TRUNK)Stores and (EXT|ZEXT|SEXT)Loads have an extra SDOperand which is a SrcValueSDNode which contains the Value*.  Note that if the operation is introduced by the backend, it will still have the operand, but the value* will be null.

llvm-svn: 21599
2005-04-27 20:10:01 +00:00
Misha Brukman e73e76dc42 Convert tabs to spaces
llvm-svn: 21452
2005-04-22 17:54:37 +00:00
Misha Brukman b440243e94 Remove trailing whitespace
llvm-svn: 21425
2005-04-21 23:30:14 +00:00
Chris Lattner 3590ef1164 Match another form of eqv
llvm-svn: 21413
2005-04-21 21:09:11 +00:00
Nate Begeman 2331c061ee Next round of PPC CR optimizations. For the following code:
int %bar(float %a, float %b, float %c, float %d) {
entry:
    %tmp.1 = setlt float %a, %d
    %tmp.2 = setlt float %b, %d
    %or = or bool %tmp.1, %tmp.2
    %tmp.3 = setgt float %c, %d
    %tmp.4 = or bool %or, %tmp.3
    %tmp.5 = and bool %tmp.4, true
    %retval = cast bool %tmp.5 to int
    ret int %retval
}

We now emit:

_bar:
.LBB_bar_0:     ; entry
        fcmpu cr0, f1, f4
        fcmpu cr1, f2, f4
        cror 0, 0, 4
        fcmpu cr1, f3, f4
        cror 28, 0, 5
        mfcr r2
        rlwinm r3, r2, 29, 31, 31
        blr

Instead of:

_bar:
.LBB_bar_0:     ; entry
        fcmpu cr7, f1, f4
        mfcr r2
        rlwinm r2, r2, 29, 31, 31
        fcmpu cr7, f2, f4
        mfcr r3
        rlwinm r3, r3, 29, 31, 31
        or r2, r2, r3
        fcmpu cr7, f3, f4
        mfcr r3
        rlwinm r3, r3, 30, 31, 31
        or r3, r2, r3
        blr

llvm-svn: 21321
2005-04-18 07:48:09 +00:00
Nate Begeman 602a45f415 Change codegen for setcc to read the bit directly out of the condition
register.  Added support in the .td file for the g5-specific variant
  of cr -> gpr moves that executes faster, but we currently don't
  generate it.

llvm-svn: 21314
2005-04-18 02:43:24 +00:00
Nate Begeman 779c5cbb44 Make pattern isel default for ppc
Add new ppc beta option related to using condition registers
Make pattern isel control flag (-enable-pattern-isel) global and tristate
  0 == off
  1 == on
  2 == target default

llvm-svn: 21309
2005-04-15 22:12:16 +00:00
Nate Begeman 53d3eccbe2 Implement multi-way branches through logical ops on condition registers.
This can generate considerably shorter code, reducing the size of crafty
by almost 1%.  Also fix the printing of mcrf.  The code is currently
disabled until it gets a bit more testing, but should work as-is.

llvm-svn: 21298
2005-04-14 09:45:08 +00:00
Nate Begeman be21011911 Start allocating condition registers. Almost all explicit uses of CR0 are
now gone.  Next step is to get rid of the remaining ones and then start
allocating bools to CRs where appropriate.

llvm-svn: 21294
2005-04-13 23:15:44 +00:00
Nate Begeman dd1bb81d04 Implement the fold shift X, zext(Y) -> shift X, Y at the target level,
where it is safe to do so.

llvm-svn: 21293
2005-04-13 22:14:14 +00:00
Nate Begeman 4ddd81657b Disbale the broken fold of shift + sz[ext] for now
Move the transform for select (a < 0) ? b : 0 into the dag from ppc isel
Enable the dag to fold and (setcc, 1) -> setcc for targets where setcc
  always produces zero or one.

llvm-svn: 21291
2005-04-13 21:23:31 +00:00
Chris Lattner e0efd1fa72 remove one more occurance of this that snuck in
llvm-svn: 21271
2005-04-13 02:46:17 +00:00
Chris Lattner 83075510ee Elimate handling of ZERO_EXTEND_INREG. This causes the PPC backend to emit
andi instructions instead of rlwinm instructions for zero extend, but they
seem like they would take the same time.

llvm-svn: 21268
2005-04-13 02:40:26 +00:00
Nate Begeman af1c0f7a00 Fold shift by size larger than type size to undef
Make llvm undef values generate ISD::UNDEF nodes

llvm-svn: 21261
2005-04-12 23:12:17 +00:00
Nate Begeman 818eb6ddd2 Implement setcc op, -1 sequences
Remove dead setcc op, 0 sequences
Coming later: generalization of op, imm

llvm-svn: 21260
2005-04-12 21:22:28 +00:00
Nate Begeman 79a3bea4ca Implement bitfield clears
Implement divide by negative power of two

llvm-svn: 21240
2005-04-12 00:10:02 +00:00
Nate Begeman bebefac791 Add recording variants of ISD::AND and ISD::OR. This kills almost 1000
(1.5%) instructions in 186.crafty

llvm-svn: 21222
2005-04-11 06:34:10 +00:00
Nate Begeman 492370311d Fix another fixme: factor out the constant fp generation code.
llvm-svn: 21207
2005-04-10 06:06:10 +00:00
Nate Begeman 941a01802f Fix 64 bit argument loading that straddles the args in regs / args on stack
boundary.

llvm-svn: 21206
2005-04-10 05:53:14 +00:00
Nate Begeman 6566e8ac06 Make sure that BRCOND branches can be converted into long branches too.
llvm-svn: 21198
2005-04-10 01:48:29 +00:00
Nate Begeman 3345eadc37 Don't hand ISD::CALL nodes off to SelectExprFP. This fixes siod.
llvm-svn: 21197
2005-04-10 01:14:13 +00:00
Nate Begeman 2121a54868 fix ISD::BRCONDTWOWAY codegen to not deference the end() iterator
llvm-svn: 21193
2005-04-09 23:35:05 +00:00
Chris Lattner e8e070dbfb do not set the root to null if an argument is dead
llvm-svn: 21188
2005-04-09 21:23:24 +00:00
Nate Begeman 8309a333dd Add rlwnm instruction for variable rotate
Generate rotate left/right immediate
Generate code for brcondtwoway
Use new livein/liveout functionality

llvm-svn: 21187
2005-04-09 20:09:12 +00:00