Commit Graph

1495 Commits

Author SHA1 Message Date
Chris Lattner ffad2166e1 move big chunks of code out-of-line, no functionality change.
llvm-svn: 31658
2006-11-11 00:39:41 +00:00
Chris Lattner 4eac5f59e6 Fix a dag combiner bug exposed by my recent instcombine patch. This fixes
CodeGen/Generic/2006-11-10-DAGCombineMiscompile.ll and PPC gsm/toast

llvm-svn: 31644
2006-11-10 21:37:15 +00:00
Evan Cheng 8c9c6d71ed Add implicit def / use operands to MachineInstr.
llvm-svn: 31633
2006-11-10 08:43:01 +00:00
Evan Cheng 13440b025c When forming a pre-indexed store, make sure ptr isn't the same or is a pred of value being stored. It would cause a cycle.
llvm-svn: 31631
2006-11-10 08:28:11 +00:00
Chris Lattner d5e604dbb2 commentate
llvm-svn: 31627
2006-11-10 04:41:34 +00:00
Evan Cheng 6878378390 Don't attempt expensive pre-/post- indexed dag combine if target does not support them.
llvm-svn: 31598
2006-11-09 19:10:46 +00:00
Evan Cheng d550248f2c Add a mechanism to specify whether a target supports a particular indexed load / store.
llvm-svn: 31597
2006-11-09 18:56:43 +00:00
Evan Cheng c034f14fbe Rename ISD::MemOpAddrMode to ISD::MemIndexedMode
llvm-svn: 31596
2006-11-09 18:44:21 +00:00
Evan Cheng b15000736c Rename ISD::MemOpAddrMode to ISD::MemIndexedMode
llvm-svn: 31595
2006-11-09 17:55:04 +00:00
Evan Cheng b58e06bc9e getPostIndexedAddressParts change: passes in load/store instead of its loaded / stored VT.
llvm-svn: 31584
2006-11-09 04:29:46 +00:00
Evan Cheng 85e54223cd Match more post-indexed ops.
llvm-svn: 31569
2006-11-08 20:27:27 +00:00
Jim Laskey 61feeb90f9 Remove redundant <cmath>.
llvm-svn: 31561
2006-11-08 19:16:44 +00:00
Evan Cheng 0303cb9b33 - When performing pre-/post- indexed load/store transformation, do not worry
about whether the new base ptr would be live below the load/store. Let two
  address pass split it back to non-indexed ops.
- Minor tweaks / fixes.

llvm-svn: 31544
2006-11-08 08:30:28 +00:00
Evan Cheng 6072435756 Fixed a minor bug preventing some pre-indexed load / store transformation.
llvm-svn: 31543
2006-11-08 06:56:05 +00:00
Reid Spencer fdff938a7e For PR950:
This patch converts the old SHR instruction into two instructions,
AShr (Arithmetic) and LShr (Logical). The Shr instructions now are not
dependent on the sign of their operands.

llvm-svn: 31542
2006-11-08 06:47:33 +00:00
Evan Cheng d48f7dd250 Fix a obscure post-indexed load / store dag combine bug.
llvm-svn: 31537
2006-11-08 02:38:55 +00:00
Evan Cheng 60c6846d21 Add post-indexed load / store transformations.
llvm-svn: 31498
2006-11-07 09:03:05 +00:00
Chris Lattner 94c231f453 Fix PR988 and CodeGen/Generic/2006-11-06-MemIntrinsicExpand.ll.
The low part goes in the first operand of expandop, not the second one.

llvm-svn: 31487
2006-11-07 04:11:44 +00:00
Evan Cheng f24d15f969 Remove dead code; added a missing null ptr check.
llvm-svn: 31478
2006-11-06 21:33:46 +00:00
Evan Cheng eb99bd736a Add comment.
llvm-svn: 31473
2006-11-06 08:14:30 +00:00
Jeff Cohen 7d6f3db3e2 Unbreak VC++ build.
llvm-svn: 31464
2006-11-05 19:31:28 +00:00
Evan Cheng 33157700d9 Added pre-indexed store support.
llvm-svn: 31459
2006-11-05 09:31:14 +00:00
Evan Cheng 1a1e23eff7 Added getIndexedStore.
llvm-svn: 31458
2006-11-05 09:30:09 +00:00
Evan Cheng fd2c5dd806 Changes to use operand constraints to process two-address instructions.
llvm-svn: 31453
2006-11-04 09:44:31 +00:00
Evan Cheng 9456dd8b81 Fix comments.
llvm-svn: 31414
2006-11-03 07:31:32 +00:00
Evan Cheng 1dfd26a151 Rename
llvm-svn: 31413
2006-11-03 07:21:16 +00:00
Reid Spencer 52f958741a Remove dead variable. Fix 80 column violations.
llvm-svn: 31412
2006-11-03 03:30:34 +00:00
Evan Cheng 357017f4a9 Added DAG combiner transformation to generate pre-indexed loads.
llvm-svn: 31410
2006-11-03 03:06:21 +00:00
Evan Cheng c176f038b9 Added isPredecessor.
llvm-svn: 31409
2006-11-03 03:05:24 +00:00
Chris Lattner cd7b92251d silence warning
llvm-svn: 31397
2006-11-03 01:28:29 +00:00
Reid Spencer de46e48420 For PR786:
Turn on -Wunused and -Wno-unused-parameter. Clean up most of the resulting
fall out by removing unused variables. Remaining warnings have to do with
unused functions (I didn't want to delete code without review) and unused
variables in generated code. Maintainers should clean up the remaining
issues when they see them. All changes pass DejaGnu tests and Olden.

llvm-svn: 31380
2006-11-02 20:25:50 +00:00
Reid Spencer 7eb55b395f For PR950:
Replace the REM instruction with UREM, SREM and FREM.

llvm-svn: 31369
2006-11-02 01:53:59 +00:00
Chris Lattner 55402d4403 Allow the getRegForInlineAsmConstraint method to return a register class with
no fixes physreg.  Treat this as permission to use any register in the register
class.  When this happens and it is safe, allow the llvm register allcoator to
allocate the register instead of doing it at isel time.  This eliminates a ton
of copies around common inline asms.  For example:

int test2(int Y, int X) {
  asm("foo %0, %1" : "=r"(X): "r"(X));
  return X;
}

now compiles to:

_test2:
        foo r3, r4
        blr

instead of:

_test2:
        mr r2, r4
        foo r2, r2
        mr r3, r2
        blr

GCC produces:

_test2:
        foo r4, r4
        mr r3,r4
        blr

llvm-svn: 31366
2006-11-02 01:41:49 +00:00
Evan Cheng 1359196c4e Clean up.
llvm-svn: 31359
2006-11-01 22:39:30 +00:00
Evan Cheng 47218fab42 CopyFromReg starts a live range so its use should not be considered a floater.
llvm-svn: 31356
2006-11-01 22:17:06 +00:00
Evan Cheng 415f365e5c Print jumptable index.
llvm-svn: 31340
2006-11-01 04:48:30 +00:00
Chris Lattner fe43befeda Compile CodeGen/PowerPC/fp-branch.ll to:
_intcoord_cond_next55:
LBB1_3: ;cond_next55
        lis r2, ha16(LCPI1_0)
        lfs f0, lo16(LCPI1_0)(r2)
        fcmpu cr0, f1, f0
        blt cr0, LBB1_2 ;cond_next62.exitStub
LBB1_1: ;bb72.exitStub
        li r3, 1
        blr
LBB1_2: ;cond_next62.exitStub
        li r3, 0
        blr

instead of:

_intcoord_cond_next55:
LBB1_3: ;cond_next55
        lis r2, ha16(LCPI1_0)
        lfs f0, lo16(LCPI1_0)(r2)
        fcmpu cr0, f1, f0
        bge cr0, LBB1_1 ;bb72.exitStub
LBB1_4: ;cond_next55
        lis r2, ha16(LCPI1_0)
        lfs f0, lo16(LCPI1_0)(r2)
        fcmpu cr0, f1, f0
        bnu cr0, LBB1_2 ;cond_next62.exitStub
LBB1_1: ;bb72.exitStub
        li r3, 1
        blr
LBB1_2: ;cond_next62.exitStub
        li r3, 0
        blr

llvm-svn: 31330
2006-10-31 23:06:00 +00:00
Chris Lattner 427301fdae look through isunordered to inline it into branch blocks.
llvm-svn: 31328
2006-10-31 22:37:42 +00:00
Chris Lattner 1fd360e13a handle global address constant sdnodes
llvm-svn: 31323
2006-10-31 20:01:56 +00:00
Chris Lattner 6f043b90ea TargetLowering::isOperandValidForConstraint
llvm-svn: 31319
2006-10-31 19:41:18 +00:00
Chris Lattner 8c6949e5b2 Change the prototype for TargetLowering::isOperandValidForConstraint
llvm-svn: 31318
2006-10-31 19:40:43 +00:00
Chris Lattner 968f803928 Turn an assert into an error message. This is commonly triggered when
we don't support a specific constraint yet.  When this happens, print the
unsupported constraint.

llvm-svn: 31310
2006-10-31 07:33:13 +00:00
Evan Cheng e6d584765f Fix a typo which can break jumptables.
llvm-svn: 31305
2006-10-31 02:31:00 +00:00
Evan Cheng 84a28d4e76 Lower jumptable to BR_JT. The legalizer can lower it to a BRIND or let the target custom lower it.
llvm-svn: 31293
2006-10-30 08:00:44 +00:00
Evan Cheng c3e695137d Added a new SDNode type: BR_JT for jumptable branch.
llvm-svn: 31292
2006-10-30 07:59:36 +00:00
Chris Lattner e60ae823e8 fix Generic/2006-10-29-Crash.ll
llvm-svn: 31281
2006-10-29 21:01:20 +00:00
Chris Lattner f31b9ef458 Fix a load folding issue that Evan noticed: there is no need to export values
used by comparisons in the main block.

llvm-svn: 31279
2006-10-29 18:23:37 +00:00
Evan Cheng 7ab6123c42 VLOAD is not the LoadSDNode opcode.
llvm-svn: 31276
2006-10-29 06:14:47 +00:00
Nick Lewycky dc146a9fb9 Remove spurious case. EXTLOAD is not one of the node opcodes.
llvm-svn: 31275
2006-10-29 02:26:30 +00:00
Chris Lattner bba52191fa split critical edges more carefully and intelligently. In particular, critical
edges whose destinations are not phi nodes don't bother us.  Also, share
split edges, since the split edge can't have a phi.  This significantly
reduces the complexity of generated code in some cases.

llvm-svn: 31274
2006-10-28 19:22:10 +00:00
Jim Laskey eef273a16f Load and stores have not been uniqued properly.
llvm-svn: 31261
2006-10-28 17:25:28 +00:00
Chris Lattner 3e6b1c6157 Split *all* critical edges before isel. This resolves issues with spill code
being inserted on unsplit critical edges, which introduces (sometimes large
amounts of) partially dead spill code.

This also fixes PR925 + CodeGen/Generic/switch-crit-edge-constant.ll

llvm-svn: 31260
2006-10-28 17:04:37 +00:00
Chris Lattner b78eb6c8d1 Fix a serious bug that caused any x86 vector stuff to infinite loop
llvm-svn: 31254
2006-10-28 06:15:26 +00:00
Jim Laskey bd0f088743 Clean up.
llvm-svn: 31243
2006-10-27 23:52:51 +00:00
Chris Lattner 84a035056e Fix a bug in merged condition handling (CodeGen/Generic/2006-10-27-CondFolding.ll).
Add many fewer CFG edges and PHI node entries.  If there is a switch which has
the same block as multiple destinations, only add that block once as a successor/phi
node (in the jumptable case)

llvm-svn: 31242
2006-10-27 23:50:33 +00:00
Jim Laskey f576b42bb2 Switch over from SelectionNodeCSEMap to FoldingSet.
llvm-svn: 31240
2006-10-27 23:46:08 +00:00
Chris Lattner b9392fb635 remove debug code
llvm-svn: 31233
2006-10-27 21:58:03 +00:00
Chris Lattner f1b54fd7a5 Codegen cond&cond with two branches. This compiles (f.e.) PowerPC/and-branch.ll to:
cmpwi cr0, r4, 4
        bgt cr0, LBB1_2 ;UnifiedReturnBlock
LBB1_3: ;entry
        cmplwi cr0, r3, 0
        bne cr0, LBB1_2 ;UnifiedReturnBlock

instead of:

        cmpwi cr7, r4, 4
        mfcr r2
        addic r4, r3, -1
        subfe r3, r4, r3
        rlwinm r2, r2, 30, 31, 31
        or r2, r2, r3
        cmplwi cr0, r2, 0
        bne cr0, LBB1_2 ;UnifiedReturnBlock
LBB1_1: ;cond_true

llvm-svn: 31232
2006-10-27 21:54:23 +00:00
Chris Lattner ed0110b949 Turn conditions like x<Y|z==q into multiple blocks.
This compiles Regression/CodeGen/X86/or-branch.ll into:

_foo:
        subl $12, %esp
        call L_bar$stub
        movl 20(%esp), %eax
        movl 16(%esp), %ecx
        cmpl $5, %eax
        jl LBB1_1       #cond_true
LBB1_3: #entry
        testl %ecx, %ecx
        jne LBB1_2      #UnifiedReturnBlock
LBB1_1: #cond_true
        call L_bar$stub
        addl $12, %esp
        ret
LBB1_2: #UnifiedReturnBlock
        addl $12, %esp
        ret

instead of:

_foo:
        subl $12, %esp
        call L_bar$stub
        movl 20(%esp), %eax
        movl 16(%esp), %ecx
        cmpl $4, %eax
        setg %al
        testl %ecx, %ecx
        setne %cl
        testb %cl, %al
        jne LBB1_2      #UnifiedReturnBlock
LBB1_1: #cond_true
        call L_bar$stub
        addl $12, %esp
        ret
LBB1_2: #UnifiedReturnBlock
        addl $12, %esp
        ret

And on ppc to:

        cmpwi cr0, r29, 5
        blt cr0, LBB1_1 ;cond_true
LBB1_3: ;entry
        cmplwi cr0, r30, 0
        bne cr0, LBB1_2 ;UnifiedReturnBlock

instead of:

        cmpwi cr7, r4, 4
        mfcr r2
        addic r4, r3, -1
        subfe r30, r4, r3
        rlwinm r29, r2, 30, 31, 31
        and r2, r29, r30
        cmplwi cr0, r2, 0
        bne cr0, LBB1_2 ;UnifiedReturnBlock

llvm-svn: 31230
2006-10-27 21:36:01 +00:00
Evan Cheng 96d6bf50ae getPreIndexedLoad -> getIndexedLoad.
llvm-svn: 31209
2006-10-26 21:53:40 +00:00
Reid Spencer 7e80b0b31e For PR950:
Make necessary changes to support DIV -> [SUF]Div. This changes llvm to
have three division instructions: signed, unsigned, floating point. The
bytecode and assembler are bacwards compatible, however.

llvm-svn: 31195
2006-10-26 06:15:43 +00:00
Chris Lattner 61bcf9154d visitSwitchCase knows how to insert conditional branches well. Change
visitBr to just call visitSwitchCase, eliminating duplicate logic.

llvm-svn: 31167
2006-10-24 18:07:37 +00:00
Chris Lattner 963ddad31a Generalize CaseBlock a bit more:
Rename LHSBB/RHSBB to TrueBB/FalseBB.  Allow the RHS value to be null,
in which case the LHS is treated as a bool.

llvm-svn: 31166
2006-10-24 17:57:59 +00:00
Chris Lattner 3f179d24c6 generalize 'CaseBlock'. It really allows any comparison to be inserted.
llvm-svn: 31161
2006-10-24 17:03:35 +00:00
Chris Lattner 4c931502cc Minor tweak. Instead of generating:
movl 32(%esp), %eax
        cmpl $1, %eax
        je LBB1_1       #bb
LBB1_4: #entry
        cmpl $2, %eax
        je LBB1_2       #bb2
        jmp LBB1_3      #UnifiedReturnBlock
LBB1_1: #bb

notice that we would miss the fall through and emit this instead:

        movl 32(%esp), %eax
        cmpl $2, %eax
        je LBB1_2       #bb2
LBB1_4: #entry
        cmpl $1, %eax
        jne LBB1_3      #UnifiedReturnBlock
LBB1_1: #bb

llvm-svn: 31130
2006-10-23 18:38:22 +00:00
Chris Lattner 76a7bc8c55 Fix phi node updating for switches lowered to linear sequences of branches.
llvm-svn: 31125
2006-10-22 23:00:53 +00:00
Chris Lattner 4c3ef4782d disable this code for now, it's not yet safely updating phi nodes
llvm-svn: 31124
2006-10-22 22:47:10 +00:00
Chris Lattner 6d6fc26257 Implement PR964 and Regression/CodeGen/Generic/SwitchLowering.ll
llvm-svn: 31119
2006-10-22 21:36:53 +00:00
Chris Lattner c5ab6ce613 Make flag and chain edges visually distinguishable from value edges in DOT
output.

llvm-svn: 31067
2006-10-20 18:06:09 +00:00
Reid Spencer e0fc4dfc22 For PR950:
This patch implements the first increment for the Signless Types feature.
All changes pertain to removing the ConstantSInt and ConstantUInt classes
in favor of just using ConstantInt.

llvm-svn: 31063
2006-10-20 07:07:24 +00:00
Bill Wendling be96e1cd09 Partially in response to PR926: insert the newly created machine basic
blocks into the basic block list when lowering the switch inst. into a
binary tree of if-then statements. This allows the "visitSwitchCase" func
to allow for fall-through behavior.

llvm-svn: 31057
2006-10-19 21:46:38 +00:00
Jim Laskey 55e4dcad36 Add option for controlling inclusion of global AA.
llvm-svn: 31040
2006-10-18 19:08:31 +00:00
Jim Laskey a15b0ebb5e Use global info for alias analysis.
llvm-svn: 31035
2006-10-18 12:29:57 +00:00
Chris Lattner 78fd0f83ff Trivial patch to speed up legalizing common i64 constants.
llvm-svn: 31020
2006-10-17 21:47:13 +00:00
Chris Lattner 327b88b102 Fix CodeGen/PowerPC/2006-10-17-brcc-miscompile.ll
llvm-svn: 31019
2006-10-17 21:24:15 +00:00
Evan Cheng 2f4ddce75c Fix printer for StoreSDNode.
llvm-svn: 31017
2006-10-17 21:18:26 +00:00
Evan Cheng 1839d76f69 Reflect MemOpAddrMode change; added a helper to create pre-indexed load.
llvm-svn: 31016
2006-10-17 21:14:32 +00:00
Jim Laskey e7d2c24a7d Make it simplier to dump DAGs while in DAGCombiner. Remove a nasty optimization.
llvm-svn: 31009
2006-10-17 19:33:52 +00:00
Evan Cheng 1e3a39cd08 Make sure operand does have size and element type operands.
llvm-svn: 30999
2006-10-17 17:06:35 +00:00
Evan Cheng f3ae00a64a Be careful when looking through a vbit_convert. Optimizing this:
(vector_shuffle
  (vbitconvert (vbuildvector (copyfromreg v4f32), 1, v4f32), 4, f32),
  (undef, undef, undef, undef), (0, 0, 0, 0), 4, f32)
to the
  vbitconvert
is a very bad idea.

llvm-svn: 30989
2006-10-16 22:49:37 +00:00
Jim Laskey dcb2b83886 Pass AliasAnalysis thru to DAGCombiner.
llvm-svn: 30984
2006-10-16 20:52:31 +00:00
Jim Laskey 3bf4f3bd60 Tidy up after truncstore changes.
llvm-svn: 30961
2006-10-14 12:14:27 +00:00
Evan Cheng 47fbeda5ce Debug tweak.
llvm-svn: 30959
2006-10-14 08:34:06 +00:00
Chris Lattner 6a1b2de8c4 Make sure that the node returned by SimplifySetCC is added to the worklist
so that it can be deleted if unused.

llvm-svn: 30955
2006-10-14 03:52:46 +00:00
Chris Lattner 0626bd2fbc fold setcc of a setcc.
llvm-svn: 30953
2006-10-14 01:02:29 +00:00
Chris Lattner bd9acad805 When SimplifySetCC was moved to the DAGCombiner, it was never removed from
SelectionDAG and it has since bitrotted.  Remove the copy from SelectionDAG.
Next, remove the constant folding piece of DAGCombiner::SimplifySetCC into
a new FoldSetCC method which can be used by getNode() and SimplifySetCC.

This fixes obscure bugs.

llvm-svn: 30952
2006-10-14 00:41:01 +00:00
Jim Laskey dcf983ce41 Reduce the workload by not adding chain users to work list.
llvm-svn: 30948
2006-10-13 23:32:28 +00:00
Chris Lattner 45ffb1eb70 Fix a bug where we incorrectly turned '(X & 0) == 0' into '(X & 0) >> -1',
which is undefined.  "0" isn't a power of 2.

llvm-svn: 30947
2006-10-13 22:46:18 +00:00
Evan Cheng ab51cf2e78 Merge ISD::TRUNCSTORE to ISD::STORE. Switch to using StoreSDNode.
llvm-svn: 30945
2006-10-13 21:14:26 +00:00
Chris Lattner d0620d2773 Lower X%C into X/C+stuff. This allows the 'division by a constant' logic to
apply to rems as well as divs.  This fixes PR945 and speeds up ReedSolomon
from 14.57s to 10.90s (which is now faster than gcc).

It compiles CodeGen/X86/rem.ll into:

_test1:
        subl $4, %esp
        movl %esi, (%esp)
        movl $2155905153, %ecx
        movl 8(%esp), %esi
        movl %esi, %eax
        imull %ecx
        addl %esi, %edx
        movl %edx, %eax
        shrl $31, %eax
        sarl $7, %edx
        addl %eax, %edx
        imull $255, %edx, %eax
        subl %eax, %esi
        movl %esi, %eax
        movl (%esp), %esi
        addl $4, %esp
        ret
_test2:
        movl 4(%esp), %eax
        movl %eax, %ecx
        sarl $31, %ecx
        shrl $24, %ecx
        addl %eax, %ecx
        andl $4294967040, %ecx
        subl %ecx, %eax
        ret
_test3:
        subl $4, %esp
        movl %esi, (%esp)
        movl $2155905153, %ecx
        movl 8(%esp), %esi
        movl %esi, %eax
        mull %ecx
        shrl $7, %edx
        imull $255, %edx, %eax
        subl %eax, %esi
        movl %esi, %eax
        movl (%esp), %esi
        addl $4, %esp
        ret

instead of div/idiv instructions.

llvm-svn: 30920
2006-10-12 20:58:32 +00:00
Evan Cheng a731cb674a Add RemoveDeadNode to remove a dead node and its (potentially) dead operands.
llvm-svn: 30916
2006-10-12 20:34:05 +00:00
Chris Lattner 2e33fb453b add a minor dag combine noticed when looking at PR945
llvm-svn: 30915
2006-10-12 20:23:19 +00:00
Jim Laskey df2ccc395e D'oh - need to use the rigth kind of store.
llvm-svn: 30903
2006-10-12 15:22:24 +00:00
Jim Laskey a13b9c7aa4 Alias analysis of TRUNCSTORE.
llvm-svn: 30889
2006-10-11 18:55:16 +00:00
Jim Laskey 6a4c6d3a7a Typo
llvm-svn: 30884
2006-10-11 17:52:19 +00:00
Jim Laskey 0f7c328ae7 Handle aliasing of loadext.
llvm-svn: 30883
2006-10-11 17:47:52 +00:00
Jim Laskey 08edf332ed Fix regression in combiner alias analysis.
llvm-svn: 30880
2006-10-11 13:47:09 +00:00
Evan Cheng d35734bd1f Naming consistency.
llvm-svn: 30878
2006-10-11 07:10:22 +00:00
Andrew Lenharth a6bbf33cbf Jimptables working again on alpha.
As a bonus, use the GOT node instead of the AlphaISD::GOT for internal stuff.

llvm-svn: 30873
2006-10-11 04:29:42 +00:00
Chris Lattner 6df349676e add two helper methods.
llvm-svn: 30869
2006-10-11 03:58:02 +00:00
Evan Cheng 2da4671e05 FindModifiedNodeSlot needs to add LoadSDNode ivars to create proper SelectionDAGCSEMap ID.
llvm-svn: 30866
2006-10-11 01:47:58 +00:00
Evan Cheng 7994aec7b5 Also update getNodeLabel for LoadSDNode.
llvm-svn: 30861
2006-10-10 20:11:26 +00:00
Evan Cheng fe858538c0 SDNode::dump should also print out extension type and VT.
llvm-svn: 30860
2006-10-10 20:05:10 +00:00
Chris Lattner 8438429c96 Fix another bug in extload promotion.
llvm-svn: 30857
2006-10-10 18:54:19 +00:00
Evan Cheng dc6a3aab71 Fix a bug introduced by my LOAD/LOADX changes.
llvm-svn: 30853
2006-10-10 07:51:21 +00:00
Evan Cheng e71fe34d75 Reflects ISD::LOAD / ISD::LOADX / LoadSDNode changes.
llvm-svn: 30844
2006-10-09 20:57:25 +00:00
Chris Lattner 5ab6d8b3fc Eliminate more token factors by taking advantage of transitivity:
if TF depends on A and B, and A depends on B, TF just needs to depend on
A.  With Jim's alias-analysis stuff enabled, this compiles the testcase in
PR892 into:

__Z4test3Val:
        subl $44, %esp
        call L__Z3foov$stub
        movl %edx, 28(%esp)
        movl %eax, 32(%esp)
        movl %eax, 24(%esp)
        movl %edx, 36(%esp)
        movl 52(%esp), %ecx
        movl %ecx, 4(%esp)
        movl %eax, 8(%esp)
        movl %edx, 12(%esp)
        movl 48(%esp), %eax
        movl %eax, (%esp)
        call L__Z3bar3ValS_$stub
        addl $44, %esp
        ret

instead of:

__Z4test3Val:
        subl $44, %esp
        call L__Z3foov$stub
        movl %eax, 24(%esp)
        movl %edx, 28(%esp)
        movl 24(%esp), %eax
        movl %eax, 32(%esp)
        movl 28(%esp), %eax
        movl %eax, 36(%esp)
        movl 32(%esp), %eax
        movl 36(%esp), %ecx
        movl 52(%esp), %edx
        movl %edx, 4(%esp)
        movl %eax, 8(%esp)
        movl %ecx, 12(%esp)
        movl 48(%esp), %eax
        movl %eax, (%esp)
        call L__Z3bar3ValS_$stub
        addl $44, %esp
        ret

llvm-svn: 30821
2006-10-08 22:57:01 +00:00
Jim Laskey 0463e08005 Combiner alias analysis passes Multisource (release-asserts.)
llvm-svn: 30818
2006-10-07 23:37:56 +00:00
Chris Lattner f9f90bc239 Fix a bug legalizing zero-extending i64 loads into 32-bit loads. The bottom
part was always forced to be sextload, even when we needed an zextload.

llvm-svn: 30782
2006-10-07 00:58:36 +00:00
Chris Lattner a389a612bb initialize ivar
llvm-svn: 30780
2006-10-06 22:52:08 +00:00
Chris Lattner 9d75324ddf jump tables handle pic
llvm-svn: 30776
2006-10-06 22:32:29 +00:00
Chris Lattner f5839a0816 Fix a miscompilation of:
long long foo(long long X) {
  return (long long)(signed char)(int)X;
}

Instead of:

_foo:
        extsb r2, r4
        srawi r3, r4, 31
        mr r4, r2
        blr

we now produce:

_foo:
        extsb r4, r4
        srawi r3, r4, 31
        blr

This fixes a miscompilation in ConstantFolding.cpp.

llvm-svn: 30768
2006-10-06 17:34:12 +00:00
Evan Cheng df9ac47e5e Make use of getStore().
llvm-svn: 30759
2006-10-05 23:01:46 +00:00
Evan Cheng af309d29b1 Add getStore() helper function to create ISD::STORE nodes.
llvm-svn: 30758
2006-10-05 22:57:11 +00:00
Jim Laskey 6549d22ef9 Alias analysis code clean ups.
llvm-svn: 30753
2006-10-05 15:07:25 +00:00
Evan Cheng f80dfa83a0 Fix some typos that can cause a flag value to have more than one use.
llvm-svn: 30727
2006-10-04 22:23:53 +00:00
Jim Laskey 708d0db2d8 More extensive alias analysis.
llvm-svn: 30721
2006-10-04 16:53:27 +00:00
Evan Cheng 5d9fd977d3 Combine ISD::EXTLOAD, ISD::SEXTLOAD, ISD::ZEXTLOAD into ISD::LOADX. Add an
extra operand to LOADX to specify the exact value extension type.

llvm-svn: 30714
2006-10-04 00:56:09 +00:00
Evan Cheng 91d76cb27f Fix an obvious typo.
llvm-svn: 30711
2006-10-03 23:08:27 +00:00
Jim Laskey e73a22514d Debugging kruft
llvm-svn: 30688
2006-10-02 13:01:17 +00:00
Jim Laskey 1368c265da Add ability to annotate (color) nodes in a viewGraph.
llvm-svn: 30686
2006-10-02 12:26:53 +00:00
Chris Lattner a9caf95591 refactor critical edge breaking out into the SplitCritEdgesForPHIConstants method.
This is a baby step towards fixing PR925.

llvm-svn: 30643
2006-09-28 06:17:10 +00:00
Andrew Lenharth c19ef92403 Comments on JumpTableness
llvm-svn: 30615
2006-09-26 20:02:30 +00:00
Jim Laskey 60832693a7 Load chain check is not needed
llvm-svn: 30613
2006-09-26 17:44:58 +00:00
Jim Laskey dde51671e5 Chain can be any operand
llvm-svn: 30611
2006-09-26 09:32:41 +00:00
Jim Laskey 5f3e0af9d0 Wrong size for load
llvm-svn: 30610
2006-09-26 08:14:06 +00:00
Jim Laskey b4a864d533 Can't move a load node if it's chain is not used.
llvm-svn: 30609
2006-09-26 07:37:42 +00:00
Jim Laskey 7aa0638aa9 Accidental enable of bad code
llvm-svn: 30601
2006-09-25 21:11:32 +00:00
Jim Laskey b5534e5c28 Fix chain dropping in load and drop unused stores in ret blocks.
llvm-svn: 30600
2006-09-25 19:32:58 +00:00
Jim Laskey d07be232ba Core antialiasing for load and store.
llvm-svn: 30597
2006-09-25 16:29:54 +00:00
Andrew Lenharth 783a4a9d86 Add support for other relocation bases to jump tables, as well as custom asm directives
llvm-svn: 30593
2006-09-24 19:45:58 +00:00
Evan Cheng 77c0757f8b PIC jump table entries are always 32-bit. This fixes PIC jump table support on X86-64.
llvm-svn: 30590
2006-09-24 05:22:38 +00:00
Evan Cheng 449a0c7e33 Make it work for DAG combine of multi-value nodes.
llvm-svn: 30573
2006-09-21 19:04:05 +00:00
Jim Laskey 35f7eebb49 core corrections
llvm-svn: 30570
2006-09-21 17:35:47 +00:00
Jim Laskey 5d19d59017 Basic "in frame" alias analysis.
llvm-svn: 30568
2006-09-21 16:28:59 +00:00
Chris Lattner 082db3f9aa fold (aext (and (trunc x), cst)) -> (and x, cst).
llvm-svn: 30561
2006-09-21 06:40:43 +00:00
Chris Lattner fa9f92cf65 Check the right value type. This fixes 186.crafty on x86
llvm-svn: 30560
2006-09-21 06:17:39 +00:00
Chris Lattner 8d8a3bf9c9 Compile:
int %test(ulong *%tmp) {
        %tmp = load ulong* %tmp         ; <ulong> [#uses=1]
        %tmp.mask = shr ulong %tmp, ubyte 50            ; <ulong> [#uses=1]
        %tmp.mask = cast ulong %tmp.mask to ubyte
        %tmp2 = and ubyte %tmp.mask, 3          ; <ubyte> [#uses=1]
        %tmp2 = cast ubyte %tmp2 to int         ; <int> [#uses=1]
        ret int %tmp2
}

to:

_test:
        movl 4(%esp), %eax
        movl 4(%eax), %eax
        shrl $18, %eax
        andl $3, %eax
        ret

instead of:

_test:
        movl 4(%esp), %eax
        movl 4(%eax), %eax
        shrl $18, %eax
        # TRUNCATE movb %al, %al
        andb $3, %al
        movzbl %al, %eax
        ret

llvm-svn: 30558
2006-09-21 06:14:31 +00:00
Chris Lattner a31f0a622b Generalize (zext (truncate x)) and (sext (truncate x)) folding to work when
the src/dst are not the same size.  This catches things like "truncate
32-bit X to 8 bits, then zext to 16", which happens a bit on X86.

llvm-svn: 30557
2006-09-21 06:00:20 +00:00
Chris Lattner c8cd62d381 Compile:
int test3(int a, int b) { return (a < 0) ? a : 0; }

to:

_test3:
        srawi r2, r3, 31
        and r3, r2, r3
        blr

instead of:

_test3:
        cmpwi cr0, r3, 1
        li r2, 0
        blt cr0, LBB2_2 ;entry
LBB2_1: ;entry
        mr r3, r2
LBB2_2: ;entry
        blr


This implements: PowerPC/select_lt0.ll:seli32_a_a

llvm-svn: 30517
2006-09-20 06:41:35 +00:00
Chris Lattner 8746e2cd57 Fold the full generality of (any_extend (truncate x))
llvm-svn: 30514
2006-09-20 06:29:17 +00:00
Chris Lattner 8b68decb27 Two things:
1. teach SimplifySetCC that '(srl (ctlz x), 5) == 0' is really x != 0.
2. Teach visitSELECT_CC to use SimplifySetCC instead of calling it and
   ignoring the result.  This allows us to compile:

bool %test(ulong %x) {
  %tmp = setlt ulong %x, 4294967296
  ret bool %tmp
}

to:

_test:
        cntlzw r2, r3
        cmplwi cr0, r3, 1
        srwi r2, r2, 5
        li r3, 0
        beq cr0, LBB1_2 ;
LBB1_1: ;
        mr r3, r2
LBB1_2: ;
        blr

instead of:

_test:
        addi r2, r3, -1
        cntlzw r2, r2
        cntlzw r3, r3
        srwi r2, r2, 5
        cmplwi cr0, r2, 0
        srwi r2, r3, 5
        li r3, 0
        bne cr0, LBB1_2 ;
LBB1_1: ;
        mr r3, r2
LBB1_2: ;
        blr

This isn't wonderful, but it's an improvement.

llvm-svn: 30513
2006-09-20 06:19:26 +00:00
Chris Lattner 875ea0cdbd Expand 64-bit shifts more optimally if we know that the high bit of the
shift amount is one or zero.  For example, for:

long long foo1(long long X, int C) {
  return X << (C|32);
}

long long foo2(long long X, int C) {
  return X << (C&~32);
}

we get:

_foo1:
        movb $31, %cl
        movl 4(%esp), %edx
        andb 12(%esp), %cl
        shll %cl, %edx
        xorl %eax, %eax
        ret
_foo2:
        movb $223, %cl
        movl 4(%esp), %eax
        movl 8(%esp), %edx
        andb 12(%esp), %cl
        shldl %cl, %eax, %edx
        shll %cl, %eax
        ret

instead of:

_foo1:
        subl $4, %esp
        movl %ebx, (%esp)
        movb $32, %bl
        movl 8(%esp), %eax
        movl 12(%esp), %edx
        movb %bl, %cl
        orb 16(%esp), %cl
        shldl %cl, %eax, %edx
        shll %cl, %eax
        xorl %ecx, %ecx
        testb %bl, %bl
        cmovne %eax, %edx
        cmovne %ecx, %eax
        movl (%esp), %ebx
        addl $4, %esp
        ret
_foo2:
        subl $4, %esp
        movl %ebx, (%esp)
        movb $223, %cl
        movl 8(%esp), %eax
        movl 12(%esp), %edx
        andb 16(%esp), %cl
        shldl %cl, %eax, %edx
        shll %cl, %eax
        xorl %ecx, %ecx
        xorb %bl, %bl
        testb %bl, %bl
        cmovne %eax, %edx
        cmovne %ecx, %eax
        movl (%esp), %ebx
        addl $4, %esp
        ret

llvm-svn: 30506
2006-09-20 03:38:48 +00:00
Chris Lattner 5a42ebcff3 Fold extract_element(cst) to cst
llvm-svn: 30478
2006-09-19 05:02:39 +00:00
Chris Lattner 4c059f4962 Minor speedup for legalize by avoiding some malloc traffic
llvm-svn: 30477
2006-09-19 04:51:23 +00:00
Evan Cheng 1fc7c363e6 Fix a typo.
llvm-svn: 30474
2006-09-18 23:28:33 +00:00
Evan Cheng 4bfaf0bd2c Allow i32 UDIV, SDIV, UREM, SREM to be expanded into libcalls.
llvm-svn: 30470
2006-09-18 21:49:04 +00:00
Andrew Lenharth c50458fb90 absolute addresses must match pointer size
llvm-svn: 30461
2006-09-18 17:59:35 +00:00
Chris Lattner e50f5d1fb1 Oh yeah, this is needed too
llvm-svn: 30407
2006-09-16 05:08:34 +00:00
Chris Lattner 1b63391fdf simplify control flow, no functionality change
llvm-svn: 30403
2006-09-16 00:21:44 +00:00
Chris Lattner fbadbda6ba Allow custom expand of mul
llvm-svn: 30402
2006-09-16 00:09:24 +00:00
Chris Lattner 46d710e6ea Fold (X & C1) | (Y & C2) -> (X|Y) & C3 when possible.
This implements CodeGen/X86/and-or-fold.ll

llvm-svn: 30379
2006-09-14 21:11:37 +00:00
Chris Lattner 97614c86ce Split rotate matching code out to its own function. Make it stronger, by
matching things like ((x >> c1) & c2) | ((x << c3) & c4) to (rot x, c5) & c6

llvm-svn: 30376
2006-09-14 20:50:57 +00:00
Chris Lattner 84cc1f7cb8 If LSR went through a lot of trouble to put constants (e.g. the addr of a global
in a specific BB, don't undo this!).  This allows us to compile
CodeGen/X86/loop-hoist.ll into:

_foo:
        xorl %eax, %eax
***     movl L_Arr$non_lazy_ptr, %ecx
        movl 4(%esp), %edx
LBB1_1: #cond_true
        movl %eax, (%ecx,%eax,4)
        incl %eax
        cmpl %edx, %eax
        jne LBB1_1      #cond_true
LBB1_2: #return
        ret

instead of:

_foo:
        xorl %eax, %eax
        movl 4(%esp), %ecx
LBB1_1: #cond_true
***     movl L_Arr$non_lazy_ptr, %edx
        movl %eax, (%edx,%eax,4)
        incl %eax
        cmpl %ecx, %eax
        jne LBB1_1      #cond_true
LBB1_2: #return
        ret

This was noticed in 464.h264ref.  This doesn't usually affect PPC,
but strikes X86 all the time.

llvm-svn: 30290
2006-09-13 06:02:42 +00:00
Chris Lattner 72b503bcad Compile X << 1 (where X is a long-long) to:
addl %ecx, %ecx
        adcl %eax, %eax

instead of:

        movl %ecx, %edx
        addl %edx, %edx
        shrl $31, %ecx
        addl %eax, %eax
        orl %ecx, %eax

and to:

        addc r5, r5, r5
        adde r4, r4, r4

instead of:

        slwi r2,r9,1
        srwi r0,r11,31
        slwi r3,r11,1
        or r2,r0,r2

on PPC.

llvm-svn: 30284
2006-09-13 03:50:39 +00:00
Evan Cheng 45fe3bc72c Added support for machine specific constantpool values. These are useful for
representing expressions that can only be resolved at link time, etc.

llvm-svn: 30278
2006-09-12 21:00:35 +00:00
Chris Lattner 2e0dfb0b16 This code was trying too hard. By eliminating redundant edges in the CFG
due to switch cases going to the same place, it make #pred != #phi entries,
breaking live interval analysis.

This fixes 458.sjeng on x86 with llc.

llvm-svn: 30236
2006-09-10 06:36:57 +00:00
Chris Lattner f0359b343a Implement the fpowi now by lowering to a libcall
llvm-svn: 30225
2006-09-09 06:03:30 +00:00
Chris Lattner e4bbb6c341 Allow targets to custom lower expanded BIT_CONVERT's
llvm-svn: 30217
2006-09-09 00:20:27 +00:00
Chris Lattner 707339a57b Fix CodeGen/Generic/2006-09-06-SwitchLowering.ll, a bug where SDIsel inserted
too many phi operands when lowering a switch to branches in some cases.

llvm-svn: 30142
2006-09-07 01:59:34 +00:00
Chris Lattner 0dce3311c4 Change the default to 0, which means 'default'.
llvm-svn: 30114
2006-09-05 17:39:15 +00:00
Chris Lattner af23f9b5f6 Completely eliminate def&use operands. Now a register operand is EITHER a
def operand or a use operand.

llvm-svn: 30109
2006-09-05 02:31:13 +00:00
Duraid Madina 373be1d1a2 forgot this
llvm-svn: 30097
2006-09-04 07:44:11 +00:00
Evan Cheng e93762d36e Allow legalizer to expand ISD::MUL using only MULHS in the rare case that is
possible and the target only supports MULHS.

llvm-svn: 30022
2006-09-01 18:17:58 +00:00
Evan Cheng 31305c45da DAG combiner fix for rotates. Previously the outer-most condition checks
for ROTL availability. This prevents it from forming ROTR for targets that
has ROTR only.

llvm-svn: 29997
2006-08-31 07:41:12 +00:00
Evan Cheng e5570a4c3f Move isCommutativeBinOp from SelectionDAG.cpp and DAGCombiner.cpp out. Make it a static method of SelectionDAG.
llvm-svn: 29951
2006-08-29 06:42:35 +00:00
Chris Lattner 3d27be1333 s|llvm/Support/Visibility.h|llvm/Support/Compiler.h|
llvm-svn: 29911
2006-08-27 12:54:02 +00:00
Evan Cheng 849f4bf8dd Eliminate SelectNodeTo() and getTargetNode() variants which take more than
3 SDOperand operands. They are replaced by versions which take an array
of SDOperand and the number of operands.

llvm-svn: 29905
2006-08-27 08:08:54 +00:00
Evan Cheng 34b70eea5c SelectNodeTo now returns a SDNode*.
llvm-svn: 29901
2006-08-26 08:00:10 +00:00
Chris Lattner 451b099113 Fix PR861
llvm-svn: 29796
2006-08-21 20:24:53 +00:00
Chris Lattner d86418ab20 switch the SUnit pred/succ sets from being std::sets to being smallvectors.
This reduces selectiondag time on kc++ from 5.43s to 4.98s (9%).  More
significantly, this speeds up the default ppc scheduler from ~1571ms to 1063ms,
a 33% speedup.

llvm-svn: 29743
2006-08-17 00:09:56 +00:00
Chris Lattner 65879caf07 minor changes.
llvm-svn: 29740
2006-08-16 22:57:46 +00:00
Chris Lattner a4f3625c23 Use the appropriate typedef
llvm-svn: 29730
2006-08-16 20:59:32 +00:00
Chris Lattner a5a3eafbd0 Start using SDVTList more consistently
llvm-svn: 29711
2006-08-15 19:11:05 +00:00
Chris Lattner f98411a220 add a new SDVTList type and new SelectionDAG::getVTList methods to streamline
the creation of canonical VTLists.

llvm-svn: 29709
2006-08-15 17:46:01 +00:00
Chris Lattner bd8877744b eliminate use of getNode that takes vector of valuetypes.
llvm-svn: 29687
2006-08-14 23:53:35 +00:00
Chris Lattner 3bf4be453f Add a new getNode() method that takes a pointer to an already-intern'd list
of value-type nodes.  This avoids having to do mallocs for std::vectors of
valuetypes when a node returns more than one type.

llvm-svn: 29685
2006-08-14 23:31:51 +00:00
Chris Lattner e93a39f2d7 remove SelectionDAG::InsertISelMapEntry, it is dead
llvm-svn: 29677
2006-08-14 22:24:39 +00:00
Chris Lattner 63268f0672 Add code to resize the CSEMap hash table. This doesn't speedup codegen of
kimwitu, but seems like a good idea from a "avoid performance cliffs" standpoint :)

llvm-svn: 29675
2006-08-14 22:19:25 +00:00
Chris Lattner 8e37283d8b Add the actual constant to the hash for ConstantPool nodes. Thanks to
Rafael Espindola for pointing this out.

llvm-svn: 29669
2006-08-14 20:12:44 +00:00
Chris Lattner 0a60294fa0 Switch to using SuperFastHash instead of adding all elements together. This
doesn't significantly improve performance but it helps a small amount.

llvm-svn: 29642
2006-08-12 01:07:10 +00:00
Chris Lattner 04aa034f38 Switch NodeID to track 32-bit chunks instead of 8-bit chunks, for a 2.5%
speedup in isel time.

llvm-svn: 29640
2006-08-11 23:55:53 +00:00
Chris Lattner 0c2e5412bb Remove 8 more std::map's.
llvm-svn: 29631
2006-08-11 21:55:30 +00:00
Chris Lattner 3f16b201e2 Move the BBNodes, GlobalValues, TargetGlobalValues, Constants, TargetConstants,
RegNodes, and ValueNodes maps into the CSEMap.

llvm-svn: 29626
2006-08-11 21:01:22 +00:00
Chris Lattner fcb16470ec eliminate the NullaryOps map, use CSEMap instead.
llvm-svn: 29621
2006-08-11 18:38:11 +00:00
Chris Lattner 6f22ebd8be change internal impl of dag combiner so that calls to CombineTo never have to
make a temporary vector.

llvm-svn: 29618
2006-08-11 17:56:38 +00:00
Chris Lattner a2f4086828 Change one ReplaceAllUsesWith method to take an array of operands to replace
instead of a vector of operands.

llvm-svn: 29616
2006-08-11 17:46:28 +00:00
Chris Lattner c24a1d3093 Start eliminating temporary vectors used to create DAG nodes. Instead, pass
in the start of an array and a count of operands where applicable.  In many
cases, the number of operands is known, so this static array can be allocated
on the stack, avoiding the heap.  In many other cases, a SmallVector can be
used, which has the same benefit in the common cases.

I updated a lot of code calling getNode that takes a vector, but ran out of
time.  The rest of the code should be updated, and these methods should be
removed.

We should also do the same thing to eliminate the methods that take a
vector of MVT::ValueTypes.

It would be extra nice to convert the dagiselemitter to avoid creating vectors
for operands when calling getTargetNode.

llvm-svn: 29566
2006-08-08 02:23:42 +00:00
Chris Lattner 97af9d5d3a Eliminate some malloc traffic by allocating vectors on the stack. Change some
method that took std::vector<SDOperand> to take a pointer to a first operand
and #operands.

This speeds up isel on kc++ by about 3%.

llvm-svn: 29561
2006-08-08 01:09:31 +00:00
Chris Lattner 1ee75ce65d Revamp the "CSEMap" datastructure used in the SelectionDAG class. This
eliminates a bunch of std::map's in the SelectionDAG, replacing them with a
home-grown hashtable.

This is still a work in progress: not all the maps have been moved over and the
hashtable never resizes.  That said, this still speeds up llc 20% on kimwitu++
with -fast -regalloc=local using a release build.

llvm-svn: 29550
2006-08-07 23:03:03 +00:00
Evan Cheng 445b91a041 Clear TopOrder before assigning topological order. Some clean ups.
llvm-svn: 29546
2006-08-07 22:13:29 +00:00
Evan Cheng 1640ae5a84 Reverse the FlaggedNodes after scanning up for flagged preds or else the order would be reversed.
llvm-svn: 29545
2006-08-07 22:12:12 +00:00
Chris Lattner 8927c875bb Make SelectionDAG::RemoveDeadNodes iterative instead of recursive, which
also make it simpler.

llvm-svn: 29524
2006-08-04 17:45:20 +00:00
Jim Laskey a5b707e3ad Copy the liveins for the first block. PR859
llvm-svn: 29511
2006-08-03 20:51:06 +00:00
Chris Lattner 524c1a21f2 Work around a GCC 3.3.5 bug noticed by a user.
llvm-svn: 29490
2006-08-03 00:18:59 +00:00
Evan Cheng bba1ebda32 - Change AssignTopologicalOrder to return vector of SDNode* by reference.
- Tweak implementation to avoid using std::map.

llvm-svn: 29479
2006-08-02 22:00:34 +00:00
Jim Laskey 29e635d3c9 Final polish on machine pass registries.
llvm-svn: 29471
2006-08-02 12:30:23 +00:00
Jim Laskey 17c67efe8a Now that the ISel is available, it's possible to create a default instruction
scheduler creator.

llvm-svn: 29452
2006-08-01 19:14:14 +00:00
Jim Laskey 03593f72db 1. Change use of "Cache" to "Default".
2. Added argument to instruction scheduler creators so the creators can do
special things.
3. Repaired target hazard code.
4. Misc.

More to follow.

llvm-svn: 29450
2006-08-01 18:29:48 +00:00
Jim Laskey 95eda5b1f3 Introducing plugable register allocators and instruction schedulers.
llvm-svn: 29434
2006-08-01 14:21:23 +00:00