llvm-project

Commit Graph

Author	SHA1	Message	Date
Chris Lattner	33182325f5	Eliminate the RegSDNode class, which 3 nodes (CopyFromReg/CopyToReg/ImplicitDef) used to tack a register number onto the node. Instead of doing this, make a new node, RegisterSDNode, which is a leaf containing a register number. These three operations just become normal DAG nodes now, instead of requiring special handling. Note that with this change, it is no longer correct to make illegal CopyFromReg/CopyToReg nodes. The legalizer will not touch them, and this is bad, so don't do it. :) llvm-svn: 22806	2005-08-16 21:55:35 +00:00
Nate Begeman	371e49515d	Implement BR_CC and BRTWOWAY_CC. This allows the removal of a rather nasty fixme from the PowerPC backend. Emit slightly better code for legalizing select_cc. llvm-svn: 22805	2005-08-16 19:49:35 +00:00
Chris Lattner	bc89226527	Allow passing a dag into dump and getOperationName. If one is available when printing a node, use it to render target operations with their target instruction name instead of "<<unknown>>". llvm-svn: 22804	2005-08-16 18:33:07 +00:00
Chris Lattner	577af48731	allow passing a dag into getOperationName and dump llvm-svn: 22803	2005-08-16 18:32:18 +00:00
Chris Lattner	7e57d18b79	Use a extant helper to do this. llvm-svn: 22802	2005-08-16 18:31:23 +00:00
Chris Lattner	1973278b38	Add some methods for dag->dag isel. Split RemoveNodeFromCSEMaps out of DeleteNodesIfDead to do it. llvm-svn: 22801	2005-08-16 18:17:10 +00:00
Chris Lattner	ba19325eae	add some methods for dag->dag isel llvm-svn: 22800	2005-08-16 18:16:24 +00:00
Chris Lattner	f22556d3ad	Pull the LLVM -> DAG lowering code out of the pattern selector so that it can be shared with the DAG->DAG selector. llvm-svn: 22799	2005-08-16 17:14:42 +00:00
Chris Lattner	5cf983ee0f	Fix a bad case in gzip where we put lots of things in registers across the loop, because a IV-dependent value was used outside of the loop and didn't have immediate-folding capability llvm-svn: 22798	2005-08-16 00:38:11 +00:00
Chris Lattner	e515416396	Fix Transforms/LoopStrengthReduce/2005-08-15-AddRecIV.ll llvm-svn: 22797	2005-08-16 00:37:01 +00:00
Chris Lattner	3cf8ef170a	testcase that crashes lsr, distilled from 175.vpr llvm-svn: 22796	2005-08-16 00:36:12 +00:00
Chris Lattner	73785d2ef2	Turn loop strength reduction on by default. Only run createLowerConstantExpressionsPass for the simple isel. The DAG isel has no need for it. llvm-svn: 22794	2005-08-15 23:47:04 +00:00
Chris Lattner	587a75b6e0	Teach LLVM to know how many times a loop executes when constructed with a < expression, e.g.: for (i = m; i < n; ++i) llvm-svn: 22793	2005-08-15 23:33:51 +00:00
Jim Laskey	24b84072ea	Broke 80 column rule. llvm-svn: 22792	2005-08-15 17:35:26 +00:00
Jim Laskey	42623a9539	Changed code gen for int to f32 to use rounding. This makes FP results consistent with gcc. llvm-svn: 22791	2005-08-15 17:14:19 +00:00
Andrew Lenharth	b65b1568ae	isIntImmediate is a good Idea. Add a flavor that checks bounds while it is at it llvm-svn: 22790	2005-08-15 14:31:37 +00:00
Nate Begeman	d5e739dcc2	Fix last night's PPC32 regressions by 1. Not selecting the false value of a select_cc in the false arm, which isn't legal for nested selects. 2. Actually returning the node we created and Legalized in the FP_TO_UINT Expander. llvm-svn: 22789	2005-08-14 18:38:32 +00:00
Nate Begeman	e5394d453d	Fix last night's X86 regressions by putting code for SSE in the if(SSE) block. nur. llvm-svn: 22788	2005-08-14 18:37:02 +00:00
Andrew Lenharth	ed07233868	only build .a on alpha llvm-svn: 22787	2005-08-14 15:14:34 +00:00
Nate Begeman	4d959f6627	Fix FP_TO_UINT with Scalar SSE2 now that the legalizer can handle it. We now generate the relatively good code sequences: unsigned short foo(float a) { return a; } _foo: movss 4(%esp), %xmm0 cvttss2si %xmm0, %eax movzwl %ax, %eax ret and unsigned bar(float a) { return a; } _bar: movss .CPI_bar_0, %xmm0 movss 4(%esp), %xmm1 movapd %xmm1, %xmm2 subss %xmm0, %xmm2 cvttss2si %xmm2, %eax xorl $-2147483648, %eax cvttss2si %xmm1, %ecx ucomiss %xmm0, %xmm1 cmovb %ecx, %eax ret llvm-svn: 22786	2005-08-14 04:36:51 +00:00
Nate Begeman	36853ee1fd	Teach the legalizer how to legalize FP_TO_UINT. Teach the legalizer to promote FP_TO_UINT to FP_TO_SINT if the wider FP_TO_UINT is also illegal. This allows us on PPC to codegen unsigned short foo(float a) { return a; } as: _foo: .LBB_foo_0: ; entry fctiwz f0, f1 stfd f0, -8(r1) lwz r2, -4(r1) rlwinm r3, r2, 0, 16, 31 blr instead of: _foo: .LBB_foo_0: ; entry fctiwz f0, f1 stfd f0, -8(r1) lwz r2, -4(r1) lis r3, ha16(.CPI_foo_0) lfs f0, lo16(.CPI_foo_0)(r3) fcmpu cr0, f1, f0 blt .LBB_foo_2 ; entry .LBB_foo_1: ; entry fsubs f0, f1, f0 fctiwz f0, f0 stfd f0, -16(r1) lwz r2, -12(r1) xoris r2, r2, 32768 .LBB_foo_2: ; entry rlwinm r3, r2, 0, 16, 31 blr llvm-svn: 22785	2005-08-14 01:20:53 +00:00
Nate Begeman	83f6b98c42	Make FP_TO_UINT Illegal. This allows us to generate significantly better codegen for FP_TO_UINT by using the legalizer's SELECT variant. Implement a codegen improvement for SELECT_CC, selecting the false node in the MBB that feeds the phi node. This allows us to codegen: void foo(int a, int b, int c) { int d = (a < b) ? 5 : 9; a = d; } as: _foo: li r2, 5 cmpw cr0, r4, r3 bgt .LBB_foo_2 ; entry .LBB_foo_1: ; entry li r2, 9 .LBB_foo_2: ; entry stw r2, 0(r3) blr insted of: _foo: li r2, 5 li r5, 9 cmpw cr0, r4, r3 bgt .LBB_foo_2 ; entry .LBB_foo_1: ; entry or r2, r5, r5 .LBB_foo_2: ; entry stw r2, 0(r3) blr llvm-svn: 22784	2005-08-14 01:17:16 +00:00
Andrew Lenharth	107a0a7690	Testing a variable before it is defined doesn't work so well. It is a fairly small thing, so just let everyone build the .a file llvm-svn: 22783	2005-08-13 14:58:23 +00:00
Chris Lattner	47d3ec3525	Ooops, don't forget to clear this. The real inner loop is now: .LBB_foo_3: ; no_exit.1 lfd f2, 0(r9) lfd f3, 8(r9) fmul f4, f1, f2 fmadd f4, f0, f3, f4 stfd f4, 8(r9) fmul f3, f1, f3 fmsub f2, f0, f2, f3 stfd f2, 0(r9) addi r9, r9, 16 addi r8, r8, 1 cmpw cr0, r8, r4 ble .LBB_foo_3 ; no_exit.1 llvm-svn: 22782	2005-08-13 07:42:01 +00:00
Chris Lattner	5949d49032	Recursively scan scev expressions for common subexpressions. This allows us to handle nested loops much better, for example, by being able to tell that these two expressions: {( 8 + ( 16 * ( 1 + %Tmp11 + %Tmp12)) + %c_),+,( 16 * %Tmp 12)}<loopentry.1> {(( 16 * ( 1 + %Tmp11 + %Tmp12)) + %c_),+,( 16 * %Tmp12)}<loopentry.1> Have the following common part that can be shared: {(( 16 * ( 1 + %Tmp11 + %Tmp12)) + %c_),+,( 16 * %Tmp12)}<loopentry.1> This allows us to codegen an important inner loop in 168.wupwise as: .LBB_foo_4: ; no_exit.1 lfd f2, 16(r9) fmul f3, f0, f2 fmul f2, f1, f2 fadd f4, f3, f2 stfd f4, 8(r9) fsub f2, f3, f2 stfd f2, 16(r9) addi r8, r8, 1 addi r9, r9, 16 cmpw cr0, r8, r4 ble .LBB_foo_4 ; no_exit.1 instead of: .LBB_foo_3: ; no_exit.1 lfdx f2, r6, r9 add r10, r6, r9 lfd f3, 8(r10) fmul f4, f1, f2 fmadd f4, f0, f3, f4 stfd f4, 8(r10) fmul f3, f1, f3 fmsub f2, f0, f2, f3 stfdx f2, r6, r9 addi r9, r9, 16 addi r8, r8, 1 cmpw cr0, r8, r4 ble .LBB_foo_3 ; no_exit.1 llvm-svn: 22781	2005-08-13 07:27:18 +00:00
Nate Begeman	dc3154ec66	Remove an unncessary argument to SimplifySelectCC and add an additional assert when creating a select_cc node. llvm-svn: 22780	2005-08-13 06:14:17 +00:00
Nate Begeman	b6651e81a0	Fix the fabs regression on x86 by abstracting the select_cc optimization out into SimplifySelectCC. This allows both ISD::SELECT and ISD::SELECT_CC to use the same set of simplifying folds. llvm-svn: 22779	2005-08-13 06:00:21 +00:00
Nate Begeman	a22bf778c9	Remove support for 64b PPC, it's been broken for a long time. It'll be back once a DAG->DAG ISel exists. llvm-svn: 22778	2005-08-13 05:59:16 +00:00
Andrew Lenharth	6b62b479fa	Fix oversized GOT problem with gcc-4 on alpha llvm-svn: 22777	2005-08-13 05:09:50 +00:00
Chris Lattner	89c1dfc733	Teach SplitCriticalEdge to update LoopInfo if it is alive. This fixes a problem in LoopStrengthReduction, where it would split critical edges then confused itself with outdated loop information. llvm-svn: 22776	2005-08-13 01:38:43 +00:00
Chris Lattner	79396539d3	remove dead code. The exit block list is computed on demand, thus does not need to be updated. This code is a relic from when it did. llvm-svn: 22775	2005-08-13 01:30:36 +00:00
Chris Lattner	21381e8424	implement a couple of simple shift foldings. e.g. (X & 7) >> 3 -> 0 llvm-svn: 22774	2005-08-12 23:54:58 +00:00
Jim Laskey	35960708b7	Fix for 2005-08-12-rlwimi-crash.ll. Make allowance for masks being shifted to zero. llvm-svn: 22773	2005-08-12 23:52:46 +00:00
Jim Laskey	461edda709	Added test cases to guarantee use of ORC and ANDC. llvm-svn: 22772	2005-08-12 23:40:14 +00:00
Jim Laskey	a568700618	1. This changes handles the cases of (~x)&y and x&(~y) yielding ANDC, and (~x)\|y and x\|(~y) yielding ORC. llvm-svn: 22771	2005-08-12 23:38:02 +00:00
Chris Lattner	f6a762ada1	testcase that crashed the ppc backend, distilled from crafty llvm-svn: 22770	2005-08-12 23:34:03 +00:00
Chris Lattner	8447b49526	When splitting critical edges, make sure not to leave the new block in the middle of the loop. This turns a critical loop in gzip into this: .LBB_test_1: ; loopentry or r27, r28, r28 add r28, r3, r27 lhz r28, 3(r28) add r26, r4, r27 lhz r26, 3(r26) cmpw cr0, r28, r26 bne .LBB_test_8 ; loopentry.loopexit_crit_edge .LBB_test_2: ; shortcirc_next.0 add r28, r3, r27 lhz r28, 5(r28) add r26, r4, r27 lhz r26, 5(r26) cmpw cr0, r28, r26 bne .LBB_test_7 ; shortcirc_next.0.loopexit_crit_edge .LBB_test_3: ; shortcirc_next.1 add r28, r3, r27 lhz r28, 7(r28) add r26, r4, r27 lhz r26, 7(r26) cmpw cr0, r28, r26 bne .LBB_test_6 ; shortcirc_next.1.loopexit_crit_edge .LBB_test_4: ; shortcirc_next.2 add r28, r3, r27 lhz r26, 9(r28) add r28, r4, r27 lhz r25, 9(r28) addi r28, r27, 8 cmpw cr7, r26, r25 mfcr r26, 1 rlwinm r26, r26, 31, 31, 31 add r25, r8, r27 cmpw cr7, r25, r7 mfcr r25, 1 rlwinm r25, r25, 29, 31, 31 and. r26, r26, r25 bne .LBB_test_1 ; loopentry instead of this: .LBB_test_1: ; loopentry or r27, r28, r28 add r28, r3, r27 lhz r28, 3(r28) add r26, r4, r27 lhz r26, 3(r26) cmpw cr0, r28, r26 beq .LBB_test_3 ; shortcirc_next.0 .LBB_test_2: ; loopentry.loopexit_crit_edge add r2, r30, r27 add r8, r29, r27 b .LBB_test_9 ; loopexit .LBB_test_3: ; shortcirc_next.0 add r28, r3, r27 lhz r28, 5(r28) add r26, r4, r27 lhz r26, 5(r26) cmpw cr0, r28, r26 beq .LBB_test_5 ; shortcirc_next.1 .LBB_test_4: ; shortcirc_next.0.loopexit_crit_edge add r2, r11, r27 add r8, r12, r27 b .LBB_test_9 ; loopexit .LBB_test_5: ; shortcirc_next.1 add r28, r3, r27 lhz r28, 7(r28) add r26, r4, r27 lhz r26, 7(r26) cmpw cr0, r28, r26 beq .LBB_test_7 ; shortcirc_next.2 .LBB_test_6: ; shortcirc_next.1.loopexit_crit_edge add r2, r9, r27 add r8, r10, r27 b .LBB_test_9 ; loopexit .LBB_test_7: ; shortcirc_next.2 add r28, r3, r27 lhz r26, 9(r28) add r28, r4, r27 lhz r25, 9(r28) addi r28, r27, 8 cmpw cr7, r26, r25 mfcr r26, 1 rlwinm r26, r26, 31, 31, 31 add r25, r8, r27 cmpw cr7, r25, r7 mfcr r25, 1 rlwinm r25, r25, 29, 31, 31 and. r26, r26, r25 bne .LBB_test_1 ; loopentry Next up, improve the code for the loop. llvm-svn: 22769	2005-08-12 22:22:17 +00:00
Chris Lattner	e09bbc800c	Add a helper method llvm-svn: 22768	2005-08-12 22:14:06 +00:00
Chris Lattner	1344253e3f	add a helper method llvm-svn: 22767	2005-08-12 22:13:27 +00:00
Chris Lattner	4fec86d348	Fix a FIXME: if we are inserting code for a PHI argument, split the critical edge so that the code is not always executed for both operands. This prevents LSR from inserting code into loops whose exit blocks contain PHI uses of IV expressions (which are outside of loops). On gzip, for example, we turn this ugly code: .LBB_test_1: ; loopentry add r27, r3, r28 lhz r27, 3(r27) add r26, r4, r28 lhz r26, 3(r26) add r25, r30, r28 ;; Only live if exiting the loop add r24, r29, r28 ;; Only live if exiting the loop cmpw cr0, r27, r26 bne .LBB_test_5 ; loopexit into this: .LBB_test_1: ; loopentry or r27, r28, r28 add r28, r3, r27 lhz r28, 3(r28) add r26, r4, r27 lhz r26, 3(r26) cmpw cr0, r28, r26 beq .LBB_test_3 ; shortcirc_next.0 .LBB_test_2: ; loopentry.loopexit_crit_edge add r2, r30, r27 add r8, r29, r27 b .LBB_test_9 ; loopexit .LBB_test_2: ; shortcirc_next.0 ... blt .LBB_test_1 into this: .LBB_test_1: ; loopentry or r27, r28, r28 add r28, r3, r27 lhz r28, 3(r28) add r26, r4, r27 lhz r26, 3(r26) cmpw cr0, r28, r26 beq .LBB_test_3 ; shortcirc_next.0 .LBB_test_2: ; loopentry.loopexit_crit_edge add r2, r30, r27 add r8, r29, r27 b .LBB_t_3: ; shortcirc_next.0 .LBB_test_3: ; shortcirc_next.0 ... blt .LBB_test_1 Next step: get the block out of the loop so that the loop is all fall-throughs again. llvm-svn: 22766	2005-08-12 22:06:11 +00:00
Chris Lattner	b7ebe65c56	Change break critical edges to not remove, then insert, PHI node entries. Instead, just update the BB in-place. This is both faster, and it prevents split-critical-edges from shuffling the PHI argument list unneccesarily. llvm-svn: 22765	2005-08-12 21:58:07 +00:00
Andrew Lenharth	8c6701be6e	match gcc's use of tabs, makes diffs easier llvm-svn: 22764	2005-08-12 16:14:08 +00:00
Andrew Lenharth	ca94102d3e	.section cleanup, patch from Nicholas Riley llvm-svn: 22763	2005-08-12 16:13:43 +00:00
Chris Lattner	4ac8b3f781	First rev of Xcode 2.1 project llvm-svn: 22762	2005-08-11 22:19:26 +00:00
Jim Laskey	a50f770a2c	1. Added the function isOpcWithIntImmediate to simplify testing of operand with specified opcode and an integer constant right operand. 2. Modified ISD::SHL, ISD::SRL, ISD::SRA to use rlwinm when applied after a mask. llvm-svn: 22761	2005-08-11 21:59:23 +00:00
Chris Lattner	d418d752f4	Tidied up the use of dyn_cast<ConstantSDNode> by using isIntImmediate more. Patch by Jim Laskey. llvm-svn: 22760	2005-08-11 17:56:50 +00:00
Chris Lattner	c5e1312baa	Use a more efficient method of creating integer and float virtual registers (avoids an extra level of indirection in MakeReg). defined MakeIntReg using RegMap->createVirtualRegister(PPC32::GPRCRegisterClass) defined MakeFPReg using RegMap->createVirtualRegister(PPC32::FPRCRegisterClass) s/MakeReg(MVT::i32)/MakeIntReg/ s/MakeReg(MVT::f64)/MakeFPReg/ Patch by Jim Laskey! llvm-svn: 22759	2005-08-11 17:15:31 +00:00
Nate Begeman	5c7656fd53	Add a select_cc optimization for recognizing abs(int). This speeds up an integer MPEG encoding loop by a factor of two. llvm-svn: 22758	2005-08-11 02:18:13 +00:00
Nate Begeman	180b08897f	Some SELECT_CC cleanups: 1. move assertions for node creation to getNode() 2. legalize the values returned in ExpandOp immediately 3. Move select_cc optimizations from SELECT's getNode() to SELECT_CC's, allowing them to be cleaned up significantly. This paves the way to pick up additional optimizations on SELECT_CC, such as sum-of-absolute-differences. llvm-svn: 22757	2005-08-11 01:12:20 +00:00
Nate Begeman	5646b181e8	Make SELECT illegal on PPC32, switch to using SELECT_CC, which more closely reflects what the hardware is capable of. This significantly simplifies the CC handling logic throughout the ISel. llvm-svn: 22756	2005-08-10 20:52:09 +00:00

1 2 3 4 5 ...

19501 Commits