Commit Graph

19478 Commits

Author SHA1 Message Date
Chris Lattner 9af011cc11 new testcase
llvm-svn: 22743
2005-08-10 01:11:24 +00:00
Chris Lattner 35c0e2ee33 Fix an obvious oops
llvm-svn: 22742
2005-08-10 00:59:40 +00:00
Chris Lattner 08c7fd19e1 new testcase we handle
llvm-svn: 22741
2005-08-10 00:48:11 +00:00
Chris Lattner edff91a49a Teach LSR to strength reduce IVs that have a loop-invariant but non-constant stride.
For code like this:

void foo(float *a, float *b, int n, int stride_a, int stride_b) {
  int i;
  for (i=0; i<n; i++)
      a[i*stride_a] = b[i*stride_b];
}

we now emit:

.LBB_foo2_2:    ; no_exit
        lfs f0, 0(r4)
        stfs f0, 0(r3)
        addi r7, r7, 1
        add r4, r2, r4
        add r3, r6, r3
        cmpw cr0, r7, r5
        blt .LBB_foo2_2 ; no_exit

instead of:

.LBB_foo_2:     ; no_exit
        mullw r8, r2, r7     ;; multiply!
        slwi r8, r8, 2
        lfsx f0, r4, r8
        mullw r8, r2, r6     ;; multiply!
        slwi r8, r8, 2
        stfsx f0, r3, r8
        addi r2, r2, 1
        cmpw cr0, r2, r5
        blt .LBB_foo_2  ; no_exit

loops with variable strides occur pretty often.  For example, in SPECFP2K
there are 317 variable strides in 177.mesa, 3 in 179.art, 14 in 188.ammp,
56 in 168.wupwise, 36 in 172.mgrid.

Now we can allow indvars to turn functions written like this:

void foo2(float *a, float *b, int n, int stride_a, int stride_b) {
  int i, ai = 0, bi = 0;
  for (i=0; i<n; i++)
    {
      a[ai] = b[bi];
      ai += stride_a;
      bi += stride_b;
    }
}

into code like the above for better analysis.  With this patch, they generate
identical code.

llvm-svn: 22740
2005-08-10 00:45:21 +00:00
Chris Lattner dde7dc525e Fix Regression/Transforms/LoopStrengthReduce/phi_node_update_multiple_preds.ll
by being more careful about updating PHI nodes

llvm-svn: 22739
2005-08-10 00:35:32 +00:00
Chris Lattner 832105d511 new testcase
llvm-svn: 22738
2005-08-10 00:33:01 +00:00
Chris Lattner c6c4d99a21 Fix some 80 column violations.
Once we compute the evolution for a GEP, tell SE about it.  This allows users
of the GEP to know it, if the users are not direct.  This allows us to compile
this testcase:

void fbSolidFillmmx(int w, unsigned char *d) {
    while (w >= 64) {
        *(unsigned long long *) (d +  0) = 0;
        *(unsigned long long *) (d +  8) = 0;
        *(unsigned long long *) (d + 16) = 0;
        *(unsigned long long *) (d + 24) = 0;
        *(unsigned long long *) (d + 32) = 0;
        *(unsigned long long *) (d + 40) = 0;
        *(unsigned long long *) (d + 48) = 0;
        *(unsigned long long *) (d + 56) = 0;
        w -= 64;
        d += 64;
    }
}

into:

.LBB_fbSolidFillmmx_2:  ; no_exit
        li r2, 0
        stw r2, 0(r4)
        stw r2, 4(r4)
        stw r2, 8(r4)
        stw r2, 12(r4)
        stw r2, 16(r4)
        stw r2, 20(r4)
        stw r2, 24(r4)
        stw r2, 28(r4)
        stw r2, 32(r4)
        stw r2, 36(r4)
        stw r2, 40(r4)
        stw r2, 44(r4)
        stw r2, 48(r4)
        stw r2, 52(r4)
        stw r2, 56(r4)
        stw r2, 60(r4)
        addi r4, r4, 64
        addi r3, r3, -64
        cmpwi cr0, r3, 63
        bgt .LBB_fbSolidFillmmx_2       ; no_exit

instead of:

.LBB_fbSolidFillmmx_2:  ; no_exit
        li r11, 0
        stw r11, 0(r4)
        stw r11, 4(r4)
        stwx r11, r10, r4
        add r12, r10, r4
        stw r11, 4(r12)
        stwx r11, r9, r4
        add r12, r9, r4
        stw r11, 4(r12)
        stwx r11, r8, r4
        add r12, r8, r4
        stw r11, 4(r12)
        stwx r11, r7, r4
        add r12, r7, r4
        stw r11, 4(r12)
        stwx r11, r6, r4
        add r12, r6, r4
        stw r11, 4(r12)
        stwx r11, r5, r4
        add r12, r5, r4
        stw r11, 4(r12)
        stwx r11, r2, r4
        add r12, r2, r4
        stw r11, 4(r12)
        addi r4, r4, 64
        addi r3, r3, -64
        cmpwi cr0, r3, 63
        bgt .LBB_fbSolidFillmmx_2       ; no_exit

llvm-svn: 22737
2005-08-09 23:39:36 +00:00
Chris Lattner b310ac4a86 implement two helper methods
llvm-svn: 22736
2005-08-09 23:36:33 +00:00
Chris Lattner 67017db5c6 add two helper methods
llvm-svn: 22735
2005-08-09 23:36:18 +00:00
Chris Lattner 679f5b0b40 Fix spelling, fix some broken canonicalizations by my last patch
llvm-svn: 22734
2005-08-09 23:09:05 +00:00
Chris Lattner fc070f1fd2 I can't believe I caught this before Misha! :)
llvm-svn: 22733
2005-08-09 23:08:53 +00:00
Chris Lattner 54ee86aca7 add a optimization note
llvm-svn: 22732
2005-08-09 22:30:57 +00:00
Chris Lattner 14e060f743 add cc nodes to the AllNodes list so they show up in Graphviz output
llvm-svn: 22731
2005-08-09 20:40:02 +00:00
Chris Lattner 080f741f50 Add testcases for new rlwinm cases handled, patch by Jim Laskey!
llvm-svn: 22730
2005-08-09 20:24:16 +00:00
Chris Lattner 6ec7745e80 Update the targets to the new SETCC/CondCodeSDNode interfaces.
llvm-svn: 22729
2005-08-09 20:21:10 +00:00
Chris Lattner d47675ed24 Eliminate the SetCCSDNode in favor of a CondCodeSDNode class. This pulls the
CC out of the SetCC operation, making SETCC a standard ternary operation and
CC's a standard DAG leaf.  This will make it possible for other node to use
CC's as operands in the future...

llvm-svn: 22728
2005-08-09 20:20:18 +00:00
Chris Lattner 2035c4f7f8 Minor cleanup patch, no functionality changes. Written by Jim Laskey.
llvm-svn: 22727
2005-08-09 18:29:55 +00:00
Chris Lattner 4c62c647c2 Fix CodeGen/Generic/div-neg-power-2.ll, a regression from last night.
llvm-svn: 22726
2005-08-09 18:08:41 +00:00
Chris Lattner 91fca092db new reg test for a failure last night on ppc/darwin
llvm-svn: 22725
2005-08-09 18:07:45 +00:00
Chris Lattner 02742710f3 SCEVAddExpr::get() of an empty list is invalid.
llvm-svn: 22724
2005-08-09 01:13:47 +00:00
Chris Lattner 23e3fb9e83 This is now implemented
llvm-svn: 22723
2005-08-09 00:19:44 +00:00
Chris Lattner a091ff1764 Implement: LoopStrengthReduce/share_ivs.ll
Two changes:
  * Only insert one PHI node for each stride.  Other values are live in
    values.  This cannot introduce higher register pressure than the
    previous approach, and can take advantage of reg+reg addressing modes.
  * Factor common base values out of uses before moving values from the
    base to the immediate fields.  This improves codegen by starting the
    stride-specific PHI node out at a common place for each IV use.

As an example, we used to generate this for a loop in swim:

.LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_2:        ; no_exit.7.i
        lfd f0, 0(r8)
        stfd f0, 0(r3)
        lfd f0, 0(r6)
        stfd f0, 0(r7)
        lfd f0, 0(r2)
        stfd f0, 0(r5)
        addi r9, r9, 1
        addi r2, r2, 8
        addi r5, r5, 8
        addi r6, r6, 8
        addi r7, r7, 8
        addi r8, r8, 8
        addi r3, r3, 8
        cmpw cr0, r9, r4
        bgt .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_1

now we emit:

.LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_2:        ; no_exit.7.i
        lfdx f0, r8, r2
        stfdx f0, r9, r2
        lfdx f0, r5, r2
        stfdx f0, r7, r2
        lfdx f0, r3, r2
        stfdx f0, r6, r2
        addi r10, r10, 1
        addi r2, r2, 8
        cmpw cr0, r10, r4
        bgt .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_1

As another more dramatic example, we used to emit this:

.LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_2:       ; no_exit.1.i19
        lfd f0, 8(r21)
        lfd f4, 8(r3)
        lfd f5, 8(r27)
        lfd f6, 8(r22)
        lfd f7, 8(r5)
        lfd f8, 8(r6)
        lfd f9, 8(r30)
        lfd f10, 8(r11)
        lfd f11, 8(r12)
        fsub f10, f10, f11
        fadd f5, f4, f5
        fmul f5, f5, f1
        fadd f6, f6, f7
        fadd f6, f6, f8
        fadd f6, f6, f9
        fmadd f0, f5, f6, f0
        fnmsub f0, f10, f2, f0
        stfd f0, 8(r4)
        lfd f0, 8(r25)
        lfd f5, 8(r26)
        lfd f6, 8(r23)
        lfd f9, 8(r28)
        lfd f10, 8(r10)
        lfd f12, 8(r9)
        lfd f13, 8(r29)
        fsub f11, f13, f11
        fadd f4, f4, f5
        fmul f4, f4, f1
        fadd f5, f6, f9
        fadd f5, f5, f10
        fadd f5, f5, f12
        fnmsub f0, f4, f5, f0
        fnmsub f0, f11, f3, f0
        stfd f0, 8(r24)
        lfd f0, 8(r8)
        fsub f4, f7, f8
        fsub f5, f12, f10
        fnmsub f0, f5, f2, f0
        fnmsub f0, f4, f3, f0
        stfd f0, 8(r2)
        addi r20, r20, 1
        addi r2, r2, 8
        addi r8, r8, 8
        addi r10, r10, 8
        addi r12, r12, 8
        addi r6, r6, 8
        addi r29, r29, 8
        addi r28, r28, 8
        addi r26, r26, 8
        addi r25, r25, 8
        addi r24, r24, 8
        addi r5, r5, 8
        addi r23, r23, 8
        addi r22, r22, 8
        addi r3, r3, 8
        addi r9, r9, 8
        addi r11, r11, 8
        addi r30, r30, 8
        addi r27, r27, 8
        addi r21, r21, 8
        addi r4, r4, 8
        cmpw cr0, r20, r7
        bgt .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_1

we now emit:

.LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_2:       ; no_exit.1.i19
        lfdx f0, r21, r20
        lfdx f4, r3, r20
        lfdx f5, r27, r20
        lfdx f6, r22, r20
        lfdx f7, r5, r20
        lfdx f8, r6, r20
        lfdx f9, r30, r20
        lfdx f10, r11, r20
        lfdx f11, r12, r20
        fsub f10, f10, f11
        fadd f5, f4, f5
        fmul f5, f5, f1
        fadd f6, f6, f7
        fadd f6, f6, f8
        fadd f6, f6, f9
        fmadd f0, f5, f6, f0
        fnmsub f0, f10, f2, f0
        stfdx f0, r4, r20
        lfdx f0, r25, r20
        lfdx f5, r26, r20
        lfdx f6, r23, r20
        lfdx f9, r28, r20
        lfdx f10, r10, r20
        lfdx f12, r9, r20
        lfdx f13, r29, r20
        fsub f11, f13, f11
        fadd f4, f4, f5
        fmul f4, f4, f1
        fadd f5, f6, f9
        fadd f5, f5, f10
        fadd f5, f5, f12
        fnmsub f0, f4, f5, f0
        fnmsub f0, f11, f3, f0
        stfdx f0, r24, r20
        lfdx f0, r8, r20
        fsub f4, f7, f8
        fsub f5, f12, f10
        fnmsub f0, f5, f2, f0
        fnmsub f0, f4, f3, f0
        stfdx f0, r2, r20
        addi r19, r19, 1
        addi r20, r20, 8
        cmpw cr0, r19, r7
        bgt .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_1

llvm-svn: 22722
2005-08-09 00:18:09 +00:00
Chris Lattner 37c24cc98c Suck the base value out of the UsersToProcess vector into the BasedUser
class to simplify the code.  Fuse two loops.

llvm-svn: 22721
2005-08-08 22:56:21 +00:00
Chris Lattner 37ed895bf1 Split MoveLoopVariantsToImediateField out from MoveImmediateValues. The
first is a correctness thing, and the later is an optzn thing.  This also
is needed to support a future change.

llvm-svn: 22720
2005-08-08 22:32:34 +00:00
Nate Begeman c92787e1f5 Factor out some common code, and be smarter about when to emit load hi/lo
code sequences.

llvm-svn: 22719
2005-08-08 22:22:56 +00:00
Chris Lattner 319292a666 A testcase I don't want to break in the future
llvm-svn: 22718
2005-08-08 22:13:49 +00:00
Chris Lattner d09a9a788b Allow tools with "consume after" options (like lli) to take more positional
opts than they take directly.  Thanks to John C for pointing this problem
out to me!

llvm-svn: 22717
2005-08-08 21:57:27 +00:00
Chris Lattner 64068eb7da Remove getImmediateForOpcode, which is now dead.
Patch by Jim Laskey.

llvm-svn: 22716
2005-08-08 21:34:13 +00:00
Chris Lattner 25388199a2 Add new immediate handling support for mul/div.
Patch by Jim Laskey!

llvm-svn: 22715
2005-08-08 21:33:23 +00:00
Chris Lattner 8e9dc31928 Add support for OR/XOR/SUB immediates that are handled with the new immediate
way.  This allows ORI/ORIS pairs, for example.

llvm-svn: 22714
2005-08-08 21:30:29 +00:00
Chris Lattner fd0fe76ba6 Modify the ISD::AND opcode case to use new immediate constant predicates.
Includes wider support for rotate and mask cases.

Patch by Jim Laskey.

I've requested that Jim add new regression tests the newly handled cases.

llvm-svn: 22712
2005-08-08 21:24:57 +00:00
Chris Lattner 81e0e3e933 Modify the ISD::ADD opcode case to use new immediate constant predicates.
Includes support for 32-bit constants using addi/addis.

Patch by Jim Laskey.

llvm-svn: 22711
2005-08-08 21:21:03 +00:00
Chris Lattner 4c54dae243 Modify existing support functions to use new immediate constant predicates.
Patch by Jim Laskey

llvm-svn: 22710
2005-08-08 21:12:35 +00:00
Chris Lattner 3cc070cc36 Add support predicates for future immediate constant changes.
Patch by Jim Laskey

llvm-svn: 22709
2005-08-08 21:10:27 +00:00
Chris Lattner f2267ed5c5 Move IsRunOfOnes to a more logical place and rename to a proper predicate form
(lowercase isXXX).

Patch by Jim Laskey.

llvm-svn: 22708
2005-08-08 21:08:09 +00:00
Nate Begeman 9a838678b0 Fix JIT encoding of ppc mfocrf instruction; the operands were reversed
llvm-svn: 22707
2005-08-08 20:04:52 +00:00
Chris Lattner 9f269e40c9 Use the new 'moveBefore' method to simplify some code. Really, which is
easier to understand?  :)

llvm-svn: 22706
2005-08-08 19:11:57 +00:00
Chris Lattner d380d8412d Reject command lines that have too many positional arguments passed (e.g.,
'opt x y').  This fixes PR493.

Patch contributed by Owen Anderson!

llvm-svn: 22705
2005-08-08 17:25:38 +00:00
Chris Lattner 14203e85b2 Not all constants are legal immediates in load/store instructions.
llvm-svn: 22704
2005-08-08 06:25:50 +00:00
Chris Lattner ac86efb6a3 new testcase, not implemented yet
llvm-svn: 22703
2005-08-08 06:23:47 +00:00
Chris Lattner c70bbc0c41 Implement LoopStrengthReduce/share_code_in_preheader.ll by having one
rewriter for all code inserted into the preheader, which is never flushed.

llvm-svn: 22702
2005-08-08 05:47:49 +00:00
Chris Lattner 03b6275fb8 It is better to not depend on CSE to share multiplies due to IV insertion.
This testcase checks that only one mul is present in the output code, as it
should be.

llvm-svn: 22701
2005-08-08 05:46:51 +00:00
Chris Lattner 7eecc6cb24 These are both implemented by a recent LSR patch
llvm-svn: 22700
2005-08-08 05:29:51 +00:00
Chris Lattner 9bfa6f8784 Implement a simple optimization for the termination condition of the loop.
The termination condition actually wants to use the post-incremented value
of the loop, not a new indvar with an unusual base.

On PPC, for example, this allows us to compile
LoopStrengthReduce/exit_compare_live_range.ll to:

_foo:
        li r2, 0
.LBB_foo_1:     ; no_exit
        li r5, 0
        stw r5, 0(r3)
        addi r2, r2, 1
        cmpw cr0, r2, r4
        bne .LBB_foo_1  ; no_exit
        blr

instead of:

_foo:
        li r2, 1                ;; IV starts at 1, not 0
.LBB_foo_1:     ; no_exit
        li r5, 0
        stw r5, 0(r3)
        addi r5, r2, 1
        cmpw cr0, r2, r4
        or r2, r5, r5           ;; Reg-reg copy, extra live range
        bne .LBB_foo_1  ; no_exit
        blr

This implements LoopStrengthReduce/exit_compare_live_range.ll

llvm-svn: 22699
2005-08-08 05:28:22 +00:00
Chris Lattner 24a0a43cb0 add new helper function
llvm-svn: 22698
2005-08-08 05:21:50 +00:00
Chris Lattner 9dae86a54d add a new helper method
llvm-svn: 22697
2005-08-08 05:21:33 +00:00
Chris Lattner 88e2d2ee6b Handle 64-bit constant exprs on 64-bit targets.
llvm-svn: 22696
2005-08-08 04:26:32 +00:00
Chris Lattner 579b20b747 All stats are "Number of ..."
llvm-svn: 22694
2005-08-07 20:02:04 +00:00
Chris Lattner 2c14cf7b74 Add some simple folds that occur in bitfield cases. Fix a minor bug in
isHighOnes, where it would consider 0 to have high ones.

llvm-svn: 22693
2005-08-07 07:03:10 +00:00
Chris Lattner 134ebd0801 Fix typoCVS: ----------------------------------------------------------------------
llvm-svn: 22692
2005-08-07 07:00:52 +00:00