Commit Graph

20502 Commits

Author SHA1 Message Date
Chris Lattner 6bd8fd09b6 make sure that -view-isel-dags is the input to the isel, not the input to
the second phase of dag combining

llvm-svn: 23631
2005-10-05 06:09:10 +00:00
Chris Lattner 746d50a01a Fix a crash compiling Olden/tsp
llvm-svn: 23630
2005-10-05 04:45:43 +00:00
Chris Lattner bb08795695 Add some rules for building preprocessed files
llvm-svn: 23629
2005-10-05 00:28:41 +00:00
Chris Lattner 3b793c6521 refactor a bit of code.
When moving constant entries in 'Map' if the entry is the representative
constant for the abstractypemap, make sure to update it as well.  This
fixes the bcreader failures from last night on several C++ apps.

llvm-svn: 23628
2005-10-04 21:35:50 +00:00
Chris Lattner dff59118c6 Minor speedup to avoid array searches given a Use*. This speeds up bc reading
of the python test from 1:00 to 54s.

llvm-svn: 23627
2005-10-04 18:47:09 +00:00
Chris Lattner 7a1450dbc6 Change the signature of replaceUsesOfWithOnConstant. The bool was always
true dynamically.  Finally, pass the Use* that replaceAllUsesWith has into
the method for future use.

llvm-svn: 23626
2005-10-04 18:13:04 +00:00
Chris Lattner 5188716344 Change the signature of replaceUsesOfWithOnConstant to take a Use* and not
take the bool.  The bool is always true dynamically.

llvm-svn: 23625
2005-10-04 18:12:13 +00:00
Chris Lattner 935aa922e3 For large constants (e.g. arrays and structs with many elements) just
creating the keys and doing comparisons to index into 'Map' takes a lot
of time.  For these large constants, keep an inverse map so that 'remove'
and move operations are much faster.

This speeds up a release build of the bc reader on Eric's nasty python
bytecode file from 1:39 to 1:00s.

llvm-svn: 23624
2005-10-04 17:48:46 +00:00
Chris Lattner 5bbf60a5b6 minor cleanup/fastpath for the bcreader. This speeds up the bcreader
from 1:41 -> 1:39 on the large python .bc file in a release build.

llvm-svn: 23623
2005-10-04 16:52:46 +00:00
Jim Laskey 327d4298e1 Reverting to version - until problem isolated.
llvm-svn: 23622
2005-10-04 16:41:51 +00:00
Chris Lattner d1a5bc8dbd Add a forward def
llvm-svn: 23621
2005-10-04 05:09:20 +00:00
Nate Begeman 5da6908d65 Fix some faulty logic in the libcall inserter.
Since calls return more than one value, don't bail if one of their uses
happens to be a node that's not an MVT::Other when following the chain
from CALLSEQ_START to CALLSEQ_END.

Once we've found a CALLSEQ_START, we can just return; there's no need to
tail-recurse further up the graph.

Most importantly, just because something only has one use doesn't mean we
should use it's one use to follow from start to end.  This faulty logic
caused us to follow a chain of one-use FP operations back to a much earlier
call, putting a cycle in the graph from a later start to an earlier end.

This is a better fix that reverting to the workaround committed earlier
today.

llvm-svn: 23620
2005-10-04 02:10:55 +00:00
Chris Lattner 8760ec73d8 implement the struct version of the array speedup, speeding up the
testcase a bit more from 1:48 -> 1.40.

llvm-svn: 23619
2005-10-04 01:17:50 +00:00
Chris Lattner 20b0754c41 Fix DemoteRegToStack on an invoke. This fixes PR634.
llvm-svn: 23618
2005-10-04 00:44:01 +00:00
Nate Begeman 54fb5002e5 Add back a workaround that fixes some breakages from chris's last change.
Neither of us have yet figured out why this code is necessary, but stuff
breaks if its not there.  Still tracking this down...

llvm-svn: 23617
2005-10-04 00:37:37 +00:00
Chris Lattner 4c3b2b536c Clean up the code a bit. Use isInstructionTriviallyDead to be more aggressive
and more correct than use_empty().  This fixes PR635 and
SimplifyCFG/2005-10-02-InvokeSimplify.ll

llvm-svn: 23616
2005-10-03 23:43:43 +00:00
Chris Lattner a6e98f2e85 new testcase for PR635
llvm-svn: 23615
2005-10-03 23:42:54 +00:00
Chris Lattner b64419ac40 Change ConstantArray::replaceUsesOfWithOnConstant to attempt to update
constant arrays in place instead of reallocating them and replaceAllUsesOf'ing
the result.  This speeds up a release build of the bcreader from:

136.987u 120.866s 4:24.38
to
49.790u 49.890s 1:40.14

... a 2.6x speedup parsing a large python bc file.

llvm-svn: 23614
2005-10-03 22:51:37 +00:00
Chris Lattner c4062ba65f move some methods, no other changes
llvm-svn: 23613
2005-10-03 21:58:36 +00:00
Chris Lattner 0144fadc17 minor microoptimizations
llvm-svn: 23612
2005-10-03 21:56:24 +00:00
Chris Lattner bad09e71d0 Use a map to cache the ModuleType information, so we can do logarithmic
lookups instead of linear time lookups.  This speeds up bc parsing of a
large file from

137.834u 118.256s 4:27.96
to
132.611u 114.436s 4:08.53

with a release build.

llvm-svn: 23611
2005-10-03 21:26:53 +00:00
Jim Laskey 409a6b204e Refactor gathering node info and emission.
llvm-svn: 23610
2005-10-03 12:30:32 +00:00
Chris Lattner 57b21f9f10 clean up this code a bit, no functionality change
llvm-svn: 23609
2005-10-03 07:22:07 +00:00
Chris Lattner afef68baff Speed up the asm printer a lot by not printing formatted LLVM asm output
for globals

llvm-svn: 23608
2005-10-03 07:08:36 +00:00
Chris Lattner 5f096e2847 Break the body of the loop out into a new method
llvm-svn: 23606
2005-10-03 04:47:08 +00:00
Chris Lattner 1687459559 Fix case of path
llvm-svn: 23605
2005-10-03 03:32:39 +00:00
Chris Lattner f07a587c79 Make IVUseShouldUsePostIncValue more aggressive when the use is a PHI. In
particular, it should realize that phi's use their values in the pred block
not the phi block itself.  This change turns our em3d loop from this:

_test:
        cmpwi cr0, r4, 0
        bgt cr0, LBB_test_2     ; entry.no_exit_crit_edge
LBB_test_1:     ; entry.loopexit_crit_edge
        li r2, 0
        b LBB_test_6    ; loopexit
LBB_test_2:     ; entry.no_exit_crit_edge
        li r6, 0
LBB_test_3:     ; no_exit
        or r2, r6, r6
        lwz r6, 0(r3)
        cmpw cr0, r6, r5
        beq cr0, LBB_test_6     ; loopexit
LBB_test_4:     ; endif
        addi r3, r3, 4
        addi r6, r2, 1
        cmpw cr0, r6, r4
        blt cr0, LBB_test_3     ; no_exit
LBB_test_5:     ; endif.loopexit.loopexit_crit_edge
        addi r3, r2, 1
        blr
LBB_test_6:     ; loopexit
        or r3, r2, r2
        blr

into:

_test:
        cmpwi cr0, r4, 0
        bgt cr0, LBB_test_2     ; entry.no_exit_crit_edge
LBB_test_1:     ; entry.loopexit_crit_edge
        li r2, 0
        b LBB_test_5    ; loopexit
LBB_test_2:     ; entry.no_exit_crit_edge
        li r6, 0
LBB_test_3:     ; no_exit
        lwz r2, 0(r3)
        cmpw cr0, r2, r5
        or r2, r6, r6
        beq cr0, LBB_test_5     ; loopexit
LBB_test_4:     ; endif
        addi r3, r3, 4
        addi r6, r6, 1
        cmpw cr0, r6, r4
        or r2, r6, r6
        blt cr0, LBB_test_3     ; no_exit
LBB_test_5:     ; loopexit
        or r3, r2, r2
        blr


Unfortunately, this is actually worse code, because the register coallescer
is getting confused somehow.  If it were doing its job right, it could turn the
code into this:

_test:
        cmpwi cr0, r4, 0
        bgt cr0, LBB_test_2     ; entry.no_exit_crit_edge
LBB_test_1:     ; entry.loopexit_crit_edge
        li r6, 0
        b LBB_test_5    ; loopexit
LBB_test_2:     ; entry.no_exit_crit_edge
        li r6, 0
LBB_test_3:     ; no_exit
        lwz r2, 0(r3)
        cmpw cr0, r2, r5
        beq cr0, LBB_test_5     ; loopexit
LBB_test_4:     ; endif
        addi r3, r3, 4
        addi r6, r6, 1
        cmpw cr0, r6, r4
        blt cr0, LBB_test_3     ; no_exit
LBB_test_5:     ; loopexit
        or r3, r6, r6
        blr

... which I'll work on next. :)

llvm-svn: 23604
2005-10-03 02:50:05 +00:00
Chris Lattner e4ed42a426 Refactor some code into a function
llvm-svn: 23603
2005-10-03 01:04:44 +00:00
Chris Lattner 360928dbed This break is bogus and I have no idea why it was there. Basically it prevents
memoizing code when IV's are used by phinodes outside of loops.  In a simple
example, we were getting this code before (note that r6 and r7 are isomorphic
IV's):

        li r6, 0
        or r7, r6, r6
LBB_test_3:     ; no_exit
        lwz r2, 0(r3)
        cmpw cr0, r2, r5
        or r2, r7, r7
        beq cr0, LBB_test_5     ; loopexit
LBB_test_4:     ; endif
        addi r2, r7, 1
        addi r7, r7, 1
        addi r3, r3, 4
        addi r6, r6, 1
        cmpw cr0, r6, r4
        blt cr0, LBB_test_3     ; no_exit

Now we get:

        li r6, 0
LBB_test_3:     ; no_exit
        or r2, r6, r6
        lwz r6, 0(r3)
        cmpw cr0, r6, r5
        beq cr0, LBB_test_6     ; loopexit
LBB_test_4:     ; endif
        addi r3, r3, 4
        addi r6, r2, 1
        cmpw cr0, r6, r4
        blt cr0, LBB_test_3     ; no_exit

this was noticed in em3d.

llvm-svn: 23602
2005-10-03 00:37:33 +00:00
Chris Lattner 8fcce170cf when checking if we should move a split edge block outside of a loop,
check the presplit pred, not the post-split pred.  This was causing us
to make the wrong decision in some cases, leaving the critical edge block
in the loop.

llvm-svn: 23601
2005-10-03 00:31:52 +00:00
Chris Lattner 77676d5bc2 This member can be const too
llvm-svn: 23600
2005-10-03 00:21:25 +00:00
Chris Lattner e51d6a9f70 put the right labels on the data
llvm-svn: 23599
2005-10-02 21:51:38 +00:00
Chris Lattner 9cfccfb517 Fix a problem where the legalizer would run out of stack space on extremely
large basic blocks because it was purely recursive.  This switches it to an
iterative/recursive hybrid.

llvm-svn: 23596
2005-10-02 17:49:46 +00:00
Chris Lattner 7f718e61e8 silence a bogus warning
llvm-svn: 23595
2005-10-02 16:30:51 +00:00
Chris Lattner 9982da2703 silence some warnings
llvm-svn: 23594
2005-10-02 16:29:36 +00:00
Chris Lattner c0e655b65d silence a warning
llvm-svn: 23593
2005-10-02 16:27:59 +00:00
Chris Lattner 68303a78ff add patterns for float binops and fma ops
llvm-svn: 23592
2005-10-02 07:46:28 +00:00
Chris Lattner 98da1d9910 Sort the cpu and features table, so that the alpha backend doesn't fail EVERY
compile with an assertion that the tables are not sorted!

llvm-svn: 23591
2005-10-02 07:13:52 +00:00
Chris Lattner 704d97f8b2 Add assertions to the trivial scheduler to check that the value types match
up between defs and uses.

llvm-svn: 23590
2005-10-02 07:10:55 +00:00
Chris Lattner 3734d204b8 another solution to the fsel issue. Instead of having 4 variants, just force
the comparison to be 64-bits.  This is fine because extensions from float
to double are free.

llvm-svn: 23589
2005-10-02 07:07:49 +00:00
Chris Lattner 9e98672962 fsel can take a different FP type for the comparison and for the result. As such
split the FSEL family into 4 things instead of just two.

llvm-svn: 23588
2005-10-02 06:58:23 +00:00
Chris Lattner a17e6c486c fix an f32/f64 type mismatch
llvm-svn: 23587
2005-10-02 06:37:13 +00:00
Chris Lattner a038d901fb Codegen CopyFromReg using the regclass that matches the valuetype of the
destination vreg.

llvm-svn: 23586
2005-10-02 06:34:16 +00:00
Chris Lattner 4155ae0f74 Adjust to change in ctor
llvm-svn: 23585
2005-10-02 06:23:51 +00:00
Chris Lattner d4ff3c1324 Emit the value type for each register class.
llvm-svn: 23584
2005-10-02 06:23:37 +00:00
Chris Lattner 0bc697eae7 Expose the actual valuetype of each register class
llvm-svn: 23583
2005-10-02 06:23:19 +00:00
Chris Lattner 5ab9d42bb4 Minor tweak to the branch selector. When emitting a two-way branch, and if
we're in a single-mbb loop, make sure to emit the backwards branch as the
conditional branch instead of the uncond branch.  For example, emit this:

LBBl29_z__44:
        stw r9, 0(r15)
        stw r9, 4(r15)
        stw r9, 8(r15)
        stw r9, 12(r15)
        addi r15, r15, 16
        addi r8, r8, 1
        cmpw cr0, r8, r28
        ble cr0, LBBl29_z__44
        b LBBl29_z__48                   *** NOT PART OF LOOP

Instead of:

LBBl29_z__44:
        stw r9, 0(r15)
        stw r9, 4(r15)
        stw r9, 8(r15)
        stw r9, 12(r15)
        addi r15, r15, 16
        addi r8, r8, 1
        cmpw cr0, r8, r28
        bgt cr0, LBBl29_z__48            *** PART OF LOOP!
        b LBBl29_z__44

The former sequence has one fewer dispatch group for the loop body.

llvm-svn: 23582
2005-10-01 23:06:26 +00:00
Chris Lattner 6f4dc51d6f like the comment says, enable this
llvm-svn: 23581
2005-10-01 23:02:40 +00:00
Chris Lattner 5a7bfe0b72 Add some very paranoid checking for operand/result reg class matchup
For instructions that define multiple results, use the right regclass
to define the result, not always the rc of result #0

llvm-svn: 23580
2005-10-01 07:45:09 +00:00
Jeff Cohen f8a5e5ae6e Fix VC++ warnings.
llvm-svn: 23579
2005-10-01 03:57:14 +00:00
Chris Lattner 8713ebf37c fix typo
llvm-svn: 23578
2005-10-01 02:51:36 +00:00
Chris Lattner d3eee1a09b Modify the ppc backend to use two register classes for FP: F8RC and F4RC.
These are used to represent float and double values, and the two regclasses
contain the same physical registers.

llvm-svn: 23577
2005-10-01 01:35:02 +00:00
Chris Lattner afdc9d25db Annotate nodes with their addresses if a graph requests it.
This is Jim's feature implemented so that graphs could 'opt-in' and get
this behavior.  This is currently used by selection dags.

llvm-svn: 23576
2005-10-01 00:19:21 +00:00
Chris Lattner fda6944c5b add a method
llvm-svn: 23575
2005-10-01 00:17:07 +00:00
Jim Laskey d3850457a1 typo
llvm-svn: 23574
2005-10-01 00:08:23 +00:00
Jim Laskey 9d96932879 1. Simplify the gathering of node groups.
2. Printing node groups when displaying nodes.

llvm-svn: 23573
2005-10-01 00:03:07 +00:00
Jim Laskey f61232354f Should be using flag and not chain.
llvm-svn: 23572
2005-09-30 23:43:37 +00:00
Nate Begeman fbfad0b565 Remove some now-dead code.
llvm-svn: 23571
2005-09-30 21:28:27 +00:00
Andrew Lenharth 5b8bd94ab2 more specific tests of subtarget stuff
llvm-svn: 23570
2005-09-30 20:30:24 +00:00
Andrew Lenharth 49e48f6234 subtarget support for CIX and FIX extentions (the only 2 I care about right now)
llvm-svn: 23569
2005-09-30 20:24:38 +00:00
Jim Laskey 90b34c1865 Reverting change moving to selection dag graph.
llvm-svn: 23568
2005-09-30 19:33:41 +00:00
Jim Laskey 3059965a4b Added allnodes_size for scheduling support.
llvm-svn: 23567
2005-09-30 19:27:01 +00:00
Jim Laskey 3fe3841c2a 1. Made things node-centric (from operand).
2. Added node groups to handle flagged nodes.

3. Started weaning simple scheduling off existing emitter.

llvm-svn: 23566
2005-09-30 19:15:27 +00:00
Jim Laskey fe59ae2b11 Add the node name (thus the address) to node label.
llvm-svn: 23565
2005-09-30 19:11:53 +00:00
Chris Lattner c9f4219cfc Rename MRegisterDesc -> TargetRegisterDesc for consistency
llvm-svn: 23564
2005-09-30 17:49:27 +00:00
Chris Lattner 57b8ae71e0 Update the discussion of TargetRegisterDesc
llvm-svn: 23563
2005-09-30 17:46:55 +00:00
Chris Lattner 3e020bb619 remove some more initializers
llvm-svn: 23562
2005-09-30 17:41:05 +00:00
Chris Lattner 81f32a2acb trim down the target info structs now that we have a preferred spill register class for each callee save register
Why is V9 maintaining these tables manually? ugh!

llvm-svn: 23561
2005-09-30 17:38:36 +00:00
Chris Lattner ddc69bbbba trim down the target info structs now that we have a preferred spill register class for each callee save register
llvm-svn: 23560
2005-09-30 17:35:22 +00:00
Chris Lattner 2e794c9198 now that we have a reg class to spill with, get this info from the regclass
llvm-svn: 23559
2005-09-30 17:19:22 +00:00
Chris Lattner 88025e17c5 constant fold these calls
llvm-svn: 23558
2005-09-30 17:16:59 +00:00
Chris Lattner bb1c9ecb17 simplify this code using the new regclass info passed in
llvm-svn: 23557
2005-09-30 17:12:38 +00:00
Chris Lattner 51878189c5 Now that we have getCalleeSaveRegClasses() info, use it to pass the register
class into the spill/reload methods.  Targets can now rely on that argument.

llvm-svn: 23556
2005-09-30 16:59:07 +00:00
Chris Lattner fbc60722b9 expose a new virtual method
llvm-svn: 23555
2005-09-30 07:06:37 +00:00
Chris Lattner 8688b92b86 stub out a virtual method
llvm-svn: 23554
2005-09-30 06:55:18 +00:00
Chris Lattner da6fcc9f49 Compute a preferred spill register class for each callee-save register
llvm-svn: 23553
2005-09-30 06:44:45 +00:00
Chris Lattner 4984e99b83 CR registers are not used by this "target"
llvm-svn: 23552
2005-09-30 06:43:58 +00:00
Chris Lattner 6169a78f46 these registers don't belong to any register classes, so don't mark them
as callee save.  They can never be generated by the compiler.

llvm-svn: 23551
2005-09-30 06:42:24 +00:00
Chris Lattner 26f5fb1277 Fix a warning
llvm-svn: 23550
2005-09-30 06:09:50 +00:00
Chris Lattner 1916ef75cf Regenerate
llvm-svn: 23549
2005-09-30 04:53:25 +00:00
Chris Lattner b509577605 Refactor this a bit to move ParsingTemplateArgs to only apply to classes,
not defs.

Implement support for forward definitions of classes.  This implements
TableGen/ForwardRef.td.

llvm-svn: 23548
2005-09-30 04:53:04 +00:00
Chris Lattner 41815f2aa2 Add a test that you can forward ref a class.
llvm-svn: 23547
2005-09-30 04:52:43 +00:00
Chris Lattner 20b0e3cee4 Regenerate
llvm-svn: 23546
2005-09-30 04:42:56 +00:00
Chris Lattner ad61925e27 Generate a parse error instead of a checked exception if template args are
used on a def.

llvm-svn: 23545
2005-09-30 04:42:31 +00:00
Chris Lattner 33ce5f8a73 Now that self referential classes are supported, get rid of a work-around.
llvm-svn: 23544
2005-09-30 04:13:23 +00:00
Chris Lattner 6e60c8fe05 regenerate
llvm-svn: 23543
2005-09-30 04:11:27 +00:00
Chris Lattner e04e1384fc Refactor the grammar a bit to implement TableGen/ForwardRef.td
llvm-svn: 23542
2005-09-30 04:10:49 +00:00
Chris Lattner 08321aa8cb Check that we can refer to the same class we are defining.
llvm-svn: 23541
2005-09-30 04:10:17 +00:00
Chris Lattner 2a6fd61dfc allow regs to be in multiple reg classes
llvm-svn: 23540
2005-09-30 01:33:48 +00:00
Chris Lattner f6d4173f75 pass extra args
llvm-svn: 23539
2005-09-30 01:31:52 +00:00
Chris Lattner 64ca7cda3f these methods get extra args
llvm-svn: 23538
2005-09-30 01:30:55 +00:00
Chris Lattner a654525c1c Pass extra regclasses into spilling code
llvm-svn: 23537
2005-09-30 01:29:42 +00:00
Chris Lattner 5a6199f387 Change this code ot pass register classes into the stack slot spiller/reloader
code.  PrologEpilogInserter hasn't been updated yet though, so targets cannot
use this info.

llvm-svn: 23536
2005-09-30 01:29:00 +00:00
Chris Lattner b7d89db484 Change these methods to take RC's
llvm-svn: 23535
2005-09-30 01:28:14 +00:00
Chris Lattner 08f157c5b2 Use the 32-bit version for now
llvm-svn: 23534
2005-09-30 00:05:05 +00:00
Chris Lattner 027a2671ef Add a bunch of patterns for F64 FP ops, add some more integer ops
llvm-svn: 23533
2005-09-29 23:34:24 +00:00
Chris Lattner 1de5706e68 Remove code for patterns that are autogenerated
llvm-svn: 23532
2005-09-29 23:33:31 +00:00
Andrew Lenharth a7a83b9255 begining alpha subtarget support
llvm-svn: 23531
2005-09-29 22:54:56 +00:00
Chris Lattner 0a1cd715d4 tblgen autogens this pattern now
llvm-svn: 23530
2005-09-29 22:37:24 +00:00
Chris Lattner 366fe04301 Teach tablegen to reassociate operators when possible. This allows it to
find all of teh pattern matches for EQV from one definition

llvm-svn: 23529
2005-09-29 22:36:54 +00:00
Andrew Lenharth bae1f9d790 copy and paste error
llvm-svn: 23528
2005-09-29 21:11:57 +00:00
Chris Lattner a748e3ae5b now that tblgen is smarter, this pattern is not needed. Also, tblgen
now inverts commuted versions of ANDC/ORC with the current .td file.

llvm-svn: 23527
2005-09-29 19:29:15 +00:00
Chris Lattner e86824e57a Teach tblgen to build permutations of instructions, so that the target author
doesn't have to specify them manually.  It currently handles associativity,
e.g. knowing that (X*Y)+Z  also matches  X+(Y*Z)  and will be extended in
the future.

It is smart enough to not introduce duplicate patterns or patterns that can
never match.

llvm-svn: 23526
2005-09-29 19:28:10 +00:00
Chris Lattner a554c9470b Insert stores after phi nodes in the normal dest. This fixes
LowerInvoke/2005-08-03-InvokeWithPHI.ll

llvm-svn: 23525
2005-09-29 17:44:20 +00:00
Chris Lattner 02d3ba3db8 consistency with other cases, no functionality change
llvm-svn: 23524
2005-09-29 17:38:52 +00:00
Chris Lattner eca4f56646 Make the JIT default to the DAG isel instead of the pattern isel, like LLC.
The Pattern isel has some strange memory corruption issues going on. :(

This should have been converted over anyway, but it got forgotten somehow
when switching to the dag isel.

llvm-svn: 23523
2005-09-29 17:31:03 +00:00
Chris Lattner 5b2be1f890 Fix two bugs in my patch earlier today that broke int->fp conversion on X86.
llvm-svn: 23522
2005-09-29 06:44:39 +00:00
Chris Lattner 87ef943a4c Fold isascii into a simple comparison. This speeds up 197.parser by 7.4%,
bringing the LLC time down to the CBE time.

llvm-svn: 23521
2005-09-29 06:17:27 +00:00
Chris Lattner 5de939e791 new testcase for isascii
llvm-svn: 23520
2005-09-29 06:16:37 +00:00
Chris Lattner 5f6035feb0 remove a bunch of unneeded stuff, or self evident comments
llvm-svn: 23519
2005-09-29 06:16:11 +00:00
Chris Lattner e94e6a9e62 add a new testcase
llvm-svn: 23518
2005-09-29 06:11:34 +00:00
Chris Lattner c244e7c178 Implement a couple of memcmp folds from the todo list
llvm-svn: 23517
2005-09-29 04:54:20 +00:00
Jeff Cohen b01a41a06d Silence VC++ redeclaration warnings.
llvm-svn: 23516
2005-09-29 01:59:49 +00:00
Chris Lattner 08c319fbdd Never rely on ReplaceAllUsesWith when selecting, use CodeGenMap instead.
ReplaceAllUsesWith does not replace scalars SDOperand floating around on
the stack, permitting things to be selected multiple times.

llvm-svn: 23515
2005-09-29 00:59:32 +00:00
Chris Lattner d4e9e8b7ec Codegen ADD X, IMM -> addis/addi if needed.
This implements PowerPC/fold-li.ll

llvm-svn: 23514
2005-09-28 23:07:13 +00:00
Chris Lattner a22f7a2e16 add a testcase for a feature we regressed on because noone wrote the test! :(
llvm-svn: 23513
2005-09-28 23:03:11 +00:00
Chris Lattner b9b2e77295 Autogen MUL, move FP cases together
llvm-svn: 23512
2005-09-28 22:53:16 +00:00
Chris Lattner 5769311c92 disentangle FP from INT versions of div/mul
llvm-svn: 23511
2005-09-28 22:50:24 +00:00
Chris Lattner 585131baaf Use the autogenerated matcher for ADD/SUB
llvm-svn: 23510
2005-09-28 22:47:28 +00:00
Chris Lattner f023b2cda2 add a patter for SUBFIC
llvm-svn: 23509
2005-09-28 22:47:06 +00:00
Chris Lattner 21551ea5ab Mark int binops as int-only, add FP binops. Mark FADD/FMUL as commutative but
not associative.  Add [SU]REM.

llvm-svn: 23508
2005-09-28 22:38:27 +00:00
Chris Lattner cd002b2461 wrap a long line
llvm-svn: 23507
2005-09-28 22:30:58 +00:00
Chris Lattner d3ea19b51a Add FP versions of the binary operators, keeping the int and fp worlds seperate.
llvm-svn: 23506
2005-09-28 22:29:58 +00:00
Chris Lattner 0815dcae3f Add FP versions of the binary operators, keeping the int and fp worlds seperate.
Though I have done extensive testing, it is possible that this will break
things in configs I can't test.  Please let me know if this causes a problem
and I'll fix it ASAP.

llvm-svn: 23505
2005-09-28 22:29:17 +00:00
Chris Lattner 6f3b577ee6 Add FP versions of the binary operators, keeping the int and fp worlds seperate.
Though I have done extensive testing, it is possible that this will break
things in configs I can't test.  Please let me know if this causes a problem
and I'll fix it ASAP.

llvm-svn: 23504
2005-09-28 22:28:18 +00:00
Chris Lattner 7fe6734dff Mark associative nodes as associative
llvm-svn: 23503
2005-09-28 20:58:39 +00:00
Chris Lattner 492e70f4ec add support for an associative marker
llvm-svn: 23502
2005-09-28 20:58:06 +00:00
Chris Lattner 8bb25cd68a Emit an error if instructions or patterns are defined but can never match.
Currently we check that immediate values live on the RHS of commutative
operators.  Defining ORI like this, for example:

def ORI   : DForm_4<24, (ops GPRC:$dst, GPRC:$src1, u16imm:$src2),
                    "ori $dst, $src1, $src2",
                    [(set GPRC:$dst, (or immZExt16:$src2, GPRC:$src1))]>;

results in:

tblgen: In ORI: Instruction can never match: Immediate values must be on the RHS of commutative operators!
llvm-svn: 23501
2005-09-28 19:27:25 +00:00
Chris Lattner b97b054ba7 Nate pointed out that mulh[us] are commutative as well. Thanks!
llvm-svn: 23500
2005-09-28 19:01:44 +00:00
Chris Lattner f74c30c281 collect commutativity information
llvm-svn: 23499
2005-09-28 18:28:29 +00:00
Chris Lattner 89d168ceb3 expose commutativity information
llvm-svn: 23498
2005-09-28 18:27:58 +00:00
Chris Lattner fab48b3285 All (xor *) cases are autogenerated now
llvm-svn: 23497
2005-09-28 18:12:37 +00:00
Chris Lattner 037d69a404 add support for missed eqv tests
llvm-svn: 23496
2005-09-28 18:10:51 +00:00
Chris Lattner afc5ba4f3a add testcase for nand
llvm-svn: 23495
2005-09-28 18:08:58 +00:00
Chris Lattner 33f8e08c8f Implement PowerPC/eqv-andc-orc-nor.ll:EQV3
llvm-svn: 23494
2005-09-28 18:04:52 +00:00
Chris Lattner 380fd4a413 Consolidate the eqv.ll and nor.ll files together.
Add a missed eqv case.

llvm-svn: 23493
2005-09-28 18:04:22 +00:00
Chris Lattner 3622f15491 Prefer cheaper patterns to more expensive ones. Print the costs to the generated
file

llvm-svn: 23492
2005-09-28 17:57:56 +00:00
Chris Lattner e2b772b0ae simple tests for nor generation
llvm-svn: 23491
2005-09-28 17:55:10 +00:00
Chris Lattner 8cd7b88a88 learn to codegen not as NOR instead of xoris/xori
llvm-svn: 23490
2005-09-28 17:13:15 +00:00
Chris Lattner bb5939a436 These nodes are all autogenerated
llvm-svn: 23489
2005-09-28 17:07:09 +00:00
Chris Lattner 75b4c5d868 Select Constant nodes to TargetConstant nodes
llvm-svn: 23488
2005-09-28 16:58:06 +00:00
Chris Lattner ea7214b23d Constant fold llvm.sqrt
llvm-svn: 23487
2005-09-28 01:34:32 +00:00
Chris Lattner 3b63bb375c add a note about a way to improve this code further, that I won't be getting
to right now.

llvm-svn: 23485
2005-09-27 22:44:59 +00:00
Chris Lattner eb953f0ef8 Fix a regression in my previous patch, fixing GlobalOpt/2005-09-27-Crash.ll
and PR632.

llvm-svn: 23484
2005-09-27 22:28:11 +00:00
Chris Lattner b1fb4da271 Testcase for PR632
llvm-svn: 23483
2005-09-27 22:27:19 +00:00
Chris Lattner a028e7a39c Darwin, like many BSD systems, has a setjmp/longjmp which saves the signal mask
on setjmp calls and restores it on longjmp calls (both of which require syscalls).

This makes the calls REALLY slow.  Use _setjmp/_longjmp instead.  This speeds up
hexxagon from 120.31s to 15.68s: from 5.53x slower than GCC to 28% faster than GCC.

llvm-svn: 23482
2005-09-27 22:18:25 +00:00
Chris Lattner 0fd8f9fbc9 If the target prefers it, use _setjmp/_longjmp should be used instead of setjmp/longjmp for llvm.setjmp/llvm.longjmp.
llvm-svn: 23481
2005-09-27 22:15:53 +00:00
Chris Lattner 59dc1e082c initialize new flag
llvm-svn: 23480
2005-09-27 22:13:56 +00:00
Chris Lattner a458a2e472 Add a new flag for targets where setjmp/longjmp saves/restores the signal mask,
and _setjmp/_longjmp should be used instead (for llvm.setjmp/llvm.longjmp).

llvm-svn: 23479
2005-09-27 22:13:36 +00:00
Chris Lattner e285f5ed8f Avoid spilling stack slots... to stack slots.
llvm-svn: 23478
2005-09-27 21:33:12 +00:00
Chris Lattner 87eb249300 Completely rewrite 'correct' eh support. This changes how setjmp insertion
is performed so it is only at most once per function that contains an invoke
instead of once per invoke in the function.  This patch has the following perks:

1. It fixes PR631, which complains about slowness.
2. If fixes PR240, which complains about non-volatile vars being live across
   setjmp/longjmps.
3. It improves (but does not fix) the jmpbuf alignment issue on itanium by not
   forcing the jmpbufs to always be 8-bytes off the alignment of the structure.
4. It speeds up 253.perlbmk from 338s to 13.70s (a 25x improvement!), making us
   now about 4% faster than GCC.

Further improvements are also possible.

llvm-svn: 23477
2005-09-27 21:18:17 +00:00
Chris Lattner 92233d2175 Make the pass name simpler
llvm-svn: 23476
2005-09-27 21:10:32 +00:00
Chris Lattner 5635cc069f fix CBackend/2005-09-27-VolatileFuncPtr.ll
llvm-svn: 23475
2005-09-27 20:52:44 +00:00
Chris Lattner e338f05ba6 new testcase the CBE creates invalid C code for
llvm-svn: 23474
2005-09-27 20:52:30 +00:00
Chris Lattner 16cd356fb2 allow demotion to volatile values, add support for invoke
llvm-svn: 23473
2005-09-27 19:39:00 +00:00
Chris Lattner cce0355fc0 allow demotion to volatile values
llvm-svn: 23472
2005-09-27 19:38:43 +00:00
Chris Lattner 6f726d2c1a Add a simple testcase for lowerinvoke
llvm-svn: 23471
2005-09-27 18:34:31 +00:00
Chris Lattner c628f00845 Make sure to clear the CodeGenMap after each basic block is selected to avoid
cross MBB pollution.

llvm-svn: 23470
2005-09-27 17:45:33 +00:00
Jim Laskey 63523f98d5 Remove some redundancies.
llvm-svn: 23469
2005-09-27 17:32:45 +00:00
Chris Lattner 57432e717e Make this slightly more efficient by pushing actual type information down
into the evaluator.  This shrinks a release build of instcombine's text
section from 216363 to 215975 bytes (on PPC).

llvm-svn: 23468
2005-09-27 06:38:05 +00:00
Chris Lattner e7e139e8e8 Split SimpleConstantVal up into its components, so each Constant subclass getsa different enum value. This allows 'classof' for these to be really simple,not needing to call getType() anymore.
This speeds up isa/dyncast/etc for constants, and also makes them smaller.
For example, the text section of a release build of InstCombine.cpp shrinks
from 230037 bytes to 216363 bytes, a 6% reduction.

llvm-svn: 23467
2005-09-27 06:09:08 +00:00
Chris Lattner 555fb9c984 Split SimpleConstantVal up into its components, so each Constant subclass gets
a different enum value.  This allows 'classof' for these to be really simple,
not needing to call getType() anymore.

This speeds up isa/dyncast/etc for constants, and also makes them smaller.
For example, the text section of a release build of InstCombine.cpp shrinks
from 230037 bytes to 216363 bytes, a 6% reduction.

llvm-svn: 23466
2005-09-27 06:08:32 +00:00
Chris Lattner 3d27e7f27f Add support for external calls that we know how to constant fold. This implements
ctor-list-opt.ll:CTOR8

llvm-svn: 23465
2005-09-27 05:02:43 +00:00
Chris Lattner 1f1fd227fb add a new testcase for constant foldable calls
llvm-svn: 23464
2005-09-27 05:02:03 +00:00
Chris Lattner 29b2780c8a Fix a bug where we would evaluate stores into linkonce objects which could be
potentially replaced at link-time.

llvm-svn: 23463
2005-09-27 04:50:03 +00:00
Chris Lattner 65a3a0918f Implement support for static constructors with calls in them. This is useful
because gccas runs globalopt before inlining.

This implements ctor-list-opt.ll:CTOR7

llvm-svn: 23462
2005-09-27 04:45:34 +00:00
Chris Lattner 3803cbb196 Add a more difficult testcase which uses a call to a helper function to do
the initialization

llvm-svn: 23461
2005-09-27 04:44:04 +00:00
Chris Lattner da1889b778 Refactor this code a bit, no functionality changes.
llvm-svn: 23460
2005-09-27 04:27:01 +00:00
Chris Lattner 54ec5f2089 Move the post-lsr simplify cfg pass after lowereh, so it can clean up after
eh lowering as well.

llvm-svn: 23459
2005-09-27 00:14:41 +00:00
Chris Lattner 4435b149a0 minor pattern shuffling
llvm-svn: 23458
2005-09-26 22:20:16 +00:00
Chris Lattner d455c36c91 memoize the assert results
llvm-svn: 23457
2005-09-26 22:10:24 +00:00
Chris Lattner c9153266c6 Emit the switch stmt cases in alphabetical order instead of pointer order,
which is not stable.

llvm-svn: 23456
2005-09-26 21:59:35 +00:00
Jim Laskey 5f2443c8a3 Addition of a simple two pass scheduler. This version is currently hacked up
for testing and will require target machine info to do a proper scheduling.
The simple scheduler can be turned on using -sched=simple (defaults
to -sched=none)

llvm-svn: 23455
2005-09-26 21:57:04 +00:00
Chris Lattner d5de8544f8 implement a fixme: only select values once, even if used multiple times.
llvm-svn: 23454
2005-09-26 21:53:26 +00:00
Chris Lattner f2f89af69a Remove some dead code. ctor evaluation subsumes empty ctor elim
llvm-svn: 23453
2005-09-26 20:38:20 +00:00
Chris Lattner 6bf2cd5735 Add support for alloca, implementing ctor-list-opt.ll:CTOR6
llvm-svn: 23452
2005-09-26 17:07:09 +00:00
Chris Lattner 46eeed89e5 Testcase that uses an alloca
llvm-svn: 23451
2005-09-26 17:06:32 +00:00
Chris Lattner 46d9ff081d Add a debug printout, fix a crash on kc++
llvm-svn: 23450
2005-09-26 07:34:35 +00:00
Chris Lattner 46af55e0e4 Implement loads/stores through GEP's of globals. This implements
ctor-list-opt.ll:CTOR5.

llvm-svn: 23449
2005-09-26 06:52:44 +00:00
Chris Lattner 636fa212b9 add another case, this one that uses getelementptr instructions
llvm-svn: 23448
2005-09-26 06:51:50 +00:00
Chris Lattner 61ff32cd70 Replace TraverseGEPInitializer with ConstantFoldLoadThroughGEPConstantExpr
llvm-svn: 23447
2005-09-26 05:34:07 +00:00
Chris Lattner 02ae21e1e0 Eliminate GetGEPGlobalInitializer in favor of the more powerful
ConstantFoldLoadThroughGEPConstantExpr function in the utils lib.

llvm-svn: 23446
2005-09-26 05:28:52 +00:00
Chris Lattner 0b011ec8e2 Factor the GetGEPGlobalInitializer out of this pass and into Transforms/Utils
as ConstantFoldLoadThroughGEPConstantExpr.

llvm-svn: 23445
2005-09-26 05:28:06 +00:00
Chris Lattner c13c7b9376 Move the ConstantFoldLoadThroughGEPConstantExpr function out of the InstCombine
pass.

llvm-svn: 23444
2005-09-26 05:27:10 +00:00
Chris Lattner 348a39982e add a new function
llvm-svn: 23443
2005-09-26 05:26:32 +00:00
Chris Lattner b009663e27 add a comment
llvm-svn: 23442
2005-09-26 05:16:34 +00:00
Chris Lattner 4b05c322d5 Add support for getelementptr, load, and correctly reject volatile stores.
llvm-svn: 23441
2005-09-26 05:15:37 +00:00
Chris Lattner 05035fe970 add a test for load
llvm-svn: 23440
2005-09-26 05:14:48 +00:00
Chris Lattner 3e9ea5ffec Add support for br/brcond/switch and phi
llvm-svn: 23439
2005-09-26 04:57:38 +00:00
Chris Lattner 543efbb71f add another testcase with simple control flow
llvm-svn: 23438
2005-09-26 04:57:10 +00:00
Chris Lattner 99e23fa74c Add a simple interpreter to this code, allowing us to statically evaluate
global ctors that are simple enough.  This implements ctor-list-opt.ll:CTOR2.

llvm-svn: 23437
2005-09-26 04:44:35 +00:00
Chris Lattner 6debcf3071 make this harder: put some code into it
llvm-svn: 23436
2005-09-26 04:43:01 +00:00
Chris Lattner 696beefabb factor some code into a InstallGlobalCtors method, add comments. No functionality change.
llvm-svn: 23435
2005-09-26 02:31:18 +00:00
Chris Lattner 838bdc1836 Make the global opt optimizer work on modules with a null terminator, by
accepting the null even with a non-65535 init prio

llvm-svn: 23434
2005-09-26 02:19:27 +00:00
Chris Lattner 41b6a5a693 Factor this code out into a few methods.
Implement the start of global ctor optimization.  It is currently smart
enough to remove the global ctor for cases like this:

struct foo {
  foo() {}
} x;

... saving a bit of startup time for the program.

llvm-svn: 23433
2005-09-26 01:43:45 +00:00
Chris Lattner 9db3c91a51 new testcase for static ctor list optimizations
llvm-svn: 23432
2005-09-26 01:42:03 +00:00
Jeff Cohen 23b1d28e69 Fix VC++ build errors.
llvm-svn: 23431
2005-09-25 19:04:43 +00:00
Chris Lattner f487768062 Fix some logic I broke that caused a regression on
SimplifyLibCalls/2005-05-20-sprintf-crash.ll

llvm-svn: 23430
2005-09-25 07:06:48 +00:00
Chris Lattner 0b3557f54a Move MaskedValueIsZero up.
Match a bunch of idioms for sign extensions, implementing InstCombine/signext.ll

llvm-svn: 23428
2005-09-24 23:43:33 +00:00
Chris Lattner 04d4a725ca All of these should turn into sign extends (e.g. extsh/extsb on PPC)
llvm-svn: 23427
2005-09-24 23:42:18 +00:00
Chris Lattner 699c80eebe Add long-overdue helpers for getting constants with known upper bits
llvm-svn: 23426
2005-09-24 22:57:28 +00:00
Chris Lattner 175463a165 Simplify this code a bit by relying on recursive simplification. Support
sprintf("%s", P)'s that have uses.

s/hasNUses(0)/use_empty()/

llvm-svn: 23425
2005-09-24 22:17:06 +00:00
Chris Lattner 906d705644 Enhance this to check for a crash, add a case that crashes simplifylibcalls,
and add a case that has uses.

llvm-svn: 23424
2005-09-24 22:16:04 +00:00
Chris Lattner 379dea1999 new testcase that crashes the CFE
llvm-svn: 23423
2005-09-24 20:54:33 +00:00
Chris Lattner a88736647b new testcase for PR630
llvm-svn: 23422
2005-09-24 08:38:28 +00:00
Chris Lattner cc9c03386f Add support for a marker byte that indicates that we shouldn't add the user
prefix to a symbol name

llvm-svn: 23421
2005-09-24 08:24:28 +00:00
Chris Lattner 7cd3c2d151 change proto slightly
llvm-svn: 23420
2005-09-24 08:23:53 +00:00
Chris Lattner cc1d38160d memoize translations
llvm-svn: 23419
2005-09-24 00:50:51 +00:00
Chris Lattner 6736a6cdd2 Teach the dag isel generator how to construct arbitrary immediates. The
generated isel now tries li then lis, then lis+ori.

llvm-svn: 23418
2005-09-24 00:41:58 +00:00
Chris Lattner 0afb14cade Teach the DAG isel generator to emit code that creates nodes.
Fix a few corner cases parsing things like (i32 imm:$foo)

llvm-svn: 23417
2005-09-24 00:40:24 +00:00
Chris Lattner cd093e868e Emit better code (no more copies for var references), and support DAG patterns
(e.g. things like rotates).

llvm-svn: 23416
2005-09-23 23:16:51 +00:00
Chris Lattner 8ffb99b4fe Fix a fixme by passing around SDOperand's instead of SDNode*'s
llvm-svn: 23415
2005-09-23 21:53:45 +00:00
Chris Lattner cc8a564cb1 Emit code that matches the incoming DAG pattern and checks predicates.
This does not check that types match yet, but PPC only has one integer type
;-).

This also doesn't have the code to build the resultant dag.

llvm-svn: 23414
2005-09-23 21:33:23 +00:00
Chris Lattner 323a47970e emit information about the order patterns are to be matched.
llvm-svn: 23413
2005-09-23 20:52:47 +00:00
Chris Lattner abb430bad2 start filling in the switch stmt
llvm-svn: 23412
2005-09-23 19:36:15 +00:00
Chris Lattner 499e33646e remove some debugging code
llvm-svn: 23411
2005-09-23 18:49:09 +00:00
Chris Lattner c59a371d45 Fold two consequtive branches that share a common destination between them.
This implements SimplifyCFG/branch-fold.ll, and is useful on ?:/min/max heavy
code

llvm-svn: 23410
2005-09-23 18:47:20 +00:00
Chris Lattner 62f565d198 new testcase
llvm-svn: 23409
2005-09-23 18:43:57 +00:00
Chris Lattner 3a978bf66d simplify some logic further
llvm-svn: 23408
2005-09-23 07:23:18 +00:00
Chris Lattner cc14ebc17b pull a bunch of logic out of SimplifyCFG into a helper fn
llvm-svn: 23407
2005-09-23 06:39:30 +00:00
Chris Lattner 1e3d3148bb speed up Archive::isBytecodeArchive in the case when the archive doesn't have
an llvm-ranlib symtab.  This speeds up gccld -native on an almost empty .o file
from 1.63s to 0.18s.

llvm-svn: 23406
2005-09-23 06:22:58 +00:00
Chris Lattner f20941116b Speed up isBytecodeLPath from 20s to .01s in common cases. This makes -native
not completely painful to use.  Once we decide a directory has a bytecode
library, we know it this function returns true, no need to scan entire directories.

llvm-svn: 23405
2005-09-23 06:11:24 +00:00
Chris Lattner 9b9b510084 1. Do not use .c_str() to keep a persistent handle on a temporary string.
2. Concatenate -lfoo and -L/bar options into a single option instead of
   passing "-L /bar" (for example) which doesn't work on Darwin.
3. Send -v output to stderr instead of stdout

llvm-svn: 23404
2005-09-23 06:05:46 +00:00
Chris Lattner 59a05bdde6 Turn (X^C1) == C2 into X == C1^C2 iff X&~C1 = 0 (and move a function)
This happens all the time on PPC for bool values, e.g. eliminating a xori
in inverted-bool-compares.ll.

This should be added to the dag combiner as well.

llvm-svn: 23403
2005-09-23 00:55:52 +00:00
Chris Lattner c619d43155 new testcase
llvm-svn: 23402
2005-09-23 00:53:06 +00:00
Chris Lattner 5ff606401b Testcase for PR629
llvm-svn: 23401
2005-09-21 06:53:56 +00:00
Chris Lattner b1f8982ff0 Expose the LiveInterval interfaces as public headers.
llvm-svn: 23400
2005-09-21 04:19:09 +00:00
Chris Lattner 993a2ec38c Recommend what I actually test
llvm-svn: 23398
2005-09-21 03:56:26 +00:00
Chris Lattner 6c70106053 Start threading across blocks with code in them, so long as the code does
not define a value that is used outside of it's block.  This catches many
more simplifications, e.g. 854 in 176.gcc, 137 in vpr, etc.

This implements branch-phi-thread.ll:test3.ll

llvm-svn: 23397
2005-09-20 01:48:40 +00:00
Chris Lattner cb6d8173d2 make this test harder: add a case where instructions are in the bb to be
threaded over

llvm-svn: 23396
2005-09-20 01:43:41 +00:00
Chris Lattner f0bd8d0107 Implement merging of blocks with the same condition if the block has multiple
predecessors.  This implements branch-phi-thread.ll::test1

llvm-svn: 23395
2005-09-20 00:43:16 +00:00
Chris Lattner 168d2e5343 new testcase
llvm-svn: 23394
2005-09-20 00:41:55 +00:00
Chris Lattner 049cb4482f Reject a case we don't handle yet
llvm-svn: 23393
2005-09-19 23:57:04 +00:00
Chris Lattner a160924d57 remove debugging code :-/
llvm-svn: 23392
2005-09-19 23:50:15 +00:00
Chris Lattner 748f903046 Implement SimplifyCFG/branch-phi-thread.ll, the most trivial case of threading
control across branches with determined outcomes.  More generality to follow.
This triggers a couple thousand times in specint.

llvm-svn: 23391
2005-09-19 23:49:37 +00:00
Chris Lattner b2a9e8115b new testcase.
llvm-svn: 23390
2005-09-19 23:48:04 +00:00
Nate Begeman c760f80fed Stub out the rest of the DAG Combiner. Just need to fill in the
select_cc bits and then wrap it in a convenience function for  use with
regular select.

llvm-svn: 23389
2005-09-19 22:34:01 +00:00
Chris Lattner 2f838f2192 Teach the local spiller to turn stack slot loads into register-register copies
when possible, avoiding the load (and avoiding the copy if the value is already
in the right register).

This patch came about when I noticed code like the following being generated:

  store R17 -> [SS1]
  ...blah...
  R4 = load [SS1]

This was causing an LSU reject on the G5.  This problem was due to the register
allocator folding spill code into a reg-reg copy (producing the load), which
prevented the spiller from being able to rewrite the load into a copy, despite
the fact that the value was already available in a register.  In the case
above, we now rip out the R4 load and replace it with a R4 = R17 copy.

This speeds up several programs on X86 (which spills a lot :) ), e.g.
smg2k from 22.39->20.60s, povray from 12.93->12.66s, 168.wupwise from
68.54->53.83s (!), 197.parser from 7.33->6.62s (!), etc.  This may have a larger
impact in some cases on the G5 (by avoiding LSU rejects), though it probably
won't trigger as often (less spilling in general).

Targets that implement folding of loads/stores into copies should implement
the isLoadFromStackSlot hook to get this.

llvm-svn: 23388
2005-09-19 06:56:21 +00:00
Chris Lattner de3c87a2ab Implement the isLoadFromStackSlot interface
llvm-svn: 23387
2005-09-19 05:23:44 +00:00
Chris Lattner b4b2530a1a Refactor this code a bit and make it more general. This now compiles:
struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus2 (unsigned int x) { b.j += x; }

To:

_plus2:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        slwi r3, r3, 6
        add r3, r4, r3
        rlwimi r3, r4, 0, 26, 14
        stw r3, 0(r2)
        blr


instead of:

_plus2:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        rlwinm r5, r4, 26, 21, 31
        add r3, r5, r3
        rlwimi r4, r3, 6, 15, 25
        stw r4, 0(r2)
        blr

by eliminating an 'and'.

I'm pretty sure this is as small as we can go :)

llvm-svn: 23386
2005-09-18 07:22:02 +00:00
Chris Lattner 797dee7705 Compile
struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus2 (unsigned int x) {
  b.j += x;
}

to:

plus2:
        mov %EAX, DWORD PTR [b]
        mov %ECX, %EAX
        and %ECX, 131008
        mov %EDX, DWORD PTR [%ESP + 4]
        shl %EDX, 6
        add %EDX, %ECX
        and %EDX, 131008
        and %EAX, -131009
        or %EDX, %EAX
        mov DWORD PTR [b], %EDX
        ret

instead of:

plus2:
        mov %EAX, DWORD PTR [b]
        mov %ECX, %EAX
        shr %ECX, 6
        and %ECX, 2047
        add %ECX, DWORD PTR [%ESP + 4]
        shl %ECX, 6
        and %ECX, 131008
        and %EAX, -131009
        or %ECX, %EAX
        mov DWORD PTR [b], %ECX
        ret

llvm-svn: 23385
2005-09-18 06:30:59 +00:00
Chris Lattner 01f56c68e9 Generalize this transform, using MaskedValueIsZero, allowing us to compile:
struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus3 (unsigned int x) { b.k += x; }

To:

plus3:
        mov %EAX, DWORD PTR [%ESP + 4]
        shl %EAX, 17
        add DWORD PTR [b], %EAX
        ret

instead of:

plus3:
        mov %EAX, DWORD PTR [%ESP + 4]
        shl %EAX, 17
        mov %ECX, DWORD PTR [b]
        add %EAX, %ECX
        and %EAX, -131072
        and %ECX, 131071
        or %ECX, %EAX
        mov DWORD PTR [b], %ECX
        ret

llvm-svn: 23384
2005-09-18 06:02:59 +00:00
Chris Lattner 4ebc8ab4e0 fix typeo
llvm-svn: 23383
2005-09-18 05:25:20 +00:00
Chris Lattner e5b23a6d67 Remove unintentionally committed code
llvm-svn: 23382
2005-09-18 05:12:51 +00:00
Chris Lattner 27cb9dbd35 implement shift.ll:test25. This compiles:
struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus3 (unsigned int x) {
  b.k += x;
}

to:

_plus3:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r3, 0(r2)
        rlwinm r4, r3, 0, 0, 14
        add r4, r4, r3
        rlwimi r4, r3, 0, 15, 31
        stw r4, 0(r2)
        blr

instead of:

_plus3:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        srwi r5, r4, 17
        add r3, r5, r3
        slwi r3, r3, 17
        rlwimi r3, r4, 0, 15, 31
        stw r3, 0(r2)
        blr

llvm-svn: 23381
2005-09-18 05:12:10 +00:00
Chris Lattner 1813aabcf2 new testcase
llvm-svn: 23380
2005-09-18 05:10:39 +00:00
Chris Lattner af517574ce Implement add.ll:test29. Codegening:
struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus1 (unsigned int x) {
  b.i += x;
}

as:
_plus1:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        add r3, r4, r3
        rlwimi r3, r4, 0, 0, 25
        stw r3, 0(r2)
        blr

instead of:

_plus1:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        rlwinm r5, r4, 0, 26, 31
        add r3, r5, r3
        rlwimi r3, r4, 0, 0, 25
        stw r3, 0(r2)
        blr

llvm-svn: 23379
2005-09-18 04:24:45 +00:00
Chris Lattner 9136c832c4 new testcase
llvm-svn: 23378
2005-09-18 04:22:59 +00:00
Chris Lattner 027eaf01cf remove debug output
llvm-svn: 23377
2005-09-18 03:50:25 +00:00
Chris Lattner 1521298993 Implement or.ll:test21. This teaches instcombine to be able to turn this:
struct {
   unsigned int bit0:1;
   unsigned int ubyte:31;
} sdata;

void foo() {
  sdata.ubyte++;
}

into this:

foo:
        add DWORD PTR [sdata], 2
        ret

instead of this:

foo:
        mov %EAX, DWORD PTR [sdata]
        mov %ECX, %EAX
        add %ECX, 2
        and %ECX, -2
        and %EAX, 1
        or %EAX, %ECX
        mov DWORD PTR [sdata], %EAX
        ret

llvm-svn: 23376
2005-09-18 03:42:07 +00:00