Commit Graph

12747 Commits

Author SHA1 Message Date
Chris Lattner 2c0956bcea Two changes:
1. Treat FMOVD as a copy instruction, to help with coallescing in V9 mode
2. When in V9 mode, insert FMOVD instead of FpMOVD instructions, as we don't
   ever rewrite FpMOVD instructions into FMOVS instructions, thus we just end
   up with commented out copies!
This should fix a bunch of failures in V9 mode on sparc.

llvm-svn: 25961
2006-02-04 06:58:46 +00:00
Evan Cheng f9adce90bf Get rid of some memory leaks identified by Valgrind
llvm-svn: 25960
2006-02-04 06:49:00 +00:00
Chris Lattner 2d2e2e3c0e Let bugpoint work on sparc with v9 instructions enabled.
llvm-svn: 25958
2006-02-04 05:02:27 +00:00
Jeff Cohen 57a004abfe Fix VC++ warning.
llvm-svn: 25957
2006-02-04 03:27:39 +00:00
Chris Lattner 3b48431333 Add initial support for immediates. This allows us to compile this:
int %rlwnm(int %A, int %B) {
  %C = call int asm "rlwnm $0, $1, $2, $3, $4", "=r,r,r,n,n"(int %A, int %B, int 4, int 17)
  ret int %C
}

into:

_rlwnm:
        or r2, r3, r3
        or r3, r4, r4
        rlwnm r2, r2, r3, 4, 17    ;; note the immediates :)
        or r3, r2, r2
        blr

llvm-svn: 25955
2006-02-04 02:26:14 +00:00
Evan Cheng 0a977c95aa Remove an unnecessary predicate.
llvm-svn: 25954
2006-02-04 02:23:01 +00:00
Evan Cheng 11613a5219 Separate FILD and FILD_FLAG, the later is only used for SSE2. It produces a
flag so it can be flagged to a FST.

llvm-svn: 25953
2006-02-04 02:20:30 +00:00
Chris Lattner 65ad53feb3 Initial early support for non-register operands, like immediates
llvm-svn: 25952
2006-02-04 02:16:44 +00:00
Chris Lattner ee1dadbccf implementation of some methods for inlineasm
llvm-svn: 25951
2006-02-04 02:13:02 +00:00
Chris Lattner c93403a7fb Handle another case exposed on X86.
llvm-svn: 25949
2006-02-03 23:50:46 +00:00
Chris Lattner 71d20c4e18 Fix a nasty problem on two-address machines in the following situation:
store EAX -> [ss#0]
[ss#0] += 1
...
use(EAX)

In this case, it is not valid to rewrite this as:


store EAX -> [ss#0]
EAX += 1
store EAX -> [ss#0]  ;;; this would also delete the store above
...
use(EAX)

... because EAX is not a dead at that point.  Keep track of which registers
we are allowed to clobber, and which ones we aren't, and don't clobber the
ones we're not supposed to.  :)

This should resolve the issues on X86 last night.

llvm-svn: 25948
2006-02-03 23:28:46 +00:00
Chris Lattner 507a3a7bd1 significantly simplify the VirtRegMap code by pulling the SpillSlotsAvailable
and PhysRegsAvailable maps out into a new AvailableSpills struct.  No
functionality change.

This paves the way for a bugfix, coming up next.

llvm-svn: 25947
2006-02-03 23:13:58 +00:00
Nate Begeman 20a894282d Implement some feedback from sabre
llvm-svn: 25946
2006-02-03 22:38:07 +00:00
Nate Begeman dc7bba9ffe Add a framework for eliminating instructions that produces undemanded bits.
llvm-svn: 25945
2006-02-03 22:24:05 +00:00
Chris Lattner 81e66abd1e add a note
llvm-svn: 25944
2006-02-03 22:06:45 +00:00
Chris Lattner d079dbb9b0 another case Nate came up with
llvm-svn: 25943
2006-02-03 22:05:41 +00:00
Chris Lattner 277462e20f add a note
llvm-svn: 25942
2006-02-03 21:25:23 +00:00
Chris Lattner f68fd20286 remove some #ifdef'd out code, which should properly be in the dag combiner anyway.
llvm-svn: 25941
2006-02-03 20:13:59 +00:00
Chris Lattner a1d312c6ea remove an old comment
llvm-svn: 25940
2006-02-03 18:59:39 +00:00
Chris Lattner 23d55f2547 Remove the X86PeepholeOptimizerPass, a truly horrible old hack that is now
obsolete.  yaay :)

llvm-svn: 25939
2006-02-03 18:54:24 +00:00
Chris Lattner c408558638 When rewriting frame instructions, emit the appropriate small-immediate
instruction when possible.

llvm-svn: 25938
2006-02-03 18:20:04 +00:00
Chris Lattner ca76917388 Teach sparc to fold loads/stores into copies.
Remove the dead getRegClassForType method
minor formating changes.

llvm-svn: 25936
2006-02-03 07:06:25 +00:00
Chris Lattner 6091407783 remove dead fn
llvm-svn: 25935
2006-02-03 06:51:34 +00:00
Nate Begeman 22e251abf1 Add common code for reassociating ops in the dag combiner
llvm-svn: 25934
2006-02-03 06:46:56 +00:00
Chris Lattner d7d98611ca Implement isLoadFromStackSlot and isStoreToStackSlot
llvm-svn: 25932
2006-02-03 06:44:54 +00:00
Chris Lattner a23b04acdb remove some target-indep and implemented notes
llvm-svn: 25930
2006-02-03 06:22:11 +00:00
Chris Lattner d1aaee03ce target independent notes
llvm-svn: 25929
2006-02-03 06:21:43 +00:00
Nate Begeman fc567d85d5 Flesh out a couple of the items in the README
llvm-svn: 25928
2006-02-03 05:17:06 +00:00
Jeff Cohen 3276ff7ac6 Fix VC++ compilation error caused by using a std::map iterator variable to receive
a std::multimap iterator value.  For some reason, GCC doesn't have a problem with this.

llvm-svn: 25927
2006-02-03 03:48:54 +00:00
Chris Lattner e18ef0d4a6 Remove move copies and dead stuff by not clobbering the result reg of a noop copy.
llvm-svn: 25926
2006-02-03 03:16:14 +00:00
Andrew Lenharth 1318240fd0 isStoreToStackSlot
llvm-svn: 25925
2006-02-03 03:07:37 +00:00
Chris Lattner 774d4a190b Simplify some code
llvm-svn: 25924
2006-02-03 03:06:49 +00:00
Chris Lattner a1eac9b978 the X86 backend no longer needs to delete its own noop copies
llvm-svn: 25923
2006-02-03 02:59:58 +00:00
Chris Lattner 1ef239afb4 Add code that checks for noop copies, which triggers when either:
1. a target doesn't know how to fold load/stores into copies, or
2. the spiller rewrites the input to a copy to the same register as the dest
   instead of to the reloaded reg.

This will be moved/improved in the near future, but allows elimination of
some ancient x86 hacks.  This eliminates 92 copies from SMG2000 on X86 and
163 copies from 252.eon.

llvm-svn: 25922
2006-02-03 02:02:59 +00:00
Chris Lattner f0a2d66d1c Add a note
llvm-svn: 25921
2006-02-03 01:49:49 +00:00
Evan Cheng 02b5b9cdd6 Added case HANDLENODE to getOperationName().
llvm-svn: 25920
2006-02-03 01:33:01 +00:00
Chris Lattner b7f24de4c8 Physregs may hold multiple stack slot values at the same time. Keep track
of this, and use it to our advantage (bwahahah).  This allows us to eliminate another
60 instructions from smg2000 on PPC (probably significantly more on X86).  A common
old-new diff looks like this:

        stw r2, 3304(r1)
-       lwz r2, 3192(r1)
        stw r2, 3300(r1)
-       lwz r2, 3192(r1)
        stw r2, 3296(r1)
-       lwz r2, 3192(r1)
        stw r2, 3200(r1)
-       lwz r2, 3192(r1)
        stw r2, 3196(r1)
-       lwz r2, 3192(r1)
+       or r2, r2, r2
        stw r2, 3188(r1)

and

-       lwz r31, 604(r1)
-       lwz r13, 604(r1)
-       lwz r14, 604(r1)
-       lwz r15, 604(r1)
-       lwz r16, 604(r1)
-       lwz r30, 604(r1)
+       or r31, r30, r30
+       or r13, r30, r30
+       or r14, r30, r30
+       or r15, r30, r30
+       or r16, r30, r30
+       or r30, r30, r30

Removal of the R = R copies is coming next...

llvm-svn: 25919
2006-02-03 00:36:31 +00:00
Chris Lattner 9b178ce225 update a note
llvm-svn: 25918
2006-02-02 23:50:22 +00:00
Chris Lattner f3aef1b004 Fix a deficiency in the spiller that Evan noticed. In particular, consider
this code:

  store [stack slot #0],  R10
    = add R14, [stack slot #0]

The spiller didn't know that the store made the value of [stackslot#0] available
in R10 *IF* the store came from a copy instruction with the store folded into it.

This patch teaches VirtRegMap to look at these stores and recognize the values
they make available.  In one case Evan provided, this code:

        divsd %XMM0, %XMM1
        movsd %XMM1, QWORD PTR [%ESP + 40]
1)      movsd QWORD PTR [%ESP + 48], %XMM1
2)      movsd %XMM1, QWORD PTR [%ESP + 48]
        addsd %XMM1, %XMM0
3)      movsd QWORD PTR [%ESP + 48], %XMM1
        movsd QWORD PTR [%ESP + 4], %XMM0

turns into:

        divsd %XMM0, %XMM1
        movsd %XMM1, QWORD PTR [%ESP + 40]
        addsd %XMM1, %XMM0
3)      movsd QWORD PTR [%ESP + 48], %XMM1
        movsd QWORD PTR [%ESP + 4], %XMM0

In this case, instruction #2 was removed because of the value made
available by #1, and inst #1 was later deleted because it is now
never used before the stack slot is redefined by #3.

This occurs here and there in a lot of code with high spilling, on PPC
most of the removed loads/stores are LSU-reject-causing loads, which is
nice.

On X86, things are much better (because it spills more), where we nuke
about 1% of the instructions from SMG2000 and several hundred from eon.

More improvements to come...

llvm-svn: 25917
2006-02-02 23:29:36 +00:00
Nate Begeman 4efb328926 add 64b gpr store to the possible list of isStoreToStackSlot opcodes.
llvm-svn: 25916
2006-02-02 21:07:50 +00:00
Chris Lattner 5123346708 fix operand numbers
llvm-svn: 25915
2006-02-02 20:38:12 +00:00
Chris Lattner c327d71e06 implement isStoreToStackSlot for PPC
llvm-svn: 25914
2006-02-02 20:16:12 +00:00
Chris Lattner bb53acd03c Move isLoadFrom/StoreToStackSlot from MRegisterInfo to TargetInstrInfo,a far more logical place. Other methods should also be moved if anyoneis interested. :)
llvm-svn: 25913
2006-02-02 20:12:32 +00:00
Chris Lattner 246ee44c8f implement isStoreToStackSlot
llvm-svn: 25911
2006-02-02 20:00:41 +00:00
Chris Lattner 0acc90c67e add a method
llvm-svn: 25910
2006-02-02 19:57:16 +00:00
Chris Lattner d8208c3665 more notes
llvm-svn: 25908
2006-02-02 19:43:28 +00:00
Chris Lattner d3f033e8e0 add a note, I have no idea how important this is.
llvm-svn: 25907
2006-02-02 19:16:34 +00:00
Chris Lattner e10e1024bc %fcc is not an alias for %fcc0
llvm-svn: 25906
2006-02-02 08:02:20 +00:00
Chris Lattner cb34968d19 correct an opcode
llvm-svn: 25905
2006-02-02 07:56:15 +00:00
Chris Lattner 9dd7df7ee7 new example
llvm-svn: 25903
2006-02-02 07:37:11 +00:00
Nate Begeman cd018525f8 Update the README
llvm-svn: 25902
2006-02-02 07:27:56 +00:00
Chris Lattner 49beaf40fc Turn any_extend nodes into zero_extend nodes when it allows us to remove an
and instruction.  This allows us to compile stuff like this:

bool %X(int %X) {
        %Y = add int %X, 14
        %Z = setne int %Y, 12345
        ret bool %Z
}

to this:

_X:
        cmpl $12331, 4(%esp)
        setne %al
        movzbl %al, %eax
        ret

instead of this:

_X:
        cmpl $12331, 4(%esp)
        setne %al
        movzbl %al, %eax
        andl $1, %eax
        ret

This occurs quite a bit with the X86 backend.  For example, 25 times in
lambda, 30 times in 177.mesa, 14 times in galgel,  70 times in fma3d,
25 times in vpr, several hundred times in gcc, ~45 times in crafty,
~60 times in parser, ~140 times in eon, 110 times in perlbmk, 55 on gap,
16 times on bzip2, 14 times on twolf, and 1-2 times in many other SPEC2K
programs.

llvm-svn: 25901
2006-02-02 07:17:31 +00:00
Chris Lattner e0c60d63b1 Implement MaskedValueIsZero for ANY_EXTEND nodes
llvm-svn: 25900
2006-02-02 06:43:15 +00:00
Chris Lattner 4b2ec8af23 implemented, testcase here: test/Regression/CodeGen/X86/compare-add.ll
llvm-svn: 25899
2006-02-02 06:36:48 +00:00
Chris Lattner 49ce35542f add two dag combines:
(C1-X) == C2 --> X == C1-C2
(X+C1) == C2 --> X == C2-C1

This allows us to compile this:

bool %X(int %X) {
        %Y = add int %X, 14
        %Z = setne int %Y, 12345
        ret bool %Z
}

into this:

_X:
        cmpl $12331, 4(%esp)
        setne %al
        movzbl %al, %eax
        andl $1, %eax
        ret

not this:

_X:
        movl $14, %eax
        addl 4(%esp), %eax
        cmpl $12345, %eax
        setne %al
        movzbl %al, %eax
        andl $1, %eax
        ret

Testcase here: Regression/CodeGen/X86/compare-add.ll

nukage of the and coming up next.

llvm-svn: 25898
2006-02-02 06:36:13 +00:00
Evan Cheng d3908f79cb Update.
llvm-svn: 25896
2006-02-02 02:40:17 +00:00
Chris Lattner 0bd74558ae make -debug output less newliney
llvm-svn: 25895
2006-02-02 00:38:08 +00:00
Evan Cheng d8fba3a1ee Fix a erroneous comment.
llvm-svn: 25894
2006-02-02 00:28:23 +00:00
Chris Lattner 7f5880b1c7 Implement matching constraints. We can now say things like this:
%C = call int asm "xyz $0, $1, $2, $3", "=r,r,r,0"(int %A, int %B, int 4)

and get:

xyz r2, r3, r4, r2

note that the r2's are pinned together.  Yaay for 2-address instructions.

2342 ----------------------------------------------------------------------

llvm-svn: 25893
2006-02-02 00:25:23 +00:00
Chris Lattner 2f34a9e332 validate matching constraints and remember when we see them.
llvm-svn: 25892
2006-02-02 00:23:53 +00:00
Chris Lattner 6132a87cf4 more notes
llvm-svn: 25890
2006-02-01 23:38:08 +00:00
Evan Cheng b3ea2677a4 Tell codegen MOVAPSrr and MOVAPDrr are copies.
llvm-svn: 25889
2006-02-01 23:03:16 +00:00
Evan Cheng f1ed826c2a Added SSE entries to foldMemoryOperand().
llvm-svn: 25888
2006-02-01 23:02:25 +00:00
Evan Cheng 8b40cde148 Rearrange code to my liking. :)
llvm-svn: 25887
2006-02-01 23:01:57 +00:00
Chris Lattner aa23fa9f43 Implement smart printing of inline asm strings, handling variants and
substituted operands.  For this testcase:

int %test(int %A, int %B) {
  %C = call int asm "xyz $0, $1, $2", "=r,r,r"(int %A, int %B)
  ret int %C
}

we now emit:

_test:
        or r2, r3, r3
        or r3, r4, r4
        xyz r2, r2, r3  ;; look here
        or r3, r2, r2
        blr

... note the substituted operands. :)

llvm-svn: 25886
2006-02-01 22:41:11 +00:00
Chris Lattner f7f056751c add a method
llvm-svn: 25884
2006-02-01 22:38:46 +00:00
Chris Lattner 2f7650f9dc another note
llvm-svn: 25883
2006-02-01 21:44:48 +00:00
Andrew Lenharth 4b1c726fbb Add immediate forms of cmov and remove some cruft
llvm-svn: 25882
2006-02-01 19:37:33 +00:00
Nate Begeman 01bd9d9911 *** empty log message ***
llvm-svn: 25879
2006-02-01 19:05:15 +00:00
Chris Lattner 1558fc64f9 Implement simple register assignment for inline asms. This allows us to compile:
int %test(int %A, int %B) {
  %C = call int asm "xyz $0, $1, $2", "=r,r,r"(int %A, int %B)
  ret int %C
}

into:

 (0x8906130, LLVM BB @0x8902220):
        %r2 = OR4 %r3, %r3
        %r3 = OR4 %r4, %r4
        INLINEASM <es:xyz $0, $1, $2>, %r2<def>, %r2, %r3
        %r3 = OR4 %r2, %r2
        BLR

which asmprints as:

_test:
        or r2, r3, r3
        or r3, r4, r4
        xyz $0, $1, $2      ;; need to print the operands now :)
        or r3, r2, r2
        blr

llvm-svn: 25878
2006-02-01 18:59:47 +00:00
Chris Lattner ba56b5dc35 Finegrainify namespacification
llvm-svn: 25877
2006-02-01 18:10:56 +00:00
Chris Lattner a983beab37 add a note
llvm-svn: 25876
2006-02-01 17:54:23 +00:00
Nate Begeman 7e7f439f85 Fix some of the stuff in the PPC README file, and clean up legalization
of the SELECT_CC, BR_CC, and BRTWOWAY_CC nodes.

llvm-svn: 25875
2006-02-01 07:19:44 +00:00
Chris Lattner 3da1bb520e add a note, I'll take care of this after nate commits his big patch
llvm-svn: 25873
2006-02-01 06:40:32 +00:00
Evan Cheng 9e350cd6ad - Use xor to clear integer registers (set R, 0).
- Added a new format for instructions where the source register is implied
  and it is same as the destination register. Used for pseudo instructions
  that clear the destination register.

llvm-svn: 25872
2006-02-01 06:13:50 +00:00
Evan Cheng c404b5748c Remove another entry.
llvm-svn: 25871
2006-02-01 06:08:48 +00:00
Jeff Cohen b24b66f209 Fix VC++ compilation error.
llvm-svn: 25869
2006-02-01 04:37:04 +00:00
Chris Lattner b0a76b0981 Another regression from the pattern isel
llvm-svn: 25867
2006-02-01 01:44:25 +00:00
Chris Lattner 7ed3101d14 Beef up the interface to inline asm constraint parsing, making it more general, useful, and easier to use.
llvm-svn: 25866
2006-02-01 01:29:47 +00:00
Chris Lattner 3a5ed55187 adjust to changes in InlineAsm interface. Fix a few minor bugs.
llvm-svn: 25865
2006-02-01 01:28:23 +00:00
Evan Cheng a24617f5d4 Return's chain should be matching either the chain produced by the
value or the chain going into the load.

llvm-svn: 25863
2006-02-01 01:19:32 +00:00
Chris Lattner a0527473ac another testcase.
llvm-svn: 25862
2006-02-01 00:28:12 +00:00
Evan Cheng e1ce4d7115 When folding a load into a return of SSE value, check the chain to
ensure the memory location has not been clobbered.

llvm-svn: 25861
2006-02-01 00:20:21 +00:00
Evan Cheng bc1fcd074e Remove an item. It's done.
llvm-svn: 25860
2006-02-01 00:15:53 +00:00
Evan Cheng 5659ca8f47 Be smarter about whether to store the SSE return value in memory. If
it is already available in memory, do a fld directly from there.

llvm-svn: 25859
2006-01-31 23:19:54 +00:00
Chris Lattner 64387c3e9c turning these into 'adds' would require extra copies
llvm-svn: 25858
2006-01-31 22:59:46 +00:00
Evan Cheng 72d5c256c9 - Allow XMM load (for scalar use) to be folded into ANDP* and XORP*.
- Use XORP* to implement fneg.

llvm-svn: 25857
2006-01-31 22:28:30 +00:00
Evan Cheng a91eb48547 Remove entries on fabs and fneg. These are done.
llvm-svn: 25856
2006-01-31 22:26:21 +00:00
Evan Cheng 32be2dc0af Allow the specification of explicit alignments for constant pool entries.
llvm-svn: 25855
2006-01-31 22:23:14 +00:00
Chris Lattner c642aa5e1c * Fix 80-column violations
* Rename hasSSE -> hasSSE1 to avoid my continual confusion with 'has any SSE'.
* Add inline asm constraint specification.

llvm-svn: 25854
2006-01-31 19:43:35 +00:00
Chris Lattner 0151361d21 add info about the inline asm register constraints for PPC
llvm-svn: 25853
2006-01-31 19:20:21 +00:00
Evan Cheng 2443ab932d Allow custom lowering of fabs. I forgot to check in this change which
caused several test failures.

llvm-svn: 25852
2006-01-31 18:14:25 +00:00
Chris Lattner 0962ffc4a6 add a missing break that caused a lot of failures last night :(
llvm-svn: 25851
2006-01-31 17:20:06 +00:00
Nate Begeman a162f208ee Codegen
bool %test(int %X) {
  %Y = seteq int %X, 13
  ret bool %Y
}

as

_test:
        addi r2, r3, -13
        cntlzw r2, r2
        srwi r3, r2, 5
        blr

rather than

_test:
        cmpwi cr7, r3, 13
        mfcr r2
        rlwinm r3, r2, 31, 31, 31
        blr

This has very little effect on most code, but speeds up analyzer 23% and
mason 11%

llvm-svn: 25848
2006-01-31 08:17:29 +00:00
Chris Lattner ac9892ccaf okay, one more
llvm-svn: 25847
2006-01-31 07:45:45 +00:00
Chris Lattner 882611dc25 another note
llvm-svn: 25846
2006-01-31 07:45:08 +00:00
Chris Lattner 24b0742476 More notes
llvm-svn: 25845
2006-01-31 07:43:33 +00:00
Chris Lattner 57480d0634 another one
llvm-svn: 25844
2006-01-31 07:38:32 +00:00
Chris Lattner 17cd988419 add a note
llvm-svn: 25843
2006-01-31 07:37:20 +00:00
Chris Lattner 799716141b add conditional moves of float and double values on int/fp condition codes.
llvm-svn: 25842
2006-01-31 07:26:55 +00:00
Chris Lattner b0fe138b65 example nate pointed out
llvm-svn: 25841
2006-01-31 07:16:34 +00:00
Chris Lattner 6f9bf658a7 treat conditional branches the same way as conditional moves (giving them
an operand that contains the condcode), making things significantly simpler.

llvm-svn: 25840
2006-01-31 06:56:30 +00:00
Chris Lattner 21ec192419 compactify all of the integer conditional moves into one instruction that takes
a CC as an operand.  Much smaller, much happier.

llvm-svn: 25839
2006-01-31 06:49:09 +00:00
Chris Lattner 196d58373c Add immediate forms of integer cmovs
llvm-svn: 25838
2006-01-31 06:24:29 +00:00
Chris Lattner 283492b4fe Shrinkify
llvm-svn: 25837
2006-01-31 06:18:16 +00:00
Chris Lattner 70c9e42593 Add the full complement of conditional moves of integer registers.
llvm-svn: 25834
2006-01-31 05:26:36 +00:00
Chris Lattner b6493b3165 Compile this:
void %X(int %A) {
        %C = setlt int %A, 123          ; <bool> [#uses=1]
        br bool %C, label %T, label %F

T:              ; preds = %0
        call int %main( int 0 )         ; <int>:0 [#uses=0]
        ret void

F:              ; preds = %0
        ret void
}

to this:

X:
        save -96, %o6, %o6
        subcc %i0, 122, %l0
        bg .LBBX_2      ! F
        nop
...

not this:

X:
        save -96, %o6, %o6
        sethi 0, %l0
        or %g0, 1, %l1
        subcc %i0, 122, %l2
        bg .LBBX_4      !
        nop
.LBBX_3:        !
        or %g0, %l0, %l1
.LBBX_4:        !
        subcc %l1, 0, %l0
        bne .LBBX_2     ! F
        nop

llvm-svn: 25833
2006-01-31 05:05:52 +00:00
Chris Lattner e9721b2984 Only insert an AND when converting from BR_COND to BRCC if needed.
llvm-svn: 25832
2006-01-31 05:04:52 +00:00
Evan Cheng 2dd217b88f Added custom lowering of fabs
llvm-svn: 25831
2006-01-31 03:14:29 +00:00
Chris Lattner a9bfca8d1e add the 'lucas' optimization
llvm-svn: 25830
2006-01-31 02:55:28 +00:00
Chris Lattner 0e70729e83 I don't see why this optimization isn't safe, but it isn't, so disable it
llvm-svn: 25829
2006-01-31 02:45:52 +00:00
Chris Lattner d916e78b0a Another high-prio selection performance bug
llvm-svn: 25828
2006-01-31 02:10:06 +00:00
Chris Lattner 2e56e89452 Handle physreg input/outputs. We now compile this:
int %test_cpuid(int %op) {
        %B = alloca int
        %C = alloca int
        %D = alloca int
        %A = call int asm "cpuid", "=eax,==ebx,==ecx,==edx,eax"(int* %B, int* %C, int* %D, int %op)
        %Bv = load int* %B
        %Cv = load int* %C
        %Dv = load int* %D
        %x = add int %A, %Bv
        %y = add int %x, %Cv
        %z = add int %y, %Dv
        ret int %z
}

to this:

_test_cpuid:
        sub %ESP, 16
        mov DWORD PTR [%ESP], %EBX
        mov %EAX, DWORD PTR [%ESP + 20]
        cpuid
        mov DWORD PTR [%ESP + 8], %ECX
        mov DWORD PTR [%ESP + 12], %EBX
        mov DWORD PTR [%ESP + 4], %EDX
        mov %ECX, DWORD PTR [%ESP + 12]
        add %EAX, %ECX
        mov %ECX, DWORD PTR [%ESP + 8]
        add %EAX, %ECX
        mov %ECX, DWORD PTR [%ESP + 4]
        add %EAX, %ECX
        mov %EBX, DWORD PTR [%ESP]
        add %ESP, 16
        ret

... note the proper register allocation.  :)

it is unclear to me why the loads aren't folded into the adds.

llvm-svn: 25827
2006-01-31 02:03:41 +00:00
Chris Lattner 2b70a6f853 more mumbling
llvm-svn: 25826
2006-01-31 00:45:37 +00:00
Chris Lattner b521361fb9 add some notes
llvm-svn: 25825
2006-01-31 00:20:38 +00:00
Evan Cheng 45df7f84ff Don't generate complex sequence for SETOLE, SETOLT, SETULT, and SETUGT. Flip
the order of the compare operands and generate SETOGT, SETOGE, SETUGE, and
SETULE instead.

llvm-svn: 25824
2006-01-30 23:41:35 +00:00
Chris Lattner 57ecb561c6 Print the most trivial inline asms.
llvm-svn: 25822
2006-01-30 23:00:08 +00:00
Chris Lattner f263a23735 Fix a bug in my legalizer reworking that caused the X86 backend to not get
a chance to custom legalize setcc, which broke a bunch of C++ Codes.
Testcase here: CodeGen/X86/2006-01-30-LongSetcc.ll

llvm-svn: 25821
2006-01-30 22:43:50 +00:00
Chris Lattner 9a90572374 Fix FP constants, and the SparcV8/2006-01-22-BitConvertLegalize.ll failure from last night
llvm-svn: 25819
2006-01-30 22:20:49 +00:00
Evan Cheng 08390f6a21 i64 -> f32, f32 -> i64 and some clean up.
llvm-svn: 25818
2006-01-30 22:13:22 +00:00
Evan Cheng 5b97fcf0f5 Always use FP stack instructions to perform i64 to f64 as well as f64 to i64
conversions. SSE does not have instructions to handle these tasks.

llvm-svn: 25817
2006-01-30 08:02:57 +00:00
Chris Lattner 37faeb2b02 Revamp the ICC/FCC reading instructions to be parameterized in terms of the
SPARC condition codes, not in terms of the DAG condcodes.  This allows us to
write nice clean patterns for cmovs/branches.

llvm-svn: 25815
2006-01-30 07:43:04 +00:00
Chris Lattner 33a79cae7c Compile:
uint %test(uint %X) {
        %Y = call uint %llvm.ctpop.i32(uint %X)
        ret uint %Y
}

to:

test:
        save -96, %o6, %o6
        sll %i0, 0, %l0
        popc %l0, %i0
        restore %g0, %g0, %g0
        retl
        nop

instead of to 40 logical ops.  Note the shift-by-zero that clears the top
part of the 64-bit V9 register.

Testcase here: CodeGen/SparcV8/ctpop.ll

llvm-svn: 25814
2006-01-30 06:14:02 +00:00
Chris Lattner 321e337d95 If the target has V9 instructions, this pass is a noop, don't bother
running it.

llvm-svn: 25811
2006-01-30 05:51:14 +00:00
Chris Lattner 90d3fd9e7c When in v9 mode, emit fabsd/fnegd/fmovd
llvm-svn: 25810
2006-01-30 05:48:37 +00:00
Chris Lattner 99dcb95e14 First step towards V9 instructions in the V8 backend, two conditional move
patterns.  This allows emission of this code:

t1:
        save -96, %o6, %o6
        subcc %i0, %i1, %l0
        move %icc, %i0, %i2
        or %g0, %i2, %i0
        restore %g0, %g0, %g0
        retl
        nop

instead of this:

t1:
        save -96, %o6, %o6
        subcc %i0, %i1, %l0
        be .LBBt1_2     !
        nop
.LBBt1_1:       !
        or %g0, %i2, %i0
.LBBt1_2:       !
        restore %g0, %g0, %g0
        retl
        nop

for this:

int %t1(int %a, int %b, int %c) {
        %tmp.2 = seteq int %a, %b
        %tmp3 = select bool %tmp.2, int %a, int %c
        ret int %tmp3
}

llvm-svn: 25809
2006-01-30 05:35:57 +00:00
Chris Lattner 238fe93242 Two changes:
1. Default to having V9 instructions, instead of just V8.
2. unless -enable-sparc-v9-insts is passed, disable V9 (for use with llcbeta)

llvm-svn: 25807
2006-01-30 04:57:43 +00:00
Chris Lattner af209b8b13 When lowering SELECT_CC, see if the input is a lowered SETCC. If so, fold
the two operations together.  This allows us to compile this:

void %two(int %a, int* %b) {
        %tmp.2 = seteq int %a, 0
        %tmp.0.0 = select bool %tmp.2, int 10, int 20
        store int %tmp.0.0, int* %b
        ret void
}

into:

two:
        save -96, %o6, %o6
        or %g0, 20, %l0
        or %g0, 10, %l1
        subcc %i0, 0, %l2
        be .LBBtwo_2    ! entry
        nop
.LBBtwo_1:      ! entry
        or %g0, %l0, %l1
.LBBtwo_2:      ! entry
        st %l1, [%i1]
        restore %g0, %g0, %g0
        retl
        nop

instead of:

two:
        save -96, %o6, %o6
        sethi 0, %l0
        or %g0, 1, %l1
        or %g0, 20, %l2
        or %g0, 10, %l3
        subcc %i0, 0, %l4
        be .LBBtwo_2    ! entry
        nop
.LBBtwo_1:      ! entry
        or %g0, %l0, %l1
.LBBtwo_2:      ! entry
        subcc %l1, 0, %l0
        bne .LBBtwo_4   ! entry
        nop
.LBBtwo_3:      ! entry
        or %g0, %l2, %l3
.LBBtwo_4:      ! entry
        st %l3, [%i1]
        restore %g0, %g0, %g0
        retl
        nop

llvm-svn: 25806
2006-01-30 04:34:44 +00:00
Jeff Cohen baeb39c969 Add AddSymbol() method to DynamicLibrary to work around Windows limitation
of being unable to search for symbols in an EXE.  It will also allow other
existing hacks to be improved.

llvm-svn: 25805
2006-01-30 04:33:51 +00:00
Chris Lattner d6f5ae4455 don't insert an and node if it isn't needed here, this can prevent folding
of lowered target nodes.

llvm-svn: 25804
2006-01-30 04:22:28 +00:00
Chris Lattner f0b24d2dc0 Move MaskedValueIsZero from the DAGCombiner to the TargetLowering interface,making isMaskedValueZeroForTargetNode simpler, and useable from other partsof the compiler.
llvm-svn: 25803
2006-01-30 04:09:27 +00:00
Chris Lattner 4ac0fa2aa5 Implement isMaskedValueZeroForTargetNode for the various v8 selectcc nodes,
allowing redundant and's to be eliminated by the dag combiner.

llvm-svn: 25800
2006-01-30 03:51:45 +00:00
Chris Lattner 3b40e64aa3 pass the address of MaskedValueIsZero into isMaskedValueZeroForTargetNode,
to permit recursion

llvm-svn: 25799
2006-01-30 03:49:37 +00:00
Chris Lattner c6fa0282d2 adjust prototype
llvm-svn: 25798
2006-01-30 03:49:07 +00:00
Jeff Cohen 8ee89c774b Fix indentation.
llvm-svn: 25795
2006-01-29 22:02:52 +00:00
Chris Lattner 4d1ea71a31 Fix RET of promoted values on targets that custom expand RET to a target node.
llvm-svn: 25794
2006-01-29 21:02:23 +00:00
Chris Lattner 32058cfb7b Functions that are lazily streamed in from the .bc file are *not* external.
This fixes llvm-test/SingleSource/UnitTests/2006-01-29-SimpleIndirectCall.c
and PR704

llvm-svn: 25793
2006-01-29 20:49:17 +00:00
Chris Lattner 3c6a950653 add another note
llvm-svn: 25789
2006-01-29 09:46:06 +00:00
Chris Lattner dabee1f655 add some performance notes from looking at sgefa
llvm-svn: 25788
2006-01-29 09:42:20 +00:00
Chris Lattner 7c7cbde0e5 add a high-priority SSE issue from sgefa
llvm-svn: 25787
2006-01-29 09:14:47 +00:00
Chris Lattner 5a7a22c9dd add a missed optimization
llvm-svn: 25786
2006-01-29 09:08:15 +00:00
Chris Lattner 2c748afd6c cleanups to the ValueTypeActions interface
llvm-svn: 25785
2006-01-29 08:42:06 +00:00
Chris Lattner 3072af4d4f Now that OpActions is big enough, we can specify actions for vector types
llvm-svn: 25784
2006-01-29 08:41:37 +00:00
Chris Lattner 8a4a3deaf9 clean up interface to ValueTypeActions
llvm-svn: 25783
2006-01-29 08:41:12 +00:00
Chris Lattner ccb4476c87 Remove some special case hacks for CALLSEQ_*, using UpdateNodeOperands
instead.

llvm-svn: 25780
2006-01-29 07:58:15 +00:00
Chris Lattner d7738e6b32 disable this for now
llvm-svn: 25778
2006-01-29 07:31:33 +00:00
Reid Spencer 0c05a2c99c Add a note about lowering llvm.memset, llvm.memcpy, and llvm.memmove to a
few stores under certain conditions.

llvm-svn: 25777
2006-01-29 06:48:25 +00:00
Chris Lattner 35d20a4c00 remove now-dead code, the legalizer takes care of this for us
llvm-svn: 25776
2006-01-29 06:45:31 +00:00
Chris Lattner 132177e103 The FP stack doesn't support UNDEF, ask the legalizer to legalize it
instead of lying and saying we have it.

llvm-svn: 25775
2006-01-29 06:44:22 +00:00
Chris Lattner 2f292789dc Allow custom expansion of ConstantVec nodes. PPC will use this in the future.
llvm-svn: 25774
2006-01-29 06:34:16 +00:00