Commit Graph

153 Commits

Author SHA1 Message Date
Evan Cheng 9fee442e63 X86 integer register classes naming changes. Make them consistent with FP, vector classes.
llvm-svn: 28324
2006-05-16 07:21:53 +00:00
Chris Lattner b19ce6c810 More coverity fixes
llvm-svn: 28266
2006-05-12 21:14:20 +00:00
Evan Cheng 9733bde74c Fixing truncate. Previously we were emitting truncate from r16 to r8 as
movw. That is we promote the destination operand to r16. So
        %CH = TRUNC_R16_R8 %BP
is emitted as
        movw %bp, %cx.

This is incorrect. If %cl is live, it would be clobbered.
Ideally we want to do the opposite, that is emitted it as
        movb ??, %ch
But this is not possible since %bp does not have a r8 sub-register.

We are now defining a new register class R16_ which is a subclass of R16
containing only those 16-bit registers that have r8 sub-registers (i.e.
AX - DX). We isel the truncate to two instructions, a MOV16to16_ to copy the
value to the R16_ class, followed by a TRUNC_R16_R8.

Due to bug 770, the register colaescer is not going to coalesce between R16 and
R16_. That will be fixed later so we can eliminate the MOV16to16_. Right now, it
can only be eliminated if we are lucky that source and destination registers are
the same.

llvm-svn: 28164
2006-05-08 08:01:26 +00:00
Evan Cheng ddb6cc1d8e Better implementation of truncate. ISel matches it to a pseudo instruction
that gets emitted as movl (for r32 to i16, i8) or a movw (for r16 to i8). And
if the destination gets allocated a subregister of the source operand, then
the instruction will not be emitted at all.

llvm-svn: 28119
2006-05-05 05:40:20 +00:00
Chris Lattner 469647bf38 Remove and simplify some more machineinstr/machineoperand stuff.
llvm-svn: 28105
2006-05-04 18:16:01 +00:00
Chris Lattner 10d6341618 Move some methods out of MachineInstr into MachineOperand
llvm-svn: 28102
2006-05-04 17:52:23 +00:00
Chris Lattner fef7a2d0f5 There shalt be only one "immediate" operand type!
llvm-svn: 28099
2006-05-04 17:21:20 +00:00
Chris Lattner 940cc978ef Remove a bunch more SparcV9 specific stuff
llvm-svn: 28093
2006-05-04 01:15:02 +00:00
Evan Cheng f0157cb0bc Use movaps instead of movapd for spill / restore.
llvm-svn: 28005
2006-04-28 02:23:35 +00:00
Evan Cheng ab0ee6340c MakeMIInst() should handle jump table index operands.
llvm-svn: 27955
2006-04-24 05:37:35 +00:00
Evan Cheng 3823aa1d0f - PEXTRW cannot take a memory location as its first source operand.
- PINSRWrmi encoding bug.

llvm-svn: 27818
2006-04-18 21:59:43 +00:00
Evan Cheng 43f4ef4ffb SHUFP{S|D}, PSHUF* encoding bugs. Left out the mask immediate operand.
llvm-svn: 27817
2006-04-18 21:56:36 +00:00
Evan Cheng 09e36ef710 Encoding bug: CMPPSrmi, CMPPDrmi dropped operand 2 (condtion immediate).
llvm-svn: 27815
2006-04-18 21:31:08 +00:00
Evan Cheng bf0d13c54f Incorrect foldMemoryOperand entries
llvm-svn: 27763
2006-04-17 18:06:12 +00:00
Evan Cheng 685ddd8152 Can't fold loads into alias vector SSE ops used for scalar operation. The load
address has to be 16-byte aligned but the values aren't spilled to 128-bit
locations.

llvm-svn: 27732
2006-04-16 06:58:19 +00:00
Evan Cheng 0ba896c75b Added SSE (and other) entries to foldMemoryOperand().
llvm-svn: 27716
2006-04-14 23:33:27 +00:00
Evan Cheng e349d01acf We were not adjusting the frame size to ensure proper alignment when alloca /
vla are present in the function. This causes a crash when a leaf function
allocates space on the stack used to store / load with 128-bit SSE
instructions.

llvm-svn: 27698
2006-04-14 07:26:43 +00:00
Evan Cheng c9ed8e4c1a Use movaps to do VR128 reg-to-reg copies for now. It's shorter and available for SSE1.
llvm-svn: 27554
2006-04-10 07:21:31 +00:00
Jim Laskey 2d7298c362 Foundation for call frame information.
llvm-svn: 27491
2006-04-07 16:34:46 +00:00
Evan Cheng 8f3b6b8d8a Minor fixes + naming changes.
llvm-svn: 27410
2006-04-04 19:12:30 +00:00
Jim Laskey d1aa1638c6 Expose base register for DwarfWriter. Refactor code accordingly.
llvm-svn: 27225
2006-03-28 13:48:33 +00:00
Jim Laskey fa53b276d0 Translate llvm target registers to dwarf register numbers properly.
llvm-svn: 27180
2006-03-27 20:18:45 +00:00
Jim Laskey 3c43609f1f Add support to locate local variables in frames (early version.)
llvm-svn: 26994
2006-03-23 18:12:57 +00:00
Evan Cheng 9bf978dc20 Use the generic vector register classes VR64 / VR128 rather than V4F32,
V8I16, etc.

llvm-svn: 26838
2006-03-18 01:23:20 +00:00
Evan Cheng bfc2e97383 Also fold MOV8r0, MOV16r0, MOV32r0 + store to MOV8mi, MOV16mi, and MOV32mi.
llvm-svn: 26817
2006-03-17 02:36:22 +00:00
Evan Cheng aca7915b70 Add some missing entries to X86RegisterInfo::foldMemoryOperand(). e.g.
ADD32ri8.

llvm-svn: 26816
2006-03-17 02:25:01 +00:00
Evan Cheng 42d5ac557c Fix an obvious bug exposed when we are doing
ADD X, 4
==>
MOV32ri $X+4, ...

llvm-svn: 26366
2006-02-25 01:37:02 +00:00
Evan Cheng fa57a0add9 Added SSE2 128-bit integer packed types: V16I8, V8I16, V4I32, and V2I64.
Added generic vector types: VR64 and VR128.

llvm-svn: 26295
2006-02-21 01:38:21 +00:00
Evan Cheng 43070b7541 Added x86 integer vector types: 64-bit packed byte integer (v16i8), 64-bit
packed word integer (v8i16), and 64-bit packed doubleword integer (v2i32).

llvm-svn: 26294
2006-02-20 22:34:53 +00:00
Evan Cheng 24c461b51e 1. Use pxor instead of xoraps / xorapd to clear FR32 / FR64 registers. This
proves to be worth 20% on Ptrdist/ks. Might be related to dependency
   breaking support.
2. Added FsMOVAPSrr and FsMOVAPDrr as aliases to MOVAPSrr and MOVAPDrr. These
   are used for FR32 / FR64 reg-to-reg copies.
3. Tell reg-allocator to generate MOVSSrm / MOVSDrm and MOVSSmr / MOVSDmr to
   spill / restore FsMOVAPSrr and FsMOVAPDrr.

llvm-svn: 26241
2006-02-16 22:45:17 +00:00
Evan Cheng 3f99628939 Use movaps / movapd to spill / restore V4F4 / V2F8 registers.
llvm-svn: 26240
2006-02-16 21:20:26 +00:00
Evan Cheng ae82498e81 Use movaps / movapd (instead of movss / movsd) to do FR32 / FR64 reg to reg
transfer.

According to the Intel P4 Optimization Manual:

Moves that write a portion of a register can introduce unwanted
dependences. The movsd reg, reg instruction writes only the bottom
64 bits of a register, not to all 128 bits. This introduces a dependence on
the preceding instruction that produces the upper 64 bits (even if those
bits are not longer wanted). The dependence inhibits register renaming,
and thereby reduces parallelism.

Not to mention movaps is shorter than movss.

llvm-svn: 26226
2006-02-16 01:50:02 +00:00
Chris Lattner c408558638 When rewriting frame instructions, emit the appropriate small-immediate
instruction when possible.

llvm-svn: 25938
2006-02-03 18:20:04 +00:00
Chris Lattner bb53acd03c Move isLoadFrom/StoreToStackSlot from MRegisterInfo to TargetInstrInfo,a far more logical place. Other methods should also be moved if anyoneis interested. :)
llvm-svn: 25913
2006-02-02 20:12:32 +00:00
Chris Lattner 246ee44c8f implement isStoreToStackSlot
llvm-svn: 25911
2006-02-02 20:00:41 +00:00
Evan Cheng f1ed826c2a Added SSE entries to foldMemoryOperand().
llvm-svn: 25888
2006-02-01 23:02:25 +00:00
Evan Cheng 9c249c37f8 Support for ADD_PARTS, SUB_PARTS, SHL_PARTS, SHR_PARTS, and SRA_PARTS.
llvm-svn: 25158
2006-01-09 18:33:28 +00:00
Evan Cheng 172fce7050 * Fast call support.
* FP cmp, setcc, etc.

llvm-svn: 25117
2006-01-06 00:43:03 +00:00
Evan Cheng 782b654e6f Let the helper functions know about X86::FR32RegClass and X86::FR64RegClass.
llvm-svn: 25004
2005-12-24 09:48:35 +00:00
Evan Cheng 9ae486047e * Removed the use of FLAG. Now use hasFlagIn and hasFlagOut instead.
* Added a pseudo instruction (for each target) that represent "return void".
  This is a workaround for lack of optional flag operand (return void is not
  lowered so it does not have a flag operand.)

llvm-svn: 24997
2005-12-23 22:14:32 +00:00
Chris Lattner f431ad4477 Rewrite FP stackifier support in the X86InstrInfo.td file, splitting patterns
that were overloaded to work before and after the stackifier runs.  With the
new clean world, it is possible to write patterns for these instructions: woo!

This also adds a few simple patterns here and there, though there are a lot
still missing.  These should be easy to add though. :)

See the comments under "Floating Point Stack Support" for more details on
the new world order.

This patch as absolutely no effect on the generated code, woo!

llvm-svn: 24899
2005-12-21 07:47:04 +00:00
Nate Begeman 9d7008b08d Properly split f32 and f64 into separate register classes for scalar sse fp
fixing a bunch of nasty hackery

llvm-svn: 23735
2005-10-14 22:06:00 +00:00
Chris Lattner bb1c9ecb17 simplify this code using the new regclass info passed in
llvm-svn: 23557
2005-09-30 17:12:38 +00:00
Chris Lattner a654525c1c Pass extra regclasses into spilling code
llvm-svn: 23537
2005-09-30 01:29:42 +00:00
Chris Lattner de3c87a2ab Implement the isLoadFromStackSlot interface
llvm-svn: 23387
2005-09-19 05:23:44 +00:00
Chris Lattner 8ad3700a3e The simple isel being gone makes this dead!
llvm-svn: 22914
2005-08-19 18:32:03 +00:00
Jeff Cohen 5f4ef3c5a8 Eliminate all remaining tabs and trailing spaces.
llvm-svn: 22523
2005-07-27 06:12:32 +00:00
Nate Begeman 8a0933608a First round of support for doing scalar FP using the SSE2 ISA extension and
XMM registers.  There are many known deficiencies and fixmes, which will be
addressed ASAP.  The major benefit of this work is that it will allow the
LLVM register allocator to allocate FP registers across basic blocks.

The x86 backend will still default to x87 style FP.  To enable this work,
you must pass -enable-sse-scalar-fp and either -sse2 or -sse3 to llc.

An example before and after would be for:
double foo(double *P) { double Sum = 0; int i; for (i = 0; i < 1000; ++i)
                        Sum += P[i]; return Sum; }

The inner loop looks like the following:
x87:
.LBB_foo_1:     # no_exit
        fldl (%esp)
        faddl (%eax,%ecx,8)
        fstpl (%esp)
        incl %ecx
        cmpl $1000, %ecx
        #FP_REG_KILL
        jne .LBB_foo_1  # no_exit

SSE2:
        addsd (%eax,%ecx,8), %xmm0
        incl %ecx
        cmpl $1000, %ecx
        #FP_REG_KILL
        jne .LBB_foo_1  # no_exit

llvm-svn: 22340
2005-07-06 18:59:04 +00:00
Chris Lattner 97e3b65652 Teach reginfo how to deal with ADJSTACKPTRri, allowing us to generate:
add %ESP, 20
        jmp %EDX  # TAIL CALL

instead of:
        add %ESP, -8
        add %ESP, 28
        jmp %EDX  # TAIL CALL

llvm-svn: 22047
2005-05-15 05:49:58 +00:00
Chris Lattner 5366c859a7 When emitting the function epilog, check to see if there already a stack
adjustment.  If so, we merge the adjustment into the existing one.  This
allows us to generate:

caller2:
        sub %ESP, 12
        mov DWORD PTR [%ESP], 0
        mov %EAX, 1234567890
        mov %EDX, 0
        call func2
        add %ESP, 8
        ret 4

intead of:

caller2:
        sub %ESP, 12
        mov DWORD PTR [%ESP], 0
        mov %EAX, 1234567890
        mov %EDX, 0
        call func2
        sub %ESP, 4
        add %ESP, 12
        ret 4

for X86/fast-cc-merge-stack-adj.ll

llvm-svn: 22038
2005-05-14 23:53:43 +00:00