Commit Graph

783 Commits

Author SHA1 Message Date
Evan Cheng 82241c86e9 - FCOPYSIGN custom lowering bug. Clear the sign bit of operand 0 first before
or'ing in the sign bit of operand 1.
- Tweaking: rather than left shift the sign bit, fp_extend operand 1 first
  before taking its sign bit if its type is smaller than that of operand 0.

llvm-svn: 32932
2007-01-05 21:37:56 +00:00
Evan Cheng 4363e884c0 With SSE2, expand FCOPYSIGN to a series of SSE bitwise operations.
llvm-svn: 32900
2007-01-05 07:55:56 +00:00
Evan Cheng ae1cd75af7 - Use a different wrapper node for RIP-relative GV, etc.
- Proper support for both small static and PIC modes under X86-64
- Some (non-optimal) support for medium modes.

llvm-svn: 32046
2006-11-30 21:55:46 +00:00
Evan Cheng 49683ba236 Don't dag combine floating point select to max and min intrinsics. Those
take v4f32 / v2f64 operands and may end up causing larger spills / restores.
Added X86 specific nodes X86ISD::FMAX, X86ISD::FMIN instead.

This fixes PR996.

llvm-svn: 31645
2006-11-10 21:43:37 +00:00
Evan Cheng 922e191116 Fixed a bug which causes x86 be to incorrectly match
shuffle v, undef, <2, ?, 3, ?>
to movhlps
It should match to unpckhps instead.

Added proper matching code for
shuffle v, undef, <2, 3, 2, 3>

llvm-svn: 31519
2006-11-07 22:14:24 +00:00
Chris Lattner 44daa50bed allow the address of a global to be used with the "i" constraint when in
-static mode.  This implements PR882.

llvm-svn: 31326
2006-10-31 20:13:11 +00:00
Evan Cheng e056dd5928 Fixed a significant bug where unpcklpd is incorrectly used to extract element 1 from a v2f64 value.
llvm-svn: 31228
2006-10-27 21:08:32 +00:00
Chris Lattner c0fb567e23 Implement branch analysis/xform hooks required by the branch folding pass.
llvm-svn: 31065
2006-10-20 17:42:20 +00:00
Chris Lattner f4aeff00c2 fit in 80 cols
llvm-svn: 31039
2006-10-18 18:26:48 +00:00
Chris Lattner d9e4bf5285 update comments
llvm-svn: 30663
2006-09-28 23:33:12 +00:00
Anton Korobeynikov 3c5b3df6a0 Adding codegeneration for StdCall & FastCall calling conventions
llvm-svn: 30549
2006-09-20 22:03:51 +00:00
Evan Cheng 4259a0f654 X86ISD::CMP now produces a chain as well as a flag. Make that the chain
operand of a conditional branch to allow load folding into CMP / TEST
instructions.

llvm-svn: 30241
2006-09-11 02:19:56 +00:00
Evan Cheng 11b0a5dbd4 Committing X86-64 support.
llvm-svn: 30177
2006-09-08 06:48:29 +00:00
Chris Lattner 524129dd64 Fix PR850 and CodeGen/X86/2006-07-31-SingleRegClass.ll.
The CFE refers to all single-register constraints (like "A") by their 16-bit
name, even though the 8 or 32-bit version of the register may be needed.
The X86 backend should realize what is going on and redecode the name back
to its proper form.

llvm-svn: 29420
2006-07-31 23:26:50 +00:00
Chris Lattner 298ef37e02 Implement the inline asm 'A' constraint. This implements PR825 and
CodeGen/X86/2006-07-10-InlineAsmAConstraint.ll

llvm-svn: 29101
2006-07-11 02:54:03 +00:00
Evan Cheng 5987cfb7b1 X86 target specific DAG combine: turn build_vector (load x), (load x+4),
(load x+8), (load x+12), <0, 1, 2, 3> to a single 128-bit load (aligned and
unaligned).

e.g.

__m128 test(float a, float b, float c, float d) {
  return _mm_set_ps(d, c, b, a);
}

_test:
        movups 4(%esp), %xmm0
        ret

llvm-svn: 29042
2006-07-07 08:33:52 +00:00
Evan Cheng 38c5aee959 Simplify X86CompilationCallback: always align to 16-byte boundary; don't save EAX/EDX if unnecessary.
llvm-svn: 28910
2006-06-24 08:36:10 +00:00
Evan Cheng 2a33094284 Switch X86 over to a call-selection model where the lowering code creates
the copyto/fromregs instead of making the X86ISD::CALL selection code create
them.

llvm-svn: 28463
2006-05-25 00:59:30 +00:00
Chris Lattner aa2372562e Patches to make the LLVM sources more -pedantic clean. Patch provided
by Anton Korobeynikov!  This is a step towards closing PR786.

llvm-svn: 28447
2006-05-24 17:04:05 +00:00
Evan Cheng 17e734f0a6 Remove PreprocessCCCArguments and PreprocessFastCCArguments now that
FORMAL_ARGUMENTS nodes include a token operand.

llvm-svn: 28439
2006-05-23 21:06:34 +00:00
Chris Lattner 8be5be817c Implement an annoying part of the Darwin/X86 abi: the callee of a struct
return argument pops the hidden struct pointer if present, not the caller.

For example, in this testcase:

struct X { int D, E, F, G; };
struct X bar() {
  struct X a;
  a.D = 0;
  a.E = 1;
  a.F = 2;
  a.G = 3;
  return a;
}
void foo(struct X *P) {
  *P = bar();
}

We used to emit:

_foo:
        subl $28, %esp
        movl 32(%esp), %eax
        movl %eax, (%esp)
        call _bar
        addl $28, %esp
        ret
_bar:
        movl 4(%esp), %eax
        movl $0, (%eax)
        movl $1, 4(%eax)
        movl $2, 8(%eax)
        movl $3, 12(%eax)
        ret

This is correct on Linux/X86 but not Darwin/X86.  With this patch, we now
emit:

_foo:
        subl $28, %esp
        movl 32(%esp), %eax
        movl %eax, (%esp)
        call _bar
***     addl $24, %esp
        ret
_bar:
        movl 4(%esp), %eax
        movl $0, (%eax)
        movl $1, 4(%eax)
        movl $2, 8(%eax)
        movl $3, 12(%eax)
***     ret $4

For the record, GCC emits (which is functionally equivalent to our new code):

_bar:
        movl    4(%esp), %eax
        movl    $3, 12(%eax)
        movl    $2, 8(%eax)
        movl    $1, 4(%eax)
        movl    $0, (%eax)
        ret     $4
_foo:
        pushl   %esi
        subl    $40, %esp
        movl    48(%esp), %esi
        leal    16(%esp), %eax
        movl    %eax, (%esp)
        call    _bar
        subl    $4, %esp
        movl    16(%esp), %eax
        movl    %eax, (%esi)
        movl    20(%esp), %eax
        movl    %eax, 4(%esi)
        movl    24(%esp), %eax
        movl    %eax, 8(%esi)
        movl    28(%esp), %eax
        movl    %eax, 12(%esi)
        addl    $40, %esp
        popl    %esi
        ret

This fixes SingleSource/Benchmarks/CoyoteBench/fftbench with LLC and the
JIT, and fixes the X86-backend portion of PR729.  The CBE still needs to
be updated.

llvm-svn: 28438
2006-05-23 18:50:38 +00:00
Evan Cheng 8c6b234ce8 Should pass by reference.
llvm-svn: 28357
2006-05-17 19:07:40 +00:00
Evan Cheng 48940d16b2 - Clean up formal argument lowering code. Prepare for vector pass by value work.
- Fixed vararg support.

llvm-svn: 27985
2006-04-27 01:32:22 +00:00
Evan Cheng e0bcfbe811 Switching over FORMAL_ARGUMENTS mechanism to lower call arguments.
llvm-svn: 27975
2006-04-26 01:20:17 +00:00
Evan Cheng a9467aab0a Separate LowerOperation() into multiple functions, one per opcode.
llvm-svn: 27972
2006-04-25 20:13:52 +00:00
Evan Cheng e8b5180044 Now generating perfect (I think) code for "vector set" with a single non-zero
scalar value.

e.g.
        _mm_set_epi32(0, a, 0, 0);
==>
	movd 4(%esp), %xmm0
	pshufd $69, %xmm0, %xmm0

        _mm_set_epi8(0, 0, 0, 0, 0, a, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
==>
	movzbw 4(%esp), %ax
	movzwl %ax, %eax
	pxor %xmm0, %xmm0
	pinsrw $5, %eax, %xmm0

llvm-svn: 27923
2006-04-21 01:05:10 +00:00
Evan Cheng 60f0b8998e - Added support to turn "vector clear elements", e.g. pand V, <-1, -1, 0, -1>
to a vector shuffle.
- VECTOR_SHUFFLE lowering change in preparation for more efficient codegen
of vector shuffle with zero (or any splat) vector.

llvm-svn: 27875
2006-04-20 08:58:49 +00:00
Evan Cheng 7855e4d032 Commute vector_shuffle to match more movlhps, movlp{s|d} cases.
llvm-svn: 27840
2006-04-19 20:35:22 +00:00
Evan Cheng 5d247f81c1 Last few SSE3 intrinsics.
llvm-svn: 27711
2006-04-14 21:59:03 +00:00
Evan Cheng 12ba3e23d0 Added support for _mm_move_ss and _mm_move_sd.
llvm-svn: 27575
2006-04-11 00:19:04 +00:00
Evan Cheng c995b45f67 - movlp{s|d} and movhp{s|d} support.
- Normalize shuffle nodes so result vector lower half elements come from the
  first vector, the rest come from the second vector. (Except for the
  exceptions :-).
- Other minor fixes.

llvm-svn: 27474
2006-04-06 23:23:56 +00:00
Evan Cheng 780382946e Support for comi / ucomi intrinsics.
llvm-svn: 27444
2006-04-05 23:38:46 +00:00
Evan Cheng f3b52c84ea Handle canonical form of e.g.
vector_shuffle v1, v1, <0, 4, 1, 5, 2, 6, 3, 7>

This is turned into
vector_shuffle v1, <undef>, <0, 0, 1, 1, 2, 2, 3, 3>
by dag combiner.

It would match a {p}unpckl on x86.

llvm-svn: 27437
2006-04-05 07:20:06 +00:00
Evan Cheng 5fd7c69473 Use a X86 target specific node X86ISD::PINSRW instead of a mal-formed
INSERT_VECTOR_ELT to insert a 16-bit value in a 128-bit vector.

llvm-svn: 27314
2006-03-31 21:55:24 +00:00
Evan Cheng cbffa4656b Add support to use pextrw and pinsrw to extract and insert a word element
from a 128-bit vector.

llvm-svn: 27304
2006-03-31 19:22:53 +00:00
Evan Cheng b7fedffc78 - Added some SSE2 128-bit packed integer ops.
- Added SSE2 128-bit integer pack with signed saturation ops.
- Added pshufhw and pshuflw ops.

llvm-svn: 27252
2006-03-29 23:07:14 +00:00
Evan Cheng 1a194a5264 * Prefer using operation of matching types. e.g unpcklpd rather than movlhps.
* Bug fixes.

llvm-svn: 27218
2006-03-28 06:50:32 +00:00
Evan Cheng 2bc3280659 - Clean up / consoladate various shuffle masks.
- Some misc. bug fixes.
- Use MOVHPDrm to load from m64 to upper half of a XMM register.

llvm-svn: 27210
2006-03-28 02:43:26 +00:00
Evan Cheng 5df75889db Model unpack lower and interleave as vector_shuffle so we can lower the
intrinsics as such.

llvm-svn: 27200
2006-03-28 00:39:58 +00:00
Evan Cheng ed6184aef2 Remove X86:isZeroVector, use ISD::isBuildVectorAllZeros instead; some fixes / cleanups
llvm-svn: 27150
2006-03-26 09:53:12 +00:00
Evan Cheng 2bc0941e2a Build arbitrary vector with more than 2 distinct scalar elements with a
series of unpack and interleave ops.

llvm-svn: 27119
2006-03-25 09:37:23 +00:00
Evan Cheng e7ee6a5e32 Support for scalar to vector with zero extension.
llvm-svn: 27091
2006-03-24 23:15:12 +00:00
Evan Cheng 082c8785ef Handle BUILD_VECTOR with all zero elements.
llvm-svn: 27056
2006-03-24 07:29:27 +00:00
Evan Cheng 2595a687da More efficient v2f64 shuffle using movlhps, movhlps, unpckhpd, and unpcklpd.
llvm-svn: 27040
2006-03-24 02:58:06 +00:00
Evan Cheng d27fb3e85e Handle more shuffle cases with SHUFP* instructions.
llvm-svn: 27024
2006-03-24 01:18:28 +00:00
Evan Cheng 021bb7c956 Added a ValueType operand to isShuffleMaskLegal(). For now, x86 will not do
64-bit vector shuffle.

llvm-svn: 26964
2006-03-22 22:07:06 +00:00
Evan Cheng 68ad48bd1a - Implement X86ISelLowering::isShuffleMaskLegal(). We currently only support
splat and PSHUFD cases.
- Clean up shuffle / splat matching code.

llvm-svn: 26954
2006-03-22 18:59:22 +00:00
Evan Cheng 8fdbdf20cd - VECTOR_SHUFFLE of v4i32 / v4f32 with undef second vector always matches
PSHUFD. We can make permutes entries which point to the undef pointing
  anything we want.
- Change some names to appease Chris.

llvm-svn: 26951
2006-03-22 08:01:21 +00:00
Evan Cheng d097e67544 Some splat and shuffle support.
llvm-svn: 26940
2006-03-22 02:53:00 +00:00
Evan Cheng d5e905d762 - Use movaps to store 128-bit vector integers.
- Each scalar to vector v8i16 and v16i8 is a any_extend followed by a movd.

llvm-svn: 26932
2006-03-21 23:01:21 +00:00
Evan Cheng 2dd2c652b2 Added getTargetLowering() to TargetMachine. Refactored targets to support this.
llvm-svn: 26742
2006-03-13 23:20:37 +00:00
Evan Cheng e0ed6ec13f - Clean up the lowering and selection code of ConstantPool, GlobalAddress,
and ExternalSymbol.
- Use C++ code (rather than tblgen'd selection code) to match the above
  mentioned leaf nodes. Do not mutate and nodes and do not record the
  selection in CodeGenMap. These nodes should be safe to duplicate. This is
  a performance win.

llvm-svn: 26335
2006-02-23 20:41:18 +00:00
Evan Cheng 1f342c2884 PIC related bug fixes.
1. Various asm printer bug.
2. Lowering bug. Now TargetGlobalAddress is wrapped in X86ISD::TGAWrapper.

llvm-svn: 26324
2006-02-23 02:43:52 +00:00
Chris Lattner 7ad77dfc2a split register class handling from explicit physreg handling.
llvm-svn: 26308
2006-02-22 00:56:39 +00:00
Chris Lattner 7bb4696dc3 Updates to match change of getRegForInlineAsmConstraint prototype
llvm-svn: 26305
2006-02-21 23:11:00 +00:00
Evan Cheng 5588de9415 x86 / Darwin PIC support.
llvm-svn: 26273
2006-02-18 00:15:05 +00:00
Nate Begeman 5965bd19f8 kill ADD_PARTS & SUB_PARTS and replace them with fancy new ADDC, ADDE, SUBC
and SUBE nodes that actually expose what's going on and allow for
significant simplifications in the targets.

llvm-svn: 26255
2006-02-17 05:43:56 +00:00
Nate Begeman 8a77efe4f7 Rework the SelectionDAG-based implementations of SimplifyDemandedBits
and ComputeMaskedBits to match the new improved versions in instcombine.
Tested against all of multisource/benchmarks on ppc.

llvm-svn: 26238
2006-02-16 21:11:51 +00:00
Evan Cheng 11613a5219 Separate FILD and FILD_FLAG, the later is only used for SSE2. It produces a
flag so it can be flagged to a FST.

llvm-svn: 25953
2006-02-04 02:20:30 +00:00
Evan Cheng 72d5c256c9 - Allow XMM load (for scalar use) to be folded into ANDP* and XORP*.
- Use XORP* to implement fneg.

llvm-svn: 25857
2006-01-31 22:28:30 +00:00
Chris Lattner c642aa5e1c * Fix 80-column violations
* Rename hasSSE -> hasSSE1 to avoid my continual confusion with 'has any SSE'.
* Add inline asm constraint specification.

llvm-svn: 25854
2006-01-31 19:43:35 +00:00
Evan Cheng 2dd217b88f Added custom lowering of fabs
llvm-svn: 25831
2006-01-31 03:14:29 +00:00
Evan Cheng 5b97fcf0f5 Always use FP stack instructions to perform i64 to f64 as well as f64 to i64
conversions. SSE does not have instructions to handle these tasks.

llvm-svn: 25817
2006-01-30 08:02:57 +00:00
Chris Lattner f0b24d2dc0 Move MaskedValueIsZero from the DAGCombiner to the TargetLowering interface,making isMaskedValueZeroForTargetNode simpler, and useable from other partsof the compiler.
llvm-svn: 25803
2006-01-30 04:09:27 +00:00
Chris Lattner c6fa0282d2 adjust prototype
llvm-svn: 25798
2006-01-30 03:49:07 +00:00
Nate Begeman 8c47c3a3b1 Remove TLI.LowerReturnTo, and just let targets custom lower ISD::RET for
the same functionality.  This addresses another piece of bug 680.  Next,
on to fixing Alpha VAARG, which I broke last time.

llvm-svn: 25696
2006-01-27 21:09:22 +00:00
Evan Cheng cde9e30bc6 x86 CPU detection and proper subtarget support
llvm-svn: 25679
2006-01-27 08:10:46 +00:00
Nate Begeman e74795cd70 First part of bug 680:
Remove TLI.LowerVA* and replace it with SDNodes that are lowered the same
way as everything else.

llvm-svn: 25606
2006-01-25 18:21:52 +00:00
Evan Cheng 6305e50ee1 Fix sint_to_fp (fild*) support.
llvm-svn: 25257
2006-01-12 22:54:21 +00:00
Evan Cheng ae986f1f1e Support for MEMCPY and MEMSET.
llvm-svn: 25226
2006-01-11 22:15:48 +00:00
Evan Cheng 339edad775 SSE cmov support.
llvm-svn: 25190
2006-01-11 00:33:36 +00:00
Evan Cheng 9c249c37f8 Support for ADD_PARTS, SUB_PARTS, SHL_PARTS, SHR_PARTS, and SRA_PARTS.
llvm-svn: 25158
2006-01-09 18:33:28 +00:00
Evan Cheng 172fce7050 * Fast call support.
* FP cmp, setcc, etc.

llvm-svn: 25117
2006-01-06 00:43:03 +00:00
Evan Cheng 45e19098a6 DAG based isel call support.
llvm-svn: 25103
2006-01-05 00:27:02 +00:00
Evan Cheng 5c59d49630 More X86 floating point patterns.
llvm-svn: 24990
2005-12-23 07:31:11 +00:00
Evan Cheng 9cdc16c6d3 * Fix a GlobalAddress lowering bug.
* Teach DAG combiner about X86ISD::SETCC by adding a TargetLowering hook.

llvm-svn: 24921
2005-12-21 23:05:39 +00:00
Evan Cheng c1583dbd63 * Added support for X86 RET with an additional operand to specify number of
bytes to pop off stack.
* Added support for X86 SETCC.

llvm-svn: 24917
2005-12-21 20:21:51 +00:00
Evan Cheng a74ce62746 * Added lowering hook for external weak global address. It inserts a load
for Darwin.
* Added lowering hook for ISD::RET. It inserts CopyToRegs for the return
  value (or store / fld / copy to ST(0) for floating point value). This
  eliminate the need to write C++ code to handle RET with variable number
  of operands.

llvm-svn: 24888
2005-12-21 02:39:21 +00:00
Evan Cheng 6af02635a7 Added a hook to print out names of target specific DAG nodes.
llvm-svn: 24877
2005-12-20 06:22:03 +00:00
Evan Cheng 6fc31046aa X86 conditional branch support.
llvm-svn: 24870
2005-12-19 23:12:38 +00:00
Evan Cheng 225a4d0d6d X86 lowers SELECT to a cmp / test followed by a conditional move.
llvm-svn: 24754
2005-12-17 01:21:05 +00:00
Andrew Lenharth 0bf68ae434 The second patch of X86 support for read cycle counter.
llvm-svn: 24430
2005-11-20 21:41:10 +00:00
Chris Lattner 76ac068568 Separate X86ISelLowering stuff out from the X86ISelPattern.cpp file. Patch
contributed by Evan Cheng.

llvm-svn: 24358
2005-11-15 00:40:23 +00:00