Evan Cheng
82241c86e9
- FCOPYSIGN custom lowering bug. Clear the sign bit of operand 0 first before
...
or'ing in the sign bit of operand 1.
- Tweaking: rather than left shift the sign bit, fp_extend operand 1 first
before taking its sign bit if its type is smaller than that of operand 0.
llvm-svn: 32932
2007-01-05 21:37:56 +00:00
Evan Cheng
4363e884c0
With SSE2, expand FCOPYSIGN to a series of SSE bitwise operations.
...
llvm-svn: 32900
2007-01-05 07:55:56 +00:00
Evan Cheng
ae1cd75af7
- Use a different wrapper node for RIP-relative GV, etc.
...
- Proper support for both small static and PIC modes under X86-64
- Some (non-optimal) support for medium modes.
llvm-svn: 32046
2006-11-30 21:55:46 +00:00
Evan Cheng
49683ba236
Don't dag combine floating point select to max and min intrinsics. Those
...
take v4f32 / v2f64 operands and may end up causing larger spills / restores.
Added X86 specific nodes X86ISD::FMAX, X86ISD::FMIN instead.
This fixes PR996.
llvm-svn: 31645
2006-11-10 21:43:37 +00:00
Evan Cheng
922e191116
Fixed a bug which causes x86 be to incorrectly match
...
shuffle v, undef, <2, ?, 3, ?>
to movhlps
It should match to unpckhps instead.
Added proper matching code for
shuffle v, undef, <2, 3, 2, 3>
llvm-svn: 31519
2006-11-07 22:14:24 +00:00
Chris Lattner
44daa50bed
allow the address of a global to be used with the "i" constraint when in
...
-static mode. This implements PR882.
llvm-svn: 31326
2006-10-31 20:13:11 +00:00
Evan Cheng
e056dd5928
Fixed a significant bug where unpcklpd is incorrectly used to extract element 1 from a v2f64 value.
...
llvm-svn: 31228
2006-10-27 21:08:32 +00:00
Chris Lattner
c0fb567e23
Implement branch analysis/xform hooks required by the branch folding pass.
...
llvm-svn: 31065
2006-10-20 17:42:20 +00:00
Chris Lattner
f4aeff00c2
fit in 80 cols
...
llvm-svn: 31039
2006-10-18 18:26:48 +00:00
Chris Lattner
d9e4bf5285
update comments
...
llvm-svn: 30663
2006-09-28 23:33:12 +00:00
Anton Korobeynikov
3c5b3df6a0
Adding codegeneration for StdCall & FastCall calling conventions
...
llvm-svn: 30549
2006-09-20 22:03:51 +00:00
Evan Cheng
4259a0f654
X86ISD::CMP now produces a chain as well as a flag. Make that the chain
...
operand of a conditional branch to allow load folding into CMP / TEST
instructions.
llvm-svn: 30241
2006-09-11 02:19:56 +00:00
Evan Cheng
11b0a5dbd4
Committing X86-64 support.
...
llvm-svn: 30177
2006-09-08 06:48:29 +00:00
Chris Lattner
524129dd64
Fix PR850 and CodeGen/X86/2006-07-31-SingleRegClass.ll.
...
The CFE refers to all single-register constraints (like "A") by their 16-bit
name, even though the 8 or 32-bit version of the register may be needed.
The X86 backend should realize what is going on and redecode the name back
to its proper form.
llvm-svn: 29420
2006-07-31 23:26:50 +00:00
Chris Lattner
298ef37e02
Implement the inline asm 'A' constraint. This implements PR825 and
...
CodeGen/X86/2006-07-10-InlineAsmAConstraint.ll
llvm-svn: 29101
2006-07-11 02:54:03 +00:00
Evan Cheng
5987cfb7b1
X86 target specific DAG combine: turn build_vector (load x), (load x+4),
...
(load x+8), (load x+12), <0, 1, 2, 3> to a single 128-bit load (aligned and
unaligned).
e.g.
__m128 test(float a, float b, float c, float d) {
return _mm_set_ps(d, c, b, a);
}
_test:
movups 4(%esp), %xmm0
ret
llvm-svn: 29042
2006-07-07 08:33:52 +00:00
Evan Cheng
38c5aee959
Simplify X86CompilationCallback: always align to 16-byte boundary; don't save EAX/EDX if unnecessary.
...
llvm-svn: 28910
2006-06-24 08:36:10 +00:00
Evan Cheng
2a33094284
Switch X86 over to a call-selection model where the lowering code creates
...
the copyto/fromregs instead of making the X86ISD::CALL selection code create
them.
llvm-svn: 28463
2006-05-25 00:59:30 +00:00
Chris Lattner
aa2372562e
Patches to make the LLVM sources more -pedantic clean. Patch provided
...
by Anton Korobeynikov! This is a step towards closing PR786.
llvm-svn: 28447
2006-05-24 17:04:05 +00:00
Evan Cheng
17e734f0a6
Remove PreprocessCCCArguments and PreprocessFastCCArguments now that
...
FORMAL_ARGUMENTS nodes include a token operand.
llvm-svn: 28439
2006-05-23 21:06:34 +00:00
Chris Lattner
8be5be817c
Implement an annoying part of the Darwin/X86 abi: the callee of a struct
...
return argument pops the hidden struct pointer if present, not the caller.
For example, in this testcase:
struct X { int D, E, F, G; };
struct X bar() {
struct X a;
a.D = 0;
a.E = 1;
a.F = 2;
a.G = 3;
return a;
}
void foo(struct X *P) {
*P = bar();
}
We used to emit:
_foo:
subl $28, %esp
movl 32(%esp), %eax
movl %eax, (%esp)
call _bar
addl $28, %esp
ret
_bar:
movl 4(%esp), %eax
movl $0, (%eax)
movl $1, 4(%eax)
movl $2, 8(%eax)
movl $3, 12(%eax)
ret
This is correct on Linux/X86 but not Darwin/X86. With this patch, we now
emit:
_foo:
subl $28, %esp
movl 32(%esp), %eax
movl %eax, (%esp)
call _bar
*** addl $24, %esp
ret
_bar:
movl 4(%esp), %eax
movl $0, (%eax)
movl $1, 4(%eax)
movl $2, 8(%eax)
movl $3, 12(%eax)
*** ret $4
For the record, GCC emits (which is functionally equivalent to our new code):
_bar:
movl 4(%esp), %eax
movl $3, 12(%eax)
movl $2, 8(%eax)
movl $1, 4(%eax)
movl $0, (%eax)
ret $4
_foo:
pushl %esi
subl $40, %esp
movl 48(%esp), %esi
leal 16(%esp), %eax
movl %eax, (%esp)
call _bar
subl $4, %esp
movl 16(%esp), %eax
movl %eax, (%esi)
movl 20(%esp), %eax
movl %eax, 4(%esi)
movl 24(%esp), %eax
movl %eax, 8(%esi)
movl 28(%esp), %eax
movl %eax, 12(%esi)
addl $40, %esp
popl %esi
ret
This fixes SingleSource/Benchmarks/CoyoteBench/fftbench with LLC and the
JIT, and fixes the X86-backend portion of PR729. The CBE still needs to
be updated.
llvm-svn: 28438
2006-05-23 18:50:38 +00:00
Evan Cheng
8c6b234ce8
Should pass by reference.
...
llvm-svn: 28357
2006-05-17 19:07:40 +00:00
Evan Cheng
48940d16b2
- Clean up formal argument lowering code. Prepare for vector pass by value work.
...
- Fixed vararg support.
llvm-svn: 27985
2006-04-27 01:32:22 +00:00
Evan Cheng
e0bcfbe811
Switching over FORMAL_ARGUMENTS mechanism to lower call arguments.
...
llvm-svn: 27975
2006-04-26 01:20:17 +00:00
Evan Cheng
a9467aab0a
Separate LowerOperation() into multiple functions, one per opcode.
...
llvm-svn: 27972
2006-04-25 20:13:52 +00:00
Evan Cheng
e8b5180044
Now generating perfect (I think) code for "vector set" with a single non-zero
...
scalar value.
e.g.
_mm_set_epi32(0, a, 0, 0);
==>
movd 4(%esp), %xmm0
pshufd $69, %xmm0, %xmm0
_mm_set_epi8(0, 0, 0, 0, 0, a, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
==>
movzbw 4(%esp), %ax
movzwl %ax, %eax
pxor %xmm0, %xmm0
pinsrw $5, %eax, %xmm0
llvm-svn: 27923
2006-04-21 01:05:10 +00:00
Evan Cheng
60f0b8998e
- Added support to turn "vector clear elements", e.g. pand V, <-1, -1, 0, -1>
...
to a vector shuffle.
- VECTOR_SHUFFLE lowering change in preparation for more efficient codegen
of vector shuffle with zero (or any splat) vector.
llvm-svn: 27875
2006-04-20 08:58:49 +00:00
Evan Cheng
7855e4d032
Commute vector_shuffle to match more movlhps, movlp{s|d} cases.
...
llvm-svn: 27840
2006-04-19 20:35:22 +00:00
Evan Cheng
5d247f81c1
Last few SSE3 intrinsics.
...
llvm-svn: 27711
2006-04-14 21:59:03 +00:00
Evan Cheng
12ba3e23d0
Added support for _mm_move_ss and _mm_move_sd.
...
llvm-svn: 27575
2006-04-11 00:19:04 +00:00
Evan Cheng
c995b45f67
- movlp{s|d} and movhp{s|d} support.
...
- Normalize shuffle nodes so result vector lower half elements come from the
first vector, the rest come from the second vector. (Except for the
exceptions :-).
- Other minor fixes.
llvm-svn: 27474
2006-04-06 23:23:56 +00:00
Evan Cheng
780382946e
Support for comi / ucomi intrinsics.
...
llvm-svn: 27444
2006-04-05 23:38:46 +00:00
Evan Cheng
f3b52c84ea
Handle canonical form of e.g.
...
vector_shuffle v1, v1, <0, 4, 1, 5, 2, 6, 3, 7>
This is turned into
vector_shuffle v1, <undef>, <0, 0, 1, 1, 2, 2, 3, 3>
by dag combiner.
It would match a {p}unpckl on x86.
llvm-svn: 27437
2006-04-05 07:20:06 +00:00
Evan Cheng
5fd7c69473
Use a X86 target specific node X86ISD::PINSRW instead of a mal-formed
...
INSERT_VECTOR_ELT to insert a 16-bit value in a 128-bit vector.
llvm-svn: 27314
2006-03-31 21:55:24 +00:00
Evan Cheng
cbffa4656b
Add support to use pextrw and pinsrw to extract and insert a word element
...
from a 128-bit vector.
llvm-svn: 27304
2006-03-31 19:22:53 +00:00
Evan Cheng
b7fedffc78
- Added some SSE2 128-bit packed integer ops.
...
- Added SSE2 128-bit integer pack with signed saturation ops.
- Added pshufhw and pshuflw ops.
llvm-svn: 27252
2006-03-29 23:07:14 +00:00
Evan Cheng
1a194a5264
* Prefer using operation of matching types. e.g unpcklpd rather than movlhps.
...
* Bug fixes.
llvm-svn: 27218
2006-03-28 06:50:32 +00:00
Evan Cheng
2bc3280659
- Clean up / consoladate various shuffle masks.
...
- Some misc. bug fixes.
- Use MOVHPDrm to load from m64 to upper half of a XMM register.
llvm-svn: 27210
2006-03-28 02:43:26 +00:00
Evan Cheng
5df75889db
Model unpack lower and interleave as vector_shuffle so we can lower the
...
intrinsics as such.
llvm-svn: 27200
2006-03-28 00:39:58 +00:00
Evan Cheng
ed6184aef2
Remove X86:isZeroVector, use ISD::isBuildVectorAllZeros instead; some fixes / cleanups
...
llvm-svn: 27150
2006-03-26 09:53:12 +00:00
Evan Cheng
2bc0941e2a
Build arbitrary vector with more than 2 distinct scalar elements with a
...
series of unpack and interleave ops.
llvm-svn: 27119
2006-03-25 09:37:23 +00:00
Evan Cheng
e7ee6a5e32
Support for scalar to vector with zero extension.
...
llvm-svn: 27091
2006-03-24 23:15:12 +00:00
Evan Cheng
082c8785ef
Handle BUILD_VECTOR with all zero elements.
...
llvm-svn: 27056
2006-03-24 07:29:27 +00:00
Evan Cheng
2595a687da
More efficient v2f64 shuffle using movlhps, movhlps, unpckhpd, and unpcklpd.
...
llvm-svn: 27040
2006-03-24 02:58:06 +00:00
Evan Cheng
d27fb3e85e
Handle more shuffle cases with SHUFP* instructions.
...
llvm-svn: 27024
2006-03-24 01:18:28 +00:00
Evan Cheng
021bb7c956
Added a ValueType operand to isShuffleMaskLegal(). For now, x86 will not do
...
64-bit vector shuffle.
llvm-svn: 26964
2006-03-22 22:07:06 +00:00
Evan Cheng
68ad48bd1a
- Implement X86ISelLowering::isShuffleMaskLegal(). We currently only support
...
splat and PSHUFD cases.
- Clean up shuffle / splat matching code.
llvm-svn: 26954
2006-03-22 18:59:22 +00:00
Evan Cheng
8fdbdf20cd
- VECTOR_SHUFFLE of v4i32 / v4f32 with undef second vector always matches
...
PSHUFD. We can make permutes entries which point to the undef pointing
anything we want.
- Change some names to appease Chris.
llvm-svn: 26951
2006-03-22 08:01:21 +00:00
Evan Cheng
d097e67544
Some splat and shuffle support.
...
llvm-svn: 26940
2006-03-22 02:53:00 +00:00
Evan Cheng
d5e905d762
- Use movaps to store 128-bit vector integers.
...
- Each scalar to vector v8i16 and v16i8 is a any_extend followed by a movd.
llvm-svn: 26932
2006-03-21 23:01:21 +00:00