Elena Demikhovsky
8952974e29
AVX-512: implemented extractelement with variable index.
...
Added parsing of mask register and "zeroing" semantic, like {%k1} {z}.
llvm-svn: 190595
2013-09-12 08:55:00 +00:00
Elena Demikhovsky
980c6b08b1
AVX-512: added extend and truncate instructions.
...
llvm-svn: 189580
2013-08-29 11:56:53 +00:00
Craig Topper
6269f49505
Make sure x86 instructions using ssmem/sdmem operand types are only able to parse memory operands of the proper size in Intel syntax. Primarily affects some of sse cvt instructions.
...
llvm-svn: 189206
2013-08-26 00:39:04 +00:00
Elena Demikhovsky
33d447a2d6
AVX-512: Added SHIFT instructions.
...
llvm-svn: 188899
2013-08-21 09:36:02 +00:00
Elena Demikhovsky
1490c5eb5b
AVX-512: added arithmetic and logical operations.
...
ADD, SUB, MUL integer and FP types. OR, AND, XOR.
Added embeded broadcast form for these instructions.
llvm-svn: 188673
2013-08-19 13:26:14 +00:00
Elena Demikhovsky
3ce8dbbac2
AVX-512: Added VMOVD, VMOVQ, VMOVSS, VMOVSD instructions.
...
llvm-svn: 188637
2013-08-18 13:08:57 +00:00
Craig Topper
8c929627d9
Don't use v16i32 for load pattern matching. All 512-bit loads are cated to v8i64.
...
llvm-svn: 188534
2013-08-16 06:07:34 +00:00
Elena Demikhovsky
60b1f289f2
AVX-512: Added CMP and BLEND instructions.
...
Lowering for SETCC.
llvm-svn: 188265
2013-08-13 13:24:07 +00:00
Elena Demikhovsky
cf5b1458e6
AVX-512: Added VPERM* instructons and MOV* zmm-to-zmm instructions.
...
Added a test for shuffles using VPERM.
llvm-svn: 188147
2013-08-11 07:55:09 +00:00
Elena Demikhovsky
45c54ad8dc
AVX-512 set: Added BROADCAST instructions
...
with lowering logic and a test.
llvm-svn: 187884
2013-08-07 12:34:55 +00:00
Elena Demikhovsky
40864b690b
AVX-512 set: added mask operations, lowering BUILD_VECTOR for i1 vector types.
...
Added intrinsics and tests.
llvm-svn: 187717
2013-08-05 08:52:21 +00:00
Benjamin Kramer
5bc180c14f
X86: Turn fp selects into mask operations.
...
double test(double a, double b, double c, double d) { return a<b ? c : d; }
before:
_test:
ucomisd %xmm0, %xmm1
ja LBB0_2
movaps %xmm3, %xmm2
LBB0_2:
movaps %xmm2, %xmm0
after:
_test:
cmpltsd %xmm1, %xmm0
andpd %xmm0, %xmm2
andnpd %xmm3, %xmm0
orpd %xmm2, %xmm0
Small speedup on Benchmarks/SmallPT
llvm-svn: 187706
2013-08-04 12:05:16 +00:00
Elena Demikhovsky
67b05fc0b3
Added INSERT and EXTRACT intructions from AVX-512 ISA.
...
All insertf*/extractf* functions replaced with insert/extract since we have insertf and inserti forms.
Added lowering for INSERT_VECTOR_ELT / EXTRACT_VECTOR_ELT for 512-bit vectors.
Added lowering for EXTRACT/INSERT subvector for 512-bit vectors.
Added a test.
llvm-svn: 187491
2013-07-31 11:35:14 +00:00
Craig Topper
8fb09f0abb
Fix inconsistent usage of PALIGN and PALIGNR when referring to the same instruction.
...
llvm-svn: 173667
2013-01-28 06:48:25 +00:00
Benjamin Kramer
4669d18893
X86: Match the SSE/AVX min/max vector ops using a custom node instead of intrinsics
...
This is very mechanical, no functionality change. Preparation for PR14667.
llvm-svn: 170898
2012-12-21 14:04:55 +00:00
Benjamin Kramer
b16ccde7a4
X86: Add a couple of target-specific dag combines that turn VSELECTS into psubus if possible.
...
We match the pattern "x >= y ? x-y : 0" into "subus x, y" and two special cases
if y is a constant. DAGCombiner canonicalizes those so we first have to undo the
canonicalization for those cases. The pattern occurs in gzip when the loop
vectorizer is enabled. Part of PR14613.
llvm-svn: 170273
2012-12-15 16:47:44 +00:00
Elena Demikhovsky
cd3c1c4a16
Simplified BLEND pattern matching for shuffles.
...
Generate VPBLENDD for AVX2 and VPBLENDW for v16i16 type on AVX2.
llvm-svn: 169366
2012-12-05 09:24:57 +00:00
Michael Liao
1be96bb5ce
Enable lowering ZERO_EXTEND/ANY_EXTEND to PMOVZX from SSE4.1
...
llvm-svn: 166486
2012-10-23 17:34:00 +00:00
Michael Liao
e999b865dd
Add support for FP_ROUND from v2f64 to v2f32
...
- Due to the current matching vector elements constraints in
ISD::FP_ROUND, rounding from v2f64 to v4f32 (after legalization from
v2f32) is scalarized. Add a customized v2f32 widening to convert it
into a target-specific X86ISD::VFPROUND to work around this
constraints.
llvm-svn: 165631
2012-10-10 16:53:28 +00:00
Michael Liao
400f7ef871
Enhance PR11334 fix to support extload from v2f32/v4f32
...
- Fix an remaining issue of PR11674 as well
llvm-svn: 163528
2012-09-10 18:33:51 +00:00
Craig Topper
a999c66292
Convert FMA4 patterns to use target specific nodes instead of intrinsics to align with FMA3.
...
llvm-svn: 162829
2012-08-29 07:18:25 +00:00
Nadav Rotem
178250ad87
When unsafe math is used, we can use commutative FMAX and FMIN. In some cases
...
this allows for better code generation.
Added a new DAGCombine transformation to convert FMAX and FMIN to FMANC and
FMINC, which are commutative.
For example:
movaps %xmm0, %xmm1
movsd LC(%rip), %xmm0
minsd %xmm1, %xmm0
becomes:
minsd LC(%rip), %xmm0
llvm-svn: 162187
2012-08-19 13:06:16 +00:00
Michael Liao
34107b9177
fix PR11334
...
- FP_EXTEND only support extending from vectors with matching elements.
This results in the scalarization of extending to v2f64 from v2f32,
which will be legalized to v4f32 not matching with v2f64.
- add X86-specific VFPEXT supproting extending from v4f32 to v2f64.
- add BUILD_VECTOR lowering helper to recover back the original
extending from v4f32 to v2f64.
- test case is enhanced to include different vector width.
llvm-svn: 161894
2012-08-14 21:24:47 +00:00
Craig Topper
ab47fe4e16
Implement proper handling for pcmpistri/pcmpestri intrinsics. Requires custom handling in DAGISelToDAG due to limitations in TableGen's implicit def handling. Fixes PR11305.
...
llvm-svn: 161318
2012-08-06 06:22:36 +00:00
Elena Demikhovsky
3cb3b0045c
Added FMA functionality to X86 target.
...
llvm-svn: 161110
2012-08-01 12:06:00 +00:00
Bill Wendling
ea6397f67b
Remove tabs.
...
llvm-svn: 160477
2012-07-19 00:11:40 +00:00
Craig Topper
a54893c662
Use XOP vpcom intrinsics in patterns instead of a target specific SDNode type. Remove the custom lowering code that selected the SDNode type.
...
llvm-svn: 158279
2012-06-09 17:02:24 +00:00
Elena Demikhovsky
8d7e56c409
ZERO_EXTEND/SIGN_EXTEND/TRUNCATE optimization for AVX2
...
llvm-svn: 155309
2012-04-22 09:39:03 +00:00
Craig Topper
26d7a94981
Change type profile for vpermv back to using operand type for the mask argument to match intrinsic behavior. Add a bitcast to the lowering code to convert mask from v8i32 to v8f32 for vpermps.
...
llvm-svn: 154798
2012-04-16 06:43:40 +00:00
Craig Topper
b86fa404d3
Merge vpermps/vpermd and vpermpd/vpermq SD nodes.
...
llvm-svn: 154782
2012-04-16 00:41:45 +00:00
Craig Topper
b04fe34030
Fix SDTypeProfile for vpermps. The mask operand should be v8i32.
...
llvm-svn: 154781
2012-04-16 00:12:20 +00:00
Elena Demikhovsky
779a72b49e
Added VPERM optimization for AVX2 shuffles
...
llvm-svn: 154761
2012-04-15 11:18:59 +00:00
Nadav Rotem
9bc178ac5c
Reapply 154396 after fixing a test.
...
Original message:
Modify the code that lowers shuffles to blends from using blendvXX to vblendXX.
blendV uses a register for the selection while Vblend uses an immediate.
On sandybridge they still have the same latency and execute on the same execution ports.
llvm-svn: 154483
2012-04-11 06:40:27 +00:00
Eric Christopher
65ada95b84
Temporarily revert this patch to see if it brings the buildbots back.
...
llvm-svn: 154425
2012-04-10 19:33:16 +00:00
Nadav Rotem
f934f91709
Modify the code that lowers shuffles to blends from using blendvXX to vblendXX.
...
blendv uses a register for the selection while vblend uses an immediate.
On sandybridge they still have the same latency and execute on the same execution ports.
llvm-svn: 154396
2012-04-10 14:33:13 +00:00
Chad Rosier
a281afc676
Fix a regression from r147481.
...
Original commit message from r147481:
DAGCombine for transforming 128->256 casts into a vmovaps, rather
then a vxorps + vinsertf128 pair if the original vector came from a load.
Fix:
Unaligned loads need to generate a vmovups.
rdar://10974078
llvm-svn: 152366
2012-03-09 02:00:48 +00:00
Jia Liu
e1d619691b
some comment fix for X86 and ARM
...
llvm-svn: 150902
2012-02-19 02:03:36 +00:00
Jia Liu
b22310fda6
Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, MSP430, PPC, PTX, Sparc, X86, XCore.
...
llvm-svn: 150878
2012-02-18 12:03:15 +00:00
Craig Topper
ba172d2d59
Remove the last of the old vector_shuffle patterns from X86 isel.
...
llvm-svn: 150795
2012-02-17 07:02:34 +00:00
Craig Topper
cfad98f745
Move old movl vector_shuffle patterns. Not needed anymore since vector_shuffles shouldn't reach isel.
...
llvm-svn: 150462
2012-02-14 08:14:53 +00:00
Craig Topper
8b19d78808
Still more vector_shuffle pattern removal.
...
llvm-svn: 150365
2012-02-13 07:23:41 +00:00
Craig Topper
6d471c9e49
Recommit r150328. Previous test failures should be fixed by r150360.
...
llvm-svn: 150362
2012-02-13 05:10:10 +00:00
NAKAMURA Takumi
0826c17d00
Revert r150328, "Remove more vector_shuffle patterns."
...
It caused 3 failures on pre-penryn and non-x86(generic) hosts.
llvm-svn: 150357
2012-02-13 00:10:15 +00:00
Craig Topper
e24c94af81
Remove more vector_shuffle patterns.
...
llvm-svn: 150328
2012-02-12 08:14:35 +00:00
Craig Topper
d40d9eb2b3
Remove more vector_shuffle patterns.
...
llvm-svn: 150321
2012-02-12 01:07:34 +00:00
Craig Topper
981c6cf7b3
Remove some patterns for matching vector_shuffle instructions since vector_shuffles should be custom lowered before isel.
...
llvm-svn: 150299
2012-02-11 07:43:35 +00:00
Craig Topper
1d471e31ba
Add target specific node for PMULUDQ. Change patterns to use it and custom lower intrinsics to it. Use it instead of intrinsic to handle 64-bit vector multiplies.
...
llvm-svn: 149807
2012-02-05 03:14:49 +00:00
Elena Demikhovsky
fb44980b41
Optimization for SIGN_EXTEND operation on AVX.
...
Special handling was added for v4i32 -> v4i64 and v8i16 -> v8i32
extensions.
llvm-svn: 149600
2012-02-02 09:10:43 +00:00
Craig Topper
ca29bcfc10
Move some XOP patterns into instruction definition. Replae VPCMOV intrinsic patterns with custom lowering to a target specific nodes.
...
llvm-svn: 149216
2012-01-30 01:10:15 +00:00
Craig Topper
7834900950
Custom lower PSIGN and PSHUFB intrinsics to their corresponding target specific nodes so we can remove the isel patterns.
...
llvm-svn: 148933
2012-01-25 06:43:11 +00:00