Elena Demikhovsky
45c54ad8dc
AVX-512 set: Added BROADCAST instructions
...
with lowering logic and a test.
llvm-svn: 187884
2013-08-07 12:34:55 +00:00
Elena Demikhovsky
40864b690b
AVX-512 set: added mask operations, lowering BUILD_VECTOR for i1 vector types.
...
Added intrinsics and tests.
llvm-svn: 187717
2013-08-05 08:52:21 +00:00
Benjamin Kramer
5bc180c14f
X86: Turn fp selects into mask operations.
...
double test(double a, double b, double c, double d) { return a<b ? c : d; }
before:
_test:
ucomisd %xmm0, %xmm1
ja LBB0_2
movaps %xmm3, %xmm2
LBB0_2:
movaps %xmm2, %xmm0
after:
_test:
cmpltsd %xmm1, %xmm0
andpd %xmm0, %xmm2
andnpd %xmm3, %xmm0
orpd %xmm2, %xmm0
Small speedup on Benchmarks/SmallPT
llvm-svn: 187706
2013-08-04 12:05:16 +00:00
Elena Demikhovsky
67b05fc0b3
Added INSERT and EXTRACT intructions from AVX-512 ISA.
...
All insertf*/extractf* functions replaced with insert/extract since we have insertf and inserti forms.
Added lowering for INSERT_VECTOR_ELT / EXTRACT_VECTOR_ELT for 512-bit vectors.
Added lowering for EXTRACT/INSERT subvector for 512-bit vectors.
Added a test.
llvm-svn: 187491
2013-07-31 11:35:14 +00:00
Craig Topper
8fb09f0abb
Fix inconsistent usage of PALIGN and PALIGNR when referring to the same instruction.
...
llvm-svn: 173667
2013-01-28 06:48:25 +00:00
Benjamin Kramer
4669d18893
X86: Match the SSE/AVX min/max vector ops using a custom node instead of intrinsics
...
This is very mechanical, no functionality change. Preparation for PR14667.
llvm-svn: 170898
2012-12-21 14:04:55 +00:00
Benjamin Kramer
b16ccde7a4
X86: Add a couple of target-specific dag combines that turn VSELECTS into psubus if possible.
...
We match the pattern "x >= y ? x-y : 0" into "subus x, y" and two special cases
if y is a constant. DAGCombiner canonicalizes those so we first have to undo the
canonicalization for those cases. The pattern occurs in gzip when the loop
vectorizer is enabled. Part of PR14613.
llvm-svn: 170273
2012-12-15 16:47:44 +00:00
Elena Demikhovsky
cd3c1c4a16
Simplified BLEND pattern matching for shuffles.
...
Generate VPBLENDD for AVX2 and VPBLENDW for v16i16 type on AVX2.
llvm-svn: 169366
2012-12-05 09:24:57 +00:00
Michael Liao
1be96bb5ce
Enable lowering ZERO_EXTEND/ANY_EXTEND to PMOVZX from SSE4.1
...
llvm-svn: 166486
2012-10-23 17:34:00 +00:00
Michael Liao
e999b865dd
Add support for FP_ROUND from v2f64 to v2f32
...
- Due to the current matching vector elements constraints in
ISD::FP_ROUND, rounding from v2f64 to v4f32 (after legalization from
v2f32) is scalarized. Add a customized v2f32 widening to convert it
into a target-specific X86ISD::VFPROUND to work around this
constraints.
llvm-svn: 165631
2012-10-10 16:53:28 +00:00
Michael Liao
400f7ef871
Enhance PR11334 fix to support extload from v2f32/v4f32
...
- Fix an remaining issue of PR11674 as well
llvm-svn: 163528
2012-09-10 18:33:51 +00:00
Craig Topper
a999c66292
Convert FMA4 patterns to use target specific nodes instead of intrinsics to align with FMA3.
...
llvm-svn: 162829
2012-08-29 07:18:25 +00:00
Nadav Rotem
178250ad87
When unsafe math is used, we can use commutative FMAX and FMIN. In some cases
...
this allows for better code generation.
Added a new DAGCombine transformation to convert FMAX and FMIN to FMANC and
FMINC, which are commutative.
For example:
movaps %xmm0, %xmm1
movsd LC(%rip), %xmm0
minsd %xmm1, %xmm0
becomes:
minsd LC(%rip), %xmm0
llvm-svn: 162187
2012-08-19 13:06:16 +00:00
Michael Liao
34107b9177
fix PR11334
...
- FP_EXTEND only support extending from vectors with matching elements.
This results in the scalarization of extending to v2f64 from v2f32,
which will be legalized to v4f32 not matching with v2f64.
- add X86-specific VFPEXT supproting extending from v4f32 to v2f64.
- add BUILD_VECTOR lowering helper to recover back the original
extending from v4f32 to v2f64.
- test case is enhanced to include different vector width.
llvm-svn: 161894
2012-08-14 21:24:47 +00:00
Craig Topper
ab47fe4e16
Implement proper handling for pcmpistri/pcmpestri intrinsics. Requires custom handling in DAGISelToDAG due to limitations in TableGen's implicit def handling. Fixes PR11305.
...
llvm-svn: 161318
2012-08-06 06:22:36 +00:00
Elena Demikhovsky
3cb3b0045c
Added FMA functionality to X86 target.
...
llvm-svn: 161110
2012-08-01 12:06:00 +00:00
Bill Wendling
ea6397f67b
Remove tabs.
...
llvm-svn: 160477
2012-07-19 00:11:40 +00:00
Craig Topper
a54893c662
Use XOP vpcom intrinsics in patterns instead of a target specific SDNode type. Remove the custom lowering code that selected the SDNode type.
...
llvm-svn: 158279
2012-06-09 17:02:24 +00:00
Elena Demikhovsky
8d7e56c409
ZERO_EXTEND/SIGN_EXTEND/TRUNCATE optimization for AVX2
...
llvm-svn: 155309
2012-04-22 09:39:03 +00:00
Craig Topper
26d7a94981
Change type profile for vpermv back to using operand type for the mask argument to match intrinsic behavior. Add a bitcast to the lowering code to convert mask from v8i32 to v8f32 for vpermps.
...
llvm-svn: 154798
2012-04-16 06:43:40 +00:00
Craig Topper
b86fa404d3
Merge vpermps/vpermd and vpermpd/vpermq SD nodes.
...
llvm-svn: 154782
2012-04-16 00:41:45 +00:00
Craig Topper
b04fe34030
Fix SDTypeProfile for vpermps. The mask operand should be v8i32.
...
llvm-svn: 154781
2012-04-16 00:12:20 +00:00
Elena Demikhovsky
779a72b49e
Added VPERM optimization for AVX2 shuffles
...
llvm-svn: 154761
2012-04-15 11:18:59 +00:00
Nadav Rotem
9bc178ac5c
Reapply 154396 after fixing a test.
...
Original message:
Modify the code that lowers shuffles to blends from using blendvXX to vblendXX.
blendV uses a register for the selection while Vblend uses an immediate.
On sandybridge they still have the same latency and execute on the same execution ports.
llvm-svn: 154483
2012-04-11 06:40:27 +00:00
Eric Christopher
65ada95b84
Temporarily revert this patch to see if it brings the buildbots back.
...
llvm-svn: 154425
2012-04-10 19:33:16 +00:00
Nadav Rotem
f934f91709
Modify the code that lowers shuffles to blends from using blendvXX to vblendXX.
...
blendv uses a register for the selection while vblend uses an immediate.
On sandybridge they still have the same latency and execute on the same execution ports.
llvm-svn: 154396
2012-04-10 14:33:13 +00:00
Chad Rosier
a281afc676
Fix a regression from r147481.
...
Original commit message from r147481:
DAGCombine for transforming 128->256 casts into a vmovaps, rather
then a vxorps + vinsertf128 pair if the original vector came from a load.
Fix:
Unaligned loads need to generate a vmovups.
rdar://10974078
llvm-svn: 152366
2012-03-09 02:00:48 +00:00
Jia Liu
e1d619691b
some comment fix for X86 and ARM
...
llvm-svn: 150902
2012-02-19 02:03:36 +00:00
Jia Liu
b22310fda6
Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, MSP430, PPC, PTX, Sparc, X86, XCore.
...
llvm-svn: 150878
2012-02-18 12:03:15 +00:00
Craig Topper
ba172d2d59
Remove the last of the old vector_shuffle patterns from X86 isel.
...
llvm-svn: 150795
2012-02-17 07:02:34 +00:00
Craig Topper
cfad98f745
Move old movl vector_shuffle patterns. Not needed anymore since vector_shuffles shouldn't reach isel.
...
llvm-svn: 150462
2012-02-14 08:14:53 +00:00
Craig Topper
8b19d78808
Still more vector_shuffle pattern removal.
...
llvm-svn: 150365
2012-02-13 07:23:41 +00:00
Craig Topper
6d471c9e49
Recommit r150328. Previous test failures should be fixed by r150360.
...
llvm-svn: 150362
2012-02-13 05:10:10 +00:00
NAKAMURA Takumi
0826c17d00
Revert r150328, "Remove more vector_shuffle patterns."
...
It caused 3 failures on pre-penryn and non-x86(generic) hosts.
llvm-svn: 150357
2012-02-13 00:10:15 +00:00
Craig Topper
e24c94af81
Remove more vector_shuffle patterns.
...
llvm-svn: 150328
2012-02-12 08:14:35 +00:00
Craig Topper
d40d9eb2b3
Remove more vector_shuffle patterns.
...
llvm-svn: 150321
2012-02-12 01:07:34 +00:00
Craig Topper
981c6cf7b3
Remove some patterns for matching vector_shuffle instructions since vector_shuffles should be custom lowered before isel.
...
llvm-svn: 150299
2012-02-11 07:43:35 +00:00
Craig Topper
1d471e31ba
Add target specific node for PMULUDQ. Change patterns to use it and custom lower intrinsics to it. Use it instead of intrinsic to handle 64-bit vector multiplies.
...
llvm-svn: 149807
2012-02-05 03:14:49 +00:00
Elena Demikhovsky
fb44980b41
Optimization for SIGN_EXTEND operation on AVX.
...
Special handling was added for v4i32 -> v4i64 and v8i16 -> v8i32
extensions.
llvm-svn: 149600
2012-02-02 09:10:43 +00:00
Craig Topper
ca29bcfc10
Move some XOP patterns into instruction definition. Replae VPCMOV intrinsic patterns with custom lowering to a target specific nodes.
...
llvm-svn: 149216
2012-01-30 01:10:15 +00:00
Craig Topper
7834900950
Custom lower PSIGN and PSHUFB intrinsics to their corresponding target specific nodes so we can remove the isel patterns.
...
llvm-svn: 148933
2012-01-25 06:43:11 +00:00
Craig Topper
0d8e67aebd
Add comments near load pattern fragments indicating that all integer vector loads are promoted to v2i64 or v4i64 so that no one tries to reintroduce pattern fragments for other types.
...
llvm-svn: 148771
2012-01-24 03:03:17 +00:00
Craig Topper
20c98df340
Remove pattern fragments for v32i8, v16i16, v8i32, v16i8, v8i16, and v4i32 loads. All integer vector loads are promoted to v2i64 or v4i64 so these pattern fragments can never match. Fix or remove patterns that used these fragments.
...
llvm-svn: 148672
2012-01-23 00:06:44 +00:00
Craig Topper
0b7ad76bd0
Combine X86 CMPPD and CMPPS node types. Simplifies selection code and pattern matching.
...
llvm-svn: 148670
2012-01-22 23:36:02 +00:00
Craig Topper
bd4884371b
Merge PCMPEQB/PCMPEQW/PCMPEQD/PCMPEQQ and PCMPGTB/PCMPGTW/PCMPGTD/PCMPGTQ X86 ISD node types into only two node types. Simplifying opcode selection and pattern matching.
...
llvm-svn: 148667
2012-01-22 22:42:16 +00:00
Craig Topper
094626414d
Add target specific ISD node types for SSE/AVX vector shuffle instructions and change all the code that used to create intrinsic nodes to create the new nodes instead.
...
llvm-svn: 148664
2012-01-22 19:15:14 +00:00
Craig Topper
80576e8d1f
Merge 128-bit and 256-bit SHUFPS/SHUFPD handling.
...
llvm-svn: 148466
2012-01-19 08:19:12 +00:00
Craig Topper
6e54ba7eee
Merge X86 SHUFPS and SHUFPD node types.
...
llvm-svn: 147394
2011-12-31 23:50:21 +00:00
Craig Topper
a913dde0ef
Remove an unused X86ISD node type.
...
llvm-svn: 146833
2011-12-17 19:16:44 +00:00
Craig Topper
1fdfec63a4
Remove some remants of the old palign pattern fragment that were still hanging around. Also remove a cast from inside getShuffleVPERM2X128Immediate and getShuffleVPERMILPImmediate since the only caller already had done the cast.
...
llvm-svn: 146344
2011-12-11 19:12:35 +00:00