llvm-project

Commit Graph

Author	SHA1	Message	Date
Elena Demikhovsky	45c54ad8dc	AVX-512 set: Added BROADCAST instructions with lowering logic and a test. llvm-svn: 187884	2013-08-07 12:34:55 +00:00
Elena Demikhovsky	40864b690b	AVX-512 set: added mask operations, lowering BUILD_VECTOR for i1 vector types. Added intrinsics and tests. llvm-svn: 187717	2013-08-05 08:52:21 +00:00
Benjamin Kramer	5bc180c14f	X86: Turn fp selects into mask operations. double test(double a, double b, double c, double d) { return a<b ? c : d; } before: _test: ucomisd %xmm0, %xmm1 ja LBB0_2 movaps %xmm3, %xmm2 LBB0_2: movaps %xmm2, %xmm0 after: _test: cmpltsd %xmm1, %xmm0 andpd %xmm0, %xmm2 andnpd %xmm3, %xmm0 orpd %xmm2, %xmm0 Small speedup on Benchmarks/SmallPT llvm-svn: 187706	2013-08-04 12:05:16 +00:00
Elena Demikhovsky	67b05fc0b3	Added INSERT and EXTRACT intructions from AVX-512 ISA. All insertf/extractf functions replaced with insert/extract since we have insertf and inserti forms. Added lowering for INSERT_VECTOR_ELT / EXTRACT_VECTOR_ELT for 512-bit vectors. Added lowering for EXTRACT/INSERT subvector for 512-bit vectors. Added a test. llvm-svn: 187491	2013-07-31 11:35:14 +00:00
Craig Topper	8fb09f0abb	Fix inconsistent usage of PALIGN and PALIGNR when referring to the same instruction. llvm-svn: 173667	2013-01-28 06:48:25 +00:00
Benjamin Kramer	4669d18893	X86: Match the SSE/AVX min/max vector ops using a custom node instead of intrinsics This is very mechanical, no functionality change. Preparation for PR14667. llvm-svn: 170898	2012-12-21 14:04:55 +00:00
Benjamin Kramer	b16ccde7a4	X86: Add a couple of target-specific dag combines that turn VSELECTS into psubus if possible. We match the pattern "x >= y ? x-y : 0" into "subus x, y" and two special cases if y is a constant. DAGCombiner canonicalizes those so we first have to undo the canonicalization for those cases. The pattern occurs in gzip when the loop vectorizer is enabled. Part of PR14613. llvm-svn: 170273	2012-12-15 16:47:44 +00:00
Elena Demikhovsky	cd3c1c4a16	Simplified BLEND pattern matching for shuffles. Generate VPBLENDD for AVX2 and VPBLENDW for v16i16 type on AVX2. llvm-svn: 169366	2012-12-05 09:24:57 +00:00
Michael Liao	1be96bb5ce	Enable lowering ZERO_EXTEND/ANY_EXTEND to PMOVZX from SSE4.1 llvm-svn: 166486	2012-10-23 17:34:00 +00:00
Michael Liao	e999b865dd	Add support for FP_ROUND from v2f64 to v2f32 - Due to the current matching vector elements constraints in ISD::FP_ROUND, rounding from v2f64 to v4f32 (after legalization from v2f32) is scalarized. Add a customized v2f32 widening to convert it into a target-specific X86ISD::VFPROUND to work around this constraints. llvm-svn: 165631	2012-10-10 16:53:28 +00:00
Michael Liao	400f7ef871	Enhance PR11334 fix to support extload from v2f32/v4f32 - Fix an remaining issue of PR11674 as well llvm-svn: 163528	2012-09-10 18:33:51 +00:00
Craig Topper	a999c66292	Convert FMA4 patterns to use target specific nodes instead of intrinsics to align with FMA3. llvm-svn: 162829	2012-08-29 07:18:25 +00:00
Nadav Rotem	178250ad87	When unsafe math is used, we can use commutative FMAX and FMIN. In some cases this allows for better code generation. Added a new DAGCombine transformation to convert FMAX and FMIN to FMANC and FMINC, which are commutative. For example: movaps %xmm0, %xmm1 movsd LC(%rip), %xmm0 minsd %xmm1, %xmm0 becomes: minsd LC(%rip), %xmm0 llvm-svn: 162187	2012-08-19 13:06:16 +00:00
Michael Liao	34107b9177	fix PR11334 - FP_EXTEND only support extending from vectors with matching elements. This results in the scalarization of extending to v2f64 from v2f32, which will be legalized to v4f32 not matching with v2f64. - add X86-specific VFPEXT supproting extending from v4f32 to v2f64. - add BUILD_VECTOR lowering helper to recover back the original extending from v4f32 to v2f64. - test case is enhanced to include different vector width. llvm-svn: 161894	2012-08-14 21:24:47 +00:00
Craig Topper	ab47fe4e16	Implement proper handling for pcmpistri/pcmpestri intrinsics. Requires custom handling in DAGISelToDAG due to limitations in TableGen's implicit def handling. Fixes PR11305. llvm-svn: 161318	2012-08-06 06:22:36 +00:00
Elena Demikhovsky	3cb3b0045c	Added FMA functionality to X86 target. llvm-svn: 161110	2012-08-01 12:06:00 +00:00
Bill Wendling	ea6397f67b	Remove tabs. llvm-svn: 160477	2012-07-19 00:11:40 +00:00
Craig Topper	a54893c662	Use XOP vpcom intrinsics in patterns instead of a target specific SDNode type. Remove the custom lowering code that selected the SDNode type. llvm-svn: 158279	2012-06-09 17:02:24 +00:00
Elena Demikhovsky	8d7e56c409	ZERO_EXTEND/SIGN_EXTEND/TRUNCATE optimization for AVX2 llvm-svn: 155309	2012-04-22 09:39:03 +00:00
Craig Topper	26d7a94981	Change type profile for vpermv back to using operand type for the mask argument to match intrinsic behavior. Add a bitcast to the lowering code to convert mask from v8i32 to v8f32 for vpermps. llvm-svn: 154798	2012-04-16 06:43:40 +00:00
Craig Topper	b86fa404d3	Merge vpermps/vpermd and vpermpd/vpermq SD nodes. llvm-svn: 154782	2012-04-16 00:41:45 +00:00
Craig Topper	b04fe34030	Fix SDTypeProfile for vpermps. The mask operand should be v8i32. llvm-svn: 154781	2012-04-16 00:12:20 +00:00
Elena Demikhovsky	779a72b49e	Added VPERM optimization for AVX2 shuffles llvm-svn: 154761	2012-04-15 11:18:59 +00:00
Nadav Rotem	9bc178ac5c	Reapply 154396 after fixing a test. Original message: Modify the code that lowers shuffles to blends from using blendvXX to vblendXX. blendV uses a register for the selection while Vblend uses an immediate. On sandybridge they still have the same latency and execute on the same execution ports. llvm-svn: 154483	2012-04-11 06:40:27 +00:00
Eric Christopher	65ada95b84	Temporarily revert this patch to see if it brings the buildbots back. llvm-svn: 154425	2012-04-10 19:33:16 +00:00
Nadav Rotem	f934f91709	Modify the code that lowers shuffles to blends from using blendvXX to vblendXX. blendv uses a register for the selection while vblend uses an immediate. On sandybridge they still have the same latency and execute on the same execution ports. llvm-svn: 154396	2012-04-10 14:33:13 +00:00
Chad Rosier	a281afc676	Fix a regression from r147481. Original commit message from r147481: DAGCombine for transforming 128->256 casts into a vmovaps, rather then a vxorps + vinsertf128 pair if the original vector came from a load. Fix: Unaligned loads need to generate a vmovups. rdar://10974078 llvm-svn: 152366	2012-03-09 02:00:48 +00:00
Jia Liu	e1d619691b	some comment fix for X86 and ARM llvm-svn: 150902	2012-02-19 02:03:36 +00:00
Jia Liu	b22310fda6	Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, MSP430, PPC, PTX, Sparc, X86, XCore. llvm-svn: 150878	2012-02-18 12:03:15 +00:00
Craig Topper	ba172d2d59	Remove the last of the old vector_shuffle patterns from X86 isel. llvm-svn: 150795	2012-02-17 07:02:34 +00:00
Craig Topper	cfad98f745	Move old movl vector_shuffle patterns. Not needed anymore since vector_shuffles shouldn't reach isel. llvm-svn: 150462	2012-02-14 08:14:53 +00:00
Craig Topper	8b19d78808	Still more vector_shuffle pattern removal. llvm-svn: 150365	2012-02-13 07:23:41 +00:00
Craig Topper	6d471c9e49	Recommit r150328. Previous test failures should be fixed by r150360. llvm-svn: 150362	2012-02-13 05:10:10 +00:00
NAKAMURA Takumi	0826c17d00	Revert r150328, "Remove more vector_shuffle patterns." It caused 3 failures on pre-penryn and non-x86(generic) hosts. llvm-svn: 150357	2012-02-13 00:10:15 +00:00
Craig Topper	e24c94af81	Remove more vector_shuffle patterns. llvm-svn: 150328	2012-02-12 08:14:35 +00:00
Craig Topper	d40d9eb2b3	Remove more vector_shuffle patterns. llvm-svn: 150321	2012-02-12 01:07:34 +00:00
Craig Topper	981c6cf7b3	Remove some patterns for matching vector_shuffle instructions since vector_shuffles should be custom lowered before isel. llvm-svn: 150299	2012-02-11 07:43:35 +00:00
Craig Topper	1d471e31ba	Add target specific node for PMULUDQ. Change patterns to use it and custom lower intrinsics to it. Use it instead of intrinsic to handle 64-bit vector multiplies. llvm-svn: 149807	2012-02-05 03:14:49 +00:00
Elena Demikhovsky	fb44980b41	Optimization for SIGN_EXTEND operation on AVX. Special handling was added for v4i32 -> v4i64 and v8i16 -> v8i32 extensions. llvm-svn: 149600	2012-02-02 09:10:43 +00:00
Craig Topper	ca29bcfc10	Move some XOP patterns into instruction definition. Replae VPCMOV intrinsic patterns with custom lowering to a target specific nodes. llvm-svn: 149216	2012-01-30 01:10:15 +00:00
Craig Topper	7834900950	Custom lower PSIGN and PSHUFB intrinsics to their corresponding target specific nodes so we can remove the isel patterns. llvm-svn: 148933	2012-01-25 06:43:11 +00:00
Craig Topper	0d8e67aebd	Add comments near load pattern fragments indicating that all integer vector loads are promoted to v2i64 or v4i64 so that no one tries to reintroduce pattern fragments for other types. llvm-svn: 148771	2012-01-24 03:03:17 +00:00
Craig Topper	20c98df340	Remove pattern fragments for v32i8, v16i16, v8i32, v16i8, v8i16, and v4i32 loads. All integer vector loads are promoted to v2i64 or v4i64 so these pattern fragments can never match. Fix or remove patterns that used these fragments. llvm-svn: 148672	2012-01-23 00:06:44 +00:00
Craig Topper	0b7ad76bd0	Combine X86 CMPPD and CMPPS node types. Simplifies selection code and pattern matching. llvm-svn: 148670	2012-01-22 23:36:02 +00:00
Craig Topper	bd4884371b	Merge PCMPEQB/PCMPEQW/PCMPEQD/PCMPEQQ and PCMPGTB/PCMPGTW/PCMPGTD/PCMPGTQ X86 ISD node types into only two node types. Simplifying opcode selection and pattern matching. llvm-svn: 148667	2012-01-22 22:42:16 +00:00
Craig Topper	094626414d	Add target specific ISD node types for SSE/AVX vector shuffle instructions and change all the code that used to create intrinsic nodes to create the new nodes instead. llvm-svn: 148664	2012-01-22 19:15:14 +00:00
Craig Topper	80576e8d1f	Merge 128-bit and 256-bit SHUFPS/SHUFPD handling. llvm-svn: 148466	2012-01-19 08:19:12 +00:00
Craig Topper	6e54ba7eee	Merge X86 SHUFPS and SHUFPD node types. llvm-svn: 147394	2011-12-31 23:50:21 +00:00
Craig Topper	a913dde0ef	Remove an unused X86ISD node type. llvm-svn: 146833	2011-12-17 19:16:44 +00:00
Craig Topper	1fdfec63a4	Remove some remants of the old palign pattern fragment that were still hanging around. Also remove a cast from inside getShuffleVPERM2X128Immediate and getShuffleVPERMILPImmediate since the only caller already had done the cast. llvm-svn: 146344	2011-12-11 19:12:35 +00:00

1 2 3

113 Commits