llvm-project

Commit Graph

Author	SHA1	Message	Date
Bruno Cardoso Lopes	957a6a13e0	Cleanup movsldup/movshdup matching. 27 insertions(+), 62 deletions(-) llvm-svn: 136047	2011-07-26 02:39:13 +00:00
Bruno Cardoso Lopes	123dff0f58	- Handle special scalar_to_vector case: splats. Using a native 128-bit shuffle before inserting on a 256-bit vector. - Add AVX versions of movd/movq instructions - Introduce a few COPY patterns to match insert_subvector instructions. This turns a trivial insert_subvector instruction into a register copy, coalescing the xmm into a ymm and avoid emiting on more instruction. llvm-svn: 136002	2011-07-25 23:05:25 +00:00
Bruno Cardoso Lopes	b878caa5e2	Add support for 256-bit versions of VPERMIL instruction. This is a new instruction introduced in AVX, which can operate on 128 and 256-bit vectors. It considers a 256-bit vector as two independent 128-bit lanes. It can permute any 32 or 64 elements inside a lane, and restricts the second lane to have the same permutation of the first one. With the improved splat support introduced early today, adding codegen for this instruction enable more efficient 256-bit code: Instead of: vextractf128 $0, %ymm0, %xmm0 punpcklbw %xmm0, %xmm0 punpckhbw %xmm0, %xmm0 vinsertf128 $0, %xmm0, %ymm0, %ymm1 vinsertf128 $1, %xmm0, %ymm1, %ymm0 vextractf128 $1, %ymm0, %xmm1 shufps $1, %xmm1, %xmm1 movss %xmm1, 28(%rsp) movss %xmm1, 24(%rsp) movss %xmm1, 20(%rsp) movss %xmm1, 16(%rsp) vextractf128 $0, %ymm0, %xmm0 shufps $1, %xmm0, %xmm0 movss %xmm0, 12(%rsp) movss %xmm0, 8(%rsp) movss %xmm0, 4(%rsp) movss %xmm0, (%rsp) vmovaps (%rsp), %ymm0 We get: vextractf128 $0, %ymm0, %xmm0 punpcklbw %xmm0, %xmm0 punpckhbw %xmm0, %xmm0 vinsertf128 $0, %xmm0, %ymm0, %ymm1 vinsertf128 $1, %xmm0, %ymm1, %ymm0 vpermilps $85, %ymm0, %ymm0 llvm-svn: 135662	2011-07-21 01:55:47 +00:00
Benjamin Kramer	9654eef493	Port operand types for ARM and X86 over from EDIS to the .td files. llvm-svn: 135198	2011-07-14 21:47:22 +00:00
Bruno Cardoso Lopes	9613b64916	Make X86ISD::ANDNP more general and Codegen 256-bit VANDNP. A more general version of X86ISD::ANDNP also opened the room for a little bit of refactoring. llvm-svn: 135088	2011-07-13 21:36:51 +00:00
Bruno Cardoso Lopes	7ba479d22f	The target specific node PANDN name is misleading. That happens because it's later selected to a ANDNPD/ANDNPS instruction instead of the PANDN instruction. Rename it. llvm-svn: 135087	2011-07-13 21:36:47 +00:00
Bruno Cardoso Lopes	1021b4a9dd	AVX Codegen support for 256-bit versions of vandps, vandpd, vorps, vorpd, vxorps, vxorpd llvm-svn: 135023	2011-07-13 01:15:33 +00:00
Stuart Hastings	be605494ac	Reapply 132424 with fixes. This fixes PR10068. rdar://problem/5993888 llvm-svn: 132606	2011-06-03 23:53:54 +00:00
Rafael Espindola	aa318ae495	Revert 132424 to fix PR10068. llvm-svn: 132479	2011-06-02 19:57:47 +00:00
Stuart Hastings	7adc95f69e	Recommit 132404 with fixes. rdar://problem/5993888 llvm-svn: 132424	2011-06-01 21:33:14 +00:00
Stuart Hastings	aab130d995	Revert 132404 to appease a buildbot. rdar://problem/5993888 llvm-svn: 132419	2011-06-01 19:52:20 +00:00
Stuart Hastings	7b7c102f2c	Add support for x86 CMPEQSS and friends. These instructions do a floating-point comparison, generate a mask of 0s or 1s, and generally DTRT with NaNs. Only profitable when the user wants a materialized 0 or 1 at runtime. rdar://problem/5993888 llvm-svn: 132404	2011-06-01 17:17:45 +00:00
Stuart Hastings	9f20804216	FGETSIGN support for x86, using movmskps/pd. Will be enabled with a patch to TargetLowering.cpp. rdar://problem/5660695 llvm-svn: 132388	2011-06-01 04:39:42 +00:00
David Greene	dd567b214b	[AVX] Fix mask predicates for 256-bit UNPCKLPS/D and implement missing patterns for them. Add a SIMD test subdirectory to hold tests for SIMD instruction selection correctness and quality. ' llvm-svn: 126845	2011-03-02 17:23:43 +00:00
David Greene	653f1eed2d	[AVX] Support VSINSERTF128 with more patterns and appropriate infrastructure. This makes lowering 256-bit vectors to 128-bit vectors simple when 256-bit vector support is not available. llvm-svn: 124868	2011-02-04 16:08:29 +00:00
David Greene	c4da110fd2	[AVX] VEXTRACTF128 support. This commit includes patterns for matching EXTRACT_SUBVECTOR to VEXTRACTF128 along with support routines to examine and translate index values. VINSERTF128 comes next. With these two in place we can begin supporting more AVX operations as INSERT/EXTRACT can be used as a fallback when 256-bit support is not available. llvm-svn: 124797	2011-02-03 15:50:00 +00:00
Nate Begeman	4b9db07b02	Implement feedback from Bruno on making pblendvb an x86-specific ISD node in addition to being an intrinsic, and convert lowering to use it. Hopefully the pattern fragment is doing the right thing with XMM0, looks correct in testing. llvm-svn: 122277	2010-12-20 22:04:24 +00:00
Nate Begeman	97b72c99d2	Add support for matching psign & plendvb to the x86 target Remove unnecessary pandn patterns, 'vnot' patfrag looks through bitcasts llvm-svn: 122098	2010-12-17 22:55:37 +00:00
Dale Johannesen	dd224d2333	Massive rewrite of MMX: The x86_mmx type is used for MMX intrinsics, parameters and return values where these use MMX registers, and is also supported in load, store, and bitcast. Only the above operations generate MMX instructions, and optimizations do not operate on or produce MMX intrinsics. MMX-sized vectors <2 x i32> etc. are lowered to XMM or split into smaller pieces. Optimizations may occur on these forms and the result casted back to x86_mmx, provided the result feeds into a previous existing x86_mmx operation. The point of all this is prevent optimizations from introducing MMX operations, which is unsafe due to the EMMS problem. llvm-svn: 115243	2010-09-30 23:57:10 +00:00
Chris Lattner	54e5329545	give VZEXT_LOAD a memory operand, it now works with segment registers. llvm-svn: 114515	2010-09-22 00:34:38 +00:00
Chris Lattner	0e023ea02a	fix a long standing wart: all the ComplexPattern's were being passed the root of the match, even though only a few patterns actually needed this (one in X86, several in ARM [which should be refactored anyway], and some in CellSPU that I don't feel like detangling). Instead of requiring all ComplexPatterns to take the dead root, have targets opt into getting the root by putting SDNPWantRoot on the ComplexPattern. llvm-svn: 114471	2010-09-21 20:31:19 +00:00
Dale Johannesen	1eea351920	Fix typos. 128-bit PSHUFB takes 128-bit memory op. v8i16 is not an MMX type; put it where it belongs. llvm-svn: 113785	2010-09-13 21:15:43 +00:00
Bill Wendling	798725b24d	Reapply r113585. The msvc machine is mercurial. llvm-svn: 113610	2010-09-10 20:20:28 +00:00
Bill Wendling	a9c9aaa839	r113585 was causing clang-i686-xp-msvc9 to fail in mysterious ways that I can't understand (the log file was no help). llvm-svn: 113605	2010-09-10 19:20:47 +00:00
Bill Wendling	638a098f72	Mark the sse_load_f32 and sse_load_f64 load patterns as having memoperands so that the memoperands are properly set after DAG building and general mucking about. llvm-svn: 113585	2010-09-10 10:34:22 +00:00
Bruno Cardoso Lopes	f0ea222255	Remove unused target specific node llvm-svn: 113224	2010-09-07 17:38:55 +00:00
Bruno Cardoso Lopes	b3825216ce	Use movlps, movlpd, movss and movsd specific nodes instead of pattern matching with movlp pattern fragment llvm-svn: 112694	2010-09-01 05:08:25 +00:00
Bruno Cardoso Lopes	4b56d87290	Use x86 specific MOVSLDUP node, add more patterns to match it and remove useless load nodes llvm-svn: 112661	2010-08-31 22:35:05 +00:00
Bruno Cardoso Lopes	03e4c35302	Use MOVLHPS and MOVHLPS x86 nodes whenever possible. Also remove some useless nodes llvm-svn: 112642	2010-08-31 21:15:21 +00:00
Bruno Cardoso Lopes	6f3b38a851	This is the first step towards refactoring the x86 vector shuffle code. The general idea here is to have a group of x86 target specific nodes which are going to be selected during lowering and then directly matched in isel. The commit includes the addition of those specific nodes and a bunch of patterns, and incrementally we're going to switch between them and what we have right now. Both the patterns and target specific nodes can change as we move forward with this work. llvm-svn: 111691	2010-08-20 22:55:05 +00:00
Bruno Cardoso Lopes	160be2936b	Add comments to some pattern fragments in x86 llvm-svn: 111041	2010-08-13 20:39:01 +00:00
Bruno Cardoso Lopes	91d61df3eb	Add AVX matching patterns to Packed Bit Test intrinsics. Apply the same approach of SSE4.1 ptest intrinsics but create a new x86 node "testp" since AVX introduces vtest{ps}{pd} instructions which set ZF and CF depending on sign bit AND and ANDN of packed floating-point sources. This is slightly different from what the "ptest" does. Tests comming with the other 256 intrinsics tests. llvm-svn: 110744	2010-08-10 23:25:42 +00:00
Bruno Cardoso Lopes	3d6a3a0ede	Patterns to match AVX 256-bit permutation intrinsics llvm-svn: 110468	2010-08-06 20:03:27 +00:00
Bruno Cardoso Lopes	e3acfd4d58	Add more 256-bit forms for a bunch of regular AVX instructions Add 64-bit (GR64) versions of some instructions (which are not described in their SSE forms, but are described in AVX) llvm-svn: 109063	2010-07-21 23:53:50 +00:00
Bruno Cardoso Lopes	9de0ca73d4	Add 256-bit vaddsub, vhadd, vhsub, vblend and vdpp instructions! llvm-svn: 108769	2010-07-19 23:32:44 +00:00
David Greene	03264efe30	Move some SIMD fragment code into X86InstrFragmentsSIMD so that the utility classes can be used from multiple files. This will aid transitioning to a new refactored x86 SIMD specification. llvm-svn: 108213	2010-07-12 23:41:28 +00:00
David Greene	509be1fe5e	TableGen fragment refactoring. Move some utility TableGen defs, classes, etc. into a common file so they may be used my multiple pattern files. We will use this for the AVX specification to help with the transition from the current SSE specification. llvm-svn: 95727	2010-02-09 23:52:19 +00:00

37 Commits