Commit Graph

189 Commits

Author SHA1 Message Date
Craig Topper a984729f8a Remove unneeded MMX instruction definition by moving pattern to an equivalent instruction definition and removing the filtering from the disassembler table building.
llvm-svn: 192175
2013-10-08 06:30:39 +00:00
Preston Gurd 3fe264d625 Adds support for Atom Silvermont (SLM) - -march=slm
Implements Instruction scheduler latencies for Silvermont,
using latencies from the Intel Silvermont Optimization Guide.

Auto detects SLM.

Turns on post RA scheduler when generating code for SLM.

llvm-svn: 190717
2013-09-13 19:23:28 +00:00
Benjamin Kramer b289319fb8 X86: cvtpi2ps is just an SSE instruction with MMX operands. It has no AVX equivalent.
Give it the right register format so we can also emit it when AVX is enabled.

llvm-svn: 183971
2013-06-14 09:31:41 +00:00
Eric Christopher b27cd8bea6 Reapply "Subtract isn't commutative, fix this for MMX psub." with
a somewhat randomly chosen cpu that will minimize cpu specific
differences on bots.

llvm-svn: 181814
2013-05-14 18:33:40 +00:00
Eric Christopher 3eee7454cf Temporarily revert "Subtract isn't commutative, fix this for MMX psub."
It's causing failures on the atom bot.

llvm-svn: 181812
2013-05-14 18:20:42 +00:00
Eric Christopher 0344f495f9 Subtract isn't commutative, fix this for MMX psub.
Patch by Andrea DiBiagio.

llvm-svn: 181809
2013-05-14 17:52:05 +00:00
Jakob Stoklund Olesen 267dd946f6 Annotate x87 and mmx instructions with SchedRW lists.
This only covers the instructions that were given itinerary classes for
the Atom model.

llvm-svn: 178050
2013-03-26 18:24:20 +00:00
Jakob Stoklund Olesen 4d39e81fb8 Remove IIC_DEFAULT from X86Schedule.td
All the instructions tagged with IIC_DEFAULT had nothing in common, and
we already have a NoItineraries class to represent untagged
instructions.

llvm-svn: 177937
2013-03-25 23:12:41 +00:00
Manman Ren acb8becc73 X86 MMX: optimize transfer from mmx to i32
We used to generate a store (movq) + a load.
Now we use movd.

rdar://9946746

llvm-svn: 167056
2012-10-30 22:15:38 +00:00
Michael Liao bbd10792c2 Introduce 'UseSSEx' to force SSE legacy encoding
- Add 'UseSSEx' to force SSE legacy insn not being selected when AVX is
  enabled.

  As the penalty of inter-mixing SSE and AVX instructions, we need
  prevent SSE legacy insn from being generated except explicitly
  specified through some intrinsics. For patterns supported by both
  SSE and AVX, so far, we force AVX insn will be tried first relying on
  AddedComplexity or position in td file. It's error-prone and
  introduces bugs accidentally.

  'UseSSEx' is disabled when AVX is turned on. For SSE insns inherited
  by AVX, we need this predicate to force VEX encoding or SSE legacy
  encoding only.

  For insns not inherited by AVX, we still use the previous predicates,
  i.e. 'HasSSEx'. So far, these insns fall into the following
  categories:
  * SSE insns with MMX operands
  * SSE insns with GPR/MEM operands only (xFENCE, PREFETCH, CLFLUSH,
    CRC, and etc.)
  * SSE4A insns.
  * MMX insns.
  * x87 insns added by SSE.

2 test cases are modified:

 - test/CodeGen/X86/fast-isel-x86-64.ll
   AVX code generation is different from SSE one. 'vcvtsi2sdq' cannot be
   selected by fast-isel due to complicated pattern and fast-isel
   fallback to materialize it from constant pool.

 - test/CodeGen/X86/widen_load-1.ll
   AVX code generation is different from SSE one after fixing SSE/AVX
   inter-mixing. Exec-domain fixing prefers 'vmovapd' instead of
   'vmovaps'.

llvm-svn: 162919
2012-08-30 16:54:46 +00:00
Craig Topper a7aaa62d54 Remove the LowerMMXCONCAT_VECTORS function. It could never execute because there are no legal 64-bit vector types that could be used as inputs to a 128-bit concat_vectors. Remove a target specific SDNode and its patterns that become unused as a result.
llvm-svn: 161742
2012-08-13 01:23:55 +00:00
Craig Topper f881d385da Fix patterns for CVTTPS2DQ to specify SSE2 instead of SSE1.
llvm-svn: 160941
2012-07-30 02:14:02 +00:00
Preston Gurd 09de6ae399 Added X86 Atom latencies to X86InstrMMX.td.
llvm-svn: 156615
2012-05-11 14:27:12 +00:00
Jia Liu b22310fda6 Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, MSP430, PPC, PTX, Sparc, X86, XCore.
llvm-svn: 150878
2012-02-18 12:03:15 +00:00
Andrew Trick 8523b16ff5 Instruction scheduling itinerary for Intel Atom.
Adds an instruction itinerary to all x86 instructions, giving each a default latency of 1, using the InstrItinClass IIC_DEFAULT.

Sets specific latencies for Atom for the instructions in files X86InstrCMovSetCC.td, X86InstrArithmetic.td, X86InstrControl.td, and X86InstrShiftRotate.td. The Atom latencies for the remainder of the x86 instructions will be set in subsequent patches.

Adds a test to verify that the scheduler is working.

Also changes the scheduling preference to "Hybrid" for i386 Atom, while leaving x86_64 as ILP.

Patch by Preston Gurd!

llvm-svn: 149558
2012-02-01 23:20:51 +00:00
Craig Topper eb8f9e9e5b Instruction selection priority fixes to remove the XMM/XMMInt/orAVX predicates. Another commit will remove orAVX functions from X86SubTarget.
llvm-svn: 147841
2012-01-10 06:30:56 +00:00
Craig Topper 744f6311d3 Don't disable MMX support when AVX is enabled. Fix predicates for MMX instructions that were added along with SSE instructions to check for AVX in addition to SSE level.
llvm-svn: 147762
2012-01-09 00:11:29 +00:00
Eli Friedman f1e2b50a30 PR9848: pandn is not commutative.
No test because I can't think of any way to write one that won't break quickly.

llvm-svn: 130932
2011-05-05 17:45:31 +00:00
Bill Wendling 402e54822b The pshufw instruction came about in MMX2 when SSE was introduced. Don't place
it in with the SSSE3 instructions.

Steward! Could you place this chair by the aft sun deck? I'm trying to get away
from the Astors. They are such boors!

llvm-svn: 115552
2010-10-04 20:24:01 +00:00
Chris Lattner d3593c3a8e the immediate field of pshufw is actually an 8-bit field, not a 8-bit field that is sign extended. This fixes PR8288
llvm-svn: 115473
2010-10-03 19:09:13 +00:00
Chris Lattner b44b202d66 add support for the prefetch/prefetchw instructions, move femms into
the right file.  The assembler supports all the 3dnow instructions now,
but not the "3dnowa" ones.

llvm-svn: 115468
2010-10-03 18:42:30 +00:00
Chris Lattner ae1a9de083 stub out a header to put 3dNow! instructions into.
llvm-svn: 115429
2010-10-02 23:06:23 +00:00
Chris Lattner 4756bbeba0 fix a regression introduced in r115243, in which the instruction
backing int_x86_ssse3_pshuf_w got removed.  This caused PR8280.

llvm-svn: 115422
2010-10-02 21:32:15 +00:00
Dale Johannesen dd224d2333 Massive rewrite of MMX:
The x86_mmx type is used for MMX intrinsics, parameters and
return values where these use MMX registers, and is also
supported in load, store, and bitcast.

Only the above operations generate MMX instructions, and optimizations
do not operate on or produce MMX intrinsics. 

MMX-sized vectors <2 x i32> etc. are lowered to XMM or split into
smaller pieces.  Optimizations may occur on these forms and the
result casted back to x86_mmx, provided the result feeds into a
previous existing x86_mmx operation.

The point of all this is prevent optimizations from introducing
MMX operations, which is unsafe due to the EMMS problem.

llvm-svn: 115243
2010-09-30 23:57:10 +00:00
Dale Johannesen 0ec303b97b Move remaining MMX instructions from SSE to MMX.
llvm-svn: 113501
2010-09-09 17:13:07 +00:00
Dale Johannesen 5f4a6f295c Move most MMX instructions (defined as anything that
uses MMX, even if it also uses other things) from InstrSSE
into InstrMMX.  No (intended) functional change.

llvm-svn: 113462
2010-09-09 01:02:39 +00:00
Dale Johannesen 0d2e6ad504 Add intrinsic-based patterns for MMX PINSRW and PEXTRW.
llvm-svn: 113420
2010-09-08 22:08:40 +00:00
Dale Johannesen 4dae01781f Slight cleanup, use only one form of MMXI_binop_rm_int.
llvm-svn: 113406
2010-09-08 20:54:00 +00:00
Dale Johannesen d79bb127dd Add intrinsic forms of mmx<->sse conversions. Notes:
Omission of memory form of PI2PD is intentional; this
does not use an MMX register and does not put the chip
into MMX mode (PI2PS, oddly enough, does).
Operands of PI2PS follow the gcc builtin, not Intel.

llvm-svn: 113388
2010-09-08 19:15:38 +00:00
Dale Johannesen 605acfe533 Add patterns for MMX that use the new intrinsics.
Enable palignr intrinsic.
These may need adjustment for a new VT in due course.

llvm-svn: 113233
2010-09-07 18:10:56 +00:00
Chris Lattner 620693806a fix the encoding of MMX_MOVFR642Qrr, it starts with 0xF2 not 0xF3,
this fixes rdar://8192860.  Unfortunately it can only be triggered
with llc because llvm-mc matches another (correctly encoded) version
of this, so no testcase.

llvm-svn: 108454
2010-07-15 20:13:34 +00:00
Chris Lattner 6d60a14251 rip out even more sporadic v2f32 support.
llvm-svn: 107610
2010-07-05 04:38:33 +00:00
Dan Gohman 79b6a0f140 Fix an mmx movd encoding.
llvm-svn: 104552
2010-05-24 20:51:08 +00:00
Dan Gohman 098a47931c Delete MMX_MOVQ64gmr. It was the same as MMX_MOVQ64mr, but it didn't
have a pattern and it had an invalid encoding.

llvm-svn: 104244
2010-05-20 18:05:01 +00:00
Kevin Enderby e3a1726034 Fixed the encoding of two of the X86 movq instuctions. The Move quadword from
mm to mm/m64 and the Move quadword from xmm2/mem64 to xmm1 had the incorrect
encodings.

llvm-svn: 102952
2010-05-03 21:03:31 +00:00
Stuart Hastings 24b63f1597 Add some missing x86 patterns for movdq2q. Fixes two (LLVM-)GCC DejaGNU testcases. Radar 6881029.
llvm-svn: 102199
2010-04-23 19:03:32 +00:00
Chris Lattner be980f2df7 remove a bunch of dead patterns.
llvm-svn: 99748
2010-03-28 07:38:00 +00:00
Chris Lattner 26e6273772 fix a few more ambiguous types.
llvm-svn: 98531
2010-03-15 05:53:30 +00:00
Chris Lattner d8045649a6 fix some more ambiguous patterns, remove another nontemporalstore
pattern which is broken (source and address swapped).

llvm-svn: 97958
2010-03-08 18:57:56 +00:00
Dan Gohman 8c5d683aa9 The mayHaveSideEffects flag is no longer used.
llvm-svn: 97348
2010-02-27 23:47:46 +00:00
Chris Lattner 7489838a89 remove a confused pattern that is trying to match an address
then use it as an MMX register (!?).

llvm-svn: 96901
2010-02-23 07:16:12 +00:00
Chris Lattner a828850b4d X86InstrInfoSSE.td declares PINSRW as having type v8i16,
don't alis it in the MMX .td file with a different width,
split into two X86ISD opcodes.  This fixes an x86 testcase.

llvm-svn: 96859
2010-02-23 02:07:48 +00:00
David Greene 509be1fe5e TableGen fragment refactoring.
Move some utility TableGen defs, classes, etc. into a common file so
they may be used my multiple pattern files.  We will use this for
the AVX specification to help with the transition from the current
SSE specification.

llvm-svn: 95727
2010-02-09 23:52:19 +00:00
Chris Lattner e96d534ce0 lower the last of the MRMInitReg instructions in MCInstLower.
llvm-svn: 95435
2010-02-05 21:30:49 +00:00
Mon P Wang 586d997e98 Improved widening loads by adding support for wider loads if
the alignment allows.  Fixed a bug where we didn't use a
vector load/store for PR5626.

llvm-svn: 94338
2010-01-24 00:05:03 +00:00
Sean Callanan 04d8cb74f3 Instruction fixes, added instructions, and AsmString changes in the
X86 instruction tables.

Also (while I was at it) cleaned up the X86 tables, removing tabs and
80-line violations.

This patch was reviewed by Chris Lattner, but please let me know if
there are any problems.

* X86*.td
	Removed tabs and fixed 80-line violations

* X86Instr64bit.td
	(IRET, POPCNT, BT_, LSL, SWPGS, PUSH_S, POP_S, L_S, SMSW)
		Added
	(CALL, CMOV) Added qualifiers
	(JMP) Added PC-relative jump instruction
	(POPFQ/PUSHFQ) Added qualifiers; renamed PUSHFQ to indicate
		that it is 64-bit only (ambiguous since it has no
		REX prefix)
	(MOV) Added rr form going the other way, which is encoded
		differently
	(MOV) Changed immediates to offsets, which is more correct;
		also fixed MOV64o64a to have to a 64-bit offset
	(MOV) Fixed qualifiers
	(MOV) Added debug-register and condition-register moves
	(MOVZX) Added more forms
	(ADC, SUB, SBB, AND, OR, XOR) Added reverse forms, which
		(as with MOV) are encoded differently
	(ROL) Made REX.W required
	(BT) Uncommented mr form for disassembly only
	(CVT__2__) Added several missing non-intrinsic forms
	(LXADD, XCHG) Reordered operands to make more sense for
		MRMSrcMem
	(XCHG) Added register-to-register forms
	(XADD, CMPXCHG, XCHG) Added non-locked forms
* X86InstrSSE.td
	(CVTSS2SI, COMISS, CVTTPS2DQ, CVTPS2PD, CVTPD2PS, MOVQ)
		Added
* X86InstrFPStack.td
	(COM_FST0, COMP_FST0, COM_FI, COM_FIP, FFREE, FNCLEX, FNOP,
	 FXAM, FLDL2T, FLDL2E, FLDPI, FLDLG2, FLDLN2, F2XM1, FYL2X,
	 FPTAN, FPATAN, FXTRACT, FPREM1, FDECSTP, FINCSTP, FPREM,
	 FYL2XP1, FSINCOS, FRNDINT, FSCALE, FCOMPP, FXSAVE,
	 FXRSTOR)
		Added
	(FCOM, FCOMP) Added qualifiers
	(FSTENV, FSAVE, FSTSW) Fixed opcode names
	(FNSTSW) Added implicit register operand
* X86InstrInfo.td
	(opaque512mem) Added for FXSAVE/FXRSTOR
	(offset8, offset16, offset32, offset64) Added for MOV
	(NOOPW, IRET, POPCNT, IN, BTC, BTR, BTS, LSL, INVLPG, STR,
	 LTR, PUSHFS, PUSHGS, POPFS, POPGS, LDS, LSS, LES, LFS,
	 LGS, VERR, VERW, SGDT, SIDT, SLDT, LGDT, LIDT, LLDT,
	 LODSD, OUTSB, OUTSW, OUTSD, HLT, RSM, FNINIT, CLC, STC,
	 CLI, STI, CLD, STD, CMC, CLTS, XLAT, WRMSR, RDMSR, RDPMC,
	 SMSW, LMSW, CPUID, INVD, WBINVD, INVEPT, INVVPID, VMCALL,
	 VMCLEAR, VMLAUNCH, VMRESUME, VMPTRLD, VMPTRST, VMREAD,
	 VMWRITE, VMXOFF, VMXON) Added
	(NOOPL, POPF, POPFD, PUSHF, PUSHFD) Added qualifier
	(JO, JNO, JB, JAE, JE, JNE, JBE, JA, JS, JNS, JP, JNP, JL,
	 JGE, JLE, JG, JCXZ) Added 32-bit forms
	(MOV) Changed some immediate forms to offset forms
	(MOV) Added reversed reg-reg forms, which are encoded
		differently
	(MOV) Added debug-register and condition-register moves
	(CMOV) Added qualifiers
	(AND, OR, XOR, ADC, SUB, SBB) Added reverse forms, like MOV
	(BT) Uncommented memory-register forms for disassembler
	(MOVSX, MOVZX) Added forms
	(XCHG, LXADD) Made operand order make sense for MRMSrcMem
	(XCHG) Added register-register forms
	(XADD, CMPXCHG) Added unlocked forms
* X86InstrMMX.td
	(MMX_MOVD, MMV_MOVQ) Added forms
* X86InstrInfo.cpp: Changed PUSHFQ to PUSHFQ64 to reflect table
	change

* X86RegisterInfo.td: Added debug and condition register sets
* x86-64-pic-3.ll: Fixed testcase to reflect call qualifier
* peep-test-3.ll: Fixed testcase to reflect test qualifier
* cmov.ll: Fixed testcase to reflect cmov qualifier
* loop-blocks.ll: Fixed testcase to reflect call qualifier
* x86-64-pic-11.ll: Fixed testcase to reflect call qualifier
* 2009-11-04-SubregCoalescingBug.ll: Fixed testcase to reflect call
  qualifier
* x86-64-pic-2.ll: Fixed testcase to reflect call qualifier
* live-out-reg-info.ll: Fixed testcase to reflect test qualifier
* tail-opts.ll: Fixed testcase to reflect call qualifiers
* x86-64-pic-10.ll: Fixed testcase to reflect call qualifier
* bss-pagealigned.ll: Fixed testcase to reflect call qualifier
* x86-64-pic-1.ll: Fixed testcase to reflect call qualifier
* widen_load-1.ll: Fixed testcase to reflect call qualifier

llvm-svn: 91638
2009-12-18 00:01:26 +00:00
Dan Gohman 453d64c9f5 Rename usesCustomDAGSchedInserter to usesCustomInserter, and update a
bunch of associated comments, because it doesn't have anything to do
with DAGs or scheduling. This is another step in decoupling MachineInstr
emitting from scheduling.

llvm-svn: 85517
2009-10-29 18:10:34 +00:00
Daniel Dunbar c4f8ea4ccb Add 'isCodeGenOnly' bit to Instruction .td records.
- Used to mark fake instructions which don't correspond to an actual machine
   instruction (or are duplicates of a real instruction). This is to be used for
   "special cases" in the .td files, which should be ignored by things like the
   assembler and disassembler. We still need a good solution to handle pervasive
   duplication, like with the Int_ instructions.

 - Set the bit on fake "mov 0" style instructions, which allows turning an
   assembler matcher warning into a hard error.

 - -2 FIXMEs.

llvm-svn: 78731
2009-08-11 22:17:52 +00:00
Eric Christopher d91dceea0f Whitespace, 80-column, and isTwoAddress -> Constraints = "" changes.
No functional change.

llvm-svn: 78608
2009-08-10 22:37:37 +00:00
Evan Cheng 3aa1e77572 Remove neverHasSideEffects on MMX_MOVD64rrv164 since it has a matching pattern.
llvm-svn: 77978
2009-08-03 18:07:19 +00:00
Rafael Espindola 70e9816624 Use movd instead of movq
llvm-svn: 77956
2009-08-03 05:21:05 +00:00
Rafael Espindola 7bdf4c2cec Fix the instruction encoding.
llvm-svn: 77944
2009-08-03 03:27:05 +00:00
Rafael Espindola 18ba271a79 Use movq to move 64 bits in and out of mmx registers.
Fixes PR4669

llvm-svn: 77940
2009-08-03 02:45:34 +00:00
Eli Friedman caccc0081a Add support for MMX VSETCC.
llvm-svn: 76713
2009-07-22 01:06:52 +00:00
Eli Friedman 5911537b68 Misc encoding fixes; reported on llvmdev.
llvm-svn: 75142
2009-07-09 16:49:25 +00:00
Bill Wendling f6e8f6b0f4 "The MMX_MASKMOVQ and MMX_MASKMOVQ64 instructions are labeled as MRMDestMem
instructions, which implies that there is an explicit memory operand.  There is
(however) no explicit memory operand; although this is a store, the only memory
operand is implicit, indicated by DS:EDI.  This causes the table-generation code
for the disassembler to report an error."

Patch by Sean Callanan!

llvm-svn: 73989
2009-06-23 19:52:59 +00:00
Eli Friedman 1b1844ad1f Get rid of some bogus patterns for X86vzmovl. Don't create VZEXT_MOVL
nodes for vectors with an i16 element type.  Add an optimization for 
building a vector which is all zeros/undef except for the bottom 
element, where the bottom element is an i8 or i16.

llvm-svn: 72988
2009-06-06 06:05:10 +00:00
Eli Friedman 6c101ebfa8 Get rid of a bogus pattern that interferes with optimization.
llvm-svn: 72985
2009-06-06 04:17:04 +00:00
Stuart Hastings 2797e7a483 Evan says it's wrong; back out 72808.
llvm-svn: 72817
2009-06-03 22:59:34 +00:00
Stuart Hastings 679ec6917c Recognize another euphemism for MOVDQ2Q.
llvm-svn: 72808
2009-06-03 21:39:14 +00:00
Bill Wendling 0feb0e6071 "The instructions MMX_PSADBWrm and MMX_PSADBWrr have opcode 0b11100000 (e0), but
the Intel manual (screenshot) says it should be 0b11110110 (f6).  The existing
encoding causes a disassembly conflict with MMX_PAVGBrm, which really should be
0f e0."

Patch by Sean Callanan!

llvm-svn: 72508
2009-05-28 02:04:00 +00:00
Nate Begeman 8d6d4b9289 2nd attempt, fixing SSE4.1 issues and implementing feedback from duncan.
PR2957

ISD::VECTOR_SHUFFLE now stores an array of integers representing the shuffle
mask internal to the node, rather than taking a BUILD_VECTOR of ConstantSDNodes
as the shuffle mask.  A value of -1 represents UNDEF.

In addition to eliminating the creation of illegal BUILD_VECTORS just to 
represent shuffle masks, we are better about canonicalizing the shuffle mask,
resulting in substantially better code for some classes of shuffles.

llvm-svn: 70225
2009-04-27 18:41:29 +00:00
Rafael Espindola b93db668b3 Revert 69952. Causes testsuite failures on linux x86-64.
llvm-svn: 69967
2009-04-24 12:40:33 +00:00
Nate Begeman bb881d66f4 PR2957
ISD::VECTOR_SHUFFLE now stores an array of integers representing the shuffle
mask internal to the node, rather than taking a BUILD_VECTOR of ConstantSDNodes
as the shuffle mask.  A value of -1 represents UNDEF.

In addition to eliminating the creation of illegal BUILD_VECTORS just to 
represent shuffle masks, we are better about canonicalizing the shuffle mask,
resulting in substantially better code for some classes of shuffles.

A clean up of x86 shuffle code, and some canonicalizing in DAGCombiner is next.

llvm-svn: 69952
2009-04-24 03:42:54 +00:00
Evan Cheng 9f8fddeed8 Only v1i16 (i.e. _m64) is returned via RAX / RDX.
llvm-svn: 65313
2009-02-23 09:03:22 +00:00
Mon P Wang 9c2d26d208 Added support for SELECT v8i8 v4i16 for X86 (MMX)
Added support for TRUNC v8i16 to v8i8 for X86 (MMX)

llvm-svn: 60916
2008-12-12 01:25:51 +00:00
Evan Cheng 1339e72d97 Use mmx (punpckldq VR64, (mmx_v_set0)) to clear high 32-bits of a VR64 register.
llvm-svn: 60499
2008-12-03 19:38:05 +00:00
Dan Gohman 69cc2cbbff Rename isSimpleLoad to canFoldAsLoad, to better reflect its meaning.
llvm-svn: 60487
2008-12-03 18:15:48 +00:00
Evan Cheng 27889ab29f Add more vector move low and zero-extend patterns.
llvm-svn: 58752
2008-11-05 06:04:51 +00:00
Bill Wendling 76105a4a4f Make "movdq2q" and "movq2dq" dependent upon having SSE2 because they use the
SSE2 registers as well as the MMX registers.

llvm-svn: 55436
2008-08-27 21:32:04 +00:00
Bill Wendling 6cfd3830fb Nevermind. This broke the bootstrap (?!).
llvm-svn: 55318
2008-08-25 18:32:39 +00:00
Bill Wendling dd6759aea7 MOVQ2DQ and MOVQ2DQ use SSE2. We should conditionalize the use of these
instructions on having SSE2.

llvm-svn: 55317
2008-08-25 18:20:52 +00:00
Anton Korobeynikov 31099519d0 Provide a 64 bit variant of mmx.maskmovq intrinsic lowering.
Is there way to avoid explicit target check?

llvm-svn: 55238
2008-08-23 15:53:19 +00:00
Nate Begeman 628ab8c673 Remove dead PatLeaf; there are a number of issues around MMX movl that need to be fixed.
llvm-svn: 54026
2008-07-25 17:25:04 +00:00
Dale Johannesen e5f4ffbdf1 Add v2f32 (MMX) type to X86. Support is primitive:
load,store,call,return,bitcast.  This is enough to
make call and return work.

llvm-svn: 52691
2008-06-24 22:01:44 +00:00
Evan Cheng 5e28227dbd Implement vector shift up / down and insert zero with ps{rl}lq / ps{rl}ldq.
llvm-svn: 51667
2008-05-29 08:22:04 +00:00
Evan Cheng 961339bbdb Handle a few more cases of folding load i64 into xmm and zero top bits.
Note, some of the code will be moved into target independent part of DAG combiner in a subsequent patch.

llvm-svn: 50918
2008-05-09 21:53:03 +00:00
Evan Cheng 78af38c392 Handle vector move / load which zero the destination register top bits (i.e. movd, movq, movss (addr), movsd (addr)) with X86 specific dag combine.
llvm-svn: 50838
2008-05-08 00:57:18 +00:00
Evan Cheng cdf22f2953 Add separate intrinsics for MMX / SSE shifts with i32 integer operands. This allow us to simplify the horribly complicated matching code.
llvm-svn: 50601
2008-05-03 00:52:09 +00:00
Evan Cheng 5ba02020e6 Fix illegal MMX_MOVDQ2Qrr pattern. vector_extract result must be a scalar value.
llvm-svn: 50291
2008-04-25 20:12:46 +00:00
Evan Cheng ccde6dd016 Special handling for MMX values being passed in either GPR64 or lower 64-bits of XMM registers.
llvm-svn: 50289
2008-04-25 19:11:04 +00:00
Evan Cheng 6d653b58f9 Fix MMX_MOVQ2DQrr pattern. It's illegal to do a bitconvert from a smaller type to a larger one.
llvm-svn: 50278
2008-04-25 18:19:54 +00:00
Dan Gohman db08f5218e Fix the encoding of the MMX movd that moves from MMX to 64-bit GPR.
llvm-svn: 50053
2008-04-21 19:52:29 +00:00
Dan Gohman 01a5d36d9d Add movd instructions to move from MMX registers
to 64-bit GPR registers on x86-64.

llvm-svn: 49757
2008-04-15 23:55:07 +00:00
Evan Cheng 92b4488202 Undo 48570. Correctly match mmx shift instructions with an immediate operand.
llvm-svn: 48627
2008-03-21 00:40:09 +00:00
Evan Cheng bbba76fc99 Add intrinsics to match mmx shift builtin's with immediate operand.
llvm-svn: 48569
2008-03-19 23:38:52 +00:00
Evan Cheng 0e7b00d79f Replace all target specific implicit def instructions with a target independent one: TargetInstrInfo::IMPLICIT_DEF.
llvm-svn: 48380
2008-03-15 00:03:38 +00:00
Evan Cheng 99ee78ef63 Clean up my own mess.
X86 lowering normalize vector 0 to v4i32. However DAGCombine can fold (sub x, x) -> 0 after legalization. It can create a zero vector of a type that's not expected (e.g. v8i16). We don't want to disable the optimization since leaving a (sub x, x) is really bad. Add isel patterns for other types of vector 0 to ensure correctness. It's highly unlikely to happen other than in bugpoint reduced test cases.

llvm-svn: 48279
2008-03-12 07:02:50 +00:00
Anders Carlsson 17df4cd397 Use the correct instruction encodings for the 64-bit MMX movd.
llvm-svn: 47740
2008-02-29 01:35:12 +00:00
Evan Cheng 6200c225e0 - When DAG combiner is folding a bit convert into a BUILD_VECTOR, it should check if it's essentially a SCALAR_TO_VECTOR. Avoid turning (v8i16) <10, u, u, u> to <10, 0, u, u, u, u, u, u>. Instead, simply convert it to a SCALAR_TO_VECTOR of the proper type.
- X86 now normalize SCALAR_TO_VECTOR to (BIT_CONVERT (v4i32 SCALAR_TO_VECTOR)). Get rid of X86ISD::S2VEC.

llvm-svn: 47290
2008-02-18 23:04:32 +00:00
Chris Lattner 317332fc2a Start inferring side effect information more aggressively, and fix many bugs in the
x86 backend where instructions were not marked maystore/mayload, and perf issues where
instructions were not marked neverHasSideEffects.  It would be really nice if we could
write patterns for copy instructions.

I have audited all the x86 instructions down to MOVDQAmr.  The flags on others and on
other targets are probably not right in all cases, but no clients currently use this
info that are enabled by default.

llvm-svn: 45829
2008-01-10 07:59:24 +00:00
Chris Lattner aca7ca3730 remove explicit sets of 'neverHasSideEffects' that can now be
inferred from the instr patterns.

llvm-svn: 45824
2008-01-10 05:45:39 +00:00
Chris Lattner a4ce4f6987 rename isLoad -> isSimpleLoad due to evan's desire to have such a predicate.
llvm-svn: 45667
2008-01-06 23:38:27 +00:00
Chris Lattner f3ebc3f3d2 Remove attribution from file headers, per discussion on llvmdev.
llvm-svn: 45418
2007-12-29 20:36:04 +00:00
Bill Wendling b3d85a5d4b Add "mayHaveSideEffects" and "neverHasSideEffects" flags to some instructions. I
based what flag to set on whether it was already marked as
"isRematerializable". If there was a further check to determine if it's "really"
rematerializable, then I marked it as "mayHaveSideEffects" and created a check
in the X86 back-end similar to the remat one.

llvm-svn: 45132
2007-12-17 23:07:56 +00:00
Evan Cheng 6e68381e02 Implicit def instructions, e.g. X86::IMPLICIT_DEF_GR32, are always re-materializable and they should not be spilled.
llvm-svn: 44960
2007-12-12 23:12:09 +00:00
Chris Lattner 5728bdd4db Fix a long standing deficiency in the X86 backend: we would
sometimes emit "zero" and "all one" vectors multiple times,
for example:

_test2:
	pcmpeqd	%mm0, %mm0
	movq	%mm0, _M1
	pcmpeqd	%mm0, %mm0
	movq	%mm0, _M2
	ret

instead of:

_test2:
	pcmpeqd	%mm0, %mm0
	movq	%mm0, _M1
	movq	%mm0, _M2
	ret

This patch fixes this by always arranging for zero/one vectors
to be defined as v4i32 or v2i32 (SSE/MMX) instead of letting them be
any random type.  This ensures they get trivially CSE'd on the dag.
This fix is also important for LegalizeDAGTypes, as it gets unhappy
when the x86 backend wants BUILD_VECTOR(i64 0) to be legal even when
'i64' isn't legal.

This patch makes the following changes:

1) X86TargetLowering::LowerBUILD_VECTOR now lowers 0/1 vectors into
   their canonical types.
2) The now-dead patterns are removed from the SSE/MMX .td files.
3) All the patterns in the .td file that referred to immAllOnesV or
   immAllZerosV in the wrong form now use *_bc to match them with a
   bitcast wrapped around them.
4) X86DAGToDAGISel::SelectScalarSSELoad is generalized to handle 
   bitcast'd zero vectors, which simplifies the code actually.
5) getShuffleVectorZeroOrUndef is updated to generate a shuffle that
   is legal, instead of generating one that is illegal and expecting
   a later legalize pass to clean it up.
6) isZeroShuffle is generalized to handle bitcast of zeros.
7) several other minor tweaks.

This patch is definite goodness, but has the potential to cause random
code quality regressions.  Please be on the lookout for these and let 
me know if they happen.

llvm-svn: 44310
2007-11-25 00:24:49 +00:00
Evan Cheng 3e18e504ae Remove (somewhat confusing) Imp<> helper, use let Defs = [], Uses = [] instead.
llvm-svn: 41863
2007-09-11 19:55:27 +00:00
Evan Cheng c2081fe573 Mark load instructions with isLoad = 1.
llvm-svn: 41595
2007-08-30 05:49:43 +00:00
Dan Gohman fa3eeeedc0 Mark the SSE and MMX load instructions that
X86InstrInfo::isReallyTriviallyReMaterializable knows how to handle
with the isReMaterializable flag so that it is given a chance to handle
them. Without hoisting constant-pool loads from loops this isn't very
visible, though it does keep CodeGen/X86/constant-pool-remat-0.ll from
making a copy of the constant pool on the stack.

llvm-svn: 40736
2007-08-02 14:27:55 +00:00