llvm-project

Commit Graph

Author	SHA1	Message	Date
Eric Christopher	64831c6a4c	Remove the palignr intrinsics now that we lower them to vector shuffles, shifts and null vectors. Autoupgrade these to what we'd lower them to. Add a testcase to exercise this. llvm-svn: 101851	2010-04-20 00:59:54 +00:00
Chris Lattner	7f5088e6de	a bunch of ssse3 instructions are misencoded to think they have an i8 field when they really do not. This fixes rdar://7840289 llvm-svn: 101629	2010-04-17 07:38:24 +00:00
Eric Christopher	eabc9623da	Allow lowering for palignr instructions for mmx sized vectors. Add patterns to handle the lowering. llvm-svn: 101331	2010-04-15 01:40:20 +00:00
Eric Christopher	c0f63cf7a9	mpsadbw is not commutative. Fixes PR3440. llvm-svn: 100736	2010-04-08 00:52:02 +00:00
Eric Christopher	000e502eb1	Rewrite aesimc handling. It only takes a single input and has a single dest. llvm-svn: 100252	2010-04-02 23:48:33 +00:00
Eric Christopher	2ef63183a5	Separate out the AES-NI instructions from the SSE4.2 instructions. Add a new subtarget option for AES and check for the support. Add "westmere" line of processors and add AES-NI support to the core i7. Add a couple of TODOs for information I couldn't verify. llvm-svn: 100231	2010-04-02 21:54:27 +00:00
Eric Christopher	9002ac5d93	Add aeskeygenassist intrinsic and rename all of the aes intrinsics to aes instead of sse4.2. Add a brief todo for a subtarget flag and rework the aeskeygenassist instruction to more closely match the docs. llvm-svn: 100078	2010-04-01 03:05:45 +00:00
Jakob Stoklund Olesen	9986ba954c	Replace V_SET0 with variants for each SSE execution domain. llvm-svn: 99975	2010-03-31 00:40:13 +00:00
Jakob Stoklund Olesen	3493398f13	V_SETALLONES is an integer instruction. Since it is just a pxor in disguise, we should probably expand it to a full polymorphic triple. llvm-svn: 99953	2010-03-30 22:46:55 +00:00
Eric Christopher	6ad8167714	Remove the pmulld intrinsic and autoupdate it as a vector multiply. Rewrite the pmulld patterns, and make sure that they fold in loads of arguments into the instruction. llvm-svn: 99910	2010-03-30 18:49:01 +00:00
Eric Christopher	9bdadf0d99	We'll never match these as instructions, just as intrinsics so remove the SDNodes. llvm-svn: 99835	2010-03-29 20:41:51 +00:00
Chris Lattner	11f85ccf7d	zap an extra line that Eli noticed! llvm-svn: 99770	2010-03-28 18:52:28 +00:00
Chris Lattner	505849d277	remove a pattern with no testcase that doesn't appear to be matchable: it seems like it would always constant fold. llvm-svn: 99758	2010-03-28 08:40:48 +00:00
Chris Lattner	ec5fe65838	fix some modelling problems exposed by a patch I'm working on. bsr/bsf/ptest nodes all have an EFLAGS result when made by isel lowering. llvm-svn: 99736	2010-03-28 05:07:17 +00:00
Jakob Stoklund Olesen	3758ff917e	Tag SSE2 integer instructions as SSEPackedInt. llvm-svn: 99540	2010-03-25 18:52:04 +00:00
Bob Wilson	e543e7fcb1	Reapply Kevin's change 94440, now that Chris has fixed the limitation on opcode values fitting in one byte (svn r99494). llvm-svn: 99514	2010-03-25 16:36:14 +00:00
Bob Wilson	5b2da69f6d	Speculatively revert this to see if it fixes buildbot failures. --- Reverse-merging r99440 into '.': U test/MC/AsmParser/X86/x86_32-bit_cat.s U test/MC/AsmParser/X86/x86_32-encoding.s U include/llvm/IntrinsicsX86.td U include/llvm/CodeGen/SelectionDAGNodes.h U lib/Target/X86/X86InstrSSE.td U lib/Target/X86/X86ISelLowering.h llvm-svn: 99450	2010-03-24 23:26:29 +00:00
Kevin Enderby	f5584a7397	Added the Advanced Encryption Standard (AES) Instructions. llvm-svn: 99440	2010-03-24 22:33:33 +00:00
Kevin Enderby	cf0843ed93	Fixed the encoding problems of the crc32 instructions. All had the Operand size override prefix and only the r/m16 forms should have had that. Also for variant one, the AT&T syntax, added suffixes to all forms. Also added the missing 64-bit form for 'CRC32 r64, r/m8'. Plus added test cases for all forms and tweaked one test case to add the needed suffixes. llvm-svn: 98980	2010-03-19 20:04:42 +00:00
Chris Lattner	83facb0812	Now that tblgen can handle matching implicit defs of instructions to input patterns, we can fix X86ISD::CMP and X86ISD::BT as taking two inputs (which have to be the same type) and returning an i32. This is how the SDNodes get made in the graph, but we weren't able to model it this way due to deficiencies in the pattern language. Now we can change things like this: def UCOM_FpIr80: FpI_<(outs), (ins RFP80:$lhs, RFP80:$rhs), CompareFP, - [(X86cmp RFP80:$lhs, RFP80:$rhs), - (implicit EFLAGS)]>; // CC = ST(0) cmp ST(i) + [(set EFLAGS, (X86cmp RFP80:$lhs, RFP80:$rhs))]>; and fix terrible crimes like this: -def : Pat<(parallel (X86cmp GR8:$src1, 0), (implicit EFLAGS)), +def : Pat<(X86cmp GR8:$src1, 0), (TEST8rr GR8:$src1, GR8:$src1)>; This relies on matching the result of TEST8rr (which is EFLAGS, which is an implicit def) to the result of X86cmp, an i32. llvm-svn: 98903	2010-03-19 00:01:11 +00:00
Chris Lattner	26e6273772	fix a few more ambiguous types. llvm-svn: 98531	2010-03-15 05:53:30 +00:00
Chris Lattner	d8045649a6	fix some more ambiguous patterns, remove another nontemporalstore pattern which is broken (source and address swapped). llvm-svn: 97958	2010-03-08 18:57:56 +00:00
Chris Lattner	ca8d590c28	remove a non-temporal store pattern which is not tested and could never have matched because the operand list was backwards. llvm-svn: 97933	2010-03-08 03:18:28 +00:00
Dan Gohman	bdd6405f29	Implement XMM subregs. Extracting the low element of a vector is now done with EXTRACT_SUBREG, and the zero-extension performed by load movss is now modeled with SUBREG_TO_REG, and so on. Register-to-register movss and movsd are no longer considered copies; they are two-address instructions which insert a scalar into a vector. llvm-svn: 97354	2010-02-28 00:17:42 +00:00
Dan Gohman	8c5d683aa9	The mayHaveSideEffects flag is no longer used. llvm-svn: 97348	2010-02-27 23:47:46 +00:00
Dan Gohman	9300486d68	Delete a bunch of redundant predicates. llvm-svn: 97201	2010-02-26 01:14:30 +00:00
Chris Lattner	d17089231a	remove a bunch of dead named arguments in input patterns, though some look dubious afaict, these are all ok. llvm-svn: 96899	2010-02-23 06:54:29 +00:00
Chris Lattner	fd47c79774	add a missing type cast. llvm-svn: 96574	2010-02-18 06:33:42 +00:00
David Greene	9641d06809	Add support for emitting non-temporal stores for DAGs marked non-temporal. Fix from r96241 for botched encoding of MOVNTDQ. Add documentation for !nontemporal metadata. Add a simpler movnt testcase. llvm-svn: 96386	2010-02-16 20:50:18 +00:00
Chris Lattner	bcbaaba532	revert r96241. It breaks two regression tests, isn't documented, and the testcase needs improvement. llvm-svn: 96265	2010-02-15 20:53:01 +00:00
David Greene	63cedef74b	Add support for emitting non-temporal stores for DAGs marked non-temporal. llvm-svn: 96241	2010-02-15 17:02:56 +00:00
Chris Lattner	064e926362	Remove special cases for [LM]FENCE, MONITOR and MWAIT from encoder and decoder by using new MRM_ forms. llvm-svn: 96048	2010-02-12 23:54:57 +00:00
Nate Begeman	c780af6471	Add a missing pattern for movhps so that we get: movq (%ecx,%edx,2), %xmm2 movhps (%ecx,%eax,2), %xmm2 rather than: movq (%eax, %edx, 2), %xmm2 movq (%eax, %ebx, 2), %xmm3 movlhps %xmm3, %xmm2 Testcase forthcoming. llvm-svn: 95948	2010-02-12 01:10:45 +00:00
Kevin Enderby	a7c1d6cfd1	Fix the encoding of the movntdqa X86 instruction. It was missing the 0x66 prefix which is part of the opcode encoding. llvm-svn: 95729	2010-02-10 00:10:31 +00:00
Chris Lattner	86bd194234	really kill off the last MRMInitReg inst, remove logic from encoder. llvm-svn: 95437	2010-02-05 21:34:18 +00:00
Chris Lattner	e96d534ce0	lower the last of the MRMInitReg instructions in MCInstLower. llvm-svn: 95435	2010-02-05 21:30:49 +00:00
David Greene	206351a1ff	Implement a feature (-vector-unaligned-mem) to allow targets to ignore alignment requirements for SIMD memory operands. This is useful on architectures like the AMD 10h that do not trap on unaligned references if a status bit is twiddled at startup time. llvm-svn: 93151	2010-01-11 16:29:42 +00:00
Evan Cheng	71d7eaa87e	Remove target attribute break-sse-dep. Instead, do not fold load into sse partial update instructions unless optimizing for size. llvm-svn: 91910	2009-12-22 17:47:23 +00:00
Evan Cheng	4cf30b72bf	On recent Intel u-arch's, folding loads into some unary SSE instructions can be non-optimal. To be precise, we should avoid folding loads if the instructions only update part of the destination register, and the non-updated part is not needed. e.g. cvtss2sd, sqrtss. Unfolding the load from these instructions breaks the partial register dependency and it can improve performance. e.g. movss (%rdi), %xmm0 cvtss2sd %xmm0, %xmm0 instead of cvtss2sd (%rdi), %xmm0 An alternative method to break dependency is to clear the register first. e.g. xorps %xmm0, %xmm0 cvtss2sd (%rdi), %xmm0 llvm-svn: 91672	2009-12-18 07:40:29 +00:00
Sean Callanan	04d8cb74f3	Instruction fixes, added instructions, and AsmString changes in the X86 instruction tables. Also (while I was at it) cleaned up the X86 tables, removing tabs and 80-line violations. This patch was reviewed by Chris Lattner, but please let me know if there are any problems. * X86.td Removed tabs and fixed 80-line violations X86Instr64bit.td (IRET, POPCNT, BT_, LSL, SWPGS, PUSH_S, POP_S, L_S, SMSW) Added (CALL, CMOV) Added qualifiers (JMP) Added PC-relative jump instruction (POPFQ/PUSHFQ) Added qualifiers; renamed PUSHFQ to indicate that it is 64-bit only (ambiguous since it has no REX prefix) (MOV) Added rr form going the other way, which is encoded differently (MOV) Changed immediates to offsets, which is more correct; also fixed MOV64o64a to have to a 64-bit offset (MOV) Fixed qualifiers (MOV) Added debug-register and condition-register moves (MOVZX) Added more forms (ADC, SUB, SBB, AND, OR, XOR) Added reverse forms, which (as with MOV) are encoded differently (ROL) Made REX.W required (BT) Uncommented mr form for disassembly only (CVT__2__) Added several missing non-intrinsic forms (LXADD, XCHG) Reordered operands to make more sense for MRMSrcMem (XCHG) Added register-to-register forms (XADD, CMPXCHG, XCHG) Added non-locked forms * X86InstrSSE.td (CVTSS2SI, COMISS, CVTTPS2DQ, CVTPS2PD, CVTPD2PS, MOVQ) Added * X86InstrFPStack.td (COM_FST0, COMP_FST0, COM_FI, COM_FIP, FFREE, FNCLEX, FNOP, FXAM, FLDL2T, FLDL2E, FLDPI, FLDLG2, FLDLN2, F2XM1, FYL2X, FPTAN, FPATAN, FXTRACT, FPREM1, FDECSTP, FINCSTP, FPREM, FYL2XP1, FSINCOS, FRNDINT, FSCALE, FCOMPP, FXSAVE, FXRSTOR) Added (FCOM, FCOMP) Added qualifiers (FSTENV, FSAVE, FSTSW) Fixed opcode names (FNSTSW) Added implicit register operand * X86InstrInfo.td (opaque512mem) Added for FXSAVE/FXRSTOR (offset8, offset16, offset32, offset64) Added for MOV (NOOPW, IRET, POPCNT, IN, BTC, BTR, BTS, LSL, INVLPG, STR, LTR, PUSHFS, PUSHGS, POPFS, POPGS, LDS, LSS, LES, LFS, LGS, VERR, VERW, SGDT, SIDT, SLDT, LGDT, LIDT, LLDT, LODSD, OUTSB, OUTSW, OUTSD, HLT, RSM, FNINIT, CLC, STC, CLI, STI, CLD, STD, CMC, CLTS, XLAT, WRMSR, RDMSR, RDPMC, SMSW, LMSW, CPUID, INVD, WBINVD, INVEPT, INVVPID, VMCALL, VMCLEAR, VMLAUNCH, VMRESUME, VMPTRLD, VMPTRST, VMREAD, VMWRITE, VMXOFF, VMXON) Added (NOOPL, POPF, POPFD, PUSHF, PUSHFD) Added qualifier (JO, JNO, JB, JAE, JE, JNE, JBE, JA, JS, JNS, JP, JNP, JL, JGE, JLE, JG, JCXZ) Added 32-bit forms (MOV) Changed some immediate forms to offset forms (MOV) Added reversed reg-reg forms, which are encoded differently (MOV) Added debug-register and condition-register moves (CMOV) Added qualifiers (AND, OR, XOR, ADC, SUB, SBB) Added reverse forms, like MOV (BT) Uncommented memory-register forms for disassembler (MOVSX, MOVZX) Added forms (XCHG, LXADD) Made operand order make sense for MRMSrcMem (XCHG) Added register-register forms (XADD, CMPXCHG) Added unlocked forms * X86InstrMMX.td (MMX_MOVD, MMV_MOVQ) Added forms * X86InstrInfo.cpp: Changed PUSHFQ to PUSHFQ64 to reflect table change * X86RegisterInfo.td: Added debug and condition register sets * x86-64-pic-3.ll: Fixed testcase to reflect call qualifier * peep-test-3.ll: Fixed testcase to reflect test qualifier * cmov.ll: Fixed testcase to reflect cmov qualifier * loop-blocks.ll: Fixed testcase to reflect call qualifier * x86-64-pic-11.ll: Fixed testcase to reflect call qualifier * 2009-11-04-SubregCoalescingBug.ll: Fixed testcase to reflect call qualifier * x86-64-pic-2.ll: Fixed testcase to reflect call qualifier * live-out-reg-info.ll: Fixed testcase to reflect test qualifier * tail-opts.ll: Fixed testcase to reflect call qualifiers * x86-64-pic-10.ll: Fixed testcase to reflect call qualifier * bss-pagealigned.ll: Fixed testcase to reflect call qualifier * x86-64-pic-1.ll: Fixed testcase to reflect call qualifier * widen_load-1.ll: Fixed testcase to reflect call qualifier llvm-svn: 91638	2009-12-18 00:01:26 +00:00
Evan Cheng	493b882f80	Optimize splat of a scalar load into a shuffle of a vector load when it's legal. e.g. vector_shuffle (scalar_to_vector (i32 load (ptr + 4))), undef, <0, 0, 0, 0> => vector_shuffle (v4i32 load ptr), undef, <1, 1, 1, 1> iff ptr is 16-byte aligned (or can be made into 16-byte aligned). llvm-svn: 90984	2009-12-09 21:00:30 +00:00
Sean Callanan	c1f532e930	Recommitting PALIGNR shift width fixes. Thanks to Daniel Dunbar for fixing clang intrinsics: http://llvm.org/viewvc/llvm-project?view=rev&revision=89499 llvm-svn: 89500	2009-11-20 22:28:42 +00:00
Sean Callanan	19d92728d0	Reverting PALIGNR fix until I figure out how this broke the Clang testsuite. llvm-svn: 89495	2009-11-20 22:09:28 +00:00
Sean Callanan	fbed130173	Fixed PALIGNR to take 8-bit rotations in all cases. Also fixed the corresponding testcase, and the PALIGNR intrinsic (tested for correctness with llvm-gcc). llvm-svn: 89491	2009-11-20 21:40:28 +00:00
Evan Cheng	5392cc9d14	Re-apply 89011. It's not to be blamed. llvm-svn: 89081	2009-11-17 09:51:18 +00:00
Evan Cheng	05938e819b	Revert 89011. Buildbot thinks it might be breaking stuff. llvm-svn: 89076	2009-11-17 09:20:28 +00:00
Evan Cheng	ce28f6f478	A few more instructions that should be marked re-materializable. llvm-svn: 89011	2009-11-17 00:23:22 +00:00
Evan Cheng	f25ef4ffb0	- Check memoperand alignment instead of checking stack alignment. Most load / store folding instructions are not referencing spill stack slots. - Mark MOVUPSrm re-materializable. llvm-svn: 88974	2009-11-16 21:56:03 +00:00
Nate Begeman	3a313df69b	x86 vector shuffle cleanup/fixes: 1. rename the movhp patfrag to movlhps, since thats what it actually matches 2. eliminate the bogus movhps load and store patterns, they were incorrect. The load transforms are already handled (correctly) by shufps/unpack. 3. revert a recent test change to its correct form. llvm-svn: 86415	2009-11-07 23:17:15 +00:00
Eric Christopher	bd05185ef1	Fix a couple of shuffle patterns to use movhlps instead of movhps as the constraint. Changes optimizations so update testcases as appropriate as well. llvm-svn: 86360	2009-11-07 08:45:53 +00:00

1 2 3 4 5 ...

393 Commits