Commit Graph

77 Commits

Author SHA1 Message Date
Evan Cheng f8b4c0035b Disable auto-detection of AVX support since AVX codegen support is not ready.
llvm-svn: 121677
2010-12-13 04:23:53 +00:00
Nate Begeman 8b08f5232b Formalize the notion that AVX and SSE are non-overlapping extensions from the compiler's point of view. Per email discussion, we either want to always use VEX-prefixed instructions or never use them, and are taking "HasAVX" to mean "Always use VEX". Passing -mattr=-avx,+sse42 should serve to restore legacy SSE support when desirable.
llvm-svn: 121439
2010-12-10 00:26:57 +00:00
Benjamin Kramer 2f489236ab Add patterns for the x86 popcnt instruction.
- Also adds a new POPCNT subtarget feature that is currently enabled if the target
  supports SSE4.2 (nehalem) or SSE4A (barcelona).

llvm-svn: 120917
2010-12-04 20:32:23 +00:00
Jim Grosbach 4cf25f5ba9 Clean up comments.
llvm-svn: 117785
2010-10-30 13:48:28 +00:00
Jim Grosbach c6e13f7383 Clean up asm writer usage for x86 and msp430 to flag that the writer should
use MC instructions in the printInstruction() method via the tablegen flag
for it rather than a #define prior to including the autogenerated bits.

llvm-svn: 115238
2010-09-30 23:40:25 +00:00
Daniel Dunbar 167b9d7f30 tblgen/AsmMatcher: Always emit the match function as 'MatchInstructionImpl',
target specific parsers can adapt the TargetAsmParser to this.

llvm-svn: 110888
2010-08-12 00:55:32 +00:00
Bruno Cardoso Lopes d618c8ac64 Declare CLMUL as a subtarget feature
llvm-svn: 109207
2010-07-23 01:22:45 +00:00
Daniel Dunbar b82cd9319b MC/X86: We now match instructions like "incl %eax" correctly for the arch we are
assembling; remove crufty custom cleanup code.

llvm-svn: 108681
2010-07-19 06:14:54 +00:00
Daniel Dunbar 9b816a1bb3 MC/X86: Add "support" for matching ATT style mnemonic prefixes.
- The idea is that when a match fails, we just try to match each of +'b', +'w',
   +'l'. If exactly one matches, we assume this is a mnemonic prefix and accept
   it. If all match, we assume it is width generic, and take the 'l' form.

 - This would be a horrible hack, if it weren't so simple. Therefore it is an
   elegant solution! Chris gets the credit for this particular elegant
   solution. :)

 - Next step to making this more robust is to have the X86 matcher generate the
   mnemonic prefix information. Ideally we would also compute up-front exactly
   which mnemonic to attempt to match, but this may require more custom code in
   the matcher than is really worth it.

llvm-svn: 103012
2010-05-04 16:12:42 +00:00
Jakob Stoklund Olesen b93331f3be Replace TSFlagsFields and TSFlagsShifts with a simpler TSFlags field.
When a target instruction wants to set target-specific flags, it should simply
set bits in the TSFlags bit vector defined in the Instruction TableGen class.

This works well because TableGen resolves member references late:

class I : Instruction {
  AddrMode AM = AddrModeNone;
  let TSFlags{3-0} = AM.Value;
}

let AM = AddrMode4 in
def ADD : I;

TSFlags gets the expected bits from AddrMode4 in this example.

llvm-svn: 100384
2010-04-05 03:10:20 +00:00
Eric Christopher 2ef63183a5 Separate out the AES-NI instructions from the SSE4.2 instructions. Add
a new subtarget option for AES and check for the support.  Add "westmere"
line of processors and add AES-NI support to the core i7.

Add a couple of TODOs for information I couldn't verify.

llvm-svn: 100231
2010-04-02 21:54:27 +00:00
Evan Cheng 738b0f9ec7 Nehalem unaligned memory access is fast.
llvm-svn: 100089
2010-04-01 05:58:17 +00:00
Jakob Stoklund Olesen f8d7eda663 Teach TableGen to understand X.Y notation in the TSFlagsFields strings.
Remove much horribleness from X86InstrFormats as a result. Similar
simplifications are probably possible for other targets.

llvm-svn: 99539
2010-03-25 18:52:01 +00:00
Jakob Stoklund Olesen 49e121d5e4 Add a late SSEDomainFix pass that twiddles SSE instructions to avoid domain crossings.
On Nehalem and newer CPUs there is a 2 cycle latency penalty on using a register
in a different domain than where it was defined. Some instructions have
equvivalents for different domains, like por/orps/orpd.

The SSEDomainFix pass tries to minimize the number of domain crossings by
changing between equvivalent opcodes where possible.

This is a work in progress, in particular the pass doesn't do anything yet. SSE
instructions are tagged with their execution domain in TableGen using the last
two bits of TSFlags. Note that not all instructions are tagged correctly. Life
just isn't that simple.

The SSE execution domain issue is very similar to the ARM NEON/VFP pipeline
issue handled by NEONMoveFixPass. This pass may become target independent to
handle both.

llvm-svn: 99524
2010-03-25 17:25:00 +00:00
Daniel Dunbar 63ec093b6e MC/X86/AsmMatcher: Use the new instruction cleanup routine to implement a
temporary workaround for matching inc/dec on x86_64 to the correct instruction.
 - This hack will eventually be replaced with a robust mechanism for handling
   matching instructions based on the available target features.

llvm-svn: 98858
2010-03-18 20:06:02 +00:00
Chris Lattner 77f7dba60c all 64-bit cpus have cmov, this should fix CodeGen/X86/cmov.ll
(at least) on non-x86 builders.

llvm-svn: 98520
2010-03-14 22:24:34 +00:00
Chris Lattner 44ac89f517 revert r95949, it turns out that adding new prefixes is not a
great solution for the disassembler, we'll go with "plan b".

llvm-svn: 95957
2010-02-12 01:55:31 +00:00
Chris Lattner 336f9abb45 add another bit of space for new kinds of instruction prefixes.
llvm-svn: 95949
2010-02-12 01:15:16 +00:00
David Greene 206351a1ff Implement a feature (-vector-unaligned-mem) to allow targets to
ignore alignment requirements for SIMD memory operands.  This
is useful on architectures like the AMD 10h that do not trap on
unaligned references if a status bit is twiddled at startup time.

llvm-svn: 93151
2010-01-11 16:29:42 +00:00
Evan Cheng 71d7eaa87e Remove target attribute break-sse-dep. Instead, do not fold load into sse partial update instructions unless optimizing for size.
llvm-svn: 91910
2009-12-22 17:47:23 +00:00
Evan Cheng 4cf30b72bf On recent Intel u-arch's, folding loads into some unary SSE instructions can
be non-optimal. To be precise, we should avoid folding loads if the instructions
only update part of the destination register, and the non-updated part is not
needed. e.g. cvtss2sd, sqrtss. Unfolding the load from these instructions breaks
the partial register dependency and it can improve performance. e.g.

movss (%rdi), %xmm0
cvtss2sd %xmm0, %xmm0

instead of
cvtss2sd (%rdi), %xmm0

An alternative method to break dependency is to clear the register first. e.g.
xorps %xmm0, %xmm0
cvtss2sd (%rdi), %xmm0

llvm-svn: 91672
2009-12-18 07:40:29 +00:00
Sean Callanan 04d8cb74f3 Instruction fixes, added instructions, and AsmString changes in the
X86 instruction tables.

Also (while I was at it) cleaned up the X86 tables, removing tabs and
80-line violations.

This patch was reviewed by Chris Lattner, but please let me know if
there are any problems.

* X86*.td
	Removed tabs and fixed 80-line violations

* X86Instr64bit.td
	(IRET, POPCNT, BT_, LSL, SWPGS, PUSH_S, POP_S, L_S, SMSW)
		Added
	(CALL, CMOV) Added qualifiers
	(JMP) Added PC-relative jump instruction
	(POPFQ/PUSHFQ) Added qualifiers; renamed PUSHFQ to indicate
		that it is 64-bit only (ambiguous since it has no
		REX prefix)
	(MOV) Added rr form going the other way, which is encoded
		differently
	(MOV) Changed immediates to offsets, which is more correct;
		also fixed MOV64o64a to have to a 64-bit offset
	(MOV) Fixed qualifiers
	(MOV) Added debug-register and condition-register moves
	(MOVZX) Added more forms
	(ADC, SUB, SBB, AND, OR, XOR) Added reverse forms, which
		(as with MOV) are encoded differently
	(ROL) Made REX.W required
	(BT) Uncommented mr form for disassembly only
	(CVT__2__) Added several missing non-intrinsic forms
	(LXADD, XCHG) Reordered operands to make more sense for
		MRMSrcMem
	(XCHG) Added register-to-register forms
	(XADD, CMPXCHG, XCHG) Added non-locked forms
* X86InstrSSE.td
	(CVTSS2SI, COMISS, CVTTPS2DQ, CVTPS2PD, CVTPD2PS, MOVQ)
		Added
* X86InstrFPStack.td
	(COM_FST0, COMP_FST0, COM_FI, COM_FIP, FFREE, FNCLEX, FNOP,
	 FXAM, FLDL2T, FLDL2E, FLDPI, FLDLG2, FLDLN2, F2XM1, FYL2X,
	 FPTAN, FPATAN, FXTRACT, FPREM1, FDECSTP, FINCSTP, FPREM,
	 FYL2XP1, FSINCOS, FRNDINT, FSCALE, FCOMPP, FXSAVE,
	 FXRSTOR)
		Added
	(FCOM, FCOMP) Added qualifiers
	(FSTENV, FSAVE, FSTSW) Fixed opcode names
	(FNSTSW) Added implicit register operand
* X86InstrInfo.td
	(opaque512mem) Added for FXSAVE/FXRSTOR
	(offset8, offset16, offset32, offset64) Added for MOV
	(NOOPW, IRET, POPCNT, IN, BTC, BTR, BTS, LSL, INVLPG, STR,
	 LTR, PUSHFS, PUSHGS, POPFS, POPGS, LDS, LSS, LES, LFS,
	 LGS, VERR, VERW, SGDT, SIDT, SLDT, LGDT, LIDT, LLDT,
	 LODSD, OUTSB, OUTSW, OUTSD, HLT, RSM, FNINIT, CLC, STC,
	 CLI, STI, CLD, STD, CMC, CLTS, XLAT, WRMSR, RDMSR, RDPMC,
	 SMSW, LMSW, CPUID, INVD, WBINVD, INVEPT, INVVPID, VMCALL,
	 VMCLEAR, VMLAUNCH, VMRESUME, VMPTRLD, VMPTRST, VMREAD,
	 VMWRITE, VMXOFF, VMXON) Added
	(NOOPL, POPF, POPFD, PUSHF, PUSHFD) Added qualifier
	(JO, JNO, JB, JAE, JE, JNE, JBE, JA, JS, JNS, JP, JNP, JL,
	 JGE, JLE, JG, JCXZ) Added 32-bit forms
	(MOV) Changed some immediate forms to offset forms
	(MOV) Added reversed reg-reg forms, which are encoded
		differently
	(MOV) Added debug-register and condition-register moves
	(CMOV) Added qualifiers
	(AND, OR, XOR, ADC, SUB, SBB) Added reverse forms, like MOV
	(BT) Uncommented memory-register forms for disassembler
	(MOVSX, MOVZX) Added forms
	(XCHG, LXADD) Made operand order make sense for MRMSrcMem
	(XCHG) Added register-register forms
	(XADD, CMPXCHG) Added unlocked forms
* X86InstrMMX.td
	(MMX_MOVD, MMV_MOVQ) Added forms
* X86InstrInfo.cpp: Changed PUSHFQ to PUSHFQ64 to reflect table
	change

* X86RegisterInfo.td: Added debug and condition register sets
* x86-64-pic-3.ll: Fixed testcase to reflect call qualifier
* peep-test-3.ll: Fixed testcase to reflect test qualifier
* cmov.ll: Fixed testcase to reflect cmov qualifier
* loop-blocks.ll: Fixed testcase to reflect call qualifier
* x86-64-pic-11.ll: Fixed testcase to reflect call qualifier
* 2009-11-04-SubregCoalescingBug.ll: Fixed testcase to reflect call
  qualifier
* x86-64-pic-2.ll: Fixed testcase to reflect call qualifier
* live-out-reg-info.ll: Fixed testcase to reflect test qualifier
* tail-opts.ll: Fixed testcase to reflect call qualifiers
* x86-64-pic-10.ll: Fixed testcase to reflect call qualifier
* bss-pagealigned.ll: Fixed testcase to reflect call qualifier
* x86-64-pic-1.ll: Fixed testcase to reflect call qualifier
* widen_load-1.ll: Fixed testcase to reflect call qualifier

llvm-svn: 91638
2009-12-18 00:01:26 +00:00
Chris Lattner 13306a1fff remove a temporary hack.
llvm-svn: 82395
2009-09-20 07:47:59 +00:00
Chris Lattner 1cbd3ded33 split MCInst printing out of the X86ATTInstPrinter
class into its own X86ATTInstPrinter class.  The inst
printer now has just one dependence on the code generator
(TRI).

llvm-svn: 81703
2009-09-13 19:30:11 +00:00
Chris Lattner cc8c581a5b Add support for modeling whether or not the processor has support for
conditional moves as a subtarget feature.  This is the easy part of 
PR4841.

llvm-svn: 80763
2009-09-02 05:53:04 +00:00
Daniel Dunbar e431871851 llvm-mc/AsmParser: Allow target to specific a comment delimiter, which will be
used to strip hard coded comments out of .td assembly strings.

llvm-svn: 78716
2009-08-11 20:59:47 +00:00
Daniel Dunbar 0033199c50 Match X86 register names to number.
llvm-svn: 77404
2009-07-29 00:02:19 +00:00
David Greene 46b56ffae3 Add processor descriptions for Istanbul and Shanghai.
llvm-svn: 74429
2009-06-29 16:54:06 +00:00
David Greene 8f6f72cc99 Add feature flags for AVX and FMA and fix some SSE4A feature flag
initialization problems.

llvm-svn: 74350
2009-06-26 22:46:54 +00:00
Dale Johannesen 5234d3795f Revert 72707 and 72709, for the moment.
llvm-svn: 72712
2009-06-02 03:12:52 +00:00
Dale Johannesen 7fde88cce8 Add missing file.
llvm-svn: 72709
2009-06-01 23:48:58 +00:00
Stefanus Du Toit 96180b5387 Update CPU capabilities for AMD machines
- added processors k8-sse3, opteron-sse3, athlon64-sse3, amdfam10, and
barcelona with appropriate sse3/4a levels
- added FeatureSSE4A for amdfam10 processors
in X86Subtarget:
- added hasSSE4A
- updated AutoDetectSubtargetFeatures to detect SSE4A
- updated GetCurrentX86CPU to detect family 15 with sse3 as k8-sse3 and
family 10h as amdfam10

New processor names match those used by gcc.

Patch by Paul Redmond!

llvm-svn: 72434
2009-05-26 21:04:35 +00:00
Dan Gohman 7403751e16 Change Feature64Bit to not imply FeatureSSE2. All x86-64 hardware has
SSE2, however it's possible to disable SSE2, and the subtarget support
code thinks that if 64-bit implies SSE2 and SSE2 is disabled then
64-bit should also be disabled. Instead, just mark all the 64-bit
subtargets as explicitly supporting SSE2.

Also, move the code that makes -march=x86-64 enable 64-bit support by
default to only apply when there is no explicit subtarget. If you
need to specify a subtarget and you want 64-bit code, you'll need to
select a subtarget that supports 64-bit code.

llvm-svn: 63575
2009-02-03 00:04:43 +00:00
Evan Cheng 6e100a62b1 Add Intel processors core i7 and atom.
llvm-svn: 61603
2009-01-03 04:24:44 +00:00
Evan Cheng 4c91aa3418 Do not isel load folding bt instructions for pentium m, core, core2, and AMD processors. These are significantly slower than a load followed by a bt of a register.
llvm-svn: 61557
2009-01-02 05:35:45 +00:00
Evan Cheng 977e7be9d4 Move target independent td files from lib/Target/ to include/llvm/Target so they can be distributed along with the header files.
llvm-svn: 59953
2008-11-24 07:34:46 +00:00
Dale Johannesen 28106756af Accept -march=i586, because gcc does (a synonym
for pentium).  Fixes
gcc.target/i386/20000720-1.c
gcc.target/i386/pr26826.c

llvm-svn: 57528
2008-10-14 22:06:33 +00:00
Anton Korobeynikov 2589777f3f Add ability to override segment (mostly for code emitter purposes).
llvm-svn: 57380
2008-10-11 19:09:15 +00:00
Andrew Lenharth 0070dd1de3 Add lock prefix support to x86. Also add the instructions necessary for the atomic ops. They are still marked pseudo, since I cannot figure out what format to use, but they are the correct opcode.
llvm-svn: 47795
2008-03-01 13:37:02 +00:00
Dale Johannesen 401a4d72d5 nocona, core2 and penryn support 64 bit.
llvm-svn: 47149
2008-02-15 01:22:41 +00:00
Nate Begeman e14fdfaecd SSE 4.1 Intrinsics and detection
llvm-svn: 46681
2008-02-03 07:18:54 +00:00
Chris Lattner f3ebc3f3d2 Remove attribution from file headers, per discussion on llvmdev.
llvm-svn: 45418
2007-12-29 20:36:04 +00:00
Arnold Schwaighofer 1f0da1fefb Corrected many typing errors. And removed 'nest' parameter handling
for fastcc from X86CallingConv.td.  This means that nested functions
are not supported for calling convention 'fastcc'.

llvm-svn: 42934
2007-10-12 21:30:57 +00:00
Bill Wendling 3fb7fdfded We only need to specify the most-implied feature for an architecture.
llvm-svn: 37275
2007-05-22 05:15:37 +00:00
Bill Wendling f985c492e1 3DNowA implies 3DNow. 64-bit implies SSE1, SSE2, and I assume MMX.
llvm-svn: 36860
2007-05-06 07:56:19 +00:00
Bill Wendling e6182267d7 Add an "implies" field to features. This indicates that, if the current
feature is set, then the features in the implied list should be set also.
The opposite is also enforced: if a feature in the implied list isn't set,
then the feature that owns that implies list shouldn't be set either.

llvm-svn: 36756
2007-05-04 20:38:40 +00:00
Bill Wendling 157d7ee7e5 Add SSSE3 as a feature of Core2. Add MMX registers to the list of registers
clobbered by a call.

llvm-svn: 36448
2007-04-25 21:31:48 +00:00
Bill Wendling f099841573 Add support for our first SSSE3 instruction "pmulhrsw".
llvm-svn: 35869
2007-04-10 22:10:25 +00:00
Chris Lattner 5d00a0b8a9 Add a description of the X86-64 calling convention and the return
conventions.  This doesn't do anything yet, but may in the future.

llvm-svn: 34636
2007-02-26 18:17:14 +00:00
Evan Cheng ff1beda569 Still need to support -mcpu=<> or cross compilation will fail. Doh.
llvm-svn: 30764
2006-10-06 09:17:41 +00:00