Commit Graph

6727 Commits

Author SHA1 Message Date
Dale Johannesen 0d2e6ad504 Add intrinsic-based patterns for MMX PINSRW and PEXTRW.
llvm-svn: 113420
2010-09-08 22:08:40 +00:00
Dale Johannesen e54dba94f9 Check in forgotten file. Should fix build.
llvm-svn: 113409
2010-09-08 21:09:48 +00:00
Dale Johannesen 4dae01781f Slight cleanup, use only one form of MMXI_binop_rm_int.
llvm-svn: 113406
2010-09-08 20:54:00 +00:00
Dale Johannesen d79bb127dd Add intrinsic forms of mmx<->sse conversions. Notes:
Omission of memory form of PI2PD is intentional; this
does not use an MMX register and does not put the chip
into MMX mode (PI2PS, oddly enough, does).
Operands of PI2PS follow the gcc builtin, not Intel.

llvm-svn: 113388
2010-09-08 19:15:38 +00:00
Bruno Cardoso Lopes 99a9f4661a Minor change. Fix comments and remove unused and redundant code
llvm-svn: 113378
2010-09-08 18:12:31 +00:00
Bruno Cardoso Lopes f7fee1c185 x86 vector shuffle lowering now relies only on target specific
nodes to emit shuffles and don't do isel mask matching anymore.
- Add the selection of the remaining shuffle opcode (movddup)
- Introduce two new functions to "recognize" where we may get
potential folds and add several comments to them explaining why
they are not yet in the desidered shape.
- Add more patterns to fallback the case where we select
a specific shuffle opcode as if it could fold a load, but it
can't, so remap to a valid instruction.
- Add a couple of FIXMEs to address in the following days once
there's a good solution to the current folding problem.

llvm-svn: 113369
2010-09-08 17:43:25 +00:00
Chris Lattner 2907d2e419 add support for the commuted form of the test instruction, rdar://8018260.
llvm-svn: 113352
2010-09-08 05:51:12 +00:00
Chris Lattner a9ca7837e4 implement proper support for sysret{,l,q}, rdar://8403907
llvm-svn: 113350
2010-09-08 05:45:34 +00:00
Chris Lattner 063363fa80 implement the iret suite of instructions properly,
fixing rdar://8403974

llvm-svn: 113349
2010-09-08 05:38:31 +00:00
Chris Lattner 086a83afb1 add support for instruction prefixes on the same line as the instruction,
implementing rdar://8033482 and PR7254.

llvm-svn: 113348
2010-09-08 05:17:37 +00:00
Chris Lattner 91689c1d0f change the MC "ParseInstruction" interface to make it the
implementation's job to check for and lex the EndOfStatement
marker.

llvm-svn: 113347
2010-09-08 05:10:46 +00:00
Chris Lattner 8caea68a4f gas accepts xchg <mem>, <reg> as a synonym for xchg <reg>, <mem>.
Add this to the mc assembler, fixing PR8061

llvm-svn: 113346
2010-09-08 04:53:27 +00:00
Chris Lattner 4703cb4a96 fix the encoding of the "jump on *cx" family of instructions,
rdar://8061602

llvm-svn: 113343
2010-09-08 04:30:51 +00:00
Bruno Cardoso Lopes 6b1d62c529 Factor out some x86 vector shuffle rewriting and add comments about the direction the shuffle lowering is heading to
llvm-svn: 113286
2010-09-07 21:03:14 +00:00
Bruno Cardoso Lopes 7c483028fb Move code around to prepare for moving some of the logic together to another function
llvm-svn: 113267
2010-09-07 20:20:27 +00:00
Bill Wendling 353802114f Add an MVT::x86mmx type. It will take the place of all current MMX vector types.
llvm-svn: 113261
2010-09-07 20:03:56 +00:00
Evan Cheng 5444b36e01 Remove a dead comment.
llvm-svn: 113259
2010-09-07 20:01:10 +00:00
Bruno Cardoso Lopes 5a45db3e6c decouple MMX check from regular splat checks. Some refactoring is coming, and MMX should be left alone to be easily removed after moving to intrinsics
llvm-svn: 113247
2010-09-07 18:41:45 +00:00
Bruno Cardoso Lopes 4f5d4b4a6e Remove now useless check, because the code can be matched below, no need to leave it for isel
llvm-svn: 113242
2010-09-07 18:29:03 +00:00
Bruno Cardoso Lopes c9b3316fea Minor change. Since the checks are equivalent, use isMMX
llvm-svn: 113239
2010-09-07 18:24:00 +00:00
Dale Johannesen 605acfe533 Add patterns for MMX that use the new intrinsics.
Enable palignr intrinsic.
These may need adjustment for a new VT in due course.

llvm-svn: 113233
2010-09-07 18:10:56 +00:00
Bruno Cardoso Lopes f0ea222255 Remove unused target specific node
llvm-svn: 113224
2010-09-07 17:38:55 +00:00
Benjamin Kramer 1ecb978214 Don't leak the old operand when transforming "sldt" into "sldtw".
llvm-svn: 113200
2010-09-07 14:40:58 +00:00
Chris Lattner 30bb384944 add missing cmov aliases, this resolves rdar://8208499
llvm-svn: 113189
2010-09-07 00:05:45 +00:00
Chris Lattner 3ae9398d5f remove duplicated entry
llvm-svn: 113188
2010-09-06 23:57:24 +00:00
Chris Lattner 7ece716da2 "sldt <mem>" is ambiguous in 64-bit mode, but should
always be disambiguated as sldtw.  sldtw and sldtq with
a mem operands have the same effect, but sldtw is more
compact.  Force it to sldtw, resolving rdar://8017530

llvm-svn: 113186
2010-09-06 23:51:44 +00:00
Chris Lattner 415e04fad2 fix rdar://8017621 - llvm-mc can't guess encoding for "push $(1000)"
llvm-svn: 113184
2010-09-06 23:40:56 +00:00
Chris Lattner 34e366b45c fix the operand constraints of the immediate form of in/out,
allowing unsigned 8-bit operands.  This fixes rdar://8208481

llvm-svn: 113182
2010-09-06 23:29:05 +00:00
Chris Lattner 339cc7bfef in the case where an instruction only has one implementation
of a mneumonic, report operand errors with better location
info.  For example, we now report:

t.s:6:14: error: invalid operand for instruction
        cwtl $1
             ^

but we fail for common cases like:

t.s:11:4: error: invalid operand for instruction
   addl $1, $1
   ^

because we don't know if this is supposed to be the reg/imm or imm/reg
form.

llvm-svn: 113178
2010-09-06 22:11:18 +00:00
Chris Lattner 628fbecf4f Now that we know if we had a total fail on the instruction mnemonic,
give a more detailed error.  Before:

t.s:11:4: error: unrecognized instruction
   addl $1, $1
   ^
t.s:12:4: error: unrecognized instruction
   f2efqefa $1
   ^

After:

t.s:11:4: error: invalid operand for instruction
   addl $1, $1
   ^
t.s:12:4: error: invalid instruction mnemonic 'f2efqefa'
   f2efqefa $1
   ^

This fixes rdar://8017912 - llvm-mc says "unrecognized instruction" when it means "invalid operands"

llvm-svn: 113176
2010-09-06 21:54:15 +00:00
Chris Lattner 31c63fb518 simplify the hacks around jrcxz.
llvm-svn: 113167
2010-09-06 20:10:12 +00:00
Chris Lattner b4be28f33d have tblgen detect when an instruction would have matched, but
failed because a subtarget feature was not enabled.  Use this to
remove a bunch of hacks from the X86AsmParser for rejecting things
like popfl in 64-bit mode.  Previously these hacks weren't needed,
but were important to get a message better than "invalid instruction"
when used in the wrong mode.

This also fixes bugs where pushal would not be rejected correctly in
32-bit mode (just pusha).

llvm-svn: 113166
2010-09-06 20:08:02 +00:00
Chris Lattner a22a368e7c change MatchInstructionImpl to return an enum instead of bool.
llvm-svn: 113165
2010-09-06 19:22:17 +00:00
Chris Lattner 3e4582ada5 have AsmMatcherEmitter.cpp produce the hunk of code that gets included
into the middle of the class, and rework how the different sections of
the generated file are conditionally included for simplicity.

llvm-svn: 113163
2010-09-06 19:11:01 +00:00
Roman Divacky e1278b57f9 Redefine LOOP* instructions from I to Ii8PCRel as they take an i8 argument.
llvm-svn: 113158
2010-09-06 18:43:14 +00:00
Chris Lattner 4cfbcdc7b6 random cleanups
llvm-svn: 113157
2010-09-06 18:32:06 +00:00
Chris Lattner 5cac0f71ca update this.
llvm-svn: 113116
2010-09-05 20:22:09 +00:00
Chris Lattner eeba0c73e5 implement rdar://6653118 - fastisel should fold loads where possible.
Since mem2reg isn't run at -O0, we get a ton of reloads from the stack,
for example, before, this code:

int foo(int x, int y, int z) {
  return x+y+z;
}

used to compile into:

_foo:                                   ## @foo
	subq	$12, %rsp
	movl	%edi, 8(%rsp)
	movl	%esi, 4(%rsp)
	movl	%edx, (%rsp)
	movl	8(%rsp), %edx
	movl	4(%rsp), %esi
	addl	%edx, %esi
	movl	(%rsp), %edx
	addl	%esi, %edx
	movl	%edx, %eax
	addq	$12, %rsp
	ret

Now we produce:

_foo:                                   ## @foo
	subq	$12, %rsp
	movl	%edi, 8(%rsp)
	movl	%esi, 4(%rsp)
	movl	%edx, (%rsp)
	movl	8(%rsp), %edx
	addl	4(%rsp), %edx    ## Folded load
	addl	(%rsp), %edx     ## Folded load
	movl	%edx, %eax
	addq	$12, %rsp
	ret

Fewer instructions and less register use = faster compiles.

llvm-svn: 113102
2010-09-05 02:18:34 +00:00
Chris Lattner 65b48b5dfc zap dead code.
llvm-svn: 113073
2010-09-04 18:12:00 +00:00
Bruno Cardoso Lopes c6accda78e Remove the last bit of isShuffleMaskLegal checks and improve the comment regarding mmx shuffles
llvm-svn: 113059
2010-09-04 02:58:56 +00:00
Bruno Cardoso Lopes 731bcc1abf make explicit that we not handle several mmx shuffles
llvm-svn: 113058
2010-09-04 02:50:13 +00:00
Bruno Cardoso Lopes 20779ee157 Emit target specific nodes to handle palignr. Do not touch it for MMX versions yet.
llvm-svn: 113056
2010-09-04 02:36:07 +00:00
Bruno Cardoso Lopes cff7cd18ab Emit target specific nodes to handle splats starting at zero indicies
llvm-svn: 113055
2010-09-04 02:02:14 +00:00
Bruno Cardoso Lopes 95759917eb Emit target specific nodes for isPSHUFHWMask and isPSHUFLWMask
llvm-svn: 113050
2010-09-04 01:36:45 +00:00
Bruno Cardoso Lopes 2b57008c72 Emit target specific nodes for isSHUFPMask
llvm-svn: 113048
2010-09-04 01:22:57 +00:00
Bruno Cardoso Lopes 2f7af36134 Previous isMOVLMask matching already emits targets nodes, remove check
llvm-svn: 113047
2010-09-04 00:50:08 +00:00
Bruno Cardoso Lopes 9f8e704151 One more check from the original isShuffleMaskLegal goes away
llvm-svn: 113045
2010-09-04 00:46:16 +00:00
Bruno Cardoso Lopes 16959372bb Remove a duplicated but useless check that i've inserted in the previous commit.
llvm-svn: 113044
2010-09-04 00:43:12 +00:00
Bruno Cardoso Lopes 44578d38d3 Refactor some code and remove the extra checks for unpckl_undef and unpckh_undef
llvm-svn: 113043
2010-09-04 00:39:43 +00:00
Bruno Cardoso Lopes 7829d0e74b Remove check for unpckh mask
llvm-svn: 113035
2010-09-03 23:32:47 +00:00
Bruno Cardoso Lopes d1dacc57aa Remove check for unpckl mask
llvm-svn: 113034
2010-09-03 23:31:50 +00:00
Bruno Cardoso Lopes 207b9d6218 Inline isShuffleMaskLegal into LowerVECTOR_SHUFFLE, so we can start
checking each standalone condition and decide whether emit target
specific nodes or remove the condition if it's already matched before.

llvm-svn: 113031
2010-09-03 23:24:06 +00:00
Bruno Cardoso Lopes 2bef20eda7 Reapply considered harmfull part of rr112934 and r112942.
"Use target specific nodes instead of relying in unpckl and
unpckh pattern fragments during isel time. Also place a
depth limit in getShuffleScalarElt.

llvm-svn: 113020
2010-09-03 22:09:41 +00:00
Dale Johannesen 367afb5a00 Remove the rest of the nonexistent 64-bit AVX instructions.
Bruno, please review.

llvm-svn: 113014
2010-09-03 21:23:00 +00:00
Bruno Cardoso Lopes a750d994fe Reapply last harmless part of r112934, the pattern fragment to match X86Unpcklpd
llvm-svn: 113009
2010-09-03 20:44:26 +00:00
Bruno Cardoso Lopes fe8717c573 Reintroduce a simple function refactoring done in r112934, also without any functionality changes
llvm-svn: 113008
2010-09-03 20:20:02 +00:00
Bruno Cardoso Lopes 48e589b122 Reapply piecies of r112942 and r112934 which don't do
functional changes

llvm-svn: 113007
2010-09-03 20:10:35 +00:00
Bruno Cardoso Lopes 6979cf0808 Reapply Fix comment
llvm-svn: 113006
2010-09-03 19:55:05 +00:00
Daniel Dunbar 6f3da24d70 Revert r112934, "- Use specific nodes to match unpckl masks.", which introduced
some infinite loop and select failures.
 - Apologies for eager reverting, but its branch day.

llvm-svn: 113000
2010-09-03 19:38:11 +00:00
Daniel Dunbar f1aacd55c0 Revert r112938 "Fix comment", which depends on r112934, which introduced some
infinite loop and select failures.

llvm-svn: 112999
2010-09-03 19:38:08 +00:00
Daniel Dunbar 0ffe4db45c Revert r112942, "Use punpckh and unpckh family of nodes instead of using unpckh
mask pattern fragment", which depends on r112934, which introduced some infinite
loop and select failures.

llvm-svn: 112998
2010-09-03 19:38:05 +00:00
Bruno Cardoso Lopes d6634a5b2e AVX doesn't support mm operations neither its instrinsics.
The AVX versions of PALIGN and PABS* should only exist for
128-bit. Remove the unnecessary stuff.

llvm-svn: 112944
2010-09-03 02:08:45 +00:00
Bruno Cardoso Lopes a85ec10483 Use punpckh and unpckh family of nodes instead of using unpckh mask pattern fragment
llvm-svn: 112942
2010-09-03 01:39:08 +00:00
Bruno Cardoso Lopes adc6bca2dd Fix comment
llvm-svn: 112938
2010-09-03 01:28:51 +00:00
Bruno Cardoso Lopes cce44678b4 - Use specific nodes to match unpckl masks.
- Teach getShuffleScalarElt how to handle more target
specific nodes, so the DAGCombine can make use of it.
- Add another hack to avoid the node update problem
during legalization. More description on the comments

llvm-svn: 112934
2010-09-03 01:24:00 +00:00
Jakob Stoklund Olesen 08aede2538 Don't call Predicate_* from X86 target.
llvm-svn: 112921
2010-09-03 00:35:18 +00:00
Anton Korobeynikov a5a645559c Properly emit __chkstk call instead of __alloca on non-mingw windows targets.
Patch by Cameron Esfahani!

llvm-svn: 112902
2010-09-02 23:03:46 +00:00
Bruno Cardoso Lopes 02a05a6a89 Move insertps mask decoding to header file
llvm-svn: 112896
2010-09-02 22:43:39 +00:00
Anton Korobeynikov a689c5b2c0 Revert win64 changes. They seem to be incomplete
llvm-svn: 112885
2010-09-02 22:31:32 +00:00
Anton Korobeynikov 56291f7e53 Properly allocate win64 shadow reg area.
Patch by Jan Sjodin!

llvm-svn: 112875
2010-09-02 22:16:28 +00:00
Bruno Cardoso Lopes 814a69c330 Move decoding of insertps back to avoid unused warnings in x86 isel lowering, and fix movlhps/movhlps to decode 4 elements shuffles
llvm-svn: 112869
2010-09-02 21:51:11 +00:00
Dan Gohman 3c9b5f394b Don't narrow the load and store in a load+twiddle+store sequence unless
there are clearly no stores between the load and the store. This fixes
this miscompile reported as PR7833.

This breaks the test/CodeGen/X86/narrow_op-2.ll optimization, which is
safe, but awkward to prove safe. Move it to X86's README.txt.

llvm-svn: 112861
2010-09-02 21:18:42 +00:00
Bruno Cardoso Lopes c79f50170a Move x86 specific shuffle mask decoding to its own header, it's also going to be used elsewhere. Also trim trailing whitespaces
llvm-svn: 112846
2010-09-02 18:40:13 +00:00
Bruno Cardoso Lopes 489613f1e5 Replace unpckl_undef and unpckh_undef matching with target specific opcodes
llvm-svn: 112806
2010-09-02 05:23:12 +00:00
Bruno Cardoso Lopes e4e4be3885 Move condition out to prepare for more matching
llvm-svn: 112805
2010-09-02 04:20:26 +00:00
Bruno Cardoso Lopes bf7fd146c7 Remove checking for isUNPCKL_v_undef_Mask, the specific node is already emitted for it
llvm-svn: 112804
2010-09-02 03:57:58 +00:00
Bruno Cardoso Lopes 6a7f634487 become more strict about when it's safe to use X86ISD::MOVLPS
llvm-svn: 112799
2010-09-02 02:35:51 +00:00
Bruno Cardoso Lopes 04c25c15c7 Revert r112689, avoid those kind of checks cause they mess up with mmx
llvm-svn: 112760
2010-09-01 22:59:03 +00:00
Bruno Cardoso Lopes fea81b4831 Using target specific nodes for shuffle nodes makes the mask
check more strict, breaking some cases not checked in the
testsuite, but also exposes some foldings not done before,
as this example:

  movaps  (%rdi), %xmm0
  movaps  (%rax), %xmm1
  movaps  %xmm0, %xmm2
  movss %xmm1, %xmm2
  shufps  $36, %xmm2, %xmm0

now is generated as:

  movaps  (%rdi), %xmm0
  movaps  %xmm0, %xmm1
  movlps  (%rax), %xmm1
  shufps  $36, %xmm1, %xmm0

llvm-svn: 112753
2010-09-01 22:33:20 +00:00
Bruno Cardoso Lopes b3825216ce Use movlps, movlpd, movss and movsd specific nodes instead of pattern matching with movlp pattern fragment
llvm-svn: 112694
2010-09-01 05:08:25 +00:00
Bruno Cardoso Lopes 6aaebe877b minor change, simplify some logic
llvm-svn: 112689
2010-09-01 00:57:08 +00:00
Bruno Cardoso Lopes 2b025707a2 Move some functions around so they can be used for some other to come function
llvm-svn: 112687
2010-09-01 00:51:36 +00:00
Bruno Cardoso Lopes 4b56d87290 Use x86 specific MOVSLDUP node, add more patterns to match it and remove useless load nodes
llvm-svn: 112661
2010-08-31 22:35:05 +00:00
Bruno Cardoso Lopes 61996ef835 Use x86 specific MOVSHDUP node and add more patterns to match it
llvm-svn: 112657
2010-08-31 22:22:11 +00:00
Jakob Stoklund Olesen 33e9fce2d6 Make %EFLAGS unallocatable.
No CCR virtual registers should exist, and %EFLAGS is used in ways that can
surprise RegAllocFast.

llvm-svn: 112650
2010-08-31 21:51:07 +00:00
Bruno Cardoso Lopes 5de15ce468 Use MOVHLPS node instead of matching using movhlps and movhlps_undef pattern fragments
llvm-svn: 112644
2010-08-31 21:38:49 +00:00
Bruno Cardoso Lopes 03e4c35302 Use MOVLHPS and MOVHLPS x86 nodes whenever possible. Also remove some useless nodes
llvm-svn: 112642
2010-08-31 21:15:21 +00:00
Bruno Cardoso Lopes dfd9dd5d75 Use X86ISD::MOVSS and MOVSD to represent the movl mask pattern, also fix the handling of those nodes when seeking for scalars inside vector shuffles
llvm-svn: 112570
2010-08-31 02:26:40 +00:00
Eli Friedman f75de6eae7 A couple of small missed optimizations.
llvm-svn: 112411
2010-08-29 05:07:40 +00:00
Chris Lattner 38ccc8b884 add a bunch more common shuffles to the instprinter.
llvm-svn: 112397
2010-08-29 03:08:08 +00:00
Chris Lattner 7a05e6dca2 I have manually decoded the imm field of an insertps one too many
times.  This patch causes llc and llvm-mc (which both default to
verbose-asm) to print out comments after a few common shuffle 
instructions which indicates the shuffle mask, e.g.:

	insertps	$113, %xmm3, %xmm0     ## xmm0 = zero,xmm0[1,2],xmm3[1]
	unpcklps	%xmm1, %xmm0    ## xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
	pshufd	$1, %xmm1, %xmm1        ## xmm1 = xmm1[1,0,0,0]

This is carefully factored to keep the information extraction (of the
shuffle mask) separate from the printing logic.  I plan to move the
extraction part out somewhere else at some point for other parts of
the x86 backend that want to introspect on the behavior of shuffles.

llvm-svn: 112387
2010-08-28 20:42:31 +00:00
Chris Lattner 94656b1c8c fix the buildvector->insertp[sd] logic to not always create a redundant
insertp[sd] $0, which is a noop.  Before:

_f32:                                   ## @f32
	pshufd	$1, %xmm1, %xmm2
	pshufd	$1, %xmm0, %xmm3
	addss	%xmm2, %xmm3
	addss	%xmm1, %xmm0
                                        ## kill: XMM0<def> XMM0<kill> XMM0<def>
	insertps	$0, %xmm0, %xmm0
	insertps	$16, %xmm3, %xmm0
	ret

after:

_f32:                                   ## @f32
	movdqa	%xmm0, %xmm2
	addss	%xmm1, %xmm2
	pshufd	$1, %xmm1, %xmm1
	pshufd	$1, %xmm0, %xmm3
	addss	%xmm1, %xmm3
	movdqa	%xmm2, %xmm0
	insertps	$16, %xmm3, %xmm0
	ret

The extra movs are due to a random (poor) scheduling decision.

llvm-svn: 112379
2010-08-28 17:59:08 +00:00
Chris Lattner bcb6090ad0 fix the BuildVector -> unpcklps logic to not do pointless shuffles
when the top elements of a vector are undefined.  This happens all
the time for X86-64 ABI stuff because only the low 2 elements of
a 4 element vector are defined.  For example, on:

_Complex float f32(_Complex float A, _Complex float B) {
  return A+B;
}

We used to produce (with SSE2, SSE4.1+ uses insertps):

_f32:                                   ## @f32
	movdqa	%xmm0, %xmm2
	addss	%xmm1, %xmm2
	pshufd	$16, %xmm2, %xmm2
	pshufd	$1, %xmm1, %xmm1
	pshufd	$1, %xmm0, %xmm0
	addss	%xmm1, %xmm0
	pshufd	$16, %xmm0, %xmm1
	movdqa	%xmm2, %xmm0
	unpcklps	%xmm1, %xmm0
	ret

We now produce:

_f32:                                   ## @f32
	movdqa	%xmm0, %xmm2
	addss	%xmm1, %xmm2
	pshufd	$1, %xmm1, %xmm1
	pshufd	$1, %xmm0, %xmm3
	addss	%xmm1, %xmm3
	movaps	%xmm2, %xmm0
	unpcklps	%xmm3, %xmm0
	ret

This implements rdar://8368414

llvm-svn: 112378
2010-08-28 17:28:30 +00:00
Chris Lattner 96db6e66f4 improve comments in the unpcklps generating logic, introduce
a new EltStride variable instead of reusing NumElems variable
for a non-obvious purpose.  No functionality change.

llvm-svn: 112377
2010-08-28 17:15:43 +00:00
Bruno Cardoso Lopes a982aa24ef Clean up the logic of vector shuffles -> vector shifts.
Also teach this logic how to handle target specific shuffles if
needed, this is necessary while searching recursively for zeroed
scalar elements in vector shuffle operands.

llvm-svn: 112348
2010-08-28 02:46:39 +00:00
Anton Korobeynikov c0b36921c2 Properly handle passing of FP stuff to varargs function on Win64:
value should be copied to the corresponding shadow reg as well.
Patch by Cameron Esfahani!

llvm-svn: 112262
2010-08-27 14:43:06 +00:00
Daniel Dunbar 1844a71e66 X86: Fix an encoding issue with LOCK_ADD64mr, which could lead to very hard to find miscompiles with the integrated assembler.
llvm-svn: 112250
2010-08-27 01:30:14 +00:00
Jim Grosbach 6a77066913 Simplify eliminateFrameIndex() interface back down now that PEI doesn't need
to try to re-use scavenged frame index reference registers. rdar://8277890

llvm-svn: 112241
2010-08-26 23:32:16 +00:00
Bruno Cardoso Lopes e25ba0c7c2 zap the now unused MVT::getIntVectorWithNumElements
llvm-svn: 112218
2010-08-26 20:53:12 +00:00
Bob Wilson a967c42a3d Fix comment typos.
llvm-svn: 112202
2010-08-26 18:08:11 +00:00
Chris Lattner eb2cc0ce0e implement SplitVecOp_CONCAT_VECTORS, fixing the included testcase with SSE1.
llvm-svn: 112171
2010-08-26 05:51:22 +00:00
Chris Lattner cc60609cb4 fix sse1 only codegen in x86-64 mode, which is something we
apparently try to support.

llvm-svn: 112168
2010-08-26 05:24:29 +00:00
Bruno Cardoso Lopes 184eaea855 Fix PR7748 without using microsoft extensions
llvm-svn: 112128
2010-08-26 01:02:53 +00:00
Chris Lattner aecf47a5cb we should pattern match the SSE complex arithmetic ops.
llvm-svn: 112109
2010-08-25 23:31:42 +00:00
Bruno Cardoso Lopes d4085f6e91 Revert this for now, PUNPCKLDQ dont operate on v4f32
llvm-svn: 112090
2010-08-25 21:26:37 +00:00
Daniel Dunbar 3d148ac089 X86: Fix misencode of RI64mi8. This fixes OpenSSL / x86_64-apple-darwin10 / clang -O3.
llvm-svn: 112089
2010-08-25 21:11:02 +00:00
Benjamin Kramer f1f2133ac0 Remove dead recursive function. Yay for clang -Wunused-function.
llvm-svn: 112060
2010-08-25 17:27:58 +00:00
Anton Korobeynikov b3b53ecac0 Fix nasty mingw32 bug, which e.g. prevented llvm-gcc bootstrap there.
Mark _alloca call as clobberring EFLAGS, otherwise some DCE might remove
other flags-clobberring stuff (e.g. cmp instructions) occuring after
_alloca call.

llvm-svn: 112034
2010-08-25 07:50:11 +00:00
Bruno Cardoso Lopes 0770d25758 PUNPCKLDQ should also be used for v4f32
llvm-svn: 112020
2010-08-25 02:55:40 +00:00
Bruno Cardoso Lopes 2e45d522c1 teach lowering to get target specific nodes for pshufd, emulating the same isel behavior for now, so we can pass all vector shuffle tests
llvm-svn: 112017
2010-08-25 02:35:37 +00:00
Daniel Dunbar 1c8d777c93 MC/X86: Tweak imul recognition, previous hack only applies for the imul form
taking immediates.

llvm-svn: 111950
2010-08-24 19:37:56 +00:00
Daniel Dunbar 09392785b4 MC/X86: Add custom hack for recognizing "imul $12, %eax" and friends.
llvm-svn: 111947
2010-08-24 19:24:18 +00:00
Daniel Dunbar 94b84a19b9 MC/X86: Warn on scale factors > 1 without index register, instead of erroring,
for 'as' compatibility.

llvm-svn: 111945
2010-08-24 19:13:38 +00:00
Dan Gohman c88fda477a Fix X86's isLegalAddressingMode to recognize that static addresses
need not be RIP-relative in small mode.

llvm-svn: 111917
2010-08-24 15:55:12 +00:00
Bruno Cardoso Lopes 758d7b1f5c Use pshufhw and pshuflw in more cases and fix getTargetShuffleNode number of arguments
llvm-svn: 111890
2010-08-24 01:16:15 +00:00
Bruno Cardoso Lopes 264d90fff7 Start using target speficic nodes for shuffles: pshufhw and pshuflw
llvm-svn: 111837
2010-08-23 20:41:02 +00:00
Gabor Greif 21fed6616c tyops
llvm-svn: 111835
2010-08-23 20:30:51 +00:00
Chris Lattner 58bd73a5a7 Add a new llvm.x86.int intrinsic, allowing access to the
x86 int and int3 instructions.  Patch by Peter Housel!

llvm-svn: 111831
2010-08-23 19:39:25 +00:00
Chris Lattner a42202e0e4 random improvement for variable shift codegen.
llvm-svn: 111813
2010-08-23 17:30:29 +00:00
Anton Korobeynikov cbbe4501df Revert invalid r111792. Jump tables are not broken on x86-64 / coff,
it's COFF emitter which does not support differences of two symbols
(and needs to be fixed). GAS is pretty fine with code produced.

llvm-svn: 111801
2010-08-23 07:38:51 +00:00
Michael J. Spencer e87231232a Workaround broken jump tables on x86-64 COFF.
llvm-svn: 111792
2010-08-23 04:45:37 +00:00
Anton Korobeynikov db9820ecaa Use rip-rel addressing on win64 by default. For this we just
defaults to small pic code model.

llvm-svn: 111741
2010-08-21 17:21:11 +00:00
Michael J. Spencer 377aa20e6e MC: Add partial x86-64 support to COFF.
llvm-svn: 111728
2010-08-21 05:58:13 +00:00
Dan Gohman 42ef669d81 Fix x86 fast-isel's cmp+branch folding to avoid folding when the
comparison is in a different basic block from the branch. In such
cases, the comparison's operands may not have initialized virtual
registers available.

llvm-svn: 111709
2010-08-21 02:32:36 +00:00
Bruno Cardoso Lopes 9f20e7a1bf Prepare LowerVECTOR_SHUFFLEv8i16 to use x86 target specific nodes directly
llvm-svn: 111704
2010-08-21 01:32:18 +00:00
Bruno Cardoso Lopes 6f3b38a851 This is the first step towards refactoring the x86 vector shuffle code. The
general idea here is to have a group of x86 target specific nodes which are
going to be selected during lowering and then directly matched in isel.

The commit includes the addition of those specific nodes and a *bunch* of
patterns, and incrementally we're going to switch between them and what we
have right now. Both the patterns and target specific nodes can change as
we move forward with this work.

llvm-svn: 111691
2010-08-20 22:55:05 +00:00
Chris Lattner f547740d3f fix PR7465, mishandling of lcall and ljmp: intersegment long
call and jumps.

llvm-svn: 111496
2010-08-19 01:18:43 +00:00
Chris Lattner beb506eeed minor progress towards fixing PR7465
llvm-svn: 111494
2010-08-19 01:00:34 +00:00
Bill Wendling 817e857b13 Marked with ATTRIBUTE_USED so that clang doesn't complain.
llvm-svn: 111383
2010-08-18 18:40:57 +00:00
Chris Lattner 3e3e63efe1 remove some code that is dead now that lea's are modeled with segment registers.
llvm-svn: 111343
2010-08-18 02:40:44 +00:00
Anton Korobeynikov 88c09879c7 Revert part of one of the prev. patches - tailjmp will follow later.
llvm-svn: 111291
2010-08-17 21:08:28 +00:00
Anton Korobeynikov 231ab847ca More fixes for win64:
- Do not clobber al during variadic calls, this is AMD64 ABI-only feature
  - Emit wincall64, where necessary
Patch by Cameron Esfahani!

llvm-svn: 111289
2010-08-17 21:06:07 +00:00
Anton Korobeynikov cd78af6e3c Enable more win64 calls folding opportunities.
Patch by Cameron Esfahani!

llvm-svn: 111288
2010-08-17 21:06:01 +00:00
Eli Friedman 2444da0652 Comment out some broken/unused/useless instructions which mess up disassembly.
llvm-svn: 111185
2010-08-16 21:18:51 +00:00
Eli Friedman 51ec745509 Don't attempt to SimplifyShortMoveForm in 64-bit mode.
llvm-svn: 111182
2010-08-16 21:03:32 +00:00
Matt Fleming f751d856f0 Hookup ELF support for X86.
llvm-svn: 111173
2010-08-16 18:36:14 +00:00
Jakob Stoklund Olesen 2cd00737c0 Partially revert r111155. It looks like MSVC is calling an operator<() that
clang says is unused.

llvm-svn: 111167
2010-08-16 18:24:54 +00:00
Jakob Stoklund Olesen b7f872197a Remove unused functions.
llvm-svn: 111155
2010-08-16 17:18:18 +00:00
Argyrios Kyrtzidis d0fcc9a818 Revert r111082. No warnings for this common pattern.
llvm-svn: 111102
2010-08-15 10:27:23 +00:00
Eric Christopher 54194bd127 Rework how the non-sse2 memory barrier is lowered so that the
encoding is correct for the built-in assembler.

Based on a patch from Chris.

llvm-svn: 111083
2010-08-14 21:51:50 +00:00
Argyrios Kyrtzidis 7c09ddf0ae Add ATTRIBUTE_UNUSED to methods that are not supposed to be used.
llvm-svn: 111082
2010-08-14 21:35:10 +00:00
Chris Lattner 2f6c3434ac improve indentation
llvm-svn: 111073
2010-08-14 17:26:09 +00:00
Bruno Cardoso Lopes 160be2936b Add comments to some pattern fragments in x86
llvm-svn: 111041
2010-08-13 20:39:01 +00:00
Dale Johannesen 8d3c89e765 Revert 110491. While not wrong, it was based on a
misanalysis and is undesirable.

llvm-svn: 111028
2010-08-13 18:43:45 +00:00
Bruno Cardoso Lopes 081861b6b7 Fix comment to reflect code, and remove an unused argument
llvm-svn: 111022
2010-08-13 17:50:47 +00:00
Bruno Cardoso Lopes 1187e3f09b Improve comment to make explicit why not to touch this could before JIT goes MC
llvm-svn: 111021
2010-08-13 17:44:10 +00:00
Eric Christopher 6e5b67ccc4 Revert last patch and r110954 as I meant to.
llvm-svn: 111001
2010-08-13 02:37:50 +00:00
Eric Christopher 5e027fe113 Revert r110954 for now, pseudo instructions can't make it through to the JIT.
llvm-svn: 111000
2010-08-13 02:30:00 +00:00
Bruno Cardoso Lopes cc20fe5937 Some small clean-up: use of pseudo instructions
llvm-svn: 110954
2010-08-12 20:55:18 +00:00
Bruno Cardoso Lopes 7f704b31a9 - Teach SSEDomainFix to switch between different levels of AVX instructions. Here we guess that AVX will have domain issues, so just implement them for consistency and in the future we remove if it's unnecessary.
- Make foldMemoryOperandImpl aware of 256-bit zero vectors folding and support the 128-bit counterparts of AVX too.
- Make sure MOV[AU]PS instructions are only selected when SSE1 is enabled, and duplicate the patterns to match AVX.
- Add a testcase for a simple 128-bit zero vector creation.

llvm-svn: 110946
2010-08-12 20:20:53 +00:00
Bruno Cardoso Lopes 7e1a30c0d3 Define AVX 128-bit pattern versions of SET0PS/PD.
llvm-svn: 110937
2010-08-12 18:20:59 +00:00
Bruno Cardoso Lopes 1401e040eb Fix comment order
llvm-svn: 110898
2010-08-12 02:08:52 +00:00
Bruno Cardoso Lopes 7306c86886 Begin to support some vector operations for AVX 256-bit intructions. The long
term goal here is to be able to match enough of vector_shuffle and build_vector
so all avx intrinsics which aren't mapped to their own built-ins but to
shufflevector calls can be codegen'd. This is the first (baby) step, support
building zeroed vectors.

llvm-svn: 110897
2010-08-12 02:06:36 +00:00
Daniel Dunbar 7d7b4d1b0f MC/X86/AsmParser: Give an explicit error message when we reject an instruction
because it could have an ambiguous suffix.

llvm-svn: 110890
2010-08-12 00:55:42 +00:00
Daniel Dunbar 2ecc3bb4f7 MC/AsmParser: Push the burdon of emitting diagnostics about unmatched
instructions onto the target specific parser, which can do a better job.

llvm-svn: 110889
2010-08-12 00:55:38 +00:00
Daniel Dunbar 167b9d7f30 tblgen/AsmMatcher: Always emit the match function as 'MatchInstructionImpl',
target specific parsers can adapt the TargetAsmParser to this.

llvm-svn: 110888
2010-08-12 00:55:32 +00:00
Jakob Stoklund Olesen 9c473e46f3 Fix <rdar://problem/8282498> even if it doesn't reproduce on trunk.
When a register is defined by a partial load:

  %reg1234:sub_32 = MOV32mr <fi#-1>; GR64:%reg1234

That load cannot be folded into an instruction using the full 64-bit register.
It would become a 64-bit load.

This is related to the recent change to have isLoadFromStackSlot return false on
a sub-register load.

llvm-svn: 110874
2010-08-11 23:08:22 +00:00
Dan Gohman 5531aa4de1 Use ISD::ADD instead of ISD::SUB with a negated constant. This
avoids trouble if the return type of TD->getPointerSize() is
changed to something which doesn't promote to a signed type,
and is simpler anyway.

Also, use getCopyFromReg instead of getRegister to read a
physical register's value.

llvm-svn: 110835
2010-08-11 18:14:00 +00:00
Daniel Dunbar ebace2248f MCAsmParser: Add dump() hook to MCParsedAsmOperand.
llvm-svn: 110790
2010-08-11 06:37:04 +00:00
Bruno Cardoso Lopes 91d61df3eb Add AVX matching patterns to Packed Bit Test intrinsics.
Apply the same approach of SSE4.1 ptest intrinsics but
create a new x86 node "testp" since AVX introduces
vtest{ps}{pd} instructions which set ZF and CF depending
on sign bit AND and ANDN of packed floating-point sources.

This is slightly different from what the "ptest" does.
Tests comming with the other 256 intrinsics tests.

llvm-svn: 110744
2010-08-10 23:25:42 +00:00
Bruno Cardoso Lopes 39f215bd33 Add AVX movnt{pd,ps,dq} 256-bit intrinsics
llvm-svn: 110650
2010-08-10 02:49:24 +00:00
Bruno Cardoso Lopes cedf23dfe5 Add AVX movmsk 256-bit intrinsics
llvm-svn: 110648
2010-08-10 02:34:56 +00:00
Bruno Cardoso Lopes 85da72a88f Support AVX 256-bit load and store intrinsics
llvm-svn: 110645
2010-08-10 01:43:16 +00:00
Bruno Cardoso Lopes b2b6b65b86 Patterns to match AVX cmp instructions
llvm-svn: 110633
2010-08-10 00:13:20 +00:00
Bruno Cardoso Lopes 001d6fa174 Add matching patterns for vblend AVX intrinsics
llvm-svn: 110630
2010-08-10 00:02:05 +00:00
Eric Christopher b9627ee79b Wording.
llvm-svn: 110618
2010-08-09 22:52:47 +00:00
Bruno Cardoso Lopes 685cb32d2b Add VCVTPD2PS, VCVTPS2DQ, VCVTPS2PDY, VCVTTPD2DQY, VCVTTPS2DQ and VCVTPD2DQ 256-bit conversion intrinsics
llvm-svn: 110608
2010-08-09 21:51:56 +00:00
Bruno Cardoso Lopes 3e9b567643 Add patterns to AVX conversions instructions. Do that instead of declaring more intructions whenever is possible, more coming
llvm-svn: 110605
2010-08-09 21:24:59 +00:00
Oscar Fuentes 212cfde6ec CMake: eliminated unnecessary target_link_libraries.
Next time the build is broken due to wrong library dependencies, just
try building again (if you are on some Unix and are building all LLVM
targets) or ask someone to commit the regenerated LLVMLibDeps.cmake.

llvm-svn: 110593
2010-08-09 20:33:08 +00:00
Bruno Cardoso Lopes c33940b3aa Memory version of vcvtdq2pd intrinsic
llvm-svn: 110582
2010-08-09 18:20:14 +00:00
Bruno Cardoso Lopes 828f6aeced Patterns to match vinsert, vbroadcast, vmovmask and vcvtdq2pd AVX intrinsics
llvm-svn: 110580
2010-08-09 18:03:43 +00:00
Dale Johannesen a3bd31a923 Use sdmem and sse_load_f64 (etc.) for the vector
form of CMPSD (etc.)  Matching a 128-bit memory
operand is wrong, the instruction uses only 64 bits
(same as ADDSD etc.)  8193553.

llvm-svn: 110491
2010-08-07 00:33:42 +00:00
Bruno Cardoso Lopes 93cc666a58 Patterns to match AVX 256-bit vzero intrinsics
llvm-svn: 110480
2010-08-06 22:10:01 +00:00
Bruno Cardoso Lopes 3d6a3a0ede Patterns to match AVX 256-bit permutation intrinsics
llvm-svn: 110468
2010-08-06 20:03:27 +00:00
Owen Anderson a7aed18624 Reapply r110396, with fixes to appease the Linux buildbot gods.
llvm-svn: 110460
2010-08-06 18:33:48 +00:00
Bruno Cardoso Lopes 1cf067cb3d Patterns to match AVX 256-bit horizontal arithmetic intrinsics
llvm-svn: 110427
2010-08-06 02:10:30 +00:00
Bruno Cardoso Lopes b9ad94fbf7 Patterns to match AVX 256-bit arithmetic intrinsics
llvm-svn: 110425
2010-08-06 01:52:29 +00:00
Owen Anderson bda59bd247 Revert r110396 to fix buildbots.
llvm-svn: 110410
2010-08-06 00:23:35 +00:00
Eric Christopher e1fb772aa5 Add an option to always emit realignment code for a particular module.
llvm-svn: 110404
2010-08-05 23:57:43 +00:00
Owen Anderson 755aceb5d0 Don't use PassInfo* as a type identifier for passes. Instead, use the address of the static
ID member as the sole unique type identifier.  Clean up APIs related to this change.

llvm-svn: 110396
2010-08-05 23:42:04 +00:00
Bruno Cardoso Lopes 77954bdf7a Support very basic (doesn't include ABI support in the front-end, varags, ...) 256-bit argument passing and return for AVX
llvm-svn: 110394
2010-08-05 23:35:51 +00:00
Eric Christopher 4d9c3400f3 Handle the memory barrier pseudo that goes to nothing for the JIT.
llvm-svn: 110371
2010-08-05 20:04:36 +00:00
Eric Christopher 7fd06eb8ce Set hasSideEffects on the 64-bit no-sse memory barrier.
llvm-svn: 110369
2010-08-05 19:54:59 +00:00
Eric Christopher 32f5d6b9be Be a little bit more specific about target for the memory barrier
instructions.

llvm-svn: 110360
2010-08-05 18:36:20 +00:00
Eric Christopher 4abffad17c Handle the pseudo in MCInstLower.
llvm-svn: 110359
2010-08-05 18:34:30 +00:00
Eric Christopher 2db8464282 Make x86-64 membarriers work without sse and clean up some of the
uses.

llvm-svn: 110274
2010-08-04 23:03:04 +00:00
Eli Friedman 39d0f57cab PR7814: Truncates cannot be ignored for signed comparisons.
llvm-svn: 110268
2010-08-04 22:40:58 +00:00
Devang Patel 2bf0f3ceff Add DEBUG message.
llvm-svn: 110224
2010-08-04 18:06:05 +00:00
Benjamin Kramer a53a4eefa6 Enable COFF writer on mingw32 and cygwin.
llvm-svn: 110200
2010-08-04 15:32:40 +00:00
Benjamin Kramer 61c8e6dc16 Print an error message when someone tries -integrated-as on an unsupported target.
- The COFF backend doesn't support MingW/Cygwin at the moment, it'll report an
  error, but it's still much better than random assertions from the MachO backend.
- We want to make ELF the default eventually, it's what the majority of targets use.

llvm-svn: 110197
2010-08-04 13:16:30 +00:00
Chris Lattner 53befe7bc1 fix a win64 encoding problem, patch by Cameron Esfahani!
llvm-svn: 110164
2010-08-03 22:49:22 +00:00
Michael J. Spencer ed80f361b3 MC: Remove HasAbsolutizedSet from WindowsX86AsmBackend.
llvm-svn: 109949
2010-07-31 07:21:44 +00:00
Michael J. Spencer 6b4925e223 Add relax all support to the COFF object streamer.
llvm-svn: 109947
2010-07-31 06:22:29 +00:00
Bruno Cardoso Lopes 349165b48f Support all 128-bit AVX vector intrinsics. Most part of them I already
declared during the addition of the assembler support, the additional
changes are:
- Add missing intrinsics
- Move all SSE conversion instructions in X86InstInfo64.td to the SSE.td file.
- Duplicate some patterns to AVX mode.
- Step into PCMPEST/PCMPIST custom inserter and add AVX versions.

llvm-svn: 109878
2010-07-30 19:54:33 +00:00
Bruno Cardoso Lopes 405405bbfe Fix typo!
llvm-svn: 109877
2010-07-30 19:41:24 +00:00
Jakob Stoklund Olesen ba0e124aaf Revert r109652, and remove the offending assert in loadRegFromStackSlot instead.
We do sometimes load from a too small stack slot when dealing with x86 arguments
(varargs and smaller-than-32-bit args). It looks like we know what we are doing
in those cases, so I am going to remove the assert instead of artifically
enlarging stack slot sizes.

The assert in storeRegToStackSlot stays in. We don't want to write beyond the
bounds of a stack slot.

llvm-svn: 109764
2010-07-29 17:42:27 +00:00
Jakob Stoklund Olesen f2234fbe70 Create a fixed stack object for varargs that is as large as any register.
The size of this object isn't used for anything - technically it is of variable
size.

This avoids a false positive from the assert in
X86InstrInfo::loadRegFromStackSlot, and fixes PR7735.

llvm-svn: 109652
2010-07-28 20:55:38 +00:00
Nate Begeman 53afc8f06a Implement a vectorized algorithm for <16 x i8> << <16 x i8>
This is about 4x faster and smaller than the existing scalarization.

llvm-svn: 109566
2010-07-28 00:21:48 +00:00
Nate Begeman 269a6da023 ~40% faster vector shl <4 x i32> on SSE 4.1 Larger improvements for smaller types coming in future patches.
For:

define <2 x i64> @shl(<4 x i32> %r, <4 x i32> %a) nounwind readnone ssp {
entry:
  %shl = shl <4 x i32> %r, %a                     ; <<4 x i32>> [#uses=1]
  %tmp2 = bitcast <4 x i32> %shl to <2 x i64>     ; <<2 x i64>> [#uses=1]
  ret <2 x i64> %tmp2
}

We get:

_shl:                                   ## @shl
	pslld	$23, %xmm1
	paddd	LCPI0_0, %xmm1
	cvttps2dq	%xmm1, %xmm1
	pmulld	%xmm1, %xmm0
	ret

Instead of:

_shl:                                   ## @shl
	pshufd	$3, %xmm0, %xmm2
	movd	%xmm2, %eax
	pshufd	$3, %xmm1, %xmm2
	movd	%xmm2, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm2
	pshufd	$1, %xmm0, %xmm3
	movd	%xmm3, %eax
	pshufd	$1, %xmm1, %xmm3
	movd	%xmm3, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm3
	punpckldq	%xmm2, %xmm3
	movd	%xmm0, %eax
	movd	%xmm1, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm2
	movhlps	%xmm0, %xmm0
	movd	%xmm0, %eax
	movhlps	%xmm1, %xmm1
	movd	%xmm1, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm0
	punpckldq	%xmm0, %xmm2
	movdqa	%xmm2, %xmm0
	punpckldq	%xmm3, %xmm0
	ret

llvm-svn: 109549
2010-07-27 22:37:06 +00:00
Michael J. Spencer f8270bdb2d Make MC use Windows COFF on Windows and add tests.
llvm-svn: 109494
2010-07-27 06:46:15 +00:00
Jakob Stoklund Olesen 96a890a7f8 The isLoadFromStackSlot and isStoreToStackSlot have no way of reporting
subregister operands like this:

%reg1040:sub_32bit<def> = MOV32rm <fi#-2>, 1, %reg0, 0, %reg0, %reg1040<imp-def>; mem:LD4[FixedStack-2](align=8)

Make them return false when subreg operands are present. VirtRegRewriter is
making bad assumptions otherwise.

This fixes PR7713.

llvm-svn: 109489
2010-07-27 04:17:01 +00:00
Jakob Stoklund Olesen c3c05ed02e Add assertions that expose the PR7713 miscompilation: Accessing a stack slot
with a too-big register class.

llvm-svn: 109488
2010-07-27 04:16:58 +00:00
Evan Cheng d4218b8793 On x86, f32 / f64 nodes share the same registers as 128-bit vector values.
llvm-svn: 109450
2010-07-26 21:50:05 +00:00
Bruno Cardoso Lopes 36c2ea6c7a Temporary hack to let codegen assert or generate poor code in case
we are using AVX and no AVX version of the desired intruction is present,
this is better for incremental dev (without fallbacks it's easier to spot
what's missing). Not sure this is the best hack thought (we can also disable
all HasSSE* predicates by dinamically marking them 'false' if AVX is present)

llvm-svn: 109434
2010-07-26 21:01:18 +00:00
Evan Cheng 37b740c4bf Add an ILP scheduler. This is a register pressure aware scheduler that's
appropriate for targets without detailed instruction iterineries.
The scheduler schedules for increased instruction level parallelism in
low register pressure situation; it schedules to reduce register pressure
when the register pressure becomes high.

On x86_64, this is a win for all tests in CFP2000. It also sped up 256.bzip2
by 16%.

llvm-svn: 109300
2010-07-24 00:39:05 +00:00
Bruno Cardoso Lopes 306a1f9721 Support x86 "eiz" and "riz" pseudo index registers in the assembler.
llvm-svn: 109295
2010-07-24 00:06:39 +00:00
Bruno Cardoso Lopes d65cd1d581 Remove trailing whitespace
llvm-svn: 109276
2010-07-23 22:15:26 +00:00
Bruno Cardoso Lopes ea0e05a3ce Add AVX version of CLMUL instructions
llvm-svn: 109248
2010-07-23 18:41:12 +00:00
Bruno Cardoso Lopes d618c8ac64 Declare CLMUL as a subtarget feature
llvm-svn: 109207
2010-07-23 01:22:45 +00:00
Bruno Cardoso Lopes 09dc24beac Add x86 CLMUL (Carry-less multiplication) cpu feature
llvm-svn: 109206
2010-07-23 01:17:51 +00:00
Bruno Cardoso Lopes acd9230b1b Add complete assembler support for FMA3 instructions, with descriptions and encodings taken from the AVX manual
llvm-svn: 109204
2010-07-23 00:54:35 +00:00
Dale Johannesen f2d75670b7 The only supported calling convention for X86-64 uses
SSE, so we can't return floating point values if this
is disabled.  Detect this error for clang.

With SSE1 only, f64 is a problem; it can be done, but
neither llvm-gcc nor clang has ever generated correct
code for it.  Since nobody noticed this I think it's
OK to treat it as an error for now.

This also handles SSE-sized vectors of floating point.
8207686, 8204109.

llvm-svn: 109201
2010-07-23 00:30:35 +00:00
Bruno Cardoso Lopes e29e389678 Fix some AVX instructions which didnt had HasAVX prefix. And also a problem with PINSRW, which was totally wrong because of a typo I introduced previously
llvm-svn: 109198
2010-07-23 00:14:54 +00:00
Bruno Cardoso Lopes 0710c74f29 Add remaining AVX instructions (most of them dealing with GR64 destinations. This complete the assembler support for the general AVX ISA. But we still miss instructions from FMA3 and CLMUL specific feature flags, which are now the next step
llvm-svn: 109168
2010-07-22 21:18:49 +00:00
Chris Lattner 8f3adc9057 remove the JIT "NeedsExactSize" feature and supporting logic.
llvm-svn: 109167
2010-07-22 21:17:55 +00:00
Chris Lattner b3f608bbba X86MCInstLower now depends on AsmPrinter being around.
llvm-svn: 109154
2010-07-22 21:10:04 +00:00
Chris Lattner 083be4d384 instead of migrating it to the MC instruction encoder, just
rip out the implementation of X86InstrInfo::GetInstSizeInBytes.
The code being ripped out just implemented a copy and hacked up
version of the (old) instruction encoder, and is buggy and 
terrible in other ways.  Since "GetInstSizeInBytes" is really 
only there to support the JIT's "NeedsExactSize" hook (which
noone is using), just rip out the code.  I will rip out the
NeedsExactSize hook next.

This resolves rdar://7617809 - switch X86InstrInfo::GetInstSizeInBytes to use X86MCCodeEmitter

llvm-svn: 109149
2010-07-22 21:05:13 +00:00
Chandler Carruth 3180f9f55f Attempt to fix linking issues with CMake. Please review other CMake users,
especially on other platforms. Is there a better way to fix this.

llvm-svn: 109084
2010-07-22 06:27:45 +00:00
Eric Christopher 9a77382685 Custom lower the memory barrier instructions and add support
for lowering without sse2.  Add a couple of new testcases.

Fixes a few libgomp tests and latent bugs.  Remove a few todos.

llvm-svn: 109078
2010-07-22 02:48:34 +00:00
Eric Christopher a4c435f1fa 80-columns.
llvm-svn: 109070
2010-07-22 00:26:08 +00:00
Nate Begeman 68a069a188 Make fast isel win64-aware w.r.t. call-clobbered regs
llvm-svn: 109069
2010-07-22 00:09:39 +00:00
Bruno Cardoso Lopes e3acfd4d58 Add more 256-bit forms for a bunch of regular AVX instructions
Add 64-bit (GR64) versions of some instructions (which are not
described in their SSE forms, but are described in AVX)

llvm-svn: 109063
2010-07-21 23:53:50 +00:00
Rafael Espindola 350b1a449f Fixes win64. It was broken by a previous patch where I missed the !isWin64
and then forced every register to be a vr128 on win64.

llvm-svn: 109060
2010-07-21 23:19:57 +00:00
Chris Lattner 5c91a5e747 add some rough support for making mcinst lowering work without an
asmprinter or mangler around.  This is option #B for killing off 
X86InstrInfo::GetInstSizeInBytes.  Option #A (killing 
"needsexactsize") was sent for consideration to llvmdev.

llvm-svn: 109056
2010-07-21 23:03:35 +00:00
Bruno Cardoso Lopes 6238c1d102 Add missing AVX convert instructions. Those instructions are not described in their SSE forms (although they exist), but add the AVX forms anyway, so the assembler can benefit from it
llvm-svn: 109039
2010-07-21 21:37:59 +00:00
Nate Begeman 784e062b2a Fix a couple issues with Win64 ABI
1) all registers were spilled as xmm, regardless of actual size
2) win64 abi doesn't do the varargs-size-in-%al thing

Still to look into:

xmm6-15 are marked as clobbered by call instructions on win64 even though they aren't.

llvm-svn: 109035
2010-07-21 20:49:52 +00:00
Bruno Cardoso Lopes 19b3830142 Avoid AVX instructions to be selected instead of its SSE form
llvm-svn: 109032
2010-07-21 20:38:42 +00:00
Eric Christopher d27913e516 Pulling out previous patch, must've run the tests in
the wrong directory.

llvm-svn: 109005
2010-07-21 09:23:56 +00:00
Eric Christopher b2d1067024 Lower MEMBARRIER on x86 and support processors without SSE2.
Fixes a pile of libgomp failures in the llvm-gcc testsuite due
to the libcall not existing.

llvm-svn: 109004
2010-07-21 09:05:23 +00:00
Bruno Cardoso Lopes cdbec62510 Add AVX only vzeroall and vzeroupper instructions
llvm-svn: 109002
2010-07-21 08:56:24 +00:00
Bruno Cardoso Lopes 3499934da6 Add new AVX vpermilps, vpermilpd and vperm2f128 instructions
llvm-svn: 108984
2010-07-21 03:07:42 +00:00
Bruno Cardoso Lopes 3ceaf7a0a2 Add new AVX vmaskmov instructions, and also fix the VEX encoding bits to support it
llvm-svn: 108983
2010-07-21 02:46:58 +00:00
Bruno Cardoso Lopes e706501975 Add new AVX vextractf128 instructions
llvm-svn: 108964
2010-07-20 23:19:02 +00:00
Chris Lattner 41ff5d4d91 make asmprinter optional, even though passing in null will cause things to explode right now.
llvm-svn: 108955
2010-07-20 22:45:33 +00:00
Chris Lattner b4dc58975b continue pushing dependencies around.
llvm-svn: 108952
2010-07-20 22:35:40 +00:00
Chris Lattner 2366d95af9 reduce X86MCInstLower dependencies on asmprinter.
llvm-svn: 108950
2010-07-20 22:30:53 +00:00
Chris Lattner 7fbdd7c852 pass around MF, not MMI.
llvm-svn: 108949
2010-07-20 22:26:07 +00:00
Chris Lattner d3f3a89425 cleanups.
llvm-svn: 108947
2010-07-20 22:23:57 +00:00
Chris Lattner 5ca516b87c move two asmprinter methods into the asmprinter .cpp file.
llvm-svn: 108945
2010-07-20 22:18:19 +00:00
Bruno Cardoso Lopes 3b505848fd Add new AVX instruction vinsertf128
llvm-svn: 108892
2010-07-20 19:44:51 +00:00
Eric Christopher 4adaccf0bf Constify some arguments.
llvm-svn: 108812
2010-07-20 06:52:21 +00:00
Bruno Cardoso Lopes 14c5fd437c Add AVX vbroadcast new instruction
llvm-svn: 108788
2010-07-20 00:11:13 +00:00
Daniel Dunbar 0aff8033c6 Update CMake files.
llvm-svn: 108787
2010-07-20 00:08:13 +00:00
Chris Lattner 64fffadad3 fix a layering problem by moving the x86 implementation
of AsmPrinter and InstLowering into libx86 and out of the
asmprinter subdirectory.  Now X86/AsmPrinter just depends on
MC stuff, not all of codegen and LLVM IR.

llvm-svn: 108782
2010-07-19 23:41:57 +00:00
Bruno Cardoso Lopes 9de0ca73d4 Add 256-bit vaddsub, vhadd, vhsub, vblend and vdpp instructions!
llvm-svn: 108769
2010-07-19 23:32:44 +00:00
Daniel Dunbar 9db7d0addd X86: Mark JMP{32,64}[mr] as requires 32-bit/64-bit mode. They are the same
instruction, we only want to allow the one for the current subtarget.
 - This also fixes suffix matching for jmp instructions, because it eliminates
   the ambiguity between 'jmpl' and 'jmpq'.

llvm-svn: 108746
2010-07-19 20:44:16 +00:00
Daniel Dunbar 9aefb8ee4c X86-64: Mark WINCALL and more tail call instructions as code gen only.
llvm-svn: 108685
2010-07-19 07:21:07 +00:00
Daniel Dunbar 2e9f58517d X86: Mark some tail call pseduo instruction as code gen only.
llvm-svn: 108684
2010-07-19 07:21:04 +00:00
Daniel Dunbar 1cd02510d3 X86: Mark In32/64BitMode on LEAVE[64] and SYSEXIT[64].
llvm-svn: 108683
2010-07-19 07:21:01 +00:00
Daniel Dunbar b82cd9319b MC/X86: We now match instructions like "incl %eax" correctly for the arch we are
assembling; remove crufty custom cleanup code.

llvm-svn: 108681
2010-07-19 06:14:54 +00:00
Daniel Dunbar 150d948d3a X86: Mark MOV.*_{TC,NOREX} instruction as code gen only, they aren't real.
llvm-svn: 108680
2010-07-19 06:14:49 +00:00
Daniel Dunbar 961543377d X86: MOV8o8a, MOV8ao8, etc. are only valid in 32-bit mode.
llvm-svn: 108679
2010-07-19 06:14:44 +00:00
Daniel Dunbar eefe8616be TblGen/AsmMatcher: Add support for honoring instruction Requires<[]> attributes as part of the matcher.
- Currently includes a hack to limit ourselves to "In32BitMode" and "In64BitMode", because we don't have the other infrastructure to properly deal with setting SSE, etc. features on X86.

llvm-svn: 108677
2010-07-19 05:44:09 +00:00
Daniel Dunbar 419197cc4d Target: Give the TargetAsmParser access to the TargetMachine.
- Unfortunate, but necessary for now to handle subtarget instruction matching. Eventually we should factor out the lower level target machine information so we don't need to do this.

llvm-svn: 108664
2010-07-19 00:33:49 +00:00
Chris Lattner 5218343970 the stackifier is global!
llvm-svn: 108626
2010-07-17 17:42:04 +00:00
Chris Lattner 8f440bb9b0 doxygenify some comments.
llvm-svn: 108625
2010-07-17 17:40:51 +00:00
Eric Christopher 83f250f005 Remove unnecessary check that was subsumed into canRealignStack.
llvm-svn: 108588
2010-07-17 00:33:04 +00:00
Eric Christopher c0be37287c Make comment a bit more clear as well as return statement since
needsStackRealignment is currently checking the can conditions as well.

llvm-svn: 108581
2010-07-17 00:25:41 +00:00
Jakob Stoklund Olesen 8289f78569 Remove the isMoveInstr() hook.
llvm-svn: 108567
2010-07-16 22:35:46 +00:00
Jakob Stoklund Olesen 2c130b8ead Use MI.isCopy.
llvm-svn: 108565
2010-07-16 22:35:34 +00:00
Bill Wendling 499f797cdd Rename DBG_LABEL PROLOG_LABEL, because it's only used during prolog emission and
thus is a much more meaningful name.

llvm-svn: 108563
2010-07-16 22:20:36 +00:00
Jakob Stoklund Olesen 8d51149102 Keep valgrind quiet.
The isLive() method can read uninitialized memory, but it still gives correct
results.

llvm-svn: 108561
2010-07-16 22:00:33 +00:00
Dale Johannesen da3e05db70 Accept registers with P modifier. PR 5314.
llvm-svn: 108545
2010-07-16 18:35:46 +00:00
Jakob Stoklund Olesen c30b4ddc58 Remove the X86::FP_REG_KILL pseudo-instruction and the X86FloatingPointRegKill
pass that inserted it.

It is no longer necessary to limit the live ranges of FP registers to a single
basic block.

llvm-svn: 108536
2010-07-16 17:41:44 +00:00
Jakob Stoklund Olesen f0af236874 Search for a free FP register instead of just assuming FP7 is not in use.
llvm-svn: 108535
2010-07-16 17:41:40 +00:00
Jakob Stoklund Olesen 0e5fb020a0 Allow x87 FP registers to be alive globally in a function.
FP_REG_KILL instructions are still inserted, but can be disabled by passing
-live-x87 to llc. The X87FPRegKillInserterPass is going to be removed shortly.

CFG edges are partioned into bundles where the x87 stack must be allocated
identically. Code is insertad at the end of each basic block that shuffles the
live FP registers to match the outgoing bundles expectations.

This fix is in preparation for some upcoming register allocator improvements
that may extend the live range of registers beyond a basic block, similar to
LICM. It also provides a nice runtime speedup if you are building with
-mfpmath=387.

llvm-svn: 108529
2010-07-16 16:38:12 +00:00
Evan Cheng 55f0c6b9fc Split -enable-finite-only-fp-math to two options:
-enable-no-nans-fp-math and -enable-no-infs-fp-math. All of the current codegen fp math optimizations only care whether the fp arithmetics arguments and results can never be NaN.

llvm-svn: 108465
2010-07-15 22:07:12 +00:00
Chris Lattner 620693806a fix the encoding of MMX_MOVFR642Qrr, it starts with 0xF2 not 0xF3,
this fixes rdar://8192860.  Unfortunately it can only be triggered
with llc because llvm-mc matches another (correctly encoded) version
of this, so no testcase.

llvm-svn: 108454
2010-07-15 20:13:34 +00:00
Jakob Stoklund Olesen 8b1bb8cfbd Last COPY conversion.
llvm-svn: 108387
2010-07-14 23:58:21 +00:00
Jakob Stoklund Olesen 9b449d5a92 Use TargetOpcode::COPY instead of X86-native register copy instructions when
lowering atomics. This will allow those copies to still be coalesced after
TII::isMoveInstr is removed.

llvm-svn: 108385
2010-07-14 23:50:27 +00:00
Chris Lattner 769aedd523 fix indentation
llvm-svn: 108368
2010-07-14 23:04:59 +00:00
Benjamin Kramer 92d8998348 Don't pass StringRef by reference.
llvm-svn: 108366
2010-07-14 22:38:02 +00:00
Chris Lattner 254858031a Merge lib/Target/X86/X86COFF.h into include/llvm/Support/COFF.h,
patch by Michael Spencer!

llvm-svn: 108342
2010-07-14 18:14:33 +00:00
Evan Cheng a8e8874552 Fix for PR7193 was overly conservative. The only case where sibcall callee
address cannot be allocated a register is in 32-bit mode where the first
three arguments are marked inreg. In that case EAX, EDX, and ECX will be
used for argument passing.

This fixes PR7610.

llvm-svn: 108327
2010-07-14 06:44:01 +00:00
Dan Gohman 1f471435f8 Don't propagate debug locations to instructions for materializing
constants, since they may not be emited near the other instructions
which get the same line, and this confuses debug info.

llvm-svn: 108302
2010-07-14 01:07:44 +00:00
Bruno Cardoso Lopes 6c6c14a55c Add AVX 256-bit compare instructions and a bunch of testcases
llvm-svn: 108286
2010-07-13 22:06:38 +00:00
Bruno Cardoso Lopes fd8bfcd6e1 AVX 256-bit conversion instructions
Add the x86 VEX_L form to handle special cases where VEX_L must be set.

llvm-svn: 108274
2010-07-13 21:07:28 +00:00
Kevin Enderby 76a6b663a3 Added a check that pusha cannot be encoded in 64-bit mode.
llvm-svn: 108265
2010-07-13 20:05:41 +00:00
Chris Lattner 55595fb291 my work on adding segment registers to LEA missed the
disassembler.  Remove some code from the disassembler to
compensate, unbreaking disassembly of lea's.

llvm-svn: 108226
2010-07-13 04:23:55 +00:00
Bruno Cardoso Lopes dff283e146 Add AVX 256-bit packed logical forms
llvm-svn: 108224
2010-07-13 02:38:35 +00:00
Bruno Cardoso Lopes 36b32aeaa5 Add AVX 256-bit unop arithmetic instructions
llvm-svn: 108223
2010-07-13 01:53:31 +00:00
Bruno Cardoso Lopes 77a3c4462f Since AVX is a superset of all SSE versions, only use HasAVX for AVX instructions
llvm-svn: 108222
2010-07-13 00:38:47 +00:00
David Greene 03264efe30 Move some SIMD fragment code into X86InstrFragmentsSIMD so that the
utility classes can be used from multiple files.  This will aid
transitioning to a new refactored x86 SIMD specification.

llvm-svn: 108213
2010-07-12 23:41:28 +00:00
Bruno Cardoso Lopes 8e67a0482e Add AVX 256 binary arithmetic instructions
llvm-svn: 108207
2010-07-12 23:04:15 +00:00
Bruno Cardoso Lopes 91806311c9 More refactoring of basic SSE arith instructions. Open room for 256-bit instructions
llvm-svn: 108204
2010-07-12 22:41:32 +00:00
Dan Gohman 51e6d9bbf6 Apply the SSE dependence idiom for SSE unary operations to
SD instructions too, in addition to SS instructions. And
add a comment about it.

llvm-svn: 108191
2010-07-12 20:46:04 +00:00
Bruno Cardoso Lopes f9bcaad76d Add AVX 256-bit MOVMSK forms
llvm-svn: 108184
2010-07-12 20:06:32 +00:00
Dan Gohman 425b35681f Check begin!=end, rather than !begin.
llvm-svn: 108167
2010-07-12 18:12:35 +00:00
Dan Gohman 68d7424a65 Don't fast-isel an x87 comparison opcode, as fast-isel doesn't
support branching on x87 comparisons yet. This fixes PR7624.

llvm-svn: 108149
2010-07-12 15:46:30 +00:00
Rafael Espindola 6635f9838e Convert getLoadStoreRegOpcode to use a switch.
llvm-svn: 108123
2010-07-12 03:43:04 +00:00
Jakob Stoklund Olesen de7201545e A basic block that only uses RFP registers still needs the FP_REG_KILL marker.
This fixes PR7375.

llvm-svn: 108120
2010-07-12 02:12:47 +00:00
Rafael Espindola e35d70fafa Convert the last getPhysicalRegisterRegClass in VirtRegRewriter.cpp to
getMinimalPhysRegClass. It was used to produce spills, and it is better to
use the most specific class if possible.

Update getLoadStoreRegOpcode to handle GR32_AD.

llvm-svn: 108115
2010-07-12 00:52:33 +00:00
Jakob Stoklund Olesen f6c7d7fb3f Use target independent COPY instructions for the fake fextend and fround
operations in x87 code.

llvm-svn: 108098
2010-07-11 18:19:39 +00:00
Jakob Stoklund Olesen 98ee37d878 Remove obsolete README_SSE note.
We are generating movaps for all XMM register copies, including scalar
floating point values. This is known to be at least as good as movss and movsd
for all known architectures up to and including Nehalem because it avoids a
partial register stall.

The SSEDomainFix pass will switch movaps to movdqa when appropriate (i.e., when
operands come from the integer unit). We don't now that switching movaps to
movapd has any benefit.

The same applies to andps -> pand.

llvm-svn: 108096
2010-07-11 17:13:42 +00:00
Jakob Stoklund Olesen 4806848799 Avoid SSE instructions in FastIsel when it is not available.
llvm-svn: 108091
2010-07-11 16:22:13 +00:00
Jakob Stoklund Olesen e46f3eb0c4 X86InstrInfo::copyRegToReg is dead. Long live copyPhysReg!
llvm-svn: 108076
2010-07-11 05:44:30 +00:00
Jakob Stoklund Olesen 8969657f0c Use COPY in X86FastISel::X86SelectRet.
Don't try a cross-class copy. That is very unlikely anywy since return value
registers are usually register class friendly. (%EAX, %XMM0, etc).

llvm-svn: 108074
2010-07-11 05:17:02 +00:00
Jakob Stoklund Olesen 3bb1267431 Use COPY in FastISel everywhere it is safe and trivial.
The remaining copyRegToReg calls actually check the return value (shock!), so we
cannot trivially replace them with COPY instructions.

llvm-svn: 108069
2010-07-11 03:31:00 +00:00
Jakob Stoklund Olesen de457896b6 Don't emit st(0)/st(1) copies as FpMOV instructions. Use FpSET_ST? instead.
Based on a patch by Rafael Espíndola.

Attempt to make the FpSET_ST1 hack more robust, but we are still relying on
FpSET_ST0 preceeding it. This is only for supporting really weird x87 inline
asm.

We support:

  FpSET_ST0
  INLINEASM

  FpSET_ST0
  FpSET_ST1
  INLINEASM

with and without kills on the arguments. We don't support:

  FpSET_ST1
  FpSET_ST0
  INLINEASM

nor

  FpSET_ST1
  INLINEASM

Just Don't Do It!

llvm-svn: 108047
2010-07-10 17:42:34 +00:00
Dan Gohman d7b5ce3312 Reapply bottom-up fast-isel, with several fixes for x86-32:
- Check getBytesToPopOnReturn().
 - Eschew ST0 and ST1 for return values.
 - Fix the PIC base register initialization so that it doesn't ever
   fail to end up the top of the entry block.

llvm-svn: 108039
2010-07-10 09:00:22 +00:00
Jakob Stoklund Olesen be8d9b0bb8 An x86 function returns a floating point value in st(0), and we must make sure
it is popped, even if it is ununsed. A CopyFromReg node is too weak to represent
the required sideeffect, so insert an FpGET_ST0 instruction directly instead.

This will matter when CopyFromReg gets lowered to a generic COPY instruction.

llvm-svn: 108037
2010-07-10 04:04:25 +00:00
Bruno Cardoso Lopes 5e6c2155a3 Declare YMM subregisters in the right way! Thanks Jakob
llvm-svn: 108022
2010-07-09 21:46:19 +00:00
Bruno Cardoso Lopes 2419606bfb Add AVX 256-bit packed MOVNT variants
llvm-svn: 108021
2010-07-09 21:42:42 +00:00
Jakob Stoklund Olesen e2614a9979 Remember the *_TC opcodes for load/store
llvm-svn: 108020
2010-07-09 21:27:55 +00:00
Bruno Cardoso Lopes 6bc772eec7 Add AVX 256-bit unpack and interleave
llvm-svn: 108017
2010-07-09 21:20:35 +00:00
Jakob Stoklund Olesen 7a7b55eb67 Automatically fold COPY instructions into stack load/store.
llvm-svn: 108012
2010-07-09 20:43:13 +00:00
Jakob Stoklund Olesen 51702ec46b Fix a few tests
llvm-svn: 108011
2010-07-09 20:43:09 +00:00
Bruno Cardoso Lopes 792e906bef Start the support for AVX instructions with 256-bit %ymm registers. A couple of
notes:
- The instructions are being added with dummy placeholder patterns using some 256
  specifiers, this is not meant to work now, but since there are some multiclasses
  generic enough to accept them,  when we go for codegen, the stuff will be already
  there.
- Add VEX encoding bits to support YMM
- Add MOVUPS and MOVAPS in the first round
- Use "Y" as suffix for those Instructions: MOVUPSYrr, ...
- All AVX instructions in X86InstrSSE.td will move soon to a new X86InstrAVX
  file.

llvm-svn: 107996
2010-07-09 18:27:43 +00:00
Bob Wilson 6586e9b203 --- Reverse-merging r107947 into '.':
U    utils/TableGen/FastISelEmitter.cpp
--- Reverse-merging r107943 into '.':
U    test/CodeGen/X86/fast-isel.ll
U    test/CodeGen/X86/fast-isel-loads.ll
U    include/llvm/Target/TargetLowering.h
U    include/llvm/Support/PassNameParser.h
U    include/llvm/CodeGen/FunctionLoweringInfo.h
U    include/llvm/CodeGen/CallingConvLower.h
U    include/llvm/CodeGen/FastISel.h
U    include/llvm/CodeGen/SelectionDAGISel.h
U    lib/CodeGen/LLVMTargetMachine.cpp
U    lib/CodeGen/CallingConvLower.cpp
U    lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
U    lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp
U    lib/CodeGen/SelectionDAG/FastISel.cpp
U    lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
U    lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
U    lib/CodeGen/SelectionDAG/InstrEmitter.cpp
U    lib/CodeGen/SelectionDAG/TargetLowering.cpp
U    lib/Target/XCore/XCoreISelLowering.cpp
U    lib/Target/XCore/XCoreISelLowering.h
U    lib/Target/X86/X86ISelLowering.cpp
U    lib/Target/X86/X86FastISel.cpp
U    lib/Target/X86/X86ISelLowering.h

llvm-svn: 107987
2010-07-09 16:37:18 +00:00
Bruno Cardoso Lopes 992d25da71 Merge VEX enums with other x86 enum forms. Also fix all checks of which VEX
fields to use. 

llvm-svn: 107952
2010-07-09 01:56:45 +00:00
Dan Gohman 0a7d155d67 Fix the memoperand offsets in code generated for va_start.
llvm-svn: 107948
2010-07-09 01:06:48 +00:00
Chris Lattner 88c185617c have the mc lowering process handle a few tail call forms, lowering them to
jumps where possible and turning the TAILCALL marker in the instruction
asm string into a proper comment.

This eliminates a FIXME and is on the path to finishing:
rdar://7639610 - eliminate encoding and asm info for TAILJMPd TAILJMPr TAILJMPn, etc.

However, I can't eliminate the encodings for these instructions because the JIT
still exists and has its own copy of the encoder, sigh.

llvm-svn: 107946
2010-07-09 00:49:41 +00:00
Dan Gohman 0b5aa1cdd3 Re-apply bottom-up fast-isel, with fixes. Be very careful to avoid emitting
a DBG_VALUE after a terminator, or emitting any instructions before an EH_LABEL.

llvm-svn: 107943
2010-07-09 00:39:23 +00:00
Bruno Cardoso Lopes e6cc0d33bb Factor out x86 segment override prefix encoding, and also use it for VEX
llvm-svn: 107942
2010-07-09 00:38:14 +00:00
Chris Lattner 061d70ad2c reject pseudo instructions early in the encoder.
llvm-svn: 107939
2010-07-09 00:17:50 +00:00
Bruno Cardoso Lopes b652c1a145 Remove trailing whitespaces from file
llvm-svn: 107937
2010-07-09 00:07:19 +00:00
Chris Lattner f469307c77 Change LEA to have 5 operands for its memory operand, just
like all other instructions, even though a segment is not
allowed.  This resolves a bunch of gross hacks in the 
encoder and makes LEA more consistent with the rest of the
instruction set.

No functionality change.

llvm-svn: 107934
2010-07-08 23:46:44 +00:00
Chris Lattner ec536276f0 add some long-overdue enums to refer to the parts of the 5-operand
X86 memory operand.

llvm-svn: 107925
2010-07-08 22:41:28 +00:00
Jakob Stoklund Olesen ec58a43d81 Remember the VR64 register class
llvm-svn: 107920
2010-07-08 22:30:35 +00:00
Chris Lattner 9f034c1e5d Rework segment prefix emission code to handle segments
in memory operands at the same type as hard coded segments.
This fixes problems where we'd emit the segment override after
the REX prefix on instructions like:
mov %gs:(%rdi), %rax

This fixes rdar://8127102.  I have several cleanup patches coming
next.

llvm-svn: 107917
2010-07-08 22:28:12 +00:00
Chris Lattner 1dd82c7dc2 introduce a new X86II::getMemoryOperandNo method, which
returns the start of the memory operand for an instruction.

Introduce a new "X86AddrSegment" enum to reduce # magic numbers
referring to X86 memory operand layout.

llvm-svn: 107916
2010-07-08 22:27:06 +00:00
Jakob Stoklund Olesen 63a622b768 Teach the x86 floating point stackifier to handle COPY instructions.
This pass runs before COPY instructions are passed to copyPhysReg, so we simply
translate COPY to the proper pseudo instruction. Note that copyPhysReg does not
handle floating point stack copies.

Once COPY is used everywhere, this can be cleaned up a bit, and most of the
pseudo instructions can be removed.

llvm-svn: 107899
2010-07-08 19:46:30 +00:00
Jakob Stoklund Olesen 930f8082c3 Implement X86InstrInfo::copyPhysReg
llvm-svn: 107898
2010-07-08 19:46:25 +00:00
Jakob Stoklund Olesen 00264624a9 Convert EXTRACT_SUBREG to COPY when emitting machine instrs.
EXTRACT_SUBREG no longer appears as a machine instruction. Use COPY instead.

Add isCopy() checks in many places using isMoveInstr() and isExtractSubreg().
The isMoveInstr hook will be removed later.

llvm-svn: 107879
2010-07-08 16:40:22 +00:00
Jakob Stoklund Olesen a1e883dcf6 Remove references to INSERT_SUBREG after de-SSA.
Fix X86InstrInfo::convertToThreeAddressWithLEA to generate COPY instead of
INSERT_SUBREG.

llvm-svn: 107878
2010-07-08 16:40:15 +00:00
Eric Christopher e796253217 A slight reworking of the custom patterns for x86-64 tpoff codegen and
correct the testcase for valid assembly.

Needs more tests.

llvm-svn: 107860
2010-07-08 07:36:46 +00:00
Dan Gohman e75704369d Revert 107840 107839 107813 107804 107800 107797 107791.
Debug info intrinsics win for now.

llvm-svn: 107850
2010-07-08 01:00:56 +00:00
Jakob Stoklund Olesen 6213ab789f fix copies to/from GR8_ABCD_H even more
llvm-svn: 107832
2010-07-07 23:04:56 +00:00
Chris Lattner 05ea2a4791 finish up support for callw: PR7195
llvm-svn: 107826
2010-07-07 22:35:13 +00:00
Chris Lattner ac5881295c Implement the major chunk of PR7195: support for 'callw'
in the integrated assembler.  Still some discussion to be
done.

llvm-svn: 107825
2010-07-07 22:27:31 +00:00
Bruno Cardoso Lopes 6c61451011 Add more assembly opcodes for SSE compare instructions
llvm-svn: 107823
2010-07-07 22:24:03 +00:00
Evan Cheng 1c349f18f8 Move getExtLoad() and (some) getLoad() DebugLoc argument after EVT argument for consistency sake.
llvm-svn: 107820
2010-07-07 22:15:37 +00:00
Devang Patel 32a600b494 Print undefined/unknown debug value as "undef".
llvm-svn: 107818
2010-07-07 21:52:21 +00:00
Jakob Stoklund Olesen ddaf0099a5 Allow copies between GR8_ABCD_L and GR8_ABCD_H.
This fixes PR7540.

llvm-svn: 107809
2010-07-07 20:33:27 +00:00
Dan Gohman e7ccc51cc1 Implement bottom-up fast-isel. This has the advantage of not requiring
a separate DCE pass over MachineInstrs.

llvm-svn: 107804
2010-07-07 19:20:32 +00:00
Dan Gohman 2d4d01d0de Add X86FastISel support for return statements. This entails refactoring
a bunch of stuff, to allow the target-independent calling convention
logic to be employed.

llvm-svn: 107800
2010-07-07 18:32:53 +00:00
Bruno Cardoso Lopes fd8060335b Add AVX AES instructions
llvm-svn: 107798
2010-07-07 18:24:20 +00:00
Dan Gohman ffe64b1ee5 Give FunctionLoweringInfo an MBB member, avoiding the need to pass it
around everywhere, and also give it an InsertPt member, to enable isel
to operate at an arbitrary position within a block, rather than just
appending to a block.

llvm-svn: 107791
2010-07-07 16:47:08 +00:00
Dan Gohman 87fb4e8fcd Simplify FastISel's constructor by giving it a FunctionLoweringInfo
instance, rather than pointers to all of FunctionLoweringInfo's
members.

This eliminates an NDEBUG ABI sensitivity.

llvm-svn: 107789
2010-07-07 16:29:44 +00:00
Dan Gohman fe7532a308 Split the SDValue out of OutputArg so that SelectionDAG-independent
code can do calling-convention queries. This obviates OutputArgReg.

llvm-svn: 107786
2010-07-07 15:54:55 +00:00
Bruno Cardoso Lopes 6d122aef97 Add AVX SSE4.2 instructions
llvm-svn: 107752
2010-07-07 03:39:29 +00:00
Bruno Cardoso Lopes 3df55b2d6f Use only one multiclass to pinsrq instructions
llvm-svn: 107750
2010-07-07 01:43:01 +00:00
Bruno Cardoso Lopes fd6c808154 Now that almost all SSE4.1 AVX instructions are added, move code around to more appropriate sections. No functionality changes
llvm-svn: 107749
2010-07-07 01:33:38 +00:00
Bruno Cardoso Lopes 8f5472a8e8 Add AVX SSE4.1 insertps, ptest and movntdqa instructions
llvm-svn: 107747
2010-07-07 01:14:56 +00:00
Bruno Cardoso Lopes 6430c7350d Add AVX SSE4.1 extractps and pinsr instructions
llvm-svn: 107746
2010-07-07 01:01:13 +00:00
Bruno Cardoso Lopes f3116ebe96 Add AVX SSE4.1 Extract Integer instructions
llvm-svn: 107740
2010-07-07 00:07:24 +00:00
Dale Johannesen ce65663330 Accept RIP-relative symbols with 'i' constraint, and
print the (%rip) only if the 'a' modifier is present.
PR 7528.

llvm-svn: 107727
2010-07-06 23:27:00 +00:00
Bruno Cardoso Lopes 1f9ad516c6 Add the rest of AVX SSE4.1 packed move with sign/zero extend instructions
llvm-svn: 107723
2010-07-06 23:15:17 +00:00
Bruno Cardoso Lopes 35702d27c4 Add part of AVX SSE4.1 packed move with sign/zero extend instructions
llvm-svn: 107720
2010-07-06 23:01:41 +00:00
Bruno Cardoso Lopes 13f0260e76 Fix comment from previous patch
llvm-svn: 107717
2010-07-06 22:38:32 +00:00