llvm-project

Commit Graph

Author	SHA1	Message	Date
Dale Johannesen	0d2e6ad504	Add intrinsic-based patterns for MMX PINSRW and PEXTRW. llvm-svn: 113420	2010-09-08 22:08:40 +00:00
Dale Johannesen	e54dba94f9	Check in forgotten file. Should fix build. llvm-svn: 113409	2010-09-08 21:09:48 +00:00
Dale Johannesen	4dae01781f	Slight cleanup, use only one form of MMXI_binop_rm_int. llvm-svn: 113406	2010-09-08 20:54:00 +00:00
Dale Johannesen	d79bb127dd	Add intrinsic forms of mmx<->sse conversions. Notes: Omission of memory form of PI2PD is intentional; this does not use an MMX register and does not put the chip into MMX mode (PI2PS, oddly enough, does). Operands of PI2PS follow the gcc builtin, not Intel. llvm-svn: 113388	2010-09-08 19:15:38 +00:00
Bruno Cardoso Lopes	99a9f4661a	Minor change. Fix comments and remove unused and redundant code llvm-svn: 113378	2010-09-08 18:12:31 +00:00
Bruno Cardoso Lopes	f7fee1c185	x86 vector shuffle lowering now relies only on target specific nodes to emit shuffles and don't do isel mask matching anymore. - Add the selection of the remaining shuffle opcode (movddup) - Introduce two new functions to "recognize" where we may get potential folds and add several comments to them explaining why they are not yet in the desidered shape. - Add more patterns to fallback the case where we select a specific shuffle opcode as if it could fold a load, but it can't, so remap to a valid instruction. - Add a couple of FIXMEs to address in the following days once there's a good solution to the current folding problem. llvm-svn: 113369	2010-09-08 17:43:25 +00:00
Chris Lattner	2907d2e419	add support for the commuted form of the test instruction, rdar://8018260. llvm-svn: 113352	2010-09-08 05:51:12 +00:00
Chris Lattner	a9ca7837e4	implement proper support for sysret{,l,q}, rdar://8403907 llvm-svn: 113350	2010-09-08 05:45:34 +00:00
Chris Lattner	063363fa80	implement the iret suite of instructions properly, fixing rdar://8403974 llvm-svn: 113349	2010-09-08 05:38:31 +00:00
Chris Lattner	086a83afb1	add support for instruction prefixes on the same line as the instruction, implementing rdar://8033482 and PR7254. llvm-svn: 113348	2010-09-08 05:17:37 +00:00
Chris Lattner	91689c1d0f	change the MC "ParseInstruction" interface to make it the implementation's job to check for and lex the EndOfStatement marker. llvm-svn: 113347	2010-09-08 05:10:46 +00:00
Chris Lattner	8caea68a4f	gas accepts xchg <mem>, <reg> as a synonym for xchg <reg>, <mem>. Add this to the mc assembler, fixing PR8061 llvm-svn: 113346	2010-09-08 04:53:27 +00:00
Chris Lattner	4703cb4a96	fix the encoding of the "jump on *cx" family of instructions, rdar://8061602 llvm-svn: 113343	2010-09-08 04:30:51 +00:00
Bruno Cardoso Lopes	6b1d62c529	Factor out some x86 vector shuffle rewriting and add comments about the direction the shuffle lowering is heading to llvm-svn: 113286	2010-09-07 21:03:14 +00:00
Bruno Cardoso Lopes	7c483028fb	Move code around to prepare for moving some of the logic together to another function llvm-svn: 113267	2010-09-07 20:20:27 +00:00
Bill Wendling	353802114f	Add an MVT::x86mmx type. It will take the place of all current MMX vector types. llvm-svn: 113261	2010-09-07 20:03:56 +00:00
Evan Cheng	5444b36e01	Remove a dead comment. llvm-svn: 113259	2010-09-07 20:01:10 +00:00
Bruno Cardoso Lopes	5a45db3e6c	decouple MMX check from regular splat checks. Some refactoring is coming, and MMX should be left alone to be easily removed after moving to intrinsics llvm-svn: 113247	2010-09-07 18:41:45 +00:00
Bruno Cardoso Lopes	4f5d4b4a6e	Remove now useless check, because the code can be matched below, no need to leave it for isel llvm-svn: 113242	2010-09-07 18:29:03 +00:00
Bruno Cardoso Lopes	c9b3316fea	Minor change. Since the checks are equivalent, use isMMX llvm-svn: 113239	2010-09-07 18:24:00 +00:00
Dale Johannesen	605acfe533	Add patterns for MMX that use the new intrinsics. Enable palignr intrinsic. These may need adjustment for a new VT in due course. llvm-svn: 113233	2010-09-07 18:10:56 +00:00
Bruno Cardoso Lopes	f0ea222255	Remove unused target specific node llvm-svn: 113224	2010-09-07 17:38:55 +00:00
Benjamin Kramer	1ecb978214	Don't leak the old operand when transforming "sldt" into "sldtw". llvm-svn: 113200	2010-09-07 14:40:58 +00:00
Chris Lattner	30bb384944	add missing cmov aliases, this resolves rdar://8208499 llvm-svn: 113189	2010-09-07 00:05:45 +00:00
Chris Lattner	3ae9398d5f	remove duplicated entry llvm-svn: 113188	2010-09-06 23:57:24 +00:00
Chris Lattner	7ece716da2	"sldt <mem>" is ambiguous in 64-bit mode, but should always be disambiguated as sldtw. sldtw and sldtq with a mem operands have the same effect, but sldtw is more compact. Force it to sldtw, resolving rdar://8017530 llvm-svn: 113186	2010-09-06 23:51:44 +00:00
Chris Lattner	415e04fad2	fix rdar://8017621 - llvm-mc can't guess encoding for "push $(1000)" llvm-svn: 113184	2010-09-06 23:40:56 +00:00
Chris Lattner	34e366b45c	fix the operand constraints of the immediate form of in/out, allowing unsigned 8-bit operands. This fixes rdar://8208481 llvm-svn: 113182	2010-09-06 23:29:05 +00:00
Chris Lattner	339cc7bfef	in the case where an instruction only has one implementation of a mneumonic, report operand errors with better location info. For example, we now report: t.s:6:14: error: invalid operand for instruction cwtl $1 ^ but we fail for common cases like: t.s:11:4: error: invalid operand for instruction addl $1, $1 ^ because we don't know if this is supposed to be the reg/imm or imm/reg form. llvm-svn: 113178	2010-09-06 22:11:18 +00:00
Chris Lattner	628fbecf4f	Now that we know if we had a total fail on the instruction mnemonic, give a more detailed error. Before: t.s:11:4: error: unrecognized instruction addl $1, $1 ^ t.s:12:4: error: unrecognized instruction f2efqefa $1 ^ After: t.s:11:4: error: invalid operand for instruction addl $1, $1 ^ t.s:12:4: error: invalid instruction mnemonic 'f2efqefa' f2efqefa $1 ^ This fixes rdar://8017912 - llvm-mc says "unrecognized instruction" when it means "invalid operands" llvm-svn: 113176	2010-09-06 21:54:15 +00:00
Chris Lattner	31c63fb518	simplify the hacks around jrcxz. llvm-svn: 113167	2010-09-06 20:10:12 +00:00
Chris Lattner	b4be28f33d	have tblgen detect when an instruction would have matched, but failed because a subtarget feature was not enabled. Use this to remove a bunch of hacks from the X86AsmParser for rejecting things like popfl in 64-bit mode. Previously these hacks weren't needed, but were important to get a message better than "invalid instruction" when used in the wrong mode. This also fixes bugs where pushal would not be rejected correctly in 32-bit mode (just pusha). llvm-svn: 113166	2010-09-06 20:08:02 +00:00
Chris Lattner	a22a368e7c	change MatchInstructionImpl to return an enum instead of bool. llvm-svn: 113165	2010-09-06 19:22:17 +00:00
Chris Lattner	3e4582ada5	have AsmMatcherEmitter.cpp produce the hunk of code that gets included into the middle of the class, and rework how the different sections of the generated file are conditionally included for simplicity. llvm-svn: 113163	2010-09-06 19:11:01 +00:00
Roman Divacky	e1278b57f9	Redefine LOOP* instructions from I to Ii8PCRel as they take an i8 argument. llvm-svn: 113158	2010-09-06 18:43:14 +00:00
Chris Lattner	4cfbcdc7b6	random cleanups llvm-svn: 113157	2010-09-06 18:32:06 +00:00
Chris Lattner	5cac0f71ca	update this. llvm-svn: 113116	2010-09-05 20:22:09 +00:00
Chris Lattner	eeba0c73e5	implement rdar://6653118 - fastisel should fold loads where possible. Since mem2reg isn't run at -O0, we get a ton of reloads from the stack, for example, before, this code: int foo(int x, int y, int z) { return x+y+z; } used to compile into: _foo: ## @foo subq $12, %rsp movl %edi, 8(%rsp) movl %esi, 4(%rsp) movl %edx, (%rsp) movl 8(%rsp), %edx movl 4(%rsp), %esi addl %edx, %esi movl (%rsp), %edx addl %esi, %edx movl %edx, %eax addq $12, %rsp ret Now we produce: _foo: ## @foo subq $12, %rsp movl %edi, 8(%rsp) movl %esi, 4(%rsp) movl %edx, (%rsp) movl 8(%rsp), %edx addl 4(%rsp), %edx ## Folded load addl (%rsp), %edx ## Folded load movl %edx, %eax addq $12, %rsp ret Fewer instructions and less register use = faster compiles. llvm-svn: 113102	2010-09-05 02:18:34 +00:00
Chris Lattner	65b48b5dfc	zap dead code. llvm-svn: 113073	2010-09-04 18:12:00 +00:00
Bruno Cardoso Lopes	c6accda78e	Remove the last bit of isShuffleMaskLegal checks and improve the comment regarding mmx shuffles llvm-svn: 113059	2010-09-04 02:58:56 +00:00
Bruno Cardoso Lopes	731bcc1abf	make explicit that we not handle several mmx shuffles llvm-svn: 113058	2010-09-04 02:50:13 +00:00
Bruno Cardoso Lopes	20779ee157	Emit target specific nodes to handle palignr. Do not touch it for MMX versions yet. llvm-svn: 113056	2010-09-04 02:36:07 +00:00
Bruno Cardoso Lopes	cff7cd18ab	Emit target specific nodes to handle splats starting at zero indicies llvm-svn: 113055	2010-09-04 02:02:14 +00:00
Bruno Cardoso Lopes	95759917eb	Emit target specific nodes for isPSHUFHWMask and isPSHUFLWMask llvm-svn: 113050	2010-09-04 01:36:45 +00:00
Bruno Cardoso Lopes	2b57008c72	Emit target specific nodes for isSHUFPMask llvm-svn: 113048	2010-09-04 01:22:57 +00:00
Bruno Cardoso Lopes	2f7af36134	Previous isMOVLMask matching already emits targets nodes, remove check llvm-svn: 113047	2010-09-04 00:50:08 +00:00
Bruno Cardoso Lopes	9f8e704151	One more check from the original isShuffleMaskLegal goes away llvm-svn: 113045	2010-09-04 00:46:16 +00:00
Bruno Cardoso Lopes	16959372bb	Remove a duplicated but useless check that i've inserted in the previous commit. llvm-svn: 113044	2010-09-04 00:43:12 +00:00
Bruno Cardoso Lopes	44578d38d3	Refactor some code and remove the extra checks for unpckl_undef and unpckh_undef llvm-svn: 113043	2010-09-04 00:39:43 +00:00
Bruno Cardoso Lopes	7829d0e74b	Remove check for unpckh mask llvm-svn: 113035	2010-09-03 23:32:47 +00:00
Bruno Cardoso Lopes	d1dacc57aa	Remove check for unpckl mask llvm-svn: 113034	2010-09-03 23:31:50 +00:00
Bruno Cardoso Lopes	207b9d6218	Inline isShuffleMaskLegal into LowerVECTOR_SHUFFLE, so we can start checking each standalone condition and decide whether emit target specific nodes or remove the condition if it's already matched before. llvm-svn: 113031	2010-09-03 23:24:06 +00:00
Bruno Cardoso Lopes	2bef20eda7	Reapply considered harmfull part of rr112934 and r112942. "Use target specific nodes instead of relying in unpckl and unpckh pattern fragments during isel time. Also place a depth limit in getShuffleScalarElt. llvm-svn: 113020	2010-09-03 22:09:41 +00:00
Dale Johannesen	367afb5a00	Remove the rest of the nonexistent 64-bit AVX instructions. Bruno, please review. llvm-svn: 113014	2010-09-03 21:23:00 +00:00
Bruno Cardoso Lopes	a750d994fe	Reapply last harmless part of r112934, the pattern fragment to match X86Unpcklpd llvm-svn: 113009	2010-09-03 20:44:26 +00:00
Bruno Cardoso Lopes	fe8717c573	Reintroduce a simple function refactoring done in r112934, also without any functionality changes llvm-svn: 113008	2010-09-03 20:20:02 +00:00
Bruno Cardoso Lopes	48e589b122	Reapply piecies of r112942 and r112934 which don't do functional changes llvm-svn: 113007	2010-09-03 20:10:35 +00:00
Bruno Cardoso Lopes	6979cf0808	Reapply Fix comment llvm-svn: 113006	2010-09-03 19:55:05 +00:00
Daniel Dunbar	6f3da24d70	Revert r112934, "- Use specific nodes to match unpckl masks.", which introduced some infinite loop and select failures. - Apologies for eager reverting, but its branch day. llvm-svn: 113000	2010-09-03 19:38:11 +00:00
Daniel Dunbar	f1aacd55c0	Revert r112938 "Fix comment", which depends on r112934, which introduced some infinite loop and select failures. llvm-svn: 112999	2010-09-03 19:38:08 +00:00
Daniel Dunbar	0ffe4db45c	Revert r112942, "Use punpckh and unpckh family of nodes instead of using unpckh mask pattern fragment", which depends on r112934, which introduced some infinite loop and select failures. llvm-svn: 112998	2010-09-03 19:38:05 +00:00
Bruno Cardoso Lopes	d6634a5b2e	AVX doesn't support mm operations neither its instrinsics. The AVX versions of PALIGN and PABS* should only exist for 128-bit. Remove the unnecessary stuff. llvm-svn: 112944	2010-09-03 02:08:45 +00:00
Bruno Cardoso Lopes	a85ec10483	Use punpckh and unpckh family of nodes instead of using unpckh mask pattern fragment llvm-svn: 112942	2010-09-03 01:39:08 +00:00
Bruno Cardoso Lopes	adc6bca2dd	Fix comment llvm-svn: 112938	2010-09-03 01:28:51 +00:00
Bruno Cardoso Lopes	cce44678b4	- Use specific nodes to match unpckl masks. - Teach getShuffleScalarElt how to handle more target specific nodes, so the DAGCombine can make use of it. - Add another hack to avoid the node update problem during legalization. More description on the comments llvm-svn: 112934	2010-09-03 01:24:00 +00:00
Jakob Stoklund Olesen	08aede2538	Don't call Predicate_* from X86 target. llvm-svn: 112921	2010-09-03 00:35:18 +00:00
Anton Korobeynikov	a5a645559c	Properly emit __chkstk call instead of __alloca on non-mingw windows targets. Patch by Cameron Esfahani! llvm-svn: 112902	2010-09-02 23:03:46 +00:00
Bruno Cardoso Lopes	02a05a6a89	Move insertps mask decoding to header file llvm-svn: 112896	2010-09-02 22:43:39 +00:00
Anton Korobeynikov	a689c5b2c0	Revert win64 changes. They seem to be incomplete llvm-svn: 112885	2010-09-02 22:31:32 +00:00
Anton Korobeynikov	56291f7e53	Properly allocate win64 shadow reg area. Patch by Jan Sjodin! llvm-svn: 112875	2010-09-02 22:16:28 +00:00
Bruno Cardoso Lopes	814a69c330	Move decoding of insertps back to avoid unused warnings in x86 isel lowering, and fix movlhps/movhlps to decode 4 elements shuffles llvm-svn: 112869	2010-09-02 21:51:11 +00:00
Dan Gohman	3c9b5f394b	Don't narrow the load and store in a load+twiddle+store sequence unless there are clearly no stores between the load and the store. This fixes this miscompile reported as PR7833. This breaks the test/CodeGen/X86/narrow_op-2.ll optimization, which is safe, but awkward to prove safe. Move it to X86's README.txt. llvm-svn: 112861	2010-09-02 21:18:42 +00:00
Bruno Cardoso Lopes	c79f50170a	Move x86 specific shuffle mask decoding to its own header, it's also going to be used elsewhere. Also trim trailing whitespaces llvm-svn: 112846	2010-09-02 18:40:13 +00:00
Bruno Cardoso Lopes	489613f1e5	Replace unpckl_undef and unpckh_undef matching with target specific opcodes llvm-svn: 112806	2010-09-02 05:23:12 +00:00
Bruno Cardoso Lopes	e4e4be3885	Move condition out to prepare for more matching llvm-svn: 112805	2010-09-02 04:20:26 +00:00
Bruno Cardoso Lopes	bf7fd146c7	Remove checking for isUNPCKL_v_undef_Mask, the specific node is already emitted for it llvm-svn: 112804	2010-09-02 03:57:58 +00:00
Bruno Cardoso Lopes	6a7f634487	become more strict about when it's safe to use X86ISD::MOVLPS llvm-svn: 112799	2010-09-02 02:35:51 +00:00
Bruno Cardoso Lopes	04c25c15c7	Revert r112689, avoid those kind of checks cause they mess up with mmx llvm-svn: 112760	2010-09-01 22:59:03 +00:00
Bruno Cardoso Lopes	fea81b4831	Using target specific nodes for shuffle nodes makes the mask check more strict, breaking some cases not checked in the testsuite, but also exposes some foldings not done before, as this example: movaps (%rdi), %xmm0 movaps (%rax), %xmm1 movaps %xmm0, %xmm2 movss %xmm1, %xmm2 shufps $36, %xmm2, %xmm0 now is generated as: movaps (%rdi), %xmm0 movaps %xmm0, %xmm1 movlps (%rax), %xmm1 shufps $36, %xmm1, %xmm0 llvm-svn: 112753	2010-09-01 22:33:20 +00:00
Bruno Cardoso Lopes	b3825216ce	Use movlps, movlpd, movss and movsd specific nodes instead of pattern matching with movlp pattern fragment llvm-svn: 112694	2010-09-01 05:08:25 +00:00
Bruno Cardoso Lopes	6aaebe877b	minor change, simplify some logic llvm-svn: 112689	2010-09-01 00:57:08 +00:00
Bruno Cardoso Lopes	2b025707a2	Move some functions around so they can be used for some other to come function llvm-svn: 112687	2010-09-01 00:51:36 +00:00
Bruno Cardoso Lopes	4b56d87290	Use x86 specific MOVSLDUP node, add more patterns to match it and remove useless load nodes llvm-svn: 112661	2010-08-31 22:35:05 +00:00
Bruno Cardoso Lopes	61996ef835	Use x86 specific MOVSHDUP node and add more patterns to match it llvm-svn: 112657	2010-08-31 22:22:11 +00:00
Jakob Stoklund Olesen	33e9fce2d6	Make %EFLAGS unallocatable. No CCR virtual registers should exist, and %EFLAGS is used in ways that can surprise RegAllocFast. llvm-svn: 112650	2010-08-31 21:51:07 +00:00
Bruno Cardoso Lopes	5de15ce468	Use MOVHLPS node instead of matching using movhlps and movhlps_undef pattern fragments llvm-svn: 112644	2010-08-31 21:38:49 +00:00
Bruno Cardoso Lopes	03e4c35302	Use MOVLHPS and MOVHLPS x86 nodes whenever possible. Also remove some useless nodes llvm-svn: 112642	2010-08-31 21:15:21 +00:00
Bruno Cardoso Lopes	dfd9dd5d75	Use X86ISD::MOVSS and MOVSD to represent the movl mask pattern, also fix the handling of those nodes when seeking for scalars inside vector shuffles llvm-svn: 112570	2010-08-31 02:26:40 +00:00
Eli Friedman	f75de6eae7	A couple of small missed optimizations. llvm-svn: 112411	2010-08-29 05:07:40 +00:00
Chris Lattner	38ccc8b884	add a bunch more common shuffles to the instprinter. llvm-svn: 112397	2010-08-29 03:08:08 +00:00
Chris Lattner	7a05e6dca2	I have manually decoded the imm field of an insertps one too many times. This patch causes llc and llvm-mc (which both default to verbose-asm) to print out comments after a few common shuffle instructions which indicates the shuffle mask, e.g.: insertps $113, %xmm3, %xmm0 ## xmm0 = zero,xmm0[1,2],xmm3[1] unpcklps %xmm1, %xmm0 ## xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] pshufd $1, %xmm1, %xmm1 ## xmm1 = xmm1[1,0,0,0] This is carefully factored to keep the information extraction (of the shuffle mask) separate from the printing logic. I plan to move the extraction part out somewhere else at some point for other parts of the x86 backend that want to introspect on the behavior of shuffles. llvm-svn: 112387	2010-08-28 20:42:31 +00:00
Chris Lattner	94656b1c8c	fix the buildvector->insertp[sd] logic to not always create a redundant insertp[sd] $0, which is a noop. Before: _f32: ## @f32 pshufd $1, %xmm1, %xmm2 pshufd $1, %xmm0, %xmm3 addss %xmm2, %xmm3 addss %xmm1, %xmm0 ## kill: XMM0<def> XMM0<kill> XMM0<def> insertps $0, %xmm0, %xmm0 insertps $16, %xmm3, %xmm0 ret after: _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm3 addss %xmm1, %xmm3 movdqa %xmm2, %xmm0 insertps $16, %xmm3, %xmm0 ret The extra movs are due to a random (poor) scheduling decision. llvm-svn: 112379	2010-08-28 17:59:08 +00:00
Chris Lattner	bcb6090ad0	fix the BuildVector -> unpcklps logic to not do pointless shuffles when the top elements of a vector are undefined. This happens all the time for X86-64 ABI stuff because only the low 2 elements of a 4 element vector are defined. For example, on: _Complex float f32(_Complex float A, _Complex float B) { return A+B; } We used to produce (with SSE2, SSE4.1+ uses insertps): _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $16, %xmm2, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm0 addss %xmm1, %xmm0 pshufd $16, %xmm0, %xmm1 movdqa %xmm2, %xmm0 unpcklps %xmm1, %xmm0 ret We now produce: _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm3 addss %xmm1, %xmm3 movaps %xmm2, %xmm0 unpcklps %xmm3, %xmm0 ret This implements rdar://8368414 llvm-svn: 112378	2010-08-28 17:28:30 +00:00
Chris Lattner	96db6e66f4	improve comments in the unpcklps generating logic, introduce a new EltStride variable instead of reusing NumElems variable for a non-obvious purpose. No functionality change. llvm-svn: 112377	2010-08-28 17:15:43 +00:00
Bruno Cardoso Lopes	a982aa24ef	Clean up the logic of vector shuffles -> vector shifts. Also teach this logic how to handle target specific shuffles if needed, this is necessary while searching recursively for zeroed scalar elements in vector shuffle operands. llvm-svn: 112348	2010-08-28 02:46:39 +00:00
Anton Korobeynikov	c0b36921c2	Properly handle passing of FP stuff to varargs function on Win64: value should be copied to the corresponding shadow reg as well. Patch by Cameron Esfahani! llvm-svn: 112262	2010-08-27 14:43:06 +00:00
Daniel Dunbar	1844a71e66	X86: Fix an encoding issue with LOCK_ADD64mr, which could lead to very hard to find miscompiles with the integrated assembler. llvm-svn: 112250	2010-08-27 01:30:14 +00:00
Jim Grosbach	6a77066913	Simplify eliminateFrameIndex() interface back down now that PEI doesn't need to try to re-use scavenged frame index reference registers. rdar://8277890 llvm-svn: 112241	2010-08-26 23:32:16 +00:00
Bruno Cardoso Lopes	e25ba0c7c2	zap the now unused MVT::getIntVectorWithNumElements llvm-svn: 112218	2010-08-26 20:53:12 +00:00
Bob Wilson	a967c42a3d	Fix comment typos. llvm-svn: 112202	2010-08-26 18:08:11 +00:00
Chris Lattner	eb2cc0ce0e	implement SplitVecOp_CONCAT_VECTORS, fixing the included testcase with SSE1. llvm-svn: 112171	2010-08-26 05:51:22 +00:00
Chris Lattner	cc60609cb4	fix sse1 only codegen in x86-64 mode, which is something we apparently try to support. llvm-svn: 112168	2010-08-26 05:24:29 +00:00
Bruno Cardoso Lopes	184eaea855	Fix PR7748 without using microsoft extensions llvm-svn: 112128	2010-08-26 01:02:53 +00:00
Chris Lattner	aecf47a5cb	we should pattern match the SSE complex arithmetic ops. llvm-svn: 112109	2010-08-25 23:31:42 +00:00
Bruno Cardoso Lopes	d4085f6e91	Revert this for now, PUNPCKLDQ dont operate on v4f32 llvm-svn: 112090	2010-08-25 21:26:37 +00:00
Daniel Dunbar	3d148ac089	X86: Fix misencode of RI64mi8. This fixes OpenSSL / x86_64-apple-darwin10 / clang -O3. llvm-svn: 112089	2010-08-25 21:11:02 +00:00
Benjamin Kramer	f1f2133ac0	Remove dead recursive function. Yay for clang -Wunused-function. llvm-svn: 112060	2010-08-25 17:27:58 +00:00
Anton Korobeynikov	b3b53ecac0	Fix nasty mingw32 bug, which e.g. prevented llvm-gcc bootstrap there. Mark _alloca call as clobberring EFLAGS, otherwise some DCE might remove other flags-clobberring stuff (e.g. cmp instructions) occuring after _alloca call. llvm-svn: 112034	2010-08-25 07:50:11 +00:00
Bruno Cardoso Lopes	0770d25758	PUNPCKLDQ should also be used for v4f32 llvm-svn: 112020	2010-08-25 02:55:40 +00:00
Bruno Cardoso Lopes	2e45d522c1	teach lowering to get target specific nodes for pshufd, emulating the same isel behavior for now, so we can pass all vector shuffle tests llvm-svn: 112017	2010-08-25 02:35:37 +00:00
Daniel Dunbar	1c8d777c93	MC/X86: Tweak imul recognition, previous hack only applies for the imul form taking immediates. llvm-svn: 111950	2010-08-24 19:37:56 +00:00
Daniel Dunbar	09392785b4	MC/X86: Add custom hack for recognizing "imul $12, %eax" and friends. llvm-svn: 111947	2010-08-24 19:24:18 +00:00
Daniel Dunbar	94b84a19b9	MC/X86: Warn on scale factors > 1 without index register, instead of erroring, for 'as' compatibility. llvm-svn: 111945	2010-08-24 19:13:38 +00:00
Dan Gohman	c88fda477a	Fix X86's isLegalAddressingMode to recognize that static addresses need not be RIP-relative in small mode. llvm-svn: 111917	2010-08-24 15:55:12 +00:00
Bruno Cardoso Lopes	758d7b1f5c	Use pshufhw and pshuflw in more cases and fix getTargetShuffleNode number of arguments llvm-svn: 111890	2010-08-24 01:16:15 +00:00
Bruno Cardoso Lopes	264d90fff7	Start using target speficic nodes for shuffles: pshufhw and pshuflw llvm-svn: 111837	2010-08-23 20:41:02 +00:00
Gabor Greif	21fed6616c	tyops llvm-svn: 111835	2010-08-23 20:30:51 +00:00
Chris Lattner	58bd73a5a7	Add a new llvm.x86.int intrinsic, allowing access to the x86 int and int3 instructions. Patch by Peter Housel! llvm-svn: 111831	2010-08-23 19:39:25 +00:00
Chris Lattner	a42202e0e4	random improvement for variable shift codegen. llvm-svn: 111813	2010-08-23 17:30:29 +00:00
Anton Korobeynikov	cbbe4501df	Revert invalid r111792. Jump tables are not broken on x86-64 / coff, it's COFF emitter which does not support differences of two symbols (and needs to be fixed). GAS is pretty fine with code produced. llvm-svn: 111801	2010-08-23 07:38:51 +00:00
Michael J. Spencer	e87231232a	Workaround broken jump tables on x86-64 COFF. llvm-svn: 111792	2010-08-23 04:45:37 +00:00
Anton Korobeynikov	db9820ecaa	Use rip-rel addressing on win64 by default. For this we just defaults to small pic code model. llvm-svn: 111741	2010-08-21 17:21:11 +00:00
Michael J. Spencer	377aa20e6e	MC: Add partial x86-64 support to COFF. llvm-svn: 111728	2010-08-21 05:58:13 +00:00
Dan Gohman	42ef669d81	Fix x86 fast-isel's cmp+branch folding to avoid folding when the comparison is in a different basic block from the branch. In such cases, the comparison's operands may not have initialized virtual registers available. llvm-svn: 111709	2010-08-21 02:32:36 +00:00
Bruno Cardoso Lopes	9f20e7a1bf	Prepare LowerVECTOR_SHUFFLEv8i16 to use x86 target specific nodes directly llvm-svn: 111704	2010-08-21 01:32:18 +00:00
Bruno Cardoso Lopes	6f3b38a851	This is the first step towards refactoring the x86 vector shuffle code. The general idea here is to have a group of x86 target specific nodes which are going to be selected during lowering and then directly matched in isel. The commit includes the addition of those specific nodes and a bunch of patterns, and incrementally we're going to switch between them and what we have right now. Both the patterns and target specific nodes can change as we move forward with this work. llvm-svn: 111691	2010-08-20 22:55:05 +00:00
Chris Lattner	f547740d3f	fix PR7465, mishandling of lcall and ljmp: intersegment long call and jumps. llvm-svn: 111496	2010-08-19 01:18:43 +00:00
Chris Lattner	beb506eeed	minor progress towards fixing PR7465 llvm-svn: 111494	2010-08-19 01:00:34 +00:00
Bill Wendling	817e857b13	Marked with ATTRIBUTE_USED so that clang doesn't complain. llvm-svn: 111383	2010-08-18 18:40:57 +00:00
Chris Lattner	3e3e63efe1	remove some code that is dead now that lea's are modeled with segment registers. llvm-svn: 111343	2010-08-18 02:40:44 +00:00
Anton Korobeynikov	88c09879c7	Revert part of one of the prev. patches - tailjmp will follow later. llvm-svn: 111291	2010-08-17 21:08:28 +00:00
Anton Korobeynikov	231ab847ca	More fixes for win64: - Do not clobber al during variadic calls, this is AMD64 ABI-only feature - Emit wincall64, where necessary Patch by Cameron Esfahani! llvm-svn: 111289	2010-08-17 21:06:07 +00:00
Anton Korobeynikov	cd78af6e3c	Enable more win64 calls folding opportunities. Patch by Cameron Esfahani! llvm-svn: 111288	2010-08-17 21:06:01 +00:00
Eli Friedman	2444da0652	Comment out some broken/unused/useless instructions which mess up disassembly. llvm-svn: 111185	2010-08-16 21:18:51 +00:00
Eli Friedman	51ec745509	Don't attempt to SimplifyShortMoveForm in 64-bit mode. llvm-svn: 111182	2010-08-16 21:03:32 +00:00
Matt Fleming	f751d856f0	Hookup ELF support for X86. llvm-svn: 111173	2010-08-16 18:36:14 +00:00
Jakob Stoklund Olesen	2cd00737c0	Partially revert r111155. It looks like MSVC is calling an operator<() that clang says is unused. llvm-svn: 111167	2010-08-16 18:24:54 +00:00
Jakob Stoklund Olesen	b7f872197a	Remove unused functions. llvm-svn: 111155	2010-08-16 17:18:18 +00:00
Argyrios Kyrtzidis	d0fcc9a818	Revert r111082. No warnings for this common pattern. llvm-svn: 111102	2010-08-15 10:27:23 +00:00
Eric Christopher	54194bd127	Rework how the non-sse2 memory barrier is lowered so that the encoding is correct for the built-in assembler. Based on a patch from Chris. llvm-svn: 111083	2010-08-14 21:51:50 +00:00
Argyrios Kyrtzidis	7c09ddf0ae	Add ATTRIBUTE_UNUSED to methods that are not supposed to be used. llvm-svn: 111082	2010-08-14 21:35:10 +00:00
Chris Lattner	2f6c3434ac	improve indentation llvm-svn: 111073	2010-08-14 17:26:09 +00:00
Bruno Cardoso Lopes	160be2936b	Add comments to some pattern fragments in x86 llvm-svn: 111041	2010-08-13 20:39:01 +00:00
Dale Johannesen	8d3c89e765	Revert 110491. While not wrong, it was based on a misanalysis and is undesirable. llvm-svn: 111028	2010-08-13 18:43:45 +00:00
Bruno Cardoso Lopes	081861b6b7	Fix comment to reflect code, and remove an unused argument llvm-svn: 111022	2010-08-13 17:50:47 +00:00
Bruno Cardoso Lopes	1187e3f09b	Improve comment to make explicit why not to touch this could before JIT goes MC llvm-svn: 111021	2010-08-13 17:44:10 +00:00
Eric Christopher	6e5b67ccc4	Revert last patch and r110954 as I meant to. llvm-svn: 111001	2010-08-13 02:37:50 +00:00
Eric Christopher	5e027fe113	Revert r110954 for now, pseudo instructions can't make it through to the JIT. llvm-svn: 111000	2010-08-13 02:30:00 +00:00
Bruno Cardoso Lopes	cc20fe5937	Some small clean-up: use of pseudo instructions llvm-svn: 110954	2010-08-12 20:55:18 +00:00
Bruno Cardoso Lopes	7f704b31a9	- Teach SSEDomainFix to switch between different levels of AVX instructions. Here we guess that AVX will have domain issues, so just implement them for consistency and in the future we remove if it's unnecessary. - Make foldMemoryOperandImpl aware of 256-bit zero vectors folding and support the 128-bit counterparts of AVX too. - Make sure MOV[AU]PS instructions are only selected when SSE1 is enabled, and duplicate the patterns to match AVX. - Add a testcase for a simple 128-bit zero vector creation. llvm-svn: 110946	2010-08-12 20:20:53 +00:00
Bruno Cardoso Lopes	7e1a30c0d3	Define AVX 128-bit pattern versions of SET0PS/PD. llvm-svn: 110937	2010-08-12 18:20:59 +00:00
Bruno Cardoso Lopes	1401e040eb	Fix comment order llvm-svn: 110898	2010-08-12 02:08:52 +00:00
Bruno Cardoso Lopes	7306c86886	Begin to support some vector operations for AVX 256-bit intructions. The long term goal here is to be able to match enough of vector_shuffle and build_vector so all avx intrinsics which aren't mapped to their own built-ins but to shufflevector calls can be codegen'd. This is the first (baby) step, support building zeroed vectors. llvm-svn: 110897	2010-08-12 02:06:36 +00:00
Daniel Dunbar	7d7b4d1b0f	MC/X86/AsmParser: Give an explicit error message when we reject an instruction because it could have an ambiguous suffix. llvm-svn: 110890	2010-08-12 00:55:42 +00:00
Daniel Dunbar	2ecc3bb4f7	MC/AsmParser: Push the burdon of emitting diagnostics about unmatched instructions onto the target specific parser, which can do a better job. llvm-svn: 110889	2010-08-12 00:55:38 +00:00
Daniel Dunbar	167b9d7f30	tblgen/AsmMatcher: Always emit the match function as 'MatchInstructionImpl', target specific parsers can adapt the TargetAsmParser to this. llvm-svn: 110888	2010-08-12 00:55:32 +00:00
Jakob Stoklund Olesen	9c473e46f3	Fix <rdar://problem/8282498> even if it doesn't reproduce on trunk. When a register is defined by a partial load: %reg1234:sub_32 = MOV32mr <fi#-1>; GR64:%reg1234 That load cannot be folded into an instruction using the full 64-bit register. It would become a 64-bit load. This is related to the recent change to have isLoadFromStackSlot return false on a sub-register load. llvm-svn: 110874	2010-08-11 23:08:22 +00:00
Dan Gohman	5531aa4de1	Use ISD::ADD instead of ISD::SUB with a negated constant. This avoids trouble if the return type of TD->getPointerSize() is changed to something which doesn't promote to a signed type, and is simpler anyway. Also, use getCopyFromReg instead of getRegister to read a physical register's value. llvm-svn: 110835	2010-08-11 18:14:00 +00:00
Daniel Dunbar	ebace2248f	MCAsmParser: Add dump() hook to MCParsedAsmOperand. llvm-svn: 110790	2010-08-11 06:37:04 +00:00
Bruno Cardoso Lopes	91d61df3eb	Add AVX matching patterns to Packed Bit Test intrinsics. Apply the same approach of SSE4.1 ptest intrinsics but create a new x86 node "testp" since AVX introduces vtest{ps}{pd} instructions which set ZF and CF depending on sign bit AND and ANDN of packed floating-point sources. This is slightly different from what the "ptest" does. Tests comming with the other 256 intrinsics tests. llvm-svn: 110744	2010-08-10 23:25:42 +00:00
Bruno Cardoso Lopes	39f215bd33	Add AVX movnt{pd,ps,dq} 256-bit intrinsics llvm-svn: 110650	2010-08-10 02:49:24 +00:00
Bruno Cardoso Lopes	cedf23dfe5	Add AVX movmsk 256-bit intrinsics llvm-svn: 110648	2010-08-10 02:34:56 +00:00
Bruno Cardoso Lopes	85da72a88f	Support AVX 256-bit load and store intrinsics llvm-svn: 110645	2010-08-10 01:43:16 +00:00
Bruno Cardoso Lopes	b2b6b65b86	Patterns to match AVX cmp instructions llvm-svn: 110633	2010-08-10 00:13:20 +00:00
Bruno Cardoso Lopes	001d6fa174	Add matching patterns for vblend AVX intrinsics llvm-svn: 110630	2010-08-10 00:02:05 +00:00
Eric Christopher	b9627ee79b	Wording. llvm-svn: 110618	2010-08-09 22:52:47 +00:00
Bruno Cardoso Lopes	685cb32d2b	Add VCVTPD2PS, VCVTPS2DQ, VCVTPS2PDY, VCVTTPD2DQY, VCVTTPS2DQ and VCVTPD2DQ 256-bit conversion intrinsics llvm-svn: 110608	2010-08-09 21:51:56 +00:00
Bruno Cardoso Lopes	3e9b567643	Add patterns to AVX conversions instructions. Do that instead of declaring more intructions whenever is possible, more coming llvm-svn: 110605	2010-08-09 21:24:59 +00:00
Oscar Fuentes	212cfde6ec	CMake: eliminated unnecessary target_link_libraries. Next time the build is broken due to wrong library dependencies, just try building again (if you are on some Unix and are building all LLVM targets) or ask someone to commit the regenerated LLVMLibDeps.cmake. llvm-svn: 110593	2010-08-09 20:33:08 +00:00
Bruno Cardoso Lopes	c33940b3aa	Memory version of vcvtdq2pd intrinsic llvm-svn: 110582	2010-08-09 18:20:14 +00:00
Bruno Cardoso Lopes	828f6aeced	Patterns to match vinsert, vbroadcast, vmovmask and vcvtdq2pd AVX intrinsics llvm-svn: 110580	2010-08-09 18:03:43 +00:00
Dale Johannesen	a3bd31a923	Use sdmem and sse_load_f64 (etc.) for the vector form of CMPSD (etc.) Matching a 128-bit memory operand is wrong, the instruction uses only 64 bits (same as ADDSD etc.) 8193553. llvm-svn: 110491	2010-08-07 00:33:42 +00:00
Bruno Cardoso Lopes	93cc666a58	Patterns to match AVX 256-bit vzero intrinsics llvm-svn: 110480	2010-08-06 22:10:01 +00:00
Bruno Cardoso Lopes	3d6a3a0ede	Patterns to match AVX 256-bit permutation intrinsics llvm-svn: 110468	2010-08-06 20:03:27 +00:00
Owen Anderson	a7aed18624	Reapply r110396, with fixes to appease the Linux buildbot gods. llvm-svn: 110460	2010-08-06 18:33:48 +00:00
Bruno Cardoso Lopes	1cf067cb3d	Patterns to match AVX 256-bit horizontal arithmetic intrinsics llvm-svn: 110427	2010-08-06 02:10:30 +00:00
Bruno Cardoso Lopes	b9ad94fbf7	Patterns to match AVX 256-bit arithmetic intrinsics llvm-svn: 110425	2010-08-06 01:52:29 +00:00
Owen Anderson	bda59bd247	Revert r110396 to fix buildbots. llvm-svn: 110410	2010-08-06 00:23:35 +00:00
Eric Christopher	e1fb772aa5	Add an option to always emit realignment code for a particular module. llvm-svn: 110404	2010-08-05 23:57:43 +00:00
Owen Anderson	755aceb5d0	Don't use PassInfo* as a type identifier for passes. Instead, use the address of the static ID member as the sole unique type identifier. Clean up APIs related to this change. llvm-svn: 110396	2010-08-05 23:42:04 +00:00
Bruno Cardoso Lopes	77954bdf7a	Support very basic (doesn't include ABI support in the front-end, varags, ...) 256-bit argument passing and return for AVX llvm-svn: 110394	2010-08-05 23:35:51 +00:00
Eric Christopher	4d9c3400f3	Handle the memory barrier pseudo that goes to nothing for the JIT. llvm-svn: 110371	2010-08-05 20:04:36 +00:00
Eric Christopher	7fd06eb8ce	Set hasSideEffects on the 64-bit no-sse memory barrier. llvm-svn: 110369	2010-08-05 19:54:59 +00:00
Eric Christopher	32f5d6b9be	Be a little bit more specific about target for the memory barrier instructions. llvm-svn: 110360	2010-08-05 18:36:20 +00:00
Eric Christopher	4abffad17c	Handle the pseudo in MCInstLower. llvm-svn: 110359	2010-08-05 18:34:30 +00:00
Eric Christopher	2db8464282	Make x86-64 membarriers work without sse and clean up some of the uses. llvm-svn: 110274	2010-08-04 23:03:04 +00:00
Eli Friedman	39d0f57cab	PR7814: Truncates cannot be ignored for signed comparisons. llvm-svn: 110268	2010-08-04 22:40:58 +00:00
Devang Patel	2bf0f3ceff	Add DEBUG message. llvm-svn: 110224	2010-08-04 18:06:05 +00:00
Benjamin Kramer	a53a4eefa6	Enable COFF writer on mingw32 and cygwin. llvm-svn: 110200	2010-08-04 15:32:40 +00:00
Benjamin Kramer	61c8e6dc16	Print an error message when someone tries -integrated-as on an unsupported target. - The COFF backend doesn't support MingW/Cygwin at the moment, it'll report an error, but it's still much better than random assertions from the MachO backend. - We want to make ELF the default eventually, it's what the majority of targets use. llvm-svn: 110197	2010-08-04 13:16:30 +00:00
Chris Lattner	53befe7bc1	fix a win64 encoding problem, patch by Cameron Esfahani! llvm-svn: 110164	2010-08-03 22:49:22 +00:00
Michael J. Spencer	ed80f361b3	MC: Remove HasAbsolutizedSet from WindowsX86AsmBackend. llvm-svn: 109949	2010-07-31 07:21:44 +00:00
Michael J. Spencer	6b4925e223	Add relax all support to the COFF object streamer. llvm-svn: 109947	2010-07-31 06:22:29 +00:00
Bruno Cardoso Lopes	349165b48f	Support all 128-bit AVX vector intrinsics. Most part of them I already declared during the addition of the assembler support, the additional changes are: - Add missing intrinsics - Move all SSE conversion instructions in X86InstInfo64.td to the SSE.td file. - Duplicate some patterns to AVX mode. - Step into PCMPEST/PCMPIST custom inserter and add AVX versions. llvm-svn: 109878	2010-07-30 19:54:33 +00:00
Bruno Cardoso Lopes	405405bbfe	Fix typo! llvm-svn: 109877	2010-07-30 19:41:24 +00:00
Jakob Stoklund Olesen	ba0e124aaf	Revert r109652, and remove the offending assert in loadRegFromStackSlot instead. We do sometimes load from a too small stack slot when dealing with x86 arguments (varargs and smaller-than-32-bit args). It looks like we know what we are doing in those cases, so I am going to remove the assert instead of artifically enlarging stack slot sizes. The assert in storeRegToStackSlot stays in. We don't want to write beyond the bounds of a stack slot. llvm-svn: 109764	2010-07-29 17:42:27 +00:00
Jakob Stoklund Olesen	f2234fbe70	Create a fixed stack object for varargs that is as large as any register. The size of this object isn't used for anything - technically it is of variable size. This avoids a false positive from the assert in X86InstrInfo::loadRegFromStackSlot, and fixes PR7735. llvm-svn: 109652	2010-07-28 20:55:38 +00:00
Nate Begeman	53afc8f06a	Implement a vectorized algorithm for <16 x i8> << <16 x i8> This is about 4x faster and smaller than the existing scalarization. llvm-svn: 109566	2010-07-28 00:21:48 +00:00
Nate Begeman	269a6da023	~40% faster vector shl <4 x i32> on SSE 4.1 Larger improvements for smaller types coming in future patches. For: define <2 x i64> @shl(<4 x i32> %r, <4 x i32> %a) nounwind readnone ssp { entry: %shl = shl <4 x i32> %r, %a ; <<4 x i32>> [#uses=1] %tmp2 = bitcast <4 x i32> %shl to <2 x i64> ; <<2 x i64>> [#uses=1] ret <2 x i64> %tmp2 } We get: _shl: ## @shl pslld $23, %xmm1 paddd LCPI0_0, %xmm1 cvttps2dq %xmm1, %xmm1 pmulld %xmm1, %xmm0 ret Instead of: _shl: ## @shl pshufd $3, %xmm0, %xmm2 movd %xmm2, %eax pshufd $3, %xmm1, %xmm2 movd %xmm2, %ecx shll %cl, %eax movd %eax, %xmm2 pshufd $1, %xmm0, %xmm3 movd %xmm3, %eax pshufd $1, %xmm1, %xmm3 movd %xmm3, %ecx shll %cl, %eax movd %eax, %xmm3 punpckldq %xmm2, %xmm3 movd %xmm0, %eax movd %xmm1, %ecx shll %cl, %eax movd %eax, %xmm2 movhlps %xmm0, %xmm0 movd %xmm0, %eax movhlps %xmm1, %xmm1 movd %xmm1, %ecx shll %cl, %eax movd %eax, %xmm0 punpckldq %xmm0, %xmm2 movdqa %xmm2, %xmm0 punpckldq %xmm3, %xmm0 ret llvm-svn: 109549	2010-07-27 22:37:06 +00:00
Michael J. Spencer	f8270bdb2d	Make MC use Windows COFF on Windows and add tests. llvm-svn: 109494	2010-07-27 06:46:15 +00:00
Jakob Stoklund Olesen	96a890a7f8	The isLoadFromStackSlot and isStoreToStackSlot have no way of reporting subregister operands like this: %reg1040:sub_32bit<def> = MOV32rm <fi#-2>, 1, %reg0, 0, %reg0, %reg1040<imp-def>; mem:LD4[FixedStack-2](align=8) Make them return false when subreg operands are present. VirtRegRewriter is making bad assumptions otherwise. This fixes PR7713. llvm-svn: 109489	2010-07-27 04:17:01 +00:00
Jakob Stoklund Olesen	c3c05ed02e	Add assertions that expose the PR7713 miscompilation: Accessing a stack slot with a too-big register class. llvm-svn: 109488	2010-07-27 04:16:58 +00:00
Evan Cheng	d4218b8793	On x86, f32 / f64 nodes share the same registers as 128-bit vector values. llvm-svn: 109450	2010-07-26 21:50:05 +00:00
Bruno Cardoso Lopes	36c2ea6c7a	Temporary hack to let codegen assert or generate poor code in case we are using AVX and no AVX version of the desired intruction is present, this is better for incremental dev (without fallbacks it's easier to spot what's missing). Not sure this is the best hack thought (we can also disable all HasSSE* predicates by dinamically marking them 'false' if AVX is present) llvm-svn: 109434	2010-07-26 21:01:18 +00:00
Evan Cheng	37b740c4bf	Add an ILP scheduler. This is a register pressure aware scheduler that's appropriate for targets without detailed instruction iterineries. The scheduler schedules for increased instruction level parallelism in low register pressure situation; it schedules to reduce register pressure when the register pressure becomes high. On x86_64, this is a win for all tests in CFP2000. It also sped up 256.bzip2 by 16%. llvm-svn: 109300	2010-07-24 00:39:05 +00:00
Bruno Cardoso Lopes	306a1f9721	Support x86 "eiz" and "riz" pseudo index registers in the assembler. llvm-svn: 109295	2010-07-24 00:06:39 +00:00
Bruno Cardoso Lopes	d65cd1d581	Remove trailing whitespace llvm-svn: 109276	2010-07-23 22:15:26 +00:00
Bruno Cardoso Lopes	ea0e05a3ce	Add AVX version of CLMUL instructions llvm-svn: 109248	2010-07-23 18:41:12 +00:00
Bruno Cardoso Lopes	d618c8ac64	Declare CLMUL as a subtarget feature llvm-svn: 109207	2010-07-23 01:22:45 +00:00
Bruno Cardoso Lopes	09dc24beac	Add x86 CLMUL (Carry-less multiplication) cpu feature llvm-svn: 109206	2010-07-23 01:17:51 +00:00
Bruno Cardoso Lopes	acd9230b1b	Add complete assembler support for FMA3 instructions, with descriptions and encodings taken from the AVX manual llvm-svn: 109204	2010-07-23 00:54:35 +00:00
Dale Johannesen	f2d75670b7	The only supported calling convention for X86-64 uses SSE, so we can't return floating point values if this is disabled. Detect this error for clang. With SSE1 only, f64 is a problem; it can be done, but neither llvm-gcc nor clang has ever generated correct code for it. Since nobody noticed this I think it's OK to treat it as an error for now. This also handles SSE-sized vectors of floating point. 8207686, 8204109. llvm-svn: 109201	2010-07-23 00:30:35 +00:00
Bruno Cardoso Lopes	e29e389678	Fix some AVX instructions which didnt had HasAVX prefix. And also a problem with PINSRW, which was totally wrong because of a typo I introduced previously llvm-svn: 109198	2010-07-23 00:14:54 +00:00
Bruno Cardoso Lopes	0710c74f29	Add remaining AVX instructions (most of them dealing with GR64 destinations. This complete the assembler support for the general AVX ISA. But we still miss instructions from FMA3 and CLMUL specific feature flags, which are now the next step llvm-svn: 109168	2010-07-22 21:18:49 +00:00
Chris Lattner	8f3adc9057	remove the JIT "NeedsExactSize" feature and supporting logic. llvm-svn: 109167	2010-07-22 21:17:55 +00:00
Chris Lattner	b3f608bbba	X86MCInstLower now depends on AsmPrinter being around. llvm-svn: 109154	2010-07-22 21:10:04 +00:00
Chris Lattner	083be4d384	instead of migrating it to the MC instruction encoder, just rip out the implementation of X86InstrInfo::GetInstSizeInBytes. The code being ripped out just implemented a copy and hacked up version of the (old) instruction encoder, and is buggy and terrible in other ways. Since "GetInstSizeInBytes" is really only there to support the JIT's "NeedsExactSize" hook (which noone is using), just rip out the code. I will rip out the NeedsExactSize hook next. This resolves rdar://7617809 - switch X86InstrInfo::GetInstSizeInBytes to use X86MCCodeEmitter llvm-svn: 109149	2010-07-22 21:05:13 +00:00
Chandler Carruth	3180f9f55f	Attempt to fix linking issues with CMake. Please review other CMake users, especially on other platforms. Is there a better way to fix this. llvm-svn: 109084	2010-07-22 06:27:45 +00:00
Eric Christopher	9a77382685	Custom lower the memory barrier instructions and add support for lowering without sse2. Add a couple of new testcases. Fixes a few libgomp tests and latent bugs. Remove a few todos. llvm-svn: 109078	2010-07-22 02:48:34 +00:00
Eric Christopher	a4c435f1fa	80-columns. llvm-svn: 109070	2010-07-22 00:26:08 +00:00
Nate Begeman	68a069a188	Make fast isel win64-aware w.r.t. call-clobbered regs llvm-svn: 109069	2010-07-22 00:09:39 +00:00
Bruno Cardoso Lopes	e3acfd4d58	Add more 256-bit forms for a bunch of regular AVX instructions Add 64-bit (GR64) versions of some instructions (which are not described in their SSE forms, but are described in AVX) llvm-svn: 109063	2010-07-21 23:53:50 +00:00
Rafael Espindola	350b1a449f	Fixes win64. It was broken by a previous patch where I missed the !isWin64 and then forced every register to be a vr128 on win64. llvm-svn: 109060	2010-07-21 23:19:57 +00:00
Chris Lattner	5c91a5e747	add some rough support for making mcinst lowering work without an asmprinter or mangler around. This is option #B for killing off X86InstrInfo::GetInstSizeInBytes. Option #A (killing "needsexactsize") was sent for consideration to llvmdev. llvm-svn: 109056	2010-07-21 23:03:35 +00:00
Bruno Cardoso Lopes	6238c1d102	Add missing AVX convert instructions. Those instructions are not described in their SSE forms (although they exist), but add the AVX forms anyway, so the assembler can benefit from it llvm-svn: 109039	2010-07-21 21:37:59 +00:00
Nate Begeman	784e062b2a	Fix a couple issues with Win64 ABI 1) all registers were spilled as xmm, regardless of actual size 2) win64 abi doesn't do the varargs-size-in-%al thing Still to look into: xmm6-15 are marked as clobbered by call instructions on win64 even though they aren't. llvm-svn: 109035	2010-07-21 20:49:52 +00:00
Bruno Cardoso Lopes	19b3830142	Avoid AVX instructions to be selected instead of its SSE form llvm-svn: 109032	2010-07-21 20:38:42 +00:00
Eric Christopher	d27913e516	Pulling out previous patch, must've run the tests in the wrong directory. llvm-svn: 109005	2010-07-21 09:23:56 +00:00
Eric Christopher	b2d1067024	Lower MEMBARRIER on x86 and support processors without SSE2. Fixes a pile of libgomp failures in the llvm-gcc testsuite due to the libcall not existing. llvm-svn: 109004	2010-07-21 09:05:23 +00:00
Bruno Cardoso Lopes	cdbec62510	Add AVX only vzeroall and vzeroupper instructions llvm-svn: 109002	2010-07-21 08:56:24 +00:00
Bruno Cardoso Lopes	3499934da6	Add new AVX vpermilps, vpermilpd and vperm2f128 instructions llvm-svn: 108984	2010-07-21 03:07:42 +00:00
Bruno Cardoso Lopes	3ceaf7a0a2	Add new AVX vmaskmov instructions, and also fix the VEX encoding bits to support it llvm-svn: 108983	2010-07-21 02:46:58 +00:00
Bruno Cardoso Lopes	e706501975	Add new AVX vextractf128 instructions llvm-svn: 108964	2010-07-20 23:19:02 +00:00
Chris Lattner	41ff5d4d91	make asmprinter optional, even though passing in null will cause things to explode right now. llvm-svn: 108955	2010-07-20 22:45:33 +00:00
Chris Lattner	b4dc58975b	continue pushing dependencies around. llvm-svn: 108952	2010-07-20 22:35:40 +00:00
Chris Lattner	2366d95af9	reduce X86MCInstLower dependencies on asmprinter. llvm-svn: 108950	2010-07-20 22:30:53 +00:00
Chris Lattner	7fbdd7c852	pass around MF, not MMI. llvm-svn: 108949	2010-07-20 22:26:07 +00:00
Chris Lattner	d3f3a89425	cleanups. llvm-svn: 108947	2010-07-20 22:23:57 +00:00
Chris Lattner	5ca516b87c	move two asmprinter methods into the asmprinter .cpp file. llvm-svn: 108945	2010-07-20 22:18:19 +00:00
Bruno Cardoso Lopes	3b505848fd	Add new AVX instruction vinsertf128 llvm-svn: 108892	2010-07-20 19:44:51 +00:00
Eric Christopher	4adaccf0bf	Constify some arguments. llvm-svn: 108812	2010-07-20 06:52:21 +00:00
Bruno Cardoso Lopes	14c5fd437c	Add AVX vbroadcast new instruction llvm-svn: 108788	2010-07-20 00:11:13 +00:00
Daniel Dunbar	0aff8033c6	Update CMake files. llvm-svn: 108787	2010-07-20 00:08:13 +00:00
Chris Lattner	64fffadad3	fix a layering problem by moving the x86 implementation of AsmPrinter and InstLowering into libx86 and out of the asmprinter subdirectory. Now X86/AsmPrinter just depends on MC stuff, not all of codegen and LLVM IR. llvm-svn: 108782	2010-07-19 23:41:57 +00:00
Bruno Cardoso Lopes	9de0ca73d4	Add 256-bit vaddsub, vhadd, vhsub, vblend and vdpp instructions! llvm-svn: 108769	2010-07-19 23:32:44 +00:00
Daniel Dunbar	9db7d0addd	X86: Mark JMP{32,64}[mr] as requires 32-bit/64-bit mode. They are the same instruction, we only want to allow the one for the current subtarget. - This also fixes suffix matching for jmp instructions, because it eliminates the ambiguity between 'jmpl' and 'jmpq'. llvm-svn: 108746	2010-07-19 20:44:16 +00:00
Daniel Dunbar	9aefb8ee4c	X86-64: Mark WINCALL and more tail call instructions as code gen only. llvm-svn: 108685	2010-07-19 07:21:07 +00:00
Daniel Dunbar	2e9f58517d	X86: Mark some tail call pseduo instruction as code gen only. llvm-svn: 108684	2010-07-19 07:21:04 +00:00
Daniel Dunbar	1cd02510d3	X86: Mark In32/64BitMode on LEAVE[64] and SYSEXIT[64]. llvm-svn: 108683	2010-07-19 07:21:01 +00:00
Daniel Dunbar	b82cd9319b	MC/X86: We now match instructions like "incl %eax" correctly for the arch we are assembling; remove crufty custom cleanup code. llvm-svn: 108681	2010-07-19 06:14:54 +00:00
Daniel Dunbar	150d948d3a	X86: Mark MOV.*_{TC,NOREX} instruction as code gen only, they aren't real. llvm-svn: 108680	2010-07-19 06:14:49 +00:00
Daniel Dunbar	961543377d	X86: MOV8o8a, MOV8ao8, etc. are only valid in 32-bit mode. llvm-svn: 108679	2010-07-19 06:14:44 +00:00
Daniel Dunbar	eefe8616be	TblGen/AsmMatcher: Add support for honoring instruction Requires<[]> attributes as part of the matcher. - Currently includes a hack to limit ourselves to "In32BitMode" and "In64BitMode", because we don't have the other infrastructure to properly deal with setting SSE, etc. features on X86. llvm-svn: 108677	2010-07-19 05:44:09 +00:00
Daniel Dunbar	419197cc4d	Target: Give the TargetAsmParser access to the TargetMachine. - Unfortunate, but necessary for now to handle subtarget instruction matching. Eventually we should factor out the lower level target machine information so we don't need to do this. llvm-svn: 108664	2010-07-19 00:33:49 +00:00
Chris Lattner	5218343970	the stackifier is global! llvm-svn: 108626	2010-07-17 17:42:04 +00:00
Chris Lattner	8f440bb9b0	doxygenify some comments. llvm-svn: 108625	2010-07-17 17:40:51 +00:00
Eric Christopher	83f250f005	Remove unnecessary check that was subsumed into canRealignStack. llvm-svn: 108588	2010-07-17 00:33:04 +00:00
Eric Christopher	c0be37287c	Make comment a bit more clear as well as return statement since needsStackRealignment is currently checking the can conditions as well. llvm-svn: 108581	2010-07-17 00:25:41 +00:00
Jakob Stoklund Olesen	8289f78569	Remove the isMoveInstr() hook. llvm-svn: 108567	2010-07-16 22:35:46 +00:00
Jakob Stoklund Olesen	2c130b8ead	Use MI.isCopy. llvm-svn: 108565	2010-07-16 22:35:34 +00:00
Bill Wendling	499f797cdd	Rename DBG_LABEL PROLOG_LABEL, because it's only used during prolog emission and thus is a much more meaningful name. llvm-svn: 108563	2010-07-16 22:20:36 +00:00
Jakob Stoklund Olesen	8d51149102	Keep valgrind quiet. The isLive() method can read uninitialized memory, but it still gives correct results. llvm-svn: 108561	2010-07-16 22:00:33 +00:00
Dale Johannesen	da3e05db70	Accept registers with P modifier. PR 5314. llvm-svn: 108545	2010-07-16 18:35:46 +00:00
Jakob Stoklund Olesen	c30b4ddc58	Remove the X86::FP_REG_KILL pseudo-instruction and the X86FloatingPointRegKill pass that inserted it. It is no longer necessary to limit the live ranges of FP registers to a single basic block. llvm-svn: 108536	2010-07-16 17:41:44 +00:00
Jakob Stoklund Olesen	f0af236874	Search for a free FP register instead of just assuming FP7 is not in use. llvm-svn: 108535	2010-07-16 17:41:40 +00:00
Jakob Stoklund Olesen	0e5fb020a0	Allow x87 FP registers to be alive globally in a function. FP_REG_KILL instructions are still inserted, but can be disabled by passing -live-x87 to llc. The X87FPRegKillInserterPass is going to be removed shortly. CFG edges are partioned into bundles where the x87 stack must be allocated identically. Code is insertad at the end of each basic block that shuffles the live FP registers to match the outgoing bundles expectations. This fix is in preparation for some upcoming register allocator improvements that may extend the live range of registers beyond a basic block, similar to LICM. It also provides a nice runtime speedup if you are building with -mfpmath=387. llvm-svn: 108529	2010-07-16 16:38:12 +00:00
Evan Cheng	55f0c6b9fc	Split -enable-finite-only-fp-math to two options: -enable-no-nans-fp-math and -enable-no-infs-fp-math. All of the current codegen fp math optimizations only care whether the fp arithmetics arguments and results can never be NaN. llvm-svn: 108465	2010-07-15 22:07:12 +00:00
Chris Lattner	620693806a	fix the encoding of MMX_MOVFR642Qrr, it starts with 0xF2 not 0xF3, this fixes rdar://8192860. Unfortunately it can only be triggered with llc because llvm-mc matches another (correctly encoded) version of this, so no testcase. llvm-svn: 108454	2010-07-15 20:13:34 +00:00
Jakob Stoklund Olesen	8b1bb8cfbd	Last COPY conversion. llvm-svn: 108387	2010-07-14 23:58:21 +00:00
Jakob Stoklund Olesen	9b449d5a92	Use TargetOpcode::COPY instead of X86-native register copy instructions when lowering atomics. This will allow those copies to still be coalesced after TII::isMoveInstr is removed. llvm-svn: 108385	2010-07-14 23:50:27 +00:00
Chris Lattner	769aedd523	fix indentation llvm-svn: 108368	2010-07-14 23:04:59 +00:00
Benjamin Kramer	92d8998348	Don't pass StringRef by reference. llvm-svn: 108366	2010-07-14 22:38:02 +00:00
Chris Lattner	254858031a	Merge lib/Target/X86/X86COFF.h into include/llvm/Support/COFF.h, patch by Michael Spencer! llvm-svn: 108342	2010-07-14 18:14:33 +00:00
Evan Cheng	a8e8874552	Fix for PR7193 was overly conservative. The only case where sibcall callee address cannot be allocated a register is in 32-bit mode where the first three arguments are marked inreg. In that case EAX, EDX, and ECX will be used for argument passing. This fixes PR7610. llvm-svn: 108327	2010-07-14 06:44:01 +00:00
Dan Gohman	1f471435f8	Don't propagate debug locations to instructions for materializing constants, since they may not be emited near the other instructions which get the same line, and this confuses debug info. llvm-svn: 108302	2010-07-14 01:07:44 +00:00
Bruno Cardoso Lopes	6c6c14a55c	Add AVX 256-bit compare instructions and a bunch of testcases llvm-svn: 108286	2010-07-13 22:06:38 +00:00
Bruno Cardoso Lopes	fd8bfcd6e1	AVX 256-bit conversion instructions Add the x86 VEX_L form to handle special cases where VEX_L must be set. llvm-svn: 108274	2010-07-13 21:07:28 +00:00
Kevin Enderby	76a6b663a3	Added a check that pusha cannot be encoded in 64-bit mode. llvm-svn: 108265	2010-07-13 20:05:41 +00:00
Chris Lattner	55595fb291	my work on adding segment registers to LEA missed the disassembler. Remove some code from the disassembler to compensate, unbreaking disassembly of lea's. llvm-svn: 108226	2010-07-13 04:23:55 +00:00
Bruno Cardoso Lopes	dff283e146	Add AVX 256-bit packed logical forms llvm-svn: 108224	2010-07-13 02:38:35 +00:00
Bruno Cardoso Lopes	36b32aeaa5	Add AVX 256-bit unop arithmetic instructions llvm-svn: 108223	2010-07-13 01:53:31 +00:00
Bruno Cardoso Lopes	77a3c4462f	Since AVX is a superset of all SSE versions, only use HasAVX for AVX instructions llvm-svn: 108222	2010-07-13 00:38:47 +00:00
David Greene	03264efe30	Move some SIMD fragment code into X86InstrFragmentsSIMD so that the utility classes can be used from multiple files. This will aid transitioning to a new refactored x86 SIMD specification. llvm-svn: 108213	2010-07-12 23:41:28 +00:00
Bruno Cardoso Lopes	8e67a0482e	Add AVX 256 binary arithmetic instructions llvm-svn: 108207	2010-07-12 23:04:15 +00:00
Bruno Cardoso Lopes	91806311c9	More refactoring of basic SSE arith instructions. Open room for 256-bit instructions llvm-svn: 108204	2010-07-12 22:41:32 +00:00
Dan Gohman	51e6d9bbf6	Apply the SSE dependence idiom for SSE unary operations to SD instructions too, in addition to SS instructions. And add a comment about it. llvm-svn: 108191	2010-07-12 20:46:04 +00:00
Bruno Cardoso Lopes	f9bcaad76d	Add AVX 256-bit MOVMSK forms llvm-svn: 108184	2010-07-12 20:06:32 +00:00
Dan Gohman	425b35681f	Check begin!=end, rather than !begin. llvm-svn: 108167	2010-07-12 18:12:35 +00:00
Dan Gohman	68d7424a65	Don't fast-isel an x87 comparison opcode, as fast-isel doesn't support branching on x87 comparisons yet. This fixes PR7624. llvm-svn: 108149	2010-07-12 15:46:30 +00:00
Rafael Espindola	6635f9838e	Convert getLoadStoreRegOpcode to use a switch. llvm-svn: 108123	2010-07-12 03:43:04 +00:00
Jakob Stoklund Olesen	de7201545e	A basic block that only uses RFP registers still needs the FP_REG_KILL marker. This fixes PR7375. llvm-svn: 108120	2010-07-12 02:12:47 +00:00
Rafael Espindola	e35d70fafa	Convert the last getPhysicalRegisterRegClass in VirtRegRewriter.cpp to getMinimalPhysRegClass. It was used to produce spills, and it is better to use the most specific class if possible. Update getLoadStoreRegOpcode to handle GR32_AD. llvm-svn: 108115	2010-07-12 00:52:33 +00:00
Jakob Stoklund Olesen	f6c7d7fb3f	Use target independent COPY instructions for the fake fextend and fround operations in x87 code. llvm-svn: 108098	2010-07-11 18:19:39 +00:00
Jakob Stoklund Olesen	98ee37d878	Remove obsolete README_SSE note. We are generating movaps for all XMM register copies, including scalar floating point values. This is known to be at least as good as movss and movsd for all known architectures up to and including Nehalem because it avoids a partial register stall. The SSEDomainFix pass will switch movaps to movdqa when appropriate (i.e., when operands come from the integer unit). We don't now that switching movaps to movapd has any benefit. The same applies to andps -> pand. llvm-svn: 108096	2010-07-11 17:13:42 +00:00
Jakob Stoklund Olesen	4806848799	Avoid SSE instructions in FastIsel when it is not available. llvm-svn: 108091	2010-07-11 16:22:13 +00:00
Jakob Stoklund Olesen	e46f3eb0c4	X86InstrInfo::copyRegToReg is dead. Long live copyPhysReg! llvm-svn: 108076	2010-07-11 05:44:30 +00:00
Jakob Stoklund Olesen	8969657f0c	Use COPY in X86FastISel::X86SelectRet. Don't try a cross-class copy. That is very unlikely anywy since return value registers are usually register class friendly. (%EAX, %XMM0, etc). llvm-svn: 108074	2010-07-11 05:17:02 +00:00
Jakob Stoklund Olesen	3bb1267431	Use COPY in FastISel everywhere it is safe and trivial. The remaining copyRegToReg calls actually check the return value (shock!), so we cannot trivially replace them with COPY instructions. llvm-svn: 108069	2010-07-11 03:31:00 +00:00
Jakob Stoklund Olesen	de457896b6	Don't emit st(0)/st(1) copies as FpMOV instructions. Use FpSET_ST? instead. Based on a patch by Rafael Espíndola. Attempt to make the FpSET_ST1 hack more robust, but we are still relying on FpSET_ST0 preceeding it. This is only for supporting really weird x87 inline asm. We support: FpSET_ST0 INLINEASM FpSET_ST0 FpSET_ST1 INLINEASM with and without kills on the arguments. We don't support: FpSET_ST1 FpSET_ST0 INLINEASM nor FpSET_ST1 INLINEASM Just Don't Do It! llvm-svn: 108047	2010-07-10 17:42:34 +00:00
Dan Gohman	d7b5ce3312	Reapply bottom-up fast-isel, with several fixes for x86-32: - Check getBytesToPopOnReturn(). - Eschew ST0 and ST1 for return values. - Fix the PIC base register initialization so that it doesn't ever fail to end up the top of the entry block. llvm-svn: 108039	2010-07-10 09:00:22 +00:00
Jakob Stoklund Olesen	be8d9b0bb8	An x86 function returns a floating point value in st(0), and we must make sure it is popped, even if it is ununsed. A CopyFromReg node is too weak to represent the required sideeffect, so insert an FpGET_ST0 instruction directly instead. This will matter when CopyFromReg gets lowered to a generic COPY instruction. llvm-svn: 108037	2010-07-10 04:04:25 +00:00
Bruno Cardoso Lopes	5e6c2155a3	Declare YMM subregisters in the right way! Thanks Jakob llvm-svn: 108022	2010-07-09 21:46:19 +00:00
Bruno Cardoso Lopes	2419606bfb	Add AVX 256-bit packed MOVNT variants llvm-svn: 108021	2010-07-09 21:42:42 +00:00
Jakob Stoklund Olesen	e2614a9979	Remember the *_TC opcodes for load/store llvm-svn: 108020	2010-07-09 21:27:55 +00:00
Bruno Cardoso Lopes	6bc772eec7	Add AVX 256-bit unpack and interleave llvm-svn: 108017	2010-07-09 21:20:35 +00:00
Jakob Stoklund Olesen	7a7b55eb67	Automatically fold COPY instructions into stack load/store. llvm-svn: 108012	2010-07-09 20:43:13 +00:00
Jakob Stoklund Olesen	51702ec46b	Fix a few tests llvm-svn: 108011	2010-07-09 20:43:09 +00:00
Bruno Cardoso Lopes	792e906bef	Start the support for AVX instructions with 256-bit %ymm registers. A couple of notes: - The instructions are being added with dummy placeholder patterns using some 256 specifiers, this is not meant to work now, but since there are some multiclasses generic enough to accept them, when we go for codegen, the stuff will be already there. - Add VEX encoding bits to support YMM - Add MOVUPS and MOVAPS in the first round - Use "Y" as suffix for those Instructions: MOVUPSYrr, ... - All AVX instructions in X86InstrSSE.td will move soon to a new X86InstrAVX file. llvm-svn: 107996	2010-07-09 18:27:43 +00:00
Bob Wilson	6586e9b203	--- Reverse-merging r107947 into '.': U utils/TableGen/FastISelEmitter.cpp --- Reverse-merging r107943 into '.': U test/CodeGen/X86/fast-isel.ll U test/CodeGen/X86/fast-isel-loads.ll U include/llvm/Target/TargetLowering.h U include/llvm/Support/PassNameParser.h U include/llvm/CodeGen/FunctionLoweringInfo.h U include/llvm/CodeGen/CallingConvLower.h U include/llvm/CodeGen/FastISel.h U include/llvm/CodeGen/SelectionDAGISel.h U lib/CodeGen/LLVMTargetMachine.cpp U lib/CodeGen/CallingConvLower.cpp U lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp U lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp U lib/CodeGen/SelectionDAG/FastISel.cpp U lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp U lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp U lib/CodeGen/SelectionDAG/InstrEmitter.cpp U lib/CodeGen/SelectionDAG/TargetLowering.cpp U lib/Target/XCore/XCoreISelLowering.cpp U lib/Target/XCore/XCoreISelLowering.h U lib/Target/X86/X86ISelLowering.cpp U lib/Target/X86/X86FastISel.cpp U lib/Target/X86/X86ISelLowering.h llvm-svn: 107987	2010-07-09 16:37:18 +00:00
Bruno Cardoso Lopes	992d25da71	Merge VEX enums with other x86 enum forms. Also fix all checks of which VEX fields to use. llvm-svn: 107952	2010-07-09 01:56:45 +00:00
Dan Gohman	0a7d155d67	Fix the memoperand offsets in code generated for va_start. llvm-svn: 107948	2010-07-09 01:06:48 +00:00
Chris Lattner	88c185617c	have the mc lowering process handle a few tail call forms, lowering them to jumps where possible and turning the TAILCALL marker in the instruction asm string into a proper comment. This eliminates a FIXME and is on the path to finishing: rdar://7639610 - eliminate encoding and asm info for TAILJMPd TAILJMPr TAILJMPn, etc. However, I can't eliminate the encodings for these instructions because the JIT still exists and has its own copy of the encoder, sigh. llvm-svn: 107946	2010-07-09 00:49:41 +00:00
Dan Gohman	0b5aa1cdd3	Re-apply bottom-up fast-isel, with fixes. Be very careful to avoid emitting a DBG_VALUE after a terminator, or emitting any instructions before an EH_LABEL. llvm-svn: 107943	2010-07-09 00:39:23 +00:00
Bruno Cardoso Lopes	e6cc0d33bb	Factor out x86 segment override prefix encoding, and also use it for VEX llvm-svn: 107942	2010-07-09 00:38:14 +00:00
Chris Lattner	061d70ad2c	reject pseudo instructions early in the encoder. llvm-svn: 107939	2010-07-09 00:17:50 +00:00
Bruno Cardoso Lopes	b652c1a145	Remove trailing whitespaces from file llvm-svn: 107937	2010-07-09 00:07:19 +00:00
Chris Lattner	f469307c77	Change LEA to have 5 operands for its memory operand, just like all other instructions, even though a segment is not allowed. This resolves a bunch of gross hacks in the encoder and makes LEA more consistent with the rest of the instruction set. No functionality change. llvm-svn: 107934	2010-07-08 23:46:44 +00:00
Chris Lattner	ec536276f0	add some long-overdue enums to refer to the parts of the 5-operand X86 memory operand. llvm-svn: 107925	2010-07-08 22:41:28 +00:00
Jakob Stoklund Olesen	ec58a43d81	Remember the VR64 register class llvm-svn: 107920	2010-07-08 22:30:35 +00:00
Chris Lattner	9f034c1e5d	Rework segment prefix emission code to handle segments in memory operands at the same type as hard coded segments. This fixes problems where we'd emit the segment override after the REX prefix on instructions like: mov %gs:(%rdi), %rax This fixes rdar://8127102. I have several cleanup patches coming next. llvm-svn: 107917	2010-07-08 22:28:12 +00:00
Chris Lattner	1dd82c7dc2	introduce a new X86II::getMemoryOperandNo method, which returns the start of the memory operand for an instruction. Introduce a new "X86AddrSegment" enum to reduce # magic numbers referring to X86 memory operand layout. llvm-svn: 107916	2010-07-08 22:27:06 +00:00
Jakob Stoklund Olesen	63a622b768	Teach the x86 floating point stackifier to handle COPY instructions. This pass runs before COPY instructions are passed to copyPhysReg, so we simply translate COPY to the proper pseudo instruction. Note that copyPhysReg does not handle floating point stack copies. Once COPY is used everywhere, this can be cleaned up a bit, and most of the pseudo instructions can be removed. llvm-svn: 107899	2010-07-08 19:46:30 +00:00
Jakob Stoklund Olesen	930f8082c3	Implement X86InstrInfo::copyPhysReg llvm-svn: 107898	2010-07-08 19:46:25 +00:00
Jakob Stoklund Olesen	00264624a9	Convert EXTRACT_SUBREG to COPY when emitting machine instrs. EXTRACT_SUBREG no longer appears as a machine instruction. Use COPY instead. Add isCopy() checks in many places using isMoveInstr() and isExtractSubreg(). The isMoveInstr hook will be removed later. llvm-svn: 107879	2010-07-08 16:40:22 +00:00
Jakob Stoklund Olesen	a1e883dcf6	Remove references to INSERT_SUBREG after de-SSA. Fix X86InstrInfo::convertToThreeAddressWithLEA to generate COPY instead of INSERT_SUBREG. llvm-svn: 107878	2010-07-08 16:40:15 +00:00
Eric Christopher	e796253217	A slight reworking of the custom patterns for x86-64 tpoff codegen and correct the testcase for valid assembly. Needs more tests. llvm-svn: 107860	2010-07-08 07:36:46 +00:00
Dan Gohman	e75704369d	Revert 107840 107839 107813 107804 107800 107797 107791. Debug info intrinsics win for now. llvm-svn: 107850	2010-07-08 01:00:56 +00:00
Jakob Stoklund Olesen	6213ab789f	fix copies to/from GR8_ABCD_H even more llvm-svn: 107832	2010-07-07 23:04:56 +00:00
Chris Lattner	05ea2a4791	finish up support for callw: PR7195 llvm-svn: 107826	2010-07-07 22:35:13 +00:00
Chris Lattner	ac5881295c	Implement the major chunk of PR7195: support for 'callw' in the integrated assembler. Still some discussion to be done. llvm-svn: 107825	2010-07-07 22:27:31 +00:00
Bruno Cardoso Lopes	6c61451011	Add more assembly opcodes for SSE compare instructions llvm-svn: 107823	2010-07-07 22:24:03 +00:00
Evan Cheng	1c349f18f8	Move getExtLoad() and (some) getLoad() DebugLoc argument after EVT argument for consistency sake. llvm-svn: 107820	2010-07-07 22:15:37 +00:00
Devang Patel	32a600b494	Print undefined/unknown debug value as "undef". llvm-svn: 107818	2010-07-07 21:52:21 +00:00
Jakob Stoklund Olesen	ddaf0099a5	Allow copies between GR8_ABCD_L and GR8_ABCD_H. This fixes PR7540. llvm-svn: 107809	2010-07-07 20:33:27 +00:00
Dan Gohman	e7ccc51cc1	Implement bottom-up fast-isel. This has the advantage of not requiring a separate DCE pass over MachineInstrs. llvm-svn: 107804	2010-07-07 19:20:32 +00:00
Dan Gohman	2d4d01d0de	Add X86FastISel support for return statements. This entails refactoring a bunch of stuff, to allow the target-independent calling convention logic to be employed. llvm-svn: 107800	2010-07-07 18:32:53 +00:00
Bruno Cardoso Lopes	fd8060335b	Add AVX AES instructions llvm-svn: 107798	2010-07-07 18:24:20 +00:00
Dan Gohman	ffe64b1ee5	Give FunctionLoweringInfo an MBB member, avoiding the need to pass it around everywhere, and also give it an InsertPt member, to enable isel to operate at an arbitrary position within a block, rather than just appending to a block. llvm-svn: 107791	2010-07-07 16:47:08 +00:00
Dan Gohman	87fb4e8fcd	Simplify FastISel's constructor by giving it a FunctionLoweringInfo instance, rather than pointers to all of FunctionLoweringInfo's members. This eliminates an NDEBUG ABI sensitivity. llvm-svn: 107789	2010-07-07 16:29:44 +00:00
Dan Gohman	fe7532a308	Split the SDValue out of OutputArg so that SelectionDAG-independent code can do calling-convention queries. This obviates OutputArgReg. llvm-svn: 107786	2010-07-07 15:54:55 +00:00
Bruno Cardoso Lopes	6d122aef97	Add AVX SSE4.2 instructions llvm-svn: 107752	2010-07-07 03:39:29 +00:00
Bruno Cardoso Lopes	3df55b2d6f	Use only one multiclass to pinsrq instructions llvm-svn: 107750	2010-07-07 01:43:01 +00:00
Bruno Cardoso Lopes	fd6c808154	Now that almost all SSE4.1 AVX instructions are added, move code around to more appropriate sections. No functionality changes llvm-svn: 107749	2010-07-07 01:33:38 +00:00
Bruno Cardoso Lopes	8f5472a8e8	Add AVX SSE4.1 insertps, ptest and movntdqa instructions llvm-svn: 107747	2010-07-07 01:14:56 +00:00
Bruno Cardoso Lopes	6430c7350d	Add AVX SSE4.1 extractps and pinsr instructions llvm-svn: 107746	2010-07-07 01:01:13 +00:00
Bruno Cardoso Lopes	f3116ebe96	Add AVX SSE4.1 Extract Integer instructions llvm-svn: 107740	2010-07-07 00:07:24 +00:00
Dale Johannesen	ce65663330	Accept RIP-relative symbols with 'i' constraint, and print the (%rip) only if the 'a' modifier is present. PR 7528. llvm-svn: 107727	2010-07-06 23:27:00 +00:00
Bruno Cardoso Lopes	1f9ad516c6	Add the rest of AVX SSE4.1 packed move with sign/zero extend instructions llvm-svn: 107723	2010-07-06 23:15:17 +00:00
Bruno Cardoso Lopes	35702d27c4	Add part of AVX SSE4.1 packed move with sign/zero extend instructions llvm-svn: 107720	2010-07-06 23:01:41 +00:00
Bruno Cardoso Lopes	13f0260e76	Fix comment from previous patch llvm-svn: 107717	2010-07-06 22:38:32 +00:00

... 5 6 7 8 9 ...

6727 Commits