llvm-project

Commit Graph

Author	SHA1	Message	Date
Bruno Cardoso Lopes	c9b3316fea	Minor change. Since the checks are equivalent, use isMMX llvm-svn: 113239	2010-09-07 18:24:00 +00:00
Dale Johannesen	605acfe533	Add patterns for MMX that use the new intrinsics. Enable palignr intrinsic. These may need adjustment for a new VT in due course. llvm-svn: 113233	2010-09-07 18:10:56 +00:00
Bruno Cardoso Lopes	f0ea222255	Remove unused target specific node llvm-svn: 113224	2010-09-07 17:38:55 +00:00
Benjamin Kramer	1ecb978214	Don't leak the old operand when transforming "sldt" into "sldtw". llvm-svn: 113200	2010-09-07 14:40:58 +00:00
Chris Lattner	30bb384944	add missing cmov aliases, this resolves rdar://8208499 llvm-svn: 113189	2010-09-07 00:05:45 +00:00
Chris Lattner	3ae9398d5f	remove duplicated entry llvm-svn: 113188	2010-09-06 23:57:24 +00:00
Chris Lattner	7ece716da2	"sldt <mem>" is ambiguous in 64-bit mode, but should always be disambiguated as sldtw. sldtw and sldtq with a mem operands have the same effect, but sldtw is more compact. Force it to sldtw, resolving rdar://8017530 llvm-svn: 113186	2010-09-06 23:51:44 +00:00
Chris Lattner	415e04fad2	fix rdar://8017621 - llvm-mc can't guess encoding for "push $(1000)" llvm-svn: 113184	2010-09-06 23:40:56 +00:00
Chris Lattner	34e366b45c	fix the operand constraints of the immediate form of in/out, allowing unsigned 8-bit operands. This fixes rdar://8208481 llvm-svn: 113182	2010-09-06 23:29:05 +00:00
Chris Lattner	339cc7bfef	in the case where an instruction only has one implementation of a mneumonic, report operand errors with better location info. For example, we now report: t.s:6:14: error: invalid operand for instruction cwtl $1 ^ but we fail for common cases like: t.s:11:4: error: invalid operand for instruction addl $1, $1 ^ because we don't know if this is supposed to be the reg/imm or imm/reg form. llvm-svn: 113178	2010-09-06 22:11:18 +00:00
Chris Lattner	628fbecf4f	Now that we know if we had a total fail on the instruction mnemonic, give a more detailed error. Before: t.s:11:4: error: unrecognized instruction addl $1, $1 ^ t.s:12:4: error: unrecognized instruction f2efqefa $1 ^ After: t.s:11:4: error: invalid operand for instruction addl $1, $1 ^ t.s:12:4: error: invalid instruction mnemonic 'f2efqefa' f2efqefa $1 ^ This fixes rdar://8017912 - llvm-mc says "unrecognized instruction" when it means "invalid operands" llvm-svn: 113176	2010-09-06 21:54:15 +00:00
Chris Lattner	31c63fb518	simplify the hacks around jrcxz. llvm-svn: 113167	2010-09-06 20:10:12 +00:00
Chris Lattner	b4be28f33d	have tblgen detect when an instruction would have matched, but failed because a subtarget feature was not enabled. Use this to remove a bunch of hacks from the X86AsmParser for rejecting things like popfl in 64-bit mode. Previously these hacks weren't needed, but were important to get a message better than "invalid instruction" when used in the wrong mode. This also fixes bugs where pushal would not be rejected correctly in 32-bit mode (just pusha). llvm-svn: 113166	2010-09-06 20:08:02 +00:00
Chris Lattner	a22a368e7c	change MatchInstructionImpl to return an enum instead of bool. llvm-svn: 113165	2010-09-06 19:22:17 +00:00
Chris Lattner	3e4582ada5	have AsmMatcherEmitter.cpp produce the hunk of code that gets included into the middle of the class, and rework how the different sections of the generated file are conditionally included for simplicity. llvm-svn: 113163	2010-09-06 19:11:01 +00:00
Roman Divacky	e1278b57f9	Redefine LOOP* instructions from I to Ii8PCRel as they take an i8 argument. llvm-svn: 113158	2010-09-06 18:43:14 +00:00
Chris Lattner	4cfbcdc7b6	random cleanups llvm-svn: 113157	2010-09-06 18:32:06 +00:00
Chris Lattner	5cac0f71ca	update this. llvm-svn: 113116	2010-09-05 20:22:09 +00:00
Chris Lattner	eeba0c73e5	implement rdar://6653118 - fastisel should fold loads where possible. Since mem2reg isn't run at -O0, we get a ton of reloads from the stack, for example, before, this code: int foo(int x, int y, int z) { return x+y+z; } used to compile into: _foo: ## @foo subq $12, %rsp movl %edi, 8(%rsp) movl %esi, 4(%rsp) movl %edx, (%rsp) movl 8(%rsp), %edx movl 4(%rsp), %esi addl %edx, %esi movl (%rsp), %edx addl %esi, %edx movl %edx, %eax addq $12, %rsp ret Now we produce: _foo: ## @foo subq $12, %rsp movl %edi, 8(%rsp) movl %esi, 4(%rsp) movl %edx, (%rsp) movl 8(%rsp), %edx addl 4(%rsp), %edx ## Folded load addl (%rsp), %edx ## Folded load movl %edx, %eax addq $12, %rsp ret Fewer instructions and less register use = faster compiles. llvm-svn: 113102	2010-09-05 02:18:34 +00:00
Chris Lattner	65b48b5dfc	zap dead code. llvm-svn: 113073	2010-09-04 18:12:00 +00:00
Bruno Cardoso Lopes	c6accda78e	Remove the last bit of isShuffleMaskLegal checks and improve the comment regarding mmx shuffles llvm-svn: 113059	2010-09-04 02:58:56 +00:00
Bruno Cardoso Lopes	731bcc1abf	make explicit that we not handle several mmx shuffles llvm-svn: 113058	2010-09-04 02:50:13 +00:00
Bruno Cardoso Lopes	20779ee157	Emit target specific nodes to handle palignr. Do not touch it for MMX versions yet. llvm-svn: 113056	2010-09-04 02:36:07 +00:00
Bruno Cardoso Lopes	cff7cd18ab	Emit target specific nodes to handle splats starting at zero indicies llvm-svn: 113055	2010-09-04 02:02:14 +00:00
Bruno Cardoso Lopes	95759917eb	Emit target specific nodes for isPSHUFHWMask and isPSHUFLWMask llvm-svn: 113050	2010-09-04 01:36:45 +00:00
Bruno Cardoso Lopes	2b57008c72	Emit target specific nodes for isSHUFPMask llvm-svn: 113048	2010-09-04 01:22:57 +00:00
Bruno Cardoso Lopes	2f7af36134	Previous isMOVLMask matching already emits targets nodes, remove check llvm-svn: 113047	2010-09-04 00:50:08 +00:00
Bruno Cardoso Lopes	9f8e704151	One more check from the original isShuffleMaskLegal goes away llvm-svn: 113045	2010-09-04 00:46:16 +00:00
Bruno Cardoso Lopes	16959372bb	Remove a duplicated but useless check that i've inserted in the previous commit. llvm-svn: 113044	2010-09-04 00:43:12 +00:00
Bruno Cardoso Lopes	44578d38d3	Refactor some code and remove the extra checks for unpckl_undef and unpckh_undef llvm-svn: 113043	2010-09-04 00:39:43 +00:00
Bruno Cardoso Lopes	7829d0e74b	Remove check for unpckh mask llvm-svn: 113035	2010-09-03 23:32:47 +00:00
Bruno Cardoso Lopes	d1dacc57aa	Remove check for unpckl mask llvm-svn: 113034	2010-09-03 23:31:50 +00:00
Bruno Cardoso Lopes	207b9d6218	Inline isShuffleMaskLegal into LowerVECTOR_SHUFFLE, so we can start checking each standalone condition and decide whether emit target specific nodes or remove the condition if it's already matched before. llvm-svn: 113031	2010-09-03 23:24:06 +00:00
Bruno Cardoso Lopes	2bef20eda7	Reapply considered harmfull part of rr112934 and r112942. "Use target specific nodes instead of relying in unpckl and unpckh pattern fragments during isel time. Also place a depth limit in getShuffleScalarElt. llvm-svn: 113020	2010-09-03 22:09:41 +00:00
Dale Johannesen	367afb5a00	Remove the rest of the nonexistent 64-bit AVX instructions. Bruno, please review. llvm-svn: 113014	2010-09-03 21:23:00 +00:00
Bruno Cardoso Lopes	a750d994fe	Reapply last harmless part of r112934, the pattern fragment to match X86Unpcklpd llvm-svn: 113009	2010-09-03 20:44:26 +00:00
Bruno Cardoso Lopes	fe8717c573	Reintroduce a simple function refactoring done in r112934, also without any functionality changes llvm-svn: 113008	2010-09-03 20:20:02 +00:00
Bruno Cardoso Lopes	48e589b122	Reapply piecies of r112942 and r112934 which don't do functional changes llvm-svn: 113007	2010-09-03 20:10:35 +00:00
Bruno Cardoso Lopes	6979cf0808	Reapply Fix comment llvm-svn: 113006	2010-09-03 19:55:05 +00:00
Daniel Dunbar	6f3da24d70	Revert r112934, "- Use specific nodes to match unpckl masks.", which introduced some infinite loop and select failures. - Apologies for eager reverting, but its branch day. llvm-svn: 113000	2010-09-03 19:38:11 +00:00
Daniel Dunbar	f1aacd55c0	Revert r112938 "Fix comment", which depends on r112934, which introduced some infinite loop and select failures. llvm-svn: 112999	2010-09-03 19:38:08 +00:00
Daniel Dunbar	0ffe4db45c	Revert r112942, "Use punpckh and unpckh family of nodes instead of using unpckh mask pattern fragment", which depends on r112934, which introduced some infinite loop and select failures. llvm-svn: 112998	2010-09-03 19:38:05 +00:00
Bruno Cardoso Lopes	d6634a5b2e	AVX doesn't support mm operations neither its instrinsics. The AVX versions of PALIGN and PABS* should only exist for 128-bit. Remove the unnecessary stuff. llvm-svn: 112944	2010-09-03 02:08:45 +00:00
Bruno Cardoso Lopes	a85ec10483	Use punpckh and unpckh family of nodes instead of using unpckh mask pattern fragment llvm-svn: 112942	2010-09-03 01:39:08 +00:00
Bruno Cardoso Lopes	adc6bca2dd	Fix comment llvm-svn: 112938	2010-09-03 01:28:51 +00:00
Bruno Cardoso Lopes	cce44678b4	- Use specific nodes to match unpckl masks. - Teach getShuffleScalarElt how to handle more target specific nodes, so the DAGCombine can make use of it. - Add another hack to avoid the node update problem during legalization. More description on the comments llvm-svn: 112934	2010-09-03 01:24:00 +00:00
Jakob Stoklund Olesen	08aede2538	Don't call Predicate_* from X86 target. llvm-svn: 112921	2010-09-03 00:35:18 +00:00
Anton Korobeynikov	a5a645559c	Properly emit __chkstk call instead of __alloca on non-mingw windows targets. Patch by Cameron Esfahani! llvm-svn: 112902	2010-09-02 23:03:46 +00:00
Bruno Cardoso Lopes	02a05a6a89	Move insertps mask decoding to header file llvm-svn: 112896	2010-09-02 22:43:39 +00:00
Anton Korobeynikov	a689c5b2c0	Revert win64 changes. They seem to be incomplete llvm-svn: 112885	2010-09-02 22:31:32 +00:00
Anton Korobeynikov	56291f7e53	Properly allocate win64 shadow reg area. Patch by Jan Sjodin! llvm-svn: 112875	2010-09-02 22:16:28 +00:00
Bruno Cardoso Lopes	814a69c330	Move decoding of insertps back to avoid unused warnings in x86 isel lowering, and fix movlhps/movhlps to decode 4 elements shuffles llvm-svn: 112869	2010-09-02 21:51:11 +00:00
Dan Gohman	3c9b5f394b	Don't narrow the load and store in a load+twiddle+store sequence unless there are clearly no stores between the load and the store. This fixes this miscompile reported as PR7833. This breaks the test/CodeGen/X86/narrow_op-2.ll optimization, which is safe, but awkward to prove safe. Move it to X86's README.txt. llvm-svn: 112861	2010-09-02 21:18:42 +00:00
Bruno Cardoso Lopes	c79f50170a	Move x86 specific shuffle mask decoding to its own header, it's also going to be used elsewhere. Also trim trailing whitespaces llvm-svn: 112846	2010-09-02 18:40:13 +00:00
Bruno Cardoso Lopes	489613f1e5	Replace unpckl_undef and unpckh_undef matching with target specific opcodes llvm-svn: 112806	2010-09-02 05:23:12 +00:00
Bruno Cardoso Lopes	e4e4be3885	Move condition out to prepare for more matching llvm-svn: 112805	2010-09-02 04:20:26 +00:00
Bruno Cardoso Lopes	bf7fd146c7	Remove checking for isUNPCKL_v_undef_Mask, the specific node is already emitted for it llvm-svn: 112804	2010-09-02 03:57:58 +00:00
Bruno Cardoso Lopes	6a7f634487	become more strict about when it's safe to use X86ISD::MOVLPS llvm-svn: 112799	2010-09-02 02:35:51 +00:00
Bruno Cardoso Lopes	04c25c15c7	Revert r112689, avoid those kind of checks cause they mess up with mmx llvm-svn: 112760	2010-09-01 22:59:03 +00:00
Bruno Cardoso Lopes	fea81b4831	Using target specific nodes for shuffle nodes makes the mask check more strict, breaking some cases not checked in the testsuite, but also exposes some foldings not done before, as this example: movaps (%rdi), %xmm0 movaps (%rax), %xmm1 movaps %xmm0, %xmm2 movss %xmm1, %xmm2 shufps $36, %xmm2, %xmm0 now is generated as: movaps (%rdi), %xmm0 movaps %xmm0, %xmm1 movlps (%rax), %xmm1 shufps $36, %xmm1, %xmm0 llvm-svn: 112753	2010-09-01 22:33:20 +00:00
Bruno Cardoso Lopes	b3825216ce	Use movlps, movlpd, movss and movsd specific nodes instead of pattern matching with movlp pattern fragment llvm-svn: 112694	2010-09-01 05:08:25 +00:00
Bruno Cardoso Lopes	6aaebe877b	minor change, simplify some logic llvm-svn: 112689	2010-09-01 00:57:08 +00:00
Bruno Cardoso Lopes	2b025707a2	Move some functions around so they can be used for some other to come function llvm-svn: 112687	2010-09-01 00:51:36 +00:00
Bruno Cardoso Lopes	4b56d87290	Use x86 specific MOVSLDUP node, add more patterns to match it and remove useless load nodes llvm-svn: 112661	2010-08-31 22:35:05 +00:00
Bruno Cardoso Lopes	61996ef835	Use x86 specific MOVSHDUP node and add more patterns to match it llvm-svn: 112657	2010-08-31 22:22:11 +00:00
Jakob Stoklund Olesen	33e9fce2d6	Make %EFLAGS unallocatable. No CCR virtual registers should exist, and %EFLAGS is used in ways that can surprise RegAllocFast. llvm-svn: 112650	2010-08-31 21:51:07 +00:00
Bruno Cardoso Lopes	5de15ce468	Use MOVHLPS node instead of matching using movhlps and movhlps_undef pattern fragments llvm-svn: 112644	2010-08-31 21:38:49 +00:00
Bruno Cardoso Lopes	03e4c35302	Use MOVLHPS and MOVHLPS x86 nodes whenever possible. Also remove some useless nodes llvm-svn: 112642	2010-08-31 21:15:21 +00:00
Bruno Cardoso Lopes	dfd9dd5d75	Use X86ISD::MOVSS and MOVSD to represent the movl mask pattern, also fix the handling of those nodes when seeking for scalars inside vector shuffles llvm-svn: 112570	2010-08-31 02:26:40 +00:00
Eli Friedman	f75de6eae7	A couple of small missed optimizations. llvm-svn: 112411	2010-08-29 05:07:40 +00:00
Chris Lattner	38ccc8b884	add a bunch more common shuffles to the instprinter. llvm-svn: 112397	2010-08-29 03:08:08 +00:00
Chris Lattner	7a05e6dca2	I have manually decoded the imm field of an insertps one too many times. This patch causes llc and llvm-mc (which both default to verbose-asm) to print out comments after a few common shuffle instructions which indicates the shuffle mask, e.g.: insertps $113, %xmm3, %xmm0 ## xmm0 = zero,xmm0[1,2],xmm3[1] unpcklps %xmm1, %xmm0 ## xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1] pshufd $1, %xmm1, %xmm1 ## xmm1 = xmm1[1,0,0,0] This is carefully factored to keep the information extraction (of the shuffle mask) separate from the printing logic. I plan to move the extraction part out somewhere else at some point for other parts of the x86 backend that want to introspect on the behavior of shuffles. llvm-svn: 112387	2010-08-28 20:42:31 +00:00
Chris Lattner	94656b1c8c	fix the buildvector->insertp[sd] logic to not always create a redundant insertp[sd] $0, which is a noop. Before: _f32: ## @f32 pshufd $1, %xmm1, %xmm2 pshufd $1, %xmm0, %xmm3 addss %xmm2, %xmm3 addss %xmm1, %xmm0 ## kill: XMM0<def> XMM0<kill> XMM0<def> insertps $0, %xmm0, %xmm0 insertps $16, %xmm3, %xmm0 ret after: _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm3 addss %xmm1, %xmm3 movdqa %xmm2, %xmm0 insertps $16, %xmm3, %xmm0 ret The extra movs are due to a random (poor) scheduling decision. llvm-svn: 112379	2010-08-28 17:59:08 +00:00
Chris Lattner	bcb6090ad0	fix the BuildVector -> unpcklps logic to not do pointless shuffles when the top elements of a vector are undefined. This happens all the time for X86-64 ABI stuff because only the low 2 elements of a 4 element vector are defined. For example, on: _Complex float f32(_Complex float A, _Complex float B) { return A+B; } We used to produce (with SSE2, SSE4.1+ uses insertps): _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $16, %xmm2, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm0 addss %xmm1, %xmm0 pshufd $16, %xmm0, %xmm1 movdqa %xmm2, %xmm0 unpcklps %xmm1, %xmm0 ret We now produce: _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm3 addss %xmm1, %xmm3 movaps %xmm2, %xmm0 unpcklps %xmm3, %xmm0 ret This implements rdar://8368414 llvm-svn: 112378	2010-08-28 17:28:30 +00:00
Chris Lattner	96db6e66f4	improve comments in the unpcklps generating logic, introduce a new EltStride variable instead of reusing NumElems variable for a non-obvious purpose. No functionality change. llvm-svn: 112377	2010-08-28 17:15:43 +00:00
Bruno Cardoso Lopes	a982aa24ef	Clean up the logic of vector shuffles -> vector shifts. Also teach this logic how to handle target specific shuffles if needed, this is necessary while searching recursively for zeroed scalar elements in vector shuffle operands. llvm-svn: 112348	2010-08-28 02:46:39 +00:00
Anton Korobeynikov	c0b36921c2	Properly handle passing of FP stuff to varargs function on Win64: value should be copied to the corresponding shadow reg as well. Patch by Cameron Esfahani! llvm-svn: 112262	2010-08-27 14:43:06 +00:00
Daniel Dunbar	1844a71e66	X86: Fix an encoding issue with LOCK_ADD64mr, which could lead to very hard to find miscompiles with the integrated assembler. llvm-svn: 112250	2010-08-27 01:30:14 +00:00
Jim Grosbach	6a77066913	Simplify eliminateFrameIndex() interface back down now that PEI doesn't need to try to re-use scavenged frame index reference registers. rdar://8277890 llvm-svn: 112241	2010-08-26 23:32:16 +00:00
Bruno Cardoso Lopes	e25ba0c7c2	zap the now unused MVT::getIntVectorWithNumElements llvm-svn: 112218	2010-08-26 20:53:12 +00:00
Bob Wilson	a967c42a3d	Fix comment typos. llvm-svn: 112202	2010-08-26 18:08:11 +00:00
Chris Lattner	eb2cc0ce0e	implement SplitVecOp_CONCAT_VECTORS, fixing the included testcase with SSE1. llvm-svn: 112171	2010-08-26 05:51:22 +00:00
Chris Lattner	cc60609cb4	fix sse1 only codegen in x86-64 mode, which is something we apparently try to support. llvm-svn: 112168	2010-08-26 05:24:29 +00:00
Bruno Cardoso Lopes	184eaea855	Fix PR7748 without using microsoft extensions llvm-svn: 112128	2010-08-26 01:02:53 +00:00
Chris Lattner	aecf47a5cb	we should pattern match the SSE complex arithmetic ops. llvm-svn: 112109	2010-08-25 23:31:42 +00:00
Bruno Cardoso Lopes	d4085f6e91	Revert this for now, PUNPCKLDQ dont operate on v4f32 llvm-svn: 112090	2010-08-25 21:26:37 +00:00
Daniel Dunbar	3d148ac089	X86: Fix misencode of RI64mi8. This fixes OpenSSL / x86_64-apple-darwin10 / clang -O3. llvm-svn: 112089	2010-08-25 21:11:02 +00:00
Benjamin Kramer	f1f2133ac0	Remove dead recursive function. Yay for clang -Wunused-function. llvm-svn: 112060	2010-08-25 17:27:58 +00:00
Anton Korobeynikov	b3b53ecac0	Fix nasty mingw32 bug, which e.g. prevented llvm-gcc bootstrap there. Mark _alloca call as clobberring EFLAGS, otherwise some DCE might remove other flags-clobberring stuff (e.g. cmp instructions) occuring after _alloca call. llvm-svn: 112034	2010-08-25 07:50:11 +00:00
Bruno Cardoso Lopes	0770d25758	PUNPCKLDQ should also be used for v4f32 llvm-svn: 112020	2010-08-25 02:55:40 +00:00
Bruno Cardoso Lopes	2e45d522c1	teach lowering to get target specific nodes for pshufd, emulating the same isel behavior for now, so we can pass all vector shuffle tests llvm-svn: 112017	2010-08-25 02:35:37 +00:00
Daniel Dunbar	1c8d777c93	MC/X86: Tweak imul recognition, previous hack only applies for the imul form taking immediates. llvm-svn: 111950	2010-08-24 19:37:56 +00:00
Daniel Dunbar	09392785b4	MC/X86: Add custom hack for recognizing "imul $12, %eax" and friends. llvm-svn: 111947	2010-08-24 19:24:18 +00:00
Daniel Dunbar	94b84a19b9	MC/X86: Warn on scale factors > 1 without index register, instead of erroring, for 'as' compatibility. llvm-svn: 111945	2010-08-24 19:13:38 +00:00
Dan Gohman	c88fda477a	Fix X86's isLegalAddressingMode to recognize that static addresses need not be RIP-relative in small mode. llvm-svn: 111917	2010-08-24 15:55:12 +00:00
Bruno Cardoso Lopes	758d7b1f5c	Use pshufhw and pshuflw in more cases and fix getTargetShuffleNode number of arguments llvm-svn: 111890	2010-08-24 01:16:15 +00:00
Bruno Cardoso Lopes	264d90fff7	Start using target speficic nodes for shuffles: pshufhw and pshuflw llvm-svn: 111837	2010-08-23 20:41:02 +00:00
Gabor Greif	21fed6616c	tyops llvm-svn: 111835	2010-08-23 20:30:51 +00:00
Chris Lattner	58bd73a5a7	Add a new llvm.x86.int intrinsic, allowing access to the x86 int and int3 instructions. Patch by Peter Housel! llvm-svn: 111831	2010-08-23 19:39:25 +00:00
Chris Lattner	a42202e0e4	random improvement for variable shift codegen. llvm-svn: 111813	2010-08-23 17:30:29 +00:00
Anton Korobeynikov	cbbe4501df	Revert invalid r111792. Jump tables are not broken on x86-64 / coff, it's COFF emitter which does not support differences of two symbols (and needs to be fixed). GAS is pretty fine with code produced. llvm-svn: 111801	2010-08-23 07:38:51 +00:00
Michael J. Spencer	e87231232a	Workaround broken jump tables on x86-64 COFF. llvm-svn: 111792	2010-08-23 04:45:37 +00:00
Anton Korobeynikov	db9820ecaa	Use rip-rel addressing on win64 by default. For this we just defaults to small pic code model. llvm-svn: 111741	2010-08-21 17:21:11 +00:00
Michael J. Spencer	377aa20e6e	MC: Add partial x86-64 support to COFF. llvm-svn: 111728	2010-08-21 05:58:13 +00:00
Dan Gohman	42ef669d81	Fix x86 fast-isel's cmp+branch folding to avoid folding when the comparison is in a different basic block from the branch. In such cases, the comparison's operands may not have initialized virtual registers available. llvm-svn: 111709	2010-08-21 02:32:36 +00:00
Bruno Cardoso Lopes	9f20e7a1bf	Prepare LowerVECTOR_SHUFFLEv8i16 to use x86 target specific nodes directly llvm-svn: 111704	2010-08-21 01:32:18 +00:00
Bruno Cardoso Lopes	6f3b38a851	This is the first step towards refactoring the x86 vector shuffle code. The general idea here is to have a group of x86 target specific nodes which are going to be selected during lowering and then directly matched in isel. The commit includes the addition of those specific nodes and a bunch of patterns, and incrementally we're going to switch between them and what we have right now. Both the patterns and target specific nodes can change as we move forward with this work. llvm-svn: 111691	2010-08-20 22:55:05 +00:00
Chris Lattner	f547740d3f	fix PR7465, mishandling of lcall and ljmp: intersegment long call and jumps. llvm-svn: 111496	2010-08-19 01:18:43 +00:00
Chris Lattner	beb506eeed	minor progress towards fixing PR7465 llvm-svn: 111494	2010-08-19 01:00:34 +00:00
Bill Wendling	817e857b13	Marked with ATTRIBUTE_USED so that clang doesn't complain. llvm-svn: 111383	2010-08-18 18:40:57 +00:00
Chris Lattner	3e3e63efe1	remove some code that is dead now that lea's are modeled with segment registers. llvm-svn: 111343	2010-08-18 02:40:44 +00:00
Anton Korobeynikov	88c09879c7	Revert part of one of the prev. patches - tailjmp will follow later. llvm-svn: 111291	2010-08-17 21:08:28 +00:00
Anton Korobeynikov	231ab847ca	More fixes for win64: - Do not clobber al during variadic calls, this is AMD64 ABI-only feature - Emit wincall64, where necessary Patch by Cameron Esfahani! llvm-svn: 111289	2010-08-17 21:06:07 +00:00
Anton Korobeynikov	cd78af6e3c	Enable more win64 calls folding opportunities. Patch by Cameron Esfahani! llvm-svn: 111288	2010-08-17 21:06:01 +00:00
Eli Friedman	2444da0652	Comment out some broken/unused/useless instructions which mess up disassembly. llvm-svn: 111185	2010-08-16 21:18:51 +00:00
Eli Friedman	51ec745509	Don't attempt to SimplifyShortMoveForm in 64-bit mode. llvm-svn: 111182	2010-08-16 21:03:32 +00:00
Matt Fleming	f751d856f0	Hookup ELF support for X86. llvm-svn: 111173	2010-08-16 18:36:14 +00:00
Jakob Stoklund Olesen	2cd00737c0	Partially revert r111155. It looks like MSVC is calling an operator<() that clang says is unused. llvm-svn: 111167	2010-08-16 18:24:54 +00:00
Jakob Stoklund Olesen	b7f872197a	Remove unused functions. llvm-svn: 111155	2010-08-16 17:18:18 +00:00
Argyrios Kyrtzidis	d0fcc9a818	Revert r111082. No warnings for this common pattern. llvm-svn: 111102	2010-08-15 10:27:23 +00:00
Eric Christopher	54194bd127	Rework how the non-sse2 memory barrier is lowered so that the encoding is correct for the built-in assembler. Based on a patch from Chris. llvm-svn: 111083	2010-08-14 21:51:50 +00:00
Argyrios Kyrtzidis	7c09ddf0ae	Add ATTRIBUTE_UNUSED to methods that are not supposed to be used. llvm-svn: 111082	2010-08-14 21:35:10 +00:00
Chris Lattner	2f6c3434ac	improve indentation llvm-svn: 111073	2010-08-14 17:26:09 +00:00
Bruno Cardoso Lopes	160be2936b	Add comments to some pattern fragments in x86 llvm-svn: 111041	2010-08-13 20:39:01 +00:00
Dale Johannesen	8d3c89e765	Revert 110491. While not wrong, it was based on a misanalysis and is undesirable. llvm-svn: 111028	2010-08-13 18:43:45 +00:00
Bruno Cardoso Lopes	081861b6b7	Fix comment to reflect code, and remove an unused argument llvm-svn: 111022	2010-08-13 17:50:47 +00:00
Bruno Cardoso Lopes	1187e3f09b	Improve comment to make explicit why not to touch this could before JIT goes MC llvm-svn: 111021	2010-08-13 17:44:10 +00:00
Eric Christopher	6e5b67ccc4	Revert last patch and r110954 as I meant to. llvm-svn: 111001	2010-08-13 02:37:50 +00:00
Eric Christopher	5e027fe113	Revert r110954 for now, pseudo instructions can't make it through to the JIT. llvm-svn: 111000	2010-08-13 02:30:00 +00:00
Bruno Cardoso Lopes	cc20fe5937	Some small clean-up: use of pseudo instructions llvm-svn: 110954	2010-08-12 20:55:18 +00:00
Bruno Cardoso Lopes	7f704b31a9	- Teach SSEDomainFix to switch between different levels of AVX instructions. Here we guess that AVX will have domain issues, so just implement them for consistency and in the future we remove if it's unnecessary. - Make foldMemoryOperandImpl aware of 256-bit zero vectors folding and support the 128-bit counterparts of AVX too. - Make sure MOV[AU]PS instructions are only selected when SSE1 is enabled, and duplicate the patterns to match AVX. - Add a testcase for a simple 128-bit zero vector creation. llvm-svn: 110946	2010-08-12 20:20:53 +00:00
Bruno Cardoso Lopes	7e1a30c0d3	Define AVX 128-bit pattern versions of SET0PS/PD. llvm-svn: 110937	2010-08-12 18:20:59 +00:00
Bruno Cardoso Lopes	1401e040eb	Fix comment order llvm-svn: 110898	2010-08-12 02:08:52 +00:00
Bruno Cardoso Lopes	7306c86886	Begin to support some vector operations for AVX 256-bit intructions. The long term goal here is to be able to match enough of vector_shuffle and build_vector so all avx intrinsics which aren't mapped to their own built-ins but to shufflevector calls can be codegen'd. This is the first (baby) step, support building zeroed vectors. llvm-svn: 110897	2010-08-12 02:06:36 +00:00
Daniel Dunbar	7d7b4d1b0f	MC/X86/AsmParser: Give an explicit error message when we reject an instruction because it could have an ambiguous suffix. llvm-svn: 110890	2010-08-12 00:55:42 +00:00
Daniel Dunbar	2ecc3bb4f7	MC/AsmParser: Push the burdon of emitting diagnostics about unmatched instructions onto the target specific parser, which can do a better job. llvm-svn: 110889	2010-08-12 00:55:38 +00:00
Daniel Dunbar	167b9d7f30	tblgen/AsmMatcher: Always emit the match function as 'MatchInstructionImpl', target specific parsers can adapt the TargetAsmParser to this. llvm-svn: 110888	2010-08-12 00:55:32 +00:00
Jakob Stoklund Olesen	9c473e46f3	Fix <rdar://problem/8282498> even if it doesn't reproduce on trunk. When a register is defined by a partial load: %reg1234:sub_32 = MOV32mr <fi#-1>; GR64:%reg1234 That load cannot be folded into an instruction using the full 64-bit register. It would become a 64-bit load. This is related to the recent change to have isLoadFromStackSlot return false on a sub-register load. llvm-svn: 110874	2010-08-11 23:08:22 +00:00
Dan Gohman	5531aa4de1	Use ISD::ADD instead of ISD::SUB with a negated constant. This avoids trouble if the return type of TD->getPointerSize() is changed to something which doesn't promote to a signed type, and is simpler anyway. Also, use getCopyFromReg instead of getRegister to read a physical register's value. llvm-svn: 110835	2010-08-11 18:14:00 +00:00
Daniel Dunbar	ebace2248f	MCAsmParser: Add dump() hook to MCParsedAsmOperand. llvm-svn: 110790	2010-08-11 06:37:04 +00:00
Bruno Cardoso Lopes	91d61df3eb	Add AVX matching patterns to Packed Bit Test intrinsics. Apply the same approach of SSE4.1 ptest intrinsics but create a new x86 node "testp" since AVX introduces vtest{ps}{pd} instructions which set ZF and CF depending on sign bit AND and ANDN of packed floating-point sources. This is slightly different from what the "ptest" does. Tests comming with the other 256 intrinsics tests. llvm-svn: 110744	2010-08-10 23:25:42 +00:00
Bruno Cardoso Lopes	39f215bd33	Add AVX movnt{pd,ps,dq} 256-bit intrinsics llvm-svn: 110650	2010-08-10 02:49:24 +00:00
Bruno Cardoso Lopes	cedf23dfe5	Add AVX movmsk 256-bit intrinsics llvm-svn: 110648	2010-08-10 02:34:56 +00:00
Bruno Cardoso Lopes	85da72a88f	Support AVX 256-bit load and store intrinsics llvm-svn: 110645	2010-08-10 01:43:16 +00:00
Bruno Cardoso Lopes	b2b6b65b86	Patterns to match AVX cmp instructions llvm-svn: 110633	2010-08-10 00:13:20 +00:00
Bruno Cardoso Lopes	001d6fa174	Add matching patterns for vblend AVX intrinsics llvm-svn: 110630	2010-08-10 00:02:05 +00:00
Eric Christopher	b9627ee79b	Wording. llvm-svn: 110618	2010-08-09 22:52:47 +00:00
Bruno Cardoso Lopes	685cb32d2b	Add VCVTPD2PS, VCVTPS2DQ, VCVTPS2PDY, VCVTTPD2DQY, VCVTTPS2DQ and VCVTPD2DQ 256-bit conversion intrinsics llvm-svn: 110608	2010-08-09 21:51:56 +00:00
Bruno Cardoso Lopes	3e9b567643	Add patterns to AVX conversions instructions. Do that instead of declaring more intructions whenever is possible, more coming llvm-svn: 110605	2010-08-09 21:24:59 +00:00
Oscar Fuentes	212cfde6ec	CMake: eliminated unnecessary target_link_libraries. Next time the build is broken due to wrong library dependencies, just try building again (if you are on some Unix and are building all LLVM targets) or ask someone to commit the regenerated LLVMLibDeps.cmake. llvm-svn: 110593	2010-08-09 20:33:08 +00:00
Bruno Cardoso Lopes	c33940b3aa	Memory version of vcvtdq2pd intrinsic llvm-svn: 110582	2010-08-09 18:20:14 +00:00
Bruno Cardoso Lopes	828f6aeced	Patterns to match vinsert, vbroadcast, vmovmask and vcvtdq2pd AVX intrinsics llvm-svn: 110580	2010-08-09 18:03:43 +00:00
Dale Johannesen	a3bd31a923	Use sdmem and sse_load_f64 (etc.) for the vector form of CMPSD (etc.) Matching a 128-bit memory operand is wrong, the instruction uses only 64 bits (same as ADDSD etc.) 8193553. llvm-svn: 110491	2010-08-07 00:33:42 +00:00
Bruno Cardoso Lopes	93cc666a58	Patterns to match AVX 256-bit vzero intrinsics llvm-svn: 110480	2010-08-06 22:10:01 +00:00
Bruno Cardoso Lopes	3d6a3a0ede	Patterns to match AVX 256-bit permutation intrinsics llvm-svn: 110468	2010-08-06 20:03:27 +00:00
Owen Anderson	a7aed18624	Reapply r110396, with fixes to appease the Linux buildbot gods. llvm-svn: 110460	2010-08-06 18:33:48 +00:00
Bruno Cardoso Lopes	1cf067cb3d	Patterns to match AVX 256-bit horizontal arithmetic intrinsics llvm-svn: 110427	2010-08-06 02:10:30 +00:00
Bruno Cardoso Lopes	b9ad94fbf7	Patterns to match AVX 256-bit arithmetic intrinsics llvm-svn: 110425	2010-08-06 01:52:29 +00:00
Owen Anderson	bda59bd247	Revert r110396 to fix buildbots. llvm-svn: 110410	2010-08-06 00:23:35 +00:00
Eric Christopher	e1fb772aa5	Add an option to always emit realignment code for a particular module. llvm-svn: 110404	2010-08-05 23:57:43 +00:00
Owen Anderson	755aceb5d0	Don't use PassInfo* as a type identifier for passes. Instead, use the address of the static ID member as the sole unique type identifier. Clean up APIs related to this change. llvm-svn: 110396	2010-08-05 23:42:04 +00:00
Bruno Cardoso Lopes	77954bdf7a	Support very basic (doesn't include ABI support in the front-end, varags, ...) 256-bit argument passing and return for AVX llvm-svn: 110394	2010-08-05 23:35:51 +00:00
Eric Christopher	4d9c3400f3	Handle the memory barrier pseudo that goes to nothing for the JIT. llvm-svn: 110371	2010-08-05 20:04:36 +00:00
Eric Christopher	7fd06eb8ce	Set hasSideEffects on the 64-bit no-sse memory barrier. llvm-svn: 110369	2010-08-05 19:54:59 +00:00
Eric Christopher	32f5d6b9be	Be a little bit more specific about target for the memory barrier instructions. llvm-svn: 110360	2010-08-05 18:36:20 +00:00
Eric Christopher	4abffad17c	Handle the pseudo in MCInstLower. llvm-svn: 110359	2010-08-05 18:34:30 +00:00
Eric Christopher	2db8464282	Make x86-64 membarriers work without sse and clean up some of the uses. llvm-svn: 110274	2010-08-04 23:03:04 +00:00
Eli Friedman	39d0f57cab	PR7814: Truncates cannot be ignored for signed comparisons. llvm-svn: 110268	2010-08-04 22:40:58 +00:00
Devang Patel	2bf0f3ceff	Add DEBUG message. llvm-svn: 110224	2010-08-04 18:06:05 +00:00
Benjamin Kramer	a53a4eefa6	Enable COFF writer on mingw32 and cygwin. llvm-svn: 110200	2010-08-04 15:32:40 +00:00
Benjamin Kramer	61c8e6dc16	Print an error message when someone tries -integrated-as on an unsupported target. - The COFF backend doesn't support MingW/Cygwin at the moment, it'll report an error, but it's still much better than random assertions from the MachO backend. - We want to make ELF the default eventually, it's what the majority of targets use. llvm-svn: 110197	2010-08-04 13:16:30 +00:00
Chris Lattner	53befe7bc1	fix a win64 encoding problem, patch by Cameron Esfahani! llvm-svn: 110164	2010-08-03 22:49:22 +00:00
Michael J. Spencer	ed80f361b3	MC: Remove HasAbsolutizedSet from WindowsX86AsmBackend. llvm-svn: 109949	2010-07-31 07:21:44 +00:00
Michael J. Spencer	6b4925e223	Add relax all support to the COFF object streamer. llvm-svn: 109947	2010-07-31 06:22:29 +00:00
Bruno Cardoso Lopes	349165b48f	Support all 128-bit AVX vector intrinsics. Most part of them I already declared during the addition of the assembler support, the additional changes are: - Add missing intrinsics - Move all SSE conversion instructions in X86InstInfo64.td to the SSE.td file. - Duplicate some patterns to AVX mode. - Step into PCMPEST/PCMPIST custom inserter and add AVX versions. llvm-svn: 109878	2010-07-30 19:54:33 +00:00
Bruno Cardoso Lopes	405405bbfe	Fix typo! llvm-svn: 109877	2010-07-30 19:41:24 +00:00
Jakob Stoklund Olesen	ba0e124aaf	Revert r109652, and remove the offending assert in loadRegFromStackSlot instead. We do sometimes load from a too small stack slot when dealing with x86 arguments (varargs and smaller-than-32-bit args). It looks like we know what we are doing in those cases, so I am going to remove the assert instead of artifically enlarging stack slot sizes. The assert in storeRegToStackSlot stays in. We don't want to write beyond the bounds of a stack slot. llvm-svn: 109764	2010-07-29 17:42:27 +00:00
Jakob Stoklund Olesen	f2234fbe70	Create a fixed stack object for varargs that is as large as any register. The size of this object isn't used for anything - technically it is of variable size. This avoids a false positive from the assert in X86InstrInfo::loadRegFromStackSlot, and fixes PR7735. llvm-svn: 109652	2010-07-28 20:55:38 +00:00
Nate Begeman	53afc8f06a	Implement a vectorized algorithm for <16 x i8> << <16 x i8> This is about 4x faster and smaller than the existing scalarization. llvm-svn: 109566	2010-07-28 00:21:48 +00:00
Nate Begeman	269a6da023	~40% faster vector shl <4 x i32> on SSE 4.1 Larger improvements for smaller types coming in future patches. For: define <2 x i64> @shl(<4 x i32> %r, <4 x i32> %a) nounwind readnone ssp { entry: %shl = shl <4 x i32> %r, %a ; <<4 x i32>> [#uses=1] %tmp2 = bitcast <4 x i32> %shl to <2 x i64> ; <<2 x i64>> [#uses=1] ret <2 x i64> %tmp2 } We get: _shl: ## @shl pslld $23, %xmm1 paddd LCPI0_0, %xmm1 cvttps2dq %xmm1, %xmm1 pmulld %xmm1, %xmm0 ret Instead of: _shl: ## @shl pshufd $3, %xmm0, %xmm2 movd %xmm2, %eax pshufd $3, %xmm1, %xmm2 movd %xmm2, %ecx shll %cl, %eax movd %eax, %xmm2 pshufd $1, %xmm0, %xmm3 movd %xmm3, %eax pshufd $1, %xmm1, %xmm3 movd %xmm3, %ecx shll %cl, %eax movd %eax, %xmm3 punpckldq %xmm2, %xmm3 movd %xmm0, %eax movd %xmm1, %ecx shll %cl, %eax movd %eax, %xmm2 movhlps %xmm0, %xmm0 movd %xmm0, %eax movhlps %xmm1, %xmm1 movd %xmm1, %ecx shll %cl, %eax movd %eax, %xmm0 punpckldq %xmm0, %xmm2 movdqa %xmm2, %xmm0 punpckldq %xmm3, %xmm0 ret llvm-svn: 109549	2010-07-27 22:37:06 +00:00
Michael J. Spencer	f8270bdb2d	Make MC use Windows COFF on Windows and add tests. llvm-svn: 109494	2010-07-27 06:46:15 +00:00
Jakob Stoklund Olesen	96a890a7f8	The isLoadFromStackSlot and isStoreToStackSlot have no way of reporting subregister operands like this: %reg1040:sub_32bit<def> = MOV32rm <fi#-2>, 1, %reg0, 0, %reg0, %reg1040<imp-def>; mem:LD4[FixedStack-2](align=8) Make them return false when subreg operands are present. VirtRegRewriter is making bad assumptions otherwise. This fixes PR7713. llvm-svn: 109489	2010-07-27 04:17:01 +00:00
Jakob Stoklund Olesen	c3c05ed02e	Add assertions that expose the PR7713 miscompilation: Accessing a stack slot with a too-big register class. llvm-svn: 109488	2010-07-27 04:16:58 +00:00
Evan Cheng	d4218b8793	On x86, f32 / f64 nodes share the same registers as 128-bit vector values. llvm-svn: 109450	2010-07-26 21:50:05 +00:00
Bruno Cardoso Lopes	36c2ea6c7a	Temporary hack to let codegen assert or generate poor code in case we are using AVX and no AVX version of the desired intruction is present, this is better for incremental dev (without fallbacks it's easier to spot what's missing). Not sure this is the best hack thought (we can also disable all HasSSE* predicates by dinamically marking them 'false' if AVX is present) llvm-svn: 109434	2010-07-26 21:01:18 +00:00
Evan Cheng	37b740c4bf	Add an ILP scheduler. This is a register pressure aware scheduler that's appropriate for targets without detailed instruction iterineries. The scheduler schedules for increased instruction level parallelism in low register pressure situation; it schedules to reduce register pressure when the register pressure becomes high. On x86_64, this is a win for all tests in CFP2000. It also sped up 256.bzip2 by 16%. llvm-svn: 109300	2010-07-24 00:39:05 +00:00
Bruno Cardoso Lopes	306a1f9721	Support x86 "eiz" and "riz" pseudo index registers in the assembler. llvm-svn: 109295	2010-07-24 00:06:39 +00:00
Bruno Cardoso Lopes	d65cd1d581	Remove trailing whitespace llvm-svn: 109276	2010-07-23 22:15:26 +00:00
Bruno Cardoso Lopes	ea0e05a3ce	Add AVX version of CLMUL instructions llvm-svn: 109248	2010-07-23 18:41:12 +00:00
Bruno Cardoso Lopes	d618c8ac64	Declare CLMUL as a subtarget feature llvm-svn: 109207	2010-07-23 01:22:45 +00:00
Bruno Cardoso Lopes	09dc24beac	Add x86 CLMUL (Carry-less multiplication) cpu feature llvm-svn: 109206	2010-07-23 01:17:51 +00:00
Bruno Cardoso Lopes	acd9230b1b	Add complete assembler support for FMA3 instructions, with descriptions and encodings taken from the AVX manual llvm-svn: 109204	2010-07-23 00:54:35 +00:00
Dale Johannesen	f2d75670b7	The only supported calling convention for X86-64 uses SSE, so we can't return floating point values if this is disabled. Detect this error for clang. With SSE1 only, f64 is a problem; it can be done, but neither llvm-gcc nor clang has ever generated correct code for it. Since nobody noticed this I think it's OK to treat it as an error for now. This also handles SSE-sized vectors of floating point. 8207686, 8204109. llvm-svn: 109201	2010-07-23 00:30:35 +00:00
Bruno Cardoso Lopes	e29e389678	Fix some AVX instructions which didnt had HasAVX prefix. And also a problem with PINSRW, which was totally wrong because of a typo I introduced previously llvm-svn: 109198	2010-07-23 00:14:54 +00:00
Bruno Cardoso Lopes	0710c74f29	Add remaining AVX instructions (most of them dealing with GR64 destinations. This complete the assembler support for the general AVX ISA. But we still miss instructions from FMA3 and CLMUL specific feature flags, which are now the next step llvm-svn: 109168	2010-07-22 21:18:49 +00:00
Chris Lattner	8f3adc9057	remove the JIT "NeedsExactSize" feature and supporting logic. llvm-svn: 109167	2010-07-22 21:17:55 +00:00
Chris Lattner	b3f608bbba	X86MCInstLower now depends on AsmPrinter being around. llvm-svn: 109154	2010-07-22 21:10:04 +00:00
Chris Lattner	083be4d384	instead of migrating it to the MC instruction encoder, just rip out the implementation of X86InstrInfo::GetInstSizeInBytes. The code being ripped out just implemented a copy and hacked up version of the (old) instruction encoder, and is buggy and terrible in other ways. Since "GetInstSizeInBytes" is really only there to support the JIT's "NeedsExactSize" hook (which noone is using), just rip out the code. I will rip out the NeedsExactSize hook next. This resolves rdar://7617809 - switch X86InstrInfo::GetInstSizeInBytes to use X86MCCodeEmitter llvm-svn: 109149	2010-07-22 21:05:13 +00:00
Chandler Carruth	3180f9f55f	Attempt to fix linking issues with CMake. Please review other CMake users, especially on other platforms. Is there a better way to fix this. llvm-svn: 109084	2010-07-22 06:27:45 +00:00
Eric Christopher	9a77382685	Custom lower the memory barrier instructions and add support for lowering without sse2. Add a couple of new testcases. Fixes a few libgomp tests and latent bugs. Remove a few todos. llvm-svn: 109078	2010-07-22 02:48:34 +00:00

... 2 3 4 5 6 ...

6558 Commits