llvm-project

Commit Graph

Author	SHA1	Message	Date
Dale Johannesen	ab60ae3cf3	Disable these tests for now; it's not obvious why they fail on Linux. llvm-svn: 115257	2010-10-01 00:59:21 +00:00
Dale Johannesen	c6f17f7420	Make test not sensitive to register choice. llvm-svn: 115250	2010-10-01 00:16:17 +00:00
Dale Johannesen	dd224d2333	Massive rewrite of MMX: The x86_mmx type is used for MMX intrinsics, parameters and return values where these use MMX registers, and is also supported in load, store, and bitcast. Only the above operations generate MMX instructions, and optimizations do not operate on or produce MMX intrinsics. MMX-sized vectors <2 x i32> etc. are lowered to XMM or split into smaller pieces. Optimizations may occur on these forms and the result casted back to x86_mmx, provided the result feeds into a previous existing x86_mmx operation. The point of all this is prevent optimizations from introducing MMX operations, which is unsafe due to the EMMS problem. llvm-svn: 115243	2010-09-30 23:57:10 +00:00
NAKAMURA Takumi	bb995ae261	test/CodeGen/X86/sibcall.ll: Add explicit triplets and remove XFAIL: apple-darwin8. llvm-svn: 115215	2010-09-30 22:02:06 +00:00
Jakob Stoklund Olesen	eb12f49fb7	Try again to disable critical edge splitting in CodeGenPrepare. The bug that broke i386 linux has been fixed in r115191. llvm-svn: 115204	2010-09-30 20:51:52 +00:00
Jakob Stoklund Olesen	665aa6efcc	When isel is emitting instructions for an x86 target without CMOV, the CFG is edited during emission. If the basic block ends in a switch that gets lowered to a jump table, any phis at the default edge were getting updated wrong. The jump table data structure keeps a pointer to the header blocks that wasn't getting updated after the MBB is split. This bug was exposed on 32-bit Linux when disabling critical edge splitting in codegen prepare. The fix is to uipdate stale MBB pointers whenever a block is split during emission. llvm-svn: 115191	2010-09-30 19:44:31 +00:00
Bill Wendling	cc91601211	And remove r114997's test. llvm-svn: 115003	2010-09-28 23:24:18 +00:00
Bill Wendling	b0b2c57149	Revert r114997. It was causing a failure on darwin10-selfhost. llvm-svn: 115002	2010-09-28 23:11:55 +00:00
Bill Wendling	d848beb1e5	Fix a FIXME. _foo.eh symbols are currently always exported so that the linker knows about them. This is not necessary on 10.6 and later. llvm-svn: 114997	2010-09-28 22:36:56 +00:00
Jakob Stoklund Olesen	415a7a6fec	Revert "Disable codegen prepare critical edge splitting. Machine instruction passes now" This reverts revision 114633. It was breaking llvm-gcc-i386-linux-selfhost. It seems there is a downstream bug that is exposed by -cgp-critical-edge-splitting=0. When that bug is fixed, this patch can go back in. Note that the changes to tailcallfp2.ll are not reverted. They were good are required. llvm-svn: 114859	2010-09-27 18:43:48 +00:00
Evan Cheng	794aaa79e2	Disable codegen prepare critical edge splitting. Machine instruction passes now break critical edges on demand. llvm-svn: 114633	2010-09-23 06:55:34 +00:00
Owen Anderson	3231d13ddd	A select between a constant and zero, when fed by a bit test, can be efficiently lowered using a series of shifts. Fixes <rdar://problem/8285015>. llvm-svn: 114599	2010-09-22 22:58:22 +00:00
Cameron Esfahani	bbb9287080	Fix PR8201: Update the code to call via X86::CALL64pcrel32 in the 64-bit case. llvm-svn: 114597	2010-09-22 22:35:21 +00:00
Chris Lattner	bd85725341	Fix an inconsistency in the x86 backend that led it to reject "calll foo" on x86-32: 32-bit calls were named "call" not "calll". 64-bit calls were correctly named "callq", so this only impacted x86-32. This fixes rdar://8456370 - llvm-mc rejects 'calll' This also exposes that mingw/64 is generating a 32-bit call instead of a 64-bit call, I will file a bugzilla. llvm-svn: 114534	2010-09-22 05:49:14 +00:00
Chris Lattner	8a236b63d8	reimplement elf TLS support in terms of addressing modes, eliminating SegmentBaseAddress. llvm-svn: 114529	2010-09-22 04:39:11 +00:00
Chris Lattner	505af598d0	linux has a different stack alignment than the mac, relax this a bit. llvm-svn: 114519	2010-09-22 00:46:26 +00:00
Chris Lattner	54e5329545	give VZEXT_LOAD a memory operand, it now works with segment registers. llvm-svn: 114515	2010-09-22 00:34:38 +00:00
Chris Lattner	07827ba978	revert r114386 now that address modes work correctly, we get a nice call through gs-relative memory now. llvm-svn: 114510	2010-09-22 00:11:31 +00:00
Chris Lattner	e479e9643b	give LCMPXCHG_DAG[8] a memory operand, allowing it to work with addrspace 256/257 llvm-svn: 114508	2010-09-21 23:59:42 +00:00
Chris Lattner	0cefa51114	filecheckize llvm-svn: 114507	2010-09-21 23:57:27 +00:00
Devang Patel	d92f42d1d0	Use FileCheck llvm-svn: 114475	2010-09-21 20:50:32 +00:00
Owen Anderson	f4b1a5bdc4	When adding the carry bit to another value on X86, exploit the fact that the carry-materialization (sbbl x, x) sets the registers to 0 or ~0. Combined with two's complement arithmetic, we can fold the intermediate AND and the ADD into a single SUB. This fixes <rdar://problem/8449754>. llvm-svn: 114460	2010-09-21 18:41:19 +00:00
Chris Lattner	bb0a1c44bf	fix rdar://8453210, a crash handling a call through a GS relative load. For now, just disable folding the load into the call. llvm-svn: 114386	2010-09-21 03:37:00 +00:00
Evan Cheng	f3e9a48584	Enable machine sinking critical edge splitting. e.g. define double @foo(double %x, double %y, i1 %c) nounwind { %a = fdiv double %x, 3.2 %z = select i1 %c, double %a, double %y ret double %z } Was: _foo: divsd LCPI0_0(%rip), %xmm0 testb $1, %dil jne LBB0_2 movaps %xmm1, %xmm0 LBB0_2: ret Now: _foo: testb $1, %dil je LBB0_2 divsd LCPI0_0(%rip), %xmm0 ret LBB0_2: movaps %xmm1, %xmm0 ret This avoids the divsd when early exit is taken. rdar://8454886 llvm-svn: 114372	2010-09-20 22:52:00 +00:00
Owen Anderson	272ff94916	When TCO is turned on, it is possible to end up with aliasing FrameIndex's. Therefore, CombinerAA cannot assume that different FrameIndex's never alias, but can instead use MachineFrameInfo to get the actual offsets of these slots and check for actual aliasing. This fixes CodeGen/X86/2010-02-19-TailCallRetAddrBug.ll and CodeGen/X86/tailcallstack64.ll when CombinerAA is enabled, modulo a different register allocation sequence. llvm-svn: 114348	2010-09-20 20:39:59 +00:00
NAKAMURA Takumi	b912c27fc9	test/CodeGen/X86: Add explicit triplet -mtriple=i686-linux to 3 tests incompatible to Win32 codegen. r114297 raises 3 failures. They might fail also on mingw. llvm-svn: 114317	2010-09-19 21:58:55 +00:00
Owen Anderson	b92b13d8a0	Invert the logic of reachesChainWithoutSideEffects(). What we want to check is that there is NO path to the destination containing side effects, not that SOME path contains no side effects. In practice, this only manifests with CombinerAA enabled, because otherwise the chain has little to no branching, so "any" is effectively equivalent to "all". llvm-svn: 114268	2010-09-18 04:45:14 +00:00
Evan Cheng	e53ab6dffc	Teach machine sink to 1) Do forward copy propagation. This makes it easier to estimate the cost of the instruction being sunk. 2) Break critical edges on demand, including cases where the value is used by PHI nodes. Critical edge splitting is not yet enabled by default. llvm-svn: 114227	2010-09-17 22:28:18 +00:00
Dan Gohman	534db8a5c8	Avoid emitting a PIC base register if no PIC addresses are needed. This fixes rdar://8396318. llvm-svn: 114201	2010-09-17 20:24:24 +00:00
Dale Johannesen	f95f59a0c2	When substituting sunkaddrs into indirect arguments an asm, we were walking the asm arguments once and stashing their Values. This is wrong because the same memory location can be in the list twice, and if the first one has a sunkaddr substituted, the stashed value for the second one will be wrong (use-after-free). PR 8154. llvm-svn: 114104	2010-09-16 18:30:55 +00:00
Bruno Cardoso Lopes	e8501a468c	Add one more pattern to fallback movddup llvm-svn: 113522	2010-09-09 18:48:34 +00:00
Devang Patel	3f4abf397c	remove these tests for now. llvm-svn: 113293	2010-09-07 22:03:44 +00:00
Devang Patel	b0af23a1f6	There is no need to force target if the test is going to run on other x86 platforms. llvm-svn: 113285	2010-09-07 20:59:09 +00:00
Devang Patel	e50b23e223	Fix command line used to link these test cases. llvm-svn: 113237	2010-09-07 18:17:56 +00:00
Devang Patel	9dc0e5be58	Reintroduce dbg-declare tests. llvm-svn: 113232	2010-09-07 18:01:49 +00:00
Devang Patel	688338eec3	Remove last three tests. I need to make them independent of my setup. llvm-svn: 113213	2010-09-07 17:08:57 +00:00
Devang Patel	55a3bab0d2	Add a test case to check handling of dbg-declare during hybrid mode where we begin using fast-isel but switch back to DAG building at some point. llvm-svn: 113210	2010-09-07 17:03:44 +00:00
Devang Patel	29a775adf1	Add a test case to check handling of dbg-declare by selection DAG builder. llvm-svn: 113209	2010-09-07 16:56:35 +00:00
Devang Patel	184c81c3e2	Add a test case to check handling of dbg-declare by fast-isel. llvm-svn: 113208	2010-09-07 16:40:53 +00:00
Chris Lattner	eeba0c73e5	implement rdar://6653118 - fastisel should fold loads where possible. Since mem2reg isn't run at -O0, we get a ton of reloads from the stack, for example, before, this code: int foo(int x, int y, int z) { return x+y+z; } used to compile into: _foo: ## @foo subq $12, %rsp movl %edi, 8(%rsp) movl %esi, 4(%rsp) movl %edx, (%rsp) movl 8(%rsp), %edx movl 4(%rsp), %esi addl %edx, %esi movl (%rsp), %edx addl %esi, %edx movl %edx, %eax addq $12, %rsp ret Now we produce: _foo: ## @foo subq $12, %rsp movl %edi, 8(%rsp) movl %esi, 4(%rsp) movl %edx, (%rsp) movl 8(%rsp), %edx addl 4(%rsp), %edx ## Folded load addl (%rsp), %edx ## Folded load movl %edx, %eax addq $12, %rsp ret Fewer instructions and less register use = faster compiles. llvm-svn: 113102	2010-09-05 02:18:34 +00:00
Dale Johannesen	367afb5a00	Remove the rest of the nonexistent 64-bit AVX instructions. Bruno, please review. llvm-svn: 113014	2010-09-03 21:23:00 +00:00
NAKAMURA Takumi	24d039ebe3	test/CodeGen/X86: Add explicit -mtriple=(i686\|x86_64)-linux for Win32 host. llvm-svn: 112947	2010-09-03 03:24:08 +00:00
Bruno Cardoso Lopes	d6634a5b2e	AVX doesn't support mm operations neither its instrinsics. The AVX versions of PALIGN and PABS* should only exist for 128-bit. Remove the unnecessary stuff. llvm-svn: 112944	2010-09-03 02:08:45 +00:00
Anton Korobeynikov	a5a645559c	Properly emit __chkstk call instead of __alloca on non-mingw windows targets. Patch by Cameron Esfahani! llvm-svn: 112902	2010-09-02 23:03:46 +00:00
Dan Gohman	3c9b5f394b	Don't narrow the load and store in a load+twiddle+store sequence unless there are clearly no stores between the load and the store. This fixes this miscompile reported as PR7833. This breaks the test/CodeGen/X86/narrow_op-2.ll optimization, which is safe, but awkward to prove safe. Move it to X86's README.txt. llvm-svn: 112861	2010-09-02 21:18:42 +00:00
NAKAMURA Takumi	a224e5563e	test/loop-strength-reduce4: Add explicit triplet for Win32 host. llvm-svn: 112802	2010-09-02 03:45:58 +00:00
NAKAMURA Takumi	54ce546865	test/twoaddr-coalesce: Do not use @main . Win32 codegen emits implicit invoking __main into, to fail. llvm-svn: 112801	2010-09-02 03:45:51 +00:00
Bruno Cardoso Lopes	fea81b4831	Using target specific nodes for shuffle nodes makes the mask check more strict, breaking some cases not checked in the testsuite, but also exposes some foldings not done before, as this example: movaps (%rdi), %xmm0 movaps (%rax), %xmm1 movaps %xmm0, %xmm2 movss %xmm1, %xmm2 shufps $36, %xmm2, %xmm0 now is generated as: movaps (%rdi), %xmm0 movaps %xmm0, %xmm1 movlps (%rax), %xmm1 shufps $36, %xmm1, %xmm0 llvm-svn: 112753	2010-09-01 22:33:20 +00:00
Jakob Stoklund Olesen	4b6fd48bba	Teach RemoveCopyByCommutingDef to check all aliases, not just subregisters. This caused a miscompilation in WebKit where %RAX had conflicting defs when RemoveCopyByCommutingDef was commuting a %EAX use. llvm-svn: 112751	2010-09-01 22:15:35 +00:00
Dan Gohman	110ed64fbb	Revert 112442 and 112440 until the compile time problems introduced by 112440 are resolved. llvm-svn: 112692	2010-09-01 01:45:53 +00:00
Chris Lattner	34bfab0ad5	two changes: 1) nuke ConstDataCoalSection, which is dead. 2) revise my previous patch for rdar://8018335, which was completely wrong. Specifically, it doesn't make sense to mark __TEXT,__const_coal as PURE_INSTRUCTIONS, because it is for readonly data. templates (it turns out) go to const_coal_nt. The real fix for rdar://8018335 was to give ConstTextCoalSection a section kind of ReadOnly instead of Text. llvm-svn: 112496	2010-08-30 18:12:35 +00:00
Duncan Sands	68c30907cc	Correct bogus module triple specifications. llvm-svn: 112469	2010-08-30 10:48:29 +00:00
Dan Gohman	3a08ed7904	Make IVUsers iterative instead of recursive. This has the side effect of reversing the order of most of IVUser's results. llvm-svn: 112442	2010-08-29 16:40:03 +00:00
Dan Gohman	6665550bca	Make this test less dependent on register allocation choices. llvm-svn: 112426	2010-08-29 14:49:42 +00:00
Chris Lattner	c2887bc283	merge a bunch of shuffle tests into sse2.ll llvm-svn: 112398	2010-08-29 03:19:04 +00:00
Chris Lattner	b1ff978406	add some nounwind's llvm-svn: 112396	2010-08-29 03:07:47 +00:00
Chris Lattner	94656b1c8c	fix the buildvector->insertp[sd] logic to not always create a redundant insertp[sd] $0, which is a noop. Before: _f32: ## @f32 pshufd $1, %xmm1, %xmm2 pshufd $1, %xmm0, %xmm3 addss %xmm2, %xmm3 addss %xmm1, %xmm0 ## kill: XMM0<def> XMM0<kill> XMM0<def> insertps $0, %xmm0, %xmm0 insertps $16, %xmm3, %xmm0 ret after: _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm3 addss %xmm1, %xmm3 movdqa %xmm2, %xmm0 insertps $16, %xmm3, %xmm0 ret The extra movs are due to a random (poor) scheduling decision. llvm-svn: 112379	2010-08-28 17:59:08 +00:00
Chris Lattner	bcb6090ad0	fix the BuildVector -> unpcklps logic to not do pointless shuffles when the top elements of a vector are undefined. This happens all the time for X86-64 ABI stuff because only the low 2 elements of a 4 element vector are defined. For example, on: _Complex float f32(_Complex float A, _Complex float B) { return A+B; } We used to produce (with SSE2, SSE4.1+ uses insertps): _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $16, %xmm2, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm0 addss %xmm1, %xmm0 pshufd $16, %xmm0, %xmm1 movdqa %xmm2, %xmm0 unpcklps %xmm1, %xmm0 ret We now produce: _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm3 addss %xmm1, %xmm3 movaps %xmm2, %xmm0 unpcklps %xmm3, %xmm0 ret This implements rdar://8368414 llvm-svn: 112378	2010-08-28 17:28:30 +00:00
Dan Gohman	e06905d1f0	Completely disable tail calls when fast-isel is enabled, as fast-isel doesn't currently support dealing with this. llvm-svn: 112341	2010-08-28 00:51:03 +00:00
Chris Lattner	7413e87b6d	get this test passing on linux builders. llvm-svn: 112280	2010-08-27 18:49:08 +00:00
Daniel Dunbar	1844a71e66	X86: Fix an encoding issue with LOCK_ADD64mr, which could lead to very hard to find miscompiles with the integrated assembler. llvm-svn: 112250	2010-08-27 01:30:14 +00:00
Chris Lattner	af23e9a798	Add a hackaround for PR7993 which is causing failures on x86 builders that lack sse2. llvm-svn: 112175	2010-08-26 06:57:07 +00:00
Chris Lattner	66afba7aa4	I think enough general codegen bugs are fixed to allow this to work on random hosts, lets see! llvm-svn: 112172	2010-08-26 05:52:42 +00:00
Chris Lattner	eb2cc0ce0e	implement SplitVecOp_CONCAT_VECTORS, fixing the included testcase with SSE1. llvm-svn: 112171	2010-08-26 05:51:22 +00:00
Chris Lattner	825294b85f	Make sure this forces the x86 targets llvm-svn: 112169	2010-08-26 05:25:05 +00:00
Chris Lattner	cc60609cb4	fix sse1 only codegen in x86-64 mode, which is something we apparently try to support. llvm-svn: 112168	2010-08-26 05:24:29 +00:00
Chris Lattner	c7fb446a9d	temporarily disable this, which started failing on the llvm-i686-linux builder. I will investigate tonight. llvm-svn: 112113	2010-08-25 23:43:14 +00:00
Chris Lattner	75ff053497	Change handling of illegal vector types to widen when possible instead of expanding: e.g. <2 x float> -> <4 x float> instead of -> 2 floats. This affects two places in the code: handling cross block values and handling function return and arguments. Since vectors are already widened by legalizetypes, this gives us much better code and unblocks x86-64 abi and SPU abi work. For example, this (which is a silly example of a cross-block value): define <4 x float> @test2(<4 x float> %A) nounwind { %B = shufflevector <4 x float> %A, <4 x float> undef, <2 x i32> <i32 0, i32 1> %C = fadd <2 x float> %B, %B br label %BB BB: %D = fadd <2 x float> %C, %C %E = shufflevector <2 x float> %D, <2 x float> undef, <4 x i32> <i32 0, i32 1, i32 undef, i32 undef> ret <4 x float> %E } Now compiles into: _test2: ## @test2 ## BB#0: addps %xmm0, %xmm0 addps %xmm0, %xmm0 ret previously it compiled into: _test2: ## @test2 ## BB#0: addps %xmm0, %xmm0 pshufd $1, %xmm0, %xmm1 ## kill: XMM0<def> XMM0<kill> XMM0<def> insertps $0, %xmm0, %xmm0 insertps $16, %xmm1, %xmm0 addps %xmm0, %xmm0 ret This implements rdar://8230384 llvm-svn: 112101	2010-08-25 22:49:25 +00:00
Bruno Cardoso Lopes	0bc919fa35	Convert test to use filecheck and make it more specific llvm-svn: 112016	2010-08-25 01:47:16 +00:00
Dan Gohman	c88fda477a	Fix X86's isLegalAddressingMode to recognize that static addresses need not be RIP-relative in small mode. llvm-svn: 111917	2010-08-24 15:55:12 +00:00
Chris Lattner	58bd73a5a7	Add a new llvm.x86.int intrinsic, allowing access to the x86 int and int3 instructions. Patch by Peter Housel! llvm-svn: 111831	2010-08-23 19:39:25 +00:00
Dan Gohman	42ef669d81	Fix x86 fast-isel's cmp+branch folding to avoid folding when the comparison is in a different basic block from the branch. In such cases, the comparison's operands may not have initialized virtual registers available. llvm-svn: 111709	2010-08-21 02:32:36 +00:00
Evan Cheng	361b9be7c6	It's possible to sink a def if its local uses are PHI's. llvm-svn: 111537	2010-08-19 18:33:29 +00:00
Dan Gohman	2470818942	When sending stats output to stdout for grepping, don't emit normal output to standard output also. llvm-svn: 111401	2010-08-18 20:32:46 +00:00
Dan Gohman	ed2b005842	Tweak IVUsers' concept of "interesting" to exclude add recurrences where the step value is an induction variable from an outer loop, to avoid trouble trying to re-expand such expressions. This effectively hides such expressions from indvars and lsr, which prevents them from getting into trouble. llvm-svn: 111317	2010-08-17 22:50:37 +00:00
Evan Cheng	efdc74ea59	Add nounwind. llvm-svn: 111312	2010-08-17 22:35:20 +00:00
Dale Johannesen	16f96445c3	Make fast scheduler handle asm clobbers correctly. PR 7882. Follows suggestion by Amaury Pouly, thanks. llvm-svn: 111306	2010-08-17 22:17:24 +00:00
Evan Cheng	f259efde47	PHI elimination should not break back edge. It can cause some significant code placement issues. rdar://8263994 good: LBB0_2: mov r2, r0 . . . mov r1, r2 bne LBB0_2 bad: LBB0_2: mov r2, r0 . . . @ BB#3: mov r1, r2 b LBB0_2 llvm-svn: 111221	2010-08-17 01:20:36 +00:00
Benjamin Kramer	cbc55d9dc0	Test expects SSE, give him SSE. llvm-svn: 111115	2010-08-15 23:32:03 +00:00
Benjamin Kramer	4566466b7f	Restore arch on these test, they fail on arm. llvm-svn: 111109	2010-08-15 20:42:56 +00:00
Dale Johannesen	339423c460	Mark as XFAIL on darwin 8. PR 7886. llvm-svn: 111108	2010-08-15 19:40:29 +00:00
Dale Johannesen	8d3c89e765	Revert 110491. While not wrong, it was based on a misanalysis and is undesirable. llvm-svn: 111028	2010-08-13 18:43:45 +00:00
Bruno Cardoso Lopes	7f704b31a9	- Teach SSEDomainFix to switch between different levels of AVX instructions. Here we guess that AVX will have domain issues, so just implement them for consistency and in the future we remove if it's unnecessary. - Make foldMemoryOperandImpl aware of 256-bit zero vectors folding and support the 128-bit counterparts of AVX too. - Make sure MOV[AU]PS instructions are only selected when SSE1 is enabled, and duplicate the patterns to match AVX. - Add a testcase for a simple 128-bit zero vector creation. llvm-svn: 110946	2010-08-12 20:20:53 +00:00
Bruno Cardoso Lopes	7306c86886	Begin to support some vector operations for AVX 256-bit intructions. The long term goal here is to be able to match enough of vector_shuffle and build_vector so all avx intrinsics which aren't mapped to their own built-ins but to shufflevector calls can be codegen'd. This is the first (baby) step, support building zeroed vectors. llvm-svn: 110897	2010-08-12 02:06:36 +00:00
Devang Patel	48595bf2bc	This is x86 only test. llvm-svn: 110887	2010-08-12 00:17:38 +00:00
Bruno Cardoso Lopes	1675ee7a02	Add testcases for all AVX 256-bit intrinsics added in the last couple days llvm-svn: 110854	2010-08-11 21:12:09 +00:00
Bruno Cardoso Lopes	29c8818ad9	Reapply r109881 using a more strict command line for llc. llvm-svn: 110833	2010-08-11 17:39:23 +00:00
Jakob Stoklund Olesen	5730846c2f	Fix test for more architectures. Patch by Tobias Grosser. llvm-svn: 110685	2010-08-10 16:48:24 +00:00
Tobias Grosser	fedeff8015	Fix failing testcase. Those look like typos to me. llvm-svn: 110664	2010-08-10 09:54:29 +00:00
Devang Patel	b219746c80	Handle TAG_constant for integers. llvm-svn: 110656	2010-08-10 07:11:13 +00:00
Dale Johannesen	a3bd31a923	Use sdmem and sse_load_f64 (etc.) for the vector form of CMPSD (etc.) Matching a 128-bit memory operand is wrong, the instruction uses only 64 bits (same as ADDSD etc.) 8193553. llvm-svn: 110491	2010-08-07 00:33:42 +00:00
Eric Christopher	e1fb772aa5	Add an option to always emit realignment code for a particular module. llvm-svn: 110404	2010-08-05 23:57:43 +00:00
Devang Patel	cc3f3b341d	Move x86 specific tests into test/CodeGen/X86. llvm-svn: 110372	2010-08-05 20:25:37 +00:00
Dan Gohman	c53ee449a5	Move x86-specific tests out of test/Transforms/LoopStrengthReduce and into test/CodeGen/X86, so that they aren't run when the x86 target is not enabled. Fix uglygep.ll to not be x86-specific. llvm-svn: 110343	2010-08-05 17:04:15 +00:00
Daniel Dunbar	e62e664656	tests: CodeGen/X86/GC tests require X86. llvm-svn: 110338	2010-08-05 15:45:33 +00:00
Bill Wendling	ca1cb13646	The lower invoke pass needs to have unreachable code elimination run after it because it could create such things. This fixes a MingW buildbot test failure. llvm-svn: 110279	2010-08-04 23:36:02 +00:00
Eli Friedman	39d0f57cab	PR7814: Truncates cannot be ignored for signed comparisons. llvm-svn: 110268	2010-08-04 22:40:58 +00:00
Stuart Hastings	cba0d06b7c	call-imm.ll test case regex fix. Patch by Dimitry Andric! llvm-svn: 110199	2010-08-04 15:31:35 +00:00
Jakob Stoklund Olesen	011ff9bec9	OK, that's it. This test is going away now. But don't worry, I am taking it to a nice farm in the country where it can play with other tests. And bunnies. It is not clear what is being tested, and the revision history shows a bunch of random changes to the expected instruction count. Clearly, we are just fudging it to pass whenever it fails. llvm-svn: 110118	2010-08-03 17:21:14 +00:00
Bob Wilson	66161f5eb4	Revert new AVX intrinsic tests. They are breaking buildbots and Bruno is away from a computer now. --- Reverse-merging r109881 into '.': D test/CodeGen/X86/avx-intrinsics-x86.ll D test/CodeGen/X86/avx-intrinsics-x86_64.ll llvm-svn: 109959	2010-07-31 22:36:03 +00:00
Bruno Cardoso Lopes	92941fdb26	A bunch of tests for AVX intrinsics llvm-svn: 109881	2010-07-30 19:57:56 +00:00
Eli Friedman	ffe64c06ef	Fix for bug reported by Evzen Muller on llvm-commits: make sure to correctly check the range of the constant when optimizing a comparison between a constant and a sign_extend_inreg node. llvm-svn: 109854	2010-07-30 06:44:31 +00:00
Nate Begeman	53afc8f06a	Implement a vectorized algorithm for <16 x i8> << <16 x i8> This is about 4x faster and smaller than the existing scalarization. llvm-svn: 109566	2010-07-28 00:21:48 +00:00
Nate Begeman	269a6da023	~40% faster vector shl <4 x i32> on SSE 4.1 Larger improvements for smaller types coming in future patches. For: define <2 x i64> @shl(<4 x i32> %r, <4 x i32> %a) nounwind readnone ssp { entry: %shl = shl <4 x i32> %r, %a ; <<4 x i32>> [#uses=1] %tmp2 = bitcast <4 x i32> %shl to <2 x i64> ; <<2 x i64>> [#uses=1] ret <2 x i64> %tmp2 } We get: _shl: ## @shl pslld $23, %xmm1 paddd LCPI0_0, %xmm1 cvttps2dq %xmm1, %xmm1 pmulld %xmm1, %xmm0 ret Instead of: _shl: ## @shl pshufd $3, %xmm0, %xmm2 movd %xmm2, %eax pshufd $3, %xmm1, %xmm2 movd %xmm2, %ecx shll %cl, %eax movd %eax, %xmm2 pshufd $1, %xmm0, %xmm3 movd %xmm3, %eax pshufd $1, %xmm1, %xmm3 movd %xmm3, %ecx shll %cl, %eax movd %eax, %xmm3 punpckldq %xmm2, %xmm3 movd %xmm0, %eax movd %xmm1, %ecx shll %cl, %eax movd %eax, %xmm2 movhlps %xmm0, %xmm0 movd %xmm0, %eax movhlps %xmm1, %xmm1 movd %xmm1, %ecx shll %cl, %eax movd %eax, %xmm0 punpckldq %xmm0, %xmm2 movdqa %xmm2, %xmm0 punpckldq %xmm3, %xmm0 ret llvm-svn: 109549	2010-07-27 22:37:06 +00:00
Dan Gohman	55e244698a	Use the proper type for shift counts. This fixes a bootstrap error. llvm-svn: 109265	2010-07-23 21:08:12 +00:00
Dan Gohman	0818684a70	DAGCombine (shl (anyext x, c)) to (anyext (shl x, c)) if the high bits are not demanded. This often allows the anyext to be folded away. llvm-svn: 109242	2010-07-23 18:03:30 +00:00
Eric Christopher	9a77382685	Custom lower the memory barrier instructions and add support for lowering without sse2. Add a couple of new testcases. Fixes a few libgomp tests and latent bugs. Remove a few todos. llvm-svn: 109078	2010-07-22 02:48:34 +00:00
Dan Gohman	625fd2292d	Fix SCEV denormalization of expressions where the exit value from one loop is involved in the increment of an addrec for another loop. This fixes rdar://8168938. llvm-svn: 108863	2010-07-20 17:06:20 +00:00
Duncan Sands	2e839de377	The same problem was being tracked in PR7652. llvm-svn: 108843	2010-07-20 15:52:32 +00:00
Dan Gohman	b5e918dc05	After a custom inserter, in a block which has constant instructions, update the current basic block in addition to the current insert position, so that they remain consistent. This fixes rdar://8204072. llvm-svn: 108765	2010-07-19 22:48:56 +00:00
Owen Anderson	9c271e2835	Remove r108639 now that it is handled by InstCombine instead. llvm-svn: 108688	2010-07-19 08:10:24 +00:00
Owen Anderson	41670a11a8	Add a testcase for r108639. llvm-svn: 108640	2010-07-18 08:57:19 +00:00
Bill Wendling	bf8370ff36	Consider this function: void foo() { __builtin_unreachable(); } It will output the following on Darwin X86: _func1: Leh_func_begin0: pushq %rbp Ltmp0: movq %rsp, %rbp Ltmp1: Leh_func_end0: This prolog adds a new Call Frame Information (CFI) row to the FDE with an address that is not within the address range of the code it describes -- part is equal to the end of the function -- and therefore results in an invalid EH frame. If we emit a nop in this situation, then the CFI row is now within the address range. llvm-svn: 108568	2010-07-16 22:51:10 +00:00
Jakob Stoklund Olesen	c30b4ddc58	Remove the X86::FP_REG_KILL pseudo-instruction and the X86FloatingPointRegKill pass that inserted it. It is no longer necessary to limit the live ranges of FP registers to a single basic block. llvm-svn: 108536	2010-07-16 17:41:44 +00:00
Jakob Stoklund Olesen	b1671271ab	Add forgotten test case. llvm-svn: 108506	2010-07-16 04:45:35 +00:00
Dan Gohman	103c4ebea5	Use the source-order scheduler instead of the "fast" scheduler at -O0, because it's more likely to keep debug line information in its original order. llvm-svn: 108496	2010-07-16 02:01:19 +00:00
Bill Wendling	4bda1c8e68	Revert. This isn't the correct way to go. llvm-svn: 108478	2010-07-15 23:42:21 +00:00
Bill Wendling	973dc3b1d8	Handle code gen for the unreachable instruction if it's the only instruction in the function. We'll just turn it into a "trap" instruction instead. The problem with not handling this is that it might generate a prologue without the equivalent epilogue to go with it: $ cat t.ll define void @foo() { entry: unreachable } $ llc -o - t.ll -relocation-model=pic -disable-fp-elim -unwind-tables .section __TEXT,__text,regular,pure_instructions .globl _foo .align 4, 0x90 _foo: ## @foo Leh_func_begin0: ## BB#0: ## %entry pushq %rbp Ltmp0: movq %rsp, %rbp Ltmp1: Leh_func_end0: ... The unwind tables then have bad data in them causing all sorts of problems. Fixes <rdar://problem/8096481>. llvm-svn: 108473	2010-07-15 23:32:40 +00:00
Evan Cheng	55f0c6b9fc	Split -enable-finite-only-fp-math to two options: -enable-no-nans-fp-math and -enable-no-infs-fp-math. All of the current codegen fp math optimizations only care whether the fp arithmetics arguments and results can never be NaN. llvm-svn: 108465	2010-07-15 22:07:12 +00:00
Chris Lattner	60b131654b	fix the definitions of ConstTextCoalSection/ConstDataCoalSection to keep "Text" in sync with the "pure instructions" section attribute. Lack of this attribute was preventing the assembler from emitting multibyte noops instructions for templates (and inlines, and other coalesced stuff) and was causing the assembler to mismatch .o files. This fixes rdar://8018335 llvm-svn: 108461	2010-07-15 21:22:00 +00:00
Devang Patel	df09db62e2	Fix crash reported in PR7653. llvm-svn: 108441	2010-07-15 18:45:27 +00:00
Dan Gohman	4afd412d6b	Watch out for a constant offset cancelling out a base register, forming a zero. This situation arrises in Fortran code with induction variables that start at 1 instead of 0. This fixes PR7651. llvm-svn: 108424	2010-07-15 15:14:45 +00:00
Devang Patel	29168baf4b	Make it a .ll test case. llvm-svn: 108370	2010-07-14 23:12:52 +00:00
Dan Gohman	042523340b	Delete fast-isel's trivial load optimization; it breaks debugging because it can look past points where a debugger might modify user variables. llvm-svn: 108336	2010-07-14 17:25:37 +00:00
Evan Cheng	a8e8874552	Fix for PR7193 was overly conservative. The only case where sibcall callee address cannot be allocated a register is in 32-bit mode where the first three arguments are marked inreg. In that case EAX, EDX, and ECX will be used for argument passing. This fixes PR7610. llvm-svn: 108327	2010-07-14 06:44:01 +00:00
Evan Cheng	c893115312	Re-enable the test with fix. llvm-svn: 108319	2010-07-14 05:49:23 +00:00
Chris Lattner	711338fb04	temporarily disable to test to fix buildbots. llvm-svn: 108310	2010-07-14 02:21:59 +00:00
Evan Cheng	d542414945	Teach ProcessImplicitDefs to transform more COPY instructions into IMPLICIT_DEF (and subsequently eliminate them). This allows machine LICM to hoist IMPLICIT_DEF's. PR7620. llvm-svn: 108304	2010-07-14 01:22:19 +00:00
Dale Johannesen	caca5488dc	In inline asm treat indirect 'X' constraint as 'm'. This may not be right in all cases, but it's better than asserting which it was doing before. PR 7528. llvm-svn: 108268	2010-07-13 20:17:05 +00:00
Evan Cheng	f43961007c	-enable-unsafe-fp-math should not imply -enable-finite-only-fp-math. llvm-svn: 108254	2010-07-13 18:46:14 +00:00
Dale Johannesen	f241d4626c	Fix PR number. llvm-svn: 108251	2010-07-13 18:14:47 +00:00
Dan Gohman	51e6d9bbf6	Apply the SSE dependence idiom for SSE unary operations to SD instructions too, in addition to SS instructions. And add a comment about it. llvm-svn: 108191	2010-07-12 20:46:04 +00:00
Dan Gohman	79be2b9be5	Fix this test. llvm-svn: 108059	2010-07-10 22:42:12 +00:00
Jakob Stoklund Olesen	c4b3bcc051	FileCheckize inline asm FP stack tests llvm-svn: 108046	2010-07-10 16:30:25 +00:00
Dan Gohman	d7b5ce3312	Reapply bottom-up fast-isel, with several fixes for x86-32: - Check getBytesToPopOnReturn(). - Eschew ST0 and ST1 for return values. - Fix the PIC base register initialization so that it doesn't ever fail to end up the top of the entry block. llvm-svn: 108039	2010-07-10 09:00:22 +00:00
Jakob Stoklund Olesen	51702ec46b	Fix a few tests llvm-svn: 108011	2010-07-09 20:43:09 +00:00
Dan Gohman	ea9ae3e6ed	Add a target triple. llvm-svn: 108003	2010-07-09 19:17:36 +00:00
Dan Gohman	7929c448fc	Fix MachineLICM to actually visit inner loops. llvm-svn: 108001	2010-07-09 18:49:45 +00:00
Bob Wilson	6586e9b203	--- Reverse-merging r107947 into '.': U utils/TableGen/FastISelEmitter.cpp --- Reverse-merging r107943 into '.': U test/CodeGen/X86/fast-isel.ll U test/CodeGen/X86/fast-isel-loads.ll U include/llvm/Target/TargetLowering.h U include/llvm/Support/PassNameParser.h U include/llvm/CodeGen/FunctionLoweringInfo.h U include/llvm/CodeGen/CallingConvLower.h U include/llvm/CodeGen/FastISel.h U include/llvm/CodeGen/SelectionDAGISel.h U lib/CodeGen/LLVMTargetMachine.cpp U lib/CodeGen/CallingConvLower.cpp U lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp U lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp U lib/CodeGen/SelectionDAG/FastISel.cpp U lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp U lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp U lib/CodeGen/SelectionDAG/InstrEmitter.cpp U lib/CodeGen/SelectionDAG/TargetLowering.cpp U lib/Target/XCore/XCoreISelLowering.cpp U lib/Target/XCore/XCoreISelLowering.h U lib/Target/X86/X86ISelLowering.cpp U lib/Target/X86/X86FastISel.cpp U lib/Target/X86/X86ISelLowering.h llvm-svn: 107987	2010-07-09 16:37:18 +00:00
Dan Gohman	0b5aa1cdd3	Re-apply bottom-up fast-isel, with fixes. Be very careful to avoid emitting a DBG_VALUE after a terminator, or emitting any instructions before an EH_LABEL. llvm-svn: 107943	2010-07-09 00:39:23 +00:00
Bill Wendling	a992445ff2	Extension of r107506. Make sure that we don't mark a function as having a call if the inline ASM doesn't need a stack frame. llvm-svn: 107922	2010-07-08 22:38:02 +00:00
Eric Christopher	e796253217	A slight reworking of the custom patterns for x86-64 tpoff codegen and correct the testcase for valid assembly. Needs more tests. llvm-svn: 107860	2010-07-08 07:36:46 +00:00
Dan Gohman	e75704369d	Revert 107840 107839 107813 107804 107800 107797 107791. Debug info intrinsics win for now. llvm-svn: 107850	2010-07-08 01:00:56 +00:00
Jakob Stoklund Olesen	ddaf0099a5	Allow copies between GR8_ABCD_L and GR8_ABCD_H. This fixes PR7540. llvm-svn: 107809	2010-07-07 20:33:27 +00:00
Dan Gohman	e7ccc51cc1	Implement bottom-up fast-isel. This has the advantage of not requiring a separate DCE pass over MachineInstrs. llvm-svn: 107804	2010-07-07 19:20:32 +00:00
Dan Gohman	2d4d01d0de	Add X86FastISel support for return statements. This entails refactoring a bunch of stuff, to allow the target-independent calling convention logic to be employed. llvm-svn: 107800	2010-07-07 18:32:53 +00:00
Dale Johannesen	ce65663330	Accept RIP-relative symbols with 'i' constraint, and print the (%rip) only if the 'a' modifier is present. PR 7528. llvm-svn: 107727	2010-07-06 23:27:00 +00:00
Dale Johannesen	6f01541ae6	Make test not hang waiting for input. llvm-svn: 107721	2010-07-06 23:06:58 +00:00
Jakob Stoklund Olesen	a64c0a3d22	Be more forgiving when calculating alias interference for physreg coalescing. It is OK for an alias live range to overlap if there is a copy to or from the physical register. CoalescerPair can work out if the copy is coalescable independently of the alias. This means that we can join with the actual destination interval instead of using the getOrigDstReg() hack. It is no longer necessary to merge clobber ranges into subregisters. llvm-svn: 107695	2010-07-06 20:31:51 +00:00
Devang Patel	23a7593534	Fix PR7545 crash. llvm-svn: 107678	2010-07-06 18:18:32 +00:00
Eric Christopher	8f06b4a294	Remove mistakenly added test. llvm-svn: 107641	2010-07-06 05:20:13 +00:00
Eric Christopher	2ad0c779c3	Fix up -fstack-protector on linux to use the segment registers. Split out testcases per architecture and os now. Patch from Nelson Elhage. llvm-svn: 107640	2010-07-06 05:18:56 +00:00
Chris Lattner	60db4557cd	another v2f32 case, in this case showing poor codegen. llvm-svn: 107614	2010-07-05 05:52:56 +00:00
Chris Lattner	431e81f2fb	fix test on non-x86 hosts. llvm-svn: 107608	2010-07-05 03:56:55 +00:00
Chris Lattner	45cc4d74a3	Just rip v2f32 support completely out of the X86 backend. In the example in the testcase, we now generate: _test1: ## @test1 movss 4(%esp), %xmm0 addss 8(%esp), %xmm0 movl 12(%esp), %eax movss %xmm0, (%eax) ret instead of: _test1: ## @test1 subl $20, %esp movl 24(%esp), %eax movq %mm0, (%esp) movq %mm0, 8(%esp) movss (%esp), %xmm0 addss 12(%esp), %xmm0 movss %xmm0, (%eax) addl $20, %esp ret v2f32 support did not work reliably because most of the X86 backend didn't know it was legal. It was apparently only added to support returning source-level v2f32 values in MMX registers in x86-32 mode. If ABI compatibility is important on this GCC-extended-vector type for some reason, then the frontend should generate IR that returns v2i32 instead of v2f32. However, we generally don't try very hard to be abi compatible on gcc extended vectors. llvm-svn: 107601	2010-07-04 23:07:25 +00:00
Chris Lattner	681b926d54	fix PR7518 - terrible codegen of <2 x float>, by only marking v2f32 as legal in 32-bit mode. It is just as terrible there, but I just care about x86-64 and noone claims it is valuable in 64-bit mode. llvm-svn: 107600	2010-07-04 22:57:10 +00:00
Evan Cheng	0ce84486c3	- Two-address pass should not assume unfolding is always successful. - X86 unfolding should check if the instructions being unfolded has memoperands. If there is no memoperands, then it must assume conservative alignment. If this would introduce an expensive sse unaligned load / store, then unfoldMemoryOperand etc. should not unfold the instruction. llvm-svn: 107509	2010-07-02 20:36:18 +00:00
Dale Johannesen	4d887f7ca7	Propagate the AlignStack bit in InlineAsm's to the PrologEpilog code, and use it to determine whether the asm forces stack alignment or not. gcc consistently does not do this for GCC-style asms; Apple gcc inconsistently sometimes does it for asm blocks. There is no convenient place to put a bit in either the SDNode or the MachineInstr form, so I've added an extra operand to each; unlovely, but it does allow for expansion for more bits, should we need it. PR 5125. Some existing testcases are affected. The operand lists of the SDNode and MachineInstr forms are indexed with awesome mnemonics, like "2"; I may fix this someday, but not now. I'm not making it any worse. If anyone is inspired I think you can find all the right places from this patch. llvm-svn: 107506	2010-07-02 20:16:09 +00:00
Bill Wendling	03bcd6ecc8	Implement the "linker_private_weak" linkage type. This will be used for Objective-C metadata types which should be marked as "weak", but which the linker will remove upon final linkage. However, this linkage isn't specific to Objective-C. For example, the "objc_msgSend_fixup_alloc" symbol is defined like this: .globl l_objc_msgSend_fixup_alloc .weak_definition l_objc_msgSend_fixup_alloc .section __DATA, __objc_msgrefs, coalesced .align 3 l_objc_msgSend_fixup_alloc: .quad _objc_msgSend_fixup .quad L_OBJC_METH_VAR_NAME_1 This is different from the "linker_private" linkage type, because it can't have the metadata defined with ".weak_definition". Currently only supported on Darwin platforms. llvm-svn: 107433	2010-07-01 21:55:59 +00:00
Dan Gohman	d2965c10a1	Temporarily disable on-demand fast-isel. llvm-svn: 107393	2010-07-01 12:15:30 +00:00
Dan Gohman	aef3d140b7	Teach fast-isel to avoid loading a value from memory when it's already available in a register. This is pretty primitive, but it reduces the number of instructions in common testcases by 4%. llvm-svn: 107380	2010-07-01 03:49:38 +00:00
Dan Gohman	722f5fc567	Enable on-demand fast-isel. llvm-svn: 107377	2010-07-01 02:58:57 +00:00
Dan Gohman	7937d5606d	Teach X86FastISel to fold constant offsets and scaled indices in the same address. llvm-svn: 107373	2010-07-01 02:27:15 +00:00
Dale Johannesen	17feb07c53	In asm's, output operands with matching input constraints have to be registers, per gcc documentation. This affects the logic for determining what "g" should lower to. PR 7393. A couple of existing testcases are affected. llvm-svn: 107079	2010-06-28 22:09:45 +00:00
Jakob Stoklund Olesen	fde9c348e9	Don't write temporary files in test directory llvm-svn: 107049	2010-06-28 20:01:15 +00:00
Jakob Stoklund Olesen	0117091c16	Add a triple so test runs on Linux as well. llvm-svn: 107045	2010-06-28 19:31:15 +00:00
Jakob Stoklund Olesen	0d94d7af78	Add more special treatment for inline asm in RegAllocFast. When an instruction has tied operands and physreg defines, we must take extra care that the tied operands conflict with neither physreg defs nor uses. The special treatment is given to inline asm and instructions with tied operands / early clobbers and physreg defines. This fixes PR7509. llvm-svn: 107043	2010-06-28 18:34:34 +00:00
Benjamin Kramer	3bbc52ce3e	Fix some tests that didn't test anything. llvm-svn: 106954	2010-06-26 20:05:06 +00:00
Jakob Stoklund Olesen	d7d0d4e882	When creating X86 MUL8 and DIV8 instructions, make sure we don't produce CopyFromReg nodes for aliasing registers (AX and AL). This confuses the fast register allocator. Instead of CopyFromReg(AL), use ExtractSubReg(CopyFromReg(AX), sub_8bit). This fixes PR7312. llvm-svn: 106934	2010-06-26 00:39:23 +00:00
Dale Johannesen	ce97d55ad9	The hasMemory argument is irrelevant to how the argument for an "i" constraint should get lowered; PR 6309. While this argument was passed around a lot, this is the only place it was used, so it goes away from a lot of other places. llvm-svn: 106893	2010-06-25 21:55:36 +00:00
Dan Gohman	8de1fe3ccf	pcmpeqd and friends are Commutable. llvm-svn: 106886	2010-06-25 21:05:35 +00:00
Bill Wendling	e41e40f689	- Reapply r106066 now that the bzip2 build regression has been fixed. - 2010-06-25-CoalescerSubRegDefDead.ll is the testcase for r106878. llvm-svn: 106880	2010-06-25 20:48:10 +00:00
Dan Gohman	600658a4ba	Don't write an output file to cwd, and put an rdar prefix on an rdar number. llvm-svn: 106810	2010-06-24 23:45:15 +00:00
Dan Gohman	9a2f0473b2	Teach EmitLiveInCopies to omit copies for unused virtual registers, and to clean up unused incoming physregs from the live-in list. llvm-svn: 106805	2010-06-24 22:23:02 +00:00
Dale Johannesen	5ad5226c58	Disallow matching "i" constraint to symbol addresses when address requires a register or secondary load to compute (most PIC modes). This improves "g" constraint handling. 8015842. The test from 2007 is attempting to test the fix for PR1761, but since -relocation-model=static doesn't work on Darwin x86-64, it was not testing what it was supposed to be testing and was passing erroneously. Fixed to use Linux x86-64. llvm-svn: 106779	2010-06-24 20:14:51 +00:00
Dan Gohman	463f26b4be	Eliminate the other half of the BRCOND optimization, and update as many tests as possible. llvm-svn: 106749	2010-06-24 15:24:03 +00:00
Dan Gohman	df6b33e778	Eliminate the first have of the optimization which eliminates BRCOND when the condition is constant. This optimization shouldn't be necessary, because codegen shouldn't be able to find dead control paths that the IR-level optimizer can't find. And it's undesirable, because it encourages bugpoint to leave "br i1 false" branches in its output. And it wasn't updating the CFG. I updated all the tests I could, but some tests are too reduced and I wasn't able to meaningfully preserve them. llvm-svn: 106748	2010-06-24 15:04:11 +00:00
Dan Gohman	600f62b3ba	Reapply r106634, now that the bug it exposed is fixed. llvm-svn: 106746	2010-06-24 14:30:44 +00:00
Dan Gohman	0695e09b09	Optimize the "bit test" code path for switch lowering in the case where the bit mask has exactly one bit. llvm-svn: 106716	2010-06-24 02:06:24 +00:00
Bill Wendling	a136521a17	MorphNodeTo doesn't preserve the memory operands. Because we're morphing a node into the same node, but with different non-memory operands, we need to replace the memory operands after it's finished morphing. llvm-svn: 106643	2010-06-23 18:16:24 +00:00
Daniel Dunbar	4df321b7ad	Revert r106263, "Fold the ShrinkDemandedOps pass into the regular DAGCombiner pass,"... it was causing both 'file' (with clang) and 176.gcc (with llvm-gcc) to be miscompiled. llvm-svn: 106634	2010-06-23 17:09:26 +00:00
Daniel Dunbar	ef5a4383ad	Revert r106066, "Create a more targeted fix for not sinking instructions into a range where it"... it causes bzip2 to be miscompiled by Clang. Conflicts: lib/CodeGen/MachineSink.cpp llvm-svn: 106614	2010-06-23 00:48:25 +00:00
Dan Gohman	f1cf963c64	Loosen up this test so that it doesn't depend as much on register allocation details. llvm-svn: 106599	2010-06-22 23:32:47 +00:00
Dan Gohman	1081f1a0f5	Fix OptimizeMax to handle an odd case where one of the max operands is another max which folds. This fixes PR7454. llvm-svn: 106594	2010-06-22 23:07:13 +00:00
Dale Johannesen	6d4802ba6c	Add SSE so these actually pass on non-X86 hosts. llvm-svn: 106575	2010-06-22 20:54:03 +00:00
Mon P Wang	825639e849	Move v-binop-widen tests to X86 since they don't work on all platforms llvm-svn: 106562	2010-06-22 19:40:50 +00:00
Jakob Stoklund Olesen	9c47dac677	Remove the SimpleJoin optimization from SimpleRegisterCoalescing. Measurements show that it does not speed up coalescing, so there is no reason the keep the added complexity around. Also clean out some unused methods and static functions. llvm-svn: 106548	2010-06-22 16:13:57 +00:00
Dan Gohman	3c1b3c61e9	Teach two-address lowering how to unfold a load to open up commuting opportunities. For example, this lets it emit this: movq (%rax), %rcx addq %rdx, %rcx instead of this: movq %rdx, %rcx addq (%rax), %rcx in the case where %rdx has subsequent uses. It's the same number of instructions, and usually the same encoding size on x86, but it appears faster, and in general, it may allow better scheduling for the load. llvm-svn: 106493	2010-06-21 22:17:20 +00:00
Dan Gohman	2dd1d3d182	Make this test more robust in case LLVM ever decides to align the global variable differently. llvm-svn: 106454	2010-06-21 19:56:27 +00:00
Eric Christopher	bf572c7cea	Add some codegen patterns for x86_64-linux-gnu tls codegen matching. Based on a patch by Patrick Marlier! llvm-svn: 106433	2010-06-21 18:21:27 +00:00
Dan Gohman	51d00092b6	Include the use kind along with the expression in the key of the use sharing map. The reconcileNewOffset logic already forces a separate use if the kinds differ, so incorporating the kind in the key means we can track more sharing opportunities. More sharing means fewer total uses to track, which means smaller problem sizes, which means the conservative throttles don't kick in as often. llvm-svn: 106396	2010-06-19 21:29:59 +00:00
Dan Gohman	99ba4dac59	Don't maintain a set of deleted nodes; instead, use a HandleSDNode to track a node over CSE events. This fixes PR7368. llvm-svn: 106266	2010-06-18 01:24:29 +00:00
Dan Gohman	b92156d5e4	Fold the ShrinkDemandedOps pass into the regular DAGCombiner pass, which is faster, simpler, and less surprising. llvm-svn: 106263	2010-06-18 01:05:21 +00:00
Dan Gohman	30d7a51d6c	Make this test less fragile. llvm-svn: 106255	2010-06-18 00:06:03 +00:00
Bill Wendling	8c0cf0994d	Create a more targeted fix for not sinking instructions into a range where it will conflict with another live range. The place which creates this scenerio is the code in X86 that lowers a select instruction by splitting the MBBs. This eliminates the need to check from the bottom up in an MBB for live pregs. llvm-svn: 106066	2010-06-15 23:46:31 +00:00
Jakob Stoklund Olesen	ec2e964fd6	Remove the local register allocator. Please use the fast allocator instead. llvm-svn: 106051	2010-06-15 21:58:33 +00:00
Chris Lattner	874c92bd47	fix fastisel to handle GS and FS relative pointers. Patch by Nelson Elhage! llvm-svn: 106031	2010-06-15 19:08:40 +00:00
Jakob Stoklund Olesen	246e9a07a2	Avoid processing early clobbers twice in RegAllocFast. Early clobbers defining a virtual register were first alocated to a physreg and then processed as a physreg EC, spilling the virtreg. This fixes PR7382. llvm-svn: 105998	2010-06-15 16:20:57 +00:00
Chris Lattner	00ab615406	apparently lots of dupes. llvm-svn: 105956	2010-06-14 20:19:03 +00:00
Chris Lattner	faa7bdccbf	fix a nasty bug where we were not treating available_externally symbols as declarations in the X86 backend. This would manifest on darwin x86-32 as errors like this with -fvisibility=hidden: symbol '__ZNSbIcED1Ev' can not be undefined in a subtraction expression This fixes PR7353. llvm-svn: 105954	2010-06-14 20:11:56 +00:00
Chris Lattner	bbb798c7d1	remove old test. llvm-svn: 105953	2010-06-14 20:07:43 +00:00
Chris Lattner	b30f87b74e	rename test llvm-svn: 105952	2010-06-14 20:07:34 +00:00
Bill Wendling	d53a2cb4ac	Testcase for r105741. llvm-svn: 105750	2010-06-09 20:30:22 +00:00
Jakob Stoklund Olesen	8bc5eca331	Mark physregs defined by inline asm as implicit. This is a bit of a hack to make inline asm look more like call instructions. It would be better to produce correct dead flags during isel. llvm-svn: 105749	2010-06-09 20:05:00 +00:00
Dan Gohman	bbfb6aca92	LSR needs to remember inserted instructions even in postinc mode, because there could be multiple subexpressions within a single expansion which require insert point adjustment. This fixes PR7306. llvm-svn: 105510	2010-06-05 00:33:07 +00:00
Dan Gohman	538b413ccb	Fix normalization and de-normalization of non-affine SCEVs. llvm-svn: 105480	2010-06-04 19:16:34 +00:00
Mon P Wang	622cdd2297	Fixed a bug during widening where we would avoid legalizing a node. When we replace an OpA with a widened OpB, it is possible to get new uses of OpA due to CSE when recursively updating nodes. Since OpA has been processed, the new uses are not examined again. The patch checks if this occurred and it it did, updates the new uses of OpA to use OpB. llvm-svn: 105453	2010-06-04 01:20:10 +00:00
Dan Gohman	8fdda8a655	This test doesn't need the ssp attribute. llvm-svn: 105440	2010-06-04 00:14:48 +00:00
Dan Gohman	d83e3e7750	Fix SimplifyDemandedBits' AssertZext logic to demand all the bits. It needs to demand the high bits because it's asserting that they're zero. llvm-svn: 105406	2010-06-03 20:21:33 +00:00
Bill Wendling	f82aea634c	Machine sink could potentially sink instructions into a block where the physical registers it defines then interfere with an existing preg live range. For instance, if we had something like these machine instructions: BB#0 ... = imul ... EFLAGS<imp-def,dead> test ..., EFLAGS<imp-def> jcc BB#2 EFLAGS<imp-use> BB#1 ... ; fallthrough to BB#2 BB#2 ... ; No code that defines EFLAGS jcc ... EFLAGS<imp-use> Machine sink will come along, see that imul implicitly defines EFLAGS, but because it's "dead", it assumes that it can move imul into BB#2. But when it does, imul's "dead" imp-def of EFLAGS is raised from the dead (a zombie) and messes up the condition code for the jump (and pretty much anything else which relies upon it being correct). The solution is to know which pregs are live going into a basic block. However, that information isn't calculated at this point. Nor does the LiveVariables pass take into account non-allocatable physical registers. In lieu of this, we do a very conservative pass through the basic block to determine if a preg is live coming out of it. llvm-svn: 105387	2010-06-03 07:54:20 +00:00
Eric Christopher	f67fe3b1e8	One underscore, not two. llvm-svn: 105379	2010-06-03 04:02:59 +00:00
Dan Gohman	b782caa393	Fill in missing support for ISD::FEXP, ISD::FPOWI, and friends. llvm-svn: 105283	2010-06-01 18:35:14 +00:00
Chris Lattner	14c46517b5	fix PR6623: when optimizing for size, don't inline memcpy/memsets that are too large. This causes the freebsd bootloader to be too large apparently. It's unclear if this should be an -Os or -Oz thing. Thoughts welcome. llvm-svn: 105228	2010-05-31 17:30:14 +00:00
Chris Lattner	291a189cda	upgrade and filecheckize this test. llvm-svn: 105227	2010-05-31 17:27:17 +00:00
Evan Cheng	707b7cc429	Remove schedule-livein-copies. It's not being used. llvm-svn: 105095	2010-05-29 02:23:39 +00:00
Evan Cheng	27c4933e02	Fix PR7193: if sibling call address can take a register, make sure there are enough registers available by counting inreg arguments. llvm-svn: 105092	2010-05-29 01:35:22 +00:00
Jakob Stoklund Olesen	2085089c49	Fix more tests that depended on the default register allocator choice. llvm-svn: 104961	2010-05-28 17:06:30 +00:00
Dan Gohman	2140a74979	Eliminate the restriction that the array size in an alloca must be i32. This will help reduce the amount of casting required on 64-bit targets. llvm-svn: 104911	2010-05-28 01:14:11 +00:00
Jakob Stoklund Olesen	b613ae2c89	Add a -regalloc=default option that chooses a register allocator based on the -O optimization level. This only really affects llc for now because both the llvm-gcc and clang front ends override the default register allocator. I intend to remove that code later. llvm-svn: 104904	2010-05-27 23:57:25 +00:00
Devang Patel	6b9a9fe207	Simplify. Eliminate unneeded debug_loc entry. llvm-svn: 104785	2010-05-26 23:55:23 +00:00
Devang Patel	1b08572a66	Update debug info when live-in reg is copied into a vreg. llvm-svn: 104732	2010-05-26 20:18:50 +00:00
Dale Johannesen	053dd21c84	Testcase for 104624/104619/PR7191/8023512. Reduced from one provided by Duncan Sands, thanks! llvm-svn: 104710	2010-05-26 17:55:45 +00:00
Dale Johannesen	cd4ba6caba	Removing test; Chris thinks it's better to have the bug go untested than have a testcase this large. So be it. llvm-svn: 104632	2010-05-25 20:40:10 +00:00
Dale Johannesen	60fe2cdc4f	Fix another variant of PR 7191. Also add a testcase Mon Ping provided; unfortunately bugpoint failed to reduce it, but I think it's important to have a test for this in the suite. 8023512. llvm-svn: 104624	2010-05-25 18:47:23 +00:00
Eric Christopher	64087cd346	This test is darwin only. Make it so(tm). llvm-svn: 104418	2010-05-22 00:55:55 +00:00
Eric Christopher	6fdea1bda8	Add full bss data support for darwin tls variables. llvm-svn: 104414	2010-05-22 00:10:22 +00:00
Chris Lattner	0735ecfe17	now that fp reg kill insertion stuff happens as a separate pass after isel instead of being interlaced with it, we can trust that all the code for a function has been isel'd before it is run. The practical impact of this is that we can scan for machine instr phis instead of doing a fuzzy match on the LLVM BB for phi nodes. Doing the fuzzy match required knowing when isel would produce an fp reg stack phi which was gross. It was also wrong in cases where select got lowered to a branch tree because cmovs aren't available (PR6828). Just do the scan on machine phis which is simpler, faster and more correct. This fixes PR6828. llvm-svn: 104333	2010-05-21 18:17:54 +00:00
Dale Johannesen	b3b9c8ac48	Fix i64->f64 conversion, x86-64, -no-sse. A bit tricky since there's a 3rd 64-bit type, MMX vectors. PR 7135. llvm-svn: 104308	2010-05-21 00:52:33 +00:00
Dan Gohman	ee2fea3cd7	When canonicalizing icmp operand order to put the loop invariant operand on the left, the interesting operand is on the right. This fixes a bug where LSR was failing to recognize ICmpZero uses, which led it to be unable to reverse the induction variable in the attached testcase. Delete test/CodeGen/X86/stack-color-with-reg-2.ll, because its test is extremely fragile and hard to meaningfully update. llvm-svn: 104262	2010-05-20 19:26:52 +00:00
Dan Gohman	887dd1cd31	When converting a test to a cmp to fold a load, use the cmp that has an 8-bit immediate field rather than one with a wider immediate field. llvm-svn: 104064	2010-05-18 21:42:03 +00:00
Daniel Dunbar	a4820fcc78	MC/X86: Implement custom lowering to make sure we match things like X86::ADC32ri $0, %eax to X86::ADC32i32 $0 llvm-svn: 104030	2010-05-18 17:22:24 +00:00
Dale Johannesen	f92c344167	Removing as part of previous reversion. llvm-svn: 103915	2010-05-16 20:19:40 +00:00
Dale Johannesen	2ef974ee0e	Revert 103911; it broke a test that expects bitconvert <1xi64> -> i64 to work in MMX registers on hosts where -no-sse is the default (not mine). The right thing is to accept this and make i64->f64 conversions go through memory, but I don't have time right now. llvm-svn: 103914	2010-05-16 20:19:04 +00:00
Dale Johannesen	fc1492d71b	Make x86-64 64-bit bitconvert work when SSE is not available. (This worked as of about 6 months ago and I didn't track down exactly what broke it; I think this fix is appropriate.) llvm-svn: 103911	2010-05-16 18:22:38 +00:00
Anton Korobeynikov	8f35fabbc1	Add support for thiscall calling convention. Patch by Charles Davis and Steven Watanabe! llvm-svn: 103902	2010-05-16 09:08:45 +00:00
Jakob Stoklund Olesen	4d5c1061e3	Simplify the handling of physreg defs and uses in RegAllocFast. This adds extra security against using clobbered physregs, and it adds kill markers to physreg uses. llvm-svn: 103784	2010-05-14 18:03:25 +00:00
Jakob Stoklund Olesen	0ba2e2a568	Take allocation hints from copy instructions to/from physregs. This causes way more identity copies to be generated, ripe for coalescing. llvm-svn: 103686	2010-05-13 00:19:43 +00:00
Jakob Stoklund Olesen	955a0e71e9	Make sure to add kill flags to the last use of a virtreg when it is redefined. The X86 floating point stack pass and others depend on good kill flags. llvm-svn: 103635	2010-05-12 18:46:03 +00:00
Jakob Stoklund Olesen	e6e39dc310	Enable a bunch more -regalloc=fast tests llvm-svn: 103531	2010-05-12 00:11:24 +00:00
Jakob Stoklund Olesen	84c881e593	One more -regalloc=fast test llvm-svn: 103509	2010-05-11 20:51:07 +00:00
Jakob Stoklund Olesen	3f0241e0f9	Simplify the tracking of used physregs to a bulk bitor followed by a transitive closure after allocating all blocks. Add a few more test cases for -regalloc=fast. llvm-svn: 103500	2010-05-11 20:30:28 +00:00
Jakob Stoklund Olesen	f1b3029a54	Mostly rewrite RegAllocFast. Sorry for the big change. The path leading up to this patch had some TableGen changes that I didn't want to commit before I knew they were useful. They weren't, and this version does not need them. The fast register allocator now does no liveness calculations. Instead it relies on kill flags provided by isel. (Currently those kill flags are also ignored due to isel bugs). The allocation algorithm is supposed to work with any subset of valid kill flags. More kill flags simply means fewer spills inserted. Registers are allocated from a working set that contains no aliases. That means most allocations can be done directly without expensive alias checks. When the working set runs out of registers we do the full alias check to find new free registers. llvm-svn: 103488	2010-05-11 18:54:45 +00:00
Evan Cheng	02947a4551	Be careful with operand promotion. For a binary operation, the source operands may be the same. PR7018. rdar://7939869. llvm-svn: 103419	2010-05-10 19:03:57 +00:00
Bill Wendling	cd476b6760	Readd testcase. llvm-svn: 103335	2010-05-08 04:47:54 +00:00
Dan Gohman	d0800241d2	When pruning candidate formulae out of an LSRUse, update the LSRUse's Regs set after all pruning is done, rather than trying to do it on the fly, which can produce an incomplete result. This fixes a case where heuristic pruning was stripping all formulae from a use, which led the solver to enter an infinite loop. Also, add a few asserts to diagnose this kind of situation. llvm-svn: 103328	2010-05-07 23:36:59 +00:00
Bill Wendling	6b5897b4de	Remove. Don't XFAIL. llvm-svn: 103321	2010-05-07 23:09:17 +00:00
Bill Wendling	32d8981ec0	Temorarily revert r101984. llvm-svn: 103314	2010-05-07 22:45:36 +00:00
Dale Johannesen	51c1695a0a	Fix PR 7087, and probably other things, by extending getConstantFP to accept the two supported long double target types. This was not the original intent, but there are other places that assume this works and it's easy enough to do. llvm-svn: 103299	2010-05-07 21:35:53 +00:00
Duncan Sands	ebf838274f	Correct some bogus target triples. llvm-svn: 103265	2010-05-07 17:03:48 +00:00
Nick Lewycky	45f530db39	Revert r103133 and add testcase from PR7066. llvm-svn: 103233	2010-05-07 01:45:38 +00:00

... 3 4 5 6 7 ...

2286 Commits