llvm-project

Commit Graph

Author	SHA1	Message	Date
NAKAMURA Takumi	4c14a5cc2c	Triple::MinGW64 is deprecated and removed. We can use Triple::MinGW32 generally. No one uses *-mingw64. mingw-w64 is represented as {i686\|x86_64}-w64-mingw32. In llvm side, i686 and x64 can be treated as similar way. llvm-svn: 125747	2011-02-17 12:24:17 +00:00
Eric Christopher	ef72141a75	The change for PR9190 wasn't quite right. We need to avoid making the transformation if we can't legally create a build vector of the correct type. Check that we can make the transformation first, and add a TODO to refactor this code with similar cases. Fixes: PR9223 and rdar://9000350 llvm-svn: 125631	2011-02-16 01:10:03 +00:00
Eric Christopher	58d6556fae	Add testcase for PR9190. llvm-svn: 125630	2011-02-16 01:08:31 +00:00
Devang Patel	d12c0a2764	Ignore DBG_VALUE machine instructions while constructing instruction ranges based on location info. Machine instruction range consisting of only DBG_VALUE MIs only contributes consecutive labels in assembly output, which is harmless, and empty scope entry in DebugInfo, which confuses debugger tools. llvm-svn: 125577	2011-02-15 17:56:09 +00:00
Rafael Espindola	70d8015063	Switch llvm to using comdats. For now always use groups with a single section. llvm-svn: 125526	2011-02-14 22:23:49 +00:00
Chris Lattner	eff248ca7f	fix PR9210 by implementing some type legalization logic for vector fp conversions. llvm-svn: 125482	2011-02-14 06:30:45 +00:00
Chris Lattner	46c01a30f4	Enhance ComputeMaskedBits to know that aligned frameindexes have their low bits set to zero. This allows us to optimize out explicit stack alignment code like in stack-align.ll:test4 when it is redundant. Doing this causes the code generator to start turning FI+cst into FI\|cst all over the place, which is general goodness (that is the canonical form) except that various pieces of the code generator don't handle OR aggressively. Fix this by introducing a new SelectionDAG::isBaseWithConstantOffset predicate, and using it in places that are looking for ADD(X,CST). The ARM backend in particular was missing a lot of addressing mode folding opportunities around OR. llvm-svn: 125470	2011-02-13 22:25:43 +00:00
Chris Lattner	d5f0b1148a	when legalizing extremely wide shifts, make sure that the shift amounts are in a suitably wide type so that we don't generate out of range constant shift amounts. This fixes PR9028. llvm-svn: 125458	2011-02-13 09:10:56 +00:00
Evan Cheng	d4fcc05304	After 3-addressifying a two-address instruction, update the register maps; add a missing check when considering whether it's profitable to commute. rdar://8977508. llvm-svn: 125259	2011-02-10 02:20:55 +00:00
Devang Patel	389971b318	Reduce test case, smaller is better. llvm-svn: 125019	2011-02-07 18:24:18 +00:00
NAKAMURA Takumi	1850c80afb	Target/X86: Tweak allocating shadow area (aka home) on Win64. It must be enough for caller to allocate one. llvm-svn: 124949	2011-02-05 15:11:32 +00:00
Devang Patel	116a9d7c38	Merge .debug_loc entries whenever possible to reduce debug_loc size. llvm-svn: 124904	2011-02-04 22:57:18 +00:00
Nick Lewycky	d650b30488	Mark that the return is using EAX so that we don't use it for some other purpose. Fixes PR9080! llvm-svn: 124903	2011-02-04 22:44:08 +00:00
Devang Patel	26ffa01889	DebugLoc associated with a machine instruction is used to emit location entries. DebugLoc associated with a DBG_VALUE is used to identify lexical scope of the variable. After register allocation, while inserting DBG_VALUE remember original debug location for the first instruction and reuse it, otherwise dwarf writer may be mislead in identifying the variable's scope. llvm-svn: 124845	2011-02-04 01:43:25 +00:00
Rafael Espindola	f5754b851c	Add -march to fix the bots. llvm-svn: 124774	2011-02-03 04:21:01 +00:00
Rafael Espindola	d11311f291	Fix PR9127 by reversing the operands even if they have more then one use. Reversing the operands allows us to fold, but doesn't force us to. Also, at this point the DAG is still being optimized, so the check for hasOneUse is not very precise. llvm-svn: 124773	2011-02-03 03:58:05 +00:00
Devang Patel	56cc5fdf09	Keep track of incoming argument's location while emitting LiveIns. llvm-svn: 124611	2011-01-31 21:38:14 +00:00
Benjamin Kramer	946e1522b6	Teach DAGCombine to fold fold (sra (trunc (sr x, c1)), c2) -> (trunc (sra x, c1+c2) when c1 equals the amount of bits that are truncated off. This happens all the time when a smul is promoted to a larger type. On x86-64 we now compile "int test(int x) { return x/10; }" into movslq %edi, %rax imulq $1717986919, %rax, %rax movq %rax, %rcx shrq $63, %rcx sarq $34, %rax <- used to be "shrq $32, %rax; sarl $2, %eax" addl %ecx, %eax This fires 96 times in gcc.c on x86-64. llvm-svn: 124559	2011-01-30 16:38:43 +00:00
Evan Cheng	d983eba7dc	Re-apply r124518 with fix. Watch out for invalidated iterator. llvm-svn: 124526	2011-01-29 04:46:23 +00:00
Evan Cheng	65b8ccf6ac	Revert r124518. It broke Linux self-host. llvm-svn: 124522	2011-01-29 02:43:04 +00:00
Evan Cheng	d4eff31476	Re-commit r124462 with fixes. Tail recursion elim will now dup ret into unconditional predecessor to enable TCE on demand. llvm-svn: 124518	2011-01-29 01:29:26 +00:00
Evan Cheng	aaa9606b2f	Revert r124462. There are a few big regressions that I need to fix first. llvm-svn: 124478	2011-01-28 07:12:38 +00:00
Rafael Espindola	2f72a84284	Add a triple. llvm-svn: 124471	2011-01-28 03:57:55 +00:00
Rafael Espindola	6c17d54891	Print the visibility of declarations. llvm-svn: 124468	2011-01-28 03:20:10 +00:00
Evan Cheng	417fca86c4	- Stop simplifycfg from duplicating "ret" instructions into unconditional branches. PR8575, rdar://5134905, rdar://8911460. - Allow codegen tail duplication to dup small return blocks after register allocation is done. llvm-svn: 124462	2011-01-28 02:19:21 +00:00
NAKAMURA Takumi	0cfdac078e	Target/X86: Tweak win64's tailcall. llvm-svn: 124272	2011-01-26 02:04:09 +00:00
NAKAMURA Takumi	9d29eff198	Fix whitespace. llvm-svn: 124270	2011-01-26 02:03:37 +00:00
Devang Patel	70f8e5962a	Resolve DanglingDbgValue of PHI nodes where the use follows dbg.value intrinisic. llvm-svn: 124203	2011-01-25 18:09:58 +00:00
Devang Patel	533479544b	Speculatively revert r124138. llvm-svn: 124142	2011-01-24 20:04:37 +00:00
Devang Patel	8cc5355c90	Resolve DanglingDbgValue of PHI nodes where the use follows dbg.value intrinisic. llvm-svn: 124138	2011-01-24 19:24:37 +00:00
Chris Lattner	bf638d2a0d	fix a missing shuffle pattern, PR9009. Patch by Artiom Myaskouvskey! llvm-svn: 124102	2011-01-24 03:42:46 +00:00
Eric Christopher	785db078b4	Expand invalid return values for umulo and smulo. Handle these similarly to add/sub by doing the normal operation and then checking for overflow afterwards. This generally relies on the DAG handling the later invalid operations as well. Fixes the 64-bit part of rdar://8622122 and rdar://8774702. llvm-svn: 123908	2011-01-20 08:54:28 +00:00
Benjamin Kramer	45d183ccf0	Fix an off-by-one error in ctpop combining. llvm-svn: 123664	2011-01-17 18:00:28 +00:00
Benjamin Kramer	24c5184dca	Add a DAGCombine to turn (ctpop x) u< 2 into (x & x-1) == 0. This shaves off 4 popcounts from the hacked 186.crafty source. This is enabled even when a native popcount instruction is available. The combined code is one operation longer but it should be faster nevertheless. llvm-svn: 123621	2011-01-17 12:04:57 +00:00
Rafael Espindola	ec517cdf24	Update tests. llvm-svn: 123591	2011-01-16 18:02:57 +00:00
Chris Lattner	35a2e65bcb	fix PR8514, a bug where the "heroic" transformation of shift/and into and/shift would cause nodes to move around and a dangling pointer to happen. The code tried to avoid this with a HandleSDNode, but got the details wrong. llvm-svn: 123578	2011-01-16 08:48:11 +00:00
Chris Lattner	218092e68e	fix PR8981, a crash trying to form a conditional inc with a floating point compare. llvm-svn: 123560	2011-01-16 02:56:53 +00:00
Chris Lattner	2d186574a6	reapply my fix for PR8961 with a tweak to properly handle multi-instruction sequences like calls. Many thanks to Jakob for finding a testcase. llvm-svn: 123559	2011-01-16 02:27:38 +00:00
Chris Lattner	e93e4f118c	revert my fastisel patch again which apparently still gives the llvm-gcc-i386-linux-selfhost buildbot heartburn... llvm-svn: 123431	2011-01-14 06:14:33 +00:00
Chris Lattner	5ca1391003	reapply r123414 now that the botz are calmed down and the fix is already in. llvm-svn: 123427	2011-01-14 04:24:28 +00:00
Chris Lattner	21a64979f1	r123414 broke llvm-gcc bootstrap apparently, revert llvm-svn: 123422	2011-01-14 02:07:32 +00:00
Chris Lattner	0c34cb429e	fix PR8961 - a fast isel miscompilation where we'd insert a new instruction after sext's generated for addressing that got folded. Previously we compiled test5 into: _test5: ## @test5 ## BB#0: movq -8(%rsp), %rax ## 8-byte Reload movq (%rdi,%rax), %rdi addq %rdx, %rdi movslq %esi, %rax movq %rax, -8(%rsp) ## 8-byte Spill movq %rdi, %rax ret which is insane and wrong. Now we produce: _test5: ## @test5 ## BB#0: movslq %esi, %rax movq (%rdi,%rax), %rax addq %rdx, %rax ret llvm-svn: 123414	2011-01-14 00:01:01 +00:00
Eric Christopher	da2d2f4d1f	Experiment with changing the default 32-bit linux stack alignment to 16 bytes for PR8969. Update all testcases accordingly. llvm-svn: 123367	2011-01-13 06:47:10 +00:00
Jakob Stoklund Olesen	74ded57bb8	Try again enabling LiveDebugVariables. llvm-svn: 123342	2011-01-12 23:36:21 +00:00
Jakob Stoklund Olesen	43812bfa92	The world is not ready for LiveDebugVariables yet. llvm-svn: 123290	2011-01-11 23:20:33 +00:00
Jakob Stoklund Olesen	8c98495f43	Enable LiveDebugVariables by default. llvm-svn: 123282	2011-01-11 22:45:28 +00:00
Dale Johannesen	d2b48119b0	Fix PR 8916 (qv for analysis), at least the immediate problem. There's an inherent tension in DAGCombine between assuming that things will be put in canonical form, and the Depth mechanism that disables transformations when recursion gets too deep. It would not surprise me if there's a lot of little bugs like this one waiting to be discovered. The mechanism seems fragile and I'd suggest looking at it from a design viewpoint. llvm-svn: 123191	2011-01-10 21:53:07 +00:00
Evan Cheng	078b0b095e	Recognize inline asm 'rev /bin/bash, ' as a bswap intrinsic call. llvm-svn: 123048	2011-01-08 01:24:27 +00:00
Evan Cheng	6eb516dbea	Do not model all INLINEASM instructions as having unmodelled side effects. Instead encode llvm IR level property "HasSideEffects" in an operand (shared with IsAlignStack). Added MachineInstrs::hasUnmodeledSideEffects() to check the operand when the instruction is an INLINEASM. This allows memory instructions to be moved around INLINEASM instructions. llvm-svn: 123044	2011-01-07 23:50:32 +00:00
Devang Patel	acbee0b0d9	Speculatively revert r123032. llvm-svn: 123039	2011-01-07 22:33:41 +00:00
Devang Patel	6381e1584c	Appropriately truncate debug info range in dwarf output. Enable live debug variables pass. llvm-svn: 123032	2011-01-07 21:30:41 +00:00
Evan Cheng	a048c83fe4	Revert r122955. It seems using movups to lower memcpy can cause massive regression (even on Nehalem) in edge cases. I also didn't see any real performance benefit. llvm-svn: 123015	2011-01-07 19:35:30 +00:00
Benjamin Kramer	1ec7ecce86	Try to unbreak the arm buildbot. llvm-svn: 122999	2011-01-07 11:35:21 +00:00
Duncan Sands	61c5708b51	Fix the other problem reported in PR8582. Testcase and patch by Nadav Rotem. llvm-svn: 122983	2011-01-06 23:45:22 +00:00
Evan Cheng	7998b1d6fe	Use movups to lower memcpy and memset even if it's not fast (like corei7). The theory is it's still faster than a pair of movq / a quad of movl. This will probably hurt older chips like P4 but should run faster on current and future Intel processors. rdar://8817010 llvm-svn: 122955	2011-01-06 07:58:36 +00:00
Evan Cheng	3ae2b79aa3	Re-implement r122936 with proper target hooks. Now getMaxStoresPerMemcpy etc. takes an option OptSize. If OptSize is true, it would return the inline limit for functions with attribute OptSize. llvm-svn: 122952	2011-01-06 06:52:41 +00:00
Evan Cheng	c052ba7ff3	Revert r122936. I'll re-implement the change. llvm-svn: 122949	2011-01-06 06:17:53 +00:00
Bill Wendling	2b898548e8	Fix test to coincide with r122934 change from PR8919. llvm-svn: 122937	2011-01-06 01:09:35 +00:00
Evan Cheng	06536e7158	r105228 reduced the memcpy / memset inline limit to 4 with -Os to avoid blowing up freebsd bootloader. However, this doesn't make much sense for Darwin, whose -Os is meant to optimize for size only if it doesn't hurt performance. rdar://8821501 llvm-svn: 122936	2011-01-06 01:04:47 +00:00
Evan Cheng	ac730dd2d1	Avoid zero extend bit test operands to pointer type if all the masks fit in the original type of the switch statement key. rdar://8781238 llvm-svn: 122935	2011-01-06 01:02:44 +00:00
Evan Cheng	260acf32ee	Optimize: r1025 = s/zext r1024, 4 r1026 = extract_subreg r1025, 4 to: r1026 = copy r1024 llvm-svn: 122925	2011-01-05 23:06:49 +00:00
Chris Lattner	872908fdeb	fix PR8900, a shuffle miscompilation. Patch by Nadav Rotem! llvm-svn: 122921	2011-01-05 22:28:46 +00:00
Evan Cheng	65089fc6c7	Use pushq / popq instead of subq $8, %rsp / addq $8, %rsp to adjust stack in prologue and epilogue if the adjustment is 8. Similarly, use pushl / popl if the adjustment is 4 in 32-bit mode. In the epilogue, takes care to pop to a caller-saved register that's not live at the exit (either return or tailcall instruction). rdar://8771137 llvm-svn: 122783	2011-01-03 22:53:22 +00:00
Benjamin Kramer	25e6e06e42	Try to reuse the value when lowering memset. This allows us to compile: void test(char *s, int a) { __builtin_memset(s, a, 15); } into 1 mul + 3 stores instead of 3 muls + 3 stores. llvm-svn: 122710	2011-01-02 19:57:05 +00:00
Benjamin Kramer	2fdea4c8f1	Lower the i8 extension in memset to a multiply instead of a potentially long series of shifts and ors. We could implement a DAGCombine to turn x * 0x0101 back into logic operations on targets that doesn't support the multiply or it is slow (p4) if someone cares enough. Example code: void test(char *s, int a) { __builtin_memset(s, a, 4); } before: _test: ## @test movzbl 8(%esp), %eax movl %eax, %ecx shll $8, %ecx orl %eax, %ecx movl %ecx, %eax shll $16, %eax orl %ecx, %eax movl 4(%esp), %ecx movl %eax, 4(%ecx) movl %eax, (%ecx) ret after: _test: ## @test movzbl 8(%esp), %eax imull $16843009, %eax, %eax ## imm = 0x1010101 movl 4(%esp), %ecx movl %eax, 4(%ecx) movl %eax, (%ecx) ret llvm-svn: 122707	2011-01-02 19:44:58 +00:00
Rafael Espindola	abe3eaa481	Fix darwin bots. llvm-svn: 122672	2011-01-01 21:58:41 +00:00
Rafael Espindola	d606e54757	Add support for the 'H' modifier. llvm-svn: 122667	2011-01-01 20:58:46 +00:00
NAKAMURA Takumi	dbacfb73e0	test/CodeGen/X86/negative-sin.ll: FileCheck-ize. llvm-svn: 122619	2010-12-29 03:58:47 +00:00
NAKAMURA Takumi	74835a22d8	test/CodeGen/X86/fp-in-intregs.ll: FileCheck-ize. llvm-svn: 122618	2010-12-29 03:58:36 +00:00
Benjamin Kramer	1f4dfbbcb0	DAGCombine add (sext i1), X into sub X, (zext i1) if sext from i1 is illegal. The latter usually compiles into smaller code. example code: unsigned foo(unsigned x, unsigned y) { if (x != 0) y--; return y; } before: _foo: ## @foo cmpl $1, 4(%esp) ## encoding: [0x83,0x7c,0x24,0x04,0x01] sbbl %eax, %eax ## encoding: [0x19,0xc0] notl %eax ## encoding: [0xf7,0xd0] addl 8(%esp), %eax ## encoding: [0x03,0x44,0x24,0x08] ret ## encoding: [0xc3] after: _foo: ## @foo cmpl $1, 4(%esp) ## encoding: [0x83,0x7c,0x24,0x04,0x01] movl 8(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x08] adcl $-1, %eax ## encoding: [0x83,0xd0,0xff] ret ## encoding: [0xc3] llvm-svn: 122455	2010-12-22 23:17:45 +00:00
Benjamin Kramer	6020ed9d99	X86: Lower a select directly to a setcc_carry if possible. int test(unsigned long a, unsigned long b) { return -(a < b); } compiles to _test: ## @test cmpq %rsi, %rdi ## encoding: [0x48,0x39,0xf7] sbbl %eax, %eax ## encoding: [0x19,0xc0] ret ## encoding: [0xc3] instead of _test: ## @test xorl %ecx, %ecx ## encoding: [0x31,0xc9] cmpq %rsi, %rdi ## encoding: [0x48,0x39,0xf7] movl $-1, %eax ## encoding: [0xb8,0xff,0xff,0xff,0xff] cmovael %ecx, %eax ## encoding: [0x0f,0x43,0xc1] ret ## encoding: [0xc3] llvm-svn: 122451	2010-12-22 23:09:28 +00:00
Chris Lattner	cafc1e60bb	Fix a bug in ReduceLoadWidth that wasn't handling extending loads properly. We miscompiled the testcase into: _test: ## @test movl $128, (%rdi) movzbl 1(%rdi), %eax ret Now we get a proper: _test: ## @test movl $128, (%rdi) movsbl (%rdi), %eax movzbl %ah, %eax ret This fixes PR8757. llvm-svn: 122392	2010-12-22 08:02:57 +00:00
Dale Johannesen	a94e36bbee	Reapply 122353-122355 with fixes. 122354 was wrong; the shift type was needed one place, the shift count type another. The transform in 123555 had the same problem. llvm-svn: 122366	2010-12-21 21:55:50 +00:00
Benjamin Kramer	f6ddc4a1de	Add some x86 specific dagcombines for conditional increments. (add Y, (sete X, 0)) -> cmp X, 1; adc 0, Y (add Y, (setne X, 0)) -> cmp X, 1; sbb -1, Y (sub (sete X, 0), Y) -> cmp X, 1; sbb 0, Y (sub (setne X, 0), Y) -> cmp X, 1; adc -1, Y for unsigned foo(unsigned a, unsigned b) { if (a == 0) b++; return b; } we now get: foo: cmpl $1, %edi movl %esi, %eax adcl $0, %eax ret instead of: foo: testl %edi, %edi sete %al movzbl %al, %eax addl %esi, %eax ret llvm-svn: 122364	2010-12-21 21:41:44 +00:00
Dale Johannesen	87c47499c6	Revert 122353-122355 for the moment, they broke stuff. llvm-svn: 122360	2010-12-21 21:22:27 +00:00
Dale Johannesen	caf42aa6a4	Add a new transform to DAGCombiner. llvm-svn: 122355	2010-12-21 20:10:51 +00:00
Dale Johannesen	fa5dc82fda	Get the type of a shift from the shift, not from its shift count operand. These should be the same but apparently are not always, and this is cleaner anyway. This improves the code in an existing test. llvm-svn: 122354	2010-12-21 20:06:19 +00:00
Dale Johannesen	0a291a36f2	Cosmetic changes. llvm-svn: 122259	2010-12-20 20:10:50 +00:00
Chris Lattner	5c00d41688	now that addc/adde are gone, "ADDC" in the X86 backend uses EFLAGS results, the same as setcc. Optimize ADDC(0,0,FLAGS) -> SET_CARRY(FLAGS). This is a step towards finishing off PR5443. In the testcase in that bug we now get: movq %rdi, %rax addq %rsi, %rax sbbq %rcx, %rcx testb $1, %cl setne %dl ret instead of: movq %rdi, %rax addq %rsi, %rax movl $0, %ecx adcq $0, %rcx testq %rcx, %rcx setne %dl ret llvm-svn: 122219	2010-12-20 01:37:09 +00:00
Chris Lattner	46b9efcad7	We lower setb to sbb with the hope that the and will go away, when it doesn't, match it back to setb. On a 64-bit version of the testcase before we'd get: movq %rdi, %rax addq %rsi, %rax sbbb %dl, %dl andb $1, %dl ret now we get: movq %rdi, %rax addq %rsi, %rax setb %dl ret llvm-svn: 122217	2010-12-20 01:16:03 +00:00
Mon P Wang	075a16b09e	Add comment for testcase for 122206 llvm-svn: 122210	2010-12-20 00:54:26 +00:00
Mon P Wang	1064992c84	Prevents PerformShuffleCombine from creating a node with an illegal type after legalize types has run, e.g., prevent creating an i64 node from a v2i64 when i64 is not a legal type. llvm-svn: 122206	2010-12-19 23:55:53 +00:00
Chris Lattner	9edf3f50bf	improve the setcc -> setcc_carry optimization to happen more consistently by moving it out of lowering into dag combine. Add some missing patterns for matching away extended versions of setcc_c. llvm-svn: 122201	2010-12-19 22:08:31 +00:00
Chris Lattner	ff392ab3ed	now that generic vector types aren't selected onto MMX registers, these tests don't need -disable-mmx. llvm-svn: 122188	2010-12-19 20:12:58 +00:00
Chris Lattner	77a8a71414	fix PR8642: if a critical edge has a PHI value that can trap, isel is required to split the edge. PHI values get evaluated on the edge, not in their predecessor block. llvm-svn: 122170	2010-12-19 04:58:57 +00:00
Benjamin Kramer	4b4f3557de	Just rename the functions, relying on matching a instruction that has the same name as a symbol is way too fragile. llvm-svn: 122154	2010-12-18 14:23:57 +00:00
Benjamin Kramer	5f6219d929	Test more than just label names and make test work on non-x86 hosts. llvm-svn: 122153	2010-12-18 14:07:28 +00:00
Nate Begeman	97b72c99d2	Add support for matching psign & plendvb to the x86 target Remove unnecessary pandn patterns, 'vnot' patfrag looks through bitcasts llvm-svn: 122098	2010-12-17 22:55:37 +00:00
Dale Johannesen	cd538afa52	Add a transform to DAG Combiner. This improves the code for the case where 32-bit divide by constant is turned into 64-bit multiply by constant. 8771012. llvm-svn: 122090	2010-12-17 21:45:49 +00:00
Evan Cheng	b7ff5a0f20	Teach machine cse to commute instructions. llvm-svn: 121903	2010-12-15 22:16:21 +00:00
Chris Lattner	15090e1eb0	take care of some todos, transforming [us]mul_lohi into a wider mul if the wider mul is legal. llvm-svn: 121848	2010-12-15 06:04:19 +00:00
Chris Lattner	c3301e970c	merge two tests llvm-svn: 121847	2010-12-15 05:58:59 +00:00
Evan Cheng	19dc77cec6	Fix a minor bug in two-address pass. It was missing a commute opportunity. regB = move RCX regA = op regB, regC RAX = move regA where both regB and regC are killed. If regB is constrainted to non-compatible physical registers but regC is not constrainted at all, then it's better to commute the instruction. movl %edi, %eax shlq $32, %rcx leaq (%rcx,%rax), %rax => movl %edi, %eax shlq $32, %rcx orq %rcx, %rax rdar://8762995 llvm-svn: 121793	2010-12-14 21:34:53 +00:00
Chris Lattner	8e21a02c19	rename test llvm-svn: 121697	2010-12-13 08:39:40 +00:00
Chris Lattner	10bd29f1d4	Add a couple dag combines to transform mulhi/mullo into a wider multiply when the wider type is legal. This allows us to compile: define zeroext i16 @test1(i16 zeroext %x) nounwind { entry: %div = udiv i16 %x, 33 ret i16 %div } into: test1: # @test1 movzwl 4(%esp), %eax imull $63551, %eax, %eax # imm = 0xF83F shrl $21, %eax ret instead of: test1: # @test1 movw $-1985, %ax # imm = 0xFFFFFFFFFFFFF83F mulw 4(%esp) andl $65504, %edx # imm = 0xFFE0 movl %edx, %eax shrl $5, %eax ret Implementing rdar://8760399 and example #4 from: http://blog.regehr.org/archives/320 We should implement the same thing for [su]mul_hilo, but I don't have immediate plans to do this. llvm-svn: 121696	2010-12-13 08:39:01 +00:00
Nate Begeman	8b08f5232b	Formalize the notion that AVX and SSE are non-overlapping extensions from the compiler's point of view. Per email discussion, we either want to always use VEX-prefixed instructions or never use them, and are taking "HasAVX" to mean "Always use VEX". Passing -mattr=-avx,+sse42 should serve to restore legacy SSE support when desirable. llvm-svn: 121439	2010-12-10 00:26:57 +00:00
Eric Christopher	a8aaaee379	Rewrite the darwin tlv support to use a chain and return to copying the output to the correct register. Fixes a hidden problem uncovered by the last patch where we'd try to DAG combine our MVT::Other node oddly. llvm-svn: 121358	2010-12-09 06:25:53 +00:00
Eric Christopher	d84970ae8b	Remove extraneous copy from DAG conversion for darwin tls. This was popping up at O0 when it wasn't folded and the fast allocator would complain. llvm-svn: 121330	2010-12-09 00:27:58 +00:00
Eric Christopher	6a21b40bd6	Move this test to tlv* to make it easier to notice versus linux tls support. llvm-svn: 121316	2010-12-08 23:33:23 +00:00
Devang Patel	c24048a718	If dbg_declare() or dbg_value() is not lowered by isel then emit DEBUG message instead of creating DBG_VALUE for undefined value in reg0. llvm-svn: 121059	2010-12-06 22:39:26 +00:00
Rafael Espindola	dee3062373	Revert previous two patches while I try to find out how to make both linux and darwin assemblers happy :-( llvm-svn: 121004	2010-12-06 15:35:15 +00:00
Rafael Espindola	884d58a798	Update test for the extra =. llvm-svn: 121001	2010-12-06 15:05:36 +00:00
Chris Lattner	6886171792	Teach X86ISelLowering that the second result of X86ISD::UMUL is a flags result. This allows us to compile: void *test12(long count) { return new int[count]; } into: test12: movl $4, %ecx movq %rdi, %rax mulq %rcx movq $-1, %rdi cmovnoq %rax, %rdi jmp __Znam ## TAILCALL instead of: test12: movl $4, %ecx movq %rdi, %rax mulq %rcx seto %cl testb %cl, %cl movq $-1, %rdi cmoveq %rax, %rdi jmp __Znam Of course it would be even better if the regalloc inverted the cmov to 'cmovoq', which would eliminate the need for the 'movq %rdi, %rax'. llvm-svn: 120936	2010-12-05 07:49:54 +00:00
Chris Lattner	364bb0a081	it turns out that when ".with.overflow" intrinsics were added to the X86 backend that they were all implemented except umul. This one fell back to the default implementation that did a hi/lo multiply and compared the top. Fix this to check the overflow flag that the 'mul' instruction sets, so we can avoid an explicit test. Now we compile: void *func(long count) { return new int[count]; } into: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] seto %cl ## encoding: [0x0f,0x90,0xc1] testb %cl, %cl ## encoding: [0x84,0xc9] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL instead of: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] testq %rdx, %rdx ## encoding: [0x48,0x85,0xd2] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL Other than the silly seto+test, this is using the o bit directly, so it's going in the right direction. llvm-svn: 120935	2010-12-05 07:30:36 +00:00
Chris Lattner	183ddd8ed3	fix the rest of the linux miscompares :) llvm-svn: 120933	2010-12-05 02:08:07 +00:00
Chris Lattner	116580a11c	generalize the previous check to handle -1 on either side of the select, inserting a not to compensate. Add a missing isZero check that I lost somehow. This improves codegen of: void *func(long count) { return new int[count]; } from: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] testq %rdx, %rdx ## encoding: [0x48,0x85,0xd2] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL ## encoding: [0xeb,A] to: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] cmpq $1, %rdx ## encoding: [0x48,0x83,0xfa,0x01] sbbq %rdi, %rdi ## encoding: [0x48,0x19,0xff] notq %rdi ## encoding: [0x48,0xf7,0xd7] orq %rax, %rdi ## encoding: [0x48,0x09,0xc7] jmp __Znam ## TAILCALL ## encoding: [0xeb,A] llvm-svn: 120932	2010-12-05 02:00:51 +00:00
Chris Lattner	77a11c6174	relax this to handle linux defaulting to -static. llvm-svn: 120930	2010-12-05 01:31:13 +00:00
Chris Lattner	342e6ea5f9	Improve an integer select optimization in two ways: 1. generalize (select (x == 0), -1, 0) -> (sign_bit (x - 1)) to: (select (x == 0), -1, y) -> (sign_bit (x - 1)) \| y 2. Handle the identical pattern that happens with !=: (select (x != 0), y, -1) -> (sign_bit (x - 1)) \| y cmov is often high latency and can't fold immediates or memory operands. For example for (x == 0) ? -1 : 1, before we got: < testb %sil, %sil < movl $-1, %ecx < movl $1, %eax < cmovel %ecx, %eax now we get: > cmpb $1, %sil > sbbl %eax, %eax > orl $1, %eax llvm-svn: 120929	2010-12-05 01:23:24 +00:00
Chris Lattner	0523388d60	merge some tests into select.ll and make them more specific. llvm-svn: 120928	2010-12-05 01:13:58 +00:00
Chris Lattner	b89b6f17da	rename test llvm-svn: 120927	2010-12-05 01:02:23 +00:00
Chris Lattner	d4f8c9641a	remove two tests that aren't really testing anything. llvm-svn: 120926	2010-12-05 01:02:13 +00:00
Benjamin Kramer	2f489236ab	Add patterns for the x86 popcnt instruction. - Also adds a new POPCNT subtarget feature that is currently enabled if the target supports SSE4.2 (nehalem) or SSE4A (barcelona). llvm-svn: 120917	2010-12-04 20:32:23 +00:00
Devang Patel	88d794c628	Hide tests, that check .loc, .file in output assembly, from darwin9 buildbot. llvm-svn: 120750	2010-12-02 23:29:58 +00:00
Devang Patel	8cabd938ed	Use set directive for StartMinusEndExpr. This is a fix for llvm-gcc-i386-darwin9 buildbot failure. llvm-svn: 120742	2010-12-02 21:32:30 +00:00
Evan Cheng	5709254bd5	Fix test. llvm-svn: 120730	2010-12-02 20:17:34 +00:00
Evan Cheng	419ea286ee	Fix and re-enable tail call optimization of expanded libcalls. llvm-svn: 120622	2010-12-01 22:59:46 +00:00
Evan Cheng	a695abde49	Speculatively disable x86 portion of r120501 to appease the x86_64 buildbot. llvm-svn: 120549	2010-12-01 03:27:20 +00:00
Evan Cheng	d4b0873c06	Enable sibling call optimization of libcalls which are expanded during legalization time. Since at legalization time there is no mapping from SDNode back to the corresponding LLVM instruction and the return SDNode is target specific, this requires a target hook to check for eligibility. Only x86 and ARM support this form of sibcall optimization right now. rdar://8707777 llvm-svn: 120501	2010-11-30 23:55:39 +00:00
Eric Christopher	8e9fbcf0f0	Not all platforms use _<func>. Duh. llvm-svn: 120418	2010-11-30 09:23:54 +00:00
Eric Christopher	fa6657cec0	Rewrite mwait and monitor support and custom lower arguments. Fixes PR8573. llvm-svn: 120404	2010-11-30 07:20:12 +00:00
Benjamin Kramer	e6840ef4b3	Fix some broken CHECK lines. llvm-svn: 120332	2010-11-29 22:34:55 +00:00
Rafael Espindola	5d882894d8	Lower TLS_addr32 and TLS_addr64. llvm-svn: 120225	2010-11-27 20:43:02 +00:00
Benjamin Kramer	24656c9583	Implement the "if (X == 6 \|\| X == 4)" -> "if ((X\|2) == 6)" optimization. This currently only catches the most basic case, a two-case switch, but can be extended later. llvm-svn: 119964	2010-11-22 09:45:38 +00:00
Dale Johannesen	c9242c577e	Prefetch has a MemOperand now. FileCheckize a test. This finishes up 8460971. llvm-svn: 119848	2010-11-19 21:49:38 +00:00
Mon P Wang	88ff56caa3	Make isScalarToVector to return false if the node is a scalar. This will prevent DAGCombine from making an illegal transformation of bitcast of a scalar to a vector into a scalar_to_vector. llvm-svn: 119819	2010-11-19 19:08:12 +00:00
Duncan Sands	12f3b3b44f	The DAGCombiner was threading select over pairs of extending loads even if the extension types were not the same. The result was that if you fed a select with sext and zext loads, as in the testcase, then it would get turned into a zext (or sext) of the select, which is wrong in the cases when it should have been an sext (resp. zext). Reported and diagnosed by Sebastien Deldon. llvm-svn: 119728	2010-11-18 20:05:18 +00:00
Rafael Espindola	67c6ab8865	Change CodeGen to use .loc directives. This produces a lot more readable output and testing is easier. A good example is the unknown-location.ll test that now can just look for ".loc 1 0 0". We also don't use a DW_LNE_set_address for every address change anymore. llvm-svn: 119613	2010-11-18 02:04:25 +00:00
Dale Johannesen	ed0d840838	Do not throw away alignment when generating the DAG for memset; we may need it to decide between MOVAPS and MOVUPS later. Adjust a test that was looking for wrong code. PR 3866 / 8675131. llvm-svn: 119605	2010-11-18 01:35:23 +00:00
John Thompson	488051f1a6	Fixed to use input redirection for source - to eliminate .s output. llvm-svn: 119599	2010-11-18 00:50:20 +00:00
John Thompson	ddc7ce548c	Bug 8621 fix - pointer cast stripped from inline asm constraint argument. llvm-svn: 119590	2010-11-17 23:58:47 +00:00
Peter Collingbourne	feea10bcdf	Recognise 32-bit ror-based bswap implementation used by uclibc llvm-svn: 119007	2010-11-13 19:54:30 +00:00
Dan Gohman	6cf9bb45ad	Remove the memmove->memcpy optimization from CodeGen. MemCpyOpt does this. llvm-svn: 118789	2010-11-11 16:24:49 +00:00
Duncan Sands	e5276f11ee	Testcase for PR8211 (llc crash at -O0). llvm-svn: 118509	2010-11-09 16:22:27 +00:00
Dan Gohman	5db8921422	Fix DAGCombiner to avoid folding a sext-in-reg or similar through a shl in order to fold it into a load. llvm-svn: 118471	2010-11-09 01:54:35 +00:00
Dan Gohman	4677bafd85	Delete an extraneous svn:executable property. llvm-svn: 118470	2010-11-09 01:51:06 +00:00
Dale Johannesen	f11ea9ce61	Fix an inline asm pasto from 117667; was preventing {i64, i64} from matching i128. llvm-svn: 118465	2010-11-09 01:15:07 +00:00
Chris Lattner	ca7801e472	go to great lengths to work around a GAS bug my previous patch exposed: GAS doesn't accept "fcomip %st(1)", it requires "fcomip %st(1), %st(0)" even though st(0) is implicit in all other fp stack instructions. Fortunately, there is an alias for fcomip named "fcompi" and gas does accept the default argument for the alias (boggle!). As such, switch the canonical form of this instruction to "pi" instead of "ip". This makes the code generator and disassembler generate pi, avoiding the gas bug. llvm-svn: 118356	2010-11-06 21:37:06 +00:00
Dale Johannesen	c7d82d58b5	This test assumes SSE is present; that is not the default on non-X86 hosts. Hopefully fixes ppc-host buildbot. llvm-svn: 118182	2010-11-03 18:08:41 +00:00
Dan Gohman	68fb004616	Fix DAGCombiner to avoid going into an infinite loop when it encounters (and:i64 (shl:i64 (load:i64), 1), 0xffffffff). This fixes rdar://8606584. llvm-svn: 118143	2010-11-03 01:47:46 +00:00
John Thompson	beffa5bef1	Inline asm mult-alt constraint tests. llvm-svn: 118107	2010-11-02 23:01:44 +00:00
Devang Patel	94f2a2578c	Use frameindex, if available, as a last resort to emit debug info for a parameter. llvm-svn: 118020	2010-11-02 17:01:30 +00:00
Dale Johannesen	9c3f6bf2bf	Fix pastos in handling of AVX cvttsd2si, PR8491. Bruno, please review, but I'm pretty sure this is right. Patch by Alex Mac! llvm-svn: 117514	2010-10-28 00:35:54 +00:00
Dale Johannesen	ec57ac1c3c	An stdcall function calling a non-stdcall function cannot use tailcall. PR 8461. llvm-svn: 117322	2010-10-25 22:17:05 +00:00
Michael J. Spencer	aa19ee17c0	X86: Emit _fltused instead of __fltused on Windows x64. llvm-svn: 117205	2010-10-23 09:06:59 +00:00
Evan Cheng	87066f0677	More accurate estimate / tracking of register pressure. - Initial register pressure in the loop should be all the live defs into the loop. Not just those from loop preheader which is often empty. - When an instruction is hoisted, update register pressure from loop preheader to the original BB. - Treat only use of a virtual register as kill since the code is still SSA. llvm-svn: 116956	2010-10-20 22:03:58 +00:00
Evan Cheng	63c7608c34	Re-enable register pressure aware machine licm with fixes. Hoist() may have erased the instruction during LICM so UpdateRegPressureAfter() should not reference it afterwards. llvm-svn: 116845	2010-10-19 18:58:51 +00:00
Daniel Dunbar	418204e523	Revert r116781 "- Add a hook for target to determine whether an instruction def is", which breaks some nightly tests. llvm-svn: 116816	2010-10-19 17:14:24 +00:00
Evan Cheng	8249dfe6ce	- Add a hook for target to determine whether an instruction def is "long latency" enough to hoist even if it may increase spilling. Reloading a value from spill slot is often cheaper than performing an expensive computation in the loop. For X86, that means machine LICM will hoist SQRT, DIV, etc. ARM will be somewhat aggressive with VFP and NEON instructions. - Enable register pressure aware machine LICM by default. llvm-svn: 116781	2010-10-19 00:55:07 +00:00
Michael J. Spencer	5e683250ee	X86-Windows: Emit an undefined global __fltused symbol when targeting Windows if any floating point arguments are passed to an external function. llvm-svn: 116665	2010-10-16 08:25:41 +00:00
Rafael Espindola	2216af3fa8	Fix another case where we were preferring instructions with large immediates instead of 8 bits ones. llvm-svn: 116410	2010-10-13 17:14:25 +00:00
Rafael Espindola	8ea9b0eb32	Fix PR8365 by adding a more specialized Pat that checks if an 'and' with 8 bit constants can be used. llvm-svn: 116403	2010-10-13 13:31:20 +00:00
Eric Christopher	a237bdbe52	FileCheckize this in a hope to quiet a valgrind warning on grep. llvm-svn: 116376	2010-10-12 23:47:58 +00:00
Andrew Trick	3e02306fed	PR8297 llvm-svn: 116223	2010-10-11 21:08:42 +00:00
Andrew Trick	e01c9001c9	Fixes bug 8297: i386 cmpxchg8b, missing MachineMemOperand llvm-svn: 116214	2010-10-11 19:02:04 +00:00
Michael J. Spencer	00765e5be0	X86: MinGW should always use libgcc on Windows. llvm-svn: 116177	2010-10-10 23:11:06 +00:00
Michael J. Spencer	7a573a5e1f	X86: Call _alldiv instead of __divdi3 on Windows (excluding cygwin). llvm-svn: 116174	2010-10-10 22:04:34 +00:00
Cameron Esfahani	d57f9ecd4a	Recommit 116056, now with the missing file... llvm-svn: 116083	2010-10-08 19:24:18 +00:00
Andrew Trick	cf97db2402	reverting 116056: win64_params.ll may need to be conditionalized? llvm-svn: 116063	2010-10-08 17:22:42 +00:00
Cameron Esfahani	a07b5c291d	Small patch to restore home register stack space allocation for the Win64 case. Add test case. This code eventually needs to be tighter, since it's always allocating it, even in leaf routines. llvm-svn: 116056	2010-10-08 10:31:30 +00:00
Chris Lattner	3e210eb398	testcase that goes with r116053 llvm-svn: 116054	2010-10-08 05:12:30 +00:00
Chris Lattner	8ed76f87cf	rename test llvm-svn: 116052	2010-10-08 05:05:06 +00:00
Chris Lattner	420cf26d99	merge tests llvm-svn: 116051	2010-10-08 05:04:58 +00:00
Chris Lattner	6a8a65cb43	filecheckize. llvm-svn: 116050	2010-10-08 05:02:29 +00:00
Chris Lattner	dd77477690	reapply: Use the new TB_NOT_REVERSABLE flag instead of special reapply: reimplement the second half of the or/add optimization. We should now with no changes. Turns out that one missing "Defs = [EFLAGS]" can upset things a bit. llvm-svn: 116040	2010-10-08 03:57:25 +00:00
Daniel Dunbar	efdf08b5b8	Revert "reimplement the second half of the or/add optimization. We should now", which depends on r116007, which I am about to revert. llvm-svn: 116031	2010-10-08 02:07:26 +00:00
Chris Lattner	134f415bf8	reimplement the second half of the or/add optimization. We should now only end up emitting LEA instead of OR. If we aren't able to promote something into an LEA, we should never be emitting it as an ADD. Add some testcases that we emit "or" in cases where we used to produce an "add". llvm-svn: 116026	2010-10-08 01:05:10 +00:00
Chris Lattner	ae8d67d3bb	convert cmp to use a multipattern llvm-svn: 115978	2010-10-07 20:56:25 +00:00
Evan Cheng	5c31bf0619	Canonicalize X86ISD::MOVDDUP nodes to v2f64 to make sure all cases match. Also eliminate unneeded isel patterns. rdar://8520311 llvm-svn: 115977	2010-10-07 20:50:20 +00:00
Bill Wendling	10a0fdeab5	PSHUFW is in SSE, not SSSE3. llvm-svn: 115691	2010-10-05 21:58:12 +00:00
Owen Anderson	d8d1dcc09a	Use a more efficient lowering of uint64_t --> float that can take advantage of hardware signed integer conversion without having to do a double cast (uint64_t --> double --> float). This is based on the algorithm from compiler_rt's __floatundisf for X86-64. llvm-svn: 115634	2010-10-05 17:24:05 +00:00
NAKAMURA Takumi	7681b41720	test/CodeGen/X86/atomic_op.ll: Rename @main to @func. Extra sequences will be inserted to @main as prologue on cygming, to fail. llvm-svn: 115611	2010-10-05 11:16:24 +00:00
Anton Korobeynikov	d77a443631	va_args support for Win64. Patch by Cameron! llvm-svn: 115480	2010-10-03 22:52:07 +00:00
Anton Korobeynikov	ff85688559	Properly emit stack probe on win64 (for non-mingw targets). Based on the patch by Cameron Esfahani! llvm-svn: 115479	2010-10-03 22:02:38 +00:00
Chris Lattner	f909b07340	unbreak buildbot llvm-svn: 115476	2010-10-03 20:02:48 +00:00
Bill Wendling	5d9089ae14	Add test to make sure that the MMX intrinsic calls make it out the other end in tact. llvm-svn: 115458	2010-10-03 03:30:30 +00:00
Bill Wendling	bf73fe5e8d	Need to specify SSE4 for machines which don't have SSE4. The code checked for is generated by SSE4. Otherwise, we get something else. llvm-svn: 115352	2010-10-01 21:39:35 +00:00
Bill Wendling	a39904e6b9	We must check for something. llvm-svn: 115309	2010-10-01 10:20:10 +00:00
Bill Wendling	0e5e4b7b76	Disable tests until I can figure out why they're failing on just two machines but not others. llvm-svn: 115308	2010-10-01 10:01:10 +00:00
Bill Wendling	b3a1022572	Try adding an mtriple. llvm-svn: 115307	2010-10-01 09:40:50 +00:00
Bill Wendling	3b2b1e7942	FileCheck-ize this test. llvm-svn: 115304	2010-10-01 08:55:48 +00:00
Bill Wendling	9b6853c6eb	FileCheck-ize this test. llvm-svn: 115303	2010-10-01 08:50:12 +00:00
Chris Lattner	a205055857	fix rdar://8494845 + PR8244 - a miscompile exposed by my patch in r101350 llvm-svn: 115294	2010-10-01 05:36:09 +00:00
Dale Johannesen	f419de0852	One more +sse2. llvm-svn: 115293	2010-10-01 05:08:18 +00:00
Dale Johannesen	bb6b961867	Mark all these as needing SSE2. Should fix PPC and maybe even Linux. llvm-svn: 115291	2010-10-01 04:17:55 +00:00
Dale Johannesen	ab60ae3cf3	Disable these tests for now; it's not obvious why they fail on Linux. llvm-svn: 115257	2010-10-01 00:59:21 +00:00
Dale Johannesen	c6f17f7420	Make test not sensitive to register choice. llvm-svn: 115250	2010-10-01 00:16:17 +00:00
Dale Johannesen	dd224d2333	Massive rewrite of MMX: The x86_mmx type is used for MMX intrinsics, parameters and return values where these use MMX registers, and is also supported in load, store, and bitcast. Only the above operations generate MMX instructions, and optimizations do not operate on or produce MMX intrinsics. MMX-sized vectors <2 x i32> etc. are lowered to XMM or split into smaller pieces. Optimizations may occur on these forms and the result casted back to x86_mmx, provided the result feeds into a previous existing x86_mmx operation. The point of all this is prevent optimizations from introducing MMX operations, which is unsafe due to the EMMS problem. llvm-svn: 115243	2010-09-30 23:57:10 +00:00
NAKAMURA Takumi	bb995ae261	test/CodeGen/X86/sibcall.ll: Add explicit triplets and remove XFAIL: apple-darwin8. llvm-svn: 115215	2010-09-30 22:02:06 +00:00
Jakob Stoklund Olesen	eb12f49fb7	Try again to disable critical edge splitting in CodeGenPrepare. The bug that broke i386 linux has been fixed in r115191. llvm-svn: 115204	2010-09-30 20:51:52 +00:00
Jakob Stoklund Olesen	665aa6efcc	When isel is emitting instructions for an x86 target without CMOV, the CFG is edited during emission. If the basic block ends in a switch that gets lowered to a jump table, any phis at the default edge were getting updated wrong. The jump table data structure keeps a pointer to the header blocks that wasn't getting updated after the MBB is split. This bug was exposed on 32-bit Linux when disabling critical edge splitting in codegen prepare. The fix is to uipdate stale MBB pointers whenever a block is split during emission. llvm-svn: 115191	2010-09-30 19:44:31 +00:00
Bill Wendling	cc91601211	And remove r114997's test. llvm-svn: 115003	2010-09-28 23:24:18 +00:00
Bill Wendling	b0b2c57149	Revert r114997. It was causing a failure on darwin10-selfhost. llvm-svn: 115002	2010-09-28 23:11:55 +00:00
Bill Wendling	d848beb1e5	Fix a FIXME. _foo.eh symbols are currently always exported so that the linker knows about them. This is not necessary on 10.6 and later. llvm-svn: 114997	2010-09-28 22:36:56 +00:00
Jakob Stoklund Olesen	415a7a6fec	Revert "Disable codegen prepare critical edge splitting. Machine instruction passes now" This reverts revision 114633. It was breaking llvm-gcc-i386-linux-selfhost. It seems there is a downstream bug that is exposed by -cgp-critical-edge-splitting=0. When that bug is fixed, this patch can go back in. Note that the changes to tailcallfp2.ll are not reverted. They were good are required. llvm-svn: 114859	2010-09-27 18:43:48 +00:00
Evan Cheng	794aaa79e2	Disable codegen prepare critical edge splitting. Machine instruction passes now break critical edges on demand. llvm-svn: 114633	2010-09-23 06:55:34 +00:00
Owen Anderson	3231d13ddd	A select between a constant and zero, when fed by a bit test, can be efficiently lowered using a series of shifts. Fixes <rdar://problem/8285015>. llvm-svn: 114599	2010-09-22 22:58:22 +00:00
Cameron Esfahani	bbb9287080	Fix PR8201: Update the code to call via X86::CALL64pcrel32 in the 64-bit case. llvm-svn: 114597	2010-09-22 22:35:21 +00:00
Chris Lattner	bd85725341	Fix an inconsistency in the x86 backend that led it to reject "calll foo" on x86-32: 32-bit calls were named "call" not "calll". 64-bit calls were correctly named "callq", so this only impacted x86-32. This fixes rdar://8456370 - llvm-mc rejects 'calll' This also exposes that mingw/64 is generating a 32-bit call instead of a 64-bit call, I will file a bugzilla. llvm-svn: 114534	2010-09-22 05:49:14 +00:00
Chris Lattner	8a236b63d8	reimplement elf TLS support in terms of addressing modes, eliminating SegmentBaseAddress. llvm-svn: 114529	2010-09-22 04:39:11 +00:00
Chris Lattner	505af598d0	linux has a different stack alignment than the mac, relax this a bit. llvm-svn: 114519	2010-09-22 00:46:26 +00:00
Chris Lattner	54e5329545	give VZEXT_LOAD a memory operand, it now works with segment registers. llvm-svn: 114515	2010-09-22 00:34:38 +00:00
Chris Lattner	07827ba978	revert r114386 now that address modes work correctly, we get a nice call through gs-relative memory now. llvm-svn: 114510	2010-09-22 00:11:31 +00:00
Chris Lattner	e479e9643b	give LCMPXCHG_DAG[8] a memory operand, allowing it to work with addrspace 256/257 llvm-svn: 114508	2010-09-21 23:59:42 +00:00
Chris Lattner	0cefa51114	filecheckize llvm-svn: 114507	2010-09-21 23:57:27 +00:00
Devang Patel	d92f42d1d0	Use FileCheck llvm-svn: 114475	2010-09-21 20:50:32 +00:00
Owen Anderson	f4b1a5bdc4	When adding the carry bit to another value on X86, exploit the fact that the carry-materialization (sbbl x, x) sets the registers to 0 or ~0. Combined with two's complement arithmetic, we can fold the intermediate AND and the ADD into a single SUB. This fixes <rdar://problem/8449754>. llvm-svn: 114460	2010-09-21 18:41:19 +00:00
Chris Lattner	bb0a1c44bf	fix rdar://8453210, a crash handling a call through a GS relative load. For now, just disable folding the load into the call. llvm-svn: 114386	2010-09-21 03:37:00 +00:00
Evan Cheng	f3e9a48584	Enable machine sinking critical edge splitting. e.g. define double @foo(double %x, double %y, i1 %c) nounwind { %a = fdiv double %x, 3.2 %z = select i1 %c, double %a, double %y ret double %z } Was: _foo: divsd LCPI0_0(%rip), %xmm0 testb $1, %dil jne LBB0_2 movaps %xmm1, %xmm0 LBB0_2: ret Now: _foo: testb $1, %dil je LBB0_2 divsd LCPI0_0(%rip), %xmm0 ret LBB0_2: movaps %xmm1, %xmm0 ret This avoids the divsd when early exit is taken. rdar://8454886 llvm-svn: 114372	2010-09-20 22:52:00 +00:00
Owen Anderson	272ff94916	When TCO is turned on, it is possible to end up with aliasing FrameIndex's. Therefore, CombinerAA cannot assume that different FrameIndex's never alias, but can instead use MachineFrameInfo to get the actual offsets of these slots and check for actual aliasing. This fixes CodeGen/X86/2010-02-19-TailCallRetAddrBug.ll and CodeGen/X86/tailcallstack64.ll when CombinerAA is enabled, modulo a different register allocation sequence. llvm-svn: 114348	2010-09-20 20:39:59 +00:00
NAKAMURA Takumi	b912c27fc9	test/CodeGen/X86: Add explicit triplet -mtriple=i686-linux to 3 tests incompatible to Win32 codegen. r114297 raises 3 failures. They might fail also on mingw. llvm-svn: 114317	2010-09-19 21:58:55 +00:00
Owen Anderson	b92b13d8a0	Invert the logic of reachesChainWithoutSideEffects(). What we want to check is that there is NO path to the destination containing side effects, not that SOME path contains no side effects. In practice, this only manifests with CombinerAA enabled, because otherwise the chain has little to no branching, so "any" is effectively equivalent to "all". llvm-svn: 114268	2010-09-18 04:45:14 +00:00
Evan Cheng	e53ab6dffc	Teach machine sink to 1) Do forward copy propagation. This makes it easier to estimate the cost of the instruction being sunk. 2) Break critical edges on demand, including cases where the value is used by PHI nodes. Critical edge splitting is not yet enabled by default. llvm-svn: 114227	2010-09-17 22:28:18 +00:00
Dan Gohman	534db8a5c8	Avoid emitting a PIC base register if no PIC addresses are needed. This fixes rdar://8396318. llvm-svn: 114201	2010-09-17 20:24:24 +00:00
Dale Johannesen	f95f59a0c2	When substituting sunkaddrs into indirect arguments an asm, we were walking the asm arguments once and stashing their Values. This is wrong because the same memory location can be in the list twice, and if the first one has a sunkaddr substituted, the stashed value for the second one will be wrong (use-after-free). PR 8154. llvm-svn: 114104	2010-09-16 18:30:55 +00:00
Bruno Cardoso Lopes	e8501a468c	Add one more pattern to fallback movddup llvm-svn: 113522	2010-09-09 18:48:34 +00:00
Devang Patel	3f4abf397c	remove these tests for now. llvm-svn: 113293	2010-09-07 22:03:44 +00:00
Devang Patel	b0af23a1f6	There is no need to force target if the test is going to run on other x86 platforms. llvm-svn: 113285	2010-09-07 20:59:09 +00:00
Devang Patel	e50b23e223	Fix command line used to link these test cases. llvm-svn: 113237	2010-09-07 18:17:56 +00:00
Devang Patel	9dc0e5be58	Reintroduce dbg-declare tests. llvm-svn: 113232	2010-09-07 18:01:49 +00:00
Devang Patel	688338eec3	Remove last three tests. I need to make them independent of my setup. llvm-svn: 113213	2010-09-07 17:08:57 +00:00
Devang Patel	55a3bab0d2	Add a test case to check handling of dbg-declare during hybrid mode where we begin using fast-isel but switch back to DAG building at some point. llvm-svn: 113210	2010-09-07 17:03:44 +00:00
Devang Patel	29a775adf1	Add a test case to check handling of dbg-declare by selection DAG builder. llvm-svn: 113209	2010-09-07 16:56:35 +00:00
Devang Patel	184c81c3e2	Add a test case to check handling of dbg-declare by fast-isel. llvm-svn: 113208	2010-09-07 16:40:53 +00:00
Chris Lattner	eeba0c73e5	implement rdar://6653118 - fastisel should fold loads where possible. Since mem2reg isn't run at -O0, we get a ton of reloads from the stack, for example, before, this code: int foo(int x, int y, int z) { return x+y+z; } used to compile into: _foo: ## @foo subq $12, %rsp movl %edi, 8(%rsp) movl %esi, 4(%rsp) movl %edx, (%rsp) movl 8(%rsp), %edx movl 4(%rsp), %esi addl %edx, %esi movl (%rsp), %edx addl %esi, %edx movl %edx, %eax addq $12, %rsp ret Now we produce: _foo: ## @foo subq $12, %rsp movl %edi, 8(%rsp) movl %esi, 4(%rsp) movl %edx, (%rsp) movl 8(%rsp), %edx addl 4(%rsp), %edx ## Folded load addl (%rsp), %edx ## Folded load movl %edx, %eax addq $12, %rsp ret Fewer instructions and less register use = faster compiles. llvm-svn: 113102	2010-09-05 02:18:34 +00:00
Dale Johannesen	367afb5a00	Remove the rest of the nonexistent 64-bit AVX instructions. Bruno, please review. llvm-svn: 113014	2010-09-03 21:23:00 +00:00
NAKAMURA Takumi	24d039ebe3	test/CodeGen/X86: Add explicit -mtriple=(i686\|x86_64)-linux for Win32 host. llvm-svn: 112947	2010-09-03 03:24:08 +00:00
Bruno Cardoso Lopes	d6634a5b2e	AVX doesn't support mm operations neither its instrinsics. The AVX versions of PALIGN and PABS* should only exist for 128-bit. Remove the unnecessary stuff. llvm-svn: 112944	2010-09-03 02:08:45 +00:00
Anton Korobeynikov	a5a645559c	Properly emit __chkstk call instead of __alloca on non-mingw windows targets. Patch by Cameron Esfahani! llvm-svn: 112902	2010-09-02 23:03:46 +00:00
Dan Gohman	3c9b5f394b	Don't narrow the load and store in a load+twiddle+store sequence unless there are clearly no stores between the load and the store. This fixes this miscompile reported as PR7833. This breaks the test/CodeGen/X86/narrow_op-2.ll optimization, which is safe, but awkward to prove safe. Move it to X86's README.txt. llvm-svn: 112861	2010-09-02 21:18:42 +00:00
NAKAMURA Takumi	a224e5563e	test/loop-strength-reduce4: Add explicit triplet for Win32 host. llvm-svn: 112802	2010-09-02 03:45:58 +00:00
NAKAMURA Takumi	54ce546865	test/twoaddr-coalesce: Do not use @main . Win32 codegen emits implicit invoking __main into, to fail. llvm-svn: 112801	2010-09-02 03:45:51 +00:00
Bruno Cardoso Lopes	fea81b4831	Using target specific nodes for shuffle nodes makes the mask check more strict, breaking some cases not checked in the testsuite, but also exposes some foldings not done before, as this example: movaps (%rdi), %xmm0 movaps (%rax), %xmm1 movaps %xmm0, %xmm2 movss %xmm1, %xmm2 shufps $36, %xmm2, %xmm0 now is generated as: movaps (%rdi), %xmm0 movaps %xmm0, %xmm1 movlps (%rax), %xmm1 shufps $36, %xmm1, %xmm0 llvm-svn: 112753	2010-09-01 22:33:20 +00:00
Jakob Stoklund Olesen	4b6fd48bba	Teach RemoveCopyByCommutingDef to check all aliases, not just subregisters. This caused a miscompilation in WebKit where %RAX had conflicting defs when RemoveCopyByCommutingDef was commuting a %EAX use. llvm-svn: 112751	2010-09-01 22:15:35 +00:00
Dan Gohman	110ed64fbb	Revert 112442 and 112440 until the compile time problems introduced by 112440 are resolved. llvm-svn: 112692	2010-09-01 01:45:53 +00:00
Chris Lattner	34bfab0ad5	two changes: 1) nuke ConstDataCoalSection, which is dead. 2) revise my previous patch for rdar://8018335, which was completely wrong. Specifically, it doesn't make sense to mark __TEXT,__const_coal as PURE_INSTRUCTIONS, because it is for readonly data. templates (it turns out) go to const_coal_nt. The real fix for rdar://8018335 was to give ConstTextCoalSection a section kind of ReadOnly instead of Text. llvm-svn: 112496	2010-08-30 18:12:35 +00:00
Duncan Sands	68c30907cc	Correct bogus module triple specifications. llvm-svn: 112469	2010-08-30 10:48:29 +00:00
Dan Gohman	3a08ed7904	Make IVUsers iterative instead of recursive. This has the side effect of reversing the order of most of IVUser's results. llvm-svn: 112442	2010-08-29 16:40:03 +00:00
Dan Gohman	6665550bca	Make this test less dependent on register allocation choices. llvm-svn: 112426	2010-08-29 14:49:42 +00:00
Chris Lattner	c2887bc283	merge a bunch of shuffle tests into sse2.ll llvm-svn: 112398	2010-08-29 03:19:04 +00:00
Chris Lattner	b1ff978406	add some nounwind's llvm-svn: 112396	2010-08-29 03:07:47 +00:00
Chris Lattner	94656b1c8c	fix the buildvector->insertp[sd] logic to not always create a redundant insertp[sd] $0, which is a noop. Before: _f32: ## @f32 pshufd $1, %xmm1, %xmm2 pshufd $1, %xmm0, %xmm3 addss %xmm2, %xmm3 addss %xmm1, %xmm0 ## kill: XMM0<def> XMM0<kill> XMM0<def> insertps $0, %xmm0, %xmm0 insertps $16, %xmm3, %xmm0 ret after: _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm3 addss %xmm1, %xmm3 movdqa %xmm2, %xmm0 insertps $16, %xmm3, %xmm0 ret The extra movs are due to a random (poor) scheduling decision. llvm-svn: 112379	2010-08-28 17:59:08 +00:00
Chris Lattner	bcb6090ad0	fix the BuildVector -> unpcklps logic to not do pointless shuffles when the top elements of a vector are undefined. This happens all the time for X86-64 ABI stuff because only the low 2 elements of a 4 element vector are defined. For example, on: _Complex float f32(_Complex float A, _Complex float B) { return A+B; } We used to produce (with SSE2, SSE4.1+ uses insertps): _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $16, %xmm2, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm0 addss %xmm1, %xmm0 pshufd $16, %xmm0, %xmm1 movdqa %xmm2, %xmm0 unpcklps %xmm1, %xmm0 ret We now produce: _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm3 addss %xmm1, %xmm3 movaps %xmm2, %xmm0 unpcklps %xmm3, %xmm0 ret This implements rdar://8368414 llvm-svn: 112378	2010-08-28 17:28:30 +00:00
Dan Gohman	e06905d1f0	Completely disable tail calls when fast-isel is enabled, as fast-isel doesn't currently support dealing with this. llvm-svn: 112341	2010-08-28 00:51:03 +00:00
Chris Lattner	7413e87b6d	get this test passing on linux builders. llvm-svn: 112280	2010-08-27 18:49:08 +00:00
Daniel Dunbar	1844a71e66	X86: Fix an encoding issue with LOCK_ADD64mr, which could lead to very hard to find miscompiles with the integrated assembler. llvm-svn: 112250	2010-08-27 01:30:14 +00:00
Chris Lattner	af23e9a798	Add a hackaround for PR7993 which is causing failures on x86 builders that lack sse2. llvm-svn: 112175	2010-08-26 06:57:07 +00:00
Chris Lattner	66afba7aa4	I think enough general codegen bugs are fixed to allow this to work on random hosts, lets see! llvm-svn: 112172	2010-08-26 05:52:42 +00:00
Chris Lattner	eb2cc0ce0e	implement SplitVecOp_CONCAT_VECTORS, fixing the included testcase with SSE1. llvm-svn: 112171	2010-08-26 05:51:22 +00:00
Chris Lattner	825294b85f	Make sure this forces the x86 targets llvm-svn: 112169	2010-08-26 05:25:05 +00:00
Chris Lattner	cc60609cb4	fix sse1 only codegen in x86-64 mode, which is something we apparently try to support. llvm-svn: 112168	2010-08-26 05:24:29 +00:00

... 3 4 5 6 7 ...

2470 Commits