llvm-project

Commit Graph

Author	SHA1	Message	Date
Dan Gohman	0695e09b09	Optimize the "bit test" code path for switch lowering in the case where the bit mask has exactly one bit. llvm-svn: 106716	2010-06-24 02:06:24 +00:00
Bill Wendling	a136521a17	MorphNodeTo doesn't preserve the memory operands. Because we're morphing a node into the same node, but with different non-memory operands, we need to replace the memory operands after it's finished morphing. llvm-svn: 106643	2010-06-23 18:16:24 +00:00
Daniel Dunbar	4df321b7ad	Revert r106263, "Fold the ShrinkDemandedOps pass into the regular DAGCombiner pass,"... it was causing both 'file' (with clang) and 176.gcc (with llvm-gcc) to be miscompiled. llvm-svn: 106634	2010-06-23 17:09:26 +00:00
Daniel Dunbar	ef5a4383ad	Revert r106066, "Create a more targeted fix for not sinking instructions into a range where it"... it causes bzip2 to be miscompiled by Clang. Conflicts: lib/CodeGen/MachineSink.cpp llvm-svn: 106614	2010-06-23 00:48:25 +00:00
Dan Gohman	f1cf963c64	Loosen up this test so that it doesn't depend as much on register allocation details. llvm-svn: 106599	2010-06-22 23:32:47 +00:00
Dan Gohman	1081f1a0f5	Fix OptimizeMax to handle an odd case where one of the max operands is another max which folds. This fixes PR7454. llvm-svn: 106594	2010-06-22 23:07:13 +00:00
Dale Johannesen	6d4802ba6c	Add SSE so these actually pass on non-X86 hosts. llvm-svn: 106575	2010-06-22 20:54:03 +00:00
Mon P Wang	825639e849	Move v-binop-widen tests to X86 since they don't work on all platforms llvm-svn: 106562	2010-06-22 19:40:50 +00:00
Jakob Stoklund Olesen	9c47dac677	Remove the SimpleJoin optimization from SimpleRegisterCoalescing. Measurements show that it does not speed up coalescing, so there is no reason the keep the added complexity around. Also clean out some unused methods and static functions. llvm-svn: 106548	2010-06-22 16:13:57 +00:00
Dan Gohman	3c1b3c61e9	Teach two-address lowering how to unfold a load to open up commuting opportunities. For example, this lets it emit this: movq (%rax), %rcx addq %rdx, %rcx instead of this: movq %rdx, %rcx addq (%rax), %rcx in the case where %rdx has subsequent uses. It's the same number of instructions, and usually the same encoding size on x86, but it appears faster, and in general, it may allow better scheduling for the load. llvm-svn: 106493	2010-06-21 22:17:20 +00:00
Dan Gohman	2dd1d3d182	Make this test more robust in case LLVM ever decides to align the global variable differently. llvm-svn: 106454	2010-06-21 19:56:27 +00:00
Eric Christopher	bf572c7cea	Add some codegen patterns for x86_64-linux-gnu tls codegen matching. Based on a patch by Patrick Marlier! llvm-svn: 106433	2010-06-21 18:21:27 +00:00
Dan Gohman	51d00092b6	Include the use kind along with the expression in the key of the use sharing map. The reconcileNewOffset logic already forces a separate use if the kinds differ, so incorporating the kind in the key means we can track more sharing opportunities. More sharing means fewer total uses to track, which means smaller problem sizes, which means the conservative throttles don't kick in as often. llvm-svn: 106396	2010-06-19 21:29:59 +00:00
Dan Gohman	99ba4dac59	Don't maintain a set of deleted nodes; instead, use a HandleSDNode to track a node over CSE events. This fixes PR7368. llvm-svn: 106266	2010-06-18 01:24:29 +00:00
Dan Gohman	b92156d5e4	Fold the ShrinkDemandedOps pass into the regular DAGCombiner pass, which is faster, simpler, and less surprising. llvm-svn: 106263	2010-06-18 01:05:21 +00:00
Dan Gohman	30d7a51d6c	Make this test less fragile. llvm-svn: 106255	2010-06-18 00:06:03 +00:00
Bill Wendling	8c0cf0994d	Create a more targeted fix for not sinking instructions into a range where it will conflict with another live range. The place which creates this scenerio is the code in X86 that lowers a select instruction by splitting the MBBs. This eliminates the need to check from the bottom up in an MBB for live pregs. llvm-svn: 106066	2010-06-15 23:46:31 +00:00
Jakob Stoklund Olesen	ec2e964fd6	Remove the local register allocator. Please use the fast allocator instead. llvm-svn: 106051	2010-06-15 21:58:33 +00:00
Chris Lattner	874c92bd47	fix fastisel to handle GS and FS relative pointers. Patch by Nelson Elhage! llvm-svn: 106031	2010-06-15 19:08:40 +00:00
Jakob Stoklund Olesen	246e9a07a2	Avoid processing early clobbers twice in RegAllocFast. Early clobbers defining a virtual register were first alocated to a physreg and then processed as a physreg EC, spilling the virtreg. This fixes PR7382. llvm-svn: 105998	2010-06-15 16:20:57 +00:00
Chris Lattner	00ab615406	apparently lots of dupes. llvm-svn: 105956	2010-06-14 20:19:03 +00:00
Chris Lattner	faa7bdccbf	fix a nasty bug where we were not treating available_externally symbols as declarations in the X86 backend. This would manifest on darwin x86-32 as errors like this with -fvisibility=hidden: symbol '__ZNSbIcED1Ev' can not be undefined in a subtraction expression This fixes PR7353. llvm-svn: 105954	2010-06-14 20:11:56 +00:00
Chris Lattner	bbb798c7d1	remove old test. llvm-svn: 105953	2010-06-14 20:07:43 +00:00
Chris Lattner	b30f87b74e	rename test llvm-svn: 105952	2010-06-14 20:07:34 +00:00
Bill Wendling	d53a2cb4ac	Testcase for r105741. llvm-svn: 105750	2010-06-09 20:30:22 +00:00
Jakob Stoklund Olesen	8bc5eca331	Mark physregs defined by inline asm as implicit. This is a bit of a hack to make inline asm look more like call instructions. It would be better to produce correct dead flags during isel. llvm-svn: 105749	2010-06-09 20:05:00 +00:00
Dan Gohman	bbfb6aca92	LSR needs to remember inserted instructions even in postinc mode, because there could be multiple subexpressions within a single expansion which require insert point adjustment. This fixes PR7306. llvm-svn: 105510	2010-06-05 00:33:07 +00:00
Dan Gohman	538b413ccb	Fix normalization and de-normalization of non-affine SCEVs. llvm-svn: 105480	2010-06-04 19:16:34 +00:00
Mon P Wang	622cdd2297	Fixed a bug during widening where we would avoid legalizing a node. When we replace an OpA with a widened OpB, it is possible to get new uses of OpA due to CSE when recursively updating nodes. Since OpA has been processed, the new uses are not examined again. The patch checks if this occurred and it it did, updates the new uses of OpA to use OpB. llvm-svn: 105453	2010-06-04 01:20:10 +00:00
Dan Gohman	8fdda8a655	This test doesn't need the ssp attribute. llvm-svn: 105440	2010-06-04 00:14:48 +00:00
Dan Gohman	d83e3e7750	Fix SimplifyDemandedBits' AssertZext logic to demand all the bits. It needs to demand the high bits because it's asserting that they're zero. llvm-svn: 105406	2010-06-03 20:21:33 +00:00
Bill Wendling	f82aea634c	Machine sink could potentially sink instructions into a block where the physical registers it defines then interfere with an existing preg live range. For instance, if we had something like these machine instructions: BB#0 ... = imul ... EFLAGS<imp-def,dead> test ..., EFLAGS<imp-def> jcc BB#2 EFLAGS<imp-use> BB#1 ... ; fallthrough to BB#2 BB#2 ... ; No code that defines EFLAGS jcc ... EFLAGS<imp-use> Machine sink will come along, see that imul implicitly defines EFLAGS, but because it's "dead", it assumes that it can move imul into BB#2. But when it does, imul's "dead" imp-def of EFLAGS is raised from the dead (a zombie) and messes up the condition code for the jump (and pretty much anything else which relies upon it being correct). The solution is to know which pregs are live going into a basic block. However, that information isn't calculated at this point. Nor does the LiveVariables pass take into account non-allocatable physical registers. In lieu of this, we do a very conservative pass through the basic block to determine if a preg is live coming out of it. llvm-svn: 105387	2010-06-03 07:54:20 +00:00
Eric Christopher	f67fe3b1e8	One underscore, not two. llvm-svn: 105379	2010-06-03 04:02:59 +00:00
Dan Gohman	b782caa393	Fill in missing support for ISD::FEXP, ISD::FPOWI, and friends. llvm-svn: 105283	2010-06-01 18:35:14 +00:00
Chris Lattner	14c46517b5	fix PR6623: when optimizing for size, don't inline memcpy/memsets that are too large. This causes the freebsd bootloader to be too large apparently. It's unclear if this should be an -Os or -Oz thing. Thoughts welcome. llvm-svn: 105228	2010-05-31 17:30:14 +00:00
Chris Lattner	291a189cda	upgrade and filecheckize this test. llvm-svn: 105227	2010-05-31 17:27:17 +00:00
Evan Cheng	707b7cc429	Remove schedule-livein-copies. It's not being used. llvm-svn: 105095	2010-05-29 02:23:39 +00:00
Evan Cheng	27c4933e02	Fix PR7193: if sibling call address can take a register, make sure there are enough registers available by counting inreg arguments. llvm-svn: 105092	2010-05-29 01:35:22 +00:00
Jakob Stoklund Olesen	2085089c49	Fix more tests that depended on the default register allocator choice. llvm-svn: 104961	2010-05-28 17:06:30 +00:00
Dan Gohman	2140a74979	Eliminate the restriction that the array size in an alloca must be i32. This will help reduce the amount of casting required on 64-bit targets. llvm-svn: 104911	2010-05-28 01:14:11 +00:00
Jakob Stoklund Olesen	b613ae2c89	Add a -regalloc=default option that chooses a register allocator based on the -O optimization level. This only really affects llc for now because both the llvm-gcc and clang front ends override the default register allocator. I intend to remove that code later. llvm-svn: 104904	2010-05-27 23:57:25 +00:00
Devang Patel	6b9a9fe207	Simplify. Eliminate unneeded debug_loc entry. llvm-svn: 104785	2010-05-26 23:55:23 +00:00
Devang Patel	1b08572a66	Update debug info when live-in reg is copied into a vreg. llvm-svn: 104732	2010-05-26 20:18:50 +00:00
Dale Johannesen	053dd21c84	Testcase for 104624/104619/PR7191/8023512. Reduced from one provided by Duncan Sands, thanks! llvm-svn: 104710	2010-05-26 17:55:45 +00:00
Dale Johannesen	cd4ba6caba	Removing test; Chris thinks it's better to have the bug go untested than have a testcase this large. So be it. llvm-svn: 104632	2010-05-25 20:40:10 +00:00
Dale Johannesen	60fe2cdc4f	Fix another variant of PR 7191. Also add a testcase Mon Ping provided; unfortunately bugpoint failed to reduce it, but I think it's important to have a test for this in the suite. 8023512. llvm-svn: 104624	2010-05-25 18:47:23 +00:00
Eric Christopher	64087cd346	This test is darwin only. Make it so(tm). llvm-svn: 104418	2010-05-22 00:55:55 +00:00
Eric Christopher	6fdea1bda8	Add full bss data support for darwin tls variables. llvm-svn: 104414	2010-05-22 00:10:22 +00:00
Chris Lattner	0735ecfe17	now that fp reg kill insertion stuff happens as a separate pass after isel instead of being interlaced with it, we can trust that all the code for a function has been isel'd before it is run. The practical impact of this is that we can scan for machine instr phis instead of doing a fuzzy match on the LLVM BB for phi nodes. Doing the fuzzy match required knowing when isel would produce an fp reg stack phi which was gross. It was also wrong in cases where select got lowered to a branch tree because cmovs aren't available (PR6828). Just do the scan on machine phis which is simpler, faster and more correct. This fixes PR6828. llvm-svn: 104333	2010-05-21 18:17:54 +00:00
Dale Johannesen	b3b9c8ac48	Fix i64->f64 conversion, x86-64, -no-sse. A bit tricky since there's a 3rd 64-bit type, MMX vectors. PR 7135. llvm-svn: 104308	2010-05-21 00:52:33 +00:00
Dan Gohman	ee2fea3cd7	When canonicalizing icmp operand order to put the loop invariant operand on the left, the interesting operand is on the right. This fixes a bug where LSR was failing to recognize ICmpZero uses, which led it to be unable to reverse the induction variable in the attached testcase. Delete test/CodeGen/X86/stack-color-with-reg-2.ll, because its test is extremely fragile and hard to meaningfully update. llvm-svn: 104262	2010-05-20 19:26:52 +00:00
Dan Gohman	887dd1cd31	When converting a test to a cmp to fold a load, use the cmp that has an 8-bit immediate field rather than one with a wider immediate field. llvm-svn: 104064	2010-05-18 21:42:03 +00:00
Daniel Dunbar	a4820fcc78	MC/X86: Implement custom lowering to make sure we match things like X86::ADC32ri $0, %eax to X86::ADC32i32 $0 llvm-svn: 104030	2010-05-18 17:22:24 +00:00
Dale Johannesen	f92c344167	Removing as part of previous reversion. llvm-svn: 103915	2010-05-16 20:19:40 +00:00
Dale Johannesen	2ef974ee0e	Revert 103911; it broke a test that expects bitconvert <1xi64> -> i64 to work in MMX registers on hosts where -no-sse is the default (not mine). The right thing is to accept this and make i64->f64 conversions go through memory, but I don't have time right now. llvm-svn: 103914	2010-05-16 20:19:04 +00:00
Dale Johannesen	fc1492d71b	Make x86-64 64-bit bitconvert work when SSE is not available. (This worked as of about 6 months ago and I didn't track down exactly what broke it; I think this fix is appropriate.) llvm-svn: 103911	2010-05-16 18:22:38 +00:00
Anton Korobeynikov	8f35fabbc1	Add support for thiscall calling convention. Patch by Charles Davis and Steven Watanabe! llvm-svn: 103902	2010-05-16 09:08:45 +00:00
Jakob Stoklund Olesen	4d5c1061e3	Simplify the handling of physreg defs and uses in RegAllocFast. This adds extra security against using clobbered physregs, and it adds kill markers to physreg uses. llvm-svn: 103784	2010-05-14 18:03:25 +00:00
Jakob Stoklund Olesen	0ba2e2a568	Take allocation hints from copy instructions to/from physregs. This causes way more identity copies to be generated, ripe for coalescing. llvm-svn: 103686	2010-05-13 00:19:43 +00:00
Jakob Stoklund Olesen	955a0e71e9	Make sure to add kill flags to the last use of a virtreg when it is redefined. The X86 floating point stack pass and others depend on good kill flags. llvm-svn: 103635	2010-05-12 18:46:03 +00:00
Jakob Stoklund Olesen	e6e39dc310	Enable a bunch more -regalloc=fast tests llvm-svn: 103531	2010-05-12 00:11:24 +00:00
Jakob Stoklund Olesen	84c881e593	One more -regalloc=fast test llvm-svn: 103509	2010-05-11 20:51:07 +00:00
Jakob Stoklund Olesen	3f0241e0f9	Simplify the tracking of used physregs to a bulk bitor followed by a transitive closure after allocating all blocks. Add a few more test cases for -regalloc=fast. llvm-svn: 103500	2010-05-11 20:30:28 +00:00
Jakob Stoklund Olesen	f1b3029a54	Mostly rewrite RegAllocFast. Sorry for the big change. The path leading up to this patch had some TableGen changes that I didn't want to commit before I knew they were useful. They weren't, and this version does not need them. The fast register allocator now does no liveness calculations. Instead it relies on kill flags provided by isel. (Currently those kill flags are also ignored due to isel bugs). The allocation algorithm is supposed to work with any subset of valid kill flags. More kill flags simply means fewer spills inserted. Registers are allocated from a working set that contains no aliases. That means most allocations can be done directly without expensive alias checks. When the working set runs out of registers we do the full alias check to find new free registers. llvm-svn: 103488	2010-05-11 18:54:45 +00:00
Evan Cheng	02947a4551	Be careful with operand promotion. For a binary operation, the source operands may be the same. PR7018. rdar://7939869. llvm-svn: 103419	2010-05-10 19:03:57 +00:00
Bill Wendling	cd476b6760	Readd testcase. llvm-svn: 103335	2010-05-08 04:47:54 +00:00
Dan Gohman	d0800241d2	When pruning candidate formulae out of an LSRUse, update the LSRUse's Regs set after all pruning is done, rather than trying to do it on the fly, which can produce an incomplete result. This fixes a case where heuristic pruning was stripping all formulae from a use, which led the solver to enter an infinite loop. Also, add a few asserts to diagnose this kind of situation. llvm-svn: 103328	2010-05-07 23:36:59 +00:00
Bill Wendling	6b5897b4de	Remove. Don't XFAIL. llvm-svn: 103321	2010-05-07 23:09:17 +00:00
Bill Wendling	32d8981ec0	Temorarily revert r101984. llvm-svn: 103314	2010-05-07 22:45:36 +00:00
Dale Johannesen	51c1695a0a	Fix PR 7087, and probably other things, by extending getConstantFP to accept the two supported long double target types. This was not the original intent, but there are other places that assume this works and it's easy enough to do. llvm-svn: 103299	2010-05-07 21:35:53 +00:00
Duncan Sands	ebf838274f	Correct some bogus target triples. llvm-svn: 103265	2010-05-07 17:03:48 +00:00
Nick Lewycky	45f530db39	Revert r103133 and add testcase from PR7066. llvm-svn: 103233	2010-05-07 01:45:38 +00:00
Dan Gohman	7421ae48bf	Disable the new unknown-location code for now. It causes a major increase in the debug line info section, and it's causing regressions in a gdb testsuite. llvm-svn: 103226	2010-05-07 01:08:53 +00:00
Dan Gohman	779c69bbc5	Add a DebugLoc argument to TargetInstrInfo::copyRegToReg, so that it doesn't have to guess. llvm-svn: 103194	2010-05-06 20:33:48 +00:00
Dan Gohman	cb4e3e51a9	Add a testcase for r103135, explicitly representing unknown locations in debug line info. llvm-svn: 103189	2010-05-06 17:49:17 +00:00
Chris Lattner	35096e82c5	Fix PR7054 - Assertion `Symbol->isUndefined() && "Cannot define a symbol twice!"' failed. Users can write broken code that emits the same label twice with asm renaming, detect this and emit a fatal backend error instead of aborting. llvm-svn: 103140	2010-05-06 00:05:37 +00:00
Jakob Stoklund Olesen	1b6f698e85	Fix PR6520. An earlyclobber physreg must not be allocated to anything else. llvm-svn: 103133	2010-05-05 23:07:41 +00:00
Jakob Stoklund Olesen	f4e4e84115	Check that subregisters don't have independent values in RemoveCopyByCommutingDef(). This fixes PR6941. llvm-svn: 102970	2010-05-03 22:40:32 +00:00
Dan Gohman	0553acff5e	Fix tests to use fadd, fsub, and fmul, instead of add, sub, and mul, when the type is floating-point. llvm-svn: 102969	2010-05-03 22:36:46 +00:00
Dan Gohman	2ad68de4aa	Fix a bug which prevented tail merging of return instructions in beneficial cases. See the changes in test/CodeGen/X86/tail-opts.ll and test/CodeGen/ARM/ifcvt2.ll for details. The fix is to change HashEndOfMBB to hash at most one instruction, instead of trying to apply heuristics about when it will be profitable to consider more than one instruction. The regular tail-merging heuristics are already prepared to handle the same cases, and they're more precise. Also, make test/CodeGen/ARM/ifcvt5.ll and test/CodeGen/Thumb2/thumb2-branch.ll slightly more complex so that they continue to test what they're intended to test. And, this eliminates the problem in test/CodeGen/Thumb2/2009-10-15-ITBlockBranch.ll, the testcase from PR5204. Update it accordingly. llvm-svn: 102907	2010-05-03 14:35:47 +00:00
Duncan Sands	211427bda9	Remove the -enable-sjlj-eh option, which doesn't do anything. Remove the -enable-eh option which is only used by the JIT, and replace it with -jit-enable-eh. llvm-svn: 102865	2010-05-02 15:36:26 +00:00
Bill Wendling	02bc6787ca	Test failing too much on too many platforms. llvm-svn: 102812	2010-05-01 00:12:33 +00:00
Bill Wendling	06cacb1291	Maybe it needs sse2? llvm-svn: 102802	2010-04-30 23:19:29 +00:00
Bill Wendling	613fb7daa6	Force 64-bit. llvm-svn: 102800	2010-04-30 22:45:20 +00:00
Bill Wendling	de4b225093	EXTRACT_VECTOR_ELT of an INSERT_VECTOR_ELT may have the same index, but the indexes could be of a different value type. Or not even using the same SDNode for the constant (weird, I know). Compare the actual values instead of the pointers. llvm-svn: 102791	2010-04-30 22:19:17 +00:00
Jakob Stoklund Olesen	9afed0f98b	The local register allocator has to spill dirty callee saved registers before a call that might throw. The landing pad assumes that all registers are in stack slots. We used to spill those dirty CSRs after the call, and the stack slots would be wrong when arriving at the landing pad. llvm-svn: 102770	2010-04-30 21:19:29 +00:00
Evan Cheng	5f2314f3a3	Fix test. llvm-svn: 102694	2010-04-30 06:00:56 +00:00
Evan Cheng	5117a555e0	Another sibcall bug. If caller and callee calling conventions differ, then it's only safe to do a tail call if the results are returned in the same way. llvm-svn: 102683	2010-04-30 01:12:32 +00:00
Jakob Stoklund Olesen	8d4214578d	Reject really weird coalescer case when trying to merge identical subregisters of different register classes. e.g. %reg1048:3<def> = EXTRACT_SUBREG %RAX<kill>, 3 Where %reg1048 is a GR32 register. This is not impossible to handle, but it is pretty hard and very rare. This should unbreak the dragonegg builder. llvm-svn: 102672	2010-04-29 23:47:46 +00:00
Evan Cheng	38dfa5cf20	Load folding tail call should not use ebp / rbp after it's popped. PEI should use esp / rsp to reference frame instead. llvm-svn: 102596	2010-04-29 05:08:22 +00:00
Chris Lattner	08e9e72fa9	Rework global alignment computation again. Now we do round up alignment of globals to the preferred alignment, but only when there is no section specified on the global (by far the common case). llvm-svn: 102515	2010-04-28 19:58:07 +00:00
Evan Cheng	050df1b8de	Enable i16 to i32 promotion by default. llvm-svn: 102493	2010-04-28 08:30:49 +00:00
Evan Cheng	fe420adde0	Update tests. llvm-svn: 102487	2010-04-28 01:53:13 +00:00
Devang Patel	50c9431203	Emit debug info for byval parameters. llvm-svn: 102486	2010-04-28 01:39:28 +00:00
Evan Cheng	eb828b6391	Do not count kill, implicit_def instructions as printed instructions. llvm-svn: 102453	2010-04-27 19:38:45 +00:00
Chris Lattner	64d43d80be	round zero-byte .zerofill directives up to 1 byte. This should fix some "g++.dg-struct-layout-1" failures, rdar://7886017 llvm-svn: 102421	2010-04-27 07:41:44 +00:00
Chris Lattner	6a5e706e3c	on darwin empty functions need to codegen into something of non-zero length, otherwise labels get incorrectly merged. We handled this by emitting a ".byte 0", but this isn't correct on thumb/arm targets where the text segment needs to be a multiple of 2/4 bytes. Handle this by emitting a noop. This is more gross than it should be because arm/ppc are not fully mc'ized yet. This fixes rdar://7908505 llvm-svn: 102400	2010-04-26 23:37:21 +00:00
Dan Gohman	58b0470592	When checking whether the special handling for an addrec increment which doesn't dominate the header is needed, don't check whether the increment expression has computable loop evolution. While the operands of an addrec are required to be loop-invariant, they're not required to dominate any part of the loop. This fixes PR6914. llvm-svn: 102389	2010-04-26 21:46:36 +00:00
Chris Lattner	f740a8ceeb	fix PR6921 a different way. Intead of increasing the alignment of globals with a specified alignment, we fix common variables to obey their alignment. Add a comment explaining why this behavior is important. llvm-svn: 102365	2010-04-26 18:46:46 +00:00
Chris Lattner	e80442aa6d	Revert r102300/102301, which serious broke objc apps. llvm-svn: 102359	2010-04-26 18:30:45 +00:00
Chris Lattner	386a220f70	Fix PR6921: globals were not getting correctly rounded up to their preferred alignment unless they were common or some other special case. llvm-svn: 102300	2010-04-25 05:30:43 +00:00
Dan Gohman	534ba376f6	Generalize LSR's OptimizeMax to handle the new kinds of max expressions that indvars may use, now that indvars is recognizing le and ge loops. llvm-svn: 102235	2010-04-24 03:13:44 +00:00
Stuart Hastings	c8b2fc0909	Per Chris, fuse four trivial tests using grep (r102199) into one that uses FileCheck. llvm-svn: 102216	2010-04-23 22:12:57 +00:00
Dan Gohman	e1931fa676	Change TargetData's algorithm for computing defualt vector type alignment to match what's used in clang and GCC for __alignof, rather than trying to guess what Legalize is going to be doing. llvm-svn: 102206	2010-04-23 19:41:15 +00:00
Stuart Hastings	24b63f1597	Add some missing x86 patterns for movdq2q. Fixes two (LLVM-)GCC DejaGNU testcases. Radar 6881029. llvm-svn: 102199	2010-04-23 19:03:32 +00:00
Dan Gohman	997bbc54d6	Fix LSR to tolerate cases where ScalarEvolution initially misses an opportunity to fold add operands, but folds them after LSR has separated them out. This fixes rdar://7886751. llvm-svn: 102157	2010-04-23 01:55:05 +00:00
Evan Cheng	02e816b317	Do not try to optimize a copy that has already been marked for deletion. llvm-svn: 102027	2010-04-21 20:57:54 +00:00
Evan Cheng	4158a0ff6b	Implement -disable-non-leaf-fp-elim which disable frame pointer elimination optimization for non-leaf functions. This will be hooked up to gcc's -momit-leaf-frame-pointer option. rdar://7886181 llvm-svn: 101984	2010-04-21 03:18:23 +00:00
Evan Cheng	2034d9f2da	- Clean up some crappy code which deals with coalescing of copies which look at extract_subreg / insert_subreg, etc. - Add support for more aggressive insert_subreg coalescing. llvm-svn: 101971	2010-04-21 00:44:22 +00:00
Dan Gohman	ad33d33719	Add another variant of this test which found a place where CodeGen's ComputeMaskedBits was being over-conservative when computing bits for an ADD. llvm-svn: 101963	2010-04-21 00:19:28 +00:00
Chris Lattner	84776786a7	teach the x86 address matching stuff to handle (shl (or x,c), 3) the same as (shl (add x, c), 3) when x doesn't have any bits from c set. This finishes off PR1135. Before we compiled the block to: to: LBB0_3: ## %bb cmpb $4, %dl sete %dl addb %dl, %cl movb %cl, %dl shlb $2, %dl addb %r8b, %dl shlb $2, %dl movzbl %dl, %edx movl %esi, (%rdi,%rdx,4) leaq 2(%rdx), %r9 movl %esi, (%rdi,%r9,4) leaq 1(%rdx), %r9 movl %esi, (%rdi,%r9,4) addq $3, %rdx movl %esi, (%rdi,%rdx,4) incb %r8b decb %al movb %r8b, %dl jne LBB0_1 Now we produce: LBB0_3: ## %bb cmpb $4, %dl sete %dl addb %dl, %cl movb %cl, %dl shlb $2, %dl addb %r8b, %dl shlb $2, %dl movzbl %dl, %edx movl %esi, (%rdi,%rdx,4) movl %esi, 8(%rdi,%rdx,4) movl %esi, 4(%rdi,%rdx,4) movl %esi, 12(%rdi,%rdx,4) incb %r8b decb %al movb %r8b, %dl jne LBB0_1 llvm-svn: 101958	2010-04-20 23:18:40 +00:00
Bill Wendling	a8ae1783b4	Move CodeGen/X86/2010-04-19-DAGCombineCrash.ll into CodeGen/X86/crash.ll. Also reduce. llvm-svn: 101925	2010-04-20 18:14:47 +00:00
Bill Wendling	467e6c2deb	The visitXOR method can return the same SDNode. If so, we don't want to delete it as it's not dead. llvm-svn: 101855	2010-04-20 01:25:01 +00:00
Dan Gohman	4fee6f3bdd	Start function numbering at 0. llvm-svn: 101638	2010-04-17 16:29:15 +00:00
Evan Cheng	3af19e80c9	Add nounwind. llvm-svn: 101613	2010-04-17 03:43:36 +00:00
Jakob Stoklund Olesen	dc6d42dbf8	Add test case for machine-sink on critical edges llvm-svn: 101416	2010-04-15 23:19:16 +00:00
Chris Lattner	3245afdf05	enhance the load/store narrowing optimization to handle a tokenfactor in between the load/store. This allows us to optimize test7 into: _test7: ## @test7 ## BB#0: ## %entry movl (%rdx), %eax ## kill: SIL<def> ESI<kill> movb %sil, 5(%rdi) ret instead of: _test7: ## @test7 ## BB#0: ## %entry movl 4(%esp), %ecx movl $-65281, %eax ## imm = 0xFFFFFFFFFFFF00FF andl 4(%ecx), %eax movzbl 8(%esp), %edx shll $8, %edx addl %eax, %edx movl 12(%esp), %eax movl (%eax), %eax movl %edx, 4(%ecx) ret llvm-svn: 101355	2010-04-15 06:10:49 +00:00
Chris Lattner	6ebd8674eb	teach codegen to turn trunc(zextload) into load when possible. This doesn't occur much at all, it only seems to formed in the case when the trunc optimization kicks in due to phase ordering. In that case it is saves a few bytes on x86-32. llvm-svn: 101350	2010-04-15 05:40:59 +00:00
Chris Lattner	4041ab6e00	Implement rdar://7860110 (also in target/readme.txt) narrowing a load/or/and/store sequence into a narrower store when it is safe. Daniel tells me that clang will start producing this sort of thing with bitfields, and this does trigger a few dozen times on 176.gcc produced by llvm-gcc even now. This compiles code like CodeGen/X86/2009-05-28-DAGCombineCrash.ll into: movl %eax, 36(%rdi) instead of: movl $4294967295, %eax ## imm = 0xFFFFFFFF andq 32(%rdi), %rax shlq $32, %rcx addq %rax, %rcx movq %rcx, 32(%rdi) and each of the testcases into a single store. Each of them used to compile into craziness like this: _test4: movl $65535, %eax ## imm = 0xFFFF andl (%rdi), %eax shll $16, %esi addl %eax, %esi movl %esi, (%rdi) ret llvm-svn: 101343	2010-04-15 04:48:01 +00:00
Chris Lattner	60bbb8c356	further tweak this to do something useful. llvm-svn: 101341	2010-04-15 04:31:42 +00:00
Chris Lattner	9ebaf531ab	remove undef control flow. llvm-svn: 101340	2010-04-15 04:30:19 +00:00
Jakob Stoklund Olesen	938f2ae310	Remove unneeded types from test. llvm-svn: 101286	2010-04-14 20:56:09 +00:00
Evan Cheng	6c35893aa6	Add test for post-ra machine licm. llvm-svn: 101182	2010-04-13 22:10:03 +00:00
Evan Cheng	4d89dd8353	Fix test on non-x86 hosts. llvm-svn: 101163	2010-04-13 18:54:04 +00:00
Evan Cheng	4ca4bc6f95	Re-apply 101075 and fix it properly. Just reuse the debug info of the branch instruction being optimized. There is no need to --I which can deref off start of the BB. llvm-svn: 101162	2010-04-13 18:50:27 +00:00
Eric Christopher	d67f66dc0c	Temporarily revert r101075, it's causing invalid iterator assertions in a nightly tester. llvm-svn: 101158	2010-04-13 18:37:58 +00:00
Chris Lattner	5b212a31a2	add llvm codegen support for -ffunction-sections and -fdata-sections, patch by Sylvere Teissier! llvm-svn: 101106	2010-04-13 00:36:43 +00:00
Evan Cheng	d0d8e3343a	Use .set expression for x86 pic jump table reference to reduce assembly relocation. rdar://7738756 llvm-svn: 101085	2010-04-12 23:07:17 +00:00
Bill Wendling	caaf445a01	Third time's a charm... llvm-svn: 101081	2010-04-12 22:43:21 +00:00
Bill Wendling	4fc5a4d8b8	Genericize the label test. llvm-svn: 101079	2010-04-12 22:40:37 +00:00
Bill Wendling	4627917b9a	Correct test to test what I mean it to test. llvm-svn: 101077	2010-04-12 22:25:42 +00:00
Bill Wendling	b02bbe416f	Micro-optimization: If we have this situation: jCC L1 jmp L2 L1: ... L2: ... We can get a small performance boost by emitting this instead: jnCC L2 L1: ... L2: ... This testcase shows an example of this: float func(float x, float y) { double product = (double)x * y; if (product == 0.0) return product; return product - 1.0; } llvm-svn: 101075	2010-04-12 22:19:57 +00:00
Evan Cheng	250283916d	Enable post regalloc machine licm by default. llvm-svn: 101023	2010-04-12 06:25:28 +00:00
Dan Gohman	d23fa7d90d	Merge a few fast-isel tests. llvm-svn: 100860	2010-04-09 15:03:55 +00:00
Evan Cheng	b083c47c21	Coalescer should not delete copy instructions whose defs are partially dead. e.g. %RDI<def,dead> = MOV64rr %RAX<kill>, %EDI<imp-def> llvm-svn: 100804	2010-04-08 20:02:37 +00:00
Evan Cheng	ebe47c872f	Avoid using f64 to lower memcpy from constant string. It's cheaper to use i32 store of immediates. llvm-svn: 100751	2010-04-08 07:37:57 +00:00
Dan Gohman	4506539d84	When expanding expressions which are using post-inc mode for multiple loops, ensure that the expansion is dominated by the increments of those loops. llvm-svn: 100748	2010-04-08 05:57:57 +00:00
Chris Lattner	3ae2dd2ba5	add newlines at the end of files. llvm-svn: 100705	2010-04-07 22:53:17 +00:00
Dan Gohman	d006ab90dd	Generalize IVUsers to track arbitrary expressions rather than expressions explicitly split into stride-and-offset pairs. Also, add the ability to track multiple post-increment loops on the same expression. This refines the concept of "normalizing" SCEV expressions used for to post-increment uses, and introduces a dedicated utility routine for normalizing and denormalizing expressions. This fixes the expansion of expressions which are post-increment users of more than one loop at a time. More broadly, this takes LSR another step closer to being able to reason about more than one loop at a time. llvm-svn: 100699	2010-04-07 22:27:08 +00:00
Dale Johannesen	f118f9788b	Split big test into multiple directories to cater to those who don't build all targets. llvm-svn: 100688	2010-04-07 20:43:35 +00:00
Chris Lattner	2c88f8a8c4	this has a pr! llvm-svn: 100637	2010-04-07 18:04:56 +00:00
Chris Lattner	f839ee0c13	fix a latent bug my inline asm stuff exposed: MachineOperand::isIdenticalTo wasn't handling metadata operands. llvm-svn: 100636	2010-04-07 18:03:19 +00:00
Jakob Stoklund Olesen	41051a0bfe	Don't try to collapse DomainValues onto an incompatible SSE domain. This fixes the Bullet regression on i386/nocona. llvm-svn: 100553	2010-04-06 19:48:56 +00:00
Evan Cheng	b7a20ee5b5	Add nounwind. llvm-svn: 100482	2010-04-05 22:30:05 +00:00
Dan Gohman	918a90a3ca	Don't do code sinking on unreachable blocks. It's unprofitable and hazardous. llvm-svn: 100455	2010-04-05 19:17:22 +00:00
Chris Lattner	4e4549deea	resolve a fixme. llvm-svn: 100346	2010-04-04 19:28:59 +00:00
Evan Cheng	61399375a2	Correctly lower memset / memcpy of undef. It should be a nop. PR6767. llvm-svn: 100208	2010-04-02 19:36:14 +00:00
Dan Gohman	4bd755419f	Revert the recent alignment changes. They're broken for -Os because, in particular, they end up aligning strings at 16-byte boundaries, and there's no way for GlobalOpt to check OptForSize. llvm-svn: 100172	2010-04-02 03:04:37 +00:00
Dan Gohman	8ceeeb444e	Remove this initializer so that the optimizer doesn't convert unaligned loads into aligned loads. llvm-svn: 100166	2010-04-02 01:26:13 +00:00
Dan Gohman	ffb9c71174	Update this test for the new preferred alignment heuristics. llvm-svn: 100165	2010-04-02 01:24:08 +00:00
Evan Cheng	f997c31598	In 64-bit mode, use i64 to lower memcpy / memset instead of f64. llvm-svn: 100137	2010-04-01 20:27:45 +00:00
Evan Cheng	4c014c892a	- Avoid using floating point stores to implement memset unless the value is zero. - Do not try to infer GV alignment unless its type is sized. It's not possible to infer alignment if it has opaque type. llvm-svn: 100118	2010-04-01 18:19:11 +00:00
Evan Cheng	1e8ee79957	Add -mcpu to memcpy / memset tests to ensure they behave the same on all hosts / targets. llvm-svn: 100101	2010-04-01 08:25:26 +00:00
Evan Cheng	43cd9e3845	Fix sdisel memcpy, memset, memmove lowering: 1. Makes it possible to lower with floating point loads and stores. 2. Avoid unaligned loads / stores unless it's fast. 3. Fix some memcpy lowering logic bug related to when to optimize a load from constant string into a constant. 4. Adjust x86 memcpy lowering threshold to make it more sane. 5. Fix x86 target hook so it uses vector and floating point memory ops more effectively. rdar://7774704 llvm-svn: 100090	2010-04-01 06:04:33 +00:00
Jakob Stoklund Olesen	9986ba954c	Replace V_SET0 with variants for each SSE execution domain. llvm-svn: 99975	2010-03-31 00:40:13 +00:00
Jakob Stoklund Olesen	710c6892be	Fix typo. Thank you, valgrind. llvm-svn: 99974	2010-03-31 00:40:08 +00:00
Jakob Stoklund Olesen	19aa6f72a0	Not all platforms start symbols with _ llvm-svn: 99959	2010-03-30 23:12:48 +00:00
Jakob Stoklund Olesen	6f6ebb663c	Enable -sse-domain-fix by default. Now with tests! llvm-svn: 99954	2010-03-30 22:47:00 +00:00
Eric Christopher	6ad8167714	Remove the pmulld intrinsic and autoupdate it as a vector multiply. Rewrite the pmulld patterns, and make sure that they fold in loads of arguments into the instruction. llvm-svn: 99910	2010-03-30 18:49:01 +00:00
Chris Lattner	a787c9e23a	teach tblgen to allow patterns like (add (i32 (bitconvert (i32 GPR))), 4), transforming it into (add (i32 GPR), 4). This allows us to write type generic multi patterns and have tblgen automatically drop the bitconvert in the case when the types align. This allows us to fold an extra load in the changed testcase. llvm-svn: 99756	2010-03-28 08:38:32 +00:00
Evan Cheng	3365fb1412	Do not sibcall if stack needs to be dynamically aligned. llvm-svn: 99620	2010-03-26 16:26:03 +00:00
Evan Cheng	00a620c61e	Allow trivial sibcall of vararg callee when no arguments are being passed. llvm-svn: 99598	2010-03-26 02:13:13 +00:00
Evan Cheng	7b4a1a221b	Try trivial remat before the coalescer gives up on a vr / physreg coalescing for fear of tying up a physical register. llvm-svn: 99575	2010-03-26 00:07:25 +00:00
Evan Cheng	dbcf861a96	Add nounwind. llvm-svn: 99546	2010-03-25 20:01:07 +00:00
Chris Lattner	4690af8567	Make the NDEBUG assertion stronger and more clear what is happening. Enhance scheduling to set the DEAD flag on implicit defs more aggressively. Before, we'd set an implicit def operand to dead if it were present in the SDNode corresponding to the machineinstr but had no use. Now we do it in this case AND if the implicit def does not exist in the SDNode at all. This exposes a couple of problems: one is the FIXME, which causes a live intervals crash on CodeGen/X86/sibcall.ll. The second is that it makes machinecse and licm more aggressive (which is a good thing) but also exposes a case where licm hoists a set0 and then it doesn't get resunk. Talking to codegen folks about both these issues, but I need this patch in in the meantime. llvm-svn: 99485	2010-03-25 05:40:48 +00:00
Nate Begeman	583e05d8ce	BUILD_VECTOR was missing out on some prime opportunities to use SSE 4.1 inserts. llvm-svn: 99423	2010-03-24 20:49:50 +00:00
Evan Cheng	b8d1fd0553	Stupid svn. Add back to the lost sibcall tests. llvm-svn: 99033	2010-03-20 03:17:05 +00:00
Kevin Enderby	cf0843ed93	Fixed the encoding problems of the crc32 instructions. All had the Operand size override prefix and only the r/m16 forms should have had that. Also for variant one, the AT&T syntax, added suffixes to all forms. Also added the missing 64-bit form for 'CRC32 r64, r/m8'. Plus added test cases for all forms and tweaked one test case to add the needed suffixes. llvm-svn: 98980	2010-03-19 20:04:42 +00:00
Mon P Wang	7ad43f8768	Fixed a widening bug where we were not using the correct size for the load llvm-svn: 98920	2010-03-19 01:19:52 +00:00
Evan Cheng	bf724b9ee0	Turning off post-ra scheduling for x86. It isn't a consistent win. llvm-svn: 98810	2010-03-18 06:55:42 +00:00
Evan Cheng	68333f5c6e	X86 address mode matching code MatchAddressRecursively does some aggressive hack which require doing a RAUW. It may end up deleting some SDNode up stream. It should avoid referencing deleted nodes. llvm-svn: 98780	2010-03-17 23:58:35 +00:00
Dan Gohman	5a6dc1dd09	Add an rdar number to this test. llvm-svn: 98654	2010-03-16 19:08:20 +00:00
Bill Wendling	31d7f0d96a	Forgot testcase for r98599. llvm-svn: 98602	2010-03-16 01:54:20 +00:00
Daniel Dunbar	5599256415	MC: Allow modifiers in MCSymbolRefExpr, and eliminate X86MCTargetExpr. - Although it would be nice to allow this decoupling, the assembler needs to be able to reason about MCSymbolRefExprs in too many places to make this viable. We can use a target specific encoding of the variant if this becomes an issue. - This patch also extends llvm-mc to support parsing of the modifiers, as opposed to lumping them in with the symbol. llvm-svn: 98592	2010-03-15 23:51:06 +00:00
Dan Gohman	c6ddebd6d1	Recognize code for doing vector gather/scatter index calculations with 32-bit indices. Instead of shuffling each element out of the index vector, when all indices are needed, just store the input vector to the stack and load the elements out. llvm-svn: 98588	2010-03-15 23:23:03 +00:00
Chris Lattner	561334a81f	Implement support for the case when a reference to a addr-of-bb label is generated, but then the block is deleted. Since the value is undefined, we just emit the label right after the entry label of the function. It might matter that the label is in the same section as the function was afterall. llvm-svn: 98579	2010-03-15 20:39:00 +00:00
Chris Lattner	347a0eb85c	Fix the case when a reference to an address taken BB is emitted in one function, then the BB is RAUW'd before the definition is emitted. There are still two cases not being handled, but this should improve us back to the situation before I touched anything. llvm-svn: 98566	2010-03-15 19:09:43 +00:00
Chris Lattner	d03a956a01	filecheckize a test and mark these wiht a cpu so it passes on hosts without cmovs. llvm-svn: 98521	2010-03-14 22:31:16 +00:00
Chris Lattner	f71cb6c439	fix ShrinkDemandedOps to not leave dead nodes around, fixing PR6607 llvm-svn: 98512	2010-03-14 19:46:02 +00:00
Chris Lattner	5049f23592	don't have i386-specific tests in CodeGen/Generic, PR6601. llvm-svn: 98508	2010-03-14 18:51:18 +00:00
Chris Lattner	6feb7e3325	fix PR6605, X86ISD::CMP always returns i32 (EFLAGS), not the operand type. llvm-svn: 98507	2010-03-14 18:44:35 +00:00
Chris Lattner	6e52e9db31	get MMI out of the label uniquing business, just go to MCContext to get unique assembler temporary labels. llvm-svn: 98489	2010-03-14 08:36:50 +00:00
Evan Cheng	d703df67ce	Do not force indirect tailcall through fixed registers: eax, r11. Add support to allow loads to be folded to tail call instructions. llvm-svn: 98465	2010-03-14 03:48:46 +00:00
Chris Lattner	d75813970a	simplify code to use OutContext.GetOrCreateTemporarySymbol with no arguments instead of having to come up with a unique name. This also makes the code less fragile. llvm-svn: 98364	2010-03-12 18:47:50 +00:00
Chris Lattner	53ebf8a7ca	fix PR6577, a bug in sdbuilder lowering select instructions whose true value was not Val#0. llvm-svn: 98336	2010-03-12 07:15:36 +00:00
Bill Wendling	00810c39da	revert r98270. llvm-svn: 98281	2010-03-11 19:50:31 +00:00
Evan Cheng	31fe835bf2	Bad bad bug. x86 force indirect tail call address into eax when it's meant to force it into a call preserved register instead. Change it to ecx for now. llvm-svn: 98270	2010-03-11 18:49:14 +00:00
Evan Cheng	8c4df8160e	The check for coalescing a virtual register to a physical register, e.g. cl = EXTRACT_SUBREG reg1024, 1, is overly conservative. It should check for overlaps of vr's live interval with the super registers of the physical register (ECX in this case) and let JoinIntervals() handle checking the coalescing feasibility against the physical register (cl in this case). llvm-svn: 98251	2010-03-11 08:20:21 +00:00
Eric Christopher	304f13c637	Have fast-isel understand llvm.objectsize. Update testcase for slightly different codegen. llvm-svn: 98244	2010-03-11 06:20:22 +00:00
Chris Lattner	a179e4d0a8	add support, testcases, and dox for the new GHC calling convention. Patch by David Terei! llvm-svn: 98212	2010-03-11 00:22:57 +00:00
Chris Lattner	4ec0b670d5	fix PR6533 by updating the br(xor) code to remember the case when it looked past a trunc. llvm-svn: 98203	2010-03-10 23:46:44 +00:00
Evan Cheng	72811e8714	Fix typo. llvm-svn: 98142	2010-03-10 07:07:55 +00:00
Evan Cheng	a3b6739749	Unbreak test on Linux. llvm-svn: 98141	2010-03-10 07:07:45 +00:00
Evan Cheng	80ad113731	Enable machine cse pass. llvm-svn: 98132	2010-03-10 03:07:41 +00:00
Chris Lattner	9889c1eb9e	move .set generation out of DwarfPrinter into AsmPrinter and MCize it. llvm-svn: 98010	2010-03-08 23:58:37 +00:00
Chris Lattner	27a9732450	simplify EmitSectionOffset to always use .set if it is available, the only thing this affects is that we produce .set in one case we didn't before, which shouldn't harm anything. Make EmitSectionOffset call EmitDifference instead of duplicating it. llvm-svn: 98005	2010-03-08 23:23:25 +00:00
Evan Cheng	5967649780	Add documentation on sibling call optimization. Rename tailcall2.ll test to sibcall.ll. llvm-svn: 97980	2010-03-08 21:05:02 +00:00
Charles Davis	8545afe0b0	Don't emit global symbols into the (__TEXT,__ustring) section on Darwin. This is a workaround for <rdar://problem/7672401/> (which I filed). This let's us build Wine on Darwin, and it gets the Qt build there a little bit further (so Doug says). llvm-svn: 97845	2010-03-05 22:28:45 +00:00
Jakob Stoklund Olesen	2664d295cb	Better handling of dead super registers in LiveVariables. We used to do this: CALL ... %RAX<imp-def> ... [not using %RAX] %EAX = ..., %RAX<imp-use, kill> RET %EAX<imp-use,kill> Now we do this: CALL ... %RAX<imp-def, dead> ... [not using %RAX] %EAX = ... RET %EAX<imp-use,kill> By not artificially keeping %RAX alive, we lower register pressure a bit. The correct number of instructions for 2008-08-05-SpillerBug.ll is obviously 55, anybody can see that. Sheesh. llvm-svn: 97838	2010-03-05 21:49:17 +00:00
Jakob Stoklund Olesen	8c5b8db5cd	We don't really care about correct register liveness information after the post-ra scheduler has run. Disable the verifier checks that late in the game. llvm-svn: 97837	2010-03-05 21:49:13 +00:00

... 2 3 4 5 6 ...

2058 Commits