llvm-project

Commit Graph

Author	SHA1	Message	Date
Cameron Zwarich	76dfa226cf	The bitcast case here is actually handled uniformly earlier in the function, so delete it. llvm-svn: 129877	2011-04-20 21:48:34 +00:00
Cameron Zwarich	4cd9a4a975	Cleanup some code to better use an early return style in preparation for adding more cases. llvm-svn: 129876	2011-04-20 21:48:16 +00:00
Chris Lattner	0ab5e2cded	Fix a ton of comment typos found by codespell. Patch by Luis Felipe Strano Moraes! llvm-svn: 129558	2011-04-15 05:18:47 +00:00
Owen Anderson	92651ec374	Fix an infinite alternation in JumpThreading where two transforms would repeatedly undo each other. The solution is to perform more aggressive constant folding to make one of the edges just folded away rather than trying to thread it. Fixes <rdar://problem/9284786>. Discovered with CSmith. llvm-svn: 129538	2011-04-14 21:35:50 +00:00
Mon P Wang	1cde91674a	Cleanup r129509 based on comments by Chris llvm-svn: 129532	2011-04-14 19:20:42 +00:00
Mon P Wang	0f6bad7b6e	Cleanup r129472 by using a utility routine as suggested by Eli. llvm-svn: 129509	2011-04-14 08:04:01 +00:00
Chris Lattner	35a65b2aa6	fix a couple -Wsign-compare warnings. llvm-svn: 129501	2011-04-14 02:27:25 +00:00
Mon P Wang	2e5528f0b2	Vectors with different number of elements of the same element type can have the same allocation size but different primitive sizes(e.g., <3xi32> and <4xi32>). When ScalarRepl promotes them, it can't use a bit cast but should use a shuffle vector instead. llvm-svn: 129472	2011-04-13 21:40:02 +00:00
Junjie Gu	377cc31a74	Fixed the revision 129449. llvm-svn: 129450	2011-04-13 16:45:49 +00:00
Junjie Gu	7c3b4593b5	Passing unroll parameters (unroll-count, threshold, and partial unroll) via LoopUnroll class's ctor. Doing so will allow multiple context with different loop unroll parameters to run. This is a minor change and no effect on existing application. llvm-svn: 129449	2011-04-13 16:15:29 +00:00
Rafael Espindola	6aafb64daf	Add the alias analysis to the C api. llvm-svn: 129447	2011-04-13 15:44:58 +00:00
Bill Wendling	b902f1dd88	Reapply r129401 with patch for clang. llvm-svn: 129419	2011-04-13 00:36:11 +00:00
Bill Wendling	dbfde42468	Revert r129401 for now. Clang is using the old way of doing things. llvm-svn: 129403	2011-04-12 22:59:27 +00:00
Bill Wendling	47c24875a1	Remove the unaligned load intrinsics in favor of using native unaligned loads. Now that we have a first-class way to represent unaligned loads, the unaligned load intrinsics are superfluous. First part of <rdar://problem/8460511>. llvm-svn: 129401	2011-04-12 22:46:31 +00:00
Dan Gohman	1c6c34834b	Fix reassociate to use a worklist instead of recursing when new reassociation opportunities are exposed. This fixes a bug where the nested reassociation expects to be the IR to be consistent, but it isn't, because the outer reassociation has disconnected some of the operands. rdar://9167457 llvm-svn: 129324	2011-04-12 00:11:56 +00:00
Jay Foad	7c14a558fe	Don't include Operator.h from InstrTypes.h. llvm-svn: 129271	2011-04-11 09:35:34 +00:00
Chris Lattner	88974f4625	fix PR9523, a crash in looprotate on a non-canonical loop made out of indirectbr. llvm-svn: 129203	2011-04-09 07:25:58 +00:00
Chris Lattner	af1bccec68	Fix a bug where RecursivelyDeleteTriviallyDeadInstructions could delete the instruction pointed to by CGP's current instruction iterator, leading to a crash on the testcase. This fixes PR9578. llvm-svn: 129200	2011-04-09 07:05:44 +00:00
Rafael Espindola	e4e4e37580	Expose more passes to the C API. llvm-svn: 129087	2011-04-07 18:20:46 +00:00
Eli Friedman	c5f22a7815	PR9634: Don't unconditionally tell the AliasSetTracker that the PreheaderLoad is equivalent to any other relevant value; it isn't true in general. If it is equivalent, the LoopPromoter will tell the AST the equivalence. Also, delete the PreheaderLoad if it is unused. Chris, since you were the last one to make major changes here, can you check that this is sane? llvm-svn: 129049	2011-04-07 01:35:06 +00:00
Bill Wendling	5034159c5f	* The DSE code that tested for overlapping needed to take into account the fact that one of the numbers is signed while the other is unsigned. This could lead to a wrong result when the signed was promoted to an unsigned int. * Add the data layout line to the testcase so that it will test the appropriate thing. Patch by David Terei! llvm-svn: 128577	2011-03-30 21:37:19 +00:00
Jay Foad	52131344a2	Remove PHINode::reserveOperandSpace(). Instead, add a parameter to PHINode::Create() giving the (known or expected) number of operands. llvm-svn: 128537	2011-03-30 11:28:46 +00:00
Jay Foad	e0938d8a87	(Almost) always call reserveOperandSpace() on newly created PHINodes. llvm-svn: 128535	2011-03-30 11:19:20 +00:00
Benjamin Kramer	e41395ac24	DSE: Remove an early exit optimization that depended on the ordering of a SmallPtrSet. Fixes PR9569 and will hopefully make selfhost on ASLR-enabled systems more deterministic. llvm-svn: 128482	2011-03-29 20:28:57 +00:00
Cameron Zwarich	ff811cc475	Do some simple copy propagation through integer loads and stores when promoting vector types. This helps a lot with inlined functions when using the ARM soft float ABI. Fixes <rdar://problem/9184212>. llvm-svn: 128453	2011-03-29 05:19:52 +00:00
Bill Wendling	b5139920d6	Simplification noticed by Frits. llvm-svn: 128333	2011-03-26 09:32:07 +00:00
Bill Wendling	19f33b9393	Rework the logic that determines if a store completely overlaps an ealier store. There are two ways that a later store can comletely overlap a previous store: 1. They both start at the same offset, but the earlier store's size is <= the later's size, or 2. The earlier store's offset is > the later's offset, but it's offset + size doesn't extend past the later's offset + size. llvm-svn: 128332	2011-03-26 08:02:59 +00:00
Cameron Zwarich	d4174ee43e	Fix a typo and add a test. llvm-svn: 128331	2011-03-26 04:58:50 +00:00
Bill Wendling	db40b5c899	PR9561: A store with a negative offset (via GEP) could erroniously say that it completely overlaps a previous store, thus mistakenly deleting that store. Check for this condition. llvm-svn: 128319	2011-03-26 01:20:37 +00:00
Cameron Zwarich	74157ab3e5	Debug intrinsics must be skipped at the beginning and ends of blocks, lest they affect the generated code. llvm-svn: 128217	2011-03-24 16:34:59 +00:00
Cameron Zwarich	2edfe778ec	It is enough for the CallInst to have no uses to be made a tail call with a ret void; it doesn't need to have a void type. llvm-svn: 128212	2011-03-24 15:54:11 +00:00
Devang Patel	8f606d7b9b	s/UpdateDT/ModifiedDT/g llvm-svn: 128211	2011-03-24 15:35:25 +00:00
Cameron Zwarich	4649f17db1	Do early taildup of ret in CodeGenPrepare for potential tail calls that have a void return type. This fixes PR9487. llvm-svn: 128197	2011-03-24 04:52:10 +00:00
Cameron Zwarich	0e331c05ae	Use an early return instead of a long if block. llvm-svn: 128196	2011-03-24 04:52:07 +00:00
Cameron Zwarich	dd84bcce8f	When UpdateDT is set, DT is invalid, which could cause problems when trying to use it later. I couldn't make a test that hits this with the current code. llvm-svn: 128195	2011-03-24 04:52:04 +00:00
Cameron Zwarich	47e7175fe9	Check for TLI so that -codegenprepare can be used from opt. llvm-svn: 128194	2011-03-24 04:51:51 +00:00
Cameron Zwarich	10ebc189ee	Fix PR9464 by correcting some math that just happened to be right in most cases that were hit in practice. llvm-svn: 128146	2011-03-23 05:25:55 +00:00
Evan Cheng	0663f23bd8	Re-apply r127953 with fixes: eliminate empty return block if it has no predecessors; update dominator tree if cfg is modified. llvm-svn: 127981	2011-03-21 01:19:09 +00:00
Daniel Dunbar	327cd36f74	Revert r127953, "SimplifyCFG has stopped duplicating returns into predecessors to canonicalize IR", it broke a lot of things. llvm-svn: 127954	2011-03-19 21:47:14 +00:00
Evan Cheng	824a711305	SimplifyCFG has stopped duplicating returns into predecessors to canonicalize IR to have single return block (at least getting there) for optimizations. This is general goodness but it would prevent some tailcall optimizations. One specific case is code like this: int f1(void); int f2(void); int f3(void); int f4(void); int f5(void); int f6(void); int foo(int x) { switch(x) { case 1: return f1(); case 2: return f2(); case 3: return f3(); case 4: return f4(); case 5: return f5(); case 6: return f6(); } } => LBB0_2: ## %sw.bb callq _f1 popq %rbp ret LBB0_3: ## %sw.bb1 callq _f2 popq %rbp ret LBB0_4: ## %sw.bb3 callq _f3 popq %rbp ret This patch teaches codegenprep to duplicate returns when the return value is a phi and where the phi operands are produced by tail calls followed by an unconditional branch: sw.bb7: ; preds = %entry %call8 = tail call i32 @f5() nounwind br label %return sw.bb9: ; preds = %entry %call10 = tail call i32 @f6() nounwind br label %return return: %retval.0 = phi i32 [ %call10, %sw.bb9 ], [ %call8, %sw.bb7 ], ... [ 0, %entry ] ret i32 %retval.0 This allows codegen to generate better code like this: LBB0_2: ## %sw.bb jmp _f1 ## TAILCALL LBB0_3: ## %sw.bb1 jmp _f2 ## TAILCALL LBB0_4: ## %sw.bb3 jmp _f3 ## TAILCALL rdar://9147433 llvm-svn: 127953	2011-03-19 17:17:39 +00:00
Andrew Trick	f8f67f0188	Remove TargetData and ValueTracking includes. I didn't mean for them to sneak in my last checkin. llvm-svn: 127842	2011-03-18 00:36:39 +00:00
Andrew Trick	87716c93c2	Added isValidRewrite() to check the result of ScalarEvolutionExpander. SCEV may generate expressions composed of multiple pointers, which can lead to invalid GEP expansion. Until we can teach SCEV to follow strict pointer rules, make sure no bad GEPs creep into IR. Fixes rdar://problem/9038671. llvm-svn: 127839	2011-03-17 23:51:11 +00:00
Andrew Trick	e44f0d94f6	whitespace llvm-svn: 127837	2011-03-17 23:46:48 +00:00
Cameron Zwarich	7599b106b7	Fix a comment. llvm-svn: 127728	2011-03-16 08:13:42 +00:00
Cameron Zwarich	0454253d7a	Only convert allocas to scalars if it is profitable. The profitability metric I chose is having a non-memcpy/memset use and being larger than any native integer type. Originally I chose having an access of a size smaller than the total size of the alloca, but this caused some minor issues on the spirit benchmark where SRoA runs again after some inlining. This fixes <rdar://problem/8613163>. llvm-svn: 127718	2011-03-16 00:13:44 +00:00
Cameron Zwarich	b51c830f7c	Better use initializer lists. llvm-svn: 127716	2011-03-16 00:13:37 +00:00
Cameron Zwarich	63062ccf85	Add a clarifying comment. llvm-svn: 127715	2011-03-16 00:13:35 +00:00
Andrew Trick	8b55b736b1	Added SCEV::NoWrapFlags to manage unsigned, signed, and self wrap properties. Added the self-wrap flag for SCEV::AddRecExpr. A slew of temporary FIXMEs indicate the intention of the no-self-wrap flag without changing behavior in this revision. llvm-svn: 127590	2011-03-14 16:50:06 +00:00
Andrew Trick	328b223bb1	whitespace llvm-svn: 127589	2011-03-14 16:48:10 +00:00
Cameron Zwarich	338d362200	Roll r127459 back in: Optimize trivial branches in CodeGenPrepare, which often get created from the lowering of objectsize intrinsics. Unfortunately, a number of tests were relying on llc not optimizing trivial branches, so I had to add an option to allow them to continue to test what they originally tested. This fixes <rdar://problem/8785296> and <rdar://problem/9112893>. llvm-svn: 127498	2011-03-11 21:52:04 +00:00
Daniel Dunbar	94ccb27b43	Revert r127459, "Optimize trivial branches in CodeGenPrepare, which often get created from the", it broke some GCC test suite tests. llvm-svn: 127477	2011-03-11 19:30:30 +00:00
Cameron Zwarich	cc27b3acc4	Optimize trivial branches in CodeGenPrepare, which often get created from the lowering of objectsize intrinsics. Unfortunately, a number of tests were relying on llc not optimizing trivial branches, so I had to add an option to allow them to continue to test what they originally tested. This fixes <rdar://problem/8785296> and <rdar://problem/9112893>. llvm-svn: 127459	2011-03-11 04:54:27 +00:00
Dan Gohman	affbc66f60	RecursivelyDeleteTriviallyDeadInstructions only needs a Value, not an Instruction, so casting is not necessary. Also, it's theoretically possible that the Value is not an Instruction, since WeakVH follows RAUWs. llvm-svn: 127427	2011-03-10 20:57:44 +00:00
Dan Gohman	154ed49784	Fix reassociate to postpone certain instruction deletions until after it has finished all of its reassociations, because its habit of unlinking operands and holding them in a datastructure while working means that it's not easy to determine when an instruction is really dead until after all its regular work is done. rdar://9096268. llvm-svn: 127424	2011-03-10 19:51:54 +00:00
Devang Patel	13f8c7d48e	Preserve line number information while simplifying libcalls. llvm-svn: 127362	2011-03-09 21:27:52 +00:00
Cameron Zwarich	19f2b3c652	Fix a crasher introduced by r127317 that is seen on the bots when using an alloca as both integer and floating-point vectors of the same size. Bugpoint is not cooperating with me, but I'll try to find a manual testcase tomorrow. llvm-svn: 127320	2011-03-09 07:34:11 +00:00
Cameron Zwarich	3b649f4d01	Add support to scalar replacement for partial vector accesses of an alloca, e.g. a union of a float, <2 x float>, and <4 x float>. This mostly comes up with the use of vector intrinsics, especially in NEON when programmers know the layout of the register file. This enables codegen to eliminate a lot of the subregister traffic it would otherwise generate. This commit only enables this for a small number of floating-point cases, but a lot more integer cases. I assume this is okay for all ports, but I did not do extensive testing of the quality of code involving i512 vectors and the like. If there is a use case where this generates worse code than before, let me know and we can scale it back. This fixes <rdar://problem/9036264>. llvm-svn: 127317	2011-03-09 05:43:05 +00:00
Cameron Zwarich	43a241fa06	Move vector type merging to a separate function in preparation for it getting more complicated. llvm-svn: 127316	2011-03-09 05:43:01 +00:00
Devang Patel	97d0be8ee1	While sinking an instruction, do not lose llvm.dbg.value intrinsic. llvm-svn: 127214	2011-03-08 03:06:19 +00:00
Devang Patel	d00c628f8f	Preserve line no. info. Radar `9097659` llvm-svn: 127182	2011-03-07 22:43:45 +00:00
Cameron Zwarich	13c885d193	Fix PR9398 - 10% of llc compile time is spent in Value::getNumUses. This reduces the percentage of time spent in CodeGenPrepare when llcing 403.gcc from 12.6% to 1.8% of total llc time. llvm-svn: 127069	2011-03-05 08:12:26 +00:00
Richard Osborne	5003782293	Fix typo in comment. llvm-svn: 126941	2011-03-03 14:21:22 +00:00
Richard Osborne	af52c52569	Optimize fprintf -> iprintf if there are no floating point arguments and siprintf is available on the target. llvm-svn: 126940	2011-03-03 14:20:22 +00:00
Richard Osborne	2dfb888392	Optimize sprintf -> siprintf if there are no floating point arguments and siprintf is available on the target. llvm-svn: 126937	2011-03-03 14:09:28 +00:00
Richard Osborne	815de536e5	Optimize printf -> iprintf if there are no floating point arguments and iprintf is available on the target. Currently iprintf is only marked as being available on the XCore. llvm-svn: 126935	2011-03-03 13:17:51 +00:00
Cameron Zwarich	86ade9510f	Remove some more unused code that I missed. llvm-svn: 126826	2011-03-02 03:48:29 +00:00
Cameron Zwarich	5dd2aa2615	Eliminate the unused CodeGenPrepare option to split critical edges. llvm-svn: 126825	2011-03-02 03:31:46 +00:00
Cameron Zwarich	b7f8eaafa3	Stop computing the number of uses twice per value in CodeGenPrepare's sinking of addressing code. On 403.gcc this almost halves CodeGenPrepare time and reduces total llc time by 9.5%. Unfortunately, getNumUses() is still the hottest function in llc. llvm-svn: 126782	2011-03-01 21:13:53 +00:00
Ted Kremenek	20164dcc68	Unbreak CMake build. llvm-svn: 126715	2011-02-28 23:56:33 +00:00
Chris Lattner	1ac5e0c5c6	update cmake llvm-svn: 126694	2011-02-28 22:45:25 +00:00
Dan Gohman	06d70015ce	Delete the GEPSplitter experiment. llvm-svn: 126671	2011-02-28 19:47:47 +00:00
Dan Gohman	b8a25f49f3	Delete the SimplifyHalfPowrLibCalls pass, which was unused, and only existed as the result of a misunderstanding. llvm-svn: 126669	2011-02-28 19:41:14 +00:00
Chris Lattner	eddb33ebd0	wire TargetLibraryInfo into simplify libcalls and use it in a couple of trivial places. This pass needs a lot of work. llvm-svn: 126367	2011-02-24 07:16:14 +00:00
Chris Lattner	2e56e20662	move a massive amount of code out into its own helper function to reduce nesting. This needs to be turned into a table. llvm-svn: 126366	2011-02-24 07:12:12 +00:00
Cameron Zwarich	826308586c	Make LoopDeletion work on loops with multiple edges, as long as the incoming values from all of the loop's exiting blocks are equal. Patch by Andrew Clinton. llvm-svn: 126253	2011-02-22 22:25:39 +00:00
Chris Lattner	2333ac279f	fix a crasher in disabled code (on variable stride loops) llvm-svn: 126125	2011-02-21 17:02:55 +00:00
Chris Lattner	bc661d6686	Add some (disabled code) to print out negative strides. llvm-svn: 126102	2011-02-21 02:08:54 +00:00
Chris Lattner	72a35fb974	rewrite the memset_pattern pattern generation stuff to accept any 2/4/8/16-byte constant, including globals. This makes us generate much more "pretty" pattern globals as well because it doesn't break it down to an array of bytes all the time. This enables us to handle stores of relocatable globals. This kicks in about 48 times in 254.gap, giving us stuff like this: @.memset_pattern40 = internal constant [2 x %struct.TypHeader* (%struct.TypHeader, %struct.TypHeader)] [%struct.TypHeader (%struct.TypHeader, %struct .TypHeader)* @IsFalse, %struct.TypHeader* (%struct.TypHeader, %struct.TypHeader)* @IsFalse], align 16 ... call void @memset_pattern16(i8* %scevgep5859, i8* bitcast ([2 x %struct.TypHeader* (%struct.TypHeader, %struct.TypHeader)] @.memset_pattern40 to i8* ), i64 %tmp75) nounwind llvm-svn: 126044	2011-02-19 19:56:44 +00:00
Chris Lattner	0f4a64011e	Implement rdar://9009151, transforming strided loop stores of unsplatable values into memset_pattern16 when it is available (recent darwins). This transforms lots of strided loop stores of ints for example, like 5 in vpr: Formed memset: call void @memset_pattern16(i8* %4, i8* getelementptr inbounds ([16 x i8]* @.memset_pattern9, i32 0, i32 0), i64 %tmp25) from store to: {%3,+,4}<%11> at: store i32 3, i32* %scevgep, align 4, !tbaa !4 llvm-svn: 126040	2011-02-19 19:31:39 +00:00
Chris Lattner	e6b261fec5	Make loop-idiom use TargetLibraryInfo to determine whether it is allowed to hack on memset, memcpy etc. llvm-svn: 125974	2011-02-18 22:22:15 +00:00
Chris Lattner	1a924e770a	prevent jump threading from merging blocks when their address is taken (and used!). This prevents merging the blocks (invalidating the block addresses) in a case like this: #define _THIS_IP_ ({ __label__ __here; __here: (unsigned long)&&__here; }) void foo() { printf("%p\n", _THIS_IP_); printf("%p\n", _THIS_IP_); printf("%p\n", _THIS_IP_); } which fixes PR4151. llvm-svn: 125829	2011-02-18 04:43:06 +00:00
Chris Lattner	3eb0af94c4	fix PR9215, preventing -reassociate from clearing nsw/nuw when it swaps the LHS/RHS of a single binop. llvm-svn: 125700	2011-02-17 01:29:24 +00:00
Duncan Sands	75b5d27b84	Spelling fix: consequtive -> consecutive. llvm-svn: 125563	2011-02-15 09:23:02 +00:00
Chris Lattner	69229316aa	convert ConstantVector::get to use ArrayRef. llvm-svn: 125537	2011-02-15 00:14:00 +00:00
Devang Patel	3058398655	Do not hoist @llvm.dbg.value. Here, @llvm.dbg.value is "referring" a value that is modified inside loop. llvm-svn: 125529	2011-02-14 23:03:23 +00:00
Chris Lattner	34442e6ebf	revert my ConstantVector patch, it seems to have made the llvm-gcc builders unhappy. llvm-svn: 125504	2011-02-14 18:15:46 +00:00
Chris Lattner	d9f5b88548	Switch ConstantVector::get to use ArrayRef instead of a pointer+size idiom. Change various clients to simplify their code. llvm-svn: 125487	2011-02-14 07:55:32 +00:00
Daniel Dunbar	210ce0feb5	SimplifyLibCalls: Add missing legalize check on various printf to puts and putchar transforms, their return values are not compatible. llvm-svn: 125442	2011-02-12 18:19:57 +00:00
Cameron Zwarich	99de19b3cb	Make LoopUnswitch preserve ScalarEvolution by just forgetting everything about a loop when unswitching it. It only does this in the complex case, because everything should be fine already in the simple case. llvm-svn: 125369	2011-02-11 06:08:28 +00:00
Cameron Zwarich	25cb63c791	LoopInstSimplify preserves ScalarEvolution. llvm-svn: 125368	2011-02-11 06:08:25 +00:00
Cameron Zwarich	97dae4d361	If we can't avoid running loop-simplify twice for now, at least avoid running iv-users twice. llvm-svn: 125318	2011-02-10 23:53:14 +00:00
Eric Christopher	da6bd45088	Revert this in an attempt to bring the builders back. llvm-svn: 125257	2011-02-10 01:48:24 +00:00
Cameron Zwarich	58c8670ab2	Turn this pass ordering: Natural Loop Information Loop Pass Manager Canonicalize natural loops Scalar Evolution Analysis Loop Pass Manager Induction Variable Users Canonicalize natural loops Induction Variable Users Loop Strength Reduction into this: Scalar Evolution Analysis Loop Pass Manager Canonicalize natural loops Induction Variable Users Loop Strength Reduction This fixes <rdar://problem/8869639>. I also filed PR9184 on doing this sort of thing automatically, but it seems easier to just change the ordering of the passes if this is the only case. llvm-svn: 125254	2011-02-10 01:07:54 +00:00
Dan Gohman	de7f699754	Don't split any loop backedges, including backedges of loops other than the active loop. This is generally desirable, and it avoids trouble in situations such as the testcase in PR9123, though the failure mode depends on use-list order, so it is infeasible to test. llvm-svn: 125065	2011-02-08 00:55:13 +00:00
Dan Gohman	08d2c98c23	Fix reassociate to clear optional flags, such as nsw. llvm-svn: 124712	2011-02-02 02:02:34 +00:00
Francois Pichet	326e4a2966	Unbreak the MSVC build. The DEBUG() call at line 606 demands to see raw_ostream's definition. I have no idea why this seems to only break MSVC. llvm-svn: 124545	2011-01-29 20:06:16 +00:00
Evan Cheng	73c29178ac	Add a test for TCE return duplication. llvm-svn: 124527	2011-01-29 04:53:35 +00:00
Evan Cheng	d983eba7dc	Re-apply r124518 with fix. Watch out for invalidated iterator. llvm-svn: 124526	2011-01-29 04:46:23 +00:00
Evan Cheng	65b8ccf6ac	Revert r124518. It broke Linux self-host. llvm-svn: 124522	2011-01-29 02:43:04 +00:00
Evan Cheng	d4eff31476	Re-commit r124462 with fixes. Tail recursion elim will now dup ret into unconditional predecessor to enable TCE on demand. llvm-svn: 124518	2011-01-29 01:29:26 +00:00
Duncan Sands	69bdb585b2	Fix PR9039, a use-after-free in reassociate. The issue was that the operand being factorized (and erased) could occur several times in Ops, resulting in freed memory being used when the next occurrence in Ops was analyzed. llvm-svn: 124287	2011-01-26 10:08:38 +00:00
Dan Gohman	0f124e1987	Give GetUnderlyingObject a TargetData, to keep it in sync with BasicAA's DecomposeGEPExpression, which recently began using a TargetData. This fixes PR8968, though the testcase is awkward to reduce. Also, update several off GetUnderlyingObject's users which happen to have a TargetData handy to pass it in. llvm-svn: 124134	2011-01-24 18:53:32 +00:00
Chris Lattner	d83e7b0ff6	enhance SRoA to promote allocas that are used by PHI nodes. This often occurs because instcombine sinks loads and inserts phis. This kicks in on such apps as 175.vpr, eon, 403.gcc, xalancbmk and a bunch of times in spec2006 in some app that uses std::deque. This resolves the last of rdar://7339113. llvm-svn: 124090	2011-01-24 01:07:11 +00:00
Chris Lattner	a960725d18	Enhance SRoA to promote allocas that are used by selects in some common cases. This triggers a surprising number of times in SPEC2K6 because min/max idioms end up doing this. For example, code from the STL ends up looking like this to SRoA: %202 = load i64* %__old_size, align 8, !tbaa !3 %203 = load i64* %__old_size, align 8, !tbaa !3 %204 = load i64* %__n, align 8, !tbaa !3 %205 = icmp ult i64 %203, %204 %storemerge.i = select i1 %205, i64* %__n, i64* %__old_size %206 = load i64* %storemerge.i, align 8, !tbaa !3 We can now promote both the __n and the __old_size allocas. This addresses another chunk of rdar://7339113, poor codegen on stringswitch. llvm-svn: 124088	2011-01-23 22:04:55 +00:00
Chris Lattner	9491dee24e	Enhance SRoA to be more aggressive about scalarization of aggregate allocas that have PHI or select uses of their element pointers. This can often happen when instcombine sinks two loads into a successor, inserting a phi or select. With this patch, we can scalarize the alloca, but the pinned elements are not yet promoted. This is still a win for large aggregates where only one element is used. This fixes rdar://8904039 and part of rdar://7339113 (poor codegen on stringswitch). llvm-svn: 124070	2011-01-23 08:27:54 +00:00
Chris Lattner	8acbb79506	have AllocaInfo store the alloca being inspected, simplifying callers. No functionality change. llvm-svn: 124067	2011-01-23 07:29:29 +00:00
Chris Lattner	3e56c29068	Rearrange some code a bit. Change MarkUnsafe to handle the "Transformation preventing inst" printing, so that -scalarrepl -debug will always print the rejected instruction. No functionality change. llvm-svn: 124066	2011-01-23 07:05:44 +00:00
Chris Lattner	a587ab7b94	remove an old hack that avoided creating MMX datatypes. The X86 backend has been fixed. llvm-svn: 124064	2011-01-23 06:40:33 +00:00
Dan Gohman	19e30d5a7d	Actually check memcpy lengths, instead of just commenting about how they should be checked. llvm-svn: 123999	2011-01-21 22:07:57 +00:00
Nick Lewycky	ae0275e018	SCCP doesn't actually preserve the CFG. It will delete and insert terminator instructions. llvm-svn: 123973	2011-01-21 08:38:09 +00:00
Chris Lattner	86d56c651d	fix rdar://8878965, a regression I introduced with the recent llvm.objectsize changes. llvm-svn: 123771	2011-01-18 20:53:04 +00:00
Cameron Zwarich	b703654edc	Remove code for updating dominance frontiers and some outdated references to dominance and post-dominance frontiers. llvm-svn: 123725	2011-01-18 04:11:31 +00:00
Cameron Zwarich	4694e69540	Remove outdated references to dominance frontiers. llvm-svn: 123724	2011-01-18 03:53:26 +00:00
Owen Anderson	459e079912	Remove dead code, that I apparently wrote a while back. We seem to be doing well enough without whatever this was trying to do. When/if someone has the time to do some empirical evaluations, it might be worth it to figure out what this code was trying to do and see if it's worth resurrecting/fixing. llvm-svn: 123684	2011-01-17 22:39:54 +00:00
Cameron Zwarich	b410858a5f	Roll r123609 back in with two changes that fix test failures with expensive checks enabled: 1) Use '<' to compare integers in a comparison function rather than '<='. 2) Use the uniqued set DefBlocks rather than Info.DefiningBlocks to initialize the priority queue. The speedup of scalarrepl on test-suite + SPEC2000 + SPEC2006 is a bit less, at just under 16% rather than 17%. llvm-svn: 123662	2011-01-17 17:38:41 +00:00
Cameron Zwarich	67431d7943	Roll out r123609 due to failures on the llvm-x86_64-linux-checks bot. llvm-svn: 123618	2011-01-17 07:26:51 +00:00
Cameron Zwarich	814cd9233e	Eliminate the use of dominance frontiers in PromoteMemToReg. In addition to eliminating a potentially quadratic data structure, this also gives a 17% speedup when running -scalarrepl on test-suite + SPEC2000 + SPEC2006. My initial experiment gave a greater speedup around 25%, but I moved the dominator tree level computation from dominator tree construction to PromoteMemToReg. Since this approach to computing IDFs has a much lower overhead than the old code using precomputed DFs, it is worth looking at using this new code for the second scalarrepl pass as well. llvm-svn: 123609	2011-01-17 01:08:59 +00:00
Chris Lattner	7c9f4c9c2b	tidy up a comment, as suggested by duncan llvm-svn: 123590	2011-01-16 17:46:19 +00:00
Chris Lattner	ed1fb92cfe	simplify a little llvm-svn: 123573	2011-01-16 07:11:21 +00:00
Chris Lattner	6fab2e9418	if an alloca is only ever accessed as a unit, and is accessed with load/store instructions, then don't try to decimate it into its individual pieces. This will just make a mess of the IR and is pointless if none of the elements are individually accessed. This was generating really terrible code for std::bitset (PR8980) because it happens to be lowered by clang as an {[8 x i8]} structure instead of {i64}. The testcase now is optimized to: define i64 @test2(i64 %X) { br label %L2 L2: ; preds = %0 ret i64 %X } before we generated: define i64 @test2(i64 %X) { %sroa.store.elt = lshr i64 %X, 56 %1 = trunc i64 %sroa.store.elt to i8 %sroa.store.elt8 = lshr i64 %X, 48 %2 = trunc i64 %sroa.store.elt8 to i8 %sroa.store.elt9 = lshr i64 %X, 40 %3 = trunc i64 %sroa.store.elt9 to i8 %sroa.store.elt10 = lshr i64 %X, 32 %4 = trunc i64 %sroa.store.elt10 to i8 %sroa.store.elt11 = lshr i64 %X, 24 %5 = trunc i64 %sroa.store.elt11 to i8 %sroa.store.elt12 = lshr i64 %X, 16 %6 = trunc i64 %sroa.store.elt12 to i8 %sroa.store.elt13 = lshr i64 %X, 8 %7 = trunc i64 %sroa.store.elt13 to i8 %8 = trunc i64 %X to i8 br label %L2 L2: ; preds = %0 %9 = zext i8 %1 to i64 %10 = shl i64 %9, 56 %11 = zext i8 %2 to i64 %12 = shl i64 %11, 48 %13 = or i64 %12, %10 %14 = zext i8 %3 to i64 %15 = shl i64 %14, 40 %16 = or i64 %15, %13 %17 = zext i8 %4 to i64 %18 = shl i64 %17, 32 %19 = or i64 %18, %16 %20 = zext i8 %5 to i64 %21 = shl i64 %20, 24 %22 = or i64 %21, %19 %23 = zext i8 %6 to i64 %24 = shl i64 %23, 16 %25 = or i64 %24, %22 %26 = zext i8 %7 to i64 %27 = shl i64 %26, 8 %28 = or i64 %27, %25 %29 = zext i8 %8 to i64 %30 = or i64 %29, %28 ret i64 %30 } In this case, instcombine was able to eliminate the nonsense, but in PR8980 enough PHIs are in play that instcombine backs off. It's better to not generate this stuff in the first place. llvm-svn: 123571	2011-01-16 06:18:28 +00:00
Chris Lattner	7cd8cf7d24	Use an irbuilder to get some trivial constant folding when doing a store of a constant. llvm-svn: 123570	2011-01-16 05:58:24 +00:00
Chris Lattner	d55581ded8	enhance FoldOpIntoPhi in instcombine to try harder when a phi has multiple uses. In some cases, all the uses are the same operation, so instcombine can go ahead and promote the phi. In the testcase this pushes an add out of the loop. llvm-svn: 123568	2011-01-16 05:28:59 +00:00
Chris Lattner	af26390790	temporarily revert r123526. While working on a follow-on patch I realize that ConstantFoldTerminator doesn't preserve dominfo. llvm-svn: 123527	2011-01-15 07:51:19 +00:00
Chris Lattner	8df83c4a24	fix rdar://8785296 - -fcatch-undefined-behavior generates inefficient code The basic issue is that isel (very reasonably!) expects conditional branches to be folded, so CGP leaving around a bunch dead computation feeding conditional branches isn't such a good idea. Just fold branches on constants into unconditional branches. llvm-svn: 123526	2011-01-15 07:36:13 +00:00
Chris Lattner	ee588defc6	simplify code, no functionality change. llvm-svn: 123525	2011-01-15 07:29:01 +00:00
Chris Lattner	1b93be501d	Now that instruction optzns can update the iterator as they go, we can have objectsize folding recursively simplify away their result when it folds. It is important to catch this here, because otherwise we won't eliminate the cross-block values at isel and other times. llvm-svn: 123524	2011-01-15 07:25:29 +00:00
Chris Lattner	7a2771440f	make the current instruction iterator an ivar, allowing xforms that potentially invalidate it (like inline asm lowering) to be sunk into their proper place, cleaning up a ton of code. llvm-svn: 123523	2011-01-15 07:14:54 +00:00
Chris Lattner	b68ec5c339	Generalize LoadAndStorePromoter a bit and switch LICM to use it. llvm-svn: 123501	2011-01-15 00:12:35 +00:00
Chris Lattner	b498f9aff3	switch SRoA to use LoadAndStorePromoter instead of its own copy of the code. llvm-svn: 123457	2011-01-14 19:50:47 +00:00
Chris Lattner	9987a6f49b	split SROA into two passes: one that uses DomFrontiers (-scalarrepl) and one that uses SSAUpdater (-scalarrepl-ssa) llvm-svn: 123436	2011-01-14 08:13:00 +00:00
Chris Lattner	543384efb4	Implement full support for promoting allocas to registers using SSAUpdater instead of DomTree/DomFrontier. This may be interesting for reducing compile time. This is currently disabled, but seems to work just fine. When this is enabled, we eliminate two runs of dominator frontier, one in the "early per-function" optimizations and one in the "interlaced with inliner" function passes. llvm-svn: 123434	2011-01-14 07:50:47 +00:00
Bob Wilson	328e91bbe1	Fix whitespace. llvm-svn: 123396	2011-01-13 20:59:44 +00:00
Bob Wilson	c8056a952e	Check for empty structs, and for consistency, zero-element arrays. llvm-svn: 123383	2011-01-13 18:26:59 +00:00
Bob Wilson	08713d3c5f	Extend SROA to handle arrays accessed as homogeneous structs and vice versa. This is a minor extension of SROA to handle a special case that is important for some ARM NEON operations. Some of the NEON intrinsics return multiple values, which are handled as struct types containing multiple elements of the same vector type. The corresponding return types declared in the arm_neon.h header have equivalent arrays. We need SROA to recognize that it can split up those arrays and structs into separate vectors, even though they are not always accessed with the same type. SROA already handles loads and stores of an entire alloca by using insertvalue/extractvalue to access the individual pieces, and that code works the same regardless of whether the type is a struct or an array. So, all that needs to be done is to check for compatible arrays and homogeneous structs. llvm-svn: 123381	2011-01-13 17:45:11 +00:00
Bob Wilson	12eec40c83	Make SROA more aggressive with allocas containing padding. SROA only split up structs and arrays one level at a time, so padding can only cause trouble if it is located in between the struct or array elements. llvm-svn: 123380	2011-01-13 17:45:08 +00:00
Devang Patel	30f3ebbc1f	Use SmallVector instead of SmallPtrSet and avoid non-deterministic behavior. llvm-svn: 123318	2011-01-12 19:12:45 +00:00
Chris Lattner	dd5f60b7a7	revert 123144, reenabling the rest of memset formation. llvm-svn: 123302	2011-01-12 03:25:15 +00:00
Chris Lattner	654098f411	revert r123146 which disabled code that wasn't the root cause of the bootstrap miscompare issue. llvm-svn: 123299	2011-01-12 01:52:23 +00:00
Chris Lattner	fa7c29d255	revert r123149, reenabling an improvement to memcpyopt that wasn't the source of the bootstrap problem. llvm-svn: 123298	2011-01-12 01:43:46 +00:00
Jakob Stoklund Olesen	12cc296bd4	Remove the PR8954 workaround. llvm-svn: 123288	2011-01-11 22:56:41 +00:00
Cameron Zwarich	cb9c4f85ec	Dial back the speculative fix for PR8954 a bit, so that we only recompute dominators once at the beginning of GVN instead of once per iteration. llvm-svn: 123278	2011-01-11 22:14:42 +00:00
Cameron Zwarich	51eb403907	Attempt to fix the bootstrap buildbot. Rafael says this works for him on x86-64 Linux. llvm-svn: 123270	2011-01-11 20:23:34 +00:00
Chris Lattner	193ce7c4d1	update memdep when an instruction is deleted. This code isn't actually reached in the testcase in PR8954, but it's safe and good practice. llvm-svn: 123224	2011-01-11 08:19:16 +00:00
Chris Lattner	f6ae904e34	Fix FoldSingleEntryPHINodes to update memdep and AA when it deletes phi nodes. It is called from MergeBlockIntoPredecessor which is called from GVN, which claims to preserve these. I'm skeptical that this is the actual problem behind PR8954, but this is a stab in the right direction. llvm-svn: 123222	2011-01-11 08:13:40 +00:00
Chris Lattner	dfcfcb49fa	random cleanups llvm-svn: 123221	2011-01-11 08:00:40 +00:00
Chris Lattner	63fe78de68	remove a bogus assertion: the latch block of a loop is not neccesarily an uncond branch to the header. This fixes PR8955 (the assertion tripping). llvm-svn: 123219	2011-01-11 07:47:59 +00:00
Chris Lattner	88bc848ab6	another random stab in the dark trying to fix llvm-gcc-i386-linux-selfhost llvm-svn: 123149	2011-01-10 02:34:11 +00:00
Chris Lattner	4662bd4b13	another (more) aggressive attempt to bring llvm-gcc-i386-linux-selfhost back to life. llvm-svn: 123146	2011-01-10 00:47:34 +00:00
Chris Lattner	1017fa6746	temporarily disable memset formation from memsets in an effort to restore buildbot stability. llvm-svn: 123144	2011-01-09 23:52:48 +00:00
Chris Lattner	caf5c0d037	fix a few old bugs (found by inspection) where we would zap instructions without informing memdep. This could cause nondeterminstic weirdness based on where instructions happen to get allocated, and will hopefully breath some life into some broken testers. llvm-svn: 123124	2011-01-09 19:26:10 +00:00
Cameron Zwarich	a42e5915bf	LoopInstSimplify preserves LoopSimplify. llvm-svn: 123117	2011-01-09 12:35:16 +00:00
Chris Lattner	a337f5ec5c	reduce indentation. Print <nuw> and <nsw> when dumping SCEV AddRec's that have the bit set. llvm-svn: 123104	2011-01-09 02:16:18 +00:00
Chris Lattner	7d6433ae76	fix a latent bug in memcpyoptimizer that my recent patches exposed: it wasn't updating memdep when fusing stores together. This fixes the crash optimizing the bullet benchmark. llvm-svn: 123091	2011-01-08 22:19:21 +00:00
Chris Lattner	ff6ed2ac5f	tryMergingIntoMemset can only handle constant length memsets. llvm-svn: 123090	2011-01-08 22:11:56 +00:00
Chris Lattner	9a1d63ba9f	Merge memsets followed by neighboring memsets and other stores into larger memsets. Among other things, this fixes rdar://8760394 and allows us to handle "Example 2" from http://blog.regehr.org/archives/320, compiling it into a single 4096-byte memset: _mad_synth_mute: ## @mad_synth_mute ## BB#0: ## %entry pushq %rax movl $4096, %esi ## imm = 0x1000 callq ___bzero popq %rax ret llvm-svn: 123089	2011-01-08 21:19:19 +00:00
Chris Lattner	5120ebf184	fix an issue in IsPointerOffset that prevented us from recognizing that P and P+1 are relative to the same base pointer. llvm-svn: 123087	2011-01-08 21:07:56 +00:00
Chris Lattner	4dc1fd938f	enhance memcpyopt to merge a store and a subsequent memset into a single larger memset. llvm-svn: 123086	2011-01-08 20:54:51 +00:00
Chris Lattner	c638147e9f	constify TargetData references. Split memset formation logic out into its own "tryMergingIntoMemset" helper function. llvm-svn: 123081	2011-01-08 20:24:01 +00:00
Chris Lattner	59c82f850d	When loop rotation happens, it is very common for the duplicated condbr to be foldable into an uncond branch. When this happens, we can make a much simpler CFG for the loop, which is important for nested loop cases where we want the outer loop to be aggressively optimized. Handle this case more aggressively. For example, previously on phi-duplicate.ll we would get this: define void @test(i32 %N, double* %G) nounwind ssp { entry: %cmp1 = icmp slt i64 1, 1000 br i1 %cmp1, label %bb.nph, label %for.end bb.nph: ; preds = %entry br label %for.body for.body: ; preds = %bb.nph, %for.cond %j.02 = phi i64 [ 1, %bb.nph ], [ %inc, %for.cond ] %arrayidx = getelementptr inbounds double* %G, i64 %j.02 %tmp3 = load double* %arrayidx %sub = sub i64 %j.02, 1 %arrayidx6 = getelementptr inbounds double* %G, i64 %sub %tmp7 = load double* %arrayidx6 %add = fadd double %tmp3, %tmp7 %arrayidx10 = getelementptr inbounds double* %G, i64 %j.02 store double %add, double* %arrayidx10 %inc = add nsw i64 %j.02, 1 br label %for.cond for.cond: ; preds = %for.body %cmp = icmp slt i64 %inc, 1000 br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge for.cond.for.end_crit_edge: ; preds = %for.cond br label %for.end for.end: ; preds = %for.cond.for.end_crit_edge, %entry ret void } Now we get the much nicer: define void @test(i32 %N, double* %G) nounwind ssp { entry: br label %for.body for.body: ; preds = %entry, %for.body %j.01 = phi i64 [ 1, %entry ], [ %inc, %for.body ] %arrayidx = getelementptr inbounds double* %G, i64 %j.01 %tmp3 = load double* %arrayidx %sub = sub i64 %j.01, 1 %arrayidx6 = getelementptr inbounds double* %G, i64 %sub %tmp7 = load double* %arrayidx6 %add = fadd double %tmp3, %tmp7 %arrayidx10 = getelementptr inbounds double* %G, i64 %j.01 store double %add, double* %arrayidx10 %inc = add nsw i64 %j.01, 1 %cmp = icmp slt i64 %inc, 1000 br i1 %cmp, label %for.body, label %for.end for.end: ; preds = %for.body ret void } With all of these recent changes, we are now able to compile: void foo(char X) { for (int i = 0; i != 100; ++i) for (int j = 0; j != 100; ++j) X[j+i100] = 0; } into a single memset of 10000 bytes. This series of changes should also be helpful for other nested loop scenarios as well. llvm-svn: 123079	2011-01-08 19:59:06 +00:00
Chris Lattner	30f318e5d1	split ssa updating code out to its own helper function. Don't bother moving the OrigHeader block anymore: we just merge it away anyway so its code layout doesn't matter. llvm-svn: 123077	2011-01-08 19:26:33 +00:00
Chris Lattner	2615130e1d	Implement a TODO: Enhance loopinfo to merge away the unconditional branch that it was leaving in loops after rotation (between the original latch block and the original header. With this change, it is possible for rotated loops to have just a single basic block, which is useful. llvm-svn: 123075	2011-01-08 19:10:28 +00:00
Chris Lattner	fee37c5fa3	inline preserveCanonicalLoopForm now that it is simple. llvm-svn: 123073	2011-01-08 18:55:50 +00:00
Chris Lattner	063dca0f6a	Three major changes: 1. Rip out LoopRotate's domfrontier updating code. It isn't needed now that LICM doesn't use DF and it is super complex and gross. 2. Make DomTree updating code a lot simpler and faster. The old loop over all the blocks was just to find a block?? 3. Change the code that inserts the new preheader to just use SplitCriticalEdge instead of doing an overcomplex reimplementation of it. No behavior change, except for the name of the inserted preheader. llvm-svn: 123072	2011-01-08 18:52:51 +00:00
Chris Lattner	7fab23bc1d	LoopRotate requires canonical loop form, so it always has preheaders and latch blocks. Reorder entry conditions to make hte pass faster and more logical. llvm-svn: 123069	2011-01-08 18:06:22 +00:00
Chris Lattner	d62691f4e8	use the LI ivar. llvm-svn: 123068	2011-01-08 17:49:51 +00:00
Chris Lattner	385f2ec6d8	some cleanups: remove dead arguments and eliminate ivars that are just passed to one function. llvm-svn: 123067	2011-01-08 17:48:33 +00:00
Chris Lattner	25ba40a0cc	fix an issue duncan pointed out, which could cause loop rotate to violate LCSSA form llvm-svn: 123066	2011-01-08 17:38:45 +00:00
Cameron Zwarich	b4ab257bcc	Fix coding style issues. llvm-svn: 123065	2011-01-08 17:07:11 +00:00
Cameron Zwarich	84986b298a	Make more passes preserve dominators (or state that they preserve dominators if they all ready do). This removes two dominator recomputations prior to isel, which is a 1% improvement in total llc time for 403.gcc. The only potentially suspect thing is making GCStrategy recompute dominators if it used a custom lowering strategy. llvm-svn: 123064	2011-01-08 17:01:52 +00:00
Cameron Zwarich	80bd9af7c5	Contract subloop bodies. However, it is still important to visit the phis at the top of subloop headers, as the phi uses logically occur outside of the subloop. llvm-svn: 123062	2011-01-08 15:52:22 +00:00
Chris Lattner	8c5defd0b0	Have loop-rotate simplify instructions (yay instsimplify!) as it clones them into the loop preheader, eliminating silly instructions like "icmp i32 0, 100" in fixed tripcount loops. This also better exposes the bigger problem with loop rotate that I'd like to fix: once this has been folded, the duplicated conditional branch often turns into an uncond branch. Not aggressively handling this is pessimizing later loop optimizations somethin' fierce by making "dominates all exit blocks" checks fail. llvm-svn: 123060	2011-01-08 08:24:46 +00:00
Chris Lattner	43f8d16482	Revamp the ValueMapper interfaces in a couple ways: 1. Take a flags argument instead of a bool. This makes it more clear to the reader what it is used for. 2. Add a flag that says that "remapping a value not in the map is ok". 3. Reimplement MapValue to share a bunch of code and be a lot more efficient. For lookup failures, don't drop null values into the map. 4. Using the new flag a bunch of code can vaporize in LinkModules and LoopUnswitch, kill it. No functionality change. llvm-svn: 123058	2011-01-08 08:15:20 +00:00
Chris Lattner	2b3f20e6ec	two minor changes: switch to the standard ValueToValueMapTy map from ValueMapper.h (giving us access to its utilities) and add a fastpath in the loop rotation code, avoiding expensive ssa updator manipulation for values with nothing to update. llvm-svn: 123057	2011-01-08 07:21:31 +00:00
Cameron Zwarich	9ec19ea06a	Add the CallInst optimizations that don't involve expanding inline assembly to OptimizeInst() so that they can be used on a worklist instruction. llvm-svn: 122945	2011-01-06 02:56:42 +00:00
Cameron Zwarich	d28c78eb4f	Move the GEP handling in CodeGenPrepare to OptimizeInst(). llvm-svn: 122944	2011-01-06 02:44:52 +00:00
Cameron Zwarich	14ac865ca9	Split the optimizations in CodeGenPrepare that don't manipulate the iterators into a separate function, so that it can be called from a loop using a worklist rather than a loop traversing a whole basic block. llvm-svn: 122943	2011-01-06 02:37:26 +00:00
Jakob Stoklund Olesen	70be93a200	Zap the last two -Wself-assign warnings in llvm. Simplify RALinScan::DowngradeRegister with TRI::getOverlaps while we are there. llvm-svn: 122940	2011-01-06 01:33:22 +00:00
Cameron Zwarich	ce3b930a98	Stop reallocating SunkAddrs for each basic block. When we move to an instruction worklist, the key will need to become std::pair<BasicBlock, Value>. llvm-svn: 122932	2011-01-06 00:42:50 +00:00
Cameron Zwarich	b62ccb241b	Add some more statistics to CodeGenPrepare. llvm-svn: 122891	2011-01-05 17:47:38 +00:00
Cameron Zwarich	ced753fadf	Add some stats to CodeGenPrepare to make it easier to speed it up without regressing code quality. llvm-svn: 122887	2011-01-05 17:27:27 +00:00
Cameron Zwarich	6a78995369	Use pop_back_val instead of back followed by pop_back. llvm-svn: 122876	2011-01-05 16:08:47 +00:00
Cameron Zwarich	5a2bb998ac	Use a worklist for later iterations just like ordinary instsimplify. The next step is to only process instructions in subloops if they have been modified by an earlier simplification. llvm-svn: 122869	2011-01-05 05:47:47 +00:00
Cameron Zwarich	4c51d122d5	Change LoopInstSimplify back to a LoopPass. It revisits subloops rather than skipping them, but it should probably use a worklist and only revisit those instructions in subloops that have actually changed. It should probably also use a worklist after the first iteration like instsimplify now does. Regardless, it's only 0.3% of opt -O2 time on 403.gcc if it replaces the instcombine placed in the middle of the loop passes. llvm-svn: 122868	2011-01-05 05:15:53 +00:00
Owen Anderson	7b25ff04bd	Don't bother value numbering instructions with void types in GVN. In theory this should allow us to insert fewer things into the value numbering maps, but any speedup is beneath the noise threshold on my machine on 403.gcc. llvm-svn: 122844	2011-01-04 22:15:21 +00:00
Owen Anderson	e39cb57b09	Complete the NumberTable --> LeaderTable rename. llvm-svn: 122828	2011-01-04 19:29:46 +00:00
Owen Anderson	d7d06d3aaf	Fix typo in a comment. llvm-svn: 122827	2011-01-04 19:25:18 +00:00
Owen Anderson	51489b3b28	Prune #include's. llvm-svn: 122826	2011-01-04 19:24:57 +00:00
Owen Anderson	c7c3bc63f7	Clarify terminology, settling on referring to what was the "number table" as the "leader table", and rename methods to make it much more clear what they're doing. llvm-svn: 122823	2011-01-04 19:13:25 +00:00
Owen Anderson	83546f2fe0	When removing a value from GVN's leaders list, don't drop the Next pointer in a corner case. llvm-svn: 122822	2011-01-04 19:10:54 +00:00
Owen Anderson	41a1550ef5	Branch instructions don't produce values, so there's no need to generate a value number for them. This avoids adding them to the various value numbering tables, resulting in a minor (~3%) speedup for GVN on 40.gcc. llvm-svn: 122819	2011-01-04 18:54:18 +00:00
Owen Anderson	22c53e277a	Remove commented out code. llvm-svn: 122817	2011-01-04 18:22:08 +00:00
Cameron Zwarich	b2a41e9388	Switch to the new style of asterisk placement. llvm-svn: 122815	2011-01-04 18:19:19 +00:00
Chris Lattner	8643810ede	Teach loop-idiom to turn a loop containing a memset into a larger memset when safe. The testcase is basically this nested loop: void foo(char X) { for (int i = 0; i != 100; ++i) for (int j = 0; j != 100; ++j) X[j+i100] = 0; } which gets turned into a single memset now. clang -O3 doesn't optimize this yet though due to a phase ordering issue I haven't analyzed yet. llvm-svn: 122806	2011-01-04 07:46:33 +00:00
Chris Lattner	a62b01dc37	restructure this a bit. Initialize the WeakVH with "I", the instruction after the store. The store will always be deleted if the transformation kicks in, so we'd do an N^2 scan of every loop block. Whoops. llvm-svn: 122805	2011-01-04 07:27:30 +00:00
Cameron Zwarich	f4e13699e7	Avoid finding loop back edges when we are not splitting critical edges in CodeGenPrepare (which is the default behavior). llvm-svn: 122801	2011-01-04 04:43:31 +00:00
Cameron Zwarich	e924969380	Address most of Duncan's review comments. Also, make LoopInstSimplify a simple FunctionPass. It probably doesn't have a reason to be a LoopPass, as it will probably drop the simple fixed point and either use RPO iteration or Duncan's approach in instsimplify of only revisiting instructions that have changed. The next step is to preserve LoopSimplify. This looks like it won't be too hard, although the pass manager doesn't actually seem to respect when non-loop passes claim to preserve LCSSA or LoopSimplify. This will have to be fixed. llvm-svn: 122791	2011-01-04 00:12:46 +00:00
Chris Lattner	0ba473c218	use the very-handy getTruncateOrZeroExtend helper function, and stop setting NSW: signed overflow is possible. Thanks to Dan for pointing these out. llvm-svn: 122790	2011-01-04 00:06:55 +00:00
Owen Anderson	0839d3930a	Fix comment. llvm-svn: 122788	2011-01-03 23:51:56 +00:00
Owen Anderson	d62d37225a	Use the new addEscapingValue callback to update GlobalsModRef when GVN adds PHIs of GEPs. For the moment, have GlobalsModRef handle this conservatively by simply removing the value from its maps. llvm-svn: 122787	2011-01-03 23:51:43 +00:00
Chris Lattner	bde6ec1db6	Duncan deftly points out that readnone functions aren't invalidated by stores, so they can be handled as 'simple' operations. llvm-svn: 122785	2011-01-03 23:38:13 +00:00
Owen Anderson	3a33d0cc4a	Simplify GVN's value expression structure, allowing the elimination of a lot of almost-but-not-quite-identical code. No intended functionality change. llvm-svn: 122760	2011-01-03 19:00:11 +00:00
Chris Lattner	16ca19ffc5	stength reduce my previous patch a bit. The only instructions that are allowed to have metadata operands are intrinsic calls, and the only ones that take metadata currently return void. Just reject all void instructions, which should not be value numbered anyway. To future proof things, add an assert to the getHashValue impl for calls to check that metadata operands aren't present. llvm-svn: 122759	2011-01-03 18:43:03 +00:00
Chris Lattner	142f1cd251	fix PR8895: metadata operands don't have a strong use of their nested values, so they can change and drop to null, which can change the hash and cause havok. It turns out that it isn't a good idea to value number stuff with metadata operands anyway, so... don't. llvm-svn: 122758	2011-01-03 18:28:15 +00:00
Cameron Zwarich	43cecb1200	Switch a worklist in CodeGenPrepare to SmallVector and increase the inline capacity on the Visited SmallPtrSet. On 403.gcc, this is about a 4.5% speedup of CodeGenPrepare time (which itself is 10% of time spent in the backend). This is progress towards PR8889. llvm-svn: 122741	2011-01-03 06:33:01 +00:00
Chris Lattner	9e5e9ed79a	earlycse can do trivial with-a-block dead store elimination as well. This deletes 60 stores in 176.gcc that largely come from bitfield code. llvm-svn: 122736	2011-01-03 04:17:24 +00:00
Chris Lattner	4b9a525742	switch the load table to use a recycling bump pointer allocator, speeding earlycse up by 6%. llvm-svn: 122733	2011-01-03 03:53:50 +00:00
Chris Lattner	e0e32a9ef0	now that loads are in their own table, we can implement store->load forwarding. This allows EarlyCSE to zap 600 more loads from 176.gcc. llvm-svn: 122732	2011-01-03 03:46:34 +00:00
Chris Lattner	92bb0f9f9d	split loads and calls into separate tables. Loads are now just indexed by their pointer instead of using MemoryValue to wrap it. llvm-svn: 122731	2011-01-03 03:41:27 +00:00
Chris Lattner	4cb365414f	various cleanups, no functionality change. llvm-svn: 122729	2011-01-03 03:28:23 +00:00
Chris Lattner	b9a8efc960	Teach EarlyCSE to do trivial CSE of loads and read-only calls. On 176.gcc, this catches 13090 loads and calls, and increases the number of simple instructions CSE'd from 29658 to 36208. llvm-svn: 122727	2011-01-03 03:18:43 +00:00
Chris Lattner	79d83067ee	rename InstValue to SimpleValue, add some comments. llvm-svn: 122725	2011-01-03 02:20:48 +00:00
Michael J. Spencer	edb5bcdde5	CMake: Add missing source file. llvm-svn: 122724	2011-01-03 02:13:05 +00:00
Chris Lattner	d815f69b30	Allocate nodes for the scoped hash table from a recyling bump pointer allocator. This speeds up early cse by about 20% llvm-svn: 122723	2011-01-03 01:42:46 +00:00
Chris Lattner	02a9776b64	reduce redundancy in the hashing code and other misc cleanups. llvm-svn: 122720	2011-01-03 01:10:08 +00:00
Cameron Zwarich	cab9a0abab	Add a new loop-instsimplify pass, with the intention of replacing the instance of instcombine that is currently in the middle of the loop pass pipeline. This commit only checks in the pass; it will hopefully be enabled by default later. llvm-svn: 122719	2011-01-03 00:25:16 +00:00
Chris Lattner	0844c76f9a	fix some pastos llvm-svn: 122718	2011-01-02 23:29:58 +00:00
Chris Lattner	8fac5db251	add DEBUG and -stats output to earlycse. Teach it to CSE the rest of the non-side-effecting instructions. llvm-svn: 122716	2011-01-02 23:19:45 +00:00
Chris Lattner	18ae5436b1	Enhance earlycse to do CSE of casts, instsimplify and die. Add a testcase. llvm-svn: 122715	2011-01-02 23:04:14 +00:00
Chris Lattner	bf0aa927cc	split dom frontier handling stuff out to its own DominanceFrontier header, so that Dominators.h is just domtree. Also prune #includes a bit. llvm-svn: 122714	2011-01-02 22:09:33 +00:00
Chris Lattner	704541bb23	sketch out a new early cse pass. No functionality yet. llvm-svn: 122713	2011-01-02 21:47:05 +00:00
Chris Lattner	9c69406f2b	fix a miscompilation of tramp3d-v4: when forming a memcpy, we have to make sure that the loop we're promoting into a memcpy doesn't mutate the input of the memcpy. Before we were just checking that the dest of the memcpy wasn't mod/ref'd by the loop. llvm-svn: 122712	2011-01-02 21:14:18 +00:00
Chris Lattner	5702a43c09	If a loop iterates exactly once (has backedge count = 0) then don't mess with it. We'd rather peel/unroll it than convert all of its stores into memsets. llvm-svn: 122711	2011-01-02 20:24:21 +00:00
Chris Lattner	8455b6e45e	enhance loop idiom recognition to scan all unconditionally executed blocks in a loop, instead of just the header block. This makes it more aggressive, able to handle Duncan's Ada examples. llvm-svn: 122704	2011-01-02 19:01:03 +00:00
Chris Lattner	0cdc6f62a5	make inSubLoop much more efficient. llvm-svn: 122703	2011-01-02 18:53:08 +00:00
Chris Lattner	27497ece96	rip out isExitBlockDominatedByBlockInLoop, calling DomTree::dominates instead. isExitBlockDominatedByBlockInLoop is a relic of the days when domtree was just a tree and didn't have DFS numbers. Checking DFS numbers is faster and easier than "limiting the search of the tree". llvm-svn: 122702	2011-01-02 18:45:39 +00:00
Chris Lattner	0469e01c02	add a list of opportunities for future improvement. llvm-svn: 122701	2011-01-02 18:32:09 +00:00
Chris Lattner	ddf58010bd	Allow loop-idiom to run on multiple BB loops, but still only scan the loop header for now for memset/memcpy opportunities. It turns out that loop-rotate is successfully rotating loops, but DOESN'T MERGE THE BLOCKS, turning "for loops" into 2 basic block loops that loop-idiom was ignoring. With this fix, we form many many more memcpy and memsets than before, including on the "history" loops in the viterbi benchmark, which look like this: for (j=0; j<MAX_history; ++j) { history_new[i][j+1] = history[2*i][j]; } Transforming these loops into memcpy's speeds up the viterbi benchmark from 11.98s to 3.55s on my machine. Woo. llvm-svn: 122685	2011-01-02 07:58:36 +00:00
Chris Lattner	5b5a043d82	remove debugging code. llvm-svn: 122683	2011-01-02 07:37:13 +00:00
Chris Lattner	12f91befce	add some -stats output. llvm-svn: 122682	2011-01-02 07:36:44 +00:00
Chris Lattner	679572e584	improve loop rotation to use CodeMetrics to analyze the size of a loop header instead of its own code size estimator. This allows it to handle bitcasts etc more precisely. llvm-svn: 122681	2011-01-02 07:35:53 +00:00
Chris Lattner	85b6d81d41	teach loop idiom recognition to form memcpy's from simple loops. llvm-svn: 122678	2011-01-02 03:37:56 +00:00
Chris Lattner	a3514441e0	add a validity check that was missed, fixing a crash on the new testcase. llvm-svn: 122662	2011-01-01 20:12:04 +00:00
Chris Lattner	91a4435875	improve validity check to handle constant-trip-count loops more aggressively. In practice, this doesn't help anything though, see the todo. llvm-svn: 122660	2011-01-01 19:54:22 +00:00
Chris Lattner	8b3baf6d75	implement the "no aliasing accesses in loop" safety check. This pass should be correct now. llvm-svn: 122659	2011-01-01 19:39:01 +00:00
Chris Lattner	65a699d4d0	simplify this, isBytewiseValue handles the extra check. We still check for "multiple of a byte" in size to make it clear that the >> 3 below is safe. llvm-svn: 122604	2010-12-28 18:53:48 +00:00
Duncan Sands	5cf10e691b	Silence gcc warning about an unused variable when doing a release build. llvm-svn: 122593	2010-12-28 09:41:15 +00:00
Chris Lattner	cb18bfa3d2	fix some issues Frits noticed, add AliasAnalysis as a dependency llvm-svn: 122585	2010-12-27 18:39:08 +00:00
Benjamin Kramer	7cba269dfb	SimplifyLibCalls: Use IRBuilder to simplify code. llvm-svn: 122575	2010-12-27 00:16:46 +00:00
Chris Lattner	b9fe685b9a	have loop-idiom nuke instructions that feed stores that get removed. llvm-svn: 122574	2010-12-27 00:03:23 +00:00
Chris Lattner	29e14edc8d	implement enough of the memset inference algorithm to recognize and insert memsets. This is still missing one important validity check, but this is enough to compile stuff like this: void test0(std::vector<char> &X) { for (std::vector<char>::iterator I = X.begin(), E = X.end(); I != E; ++I) *I = 0; } void test1(std::vector<int> &X) { for (long i = 0, e = X.size(); i != e; ++i) X[i] = 0x01010101; } With: $ clang t.cpp -S -o - -O2 -emit-llvm \| opt -loop-idiom \| opt -O3 \| llc to: __Z5test0RSt6vectorIcSaIcEE: ## @_Z5test0RSt6vectorIcSaIcEE ## BB#0: ## %entry subq $8, %rsp movq (%rdi), %rax movq 8(%rdi), %rsi cmpq %rsi, %rax je LBB0_2 ## BB#1: ## %bb.nph subq %rax, %rsi movq %rax, %rdi callq ___bzero LBB0_2: ## %for.end addq $8, %rsp ret ... __Z5test1RSt6vectorIiSaIiEE: ## @_Z5test1RSt6vectorIiSaIiEE ## BB#0: ## %entry subq $8, %rsp movq (%rdi), %rax movq 8(%rdi), %rdx subq %rax, %rdx cmpq $4, %rdx jb LBB1_2 ## BB#1: ## %for.body.preheader andq $-4, %rdx movl $1, %esi movq %rax, %rdi callq _memset LBB1_2: ## %for.end addq $8, %rsp ret llvm-svn: 122573	2010-12-26 23:42:51 +00:00
Chris Lattner	6cf8d6cc6e	start using irbuilder to make mem intrinsics in a few passes. llvm-svn: 122572	2010-12-26 22:57:41 +00:00
Chris Lattner	7c5f9c35d1	sketch more of this out. llvm-svn: 122567	2010-12-26 20:45:45 +00:00
Chris Lattner	9cb1035f94	move isBytewiseValue out to ValueTracking.h/cpp llvm-svn: 122565	2010-12-26 20:15:01 +00:00
Chris Lattner	81ae3f299a	actually add the file... llvm-svn: 122563	2010-12-26 19:39:38 +00:00
Chris Lattner	2ef535a4e4	Start of a pass for recognizing memset and memcpy idioms. No functionality yet. llvm-svn: 122562	2010-12-26 19:32:44 +00:00
Benjamin Kramer	30342fb1fd	Simplify code. llvm-svn: 122561	2010-12-26 15:23:45 +00:00
Benjamin Kramer	b90b2f0635	Fix a thinko pointed out by Frits van Bommel: looking through global variables in isBytewiseValue is not safe. llvm-svn: 122550	2010-12-24 22:23:59 +00:00
Benjamin Kramer	ea9152e551	MemCpyOpt: Turn memcpys from a constant into a memset if possible. This allows us to compile "int cst[] = {-1, -1, -1};" into movl $-1, 16(%rsp) movq $-1, 8(%rsp) instead of movl _cst+8(%rip), %eax movl %eax, 16(%rsp) movq _cst(%rip), %rax movq %rax, 8(%rsp) llvm-svn: 122548	2010-12-24 21:17:12 +00:00
Owen Anderson	5d690d4168	It is possible for SimplifyCFG to cause PHI nodes to become redundant too late in the optimization pipeline to be caught by instcombine, and it's not feasible to catch them in SimplifyCFG because the use-lists are in an inconsistent state at the point where it could know that it need to simplify them. Instead, have CodeGenPrepare look for trivially redundant PHIs as part of its general cleanup effort. llvm-svn: 122516	2010-12-23 20:57:35 +00:00
Mon P Wang	18b762a946	Preserve the address space when generating bitcasts for MemTransferInst in ConvertToScalarInfo llvm-svn: 122462	2010-12-23 01:41:32 +00:00
Jeffrey Yasskin	9b43f33620	Change all self assignments X=X to (void)X, so that we can turn on a new gcc warning that complains on self-assignments and self-initializations. llvm-svn: 122458	2010-12-23 00:58:24 +00:00
Owen Anderson	5ab8d4b5e5	Give GVN back the ability to perform simple conditional propagation on conditional branch values. I still think that LVI should be handling this, but that capability is some ways off in the future, and this matters for some significant benchmarks. llvm-svn: 122378	2010-12-21 23:54:34 +00:00
Owen Anderson	12470778d7	Remove dead code. llvm-svn: 122371	2010-12-21 22:31:24 +00:00
Benjamin Kramer	43493c089f	GVN's Expression is not POD-like (it contains a SmallVector). Simplify code while at it. llvm-svn: 122362	2010-12-21 21:30:19 +00:00
Chris Lattner	b6252a376a	tidy up llvm-svn: 122190	2010-12-19 20:24:28 +00:00
Chris Lattner	408a684d29	Enhance LICM to promote alias sets whose pointers themselves are stored, which doesn't affect the memory address being promoted. llvm-svn: 122172	2010-12-19 05:57:25 +00:00
Chris Lattner	3337a81450	fix PR8602, a bug in an assertion: a volatile store of a pointer does not make the alias set for that pointer volatile, just stores to the pointer. llvm-svn: 122171	2010-12-19 05:51:54 +00:00
Chris Lattner	fb888622c3	revert r122164, I'm going to go with a different approach. llvm-svn: 122168	2010-12-19 04:23:03 +00:00
Chris Lattner	583ec6fa44	first step to fixing PR8642: don't fold away empty basic blocks which have trapping constant exprs in them due to PHI nodes. Eliminating them can cause the constant expr to be evalutated on new paths if the input edges are critical. llvm-svn: 122164	2010-12-19 03:02:34 +00:00
Dan Gohman	93dc2b808f	Revert r64460. strtol and friends cannot be marked readonly, even with a null endptr argument, because they may write to errno. This fixes a seflhost miscompile observed on Linux targets when TBAA was enabled. llvm-svn: 122014	2010-12-17 01:09:43 +00:00
Frits van Bommel	9bbe849fc3	Fix a bug in the loop in JumpThreading::ProcessThreadableEdges() where it could falsely produce a MultipleDestSentinel value if the first predecessor ended with an 'indirectbr'. If that happened, it caused an unnecessary FindMostPopularDest() call. This wasn't a correctness problem, but it broke the fast path for single-predecessor blocks. llvm-svn: 121966	2010-12-16 12:16:00 +00:00
Dan Gohman	e1a17a3473	Make memcpyopt TBAA-aware. llvm-svn: 121944	2010-12-16 02:51:19 +00:00
Dan Gohman	4467aa5294	Preserve TBAA tags when doing load PRE. llvm-svn: 121921	2010-12-15 23:53:55 +00:00
Dan Gohman	a4fcd2418d	Move Value::getUnderlyingObject to be a standalone function so that it can live in Analysis instead of VMCore. llvm-svn: 121885	2010-12-15 20:02:24 +00:00
Frits van Bommel	3d1803495e	Teach jump threading to "look through" a select when the branch direction of a terminator depends on it. When it sees a promising select it now tries to figure out whether the condition of the select is known in any of the predecessors and if so it maps the operands appropriately. llvm-svn: 121859	2010-12-15 09:51:20 +00:00
Owen Anderson	35609d97ae	Fix PR8790, another instance where unreachable code can cause instruction simplification to fail, this case involve a select that simplifies to itself. llvm-svn: 121817	2010-12-15 00:55:35 +00:00
Owen Anderson	15c85c916f	Cleanup trailing whitespace. llvm-svn: 121816	2010-12-15 00:52:44 +00:00
Chris Lattner	73a58627c3	simplify code and reduce indentation llvm-svn: 121670	2010-12-13 02:38:13 +00:00
Chris Lattner	bc4457e317	enhance memcpyopt to zap memcpy's that have the same src/dst. llvm-svn: 121362	2010-12-09 07:45:45 +00:00
Chris Lattner	fd51c52ef6	fix PR8753, eliminating a case where we'd infinitely make a substitution because it doesn't actually change the IR. Patch by Jakub Staszak! llvm-svn: 121361	2010-12-09 07:39:50 +00:00
Frits van Bommel	d2f4b09e10	Remove some dead code from the jump threading pass. The last uses of these functions were removed in r113852 when LazyValueInfo was permanently enabled and removed the need for them. llvm-svn: 121133	2010-12-07 13:08:07 +00:00
Jay Foad	583abbc4df	PR5207: Change APInt methods trunc(), sext(), zext(), sextOrTrunc() and zextOrTrunc(), and APSInt methods extend(), extOrTrunc() and new method trunc(), to be const and to return a new value instead of modifying the object in place. llvm-svn: 121120	2010-12-07 08:25:19 +00:00
Frits van Bommel	d9df6eaa9c	Implement jump threading of 'indirectbr' by keeping track of whether we're looking for ConstantInts or BlockAddresss. llvm-svn: 121066	2010-12-06 23:36:56 +00:00
Chris Lattner	4dc53e37d9	Use a stronger predicate here, pointed out by Duncan llvm-svn: 121040	2010-12-06 21:48:10 +00:00
Chris Lattner	ca335e38cf	add some DEBUG statements. llvm-svn: 121038	2010-12-06 21:13:51 +00:00
Chris Lattner	94fbdf3814	Fix PR8728, a miscompilation I recently introduced. When optimizing memcpy's like: memcpy(A, B) memcpy(A, C) we cannot delete the first memcpy as dead if A and C might be aliases. If so, we actually get: memcpy(A, B) memcpy(A, A) which is not correct to transform into: memcpy(A, A) This patch was heavily influenced by Jakub Staszak's patch in PR8728, thanks Jakub! llvm-svn: 120974	2010-12-06 01:48:06 +00:00
Frits van Bommel	76244867cf	Refactor jump threading. Should have no functional change other than the order of two transformations that are mutually-exclusive and the exact formatting of debug output. Internally, it now stores the ConstantInts as Constants, and actual undef values instead of nulls. llvm-svn: 120946	2010-12-05 19:06:41 +00:00
Frits van Bommel	5e75ef4a8e	Remove trailing whitespace. llvm-svn: 120945	2010-12-05 19:02:47 +00:00
Chris Lattner	1c577b54b0	fix a bozo bug I introduced in r119930, causing a miscompile of 20040709-1.c from the gcc testsuite. I was using the size of a pointer instead of the pointee. This fixes rdar://8713376 llvm-svn: 120519	2010-12-01 01:24:55 +00:00
Chris Lattner	903add84d9	Enhance DSE to handle the variable index case in PR8657. llvm-svn: 120498	2010-11-30 23:43:23 +00:00
Chris Lattner	c0f3379ae0	teach DSE to use GetPointerBaseWithConstantOffset to analyze may-aliasing stores that partially overlap with different base pointers. This implements PR6043 and the non-variable part of PR8657 llvm-svn: 120485	2010-11-30 23:05:20 +00:00
Chris Lattner	e28618de59	move GetPointerBaseWithConstantOffset out of GVN into ValueTracking.h llvm-svn: 120476	2010-11-30 22:25:26 +00:00
Chris Lattner	50162e3c2a	remove a fixed fixme llvm-svn: 120474	2010-11-30 22:18:11 +00:00
Chris Lattner	6712251f41	Make DeleteDeadInstruction be a static function, move some code around. llvm-svn: 120471	2010-11-30 21:58:14 +00:00
Chris Lattner	51d67ce2ff	switch RemoveAccessedObjects to use AliasAnalysis::Location to simplify the code. We now get accurate sizes on Loads, though it surely doesn't matter in practice. llvm-svn: 120469	2010-11-30 21:47:58 +00:00
Chris Lattner	f80b39986f	two improvements to RemoveAccessedObjects: 1. if the underlying pointer passed in can be resolved to any argument or alloca, then we don't need to scan. Previously we would only avoid the scan if the alloca or byval was actually considered dead. 2. The dead store processing code is itself completely dead and didn't handle volatile stores right anyway, so delete it. This allows simplifying the interface to RemoveAccessedObjects. llvm-svn: 120467	2010-11-30 21:38:30 +00:00
Chris Lattner	7fe08b67fa	remove the "undead" terminology, which is nonstandard and never made sense to me. We now have a set of dead stack objects, and they become live when loaded. Fix a theoretical problem where we'd pass in the wrong pointer to the alias query. llvm-svn: 120465	2010-11-30 21:32:12 +00:00
Chris Lattner	127818d746	move call handling in handleEndBlock up a bit, and simplify it. If the call might read all the allocas, stop scanning early. Convert a vector to smallvector, shrink SmallPtrSet to 16 instead of 64 to avoid crazy linear scans. llvm-svn: 120463	2010-11-30 21:18:46 +00:00
Dale Johannesen	d3a58c8fa1	Avoid exponential growth of a table. It feels like there should be a better way to do this. PR 8679. llvm-svn: 120457	2010-11-30 20:23:21 +00:00
Chris Lattner	60a8b3dab8	various cleanups and code simplification llvm-svn: 120454	2010-11-30 19:48:15 +00:00
Chris Lattner	51c28a93cc	make getPointerSize a static function. Add ivars to DSE for AA and MD pass info instead of using getAnalysis<> all over. llvm-svn: 120453	2010-11-30 19:34:42 +00:00
Chris Lattner	77d79fa25f	reduce indentation, clean up TD use a bit. llvm-svn: 120452	2010-11-30 19:28:23 +00:00
Chris Lattner	b63ba73b1b	enhance isRemovable to refuse to delete volatile mem transfers now that DSE hacks on them. This fixes a regression I introduced, by generalizing DSE to hack on transfers. llvm-svn: 120445	2010-11-30 19:12:10 +00:00
Chris Lattner	58b779e9c2	Rewrite the main DSE loop to be written in terms of reasoning about pairs of AA::Location's instead of looking for MemDep's "Def" predicate. This is more powerful and general, handling memset/memcpy/store all uniformly, and implementing PR8701 and probably obsoleting parts of memcpyoptimizer. This also fixes an obscure bug with init.trampoline and i8 stores, but I'm not surprised it hasn't been hit yet. Enhancing init.trampoline to carry the size that it stores would allow DSE to be much more aggressive about optimizing them. llvm-svn: 120406	2010-11-30 07:23:21 +00:00
Anders Carlsson	e3ea1cba79	Add a puts optimization that converts puts() to putchar('\n'). llvm-svn: 120398	2010-11-30 06:19:18 +00:00
Chris Lattner	3590ef817c	rename a function and reduce some indentation, no functionality change. llvm-svn: 120391	2010-11-30 05:30:45 +00:00
Chris Lattner	2227a8a192	rename doesClobberMemory -> hasMemoryWrite to be more specific, and remove an actively-wrong comment. llvm-svn: 120378	2010-11-30 01:37:52 +00:00
Chris Lattner	9d179d911d	clean up handling of 'free', detangling it from everything else. It can be seriously improved, but at least now it isn't intertwined with the other logic. llvm-svn: 120377	2010-11-30 01:28:33 +00:00
Chris Lattner	9a146372b5	Teach basicaa that memset's modref set is at worst "mod" and never contains "ref". Enhance DSE to use a modref query instead of a store-specific hack to generalize the "ignore may-alias stores" optimization to handle memset and memcpy. llvm-svn: 120368	2010-11-30 00:28:45 +00:00
Chris Lattner	c3c754f750	my previous patch would cause us to start deleting some volatile stores, fix and add a testcase. llvm-svn: 120363	2010-11-30 00:12:39 +00:00

... 4 5 6 7 8 ...

5050 Commits