llvm-project

Commit Graph

Author	SHA1	Message	Date
Chris Lattner	1aafe4cece	reduce indentation, (~XorCST->getValue()).isSignBit() -> isMaxSignedValue() llvm-svn: 63500	2009-02-02 07:15:30 +00:00
Nick Lewycky	f23908151a	Reinstate this optimization to fold icmp of xor when possible. Don't try to turn icmp eq a+x, b+x into icmp eq a, b if a+x or b+x has other uses. This may have been increasing register pressure leading to the bzip2 slowdown. llvm-svn: 63487	2009-01-31 21:30:05 +00:00
Chris Lattner	9e2b9f3234	Fix PR3452 (an infinite loop bootstrapping) by disabling the recent improvements to the EvaluateInDifferentType code. This code works by just inserted a bunch of new code and then seeing if it is useful. Instcombine is not allowed to do this: it can only insert new code if it is useful, and only when it is converging to a more canonical fixed point. Now that we iterate when DCE makes progress, this causes an infinite loop when the code ends up not being used. llvm-svn: 63483	2009-01-31 19:05:27 +00:00
Chris Lattner	76a63ed099	now that all the pieces are in place, teach instcombine's simplifydemandedbits to simplify instructions with multiple uses in contexts where it can get away with it. This allows it to simplify the code in multi-use-or.ll into a single 'add double'. This change is particularly interesting because it will cover up for some common codegen bugs with large integers created due to the recent SROA patch. When working on fixing those bugs, this should be disabled. llvm-svn: 63481	2009-01-31 08:40:03 +00:00
Chris Lattner	3e2cb66c56	simplify/clarify control flow and improve comments, no functionality change. llvm-svn: 63480	2009-01-31 08:24:16 +00:00
Chris Lattner	83c6a141b8	make some fairly meaty internal changes to how SimplifyDemandedBits works. Now, if it detects that "V" is the same as some other value, SimplifyDemandedBits returns the new value instead of RAUW'ing it immediately. This has two benefits: 1) simpler code in the recursive SimplifyDemandedBits routine. 2) it allows future fun stuff in instcombine where an operation has multiple uses and can be simplified in one context, but not all. #2 isn't implemented yet, this patch should have no functionality change. llvm-svn: 63479	2009-01-31 08:15:18 +00:00
Chris Lattner	585cfb2ce7	minor cleanups llvm-svn: 63477	2009-01-31 07:26:06 +00:00
Chris Lattner	94cfb281c3	make sure to set Changed=true when instcombine hacks on the code, not doing so prevents it from properly iterating and prevents it from deleting the entire body of dce-iterate.ll llvm-svn: 63476	2009-01-31 07:04:22 +00:00
Chris Lattner	ec99c46d44	Simplify and generalize the SROA "convert to scalar" transformation to be able to handle ANY alloca that is poked by loads and stores of bitcasts and GEPs with constant offsets. Before the code had a number of annoying limitations and caused it to miss cases such as storing into holes in structs and complex casts (as in bitfield-sroa) where we had unions of bitfields etc. This also handles a number of important cases that are exposed due to the ABI lowering stuff we do to pass stuff by value. One case that is pretty great is that we compile 2006-11-07-InvalidArrayPromote.ll into: define i32 @func(<4 x float> %v0, <4 x float> %v1) nounwind { %tmp10 = call <4 x i32> @llvm.x86.sse2.cvttps2dq(<4 x float> %v1) %tmp105 = bitcast <4 x i32> %tmp10 to i128 %tmp1056 = zext i128 %tmp105 to i256 %tmp.upgrd.43 = lshr i256 %tmp1056, 96 %tmp.upgrd.44 = trunc i256 %tmp.upgrd.43 to i32 ret i32 %tmp.upgrd.44 } which turns into: _func: subl $28, %esp cvttps2dq %xmm1, %xmm0 movaps %xmm0, (%esp) movl 12(%esp), %eax addl $28, %esp ret Which is pretty good code all things considering :). One effect of this is that SROA will start generating arbitrary bitwidth integers that are a multiple of 8 bits. In the case above, we got a 256 bit integer, but the codegen guys assure me that it can handle the simple and/or/shift/zext stuff that we're doing on these operations. This addresses rdar://6532315 llvm-svn: 63469	2009-01-31 02:28:54 +00:00
Chris Lattner	df17987c19	Fix some issues with volatility, move "CanConvertToScalar" check after the others. llvm-svn: 63227	2009-01-28 20:16:43 +00:00
Duncan Sands	5a913d61e3	Rename getAnalysisToUpdate to getAnalysisIfAvailable. llvm-svn: 63198	2009-01-28 13:14:17 +00:00
Mon P Wang	3537a62704	Fixed optimization of combining two shuffles where the first shuffle inputs has a different number of elements than the output. llvm-svn: 62998	2009-01-26 04:39:00 +00:00
Chris Lattner	9449991c4f	Handle single-entry phi nodes gracefully in condprop. llvm-svn: 62985	2009-01-26 02:18:20 +00:00
Chris Lattner	7b6647c178	Fix PR3408 by making a non-obvious assumption very obvious, and handling the flaw inherent in that assumption. :) llvm-svn: 62984	2009-01-26 02:11:30 +00:00
Chris Lattner	57cb472b56	More cleanups and simplifications, no functionality change. llvm-svn: 62983	2009-01-26 01:57:01 +00:00
Chris Lattner	d67aaa6560	tidy asserts llvm-svn: 62982	2009-01-26 01:38:24 +00:00
Torok Edwin	f4395ea97a	testcase for PR3381. Also it was an empty struct, not a void after all. llvm-svn: 62920	2009-01-24 17:16:04 +00:00
Torok Edwin	73ff92272f	void* is represented as pointer to empty struct {}. Thus we need to check whether the struct is empty before trying to index into it. This fixes PR3381. llvm-svn: 62918	2009-01-24 11:30:49 +00:00
Chris Lattner	72cd68fe64	Make InstCombineStoreToCast handle aggregates more aggressively, handling the case in Transforms/InstCombine/cast-store-gep.ll, which is a heavily reduced testcase from Clang on x86-64. llvm-svn: 62904	2009-01-24 01:00:13 +00:00
Gabor Greif	eb61fcf2a1	Simplify the logic of getting hold of a PHI predecessor block. There is now a direct way from value-use-iterator to incoming block in PHINode's API. This way we avoid the iterator->index->iterator trip, and especially the costly getOperandNo() invocation. Additionally there is now an assertion that the iterator really refers to one of the PHI's Uses. llvm-svn: 62869	2009-01-23 19:40:15 +00:00
Chris Lattner	77527f5812	Remove uses of uint32_t in favor of 'unsigned' for better compatibility with cygwin. Patch by Jay Foad! llvm-svn: 62695	2009-01-21 18:09:24 +00:00
Dale Johannesen	b5721632ee	Make special cases (0 inf nan) work for frem. Besides APFloat, this involved removing code from two places that thought they knew the result of frem(0., x) but were wrong. llvm-svn: 62645	2009-01-21 00:35:19 +00:00
Chris Lattner	73d7fe5a34	improve compatibility with cygwin, patch by Jay Foad! llvm-svn: 62535	2009-01-19 22:00:18 +00:00
Chris Lattner	6f34e317e9	Fix PR3353, infinitely jump threading an infinite loop make from switches. llvm-svn: 62529	2009-01-19 21:20:34 +00:00
Chris Lattner	64b7bd7f9e	Fix rdar://6505632, an llc crash on 483.xalancbmk llvm-svn: 62470	2009-01-18 20:35:00 +00:00
Nick Lewycky	3ced0dfa69	Fix copy and pasted typos that prevented strtok_r, realloc, getenv, ungetc, putc, puts, perror, vscanf and vsscanf from getting annotations. Add annotations for eight printf functions, memalign, pread and pwrite. On Linux, llvm-gcc sometimes renames strdup, getc, putc, strtok_r, scanf and sscanf. Match the alternate function names. Fix a crash annotating opendir. Don't mark fsetpos's second parameter as nocapture. It's supposed to be captured. Do mark fopen's path and mode strings as nocapture. Mark ferror as readonly, but not fileno which may set errno. llvm-svn: 62456	2009-01-18 04:34:36 +00:00
Chris Lattner	db2d9613d2	Fix PR3335 by not turning a store to one address space into a store to another. llvm-svn: 62351	2009-01-16 20:12:52 +00:00
Chris Lattner	733256fe31	reduce indentation by using early exits, no functionality change. llvm-svn: 62350	2009-01-16 20:08:59 +00:00
Evan Cheng	beac6f8b0c	Clean up previous cast optimization a bit. Also make zext elimination a bit more aggressive: if it's not necessary to emit an AND (i.e. high bits are already zero), it's profitable to evaluate the operand at a different type. llvm-svn: 62297	2009-01-16 02:11:43 +00:00
Rafael Espindola	6de96a1b5d	Add the private linkage. llvm-svn: 62279	2009-01-15 20:18:42 +00:00
Evan Cheng	ff716cb342	Eliminate a redundant check. llvm-svn: 62264	2009-01-15 17:09:07 +00:00
Evan Cheng	60e19a46f2	- Teach CanEvaluateInDifferentType of this xform: sext (zext ty1), ty2 -> zext ty2 - Looking at the number of sign bits of the a sext instruction to determine whether new trunc + sext pair should be added when its source is being evaluated in a different type. llvm-svn: 62263	2009-01-15 17:01:23 +00:00
Chris Lattner	8fb9480ed2	Fix PR3325, a miscompilation of invokes by IPSCCP. Patch by Jay Foad! llvm-svn: 62244	2009-01-14 21:01:16 +00:00
Dale Johannesen	1f0e0e7c9c	Fix the time regression I introduced in 464.h264ref with my earlier patch to this file. The issue there was that all uses of an IV inside a loop are actually references to Base[IV2], and there was one use outside that was the same but LSR didn't see the base or the scaling because it didn't recurse into uses outside the loop; thus, it used base+IVscale mode inside the loop instead of pulling base out of the loop. This was extra bad because register pressure later forced both base and IV into memory. Doing that recursion, at least enough to figure out addressing modes, is a good idea in general; the change in AddUsersIfInteresting does this. However, there were side effects.... It is also possible for recursing outside the loop to introduce another IV where there was only 1 before (if the refs inside are not scaled and the ref outside is). I don't think this is a common case, but it's in the testsuite. It is right to be very aggressive about getting rid of such introduced IVs (CheckForIVReuse and the handling of nonzero RewriteFactor in StrengthReduceStridedIVUsers). In the testcase in question the new IV produced this way has both a nonconstant stride and a nonzero base, neither of which was handled before. And when inserting new code that feeds into a PHI, it's right to put such code at the original location rather than in the PHI's immediate predecessor(s) when the original location is outside the loop (a case that couldn't happen before) (RewriteInstructionToUseNewBase); better to avoid making multiple copies of it in this case. Also, the mechanism for keeping SCEV's corresponding to GEP's no longer works, as the GEP might change after its SCEV is remembered, invalidating the SCEV, and we might get a bad SCEV value when looking up the GEP again for a later loop. This also couldn't happen before, as we weren't recursing into GEP's outside the loop. Also, when we build an expression that involves a (possibly non-affine) IV from a different loop as well as an IV from the one we're interested in (containsAddRecFromDifferentLoop), don't recurse into that. We can't do much with it and will get in trouble if we try to create new non-affine IVs or something. More testcases are coming. llvm-svn: 62212	2009-01-14 02:35:31 +00:00
Dan Gohman	59af77376c	Make instcombine ensure that all allocas are explicitly aligned at at least their preferred alignment. llvm-svn: 62176	2009-01-13 20:18:38 +00:00
Duncan Sands	dc020f9c3c	Rename getABITypeSize to getTypePaddedSize, as suggested by Chris. llvm-svn: 62099	2009-01-12 20:38:59 +00:00
Chris Lattner	bd3c7c8b52	Duncan is nervous about undefinedness of % with negatives. I'm not thrilled about 64-bit % in general, so rewrite to use * instead. llvm-svn: 62047	2009-01-11 20:41:36 +00:00
Chris Lattner	b19151686f	do not generated GEPs into vectors where they don't already exist. We should treat vectors as atomic types, not like arrays. llvm-svn: 62046	2009-01-11 20:23:52 +00:00
Chris Lattner	171d2d474f	Make a couple of cleanups to the instcombine bitcast/gep canonicalization transform based on duncan's comments: 1) improve the comment about %. 2) within our index loop make sure the offset stays within the type size, instead of within the abi size. This allows us to reason explicitly about landing in tail padding and means that issues like non-zero offsets into [0 x foo] types don't occur anymore. llvm-svn: 62045	2009-01-11 20:15:20 +00:00
Chris Lattner	5f54d50917	fix typo Duncan noticed. llvm-svn: 61997	2009-01-09 18:31:39 +00:00
Chris Lattner	ae0e857b98	Fix PR3304 llvm-svn: 61995	2009-01-09 18:18:43 +00:00
Misha Brukman	5cbf223916	Removed trailing whitespace from Makefiles. llvm-svn: 61991	2009-01-09 16:44:42 +00:00
Chris Lattner	f50aa6ae5c	Implement rdar://6480391, extending of equality icmp's to avoid a truncation. I noticed this in the code compiled for a routine using std::map, which produced this code: %25 = tail call i32 @memcmp(i8* %24, i8* %23, i32 6) nounwind readonly %.lobit.i = lshr i32 %25, 31 ; <i32> [#uses=1] %tmp.i = trunc i32 %.lobit.i to i8 ; <i8> [#uses=1] %toBool = icmp eq i8 %tmp.i, 0 ; <i1> [#uses=1] br i1 %toBool, label %bb3, label %bb4 which compiled to: call L_memcmp$stub shrl $31, %eax testb %al, %al jne LBB1_11 ## with this change, we compile it to: call L_memcmp$stub testl %eax, %eax js LBB1_11 This triggers all the time in common code, with patters like this: %169 = and i32 %ply, 1 ; <i32> [#uses=1] %170 = trunc i32 %169 to i8 ; <i8> [#uses=1] %toBool = icmp ne i8 %170, 0 ; <i1> [#uses=1] %7 = lshr i32 %6, 24 ; <i32> [#uses=1] %9 = trunc i32 %7 to i8 ; <i8> [#uses=1] %10 = icmp ne i8 %9, 0 ; <i1> [#uses=1] etc llvm-svn: 61985	2009-01-09 07:47:06 +00:00
Chris Lattner	0f7cf1d7e1	Remove some old code that looks like a remanant from signed-types days. llvm-svn: 61984	2009-01-09 07:10:58 +00:00
Chris Lattner	482eb70a10	Fix PR3298, a crash in Jump Threading. Apparently even jump threading can have bugs, who knew? ;-) llvm-svn: 61983	2009-01-09 06:08:12 +00:00
Chris Lattner	fef138b140	Fix part 3/2 of PR3290, making instcombine zap (gep(bitcast)) when possible. llvm-svn: 61980	2009-01-09 05:44:56 +00:00
Chris Lattner	a784a2ce01	move some code, check to see if the input to the GEP is a bitcast (which is constant time and cheap) before checking hasAllZeroIndices. llvm-svn: 61976	2009-01-09 04:53:57 +00:00
Chris Lattner	c518dfd11b	This implements the second half of the fix for PR3290, handling loads from allocas that cover the entire aggregate. This handles some memcpy/byval cases that are produced by llvm-gcc. This triggers a few times in kc++ (with std::pair<std::_Rb_tree_const_iterator <kc::impl_abstract_phylum*>,bool>) and once in 176.gcc (with %struct..0anon). llvm-svn: 61915	2009-01-08 05:42:05 +00:00
Chris Lattner	f2b8c82ad1	Implement the first half of PR3290: if there is a store of an integer to a (transitive) bitcast the alloca and if that integer has the full size of the alloca, then it clobbers the whole thing. Handle this by extracting pieces out of the stored integer and filing them away in the SROA'd elements. This triggers fairly frequently because the CFE uses integers to pass small structs by value and the inliner exposes these. For example, in kimwitu++, I see a bunch of these with i64 stores to "%struct.std::pair<std::_Rb_tree_const_iterator<kc::impl_abstract_phylum*>,bool>" In 176.gcc I see a few i32 stores to "%struct..0anon". In the testcase, this is a difference between compiling test1 to: _test1: subl $12, %esp movl 20(%esp), %eax movl %eax, 4(%esp) movl 16(%esp), %eax movl %eax, (%esp) movl (%esp), %eax addl 4(%esp), %eax addl $12, %esp ret vs: _test1: movl 8(%esp), %eax addl 4(%esp), %eax ret The second half of this will be to handle loads of the same form. llvm-svn: 61853	2009-01-07 08:11:13 +00:00
Chris Lattner	9a2de65fd6	Factor a bunch of code out into a helper method. llvm-svn: 61852	2009-01-07 07:18:45 +00:00
Chris Lattner	db561146aa	use continue to simplify code and reduce nesting, no functionality change. llvm-svn: 61851	2009-01-07 06:39:58 +00:00
Chris Lattner	938b54f383	Get TargetData once up front and cache as an ivar instead of requerying it all over the place. llvm-svn: 61850	2009-01-07 06:34:28 +00:00
Chris Lattner	a63dba9e6c	Use the hasAllZeroIndices predicate to simplify some code, no functionality change. llvm-svn: 61849	2009-01-07 06:25:07 +00:00
Chris Lattner	2fdcc59bb6	Change m_ConstantInt and m_SelectCst to take their constant integers as template arguments instead of as instance variables, exposing more optimization opportunities to the compiler earlier. llvm-svn: 61776	2009-01-05 23:53:12 +00:00
Evan Cheng	8804293fe9	Find loop back edges only after empty blocks are eliminated. llvm-svn: 61752	2009-01-05 21:17:27 +00:00
Nick Lewycky	e4e5532e05	Move the libcall annotating part from doFinalization to doInitialization. Finalization occurs after all the FunctionPasses in the group have run, which is clearly not what we want. This also means that we have to make sure that we apply the right param attributes when creating a new function. Also, add a missed optimization: strdup and strndup. NoCapture and NoAlias return! llvm-svn: 61658	2009-01-05 00:07:50 +00:00
Nick Lewycky	959af7ba30	Run a post-pass that marks known function declarations by name. llvm-svn: 61632	2009-01-04 20:27:34 +00:00
Bill Wendling	0c04f9fdc3	Revert this transform. It was causing some dramatic slowdowns in a few tests. See PR3266. llvm-svn: 61623	2009-01-04 06:19:11 +00:00
Bill Wendling	0fcff2c203	Fix comment. llvm-svn: 61538	2009-01-01 01:19:59 +00:00
Bill Wendling	aedb54a947	Add transformation: xor (or (icmp, icmp), true) -> and(icmp, icmp) This is possible because of De Morgan's law. llvm-svn: 61537	2009-01-01 01:18:23 +00:00
Dale Johannesen	656237beca	Revert 61362 and 61402 until SPEC breakage is fixed. llvm-svn: 61403	2008-12-23 23:21:35 +00:00
Dale Johannesen	f8b161bcd1	This fixes the bug in 175.vpr. It doesn't fix the other SPEC breakage. I'll be reverting all recent changes shortly, this checking is mostly so this change doesn't get lost. llvm-svn: 61402	2008-12-23 23:05:26 +00:00
Dale Johannesen	93b9aa8799	Fix the time regression I introduced in 464.h264ref with my last patch to this file. The issue there was that all uses of an IV inside a loop are actually references to Base[IV2], and there was one use outside that was the same but LSR didn't see the base or the scaling because it didn't recurse into uses outside the loop; thus, it used base+IVscale mode inside the loop instead of pulling base out of the loop. This was extra bad because register pressure later forced both base and IV into memory. Doing that recursion, at least enough to figure out addressing modes, is a good idea in general; the change in AddUsersIfInteresting does this. However, there were side effects.... It is also possible for recursing outside the loop to introduce another IV where there was only 1 before (if the refs inside are not scaled and the ref outside is). I don't think this is a common case, but it's in the testsuite. It is right to be very aggressive about getting rid of such introduced IVs (CheckForIVReuse and the handling of nonzero RewriteFactor in StrengthReduceStridedIVUsers). In the testcase in question the new IV produced this way has both a nonconstant stride and a nonzero base, neither of which was handled before. And when inserting new code that feeds into a PHI, it's right to put such code at the original location rather than in the PHI's immediate predecessor(s) when the original location is outside the loop (a case that couldn't happen before) (RewriteInstructionToUseNewBase); better to avoid making multiple copies of it in this case. Also, the mechanism for keeping SCEV's corresponding to GEP's no longer works, as the GEP might change after its SCEV is remembered, invalidating the SCEV, and we might get a bad SCEV value when looking up the GEP again for a later loop. This also couldn't happen before, as we weren't recursing into GEP's outside the loop. I owe some testcases for this, want to get it in for nightly runs. llvm-svn: 61362	2008-12-23 02:12:52 +00:00
Owen Anderson	164274eeb1	Don't forget to remove phi nodes from the value numbering table after we collapse them. llvm-svn: 61358	2008-12-23 00:49:51 +00:00
Bill Wendling	456e885382	Comment clean-ups. No functionality change. llvm-svn: 61354	2008-12-22 22:32:22 +00:00
Bill Wendling	e7f08e7250	Check that the instruction isn't in the value numbering scope. llvm-svn: 61353	2008-12-22 22:28:56 +00:00
Bill Wendling	86f01cb9f6	Simplification: Negate the operator== method instead of implementing a full operator!= method. llvm-svn: 61352	2008-12-22 22:16:31 +00:00
Bill Wendling	3c793441cb	Add verification that deleted instruction isn't hiding in the PHI map. llvm-svn: 61350	2008-12-22 22:14:07 +00:00
Bill Wendling	ebb6a543fa	Verify removed in a few more places. llvm-svn: 61349	2008-12-22 21:57:30 +00:00
Bill Wendling	6b18a3994b	Add verification functions to GVN which check to see that an instruction was truely deleted. These will be expanded with further checks of all of the data structures. llvm-svn: 61347	2008-12-22 21:36:08 +00:00
Nick Lewycky	10eb8e533f	Turn strcmp into memcmp, such as strcmp(P, "x") --> memcmp(P, "x", 2). llvm-svn: 61297	2008-12-21 00:19:21 +00:00
Nick Lewycky	4bc10c9e77	Remove redundant test for vector-nature. Scan the vector first to see whether our optz'n will apply to it, then build the replacement vector only if needed. llvm-svn: 61279	2008-12-20 16:48:00 +00:00
Evan Cheng	3b3de7c228	- CodeGenPrepare does not split loop back edges but it only knows about back edges of single block loops. It now does a DFS walk to find loop back edges. - Use SplitBlockPredecessors to factor out common predecessors of the critical edge destination. This is disabled for now due to some regressions. llvm-svn: 61248	2008-12-19 18:03:11 +00:00
Bill Wendling	070de29fcf	Didn't mean to commit this. llvm-svn: 61222	2008-12-18 22:19:50 +00:00
Bill Wendling	4c13e77d49	Re-XFAIL this test until debug stuff settles down. llvm-svn: 61219	2008-12-18 22:13:31 +00:00
Nick Lewycky	c3a70ade66	Oops! Left out a line. Simplifying the sdiv might allow further simplifications for our users. llvm-svn: 61196	2008-12-18 06:42:28 +00:00
Nick Lewycky	0f0e63fe73	Make all the vector elements positive in an srem of constant vector. llvm-svn: 61195	2008-12-18 06:31:11 +00:00
Dale Johannesen	3e5843b992	Revert previous patch, appears to break bootstrap. llvm-svn: 61181	2008-12-18 01:23:41 +00:00
Dale Johannesen	12d031b716	Fix the time regression I introduced in 464.h264ref with my last patch to this file. The issue there was that all uses of an IV inside a loop are actually references to Base[IV2], and there was one use outside that was the same but LSR didn't see the base or the scaling because it didn't recurse into uses outside the loop; thus, it used base+IVscale mode inside the loop instead of pulling base out of the loop. This was extra bad because register pressure later forced both base and IV into memory. Doing that recursion, at least enough to figure out addressing modes, is a good idea in general; the change in AddUsersIfInteresting does this. However, there were side effects.... It is also possible for recursing outside the loop to introduce another IV where there was only 1 before (if the refs inside are not scaled and the ref outside is). I don't think this is a common case, but it's in the testsuite. It is right to be very aggressive about getting rid of such introduced IVs (CheckForIVReuse and the handling of nonzero RewriteFactor in StrengthReduceStridedIVUsers). In the testcase in question the new IV produced this way has both a nonconstant stride and a nonzero base, neither of which was handled before. (This patch does not handle all the cases where this can happen.) And when inserting new code that feeds into a PHI, it's right to put such code at the original location rather than in the PHI's immediate predecessor(s) when the original location is outside the loop (a case that couldn't happen before) (RewriteInstructionToUseNewBase); better to avoid making multiple copies of it in this case. Everything above is exercised in CodeGen/X86/lsr-negative-stride.ll (and ifcvt4 in ARM which is the same IR). llvm-svn: 61178	2008-12-18 00:57:22 +00:00
Chris Lattner	b6372933b5	reapply this hunk from Bill's reversion in r61169, it is conservative and safe and orthogonal from turning off load pre. llvm-svn: 61177	2008-12-18 00:51:32 +00:00
Bill Wendling	be4fb8a25f	Temporarily revert r61027. It was causing a bootstrap failure in "release" mode with everyone's favorite error messages: Comparing stages 2 and 3 warning: ./cc1-checksum.o differs warning: ./cc1plus-checksum.o differs Bootstrap comparison failure! ./c-decl.o differs ./cp/decl.o differs ./df-core.o differs ./gcc.o differs ./i386.o differs ./stor-layout.o differs ./tree-pretty-print.o differs ./tree.o differs make[2]: * [compare] Error 1 make[1]: * [stage3-bubble] Error 2 See PR3227. llvm-svn: 61169	2008-12-17 23:31:20 +00:00
Dale Johannesen	904ce8120d	Clarify that the scale factor from CheckForIVReuse can be negative. Keep track of whether all uses of an IV are outside the loop. Some cosmetics; no functional change. llvm-svn: 61109	2008-12-16 22:16:28 +00:00
Chris Lattner	0c68ae0603	Enable Load PRE. This teaches GVN to push partially redundant loads up the CFG when there is exactly one predecessor where the load is not available. This is designed to not increase code size but still eliminate partially redundant loads. This fires 1765 times on 403.gcc even though it doesn't do critical edge splitting yet (the most common reason for it to fail). llvm-svn: 61027	2008-12-15 05:28:29 +00:00
Owen Anderson	03aacbae90	Ifdef out some code that I didn't mean to enable by default yet. llvm-svn: 61024	2008-12-15 03:52:17 +00:00
Chris Lattner	69131fd872	make GVN try to rename inputs to the resultant replaced values, which cleans up the generated code a bit. This should have the added benefit of not randomly renaming functions/globals like my previous patch did. :) llvm-svn: 61023	2008-12-15 03:46:38 +00:00
Owen Anderson	bfe133e4ac	Add support for slow-path GVN with full phi construction for scalars. This is disabled for now, as it actually pessimizes code in the abscence of phi translation for load elimination. This slow down GVN a bit, by about 2% on 403.gcc. llvm-svn: 61021	2008-12-15 02:03:00 +00:00
Chris Lattner	f5eef9f6db	eliminate warning when asserts disabled. llvm-svn: 61012	2008-12-14 21:36:23 +00:00
Owen Anderson	e34c2399de	Generalize GVN's phi construciton routine to work for things other than loads. llvm-svn: 61009	2008-12-14 19:10:35 +00:00
Bill Wendling	293b9181e5	Temporarily revert r60973. It's inexplicably causing a failure when self-hosting LLVM: llvm[2]: Linking Release executable opt (without symbols) ... Undefined symbols: "llvm::APFloat::IEEEsingle", referenced from: __ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(Constants.o) __ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(AsmWriter.o) __ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(ConstantFold.o) "llvm::APFloat::IEEEdouble", referenced from: __ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(Constants.o) __ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(AsmWriter.o) __ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(ConstantFold.o) ld: symbol(s) not found This is in release mode. To replicate, compile llvm and llvm-gcc in optimized mode. Then build llvm, in optimized mode, with the newly created compiler. llvm-svn: 60977	2008-12-13 09:28:44 +00:00
Chris Lattner	1e29f7c97d	make RLE preserve the name of the load that it replaces. This is just a pretification of the IR. llvm-svn: 60973	2008-12-13 07:22:47 +00:00
Chris Lattner	fa9f99aa12	Teach GVN to invalidate some memdep information when it does an RAUW of a pointer. This allows is to catch more equivalencies. For example, the type_lists_compatible_p function used to require two iterations of the gvn pass (!) to delete its 18 redundant loads because the first pass would CSE all the addressing computation cruft, which would unblock the second memdep/gvn passes from recognizing them. This change allows memdep/gvn to catch all 18 when run just once on the function (as is typical :) instead of just 3. On all of 403.gcc, this bumps up the # reundandancies found from: 63 gvn - Number of instructions PRE'd 153991 gvn - Number of instructions deleted 50069 gvn - Number of loads deleted to: 63 gvn - Number of instructions PRE'd 154137 gvn - Number of instructions deleted 50185 gvn - Number of loads deleted +120 loads deleted isn't bad. llvm-svn: 60799	2008-12-09 22:06:23 +00:00
Chris Lattner	254314e6bc	rename getNonLocalDependency -> getNonLocalCallDependency, and remove pointer stuff from it, simplifying the code a bit. llvm-svn: 60783	2008-12-09 19:38:05 +00:00
Chris Lattner	b6fc4b8d92	Switch GVN::processNonLocalLoad to using the new MemDep::getNonLocalPointerDependency method. There are some open issues with this (missed optimizations) and plenty of future work, but this does allow GVN to eliminate slightly more loads (49246 vs 49033). Switching over now allows simplification of the other code path in memdep. llvm-svn: 60780	2008-12-09 19:25:07 +00:00
Chris Lattner	0a5a8d54a9	random cleanups, no functionality change. llvm-svn: 60779	2008-12-09 19:21:47 +00:00
Chris Lattner	56b20ffc5f	Fix a really subtle off-by-one bug that Duncan noticed with valgrind on test/CodeGen/Generic/2007-06-06-CriticalEdgeLandingPad. llvm-svn: 60739	2008-12-09 04:47:21 +00:00
Chris Lattner	e598370ae9	remove DebugIterations option. Despite the accusations, jump threading has been shown to only expose problems not have bugs itself. I'm sure it's completely bug free! ;-) llvm-svn: 60725	2008-12-08 22:44:07 +00:00
Devang Patel	2bb8a2f80f	Fix spelling. Thanks Duncan! llvm-svn: 60702	2008-12-08 17:07:24 +00:00
Devang Patel	1c469d36b0	Undo previous patch. llvm-svn: 60701	2008-12-08 17:02:37 +00:00
Chris Lattner	5df5b4cc2e	don't bother touching volatile stores, they will just return clobber on everything interesting anyway. llvm-svn: 60640	2008-12-07 00:25:15 +00:00
Chris Lattner	57e91eaf61	Reimplement the inner loop of DSE. It now uniformly uses getDependence(), doesn't do its own local caching, and is slightly more aggressive about free/store dse (see testcase). This eliminates the last external client of MemDep::getDependenceFrom(). llvm-svn: 60619	2008-12-06 00:53:22 +00:00
Dale Johannesen	9efd2ce55b	Make LoopStrengthReduce smarter about hoisting things out of loops when they can be subsumed into addressing modes. Change X86 addressing mode check to realize that some PIC references need an extra register. (I believe this is correct for Linux, if not, I'm sure someone will tell me.) llvm-svn: 60608	2008-12-05 21:47:27 +00:00
Chris Lattner	0e3d6337c6	Make a few major changes to memdep and its clients: 1. Merge the 'None' result into 'Normal', making loads and stores return their dependencies on allocations as Normal. 2. Split the 'Normal' result into 'Clobber' and 'Def' to distinguish between the cases when memdep knows the value is produced from when we just know if may be changed. 3. Move some of the logic for determining whether readonly calls are CSEs into memdep instead of it being in GVN. This still leaves verification that the arguments are hte same to GVN to let it know about value equivalences in different contexts. 4. Change memdep's call/call dependency analysis to use getModRefInfo(CallSite,CallSite) instead of doing something very weak. This only really matters for things like DSA, but someday maybe we'll have some other decent context sensitive analyses :) 5. This reimplements the guts of memdep to handle the new results. 6. This simplifies GVN significantly: a) readonly call CSE is slightly simpler b) I eliminated the "getDependencyFrom" chaining for load elimination and load CSE doesn't have to worry about volatile (they are always clobbers) anymore. c) GVN no longer does any 'lastLoad' caching, leaving it to memdep. 7. The logic in DSE is simplified a bit and sped up. A potentially unsafe case was eliminated. llvm-svn: 60607	2008-12-05 21:04:20 +00:00
Anton Korobeynikov	24600bf05a	Revert invalid r60393. It causes llvm-gcc bootstrap fails in release builds. See PR3160 for details llvm-svn: 60604	2008-12-05 19:38:49 +00:00
Chris Lattner	c100828026	Fix test/Transforms/GVN/pre-load.ll llvm-svn: 60594	2008-12-05 17:04:12 +00:00
Chris Lattner	d2a653af0c	Make IsValueFullyAvailableInBlock safe. llvm-svn: 60588	2008-12-05 07:49:08 +00:00
Devang Patel	c56423b500	Rewrite code that 1) filters loops and 2) calculates new loop bounds. This fixes many bugs. I will add more test cases in a separate check-in. Some day, the code that manipulates CFG and updates dom. info could use refactoring help. llvm-svn: 60554	2008-12-04 21:38:42 +00:00
Chris Lattner	8f723670ce	Start simplifying a switch that has a successor that is a switch. llvm-svn: 60534	2008-12-04 06:31:07 +00:00
Chris Lattner	75c2661d24	add a debugging option to help track down j-t problems. llvm-svn: 60514	2008-12-04 00:07:59 +00:00
Dale Johannesen	4e9e6ea604	Remove an unused field. llvm-svn: 60508	2008-12-03 22:43:56 +00:00
Dale Johannesen	f7a588b909	Fix a misspelled function name. llvm-svn: 60506	2008-12-03 20:56:12 +00:00
Chris Lattner	dc3f6f2c12	Factor some code into a new FoldSingleEntryPHINodes method. llvm-svn: 60501	2008-12-03 19:44:02 +00:00
Dale Johannesen	d49ceff6ba	Fix a really wrong comment. llvm-svn: 60494	2008-12-03 19:25:46 +00:00
Chris Lattner	595c7279bd	Teach jump threading some more simple tricks: 1) have it fold "br undef", which does occur with surprising frequency as jump threading iterates. 2) teach j-t to delete dead blocks. This removes the successor edges, reducing the in-edges of other blocks, allowing recursive simplification. 3) Fold things like: br COND, BBX, BBY BBX: br COND, BBZ, BBW which also happens because jump threading iterates. llvm-svn: 60470	2008-12-03 07:48:08 +00:00
Dale Johannesen	4d2ecb8f68	Minor rewrite per review feedback. llvm-svn: 60442	2008-12-02 21:17:11 +00:00
Dale Johannesen	70060013d2	Make the code do what the comment says it does. llvm-svn: 60431	2008-12-02 18:40:09 +00:00
Chris Lattner	1db9bbe802	Implement PRE of loads in the GVN pass with a pretty cheap and straight-forward implementation. This does not require any extra alias analysis queries beyond what we already do for non-local loads. Some programs really really like load PRE. For example, SPASS triggers this ~1000 times, ~300 times in 255.vortex, and ~1500 times on 403.gcc. The biggest limitation to the implementation is that it does not split critical edges. This is a huge killer on many programs and should be addressed after the initial patch is enabled by default. The implementation of this should incidentally speed up rejection of non-local loads because it avoids creating the repl densemap in cases when it won't be used for fully redundant loads. This is currently disabled by default. Before I turn this on, I need to fix a couple of miscompilations in the testsuite, look at compile time performance numbers, and look at perf impact. This is pretty close to ready though. llvm-svn: 60408	2008-12-02 08:16:11 +00:00
Bill Wendling	87beb9b909	Remove some errors that crept in. No functionality change. llvm-svn: 60403	2008-12-02 06:24:20 +00:00
Bill Wendling	790b4bf9a9	Merge two if-statements into one. llvm-svn: 60402	2008-12-02 06:22:04 +00:00
Bill Wendling	5635295266	More styalistic changes. No functionality change. llvm-svn: 60401	2008-12-02 06:18:11 +00:00
Bill Wendling	85de4b35ca	- Remove the buggy -X/C -> X/-C transform. This isn't valid when X isn't a constant. If X is a constant, then this is folded elsewhere. - Added a note to Target/README.txt to indicate that we'd like to implement this when we're able. llvm-svn: 60399	2008-12-02 05:12:47 +00:00
Bill Wendling	5369db5917	Improve comment. llvm-svn: 60398	2008-12-02 05:09:00 +00:00
Bill Wendling	21716dff5e	- Reduce nesting. - No need to do a swap on a canonicalized pattern. No functionality change. llvm-svn: 60397	2008-12-02 05:06:43 +00:00
Chris Lattner	ead1a61b47	some random comment improvements. llvm-svn: 60395	2008-12-02 04:52:26 +00:00
Owen Anderson	d930420ccf	Fix an issue that Chris noticed, where local PRE was not properly instantiating a new value numbering set after splitting a critical edge. This increases the number of instances of PRE on 403.gcc from ~60 to ~570. llvm-svn: 60393	2008-12-02 04:09:22 +00:00
Dale Johannesen	069a4eee55	Consider only references to an IV within the loop when figuring out the base of the IV. This produces better code in the example. (Addresses use (IV) instead of (BASE,IV) - a significant improvement on low-register machines like x86). llvm-svn: 60374	2008-12-01 22:00:01 +00:00
Bill Wendling	6f71bce4cf	Don't rebuild RHSNeg. Just use the one that's already there. llvm-svn: 60370	2008-12-01 21:06:30 +00:00
Bill Wendling	84f6f2539f	Document what this check is doing. Also, no need to cast to ConstantInt. llvm-svn: 60369	2008-12-01 21:03:43 +00:00
Bill Wendling	e6c87a4952	Use a simple comparison. Overflow on integer negation can only occur when the integer is "minint". llvm-svn: 60366	2008-12-01 19:46:27 +00:00
Bill Wendling	47f733e4ea	Generalize the FoldOrWithConstant method to fold for any two constants which don't have overlapping bits. llvm-svn: 60344	2008-12-01 08:32:40 +00:00
Bill Wendling	22e761b302	Reduce copy-and-paste code by splitting out the code into its own function. llvm-svn: 60343	2008-12-01 08:23:25 +00:00
Bill Wendling	582fe6b0ca	Use m_Specific() instead of double matching. llvm-svn: 60341	2008-12-01 08:09:47 +00:00
Bill Wendling	4eecfb655b	Move pattern check outside of the if-then statement. This prevents us from fiddling with constants unless we have to. llvm-svn: 60340	2008-12-01 07:47:02 +00:00
Chris Lattner	6f5bf6a718	Rename some variables, only increment BI once at the start of the loop instead of throughout it. llvm-svn: 60339	2008-12-01 07:35:54 +00:00
Chris Lattner	f00aae4968	pull the predMap densemap out of the inner loop of performPRE, so that it isn't reallocated all the time. This is a tiny speedup for GVN: 3.90->3.88s llvm-svn: 60338	2008-12-01 07:29:03 +00:00
Chris Lattner	2b07d3ccde	switch a couple more calls to use array_pod_sort. llvm-svn: 60337	2008-12-01 06:52:57 +00:00
Chris Lattner	2c2dd15a85	Introduce a new array_pod_sort function and switch LSR to use it instead of std::sort. This shrinks the release-asserts LSR.o file by 1100 bytes of code on my system. We should start using array_pod_sort where possible. llvm-svn: 60335	2008-12-01 06:49:59 +00:00
Chris Lattner	2aebea5735	Eliminate use of setvector for the DeadInsts set, just use a smallvector. This is a lot cheaper and conceptually simpler. llvm-svn: 60332	2008-12-01 06:27:41 +00:00
Chris Lattner	4da78e3774	DeleteTriviallyDeadInstructions is always passed the DeadInsts ivar, just use it directly. llvm-svn: 60330	2008-12-01 06:14:28 +00:00
Chris Lattner	a68a5a4784	simplify DeleteTriviallyDeadInstructions again, unlike my previous buggy rewrite, this notifies ScalarEvolution of a pending instruction about to be removed and then erases it, instead of erasing it then notifying. llvm-svn: 60329	2008-12-01 06:11:32 +00:00
Chris Lattner	9e6b243428	simplify these patterns using m_Specific. No need to grep for xor in testcase (or is a substring). llvm-svn: 60328	2008-12-01 05:16:26 +00:00
Chris Lattner	88a1f0213d	Teach jump threading to clean up after itself, DCE and constfolding the new instructions it simplifies. Because we're threading jumps on edges with constants coming in from PHI's, we inherently are exposing a lot more constants to the new block. Folding them and deleting dead conditions allows the cost model in jump threading to be more accurate as it iterates. llvm-svn: 60327	2008-12-01 04:48:07 +00:00
Chris Lattner	084b3a47d3	Change instcombine to use FoldPHIArgGEPIntoPHI to fold two operand PHIs instead of using FoldPHIArgBinOpIntoPHI. In addition to being more obvious, this also fixes a problem where instcombine wouldn't merge two phis that had different variable indices. This prevented instcombine from factoring big chunks of code in 403.gcc. For example: insn_cuid.exit: - %tmp336 = load i32** @uid_cuid, align 4 - %tmp337 = getelementptr %struct.rtx_def* %insn_addr.0.ph.i, i32 0, i32 3 - %tmp338 = bitcast [1 x %struct.rtunion]* %tmp337 to i32* - %tmp339 = load i32* %tmp338, align 4 - %tmp340 = getelementptr i32* %tmp336, i32 %tmp339 br label %bb62 bb61: - %tmp341 = load i32** @uid_cuid, align 4 - %tmp342 = getelementptr %struct.rtx_def* %insn, i32 0, i32 3 - %tmp343 = bitcast [1 x %struct.rtunion]* %tmp342 to i32* - %tmp344 = load i32* %tmp343, align 4 - %tmp345 = getelementptr i32* %tmp341, i32 %tmp344 br label %bb62 bb62: - %iftmp.62.0.in = phi i32* [ %tmp345, %bb61 ], [ %tmp340, %insn_cuid.exit ] + %insn.pn2 = phi %struct.rtx_def* [ %insn, %bb61 ], [ %insn_addr.0.ph.i, %insn_cuid.exit ] + %tmp344.pn.in.in = getelementptr %struct.rtx_def* %insn.pn2, i32 0, i32 3 + %tmp344.pn.in = bitcast [1 x %struct.rtunion]* %tmp344.pn.in.in to i32* + %tmp341.pn = load i32** @uid_cuid + %tmp344.pn = load i32* %tmp344.pn.in + %iftmp.62.0.in = getelementptr i32* %tmp341.pn, i32 %tmp344.pn %iftmp.62.0 = load i32* %iftmp.62.0.in llvm-svn: 60325	2008-12-01 03:42:51 +00:00
Chris Lattner	9d02a70a7d	Teach inst combine to merge GEPs through PHIs. This is really important because it is sinking the loads using the GEPs, but not the GEPs themselves. This triggers 647 times on 403.gcc and makes the .s file much much nicer. For example before: je LBB1_87 ## bb78 LBB1_62: ## bb77 leal 84(%esi), %eax LBB1_63: ## bb79 movl (%eax), %eax ... LBB1_87: ## bb78 movl $0, 4(%esp) movl %esi, (%esp) call L_make_decl_rtl$stub jmp LBB1_62 ## bb77 after: jne LBB1_63 ## bb79 LBB1_62: ## bb78 movl $0, 4(%esp) movl %esi, (%esp) call L_make_decl_rtl$stub LBB1_63: ## bb79 movl 84(%esi), %eax The input code was (and the GEPs are merged and the PHI is now eliminated by instcombine): br i1 %tmp233, label %bb78, label %bb77 bb77: %tmp234 = getelementptr %struct.tree_node* %t_addr.3, i32 0, i32 0, i32 22 br label %bb79 bb78: call void @make_decl_rtl(%struct.tree_node* %t_addr.3, i8* null) nounwind %tmp235 = getelementptr %struct.tree_node* %t_addr.3, i32 0, i32 0, i32 22 br label %bb79 bb79: %iftmp.12.0.in = phi %struct.rtx_def [ %tmp235, %bb78 ], [ %tmp234, %bb77 ] %iftmp.12.0 = load %struct.rtx_def %iftmp.12.0.in llvm-svn: 60322	2008-12-01 02:34:36 +00:00
Chris Lattner	9ce8995d24	Make GVN be more intelligent about redundant load elimination: when finding dependent load/stores, realize that they are the same if aliasing claims must alias instead of relying on the pointers to be exactly equal. This makes load elimination more aggressive. For example, on 403.gcc, we had: < 68 gvn - Number of instructions PRE'd < 152718 gvn - Number of instructions deleted < 49699 gvn - Number of loads deleted < 6153 memdep - Number of dirty cached non-local responses < 169336 memdep - Number of fully cached non-local responses < 162428 memdep - Number of uncached non-local responses now we have: > 64 gvn - Number of instructions PRE'd > 153623 gvn - Number of instructions deleted > 49856 gvn - Number of loads deleted > 5022 memdep - Number of dirty cached non-local responses > 159030 memdep - Number of fully cached non-local responses > 162443 memdep - Number of uncached non-local responses That's an extra 157 loads deleted and extra 905 other instructions nuked. This slows down GVN very slightly, from 3.91 to 3.96s. llvm-svn: 60314	2008-12-01 01:31:36 +00:00
Chris Lattner	7e61dafc95	Reimplement the non-local dependency data structure in terms of a sorted vector instead of a densemap. This shrinks the memory usage of this thing substantially (the high water mark) as well as making operations like scanning it faster. This speeds up memdep slightly, gvn goes from 3.9376 to 3.9118s on 403.gcc This also splits out the statistics for the cached non-local case to differentiate between the dirty and clean cached case. Here's the stats for 403.gcc: 6153 memdep - Number of dirty cached non-local responses 169336 memdep - Number of fully cached non-local responses 162428 memdep - Number of uncached non-local responses yay for caching :) llvm-svn: 60313	2008-12-01 01:15:42 +00:00
Bill Wendling	5b902c5b1e	Implement ((A\|B)&1)\|(B&-2) -> (A&1) \| B transformation. This also takes care of permutations of this pattern. llvm-svn: 60312	2008-12-01 01:07:11 +00:00
Chris Lattner	8541edec44	Cache analyses in ivars and add some useful DEBUG output. This speeds up GVN from 4.0386s to 3.9376s. llvm-svn: 60310	2008-12-01 00:40:32 +00:00
Chris Lattner	80c7d81e81	improve indentation, do cheap checks before expensive ones, remove some fixme's. This speeds up GVN very slightly on 403.gcc (4.06->4.03s) llvm-svn: 60309	2008-11-30 23:39:23 +00:00
Eli Friedman	11c15a5de7	Minor cleanup: use getTrue and getFalse where appropriate. No functional change. llvm-svn: 60307	2008-11-30 22:48:49 +00:00
Eli Friedman	55e4becba9	Some minor cleanups to instcombine; no functionality change. Note that the FoldOpIntoPhi call is dead because it's impossible for the first operand of a subtraction to be both a ConstantInt and a PHINode. llvm-svn: 60306	2008-11-30 21:09:11 +00:00
Bill Wendling	de89bc275c	Add instruction combining for ((A&~B)\|(~A&B)) -> A^B and all permutations. llvm-svn: 60291	2008-11-30 13:52:49 +00:00
Bill Wendling	9eef421e12	Implement (A&((~A)\|B)) -> A&B transformation in the instruction combiner. This takes care of all permutations of this pattern. llvm-svn: 60290	2008-11-30 13:08:13 +00:00
Bill Wendling	2fe3229824	Forgot one remaining call to getSExtValue(). llvm-svn: 60289	2008-11-30 12:41:09 +00:00
Bill Wendling	2d2e7861b5	getSExtValue() doesn't work for ConstantInts with bitwidth > 64 bits. Use all APInt calls instead. This fixes PR3144. llvm-svn: 60288	2008-11-30 12:38:24 +00:00
Eli Friedman	09bc610945	Optimize memmove and memset into the LLVM builtins. Note that these only show up in code from front-ends besides llvm-gcc, like clang. llvm-svn: 60287	2008-11-30 08:32:11 +00:00
Bill Wendling	7abf352f44	Don't make TwoToExp signed by default. llvm-svn: 60279	2008-11-30 05:29:33 +00:00
Bill Wendling	af200e9237	From Hacker's Delight: "For signed integers, the determination of overflow of xy is not so simple. If x and y have the same sign, then overflow occurs iff xy > 231 - 1. If they have opposite signs, then overflow occurs iff xy < -2*31." In this case, x == -1. llvm-svn: 60278	2008-11-30 05:01:05 +00:00
Bill Wendling	70635adea3	Instcombine was illegally transforming -X/C into X/-C when either X or C overflowed on negation. This commit checks to make sure that neithe C nor X overflows. This requires that the RHS of X (a subtract instruction) be a constant integer. llvm-svn: 60275	2008-11-30 03:42:12 +00:00
Chris Lattner	3ff6d01586	Fix a fixme by making memdep's handling of allocations more logical. If we see that a load depends on the allocation of its memory with no intervening stores, we now return a 'None' depedency instead of "Normal". This tweaks GVN to do its optimization with the new result. llvm-svn: 60267	2008-11-30 01:39:32 +00:00
Chris Lattner	63bd586d35	Eliminate the dropInstruction method, which is not needed any more. Fix a subtle iterator invalidation bug I introduced in the last commit. llvm-svn: 60258	2008-11-29 23:30:39 +00:00
Chris Lattner	1c6b62eb4d	Change MemDep::getNonLocalDependency to return its results as a smallvector instead of a DenseMap. This speeds up GVN by 5% on 403.gcc. llvm-svn: 60255	2008-11-29 21:33:22 +00:00
Chris Lattner	f280b0c729	reimplement getNonLocalDependency with a simpler worklist formulation that is faster and doesn't require nonLazyHelper. Much less code. llvm-svn: 60253	2008-11-29 21:22:42 +00:00
Chris Lattner	8c5ff516c6	Fix a thinko that manifested as a crash on clamav last night. llvm-svn: 60251	2008-11-29 20:29:04 +00:00
Chris Lattner	51ba8d0630	Split getDependency into getDependency and getDependencyFrom, the former does caching, the later doesn't. This dramatically simplifies the logic in getDependency and getDependencyFrom. llvm-svn: 60234	2008-11-29 03:47:00 +00:00
Bill Wendling	469e3aa696	Temporarily revert r60195. It's causing an optimized bootstrap of llvm-gcc to fail. llvm-svn: 60233	2008-11-29 03:43:04 +00:00
Chris Lattner	7f9c8a0f05	Introduce and use a new MemDepResult class to hold the results of a memdep query. This makes it crystal clear what cases can escape from MemDep that the clients have to handle. This also gives the clients a nice simplified interface to it that is easy to poke at. This patch also makes DepResultTy and MemoryDependenceAnalysis::DepType private, yay. llvm-svn: 60231	2008-11-29 02:29:27 +00:00
Chris Lattner	de04e1173a	Reimplement the internal abstraction used by MemDep in terms of a pointer/int pair instead of a manually bitmangled pointer. This forces clients to think a little more about checking the appropriate pieces and will be useful for internal implementation improvements later. I'm not particularly happy with this. After going through this I don't think that the clients of memdep should be exposed to the internal type at all. I'll fix this in a subsequent commit. This has no functionality change. llvm-svn: 60230	2008-11-29 01:43:36 +00:00
Chris Lattner	f3f6a801cc	don't revisit instructions off the beginning of the block. llvm-svn: 60221	2008-11-28 22:50:08 +00:00
Chris Lattner	f2a8ba4cf0	simplify some code, remove escaped newline. llvm-svn: 60213	2008-11-28 21:29:52 +00:00
Chris Lattner	8a172daa55	don't call MergeBasicBlockIntoOnlyPred on a block whose only predecessor is itself. This doesn't make sense, and this is a dead infinite loop anyway. llvm-svn: 60210	2008-11-28 19:54:49 +00:00
Chris Lattner	1adb6759ef	rewrite a big chunk of how DSE does recursive dead operand elimination to use more modern infrastructure. Also do a bunch of small cleanups. llvm-svn: 60201	2008-11-28 00:27:14 +00:00
Chris Lattner	c077a2a535	Simplify LoopStrengthReduce::DeleteTriviallyDeadInstructions by making it use RecursivelyDeleteTriviallyDeadInstructions to do the heavy lifting. llvm-svn: 60195	2008-11-27 23:23:35 +00:00
Chris Lattner	96e2dbe008	use continue to reduce indentation llvm-svn: 60192	2008-11-27 23:00:20 +00:00
Chris Lattner	c6c481cdfc	remove doConstantPropagation and dceInstruction, they are just wrappers around the interesting code and use an obscure iterator abstraction that dates back many many years. Move EraseDeadInstructions to Transforms/Utils and name it RecursivelyDeleteTriviallyDeadInstructions. llvm-svn: 60191	2008-11-27 22:57:53 +00:00
Chris Lattner	5ef9ebf787	simplify code. llvm-svn: 60190	2008-11-27 22:56:14 +00:00
Chris Lattner	c92fa42ddd	simplify this logic. llvm-svn: 60189	2008-11-27 22:46:09 +00:00
Nick Lewycky	4ab50b93c8	Chris prefers icmp/select over udiv! llvm-svn: 60187	2008-11-27 22:41:10 +00:00
Nick Lewycky	69941fd0a0	Add a couple of missed optimizations on integer vectors. Multiply and divide by 1, as well as multiply by -1. llvm-svn: 60182	2008-11-27 20:21:08 +00:00
Chris Lattner	4059f43b74	defensive patch: if CGP is merging a block with the entry block, make sure it ends up being the entry block. llvm-svn: 60180	2008-11-27 19:29:14 +00:00
Chris Lattner	5dfbfcd80d	Fix PR3138: if we merge the entry block into another block, make sure to move the other block back up into the entry position! llvm-svn: 60179	2008-11-27 19:25:19 +00:00
Chris Lattner	e0d019def6	switch InstCombine::visitLoadInst to use FindAvailableLoadedValue llvm-svn: 60169	2008-11-27 08:56:30 +00:00
Chris Lattner	72f16e70f0	move FindAvailableLoadedValue from JumpThreading to Transforms/Utils. llvm-svn: 60166	2008-11-27 08:10:05 +00:00
Chris Lattner	206250284d	Use the new MergeBasicBlockIntoOnlyPred function. llvm-svn: 60163	2008-11-27 07:54:12 +00:00
Chris Lattner	99d6809ac1	move MergeBasicBlockIntoOnlyPred to Transforms/Utils. llvm-svn: 60162	2008-11-27 07:43:12 +00:00
Chris Lattner	240051aace	rename ThreadBlock to ProcessBlock, since it does other things than just simple threading. llvm-svn: 60157	2008-11-27 07:20:04 +00:00
Chris Lattner	98d89d1b1b	Make jump threading substantially more powerful, in the following ways: 1. Make it fold blocks separated by an unconditional branch. This enables jump threading to see a broader scope. 2. Make jump threading able to eliminate locally redundant loads when they feed the branch condition of a block. This frequently occurs due to reg2mem running. 3. Make jump threading able to eliminate partially redundant loads when they feed the branch condition of a block. This is common in code with lots of loads and stores like C++ code and 255.vortex. This implements thread-loads.ll and rdar://6402033. Per the fixme's, several pieces of this should be moved into Transforms/Utils. llvm-svn: 60148	2008-11-27 05:07:53 +00:00
Chris Lattner	397a11ccd8	Turn on my codegen prepare heuristic by default. It doesn't affect performance in most cases on the Grawp tester, but does speed some things up (like shootout/hash by 15%). This also doesn't impact compile time in a noticable way on the Grawp tester. It also, of course, gets the testcase it was designed for right :) llvm-svn: 60120	2008-11-26 22:16:44 +00:00
Chris Lattner	fef04acc50	teach the new heuristic how to handle inline asm. llvm-svn: 60088	2008-11-26 04:59:11 +00:00
Chris Lattner	6d71b7fb95	Improve ValueAlreadyLiveAtInst with a cheap and dirty, but effective heuristic: the value is already live at the new memory operation if it is used by some other instruction in the memop's block. This is cheap and simple to compute (moreso than full liveness). This improves the new heuristic even more. For example, it cuts two out of three new instructions out of 255.vortex:DbmFileInGrpHdr, which is one of the functions that the heuristic regressed. This overall eliminates another 40 instructions from 403.gcc and visibly reduces register pressure in 255.vortex (though this only actually ends up saving the 2 instructions from the whole program). llvm-svn: 60084	2008-11-26 03:20:37 +00:00
Chris Lattner	e34fe2c52d	Start rewroking a subpiece of the profitability heuristic to be phrased in terms of liveness instead of as a horrible hack. :) In pratice, this doesn't change the generated code for either 255.vortex or 403.gcc, but it could cause minor code changes in theory. This is framework for coming changes. llvm-svn: 60082	2008-11-26 03:02:41 +00:00
Chris Lattner	383a797f42	add a comment, make save/restore logic more obvious. llvm-svn: 60076	2008-11-26 02:11:11 +00:00
Chris Lattner	eb3e4fb6fb	This adds in some code (currently disabled unless you pass -enable-smarter-addr-folding to llc) that gives CGP a better cost model for when to sink computations into addressing modes. The basic observation is that sinking increases register pressure when part of the addr computation has to be available for other reasons, such as having a use that is a non-memory operation. In cases where it works, it can substantially reduce register pressure. This code is currently an overall win on 403.gcc and 255.vortex (the two things I've been looking at), but there are several things I want to do before enabling it by default: 1. This isn't doing any caching of results, so it is much slower than it could be. It currently slows down release-asserts llc by 1.7% on 176.gcc: 27.12s -> 27.60s. 2. This doesn't think about inline asm memory operands yet. 3. The cost model botches the case when the needed value is live across the computation for other reasons. I'll continue poking at this, and eventually turn it on as llcbeta. llvm-svn: 60074	2008-11-26 02:00:14 +00:00
Evan Cheng	496b042e20	Revert r60042. IndVarSimplify should check if APFloat is PPCDoubleDouble first before trying to convert it to an integer. llvm-svn: 60072	2008-11-26 01:11:57 +00:00
Chris Lattner	a9ab165b08	Teach CodeGenPrepare to look through Bitcast instructions when attempting to optimize addressing modes. This allows us to optimize things like isel-sink2.ll into: movl 4(%esp), %eax cmpb $0, 4(%eax) jne LBB1_2 ## F LBB1_1: ## TB movl $4, %eax ret LBB1_2: ## F movzbl 7(%eax), %eax ret instead of: _test: movl 4(%esp), %eax cmpb $0, 4(%eax) leal 4(%eax), %eax jne LBB1_2 ## F LBB1_1: ## TB movl $4, %eax ret LBB1_2: ## F movzbl 3(%eax), %eax ret This shrinks (e.g.) 403.gcc from 1133510 to 1128345 lines of .s. Note that the 2008-10-16-SpillerBug.ll testcase is dubious at best, I doubt it is really testing what it thinks it is. llvm-svn: 60068	2008-11-26 00:26:16 +00:00
Chris Lattner	f3e95505c5	Teach MatchScaledValue to handle Scales by 1 with MatchAddr (which can recursively match things) and scales by 0 by ignoring them. This triggers once in 403.gcc, saving 1 (!!!!) instruction in the whole huge app. llvm-svn: 60013	2008-11-25 07:25:26 +00:00
Chris Lattner	728f90220a	significantly refactor all the addressing mode matching logic into a new AddressingModeMatcher class. This makes it easier to reason about and reduces passing around of stuff, but has no functionality change. llvm-svn: 60012	2008-11-25 07:09:13 +00:00
Chris Lattner	58f49d2916	refactor all the constantexpr/instruction handling code out into a new FindMaximalLegalAddressingModeForOperation helper method. llvm-svn: 60011	2008-11-25 05:15:49 +00:00
Chris Lattner	a3fbff15b9	another minor tweak llvm-svn: 60010	2008-11-25 04:47:41 +00:00
Chris Lattner	d616ef5683	minor cleanups no functionality change. llvm-svn: 60009	2008-11-25 04:42:10 +00:00
Chris Lattner	6416a6b7a0	rearrange and tidy some code, no functionality change. llvm-svn: 59990	2008-11-24 22:44:16 +00:00
Chris Lattner	d917c8c8fe	minor cleanups to debug code, no functionality change. llvm-svn: 59989	2008-11-24 22:40:05 +00:00
Chris Lattner	d78894197a	reenable the right part of the code. llvm-svn: 59985	2008-11-24 21:26:21 +00:00
Chris Lattner	992a541002	revert an accidental commit, this fixes the regression on test/CodeGen/X86/isel-sink.ll llvm-svn: 59976	2008-11-24 19:40:34 +00:00
Chris Lattner	53d6a07869	Fix 3113: If we have a dead cyclic PHI, replace the whole thing with an undef. llvm-svn: 59972	2008-11-24 19:25:36 +00:00
Devang Patel	702f45df58	Fix build failure. llvm-svn: 59844	2008-11-21 21:00:20 +00:00
Devang Patel	cb181bb203	Silence unused variable warnings. llvm-svn: 59841	2008-11-21 20:00:59 +00:00
Chris Lattner	dd7083452f	reapply Sanjiv's patch to genericize memcpy/memset/memmove to take an arbitrary integer width for the count. llvm-svn: 59823	2008-11-21 16:42:48 +00:00
Bill Wendling	4bce2bff88	Revert r59802. It was breaking the build of llvm-gcc: g++ -m32 -c -g -DIN_GCC -W -Wall -Wwrite-strings -Wmissing-format-attribute -fno-common -mdynamic-no-pic -DHAVE_CONFIG_H -Wno-unused -DTARGET_NAME=\"i386-apple-darwin9.5.0\" -I. -I. -I../../llvm-gcc.src/gcc -I../../llvm-gcc.src/gcc/. -I../../llvm-gcc.src/gcc/../include -I./../intl -I../../llvm-gcc.src/gcc/../libcpp/include -I../../llvm-gcc.src/gcc/../libdecnumber -I../libdecnumber -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/include -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/include -DENABLE_LLVM -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/../llvm.src/include -D_DEBUG -D_GNU_SOURCE -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -I. -I. -I../../llvm-gcc.src/gcc -I../../llvm-gcc.src/gcc/. -I../../llvm-gcc.src/gcc/../include -I./../intl -I../../llvm-gcc.src/gcc/../libcpp/include -I../../llvm-gcc.src/gcc/../libdecnumber -I../libdecnumber -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/include -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/include ../../llvm-gcc.src/gcc/llvm-types.cpp -o llvm-types.o ../../llvm-gcc.src/gcc/llvm-convert.cpp: In member function 'void TreeToLLVM::EmitMemCpy(llvm::Value, llvm::Value, llvm::Value, unsigned int)': ../../llvm-gcc.src/gcc/llvm-convert.cpp:1496: error: 'memcpy_i32' is not a member of 'llvm::Intrinsic' ../../llvm-gcc.src/gcc/llvm-convert.cpp:1496: error: 'memcpy_i64' is not a member of 'llvm::Intrinsic' ../../llvm-gcc.src/gcc/llvm-convert.cpp: In member function 'void TreeToLLVM::EmitMemMove(llvm::Value, llvm::Value, llvm::Value, unsigned int)': ../../llvm-gcc.src/gcc/llvm-convert.cpp:1512: error: 'memmove_i32' is not a member of 'llvm::Intrinsic' ../../llvm-gcc.src/gcc/llvm-convert.cpp:1512: error: 'memmove_i64' is not a member of 'llvm::Intrinsic' ../../llvm-gcc.src/gcc/llvm-convert.cpp: In member function 'void TreeToLLVM::EmitMemSet(llvm::Value, llvm::Value, llvm::Value, unsigned int)': ../../llvm-gcc.src/gcc/llvm-convert.cpp:1528: error: 'memset_i32' is not a member of 'llvm::Intrinsic' ../../llvm-gcc.src/gcc/llvm-convert.cpp:1528: error: 'memset_i64' is not a member of 'llvm::Intrinsic' make[3]: [llvm-convert.o] Error 1 make[3]: * Waiting for unfinished jobs.... rm fsf-funding.pod gcov.pod gfdl.pod cpp.pod gpl.pod gcc.pod make[2]: * [all-stage1-gcc] Error 2 make[1]: * [stage1-bubble] Error 2 make: *** [all] Error 2 llvm-svn: 59809	2008-11-21 09:09:41 +00:00
Sanjiv Gupta	09a203765a	Make mem[cpy,move,set] intrinsics overloaded. llvm-svn: 59802	2008-11-21 07:49:09 +00:00
Nick Lewycky	07d726ec4d	Optimize (x/y)*y into x-(x%y) in general. Div and rem are about the same, and a subtract is cheaper than a multiply. This generalizes an existing transform. llvm-svn: 59800	2008-11-21 07:33:58 +00:00
Devang Patel	45f1ae028e	Fix unused variable warnings. llvm-svn: 59778	2008-11-21 01:52:59 +00:00
Devang Patel	827bced2b1	Let instcombiner remove redundant dbg intrinsics. llvm-svn: 59658	2008-11-19 18:59:41 +00:00
Devang Patel	7ed6c5317c	If there are two consecutive llvm.dbg.stoppoint calls then it is likely that the optimizer deleted code in between these two intrinsics. Keep only the last llvm.dbg.stoppoint in this case. llvm-svn: 59657	2008-11-19 18:56:50 +00:00
Bill Wendling	cf194e9a27	Cast to remove warning about comparing signed and unsigned. llvm-svn: 59518	2008-11-18 10:57:27 +00:00
Devang Patel	f1e9329209	Give SIToFPInst preference over UIToFPInst because it is faster on platforms that are widely used. llvm-svn: 59476	2008-11-18 00:40:02 +00:00
Devang Patel	180afd2c55	While handling floating point IVs lift restrictions on initial value and increment value. llvm-svn: 59471	2008-11-17 23:27:13 +00:00
Devang Patel	aa3d68d301	Handle floating point ivs during doInitialization(). llvm-svn: 59466	2008-11-17 21:32:02 +00:00
Chris Lattner	7917b43a28	eliminate some std::set's. llvm-svn: 59409	2008-11-16 07:17:51 +00:00
Chris Lattner	44152742a0	simplify a bunch more instcombines to use m_Specific etc. llvm-svn: 59403	2008-11-16 05:38:51 +00:00
Chris Lattner	d397fef50d	factor the code for simplifying (icmp)\|(icmp) into its own function. llvm-svn: 59402	2008-11-16 05:20:07 +00:00
Chris Lattner	909b969b18	do some computation with apints instead of ConstantInts. llvm-svn: 59401	2008-11-16 05:14:43 +00:00
Chris Lattner	feaea9bdf7	merge a check into a place where it is simpler. llvm-svn: 59400	2008-11-16 05:10:52 +00:00
Chris Lattner	269cbd5770	factor a whole bunch of code out into a helper function. llvm-svn: 59398	2008-11-16 05:06:21 +00:00
Chris Lattner	b37b6e7e96	simplify the conditions on two gigantic if's, decreasing indentation a bit. Next step is to factor out into their own helper functions. llvm-svn: 59397	2008-11-16 04:55:20 +00:00
Chris Lattner	f1be285134	simplify some instcombine matches by using m_Specific llvm-svn: 59395	2008-11-16 04:46:19 +00:00
Chris Lattner	fae5e33111	Use new m_SelectCst template to eliminate macros. llvm-svn: 59392	2008-11-16 04:33:38 +00:00
Chris Lattner	569d78cbb5	simplify code. llvm-svn: 59390	2008-11-16 04:26:55 +00:00
Chris Lattner	c3f3b059d0	Handle the case where there is no "not". It is possible it got folded into the select. llvm-svn: 59389	2008-11-16 04:25:26 +00:00
Chris Lattner	5f6d9a313b	factor a bunch of copy/paste code out into a helper function. Eliminate the cases checking for cond?0:-1, since that is already handled by commutative checking. llvm-svn: 59388	2008-11-16 04:24:12 +00:00
Chris Lattner	68d2da2a19	rearrange some code, no functionality change. llvm-svn: 59381	2008-11-16 03:56:24 +00:00
Chris Lattner	e02c7c7ad2	if we're going to use a macro, use it maximally. no functionality change. llvm-svn: 59380	2008-11-16 03:54:57 +00:00
Devang Patel	53b39b5467	Cleanup debug info. assocated with deleted instructions. llvm-svn: 59012	2008-11-11 00:54:10 +00:00
Devang Patel	d0ce981372	If the sign of exit condition and split condition does not match then do not split loop index. llvm-svn: 58995	2008-11-10 19:48:34 +00:00
Bill Wendling	7ef7314d1a	Third time's a charm. The previous patches didn't match correctly. Also, we need to make sure that the conditional is the same before doing the transformation. llvm-svn: 58978	2008-11-10 06:59:06 +00:00
Mon P Wang	25f0106fd9	Added support for the following definition of shufflevector <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask> llvm-svn: 58964	2008-11-10 04:46:22 +00:00
Bill Wendling	4fb13c051d	Correction for the last patch. Should match the conditional in the first part of the select match, not the select instruction itself. llvm-svn: 58947	2008-11-09 23:37:53 +00:00
Bill Wendling	1579287550	The method of doing the matching with a 'select' instruction was wrong. The original code was matching like this: if (match(A, m_Not(m_Value(B)))) B was already matched as a 'select' instruction. However, this isn't matching what we think it's matching. It would match B as a 'Value', so basically anything would match to it. In this case, a Constant matched. B was replaced with a constant representation. And then the wrong value would be used in the SelectInst::Create statement, causing a crash. After thinking on this for a moment, and after Nick L. told me how the pattern matching stuff was supposed to work, the solution was to match NOT an m_Value, but an m_Select. llvm-svn: 58946	2008-11-09 23:17:42 +00:00
Nuno Lopes	2e42927e7c	fix leakage of ValueNumbering llvm-svn: 58933	2008-11-09 12:45:23 +00:00
Bill Wendling	3f547be28f	If the LHS of the FCMP is coming from a UIToFP instruction, then we don't want to generate signed ICMP instructions to replace the FCMP. This would violate the following: define i1 @test1(i32 %val) { %1 = uitofp i32 %val to double %2 = fcmp ole double %1, 0.000000e+00 ret i1 %2 } would be transformed into: define i1 @test1(i32 %val) { %1 = icmp slt i33 %val, 1 ret i1 %1 } which is obviously wrong. This patch modifes InstCombiner::FoldFCmp_IntToFP_Cst to handle when the LHS comes from UIToFP. llvm-svn: 58929	2008-11-09 04:26:50 +00:00
Mon P Wang	5ca2ec65bd	Fixed scalarizing an extract subvector and prevent an infinite loop when simplify a vector. llvm-svn: 58820	2008-11-06 22:52:21 +00:00
Oscar Fuentes	076e048cf7	CMake: updated list of source files. llvm-svn: 58736	2008-11-05 00:11:22 +00:00
Dan Gohman	8cdea717a3	Add a new pass to simplify specific half_powr function calls. This is a specialized pass that it not likely to be generally useful. llvm-svn: 58732	2008-11-04 23:41:45 +00:00
Dale Johannesen	0a7b4f5800	Allow SROA of vectors. Removing this caused a huge performance regression in something we care about. This may not be final fix. llvm-svn: 58718	2008-11-04 20:54:03 +00:00
Devang Patel	fe57d109b6	Ignore conditions that are outside the loop. llvm-svn: 58631	2008-11-03 19:38:07 +00:00
Devang Patel	c1631db93b	Turn floating point IVs into integer IVs where possible. This allows SCEV users to effectively calculate trip count. LSR later on transforms back integer IVs to floating point IVs later on to avoid int-to-float casts inside the loop. llvm-svn: 58625	2008-11-03 18:32:19 +00:00
Nick Lewycky	d73806a9cc	Replace explicit loop with utility function. llvm-svn: 58593	2008-11-03 03:49:14 +00:00
Nick Lewycky	8d8acf327b	Fix demanded bits analysis with srem by negative number. Based on a patch by Richard Osborne. llvm-svn: 58555	2008-11-02 02:41:50 +00:00
Dan Gohman	83eea0b17f	Fix this recently moved code to use the correct type. CI is now a ConstantInt, and SI is the original cast instruction. This fixes PR2996. llvm-svn: 58549	2008-11-02 00:17:33 +00:00
Dan Gohman	13cbcf1c18	Canonicalize sext(i1) to i1?-1:0, and update various instcombine optimizations accordingly. llvm-svn: 58457	2008-10-30 20:40:10 +00:00
Dan Gohman	2c34c130bf	(A & sext(C)) \| (B & ~sext(C) -> C ? A : B llvm-svn: 58351	2008-10-28 22:38:57 +00:00
Nick Lewycky	f6e4dca67e	Add value range analyzing of Add and Sub. Understand that mul %x, 1 = %x. llvm-svn: 58069	2008-10-24 04:00:26 +00:00
Daniel Dunbar	7f39e2d85a	Change createPass factory functions to return Pass instead of LoopPass*. - Although less precise, this means they can be used in clients without RTTI (who would otherwise need to include LoopPass.h, which eventually includes things using dynamic_cast). This was the simplest solution that presented itself, but I am happy to use a better one if available. llvm-svn: 58010	2008-10-22 23:32:42 +00:00
Dan Gohman	215742a966	Use 0 instead of false to return a null pointer. llvm-svn: 57660	2008-10-17 00:56:52 +00:00
Dan Gohman	bc0278400c	Teach instcombine's visitLoad to scan back several instructions to find opportunities for store-to-load forwarding or load CSE, in the same way that visitStore scans back to do DSE. Also, define a new helper function for testing whether the addresses of two memory accesses are known to have the same value, and use it in both visitStore and visitLoad. These two changes allow instcombine to eliminate loads in code produced by front-ends that frequently emit obviously redundant addressing for memory references. llvm-svn: 57608	2008-10-15 23:19:35 +00:00
Evan Cheng	d885f6e139	Combine (fcmp cc0 x, y) \| (fcmp cc1 x, y) into a single fcmp when possible. llvm-svn: 57515	2008-10-14 18:44:08 +00:00
Evan Cheng	ce70752b11	- Somehow I forgot about one / une. - Renumber fcmp predicates to match their icmp counterparts. - Try swapping operands to expose more optimization opportunities. llvm-svn: 57513	2008-10-14 18:13:38 +00:00
Evan Cheng	67786cce66	Optimize anding of two fcmp into a single fcmp if the operands are the same. e.g. uno && ueq -> ueq ord && olt -> olt ord && ueq -> oeq llvm-svn: 57507	2008-10-14 17:15:11 +00:00
Matthijs Kooijman	f7d3cb5435	Make InstructionCombining::getBitCastOperand() recognize GEP instructions and constant expression with all zero indices as being the same as a bitcast. llvm-svn: 57442	2008-10-13 15:17:01 +00:00
Chris Lattner	da435910e8	Fix PR2697 by rewriting the '(X / pos) op neg' logic. This also changes a couple other cases for clarity, but shouldn't affect correctness. Patch by Eli Friedman! llvm-svn: 57387	2008-10-11 22:55:00 +00:00
Devang Patel	647a1e532b	Check loop exit predicate properly while eliminating one iteration loop. This patch fixes PR 2869 llvm-svn: 57369	2008-10-10 22:02:57 +00:00
Nuno Lopes	e3127f3f80	fix memleak by cleaning the global sets on pass exit llvm-svn: 57353	2008-10-10 16:25:50 +00:00
Dale Johannesen	4f0bd68cfe	Add a "loses information" return value to APFloat::convert and APFloat::convertToInteger. Restore return value to IEEE754. Adjust all users accordingly. llvm-svn: 57329	2008-10-09 23:00:39 +00:00
Duncan Sands	26ff6f9c54	Add <cstdio> include where needed by gcc-4.4. Patch by Samuel Tardieu. llvm-svn: 57291	2008-10-08 07:23:46 +00:00
Chris Lattner	42d5785dbd	Add parentheses to avoid warnings in GCC 4.4.0, patch by Samuel Tardieu! llvm-svn: 57288	2008-10-08 06:42:28 +00:00
Devang Patel	40aafce00d	Fix typo, fix PR 2865. llvm-svn: 57221	2008-10-06 23:22:54 +00:00
Matthijs Kooijman	cbe5e16eb5	Allow scalarrepl to treat an all-zero GEP just as bitcast. This includes not marking a GEP involving a vector as unsafe, but only when it has all zero indices. This allows scalarrepl to work in a few more cases. llvm-svn: 57177	2008-10-06 16:23:31 +00:00
Chris Lattner	917a6c1343	rewrite bswap matching to be more general, allowing arbitrary shifting and masking inside a bswap expr. This allows it to handle the cases from PR2842, which involve the intermediate 'or' expressions being shifted, not just the input value. llvm-svn: 57095	2008-10-05 02:13:19 +00:00
Chris Lattner	ca91f265c4	fix a bug where the bswap matcher could match a case involving ashr. It should only apply to lshr. llvm-svn: 57089	2008-10-05 00:50:57 +00:00
Duncan Sands	d65a4daeea	Factorize code: remove variants of "strip off pointer bitcasts and GEP's", and centralize the logic in Value::getUnderlyingObject. The difference with stripPointerCasts is that stripPointerCasts only strips GEPs if all indices are zero, while getUnderlyingObject strips GEPs no matter what the indices are. llvm-svn: 56922	2008-10-01 15:25:41 +00:00
Dan Gohman	67d90de2b0	Call ScalarEvolution's deleteValueFromRecords before deleting an instruction, not after. This fixes some uses of free'd memory. llvm-svn: 56908	2008-10-01 02:02:03 +00:00
Nick Lewycky	e8ced3ec19	Fix misoptimization of: xor i1 (icmp eq (X, C1), icmp s[lg]t (X, C2)) llvm-svn: 56834	2008-09-30 06:08:34 +00:00
Devang Patel	9eb525d4f9	Implement function notes as function attributes. llvm-svn: 56716	2008-09-26 23:51:19 +00:00
Devang Patel	a05633e105	Now Attributes are divided in three groups - return attributes - inreg, zext and sext - parameter attributes - function attributes - nounwind, readonly, readnone, noreturn Return attributes use 0 as the index. Function attributes use ~0U as the index. This patch requires corresponding changes in llvm-gcc and clang. llvm-svn: 56704	2008-09-26 22:53:05 +00:00
Devang Patel	4c758ea3e0	Large mechanical patch. s/ParamAttr/Attribute/g s/PAList/AttrList/g s/FnAttributeWithIndex/AttributeWithIndex/g s/FnAttr/Attribute/g This sets the stage - to implement function notes as function attributes and - to distinguish between function attributes and return value attributes. This requires corresponding changes in llvm-gcc and clang. llvm-svn: 56622	2008-09-25 21:00:45 +00:00
Evan Cheng	25dd4a2daf	Commit CodeGenPrepare.cpp changes which was accidentially left out of 56526. llvm-svn: 56549	2008-09-24 06:48:55 +00:00
Eric Christopher	c1ea149dcd	Fix fallout in CodeGenPrepare from 56526. Will likely need more work. llvm-svn: 56546	2008-09-24 05:32:41 +00:00
Devang Patel	6402c7236f	s/ParamAttrsWithIndex/FnAttributeWithIndex/g llvm-svn: 56535	2008-09-24 00:55:02 +00:00
Devang Patel	e15607b7bb	Put FN_NOTE_AlwaysInline and others in FnAttr namespace. llvm-svn: 56527	2008-09-24 00:06:15 +00:00
Devang Patel	e87abd26ba	Move FN_NOTE_AlwaysInline and other out of ParamAttrs namespace. Do not check isDeclaration() in hasNote(). It is clients' responsibility. llvm-svn: 56524	2008-09-23 23:52:03 +00:00
Devang Patel	ba3fa6c6e1	s/ParameterAttributes/Attributes/g llvm-svn: 56513	2008-09-23 23:03:40 +00:00
Devang Patel	82fed6702b	Use parameter attribute store (soon to be renamed) for Function Notes also. Function notes are stored at index ~0. llvm-svn: 56511	2008-09-23 22:35:17 +00:00
Devang Patel	329fe728b5	Add hasNote() to check note associated with a function. llvm-svn: 56477	2008-09-22 22:32:29 +00:00
Oscar Fuentes	a229b3c9a7	Initial support for the CMake build system. llvm-svn: 56419	2008-09-22 01:08:49 +00:00
Duncan Sands	310077034a	Remove the MarkModRef pass (use AddReadAttrs instead). Unfortunately this means removing one regression test of GlobalsModRef because I couldn't work out how to perform it without MarkModRef. llvm-svn: 56342	2008-09-19 08:23:44 +00:00
Devang Patel	c25be3b2de	splitLoop does not handle split condition EQ. Fixes PR 2805 llvm-svn: 56321	2008-09-18 23:45:14 +00:00
Bill Wendling	a00fa322b1	Decrementing the iterator here could be wrong if the worklist is empty after the "erase". Thanks to Ji Young Park for the patch! llvm-svn: 56316	2008-09-18 23:04:18 +00:00
Devang Patel	dca8d3b183	Do not ignore iv uses outside the loop. This one slipped through cracks very well. llvm-svn: 56284	2008-09-17 17:53:47 +00:00
Dan Gohman	dafa9c6e85	Improve instcombine's handling of integer min and max in two ways: - Recognize expressions like "x > -1 ? x : 0" as min/max and turn them into expressions like "x < 0 ? 0 : x", which is easily recognizable as a min/max operation. - Refrain from folding expression like "y/2 < 1" to "y < 2" when the comparison is being used as part of a min or max idiom, like "y/2 < 1 ? 1 : y/2". In that case, the division has another use, so folding doesn't eliminate it, and obfuscates the min/max, making it harder to recognize as a min/max operation. These benefit ScalarEvolution, CodeGen, and anything else that wants to recognize integer min and max. llvm-svn: 56246	2008-09-16 18:46:06 +00:00
Dan Gohman	68e7735a38	Teach LSR to optimize away SMAX operations for tripcounts in common cases. See the comment above OptimizeSMax for the full story, and the testcase for an example. This cancels out a pessimization commonly attributed to indvars, and will allow us to lift some of the artificial throttles in indvars, rather than add new ones. llvm-svn: 56230	2008-09-15 21:22:06 +00:00
Dan Gohman	eff71f2953	On 64-bit targets, change 32-bit getelementptr indices to be 64-bit getelementptr indices, inserting an explicit cast if necessary. This helps expose the sign-extension operation to other optimizations. llvm-svn: 56133	2008-09-11 23:06:38 +00:00
Dan Gohman	7d01c0654c	Fix a vectorshuffle instcombine bug introduced by r55995. Patch by Nicolas Capens! llvm-svn: 56129	2008-09-11 22:47:57 +00:00
Dan Gohman	9b9d547a5c	Fix a copy+paste bug that Duncan spotted. For several cases it was still getting lucky and detecting overflow but it was clearly incorrect. llvm-svn: 56113	2008-09-11 18:53:02 +00:00
Dan Gohman	9d9a4be588	In my analysis for r56076 I missed the case where the original multiplication overflows. llvm-svn: 56082	2008-09-11 00:25:00 +00:00
Dan Gohman	c1ae01688f	Fix an icmp+sdiv optimization to check for and handle an overflow condition. This fixes PR2740. llvm-svn: 56076	2008-09-10 23:30:57 +00:00
Devang Patel	728c44ab56	fix white spaces. llvm-svn: 56056	2008-09-10 14:49:55 +00:00
Dan Gohman	97f0a0f28d	Fix a warning about comparing signed and unsigned values. llvm-svn: 56040	2008-09-10 01:09:32 +00:00
Devang Patel	92b032f3e6	if loop induction variable is always sign or zero extended then extend the type of induction variable. llvm-svn: 56017	2008-09-09 21:41:07 +00:00
Devang Patel	92c5367705	fix overflow check. llvm-svn: 56011	2008-09-09 20:54:34 +00:00
Dan Gohman	86fb5b48de	Make SimplifyDemandedVectorElts simplify vectors with multiple users, and teach it about shufflevector instructions. Also, fix a subtle bug in SimplifyDemandedVectorElts' insertelement code. This is a patch that was originally written by Eli Friedman, with some fixes and cleanup by me. llvm-svn: 55995	2008-09-09 18:11:14 +00:00
Devang Patel	3d56051f70	s/RemoveUnreachableBlocks/RemoveUnreachableBlocksFromFn/g llvm-svn: 55965	2008-09-08 22:14:17 +00:00
Devang Patel	7518f250b9	Remove unused counter. llvm-svn: 55924	2008-09-08 17:14:54 +00:00
Devang Patel	538a7f479a	Remove OptimizeIVType() llvm-svn: 55913	2008-09-08 16:13:27 +00:00
Devang Patel	d94269f906	Remove unused map. llvm-svn: 55861	2008-09-05 21:55:33 +00:00
Devang Patel	40519f0370	A loop may be unswitched multiple times. Reconstruct dom info. at the end. llvm-svn: 55806	2008-09-04 22:43:59 +00:00
Devang Patel	00ec74616b	Initialize loop data first. llvm-svn: 55792	2008-09-04 20:36:36 +00:00
Devang Patel	d52071540c	Do not unswitch if the function notes say we're optimizing this function for size. llvm-svn: 55786	2008-09-04 18:55:13 +00:00
Dale Johannesen	fe1bb7964c	Add intrinsic forms of pow and exp2. The non-intrinsic forms remain to handle older IR files, but will go away soon. llvm-svn: 55781	2008-09-04 18:30:46 +00:00
Dan Gohman	a79db30d28	Tidy up several unbeseeming casts from pointer to intptr_t. llvm-svn: 55779	2008-09-04 17:05:41 +00:00
Owen Anderson	2fbfb70530	Fix a bug that prevented PRE from applying in some cases. llvm-svn: 55744	2008-09-03 23:06:07 +00:00
Nick Lewycky	2fcb26cc75	Don't apply this transform to vectors. Fixes PR2756. llvm-svn: 55690	2008-09-03 06:24:21 +00:00
Devang Patel	bcd39345de	Add additional check to ensure that iv is canonicalized. llvm-svn: 55682	2008-09-03 00:29:13 +00:00
Devang Patel	b530f08122	Check iteration count. llvm-svn: 55680	2008-09-03 00:10:56 +00:00
Devang Patel	81fed043c5	While removing PHI, use basicblock to identify incoming value. llvm-svn: 55678	2008-09-03 00:02:42 +00:00
Devang Patel	43c5a52e07	If all IV uses are extending integer IV then change the type of IV itself, if possible. llvm-svn: 55674	2008-09-02 22:18:08 +00:00
Duncan Sands	130d9efec3	Add a small pass that sets the readnone/readonly attributes on functions, based on the result of alias analysis. It's not hardwired to use GlobalsModRef even though this is the only (AFAIK) alias analysis that results in this pass actually doing something. Enable as follows: opt ... -globalsmodref-aa -markmodref ... Advantages of this pass: (1) records the result of globalsmodref in the bitcode, meaning it is available for use by later passes (currently the pass manager isn't smart enough to magically make an advanced alias analysis available to all later passes), which may expose more optimization opportunities; (2) hopefully speeds up compilation when code is optimized twice, for example when a file is compiled to bitcode, then later LTO is done on it: marking functions readonly/readnone when producing the initial bitcode should speed up alias analysis during LTO; (3) good for discovering that globalsmodref doesn't work very well :) Not currently turned on by default. llvm-svn: 55604	2008-09-01 11:40:11 +00:00
Devang Patel	d6adbb6a0f	Do not apply the transformation if the target does not support DestTy natively. llvm-svn: 55433	2008-08-27 20:55:23 +00:00
Devang Patel	cf7ca5d0ba	Fix typos and whitespaces. Other cosmetic changes based on feedback. llvm-svn: 55424	2008-08-27 17:50:18 +00:00
Owen Anderson	b39e0decf8	Put a heuristic in place to prevent GVN from falling into bad cases with massively complicated CFGs. This speeds up a particular testcase from 12+ hours to 5 seconds with little perceptible loss of quality. llvm-svn: 55391	2008-08-26 22:07:42 +00:00
Devang Patel	4310d39844	If IV is used in a int-to-float cast inside the loop then try to eliminate the cast operation. llvm-svn: 55374	2008-08-26 17:57:54 +00:00
Chris Lattner	add44f3fb7	improve encapsulation of the BBExecutable set. llvm-svn: 55271	2008-08-23 23:39:31 +00:00
Chris Lattner	65938fc69a	Switch an assortment of maps, sets and vectors to more efficient versions, patch contributed by m-s! llvm-svn: 55270	2008-08-23 23:36:38 +00:00
Chris Lattner	0c19df4871	Switch the asmprinter (.ll) and all the stuff it requires over to use raw_ostream instead of std::ostream. Among other goodness, this speeds up llvm-dis of kc++ with a release build from 0.85s to 0.49s (88% faster). Other interesting changes: 1) This makes Value::print be non-virtual. 2) AP[S]Int and ConstantRange can no longer print to ostream directly, use raw_ostream instead. 3) This fixes a bug in raw_os_ostream where it didn't flush itself when destroyed. 4) This adds a new SDNode::print method, instead of only allowing "dump". A lot of APIs have both std::ostream and raw_ostream versions, it would be useful to go through and systematically anihilate the std::ostream versions. This passes dejagnu, but there may be minor fallout, plz let me know if so and I'll fix it. llvm-svn: 55263	2008-08-23 22:23:09 +00:00
Chris Lattner	3f972c9150	Fix PR2423 by checking all indices for out of range access, not only indices that start with an array subscript. x->field[10000] is just as bad as (*X)[14][10000]. llvm-svn: 55226	2008-08-23 05:21:06 +00:00
Chris Lattner	5fc8ab6d18	consolidate DenseMapInfo implementations, and add one for std::pair. Patch contributed by m-s. llvm-svn: 55167	2008-08-22 05:08:25 +00:00
Nick Lewycky	99f4558117	Revert r54876 r54877 r54906 and r54907. Evan found that these caused a 20% slowdown in bzip2. llvm-svn: 55113	2008-08-21 05:56:10 +00:00
Evan Cheng	f5a7e51c81	Silence a compiler warning. llvm-svn: 55087	2008-08-20 23:36:48 +00:00
Mon P Wang	1b2c061b73	Fixed shuffle optimizations to handle non power of 2 vectors llvm-svn: 55035	2008-08-20 02:23:25 +00:00
Chris Lattner	57693dda1d	don't use the result of WriteAsOperand llvm-svn: 54979	2008-08-19 04:45:19 +00:00
Nick Lewycky	75d4a83f2f	Make this comment clearer. Instead of using an ambiguous ~ (not) on an icmp predicate, swap the order of the operands. llvm-svn: 54907	2008-08-17 20:02:02 +00:00
Nick Lewycky	53b44029d6	Consider the case where xor by -1 and xor by 128 have been combined already to produce an xor by 127. llvm-svn: 54906	2008-08-17 19:58:24 +00:00
Evan Cheng	5dabe042a6	Revert 54821. It's miscompiling 252.eon and 447.dealII llvm-svn: 54878	2008-08-17 08:07:31 +00:00
Nick Lewycky	18c6f56c76	I found a better place for this optz'n. llvm-svn: 54877	2008-08-17 07:54:14 +00:00
Nick Lewycky	18f50b2637	Xor'ing both sides of icmp by sign-bit is equivalent to swapping signedness of the predicate. Also, make this optz'n apply in more cases where it's safe to do so. llvm-svn: 54876	2008-08-17 07:34:14 +00:00
Owen Anderson	affe0267f8	Remove GCSE, ValueNumbering, and LoadValueNumbering. These have been deprecated for almost a year; it's finally time for them to go away. llvm-svn: 54822	2008-08-15 21:31:02 +00:00
Devang Patel	f2a03d5a4b	Reapply 54786. Add overflow and number of mantissa bits checks. llvm-svn: 54821	2008-08-15 21:21:34 +00:00
Evan Cheng	86834d29f3	Revert 54786. It's not checking for overflows, etc. llvm-svn: 54813	2008-08-15 08:12:11 +00:00
Chris Lattner	1d23915a8f	use smallvector instead of vector for a couple worklists. This speeds up instcombine by ~10% on some testcases. llvm-svn: 54811	2008-08-15 04:03:01 +00:00
Bill Wendling	861bec78f8	Temporarily revert r54792. It's causing an ICE during bootstrapping. llvm-svn: 54804	2008-08-14 23:05:24 +00:00
Devang Patel	52dc07b01a	Use DenseMap. Patch by Pratik Solanki. llvm-svn: 54792	2008-08-14 21:31:10 +00:00
Devang Patel	054a833dd4	If IV is used in a int-to-float cast inside the loop then try to eliminate the cast opeation. llvm-svn: 54786	2008-08-14 20:58:31 +00:00
Dan Gohman	8de6d22392	Use empty() instead of begin() == end(). llvm-svn: 54780	2008-08-14 18:13:49 +00:00
Dan Gohman	6134fbccef	Fix a bogus srem rule - a negative value srem'd by a power-of-2 can have a non-negative result; for example, -16%16 is 0. Also, clarify the related comments. This fixes PR2670. llvm-svn: 54767	2008-08-13 23:12:35 +00:00
Dan Gohman	8ded5d5884	Fix SCCP's handling of struct value loads and stores. SCCP doesn't track individual leaf values in such cases, so it needs to treat struct values as normal values in this case. llvm-svn: 54760	2008-08-13 21:22:48 +00:00
Devang Patel	6369a798ba	Rename. s/FindIVForUser/FindIVUserForCond/g llvm-svn: 54754	2008-08-13 20:31:11 +00:00
Devang Patel	97387e6615	Check sign to detect overflow before changing compare stride. llvm-svn: 54710	2008-08-13 02:05:14 +00:00
Chris Lattner	2aa0ff27aa	Implement support for simplifying vector comparisons by 0.0 and 1.0 like we do for scalars. Patch contributed by Nicolas Capens This also generalizes the previous xforms to work on long double, now that isExactlyValue works for long double. llvm-svn: 54653	2008-08-11 22:06:05 +00:00
Eric Christopher	5927883970	Have IRBuilder take a template argument on whether or not to preserve names. This can save a lot of allocations if you aren't going to be looking at the output. llvm-svn: 54546	2008-08-08 19:39:37 +00:00
Dan Gohman	ac22cfcae9	Fix a shufflevector instcombine that was emitting invalid masks indices when it meant to be emitting undef indices. llvm-svn: 54417	2008-08-06 18:17:32 +00:00
Evan Cheng	907dc2bc37	Fix PR2355: bug in ChangeCompareStride. When the loop termination compare is the only use of its iv stride, the stride can be eliminated by moving it to another stride. If the scale is negative, swap the predicate instead of using a inverse predicate. llvm-svn: 54415	2008-08-06 18:04:43 +00:00
Chris Lattner	f5b353c1fd	optimize a common idiom generated by clang for bitfield access, PR2638. llvm-svn: 54408	2008-08-06 07:35:52 +00:00

... 5 6 7 8 9 ...

3328 Commits