llvm-project

Commit Graph

Author	SHA1	Message	Date
Dan Gohman	cc31110b95	Re-apply r73718, now that the fix in r73787 is in, and add a hand-crafted testcase which demonstrates the bug that was exposed in 254.gap. llvm-svn: 73793	2009-06-19 23:23:27 +00:00
Dan Gohman	55e3dd9174	Fix LSR's OptimizeSMax to ignore max operators with more than 2 operands, which it isn't prepared to handle. llvm-svn: 73787	2009-06-19 23:03:46 +00:00
Evan Cheng	86076c9e30	Revert 73718. It's breaking 254.gap. llvm-svn: 73783	2009-06-19 21:15:06 +00:00
Dan Gohman	8c9ac59455	Generalize LSR's OptimizeSMax to handle unsigned max tests as well as signed max tests. Along with r73717, this helps CodeGen avoid emitting code for a maximum operation for this class of loop. llvm-svn: 73718	2009-06-18 20:23:18 +00:00
Dan Gohman	a0348809b6	Remove the code from IVUsers that attempted to handle casted induction variables in cases where the cast isn't foldable. It ended up being a pessimization in many cases. This could be fixed, but it would require a bunch of complicated code in IVUsers' clients. The advantages of this approach aren't visible enough to justify it at this time. llvm-svn: 73706	2009-06-18 16:54:06 +00:00
Dan Gohman	d8329e8378	Update comments to use doxygen syntax. llvm-svn: 73621	2009-06-17 17:51:33 +00:00
Dan Gohman	7ccc52f131	Support vector casts in more places, fixing a variety of assertion failures. To support this, add some utility functions to Type to help support vector/scalar-independent code. Change ConstantInt::get and ConstantFP::get to support vector types, and add an overload to ConstantInt::get that uses a static IntegerType type, for convenience. Introduce a new getConstant method for ScalarEvolution, to simplify common use cases. llvm-svn: 73431	2009-06-15 22:12:54 +00:00
Dan Gohman	0652fd59ff	Convert several parts of the ScalarEvolution framework to use SmallVector instead of std::vector. llvm-svn: 73357	2009-06-14 22:47:23 +00:00
Devang Patel	50fc5a3cd7	Simplify. llvm-svn: 72965	2009-06-05 22:39:21 +00:00
Dan Gohman	a5b9645c4b	Split the Add, Sub, and Mul instruction opcodes into separate integer and floating-point opcodes, introducing FAdd, FSub, and FMul. For now, the AsmParser, BitcodeReader, and IRBuilder all preserve backwards compatability, and the Core LLVM APIs preserve backwards compatibility for IR producers. Most front-ends won't need to change immediately. This implements the first step of the plan outlined here: http://nondot.org/sabre/LLVMNotes/IntegerOverflow.txt llvm-svn: 72897	2009-06-04 22:49:04 +00:00
Dan Gohman	4d1823680d	Revert 72493 and replace it with a more conservative fix, for now: don't rewrite the comparison if there is any implicit extension or truncation on the induction variable. I'm planning for IVUsers to eventually take over some of the work of this code, and for it to be generalized. llvm-svn: 72496	2009-05-27 21:10:47 +00:00
Dan Gohman	f4d85325c0	In ChangeCompareStride, when the stride to be reused is truncated to a smaller type, promoted its offset back up to the type of the new comparison. This fixes PR4222. llvm-svn: 72493	2009-05-27 20:00:18 +00:00
Dan Gohman	7248923a5d	Suppress the IV reversal transformation in the case that the RHS of the comparison is defined inside the loop. This fixes a use-before-def problem, because the transformation puts a use of the RHS outside the loop. llvm-svn: 72149	2009-05-20 00:34:08 +00:00
Dan Gohman	97f70add3c	Add some more comments to the top of this file. llvm-svn: 72131	2009-05-19 20:37:36 +00:00
Dan Gohman	adc70d6806	Trim unneeded #includes. llvm-svn: 72130	2009-05-19 20:35:26 +00:00
Dan Gohman	2649491f9c	Teach SCEVExpander to expand arithmetic involving pointers into GEP instructions. It attempts to create high-level multi-operand GEPs, though in cases where this isn't possible it falls back to casting the pointer to i8* and emitting a GEP with that. Using GEP instructions instead of ptrtoint+arithmetic+inttoptr helps pointer analyses that don't use ScalarEvolution, such as BasicAliasAnalysis. Also, make the AddrModeMatcher more aggressive in handling GEPs. Previously it assumed that operand 0 of a GEP would require a register in almost all cases. It now does extra checking and can do more matching if operand 0 of the GEP is foldable. This fixes a problem that was exposed by SCEVExpander using GEPs. llvm-svn: 72093	2009-05-19 02:15:55 +00:00
Dan Gohman	14d1339579	Rename UseTy to AccessTy, for consistency with getAccessType, and to avoid ambiguity with the word "use" in IVStrideUse. llvm-svn: 72012	2009-05-18 16:45:28 +00:00
Dale Johannesen	536de01bcf	Add an int64_t variant of abs, for host environments without one. Use it where we were using abs on int64_t objects. (I strongly suspect the casts to unsigned in the fragments in LoopStrengthReduce are not doing whatever the original intent was, but the obvious change to uint64_t doesn't work. Maybe later.) llvm-svn: 71612	2009-05-13 00:24:22 +00:00
Dan Gohman	d76d71a291	Factor the code for collecting IV users out of LSR into an IVUsers class, and generalize it so that it can be used by IndVarSimplify. Implement the base IndVarSimplify transformation code using IVUsers. This removes TestOrigIVForWrap and associated code, as ScalarEvolution now has enough builtin overflow detection and folding logic to handle all the same cases, and more. Run "opt -iv-users -analyze -disable-output" on your favorite loop for an example of what IVUsers does. This lets IndVarSimplify eliminate IV casts and compute trip counts in more cases. Also, this happens to finally fix the remaining testcases in PR1301. Now that IndVarSimplify is being more aggressive, it occasionally runs into the problem where ScalarEvolutionExpander's code for avoiding duplicate expansions makes it difficult to ensure that all expanded instructions dominate all the instructions that will use them. As a temporary measure, IndVarSimplify now uses a FixUsesBeforeDefs function to fix up instructions inserted by SCEVExpander. Fortunately, this code is contained, and can be easily removed once a more comprehensive solution is available. llvm-svn: 71535	2009-05-12 02:17:14 +00:00
Evan Cheng	78a4eb844b	Teach LSR to optimize more loop exit compares, i.e. change them to use postinc iv value. Previously LSR would only optimize those which are in the loop latch block. However, if LSR can prove it is safe (and profitable), it's now possible to change those not in the latch blocks to use postinc values. Also, if the compare is the only use, LSR would place the iv increment instruction before the compare instead in the latch. llvm-svn: 71485	2009-05-11 22:33:01 +00:00
Dale Johannesen	02cb2bf2e3	Reverse a loop that is counting up to a maximum to count down to 0 instead, under very restricted circumstances. Adjust 4 testcases in which this optimization fires. llvm-svn: 71439	2009-05-11 17:15:42 +00:00
Evan Cheng	b9dcc2c0c9	Factor out code that optimize loop terminating condition. llvm-svn: 71305	2009-05-09 01:08:24 +00:00
Evan Cheng	342053cd27	Unbreak the build. llvm-svn: 71091	2009-05-06 18:00:56 +00:00
David Greene	0dec5b9a75	Make sure to use signed arithmetic in APInt to fix a regression. llvm-svn: 71090	2009-05-06 17:39:26 +00:00
Dan Gohman	e58fc20f8d	Fix a copy+pasto in a comment. llvm-svn: 71035	2009-05-05 23:02:38 +00:00
Dan Gohman	96b18ccdd3	Delete a FIXME which is no longer relevant, and add a FIXME that is. llvm-svn: 71033	2009-05-05 22:59:55 +00:00
Bill Wendling	5e2ac0cd9c	Temporarily reverting r71008. It was causing this failure: Running /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/test/ CodeGen/X86/dg.exp ... FAIL: /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/test/ CodeGen/X86/change-compare-stride-1.ll Failed with exit(1) at line 2 while running: grep {cmpq $-478,} change-compare-stride-1.ll.tmp child process exited abnormally llvm-svn: 71013	2009-05-05 20:49:46 +00:00
David Greene	246a3dfb10	Handle overflow of 64-bit loop conditions. llvm-svn: 71008	2009-05-05 20:22:36 +00:00
Dan Gohman	48f8222293	Re-apply 70645, converting ScalarEvolution to use CallbackVH, with fixes. allUsesReplacedWith need to walk the def-use chains and invalidate all users of a value that is replaced. SCEVs of users need to be recalcualted even if the new value is equivalent. Also, make forgetLoopPHIs walk def-use chains, since any SCEV that depends on a PHI should be recalculated when more information about that PHI becomes available. llvm-svn: 70927	2009-05-04 22:30:44 +00:00
Dan Gohman	a30370bc33	Constify a bunch of SCEV-using code. llvm-svn: 70919	2009-05-04 22:02:23 +00:00
Dan Gohman	5036695c32	Revert r70645 for now; it's causing a variety of regressions. llvm-svn: 70661	2009-05-03 05:46:20 +00:00
Dan Gohman	e9a38d16fe	Convert ScalarEvolution to use CallbackVH for its internal map. This makes ScalarEvolution::deleteValueFromRecords, and it's code that subtly needed to be called before ReplaceAllUsesWith, unnecessary. It also makes ValueDeletionListener unnecessary. llvm-svn: 70645	2009-05-02 21:19:20 +00:00
Dan Gohman	ff08995589	Previously, RecursivelyDeleteDeadInstructions provided an option of returning a list of pointers to Values that are deleted. This was unsafe, because the pointers in the list are, by nature of what RecursivelyDeleteDeadInstructions does, always dangling. Replace this with a simple callback mechanism. This may eventually be removed if all clients can reasonably be expected to use CallbackVH. Use this to factor out the dead-phi-cycle-elimination code from LSR utility function, and generalize it to use the RecursivelyDeleteTriviallyDeadInstructions utility function. This makes LSR more aggressive about eliminating dead PHI cycles; adjust tests to either be less trivial or to simply expect fewer instructions. llvm-svn: 70636	2009-05-02 18:29:22 +00:00
Dan Gohman	6409e7d4e9	Don't split critical edges during the AddUsersIfInteresting phase of LSR. This makes the AddUsersIfInteresting phase of LSR a pure analysis instead of a phase that potentially does CFG modifications. The conditions where this code would actually perform a split are rare, and in the cases where it actually would do a split the split is usually undone by CodeGenPrepare, and in cases where splits actually survive into codegen, they appear to hurt more often than they help. llvm-svn: 70625	2009-05-02 05:36:01 +00:00
Dan Gohman	65dbe7874f	Make RequiresTypeConversion canonicalize the types before calling the target hooks canLosslesslyBitCastTo and isTruncateFree. This allows targets to avoid worrying about handling all combinations of integer and pointer types. llvm-svn: 70555	2009-05-01 17:07:43 +00:00
Dan Gohman	d3aa4215ef	Minor whitespace fix. llvm-svn: 70551	2009-05-01 16:56:32 +00:00
Dan Gohman	6be8530158	Fix some code to work if TargetLowering is not available. llvm-svn: 70546	2009-05-01 16:29:14 +00:00
Dale Johannesen	f4031bd01e	Print correct instruction in dump. llvm-svn: 70427	2009-04-29 22:57:20 +00:00
Dan Gohman	e99f98262c	Permit ChangeCompareStride to rewrite a comparison when the factor between the comparison's iv stride and the candidate stride is exactly -1. llvm-svn: 70244	2009-04-27 20:35:32 +00:00
Dan Gohman	4860db61be	Factor out a common base class from SCEVTruncateExpr, SCEVZeroExtendExpr, and SCEVSignExtendExpr. llvm-svn: 69649	2009-04-21 01:25:57 +00:00
Dan Gohman	b397e1a7a2	Introduce encapsulation for ScalarEvolution's TargetData object, and refactor the code to minimize dependencies on TargetData. llvm-svn: 69644	2009-04-21 01:07:12 +00:00
Dan Gohman	056857aa21	Use more const qualifiers with SCEV interfaces. llvm-svn: 69450	2009-04-18 17:56:28 +00:00
Dan Gohman	d2d6fd806c	Don't create ConstantInts with pointer type. This fixes a regression in 403.gcc in PIC_CODEGEN=1 and DISABLE_LTO=1 mode. llvm-svn: 69344	2009-04-17 02:02:52 +00:00
Dan Gohman	fec1d086e0	Use TargetData::getTypeSizeInBits instead of getPrimitiveSizeInBits() to get the correct answer for pointer types. llvm-svn: 69321	2009-04-16 22:35:57 +00:00
Dan Gohman	8b6ebb1112	Minor code simplifications. Don't attempt LSR on theoretical targets with pointers larger than 64 bits, due to the code not yet being APInt clean. llvm-svn: 69296	2009-04-16 16:49:48 +00:00
Dan Gohman	e2ead2c328	LSR is no longer a GEP optimizer. It is now an IV expression optimizer, which just happen to frequently involve optimizing GEPs. llvm-svn: 69295	2009-04-16 16:46:01 +00:00
Dan Gohman	a8be04b2db	Use ConstantExpr::getIntToPtr instead of SCEVExpander::InsertCastOfTo, since the operand is always a constant. llvm-svn: 69291	2009-04-16 15:48:38 +00:00
Dan Gohman	71bccd3e0e	Use a SCEV expression cast instead of immediately inserting a new instruction with SCEVExpander::InsertCastOfTo. llvm-svn: 69290	2009-04-16 15:47:35 +00:00
Dan Gohman	0a40ad93a9	Expand GEPs in ScalarEvolution expressions. SCEV expressions can now have pointer types, though in contrast to C pointer types, SCEV addition is never implicitly scaled. This not only eliminates the need for special code like IndVars' EliminatePointerRecurrence and LSR's own GEP expansion code, it also does a better job because it lets the normal optimizations handle pointer expressions just like integer expressions. Also, since LLVM IR GEPs can't directly index into multi-dimensional VLAs, moving the GEP analysis out of client code and into the SCEV framework makes it easier for clients to handle multi-dimensional VLAs the same way as other arrays. Some existing regression tests show improved optimization. test/CodeGen/ARM/2007-03-13-InstrSched.ll in particular improved to the point where if-conversion started kicking in; I turned it off for this test to preserve the intent of the test. llvm-svn: 69258	2009-04-16 03:18:22 +00:00
Chris Lattner	42e9ca42ce	LSR shouldn't ever try to hack on integer IV's larger than 64-bits. Right now it is not APInt clean, but even when it is it needs to be evaluated carefully to determine whether it is actually profitable. This fixes a crash on PR3806 llvm-svn: 67134	2009-03-17 23:58:30 +00:00
Dan Gohman	f12436891e	Don't record the increment instruction; just recompute it from the Phi if needed. This simplifies the code a little, and is needed for an upcoming refactoring. llvm-svn: 66479	2009-03-09 22:04:01 +00:00
Dan Gohman	b855164751	Fix a few more places where induction variable types were used where memory access types are needed. llvm-svn: 66470	2009-03-09 21:22:12 +00:00
Dan Gohman	5a4e31666d	Use ReplacedTy instead of recomputing the same value. llvm-svn: 66469	2009-03-09 21:19:58 +00:00
Dan Gohman	34e52ddb7d	Use LoopInfo's getLoopLatch() instead of doing what it does manualy. llvm-svn: 66467	2009-03-09 21:14:16 +00:00
Dan Gohman	70cc9875d8	Don't use an induction variable type as a memory access type. Use VoidTy instead, to be properly conservative. llvm-svn: 66463	2009-03-09 21:04:19 +00:00
Dan Gohman	917ffe4592	Factor out the code that determines the memory access type of an instruction into a helper function. llvm-svn: 66460	2009-03-09 21:01:17 +00:00
Dan Gohman	e201f8ff1d	Move the sorting of the StrideOrder array earlier so that it doesn't have to be done twice. llvm-svn: 66449	2009-03-09 20:46:50 +00:00
Dan Gohman	b5001909b0	Delete the isOnlyStride argument, which is unused. llvm-svn: 66446	2009-03-09 20:41:15 +00:00
Dan Gohman	85875f7120	Tidy some LSR debug output: announce the loop it's about to process before it does any processing. llvm-svn: 66443	2009-03-09 20:34:59 +00:00
Dan Gohman	66476b582d	Fix this comment. llvm-svn: 66065	2009-03-04 20:50:23 +00:00
Dan Gohman	ae0035ee15	Add an assertion for a condition that's always true, and not immediately obvious. llvm-svn: 66062	2009-03-04 20:49:01 +00:00
Dan Gohman	0bddac16a8	Rename ScalarEvolution's getIterationCount to getBackedgeTakenCount, to more accurately describe what it does. Expand its doxygen comment to describe what the backedge-taken count is and how it differs from the actual iteration count of the loop. Adjust names and comments in associated code accordingly. llvm-svn: 65382	2009-02-24 18:55:53 +00:00
Dan Gohman	5d1f458f0f	Generalize the ChangeCompareStride code, in preparation for handling non-constant strides. No functionality change. llvm-svn: 65363	2009-02-24 01:58:00 +00:00
Dan Gohman	f394e58af5	Properly parenthesize this expression, fixing a real bug in the new -full-lsr code, as well as a GCC warning. llvm-svn: 65288	2009-02-22 16:40:52 +00:00
Evan Cheng	69decbf0b2	Only try to sink immediate when TLI is not null. It needs to check if immediate would fit in target addressing field. llvm-svn: 65268	2009-02-22 07:31:19 +00:00
Evan Cheng	107b06c4b9	Teach LSR sink to sink the immediate portion of the common expression back into uses if they fit in address modes of all the uses. llvm-svn: 65215	2009-02-21 02:06:47 +00:00
Evan Cheng	8a9481d50d	Fix strange logic in CollectIVUsers used to determine whether all uses are addresses, part 1. This fixes an obvious logic bug. Previously if the only in-loop use is a PHI, it would return AllUsesAreAddresses as true. llvm-svn: 65178	2009-02-20 22:16:49 +00:00
Dan Gohman	5e309a5bbb	Simplify code and reduce indentation. No functionality change. llvm-svn: 65167	2009-02-20 21:27:23 +00:00
Dan Gohman	2c8cb5b4ec	Fix 80-column violations. llvm-svn: 65159	2009-02-20 21:06:57 +00:00
Dan Gohman	addc50b4ee	It's not necessary to check if Base is null here. llvm-svn: 65157	2009-02-20 21:05:23 +00:00
Dan Gohman	1608df5319	Add a comment about how Imm can be used for loop-variant values. llvm-svn: 65147	2009-02-20 20:29:04 +00:00
Dan Gohman	2a12ae7d1f	Implement "superhero" strength reduction, or full strength reduction of address calculations down to basic pointer arithmetic. This is currently off by default, as it needs a few other features before it becomes generally useful. And even when enabled, full strength reduction is only performed when it doesn't increase register pressure, and when several other conditions are true. This also factors out a bunch of exisiting LSR code out of StrengthReduceStridedIVUsers into separate functions, and tidies up IV insertion. This actually decreases register pressure even in non-superhero mode. The change in iv-users-in-other-loops.ll is an example of this; there are two more adds because there are two fewer leas, and there is less spilling. llvm-svn: 65108	2009-02-20 04:17:46 +00:00
Dan Gohman	a34d7adefb	Use DEBUG() instead of passing *DOUT to WriteAsOperand, since the latter just passes a null reference when debugging is not enabled. llvm-svn: 65060	2009-02-19 19:32:06 +00:00
Dan Gohman	30a2959367	Make the debug output of LSR less cryptic and more informative. llvm-svn: 65057	2009-02-19 19:23:27 +00:00
Dan Gohman	d0b1fbd983	Fix a typo in a comment. llvm-svn: 64859	2009-02-18 00:08:39 +00:00
Evan Cheng	161861deb0	Strengthen the "non-constant stride must dominate loop preheader" check. llvm-svn: 64703	2009-02-17 00:13:06 +00:00
Evan Cheng	e79841adbb	Fix pr3571: If stride is a value defined by an instruction, make sure it dominates the loop preheader. When IV users are strength reduced, the stride is inserted into the preheader. It could create a use before def situation. llvm-svn: 64579	2009-02-15 06:06:15 +00:00
Evan Cheng	fe151ba135	ifdef out unneeded if statement. llvm-svn: 64575	2009-02-15 03:20:37 +00:00
Dan Gohman	a2730abaaa	Complete the sentance in this comment. I have reservations about the code it describes, but at least now the comment is right. llvm-svn: 64465	2009-02-13 17:36:42 +00:00
Dan Gohman	f71a473720	Fix the code that checked if a SCEVAddRecExpr Start contains an addrec in a different loop to check the value being added to the accumulated Start value, not the Start value before it has the new value added to it. This prevents LSR from going crazy on the included testcase. Dale, please review. llvm-svn: 64440	2009-02-13 03:58:31 +00:00
Dan Gohman	ba83228cdb	Fix LSR's IV sorting function to explicitly sort by bitwidth after sorting by stride value. This prevents it from missing IV reuse opportunities in a host-sensitive manner. llvm-svn: 64415	2009-02-13 00:26:43 +00:00
Dale Johannesen	cd19967754	Fix PR 3471, and some cleanups. llvm-svn: 64177	2009-02-09 22:14:15 +00:00
Dale Johannesen	1f0e0e7c9c	Fix the time regression I introduced in 464.h264ref with my earlier patch to this file. The issue there was that all uses of an IV inside a loop are actually references to Base[IV2], and there was one use outside that was the same but LSR didn't see the base or the scaling because it didn't recurse into uses outside the loop; thus, it used base+IVscale mode inside the loop instead of pulling base out of the loop. This was extra bad because register pressure later forced both base and IV into memory. Doing that recursion, at least enough to figure out addressing modes, is a good idea in general; the change in AddUsersIfInteresting does this. However, there were side effects.... It is also possible for recursing outside the loop to introduce another IV where there was only 1 before (if the refs inside are not scaled and the ref outside is). I don't think this is a common case, but it's in the testsuite. It is right to be very aggressive about getting rid of such introduced IVs (CheckForIVReuse and the handling of nonzero RewriteFactor in StrengthReduceStridedIVUsers). In the testcase in question the new IV produced this way has both a nonconstant stride and a nonzero base, neither of which was handled before. And when inserting new code that feeds into a PHI, it's right to put such code at the original location rather than in the PHI's immediate predecessor(s) when the original location is outside the loop (a case that couldn't happen before) (RewriteInstructionToUseNewBase); better to avoid making multiple copies of it in this case. Also, the mechanism for keeping SCEV's corresponding to GEP's no longer works, as the GEP might change after its SCEV is remembered, invalidating the SCEV, and we might get a bad SCEV value when looking up the GEP again for a later loop. This also couldn't happen before, as we weren't recursing into GEP's outside the loop. Also, when we build an expression that involves a (possibly non-affine) IV from a different loop as well as an IV from the one we're interested in (containsAddRecFromDifferentLoop), don't recurse into that. We can't do much with it and will get in trouble if we try to create new non-affine IVs or something. More testcases are coming. llvm-svn: 62212	2009-01-14 02:35:31 +00:00
Duncan Sands	dc020f9c3c	Rename getABITypeSize to getTypePaddedSize, as suggested by Chris. llvm-svn: 62099	2009-01-12 20:38:59 +00:00
Dale Johannesen	656237beca	Revert 61362 and 61402 until SPEC breakage is fixed. llvm-svn: 61403	2008-12-23 23:21:35 +00:00
Dale Johannesen	f8b161bcd1	This fixes the bug in 175.vpr. It doesn't fix the other SPEC breakage. I'll be reverting all recent changes shortly, this checking is mostly so this change doesn't get lost. llvm-svn: 61402	2008-12-23 23:05:26 +00:00
Dale Johannesen	93b9aa8799	Fix the time regression I introduced in 464.h264ref with my last patch to this file. The issue there was that all uses of an IV inside a loop are actually references to Base[IV2], and there was one use outside that was the same but LSR didn't see the base or the scaling because it didn't recurse into uses outside the loop; thus, it used base+IVscale mode inside the loop instead of pulling base out of the loop. This was extra bad because register pressure later forced both base and IV into memory. Doing that recursion, at least enough to figure out addressing modes, is a good idea in general; the change in AddUsersIfInteresting does this. However, there were side effects.... It is also possible for recursing outside the loop to introduce another IV where there was only 1 before (if the refs inside are not scaled and the ref outside is). I don't think this is a common case, but it's in the testsuite. It is right to be very aggressive about getting rid of such introduced IVs (CheckForIVReuse and the handling of nonzero RewriteFactor in StrengthReduceStridedIVUsers). In the testcase in question the new IV produced this way has both a nonconstant stride and a nonzero base, neither of which was handled before. And when inserting new code that feeds into a PHI, it's right to put such code at the original location rather than in the PHI's immediate predecessor(s) when the original location is outside the loop (a case that couldn't happen before) (RewriteInstructionToUseNewBase); better to avoid making multiple copies of it in this case. Also, the mechanism for keeping SCEV's corresponding to GEP's no longer works, as the GEP might change after its SCEV is remembered, invalidating the SCEV, and we might get a bad SCEV value when looking up the GEP again for a later loop. This also couldn't happen before, as we weren't recursing into GEP's outside the loop. I owe some testcases for this, want to get it in for nightly runs. llvm-svn: 61362	2008-12-23 02:12:52 +00:00
Dale Johannesen	3e5843b992	Revert previous patch, appears to break bootstrap. llvm-svn: 61181	2008-12-18 01:23:41 +00:00
Dale Johannesen	12d031b716	Fix the time regression I introduced in 464.h264ref with my last patch to this file. The issue there was that all uses of an IV inside a loop are actually references to Base[IV2], and there was one use outside that was the same but LSR didn't see the base or the scaling because it didn't recurse into uses outside the loop; thus, it used base+IVscale mode inside the loop instead of pulling base out of the loop. This was extra bad because register pressure later forced both base and IV into memory. Doing that recursion, at least enough to figure out addressing modes, is a good idea in general; the change in AddUsersIfInteresting does this. However, there were side effects.... It is also possible for recursing outside the loop to introduce another IV where there was only 1 before (if the refs inside are not scaled and the ref outside is). I don't think this is a common case, but it's in the testsuite. It is right to be very aggressive about getting rid of such introduced IVs (CheckForIVReuse and the handling of nonzero RewriteFactor in StrengthReduceStridedIVUsers). In the testcase in question the new IV produced this way has both a nonconstant stride and a nonzero base, neither of which was handled before. (This patch does not handle all the cases where this can happen.) And when inserting new code that feeds into a PHI, it's right to put such code at the original location rather than in the PHI's immediate predecessor(s) when the original location is outside the loop (a case that couldn't happen before) (RewriteInstructionToUseNewBase); better to avoid making multiple copies of it in this case. Everything above is exercised in CodeGen/X86/lsr-negative-stride.ll (and ifcvt4 in ARM which is the same IR). llvm-svn: 61178	2008-12-18 00:57:22 +00:00
Dale Johannesen	904ce8120d	Clarify that the scale factor from CheckForIVReuse can be negative. Keep track of whether all uses of an IV are outside the loop. Some cosmetics; no functional change. llvm-svn: 61109	2008-12-16 22:16:28 +00:00
Chris Lattner	56b20ffc5f	Fix a really subtle off-by-one bug that Duncan noticed with valgrind on test/CodeGen/Generic/2007-06-06-CriticalEdgeLandingPad. llvm-svn: 60739	2008-12-09 04:47:21 +00:00
Dale Johannesen	9efd2ce55b	Make LoopStrengthReduce smarter about hoisting things out of loops when they can be subsumed into addressing modes. Change X86 addressing mode check to realize that some PIC references need an extra register. (I believe this is correct for Linux, if not, I'm sure someone will tell me.) llvm-svn: 60608	2008-12-05 21:47:27 +00:00
Dale Johannesen	4e9e6ea604	Remove an unused field. llvm-svn: 60508	2008-12-03 22:43:56 +00:00
Dale Johannesen	f7a588b909	Fix a misspelled function name. llvm-svn: 60506	2008-12-03 20:56:12 +00:00
Dale Johannesen	d49ceff6ba	Fix a really wrong comment. llvm-svn: 60494	2008-12-03 19:25:46 +00:00
Dale Johannesen	4d2ecb8f68	Minor rewrite per review feedback. llvm-svn: 60442	2008-12-02 21:17:11 +00:00
Dale Johannesen	70060013d2	Make the code do what the comment says it does. llvm-svn: 60431	2008-12-02 18:40:09 +00:00
Chris Lattner	ead1a61b47	some random comment improvements. llvm-svn: 60395	2008-12-02 04:52:26 +00:00
Dale Johannesen	069a4eee55	Consider only references to an IV within the loop when figuring out the base of the IV. This produces better code in the example. (Addresses use (IV) instead of (BASE,IV) - a significant improvement on low-register machines like x86). llvm-svn: 60374	2008-12-01 22:00:01 +00:00
Chris Lattner	2c2dd15a85	Introduce a new array_pod_sort function and switch LSR to use it instead of std::sort. This shrinks the release-asserts LSR.o file by 1100 bytes of code on my system. We should start using array_pod_sort where possible. llvm-svn: 60335	2008-12-01 06:49:59 +00:00
Chris Lattner	2aebea5735	Eliminate use of setvector for the DeadInsts set, just use a smallvector. This is a lot cheaper and conceptually simpler. llvm-svn: 60332	2008-12-01 06:27:41 +00:00
Chris Lattner	4da78e3774	DeleteTriviallyDeadInstructions is always passed the DeadInsts ivar, just use it directly. llvm-svn: 60330	2008-12-01 06:14:28 +00:00
Chris Lattner	a68a5a4784	simplify DeleteTriviallyDeadInstructions again, unlike my previous buggy rewrite, this notifies ScalarEvolution of a pending instruction about to be removed and then erases it, instead of erasing it then notifying. llvm-svn: 60329	2008-12-01 06:11:32 +00:00
Bill Wendling	469e3aa696	Temporarily revert r60195. It's causing an optimized bootstrap of llvm-gcc to fail. llvm-svn: 60233	2008-11-29 03:43:04 +00:00
Chris Lattner	c077a2a535	Simplify LoopStrengthReduce::DeleteTriviallyDeadInstructions by making it use RecursivelyDeleteTriviallyDeadInstructions to do the heavy lifting. llvm-svn: 60195	2008-11-27 23:23:35 +00:00
Chris Lattner	96e2dbe008	use continue to reduce indentation llvm-svn: 60192	2008-11-27 23:00:20 +00:00
Daniel Dunbar	7f39e2d85a	Change createPass factory functions to return Pass instead of LoopPass*. - Although less precise, this means they can be used in clients without RTTI (who would otherwise need to include LoopPass.h, which eventually includes things using dynamic_cast). This was the simplest solution that presented itself, but I am happy to use a better one if available. llvm-svn: 58010	2008-10-22 23:32:42 +00:00
Dan Gohman	67d90de2b0	Call ScalarEvolution's deleteValueFromRecords before deleting an instruction, not after. This fixes some uses of free'd memory. llvm-svn: 56908	2008-10-01 02:02:03 +00:00
Dan Gohman	68e7735a38	Teach LSR to optimize away SMAX operations for tripcounts in common cases. See the comment above OptimizeSMax for the full story, and the testcase for an example. This cancels out a pessimization commonly attributed to indvars, and will allow us to lift some of the artificial throttles in indvars, rather than add new ones. llvm-svn: 56230	2008-09-15 21:22:06 +00:00
Devang Patel	92c5367705	fix overflow check. llvm-svn: 56011	2008-09-09 20:54:34 +00:00
Devang Patel	7518f250b9	Remove unused counter. llvm-svn: 55924	2008-09-08 17:14:54 +00:00
Devang Patel	538a7f479a	Remove OptimizeIVType() llvm-svn: 55913	2008-09-08 16:13:27 +00:00
Dan Gohman	a79db30d28	Tidy up several unbeseeming casts from pointer to intptr_t. llvm-svn: 55779	2008-09-04 17:05:41 +00:00
Devang Patel	bcd39345de	Add additional check to ensure that iv is canonicalized. llvm-svn: 55682	2008-09-03 00:29:13 +00:00
Devang Patel	b530f08122	Check iteration count. llvm-svn: 55680	2008-09-03 00:10:56 +00:00
Devang Patel	81fed043c5	While removing PHI, use basicblock to identify incoming value. llvm-svn: 55678	2008-09-03 00:02:42 +00:00
Devang Patel	43c5a52e07	If all IV uses are extending integer IV then change the type of IV itself, if possible. llvm-svn: 55674	2008-09-02 22:18:08 +00:00
Devang Patel	d6adbb6a0f	Do not apply the transformation if the target does not support DestTy natively. llvm-svn: 55433	2008-08-27 20:55:23 +00:00
Devang Patel	cf7ca5d0ba	Fix typos and whitespaces. Other cosmetic changes based on feedback. llvm-svn: 55424	2008-08-27 17:50:18 +00:00
Devang Patel	4310d39844	If IV is used in a int-to-float cast inside the loop then try to eliminate the cast operation. llvm-svn: 55374	2008-08-26 17:57:54 +00:00
Evan Cheng	5dabe042a6	Revert 54821. It's miscompiling 252.eon and 447.dealII llvm-svn: 54878	2008-08-17 08:07:31 +00:00
Devang Patel	f2a03d5a4b	Reapply 54786. Add overflow and number of mantissa bits checks. llvm-svn: 54821	2008-08-15 21:21:34 +00:00
Evan Cheng	86834d29f3	Revert 54786. It's not checking for overflows, etc. llvm-svn: 54813	2008-08-15 08:12:11 +00:00
Devang Patel	054a833dd4	If IV is used in a int-to-float cast inside the loop then try to eliminate the cast opeation. llvm-svn: 54786	2008-08-14 20:58:31 +00:00
Devang Patel	6369a798ba	Rename. s/FindIVForUser/FindIVUserForCond/g llvm-svn: 54754	2008-08-13 20:31:11 +00:00
Devang Patel	97387e6615	Check sign to detect overflow before changing compare stride. llvm-svn: 54710	2008-08-13 02:05:14 +00:00
Evan Cheng	907dc2bc37	Fix PR2355: bug in ChangeCompareStride. When the loop termination compare is the only use of its iv stride, the stride can be eliminated by moving it to another stride. If the scale is negative, swap the predicate instead of using a inverse predicate. llvm-svn: 54415	2008-08-06 18:04:43 +00:00
Dan Gohman	7ad3cd8c9d	Fix a bug in LSR's dead-PHI cleanup. If a PHI has a def-use chain that leads into a cycle involving a different PHI, LSR got stuck running around that cycle looking for the original PHI. To avoid this, keep track of visited PHIs and stop searching if we see one more than once. This fixes PR2570. llvm-svn: 53879	2008-07-21 21:45:02 +00:00
Dan Gohman	162668fa78	Fix uninitialized use of the Changed variable. llvm-svn: 53564	2008-07-14 17:55:01 +00:00
Evan Cheng	03001cb820	Fix two serious LSR bugs. 1. LSR runOnLoop is always returning false regardless if any transformation is made. 2. AddUsersIfInteresting can create new instructions that are added to DeadInsts. But there is a later early exit which prevents them from being freed. llvm-svn: 53193	2008-07-07 19:51:32 +00:00
Dan Gohman	ac563833ae	Fix spelling and grammar in a comment. llvm-svn: 52648	2008-06-23 22:11:52 +00:00
Dan Gohman	5ca5e02480	Improve LSR's dead-phi detection to handle use-def cycles with more than two nodes. llvm-svn: 52617	2008-06-22 20:44:02 +00:00
Dan Gohman	be928e3b21	Move LSR's private isZero function to a public SCEV member function, and make use of it in several places. llvm-svn: 52463	2008-06-18 16:23:07 +00:00
Dan Gohman	ab0dccba6b	Refine the change in r52258 for avoiding use-before-def conditions when changing the stride of a comparison so that it's slightly more precise, by having it scan the instruction list to determine if there is a use of the condition after the point where the condition will be inserted. llvm-svn: 52371	2008-06-16 22:34:15 +00:00
Evan Cheng	319e9a4f63	Switch over to SetVector to ensure same order of iterations do not vary across runs. llvm-svn: 52361	2008-06-16 21:08:17 +00:00
Evan Cheng	a72cdcd1a2	Iterating over SmallPtrSet is not deterministic. llvm-svn: 52339	2008-06-16 18:17:09 +00:00
Dan Gohman	9ad8c54aab	Protect ChangeCompareStride from situations in which it is possible for it to generate use-before-def IR, such as in this testcase. llvm-svn: 52258	2008-06-13 21:43:41 +00:00
Gabor Greif	0babc61631	op_iterator-ify some loops, fix 80col violations llvm-svn: 52226	2008-06-11 21:38:51 +00:00
Evan Cheng	02912418f1	Remove x86.sse2.loadh.pd and x86.sse2.loadl.pd. These will be lowered into load and shuffle instructions. llvm-svn: 51521	2008-05-24 00:07:06 +00:00
Dan Gohman	e62632e0bb	When LSR is replacing an instruction, call ScalarEvolution::deleteValueFromRecords on it before doing the replaceAllUsesWith, because ScalarEvolution looks at the instruction's users to find SCEV references to the instruction's SCEV object in its internal maps. Move all of LSR's loop-related state clearing after processing the loop and before cleaning up dead PHI nodes. This eliminates all of LSR's SCEV references just before the calls to ScalarEvolution::deleteValueFromRecords so that when ScalarEvolution drops its own SCEV references, the reference counts will reach zero and the SCEVs will be deleted immediately. These changes fix some compiler aborts involving ScalarEvolution holding onto and reusing SCEV objects for instructions that have been deleted. No regression test unfortunately; because the symptoms were due to dangling pointers, reduced testcases ended up being fairly arbitrary. llvm-svn: 51359	2008-05-21 00:54:12 +00:00
Dan Gohman	e5572706e8	Refine the fix in r51169 to only apply when the operand val being replaced is a PHI. This prevents it from inserting uses before defs in the case that it isn't a PHI and it depends on other instructions later in the block. This fixes the 447.dealII regression on x86-64. llvm-svn: 51292	2008-05-20 03:01:48 +00:00
Dan Gohman	0a0fa7cf78	Fix a bug in LoopStrengthReduce that caused it to emit IR with use-before-def. The problem comes up in code with multiple PHIs where one PHI is being rewritten in terms of the other, but the other needs to be casted first. LLVM rules requre the cast instruction to be inserted after any PHI instructions, but when instructions were inserted to replace the second PHI value with a function of the first, they were ended up going before the cast instruction. Avoid this problem by remembering the location of the cast instruction, when one is needed, and inserting the expansion of the new value after it. This fixes a bug that surfaced in 255.vortex on x86-64 when instcombine was removed from the middle of the loop optimization passes. llvm-svn: 51169	2008-05-15 23:26:57 +00:00
Dan Gohman	d78c400b5b	Clean up the use of static and anonymous namespaces. This turned up several things that were neither in an anonymous namespace nor static but not intended to be global. llvm-svn: 51017	2008-05-13 00:00:25 +00:00
Dan Gohman	e36714c0b4	Minor whitespace and comment cleanups. llvm-svn: 49671	2008-04-14 18:26:16 +00:00
Gabor Greif	e9ecc68d8f	API changes for class Use size reduction, wave 1. Specifically, introduction of XXX::Create methods for Users that have a potentially variable number of Uses. llvm-svn: 49277	2008-04-06 20:25:17 +00:00
Evan Cheng	a90fdc4340	Remove dead options. llvm-svn: 48556	2008-03-19 22:02:26 +00:00
Dan Gohman	70de4cb1cd	Use empty() instead of comparing size() with zero. llvm-svn: 46514	2008-01-29 13:02:09 +00:00
Chris Lattner	f3ebc3f3d2	Remove attribution from file headers, per discussion on llvmdev. llvm-svn: 45418	2007-12-29 20:36:04 +00:00
Evan Cheng	26ee54eb05	Clean up previous patch: PHI uses should not prevent iv reuse if all other uses are addresses. This trades a constant multiply for one fewer iv. llvm-svn: 45251	2007-12-20 02:20:53 +00:00
Evan Cheng	e2a8ba7fec	Allow iv reuse if the user is a PHI node which is in turn used as addresses. llvm-svn: 45230	2007-12-19 23:33:23 +00:00
Dale Johannesen	7d97662467	Remove indeterminism from a loop. We think this will fix an occasional nonrepeatable bootstrap failure we've been seeing on Darwin. llvm-svn: 44202	2007-11-17 02:48:01 +00:00
Evan Cheng	240c1adade	At end of LSR, replace uses of now constant (as result of SplitCriticalEdge) PHI node with the constant value. llvm-svn: 43533	2007-10-30 23:45:15 +00:00
Evan Cheng	c2dbfee43f	It's not safe to tell SplitCriticalEdge to merge identical edges. It may delete the phi instruction that's being processed. llvm-svn: 43524	2007-10-30 22:27:26 +00:00
Evan Cheng	b024c4c81d	- Bug fixes. - Allow icmp rewrite using an iv / stride of a smaller integer type. llvm-svn: 43480	2007-10-29 22:07:18 +00:00
Dan Gohman	7414e21ec0	Update a comment to reflect the current code. llvm-svn: 43463	2007-10-29 19:32:39 +00:00
Dan Gohman	f5feb01056	Remove an unused function argument. llvm-svn: 43462	2007-10-29 19:31:25 +00:00
Dan Gohman	50d42224d0	Fix a typo in a comment. llvm-svn: 43461	2007-10-29 19:26:14 +00:00
Dan Gohman	8e8adada83	Avoid calling ValidStride when not all uses are addresses. llvm-svn: 43460	2007-10-29 19:23:53 +00:00
Evan Cheng	9dbe99dcd6	A number of LSR fixes: - ChangeCompareStride only reuse stride that is larger than current stride. It will let the general reuse mechanism to try to reuse a smaller stride. - Watch out for multiplication overflow in ChangeCompareStride. - Replace std::set with SmallPtrSet. llvm-svn: 43408	2007-10-26 23:08:19 +00:00
Evan Cheng	d78a3e5555	Fix a crash. Make sure TLI is not null. llvm-svn: 43384	2007-10-26 17:24:46 +00:00
Evan Cheng	7f3d02471d	Loosen up iv reuse to allow reuse of the same stride but a larger type when truncating from the larger type to smaller type is free. e.g. Turns this loop: LBB1_1: # entry.bb_crit_edge xorl %ecx, %ecx xorw %dx, %dx movw %dx, %si LBB1_2: # bb movl L_X$non_lazy_ptr, %edi movw %si, (%edi) movl L_Y$non_lazy_ptr, %edi movw %dx, (%edi) addw $4, %dx incw %si incl %ecx cmpl %eax, %ecx jne LBB1_2 # bb into LBB1_1: # entry.bb_crit_edge xorl %ecx, %ecx xorw %dx, %dx LBB1_2: # bb movl L_X$non_lazy_ptr, %esi movw %cx, (%esi) movl L_Y$non_lazy_ptr, %esi movw %dx, (%esi) addw $4, %dx incl %ecx cmpl %eax, %ecx jne LBB1_2 # bb llvm-svn: 43375	2007-10-26 01:56:11 +00:00
Evan Cheng	29e29e63bd	Do not rewrite compare instruction using iv of a different stride if the new stride may be rewritten using the stride of the compare instruction. llvm-svn: 43367	2007-10-25 22:45:20 +00:00
Evan Cheng	5a38108374	Remove code that's commented out. llvm-svn: 43356	2007-10-25 18:38:24 +00:00
Evan Cheng	133694db06	If a loop termination compare instruction is the only use of its stride, and the compaison is against a constant value, try eliminate the stride by moving the compare instruction to another stride and change its constant operand accordingly. e.g. loop: ... v1 = v1 + 3 v2 = v2 + 1 if (v2 < 10) goto loop => loop: ... v1 = v1 + 3 if (v1 < 30) goto loop llvm-svn: 43336	2007-10-25 09:11:16 +00:00
Dan Gohman	e0c3d9f338	Strength reduction improvements. - Avoid attempting stride-reuse in the case that there are users that aren't addresses. In that case, there will be places where the multiplications won't be folded away, so it's better to try to strength-reduce them. - Several SSE intrinsics have operands that strength-reduction can treat as addresses. The previous item makes this more visible, as any non-address use of an IV can inhibit stride-reuse. - Make ValidStride aware of whether there's likely to be a base register in the address computation. This prevents it from thinking that things like stride 9 are valid on x86 when the base register is already occupied. Also, XFAIL the 2007-08-10-LEA16Use32.ll test; the new logic to avoid stride-reuse elimintes the LEA in the loop, so the test is no longer testing what it was intended to test. llvm-svn: 43231	2007-10-22 20:40:42 +00:00
Dan Gohman	a37eaf2bf9	Move the SCEV object factors from being static members of the individual SCEV subclasses to being non-static member functions of the ScalarEvolution class. llvm-svn: 43224	2007-10-22 18:31:58 +00:00
Dale Johannesen	b6c05b1f90	Fix stride computations for long double arrays. llvm-svn: 42508	2007-10-01 23:08:35 +00:00
Chris Lattner	2740694450	wrap some long lines. Major offenders that are left include gvn, gvnpre, dse, and predsimplify. To see these, use: make check-line-length llvm-svn: 40738	2007-08-02 16:53:43 +00:00
Dan Gohman	34d442f274	More explicit keywords. llvm-svn: 40673	2007-08-01 15:32:29 +00:00
Dan Gohman	8c4da37b1f	Use SCEVExpander::InsertCastOfTo instead of calling new IntToPtrInst directly, because the insert point used by the SCEVExpander may vary from what LSR originally computes. llvm-svn: 40641	2007-07-31 17:22:27 +00:00
Dan Gohman	32f53bbd85	Rename ScalarEvolution::deleteInstructionFromRecords to deleteValueFromRecords and loosen the types to all it to accept Value* instead of just Instruction*, since this is what ScalarEvolution uses internally anyway. This allows more flexibility for future uses. llvm-svn: 37657	2007-06-19 14:28:31 +00:00
Dan Gohman	cb9e09ad57	Add a SCEV class and supporting code for sign-extend expressions. This created an ambiguity for expandInTy to decide when to use sign-extension or zero-extension, but it turns out that most of its callers don't actually need a type conversion, now that LLVM types don't have explicit signedness. Drop expandInTy in favor of plain expand, and change the few places that actually need a type conversion to do it themselves. llvm-svn: 37591	2007-06-15 14:38:12 +00:00
Devang Patel	df6355ccf8	Use DominatorTree instead of ETForest. llvm-svn: 37499	2007-06-07 21:42:15 +00:00
Chris Lattner	1b7b6e76ec	Fix PR1495 and CodeGen/X86/2007-06-05-LSR-Dominator.ll llvm-svn: 37454	2007-06-06 01:23:55 +00:00
Chris Lattner	e8bd53c36a	Handle negative strides much more optimally. This compiles X86/lsr-negative-stride.ll into: _t: movl 8(%esp), %ecx movl 4(%esp), %eax cmpl %ecx, %eax je LBB1_3 #bb17 LBB1_1: #bb cmpl %ecx, %eax jg LBB1_4 #cond_true LBB1_2: #cond_false subl %eax, %ecx cmpl %ecx, %eax jne LBB1_1 #bb LBB1_3: #bb17 ret LBB1_4: #cond_true subl %ecx, %eax cmpl %ecx, %eax jne LBB1_1 #bb jmp LBB1_3 #bb17 instead of: _t: subl $4, %esp movl %esi, (%esp) movl 12(%esp), %ecx movl 8(%esp), %eax cmpl %ecx, %eax je LBB1_4 #bb17 LBB1_1: #bb.outer movl %ecx, %edx negl %edx LBB1_2: #bb cmpl %ecx, %eax jle LBB1_5 #cond_false LBB1_3: #cond_true addl %edx, %eax cmpl %ecx, %eax jne LBB1_2 #bb LBB1_4: #bb17 movl (%esp), %esi addl $4, %esp ret LBB1_5: #cond_false movl %ecx, %edx subl %eax, %edx movl %eax, %esi addl %esi, %esi cmpl %ecx, %esi je LBB1_4 #bb17 LBB1_6: #cond_false.bb.outer_crit_edge movl %edx, %ecx jmp LBB1_1 #bb.outer llvm-svn: 37252	2007-05-19 01:22:21 +00:00
Chris Lattner	1480e16596	significantly improve debug output of lsr llvm-svn: 36996	2007-05-11 22:40:34 +00:00
Dan Gohman	2bcbd5b7ca	Use IntrinsicInst to test for prefetch instructions, which is ever so slightly nicer than using CallInst with an extra check; thanks Chris. llvm-svn: 36743	2007-05-04 14:59:09 +00:00
Dan Gohman	3fbb18d1b6	Allow strength reduction to make use of addressing modes for the address operand in a prefetch intrinsic. llvm-svn: 36713	2007-05-03 23:20:33 +00:00
Devang Patel	8c78a0bff0	Drop 'const' llvm-svn: 36662	2007-05-03 01:11:54 +00:00
Devang Patel	e95c6ad802	Use 'static const char' instead of 'static const int'. Due to darwin gcc bug, one version of darwin linker coalesces static const int, which defauts PassID based pass identification. llvm-svn: 36652	2007-05-02 21:39:20 +00:00
Devang Patel	09f162ca6a	Do not use typeinfo to identify pass in pass manager. llvm-svn: 36632	2007-05-01 21:15:47 +00:00
Devang Patel	38bc86f057	Fix http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070423/048333.html llvm-svn: 36380	2007-04-23 22:42:03 +00:00
Owen Anderson	f35a1dbc7a	Remove ImmediateDominator analysis. The same information can be obtained from DomTree. A lot of code for constructing ImmediateDominator is now folded into DomTree construction. This is part of the ongoing work for PR217. llvm-svn: 36063	2007-04-15 08:47:27 +00:00
Chris Lattner	efd3051d60	Now that codegen prepare isn't defeating me, I can finally fix what I set out to do! :) This fixes a problem where LSR would insert a bunch of code into each MBB that uses a particular subexpression (e.g. IV+base+C). The problem is that this code cannot be CSE'd back together if inserted into different blocks. This patch changes LSR to attempt to insert a single copy of this code and share it, allowing codegenprepare to duplicate the code if it can be sunk into various addressing modes. On CodeGen/ARM/lsr-code-insertion.ll, for example, this gives us code like: add r8, r0, r5 str r6, [r8, #+4] .. ble LBB1_4 @cond_next LBB1_3: @cond_true str r10, [r8, #+4] LBB1_4: @cond_next ... LBB1_5: @cond_true55 ldr r6, LCPI1_1 str r6, [r8, #+4] instead of: add r10, r0, r6 str r8, [r10, #+4] ... ble LBB1_4 @cond_next LBB1_3: @cond_true add r8, r0, r6 str r10, [r8, #+4] LBB1_4: @cond_next ... LBB1_5: @cond_true55 add r8, r0, r6 ldr r10, LCPI1_1 str r10, [r8, #+4] Besides being smaller and more efficient, this makes it immediately obvious that it is profitable to predicate LBB1_3 now :) llvm-svn: 35972	2007-04-13 20:42:26 +00:00
Chris Lattner	780c009756	switch LSR to use isLegalAddressingMode instead of other simpler hooks llvm-svn: 35837	2007-04-09 22:20:14 +00:00
Owen Anderson	8763ba1b88	Completely purge DomSet. This is the (hopefully) final patch for PR1171. llvm-svn: 35731	2007-04-07 07:17:27 +00:00
Chris Lattner	81e0707552	split some code out into a helper function llvm-svn: 35615	2007-04-03 05:11:24 +00:00
Chris Lattner	f3197a7d53	allow -1 strides to reuse "1" strides. llvm-svn: 35607	2007-04-02 22:51:58 +00:00
Chris Lattner	28e0e4e11e	Pass the type of the store access, not the type of the store, into the target hook. This allows us to codegen a loop as: LBB1_1: @cond_next mov r2, #0 str r2, [r0, +r3, lsl #2] add r3, r3, #1 cmn r3, #1 bne LBB1_1 @cond_next instead of: LBB1_1: @cond_next mov r2, #0 str r2, [r0], #+4 add r3, r3, #1 cmn r3, #1 bne LBB1_1 @cond_next This looks the same, but has one fewer induction variable (and therefore, one fewer register) live in the loop. llvm-svn: 35592	2007-04-02 06:34:44 +00:00
Chris Lattner	8fe3cbe6bd	print the type of an inserted IV in -debug mode. llvm-svn: 35563	2007-04-01 22:21:39 +00:00
Dale Johannesen	e5866e7b89	Look through bitcast when finding IVs. (Chris' patch really.) llvm-svn: 35347	2007-03-26 03:01:27 +00:00
Dale Johannesen	bacf4acf65	do not share old induction variables when this would result in invalid instructions (that would have to be split later) llvm-svn: 35227	2007-03-20 21:54:54 +00:00
Jeff Cohen	1baf5c84ab	Fix some VC++ warnings. llvm-svn: 35224	2007-03-20 20:43:18 +00:00
Dale Johannesen	e3a02be5f1	use types of loads and stores, not address, in CheckForIVReuse llvm-svn: 35197	2007-03-20 00:47:50 +00:00
Evan Cheng	b5eb932c93	Correct type info for isLegalAddressImmediate() check. llvm-svn: 35086	2007-03-13 20:34:37 +00:00
Evan Cheng	720acdfb31	Use new TargetLowering addressing modes hooks. llvm-svn: 35072	2007-03-12 23:27:37 +00:00
Devang Patel	58818c530f	Increment iterator now because IVUseShouldUsePostIncValue may remove User from the list of I users. llvm-svn: 35051	2007-03-09 21:19:53 +00:00
Devang Patel	b0743b5d6a	Now LoopStrengthReduce is a LoopPass. llvm-svn: 34984	2007-03-06 21:14:09 +00:00
Reid Spencer	53a3739c80	Finally get this patch right :) Replace expensive getZExtValue() == 0 calls with isZero() calls. llvm-svn: 34861	2007-03-02 23:51:25 +00:00
Reid Spencer	ba547cbb2a	Dang, I've done that twice now! Undo previous commit. llvm-svn: 34860	2007-03-02 23:37:53 +00:00
Reid Spencer	558990e189	Use more efficient test for one value in a ConstantInt. llvm-svn: 34859	2007-03-02 23:35:28 +00:00
Reid Spencer	197adfaa0a	Reverse a premature commital. llvm-svn: 34822	2007-03-02 00:31:39 +00:00
Reid Spencer	2e54a15943	Prefer non-virtual calls to ConstantInt::isZero over virtual calls to Constant::isNullValue() in situations where it is possible. llvm-svn: 34821	2007-03-02 00:28:52 +00:00
Chris Lattner	c473d8e431	Privatize StructLayout::MemberOffsets, adding an accessor llvm-svn: 34156	2007-02-10 19:55:17 +00:00
Reid Spencer	557ab15e71	Apply the VISIBILITY_HIDDEN field to the remaining anonymous classes in the Transforms library. This reduces debug library size by 132 KB, debug binary size by 376 KB, and reduces link time for llvm tools slightly. llvm-svn: 33939	2007-02-05 23:32:05 +00:00
Chris Lattner	03c4953cdd	rename Type::isIntegral to Type::isInteger, eliminating the old Type::isInteger. rename Type::getIntegralTypeMask to Type::getIntegerTypeMask. This makes naming much more consistent. For example, there are now no longer any instances of IntegerType that are not considered isInteger! :) llvm-svn: 33225	2007-01-15 02:27:26 +00:00
Chris Lattner	1942249c5b	Eliminate calls to isInteger, generalizing code and tightening checks as needed. llvm-svn: 33218	2007-01-15 01:55:30 +00:00
Reid Spencer	bf96e02a54	For PR1097: Enable complex addressing modes on 64-bit platforms involving two induction variables by keeping a size and scale in 64-bits not 32. Patch by Dan Gohman. llvm-svn: 33011	2007-01-08 16:17:51 +00:00
Chris Lattner	3fe98ae10a	no need to worry about int vs uint any more. llvm-svn: 32946	2007-01-06 01:37:35 +00:00
Reid Spencer	c635f47d9a	For PR950: This patch replaces signed integer types with signless ones: 1. [US]Byte -> Int8 2. [U]Short -> Int16 3. [U]Int -> Int32 4. [U]Long -> Int64. 5. Removal of isSigned, isUnsigned, getSignedVersion, getUnsignedVersion and other methods related to signedness. In a few places this warranted identifying the signedness information from other sources. llvm-svn: 32785	2006-12-31 05:48:39 +00:00
Reid Spencer	266e42b312	For PR950: This patch removes the SetCC instructions and replaces them with the ICmp and FCmp instructions. The SetCondInst instruction has been removed and been replaced with ICmpInst and FCmpInst. llvm-svn: 32751	2006-12-23 06:05:41 +00:00
Chris Lattner	79a42ac941	Switch over Transforms/Scalar to use the STATISTIC macro. For each statistic converted, we lose a static initializer. This also allows GCC to emit warnings about unused statistics. llvm-svn: 32690	2006-12-19 21:40:18 +00:00
Reid Spencer	df1f19a8ef	Change the interface to SCEVExpander::InsertCastOfTo to take a cast opcode so the decision of which opcode to use is pushed upward to the caller. Adjust the callers to pass the expected opcode. llvm-svn: 32535	2006-12-13 08:06:42 +00:00
Reid Spencer	b341b0861d	Change inferred getCast into specific getCast. Passes all tests. llvm-svn: 32469	2006-12-12 05:05:00 +00:00
Bill Wendling	f3baad3ee1	Changed llvm_ostream et all to OStream. llvm_cerr, llvm_cout, llvm_null, are now cerr, cout, and NullStream resp. llvm-svn: 32298	2006-12-07 01:30:32 +00:00
Chris Lattner	700b873130	Detemplatize the Statistic class. The only type it is instantiated with is 'unsigned'. llvm-svn: 32279	2006-12-06 17:46:33 +00:00
Reid Spencer	6c38f0bb07	For PR950: The long awaited CAST patch. This introduces 12 new instructions into LLVM to replace the cast instruction. Corresponding changes throughout LLVM are provided. This passes llvm-test, llvm/test, and SPEC CPUINT2000 with the exception of 175.vpr which fails only on a slight floating point output difference. llvm-svn: 31931	2006-11-27 01:05:10 +00:00
Bill Wendling	5dbf43c983	Removed #include <iostream> and replaced with llvm_* streams. llvm-svn: 31923	2006-11-26 09:46:52 +00:00
Chris Lattner	21eba2da26	If an indvar with a variable stride is used by the exit condition, go ahead and handle it like constant stride vars. This fixes some bad codegen in variable stride cases. For example, it compiles this: void foo(int k, int i) { for (k=i+i; k <= 8192; k+=i) flags2[k] = 0; } to: LBB1_1: #bb.preheader movl %eax, %ecx addl %ecx, %ecx movl L_flags2$non_lazy_ptr, %edx LBB1_2: #bb movb $0, (%edx,%ecx) addl %eax, %ecx cmpl $8192, %ecx jle LBB1_2 #bb LBB1_5: #return ret or (if the array is local and we are in dynamic-nonpic or static mode): LBB3_2: #bb movb $0, _flags2(%ecx) addl %eax, %ecx cmpl $8192, %ecx jle LBB3_2 #bb and: lis r2, ha16(L_flags2$non_lazy_ptr) lwz r2, lo16(L_flags2$non_lazy_ptr)(r2) slwi r3, r4, 1 LBB1_2: ;bb li r5, 0 add r6, r4, r3 stbx r5, r2, r3 cmpwi cr0, r6, 8192 bgt cr0, LBB1_5 ;return instead of: leal (%eax,%eax,2), %ecx movl %eax, %edx addl %edx, %edx addl L_flags2$non_lazy_ptr, %edx xorl %esi, %esi LBB1_2: #bb movb $0, (%edx,%esi) movl %eax, %edi addl %esi, %edi addl %ecx, %esi cmpl $8192, %esi jg LBB1_5 #return and: lis r2, ha16(L_flags2$non_lazy_ptr) lwz r2, lo16(L_flags2$non_lazy_ptr)(r2) mulli r3, r4, 3 slwi r5, r4, 1 li r6, 0 add r2, r2, r5 LBB1_2: ;bb li r5, 0 add r7, r3, r6 stbx r5, r2, r6 add r6, r4, r6 cmpwi cr0, r7, 8192 ble cr0, LBB1_2 ;bb This speeds up Benchmarks/Shootout/sieve from 8.533s to 6.464s and implements LoopStrengthReduce/var_stride_used_by_compare.ll llvm-svn: 31809	2006-11-17 06:17:33 +00:00
Reid Spencer	de46e48420	For PR786: Turn on -Wunused and -Wno-unused-parameter. Clean up most of the resulting fall out by removing unused variables. Remaining warnings have to do with unused functions (I didn't want to delete code without review) and unused variables in generated code. Maintainers should clean up the remaining issues when they see them. All changes pass DejaGnu tests and Olden. llvm-svn: 31380	2006-11-02 20:25:50 +00:00
Chris Lattner	a6eb7e0803	break edges more intelligently llvm-svn: 31257	2006-10-28 06:45:33 +00:00
Chris Lattner	5191c65485	prepare for a change I'm about to make llvm-svn: 31248	2006-10-28 00:59:20 +00:00
Reid Spencer	e0fc4dfc22	For PR950: This patch implements the first increment for the Signless Types feature. All changes pertain to removing the ConstantSInt and ConstantUInt classes in favor of just using ConstantInt. llvm-svn: 31063	2006-10-20 07:07:24 +00:00
Chris Lattner	c2d3d3112e	eliminate RegisterOpt. It does the same thing as RegisterPass. llvm-svn: 29925	2006-08-27 22:42:52 +00:00
Chris Lattner	3d27be1333	s\|llvm/Support/Visibility.h\|llvm/Support/Compiler.h\| llvm-svn: 29911	2006-08-27 12:54:02 +00:00
Chris Lattner	3ff620178b	Changes: 1. Update an obsolete comment. 2. Make the sorting by base an explicit (though still N^2) step, so that the code is more clear on what it is doing. 3. Partition uses so that uses inside the loop are handled before uses outside the loop. Note that none of these changes currently changes the code inserted by LSR, but they are a stepping stone to getting there. This code is the result of some crazy pair programming with Nate. :) llvm-svn: 29493	2006-08-03 06:34:50 +00:00
Evan Cheng	e9c68f52e1	Only reuse a previous IV if it would not require a type conversion. llvm-svn: 29186	2006-07-18 19:07:58 +00:00
Chris Lattner	996795b0dd	Use hidden visibility to make symbols in an anonymous namespace get dropped. This shrinks libllvmgcc.dylib another 67K llvm-svn: 28975	2006-06-28 23:17:24 +00:00
Evan Cheng	398f70292c	RewriteExpr, either the new PHI node of induction variable or the post-increment value, should be first cast to the appropriated type (to the type of the common expr). Otherwise, the rewrite of a use based on (common + iv) may end up with an incorrect type. llvm-svn: 28735	2006-06-09 00:12:42 +00:00
Reid Spencer	13a1a7a4a6	Get rid of a signed/unsigned compare warning. llvm-svn: 27625	2006-04-12 19:28:15 +00:00
Chris Lattner	f365f5f0c1	Fix spello llvm-svn: 27052	2006-03-24 07:14:34 +00:00
Chris Lattner	7d80b4f366	silence a bogus gcc warning llvm-svn: 26953	2006-03-22 17:27:24 +00:00
Evan Cheng	c28282bd87	- Fixed a bogus if condition. - Added more debugging info. - Allow reuse of IV of negative stride. e.g. -4 stride == 2 * iv of -2 stride. llvm-svn: 26841	2006-03-18 08:03:12 +00:00
Evan Cheng	f09f0ebd48	Sort StrideOrder so we can process the smallest strides first. This allows for more IV reuses. llvm-svn: 26837	2006-03-18 00:44:49 +00:00
Evan Cheng	4520698820	Allow users of iv / stride to be rewritten with expression that is a multiply of a smaller stride even if they have a common loop invariant expression part. llvm-svn: 26828	2006-03-17 19:52:23 +00:00
Evan Cheng	3df447d354	For each loop, keep track of all the IV expressions inserted indexed by stride. For a set of uses of the IV of a stride which is a multiple of another stride, do not insert a new IV expression. Rather, reuse the previous IV and rewrite the uses as uses of IV expression multiplied by the factor. e.g. x = 0 ...; x ++ y = 0 ...; y += 4 then use of y can be rewritten as use of 4*x for x86. llvm-svn: 26803	2006-03-16 21:53:05 +00:00
Evan Cheng	c567c4efbb	Added target lowering hooks which LSR consults to make more intelligent transformation decisions. llvm-svn: 26738	2006-03-13 23:14:23 +00:00
Chris Lattner	d30c4991a1	Use SCEVExpander::InsertCastOfTo instead of our own code. This reduces #LLVM LOC, and auto-cse's cast instructions. llvm-svn: 25974	2006-02-04 09:52:43 +00:00
Chris Lattner	2959f0003e	Fix two significant bugs in LSR: 1. When rewriting code in outer loops, sometimes we would insert code into inner loops that is invariant in that loop. 2. Notice that 4(2+x) is 8+4x and use that to simplify expressions. This is a performance neutral change. llvm-svn: 25964	2006-02-04 07:36:50 +00:00
Chris Lattner	c597b8a55e	Make iostream #inclusion explicit llvm-svn: 25514	2006-01-22 23:32:06 +00:00
Chris Lattner	cb36710ff9	Switch these to using ETForest instead of DominatorSet to compute itself. Patch written by Daniel Berlin! llvm-svn: 25202	2006-01-11 05:10:20 +00:00
Chris Lattner	077200737c	getRawValue zero extens for unsigned values, use getsextvalue so that we know that small negative values fit into the immediate field of addressing modes. llvm-svn: 24608	2005-12-05 18:23:57 +00:00
Chris Lattner	5df0e36e98	My previous patch was too conservative. Reject FP and void types, but do allow pointer types. llvm-svn: 23859	2005-10-21 05:45:41 +00:00
Chris Lattner	0c0b38bb4c	Do NOT touch FP ops with LSR. This fixes a testcase Nate sent me from an inner loop like this: LBB_RateConvertMono8AltiVec_2: ; no_exit lis r2, ha16(.CPI_RateConvertMono8AltiVec_0) lfs f3, lo16(.CPI_RateConvertMono8AltiVec_0)(r2) fmr f3, f3 fadd f0, f2, f0 fadd f3, f0, f3 fcmpu cr0, f3, f1 bge cr0, LBB_RateConvertMono8AltiVec_2 ; no_exit to an inner loop like this: LBB_RateConvertMono8AltiVec_1: ; no_exit fsub f2, f2, f1 fcmpu cr0, f2, f1 fmr f0, f2 bge cr0, LBB_RateConvertMono8AltiVec_1 ; no_exit Doh! good catch! llvm-svn: 23838	2005-10-20 04:47:10 +00:00
Chris Lattner	192cd18f53	Fix (hopefully the last) issue where LSR is nondeterminstic. When pulling out CSE's of base expressions it could build a result whose order was nondet. llvm-svn: 23698	2005-10-11 18:41:04 +00:00
Chris Lattner	5c9d63da31	Fix another problem where LSR was being nondeterminstic. Also remove elements from the end of a vector instead of the beginning llvm-svn: 23697	2005-10-11 18:30:57 +00:00
Chris Lattner	b7a3894e7c	Fix another lsr-is-nondeterministic case llvm-svn: 23695	2005-10-11 18:17:57 +00:00
Chris Lattner	eb4be8b942	Hrm, you didn't see this. llvm-svn: 23673	2005-10-09 06:24:02 +00:00
Chris Lattner	4ea0a3eaac	Fix a source of non-determinism in the backend: the order of processing IV strides dependend on the pointer order of the strides in memory. Non-determinism is bad. llvm-svn: 23672	2005-10-09 06:20:55 +00:00
Chris Lattner	f07a587c79	Make IVUseShouldUsePostIncValue more aggressive when the use is a PHI. In particular, it should realize that phi's use their values in the pred block not the phi block itself. This change turns our em3d loop from this: _test: cmpwi cr0, r4, 0 bgt cr0, LBB_test_2 ; entry.no_exit_crit_edge LBB_test_1: ; entry.loopexit_crit_edge li r2, 0 b LBB_test_6 ; loopexit LBB_test_2: ; entry.no_exit_crit_edge li r6, 0 LBB_test_3: ; no_exit or r2, r6, r6 lwz r6, 0(r3) cmpw cr0, r6, r5 beq cr0, LBB_test_6 ; loopexit LBB_test_4: ; endif addi r3, r3, 4 addi r6, r2, 1 cmpw cr0, r6, r4 blt cr0, LBB_test_3 ; no_exit LBB_test_5: ; endif.loopexit.loopexit_crit_edge addi r3, r2, 1 blr LBB_test_6: ; loopexit or r3, r2, r2 blr into: _test: cmpwi cr0, r4, 0 bgt cr0, LBB_test_2 ; entry.no_exit_crit_edge LBB_test_1: ; entry.loopexit_crit_edge li r2, 0 b LBB_test_5 ; loopexit LBB_test_2: ; entry.no_exit_crit_edge li r6, 0 LBB_test_3: ; no_exit lwz r2, 0(r3) cmpw cr0, r2, r5 or r2, r6, r6 beq cr0, LBB_test_5 ; loopexit LBB_test_4: ; endif addi r3, r3, 4 addi r6, r6, 1 cmpw cr0, r6, r4 or r2, r6, r6 blt cr0, LBB_test_3 ; no_exit LBB_test_5: ; loopexit or r3, r2, r2 blr Unfortunately, this is actually worse code, because the register coallescer is getting confused somehow. If it were doing its job right, it could turn the code into this: _test: cmpwi cr0, r4, 0 bgt cr0, LBB_test_2 ; entry.no_exit_crit_edge LBB_test_1: ; entry.loopexit_crit_edge li r6, 0 b LBB_test_5 ; loopexit LBB_test_2: ; entry.no_exit_crit_edge li r6, 0 LBB_test_3: ; no_exit lwz r2, 0(r3) cmpw cr0, r2, r5 beq cr0, LBB_test_5 ; loopexit LBB_test_4: ; endif addi r3, r3, 4 addi r6, r6, 1 cmpw cr0, r6, r4 blt cr0, LBB_test_3 ; no_exit LBB_test_5: ; loopexit or r3, r6, r6 blr ... which I'll work on next. :) llvm-svn: 23604	2005-10-03 02:50:05 +00:00
Chris Lattner	e4ed42a426	Refactor some code into a function llvm-svn: 23603	2005-10-03 01:04:44 +00:00
Chris Lattner	360928dbed	This break is bogus and I have no idea why it was there. Basically it prevents memoizing code when IV's are used by phinodes outside of loops. In a simple example, we were getting this code before (note that r6 and r7 are isomorphic IV's): li r6, 0 or r7, r6, r6 LBB_test_3: ; no_exit lwz r2, 0(r3) cmpw cr0, r2, r5 or r2, r7, r7 beq cr0, LBB_test_5 ; loopexit LBB_test_4: ; endif addi r2, r7, 1 addi r7, r7, 1 addi r3, r3, 4 addi r6, r6, 1 cmpw cr0, r6, r4 blt cr0, LBB_test_3 ; no_exit Now we get: li r6, 0 LBB_test_3: ; no_exit or r2, r6, r6 lwz r6, 0(r3) cmpw cr0, r6, r5 beq cr0, LBB_test_6 ; loopexit LBB_test_4: ; endif addi r3, r3, 4 addi r6, r2, 1 cmpw cr0, r6, r4 blt cr0, LBB_test_3 ; no_exit this was noticed in em3d. llvm-svn: 23602	2005-10-03 00:37:33 +00:00
Chris Lattner	8fcce170cf	when checking if we should move a split edge block outside of a loop, check the presplit pred, not the post-split pred. This was causing us to make the wrong decision in some cases, leaving the critical edge block in the loop. llvm-svn: 23601	2005-10-03 00:31:52 +00:00
Chris Lattner	92233d2175	Make the pass name simpler llvm-svn: 23476	2005-09-27 21:10:32 +00:00
Chris Lattner	fd018c8dfe	Fix an issue where LSR would miss rewriting a use of an IV expression by a PHI node that is not the original PHI. This fixes up a dot-product loop in galgel, speeding it up from 18.47s to 16.13s. llvm-svn: 23327	2005-09-13 02:09:55 +00:00
Chris Lattner	8048b85e8f	Fix a regression from last night, which caused this pass to create invalid code for IV uses outside of loops that are not dominated by the latch block. We should only convert these uses to use the post-inc value if they ARE dominated by the latch block. Also use a new LoopInfo method to simplify some code. This fixes Transforms/LoopStrengthReduce/2005-09-12-UsesOutOutsideOfLoop.ll llvm-svn: 23318	2005-09-12 17:11:27 +00:00
Chris Lattner	a67648396a	_test: li r2, 0 LBB_test_1: ; no_exit.2 li r5, 0 stw r5, 0(r3) addi r2, r2, 1 addi r3, r3, 4 cmpwi cr0, r2, 701 blt cr0, LBB_test_1 ; no_exit.2 LBB_test_2: ; loopexit.2.loopexit addi r2, r2, 1 stw r2, 0(r4) blr [zion ~/llvm]$ cat > ~/xx Uses of IV's outside of the loop should use hte post-incremented version of the IV, not the preincremented version. This helps many loops (e.g. in sixtrack) which used to generate code like this (this is the code from the dont-hoist-simple-loop-constants.ll testcase): _test: li r2, 0 ** IV starts at 0 LBB_test_1: ; no_exit.2 or r5, r2, r2 Copy for loop exit li r2, 0 stw r2, 0(r3) addi r3, r3, 4 addi r2, r5, 1 addi r6, r5, 2 IV+2 cmpwi cr0, r6, 701 blt cr0, LBB_test_1 ; no_exit.2 LBB_test_2: ; loopexit.2.loopexit addi r2, r5, 2 IV+2 stw r2, 0(r4) blr And now generated code like this: _test: li r2, 1 * IV starts at 1 LBB_test_1: ; no_exit.2 li r5, 0 stw r5, 0(r3) addi r2, r2, 1 addi r3, r3, 4 cmpwi cr0, r2, 701 * IV.postinc + 0 blt cr0, LBB_test_1 LBB_test_2: ; loopexit.2.loopexit stw r2, 0(r4) * IV.postinc + 0 blr llvm-svn: 23313	2005-09-12 06:04:47 +00:00
Chris Lattner	530fe6ab30	implement Transforms/LoopStrengthReduce/dont-hoist-simple-loop-constants.ll. We used to emit this code for it: _test: li r2, 1 ;; Value tying up a register for the whole loop li r5, 0 LBB_test_1: ; no_exit.2 or r6, r5, r5 li r5, 0 stw r5, 0(r3) addi r5, r6, 1 addi r3, r3, 4 add r7, r2, r5 ;; should be addi r7, r5, 1 cmpwi cr0, r7, 701 blt cr0, LBB_test_1 ; no_exit.2 LBB_test_2: ; loopexit.2.loopexit addi r2, r6, 2 stw r2, 0(r4) blr now we emit this: _test: li r2, 0 LBB_test_1: ; no_exit.2 or r5, r2, r2 li r2, 0 stw r2, 0(r3) addi r3, r3, 4 addi r2, r5, 1 addi r6, r5, 2 ;; whoa, fold those adds! cmpwi cr0, r6, 701 blt cr0, LBB_test_1 ; no_exit.2 LBB_test_2: ; loopexit.2.loopexit addi r2, r5, 2 stw r2, 0(r4) blr more improvement coming. llvm-svn: 23306	2005-09-10 01:18:45 +00:00
Chris Lattner	ea7dfd53d6	Fix Transforms/LoopStrengthReduce/2005-08-17-OutOfLoopVariant.ll, a crash on 177.mesa llvm-svn: 22843	2005-08-17 21:22:41 +00:00
Chris Lattner	2bf7cb5213	Use a new helper to split critical edges, making the code simpler. Do not claim to not change the CFG. We do change the cfg to split critical edges. This isn't causing us a problem now, but could likely do so in the future. llvm-svn: 22824	2005-08-17 06:35:16 +00:00
Chris Lattner	5cf983ee0f	Fix a bad case in gzip where we put lots of things in registers across the loop, because a IV-dependent value was used outside of the loop and didn't have immediate-folding capability llvm-svn: 22798	2005-08-16 00:38:11 +00:00
Chris Lattner	47d3ec3525	Ooops, don't forget to clear this. The real inner loop is now: .LBB_foo_3: ; no_exit.1 lfd f2, 0(r9) lfd f3, 8(r9) fmul f4, f1, f2 fmadd f4, f0, f3, f4 stfd f4, 8(r9) fmul f3, f1, f3 fmsub f2, f0, f2, f3 stfd f2, 0(r9) addi r9, r9, 16 addi r8, r8, 1 cmpw cr0, r8, r4 ble .LBB_foo_3 ; no_exit.1 llvm-svn: 22782	2005-08-13 07:42:01 +00:00
Chris Lattner	5949d49032	Recursively scan scev expressions for common subexpressions. This allows us to handle nested loops much better, for example, by being able to tell that these two expressions: {( 8 + ( 16 * ( 1 + %Tmp11 + %Tmp12)) + %c_),+,( 16 * %Tmp 12)}<loopentry.1> {(( 16 * ( 1 + %Tmp11 + %Tmp12)) + %c_),+,( 16 * %Tmp12)}<loopentry.1> Have the following common part that can be shared: {(( 16 * ( 1 + %Tmp11 + %Tmp12)) + %c_),+,( 16 * %Tmp12)}<loopentry.1> This allows us to codegen an important inner loop in 168.wupwise as: .LBB_foo_4: ; no_exit.1 lfd f2, 16(r9) fmul f3, f0, f2 fmul f2, f1, f2 fadd f4, f3, f2 stfd f4, 8(r9) fsub f2, f3, f2 stfd f2, 16(r9) addi r8, r8, 1 addi r9, r9, 16 cmpw cr0, r8, r4 ble .LBB_foo_4 ; no_exit.1 instead of: .LBB_foo_3: ; no_exit.1 lfdx f2, r6, r9 add r10, r6, r9 lfd f3, 8(r10) fmul f4, f1, f2 fmadd f4, f0, f3, f4 stfd f4, 8(r10) fmul f3, f1, f3 fmsub f2, f0, f2, f3 stfdx f2, r6, r9 addi r9, r9, 16 addi r8, r8, 1 cmpw cr0, r8, r4 ble .LBB_foo_3 ; no_exit.1 llvm-svn: 22781	2005-08-13 07:27:18 +00:00
Chris Lattner	8447b49526	When splitting critical edges, make sure not to leave the new block in the middle of the loop. This turns a critical loop in gzip into this: .LBB_test_1: ; loopentry or r27, r28, r28 add r28, r3, r27 lhz r28, 3(r28) add r26, r4, r27 lhz r26, 3(r26) cmpw cr0, r28, r26 bne .LBB_test_8 ; loopentry.loopexit_crit_edge .LBB_test_2: ; shortcirc_next.0 add r28, r3, r27 lhz r28, 5(r28) add r26, r4, r27 lhz r26, 5(r26) cmpw cr0, r28, r26 bne .LBB_test_7 ; shortcirc_next.0.loopexit_crit_edge .LBB_test_3: ; shortcirc_next.1 add r28, r3, r27 lhz r28, 7(r28) add r26, r4, r27 lhz r26, 7(r26) cmpw cr0, r28, r26 bne .LBB_test_6 ; shortcirc_next.1.loopexit_crit_edge .LBB_test_4: ; shortcirc_next.2 add r28, r3, r27 lhz r26, 9(r28) add r28, r4, r27 lhz r25, 9(r28) addi r28, r27, 8 cmpw cr7, r26, r25 mfcr r26, 1 rlwinm r26, r26, 31, 31, 31 add r25, r8, r27 cmpw cr7, r25, r7 mfcr r25, 1 rlwinm r25, r25, 29, 31, 31 and. r26, r26, r25 bne .LBB_test_1 ; loopentry instead of this: .LBB_test_1: ; loopentry or r27, r28, r28 add r28, r3, r27 lhz r28, 3(r28) add r26, r4, r27 lhz r26, 3(r26) cmpw cr0, r28, r26 beq .LBB_test_3 ; shortcirc_next.0 .LBB_test_2: ; loopentry.loopexit_crit_edge add r2, r30, r27 add r8, r29, r27 b .LBB_test_9 ; loopexit .LBB_test_3: ; shortcirc_next.0 add r28, r3, r27 lhz r28, 5(r28) add r26, r4, r27 lhz r26, 5(r26) cmpw cr0, r28, r26 beq .LBB_test_5 ; shortcirc_next.1 .LBB_test_4: ; shortcirc_next.0.loopexit_crit_edge add r2, r11, r27 add r8, r12, r27 b .LBB_test_9 ; loopexit .LBB_test_5: ; shortcirc_next.1 add r28, r3, r27 lhz r28, 7(r28) add r26, r4, r27 lhz r26, 7(r26) cmpw cr0, r28, r26 beq .LBB_test_7 ; shortcirc_next.2 .LBB_test_6: ; shortcirc_next.1.loopexit_crit_edge add r2, r9, r27 add r8, r10, r27 b .LBB_test_9 ; loopexit .LBB_test_7: ; shortcirc_next.2 add r28, r3, r27 lhz r26, 9(r28) add r28, r4, r27 lhz r25, 9(r28) addi r28, r27, 8 cmpw cr7, r26, r25 mfcr r26, 1 rlwinm r26, r26, 31, 31, 31 add r25, r8, r27 cmpw cr7, r25, r7 mfcr r25, 1 rlwinm r25, r25, 29, 31, 31 and. r26, r26, r25 bne .LBB_test_1 ; loopentry Next up, improve the code for the loop. llvm-svn: 22769	2005-08-12 22:22:17 +00:00
Chris Lattner	4fec86d348	Fix a FIXME: if we are inserting code for a PHI argument, split the critical edge so that the code is not always executed for both operands. This prevents LSR from inserting code into loops whose exit blocks contain PHI uses of IV expressions (which are outside of loops). On gzip, for example, we turn this ugly code: .LBB_test_1: ; loopentry add r27, r3, r28 lhz r27, 3(r27) add r26, r4, r28 lhz r26, 3(r26) add r25, r30, r28 ;; Only live if exiting the loop add r24, r29, r28 ;; Only live if exiting the loop cmpw cr0, r27, r26 bne .LBB_test_5 ; loopexit into this: .LBB_test_1: ; loopentry or r27, r28, r28 add r28, r3, r27 lhz r28, 3(r28) add r26, r4, r27 lhz r26, 3(r26) cmpw cr0, r28, r26 beq .LBB_test_3 ; shortcirc_next.0 .LBB_test_2: ; loopentry.loopexit_crit_edge add r2, r30, r27 add r8, r29, r27 b .LBB_test_9 ; loopexit .LBB_test_2: ; shortcirc_next.0 ... blt .LBB_test_1 into this: .LBB_test_1: ; loopentry or r27, r28, r28 add r28, r3, r27 lhz r28, 3(r28) add r26, r4, r27 lhz r26, 3(r26) cmpw cr0, r28, r26 beq .LBB_test_3 ; shortcirc_next.0 .LBB_test_2: ; loopentry.loopexit_crit_edge add r2, r30, r27 add r8, r29, r27 b .LBB_t_3: ; shortcirc_next.0 .LBB_test_3: ; shortcirc_next.0 ... blt .LBB_test_1 Next step: get the block out of the loop so that the loop is all fall-throughs again. llvm-svn: 22766	2005-08-12 22:06:11 +00:00
Chris Lattner	edff91a49a	Teach LSR to strength reduce IVs that have a loop-invariant but non-constant stride. For code like this: void foo(float a, float b, int n, int stride_a, int stride_b) { int i; for (i=0; i<n; i++) a[istride_a] = b[istride_b]; } we now emit: .LBB_foo2_2: ; no_exit lfs f0, 0(r4) stfs f0, 0(r3) addi r7, r7, 1 add r4, r2, r4 add r3, r6, r3 cmpw cr0, r7, r5 blt .LBB_foo2_2 ; no_exit instead of: .LBB_foo_2: ; no_exit mullw r8, r2, r7 ;; multiply! slwi r8, r8, 2 lfsx f0, r4, r8 mullw r8, r2, r6 ;; multiply! slwi r8, r8, 2 stfsx f0, r3, r8 addi r2, r2, 1 cmpw cr0, r2, r5 blt .LBB_foo_2 ; no_exit loops with variable strides occur pretty often. For example, in SPECFP2K there are 317 variable strides in 177.mesa, 3 in 179.art, 14 in 188.ammp, 56 in 168.wupwise, 36 in 172.mgrid. Now we can allow indvars to turn functions written like this: void foo2(float a, float b, int n, int stride_a, int stride_b) { int i, ai = 0, bi = 0; for (i=0; i<n; i++) { a[ai] = b[bi]; ai += stride_a; bi += stride_b; } } into code like the above for better analysis. With this patch, they generate identical code. llvm-svn: 22740	2005-08-10 00:45:21 +00:00
Chris Lattner	dde7dc525e	Fix Regression/Transforms/LoopStrengthReduce/phi_node_update_multiple_preds.ll by being more careful about updating PHI nodes llvm-svn: 22739	2005-08-10 00:35:32 +00:00
Chris Lattner	c6c4d99a21	Fix some 80 column violations. Once we compute the evolution for a GEP, tell SE about it. This allows users of the GEP to know it, if the users are not direct. This allows us to compile this testcase: void fbSolidFillmmx(int w, unsigned char d) { while (w >= 64) { (unsigned long long ) (d + 0) = 0; (unsigned long long ) (d + 8) = 0; (unsigned long long ) (d + 16) = 0; (unsigned long long ) (d + 24) = 0; (unsigned long long ) (d + 32) = 0; (unsigned long long ) (d + 40) = 0; (unsigned long long ) (d + 48) = 0; (unsigned long long *) (d + 56) = 0; w -= 64; d += 64; } } into: .LBB_fbSolidFillmmx_2: ; no_exit li r2, 0 stw r2, 0(r4) stw r2, 4(r4) stw r2, 8(r4) stw r2, 12(r4) stw r2, 16(r4) stw r2, 20(r4) stw r2, 24(r4) stw r2, 28(r4) stw r2, 32(r4) stw r2, 36(r4) stw r2, 40(r4) stw r2, 44(r4) stw r2, 48(r4) stw r2, 52(r4) stw r2, 56(r4) stw r2, 60(r4) addi r4, r4, 64 addi r3, r3, -64 cmpwi cr0, r3, 63 bgt .LBB_fbSolidFillmmx_2 ; no_exit instead of: .LBB_fbSolidFillmmx_2: ; no_exit li r11, 0 stw r11, 0(r4) stw r11, 4(r4) stwx r11, r10, r4 add r12, r10, r4 stw r11, 4(r12) stwx r11, r9, r4 add r12, r9, r4 stw r11, 4(r12) stwx r11, r8, r4 add r12, r8, r4 stw r11, 4(r12) stwx r11, r7, r4 add r12, r7, r4 stw r11, 4(r12) stwx r11, r6, r4 add r12, r6, r4 stw r11, 4(r12) stwx r11, r5, r4 add r12, r5, r4 stw r11, 4(r12) stwx r11, r2, r4 add r12, r2, r4 stw r11, 4(r12) addi r4, r4, 64 addi r3, r3, -64 cmpwi cr0, r3, 63 bgt .LBB_fbSolidFillmmx_2 ; no_exit llvm-svn: 22737	2005-08-09 23:39:36 +00:00
Chris Lattner	02742710f3	SCEVAddExpr::get() of an empty list is invalid. llvm-svn: 22724	2005-08-09 01:13:47 +00:00
Chris Lattner	a091ff1764	Implement: LoopStrengthReduce/share_ivs.ll Two changes: * Only insert one PHI node for each stride. Other values are live in values. This cannot introduce higher register pressure than the previous approach, and can take advantage of reg+reg addressing modes. * Factor common base values out of uses before moving values from the base to the immediate fields. This improves codegen by starting the stride-specific PHI node out at a common place for each IV use. As an example, we used to generate this for a loop in swim: .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_2: ; no_exit.7.i lfd f0, 0(r8) stfd f0, 0(r3) lfd f0, 0(r6) stfd f0, 0(r7) lfd f0, 0(r2) stfd f0, 0(r5) addi r9, r9, 1 addi r2, r2, 8 addi r5, r5, 8 addi r6, r6, 8 addi r7, r7, 8 addi r8, r8, 8 addi r3, r3, 8 cmpw cr0, r9, r4 bgt .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_1 now we emit: .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_2: ; no_exit.7.i lfdx f0, r8, r2 stfdx f0, r9, r2 lfdx f0, r5, r2 stfdx f0, r7, r2 lfdx f0, r3, r2 stfdx f0, r6, r2 addi r10, r10, 1 addi r2, r2, 8 cmpw cr0, r10, r4 bgt .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_1 As another more dramatic example, we used to emit this: .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_2: ; no_exit.1.i19 lfd f0, 8(r21) lfd f4, 8(r3) lfd f5, 8(r27) lfd f6, 8(r22) lfd f7, 8(r5) lfd f8, 8(r6) lfd f9, 8(r30) lfd f10, 8(r11) lfd f11, 8(r12) fsub f10, f10, f11 fadd f5, f4, f5 fmul f5, f5, f1 fadd f6, f6, f7 fadd f6, f6, f8 fadd f6, f6, f9 fmadd f0, f5, f6, f0 fnmsub f0, f10, f2, f0 stfd f0, 8(r4) lfd f0, 8(r25) lfd f5, 8(r26) lfd f6, 8(r23) lfd f9, 8(r28) lfd f10, 8(r10) lfd f12, 8(r9) lfd f13, 8(r29) fsub f11, f13, f11 fadd f4, f4, f5 fmul f4, f4, f1 fadd f5, f6, f9 fadd f5, f5, f10 fadd f5, f5, f12 fnmsub f0, f4, f5, f0 fnmsub f0, f11, f3, f0 stfd f0, 8(r24) lfd f0, 8(r8) fsub f4, f7, f8 fsub f5, f12, f10 fnmsub f0, f5, f2, f0 fnmsub f0, f4, f3, f0 stfd f0, 8(r2) addi r20, r20, 1 addi r2, r2, 8 addi r8, r8, 8 addi r10, r10, 8 addi r12, r12, 8 addi r6, r6, 8 addi r29, r29, 8 addi r28, r28, 8 addi r26, r26, 8 addi r25, r25, 8 addi r24, r24, 8 addi r5, r5, 8 addi r23, r23, 8 addi r22, r22, 8 addi r3, r3, 8 addi r9, r9, 8 addi r11, r11, 8 addi r30, r30, 8 addi r27, r27, 8 addi r21, r21, 8 addi r4, r4, 8 cmpw cr0, r20, r7 bgt .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_1 we now emit: .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_2: ; no_exit.1.i19 lfdx f0, r21, r20 lfdx f4, r3, r20 lfdx f5, r27, r20 lfdx f6, r22, r20 lfdx f7, r5, r20 lfdx f8, r6, r20 lfdx f9, r30, r20 lfdx f10, r11, r20 lfdx f11, r12, r20 fsub f10, f10, f11 fadd f5, f4, f5 fmul f5, f5, f1 fadd f6, f6, f7 fadd f6, f6, f8 fadd f6, f6, f9 fmadd f0, f5, f6, f0 fnmsub f0, f10, f2, f0 stfdx f0, r4, r20 lfdx f0, r25, r20 lfdx f5, r26, r20 lfdx f6, r23, r20 lfdx f9, r28, r20 lfdx f10, r10, r20 lfdx f12, r9, r20 lfdx f13, r29, r20 fsub f11, f13, f11 fadd f4, f4, f5 fmul f4, f4, f1 fadd f5, f6, f9 fadd f5, f5, f10 fadd f5, f5, f12 fnmsub f0, f4, f5, f0 fnmsub f0, f11, f3, f0 stfdx f0, r24, r20 lfdx f0, r8, r20 fsub f4, f7, f8 fsub f5, f12, f10 fnmsub f0, f5, f2, f0 fnmsub f0, f4, f3, f0 stfdx f0, r2, r20 addi r19, r19, 1 addi r20, r20, 8 cmpw cr0, r19, r7 bgt .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_1 llvm-svn: 22722	2005-08-09 00:18:09 +00:00
Chris Lattner	37c24cc98c	Suck the base value out of the UsersToProcess vector into the BasedUser class to simplify the code. Fuse two loops. llvm-svn: 22721	2005-08-08 22:56:21 +00:00
Chris Lattner	37ed895bf1	Split MoveLoopVariantsToImediateField out from MoveImmediateValues. The first is a correctness thing, and the later is an optzn thing. This also is needed to support a future change. llvm-svn: 22720	2005-08-08 22:32:34 +00:00
Chris Lattner	14203e85b2	Not all constants are legal immediates in load/store instructions. llvm-svn: 22704	2005-08-08 06:25:50 +00:00
Chris Lattner	c70bbc0c41	Implement LoopStrengthReduce/share_code_in_preheader.ll by having one rewriter for all code inserted into the preheader, which is never flushed. llvm-svn: 22702	2005-08-08 05:47:49 +00:00
Chris Lattner	9bfa6f8784	Implement a simple optimization for the termination condition of the loop. The termination condition actually wants to use the post-incremented value of the loop, not a new indvar with an unusual base. On PPC, for example, this allows us to compile LoopStrengthReduce/exit_compare_live_range.ll to: _foo: li r2, 0 .LBB_foo_1: ; no_exit li r5, 0 stw r5, 0(r3) addi r2, r2, 1 cmpw cr0, r2, r4 bne .LBB_foo_1 ; no_exit blr instead of: _foo: li r2, 1 ;; IV starts at 1, not 0 .LBB_foo_1: ; no_exit li r5, 0 stw r5, 0(r3) addi r5, r2, 1 cmpw cr0, r2, r4 or r2, r5, r5 ;; Reg-reg copy, extra live range bne .LBB_foo_1 ; no_exit blr This implements LoopStrengthReduce/exit_compare_live_range.ll llvm-svn: 22699	2005-08-08 05:28:22 +00:00
Chris Lattner	11e7a5eda7	Make sure to clean CastedPointers after casts are potentially deleted. This fixes LSR crashes on 301.apsi, 191.fma3d, and 189.lucas llvm-svn: 22673	2005-08-05 01:30:11 +00:00
Chris Lattner	45f8b6e7aa	Modify how immediates are removed from base expressions to deal with the fact that the symbolic evaluator is not always able to use subtraction to remove expressions. This makes the code faster, and fixes the last crash on 178.galgel. Finally, add a statistic to see how many phi nodes are inserted. On 178.galgel, we get the follow stats: 2562 loop-reduce - Number of PHIs inserted 3927 loop-reduce - Number of GEPs strength reduced llvm-svn: 22662	2005-08-04 22:34:05 +00:00
Chris Lattner	a6d7c355bc	* Refactor some code into a new BasedUser::RewriteInstructionToUseNewBase method. * Fix a crash on 178.galgel, where we would insert expressions before PHI nodes instead of into the PHI node predecessor blocks. llvm-svn: 22657	2005-08-04 20:03:32 +00:00
Chris Lattner	0f7c0fa2a7	Fix a case that caused this to crash on 178.galgel llvm-svn: 22653	2005-08-04 19:26:19 +00:00
Chris Lattner	acc42c4df1	Teach LSR about loop-variant expressions, such as loops like this: for (i = 0; i < N; ++i) A[i][foo()] = 0; here we still want to strength reduce the A[i] part, even though foo() is l-v. This also simplifies some of the 'CanReduce' logic. This implements Transforms/LoopStrengthReduce/ops_after_indvar.ll llvm-svn: 22652	2005-08-04 19:08:16 +00:00
Nate Begeman	456044b724	Remove some more dead code. llvm-svn: 22650	2005-08-04 18:13:56 +00:00
Chris Lattner	eaf24725b2	Refactor this code substantially with the following improvements: 1. We only analyze instructions once, guaranteed 2. AnalyzeGetElementPtrUsers has been ripped apart and replaced with something much simpler. The next step is to handle expressions that are not all indvar+loop-invariant values (e.g. handling indvar+loopvariant). llvm-svn: 22649	2005-08-04 17:40:30 +00:00
Chris Lattner	6f286b760f	refactor some code llvm-svn: 22643	2005-08-04 01:19:13 +00:00
Chris Lattner	6510749050	invert to if's to make the logic simpler llvm-svn: 22641	2005-08-04 00:40:47 +00:00
Chris Lattner	a0102fbc4f	When processing outer loops and we find uses of an IV in inner loops, make sure to handle the use, just don't recurse into it. This permits us to generate this code for a simple nested loop case: .LBB_foo_0: ; entry stwu r1, -48(r1) stw r29, 44(r1) stw r30, 40(r1) mflr r11 stw r11, 56(r1) lis r2, ha16(L_A$non_lazy_ptr) lwz r30, lo16(L_A$non_lazy_ptr)(r2) li r29, 1 .LBB_foo_1: ; no_exit.0 bl L_bar$stub li r2, 1 or r3, r30, r30 .LBB_foo_2: ; no_exit.1 lfd f0, 8(r3) stfd f0, 0(r3) addi r4, r2, 1 addi r3, r3, 8 cmpwi cr0, r2, 100 or r2, r4, r4 bne .LBB_foo_2 ; no_exit.1 .LBB_foo_3: ; loopexit.1 addi r30, r30, 800 addi r2, r29, 1 cmpwi cr0, r29, 100 or r29, r2, r2 bne .LBB_foo_1 ; no_exit.0 .LBB_foo_4: ; return lwz r11, 56(r1) mtlr r11 lwz r30, 40(r1) lwz r29, 44(r1) lwz r1, 0(r1) blr instead of this: _foo: .LBB_foo_0: ; entry stwu r1, -48(r1) stw r28, 44(r1) ;; uses an extra register. stw r29, 40(r1) stw r30, 36(r1) mflr r11 stw r11, 56(r1) li r30, 1 li r29, 0 or r28, r29, r29 .LBB_foo_1: ; no_exit.0 bl L_bar$stub mulli r2, r28, 800 ;; unstrength-reduced multiply lis r3, ha16(L_A$non_lazy_ptr) ;; loop invariant address computation lwz r3, lo16(L_A$non_lazy_ptr)(r3) add r2, r2, r3 mulli r4, r29, 800 ;; unstrength-reduced multiply addi r3, r3, 8 add r3, r4, r3 li r4, 1 .LBB_foo_2: ; no_exit.1 lfd f0, 0(r3) stfd f0, 0(r2) addi r5, r4, 1 addi r2, r2, 8 ;; multiple stride 8 IV's addi r3, r3, 8 cmpwi cr0, r4, 100 or r4, r5, r5 bne .LBB_foo_2 ; no_exit.1 .LBB_foo_3: ; loopexit.1 addi r28, r28, 1 ;;; Many IV's with stride 1 addi r29, r29, 1 addi r2, r30, 1 cmpwi cr0, r30, 100 or r30, r2, r2 bne .LBB_foo_1 ; no_exit.0 .LBB_foo_4: ; return lwz r11, 56(r1) mtlr r11 lwz r30, 36(r1) lwz r29, 40(r1) lwz r28, 44(r1) lwz r1, 0(r1) blr llvm-svn: 22640	2005-08-04 00:14:11 +00:00
Chris Lattner	fc62470466	Teach loop-reduce to see into nested loops, to pull out immediate values pushed down by SCEV. In a nested loop case, this allows us to emit this: lis r3, ha16(L_A$non_lazy_ptr) lwz r3, lo16(L_A$non_lazy_ptr)(r3) add r2, r2, r3 li r3, 1 .LBB_foo_2: ; no_exit.1 lfd f0, 8(r2) ;; Uses offset of 8 instead of 0 stfd f0, 0(r2) addi r4, r3, 1 addi r2, r2, 8 cmpwi cr0, r3, 100 or r3, r4, r4 bne .LBB_foo_2 ; no_exit.1 instead of this: lis r3, ha16(L_A$non_lazy_ptr) lwz r3, lo16(L_A$non_lazy_ptr)(r3) add r2, r2, r3 addi r3, r3, 8 li r4, 1 .LBB_foo_2: ; no_exit.1 lfd f0, 0(r3) stfd f0, 0(r2) addi r5, r4, 1 addi r2, r2, 8 addi r3, r3, 8 cmpwi cr0, r4, 100 or r4, r5, r5 bne .LBB_foo_2 ; no_exit.1 llvm-svn: 22639	2005-08-03 23:44:42 +00:00
Chris Lattner	bb78c97e24	improve debug output llvm-svn: 22638	2005-08-03 23:30:08 +00:00
Chris Lattner	db23c74e5e	Move from Stage 0 to Stage 1. Only emit one PHI node for IV uses with identical bases and strides (after moving foldable immediates to the load/store instruction). This implements LoopStrengthReduce/dont_insert_redundant_ops.ll, allowing us to generate this PPC code for test1: or r30, r3, r3 .LBB_test1_1: ; Loop li r2, 0 stw r2, 0(r30) stw r2, 4(r30) bl L_pred$stub addi r30, r30, 8 cmplwi cr0, r3, 0 bne .LBB_test1_1 ; Loop instead of this code: or r30, r3, r3 or r29, r3, r3 .LBB_test1_1: ; Loop li r2, 0 stw r2, 0(r29) stw r2, 4(r30) bl L_pred$stub addi r30, r30, 8 ;; Two iv's with step of 8 addi r29, r29, 8 cmplwi cr0, r3, 0 bne .LBB_test1_1 ; Loop llvm-svn: 22635	2005-08-03 22:51:21 +00:00
Chris Lattner	430d0022df	Rename IVUse to IVUsersOfOneStride, use a struct instead of a pair to unify some parallel vectors and get field names more descriptive than "first" and "second". This isn't lisp afterall :) llvm-svn: 22633	2005-08-03 22:21:05 +00:00
Chris Lattner	84e9baa925	Fix a nasty dangling pointer issue. The ScalarEvolution pass would keep a map from instruction* to SCEVHandles. When we delete instructions, we have to tell it about it. We would run into nasty cases where new instructions were reallocated at old instruction addresses and get the old map values. Bad bad bad :( llvm-svn: 22632	2005-08-03 21:36:09 +00:00
Chris Lattner	351b891cbc	Like the comment says, do not insert cast instructions before phi nodes llvm-svn: 22586	2005-08-02 03:31:14 +00:00
Chris Lattner	75a44e154e	add a comment, make a check more lenient llvm-svn: 22581	2005-08-02 02:52:02 +00:00
Chris Lattner	dcce49e006	Simplify for loop, clear a per-loop map after processing each loop llvm-svn: 22580	2005-08-02 02:44:31 +00:00
Chris Lattner	9ef1294210	Add a comment Make LSR ignore GEP's that have loop variant base values, as we currently cannot codegen them llvm-svn: 22576	2005-08-02 01:32:29 +00:00
Chris Lattner	564900e5e5	Fix an iterator invalidation problem llvm-svn: 22575	2005-08-02 00:41:11 +00:00
Jeff Cohen	546fd5944e	Keep tabs and trailing spaces out. llvm-svn: 22565	2005-07-30 18:33:25 +00:00
Jeff Cohen	c500991055	Fix VC++ build problems. llvm-svn: 22564	2005-07-30 18:22:27 +00:00
Nate Begeman	17a0e2afea	Ack, typo llvm-svn: 22560	2005-07-30 00:21:31 +00:00
Nate Begeman	e68bcd1946	Commit a new LoopStrengthReduce pass that can use scalar evolutions and target data to decide which loop induction variables to strength reduce and how to do so. This work is mostly by Chris Lattner, with tweaks by me to get it working on some of MultiSource. llvm-svn: 22558	2005-07-30 00:15:07 +00:00
Misha Brukman	b1c9317bb4	Remove trailing whitespace llvm-svn: 21427	2005-04-21 23:48:37 +00:00

... 4 5 6 7 8 ...

562 Commits