llvm-project

Commit Graph

Author	SHA1	Message	Date
Florian Hahn	fdea2e420c	[loop-unroll] Factor out code to update LoopInfo (NFC). Move the code to update LoopInfo for cloned basic blocks to addClonedBlockToLoopInfo, as suggested in https://reviews.llvm.org/D28482. llvm-svn: 291614	2017-01-10 23:24:54 +00:00
Philip Reames	fac031a178	Add a comment for a todo in LoopUnroll post cleanup llvm-svn: 290769	2016-12-30 22:10:19 +00:00
Haicheng Wu	b29dd0107c	[LoopUnroll] Modify a comment to clarify the usage of TripCount. NFC. Make it clear that TripCount is the upper bound of the iteration on which control exits LatchBlock. Differential Revision: https://reviews.llvm.org/D26675 llvm-svn: 290199	2016-12-20 20:23:48 +00:00
Daniel Jasper	aec2fa352f	Revert @llvm.assume with operator bundles (r289755-r289757) This creates non-linear behavior in the inliner (see more details in r289755's commit thread). llvm-svn: 290086	2016-12-19 08:22:17 +00:00
Hal Finkel	3ca4a6bcf1	Remove the AssumptionCache After r289755, the AssumptionCache is no longer needed. Variables affected by assumptions are now found by using the new operand-bundle-based scheme. This new scheme is more computationally efficient, and also we need much less code... llvm-svn: 289756	2016-12-15 03:02:15 +00:00
Michael Kuperstein	b151a641aa	[LoopUnroll] Implement profile-based loop peeling This implements PGO-driven loop peeling. The basic idea is that when the average dynamic trip-count of a loop is known, based on PGO, to be low, we can expect a performance win by peeling off the first several iterations of that loop. Unlike unrolling based on a known trip count, or a trip count multiple, this doesn't save us the conditional check and branch on each iteration. However, it does allow us to simplify the straight-line code we get (constant-folding, etc.). This is important given that we know that we will usually only hit this code, and not the actual loop. This is currently disabled by default. Differential Revision: https://reviews.llvm.org/D25963 llvm-svn: 288274	2016-11-30 21:13:57 +00:00
John Brawn	84b21835f1	[LoopUnroll] Keep the loop test only on the first iteration of max-or-zero loops When we have a loop with a known upper bound on the number of iterations, and furthermore know that either the number of iterations will be either exactly that upper bound or zero, then we can fully unroll up to that upper bound keeping only the first loop test to check for the zero iteration case. Most of the work here is in plumbing this 'max-or-zero' information from the part of scalar evolution where it's detected through to loop unrolling. I've also gone for the safe default of 'false' everywhere but howManyLessThans which could probably be improved. Differential Revision: https://reviews.llvm.org/D25682 llvm-svn: 284818	2016-10-21 11:08:48 +00:00
Haicheng Wu	1ef17e90b2	Reapply "[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop" Reappy r284044 after revert in r284051. Krzysztof fixed the error in r284049. The original summary: This patch tries to fully unroll loops having break statement like this for (int i = 0; i < 8; i++) { if (a[i] == value) { found = true; break; } } GCC can fully unroll such loops, but currently LLVM cannot because LLVM only supports loops having exact constant trip counts. The upper bound of the trip count can be obtained from calling ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the refactoring work in SCEV to prevent duplicating code. The feature of using the upper bound is enabled under the same circumstance when runtime unrolling is enabled since both are used to unroll loops without knowing the exact constant trip count. llvm-svn: 284053	2016-10-12 21:29:38 +00:00
Haicheng Wu	45e4ef737d	Revert "[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop" This reverts commit r284044. llvm-svn: 284051	2016-10-12 21:02:22 +00:00
Haicheng Wu	6cac34fd41	[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop This patch tries to fully unroll loops having break statement like this for (int i = 0; i < 8; i++) { if (a[i] == value) { found = true; break; } } GCC can fully unroll such loops, but currently LLVM cannot because LLVM only supports loops having exact constant trip counts. The upper bound of the trip count can be obtained from calling ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the refactoring work in SCEV to prevent duplicating code. The feature of using the upper bound is enabled under the same circumstance when runtime unrolling is enabled since both are used to unroll loops without knowing the exact constant trip count. Differential Revision: https://reviews.llvm.org/D24790 llvm-svn: 284044	2016-10-12 20:24:32 +00:00
Adam Nemet	f57cc62abf	[LoopUnroll] Port to the new streaming interface for opt remarks. llvm-svn: 282834	2016-09-30 03:44:16 +00:00
David Majnemer	110522bc0f	[LoopUnroll] Don't clear out the AssumptionCache on each loop Clearing out the AssumptionCache can cause us to rescan the entire function for assumes. If there are many loops, then we are scanning over the entire function many times. Instead of clearing out the AssumptionCache, register all cloned assumes. llvm-svn: 278854	2016-08-16 21:09:46 +00:00
David Majnemer	0a16c22846	Use range algorithms instead of unpacking begin/end No functionality change is intended. llvm-svn: 278417	2016-08-11 21:15:00 +00:00
Michael Zolotukhin	2f50725dbd	[LoopUnroll] Simplify loops created by unrolling. Summary: Currently loop-unrolling doesn't preserve loop-simplified form. This patch fixes it by resimplifying affected loops. Reviewers: chandlerc, sanjoy, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23148 llvm-svn: 278038	2016-08-08 19:02:15 +00:00
Michael Zolotukhin	b2738e41bf	[LoopUnroll] Switch the default value of -unroll-runtime-epilog back to its original value. As agreed in post-commit review of r265388, I'm switching the flag to its original value until the 90% runtime performance regression on SingleSource/Benchmarks/Stanford/Bubblesort is addressed. llvm-svn: 277524	2016-08-02 21:24:14 +00:00
Adam Nemet	12937c361f	[LoopUnroll] Include hotness of region in opt remark LoopUnroll is a loop pass, so the analysis of OptimizationRemarkEmitter is added to the common function analysis passes that loop passes depend on. The BFI and indirectly BPI used in this pass is computed lazily so no overhead should be observed unless -pass-remarks-with-hotness is used. This is how the patch affects the O3 pipeline: Dominator Tree Construction Natural Loop Information Canonicalize natural loops Loop-Closed SSA Form Pass Basic Alias Analysis (stateless AA impl) Function Alias Analysis Results Scalar Evolution Analysis + Lazy Branch Probability Analysis + Lazy Block Frequency Analysis + Optimization Remark Emitter Loop Pass Manager Rotate Loops Loop Invariant Code Motion Unswitch loops Simplify the CFG Dominator Tree Construction Basic Alias Analysis (stateless AA impl) Function Alias Analysis Results Combine redundant instructions Natural Loop Information Canonicalize natural loops Loop-Closed SSA Form Pass Scalar Evolution Analysis + Lazy Branch Probability Analysis + Lazy Block Frequency Analysis + Optimization Remark Emitter Loop Pass Manager Induction Variable Simplification Recognize loop idioms Delete dead loops Unroll loops ... llvm-svn: 277203	2016-07-29 19:29:47 +00:00
Davide Italiano	cd96cfd8df	[PM] Port LoopSimplify to the new pass manager. While here move simplifyLoop() function to the new header, as suggested by Chandler in the review. Differential Revision: http://reviews.llvm.org/D21404 llvm-svn: 274959	2016-07-09 03:03:01 +00:00
David Majnemer	b8da3a2bb2	Reinstate r273711 r273711 was reverted by r273743. The inliner needs to know about any call sites in the inlined function. These were obscured if we replaced a call to undef with an undef but kept the call around. This fixes PR28298. llvm-svn: 273753	2016-06-25 00:04:10 +00:00
Nico Weber	ae2ef4ccd4	Revert r273711, it caused PR28298. llvm-svn: 273743	2016-06-24 22:52:39 +00:00
David Majnemer	3b3e954ea2	SimplifyInstruction does not imply DCE We cannot remove an instruction with no uses just because SimplifyInstruction succeeds. It may have side effects. llvm-svn: 273711	2016-06-24 19:34:46 +00:00
Michael Zolotukhin	aa547616d2	[LoopUnroll] Check that DT is available before trying to verify it. llvm-svn: 272221	2016-06-08 22:49:59 +00:00
Evgeny Stupachenko	ea2aef4a1d	The patch refactors unroll pass. Summary: Unroll factor (Count) calculations moved to a new function. Early exits on pragma and "-unroll-count" defined factor added. New type of unrolling "Force" introduced (previously used implicitly). New unroll preference "AllowRemainder" introduced and set "true" by default. (should be set to false for architectures that suffers from it). Reviewers: hfinkel, mzolotukhin, zzheng Differential Revision: http://reviews.llvm.org/D19553 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 271071	2016-05-27 23:15:06 +00:00
Justin Lebar	50deb6d028	Minor formatting fixes in LoopUnroll.cpp. llvm-svn: 268995	2016-05-10 00:31:23 +00:00
Michael Zolotukhin	56ad4048ae	Follow-up for r265605: don't mutate vector we're iterating. llvm-svn: 265625	2016-04-07 00:09:42 +00:00
Michael Zolotukhin	97567e141e	[LoopUnroll] Fix the way we update DT after complete unrolling. Updating dominators for exit-blocks of the unrolled loops is not enough, as shown in PR27157. The proper way is to update dominators for all dominance-children of original loop blocks. llvm-svn: 265605	2016-04-06 21:47:12 +00:00
David L Kreitzer	188de5ae69	Adds the ability to use an epilog remainder loop during loop unrolling and makes this the default behavior. Patch by Evgeny Stupachenko (evstupac@gmail.com). Differential Revision: http://reviews.llvm.org/D18158 llvm-svn: 265388	2016-04-05 12:19:35 +00:00
Eric Christopher	257338ff0f	Use some braces to format this a little better. llvm-svn: 263527	2016-03-15 03:01:31 +00:00
Eric Christopher	ee00abe5e6	Fix llvm/llvm/lib/Transforms/Utils/LoopUnroll.cpp:285:53: error: suggest parentheses around '&&' within '\|\|' [-Werror=parentheses]. llvm-svn: 263525	2016-03-15 02:19:06 +00:00
Justin Lebar	6827de19b2	[LoopUnroll] Respect the convergent attribute. Summary: Specifically, when we perform runtime loop unrolling of a loop that contains a convergent op, we can only unroll k times, where k divides the loop trip multiple. Without this change, we'll happily unroll e.g. the following loop for (int i = 0; i < N; ++i) { if (i == 0) convergent_op(); foo(); } into int i = 0; if (N % 2 == 1) { convergent_op(); foo(); ++i; } for (; i < N - 1; i += 2) { if (i == 0) convergent_op(); foo(); foo(); }. This is unsafe, because we've just added a control-flow dependency to the convergent op in the prelude. In general, runtime unrolling loops that contain convergent ops is safe only if we don't have emit a prelude, which occurs when the unroll count divides the trip multiple. Reviewers: resistor Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17526 llvm-svn: 263509	2016-03-14 23:15:34 +00:00
Sanjay Patel	eaf06851d0	rangify, fix function names; NFCI llvm-svn: 262940	2016-03-08 17:12:32 +00:00
Sanjay Patel	5b8d741632	don't repeat function names in documentation comments; NFC llvm-svn: 262937	2016-03-08 16:26:39 +00:00
Michael Zolotukhin	792a885537	Follow up for r261597: Add the * to the auto. llvm-svn: 261600	2016-02-23 00:57:48 +00:00
Michael Zolotukhin	4fdf974e3e	Follow-up for r261595: use range loop. llvm-svn: 261597	2016-02-23 00:48:44 +00:00
Michael Zolotukhin	de19ed1eb1	[LoopUnroll] Avoid unnecessary DT recomputation. Summary: When we completely unroll a loop, it's pretty easy to update DT in-place and thus avoid rebuilding it. DT recalculation is one of the most time-consuming tasks in loop-unroll, so avoiding it at least in case of full unroll should be beneficial. On some extreme (but still real-world) tests this patch improves compile time by ~2x. Reviewers: escha, jmolloy, hfinkel, sanjoy, chandlerc Subscribers: joker.eph, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D17473 llvm-svn: 261595	2016-02-23 00:30:50 +00:00
Michael Zolotukhin	d734bea8ba	[LoopUnrolling] Fix a bug introduced in r259869 (PR26688). The issue was that we only required LCSSA rebuilding if the immediate parent-loop had values used outside of it. The fix is to enaable the same logic for all outer loops, not only immediate parent. llvm-svn: 261575	2016-02-22 21:21:45 +00:00
Michael Zolotukhin	73957179d3	[LoopUnrolling] Try harder to avoid rebuilding LCSSA when possible. In r255133 (reapplied r253126) we started to avoid redundant recomputation of LCSSA after loop-unrolling. This patch moves one step further in this direction - now we can avoid it for much wider range of loops, as we start to look at IR and try to figure out if the transformation actually breaks LCSSA phis or makes it necessary to insert new ones. Differential Revision: http://reviews.llvm.org/D16838 llvm-svn: 259869	2016-02-05 02:17:36 +00:00
Justin Bogner	e9fb228d59	LoopInfo: Simplify ownership of Loop objects It's strange that LoopInfo mostly owns the Loop objects, but that it defers deleting them to the loop pass manager. Instead, change the oddly named "updateUnloop" to "markAsRemoved" and have it queue the Loop object for deletion. We can't delete the Loop immediately when we remove it, since we need its pointer identity still, so we'll mark the object as "invalid" so that clients can see what's going on. llvm-svn: 257191	2016-01-08 19:08:53 +00:00
Justin Bogner	883a3ea67f	LPM: Make callers of LPM.deleteLoopFromQueue update LoopInfo directly. NFC As of r255720, the loop pass manager will DTRT when passes update the loop info for removed loops, so they no longer need to reach into LPPassManager APIs to do this kind of transformation. This change very nearly removes the need for the LPPassManager to even be passed into loop passes - the only remaining pass that uses the LPM argument is LoopUnswitch. llvm-svn: 255797	2015-12-16 18:40:20 +00:00
Justin Bogner	843fb204b7	LPM: Stop threading `Pass ` through all of the loop utility APIs. NFC A large number of loop utility functions take a `Pass ` and reach into it to find out which analyses to preserve. There are a number of problems with this: - The APIs have access to pretty well any Pass state they want, so it's hard to tell what they may or may not do. - Other APIs have copied these and pass around a `Pass *` even though they don't even use it. Some of these just hand a nullptr to the API since the callers don't even have a pass available. - Passes in the new pass manager don't work like the current ones, so the APIs can't be used as is there. Instead, we should explicitly thread the analysis results that we actually care about through these APIs. This is both simpler and more reusable. llvm-svn: 255669	2015-12-15 19:40:57 +00:00
Michael Zolotukhin	78760ee73d	Revert "Revert r253253 and r253126: "Don't recompute LCSSA after loop-unrolling when possible."" The bug in IndVarSimplify was fixed in r254976, r254977, so I'm reapplying the original patch for avoiding redundant LCSSA recomputation. This reverts commit ffe3b434e505e403146aff00be0c177bb6d13466. llvm-svn: 255133	2015-12-09 18:20:28 +00:00
Michael Zolotukhin	6c11c04db3	Revert r253253 and r253126: "Don't recompute LCSSA after loop-unrolling when possible." The change exposed a bug in IndVarSimplify (PR25578), which led to a failure (PR25538). When the bug is fixed, this patch can be reapplied. The tests are kept in tree, as they're useful anyway, and will not break with this revert. llvm-svn: 253596	2015-11-19 20:28:32 +00:00
Michael Zolotukhin	927bdba29d	[PR25538]: Fix a failure caused by r253126. In r253126 we stopped to recompute LCSSA after loop unrolling in all cases, except the unrolling is full and at least one of the loop exits is outside the parent loop. In other cases the transformation should not break LCSSA, but it turned out, that we also call SimplifyLoop on the parent loop, which might break LCSSA by itself. This fix just triggers LCSSA recomputation in this case as well. I'm committing it without a test case for now, but I'll try to invent one. It's a bit tricky because in an isolated test LoopSimplify would be scheduled before LoopUnroll, and thus will change the test and hide the problem. llvm-svn: 253253	2015-11-16 21:17:26 +00:00
Michael Zolotukhin	8ef44f93ca	Don't recompute LCSSA after loop-unrolling when possible. Summary: Currently we always recompute LCSSA for outer loops after unrolling an inner loop. That leads to compile time problem when we have big loop nests, and we can solve it by avoiding unnecessary work. For instance, if w eonly do partial unrolling, we don't break LCSSA, so we don't need to rebuild it. Also, if all exits from the inner loop are inside the enclosing loop, then complete unrolling won't break LCSSA either. I replaced unconditional LCSSA recomputation with conditional recomputation + unconditional assert and added several tests, which were failing when I experimented with it. Soon I plan to follow up with a similar patch for recalculation of dominators tree. Reviewers: hfinkel, dexonsmith, bogner, joker.eph, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14526 llvm-svn: 253126	2015-11-14 05:51:41 +00:00
Duncan P. N. Exon Smith	5b4c837c58	TransformUtils: Remove implicit ilist iterator conversions, NFC Continuing the work from last week to remove implicit ilist iterator conversions. First related commit was probably r249767, with some more motivation in r249925. This edition gets LLVMTransformUtils compiling without the implicit conversions. No functional change intended. llvm-svn: 250142	2015-10-13 02:39:05 +00:00
Sanjoy Das	5c8bead46d	[IndVars] Don't break dominance in `eliminateIdentitySCEV` Summary: After r249211, `getSCEV(X) == getSCEV(Y)` does not guarantee that X and Y are related in the dominator tree, even if X is an operand to Y (I've included a toy example in comments, and a real example as a test case). This commit changes `SimplifyIndVar` to require a `DominatorTree`. I don't think this is a problem because `ScalarEvolution` requires it anyway. Fixes PR25051. Depends on D13459. Reviewers: atrick, hfinkel Subscribers: joker.eph, llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D13460 llvm-svn: 249471	2015-10-06 21:44:49 +00:00
Michael Zolotukhin	d56ee06d1f	[Unroll] When completely unrolling the loop, replace conditinal branches with unconditional. Nothing is expected to change, except we do less redundant work in clean-up. Reviewers: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12951 llvm-svn: 248444	2015-09-23 23:12:43 +00:00
Chandler Carruth	2f1fd1658f	[PM] Port ScalarEvolution to the new pass manager. This change makes ScalarEvolution a stand-alone object and just produces one from a pass as needed. Making this work well requires making the object movable, using references instead of overwritten pointers in a number of places, and other refactorings. I've also wired it up to the new pass manager and added a RUN line to a test to exercise it under the new pass manager. This includes basic printing support much like with other analyses. But there is a big and somewhat scary change here. Prior to this patch ScalarEvolution was never actually invalidated!!! Re-running the pass just re-wired up the various other analyses and didn't remove any of the existing entries in the SCEV caches or clear out anything at all. This might seem OK as everything in SCEV that can uses ValueHandles to track updates to the values that serve as SCEV keys. However, this still means that as we ran SCEV over each function in the module, we kept accumulating more and more SCEVs into the cache. At the end, we would have a SCEV cache with every value that we ever needed a SCEV for in the entire module!!! Yowzers. The releaseMemory routine would dump all of this, but that isn't realy called during normal runs of the pipeline as far as I can see. To make matters worse, there is actually a key that we don't update with value handles -- there is a map keyed off of Loops. Because LoopInfo does* release its memory from run to run, it is entirely possible to run SCEV over one function, then over another function, and then lookup a Loop* from the second function but find an entry inserted for the first function! Ouch. To make matters still worse, there are plenty of updates that don't trip a value handle. It seems incredibly unlikely that today GVN or another pass that invalidates SCEV can update values in just such a way that a subsequent run of SCEV will incorrectly find lookups in a cache, but it is theoretically possible and would be a nightmare to debug. With this refactoring, I've fixed all this by actually destroying and recreating the ScalarEvolution object from run to run. Technically, this could increase the amount of malloc traffic we see, but then again it is also technically correct. ;] I don't actually think we're suffering from tons of malloc traffic from SCEV because if we were, the fact that we never clear the memory would seem more likely to have come up as an actual problem before now. So, I've made the simple fix here. If in fact there are serious issues with too much allocation and deallocation, I can work on a clever fix that preserves the allocations (while clearing the data) between each run, but I'd prefer to do that kind of optimization with a test case / benchmark that shows why we need such cleverness (and that can test that we actually make it faster). It's possible that this will make some things faster by making the SCEV caches have higher locality (due to being significantly smaller) so until there is a clear benchmark, I think the simple change is best. Differential Revision: http://reviews.llvm.org/D12063 llvm-svn: 245193	2015-08-17 02:08:17 +00:00
Chandler Carruth	96ada25bf3	[PM/AA] Remove all of the dead AliasAnalysis pointers being threaded through APIs that are no longer necessary now that the update API has been removed. This will make changes to the AA interfaces significantly less disruptive (I hope). Either way, it seems like a really nice cleanup. llvm-svn: 242882	2015-07-22 09:52:54 +00:00
Sanjoy Das	e178f46965	[LoopUnrollRuntime] Avoid high-cost trip count computation. Summary: Runtime unrolling of loops needs to emit an expression to compute the loop's runtime trip-count. Avoid runtime unrolling if this computation will be expensive. Depends on D8993. Reviewers: atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8994 llvm-svn: 234846	2015-04-14 03:20:38 +00:00
Benjamin Kramer	799003bf8c	Re-sort includes with sort-includes.py and insert raw_ostream.h where it's used. llvm-svn: 232998	2015-03-23 19:32:43 +00:00

1 2 3

124 Commits