Commit Graph

174 Commits

Author SHA1 Message Date
Florian Hahn fdea2e420c [loop-unroll] Factor out code to update LoopInfo (NFC).
Move the code to update LoopInfo for cloned basic blocks to
addClonedBlockToLoopInfo, as suggested in 
https://reviews.llvm.org/D28482.

llvm-svn: 291614
2017-01-10 23:24:54 +00:00
Philip Reames fac031a178 Add a comment for a todo in LoopUnroll post cleanup
llvm-svn: 290769
2016-12-30 22:10:19 +00:00
Haicheng Wu b29dd0107c [LoopUnroll] Modify a comment to clarify the usage of TripCount. NFC.
Make it clear that TripCount is the upper bound of the iteration on which
control exits LatchBlock.

Differential Revision: https://reviews.llvm.org/D26675

llvm-svn: 290199
2016-12-20 20:23:48 +00:00
Daniel Jasper aec2fa352f Revert @llvm.assume with operator bundles (r289755-r289757)
This creates non-linear behavior in the inliner (see more details in
r289755's commit thread).

llvm-svn: 290086
2016-12-19 08:22:17 +00:00
Hal Finkel 3ca4a6bcf1 Remove the AssumptionCache
After r289755, the AssumptionCache is no longer needed. Variables affected by
assumptions are now found by using the new operand-bundle-based scheme. This
new scheme is more computationally efficient, and also we need much less
code...

llvm-svn: 289756
2016-12-15 03:02:15 +00:00
Michael Kuperstein b151a641aa [LoopUnroll] Implement profile-based loop peeling
This implements PGO-driven loop peeling.

The basic idea is that when the average dynamic trip-count of a loop is known,
based on PGO, to be low, we can expect a performance win by peeling off the
first several iterations of that loop.
Unlike unrolling based on a known trip count, or a trip count multiple, this
doesn't save us the conditional check and branch on each iteration. However,
it does allow us to simplify the straight-line code we get (constant-folding,
etc.). This is important given that we know that we will usually only hit this
code, and not the actual loop.

This is currently disabled by default.

Differential Revision: https://reviews.llvm.org/D25963

llvm-svn: 288274
2016-11-30 21:13:57 +00:00
John Brawn 84b21835f1 [LoopUnroll] Keep the loop test only on the first iteration of max-or-zero loops
When we have a loop with a known upper bound on the number of iterations, and
furthermore know that either the number of iterations will be either exactly
that upper bound or zero, then we can fully unroll up to that upper bound
keeping only the first loop test to check for the zero iteration case.

Most of the work here is in plumbing this 'max-or-zero' information from the
part of scalar evolution where it's detected through to loop unrolling. I've
also gone for the safe default of 'false' everywhere but howManyLessThans which
could probably be improved.

Differential Revision: https://reviews.llvm.org/D25682

llvm-svn: 284818
2016-10-21 11:08:48 +00:00
Haicheng Wu 1ef17e90b2 Reapply "[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop"
Reappy r284044 after revert in r284051. Krzysztof fixed the error in r284049.

The original summary:

This patch tries to fully unroll loops having break statement like this

for (int i = 0; i < 8; i++) {
    if (a[i] == value) {
        found = true;
        break;
    }
}

GCC can fully unroll such loops, but currently LLVM cannot because LLVM only
supports loops having exact constant trip counts.

The upper bound of the trip count can be obtained from calling
ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the
refactoring work in SCEV to prevent duplicating code.

The feature of using the upper bound is enabled under the same circumstance
when runtime unrolling is enabled since both are used to unroll loops without
knowing the exact constant trip count.

llvm-svn: 284053
2016-10-12 21:29:38 +00:00
Haicheng Wu 45e4ef737d Revert "[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop"
This reverts commit r284044.

llvm-svn: 284051
2016-10-12 21:02:22 +00:00
Haicheng Wu 6cac34fd41 [LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop
This patch tries to fully unroll loops having break statement like this

for (int i = 0; i < 8; i++) {
    if (a[i] == value) {
        found = true;
        break;
    }
}

GCC can fully unroll such loops, but currently LLVM cannot because LLVM only
supports loops having exact constant trip counts.

The upper bound of the trip count can be obtained from calling
ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the
refactoring work in SCEV to prevent duplicating code.

The feature of using the upper bound is enabled under the same circumstance
when runtime unrolling is enabled since both are used to unroll loops without
knowing the exact constant trip count.

Differential Revision: https://reviews.llvm.org/D24790

llvm-svn: 284044
2016-10-12 20:24:32 +00:00
Adam Nemet f57cc62abf [LoopUnroll] Port to the new streaming interface for opt remarks.
llvm-svn: 282834
2016-09-30 03:44:16 +00:00
David Majnemer 110522bc0f [LoopUnroll] Don't clear out the AssumptionCache on each loop
Clearing out the AssumptionCache can cause us to rescan the entire
function for assumes.  If there are many loops, then we are scanning
over the entire function many times.

Instead of clearing out the AssumptionCache, register all cloned
assumes.

llvm-svn: 278854
2016-08-16 21:09:46 +00:00
David Majnemer 0a16c22846 Use range algorithms instead of unpacking begin/end
No functionality change is intended.

llvm-svn: 278417
2016-08-11 21:15:00 +00:00
Michael Zolotukhin 2f50725dbd [LoopUnroll] Simplify loops created by unrolling.
Summary:
Currently loop-unrolling doesn't preserve loop-simplified form. This patch
fixes it by resimplifying affected loops.

Reviewers: chandlerc, sanjoy, hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23148

llvm-svn: 278038
2016-08-08 19:02:15 +00:00
Michael Zolotukhin b2738e41bf [LoopUnroll] Switch the default value of -unroll-runtime-epilog back to its original value.
As agreed in post-commit review of r265388, I'm switching the flag to
its original value until the 90% runtime performance regression on
SingleSource/Benchmarks/Stanford/Bubblesort is addressed.

llvm-svn: 277524
2016-08-02 21:24:14 +00:00
Adam Nemet 12937c361f [LoopUnroll] Include hotness of region in opt remark
LoopUnroll is a loop pass, so the analysis of OptimizationRemarkEmitter
is added to the common function analysis passes that loop passes
depend on.

The BFI and indirectly BPI used in this pass is computed lazily so no
overhead should be observed unless -pass-remarks-with-hotness is used.

This is how the patch affects the O3 pipeline:

         Dominator Tree Construction
         Natural Loop Information
         Canonicalize natural loops
         Loop-Closed SSA Form Pass
         Basic Alias Analysis (stateless AA impl)
         Function Alias Analysis Results
         Scalar Evolution Analysis
+        Lazy Branch Probability Analysis
+        Lazy Block Frequency Analysis
+        Optimization Remark Emitter
         Loop Pass Manager
           Rotate Loops
           Loop Invariant Code Motion
           Unswitch loops
         Simplify the CFG
         Dominator Tree Construction
         Basic Alias Analysis (stateless AA impl)
         Function Alias Analysis Results
         Combine redundant instructions
         Natural Loop Information
         Canonicalize natural loops
         Loop-Closed SSA Form Pass
         Scalar Evolution Analysis
+        Lazy Branch Probability Analysis
+        Lazy Block Frequency Analysis
+        Optimization Remark Emitter
         Loop Pass Manager
           Induction Variable Simplification
           Recognize loop idioms
           Delete dead loops
           Unroll loops
...

llvm-svn: 277203
2016-07-29 19:29:47 +00:00
Davide Italiano cd96cfd8df [PM] Port LoopSimplify to the new pass manager.
While here move simplifyLoop() function to the new header, as
suggested by Chandler in the review.

Differential Revision:  http://reviews.llvm.org/D21404

llvm-svn: 274959
2016-07-09 03:03:01 +00:00
David Majnemer b8da3a2bb2 Reinstate r273711
r273711 was reverted by r273743.  The inliner needs to know about any
call sites in the inlined function.  These were obscured if we replaced
a call to undef with an undef but kept the call around.

This fixes PR28298.

llvm-svn: 273753
2016-06-25 00:04:10 +00:00
Nico Weber ae2ef4ccd4 Revert r273711, it caused PR28298.
llvm-svn: 273743
2016-06-24 22:52:39 +00:00
David Majnemer 3b3e954ea2 SimplifyInstruction does not imply DCE
We cannot remove an instruction with no uses just because
SimplifyInstruction succeeds.  It may have side effects.

llvm-svn: 273711
2016-06-24 19:34:46 +00:00
Michael Zolotukhin aa547616d2 [LoopUnroll] Check that DT is available before trying to verify it.
llvm-svn: 272221
2016-06-08 22:49:59 +00:00
Evgeny Stupachenko ea2aef4a1d The patch refactors unroll pass.
Summary:
Unroll factor (Count) calculations moved to a new function.
Early exits on pragma and "-unroll-count" defined factor added.
New type of unrolling "Force" introduced (previously used implicitly).
New unroll preference "AllowRemainder" introduced and set "true" by default.
(should be set to false for architectures that suffers from it).

Reviewers: hfinkel, mzolotukhin, zzheng

Differential Revision: http://reviews.llvm.org/D19553

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 271071
2016-05-27 23:15:06 +00:00
Justin Lebar 50deb6d028 Minor formatting fixes in LoopUnroll.cpp.
llvm-svn: 268995
2016-05-10 00:31:23 +00:00
Michael Zolotukhin 56ad4048ae Follow-up for r265605: don't mutate vector we're iterating.
llvm-svn: 265625
2016-04-07 00:09:42 +00:00
Michael Zolotukhin 97567e141e [LoopUnroll] Fix the way we update DT after complete unrolling.
Updating dominators for exit-blocks of the unrolled loops is not enough,
as shown in PR27157. The proper way is to update dominators for all
dominance-children of original loop blocks.

llvm-svn: 265605
2016-04-06 21:47:12 +00:00
David L Kreitzer 188de5ae69 Adds the ability to use an epilog remainder loop during loop unrolling and makes
this the default behavior.

Patch by Evgeny Stupachenko (evstupac@gmail.com).

Differential Revision: http://reviews.llvm.org/D18158

llvm-svn: 265388
2016-04-05 12:19:35 +00:00
Eric Christopher 257338ff0f Use some braces to format this a little better.
llvm-svn: 263527
2016-03-15 03:01:31 +00:00
Eric Christopher ee00abe5e6 Fix llvm/llvm/lib/Transforms/Utils/LoopUnroll.cpp:285:53: error: suggest
parentheses around '&&' within '||' [-Werror=parentheses].

llvm-svn: 263525
2016-03-15 02:19:06 +00:00
Justin Lebar 6827de19b2 [LoopUnroll] Respect the convergent attribute.
Summary:
Specifically, when we perform runtime loop unrolling of a loop that
contains a convergent op, we can only unroll k times, where k divides
the loop trip multiple.

Without this change, we'll happily unroll e.g. the following loop

  for (int i = 0; i < N; ++i) {
    if (i == 0) convergent_op();
    foo();
  }

into

  int i = 0;
  if (N % 2 == 1) {
    convergent_op();
    foo();
    ++i;
  }
  for (; i < N - 1; i += 2) {
    if (i == 0) convergent_op();
    foo();
    foo();
  }.

This is unsafe, because we've just added a control-flow dependency to
the convergent op in the prelude.

In general, runtime unrolling loops that contain convergent ops is safe
only if we don't have emit a prelude, which occurs when the unroll count
divides the trip multiple.

Reviewers: resistor

Subscribers: llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D17526

llvm-svn: 263509
2016-03-14 23:15:34 +00:00
Sanjay Patel eaf06851d0 rangify, fix function names; NFCI
llvm-svn: 262940
2016-03-08 17:12:32 +00:00
Sanjay Patel 5b8d741632 don't repeat function names in documentation comments; NFC
llvm-svn: 262937
2016-03-08 16:26:39 +00:00
Michael Zolotukhin 792a885537 Follow up for r261597: Add the * to the auto.
llvm-svn: 261600
2016-02-23 00:57:48 +00:00
Michael Zolotukhin 4fdf974e3e Follow-up for r261595: use range loop.
llvm-svn: 261597
2016-02-23 00:48:44 +00:00
Michael Zolotukhin de19ed1eb1 [LoopUnroll] Avoid unnecessary DT recomputation.
Summary:
When we completely unroll a loop, it's pretty easy to update DT in-place and
thus avoid rebuilding it. DT recalculation is one of the most time-consuming
tasks in loop-unroll, so avoiding it at least in case of full unroll should be
beneficial.

On some extreme (but still real-world) tests this patch improves compile time by
~2x.

Reviewers: escha, jmolloy, hfinkel, sanjoy, chandlerc

Subscribers: joker.eph, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D17473

llvm-svn: 261595
2016-02-23 00:30:50 +00:00
Michael Zolotukhin d734bea8ba [LoopUnrolling] Fix a bug introduced in r259869 (PR26688).
The issue was that we only required LCSSA rebuilding if the immediate
parent-loop had values used outside of it. The fix is to enaable the
same logic for all outer loops, not only immediate parent.

llvm-svn: 261575
2016-02-22 21:21:45 +00:00
Michael Zolotukhin 73957179d3 [LoopUnrolling] Try harder to avoid rebuilding LCSSA when possible.
In r255133 (reapplied r253126) we started to avoid redundant
recomputation of LCSSA after loop-unrolling. This patch moves one step
further in this direction - now we can avoid it for much wider range of
loops, as we start to look at IR and try to figure out if the
transformation actually breaks LCSSA phis or makes it necessary to
insert new ones.

Differential Revision: http://reviews.llvm.org/D16838

llvm-svn: 259869
2016-02-05 02:17:36 +00:00
Justin Bogner e9fb228d59 LoopInfo: Simplify ownership of Loop objects
It's strange that LoopInfo mostly owns the Loop objects, but that it
defers deleting them to the loop pass manager. Instead, change the
oddly named "updateUnloop" to "markAsRemoved" and have it queue the
Loop object for deletion. We can't delete the Loop immediately when we
remove it, since we need its pointer identity still, so we'll mark the
object as "invalid" so that clients can see what's going on.

llvm-svn: 257191
2016-01-08 19:08:53 +00:00
Justin Bogner 883a3ea67f LPM: Make callers of LPM.deleteLoopFromQueue update LoopInfo directly. NFC
As of r255720, the loop pass manager will DTRT when passes update the
loop info for removed loops, so they no longer need to reach into
LPPassManager APIs to do this kind of transformation. This change very
nearly removes the need for the LPPassManager to even be passed into
loop passes - the only remaining pass that uses the LPM argument is
LoopUnswitch.

llvm-svn: 255797
2015-12-16 18:40:20 +00:00
Justin Bogner 843fb204b7 LPM: Stop threading `Pass *` through all of the loop utility APIs. NFC
A large number of loop utility functions take a `Pass *` and reach
into it to find out which analyses to preserve. There are a number of
problems with this:

- The APIs have access to pretty well any Pass state they want, so
  it's hard to tell what they may or may not do.

- Other APIs have copied these and pass around a `Pass *` even though
  they don't even use it. Some of these just hand a nullptr to the API
  since the callers don't even have a pass available.

- Passes in the new pass manager don't work like the current ones, so
  the APIs can't be used as is there.

Instead, we should explicitly thread the analysis results that we
actually care about through these APIs. This is both simpler and more
reusable.

llvm-svn: 255669
2015-12-15 19:40:57 +00:00
Michael Zolotukhin 78760ee73d Revert "Revert r253253 and r253126: "Don't recompute LCSSA after loop-unrolling when possible.""
The bug in IndVarSimplify was fixed in r254976, r254977, so I'm
reapplying the original patch for avoiding redundant LCSSA recomputation.

This reverts commit ffe3b434e505e403146aff00be0c177bb6d13466.

llvm-svn: 255133
2015-12-09 18:20:28 +00:00
Michael Zolotukhin 6c11c04db3 Revert r253253 and r253126: "Don't recompute LCSSA after loop-unrolling when possible."
The change exposed a bug in IndVarSimplify (PR25578), which led to a
failure (PR25538). When the bug is fixed, this patch can be reapplied.

The tests are kept in tree, as they're useful anyway, and will not break
with this revert.

llvm-svn: 253596
2015-11-19 20:28:32 +00:00
Michael Zolotukhin 927bdba29d [PR25538]: Fix a failure caused by r253126.
In r253126 we stopped to recompute LCSSA after loop unrolling in all
cases, except the unrolling is full and at least one of the loop exits
is outside the parent loop. In other cases the transformation should not
break LCSSA, but it turned out, that we also call SimplifyLoop on the
parent loop, which might break LCSSA by itself. This fix just triggers
LCSSA recomputation in this case as well.

I'm committing it without a test case for now, but I'll try to invent
one. It's a bit tricky because in an isolated test LoopSimplify would
be scheduled before LoopUnroll, and thus will change the test and hide
the problem.

llvm-svn: 253253
2015-11-16 21:17:26 +00:00
Michael Zolotukhin 8ef44f93ca Don't recompute LCSSA after loop-unrolling when possible.
Summary:
Currently we always recompute LCSSA for outer loops after unrolling an
inner loop. That leads to compile time problem when we have big loop
nests, and we can solve it by avoiding unnecessary work. For instance,
if w eonly do partial unrolling, we don't break LCSSA, so we don't need
to rebuild it. Also, if all exits from the inner loop are inside the
enclosing loop, then complete unrolling won't break LCSSA either.

I replaced unconditional LCSSA recomputation with conditional recomputation +
unconditional assert and added several tests, which were failing when I
experimented with it.

Soon I plan to follow up with a similar patch for recalculation of dominators
tree.

Reviewers: hfinkel, dexonsmith, bogner, joker.eph, chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14526

llvm-svn: 253126
2015-11-14 05:51:41 +00:00
Duncan P. N. Exon Smith 5b4c837c58 TransformUtils: Remove implicit ilist iterator conversions, NFC
Continuing the work from last week to remove implicit ilist iterator
conversions.  First related commit was probably r249767, with some more
motivation in r249925.  This edition gets LLVMTransformUtils compiling
without the implicit conversions.

No functional change intended.

llvm-svn: 250142
2015-10-13 02:39:05 +00:00
Sanjoy Das 5c8bead46d [IndVars] Don't break dominance in `eliminateIdentitySCEV`
Summary:
After r249211, `getSCEV(X) == getSCEV(Y)` does not guarantee that X and
Y are related in the dominator tree, even if X is an operand to Y (I've
included a toy example in comments, and a real example as a test case).

This commit changes `SimplifyIndVar` to require a `DominatorTree`.  I
don't think this is a problem because `ScalarEvolution` requires it
anyway.

Fixes PR25051.

Depends on D13459.

Reviewers: atrick, hfinkel

Subscribers: joker.eph, llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D13460

llvm-svn: 249471
2015-10-06 21:44:49 +00:00
Michael Zolotukhin d56ee06d1f [Unroll] When completely unrolling the loop, replace conditinal branches with unconditional.
Nothing is expected to change, except we do less redundant work in
clean-up.

Reviewers: hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12951

llvm-svn: 248444
2015-09-23 23:12:43 +00:00
Chandler Carruth 2f1fd1658f [PM] Port ScalarEvolution to the new pass manager.
This change makes ScalarEvolution a stand-alone object and just produces
one from a pass as needed. Making this work well requires making the
object movable, using references instead of overwritten pointers in
a number of places, and other refactorings.

I've also wired it up to the new pass manager and added a RUN line to
a test to exercise it under the new pass manager. This includes basic
printing support much like with other analyses.

But there is a big and somewhat scary change here. Prior to this patch
ScalarEvolution was never *actually* invalidated!!! Re-running the pass
just re-wired up the various other analyses and didn't remove any of the
existing entries in the SCEV caches or clear out anything at all. This
might seem OK as everything in SCEV that can uses ValueHandles to track
updates to the values that serve as SCEV keys. However, this still means
that as we ran SCEV over each function in the module, we kept
accumulating more and more SCEVs into the cache. At the end, we would
have a SCEV cache with every value that we ever needed a SCEV for in the
entire module!!! Yowzers. The releaseMemory routine would dump all of
this, but that isn't realy called during normal runs of the pipeline as
far as I can see.

To make matters worse, there *is* actually a key that we don't update
with value handles -- there is a map keyed off of Loop*s. Because
LoopInfo *does* release its memory from run to run, it is entirely
possible to run SCEV over one function, then over another function, and
then lookup a Loop* from the second function but find an entry inserted
for the first function! Ouch.

To make matters still worse, there are plenty of updates that *don't*
trip a value handle. It seems incredibly unlikely that today GVN or
another pass that invalidates SCEV can update values in *just* such
a way that a subsequent run of SCEV will incorrectly find lookups in
a cache, but it is theoretically possible and would be a nightmare to
debug.

With this refactoring, I've fixed all this by actually destroying and
recreating the ScalarEvolution object from run to run. Technically, this
could increase the amount of malloc traffic we see, but then again it is
also technically correct. ;] I don't actually think we're suffering from
tons of malloc traffic from SCEV because if we were, the fact that we
never clear the memory would seem more likely to have come up as an
actual problem before now. So, I've made the simple fix here. If in fact
there are serious issues with too much allocation and deallocation,
I can work on a clever fix that preserves the allocations (while
clearing the data) between each run, but I'd prefer to do that kind of
optimization with a test case / benchmark that shows why we need such
cleverness (and that can test that we actually make it faster). It's
possible that this will make some things faster by making the SCEV
caches have higher locality (due to being significantly smaller) so
until there is a clear benchmark, I think the simple change is best.

Differential Revision: http://reviews.llvm.org/D12063

llvm-svn: 245193
2015-08-17 02:08:17 +00:00
Chandler Carruth 96ada25bf3 [PM/AA] Remove all of the dead AliasAnalysis pointers being threaded
through APIs that are no longer necessary now that the update API has
been removed.

This will make changes to the AA interfaces significantly less
disruptive (I hope). Either way, it seems like a really nice cleanup.

llvm-svn: 242882
2015-07-22 09:52:54 +00:00
Sanjoy Das e178f46965 [LoopUnrollRuntime] Avoid high-cost trip count computation.
Summary:
Runtime unrolling of loops needs to emit an expression to compute the
loop's runtime trip-count.  Avoid runtime unrolling if this computation
will be expensive.

Depends on D8993.

Reviewers: atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8994

llvm-svn: 234846
2015-04-14 03:20:38 +00:00
Benjamin Kramer 799003bf8c Re-sort includes with sort-includes.py and insert raw_ostream.h where it's used.
llvm-svn: 232998
2015-03-23 19:32:43 +00:00
Mehdi Amini a28d91d81b DataLayout is mandatory, update the API to reflect it with references.
Summary:
Now that the DataLayout is a mandatory part of the module, let's start
cleaning the codebase. This patch is a first attempt at doing that.

This patch is not exactly NFC as for instance some places were passing
a nullptr instead of the DataLayout, possibly just because there was a
default value on the DataLayout argument to many functions in the API.
Even though it is not purely NFC, there is no change in the
validation.

I turned as many pointer to DataLayout to references, this helped
figuring out all the places where a nullptr could come up.

I had initially a local version of this patch broken into over 30
independant, commits but some later commit were cleaning the API and
touching part of the code modified in the previous commits, so it
seemed cleaner without the intermediate state.

Test Plan:

Reviewers: echristo

Subscribers: llvm-commits

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 231740
2015-03-10 02:37:25 +00:00
Mehdi Amini 46a43556db Make DataLayout Non-Optional in the Module
Summary:
DataLayout keeps the string used for its creation.

As a side effect it is no longer needed in the Module.
This is "almost" NFC, the string is no longer
canonicalized, you can't rely on two "equals" DataLayout
having the same string returned by getStringRepresentation().

Get rid of DataLayoutPass: the DataLayout is in the Module

The DataLayout is "per-module", let's enforce this by not
duplicating it more than necessary.
One more step toward non-optionality of the DataLayout in the
module.

Make DataLayout Non-Optional in the Module

Module->getDataLayout() will never returns nullptr anymore.

Reviewers: echristo

Subscribers: resistor, llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D7992

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 231270
2015-03-04 18:43:29 +00:00
Jingyue Wu 49a766e468 Resurrect the assertion removed by r227717
Summary: MSVC can compile "LoopID->getOperand(0) == LoopID" when LoopID is MDNode*.

Test Plan: no regression

Reviewers: mkuper

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D7327

llvm-svn: 227853
2015-02-02 20:41:11 +00:00
Michael Kuperstein a691f3e921 Removed assert that doesn't typecheck and breaks debug MSVC build.
llvm-svn: 227717
2015-02-01 08:46:20 +00:00
Jingyue Wu 0220df0dfd [NVPTX] Emit .pragma "nounroll" for loops marked with nounroll
Summary:
CUDA driver can unroll loops when jit-compiling PTX. To prevent CUDA
driver from unrolling a loop marked with llvm.loop.unroll.disable is not
unrolled by CUDA driver, we need to emit .pragma "nounroll" at the
header of that loop.

This patch also extracts getting unroll metadata from loop ID metadata
into a shared helper function.

Test Plan: test/CodeGen/NVPTX/nounroll.ll

Reviewers: eliben, meheff, jholewinski

Reviewed By: jholewinski

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D7041

llvm-svn: 227703
2015-02-01 02:27:45 +00:00
Chandler Carruth 691addc25f [PM] Now that LoopInfo isn't in the Pass type hierarchy, it is much
cleaner to derive from the generic base.

Thise removes a ton of boiler plate code and somewhat strange and
pointless indirections. It also remove a bunch of the previously needed
friend declarations. To fully remove these, I also lifted the verify
logic into the generic LoopInfoBase, which seems good anyways -- it is
generic and useful logic even for the machine side.

llvm-svn: 226385
2015-01-18 01:25:51 +00:00
Chandler Carruth 66b3130cda [PM] Split the AssumptionTracker immutable pass into two separate APIs:
a cache of assumptions for a single function, and an immutable pass that
manages those caches.

The motivation for this change is two fold. Immutable analyses are
really hacks around the current pass manager design and don't exist in
the new design. This is usually OK, but it requires that the core logic
of an immutable pass be reasonably partitioned off from the pass logic.
This change does precisely that. As a consequence it also paves the way
for the *many* utility functions that deal in the assumptions to live in
both pass manager worlds by creating an separate non-pass object with
its own independent API that they all rely on. Now, the only bits of the
system that deal with the actual pass mechanics are those that actually
need to deal with the pass mechanics.

Once this separation is made, several simplifications become pretty
obvious in the assumption cache itself. Rather than using a set and
callback value handles, it can just be a vector of weak value handles.
The callers can easily skip the handles that are null, and eventually we
can wrap all of this up behind a filter iterator.

For now, this adds boiler plate to the various passes, but this kind of
boiler plate will end up making it possible to port these passes to the
new pass manager, and so it will end up factored away pretty reasonably.

llvm-svn: 225131
2015-01-04 12:03:27 +00:00
Bruno Cardoso Lopes bad65c3b70 [LCSSA] Handle PHI insertion in disjoint loops
Take two disjoint Loops L1 and L2.

LoopSimplify fails to simplify some loops (e.g. when indirect branches
are involved). In such situations, it can happen that an exit for L1 is
the header of L2. Thus, when we create PHIs in one of such exits we are
also inserting PHIs in L2 header.

This could break LCSSA form for L2 because these inserted PHIs can also
have uses in L2 exits, which are never handled in the current
implementation. Provide a fix for this corner case and test that we
don't assert/crash on that.

Differential Revision: http://reviews.llvm.org/D6624

rdar://problem/19166231

llvm-svn: 224740
2014-12-22 22:35:46 +00:00
David Blaikie 70573dcd9f Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool>
This is to be consistent with StringSet and ultimately with the standard
library's associative container insert function.

This lead to updating SmallSet::insert to return pair<iterator, bool>,
and then to update SmallPtrSet::insert to return pair<iterator, bool>,
and then to update all the existing users of those functions...

llvm-svn: 222334
2014-11-19 07:49:26 +00:00
Duncan P. N. Exon Smith c46cfcbbc6 LoopUnroll: Create sub-loops in LoopInfo
`LoopUnrollPass` says that it preserves `LoopInfo` -- make it so.  In
particular, tell `LoopInfo` about copies of inner loops when unrolling
the outer loop.

Conservatively, also tell `ScalarEvolution` to forget about the original
versions of these loops, since their inputs may have changed.

Fixes PR20987.

llvm-svn: 219241
2014-10-07 21:19:00 +00:00
Duncan P. N. Exon Smith 9b4d37e8f5 LoopUnroll: Only check for ScalarEvolution analysis once, NFC
A follow-up commit will add use to a tight loop.  We might as well just
find it once anyway.

llvm-svn: 219239
2014-10-07 21:12:44 +00:00
Duncan P. N. Exon Smith e5d7d9797b LoopUnroll: Change code order of changes to new basic blocks
Add new basic blocks to `LoopInfo` earlier.  No functionality change
intended (simplifies upcoming bugfix patch).

llvm-svn: 219150
2014-10-06 22:05:02 +00:00
Duncan P. N. Exon Smith 0bbf5418c6 Sink comment, NFC
llvm-svn: 219149
2014-10-06 22:04:59 +00:00
Hal Finkel 60db05896a Make use of @llvm.assume in ValueTracking (computeKnownBits, etc.)
This change, which allows @llvm.assume to be used from within computeKnownBits
(and other associated functions in ValueTracking), adds some (optional)
parameters to computeKnownBits and friends. These functions now (optionally)
take a "context" instruction pointer, an AssumptionTracker pointer, and also a
DomTree pointer, and most of the changes are just to pass this new information
when it is easily available from InstSimplify, InstCombine, etc.

As explained below, the significant conceptual change is that known properties
of a value might depend on the control-flow location of the use (because we
care that the @llvm.assume dominates the use because assumptions have
control-flow dependencies). This means that, when we ask if bits are known in a
value, we might get different answers for different uses.

The significant changes are all in ValueTracking. Two main changes: First, as
with the rest of the code, new parameters need to be passed around. To make
this easier, I grouped them into a structure, and I made internal static
versions of the relevant functions that take this structure as a parameter. The
new code does as you might expect, it looks for @llvm.assume calls that make
use of the value we're trying to learn something about (often indirectly),
attempts to pattern match that expression, and uses the result if successful.
By making use of the AssumptionTracker, the process of finding @llvm.assume
calls is not expensive.

Part of the structure being passed around inside ValueTracking is a set of
already-considered @llvm.assume calls. This is to prevent a query using, for
example, the assume(a == b), to recurse on itself. The context and DT params
are used to find applicable assumptions. An assumption needs to dominate the
context instruction, or come after it deterministically. In this latter case we
only handle the specific case where both the assumption and the context
instruction are in the same block, and we need to exclude assumptions from
being used to simplify their own ephemeral values (those which contribute only
to the assumption) because otherwise the assumption would prove its feeding
comparison trivial and would be removed.

This commit adds the plumbing and the logic for a simple masked-bit propagation
(just enough to write a regression test). Future commits add more patterns
(and, correspondingly, more regression tests).

llvm-svn: 217342
2014-09-07 18:57:58 +00:00
Hal Finkel 74c2f355d2 Add an Assumption-Tracking Pass
This adds an immutable pass, AssumptionTracker, which keeps a cache of
@llvm.assume call instructions within a module. It uses callback value handles
to keep stale functions and intrinsics out of the map, and it relies on any
code that creates new @llvm.assume calls to notify it of the new instructions.
The benefit is that code needing to find @llvm.assume intrinsics can do so
directly, without scanning the function, thus allowing the cost of @llvm.assume
handling to be negligible when none are present.

The current design is intended to be lightweight. We don't keep track of
anything until we need a list of assumptions in some function. The first time
this happens, we scan the function. After that, we add/remove @llvm.assume
calls from the cache in response to registration calls and ValueHandle
callbacks.

There are no new direct test cases for this pass, but because it calls it
validation function upon module finalization, we'll pick up detectable
inconsistencies from the other tests that touch @llvm.assume calls.

This pass will be used by follow-up commits that make use of @llvm.assume.

llvm-svn: 217334
2014-09-07 12:44:26 +00:00
Duncan P. N. Exon Smith 6c99015fe2 Revert "[C++11] Add predecessors(BasicBlock *) / successors(BasicBlock *) iterator ranges."
This reverts commit r213474 (and r213475), which causes a miscompile on
a stage2 LTO build.  I'll reply on the list in a moment.

llvm-svn: 213562
2014-07-21 17:06:51 +00:00
Manuel Jacob d11beffef4 [C++11] Add predecessors(BasicBlock *) / successors(BasicBlock *) iterator ranges.
Summary: This patch introduces two new iterator ranges and updates existing code to use it.  No functional change intended.

Test Plan: All tests (make check-all) still pass.

Reviewers: dblaikie

Reviewed By: dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D4481

llvm-svn: 213474
2014-07-20 09:10:11 +00:00
Mark Heffernan 675d401a26 Partially fix PR20058: reduce compile time for loop unrolling with very high count by reducing calls to SE->forgetLoop
llvm-svn: 212782
2014-07-10 23:30:06 +00:00
Hal Finkel a995f92627 Feeding isSafeToSpeculativelyExecute its DataLayout pointer
isSafeToSpeculativelyExecute can optionally take a DataLayout pointer. In the
past, this was mainly used to make better decisions regarding divisions known
not to trap, and so was not all that important for users concerned with "cheap"
instructions. However, now it also helps look through bitcasts for
dereferencable loads, and will also be important if/when we add a
dereferencable pointer attribute.

This is some initial work to feed a DataLayout pointer through to callers of
isSafeToSpeculativelyExecute, generally where one was already available.

llvm-svn: 212720
2014-07-10 14:41:31 +00:00
Benjamin Kramer cccdadca45 Fix some Twine locals.
Two of those are use after frees. Found by clang-tidy, fixed by me.

llvm-svn: 212537
2014-07-08 14:55:06 +00:00
Dinesh Dwivedi d266cb1a0b LCSSA should be performed on the outermost affected loop while unrolling loop.
During loop-unroll, loop exits from the current loop may end up in in different
outer loop. This requires to re-form LCSSA recursively for one level down from
the outer most loop where loop exits are landed during unroll. This fixes PR18861.

Differential Revision: http://reviews.llvm.org/D2976

llvm-svn: 209796
2014-05-29 06:47:23 +00:00
Diego Novillo 7f8af8bf91 Add support for missed and analysis optimization remarks.
Summary:
This adds two new diagnostics: -pass-remarks-missed and
-pass-remarks-analysis. They take the same values as -pass-remarks but
are intended to be triggered in different contexts.

-pass-remarks-missed is used by LLVMContext::emitOptimizationRemarkMissed,
which passes call when they tried to apply a transformation but
couldn't.

-pass-remarks-analysis is used by LLVMContext::emitOptimizationRemarkAnalysis,
which passes call when they want to inform the user about analysis
results.

The patch also:

1- Adds support in the inliner for the two new remarks and a
   test case.

2- Moves emitOptimizationRemark* functions to the llvm namespace.

3- Adds an LLVMContext argument instead of making them member functions
   of LLVMContext.

Reviewers: qcolombet

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3682

llvm-svn: 209442
2014-05-22 14:19:46 +00:00
Diego Novillo 34fc8a7c4c Add optimization remarks to the loop unroller and vectorizer.
Summary:
This calls emitOptimizationRemark from the loop unroller and vectorizer
at the point where they make a positive transformation. For the
vectorizer, it reports vectorization and interleave factors. For the
loop unroller, it reports all the different supported types of
unrolling.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3456

llvm-svn: 207528
2014-04-29 14:27:31 +00:00
Craig Topper f40110f4d8 [C++] Use 'nullptr'. Transforms edition.
llvm-svn: 207196
2014-04-25 05:29:35 +00:00
Chandler Carruth 964daaaf19 [Modules] Fix potential ODR violations by sinking the DEBUG_TYPE
definition below all of the header #include lines, lib/Transforms/...
edition.

This one is tricky for two reasons. We again have a couple of passes
that define something else before the includes as well. I've sunk their
name macros with the DEBUG_TYPE.

Also, InstCombine contains headers that need DEBUG_TYPE, so now those
headers #define and #undef DEBUG_TYPE around their code, leaving them
well formed modular headers. Fixing these headers was a large motivation
for all of these changes, as "leaky" macros of this form are hard on the
modules implementation.

llvm-svn: 206844
2014-04-22 02:55:47 +00:00
Chandler Carruth d84f776e8a [LPM] Fix PR18616 where the shifts to the loop pass manager to extract
LCSSA from it caused a crasher with the LoopUnroll pass.

This crasher is really nasty. We destroy LCSSA form in a suprising way.
When unrolling a loop into an outer loop, we not only need to restore
LCSSA form for the outer loop, but for all children of the outer loop.
This is somewhat obvious in retrospect, but hey!

While this seems pretty heavy-handed, it's not that bad. Fundamentally,
we only do this when we unroll a loop, which is already a heavyweight
operation. We're unrolling all of these hypothetical inner loops as
well, so their size and complexity is already on the critical path. This
is just adding another pass over them to re-canonicalize.

I have a test case from PR18616 that is great for reproducing this, but
pretty useless to check in as it relies on many 10s of nested empty
loops that get unrolled and deleted in just the right order. =/ What's
worse is that investigating this has exposed another source of failure
that is likely to be even harder to test. I'll try to come up with test
cases for these fixes, but I want to get the fixes into the tree first
as they're causing crashes in the wild.

llvm-svn: 200273
2014-01-28 01:25:38 +00:00
Chandler Carruth aa7fa5e4b2 [LPM] Make LoopSimplify no longer a LoopPass and instead both a utility
function and a FunctionPass.

This has many benefits. The motivating use case was to be able to
compute function analysis passes *after* running LoopSimplify (to avoid
invalidating them) and then to run other passes which require
LoopSimplify. Specifically passes like unrolling and vectorization are
critical to wire up to BranchProbabilityInfo and BlockFrequencyInfo so
that they can be profile aware. For the LoopVectorize pass the only
things in the way are LoopSimplify and LCSSA. This fixes LoopSimplify
and LCSSA is next on my list.

There are also a bunch of other benefits of doing this:
- It is now very feasible to make more passes *preserve* LoopSimplify
  because they can simply run it after changing a loop. Because
  subsequence passes can assume LoopSimplify is preserved we can reduce
  the runs of this pass to the times when we actually mutate a loop
  structure.
- The new pass manager should be able to more easily support loop passes
  factored in this way.
- We can at long, long last observe that LoopSimplify is preserved
  across SCEV. This *halves* the number of times we run LoopSimplify!!!

Now, getting here wasn't trivial. First off, the interfaces used by
LoopSimplify are all over the map regarding how analysis are updated. We
end up with weird "pass" parameters as a consequence. I'll try to clean
at least some of this up later -- I'll have to have it all clean for the
new pass manager.

Next up I discovered a really frustrating bug. LoopUnroll *claims* to
preserve LoopSimplify. That's actually a lie. But the way the
LoopPassManager ends up running the passes, it always ran LoopSimplify
on the unrolled-into loop, rectifying this oversight before any
verification could kick in and point out that in fact nothing was
preserved. So I've added code to the unroller to *actually* simplify the
surrounding loop when it succeeds at unrolling.

The only functional change in the test suite is that we now catch a case
that was previously missed because SCEV and other loop transforms see
their containing loops as simplified and thus don't miss some
opportunities. One test case has been converted to check that we catch
this case rather than checking that we miss it but at least don't get
the wrong answer.

Note that I have #if-ed out all of the verification logic in
LoopSimplify! This is a temporary workaround while extracting these bits
from the LoopPassManager. Currently, there is no way to have a pass in
the LoopPassManager which preserves LoopSimplify along with one which
does not. The LPM will try to verify on each loop in the nest that
LoopSimplify holds but the now-Function-pass cannot distinguish what
loop is being verified and so must try to verify all of them. The inner
most loop is clearly no longer simplified as there is a pass which
didn't even *attempt* to preserve it. =/ Once I get LCSSA out (and maybe
LoopVectorize and some other fixes) I'll be able to re-enable this check
and catch any places where we are still failing to preserve
LoopSimplify. If this causes problems I can back this out and try to
commit *all* of this at once, but so far this seems to work and allow
much more incremental progress.

llvm-svn: 199884
2014-01-23 11:23:19 +00:00
Chandler Carruth 73523021d0 [PM] Split DominatorTree into a concrete analysis result object which
can be used by both the new pass manager and the old.

This removes it from any of the virtual mess of the pass interfaces and
lets it derive cleanly from the DominatorTreeBase<> template. In turn,
tons of boilerplate interface can be nuked and it turns into a very
straightforward extension of the base DominatorTree interface.

The old analysis pass is now a simple wrapper. The names and style of
this split should match the split between CallGraph and
CallGraphWrapperPass. All of the users of DominatorTree have been
updated to match using many of the same tricks as with CallGraph. The
goal is that the common type remains the resulting DominatorTree rather
than the pass. This will make subsequent work toward the new pass
manager significantly easier.

Also in numerous places things became cleaner because I switched from
re-running the pass (!!! mid way through some other passes run!!!) to
directly recomputing the domtree.

llvm-svn: 199104
2014-01-13 13:07:17 +00:00
Chandler Carruth 5ad5f15cff [cleanup] Move the Dominators.h and Verifier.h headers into the IR
directory. These passes are already defined in the IR library, and it
doesn't make any sense to have the headers in Analysis.

Long term, I think there is going to be a much better way to divide
these matters. The dominators code should be fully separated into the
abstract graph algorithm and have that put in Support where it becomes
obvious that evn Clang's CFGBlock's can use it. Then the verifier can
manually construct dominance information from the Support-driven
interface while the Analysis library can provide a pass which both
caches, reconstructs, and supports a nice update API.

But those are very long term, and so I don't want to leave the really
confusing structure until that day arrives.

llvm-svn: 199082
2014-01-13 09:26:24 +00:00
Jakub Staszak 3ab283c157 Don't #include heavy Dominators.h file in LoopInfo.h. This change reduces
overall time of LLVM compilation by ~1%.

llvm-svn: 196667
2013-12-07 21:20:17 +00:00
NAKAMURA Takumi f9c8339a4e Utils/LoopUnroll.cpp: Tweak (StringRef)OldName to be valid until it is used, since r194601.
eraseFromParent() invalidates OldName.

llvm-svn: 194970
2013-11-17 18:05:34 +00:00
Jakub Staszak 86a7492f0d Use StringRef instead of std::string
llvm-svn: 194601
2013-11-13 20:09:11 +00:00
Benjamin Kramer 7d6052687e Replace some unnecessary vector copies with references.
llvm-svn: 190770
2013-09-15 22:04:42 +00:00
Chandler Carruth 9fb823bbd4 Move all of the header files which are involved in modelling the LLVM IR
into their new header subdirectory: include/llvm/IR. This matches the
directory structure of lib, and begins to correct a long standing point
of file layout clutter in LLVM.

There are still more header files to move here, but I wanted to handle
them in separate commits to make tracking what files make sense at each
layer easier.

The only really questionable files here are the target intrinsic
tablegen files. But that's a battle I'd rather not fight today.

I've updated both CMake and Makefile build systems (I think, and my
tests think, but I may have missed something).

I've also re-sorted the includes throughout the project. I'll be
committing updates to Clang, DragonEgg, and Polly momentarily.

llvm-svn: 171366
2013-01-02 11:36:10 +00:00
Chandler Carruth ed0881b2a6 Use the new script to sort the includes of every file under lib.
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.

Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]

llvm-svn: 169131
2012-12-03 16:50:05 +00:00
Andrew Trick a6fb910fad LoopUnroll: always check for NULL LoopPassManager
llvm-svn: 158007
2012-06-05 17:51:05 +00:00
Andrew Trick d29cd732d4 Allow NULL LoopPassManager argument in UnrollLoop. PR12734.
llvm-svn: 156358
2012-05-08 02:52:09 +00:00
Andrew Trick 4442bfe559 Fix 12513: Loop unrolling breaks with indirect branches.
Take this opportunity to generalize the indirectbr bailout logic for
loop transformations. CFG transformations will never get indirectbr
right, and there's no point trying.

llvm-svn: 154386
2012-04-10 05:14:42 +00:00
Andrew Trick ca3417e932 Avoid a confusing assert for silly options: -unroll-runtime -unroll-count=1.
No need for an explicit test case for an unsupported combination of options.

llvm-svn: 146721
2011-12-16 02:03:48 +00:00
Andrew Trick d04d152998 Add -unroll-runtime for unrolling loops with run-time trip counts.
Patch by Brendon Cahoon!

This extends the existing LoopUnroll and LoopUnrollPass. Brendon
measured no regressions in the llvm test suite with -unroll-runtime
enabled. This implementation works by using the existing loop
unrolling code to unroll the loop by a power-of-two (default 8). It
generates an if-then-else sequence of code prior to the loop to
execute the extra iterations before entering the unrolled loop.

llvm-svn: 146245
2011-12-09 06:19:40 +00:00
Andrew Trick 6dbb060778 Comments. Thanks for the spell check Nick!
Also, my apologies for spoiling the autocomplete on SimplifyInstructions.cpp. I couldn't think of a better filename.

llvm-svn: 137229
2011-08-10 18:07:05 +00:00
Andrew Trick 4d0040baf8 Invoke SimplifyIndVar when we partially unroll a loop. Fixes PR10534.
llvm-svn: 137203
2011-08-10 04:29:49 +00:00
Andrew Trick 78b40c3f3a Cleanup. Added LoopBlocksDFS::perform for simple clients.
llvm-svn: 137195
2011-08-10 01:59:05 +00:00
Andrew Trick b72bbe2a92 Fix the LoopUnroller to handle nontrivial loops and partial unrolling.
These are not individual bug fixes. I had to rewrite a good chunk of
the unroller to make it sane. I think it was getting lucky on trivial
completely unrolled loops with no early exits. I included some fairly
simple unit tests for partial unrolling. I didn't do much stress
testing, so it may not be perfect, but should be usable now.

llvm-svn: 137190
2011-08-10 00:28:10 +00:00
Andrew Trick 5e0ee1c7f2 LoopUnroll looks like it has some stale code. Remove it to prove my sanity and avoid further confusion.
llvm-svn: 137106
2011-08-09 03:11:29 +00:00
Andrew Trick bf69d03382 SCEV: Use AssertingVH to catch dangling BasicBlock* when passes forget
to notify SCEV of a change. Add forgetLoop in a couple of those places.

llvm-svn: 136797
2011-08-03 18:32:11 +00:00
Andrew Trick 990f771a9a Add clarifying comments for the new arguments to UnrollLoop.
llvm-svn: 135988
2011-07-25 22:17:47 +00:00
Andrew Trick 1cabe54fab Move trip count discovery outside of the generic LoopUnroll helper. This
removes its dependence on canonical induction variables.

llvm-svn: 135829
2011-07-23 00:33:05 +00:00
Andrew Trick 279e7a6c83 whitespace
llvm-svn: 135828
2011-07-23 00:29:16 +00:00
Jay Foad 61ea0e4692 Reinstate r133513 (reverted in r133700) with an additional fix for a
-Wshorten-64-to-32 warning in Instructions.h.

llvm-svn: 133708
2011-06-23 09:09:15 +00:00