Commit Graph

3443 Commits

Author SHA1 Message Date
Dehao Chen 820372c0ed revert r280432:
r280432 | dehao | 2016-09-01 16:51:37 -0700 (Thu, 01 Sep 2016) | 9 lines

Explicitly require DominatorTreeAnalysis pass for instsimplify pass.

Summary: DominatorTreeAnalysis is always required by instsimplify.
llvm-svn: 280452
2016-09-02 01:47:13 +00:00
Dehao Chen e573b77772 Explicitly require DominatorTreeAnalysis pass for instsimplify pass.
Summary: DominatorTreeAnalysis is always required by instsimplify.

Reviewers: davidxl, danielcdh

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24173

llvm-svn: 280432
2016-09-01 23:51:37 +00:00
Dehao Chen ddd0c125e3 Refactor replaceDominatedUsesWith to have a flag to control whether to replace uses in BB itself.
Summary: This is in preparation for LoopSink pass which calls replaceDominatedUsesWith to update after sinking.

Reviewers: chandlerc, davidxl, danielcdh

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24170

llvm-svn: 280427
2016-09-01 23:26:48 +00:00
James Molloy 88cad7e5cf [SimplifyCFG] Handle tail-sinking of more than 2 incoming branches
This was a real restriction in the original version of SinkIfThenCodeToEnd. Now it's been rewritten, the restriction can be lifted.

As part of this, we handle a very common and useful case where one of the incoming branches is actually conditional. Consider:

   if (a)
     x(1);
   else if (b)
     x(2);

This produces the following CFG:

         [if]
        /    \
      [x(1)] [if]
        |     | \
        |     |  \
        |  [x(2)] |
         \    |  /
          [ end ]

[end] has two unconditional predecessor arcs and one conditional. The conditional refers to the implicit empty 'else' arc. This same pattern can also be caused by an empty default block in a switch.

We can't sink the call to x() down to end because no call to x() happens on the third incoming arc (assume that x() has sideeffects for the sake of argument; if something is safe to speculate we could indeed sink nevertheless but this cannot happen in the general case and causes many extra selects).

We are now able to detect this case and split off the unconditional arcs to a common successor:

         [if]
        /    \
      [x(1)] [if]
        |     | \
        |     |  \
        |  [x(2)] |
         \   /    |
     [sink.split] |
           \     /
           [ end ]

Now we can sink the call to x() into %sink.split. This can cause significant code simplification in many testcases.

llvm-svn: 280364
2016-09-01 12:58:13 +00:00
James Molloy eec6df3193 [SimplifyCFG] Change the algorithm in SinkThenElseCodeToEnd
r279460 rewrote this function to be able to handle more than two incoming edges and took pains to ensure this didn't regress anything.

This time we change the logic for determining if an instruction should be sunk. Previously we used a single pass greedy algorithm - sink instructions until one requires more than one PHI node or we run out of instructions to sink.

This had the problem that sinking instructions that had non-identical but trivially the same operands needed extra logic so we sunk them aggressively. For example:

    %a = load i32* %b          %d = load i32* %b
    %c = gep i32* %a, i32 0    %e = gep i32* %d, i32 1

Sinking %c and %e would naively require two PHI merges as %a != %d. But the loads are obviously equivalent (and maybe can't be hoisted because there is no common predecessor).

This is why we implemented the fairly complex function areValuesTriviallySame(), to look through trivial differences like this. However it's just not clever enough.

Instead, throw areValuesTriviallySame away, use pointer equality to check equivalence of operands and switch to a two-stage algorithm.

In the "scan" stage, we look at every sinkable instruction in isolation from end of block to front. If it's sinkable, we keep track of all operands that required PHI merging.

In the "sink" stage, we iteratively sink the last non-terminator in the source blocks. But when calculating how many PHIs are actually required to be inserted (to work out if we should stop or not) we remove any values that have already been sunk from the set of PHI-merges required, which allows us to be more aggressive.

This turns an algorithm with potentially recursive lookahead (looking through GEPs, casts, loads and any other instruction potentially not CSE'd) to two linear scans.

llvm-svn: 280351
2016-09-01 10:44:35 +00:00
James Molloy 21744689b9 [SimplifyCFG] Fix nondeterministic iteration order
We iterate over the result from SafeToMergeTerminators, so make it a SmallSetVector instead of a SmallPtrSet.

Should fix stage3 convergence builds.

llvm-svn: 280342
2016-09-01 09:01:34 +00:00
James Molloy e656642295 [SimplifyCFG] Improve FoldValueComparisonIntoPredecessors to handle more cases
A very important case is not handled here: multiple arcs to a single block with a PHI. Consider:

    a:
      %1 = icmp %b, 1
      br %1, label %c, label %e
    c:
      %2 = icmp %b, 2
      br %2, label %d, label %e
    d:
      br %e
    e:
      phi [0, %a], [1, %c], [2, %d]

FoldValueComparisonIntoPredecessors will refuse to fold this, as it doesn't know how to deal with two arcs to a common destination with different PHI values. The answer is obvious - just split all conflicting arcs.

llvm-svn: 280338
2016-09-01 07:45:25 +00:00
James Molloy 3c1137c639 Revert "[SimplifyCFG] Improve FoldValueComparisonIntoPredecessors to handle more cases"
This reverts commit r280218. This *also* causes buildbot errors. Sigh. Not a successful day all around!

llvm-svn: 280239
2016-08-31 13:32:28 +00:00
James Molloy cacfc16109 Revert "[SimplifyCFG] Change the algorithm in SinkThenElseCodeToEnd"
This reverts commit r280216 - it caused buildbot failures.

llvm-svn: 280234
2016-08-31 13:16:52 +00:00
James Molloy 76c9d423a7 Revert "[SimplifyCFG] Handle tail-sinking of more than 2 incoming branches"
This reverts commit r280217. r280216 caused buildbot failures - backing out the entire chain.

llvm-svn: 280233
2016-08-31 13:16:45 +00:00
James Molloy 06a45483a1 Revert "[SimplifyCFG] Add a workaround to fix PR30188"
This reverts commit r280219. r280216 caused buildbot failures - backing out the entire chain.

llvm-svn: 280232
2016-08-31 13:16:36 +00:00
James Molloy 8a66a39cbf Revert "[SimplifyCFG] Fix bootstrap failure after r280220"
This reverts commit r280228. r280216 caused buildbot failures - backing out the entire sequence.

llvm-svn: 280231
2016-08-31 13:16:30 +00:00
James Molloy b7efa6c227 [SimplifyCFG] Fix bootstrap failure after r280220
We check that a sinking candidate is used by only one PHI node during our legality checks. However for instructions that are used by other sinking candidates our heuristic is less conservative. This can result in a candidate actually being illegal when we come to sink it because of how we sunk a predecessor. Do the used-by-only-one-PHI checks again during sinking to ensure we don't crash.

llvm-svn: 280228
2016-08-31 12:33:48 +00:00
James Molloy 171fdac7ce [SimplifyCFG] Add a workaround to fix PR30188
We're sinking stores, which is a good thing, but in the process creating selects for the store address operand, which SROA/Mem2Reg can't look through, which caused serious regressions.

The real fix is in SROA, which I'll be looking into.

llvm-svn: 280219
2016-08-31 10:46:45 +00:00
James Molloy 8e69b032e5 [SimplifyCFG] Improve FoldValueComparisonIntoPredecessors to handle more cases
A very important case is not handled here: multiple arcs to a single block with a PHI. Consider:

    a:
      %1 = icmp %b, 1
      br %1, label %c, label %e
    c:
      %2 = icmp %b, 2
      br %2, label %d, label %e
    d:
      br %e
    e:
      phi [0, %a], [1, %c], [2, %d]

FoldValueComparisonIntoPredecessors will refuse to fold this, as it doesn't know how to deal with two arcs to a common destination with different PHI values. The answer is obvious - just split all conflicting arcs.

llvm-svn: 280218
2016-08-31 10:46:39 +00:00
James Molloy c53b40b509 [SimplifyCFG] Handle tail-sinking of more than 2 incoming branches
This was a real restriction in the original version of SinkIfThenCodeToEnd. Now it's been rewritten, the restriction can be lifted.

As part of this, we handle a very common and useful case where one of the incoming branches is actually conditional. Consider:

   if (a)
     x(1);
   else if (b)
     x(2);

This produces the following CFG:

         [if]
        /    \
      [x(1)] [if]
        |     | \
        |     |  \
        |  [x(2)] |
         \    |  /
          [ end ]

[end] has two unconditional predecessor arcs and one conditional. The conditional refers to the implicit empty 'else' arc. This same pattern can also be caused by an empty default block in a switch.

We can't sink the call to x() down to end because no call to x() happens on the third incoming arc (assume that x() has sideeffects for the sake of argument; if something is safe to speculate we could indeed sink nevertheless but this cannot happen in the general case and causes many extra selects).

We are now able to detect this case and split off the unconditional arcs to a common successor:

         [if]
        /    \
      [x(1)] [if]
        |     | \
        |     |  \
        |  [x(2)] |
         \   /    |
     [sink.split] |
           \     /
           [ end ]

Now we can sink the call to x() into %sink.split. This can cause significant code simplification in many testcases.

llvm-svn: 280217
2016-08-31 10:46:33 +00:00
James Molloy 55bd04cd20 [SimplifyCFG] Change the algorithm in SinkThenElseCodeToEnd
r279460 rewrote this function to be able to handle more than two incoming edges and took pains to ensure this didn't regress anything.

This time we change the logic for determining if an instruction should be sunk. Previously we used a single pass greedy algorithm - sink instructions until one requires more than one PHI node or we run out of instructions to sink.

This had the problem that sinking instructions that had non-identical but trivially the same operands needed extra logic so we sunk them aggressively. For example:

    %a = load i32* %b          %d = load i32* %b
    %c = gep i32* %a, i32 0    %e = gep i32* %d, i32 1

Sinking %c and %e would naively require two PHI merges as %a != %d. But the loads are obviously equivalent (and maybe can't be hoisted because there is no common predecessor).

This is why we implemented the fairly complex function areValuesTriviallySame(), to look through trivial differences like this. However it's just not clever enough.

Instead, throw areValuesTriviallySame away, use pointer equality to check equivalence of operands and switch to a two-stage algorithm.

In the "scan" stage, we look at every sinkable instruction in isolation from end of block to front. If it's sinkable, we keep track of all operands that required PHI merging.

In the "sink" stage, we iteratively sink the last non-terminator in the source blocks. But when calculating how many PHIs are actually required to be inserted (to work out if we should stop or not) we remove any values that have already been sunk from the set of PHI-merges required, which allows us to be more aggressive.

This turns an algorithm with potentially recursive lookahead (looking through GEPs, casts, loads and any other instruction potentially not CSE'd) to two linear scans.

llvm-svn: 280216
2016-08-31 10:46:23 +00:00
James Molloy 923e98c232 [SimplifyCFG] Tail-merge calls with sideeffects
This was deliberately disabled during my rewrite of SinkIfThenToEnd to keep behaviour
at least vaguely consistent with the previous version and keep it as close to NFC as
I could.

There's no real reason not to merge sideeffect calls though, so let's do it! Small fixup
along the way to ensure we don't create indirect calls.

Should fix PR28964.

llvm-svn: 280215
2016-08-31 10:46:16 +00:00
James Molloy d13b1239e4 [SimplifyCFG] Properly CSE metadata in SinkThenElseCodeToEnd
This was missing, meaning the metadata in sunk instructions was potentially bogus and could cause miscompiles.

llvm-svn: 280072
2016-08-30 10:56:08 +00:00
Tim Northover c10c33444e ASan: remove variable only used in assertions build
llvm-svn: 279990
2016-08-29 19:12:20 +00:00
Vitaly Buka db331d8be7 [asan] Separate calculation of ShadowBytes from calculating ASanStackFrameLayout
Summary: No functional changes, just refactoring to make D23947 simpler.

Reviewers: eugenis

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23954

llvm-svn: 279982
2016-08-29 17:41:29 +00:00
David Majnemer e8fd5f9ffd [SimplifyCFG] Hoisting invalidates metadata
We forgot to remove optimization metadata when performing hosting during
FoldTwoEntryPHINode.

This fixes PR29163.

llvm-svn: 279980
2016-08-29 17:14:08 +00:00
Adam Nemet 4f155b6e91 [LoopUnroll] Use OptimizationRemarkEmitter directly not via the analysis pass
We can't mark ORE (a function pass) preserved as required by the loop
passes because that is how we ensure that the required passes like
LazyBFI are all available any time ORE is used.  See the new comments in
the patch.

Instead we use it directly just like the inliner does in D22694.

As expected there is some additional overhead after removing the caching
provided by analysis passes.  The worst case, I measured was
LNT/CINT2006_ref/401.bzip2 which regresses by 12%.  As before, this only
affects -Rpass-with-hotness and not default compilation.

llvm-svn: 279829
2016-08-26 15:58:34 +00:00
Wei Mi 59ca96636d [UNROLL] Postpone ScalarEvolution::forgetLoop after TripCountSC is expanded
when unroll runtime iteration loop.

In llvm::UnrollRuntimeLoopRemainder, if the loop to be unrolled is the inner
loop inside a loop nest, the scalar evolution needs to be dropped for its
parent loop which is done by ScalarEvolution::forgetLoop. However, we can
postpone forgetLoop to the end of UnrollRuntimeLoopRemainder so TripCountSC
expansion can still reuse existing value.

Differential Revision: https://reviews.llvm.org/D23572

llvm-svn: 279748
2016-08-25 16:17:18 +00:00
George Burgess IV 7f414b90ab [MemorySSA] Remove unused field. NFC.
Given that we're not currently using blocker info, and whether or not we
will end up using it it is unclear, don't waste 8 (or 4) bytes of memory
per path node.

llvm-svn: 279493
2016-08-22 23:40:01 +00:00
Daniel Berlin 3d512a2dc2 MSSA: Factor out phi node placement
llvm-svn: 279462
2016-08-22 19:14:30 +00:00
Daniel Berlin 868381bff6 MSSA: Only rename accesses whose defining access is nullptr
llvm-svn: 279461
2016-08-22 19:14:16 +00:00
James Molloy 5bf2114265 [SimplifyCFG] Rewrite SinkThenElseCodeToEnd
[Recommitting now an unrelated assertion in SROA is sorted out]

The new version has several advantages:
  1) IMSHO it's more readable and neater
  2) It handles loads and stores properly
  3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch.

With this change we can now finally sink load-modify-store idioms such as:

    if (a)
      return *b += 3;
    else
      return *b += 4;

    =>

    %z = load i32, i32* %y
    %.sink = select i1 %a, i32 5, i32 7
    %b = add i32 %z, %.sink
    store i32 %b, i32* %y
    ret i32 %b

When this works for switches it'll be even more powerful.

Round 4. This time we should handle all instructions correctly, and not replace any operands that need to be constant with variables.

This was really hard to determine safely, so the helper function should be put into the Instruction API. I'll do that as a followup.

llvm-svn: 279460
2016-08-22 19:07:15 +00:00
James Molloy 475f4a763f Revert "[SimplifyCFG] Rewrite SinkThenElseCodeToEnd"
This reverts commit r279443. It caused buildbot failures.

llvm-svn: 279447
2016-08-22 18:13:12 +00:00
James Molloy 353052698a [SimplifyCFG] Rewrite SinkThenElseCodeToEnd
The new version has several advantages:
  1) IMSHO it's more readable and neater
  2) It handles loads and stores properly
  3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch.

With this change we can now finally sink load-modify-store idioms such as:

    if (a)
      return *b += 3;
    else
      return *b += 4;

    =>

    %z = load i32, i32* %y
    %.sink = select i1 %a, i32 5, i32 7
    %b = add i32 %z, %.sink
    store i32 %b, i32* %y
    ret i32 %b

When this works for switches it'll be even more powerful.

Round 4. This time we should handle all instructions correctly, and not replace any operands that need to be constant with variables.

This was really hard to determine safely, so the helper function should be put into the Instruction API. I'll do that as a followup.

llvm-svn: 279443
2016-08-22 17:40:23 +00:00
Vitaly Buka f9fd63ad39 [asan] Add support of lifetime poisoning into ComputeASanStackFrameLayout
Summary:
We are going to combine poisoning of red zones and scope poisoning.

PR27453

Reviewers: kcc, eugenis

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23623

llvm-svn: 279373
2016-08-20 16:48:24 +00:00
Vitaly Buka e149b392a8 Revert "[asan] Add support of lifetime poisoning into ComputeASanStackFrameLayout"
This reverts commit r279020.

Speculative revert in hope to fix asan test on arm.

llvm-svn: 279332
2016-08-19 22:12:58 +00:00
Reid Kleckner 98a48afa5d Revert "[SimplifyCFG] Rewrite SinkThenElseCodeToEnd"
This reverts commit r279229. It breaks intrinsic function calls in
diamonds.

llvm-svn: 279313
2016-08-19 20:22:39 +00:00
David Majnemer 5554edabef [CloneFunction] Don't remove unrelated nodes from the CGSSC
CGSCC use a WeakVH to track call sites.  RAUW a call within a function
can result in that WeakVH getting confused about whether or not the call
site is still around.

llvm-svn: 279268
2016-08-19 16:37:40 +00:00
James Molloy 11a1936b70 [SimplifyCFG] Rewrite SinkThenElseCodeToEnd
The new version has several advantages:
  1) IMSHO it's more readable and neater
  2) It handles loads and stores properly
  3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch.

With this change we can now finally sink load-modify-store idioms such as:

    if (a)
      return *b += 3;
    else
      return *b += 4;

    =>

    %z = load i32, i32* %y
    %.sink = select i1 %a, i32 5, i32 7
    %b = add i32 %z, %.sink
    store i32 %b, i32* %y
    ret i32 %b

When this works for switches it'll be even more powerful.

llvm-svn: 279229
2016-08-19 10:10:27 +00:00
Vitaly Buka d5ec14989d [asan] Add support of lifetime poisoning into ComputeASanStackFrameLayout
Summary:
We are going to combine poisoning of red zones and scope poisoning.

PR27453

Reviewers: kcc, eugenis

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23623

llvm-svn: 279020
2016-08-18 00:56:58 +00:00
Chandler Carruth 67fc52f067 [PM] Port the always inliner to the new pass manager in a much more
minimal and boring form than the old pass manager's version.

This pass does the very minimal amount of work necessary to inline
functions declared as always-inline. It doesn't support a wide array of
things that the legacy pass manager did support, but is alse ... about
20 lines of code. So it has that going for it. Notably things this
doesn't support:

- Array alloca merging
  - To support the above, bottom-up inlining with careful history
    tracking and call graph updates
- DCE of the functions that become dead after this inlining.
- Inlining through call instructions with the always_inline attribute.
  Instead, it focuses on inlining functions with that attribute.

The first I've omitted because I'm hoping to just turn it off for the
primary pass manager. If that doesn't pan out, I can add it here but it
will be reasonably expensive to do so.

The second should really be handled by running global-dce after the
inliner. I don't want to re-implement the non-trivial logic necessary to
do comdat-correct DCE of functions. This means the -O0 pipeline will
have to be at least 'always-inline,global-dce', but that seems
reasonable to me. If others are seriously worried about this I'd like to
hear about it and understand why. Again, this is all solveable by
factoring that logic into a utility and calling it here, but I'd like to
wait to do that until there is a clear reason why the existing
pass-based factoring won't work.

The final point is a serious one. I can fairly easily add support for
this, but it seems both costly and a confusing construct for the use
case of the always inliner running at -O0. This attribute can of course
still impact the normal inliner easily (although I find that
a questionable re-use of the same attribute). I've started a discussion
to sort out what semantics we want here and based on that can figure out
if it makes sense ta have this complexity at O0 or not.

One other advantage of this design is that it should be quite a bit
faster due to checking for whether the function is a viable candidate
for inlining exactly once per function instead of doing it for each call
site.

Anyways, hopefully a reasonable starting point for this pass.

Differential Revision: https://reviews.llvm.org/D23299

llvm-svn: 278896
2016-08-17 02:56:20 +00:00
Duncan P. N. Exon Smith 0a12729f99 SimplifyCFG: Avoid dereferencing end()
When comparing a User* to a BasicBlock::iterator in
passingValueIsAlwaysUndefined, don't dereference the iterator in case it
is end().

llvm-svn: 278872
2016-08-16 23:57:56 +00:00
David Majnemer 744a8753db Preserve the assumption cache more often
We were clearing it out in LoopUnswitch and InlineFunction instead of
attempting to preserve it.

llvm-svn: 278860
2016-08-16 22:07:32 +00:00
David Majnemer 110522bc0f [LoopUnroll] Don't clear out the AssumptionCache on each loop
Clearing out the AssumptionCache can cause us to rescan the entire
function for assumes.  If there are many loops, then we are scanning
over the entire function many times.

Instead of clearing out the AssumptionCache, register all cloned
assumes.

llvm-svn: 278854
2016-08-16 21:09:46 +00:00
Reid Kleckner 70a600b8bb Revert "[SimplifyCFG] Rewrite SinkThenElseCodeToEnd"
This reverts commit r278660.

It causes downstream assertion failure in InstCombine on shuffle
instructions. Comes up in __mm_swizzle_epi32.

llvm-svn: 278672
2016-08-15 15:42:31 +00:00
James Molloy 9a3c82f5cf [SimplifyCFG] Rewrite SinkThenElseCodeToEnd
The new version has several advantages:
  1) IMSHO it's more readable and neater
  2) It handles loads and stores properly
  3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch.

With this change we can now finally sink load-modify-store idioms such as:

    if (a)
      return *b += 3;
    else
      return *b += 4;

    =>

    %z = load i32, i32* %y
    %.sink = select i1 %a, i32 5, i32 7
    %b = add i32 %z, %.sink
    store i32 %b, i32* %y
    ret i32 %b

When this works for switches it'll be even more powerful.

llvm-svn: 278660
2016-08-15 08:04:56 +00:00
Reid Kleckner 6ee00a2602 [Inliner] Don't treat inalloca allocas as static
They aren't static, and moving them to the entry block across something
else will only result in tears.

Root cause of http://crbug.com/636558.

llvm-svn: 278571
2016-08-12 22:23:04 +00:00
David L Kreitzer 9667417a1a Fixed typo.
llvm-svn: 278565
2016-08-12 21:06:53 +00:00
Michael Kuperstein 31b8399beb [PM] Port LowerInvoke to the new pass manager
llvm-svn: 278531
2016-08-12 17:28:27 +00:00
Teresa Johnson 4223dd8559 [PM] Port NameAnonFunction pass to new pass manager
Summary:
Port the NameAnonFunction pass and add a test.

Depends on D23439.

Reviewers: mehdi_amini

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D23440

llvm-svn: 278509
2016-08-12 14:03:36 +00:00
David Majnemer c700490f48 Use the range variant of remove_if instead of unpacking begin/end
No functionality change is intended.

llvm-svn: 278475
2016-08-12 04:32:37 +00:00
David Majnemer 42531260b3 Use the range variant of find/find_if instead of unpacking begin/end
If the result of the find is only used to compare against end(), just
use is_contained instead.

No functionality change is intended.

llvm-svn: 278469
2016-08-12 03:55:06 +00:00
David Majnemer 0d955d0bf5 Use the range variant of find instead of unpacking begin/end
If the result of the find is only used to compare against end(), just
use is_contained instead.

No functionality change is intended.

llvm-svn: 278433
2016-08-11 22:21:41 +00:00
Daniel Berlin da2f38e0f4 [MSSA] Use is_contained
llvm-svn: 278418
2016-08-11 21:26:50 +00:00
David Majnemer 0a16c22846 Use range algorithms instead of unpacking begin/end
No functionality change is intended.

llvm-svn: 278417
2016-08-11 21:15:00 +00:00
Rong Xu 63f970ee24 Fix LCSSA increased compile time
We are seeing r276077 drastically increasing compiler time for our larger
benchmarks in PGO profile generation build (both clang based and IR based
mode) -- it can be 20x slower than without the patch (like from 30 secs to
780 secs)

The increased time are all in pass LCSSA. The problematic code is about
PostProcessPHIs after use-rewrite. Note that the InsertedPhis from ssa_updater
is accumulating (never been cleared). Since the inserted PHIs are added to the
candidate for each rewrite, The earlier ones will be repeatedly added. Later
when adding the new PHIs to the work-list, we don't check the duplication
either. This can result in extremely long work-list that containing tons of
duplicated PHIs.

This patch fixes the issue by hoisting the code out of the loop.

Differential Revision: http://reviews.llvm.org/D23344

llvm-svn: 278250
2016-08-10 17:49:11 +00:00
Davide Italiano 873219c406 [SimplifyLibCalls] Restore the old behaviour, emit a libcall.
Hal pointed out that the semantic of our intrinsic and the libc
call are slightly different. Add a comment while I'm here to
explain why we can't emit an intrinsic. Thanks Hal!

llvm-svn: 278200
2016-08-10 06:33:32 +00:00
Michael Zolotukhin aae168f993 [LoopSimplify] Rebuild LCSSA for the inner loop after separating nested loops.
Summary:
This hopefully fixes PR28825. The problem now was that a value from the
original loop was used in a subloop, which became a sibling after separation.
While a subloop doesn't need an lcssa phi node, a sibling does, and that's
where we broke LCSSA. The most natural way to fix this now is to simply call
formLCSSA on the original loop: it'll do what we've been doing before plus
it'll cover situations described above.

I think we don't need to run formLCSSARecursively here, and we have an assert
to verify this (I've tried testing it on LLVM testsuite + SPECs). I'd be happy
to be corrected here though.

I also changed a run line in the test from '-lcssa -loop-unroll' to
'-lcssa -loop-simplify -indvars', because it exercises LCSSA
preservation to the same extent, but also makes less unrelated
transformation on the CFG, which makes it easier to verify.

Reviewers: chandlerc, sanjoy, silvas

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23288

llvm-svn: 278173
2016-08-09 22:44:56 +00:00
Sean Silva 36e0d01e13 Consistently use FunctionAnalysisManager
Besides a general consistently benefit, the extra layer of indirection
allows the mechanical part of https://reviews.llvm.org/D23256 that
requires touching every transformation and analysis to be factored out
cleanly.

Thanks to David for the suggestion.

llvm-svn: 278077
2016-08-09 00:28:15 +00:00
Michael Zolotukhin 2f50725dbd [LoopUnroll] Simplify loops created by unrolling.
Summary:
Currently loop-unrolling doesn't preserve loop-simplified form. This patch
fixes it by resimplifying affected loops.

Reviewers: chandlerc, sanjoy, hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23148

llvm-svn: 278038
2016-08-08 19:02:15 +00:00
Geoff Berry 290a13e7c7 [MemorySSA] Fix windows build breakage caused by r278028
r278028: [MemorySSA] Ensure address stability of MemorySSA object.
llvm-svn: 278035
2016-08-08 18:27:22 +00:00
Geoff Berry cdf5333f6f [MemorySSA] Ensure address stability of MemorySSA object.
Summary:
Ensure that the MemorySSA object never changes address when using the
new pass manager since the walkers contained by MemorySSA cache pointers
to it at construction time.  This is achieved by wrapping the
MemorySSAAnalysis result in a unique_ptr.  Also add some asserts that
check for this bug.

Reviewers: george.burgess.iv, dberlin

Subscribers: mcrosier, hfinkel, chandlerc, silvas, llvm-commits

Differential Revision: https://reviews.llvm.org/D23171

llvm-svn: 278028
2016-08-08 17:52:01 +00:00
Sean Silva 0873e7d218 Add some comments linking back to PR28400.
Thanks to Mehdi for the suggestion!

llvm-svn: 277984
2016-08-08 07:03:49 +00:00
Sean Silva 7f21f4b264 [PM] More workaround for PR28400
llvm-svn: 277982
2016-08-08 05:38:06 +00:00
Daniel Berlin 4b4c722e79 [MSSA] Fix PR28880 by fixing use optimizer's lower bound tracking behavior.
Summary:
In the use optimizer, we need to keep of whether the lower bound still
dominates us or else we may decide a lower bound is still valid when it
is not due to intervening pushes/pops.  Fixes PR28880 (and probably a
bunch of other things).

Reviewers: george.burgess.iv

Subscribers: MatzeB, llvm-commits, sebpop

Differential Revision: https://reviews.llvm.org/D23237

llvm-svn: 277978
2016-08-08 04:44:53 +00:00
Eli Friedman 02419a9849 [JumpThreading] Fix handling of aliasing metadata.
Summary:
The correctness fix here is that when we CSE a load with another load,
we need to combine the metadata on the two loads. This matches the
behavior of other passes, like instcombine and GVN.

There's also a minor optimization improvement here: for load PRE, the
aliasing metadata on the inserted load should be the same as the
metadata on the original load. Not sure why the old code was throwing
it away.

Issue found by inspection.

Differential Revision: http://reviews.llvm.org/D21460

llvm-svn: 277977
2016-08-08 04:10:22 +00:00
Davide Italiano e3b916d164 [SimplifyLibCalls] Emit sqrt intrinsic instead of a libcall.
llvm-svn: 277972
2016-08-08 03:23:01 +00:00
Davide Italiano 27da131f32 [SLC] Emit an intrinsic instead of a libcall for pow.
Differential Revision:  https://reviews.llvm.org/D22104

llvm-svn: 277963
2016-08-07 20:27:03 +00:00
Michael Zolotukhin 442b82f0eb Revert "Revert "[LoopSimplify] Fix updating LCSSA after separating nested loops.""
This reverts commit r277901. Reaaply the commit as it looks like it has
nothing to do with the bots failures.

llvm-svn: 277946
2016-08-07 01:56:54 +00:00
Benjamin Kramer b7d3311c77 Move helpers into anonymous namespaces. NFC.
llvm-svn: 277916
2016-08-06 11:13:10 +00:00
Michael Zolotukhin 09cf304ebc Revert "[LoopSimplify] Fix updating LCSSA after separating nested loops."
This reverts commit r277877.
Try to appease clang-x64-ninja-win7 buildbot.

llvm-svn: 277901
2016-08-06 01:48:51 +00:00
Daniel Berlin 7ac3d74017 [MSSA] Use depth first iterator instead of custom version.
Summary:
Originally the plan was to use the custom worklist to do some block popping,
and because we don't actually need a visited set. The custom one we have
here is slightly broken, and it's not worth fixing vs using depth_first_iterator since we aren't going to go the route we originally
were.

Fixes PR28874
Reviewers: george.burgess.iv

Subscribers: llvm-commits, gberry

Differential Revision: https://reviews.llvm.org/D23187

llvm-svn: 277880
2016-08-05 22:09:14 +00:00
Michael Zolotukhin 4c65c3596a [LoopSimplify] Fix updating LCSSA after separating nested loops.
This fixes PR28825. The problem was that we only checked if a value from
a created inner loop is used in the outer loop, and fixed LCSSA for
them. But we missed to fixup LCSSA for values used in exits of the outer
loop.

llvm-svn: 277877
2016-08-05 21:52:58 +00:00
Daniel Berlin 7af95876cf [MSSA] Match assert vs llvm_unreachable style in verification functions.
llvm-svn: 277873
2016-08-05 21:47:20 +00:00
Daniel Berlin 2919b1c41b Rewrite domination verifier to handle local domination as well.
Summary:
Rewrite domination verifier to handle local domination as well.
This catches a bug Geoff Berry noticed.

Reviewers: george.burgess.iv

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23184

llvm-svn: 277872
2016-08-05 21:46:52 +00:00
Davide Italiano 500929df9c [FlattenCFG] Simplify + remove unused variable. NFCI.
llvm-svn: 277864
2016-08-05 20:53:35 +00:00
Dehao Chen 17c6afc35b Do not assign new discriminator for all intrinsics.
Summary: We do not care about intrinsic calls when assigning discriminators.

Reviewers: davidxl, dnovillo

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23212

llvm-svn: 277843
2016-08-05 17:56:49 +00:00
Benjamin Kramer aa160c22f7 [SimplifyCFG] Make range reduction code deterministic.
This generated IR based on the order of evaluation, which is different
between GCC and Clang. With that in mind you get bootstrap miscompares
if you compare a Clang built with GCC-built Clang vs. Clang built with
Clang-built Clang. Diagnosing that made my head hurt.

This also reverts commit r277337, which "fixed" the test case.

llvm-svn: 277820
2016-08-05 14:55:02 +00:00
David Majnemer 4eefd6bca4 Forgot the dyn_cast_or_null intended for r277691.
llvm-svn: 277693
2016-08-04 04:47:18 +00:00
David Majnemer 909793fa63 Reinstate "[CloneFunction] Don't remove side effecting calls"
This reinstates r277611 + r277614 and reverts r277642.  A cast_or_null
should have been a dyn_cast_or_null.

llvm-svn: 277691
2016-08-04 04:24:02 +00:00
George Burgess IV 363da6f589 [MSSA] Fix a bug in MemorySSA's move ctor.
Not a correctness issue, but it would be nice if we didn't have to
recompute our block numbering (worst-case) every time we move MSSA.

llvm-svn: 277652
2016-08-03 21:07:52 +00:00
Reid Kleckner a6be60871f Revert "[CloneFunction] Don't remove side effecting calls"
This reverts commit r277611 and the followup r277614.

Bootstrap builds and chromium builds are crashing during inlining after
this change.

llvm-svn: 277642
2016-08-03 20:01:01 +00:00
George Burgess IV f7672854f0 [MSSA] clang-format. NFC.
Didn't want to fold this in with r277640, since it touches bits that
aren't entirely related to r277640.

llvm-svn: 277641
2016-08-03 19:59:11 +00:00
George Burgess IV 024f3d2683 [MSSA] Add special handling for invariant/constant loads.
This is a follow-up to r277637. It teaches MemorySSA that invariant
loads (and loads of provably constant memory) are always liveOnEntry.

llvm-svn: 277640
2016-08-03 19:57:02 +00:00
George Burgess IV 82e355ce48 [MSSA] Add logic for special handling of atomics/volatiles.
This patch makes MemorySSA recognize atomic/volatile loads, and makes
MSSA treat said loads specially. This allows us to be a bit more
aggressive in some cases.

Administrative note: Revision was LGTM'ed by reames in person.
Additionally, this doesn't include the `invariant.load` recognition in
the differential revision, because I feel it's better to commit that
separately. Will commit soon.

Differential Revision: https://reviews.llvm.org/D16875

llvm-svn: 277637
2016-08-03 19:39:54 +00:00
David Majnemer fa8ef91748 [CloneFunction] Don't crash if the value map doesn't hold something
It is possible for the value map to not have an entry for some value
that has already been removed.

I don't have a testcase, this is fall-out from a buildbot.

llvm-svn: 277614
2016-08-03 17:37:10 +00:00
David Majnemer fad0490869 [CloneFunction] Don't remove side effecting calls
We were able to figure out that the result of a call is some constant.
While propagating that fact, we added the constant to the value map.
This is problematic because it results in us losing the call site when
processing the value map.

This fixes PR28802.

llvm-svn: 277611
2016-08-03 17:12:47 +00:00
George Burgess IV 14633b5cd3 [MSSA] Fix a caching bug.
This fixes a bug where we'd sometimes cache overly-conservative results
with our walker. This bug was made more obvious by r277480, which makes
our cache far more spotty than it was. Test case is llvm-unit, because
we're likely going to use CachingWalker only for def optimization in the
future.

The bug stems from that there was a place where the walker assumed that
`DefNode.Last` was a valid target to cache to when failing to optimize
phis. This is sometimes incorrect if we have a cache hit. The fix is to
use the thing we *can* assume is a valid target to cache to. :)

llvm-svn: 277559
2016-08-03 01:22:19 +00:00
Daniel Berlin df10119e4e Support for lifetime begin/end markers in the MemorySSA use optimizer
Summary: Depends on D23072

Reviewers: george.burgess.iv

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23076

llvm-svn: 277553
2016-08-03 00:01:46 +00:00
Piotr Padlewski 47509f6185 Imported statistics types changes
Reviewers: tejohnson, eraman

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D22980

llvm-svn: 277534
2016-08-02 22:18:47 +00:00
Daniel Berlin dff31deb1e Move to having a single real instructionClobbersQuery
Summary: We really want to move towards MemoryLocOrCall (or fix AA) everywhere, but for now, this lets us have a single instructionClobbersQuery.

Reviewers: george.burgess.iv

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23072

llvm-svn: 277530
2016-08-02 21:57:52 +00:00
Michael Zolotukhin b2738e41bf [LoopUnroll] Switch the default value of -unroll-runtime-epilog back to its original value.
As agreed in post-commit review of r265388, I'm switching the flag to
its original value until the 90% runtime performance regression on
SingleSource/Benchmarks/Stanford/Bubblesort is addressed.

llvm-svn: 277524
2016-08-02 21:24:14 +00:00
Daniel Berlin 26fcea91f6 Fixes for post-commit review comments on r277480
llvm-svn: 277510
2016-08-02 20:02:21 +00:00
Michael Zolotukhin d9b6ad3c01 [LoopUnroll] Ensure we create prolog loops in simplified form.
llvm-svn: 277502
2016-08-02 19:19:31 +00:00
Daniel Berlin de4be65313 MSVC 2013 does not implement C++11 unions properly, so remove the anoymous union for now,
and leave a FIXME.

llvm-svn: 277485
2016-08-02 16:59:51 +00:00
Daniel Berlin c43aa5a5b6 Rewrite the use optimizer to be less memory intensive and 50% faster.
Fixes PR28670

Summary:
Rewrite the use optimizer to be less memory intensive and 50% faster.
Fixes PR28670

The new use optimizer works like a standard SSA renaming pass, storing
all possible versions a MemorySSA use could get in a stack, and just
tracking indexes into the stack.
This uses much less memory than caching N^2 alias query results.
It's also a lot faster.

The current version defers phi node walking to the normal walker.

Reviewers: george.burgess.iv

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23032

llvm-svn: 277480
2016-08-02 16:24:03 +00:00
Junmo Park db8f6eebee Minor code cleanups. NFC.
llvm-svn: 277415
2016-08-02 04:38:27 +00:00
Sean Silva f801575fd0 CodeExtractor : Add ability to preserve profile data.
Added ability to estimate the entry count of the extracted function and
the branch probabilities of the exit branches.

Patch by River Riddle!

Differential Revision: https://reviews.llvm.org/D22744

llvm-svn: 277411
2016-08-02 02:15:45 +00:00
James Molloy bade86cedc [SimplifyCFG] Fix nasty RAUW bug from r277325
Using RAUW was wrong here; if we have a switch transform such as:
  18 -> 6 then
  6 -> 0

If we use RAUW, while performing the second transform the  *transformed* 6
from the first will be also replaced, so we end up with:
  18 -> 0
  6 -> 0

Found by clang stage2 bootstrap; testcase added.

llvm-svn: 277332
2016-08-01 09:34:48 +00:00
James Molloy b2e436de42 [SimplifyCFG] Range reduce switches
If a switch is sparse and all the cases (once sorted) are in arithmetic progression, we can extract the common factor out of the switch and create a dense switch. For example:

    switch (i) {
    case 5: ...
    case 9: ...
    case 13: ...
    case 17: ...
    }

can become:

    if ( (i - 5) % 4 ) goto default;
    switch ((i - 5) / 4) {
    case 0: ...
    case 1: ...
    case 2: ...
    case 3: ...
    }

or even better:

   switch ( ROTR(i - 5, 2) {
   case 0: ...
   case 1: ...
   case 2: ...
   case 3: ...
   }

The division and remainder operations could be costly so we only do this if the factor is a power of two, and emit a right-rotate instead of a divide/remainder sequence. Dense switches can be lowered significantly better than sparse switches and can even be transformed into lookup tables.

llvm-svn: 277325
2016-08-01 07:45:11 +00:00
Sean Silva 423c7149dc Revert r277313 and r277314.
They seem to trigger an LSan failure:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/15140/steps/check-llvm%20asan/logs/stdio

Revert "Add the tests for r277313"

This reverts commit r277314.

Revert "CodeExtractor : Add ability to preserve profile data."

This reverts commit r277313.

llvm-svn: 277317
2016-08-01 04:16:09 +00:00
Sean Silva a0a802abe3 Fix - CodeExtractor : Inherit Target Dependent Attributes from the parent function.
When extracting a set of blocks make sure to inherit all of the target
dependent attributes to make sure that the function will be valid for
lowering. One example is the "target-features" attribute for x86, if the
extracted region has functionality that relies on a specific feature it
will fail to be lowered.
This also allows for extracted functions to be valid for inlining, at
least back into the parent function, as the target attributes are tested
when inlining for compatibility.

Patch by River Riddle!

Differential Revision: https://reviews.llvm.org/D22713

llvm-svn: 277315
2016-08-01 03:15:32 +00:00
Sean Silva 6208924323 CodeExtractor : Add ability to preserve profile data.
Added ability to estimate the entry count of the extracted function and
the branch probabilities of the exit branches.

Patch by River Riddle!

Differential Revision: https://reviews.llvm.org/D22744

llvm-svn: 277313
2016-08-01 02:59:26 +00:00
Daniel Berlin 5130cc831a Fix the MemorySSA updating API to enable people to create memory accesses before removing old ones
llvm-svn: 277309
2016-07-31 21:08:20 +00:00