Commit Graph

4043 Commits

Author SHA1 Message Date
Dehao Chen ed2d5402cb Do not add discriminator encoding for debug intrinsics.
Summary: There are certain requirements for debug location of debug intrinsics, e.g. the scope of the DILocalVariable should be the same as the scope of its debug location. As a result, we should not add discriminator encoding for debug intrinsics.

Reviewers: dblaikie, aprantl

Reviewed By: aprantl

Subscribers: JDevlieghere, aprantl, bjope, sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D39343

llvm-svn: 316703
2017-10-26 21:20:52 +00:00
Balaram Makam 9ee942f481 Reapply r316582 [Local] Fix a bug in the domtree update logic for MergeBasicBlockIntoOnlyPred.
Summary: This reverts r316612 to reapply r316582. The buildbot failure was unrelated to this commit.

Reviewers:

Subscribers:

llvm-svn: 316669
2017-10-26 15:04:53 +00:00
Eugene Zelenko 5adb96cc92 [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316630
2017-10-26 00:55:39 +00:00
Balaram Makam 52252fe20d Revert r316582 [Local] Fix a bug in the domtree update logic for MergeBasicBlockIntoOnlyPred.
Summary: This reverts commit r316582. It looks like this commit broke tests on one buildbot:
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/5719

. . .
Failing Tests (1):
    LLVM :: Transforms/CalledValuePropagation/simple-arguments.ll

Reviewers:

Subscribers:

llvm-svn: 316612
2017-10-25 21:32:54 +00:00
Balaram Makam 925ddf1a93 [Local] Fix a bug in the domtree update logic for MergeBasicBlockIntoOnlyPred.
Summary: For some irreducible CFG the domtree nodes might be dead, do not update domtree for dead nodes.

Reviewers: kuhar, dberlin, hfinkel

Reviewed By: kuhar

Subscribers: llvm-commits, mcrosier

Differential Revision: https://reviews.llvm.org/D38960

llvm-svn: 316582
2017-10-25 14:55:48 +00:00
Adrian Prantl d20442d383 Delete unused instantiations of DIBuilder. NFC
llvm-svn: 316494
2017-10-24 20:26:17 +00:00
Sanjay Patel b80daf0b48 [SimplifyCFG] delay switch condition forwarding to -latesimplifycfg
As discussed in D39011:
https://reviews.llvm.org/D39011
...replacing constants with a variable is inverting the transform done
by other IR passes, so we definitely don't want to do this early. 
In fact, it's questionable whether this transform belongs in SimplifyCFG 
at all. I'll look at moving this to codegen as a follow-up step.

llvm-svn: 316298
2017-10-22 19:10:07 +00:00
Sanjay Patel 24226504a7 [SimplifyCFG] try harder to forward switch condition to phi (PR34471)
The missed canonicalization/optimization in the motivating test from PR34471 leads to very different codegen:

  int switcher(int x) {
      switch(x) {
      case 17: return 17;
      case 19: return 19;
      case 42: return 42;
      default: break;
      }
      return 0;
    }

  int comparator(int x) {
    if (x == 17) return 17;
    if (x == 19) return 19;
    if (x == 42) return 42;
    return 0;
  }

For the first example, we use a bit-test optimization to avoid a series of compare-and-branch:
https://godbolt.org/g/BivDsw

Differential Revision: https://reviews.llvm.org/D39011

llvm-svn: 316293
2017-10-22 16:51:03 +00:00
Eugene Zelenko fce435764e [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316253
2017-10-21 00:57:46 +00:00
Eugene Zelenko 6cadde7f40 [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316034
2017-10-17 21:27:42 +00:00
Vitaly Buka 524c0a639d Fix signed overflow detected by ubsan
This overflow does not affect algorithm, so just suppress it.

llvm-svn: 316018
2017-10-17 18:33:15 +00:00
Sanjay Patel 30f30d37fb [SimplifyCFG] use range-for-loops, tidy; NFCI
There seems to be something missing here as shown in PR34471:
https://bugs.llvm.org/show_bug.cgi?id=34471 

llvm-svn: 315855
2017-10-15 14:43:39 +00:00
Hongbin Zheng 73f650435b [LoopInfo][Refactor] Make SetLoopAlreadyUnrolled a member function of the Loop Pass, NFC.
This avoid code duplication and allow us to add the disable unroll metadata elsewhere.

Differential Revision: https://reviews.llvm.org/D38928

llvm-svn: 315850
2017-10-15 07:31:02 +00:00
Hongbin Zheng d36f2030e2 [SimplifyIndVar] Replace IVUsers with loop invariant whenever possible
Differential Revision: https://reviews.llvm.org/D38415

llvm-svn: 315551
2017-10-12 02:54:11 +00:00
Eugene Zelenko 6f1ae631f7 [Transforms] Revert r315516 changes in PredicateInfo to fix Windows build bots (NFC).
llvm-svn: 315519
2017-10-11 21:56:44 +00:00
Eugene Zelenko 286d5897d6 [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 315516
2017-10-11 21:41:43 +00:00
Vivek Pandya 9590658fb8 [NFC] Convert OptimizationRemarkEmitter old emit() calls to new closure
parameterized emit() calls

Summary: This is not functional change to adopt new emit() API added in r313691.

Reviewed By: anemet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38285

llvm-svn: 315476
2017-10-11 17:12:59 +00:00
Adam Nemet 0965da2055 Rename OptimizationDiagnosticInfo.* to OptimizationRemarkEmitter.*
Sync it up with the name of the class actually defined here.  This has been
bothering me for a while...

llvm-svn: 315249
2017-10-09 23:19:02 +00:00
Jakub Kuderski cbe9fae99d [CodeExtractor] Fix multiple bugs under certain shape of extracted region
Summary:
If the extracted region has multiple exported data flows toward the same BB which is not included in the region, correct resotre instructions and PHI nodes won't be generated inside the exitStub. The solution is simply put the restore instructions right after the definition of output values instead of putting in exitStub.
Unittest for this bug is included.

Author: myhsu

Reviewers: chandlerc, davide, lattner, silvas, davidxl, wmi, kuhar

Subscribers: dberlin, kuhar, mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D37902

llvm-svn: 315041
2017-10-06 03:37:06 +00:00
Peter Collingbourne 715bcfe0c9 ModuleUtils: Stop using comdat members to generate unique module ids.
It is possible for two modules to define the same set of external
symbols without causing a duplicate symbol error at link time,
as long as each of the symbols is a comdat member. So we cannot
use them as part of a unique id for the module.

Differential Revision: https://reviews.llvm.org/D38602

llvm-svn: 315026
2017-10-05 21:54:53 +00:00
Hans Wennborg 899809d531 Fix a -Wparentheses warning. NFC.
llvm-svn: 314936
2017-10-04 21:14:07 +00:00
Marcello Maggioni df3e71e037 [LoopDeletion] Move deleteDeadLoop to to LoopUtils. NFC
llvm-svn: 314934
2017-10-04 20:42:46 +00:00
Sanjay Patel 4c33d5213b [SimplifyCFG] put the optional assumption cache pointer in the options struct; NFCI
This is a follow-up to https://reviews.llvm.org/D38138. 

I fixed the capitalization of some functions because we're changing those
lines anyway and that helped verify that we weren't accidentally dropping 
any options by using default param values.

llvm-svn: 314930
2017-10-04 20:26:25 +00:00
Dehao Chen f464627f28 Update getMergedLocation to check the instruction type and merge properly.
Summary: If the merged instruction is call instruction, we need to set the scope to the closes common scope between 2 locations, otherwise it will cause trouble when the call is getting inlined.

Reviewers: dblaikie, aprantl

Reviewed By: dblaikie, aprantl

Subscribers: llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D37877

llvm-svn: 314694
2017-10-02 18:13:14 +00:00
Daniel Jasper 0a51ec29c9 Revert r314435: "[JumpThreading] Preserve DT and LVI across the pass"
Causes a segfault on a builtbot (and in our internal bootstrapping of
Clang). See Eli's response on the commit thread.

llvm-svn: 314589
2017-09-30 11:57:19 +00:00
Hongbin Zheng c8abdf5f25 [SimplifyIndVar] Do not fail when we constant fold an IV user to ConstantPointerNull
The type of a SCEVConstant may not match the corresponding LLVM Value.
In this case, we skip the constant folding for now.

TODO: Replace ConstantInt Zero by ConstantPointerNull
llvm-svn: 314531
2017-09-29 16:32:12 +00:00
Sanjoy Das 0ac5ba5ade Revert "[BypassSlowDivision] Improve our handling of divisions by constants"
This reverts commit r314253.  It causes a miscompile on P100 in an internal
benchmark.  Reverting while I investigate.

llvm-svn: 314482
2017-09-29 00:54:16 +00:00
Evandro Menezes 3701df55c6 [JumpThreading] Preserve DT and LVI across the pass
JumpThreading now preserves dominance and lazy value information across the
entire pass.  The pass manager is also informed of this preservation with
the goal of DT and LVI being recalculated fewer times overall during
compilation.

This change prepares JumpThreading for enhanced opportunities; particularly
those across loop boundaries.

Patch by: Brian Rzycki <b.rzycki@samsung.com>,
          Sebastian Pop <s.pop@samsung.com>

Differential revision: https://reviews.llvm.org/D37528

llvm-svn: 314435
2017-09-28 17:24:40 +00:00
Sanjoy Das def1729dc4 Use a BumpPtrAllocator for Loop objects
Summary:
And now that we no longer have to explicitly free() the Loop instances, we can
(with more ease) use the destructor of LoopBase to do what LoopBase::clear() was
doing.

Reviewers: chandlerc

Subscribers: mehdi_amini, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D38201

llvm-svn: 314375
2017-09-28 02:45:42 +00:00
Sanjoy Das 3567d3d2ec Rename LoopUnrollStatus to LoopUnrollResult; NFC
A "Result" suffix is more appropriate here

llvm-svn: 314350
2017-09-27 21:45:19 +00:00
Sanjay Patel 0f9b4773c1 [SimplifyCFG] add a struct to house optional folds (PR34603)
This was intended to be no-functional-change, but it's not - there's a test diff.

So I thought I should stop here and post it as-is to see if this looks like what was expected 
based on the discussion in PR34603:
https://bugs.llvm.org/show_bug.cgi?id=34603

Notes:
 1. The test improvement occurs because the existing 'LateSimplifyCFG' marker is not carried 
    through the recursive calls to 'SimplifyCFG()->SimplifyCFGOpt().run()->SimplifyCFG()'. 
    The parameter isn't passed down, so we pick up the default value from the function signature 
    after the first level. I assumed that was a bug, so I've passed 'Options' down in all of the 
    'SimplifyCFG' calls.

 2. I split 'LateSimplifyCFG' into 2 bits: ConvertSwitchToLookupTable and KeepCanonicalLoops. 
    This would theoretically allow us to differentiate the transforms controlled by those params 
    independently.

 3. We could stash the optional AssumptionCache pointer and 'LoopHeaders' pointer in the struct too. 
    I just stopped here to minimize the diffs.

 4. Similarly, I stopped short of messing with the pass manager layer. I have another question that 
    could wait for the follow-up: why is the new pass manager creating the pass with LateSimplifyCFG 
    set to true no matter where in the pipeline it's creating SimplifyCFG passes?

    // Create an early function pass manager to cleanup the output of the
    // frontend.
    EarlyFPM.addPass(SimplifyCFGPass());

    -->

    /// \brief Construct a pass with the default thresholds
    /// and switch optimizations.
    SimplifyCFGPass::SimplifyCFGPass()
       : BonusInstThreshold(UserBonusInstThreshold),
         LateSimplifyCFG(true) {}   <-- switches get converted to lookup tables and loops may not be in canonical form

    If this is unintended, then it's possible that the current behavior of dropping the 'LateSimplifyCFG' 
    setting via recursion was masking this bug.

Differential Revision: https://reviews.llvm.org/D38138

llvm-svn: 314308
2017-09-27 14:54:16 +00:00
Hongbin Zheng d1b7b2efba [SimplifyIndVar] Constant fold IV users
This patch tries to transform cases like:

for (unsigned i = 0; i < N; i += 2) {
  bool c0 = (i & 0x1) == 0;
  bool c1 = ((i + 1) & 0x1) == 1;
}
To

for (unsigned i = 0; i < N; i += 2) {
  bool c0 = true;
  bool c1 = true;
}

This commit also update test/Transforms/IndVarSimplify/replace-srem-by-urem.ll to prevent constant folding.

Differential Revision: https://reviews.llvm.org/D38272

llvm-svn: 314266
2017-09-27 03:11:46 +00:00
Sanjoy Das eda7a86d42 [BypassSlowDivision] Improve our handling of divisions by constants
Summary:
Don't bail out on constant divisors for divisions that can be narrowed without
introducing control flow .  This gives us a 32 bit multiply instead of an
emulated 64 bit multiply in the generated PTX assembly.

Reviewers: jlebar

Subscribers: jholewinski, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D38265

llvm-svn: 314253
2017-09-26 21:54:27 +00:00
Matthias Braun cc603ee3d5 TargetLibraryInfo: Stop guessing wchar_t size
Usually the frontend communicates the size of wchar_t via metadata and
we can optimize wcslen (and possibly other calls in the future). In
cases without the wchar_size metadata we would previously try to guess
the correct size based on the target triple; however this is fragile to
keep up to date and may miss users manually changing the size via flags.
Better be safe and stop guessing and optimizing if the frontend didn't
communicate the size.

Differential Revision: https://reviews.llvm.org/D38106

llvm-svn: 314185
2017-09-26 02:36:57 +00:00
Hongbin Zheng bbe448abd8 [SimplifyIndvar] Minor change to refine r314125, NFC
llvm-svn: 314130
2017-09-25 18:10:36 +00:00
Hongbin Zheng f0093e45c4 [SimplifyIndvar] Replace the srem used by IV if we can prove both of its operands are non-negative
Since now SCEV can handle 'urem', an 'urem' is a better canonical form than an 'srem' because it has well-defined behavior

This is a follow up of D34598

Differential Revision: https://reviews.llvm.org/D38072

llvm-svn: 314125
2017-09-25 17:39:40 +00:00
Sanjoy Das 388b012f4e Rename markAsErased to erase, as pointed out in a previous review; NFC
llvm-svn: 313951
2017-09-22 01:47:41 +00:00
Reid Kleckner 0fe506bc5e Re-land r313825: "[IR] Add llvm.dbg.addr, a control-dependent version of llvm.dbg.declare"
The fix is to avoid invalidating our insertion point in
replaceDbgDeclare:
     Builder.insertDeclare(NewAddress, DIVar, DIExpr, Loc, InsertBefore);
+    if (DII == InsertBefore)
+      InsertBefore = &*std::next(InsertBefore->getIterator());
     DII->eraseFromParent();

I had to write a unit tests for this instead of a lit test because the
use list order matters in order to trigger the bug.

The reduced C test case for this was:
  void useit(int*);
  static inline void inlineme() {
    int x[2];
    useit(x);
  }
  void f() {
    inlineme();
    inlineme();
  }

llvm-svn: 313905
2017-09-21 19:52:03 +00:00
Daniel Jasper 7d2f38d600 Revert r313825: "[IR] Add llvm.dbg.addr, a control-dependent version of llvm.dbg.declare"
.. as well as the two subsequent changes r313826 and r313875.

This leads to segfaults in combination with ASAN. Will forward repro
instructions to the original author (rnk).

llvm-svn: 313876
2017-09-21 12:07:33 +00:00
Sanjay Patel 73811a152a [SimplifyCFG] don't create a no-op subtract
I noticed this inefficiency while investigating PR34603:
https://bugs.llvm.org/show_bug.cgi?id=34603

This fix will likely push another bug (we don't maintain state of 'LateSimplifyCFG') 
into hiding, but I'll try to clean that up with a follow-up patch anyway.

llvm-svn: 313829
2017-09-20 22:31:35 +00:00
Reid Kleckner 3f547e87b2 [IR] Add llvm.dbg.addr, a control-dependent version of llvm.dbg.declare
Summary:
This implements the design discussed on llvm-dev for better tracking of
variables that live in memory through optimizations:
  http://lists.llvm.org/pipermail/llvm-dev/2017-September/117222.html

This is tracked as PR34136

llvm.dbg.addr is intended to be produced and used in almost precisely
the same way as llvm.dbg.declare is today, with the exception that it is
control-dependent. That means that dbg.addr should always have a
position in the instruction stream, and it will allow passes that
optimize memory operations on local variables to insert llvm.dbg.value
calls to reflect deleted stores. See SourceLevelDebugging.rst for more
details.

The main drawback to generating DBG_VALUE machine instrs is that they
usually cause LLVM to emit a location list for DW_AT_location. The next
step will be to teach DwarfDebug.cpp how to recognize more DBG_VALUE
ranges as not needing a location list, and possibly start setting
DW_AT_start_offset for variables whose lifetimes begin mid-scope.

Reviewers: aprantl, dblaikie, probinson

Subscribers: eraman, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D37768

llvm-svn: 313825
2017-09-20 21:52:33 +00:00
Sanjoy Das 09613b122e Tighten the invariants around LoopBase::invalidate
Summary:
With this change:
 - Methods in LoopBase trip an assert if the receiver has been invalidated
 - LoopBase::clear frees up the memory held the LoopBase instance

This change also shuffles things around as necessary to work with this stricter invariant.

Reviewers: chandlerc

Subscribers: mehdi_amini, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D38055

llvm-svn: 313708
2017-09-20 02:31:57 +00:00
Sanjoy Das 76ab23234c [LoopInfo] Make LoopBase and Loop destructors non-public
Summary:
See comment for why I think this is a good idea.

This change also:

 - Removes an SCEV test case.  The SCEV test was not testing anything useful (most of it was `#if 0` ed out) and it would need to be updated to deal with a private ~Loop::Loop.
 - Updates the loop pass manager test case to deal with a private ~Loop::Loop.
 - Renames markAsRemoved to markAsErased to contrast with removeLoop, via the usual remove vs. erase idiom we already have for instructions and basic blocks.

Reviewers: chandlerc

Subscribers: mehdi_amini, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D37996

llvm-svn: 313695
2017-09-19 23:19:00 +00:00
Adam Nemet 15fccf0009 Allow ORE.emit to take a closure to delay building the remark object
In the lambda we are now returning the remark by value so we need to preserve
its type in the insertion operator.  This requires making the insertion
operator generic.

I've also converted a few cases to use the new API.  It seems to work pretty
well.  See the LoopUnroller for a slightly more interesting case.

llvm-svn: 313691
2017-09-19 23:00:55 +00:00
Sanjay Patel ca14697c2b [SimplifyCFG] fix typos/formatting; NFC
llvm-svn: 313671
2017-09-19 20:58:14 +00:00
Anna Thomas f34537dff8 [RuntimeUnroll] Add heuristic for unrolling multi-exit loop
Add a profitability heuristic to enable runtime unrolling of multi-exit
loop: There can be atmost two unique exit blocks for the loop and the
second exit block should be a deoptimizing block. Also, there can be one
other exiting block other than the latch exiting block. The reason for
the latter is so that we limit the number of branches in the unrolled
code to being at most the unroll factor.  Deoptimizing blocks are rarely
taken so these additional number of branches created due to the
unrolling are predictable, since one of their target is the deopt block.

Reviewers: apilipenko, reames, evstupac, mkuper

Subscribers: llvm-commits

Reviewed by: reames

Differential Revision: https://reviews.llvm.org/D35380

llvm-svn: 313363
2017-09-15 15:56:05 +00:00
Anna Thomas 512dde77ba [RuntimeUnrolling] Populate the VMap entry correctly when default generated through lookup
During runtime unrolling on loops with multiple exits, we update the
exit blocks with the correct phi values from both original and remainder
loop.
In this process, we lookup the VMap for the mapped incoming phi values,
but did not update the VMap if a default entry was generated in the VMap
during the lookup. This default value is generated when constants or
values outside the current loop are looked up.
This patch fixes the assertion failure when null entries are present in
the VMap because of this lookup. Added a testcase that showcases the
problem.

llvm-svn: 313358
2017-09-15 13:29:33 +00:00
Alina Sbirlea 7ed5856a32 Refactor collectChildrenInLoop to LoopUtils [NFC]
Summary: Move to LoopUtils method that collects all children of a node inside a loop.

Reviewers: majnemer, sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37870

llvm-svn: 313322
2017-09-15 00:04:16 +00:00
Nuno Lopes 404f106d71 Merge isKnownNonNull into isKnownNonZero
It now knows the tricks of both functions.
Also, fix a bug that considered allocas of non-zero address space to be always non null

Differential Revision: https://reviews.llvm.org/D37628

llvm-svn: 312869
2017-09-09 18:23:11 +00:00
Sam Parker 7cd826a321 [LoopUnroll][DebugInfo] Don't add metadata to unrolled remainder loop
Debug information can be, and was, corrupted when the runtime
remainder loop was fully unrolled. This is because a !null node can
be created instead of a unique one describing the loop. In this case,
the original node gets incorrectly updated with the NewLoopID
metadata.

In the case when the remainder loop is going to be quickly fully
unrolled, there isn't the need to add loop metadata for it anyway.

Differential Revision: https://reviews.llvm.org/D37338

llvm-svn: 312471
2017-09-04 08:12:16 +00:00
Alexey Bataev 978e2e4760 [SimplifyCFG] Fix for PR34219: Preserve alignment after merging conditional stores.
Summary:
If SimplifyCFG pass is able to merge conditional stores into single one,
it loses the alignment. This may lead to incorrect codegen. Patch
sets the alignment of the new instruction if it is set in the original
one.

Reviewers: jmolloy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36841

llvm-svn: 312030
2017-08-29 20:06:24 +00:00
Davide Italiano 20cb7e887f [LoopUnroll] Properly update loop structure in case of successful peeling.
When peeling kicks in, it updates the loop preheader.
Later, a successful full unroll of the loop needs to update a PHI
which i-th argument comes from the loop preheader, so it'd better look
at the correct block. Fixes PR33437.

Differential Revision:  https://reviews.llvm.org/D37153

llvm-svn: 311922
2017-08-28 20:29:33 +00:00
Dehao Chen 191b24d3d2 revert r310985 which breaks for the following case:
struct string {
  ~string();
};
void f2();
void f1(int) { f2(); }
void run(int c) {
  string body;
  while (true) {
    if (c)
      f1(c);
    else
      f1(c);
  }
}

Will recommit once the issue is fixed.

llvm-svn: 311864
2017-08-27 22:22:39 +00:00
Sanjay Patel 5d67d8916e [BypassSlowDivision] move map helper code to header; NFC
We can reuse this code with other div/rem transforms as shown in:
https://reviews.llvm.org/D31037 
https://bugs.llvm.org/show_bug.cgi?id=31028

llvm-svn: 311661
2017-08-24 14:43:33 +00:00
Sanjay Patel 82ec872990 [LibCallSimplifier] try harder to fold memcmp with constant arguments (2nd try)
The 1st try was reverted because it could inf-loop by creating a dead instruction.
Fixed that to not happen and added a test case to verify.

Original commit message:

Try to fold:
memcmp(X, C, ConstantLength) == 0 --> load X == *C

Without this change, we're unnecessarily checking the alignment of the constant data,
so we miss the transform in the first 2 tests in the patch.

I noted this shortcoming of LibCallSimpifier in one of the recent CGP memcmp expansion
patches. This doesn't help the example in:
https://bugs.llvm.org/show_bug.cgi?id=34032#c13
...directly, but it's worth short-circuiting more of these simple cases since we're
already trying to do that.

The benefit of transforming to load+cmp is that existing IR analysis/transforms may
further simplify that code. For example, if the load of the variable is common to
multiple memcmp calls, CSE can remove the duplicate instructions.

Differential Revision: https://reviews.llvm.org/D36922

llvm-svn: 311366
2017-08-21 19:13:14 +00:00
Sanjay Patel 707f786cc5 revert r311333: [LibCallSimplifier] try harder to fold memcmp with constant arguments
We're getting lots of compile-timeout bot failures like:
http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/7119
http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux

llvm-svn: 311340
2017-08-21 15:16:25 +00:00
Sanjay Patel 7756edfa93 [LibCallSimplifier] try harder to fold memcmp with constant arguments
Try to fold:
memcmp(X, C, ConstantLength) == 0 --> load X == *C

Without this change, we're unnecessarily checking the alignment of the constant data, 
so we miss the transform in the first 2 tests in the patch.

I noted this shortcoming of LibCallSimpifier in one of the recent CGP memcmp expansion 
patches. This doesn't help the example in:
https://bugs.llvm.org/show_bug.cgi?id=34032#c13
...directly, but it's worth short-circuiting more of these simple cases since we're 
already trying to do that.

The benefit of transforming to load+cmp is that existing IR analysis/transforms may
further simplify that code. For example, if the load of the variable is common to 
multiple memcmp calls, CSE can remove the duplicate instructions.

Differential Revision: https://reviews.llvm.org/D36922

llvm-svn: 311333
2017-08-21 13:55:49 +00:00
Benjamin Kramer 2e5be849cc [Mem2Reg] Modernize code a bit.
No functionality change intended.

llvm-svn: 311290
2017-08-20 14:34:44 +00:00
Chandler Carruth 4f3aa29a46 [Inliner] Fix a nasty bug when inlining a non-recursive trace of
a function into itself.

We tried to fix this before in r306495 but that got reverted as the
assert was actually hit.

This fixes the original bug (which we seem to have lost track of with
the revert) by blocking a second remapping when the function being
inlined is also the caller and the remapping could succeed but
erroneously.

The included test case would actually load from an inlined copy of the
alloca before this change, failing to load the stored value and
miscompiling.

Many thanks to Richard Smith for diagnosing a user miscompile to this
bug, and to Kyle for the first attempt and initial analysis and David Li
for remembering the issue and how to fix it and suggesting the patch.
I'm just stitching it together and landing it. =]

llvm-svn: 311229
2017-08-19 06:56:11 +00:00
Jakub Kuderski e35a449140 [Dominators] Teach LoopUnswitch to use the incremental API
Summary:
This patch makes LoopUnswitch use new incremental API for updating dominators.
It also updates SplitCriticalEdge, as it is called in LoopUnswitch.

There doesn't seem to be any noticeable performance difference when bootstrapping clang with this patch.

Reviewers: dberlin, davide, sanjoy, grosser, chandlerc

Reviewed By: davide, grosser

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D35528

llvm-svn: 311093
2017-08-17 16:45:35 +00:00
Dehao Chen 84d412035a Merge debug info when hoist then-else code to if.
Summary: When we move then-else code to if, we need to merge its debug info, otherwise the hoisted instruction may have inaccurate debug info attached.

Reviewers: aprantl, probinson, dblaikie, echristo, loladiro

Reviewed By: aprantl

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D36778

llvm-svn: 310985
2017-08-16 01:55:26 +00:00
Ayal Zaks 25e2800e20 [LV] Minor savings to Sink casts to unravel first order recurrence
Two minor savings: avoid copying the SinkAfter map and avoid moving a cast if it
is not needed.

Differential Revision: https://reviews.llvm.org/D36408

llvm-svn: 310910
2017-08-15 08:32:59 +00:00
Craig Topper 0aa3a19512 Recommit r310869, "[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify"
This recommits r310869, with the moved files and no extra changes.

Original commit message:

This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too.

I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself.

I also had to make decomposeBitTest support vectors since InstSimplify needs that.

As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library.

Differential Revision: https://reviews.llvm.org/D36593

llvm-svn: 310889
2017-08-14 21:39:51 +00:00
Andrew Kaylor 53a5fbb45f Add strictfp attribute to prevent unwanted optimizations of libm calls
Differential Revision: https://reviews.llvm.org/D34163

llvm-svn: 310885
2017-08-14 21:15:13 +00:00
Craig Topper 69fa8e0d99 Revert r310869 "[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify"
Failed to add the two files that moved. And then added an extra change I didn't mean to while trying to fix that. Reverting everything.

llvm-svn: 310873
2017-08-14 19:09:32 +00:00
Craig Topper 2f0b450666 [InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify
This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too.

I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself.

I also had to make decomposeBitTest support vectors since InstSimplify needs that.

As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library.

Differential Revision: https://reviews.llvm.org/D36593

llvm-svn: 310869
2017-08-14 18:49:42 +00:00
Sam Parker 718c8a6a2a [LoopUnroll] Enable option to peel remainder loop
On some targets, the penalty of executing runtime unrolling checks
and then not the unrolled loop can be significantly detrimental to
performance. This results in the need to be more conservative with
the unroll count, keeping a trip count of 2 reduces the overhead as
well as increasing the chance of the unrolled body being executed. But
being conservative leaves performance gains on the table.

This patch enables the unrolling of the remainder loop introduced by
runtime unrolling. This can help reduce the overhead of misunrolled
loops because the cost of non-taken branches is much less than the
cost of the backedge that would normally be executed in the remainder
loop. This allows larger unroll factors to be used without suffering
performance loses with smaller iteration counts.

Differential Revision: https://reviews.llvm.org/D36309

llvm-svn: 310824
2017-08-14 09:25:26 +00:00
Craig Topper 9cd976d041 [DebugCounter] Move the semicolon out of the DEBUG_COUNTER macro and require it to be placed at the end of each use.
This make it consistent with STATISTIC which it will often appears near.

While there move one DEBUG_COUNTER instance out of an anonymous namespace. It's already declaring a static variable so the namespace is unnecessary.

llvm-svn: 310637
2017-08-10 17:48:11 +00:00
Ewan Crawford e18490c8be [Cloning] Move distinct GlobalVariable debug info metadata in CloneModule
Duplicating the distinct Subprogram and CU metadata nodes seems like the incorrect thing to do in CloneModule for GlobalVariable debug info. As it results in the scope of the GlobalVariable DI no longer being consistent with the rest of the module, and the new CU is absent from llvm.dbg.cu.

Fixed by adding RF_MoveDistinctMDs to MapMetadata flags for GlobalVariables.

Current unit test IR after clone:
```
@gv = global i32 1, comdat($comdat), !dbg !0, !type !5

define private void @f() comdat($comdat) personality void ()* @persfn !dbg !14 {

!llvm.dbg.cu = !{!10}

!0 = !DIGlobalVariableExpression(var: !1)
!1 = distinct !DIGlobalVariable(name: "gv", linkageName: "gv", scope: !2, file: !3, line: 1, type: !9, isLocal: false, isDefinition: true)
!2 = distinct !DISubprogram(name: "f", linkageName: "f", scope: null, file: !3, line: 4, type: !4, isLocal: true, isDefinition: true, scopeLine: 3, isOptimized: false, unit: !6, variables: !5)
!3 = !DIFile(filename: "filename.c", directory: "/file/dir/")
!4 = !DISubroutineType(types: !5)
!5 = !{}
!6 = distinct !DICompileUnit(language: DW_LANG_C99, file: !7, producer: "CloneModule", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !5, globals: !8)
!7 = !DIFile(filename: "filename.c", directory: "/file/dir")
!8 = !{!0}
!9 = !DIBasicType(tag: DW_TAG_unspecified_type, name: "decltype(nullptr)")
!10 = distinct !DICompileUnit(language: DW_LANG_C99, file: !7, producer: "CloneModule", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !5, globals: !11)
!11 = !{!12}
!12 = !DIGlobalVariableExpression(var: !13)
!13 = distinct !DIGlobalVariable(name: "gv", linkageName: "gv", scope: !14, file: !3, line: 1, type: !9, isLocal: false, isDefinition: true)
!14 = distinct !DISubprogram(name: "f", linkageName: "f", scope: null, file: !3, line: 4, type: !4, isLocal: true, isDefinition: true, scopeLine: 3, isOptimized: false, unit: !10, variables: !5)
```

Patched IR after clone:
```
@gv = global i32 1, comdat($comdat), !dbg !0, !type !5

define private void @f() comdat($comdat) personality void ()* @persfn !dbg !2 {

!llvm.dbg.cu = !{!6}

!0 = !DIGlobalVariableExpression(var: !1)
!1 = distinct !DIGlobalVariable(name: "gv", linkageName: "gv", scope: !2, file: !3, line: 1, type: !9, isLocal: false, isDefinition: true)
!2 = distinct !DISubprogram(name: "f", linkageName: "f", scope: null, file: !3, line: 4, type: !4, isLocal: true, isDefinition: true, scopeLine: 3, isOptimized: false, unit: !6, variables: !5)
!3 = !DIFile(filename: "filename.c", directory: "/file/dir/")
!4 = !DISubroutineType(types: !5)
!5 = !{}
!6 = distinct !DICompileUnit(language: DW_LANG_C99, file: !7, producer: "CloneModule", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !5, globals: !8)
!7 = !DIFile(filename: "filename.c", directory: "/file/dir")
!8 = !{!0}
!9 = !DIBasicType(tag: DW_TAG_unspecified_type, name: "decltype(nullptr)")
```

Reviewers: aprantl, probinson, dblaikie, echristo, loladiro
Reviewed By: aprantl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36082

llvm-svn: 309928
2017-08-03 09:23:03 +00:00
Craig Topper 36af40c115 [SimplifyCFG] Fix typo in comment. NFC
llvm-svn: 309785
2017-08-02 02:34:16 +00:00
Chad Rosier dfd1de687d [Value Tracking] Default argument to true and rename accordingly. NFC.
IMHO this is a bit more readable.

llvm-svn: 309739
2017-08-01 20:18:54 +00:00
Davide Italiano 72c4285bd6 [MetaRenamer] Leave `@main` alone.
To the best of my knowledge -metarenamer is used in two cases:
1) obfuscate names, when e.g. they contain informations that
can't be shared.
2) Improve clarity of the textual IR for testcases.

One of the usecases if getting the output of `opt` and passing it
to the lli interpreter to run the test. If metarenamer renames
@main, lli can't find an entry point.

llvm-svn: 309657
2017-08-01 05:14:45 +00:00
Sumanth Gundapaneni 8d50a50e98 [SimplifyCFG] Make the no-jump-tables attribute also disable switch lookup tables
Differential Revision: https://reviews.llvm.org/D35579

llvm-svn: 309444
2017-07-28 22:25:40 +00:00
Adrian Prantl abe04759a6 Remove the obsolete offset parameter from @llvm.dbg.value
There is no situation where this rarely-used argument cannot be
substituted with a DIExpression and removing it allows us to simplify
the DWARF backend. Note that this patch does not yet remove any of
the newly dead code.

rdar://problem/33580047
Differential Revision: https://reviews.llvm.org/D35951

llvm-svn: 309426
2017-07-28 20:21:02 +00:00
Daniel Neilson 2574d7cbf6 All libcalls should be considered to be GC-leaf functions.
Summary:
It is possible for some passes to materialize a call to a libcall (ex: ldexp, exp2, etc),
but these passes will not mark the call as a gc-leaf-function. All libcalls are
actually gc-leaf-functions, so we change llvm::callsGCLeafFunction() to tell us that
available libcalls are equivalent to gc-leaf-function calls.

Reviewers: sanjoy, anna, reames

Reviewed By: anna

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D35840

llvm-svn: 309291
2017-07-27 16:49:39 +00:00
David Blaikie 72c0b1cc1b Fix assert from r309278
llvm-svn: 309281
2017-07-27 15:28:10 +00:00
David Blaikie 2f0cc477ab ThinLTO: Don't import aliases of any kind (even linkonce_odr)
Summary:
Until a more advanced version of importing can be implemented for
aliases (one that imports an alias as an available_externally definition
of the aliasee), skip the narrow subset of cases that was possible but
came at a cost: aliases of linkonce_odr functions could be imported
because the linkonce_odr function could be safely duplicated from the
source module. This came/comes at the cost of not being able to 'home'
imported linkonce functions (they had to be emitted linkonce_odr in all
the destination modules (even if they weren't used by an alias) rather
than as available_externally - causing extra object size).

Tangentially, this also was the only reason ThinLTO would emit multiple
CUs in to the resulting DWARF - which happens to be a problem for
Fission (there's a fix for this in GDB but not released yet, etc).
(actually it's not the only reason - but I'm sending a patch to fix the
other reason shortly)

There's no reason to believe this particularly narrow alias importing
was especially/meaningfully important, only that it was /possible/ to
implement in this way. When a more general solution is done, it should
still satisfy the DWARF concerns above, since the import will still be
available_externally, and thus not create extra CUs.

Since now all aliases are treated the same, I removed/simplified some
test cases since they were testing corner cases where there are no
longer any corners.

Reviewers: tejohnson, mehdi_amini

Differential Revision: https://reviews.llvm.org/D35875

llvm-svn: 309278
2017-07-27 15:09:06 +00:00
Adam Nemet ea06e6e865 Migrate SimplifyLibCalls to new OptimizationRemarkEmitter
Summary:
This changes SimplifyLibCalls to use the new OptimizationRemarkEmitter
API.

In fact, as SimplifyLibCalls is only ever called via InstCombine,
(as far as I can tell) the OptimizationRemarkEmitter is added there,
and then passed through to SimplifyLibCalls later.

I have avoided changing any remark text.

This closes PR33787

Patch by Sam Elliott!

Reviewers: anemet, davide

Reviewed By: anemet

Subscribers: davide, mehdi_amini, eraman, fhahn, llvm-commits

Differential Revision: https://reviews.llvm.org/D35608

llvm-svn: 309158
2017-07-26 19:03:18 +00:00
Anna Thomas 5c07a4c5de [RuntimeUnroll] NFC: Add a profitability function for mutliexit loop
Separated out the profitability from the safety analysis for multiexit
loop unrolling. Currently, this is an NFC because profitability is true
only if the unroll-runtime-multi-exit is set to true (off-by-default).

This is to ease adding the profitability heuristic up for review at
D35380.

llvm-svn: 308753
2017-07-21 16:30:38 +00:00
Dinar Temirbulatov a61f4b8957 [LoopUtils] Add an extra parameter OpValue to propagateIRFlags function,
If OpValue is non-null, we only consider operations similar to OpValue
when intersecting.

Differential Revision: https://reviews.llvm.org/D35292

llvm-svn: 308428
2017-07-19 10:02:07 +00:00
Balaram Makam b05a55787a [SimplifyCFG] Defer folding unconditional branches to LateSimplifyCFG if it can destroy canonical loop structure.
Summary:
When simplifying unconditional branches from empty blocks, we pre-test if the
BB belongs to a set of loop headers and keep the block to prevent passes from
destroying canonical loop structure. However, the current algorithm fails if
the destination of the branch is a loop header. Especially when such a loop's
latch block is folded into loop header it results in additional backedges and
LoopSimplify turns it into a nested loop which prevent later optimizations
from being applied (e.g., loop  unrolling and loop interleaving).

This patch augments the existing algorithm by further checking if the
destination of the branch belongs to a set of loop headers and defer
eliminating it if yes to LateSimplifyCFG.

Fixes PR33605: https://bugs.llvm.org/show_bug.cgi?id=33605

Reviewers: efriedma, mcrosier, pacxx, hsung, davidxl

Reviewed By: efriedma

Subscribers: ashutosh.nema, gberry, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D35411

llvm-svn: 308422
2017-07-19 08:53:34 +00:00
Simon Pilgrim f32f4be957 Fix unused variable warning on EXPENSIVE_CHECKS release builds. NFCI.
llvm-svn: 307929
2017-07-13 17:10:12 +00:00
Anna Thomas ec9b326569 [RuntimeUnrolling] Update DomTree correctly when exit blocks have successors
Summary:
When we runtime unroll with multiple exit blocks, we also need to update the
immediate dominators of the immediate successors of the exit blocks.

Reviewers: reames, mkuper, mzolotukhin, apilipenko

Reviewed by: mzolotukhin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D35304

llvm-svn: 307909
2017-07-13 13:21:23 +00:00
Anna Thomas 8e431a9851 [LoopUnrollRuntime] NFC: Refactored safety checks of unrolling multi-exit loop
Refactored the code and separated out a function
`canSafelyUnrollMultiExitLoop` to reduce redundant checks and make it
easier to add profitability heuristics later.
Added tests to runtime unrolling to make sure that unrolling for
multi-exit loops is not done unless the option
-unroll-runtime-multi-exit is true.

llvm-svn: 307843
2017-07-12 20:55:43 +00:00
Konstantin Zhuravlyov bb80d3e1d3 Enhance synchscope representation
OpenCL 2.0 introduces the notion of memory scopes in atomic operations to
  global and local memory. These scopes restrict how synchronization is
  achieved, which can result in improved performance.

  This change extends existing notion of synchronization scopes in LLVM to
  support arbitrary scopes expressed as target-specific strings, in addition to
  the already defined scopes (single thread, system).

  The LLVM IR and MIR syntax for expressing synchronization scopes has changed
  to use *syncscope("<scope>")*, where <scope> can be "singlethread" (this
  replaces *singlethread* keyword), or a target-specific name. As before, if
  the scope is not specified, it defaults to CrossThread/System scope.

  Implementation details:
    - Mapping from synchronization scope name/string to synchronization scope id
      is stored in LLVM context;
    - CrossThread/System and SingleThread scopes are pre-defined to efficiently
      check for known scopes without comparing strings;
    - Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in
      the bitcode.

Differential Revision: https://reviews.llvm.org/D21723

llvm-svn: 307722
2017-07-11 22:23:00 +00:00
Anna Thomas bafe766f5d [LoopUnrollRuntime] NFC: Add some debugging trace messages for why loop wasn't unrolled.
llvm-svn: 307705
2017-07-11 20:44:37 +00:00
Anna Thomas 5526a33f4f [LoopUnrollRuntime] Avoid multi-exit nested loop with epilog generation
The loop structure for the outer loop does not contain the epilog
preheader when we try to unroll inner loop with multiple exits and
epilog code is generated. For now, we just bail out in such cases.
Added a test case that shows the problem. Without this bailout, we would
trip on assert saying LCSSA form is incorrect for outer loop.

llvm-svn: 307676
2017-07-11 17:16:33 +00:00
Leo Li 93abd7d915 [ConstantHoisting] Remove dupliate logic in constant hoisting
Summary:
As metioned in https://reviews.llvm.org/D34576, checkings in
`collectConstantCandidates` can be replaced by using
`llvm::canReplaceOperandWithVariable`.

The only special case is that `collectConstantCandidates` return false for
all `IntrinsicInst` but it is safe for us to collect constant candidates from
`IntrinsicInst`.

Reviewers: pirama, efriedma, srhines

Reviewed By: efriedma

Subscribers: llvm-commits, javed.absar

Differential Revision: https://reviews.llvm.org/D34921

llvm-svn: 307587
2017-07-10 20:45:34 +00:00
Anna Thomas 70ffd65ca9 [LoopUnrollRuntime] Remove strict assert about VMap requirement
When unrolling under multiple exits which is under off-by-default option,
the assert that checks for VMap entry in loop exit values is too strong.
(assert if VMap entry did not exist, the value should be a
constant). However, values derived from
constants or from values outside loop, does not have a VMap entry too.

Removed the assert and added a testcase showcasing the property for
non-constant values.

llvm-svn: 307542
2017-07-10 15:29:38 +00:00
Craig Topper 95d2347ae1 [IR] Make use of Type::isPtrOrPtrVectorTy/isIntOrIntVectorTy/isFPOrFPVectorTy to shorten code. NFC
llvm-svn: 307491
2017-07-09 07:04:00 +00:00
Max Kazantsev b9edcbcb1d Re-enable "[IndVars] Canonicalize comparisons between non-negative values and indvars"
The patch was reverted due to a bug. The bug was that if the IV is the 2nd operand of the icmp
instruction, then the "Pred" variable gets swapped and differs from the instruction's predicate.
In this patch we use the original predicate to do the transformation.

Also added a test case that exercises this situation.

Differentian Revision: https://reviews.llvm.org/D35107

llvm-svn: 307477
2017-07-08 17:17:30 +00:00
Anna Thomas e3872003d0 [LoopUnrollRuntime] Support multiple exit blocks unrolling when prolog remainder generated
With the NFC refactoring in rL307417 (git SHA 987dd01), all the logic
is in place to support multiple exit/exiting blocks when prolog
remainder is generated.
This patch removed the assert that multiple exit blocks unrolling is only
supported when epilog remainder is generated.

Also, added test runs and checks with PROLOG prefix in
runtime-loop-multiple-exits.ll test cases.

llvm-svn: 307435
2017-07-07 20:12:32 +00:00
Davide Italiano 4eb210bdeb [Local] Update the comment for removeUnreachableBlocks.
It referenced a wrong function name, and didn't mention what the
second argument did. This should be slightly more accurate now.

llvm-svn: 307425
2017-07-07 18:54:14 +00:00
Gor Nishanov 8cdf648795 [cloning] Do not duplicate types when cloning functions
Summary:
This is an addon to the change rl304488 cloning fixes. (Originally rl304226 reverted rl304228 and reapplied rl304488 https://reviews.llvm.org/D33655)

rl304488 works great when DILocalVariables that comes from the inlined function has a 'unique-ed' type, but,
in the case when the variable type is distinct we will create a second DILocalVariable in the scope of the original function that was inlined.

Consider cloning of the following function:
```
define private void @f() !dbg !5 {
  %1 = alloca i32, !dbg !11
  call void @llvm.dbg.declare(metadata i32* %1, metadata !14, metadata !12), !dbg !18
  ret void, !dbg !18
}

!14 = !DILocalVariable(name: "inlined", scope: !15, file: !6, line: 5, type: !17) ; came from an inlined function
!15 = distinct !DISubprogram(name: "inlined", linkageName: "inlined", scope: null, file: !6, line: 8, type: !7, isLocal: true, isDefinition: true, scopeLine: 9, isOptimized: false, unit: !0, variables: !16)
!16 = !{!14}
!17 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "some_struct", size: 32, align: 32)
```

Without this fix, when function 'f' is cloned, we will create another DILocalVariable for "inlined", due to its type being distinct.

```
define private void @f.1() !dbg !23 {
  %1 = alloca i32, !dbg !26
  call void @llvm.dbg.declare(metadata i32* %1, metadata !28, metadata !12), !dbg !30
  ret void, !dbg !30
}

!14 = !DILocalVariable(name: "inlined", scope: !15, file: !6, line: 5, type: !17)
!15 = distinct !DISubprogram(name: "inlined", linkageName: "inlined", scope: null, file: !6, line: 8, type: !7, isLocal: true, isDefinition: true, scopeLine: 9, isOptimized: false, unit: !0, variables: !16)
!16 = !{!14}
!17 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "some_struct", size: 32, align: 32)
 ;
!28 = !DILocalVariable(name: "inlined", scope: !15, file: !6, line: 5, type: !29) ; OOPS second DILocalVariable
!29 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "some_struct", size: 32, align: 32)
```

Now we have two DILocalVariable for "inlined" within the same scope. This result in assert in AsmPrinter/DwarfDebug.h:131: void llvm::DbgVariable::addMMIEntry(const llvm::DbgVariable &): Assertion `V.Var == Var && "conflicting variable"' failed.
(Full example: See: https://bugs.llvm.org/show_bug.cgi?id=33492)

In this change we prevent duplication of types so that when a metadata for DILocalVariable is cloned it will get uniqued to the same metadate node as an original variable.

Reviewers: loladiro, dblaikie, aprantl, echristo

Reviewed By: loladiro

Subscribers: EricWF, llvm-commits

Differential Revision: https://reviews.llvm.org/D35106

llvm-svn: 307418
2017-07-07 18:24:20 +00:00
Anna Thomas 734ab3f75c [LoopUnrollRuntime] NFC: use the precomputed loop exit in ConnectProlog
Minor refactoring to use the preexisting loop exit that's already
calculated. We do not need to recompute the loop exit in ConnectProlog.
Apart from avoiding redundant computation, this is required for
supporting multiple loop exits when Prolog remainder loops are generated.

llvm-svn: 307417
2017-07-07 18:05:28 +00:00
Sean Fertile 9cd1cdf814 Extend memcpy expansion in Transform/Utils to handle wider operand types.
Adds loop expansions for known-size and unknown-sized memcpy calls, allowing the
target to provide the operand types through TTI callbacks. The default values
for the TTI callbacks use int8 operand types and matches the existing behaviour
if they aren't overridden by the target.

Differential revision: https://reviews.llvm.org/D32536

llvm-svn: 307346
2017-07-07 02:00:06 +00:00
Leo Li 5499b1b8be Modify constraints in `llvm::canReplaceOperandWithVariable`
Summary:
`Instruction::Switch`: only first operand can be set to a non-constant value.
`Instruction::InsertValue` both the first and the second operand can be set to a non-constant value.
`Instruction::Alloca` return true for non-static allocation.

Reviewers: efriedma

Reviewed By: efriedma

Subscribers: srhines, pirama, llvm-commits

Differential Revision: https://reviews.llvm.org/D34905

llvm-svn: 307294
2017-07-06 18:47:05 +00:00
Craig Topper 79ab643da8 [Constants] If we already have a ConstantInt*, prefer to use isZero/isOne/isMinusOne instead of isNullValue/isOneValue/isAllOnesValue inherited from Constant. NFCI
Going through the Constant methods requires redetermining that the Constant is a ConstantInt and then calling isZero/isOne/isMinusOne.

llvm-svn: 307292
2017-07-06 18:39:47 +00:00
Anna Thomas eb6d5d1950 [LoopUnrollRuntime] Bailout when multiple exiting blocks to the unique latch exit block
Currently, we do not support multiple exiting blocks to the
latch exit block. However, this bailout wasn't triggered when we had a
unique exit block (which is the latch exit), with multiple exiting
blocks to that unique exit.

Moved the bailout so that it's triggered in both cases and added
testcase.

llvm-svn: 307291
2017-07-06 18:39:26 +00:00
Craig Topper dfd01ea9ed [SimplifyCFG] Move a portion of an if statement that should already be implied to an assert
Summary: In this code we got to Dom by following the predecessor link of BB. So it stands to reason that BB should also show up as a successor of Dom's terminator right? There isn't a way to have the CFG connect in only one direction is there?

Reviewers: jmolloy, davide, mcrosier

Reviewed By: mcrosier

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D35025

llvm-svn: 307276
2017-07-06 16:29:43 +00:00
Max Kazantsev 98838527c6 Revert "Revert "Revert "[IndVars] Canonicalize comparisons between non-negative values and indvars"""
It appears that the problem is still there. Needs more analysis to understand why
SaturatedMultiply test fails.

llvm-svn: 307249
2017-07-06 10:47:13 +00:00
Max Kazantsev c8db20b78c Revert "Revert "[IndVars] Canonicalize comparisons between non-negative values and indvars""
It seems that the patch was reverted by mistake. Clang testing showed failure of the
MathExtras.SaturatingMultiply test, however I was unable to reproduce the issue on the
fresh code base and was able to confirm that the transformation introduced by the change
does not happen in the said test. This gives a strong confidence that the actual reason of
the failure of the initial patch was somewhere else, and that problem now seems to be
fixed. Re-submitting the change to confirm that.

llvm-svn: 307244
2017-07-06 09:57:41 +00:00
David Green b26a0a460c [IndVarSimplify] Add AShr exact flags using induction variables ranges.
This adds exact flags to AShr/LShr flags where we can statically
prove it is valid using the range of induction variables. This
allows further optimisations to remove extra loads.

Differential Revision: https://reviews.llvm.org/D34207

llvm-svn: 307157
2017-07-05 13:25:58 +00:00
Max Kazantsev ebe56283bc Revert "[IndVars] Canonicalize comparisons between non-negative values and indvars"
This patch seems to cause failures of test MathExtras.SaturatingMultiply on
multiple buildbots. Reverting until the reason of that is clarified.

Differential Revision: https://reviews.llvm.org/rL307126

llvm-svn: 307135
2017-07-05 09:44:41 +00:00
Max Kazantsev 80bc4a5554 [IndVars] Canonicalize comparisons between non-negative values and indvars
-If there is a IndVar which is known to be non-negative, and there is a value which is also non-negative,
then signed and unsigned comparisons between them produce the same result. Both of those can be
seen in the same loop. To allow other optimizations to simplify them, we turn all instructions like

  %c = icmp slt i32 %iv, %b
to

  %c = icmp ult i32 %iv, %b

if both %iv and %b are known to be non-negative.

Differential Revision: https://reviews.llvm.org/D34979

llvm-svn: 307126
2017-07-05 06:38:49 +00:00
Davide Italiano e3f7dda1fb [CodeExtractor] Remove unneded and commented out debugging stmts.
llvm-svn: 306966
2017-07-02 00:07:18 +00:00
Davide Italiano 9282f1aece [Cloner] Re-map simplfied cloned instructions.
This commit pretty much rolls back the logic added in r306495
as in the testcase provided we simplify an `icmp` looking through
a PHI that hasn't been mapped yet.

I think instsimplify shouldn't do threading over select/phis or
just looking through phis in general, but this is what we have
now. Also, add a test to prevent this from happening in case somebody
wants to modify this code again.

Briefly discussed with Kyle Butt (thanks Kyle!).

llvm-svn: 306938
2017-07-01 03:29:33 +00:00
Teresa Johnson 32d95742b8 Recommit "r306541 - Add zero-length check to memcpy/memset load store loop expansion""
With fix for use-after-free errors. We can't add the new branch and
remove the old one until we are done with the Builder constructed for
the block.

llvm-svn: 306937
2017-07-01 03:24:10 +00:00
Ayal Zaks 2ff59d4350 [LV] Sink casts to unravel first order recurrence
Check if a single cast is preventing handling a first-order-recurrence Phi,
because the scheduling constraints it imposes on the first-order-recurrence
shuffle are infeasible; but they can be made feasible by moving the cast
downwards. Record such casts and move them when vectorizing the loop.

Differential Revision: https://reviews.llvm.org/D33058

llvm-svn: 306884
2017-06-30 21:05:06 +00:00
Sumanth Gundapaneni 5372f0a73e [SimplifyCFG] Update the name of switch generated lookup table.
This patch appends the name of the function to the switch generated lookup
table. This will ease the visual debugging in identifying the function the table
is generated from.

Differential Revision: https://reviews.llvm.org/D34817

llvm-svn: 306867
2017-06-30 20:00:01 +00:00
Anna Thomas e5e5e59d8b [RuntimeUnrolling] Add logic for loops with multiple exit blocks
Summary:
Runtime unrolling is done for loops with a single exit block and a
single exiting block (and this exiting block should be the latch block).
This patch adds logic to support unrolling in the presence of multiple exit
blocks (which also means multiple exiting blocks).
Currently this is under an off-by-default option and is supported when
epilog code is generated. Support in presence of prolog code will be in
a future patch (we just need to add more tests, and update comments).

This patch is essentially an implementation patch. I have not added any
heuristic (in terms of branches added or code size) to decide when
this should be enabled.

Reviewers: mkuper, sanjoy, reames, evstupac

Reviewed by: reames

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33001

llvm-svn: 306846
2017-06-30 17:57:07 +00:00
Daniel Jasper 3b704ceba1 Revert "r306541 - Add zero-length check to memcpy/memset load store loop expansion"
Segfaults in non-optimized builds. I'll get a stack trace and a
reproducer to Teresa.

llvm-svn: 306793
2017-06-30 06:37:33 +00:00
Max Kazantsev 8d0322e612 [SCEV] Use depth limit instead of local cache for SExt and ZExt
In rL300494 there was an attempt to deal with excessive compile time on
invocations of getSign/ZeroExtExpr using local caching. This approach only
helps if we request the same SCEV multiple times throughout recursion. But
in the bug PR33431 we see a case where we request different values all the time,
so caching does not help and the size of the cache grows enormously.

In this patch we remove the local cache for this methods and add the recursion
depth limit instead, as we do for arithmetics. This gives us a guarantee that the
invocation sequence is limited and reasonably short.

Differential Revision: https://reviews.llvm.org/D34273

llvm-svn: 306785
2017-06-30 05:04:09 +00:00
Xin Tong 02008c30b5 Remove useless header. NFC
llvm-svn: 306712
2017-06-29 17:48:12 +00:00
Daniel Berlin b7df17ec59 PredicateInfo: Use OrderedInstructions instead of our homemade
version.

llvm-svn: 306703
2017-06-29 17:01:14 +00:00
Daniel Berlin 7c757aee38 Remove unneeded else from OrderedInstructions::dominates.
llvm-svn: 306701
2017-06-29 17:01:03 +00:00
Teresa Johnson 538b8d25f0 Add zero-length check to memcpy/memset load store loop expansion
Summary:
I was testing using this expansion logic in other cases besides
NVPTX, and found some runtime failures due to the lack of a check
for a zero length memcpy/memset before the loop. There is already
such a check in the memmove expansion code though.

Reviewers: hfinkel

Subscribers: jholewinski, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D34707

llvm-svn: 306541
2017-06-28 13:07:37 +00:00
Kyle Butt f73c8a06a9 Inlining: Don't re-map simplified cloned instructions.
When simplifying an instruction that has been re-mapped, it should never
simplify to an instruction in the original function. In the edge case
where we are inlining a function into itself, the existing code led to
incorrect behavior. Replace the incorrect code with an assert verifying
that we never expect simplification to produce an instruction in the old
function, unless the functions are the same.

Differential Revision: https://reviews.llvm.org/D33850

llvm-svn: 306495
2017-06-28 01:41:25 +00:00
Serge Guelton 7bc405aa4c [CodeExtractor] Prevent extraction of block involving blockaddress
BlockAddress are only valid within their function context, which does not
interact well with CodeExtractor. Detect this case and prevent it.

Differential Revision: https://reviews.llvm.org/D33839

llvm-svn: 306448
2017-06-27 18:57:53 +00:00
Anna Thomas dc935a6eb6 [LoopUnrollRuntime] Use SCEV exit count for calculating trip count. NFCI
Instead of getBackEdgeTakenCount, use getExitCount on the latch exiting block
(which is proven to be the only exiting block in the loop to be unrolled).

llvm-svn: 306410
2017-06-27 14:14:35 +00:00
Chandler Carruth 2abb65ae11 [InstCombine] Factor the logic for propagating !nonnull and !range
metadata out of InstCombine and into helpers.

NFC, this just exposes the logic used by InstCombine when propagating
metadata from one load instruction to another. The plan is to use this
in SROA to address PR32902.

If anyone has better ideas about how to factor this or name variables,
I'm all ears, but this seemed like a pretty good start and lets us make
progress on the PR.

This is based on a patch by Ariel Ben-Yehuda (D34285).

llvm-svn: 306267
2017-06-26 03:31:31 +00:00
Chandler Carruth 4a000883c7 [LoopSimplify] Re-instate r306081 with a bug fix w.r.t. indirectbr.
This was reverted in r306252, but I already had the bug fixed and was
just trying to form a test case.

The original commit factored the logic for forming dedicated exits
inside of LoopSimplify into a helper that could be used elsewhere and
with an approach that required fewer intermediate data structures. See
that commit for full details including the change to the statistic, etc.

The code looked fine to me and my reviewers, but in fact didn't handle
indirectbr correctly -- it left the 'InLoopPredecessors' vector dirty.

If you have code that looks *just* right, you can end up leaking these
predecessors into a subsequent rewrite, and crash deep down when trying
to update PHI nodes for predecessors that don't exist.

I've added an assert that makes the bug much more obvious, and then
changed the code to reliably clear the vector so we don't get this bug
again in some other form as the code changes.

I've also added a test case that *does* manage to catch this while also
giving some nice positive coverage in the face of indirectbr.

The real code that found this came out of what I think is CPython's
interpreter loop, but any code with really "creative" interpreter loops
mixing indirectbr and other exit paths could manage to tickle the bug.
I was hard to reduce the original test case because in addition to
having a particular pattern of IR, the whole thing depends on the order
of the predecessors which is in turn depends on use list order. The test
case added here was designed so that in multiple different predecessor
orderings it should always end up going down the same path and tripping
the same bug. I hope. At least, it tripped it for me without
manipulating the use list order which is better than anything bugpoint
could do...

llvm-svn: 306257
2017-06-25 22:45:31 +00:00
Daniel Jasper 4c6cd4ccb7 Revert "[LoopSimplify] Factor the logic to form dedicated exits into a utility."
This leads to a segfault. Chandler already has a test case and should be
able to recommit with a fix soon.

llvm-svn: 306252
2017-06-25 17:58:25 +00:00
Craig Topper 72ee6945af [Analysis][Transforms] Use commutable matchers instead of m_CombineOr in a few places. NFC
llvm-svn: 306204
2017-06-24 06:24:01 +00:00
Anna Thomas 91eed9ac1a [RuntimeLoopUnrolling] Rename exit block and move assert earlier. NFC
The single exit block allowed in runtime unrolling is guaranteed to be
the Latch's successor, so rename it as LatchExitBlock.

llvm-svn: 306105
2017-06-23 14:28:01 +00:00
Chandler Carruth 4ab0f4910a [LoopSimplify] Factor the logic to form dedicated exits into a utility.
I want to use the same logic as LoopSimplify to form dedicated exits in
another pass (SimpleLoopUnswitch) so I wanted to factor it out here.

I also noticed that there is a pretty significantly more efficient way
to implement this than the way the code in LoopSimplify worked. We don't
need to actually retain the set of unique exit blocks, we can just
rewrite them as we find them and use only a set to deduplicate.

This did require changing one part of LoopSimplify to not re-use the
unique set of exits, but it only used it to check that there was
a single unique exit. That part of the code is about to walk the exiting
blocks anyways, so it seemed better to rewrite it to use those exiting
blocks to compute this property on-demand.

I also had to ditch a statistic, but it doesn't seem terribly valuable.

Differential Revision: https://reviews.llvm.org/D34049

llvm-svn: 306081
2017-06-23 04:03:04 +00:00
Xin Tong 9d2a5b1cf7 Add argmononly attribute to strlen and wcslen, i.e. they only read memory (string) passed to them.
Summary:
This allows strlen to be moved out of the loop in case its argument is
not modified in the loop in LICM.

Reviewers: hfinkel, davide, sanjoy, dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34323

llvm-svn: 305641
2017-06-18 03:10:26 +00:00
Max Kazantsev dc80366d52 [ScalarEvolution] Apply Depth limit to getMulExpr
This is a fix for PR33292 that shows a case of extremely long compilation
of a single .c file with clang, with most time spent within SCEV.

We have a mechanism of limiting recursion depth for getAddExpr to avoid
long analysis in SCEV. However, there are calls from getAddExpr to getMulExpr
and back that do not propagate the info about depth. As result of this, a chain

  getAddExpr -> ... .> getAddExpr -> getMulExpr -> getAddExpr -> ... -> getAddExpr

can be extremely long, with every segment of getAddExpr's being up to max depth long.
This leads either to long compilation or crash by stack overflow. We face this situation while
analyzing big SCEVs in the test of PR33292.

This patch applies the same limit on max expression depth for getAddExpr and getMulExpr.

Differential Revision: https://reviews.llvm.org/D33984

llvm-svn: 305463
2017-06-15 11:48:21 +00:00
Daniel Berlin 6d2db9edb2 PredicateInfo: Don't insert conditional info when a conditional branch jumps to the same target regardless of condition
llvm-svn: 305416
2017-06-14 21:19:52 +00:00
Xinliang David Li 7ed6cd32ea [PartialInlining] Support shrinkwrap life_range markers
Differential Revision: http://reviews.llvm.org/D33847

llvm-svn: 305170
2017-06-11 20:46:05 +00:00
Andrew Kaylor 647025f9e1 [InstSimplify] Don't constant fold or DCE calls that are marked nobuiltin
Differential Revision: https://reviews.llvm.org/D33737

llvm-svn: 305132
2017-06-09 23:18:11 +00:00
Sanjay Patel 70db424601 [SimplifyLibCalls] fix formatting; NFC
llvm-svn: 305081
2017-06-09 14:22:03 +00:00
Chandler Carruth 6bda14b313 Sort the remaining #include lines in include/... and lib/....
I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.

I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.

This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.

Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).

llvm-svn: 304787
2017-06-06 11:49:48 +00:00
Xin Tong 9d6f08a8d4 Add a dominanance check interface that uses caching for instructions within same basic block.
Summary:
This problem stems from the fact that instructions are allocated using new
in LLVM, i.e. there is no relationship that can be derived by just looking
at the pointer value.

This interface dispatches to appropriate dominance check given 2 instructions,
i.e. in case the instructions are in the same basic block, ordered basicblock
(with instruction numbering and caching) are used. Otherwise, dominator tree
is used.

This is a preparation patch for https://reviews.llvm.org/D32720

Reviewers: dberlin, hfinkel, davide

Subscribers: davide, mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D33380

llvm-svn: 304764
2017-06-06 02:34:41 +00:00
Keno Fischer fa635d730f Reapply "[Cloning] Take another pass at properly cloning debug info"
This was rL304226, reverted in 304228 due to a clang assertion failure
on the build bots. That problem should have been addressed by clang
commit rL304470.

llvm-svn: 304488
2017-06-01 23:02:12 +00:00
Mandeep Singh Grang 33a1b73600 [PredicateInfo] Fix non-determinism in codegen uncovered by reverse iterating SmallPtrSet
Summary:
Sort OpsToRename before iterating to make iteration order deterministic.

Thanks to Daniel Berlin for the sorting logic.

Reviewers: dberlin, RKSimon, efriedma, davide

Reviewed By: dberlin, davide

Subscribers: sanjoy, davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D33265

llvm-svn: 304447
2017-06-01 18:36:24 +00:00
Zaara Syeda 3a7578c658 [PPC] Inline expansion of memcmp
This patch does an inline expansion of memcmp.
It changes the memcmp library call into an inline expansion when the size is
known at compile time and is under a target specified threshold.
This expansion is implemented in CodeGenPrepare and expands into straight line
code. The target specifies a maximum load size and the expansion works by using
this size to load the two sources, compare, and exit early if a difference is
found. It also has a special case when the memcmp result is used in a compare
to zero equality.

Differential Revision: https://reviews.llvm.org/D28637

llvm-svn: 304313
2017-05-31 17:12:38 +00:00
Xinliang David Li 74480adafd [PartialInlining] Shrinkwrap allocas with live range contained in outline region.
Differential Revision: http://reviews.llvm.org/D33618

llvm-svn: 304245
2017-05-30 21:22:18 +00:00
Keno Fischer 3fa5db4c04 Revert "[Cloning] Take another pass at properly cloning debug info"
At least one build bot is complaining. Will investigate after lunch.

llvm-svn: 304228
2017-05-30 18:56:26 +00:00
Keno Fischer 945dc1d2d1 [Cloning] Take another pass at properly cloning debug info
Summary:
In rL302576, DISubprograms gained the constraint that a !dbg attachments to functions must
have a 1:1 mapping to DISubprograms. As part of that change, the function cloning support
was adjusted to attempt to enforce this invariant during cloning. However, there
were several problems with the implementation. Part of these were fixed in rL304079.
However, there was a more fundamental problem with these changes, namely that it
bypasses the matadata value map, causing the cloned metadata to be a mix of metadata
pointing to the new suprogram (where manual code was added to fix those up) and the
old suprogram (where this was not the case). This mismatch could cause a number of
different assertion failures in the DWARF emitter. Some of these are given at
https://github.com/JuliaLang/julia/issues/22069, but some others have been observed
as well. Attempt to rectify this by partially reverting the manual DI metadata fixup,
and instead using the standard value map approach. To retain the desired semantics
of not duplicating the compilation unit and inlined subprograms, explicitly freeze
these in the value map.

Reviewers: dblaikie, aprantl, GorNishanov, echristo

Reviewed By: aprantl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33655

llvm-svn: 304226
2017-05-30 18:28:30 +00:00
Gor Nishanov ffbeb22b6f Cloning: Fix debug info cloning
Summary:
I believe https://reviews.llvm.org/rL302576 introduced two bugs:

1) it produces duplicate distinct variables for every: dbg.value describing the same variable.
    To fix the problme I switched form getDistinct() to get() in DebugLoc.cpp: auto reparentVar = [&](DILocalVariable *Var) {
    return DILocalVariable::getDistinct(

2) It passes NewFunction plain name as a linkagename parameter to Subprogram constructor. Breaks assert in:

 || DeclLinkageName.empty()) || LinkageName == DeclLinkageName) && "decl has a linkage name and it is different"' failed.
#9 0x00007f5010261b75 llvm::DwarfUnit::applySubprogramDefinitionAttributes(llvm::DISubprogram const*, llvm::DIE&) /home/gor/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp:1173:3
#
(Edit: reproducer added)

Here how https://reviews.llvm.org/rL302576 broke coroutine debug info.
Coroutine body of the original function is split into several parts by cloning and removing unneeded code.
All parts describe the original function and variables present in the original function.

For a simple case, prior to Split, original function has these two blocks:

```
PostSpill:                                        ; preds = %AllocaSpillBB
  call void @llvm.dbg.value(metadata i32 %x, i64 0, metadata !14, metadata !15), !dbg !13
  store i32 %x, i32* %x.addr, align 4
  ...
and

sw.epilog:                                        ; preds = %sw.bb
  %x.addr.reload.addr = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr, i32 0, i32 4, !dbg !20
  %4 = load i32, i32* %x.addr.reload.addr, align 4, !dbg !20
  call void @llvm.dbg.value(metadata i32 %4, i64 0, metadata !14, metadata !15), !dbg !13

!14 = !DILocalVariable(name: "x", arg: 1, scope: !6, file: !7, line: 55, type: !11)

```

Note that in two blocks different expression represent the same original user variable X.

Before rL302576, for every cloned function there was exactly one cloned DILocalVariable(name: "x" as in:

```
define i8* @f(i32 %x) #0 !dbg !6 {
  ...
!6 = distinct !DISubprogram(name: "f", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55, flags: DIFlagPrototyped,
...
!14 = !DILocalVariable(name: "x", arg: 1, scope: !6, file: !7, line: 55, type: !11)

define internal fastcc void @f.resume(%f.Frame* %FramePtr) #0 !dbg !25 {
...
!25 = distinct !DISubprogram(name: "f", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55, flags: DIFlagPrototyped, isOptimized: false, unit: !0, variables: !2)
!28 = !DILocalVariable(name: "x", arg: 1, scope: !25, file: !7, line: 55, type: !11)
```
After rL302576, for every cloned function there were as many DILocalVariable(name: "x" as there were "call void @llvm.dbg.value" for that variable.
This was causing asserts in VerifyDebugInfo and AssemblyPrinter.

Example:

```
!27 = distinct !DISubprogram(name: "f", linkageName: "f.resume", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55,
!29 = distinct !DILocalVariable(name: "x", arg: 1, scope: !27, file: !7, line: 55, type: !11)
!39 = distinct !DILocalVariable(name: "x", arg: 1, scope: !27, file: !7, line: 55, type: !11)
!41 = distinct !DILocalVariable(name: "x", arg: 1, scope: !27, file: !7, line: 55, type: !11)
```

Second problem:

Prior to rL302576, all clones were described by DISubprogram referring to original function.

```
define i8* @f(i32 %x) #0 !dbg !6 {
...
!6 = distinct !DISubprogram(name: "f", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55, flags: DIFlagPrototyped,

define internal fastcc void @f.resume(%f.Frame* %FramePtr) #0 !dbg !25 {
...
!25 = distinct !DISubprogram(name: "f", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55, flags: DIFlagPrototyped,
```

After rL302576, DISubprogram for clones is of two minds, plain name refers to the original name, linkageName refers to plain name of the clone.

```
!27 = distinct !DISubprogram(name: "f", linkageName: "f.resume", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55,
```

I think the assumption in AsmPrinter is that both name and linkageName should refer to the same entity. It asserts here when they are not:

```
 || DeclLinkageName.empty()) || LinkageName == DeclLinkageName) && "decl has a linkage name and it is different"' failed.
#9 0x00007f5010261b75 llvm::DwarfUnit::applySubprogramDefinitionAttributes(llvm::DISubprogram const*, llvm::DIE&) /home/gor/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp:1173:3
```
After this fix, behavior (with respect to coroutines) reverts to exactly as it was before and therefore making them debuggable again, or even more importantly, compilable, with "-g"

Reviewers: dblaikie, echristo, aprantl

Reviewed By: dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33614

llvm-svn: 304079
2017-05-27 19:41:09 +00:00
James Molloy a929063233 [GVNSink] GVNSink pass
This patch provides an initial prototype for a pass that sinks instructions based on GVN information, similar to GVNHoist. It is not yet ready for commiting but I've uploaded it to gather some initial thoughts.

This pass attempts to sink instructions into successors, reducing static
instruction count and enabling if-conversion.
We use a variant of global value numbering to decide what can be sunk.
Consider:

[ %a1 = add i32 %b, 1  ]   [ %c1 = add i32 %d, 1  ]
[ %a2 = xor i32 %a1, 1 ]   [ %c2 = xor i32 %c1, 1 ]
                 \           /
           [ %e = phi i32 %a2, %c2 ]
           [ add i32 %e, 4         ]

GVN would number %a1 and %c1 differently because they compute different
results - the VN of an instruction is a function of its opcode and the
transitive closure of its operands. This is the key property for hoisting
and CSE.

What we want when sinking however is for a numbering that is a function of
the *uses* of an instruction, which allows us to answer the question "if I
replace %a1 with %c1, will it contribute in an equivalent way to all
successive instructions?". The (new) PostValueTable class in GVN provides this
mapping.

This pass has some shown really impressive improvements especially for codesize already on internal benchmarks, so I have high hopes it can replace all the sinking logic in SimplifyCFG.

Differential revision: https://reviews.llvm.org/D24805

llvm-svn: 303850
2017-05-25 12:51:11 +00:00
Craig Topper 8205a1a9b6 [ValueTracking] Convert most of the calls to computeKnownBits to use the version that returns the KnownBits object.
This continues the changes started when computeSignBit was replaced with this new version of computeKnowBits.

Differential Revision: https://reviews.llvm.org/D33431

llvm-svn: 303773
2017-05-24 16:53:07 +00:00
Reid Kleckner 8bf67fe98f [IR] Switch AttributeList to use an array for O(1) access
Summary:
Before this change, AttributeLists stored a pair of index and
AttributeSet. This is memory efficient if most arguments do not have
attributes. However, it requires doing a search over the pairs to test
an argument or function attribute. Profiling shows that this loop was
0.76% of the time in 'opt -O2' of sqlite3.c, because LLVM constantly
tests values for nullability.

This was worth about 2.5% of mid-level optimization cycles on the
sqlite3 amalgamation. Here are the full perf results:
https://reviews.llvm.org/P7995

Here are just the before and after cycle counts:
```
$ perf stat -r 5 ./opt_before -O2 sqlite3.bc -o /dev/null
    13,274,181,184      cycles                    #    3.047 GHz                      ( +-  0.28% )
$ perf stat -r 5 ./opt_after -O2 sqlite3.bc -o /dev/null
    12,906,927,263      cycles                    #    3.043 GHz                      ( +-  0.51% )
```

This patch *does not* change the indices used to query attributes, as
requested by reviewers. Tracking whether an index is usable for array
indexing is a huge pain that affects many of the internal APIs, so it
would be good to come back later and do a cleanup to remove this
internal adjustment.

Reviewers: pete, chandlerc

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D32819

llvm-svn: 303654
2017-05-23 17:01:48 +00:00
Anna Thomas c07d5544dd [JumpThreading] Safely replace uses of condition
This patch builds over https://reviews.llvm.org/rL303349 and replaces
the use of the condition only if it is safe to do so.

We should not blindly RAUW the condition if experimental.guard or assume
is a use of that
condition. This is because LVI may have used the guard/assume to
identify the
value of the condition, and RUAWing will fold the guard/assume and uses
before the guards/assumes.

Reviewers: sanjoy, reames, trentxintong, mkazantsev

Reviewed by: sanjoy, reames

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33257

llvm-svn: 303633
2017-05-23 13:36:25 +00:00
Teresa Johnson 525dcb617b Fix update VP metadata after inlining for instrumentation PGO
Summary:
With instrumentation profiling, when updating the VP metadata after
an inline, VP metadata on the inlined copy was inadvertantly having
all counts zeroed out. This was causing indirect calls from code inlined
during the call step to be marked as cold in the ThinLTO summaries and
not imported.

The CallerBFI needs to be passed down so that the CallSiteCount can be
computed from the profile summary info. With Sample PGO this was working
since the count is extracted from the branch weight metadata on the
call being inlined (even before we stopped looking at metadata for
non-sample PGO in r302844 this largely wasn't working for instrumentation
PGO since only promoted indirect calls would be getting inlined and have
the metadata).

Added an instrumentation PGO test and renamed the sample PGO test.

Reviewers: danielcdh, eraman

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D33389

llvm-svn: 303574
2017-05-22 20:28:18 +00:00
Craig Topper e777fed152 [SimplifyCFG] Prevent a few APInt copies on method calls that return const reference. NFCI
llvm-svn: 303523
2017-05-22 00:49:35 +00:00
Xin Tong 9fbfeefadf Revert "Add pthread_self function prototype and make it speculatable."
This reverts commit 143d7445b5dfa2f6d6c45bdbe0433d9fc531be21.

Build breaking

llvm-svn: 303496
2017-05-21 00:37:55 +00:00
Xin Tong 75af3af957 Add pthread_self function prototype and make it speculatable.
Summary: This allows pthread_self to be pulled out of a loop by LICM.

Reviewers: hfinkel, arsenm, davide

Reviewed By: davide

Subscribers: davide, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D32782

llvm-svn: 303495
2017-05-20 22:40:25 +00:00
Matthias Braun 50ec0b5dce SimplifyLibCalls: Optimize wcslen
Refactor the strlen optimization code to work for both strlen and wcslen.

This especially helps with programs in the wild where people pass
L"string"s to const std::wstring& function parameters and the wstring
constructor gets inlined.

This also fixes a lingerind API problem/bug in getConstantStringInfo()
where zeroinitializers would always give you an empty string (without a
length) back regardless of the actual length of the initializer which
did not work well in the TrimAtNul==false causing the PR mentioned
below.

Note that the fixed getConstantStringInfo() needed fixes to SelectionDAG
memcpy lowering and may lead to some cases for out-of-bounds
zeroinitializer accesses not getting optimized anymore. So some code
with UB may produce out of bound memory reads now instead of just
producing zeros.

The refactoring "accidentally" fixes http://llvm.org/PR32124

Differential Revision: https://reviews.llvm.org/D32839

llvm-svn: 303461
2017-05-19 22:37:09 +00:00
Reid Kleckner 96ab8726a3 [IR] De-virtualize ~Value to save a vptr
Summary:
Implements PR889

Removing the virtual table pointer from Value saves 1% of RSS when doing
LTO of llc on Linux. The impact on time was positive, but too noisy to
conclusively say that performance improved. Here is a link to the
spreadsheet with the original data:

https://docs.google.com/spreadsheets/d/1F4FHir0qYnV0MEp2sYYp_BuvnJgWlWPhWOwZ6LbW7W4/edit?usp=sharing

This change makes it invalid to directly delete a Value, User, or
Instruction pointer. Instead, such code can be rewritten to a null check
and a call Value::deleteValue(). Value objects tend to have their
lifetimes managed through iplist, so for the most part, this isn't a big
deal.  However, there are some places where LLVM deletes values, and
those places had to be migrated to deleteValue.  I have also created
llvm::unique_value, which has a custom deleter, so it can be used in
place of std::unique_ptr<Value>.

I had to add the "DerivedUser" Deleter escape hatch for MemorySSA, which
derives from User outside of lib/IR. Code in IR cannot include MemorySSA
headers or call the MemoryAccess object destructors without introducing
a circular dependency, so we need some level of indirection.
Unfortunately, no class derived from User may have any virtual methods,
because adding a virtual method would break User::getHungOffOperands(),
which assumes that it can find the use list immediately prior to the
User object. I've added a static_assert to the appropriate OperandTraits
templates to help people avoid this trap.

Reviewers: chandlerc, mehdi_amini, pete, dberlin, george.burgess.iv

Reviewed By: chandlerc

Subscribers: krytarowski, eraman, george.burgess.iv, mzolotukhin, Prazek, nlewycky, hans, inglorion, pcc, tejohnson, dberlin, llvm-commits

Differential Revision: https://reviews.llvm.org/D31261

llvm-svn: 303362
2017-05-18 17:24:10 +00:00
David Blaikie 441cfee780 PR32288: Describe a bool parameter's DWARF location with a simple register
There's no need (& a bit incorrect) to mask off the high bits of the
register reference when describing a simple bool value.

Reviewers: aprantl

Differential Revision: https://reviews.llvm.org/D31062

llvm-svn: 303117
2017-05-15 21:34:01 +00:00
Craig Topper 8df66c602a [KnownBits] Add bit counting methods to KnownBits struct and use them where possible
This patch adds min/max population count, leading/trailing zero/one bit counting methods.

The min methods return answers based on bits that are known without considering unknown bits. The max methods give answers taking into account the largest count that unknown bits could give.

Differential Revision: https://reviews.llvm.org/D32931

llvm-svn: 302925
2017-05-12 17:20:30 +00:00
Serge Guelton f4dc59ba8e Remove spurious cast of nullptr. NFC.
Conversion rules allow automatic casting of nullptr to any pointer type.

llvm-svn: 302780
2017-05-11 08:53:00 +00:00
Amara Emerson 836b0f48c1 Add a late IR expansion pass for the experimental reduction intrinsics.
This pass uses a new target hook to decide whether or not to expand a particular
intrinsic to the shuffevector sequence.

Differential Revision: https://reviews.llvm.org/D32245

llvm-svn: 302631
2017-05-10 09:42:49 +00:00
Easwaran Raman f5f9160072 [ProfileSummary] Make getProfileCount a non-static member function.
This change is required because the notion of count is different for
sample profiling and getProfileCount will need to determine the
underlying profile type.

Differential revision: https://reviews.llvm.org/D33012

llvm-svn: 302597
2017-05-09 23:21:10 +00:00
Keno Fischer 06f962c1e8 [GVN] Fix a crash on encountering non-integral pointers
Summary:
This fixes the immediate crash caused by introducing an incorrect inttoptr
before attempting the conversion. There may still be a legality
check missing somewhere earlier for non-integral pointers, but this change
seems necessary in any case.

Reviewers: sanjoy, dberlin

Reviewed By: dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32623

llvm-svn: 302587
2017-05-09 21:07:20 +00:00
Adrian Prantl c10d0e5ccd Make it illegal for two Functions to point to the same DISubprogram
As recently discussed on llvm-dev [1], this patch makes it illegal for
two Functions to point to the same DISubprogram and updates
FunctionCloner to also clone the debug info of a function to conform
to the new requirement. To simplify the implementation it also factors
out the creation of inlineAt locations from the Inliner into a
general-purpose utility in DILocation.

[1] http://lists.llvm.org/pipermail/llvm-dev/2017-May/112661.html
<rdar://problem/31926379>

Differential Revision: https://reviews.llvm.org/D32975

This reapplies r302469 with a fix for a bot failure (reparentDebugInfo
now checks for the case the orig and new function are identical).

llvm-svn: 302576
2017-05-09 19:47:37 +00:00
Piotr Padlewski d979c1f806 NFC: refactor replaceDominatedUsesWith
Summary:
Since I will post patch with some changes to
replaceDominatedUsesWith, it would be good to avoid
duplicating code again.

Reviewers: davide, dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32798

llvm-svn: 302575
2017-05-09 19:39:44 +00:00
Serge Guelton e38003f839 Suppress all uses of LLVM_END_WITH_NULL. NFC.
Use variadic templates instead of relying on <cstdarg> + sentinel.
This enforces better type checking and makes code more readable.

Differential Revision: https://reviews.llvm.org/D32541

llvm-svn: 302571
2017-05-09 19:31:13 +00:00
Hans Wennborg 66fb0d9768 Revert r302469 "Make it illegal for two Functions to point to the same DISubprogram"
This caused PR32977.

Original commit message:

> Make it illegal for two Functions to point to the same DISubprogram
>
> As recently discussed on llvm-dev [1], this patch makes it illegal for
> two Functions to point to the same DISubprogram and updates
> FunctionCloner to also clone the debug info of a function to conform
> to the new requirement. To simplify the implementation it also factors
> out the creation of inlineAt locations from the Inliner into a
> general-purpose utility in DILocation.
>
> [1] http://lists.llvm.org/pipermail/llvm-dev/2017-May/112661.html
> <rdar://problem/31926379>
>
> Differential Revision: https://reviews.llvm.org/D32975

llvm-svn: 302533
2017-05-09 14:44:15 +00:00
Amara Emerson cf9daa33a7 Introduce experimental generic intrinsics for horizontal vector reductions.
- This change allows targets to opt-in to using them instead of the log2
  shufflevector algorithm.
- The SLP and Loop vectorizers have the common code to do shuffle reductions
  factored out into LoopUtils, and now have a unified interface for generating
  reductions regardless of the preference of the target. LoopUtils now uses TTI
  to determine what kind of reductions the target wants to handle.
- For CodeGen, basic legalization support is added.

Differential Revision: https://reviews.llvm.org/D30086

llvm-svn: 302514
2017-05-09 10:43:25 +00:00
Sanjoy Das 76bbdd1a16 [InstNamer] Use range-for
llvm-svn: 302481
2017-05-08 23:18:43 +00:00
Sanjoy Das 0ac3bf1cc8 [InstNamer] Don't check type of arguments (they're never void)
llvm-svn: 302480
2017-05-08 23:18:39 +00:00
Sanjoy Das 7be961b081 Delete trailing whitespace
llvm-svn: 302479
2017-05-08 23:18:36 +00:00
Adrian Prantl 200a5ef526 Make it illegal for two Functions to point to the same DISubprogram
As recently discussed on llvm-dev [1], this patch makes it illegal for
two Functions to point to the same DISubprogram and updates
FunctionCloner to also clone the debug info of a function to conform
to the new requirement. To simplify the implementation it also factors
out the creation of inlineAt locations from the Inliner into a
general-purpose utility in DILocation.

[1] http://lists.llvm.org/pipermail/llvm-dev/2017-May/112661.html
<rdar://problem/31926379>

Differential Revision: https://reviews.llvm.org/D32975

llvm-svn: 302469
2017-05-08 21:17:08 +00:00
Craig Topper 7e3e7afca8 [ConstantRange][SimplifyCFG] Add a helper method to allow SimplifyCFG to determine if a ConstantRange has more than 8 elements without requiring an allocation if the ConstantRange is 64-bits wide.
Previously SimplifyCFG used getSetSize which returns an APInt that is 1 bit wider than the ConstantRange's bit width. In the reasonably common case that the ConstantRange is 64-bits wide, this requires returning a 65-bit APInt. APInt's can only store 64-bits without a memory allocation so this is inefficient.

The new method takes the 8 as an input and tells if the range contains more than that many elements without requiring any wider math.

llvm-svn: 302385
2017-05-07 22:22:11 +00:00
Matthias Braun 60b40b8fec TargetLibraryInfo: Introduce wcslen
wcslen is part of the C99 and C++98 standards.

- This introduces the function to TargetLibraryInfo.
- Also set attributes for wcslen in llvm::inferLibFuncAttributes().

Differential Revision: https://reviews.llvm.org/D32837

llvm-svn: 302278
2017-05-05 20:25:50 +00:00
Evgeniy Stepanov 9aff829f78 Remap metadata attached to global variables.
Fix for PR32577.
Global variables may have !associated metadata, which includes a reference to another global. It needs remapping.

llvm-svn: 302203
2017-05-04 23:29:39 +00:00
Anna Thomas f475fa3575 Avoid warning of unused variable in release builds. NFC
llvm-svn: 302068
2017-05-03 19:25:04 +00:00
Anna Thomas d4c0295cc8 Fix PPC64 warning for missing parantheses. NFC.
llvm-svn: 302061
2017-05-03 18:25:43 +00:00
Reid Kleckner a0b45f4bfc [IR] Abstract away ArgNo+1 attribute indexing as much as possible
Summary:
Do three things to help with that:
- Add AttributeList::FirstArgIndex, which is an enumerator currently set
  to 1. It allows us to change the indexing scheme with fewer changes.
- Add addParamAttr/removeParamAttr. This just shortens addAttribute call
  sites that would otherwise need to spell out FirstArgIndex.
- Remove some attribute-specific getters and setters from Function that
  take attribute list indices.  Most of these were only used from
  BuildLibCalls, and doesNotAlias was only used to test or set if the
  return value is malloc-like.

I'm happy to split the patch, but I think they are probably easier to
review when taken together.

This patch should be NFC, but it sets the stage to change the indexing
scheme to this, which is more convenient when indexing into an array:
  0: func attrs
  1: retattrs
  2...: arg attrs

Reviewers: chandlerc, pete, javed.absar

Subscribers: david2050, llvm-commits

Differential Revision: https://reviews.llvm.org/D32811

llvm-svn: 302060
2017-05-03 18:17:31 +00:00
Anna Thomas ac0ec2240b [RuntimeLoopUnroller] Add assert that we dont unroll non-rotated loops
Summary:
Cloning basic blocks in the loop for runtime loop unroller depends on loop being
in rotated form (i.e. loop latch target is the exit block).
Assert that this is true, so that callers of runtime loop unroller pass in
canonical loops.
The single caller of this function has that check recently added:
https://reviews.llvm.org/rL301239

Reviewers: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32801

llvm-svn: 302058
2017-05-03 17:43:59 +00:00
Matt Arsenault 6a288c1e32 Replace hardcoded intrinsic list with speculatable attribute.
No change in which intrinsics should be speculated.

llvm-svn: 301995
2017-05-03 02:26:10 +00:00
Sanjoy Das e6bca0eecb Rename WeakVH to WeakTrackingVH; NFC
This relands r301424.

llvm-svn: 301812
2017-05-01 17:07:49 +00:00
Reid Kleckner 859f8b544a Make getParamAlignment use argument numbers
The method is called "get *Param* Alignment", and is only used for
return values exactly once, so it should take argument indices, not
attribute indices.

Avoids confusing code like:
  IsSwiftError = CS->paramHasAttr(ArgIdx, Attribute::SwiftError);
  Alignment  = CS->getParamAlignment(ArgIdx + 1);

Add getRetAlignment to handle the one case in Value.cpp that wants the
return value alignment.

This is a potentially breaking change for out-of-tree backends that do
their own call lowering.

llvm-svn: 301682
2017-04-28 20:34:27 +00:00
Daniel Berlin 4d0fe64ae3 Kill off the old SimplifyInstruction API by converting remaining users.
llvm-svn: 301673
2017-04-28 19:55:38 +00:00
Adrian Prantl 109b236850 Clean up DIExpression::prependDIExpr a little. (NFC)
llvm-svn: 301662
2017-04-28 17:51:05 +00:00
Andrew Ng 03e35b6bc0 [DebugInfo][X86] Improve X86 Optimize LEAs handling of debug values.
This is a follow up to the fix in r298360 to improve the handling of debug
values when redundant LEAs are removed. The fix in r298360 effectively
discarded the debug values. This patch now attempts to preserve the debug
values by using the DWARF DW_OP_stack_value operation via prependDIExpr.

Moved functions appendOffset and prependDIExpr from Local.cpp to
DebugInfoMetadata.cpp and made them available as static member functions of
DIExpression.

Differential Revision: https://reviews.llvm.org/D31604

llvm-svn: 301630
2017-04-28 08:44:30 +00:00
Evgeniy Stepanov 964f4663c4 [asan] Fix dead stripping of globals on Linux.
Use a combination of !associated, comdat, @llvm.compiler.used and
custom sections to allow dead stripping of globals and their asan
metadata. Sometimes.

Currently this works on LLD, which supports SHF_LINK_ORDER with
sh_link pointing to the associated section.

This also works on BFD, which seems to treat comdats as
all-or-nothing with respect to linker GC. There is a weird quirk
where the "first" global in each link is never GC-ed because of the
section symbols.

At this moment it does not work on Gold (as in the globals are never
stripped).

This is a second re-land of r298158. This time, this feature is
limited to -fdata-sections builds.

llvm-svn: 301587
2017-04-27 20:27:27 +00:00
Davide Italiano d7b2a9981c [LibCallsShrinkWrap] Remove an unnecessary class member variable.
llvm-svn: 301477
2017-04-26 21:28:40 +00:00
Davide Italiano 11817ba2ea [LibCallsShrinkWrap] More descriptive assertion messages.
Fix a typo while I'm here.

llvm-svn: 301474
2017-04-26 21:21:02 +00:00
Davide Italiano 3c3785fd1f [LibCallsShrinkWrap] Remove some temporary cl::opt(s).
The pass has been on and working for a while.

llvm-svn: 301473
2017-04-26 21:19:05 +00:00
Davide Italiano 6abada8ab8 [LibCallsShrinkWrap] Teach the pass how to preserve the dominator.
llvm-svn: 301471
2017-04-26 21:05:40 +00:00
Craig Topper b45eabcf82 [ValueTracking] Introduce a KnownBits struct to wrap the two APInts for computeKnownBits
This patch introduces a new KnownBits struct that wraps the two APInt used by computeKnownBits. This allows us to treat them as more of a unit.

Initially I've just altered the signatures of computeKnownBits and InstCombine's simplifyDemandedBits to pass a KnownBits reference instead of two separate APInt references. I'll do similar to the SelectionDAG version of computeKnownBits/simplifyDemandedBits as a separate patch.

I've added a constructor that allows initializing both APInts to the same bit width with a starting value of 0. This reduces the repeated pattern of initializing both APInts. Once place default constructed the APInts so I added a default constructor for those cases.

Going forward I would like to add more methods that will work on the pairs. For example trunc, zext, and sext occur on both APInts together in several places. We should probably add a clear method that can be used to clear both pieces. Maybe a method to check for conflicting information. A method to return (Zero|One) so we don't write it out everywhere. Maybe a method for (Zero|One).isAllOnesValue() to determine if all bits are known. I'm sure there are many other methods we can come up with.

Differential Revision: https://reviews.llvm.org/D32376

llvm-svn: 301432
2017-04-26 16:39:58 +00:00
Sanjoy Das 2cbeb00f38 Reverts commit r301424, r301425 and r301426
Commits were:

"Use WeakVH instead of WeakTrackingVH in AliasSetTracker's UnkownInsts"
"Add a new WeakVH value handle; NFC"
"Rename WeakVH to WeakTrackingVH; NFC"

The changes assumed pointers are 8 byte aligned on all architectures.

llvm-svn: 301429
2017-04-26 16:37:05 +00:00
Sanjoy Das 01de557738 Rename WeakVH to WeakTrackingVH; NFC
Summary:
I plan to use WeakVH to mean "nulls itself out on deletion, but does
not track RAUW" in a subsequent commit.

Reviewers: dblaikie, davide

Reviewed By: davide

Subscribers: arsenm, mehdi_amini, mcrosier, mzolotukhin, jfb, llvm-commits, nhaehnle

Differential Revision: https://reviews.llvm.org/D32266

llvm-svn: 301424
2017-04-26 16:20:52 +00:00
Daniel Berlin 954006fde8 Convert SimplifyInstructions to use the SimplifyQuery version of SimplifyInstruction
llvm-svn: 301406
2017-04-26 13:52:16 +00:00
Matthias Braun c36a78c3f3 SimplifyLibCalls: Fix crash on memset(notmalloc())
rdar://31520787

llvm-svn: 301352
2017-04-25 19:44:25 +00:00
Craig Topper f3dbd17d0a [APInt] Use isSubsetOf, intersects, and bit counting methods to reduce temporary APInts
This patch uses various APInt methods to reduce temporary APInt creation.

This should be all of the unrelated cleanups that got buried in D32376(creating a KnownBits struct) as well as some pointed out by Simon during the review of that. Plus a few improvements to use counting instead of masking.

I've left out any places where we do something like (KnownZero & KnownOne) != 0 as I plan to add a helper method to KnownBits to ask that question and didn't want to thrash that code an additional time.

Differential Revision: https://reviews.llvm.org/D32495

llvm-svn: 301338
2017-04-25 17:46:30 +00:00
Andrew Ng 1606fc0bf9 [SimplifyLibCalls] Fix infinite loop with fast-math optimization.
One of the fast-math optimizations is to replace calls to standard double
functions with their float equivalents, e.g. exp -> expf. However, this can
cause infinite loops for the following:

  float expf(float val) { return (float) exp((double) val); }

A similar inline declaration exists in the MinGW-w64 math.h header file which
when compiled with -O2/3 and fast-math generates infinite loops.

So this fix checks that the calling function to the standard double function
that is being replaced does not match the float equivalent.

Differential Revision: https://reviews.llvm.org/D31806

llvm-svn: 301304
2017-04-25 12:36:14 +00:00
Xinliang David Li f12a0faf88 [CodeExtractor]: Fixup use refs of the old phi.
Differential Revision: http://reviews.llvm.org/D32468

llvm-svn: 301291
2017-04-25 04:51:19 +00:00
Davide Italiano 5b65f12bfa [SimplifyLibCalls] Remove a cl::opt that's been `true` for a long time.
llvm-svn: 301288
2017-04-25 03:48:47 +00:00
Davide Italiano ca81fbcadb [LoopUnroll] Remove spurious newline.
Eli pointed out in the review, but I didn't squash the two commits
correctly. Pointy-hat to me.

llvm-svn: 301241
2017-04-24 20:17:38 +00:00
Davide Italiano 0f62eea7ff [LoopUnroll] Don't try to unroll non canonical loops.
The current Loop Unroll implementation works with loops having a
single latch that contains a conditional branch to a block outside
the loop (the other successor is, by defition of latch, the header).
If this precondition doesn't hold, avoid unrolling the loop as
the code is not ready to handle such circumstances.

Differential Revision:  https://reviews.llvm.org/D32261

llvm-svn: 301239
2017-04-24 20:14:11 +00:00
Mandeep Singh Grang 799a2edb3d [SimplifyCFG] Fix for non-determinism in codegen
Summary: This patch fixes issues in codegen uncovered due to https://reviews.llvm.org/D26718

Reviewers: majnemer, chenli, davide

Reviewed By: davide

Subscribers: davide, arsenm, llvm-commits

Differential Revision: https://reviews.llvm.org/D26726

llvm-svn: 301222
2017-04-24 19:20:45 +00:00
Evgeniy Stepanov 58ccc0949a Revert "Compute safety information in a much finer granularity."
Use-after-free in llvm::isGuaranteedToExecute.

llvm-svn: 301214
2017-04-24 18:25:07 +00:00
Adrian Prantl f2c7997013 Use DW_OP_stack_value when reconstructing variable values with arithmetic.
When the location description of a source variable involves arithmetic
on the value itself, it needs to be marked with DW_OP_stack_value since it
is not describing the variable's location, but rather its value.

This is a follow-up to r297971 and fixes the source testcase quoted in
the comment in debuginfo-dce.ll.

rdar://problem/30725338

This reapplies r301093 without modifications.

llvm-svn: 301210
2017-04-24 18:11:42 +00:00
Xin Tong a266923d57 Compute safety information in a much finer granularity.
Summary:
Instead of keeping a variable indicating whether there are early exits
in the loop.  We keep all the early exits. This improves LICM's ability to
move instructions out of the loop based on is-guaranteed-to-execute.

I am going to update compilation time as well soon.

Reviewers: hfinkel, sanjoy, efriedma, mkuper

Reviewed By: hfinkel

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D32433

llvm-svn: 301196
2017-04-24 17:12:22 +00:00
Adrian Prantl 4677205010 Revert "Use DW_OP_stack_value when reconstructing variable values with arithmetic."
This reverts commit r301093 while investigating stage2 bot breakage.

llvm-svn: 301099
2017-04-23 00:44:40 +00:00