Commit Graph

3725 Commits

Author SHA1 Message Date
David Blaikie 795dc94614 Fix UB found by -Wtautological-undefined-compare
llvm-svn: 298279
2017-03-20 18:01:07 +00:00
Dehao Chen e593049fb0 Updates branch_weights annotation for call instructions during inlining.
Summary: Inliner should update the branch_weights annotation to scale it to proper value.

Reviewers: davidxl, eraman

Reviewed By: eraman

Subscribers: zzheng, llvm-commits

Differential Revision: https://reviews.llvm.org/D30767

llvm-svn: 298270
2017-03-20 16:40:44 +00:00
Adrian Prantl 6d80a262d5 Use isa<> instead of dyn_cast<> (NFC).
llvm-svn: 298268
2017-03-20 16:39:41 +00:00
Daniel Berlin 12883b1673 Templatize parts of VNCoercion, and add constant-only versions of the functions to be used in NewGVN.
NFCI.

Summary:
This is ground work for the changes to enable coercion in NewGVN.
GVN doesn't care if they end up constant because it eliminates as it goes.
NewGVN cares.

IRBuilder and ConstantFolder deliberately present the same interface,
so we use this to our advantage to templatize our functions to make
them either constant only or not.

Reviewers: davide

Subscribers: llvm-commits, Prazek

Differential Revision: https://reviews.llvm.org/D30928

llvm-svn: 298262
2017-03-20 16:08:29 +00:00
Craig Topper b5c2bfa869 [IR] Remove some unneeded includes from Operator.h and fix cpp files that were transitively depending on it. NFC
llvm-svn: 298235
2017-03-20 05:08:41 +00:00
Craig Topper 0f5063c754 [BuildLibCalls] emitPutChar should infer function attributes for putchar
When InstCombine calls into SimplifyLibCalls and it createa putChar calls, we don't infer the attributes. And since SimplifyLibCalls doesn't use InstCombine's IRBuilder the calls doesn't end up in the worklist on this iteration of InstCombine. So it gets picked up on the next iteration where it causes an IR change. This of course causes InstCombine to run another iteration.

So this patch just gets the attributes right the first time. We already did this for puts and some other libcalls.

Differential Revision: https://reviews.llvm.org/D31094

llvm-svn: 298171
2017-03-17 23:48:02 +00:00
Evgeniy Stepanov c5aa6b9411 [asan] Fix dead stripping of globals on Linux.
Use a combination of !associated, comdat, @llvm.compiler.used and
custom sections to allow dead stripping of globals and their asan
metadata. Sometimes.

Currently this works on LLD, which supports SHF_LINK_ORDER with
sh_link pointing to the associated section.

This also works on BFD, which seems to treat comdats as
all-or-nothing with respect to linker GC. There is a weird quirk
where the "first" global in each link is never GC-ed because of the
section symbols.

At this moment it does not work on Gold (as in the globals are never
stripped).

Differential Revision: https://reviews.llvm.org/D30121

llvm-svn: 298158
2017-03-17 22:17:29 +00:00
Reid Kleckner 45707d4d5a Remove getArgumentList() in favor of arg_begin(), args(), etc
Users often call getArgumentList().size(), which is a linear way to get
the number of function arguments. arg_size(), on the other hand, is
constant time.

In general, the fact that arguments are stored in an iplist is an
implementation detail, so I've removed it from the Function interface
and moved all other users to the argument container APIs (arg_begin(),
arg_end(), args(), arg_size()).

Reviewed By: chandlerc

Differential Revision: https://reviews.llvm.org/D31052

llvm-svn: 298010
2017-03-16 22:59:15 +00:00
Adrian Prantl 47ea6478ed Salvage debug info from instructions about to be deleted
[Reapplies r297971 and punting on finding a better API for findDbgValues()]

This patch improves debug info quality in InstCombine by looking at
values that are about to be deleted, checking whether there are any
dbg.value instrinsics referring to them, and potentially encoding the
semantics of the deleted instruction into the dbg.value's
DIExpression.

In the example in the testcase (which was extracted from XNU) there is a sequence of

 %4 = load %struct.entry*, %struct.entry** %next2, align 8, !dbg !41
 %5 = bitcast %struct.entry* %4 to i8*, !dbg !42
 %add.ptr4 = getelementptr inbounds i8, i8* %5, i64 -8, !dbg !43
 %6 = bitcast i8* %add.ptr4 to %struct.entry*, !dbg !44
 call void @llvm.dbg.value(metadata %struct.entry* %6, i64 0, metadata !20, metadata !21), !dbg 34

When these instructions are eliminated by instcombine one after
another, we can still salvage the otherwise dead debug info:

- Bitcasts have no effect, so have the dbg.value point to operand(0)
- Loads can be expressed via a DW_OP_deref
- Constant gep instructions can be replaced by DWARF expression arithmetic

The API introduced by this patch is not specific to instcombine and
can be useful in other places, too.

rdar://problem/30725338

Differential Revision: https://reviews.llvm.org/D30919

llvm-svn: 297994
2017-03-16 21:14:09 +00:00
Michael Kuperstein 2da2bfa088 [LoopUnroll] Don't peel loops where the latch isn't the exiting block
Peeling assumed this doesn't happen, but didn't check it.
This fixes PR32178.

Differential Revision: https://reviews.llvm.org/D30757

llvm-svn: 297993
2017-03-16 21:07:48 +00:00
Adrian Prantl fa9e84eb6d Revert commit r297971 because of issues reported by msan.
llvm-svn: 297982
2017-03-16 20:11:54 +00:00
Adrian Prantl 4a7781aa38 Fix unused variable warnings.
llvm-svn: 297973
2017-03-16 18:33:01 +00:00
Adrian Prantl 4377314a98 Salvage debug info from instructions about to be deleted
This patch improves debug info quality in InstCombine by looking at
values that are about to be deleted, checking whether there are any
dbg.value instrinsics referring to them, and potentially encoding the
semantics of the deleted instruction into the dbg.value's
DIExpression.

In the example in the testcase (which was extracted from XNU) there is a sequence of

  %4 = load %struct.entry*, %struct.entry** %next2, align 8, !dbg !41
  %5 = bitcast %struct.entry* %4 to i8*, !dbg !42
  %add.ptr4 = getelementptr inbounds i8, i8* %5, i64 -8, !dbg !43
  %6 = bitcast i8* %add.ptr4 to %struct.entry*, !dbg !44
  call void @llvm.dbg.value(metadata %struct.entry* %6, i64 0, metadata !20, metadata !21), !dbg 34

When these instructions are eliminated by instcombine one after
another, we can still salvage the otherwise dead debug info:

- Bitcasts have no effect, so have the dbg.value point to operand(0)
- Loads can be expressed via a DW_OP_deref
- Constant gep instructions can be replaced by DWARF expression arithmetic

The API introduced by this patch is not specific to instcombine and
can be useful in other places, too.

rdar://problem/30725338

Differential Revision: https://reviews.llvm.org/D30919

llvm-svn: 297971
2017-03-16 18:22:52 +00:00
Aditya Kumar 24f6ad51bb Fix: Refactor SimplifyCFG:canSinkInstructions [NFC]
Differential Revision: https://reviews.llvm.org/D30116

llvm-svn: 297955
2017-03-16 14:09:18 +00:00
Eric Liu 8c7d28b2f1 Revert "Refactor SimplifyCFG:canSinkInstructions [NFC]"
This reverts commit r297839, which breaks Transforms/SimplifyCFG/sink-common-code.ll

llvm-svn: 297845
2017-03-15 15:29:42 +00:00
Aditya Kumar ee55bf3e34 Refactor SimplifyCFG:canSinkInstructions [NFC]
llvm-svn: 297839
2017-03-15 14:26:45 +00:00
Adrian Prantl 140a8569ce API gardening: Rename FindAllocaDbgValue to findDbgValue (NFC)
and use have it use SmallVectorImpl.

There is nothing specific about allocas in this function.

llvm-svn: 297643
2017-03-13 17:20:47 +00:00
Daniel Berlin cd07a0f685 VNCoercion: Make the function signatures all consistent
llvm-svn: 297537
2017-03-11 00:51:01 +00:00
Daniel Berlin 5ac9179f6c Move memory coercion functions from GVN.cpp to VNCoercion.cpp so they can be shared between GVN and NewGVN.
Summary:
These are the functions used to determine when values of loads can be
extracted from stores, etc, and to perform the necessary insertions to
do this.  There are no changes to the functions themselves except
reformatting, and one case where memdep was informed of a removed load
(which was pushed into the caller).

Reviewers: davide

Subscribers: mgorny, llvm-commits, Prazek

Differential Revision: https://reviews.llvm.org/D30478

llvm-svn: 297438
2017-03-10 04:54:10 +00:00
Daniel Berlin e3e69e1680 NewGVN: Rewrite DCE during elimination so we do it as well as old GVN did.
llvm-svn: 297428
2017-03-10 00:32:33 +00:00
Adrian Prantl d4056501fb Revert "Strip debug info when inlining into a nodebug function."
This reverts commit r296488.

As noted by David Blaikie on llvm-commits, I overlooked the case of a
debug function being inlined into a nodebug function being inlined
into a debug function.

llvm-svn: 297163
2017-03-07 17:28:57 +00:00
Sanjoy Das 30c3538e2e [LoopUnrolling] Fix loop size check for peeling
Summary:
We should check if loop size allows us to peel at least one iteration
before we do so.

Patch by Max Kazantsev!

Reviewers: sanjoy, mkuper, efriedma

Reviewed By: mkuper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30632

llvm-svn: 297122
2017-03-07 06:03:15 +00:00
Michael Kruse 811de8a619 [BasicBlockUtils] Check for nullptr before updating LoopInfo.
LoopInfo::getLoopFor returns nullptr if a BB is not in a loop and only
then can the loop be updated to contain the newly created BBs. Add the
missing nullptr check to SplitBlockAndInsertIfThen.

Within LLVM, the only user of this function that also passes a LoopInfo
to be updated is InnerLoopVectorizer::predicateInstructions().
As the method's name implies, the BB operataten on will always be within
a loop, but out-of-tree users may also use it differently (here: Polly).

All other uses of LoopInfo::getLoopFor in the file properly check its
return value for nullptr.

llvm-svn: 297016
2017-03-06 15:33:05 +00:00
Craig Topper b9dbd4d596 [SimplifyCFG] Use APInt::operator| instead of APInt::Or. NFC
I'm looking to improve operator| to support rvalue references and may remove APInt::Or.

llvm-svn: 296982
2017-03-05 01:08:19 +00:00
Sanjoy Das 0a4ec554c1 Fix a compiler warning
llvm-svn: 296903
2017-03-03 18:53:09 +00:00
Sanjoy Das 664c925a57 [LoopUnrolling] Peel loops with invariant backedge Phi input
Summary:
If a loop contains a Phi node which has an invariant input from back
edge, it is profitable to peel such loops (rather than unroll them) to
use the advantage that this Phi is always invariant starting from 2nd
iteration. After the 1st iteration is peeled, other optimizations can
potentially simplify calculations with this invariant.

Patch by Max Kazantsev!

Reviewers: sanjoy, apilipenko, igor-laevsky, anna, mkuper, reames

Reviewed By: mkuper

Subscribers: mkuper, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D30161

llvm-svn: 296898
2017-03-03 18:19:15 +00:00
Sanjoy Das eed71b9e1c [LoopUnrolling] Re-prioritize Peeling and Partial unrolling
Summary:
In current implementation the loop peeling happens after trip-count based partial unrolling and may
sometimes not happen at all due to it (for example, if trip count is known, but UP.Partial = false). This
is generally bad, the more than there are some situations where peeling is profitable even if the partial
unrolling is disabled.

This patch is a NFC which reorders peeling and partial unrolling application and prepares the code for
implementation of the said optimizations.

Patch by Max Kazantsev!

Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper

Reviewed By: mkuper

Subscribers: mkuper, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D30243

llvm-svn: 296897
2017-03-03 18:19:10 +00:00
Daniel Berlin dcb004fdf1 Move defClobbersUseOrDef to being a protected member of a class since we don't want anyone else using it
llvm-svn: 296838
2017-03-02 23:06:46 +00:00
Nikolai Bozhenov 4a04fb9e90 [BypassSlowDivision] Use ValueTracking to simplify run-time checks
ValueTracking is used for more thorough analysis of operands. Based on the
analysis, either run-time checks can be simplified (e.g. check only one operand
instead of two) or the transformation can be avoided. For example, it is quite
often the case that a divisor is promoted from a shorter type and run-time
checks for it are redundant.

With additional compile-time analysis of values, two special cases naturally
arise and are addressed by the patch:

 1) Both operands are known to be short enough. Then, the long division can be
    simply replaced with a short one without CFG modification.

 2) If a division is unsigned and the dividend is known to be short then the
    long division is not needed at all. Because if the divisor is too big for
    short division then the quotient is obviously zero (and the remainder is
    equal to the dividend). Actually, the division is not needed when
    (divisor > dividend).

Differential Revision: https://reviews.llvm.org/D29897

llvm-svn: 296832
2017-03-02 22:12:15 +00:00
Nikolai Bozhenov d4b12b3348 [BypassSlowDivision] Refactor fast division insertion logic (NFC)
The most important goal of the patch is to break large insertFastDiv function
into separate pieces, so that later a different fast insertion logic can be
implemented using some of these pieces.

Differential Revision: https://reviews.llvm.org/D29896

llvm-svn: 296828
2017-03-02 22:05:07 +00:00
Evgeny Stupachenko 21bef2cb3c The patch turns on epilogue unroll for loops with constant recurency start.
Summary:

Set unroll remainder to epilog if a loop contains a phi with constant parameter:

  loop:
  pn = phi [Const, PreHeader], [pn.next, Latch]
  ...

Reviewer: hfinkel

Differential Revision: http://reviews.llvm.org/D27004

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 296770
2017-03-02 17:38:46 +00:00
Adrian Prantl 80d0c93436 Strip debug info when inlining into a nodebug function.
The LLVM backend cannot produce any debug info for an llvm::Function
without a DISubprogram attachment. When inlining a debug-info-carrying
function into a nodebug function, there is therefore no reason to keep
any debug info intrinsic calls or debug locations on the instructions.

This fixes a problem discovered in PR32042.

rdar://problem/30679307

llvm-svn: 296488
2017-02-28 16:58:13 +00:00
Hans Wennborg 2d5841fa73 Revert r296366 "[InlineFunction] add nonnull assumptions based on argument attributes"
It causes miscompiles e.g. during self-host of Clang (PR32082).

llvm-svn: 296398
2017-02-27 22:33:02 +00:00
Sanjay Patel 40975e05eb [InlineFunction] add nonnull assumptions based on argument attributes
This was suggested in D27855: have the inliner add assumptions, so we don't 
lose nonnull info provided by argument attributes.

This still doesn't solve PR28430 (dyn_cast), but this gets us closer.

https://reviews.llvm.org/D29999

llvm-svn: 296366
2017-02-27 18:13:48 +00:00
Daniel Berlin fccbda967a PredicateInfo: Support switch statements
Summary:
Depends on D29606 and D29682

Makes us pass GVN's edge.ll (we also will pass a few other testcases
they just need cleaning up).

Thoughts on the Predicate* hiearchy of classes especially welcome :)
(it's not clear to me how best to organize it, and currently, the getBlock* seems ... uglier than maybe wasting a field somewhere or something).

Reviewers: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29747

llvm-svn: 295889
2017-02-22 22:20:58 +00:00
Daniel Berlin 17e8d0eae2 Move updating functions to MemorySSAUpdater.
Add updater to passes that now need it.
Move around code in MemorySSA to expose needed functions.

Summary: Mostly cleanup

Reviewers: george.burgess.iv

Subscribers: llvm-commits, Prazek

Differential Revision: https://reviews.llvm.org/D30221

llvm-svn: 295887
2017-02-22 22:19:55 +00:00
Sean Silva 9011aca5f4 Use const-ref in range-loop for to avoid copying pairs of std::string
No reason to create temporaries.

Differential Revision: https://reviews.llvm.org/D29871

Patch by sergio.martins!

llvm-svn: 295807
2017-02-22 06:34:04 +00:00
Xin Tong ebfe01c121 [LoopSimplify] Simplify how we compute UniqueExit
Summary: Simplify how we compute UniqueExit. Reuse ExitBlockSet.

Reviewers: sanjoy, efriedma, hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30182

llvm-svn: 295751
2017-02-21 19:10:58 +00:00
Daniel Berlin 78cbd28102 MemorySSA: Add support for renaming uses in the updater.
Summary:
This lets one add aliasing stores to the updater.
(i'm next going to move the creation/etc functions to the updater)

Reviewers: george.burgess.iv

Subscribers: llvm-commits, Prazek

Differential Revision: https://reviews.llvm.org/D30154

llvm-svn: 295677
2017-02-20 22:26:03 +00:00
Simon Pilgrim 5798180947 Removed extra ';'
llvm-svn: 295603
2017-02-19 12:32:44 +00:00
Daniel Berlin a4b5c01dd2 Add a DebugCounter for PredicateInfo renaming, and an associated test
llvm-svn: 295594
2017-02-19 04:29:01 +00:00
Simon Pilgrim dba9011942 Fix unused variable warning when assertions are disabled.
llvm-svn: 295587
2017-02-19 00:33:37 +00:00
Daniel Berlin 588e0be39d PredicateInfo: Clean up predicate info a little, using insertion
helpers, and fixing support for the renaming the comparison.

llvm-svn: 295581
2017-02-18 23:06:38 +00:00
Piotr Padlewski cc5868c186 [MemorySSA] NFC small fixes
Summary:
2 small fixes extracted from
https://reviews.llvm.org/D29064

Reviewers: kuhar, davide, dberlin, george.burgess.iv

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30109

llvm-svn: 295566
2017-02-18 20:34:36 +00:00
Sanjoy Das 8b859c26ec [JumpThreading] Re-enable JumpThreading for guards
Summary:
JumpThreading for guards feature has been reverted at https://reviews.llvm.org/rL295200
due to the following problem: the feature used the following algorithm for detection of
diamond patters:

1. Find a block with 2 predecessors;
2. Check that these blocks have a common single parent;
3. Check that the parent's terminator is a branch instruction.

The problem is that these checks are insufficient. They may pass for a non-diamond
construction in case if those two predecessors are actually the same block. This may
happen if parent's terminator is a br (either conditional or unconditional) to a block
that ends with "switch" instruction with exactly two branches going to one block.

This patch re-enables the JumpThreading for guards and fixes this issue by adding the
check that those found predecessors are actually different blocks. This guarantees that
parent's terminator is a conditional branch with exactly 2 different successors, which
is now ensured by assertions. It also adds two more tests for this situation (with parent's
terminator being a conditional and an unconditional branch).

Patch by Max Kazantsev!

Reviewers: anna, sanjoy, reames

Reviewed By: sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30036

llvm-svn: 295410
2017-02-17 04:21:14 +00:00
Anna Thomas 94c8d4976c Revert "[JumpThreading] Thread through guards"
This reverts commit r294617.

We fail on an assert while trying to get a condition from an
unconditional branch.

llvm-svn: 295200
2017-02-15 17:08:29 +00:00
Sanjay Patel 288f075f8e [InlineFunction] use getFunction(); NFC
llvm-svn: 295185
2017-02-15 15:22:18 +00:00
Sanjay Patel 32d753cae3 [InlineFunction] use getCaller(); NFCI
llvm-svn: 295181
2017-02-15 15:08:38 +00:00
Sanjay Patel ada717e25b [InlineFunction] use range-for loop; NFCI
llvm-svn: 295179
2017-02-15 14:56:11 +00:00
Peter Collingbourne 0609acc10d SimplifyCFG: Register cloned assume intrinsics with assumption cache when creating critical edge.
Differential Revision: https://reviews.llvm.org/D29976

llvm-svn: 295145
2017-02-15 03:01:11 +00:00
Easwaran Raman 5a12f236c6 Fix a bug in caller's BFI update code after inlining.
Multiple blocks in the callee can be mapped to a single cloned block
since we prune the callee as we clone it. The existing code
iterates over the value map and clones the block frequency (and
eventually scales the frequencies of the cloned blocks). Value map's
iteration is not deterministic and so the cloned block might get the
frequency of any of the original blocks. The fix is to set the max of
the original frequencies to the cloned block. The first block in the
sequence must have this max frequency and, in the call context,
subsequent blocks must have its frequency.

Differential Revision: https://reviews.llvm.org/D29696

llvm-svn: 295115
2017-02-14 22:49:28 +00:00
Taewook Oh 2e945ebb13 [BasicBlockUtils] Use getFirstNonPHIOrDbg to set debugloc for instructions created in SplitBlockPredecessors
Summary:
When setting debugloc for instructions created in SplitBlockPredecessors, current implementation copies debugloc from the first-non-phi instruction of the original basic block. However, if the first-non-phi instruction is a call for @llvm.dbg.value, the debugloc of the instruction may point the location outside of the block itself. For the example code of

```
  1 typedef struct _node_t {
  2   struct _node_t *next;
  3 } node_t;
  4
  5 extern node_t *root;
  6
  7 int foo() {
  8   node_t *node, *tmp;
  9   int ret = 0;
 10
 11   node = tmp = root->next;
 12   while (node != root) {
 13     while (node) {
 14       tmp = node;
 15       node = node->next;
 16       ret++;
 17     }
 18   }
 19
 20   return ret;
 21 }
```

, below is the basicblock corresponding to line 12 after Reassociate expressions pass:

```
while.cond:                                       ; preds = %while.cond2, %entry
  %node.0 = phi %struct._node_t* [ %1, %entry ], [ null, %while.cond2 ]
  %ret.0 = phi i32 [ 0, %entry ], [ %ret.1, %while.cond2 ]
  tail call void @llvm.dbg.value(metadata i32 %ret.0, i64 0, metadata !19, metadata !20), !dbg !21
  tail call void @llvm.dbg.value(metadata %struct._node_t* %node.0, i64 0, metadata !11, metadata !20), !dbg !31
  %cmp = icmp eq %struct._node_t* %node.0, %0, !dbg !33
  br i1 %cmp, label %while.end5, label %while.cond2, !dbg !35
```

As you can see, the first-non-phi instruction is a call for @llvm.dbg.value, and the debugloc is

```
!21 = !DILocation(line: 9, column: 7, scope: !6)
```

, which is a definition of 'ret' variable and outside of the scope of the basicblock itself. However, current implementation picks up this debugloc for the instructions created in SplitBlockPredecessors. This patch addresses this problem by picking up debugloc from the first-non-phi-non-dbg instruction.

Reviewers: dblaikie, samsonov, eugenis

Reviewed By: eugenis

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29867

llvm-svn: 295106
2017-02-14 21:10:40 +00:00
Daniel Berlin dbe8264c93 PredicateInfo: Handle critical edges
Summary:
This adds support for placing predicateinfo such that it affects critical edges.

This fixes the issues mentioned by Nuno on the mailing list.

Depends on D29519

Reviewers: davide, nlopes

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29606

llvm-svn: 294921
2017-02-12 22:12:20 +00:00
Dehao Chen fb02f7140a Encode duplication factor from loop vectorization and loop unrolling to discriminator.
Summary:
This patch starts the implementation as discuss in the following RFC: http://lists.llvm.org/pipermail/llvm-dev/2016-October/106532.html

When optimization duplicates code that will scale down the execution count of a basic block, we will record the duplication factor as part of discriminator so that the offline process tool can find the duplication factor and collect the accurate execution frequency of the corresponding source code. Two important optimization that fall into this category is loop vectorization and loop unroll. This patch records the duplication factor for these 2 optimizations.

The recording will be guarded by a flag encode-duplication-in-discriminators, which is off by default.

Reviewers: probinson, aprantl, davidxl, hfinkel, echristo

Reviewed By: hfinkel

Subscribers: mehdi_amini, anemet, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D26420

llvm-svn: 294782
2017-02-10 21:09:07 +00:00
Sanjoy Das 74bda4d591 [JumpThreading] Thread through guards
Summary:
This patch allows JumpThreading also thread through guards.
Virtually, guard(cond) is equivalent to the following construction:

  if (cond) { do something } else {deoptimize}

Yet it is not explicitly converted into IFs before lowering.
This patch enables early threading through guards in simple cases.
Currently it covers the following situation:

  if (cond1) {
    // code A
  } else {
    // code B
  }
  // code C
  guard(cond2)
  // code D

If there is implication cond1 => cond2 or !cond1 => cond2, we can transform
this construction into the following:

  if (cond1) {
    // code A
    // code C
  } else {
    // code B
    // code C
    guard(cond2)
  }
  // code D

Thus, removing the guard from one of execution branches.

Patch by Max Kazantsev!

Reviewers: reames, apilipenko, igor-laevsky, anna, sanjoy

Reviewed By: sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29620

llvm-svn: 294617
2017-02-09 19:40:22 +00:00
Matt Arsenault 560665250f NVPTX: Extract mem intrinsic expansions into utilities
llvm-svn: 294490
2017-02-08 17:49:52 +00:00
Daniel Berlin c763fd1ab4 PredicateInfo: Some compilers are unhappy with naming Use *'s Use. Change the name.
llvm-svn: 294364
2017-02-07 22:11:43 +00:00
Daniel Berlin 439042b7ad Add PredicateInfo utility and printing pass
Summary:
This patch adds a utility to build extended SSA (see "ABCD: eliminating
array bounds checks on demand"), and an intrinsic to support it. This
is then used to get functionality equivalent to propagateEquality in
GVN, in NewGVN (without having to replace instructions as we go). It
would work similarly in SCCP or other passes. This has been talked
about a few times, so i built a real implementation and tried to
productionize it.

Copies are inserted for operands used in assumes and conditional
branches that are based on comparisons (see below for more)

Every use affected by the predicate is renamed to the appropriate
intrinsic result.

E.g.
%cmp = icmp eq i32 %x, 50
br i1 %cmp, label %true, label %false
true:
ret i32 %x
false:
ret i32 1

will become

%cmp = icmp eq i32, %x, 50
br i1 %cmp, label %true, label %false
true:
; Has predicate info
; branch predicate info { TrueEdge: 1 Comparison: %cmp = icmp eq i32 %x, 50 }
%x.0 = call @llvm.ssa_copy.i32(i32 %x)
ret i32 %x.0
false:
ret i23 1

(you can use -print-predicateinfo to get an annotated-with-predicateinfo dump)

This enables us to easily determine what operations are affected by a
given predicate, and how operations affected by a chain of
predicates.

Reviewers: davide, sanjoy

Subscribers: mgorny, llvm-commits, Prazek

Differential Revision: https://reviews.llvm.org/D29519

Update for review comments

Fix a bug Nuno noticed where we are giving information about and/or on edges where the info is not useful and easy to use wrong

Update for review comments

llvm-svn: 294351
2017-02-07 21:10:46 +00:00
Sanjay Patel 54656ca7db [ValueTracking] emit a remark when we detect a conflicting assumption (PR31809)
This is a follow-up to D29395 where we try to be good citizens and let the user know that
we've probably gone off the rails.

This should allow us to resolve:
https://llvm.org/bugs/show_bug.cgi?id=31809

Differential Revision: https://reviews.llvm.org/D29404

llvm-svn: 294208
2017-02-06 18:26:06 +00:00
Anna Thomas b555cc8cb6 NFC: [LoopUnroll] More meaningful message in tracing
llvm-svn: 294017
2017-02-03 17:12:43 +00:00
Peter Collingbourne 6d8f817f8b FunctionImport: Use IRMover directly.
The importer was previously using ModuleLinker in a sort of "IRMover mode". Use
IRMover directly instead in order to remove a level of indirection.

I will remove all importing support from ModuleLinker in a separate
change.

Differential Revision: https://reviews.llvm.org/D29468

llvm-svn: 294014
2017-02-03 16:56:27 +00:00
Michael Kuperstein 3c6b3ba258 Shut up another GCC warning about operator precedence. NFC.
llvm-svn: 293812
2017-02-01 21:06:33 +00:00
Florian Hahn a35b8a4852 [LoopUnroll] Use addClonedBlockToLoopInfo to add loop header to LI (NFC).
Summary:
I have a similar patch up for review already (D29173). If you prefer I
can squash them both together.

Also I think there more potential for code sharing between
LoopUnroll.cpp and LoopUnrollRuntime.cpp. Do you think patches for
that would be worthwhile? 

Reviewers: mkuper, mzolotukhin

Reviewed By: mkuper, mzolotukhin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29311

llvm-svn: 293758
2017-02-01 10:39:35 +00:00
Florian Hahn 5364cf3b56 [LoopUnroll] Use addClonedBlockToLoopInfo to clone the top level loop (NFC)
Summary:
rL293124 added the necessary infrastructure to properly add the cloned
top level loop to LoopInfo, which means we do not have to do it manually
in CloneLoopBlocks.

@mkuper sorry for not pointing this out during my review of D29156, I just
realized that today.


Reviewers: mzolotukhin, chandlerc, mkuper

Reviewed By: mkuper

Subscribers: llvm-commits, mkuper

Differential Revision: https://reviews.llvm.org/D29173

llvm-svn: 293615
2017-01-31 11:13:44 +00:00
Daniel Berlin 9d8a335ce0 Revert "[MemorySSA] Revert r293361 and r293363, as the tests fail under asan."
This reverts commit r293471, reapplying r293361 and r293363 with a fix
for an out-of-bounds read.

llvm-svn: 293474
2017-01-30 11:35:39 +00:00
Sam McCall b9d6c10c2d [MemorySSA] Revert r293361 and r293363, as the tests fail under asan.
llvm-svn: 293471
2017-01-30 09:19:50 +00:00
Davide Italiano 6c77de0367 [MemorySSA] Correct an assertion surrounding with parentheses.
llvm-svn: 293453
2017-01-30 03:16:43 +00:00
Taewook Oh 505a25aec5 [InstCombine] Merge DebugLoc when speculatively hoisting store instruction
Summary: Along with https://reviews.llvm.org/D27804, debug locations need to be merged when hoisting store instructions as well. Not sure if just dropping debug locations would make more sense for this case, but as the branch instruction will have at least different discriminator with the hoisted store instruction, I think there will be no difference in practice.

Reviewers: aprantl, andreadb, danielcdh

Reviewed By: aprantl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29062

llvm-svn: 293372
2017-01-28 07:05:43 +00:00
Daniel Berlin ee6e3a598a MemorySSA: Allow movement to arbitrary places
Summary: Extend the MemorySSAUpdater API to allow movement to arbitrary places

Reviewers: davide, george.burgess.iv

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29239

llvm-svn: 293363
2017-01-28 02:26:39 +00:00
Daniel Berlin 2f1ab4ba79 MemorySSA: Fix block numbering invalidation and replacement bugs discovered by updater
llvm-svn: 293361
2017-01-28 02:22:52 +00:00
Matthias Braun 8c209aa877 Cleanup dump() functions.
We had various variants of defining dump() functions in LLVM. Normalize
them (this should just consistently implement the things discussed in
http://lists.llvm.org/pipermail/cfe-dev/2014-January/034323.html

For reference:
- Public headers should just declare the dump() method but not use
  LLVM_DUMP_METHOD or #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
- The definition of a dump method should look like this:
  #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
  LLVM_DUMP_METHOD void MyClass::dump() {
    // print stuff to dbgs()...
  }
  #endif

llvm-svn: 293359
2017-01-28 02:02:38 +00:00
Daniel Berlin ae6b8b6933 MemorySSA: Move updater to its own file
llvm-svn: 293357
2017-01-28 01:35:02 +00:00
Daniel Berlin 60ead05f80 Introduce a basic MemorySSA updater, that supports insertDef,
insertUse, moveBefore and moveAfter operations.

Summary:
This creates a basic MemorySSA updater that handles arbitrary
insertion of uses and defs into MemorySSA, as well as arbitrary
movement around the CFG. It replaces the current splice API.

It can be made to handle arbitrary control flow changes.
Currently, it uses the same updater algorithm from D28934.

The main difference is because MemorySSA is single variable, we have
the complete def and use list, and don't need anyone to give it to us
as part of the API.  We also have to rename stores below us in some
cases.

If we go that direction in that patch, i will merge all the updater
implementations (using an updater_traits or something to provide the
get* functions we use, called read*/write* in that patch).

Sadly, the current SSAUpdater algorithm is way too slow to use for
what we are doing here.

I have updated the tests we have to basically build memoryssa
incrementally using the updater api, and make sure it still comes out
the same.

Reviewers: george.burgess.iv

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29047

llvm-svn: 293356
2017-01-28 01:23:13 +00:00
Anna Thomas e7d865e34e NFC: Add debug tracing for more cases where loop unrolling fails.
llvm-svn: 293313
2017-01-27 17:57:05 +00:00
Justin Lebar cb9b41dd76 [LangRef] Make @llvm.sqrt(x) return undef, rather than have UB, for negative x.
Summary:
Some frontends emit a speculate-and-select idiom for sqrt, wherein they compute
sqrt(x), check if x is negative, and select NaN if it is:

  %cmp = fcmp olt double %a, -0.000000e+00
  %sqrt = call double @llvm.sqrt.f64(double %a)
  %ret = select i1 %cmp, double 0x7FF8000000000000, double %sqrt

This is technically UB as the LangRef is written today if %a is ever less than
-0.  But emitting code that's compliant with the current definition of sqrt
would require a branch, which would then prevent us from matching this idiom in
SelectionDAG (which we do today -- ISD::FSQRT has defined behavior on negative
inputs), because SelectionDAG looks at one BB at a time.

Nothing in LLVM takes advantage of this undefined behavior, as far as we can
tell, and the fact that llvm.sqrt has UB dates from its initial addition to the
LangRef.

Reviewers: arsenm, mehdi_amini, hfinkel

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D28797

llvm-svn: 293242
2017-01-27 00:58:03 +00:00
Michael Kuperstein 5dd55e8405 [LoopUnroll] Properly update loopinfo for runtime unrolling by 2
Even when we don't create a remainder loop (that is, when we unroll by 2), we
may duplicate nested loops into the remainder. This is complicated by the fact
the remainder may itself be either inserted into an outer loop, or at the top
level. In the latter case, we may need to create new top-level loops.

Differential Revision: https://reviews.llvm.org/D29156

llvm-svn: 293124
2017-01-26 01:04:11 +00:00
Daniel Berlin d602e04c9e MemorySSA: Link all defs together into an intrusive defslist, to make updater easier
Summary:
This is the first in a series of patches to add a simple, generalized updater to MemorySSA.

For MemorySSA, every def is may-def, instead of the normal must-def.
(the best way to think of memoryssa is "everything is really one variable, with different versions of that variable at different points in the program).
This means when updating, we end up having to do a bunch of work to touch defs below and above us.

In order to support this quickly, i have ilist'd all the defs for each block.  ilist supports tags, so this is quite easy. the only slightly messy part is that you can't have two iplists for the same type that differ only whether they have the ownership part enabled or not, because the traits are for the value type.

The verifiers have been updated to test that the def order is correct.

Reviewers: george.burgess.iv

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29046

llvm-svn: 293085
2017-01-25 20:56:19 +00:00
Akira Hatanaka 4ec7b20ef6 [SimplifyCFG] Do not sink and merge inline-asm instructions.
Conservatively disable sinking and merging inline-asm instructions as doing so
can potentially create arguments that cannot satisfy the inline-asm constraints.

For example, SimplifyCFG used to do the following transformation:

(before)
if.then:
  %0 = call i32 asm "rorl $2, $0", "=&r,0,n"(i32 %r6, i32 8)
  br label %if.end
if.else:
  %1 = call i32 asm "rorl $2, $0", "=&r,0,n"(i32 %r6, i32 6)
  br label %if.end

(after)
  %.sink = select i1 %tobool, i32 6, i32 8
  %0 = call i32 asm "rorl $2, $0", "=&r,0,n"(i32 %r6, i32 %.sink)

This would result in a crash in the backend since only immediate integer operands
are permitted for constraint "n".

rdar://problem/30110806

Differential Revision: https://reviews.llvm.org/D29111

llvm-svn: 293025
2017-01-25 06:21:51 +00:00
Chandler Carruth 6acdca78a0 [PH] Replace uses of AssertingVH from members of analysis results with
a lazy-asserting PoisoningVH.

AssertVH is fundamentally incompatible with cache-invalidation of
analysis results. The invaliadtion happens after the AssertingVH has
already fired. Instead, use a PoisoningVH that will assert if the
dangling handle is ever used rather than merely be assigned or
destroyed.

This patch also removes all of the (numerous) doomed attempts to work
around this fundamental incompatibility. It is a pretty significant
simplification IMO.

The most interesting change is in the Inliner where we still do some
clearing because we don't want to rely on the coarse grained
invalidation strategy of the containing pass manager. However, I prefer
the approach that contains this logic to the cleanup phase of the
Inliner, and I think we could enhance the CGSCC analysis management
layer to make this even better in the future if desired.

The rest is straight cleanup.

I've also added a test for one of the harder cases to work around: when
a *module analysis* contains many AssertingVHes pointing at functions.

Differential Revision: https://reviews.llvm.org/D29006

llvm-svn: 292928
2017-01-24 12:55:57 +00:00
Serge Pavlov 098ee2fe02 Update domtree incrementally in loop peeling.
With this change dominator tree remains in sync after each step of loop
peeling.

Differential Revision: https://reviews.llvm.org/D29029

llvm-svn: 292895
2017-01-24 06:58:39 +00:00
Matt Arsenault 954a624fb9 SimplifyLibCalls: Replace more unary libcalls with intrinsics
llvm-svn: 292855
2017-01-23 23:55:08 +00:00
Michael Kuperstein 461aa57ad3 [LoopUnroll] First form LCSSA, then loop-simplify
Running non-LCSSA-preserving LoopSimplify followed by LCSSA on (roughly) the
same loop is incorrect, since LoopSimplify may break LCSSA arbitrarily higher
in the loop nest. Instead, run LCSSA first, and then run LCSSA-preserving
LoopSimplify on the result.

This fixes PR31718.

Differential Revision: https://reviews.llvm.org/D29055

llvm-svn: 292854
2017-01-23 23:45:42 +00:00
David L. Jones d21529fa0d [Analysis] Add LibFunc_ prefix to enums in TargetLibraryInfo. (NFC)
Summary:
The LibFunc::Func enum holds enumerators named for libc functions.
Unfortunately, there are real situations, including libc implementations, where
function names are actually macros (musl uses "#define fopen64 fopen", for
example; any other transitively visible macro would have similar effects).

Strictly speaking, a conforming C++ Standard Library should provide any such
macros as functions instead (via <cstdio>). However, there are some "library"
functions which are not part of the standard, and thus not subject to this
rule (fopen64, for example). So, in order to be both portable and consistent,
the enum should not use the bare function names.

The old enum naming used a namespace LibFunc and an enum Func, with bare
enumerators. This patch changes LibFunc to be an enum with enumerators prefixed
with "LibFFunc_". (Unfortunately, a scoped enum is not sufficient to override
macros.)

There are additional changes required in clang.

Reviewers: rsmith

Subscribers: mehdi_amini, mzolotukhin, nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D28476

llvm-svn: 292848
2017-01-23 23:16:46 +00:00
Amaury Sechet 2fec7e4f44 Tweak ASCII art in Simplify CFG. NFC
llvm-svn: 292792
2017-01-23 15:13:01 +00:00
Chandler Carruth 7fd29cef42 [PM] Sink an LCSSA preservation assert from the LoopSimplify pass into
the library routine shared with the new PM and other code.

This assert checks that when LCSSA preservation is requested we start in
LCSSA form. Without this early assert, given *very* complex test cases
we can hit an assert or crash much later on when trying to preserve
LCSSA.

The new PM's loop simplify doesn't need to (and indeed can't) preserve
LCSSA as the new PM doesn't deal in transforms in the dependency graph.
But we asked the library to and shockingly, this didn't work very well!
Stop doing that. Now the assert will tell us immediately with existing
test cases. Before this, it took a pretty convoluted input to trigger
this.

However, sinking the assert also found a bug in LoopUnroll where we
asked simplifyLoop to preserve LCSSA *right before we reform it*. That's
kinda silly and unsurprising that it wasn't available. =D Stop doing
that too.

We also would assert that the unrolled loop was in LCSSA even if
preserving LCSSA was never requested! I don't have a test case or
anything here. I spotted it by inspection and it seems quite obvious. No
logic change anyways, that's just avoiding a spurrious assert.

llvm-svn: 292710
2017-01-21 04:16:53 +00:00
Easwaran Raman 12585b0148 Improve PGO support for the new inliner
This adds the following to the new PM based inliner in PGO mode:

* Use block frequency analysis to derive callsite's profile count and use
that to adjust thresholds of hot and cold callsites.

* Incrementally update the BFI of the caller after a callee gets inlined
into it. This incremental update is only within an invocation of the run
method - BFI is not preserved across calls to run.
Update the function entry count of the callee after inlining it into a
caller.

* I've tuned the thresholds for the hot and cold callsites using a hacked
up version of the old inliner that explicitly computes BFI on a set of
internal benchmarks and spec. Once the new PM based pipeline stabilizes
(IIRC Chandler mentioned there are known issues) I'll benchmark this
again and adjust the thresholds if required.
Inliner PGO support.

Differential revision: https://reviews.llvm.org/D28331

llvm-svn: 292666
2017-01-20 22:44:04 +00:00
Eli Friedman 0a2174533e Preserve domtree and loop-simplify for runtime unrolling.
Mostly straightforward changes; we just didn't do the computation before.
One sort of interesting change in LoopUnroll.cpp: we weren't handling
dominance for children of the loop latch correctly, but
foldBlockIntoPredecessor hid the problem for complete unrolling.

Currently punting on loop peeling; made some minor changes to isolate
that problem to LoopUnrollPeel.cpp.

Adds a flag -unroll-verify-domtree; it verifies the domtree immediately
after we finish updating it. This is on by default for +Asserts builds.

Differential Revision: https://reviews.llvm.org/D28073

llvm-svn: 292447
2017-01-18 23:26:37 +00:00
Peter Collingbourne 10e3b12c7a Cloning: Copy comdats when cloning globals.
Differential Revision: https://reviews.llvm.org/D28838

llvm-svn: 292430
2017-01-18 20:02:31 +00:00
Michael Kuperstein 0de990da16 Fix up a comment. NFC.
llvm-svn: 292425
2017-01-18 19:05:48 +00:00
Michael Kuperstein 7cefb409b0 [LV] Allow reductions that have several uses outside the loop
We currently check whether a reduction has a single outside user. We don't
really need to require that - we just need to make sure a single value is
used externally. The number of external users of that value shouldn't actually
matter.

Differential Revision: https://reviews.llvm.org/D28830

llvm-svn: 292424
2017-01-18 19:02:52 +00:00
Eugene Zelenko 34c23279c2 [Target, Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 292320
2017-01-18 00:57:48 +00:00
Matt Arsenault b948b4d8df SimplifyLibCalls: Remove checks for fabs
Use the intrinsic instead of emitting the libcall which
will be replaced by the intrinsic.

llvm-svn: 292176
2017-01-17 00:30:31 +00:00
Matt Arsenault 7233344c28 SimplifyLibCalls: Replace fabs libcalls with intrinsics
Add missing fabs(fpext) optimzation that worked with the call,
and also fixes it creating a second fpext when there were multiple
uses.

llvm-svn: 292172
2017-01-17 00:10:40 +00:00
Chandler Carruth ca68a3ec47 [PM] Introduce an analysis set used to preserve all analyses over
a function's CFG when that CFG is unchanged.

This allows transformation passes to simply claim they preserve the CFG
and analysis passes to check for the CFG being preserved to remove the
fanout of all analyses being listed in all passes.

I've gone through and removed or cleaned up as many of the comments
reminding us to do this as I could.

Differential Revision: https://reviews.llvm.org/D28627

llvm-svn: 292054
2017-01-15 06:32:49 +00:00
Eugene Zelenko 5fa43960f3 [Transforms/Utils] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 291983
2017-01-14 00:32:38 +00:00
David L. Jones 41cecba8e9 "Use" lambda captures which are otherwise only used in asserts. NFC
Summary:
The LLVM coding standards recommend "using" values that are only
needed by asserts:
http://llvm.org/docs/CodingStandards.html#assert-liberally

Without this change, LLVM cannot bootstrap with -Werror as the second
stage fails with this new warning:
https://reviews.llvm.org/rL291905

See also the previous fixes:
https://reviews.llvm.org/rL291916
https://reviews.llvm.org/rL291939
https://reviews.llvm.org/rL291940
https://reviews.llvm.org/rL291941

Reviewers: rsmith

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28695

llvm-svn: 291957
2017-01-13 21:02:41 +00:00
Robert Lougher b0124c1eb8 [DebugInfo] Remove redundant check in SimplifyCFG; NFC.
llvm-svn: 291813
2017-01-12 21:11:09 +00:00
Robert Lougher 6717a6fe54 [DebugInfo] DILocation variable declaration should be const; NFC.
llvm-svn: 291787
2017-01-12 18:33:49 +00:00
Florian Hahn 4f9d6d56c0 [loop-unroll] Properly populate LoopInfo for loops cloned in LoopUnrollRuntime.
Summary:
This fixes Transforms/LoopUnroll/runtime-loop3.ll which failed with
EXTENSIVE_DEBUG, because the cloned basic blocks were not added to the
correct sub-loops in LoopUnrollRuntime.cpp.


Reviewers: dexonsmith, mzolotukhin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28482

llvm-svn: 291619
2017-01-10 23:43:35 +00:00
Florian Hahn fdea2e420c [loop-unroll] Factor out code to update LoopInfo (NFC).
Move the code to update LoopInfo for cloned basic blocks to
addClonedBlockToLoopInfo, as suggested in 
https://reviews.llvm.org/D28482.

llvm-svn: 291614
2017-01-10 23:24:54 +00:00
Michael Kuperstein ee31cbe35f [LV] Don't panic when encountering the IV of an outer loop.
Bail out instead of asserting when we encounter this situation,
which can actually happen.

The reason the test uses the new PM is that the "bad" phi, incidentally, gets
cleaned up by LoopSimplify. But LICM can create this kind of phi and preserve
loop simplify form, so the cleanup has no chance to run.

This fixes PR31190.
We may want to solve this in a less conservative manner, since this phi is
actually uniform within the inner loop (or we may want LICM to output a cleaner
promotion to begin with).

Differential Revision: https://reviews.llvm.org/D28490

llvm-svn: 291589
2017-01-10 19:32:30 +00:00
Davide Italiano f8711f093e [SimplifyLibCalls] Propagate fast math flags while optimizing pow().
llvm-svn: 291577
2017-01-10 18:02:05 +00:00
Davide Italiano 472684eaf5 [SimplifyLibCalls] pow(x, -0.5) -> 1.0 / sqrt(x).
Differential Revision:  https://reviews.llvm.org/D28479

llvm-svn: 291486
2017-01-09 21:55:23 +00:00
Matt Arsenault a7d2194168 SimplifyLibCalls: Remove incorrect optimization of fabs
fabs(x * x) is not generally safe to assume x is positive if x is a NaN.
This is also less general than it could be, so this will be replaced
with a transformation on the intrinsic.

llvm-svn: 291359
2017-01-07 19:55:12 +00:00
Teresa Johnson 9006d52651 [ThinLTO] Handle conflicting local names gracefully
Summary:
r285871 introduced an assert that was overly aggressive in the case
of a same-named local in different same-named files (in different
directories), where the source name and therefore the GUID ended up
the same because the files were compiled in their own directory without
any leading path. Change the handling in the promotion logic to get
the summary for the version in that module.

This also exposed an issue where we are not always importing the
right copy, which is a performance not correctness issue (because
the renaming is based on the module hash which must be different,
see the bug report for details). I will fix that as a follow-on.

Fixes PR31561.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28411

llvm-svn: 291304
2017-01-06 23:38:41 +00:00
Teresa Johnson 2b60384581 [ThinLTO] Add parenthesis as per build warning
Fixes a warning about "||" and "&&" due to r291108.

llvm-svn: 291119
2017-01-05 15:10:10 +00:00
Teresa Johnson 519465b993 [ThinLTO] Subsume all importing checks into a single flag
Summary:
This adds a new summary flag NotEligibleToImport that subsumes
several existing flags (NoRename, HasInlineAsmMaybeReferencingInternal
and IsNotViableToInline). It also subsumes the checking of references
on the summary that was being done during the thin link by
eligibleForImport() for each candidate. It is much more efficient to
do that checking once during the per-module summary build and record
it in the summary.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28169

llvm-svn: 291108
2017-01-05 14:32:16 +00:00
Robert Lougher 5bf0416f45 Reapply "[SimplifyCFG] In sinkLastInstruction correctly set debugloc of common inst"
This reapplies r289828 (reverted in r289833 as it broke the address sanitizer).  The
debugloc is now only set when the instruction is not a call, as this causes the
verifier to assert (the inliner requires an inlinable callsite to have a debug loc
if the caller and callee have debug info).

Original commit message:

Simplify CFG will try to sink the last instruction in a series of basic blocks,
creating a "common" instruction in the successor block (sinkLastInstruction).
When it does this, the debug location of the single instruction should be the
merged debug locations of the commoned instructions.

Original review: https://reviews.llvm.org/D27590

llvm-svn: 290973
2017-01-04 17:40:32 +00:00
Xin Tong 2940231ff0 Make sure total loop body weight is preserved in loop peeling
Summary:
Regardless how the loop body weight is distributed, we should preserve
total loop body weight. i.e. we should have same weight reaching the body of the loop
or its duplicates in peeled and unpeeled case.

Reviewers: mkuper, davidxl, anemet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28179

llvm-svn: 290833
2017-01-02 20:27:23 +00:00
Sanjay Patel 65d533ca42 fix typo; NFC
llvm-svn: 290827
2017-01-02 19:05:11 +00:00
Sanjay Patel aea60846c4 [Inliner] remove unnecessary null checks from AddAlignmentAssumptions(); NFCI
We bail out on the 1st line if the assumption cache is not set, so there's
no need to check it after that.

llvm-svn: 290787
2016-12-31 17:54:05 +00:00
Philip Reames fac031a178 Add a comment for a todo in LoopUnroll post cleanup
llvm-svn: 290769
2016-12-30 22:10:19 +00:00
Chandler Carruth 0ee8bb11c3 [PM] Move the collection of call sites to a more appropriate place
inside of `InlineFunction`. Prior to this, call instructions are
specifically being rewritten and replaced within the inlined region,
invalidating some of the call sites.

Several of these regions are using the same technique to walk the
inlined region so this seems clearly safe up to this point.

I've also added a short circuit to the scan for call sites based on what
other code is doing.

With this, the most common crash I've found in the new inliner code is
fixed. I've turned it on for another test case that covers this
scenario.

I'll make my way through most of the other inliner test cases
just to get some easy coverage next.

llvm-svn: 290562
2016-12-27 01:24:50 +00:00
Chandler Carruth 6e9bb7e064 [PM] Teach the always inliner in the new pass manager to support
removing fully-dead comdats without removing dead entries in comdats
with live members.

This factors the core logic out of the current inliner's internals to
a reusable utility and leverages that in both places. The factored out
code should also be (minorly) more efficient in cases where we have very
few dead functions or dead comdats to consider.

I've added a test case to cover this behavior of the always inliner.
This is the last significant bug in the new PM's always inliner I've
found (so far).

llvm-svn: 290557
2016-12-26 23:43:27 +00:00
Bryant Wong 4213d94142 [MemorySSA] Define a restricted upward AccessList splice.
Differential Revision: https://reviews.llvm.org/D26661

llvm-svn: 290527
2016-12-25 23:34:07 +00:00
Adrian Prantl 49797ca6be Refactor the DIExpression fragment query interface (NFC)
... so it becomes available to DIExpressionCursor.

llvm-svn: 290322
2016-12-22 05:27:12 +00:00
Haicheng Wu b29dd0107c [LoopUnroll] Modify a comment to clarify the usage of TripCount. NFC.
Make it clear that TripCount is the upper bound of the iteration on which
control exits LatchBlock.

Differential Revision: https://reviews.llvm.org/D26675

llvm-svn: 290199
2016-12-20 20:23:48 +00:00
Chandler Carruth 1d96311447 [PM] Provide an initial, minimal port of the inliner to the new pass manager.
This doesn't implement *every* feature of the existing inliner, but
tries to implement the most important ones for building a functional
optimization pipeline and beginning to sort out bugs, regressions, and
other problems.

Notable, but intentional omissions:
- No alloca merging support. Why? Because it isn't clear we want to do
  this at all. Active discussion and investigation is going on to remove
  it, so for simplicity I omitted it.
- No support for trying to iterate on "internally" devirtualized calls.
  Why? Because it adds what I suspect is inappropriate coupling for
  little or no benefit. We will have an outer iteration system that
  tracks devirtualization including that from function passes and
  iterates already. We should improve that rather than approximate it
  here.
- Optimization remarks. Why? Purely to make the patch smaller, no other
  reason at all.

The last one I'll probably work on almost immediately. But I wanted to
skip it in the initial patch to try to focus the change as much as
possible as there is already a lot of code moving around and both of
these *could* be skipped without really disrupting the core logic.

A summary of the different things happening here:

1) Adding the usual new PM class and rigging.

2) Fixing minor underlying assumptions in the inline cost analysis or
   inline logic that don't generally hold in the new PM world.

3) Adding the core pass logic which is in essence a loop over the calls
   in the nodes in the call graph. This is a bit duplicated from the old
   inliner, but only a handful of lines could realistically be shared.
   (I tried at first, and it really didn't help anything.) All told,
   this is only about 100 lines of code, and most of that is the
   mechanics of wiring up analyses from the new PM world.

4) Updating the LazyCallGraph (in the new PM) based on the *newly
   inlined* calls and references. This is very minimal because we cannot
   form cycles.

5) When inlining removes the last use of a function, eagerly nuking the
   body of the function so that any "one use remaining" inline cost
   heuristics are immediately refined, and queuing these functions to be
   completely deleted once inlining is complete and the call graph
   updated to reflect that they have become dead.

6) After all the inlining for a particular function, updating the
   LazyCallGraph and the CGSCC pass manager to reflect the
   function-local simplifications that are done immediately and
   internally by the inline utilties. These are the exact same
   fundamental set of CG updates done by arbitrary function passes.

7) Adding a bunch of test cases to specifically target CGSCC and other
   subtle aspects in the new PM world.

Many thanks to the careful review from Easwaran and Sanjoy and others!

Differential Revision: https://reviews.llvm.org/D24226

llvm-svn: 290161
2016-12-20 03:15:32 +00:00
Florian Hahn 2e03213f90 [LoopVersioning] Require loop-simplify form for loop versioning.
Summary:
Requiring loop-simplify form for loop versioning ensures that the
runtime check block always dominates the exit block.
    
This patch closes #30958 (https://llvm.org/bugs/show_bug.cgi?id=30958).

Reviewers: silviu.baranga, hfinkel, anemet, ashutosh.nema

Subscribers: ashutosh.nema, mzolotukhin, efriedma, hfinkel, llvm-commits

Differential Revision: https://reviews.llvm.org/D27469

llvm-svn: 290116
2016-12-19 17:13:37 +00:00
Daniel Jasper aec2fa352f Revert @llvm.assume with operator bundles (r289755-r289757)
This creates non-linear behavior in the inliner (see more details in
r289755's commit thread).

llvm-svn: 290086
2016-12-19 08:22:17 +00:00
Michael Kuperstein 3ca147ea3d Preserve loop metadata when folding branches to a common destination.
Differential Revision: https://reviews.llvm.org/D27830

llvm-svn: 289992
2016-12-16 21:23:59 +00:00
Davide Italiano f024a56cb8 [SimplifyLibCalls] Use a lambda. NFCI.
llvm-svn: 289911
2016-12-16 02:28:38 +00:00
Davide Italiano 85ad36b0e0 [SimplifyLibCalls] Lower fls() to llvm.ctlz().
Differential Revision:  https://reviews.llvm.org/D14590

llvm-svn: 289894
2016-12-15 23:45:11 +00:00
Davide Italiano 890e850348 [SimplifyLibCalls] Remove redundant folding logic for ffs().
Lowering to llvm.cttz() will result in constant folding anyway
if the argument to ffs is a constant. Pointed out by Eli for
fls() in D14590.

llvm-svn: 289888
2016-12-15 23:11:00 +00:00
Andrea Di Biagio f20c57eca9 [SimplifyCFG] Merge debug locations when hoisting an instruction from a then/else branch. NFC.
Now that a new API to merge debug locations has been committed at r289661 (see
review D26256 for more details), we can use it to "improve" the code added by
revision r280995.

Instead of nulling the debugloc of a commoned instruction, we use the 'merged'
debug location. At the moment, this is just a no functional change since
function `DILocation::getMergedLocation()` is just a stub and would always
return a null location.

Differential Revision: https://reviews.llvm.org/D27804

llvm-svn: 289862
2016-12-15 20:01:26 +00:00
Robert Lougher 6ea759a83e Revert "[SimplifyCFG] In sinkLastInstruction correctly set debugloc of common inst"
Reverting as it is causing buildbot failures (address sanitizer).

llvm-svn: 289833
2016-12-15 16:59:13 +00:00
Robert Lougher cf17674211 [SimplifyCFG] In sinkLastInstruction correctly set debugloc of "common" inst
Simplify CFG will try to sink the last instruction in a series of basic blocks,
creating a "common" instruction in the successor block (sinkLastInstruction).
When it does this, the debug location of the single instruction should be the
merged debug locations of the commoned instructions.

Differential Revision: https://reviews.llvm.org/D27590

llvm-svn: 289828
2016-12-15 16:17:53 +00:00
Hal Finkel 3ca4a6bcf1 Remove the AssumptionCache
After r289755, the AssumptionCache is no longer needed. Variables affected by
assumptions are now found by using the new operand-bundle-based scheme. This
new scheme is more computationally efficient, and also we need much less
code...

llvm-svn: 289756
2016-12-15 03:02:15 +00:00
Stephan Bergmann 17c7f70362 Replace APFloatBase static fltSemantics data members with getter functions
At least the plugin used by the LibreOffice build
(<https://wiki.documentfoundation.org/Development/Clang_plugins>) indirectly
uses those members (through inline functions in LLVM/Clang include files in turn
using them), but they are not exported by utils/extract_symbols.py on Windows,
and accessing data across DLL/EXE boundaries on Windows is generally
problematic.

Differential Revision: https://reviews.llvm.org/D26671

llvm-svn: 289647
2016-12-14 11:57:17 +00:00
Andrea Di Biagio eff22832c0 [InlineFunction] Refactor code in function `fixupLineNumbers' as suggested by David in D27462. NFC
llvm-svn: 288901
2016-12-07 12:01:45 +00:00
Andrea Di Biagio 32d5aedd5b [InlineFunction] Do not propagate the callsite debug location to instructions inlined from functions with debug info.
When a function F is inlined, InlineFunction extends the debug location of every
instruction inlined from F by adding an InlinedAt.

However, if an instruction has a 'null' debug location, InlineFunction would
propagate the callsite debug location to it. This behavior existed since
revision 210459.

Revision 210459 was originally committed specifically to workaround the lack of
debug information for instructions inlined from intrinsic functions (which are
usually declared with attributes `__always_inline__, __nodebug__`).

The problem with revision 210459 is that it doesn't make any sort of distinction
between instructions inlined from a 'nodebug' function and instructions which
are inlined from a function built with debug info. This issue may lead to
incorrect stepping in the debugger.

This patch works under the assumption that a nodebug function does not have a
DISubprogram. When a function F is inlined into another function G,
InlineFunction checks if F has debug info associated with it.

For nodebug functions, the InlineFunction logic is unchanged (i.e. it would
still propagate the callsite debugloc to the inlined instructions). Otherwise,
InlineFunction no longer propagates the callsite debug location.

Differential Revision: https://reviews.llvm.org/D27462

llvm-svn: 288895
2016-12-07 10:37:26 +00:00
Adrian Prantl 941fa7588b [DIExpression] Introduce a dedicated DW_OP_LLVM_fragment operation
so we can stop using DW_OP_bit_piece with the wrong semantics.

The entire back story can be found here:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20161114/405934.html

The gist is that in LLVM we've been misinterpreting DW_OP_bit_piece's
offset field to mean the offset into the source variable rather than
the offset into the location at the top the DWARF expression stack. In
order to be able to fix this in a subsequent patch, this patch
introduces a dedicated DW_OP_LLVM_fragment operation with the
semantics that we used to apply to DW_OP_bit_piece, which is what we
actually need while inside of LLVM. This patch is complete with a
bitcode upgrade for expressions using the old format. It does not yet
fix the DWARF backend to use DW_OP_bit_piece correctly.

Implementation note: We discussed several options for implementing
this, including reserving a dedicated field in DIExpression for the
fragment size and offset, but using an custom operator at the end of
the expression works just fine and is more efficient because we then
only pay for it when we need it.

Differential Revision: https://reviews.llvm.org/D27361
rdar://problem/29335809

llvm-svn: 288683
2016-12-05 18:04:47 +00:00
Michael Kuperstein 997dac8709 Remove stale comment. NFC.
llvm-svn: 288572
2016-12-03 01:59:13 +00:00
Peter Collingbourne bc0705240e IR: Move NumElements field from {Array,Vector}Type to SequentialType.
Now that PointerType is no longer a SequentialType, all SequentialTypes
have an associated number of elements, so we can move that information to
the base class, allowing for a number of simplifications.

Differential Revision: https://reviews.llvm.org/D27122

llvm-svn: 288464
2016-12-02 03:20:58 +00:00
Peter Collingbourne ab85225be4 IR: Change the gep_type_iterator API to avoid always exposing the "current" type.
Instead, expose whether the current type is an array or a struct, if an array
what the upper bound is, and if a struct the struct type itself. This is
in preparation for a later change which will make PointerType derive from
Type rather than SequentialType.

Differential Revision: https://reviews.llvm.org/D26594

llvm-svn: 288458
2016-12-02 02:24:42 +00:00
Teresa Johnson 185b4ab6d4 [ThinLTO] Stop importing constant global vars as copies in the backend
Summary:
We were doing an optimization in the ThinLTO backends of importing
constant unnamed_addr globals unconditionally as a local copy (regardless
of whether the thin link decided to import them). This should be done in
the thin link instead, so that resulting exported references are marked
and promoted appropriately, but will need a summary enhancement to mark
these variables as constant unnamed_addr.

The function import logic during the thin link was trying to handle
this proactively, by conservatively marking all values referenced in
the initializer lists of exported global variables as also exported.
However, this only handled values referenced directly from the
initializer list of an exported global variable. If the value is itself
a constant unnamed_addr variable, we could end up exporting its
references as well. This caused multiple issues. The first is that the
transitively exported references weren't promoted. Secondly, some could
not be promoted/renamed (e.g. they had a section or other constraint).
recursively, instead of just adding the first level of initializer list
references to the ExportList directly.

Remove this optimization and the associated handling in the function
import backend. SPEC measurements indicate we weren't getting much
from it in any case.

Fixes PR31052.

Reviewers: mehdi_amini

Subscribers: krasin, llvm-commits

Differential Revision: https://reviews.llvm.org/D26880

llvm-svn: 288446
2016-12-02 01:02:30 +00:00
Michael Kuperstein b151a641aa [LoopUnroll] Implement profile-based loop peeling
This implements PGO-driven loop peeling.

The basic idea is that when the average dynamic trip-count of a loop is known,
based on PGO, to be low, we can expect a performance win by peeling off the
first several iterations of that loop.
Unlike unrolling based on a known trip count, or a trip count multiple, this
doesn't save us the conditional check and branch on each iteration. However,
it does allow us to simplify the straight-line code we get (constant-folding,
etc.). This is important given that we know that we will usually only hit this
code, and not the actual loop.

This is currently disabled by default.

Differential Revision: https://reviews.llvm.org/D25963

llvm-svn: 288274
2016-11-30 21:13:57 +00:00
Eugene Zelenko a3fe70d233 Fix some Clang-tidy and Include What You Use warnings; other minor fixes (NFC).
This preparation to remove SetVector.h dependency on SmallSet.h.

llvm-svn: 288256
2016-11-30 17:48:10 +00:00
Sanjay Patel da9f7bf0fc fix formatting; NFC
llvm-svn: 287997
2016-11-27 15:53:48 +00:00
Chandler Carruth dab4eae274 [PM] Change the static object whose address is used to uniquely identify
analyses to have a common type which is enforced rather than using
a char object and a `void *` type when used as an identifier.

This has a number of advantages. First, it at least helps some of the
confusion raised in Justin Lebar's code review of why `void *` was being
used everywhere by having a stronger type that connects to documentation
about this.

However, perhaps more importantly, it addresses a serious issue where
the alignment of these pointer-like identifiers was unknown. This made
it hard to use them in pointer-like data structures. We were already
dodging this in dangerous ways to create the "all analyses" entry. In
a subsequent patch I attempted to use these with TinyPtrVector and
things fell apart in a very bad way.

And it isn't just a compile time or type system issue. Worse than that,
the actual alignment of these pointer-like opaque identifiers wasn't
guaranteed to be a useful alignment as they were just characters.

This change introduces a type to use as the "key" object whose address
forms the opaque identifier. This both forces the objects to have proper
alignment, and provides type checking that we get it right everywhere.
It also makes the types somewhat less mysterious than `void *`.

We could go one step further and introduce a truly opaque pointer-like
type to return from the `ID()` static function rather than returning
`AnalysisKey *`, but that didn't seem to be a clear win so this is just
the initial change to get to a reliably typed and aligned object serving
is a key for all the analyses.

Thanks to Richard Smith and Justin Lebar for helping pick plausible
names and avoid making this refactoring many times. =] And thanks to
Sean for the super fast review!

While here, I've tried to move away from the "PassID" nomenclature
entirely as it wasn't really helping and is overloaded with old pass
manager constructs. Now we have IDs for analyses, and key objects whose
address can be used as IDs. Where possible and clear I've shortened this
to just "ID". In a few places I kept "AnalysisID" to make it clear what
was being identified.

Differential Revision: https://reviews.llvm.org/D27031

llvm-svn: 287783
2016-11-23 17:53:26 +00:00
Mandeep Singh Grang 73f0095d71 [MemorySSA] Fix for non-determinism in codegen
This patch fixes the non-determinism caused due to iterating SmallPtrSet's
which was uncovered due to the experimental "reverse iteration order " patch:
https://reviews.llvm.org/D26718

The following unit tests failed because of the undefined order of iteration.
LLVM :: Transforms/Util/MemorySSA/cyclicphi.ll
LLVM :: Transforms/Util/MemorySSA/many-dom-backedge.ll
LLVM :: Transforms/Util/MemorySSA/many-doms.ll
LLVM :: Transforms/Util/MemorySSA/phi-translation.ll

Reviewers: dberlin, mgrang

Subscribers: dberlin, llvm-commits, david2050

Differential Revision: https://reviews.llvm.org/D26704

llvm-svn: 287563
2016-11-21 19:33:02 +00:00
Benjamin Kramer ffd3715d16 Give some helper classes/functions internal linkage. NFC.
llvm-svn: 287462
2016-11-19 20:44:26 +00:00
Michael Zolotukhin 5020c9971b [LoopSimplify] Preserve LCSSA when removing edges from unreachable blocks.
This fixes PR30454.

llvm-svn: 287379
2016-11-18 21:01:12 +00:00
Florian Hahn 77382be56b [simplifycfg][loop-simplify] Preserve loop metadata in 2 transformations.
insertUniqueBackedgeBlock in lib/Transforms/Utils/LoopSimplify.cpp now
propagates existing llvm.loop metadata to newly the added backedge.

llvm::TryToSimplifyUncondBranchFromEmptyBlock in lib/Transforms/Utils/Local.cpp
now propagates existing llvm.loop metadata to the branch instructions in the
predecessor blocks of the empty block that is removed.

Differential Revision: https://reviews.llvm.org/D26495

llvm-svn: 287341
2016-11-18 13:12:07 +00:00
Chris Bieneman 05c279fc4b [CMake] NFC. Updating CMake dependency specifications
This patch updates a bunch of places where add_dependencies was being explicitly called to add dependencies on intrinsics_gen to instead use the DEPENDS named parameter. This cleanup is needed for a patch I'm working on to add a dependency debugging mode to the build system.

llvm-svn: 287206
2016-11-17 04:36:50 +00:00
Dehao Chen 41d72a8632 Use profile info to adjust loop unroll threshold.
Summary:
For flat loop, even if it is hot, it is not a good idea to unroll in runtime, thus we set a lower partial unroll threshold.
For hot loop, we set a higher unroll threshold and allows expensive tripcount computation to allow more aggressive unrolling.

Reviewers: davidxl, mzolotukhin

Subscribers: sanjoy, mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D26527

llvm-svn: 287186
2016-11-17 01:17:02 +00:00
Justin Lebar 2860573529 [BypassSlowDivision] Handle division by constant numerators better.
Summary:
We don't do BypassSlowDivision when the denominator is a constant, but
we do do it when the numerator is a constant.

This patch makes two related changes to BypassSlowDivision when the
numerator is a constant:

 * If the numerator is too large to fit into the bypass width, don't
   bypass slow division (because we'll never run the smaller-width
   code).

 * If we bypass slow division where the numerator is a constant, don't
   OR together the numerator and denominator when determining whether
   both operands fit within the bypass width.  We need to check only the
   denominator.

Reviewers: tra

Subscribers: llvm-commits, jholewinski

Differential Revision: https://reviews.llvm.org/D26699

llvm-svn: 287062
2016-11-16 00:44:47 +00:00
Justin Lebar 583b8687eb [BypassSlowDivision] Simplify partially-tautological if statement.
if (A || (B && A)) --> if (A).

llvm-svn: 287061
2016-11-16 00:44:43 +00:00
Kuba Brecka ddfdba3b01 [tsan] Add support for C++ exceptions into TSan (call __tsan_func_exit during unwinding), LLVM part
This adds support for TSan C++ exception handling, where we need to add extra calls to __tsan_func_exit when a function is exitted via exception mechanisms. Otherwise the shadow stack gets corrupted (leaked). This patch moves and enhances the existing implementation of EscapeEnumerator that finds all possible function exit points, and adds extra EH cleanup blocks where needed.

Differential Revision: https://reviews.llvm.org/D26177

llvm-svn: 286893
2016-11-14 21:41:13 +00:00
Teresa Johnson 4fef68cb8d [ThinLTO] Only promote exported locals as marked in index
Summary:
We have always speculatively promoted all renamable local values
(except const non-address taken variables) for both the exporting
and importing module. We would then internalize them back based on
the ThinLink results if they weren't actually exported. This is
inefficient, and results in unnecessary renames. It also meant we
had to check the non-renamability of a value in the summary, which
was already checked during function importing analysis in the ThinLink.

Made renameModuleForThinLTO (which does the promotion/renaming) instead
use the index when exporting, to avoid unnecessary renames/promotions.
For importing modules, we can simply promoted all values as any local
we import by definition is exported and needs promotion.

This required changes to the method used by the FunctionImport pass
(only invoked from 'opt' for testing) and when invoked from llvm-link,
since neither does a ThinLink. We simply conservatively mark all locals
in the index as promoted, which preserves the current aggressive
promotion behavior.

I also needed to change an llvm-lto based test where we had previously
been aggressively promoting values that weren't importable (aliasees),
but now will not promote.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26467

llvm-svn: 286871
2016-11-14 19:21:41 +00:00