Commit Graph

533 Commits

Author SHA1 Message Date
Geoff Berry 5bf4a5eafa [EarlyCSE] Add debug counter for debugging mis-optimizations. NFC.
Reviewers: reames, spatel, davide, dberlin

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D45162

llvm-svn: 329443
2018-04-06 18:47:33 +00:00
Fedor Sergeev 6660fd0f95 [PM][FunctionAttrs] add NoUnwind attribute inference to PostOrderFunctionAttrs pass
Summary:
This was motivated by absence of PrunEH functionality in new PM.
It was decided that a proper way to do PruneEH is to add NoUnwind inference
into PostOrderFunctionAttrs and then perform normal SimplifyCFG on top.

This change generalizes attribute handling implemented for (a removal of)
Convergent attribute, by introducing a generic builder-like class
   AttributeInferer

It registers all the attribute inference requests, storing per-attribute
predicates into a vector, and then goes through an SCC Node, scanning all
the instructions for not breaking attribute assumptions.

The main idea is that as soon all the instructions from all the functions
of SCC Node conform to attribute assumptions then we are free to infer
the attribute as set for all the functions of SCC Node.

It handles two distinct cases of attributes:
   - those that might break due to derefinement of the function code

     for these attributes we are allowed to apply inference only if all the
     functions are "exact definitions". Example - NoUnwind.

   - those that do not care about derefinement

     for these attributes we are allowed to apply inference as soon as we see
     any function definition. Example - removal of Convergent attribute.

Also in this commit:
* Converted all the FunctionAttrs tests to use FileCheck and added new-PM
  invocations to them

* FunctionAttrs/convergent.ll test demonstrates a difference in behavior between
   new and old PM implementations. Marked with FIXME.

* PruneEH tests were converted to new-PM as well, using function-attrs+simplify-cfg
  combo as intended

* some of "other" tests were updated since function-attrs now infers 'nounwind'
  even for old PM pipeline

* -disable-nounwind-inference hidden option added as a possible workaround for a supposedly
  rare case when nounwind being inferred by default presents a problem

Reviewers: chandlerc, jlebar

Reviewed By: jlebar

Subscribers: eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D44415

llvm-svn: 328377
2018-03-23 21:46:16 +00:00
Matthew Simpson 4316a2623a [test] Allow for optional No-Op Barrier Pass in O0 pipeline
llvm-svn: 328310
2018-03-23 12:47:54 +00:00
Michael Zolotukhin cc83994bba [test] Try to unbreak hexagon bots after r328160.
llvm-svn: 328167
2018-03-21 22:57:33 +00:00
Michael Zolotukhin 1dce44e8e8 [test] Add tests for opt passes pipelines for O0, O2, O3, and Os.
llvm-svn: 328160
2018-03-21 22:17:31 +00:00
Florian Hahn b4e3bad89b Recommit r325001: [CallSiteSplitting] Support splitting of blocks with instrs before call.
For basic blocks with instructions between the beginning of the block
and a call we have to duplicate the instructions before the call in all
split blocks and add PHI nodes for uses of the duplicated instructions
after the call.

Currently, the threshold for the number of instructions before a call
is quite low, to keep the impact on binary size low.

Reviewers: junbuml, mcrosier, davidxl, davide

Reviewed By: junbuml

Differential Revision: https://reviews.llvm.org/D41860

llvm-svn: 325126
2018-02-14 13:59:12 +00:00
Florian Hahn 35d744d388 Revert r325001: [CallSiteSplitting] Support splitting of blocks with instrs before call.
Due to memsan not being happy with the array of ValueToValue maps.

llvm-svn: 325009
2018-02-13 14:48:39 +00:00
Florian Hahn 6c69732c05 [CallSiteSplitting] Fix new-pm test, as TargetIRAnalysis is run earlier now
llvm-svn: 325002
2018-02-13 12:22:32 +00:00
David Blaikie 359006f192 REQUIRES: shell a couple of tests that require the shell
One test uses diff, the other tries to change the PATH which doesn't
seem to work well ('not' is no longer accessible/found after the PATH is
changed - I think $PATH isn't expanded when setting PATH).

llvm-svn: 324787
2018-02-10 00:14:54 +00:00
Zaara Syeda 1f59ae311b Re-commit : [PowerPC] Add handling for ColdCC calling convention and a pass to mark
candidates with coldcc attribute.

This recommits r322721 reverted due to sanitizer memory leak build bot failures.

Original commit message:
This patch adds support for the coldcc calling convention for Power.
This changes the set of non-volatile registers. It includes a pass to stress
test the implementation by marking all static directly called functions with
the coldcc attribute through the option -enable-coldcc-stress-test. It also
includes an option, -ppc-enable-coldcc, to add the coldcc attribute to
functions which are cold at all call sites based on BlockFrequencyInfo when
the containing function does not call any non cold functions.

Differential Revision: https://reviews.llvm.org/D38413

llvm-svn: 323778
2018-01-30 16:17:22 +00:00
Amjad Aboud f1f57a3137 Another try to commit 323321 (aggressive instruction combine).
llvm-svn: 323416
2018-01-25 12:06:32 +00:00
Amjad Aboud d53504e379 Reverted 323321.
llvm-svn: 323326
2018-01-24 14:48:49 +00:00
Amjad Aboud e4453233d7 [InstCombine] Introducing Aggressive Instruction Combine pass (-aggressive-instcombine).
Combine expression patterns to form expressions with fewer, simple instructions.
This pass does not modify the CFG.

For example, this pass reduce width of expressions post-dominated by TruncInst
into smaller width when applicable.

It differs from instcombine pass in that it contains pattern optimization that
requires higher complexity than the O(1), thus, it should run fewer times than
instcombine pass.

Differential Revision: https://reviews.llvm.org/D38313

llvm-svn: 323321
2018-01-24 12:42:42 +00:00
David Blaikie 0c64f5a2eb NewPM: Add an extension point for the start of the pipeline.
This applies to most pipelines except the LTO and ThinLTO backend
actions - it is for use at the beginning of the overall pipeline.

This extension point will be used to add the GCOV pass when enabled in
Clang.

llvm-svn: 323166
2018-01-23 01:25:20 +00:00
Daniel Neilson 1e68724d24 Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)
Summary:
 This is a resurrection of work first proposed and discussed in Aug 2015:
   http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
and initially landed (but then backed out) in Nov 2015:
   http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

 The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument
which is required to be a constant integer. It represents the alignment of the
dest (and source), and so must be the minimum of the actual alignment of the
two.

 This change is the first in a series that allows source and dest to each
have their own alignments by using the alignment attribute on their arguments.

 In this change we:
1) Remove the alignment argument.
2) Add alignment attributes to the source & dest arguments. We, temporarily,
   require that the alignments for source & dest be equal.

 For example, code which used to read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false)
will now read
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false)

 Downstream users may have to update their lit tests that check for
@llvm.memcpy/memmove/memset call/declaration patterns. The following extended sed script
may help with updating the majority of your tests, but it does not catch all possible
patterns so some manual checking and updating will be required.

s~declare void @llvm\.mem(set|cpy|move)\.p([^(]*)\((.*), i32, i1\)~declare void @llvm.mem\1.p\2(\3, i1)~g
s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* \3, i8 \4, i8 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* \3, i8 \4, i16 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* \3, i8 \4, i32 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* \3, i8 \4, i64 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* \3, i8 \4, i128 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* align \6 \3, i8 \4, i8 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* align \6 \3, i8 \4, i16 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* align \6 \3, i8 \4, i32 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* align \6 \3, i8 \4, i64 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* align \6 \3, i8 \4, i128 \5, i1 \7)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* \4, i8\5* \6, i8 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* \4, i8\5* \6, i16 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* \4, i8\5* \6, i32 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* \4, i8\5* \6, i64 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* \4, i8\5* \6, i128 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* align \8 \4, i8\5* align \8 \6, i8 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* align \8 \4, i8\5* align \8 \6, i16 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* align \8 \4, i8\5* align \8 \6, i32 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* align \8 \4, i8\5* align \8 \6, i64 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* align \8 \4, i8\5* align \8 \6, i128 \7, i1 \9)~g

 The remaining changes in the series will:
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
   source and dest alignments.
Step 3) Update Clang to use the new IRBuilder API.
Step 4) Update Polly to use the new IRBuilder API.
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
        and those that use use MemIntrinsicInst::[get|set]Alignment() to use
        getDestAlignment() and getSourceAlignment() instead.
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
        MemIntrinsicInst::[get|set]Alignment() methods.

Reviewers: pete, hfinkel, lhames, reames, bollu

Reviewed By: reames

Subscribers: niosHD, reames, jholewinski, qcolombet, jfb, sanjoy, arsenm, dschuff, dylanmckay, mehdi_amini, sdardis, nemanjai, david2050, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, llvm-commits

Differential Revision: https://reviews.llvm.org/D41675

llvm-svn: 322965
2018-01-19 17:13:12 +00:00
Rafael Espindola 9fbc040599 Make GlobalValues with non-default visibilility dso_local.
This is similar to r322317, but for visibility. It is not as neat
because we have to special case extern_weak.

The idea is the same as the previous change, make the transition to
explicit dso_local easier for the frontends. With this they only have
to add dso_local to symbols where we need some external information to
decide if it is dso_local (like it being part of an ELF executable).

llvm-svn: 322806
2018-01-18 02:08:23 +00:00
Zaara Syeda c9dc7b451b Revert [PowerPC] This reverts commit rL322721
Failing build bots. Revert the commit now.

llvm-svn: 322748
2018-01-17 20:00:15 +00:00
Zaara Syeda 8e951fd2f6 [PowerPC] Add handling for ColdCC calling convention and a pass to mark
candidates with coldcc attribute.

This patch adds support for the coldcc calling convention for Power.
This changes the set of non-volatile registers. It includes a pass to stress
test the implementation by marking all static directly called functions with
the coldcc attribute through the option -enable-coldcc-stress-test. It also
includes an option, -ppc-enable-coldcc, to add the coldcc attribute to
functions which are cold at all call sites based on BlockFrequencyInfo when
the containing function does not call any non cold functions.

Differential Revision: https://reviews.llvm.org/D38413

llvm-svn: 322721
2018-01-17 18:22:55 +00:00
Rafael Espindola e4b0231c63 Make internal/private GVs implicitly dso_local.
While updating clang tests for having clang set dso_local I noticed
that:

- There are *a lot* of tests to update.
- Many of the updates are redundant.

They are redundant because a GV is "obviously dso_local". This patch
starts formalizing that a bit by requiring that internal and private
GVs be dso_local too. Since they all are, we don't have to print
dso_local to the textual representation, making it a bit more compact
and easier to read.

llvm-svn: 322317
2018-01-11 22:15:05 +00:00
Fedor Sergeev 02e7f0247b [PM] pass -debug-pass-manager flag into FunctionToLoopPassAdaptor's canonicalization PM
Summary:
New pass manager driver passes DebugPM (-debug-pass-manager) flag into
individual PassManager constructors in order to enable debug logging.
FunctionToLoopPassAdaptor has its own internal LoopCanonicalizationPM
which never gets its debug logging enabled and that means canonicalization
passes like LoopSimplify are never present in -debug-pass-manager output.

Extending FunctionToLoopPassAdaptor's constructor and
createFunctionToLoopPassAdaptor wrapper with an optional
boolean DebugLogging argument.

Passing debug-logging flags there as appropriate.

Reviewers: chandlerc, davide

Reviewed By: davide

Subscribers: mehdi_amini, eraman, llvm-commits, JDevlieghere

Differential Revision: https://reviews.llvm.org/D41586

llvm-svn: 321548
2017-12-29 08:16:06 +00:00
Dimitry Andric e4f5d01033 Fix more inconsistent line endings. NFC.
llvm-svn: 321016
2017-12-18 19:46:56 +00:00
Sanjay Patel 0ab0c1a201 [SimplifyCFG] don't sink common insts too soon (PR34603)
This should solve:
https://bugs.llvm.org/show_bug.cgi?id=34603
...by preventing SimplifyCFG from altering redundant instructions before early-cse has a chance to run.
It changes the default (canonical-forming) behavior of SimplifyCFG, so we're only doing the
sinking transform later in the optimization pipeline.

Differential Revision: https://reviews.llvm.org/D38566

llvm-svn: 320749
2017-12-14 22:05:20 +00:00
Fedor Sergeev 83bcc68afa [PM][InstCombine] fixing omission of AliasAnalysis in new-pass-manager's version of InstCombine
Summary:
Passing AliasAnalysis results instead of nullptr appears to work just fine.
A couple new-pass-manager tests updated to align with new order of analyses.

Reviewers: chandlerc, spatel, craig.topper

Reviewed By: chandlerc

Subscribers: mehdi_amini, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D41203

llvm-svn: 320687
2017-12-14 10:36:31 +00:00
Fedor Sergeev 3b459c3847 IR printing improvement for loop passes - handle -print-module-scope
Summary:
Adding support for -print-module-scope similar to how it is
being done for function passes. This option causes loop-pass printer
to emit a whole-module IR instead of just a loop itself.

Reviewers: sanjoy, silvas, weimingz

Reviewed By: sanjoy

Subscribers: apilipenko, skatkov, llvm-commits

Differential Revision: https://reviews.llvm.org/D40247

llvm-svn: 319566
2017-12-01 18:33:58 +00:00
Fedor Sergeev 94dca7c7ea IR printing improvement for function passes - introducing -print-module-scope
Summary:
When debugging function passes it happens to be rather useful to dump
the whole module before the transformation and then use this dump
to analyze this single transformation by running it separately
on that particular module state.

Introducing
    -print-module-scope
debugging option that forces all the function-level IR dumps
to become whole-module dumps.

This option builds on top of normal dumping controls like
   -print-before/after
   -filter-print-funcs

The plan is to eventually extend this option to cover other local passes
(at least loop passes) but that should go as a separate change.

Reviewers: sanjoy, weimingz, silvas, fedor.sergeev

Reviewed By: weimingz

Subscribers: apilipenko, skatkov, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D40245

llvm-svn: 319561
2017-12-01 17:42:46 +00:00
Chandler Carruth c34f789e38 Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable.
The core idea is to (re-)introduce some redundancies where their cost is
hidden by the cost of materializing immediates for constant operands of
PHI nodes. When the cost of the redundancies is covered by this,
avoiding materializing the immediate has numerous benefits:
1) Less register pressure
2) Potential for further folding / combining
3) Potential for more efficient instructions due to immediate operand

As a motivating example, consider the remarkably different cost on x86
of a SHL instruction with an immediate operand versus a register
operand.

This pattern turns up surprisingly frequently, but is somewhat rarely
obvious as a significant performance problem.

The pass is entirely target independent, but it does rely on the target
cost model in TTI to decide when to speculate things around the PHI
node. I've included x86-focused tests, but any target that sets up its
immediate cost model should benefit from this pass.

There is probably more that can be done in this space, but the pass
as-is is enough to get some important performance on our internal
benchmarks, and should be generally performance neutral, but help with
more extensive benchmarking is always welcome.

One awkward part is that this pass has to be scheduled after
*everything* that can eliminate these kinds of redundancies. This
includes SimplifyCFG, GVN, etc. I'm open to suggestions about better
places to put this. We could in theory make it part of the codegen pass
pipeline, but there doesn't really seem to be a good reason for that --
it isn't "lowering" in any sense and only relies on pretty standard cost
model based TTI queries, so it seems to fit well with the "optimization"
pipeline model. Still, further thoughts on the pipeline position are
welcome.

I've also only implemented this in the new pass manager. If folks are
very interested, I can try to add it to the old PM as well, but I didn't
really see much point (my use case is already switched over to the new
PM).

I've tested this pretty heavily without issue. A wide range of
benchmarks internally show no change outside the noise, and I don't see
any significant changes in SPEC either. However, the size class
computation in tcmalloc is substantially improved by this, which turns
into a 2% to 4% win on the hottest path through tcmalloc for us, so
there are definitely important cases where this is going to make
a substantial difference.

Differential revision: https://reviews.llvm.org/D37467

llvm-svn: 319164
2017-11-28 11:32:31 +00:00
Fedor Sergeev 61975b49fe IR printing improvement for loop passes
Summary:
Loop-pass printing is somewhat deficient since it does not provide the
context around the loop (e.g. preheader). This context information becomes
pretty essential when analyzing transformations that move stuff out of the loop.

Extending printLoop to cover preheader and exit blocks (if any).

Reviewers: sanjoy, silvas, weimingz

Reviewed By: sanjoy

Subscribers: apilipenko, skatkov, llvm-commits

Differential Revision: https://reviews.llvm.org/D40246

llvm-svn: 318878
2017-11-22 20:59:53 +00:00
Yaxun Liu 407ca36b27 Let llvm.invariant.group.barrier accepts pointer to any address space
llvm.invariant.group.barrier may accept pointers to arbitrary address space.

This patch let it accept pointers to i8 in any address space and returns
pointer to i8 in the same address space.

Differential Revision: https://reviews.llvm.org/D39973

llvm-svn: 318413
2017-11-16 16:32:16 +00:00
Jun Bum Lim 0c99007db1 Recommit r317351 : Add CallSiteSplitting pass
This recommit r317351 after fixing a buildbot failure.

Original commit message:

    Summary:
    This change add a pass which tries to split a call-site to pass
    more constrained arguments if its argument is predicated in the control flow
    so that we can expose better context to the later passes (e.g, inliner, jump
    threading, or IPA-CP based function cloning, etc.).
    As of now we support two cases :

    1) If a call site is dominated by an OR condition and if any of its arguments
    are predicated on this OR condition, try to split the condition with more
    constrained arguments. For example, in the code below, we try to split the
    call site since we can predicate the argument (ptr) based on the OR condition.

    Split from :
          if (!ptr || c)
            callee(ptr);
    to :
          if (!ptr)
            callee(null ptr)  // set the known constant value
          else if (c)
            callee(nonnull ptr)  // set non-null attribute in the argument

    2) We can also split a call-site based on constant incoming values of a PHI
    For example,
    from :
          BB0:
           %c = icmp eq i32 %i1, %i2
           br i1 %c, label %BB2, label %BB1
          BB1:
           br label %BB2
          BB2:
           %p = phi i32 [ 0, %BB0 ], [ 1, %BB1 ]
           call void @bar(i32 %p)
    to
          BB0:
           %c = icmp eq i32 %i1, %i2
           br i1 %c, label %BB2-split0, label %BB1
          BB1:
           br label %BB2-split1
          BB2-split0:
           call void @bar(i32 0)
           br label %BB2
          BB2-split1:
           call void @bar(i32 1)
           br label %BB2
          BB2:
           %p = phi i32 [ 0, %BB2-split0 ], [ 1, %BB2-split1 ]

llvm-svn: 317362
2017-11-03 20:41:16 +00:00
Jun Bum Lim 0eb1c2d63a Revert "Add CallSiteSplitting pass"
Revert due to Buildbot failure.

This reverts commit r317351.

llvm-svn: 317353
2017-11-03 19:17:11 +00:00
Jun Bum Lim 2a58933519 Add CallSiteSplitting pass
Summary:
This change add a pass which tries to split a call-site to pass
more constrained arguments if its argument is predicated in the control flow
so that we can expose better context to the later passes (e.g, inliner, jump
threading, or IPA-CP based function cloning, etc.).
As of now we support two cases :

1) If a call site is dominated by an OR condition and if any of its arguments
are predicated on this OR condition, try to split the condition with more
constrained arguments. For example, in the code below, we try to split the
call site since we can predicate the argument (ptr) based on the OR condition.

Split from :
      if (!ptr || c)
        callee(ptr);
to :
      if (!ptr)
        callee(null ptr)  // set the known constant value
      else if (c)
        callee(nonnull ptr)  // set non-null attribute in the argument

2) We can also split a call-site based on constant incoming values of a PHI
For example,
from :
      BB0:
       %c = icmp eq i32 %i1, %i2
       br i1 %c, label %BB2, label %BB1
      BB1:
       br label %BB2
      BB2:
       %p = phi i32 [ 0, %BB0 ], [ 1, %BB1 ]
       call void @bar(i32 %p)
to
      BB0:
       %c = icmp eq i32 %i1, %i2
       br i1 %c, label %BB2-split0, label %BB1
      BB1:
       br label %BB2-split1
      BB2-split0:
       call void @bar(i32 0)
       br label %BB2
      BB2-split1:
       call void @bar(i32 1)
       br label %BB2
      BB2:
       %p = phi i32 [ 0, %BB2-split0 ], [ 1, %BB2-split1 ]

Reviewers: davidxl, huntergr, chandlerc, mcrosier, eraman, davide

Reviewed By: davidxl

Subscribers: sdesmalen, ashutosh.nema, fhahn, mssimpso, aemerson, mgorny, mehdi_amini, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D39137

llvm-svn: 317351
2017-11-03 19:01:57 +00:00
Matthew Simpson cb58558c2f Add CalledValuePropagation pass
This patch adds a new pass for attaching !callees metadata to indirect call
sites. The pass propagates values to call sites by performing an IPSCCP-like
analysis using the generic sparse propagation solver. For indirect call sites
having a small set of possible callees, the attached metadata indicates what
those callees are. The metadata can be used to facilitate optimizations like
intersecting the function attributes of the possible callees, refining the call
graph, performing indirect call promotion, etc.

Differential Revision: https://reviews.llvm.org/D37355

llvm-svn: 316576
2017-10-25 13:40:08 +00:00
Rong Xu e1f4245f8d [PM] Add pgo-memop-opt pass to the new pass manager
This pass adds pgo-memop-opt pass to the new pass manager.
It is in the old pass manager but somehow left out in the new pass manager.

Differential Revision: http://reviews.llvm.org/D39145

llvm-svn: 316384
2017-10-23 22:21:29 +00:00
Davide Italiano e070721308 [NewPassManager] Run global dead code elimination after the inliner.
This is the same exact change we did for the current pass manager
in rL314997, but the new pass manager pipeline already happened
to run GlobalOpt after the inliner, so we just insert a run of
GDCE here.

llvm-svn: 315003
2017-10-05 18:36:01 +00:00
Davide Italiano c8708e59e8 [PassManager] Improve the interaction between -O2 and ThinLTO.
Run GDCE slightly later so that we don't have to repeat it
twice when preparing for Thin. Thanks to Mehdi for the suggestion.

llvm-svn: 314999
2017-10-05 18:23:25 +00:00
Davide Italiano ff829cea8b [PassManager] Run global optimizations after the inliner.
The inliner performs some kind of dead code elimination as it goes,
but there are cases that are not really caught by it. We might
at some point consider teaching the inliner about them, but it
is OK for now to run GlobalOpt + GlobalDCE in tandem as their
benefits generally outweight the cost, making the whole pipeline
faster.

This fixes PR34652.

Differential Revision: https://reviews.llvm.org/D38154

llvm-svn: 314997
2017-10-05 18:06:37 +00:00
Sanjoy Das 005b88c0a6 Do not call Loop::getName on possibly dead loops
This fixes PR34832.

llvm-svn: 314938
2017-10-04 22:02:27 +00:00
Chandler Carruth 7376ae88eb [PM/CGSCC] Teach the CGSCC pass manager components to gracefully handle
invalidated SCCs even when we do not have an updated SCC to redirect
towards.

This comes up in a fairly subtle and surprising circumstance: we need to
have a connected but internal node in the call graph which later becomes
a disconnected island, and then gets deleted. All of this needs to
happen mid-CGSCC walk. Because it is disconnected, we have no way of
computing a new "current" SCC when it gets deleted. Instead, we need to
explicitly check for a deleted "current" SCC and bail out of the current
CGSCC step. This will bubble all the way up to the post-order walk and
then resume correctly.

I've included minimal tests for this bug. The specific behavior
matches something we've seen in the wild with the new PM combined with
ThinLTO and sample PGO, but I've not yet confirmed whether this is the
only issue there.

llvm-svn: 313242
2017-09-14 08:33:57 +00:00
Nuno Lopes 404f106d71 Merge isKnownNonNull into isKnownNonZero
It now knows the tricks of both functions.
Also, fix a bug that considered allocas of non-zero address space to be always non null

Differential Revision: https://reviews.llvm.org/D37628

llvm-svn: 312869
2017-09-09 18:23:11 +00:00
Sanjay Patel 6fd4391ddd [DivRempairs] add a pass to optimize div/rem pairs (PR31028)
This is intended to be a superset of the functionality from D31037 (EarlyCSE) but implemented 
as an independent pass, so there's no stretching of scope and feature creep for an existing pass. 
I also proposed a weaker version of this for SimplifyCFG in D30910. And I initially had almost 
this same functionality as an addition to CGP in the motivating example of PR31028:
https://bugs.llvm.org/show_bug.cgi?id=31028

The advantage of positioning this ahead of SimplifyCFG in the pass pipeline is that it can allow 
more flattening. But it needs to be after passes (InstCombine) that could sink a div/rem and
undo the hoisting that is done here.

Decomposing remainder may allow removing some code from the backend (PPC and possibly others).

Differential Revision: https://reviews.llvm.org/D37121 

llvm-svn: 312862
2017-09-09 13:38:18 +00:00
Victor Leschuk ee7d232a41 revert failing test
llvm-svn: 311238
2017-08-19 12:24:41 +00:00
Victor Leschuk ba0954c4e2 Add temporary test to verify that win10 builder hangs on error
llvm-svn: 311236
2017-08-19 12:02:39 +00:00
Kuba Mracek 17ee427ef3 [llvm] Get rid of "%T" expansions
The %T lit expansion expands to a common directory shared between all the tests in the same directory, which is unexpected and unintuitive, and more importantly, it's been a source of subtle race conditions and flaky tests. In https://reviews.llvm.org/D35396, it was agreed that it would be best to simply ban %T and only keep %t, which is unique to each test. When a test needs a temporary directory, it can just create one using mkdir %t.

This patch removes %T in llvm.

Differential Revision: https://reviews.llvm.org/D36495

llvm-svn: 310953
2017-08-15 20:29:24 +00:00
Chandler Carruth 19913b22c0 [PM] Switch the CGSCC debug messages to use the standard LLVM debug
printing techniques with a DEBUG_TYPE controlling them.

It was a mistake to start re-purposing the pass manager `DebugLogging`
variable for generic debug printing -- those logs are intended to be
very minimal and primarily used for testing. More detailed and
comprehensive logging doesn't make sense there (it would only make for
brittle tests).

Moreover, we kept forgetting to propagate the `DebugLogging` variable to
various places making it also ineffective and/or unavailable. Switching
to `DEBUG_TYPE` makes this a non-issue.

llvm-svn: 310695
2017-08-11 05:47:13 +00:00
Dehao Chen 2f4e2e2758 Revert part of r310296 to make it really NFC for instrumentation PGO.
Summary: Part of r310296 will disable PGOIndirectCallPromotion in ThinLTO backend if PGOOpt is None. However, as PGOOpt is not passed down to ThinLTO backend for instrumentation based PGO, that change would actually disable ICP entirely in ThinLTO backend, making it behave differently in instrumentation PGO mode. This change reverts that change, and only disable ICP there when it is SamplePGO.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: sanjoy, mehdi_amini, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D36566

llvm-svn: 310550
2017-08-10 05:10:32 +00:00
Dehao Chen 34cfcb29aa Make ICP uses PSI to check for hotness.
Summary: Currently, ICP checks the count against a fixed value to see if it is hot enough to be promoted. This does not work for SamplePGO because sampled count may be much smaller. This patch uses PSI to check if the count is hot enough to be promoted.

Reviewers: davidxl, tejohnson, eraman

Reviewed By: davidxl

Subscribers: sanjoy, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D36341

llvm-svn: 310416
2017-08-08 20:57:33 +00:00
Chandler Carruth 6e35c31d2d [PM] Fix a likely more critical infloop bug in the CGSCC pass manager.
This was just a bad oversight on my part. The code in question should
never have worked without this fix. But it turns out, there are
relatively few places that involve libfunctions that participate in
a single SCC, and unless they do, this happens to not matter.

The effect of not having this correct is that each time through this
routine, the edge from write_wrapper to write was toggled between a call
edge and a ref edge. First time through, it becomes a demoted call edge
and is turned into a ref edge. Next time it is a promoted call edge from
a ref edge. On, and on it goes forever.

I've added the asserts which should have always been here to catch silly
mistakes like this in the future as well as a test case that will
actually infloop without the fix.

The other (much scarier) infinite-inlining issue I think didn't actually
occur in practice, and I simply misdiagnosed this minor issue as that
much more scary issue. The other issue *is* still a real issue, but I'm
somewhat relieved that so far it hasn't happened in real-world code
yet...

llvm-svn: 310342
2017-08-08 10:13:23 +00:00
Dehao Chen 08f8831e57 Move the SampleProfileLoader right after EarlyFPM.
Summary: SampleProfileLoader pass do need to happen after some early cleanup passes so that inlining can happen correctly inside the SampleProfileLoader pass.

Reviewers: chandlerc, davidxl, tejohnson

Reviewed By: chandlerc, tejohnson

Subscribers: sanjoy, mehdi_amini, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D36333

llvm-svn: 310296
2017-08-07 20:23:20 +00:00
Teresa Johnson 8482e56920 Use profile summary to disable peeling for huge working sets
Summary:
Detect when the working set size of a profiled application is huge,
by comparing the number of counts required to reach the hot percentile
in the profile summary to a large threshold*.

When the working set size is determined to be huge, disable peeling
to avoid bloating the working set further.

*Note that the selected threshold (15K) is significantly larger than the
largest working set value in SPEC cpu2006 (which is gcc at around 11K).

Reviewers: davidxl

Subscribers: mehdi_amini, mzolotukhin, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D36288

llvm-svn: 310005
2017-08-03 23:42:58 +00:00
Teresa Johnson ecd901314d [PM] Split LoopUnrollPass and make partial unroller a function pass
Summary:
This is largely NFC*, in preparation for utilizing ProfileSummaryInfo
and BranchFrequencyInfo analyses. In this patch I am only doing the
splitting for the New PM, but I can do the same for the legacy PM as
a follow-on if this looks good.

*Not NFC since for partial unrolling we lose the updates done to the
loop traversal (adding new sibling and child loops) - according to
Chandler this is not very useful for partial unrolling, but it also
means that the debugging flag -unroll-revisit-child-loops no longer
works for partial unrolling.

Reviewers: chandlerc

Subscribers: mehdi_amini, mzolotukhin, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D36157

llvm-svn: 309886
2017-08-02 20:35:29 +00:00
Dehao Chen 4194ebff8d Update the test to make windows bot pass.
llvm-svn: 309482
2017-07-29 07:01:25 +00:00
Dehao Chen 246254b97d update the test file that was omitted in r309478.
llvm-svn: 309479
2017-07-29 04:11:20 +00:00
Dehao Chen ce0842ce9c Refine the PGOOpt and SamplePGOSupport handling.
Summary:
Now that SamplePGOSupport is part of PGOOpt, there are several places that need tweaking:
1. AddDiscriminator pass should *not* be invoked at ThinLTOBackend (as it's already invoked in the PreLink phase)
2. addPGOInstrPasses should only be invoked when either ProfileGenFile or ProfileUseFile is non-empty.
3. SampleProfileLoaderPass should only be invoked when SampleProfileFile is non-empty.
4. PGOIndirectCallPromotion should only be invoked in ProfileUse phase, or in ThinLTOBackend of SamplePGO.

Reviewers: chandlerc, tejohnson, davidxl

Reviewed By: chandlerc

Subscribers: sanjoy, mehdi_amini, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D36040

llvm-svn: 309478
2017-07-29 04:10:24 +00:00
Adam Nemet a67dfe3b04 Relax the matching in these tests
Looks like the template arguments are displayed differently depending on the
host compiler(?).  E.g.:

InnerAnalysisManagerProxy<CGSCCAnalysisManager
InnerAnalysisManagerProxy<llvm::AnalysisManager<llvm::LazyCallGraph::SCC, ...

Fix fallout after r309294

llvm-svn: 309297
2017-07-27 17:45:02 +00:00
Adam Nemet 0d8b5d6f69 [ICP] Migrate to OptimizationRemarkEmitter
This is a module pass so for the old PM, we can't use ORE, the function
analysis pass.  Instead ORE is created on the fly.

A few notes:

- isPromotionLegal is folded in the caller since we want to emit the Function
in the remark but we can only do that if the symbol table look-up succeeded.

- There was good test coverage for remarks in this pass.

- promoteIndirectCall uses ORE conditionally since it's also used from
SampleProfile which does not use ORE yet.

Fixes PR33792.

Differential Revision: https://reviews.llvm.org/D35929

llvm-svn: 309294
2017-07-27 16:54:15 +00:00
Adam Nemet ea06e6e865 Migrate SimplifyLibCalls to new OptimizationRemarkEmitter
Summary:
This changes SimplifyLibCalls to use the new OptimizationRemarkEmitter
API.

In fact, as SimplifyLibCalls is only ever called via InstCombine,
(as far as I can tell) the OptimizationRemarkEmitter is added there,
and then passed through to SimplifyLibCalls later.

I have avoided changing any remark text.

This closes PR33787

Patch by Sam Elliott!

Reviewers: anemet, davide

Reviewed By: anemet

Subscribers: davide, mehdi_amini, eraman, fhahn, llvm-commits

Differential Revision: https://reviews.llvm.org/D35608

llvm-svn: 309158
2017-07-26 19:03:18 +00:00
Dehao Chen e90d0153ca Make new PM honor -fdebug-info-for-profiling
Summary: The new PM needs to invoke add-discriminator pass when building with -fdebug-info-for-profiling.

Reviewers: chandlerc, davidxl

Reviewed By: chandlerc

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D35744

llvm-svn: 309121
2017-07-26 15:01:20 +00:00
Dehao Chen 7b05a2712a Add test coverage for new PM PGOOpt handling.
Summary: This patch adds flags and tests to cover the PGOOpt handling logic in new PM.

Reviewers: chandlerc, davide

Reviewed By: chandlerc

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D35807

llvm-svn: 309076
2017-07-26 02:00:43 +00:00
Davide Italiano 4b8c8eae32 [TRE] Move to the new OptRemark API.
Fixes PR33788.

Differential Revision:  https://reviews.llvm.org/D35570

llvm-svn: 308524
2017-07-19 21:13:22 +00:00
Chandler Carruth 06a86301a1 [PM/LCG] Follow-up fix to r308088 to handle deletion of library
functions.

In the prior commit, we provide ordering to the LCG between functions
and library function definitions that they might begin to call through
transformations. But we still would delete these library functions from
the call graph if they became dead during inlining.

While this immediately crashed, it also exposed a loss of information.
We shouldn't remove definitions of library functions that can still
usefully participate in the LCG-powered CGSCC optimization process. If
new call edges are formed, we want to have definitions to be called.

We can still remove these functions if truly dead using global-dce, etc,
but removing them during the CGSCC walk is premature.

This fixes a crash in the new PM when optimizing some unusual libraries
that end up with "internal" lib functions such as the code in the "R"
language's libraries.

llvm-svn: 308417
2017-07-19 04:12:25 +00:00
Chandler Carruth f59a838720 [PM/LCG] Teach the LazyCallGraph to maintain reference edges from every
function to every defined function known to LLVM as a library function.

LLVM can introduce calls to these functions either by replacing other
library calls or by recognizing patterns (such as memset_pattern or
vector math patterns) and replacing those with calls. When these library
functions are actually defined in the module, we need to have reference
edges to them initially so that we visit them during the CGSCC walk in
the right order and can effectively rebuild the call graph afterward.

This was discovered when building code with Fortify enabled as that is
a common case of both inline definitions of library calls and
simplifications of code into calling them.

This can in extreme cases of LTO-ing with libc introduce *many* more
reference edges. I discussed a bunch of different options with folks but
all of them are unsatisfying. They either make the graph operations
substantially more complex even when there are *no* defined libfuncs, or
they introduce some other complexity into the callgraph. So this patch
goes with the simplest possible solution of actual synthetic reference
edges. If this proves to be a memory problem, I'm happy to implement one
of the clever techniques to save memory here.

llvm-svn: 308088
2017-07-15 08:08:19 +00:00
Kamil Rytarowski cce21c1dfe Make shell redirection construct portable
Summary:
NetBSD shell sh(1) does not support ">& /dev/null" construct.
This is bashism. The portable and POSIX solution is to use:
"> /dev/null 2>&1".

This change fixes 22 Unexpected Failures on NetBSD/amd64
for the "check-llvm" target.

Sponsored by <The NetBSD Foundation>

Reviewers: joerg, dim, rnk

Reviewed By: joerg, rnk

Subscribers: rnk, davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D35277

llvm-svn: 307789
2017-07-12 13:24:46 +00:00
Hiroshi Inoue 0ca79dcf4b fix typos in comments; NFC
llvm-svn: 307626
2017-07-11 06:04:59 +00:00
Philip Pfaffe 730f2f9bb6 [PM] Enable registration of out-of-tree passes with PassBuilder
Summary:
This patch adds a callback registration API to the PassBuilder,
enabling registering out-of-tree passes with it.

Through the Callback API, callers may register callbacks with the
various stages at which passes are added into pass managers, including
parsing of a pass pipeline as well as at extension points within the
default -O pipelines.

Registering utilities like `require<>` and `invalidate<>` needs to be
handled manually by the caller, but a helper is provided.

Additionally, adding passes at pipeline extension points is exposed
through the opt tool. This patch adds a `-passes-ep-X` commandline
option for every extension point X, which opt parses into pipelines
inserted into that extension point.

Reviewers: chandlerc

Reviewed By: chandlerc

Subscribers: lksbhm, grosser, davide, mehdi_amini, llvm-commits, mgorny

Differential Revision: https://reviews.llvm.org/D33464

llvm-svn: 307532
2017-07-10 10:57:55 +00:00
Zachary Turner e9db96e6d9 Revert "[lit] Clean output directories before running tests."
This reverts commit da6318a92fba793e4f2447ec478b001392d57d43.

This is causing failures on some build bots due to what appears
to be some kind of lit ordering dependency.

llvm-svn: 306833
2017-06-30 16:05:03 +00:00
Zachary Turner 0955739b36 [lit] Clean output directories before running tests.
Presently lit leaks files in the tests' output directories.
Specifically, if a test creates output files, lit makes no
effort to remove them prior to the next test run.  This is
problematic because it leads to false positives whenever a
test passes because stale  files were present.  In general
it is a source of flakiness that should be removed.

This patch addresses this by building the list of all test
directories that are part of the current run set, and then
deleting those directories and recreating them anew.  This
gives each test a clean baseline to start from.

Differential Revision: https://reviews.llvm.org/D34732

llvm-svn: 306832
2017-06-30 16:01:30 +00:00
Tim Shen 664706916b [ThinkLTO] Invoke build(Thin)?LTOPreLinkDefaultPipeline.
Previously it doesn't actually invoke the designated new PM builder
functions.

This patch moves NameAnonGlobalPass out from PassBuilder, as Chandler
points out that PassBuilder is used for non-O0 builds, and for
optimizations only.

Differential Revision: https://reviews.llvm.org/D34728

llvm-svn: 306756
2017-06-29 23:08:38 +00:00
Geoff Berry 2573a19fe6 [EarlyCSE][MemorySSA] Enable MemorySSA in function-simplification pass of EarlyCSE.
llvm-svn: 306477
2017-06-27 22:25:02 +00:00
Chandler Carruth 8b3be4e59d [PM/ThinLTO] Port the ThinLTO pipeline (both components) to the new PM.
Based on the original patch by Davide, but I've adjusted the API exposed
to just be different entry points rather than exposing more state
parameters. I've factored all the common logic out so that we don't have
any duplicate pipelines, we just stitch them together in different ways.
I think this makes the build easier to reason about and understand.

This adds a direct method for getting the module simplification pipeline
as well as a method to get the optimization pipeline. While not my
express goal, this seems nice and gives a good place comment about the
restrictions that are imposed on them.

I did make some minor changes to the way the pipelines are structured
here, but hopefully not ones that are significant or controversial:

1) I sunk the PGO indirect call promotion to only be run when we have
   PGO enabled (or as part of the special ThinLTO pipeline).

2) I made the extra GlobalOpt run in ThinLTO just happen all the time
   and at a slightly more powerful place (before we remove available
   externaly functions). This seems like general goodness and not a big
   compile time sink, so it didn't make sense to *only* use it in
   ThinLTO. Fewer differences in the pipeline makes everything simpler
   IMO.

3) I hoisted the ThinLTO stop point pre-link above the the RPO function
   attr inference. The RPO inference won't infer anything terribly
   meaningful pre-link (recursiveness?) so it didn't make a lot of
   sense. But if the placement of RPO inference starts to matter, we
   should move it to the canonicalization phase anyways which seems like
   a better place for it (and there is a FIXME to this effect!). But
   that seemed a bridge too far for this patch.

If we ever need to parameterize these pipelines more heavily, we can
always sink the logic to helper functions with parameters to keep those
parameters out of the public API. But the changes above seemed minor
that we could possible get away without the parameters entirely.

I added support for parsing 'thinlto' and 'thinlto-pre-link' names in
pass pipelines to make it easy to test these routines and play with them
in larger pipelines. I also added a really basic manifest of passes test
that will show exactly how the pipelines behave and work as well as
making updates to them clear.

Lastly, this factoring does introduce a nesting layer of module pass
managers in the default pipeline. I don't think this is a big deal and
the flexibility of decoupling the pipelines seems easily worth it.

Differential Revision: https://reviews.llvm.org/D33540

llvm-svn: 304407
2017-06-01 11:39:39 +00:00
Chandler Carruth 86248d5632 [PM] Enable the new simple loop unswitch pass in the new pass manager
(where it is the only realistic option).

This passes the LLVM test suite for me, but I'm clearly still hammering
on this.

llvm-svn: 303952
2017-05-26 01:24:11 +00:00
Easwaran Raman 5e6f9bd4f8 [PM] Add ProfileSummaryAnalysis as a required pass in the new pipeline.
Differential revision: https://reviews.llvm.org/D32768

llvm-svn: 302170
2017-05-04 16:58:45 +00:00
Chandler Carruth c246a4c973 Disable GVN Hoist due to still more bugs being found in it. There is
also a discussion about exactly what we should do prior to re-enabling
it.

The current bug is http://llvm.org/PR32821 and the discussion about this
is in the review thread for r300200.

llvm-svn: 301505
2017-04-27 00:28:03 +00:00
Filipe Cabecinhas 92dc348773 Simplify the CFG after loop pass cleanup.
Summary:
Otherwise we might end up with some empty basic blocks or
single-entry-single-exit basic blocks.

This fixes PR32085

Reviewers: chandlerc, danielcdh

Subscribers: mehdi_amini, RKSimon, llvm-commits

Differential Revision: https://reviews.llvm.org/D30468

llvm-svn: 301395
2017-04-26 12:02:41 +00:00
Piotr Padlewski 610c966a4e Handle invariant.group.barrier in BasicAA
Summary:
llvm.invariant.group.barrier returns pointer that mustalias
pointer it takes. It can't be marked with `returned` attribute,
because it would be remove easily. The other reason is that
only Alias Analysis can know about this, because if any other
pass would know it, then the result would be replaced with it's
argument, which would be invalid.

We can think about returned pointer as something that mustalias, but
it doesn't have to be bitwise the same as the argument.

Reviewers: dberlin, chandlerc, hfinkel, sanjoy

Subscribers: reames, nlewycky, rsmith, anna, amharc

Differential Revision: https://reviews.llvm.org/D31585

llvm-svn: 301227
2017-04-24 19:37:17 +00:00
Piotr Padlewski 04aee46779 Remove readnone from invariant.group.barrier
Summary:
Readnone attribute would cause CSE of two barriers with
the same argument, which is invalid by example:

    struct Base {
          virtual int foo() { return 42; }
    };

    struct Derived1 : Base {
          int foo() override { return 50; }
    };

    struct Derived2 : Base {
          int foo() override { return 100; }
    };

    void foo() {
        Base *x = new Base{};
        new (x) Derived1{};
        int a = std::launder(x)->foo();
        new (x) Derived2{};
        int b = std::launder(x)->foo();
    }

Here 2 calls of std::launder will produce @llvm.invariant.group.barrier,
which would be merged into one call, causing devirtualization
to devirtualize second call into Derived1::foo() instead of
Derived2::foo()

Reviewers: chandlerc, dberlin, hfinkel

Subscribers: llvm-commits, rsmith, amharc

Differential Revision: https://reviews.llvm.org/D31531

llvm-svn: 300101
2017-04-12 20:45:12 +00:00
Rafael Espindola d31f04b319 Bring back r297624.
The issues was just a missing REQUIRES in the test.

llvm-svn: 297661
2017-03-13 20:00:25 +00:00
Rafael Espindola 3978b877d7 Revert "Fix crash when multiple raw_fd_ostreams to stdout are created."
This reverts commit r297624.
It was failing on the bots.

llvm-svn: 297657
2017-03-13 19:38:32 +00:00
Rafael Espindola 82d55239ea Fix crash when multiple raw_fd_ostreams to stdout are created.
If raw_fd_ostream is constructed with the path of "-", it claims
ownership of the stdout file descriptor. This means that it closes
stdout when it is destroyed. If there are multiple users of
raw_fd_ostream wrapped around stdout, then a crash can occur because
of operations on a closed stream.

An example of this would be running something like "clang -S -o - -MD
-MF - test.cpp". Alternatively, using outs() (which creates a local
version of raw_fd_stream to stdout) anywhere combined with such a
stream usage would cause the crash.

The fix duplicates the stdout file descriptor when used within
raw_fd_ostream, so that only that particular descriptor is closed when
the stream is destroyed.

Patch by James Henderson!

llvm-svn: 297624
2017-03-13 14:45:06 +00:00
Chandler Carruth 20e588e1af [PM/Inliner] Make the new PM's inliner process call edges across an
entire SCC before iterating on newly-introduced call edges resulting
from any inlined function bodies.

This more closely matches the behavior of the old PM's inliner. While it
wasn't really clear to me initially, this behavior is actually essential
to the inliner behaving reasonably in its current design.

Because the inliner is fundamentally a bottom-up inliner and all of its
cost modeling is designed around that it often runs into trouble within
an SCC where we don't have any meaningful bottom-up ordering to use. In
addition to potentially cyclic, infinite inlining that we block with the
inline history mechanism, it can also take seemingly simple call graph
patterns within an SCC and turn them into *insanely* large functions by
accidentally working top-down across the SCC without any of the
threshold limitations that traditional top-down inliners use.

Consider this diabolical monster.cpp file that Richard Smith came up
with to help demonstrate this issue:
```
template <int N> extern const char *str;

void g(const char *);

template <bool K, int N> void f(bool *B, bool *E) {
  if (K)
    g(str<N>);
  if (B == E)
    return;
  if (*B)
    f<true, N + 1>(B + 1, E);
  else
    f<false, N + 1>(B + 1, E);
}
template <> void f<false, MAX>(bool *B, bool *E) { return f<false, 0>(B, E); }
template <> void f<true, MAX>(bool *B, bool *E) { return f<true, 0>(B, E); }

extern bool *arr, *end;
void test() { f<false, 0>(arr, end); }
```

When compiled with '-DMAX=N' for various values of N, this will create an SCC
with a reasonably large number of functions. Previously, the inliner would try
to exhaust the inlining candidates in a single function before moving on. This,
unfortunately, turns it into a top-down inliner within the SCC. Because our
thresholds were never built for that, we will incrementally decide that it is
always worth inlining and proceed to flatten the entire SCC into that one
function.

What's worse, we'll then proceed to the next function, and do the exact same
thing except we'll skip the first function, and so on. And at each step, we'll
also make some of the constant factors larger, which is awesome.

The fix in this patch is the obvious one which makes the new PM's inliner use
the same technique used by the old PM: consider all the call edges across the
entire SCC before beginning to process call edges introduced by inlining. The
result of this is essentially to distribute the inlining across the SCC so that
every function incrementally grows toward the inline thresholds rather than
allowing the inliner to grow one of the functions vastly beyond the threshold.
The code for this is a bit awkward, but it works out OK.

We could consider in the future doing something more powerful here such as
prioritized order (via lowest cost and/or profile info) and/or a code-growth
budget per SCC. However, both of those would require really substantial work
both to design the system in a way that wouldn't break really useful
abstraction decomposition properties of the current inliner and to be tuned
across a reasonably diverse set of code and workloads. It also seems really
risky in many ways. I have only found a single real-world file that triggers
the bad behavior here and it is generated code that has a pretty pathological
pattern. I'm not worried about the inliner not doing an *awesome* job here as
long as it does *ok*. On the other hand, the cases that will be tricky to get
right in a prioritized scheme with a budget will be more common and idiomatic
for at least some frontends (C++ and Rust at least). So while these approaches
are still really interesting, I'm not in a huge rush to go after them. Staying
even closer to the existing PM's behavior, especially when this easy to do,
seems like the right short to medium term approach.

I don't really have a test case that makes sense yet... I'll try to find a
variant of the IR produced by the monster template metaprogram that is both
small enough to be sane and large enough to clearly show when we get this wrong
in the future. But I'm not confident this exists. And the behavior change here
*should* be unobservable without snooping on debug logging. So there isn't
really much to test.

The test case updates come from two incidental changes:
1) We now visit functions in an SCC in the opposite order. I don't think there
   really is a "right" order here, so I just update the test cases.
2) We no longer compute some analyses when an SCC has no call instructions that
   we consider for inlining.

llvm-svn: 297374
2017-03-09 11:35:40 +00:00
Zachary Turner b471d4f25a Teach lit to expand glob expressions.
This will enable removing hacks throughout the codebase
in clang and compiler-rt that feed multiple inputs to a
testing utility by globbing, all of which are either disabled
on Windows currently or using xargs / find hacks.

Differential Revision: https://reviews.llvm.org/D30380

llvm-svn: 296904
2017-03-03 18:55:24 +00:00
Daniel Berlin 283a60875e NewGVN: Add debug counter for value numbering
llvm-svn: 296665
2017-03-01 19:59:26 +00:00
Daniel Jasper 5a51f8cae4 s/REQUIRES: Asserts/REQUIRES: asserts/
Other than this, we consistently use lower case.

llvm-svn: 295623
2017-02-19 23:26:00 +00:00
Daniel Berlin 17b1375299 Re-add debugcounter.ll with Requires: Asserts so that it only triggers when asserts are on
llvm-svn: 295598
2017-02-19 06:45:02 +00:00
Daniel Berlin d46dfb3d0e Which, in turn, causes build bots to fail that have it unexpectedly passing. So remove debugcounter.ll for now
llvm-svn: 295597
2017-02-19 04:56:07 +00:00
Daniel Berlin 1ad7f5cab0 XFAIL this test until we figure out what to do here, since it will fail if NDEBUG defined
llvm-svn: 295596
2017-02-19 04:55:02 +00:00
Daniel Berlin a4b5c01dd2 Add a DebugCounter for PredicateInfo renaming, and an associated test
llvm-svn: 295594
2017-02-19 04:29:01 +00:00
Peter Collingbourne 10c500ddc0 opt: Rename -default-data-layout flag to -data-layout and make it always override the layout.
There isn't much point in a flag that only works if the data layout is empty.

Differential Revision: https://reviews.llvm.org/D30014

llvm-svn: 295468
2017-02-17 17:36:52 +00:00
Brian Cain 6dedf65cc9 Correct a typo, s/hosting/hoisting/
llvm-svn: 295066
2017-02-14 16:41:10 +00:00
Davide Italiano 513dfaa0a3 [PM] Hook up the instrumented PGO machinery in the new PM.
Differential Revision:  https://reviews.llvm.org/D29308

llvm-svn: 294955
2017-02-13 15:26:22 +00:00
Chandler Carruth 719ffe1a66 [PM] Add devirtualization-based iteration utility into the new PM's
default pipeline.

A clang with this patch built with ASan and asserts can build all of the
test-suite as well, so it seems to not uncover any latent problems.

Differential Revision: https://reviews.llvm.org/D29853

llvm-svn: 294888
2017-02-12 05:38:04 +00:00
Chandler Carruth e87fc8cb71 [PM] Enable GlobalsAA in the new PM's pipeline by default.
All the invalidation issues and bugs in this seem to be fixed, it has
survived a full build of the test suite plus SPEC with asserts and ASan
enabled on the Clang binary used.

Differential Revision: https://reviews.llvm.org/D29815

llvm-svn: 294887
2017-02-12 05:34:04 +00:00
Chandler Carruth 7bc6028d7d [PM] Relax the patterns used in the new test I added because some
compilers don't print the typedef name.

llvm-svn: 294729
2017-02-10 08:48:50 +00:00
Chandler Carruth f425292721 [PM] Fix a bug in the new loop PM when handling functions with no loops.
Without any loops, we don't even bother to build the standard analyses
used by loop passes. Without these, we can't run loop analyses or
invalidate them properly. Unfortunately, we did these things in the
wrong order which would allow a loop analysis manager's proxy to be
built but then not have the standard analyses built. When we went to do
the invalidation in the proxy thing would fall apart. In the test case
provided, it would actually crash.

The fix is to carefully check for loops first, and to in fact build the
standard analyses before building the proxy. This allows it to
correctly trigger invalidation for those standard analyses.

An alternative might seem to be  to look at whether there are any loops
when doing invalidation, but this doesn't work when during the loop
pipeline run we delete the last loop. I've even included that as a test
case. It is both simpler and more robust to defer building the proxy
until there are definitely the standard set of analyses and indeed
loops.

This bug was uncovered by enabling GlobalsAA in the pipeline.

llvm-svn: 294728
2017-02-10 08:26:58 +00:00
Chandler Carruth 0ede22e1c0 [PM] Add Argument Promotion to the pass pipeline.
This needs explicit requires of the optimization remark emission before
loop pass pipelines containing LICM as we no longer get it from the
inliner -- Argument Promotion may invalidate it. Technically the inliner
could also have broken this, but it never came up in testing.

Differential Revision: https://reviews.llvm.org/D29595

llvm-svn: 294670
2017-02-09 23:54:57 +00:00
Chandler Carruth baabda9317 [PM] Port LoopLoadElimination to the new pass manager and wire it into
the main pipeline.

This is a very straight forward port. Nothing weird or surprising.

This brings the number of missing passes from the new PM's pipeline down
to three.

llvm-svn: 293249
2017-01-27 01:32:26 +00:00
Chandler Carruth a95ff38924 [PM] Flesh out almost all of the late loop passes.
With this the per-module pass pipeline is *extremely* close to the
legacy PM. The missing pieces are:
- PruneEH (or some equivalent)
- ArgumentPromotion
- LoopLoadElimination
- LoopUnswitch

I'm going to work through those in essentially that order but this seems
like a worthwhile incremental step toward the end state.

One difference in what I have here from the legacy PM is that I've
consolidated some of the per-function passes at the very end of the
pipeline into the main optimization function pipeline. The intervening
passes are *really* uninteresting and so this seems very likely to have
any effect other than minor improvement to locality.

Note that there are still some failures in the test suite, but the
compiler doesn't crash or assert.

Differential Revision: https://reviews.llvm.org/D29114

llvm-svn: 293241
2017-01-27 00:50:21 +00:00
Chandler Carruth 79b733bc6b [PM] Enable the main loop pass pipelines with everything but
loop-unswitch in the main pipelines for the new PM.

All of these now work, and Clang built using this pipeline can build the
test suite and SPEC without hitting any asserts of ASan failures.

There are still some bugs hiding though -- 7 tests regress with the new
PM. I'm going to be investigating these, but it seems worthwhile to at
least get the pipelines in place so that others can play with them, and
they aren't completely broken.

Differential Revision: https://reviews.llvm.org/D29113

llvm-svn: 293225
2017-01-26 23:21:17 +00:00
Chandler Carruth 6acdca78a0 [PH] Replace uses of AssertingVH from members of analysis results with
a lazy-asserting PoisoningVH.

AssertVH is fundamentally incompatible with cache-invalidation of
analysis results. The invaliadtion happens after the AssertingVH has
already fired. Instead, use a PoisoningVH that will assert if the
dangling handle is ever used rather than merely be assigned or
destroyed.

This patch also removes all of the (numerous) doomed attempts to work
around this fundamental incompatibility. It is a pretty significant
simplification IMO.

The most interesting change is in the Inliner where we still do some
clearing because we don't want to rely on the coarse grained
invalidation strategy of the containing pass manager. However, I prefer
the approach that contains this logic to the cleanup phase of the
Inliner, and I think we could enhance the CGSCC analysis management
layer to make this even better in the future if desired.

The rest is straight cleanup.

I've also added a test for one of the harder cases to work around: when
a *module analysis* contains many AssertingVHes pointing at functions.

Differential Revision: https://reviews.llvm.org/D29006

llvm-svn: 292928
2017-01-24 12:55:57 +00:00
Chandler Carruth d7e0e6b514 [PM] Further fixes to the test case in r292863.
This should hopefully fix the MSVC failures remaining.

llvm-svn: 292887
2017-01-24 05:30:41 +00:00
Davide Italiano ea2dc02668 [PM] Try to make all three compilers happy when it comes to pretty printing.
Modeled after a similar change from Michael Kuperstein. Let's hope this
sticks together.

llvm-svn: 292872
2017-01-24 01:45:53 +00:00
Davide Italiano 089a912365 [PM] Flesh out the new pass manager LTO pipeline.
Differential Revision:  https://reviews.llvm.org/D28996

llvm-svn: 292863
2017-01-24 00:57:39 +00:00
Chandler Carruth e8c66b2766 [PM] Replace the hard invalidate in JumpThreading for LVI with correct
invalidation of deleted functions in GlobalDCE.

This was always testing a bug really triggered in GlobalDCE. Right now
we have analyses with asserting value handles into IR. As long as those
remain, when *deleting* an IR unit, we cannot wait for the normal
invalidation scheme to kick in even though it was designed to work
correctly in the face of these kinds of deletions. Instead, the pass
needs to directly handle invalidating the analysis results pointing at
that IR unit.

I've tought the Inliner about this and this patch teaches GlobalDCE.
This will handle the asserting VH case in the existing test as well as
other issues of the same fundamental variety. I've moved the test into
the GlobalDCE directory and added a comment explaining what is going on.

Note that we cannot simply require LVI here because LVI is too lazy.

llvm-svn: 292773
2017-01-23 08:33:24 +00:00
Chandler Carruth a504f2b8e8 [PM] Teach LVI to correctly invalidate itself when its dependencies
become unavailable.

The AssumptionCache is now immutable but it still needs to respond to
DomTree invalidation if it ended up caching one.

This lets us remove one of the explicit invalidates of LVI but the
other one continues to avoid hitting a latent bug.

llvm-svn: 292769
2017-01-23 06:35:12 +00:00
Chandler Carruth b698d5964d [PM] Fix a really nasty bug introduced when adding PGO support to the
new PM's inliner.

The bug happens when we refine an SCC after having computed a proxy for
the FunctionAnalysisManager, and then proceed to compute fresh analyses
for functions in the *new* SCC using the manager provided by the old
SCC's proxy. *And* when we manage to mutate a function in this new SCC
in a way that invalidates those analyses. This can be... challenging to
reproduce.

I've managed to contrive a set of functions that trigger this and added
a test case, but it is a bit brittle. I've directly checked that the
passes run in the expected ways to help avoid the test just becoming
silently irrelevant.

This gets the new PM back to passing the LLVM test suite after the PGO
improvements landed.

llvm-svn: 292757
2017-01-22 10:34:01 +00:00
Chandler Carruth 17350de1ca [PM] Teach the loop PM to run LoopSimplify prior to the loop pipeline.
This adds the last remaining core feature of the loop pass pipeline in
the new PM and removes the last of the really egregious hacks in the
LICM tests.

Sadly, this requires really substantial changes in the unittests in
order to provide and maintain simplified loops. This is particularly
hard because for example LoopSimplify will try to fold undef branches to
an ideal direction and simplify the loop accordingly.

Differential Revision: https://reviews.llvm.org/D28766

llvm-svn: 292709
2017-01-21 03:48:51 +00:00
Chandler Carruth 3cdf650770 [PM] Tidy up the spacing of this new, much nicer test file.
llvm-svn: 292592
2017-01-20 09:30:03 +00:00
Michael Kuperstein 568027aabb [PM] Attempt to pacify windows bots.
Another difference in type pretty-printing, this one windows-specific.

llvm-svn: 292556
2017-01-20 00:47:32 +00:00
Michael Kuperstein 853e3337db [PM] Make default pipeline test for the new PM strict
Use CHECK-NEXT to verify that a test breaks whenever unexpected passes,
analyses, or invalidations show up in default pipelines. The test case
is constructed so that we don't expect to invalidate anything, and needs
to be kept that way.

The test is slightly less strict than we'd like because of differences
in type pretty-printing.

(Right now it does show some invalidations - all of those are intentional
and temporary.)

Differential Revision: https://reviews.llvm.org/D28887

llvm-svn: 292536
2017-01-19 23:39:28 +00:00
Michael Kuperstein c9bb572b73 Revert r292530 since it breaks buildbots.
llvm-svn: 292534
2017-01-19 23:22:55 +00:00
Michael Kuperstein 5a52af0f63 [PM] Make default pipeline test for the new PM strict
Use CHECK-NEXT to verify that a test breaks whenever unexpected passes,
analyses, or invalidations show up in default pipelines. The test case
is constructed so that we don't expect to invalidate anything, and needs
to be kept that way.

(Right now it does show some invalidations - all of those are intentional
and temporary.)

Differential Revision: https://reviews.llvm.org/D28887

llvm-svn: 292530
2017-01-19 22:55:46 +00:00
Michael Kuperstein 8ecc38ef85 [PM] Add LoopVectorize to the default module pipeline
LV no longer "requires" LCSSA and LoopSimplify, and instead forms
them internally as required. So, there's nothing preventing it from
being enabled.

llvm-svn: 292464
2017-01-19 02:21:54 +00:00
Chandler Carruth b6e32daa81 [PM] Teach the LoopPassManager to automatically canonicalize loops by
runnig LCSSA over them prior to running the loop pipeline.

This also teaches the loop PM to verify that LCSSA form is preserved
throughout the pipeline's run across the loop nest.

Most of the test updates just leverage this new functionality. One has to be
relaxed with the new PM as IVUsers is less powerful when it sees LCSSA input.

Differential Revision: https://reviews.llvm.org/D28743

llvm-svn: 292241
2017-01-17 19:18:12 +00:00
Chandler Carruth 1ae34c35ba [PM] Teach the optimization remarks emitter to handle invalidation
events.

This pass sometimes has a pointer to BlockFrequencyInfo so it needs
custom invalidation logic. It is also otherwise immutable so we can
reduce the number of invalidations that happen substantially.

llvm-svn: 292058
2017-01-15 08:20:50 +00:00
Adam Nemet 6117caab58 Move test of lazy BFI with ORE to a generic directory
llvm-svn: 291862
2017-01-13 00:16:23 +00:00
Chandler Carruth 410eaeb064 [PM] Rewrite the loop pass manager to use a worklist and augmented run
arguments much like the CGSCC pass manager.

This is a major redesign following the pattern establish for the CGSCC layer to
support updates to the set of loops during the traversal of the loop nest and
to support invalidation of analyses.

An additional significant burden in the loop PM is that so many passes require
access to a large number of function analyses. Manually ensuring these are
cached, available, and preserved has been a long-standing burden in LLVM even
with the help of the automatic scheduling in the old pass manager. And it made
the new pass manager extremely unweildy. With this design, we can package the
common analyses up while in a function pass and make them immediately available
to all the loop passes. While in some cases this is unnecessary, I think the
simplicity afforded is worth it.

This does not (yet) address loop simplified form or LCSSA form, but those are
the next things on my radar and I have a clear plan for them.

While the patch is very large, most of it is either mechanically updating loop
passes to the new API or the new testing for the loop PM. The code for it is
reasonably compact.

I have not yet updated all of the loop passes to correctly leverage the update
mechanisms demonstrated in the unittests. I'll do that in follow-up patches
along with improved FileCheck tests for those passes that ensure things work in
more realistic scenarios. In many cases, there isn't much we can do with these
until the loop simplified form and LCSSA form are in place.

Differential Revision: https://reviews.llvm.org/D28292

llvm-svn: 291651
2017-01-11 06:23:21 +00:00
Chandler Carruth 05ca5acc9e [PM] Introduce a devirtualization iteration layer for the new PM.
This is an orthogonal and separated layer instead of being embedded
inside the pass manager. While it adds a small amount of complexity, it
is fairly minimal and the composability and control seems worth the
cost.

The logic for this ends up being nicely isolated and targeted. It should
be easy to experiment with different iteration strategies wrapped around
the CGSCC bottom-up walk using this kind of facility.

The mechanism used to track devirtualization is the simplest one I came
up with. I think it handles most of the cases the existing iteration
machinery handles, but I haven't done a *very* in depth analysis. It
does however match the basic intended semantics, and we can tweak or
tune its exact behavior incrementally as necessary. One thing that we
may want to revisit is freshly building the value handle set on each
iteration. While I don't think this will be a significant cost (it is
strictly fewer value handles but more churn of value handes than the old
call graph), it is conceivable that we'll want a somewhat more clever
tracking mechanism. My hope is to layer that on as a follow up patch
with data supporting any implementation complexity it adds.

This code also provides for a basic count heuristic: if the number of
indirect calls decreases and the number of direct calls increases for
a given function in the SCC, we assume devirtualization is responsible.
This matches the heuristics currently used in the legacy pass manager.

Differential Revision: https://reviews.llvm.org/D23114

llvm-svn: 290665
2016-12-28 11:07:33 +00:00
Chandler Carruth 69c5cc69ed [PM] Actually commit the test update that was supposed to accompany
r290644. Sorry for this.

llvm-svn: 290646
2016-12-28 02:31:24 +00:00
Chandler Carruth aa35167578 [PM] Teach BasicAA how to invalidate its result object.
This requires custom handling because BasicAA caches handles to other
analyses and so it needs to trigger indirect invalidation.

This fixes one of the common crashes when using the new PM in real
pipelines. I've also tweaked a regression test to check that we are at
least handling the most immediate case.

I'm going to work at re-structuring this test some to both scale better
(rather than all being in one file) and check more invalidation paths in
a follow-up commit, but I wanted to get the basic bug fix in place.

llvm-svn: 290603
2016-12-27 10:30:45 +00:00
Chandler Carruth 81c8edaf5c [PM] Disable more of the loop passes -- LCSSA and LoopSimplify are also
not really wired into the loop pass manager in a way that will let us
productively use these passes yet.

This lets the new PM get farther in basic testing which is useful for
establishing a good baseline of "doesn't explode". There are still
plenty of crashers in basic testing though, this just gets rid of some
noise that is well understood and not representing a specific or narrow
bug.

llvm-svn: 290601
2016-12-27 10:16:46 +00:00
Chandler Carruth 17c630a09c [PM] Teach the AAManager and AAResults layer (the worst offender for
inter-analysis dependencies) to use the new invalidation infrastructure.

This teaches it to invalidate itself when any of the peer function
AA results that it uses become invalid. We do this by just tracking the
originating IDs. I've kept it in a somewhat clunky API since some users
of AAResults are outside the new PM right now. We can clean this API up
if/when those users go away.

Secondly, it uses the registration on the outer analysis manager proxy
to trigger deferred invalidation when a module analysis result becomes
invalid.

I've included test cases that specifically try to trigger use-after-free
in both of these cases and they would crash or hang pretty horribly for
me even without ASan. Now they work nicely.

The `InvalidateAnalysis` utility pass required some tweaking to be
useful in this context and it still is pretty garbage. I'd like to
switch it back to the previous implementation and teach the explicit
invalidate method on the AnalysisManager to take care of correctly
triggering indirect invalidation, but I wanted to go ahead and send this
out so folks could see how all of this stuff works together in practice.
And, you know, that it does actually work. =]

Differential Revision: https://reviews.llvm.org/D27205

llvm-svn: 290595
2016-12-27 08:44:39 +00:00
Chandler Carruth 060ad61fbe [PM] Add support for building a default AA pipeline to the PassBuilder.
Pretty boring and lame as-is but necessary. This is definitely a place
we'll end up with extension hooks longer term. =]

Differential Revision: https://reviews.llvm.org/D28076

llvm-svn: 290449
2016-12-23 20:38:19 +00:00
Chandler Carruth 0d1d49507b [PM] Loosen the check ever so slightly -- MSVC appears to not include
a space after the comma in template arguments with our hacky type name
system.

llvm-svn: 290331
2016-12-22 07:53:20 +00:00
Chandler Carruth ee6865f425 [PM] Make a couple of CHECK lines a bit more precise, NFC.
I was staring at these and didn't realize these were module-layer
proxies as opposed to some other layer. Justin and I have a plan to
rename things to make the names themselves much easier to reason about,
but I at least want the CHECK lines to be precise for now.

llvm-svn: 290328
2016-12-22 07:14:35 +00:00
Chandler Carruth e3f5064b72 [PM] Introduce a reasonable port of the main per-module pass pipeline
from the old pass manager in the new one.

I'm not trying to support (initially) the numerous options that are
currently available to customize the pass pipeline. If we end up really
wanting them, we can add them later, but I suspect many are no longer
interesting. The simplicity of omitting them will help a lot as we sort
out what the pipeline should look like in the new PM.

I've also documented to the best of my ability *why* each pass or group
of passes is used so that reading the pipeline is more helpful. In many
cases I think we have some questionable choices of ordering and I've
left FIXME comments in place so we know what to come back and revisit
going forward. But for now, I've left it as similar to the current
pipeline as I could.

Lastly, I've had to comment out several places where passes are not
ported to the new pass manager or where the loop pass infrastructure is
not yet ready. I did at least fix a few bugs in the loop pass
infrastructure uncovered by running the full pipeline, but I didn't want
to go too far in this patch -- I'll come back and re-enable these as the
infrastructure comes online. But I'd like to keep the comments in place
because I don't want to lose track of which passes need to be enabled
and where they go.

One thing that seemed like a significant API improvement was to require
that we don't build pipelines for O0. It seems to have no real benefit.

I've also switched back to returning pass managers by value as at this
API layer it feels much more natural to me for composition. But if
others disagree, I'm happy to go back to an output parameter.

I'm not 100% happy with the testing strategy currently, but it seems at
least OK. I may come back and try to refactor or otherwise improve this
in subsequent patches but I wanted to at least get a good starting point
in place.

Differential Revision: https://reviews.llvm.org/D28042

llvm-svn: 290325
2016-12-22 06:59:15 +00:00
Chandler Carruth cef2482875 [PM] Further broaden this test's regex as both the CGSCC and Function
inner AM proxies are now being rendered differently.

llvm-svn: 289319
2016-12-10 07:59:59 +00:00
Chandler Carruth d8aecb0e5c [PM] Try to support the new spelling of one of the proxy names that are
showing up on the build bots.

llvm-svn: 289318
2016-12-10 07:46:51 +00:00
Chandler Carruth 6b9816477b [PM] Support invalidation of inner analysis managers from a pass over the outer IR unit.
Summary:
This never really got implemented, and was very hard to test before
a lot of the refactoring changes to make things more robust. But now we
can test it thoroughly and cleanly, especially at the CGSCC level.

The core idea is that when an inner analysis manager proxy receives the
invalidation event for the outer IR unit, it needs to walk the inner IR
units and propagate it to the inner analysis manager for each of those
units. For example, each function in the SCC needs to get an
invalidation event when the SCC gets one.

The function / module interaction is somewhat boring here. This really
becomes interesting in the face of analysis-backed IR units. This patch
effectively handles all of the CGSCC layer's needs -- both invalidating
SCC analysis and invalidating function analysis when an SCC gets
invalidated.

However, this second aspect doesn't really handle the
LoopAnalysisManager well at this point. That one will need some change
of design in order to fully integrate, because unlike the call graph,
the entire function behind a LoopAnalysis's results can vanish out from
under us, and we won't even have a cached API to access. I'd like to try
to separate solving the loop problems into a subsequent patch though in
order to keep this more focused so I've adapted them to the API and
updated the tests that immediately fail, but I've not added the level of
testing and validation at that layer that I have at the CGSCC layer.

An important aspect of this change is that the proxy for the
FunctionAnalysisManager at the SCC pass layer doesn't work like the
other proxies for an inner IR unit as it doesn't directly manage the
FunctionAnalysisManager and invalidation or clearing of it. This would
create an ever worsening problem of dual ownership of this
responsibility, split between the module-level FAM proxy and this
SCC-level FAM proxy. Instead, this patch changes the SCC-level FAM proxy
to work in terms of the module-level proxy and defer to it to handle
much of the updates. It only does SCC-specific invalidation. This will
become more important in subsequent patches that support more complex
invalidaiton scenarios.

Reviewers: jlebar

Subscribers: mehdi_amini, mcrosier, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D27197

llvm-svn: 289317
2016-12-10 06:34:44 +00:00
Peter Collingbourne 0a4fc46321 Analysis: gep inbounds (gep inbounds (...)) is inbounds.
Differential Revision: https://reviews.llvm.org/D26441

llvm-svn: 287604
2016-11-22 01:03:40 +00:00
Matthias Braun db39fd6c53 Statistic/Timer: Include timers in PrintStatisticsJSON().
Differential Revision: https://reviews.llvm.org/D25588

llvm-svn: 287370
2016-11-18 19:43:24 +00:00
Dehao Chen 947dbe1254 Enable Loop Sink pass for functions that has profile.
Summary: For functions with profile data, we are confident that loop sink will be optimal in sinking code.

Reviewers: davidxl, hfinkel

Subscribers: mehdi_amini, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D26155

llvm-svn: 286325
2016-11-09 00:58:19 +00:00
Reid Kleckner 4500f74858 [lit] Work around Windows MSys command line tokenization bug
Summary:
This will allow us to revert LLD r284768, which added spaces to get MSys
echo to print what we want.

Reviewers: ruiu, inglorion, rafael

Subscribers: modocache, llvm-commits

Differential Revision: https://reviews.llvm.org/D26009

llvm-svn: 285237
2016-10-26 20:29:27 +00:00
Sriraman Tallam 06a67ba57d [PM] Port CFGViewer and CFGPrinter to the new Pass Manager
Differential Revision: https://reviews.llvm.org/D24592

llvm-svn: 281640
2016-09-15 18:35:27 +00:00
Chandler Carruth 8882346842 [PM] Introduce basic update capabilities to the new PM's CGSCC pass
manager, including both plumbing and logic to handle function pass
updates.

There are three fundamentally tied changes here:
1) Plumbing *some* mechanism for updating the CGSCC pass manager as the
   CG changes while passes are running.
2) Changing the CGSCC pass manager infrastructure to have support for
   the underlying graph to mutate mid-pass run.
3) Actually updating the CG after function passes run.

I can separate them if necessary, but I think its really useful to have
them together as the needs of #3 drove #2, and that in turn drove #1.

The plumbing technique is to extend the "run" method signature with
extra arguments. We provide the call graph that intrinsically is
available as it is the basis of the pass manager's IR units, and an
output parameter that records the results of updating the call graph
during an SCC passes's run. Note that "...UpdateResult" isn't a *great*
name here... suggestions very welcome.

I tried a pretty frustrating number of different data structures and such
for the innards of the update result. Every other one failed for one
reason or another. Sometimes I just couldn't keep the layers of
complexity right in my head. The thing that really worked was to just
directly provide access to the underlying structures used to walk the
call graph so that their updates could be informed by the *particular*
nature of the change to the graph.

The technique for how to make the pass management infrastructure cope
with mutating graphs was also something that took a really, really large
number of iterations to get to a place where I was happy. Here are some
of the considerations that drove the design:

- We operate at three levels within the infrastructure: RefSCC, SCC, and
  Node. In each case, we are working bottom up and so we want to
  continue to iterate on the "lowest" node as the graph changes. Look at
  how we iterate over nodes in an SCC running function passes as those
  function passes mutate the CG. We continue to iterate on the "lowest"
  SCC, which is the one that continues to contain the function just
  processed.

- The call graph structure re-uses SCCs (and RefSCCs) during mutation
  events for the *highest* entry in the resulting new subgraph, not the
  lowest. This means that it is necessary to continually update the
  current SCC or RefSCC as it shifts. This is really surprising and
  subtle, and took a long time for me to work out. I actually tried
  changing the call graph to provide the opposite behavior, and it
  breaks *EVERYTHING*. The graph update algorithms are really deeply
  tied to this particualr pattern.

- When SCCs or RefSCCs are split apart and refined and we continually
  re-pin our processing to the bottom one in the subgraph, we need to
  enqueue the newly formed SCCs and RefSCCs for subsequent processing.
  Queuing them presents a few challenges:
  1) SCCs and RefSCCs use wildly different iteration strategies at
     a high level. We end up needing to converge them on worklist
     approaches that can be extended in order to be able to handle the
     mutations.
  2) The order of the enqueuing need to remain bottom-up post-order so
     that we don't get surprising order of visitation for things like
     the inliner.
  3) We need the worklists to have set semantics so we don't duplicate
     things endlessly. We don't need a *persistent* set though because
     we always keep processing the bottom node!!!! This is super, super
     surprising to me and took a long time to convince myself this is
     correct, but I'm pretty sure it is... Once we sink down to the
     bottom node, we can't re-split out the same node in any way, and
     the postorder of the current queue is fixed and unchanging.
  4) We need to make sure that the "current" SCC or RefSCC actually gets
     enqueued here such that we re-visit it because we continue
     processing a *new*, *bottom* SCC/RefSCC.

- We also need the ability to *skip* SCCs and RefSCCs that get merged
  into a larger component. We even need the ability to skip *nodes* from
  an SCC that are no longer part of that SCC.

This led to the design you see in the patch which uses SetVector-based
worklists. The RefSCC worklist is always empty until an update occurs
and is just used to handle those RefSCCs created by updates as the
others don't even exist yet and are formed on-demand during the
bottom-up walk. The SCC worklist is pre-populated from the RefSCC, and
we push new SCCs onto it and blacklist existing SCCs on it to get the
desired processing.

We then *directly* update these when updating the call graph as I was
never able to find a satisfactory abstraction around the update
strategy.

Finally, we need to compute the updates for function passes. This is
mostly used as an initial customer of all the update mechanisms to drive
their design to at least cover some real set of use cases. There are
a bunch of interesting things that came out of doing this:

- It is really nice to do this a function at a time because that
  function is likely hot in the cache. This means we want even the
  function pass adaptor to support online updates to the call graph!

- To update the call graph after arbitrary function pass mutations is
  quite hard. We have to build a fairly comprehensive set of
  data structures and then process them. Fortunately, some of this code
  is related to the code for building the cal graph in the first place.
  Unfortunately, very little of it makes any sense to share because the
  nature of what we're doing is so very different. I've factored out the
  one part that made sense at least.

- We need to transfer these updates into the various structures for the
  CGSCC pass manager. Once those were more sanely worked out, this
  became relatively easier. But some of those needs necessitated changes
  to the LazyCallGraph interface to make it significantly easier to
  extract the changed SCCs from an update operation.

- We also need to update the CGSCC analysis manager as the shape of the
  graph changes. When an SCC is merged away we need to clear analyses
  associated with it from the analysis manager which we didn't have
  support for in the analysis manager infrsatructure. New SCCs are easy!
  But then we have the case that the original SCC has its shape changed
  but remains in the call graph. There we need to *invalidate* the
  analyses associated with it.

- We also need to invalidate analyses after we *finish* processing an
  SCC. But the analyses we need to invalidate here are *only those for
  the newly updated SCC*!!! Because we only continue processing the
  bottom SCC, if we split SCCs apart the original one gets invalidated
  once when its shape changes and is not processed farther so its
  analyses will be correct. It is the bottom SCC which continues being
  processed and needs to have the "normal" invalidation done based on
  the preserved analyses set.

All of this is mostly background and context for the changes here.

Many thanks to all the reviewers who helped here. Especially Sanjoy who
caught several interesting bugs in the graph algorithms, David, Sean,
and others who all helped with feedback.

Differential Revision: http://reviews.llvm.org/D21464

llvm-svn: 279618
2016-08-24 09:37:14 +00:00
Chandler Carruth 8abdf75d6b [PM] Introduce an abstraction for all the analyses over a particular IR
unit for use in the PreservedAnalyses set.

This doesn't have any important functional change yet but it cleans
things up and makes the analysis substantially more efficient by
avoiding querying through the type erasure for every analysis.

I also think it makes it much easier to reason about how analyses are
preserved when walking across pass managers and across IR unit
abstractions.

Thanks to Sean and Mehdi both for the comments and suggestions.

Differential Revision: https://reviews.llvm.org/D23691

llvm-svn: 279360
2016-08-20 04:57:28 +00:00
Chandler Carruth a053a88df5 [PM] Change the name of the repeating utility to something less
overloaded (and simpler).

Sean rightly pointed out in code review that we've started using
"wrapper pass" as a specific part of the old pass manager, and in fact
it is more applicable there. Here, we really have a pass *template* to
build a repeated pass, so call it that.

llvm-svn: 277689
2016-08-04 03:52:53 +00:00
Matthias Braun 4dc6933d44 opt-bisect-legacy-pass-manager.ll: Test only works with default triple configured
llvm-svn: 277645
2016-08-03 20:28:19 +00:00
Chandler Carruth 241bf2456f [PM] Add a generic 'repeat N times' pass wrapper to the new pass
manager.

While this has some utility for debugging and testing on its own, it is
primarily intended to demonstrate the technique for adding custom
wrappers that can provide more interesting interation behavior in
a nice, orthogonal, and composable layer.

Being able to write these kinds of very dynamic and customized controls
for running passes was one of the motivating use cases of the new pass
manager design, and this gives a hint at how they might look. The actual
logic is tiny here, and most of this is just wiring in the pipeline
parsing so that this can be widely used.

I'm adding this now to show the wiring without a lot of business logic.
This is a precursor patch for showing how a "iterate up to N times as
long as we devirtualize a call" utility can be added as a separable and
composable component along side the CGSCC pass management.

Differential Revision: https://reviews.llvm.org/D22405

llvm-svn: 277581
2016-08-03 07:44:48 +00:00
Chandler Carruth 6cb2ab2c60 [PM] Significantly refactor the pass pipeline parsing to be easier to
reason about and less error prone.

The core idea is to fully parse the text without trying to identify
passes or structure. This is done with a single state machine. There
were various bugs in the logic around this previously that were repeated
and scattered across the code. Having a single routine makes it much
easier to fix and get correct. For example, this routine doesn't suffer
from PR28577.

Then the actual pass construction is handled using *much* easier to read
code and simple loops, with particular pass manager construction sunk to
live with other pass construction. This is especially nice as the pass
managers *are* in fact passes.

Finally, the "implicit" pass manager synthesis is done much more simply
by forming "pre-parsed" structures rather than having to duplicate tons
of logic.

One of the bugs fixed by this was evident in the tests where we accepted
a pipeline that wasn't really well formed. Another bug is PR28577 for
which I have added a test case.

The code is less efficient than the previous code but I'm really hoping
that's not a priority. ;]

Thanks to Sean for the review!

Differential Revision: https://reviews.llvm.org/D22724

llvm-svn: 277561
2016-08-03 03:21:41 +00:00
Andrew Kaylor 8b8805c94c Temporarily remove one test run line to unblock PPC bots.
llvm-svn: 274812
2016-07-08 00:32:58 +00:00
Andrew Kaylor 65fa0704aa Include SelectionDAGISel in the opt-bisect process
Differential Revision: http://reviews.llvm.org/D21143

llvm-svn: 274786
2016-07-07 18:55:02 +00:00
Chandler Carruth dca834089a [PM] Improve the debugging and logging facilities of the CGSCC bits of
the new pass manager.

This adds operator<< overloads for the various bits of the
LazyCallGraph, dump methods for use from the debugger, and debug logging
using them to the CGSCC pass manager.

Having this was essential for debugging the call graph update patch, and
I've extracted what I could from that patch here to minimize the delta.

llvm-svn: 273961
2016-06-27 23:26:08 +00:00
Matthias Braun 98ea88be42 Statistic: Add machine parseable json output
- We lacked a short unique identifier for a statistics, so I renamed the
  current "Name" field that just contained the DEBUG_TYPE name of the
  current file to DebugType and added a new "Name" field that contains
  the C++ identifier of the statistic variable.
- Add the -stats-json option which outputs statistics in json format.

Differential Revision: http://reviews.llvm.org/D20995

llvm-svn: 272826
2016-06-15 20:19:16 +00:00
Peter Collingbourne 96efdd6107 IR: Introduce local_unnamed_addr attribute.
If a local_unnamed_addr attribute is attached to a global, the address
is known to be insignificant within the module. It is distinct from the
existing unnamed_addr attribute in that it only describes a local property
of the module rather than a global property of the symbol.

This attribute is intended to be used by the code generator and LTO to allow
the linker to decide whether the global needs to be in the symbol table. It is
possible to exclude a global from the symbol table if three things are true:
- This attribute is present on every instance of the global (which means that
  the normal rule that the global must have a unique address can be broken without
  being observable by the program by performing comparisons against the global's
  address)
- The global has linkonce_odr linkage (which means that each linkage unit must have
  its own copy of the global if it requires one, and the copy in each linkage unit
  must be the same)
- It is a constant or a function (which means that the program cannot observe that
  the unique-address rule has been broken by writing to the global)

Although this attribute could in principle be computed from the module
contents, LTO clients (i.e. linkers) will normally need to be able to compute
this property as part of symbol resolution, and it would be inefficient to
materialize every module just to compute it.

See:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160509/356401.html
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160516/356738.html
for earlier discussion.

Part of the fix for PR27553.

Differential Revision: http://reviews.llvm.org/D20348

llvm-svn: 272709
2016-06-14 21:01:22 +00:00
Manuel Jacob a485984c0c [PM] Schedule InstSimplify after late LICM run, to clean up LCSSA nodes.
Summary:
The module pass pipeline includes a late LICM run after loop
unrolling.  LCSSA is implicitly run as a pass dependency of LICM.  However no
cleanup pass was run after this, so the LCSSA nodes ended in the optimized output.

Reviewers: hfinkel, mehdi_amini

Subscribers: majnemer, bruno, mzolotukhin, mehdi_amini, llvm-commits

Differential Revision: http://reviews.llvm.org/D20606

llvm-svn: 271602
2016-06-02 22:14:26 +00:00
Andrew Kaylor 04f8e06696 Update the stack coloring pass to remove lifetime intrinsics in the optnone/opt-bisect skip case.
Differential Revision: http://reviews.llvm.org/D20453

llvm-svn: 271068
2016-05-27 22:56:49 +00:00
Andrew Kaylor 50271f787e Add opt-bisect support to additional passes that can be skipped
Differential Revision: http://reviews.llvm.org/D19882

llvm-svn: 268457
2016-05-03 22:32:30 +00:00
Mehdi Amini 7f7d8be518 Move "Eliminate Available Externally" immediately after the inliner
This pass is supposed to reduce the size of the IR for compile time
purpose. We should run it ASAP, except when we prepare for LTO or
ThinLTO, and we want to keep them available for link-time inline.

Differential Revision: http://reviews.llvm.org/D19813

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 268394
2016-05-03 15:46:00 +00:00
Mehdi Amini 45c7b3ecb5 Move createReversePostOrderFunctionAttrsPass right after the inliner is done
This is where it was originally, until LoopVersioningLICM was
inserted before in r259986, I don't believe it was on purpose.

Differential Revision: http://reviews.llvm.org/D19809

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 268252
2016-05-02 16:53:16 +00:00
Nico Weber 2f1459cbb7 Try to get ResponseFile.ll passing on Windows after r267556.
llvm-svn: 267599
2016-04-26 20:32:51 +00:00
Andrew Kaylor aa641a5171 Re-commit optimization bisect support (r267022) without new pass manager support.
The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling).

Differential Revision: http://reviews.llvm.org/D19172

llvm-svn: 267231
2016-04-22 22:06:11 +00:00
Vedant Kumar 6013f45f92 Revert "Initial implementation of optimization bisect support."
This reverts commit r267022, due to an ASan failure:

  http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/1549

llvm-svn: 267115
2016-04-22 06:51:37 +00:00
Andrew Kaylor f0f279291c Initial implementation of optimization bisect support.
This patch implements a optimization bisect feature, which will allow optimizations to be selectively disabled at compile time in order to track down test failures that are caused by incorrect optimizations.

The bisection is enabled using a new command line option (-opt-bisect-limit).  Individual passes that may be skipped call the OptBisect object (via an LLVMContext) to see if they should be skipped based on the bisect limit.  A finer level of control (disabling individual transformations) can be managed through an addition OptBisect method, but this is not yet used.

The skip checking in this implementation is based on (and replaces) the skipOptnoneFunction check.  Where that check was being called, a new call has been inserted in its place which checks the bisect limit and the optnone attribute.  A new function call has been added for module and SCC passes that behaves in a similar way.

Differential Revision: http://reviews.llvm.org/D19172

llvm-svn: 267022
2016-04-21 17:58:54 +00:00
Chandler Carruth 4c660f7087 [CG] Add a new pass manager printer pass for the old call graph and
actually finish wiring up the old call graph.

There were bugs in the old call graph that hadn't been caught because it
wasn't being tested. It wasn't being tested because it wasn't in the
pipeline system and we didn't have a printing pass to run in tests. This
fixes all of that.

As for why I'm still keeping the old call graph alive its so that I can
port GlobalsAA to the new pass manager with out forking it to work with
the lazy call graph. That's clearly the right eventual design, but it
seems pragmatic to defer that until its necessary. The old call graph
works just fine for GlobalsAA.

llvm-svn: 263104
2016-03-10 11:24:11 +00:00
Chandler Carruth 61440d225b [PM] Port memdep to the new pass manager.
This is a fairly straightforward port to the new pass manager with one
exception. It removes a very questionable use of releaseMemory() in
the old pass to invalidate its caches between runs on a function.
I don't think this is really guaranteed to be safe. I've just used the
more direct port to the new PM to address this by nuking the results
object each time the pass runs. While this could cause some minor malloc
traffic increase, I don't expect the compile time performance hit to be
noticable, and it makes the correctness and other aspects of the pass
much easier to reason about. In some cases, it may make things faster by
making the sets and maps smaller with better locality. Indeed, the
measurements collected by Bruno (thanks!!!) show mostly compile time
improvements.

There is sadly very limited testing at this point as there are only two
tests of memdep, and both rely on GVN. I'll be porting GVN next and that
will exercise this heavily though.

Differential Revision: http://reviews.llvm.org/D17962

llvm-svn: 263082
2016-03-10 00:55:30 +00:00
Chandler Carruth 8b5a7419b8 [PM] Wire up optimization levels and default pipeline construction APIs
in the PassBuilder.

These are really just stubs for now, but they give a nice API surface
that Clang or other tools can start learning about and enabling for
experimentation.

I've also wired up parsing various synthetic module pass names to
generate these set pipelines. This allows the pipelines to be combined
with other passes and have their order controlled, with clear separation
between the *kind* of canned pipeline, and the *level* of optimization
to be used within that canned pipeline.

The most interesting part of this patch is almost certainly the spec for
the different optimization levels. I don't think we can ever have hard
and fast rules that would make it easy to determine whether a particular
optimization makes sense at a particular level -- it will always be in
large part a judgement call. But hopefully this will outline the
expected rationale that should be used, and the direction that the
pipelines should be taken. Much of this was based on a long llvm-dev
discussion I started years ago to try and crystalize the intent behind
these pipelines, and now, at long long last I'm returning to the task of
actually writing it down somewhere that we can cite and try to be
consistent with.

Differential Revision: http://reviews.llvm.org/D12826

llvm-svn: 262196
2016-02-28 22:16:03 +00:00
Chandler Carruth 30811a4dde [PM] Loosen the regex for the proxy template name even further to cope
with 'class' keywords in the template arguments and other silliness.

llvm-svn: 262130
2016-02-27 11:07:16 +00:00
Chandler Carruth 08a25ce0e3 [PM] Use a boring regex instead of explicitly naming the analysis
manager as some compilers print the typedef name and others print the
"canonical" name of the underlying class template.

This isn't really an important artifact of the test anyways so it seems
fine to just loosen the test assertions here.

llvm-svn: 262129
2016-02-27 10:48:14 +00:00
Chandler Carruth 2a54094d40 [PM] Provide two templates for the two directionalities of analysis
manager proxies and use those rather than repeating their definition
four times.

There are real differences between the two directions: outer AMs are
const and don't need to have invalidation tracked. But every proxy in
a particular direction is identical except for the analysis manager type
and the IR unit they proxy into. This makes them prime candidates for
nice templates.

I've started introducing explicit template instantiation declarations
and definitions as well because we really shouldn't be emitting all this
everywhere. I'm going to go back and add the same for the other
templates like this in a follow-up patch.

I've left the analysis manager as an opaque type rather than using two
IR units and requiring it to be an AnalysisManager template
specialization. I think its important that users retain the ability to
provide their own custom analysis management layer and provided it has
the appropriate API everything should Just Work.

llvm-svn: 262127
2016-02-27 10:38:10 +00:00
Chandler Carruth 3a63435551 [PM] Introduce CRTP mixin base classes to help define passes and
analyses in the new pass manager.

These just handle really basic stuff: turning a type name into a string
statically that is nice to print in logs, and getting a static unique ID
for each analysis.

Sadly, the format of passes in anonymous namespaces makes using their
names in tests really annoying so I've customized the names of the no-op
passes to keep tests sane to read.

This is the first of a few simplifying refactorings for the new pass
manager that should reduce boilerplate and confusion.

llvm-svn: 262004
2016-02-26 11:44:45 +00:00
Chandler Carruth 395fe57374 [PM] Add the IR unit type to the pass manager's logging and make all of
the testing more more explicit.

This will currently fail on platforms without support for getTypeName.
While an assert failure seems too harsh, I'm hoping we're OK with the
regression test failure, and I'd like to find out about what platforms
actually exist in this state if there are any so we can get
implementations in place for them.

But if we just can't fix all the host compilers to have a reasonably
portable variant of getTypeName and are worried about xfailing this test
on those platforms, I can add the horrible regular expression magic to
make the tests support "unknown" here as well.

llvm-svn: 261853
2016-02-25 10:27:39 +00:00
Justin Bogner eecc3c826a PM: Implement a basic loop pass manager
This creates the new-style LoopPassManager and wires it up with dummy
and print passes.

This version doesn't support modifying the loop nest at all. It will
be far easier to discuss and evaluate the approaches to that with this
in place so that the boilerplate is out of the way.

llvm-svn: 261831
2016-02-25 07:23:08 +00:00
Chandler Carruth 31088a9d58 [LPM] Factor all of the loop analysis usage updates into a common helper
routine.

We were getting this wrong in small ways and generally being very
inconsistent about it across loop passes. Instead, let's have a common
place where we do this. One minor downside is that this will require
some analyses like SCEV in more places than they are strictly needed.
However, this seems benign as these analyses are complete no-ops, and
without this consistency we can in many cases end up with the legacy
pass manager scheduling deciding to split up a loop pass pipeline in
order to run the function analysis half-way through. It is very, very
annoying to fix these without just being very pedantic across the board.

The only loop passes I've not updated here are ones that use
AU.setPreservesAll() such as IVUsers (an analysis) and the pass printer.
They seemed less relevant.

With this patch, almost all of the problems in PR24804 around loop pass
pipelines are fixed. The one remaining issue is that we run simplify-cfg
and instcombine in the middle of the loop pass pipeline. We've recently
added some loop variants of these passes that would seem substantially
cleaner to use, but this at least gets us much closer to the previous
state. Notably, the seven loop pass managers is down to three.

I've not updated the loop passes using LoopAccessAnalysis because that
analysis hasn't been fully wired into LoopSimplify/LCSSA, and it isn't
clear that those transforms want to support those forms anyways. They
all run late anyways, so this is harmless. Similarly, LSR is left alone
because it already carefully manages its forms and doesn't need to get
fused into a single loop pass manager with a bunch of other loop passes.

LoopReroll didn't use loop simplified form previously, and I've updated
the test case to match the trivially different output.

Finally, I've also factored all the pass initialization for the passes
that use this technique as well, so that should be done regularly and
reliably.

Thanks to James for the help reviewing and thinking about this stuff,
and Ben for help thinking about it as well!

Differential Revision: http://reviews.llvm.org/D17435

llvm-svn: 261316
2016-02-19 10:45:18 +00:00
Chandler Carruth 1aff022c9b [LPM] Actually test what the O2 pass pipeline consists of in key places,
especially the *structure* of it with respect to various pass managers.

This uncovers an absolute horror show of problems. This test shows just
how bad PR24804 is: we have a totaly of *seven* loop pass managers in
the main optimization pipeline.

I've tried to comment the various bits to the best of my knowledge, but
more enhancements here would be great.

Also great would be folks adding various test for other pipelines, I'm
focused on trying to fix the O2 pipeline. I just wanted a test to show
what I'm changing.

llvm-svn: 261305
2016-02-19 04:09:40 +00:00
Chandler Carruth edf5996b06 [PM/AA] Teach the new pass manager to use pass-by-lambda for registering
analysis passes, support pre-registering analyses, and use that to
implement parsing and pre-registering a custom alias analysis pipeline.

With this its possible to configure the particular alias analysis
pipeline used by the AAManager from the commandline of opt. I've updated
the test to show this effectively in use to build a pipeline including
basic-aa as part of it.

My big question for reviewers are around the APIs that are used to
expose this functionality. Are folks happy with pass-by-lambda to do
pass registration? Are folks happy with pre-registering analyses as
a way to inject customized instances of an analysis while still using
the registry for the general case?

Other thoughts of course welcome. The next round of patches will be to
add the rest of the alias analyses into the new pass manager and wire
them up here so that they can be used from opt. This will require
extending the (somewhate limited) functionality of AAManager w.r.t.
module passes.

Differential Revision: http://reviews.llvm.org/D17259

llvm-svn: 261197
2016-02-18 09:45:17 +00:00
Chandler Carruth bece8d517d [PM/AA] Wire BasicAA's new pass manager class up to the pass registry.
This ensures that all of the various pieces are working. The next patch
will wire up commandline-driven alias analysis chain building and allow
BasicAA to work with the AAManager.

llvm-svn: 260838
2016-02-13 23:46:24 +00:00
Chandler Carruth 6f5770b10f [PM/AA] Actually wire the AAManager I built for the new pass manager
into the new pass manager and fix the latent bugs there.

This lets everything live together nicely, but it isn't really useful
yet. I never finished wiring the AA layer up for the new pass manager,
and so subsequent patches will change this to do that wiring and get AA
stuff more fully integrated into the new pass manager. Turns out this is
necessary even to get functionattrs ported over. =]

llvm-svn: 260836
2016-02-13 23:32:00 +00:00
Weiming Zhao 0f1762caf9 Recommit r256952 "Filtering IR printing for print-after-all/print-before-all"
Fix lit test fail due to outputting an extra line.

Differential Revision: http://reviews.llvm.org/D15776

llvm-svn: 256987
2016-01-06 22:55:03 +00:00
Weiming Zhao b243c95c6a Revert r256952 due to lit test fails.
llvm-svn: 256954
2016-01-06 18:31:44 +00:00
Weiming Zhao eac0636805 Filtering IR printing for print-after-all/print-before-all
Summary:
This patch implements "-print-funcs" option to support function filtering for IR printing like -print-after-all, -print-before etc.
Examples:
  -print-after-all -print-funcs=foo,bar

Reviewers: mcrosier, joker.eph

Subscribers: tejohnson, joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D15776

llvm-svn: 256952
2016-01-06 18:20:25 +00:00
Keno Fischer 04464cf731 [llc/opt] Add an option to run all passes twice
Summary: Lately, I have submitted a number of patches to fix bugs that
only occurred when using the same pass manager to compile multiple
modules (generally these bugs are failure to reset some persistent
state). Unfortunately I don't think there is currently a way to test
that from the command line. This adds a very simple flag to both llc
and opt, under which the tools will simply re-run their respective
pass pipelines using the same pass manager on (a clone of the same
module). Additionally, we verify that both outputs are bitwise the
same.

Reviewers: yaron.keren

Subscribers: loladiro, yaron.keren, kcc, llvm-commits

Differential Revision: http://reviews.llvm.org/D14965

llvm-svn: 254774
2015-12-04 21:56:46 +00:00
Pete Cooper 67cf9a723b Revert "Change memcpy/memset/memmove to have dest and source alignments."
This reverts commit r253511.

This likely broke the bots in
http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202
http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787

llvm-svn: 253543
2015-11-19 05:56:52 +00:00
Pete Cooper 72bc23ef02 Change memcpy/memset/memmove to have dest and source alignments.
Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

These intrinsics currently have an explicit alignment argument which is
required to be a constant integer.  It represents the alignment of the
source and dest, and so must be the minimum of those.

This change allows source and dest to each have their own alignments
by using the alignment attribute on their arguments.  The alignment
argument itself is removed.

There are a few places in the code for which the code needs to be
checked by an expert as to whether using only src/dest alignment is
safe.  For those places, they currently take the minimum of src/dest
alignments which matches the current behaviour.

For example, code which used to read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false)
will now read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false)

For out of tree owners, I was able to strip alignment from calls using sed by replacing:
  (call.*llvm\.memset.*)i32\ [0-9]*\,\ i1 false\)
with:
  $1i1 false)

and similarly for memmove and memcpy.

I then added back in alignment to test cases which needed it.

A similar commit will be made to clang which actually has many differences in alignment as now
IRBuilder can generate different source/dest alignments on calls.

In IRBuilder itself, a new argument was added.  Instead of calling:
  CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, /* isVolatile */ false)
you now call
  CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, /* isVolatile */ false)

There is a temporary class (IntegerAlignment) which takes the source alignment and rejects
implicit conversion from bool.  This is to prevent isVolatile here from passing its default
parameter to the source alignment.

Note, changes in future can now be made to codegen.  I didn't change anything here, but this
change should enable better memcpy code sequences.

Reviewed by Hal Finkel.

llvm-svn: 253511
2015-11-18 22:17:24 +00:00
Mehdi Amini d178f4fc89 Make the default triple optional by allowing an empty string
When building LLVM as a (potentially dynamic) library that can be linked against
by multiple compilers, the default triple is not really meaningful.
We allow to explicitely set it to an empty string when configuring LLVM.
In this case, said "target independent" tests in the test suite that are using
the default triple are disabled by matching the newly available feature
"default_triple".

Reviewers: probinson, echristo
Differential Revision: http://reviews.llvm.org/D12660

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 247775
2015-09-16 05:34:32 +00:00
David Blaikie 2f40830dde [opaque pointer type] Add textual IR support for explicit type parameter for global aliases
update.py:
import fileinput
import sys
import re

alias_match_prefix = r"(.*(?:=|:|^)\s*(?:external |)(?:(?:private|internal|linkonce|linkonce_odr|weak|weak_odr|common|appending|extern_weak|available_externally) )?(?:default |hidden |protected )?(?:dllimport |dllexport )?(?:unnamed_addr |)(?:thread_local(?:\([a-z]*\))? )?alias"
plain = re.compile(alias_match_prefix + r" (.*?))(| addrspace\(\d+\) *)\*($| *(?:%|@|null|undef|blockaddress|addrspacecast|\[\[[a-zA-Z]|\{\{).*$)")
cast  = re.compile(alias_match_prefix + r") ((?:bitcast|inttoptr|addrspacecast)\s*\(.* to (.*?)(| addrspace\(\d+\) *)\*\)\s*(?:;.*)?$)")
gep   = re.compile(alias_match_prefix + r") ((?:getelementptr)\s*(?:inbounds)?\s*\((?P<type>.*), (?P=type)(?:\s*addrspace\(\d+\)\s*)?\* .*\)\s*(?:;.*)?$)")

def conv(line):
  m = re.match(cast, line)
  if m:
    return m.group(1) + " " + m.group(3) + ", " + m.group(2)
  m = re.match(gep, line)
  if m:
    return m.group(1) + " " + m.group(3) + ", " + m.group(2)
  m = re.match(plain, line)
  if m:
    return m.group(1) + ", " + m.group(2) + m.group(3) + "*" + m.group(4) + "\n"
  return line

for line in sys.stdin:
  sys.stdout.write(conv(line))

apply.sh:
for name in "$@"
do
  python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name"
  rm -f "$name.tmp"
done

The actual commands:
From llvm/src:
find test/ -name *.ll | xargs ./apply.sh
From llvm/src/tools/clang:
find test/ -name *.mm -o -name *.m -o -name *.cpp -o -name *.c | xargs -I '{}' ../../apply.sh "{}"
From llvm/src/tools/polly:
find test/ -name *.ll | xargs ./apply.sh

llvm-svn: 247378
2015-09-11 03:22:04 +00:00
Mehdi Amini c8d5783114 Update test suite to make "ninja check" succeed without native backend builtin
Requires "native" feature in most places that were failing.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 243960
2015-08-04 06:32:54 +00:00
Reid Kleckner fc0f93832b [llvm-extract] Drop comdats from declarations
The verifier rejects comdats on declarations.

llvm-svn: 241483
2015-07-06 18:48:02 +00:00
David Majnemer 7fddeccb8b Move the personality function from LandingPadInst to Function
The personality routine currently lives in the LandingPadInst.

This isn't desirable because:
- All LandingPadInsts in the same function must have the same
  personality routine.  This means that each LandingPadInst beyond the
  first has an operand which produces no additional information.

- There is ongoing work to introduce EH IR constructs other than
  LandingPadInst.  Moving the personality routine off of any one
  particular Instruction and onto the parent function seems a lot better
  than have N different places a personality function can sneak onto an
  exceptional function.

Differential Revision: http://reviews.llvm.org/D10429

llvm-svn: 239940
2015-06-17 20:52:32 +00:00
Akira Hatanaka 3058d0f080 Let llc and opt override "-target-cpu" and "-target-features" via command line
options.

This commit fixes a bug in llc and opt where "-mcpu" and "-mattr" wouldn't
override function attributes "-target-cpu" and "-target-features" in the IR.

Differential Revision: http://reviews.llvm.org/D9537

llvm-svn: 236677
2015-05-06 23:54:14 +00:00
David Blaikie 23af64846f [opaque pointer type] Add textual IR support for explicit type parameter to the call instruction
See r230786 and r230794 for similar changes to gep and load
respectively.

Call is a bit different because it often doesn't have a single explicit
type - usually the type is deduced from the arguments, and just the
return type is explicit. In those cases there's no need to change the
IR.

When that's not the case, the IR usually contains the pointer type of
the first operand - but since typed pointers are going away, that
representation is insufficient so I'm just stripping the "pointerness"
of the explicit type away.

This does make the IR a bit weird - it /sort of/ reads like the type of
the first operand: "call void () %x(" but %x is actually of type "void
()*" and will eventually be just of type "ptr". But this seems not too
bad and I don't think it would benefit from repeating the type
("void (), void () * %x(" and then eventually "void (), ptr %x(") as has
been done with gep and load.

This also has a side benefit: since the explicit type is no longer a
pointer, there's no ambiguity between an explicit type and a function
that returns a function pointer. Previously this case needed an explicit
type (eg: a function returning a void() function was written as
"call void () () * @x(" rather than "call void () * @x(" because of the
ambiguity between a function returning a pointer to a void() function
and a function returning void).

No ambiguity means even function pointer return types can just be
written alone, without writing the whole function's type.

This leaves /only/ the varargs case where the explicit type is required.

Given the special type syntax in call instructions, the regex-fu used
for migration was a bit more involved in its own unique way (as every
one of these is) so here it is. Use it in conjunction with the apply.sh
script and associated find/xargs commands I've provided in rr230786 to
migrate your out of tree tests. Do let me know if any of this doesn't
cover your cases & we can iterate on a more general script/regexes to
help others with out of tree tests.

About 9 test cases couldn't be automatically migrated - half of those
were functions returning function pointers, where I just had to manually
delete the function argument types now that we didn't need an explicit
function type there. The other half were typedefs of function types used
in calls - just had to manually drop the * from those.

import fileinput
import sys
import re

pat = re.compile(r'((?:=|:|^|\s)call\s(?:[^@]*?))(\s*$|\s*(?:(?:\[\[[a-zA-Z0-9_]+\]\]|[@%](?:(")?[\\\?@a-zA-Z0-9_.]*?(?(3)"|)|{{.*}}))(?:\(|$)|undef|inttoptr|bitcast|null|asm).*$)')
addrspace_end = re.compile(r"addrspace\(\d+\)\s*\*$")
func_end = re.compile("(?:void.*|\)\s*)\*$")

def conv(match, line):
  if not match or re.search(addrspace_end, match.group(1)) or not re.search(func_end, match.group(1)):
    return line
  return line[:match.start()] + match.group(1)[:match.group(1).rfind('*')].rstrip() + match.group(2) + line[match.end():]

for line in sys.stdin:
  sys.stdout.write(conv(re.search(pat, line), line))

llvm-svn: 235145
2015-04-16 23:24:18 +00:00
David Blaikie f72d05bc7b [opaque pointer type] Add textual IR support for explicit type parameter to gep operator
Similar to gep (r230786) and load (r230794) changes.

Similar migration script can be used to update test cases, which
successfully migrated all of LLVM and Polly, but about 4 test cases
needed manually changes in Clang.

(this script will read the contents of stdin and massage it into stdout
- wrap it in the 'apply.sh' script shown in previous commits + xargs to
apply it over a large set of test cases)

import fileinput
import sys
import re

rep = re.compile(r"(getelementptr(?:\s+inbounds)?\s*\()((<\d*\s+x\s+)?([^@]*?)(|\s*addrspace\(\d+\))\s*\*(?(3)>)\s*)(?=$|%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|zeroinitializer|<|\[\[[a-zA-Z]|\{\{)", re.MULTILINE | re.DOTALL)

def conv(match):
  line = match.group(1)
  line += match.group(4)
  line += ", "
  line += match.group(2)
  return line

line = sys.stdin.read()
off = 0
for match in re.finditer(rep, line):
  sys.stdout.write(line[off:match.start()])
  sys.stdout.write(conv(match))
  off = match.end()
sys.stdout.write(line[off:])

llvm-svn: 232184
2015-03-13 18:20:45 +00:00
Mehdi Amini 46a43556db Make DataLayout Non-Optional in the Module
Summary:
DataLayout keeps the string used for its creation.

As a side effect it is no longer needed in the Module.
This is "almost" NFC, the string is no longer
canonicalized, you can't rely on two "equals" DataLayout
having the same string returned by getStringRepresentation().

Get rid of DataLayoutPass: the DataLayout is in the Module

The DataLayout is "per-module", let's enforce this by not
duplicating it more than necessary.
One more step toward non-optionality of the DataLayout in the
module.

Make DataLayout Non-Optional in the Module

Module->getDataLayout() will never returns nullptr anymore.

Reviewers: echristo

Subscribers: resistor, llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D7992

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 231270
2015-03-04 18:43:29 +00:00
David Blaikie a79ac14fa6 [opaque pointer type] Add textual IR support for explicit type parameter to load instruction
Essentially the same as the GEP change in r230786.

A similar migration script can be used to update test cases, though a few more
test case improvements/changes were required this time around: (r229269-r229278)

import fileinput
import sys
import re

pat = re.compile(r"((?:=|:|^)\s*load (?:atomic )?(?:volatile )?(.*?))(| addrspace\(\d+\) *)\*($| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$)")

for line in sys.stdin:
  sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line))

Reviewers: rafael, dexonsmith, grosser

Differential Revision: http://reviews.llvm.org/D7649

llvm-svn: 230794
2015-02-27 21:17:42 +00:00
David Blaikie 79e6c74981 [opaque pointer type] Add textual IR support for explicit type parameter to getelementptr instruction
One of several parallel first steps to remove the target type of pointers,
replacing them with a single opaque pointer type.

This adds an explicit type parameter to the gep instruction so that when the
first parameter becomes an opaque pointer type, the type to gep through is
still available to the instructions.

* This doesn't modify gep operators, only instructions (operators will be
  handled separately)

* Textual IR changes only. Bitcode (including upgrade) and changing the
  in-memory representation will be in separate changes.

* geps of vectors are transformed as:
    getelementptr <4 x float*> %x, ...
  ->getelementptr float, <4 x float*> %x, ...
  Then, once the opaque pointer type is introduced, this will ultimately look
  like:
    getelementptr float, <4 x ptr> %x
  with the unambiguous interpretation that it is a vector of pointers to float.

* address spaces remain on the pointer, not the type:
    getelementptr float addrspace(1)* %x
  ->getelementptr float, float addrspace(1)* %x
  Then, eventually:
    getelementptr float, ptr addrspace(1) %x

Importantly, the massive amount of test case churn has been automated by
same crappy python code. I had to manually update a few test cases that
wouldn't fit the script's model (r228970,r229196,r229197,r229198). The
python script just massages stdin and writes the result to stdout, I
then wrapped that in a shell script to handle replacing files, then
using the usual find+xargs to migrate all the files.

update.py:
import fileinput
import sys
import re

ibrep = re.compile(r"(^.*?[^%\w]getelementptr inbounds )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))")
normrep = re.compile(       r"(^.*?[^%\w]getelementptr )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))")

def conv(match, line):
  if not match:
    return line
  line = match.groups()[0]
  if len(match.groups()[5]) == 0:
    line += match.groups()[2]
  line += match.groups()[3]
  line += ", "
  line += match.groups()[1]
  line += "\n"
  return line

for line in sys.stdin:
  if line.find("getelementptr ") == line.find("getelementptr inbounds"):
    if line.find("getelementptr inbounds") != line.find("getelementptr inbounds ("):
      line = conv(re.match(ibrep, line), line)
  elif line.find("getelementptr ") != line.find("getelementptr ("):
    line = conv(re.match(normrep, line), line)
  sys.stdout.write(line)

apply.sh:
for name in "$@"
do
  python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name"
  rm -f "$name.tmp"
done

The actual commands:
From llvm/src:
find test/ -name *.ll | xargs ./apply.sh
From llvm/src/tools/clang:
find test/ -name *.mm -o -name *.m -o -name *.cpp -o -name *.c | xargs -I '{}' ../../apply.sh "{}"
From llvm/src/tools/polly:
find test/ -name *.ll | xargs ./apply.sh

After that, check-all (with llvm, clang, clang-tools-extra, lld,
compiler-rt, and polly all checked out).

The extra 'rm' in the apply.sh script is due to a few files in clang's test
suite using interesting unicode stuff that my python script was throwing
exceptions on. None of those files needed to be migrated, so it seemed
sufficient to ignore those cases.

Reviewers: rafael, dexonsmith, grosser

Differential Revision: http://reviews.llvm.org/D7636

llvm-svn: 230786
2015-02-27 19:29:02 +00:00
Reid Kleckner 96d011315a Don't promote asynch EH invokes of nounwind functions to calls
If the landingpad of the invoke is using a personality function that
catches asynch exceptions, then it can catch a trap.

Also add some landingpads to invalid LLVM IR test cases that lack them.

Over-the-shoulder reviewed by David Majnemer.

llvm-svn: 228782
2015-02-11 01:23:16 +00:00
Chandler Carruth 9f8d9b613c [PM] Teach the module-to-function adaptor to not run function passes
over declarations.

This is both quite unproductive and causes things to crash, for example
domtree would just assert.

I've added a declaration and a domtree run to the basic high-level tests
for the new pass manager.

llvm-svn: 227724
2015-02-01 10:47:25 +00:00
Chandler Carruth e038552c8a [PM] Port TTI to the new pass manager, introducing a TargetIRAnalysis to
produce it.

This adds a function to the TargetMachine that produces this analysis
via a callback for each function. This in turn faves the way to produce
a *different* TTI per-function with the correct subtarget cached.

I've also done the necessary wiring in the opt tool to thread the target
machine down and make it available to the pass registry so that we can
construct this analysis from a target machine when available.

llvm-svn: 227721
2015-02-01 10:11:22 +00:00
Yunzhong Gao a8cf495a15 If we see UTF-8 BOM sequence at the beginning of a response file, we shall
remove these bytes before parsing.

Phabricator Revision: http://reviews.llvm.org/D7156

llvm-svn: 226988
2015-01-24 04:23:08 +00:00
Chandler Carruth 8ca43224db [PM] Port TargetLibraryInfo to the new pass manager, provided by the
TargetLibraryAnalysis pass.

There are actually no direct tests of this already in the tree. I've
added the most basic test that the pass manager bits themselves work,
and the TLI object produced will be tested by an upcoming patches as
they port passes which rely on TLI.

This is starting to point out the awkwardness of the invalidate API --
it seems poorly fitting on the *result* object. I suspect I will change
it to live on the analysis instead, but that's not for this change, and
I'd rather have a few more passes ported in order to have more
experience with how this plays out.

I believe there is only one more analysis required in order to start
porting instcombine. =]

llvm-svn: 226160
2015-01-15 11:39:46 +00:00
Chandler Carruth 703378f156 [PM] Remove the defunt CGSCC-specific debug flag.
Even before I sunk the debug flag into the opt tool this had been made
obsolete by factoring the pass and analysis managers into a single set
of templates that all used the core flag. No functionality changed here.

llvm-svn: 225842
2015-01-13 22:45:13 +00:00
Chandler Carruth 816702ffe0 [PM] Refactor the new pass manager to use a single template to implement
the generic functionality of the pass managers themselves.

In the new infrastructure, the pass "manager" isn't actually interesting
at all. It just pipelines a single chunk of IR through N passes. We
don't need to know anything about the IR or the passes to do this really
and we can replace the 3 implementations of the exact same functionality
with a single generic PassManager template, complementing the single
generic AnalysisManager template.

I've left typedefs in place to give convenient names to the various
obvious instantiations of the template.

With this, I think I've nuked almost all of the redundant logic in the
managers, and I think the overall design is actually simpler for having
single templates that clearly indicate there is no special logic here.
The logging is made somewhat more annoying by this change, but I don't
think the difference is worth having heavy-weight traits to help log
things.

llvm-svn: 225783
2015-01-13 11:13:56 +00:00
Chandler Carruth 7ad6d620b7 [PM] Fold all three analysis managers into a single AnalysisManager
template.

This consolidates three copies of nearly the same core logic. It adds
"complexity" to the ModuleAnalysisManager in that it makes it possible
to share a ModuleAnalysisManager across multiple modules... But it does
so by deleting *all of the code*, so I'm OK with that. This will
naturally make fixing bugs in this code much simpler, etc.

The only down side here is that we have to use 'typename' and 'this->'
in various places, and the implementation is lifted into the header.
I'll take that for the code size reduction.

The convenient names are still typedef-ed and used throughout so that
users can largely ignore this aspect of the implementation.

The follow-up change to this will do the exact same refactoring for the
PassManagers. =D

It turns out that the interesting different code is almost entirely in
the adaptors. At the end, that should be essentially all that is left.

llvm-svn: 225757
2015-01-13 02:51:47 +00:00
Chandler Carruth e5b0a9cf3d [PM] Give slightly less horrible names to the utility pass templates for
requiring and invalidating specific analyses. Also make their printed
names match their class names. Writing these out as prose really doesn't
make sense to me any more.

llvm-svn: 225346
2015-01-07 11:14:51 +00:00
Chandler Carruth fdb4180514 [PM] Fix a pretty nasty bug where the new pass manager would invalidate
passes too many time.

I think this is actually the issue that someone raised with me at the
developer's meeting and in an email, but that we never really got to the
bottom of. Having all the testing utilities made it much easier to dig
down and uncover the core issue.

When a pass manager is running many passes over a single function, we
need it to invalidate the analyses between each run so that they can be
re-computed as needed. We also need to track the intersection of
preserved higher-level analyses across all the passes that we run (for
example, if there is one module analysis which all the function analyses
preserve, we want to track that and propagate it). Unfortunately, this
interacted poorly with any enclosing pass adaptor between two IR units.
It would see the intersection of preserved analyses, and need to
invalidate any other analyses, but some of the un-preserved analyses
might have already been invalidated *and recomputed*! We would fail to
propagate the fact that the analysis had already been invalidated.

The solution to this struck me as really strange at first, but the more
I thought about it, the more natural it seemed. After a nice discussion
with Duncan about it on IRC, it seemed even nicer. The idea is that
invalidating an analysis *causes* it to be preserved! Preserving the
lack of result is trivial. If it is recomputed, great. Until something
*else* invalidates it again, we're good.

The consequence of this is that the invalidate methods on the analysis
manager which operate over many passes now consume their
PreservedAnalyses object, update it to "preserve" every analysis pass to
which it delivers an invalidation (regardless of whether the pass
chooses to be removed, or handles the invalidation itself by updating
itself). Then we return this augmented set from the invalidate routine,
letting the pass manager take the result and use the intersection of
*that* across each pass run to compute the final preserved set. This
accounts for all the places where the early invalidation of an analysis
has already "preserved" it for a future run.

I've beefed up the testing and adjusted the assertions to show that we
no longer repeatedly invalidate or compute the analyses across nested
pass managers.

llvm-svn: 225333
2015-01-07 01:58:35 +00:00
Chandler Carruth 4e107caf2e [PM] Introduce a utility pass that preserves no analyses.
Use this to test that path of invalidation. This test actually shows
redundant invalidation here that is really bad. I'm going to work on
fixing that next, but wanted to commit the test harness now that its all
working.

llvm-svn: 225257
2015-01-06 09:06:35 +00:00
Chandler Carruth ea368f1ee4 [PM] Simplify how we parse the outer layer of the pass pipeline text and
remove an extra, redundant pass manager wrapping every run.

I had kept seeing these when manually testing, but it was getting really
annoying and was going to cause problems with overly eager invalidation.
The root cause was an overly complex and unnecessary pile of code for
parsing the outer layer of the pass pipeline. We can instead delegate
most of this to the recursive pipeline parsing.

I've added some somewhat more basic and precise tests to catch this.

llvm-svn: 225253
2015-01-06 08:37:58 +00:00
Chandler Carruth 3472ffb37e [PM] Add a utility pass template that synthesizes the invalidation of
a specific analysis result.

This is quite handy to test things, and will also likely be very useful
for debugging issues. You could narrow down pass validation failures by
walking these invalidate pass runs up and down the pass pipeline, etc.
I've added support to the pass pipeline parsing to be able to create one
of these for any analysis pass desired.

Just adding this class uncovered one latent bug where the
AnalysisManager CRTP base class had a hard-coded Module type rather than
using IRUnitT.

I've also added tests for invalidation and caching of analyses in
a basic way across all the pass managers. These in turn uncovered two
more bugs where we failed to correctly invalidate an analysis -- its
results were invalidated but the key for re-running the pass was never
cleared and so it was never re-run. Quite nasty. I'm very glad to debug
this here rather than with a full system.

Also, yes, the naming here is horrid. I'm going to update some of the
names to be slightly less awful shortly. But really, I've no "good"
ideas for naming. I'll be satisfied if I can get it to "not bad".

llvm-svn: 225246
2015-01-06 04:49:44 +00:00
Chandler Carruth 0b576b377f [PM] Add a collection of no-op analysis passes and switch the new pass
manager tests to use them and be significantly more comprehensive.

This, naturally, uncovered a bug where the CGSCC pass manager wasn't
printing analyses when they were run.

The only remaining core manipulator is I think an invalidate pass
similar to the require pass. That'll be next. =]

llvm-svn: 225240
2015-01-06 02:50:06 +00:00
Chandler Carruth 628503e4d4 [PM] Add a utility to the new pass manager for generating a pass which
is a no-op other than requiring some analysis results be available.

This can be used in real pass pipelines to force the usually lazy
analysis running to eagerly compute something at a specific point, and
it can be used to test the pass manager infrastructure (my primary use
at the moment).

I've also added bit of pipeline parsing magic to support generating
these directly from the opt command so that you can directly use these
when debugging your analysis. The syntax is:

  require<analysis-name>

This can be used at any level of the pass manager. For example:

  cgscc(function(require<my-analysis>,no-op-function))

This would produce a no-op function pass requiring my-analysis, followed
by a fully no-op function pass, both of these in a function pass manager
which is nested inside of a bottom-up CGSCC pass manager which is in the
top-level (implicit) module pass manager.

I have zero attachment to the particular syntax I'm using here. Consider
it a straw man for use while I'm testing and fleshing things out.
Suggestions for better syntax welcome, and I'll update everything based
on any consensus that develops.

I've used this new functionality to more directly test the analysis
printing rather than relying on the cgscc pass manager running an
analysis for me. This is still minimally tested because I need to have
analyses to run first! ;] That patch is next, but wanted to keep this
one separate for easier review and discussion.

llvm-svn: 225236
2015-01-06 02:10:51 +00:00
Chandler Carruth 539dc4b9d5 [PM] Don't run the machinery of invalidating all the analysis passes
when all are being preserved.

We want to short-circuit this for a couple of reasons. One, I don't
really want passes to grow a dependency on actually receiving their
invalidate call when they've been preserved. I'm thinking about removing
this entirely. But more importantly, preserving everything is likely to
be the common case in a lot of scenarios, and it would be really good to
bypass all of the invalidation and preservation machinery there.
Avoiding calling N opaque functions to try to invalidate things that are
by definition still valid seems important. =]

This wasn't really inpsired by much other than seeing the spam in the
logging for analyses, but it seems better ot get it checked in rather
than forgetting about it.

llvm-svn: 225163
2015-01-05 12:32:11 +00:00
Chandler Carruth e5e8fb3bf6 [PM] Add names and debug logging for analysis passes to the new pass
manager.

This starts to allow us to test analyses more easily, but it's really
only the beginning. Some of the code here is still untestable without
manual changes to create analysis passes, but I wanted to factor it into
a small of chunks as possible.

Next up in order to be able to test things are, in no particular order:
- No-op analyses passes so we don't have to use real ones to exercise
  the pass maneger itself.
- Automatic way of generating dummy passes that require an analysis be
  run, including a variant that calls a 'print' method on a pass to make
  it even easier to print out the results of an analysis.
- Dummy passes that invalidate all analyses for their IR unit so we can
  test invalidation and re-runs.
- Automatic way to print each analysis pass as it is re-run.
- Automatic but optional verification of analysis passes everywhere
  possible.

I'm not claiming I'll get to all of these immediately, but that's what
is in the pipeline at some stage. I'm fleshing out exactly what I need
and what to prioritize by working on converting analyses and then trying
to test the conversion. =]

llvm-svn: 225162
2015-01-05 12:21:44 +00:00