Commit Graph

781 Commits

Author SHA1 Message Date
Amy Huang ea693a1627 [NPM] Port module-debuginfo pass to the new pass manager
Port pass to NPM and update tests in DebugInfo/Generic.

Differential Revision: https://reviews.llvm.org/D89730
2020-10-19 14:31:17 -07:00
Hans Wennborg 0628bea513 Revert "[PM/CC1] Add -f[no-]split-cold-code CC1 option to toggle splitting"
This broke Chromium's PGO build, it seems because hot-cold-splitting got turned
on unintentionally. See comment on the code review for repro etc.

> This patch adds -f[no-]split-cold-code CC1 options to clang. This allows
> the splitting pass to be toggled on/off. The current method of passing
> `-mllvm -hot-cold-split=true` to clang isn't ideal as it may not compose
> correctly (say, with `-O0` or `-Oz`).
>
> To implement the -fsplit-cold-code option, an attribute is applied to
> functions to indicate that they may be considered for splitting. This
> removes some complexity from the old/new PM pipeline builders, and
> behaves as expected when LTO is enabled.
>
> Co-authored by: Saleem Abdulrasool <compnerd@compnerd.org>
> Differential Revision: https://reviews.llvm.org/D57265
> Reviewed By: Aditya Kumar, Vedant Kumar
> Reviewers: Teresa Johnson, Aditya Kumar, Fedor Sergeev, Philip Pfaffe, Vedant Kumar

This reverts commit 273c299d5d.
2020-10-19 12:31:14 +02:00
Vedant Kumar 273c299d5d [PM/CC1] Add -f[no-]split-cold-code CC1 option to toggle splitting
This patch adds -f[no-]split-cold-code CC1 options to clang. This allows
the splitting pass to be toggled on/off. The current method of passing
`-mllvm -hot-cold-split=true` to clang isn't ideal as it may not compose
correctly (say, with `-O0` or `-Oz`).

To implement the -fsplit-cold-code option, an attribute is applied to
functions to indicate that they may be considered for splitting. This
removes some complexity from the old/new PM pipeline builders, and
behaves as expected when LTO is enabled.

Co-authored by: Saleem Abdulrasool <compnerd@compnerd.org>
Differential Revision: https://reviews.llvm.org/D57265
Reviewed By: Aditya Kumar, Vedant Kumar
Reviewers: Teresa Johnson, Aditya Kumar, Fedor Sergeev, Philip Pfaffe, Vedant Kumar
2020-10-15 23:13:33 +00:00
Arthur Eubanks 518ec05a10 [LoopExtract][NewPM] Port -loop-extract to NPM
-loop-extract-single is just -loop-extract on one loop.

-loop-extract depended on -break-crit-edges and -loop-simplify in the
legacy PM, but the NPM doesn't allow specifying pass dependencies like
that, so manually add those passes to the RUN lines where necessary.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D89016
2020-10-13 22:55:42 -07:00
Arthur Eubanks 0689dab844 [FixIrreducible][NewPM] Port -fix-irreducible to NPM
In the NPM, a pass cannot depend on another non-analysis pass. So pin
the test that tests that -lowerswitch is run automatically to legacy PM.

Reviewed By: sameerds

Differential Revision: https://reviews.llvm.org/D89051
2020-10-09 09:22:09 -07:00
Arthur Eubanks 9c21c6c966 [LoopInterchange][NewPM] Port -loop-interchange to NPM
Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D89058
2020-10-09 09:21:31 -07:00
Arthur Eubanks 6dcbea877b [NewPM] Use PassInstrumentation for -verify-each
This removes "VerifyEachPass" parameters from a lot of functions which is nice.

Don't verify after special passes or VerifierPass.

This introduces verification on loop and cgscc passes, verifying the corresponding function/module.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D88764
2020-10-07 19:24:25 -07:00
Reid Kleckner 940d7aaea9 Port StripGCRelocates pass to NPM
Fixes one test under NPM

Differential Revision: https://reviews.llvm.org/D88766
2020-10-07 14:41:29 -07:00
Reid Kleckner da48fe1732 [NPM] Port strip nonlinetable debuginfo pass to the new pass manager
Fixes a few tests in llvm/test/Transforms/Utils.

Differential Revision: https://reviews.llvm.org/D88762
2020-10-07 14:35:36 -07:00
Arthur Eubanks d4e08c95e5 [NewPM] Set -enable-npm-optnone to true by default
This makes the NPM skip not required passes on functions marked optnone.

If this causes a pass that should be required but has not been marked
required to be skipped, add
`static bool isRequired() { return true; }`
to the pass class. AlwaysInlinerPass is an example.

clang/test/CodeGen/O0-no-skipped-passes.c is useful for checking that
no passes are skipped under -O0.

The -enable-npm-optnone option will be removed once this has been stable
for long enough without issues.

Reviewed By: ychen, asbirlea

Differential Revision: https://reviews.llvm.org/D87869
2020-10-05 18:42:32 -07:00
Arthur Eubanks 321986fe68 [MetaRenamer][NewPM] Port metarenamer to NPM
Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D88690
2020-10-02 15:42:25 -07:00
Jamie Schmeiser 71124a9dbd Reland No.3: Add new hidden option -print-changed which only reports changes to IR
A new hidden option -print-changed is added along with code to support
printing the IR as it passes through the opt pipeline in the new pass
manager. Only those passes that change the IR are reported, with others
only having the banner reported, indicating that they did not change the
IR, were filtered out or ignored. Filtering of output via the
-filter-print-funcs is supported and a new supporting hidden option
-filter-passes is added. The latter takes a comma separated list of pass
names and filters the output to only show those passes in the list that
change the IR. The output can also be modified via the -print-module-scope
function.

The code introduces an abstract template base class that generalizes the
comparison of IRs that takes an IR representation as template parameter.
Derived classes provide overrides that provide an event based API
for generalized reporting of IRs as they are changed in the opt pipeline
through the new pass manager.

The first of several instantiations is provided that prints the IR
in a form similar to that produced by -print-after-all with the above
mentioned filtering capabilities. This version, and the others to
follow will be introduced at the upcoming developer's conference.

Reviewed By: aeubanks (Arthur Eubanks), yrouban (Yevgeny Rouban), ychen (Yuanfang Chen), MaskRay (Fangrui Song)

Differential Revision: https://reviews.llvm.org/D86360
2020-10-01 17:39:13 +00:00
Sjoerd Meijer d53b4bee0c [LoopFlatten] Add a loop-flattening pass
This is a simple pass that flattens nested loops.  The intention is to optimise
loop nests like this, which together access an array linearly:

  for (int i = 0; i < N; ++i)
    for (int j = 0; j < M; ++j)
      f(A[i*M+j]);

into one loop:

  for (int i = 0; i < (N*M); ++i)
    f(A[i]);

It can also flatten loops where the induction variables are not used in the
loop. This can help with codesize and runtime, especially on simple cpus
without advanced branch prediction.

This is only worth flattening if the induction variables are only used in an
expression like i*M+j. If they had any other uses, we would have to insert a
div/mod to reconstruct the original values, so this wouldn't be profitable.

This partially fixes PR40581 as this pass triggers on one of the two cases. I
will follow up on this to learn LoopFlatten a few more (small) tricks. Please
note that LoopFlatten is not yet enabled by default.

Patch by Oliver Stannard, with minor tweaks from Dave Green and myself.

Differential Revision: https://reviews.llvm.org/D42365
2020-10-01 13:54:45 +01:00
Arthur Eubanks 460dda071e [WholeProgramDevirt][NewPM] Add NPM testing path to match legacy pass
The legacy pass's default constructor sets UseCommandLine = true and
goes down a separate testing route. Match that in the NPM pass.

This fixes all tests in llvm/test/Transforms/WholeProgramDevirt under NPM.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D88588
2020-09-30 17:27:37 -07:00
Arthur Eubanks 4fbd83c716 [ObjCARCAA][NewPM] Add already ported objc-arc-aa to PassRegistry.def
Also add missing AnalysisKey definition.
2020-09-30 08:50:44 -07:00
Chuanqi Xu b3a722e66b [Coroutines] Reuse storage for local variables with non-overlapping lifetimes
bug 45566 shows the process of building coroutine frame won't consider
that the lifetimes of different local variables are not overlapped,
which means the compiler could generates smaller frame.

This patch calculate the lifetime range of each alloca by StackLifetime
class. Then the patch build non-overlapped sets for allocas whose
lifetime ranges are not overlapped. We use the largest type in a
non-overlapped set as the field type in the frame. In insertSpills
process, if we find the type of field is not the same with the alloca,
we cast the pointer to the field type to the pointer to the alloca type.
Since the lifetime range of alloca in one non-overlapped set is not
overlapped with each other, it should be ok to reuse the storage space
in the frame.

Test plan: check-llvm, check-clang, cppcoro, folly

Reviewers: junparser, lxfind, modocache

Differential Revision: https://reviews.llvm.org/D87596
2020-09-28 15:48:00 +08:00
Fangrui Song 50bd71e1d7 [NewPM] Port ConstraintElimination to the new pass manager
If -enable-constraint-elimination is specified, add it to the -O2/-O3 pipeline.
(-O1 uses a separate function now.)

Reviewed By: fhahn, aeubanks

Differential Revision: https://reviews.llvm.org/D88365
2020-09-27 11:12:26 -07:00
Arthur Eubanks 83e3ea2cfc [LowerTypeTests][NewPM] Add constructor that uses command line flags
This matches the legacy PM pass by having one constructor use command
line flags, and the other use parameters to the pass.

This fixes all tests under Transforms/LowerTypeTests using NPM.

Reviewed By: ychen, pcc

Differential Revision: https://reviews.llvm.org/D87845
2020-09-25 17:39:59 -07:00
Arthur Eubanks d3f6972abb [LoopReroll][NewPM] Port -loop-reroll to NPM
Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D87957
2020-09-25 12:09:06 -07:00
Hans Wennborg 4f1897c6f0 Move PassBuilder::registerParseTopLevelPipelineCallback out-of-line
For some mysterious reason it doesn't build with clang-cl when compiled
as part of the includes in clang's CodeGenAction.cpp
(crbug.com/1132292).
2020-09-25 19:55:40 +02:00
Andrew Litteken f02c4c87b4 [IRSim] Adding wrapper pass for IRSimilarityIdentfier
This introduces an analysis pass that wraps IRSimilarityIdentifier,
and adds a printer pass to examine in what function similarities are
being found.

Test for what the printer pass can find are in
test/Analysis/IRSimilarityIdentifier.

Reviewed by: paquette, jroelofs

Differential Revision: https://reviews.llvm.org/D86973
2020-09-24 14:59:41 -05:00
Arthur Eubanks 29aaa18848 Revert "[NewPM] Add callbacks to PassBuilder to run before/after parsing a pass"
This reverts commit 111aa4e366.
2020-09-23 18:43:13 -07:00
Arthur Eubanks 111aa4e366 [NewPM] Add callbacks to PassBuilder to run before/after parsing a pass
This is in preparation for supporting -debugify-each, which adds a debug
info pass before and after each pass.

Switch VerifyEach to use this.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D88107
2020-09-23 15:25:40 -07:00
Arthur Eubanks a031ef6f3a [GVNSink][NewPM] Add GVNSinkPass to PassRegistry.def 2020-09-22 08:24:09 -07:00
Arthur Eubanks 9db0c572c1 [Delinearization][NewPM] Port delinearization to NPM
Also make tests in Analysis/Delinearization work under NPM.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D87741
2020-09-21 17:59:08 -07:00
Arthur Eubanks 024979b7b6 [ObjCARC][NewPM] Port objc-arc-contract to NPM
Similar to https://reviews.llvm.org/D86178.

This is a module pass instead of a function pass since
ARCRuntimeEntryPoints can lazily add function declarations.

Reviewed By: ahatanak

Differential Revision: https://reviews.llvm.org/D87806
2020-09-21 09:40:14 -07:00
Arthur Eubanks 5249e6f248 [LoopSimplifyCFG][NewPM] Rename simplify-cfg -> loop-simplifycfg
This matches the legacy PM name and makes all tests in
Transforms/LoopSimplifyCFG pass under NPM.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D87948
2020-09-21 08:27:19 -07:00
Douglas Yung b03c2b8395 Revert "Re-land: Add new hidden option -print-changed which only reports changes to IR"
The test added in this commit is failing on Windows bots:

http://lab.llvm.org:8011/builders/llvm-clang-win-x-armv7l/builds/1269

This reverts commit f9e6d1edc0 and follow-up commit 6859d95ea2.
2020-09-17 01:32:29 -07:00
Arthur Eubanks f4ea0f9814 [NewPM] Port -print-alias-sets to NPM
Really it should be named print<alias-sets>, but for the sake of
changing fewer tests, added a TODO to rename after NPM switch and test
cleanup.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D87713
2020-09-16 18:34:56 -07:00
Michael Liao 6859d95ea2 Fix build. 2020-09-16 14:52:00 -04:00
Jamie Schmeiser f9e6d1edc0 Re-land: Add new hidden option -print-changed which only reports changes to IR
A new hidden option -print-changed is added along with code to support
printing the IR as it passes through the opt pipeline in the new pass
manager. Only those passes that change the IR are reported, with others
only having the banner reported, indicating that they did not change the
IR, were filtered out or ignored. Filtering of output via the
-filter-print-funcs is supported and a new supporting hidden option
-filter-passes is added. The latter takes a comma separated list of pass
names and filters the output to only show those passes in the list that
change the IR. The output can also be modified via the -print-module-scope
function.

The code introduces a template base class that generalizes the comparison
of IRs that takes an IR representation as template parameter. The
constructor takes a series of lambdas that provide an event based API
for generalized reporting of IRs as they are changed in the opt pipeline
through the new pass manager.

The first of several instantiations is provided that prints the IR
in a form similar to that produced by -print-after-all with the above
mentioned filtering capabilities. This version, and the others to
follow will be introduced at the upcoming developer's conference.

Reviewed By: aeubanks (Arthur Eubanks), yrouban (Yevgeny Rouban), ychen (Yuanfang Chen)

Differential Revision: https://reviews.llvm.org/D86360
2020-09-16 17:25:18 +00:00
Arthur Eubanks 09c342493d [NPM] Translate alias analysis into require<> as well
'require<globals-aa>' is needed to make globals-aa work in NPM, since
globals-aa is a module analysis but function passes cannot run module
analyses on demand.
So don't skip translating alias analyses to 'require<>'.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D87743
2020-09-16 08:54:09 -07:00
Arthur Eubanks ba12e77ec1 [NewPM] Port strip* passes to NPM
strip-nondebug and strip-debug-declare have no existing associated tests

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D87639
2020-09-15 18:25:12 -07:00
Arthur Eubanks f7aa1563eb [LowerSwitch][NewPM] Port lowerswitch to NPM
Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D87726
2020-09-15 18:18:31 -07:00
Wenlei He 2c391a5a14 [LICM] Make Loop ICM profile aware again
D65060 was reverted because it introduced non-determinism by using BFI counts from already freed blocks. The parent of this revision fixes that by using a VH callback on blocks to prevent this from happening and makes sure BFI data is passed correctly in LoopStandardAnalysisResults.

This re-introduces the previous optimization of using BFI data to prevent LICM from hoisting/sinking if the instruction will end up moving to a colder block.

Internally at Facebook this change results in a ~7% win in a CPU related metric in one of our big services by preventing hoisting cold code into a hot pre-header like the added test case demonstrates.

Testing:
ninja check

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D87551
2020-09-15 17:21:58 -07:00
Wenlei He 2ea4c2c598 [BFI] Make BFI information available through loop passes inside LoopStandardAnalysisResults
~~D65060 uncovered that trying to use BFI in loop passes can lead to non-deterministic behavior when blocks are re-used while retaining old BFI data.~~

~~To make sure BFI is preserved through loop passes a Value Handle (VH) callback is registered on blocks themselves. When a block is freed it now also wipes out the accompanying BFI entry such that stale BFI data can no longer persist resolving the determinism issue. ~~

~~An optimistic approach would be to incrementally update BFI information throughout the loop passes rather than only invalidating them on removed blocks. The issues with that are:~~
~~1. It is not clear how BFI information should be incrementally updated: If a block is duplicated does its BFI information come with? How about if it's split/modified/moved around? ~~
~~2. Assuming we can address these problems the implementation here will be a massive undertaking. ~~

~~There's a known need of BFI in LICM analysis which requires correct but not incrementally updated BFI data. A follow-up change can register BFI in all loop passes so this preserved but potentially lossy data is available to any loop pass that wants it.~~

See: D75341 for an identical implementation of preserving BFI via VH callbacks. The previous statements do still apply but this change no longer has to be in this diff because it's already upstream 😄 .

This diff also moves BFI to be a part of LoopStandardAnalysisResults since the previous method using getCachedResults now (correctly!) statically asserts (D72893) that this data isn't static through the loop passes.

Testing
Ninja check

Reviewed By: asbirlea, nikic

Differential Revision: https://reviews.llvm.org/D86156
2020-09-15 16:16:24 -07:00
Arthur Eubanks 3f69b2140f [NewPM][opt] Fix -globals-aa not being recognized as alias analysis in NPM
Was missing MODULE_ALIAS_ANALYSIS, previously only FUNCTION_ALIAS_ANALYSIS was taken into account.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D87664
2020-09-15 11:18:19 -07:00
Arthur Eubanks 10b12d4035 Reland [docs][NewPM] Add docs for writing NPM passes
As to not conflict with the legacy PM example passes under
llvm/lib/Transforms/Hello, this is under HelloNew. This makes the
CMakeLists.txt and general directory structure less confusing for people
following the example.

Much of the doc structure was taken from WritinAnLLVMPass.rst.

This adds a HelloWorld pass which simply prints out each function name.

More will follow after this, e.g. passes over different units of IR, analyses.
https://llvm.org/docs/WritingAnLLVMPass.html contains a lot more.

Relanded with missing "Support" dependency in LLVMBuild.txt.

Reviewed By: ychen, asbirlea

Differential Revision: https://reviews.llvm.org/D86979
2020-09-14 16:06:19 -07:00
Arthur Eubanks 39ec36415d Revert "[docs][NewPM] Add docs for writing NPM passes"
This reverts commit c2590de30d.

Breaks shared libs build
2020-09-14 15:55:17 -07:00
Arthur Eubanks c2590de30d [docs][NewPM] Add docs for writing NPM passes
As to not conflict with the legacy PM example passes under
llvm/lib/Transforms/Hello, this is under HelloNew. This makes the
CMakeLists.txt and general directory structure less confusing for people
following the example.

Much of the doc structure was taken from WritinAnLLVMPass.rst.

This adds a HelloWorld pass which simply prints out each function name.

More will follow after this, e.g. passes over different units of IR, analyses.
https://llvm.org/docs/WritingAnLLVMPass.html contains a lot more.

Reviewed By: ychen, asbirlea

Differential Revision: https://reviews.llvm.org/D86979
2020-09-14 13:26:03 -07:00
Teresa Johnson 226d80ebe2 [MemProf] Rename HeapProfiler to MemProfiler for consistency
This is consistent with the clang option added in
7ed8124d46, and the comments on the
runtime patch in D87120.

Differential Revision: https://reviews.llvm.org/D87622
2020-09-14 13:14:57 -07:00
Yevgeny Rouban 28012e00d8 [NewPM] Introduce PreserveCFG check
Check that all passes, which report they preserve CFG,
are really preserving CFG.
A new standard instrumentation is introduced. It can be
switched on/off by the flag verify-cfg-preserved, which
is on by default for debug builds.

Reviewers: kuhar, fedor.sergeev

Differential Revision: https://reviews.llvm.org/D81558
2020-09-11 14:32:21 +07:00
Juneyoung Lee 1b9884df8d Enable InsertFreeze flag of JumpThreading when used in LTO
This patch enables inserting freeze when JumpThreading converts a select to
a conditional branch when it is run in LTO.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D85534
2020-09-10 19:05:49 +09:00
Roman Lebedev bb7d3af113
Reland [SimplifyCFG][LoopRotate] SimplifyCFG: disable common instruction hoisting by default, enable late in pipeline
This was reverted in 503deec218
because it caused gigantic increase (3x) in branch mispredictions
in certain benchmarks on certain CPU's,
see https://reviews.llvm.org/D84108#2227365.

It has since been investigated and here are the results:
https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20200907/827578.html
> It's an amazingly severe regression, but it's also all due to branch
> mispredicts (about 3x without this). The code layout looks ok so there's
> probably something else to deal with. I'm not sure there's anything we can
> reasonably do so we'll just have to take the hit for now and wait for
> another code reorganization to make the branch predictor a bit more happy :)
>
> Thanks for giving us some time to investigate and feel free to recommit
> whenever you'd like.
>
> -eric

So let's just reland this.
Original commit message:


I've been looking at missed vectorizations in one codebase.
One particular thing that stands out is that some of the loops
reach vectorizer in a rather mangled form, with weird PHI's,
and some of the loops aren't even in a rotated form.

After taking a more detailed look, that happened because
the loop's headers were too big by then. It is evident that
SimplifyCFG's common code hoisting transform is at fault there,
because the pattern it handles is precisely the unrotated
loop basic block structure.

Surprizingly, `SimplifyCFGOpt::HoistThenElseCodeToIf()` is enabled
by default, and is always run, unlike it's friend, common code sinking
transform, `SinkCommonCodeFromPredecessors()`, which is not enabled
by default and is only run once very late in the pipeline.

I'm proposing to harmonize this, and disable common code hoisting
until //late// in pipeline. Definition of //late// may vary,
here currently i've picked the same one as for code sinking,
but i suppose we could enable it as soon as right after
loop rotation happens.

Experimentation shows that this does indeed unsurprizingly help,
more loops got rotated, although other issues remain elsewhere.

Now, this undoubtedly seriously shakes phase ordering.
This will undoubtedly be a mixed bag in terms of both compile- and
run- time performance, codesize. Since we no longer aggressively
hoist+deduplicate common code, we don't pay the price of said hoisting
(which wasn't big). That may allow more loops to be rotated,
so we pay that price. That, in turn, that may enable all the transforms
that require canonical (rotated) loop form, including but not limited to
vectorization, so we pay that too. And in general, no deduplication means
more [duplicate] instructions going through the optimizations. But there's still
late hoisting, some of them will be caught late.

As per benchmarks i've run {F12360204}, this is mostly within the noise,
there are some small improvements, some small regressions.
One big regression i saw i fixed in rG8d487668d09fb0e4e54f36207f07c1480ffabbfd, but i'm sure
this will expose many more pre-existing missed optimizations, as usual :S

llvm-compile-time-tracker.com thoughts on this:
http://llvm-compile-time-tracker.com/compare.php?from=e40315d2b4ed1e38962a8f33ff151693ed4ada63&to=c8289c0ecbf235da9fb0e3bc052e3c0d6bff5cf9&stat=instructions
* this does regress compile-time by +0.5% geomean (unsurprizingly)
* size impact varies; for ThinLTO it's actually an improvement

The largest fallout appears to be in GVN's load partial redundancy
elimination, it spends *much* more time in
`MemoryDependenceResults::getNonLocalPointerDependency()`.
Non-local `MemoryDependenceResults` is widely-known to be, uh, costly.
There does not appear to be a proper solution to this issue,
other than silencing the compile-time performance regression
by tuning cut-off thresholds in `MemoryDependenceResults`,
at the cost of potentially regressing run-time performance.
D84609 attempts to move in that direction, but the path is unclear
and is going to take some time.

If we look at stats before/after diffs, some excerpts:
* RawSpeed (the target) {F12360200}
  * -14 (-73.68%) loops not rotated due to the header size (yay)
  * -272 (-0.67%) `"Number of live out of a loop variables"` - good for vectorizer
  * -3937 (-64.19%) common instructions hoisted
  * +561 (+0.06%) x86 asm instructions
  * -2 basic blocks
  * +2418 (+0.11%) IR instructions
* vanilla test-suite + RawSpeed + darktable  {F12360201}
  * -36396 (-65.29%) common instructions hoisted
  * +1676 (+0.02%) x86 asm instructions
  * +662 (+0.06%) basic blocks
  * +4395 (+0.04%) IR instructions

It is likely to be sub-optimal for when optimizing for code size,
so one might want to change tune pipeline by enabling sinking/hoisting
when optimizing for size.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D84108

This reverts commit 503deec218.
2020-09-08 00:24:03 +03:00
Arthur Eubanks c9771391ce [NewPM][Lint] Port -lint to NewPM
This also changes -lint from an analysis to a pass. It's similar to
-verify, and that is a normal pass, and lives in llvm/IR.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D87057
2020-09-03 13:03:44 -07:00
Jamie Schmeiser b2e65cf950 Revert "Add new hidden option -print-changed which only reports changes to IR"
This reverts commit 7bc9924cb2 due to
failure caused by missing a space between trailing >>, required by some
versions of C++:wq.
2020-09-03 18:41:20 +00:00
Jamie Schmeiser 7bc9924cb2 Add new hidden option -print-changed which only reports changes to IR
A new hidden option -print-changed is added along with code to support
printing the IR as it passes through the opt pipeline in the new pass
manager. Only those passes that change the IR are reported, with others
only having the banner reported, indicating that they did not change the
IR, were filtered out or ignored. Filtering of output via the
-filter-print-funcs is supported and a new supporting hidden option
-filter-passes is added. The latter takes a comma separated list of pass
names and filters the output to only show those passes in the list that
change the IR. The output can also be modified via the -print-module-scope
function.

The code introduces a template base class that generalizes the comparison
of IRs that takes an IR representation as template parameter. The
constructor takes a series of lambdas that provide an event based API
for generalized reporting of IRs as they are changed in the opt pipeline
through the new pass manager.

The first of several instantiations is provided that prints the IR
in a form similar to that produced by -print-after-all with the above
mentioned filtering capabilities. This version, and the others to
follow will be introduced at the upcoming developer's conference.
See https://hotcrp.llvm.org/usllvm2020/paper/29 for more information.

Reviewed By: yrouban (Yevgeny Rouban)

Differential Revision: https://reviews.llvm.org/D86360
2020-09-03 15:52:35 +00:00
Arthur Eubanks e440b4933a Revert "[NewPM][Lint] Port -lint to NewPM"
This reverts commit 883399c840.
2020-09-02 21:34:29 -07:00
Arthur Eubanks 883399c840 [NewPM][Lint] Port -lint to NewPM
This also changes -lint from an analysis to a pass. It's similar to
-verify, and that is a normal pass, and lives in llvm/IR.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D87057
2020-09-02 21:13:01 -07:00
Arthur Eubanks cfde93e5d6 [ObjCARCOpt] Port objc-arc to NPM
Since doInitialization() in the legacy pass modifies the module, the NPM
pass is a Module pass.

Reviewed By: ahatanak, ychen

Differential Revision: https://reviews.llvm.org/D86178
2020-08-28 12:59:33 -07:00
Teresa Johnson 7ed8124d46 [HeapProf] Clang and LLVM support for heap profiling instrumentation
See RFC for background:
http://lists.llvm.org/pipermail/llvm-dev/2020-June/142744.html

Note that the runtime changes will be sent separately (hopefully this
week, need to add some tests).

This patch includes the LLVM pass to instrument memory accesses with
either inline sequences to increment the access count in the shadow
location, or alternatively to call into the runtime. It also changes
calls to memset/memcpy/memmove to the equivalent runtime version.
The pass is modeled on the address sanitizer pass.

The clang changes add the driver option to invoke the new pass, and to
link with the upcoming heap profiling runtime libraries.

Currently there is no attempt to optimize the instrumentation, e.g. to
aggregate updates to the same memory allocation. That will be
implemented as follow on work.

Differential Revision: https://reviews.llvm.org/D85948
2020-08-27 08:50:35 -07:00
Roman Lebedev 503deec218
Temporairly revert "[SimplifyCFG][LoopRotate] SimplifyCFG: disable common instruction hoisting by default, enable late in pipeline"
As disscussed in post-commit review starting with
	https://reviews.llvm.org/D84108#2227365
while this appears to be mostly a win overall, especially code-size-wise,
this appears to shake //certain// code pattens in a way that is extremely
unfavorable for performance (+30% runtime regression)
on certain CPU's (i personally can't reproduce).

So until the behaviour is better understood, and a path forward is mapped,
let's back this out for now.

This reverts commit 1d51dc38d8.
2020-08-22 00:33:22 +03:00
Roman Lebedev 5d7c5a5e99
[NFC] Port InstCount pass to new pass manager 2020-08-21 12:39:42 +03:00
Yevgeny Rouban 18bc400f97 [NewPM][PassInstrumentation] Add PreservedAnalyses parameter to AfterPass* callbacks
Both AfterPass and AfterPassInvalidated pass instrumentation
callbacks get additional parameter of type PreservedAnalyses.
This patch was created by @fedor.sergeev. I have just slightly
changed it.

Reviewers: fedor.sergeev

Differential Revision: https://reviews.llvm.org/D81555
2020-08-21 16:10:42 +07:00
Mircea Trofin 364cd768a2 [NFC] Expose the -Oz module optimization pipeline to opt
This exposes the module optimization pipeline as a pass that can be
applied stand-alone when using 'opt'. This helps ml inliner training
scenarios, where we start with IR captured right before inlining,
perform the inlining (-scc-oz-module-inliner) and then want to continue
and observe the final IR (where this patch comes into play). We can then
apply llc on the resulting IR to continue compilation down to native.

Differential Revision: https://reviews.llvm.org/D86224
2020-08-20 09:28:58 -07:00
Jamie Schmeiser 645c6856a6 [NFC] Add raw_ostream parameter to printIR routines
This is a non-functional-change to generalize the printIR routines so that
the output can be saved and manipulated rather than being directly output
to dbgs(). This is a prerequisite change for many upcoming changes that
allow new ways of examining changes made to the IR in the new pass manager.

Reviewed By: aeubanks (Arthur Eubanks)

Differential Revision: https://reviews.llvm.org/D85999
2020-08-18 16:05:27 +00:00
Arthur Eubanks 7abef41674 [NewPM] Print 'Skipping pass' as pass instrumentation
If OptNoneInstrumentation prints it instead, 'Skipping pass' will print for even required passes.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D85493
2020-08-07 15:02:02 -07:00
Arthur Eubanks 9e6a1e5781 [NewPM][LoopRotate] Rename rotate -> loop-rotate
To match legacy pass name.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D85338
2020-08-05 12:25:01 -07:00
Arthur Eubanks 7c19c89dd5 [NewPM][LoopVersioning] Port LoopVersioning to NPM
Reviewed By: ychen, fhahn

Differential Revision: https://reviews.llvm.org/D85063
2020-08-03 10:32:09 -07:00
Arthur Eubanks 47acbcf09a [tbaa] Rename type-based-aa -> tbaa
For consistency with legacy pass name.
Helps with 37 instances of "unknown pass name 'tbaa'" in check-llvm under NPM.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D84967
2020-07-30 19:51:35 -07:00
Yuanfang Chen 555cf42f38 [NewPM][PassInstrument] Add PrintPass callback to StandardInstrumentations
Problem:
Right now, our "Running pass" is not accurate when passes are wrapped in adaptor because adaptor is never skipped and a pass could be skipped. The other problem is that "Running pass" for a adaptor is before any "Running pass" of passes/analyses it depends on. (for example, FunctionToLoopPassAdaptor). So the order of printing is not the actual order.

Solution:
Doing things like PassManager::Debuglogging is very intrusive because we need to specify Debuglogging whenever adaptor is created. (Actually, right now we're not specifying Debuglogging for some sub-PassManagers. Check PassBuilder)

This patch move debug logging for pass as a PassInstrument callback. We could be sure that all running passes are logged and in the correct order.

This could also be used to implement hierarchy pass logging in legacy PM. We could also move logging of pass manager to this if we want.

The test fixes looks messy. It includes changes:
- Remove PassInstrumentationAnalysis
- Remove PassAdaptor
- If a PassAdaptor is for a real pass, the pass is added
- Pass reorder (to the correct order), related to PassAdaptor
- Add missing passes (due to Debuglogging not passed down)

Reviewed By: asbirlea, aeubanks

Differential Revision: https://reviews.llvm.org/D84774
2020-07-30 10:07:57 -07:00
Arthur Eubanks 71d0a2b8a3 [DFSan][NewPM] Port DataFlowSanitizer to NewPM
Reviewed By: ychen, morehouse

Differential Revision: https://reviews.llvm.org/D84707
2020-07-29 10:19:15 -07:00
Roman Lebedev 1d51dc38d8
[SimplifyCFG][LoopRotate] SimplifyCFG: disable common instruction hoisting by default, enable late in pipeline
I've been looking at missed vectorizations in one codebase.
One particular thing that stands out is that some of the loops
reach vectorizer in a rather mangled form, with weird PHI's,
and some of the loops aren't even in a rotated form.

After taking a more detailed look, that happened because
the loop's headers were too big by then. It is evident that
SimplifyCFG's common code hoisting transform is at fault there,
because the pattern it handles is precisely the unrotated
loop basic block structure.

Surprizingly, `SimplifyCFGOpt::HoistThenElseCodeToIf()` is enabled
by default, and is always run, unlike it's friend, common code sinking
transform, `SinkCommonCodeFromPredecessors()`, which is not enabled
by default and is only run once very late in the pipeline.

I'm proposing to harmonize this, and disable common code hoisting
until //late// in pipeline. Definition of //late// may vary,
here currently i've picked the same one as for code sinking,
but i suppose we could enable it as soon as right after
loop rotation happens.

Experimentation shows that this does indeed unsurprizingly help,
more loops got rotated, although other issues remain elsewhere.

Now, this undoubtedly seriously shakes phase ordering.
This will undoubtedly be a mixed bag in terms of both compile- and
run- time performance, codesize. Since we no longer aggressively
hoist+deduplicate common code, we don't pay the price of said hoisting
(which wasn't big). That may allow more loops to be rotated,
so we pay that price. That, in turn, that may enable all the transforms
that require canonical (rotated) loop form, including but not limited to
vectorization, so we pay that too. And in general, no deduplication means
more [duplicate] instructions going through the optimizations. But there's still
late hoisting, some of them will be caught late.

As per benchmarks i've run {F12360204}, this is mostly within the noise,
there are some small improvements, some small regressions.
One big regression i saw i fixed in rG8d487668d09fb0e4e54f36207f07c1480ffabbfd, but i'm sure
this will expose many more pre-existing missed optimizations, as usual :S

llvm-compile-time-tracker.com thoughts on this:
http://llvm-compile-time-tracker.com/compare.php?from=e40315d2b4ed1e38962a8f33ff151693ed4ada63&to=c8289c0ecbf235da9fb0e3bc052e3c0d6bff5cf9&stat=instructions
* this does regress compile-time by +0.5% geomean (unsurprizingly)
* size impact varies; for ThinLTO it's actually an improvement

The largest fallout appears to be in GVN's load partial redundancy
elimination, it spends *much* more time in
`MemoryDependenceResults::getNonLocalPointerDependency()`.
Non-local `MemoryDependenceResults` is widely-known to be, uh, costly.
There does not appear to be a proper solution to this issue,
other than silencing the compile-time performance regression
by tuning cut-off thresholds in `MemoryDependenceResults`,
at the cost of potentially regressing run-time performance.
D84609 attempts to move in that direction, but the path is unclear
and is going to take some time.

If we look at stats before/after diffs, some excerpts:
* RawSpeed (the target) {F12360200}
  * -14 (-73.68%) loops not rotated due to the header size (yay)
  * -272 (-0.67%) `"Number of live out of a loop variables"` - good for vectorizer
  * -3937 (-64.19%) common instructions hoisted
  * +561 (+0.06%) x86 asm instructions
  * -2 basic blocks
  * +2418 (+0.11%) IR instructions
* vanilla test-suite + RawSpeed + darktable  {F12360201}
  * -36396 (-65.29%) common instructions hoisted
  * +1676 (+0.02%) x86 asm instructions
  * +662 (+0.06%) basic blocks
  * +4395 (+0.04%) IR instructions

It is likely to be sub-optimal for when optimizing for code size,
so one might want to change tune pipeline by enabling sinking/hoisting
when optimizing for size.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D84108
2020-07-29 20:05:30 +03:00
Yuanfang Chen 7a2e1122ae [NewPM][PassInstrument] Make PrintIR and TimePasses to use before-pass-run callback
Reviewed By: asbirlea, aeubanks

Differential Revision: https://reviews.llvm.org/D84773
2020-07-29 08:26:36 -07:00
Zahira Ammarguellat 80bd6ae13e On Windows build, making the /bigobj flag global , instead of passing it per file.
To avoid having this flag be passed in per/file manner, we are instead
passing it globally.

This fixes this bug: https://bugs.llvm.org/show_bug.cgi?id=46733

Reviewed-by: aaron.ballman, beanz, meinersbur

Differential Revision: https://reviews.llvm.org/D84038
2020-07-28 18:04:36 -05:00
Arthur Eubanks 2ca6c422d2 [FunctionAttrs] Rename functionattrs -> function-attrs
To match NewPM pass name, and also for readability.
Also rename rpo-functionattrs -> rpo-function-attrs while we're here.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D84694
2020-07-28 09:09:13 -07:00
Tarindu Jayatilaka ee6f0e109c Add a Printer to the FunctionPropertiesAnalysis
A printer pass and a lit test case was added.

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D82523
2020-07-23 11:57:11 -07:00
Tarindu Jayatilaka 418121c30a Reapply "Rename InlineFeatureAnalysis to FunctionPropertiesAnalysis"
(This reverts commit a5e0194709, and
corrects author).

Rename the pass to be able to extend it to function properties other than inliner features.

    Reviewed By: mtrofin

    Differential Revision: https://reviews.llvm.org/D82044
2020-07-22 10:07:35 -07:00
Mircea Trofin a5e0194709 Revert "Rename InlineFeatureAnalysis to FunctionPropertiesAnalysis"
This reverts commit 44a6bda19b. I forgot
to correctly attibute it to tarinduj. Fixing and resubmitting.
2020-07-22 09:42:17 -07:00
Mircea Trofin 44a6bda19b Rename InlineFeatureAnalysis to FunctionPropertiesAnalysis
Rename the pass to be able to extend it to function properties other than inliner features.

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D82044
2020-07-22 09:24:15 -07:00
Sjoerd Meijer 5567c62afa [Matrix] Add LowerMatrixIntrinsics to the NPM
Pass LowerMatrixIntrinsics wasn't running yet running under the new pass
manager, and this adds LowerMatrixIntrinsics to the pipeline (to the
same place as where it is running in the old PM).

Differential Revision: https://reviews.llvm.org/D84180
2020-07-22 09:47:53 +01:00
Arthur Eubanks b13b858182 [NewPM] Support optnone under new pass manager
OptNoneInstrumentation is part of StandardInstrumentations. It skips
functions (or loops) that are marked optnone.

The feature of skipping optional passes for optnone functions under NPM
is gated on a -enable-npm-optnone flag. Currently it is by default
false. That is because we still need to mark all required passes to be
required. Otherwise optnone functions will start having incorrect
semantics.  After that is done in following changes, we can remove the
flag and always enable this.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D83519
2020-07-21 09:53:43 -07:00
Roman Lebedev 04b729d076
[NFCI][SimplifyCFG] Guard common code hoisting with a (default-on) flag
Common code sinking is already guarded with a (with default-off!) flag,
so add a flag for hoisting, too.

D84108 will hopefully make hoisting off-by-default too.
2020-07-20 10:29:57 +03:00
Yuanfang Chen 606e756bb1 [NewPM] make parsePassPipeline parse adaptor-wrapped user passes
Currently, when parsing text pipeline, different kinds of passes always
introduce nested pass managers. This makes it impossible to test the
adaptor-wrapped user passes from the text pipeline interface which is needed
by D82344 test cases. This also seems useful in general. See comments above
`parsePassPipeline`.

The syntax would be like mixing passes of different types, but it is
not the same as inferring the correct pass type and then adding the
matching nested pass managers. Strictly speaking, the resulted pipelines
are different.

Reviewed By: asbirlea, aeubanks

Differential Revision: https://reviews.llvm.org/D82698
2020-07-18 22:26:37 -07:00
Mircea Trofin 9870f77441 [llvm] Moved InlineSizeEstimatorAnalysis test to .ll
Summary:
Following guidance in
https://llvm.org/docs/TestingGuide.html#testing-analysis

Reviewers: mehdi_amini

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83918
2020-07-16 12:25:16 -07:00
Arthur Eubanks f413b53a67 [NPM][IVUsers] Rename ivusers -> iv-users
LPM passes were named iv-users, which seems nicer than ivusers.

Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D83803
2020-07-15 09:38:21 -07:00
Teresa Johnson 6014c46c80 Restore "[WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP"
This restores commit 80d0a137a5, and the
follow on fix in 873c0d0786, with a new
fix for test failures after a 2-stage clang bootstrap, and a more robust
fix for the Chromium build failure that an earlier version partially
fixed. See also discussion on D75201.

Reviewers: evgeny777

Subscribers: mehdi_amini, Prazek, hiraditya, steven_wu, dexonsmith, arphaman, davidxl, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73242
2020-07-14 12:16:57 -07:00
Mircea Trofin caf395ee8c Reapply "[llvm] Native size estimator for training -Oz inliner"
This reverts commit 9908a3b9f5.

The fix was to exclude the content of TFUtils.h (automatically
included in the LLVM_Analysis module, when LLVM_ENABLE_MODULES is enabled).

Differential Revision: https://reviews.llvm.org/D82817
2020-07-13 16:26:26 -07:00
Davide Italiano 9908a3b9f5 Revert "[llvm] Native size estimator for training -Oz inliner"
This reverts commit 83080a294a as
it breaks the macOS modules build.
2020-07-13 13:13:36 -07:00
Arthur Eubanks fefe7555e9 [NewPM][opt] Translate -foo-analysis to require<foo-analysis>
Fixes 53 check-llvm tests under NPM.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D83633
2020-07-13 11:24:59 -07:00
Mircea Trofin 83080a294a [llvm] Native size estimator for training -Oz inliner
Summary:
This is an experimental ML-based native size estimator, necessary for
computing partial rewards during -Oz inliner policy training. Data
extraction for model training will be provided in a separate patch.

RFC: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140763.html

Reviewers: davidxl, jdoerfert

Subscribers: mgorny, hiraditya, mgrang, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82817
2020-07-13 10:13:56 -07:00
Arthur Eubanks 21b4cc1db9 Reland [NFC] Derive from PassInfoMixin for no-op/printing passes
PassInfoMixin should be used for all NPM passes, rater than a custom
`name()`.

This caused ambiguous references in LegacyPassManager.cpp, so had to
remove "using namespace llvm::legacy" and move some things around.

Reviewed By: ychen, asbirlea

Differential Revision: https://reviews.llvm.org/D83498
2020-07-10 12:51:28 -07:00
Davide Italiano fdb7856d54 Revert "[NFC] Derive from PassInfoMixin for no-op/printing passes"
This reverts commit 8039d2c3bf as
it breaks the modules build on macOS.
2020-07-10 11:19:13 -07:00
Zequan Wu 1fbb719470 [LPM] Port CGProfilePass from NPM to LPM
Reviewers: hans, chandlerc!, asbirlea, nikic

Reviewed By: hans, nikic

Subscribers: steven_wu, dexonsmith, nikic, echristo, void, zhizhouy, cfe-commits, aeubanks, MaskRay, jvesely, nhaehnle, hiraditya, kerbowa, llvm-commits

Tags: #llvm, #clang

Differential Revision: https://reviews.llvm.org/D83013
2020-07-10 09:04:51 -07:00
Arthur Eubanks 8039d2c3bf [NFC] Derive from PassInfoMixin for no-op/printing passes
PassInfoMixin should be used for all NPM passes, rater than a custom
`name()`.

This caused ambiguous references in LegacyPassManager.cpp, so had to
remove "using namespace llvm::legacy" and move some things around.

The passes had to be moved to the llvm namespace, or else they would get
printed as "(anonymous namespace)::FooPass".

Reviewed By: ychen, asbirlea

Differential Revision: https://reviews.llvm.org/D83498
2020-07-09 16:58:30 -07:00
Fangrui Song c025bdf25a Revert D83013 "[LPM] Port CGProfilePass from NPM to LPM"
This reverts commit c92a8c0a0f.

It breaks builds and has unaddressed review comments.
2020-07-09 13:34:04 -07:00
Zequan Wu c92a8c0a0f [LPM] Port CGProfilePass from NPM to LPM
Reviewers: hans, chandlerc!, asbirlea, nikic

Reviewed By: hans, nikic

Subscribers: steven_wu, dexonsmith, nikic, echristo, void, zhizhouy, cfe-commits, aeubanks, MaskRay, jvesely, nhaehnle, hiraditya, kerbowa, llvm-commits

Tags: #llvm, #clang

Differential Revision: https://reviews.llvm.org/D83013
2020-07-09 13:03:42 -07:00
Arthur Eubanks 0b2536d0bd [NewPM] Add PredicateInfoPrinterPass to PassRegistry.def
Fixes tests under NPM in Transforms/Util/PredicateInfo.
2020-07-08 09:32:46 -07:00
Arthur Eubanks 1143f09678 [NewPM][LoopFusion] Rename loop-fuse -> loop-fusion
The legacy pass name is "loop-fusion".

Fixes most tests under Transforms/LoopFusion under NPM.

Reviewed By: Whitney

Differential Revision: https://reviews.llvm.org/D83066
2020-07-07 10:43:07 -07:00
Arthur Eubanks 3d12e79094 [NewPM][LSR] Rename strength-reduce -> loop-reduce
The legacy pass was called "loop-reduce".

This lowers the number of check-llvm failures under NPM by 83.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D82925
2020-07-02 11:15:29 -07:00
dfukalov 1a6cebb4d1 [PM] Fix new PM to perform SpeculativeExecution as in old PM
Summary:
Old PM runs SpeculativeExecutionPass for targets that have divergent branches.
It uses `createSpeculativeExecutionIfHasBranchDivergencePass` that creates
the pass with `OnlyIfDivergentTarget=true`, whereas new PM just created the
pass with default `OnlyIfDivergentTarget=fase` so it unexpectedly runs and
causes buildbot test fails.

Reviewers: chandlerc, arsenm

Reviewed By: arsenm

Subscribers: wdng, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82735
2020-06-30 15:21:04 +03:00
Yuanfang Chen 2b8a09e1ed [MemorySSA] Update comment in PassBuilder
Is teaching the LoopFullUnrollPass to preserve MemorySSA very hard or
just impossible?

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D82618
2020-06-26 14:55:31 -07:00
Arthur Eubanks a95796a380 [NewPM][LoopUnroll] Rename unroll* to loop-unroll*
The legacy pass is called "loop-unroll", but in the new PM it's called "unroll".
Also applied to unroll-and-jam and unroll-full.

Fixes various check-llvm tests when NPM is turned on.

Reviewed By: Whitney, dmgreen

Differential Revision: https://reviews.llvm.org/D82590
2020-06-26 09:28:32 -07:00
Arthur Eubanks 85ff5b524e [NewPM] Separate out alias analysis passes in opt
Summary:
This somewhat matches the --aa-pipeline option, which separates out any
AA analyses to make sure they run before other passes.

Makes check-llvm failures under new PM go from 2356 -> 2303.

AA passes are not handled by PassBuilder::parsePassPipeline() but rather
PassBuilder::parseAAPipeline(), which is why this fixes some failures.

Reviewers: asbirlea, hans, ychen, leonardchan

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82488
2020-06-25 08:53:57 -07:00
Yuanfang Chen ebc88811b5 Remove Passes dependency on CodeGen
The dependency was introduced in
5134020ea6. The only functional change
from this removal would be the new PM interface for the two codegen
passes. This is not necessary since we don't have codegen pipeline using
new PM yet. This removal is to break the potential circular dependency between
Passes and CodeGen once the codegen begins to gain new PM support.
2020-06-24 14:52:46 -07:00
Kirill Naumov 6a5d7d498c [InlineCost] InlineCostAnnotationWriterPass introduced
This class allows to see the inliner's decisions for better
optimization verifications and tests. To use, use flag
"-passes="print<inline-cost>"".

This is the second attempt to integrate the patch.
The problem from the first try has been discussed and
fixed in D82205.

Reviewers: apilipenko, mtrofin, davidxl, fedor.sergeev

Reviewed By: mtrofin

Differential revision: https://reviews.llvm.org/D81743
2020-06-24 21:27:07 +00:00
Arthur Eubanks b5979a383a [NewPM] Add SimpleLoopUnswitchPass to PassRegistry.def
Summary:
Seems to just be missing from PassRegistry.def.

Makes the number of check-llvm failures under new PM go from 2619 to 2581.

Reviewers: hans, ychen, asbirlea, leonardchan

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82422
2020-06-24 08:20:34 -07:00
Arthur Eubanks fcf0741262 [NewPM] Handle -simplifycfg in opt
Summary:
-simplifycfg is the legacy pass name for SimplifyCFGPass.

There is already -simplify-cfg in FUNCTION_PASS_WITH_PARAMS which
handles options for SimplifyCFGPass. Maybe that should be renamed to
-simplifycfg as well?

This reduces the number of check-llvm failures under NewPM from 2619 to 2392.

Reviewers: hans, leonardchan, asbirlea, ychen

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82421
2020-06-24 08:20:08 -07:00
Vitaly Buka fcd67665a8 [StackSafety] Add "Must Live" logic
Summary:
Extend StackLifetime with option to calculate liveliness
where alloca is only considered alive on basic block entry
if all non-dead predecessors had it alive at terminators.

Depends on D82043.

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82124
2020-06-18 16:53:37 -07:00
Vitaly Buka f672791e08 [StackSafety] Add pass for StackLifetime testing
Summary: lifetime.ll is a copy of SafeStack/X86/coloring2.ll

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: hiraditya, mgrang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82043
2020-06-18 16:34:18 -07:00
Kirill Naumov ea844c7520 Revert "[InlineCost] InlineCostAnnotationWriterPass introduced"
This reverts commit 37e06e8f5c.
2020-06-17 14:02:34 +00:00
Kirill Naumov 37e06e8f5c [InlineCost] InlineCostAnnotationWriterPass introduced
This class allows to see the inliner's decisions for better
optimization verifications and tests. To use, use flag
"-passes="print<inline-cost>"".

Reviewers: apilipenko, mtrofin, davidxl, fedor.sergeev

Reviewed By: mtrofin

Differential revision: https://reviews.llvm.org/D81743
2020-06-17 13:40:17 +00:00
Mircea Trofin e2cc854015 [llvm][NFC] Move content of ML subdirectory into Analysis
The initial intent was to organize ML stuff in its own directory, but
it turns out that conflicts with llvm component layering policies: it
is not a component, because subsequent changes want to rely on other
analyses, which would create a cycle; and we don't have a reliable,
cross-platform mechanism to compile files in a subdirectory, and fit in
the existing LLVM build structure.

This change moves the files into Analysis, and subsequent changes will
leverage conditional compilation for those that have optional
dependencies.
2020-06-15 14:35:33 -07:00
Mircea Trofin 29e5722949 Revert "[llvm] Added support for stand-alone cmake object libraries."
This reverts commit 695c7d6313.

Breaks windows (e.g.
http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/16497)

Likely to cause problems with XCode.
2020-06-15 12:15:39 -07:00
Mircea Trofin 695c7d6313 [llvm] Added support for stand-alone cmake object libraries.
Summary:
Currently, add_llvm_library would create an OBJECT library alongside
of a STATIC / SHARED library, but losing the link interface (its
elements would become dependencies instead). To support scenarios
where linking an object library also brings in its usage
requirements, this patch adds support for 'stand-alone' OBJECT
libraries - i.e. without an accompanying SHARED/STATIC library, and
maintaining the link interface defined by the user.

The support is via a new option, OBJECT_ONLY, to avoid breaking changes
- since just specifying "OBJECT" would currently imply also STATIC or
SHARED, depending on BUILD_SHARED_LIBS.

This is useful for cases where, for example, we want to build a part
of a component separately. Using a STATIC target would incur the risk
that symbols not referenced in the consumer would be dropped (which may
be undesirable).

The current application is the ML part of Analysis. It should be part
of the Analysis component, so it may reference other analyses; and (in
upcoming changes) it has dependencies on optional libraries.

Reviewers: karies, davidxl

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81447
2020-06-15 12:01:43 -07:00
Kazu Hirata 347a599e5f [Inlining] Introduce -enable-npm-pgo-inline-deferral
Summary:
Experiments show that inline deferral past pre-inlining slightly
pessimizes the performance.

This patch introduces an option to control inline deferral during PGO.
The option defaults to true for now (that is, NFC).

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: eraman, hiraditya, haicheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80776
2020-06-04 00:40:58 -07:00
Vitaly Buka 232d348c6e [MTE] Convert StackSafety into analysis
This lets us to remove !stack-safe metadata and
better controll when to perform StackSafety
analysis.

Reviewers: eugenis

Subscribers: hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D80771
2020-06-02 16:08:14 -07:00
Eric Christopher c2bb26d861 NFC: Simplify O1 pass pipeline construction.
Pull O1 pass pipeline out into a separate function and simplify
buildFunctionSimplificationPipeline accordingly.
2020-05-29 20:08:22 -07:00
Eric Christopher c554c5e159 Fix full unrolling with new pass manager.
Last we looked at this and couldn't come up with a reason to change
it, but with a pragma for full loop unrolling we bypass every other
loop unroll and then fail to fully unroll a loop when the pragma is set.

Move the OnlyWhenForced out of the check and into the initialization
of the full unroll pass in the new pass manager. This doesn't show up
with the old pass manager.

Add a new option to opt so that we can turn off loop unrolling
manually since this is a difference between clang and opt.

Tested with check-clang and check-llvm.
2020-05-29 20:08:21 -07:00
Arthur Eubanks 1285e8bcac Run Coverage pass before other *San passes under new pass manager, round 2
Summary:
This was attempted once before in https://reviews.llvm.org/D79698, but
was reverted due to the coverage pass running in the wrong part of the
pipeline. This commit puts it in the same place as the other sanitizers.

This changes PassBuilder.OptimizerLastEPCallbacks to work on a
ModulePassManager instead of a FunctionPassManager. That is because
SanitizerCoverage cannot (easily) be split into a module pass and a
function pass like some of the other sanitizers since in its current
implementation it conditionally inserts module constructors based on
whether or not it successfully modified functions.

This fixes compiler-rt/test/msan/coverage-levels.cpp under the new pass
manager (last check-msan test).

Currently sanitizers + LTO don't work together under the new pass
manager, so I removed tests that checked that this combination works for
sancov.

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D80692
2020-05-28 17:04:47 -07:00
Arthur Eubanks e3fb8446f2 Revert "Run Coverage pass before other *San passes under new pass manager, round 2"
This reverts commit 922fa2fce3.
2020-05-28 14:38:05 -07:00
Arthur Eubanks 922fa2fce3 Run Coverage pass before other *San passes under new pass manager, round 2
Summary:
This was attempted once before in https://reviews.llvm.org/D79698, but
was reverted due to the coverage pass running in the wrong part of the
pipeline. This commit puts it in the same place as the other sanitizers.

This changes PassBuilder.OptimizerLastEPCallbacks to work on a
ModulePassManager instead of a FunctionPassManager. That is because
SanitizerCoverage cannot (easily) be split into a module pass and a
function pass like some of the other sanitizers since in its current
implementation it conditionally inserts module constructors based on
whether or not it successfully modified functions.

This fixes compiler-rt/test/msan/coverage-levels.cpp under the new pass
manager (last check-msan test).

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D80692
2020-05-28 14:25:23 -07:00
Fangrui Song be6bffe729 [CMake] Revert cf86a234ba
It is unnecessary after 993bbaf6a3
2020-05-27 15:29:22 -07:00
Fangrui Song 993bbaf6a3 [MLPolicies] Fix dependency and -DBUILD_SHARED_LIBS=on builds after D80579 2020-05-27 15:26:13 -07:00
Mircea Trofin cf86a234ba Fix shared libs build break introduced in rG98ef93eabd76 2020-05-27 15:12:16 -07:00
Mircea Trofin 98ef93eabd [llvm] Add function feature extraction analysis
Summary:
This patch introduces an analysis pass to extract function features,
which will be needed by the ML InlineAdvisor.

RFC: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140763.html

Reviewers: davidxl, dblaikie, jdoerfert

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80579
2020-05-27 13:38:50 -07:00
Sanjay Patel 57bb4787d7 [Pass Manager] remove EarlyCSE as clean-up for VectorCombine
EarlyCSE was added with D75145, but the motivating test is
not regressed by removing the extra pass now. That might be
because VectorCombine altered the way it processes instructions,
or it might be from (re)moving VectorCombine in the pipeline.

The extra round of EarlyCSE appears to cost approximately
0.26% in compile-time as discussed in D80236, so we need some
evidence to justify its inclusion here, but we do not have
that (yet).

I suspect that between SLP and VectorCombine, we are creating
patterns that InstCombine and/or codegen are not prepared for,
but we will need to reduce those examples and include them as
PhaseOrdering and/or test-suite benchmarks.
2020-05-24 12:36:21 -04:00
Sanjay Patel 6438ea45e0 [VectorCombine] position pass after SLP in the optimization pipeline rather than before
There are 2 known problem patterns shown in the test diffs here:
vector horizontal ops (an x86 specialization) and vector reductions.

SLP has greater ability to match and fold those than vector-combine,
so let SLP have first chance at that.

This is a quick fix while we continue to improve vector-combine and
possibly canonicalize to reduction intrinsics.

In the longer term, we should improve matching of these patterns
because if they were created in the "bad" forms shown here, then we
would miss optimizing them.

I'm not sure what is happening with alias analysis on the addsub test.
The old pass manager now shows an extra line for that, and we see an
improvement that comes from SLP vectorizing a store. I don't know
what's missing with the new pass manager to make that happen.
Strangely, I can't reproduce the behavior if I compile from C++ with
clang and invoke the new PM with "-fexperimental-new-pass-manager".

Differential Revision: https://reviews.llvm.org/D80236
2020-05-22 12:22:44 -04:00
Juneyoung Lee d9a4a24413 Add CanonicalizeFreezeInLoops pass
Summary:
If an induction variable is frozen and used, SCEV yields imprecise result
because it doesn't say anything about frozen variables.

Due to this reason, performance degradation happened after
https://reviews.llvm.org/D76483 is merged, causing
SCEV yield imprecise result and preventing LSR to optimize a loop.

The suggested solution here is to add a pass which canonicalizes frozen variables
inside a loop. To be specific, it pushes freezes out of the loop by freezing
the initial value and step values instead & dropping nsw/nuw flags from instructions used by freeze.
This solution was also mentioned at https://reviews.llvm.org/D70623 .

Reviewers: spatel, efriedma, lebedev.ri, fhahn, jdoerfert

Reviewed By: fhahn

Subscribers: nikic, mgorny, hiraditya, javed.absar, llvm-commits, sanwou01, nlopes

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77523
2020-05-21 09:29:29 +09:00
Mircea Trofin d6695e1876 [llvm] Add interface to drive inlining decision using ML model
Summary:

This change introduces InliningAdvisor (and related APIs), the interface
that abstracts decision making away from the inlining pass. We will use
this interface to delegate decision making to a trained ML model,
subsequently (see referenced RFC).

RFC: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140763.html

Reviewers: davidxl, eraman, dblaikie

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79042
2020-05-13 13:27:29 -07:00
Whitney Tsang 5c10c6e012 [PassBuilder] Moved ProfileSummaryAnalysis in buildInlinerPipeline.
Summary:
As commented in the code, ProfileSummaryAnalysis is required for inliner
pass to query, so this patch moved
RequireAnalysisPass<ProfileSummaryAnalysis> in the recently created
buildInlinerPipeline.
Reviewer: mtrofin, davidxl, tejohnson, dblaikie, jdoerfert, sstefan1
Reviewed By: mtrofin, davidxl, jdoerfert
Subscribers: hiraditya, steven_wu, dexonsmith, wuzish, llvm-commits,
jsji
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D79696
2020-05-12 16:00:40 +00:00
Eric Christopher a42e53cccf Fix typos encountered while working on pass pipeline for O1. 2020-05-12 00:45:15 -07:00
OCHyams da100de0a6 [NFC][DwarfDebug] Add test for variables with a single location which
don't span their entire scope.

The previous commit (6d1c40c171) is an older version of the test.

Reviewed By: aprantl, vsk

Differential Revision: https://reviews.llvm.org/D79573
2020-05-11 11:49:11 +02:00
Mircea Trofin c3770c5d6d [llvm][NFC] Factor out inlining pipeline as a module pipeline.
Summary:
This simplifies testing in scenarios where we want to set up module-wide
analyses for inlining. The patch enables treating inlining and its
function cleanups, as a module pass. The alternative would be for tests
to describe the pipeline, which is tedious and adds maintenance
overhead.

Reviewers: davidxl, dblaikie, jdoerfert, sstefan1

Subscribers: hiraditya, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78512
2020-04-24 09:24:12 -07:00
Johannes Doerfert c5794f77eb [Attributor][PM] Introduce `-attributor-enable={none,cgscc,module,all}`
The old command line option `-attributor-disable` was too coarse grained
as we want to measure the effects of the module or cgscc pass without
the other as well.

Since `none` is the default there is no real functional change.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D78571
2020-04-21 15:22:10 -05:00
Simon Pilgrim a140a9b112 [cmake] LLVMPasses - add include/llvm header path
Pick up all the Pass headers in the root for MSVC projects
2020-04-18 12:31:41 +01:00
Teresa Johnson 33ffb62e23 Allow disabling of vectorization using internal options
Summary:
Currently, the internal options -vectorize-loops, -vectorize-slp, and
-interleave-loops do not have much practical effect. This is because
they are used to initialize the corresponding flags in the pass
managers, and those flags are then unconditionally overwritten when
compiling via clang or via LTO from the linkers. The only exception was
-vectorize-loops via opt because of some special hackery there.

While vectorization could still be disabled when compiling via clang,
using -fno-[slp-]vectorize, this meant that there was no way to disable
it when compiling in LTO mode via the linkers. This only affected
ThinLTO, since for regular LTO vectorization is done during the compile
step for scalability reasons. For ThinLTO it is invoked in the LTO
backends. See also the discussion on PR45434.

This patch makes it so the internal options can actually be used to
disable these optimizations. Ultimately, the best long term solution is
to mark the loops with metadata (similar to the approach used to fix
-fno-unroll-loops in D77058), but this enables a shorter term
workaround, and actually makes these internal options useful.

I constant propagated the initial values of these internal flags into
the pass manager flags (for some reasons vectorize-loops and
interleave-loops were initialized to true, while vectorize-slp was
initialized to false). As mentioned above, they are overwritten
unconditionally so this doesn't have any real impact, and these initial
values aren't particularly meaningful.

I then changed the passes to check the internl values and return without
performing the associated optimization when false (I changed the default
of -vectorize-slp to true so the options behave similarly). I was able
to remove the hackery in opt used to get -vectorize-loops=false to work,
as well as a special option there used to disable SLP vectorization.

Finally, I changed thinlto-slp-vectorize-pm.c to:
a) Only test SLP (moved the loop vectorization checking to a new test).
b) Use code that is slp vectorized when it is enabled, and check that
instead of whether the pass is enabled.
c) Test the new behavior of -vectorize-slp.
d) Test both pass managers.

The loop vectorization (and associated interleaving) testing I moved to
a new thinlto-loop-vectorize-pm.c test, with several changes:
a) Changed the flags on the interleaving testing so that it will
actually interleave, and check that.
b) Test the new behavior of -vectorize-loops and -interleave-loops.
c) Test both pass managers.

Reviewers: fhahn, wmi

Subscribers: hiraditya, steven_wu, dexonsmith, cfe-commits, davezarzycki, llvm-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77989
2020-04-14 18:09:10 -07:00
Mehdi Amini 384ca190ae Revert "Move ModuleSummaryAnalysis from libAnalysis to libObject to break the dependency from Analysis to Object"
This reverts commit 10df1563d6.

Some buildbots are broken.
2020-04-14 00:27:08 +00:00
Mehdi Amini 10df1563d6 Move ModuleSummaryAnalysis from libAnalysis to libObject to break the dependency from Analysis to Object
ModuleSummaryAnalysis is the only file in libAnalysis that brings a
dependency on the CodeGen layer from libAnalysis, moving it breaks this
dependency.

Differential Revision: https://reviews.llvm.org/D77994
2020-04-13 23:12:11 +00:00
Masoud Ataei jaliseh 9ed0612cca Add InjectTLIMappings pass to new pass manager
This pass is created in d6de5f12d4 and tested
for new and legacy pass manager but never added to new pass manager pipeline.
I am adding it to new pass manager pipeline.

This pass is get used in Vector Function Database (VFDatabase) and without
this pass in new pass manager pipeline, none of the vector libraries are work
ing with new pass manager.

Related passes:
66c120f025
https://reviews.llvm.org/D74944

Differential revision: https://reviews.llvm.org/D75354
2020-04-06 13:16:48 -05:00
Tarindu Jayatilaka b43b59fcc0 Expose `attributor-disable` to the new and old pass managers
The new and old pass managers (PassManagerBuilder.cpp and
PassBuilder.cpp) are exposed to an `extern` declaration of
`attributor-disable` option which will guard the addition of the
attributor passes to the pass pipelines.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D76871
2020-04-05 22:29:34 -05:00
Tyker c00cb76274 [NFC] Split Knowledge retention and place it more appropriatly
Summary:
Splitting Knowledge retention into Queries in Analysis and Builder into Transform/Utils
allows Queries and Transform/Utils to use Analysis.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77171
2020-04-02 15:01:41 +02:00
zhizhouy 94d912296d [NFC] Do not run CGProfilePass when not using integrated assembler
Summary:
CGProfilePass is run by default in certain new pass manager optimization pipeline. Assemblers other than llvm as (such as gnu as) cannot recognize the .cgprofile entries generated and emitted from this pass, causing build time error.

This patch adds new options in clang CodeGenOpts and PassBuilder options so that we can turn cgprofile off when not using integrated assembler.

Reviewers: Bigcheese, xur, george.burgess.iv, chandlerc, manojgupta

Reviewed By: manojgupta

Subscribers: manojgupta, void, hiraditya, dexonsmith, llvm-commits, tcwang, llozano

Tags: #llvm, #clang

Differential Revision: https://reviews.llvm.org/D62627
2020-03-31 10:31:31 -07:00
Evgenii Stepanov 2a3723ef11 [memtag] Plug in stack safety analysis.
Summary:
Run StackSafetyAnalysis at the end of the IR pipeline and annotate
proven safe allocas with !stack-safe metadata. Do not instrument such
allocas in the AArch64StackTagging pass.

Reviewers: pcc, vitalybuka, ostannard

Reviewed By: vitalybuka

Subscribers: merge_guards_bot, kristof.beyls, hiraditya, cfe-commits, gilang, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73513
2020-03-16 16:35:25 -07:00
Tyker a4cde9ad7b Fixed [AssumeBundles] Move to IR so it can be used by Analysis
This is a recommit of 57c964aaa7
after fixing modules build.
2020-03-10 18:02:39 +01:00
Jonas Devlieghere 882f589e20 Revert "[AssumeBundles] Move to IR so it can be used by Analysis"
This breaks the modules build:

http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/
http://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/

This reverts commit 57c964aaa7.
2020-03-09 09:02:47 -07:00
Tyker 57c964aaa7 [AssumeBundles] Move to IR so it can be used by Analysis
Summary:
Assume bundles need to be usable by Analysis and Transforms/Utils isn't.
so this commit moves utilities to deal with asusme bundles to IR.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75618
2020-03-08 12:21:50 +01:00
Sanjay Patel 71a316883d [PassManager] adjust VectorCombine placement
The initial placement of vector-combine in the opt pipeline revealed phase ordering bugs:
https://bugs.llvm.org/show_bug.cgi?id=45015
https://bugs.llvm.org/show_bug.cgi?id=42022

This patch contains a few independent changes:

1. Move the pass up in the pipeline, so it happens just after loop-vectorization.
   This is only to keep vectorization passes together in the pipeline at the moment.
   I don't have evidence of interaction between these yet.
2. Add an -early-cse pass after -vector-combine to clean up redundant ops. This was
   partly proposed as far back as rL219644 (which is why it's effectively being moved
   in the old PM code). This is important because the subsequent -instcombine doesn't
   work as well without EarlyCSE. With the CSE, -instcombine is able to squash
   shuffles together in 1 of the tests (because those are simple "select" shuffles).
3. Remove the -vector-combine pass that was running after SLP. We may want to do that
   eventually, but I don't have a test case to support it yet.

Differential Revision: https://reviews.llvm.org/D75145
2020-03-04 11:10:49 -05:00
Whitney Tsang c84532a70a [LoopNest]: Analysis to discover properties of a loop nest.
Summary: This patch adds an analysis pass to collect loop nests and
summarize properties of the nest (e.g the nest depth, whether the nest
is perfect, what's the innermost loop, etc...).

The motivation for this patch was discussed at the latest meeting of the
LLVM loop group (https://ibm.box.com/v/llvm-loop-nest-analysis) where we
discussed
the unimodular loop transformation framework ( “A Loop Transformation
Theory and an Algorithm to Maximize Parallelism”, Michael E. Wolf and
Monica S. Lam, IEEE TPDS, October 1991). The unimodular framework
provides a convenient way to unify legality checking and code generation
for several loop nest transformations (e.g. loop reversal, loop
interchange, loop skewing) and their compositions. Given that the
unimodular framework is applicable to perfect loop nests this is one
property of interest we expose in this analysis. Several other utility
functions are also provided. In the future other properties of interest
can be added in a centralized place.
Authored By: etiotto
Reviewer: Meinersbur, bmahjour, kbarton, Whitney, dmgreen, fhahn,
reames, hfinkel, jdoerfert, ppc-slack
Reviewed By: Meinersbur
Subscribers: bryanpkc, ppc-slack, mgorny, hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D68789
2020-03-03 18:25:19 +00:00
Whitney Tsang 613f791131 Revert "[LoopNest]: Analysis to discover properties of a loop nest."
This reverts commit 3a063d68e3.

Broke the build with modules enabled:
http://green.lab.llvm.org/green/job/lldb-cmake/10655/console .
2020-03-03 14:07:49 +00:00
Whitney Tsang 3a063d68e3 [LoopNest]: Analysis to discover properties of a loop nest.
Summary: This patch adds an analysis pass to collect loop nests and
summarize properties of the nest (e.g the nest depth, whether the nest
is perfect, what's the innermost loop, etc...).

The motivation for this patch was discussed at the latest meeting of the
LLVM loop group (https://ibm.box.com/v/llvm-loop-nest-analysis) where we
discussed
the unimodular loop transformation framework ( “A Loop Transformation
Theory and an Algorithm to Maximize Parallelism”, Michael E. Wolf and
Monica S. Lam, IEEE TPDS, October 1991). The unimodular framework
provides a convenient way to unify legality checking and code generation
for several loop nest transformations (e.g. loop reversal, loop
interchange, loop skewing) and their compositions. Given that the
unimodular framework is applicable to perfect loop nests this is one
property of interest we expose in this analysis. Several other utility
functions are also provided. In the future other properties of interest
can be added in a centralized place.
Authored By: etiotto
Reviewer: Meinersbur, bmahjour, kbarton, Whitney, dmgreen, fhahn,
reames, hfinkel, jdoerfert, ppc-slack
Reviewed By: Meinersbur
Subscribers: bryanpkc, ppc-slack, mgorny, hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D68789
2020-03-03 13:25:28 +00:00
Teresa Johnson 80bf137fa1 Revert "Restore "[WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP""
This reverts commit 80d0a137a5, and the
follow on fix in 873c0d0786. It is
causing test failures after a multi-stage clang bootstrap. See
discussion on D73242 and D75201.
2020-03-02 14:02:13 -08:00
Jun Ma 624dbfcc1b [Coroutines][New pass manager] Move CoroElide pass to right position
Differential Revision: https://reviews.llvm.org/D75345
2020-03-01 21:48:24 +08:00
Jun Ma 44d83671c5 Revert "[Coroutines][new pass manager] Move CoroElide pass to right position"
This reverts commit 4c0a133a41.
2020-03-01 21:37:41 +08:00
Jun Ma 4c0a133a41 [Coroutines][new pass manager] Move CoroElide pass to right position
Differential Revision: https://reviews.llvm.org/D75345
2020-03-01 20:55:38 +08:00
Hongtao Yu bae33a7c5a IR printing for single function with the new pass manager.
Summary:
The IR printing always prints out all functions in a module with the new pass manager, even with -filter-print-funcs specified. This is being fixed in this change. However, there are two exceptions, i.e, with user-specified wildcast switch -filter-print-funcs=* or -print-module-scope, under which IR of all functions should be printed.

Test Plan:
make check-clang
make check-llvm

Reviewers: wenlei

Reviewed By: wenlei

Subscribers: wenlei, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D74814
2020-02-23 15:28:57 -08:00
Brian Gesiak 72961071f3 [Coroutines][5/6] Add coroutine passes to pipeline
Summary:
Depends on https://reviews.llvm.org/D71901.

The fifth in a series of patches that ports the LLVM coroutines passes
to the new pass manager infrastructure.

The first 4 patches allow users to run coroutine passes by invoking, for
example `opt -passes=coro-early`. However, most of LLVM's tests for
coroutines use an option, `opt -enable-coroutines`, which adds all 4
coroutine passes to the appropriate legacy pass manager extension points.
This patch does the same, but using the new pass manager: when
coroutine features are enabled and the new pass manager is being used,
this adds the new-pass-manager-compliant coroutine passes to the pass
builder's pipeline.

This allows us to run all coroutine tests using the new pass manager
(besides those that use the coroutine retcon ABI used by the Swift
compiler, which is not yet supported in the new pass manager).

Reviewers: GorNishanov, lewissbaker, chandlerc, junparser, wenlei

Subscribers: wenlei, EricWF, Prazek, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71902
2020-02-19 00:57:14 -05:00
Brian Gesiak 5a187d8ed1 [Coroutines][4/6] New pass manager: coro-cleanup
Summary:
Depends on https://reviews.llvm.org/D71900.

The fourth in a series of patches that ports the LLVM coroutines passes
to the new pass manager infrastructure. This patch implements
'coro-cleanup'.

No existing regression tests check the behavior of coro-cleanup on its
own, so this patch adds one. (A test named 'coro-cleanup.ll' exists, but
it relies on the entire coroutines pipeline being run. It's updated to
test the new pass manager in the 5th patch of this series.)

Reviewers: GorNishanov, lewissbaker, chandlerc, junparser, deadalnix, wenlei

Reviewed By: wenlei

Subscribers: wenlei, EricWF, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71901
2020-02-19 00:30:27 -05:00
Brian Gesiak 2365238b9d Re-land new pass manager coro-split and coro-elide
This re-applies patches https://reviews.llvm.org/D71899 and
https://reviews.llvm.org/D71900, which were reverted in
https://reviews.llvm.org/rG11053a1cc61 and
https://reviews.llvm.org/rGe999aa38d16. The underlying problem that
caused two buildbots to fail with these patches is explained in
https://reviews.llvm.org/rG26f356350bd -- older compliers disagree with
the order in which the left- and right-hand side of an assignment in
LazyCallGraph ought to be evaluated, which caused an assertion in
SmallVector::operator[] to fire when the test suite was run.
2020-02-19 00:11:23 -05:00
Brian Gesiak 11053a1cc6 Revert new pass manager coro-split and coro-elide
This reverts
https://reviews.llvm.org/rG7125d66f9969605d886b5286780101a45b5bed67 and
https://reviews.llvm.org/rG00fec8004aca6588d8d695a2c3827c3754c380a0 due
to buildbot failures:
http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/34004
2020-02-17 23:55:10 -05:00
Brian Gesiak 00fec8004a [Coroutines][3/6] New pass manager: coro-elide
Summary:
Depends on https://reviews.llvm.org/D71899.

The third in a series of patches that ports the LLVM coroutines passes
to the new pass manager infrastructure. This patch implements 'coro-elide'.

The new pass manager infrastructure does not implicitly repeat CGSCC
pass pipelines when a function is devirtualized, and so the tests
for the new pass manager that rely on that behavior now explicitly
specify `repeat<2>`.

Reviewers: GorNishanov, lewissbaker, chandlerc, jdoerfert, junparser, deadalnix, wenlei

Reviewed By: wenlei

Subscribers: wenlei, EricWF, Prazek, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71900
2020-02-17 23:41:57 -05:00
Brian Gesiak 7125d66f99 [Coroutines][2/6] New pass manager: coro-split
Summary:
This patch has four dependencies:

1. The first in this series of patches that implement coroutine passes in the
   new pass manager: https://reviews.llvm.org/D71898.
2. A patch that introduces an API for CGSCC passes to add new reference
   edges to a `LazyCallGraph`, `updateCGAndAnalysisManagerForCGSCCPass`:
   https://reviews.llvm.org/D72025.
3. A patch that introduces a `CallGraphUpdater` helper class that is
   capable of mutating internal `LazyCallGraph` state in order to insert
   new function nodes into a specific SCC: https://reviews.llvm.org/D70927.
4. And finally, a small edge case fix for updating `LazyCallGraph` that
   patch 3 above happens to run into: https://reviews.llvm.org/D72226.

This is the second in a series of patches that ports the LLVM coroutines
passes to the new pass manager infrastructure. This patch implements
'coro-split'.

Some notes:
* Using the new CGSCC pass manager resulted in IR being printed in the
  reverse order in some tests. To prevent FileCheck checks from failing due
  to these reversed orders, this patch splits up test files that test
  multiple different coroutine functions: specifically
  coro-alloc-with-param.ll, coro-split-eh.ll, and coro-eh-aware-edge-split.ll.
* CoroSplit.cpp contained 2 overloads of `splitCoroutine`, one of which
  dispatched to the other based on the coroutine ABI being used (C++20
  switch-based versus Swift returned-continuation-based). I found this
  confusing, especially with the additional branching based on `CallGraph`
  vs. `LazyCallGraph`, so I removed the ABI-checking overload of
  `splitCoroutine`.

Reviewers: GorNishanov, lewissbaker, chandlerc, jdoerfert, junparser, deadalnix, wenlei

Reviewed By: wenlei

Subscribers: wenlei, qcolombet, EricWF, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71899
2020-02-17 23:35:27 -05:00
Brian Gesiak e9849d5195 [Coroutines][1/6] New pass manager: coro-early
Summary:
The first in a series of patches that ports the LLVM coroutines passes
to the new pass manager infrastructure. This patch implements
'coro-early'.

NB: All coroutines passes begin by checking that coroutine intrinsics are
declared within the LLVM IR module they're operating on. To do so, they call
`coro::declaresIntrinsics`. The next 3 patches in this series, which add new
pass manager implementations of the 'coro-split', 'coro-elide', and
'coro-cleanup' passes, use a similar pattern as the one used here: a static
function is shared across both old and new passes to detect if relevant
coroutine intrinsics are delcared. To make this pattern easier to read, this
patch adds `const` keywords to the parameters of `coro::declaresIntrinsics`.

Reviewers: GorNishanov, lewissbaker, junparser, chandlerc, deadalnix, wenlei

Reviewed By: wenlei

Subscribers: ychen, wenlei, EricWF, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71898
2020-02-17 13:27:48 -05:00
Teresa Johnson 80d0a137a5 Restore "[WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP"
This restores commit 748bb5a0f1, along
with a fix for a Chromium test suite build issue (and a new test for
that case).

Differential Revision: https://reviews.llvm.org/D73242
2020-02-11 10:48:05 -08:00
Sanjay Patel a17f03bd93 [VectorCombine] new IR transform pass for partial vector ops
We have several bug reports that could be characterized as "reducing scalarization",
and this topic was also raised on llvm-dev recently:
http://lists.llvm.org/pipermail/llvm-dev/2020-January/138157.html
...so I'm proposing that we deal with these patterns in a new, lightweight IR vector
pass that runs before/after other vectorization passes.

There are 4 alternate options that I can think of to deal with this kind of problem
(and we've seen various attempts at all of these), but they all have flaws:

    InstCombine - can't happen without TTI, but we don't want target-specific
                  folds there.
    SDAG - too late to assist other vectorization passes; TLI is not equipped
           for these kind of cost queries; limited to a single basic block.
    CGP - too late to assist other vectorization passes; would need to re-implement
          basic cleanups like CSE/instcombine.
    SLP - doesn't fit with existing transforms; limited to a single basic block.

This initial patch/transform is based on existing code in AggressiveInstCombine:
we walk backwards through the function looking for a pattern match. But we diverge
from that cost-independent IR canonicalization pass by using TTI to decide if the
vector alternative is profitable.

We probably have at least 10 similar bug reports/patterns (binops, constants,
inserts, cheap shuffles, etc) that would fit in this pass as follow-up enhancements.
It's possible that we could iterate on a worklist to fix-point like InstCombine does,
but it's safer to start with a most basic case and evolve from there, so I didn't
try to do anything fancy with this initial implementation.

Differential Revision: https://reviews.llvm.org/D73480
2020-02-09 10:04:41 -05:00
Johannes Doerfert b0c77c36d2 [Attributor] Add an Attributor CGSCC pass and run it
In addition to the module pass, this patch introduces a CGSCC pass that
runs the Attributor on a strongly connected component of the call graph
(both old and new PM). The Attributor was always design to be used on a
subset of functions which makes this patch mostly mechanical.

The one change is that we give up `norecurse` deduction in the module
pass in favor of doing it during the CGSCC pass. This makes the
interfaces simpler but can be revisited if needed.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D70767
2020-02-08 21:27:34 -06:00
Johannes Doerfert 9548b74a83 [OpenMP] Introduce the OpenMPOpt transformation pass
The OpenMPOpt pass is a CGSCC pass in which OpenMP specific
optimizations can reside.

The OpenMPOpt pass uses the OpenMPKinds.def file to identify runtime
calls and their uses. This allows targeted transformations and eases
their implementation.

This initial patch deduplicates `__kmpc_global_thread_num` and
`omp_get_thread_num` calls. We can also identify arguments that are
equivalent to such a call result and use it instead. Later we can
determine "gtid" arguments based on the use in kernel functions etc.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D69930
2020-02-08 14:47:03 -06:00
Teresa Johnson 25aa2eef99 Revert "[WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP"
This reverts commit 748bb5a0f1.

Due to Chromium CFI+ThinLTO test crashes reported on patch.
2020-02-05 19:27:32 -08:00
Alina Sbirlea 67904db23c [IRCE] Make IRCE a Function pass.
Summary: Make InductiveRangeCheckElimination a FunctionPass.

Reviewers: reames, mkazantsev

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73592
2020-02-05 09:22:41 -08:00
Teresa Johnson 748bb5a0f1 [WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP
Summary:
Currently type test assume sequences inserted for devirtualization are
removed during WPD. This patch delays their removal until later in the
optimization pipeline. This is an enabler for upcoming enhancements to
indirect call promotion, for example streamlined promotion guard
sequences that compare against vtable address instead of the target
function, when there are small number of possible vtables (either
determined via WPD or by in-progress type profiling). We need the type
tests to correlate the callsites with the address point offset needed in
the compare sequence, and optionally to associated type summary info
computed during WPD.

This depends on work in D71913 to enable invocation of LowerTypeTests to
drop type test assume sequences, which will now be invoked following ICP
in the ThinLTO post-LTO link pipelines, and also after the existing
export phase LowerTypeTests invocation in regular LTO (which is already
after ICP). We cannot simply move the existing import phase
LowerTypeTests pass later in the ThinLTO post link pipelines, as the
comment in PassBuilder.cpp notes (it must run early because when
performing CFI other passes may disturb the sequences it looks for).

This necessitated adding a new type test resolution "Unknown" that we
can use on the type test assume sequences previously removed by WPD,
that we now want LTT to ignore.

Depends on D71913.

Reviewers: pcc, evgeny777

Subscribers: mehdi_amini, Prazek, hiraditya, steven_wu, dexonsmith, arphaman, davidxl, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73242
2020-02-05 08:59:48 -08:00
Tyker a7bbe45a3e Build assume from call
Fix attempt

this is part of the implementation of http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html

this patch gives the basis of building an assume to preserve all information from an instruction and add support for building an assume that preserve the information from a call.
2020-02-02 19:43:36 +01:00
Tyker 7cb5d96fbe Revert "[WIP] Build assume from call"
casued buildbot failure

This reverts commit 8ebe001553.
2020-02-02 18:35:19 +01:00
Tyker 8ebe001553 [WIP] Build assume from call
Summary:
this is part of the implementation of http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html

this patch gives the basis of building an assume to preserve all information from an instruction and add support for building an assume that preserve the information from a call.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: mgrang, fhahn, mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72475
2020-02-02 18:15:50 +01:00
Tyker c2d0336208 Revert "[WIP] Build assume from call"
caused build bot failure

This reverts commit 780d2c532f.
2020-02-02 18:09:06 +01:00
Tyker 780d2c532f [WIP] Build assume from call
Summary:
this is part of the implementation of http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html

this patch gives the basis of building an assume to preserve all information from an instruction and add support for building an assume that preserve the information from a call.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: mgrang, fhahn, mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72475
2020-02-02 17:54:31 +01:00
Tyker ad8ffc5010 Revert "[WIP] Build assume from call"
This reverts commit 355e4bfd78.
2020-02-02 17:49:23 +01:00
Tyker 355e4bfd78 [WIP] Build assume from call
Summary:
this is part of the implementation of http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html

this patch gives the basis of building an assume to preserve all information from an instruction and add support for building an assume that preserve the information from a call.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: mgrang, fhahn, mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72475
2020-02-02 17:17:46 +01:00
Tyker 0adda3df92 Revert "[WIP] Build assume from call"
This reverts commit 2ff5602cb5.
2020-02-02 15:05:33 +01:00
Tyker 2ff5602cb5 [WIP] Build assume from call
Summary:
this is part of the implementation of http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html

this patch gives the basis of building an assume to preserve all information from an instruction and add support for building an assume that preserve the information from a call.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: mgrang, fhahn, mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72475
2020-02-02 14:50:31 +01:00
Benjamin Kramer adcd026838 Make llvm::StringRef to std::string conversions explicit.
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.

This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.

This doesn't actually modify StringRef yet, I'll do that in a follow-up.
2020-01-28 23:25:25 +01:00
Fedor Sergeev 1f2dad1fd5 [GVN] add GVN parameters parsing to new pass manager
Introduce parsing, add a few instances of parameter use into GVN-PRE tests.

Reviewers: skatkov, asbirlea
Reviewed By: skatkov

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72752
2020-01-16 23:53:46 +03:00
Mircea Trofin 7acfda633f [llvm] Make new pass manager's OptimizationLevel a class
Summary:
The old pass manager separated speed optimization and size optimization
levels into two unsigned values. Coallescing both in an enum in the new
pass manager may lead to unintentional casts and comparisons.

In particular, taking a look at how the loop unroll passes were constructed
previously, the Os/Oz are now (==new pass manager) treated just like O3,
likely unintentionally.

This change disallows raw comparisons between optimization levels, to
avoid such unintended effects. As an effect, the O{s|z} behavior changes
for loop unrolling and loop unroll and jam, matching O2 rather than O3.

The change also parameterizes the threshold values used for loop
unrolling, primarily to aid testing.

Reviewers: tejohnson, davidxl

Reviewed By: tejohnson

Subscribers: zzheng, ychen, mehdi_amini, hiraditya, steven_wu, dexonsmith, dang, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D72547
2020-01-16 09:00:56 -08:00
Nikita Popov 410331869d [NewPM] Port MergeFunctions pass
This ports the MergeFunctions pass to the NewPM. This was rather
straightforward, as no analyses are used.

Additionally MergeFunctions needs to be conditionally enabled in
the PassBuilder, but I left that part out of this patch.

Differential Revision: https://reviews.llvm.org/D72537
2020-01-14 20:55:41 +01:00
Kazu Hirata 6b686703e6 [Inlining] Add PreInlineThreshold for the new pass manager
Summary:
This patch makes it easy to try out different preinlining thresholds
with a command-line switch just like -preinline-threshold for the
legacy pass manager.

Reviewers: davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72618
2020-01-13 07:59:42 -08:00
Wei Mi 21a4710c67 [ThinLTO] Pass CodeGenOpts like UnrollLoops/VectorizeLoop/VectorizeSLP
down to pass builder in ltobackend.

Currently CodeGenOpts like UnrollLoops/VectorizeLoop/VectorizeSLP in clang
are not passed down to pass builder in ltobackend when new pass manager is
used. This is inconsistent with the behavior when new pass manager is used
and thinlto is not used. Such inconsistency causes slp vectorization pass
not being enabled in ltobackend for O3 + thinlto right now. This patch
fixes that.

Differential Revision: https://reviews.llvm.org/D72386
2020-01-09 21:13:11 -08:00
Whitney Tsang d27a15fed7 [NFCI][LoopUnrollAndJam] Changing LoopUnrollAndJamPass to a function
pass.

Summary: This patch changes LoopUnrollAndJamPass to a function pass, and
keeps the loops traversal order same as defined in
FunctionToLoopPassAdaptor LoopPassManager.h.

The next patch will change the loop traversal to outer to inner order,
so more loops can be transform.

Discussion in llvm-dev mailing list:
https://groups.google.com/forum/#!topic/llvm-dev/LF4rUjkVI2g
Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto
Reviewed By: dmgreen
Subscribers: hiraditya, zzheng, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D72230
2020-01-09 16:18:36 +00:00
Florian Hahn 526244b187 [Matrix] Add first set of matrix intrinsics and initial lowering pass.
This is the first patch adding an initial set of matrix intrinsics and a
corresponding lowering pass. This has been discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2019-October/136240.html

The first patch introduces four new intrinsics (transpose, multiply,
columnwise load and store) and a LowerMatrixIntrinsics pass, that
lowers those intrinsics to vector operations.

Matrixes are embedded in a 'flat' vector (e.g. a 4 x 4 float matrix
embedded in a <16 x float> vector) and the intrinsics take the dimension
information as parameters. Those parameters need to be ConstantInt.
For the memory layout, we initially assume column-major, but in the RFC
we also described how to extend the intrinsics to support row-major as
well.

For the initial lowering, we split the input of the intrinsics into a
set of column vectors, transform those column vectors and concatenate
the result columns to a flat result vector.

This allows us to lower the intrinsics without any shape propagation, as
mentioned in the RFC. In follow-up patches, we plan to submit the
following improvements:
 * Shape propagation to eliminate the embedding/splitting for each
   intrinsic.
 * Fused & tiled lowering of multiply and other operations.
 * Optimization remarks highlighting matrix expressions and costs.
 * Generate loops for operations on large matrixes.
 * More general block processing for operation on large vectors,
   exploiting shape information.

We would like to add dedicated transpose, columnwise load and store
intrinsics, even though they are not strictly necessary. For example, we
could instead emit a large shufflevector instruction instead of the
transpose. But we expect that to
  (1) become unwieldy for larger matrixes (even for 16x16 matrixes,
      the resulting shufflevector masks would be huge),
  (2) risk instcombine making small changes, causing us to fail to
      detect the transpose, preventing better lowerings

For the load/store, we are additionally planning on exploiting the
intrinsics for better alias analysis.

Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor, efriedma, rengolin

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D70456
2019-12-12 15:42:18 +00:00
Daniil Suchkov 259ca0418e [SCEV] Make SCEV verification available from command line with new PM
New pass manager doesn't use verifyAnalysis, so currently there is no
way to call SCEV verification from command line when new PM is used.
This patch adds a pass that allows you to do that.

Reviewers: reames, fhahn, sanjoy.google, nikic

Reviewed By: fhahn

Subscribers: hiraditya, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70423
2019-12-02 13:08:20 +07:00
Eric Christopher fd39b1bb20 Revert "Revert "As a follow-up to my initial mail to llvm-dev here's a first pass at the O1 described there.""
This reapplies: 8ff85ed905

Original commit message:

As a follow-up to my initial mail to llvm-dev here's a first pass at the O1 described there.

This change doesn't include any change to move from selection dag to fast isel
and that will come with other numbers that should help inform that decision.
There also haven't been any real debuggability studies with this pipeline yet,
this is just the initial start done so that people could see it and we could start
tweaking after.

Test updates: Outside of the newpm tests most of the updates are coming from either
optimization passes not run anymore (and without a compelling argument at the moment)
that were largely used for canonicalization in clang.

Original post:

http://lists.llvm.org/pipermail/llvm-dev/2019-April/131494.html

Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65410

This reverts commit c9ddb02659.
2019-11-26 20:28:52 -08:00
Muhammad Omair Javaid c9ddb02659 Revert "As a follow-up to my initial mail to llvm-dev here's a first pass at the O1 described there."
This reverts commit 8ff85ed905.

This commit introduced 9 new failures on lldb buildbot host at http://lab.llvm.org:8014/builders/lldb-aarch64-ubuntu

Following tests were failing:
    lldb-api :: functionalities/tail_call_frames/ambiguous_tail_call_seq1/TestAmbiguousTailCallSeq1.py
    lldb-api :: functionalities/tail_call_frames/ambiguous_tail_call_seq2/TestAmbiguousTailCallSeq2.py
    lldb-api :: functionalities/tail_call_frames/disambiguate_call_site/TestDisambiguateCallSite.py
    lldb-api :: functionalities/tail_call_frames/disambiguate_paths_to_common_sink/TestDisambiguatePathsToCommonSink.py
    lldb-api :: functionalities/tail_call_frames/disambiguate_tail_call_seq/TestDisambiguateTailCallSeq.py
    lldb-api :: functionalities/tail_call_frames/inlining_and_tail_calls/TestInliningAndTailCalls.py
    lldb-api :: functionalities/tail_call_frames/sbapi_support/TestTailCallFrameSBAPI.py
    lldb-api :: functionalities/tail_call_frames/thread_step_out_message/TestArtificialFrameStepOutMessage.py
    lldb-api :: functionalities/tail_call_frames/thread_step_out_or_return/TestSteppingOutWithArtificialFrames.py
    lldb-api :: functionalities/tail_call_frames/unambiguous_sequence/TestUnambiguousTailCalls.py

Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65410
2019-11-26 09:32:13 +05:00
Eric Christopher 8ff85ed905 As a follow-up to my initial mail to llvm-dev here's a first pass at the O1 described there.
This change doesn't include any change to move from selection dag to fast isel
and that will come with other numbers that should help inform that decision.
There also haven't been any real debuggability studies with this pipeline yet,
this is just the initial start done so that people could see it and we could start
tweaking after.

Test updates: Outside of the newpm tests most of the updates are coming from either
optimization passes not run anymore (and without a compelling argument at the moment)
that were largely used for canonicalization in clang.

Original post:

http://lists.llvm.org/pipermail/llvm-dev/2019-April/131494.html

Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65410
2019-11-25 17:16:46 -08:00
Tom Stellard ab411801b8 [cmake] Explicitly mark libraries defined in lib/ as "Component Libraries"
Summary:
Most libraries are defined in the lib/ directory but there are also a
few libraries defined in tools/ e.g. libLLVM, libLTO.  I'm defining
"Component Libraries" as libraries defined in lib/ that may be included in
libLLVM.so.  Explicitly marking the libraries in lib/ as component
libraries allows us to remove some fragile checks that attempt to
differentiate between lib/ libraries and tools/ libraires:

1. In tools/llvm-shlib, because
llvm_map_components_to_libnames(LIB_NAMES "all") returned a list of
all libraries defined in the whole project, there was custom code
needed to filter out libraries defined in tools/, none of which should
be included in libLLVM.so.  This code assumed that any library
defined as static was from lib/ and everything else should be
excluded.

With this change, llvm_map_components_to_libnames(LIB_NAMES, "all")
only returns libraries that have been added to the LLVM_COMPONENT_LIBS
global cmake property, so this custom filtering logic can be removed.
Doing this also fixes the build with BUILD_SHARED_LIBS=ON
and LLVM_BUILD_LLVM_DYLIB=ON.

2. There was some code in llvm_add_library that assumed that
libraries defined in lib/ would not have LLVM_LINK_COMPONENTS or
ARG_LINK_COMPONENTS set.  This is only true because libraries
defined lib lib/ use LLVMBuild.txt and don't set these values.
This code has been fixed now to check if the library has been
explicitly marked as a component library, which should now make it
easier to remove LLVMBuild at some point in the future.

I have tested this patch on Windows, MacOS and Linux with release builds
and the following combinations of CMake options:

- "" (No options)
- -DLLVM_BUILD_LLVM_DYLIB=ON
- -DLLVM_LINK_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_BUILD_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_LINK_LLVM_DYLIB=ON

Reviewers: beanz, smeenai, compnerd, phosek

Reviewed By: beanz

Subscribers: wuzish, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, mgorny, mehdi_amini, sbc100, jgravelle-google, hiraditya, aheejin, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, steven_wu, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, dang, Jim, lenary, s.egerton, pzheng, sameer.abuasal, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70179
2019-11-21 10:48:08 -08:00
Francesco Petrogalli d6de5f12d4 [SVFS] Inject TLI Mappings in VFABI attribute.
This patch introduces a function pass to inject the scalar-to-vector
mappings stored in the TargetLIbraryInfo (TLI) into the Vector
Function ABI (VFABI) variants attribute.

The test is testing the injection for three vector libraries supported
by the TLI (Accelerate, SVML, MASSV).

The pass does not change any of the analysis associated to the
function.

Differential Revision: https://reviews.llvm.org/D70107
2019-11-15 18:42:56 +00:00
Reid Kleckner 4c1a1d3cf9 Add missing includes needed to prune LLVMContext.h include, NFC
These are a pre-requisite to removing #include "llvm/Support/Options.h"
from LLVMContext.h: https://reviews.llvm.org/D70280
2019-11-14 15:23:15 -08:00
Guozhi Wei cecc0d27ad [NewPM] Add an SROA pass after loop unroll
If there is a small local array accessed in a loop, SROA can't handle memory
accesses with variant offset inside a loop, after the loop is fully unrolled,
all memory accesses to the array are with fixed offset, so now they can be
processed by SROA. But there is no more SROA passes after loop unroll. This
patch add an SROA pass after loop unroll to handle this pattern.

Differential Revision: https://reviews.llvm.org/D68593
2019-11-01 14:59:08 -07:00
Saleem Abdulrasool 418d1ea555 PM: silence `-Wpessimizing-move` from GCC 9.2.1 (NFC)
Remove the explicit move enabling NVRO.
2019-10-27 18:33:09 -04:00
Eric Christopher c3649a0871 In the new pass manager use PTO.LoopUnrolling to determine when and how
we will unroll loops. Also comment a few occasions where we need to
know whether or not we're forcing the unwinder or not.

The default before and after this patch is for LoopUnroll to be enabled,
and for it to use a cost model to determine whether to unroll the loop
(`OnlyWhenForced = false`). Before this patch, disabling loop unroll
would not run the LoopUnroll pass. After this patch, the LoopUnroll pass
is being run, but it restricts unrolling to only the loops marked by a
pragma (`OnlyWhenForced = true`).

In addition, this patch disables the UnrollAndJam pass when disabling unrolling.

Testcase is in clang because it's controlling how the loop optimizer
is being set up and there's no other way to trigger the behavior.

llvm-svn: 374838
2019-10-14 22:56:07 +00:00
Joerg Sonnenberger 9681ea9560 Reapply r374743 with a fix for the ocaml binding
Add a pass to lower is.constant and objectsize intrinsics

This pass lowers is.constant and objectsize intrinsics not simplified by
earlier constant folding, i.e. if the object given is not constant or if
not using the optimized pass chain. The result is recursively simplified
and constant conditionals are pruned, so that dead blocks are removed
even for -O0. This allows inline asm blocks with operand constraints to
work all the time.

The new pass replaces the existing lowering in the codegen-prepare pass
and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert
on the intrinsics.

Differential Revision: https://reviews.llvm.org/D65280

llvm-svn: 374784
2019-10-14 16:15:14 +00:00
Dmitri Gribenko 1a21f98ac3 Revert "Add a pass to lower is.constant and objectsize intrinsics"
This reverts commit r374743. It broke the build with Ocaml enabled:
http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/19218

llvm-svn: 374768
2019-10-14 12:22:48 +00:00
Joerg Sonnenberger e4300c392d Add a pass to lower is.constant and objectsize intrinsics
This pass lowers is.constant and objectsize intrinsics not simplified by
earlier constant folding, i.e. if the object given is not constant or if
not using the optimized pass chain. The result is recursively simplified
and constant conditionals are pruned, so that dead blocks are removed
even for -O0. This allows inline asm blocks with operand constraints to
work all the time.

The new pass replaces the existing lowering in the codegen-prepare pass
and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert
on the intrinsics.

Differential Revision: https://reviews.llvm.org/D65280

llvm-svn: 374743
2019-10-13 23:00:15 +00:00
Vitaly Buka b46dd6e92a Insert module constructors in a module pass
Summary:
If we insert them from function pass some analysis may be missing or invalid.
Fixes PR42877.

Reviewers: eugenis, leonardchan

Reviewed By: leonardchan

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D68832

> llvm-svn: 374481
Signed-off-by: Vitaly Buka <vitalybuka@google.com>

llvm-svn: 374527
2019-10-11 08:47:03 +00:00
Nico Weber d38332981f Revert 374481 "[tsan,msan] Insert module constructors in a module pass"
CodeGen/sanitizer-module-constructor.c fails on mac and windows, see e.g.
http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/11424

llvm-svn: 374503
2019-10-11 02:44:20 +00:00
Vitaly Buka 5c72aa232e [tsan,msan] Insert module constructors in a module pass
Summary:
If we insert them from function pass some analysis may be missing or invalid.
Fixes PR42877.

Reviewers: eugenis, leonardchan

Reviewed By: leonardchan

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D68832

llvm-svn: 374481
2019-10-10 23:49:10 +00:00
Yuanfang Chen cc382cf727 [NewPM] Port MachineModuleInfo to the new pass manager.
Existing clients are converted to use MachineModuleInfoWrapperPass. The
new interface is for defining a new pass manager API in CodeGen.

Reviewers: fedor.sergeev, philip.pfaffe, chandlerc, arsenm

Reviewed By: arsenm, fedor.sergeev

Differential Revision: https://reviews.llvm.org/D64183

llvm-svn: 373240
2019-09-30 17:54:50 +00:00
Thomas Preud'homme 5f738940b5 Regex: Make "match" and "sub" const member functions
Summary:
The Regex "match" and "sub" member functions were previously not "const"
because they wrote to the "error" member variable. This commit removes
those assignments, and instead assumes that the validity of the regex
is already known after the initial compilation of the regular
expression. As a result, these member functions were possible to make
"const". This makes it easier to do things like pre-compile Regexes
up-front, and makes "match" and "sub" thread-safe. The error status is
now returned as an optional output, which also makes the API of "match"
and "sub" more consistent with each other.

Also, some uses of Regex that could be refactored to be const were made const.

Patch by Nicolas Guillemot

Reviewers: jankratochvil, thopre

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67241

llvm-svn: 372764
2019-09-24 14:42:36 +00:00
Serguei Katkov a44768858c [Unroll] Add an option to control complete unrolling
Add an ability to specify the max full unroll count for LoopUnrollPass pass
in pass options.

Reviewers: fhahn, fedor.sergeev
Reviewed By: fedor.sergeev
Subscribers: hiraditya, zzheng, dmgreen, llvm-commits
Differential Revision: https://reviews.llvm.org/D67701

llvm-svn: 372305
2019-09-19 06:57:29 +00:00
Bardia Mahjour db800c267d Data Dependence Graph Basics
Summary:
This is the first patch in a series of patches that will implement data dependence graph in LLVM. Many of the ideas used in this implementation are based on the following paper:
D. J. Kuck, R. H. Kuhn, D. A. Padua, B. Leasure, and M. Wolfe (1981). DEPENDENCE GRAPHS AND COMPILER OPTIMIZATIONS.
This patch contains support for a basic DDGs containing only atomic nodes (one node for each instruction). The edges are two fold: def-use edges and memory-dependence edges.
The implementation takes a list of basic-blocks and only considers dependencies among instructions in those basic blocks. Any dependencies coming into or going out of instructions that do not belong to those basic blocks are ignored.

The algorithm for building the graph involves the following steps in order:

  1. For each instruction in the range of basic blocks to consider, create an atomic node in the resulting graph.
  2. For each node in the graph establish def-use edges to/from other nodes in the graph.
  3. For each pair of nodes containing memory instruction(s) create memory edges between them. This part of the algorithm goes through the instructions in lexicographical order and creates edges in reverse order if the sink of the dependence occurs before the source of it.

Authored By: bmahjour

Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert

Reviewed By: Meinersbur, fhahn, myhsu

Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto

Tag: #llvm

Differential Revision: https://reviews.llvm.org/D65350

llvm-svn: 372238
2019-09-18 17:43:45 +00:00
Bardia Mahjour 6476d7cf0b Revert "Data Dependence Graph Basics"
This reverts commit c98ec60993, which broke the sphinx-docs build.

llvm-svn: 372168
2019-09-17 19:22:01 +00:00
Bardia Mahjour c98ec60993 Data Dependence Graph Basics
Summary:
This is the first patch in a series of patches that will implement data dependence graph in LLVM. Many of the ideas used in this implementation are based on the following paper:
D. J. Kuck, R. H. Kuhn, D. A. Padua, B. Leasure, and M. Wolfe (1981). DEPENDENCE GRAPHS AND COMPILER OPTIMIZATIONS.
This patch contains support for a basic DDGs containing only atomic nodes (one node for each instruction). The edges are two fold: def-use edges and memory-dependence edges.
The implementation takes a list of basic-blocks and only considers dependencies among instructions in those basic blocks. Any dependencies coming into or going out of instructions that do not belong to those basic blocks are ignored.

The algorithm for building the graph involves the following steps in order:

  1. For each instruction in the range of basic blocks to consider, create an atomic node in the resulting graph.
  2. For each node in the graph establish def-use edges to/from other nodes in the graph.
  3. For each pair of nodes containing memory instruction(s) create memory edges between them. This part of the algorithm goes through the instructions in lexicographical order and creates edges in reverse order if the sink of the dependence occurs before the source of it.

Authored By: bmahjour

Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert

Reviewed By: Meinersbur, fhahn, myhsu

Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto

Tag: #llvm

Differential Revision: https://reviews.llvm.org/D65350

llvm-svn: 372162
2019-09-17 18:55:44 +00:00
Teresa Johnson 9c27b59cec Change TargetLibraryInfo analysis passes to always require Function
Summary:
This is the first change to enable the TLI to be built per-function so
that -fno-builtin* handling can be migrated to use function attributes.
See discussion on D61634 for background. This is an enabler for fixing
handling of these options for LTO, for example.

This change should not affect behavior, as the provided function is not
yet used to build a specifically per-function TLI, but rather enables
that migration.

Most of the changes were very mechanical, e.g. passing a Function to the
legacy analysis pass's getTLI interface, or in Module level cases,
adding a callback. This is similar to the way the per-function TTI
analysis works.

There was one place where we were looking for builtins but not in the
context of a specific function. See FindCXAAtExit in
lib/Transforms/IPO/GlobalOpt.cpp. I'm somewhat concerned my workaround
could provide the wrong behavior in some corner cases. Suggestions
welcome.

Reviewers: chandlerc, hfinkel

Subscribers: arsenm, dschuff, jvesely, nhaehnle, mehdi_amini, javed.absar, sbc100, jgravelle-google, eraman, aheejin, steven_wu, george.burgess.iv, dexonsmith, jfb, asbirlea, gchatelet, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66428

llvm-svn: 371284
2019-09-07 03:09:36 +00:00
Denis Bakhvalov 58f172f05a [MergedLoadStoreMotion] Sink stores to BB with more than 2 predecessors
If we have:

bb5:
  br i1 %arg3, label %bb6, label %bb7

bb6:
  %tmp = getelementptr inbounds i32, i32* %arg1, i64 2
  store i32 3, i32* %tmp, align 4
  br label %bb9

bb7:
  %tmp8 = getelementptr inbounds i32, i32* %arg1, i64 2
  store i32 3, i32* %tmp8, align 4
  br label %bb9

bb9:  ; preds = %bb4, %bb6, %bb7
  ...

We can't sink stores directly into bb9.
This patch creates new BB that is successor of %bb6 and %bb7
and sinks stores into that block.

SplitFooterBB is the parameter to the pass that controls
that behavior.

Change-Id: I7fdf50a772b84633e4b1b860e905bf7e3e29940f
Differential: https://reviews.llvm.org/D66234
llvm-svn: 371089
2019-09-05 17:00:32 +00:00
Leonard Chan eca01b031d [NewPM][Sancov] Make Sancov a Module Pass instead of 2 Passes
This patch merges the sancov module and funciton passes into one module pass.

The reason for this is because we ran into an out of memory error when
attempting to run asan fuzzer on some protobufs (pc.cc files). I traced the OOM
error to the destructor of SanitizerCoverage where we only call
appendTo[Compiler]Used which calls appendToUsedList. I'm not sure where precisely
in appendToUsedList causes the OOM, but I am able to confirm that it's calling
this function *repeatedly* that causes the OOM. (I hacked sancov a bit such that
I can still create and destroy a new sancov on every function run, but only call
appendToUsedList after all functions in the module have finished. This passes, but
when I make it such that appendToUsedList is called on every sancov destruction,
we hit OOM.)

I don't think the OOM is from just adding to the SmallSet and SmallVector inside
appendToUsedList since in either case for a given module, they'll have the same
max size. I suspect that when the existing llvm.compiler.used global is erased,
the memory behind it isn't freed. I could be wrong on this though.

This patch works around the OOM issue by just calling appendToUsedList at the
end of every module run instead of function run. The same amount of constants
still get added to llvm.compiler.used, abd we make the pass usage and logic
simpler by not having any inter-pass dependencies.

Differential Revision: https://reviews.llvm.org/D66988

llvm-svn: 370971
2019-09-04 20:30:29 +00:00
Alina Sbirlea 7425179fee [LoopPassManager + MemorySSA] Only enable use of MemorySSA for LPMs known to preserve it.
Summary:
Add a flag to the FunctionToLoopAdaptor that allows enabling MemorySSA only for the loop pass managers that are known to preserve it.

If an LPM is known to have only loop transforms that *all* preserve MemorySSA, then use MemorySSA if `EnableMSSALoopDependency` is set.
If an LPM has loop passes that do not preserve MemorySSA, then the flag passed is `false`, regardless of the value of `EnableMSSALoopDependency`.

When using a custom loop pass pipeline via `passes=...`, use keyword `loop` vs `loop-mssa` to use MemorySSA in that LPM. If a loop that does not preserve MemorySSA is added while using the `loop-mssa` keyword, that's an error.

Add the new `loop-mssa` keyword to a few tests where a difference occurs when enabling MemorySSA.

Reviewers: chandlerc

Subscribers: mehdi_amini, Prazek, george.burgess.iv, sanjoy.google, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66376

llvm-svn: 369548
2019-08-21 17:00:57 +00:00
Whitney Tsang dd3b6498b0 Title: Loop Cache Analysis
Summary: Implement a new analysis to estimate the number of cache lines
required by a loop nest.
The analysis is largely based on the following paper:

Compiler Optimizations for Improving Data Locality
By: Steve Carr, Katherine S. McKinley, Chau-Wen Tseng
http://www.cs.utexas.edu/users/mckinley/papers/asplos-1994.pdf
The analysis considers temporal reuse (accesses to the same memory
location) and spatial reuse (accesses to memory locations within a cache
line). For simplicity the analysis considers memory accesses in the
innermost loop in a loop nest, and thus determines the number of cache
lines used when the loop L in loop nest LN is placed in the innermost
position.

The result of the analysis can be used to drive several transformations.
As an example, loop interchange could use it determine which loops in a
perfect loop nest should be interchanged to maximize cache reuse.
Similarly, loop distribution could be enhanced to take into
consideration cache reuse between arrays when distributing a loop to
eliminate vectorization inhibiting dependencies.

The general approach taken to estimate the number of cache lines used by
the memory references in the inner loop of a loop nest is:

Partition memory references that exhibit temporal or spatial reuse into
reference groups.
For each loop L in the a loop nest LN: a. Compute the cost of the
reference group b. Compute the 'cache cost' of the loop nest by summing
up the reference groups costs
For further details of the algorithm please refer to the paper.
Authored By: etiotto
Reviewers: hfinkel, Meinersbur, jdoerfert, kbarton, bmahjour, anemet,
fhahn
Reviewed By: Meinersbur
Subscribers: reames, nemanjai, MaskRay, wuzish, Hahnfeld, xusx595,
venkataramanan.kumar.llvm, greened, dmgreen, steleman, fhahn, xblvaOO,
Whitney, mgorny, hiraditya, mgrang, jsji, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D63459

llvm-svn: 368439
2019-08-09 13:56:29 +00:00
Serguei Katkov de67affd00 [Loop Peeling] Introduce an option for profile based peeling disabling.
This patch adds an ability to disable profile based peeling 
causing the peeling of all iterations and as a result prohibits
further unroll/peeling attempts on that loop.

The motivation to get an ability to separate peeling usage in
pipeline where in the first part we peel only separate iterations if needed
and later in pipeline we apply the full peeling which will prohibit further peeling.

Reviewers: reames, fhahn
Reviewed By: reames
Subscribers: hiraditya, zzheng, dmgreen, llvm-commits
Differential Revision: https://reviews.llvm.org/D64983

llvm-svn: 367668
2019-08-02 09:32:52 +00:00
Rong Xu ca161fa008 [PGO] Add PGO support at -O0 in the experimental new pass manager
Add PGO support at -O0 in the experimental new pass manager to sync the
behavior of the legacy pass manager.

Also change the test of gcc-flag-compatibility.c for more complete test:
(1) change the match string to "profc" and "profd" to ensure the
    instrumentation is happening.
(2) add IR format proftext so that PGO use compilation is tested.

Differential Revision: https://reviews.llvm.org/D64029

llvm-svn: 367628
2019-08-01 22:36:34 +00:00
Leonard Chan 007f674c6a Reland the "[NewPM] Port Sancov" patch from rL365838. No functional
changes were made to the patch since then.

--------

[NewPM] Port Sancov

This patch contains a port of SanitizerCoverage to the new pass manager. This one's a bit hefty.

Changes:

- Split SanitizerCoverageModule into 2 SanitizerCoverage for passing over
  functions and ModuleSanitizerCoverage for passing over modules.
- ModuleSanitizerCoverage exists for adding 2 module level calls to initialization
  functions but only if there's a function that was instrumented by sancov.
- Added legacy and new PM wrapper classes that own instances of the 2 new classes.
- Update llvm tests and add clang tests.

llvm-svn: 367053
2019-07-25 20:53:15 +00:00
Peter Collingbourne 3b82b92c6b hwasan: Initialize the pass only once.
This will let us instrument globals during initialization. This required
making the new PM pass a module pass, which should still provide access to
analyses via the ModuleAnalysisManager.

Differential Revision: https://reviews.llvm.org/D64843

llvm-svn: 366379
2019-07-17 21:45:19 +00:00
Leonard Chan bb147aabc6 Revert "[NewPM] Port Sancov"
This reverts commit 5652f35817.

llvm-svn: 366153
2019-07-15 23:18:31 +00:00
Leonard Chan 5652f35817 [NewPM] Port Sancov
This patch contains a port of SanitizerCoverage to the new pass manager. This one's a bit hefty.

Changes:

- Split SanitizerCoverageModule into 2 SanitizerCoverage for passing over
  functions and ModuleSanitizerCoverage for passing over modules.
- ModuleSanitizerCoverage exists for adding 2 module level calls to initialization
  functions but only if there's a function that was instrumented by sancov.
- Added legacy and new PM wrapper classes that own instances of the 2 new classes.
- Update llvm tests and add clang tests.

Differential Revision: https://reviews.llvm.org/D62888

llvm-svn: 365838
2019-07-11 22:35:40 +00:00
Philip Reames f47a313e71 Add a transform pass to make the executable semantics of poison explicit in the IR
Implements a transform pass which instruments IR such that poison semantics are made explicit. That is, it provides a (possibly partial) executable semantics for every instruction w.r.t. poison as specified in the LLVM LangRef. There are obvious parallels to the sanitizer tools, but this pass is focused purely on the semantics of LLVM IR, not any particular source language.

The target audience for this tool is developers working on or targetting LLVM from a frontend. The idea is to be able to take arbitrary IR (with the assumption of known inputs), and evaluate it concretely after having made poison semantics explicit to detect cases where either a) the original code executes UB, or b) a transform pass introduces UB which didn't exist in the original program.

At the moment, this is mostly the framework and still needs to be fleshed out. By reusing existing code we have decent coverage, but there's a lot of cases not yet handled. What's here is good enough to handle interesting cases though; for instance, one of the recent LFTR bugs involved UB being triggered by integer induction variables with nsw/nuw flags would be reported by the current code.

(See comment in PoisonChecking.cpp for full explanation and context)

Differential Revision: https://reviews.llvm.org/D64215

llvm-svn: 365536
2019-07-09 18:49:29 +00:00
Leonard Chan 97dc622ab3 [clang][NewPM] Do not eliminate available_externally durng `-O2 -flto` runs
This fixes CodeGen/available-externally-suppress.c when the new pass manager is
turned on by default. available_externally was not emitted during -O2 -flto
runs when it should still be retained for link time inlining purposes. This can
be fixed by checking that we aren't LTOPrelinking when adding the
EliminateAvailableExternallyPass.

Differential Revision: https://reviews.llvm.org/D63580

llvm-svn: 363971
2019-06-20 19:44:51 +00:00
Johannes Doerfert aade782a98 [Attributor] Pass infrastructure and fixpoint framework
NOTE: Note that no attributes are derived yet. This patch will not go in
      alone but only with others that derive attributes. The framework is
      split for review purposes.

This commit introduces the Attributor pass infrastructure and fixpoint
iteration framework. Further patches will introduce abstract attributes
into this framework.

In a nutshell, the Attributor will update instances of abstract
arguments until a fixpoint, or a "timeout", is reached. Communication
between the Attributor and the abstract attributes that are derived is
restricted to the AbstractState and AbstractAttribute interfaces.

Please see the file comment in Attributor.h for detailed information
including design decisions and typical use case. Also consider the class
documentation for Attributor, AbstractState, and AbstractAttribute.

Reviewers: chandlerc, homerdin, hfinkel, fedor.sergeev, sanjoy, spatel, nlopes, nicholas, reames

Subscribers: mehdi_amini, mgorny, hiraditya, bollu, steven_wu, dexonsmith, dang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59918

llvm-svn: 362578
2019-06-05 03:02:24 +00:00
Alina Sbirlea d82ddfa7c3 [NewPassManager] Add tuning option: ForgetAllSCEVInLoopUnroll [NFC].
Summary: Mirror tuning option from old pass manager in new pass manager.

Reviewers: chandlerc

Subscribers: mehdi_amini, jlebar, zzheng, dmgreen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61612

llvm-svn: 361560
2019-05-23 21:52:59 +00:00
Alina Sbirlea e4b27869c6 [NewPassManager] Add tuning option: LoopUnrolling [NFC].
Summary: Mirror tuning option from old pass manager in new pass manager.

Reviewers: chandlerc

Subscribers: jlebar, dmgreen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61618

llvm-svn: 361540
2019-05-23 19:35:40 +00:00
Clement Courbet 43882b16a3 [MergeICmps] Make the pass compatible with the new pass manager.
Reviewers: gchatelet, spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62287

llvm-svn: 361490
2019-05-23 12:35:26 +00:00
Joerg Sonnenberger ec6ee797ec Fix typos in comment.
llvm-svn: 360921
2019-05-16 18:01:57 +00:00
Leonard Chan 0cdd3b1d81 [NewPM] Port HWASan and Kernel HWASan
Port hardware assisted address sanitizer to new PM following the same guidelines as msan and tsan.

Changes:
- Separate HWAddressSanitizer into a pass class and a sanitizer class.
- Create new PM wrapper pass for the sanitizer class.
- Use the getOrINsert pattern for some module level initialization declarations.
- Also enable kernel-kwasan in new PM
- Update llvm tests and add clang test.

Differential Revision: https://reviews.llvm.org/D61709

llvm-svn: 360707
2019-05-14 21:17:21 +00:00
Petr Hosek 366cda03a8 [NewPM] Setup Passes for KASan and KMSan
While ASan and MSan passes were already ported to new PM, the kernel
variants weren't setup in the pipeline which makes the KASan and KMSan
tests in Clang fail.

Differential Revision: https://reviews.llvm.org/D61664

llvm-svn: 360313
2019-05-09 06:09:35 +00:00
Alina Sbirlea 458c7339e1 [NewPassManager] Add tuning option: SLPVectorization [NFC].
Summary: Mirror tuning option from old pass manager in new pass manager.

Reviewers: chandlerc

Subscribers: mehdi_amini, jlebar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61616

llvm-svn: 360276
2019-05-08 17:58:35 +00:00
Eric Christopher 86e2f169bb Tidy up a comment, fix a typo, remove a comment that's obsolete.
llvm-svn: 359852
2019-05-03 00:15:23 +00:00
Teresa Johnson 867bc3951b [ThinLTO] Pass down opt level to LTO backend and handle -O0 LTO in new PM
Summary:
The opt level was not being passed down to the ThinLTO backend when
invoked via clang (for distributed ThinLTO).

This exposed an issue where the new PM was asserting if the Thin or
regular LTO backend pipelines were invoked with -O0 (not a new issue,
could be provoked by invoking in-process *LTO backends via linker using
new PM and -O0). Fix this similar to the old PM where -O0 only does the
necessary lowering of type metadata (WPD and LowerTypeTest passes) and
then quits, rather than asserting.

Reviewers: xur

Subscribers: mehdi_amini, inglorion, eraman, hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits, pcc

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D61022

llvm-svn: 359025
2019-04-23 18:56:19 +00:00
Serguei Katkov 40a3b96196 [NewPM] Add Option handling for SimpleLoopUnswitch
This patch enables passing options to SimpleLoopUnswitch via the passes pipeline.

Reviewers: chandlerc, fedor.sergeev, leonardchan, philip.pfaffe
Reviewed By: fedor.sergeev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D60676

llvm-svn: 358880
2019-04-22 10:35:07 +00:00
Eric Christopher dfebd84eb3 Remove the EnableEarlyCSEMemSSA set of options from the legacy
and new pass managers. They were default to true and not being
used.

Differential Revision: https://reviews.llvm.org/D60747

llvm-svn: 358789
2019-04-19 22:18:53 +00:00
Alina Sbirlea 43709f7233 [LICM & MemorySSA] Make limit flags pass tuning options.
Summary:
Make the flags in LICM + MemorySSA tuning options in the old and new
pass managers.

Subscribers: mehdi_amini, jlebar, Prazek, george.burgess.iv, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60490

llvm-svn: 358772
2019-04-19 17:46:50 +00:00
Alina Sbirlea 0499a2f961 [NewPassManager] Adding pass tuning options: loop vectorize.
Summary:
Trying to add the plumbing necessary to add tuning options to the new pass manager.
Testing with the flags for loop vectorize.

Reviewers: chandlerc

Subscribers: sanjoy, mehdi_amini, jlebar, steven_wu, dexonsmith, dang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59723

llvm-svn: 358763
2019-04-19 16:11:59 +00:00
Philip Reames 137995d8da [GuardWidening] Wire up a NPM version of the LoopGuardWidening pass
llvm-svn: 358704
2019-04-18 19:17:14 +00:00
Serguei Katkov ca6c03a22f [NewPM] Add Option handling for LoopVectorize
This patch enables passing options to LoopVectorizePass via the passes pipeline.

Reviewers: chandlerc, fedor.sergeev, leonardchan, philip.pfaffe
Reviewed By: fedor.sergeev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D60681

llvm-svn: 358647
2019-04-18 08:46:11 +00:00
Eric Christopher eff3b6fe7f Elaborate why we have an option on by default for enabling chr.
llvm-svn: 358641
2019-04-18 06:17:40 +00:00
Kit Barton 3cdf87940f Add basic loop fusion pass.
This patch adds a basic loop fusion pass. It will fuse loops that conform to the
following 4 conditions:
  1. Adjacent (no code between them)
  2. Control flow equivalent (if one loop executes, the other loop executes)
  3. Identical bounds (both loops iterate the same number of iterations)
  4. No negative distance dependencies between the loop bodies.

The pass does not make any changes to the IR to create opportunities for fusion.
Instead, it checks if the necessary conditions are met and if so it fuses two
loops together.

The pass has not been added to the pass pipeline yet, and thus is not enabled by
default. It can be run stand alone using the -loop-fusion option.

Differential Revision: https://reviews.llvm.org/D55851

llvm-svn: 358607
2019-04-17 18:53:27 +00:00
Eric Christopher e29874eaa0 Revert "Add basic loop fusion pass." Per request.
This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.

llvm-svn: 358553
2019-04-17 04:55:24 +00:00
Eric Christopher cee313d288 Revert "Temporarily Revert "Add basic loop fusion pass.""
The reversion apparently deleted the test/Transforms directory.

Will be re-reverting again.

llvm-svn: 358552
2019-04-17 04:52:47 +00:00
Eric Christopher a863435128 Temporarily Revert "Add basic loop fusion pass."
As it's causing some bot failures (and per request from kbarton).

This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.

llvm-svn: 358546
2019-04-17 02:12:23 +00:00
Kit Barton ab70da0728 Add basic loop fusion pass.
This patch adds a basic loop fusion pass. It will fuse loops that conform to the
following 4 conditions:
  1. Adjacent (no code between them)
  2. Control flow equivalent (if one loop executes, the other loop executes)
  3. Identical bounds (both loops iterate the same number of iterations)
  4. No negative distance dependencies between the loop bodies.

The pass does not make any changes to the IR to create opportunities for fusion.
Instead, it checks if the necessary conditions are met and if so it fuses two
loops together.

The pass has not been added to the pass pipeline yet, and thus is not enabled by
default. It can be run stand alone using the -loop-fusion option.

Phabricator: https://reviews.llvm.org/D55851
llvm-svn: 358543
2019-04-17 01:37:00 +00:00
Hiroshi Yamauchi 09e539fcae [PGO] Profile guided code size optimization.
Summary:
Enable some of the existing size optimizations for cold code under PGO.

A ~5% code size saving in big internal app under PGO.

The way it gets BFI/PSI is discussed in the RFC thread

http://lists.llvm.org/pipermail/llvm-dev/2019-March/130894.html 

Note it doesn't currently touch loop passes.

Reviewers: davidxl, eraman

Reviewed By: eraman

Subscribers: mgorny, javed.absar, smeenai, mehdi_amini, eraman, zzheng, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59514

llvm-svn: 358422
2019-04-15 16:49:00 +00:00
Serguei Katkov f54328372b [NewPM] Add Option handling for SimplifyCFG
This patch enables passing options to SimplifyCFGPass via the passes pipeline.

Reviewers: chandlerc, fedor.sergeev, leonardchan, philip.pfaffe
Reviewed By: fedor.sergeev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D60675

llvm-svn: 358379
2019-04-15 08:57:53 +00:00
Fedor Sergeev a2ed448bf2 SafepointIRVerifier port to new Pass Manager
Straightforward port of StatepointIRVerifier pass to new Pass Manager framework.

Fix By: skatkov
Reviewed By: fedor.sergeev
Differential Revision: https://reviews.llvm.org/D59825

This is a re-land of r357147/r357148 with LLVM_ENABLE_MODULES build fixed.
Adding IR/SafepointIRVerifier.h into its own module.

llvm-svn: 357361
2019-03-31 10:15:39 +00:00
Adrian Prantl 119fdeded8 Temporarily revert "SafepointIRVerifier port to new Pass Manager"
to unbreak the modular bots and its follow-up commit.

This reverts commit https://reviews.llvm.org/D59825
because it introduced a

fatal error: cyclic dependency in module 'LLVM_intrinsic_gen': LLVM_intrinsic_gen -> LLVM_IR -> LLVM_intrinsic_gen

llvm-svn: 357201
2019-03-28 18:34:34 +00:00
Serguei Katkov 93432be304 SafepointIRVerifier port to new Pass Manager
Straightforward port of StatepointIRVerifier pass to new Pass Manager framework.

Reviewers: fedor.sergeev, reames
Reviewed By: fedor.sergeev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D59825

llvm-svn: 357147
2019-03-28 06:00:09 +00:00
Robert Lougher f2158a8ef0 Resubmit r356511 "[TailCallElim] Add tailcall elimination pass to LTO pipelines"
Failing LLD tests have been fixed in r356593.

llvm-svn: 356594
2019-03-20 19:08:18 +00:00
Robert Lougher c67a759c99 Revert r356511 "[TailCallElim] Add tailcall elimination pass to LTO pipelines"
Due to buildbot failures (LLD tests).

llvm-svn: 356516
2019-03-19 20:54:20 +00:00
Robert Lougher de548ccab9 [TailCallElim] Add tailcall elimination pass to LTO pipelines
LTO provides additional opportunities for tailcall elimination due to
link-time inlining and visibility of nocapture attribute. Testing showed
negligible impact on compilation times.

Differential Revision: https://reviews.llvm.org/D58391

llvm-svn: 356511
2019-03-19 20:24:28 +00:00
Rong Xu db29a3a438 [PGO] Context sensitive PGO (part 3)
Part 3 of CSPGO changes (mostly related to PassMananger).

Differential Revision: https://reviews.llvm.org/D54175

llvm-svn: 355330
2019-03-04 20:21:27 +00:00
Manman Ren 1829512dd3 Add a module pass for order file instrumentation
The basic idea of the pass is to use a circular buffer to log the execution ordering of the functions. We only log the function when it is first executed. We use a 8-byte hash to log the function symbol name.

In this pass, we add three global variables:
(1) an order file buffer: a circular buffer at its own llvm section.
(2) a bitmap for each module: one byte for each function to say if the function is already executed.
(3) a global index to the order file buffer.

At the function prologue, if the function has not been executed (by checking the bitmap), log the function hash, then atomically increase the index.

Differential Revision:  https://reviews.llvm.org/D57463

llvm-svn: 355133
2019-02-28 20:13:38 +00:00
Rong Xu 6cdf3d8086 Recommit r354930 "[PGO] Context sensitive PGO (part 1)"
Fixed UBSan failures.

llvm-svn: 355005
2019-02-27 17:24:33 +00:00
Vlad Tsyrklevich c01643087e Revert "[PGO] Context sensitive PGO (part 1)"
This reverts commit r354930, it was causing UBSan failures.

llvm-svn: 354953
2019-02-27 03:45:28 +00:00
Rong Xu 35d2d51369 [PGO] Context sensitive PGO (part 1)
Current PGO profile counts are not context sensitive. The branch probabilities
for the inlined functions are kept the same for all call-sites, and they might
be very different from the actual branch probabilities. These suboptimal
profiles can greatly affect some downstream optimizations, in particular for
the machine basic block placement optimization.

In this patch, we propose to have a post-inline PGO instrumentation/use pass,
which we called Context Sensitive PGO (CSPGO). For the users who want the best
possible performance, they can perform a second round of PGO instrument/use on
the top of the regular PGO. They will have two sets of profile counts. The
first pass profile will be manly for inline, indirect-call promotion, and
CGSCC simplification pass optimizations. The second pass profile is for
post-inline optimizations and code-gen optimizations.

A typical usage:
// Regular PGO instrumentation and generate pass1 profile.
> clang -O2 -fprofile-generate source.c -o gen
> ./gen
> llvm-profdata merge default.*profraw -o pass1.profdata
// CSPGO instrumentation.
> clang -O2 -fprofile-use=pass1.profdata -fcs-profile-generate -o gen2
> ./gen2
// Merge two sets of profiles
> llvm-profdata merge default.*profraw pass1.profdata -o profile.profdata
// Use the combined profile. Pass manager will invoke two PGO use passes.
> clang -O2 -fprofile-use=profile.profdata -o use

This change touches many components in the compiler. The reviewed patch
(D54175) will committed in phrases.

Differential Revision: https://reviews.llvm.org/D54175

llvm-svn: 354930
2019-02-26 22:37:46 +00:00
Eric Christopher 721eaeff3a Fix a small comment typo.
llvm-svn: 354923
2019-02-26 20:33:22 +00:00
Vedant Kumar 47a0c9b69c [HotColdSplit] Schedule splitting late to fix perf regression
With or without PGO data applied, splitting early in the pipeline
(either before the inliner or shortly after it) regresses performance
across SPEC variants. The cause appears to be that splitting hides
context for subsequent optimizations.

Schedule splitting late again, in effect reversing r352080, which
scheduled the splitting pass early for code size benefits (documented in
https://reviews.llvm.org/D57082).

Differential Revision: https://reviews.llvm.org/D58258

llvm-svn: 354158
2019-02-15 18:46:44 +00:00
Leonard Chan 436fb2bd82 [NewPM] Second attempt at porting ASan
This is the second attempt to port ASan to new PM after D52739. This takes the
initialization requried by ASan from the Module by moving it into a separate
class with it's own analysis that the new PM ASan can use.

Changes:
- Split AddressSanitizer into 2 passes: 1 for the instrumentation on the
  function, and 1 for the pass itself which creates an instance of the first
  during it's run. The same is done for AddressSanitizerModule.
- Add new PM AddressSanitizer and AddressSanitizerModule.
- Add legacy and new PM analyses for reading data needed to initialize ASan with.
- Removed DominatorTree dependency from ASan since it was unused.
- Move GlobalsMetadata and ShadowMapping out of anonymous namespace since the
  new PM analysis holds these 2 classes and will need to expose them.

Differential Revision: https://reviews.llvm.org/D56470

llvm-svn: 353985
2019-02-13 22:22:48 +00:00