This is split off from D102002, and I think it is clear that
the difference in behavior was not intended. Options were
added to SimplifyCFG over time, but different chunks of
the pass pipelines were not kept in sync.
This patch changes LoopFlattenPass from FunctionPass to LoopNestPass.
Utilize LoopNest and let function 'Flatten' generate information from it.
Reviewed By: Whitney
Differential Revision: https://reviews.llvm.org/D102904
This patch changes LoopFlattenPass from FunctionPass to LoopNestPass.
Utilize LoopNest and let function 'Flatten' generate information from it.
Reviewed By: Whitney
Differential Revision: https://reviews.llvm.org/D102904
This patch changes LoopFlattenPass from FunctionPass to LoopNestPass.
Utilize LoopNest and let function 'Flatten' generate information from it.
Reviewed By: Whitney
Differential Revision: https://reviews.llvm.org/D102904
This patch changes LoopUnrollAndJamPass from FunctionPass to LoopNest pass.
The next patch will utilize LoopNest to effectively handle loop nests.
Reviewed By: Whitney
Differential Revision: https://reviews.llvm.org/D99149
Summary:
Make the file name and descriptors static so that they are reused by
print-changed=diff. This avoids errors about being unable to create
temporary files when doing the later comparisons in a large compile.
Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: aeubanks (Arthur Eubanks)
Differential Revision: https://reviews.llvm.org/D100116
This patch changes LoopUnrollAndJamPass from FunctionPass to LoopNest pass.
The next patch will utilize LoopNest to effectively handle loop nests.
Reviewed By: Whitney
Differential Revision: https://reviews.llvm.org/D99149
This patch changes LoopUnrollAndJamPass from FunctionPass to LoopNest pass.
The next patch will utilize LoopNest to effectively handle loop nests.
Reviewed By: Whitney
Differential Revision: https://reviews.llvm.org/D99149
To bring D99599's implementation in line with the existing
PrintPassInstrumentation, and to fix a FIXME, add more customizability
to PrintPassInstrumentation.
Introduce three new options. The first takes over the existing
"-debug-pass-manager-verbose" cl::opt.
The second and third option are specific to -fdebug-pass-structure. They
allow indentation, and also don't print analysis queries.
To avoid more golden file tests than necessary, prune down the
-fdebug-pass-structure tests.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D102196
This patch contains the bare minimum to run the new Pass Manager from the LLVM-C APIs. It does not feature PGOOptions, PassPlugins or Debugify in its current state. Bugzilla: PR48499
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D102136
This patch contains the bare minimum to run the new Pass Manager from the LLVM-C APIs. It does not feature PGOOptions, PassPlugins or Debugify in its current state. Bugzilla: PR48499
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D102136
This patch adjusts the LTO pipeline in the new PM to run GlobalsAA
before LICM to match the legacy PM.
This fixes a regression where the new PM failed to vectorize loops that
require hoisting/sinking by LICM depending on GlobalsAA info.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D102345
Split off from D102345 to commit this separately from other changes in
the patch. This aligns the behavior of the new PM with the legacy PM
for LTO, with respect to running LICM.
Together with the remaining changes in D102345, this fixes new PM
regressions where we fail to vectorize loops that are vectorized with
the legacy PM.
This is better no-functional-change-intended than the 1st attempt.
As noted in D102002, there were at least 2 diffs that went
unchecked in pass manager regressions tests: different pass
parameters (SimplifyCFG) and an extension point/callback.
Those should be lifted from the original code blocks correctly
now.
This reverts commit fefcb1f878.
It was supposed to be NFC, but as noted in the post-commit
comments in D102002, that was not true: SimplifyCFG uses
different parameters and there's a difference in an
extension point / callback.
Printing pass manager invocations is fairly verbose and not super
useful.
This allows us to remove DebugLogging from pass managers and PassBuilder
since all logging (aside from analysis managers) goes through
instrumentation now.
This has the downside of never being able to print the top level pass
manager via instrumentation, but that seems like a minor downside.
Reviewed By: ychen
Differential Revision: https://reviews.llvm.org/D101797
We're trying to move DebugLogging into instrumentation, rather than
being part of PassManagers/AnalysisManagers.
Reviewed By: ychen
Differential Revision: https://reviews.llvm.org/D102093
The CGSCC pass manager interplay with the FunctionAnalysisManagerCGSCCProxy is 'special' in the sense that the former will rerun the latter if there are changes to a SCC structure; that being said, some of the functions in the SCC may be unchanged. In that case, the function simplification pipeline will be re-run, which impacts compile time[1].
This patch allows the function simplification pipeline be skipped if it was already run and the function was not modified since.
The behavior is currently disabled by default. This is because, currently, the rerunning of the function simplification pipeline on an unchanged function may still result in changes. The patch simplifies investigating and fixing those cases where repeated function pass runs do actually positively impact code quality, while offering an easy workaround for those impacted negatively by compile time regressions, and not impacting mainline scenarios.
[1] A [[ http://llvm-compile-time-tracker.com/compare.php?from=eb37d3546cd0c6e67798496634c45e501f7806f1&to=ac722d1190dc7bbdd17e977ef7ec95e69eefc91e&stat=instructions | compile time tracker ]] run with the option enabled.
Differential Revision: https://reviews.llvm.org/D98103
GlobalsAA is only created at the beginning of the inliner pipeline. If
an AAManager is cached from previous passes, it won't get rebuilt to
include the newly created GlobalsAA.
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D101379
Relative look table converter pass caused an issue when full lto
is enabled (reported in https://reviews.llvm.org/D94355).
This patch disables that pass from full lto pre-link phase optimization
pipeline until the issue is fixed.
Differential Revision: https://reviews.llvm.org/D101664
Hoisting and sinking instructions out of conditional blocks enables
additional vectorization by:
1. Executing memory accesses unconditionally.
2. Reducing the number of instructions that need predication.
After disabling early hoisting / sinking, we miss out on a few
vectorization opportunities. One of those is causing a ~10% performance
regression in one of the Geekbench benchmarks on AArch64.
This patch tires to recover the regression by running hoisting/sinking
as part of a SimplifyCFG run after LoopRotate and before LoopVectorize.
Note that in the legacy pass-manager, we run LoopRotate just before
vectorization again and there's no SimplifyCFG run in between, so the
sinking/hoisting may impact the later run on LoopRotate. But the impact
should be limited and the benefit of hosting/sinking at this stage
should outweigh the risk of not rotating.
Compile-time impact looks slightly positive for most cases.
http://llvm-compile-time-tracker.com/compare.php?from=2ea7fb7b1c045a7d60fcccf3df3ebb26aa3699e5&to=e58b4a763c691da651f25996aad619cb3d946faf&stat=instructions
NewPM-O3: geomean -0.19%
NewPM-ReleaseThinLTO: geoman -0.54%
NewPM-ReleaseLTO-g: geomean -0.03%
With a few benchmarks seeing a notable increase, but also some
improvements.
Alternative to D101290.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D101468
Summary:
This patch registers OpenMPOpt as a Module pass in addition to a CGSCC
pass. This is so certain optimzations that are sensitive to intact
call-sites can happen before inlining. The old `openmpopt` pass name is
changed to `openmp-opt-cgscc` and `openmp-opt` calls the Module pass.
The current module pass only runs a single check but will be expanded in
the future.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D99202
Being lazy with printing the banner seems hard to reason with, we should print it
unconditionally first (it could also lead to duplicate banners if we
have multiple functions in -filter-print-funcs).
The printIR() functions were doing too many things. I separated out the
call from PrintPassInstrumentation since we were essentially doing two
completely separate things in printIR() from different callers.
There were multiple ways to generate the name of some IR. That's all
been moved to getIRName(). The printing of the IR name was also
inconsistent, now it's always "IR Dump on $foo" where "$foo" is the
name. For a function, it's the function name. For a loop, it's what's
printed by Loop::print(), which is more detailed. For an SCC, it's the
list of functions in parentheses. For a module it's "[module]", to
differentiate between a possible SCC with a function called "module".
To preserve D74814, we have to check if we're going to print anything at
all first. This is unfortunate, but I would consider this a special
case that shouldn't be handled in the core logic.
Reviewed By: jamieschmeiser
Differential Revision: https://reviews.llvm.org/D100231
Lookup tables generate non PIC-friendly code, which requires dynamic relocation as described in:
https://bugs.llvm.org/show_bug.cgi?id=45244
This patch adds a new pass that converts lookup tables to relative lookup tables to make them PIC-friendly.
Differential Revision: https://reviews.llvm.org/D94355
Retry of 330619a3a6 that includes a clang test update.
Original commit message:
If we run passes before lowering llvm.expect intrinsics to metadata,
then those passes have no way to act on the hints provided by llvm.expect.
SimplifyCFG is the known offender, and we made it smarter about profile
metadata in D98898 <https://reviews.llvm.org/D98898>.
In the motivating example from https://llvm.org/PR49336 , this means we
were ignoring the recommended method for a programmer to tell the compiler
that a compare+branch is expensive. This change appears to solve that case -
the metadata survives to the backend, the compare order is as expected in IR,
and the backend does not do anything to reverse it.
We make the same change to the old pass manager to keep things synchronized.
Differential Revision: https://reviews.llvm.org/D100213
-filter-print-funcs -print-changed was crashing after the filter func
was removed by a pass with
Assertion failed: After.find("*** IR Dump") == 0 && "Unexpected banner format."
We weren't printing the banner because when we have -filter-print-funcs,
we print each function separately, letting the print function filter out
unwanted functions.
Reviewed By: jamieschmeiser
Differential Revision: https://reviews.llvm.org/D100237
If we run passes before lowering llvm.expect intrinsics to metadata,
then those passes have no way to act on the hints provided by llvm.expect.
SimplifyCFG is the known offender, and we made it smarter about profile
metadata in D98898.
In the motivating example from https://llvm.org/PR49336 , this means we
were ignoring the recommended method for a programmer to tell the compiler
that a compare+branch is expensive. This change appears to solve that case -
the metadata survives to the backend, the compare order is as expected in IR,
and the backend does not do anything to reverse it.
We make the same change to the old pass manager to keep things synchronized.
Differential Revision: https://reviews.llvm.org/D100213
The reason for the NewPM redesign is described in the commit
cba3e783389a: [NewPM] Disable PreservedCFGChecker ...
The checker introduces an internal custom CFG analysis that tracks
current up-to date CFG snapshot. The analysis is invalidated along
any other CFG related analysis (the key is CFGAnalyses). If the CFG
analysis is not invalidated at a functional pass exit then the checker
asserts that the CFG snapshot taken from this analysis is equals to
a snapshot of the current CFG.
Along the way:
- the function CFG::printDiff() is simplified by removing function
name calculation. The name is printed by the caller;
- fixed CFG invalidated condition (see CFG::invalidate());
- StandardInstrumentations::registerCallbacks() gets additional
optional parameter of type FunctionAnalysisManager*, which is
needed by the checker to get the custom CFG analysis;
- several PM related tests updated to explicitly set
-verify-cfg-preserved=1 as they need.
This patch is safe to land as the CFGChecker is left switched off
(the options -verify-cfg-preserved is false by default). It will be
switched on by a separate patch to minimize possible reverts.
Reviewed By: skatkov, kuhar
Differential Revision: https://reviews.llvm.org/D91327
Lookup tables generate non PIC-friendly code, which requires dynamic relocation as described in:
https://bugs.llvm.org/show_bug.cgi?id=45244
This patch adds a new pass that converts lookup tables to relative lookup tables to make them PIC-friendly.
Differential Revision: https://reviews.llvm.org/D94355
Summary:
The colour characters currently added to the output of -print-changed=diff
and -print-changed=diff-quiet cause difficulties when capturing the output
and examining it in an editor. Change the function to not have the colour
characters and add 2 new choices (-print-changed=cdiff and
-print-changed=cdiff-quiet) to retain the existing functionality of adding
the colour characters.
Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: aeubanks (Arthur Eubanks) yrouban (Yevgeny Rouban)
Differential Revision: https://reviews.llvm.org/D97398
Lookup tables generate non PIC-friendly code, which requires dynamic relocation as described in:
https://bugs.llvm.org/show_bug.cgi?id=45244
This patch adds a new pass that converts lookup tables to relative lookup tables to make them PIC-friendly.
Differential Revision: https://reviews.llvm.org/D94355