Summary:
It turns out that we need an OptimizerLast PassBuilder extension point
after all. I missed the relevance of this EP the first time. By legacy PM magic,
function passes added at this EP get added to the last _Function_ PM, which is a
feature we lost when dropping this EP for the new PM.
A key difference between this and the legacy PassManager's OptimizerLast
callback is that this extension point is not triggered at O0. Extensions
to the O0 pipeline should append their passes to the end of the overall
pipeline.
Differential Revision: https://reviews.llvm.org/D54374
llvm-svn: 346645
Unlike its legacy counterpart new pass manager's LoopUnrollPass does
not provide any means to select which flavors of unroll to run
(runtime, peeling, partial), relying on global defaults.
In some cases having ability to run a restricted LoopUnroll that
does more than LoopFullUnroll is needed.
Introduced LoopUnrollOptions to select optional unroll behaviors.
Added 'unroll<peeling>' to PassRegistry mainly for the sake of testing.
Reviewers: chandlerc, tejohnson
Differential Revision: https://reviews.llvm.org/D53440
llvm-svn: 345723
This reverts commit 8d6af840396f2da2e4ed6aab669214ae25443204 and commit
b78d19c287b6e4a9abc9fb0545de9a3106d38d3d which causes slower build times
by initializing the AddressSanitizer on every function run.
The corresponding revisions are https://reviews.llvm.org/D52814 and
https://reviews.llvm.org/D52739.
llvm-svn: 345433
Summary:
Fix the new PM to only perform hot cold splitting once during ThinLTO,
by skipping it in the pre-link phase.
This was already fixed in the old PM by the move of the hot cold split
pass later (after the early return when PrepareForThinLTO) by r344869.
Reviewers: vsk, sebpop, hiraditya
Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits
Differential Revision: https://reviews.llvm.org/D53611
llvm-svn: 345096
Summary:
In the new+old pass manager, hot cold splitting was schedule too early.
Thanks to Vedant for pointing this out.
Reviewers: sebpop, vsk
Reviewed By: sebpop, vsk
Subscribers: mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D53437
llvm-svn: 344869
All the PassBuilder::parse interfaces now return descriptive StringError
instead of a plain bool. It allows to make -passes/aa-pipeline parsing
errors context-specific and thus less confusing.
TODO: ideally we should also make suggestions for misspelled pass names,
but that requires some extensions to PassBuilder.
Reviewed By: philip.pfaffe, chandlerc
Differential Revision: https://reviews.llvm.org/D53246
llvm-svn: 344685
Summary:
All the PassBuilder::parse interfaces now return descriptive StringError
instead of a plain bool. It allows to make -passes/aa-pipeline parsing
errors context-specific and thus less confusing.
TODO: ideally we should also make suggestions for misspelled pass names,
but that requires some extensions to PassBuilder.
Reviewed By: philip.pfaffe, chandlerc
Differential Revision: https://reviews.llvm.org/D53246
llvm-svn: 344519
Removing deficiency of initial implementation of -print-before-all/-after-all
- it was effectively skipping IR printing for all the SCC passes.
Now LazyCallGraph:SCC gets its IR printed.
Reviewed By: skatkov
Differential Revision: https://reviews.llvm.org/D53270
llvm-svn: 344505
This patch ports the legacy pass manager to the new one to take advantage of
the benefits of the new PM. This involved moving a lot of the declarations for
`AddressSantizer` to a header so that it can be publicly used via
PassRegistry.def which I believe contains all the passes managed by the new PM.
This patch essentially decouples the instrumentation from the legacy PM such
hat it can be used by both legacy and new PM infrastructure.
Differential Revision: https://reviews.llvm.org/D52739
llvm-svn: 344274
This can be used to preserve profiling information across codebase
changes that have widespread impact on mangled names, but across which
most profiling data should still be usable. For example, when switching
from libstdc++ to libc++, or from the old libstdc++ ABI to the new ABI,
or even from a 32-bit to a 64-bit build.
The user can provide a remapping file specifying parts of mangled names
that should be treated as equivalent (eg, std::__1 should be treated as
equivalent to std::__cxx11), and profile data will be treated as
applying to a particular function if its name is equivalent to the name
of a function in the profile data under the provided equivalences. See
the documentation change for a description of how this is configured.
Remapping is supported for both sample-based profiling and instruction
profiling. We do not support remapping indirect branch target
information, but all other profile data should be remapped
appropriately.
Support is only added for the new pass manager. If someone wants to also
add support for this for the old pass manager, doing so should be
straightforward.
This is the LLVM side of Clang r344199.
Reviewers: davidxl, tejohnson, dlj, erik.pilkington
Subscribers: mehdi_amini, steven_wu, dexonsmith, llvm-commits
Differential Revision: https://reviews.llvm.org/D51249
llvm-svn: 344200
Enable time-passes functionality through PassInstrumentation callbacks
for passes and analyses.
TimePassesHandler class keeps all the callbacks, the timing data as it
is being collected as well as the stack of currently active timers.
Parts of the fix that might be somewhat unobvious:
- mapping of passes into Timer (TimingData) can not be done per-instance.
PassID name provided into the callback is common for all the pass invocations.
Thus the only way to get a timing with reasonable granularity is to collect
timing data per pass invocation, getting a new timer for each BeforePass.
Hence the key for TimingData uses a pair of <StringRef/unsigned count> to
uniquely identify a pass invocation.
- consequently, this new-pass-manager implementation performs no aggregation
of timing data, reporting timings for each pass invocation separately.
In that it differs from legacy-pass-manager time-passes implementation that
reports timing data aggregated per pass instance.
- pass managers and adaptors are not tracked, similar to how pass managers are
not tracked in legacy time-passes.
- TimerStack tracks timers that are active, each BeforePass pushes the new timer
on stack, each AfterPass pops active timer from stack and stops it.
Reviewers: chandlerc, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D51276
llvm-svn: 343898
Modified the testcases to use both pass managers
Use single commandline flag for both pass managers.
Differential Revision: https://reviews.llvm.org/D52708
Reviewers: sebpop, tejohnson, brzycki, SirishP
Reviewed By: tejohnson, brzycki
llvm-svn: 343662
This reverts commit r342387 as it's showing significant performance
regressions in a number of benchmarks. Followed up with the
committer and original thread with an example and will get performance
numbers before recommitting.
llvm-svn: 343522
Implementing -print-before-all/-print-after-all/-filter-print-func support
through PassInstrumentation callbacks.
- PrintIR routines implement printing callbacks.
- StandardInstrumentations class provides a central place to manage all
the "standard" in-tree pass instrumentations. Currently it registers
PrintIR callbacks.
Reviewers: chandlerc, paquette, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D50923
llvm-svn: 342896
Pass Execution Instrumentation interface enables customizable instrumentation
of pass execution, as per "RFC: Pass Execution Instrumentation interface"
posted 06/07/2018 on llvm-dev@
The intent is to provide a common machinery to implement all
the pass-execution-debugging features like print-before/after,
opt-bisect, time-passes etc.
Here we get a basic implementation consisting of:
* PassInstrumentationCallbacks class that handles registration of callbacks
and access to them.
* PassInstrumentation class that handles instrumentation-point interfaces
that call into PassInstrumentationCallbacks.
* Callbacks accept StringRef which is just a name of the Pass right now.
There were some ideas to pass an opaque wrapper for the pointer to pass instance,
however it appears that pointer does not actually identify the instance
(adaptors and managers might have the same address with the pass they govern).
Hence it was decided to go simple for now and then later decide on what the proper
mental model of identifying a "pass in a phase of pipeline" is.
* Callbacks accept llvm::Any serving as a wrapper for const IRUnit*, to remove direct dependencies
on different IRUnits (e.g. Analyses).
* PassInstrumentationAnalysis analysis is explicitly requested from PassManager through
usual AnalysisManager::getResult. All pass managers were updated to run that
to get PassInstrumentation object for instrumentation calls.
* Using tuples/index_sequence getAnalysisResult helper to extract generic AnalysisManager's extra
args out of a generic PassManager's extra args. This is the only way I was able to explicitly
run getResult for PassInstrumentationAnalysis out of a generic code like PassManager::run or
RepeatedPass::run.
TODO: Upon lengthy discussions we agreed to accept this as an initial implementation
and then get rid of getAnalysisResult by improving RepeatedPass implementation.
* PassBuilder takes PassInstrumentationCallbacks object to pass it further into
PassInstrumentationAnalysis. Callbacks registration should be performed directly
through PassInstrumentationCallbacks.
* new-pm tests updated to account for PassInstrumentationAnalysis being run
* Added PassInstrumentation tests to PassBuilderCallbacks unit tests.
Other unit tests updated with registration of the now-required PassInstrumentationAnalysis.
Made getName helper to return std::string (instead of StringRef initially) to fix
asan builtbot failures on CGSCC tests.
Reviewers: chandlerc, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D47858
llvm-svn: 342664
Pass Execution Instrumentation interface enables customizable instrumentation
of pass execution, as per "RFC: Pass Execution Instrumentation interface"
posted 06/07/2018 on llvm-dev@
The intent is to provide a common machinery to implement all
the pass-execution-debugging features like print-before/after,
opt-bisect, time-passes etc.
Here we get a basic implementation consisting of:
* PassInstrumentationCallbacks class that handles registration of callbacks
and access to them.
* PassInstrumentation class that handles instrumentation-point interfaces
that call into PassInstrumentationCallbacks.
* Callbacks accept StringRef which is just a name of the Pass right now.
There were some ideas to pass an opaque wrapper for the pointer to pass instance,
however it appears that pointer does not actually identify the instance
(adaptors and managers might have the same address with the pass they govern).
Hence it was decided to go simple for now and then later decide on what the proper
mental model of identifying a "pass in a phase of pipeline" is.
* Callbacks accept llvm::Any serving as a wrapper for const IRUnit*, to remove direct dependencies
on different IRUnits (e.g. Analyses).
* PassInstrumentationAnalysis analysis is explicitly requested from PassManager through
usual AnalysisManager::getResult. All pass managers were updated to run that
to get PassInstrumentation object for instrumentation calls.
* Using tuples/index_sequence getAnalysisResult helper to extract generic AnalysisManager's extra
args out of a generic PassManager's extra args. This is the only way I was able to explicitly
run getResult for PassInstrumentationAnalysis out of a generic code like PassManager::run or
RepeatedPass::run.
TODO: Upon lengthy discussions we agreed to accept this as an initial implementation
and then get rid of getAnalysisResult by improving RepeatedPass implementation.
* PassBuilder takes PassInstrumentationCallbacks object to pass it further into
PassInstrumentationAnalysis. Callbacks registration should be performed directly
through PassInstrumentationCallbacks.
* new-pm tests updated to account for PassInstrumentationAnalysis being run
* Added PassInstrumentation tests to PassBuilderCallbacks unit tests.
Other unit tests updated with registration of the now-required PassInstrumentationAnalysis.
Reviewers: chandlerc, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D47858
llvm-svn: 342597
Summary:
Pass Execution Instrumentation interface enables customizable instrumentation
of pass execution, as per "RFC: Pass Execution Instrumentation interface"
posted 06/07/2018 on llvm-dev@
The intent is to provide a common machinery to implement all
the pass-execution-debugging features like print-before/after,
opt-bisect, time-passes etc.
Here we get a basic implementation consisting of:
* PassInstrumentationCallbacks class that handles registration of callbacks
and access to them.
* PassInstrumentation class that handles instrumentation-point interfaces
that call into PassInstrumentationCallbacks.
* Callbacks accept StringRef which is just a name of the Pass right now.
There were some ideas to pass an opaque wrapper for the pointer to pass instance,
however it appears that pointer does not actually identify the instance
(adaptors and managers might have the same address with the pass they govern).
Hence it was decided to go simple for now and then later decide on what the proper
mental model of identifying a "pass in a phase of pipeline" is.
* Callbacks accept llvm::Any serving as a wrapper for const IRUnit*, to remove direct dependencies
on different IRUnits (e.g. Analyses).
* PassInstrumentationAnalysis analysis is explicitly requested from PassManager through
usual AnalysisManager::getResult. All pass managers were updated to run that
to get PassInstrumentation object for instrumentation calls.
* Using tuples/index_sequence getAnalysisResult helper to extract generic AnalysisManager's extra
args out of a generic PassManager's extra args. This is the only way I was able to explicitly
run getResult for PassInstrumentationAnalysis out of a generic code like PassManager::run or
RepeatedPass::run.
TODO: Upon lengthy discussions we agreed to accept this as an initial implementation
and then get rid of getAnalysisResult by improving RepeatedPass implementation.
* PassBuilder takes PassInstrumentationCallbacks object to pass it further into
PassInstrumentationAnalysis. Callbacks registration should be performed directly
through PassInstrumentationCallbacks.
* new-pm tests updated to account for PassInstrumentationAnalysis being run
* Added PassInstrumentation tests to PassBuilderCallbacks unit tests.
Other unit tests updated with registration of the now-required PassInstrumentationAnalysis.
Reviewers: chandlerc, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D47858
llvm-svn: 342544
Rebase rL341954 since https://bugs.llvm.org/show_bug.cgi?id=38912
has been fixed by rL342055.
Precommit testing performed:
* Overnight runs of csmith comparing the output between programs
compiled with gvn-hoist enabled/disabled.
* Bootstrap builds of clang with UbSan/ASan configurations.
llvm-svn: 342387
This reverts rL341954.
The builder `sanitizer-x86_64-linux-bootstrap-ubsan` has been
failing with timeouts at stage2 clang/ubsan:
[3065/3073] Linking CXX executable bin/lld
command timed out: 1200 seconds without output running python
../sanitizer_buildbot/sanitizers/buildbot_selector.py,
attempting to kill
llvm-svn: 342001
Summary:
Control height reduction merges conditional blocks of code and reduces the
number of conditional branches in the hot path based on profiles.
if (hot_cond1) { // Likely true.
do_stg_hot1();
}
if (hot_cond2) { // Likely true.
do_stg_hot2();
}
->
if (hot_cond1 && hot_cond2) { // Hot path.
do_stg_hot1();
do_stg_hot2();
} else { // Cold path.
if (hot_cond1) {
do_stg_hot1();
}
if (hot_cond2) {
do_stg_hot2();
}
}
This speeds up some internal benchmarks up to ~30%.
Reviewers: davidxl
Reviewed By: davidxl
Subscribers: xbolva00, dmgreen, mehdi_amini, llvm-commits, mgorny
Differential Revision: https://reviews.llvm.org/D50591
llvm-svn: 341386
Rebase rL338240 since the excessive memory usage observed when using
GVNHoist with UBSan has been fixed by rL340818.
Differential Revision: https://reviews.llvm.org/D49858
llvm-svn: 340922
Summary:
Enable these passes for CFI and WPD in ThinLTO and LTO with the new pass
manager. Add a couple of tests for both PMs based on the clang tests
tools/clang/test/CodeGen/thinlto-distributed-cfi*.ll, but just test
through llvm-lto2 and not with distributed ThinLTO.
Reviewers: pcc
Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits
Differential Revision: https://reviews.llvm.org/D49429
llvm-svn: 337461
This is a simple implementation of the unroll-and-jam classical loop
optimisation.
The basic idea is that we take an outer loop of the form:
for i..
ForeBlocks(i)
for j..
SubLoopBlocks(i, j)
AftBlocks(i)
Instead of doing normal inner or outer unrolling, we unroll as follows:
for i... i+=2
ForeBlocks(i)
ForeBlocks(i+1)
for j..
SubLoopBlocks(i, j)
SubLoopBlocks(i+1, j)
AftBlocks(i)
AftBlocks(i+1)
Remainder Loop
So we have unrolled the outer loop, then jammed the two inner loops into
one. This can lead to a simpler inner loop if memory accesses can be shared
between the now jammed loops.
To do this we have to prove that this is all safe, both for the memory
accesses (using dependence analysis) and that ForeBlocks(i+1) can move before
AftBlocks(i) and SubLoopBlocks(i, j).
Differential Revision: https://reviews.llvm.org/D41953
llvm-svn: 336062
and diretory.
Also cleans up all the associated naming to be consistent and removes
the public access to the pass ID which was unused in LLVM.
Also runs clang-format over parts that changed, which generally cleans
up a bunch of formatting.
This is in preparation for doing some internal cleanups to the pass.
Differential Revision: https://reviews.llvm.org/D47352
llvm-svn: 336028
Extends the CFGPrinter and CallPrinter with heat colors based on heuristics or
profiling information. The colors are enabled by default and can be toggled
on/off for CFGPrinter by using the option -cfg-heat-colors for both
-dot-cfg[-only] and -view-cfg[-only]. Similarly, the colors can be toggled
on/off for CallPrinter by using the option -callgraph-heat-colors for both
-dot-callgraph and -view-callgraph.
Patch by Rodrigo Caetano Rocha!
Differential Revision: https://reviews.llvm.org/D40425
llvm-svn: 335996
This pass is being added in order to make the information available to BasicAA,
which can't do caching of this information itself, but possibly this information
may be useful for other passes.
Incorporates code based on Daniel Berlin's implementation of Tarjan's algorithm.
Differential Revision: https://reviews.llvm.org/D47893
llvm-svn: 335857
=== Generating the CG Profile ===
The CGProfile module pass simply gets the block profile count for each BB and scans for call instructions. For each call instruction it adds an edge from the current function to the called function with the current BB block profile count as the weight.
After scanning all the functions, it generates an appending module flag containing the data. The format looks like:
```
!llvm.module.flags = !{!0}
!0 = !{i32 5, !"CG Profile", !1}
!1 = !{!2, !3, !4} ; List of edges
!2 = !{void ()* @a, void ()* @b, i64 32} ; Edge from a to b with a weight of 32
!3 = !{void (i1)* @freq, void ()* @a, i64 11}
!4 = !{void (i1)* @freq, void ()* @b, i64 20}
```
Differential Revision: https://reviews.llvm.org/D48105
llvm-svn: 335794
loop-cleanup passes at the beginning of the loop pass pipeline, and
re-enqueue loops after even trivial unswitching.
This will allow us to much more consistently avoid simplifying code
while doing trivial unswitching. I've also added a test case that
specifically shows effective iteration using this technique.
I've unconditionally updated the new PM as that is always using the
SimpleLoopUnswitch pass, and I've made the pipeline changes for the old
PM conditional on using this new unswitch pass. I added a bunch of
comments to the loop pass pipeline in the old PM to make it more clear
what is going on when reviewing.
Hopefully this will unblock doing *partial* unswitching instead of just
full unswitching.
Differential Revision: https://reviews.llvm.org/D47408
llvm-svn: 333493
This is a simple implementation of the unroll-and-jam classical loop
optimisation.
The basic idea is that we take an outer loop of the form:
for i..
ForeBlocks(i)
for j..
SubLoopBlocks(i, j)
AftBlocks(i)
Instead of doing normal inner or outer unrolling, we unroll as follows:
for i... i+=2
ForeBlocks(i)
ForeBlocks(i+1)
for j..
SubLoopBlocks(i, j)
SubLoopBlocks(i+1, j)
AftBlocks(i)
AftBlocks(i+1)
Remainder
So we have unrolled the outer loop, then jammed the two inner loops into
one. This can lead to a simpler inner loop if memory accesses can be shared
between the now-jammed loops.
To do this we have to prove that this is all safe, both for the memory
accesses (using dependence analysis) and that ForeBlocks(i+1) can move before
AftBlocks(i) and SubLoopBlocks(i, j).
Differential Revision: https://reviews.llvm.org/D41953
llvm-svn: 333358
The plan had always been to move towards using this rather than so much
in-pass simplification within the loop pipeline, but we never got around
to it.... until only a couple months after it was removed due to disuse.
=/
This commit is just a pure revert of the removal. I will add tests and
do some basic cleanup in follow-up commits. Then I'll wire it into the
loop pass pipeline.
Differential Revision: https://reviews.llvm.org/D47353
llvm-svn: 333250
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.
Patch produced by
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
Differential Revision: https://reviews.llvm.org/D46290
llvm-svn: 331272