This is the same exact change we did for the current pass manager
in rL314997, but the new pass manager pipeline already happened
to run GlobalOpt after the inliner, so we just insert a run of
GDCE here.
llvm-svn: 315003
This is intended to be a superset of the functionality from D31037 (EarlyCSE) but implemented
as an independent pass, so there's no stretching of scope and feature creep for an existing pass.
I also proposed a weaker version of this for SimplifyCFG in D30910. And I initially had almost
this same functionality as an addition to CGP in the motivating example of PR31028:
https://bugs.llvm.org/show_bug.cgi?id=31028
The advantage of positioning this ahead of SimplifyCFG in the pass pipeline is that it can allow
more flattening. But it needs to be after passes (InstCombine) that could sink a div/rem and
undo the hoisting that is done here.
Decomposing remainder may allow removing some code from the backend (PPC and possibly others).
Differential Revision: https://reviews.llvm.org/D37121
llvm-svn: 312862
Summary: Part of r310296 will disable PGOIndirectCallPromotion in ThinLTO backend if PGOOpt is None. However, as PGOOpt is not passed down to ThinLTO backend for instrumentation based PGO, that change would actually disable ICP entirely in ThinLTO backend, making it behave differently in instrumentation PGO mode. This change reverts that change, and only disable ICP there when it is SamplePGO.
Reviewers: davidxl
Reviewed By: davidxl
Subscribers: sanjoy, mehdi_amini, eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D36566
llvm-svn: 310550
Summary: SampleProfileLoader pass do need to happen after some early cleanup passes so that inlining can happen correctly inside the SampleProfileLoader pass.
Reviewers: chandlerc, davidxl, tejohnson
Reviewed By: chandlerc, tejohnson
Subscribers: sanjoy, mehdi_amini, eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D36333
llvm-svn: 310296
Summary:
Detect when the working set size of a profiled application is huge,
by comparing the number of counts required to reach the hot percentile
in the profile summary to a large threshold*.
When the working set size is determined to be huge, disable peeling
to avoid bloating the working set further.
*Note that the selected threshold (15K) is significantly larger than the
largest working set value in SPEC cpu2006 (which is gcc at around 11K).
Reviewers: davidxl
Subscribers: mehdi_amini, mzolotukhin, eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D36288
llvm-svn: 310005
Summary:
This is largely NFC*, in preparation for utilizing ProfileSummaryInfo
and BranchFrequencyInfo analyses. In this patch I am only doing the
splitting for the New PM, but I can do the same for the legacy PM as
a follow-on if this looks good.
*Not NFC since for partial unrolling we lose the updates done to the
loop traversal (adding new sibling and child loops) - according to
Chandler this is not very useful for partial unrolling, but it also
means that the debugging flag -unroll-revisit-child-loops no longer
works for partial unrolling.
Reviewers: chandlerc
Subscribers: mehdi_amini, mzolotukhin, eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D36157
llvm-svn: 309886
Summary:
Now that SamplePGOSupport is part of PGOOpt, there are several places that need tweaking:
1. AddDiscriminator pass should *not* be invoked at ThinLTOBackend (as it's already invoked in the PreLink phase)
2. addPGOInstrPasses should only be invoked when either ProfileGenFile or ProfileUseFile is non-empty.
3. SampleProfileLoaderPass should only be invoked when SampleProfileFile is non-empty.
4. PGOIndirectCallPromotion should only be invoked in ProfileUse phase, or in ThinLTOBackend of SamplePGO.
Reviewers: chandlerc, tejohnson, davidxl
Reviewed By: chandlerc
Subscribers: sanjoy, mehdi_amini, eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D36040
llvm-svn: 309478
Looks like the template arguments are displayed differently depending on the
host compiler(?). E.g.:
InnerAnalysisManagerProxy<CGSCCAnalysisManager
InnerAnalysisManagerProxy<llvm::AnalysisManager<llvm::LazyCallGraph::SCC, ...
Fix fallout after r309294
llvm-svn: 309297
This is a module pass so for the old PM, we can't use ORE, the function
analysis pass. Instead ORE is created on the fly.
A few notes:
- isPromotionLegal is folded in the caller since we want to emit the Function
in the remark but we can only do that if the symbol table look-up succeeded.
- There was good test coverage for remarks in this pass.
- promoteIndirectCall uses ORE conditionally since it's also used from
SampleProfile which does not use ORE yet.
Fixes PR33792.
Differential Revision: https://reviews.llvm.org/D35929
llvm-svn: 309294
Summary:
This changes SimplifyLibCalls to use the new OptimizationRemarkEmitter
API.
In fact, as SimplifyLibCalls is only ever called via InstCombine,
(as far as I can tell) the OptimizationRemarkEmitter is added there,
and then passed through to SimplifyLibCalls later.
I have avoided changing any remark text.
This closes PR33787
Patch by Sam Elliott!
Reviewers: anemet, davide
Reviewed By: anemet
Subscribers: davide, mehdi_amini, eraman, fhahn, llvm-commits
Differential Revision: https://reviews.llvm.org/D35608
llvm-svn: 309158
Previously it doesn't actually invoke the designated new PM builder
functions.
This patch moves NameAnonGlobalPass out from PassBuilder, as Chandler
points out that PassBuilder is used for non-O0 builds, and for
optimizations only.
Differential Revision: https://reviews.llvm.org/D34728
llvm-svn: 306756
Based on the original patch by Davide, but I've adjusted the API exposed
to just be different entry points rather than exposing more state
parameters. I've factored all the common logic out so that we don't have
any duplicate pipelines, we just stitch them together in different ways.
I think this makes the build easier to reason about and understand.
This adds a direct method for getting the module simplification pipeline
as well as a method to get the optimization pipeline. While not my
express goal, this seems nice and gives a good place comment about the
restrictions that are imposed on them.
I did make some minor changes to the way the pipelines are structured
here, but hopefully not ones that are significant or controversial:
1) I sunk the PGO indirect call promotion to only be run when we have
PGO enabled (or as part of the special ThinLTO pipeline).
2) I made the extra GlobalOpt run in ThinLTO just happen all the time
and at a slightly more powerful place (before we remove available
externaly functions). This seems like general goodness and not a big
compile time sink, so it didn't make sense to *only* use it in
ThinLTO. Fewer differences in the pipeline makes everything simpler
IMO.
3) I hoisted the ThinLTO stop point pre-link above the the RPO function
attr inference. The RPO inference won't infer anything terribly
meaningful pre-link (recursiveness?) so it didn't make a lot of
sense. But if the placement of RPO inference starts to matter, we
should move it to the canonicalization phase anyways which seems like
a better place for it (and there is a FIXME to this effect!). But
that seemed a bridge too far for this patch.
If we ever need to parameterize these pipelines more heavily, we can
always sink the logic to helper functions with parameters to keep those
parameters out of the public API. But the changes above seemed minor
that we could possible get away without the parameters entirely.
I added support for parsing 'thinlto' and 'thinlto-pre-link' names in
pass pipelines to make it easy to test these routines and play with them
in larger pipelines. I also added a really basic manifest of passes test
that will show exactly how the pipelines behave and work as well as
making updates to them clear.
Lastly, this factoring does introduce a nesting layer of module pass
managers in the default pipeline. I don't think this is a big deal and
the flexibility of decoupling the pipelines seems easily worth it.
Differential Revision: https://reviews.llvm.org/D33540
llvm-svn: 304407