forked from OSchip/llvm-project
ddc702376a
With D107249 I saw huge compile time regressions on a module (150s -> 5700s). This turned out to be due to a huge RefSCC in the module. As we ran the function simplification pipeline on functions in the SCCs in the RefSCC, some of those SCCs would be split out to their RefSCC, a child of the current RefSCC. We'd skip the remaining SCCs in the huge RefSCC because the current RefSCC is now the RefSCC just split out, then revisit the original huge RefSCC from the beginning. This happened many times because many functions in the RefSCC were optimizable to the point of becoming their own RefSCC. This patch makes it so we don't skip SCCs not in the current RefSCC so that we split out all the child RefSCCs on the first iteration of RefSCC. When we split out a RefSCC, we invalidate the original RefSCC and add the remainder of the SCCs into a new RefSCC in RCWorklist. This happens repeatedly until we finish visiting all SCCs, at which point there is only one valid RefSCC in RCWorklist from the original RefSCC containing all the SCCs that were not split out, and we visit that. For example, in the newly added test cgscc-refscc-mutation-order.ll, we'd previously run instcombine in this order: f1, f2, f1, f3, f1, f4, f1 Now it's: f1, f2, f3, f4, f1 This can cause more passes to be run in some specific cases, e.g. if f1<->f2 gets optimized to f1<-f2, we'd previously run f1, f2; now we run f1, f2, f2. This improves kimwitu++ compile times by a lot (12-15% for various -O3 configs): https://llvm-compile-time-tracker.com/compare.php?from=2371c5a0e06d22b48da0427cebaf53a5e5c54635&to=00908f1d67400cab1ad7bcd7cacc7558d1672e97&stat=instructions Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D121953 |
||
---|---|---|
.. | ||
ChangePrinters | ||
Inputs | ||
X86 | ||
2002-01-31-CallGraph.ll | ||
2002-02-24-InlineBrokePHINodes.ll | ||
2004-08-16-PackedConstantInlineStore.ll | ||
2004-08-16-PackedGlobalConstant.ll | ||
2004-08-16-PackedSelect.ll | ||
2004-08-16-PackedSimple.ll | ||
2004-08-20-PackedControlFlow.ll | ||
2006-02-05-PassManager.ll | ||
2007-09-10-PassManager.ll | ||
2008-02-14-PassManager.ll | ||
2008-06-04-FieldSizeInPacked.ll | ||
2008-10-06-RemoveDeadPass.ll | ||
2008-10-15-MissingSpace.ll | ||
2009-06-05-no-implicit-float.ll | ||
2009-09-14-function-elements.ll | ||
2010-05-06-Printer.ll | ||
FileCheck-space.txt | ||
ResponseFile.ll | ||
attribute-comment.ll | ||
available-externally-lto.ll | ||
bb-badref.ll | ||
bcanalyzer-block-info.txt | ||
bcanalyzer-dump-blockinfo-option.txt | ||
bcanalyzer-dump-option.txt | ||
can-execute.txt | ||
cfg-printer-branch-weights-percent.ll | ||
cfg-printer-branch-weights.ll | ||
cfg-printer-filter.ll | ||
cfg_deopt_unreach.ll | ||
cgscc-devirt-iteration.ll | ||
cgscc-disconnected-invalidation.ll | ||
cgscc-iterate-function-mutation.ll | ||
cgscc-libcall-update.ll | ||
cgscc-observe-devirt.ll | ||
cgscc-refscc-mutation-order.ll | ||
change-printer.ll | ||
cleanup-lcssa.ll | ||
codegenprepare-and-debug.ll | ||
constant-fold-gep-address-spaces.ll | ||
constant-fold-gep.ll | ||
copy-metadata-of-declaration.ll | ||
debug-pass-manager.ll | ||
debugcounter-dce.ll | ||
debugcounter-earlycse.ll | ||
debugcounter-newgvn.ll | ||
debugcounter-predicateinfo.ll | ||
devirt-invalidated.ll | ||
devirtualization-undef.ll | ||
force-opaque-ptrs.ll | ||
heat-colors-graphs.ll | ||
invalid-commandline-option.ll | ||
invariant.group.ll | ||
lint.ll | ||
lit-globbing.ll | ||
lit-quoting.txt | ||
lit-unicode.txt | ||
llvm-nm-without-aliases.ll | ||
loop-deletion-printer.ll | ||
loop-mssa-not-preserved.ll | ||
loop-pass-ordering.ll | ||
loop-pass-printer.ll | ||
loop-pm-invalidation.ll | ||
loopnest-callback.ll | ||
loopnest-pass-ordering.ll | ||
machine-size-remarks.ll | ||
mixed-opaque-ptrs-2.ll | ||
mixed-opaque-ptrs.ll | ||
module-pass-printer.ll | ||
new-pass-manager-verify-each.ll | ||
new-pass-manager.ll | ||
new-pm-O0-defaults.ll | ||
new-pm-O0-ep-callbacks.ll | ||
new-pm-cspgo.ll | ||
new-pm-defaults.ll | ||
new-pm-eager-invalidate.ll | ||
new-pm-lto-defaults.ll | ||
new-pm-pgo-O0.ll | ||
new-pm-pgo-preinline.ll | ||
new-pm-pgo.ll | ||
new-pm-pr42726-cgscc.ll | ||
new-pm-print-pipeline.ll | ||
new-pm-pseudo-probe.ll | ||
new-pm-thinlto-defaults.ll | ||
new-pm-thinlto-postlink-pgo-defaults.ll | ||
new-pm-thinlto-postlink-samplepgo-defaults.ll | ||
new-pm-thinlto-prelink-pgo-defaults.ll | ||
new-pm-thinlto-prelink-samplepgo-defaults.ll | ||
new-pm-time-trace.ll | ||
no-rerun-function-simplification-pipeline.ll | ||
opt-On.ll | ||
opt-bisect-helper.py | ||
opt-bisect-new-pass-manager.ll | ||
opt-hot-cold-split.ll | ||
opt-old-new-pm-passes.ll | ||
opt-override-denormal-fp-math-f32.ll | ||
opt-override-denormal-fp-math-mixed.ll | ||
opt-override-denormal-fp-math.ll | ||
opt-override-frame-pointer.ll | ||
opt-override-mcpu-mattr.ll | ||
opt-pipeline-vector-passes.ll | ||
opt-twice.ll | ||
optimization-remarks-auto.ll | ||
optimization-remarks-inline.ll | ||
optimization-remarks-invalidation.ll | ||
optimization-remarks-lazy-bfi.ll | ||
optimize-inrange-gep.ll | ||
pass-pipeline-parsing.ll | ||
pipefail.txt | ||
pr32085.ll | ||
print-before-after.ll | ||
print-changed-deleted.ll | ||
print-debug-counter.ll | ||
print-module-scope.ll | ||
print-passes.ll | ||
print-slotindexes.ll | ||
printer.ll | ||
scalable-vector-array.ll | ||
scalable-vector-struct-intrinsic.ll | ||
scalable-vectors-core-ir.ll | ||
scc-deleted-printer.ll | ||
scc-pass-printer.ll | ||
spir_cc.ll | ||
statistic.ll | ||
time-passes.ll | ||
unroll-sroa.ll | ||
writing-to-stdout.ll |