Summary:
eraseBlock is trying to erase all probability info for the given BB.
This info is stored in a DenseMap organized like so:
using Edge = std::pair<const BasicBlock *, unsigned>;
DenseMap<Edge, BranchProbability> Probs;
where the unsigned in the Edge key is the successor id.
It was walking through every single map entry, checking if the BB in the
key's pair matched the given BB. Much more efficient is to do what
another method (getEdgeProbability) was already doing, which is to walk
the successors of the BB, and simply do a map lookup on the key formed
from each <BB, successor id> pair.
Doing this dropped the overall compile time for a file containing a
very large function by around 32%.
Reviewers: davidxl, xur
Subscribers: llvm-commits, hiraditya
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83596
Allow InvokeInst to have the second optional prof branch weight for
its unwind branch. InvokeInst is a terminator with two successors.
It might have its unwind branch taken many times. If so
the BranchProbabilityInfo unwind branch heuristic can be inaccurate.
This patch allows a higher accuracy calculated with both branch
weights set.
Changes:
- A new section about InvokeInst is added to
the BranchWeightMetadata page. It states the old information that
missed in the doc and adds new about the second branch weight.
- Verifier is changed to allow either 1 or 2 branch weights
for InvokeInst.
- A new test is written for BranchProbabilityInfo to demonstrate
the main improvement of the simple fix in calcMetadataWeights().
- Several new testcases are created for Inliner. Those check that
both weights are accounted for invoke instruction weight
calculation.
- PGOUseFunc::setBranchWeights() is fixed to be applicable to
InvokeInst.
Reviewers: davidxl, reames, xur, yamauchi
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80618
When fixing probability of unreachable edges in
BranchProbabilityInfo::calcMetadataWeights() proportionally distribute
remainder probability over the reachable edges. The old implementation
distributes the remainder probability evenly.
See examples in the fixed tests.
Reviewers: yamauchi, ebrevnov
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80611
Hide the method that allows setting probability for particular edge
and introduce a public method that sets probabilities for all
outgoing edges at once.
Setting individual edge probability is error prone. More over it is
difficult to check that the total probability is 1.0 because there is
no easy way to know when the user finished setting all
the probabilities.
Related bug is fixed in BranchProbabilityInfo::calcMetadataWeights().
Changing unreachable branch probabilities to raw(1) and distributing
the rest (oldProbability - raw(1)) over the reachable branches could
introduce total probability inaccuracy bigger than 1/numOfBranches.
Reviewers: yamauchi, ebrevnov
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79396
Hide the method that allows setting probability for particular
edge and introduce a public method that sets probabilities for
all outgoing edges at once.
Setting individual edge probability is error prone. More over
it is difficult to check that the total probability is 1.0
because there is no easy way to know when the user finished
setting all the probabilities.
Reviewers: yamauchi, ebrevnov
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79396
Summary:
By design 'BranchProbabilityInfo:: getEdgeProbability(const BasicBlock *Src, const BasicBlock *Dst) const' should return sum of probabilities over all edges from Src to Dst. Current implementation is buggy and returns 1/num_of_successors if probabilities are not explicitly set.
Note current implementation of BPI printing has an issue as well and annotates each edge with sum of probabilities over all ages from one basic block to another. That's why 30% probability reported (instead of 10%) in the lit test. This is not urgent issue since only printing is affected.
Note also current implementation assumes that either all or none edges have probabilities set. This is not the only place which uses such assumption. At least we should assert that in verifier. In addition we can think on a more robust API of BPI which would prevent situations.
Reviewers: skatkov, yrouban, taewookoh
Reviewed By: skatkov
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79071
Summary: Currenlty BPI unconditionally creates post dominator tree each time. While this is not incorrect we can save compile time by reusing existing post dominator tree (when it's valid) provided by analysis manager.
Reviewers: skatkov, taewookoh, yrouban
Reviewed By: skatkov
Subscribers: hiraditya, steven_wu, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78987
Summary:
Rename `succ_const_iterator` to `const_succ_iterator` and
`succ_const_range` to `const_succ_range` for consistency with the
predecessor iterators, and the corresponding iterators in
MachineBasicBlock.
Reviewers: nicholas, dblaikie, nlewycky
Subscribers: hiraditya, bmahjour, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75952
Summary: b19ec1eb3d has been reverted because of the test failures
with PowerPC targets. This patch addresses the issues from the previous
commit.
Test Plan: ninja check-all. Confirmed that CodeGen/PowerPC/pr36292.ll
and CodeGen/PowerPC/sms-cpy-1.ll pass
Subscribers: llvm-commits
Summary:
While updatePostDominatedByUnreachable attemps to find basic blocks that are post-domianted by unreachable blocks, it currently cannot handle loops precisely, because it doesn't use the actual post dominator tree analysis but relies on heuristics of visiting basic blocks in post-order. More precisely, when the entire loop is post-dominated by the unreachable block, current algorithm fails to detect the entire loop as post-dominated by the unreachable because when the algorithm reaches to the loop latch it fails to tell all its successors (including the loop header) will "eventually" be post-domianted by the unreachable block, because the algorithm hasn't visited the loop header yet. This makes BPI for the loop latch to assume that loop backedges are taken with 100% of probability. And because of this, block frequency info sometimes marks virtually dead loops (which are post dominated by unreachable blocks) super hot, because 100% backedge-taken probability makes the loop iteration count the max value. updatePostDominatedByColdCall has the exact same problem as well.
To address this problem, this patch makes PostDominatedByUnreachable/PostDominatedByColdCall to be computed with the actual post-dominator tree.
Reviewers: skatkov, chandlerc, manmanren
Reviewed By: skatkov
Subscribers: manmanren, vsk, apilipenko, Carrot, qcolombet, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70104
This file lists every pass in LLVM, and is included by Pass.h, which is
very popular. Every time we add, remove, or rename a pass in LLVM, it
caused lots of recompilation.
I found this fact by looking at this table, which is sorted by the
number of times a file was changed over the last 100,000 git commits
multiplied by the number of object files that depend on it in the
current checkout:
recompiles touches affected_files header
342380 95 3604 llvm/include/llvm/ADT/STLExtras.h
314730 234 1345 llvm/include/llvm/InitializePasses.h
307036 118 2602 llvm/include/llvm/ADT/APInt.h
213049 59 3611 llvm/include/llvm/Support/MathExtras.h
170422 47 3626 llvm/include/llvm/Support/Compiler.h
162225 45 3605 llvm/include/llvm/ADT/Optional.h
158319 63 2513 llvm/include/llvm/ADT/Triple.h
140322 39 3598 llvm/include/llvm/ADT/StringRef.h
137647 59 2333 llvm/include/llvm/Support/Error.h
131619 73 1803 llvm/include/llvm/Support/FileSystem.h
Before this change, touching InitializePasses.h would cause 1345 files
to recompile. After this change, touching it only causes 550 compiles in
an incremental rebuild.
Reviewers: bkramer, asbirlea, bollu, jdoerfert
Differential Revision: https://reviews.llvm.org/D70211
Since NaN is very rare in normal programs, so the probability for floating point unordered comparison should be extremely small. Current probability is 3/8, it is too large, this patch changes it to a tiny number.
Differential Revision: https://reviews.llvm.org/D65303
llvm-svn: 371541
Summary:
This is the first change to enable the TLI to be built per-function so
that -fno-builtin* handling can be migrated to use function attributes.
See discussion on D61634 for background. This is an enabler for fixing
handling of these options for LTO, for example.
This change should not affect behavior, as the provided function is not
yet used to build a specifically per-function TLI, but rather enables
that migration.
Most of the changes were very mechanical, e.g. passing a Function to the
legacy analysis pass's getTLI interface, or in Module level cases,
adding a callback. This is similar to the way the per-function TTI
analysis works.
There was one place where we were looking for builtins but not in the
context of a specific function. See FindCXAAtExit in
lib/Transforms/IPO/GlobalOpt.cpp. I'm somewhat concerned my workaround
could provide the wrong behavior in some corner cases. Suggestions
welcome.
Reviewers: chandlerc, hfinkel
Subscribers: arsenm, dschuff, jvesely, nhaehnle, mehdi_amini, javed.absar, sbc100, jgravelle-google, eraman, aheejin, steven_wu, george.burgess.iv, dexonsmith, jfb, asbirlea, gchatelet, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66428
llvm-svn: 371284
Constant hoisting may have hidden a constant behind a bitcast so that
it isn't folded into its users. However, this prevents BPI from
calculating some of its heuristics that are based upon constant
values. So, I've added a simple helper function to look through these
casts.
Differential Revision: https://reviews.llvm.org/D58166
llvm-svn: 354119
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
by `getTerminator()` calls instead be declared as `Instruction`.
This is the biggest remaining chunk of the usage of `getTerminator()`
that insists on the narrow type and so is an easy batch of updates.
Several files saw more extensive updates where this would cascade to
requiring API updates within the file to use `Instruction` instead of
`TerminatorInst`. All of these were trivial in nature (pervasively using
`Instruction` instead just worked).
llvm-svn: 344502
Currently the loop branch heuristic is applied before the invoke heuristic which makes us overestimate the probability of the unwind destination of invokes inside loops. This in turn makes us grossly underestimate the frequencies of loops with invokes.
Reviewed By: skatkov, vsk
Differential Revision: https://reviews.llvm.org/D47371
llvm-svn: 334285
Summary:
Require DominatorTree when requiring/preserving LoopInfo in the old pass manager
BreakCriticalEdges tries to keep LoopInfo and DominatorTree updated if they
exist. However, since commit r321653 and r321805, to update LoopInfo we
must have a DominatorTree, or we will hit an assert.
To fix this we now make a couple of passes that only required/preserved
LoopInfo also require DominatorTree.
This solves PR37334.
Reviewers: eli.friedman, efriedma
Reviewed By: efriedma
Subscribers: efriedma, llvm-commits
Differential Revision: https://reviews.llvm.org/D46829
llvm-svn: 332583
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.
In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.
Differential Revision: https://reviews.llvm.org/D43624
llvm-svn: 332240
This commit adds a wrapper for std::distance() which works with ranges.
As it would be a common case to write `distance(predecessors(BB))`, this
also introduces `pred_size()` and `succ_size()` helpers to make that
easier to write.
Differential Revision: https://reviews.llvm.org/D46668
llvm-svn: 332057
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.
Patch produced by
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
Differential Revision: https://reviews.llvm.org/D46290
llvm-svn: 331272
If we have a loop like this:
int n = 0;
while (...) {
if (++n >= MAX) {
n = 0;
}
}
then the body of the 'if' statement will only be executed once every MAX
iterations. Detect this by looking for branches in loops where taking the branch
makes the branch condition evaluate to 'not taken' in the next iteration of the
loop, and reduce the probability of such branches.
This slightly improves EEMBC benchmarks on cortex-m4/cortex-m33 due to making
better choices in if-conversion, but has no effect on any other cpu/benchmark
that I could detect.
Differential Revision: https://reviews.llvm.org/D35804
llvm-svn: 325925
Summary:
Compute the strongly connected components of the CFG and fall back to
use these for blocks that are in loops that are not detected by
LoopInfo when computing loop back-edge and exit branch probabilities.
Reviewers: dexonsmith, davidxl
Subscribers: mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D39385
llvm-svn: 317094
Summary:
Add options -print-bfi/-print-bpi that dump block frequency and branch
probability info like -view-block-freq-propagation-dags and
-view-machine-block-freq-propagation-dags do but in text.
This is useful when the graph is very large and complex (the dot command
crashes, lines/edges too close to tell apart, hard to navigate without textual
search) or simply when text is preferred.
Reviewers: davidxl
Reviewed By: davidxl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D37165
llvm-svn: 311822
Current PGO only annotates the edge weight for branch and switch instructions
with profile counts. We should also annotate the indirectbr instruction as
all the information is there. This patch enables the annotating for indirectbr
instructions. Also uses this annotation in branch probability analysis.
Differential Revision: https://reviews.llvm.org/D37074
llvm-svn: 311604
Going through the Constant methods requires redetermining that the Constant is a ConstantInt and then calling isZero/isOne/isMinusOne.
llvm-svn: 307292
The zero heuristic assumes that integers are more likely positive than negative,
but this also has the effect of assuming that strcmp return values are more
likely positive than negative. Given that for nonzero strcmp return values it's
the ordering of arguments that determines the sign of the result there's no
reason to assume that's true.
Fix this by inspecting the LHS of the compare and using TargetLibraryInfo to
decide if it's strcmp-like, and if so only assume that nonzero is more likely
than zero i.e. strings are more often different than the same. This causes a
slight code generation change in the spec2006 benchmark 403.gcc, but with no
noticeable performance impact. The intent of this patch is to allow better
optimisation of dhrystone on Cortex-M cpus, but currently it won't as there are
also some changes that need to be made to if-conversion.
Differential Revision: https://reviews.llvm.org/D33934
llvm-svn: 304970
The probability of edge coming to unreachable block should be as low as possible.
The change reduces the probability to minimal value greater than zero.
The bug https://bugs.llvm.org/show_bug.cgi?id=32214 show the example when
the probability of edge coming to unreachable block is greater than for edge
coming to out of the loop and it causes incorrect loop rotation.
Please note that with this change the behavior of unreachable heuristic is a bit different
than others. Specifically, before this change the sum of probabilities
coming to unreachable blocks have the same weight for all branches
(it was just split over all edges of this block coming to unreachable blocks).
With this change it might be slightly different but not to much due to probability of
taken branch to unreachable block is really small.
Reviewers: chandlerc, sanjoy, vsk, congh, junbuml, davidxl, dexonsmith
Reviewed By: chandlerc, dexonsmith
Subscribers: reames, llvm-commits
Differential Revision: https://reviews.llvm.org/D30633
llvm-svn: 303327
This is a follow up patch for https://reviews.llvm.org/rL300440
to address a comment.
To make implementation to be consistent with other cases we just
ignore the remainder after distribution of remaining probability between
reachable edges.
If we reduced the probability of some edges coming to unreachable
blocks we should distribute the remaining part across other edges
coming to reachable blocks to satisfy the condition that sum of all
probabilities should be equal to one. If this remaining part is not
divided by number of "reachable" edges then we get this remainder.
This remainder probability should be pretty small. Other cases just ignore
if the sum of probabilities is not equal to one so we do the same.
Reviewers: chandlerc, sanjoy, vsk, junbuml, reames
Reviewed By: reames
Subscribers: reames, llvm-commits
Differential Revision: https://reviews.llvm.org/D32124
llvm-svn: 302883
This is non-functional change to re-order if statements to bail out earlier
from unreachable and ColdCall heuristics.
Reviewers: sanjoy, reames, junbuml, vsk, chandlerc
Reviewed By: chandlerc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31704
llvm-svn: 300442
Metadata potentially is more precise than any heuristics we use, so
it makes sense to use first metadata info if it is available. However it makes
sense to examine it against other strong heuristics like unreachable one.
If edge coming to unreachable block has higher probability then it is expected
by unreachable heuristic then we use heuristic and remaining probability is
distributed among other reachable blocks equally.
An example where metadata might be more strong then unreachable heuristic is
as follows: it is possible that there are two branches and for the branch A
metadata says that its probability is (0, 2^25). For the branch B
the probability is (1, 2^25).
So the expectation is that first edge of B is hotter than first edge of A
because first edge of A did not executed at least once.
If first edge of A points to the unreachable block then using the unreachable
heuristics we'll set the probability for A to (1, 2^20) and now edge of A
becomes hotter than edge of B.
This is unexpected behavior.
This fixed the biggest part of https://bugs.llvm.org/show_bug.cgi?id=32214
Reviewers: sanjoy, junbuml, vsk, chandlerc
Reviewed By: chandlerc
Subscribers: llvm-commits, reames, davidxl
Differential Revision: https://reviews.llvm.org/D30631
llvm-svn: 300440
Collection of PostDominatedByUnreachable and PostDominatedByColdCall have been
split out of heuristics itself. Update of the data happens now for each basic
block (before update for PostDominatedByColdCall might be skipped if
unreachable or matadata heuristic handled this basic block).
This separation allows re-ordering of heuristics without loosing
the post-domination information.
Reviewers: sanjoy, junbuml, vsk, chandlerc, reames
Reviewed By: chandlerc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31701
llvm-svn: 300029
BPI may trigger signed overflow UB while computing branch probabilities for
cold calls or to unreachables. For example, with our current choice of weights,
we'll crash if there are >= 2^12 branches to an unreachable.
Use a safer BranchProbability constructor which is better at handling fractions
with large denominators.
Changes since the initial commit:
- Use explicit casts to ensure that multiplication operands are 64-bit
ints.
rdar://problem/29368161
Differential Revision: https://reviews.llvm.org/D27862
llvm-svn: 290022
This reverts commit r290016. It breaks this bot, even though the test
passes locally:
http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/32956/
AnalysisTests: /home/bb/ninja-x64-msvc-RA-centos6/llvm-project/llvm/lib/Support/BranchProbability.cpp:52: static llvm::BranchProbability llvm::BranchProbability::getBranchProbability(uint64_t, uint64_t): Assertion `Numerator <= Denominator && "Probability cannot be bigger than 1!"' failed.
llvm-svn: 290019
BPI may trigger signed overflow UB while computing branch probabilities
for cold calls or to unreachables. For example, with our current choice
of weights, we'll crash if there are >= 2^12 branches to an unreachable.
Use a safer BranchProbability constructor which is better at handling
fractions with large denominators.
rdar://problem/29368161
Differential Revision: https://reviews.llvm.org/D27862
llvm-svn: 290016
analyses to have a common type which is enforced rather than using
a char object and a `void *` type when used as an identifier.
This has a number of advantages. First, it at least helps some of the
confusion raised in Justin Lebar's code review of why `void *` was being
used everywhere by having a stronger type that connects to documentation
about this.
However, perhaps more importantly, it addresses a serious issue where
the alignment of these pointer-like identifiers was unknown. This made
it hard to use them in pointer-like data structures. We were already
dodging this in dangerous ways to create the "all analyses" entry. In
a subsequent patch I attempted to use these with TinyPtrVector and
things fell apart in a very bad way.
And it isn't just a compile time or type system issue. Worse than that,
the actual alignment of these pointer-like opaque identifiers wasn't
guaranteed to be a useful alignment as they were just characters.
This change introduces a type to use as the "key" object whose address
forms the opaque identifier. This both forces the objects to have proper
alignment, and provides type checking that we get it right everywhere.
It also makes the types somewhat less mysterious than `void *`.
We could go one step further and introduce a truly opaque pointer-like
type to return from the `ID()` static function rather than returning
`AnalysisKey *`, but that didn't seem to be a clear win so this is just
the initial change to get to a reliably typed and aligned object serving
is a key for all the analyses.
Thanks to Richard Smith and Justin Lebar for helping pick plausible
names and avoid making this refactoring many times. =] And thanks to
Sean for the super fast review!
While here, I've tried to move away from the "PassID" nomenclature
entirely as it wasn't really helping and is overloaded with old pass
manager constructs. Now we have IDs for analyses, and key objects whose
address can be used as IDs. Where possible and clear I've shortened this
to just "ID". In a few places I kept "AnalysisID" to make it clear what
was being identified.
Differential Revision: https://reviews.llvm.org/D27031
llvm-svn: 287783
Summary: When identifying cold blocks, consider only the edge to the normal destination if the terminator is InvokeInst and let calcInvokeHeuristics() decide edge weights for the InvokeInst.
Reviewers: mcrosier, hfinkel, davidxl
Subscribers: mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D24868
llvm-svn: 282262
Besides a general consistently benefit, the extra layer of indirection
allows the mechanical part of https://reviews.llvm.org/D23256 that
requires touching every transformation and analysis to be factored out
cleanly.
Thanks to David for the suggestion.
llvm-svn: 278077
Most possibly problem was caused by the same reason as PR28400. This change
bypasses it by using CallbackVH instead of AssertingVH.
Differential Revision: https://reviews.llvm.org/D20957
llvm-svn: 275563