Commit Graph

1062 Commits

Author SHA1 Message Date
Roman Lebedev b5822026dd
[SimplifyCFG] 'Fold branch to common dest': don't overestimate the cost
`FoldBranchToCommonDest()` has a certain budget (`-bonus-inst-threshold=`)
for bonus instruction duplication. And currently it calculates the cost
as-if it will actually duplicate into each predecessor.

But ignoring the budget, it won't always duplicate into each predecessor,
there are some correctness and profitability checks.
So when calculating the cost, we should first check into which blocks
will we *actually* duplicate, and only then use that block count
to do budgeting.
2021-03-23 18:30:26 +03:00
Roman Lebedev 514bc01ca3
[SimplifyCFG] FoldBranchToCommonDest(): properly handle same-block external uses (PR49510/PR49689)
We clone bonus instructions to the end of the predecessor block,
and then use `SSAUpdater::RewriteUseAfterInsertions()`.
But that only deals with the cases where the use-to-be-rewritten
are either in different block from the def, or come after the def.

But in some loop cases, the external use may be in the beginning of
predecessor block, before the newly cloned bonus instruction.
`SSAUpdater::RewriteUseAfterInsertions()` does not deal with that.
Notably, the external use can't happen to be both in the same block
and *after* the newly-cloned instruction, because of the fold preconditions.

To properly handle these cases, when the use is in the same block,
we should instead use `SSAUpdater::RewriteUse()`.
TBN, they do the same thing for PHI users.

Fixes https://bugs.llvm.org/show_bug.cgi?id=49510
Likely Fixes https://bugs.llvm.org/show_bug.cgi?id=49689
2021-03-23 17:37:28 +03:00
Sanjay Patel 1bf8f9e228 [SimplifyCFG] use profile metadata to refine merging branch conditions
2nd try (original: 27ae17a6b0) with fix/test for crash. We must make
sure that TTI is available before trying to use it because it is not
required (might be another bug).

Original commit message:

This is one step towards solving:
https://llvm.org/PR49336

In that example, we disregard the recommended usage of builtin_expect,
so an expensive (unpredictable) branch is folded into another branch
that is guarding it.
Here, we read the profile metadata to see if the 1st (predecessor)
condition is likely to cause execution to bypass the 2nd (successor)
condition before merging conditions by using logic ops.

Differential Revision: https://reviews.llvm.org/D98898
2021-03-23 10:19:37 -04:00
Juneyoung Lee 5c2e50b5d2 Reland "[SimplifyCFG] Update FoldBranchToCommonDest to be poison-safe"
This relands commit 99108c791d (D95026) which was
reverted by 8d5a981a13 because the underlying
problem (https://llvm.org/pr49495) is fixed.
2021-03-23 09:19:53 +09:00
Sanjay Patel 95f7f7c21b Revert "[SimplifyCFG] use profile metadata to refine merging branch conditions"
This reverts commit 27ae17a6b0.
There are bot failures that end with:
 #4 0x00007fff7ae3c9b8 CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
 #5 0x00007fff84e504d8 (linux-vdso64.so.1+0x4d8)
 #6 0x00007fff7c419a5c llvm::TargetTransformInfo::getPredictableBranchThreshold() const (/home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1.install/bin/../lib/libLLVMAnalysis.so.13git+0x479a5c)

...but not sure how to trigger that yet.
2021-03-22 17:48:06 -04:00
Sanjay Patel 27ae17a6b0 [SimplifyCFG] use profile metadata to refine merging branch conditions
This is one step towards solving:
https://llvm.org/PR49336

In that example, we disregard the recommended usage of builtin_expect,
so an expensive (unpredictable) branch is folded into another branch
that is guarding it.
Here, we read the profile metadata to see if the 1st (predecessor)
condition is likely to cause execution to bypass the 2nd (successor)
condition before merging conditions by using logic ops.

Differential Revision: https://reviews.llvm.org/D98898
2021-03-22 16:49:21 -04:00
Sanjay Patel bd197ed0a5 [SimplifyCFG] avoid sinking insts within an infinite-loop
The test is reduced from a C source example in:
https://llvm.org/PR49541

It's possible that the test could be reduced further or
the predicate generalized further, but it seems to require
a few ingredients (including the "late" SimplifyCFG options
on the RUN line) to fall into the infinite-loop trap.
2021-03-12 08:04:57 -05:00
Mehdi Amini 8d5a981a13 Revert "[SimplifyCFG] Update FoldBranchToCommonDest to be poison-safe"
This reverts commit 99108c791d.
Clang is miscompiling LLVM with this change, a stage-2 build hits
multiple failures.

As a repro, I built clang in a stage1 directory and used it this way:

cmake -G Ninja ../llvm \
  -DCMAKE_CXX_COMPILER=`pwd`/../build-stage1/bin/clang++ \
  -DCMAKE_C_COMPILER=`pwd`/../build-stage1/bin/clang \
  -DLLVM_TARGETS_TO_BUILD="X86;NVPTX;AMDGPU" \
  -DLLVM_ENABLE_PROJECTS=mlir \
  -DLLVM_BUILD_EXAMPLES=ON \
  -DCMAKE_BUILD_TYPE=Release \
  -DLLVM_ENABLE_ASSERTIONS=On
ninja check-mlir
2021-03-08 00:15:47 +00:00
Juneyoung Lee 99108c791d [SimplifyCFG] Update FoldBranchToCommonDest to be poison-safe
This patch makes FoldBranchToCommonDest merge branch conditions into `select i1` rather than `and/or i1` when it is called by SimplifyCFG.
It is known that merging conditions into and/or is poison-unsafe, and this is towards making things *more* correct by removing possible miscompilations.
Currently, InstCombine simply consumes these selects into and/or of i1 (which is also unsafe), so the visible effect would be very small. The unsafe select -> and/or transformation will be removed in the future.
There has been efforts for updating optimizations to support the select form as well, and they are linked to D93065.

The safe transformation is fired when it is called by SimplifyCFG only. This is done by setting the new `PoisonSafe` argument as true.
Another place that calls FoldBranchToCommonDest is LoopSimplify. `PoisonSafe` flag is set to false in this case because enabling it has a nontrivial impact in performance because SCEV is more conservative with select form and InductiveRangeCheckElimination isn't aware of select form of and/or i1.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D95026
2021-03-08 01:38:03 +09:00
Hongtao Yu 8985515822 [CSSPGO] Unblocking optimizations by dangling pseudo probes.
This change fixes a couple places where the pseudo probe intrinsic blocks optimizations because they are not naturally removable. To unblock those optimizations, the blocking pseudo probes are moved out of the original blocks and tagged dangling, instead of allowing pseudo probes to be literally removed. The reason is that when the original block is removed, we won't be able to sample it. Instead of assigning it a zero weight, moving all its pseudo probes into another block and marking them dangling should allow the counts inference a chance to assign them a more reasonable weight. We have not seen counts quality degradation from our experiments.

The optimizations being unblocked are:

	1. Removing conditional probes for if-converted branches. Conditional probes are tagged dangling when their homing branch arms are folded so that they will not be over-counted.
	2. Unblocking jump threading from removing empty blocks. Pseudo probe prevents jump threading from removing logically empty blocks that only has one unconditional jump instructions.
	3. Unblocking SimplifyCFG and MIR tail duplicate to thread empty blocks and blocks with redundant branch checks.

Since dangling probes are logically deleted, they should not consume any samples in LTO postLink. This can be achieved by setting their distribution factors to zero when dangled.

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D97481
2021-03-03 22:44:42 -08:00
Juneyoung Lee 5419b67137 [SimplifyCFG] Update FoldTwoEntryPHINode to handle and/or of select and binop equally
This is a minor change that fixes FoldTwoEntryPHINode to handle
phis with and/ors of select form and binop form equally.
2021-03-01 13:34:51 +09:00
Kazu Hirata 1d4a2f3778 [Transforms/Utils] Use range-based for loops (NFC) 2021-02-26 22:36:40 -08:00
Juneyoung Lee 56d228a14e [SimplifyCFG] Update passingValueIsAlwaysUndefined to check more attributes
This is a simple patch to update SimplifyCFG's passingValueIsAlwaysUndefined to inspect more attributes.

A new function `CallBase::isPassingUndefUB` checks attributes that imply noundef.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D97244
2021-02-24 10:40:50 +09:00
Kazu Hirata be23012d5a [Transforms/Utils] Use range-based for loops (NFC) 2021-02-07 09:49:36 -08:00
Sander de Smalen bf294953e7 NFC: Migrate SimplifyCFG to work on InstructionCost
This patch migrates cost values and arithmetic to work on InstructionCost.
When the interfaces to TargetTransformInfo are changed, any InstructionCost
state will propagate naturally.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D95351
2021-02-01 16:14:05 +00:00
Sander de Smalen 880b64aa22 [SimplifyCFG] NFC: Rename static methods to clang-tidy standards.
This patch is a precursor to D95351, which changes the signature
of these methods.
2021-02-01 16:14:05 +00:00
Roman Lebedev 8cfa963463
[SimplifyCFG] If provided, preserve Dominator Tree
SimplifyCFG is an utility pass, and the fact that it does not
preserve DomTree's, forces it's users to somehow workaround that,
likely by not preserving DomTrees's themselves.

Indeed, simplifycfg pass didn't know how to preserve dominator tree,
it took me just under a month (starting with e113317958)
do rectify that, now it fully knows how to,
there's likely some problems with that still,
but i've dealt with everything i can spot so far.

I think we now can flip the switch.

Note that this is functionally an NFC change,
since this doesn't change the users to pass in the DomTree,
that is a separate question.

Reviewed By: kuhar, nikic

Differential Revision: https://reviews.llvm.org/D94827
2021-01-28 14:11:34 +03:00
Roman Lebedev 6f2753273e
[NFC][SimplifyCFG] Extract CloneInstructionsIntoPredecessorBlockAndUpdateSSAUses() out of PerformBranchToCommonDestFolding()
To be used in PerformValueComparisonIntoPredecessorFolding()
2021-01-24 00:54:55 +03:00
Roman Lebedev 67f9c87a65
[NFC][SimplifyCFG] Perform early-continue in FoldValueComparisonIntoPredecessors() per-pred loop 2021-01-24 00:54:54 +03:00
Roman Lebedev a4e6c2e647
[NFC][SimplifyCFG] Extract PerformValueComparisonIntoPredecessorFolding() out of FoldValueComparisonIntoPredecessors()
Less nested code is much easier to follow and modify.
2021-01-24 00:54:54 +03:00
Roman Lebedev 022da61f6b
[SimplifyCFG] Change 'LoopHeaders' to be ArrayRef<WeakVH>, not a naked set, thus avoiding dangling pointers
If i change it to AssertingVH instead, a number of existing tests fail,
which means we don't consistently remove from the set when deleting blocks,
which means newly-created blocks may happen to appear in that set
if they happen to occupy the same memory chunk as did some block
that was in the set originally.

There are many places where we delete blocks,
and while we could probably consistently delete from LoopHeaders
when deleting a block in transforms located in SimplifyCFG.cpp itself,
transforms located elsewhere (Local.cpp/BasicBlockUtils.cpp) also may
delete blocks, and it doesn't seem good to teach them to deal with it.

Since we at most only ever delete from LoopHeaders,
let's just delegate to WeakVH to do that automatically.

But to be honest, personally, i'm not sure that the idea
behind LoopHeaders is sound.
2021-01-23 16:48:35 +03:00
Roman Lebedev 1742203844
[SimplifyCFG] FoldBranchToCommonDest(): re-lift restrictions on liveout uses of bonus instructions
I have previously tried doing that in
b33fbbaa34 / d38205144f,
but eventually it was pointed out that the approach taken there
was just broken wrt how the uses of bonus instructions are updated
to account for the fact that they should now use either bonus instruction
or the cloned bonus instruction. In particluar, all that manual handling
of PHI nodes in successors was just wrong.

But, the fix is actually much much simpler than my initial approach:
just tell SSAUpdate about both instances of bonus instruction,
and let it deal with all the PHI handling.

Alive2 confirms that the reproducers from the original bugs (@pr48450*)
are now handled correctly.

This effectively reverts commit 59560e8589,
effectively relanding b33fbbaa34.
2021-01-23 01:29:05 +03:00
Roman Lebedev eae1cc0de5
[NFC][SimplifyCFG] PerformBranchToCommonDestFolding(): move instruction cloning to after CFG update
This simplifies follow-up patch, and is NFC otherwise.
2021-01-23 01:29:04 +03:00
Roman Lebedev 9bd8bcf993
[NFC][SimplifyCFG] PerformBranchToCommonDestFolding(): fix instruction name preservation
NewBonusInst just took name from BonusInst, so BonusInst has no name,
so BonusInst.getName() makes no sense.
So we need to ask NewBonusInst for the name.
2021-01-23 01:29:03 +03:00
Roman Lebedev 85e7578c6d
Revert "[NFCI-ish][SimplifyCFG] FoldBranchToCommonDest(): really don't deal with uncond branches"
Does not build in XCode:
http://green.lab.llvm.org/green/job/clang-stage1-RA/17963/consoleFull#-1704658317a1ca8a51-895e-46c6-af87-ce24fa4cd561

This reverts commit aabed3718a.
2021-01-22 17:37:11 +03:00
Roman Lebedev efeb8caf8b
[NFC][SimplifyCFG] FoldBranchToCommonDest(): extract the actual transform into helper function
I'm intentionally structuring it this way, so that the actual fold only
does the fold, and no legality/correctness checks, all of which must be
done by the caller. This allows for the fold code to be more compact
and more easily grokable.
2021-01-22 17:23:53 +03:00
Roman Lebedev b482560a59
[NFC][SimplifyCFG] FoldBranchToCommonDest(): extract check for destination sharing into a helper function
As a follow-up, i'll extract the actual transform into a function,
and this helper will be called from both places,
so this avoids code duplication.
2021-01-22 17:23:53 +03:00
Roman Lebedev 7b89efb55e
[NFC][SimplifyCFG] FoldBranchToCommonDest(): somewhat better structure weight updating code
Hoist the successor updating out of the code that deals with branch
weight updating, and hoist the 'has weights' check from the latter,
making code more consistent and easier to follow.
2021-01-22 17:23:41 +03:00
Roman Lebedev 256a035752
[NFC][SimplifyCFG] FoldBranchToCommonDest(): unclutter Cond/CondInPred handling
We don't need those variables, we can just get the final value directly.
2021-01-22 17:23:11 +03:00
Roman Lebedev aabed3718a
[NFCI-ish][SimplifyCFG] FoldBranchToCommonDest(): really don't deal with uncond branches
While we already ignore uncond branches, we could still potentially
end up with a conditional branches with identical destinations
due to the visitation order, or because we were called as an utility.
But if we have such a disguised uncond branch,
we still probably shouldn't deal with it here.
2021-01-22 17:23:10 +03:00
Roman Lebedev 0895b836d7
[SimplifyCFG] FoldBranchToCommonDest(): don't deal with unconditional branches
The case where BB ends with an unconditional branch,
and has a single predecessor w/ conditional branch
to BB and a single successor of BB is exactly the pattern
SpeculativelyExecuteBB() transform deals with.
(and in this case they both allow speculating only a single instruction)

Well, or FoldTwoEntryPHINode(), if the final block
has only those two predecessors.

Here, in FoldBranchToCommonDest(), only a weird subset of that
transform is supported, and it's glued on the side in a weird way.
  In particular, it took me a bit to understand that the Cond
isn't actually a branch condition in that case, but just the value
we allow to speculate (otherwise it reads as a miscompile to me).
  Additionally, this only supports for the speculated instruction
to be an ICmp.

So let's just unclutter FoldBranchToCommonDest(), and leave
this transform up to SpeculativelyExecuteBB(). As far as i can tell,
this shouldn't really impact optimization potential, but if it does,
improving SpeculativelyExecuteBB() will be more beneficial anyways.

Notably, this only affects a single test,
but EarlyCSE should have run beforehand in the pipeline,
and then FoldTwoEntryPHINode() would have caught it.

This reverts commit rL158392 / commit d33f4efbfd.
2021-01-22 17:22:49 +03:00
Juneyoung Lee 395c737d9f [SimplifyCFG] Update SimplifyBranchOnICmpChain to recognize select form of and/or
This patch teaches SimplifyCFG::SimplifyBranchOnICmpChain to understand select form of
(x == C1 || x == C2 || ...) / (x != C1 && x != C2 && ...) and optimize them into switch if possible.
D93065 has more context about the transition, including links to the list of optimizations being updated.

Differential Revision: https://reviews.llvm.org/D93943
2021-01-19 08:53:40 +09:00
Dávid Bolvanský a1500105ee [SimplifyCFG] Optimize CFG when null is passed to a function with nonnull argument
Example:

```
__attribute__((nonnull,noinline)) char * pinc(char *p)  {
  return ++p;
}

char * foo(bool b, char *a) {
       return pinc(b ? 0 : a);

}
```

optimize to

```
char * foo(bool b, char *a) {
       return pinc(a);

}
```

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D94180
2021-01-15 23:53:43 +01:00
Roman Lebedev a14c36fe27
[SimplifyCFG] switchToSelect(): don't forget to insert DomTree edge iff needed
DestBB might or might not already be a successor of SelectBB,
and it wasn't we need to ensure that we record the fact in DomTree.

The testcase used to crash in lazy domtree updater mode + non-per-function
domtree validity checks disabled.
2021-01-15 23:35:57 +03:00
Roman Lebedev c6654a4cda
[SimplifyCFG][BasicBlockUtils] Port SplitBlockPredecessors()/SplitLandingPadPredecessors() to DomTreeUpdater
This is not nice, but it's the best transient solution possible,
and is better than just duplicating the whole function.

The problem is, this function is widely used,
and it is not at all obvious that all the users
could be painlessly switched to operate on DomTreeUpdater,
and somehow i don't feel like porting all those users first.

This function is one of last three that not operate on DomTreeUpdater.
2021-01-15 23:35:56 +03:00
Roman Lebedev 286cf6cb02
[SimplifyCFG] Port SplitBlockAndInsertIfThen() to DomTreeUpdater
This is not nice, but it's the best transient solution possible,
and is better than just duplicating the whole function.

The problem is, this function is widely used,
and it is not at all obvious that all the users
could be painlessly switched to operate on DomTreeUpdater,
and somehow i don't feel like porting all those users first.

This function is one of last three that not operate on DomTreeUpdater.
2021-01-15 23:35:56 +03:00
Roman Lebedev c845c724c2
[Utils][SimplifyCFG] Port SplitBlock() to DomTreeUpdater
This is not nice, but it's the best transient solution possible,
and is better than just duplicating the whole function.

The problem is, this function is widely used,
and it is not at all obvious that all the users
could be painlessly switched to operate on DomTreeUpdater,
and somehow i don't feel like porting all those users first.

This function is one of last three that not operate on DomTreeUpdater.
2021-01-15 23:35:56 +03:00
Roman Lebedev f9ba347706
[SimplifyCFG] FoldValueComparisonIntoPredecessors(): don't insert a DomTree edge if it already exists
When we are adding edges to the terminator and potentially turning it
into a switch (if it wasn't already), it is possible that the
case we're adding will share it's destination with one of the
preexisting cases, in which case there is no domtree edge to add.

Indeed, this change does not have a test coverage change.
This failure has been exposed in an existing test coverage
by a follow-up patch that switches to lazy domtreeupdater mode,
and removes domtree verification from
SimplifyCFGOpt::simplifyOnce()/SimplifyCFGOpt::run(),
IOW it does not appear feasible to add dedicated test coverage here.
2021-01-12 02:09:47 +03:00
Roman Lebedev c0de0a1b72
[SimplifyCFG] SimplifyBranchOnICmpChain(): don't insert a DomTree edge that already exists
BB was already always branching to EdgeBB, there is no edge to add.

Indeed, this change does not have a test coverage change.
This failure has been exposed in an existing test coverage
by a follow-up patch that switches to lazy domtreeupdater mode,
and removes domtree verification from
SimplifyCFGOpt::simplifyOnce()/SimplifyCFGOpt::run(),
IOW it does not appear feasible to add dedicated test coverage here.
2021-01-12 02:09:46 +03:00
Roman Lebedev c22bc5f1f8
[SimplifyCFG] SwitchToLookupTable(): don't insert a DomTree edge that already exists
SI is the terminator of BB, so the edge we are adding obviously already existed.

Indeed, this change does not have a test coverage change.
This failure has been exposed in an existing test coverage
by a follow-up patch that switches to lazy domtreeupdater mode,
and removes domtree verification from
SimplifyCFGOpt::simplifyOnce()/SimplifyCFGOpt::run(),
IOW it does not appear feasible to add dedicated test coverage here.
2021-01-12 02:09:46 +03:00
Roman Lebedev 8e8d214c4a
[NFCI][SimplifyCFG] Prefer to add Insert edges before Delete edges into DomTreeUpdater, if reasonable
This has a measurable impact on the number of DomTree recalculations.
While this doesn't handle all the cases,
it deals with the most obvious ones.
2021-01-11 00:30:44 +03:00
Florian Hahn d98fc62ae6 [SimplifyCFG] Keep !dgb metadata of moved instruction, if they match.
Currently SimplifyCFG drops the debug locations of 'bonus' instructions.
Such instructions are moved before the first branch. The reason for the
current behavior is that this could lead to surprising debug stepping,
if the block that's folded is dead.

In case the first branch and the instructions to be folded have the same
debug location, this shouldn't be an issue and we can keep the debug
location.

Reviewed By: vsk

Differential Revision: https://reviews.llvm.org/D93662
2021-01-09 19:15:16 +00:00
Roman Lebedev 6be1fd6b20
[SimplifyCFG] FoldValueComparisonIntoPredecessors(): drop reachable errneous assert
I have added it in d15d81c because it *seemed* correct, was holding
for all the tests so far, and was validating the fix added in the same
commit, but as David Major is pointing out (with a reproducer),
the assertion isn't really correct after all. So remove it.

Note that the d15d81c still fine.
2021-01-07 18:05:04 +03:00
Roman Lebedev a14945c1db
[SimplifyCFG] SimplifyEqualityComparisonWithOnlyPredecessor(): really don't delete DomTree edges multiple times 2021-01-06 01:52:39 +03:00
Roman Lebedev 2b437fcd47
[SimplifyCFG] SwitchToLookupTable(): switch to non-permissive DomTree updates
... which requires not deleting a DomTree edge that we just deleted.
2021-01-06 01:52:38 +03:00
Roman Lebedev fa5447aa3f
[NFC][SimplifyCFG] SwitchToLookupTable(): pull out SI->getParent() into a variable 2021-01-06 01:52:38 +03:00
Roman Lebedev d15d81ce15
[SimplifyCFG] FoldValueComparisonIntoPredecessors(): deal with each predecessor only once
If the predecessor is a switch, and BB is not the default destination,
multiple cases could have the same destination. and it doesn't
make sense to re-process the predecessor, because we won't make any changes,
once is enough.

I'm not sure this can be really tested, other than via the assertion
being added here, which fires without the fix.
2021-01-06 01:52:37 +03:00
Roman Lebedev fc96cb2dad
[SimplifyCFG] FoldValueComparisonIntoPredecessors(): switch to non-permissive DomTree updates
... which requires not adding a DomTree edge that we just added.
2021-01-06 01:52:37 +03:00
Roman Lebedev 29ca7d5a1a
[SimplifyCFG] simplifyUnreachable(): fix handling of degenerate same-destination conditional branch
One would hope that it would have been already canonicalized into an
unconditional branch, but that isn't really guaranteed to happen
with SimplifyCFG's visitation order.
2021-01-06 01:52:36 +03:00
Roman Lebedev 3460719f58
[NFC][SimplifyCFG] Add a test with same-destination condidional branch
Reported by Mikael Holmén as post-commit feedback on
https://reviews.llvm.org/rG2d07414ee5f74a09fb89723b4a9bb0818bdc2e18#968162
2021-01-06 01:52:36 +03:00