This patch updates ValueLattice to distinguish between ranges that are
guaranteed to not include undef and ranges that may include undef.
A constant range guaranteed to not contain undef can be used to simplify
instructions to arbitrary values. A constant range that may contain
undef can only be used to simplify to a constant. If the value can be
undef, it might take a value outside the range. For example, consider
the snipped below
define i32 @f(i32 %a, i1 %c) {
br i1 %c, label %true, label %false
true:
%a.255 = and i32 %a, 255
br label %exit
false:
br label %exit
exit:
%p = phi i32 [ %a.255, %true ], [ undef, %false ]
%f.1 = icmp eq i32 %p, 300
call void @use(i1 %f.1)
%res = and i32 %p, 255
ret i32 %res
}
In the exit block, %p would be a constant range [0, 256) including undef as
%p could be undef. We can use the range information to replace %f.1 with
false because we remove the compare, effectively forcing the use of the
constant to be != 300. We cannot replace %res with %p however, because
if %a would be undef %cond may be true but the second use might not be
< 256.
Currently LazyValueInfo uses the new behavior just when simplifying AND
instructions and does not distinguish between constant ranges with and
without undef otherwise. I think we should address the remaining issues
in LVI incrementally.
Reviewers: efriedma, reames, aqjune, jdoerfert, sstefan1
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D76931
CVP currently does not simplify cmps with instructions in the same
block, because LVI getPredicateAt() currently does not provide
much useful information for that case (D69686 would change that,
but is stuck.) However, if the instruction is a Phi node, then
LVI can compute the result of the predicate by threading it into
the predecessor blocks, which allows it simplify some conditions
that nothing else can handle. Relevant code:
6d6a4590c5/llvm/lib/Analysis/LazyValueInfo.cpp (L1904-L1927)
Differential Revision: https://reviews.llvm.org/D72169
This file lists every pass in LLVM, and is included by Pass.h, which is
very popular. Every time we add, remove, or rename a pass in LLVM, it
caused lots of recompilation.
I found this fact by looking at this table, which is sorted by the
number of times a file was changed over the last 100,000 git commits
multiplied by the number of object files that depend on it in the
current checkout:
recompiles touches affected_files header
342380 95 3604 llvm/include/llvm/ADT/STLExtras.h
314730 234 1345 llvm/include/llvm/InitializePasses.h
307036 118 2602 llvm/include/llvm/ADT/APInt.h
213049 59 3611 llvm/include/llvm/Support/MathExtras.h
170422 47 3626 llvm/include/llvm/Support/Compiler.h
162225 45 3605 llvm/include/llvm/ADT/Optional.h
158319 63 2513 llvm/include/llvm/ADT/Triple.h
140322 39 3598 llvm/include/llvm/ADT/StringRef.h
137647 59 2333 llvm/include/llvm/Support/Error.h
131619 73 1803 llvm/include/llvm/Support/FileSystem.h
Before this change, touching InitializePasses.h would cause 1345 files
to recompile. After this change, touching it only causes 550 compiles in
an incremental rebuild.
Reviewers: bkramer, asbirlea, bollu, jdoerfert
Differential Revision: https://reviews.llvm.org/D70211
This phi simplification transform was added with:
D45448
However as shown in PR43802:
https://bugs.llvm.org/show_bug.cgi?id=43802
...we must be careful not to propagate poison when we do the substitution.
There might be some more complicated analysis possible to retain the overflow flag,
but it should always be safe and easy to drop flags (we have similar behavior in
instcombine and other passes).
Differential Revision: https://reviews.llvm.org/D69442
Summary:
CVP, unlike InstCombine, does not run till exaustion.
It only does a single pass.
When dealing with those special binops, if we prove that they can
safely be demoted into their usual binop form,
we do set the no-wrap we deduced. But when dealing with usual binops,
we try to deduce both no-wraps.
So if we convert e.g. @llvm.uadd.with.overflow() to `add nuw`,
we won't attempt to check whether it can be `add nuw nsw`.
This patch proposes to call `processBinOp()` on newly-created binop,
which is identical to what we do for div/rem already.
Reviewers: nikic, spatel, reames
Reviewed By: nikic
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69183
llvm-svn: 375273
Summary:
It looks like this is the only missing statistic in the CVP pass.
Since we prove NSW and NUW separately i'd think we should count them separately too.
Reviewers: nikic, spatel, reames
Reviewed By: spatel
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68740
llvm-svn: 375230
This is really a known bits style transformation, but known bits isn't context sensitive. The particular case which comes up happens to involve a range which allows range based reasoning to eliminate the mask pattern, so handle that case specifically in CVP.
InstCombine likes to generate the mask-by-low-bits pattern when widening an arithmetic expression which includes a zext in the middle.
Differential Revision: https://reviews.llvm.org/D68811
llvm-svn: 374506
Use a { iN undef, i1 false } struct as the base, and only insert
the first operand, instead of using { iN undef, i1 undef } as the
base and inserting both. This is the same as what we do in InstCombine.
Differential Revision: https://reviews.llvm.org/D67034
llvm-svn: 370573
Inference of nowrap flags in CVP has been disabled, because it
triggered a bug in LFTR (https://bugs.llvm.org/show_bug.cgi?id=31181).
This issue has been fixed in D60935, so we should be able to reenable
nowrap flag inference now.
Differential Revision: https://reviews.llvm.org/D62776
llvm-svn: 364228
This reverts commit 5b32f60ec3.
The fix is in commit 4f9e68148b.
This patch fixes the CorrelatedValuePropagation pass to keep
prof branch_weights metadata of SwitchInst consistent.
It makes use of SwitchInstProfUpdateWrapper.
New tests are added.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D62126
llvm-svn: 362583
If we can determine that a saturating add/sub will not overflow based
on range analysis, convert it into a simple binary operation. This is
a sibling transform to the existing with.overflow handling.
Reapplying this with an additional check that the saturating intrinsic
has integer type, as LVI currently does not support vector types.
Differential Revision: https://reviews.llvm.org/D62703
llvm-svn: 362263
Noticed on D62703. LVI only handles plain integers, not vectors of
integers. This was previously not an issue, because vector support
for with.overflow is only a relatively recent addition.
llvm-svn: 362261
If we can determine that a saturating add/sub will not overflow
based on range analysis, convert it into a simple binary operation.
This is a sibling transform to the existing with.overflow handling.
Differential Revision: https://reviews.llvm.org/D62703
llvm-svn: 362242
This reverts commit 53f2f32865.
As reported on D62126, this causes assertion failures if the switch
has incorrect branch_weights metadata, which may happen as a result
of other transforms not handling it correctly yet.
llvm-svn: 361881
This patch fixes the CorrelatedValuePropagation pass to keep
prof branch_weights metadata of SwitchInst consistent.
It makes use of SwitchInstProfUpdateWrapper.
New tests are added.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D62126
llvm-svn: 361808
The guaranteed no-wrap region is never empty, it always contains at
least zero, so these optimizations don't ever apply.
To make this more obviously true, replace the conversative return
in makeGNWR with an assertion.
llvm-svn: 361698
This reverts commit 7bf4d7c07f2fac862ef34c82ad0fef6513452445.
After thinking about this more, this isn't right, the range is not exact
in the same sense as makeExactICmpRegion(). This needs a separate
function.
llvm-svn: 358876
Following D60632 makeGuaranteedNoWrapRegion() always returns an
exact nowrap region. Rename the function accordingly. This is in
line with the naming of makeExactICmpRegion().
llvm-svn: 358875
Summary:
Teach CorrelatedValuePropagation to also handle sub instructions in addition to add. Relatively simple since makeGuaranteedNoWrapRegion already understood sub instructions. Only subtle change is which range is passed as "Other" to that function, since sub isn't commutative.
Note that CorrelatedValuePropagation::processAddSub is still hidden behind a default-off flag as IndVarSimplify hasn't yet been fixed to strip the added nsw/nuw flags and causes a miscompile. (PR31181)
Reviewers: sanjoy, apilipenko, nikic
Reviewed By: nikic
Subscribers: hiraditya, jfb, jdoerfert, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60036
llvm-svn: 358816
If a umul.with.overflow or smul.with.overflow operation cannot
overflow, simplify it to a simple mul nuw / mul nsw. After the
refactoring in D60668 this is just a matter of removing an
explicit check against multiplications.
Differential Revision: https://reviews.llvm.org/D60791
llvm-svn: 358521
This adds a WithOverflowInst class with a few helper methods to get
the underlying binop, signedness and nowrap type and makes use of it
where sensible. There will be two more uses in D60650/D60656.
The refactorings are all NFC, though I left some TODOs where things
could be improved. In particular we have two places where add/sub are
handled but mul isn't.
Differential Revision: https://reviews.llvm.org/D60668
llvm-svn: 358512
When CVP determines that a with.overflow intrinsic cannot overflow,
it currently inserts a simple add/sub. As we already determined that
there can be no overflow, we should add the appropriate NUW/NSW flag.
Differential Revision: https://reviews.llvm.org/D60585
llvm-svn: 358298
Summary:
This patch separates two semantics of `applyUpdates`:
1. User provides an accurate CFG diff and the dominator tree is updated according to the difference of `the number of edge insertions` and `the number of edge deletions` to infer the status of an edge before and after the update.
2. User provides a sequence of hints. Updates mentioned in this sequence might never happened and even duplicated.
Logic changes:
Previously, removing invalid updates is considered a side-effect of deduplication and is not guaranteed to be reliable. To handle the second semantic, `applyUpdates` does validity checking before deduplication, which can cause updates that have already been applied to be submitted again. Then, different calls to `applyUpdates` might cause unintended consequences, for example,
```
DTU(Lazy) and Edge A->B exists.
1. DTU.applyUpdates({{Delete, A, B}, {Insert, A, B}}) // User expects these 2 updates result in a no-op, but {Insert, A, B} is queued
2. Remove A->B
3. DTU.applyUpdates({{Delete, A, B}}) // DTU cancels this update with {Insert, A, B} mentioned above together (Unintended)
```
But by restricting the precondition that updates of an edge need to be strictly ordered as how CFG changes were made, we can infer the initial status of this edge to resolve this issue.
Interface changes:
The second semantic of `applyUpdates` is separated to `applyUpdatesPermissive`.
These changes enable DTU(Lazy) to use the first semantic if needed, which is quite useful in `transforms/utils`.
Reviewers: kuhar, brzycki, dmgreen, grosser
Reviewed By: brzycki
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58170
llvm-svn: 354669
DomTreeUpdater depends on headers from Analysis, but is in IR. This is a
layering violation since Analysis depends on IR. Relocate this code from IR
to Analysis to fix the layering violation.
llvm-svn: 353265
Deopt operands are generally intended to record information about a site in code with minimal perturbation of the surrounding code. Idiomatically, they also tend to appear down rare paths. Putting these together, we have an obvious case for extending CVP w/deopt operand constant folding. Arguably, we should be doing this for all operands on all instructions, but that's definitely a much larger and risky change.
Differential Revision: https://reviews.llvm.org/D55678
llvm-svn: 351774
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
1. The variables were confusing: 'C' typically refers to a constant, but here it was the Cmp.
2. Formatting violations.
3. Simplify code to return true/false constant.
llvm-svn: 347868
Fix all of the missing debug location errors in CVP found by debugify.
This includes the missing-location-after-udiv-truncation case described
in llvm.org/PR38178.
llvm-svn: 347147
Summary:
This patch is the second in a series of patches related to the [[ http://lists.llvm.org/pipermail/llvm-dev/2018-June/123883.html | RFC - A new dominator tree updater for LLVM ]].
It converts passes (e.g. adce/jump-threading) and various functions which currently accept DDT in local.cpp and BasicBlockUtils.cpp to use the new DomTreeUpdater class.
These converted functions in utils can accept DomTreeUpdater with either UpdateStrategy and can deal with both DT and PDT held by the DomTreeUpdater.
Reviewers: brzycki, kuhar, dmgreen, grosser, davide
Reviewed By: brzycki
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D48967
llvm-svn: 338814
Summary:
We only modify CFG in a couple of places, and we can preserve DT there
with a little effort.
Reviewers: davide, vsk
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D48059
llvm-svn: 334895
Review feedback from r328165. Split out just the one function from the
file that's used by Analysis. (As chandlerc pointed out, the original
change only moved the header and not the implementation anyway - which
was fine for the one function that was used (since it's a
template/inlined in the header) but not in general)
llvm-svn: 333954
We were previously using a DT in CVP through SimplifyQuery, but not requiring it in
the new pass manager. Hence it would crash if DT was not already available. This now
gets DT directly and plumbs it through to where it is used (instead of using it
through SQ).
llvm-svn: 332836
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.
In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.
Differential Revision: https://reviews.llvm.org/D43624
llvm-svn: 332240
This is based on an example that was recently posted on llvm-dev:
void *propagate_null(void* b, int* g) {
if (!b) {
return 0;
}
(*g)++;
return b;
}
https://godbolt.org/g/xYk3qG
The original code or constant propagation in other passes has obscured the fact
that the phi can be removed completely.
Differential Revision: https://reviews.llvm.org/D45448
llvm-svn: 329755
Remove #include of Transforms/Scalar.h from Transform/Utils to fix layering.
Transforms depends on Transforms/Utils, not the other way around. So
remove the header and the "createStripGCRelocatesPass" function
declaration (& definition) that is unused and motivated this dependency.
Move Transforms/Utils/Local.h into Analysis because it's used by
Analysis/MemoryBuiltins.cpp.
llvm-svn: 328165
Summary:
If the operands of a udiv/urem can be proved to fit within a smaller
power-of-two-sized type, reduce the width of the udiv/urem.
Backed out for causing performance regressions. Re-landing
because we've determined that these regressions were noise.
Original Differential Revision: https://reviews.llvm.org/D44102
llvm-svn: 328096
This reverts r326908, originally landed as D44102.
Reverted for causing performance regressions on x86. (These regressions
are not yet understood.)
llvm-svn: 327252