As pointed out in D60518 folding mulo(%x, undef) to {undef, undef}
isn't correct. As a correct version of this already exists in
InstructionSimplify (bd8056ef32/lib/Analysis/InstructionSimplify.cpp (L4750-L4757)) this is just
dead code though. Drop it together with the mul(%x, 0) -> {0, false}
fold that is also already handled by InstSimplify.
Differential Revision: https://reviews.llvm.org/D60649
llvm-svn: 358339
Following D60483 and D60497, this adds support for AlwaysOverflows
handling for ssubo. This is the last case we can handle right now.
Differential Revision: https://reviews.llvm.org/D60518
llvm-svn: 358100
Check AlwaysOverflow condition for usubo. The implementation is the
same as the existing handling for uaddo and umulo. Handling for saddo
and ssubo will follow (smulo doesn't have the necessary ValueTracking
support).
Differential Revision: https://reviews.llvm.org/D60483
llvm-svn: 358052
Change the code to always handle the unsigned+signed cases together
with the same basic structure for add/sub/mul. The simple folds are
always handled first and then the ValueTracking overflow checks are
used.
llvm-svn: 358025
This fixes a class of bugs introduced by D44367,
which transforms various cases of icmp (bitcast ([su]itofp X)), Y to icmp X, Y.
If the bitcast is between vector types with a different number of elements,
the current code will produce bad IR along the lines of: icmp <N x i32> ..., <M x i32> <...>.
This patch suppresses the transform if the bitcast changes the number of vector elements.
Patch by: @AndrewScheidecker (Andrew Scheidecker)
Differential Revision: https://reviews.llvm.org/D57871
llvm-svn: 353467
We should canonicalize to one of these forms,
and compare-with-zero could be more conducive
to follow-on transforms. This also leads to
generally better codegen as shown in PR40611:
https://bugs.llvm.org/show_bug.cgi?id=40611
llvm-svn: 353313
This cleans up all CallInst creation in LLVM to explicitly pass a
function type rather than deriving it from the pointer's element-type.
Differential Revision: https://reviews.llvm.org/D57170
llvm-svn: 352909
Followup to D55745, this time handling comparisons with ugt and ult
predicates (which are the canonical forms for non-equality predicates).
For ctlz we can convert into a simple icmp, for cttz we can convert
into a mask check.
Differential Revision: https://reviews.llvm.org/D56355
llvm-svn: 351645
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
Checking whether a number has a certain number of trailing / leading
zeros means checking whether it is of the form XXXX1000 / 0001XXXX,
which can be done with an and+icmp.
Related to https://bugs.llvm.org/show_bug.cgi?id=28668. As a next
step, this can be extended to non-equality predicates.
Differential Revision: https://reviews.llvm.org/D55745
llvm-svn: 349530
This fixes https://bugs.llvm.org/show_bug.cgi?id=39908.
The evaluateGEPOffsetExpression() function simplifies GEP offsets for
use in comparisons against zero, basically by converting X*Scale+Offset==0
to X+Offset/Scale==0 if Scale divides Offset. However, before this is done,
Offset is masked down to the pointer size. This results in incorrect
results for negative Offsets, because we basically end up dividing the
32-bit offset *zero* extended to 64-bit bits (rather than sign extended).
Fix this by explicitly sign extending the truncated value.
Differential Revision: https://reviews.llvm.org/D55449
llvm-svn: 348987
I was finally able to quantify what i thought was missing in the fix,
it was vector constants. If we have a scalar (and %x, -1),
it will be instsimplified before we reach this code,
but if it is a vector, we may still have a -1 element.
Thus, we want to avoid the fold if *at least one* element is -1.
Or in other words, ignoring the undef elements, no sign bits
should be set. Thus, m_NonNegative().
A follow-up for rL348181
https://bugs.llvm.org/show_bug.cgi?id=39861
llvm-svn: 348462
The tests here are based on the motivating cases from D54827.
More background:
1. We don't get these cases in general with SimplifyCFG because the root
of the pattern match is an icmp, not a branch. I'm not sure how often
we encounter this pattern vs. the seemingly more likely case with
branches, but I don't see evidence to leave the minimal pattern
unoptimized.
2. This has a chance of increasing compile-time because we're using a
ValueTracking call to handle the match. The motivating cases could be
handled with a simpler pair of calls to isImpliedTrueByMatchingCmp/
isImpliedFalseByMatchingCmp, but I saw that we have a more
comprehensive wrapper around those, so we might as well use it here
unless there's evidence that it's significantly slower.
3. Ideally, we'd handle the fold to constants in InstSimplify, but as
with the existing code here, we could extend this to handle cases
where the result is not a constant, but a new combined predicate.
That would mean splitting the logic across the 2 passes and possibly
duplicating the pattern-matching cost.
4. As mentioned in D54827, this seems like the kind of thing that should
be handled in Correlated Value Propagation, but that pass is currently
limited to dealing with instructions with constant operands, so extending
this bit of InstCombine is the smallest/easiest way to get these patterns
optimized.
llvm-svn: 348367
Move it out from under the constant check, reorder
predicates, add comments. This makes it easier to
extend to handle the non-constant case.
llvm-svn: 348284
There's a potential small enhancement to this code that could
solve the cases currently under proposal in D54827 via SimplifyCFG.
Whether instcombine should be doing this kind of semi-non-local
analysis in the first place is an open question, but separating
the logic out can only help if/when we decide to move it to a
different pass.
AFAICT, any proposal to do this in SimplifyCFG could also be seen
as an overreach + it would be incomplete to start the fold from a
branch rather than an icmp.
There's another question here about the code for processUGT_ADDCST_ADD().
That part may be completely dead after rL234638 ?
llvm-svn: 348273
By morphing the instruction rather than deleting and creating a new one,
we retain fast-math-flags and potentially other metadata (profile info?).
llvm-svn: 346331
The sibling fold for 'oge' --> 'ord' was already here,
but this half was missing.
The result of fabs() must be positive or nan, so asking
if the result is negative or nan is the same as asking
if the result is nan.
This is another step towards fixing:
https://bugs.llvm.org/show_bug.cgi?id=39475
llvm-svn: 346321
As shown, this is used to eliminate redundant code in InstCombine,
and there are more cases where we should be using this pattern, but
we're currently unintentionally dropping flags.
llvm-svn: 346282
This is another part of solving PR39475:
https://bugs.llvm.org/show_bug.cgi?id=39475
This might be enough to fix that particular issue, but as noted
with the FIXME, we're still dropping FMF on other folds around here.
llvm-svn: 346234
As stated in IEEE-754 and discussed in:
https://bugs.llvm.org/show_bug.cgi?id=38086
...the sign of zero does not affect any FP compare predicate.
Known regressions were fixed with:
rL346097 (D54001)
rL346143
The transform will help reduce pattern-matching complexity to solve:
https://bugs.llvm.org/show_bug.cgi?id=39475
...as well as improve CSE and codegen (a zero constant is almost always
easier to produce than 0x80..00).
llvm-svn: 346147
The 'OLT' case was updated at rL266175, so I assume it was just an
oversight that 'UGE' was not included because that patch handled
both predicates in InstSimplify.
llvm-svn: 345727
Summary:
This is a continuation of the fix for PR34627 "InstCombine assertion at vector gep/icmp folding". (I just realized bugpoint had fuzzed the original test for me, so I had fixed another trigger of the same assert in adjacent code in InstCombine.)
This patch avoids optimizing an icmp (to look only at the base pointers) when the resulting icmp would have a different type.
The patch adds a testcase and also cleans up and shrinks the pre-existing test for the adjacent assert trigger.
Reviewers: lebedev.ri, majnemer, spatel
Reviewed By: lebedev.ri
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D52494
llvm-svn: 343486
When C is not zero and infinites are not allowed (C / X) > 0 is a sign
test. Depending on the sign of C, the predicate must be swapped.
E.g.:
foo(double X) {
if ((-2.0 / X) <= 0) ...
}
=>
foo(double X) {
if (X >= 0) ...
}
Patch by: @marels (Martin Elshuber)
Differential Revision: https://reviews.llvm.org/D51942
llvm-svn: 343228
Summary:
Same as to D52146.
`((1 << y)+(-1))` is simply non-canoniacal version of `~(-1 << y)`: https://rise4fun.com/Alive/0vl
We can not canonicalize it due to the extra uses. But we can handle it here.
Reviewers: spatel, craig.topper, RKSimon
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D52147
llvm-svn: 342547
Summary:
Two folds are happening here:
1. https://rise4fun.com/Alive/oaFX
2. And then `foldICmpWithHighBitMask()` (D52001): https://rise4fun.com/Alive/wsP4
This change doesn't just add the handling for eq/ne predicates,
it actually builds upon the previous `foldICmpWithLowBitMaskedVal()` work,
so **all** the 16 fold variants* are immediately supported.
I'm indeed only testing these two predicates.
I do not feel like re-proving all 16 folds*, because they were already proven
for the general case of constant with all-ones in low bits. So as long as
the mask produces all-ones in low bits, i'm pretty sure the fold is valid.
But required, i can re-prove, let me know.
* eq/ne are commutative - 4 folds; ult/ule/ugt/uge - are not commutative (the commuted variant is InstSimplified), 4 folds; slt/sle/sgt/sge are not commutative - 4 folds. 12 folds in total.
https://bugs.llvm.org/show_bug.cgi?id=38123https://bugs.llvm.org/show_bug.cgi?id=38708
Reviewers: spatel, craig.topper, RKSimon
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D52146
llvm-svn: 342546
Summary:
It is sometimes important to check that some newly-computed value
is non-negative and only n bits wide (where n is a variable.)
There are many ways to check that:
https://godbolt.org/z/o4RB8D
The last variant seems best?
(I'm sure there are some other variations i haven't thought of..)
More complicated, canonical pattern:
https://rise4fun.com/Alive/uhA
We do need to have two `switch()`'es like this,
to not mismatch the swappable predicates.
https://bugs.llvm.org/show_bug.cgi?id=38708
Reviewers: spatel, craig.topper, RKSimon
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D52001
llvm-svn: 342173
Summary:
It is sometimes important to check that some newly-computed value
is non-negative and only `n` bits wide (where `n` is a variable.)
There are **many** ways to check that:
https://godbolt.org/z/o4RB8D
The last variant seems best?
(I'm sure there are some other variations i haven't thought of..)
Let's handle the second variant first, since it is much simpler.
https://rise4fun.com/Alive/LYjYhttps://bugs.llvm.org/show_bug.cgi?id=38708
Reviewers: spatel, craig.topper, RKSimon
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D51985
llvm-svn: 342067