Summary:
If any vector divisor element is undef, we can arbitrarily choose it be
zero which would make the div/rem an undef value by definition.
Reviewers: spatel, reames
Reviewed By: spatel
Subscribers: magabari, llvm-commits
Differential Revision: https://reviews.llvm.org/D42485
llvm-svn: 323343
This is the 'rem' counterpart to D42032 and would be folded by
D42341.
Patch by Anton Bikineev.
Differential Revision: https://reviews.llvm.org/D42342
llvm-svn: 323067
This doesn't handle the more complicated case in the bug report yet:
https://bugs.llvm.org/show_bug.cgi?id=35790
For that, we have to match / look through a cast.
llvm-svn: 322327
In one case, we were handling out of bounds, but not undef indices. In the other, we were handling undef (with the comment making the analogy to out of bounds), but not out of bounds. Be consistent and treat both undef and constant out of bounds indices as producing undefined results.
As a side effect, this also protects instcombine from having to handle large constant indices as we always simplify first.
llvm-svn: 321575
Summary:
An undef extract index can be arbitrarily chosen to be an
out-of-range index value, which would result in the instruction being undef.
This change closes a gap identified while working on lowering vector permute intrinsics
with variable index vectors to pure LLVM IR.
Reviewers: arsenm, spatel, majnemer
Reviewed By: arsenm, spatel
Subscribers: fhahn, nhaehnle, wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D40231
llvm-svn: 319910
Follow-up of r316824. This patch supports the vector type for both current and
previous index when factoring out the current one into the previous one.
Differential Revision: https://reviews.llvm.org/D39556
llvm-svn: 319683
The 'ord' and 'uno' predicates have a logic operation for NAN built into their definitions:
FCMP_ORD = 7, ///< 0 1 1 1 True if ordered (no nans)
FCMP_UNO = 8, ///< 1 0 0 0 True if unordered: isnan(X) | isnan(Y)
So we can simplify patterns like this:
(fcmp ord (known NNAN), X) && (fcmp ord X, Y) --> fcmp ord X, Y
(fcmp uno (known NNAN), X) || (fcmp uno X, Y) --> fcmp uno X, Y
It might be better to split this into (X uno 0) | (Y uno 0) as a canonicalization, but that
would be another patch.
Differential Revision: https://reviews.llvm.org/D40130
llvm-svn: 318627
Call ConstantFoldSelectInstruction() to fold cases like below
select <2 x i1><i1 true, i1 false>, <2 x i8> <i8 0, i8 1>, <2 x i8> <i8 2, i8 3>
All operands are constants and the condition has mixed true and false conditions.
Differential Revision: https://reviews.llvm.org/D38369
llvm-svn: 314741
This should bring signed div/rem analysis up to the same level as unsigned.
We use icmp simplification to determine when the divisor is known greater than the dividend.
Each positive test is followed by a negative test to show that we're not overstepping the boundaries of the known bits.
There are extra tests for the signed-min-value special cases.
Alive proofs:
http://rise4fun.com/Alive/WI5
Differential Revision: https://reviews.llvm.org/D37713
llvm-svn: 313264
As noted in PR34517, the handling of signed div/rem is not on par with
unsigned div/rem. Signed is harder to reason about, but it should be
possible to handle at least some of these using the same technique that
we use for unsigned: use icmp logic to see if there's a relationship
between the quotient and divisor.
llvm-svn: 312938
This removes some duplicated code and makes it easier to support signed div/rem
in a similar way if we want to do that. Note that the existing comments were not
accurate - we don't need a constant divisor to simplify; icmp simplification does
more than that. But as the added tests show, it could go even further.
llvm-svn: 312885
This code is double-dead:
1. We simplify all selects with constant true/false condition in InstSimplify.
I've minimized/moved the tests to show that works as expected.
2. All remaining vector selects with a constant condition are canonicalized to
shufflevector, so we really can't see this pattern.
llvm-svn: 312123
This adds support non-canonical compare predicates. InstSimplify can't rely on canonicalization to have occurred.
Differential Revision: https://reviews.llvm.org/D36646
llvm-svn: 310893
This recommits r310869, with the moved files and no extra changes.
Original commit message:
This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too.
I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself.
I also had to make decomposeBitTest support vectors since InstSimplify needs that.
As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library.
Differential Revision: https://reviews.llvm.org/D36593
llvm-svn: 310889
Failed to add the two files that moved. And then added an extra change I didn't mean to while trying to fix that. Reverting everything.
llvm-svn: 310873
This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too.
I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself.
I also had to make decomposeBitTest support vectors since InstSimplify needs that.
As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library.
Differential Revision: https://reviews.llvm.org/D36593
llvm-svn: 310869
The code in ConstantFoldGetElementPtr() assumes integers, and
therefore it crashes trying to get the integer bidwith of a vector
type (in this case <4 x i32>. I just changed the code to prevent
the folding in case of vectors and I didn't bother to generalize
as this doesn't seem to me something that really happens in
practice, but I'm willing to change the patch if you think
it's worth it.
This is hard to trigger from -instsimplify or -instcombine
only as the second instruction is dead, so the test uses loop-unroll.
Differential Revision: https://reviews.llvm.org/D35956
llvm-svn: 309330
Summary:
The constant folding code currently assumes that the constant expression will always be on the left and the simple null will be on the right. But that's not true at least on the path from InstSimplify.
This patch adds support to ConstantFolding to detect the reversed case.
Reviewers: spatel, dberlin, majnemer, davide, joey
Reviewed By: joey
Subscribers: joey, llvm-commits
Differential Revision: https://reviews.llvm.org/D33801
llvm-svn: 304559
The tests here are have operands commuted to provide more coverage. I also commuted one of the instructions in the scalar tests so the 4 tests cover the 4 commuted variations
Differential Revision: https://reviews.llvm.org/D33599
llvm-svn: 304021
Summary: This code was migrated from InstCombine a few years ago. InstCombine had nearby code that would move Constants to the RHS for these, but InstSimplify doesn't have such code on this path.
Reviewers: spatel, majnemer, davide
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33473
llvm-svn: 303774
Currently m_Not only works the canonical xor X, -1 form that InstCombine produces. InstSimplify can't rely on this canonicalization.
Differential Revision: https://reviews.llvm.org/D33331
llvm-svn: 303379
We already handled all of the new tests identically, but several
of those went through a lot of unnecessary processing before
getting folded.
Another motivation for grouping these cases together is that
InstCombine needs a similar fold. Currently, it handles the
'not' cases inefficiently which can lead to bugs as described
in the post-commit comments of:
https://reviews.llvm.org/D32143
llvm-svn: 303295
We would eventually catch these via demanded bits and computing known bits in InstCombine,
but I think it's better to handle the simple cases as soon as possible as a matter of efficiency.
This fold allows further simplifications based on distributed ops transforms. eg:
%a = lshr i8 %x, 7
%b = or i8 %a, 2
%c = and i8 %b, 1
InstSimplify can directly fold this now:
%a = lshr i8 %x, 7
Differential Revision: https://reviews.llvm.org/D33221
llvm-svn: 303213
Summary:
Re-applying r301766 with a fix to a typo and a regression test.
The log message for r301766 was:
==================================================================================
InstructionSimplify: Canonicalize shuffle operands. NFC-ish.
Summary:
Apply canonicalization rules:
1. Input vectors with no elements selected from can be replaced with undef.
2. If only one input vector is constant it shall be the second one.
This allows constant-folding to cover more ad-hoc simplifications that
were in place and avoid duplication for RHS and LHS checks.
There are more rules we may want to add in the future when we see a
justification. e.g. mask elements that select undef elements can be
replaced with undef.
==================================================================================
Reviewers: spatel, RKSimon
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32863
llvm-svn: 302373
We can simplify (or (icmp X, C1), (icmp X, C2)) to 'true' or one of the icmps in many cases.
I had to check some of these with Alive to prove to myself it's right, but everything seems
to check out. Eg, the deleted code in instcombine was completely ignoring predicates with
mismatched signedness.
This is a follow-up to:
https://reviews.llvm.org/rL301260https://reviews.llvm.org/D32143
llvm-svn: 302370
The sibling folds for 'and' with casts were added with https://reviews.llvm.org/rL273200.
This is a preliminary step for adding the 'or' variants for the folds added with https://reviews.llvm.org/rL301260.
The reason for the strange form with constant LHS in the 1st test is because there's another missing fold in that
case for the inverted predicate. That should be fixed when we add the ConstantRange functionality for 'or-of-icmps'
that already exists for 'and-of-icmps'.
I'm hoping to share more code for the and/or cases, so we won't have these differences. This will allow us to remove
code from InstCombine. It's also possible that we can remove some code here in InstSimplify. I think we have some
duplicated folds because patterns are not matched in a general way.
Differential Revision: https://reviews.llvm.org/D32876
llvm-svn: 302189
This change caused buildbot failures, apparently because we're not
passing around types that InstSimplify is used to seeing. I'm not overly
familiar with InstSimplify, so I'm reverting this until I can figure out
what exactly is wrong.
llvm-svn: 301885
In particular (since it wouldn't fit nicely in the summary):
(select (icmp eq V 0) P (getelementptr P V)) -> (getelementptr P V)
Differential Revision: https://reviews.llvm.org/D31435
llvm-svn: 301880
The code Sanjay Patel moved over from InstCombine doesn't work properly if the 'and' has both inputs as nots because we used a commuted op matcher on the 'and' first. But this will bind to the first 'not' on 'and' when there could be two 'not's. InstCombine could rely on DeMorgan to ensure the 'and' wouldn't have two 'not's eventually, but InstSimplify can't rely on that.
This patch matches the xor first then checks for the ands and allows a not of either operand of the xor.
Differential Revision: https://reviews.llvm.org/D32458
llvm-svn: 301329
We can simplify (and (icmp X, C1), (icmp X, C2)) to one of the icmps in many cases.
I had to check some of these with Alive to prove to myself it's right, but everything
seems to check out. Eg, the code in instcombine was completely ignoring predicates with
mismatched signedness.
Handling or-of-icmps would be a follow-up step.
Differential Revision: https://reviews.llvm.org/D32143
llvm-svn: 301260
This is a straight cut and paste, but there's a bigger problem: if this
fold exists for simplifyOr, there should be a DeMorganized version for
simplifyAnd. But more than that, we have a patchwork of ad hoc logic
optimizations in InstCombine. There should be some structure to ensure
that we're not missing sibling folds across and/or/xor.
llvm-svn: 301213
This patch simplifies the examples from D31509 and D31927 (PR30630) and catches
the basic identity shuffle tests that Zvi recently added.
I'm not sure if we have something like this in DAGCombiner, but we should?
It's worth noting that "MaxRecurse / RecursionLimit" is only 3 on entry at the moment.
We might want to bump that up if there are longer shuffle chains like this in the wild.
For now, we're ignoring shuffles that have undef mask elements because it's not
clear how those should be handled.
Differential Revision: https://reviews.llvm.org/D31960
llvm-svn: 300714
InstSimplify returned the wrong type when simplifying a vector GEP
and we ended up crashing when trying to replace all uses with the
new value. Fixes PR32697.
Differential Revision: https://reviews.llvm.org/D32180
llvm-svn: 300693
Summary:
Add a hook for simplification of shufflevector's with the following rules:
- Constant folding - NFC, as it was already being done by the default handler.
- If only one of the operands is constant, constant fold the shuffle if the
mask does not select elements from the variable operand - to show the hook is firing and affecting the test-cases.
Reviewers: RKSimon, craig.topper, spatel, sanjoy, nlopes, majnemer
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31525
llvm-svn: 299393
The change to InstCombine in:
https://reviews.llvm.org/D29729
...exposes this missing fold in InstSimplify, so adding this
first to avoid a regression.
llvm-svn: 295573
A program may contain llvm.assume info that disagrees with other analysis.
This may be caused by UB in the program, so we must not crash because of that.
As noted in the code comments:
https://llvm.org/bugs/show_bug.cgi?id=31809
...we can do better, but this at least avoids the assert/crash in the bug report.
Differential Revision: https://reviews.llvm.org/D29395
llvm-svn: 293773
Summary:
Previously we assumed that the result of sqrt(x) always had 0 as its
sign bit. But sqrt(-0) == -0.
Reviewers: hfinkel, efriedma, sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28928
llvm-svn: 293115
There was an efficiency problem with how we processed @llvm.assume in
ValueTracking (and other places). The AssumptionCache tracked all of the
assumptions in a given function. In order to find assumptions relevant to
computing known bits, etc. we searched every assumption in the function. For
ValueTracking, that means that we did O(#assumes * #values) work in InstCombine
and other passes (with a constant factor that can be quite large because we'd
repeat this search at every level of recursion of the analysis).
Several of us discussed this situation at the last developers' meeting, and
this implements the discussed solution: Make the values that an assume might
affect operands of the assume itself. To avoid exposing this detail to
frontends and passes that need not worry about it, I've used the new
operand-bundle feature to add these extra call "operands" in a way that does
not affect the intrinsic's signature. I think this solution is relatively
clean. InstCombine adds these extra operands based on what ValueTracking, LVI,
etc. will need and then those passes need only search the users of the values
under consideration. This should fix the computational-complexity problem.
At this point, no passes depend on the AssumptionCache, and so I'll remove
that as a follow-up change.
Differential Revision: https://reviews.llvm.org/D27259
llvm-svn: 289755
As Eli noted in the post-commit thread for r288833, the use of
swapOperands() may not be allowed in InstSimplify, so I'm
removing those calls here pending further review.
The swap mutates the icmp, and there doesn't appear to be precedent
for instruction mutation in InstSimplify.
I didn't actually have any tests for those cases, so I'm adding
a few here.
llvm-svn: 288855
All of these (and a few more) are already handled by InstCombine,
but we shouldn't have to wait until then to simplify these because
they're cheap to deal with here in InstSimplify.
This is the 'and' sibling of the earlier 'or' patch:
https://reviews.llvm.org/rL288833
llvm-svn: 288841
All of these (and a few more) are already handled by InstCombine,
but we shouldn't have to wait until then to simplify these because
they're cheap to deal with here in InstSimplify.
llvm-svn: 288833
Summary:
Extends InstSimplify to handle both `x >=u x >> y` and `x >=u x udiv y`.
This is a folloup of rL258422 and
https://github.com/rust-lang/rust/pull/30917 where llvm failed to
optimize away the bounds checking in a binary search.
Patch by Arthur Silva!
Reviewers: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D25941
llvm-svn: 285228
0 - X --> X, if X is 0 or the minimum signed value
0 - X --> 0, if X is 0 or the minimum signed value and the sub is NSW
I noticed this pattern might be created in the backend after the change from D25485,
so we'll want to add a similar fold for the DAG.
The use of computeKnownBits in InstSimplify may be something to investigate if the
compile time of InstSimplify is noticeable. We could replace computeKnownBits with
specific pattern matchers or limit the recursion.
Differential Revision: https://reviews.llvm.org/D25785
llvm-svn: 284649
The constant folder didn't know how to always fold bitcasts of constant integer
vectors. In particular, it was unable to handle the case where a constant vector
had some undef elements, and the resulting (i.e. bitcasted) vector type had more
elements than the original vector type.
Example:
%cast = bitcast <2 x i64><i64 undef, i64 2> to <4 x i32>
On a little endian target, %cast could have been folded to:
<4 x i32><i32 undef, i32 undef, i32 2, i32 0>
This patch improves the folding logic by teaching how to correctly propagate
undef elements in the folded vector.
Differential Revision: https://reviews.llvm.org/D24301
llvm-svn: 281343
InstSimplify doesn't always know how to fold a bitcast of a constant vector.
In particular, the logic in InstSimplify doesn't know how to handle the case
where the constant vector in input contains some undef elements, and the
number of elements is smaller than the number of elements of the bitcast
vector type.
llvm-svn: 281332
This patch fixes a crash caused by an incorrect folding of an ordered comparison
between a packed floating point vector and a splat vector of NaN.
An ordered comparison between a vector and a constant vector of NaN, should
always be folded into a constant vector where each element is i1 false.
Since revision 266175, SimplifyFCmpInst folds the ordered fcmp into a scalar
'false'. Later on, this would cause an assertion failure, since the value type
of the folded value doesn't match the expected value type of the uses of the
original instruction: "Assertion failed: New->getType() == getType() &&
"replaceAllUses of value with new value of different type!".
This patch fixes the issue and adds a test case to the already existing test
InstSimplify/floating-point-compares.ll.
Differential Revision: https://reviews.llvm.org/D24143
llvm-svn: 280488
...because like the corresponding code, this is just too big to keep adding to.
And the next step is to add a vector version of each of these tests to show
missed folds.
Also, auto-generate CHECK lines and add comments for the tests that correspond to
the source code.
llvm-svn: 279530
I'm removing a misplaced pair of more specific folds from InstCombine in this patch as well,
so we know where those folds are happening in InstSimplify.
llvm-svn: 277738
ConstantExpr::getWithOperands does much of the hard work that
ConstantFoldInstOperandsImpl tries to do but more completely.
This lets us fold ExtractValue/InsertValue expressions.
llvm-svn: 277100
When folding an expression, we run ConstantFoldConstantExpression on
each operand of that expression.
However, ConstantFoldConstantExpression can fail and retur nullptr.
Previously, we would bail on further refining the expression.
Instead, use the original operand and see if we can refine a later
operand.
llvm-svn: 276959
rL245171 exposed a hole in InstSimplify that manifested in a strange way in PR28466:
https://llvm.org/bugs/show_bug.cgi?id=28466
It's possible to use trunc + icmp sgt/slt in place of an and + icmp eq/ne, so we need to
recognize that pattern to eliminate selects that are choosing between some value and some
bitmasked version of that value.
Note that there is significant room for improvement (refactoring) and enhancement (more
patterns, possibly in InstCombine rather than here).
Differential Revision: https://reviews.llvm.org/D22537
llvm-svn: 276341