Commit Graph

592 Commits

Author SHA1 Message Date
Roman Lebedev 9c5a4a4527 [InstSimplify] simplifyUnsignedRangeCheck(): handle few tautological cases (PR43251)
Summary:
This is split off from D67356, since these cases produce a constant,
no real need to keep them in instcombine.

Alive proofs:
https://rise4fun.com/Alive/u7Fk
https://rise4fun.com/Alive/4lV

https://bugs.llvm.org/show_bug.cgi?id=43251

Reviewers: spatel, nikic, xbolva00

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67498

llvm-svn: 371921
2019-09-14 13:47:27 +00:00
Roman Lebedev 4cb267f9f5 [NFC][InstSimplify] Add some more tests for D67498/D67502
llvm-svn: 371877
2019-09-13 17:58:24 +00:00
Roman Lebedev 80a8a85758 [InstCombine][InstSimplify] Move constant-folding tests in result-of-usub-is-non-zero-and-no-overflow.ll
llvm-svn: 371737
2019-09-12 14:12:31 +00:00
Roman Lebedev b3e0937f0a [NFC][InstCombine][InstSimplify] Add test for "add-of-negative is non-zero and no overflow" (PR43259)
https://rise4fun.com/Alive/ska
https://rise4fun.com/Alive/9iX

https://bugs.llvm.org/show_bug.cgi?id=43259

llvm-svn: 371736
2019-09-12 14:12:20 +00:00
Roman Lebedev f1286621eb [InstSimplify] simplifyUnsignedRangeCheck(): handle more cases (PR43251)
Summary:
I don't have a direct motivational case for this,
but it would be good to have this for completeness/symmetry.

This pattern is basically the motivational pattern from
https://bugs.llvm.org/show_bug.cgi?id=43251
but with different predicate that requires that the offset is non-zero.

The completeness bit comes from the fact that a similar pattern (offset != zero)
will be needed for https://bugs.llvm.org/show_bug.cgi?id=43259,
so it'd seem to be good to not overlook very similar patterns..

Proofs: https://rise4fun.com/Alive/21b

Also, there is something odd with `isKnownNonZero()`, if the non-zero
knowledge was specified as an assumption, it didn't pick it up (PR43267)

Reviewers: spatel, nikic, xbolva00

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67411

llvm-svn: 371718
2019-09-12 09:26:17 +00:00
Roman Lebedev 00c1ee48e4 [InstSimplify] Pass SimplifyQuery into simplifyUnsignedRangeCheck() and use it for isKnownNonZero()
This was actually the original intention in D67332,
but i messed up and forgot about it.
This patch was originally part of D67411, but precommitting this.

llvm-svn: 371630
2019-09-11 15:32:46 +00:00
Roman Lebedev 8aeb7bb013 [NFC][InstSimplify] Add extra test for D67411 with @llvm.assume
llvm-svn: 371629
2019-09-11 15:28:03 +00:00
Sanjay Patel 9c4047f267 [ConstProp] move test file from InstSimplify; NFC
These are constant folding tests; there is no code
directly in InstSimplify for this.

llvm-svn: 371619
2019-09-11 14:01:11 +00:00
Sanjay Patel 29ba5e0817 [InstSimplify] regenerate test CHECKs; NFC
llvm-svn: 371617
2019-09-11 13:56:07 +00:00
Roman Lebedev 870ffe3cee [NFC][InstSimplify] rewrite test added in r371537 to use non-null pointer instead
I only want to ensure that %offset is non-zero there,
it doesn't matter how that info is conveyed.
As filed in PR43267, the assumption way does not work.

llvm-svn: 371546
2019-09-10 18:40:00 +00:00
Roman Lebedev 880657c97c [NFC][InstCombine][InstSimplify] PR43251 - and some patterns with offset != 0
https://rise4fun.com/Alive/21b

llvm-svn: 371537
2019-09-10 17:13:59 +00:00
Roman Lebedev 6e2c5c8710 [InstSimplify] simplifyUnsignedRangeCheck(): if we know that X != 0, handle more cases (PR43246)
Summary:
This is motivated by D67122 sanitizer check enhancement.
That patch seemingly worsens `-fsanitize=pointer-overflow`
overhead from 25% to 50%, which strongly implies missing folds.

In this particular case, given
```
char* test(char& base, unsigned long offset) {
  return &base + offset;
}
```
it will end up producing something like
https://godbolt.org/z/LK5-iH
which after optimizations reduces down to roughly
```
define i1 @t0(i8* nonnull %base, i64 %offset) {
  %base_int = ptrtoint i8* %base to i64
  %adjusted = add i64 %base_int, %offset
  %non_null_after_adjustment = icmp ne i64 %adjusted, 0
  %no_overflow_during_adjustment = icmp uge i64 %adjusted, %base_int
  %res = and i1 %non_null_after_adjustment, %no_overflow_during_adjustment
  ret i1 %res
}
```
Without D67122 there was no `%non_null_after_adjustment`,
and in this particular case we can get rid of the overhead:

Here we add some offset to a non-null pointer,
and check that the result does not overflow and is not a null pointer.
But since the base pointer is already non-null, and we check for overflow,
that overflow check will already catch the null pointer,
so the separate null check is redundant and can be dropped.

Alive proofs:
https://rise4fun.com/Alive/WRzq

There are more patterns of "unsigned-add-with-overflow", they are not handled here,
but this is the main pattern, that we currently consider canonical,
so it makes sense to handle it.

https://bugs.llvm.org/show_bug.cgi?id=43246

Reviewers: spatel, nikic, vsk

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits, reames

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67332

llvm-svn: 371349
2019-09-08 20:14:15 +00:00
Roman Lebedev 64965430db [NFC][InstSimplify] Some tests for dropping null check after uadd.with.overflow of non-null (PR43246)
https://rise4fun.com/Alive/WRzq

Name: C <= Y && Y != 0  -->  C <= Y  iff C != 0
Pre: C != 0
  %y_is_nonnull = icmp ne i64 %y, 0
  %no_overflow = icmp ule i64 C, %y
  %r = and i1 %y_is_nonnull, %no_overflow
=>
  %r = %no_overflow

Name: C <= Y || Y != 0  -->  Y != 0  iff C != 0
Pre: C != 0
  %y_is_nonnull = icmp ne i64 %y, 0
  %no_overflow = icmp ule i64 C, %y
  %r = or i1 %y_is_nonnull, %no_overflow
=>
  %r = %y_is_nonnull

Name: C > Y || Y == 0  -->  C > Y  iff C != 0
Pre: C != 0
  %y_is_null = icmp eq i64 %y, 0
  %overflow = icmp ugt i64 C, %y
  %r = or i1 %y_is_null, %overflow
=>
  %r = %overflow

Name: C > Y && Y == 0  -->  Y == 0  iff C != 0
Pre: C != 0
  %y_is_null = icmp eq i64 %y, 0
  %overflow = icmp ugt i64 C, %y
  %r = and i1 %y_is_null, %overflow
=>
  %r = %y_is_null

https://bugs.llvm.org/show_bug.cgi?id=43246

llvm-svn: 371339
2019-09-08 17:50:40 +00:00
Sanjay Patel 4a2cd7be5a [InstSimplify] guard against unreachable code (PR43218)
This would crash:
https://bugs.llvm.org/show_bug.cgi?id=43218

llvm-svn: 370911
2019-09-04 15:12:55 +00:00
Roman Lebedev c584786854 [InstSimplify] Drop leftover "division-by-zero guard" around `@llvm.umul.with.overflow` inverted overflow bit
Summary:
Now that with D65143/D65144 we've produce `@llvm.umul.with.overflow`,
and with D65147 we've flattened the CFG, we now can see that
the guard may have been there to prevent division by zero is redundant.
We can simply drop it:
```
----------------------------------------
Name: no overflow or zero
  %iszero = icmp eq i4 %y, 0
  %umul = smul_overflow i4 %x, %y
  %umul.ov = extractvalue {i4, i1} %umul, 1
  %umul.ov.not = xor %umul.ov, -1
  %retval.0 = or i1 %iszero, %umul.ov.not
  ret i1 %retval.0
=>
  %iszero = icmp eq i4 %y, 0
  %umul = smul_overflow i4 %x, %y
  %umul.ov = extractvalue {i4, i1} %umul, 1
  %umul.ov.not = xor %umul.ov, -1
  %retval.0 = or i1 %iszero, %umul.ov.not
  ret i1 %umul.ov.not

Done: 1
Optimization is correct!
```
Note that this is inverted from what we have in a previous patch,
here we are looking for the inverted overflow bit.
And that inversion is kinda problematic - given this particular
pattern we neither hoist that `not` closer to `ret` (then the pattern
would have been identical to the one without inversion,
and would have been handled by the previous patch), neither
do the opposite transform. But regardless, we should handle this too.
I've filled [[ https://bugs.llvm.org/show_bug.cgi?id=42720 | PR42720 ]].

Reviewers: nikic, spatel, xbolva00, RKSimon

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65151

llvm-svn: 370351
2019-08-29 12:48:04 +00:00
Roman Lebedev aaf6ab4410 [InstSimplify] Drop leftover "division-by-zero guard" around `@llvm.umul.with.overflow` overflow bit
Summary:
Now that with D65143/D65144 we've produce `@llvm.umul.with.overflow`,
and with D65147 we've flattened the CFG, we now can see that
the guard may have been there to prevent division by zero is redundant.
We can simply drop it:
```
----------------------------------------
Name: no overflow and not zero
  %iszero = icmp ne i4 %y, 0
  %umul = umul_overflow i4 %x, %y
  %umul.ov = extractvalue {i4, i1} %umul, 1
  %retval.0 = and i1 %iszero, %umul.ov
  ret i1 %retval.0
=>
  %iszero = icmp ne i4 %y, 0
  %umul = umul_overflow i4 %x, %y
  %umul.ov = extractvalue {i4, i1} %umul, 1
  %retval.0 = and i1 %iszero, %umul.ov
  ret %umul.ov

Done: 1
Optimization is correct!
```

Reviewers: nikic, spatel, xbolva00

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65150

llvm-svn: 370350
2019-08-29 12:47:50 +00:00
Bjorn Pettersson d218a3326e [InstSimplify] Report "Changed" also when only deleting dead instructions
Summary:
Make sure that we report that changes has been made
by InstSimplify also in situations when only trivially
dead instructions has been removed. If for example a call
is removed the call graph must be updated.

Bug seem to have been introduced by llvm-svn r367173
(commit 02b9e45a7e), since the code in question
was rewritten in that commit.

Reviewers: spatel, chandlerc, foad

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65973

llvm-svn: 368401
2019-08-09 07:08:25 +00:00
Craig Topper 66c08430f6 [ValueTracking] When calculating known bits for integer abs, make sure we're looking at a negate and not just any instruction with the nsw flag set.
The matchSelectPattern code can match patterns like (x >= 0) ? x : -x
for absolute value. But it can also match ((x-y) >= 0) ? (x-y) : (y-x).
If the latter form was matched we can only use the nsw flag if its
set on both subtracts.

This match makes sure we're looking at the former case only.

Differential Revision: https://reviews.llvm.org/D65692

llvm-svn: 368195
2019-08-07 18:28:16 +00:00
Craig Topper aa2810b6e7 [InstSimplify] Add test case to show bad sign bit handling for integer abs idiom in computeKnownBits.
computeKnownBits will indicate the sign bit of abs is 0 if the
the RHS operand returned by matchSelectPattern has the nsw flag set.
For abs idioms like (X >= 0) ? X : -X, the RHS returns -X. But
we can also match ((X-Y) >= 0 ? X-Y : Y-X as abs. In this case
RHS will be the Y-X operand. According to Alive, the sign bit for
this is only 0 if both the X-Y and Y-X operands have the nsw flag.
But we're only checking the Y-X operand.

llvm-svn: 367747
2019-08-03 02:54:54 +00:00
Sanjay Patel 02b9e45a7e [InstSimplify] remove quadratic time looping (PR42771)
The test case from:
https://bugs.llvm.org/show_bug.cgi?id=42771
...shows a ~30x slowdown caused by the awkward loop iteration (rL207302) that is
seemingly done just to avoid invalidating the instruction iterator. We can instead
delay instruction deletion until we reach the end of the block (or we could delay
until we reach the end of all blocks).

There's a test diff here for a degenerate case with llvm.assume that is not
meaningful in itself, but serves to verify this change in logic.

This change probably doesn't result in much overall compile-time improvement
because we call '-instsimplify' as a standalone pass only once in the standard
-O2 opt pipeline currently.

Differential Revision: https://reviews.llvm.org/D65336

llvm-svn: 367173
2019-07-27 14:05:51 +00:00
Roman Lebedev 4153f17181 [InstSimplify][NFC] Tests for skipping 'div-by-0' checks before inverted @llvm.umul.with.overflow
It would be already handled by the non-inverted case if we were hoisting
the `not` in InstCombine, but we don't (granted, we don't sink it
in this case either), so this is a separate case.

llvm-svn: 366801
2019-07-23 12:42:49 +00:00
Roman Lebedev 0689427280 [InstSimplify][NFC] Tests for skipping 'div-by-0' checks before @llvm.umul.with.overflow
These may remain after @llvm.umul.with.overflow was canonicalized
from the code that was originally doing the check via division.

llvm-svn: 366751
2019-07-22 22:09:11 +00:00
Michael Liao 543ba4e9e0 [InstructionSimplify] Apply sext/trunc after pointer stripping
Summary:
- As the pointer stripping could trace through `addrspacecast` now, need
  to sext/trunc the offset to ensure it has the same width as the
  pointer after stripping.

Reviewers: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64768

llvm-svn: 366162
2019-07-16 01:03:06 +00:00
David Bolvansky 5dca95bc4e [NFC] Revisited tests for D64285
llvm-svn: 365815
2019-07-11 19:39:20 +00:00
David Bolvansky e195a91d2d [NFC] Updated tests for D64285
llvm-svn: 365765
2019-07-11 12:51:33 +00:00
David Bolvansky 901d91e5f0 [NFC] Fixed tests
llvm-svn: 365506
2019-07-09 15:31:36 +00:00
David Bolvansky e625eb9def [NFC] Added tests for D64285
llvm-svn: 365501
2019-07-09 15:12:01 +00:00
Sanjay Patel b342f026a4 [InstSimplify] simplify power-of-2 (single bit set) sequences
As discussed in PR42314:
https://bugs.llvm.org/show_bug.cgi?id=42314

Improving the canonicalization for these patterns:
rL363956
...means we should adjust/enhance the related simplification.

https://rise4fun.com/Alive/w1cp

  Name: isPow2 or zero
  %x = and i32 %xx, 2048
  %a = add i32 %x, -1
  %r = and i32 %a, %x
  =>
  %r = i32 0

llvm-svn: 363997
2019-06-20 22:55:28 +00:00
Sanjay Patel 3207566dd6 [InstSimplify] add tests for known-not-a-power-of-2; NFC
I added a canonicalization to create this general pattern in:
rL363956

But as noted in PR42314:
https://bugs.llvm.org/show_bug.cgi?id=42314#c11

...we have a (potentially expensive) simplification for the version
of the code that we just canonicalized away from, so we should
add/adjust that code to match.

llvm-svn: 363981
2019-06-20 21:04:14 +00:00
Sanjay Patel 3e03bf6921 [InstSimplify] add a phi test with 1 incoming value; NFC
D63489 proposes to change this behavior, but there's no
direct -instsimplify test to verify that the transform exists.

llvm-svn: 363842
2019-06-19 17:23:29 +00:00
Jay Foad 45d19fb470 [ConstantFolding] Fix assertion failure on non-power-of-two vector load.
Summary:
The test case does an (out of bounds) load from a global constant with
type <3 x float>. InstSimplify tried to turn this into an integer load
of the whole alloc size of the vector, which is 128 bits due to
alignment padding, and then bitcast this to <3 x vector> which failed
an assertion due to the type size mismatch.

The fix is to do an integer load of the normal size of the vector, with
no alignment padding.

Reviewers: tpr, arsenm, majnemer, dstuttard

Reviewed By: arsenm

Subscribers: hfinkel, wdng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63375

llvm-svn: 363784
2019-06-19 10:28:48 +00:00
Roman Lebedev 5a663bd77a [InstSimplify] Fix addo/subo undef folds (PR42209)
Fix folds of addo and subo with an undef operand to be:

`@llvm.{u,s}{add,sub}.with.overflow` all fold to `{ undef, false }`,
 as per LLVM undef rules.
Same for commuted variants.

Based on the original version of the patch by @nikic.

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42209 | PR42209 ]]

Differential Revision: https://reviews.llvm.org/D63065

llvm-svn: 363522
2019-06-16 20:39:45 +00:00
Sanjay Patel 73f5a855b3 [InstSimplify] enhance fcmp fold with never-nan operand
This is another step towards correcting our usage of fast-math-flags when applied on an fcmp.
In this case, we are checking for 'nnan' on the fcmp itself rather than the operand of
the fcmp. But I'm leaving that clause in until we're more confident that we can stop
relying on fcmp's FMF.

By using the more general "isKnownNeverNaN()", we gain a simplification shown on the
tests with 'uitofp' regardless of the FMF on the fcmp (uitofp never produces a NaN).
On the tests with 'fabs', we are now relying on the FMF for the call fabs instruction
in addition to the FMF on the fcmp.

This is a continuation of D62979 / rL362879.

llvm-svn: 362903
2019-06-09 13:48:59 +00:00
Sanjay Patel de4d4d5049 [InstSimplify] add tests for fcmp with known-never-nan operands; NFC
Opposite predicate for rL362742 / rL362879 / D62979

llvm-svn: 362902
2019-06-09 13:30:14 +00:00
Sanjay Patel 4329c15f11 [InstSimplify] enhance fcmp fold with never-nan operand
This is 1 step towards correcting our usage of fast-math-flags when applied on an fcmp.
In this case, we are checking for 'nnan' on the fcmp itself rather than the operand of
the fcmp. But I'm leaving that clause in until we're more confident that we can stop
relying on fcmp's FMF.

By using the more general "isKnownNeverNaN()", we gain a simplification shown on the
tests with 'uitofp' regardless of the FMF on the fcmp (uitofp never produces a NaN).
On the tests with 'fabs', we are now relying on the FMF for the call fabs instruction
in addition to the FMF on the fcmp.

I'll update the 'ult' case below here as a follow-up assuming no problems here.

Differential Revision: https://reviews.llvm.org/D62979

llvm-svn: 362879
2019-06-08 15:12:33 +00:00
Sanjay Patel 38c5ee1802 [InstSimplify] add tests for fcmp with known-never-nan operands; NFC
llvm-svn: 362742
2019-06-06 20:14:06 +00:00
Sanjay Patel 8869a98e82 [InstSimplify] fold insertelement-of-extractelement
This was partly handled in InstCombine (only the constant
index case), so delete that and zap it more generally in
InstSimplify.

llvm-svn: 361576
2019-05-24 00:13:58 +00:00
Sanjay Patel 3e15f83381 [InstSimplify] add tests for insert-of-extract; NFC
llvm-svn: 361575
2019-05-24 00:11:23 +00:00
Sanjay Patel e60cb7d1be [InstSimplify] insertelement V, undef, ? --> V
This was part of InstCombine, but it's better placed in
InstSimplify. InstCombine also had an unreachable but weaker
fold for insertelement with undef index, so that is deleted.

llvm-svn: 361559
2019-05-23 21:49:47 +00:00
Cameron McInally 2d2a46db8e [InstSimplify] Teach fsub -0.0, (fneg X) ==> X about unary fneg
Differential Revision: https://reviews.llvm.org/D62077

llvm-svn: 361151
2019-05-20 13:13:35 +00:00
Sanjay Patel 9ef99b4b11 [InstSimplify] fold fcmp (maxnum, X, C1), C2
This is the sibling transform for rL360899 (D61691):

  maxnum(X, GreaterC) == C --> false
  maxnum(X, GreaterC) <= C --> false
  maxnum(X, GreaterC) <  C --> false
  maxnum(X, GreaterC) >= C --> true
  maxnum(X, GreaterC) >  C --> true
  maxnum(X, GreaterC) != C --> true

llvm-svn: 361118
2019-05-19 14:26:39 +00:00
Cameron McInally 12de5425c1 [NFC][InstSimplify] Add more unary fneg tests to floating-point-arithmetic.ll
llvm-svn: 361076
2019-05-17 21:10:11 +00:00
Cameron McInally bebc7d6a4e [NFC][InstSimplify] Precommit new unary fneg test
llvm-svn: 361060
2019-05-17 18:34:35 +00:00
Cameron McInally 19dc8c7280 [NFC][InstSImplify] Fix flip-flopped comments and test names
In test/Transforms/InstSimplify/floating-point-arithmetic.ll

Differential Revision: https://reviews.llvm.org/D62069

llvm-svn: 361057
2019-05-17 17:59:17 +00:00
Cameron McInally 067e946859 [InstSimplify] Add unary fneg to `fsub 0.0, (fneg X) ==> X` transform
Differential Revision: https://reviews.llvm.org/D62013

llvm-svn: 361047
2019-05-17 16:47:00 +00:00
Cameron McInally f637bb6ebd [NFC][InstSimplify] Update fast-math.ll tests I botched in r360808.
These were new tests I added in r360808. I made a mistake while converting the exisiting binary FNeg test into the new unary FNeg tests. Correct that.

llvm-svn: 360928
2019-05-16 19:00:57 +00:00
Sanjay Patel 3413035477 [InstSimplify] add tests for fcmp of maxnum with constants; NFC
Sibling tests for rL360899 (D61691).

llvm-svn: 360905
2019-05-16 15:00:11 +00:00
Sanjay Patel 152f81fae8 [InstSimplify] fold fcmp (minnum, X, C1), C2
minnum(X, LesserC) == C --> false
   minnum(X, LesserC) >= C --> false
   minnum(X, LesserC) >  C --> false
   minnum(X, LesserC) != C --> true
   minnum(X, LesserC) <= C --> true
   minnum(X, LesserC) <  C --> true

maxnum siblings will follow if there are no problems here.

We should be able to perform some other combines when the constants
are equal or greater-than too, but that would go in instcombine.

We might also generalize this by creating an FP ConstantRange
(similar to what we do for integers).

Differential Revision: https://reviews.llvm.org/D61691

llvm-svn: 360899
2019-05-16 14:03:10 +00:00
Cameron McInally 14a90661f8 Revert llvm-svn: 360807
Somehow submitted this patch twice.

llvm-svn: 360812
2019-05-15 20:48:50 +00:00
Cameron McInally b8df789ff3 Pre-commit unary fneg tests to InstSimplify
llvm-svn: 360808
2019-05-15 20:27:37 +00:00