Commit Graph

565 Commits

Author SHA1 Message Date
Sanjay Patel b342f026a4 [InstSimplify] simplify power-of-2 (single bit set) sequences
As discussed in PR42314:
https://bugs.llvm.org/show_bug.cgi?id=42314

Improving the canonicalization for these patterns:
rL363956
...means we should adjust/enhance the related simplification.

https://rise4fun.com/Alive/w1cp

  Name: isPow2 or zero
  %x = and i32 %xx, 2048
  %a = add i32 %x, -1
  %r = and i32 %a, %x
  =>
  %r = i32 0

llvm-svn: 363997
2019-06-20 22:55:28 +00:00
Sanjay Patel 3207566dd6 [InstSimplify] add tests for known-not-a-power-of-2; NFC
I added a canonicalization to create this general pattern in:
rL363956

But as noted in PR42314:
https://bugs.llvm.org/show_bug.cgi?id=42314#c11

...we have a (potentially expensive) simplification for the version
of the code that we just canonicalized away from, so we should
add/adjust that code to match.

llvm-svn: 363981
2019-06-20 21:04:14 +00:00
Sanjay Patel 3e03bf6921 [InstSimplify] add a phi test with 1 incoming value; NFC
D63489 proposes to change this behavior, but there's no
direct -instsimplify test to verify that the transform exists.

llvm-svn: 363842
2019-06-19 17:23:29 +00:00
Jay Foad 45d19fb470 [ConstantFolding] Fix assertion failure on non-power-of-two vector load.
Summary:
The test case does an (out of bounds) load from a global constant with
type <3 x float>. InstSimplify tried to turn this into an integer load
of the whole alloc size of the vector, which is 128 bits due to
alignment padding, and then bitcast this to <3 x vector> which failed
an assertion due to the type size mismatch.

The fix is to do an integer load of the normal size of the vector, with
no alignment padding.

Reviewers: tpr, arsenm, majnemer, dstuttard

Reviewed By: arsenm

Subscribers: hfinkel, wdng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63375

llvm-svn: 363784
2019-06-19 10:28:48 +00:00
Roman Lebedev 5a663bd77a [InstSimplify] Fix addo/subo undef folds (PR42209)
Fix folds of addo and subo with an undef operand to be:

`@llvm.{u,s}{add,sub}.with.overflow` all fold to `{ undef, false }`,
 as per LLVM undef rules.
Same for commuted variants.

Based on the original version of the patch by @nikic.

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42209 | PR42209 ]]

Differential Revision: https://reviews.llvm.org/D63065

llvm-svn: 363522
2019-06-16 20:39:45 +00:00
Sanjay Patel 73f5a855b3 [InstSimplify] enhance fcmp fold with never-nan operand
This is another step towards correcting our usage of fast-math-flags when applied on an fcmp.
In this case, we are checking for 'nnan' on the fcmp itself rather than the operand of
the fcmp. But I'm leaving that clause in until we're more confident that we can stop
relying on fcmp's FMF.

By using the more general "isKnownNeverNaN()", we gain a simplification shown on the
tests with 'uitofp' regardless of the FMF on the fcmp (uitofp never produces a NaN).
On the tests with 'fabs', we are now relying on the FMF for the call fabs instruction
in addition to the FMF on the fcmp.

This is a continuation of D62979 / rL362879.

llvm-svn: 362903
2019-06-09 13:48:59 +00:00
Sanjay Patel de4d4d5049 [InstSimplify] add tests for fcmp with known-never-nan operands; NFC
Opposite predicate for rL362742 / rL362879 / D62979

llvm-svn: 362902
2019-06-09 13:30:14 +00:00
Sanjay Patel 4329c15f11 [InstSimplify] enhance fcmp fold with never-nan operand
This is 1 step towards correcting our usage of fast-math-flags when applied on an fcmp.
In this case, we are checking for 'nnan' on the fcmp itself rather than the operand of
the fcmp. But I'm leaving that clause in until we're more confident that we can stop
relying on fcmp's FMF.

By using the more general "isKnownNeverNaN()", we gain a simplification shown on the
tests with 'uitofp' regardless of the FMF on the fcmp (uitofp never produces a NaN).
On the tests with 'fabs', we are now relying on the FMF for the call fabs instruction
in addition to the FMF on the fcmp.

I'll update the 'ult' case below here as a follow-up assuming no problems here.

Differential Revision: https://reviews.llvm.org/D62979

llvm-svn: 362879
2019-06-08 15:12:33 +00:00
Sanjay Patel 38c5ee1802 [InstSimplify] add tests for fcmp with known-never-nan operands; NFC
llvm-svn: 362742
2019-06-06 20:14:06 +00:00
Sanjay Patel 8869a98e82 [InstSimplify] fold insertelement-of-extractelement
This was partly handled in InstCombine (only the constant
index case), so delete that and zap it more generally in
InstSimplify.

llvm-svn: 361576
2019-05-24 00:13:58 +00:00
Sanjay Patel 3e15f83381 [InstSimplify] add tests for insert-of-extract; NFC
llvm-svn: 361575
2019-05-24 00:11:23 +00:00
Sanjay Patel e60cb7d1be [InstSimplify] insertelement V, undef, ? --> V
This was part of InstCombine, but it's better placed in
InstSimplify. InstCombine also had an unreachable but weaker
fold for insertelement with undef index, so that is deleted.

llvm-svn: 361559
2019-05-23 21:49:47 +00:00
Cameron McInally 2d2a46db8e [InstSimplify] Teach fsub -0.0, (fneg X) ==> X about unary fneg
Differential Revision: https://reviews.llvm.org/D62077

llvm-svn: 361151
2019-05-20 13:13:35 +00:00
Sanjay Patel 9ef99b4b11 [InstSimplify] fold fcmp (maxnum, X, C1), C2
This is the sibling transform for rL360899 (D61691):

  maxnum(X, GreaterC) == C --> false
  maxnum(X, GreaterC) <= C --> false
  maxnum(X, GreaterC) <  C --> false
  maxnum(X, GreaterC) >= C --> true
  maxnum(X, GreaterC) >  C --> true
  maxnum(X, GreaterC) != C --> true

llvm-svn: 361118
2019-05-19 14:26:39 +00:00
Cameron McInally 12de5425c1 [NFC][InstSimplify] Add more unary fneg tests to floating-point-arithmetic.ll
llvm-svn: 361076
2019-05-17 21:10:11 +00:00
Cameron McInally bebc7d6a4e [NFC][InstSimplify] Precommit new unary fneg test
llvm-svn: 361060
2019-05-17 18:34:35 +00:00
Cameron McInally 19dc8c7280 [NFC][InstSImplify] Fix flip-flopped comments and test names
In test/Transforms/InstSimplify/floating-point-arithmetic.ll

Differential Revision: https://reviews.llvm.org/D62069

llvm-svn: 361057
2019-05-17 17:59:17 +00:00
Cameron McInally 067e946859 [InstSimplify] Add unary fneg to `fsub 0.0, (fneg X) ==> X` transform
Differential Revision: https://reviews.llvm.org/D62013

llvm-svn: 361047
2019-05-17 16:47:00 +00:00
Cameron McInally f637bb6ebd [NFC][InstSimplify] Update fast-math.ll tests I botched in r360808.
These were new tests I added in r360808. I made a mistake while converting the exisiting binary FNeg test into the new unary FNeg tests. Correct that.

llvm-svn: 360928
2019-05-16 19:00:57 +00:00
Sanjay Patel 3413035477 [InstSimplify] add tests for fcmp of maxnum with constants; NFC
Sibling tests for rL360899 (D61691).

llvm-svn: 360905
2019-05-16 15:00:11 +00:00
Sanjay Patel 152f81fae8 [InstSimplify] fold fcmp (minnum, X, C1), C2
minnum(X, LesserC) == C --> false
   minnum(X, LesserC) >= C --> false
   minnum(X, LesserC) >  C --> false
   minnum(X, LesserC) != C --> true
   minnum(X, LesserC) <= C --> true
   minnum(X, LesserC) <  C --> true

maxnum siblings will follow if there are no problems here.

We should be able to perform some other combines when the constants
are equal or greater-than too, but that would go in instcombine.

We might also generalize this by creating an FP ConstantRange
(similar to what we do for integers).

Differential Revision: https://reviews.llvm.org/D61691

llvm-svn: 360899
2019-05-16 14:03:10 +00:00
Cameron McInally 14a90661f8 Revert llvm-svn: 360807
Somehow submitted this patch twice.

llvm-svn: 360812
2019-05-15 20:48:50 +00:00
Cameron McInally b8df789ff3 Pre-commit unary fneg tests to InstSimplify
llvm-svn: 360808
2019-05-15 20:27:37 +00:00
Cameron McInally 94f16bfaba Add unary fneg to InstSimplify/fp-nan.ll
llvm-svn: 360807
2019-05-15 20:27:35 +00:00
Cameron McInally a4d29b8e20 Add unary fneg to InstSimplify/fp-nan.ll
llvm-svn: 360797
2019-05-15 19:37:03 +00:00
Cameron McInally 0c82d9b5a2 Teach InstSimplify -X + X --> 0.0 about unary FNeg
Differential Revision: https://reviews.llvm.org/D61916

llvm-svn: 360777
2019-05-15 14:31:33 +00:00
Sanjay Patel b64c48597f [InstSimplify] add tests for fcmp+minnum; NFC
llvm-svn: 360275
2019-05-08 17:53:18 +00:00
Nikita Popov 9fd02a71a3 Revert "[ValueTracking] Improve isKnowNonZero for Ints"
This reverts commit 3b137a4956.

As reported in https://reviews.llvm.org/D60846, this is causing
miscompiles.

llvm-svn: 360260
2019-05-08 14:50:01 +00:00
Dan Robertson 3b137a4956 [ValueTracking] Improve isKnowNonZero for Ints
Improve isKnownNonZero for integers in order to improve cttz
optimizations.

Differential Revision: https://reviews.llvm.org/D60846

llvm-svn: 360222
2019-05-08 02:25:08 +00:00
Sanjay Patel e088d03b9c [ValueTracking] add logic for known-never-nan with minnum/maxnum
From the LangRef: "Returns NaN only if both operands are NaN."

llvm-svn: 360206
2019-05-07 22:58:31 +00:00
Sanjay Patel 9a1c2b7776 [InstSimplify] add tests for minnum/maxnum and NaN; NFC
llvm-svn: 360197
2019-05-07 21:50:09 +00:00
Cameron McInally c3167696bc Add FNeg support to InstructionSimplify
Differential Revision: https://reviews.llvm.org/D61573

llvm-svn: 360053
2019-05-06 16:05:10 +00:00
Dan Robertson 9e441aee50 [NFC] Add baseline tests for int isKnownNonZero
Add baseline tests for improvements of isKnownNonZero for integer types.

Differential Revision: https://reviews.llvm.org/D60932

llvm-svn: 359267
2019-04-26 02:55:54 +00:00
Eric Christopher cee313d288 Revert "Temporarily Revert "Add basic loop fusion pass.""
The reversion apparently deleted the test/Transforms directory.

Will be re-reverting again.

llvm-svn: 358552
2019-04-17 04:52:47 +00:00
Eric Christopher a863435128 Temporarily Revert "Add basic loop fusion pass."
As it's causing some bot failures (and per request from kbarton).

This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.

llvm-svn: 358546
2019-04-17 02:12:23 +00:00
Fangrui Song d5c404246f [ConstantFold] Don't evaluate FP or FP vector casts or truncations when simplifying icmp
Fix PR41476

llvm-svn: 358262
2019-04-12 07:34:30 +00:00
Nikita Popov 3db93ac5d6 Reapply [ValueTracking] Support min/max selects in computeConstantRange()
Add support for min/max flavor selects in computeConstantRange(),
which allows us to fold comparisons of a min/max against a constant
in InstSimplify. This fixes an infinite InstCombine loop, with the
test case taken from D59378.

Relative to the previous iteration, this contains some adjustments for
AMDGPU med3 tests: The AMDGPU target runs InstSimplify prior to codegen,
which ends up constant folding some existing med3 tests after this
change. To preserve these tests a hidden -amdgpu-scalar-ir-passes option
is added, which allows disabling scalar IR passes (that use InstSimplify)
for testing purposes.

Differential Revision: https://reviews.llvm.org/D59506

llvm-svn: 357870
2019-04-07 17:22:16 +00:00
Matt Arsenault 03e7492876 InstSimplify: Fold round intrinsics from sitofp/uitofp
https://godbolt.org/z/gEMRZb

llvm-svn: 357549
2019-04-03 00:25:06 +00:00
Matt Arsenault fa0a2c529b InstSimplify: Add missing case from r357386
llvm-svn: 357443
2019-04-02 00:46:19 +00:00
Matt Arsenault 0276b94356 InstSimplify: Add baseline test for upcoming change
llvm-svn: 357386
2019-04-01 14:03:44 +00:00
Nikita Popov b86576a5b9 [InstSimplify] Add tests for signed icmp of and/or; NFC
Even if a signed predicate is used, the ranges computed for and/or
are unsigned, resulting in missed simplifications.

llvm-svn: 356720
2019-03-21 21:13:08 +00:00
Nikita Popov 00b5ecab5d [ValueTracking] Compute range for abs without nsw
This is a small followup to D59511. The code that was moved into
computeConstantRange() there is a bit overly conversative: If the
abs is not nsw, it does not compute any range. However, abs without
nsw still has a well-defined contiguous unsigned range from 0 to
SIGNED_MIN. This is a lot less useful than the usual 0 to SIGNED_MAX
range, but if we're already here we might as well specify it...

Differential Revision: https://reviews.llvm.org/D59563

llvm-svn: 356586
2019-03-20 18:16:02 +00:00
Nikita Popov 2dd1566e8b [InstSimplify] Add additional cmp of abs without nsw tests; NFC
llvm-svn: 356520
2019-03-19 21:12:21 +00:00
Nikita Popov 3e9770d2dc Revert "[ValueTracking][InstSimplify] Support min/max selects in computeConstantRange()"
This reverts commit 106f0cdefb.

This change impacts the AMDGPU smed3.ll and umed3.ll codegen tests.

llvm-svn: 356424
2019-03-18 22:26:27 +00:00
Nikita Popov 106f0cdefb [ValueTracking][InstSimplify] Support min/max selects in computeConstantRange()
Add support for min/max flavor selects in computeConstantRange(),
which allows us to fold comparisons of a min/max against a constant
in InstSimplify. This was suggested by spatel as an alternative
approach to D59378. I've also added the infinite looping test from
that revision here.

Differential Revision: https://reviews.llvm.org/D59506

llvm-svn: 356415
2019-03-18 21:35:19 +00:00
Nikita Popov 05baa9ee1a [InstSimplify] Add additional icmp of min/max tests; NFC
These are baseline tests for D59506.

llvm-svn: 356408
2019-03-18 21:19:56 +00:00
Sanjay Patel 9dada83d6c [InstSimplify] remove zero-shift-guard fold for general funnel shift
As discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2019-February/130491.html

We can't remove the compare+select in the general case because
we are treating funnel shift like a standard instruction (as
opposed to a special instruction like select/phi).

That means that if one of the operands of the funnel shift is
poison, the result is poison regardless of whether we know that
the operand is actually unused based on the instruction's
particular semantics.

The motivating case for this transform is the more specific
rotate op (rather than funnel shift), and we are preserving the
fold for that case because there is no chance of introducing
extra poison when there is no anonymous extra operand to the
funnel shift.

llvm-svn: 354905
2019-02-26 18:26:56 +00:00
Sanjay Patel 421c6e6864 [InstSimplify] add tests for rotate; NFC
Rotate is a special-case of funnel shift that has different
poison constraints than the general case. That's not visible
yet in the existing tests, but it needs to be corrected.

llvm-svn: 354894
2019-02-26 16:44:08 +00:00
Sanjay Patel 68171e3cd6 [InstSimplify] use any-zero matcher for fcmp folds
The m_APFloat matcher does not work with anything but strict
splat vector constants, so we could miss these folds and then
trigger an assertion in instcombine:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13201

The previous attempt at this in rL354406 had a logic bug that
actually triggered a regression test failure, but I failed to
notice it the first time.

llvm-svn: 354467
2019-02-20 14:34:00 +00:00
Sanjay Patel 49f97395ab Revert "[InstSimplify] use any-zero matcher for fcmp folds"
This reverts commit 058bb83513.
Forgot to update another test affected by this change.

llvm-svn: 354408
2019-02-20 00:20:38 +00:00