Commit Graph

532 Commits

Author SHA1 Message Date
Eric Christopher cee313d288 Revert "Temporarily Revert "Add basic loop fusion pass.""
The reversion apparently deleted the test/Transforms directory.

Will be re-reverting again.

llvm-svn: 358552
2019-04-17 04:52:47 +00:00
Eric Christopher a863435128 Temporarily Revert "Add basic loop fusion pass."
As it's causing some bot failures (and per request from kbarton).

This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.

llvm-svn: 358546
2019-04-17 02:12:23 +00:00
Fangrui Song d5c404246f [ConstantFold] Don't evaluate FP or FP vector casts or truncations when simplifying icmp
Fix PR41476

llvm-svn: 358262
2019-04-12 07:34:30 +00:00
Nikita Popov 3db93ac5d6 Reapply [ValueTracking] Support min/max selects in computeConstantRange()
Add support for min/max flavor selects in computeConstantRange(),
which allows us to fold comparisons of a min/max against a constant
in InstSimplify. This fixes an infinite InstCombine loop, with the
test case taken from D59378.

Relative to the previous iteration, this contains some adjustments for
AMDGPU med3 tests: The AMDGPU target runs InstSimplify prior to codegen,
which ends up constant folding some existing med3 tests after this
change. To preserve these tests a hidden -amdgpu-scalar-ir-passes option
is added, which allows disabling scalar IR passes (that use InstSimplify)
for testing purposes.

Differential Revision: https://reviews.llvm.org/D59506

llvm-svn: 357870
2019-04-07 17:22:16 +00:00
Matt Arsenault 03e7492876 InstSimplify: Fold round intrinsics from sitofp/uitofp
https://godbolt.org/z/gEMRZb

llvm-svn: 357549
2019-04-03 00:25:06 +00:00
Matt Arsenault fa0a2c529b InstSimplify: Add missing case from r357386
llvm-svn: 357443
2019-04-02 00:46:19 +00:00
Matt Arsenault 0276b94356 InstSimplify: Add baseline test for upcoming change
llvm-svn: 357386
2019-04-01 14:03:44 +00:00
Nikita Popov b86576a5b9 [InstSimplify] Add tests for signed icmp of and/or; NFC
Even if a signed predicate is used, the ranges computed for and/or
are unsigned, resulting in missed simplifications.

llvm-svn: 356720
2019-03-21 21:13:08 +00:00
Nikita Popov 00b5ecab5d [ValueTracking] Compute range for abs without nsw
This is a small followup to D59511. The code that was moved into
computeConstantRange() there is a bit overly conversative: If the
abs is not nsw, it does not compute any range. However, abs without
nsw still has a well-defined contiguous unsigned range from 0 to
SIGNED_MIN. This is a lot less useful than the usual 0 to SIGNED_MAX
range, but if we're already here we might as well specify it...

Differential Revision: https://reviews.llvm.org/D59563

llvm-svn: 356586
2019-03-20 18:16:02 +00:00
Nikita Popov 2dd1566e8b [InstSimplify] Add additional cmp of abs without nsw tests; NFC
llvm-svn: 356520
2019-03-19 21:12:21 +00:00
Nikita Popov 3e9770d2dc Revert "[ValueTracking][InstSimplify] Support min/max selects in computeConstantRange()"
This reverts commit 106f0cdefb.

This change impacts the AMDGPU smed3.ll and umed3.ll codegen tests.

llvm-svn: 356424
2019-03-18 22:26:27 +00:00
Nikita Popov 106f0cdefb [ValueTracking][InstSimplify] Support min/max selects in computeConstantRange()
Add support for min/max flavor selects in computeConstantRange(),
which allows us to fold comparisons of a min/max against a constant
in InstSimplify. This was suggested by spatel as an alternative
approach to D59378. I've also added the infinite looping test from
that revision here.

Differential Revision: https://reviews.llvm.org/D59506

llvm-svn: 356415
2019-03-18 21:35:19 +00:00
Nikita Popov 05baa9ee1a [InstSimplify] Add additional icmp of min/max tests; NFC
These are baseline tests for D59506.

llvm-svn: 356408
2019-03-18 21:19:56 +00:00
Sanjay Patel 9dada83d6c [InstSimplify] remove zero-shift-guard fold for general funnel shift
As discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2019-February/130491.html

We can't remove the compare+select in the general case because
we are treating funnel shift like a standard instruction (as
opposed to a special instruction like select/phi).

That means that if one of the operands of the funnel shift is
poison, the result is poison regardless of whether we know that
the operand is actually unused based on the instruction's
particular semantics.

The motivating case for this transform is the more specific
rotate op (rather than funnel shift), and we are preserving the
fold for that case because there is no chance of introducing
extra poison when there is no anonymous extra operand to the
funnel shift.

llvm-svn: 354905
2019-02-26 18:26:56 +00:00
Sanjay Patel 421c6e6864 [InstSimplify] add tests for rotate; NFC
Rotate is a special-case of funnel shift that has different
poison constraints than the general case. That's not visible
yet in the existing tests, but it needs to be corrected.

llvm-svn: 354894
2019-02-26 16:44:08 +00:00
Sanjay Patel 68171e3cd6 [InstSimplify] use any-zero matcher for fcmp folds
The m_APFloat matcher does not work with anything but strict
splat vector constants, so we could miss these folds and then
trigger an assertion in instcombine:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13201

The previous attempt at this in rL354406 had a logic bug that
actually triggered a regression test failure, but I failed to
notice it the first time.

llvm-svn: 354467
2019-02-20 14:34:00 +00:00
Sanjay Patel 49f97395ab Revert "[InstSimplify] use any-zero matcher for fcmp folds"
This reverts commit 058bb83513.
Forgot to update another test affected by this change.

llvm-svn: 354408
2019-02-20 00:20:38 +00:00
Sanjay Patel 058bb83513 [InstSimplify] use any-zero matcher for fcmp folds
The m_APFloat matcher does not work with anything but strict
splat vector constants, so we could miss these folds and then
trigger an assertion in instcombine:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13201

llvm-svn: 354406
2019-02-20 00:09:50 +00:00
Sanjay Patel 9cf04addf3 [InstSimplify] add vector tests for fcmp+fabs; NFC
llvm-svn: 354404
2019-02-19 23:58:02 +00:00
Dmitry Venikov aaa709f2ec [InstSimplify] Missed optimization in math expression: log10(pow(10.0,x)) == x, log2(pow(2.0,x)) == x
Summary: This patch enables folding following instructions under -ffast-math flag: log10(pow(10.0,x)) -> x, log2(pow(2.0,x)) -> x

Reviewers: hfinkel, spatel, efriedma, craig.topper, zvi, majnemer, lebedev.ri

Reviewed By: spatel, lebedev.ri

Subscribers: lebedev.ri, llvm-commits

Differential Revision: https://reviews.llvm.org/D41940

llvm-svn: 352981
2019-02-03 03:48:30 +00:00
Dmitry Venikov 4c81a2b2ec Commit tests for changes in revision D41940
llvm-svn: 352734
2019-01-31 07:38:19 +00:00
Nikita Popov f17421e595 [ConstantFolding] Consolidate and extend bitcount intrinsic tests; NFC
Move constant folding tests into ConstantFolding/bitcount.ll and drop
various tests in other places. Add coverage for undefs.

llvm-svn: 349806
2018-12-20 19:46:52 +00:00
Nikita Popov 221f3fc750 [InstSimplify] Simplify saturating add/sub + icmp
If a saturating add/sub has one constant operand, then we can
determine the possible range of outputs it can produce, and simplify
an icmp comparison based on that.

The implementation is based on a similar existing mechanism for
simplifying binary operator + icmps.

Differential Revision: https://reviews.llvm.org/D55735

llvm-svn: 349369
2018-12-17 17:45:18 +00:00
Nikita Popov 32083bd072 [InstCombine] Add additional saturating add/sub + icmp tests; NFC
These test comparisons with saturating add/sub in non-canonical
form.

llvm-svn: 349309
2018-12-16 17:45:25 +00:00
Nikita Popov f61567f42a [InstSimplify] Add tests for saturating add/sub + icmp; NFC
If a saturating add/sub with a constant operand is compared to
another constant, we should be able to determine that the condition
is always true/false in some cases (but currently don't).

llvm-svn: 349261
2018-12-15 10:37:01 +00:00
Sanjay Patel 998ececef0 [InstCombine] remove dead code from visitExtractElement
Extracting from a splat constant is always handled by InstSimplify.
Move the test for this from InstCombine to InstSimplify to make
sure that stays true.

llvm-svn: 348423
2018-12-05 23:09:33 +00:00
Sanjay Patel 398728732e [InstSimplify] add tests for undef + partial undef constant folding; NFC
These tests should probably go under a separate test file because they
should fold with just -constprop, but they're similar to the scalar
tests already in here.

llvm-svn: 348045
2018-11-30 22:51:34 +00:00
Sanjay Patel d802270808 [InstSimplify] fold select with implied condition
This is an almost direct move of the functionality from InstCombine to 
InstSimplify. There's no reason not to do this in InstSimplify because 
we never create a new value with this transform.

(There's a question of whether any dominance-based transform belongs in
either of these passes, but that's a separate issue.)

I've changed 1 of the conditions for the fold (1 of the blocks for the 
branch must be the block we started with) into an assert because I'm not 
sure how that could ever be false.

We need 1 extra check to make sure that the instruction itself is in a
basic block because passes other than InstCombine may be using InstSimplify
as an analysis on values that are not wired up yet.

The 3-way compare changes show that InstCombine has some kind of 
phase-ordering hole. Otherwise, we would have already gotten the intended
final result that we now show here.

llvm-svn: 347896
2018-11-29 18:44:39 +00:00
Sanjay Patel 14ab9170b8 [InstSimplify] fold funnel shifts with undef operands
Splitting these off from the D54666.

Patch by: nikic (Nikita Popov)

llvm-svn: 347332
2018-11-20 17:34:59 +00:00
Sanjay Patel 2778f56a40 [InstSimplify] add tests for funnel shift with undef operands; NFC
These are part of D54666, so adding them here before the patch to
show the baseline (currently unoptimized) results.

Patch by: @nikic (Nikita Popov)

llvm-svn: 347331
2018-11-20 17:30:09 +00:00
Sanjay Patel eea21da12a [InstructionSimplify] Add support for saturating add/sub
Add support for saturating add/sub in InstructionSimplify. In particular, the following simplifications are supported:

    sat(X + 0) -> X
    sat(X + undef) -> -1
    sat(X uadd MAX) -> MAX
    (and commutative variants)

    sat(X - 0) -> X
    sat(X - X) -> 0
    sat(X - undef) -> 0
    sat(undef - X) -> 0
    sat(0 usub X) -> 0
    sat(X usub MAX) -> 0

Patch by: @nikic (Nikita Popov)

Differential Revision: https://reviews.llvm.org/D54532

llvm-svn: 347330
2018-11-20 17:20:26 +00:00
Sanjay Patel f5ead29b78 [PatternMatch] Handle undef vectors consistently
This patch fixes the issue noticed in D54532. 
The problem is that cst_pred_ty-based matchers like m_Zero() currently do not match 
scalar undefs (as expected), but *do* match vector undefs. This may lead to optimization 
inconsistencies in rare cases.

There is only one existing test for which output changes, reverting the change from D53205. 
The reason here is that vector fsub undef, %x is no longer matched as an m_FNeg(). While I 
think that the new output is technically worse than the previous one, it is consistent with 
scalar, and I don't think it's really important either way (generally that undef should have 
been folded away prior to reassociation.)

I've also added another test case for this issue based on InstructionSimplify. It took some 
effort to find that one, as in most cases undef folds are either checked first -- and in the 
cases where they aren't it usually happens to not make a difference in the end. This is the 
only case I was able to come up with. Prior to this patch the test case simplified to undef 
in the scalar case, but zeroinitializer in the vector case.

Patch by: @nikic (Nikita Popov)

Differential Revision: https://reviews.llvm.org/D54631

llvm-svn: 347318
2018-11-20 16:08:19 +00:00
Sanjay Patel f967328e24 [InstSimplify] add tests for saturating add/sub; NFC
These are baseline tests for D54532.
Patch based on the original tests by:
@nikic (Nikita Popov)

llvm-svn: 347060
2018-11-16 16:32:34 +00:00
Sanjay Patel 5ebd2a785e [InstSimplify] add test to demonstrate undef matching differences; NFC
This is a baseline test for D54631.
Patch by:
@nikic (Nikita Popov)

llvm-svn: 347055
2018-11-16 15:35:58 +00:00
Sanjay Patel e98ec77a95 [InstSimplify] delete shift-of-zero guard ops around funnel shifts
This is a problem seen in common rotate idioms as noted in:
https://bugs.llvm.org/show_bug.cgi?id=34924

Note that we are not canonicalizing standard IR (shifts and logic) to the intrinsics yet. 
(Although I've written this before...) I think this is the last step before we enable 
that transform. Ie, we could regress code by doing that transform without this 
simplification in place.

In PR34924, I questioned whether this is a valid transform for target-independent IR, 
but I convinced myself this is ok. If we're speculating a funnel shift by turning cmp+br 
into select, then SimplifyCFG has already determined that the transform is justified. 
It's possible that SimplifyCFG is not taking into account profile or other metadata, 
but if that's true, then it's a bug independent of funnel shifts.

Also, we do have CGP code to restore a guard like this around an intrinsic if it can't 
be lowered cheaply. But that isn't necessary for funnel shift because the default 
expansion in SelectionDAGBuilder includes this same cmp+select.

Differential Revision: https://reviews.llvm.org/D54552

llvm-svn: 346960
2018-11-15 14:53:37 +00:00
Sanjay Patel 4832ffee39 [InstSimplify] add more tests for funnel shift with select; NFC
The cases are just different enough that we should have 
complete tests to avoid bugs from typos in the code.

llvm-svn: 346902
2018-11-14 22:34:25 +00:00
Sanjay Patel 7d028670f6 [InstSimplify] add tests for funnel shift with select; NFC
llvm-svn: 346881
2018-11-14 19:12:54 +00:00
Sanjay Patel 1440107821 [InstSimplify] fold select (fcmp X, Y), X, Y
This is NFCI for InstCombine because it calls InstSimplify, 
so I left the tests for this transform there. As noted in
the code comment, we can allow this fold more often by using
FMF and/or value tracking.

llvm-svn: 346169
2018-11-05 21:51:39 +00:00
Sanjay Patel 72c2d355b7 [InstSimplify] add tests for select+fcmp; NFC
These are translated from InstCombine's test file with the same name.
We should move the transform from InstCombine to InstSimplify.

llvm-svn: 346168
2018-11-05 21:42:01 +00:00
Sanjay Patel 746ebb4ee8 [InstSimplify] fold icmp based on range of abs/nabs (2nd try)
This is retrying the fold from rL345717 
(reverted at rL347780)
...with a fix for the miscompile
demonstrated by PR39510:
https://bugs.llvm.org/show_bug.cgi?id=39510

Original commit message:

This is a fix for PR39475:
https://bugs.llvm.org/show_bug.cgi?id=39475

We managed to get some of these patterns using computeKnownBits in https://reviews.llvm.org/D47041, but that
can't be used for nabs(). Instead, put in some range-based logic, so we can fold
both abs/nabs with icmp with a constant value.

Alive proofs:
https://rise4fun.com/Alive/21r

Name: abs_nsw_is_positive

  %cmp = icmp slt i32 %x, 0
  %negx = sub nsw i32 0, %x
  %abs = select i1 %cmp, i32 %negx, i32 %x
  %r = icmp sgt i32 %abs, -1
    =>
  %r = i1 true


Name: abs_nsw_is_not_negative

  %cmp = icmp slt i32 %x, 0
  %negx = sub nsw i32 0, %x
  %abs = select i1 %cmp, i32 %negx, i32 %x
  %r = icmp slt i32 %abs, 0
    =>
  %r = i1 false


Name: nabs_is_negative_or_0

  %cmp = icmp slt i32 %x, 0
  %negx = sub i32 0, %x
  %nabs = select i1 %cmp, i32 %x, i32 %negx
  %r = icmp slt i32 %nabs, 1
    =>
  %r = i1 true

Name: nabs_is_not_over_0

  %cmp = icmp slt i32 %x, 0
  %negx = sub i32 0, %x
  %nabs = select i1 %cmp, i32 %x, i32 %negx
  %r = icmp sgt i32 %nabs, 0
    =>
  %r = i1 false

Differential Revision: https://reviews.llvm.org/D53844

llvm-svn: 345832
2018-11-01 14:07:39 +00:00
Sanjay Patel 056807b01e [InstSimplify] add tests for icmp fold bug (PR39510); NFC
Verify that set intersection/subset are not confused.

llvm-svn: 345831
2018-11-01 14:03:22 +00:00
Sanjay Patel 72fe03f93b revert rL345717 : [InstSimplify] fold icmp based on range of abs/nabs
This can miscompile as shown in PR39510:
https://bugs.llvm.org/show_bug.cgi?id=39510

llvm-svn: 345780
2018-10-31 21:37:40 +00:00
Sanjay Patel d4dc30c20d [InstSimplify] fold 'fcmp nnan ult X, 0.0' when X is not negative
This is the inverted case for the transform added with D53874 / rL345725.

llvm-svn: 345728
2018-10-31 15:35:46 +00:00
Sanjay Patel 85cba3b6fb [InstSimplify] fold 'fcmp nnan oge X, 0.0' when X is not negative
This re-raises some of the open questions about how to apply and use fast-math-flags in IR from PR38086:
https://bugs.llvm.org/show_bug.cgi?id=38086
...but given the current implementation (no FMF on casts), this is likely the only way to predicate the 
transform.

This is part of solving PR39475:
https://bugs.llvm.org/show_bug.cgi?id=39475

Differential Revision: https://reviews.llvm.org/D53874

llvm-svn: 345725
2018-10-31 14:57:23 +00:00
Sanjay Patel 1cd9917edf [InstSimplify] add tests for fcmp and known positive; NFC
llvm-svn: 345722
2018-10-31 14:29:21 +00:00
Sanjay Patel 2efccd2cf2 [InstSimplify] fold icmp based on range of abs/nabs
This is a fix for PR39475:
https://bugs.llvm.org/show_bug.cgi?id=39475

We managed to get some of these patterns using computeKnownBits in D47041, but that 
can't be used for nabs(). Instead, put in some range-based logic, so we can fold 
both abs/nabs with icmp with a constant value.

Alive proofs:
https://rise4fun.com/Alive/21r

Name: abs_nsw_is_positive
  %cmp = icmp slt i32 %x, 0
  %negx = sub nsw i32 0, %x
  %abs = select i1 %cmp, i32 %negx, i32 %x
  %r = icmp sgt i32 %abs, -1
    =>
  %r = i1 true
 
Name: abs_nsw_is_not_negative
  %cmp = icmp slt i32 %x, 0
  %negx = sub nsw i32 0, %x
  %abs = select i1 %cmp, i32 %negx, i32 %x
  %r = icmp slt i32 %abs, 0
    =>
  %r = i1 false
 
Name: nabs_is_negative_or_0
  %cmp = icmp slt i32 %x, 0
  %negx = sub i32 0, %x
  %nabs = select i1 %cmp, i32 %x, i32 %negx
  %r = icmp slt i32 %nabs, 1
    =>
  %r = i1 true

Name: nabs_is_not_over_0
  %cmp = icmp slt i32 %x, 0
  %negx = sub i32 0, %x
  %nabs = select i1 %cmp, i32 %x, i32 %negx
  %r = icmp sgt i32 %nabs, 0
    =>
  %r = i1 false

Differential Revision: https://reviews.llvm.org/D53844

llvm-svn: 345717
2018-10-31 13:25:10 +00:00
Sanjay Patel bcde5afcac [InstSimplify] add tests for fcmp folds; NFC
This is part of a problem noted in PR39475:
https://bugs.llvm.org/show_bug.cgi?id=39475

llvm-svn: 345615
2018-10-30 16:58:43 +00:00
Sanjay Patel 33603198f2 [InstSimplify] add tests for abs/nabs+icmp folding; NFC
llvm-svn: 345541
2018-10-29 21:05:41 +00:00
Thomas Lively c339250e12 [InstCombine] InstCombine and InstSimplify for minimum and maximum
Summary: Depends on D52765

Reviewers: aheejin, dschuff

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52766

llvm-svn: 344799
2018-10-19 19:01:26 +00:00
Sanjay Patel ce3f1915f3 [InstCombine] move/add tests for sub/neg; NFC
These should all be handled using "dyn_castNegVal",
but that misses vectors with undef elements.

llvm-svn: 344790
2018-10-19 17:26:22 +00:00