Commit Graph

5080 Commits

Author SHA1 Message Date
Sebastian Neubauer 2c659082bd [AMDGPU] Don't combine memory intrs to v3i16
v3i16 and v3f16 currently cannot be legalized and lowered so they should
not be emitted by inst combining.

Moved the check down to still allow extracting 1 or 2 elements via the dmask.

Fixes image intrinsics being combined to return v3x16.

Differential Revision: https://reviews.llvm.org/D84223
2020-07-22 12:44:01 +02:00
Sanjay Patel 750f4c591d [InstCombine] allow peeking through zext of shift amount to match rotate idioms (PR45701)
We might want to also allow trunc of the shift amount, but that seems less likely?

  define i32 @src(i32 %x, i1 %y) {
  %0:
    %rem = and i1 %y, 1
    %cmp = icmp eq i1 %rem, 0
    %sh_prom = zext i1 %rem to i32
    %sub = sub nsw nuw i1 0, %rem
    %sh_prom1 = zext i1 %sub to i32
    %shr = lshr i32 %x, %sh_prom1
    %shl = shl i32 %x, %sh_prom
    %or = or i32 %shl, %shr
    %r = select i1 %cmp, i32 %x, i32 %or
    ret i32 %r
  }
  =>
  define i32 @tgt(i32 %x, i1 %y) {
  %0:
    %t = zext i1 %y to i32
    %r = fshl i32 %x, i32 %x, i32 %t
    ret i32 %r
  }

  Transformation seems to be correct!

https://alive2.llvm.org/ce/z/xgMvE3

http://bugs.llvm.org/PR45701
2020-07-20 16:18:11 -04:00
Sanjay Patel 92ec0c5da6 [InstCombine] add tests for funnel shift/rotate with narrow shift amount; NFC 2020-07-20 16:18:11 -04:00
Nikita Popov d12ec0f752 [InstCombine] Fix store merge worklist management (PR46680)
Fixes https://bugs.llvm.org/show_bug.cgi?id=46680.

Just like insertions through IRBuilder, InsertNewInstBefore()
should be using the deferred worklist mechanism, so that processing
of newly added instructions is prioritized.

There's one side-effect of the worklist order change which could be
classified as a regression. An add op gets pushed through a select
that at the time is not a umax. We could add a reverse transform
that tries to push adds in the reverse direction to restore a min/max,
but that seems like a sure way of getting infinite loops... Seems
like something that should best wait on min/max intrinsics.

Differential Revision: https://reviews.llvm.org/D84109
2020-07-19 15:05:45 +02:00
Nikita Popov 13ae440de4 [InstCombine] Add test for PR46680 (NFC) 2020-07-18 23:37:16 +02:00
Xinan Jiang d8e0baf29d [InstCombine] Fix typo in comment.
Reviewers: fhahn

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D83951
2020-07-17 20:57:45 +01:00
Roman Lebedev 0fdcca07ad
[InstCombine] Fold X sdiv (-1 << C) -> -(X u>> Y) iff X is non-negative
This is the one i'm seeing as missed optimization,
although there are likely other possibilities, as usual.

There are 4 variants of a general sdiv->udiv fold:
https://rise4fun.com/Alive/VS6

Name: v0
Pre: C0 >= 0 && C1 >= 0
%r = sdiv i8 C0, C1
  =>
%r = udiv i8 C0, C1

Name: v1
Pre: C0 <= 0 && C1 >= 0
%r = sdiv i8 C0, C1
  =>
%t0 = udiv i8 -C0, C1
%r = sub i8 0, %t0

Name: v2
Pre: C0 >= 0 && C1 <= 0
%r = sdiv i8 C0, C1
  =>
%t0 = udiv i8 C0, -C1
%r = sub i8 0, %t0

Name: v3
Pre: C0 <= 0 && C1 <= 0
%r = sdiv i8 C0, C1
  =>
%r = udiv i8 -C0, -C1


If we really don't like sdiv (more than udiv that is),
and are okay with increasing instruction count (2 new negations),
and we ensure that we don't undo the fold,
then we could just implement these..
2020-07-17 22:50:09 +03:00
Roman Lebedev 66b66988e6
[NFC][InstCombine] Add some tests with sdiv-by-negative-power-of-two 2020-07-17 22:50:09 +03:00
Max Kazantsev df6e185e8f [InstCombine][Test] Test for fix of replacing select with Phis when branch has the same labels
An additional test that allows to check the correctness of handling the case of the same
branch labels in the dominator when trying to replace select with phi-node.

Patch By: Kirill Polushin
Differential Revision: https://reviews.llvm.org/D84006
Reviewed By: mkazantsev
2020-07-17 17:16:28 +07:00
Eric Christopher 7bfaa40086 Temporarily Revert "[AssumeBundles] Use operand bundles to encode alignment assumptions"
due to the performance bugs filed in https://bugs.llvm.org/show_bug.cgi?id=46753.

An SROA change soon may obviate some of these problems.

This reverts commit 8d09f20798.
2020-07-16 11:54:04 -07:00
Max Kazantsev 90798e09e2 Re-enable "[InstCombine] Simplify boolean Phis with const inputs using CFG"
This reverts commit b893822e32.

+ Clang test fixes
+ Insertion point fix for landing pads
2020-07-16 16:09:08 +07:00
Max Kazantsev b893822e32 Revert "[InstCombine] Simplify boolean Phis with const inputs using CFG"
This reverts commit 00472067c3.

Need to fix failing clang tests.
2020-07-16 12:58:39 +07:00
Max Kazantsev 00472067c3 [InstCombine] Simplify boolean Phis with const inputs using CFG
This patch adds simplification for pattern:
```
  if (cond)
  /       \
 ...      ...
  \       /
p = phi [true] [false]
...
br p, succ_1, succ_2
```
If we can prove that top block's branches dominate respective
inputs of a block that has a Phi with constant inputs, we can
use the branch condition (maybe inverted) instead of Phi.
This will make proofs of implication for further jump threading
more transparent.

Differential Revision: https://reviews.llvm.org/D81375
Reviewed By: xbolva00
2020-07-16 12:06:10 +07:00
Craig Topper 00f3579aea Revert "[InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms" and subsequent patches
This reverts most of the following patches due to reports of miscompiles.
I've left the added test cases with comments updated to be FIXMEs.

1cf6f210a2 [IR] Disable select ? C : undef -> C fold in ConstantFoldSelectInstruction unless we know C isn't poison.
469da663f2 [InstSimplify] Re-enable select ?, undef, X -> X transform when X is provably not poison
122b0640fc [InstSimplify] Don't fold vectors of partial undef in SimplifySelectInst if the non-undef element value might produce poison
ac0af12ed2 [InstSimplify] Add test cases for opportunities to fold select ?, X, undef -> X when we can prove X isn't poison
9b1e95329a [InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms
2020-07-15 22:02:33 -07:00
Sanjay Patel d8b268680d [InstCombine] prevent infinite looping in or-icmp fold (PR46712)
I'm not sure if the test is truly minimal, but we need to
induce a situation where a value becomes a constant but is
not immediately folded before getting to the 'or' transform.
2020-07-15 14:12:12 -04:00
Sanjay Patel efc30e591b [InstCombine] update datalayout in test file; NFC
We need to specify legal integer widths to trigger PR46712,
so add those here. This doesn't appear to affect any existing
tests, and it's not clear why a datalayout would not include
any legal integer widths.

While here, change some variable names that include 'tmp' to
avoid warnings from the auto-generating script for CHECK lines.
2020-07-15 14:12:12 -04:00
Christopher Tetreault 9c87c55805 [SVE] Make cstfp_pred_ty and cst_pred_ty work with scalable splats
Reviewers: efriedma, lebedev.ri, fhahn, c-rhodes, david-arm

Reviewed By: efriedma, david-arm

Subscribers: tschuett, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83001
2020-07-14 14:20:39 -07:00
Sanjay Patel 34d35d4a42 [ValueTracking] fix miscompile in maxnum case of cannotBeOrderedLessThanZeroImpl (PR46627)
A miscompile with -0.0 is shown in:
http://bugs.llvm.org/PR46627

This is because maxnum(-0.0, +0.0) does not specify a fixed result:
http://llvm.org/docs/LangRef.html#llvm-maxnum-intrinsic

So we need to tighten the constraints for when it is ok to say the
result of maxnum is positive (including +0.0).

Differential Revision: https://reviews.llvm.org/D83601
2020-07-14 08:08:09 -04:00
Sanjay Patel 9cc669d22d [InstCombine][InstSimplify] add tests for sign of maxnum; NFC
More coverage for D83601.
2020-07-14 08:08:09 -04:00
Tyker 8d09f20798 [AssumeBundles] Use operand bundles to encode alignment assumptions
Summary:
NOTE: There is a mailing list discussion on this: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html

Complemantary to the assumption outliner prototype in D71692, this patch
shows how we could simplify the code emitted for an alignemnt
assumption. The generated code is smaller, less fragile, and it makes it
easier to recognize the additional use as a "assumption use".

As mentioned in D71692 and on the mailing list, we could adopt this
scheme, and similar schemes for other patterns, without adopting the
assumption outlining.

Reviewers: hfinkel, xbolva00, lebedev.ri, nikic, rjmccall, spatel, jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: thopre, yamauchi, kuter, fhahn, merge_guards_bot, hiraditya, bollu, rkruppe, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71739
2020-07-14 01:05:58 +02:00
Gui Andrade bfa3b627c6 [InstCombine] Erase attribute lists for simplified libcalls
Currently, a transformation like pow(2.0, x) -> exp2(x) copies the pow
attribute list verbatim and applies it to exp2. This works out fine
when the attribute list is empty, but when it isn't clang may error due
due to the mismatch.

The source function and destination don't necessarily have anything
to do with one another, attribute-wise. So it makes sense to remove
the attribute lists (this is similar to what IPO does in this
situation).

This was discovered after implementing the `noundef` param attribute.

Differential Revision: https://reviews.llvm.org/D82820
2020-07-13 22:32:33 +00:00
Vedant Kumar 3d52b1e81b Revert "[InstCombine] Drop debug loc in TryToSinkInstruction (reland)"
This reverts commit 9649c2095f. See
discussion on the llvm-commits thread: if it's OK to preserve the
location when sinking a call, it's probably OK to always preserve the
location.
2020-07-13 15:17:07 -07:00
Max Kazantsev e808cab824 [InstCombine] Improve select -> phi canonicalization: consider more blocks
We can try to replace select with a Phi not in its parent block alone,
but also in blocks of its arguments. We benefit from it when select's
argument is a Phi.

Differential Revision: https://reviews.llvm.org/D83284
Reviewed By: nikic
2020-07-13 11:40:32 +07:00
Sanjay Patel 4458973347 [InstCombine] fold mul of zext/sext bools to 'and'
Similar to rG40fcc42:
The base case only worked because we were relying on a
poison-unsafe select transform; if that is fixed, we
would regress on patterns like this.

The extra use tests show that the select transform can't
be applied consistently. So it may be a regression to have
an extra instruction on 1 test, but that result was not
created safely and does not happen reliably.
2020-07-12 15:56:26 -04:00
Roman Lebedev 2655a70a04
[InstCombine] After merging store into successor, queue prev. store to be visited (PR46661)
We can happen to have a situation with many stores eligible for transform,
but due to our visitation order (top to bottom), when we have processed
the first eligible instruction, we would not try to reprocess the previous
instructions that are now also eligible.

So after we've successfully merged a store that was second-to-last instruction
into successor, if the now-second-to-last instruction is also a such store
that is eligible, add it to worklist to be revisited.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46661
2020-07-10 17:49:16 +03:00
Roman Lebedev ef0ecb7b03
[NFCI][InstCombine] PR46661: multiple stores eligible for merging into successor - worklist issue
The testcase should pass with a single instcombine iteration.
2020-07-10 17:49:16 +03:00
Craig Topper 9b1e95329a [InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms
As noted here https://lists.llvm.org/pipermail/llvm-dev/2016-October/106182.html and by alive2, this transform isn't valid. If X is poison this potentially propagates poison when it shouldn't.

This same transform still exists in DAGCombiner.

Differential Revision: https://reviews.llvm.org/D83360
2020-07-08 12:53:05 -07:00
Max Kazantsev 094e99d264 [Test] Add one more missing optimization opportunity test 2020-07-07 13:04:15 +07:00
Roman Lebedev 7ea46aee36
Revert "[AssumeBundles] Use operand bundles to encode alignment assumptions"
Assume bundle can have more than one entry with the same name,
but at least AlignmentFromAssumptionsPass::extractAlignmentInfo() uses
getOperandBundle("align"), which internally assumes that it isn't the
case, and happily crashes otherwise.

Minimal reduced reproducer: run `opt -alignment-from-assumptions` on

target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

%0 = type { i64, %1*, i8*, i64, %2, i32, %3*, i8* }
%1 = type opaque
%2 = type { i8, i8, i16 }
%3 = type { i32, i32, i32, i32 }

; Function Attrs: nounwind
define i32 @f(%0* noalias nocapture readonly %arg, %0* noalias %arg1) local_unnamed_addr #0 {
bb:
  call void @llvm.assume(i1 true) [ "align"(%0* %arg, i64 8), "align"(%0* %arg1, i64 8) ]
  ret i32 0
}

; Function Attrs: nounwind willreturn
declare void @llvm.assume(i1) #1

attributes #0 = { nounwind "reciprocal-estimates"="none" }
attributes #1 = { nounwind willreturn }


This is what we'd have with -mllvm -enable-knowledge-retention

This reverts commit c95ffadb24.
2020-07-04 23:49:23 +03:00
Sanjay Patel 3b8ae1001f [InstCombine] fix miscompile from umul_with_overflow matching
As noted in PR46561:
https://bugs.llvm.org/show_bug.cgi?id=46561
...it takes something beyond a minimal IR example to trigger
this bug because it relies on matching non-canonical IR.

There are no tests that show the need for matching this
pattern, so I'm just deleting it to fix the miscompile.
2020-07-04 11:16:23 -04:00
Roman Lebedev c3b8bd1eea
[InstCombine] Always try to invert non-canonical predicate of an icmp
Summary:
The actual transform i was going after was:
https://rise4fun.com/Alive/Tp9H
```
Name: zz
Pre: isPowerOf2(C0) && isPowerOf2(C1) && C1 == C0
%t0 = and i8 %x, C0
%r = icmp eq i8 %t0, C1
  =>
%t = icmp eq i8 %t0, 0
%r = xor i1 %t, -1

Name: zz
Pre: isPowerOf2(C0)
%t0 = and i8 %x, C0
%r = icmp ne i8 %t0, 0
  =>
%t = icmp eq i8 %t0, 0
%r = xor i1 %t, -1
```
but as it can be seen from the current tests, we already canonicalize most of it,
and we are only missing handling multi-use non-canonical icmp predicates.

If we have both `!=0` and `==0`, even though we can CSE them,
we end up being stuck with them. We should canonicalize to the `==0`.

I believe this is one of the cleanup steps i'll need after `-scalarizer`
if i end up proceeding with my WIP alloca promotion helper pass.

Reviewers: spatel, jdoerfert, nikic

Reviewed By: nikic

Subscribers: zzheng, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83139
2020-07-04 18:12:04 +03:00
Sanjay Patel ef70cc9d1a [InstCombine] improve debug value names; NFC
The use of 'tmp' can trigger warnings from the update_test_checks.py
script. That's evidence of a flaw in the script's logic, but we
can always do better than naming variables 'tmp' in LLVM too.

The phi test file should be updated with auto-generated regex CHECK
lines, so it isn't affected by cosmetic diffs, but I don't have
time to do that right now.
2020-07-04 11:06:30 -04:00
Sanjay Patel 14936e01e2 [InstCombine] add test for miscompile (PR46561); NFC 2020-07-04 11:06:30 -04:00
Roman Lebedev 341ab51149
[NFCI][InstCombine] shift.ll: s/%tmp/%i/ to silence update script warning 2020-07-04 00:39:35 +03:00
Sanjay Patel 7fd8af1de0 [InstCombine] fold mul of sext bools to 'and'
Alive2:
  define i32 @src(i1 %x, i1 %y) {
  %0:
  %zx = sext i1 %x to i32
  %zy = sext i1 %y to i32
  %r = mul i32 %zx, %zy
  ret i32 %r
  }
  =>
  define i32 @tgt(i1 %x, i1 %y) {
  %0:
  %a = and i1 %x, %y
  %r = zext i1 %a to i32
  ret i32 %r
  }
  Transformation seems to be correct!

https://alive2.llvm.org/ce/z/gaPQxA
2020-07-03 17:28:40 -04:00
Sanjay Patel 5504d8b04a [InstCombine] add more tests for mul of bools; NFC 2020-07-03 17:28:22 -04:00
Florian Hahn 31971ca1c6 [InstCombine] Try to narrow expr if trunc cannot be removed.
Narrowing an input expression of a truncate to a type larger than the
result of the truncate won't allow removing the truncate, but it may
enable further optimizations, e.g. allowing for larger vectorization
factors.

For now this is intentionally limited to integer types only, to avoid
producing new vector ops that might not be suitable for the target.

If we know that the only user is a trunc, we can also be allow more
cases, e.g. also shortening expressions with some additional shifts.

I would appreciate feedback on the best place to do such a narrowing.

This fixes PR43580.

Reviewers: spatel, RKSimon, lebedev.ri, xbolva00

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D82973
2020-07-03 20:22:51 +01:00
Sanjay Patel 40fcc42498 [InstCombine] fold mul of zext bools to 'and'
The base case only works because we are relying on a
poison-unsafe select transform; if that is fixed, we
would regress on patterns like this.

The extra use tests show that the select transform can't
be applied consistently. So it may be a regression to have
an extra instruction on 1 test, but that result was not
created safely and does not happen reliably.
2020-07-03 13:14:18 -04:00
Sanjay Patel 5d60377864 [InstCombine] add tests for mul of bools; NFC 2020-07-03 13:14:18 -04:00
Roman Lebedev 4dd784000e
[NFC][InstCombine] Add some more tests for select based on non-canonical bit-test 2020-07-03 20:12:46 +03:00
Florian Hahn 7a1161767b [InstCombine] Precommit tests for PR43580. 2020-07-03 17:14:02 +01:00
Sanjay Patel 63774642af [InstCombine] add one-use check to cast+select narrowing transform
Prevent increasing the instruction count.
2020-07-03 11:54:09 -04:00
Sanjay Patel 0cd0ae1f29 [InstCombine] add tests to show missing one-use checks; NFC 2020-07-03 11:54:09 -04:00
Simon Pilgrim eb0e7acbd4 [InstCombine] canEvaluateTruncated - use KnownBits to check for inrange shift amounts
Currently canEvaluateTruncated can only attempt to truncate shifts if they are scalar/uniform constant amounts that are in range.

This patch replaces the constant extraction code with KnownBits handling, using the KnownBits::getMaxValue to check that the amounts are inrange.

This enables support for nonuniform constant cases, and also variable shift amounts that have been masked somehow. Annoyingly, this still won't work for vectors with (demanded) undefs as KnownBits returns nothing in those cases, but its a definite improvement on what we currently have.

Differential Revision: https://reviews.llvm.org/D83127
2020-07-03 16:02:10 +01:00
Simon Pilgrim 1ab88de0ed Add tests for trunc(shl/lshr/ashr(*ext(x),zext(and(y,c)))) patterns with variable shifts with clamped shift amounts 2020-07-03 13:39:16 +01:00
Simon Pilgrim b18405fbc0 Add vector trunc(or(shl(zext(x),c1),zext(x))) tests 2020-07-03 13:32:00 +01:00
Simon Pilgrim 80d4f33479 Regenerate apint-cast tests and replace %tmp variable names to silence update_test_checks warnings 2020-07-03 11:42:16 +01:00
Simon Pilgrim b3a2882dbc Add nonuniform vector trunc(or(shl(zext(x),c1),srl(zext(x),c2))) tests 2020-07-03 11:42:15 +01:00
Simon Pilgrim 029046dc32 Regenerate mul-trunc tests, add vector variants and replace %tmp variable names to silence update_test_checks warnings 2020-07-03 11:42:15 +01:00
Simon Pilgrim 3da42f4810 [InstCombine] Add sext(ashr(shl(trunc(x),c),c)) folding support for vectors
Replacing m_ConstantInt with m_Constant permits folding of vectors as well as scalars.

Differential Revision: https://reviews.llvm.org/D83058
2020-07-03 10:04:37 +01:00