Commit Graph

5421 Commits

Author SHA1 Message Date
Sanjay Patel 5a6e66ec72 [InstCombine] add folds for icmp+ctpop
https://alive2.llvm.org/ce/z/XjFPQJ

  define void @src(i64 %value) {
    %t0 = call i64 @llvm.ctpop.i64(i64 %value)
    %gt = icmp ugt i64 %t0, 63
    %lt = icmp ult i64 %t0, 64
    call void @use(i1 %gt, i1 %lt)
    ret void
  }

  define void @tgt(i64 %value) {
    %eq = icmp eq i64 %value, -1
    %ne = icmp ne i64 %value, -1
    call void @use(i1 %eq, i1 %ne)
    ret void
  }

  declare i64 @llvm.ctpop.i64(i64) #1
  declare void @use(i1, i1)
2020-10-26 16:48:56 -04:00
Sanjay Patel 05f011b2b6 [InstCombine] add tests for ctpop at bitwidth limit; NFC 2020-10-26 16:48:56 -04:00
Joe Ellis 0f83505593 [SVE][InstCombine] Fix TypeSize warning in canReplaceGEPIdxWithZero
The warning would fire when calling canReplaceGEPIdxWithZero on a GEP
whose source element type is a scalable vector. The size of scalable
vector types is not known, so this optimization cannot be performed.

This patch fixes the issue by:

- bailing out early in this routine if the GEP instruction's source
  element type is a scalable vector.

- making use of getFixedSize -- this removes the dependency on the
  deprecated interface.

Reviewed By: fpetrogalli

Differential Revision: https://reviews.llvm.org/D89968
2020-10-26 17:40:26 +00:00
Simon Pilgrim 0ef6a25e19 [InstCombine] Add bswap test pattern using truncates 2020-10-26 16:11:03 +00:00
Simon Pilgrim 532f3bec3e [InstCombine] collectBitParts - add bitreverse intrinsic support. 2020-10-26 14:36:36 +00:00
Simon Pilgrim 16f126df43 [InstCombine] Add bswap test pattern using bitreverse intrinsic
This is mainly to help with future better bitreverse folding support but we can test it via bswap matching for now.
2020-10-26 14:13:18 +00:00
Simon Pilgrim 6b2eb31e1e [InstCombine] Add support for zext(and(neg(amt),width-1)) rotate shift amount patterns
Alive2: https://alive2.llvm.org/ce/z/bCvvHd
2020-10-26 11:22:41 +00:00
Simon Pilgrim 821f3b763a [InstCombine] Add rotate tests where the shift amount is zero extended after masking 2020-10-26 11:22:40 +00:00
Tyker d3205bbca3 [Annotation] Allows annotation to carry some additional constant arguments.
This allows using annotation in a much more contexts than it currently has.
especially when annotation with template or constexpr.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D88645
2020-10-26 10:50:05 +01:00
Simon Pilgrim 3052e474ec [InstCombine] matchBSwapOrBitReversem - recognise or(fshl(),fshl()) bswap patterns.
I'm not certain InstCombinerImpl::matchBSwapOrBitReverse needs to filter the or(op0(),op1()) ops - there are just too many cases that recognizeBSwapOrBitReverseIdiom/collectBitParts handle now (and quickly).
2020-10-25 10:17:45 +00:00
Simon Pilgrim 5e9f172295 [InstCombine] Add test for or(fshl(),fshl()) bswap pattern.
Currently InstCombinerImpl::matchBSwapOrBitReverse won't match starting from funnel shifts.
2020-10-25 10:07:19 +00:00
Simon Pilgrim 310f62b4ff [InstCombine] narrowFunnelShift - fold trunc/zext or(shl(a,x),lshr(b,sub(bw,x))) -> fshl(a,b,x) (PR35155)
As discussed on PR35155, this extends narrowFunnelShift (recently renamed from narrowRotate) to support basic funnel shift patterns.

Unlike matchFunnelShift we don't include the computeKnownBits limitation as extracting the pattern from the zext/trunc layers should be a indicator of reasonable funnel shift codegen, in D89139 we demonstrated how to efficiently promote funnel shifts to wider types.

Differential Revision: https://reviews.llvm.org/D89542
2020-10-24 12:42:43 +01:00
Jay Foad 958130dfda [AMDGPU] Add simplification/combines for llvm.amdgcn.fma.legacy
This follows on from D89558 which added the new intrinsic and D88955
which added similar combines for llvm.amdgcn.fmul.legacy.

Differential Revision: https://reviews.llvm.org/D90028
2020-10-23 16:16:13 +01:00
Simon Pilgrim a6ad077f5d [InstCombine] Add i8 bitreverse by multiplication test patterns
Pulled from bit twiddling hacks webpage
2020-10-23 15:39:57 +01:00
Simon Pilgrim 61d1847b12 [InstCombine] Add 8/16/32/64 bitreverse test coverage
Use typical codegen for the traditional pairwise lgN bitreverse algorithm
2020-10-23 15:39:56 +01:00
Simon Pilgrim 9e7667e2ad [InstCombine] Add initial bitreverse test coverage 2020-10-23 15:39:56 +01:00
Sanjay Patel c72198079d [ValueTracking] add range limits for cttz
As discussed in D89952,
instcombine can sometimes find a way to reduce similar patterns,
but it is incomplete.
InstSimplify uses the computeConstantRange() ValueTracking analysis
via simplifyICmpWithConstant(), so we just need to fill in the max
value of cttz to process any "icmp pred cttz(X), C" pattern (the
min value is initialized to zero automatically).

https://alive2.llvm.org/ce/z/Z_SLWZ

Follow-up to D89976.
2020-10-23 08:43:45 -04:00
Jay Foad 86a480e9ce [AMDGPU] Add simplification/combines for llvm.amdgcn.fmul.legacy
Differential Revision: https://reviews.llvm.org/D88955
2020-10-23 09:31:00 +01:00
Vedant Kumar 3419252a79 [InstCombine] Remove dbg.values describing contents of dead allocas
When InstCombine removes an alloca, it erases the dbg.{addr,declare}
instructions which refer to the alloca. It would be better to instead
remove all debug intrinsics which describe the contents of the dead
alloca, namely all dbg.value(<dead alloca>, ..., DW_OP_deref)'s.

This effectively undoes work performed in an InstCombine run earlier in
the pipeline by LowerDbgDeclare, which inserts DW_OP_deref dbg.values
before CallInst users of an alloca. The motivating example looks like:

```
  define void @foo(i32 %0) {
    %a = alloca i32              ; This alloca is erased.
    store i32 %0, i32* %a
    dbg.value(i32 %0, "arg0")    ; This dbg.value survives.
    dbg.value(i32* %a, "arg0", DW_OP_deref)
    call void @trivially_inlinable_no_op(i32* %a)
    ret void
  }
```

If the DW_OP_deref dbg.value is not erased, it becomes dbg.value(undef)
after inlining, making "arg0" unavailable. But we already have dbg.value
descriptions of the alloca's value (from LowerDbgDeclare), so the
DW_OP_deref dbg.value cannot serve its purpose of describing an
initialization of the alloca by some callee. It invalidates other useful
dbg.values, causing large gaps in location coverage, so we should delete
it (even though doing so may cause stale dbg.values to appear, if
there's a dead store to `%a` in @trivially_inlinable_no_op).

OTOH, it wouldn't be correct to delete all dbg.value descriptions of an
alloca. Note that it's possible to describe a variable that takes on
different pointer values, e.g.:

```
  void use(int *);
  void t(int a, int b) {
    int *local = &a;     // dbg.value(i32* %a.addr, "local")
    local = &b;          // dbg.value(i32* undef, "local")
    use(&a);             //           (note: %b.addr is optimized out)
    local = &a;          // dbg.value(i32* %a.addr, "local")
  }
```

In this example, the alloca for "b" is erased, but we need to describe
the value of "local" as <unavailable> before the call to "use". This
prevents "local" from appearing to be equal to "&a" at the callsite.

rdar://66592859

Differential Revision: https://reviews.llvm.org/D85555
2020-10-22 10:00:13 -07:00
Quentin Colombet ee6abef532 [ValueTracking] Interpret GEPs as a series of adds multiplied by the related scaling factor
Prior to this patch, computeKnownBits would only try to deduce trailing zeros
bits for getelementptrs. This patch adds the logic to treat geps as a series
of add * scaling factor.

Thanks to this patch, using a gep or performing an address computation
directly "by hand" (ptrtoint followed by adds and mul followed by inttoptr)
offers the same computeKnownBits information.

Previously, the "by hand" approach would have given more information.

This is related to https://llvm.org/PR47241.

Differential Revision: https://reviews.llvm.org/D86364
2020-10-21 15:07:04 -07:00
Martin Storsjö 4de215ff18 Revert "[InstCombine] Add or((icmp ult/ule (A + C1), C3), (icmp ult/ule (A + C2), C3)) uniform vector support"
Also revert "[InstCombine] foldOrOfICmps - use m_Specific instead of
explicit comparisons. NFCI." to make the primarily intended revert
work.

This reverts commits ce13549761 and
e372a5f86f.

This commit caused failed asserts e.g. like this:

$ cat repro.cpp
bool a(char b) {
  return b >= '0' && b <= '9' || (b | 32) >= 'a' && (b | 32) <= 'z';
$ clang++ -target x86_64-linux-gnu -c -O2 repro.cpp
clang++: ../include/llvm/ADT/APInt.h:1151: bool llvm::APInt::operator==(const
llvm::APInt&) const: Assertion `BitWidth == RHS.BitWidth && "Comparison
requires equal bit widths"' failed.
2020-10-21 09:47:18 +03:00
Shimin Cui 95bda510fb [ConstantFold] Fold the comparison of bitcasted global values
This is to simplify icmp instructions in the form like:

%cmp = icmp eq i32 (i8*, i8*)* bitcast (i32 (i32**, i32**)* @f32 to i32
%(i8*, i8*)), bitcast (i32 (i64**, i64**) @f64 to i32 (i8*, i8*)*)

Here @f32 and @f64 are two functions.

Differential Revision: https://reviews.llvm.org/D87850
2020-10-20 12:41:49 -07:00
Simon Pilgrim bf540a64f3 [InstCombine] Add (icmp ult (X + CA), C1) | (icmp eq X, C2) -> (icmp ule (X + CA), C1) test coverage
Add both commuted variants and vector uniform/nonuniform examples
2020-10-20 15:16:47 +01:00
Simon Pilgrim e372a5f86f [InstCombine] Add or((icmp ult/ule (A + C1), C3), (icmp ult/ule (A + C2), C3)) uniform vector support
Reapplied rGa704d8238c86 with a check for integer/integervector types to prevent matching with pointer types
2020-10-20 14:14:26 +01:00
sstefan1 fbfb1c7909 [IR] Make nosync, nofree and willreturn default for intrinsics.
D70365 allows us to make attributes default. This is a follow up to
actually make nosync, nofree and willreturn default. The approach we
chose, for now, is to opt-in to default attributes to avoid introducing
problems to target specific intrinsics. Intrinsics with default
attributes can be created using `DefaultAttrsIntrinsic` class.
2020-10-20 11:57:19 +02:00
Simon Pilgrim 482e6f0041 Revert rGa704d8238c86bac: "[InstCombine] Add or((icmp ult/ule (A + C1), C3), (icmp ult/ule (A + C2), C3)) uniform vector support"
This reverts commit a704d8238c.

Causing stage2 build failures on some bots.
2020-10-19 16:03:36 +01:00
Simon Pilgrim de885f1b2a [InstCombine] Add (icmp ne A, 0) | (icmp ne B, 0) --> (icmp ne (A|B), 0) vector support
Scalar cases were already being handled by foldLogOpOfMaskedICmps (so this was dead code), but refactoring to support non-uniform vectors will take some time, so tweak this fold in the meantime.
2020-10-19 15:41:21 +01:00
Simon Pilgrim ecd25086d1 [InstCombine] Add (icmp eq B, 0) | (icmp ult/gt A, B) -> (icmp ule A, B-1) vector support 2020-10-19 15:23:48 +01:00
Simon Pilgrim a704d8238c [InstCombine] Add or((icmp ult/ule (A + C1), C3), (icmp ult/ule (A + C2), C3)) uniform vector support 2020-10-19 14:55:18 +01:00
Simon Pilgrim 3ad9361254 [InstCombine] Add or((icmp ult/ule (A + C1), C3), (icmp ult/ule (A + C2), C3)) vector tests 2020-10-19 14:28:08 +01:00
Simon Pilgrim aba7275bb3 [InstCombine] Add (icmp ne A, 0) | (icmp ne B, 0) --> (icmp ne (A|B), 0) tests 2020-10-19 13:42:53 +01:00
Simon Pilgrim 3dd2f02bb0 [InstCombine] Add (icmp eq B, 0) | (icmp ult A, B) -> (icmp ule A, B-1) vector tests 2020-10-19 11:48:32 +01:00
Simon Pilgrim 0b7b446a40 [InstCombine] Support vectors-with-undef in and(logicalshift(1,X),1) --> zext(X == 0) fold 2020-10-19 11:10:32 +01:00
Simon Pilgrim 2d1fea2923 [InstCombine] Add vectors-with-undef tests for and(logicalshift(1,X),1) --> zext(X == 0) 2020-10-19 11:10:31 +01:00
Sanjay Patel 53e92b4c0e [InstCombine] (~A & B) ^ A -> A | B
Differential Revision: https://reviews.llvm.org/D86395
2020-10-17 12:20:18 -04:00
Simon Pilgrim fe8281e2d0 [InstCombine] visitAnd - add some ((val OP C1) & C2) vector test coverage 2020-10-16 15:43:11 +01:00
Simon Pilgrim ef0ab3cdfe [InstCombine] Fix typo in narrow funnel shift test 2020-10-16 12:18:16 +01:00
Simon Pilgrim 76996470ef [InstCombine] Add trunc+zext 'narrow' funnel shift tests (PR35155)
Based on the rotation equivalents in rotate.ll
2020-10-16 12:06:47 +01:00
Simon Pilgrim 55991b44b7 [InstCombine] foldAndOrOfICmpsOfAndWithPow2 - add vector support
Support vector cases for folding:

 (iszero(A & K1) | iszero(A & K2)) -> (A & (K1 | K2)) != (K1 | K2)
 (!iszero(A & K1) & !iszero(A & K2)) -> (A & (K1 | K2)) == (K1 | K2)
2020-10-16 10:41:40 +01:00
Sanjay Patel 77fb8cbd60 [InstCombine] update tests for logic folds to exercise commuted patterns; NFC
This was the intent for D88551.
I also varied the types a bit for extra coverage
and tried to make better test/value names.
2020-10-15 14:37:49 -04:00
Simon Pilgrim 60ba9233d1 Revert rG25a97c3a43d7 - "[InstCombine] visitCallInst - retain undefs in vector funnel shift amounts"
This reverts commit 25a97c3a43.

We have other constant folds that fold undef funnel shift amounts to 0 - so we need to be consistent.

If we end up with regressions where we lose a splat shift amount pattern we'll have to investigate other canonicalizations, but matchFunnelShift currently protects us from that.
2020-10-14 18:14:37 +01:00
Matt Arsenault 6a9484f4bf InstCombine: Fix losing load properties in copy-constant-to-alloca
Preserve the alignment and metadata. Atomic loads are skipped for
this, but pass along the properties for consistency.
2020-10-14 12:55:25 -04:00
Matt Arsenault 6da31fa4a6 InstCombine: Fix infinite loop in copy-constant-to-alloca transform
This was broken by 16295d521e, when
instructions started being handled and not just constant
expressions. This was re-inserting an equivalent bitcast to the
original memcpy operand, which made a non-functional IR change on
every iteration.

This also fixes a secondary problem where it was inserting
addrspacecasts which may not have been legal (i.e. it changed the
source address space). Start visiting all pointer users and fail out
if we can't process them. Also start handling the relevant memory
intrinsic users. These cases can be dealt with by running
InferAddressSpaces separately.
2020-10-14 12:55:25 -04:00
Simon Pilgrim 25a97c3a43 [InstCombine] visitCallInst - retain undefs in vector funnel shift amounts
By always performing a modulo on the shift amount constants this was causing undef amounts being replaced with zero, meaning we were losing funnel shift by splat (with undef) patterns.

Tweaked the shift amount bounds check to support (passthrough) undefs, and use Constant::mergeUndefsWith to preserve the undefs after folding.
2020-10-14 14:38:21 +01:00
Tim Northover 630d264798 Analysis: only query size of sized objects.
Recently we started looking into sret parameters, though the issue could crop
up elsewhere. If the pointee type is opaque, we should not try to compute its
size because that leads to an assertion failure.
2020-10-14 12:16:05 +01:00
Simon Pilgrim 9b4db7f733 [InstCombine] Add undef funnel shift amount test coverage 2020-10-14 11:58:37 +01:00
Simon Pilgrim 1e4d882f9a [InstCombine] matchFunnelShift - add support for non-uniform vectors containing undefs.
Replace m_SpecificInt with m_APIntAllowUndef to matching splats containing undefs, then use ConstantExpr::mergeUndefsWith to merge the undefs together in the result.

The undef funnel shift amounts are getting replaced with zero later on - I'll address this in a later patch, otherwise we lose potential shift by splat value patterns.
2020-10-14 10:42:27 +01:00
Simon Pilgrim 9c3138bd6d [InstCombine] visitTrunc - pass through undefs for trunc(shift(trunc/ext(x),c)) patterns
Based on the recent patches D88475 and D88429 where we are losing undef values due to extension/comparisons.

I've added a Constant::mergeUndefsWith method that merges the undef scalar/elements from another Constant into a specific Constant.

Differential Revision: https://reviews.llvm.org/D88687
2020-10-13 14:35:18 +01:00
Simon Pilgrim 5df61724a1 [InstCombine] Support uniform vector splats in ((((X >> C) & CC) + Y) << C) folds.
Add support for uniform vector splats (no undefs).
2020-10-13 09:28:39 +01:00
Dávid Bolvanský 2f66bfac28 [Tests] Regenerate test checks; NFC 2020-10-12 17:55:00 +02:00