Commit Graph

3436 Commits

Author SHA1 Message Date
Roman Lebedev 7a67ed5795 [InstCombine] Simplify @llvm.usub.with.overflow+non-zero check (PR43251)
Summary:
This is again motivated by D67122 sanitizer check enhancement.
That patch seemingly worsens `-fsanitize=pointer-overflow`
overhead from 25% to 50%, which strongly implies missing folds.

In this particular case, given
```
char* test(char& base, unsigned long offset) {
  return &base - offset;
}
```
it will end up producing something like
https://godbolt.org/z/luGEju
which after optimizations reduces down to roughly
```
declare void @use64(i64)
define i1 @test(i8* dereferenceable(1) %base, i64 %offset) {
  %base_int = ptrtoint i8* %base to i64
  %adjusted = sub i64 %base_int, %offset
  call void @use64(i64 %adjusted)
  %not_null = icmp ne i64 %adjusted, 0
  %no_underflow = icmp ule i64 %adjusted, %base_int
  %no_underflow_and_not_null = and i1 %not_null, %no_underflow
  ret i1 %no_underflow_and_not_null
}
```
Without D67122 there was no `%not_null`,
and in this particular case we can "get rid of it", by merging two checks:
Here we are checking: `Base u>= Offset && (Base u- Offset) != 0`, but that is simply `Base u> Offset`

Alive proofs:
https://rise4fun.com/Alive/QOs

The `@llvm.usub.with.overflow` pattern itself is not handled here
because this is the main pattern, that we currently consider canonical.

https://bugs.llvm.org/show_bug.cgi?id=43251

Reviewers: spatel, nikic, xbolva00, majnemer

Reviewed By: xbolva00, majnemer

Subscribers: vsk, majnemer, xbolva00, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67356

llvm-svn: 372341
2019-09-19 17:25:19 +00:00
Roman Lebedev b646dd92c2 [InstCombine] foldUnsignedUnderflowCheck(): handle last few cases (PR43251)
Summary:
I don't have a direct motivational case for this,
but it would be good to have this for completeness/symmetry.

This pattern is basically the motivational pattern from
https://bugs.llvm.org/show_bug.cgi?id=43251
but with different predicate that requires that the offset is non-zero.

The completeness bit comes from the fact that a similar pattern (offset != zero)
will be needed for https://bugs.llvm.org/show_bug.cgi?id=43259,
so it'd seem to be good to not overlook very similar patterns..

Proofs: https://rise4fun.com/Alive/21b

Also, there is something odd with `isKnownNonZero()`, if the non-zero
knowledge was specified as an assumption, it didn't pick it up (PR43267)

With this, i see no other missing folds for
https://bugs.llvm.org/show_bug.cgi?id=43251

Reviewers: spatel, nikic, xbolva00

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67412

llvm-svn: 372257
2019-09-18 20:10:07 +00:00
Roman Lebedev ba4cad9039 [InstCombine] dropRedundantMaskingOfLeftShiftInput(): some cleanup before upcoming patch
llvm-svn: 372245
2019-09-18 18:38:40 +00:00
Roman Lebedev 97bc5ae993 [NFC][InstCombine] dropRedundantMaskingOfLeftShiftInput(): some NFC diff shaving
llvm-svn: 372171
2019-09-17 19:32:26 +00:00
David Bolvansky be2487a2ba [InstCombine] Annotate strdup with deref_or_null
llvm-svn: 372098
2019-09-17 10:12:48 +00:00
Sanjay Patel 3961a143e1 [InstCombine] remove unneeded one-use checks for icmp fold
Related folds were added in:
rL125734
...the code comment about register pressure is discussed in
more detail in:
https://bugs.llvm.org/show_bug.cgi?id=2698

But 10 years later, perf testing bzip2 with this change now
shows a slight (0.2% average) improvement on Haswell although
that's probably within test noise.

Given that this is IR canonicalization, we shouldn't be worried
about register pressure though; the backend should be able to
adjust for that as needed.

This is part of solving PR43310 the theoretically right way:
https://bugs.llvm.org/show_bug.cgi?id=43310
...ie, if we don't cripple basic transforms, then we won't
need to add special-case code to detect larger patterns.

rL371940 and rL371981 are related patches in this series.

llvm-svn: 372007
2019-09-16 16:15:25 +00:00
Sanjay Patel c5cd808156 [InstCombine] remove unneeded one-use checks for icmp fold
This fold and several others were added in:
rL125734 <https://reviews.llvm.org/rL125734>
...with no explanation for the one-use checks other than the code
comments about register pressure.

Given that this is IR canonicalization, we shouldn't be worried
about register pressure though; the backend should be able to
adjust for that as needed.

This is part of solving PR43310 the theoretically right way:
https://bugs.llvm.org/show_bug.cgi?id=43310
...ie, if we don't cripple basic transforms, then we won't
need to add special-case code to detect larger patterns.

rL371940 is a related patch in this series.

llvm-svn: 371981
2019-09-16 12:54:34 +00:00
Sanjay Patel 91c2cd0691 [InstCombine] fix comments to match code; NFC
This blob was written before match() existed, so it
could probably be reduced significantly.

But I suspect it isn't well tested, so tests would have
to be added to reduce risk from logic changes.

llvm-svn: 371978
2019-09-16 12:12:05 +00:00
Sanjay Patel 3daf168fa9 [InstCombine] remove unneeded one-use checks for icmp fold
This fold and several others were added in:
rL125734
...with no explanation for the one-use checks other than the code
comments about register pressure.

Given that this is IR canonicalization, we shouldn't be worried
about register pressure though; the backend should be able to
adjust for that as needed.

There are similar checks as noted with the TODO comments. I'm
hoping to remove those restrictions too, but if any of these
does cause a regression, it should be easier to correct by making
small, individual commits.

This is part of solving PR43310 the theoretically right way:
https://bugs.llvm.org/show_bug.cgi?id=43310
...ie, if we don't cripple basic transforms, then we won't
need to add special-case code to detect larger patterns.

llvm-svn: 371940
2019-09-15 20:56:34 +00:00
Sanjay Patel 2bfb955c51 [InstCombine] rename variable for readability; NFC
There's more that can be done here, but "OpI"
doesn't convey that we casted to BinaryOperator.

llvm-svn: 371682
2019-09-11 22:31:34 +00:00
Florian Hahn 51de22c8ee Revert [InstCombine] Use SimplifyFMulInst to simplify multiply in fma.
This introduces additional rounding error in some cases. See D67434.

This reverts r371518 (git commit 18a1f0818b)

llvm-svn: 371634
2019-09-11 16:17:03 +00:00
Sanjay Patel 80bea345d1 [InstCombine] fold sign-bit compares of srem
(srem X, pow2C) sgt/slt 0 can be reduced using bit hacks by masking
off the sign bit and the module (low) bits:
https://rise4fun.com/Alive/jSO
A '2' divisor allows slightly more folding:
https://rise4fun.com/Alive/tDBM

Any chance to remove an 'srem' use is probably worthwhile, but this is limited
to the one-use improvement case because doing more may expose other missing
folds. That means it does nothing for PR21929 yet:
https://bugs.llvm.org/show_bug.cgi?id=21929

Differential Revision: https://reviews.llvm.org/D67334

llvm-svn: 371610
2019-09-11 12:04:26 +00:00
David Bolvansky 4dae283cd3 [InstCombine] Fixed handling of isOpNewLike (PR11748)
llvm-svn: 371602
2019-09-11 10:37:03 +00:00
Alexey Lapshin 6b1c6c1287 [Debuginfo][Instcombiner] Do not clone dbg.declare.
TryToSinkInstruction() has a bug: While updating debug info for
sunk instruction, it could clone dbg.declare intrinsic.
That is wrong. There could be only one dbg.declare.
The fix is to not clone dbg.declare intrinsic and to update
it`s arguments, to not to point to sunk instruction.

Differential Revision: https://reviews.llvm.org/D67217

llvm-svn: 371587
2019-09-11 06:07:16 +00:00
Florian Hahn 18a1f0818b [InstCombine] Use SimplifyFMulInst to simplify multiply in fma.
This allows us to fold fma's that multiply with 0.0. Also, the
multiply by 1.0 case is handled there as well. The fneg/fabs cases
are not handled by SimplifyFMulInst, so we need to keep them.

Reviewers: spatel, anemet, lebedev.ri

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D67351

llvm-svn: 371518
2019-09-10 13:10:28 +00:00
Sanjay Patel aff5bee35f [InstCombine] fold extract+insert into identity shuffle
This is similar to the existing fold for splats added with:
rL365379

If we can adjust the shuffle mask to include another element
in an identity mask (if it changes vector length, that's an
extract/insert subvector operation in the backend), then that
can eliminate extractelement/insertelement pairs in IR.

All targets are expected to lower shuffles with identity masks
efficiently.

llvm-svn: 371340
2019-09-08 19:03:01 +00:00
Teresa Johnson 9c27b59cec Change TargetLibraryInfo analysis passes to always require Function
Summary:
This is the first change to enable the TLI to be built per-function so
that -fno-builtin* handling can be migrated to use function attributes.
See discussion on D61634 for background. This is an enabler for fixing
handling of these options for LTO, for example.

This change should not affect behavior, as the provided function is not
yet used to build a specifically per-function TLI, but rather enables
that migration.

Most of the changes were very mechanical, e.g. passing a Function to the
legacy analysis pass's getTLI interface, or in Module level cases,
adding a callback. This is similar to the way the per-function TTI
analysis works.

There was one place where we were looking for builtins but not in the
context of a specific function. See FindCXAAtExit in
lib/Transforms/IPO/GlobalOpt.cpp. I'm somewhat concerned my workaround
could provide the wrong behavior in some corner cases. Suggestions
welcome.

Reviewers: chandlerc, hfinkel

Subscribers: arsenm, dschuff, jvesely, nhaehnle, mehdi_amini, javed.absar, sbc100, jgravelle-google, eraman, aheejin, steven_wu, george.burgess.iv, dexonsmith, jfb, asbirlea, gchatelet, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66428

llvm-svn: 371284
2019-09-07 03:09:36 +00:00
Matt Arsenault 524a9d5774 InstCombine: Fix crash on icmp of gep with addrspacecasted null
llvm-svn: 371146
2019-09-05 23:39:21 +00:00
Roman Lebedev 8360c42e25 [InstCombine] foldICmpBinOp(): consider inverted check in 'unsigned sub overflow' check
A follow-up for r329011.
This may be changed to produce @llvm.sub.with.overflow in a later patch,
but for now just make things more consistent overall.

A few observations stem from this:
* There does not seem to be a similar one-instruction fold for uadd-overflow
* I'm not sure we'll want to canonicalize `B u> A` as `usub.with.overflow`,
  so since the `icmp` here no longer refers to `sub`,
  reconstructing `usub.with.overflow` will be problematic,
  and will likely require standalone pass (similar to DivRemPairs).

https://rise4fun.com/Alive/Zqs

Name: (A - B) u> A --> B u> A
  %t0 = sub i8 %A, %B
  %r = icmp ugt i8 %t0, %A
=>
  %r = icmp ugt i8 %B, %A

Name: (A - B) u<= A --> B u<= A
  %t0 = sub i8 %A, %B
  %r = icmp ule i8 %t0, %A
=>
  %r = icmp ule i8 %B, %A

Name: C u< (C - D) --> C u< D
  %t0 = sub i8 %C, %D
  %r = icmp ult i8 %C, %t0
=>
  %r = icmp ult i8 %C, %D

Name: C u>= (C - D) --> C u>= D
  %t0 = sub i8 %C, %D
  %r = icmp uge i8 %C, %t0
=>
  %r = icmp uge i8 %C, %D

llvm-svn: 371101
2019-09-05 17:41:02 +00:00
Roman Lebedev ecb7ea1ae7 [InstCombine] foldICmpBinOp(): consider inverted check in 'unsigned add overflow' check
A follow-up for r342004.
This will be changed to produce @llvm.add.with.overflow in a later patch,
but for now just make things more consistent overall.

https://rise4fun.com/Alive/qxE

Name: (Op1 + X) u< Op1 --> ~Op1 u< X
  %t0 = add i8 %Op1, %X
  %r = icmp ult i8 %t0, %Op1
=>
  %n = xor i8 %Op1, -1
  %r = icmp ult i8 %n, %X

Name: (Op1 + X) u>= Op1 --> ~Op1 u>= X
  %t0 = add i8 %Op1, %X
  %r = icmp uge i8 %t0, %Op1
=>
  %n = xor i8 %Op1, -1
  %r = icmp uge i8 %n, %X

;-------------------------------------------------------------------------------

Name: Op0 u> (Op0 + X) --> X u> ~Op0
  %t0 = add i8 %Op0, %X
  %r = icmp ugt i8 %Op0, %t0
=>
  %n = xor i8 %Op0, -1
  %r = icmp ugt i8 %X, %n

Name: Op0 u<= (Op0 + X) --> X u<= ~Op0
  %t0 = add i8 %Op0, %X
  %r = icmp ule i8 %Op0, %t0
=>
  %n = xor i8 %Op0, -1
  %r = icmp ule i8 %X, %n

llvm-svn: 371100
2019-09-05 17:40:49 +00:00
David Bolvansky 420cbb6190 [InstCombine] sub(xor(x, y), or(x, y)) -> neg(and(x, y))
Summary:
```
Name: sub(xor(x, y), or(x, y)) -> neg(and(x, y))
%or = or i32 %y, %x
%xor = xor i32 %x, %y
%sub = sub i32 %xor, %or
  =>
%sub1 = and i32 %x, %y
%sub = sub i32 0, %sub1

Optimization: sub(xor(x, y), or(x, y)) -> neg(and(x, y))
Done: 1
Optimization is correct!
```

https://rise4fun.com/Alive/8OI

Reviewers: lebedev.ri

Reviewed By: lebedev.ri

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67188

llvm-svn: 370945
2019-09-04 18:03:21 +00:00
David Bolvansky 0e07248704 [InstCombine] Fold sub (and A, B) (or A, B)) to neg (xor A, B)
Summary:
```
Name: sub(and(x, y), or(x, y)) -> neg(xor(x, y))
%or = or i32 %y, %x
%and = and i32 %x, %y
%sub = sub i32 %and, %or
  =>
%sub1 = xor i32 %x, %y
%sub = sub i32 0, %sub1

Optimization: sub(and(x, y), or(x, y)) -> neg(xor(x, y))
Done: 1
Optimization is correct!
```

https://rise4fun.com/Alive/VI6

Found by @lebedev.ri. Also author of the proof.

Reviewers: lebedev.ri, spatel

Reviewed By: lebedev.ri

Subscribers: llvm-commits, lebedev.ri

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67155

llvm-svn: 370934
2019-09-04 17:30:53 +00:00
David Bolvansky 358b80b340 [InstCombine] Fold sub (or A, B) (and A, B) to (xor A, B)
Summary:
```
Name: sub or and to xor
%or = or i32 %y, %x
%and = and i32 %x, %y
%sub = sub i32 %or, %and
  =>
%sub = xor i32 %x, %y

Optimization: sub or and to xor
Done: 1
Optimization is correct!
```
https://rise4fun.com/Alive/eJu

Reviewers: spatel, lebedev.ri

Reviewed By: lebedev.ri

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67153

llvm-svn: 370883
2019-09-04 12:00:33 +00:00
Sanjay Patel 561c39994b [InstCombine] recognize bswap disguised as shufflevector
bitcast <N x i8> (shuf X, undef, <N, N-1,...0>) to i{N*8} --> bswap (bitcast X to i{N*8})

In PR43146:
https://bugs.llvm.org/show_bug.cgi?id=43146
...we have a more complicated case where SLP is making a mess of bswap. This patch won't
do anything for that currently, but we need to improve bswap recognition in instcombine,
SLP, and/or a standalone pass to avoid that problem.

This is limited using the data-layout so we don't try to do this transform with actual
vector types. The backend does not appear to have folds to convert in either direction,
so we don't want to mess up something that is actually better lowered as a shuffle.

On x86, we're trading something like this:

  vmovd	%edi, %xmm0
  vpshufb	LCPI0_0(%rip), %xmm0, %xmm0 ## xmm0 = xmm0[3,2,1,0,u,u,u,u,u,u,u,u,u,u,u,u]
  vmovd	%xmm0, %eax

For:

  movl	%edi, %eax
  bswapl	%eax

Differential Revision: https://reviews.llvm.org/D66965

llvm-svn: 370659
2019-09-02 13:33:20 +00:00
Piotr Sobczak 67b979466a [InstCombine][AMDGPU] Simplify tbuffer loads
Summary: Add missing tbuffer loads intrinsics in SimplifyDemandedVectorElts.

Reviewers: arsenm, nhaehnle

Reviewed By: arsenm

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66926

llvm-svn: 370475
2019-08-30 14:20:04 +00:00
Sanjay Patel 65f1c04000 [InstCombine] reduce duplicated code; NFC
llvm-svn: 370399
2019-08-29 19:36:18 +00:00
Roman Lebedev 473a063a5e [InstCombine] Fold '((%x * %y) u/ %x) != %y' to '@llvm.umul.with.overflow' + overflow bit extraction
Summary:
`((%x * %y) u/ %x) != %y` is one of (3?) common ways to check that
some unsigned multiplication (will not) overflow.
Currently, we don't catch it. We could:
```
$ /repositories/alive2/build-Clang-unknown/alive -root-only ~/llvm-patch1.ll
Processing /home/lebedevri/llvm-patch1.ll..

----------------------------------------
Name: no overflow
  %o0 = mul i4 %y, %x
  %o1 = udiv i4 %o0, %x
  %r = icmp ne i4 %o1, %y
  ret i1 %r
=>
  %n0 = umul_overflow i4 %x, %y
  %o0 = extractvalue {i4, i1} %n0, 0
  %o1 = udiv %o0, %x
  %r = extractvalue {i4, i1} %n0, 1
  ret %r

Done: 1
Optimization is correct!

----------------------------------------
Name: no overflow
  %o0 = mul i4 %y, %x
  %o1 = udiv i4 %o0, %x
  %r = icmp eq i4 %o1, %y
  ret i1 %r
=>
  %n0 = umul_overflow i4 %x, %y
  %o0 = extractvalue {i4, i1} %n0, 0
  %o1 = udiv %o0, %x
  %n1 = extractvalue {i4, i1} %n0, 1
  %r = xor %n1, -1
  ret i1 %r

Done: 1
Optimization is correct!

```

Reviewers: nikic, spatel, efriedma, xbolva00, RKSimon

Reviewed By: nikic

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65144

llvm-svn: 370348
2019-08-29 12:47:20 +00:00
Roman Lebedev fb38b7aab3 [InstCombine] Fold '(-1 u/ %x) u< %y' to '@llvm.umul.with.overflow' + overflow bit extraction
Summary:
`(-1 u/ %x) u< %y` is one of (3?) common ways to check that
some unsigned multiplication (will not) overflow.
Currently, we don't catch it. We could:
```
----------------------------------------
Name: no overflow
  %o0 = udiv i4 -1, %x
  %r = icmp ult i4 %o0, %y
=>
  %o0 = udiv i4 -1, %x
  %n0 = umul_overflow i4 %x, %y
  %r = extractvalue {i4, i1} %n0, 1

Done: 1
Optimization is correct!

----------------------------------------
Name: no overflow, swapped
  %o0 = udiv i4 -1, %x
  %r = icmp ugt i4 %y, %o0
=>
  %o0 = udiv i4 -1, %x
  %n0 = umul_overflow i4 %x, %y
  %r = extractvalue {i4, i1} %n0, 1

Done: 1
Optimization is correct!

----------------------------------------
Name: overflow
  %o0 = udiv i4 -1, %x
  %r = icmp uge i4 %o0, %y
=>
  %o0 = udiv i4 -1, %x
  %n0 = umul_overflow i4 %x, %y
  %n1 = extractvalue {i4, i1} %n0, 1
  %r = xor %n1, -1

Done: 1
Optimization is correct!

----------------------------------------
Name: overflow
  %o0 = udiv i4 -1, %x
  %r = icmp ule i4 %y, %o0
=>
  %o0 = udiv i4 -1, %x
  %n0 = umul_overflow i4 %x, %y
  %n1 = extractvalue {i4, i1} %n0, 1
  %r = xor %n1, -1

Done: 1
Optimization is correct!
```

As it can be observed from tests, while simply forming the `@llvm.umul.with.overflow`
is easy, if we were looking for the inverted answer, then more work needs to be done
to cleanup the now-pointless control-flow that was guarding against division-by-zero.
This is being addressed in follow-up patches.

Reviewers: nikic, spatel, efriedma, xbolva00, RKSimon

Reviewed By: nikic, xbolva00

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65143

llvm-svn: 370347
2019-08-29 12:47:08 +00:00
Roman Lebedev f13b0e3ed8 [InstCombine] Shift amount reassociation in bittest: trunc-of-lshr (PR42399)
Summary:
Finally, the fold i was looking forward to :)

The legality check is muddy, i doubt  i've groked the full generalization,
but it handles all the cases i care about, and can come up with:
https://rise4fun.com/Alive/26j

I.e. we can perform the fold if **any** of the following is true:
* The shift amount is either zero or one less than widest bitwidth
* Either of the values being shifted has at most lowest bit set
* The value that is being shifted by `shl` (which is not truncated) should have no less leading zeros than the total shift amount;
* The value that is being shifted by `lshr` (which **is** truncated) should have no less leading zeros than the widest bit width minus total shift amount minus one

I strongly suspect there is some better generalization, but i'm not aware of it as of right now.
For now i also avoided using actual `computeKnownBits()`, but restricted it to constants.

Reviewers: spatel, nikic, xbolva00

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66383

llvm-svn: 370324
2019-08-29 10:26:23 +00:00
Simon Pilgrim ef9c6a7077 Fix variable set but no used warning on NDEBUG builds. NFCI.
llvm-svn: 370317
2019-08-29 09:58:47 +00:00
Sanjay Patel 2d4b6777c4 [InstCombine] clean up wrap propagation for reassociated ops; NFCI
Always true/false checks were flagged by static analysis;
https://bugs.llvm.org/show_bug.cgi?id=43143

I have not confirmed the logic difference in propagating nsw vs. nuw,
but presumably we would have noticed a bug by now if that was wrong.

llvm-svn: 370248
2019-08-28 18:58:06 +00:00
Simon Pilgrim 1d8a886c59 Reduce scope of variable only used in a local pattern match. NFCI.
llvm-svn: 370224
2019-08-28 16:06:33 +00:00
Craig Topper f79d8a064c [InstCombine] Disable recursion in foldGEPICmp for vector pointer GEPs
Due to missing vector support in this function, recursion can
generate worse code in some cases.

llvm-svn: 370221
2019-08-28 15:40:34 +00:00
Simon Pilgrim b569624049 Fix uninitialized variable warning in cppcheck. NFCI.
InstCombiner::MaxArraySizeForCombine is set outside the constructor so we need to ensure it has a default initialization value.

llvm-svn: 370220
2019-08-28 15:19:49 +00:00
David Bolvansky af118bb6d0 [NFC] Added a comment to avoid possible confusion
llvm-svn: 370217
2019-08-28 15:04:48 +00:00
Simon Pilgrim 316bfb0f48 Remove duplicate 'BitWidth' variable. NFCI.
llvm-svn: 370212
2019-08-28 14:37:44 +00:00
Simon Pilgrim 284118ce3b InstCombiner::visitSelectInst - rename Pred to MinMaxPred to stop shadow variable warning. NFCI.
We have a lot of Predicate variables, all similarly named....

llvm-svn: 370207
2019-08-28 14:05:38 +00:00
David Bolvansky 05bda8b4e5 Annotate return values of allocation functions with dereferenceable_or_null
Summary:
Example
define dso_local noalias i8* @_Z6maixxnv() local_unnamed_addr #0 {
entry:
  %call = tail call noalias dereferenceable_or_null(64) i8* @malloc(i64 64) #6
  ret i8* %call
}


Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: aaron.ballman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66651

llvm-svn: 370168
2019-08-28 08:28:20 +00:00
Craig Topper 5bbb604bb5 [InstCombine] Disable some portions of foldGEPICmp for GEPs that return a vector of pointers. Fix other portions.
llvm-svn: 370114
2019-08-27 21:38:56 +00:00
David Bolvansky 0c2692108c [InstCombine] Fold select with ctlz to cttz
Summary:
Handle pattern [0]:

int ctz(unsigned int a)
{
  int c = __clz(a & -a);
  return a ? 31 - c : c;
}

In reality, the compiler can generate much better code for cttz, so fold away this pattern.

https://godbolt.org/z/c5kPtV

 [0] https://community.arm.com/community-help/f/discussions/2114/count-trailing-zeros

Reviewers: spatel, nikic, lebedev.ri, dmgreen, hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, javed.absar, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66308

llvm-svn: 370037
2019-08-27 10:22:40 +00:00
Vitaly Buka aeca56964f msan, codegen, instcombine: Keep more lifetime markers used for msan
Reviewers: eugenis

Subscribers: hiraditya, cfe-commits, #sanitizers, llvm-commits

Tags: #clang, #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D66695

llvm-svn: 369979
2019-08-26 22:15:50 +00:00
Philip Reames b92c971099 [InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, null for vectors
Extend the transform introduced in https://reviews.llvm.org/D66608 to work for vector geps as well.

Differential Revision: https://reviews.llvm.org/D66671

llvm-svn: 369949
2019-08-26 19:11:49 +00:00
Roman Lebedev 9cf08c6de1 [Constant] Add 'isElementWiseEqual()' method
Promoting it from InstCombine's tryToReuseConstantFromSelectInComparison().

Return true if this constant and a constant 'Y' are element-wise equal.
This is identical to just comparing the pointers, with the exception that
for vectors, if only one of the constants has an `undef` element in some
lane, the constants still match.

llvm-svn: 369842
2019-08-24 06:49:51 +00:00
Roman Lebedev de19f749e0 [InstCombine] matchThreeWayIntCompare(): commutativity awareness
Summary:
`matchThreeWayIntCompare()` looks for
```
   select i1 (a == b),
          i32 Equal,
          i32 (select i1 (a < b), i32 Less, i32 Greater)
```
but both of these selects/compares can be in it's commuted form,
so out of 8 variants, only the two most basic ones is handled.
This fixes regression being introduced in D66232.

Reviewers: spatel, nikic, efriedma, xbolva00

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66607

llvm-svn: 369841
2019-08-24 06:49:36 +00:00
Roman Lebedev 2c75fe7f2a [InstCombine] Try to reuse constant from select in leading comparison
Summary:
If we have e.g.:
```
  %t = icmp ult i32 %x, 65536
  %r = select i1 %t, i32 %y, i32 65535
```
the constants `65535` and `65536` are suspiciously close.
We could perform a transformation to deduplicate them:
```
Name: ult
%t = icmp ult i32 %x, 65536
%r = select i1 %t, i32 %y, i32 65535
  =>
%t.inv = icmp ugt i32 %x, 65535
%r = select i1 %t.inv, i32 65535, i32 %y
```
https://rise4fun.com/Alive/avb

While this may seem esoteric, this should certainly be good for vectors
(less constant pool usage) and for opt-for-size - need to have only one constant.

But the real fun part here is that it allows further transformation,
in particular it finishes cleaning up the `clamp` folding,
see e.g. `canonicalize-clamp-with-select-of-constant-threshold-pattern.ll`.
We start with e.g.
```
  %dont_need_to_clamp_positive = icmp sle i32 %X, 32767
  %dont_need_to_clamp_negative = icmp sge i32 %X, -32768
  %clamp_limit = select i1 %dont_need_to_clamp_positive, i32 -32768, i32 32767
  %dont_need_to_clamp = and i1 %dont_need_to_clamp_positive, %dont_need_to_clamp_negative
  %R = select i1 %dont_need_to_clamp, i32 %X, i32 %clamp_limit
```
without this patch we currently produce
```
  %1 = icmp slt i32 %X, 32768
  %2 = icmp sgt i32 %X, -32768
  %3 = select i1 %2, i32 %X, i32 -32768
  %R = select i1 %1, i32 %3, i32 32767
```
which isn't really a `clamp` - both comparisons are performed on the original value,
this patch changes it into
```
  %1.inv = icmp sgt i32 %X, 32767
  %2 = icmp sgt i32 %X, -32768
  %3 = select i1 %2, i32 %X, i32 -32768
  %R = select i1 %1.inv, i32 32767, i32 %3
```
and then the magic happens! Some further transform finishes polishing it and we finally get:
```
  %t1 = icmp sgt i32 %X, -32768
  %t2 = select i1 %t1, i32 %X, i32 -32768
  %t3 = icmp slt i32 %t2, 32767
  %R = select i1 %t3, i32 %t2, i32 32767
```
which is beautiful and just what we want.

Proofs for `getFlippedStrictnessPredicateAndConstant()` for de-canonicalization:
https://rise4fun.com/Alive/THl
Proofs for the fold itself: https://rise4fun.com/Alive/THl

Reviewers: spatel, dmgreen, nikic, xbolva00

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66232

llvm-svn: 369840
2019-08-24 06:49:25 +00:00
Philip Reames 9cb059fdcc Fix a bug in just submitted rL369789
Started implementing the vector case and realized the scalar case hadn't handled the GEP producing a different type than the base correctly.  It's entertaining seeing what slips through review when we're focused on the 'hard' parts.  :(

Also adding an extra vector test as it happened to be in workspace and wasn't worth separating.

llvm-svn: 369795
2019-08-23 18:27:57 +00:00
Philip Reames 5b02cfa0b3 [InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, null
This generalizes the isGEPKnownNonNull rule from ValueTracking to apply when we do not know if the base is non-null, and thus need to replace one condition with another.

The core notion is that since an inbounds GEP can only form null if the base pointer is null and the offset is zero. However, if the offset is non-zero, the the "inbounds" marker makes the result poison. Thus, we're free to ignore the case where the offset is non-zero. Similarly, there's no case under which a non-null base can result in a null result without generating poison.

Differential Revision: https://reviews.llvm.org/D66608

llvm-svn: 369789
2019-08-23 17:58:58 +00:00
Philip Reames 764b0fd5a3 [instcombine] icmp eq/ne (sub C, Y), C -> icmp eq/ne Y, 0
Noticed while looking at pr43028.  

llvm-svn: 369541
2019-08-21 15:51:57 +00:00
Sanjay Patel e728259278 [InstCombine] narrow icmp with extended operands of different widths
An intermediate extend is used to widen the narrow operand to the width of
the other (wider) operand. At that point, we have the same logic as the
existing transform that was restricted to folds of equal width zext/sext.

This mostly solves PR42700:
https://bugs.llvm.org/show_bug.cgi?id=42700

llvm-svn: 369519
2019-08-21 11:56:08 +00:00
Sanjay Patel 292b1087f4 [InstCombine] add helper function for icmp+zext/sext; NFC
llvm-svn: 369421
2019-08-20 18:15:17 +00:00