Commit Graph

3606 Commits

Author SHA1 Message Date
Bob Haarman 055779a9ac Revert "[InstCombine] keep assumption before sinking calls"
Summary:
This reverts commit c3b06d0c39.

Reason for revert: Caused miscompiles when inserting assume for undef.

Also adds a test to prevent similar breakage in future.

Fixes PR44154.

Reviewers: rnk, jdoerfert, efriedma, xbolva00

Reviewed By: rnk

Subscribers: thakis, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70933
2019-12-05 10:39:34 -08:00
Roman Lebedev 796fa662f1
[InstCombine] Invert `add A, sext(B) --> sub A, zext(B)` canonicalization (to `sub A, zext B -> add A, sext B`)
Summary:
D68408 proposes to greatly improve our negation sinking abilities.
But in current canonicalization, we produce `sub A, zext(B)`,
which we will consider non-canonical and try to sink that negation,
undoing the existing canonicalization.
So unless we explicitly stop producing previous canonicalization,
we will have two conflicting folds, and will end up endlessly looping.

This inverts canonicalization, and adds back the obvious fold
that we'd miss:
* `sub [nsw] Op0, sext/zext (bool Y) -> add [nsw] Op0, zext/sext (bool Y)`
  https://rise4fun.com/Alive/xx4
* `sext(bool) + C -> bool ? C - 1 : C`
  https://rise4fun.com/Alive/fBl

It is obvious that `@ossfuzz_9880()` / `@lshr_out_of_range()`/`@ashr_out_of_range()`
(oss-fuzz 4871) are no longer folded as much, though those aren't really worrying.

Reviewers: spatel, efriedma, t.p.northover, hfinkel

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71064
2019-12-05 21:21:30 +03:00
Sanjay Patel 3c6b5d3674 [InstCombine] narrow select with FP casts
Select doesn't change values, so truncate of extended operand cancels out.
2019-12-05 11:12:44 -05:00
Sanjay Patel 51e420c27e [InstCombine] add FMF guard to builder in fptrunc transform; NFC
This makes no difference currently because we don't apply FMF
to FP casts, but that may change.

This could also be a place to add a fold for select with fptrunc,
so it will make that patch easier/smaller.
2019-12-05 10:55:07 -05:00
Roman Lebedev 09311459e3
[InstCombine] Extend `0 - (X sdiv C) -> (X sdiv -C)` fold to non-splat vectors
Split off from https://reviews.llvm.org/D68408
2019-12-05 15:48:29 +03:00
Ehud Katz 2b6b8cb10c [APFloat] Prevent construction of APFloat with Semantics and FP value
Constructor invocations such as `APFloat(APFloat::IEEEdouble(), 0.0)`
may seem like they accept a FP (floating point) value, but the overload
they reach is actually the `integerPart` one, not a `float` or `double`
overload (which only exists when `fltSemantics` isn't passed).

This may lead to possible loss of data, by the conversion from `float`
or `double` to `integerPart`.

To prevent future mistakes, a new constructor overload, which accepts
any FP value and marked with `delete`, to prevent its usage.

Fixes PR34095.

Differential Revision: https://reviews.llvm.org/D70425
2019-12-04 12:02:04 +02:00
Craig Topper 5ebbabc1af [InstCombine] Revert aafde063aa and 6749dc3446 related to bitcast handling of x86_mmx
This reverts these two commits
[InstCombine] Turn (extractelement <1 x i64/double> (bitcast (x86_mmx))) into a single bitcast from x86_mmx to i64/double.
[InstCombine] Don't transform bitcasts between x86_mmx and v1i64 into insertelement/extractelement

We're seeing at least one internal test failure related to a
bitcast that was previously before an inline assembly block
containing emms being placed after it. This leads to the mmx
state ending up not empty after the emms. IR has no way to
make any specific guarantees about this. Reverting these patches
to get back to previous behavior which at least worked for this
test.
2019-12-03 14:02:22 -08:00
Sanjay Patel af4e59949c [InstCombine] fix undef propagation for vector urem transform (PR44186)
As described here:
https://bugs.llvm.org/show_bug.cgi?id=44186

The match() code safely allows undef values, but we can't safely
propagate a vector constant that contains an undef to the new
compare instruction.
2019-12-02 12:17:38 -05:00
Simon Tatham 01aefae4a1 [ARM,MVE] Add an InstCombine rule permitting VPNOT.
Summary:
If a user writing C code using the ACLE MVE intrinsics generates a
predicate and then complements it, then the resulting IR will use the
`pred_v2i` IR intrinsic to turn some `<n x i1>` vector into a 16-bit
integer; complement that integer; and convert back. This will generate
machine code that moves the predicate out of the `P0` register,
complements it in an integer GPR, and moves it back in again.

This InstCombine rule replaces `i2v(~v2i(x))` with a direct complement
of the original predicate vector, which we can already instruction-
select as the VPNOT instruction which complements P0 in place.

Reviewers: ostannard, MarkMurrayARM, dmgreen

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70484
2019-12-02 16:20:30 +00:00
Roman Lebedev 0f22e783a0
[InstCombine] Revert rL341831: relax one-use check in foldICmpAddConstant() (PR44100)
rL341831 moved one-use check higher up, restricting a few folds
that produced a single instruction from two instructions to the case
where the inner instruction would go away.

Original commit message:
> InstCombine: move hasOneUse check to the top of foldICmpAddConstant
>
> There were two combines not covered by the check before now,
> neither of which actually differed from normal in the benefit analysis.
>
> The most recent seems to be because it was just added at the top of the
> function (naturally). The older is from way back in 2008 (r46687)
> when we just didn't put those checks in so routinely, and has been
> diligently maintained since.

From the commit message alone, there doesn't seem to be a
deeper motivation, deeper problem that was trying to solve,
other than 'fixing the wrong one-use check'.

As i have briefly discusses in IRC with Tim, the original motivation
can no longer be recovered, too much time has passed.

However i believe that the original fold was doing the right thing,
we should be performing such a transformation even if the inner `add`
will not go away - that will still unchain the comparison from `add`,
it will no longer need to wait for `add` to compute.

Doing so doesn't seem to break any particular idioms,
as least as far as i can see.

References https://bugs.llvm.org/show_bug.cgi?id=44100
2019-12-02 18:06:15 +03:00
Sanjay Patel af0babc90a [InstCombine] fold copysign with constant sign argument to (fneg+)fabs
If the sign of the sign argument is known (this could be extended to use ValueTracking),
then we can use fneg+fabs to clear/set the sign bit of the magnitude argument.
http://llvm.org/docs/LangRef.html#llvm-copysign-intrinsic

This transform is already done in DAGCombiner, but we can do it sooner in IR as
suggested in PR44153:
https://bugs.llvm.org/show_bug.cgi?id=44153

We have effectively no analysis for copysign in IR, so we are taking the unusual step
of increasing the number of IR instructions for the negative constant case.

Differential Revision: https://reviews.llvm.org/D70792
2019-12-02 09:23:12 -05:00
Bjorn Pettersson a9d6b0e544 [InstCombine] Fix big-endian miscompile of (bitcast (zext/trunc (bitcast)))
Summary:
optimizeVectorResize is rewriting patterns like:
  %1 = bitcast vector %src to integer
  %2 = trunc/zext %1
  %dst = bitcast %2 to vector

Since bitcasting between integer an vector types gives
different integer values depending on endianness, we need
to take endianness into account. As it happens the old
implementation only produced the correct result for little
endian targets.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=44178

Reviewers: spatel, lattner, lebedev.ri

Reviewed By: spatel, lebedev.ri

Subscribers: lebedev.ri, hiraditya, uabelho, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70844
2019-12-02 11:05:25 +01:00
David Green 59b56e5c57 [InstCombine] Expand usub_sat patterns to handle constants
The constants come through as add %x, -C, not a sub as would be
expected. They need some extra matchers to canonicalise them towards
usub_sat.

Differential Revision: https://reviews.llvm.org/D69514
2019-11-30 16:58:01 +00:00
David Green 3a1bef5616 [InstCombine] Adjust usub_sat fold one use checks
This adjusts the one use checks in the the usub_sat fold code to not
increase instruction count, but otherwise do the fold. Reviewed as a
part of D69514.
2019-11-30 16:58:00 +00:00
Sanjay Patel 35827164c4 [InstCombine] remove shuffle mask canonicalization that creates undef elements
This is NFC-intended because SimplifyDemandedVectorElts() does the same
transform later. As discussed in D70641, we may want to change that
behavior, so we need to isolate where it happens.
2019-11-25 13:33:56 -05:00
Sanjay Patel e85d2e4981 [InstCombine] prevent infinite loop from conflicting shuffle mask transforms
The pattern in question is currently not possible because we
aggressively (wrongly) transform mask elements to undef values
if they choose from an undef operand. That, however, would
change if we tighten our semantics for shuffles as discussed
in D70641. Adding this check gives us the flexibility to make
that change with minimal overhead for current definitions.
2019-11-25 12:00:41 -05:00
Sanjay Patel fc31b58eff [InstCombine] simplify code for shuffle mask canonicalization; NFC
We never use the local 'Mask' before returning, so that was dead code.
2019-11-25 11:11:12 -05:00
Sanjay Patel 847aabf11f [InstCombine] remove dead code from shuffle mask canonicalization; NFC 2019-11-25 10:54:18 -05:00
Sanjay Patel 20684092ab [InstCombine] simplify loop for shuffle mask canonicalization; NFC 2019-11-25 10:41:50 -05:00
Sanjay Patel f575f12c64 [InstCombine] remove identity shuffle simplification for mask with undefs
And simultaneously enhance SimplifyDemandedVectorElts() to rcognize that
pattern. That preserves some of the old optimizations in IR.

Given a shuffle that includes undef elements in an otherwise identity mask like:

define <4 x float> @shuffle(<4 x float> %arg) {
  %shuf = shufflevector <4 x float> %arg, <4 x float> undef, <4 x i32> <i32 undef, i32 1, i32 2, i32 3>
  ret <4 x float> %shuf
}

We were simplifying that to the input operand.

But as discussed in PR43958:
https://bugs.llvm.org/show_bug.cgi?id=43958
...that means that per-vector-element poison that would be stopped by the shuffle can now
leak to the result.

Also note that we still have (and there are tests for) the same transform with no undef
elements in the mask (a fully-defined identity mask). I don't think there's any
controversy about that case - it's a valid transform under any interpretation of
shufflevector/undef/poison.

Looking at a few of the diffs into codegen, I don't see any difference in final asm. So
depending on your perspective, that's good (no real loss of optimization power) or bad
(poison exists in the DAG, so we only partially fixed the bug).

Differential Revision: https://reviews.llvm.org/D70246
2019-11-24 10:06:26 -05:00
Davide Italiano c32f0ff92f [InstCombine] Fix call guard difference with dbg
Patch by Chris Ye!

Differential Revision: https://reviews.llvm.org/D68004
2019-11-22 13:35:53 -08:00
Tom Stellard ab411801b8 [cmake] Explicitly mark libraries defined in lib/ as "Component Libraries"
Summary:
Most libraries are defined in the lib/ directory but there are also a
few libraries defined in tools/ e.g. libLLVM, libLTO.  I'm defining
"Component Libraries" as libraries defined in lib/ that may be included in
libLLVM.so.  Explicitly marking the libraries in lib/ as component
libraries allows us to remove some fragile checks that attempt to
differentiate between lib/ libraries and tools/ libraires:

1. In tools/llvm-shlib, because
llvm_map_components_to_libnames(LIB_NAMES "all") returned a list of
all libraries defined in the whole project, there was custom code
needed to filter out libraries defined in tools/, none of which should
be included in libLLVM.so.  This code assumed that any library
defined as static was from lib/ and everything else should be
excluded.

With this change, llvm_map_components_to_libnames(LIB_NAMES, "all")
only returns libraries that have been added to the LLVM_COMPONENT_LIBS
global cmake property, so this custom filtering logic can be removed.
Doing this also fixes the build with BUILD_SHARED_LIBS=ON
and LLVM_BUILD_LLVM_DYLIB=ON.

2. There was some code in llvm_add_library that assumed that
libraries defined in lib/ would not have LLVM_LINK_COMPONENTS or
ARG_LINK_COMPONENTS set.  This is only true because libraries
defined lib lib/ use LLVMBuild.txt and don't set these values.
This code has been fixed now to check if the library has been
explicitly marked as a component library, which should now make it
easier to remove LLVMBuild at some point in the future.

I have tested this patch on Windows, MacOS and Linux with release builds
and the following combinations of CMake options:

- "" (No options)
- -DLLVM_BUILD_LLVM_DYLIB=ON
- -DLLVM_LINK_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_BUILD_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_LINK_LLVM_DYLIB=ON

Reviewers: beanz, smeenai, compnerd, phosek

Reviewed By: beanz

Subscribers: wuzish, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, mgorny, mehdi_amini, sbc100, jgravelle-google, hiraditya, aheejin, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, steven_wu, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, dang, Jim, lenary, s.egerton, pzheng, sameer.abuasal, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70179
2019-11-21 10:48:08 -08:00
Sanjay Patel 4ae0a13256 [InstCombine] add assert in SimplifyDemandedVectorElts and improve readability; NFC 2019-11-21 11:16:36 -05:00
Simon Tatham f4f77aa53e [ARM,MVE] Add InstCombine rules for pred_i2v / pred_v2i.
If you're writing C code using the ACLE MVE intrinsics that passes the
result of a vcmp as input to a predicated intrinsic, e.g.

  mve_pred16_t pred = vcmpeqq(v1, v2);
  v_out = vaddq_m(v_inactive, v3, v4, pred);

then clang's codegen for the compare intrinsic will create calls to
`@llvm.arm.mve.pred.v2i` to convert the output of `icmp` into an
`mve_pred16_t` integer representation, and then the next intrinsic
will call `@llvm.arm.mve.pred.i2v` to convert it straight back again.
This will be visible in the generated code as a `vmrs`/`vmsr` pair
that move the predicate value pointlessly out of `p0` and back into it again.

To prevent that, I've added InstCombine rules to remove round trips of
the form `v2i(i2v(x))` and `i2v(v2i(x))`. Also I've taught InstCombine
about the known and demanded bits of those intrinsics. As a result,
you now get just the generated code you wanted:

  vpt.u16 eq, q1, q2
  vaddt.u16 q0, q3, q4

Reviewers: ostannard, MarkMurrayARM, dmgreen

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70313
2019-11-18 10:39:30 +00:00
Sanjay Patel 5d67d81f48 [InstCombine] prevent crashing/assert on shift constant expression (PR44028)
The binary operator cast implies an instruction, but the matcher for shift does not:
https://bugs.llvm.org/show_bug.cgi?id=44028
2019-11-17 17:31:09 -05:00
David Green 08390c52a2 [InstCombine] Canonicalize ssub.with.overflow with clamp to ssub.sat
Working on top of D69252, this adds canonicalisation patterns for ssub.with.overflow to ssub.sats.

Differential Revision: https://reviews.llvm.org/D69753
2019-11-17 10:45:11 +00:00
David Green 03fce6b12e [InstCombine] Canonicalize sadd.with.overflow with clamp to sadd.sat
This adds to D69245, adding extra signed patterns for folding from a
sadd_with_overflow to a sadd_sat. These are more complex than the
unsigned patterns, as the overflow can occur in either direction.

For the add case, the positive overflow can only occur if both of the
values are positive (same for both the values being negative). So there
is an extra select on whether to use the positive or negative overflow
limit.

Differential Revision: https://reviews.llvm.org/D69252
2019-11-17 10:42:39 +00:00
Francis Visoiu Mistrih a4c76be506 [InstCombine] Don't use getFirstNonPHI in FoldIntegerTypedPHI
getFirstNonPHI iterates over all the instructions in a block until it
finds a non-PHI.

Then, the loop starts from the beginning of the block and goes through
all the instructions until it reaches the instruction found by
getFirstNonPHI.

Instead of doing that, just stop when a non-PHI is found.

This reduces the compile-time of a test case discussed in
https://reviews.llvm.org/D47023 by 13x.

Not entirely sure how to come up with a test case for this since it's a
compile time issue that would significantly slow down running the tests.

Differential Revision: https://reviews.llvm.org/D70016
2019-11-14 17:52:01 -08:00
Reid Kleckner 4c1a1d3cf9 Add missing includes needed to prune LLVMContext.h include, NFC
These are a pre-requisite to removing #include "llvm/Support/Options.h"
from LLVMContext.h: https://reviews.llvm.org/D70280
2019-11-14 15:23:15 -08:00
Sanjay Patel 385572ccfe [InstCombine] remove duplicate code for simplifying a shuffle; NFCI
The transform is already handled by InstSimplify or earlier
in InstCombine, so trying to do it again is not necessary.
2019-11-14 13:12:25 -05:00
Daniil Suchkov 4c9d0da838 Revert "[InstCombine] Fold PHIs with equal incoming pointers"
This reverts commit a2f6ae9abf.
It is reverted due to clang-cmake-armv7-selfhost buildbot failure.
2019-11-14 17:42:01 +07:00
Daniil Suchkov a2f6ae9abf [InstCombine] Fold PHIs with equal incoming pointers
This is a resubmission of bbb29738b5 that
was reverted due to clang tests failures. It includes the fix and
additional IR tests for the missed case.

Summary:
In case when all incoming values of a PHI are equal pointers, this
transformation inserts a definition of such a pointer right after
definition of the base pointer and replaces with this value both PHI and
all it's incoming pointers. Primary goal of this transformation is
canonicalization of this pattern in order to enable optimizations that
can't handle PHIs. Non-inbounds pointers aren't currently supported.

Reviewers: spatel, RKSimon, lebedev.ri, apilipenko

Reviewed By: apilipenko

Tags: #llvm

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D68128
2019-11-14 17:04:32 +07:00
Reid Kleckner 05da2fe521 Sink all InitializePasses.h includes
This file lists every pass in LLVM, and is included by Pass.h, which is
very popular. Every time we add, remove, or rename a pass in LLVM, it
caused lots of recompilation.

I found this fact by looking at this table, which is sorted by the
number of times a file was changed over the last 100,000 git commits
multiplied by the number of object files that depend on it in the
current checkout:
  recompiles    touches affected_files  header
  342380        95      3604    llvm/include/llvm/ADT/STLExtras.h
  314730        234     1345    llvm/include/llvm/InitializePasses.h
  307036        118     2602    llvm/include/llvm/ADT/APInt.h
  213049        59      3611    llvm/include/llvm/Support/MathExtras.h
  170422        47      3626    llvm/include/llvm/Support/Compiler.h
  162225        45      3605    llvm/include/llvm/ADT/Optional.h
  158319        63      2513    llvm/include/llvm/ADT/Triple.h
  140322        39      3598    llvm/include/llvm/ADT/StringRef.h
  137647        59      2333    llvm/include/llvm/Support/Error.h
  131619        73      1803    llvm/include/llvm/Support/FileSystem.h

Before this change, touching InitializePasses.h would cause 1345 files
to recompile. After this change, touching it only causes 550 compiles in
an incremental rebuild.

Reviewers: bkramer, asbirlea, bollu, jdoerfert

Differential Revision: https://reviews.llvm.org/D70211
2019-11-13 16:34:37 -08:00
Sanjay Patel 3d6b53980c [InstCombine] propagate fast-math-flags (FMF) to select when inverting fcmp+select
As noted by the FIXME comment, this is not correct based on our current FMF semantics.
We should be propagating FMF from the final value in a sequence (in this case the
'select'). So the behavior even without this patch is wrong, but we did not allow FMF
on 'select' until recently.

But if we do the correct thing right now in this patch, we'll inevitably introduce
regressions because we have not wired up FMF propagation for 'phi' and 'select' in
other passes (like SimplifyCFG) or other places in InstCombine. I'm not seeing a
better incremental way to make progress.

That said, the potential extra damage over the existing wrong behavior from this
patch is very limited. AFAIK, the only way to have different FMF on IR in the same
function is if we have LTO inlined IR from 2 modules that were compiled using
different fast-math settings.

As seen in the tests, we may actually see some improvements with this patch because
adding the FMF to the 'select' allows matching to min/max intrinsics that were
previously missed (in the common case, the 'fcmp' and 'select' should have identical
FMF to begin with).

Next steps in the transition:

    Make similar changes in instcombine as needed.
    Enable phi-to-select FMF propagation in SimplifyCFG.
    Remove dependencies on fcmp with FMF.
    Deprecate FMF on fcmp.

Differential Revision: https://reviews.llvm.org/D69720
2019-11-13 10:38:42 -05:00
Florian Hahn f7499011ca [InstCombine] Avoid moving ops that do restrict undef across shuffles.
I think we have to be a bit more careful when it comes to moving
ops across shuffles, if the op does restrict undef. For example, without
this patch, we would move 'and %v, <0, 0, -1, -1>' over a
'shufflevector %a, undef, <undef, undef, 1, 2>'. As a result, the first
2 lanes of the result are undef after the combine, but they really
should be 0, unless I am missing something.

For ops that do fold to undef on undef operands, the current behavior
should be fine. I've add conservative check OpDoesRestrictUndef, maybe
there's a better existing utility?

Reviewers: spatel, RKSimon, lebedev.ri

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D70093
2019-11-13 13:40:34 +00:00
Daniil Suchkov cba4a27745 Temporarily revert "[InstCombine] Fold PHIs with equal incoming pointers"
Revert due to sanitizer-windows buildbot failure.

This reverts commit bbb29738b5.
2019-11-13 17:14:11 +07:00
Daniil Suchkov bbb29738b5 [InstCombine] Fold PHIs with equal incoming pointers
In case when all incoming values of a PHI are equal pointers, this
transformation inserts a definition of such a pointer right after
definition of the base pointer and replaces with this value both PHI and
all it's incoming pointers. Primary goal of this transformation is
canonicalization of this pattern in order to enable optimizations that
can't handle PHIs. Non-inbounds pointers aren't currently supported.

Reviewers: spatel, RKSimon, lebedev.ri, apilipenko

Reviewed By: apilipenko

Tags: #llvm

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D68128
2019-11-13 17:00:34 +07:00
Diana Picus 7f1dcc8952 [InstCombine] Skip scalable vectors in combineLoadToOperationType
Don't try to canonicalize loads to scalable vector types to loads
of integers.

This removes one assertion when trying to use a TypeSize as a parameter
to DataLayout::isLegalInteger. It does not handle the second part of the
function (which looks at bitcasts).

This patch also contains a NFC fix for Load Analysis, where a variable
initialization that would cause the same assertion is moved closer to
its use. This allows us to run the new test for InstCombine without
having to teach LocationSize to play nicely with scalable vectors.

Differential Revision: https://reviews.llvm.org/D70075
2019-11-12 12:27:09 +01:00
aqjune 4187cb138b Add InstCombine/InstructionSimplify support for Freeze Instruction
Summary:
- Add llvm::SimplifyFreezeInst
- Add InstCombiner::visitFreeze
- Add llvm tests

Reviewers: majnemer, sanjoy, reames, lebedev.ri, spatel

Reviewed By: reames, lebedev.ri

Subscribers: reames, lebedev.ri, filcab, regehr, trentxintong, llvm-commits

Differential Revision: https://reviews.llvm.org/D29013
2019-11-12 12:13:26 +09:00
Sanjay Patel 29f5d1670c Revert "[InstCombine] avoid crash from deleting an instruction that still has uses (PR43723) (3rd try)"
This reverts commit 3db8a3ef86.
This caused a different memory-sanitizer failure than earlier attempts,
but it's still not right.
2019-11-11 09:56:03 -05:00
Sanjay Patel 3db8a3ef86 [InstCombine] avoid crash from deleting an instruction that still has uses (PR43723) (3rd try)
Re-try because earlier attempts were reverted due to use-after-free.
Hopefully, diagnosed correctly this time - we replace/remove the
invariant.start first rather than the invariant.end to avoid angering
worklist-based iteration.

We gather a set of white-listed instructions in isAllocSiteRemovable() and then
replace/erase them. But we don't know in general if the instructions in the set
have uses amongst themselves, so order of deletion makes a difference.

There's already a special-case for the llvm.objectsize intrinsic, so add another
for llvm.invariant.start.

Should fix:
https://bugs.llvm.org/show_bug.cgi?id=43723

Differential Revision: https://reviews.llvm.org/D69977
2019-11-11 09:29:40 -05:00
Jay Foad 9323ef4ecc [InstCombine] Simplify binary op when only one operand is a select
Summary:
SimplifySelectsFeedingBinaryOp simplified binary ops when both operands
were selects with the same condition. This patch extends it to handle
these cases where only one operand is a select:

X op (C ? P : Q) -> C ? (X op P) : (X op Q)
  // if X op P and X op Q both simplify
(C ? P : Q) op Y -> C ? (P op Y) : (Q op Y)
  // if P op Y and Q op Y both simplify

For example: X *fast (C ? 1.0 : 0.0) -> C ? X : 0.0

Reviewers: mcberg2017, majnemer, craig.topper, qcolombet, mcrosier

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64713
2019-11-11 10:01:59 +00:00
Craig Topper aafde063aa [InstCombine] Turn (extractelement <1 x i64/double> (bitcast (x86_mmx))) into a single bitcast from x86_mmx to i64/double.
The _m64 type is represented in IR as <1 x i64>. The x86-64 ABI
on Linux passes <1 x i64> as a double. MMX intrinsics use x86_mmx
type in IR.These things result in a lot of bitcasts in mmx code.
There's another instcombine that tries to turn bitcast <1 x i64>
to double into extractelement and a bitcast.

The combine here tries to reverse this extractelement conversion
if we see an mmx type.
2019-11-10 16:25:25 -08:00
Sanjay Patel d115b9fd4a Revert "[InstCombine] avoid crash from deleting an instruction that still has uses (PR43723) (2nd try)"
This reverts commit 56b2aee187.
Still causes a use-after-free on sanitizer bots.
2019-11-10 18:47:49 -05:00
Sanjay Patel 56b2aee187 [InstCombine] avoid crash from deleting an instruction that still has uses (PR43723) (2nd try)
Re-try rGef02831f0a4e (reverted due to use-after-free), but bail out completely
if we encounter an unexpected llvm.invariant.start.

We gather a set of white-listed instructions in isAllocSiteRemovable() and then
replace/erase them. But we don't know in general if the instructions in the set
have uses amongst themselves, so order of deletion makes a difference.

There's already a special-case for the llvm.objectsize intrinsic, so add another
for llvm.invariant.end.

Should fix:
https://bugs.llvm.org/show_bug.cgi?id=43723

Differential Revision: https://reviews.llvm.org/D69977
2019-11-10 17:26:36 -05:00
Sanjay Patel b0ac26a632 Revert "[InstCombine] avoid crash from deleting an instruction that still has uses (PR43723)"
This reverts commit ef02831f0a.
Sanitizer bots fail with this change.
2019-11-10 11:18:05 -05:00
Sanjay Patel ef02831f0a [InstCombine] avoid crash from deleting an instruction that still has uses (PR43723)
We gather a set of white-listed instructions in isAllocSiteRemovable() and then
replace/erase them. But we don't know in general if the instructions in the set
have uses amongst themselves, so order of deletion makes a difference.

There's already a special-case for the llvm.objectsize intrinsic, so add another
for llvm.invariant.end.

Should fix:
https://bugs.llvm.org/show_bug.cgi?id=43723

Differential Revision: https://reviews.llvm.org/D69977
2019-11-10 09:18:11 -05:00
Jay Foad d162e02cee Refactor SimplifySelectsFeedingBinaryOp for D64713. NFC. 2019-11-09 09:28:22 +00:00
Craig Topper 6749dc3446 [InstCombine] Don't transform bitcasts between x86_mmx and v1i64 into insertelement/extractelement
x86_mmx is conceptually a vector already. Don't introduce an extra conversion between it and scalar i64.

I'm using VectorType::isValidElementType which checks for floating point, integer, and pointers to hopefully make this more readable than just blacklisting x86_mmx.

Differential Revision: https://reviews.llvm.org/D69964
2019-11-07 15:14:13 -08:00
Vedant Kumar a087b78bc4 Wrong debug info generated at -O2 (-O0 is correct)
Instcombiner pass was erasing trivially dead instruction without updating dependent llvm.dbg.value.
which was not showing programmer current state of variables while debugging.
As a part of this fix I did following,
Iterate throught all the users (llvm.dbg) of a instruction which is trivially dead and set each if them undef, Before deleting the instruction.
Now user will see optimized out, when try to print those variables.
This fixes
https://bugs.llvm.org/show_bug.cgi?id=43893

This is my first fix to llvm.

Patch by kamlesh kumar!

Differential Revision: https://reviews.llvm.org/D69809
2019-11-07 11:19:41 -08:00
Sanjay Patel d9ccb6367a [InstCombine] canonicalize shift+logic+shift to reduce dependency chain
shift (logic (shift X, C0), Y), C1 --> logic (shift X, C0+C1), (shift Y, C1)

This is an IR translation of an existing SDAG transform added here:
rL370617

So we again have 9 possible patterns with a commuted IR variant of each pattern:
https://rise4fun.com/Alive/VlI
https://rise4fun.com/Alive/n1m
https://rise4fun.com/Alive/1Vn

Part of the motivation is to allow easier recognition and subsequent
canonicalization of bswap patterns as discussed in PR43146:
https://bugs.llvm.org/show_bug.cgi?id=43146

We had to delay this transform because it used to allow the SLP vectorizer
to create awful reductions out of simple load-combines.
That problem was fixed with:
rL375025
(we'll bring back load combining in IR someday...)

The backend is also better equipped to deal with these patterns now
using hooks like TLI.getShiftAmountThreshold().

The only remaining potential controversy is that the -reassociate pass
tends to reverse this kind of pattern (to help GVN?). But since -reassociate
doesn't do anything with these specific patterns, there is no conflict currently.

Finally, there's a new pass proposal at D67383 for general tree-height-reduction
reassociation, and it could use a cost model to decide how to optimally rearrange
these kinds of ops for a target. That patch appears to be stalled.

Differential Revision: https://reviews.llvm.org/D69842
2019-11-07 12:09:45 -05:00
Roman Lebedev ccf1a5f4bb
[InstCombine] dropRedundantMaskingOfLeftShiftInput(): truncation (PR42563)
Summary:
That fold keeps growing and growing :(
I think this may be one of the last pieces for it.

Since D67677/D67725, the fold knowns the general form
of the pattern - where some masking is needed:
https://rise4fun.com/Alive/F5R
https://rise4fun.com/Alive/gslRa

But there is one more huge piece missing - if you are extracting some bits,
it is not impossible that the origin is wider than the extraction,
i.e. there may be a truncation. And we don't deal with that yet.

But we can, and the generalization remains fully identical:
https://rise4fun.com/Alive/Uar
https://rise4fun.com/Alive/5SW

After a preparatory cleanup i think the diff looks rather clean.

One missing piece is that in some patterns (especially pat. b),
`-1` only needs to be `-1` in final type, but that is for later..

https://bugs.llvm.org/show_bug.cgi?id=42563

Reviewers: spatel, nikic

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69125
2019-11-05 12:41:26 +03:00
Dávid Bolvanský 058b5028de Reland '[InstructionCombining] Fixed null check after dereferencing warning. NFCI.' 2019-11-03 20:34:54 +01:00
Dávid Bolvanský 5b37c018d5 Revert "[InstructionCombining] Fixed null check after dereferencing warning. NFCI."
This reverts commit 8308187fd9. This exposed a bug.
2019-11-03 20:31:05 +01:00
Dávid Bolvanský d825ed24d2 Revert "[InstructionCompares] Fixed null check after dereferencing warning. NFCI."
This reverts commit b8685cf304.
2019-11-03 20:24:01 +01:00
Dávid Bolvanský b8685cf304 [InstructionCompares] Fixed null check after dereferencing warning. NFCI. 2019-11-03 20:13:45 +01:00
Dávid Bolvanský 8308187fd9 [InstructionCombining] Fixed null check after dereferencing warning. NFCI. 2019-11-03 20:10:46 +01:00
Sanjay Patel a2240f57e7 [InstCombine] simplify fcmp+select canonicalization; NFCI
We had 2 blocks of code that are nearly identical. Existing
regression tests should cover both of the patterns.
2019-10-31 13:13:32 -04:00
David Green a5f7bc0de7 [InstCombine] Canonicalize uadd.with.overflow to uadd.sat
This adds some patterns to transform uadd.with.overflow to uadd.sat
(with usub.with.overflow to usub.sat too). The patterns selects from
UINTMAX (or 0 for subs) depending on whether the operation overflowed.

Signed patterns are a little more involved (they can wrap in two
directions), but can be added here in a followup patch too.

Differential Revision: https://reviews.llvm.org/D69245
2019-10-31 12:45:38 +00:00
tyker c3b06d0c39 [InstCombine] keep assumption before sinking calls
Summary:
in the following C code the branch is not removed by clang in O3.
```
int f1(char* p) {
    int i1 = __builtin_strlen(p);
    if (!p)
        return -1;
    return i1;
}
```
The issue is that the call to strlen is sunk to the following block by instcombine. In its new place the call to strlen doesn't dominate the use in the icmp anymore so value tracking can't see that p cannot be null.
This patch resolves the issue by inserting an assumption at the place of the call before sinking a call when that call can be used to prove an argument to be nonnull.
This resolves this issue at O3.

Reviewers: majnemer, xbolva00, fhahn, jdoerfert, spatel, efriedma

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69477
2019-10-31 00:15:19 +01:00
Sanjay Patel a22282be54 [InstCombine] make icmp vector canonicalization safe for constant with undef elements
This is a fix for:
https://bugs.llvm.org/show_bug.cgi?id=43730
...and as shown there, we have existing test cases that show potential miscompiles.

We could just bail out for vector constants that contain any undef elements, or we can do as shown here:
allow the transform, but replace the undefs with a safe value.

For most of the tests shown, this results in a full splat constant (no undefs) which is probably a win
for further IR analysis because we conservatively don't match undefs in most cases. Codegen can probably
recover these kinds of undef lanes via demanded elements analysis if that's profitable.

Differential Revision: https://reviews.llvm.org/D69519
2019-10-29 10:58:14 -04:00
Sanjay Patel a1e8ad4f2f [IR] move helper function to replace undef constant (elements) with fixed constants
This is the NFC part of D69519.
We had this functionality locally in instcombine, but it can be used
elsewhere, so hoisting it to Constant class.
2019-10-29 08:52:10 -04:00
David Green bf21f0d489 [InstCombine] Extra combine for uadd_sat
This is an extra fold for a canonical form of uadd_sat, as shown in
D68651. It essentially selects uadd from an add and a select.

Differential Revision: https://reviews.llvm.org/D69244
2019-10-28 15:21:16 +00:00
Benjamin Kramer 6f0bb77037 [InstCombine] Fold one-use variable into assert
Avoids warnings in Release builds. NFC.
2019-10-24 17:57:24 +02:00
Simon Tatham e5f485c3bd [InstCombine] Known-bits optimization for ARM MVE VADC.
The MVE VADC instruction reads and writes the carry bit at bit 29 of
the FPSCR register. The corresponding ACLE intrinsic is specified to
work with an integer in which the carry bit is stored at bit 0. So if
a user writes a code sequence in C that passes the carry from one VADC
to the next, like this,

    s0 = vadcq_u32(a0, b0, &carry);
    s1 = vadcq_u32(a1, b1, &carry);

then clang will generate IR for each of those operations that shifts
the carry bit up into bit 29 before the VADC, and after it, shifts it
back down and masks off all but the low bit. But in this situation
what you really wanted was two consecutive VADC instructions, so that
the second one directly reads the value left in FPSCR by the first,
without wasting several instructions on pointlessly clearing the other
flag bits in between.

This commit explains to InstCombine that the other bits of the flags
operand don't matter, and adds a test that demonstrates that all the
code between the two VADC instructions can be optimized away as a
result.

Reviewers: dmgreen, miyuki, ostannard

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67162
2019-10-24 16:33:13 +01:00
David Green 186155b89c [InstCombine] Signed saturation patterns
This adds an instcombine matcher for code that attempts to perform signed
saturating arithmetic by casting to a higher type. Unsigned cases are already
matched, this adds extra matches for the more complex signed cases, which
involves matching the min(max(add a b)) nodes with proper extends to ensure
legality.

Differential Revision: https://reviews.llvm.org/D68651

llvm-svn: 375505
2019-10-22 15:39:47 +00:00
Guillaume Chatelet 5b99c189b3 [Alignment][NFC] Convert StoreInst to MaybeAlign
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69303

llvm-svn: 375499
2019-10-22 12:55:32 +00:00
Guillaume Chatelet 734c74ba14 [Alignment][NFC] Convert LoadInst to MaybeAlign
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69302

llvm-svn: 375498
2019-10-22 12:35:55 +00:00
Guillaume Chatelet 301b4128ac [Alignment][NFC] Finish transition for `Loads`
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, asbirlea, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69253

llvm-svn: 375419
2019-10-21 15:10:26 +00:00
Roman Lebedev 9948fac6c1 [NFC][InstCombine] Fixup comments
As noted in post-commit review of rL375378375378.

llvm-svn: 375397
2019-10-21 08:21:54 +00:00
Piotr Sobczak a861c9aef9 [InstCombine] Allow values with multiple users in SimplifyDemandedVectorElts
Summary:
Allow for ignoring the check for a single use in SimplifyDemandedVectorElts
to be able to simplify operands if DemandedElts is known to contain
the union of elements used by all users.
It is a responsibility of a caller of SimplifyDemandedVectorElts to
supply correct DemandedElts.

Simplify a series of extractelement instructions if only a subset of
elements is used.

Reviewers: reames, arsenm, majnemer, nhaehnle

Reviewed By: nhaehnle

Subscribers: wdng, jvesely, nhaehnle, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67345

llvm-svn: 375395
2019-10-21 08:12:47 +00:00
Roman Lebedev 7015a5c54b [InstCombine] conditional sign-extend of high-bit-extract: 'or' pattern.
In this pattern, all the "magic" bits that we'd `add` are all
high sign bits, and in the value we'd be adding to they are all unset,
not unexpectedly, so we can have an `or` there:
https://rise4fun.com/Alive/ups

It is possible that `haveNoCommonBitsSet()` should be taught about this
pattern so that we never have an `add` variant, but the reasoning would
need to be recursive (because of that `select`), so i'm not really sure
that would be worth it just yet.

llvm-svn: 375378
2019-10-20 20:52:06 +00:00
Nikita Popov b1b7a2f7b6 [InstCombine] Fold uadd.sat(a, b) == 0 and usub.sat(a, b) == 0
This adds folds for comparing uadd.sat/usub.sat with zero:

 * uadd.sat(a, b) == 0 => a == 0 && b == 0 => (a | b) == 0
 * usub.sat(a, b) == 0 => a <= b

And inverted forms for !=.

Differential Revision: https://reviews.llvm.org/D69224

llvm-svn: 375374
2019-10-20 20:19:42 +00:00
Roman Lebedev 49483a3bc2 [InstCombine] Shift amount reassociation in shifty sign bit test (PR43595)
Summary:
This problem consists of several parts:
* Basic sign bit extraction - `trunc? (?shr %x, (bitwidth(x)-1))`.
  This is trivial, and easy to do, we have a fold for it.
* Shift amount reassociation - if we have two identical shifts,
  and we can simplify-add their shift amounts together,
  then we likely can just perform them as a single shift.
  But this is finicky, has one-use restrictions,
  and shift opcodes must be identical.

But there is a super-pattern where both of these work together.
to produce sign bit test from two shifts + comparison.
We do indeed already handle this in most cases.
But since we get that fold transitively, it has one-use restrictions.
And what's worse, in this case the right-shifts aren't required to be
identical, and we can't handle that transitively:

If the total shift amount is bitwidth-1, only a sign bit will remain
in the output value. But if we look at this from the perspective of
two shifts, we can't fold - we can't possibly know what bit pattern
we'd produce via two shifts, it will be *some* kind of a mask
produced from original sign bit, but we just can't tell it's shape:
https://rise4fun.com/Alive/cM0 https://rise4fun.com/Alive/9IN

But it will *only* contain sign bit and zeros.
So from the perspective of sign bit test, we're good:
https://rise4fun.com/Alive/FRz https://rise4fun.com/Alive/qBU
Superb!

So the simplest solution is to extend `reassociateShiftAmtsOfTwoSameDirectionShifts()` to also have a
sudo-analysis mode that will ignore extra-uses, and will only check
whether a) those are two right shifts and b) they end up with bitwidth(x)-1
shift amount and return either the original value that we sign-checking,
or null.

This does not have any functionality change for
the existing `reassociateShiftAmtsOfTwoSameDirectionShifts()`.

All that being said, as disscussed in the review, this yet again
increases usage of instsimplify in instcombine as utility.
Some day that may need to be reevaluated.

https://bugs.llvm.org/show_bug.cgi?id=43595

Reviewers: spatel, efriedma, vsk

Reviewed By: spatel

Subscribers: xbolva00, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68930

llvm-svn: 375371
2019-10-20 19:38:50 +00:00
Bjorn Pettersson 6456252dbf [InstCombine] Fix miscompile bug in canEvaluateShuffled
Summary:
Add restrictions in canEvaluateShuffled to prevent that we for example
transform

  %0 = insertelement <2 x i16> undef, i16 %a, i32 0
  %1 = srem <2 x i16> %0, <i16 2, i16 1>
  %2 = shufflevector <2 x i16> %1, <2 x i16> undef, <2 x i32> <i32 undef, i32 0>

into

   %1 = insertelement <2 x i16> undef, i16 %a, i32 1
   %2 = srem <2 x i16> %1, <i16 undef, i16 2>

as having an undef denominator makes the srem undefined (for all
vector elements).

Fixes: https://bugs.llvm.org/show_bug.cgi?id=43689

Reviewers: spatel, lebedev.ri

Reviewed By: spatel, lebedev.ri

Subscribers: lebedev.ri, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69038

llvm-svn: 375208
2019-10-18 07:42:02 +00:00
Roman Lebedev 31a691e2a2 [NFC][InstCombine] Some more preparatory cleanup for dropRedundantMaskingOfLeftShiftInput()
llvm-svn: 375153
2019-10-17 18:30:03 +00:00
Piotr Sobczak 79769a4475 [InstCombine][AMDGPU] Fix crash with v3i16/v3f16 buffer intrinsics
Summary:
This is something of a workaround to avoid a crash later on in type
legalizer (WidenVectorResult()).
Also added some f16 tests, including a non-working v3f16 case with
a FIXME.

Reviewers: arsenm, tpr, nhaehnle

Reviewed By: arsenm

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68865

llvm-svn: 374993
2019-10-16 11:14:01 +00:00
Sanjay Patel 455ce7816c [InstCombine] fold a shifted bool zext to a select (2nd try)
The 1st attempt at rL374828 inserted the code
at the wrong position (outside of the constant-shift-amount
block). Trying again with an additional test to verify
const-ness.

For a constant shift amount, add the following fold.
shl (zext (i1 X)), ShAmt --> select (X, 1 << ShAmt, 0)

https://rise4fun.com/Alive/IZ9

Fixes PR42257.

Based on original patch by @zvi (Zvi Rackover)

Differential Revision: https://reviews.llvm.org/D63382

llvm-svn: 374886
2019-10-15 13:12:44 +00:00
Sanjay Patel 4335d8f0e8 Revert [InstCombine] fold a shifted bool zext to a select
This reverts r374828 (git commit 1f40f15d54) due to bot breakage

llvm-svn: 374851
2019-10-14 23:55:39 +00:00
Sanjay Patel 1f40f15d54 [InstCombine] fold a shifted bool zext to a select
For a constant shift amount, add the following fold.
shl (zext (i1 X)), ShAmt --> select (X, 1 << ShAmt, 0)

https://rise4fun.com/Alive/IZ9

Fixes PR42257.

Based on original patch by @zvi (Zvi Rackover)

Differential Revision: https://reviews.llvm.org/D63382

llvm-svn: 374828
2019-10-14 21:56:40 +00:00
Roman Lebedev 7a9fa897ec [NFC][InstCombine] Some preparatory cleanup in dropRedundantMaskingOfLeftShiftInput()
llvm-svn: 374734
2019-10-13 20:15:00 +00:00
Sanjay Patel f90728c322 [InstCombine] don't assume 'inbounds' for bitcast deref or null pointer in non-default address space
Follow-up to D68244 to account for a corner case discussed in:
https://bugs.llvm.org/show_bug.cgi?id=43501

Add one more restriction: if the pointer is deref-or-null and in a non-default
(non-zero) address space, we can't assume inbounds.

Differential Revision: https://reviews.llvm.org/D68706

llvm-svn: 374728
2019-10-13 17:19:08 +00:00
Roman Lebedev 7cdeac43e5 [InstCombine] Fold conditional sign-extend of high-bit-extract into high-bit-extract-with-signext (PR42389)
This can come up in Bit Stream abstractions.

The pattern looks big/scary, but it can't be simplified any further.
It only is so simple because a number of my preparatory folds had
happened already (shift amount reassociation / shift amount
reassociation in bit test, sign bit test detection).

Highlights:
* There are two main flavors: https://rise4fun.com/Alive/zWi
  The difference is add vs. sub, and left-shift of -1 vs. 1
* Since we only change the shift opcode,
  we can preserve the exact-ness: https://rise4fun.com/Alive/4u4
* There can be truncation after high-bit-extraction:
  https://rise4fun.com/Alive/slHc1   (the main pattern i'm after!)
  Which means that we need to ignore zext of shift amounts and of NBits.
* The sign-extending magic can be extended itself (in add pattern
  via sext, in sub pattern via zext. not the other way around!)
  https://rise4fun.com/Alive/NhG
  (or those sext/zext can be sinked into `select`!)
  Which again means we should pay attention when matching NBits.
* We can have both truncation of extraction and widening of magic:
  https://rise4fun.com/Alive/XTw
  In other words, i don't believe we need to have any checks on
  bitwidths of any of these constructs.

This is worsened in general by the fact that we may have `sext` instead
of `zext` for shift amounts, and we don't yet canonicalize to `zext`,
although we should. I have not done anything about that here.

Also, we really should have something to weed out `sub` like these,
by folding them into `add` variant.

https://bugs.llvm.org/show_bug.cgi?id=42389

llvm-svn: 373964
2019-10-07 20:53:27 +00:00
Roman Lebedev 0c73be590e [InstCombine] Move isSignBitCheck(), handle rest of the predicates
True, no test coverage is being added here. But those non-canonical
predicates that are already handled here already have no test coverage
as far as i can tell. I tried to add tests for them, but all the patterns
already get handled elsewhere.

llvm-svn: 373962
2019-10-07 20:53:08 +00:00
Roman Lebedev cb6d851bb6 [InstCombine][NFC] dropRedundantMaskingOfLeftShiftInput(): change how we deal with mask
Summary:
Currently, we pre-check whether we need to produce a mask or not.
This involves some rather magical constants.
I'd like to extend this fold to also handle the situation
when there's also a `trunc` before outer shift.
That will require another set of magical constants.
It's ugly.

Instead, we can just compute the mask, and check
whether mask is a pass-through (all-ones) or not.
This way we don't need to have any magical numbers.

This change is NFC other than the fact that we now compute
the mask and then check if we need (and can!) apply it.

Reviewers: spatel

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68470

llvm-svn: 373961
2019-10-07 20:53:00 +00:00
Roman Lebedev c3b394ffba [InstCombine] dropRedundantMaskingOfLeftShiftInput(): propagate undef shift amounts
Summary:
When we do `ConstantExpr::getZExt()`, that "extends" `undef` to `0`,
which means that for patterns a/b we'd assume that we must not produce
any bits for that channel, while in reality we simply didn't care
about that channel - i.e. we don't need to mask it.

Reviewers: spatel

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68239

llvm-svn: 373960
2019-10-07 20:52:52 +00:00
Sanjay Patel aab8b3ab9c [InstCombine] fold fneg disguised as select+fmul (PR43497)
Extends rL373230 and solves the motivating bug (although in a narrow way):
https://bugs.llvm.org/show_bug.cgi?id=43497

llvm-svn: 373851
2019-10-06 14:15:48 +00:00
Sanjay Patel c38881a6b7 [InstCombine] don't assume 'inbounds' for bitcast pointer to GEP transform (PR43501)
https://bugs.llvm.org/show_bug.cgi?id=43501
We can't declare a GEP 'inbounds' in general. But we may salvage that information if
we have known dereferenceable bytes on the source pointer.

Differential Revision: https://reviews.llvm.org/D68244

llvm-svn: 373847
2019-10-06 13:08:08 +00:00
Roman Lebedev fb5af8b9b9 [InstCombine] Fold 'icmp eq/ne (?trunc (lshr/ashr %x, bitwidth(x)-1)), 0' -> 'icmp sge/slt %x, 0'
We do indeed already get it right in some cases, but only transitively,
with one-use restrictions. Since we only need to produce a single
comparison, it makes sense to match the pattern directly:
  https://rise4fun.com/Alive/kPg

llvm-svn: 373802
2019-10-04 22:16:22 +00:00
Roman Lebedev f304d4d185 [InstCombine] Right-shift shift amount reassociation with truncation (PR43564, PR42391)
Initially (D65380) i believed that if we have rightshift-trunc-rightshift,
we can't do any folding. But as it usually happens, i was wrong.

https://rise4fun.com/Alive/GEw
https://rise4fun.com/Alive/gN2O

In https://bugs.llvm.org/show_bug.cgi?id=43564 we happen to have
this very sequence, of two right shifts separated by trunc.
And "just" so that happens, we apparently can fold the pattern
if the total shift amount is either 0, or it's equal to the bitwidth
of the innermost widest shift - i.e. if we are left with only the
original sign bit. Which is exactly what is wanted there.

llvm-svn: 373801
2019-10-04 22:16:11 +00:00
Guillaume Chatelet d400d45150 [Alignment][NFC] Remove StoreInst::setAlignment(unsigned)
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet, bollu, jdoerfert

Subscribers: hiraditya, asbirlea, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D68268

llvm-svn: 373595
2019-10-03 13:17:21 +00:00
Roman Lebedev ae3315af07 [InstCombine] Bypass high bit extract before variable sign-extension (PR43523)
https://rise4fun.com/Alive/8BY - valid for lshr+trunc+variable sext
https://rise4fun.com/Alive/7jk - the variable sext can be redundant

https://rise4fun.com/Alive/Qslu - 'exact'-ness of first shift can be preserver

https://rise4fun.com/Alive/IF63 - without trunc we could view this as
                                  more general "drop redundant mask before right-shift",
                                  but let's handle it here for now
https://rise4fun.com/Alive/iip - likewise, without trunc, variable sext can be redundant.

There's more patterns for sure - e.g. we can have 'lshr' as the final shift,
but that might be best handled by some more generic transform, e.g.
"drop redundant masking before right-shift" (PR42456)

I'm singling-out this sext patch because you can only extract
high bits with `*shr` (unlike abstract bit masking),
and i *know* this fold is wanted by existing code.

I don't believe there is much to review here,
so i'm gonna opt into post-review mode here.

https://bugs.llvm.org/show_bug.cgi?id=43523

llvm-svn: 373542
2019-10-02 23:02:12 +00:00
Roman Lebedev 053014f8f9 [InstCombine] Deal with -(trunc(X >>u 63)) -> trunc(X >>s 63)
Identical to it's trunc-less variant, just pretent-to hoist
trunc, and everything else still holds:
https://rise4fun.com/Alive/JRU

llvm-svn: 373364
2019-10-01 17:50:20 +00:00
Roman Lebedev 65144149d0 [InstCombine] Preserve 'exact' in -(X >>u 31) -> (X >>s 31) fold
https://rise4fun.com/Alive/yR4

llvm-svn: 373363
2019-10-01 17:50:09 +00:00
Roman Lebedev faa90eca63 [InstCombine][NFC] visitShl(): call SimplifyQuery::getWithInstruction() once
llvm-svn: 373249
2019-09-30 19:16:00 +00:00
Sanjay Patel 712b7c2463 [InstCombine] fold negate disguised as select+mul
Name: negate if true
  %sel = select i1 %cond, i32 -1, i32 1
  %r = mul i32 %sel, %x
  =>
  %m = sub i32 0, %x
  %r = select i1 %cond, i32 %m, i32 %x

  Name: negate if false
  %sel = select i1 %cond, i32 1, i32 -1
  %r = mul i32 %sel, %x
  =>
  %m = sub i32 0, %x
  %r = select i1 %cond, i32 %x, i32 %m

https://rise4fun.com/Alive/Nlh

llvm-svn: 373230
2019-09-30 17:02:26 +00:00
Guillaume Chatelet ab11b9188d [Alignment][NFC] Remove AllocaInst::setAlignment(unsigned)
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: jholewinski, arsenm, jvesely, nhaehnle, eraman, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D68141

llvm-svn: 373207
2019-09-30 13:34:44 +00:00
Guillaume Chatelet 17380227e8 [Alignment][NFC] Remove LoadInst::setAlignment(unsigned)
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet, jdoerfert

Subscribers: hiraditya, asbirlea, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D68142

llvm-svn: 373195
2019-09-30 09:37:05 +00:00
Roman Lebedev 269f1bea0d [InstCombine] Simplify shift-by-sext to shift-by-zext
Summary:
This is valid for any `sext` bitwidth pair:
```
Processing /tmp/opt.ll..

----------------------------------------
  %signed = sext %y
  %r = shl %x, %signed
  ret %r
=>
  %unsigned = zext %y
  %r = shl %x, %unsigned
  ret %r
  %signed = sext %y

Done: 2016
Optimization is correct!
```

(This isn't so for funnel shifts, there it's illegal for e.g. i6->i7.)

Main motivation is the C++ semantics:
```
int shl(int a, char b) {
    return a << b;
}
```
ends as
```
  %3 = sext i8 %1 to i32
  %4 = shl i32 %0, %3
```
https://godbolt.org/z/0jgqUq
which is, as this shows, too pessimistic.

There is another problem here - we can only do the fold
if sext is one-use. But we can trivially have cases
where several shifts have the same sext shift amount.
This should be resolved, later.

Reviewers: spatel, nikic, RKSimon

Reviewed By: spatel

Subscribers: efriedma, hiraditya, nlopes, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68103

llvm-svn: 373106
2019-09-27 18:12:15 +00:00
Craig Topper 46721bb7f5 [InstCombine] Use m_Zero instead of isNullValue() when checking if a GEP index is all zeroes to prevent an infinite loop.
The test case here previously infinite looped. Only one element from the GEP is used so SimplifyDemandedVectorElts would replace the other lanes in each index with undef leading to the first index being <0, undef, undef, undef>. But there's a GEP transform that tries to replace an index into a 0 sized type with a zero index. But the zero index check only works on ConstantInt 0 or ConstantAggregateZero so it would turn the index back to zeroinitializer. Resulting in a loop.

The fix is to use m_Zero() to allow a vector of zeroes and undefs.

Differential Revision: https://reviews.llvm.org/D67977

llvm-svn: 373000
2019-09-26 17:20:50 +00:00