Commit Graph

160 Commits

Author SHA1 Message Date
Sanjay Patel 8869a98e82 [InstSimplify] fold insertelement-of-extractelement
This was partly handled in InstCombine (only the constant
index case), so delete that and zap it more generally in
InstSimplify.

llvm-svn: 361576
2019-05-24 00:13:58 +00:00
Sanjay Patel 093c922205 [InstCombine] remove redundant fold for extractelement; NFC
The out-of-bounds index pattern is handled by InstSimplify,
so the extractelement should be eliminated next time it is
visited.

llvm-svn: 361570
2019-05-23 23:33:38 +00:00
Sanjay Patel 4d4df6f144 [InstCombine] remove redundant fold for insertelement; NFC
The out-of-bounds index pattern is handled by InstSimplify.

llvm-svn: 361569
2019-05-23 23:33:34 +00:00
Sanjay Patel e60cb7d1be [InstSimplify] insertelement V, undef, ? --> V
This was part of InstCombine, but it's better placed in
InstSimplify. InstCombine also had an unreachable but weaker
fold for insertelement with undef index, so that is deleted.

llvm-svn: 361559
2019-05-23 21:49:47 +00:00
Sanjay Patel 3249be1e03 [InstCombine] be more careful when transforming a shuffle mask
This is reduced from a fuzzer test:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14890

Usually, demanded elements should be able to simplify shuffle
mask elements that are pointing to undef elements of its source
operands, but that doesn't happen in the test case.

llvm-svn: 361533
2019-05-23 18:46:03 +00:00
Sanjay Patel 6a554188aa [InstCombine] fold shuffles of insert_subvectors
This should be a valid exception to the general rule of not creating new shuffle masks in IR...
because we already do it. :)
Also, DAG combining/legalization will undo this by widening the shuffle back out if needed.

Explanation for how we already do this: SLP or vector source can create chains of insert/extract
as shown in 1 of the examples from PR16739:
https://godbolt.org/z/NlK7rA
https://bugs.llvm.org/show_bug.cgi?id=16739

And we expect instcombine or DAGCombine to clean that up by creating relatively simple shuffles.

Differential Revision: https://reviews.llvm.org/D62024

llvm-svn: 361338
2019-05-22 00:32:25 +00:00
Sanjay Patel 926e47751b [InstCombine] move bitcast after insertelement-with-bitcasted-operands
llvm-svn: 361058
2019-05-17 18:06:12 +00:00
Sanjay Patel b33938df7a [InstCombine] remove overzealous assert for shuffles (PR41419)
As the TODO indicates, instsimplify could be improved.

Should fix:
https://bugs.llvm.org/show_bug.cgi?id=41419

llvm-svn: 357910
2019-04-08 13:28:29 +00:00
Mikael Holmen 150a7ec2dc [InstCombine] Handle vector gep with scalar argument in evaluateInDifferentElementOrder
Summary:
This fixes PR41270.

The recursive function evaluateInDifferentElementOrder expects to be called
on a vector Value, so when we call it on a vector GEP's arguments, we must
first check that the argument is indeed a vector.

Reviewers: reames, spatel

Reviewed By: spatel

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60058

llvm-svn: 357389
2019-04-01 14:10:10 +00:00
Mikael Holmen 3e527cd823 Revert "[InstCombine] Handle vector gep with scalar argument in evaluateInDifferentElementOrder"
This reverts commit 75216a6dbcfe5fb55039ef06a07e419fa875f4a5.

I'll recommit with a better commit message with reference to the
phabricator review.

llvm-svn: 357387
2019-04-01 14:06:45 +00:00
Mikael Holmen d66a47f90a [InstCombine] Handle vector gep with scalar argument in evaluateInDifferentElementOrder
This fixes PR41270.

The recursive function evaluateInDifferentElementOrder expects to be called
on a vector Value, so when we call it on a vector GEP's arguments, we must
first check that the argument is indeed a vector.

llvm-svn: 357385
2019-04-01 13:48:56 +00:00
Sanjay Patel b276dd195a [InstCombine] canonicalize select shuffles by commuting
In PR41304:
https://bugs.llvm.org/show_bug.cgi?id=41304
...we have a case where we want to fold a binop of select-shuffle (blended) values.

Rather than try to match commuted variants of the pattern, we can canonicalize the
shuffles and check for mask equality with commuted operands.

We don't produce arbitrary shuffle masks in instcombine, but select-shuffles are a
special case that the backend is required to handle because we already canonicalize
vector select to this shuffle form.

So there should be no codegen difference from this change. It's possible that this
improves CSE in IR though.

Differential Revision: https://reviews.llvm.org/D60016

llvm-svn: 357366
2019-03-31 15:01:30 +00:00
Sanjay Patel 3f4d1b4abd [InstCombine] move shuffle canonicalizations before other transforms
This may not be NFC, but I'm not sure how to expose any diffs in
tests. In theory, it should be slightly more efficient and possibly
more profitable to do the canonicalizations (which can increase the
undef elements in the mask) ahead of SimplifyDemandedVectorElts().

llvm-svn: 357272
2019-03-29 16:49:38 +00:00
Sanjay Patel cddb1e5469 [InstCombine] limit extracting shuffle transform based on uses
As discussed in D53037, this can lead to worse codegen, and we
don't generally expect the backend to be able to optimize
arbitrary shuffles. If there's only one use of the 1st shuffle,
that means it's getting removed, so that should always be
safe.

llvm-svn: 353235
2019-02-05 22:58:45 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Sanjay Patel e51d5bdb3c [InstCombine] refactor isCheapToScalarize(); NFC
As the FIXME indicates, this has the potential to go
overboard. So I'm not sure if it's even worth keeping 
this vs. iteratively doing simple matches, but we might 
as well clean it up.

llvm-svn: 349523
2018-12-18 19:07:38 +00:00
Matt Arsenault 9ccde61f81 InstCombine: Scalarize single use icmp/fcmp
llvm-svn: 348801
2018-12-10 21:50:54 +00:00
Sanjay Patel 998ececef0 [InstCombine] remove dead code from visitExtractElement
Extracting from a splat constant is always handled by InstSimplify.
Move the test for this from InstCombine to InstSimplify to make
sure that stays true.

llvm-svn: 348423
2018-12-05 23:09:33 +00:00
Sanjay Patel 47b3b4b5aa [InstCombine] reduce duplication in visitExtractElementInst; NFC
llvm-svn: 348418
2018-12-05 21:57:51 +00:00
Sanjay Patel b12e410082 [InstCombine] try to turn shuffle into insertelement
shuffle (insert ?, Scalar, IndexC), V1, Mask --> insert V1, Scalar, IndexC'

The motivating case is at least a couple of steps away: I noticed that
SLPVectorizer does not analyze shuffles as well as sequences of 
insert/extract in PR34724:
https://bugs.llvm.org/show_bug.cgi?id=34724
...so SLP may fail to vectorize when source code has shuffles to start 
with or instcombine has converted insert/extract to shuffles.

Independent of that, an insertelement is always a simpler op for IR 
analysis vs. a shuffle, so we should transform to insert when possible.

I don't think there's any codegen concern here - if a target can't insert 
a scalar directly to some fixed element in a vector (x86?), then this 
should get expanded to the insert+shuffle that we started with.

Differential Revision: https://reviews.llvm.org/D53507

llvm-svn: 345607
2018-10-30 15:26:39 +00:00
Sanjay Patel 0522b0da31 [InstCombine] use 'match' to simplify code; NFC
llvm-svn: 344855
2018-10-20 17:15:57 +00:00
Sanjay Patel ec572ade20 [InstCombine] make code more flexible with lambda; NFC
I couldn't tell from svn history when these checks were added,
but it pre-dates the split of instcombine into its own directory
at rL92459.

The motivation for changing the check is partly shown by the
code in PR34724:
https://bugs.llvm.org/show_bug.cgi?id=34724

There are also existing regression tests for SLPVectorizer with
sequences of extract+insert that are likely assumed to become
shuffles by the vectorizer cost models.

llvm-svn: 344854
2018-10-20 16:58:27 +00:00
Sanjay Patel 729c4362cf [InstCombine] add explanatory comment for strange vector logic; NFC
llvm-svn: 344852
2018-10-20 16:25:55 +00:00
Sanjay Patel 7181146c6c [InstCombine] combine a shuffle and an extract subvector shuffle
This is part of the missing IR-level folding noted in D52912.
This should be ok as a canonicalization because the new shuffle mask can't
be any more complicated than the existing shuffle mask. If there's some 
target where the shorter vector shuffle is not legal, it should just end up 
expanding to something like the pair of shuffles that we're starting with here.

Differential Revision: https://reviews.llvm.org/D53037

llvm-svn: 344476
2018-10-14 15:25:06 +00:00
Sanjay Patel 58fc00d0bc revert r344082: [InstCombine] reverse 'trunc X to <N x i1>' canonicalization
This commit accidentally included the diffs from D53057.

llvm-svn: 344178
2018-10-10 20:39:39 +00:00
Sanjay Patel e9ca7ea3e5 [InstCombine] reverse 'trunc X to <N x i1>' canonicalization
icmp ne (and X, 1), 0 --> trunc X to N x i1

Ideally, we'd do the same for scalars, but there will likely be 
regressions unless we add more trunc folds as we're doing here 
for vectors.

The motivating vector case is from PR37549:
https://bugs.llvm.org/show_bug.cgi?id=37549

define <4 x float> @bitwise_select(<4 x float> %x, <4 x float> %y, <4 x float> %z, <4 x float> %w) {
  %c = fcmp ole <4 x float> %x, %y
  %s = sext <4 x i1> %c to <4 x i32>
  %s1 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 0, i32 0, i32 1, i32 1>
  %s2 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 2, i32 2, i32 3, i32 3>
  %cond = or <4 x i32> %s1, %s2
  %condtr = trunc <4 x i32> %cond to <4 x i1>
  %r = select <4 x i1> %condtr, <4 x float> %z, <4 x float> %w
  ret <4 x float> %r
}

Here's a sampling of the vector codegen for that case using 
mask+icmp (current behavior) vs. trunc (with this patch):

AVX before:

vcmpleps	%xmm1, %xmm0, %xmm0
vpermilps	$80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1]
vpermilps	$250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3]
vorps	%xmm0, %xmm1, %xmm0
vandps	LCPI0_0(%rip), %xmm0, %xmm0
vxorps	%xmm1, %xmm1, %xmm1
vpcmpeqd	%xmm1, %xmm0, %xmm0
vblendvps	%xmm0, %xmm3, %xmm2, %xmm0

AVX after:

vcmpleps	%xmm1, %xmm0, %xmm0
vpermilps	$80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1]
vpermilps	$250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3]
vorps	%xmm0, %xmm1, %xmm0
vblendvps	%xmm0, %xmm2, %xmm3, %xmm0

AVX512f before:

vcmpleps	%xmm1, %xmm0, %xmm0
vpermilps	$80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1]
vpermilps	$250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3]
vorps	%xmm0, %xmm1, %xmm0
vpbroadcastd	LCPI0_0(%rip), %xmm1 ## xmm1 = [1,1,1,1]
vptestnmd	%zmm1, %zmm0, %k1
vblendmps	%zmm3, %zmm2, %zmm0 {%k1}

AVX512f after:

vcmpleps	%xmm1, %xmm0, %xmm0
vpermilps	$80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1]
vpermilps	$250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3]
vorps	%xmm0, %xmm1, %xmm0
vpslld	$31, %xmm0, %xmm0
vptestmd	%zmm0, %zmm0, %k1
vblendmps	%zmm2, %zmm3, %zmm0 {%k1}

AArch64 before:

fcmge	v0.4s, v1.4s, v0.4s
zip1	v1.4s, v0.4s, v0.4s
zip2	v0.4s, v0.4s, v0.4s
orr	v0.16b, v1.16b, v0.16b
movi	v1.4s, #1
and	v0.16b, v0.16b, v1.16b
cmeq	v0.4s, v0.4s, #0
bsl	v0.16b, v3.16b, v2.16b

AArch64 after:

fcmge	v0.4s, v1.4s, v0.4s
zip1	v1.4s, v0.4s, v0.4s
zip2	v0.4s, v0.4s, v0.4s
orr	v0.16b, v1.16b, v0.16b
bsl	v0.16b, v2.16b, v3.16b

PowerPC-le before:

xvcmpgesp 34, 35, 34
vspltisw 0, 1
vmrglw 3, 2, 2
vmrghw 2, 2, 2
xxlor 0, 35, 34
xxlxor 35, 35, 35
xxland 34, 0, 32
vcmpequw 2, 2, 3
xxsel 34, 36, 37, 34

PowerPC-le after:

xvcmpgesp 34, 35, 34
vmrglw 3, 2, 2
vmrghw 2, 2, 2
xxlor 0, 35, 34
xxsel 34, 37, 36, 0

Differential Revision: https://reviews.llvm.org/D52747

llvm-svn: 344082
2018-10-09 21:26:01 +00:00
Sanjay Patel 88194dfe1a [InstCombine] make helper function 'static'; NFC
llvm-svn: 344056
2018-10-09 15:29:26 +00:00
Sanjay Patel 3746e11abe [InstCombine] allow bitcast to/from FP for vector insert/extract transform
This is a follow-up to rL343482 / D52439.
This was a pattern that initially caused the commit to be reverted because
the transform requires a bitcast as shown here.

llvm-svn: 343794
2018-10-04 16:25:05 +00:00
Sanjay Patel 31b07198f1 [InstCombine] try to convert vector insert+extract to trunc; 2nd try
This was originally committed at rL343407, but reverted at 
rL343458 because it crashed trying to handle a case where
the destination type is FP. This version of the patch adds
a check for that possibility. Tests added at rL343480.

Original commit message:

This transform is requested for the backend in:
https://bugs.llvm.org/show_bug.cgi?id=39016
...but I figured it was worth doing in IR too, and it's probably
easier to implement here, so that's this patch.

In the simplest case, we are just truncating a scalar value. If the
extract index doesn't correspond to the LSBs of the scalar, then we
have to shift-right before the truncate. Endian-ness makes this tricky,
but hopefully the ASCII-art helps visualize the transform.

Differential Revision: https://reviews.llvm.org/D52439

llvm-svn: 343482
2018-10-01 14:40:00 +00:00
Hans Wennborg a60aa91374 Revert r343407 "[InstCombine] try to convert vector insert+extract to trunc"
This caused Chromium builds to fail with "Illegal Trunc" assertion.
See https://crbug.com/890723 for repro.

> This transform is requested for the backend in:
> https://bugs.llvm.org/show_bug.cgi?id=39016
> ...but I figured it was worth doing in IR too, and it's probably
> easier to implement here, so that's this patch.
>
> In the simplest case, we are just truncating a scalar value. If the
> extract index doesn't correspond to the LSBs of the scalar, then we
> have to shift-right before the truncate. Endian-ness makes this tricky,
> but hopefully the ASCII-art helps visualize the transform.
>
> Differential Revision: https://reviews.llvm.org/D52439

llvm-svn: 343458
2018-10-01 12:07:45 +00:00
Sanjay Patel 1e0f1f645a [InstCombine] try to convert vector insert+extract to trunc
This transform is requested for the backend in:
https://bugs.llvm.org/show_bug.cgi?id=39016
...but I figured it was worth doing in IR too, and it's probably 
easier to implement here, so that's this patch.

In the simplest case, we are just truncating a scalar value. If the 
extract index doesn't correspond to the LSBs of the scalar, then we 
have to shift-right before the truncate. Endian-ness makes this tricky, 
but hopefully the ASCII-art helps visualize the transform.

Differential Revision: https://reviews.llvm.org/D52439

llvm-svn: 343407
2018-09-30 14:34:01 +00:00
Sanjay Patel 26c119a9c2 [InstCombine] allow lengthening of insertelement to eliminate shuffles
As noted in post-commit comments for D52548, the limitation on 
increasing vector length can be applied by opcode.
As a first step, this patch only allows insertelement to be
widened because that has no logical downsides for IR and has 
little risk of pessimizing codegen.

This may cause PR39132 to go into hiding during a full compile,
but that bug is not fixed.

llvm-svn: 343406
2018-09-30 13:50:42 +00:00
Sanjay Patel 54d31ef87e [InstCombine] fix formatting in vector evaluators; NFC
We need to alter the functionality as shown in D52548.

llvm-svn: 343379
2018-09-29 15:05:24 +00:00
Sanjay Patel 242f90fe82 [InstCombine] don't propagate wider shufflevector arguments to predecessors
InstCombine would propagate shufflevector insts that had wider output vectors onto 
predecessors, which would sometimes push undef's onto the divisor of a div/rem and 
result in bad codegen.

I've fixed this by just banning propagating shufflevector back if the result of 
the shufflevector is wider than the input vectors.

Patch by: @sheredom (Neil Henning)

Differential Revision: https://reviews.llvm.org/D52548

llvm-svn: 343329
2018-09-28 15:24:41 +00:00
Sanjay Patel 4674c7765d [InstCombine] add bitcast+extelt helper function; NFC
We can handle patterns where the elements have different 
sizes, so refactoring ahead of trying to add another blob
within these clauses.

llvm-svn: 342918
2018-09-24 20:41:22 +00:00
Sanjay Patel 7a52626a08 [InstCombine] improve variable name and use 'match'; NFC
'width' of a vector usually refers to the bit-width.

https://bugs.llvm.org/show_bug.cgi?id=39016
shows a case where we could extend this fold to handle
a case where the number of elements in the bitcasted
vector is not equal to the resulting value.

llvm-svn: 342902
2018-09-24 16:39:03 +00:00
Sanjay Patel c1416b60f2 [InstCombine] narrow vector select with padded condition and extracted result (PR38691)
shuf (sel (shuf NarrowCond, undef, WideMask), X, Y), undef, NarrowMask) -->
sel NarrowCond, (shuf X, undef, NarrowMask), (shuf Y, undef, NarrowMask)

The motivating case from:
https://bugs.llvm.org/show_bug.cgi?id=38691
...is the last regression test. In that case, we're just left with the narrow select.

Note that if we do create new shuffles, they use the existing extraction identity mask, 
so there's no danger that this transform creates arbitrary shuffles.

Differential Revision: https://reviews.llvm.org/D51496

llvm-svn: 341708
2018-09-07 21:03:34 +00:00
Sanjay Patel d4e19d272a [InstCombine] move declarations closer to uses; NFC
llvm-svn: 340930
2018-08-29 14:42:12 +00:00
Sanjay Patel 7a05641fa8 [InstCombine] remove unnecessary shuffle undef folding
Add a test for constant folding to show that 
(shuffle undef, undef, mask)
should already be handled via instsimplify.

llvm-svn: 340926
2018-08-29 13:24:34 +00:00
Fangrui Song f78650a8de Remove trailing space
sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h}

llvm-svn: 338293
2018-07-30 19:41:25 +00:00
Sanjay Patel c8d9d812ec [InstCombine] allow flag propagation when using safe constant
This corresponds with the code for the single binop pattern
added in rL336684.

llvm-svn: 336696
2018-07-10 16:09:49 +00:00
Sanjay Patel 509a1e7a9b [InstCombine] safely allow non-commutative binop identity constant folds
This was originally intended with D48893, but as discussed there, we
have to make the folds safe from producing extra poison. This should
give the single binop folds the same capabilities as the existing
folds for 2-binops+shuffle.

LLVM binary opcode review: there are a total of 18 binops. There are 7 
commutative binops (add, mul, and, or, xor, fadd, fmul) which we already 
fold. We're able to fold 6 more opcodes with this patch (shl, lshr, ashr,
fdiv, udiv, sdiv). There are no folds for srem/urem/frem AFAIK. We don't 
bother with sub/fsub with constant operand 1 because those are 
canonicalized to add/fadd. 7 + 6 + 3 + 2 = 18.

llvm-svn: 336684
2018-07-10 15:12:31 +00:00
Sanjay Patel 3333106a62 [InstCombine] drop poison flags when shuffle mask undef propagates to constant
llvm-svn: 336679
2018-07-10 14:27:55 +00:00
Sanjay Patel 06ea4206ad [InstCombine] allow more shuffle-binop folds with safe constants
The case with 2 variables is more complicated than the case where
we eliminate the shuffle entirely because a shuffle with an undef 
mask element creates an undef result. 

I'm not aware of any current analysis/transform that recognizes that 
undef propagating to a div/rem/shift, but we have to guard against 
the possibility.

llvm-svn: 336668
2018-07-10 13:33:26 +00:00
Sanjay Patel 69faf464ed [InstCombine] allow more shuffle folds using safe constants
getSafeVectorConstantForBinop() was calling getBinOpIdentity() assuming
that the constant we wanted was operand 1 (RHS). That's wrong, but I
don't think we could expose a bug or even a suboptimal fold from that
because the callers have other guards for any binop that would have
been affected.

llvm-svn: 336617
2018-07-09 23:22:47 +00:00
Sanjay Patel a62725317b [InstCombine] generalize safe vector constant utility
This is almost NFC, but there could be some case where the original
code had undefs in the constants (rather than just the shuffle mask),
and we'll use safe constants rather than undefs now.

The FIXME noted in foldShuffledBinop() is already visible in existing
tests, so correcting that is the next step.

llvm-svn: 336558
2018-07-09 16:16:51 +00:00
Sanjay Patel 5bd36644c8 [InstCombine] fix shuffle-of-binops transform to avoid poison/undef
As noted in D48987, there are many different ways for this transform to go wrong. 
In particular, the poison potential for shifts means we have to more careful with those ops. 
I added tests to make that behavior visible for all of the different cases that I could find.

This is a partial fix. To make this review easier, I did not make changes for the single binop 
pattern (handled in foldSelectShuffleWith1Binop()). I also left out some potential optimizations 
noted with TODO comments. I'll follow-up once we're confident that things are correct here.

The goal is to correct all marked FIXME tests to either avoid the shuffle transform or do it safely.

Note that distinguishing when the shuffle mask contains undefs and using getBinOpIdentity() allows 
for some improvements to div/rem patterns, so there are wins along with the missed opportunities 
and fixes.

Differential Revision: https://reviews.llvm.org/D49047

llvm-svn: 336546
2018-07-09 13:21:46 +00:00
Sanjay Patel 3074b9e53f [InstCombine] fold shuffle-with-binop and common value
This is the last significant change suggested in PR37806:
https://bugs.llvm.org/show_bug.cgi?id=37806#c5
...though there are several follow-ups noted in the code comments 
in this patch to complete this transform.

It's possible that a binop feeding a select-shuffle has been eliminated 
by earlier transforms (or the code was just written like this in the 1st 
place), so we'll fail to match the patterns that have 2 binops from: 
D48401, 
D48678, 
D48662, 
D48485.

In that case, we can try to materialize identity constants for the remaining
binop to fill in the "ghost" lanes of the vector (where we just want to pass 
through the original values of the source operand).

I added comments to ConstantExpr::getBinOpIdentity() to show planned follow-ups. 
For now, we only handle the 5 commutative integer binops (add/mul/and/or/xor).

Differential Revision: https://reviews.llvm.org/D48830

llvm-svn: 336196
2018-07-03 13:44:22 +00:00
Sanjay Patel b999d74132 [InstCombine] reverse canonicalization of add --> or to allow more shuffle folding
This extends D48485 to allow another pair of binops (add/or) to be combined either
with or without a leading shuffle:
or X, C --> add X, C (when X and C have no common bits set)

Here, we need value tracking to determine that the 'or' can be reversed into an 'add',
and we've added general infrastructure to allow extending to other opcodes or moving 
to where other passes could use that functionality.

Differential Revision: https://reviews.llvm.org/D48662

llvm-svn: 336128
2018-07-02 17:42:29 +00:00
Sanjay Patel da66753e01 [InstCombine] enhance shuffle-of-binops to allow different variable ops (PR37806)
This was discussed in D48401 as another improvement for:
https://bugs.llvm.org/show_bug.cgi?id=37806

If we have 2 different variable values, then we shuffle (select) those lanes, 
shuffle (select) the constants, and then perform the binop. This eliminates a binop.

The new shuffle uses the same shuffle mask as the existing shuffle, so there's no 
danger of creating a difficult shuffle.

All of the earlier constraints still apply, but we also check for extra uses to 
avoid creating more instructions than we'll remove.

Additionally, we're disallowing the fold for div/rem because that could expose a
UB hole.

Differential Revision: https://reviews.llvm.org/D48678

llvm-svn: 335974
2018-06-29 13:44:06 +00:00