Commit Graph

132 Commits

Author SHA1 Message Date
Sanjay Patel f7b851fe84 [InstCombine] allow non-splat folds of select cond (ext X), C
llvm-svn: 282906
2016-09-30 19:49:22 +00:00
Sanjay Patel 453ceff261 [InstCombine] fix function names; NFC
Also, make foldSelectExtConst() a member of InstCombiner, remove
unnecessary parameters from its interface, and group visitSelectInst
helpers together in the header file.

llvm-svn: 282796
2016-09-29 22:18:30 +00:00
Sanjay Patel ccc2927b69 fix formatting; NFC
llvm-svn: 282737
2016-09-29 17:48:19 +00:00
Sanjay Patel f26710d97d [InstCombine] canonicalize vector select with constant vector condition to shuffle
As discussed on llvm-dev ( http://lists.llvm.org/pipermail/llvm-dev/2016-August/104210.html ): 
turn a vector select with constant condition operand into a shuffle as a canonicalization step.
Shuffles may be easier to reason about in conjunction with other shuffles and insert/extract.

Possible known (minor?) regressions from this change are filed as:
https://llvm.org/bugs/show_bug.cgi?id=28530 
https://llvm.org/bugs/show_bug.cgi?id=28531 
https://llvm.org/bugs/show_bug.cgi?id=30371

If something terrible happens to perf after this commit, feel free to revert until a backend
fix is in place.

Differential Revision: https://reviews.llvm.org/D24279

llvm-svn: 281787
2016-09-16 22:16:18 +00:00
Sanjay Patel 4e463b4a2c fix formatting; NFC
llvm-svn: 280727
2016-09-06 18:16:31 +00:00
Xinliang David Li cad3a995a4 [Profile] Propagate branch metadata properly in instcombine
Differential Revision: http://reviews.llvm.org/D23590

llvm-svn: 279693
2016-08-25 00:26:32 +00:00
Nicolai Haehnle 870bf1788c [InstCombine] try to fold (select C, (sext A), B) into logical ops
Summary:
Turn (select C, (sext A), B) into (sext (select C, A, B')) when A is i1 and
B is a compatible constant, also for zext instead of sext. This will then be
further folded into logical operations.

The transformation would be valid for non-i1 types as well, but other parts of
InstCombine prefer to have sext from non-i1 as an operand of select.

Motivated by the shader compiler frontend in Mesa for AMDGPU, which emits i32
for boolean operations. With this change, the boolean logic is fully
recovered.

Reviewers: majnemer, spatel, tstellarAMD

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D22747

llvm-svn: 277801
2016-08-05 08:22:29 +00:00
Justin Bogner 9979840f59 InstCombine: Replace some never-null pointers with references. NFC
llvm-svn: 277792
2016-08-05 01:06:44 +00:00
Sanjay Patel 5f3c70307d [InstSimplify][InstCombine] don't crash when folding vector selects of icmp
Differential Revision: https://reviews.llvm.org/D22602

llvm-svn: 276209
2016-07-20 23:40:01 +00:00
Sanjay Patel 25600f39eb save type in local var; NFCI
llvm-svn: 274760
2016-07-07 15:28:17 +00:00
Sanjay Patel 65a51c25c1 [InstCombine] enhance (select X, C1, C2 --> ext X) to handle vectors
By replacing dyn_cast of ConstantInt with m_Zero/m_One/m_AllOnes, we
allow these transforms for splat vectors.

Differential Revision: http://reviews.llvm.org/D21899

llvm-svn: 274696
2016-07-06 22:23:01 +00:00
Sanjay Patel ea23436638 [InstCombine] use more specific pattern matchers; NFCI
Follow-up from r274465: we don't need to capture the value in these cases, 
so just match the constant that we're looking for. m_One/m_Zero work with
vector splats as well as scalars.

llvm-svn: 274670
2016-07-06 21:01:26 +00:00
Sanjay Patel cbaac41856 [InstCombine] enable vector select of bools -> logic folds
llvm-svn: 274465
2016-07-03 14:34:39 +00:00
Sanjay Patel a1a4e100be fix formatting; NFC
llvm-svn: 274463
2016-07-03 14:08:19 +00:00
Sanjay Patel 216d8cf720 [InstCombine] allow more than one use for vector bitcast folding with selects
The motivating example for this transform is similar to D20774 where bitcasts interfere
with a single cmp/select sequence, but in this case we have 2 uses of each bitcast to 
produce min and max ops:

define void @minmax_bc_store(<4 x float> %a, <4 x float> %b, <4 x float>* %ptr1, <4 x float>* %ptr2) {
  %cmp = fcmp olt <4 x float> %a, %b
  %bc1 = bitcast <4 x float> %a to <4 x i32>
  %bc2 = bitcast <4 x float> %b to <4 x i32>
  %sel1 = select <4 x i1> %cmp, <4 x i32> %bc1, <4 x i32> %bc2
  %sel2 = select <4 x i1> %cmp, <4 x i32> %bc2, <4 x i32> %bc1
  %bc3 = bitcast <4 x float>* %ptr1 to <4 x i32>*
  store <4 x i32> %sel1, <4 x i32>* %bc3
  %bc4 = bitcast <4 x float>* %ptr2 to <4 x i32>*
  store <4 x i32> %sel2, <4 x i32>* %bc4
  ret void
}

With this patch, we move the selects up to use the input args which allows getting rid of
all of the bitcasts:

define void @minmax_bc_store(<4 x float> %a, <4 x float> %b, <4 x float>* %ptr1, <4 x float>* %ptr2) {
  %cmp = fcmp olt <4 x float> %a, %b
  %sel1.v = select <4 x i1> %cmp, <4 x float> %a, <4 x float> %b
  %sel2.v = select <4 x i1> %cmp, <4 x float> %b, <4 x float> %a
  store <4 x float> %sel1.v, <4 x float>* %ptr1, align 16
  store <4 x float> %sel2.v, <4 x float>* %ptr2, align 16
  ret void
}

The asm for x86 SSE then improves from:

movaps  %xmm0, %xmm2
cmpltps %xmm1, %xmm2
movaps  %xmm2, %xmm3
andnps  %xmm1, %xmm3
movaps  %xmm2, %xmm4
andnps  %xmm0, %xmm4
andps %xmm2, %xmm0
orps  %xmm3, %xmm0
andps %xmm1, %xmm2
orps  %xmm4, %xmm2
movaps  %xmm0, (%rdi)
movaps  %xmm2, (%rsi)

To:

movaps  %xmm0, %xmm2
minps %xmm1, %xmm2
maxps %xmm0, %xmm1
movaps  %xmm2, (%rdi)
movaps  %xmm1, (%rsi)

The TODO comments show that we're limiting this transform only to vectors and only to bitcasts
because we need to improve other transforms or risk creating worse codegen.

Differential Revision: http://reviews.llvm.org/D21190

llvm-svn: 273011
2016-06-17 16:46:50 +00:00
Sanjay Patel 3929313811 [InstCombine] move fold of select of add/sub to helper function; NFCI
llvm-svn: 272199
2016-06-08 21:10:01 +00:00
Sanjay Patel 384d0f219d [InstCombine] fix outdated comment, simplify logic; NFCI
llvm-svn: 272196
2016-06-08 20:31:52 +00:00
Sanjay Patel 10a2c38d83 [InstCombine] reduce indent; NFC
llvm-svn: 272193
2016-06-08 20:09:04 +00:00
Sanjay Patel 916f8a0cdb [InstCombine] use copyIRFlags() ; NFCI
llvm-svn: 272191
2016-06-08 19:33:52 +00:00
Benjamin Kramer 46e38f3678 Avoid copies of std::strings and APInt/APFloats where we only read from it
As suggested by clang-tidy's performance-unnecessary-copy-initialization.
This can easily hit lifetime issues, so I audited every change and ran the
tests under asan, which came back clean.

llvm-svn: 272126
2016-06-08 10:01:20 +00:00
Chad Rosier cd62bf5821 [InstCombine] Determine the result of a select based on a dominating condition.
Differential Revision: http://reviews.llvm.org/D19550

llvm-svn: 268104
2016-04-29 21:12:31 +00:00
David Majnemer 56737722e4 [InstCombine] Fix miscompile in FoldSPFofSPF
We had a select of a cast of a select but attempted to replace the outer
select with the inner select dispite their incompatible types.

Patch by Anton Korobeynikov!

This fixes PR27236.

llvm-svn: 265805
2016-04-08 16:51:49 +00:00
Junmo Park 820964e9c6 Minor code cleanup. NFC.
llvm-svn: 264124
2016-03-23 01:38:35 +00:00
Sanjay Patel 4b198802b3 function names start with a lowercase letter; NFC
llvm-svn: 259425
2016-02-01 22:23:39 +00:00
Sanjay Patel a252815bc1 function names start with a lower case letter ; NFC
llvm-svn: 257496
2016-01-12 18:03:37 +00:00
Sanjoy Das 9fe86d90ab [InstCombine] Call getCmpPredicateForMinMax only with a valid SPF
Summary:
There are `SelectPatternFlavor`s that don't represent min or max idioms,
and we should not be passing those to `getCmpPredicateForMinMax`.

Fixes PR25745.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15249

llvm-svn: 254869
2015-12-05 23:44:22 +00:00
Sanjay Patel 6eccf487c9 don't repeat function names in comments; NFC
llvm-svn: 247154
2015-09-09 15:24:36 +00:00
James Molloy 134bec2722 Add support for floating-point minnum and maxnum
The select pattern recognition in ValueTracking (as used by InstCombine
and SelectionDAGBuilder) only knew about integer patterns. This teaches
it about minimum and maximum operations.

matchSelectPattern() has been extended to return a struct containing the
existing Flavor and a new enum defining the pattern's behavior when
given one NaN operand.

C minnum() is defined to return the non-NaN operand in this case, but
the idiomatic C "a < b ? a : b" would return the NaN operand.

ARM and AArch64 at least have different instructions for these different cases.

llvm-svn: 244580
2015-08-11 09:12:57 +00:00
David Majnemer 3f0fb98d01 [InstCombine, InstSimplify] Move xforms from Combine to Simplify
There were several SelectInst combines that always returned an existing
instruction instead of modifying an old one or creating a new one.
These are prime candidates for moving to InstSimplify.

llvm-svn: 239229
2015-06-06 22:40:21 +00:00
David Majnemer 468f670021 [InstCombine] Don't miscompile select to poison
If we have (select a, b, c), it is sometimes valid to simplify this to a
single select operand.  However, doing so is only valid if the
computation doesn't inject poison into the computation.

It might be helpful to consider the following example:
  (select (icmp ne %i, INT_MAX), (add nsw %i, 1), INT_MIN)

The select is equivalent to (add %i, 1) but not (add nsw %i, 1).

Self hosting on x86_64 revealed that this occurs very, very rarely so
bailing out is hopefully pretty reasonable.

llvm-svn: 239215
2015-06-06 02:30:43 +00:00
Renato Golin 3dabb23384 Revert "[InstCombine] Rephrase fix to SimplifyWithOpReplaced"
This reverts commit r239141. This commit was an attempt to reintroduce
a previous patch that broke many self-hosting bots with clang timeouts,
but it still has slowdown issues, at least  on ARM, increasing the
compilation time (stage 2, clang's) by 5x.

llvm-svn: 239175
2015-06-05 18:24:12 +00:00
David Majnemer 6d8081835d [InstCombine] Rephrase fix to SimplifyWithOpReplaced
I don't have the IR which is causing the build bot breakage but I can
postulate as to why they are timing out:
1. SimplifyWithOpReplaced was stripping flags from the simplified value.
2. visitSelectInstWithICmp was overriding SimplifyWithOpReplaced because
   it's simplification wasn't correct.
3. InstCombine would revisit the add instruction and note that it can
   rederive the flags.
4. By modifying the value, we chose to revisit instructions which reuse
   the value.  One of the instructions is the original select, causing
   LLVM to never reach fixpoint.

Instead, strip the flags only when we are sure we are going to perform
the simplification.

llvm-svn: 239141
2015-06-05 09:57:57 +00:00
Daniel Jasper 917fa5ee66 Revert "[InstCombine] Don't miscompile safe increment idiom"
This is breaking a lot of build bots and is causing very long-running
compiles (infinite loops)?

Likely, we shouldn't return nullptr?

llvm-svn: 239139
2015-06-05 09:31:20 +00:00
David Majnemer 00f7d9ecc8 [InstCombine] Don't miscompile safe increment idiom
We cleverly handle cases where computation done in one argument of a select
instruction is suitable for the other operand, thus obviating the need
of the select and the comparison.  However, the other operand cannot
have flags.

This fixes PR23757.

llvm-svn: 239115
2015-06-04 23:11:30 +00:00
James Molloy 2b21a7cf36 Reapply r237539 with a fix for the Chromium build.
Make sure if we're truncating a constant that would then be sign extended
that the sign extension of the truncated constant is the same as the
original constant.

> Canonicalize min/max expressions correctly.
>
> This patch introduces a canonical form for min/max idioms where one operand
> is extended or truncated. This often happens when the other operand is a
> constant. For example:
>
> %1 = icmp slt i32 %a, i32 0
> %2 = sext i32 %a to i64
> %3 = select i1 %1, i64 %2, i64 0
>
> Would now be canonicalized into:
>
> %1 = icmp slt i32 %a, i32 0
> %2 = select i1 %1, i32 %a, i32 0
> %3 = sext i32 %2 to i64
>
> This builds upon a patch posted by David Majenemer
> (https://www.marc.info/?l=llvm-commits&m=143008038714141&w=2). That pass
> passively stopped instcombine from ruining canonical patterns. This
> patch additionally actively makes instcombine canonicalize too.
>
> Canonicalization of expressions involving a change in type from int->fp
> or fp->int are not yet implemented.

llvm-svn: 237821
2015-05-20 18:41:25 +00:00
Hans Wennborg 2f21b8760e Revert r237539: "Reapply r237520 with another fix for infinite looping"
This caused PR23583.

llvm-svn: 237739
2015-05-19 23:06:30 +00:00
James Molloy 53958e187a Reapply r237520 with another fix for infinite looping
SimplifyDemandedBits was "simplifying" a constant by removing just sign bits.
This caused a canonicalization race between different parts of instcombine.

Fix and regression test added - third time lucky?

llvm-svn: 237539
2015-05-17 08:27:27 +00:00
James Molloy e8698ae3e1 Revert commits r237521 and r237520.
The AArch64 LNT bot is unhappy - I've found that the problem is in
SimpliftDemandedBits, but that's going to require another code review
so reverting in the meantime.

llvm-svn: 237528
2015-05-16 21:27:14 +00:00
James Molloy b5aa200a33 Reapply r237453 with a fix for the test timeouts.
The test timeouts were due to instcombine fighting itself. Regression test added.
Original log message:

Canonicalize min/max expressions correctly.

This patch introduces a canonical form for min/max idioms where one operand
is extended or truncated. This often happens when the other operand is a
constant. For example:

  %1 = icmp slt i32 %a, i32 0
    %2 = sext i32 %a to i64
      %3 = select i1 %1, i64 %2, i64 0

Would now be canonicalized into:

  %1 = icmp slt i32 %a, i32 0
    %2 = select i1 %1, i32 %a, i32 0
      %3 = sext i32 %2 to i64

This builds upon a patch posted by David Majenemer
(https://www.marc.info/?l=llvm-commits&m=143008038714141&w=2). That pass
passively stopped instcombine from ruining canonical patterns. This
patch additionally actively makes instcombine canonicalize too.

Canonicalization of expressions involving a change in type from int->fp
or fp->int are not yet implemented.

llvm-svn: 237520
2015-05-16 13:10:45 +00:00
James Molloy 1675b4a57f Revert "Canonicalize min/max expressions correctly."
This reverts r237453 - it was causing timeouts on some bots. Reverting
while I investigate (it's probably InstCombine fighting itself...)

llvm-svn: 237458
2015-05-15 17:45:09 +00:00
James Molloy 6edf0b4cd4 Canonicalize min/max expressions correctly.
This patch introduces a canonical form for min/max idioms where one operand
is extended or truncated. This often happens when the other operand is a
constant. For example:

  %1 = icmp slt i32 %a, i32 0
  %2 = sext i32 %a to i64
  %3 = select i1 %1, i64 %2, i64 0

Would now be canonicalized into:

  %1 = icmp slt i32 %a, i32 0
  %2 = select i1 %1, i32 %a, i32 0
  %3 = sext i32 %2 to i64

This builds upon a patch posted by David Majenemer
(https://www.marc.info/?l=llvm-commits&m=143008038714141&w=2). That pass
passively stopped instcombine from ruining canonical patterns. This
patch additionally actively makes instcombine canonicalize too.

Canonicalization of expressions involving a change in type from int->fp
or fp->int are not yet implemented.

llvm-svn: 237453
2015-05-15 16:10:59 +00:00
James Molloy 71b91c2dba Rip min/max pattern matching out of InstCombine and into
ValueTracking.

This matching functionality is useful in more than just InstCombine, so
make it available in ValueTracking.

NFC.

llvm-svn: 236998
2015-05-11 14:42:20 +00:00
Sanjoy Das 08e95b4703 [InstCombine] Add new rule for MIN(MAX(~A, ~B), ~C) et. al.
Summary:
Optimizing these well are especially interesting for IRCE since it
"clamps" values by generating this sort of pattern through SCEV
expressions.

Depends on D9352.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9353

llvm-svn: 236203
2015-04-30 04:56:04 +00:00
Sanjoy Das a8c178f280 [InstCombine] Add a new formula for SMIN.
Summary:
After this change `MatchSelectPattern` recognizes the following form
of SMIN:

  Y >s C ? ~Y : ~C == ~Y <s ~C ? ~Y : ~C = SMIN(~Y, ~C)

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9352

llvm-svn: 236202
2015-04-30 04:56:00 +00:00
Mehdi Amini a28d91d81b DataLayout is mandatory, update the API to reflect it with references.
Summary:
Now that the DataLayout is a mandatory part of the module, let's start
cleaning the codebase. This patch is a first attempt at doing that.

This patch is not exactly NFC as for instance some places were passing
a nullptr instead of the DataLayout, possibly just because there was a
default value on the DataLayout argument to many functions in the API.
Even though it is not purely NFC, there is no change in the
validation.

I turned as many pointer to DataLayout to references, this helped
figuring out all the places where a nullptr could come up.

I had initially a local version of this patch broken into over 30
independant, commits but some later commit were cleaning the API and
touching part of the code modified in the previous commits, so it
seemed cleaner without the intermediate state.

Test Plan:

Reviewers: echristo

Subscribers: llvm-commits

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 231740
2015-03-10 02:37:25 +00:00
David Blaikie dc3f01e9cf Simplify expressions involving boolean constants with clang-tidy
Patch by Richard (legalize at xmission dot com).

Differential Revision: http://reviews.llvm.org/D8154

llvm-svn: 231617
2015-03-09 01:57:13 +00:00
David Majnemer 1bacc0abc9 InstCombine: Ensure select condition types are identical before merging
Selection conditions may be vectors or scalars.  Make sure InstCombine
doesn't indiscriminately assume that a select which is value dependent
on another select have identical select condition types.

This fixes PR22773.

llvm-svn: 231156
2015-03-03 22:40:36 +00:00
Sanjoy Das 82ea3d45b5 New instcombine rule: max(~a,~b) -> ~min(a, b)
This case is interesting because ScalarEvolutionExpander lowers min(a,
b) as ~max(~a,~b).  I think the profitability heuristics can be made
more clever/aggressive, but this is a start.

Differential Revision: http://reviews.llvm.org/D7821

llvm-svn: 230285
2015-02-24 00:08:41 +00:00
Andrea Di Biagio 30d471f6aa [InstCombine] Fix regression introduced at r227197.
This patch fixes a problem I accidentally introduced in an instruction combine
on select instructions added at r227197. That revision taught the instruction
combiner how to fold a cttz/ctlz followed by a icmp plus select into a single
cttz/ctlz with flag 'is_zero_undef' cleared.

However, the new rule added at r227197 would have produced wrong results in the
case where a cttz/ctlz with flag 'is_zero_undef' cleared was follwed by a
zero-extend or truncate. In that case, the folded instruction would have
been inserted in a wrong location thus leaving the CFG in an inconsistent
state.

This patch fixes the problem and add two reproducible test cases to
existing test 'InstCombine/select-cmp-cttz-ctlz.ll'.

llvm-svn: 229124
2015-02-13 16:33:34 +00:00
Matthias Braun 2e404597f4 InstCombine: Combine select sequences into a single select
Normalize
select(C0, select(C1, a, b), b) -> select((C0 & C1), a, b)
select(C0, a, select(C1, a, b)) -> select((C0 | C1), a, b)

This normal form may enable further combines on the And/Or and shortens
paths for the values. Many targets prefer the other but can go back
easily in CodeGen.

Differential Revision: http://reviews.llvm.org/D7399

llvm-svn: 228409
2015-02-06 17:49:36 +00:00