Commit Graph

3579 Commits

Author SHA1 Message Date
Bjorn Pettersson 8dd6cf711f [DebugInfo] Corrections for salvageDebugInfo
Summary:
When salvaging a dbg.declare/dbg.addr we should not add
DW_OP_stack_value to the DIExpression
(see test/Transforms/InstCombine/salvage-dbg-declare.ll).

Consider this example
  %vla = alloca i32, i64 2
  call void @llvm.dbg.declare(metadata i32* %vla, metadata !1, metadata !DIExpression())

Instcombine will turn it into
  %vla1 = alloca [2 x i32]
  %vla1.sub = getelementptr inbounds [2 x i32], [2 x i32]* %vla, i64 0, i64 0
  call void @llvm.dbg.declare(metadata [2 x i32]* %vla1.sub, metadata !19, metadata !DIExpression())

If the GEP can be eliminated, then the dbg.declare will be salvaged
and we should get
  %vla1 = alloca [2 x i32]
  call void @llvm.dbg.declare(metadata [2 x i32]* %vla1, metadata !19, metadata !DIExpression())

The problem was that salvageDebugInfo did not recognize dbg.declare
as being indirect (%vla1 points to the value, it does not hold the
value), so we incorrectly got
  call void @llvm.dbg.declare(metadata [2 x i32]* %vla1, metadata !19, metadata !DIExpression(DW_OP_stack_value))

I also made sure that llvm::salvageDebugInfo and
DIExpression::prependOpcodes do not add DW_OP_stack_value to
the DIExpression in case no new operands are added to the
DIExpression. That way we avoid to, unneccessarily, turn a
register location expression into an implicit location expression
in some situations (see test11 in test/Transforms/LICM/sinking.ll).

Reviewers: aprantl, vsk

Reviewed By: aprantl, vsk

Subscribers: JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D48837

llvm-svn: 336191
2018-07-03 11:29:00 +00:00
Max Kazantsev 3097b76e8c [InstCombine] Delay foldICmpUsingKnownBits until simple transforms are done
This patch changes order of transform in InstCombineCompares to avoid
performing transforms based on ranges which produce complex bit arithmetics
before more simple things (like folding with constants) are done. See PR37636
for the motivating example.

Differential Revision: https://reviews.llvm.org/D48584
Reviewed By: spatel, lebedev.ri

llvm-svn: 336172
2018-07-03 06:23:57 +00:00
Sanjay Patel b999d74132 [InstCombine] reverse canonicalization of add --> or to allow more shuffle folding
This extends D48485 to allow another pair of binops (add/or) to be combined either
with or without a leading shuffle:
or X, C --> add X, C (when X and C have no common bits set)

Here, we need value tracking to determine that the 'or' can be reversed into an 'add',
and we've added general infrastructure to allow extending to other opcodes or moving 
to where other passes could use that functionality.

Differential Revision: https://reviews.llvm.org/D48662

llvm-svn: 336128
2018-07-02 17:42:29 +00:00
Sanjay Patel 284ba0c18f [ValueTracking] allow undef elements when matching vector abs
llvm-svn: 336111
2018-07-02 14:43:40 +00:00
Sanjay Patel 951f617e16 [InstCombine] adjust shuffle tests with IR flags; NFC
Due to current limitations in constant analysis, we need flags
on add or mul to show propagation for the potential transform
suggested in these tests (no other binops currently report 
identity constants).

llvm-svn: 336101
2018-07-02 13:40:54 +00:00
Sanjay Patel d980084597 [InstCombine] add tests for shuffle-binop; NFC
This is another pattern mentioned in PR37806.

llvm-svn: 336096
2018-07-02 12:30:46 +00:00
Max Kazantsev 66da390506 [NFC] Test that shows unprofitability of instcombine with bit ranges
llvm-svn: 336078
2018-07-02 06:55:00 +00:00
Piotr Padlewski 5b3db45e8f Implement strip.invariant.group
Summary:
This patch introduce new intrinsic -
strip.invariant.group that was described in the
RFC: Devirtualization v2

Reviewers: rsmith, hfinkel, nlopes, sanjoy, amharc, kuhar

Subscribers: arsenm, nhaehnle, JDevlieghere, hiraditya, xbolva00, llvm-commits

Differential Revision: https://reviews.llvm.org/D47103

Co-authored-by: Krzysztof Pszeniczny <krzysztof.pszeniczny@gmail.com>
llvm-svn: 336073
2018-07-02 04:49:30 +00:00
Sanjay Patel 279a1a39ad [InstCombine] add abs tests with undef elts; NFC
llvm-svn: 336065
2018-07-01 17:14:37 +00:00
Sanjay Patel a9fdb9fd37 [PatternMatch] allow undef elements in vectors with m_Neg
This is similar to the m_Not change from D44076.

llvm-svn: 336064
2018-07-01 13:42:57 +00:00
Sanjay Patel 16a42ca274 [InstCombine] add tests for negate vector with undef elts; NFC
llvm-svn: 336050
2018-06-30 14:11:46 +00:00
Sanjay Patel 4491c0dd45 [InstCombine] add more tests for shuffle-binop folds; NFC
The mul+shl tests add coverage for the fold enabled with D48678.
The and+or tests are not handled yet; that's D48662.

llvm-svn: 335984
2018-06-29 15:28:11 +00:00
Sanjay Patel da66753e01 [InstCombine] enhance shuffle-of-binops to allow different variable ops (PR37806)
This was discussed in D48401 as another improvement for:
https://bugs.llvm.org/show_bug.cgi?id=37806

If we have 2 different variable values, then we shuffle (select) those lanes, 
shuffle (select) the constants, and then perform the binop. This eliminates a binop.

The new shuffle uses the same shuffle mask as the existing shuffle, so there's no 
danger of creating a difficult shuffle.

All of the earlier constraints still apply, but we also check for extra uses to 
avoid creating more instructions than we'll remove.

Additionally, we're disallowing the fold for div/rem because that could expose a
UB hole.

Differential Revision: https://reviews.llvm.org/D48678

llvm-svn: 335974
2018-06-29 13:44:06 +00:00
Sanjay Patel 019d3cd3b4 [InstCombine] adjust shuffle tests; NFC
Use xor for the extra uses test because div/rem have other problems.

llvm-svn: 335924
2018-06-28 21:14:02 +00:00
Sanjay Patel 57bda365bf [InstCombine] allow shl+mul combos with shuffle (select) fold (PR37806)
This is an enhancement to D48401 that was discussed in:
https://bugs.llvm.org/show_bug.cgi?id=37806

We can convert a shift-left-by-constant into a multiply (we canonicalize IR in the other 
direction because that's generally better of course). This allows us to remove the shuffle 
as we do in the regular opcodes-are-the-same cases.

This requires a small hack to make sure we don't introduce any extra poison:
https://rise4fun.com/Alive/ZGv

Other examples of opcodes where this would work are add+sub and fadd+fsub, but we already 
canonicalize those subs into adds, so there's nothing to do for those cases AFAICT. There 
are planned enhancements for opcode transforms such or -> add.

Note that there's a different fold needed if we've already managed to simplify away a binop 
as seen in the test based on PR37806, but we manage to get that one case here because this 
fold is positioned above the demanded elements fold currently.

Differential Revision: https://reviews.llvm.org/D48485

llvm-svn: 335888
2018-06-28 17:48:04 +00:00
Sanjay Patel 1ef49be8b6 [InstCombine] add tests for vector-select-of-binops with 2 variables; NFC
llvm-svn: 335778
2018-06-27 20:23:47 +00:00
Sanjay Patel 7e45aebe55 [InstCombine] add more tests for shuffle with different binops; NFC
llvm-svn: 335756
2018-06-27 17:21:57 +00:00
Craig Topper 31cbe75b3b [X86] Rename the autoupgraded of packed fp compare and fpclass intrinsics that don't take a mask as input to exclude '.mask.' from their name.
I think the intrinsics named 'avx512.mask.' should refer to the previous behavior of taking a mask argument in the intrinsic instead of using a 'select' or 'and' instruction in IR to accomplish the masking. This is more consistent with the goal that eventually we will have no intrinsics that have masking builtin. When we reach that goal, we should have no intrinsics named "avx512.mask".

llvm-svn: 335744
2018-06-27 15:57:53 +00:00
Vedant Kumar f6c0b41fb7 [InstCombine] Avoid creating mis-sized dbg.values in commonCastTransforms()
This prevents InstCombine from creating mis-sized dbg.values when
replacing a sequence of casts with a simpler cast. For example, in:

  (fptrunc (floor (fpext X))) -> (floorf X)

We no longer emit dbg.value(X) (with a 32-bit float operand) to describe
(fpext X) (which is a 64-bit float).

This was diagnosed by the debugify check added in r335682.

llvm-svn: 335696
2018-06-27 00:47:53 +00:00
Matt Arsenault 7e991d30c0 ConstantFold: Don't fold global address vs. null for addrspace != 0
Not sure why this logic seems to be repeated in 2 different places,
one called by the other.

On AMDGPU addrspace(3) globals start allocating at 0, so these
checks will be incorrect (not that real code actually tries
to compare these addresses)

llvm-svn: 335649
2018-06-26 18:55:43 +00:00
Sanjay Patel 3575f0c0b3 [InstCombine] fold urem with sext bool divisor
Similar to other patches in this series:
https://reviews.llvm.org/rL335512
https://reviews.llvm.org/rL335527
https://reviews.llvm.org/rL335597
https://reviews.llvm.org/rL335616

...this is filling a gap in analysis that is exposed by an unrelated select-of-constants transform.
I didn't see a way to unify the sext cases because each div/rem opcode results in a different fold.

Note that in this case, the backend might want to convert the select into math:
Name: sext urem
%e = sext i1 %x to i32
%r = urem i32 %y, %e
=>
%c = icmp eq i32 %y, -1
%z = zext i1 %c to i32
%r = add i32 %z, %y

llvm-svn: 335622
2018-06-26 16:30:00 +00:00
Sanjay Patel 0f44759b0d [InstCombine] add tests for urem with sext bool divisor; NFC
llvm-svn: 335619
2018-06-26 16:01:24 +00:00
Sanjay Patel 7c45debaea [InstCombine] fold udiv with sext bool divisor
Note: I didn't add a hasOneUse() check because the existing,
related fold doesn't have that check. I suspect that the
improved analysis and codegen make these some of the rare
canonicalization cases where we allow an increase in
instructions.

llvm-svn: 335597
2018-06-26 12:41:15 +00:00
Gil Rapaport da2e2caa6c [InstCombine] (A + 1) + (B ^ -1) --> A - B
Turn canonicalized subtraction back into (-1 - B) and combine it with (A + 1) into (A - B).
This is similar to the folding already done for (B ^ -1) + Const into (-1 + Const) - B.

Differential Revision: https://reviews.llvm.org/D48535

llvm-svn: 335579
2018-06-26 05:31:18 +00:00
Sanjay Patel 0c90400bf2 [InstCombine] add/move tests for udiv; NFC
llvm-svn: 335544
2018-06-25 22:27:36 +00:00
Sanjay Patel 6a96d90acd [InstCombine] fold sdiv with sext bool divisor
llvm-svn: 335527
2018-06-25 21:39:41 +00:00
Sanjay Patel 46f9b8c333 [InstCombine] add tests for sdiv with sext bool divisor; NFC
llvm-svn: 335526
2018-06-25 21:36:09 +00:00
Sanjay Patel 1e911fa746 [InstSimplify] fold div/rem of zexted bool
I was looking at an unrelated fold and noticed that
we don't have this simplification (because the other
fold would break existing tests).

Name: zext udiv
  %z = zext i1 %x to i32
  %r = udiv i32 %y, %z
=>
  %r = %y

Name: zext urem
  %z = zext i1 %x to i32
  %r = urem i32 %y, %z
=>
  %r = 0

Name: zext sdiv
  %z = zext i1 %x to i32
  %r = sdiv i32 %y, %z
=>
  %r = %y

Name: zext srem
  %z = zext i1 %x to i32
  %r = srem i32 %y, %z
=>
  %r = 0

https://rise4fun.com/Alive/LZ9

llvm-svn: 335512
2018-06-25 18:51:21 +00:00
Sanjay Patel 2e8babb4fa [InstCombine] add tests for add-of-sext-bool; NFC
We canonicalize to select with a zext-add and either zext-sub or sext-sub,
so this shows a pattern that's not conforming to the general trend.

llvm-svn: 335506
2018-06-25 17:52:10 +00:00
Sanjay Patel c2b37f73f5 [InstCombine] add shuffle+binops test from PR37806; NFC
This one shows another pattern that we'll need to match
in some cases, but the current ordering of folds allows
us to match this as 2 binops before simplification takes
place.

llvm-svn: 335347
2018-06-22 13:44:42 +00:00
Sanjay Patel cfd9da038c [InstCombine] add tests for shuffle-with-different-binops; NFC
llvm-svn: 335345
2018-06-22 13:19:25 +00:00
Sanjay Patel 4784e1506e [InstCombine] fix shuffle-of-binops bug
With non-commutative binops, we could be using the same
variable value as operand 0 in 1 binop and operand 1 in 
the other, so we have to check for that possibility and
bail out.

llvm-svn: 335312
2018-06-21 23:56:59 +00:00
Sanjay Patel 23408c25f3 [InstCombine] add test for shuffle-of-binops; NFC
This shows a miscompile that was missed in rL335283.

llvm-svn: 335311
2018-06-21 23:53:01 +00:00
Sanjay Patel a76b70069d [InstCombine] fold vector select of binops with constant ops to 1 binop (PR37806)
This is the simplest case from PR37806:
https://bugs.llvm.org/show_bug.cgi?id=37806

If we have a common variable operand used in a pair of binops with vector constants 
that are vector selected together, then we can constant shuffle the constant vectors 
to eliminate the shuffle instruction.

This has some tricky parts that are hopefully addressed in the tests and their 
respective comments:

  1. If the shuffle mask contains an undef element, then that lane of the result is 
     undef:
     http://llvm.org/docs/LangRef.html#shufflevector-instruction

     Therefore, we can replace the constant in that lane with an undef value except 
     for div/rem. With div/rem, an undef in the divisor would cause the whole op to 
     be undef. So I'm using the same hack as in D47686 - replace the undefs with '1'.

  2. Intersect the wrapping and FMF of the original binops for the new binop. There 
     should be no extra poison or fast-math potential in the new binop that wasn't 
     possible in the original code.

  3. Disregard other uses. Given that we're eliminating uses (shortening the 
     dependency chain), I think that's always the right IR canonicalization. But 
     I purposely chose the udiv test to demonstrate the scenario where both 
     intermediate values have other uses because that seems likely worse for 
     codegen with an expensive math op. This seems like a very rare possibility to 
     me, so I don't think it requires a backend patch first.

Differential Revision: https://reviews.llvm.org/D48401

llvm-svn: 335283
2018-06-21 20:15:09 +00:00
Sanjay Patel 3382dc644e [InstCombine] add tests for shuffled cmps; NFC
llvm-svn: 335266
2018-06-21 18:07:38 +00:00
Sanjay Patel 3244537a3c [InstCombine] use constant pattern matchers with icmp+sext
The previous code worked with vectors, but it failed when the
vector constants contained undef elements. 
The matchers handle those cases.

llvm-svn: 335262
2018-06-21 17:51:44 +00:00
Sanjay Patel 5522e968ad [InstCombine] add vector icmp tests with undefs; NFC
llvm-svn: 335261
2018-06-21 17:37:14 +00:00
Nicolai Haehnle b29ee70122 InstCombine/AMDGPU: Add dimension-aware image intrinsics to SimplifyDemanded
Summary:
Use the expanded features of the TableGen generic tables to avoid manually
adding the combinatorially exploded set of intrinsics. The
getAMDGPUImageDimIntrinsic lookup function is early-out,
i.e. non-AMDGPU intrinsics will never look at the underlying table.

Use a generic approach for getting the new intrinsic overload to keep the
code simple, and make the image dmask handling more generic:
- handle non-sampler image loads
- handle the case where the set of demanded elements is not a prefix

There is some overlap between this code and an optimization that happens
in the backend during code generation. They currently complement each other:

- only the codegen optimization can generate vec3 loads
- only the InstCombine optimization can handle D16

The InstCombine optimization also likely covers more cases since the
codegen optimization is fairly ad-hoc. Ideally, we'll remove the optimization
in codegen once the infrastructure for vec3 is in place (which will probably
take a long time).

Modify the test cases to use dimension-aware intrinsics. This makes it
easier to see that the test coverage for the new intrinsics is equivalent,
and the old style intrinsics will be removed in a follow-up commit anyway.

Change-Id: I4b91ea661413d13004956fe4ef7d13d41b8ce3ad

Reviewers: arsenm, rampitec, majnemer

Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D48165

llvm-svn: 335230
2018-06-21 13:37:31 +00:00
Sanjay Patel 88de778b9b [InstCombine] fix typo in test comment; NFC
llvm-svn: 335165
2018-06-20 20:16:45 +00:00
Sanjay Patel 74442061f2 [InstCombine] add vector select of binops tests (PR37806)
These represent the most basic requested transform - a matching
operand and 2 constant operands.

llvm-svn: 335151
2018-06-20 17:48:43 +00:00
Sanjay Patel 825a4faa8d [InstCombine] ignore debuginfo when removing redundant assumes (PR37726)
This is similar to:
rL335083

Fixes::
https://bugs.llvm.org/show_bug.cgi?id=37726

llvm-svn: 335121
2018-06-20 13:22:26 +00:00
Mikhail Dvoretckii 8393f90717 [InstCombine] Replacing X86-specific rounding intrinsics with generic floor-ceil
This patch replaces calls to X86-specific intrinsics with floor-ceil semantics
with calls to target-independent @llvm.floor.* and @llvm.ceil.* intrinsics. This
doesn't affect the resulting machine code, as those intrinsics are lowered to
the same instructions, but exposes these specific rounding cases to generic
optimizations.

Differential Revision: https://reviews.llvm.org/D48067

llvm-svn: 335039
2018-06-19 10:49:12 +00:00
Tomasz Krupa bcaab53d47 [X86] Lowering sqrt intrinsics to native IR
Summary: Complementary patch to lowering sqrt intrinsics in Clang.

Reviewers: craig.topper, spatel, RKSimon, DavidKreitzer, uriel.k

Reviewed By: craig.topper

Subscribers: tkrupa, mike.dvoretsky, llvm-commits

Differential Revision: https://reviews.llvm.org/D41599

llvm-svn: 334849
2018-06-15 18:05:24 +00:00
Joseph Tremoulet 6f406d4f02 [InstCombine] Avoid iteration/mutation conflict
Summary:
When iterating users of a multiply in processUMulZExtIdiom, the
call to setOperand in the truncation case may replace the use
being visited; make sure the iterator has been advanced before
doing that replacement.

Reviewers: majnemer, davide

Reviewed By: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D48192

llvm-svn: 334844
2018-06-15 16:52:40 +00:00
Bjorn Pettersson 428caf988b Re-apply "[DebugInfo] Check size of variable in ConvertDebugDeclareToDebugValue"
This is r334704 (which was reverted in r334732) with a fix for
types like x86_fp80. We need to use getTypeAllocSizeInBits and
not getTypeStoreSizeInBits to avoid dropping debug info for
such types.

Original commit msg:
> Summary:
> Do not convert a DbgDeclare to DbgValue if the store
> instruction only refer to a fragment of the variable
> described by the DbgDeclare.
>
> Problem was seen when for example having an alloca for an
> array or struct, and there were stores to individual elements.
> In the past we inserted a DbgValue intrinsics for each store,
> just as if the store wrote the whole variable.
>
> When handling store instructions we insert a DbgValue that
> indicates that the variable is "undefined", as we do not know
> which part of the variable that is updated by the store.
>
> When ConvertDebugDeclareToDebugValue is used with a load/phi
> instruction we assert that the referenced value is large enough
> to cover the whole variable. Afaict this should be true for all
> scenarios where those methods are used on trunk. If the assert
> blows in the future I guess we could simply skip to insert a
> dbg.value instruction.
>
> In the future I think we should examine which part of the variable
> that is accessed, and add a DbgValue instrinsic with an appropriate
> DW_OP_LLVM_fragment expression.
>
> Reviewers: dblaikie, aprantl, rnk
>
> Reviewed By: aprantl
>
> Subscribers: JDevlieghere, llvm-commits
>
> Tags: #debug-info
>
> Differential Revision: https://reviews.llvm.org/D48024

llvm-svn: 334830
2018-06-15 13:48:55 +00:00
Roman Lebedev 84c11aed10 [InstCombine] Recommit: Fold (x << y) >> y -> x & (-1 >> y)
Summary:
We already do it for splat constants, but not just values.
Also, undef cases are mostly non-functional.

The original commit was reverted because
it broke tests for amdgpu backend, which i didn't check.
Now, the backed was updated to recognize these new
patterns, so we are good.

https://bugs.llvm.org/show_bug.cgi?id=37603
https://rise4fun.com/Alive/cplX

Reviewers: spatel, craig.topper, mareko, bogner, rampitec, nhaehnle, arsenm

Reviewed By: spatel, rampitec, nhaehnle

Subscribers: wdng, nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D47980

llvm-svn: 334818
2018-06-15 09:56:52 +00:00
Mikhail Dvoretckii 0531ec654a NFC: Regenerating x86-sse41.ll test for InstCombine
Test regenerated to reduce noise in further patches.

llvm-svn: 334806
2018-06-15 07:59:29 +00:00
Bjorn Pettersson 972fd1c9e7 Revert rL334704: "[DebugInfo] Check size of variable in ConvertDebugDeclareToDebugValue"
This reverts commit r334704.

Buildbots detected an assertion in "test tsan in debug compiler-rt build".

llvm-svn: 334732
2018-06-14 16:08:22 +00:00
Bjorn Pettersson e406b29c22 [DebugInfo] Check size of variable in ConvertDebugDeclareToDebugValue
Summary:
Do not convert a DbgDeclare to DbgValue if the store
instruction only refer to a fragment of the variable
described by the DbgDeclare.

Problem was seen when for example having an alloca for an
array or struct, and there were stores to individual elements.
In the past we inserted a DbgValue intrinsics for each store,
just as if the store wrote the whole variable.

When handling store instructions we insert a DbgValue that
indicates that the variable is "undefined", as we do not know
which part of the variable that is updated by the store.

When ConvertDebugDeclareToDebugValue is used with a load/phi
instruction we assert that the referenced value is large enough
to cover the whole variable. Afaict this should be true for all
scenarios where those methods are used on trunk. If the assert
blows in the future I guess we could simply skip to insert a
dbg.value instruction.

In the future I think we should examine which part of the variable
that is accessed, and add a DbgValue instrinsic with an appropriate
DW_OP_LLVM_fragment expression.

Reviewers: dblaikie, aprantl, rnk

Reviewed By: aprantl

Subscribers: JDevlieghere, llvm-commits

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D48024

llvm-svn: 334704
2018-06-14 11:23:42 +00:00
Roman Lebedev ebb3252f00 Revert rL334371 / D47980: "[InstCombine] Fold (x << y) >> y -> x & (-1 >> y)"
test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll broke,
and i did not notice because i did not build that backend.

llvm-svn: 334373
2018-06-10 20:32:03 +00:00
Roman Lebedev eb795a0661 [InstCombine] Fold (x >> y) << y -> x & (-1 << y)
Summary:
We already do it for matching splat constants, but not just values.

Further improvements for non-matching splat constants, as noted in
https://reviews.llvm.org/D46760#1123713 will be needed,
but i'd prefer to do that as a follow-up.

https://bugs.llvm.org/show_bug.cgi?id=37603
https://rise4fun.com/Alive/cplX
https://rise4fun.com/Alive/0HF

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47981

llvm-svn: 334372
2018-06-10 20:10:13 +00:00
Roman Lebedev 4cdc59ecf2 [InstCombine] Fold (x << y) >> y -> x & (-1 >> y)
Summary:
We already do it for splat constants, but not just values.
Also, undef cases are mostly non-functional.

https://bugs.llvm.org/show_bug.cgi?id=37603
https://rise4fun.com/Alive/cplX

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47980

llvm-svn: 334371
2018-06-10 20:10:06 +00:00
Roman Lebedev 1e9457ee76 [NFC][InstCombine] Revisit tests for D47980 / D47981 once more.
llvm-svn: 334370
2018-06-10 20:10:00 +00:00
Craig Topper 98a79934af [X86] Remove masking from the 512-bit masked floating point add/sub/mul/div intrinsics. Use a select in IR instead.
llvm-svn: 334358
2018-06-10 06:01:36 +00:00
Roman Lebedev ff7652e7ab [NFC][InstCombine] More tests for (x >> y) << y -> x & (-1 << y) fold.
Followup for rL334347.
The fold is also valid for ashr.
https://rise4fun.com/Alive/0HF

https://bugs.llvm.org/show_bug.cgi?id=37603
https://reviews.llvm.org/D46760#1123713
https://rise4fun.com/Alive/cplX

llvm-svn: 334349
2018-06-09 14:01:46 +00:00
Roman Lebedev 8aada65f81 [NFC][InstCombine] Tests for (x >> y) << y -> x & (-1 << y) fold.
We already do it for splat constants, but not just values.
Also, undef cases are mostly non-functional.

https://bugs.llvm.org/show_bug.cgi?id=37603
https://reviews.llvm.org/D46760#1123713
https://rise4fun.com/Alive/cplX

llvm-svn: 334347
2018-06-09 09:27:43 +00:00
Roman Lebedev 794e29f964 [NFC][InstCombine] Tests for (x << y) >> y -> x & (-1 >> y) fold.
We already do it for splat constants, but not just values.
Also, undef cases are mostly non-functional.

https://bugs.llvm.org/show_bug.cgi?id=37603
https://rise4fun.com/Alive/cplX

llvm-svn: 334346
2018-06-09 09:27:39 +00:00
Davide Italiano 189c2cf114 [InstCombine] Skip dbg.value(s) when looking at stack{save,restore}.
Fixes PR37713.

llvm-svn: 334317
2018-06-08 20:42:36 +00:00
Sanjay Patel afcf39e1f9 [InstCombine] add llvm.assume + debuginfo test (PR37726); NFC
llvm-svn: 334314
2018-06-08 18:47:33 +00:00
Roman Lebedev b060ce45ca [InstSimplify] add nuw %x, -1 -> -1 fold.
Summary:
`%ret = add nuw i8 %x, C`
From [[ https://llvm.org/docs/LangRef.html#add-instruction | langref ]]:
    nuw and nsw stand for “No Unsigned Wrap” and “No Signed Wrap”,
    respectively. If the nuw and/or nsw keywords are present,
    the result value of the add is a poison value if unsigned
    and/or signed overflow, respectively, occurs.

So if `C` is `-1`, `%x` can only be `0`, and the result is always `-1`.

I'm not sure we want to use `KnownBits`/`LVI` here, because there is
exactly one possible value (all bits set, `-1`), so some other pass
should take care of replacing the known-all-ones with constant `-1`.

The `test/Transforms/InstCombine/set-lowbits-mask-canonicalize.ll` change *is* confusing.
What happening is, before this: (omitting `nuw` for simplicity)
1. First, InstCombine D47428/rL334127 folds `shl i32 1, %NBits`) to `shl nuw i32 -1, %NBits`
2. Then, InstSimplify D47883/rL334222 folds `shl nuw i32 -1, %NBits` to `-1`,
3. `-1` is inverted to `0`.
But now:
1. *This* InstSimplify fold `%ret = add nuw i32 %setbit, -1` -> `-1` happens first,
   before InstCombine D47428/rL334127 fold could happen.
Thus we now end up with the opposite constant,
and it is all good: https://rise4fun.com/Alive/OA9

https://rise4fun.com/Alive/sldC
Was mentioned in D47428 review.
Follow-up for D47883.

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47908

llvm-svn: 334298
2018-06-08 15:44:47 +00:00
Roman Lebedev 2683802ba0 [InstSimplify] shl nuw C, %x -> C iff signbit is set on C.
Summary:
`%r = shl nuw i8 C, %x`

As per langref:
```
If the nuw keyword is present, then the shift produces
a poison value if it shifts out any non-zero bits.
```
Thus, if the sign bit is set on `C`, then `%x` can only be `0`,
which means that `%r` can only be `C`.
Or in other words, set sign bit means that the signed value
is negative, so the constant is `<= 0`.

https://rise4fun.com/Alive/WMk
https://rise4fun.com/Alive/udv

Was mentioned in D47428 review.

We already handle the `0` constant, https://godbolt.org/g/UZq1sJ, so this only handles negative constants.

Could use computeKnownBits() / LazyValueInfo,
but the cost-benefit analysis (https://reviews.llvm.org/D47891)
suggests it isn't worth it.

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47883

llvm-svn: 334222
2018-06-07 20:03:45 +00:00
Sanjay Patel 3cd1aa88f9 [InstCombine] fold another shifty abs pattern to cmp+sel (PR36036)
The bug report:
https://bugs.llvm.org/show_bug.cgi?id=36036

...requests a DAG change for this, but an IR canonicalization
probably handles most cases. If we still want to match this
pattern in the backend, there's a proposal for that too:
D47831

Alive proofs including nsw/nuw cases that were first noted in:
D46988

https://rise4fun.com/Alive/Kmp

This patch is largely copied from the existing code that was
initially added with:
D40984
...but I didn't see much gain from trying to share code.

llvm-svn: 334137
2018-06-06 21:58:12 +00:00
Sanjay Patel 6fda6b1210 [InstCombine] add tests for another abs() pattern (PR36036); NFC
llvm-svn: 334133
2018-06-06 21:32:42 +00:00
Roman Lebedev cbf8446359 [InstCombine] PR37603: low bit mask canonicalization
Summary:
This is [[ https://bugs.llvm.org/show_bug.cgi?id=37603 | PR37603 ]].

https://godbolt.org/g/VCMNpS
https://rise4fun.com/Alive/idM

When doing bit manipulations, it is quite common to calculate some bit mask,
and apply it to some value via `and`.

The typical C code looks like:
```
int mask_signed_add(int nbits) {
    return (1 << nbits) - 1;
}
```
which is translated into (with `-O3`)
```
define dso_local i32 @mask_signed_add(int)(i32) local_unnamed_addr #0 {
  %2 = shl i32 1, %0
  %3 = add nsw i32 %2, -1
  ret i32 %3
}
```

But there is a second, less readable variant:
```
int mask_signed_xor(int nbits) {
    return ~(-(1 << nbits));
}
```
which is translated into (with `-O3`)
```
define dso_local i32 @mask_signed_xor(int)(i32) local_unnamed_addr #0 {
  %2 = shl i32 -1, %0
  %3 = xor i32 %2, -1
  ret i32 %3
}
```

Since we created such a mask, it is quite likely that we will use it in `and` next.
And then we may get rid of `not` op by folding into `andn`.

But now that i have actually looked:
https://godbolt.org/g/VTUDmU
_some_ backend changes will be needed too.
We clearly loose `bzhi` recognition.

Reviewers: spatel, craig.topper, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47428

llvm-svn: 334127
2018-06-06 19:38:27 +00:00
Roman Lebedev 4771bc6c35 [InstCombine][NFC] PR37603: low bit mask canonicalization tests
Differential Revision: https://reviews.llvm.org/D47427

llvm-svn: 334126
2018-06-06 19:38:21 +00:00
Sanjay Patel 0e8b90da0c [ConstProp] move tests for fp <--> int; NFC
These were added for D5603 / rL219542, and there's a proposal to 
change one side in D47807.

These are tests of constant propagation, so they shouldn't have
ever been tested/housed under InstCombine.

llvm-svn: 334107
2018-06-06 16:53:56 +00:00
Tim Northover 9b80060d7b InstCombine: ignore debug instructions during fence combine
We should never get different CodeGen based on whether the code is being
compiled in debug mode so we must skip over @llvm.dbg.value (and similar)
calls.

Should fix at least the worst part of PR37690.

llvm-svn: 334090
2018-06-06 12:46:02 +00:00
John Brawn e4ff0bd401 [InstCombine] Correct the cmp operand type used when canonicalizing abs/nabs
When adjusting a cmp in order to canonicalize an abs/nabs select pattern we need
to use the type of the existing operand when creating a new operand not the
type of a select operand, as the two may be different.

This fixes PR37686.

llvm-svn: 334019
2018-06-05 14:10:55 +00:00
Sanjay Patel dcb8d304c3 [InstCombine] refine UB-handling in shuffle-binop transform
As noted in rL333782, we can be both better for optimization and
safer with this transform:
BinOp (shuffle V1, Mask), C --> shuffle (BinOp V1, NewC), Mask

The only potentially unsafe-to-speculate binops are integer div/rem.
All other binops are always safe (although I don't see a way to
assert that in code here).

For opcodes like shifts that can produce poison, it can't matter
here because we know the lanes with undef are dropped by the
subsequent shuffle.

Differential Revision: https://reviews.llvm.org/D47686

llvm-svn: 333962
2018-06-04 22:26:45 +00:00
Vedant Kumar 7dda22115e [Debugify] Add debug intrinsics before terminating musttail calls
After r333856, opt -debugify would just stop emitting debug value
intrinsics after encountering a musttail call. This wasn't sufficient to
avoid verifier failures.

Debug value intrinicss for all instructions preceding a musttail call
must also be emitted before the musttail call.

llvm-svn: 333866
2018-06-04 03:33:01 +00:00
Serguei Katkov d894fb4288 [InstCombine] Fix div handling
When we optimize select basing on fact that div by 0 is undef
we should not traverse the instruction which are not guaranteed to
transfer execution to next instruction. Guard intrinsic is an example.

Reviewers: spatel, craig.topper
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D47576

llvm-svn: 333864
2018-06-04 02:52:36 +00:00
Sanjay Patel 3bd957b7ae [InstCombine] improve sub with bool folds
There's a patchwork of existing transforms trying to handle
these cases, but as seen in the changed test, we weren't
catching them all.

llvm-svn: 333845
2018-06-03 16:35:26 +00:00
Sanjay Patel bbc6d60677 [InstCombine] call simplify before trying vector folds
As noted in the review thread for rL333782, we could have
made a bug harder to hit if we were simplifying instructions
before trying other folds. 

The shuffle transform in question isn't ever a simplification;
it's just a canonicalization. So I've renamed that to make that 
clearer.

This is NFCI at this point, but I've regenerated the test file 
to show the cosmetic value naming difference of using 
instcombine's RAUW vs. the builder.

Possible follow-ups:
1. Move reassociation folds after simplifies too.
2. Refactor common code; we shouldn't have so much repetition.

llvm-svn: 333820
2018-06-02 16:27:44 +00:00
Sanjay Patel b6333486f4 [InstCombine] add more tests for shuffle-binop; NFC
As noted in the review thread for rL333782, we're lacking coverage
for this transform, so add tests for each binop opcode with constant 
operand.

llvm-svn: 333818
2018-06-02 16:16:42 +00:00
Sanjay Patel 66f7e19f6a [InstCombine] fix vector shuffle transform to replace undef elements (PR37648)
This bug:
https://bugs.llvm.org/show_bug.cgi?id=37648
...was created with the enhancement to this transform with rL332479.

The urem test shows the disaster potential: any undef divisor lane makes
the whole op undef.

The test diffs show that vector demanded elements turns some of the potential, 
but not all, unused binop operands back into undef already.

llvm-svn: 333782
2018-06-01 19:23:18 +00:00
Sanjay Patel 3883fcd0c0 [InstCombine] add tests for broken shuffle transform (PR37648)
llvm-svn: 333779
2018-06-01 18:52:38 +00:00
Sanjay Patel 284fe7a8ec [InstCombine] add baseline test for bug with div+select transform (D47576)
llvm-svn: 333756
2018-06-01 14:39:05 +00:00
Sanjay Patel 26368cd5d9 [InstCombine] narrow select to match condition operands' size
This is the planned enhancement to D47163 / rL333611.
We want to match cmp/select sizes because that will be recognized
as min/max more easily and lead to better codegen (especially for
vector types).

As mentioned in D47163, this improves some of the tests that would
also be folded by D46380, so we may want to adjust that patch to
match the new patterns where the extend op occurs after the select.

llvm-svn: 333689
2018-05-31 19:55:27 +00:00
Sanjay Patel dfbe6b49f0 [InstCombine] regenerate checks; NFC
llvm-svn: 333682
2018-05-31 19:25:02 +00:00
Alexandros Lamprineas 61f0ba1fcc [InstCombine, ARM] Convert vld1 to llvm load
Convert a vector load intrinsic into an llvm load instruction.
This is beneficial when the underlying object being addressed
comes from a constant, since we get constant-folding for free.

Differential Revision: https://reviews.llvm.org/D46273

llvm-svn: 333643
2018-05-31 12:19:18 +00:00
Roman Lebedev c0ecd06428 Revert rL333106 / D46814: [InstCombine] Fold unfolded masked merge pattern with variable mask!
In post-commit review, Eric Christopher notes that many
new MSan warnings are being observed with this patch.

The probable reason is: if 'y' is undef here and we could
evaluate it twice and get different results.
We can't increase the number of uses of a value.

llvm-svn: 333631
2018-05-31 06:00:36 +00:00
Sanjay Patel e5bc441791 [InstCombine] don't change the size of a select if it would mismatch its condition operands' sizes
Don't always:
cast (select (cmp x, y), z, C) --> select (cmp x, y), (cast z), C'

This is something that came up as far back as D26556, and I lost track of it. 
I suspect that this transform is part of the underlying problem that is 
inspiring some of the recent proposals that seek to match larger patterns 
that include a cast op. Even if that's not true, this transform causes
problems for codegen (particularly with vector types).

A transform to actively match the size of cmp and select operand sizes should
follow. This patch just removes the harmful canonicalization in the other
direction.

Differential Revision: https://reviews.llvm.org/D47163

llvm-svn: 333611
2018-05-31 00:16:58 +00:00
Sanjay Patel ceb595b04e [InstCombine] don't negate constant expression with fsub (PR37605)
X + (-C) would be transformed back into X - C, so infinite loop:
https://bugs.llvm.org/show_bug.cgi?id=37605

llvm-svn: 333610
2018-05-30 23:55:12 +00:00
Alexandros Lamprineas 52457d33b2 [InstCombine, ARM, AArch64] Convert table lookup to shuffle vector
Turning a table lookup intrinsic into a shuffle vector instruction
can be beneficial. If the mask used for the lookup is the constant
vector {7,6,5,4,3,2,1,0}, then the back-end generates byte reverse
instructions instead.

Differential Revision: https://reviews.llvm.org/D46133

llvm-svn: 333550
2018-05-30 14:38:50 +00:00
Craig Topper 8f77dcadff Recommit r333226 "[ValueTracking] Teach computeKnownBits that the result of an absolute value pattern that uses nsw flag is always positive."
Libfuzzer tests have been fixed to prevent being optimized.

Original commit message:

If the nsw flag is used in the absolute value then it is undefined for INT_MIN. For all other value it will produce a positive number. So we can assume the result is positive.

This breaks some InstCombine abs/nabs combining tests because we simplify the second compare from known bits rather than as the whole pattern. Looks like we can probably fix it by adding a neg+abs/nabs combine to just swap the select operands. N

Differential Revision: https://reviews.llvm.org/D47041

llvm-svn: 333300
2018-05-25 19:18:09 +00:00
Craig Topper 8174281b93 Revert r333226 "[ValueTracking] Teach computeKnownBits that the result of an absolute value pattern that uses nsw flag is always positive."
This breaks some libFuzzer tests. http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fuzzer/builds/15589/steps/check-fuzzer/logs/stdio

Reverting to investigate

llvm-svn: 333253
2018-05-25 04:01:56 +00:00
Vedant Kumar 4872535eb9 [Debugify] Set a DI version module flag for llc compatibility
Setting the "Debug Info Version" module flag makes it possible to pipe
synthetic debug info into llc, which is useful for testing backends.

llvm-svn: 333237
2018-05-24 23:00:23 +00:00
Craig Topper 49f23fe349 [ValueTracking] Teach computeKnownBits that the result of an absolute value pattern that uses nsw flag is always positive.
If the nsw flag is used in the absolute value then it is undefined for INT_MIN. For all other value it will produce a positive number. So we can assume the result is positive.

This breaks some InstCombine abs/nabs combining tests because we simplify the second compare from known bits rather than as the whole pattern. Looks like we can probably fix it by adding a neg+abs/nabs combine to just swap the select operands. Need to check alive to make sure there are no corner cases.

Differential Revision: https://reviews.llvm.org/D47041

llvm-svn: 333226
2018-05-24 21:22:51 +00:00
Warren Ristow d3efa9429f [InstCombine] Enable more reassociations using FMF 'reassoc' + 'nsz'
Reassociation of math ops in some contexts (especially vector contexts)
has generally only been happening when the 'fast' FMF was set.  This
enables reassoication when only the finer grained controls 'reassoc' and
'nsz' are set.

Differential Revision: https://reviews.llvm.org/D47335

llvm-svn: 333221
2018-05-24 20:16:43 +00:00
Chad Rosier 274d72faad [InstCombine] Combine XOR and AES instructions on ARM/ARM64.
The ARM/ARM64 AESE and AESD instructions have a builtin XOR as the first step in
the instruction. Therefore, if the AES key is zero and the AES data was
previously XORed, it can be combined into a single instruction.

Differential Revision: https://reviews.llvm.org/D47239
Patch by Michael Brase!

llvm-svn: 333193
2018-05-24 15:26:42 +00:00
Roman Lebedev 6b6c553bb8 [InstCombine] Fold unfolded masked merge pattern with variable mask!
Summary:
Finally fixes [[ https://bugs.llvm.org/show_bug.cgi?id=6773 | PR6773 ]].

Now that the backend is all done, we can finally fold it!

The canonical unfolded masked merge pattern is
```(x &  m) | (y & ~m)```
There is a second, equivalent variant:
```(x | ~m) & (y |  m)```
Only one of them (the or-of-and's i think) is canonical.
And if the mask is not a constant, we should fold it to:
```((x ^ y) & M) ^ y```

https://rise4fun.com/Alive/ndQw

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: nicholas, RKSimon, llvm-commits

Differential Revision: https://reviews.llvm.org/D46814

llvm-svn: 333106
2018-05-23 17:47:52 +00:00
Craig Topper 3b768e8602 [InstCombine] Negate ABS/NABS patterns by swapping the select operands to remove the negation
Differential Revision: https://reviews.llvm.org/D47236

llvm-svn: 333101
2018-05-23 17:29:03 +00:00
David Bolvansky cd3eb99016 [InstCombine] [NFC] Added more tests for unlocked IO transformation
Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47243

llvm-svn: 333057
2018-05-23 03:01:45 +00:00
Sanjay Patel 4b96935bd7 [InstCombine] use nsw negation for abs libcalls
Also, produce the canonical IR abs (s<0) to be more efficient. 

This is the libcall equivalent of the clang builtin change from:
rL333038

Pasting from that commit message:
The stdlib functions are defined in section 7.20.6.1 of the C standard with:
"If the result cannot be represented, the behavior is undefined."

That lets us mark the negation with 'nsw' because "sub i32 0, INT_MIN" would
be UB/poison.

llvm-svn: 333042
2018-05-22 23:29:40 +00:00
Sanjay Patel 3ef8f858da [InstCombine] move misplaced test file and regenerate checks; NFC
llvm-svn: 333039
2018-05-22 23:15:56 +00:00
David Bolvansky 88e262bcdd Delete empty test file
Differential Revision: https://reviews.llvm.org/D47230

llvm-svn: 333031
2018-05-22 21:47:08 +00:00
David Bolvansky 1f343fa0e0 [InstCombine] Remove calloc transformations
Summary: Previous patch does not care if a value is changed between calloc and strlen. This needs to be removed from InstCombine and maybe moved to DSE later after some rework.

Reviewers: efriedma

Reviewed By: efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47218

llvm-svn: 333022
2018-05-22 20:27:36 +00:00
Sanjay Patel 9781679f0f [InstCombine] move/add tests for sub with bool op; NFC
llvm-svn: 333012
2018-05-22 18:50:06 +00:00
Sanjay Patel dd5fb8f03f [InstCombine] fix broken test
Looks like the last line got chopped off from rL332990.

llvm-svn: 332992
2018-05-22 16:14:16 +00:00
David Bolvansky 41f4b64ee1 [InstCombine] Calloc-ed strings optimizations
Summary:
Example cases:
strlen(calloc(...)) -> 0

Reviewers: efriedma, bkramer

Reviewed By: bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47059

llvm-svn: 332990
2018-05-22 15:41:23 +00:00
Stanislav Mekhanoshin 0e132dca53 [AMDGPU] Optimze old value of v_mov_b32_dpp
We can eliminate old value if bound_ctrl = 1 and row_mask = bank_mask = 0xf.
This is alternative implementation working with the intrinsic in InstCombine.
Original review for past-ISel optimization: D46570.

Differential Revision: https://reviews.llvm.org/D46596

llvm-svn: 332956
2018-05-22 08:04:33 +00:00
Sanjay Patel ec50effbd6 [InstCombine] regenerate checks; NFC
llvm-svn: 332894
2018-05-21 21:09:14 +00:00
Sanjay Patel 94b1f846b2 [InstCombine] add tests for cast-of-select; NFC
In all cases, we're pulling the cast above the select.
That's not a good canonicalization if we're creating 
a select that then mismatches the operand size of its
condition.

llvm-svn: 332883
2018-05-21 20:23:58 +00:00
Alexey Bataev 7c9ad0db3d [InstCombine] Fix PR37526: MinMax patterns produce an infinite loop.
Summary:
This patch fixes PR37526 by simplifying the newly generated LoadInst
instructions. If the pointer address is a bitcast from the pointer to
the NewType, we can just remove this extra bitcast instead of creating
the new one. This fixes the PR37526 + may speed up the whole compilation
process.

Reviewers: spatel, RKSimon, hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47144

llvm-svn: 332855
2018-05-21 17:46:34 +00:00
Craig Topper e4c045b7df [X86] Remove mask arguments from permvar builtins/intrinsics. Use a select in IR instead.
Someday maybe we'll use selects for all intrinsics.

llvm-svn: 332824
2018-05-20 23:34:04 +00:00
Sanjay Patel a003c728a5 [InstCombine] choose 1 form of abs and nabs as canonical
We already do this for min/max (see the blob above the diff), 
so we should do the same for abs/nabs.
A sign-bit check (<s 0) is used as a predicate for other IR 
transforms and it's likely the best for codegen.

This might solve the motivating cases for D47037 and D47041, 
but I think those patches still make sense. We can't guarantee 
this canonicalization if the icmp has more than one use.

Differential Revision: https://reviews.llvm.org/D47076

llvm-svn: 332819
2018-05-20 14:23:23 +00:00
Piotr Padlewski a26a08cb52 Constant fold launder of null and undef
Summary:
This might be useful because clang will add
some barriers for pointer comparisons.

Reviewers: majnemer, dberlin, hfinkel, nlewycky, davide, rsmith, amharc,
kuhar

Subscribers: davide, amharc, llvm-commits

Differential Revision: https://reviews.llvm.org/D32423

llvm-svn: 332786
2018-05-18 23:52:57 +00:00
Sanjay Patel 56e09c6928 [InstCombine] add tests for lack of abs/nabs canonicalization; NFC
llvm-svn: 332726
2018-05-18 15:26:38 +00:00
Sanjay Patel fa3e4601c6 [InstCombine] regenerate checks; NFC
There were a combination of auto-generated styles in use
here because the scripts have evolved. 

llvm-svn: 332725
2018-05-18 15:22:19 +00:00
Craig Topper bd332588bd [InstCombine] Propagate the nsw/nuw flags from the add in the 'shifty' abs pattern to the sub in the select version.
According to alive this is valid. I'm hoping to use this to make an assumption that the sign bit is zero after this sequence. The only way it wouldn't be is if the input was INT__MIN, but by preserving the flags we can make doing this to INT_MIN UB.

The nuw flags is weird because it creates such a contradiction that the original number would have to be positive meaning we could remove the select entirely, but we don't get that far.

Differential Revision: https://reviews.llvm.org/D46988

llvm-svn: 332623
2018-05-17 16:29:52 +00:00
Martin Storsjo c10788728b [Analysis] Only use _unlocked stdio functions on linux
The existing comment said that the functions were available only
on GNU/Linux (and on certain Android versions), but only checked
T.isGNUEnvironment() which also is true on MinGW (for arch-windows-gnu
triplets), which doesn't have such functions.

Existing checks in the initialize function in TargetLibraryInfo.cpp
also use only T.isOSLinux() to check for glibc features.

This fixes use of stdio on MinGW.

Differential Revision: https://reviews.llvm.org/D47002

llvm-svn: 332581
2018-05-17 08:16:08 +00:00
Benjamin Kramer 8ac15bf4dc [InstCombine] Fix the signature of fgets_unlocked.
It returns a pointer, not an int. This miscompiles all code that uses
the return value of fgets.

llvm-svn: 332531
2018-05-16 21:45:39 +00:00
Sanjay Patel 2eb3512090 [InstCombine] allow more binop (shuffle X), C transforms
The canonicalization was restricted to shuffle masks with
a 1-to-1 mapping to the constant vector, but that disqualifies
the common splat pattern. This is part of solving PR37463:
https://bugs.llvm.org/show_bug.cgi?id=37463

llvm-svn: 332479
2018-05-16 15:15:22 +00:00
David Bolvansky ca22d427b9 [SimplifyLibcalls] Replace locked IO with unlocked IO
Summary: If file stream arg is not captured and source is fopen, we could replace IO calls by unlocked IO ("_unlocked" function variants) to gain better speed,

Reviewers: efriedma, RKSimon, spatel, sanjoy, hfinkel, majnemer, lebedev.ri, rja

Reviewed By: rja

Subscribers: rja, srhines, efriedma, lebedev.ri, llvm-commits

Differential Revision: https://reviews.llvm.org/D45736

llvm-svn: 332452
2018-05-16 11:39:52 +00:00
Sanjay Patel 919882638e [InstCombine] fix binop (shuffle X), C --> shuffle (binop X, C') to check uses
llvm-svn: 332407
2018-05-15 22:00:37 +00:00
Sanjay Patel cc0fbdb6e4 [InstCombine] add more tests for binop-shuffle; NFC
The splat pattern is part of PR37463:
https://bugs.llvm.org/show_bug.cgi?id=37463

llvm-svn: 332393
2018-05-15 20:34:09 +00:00
Sanjay Patel 3c35290c58 [InstCombine] fix binop-of-shuffles to check uses
llvm-svn: 332375
2018-05-15 17:14:23 +00:00
Sanjay Patel c5b4401cf7 [InstCombine] add multi-use shuffle tests and regenerate checks; NFC
llvm-svn: 332373
2018-05-15 16:47:47 +00:00
Keno Fischer de577af8c0 [InstCombine] fix crash due to ignored addrspacecast
Summary:
Part of the InstCombine code for simplifying GEPs looks through
addrspacecasts. However, this was done by updating a variable
also used by the next transformation, for marking GEPs as
inbounds. This led to replacing a GEP with a similar instruction
in a different addrspace, which caused an assertion failure in RAUW.

This caused julia issue https://github.com/JuliaLang/julia/issues/27055

Patch by Jeff Bezanson <jeff@juliacomputing.com>

Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D46722

llvm-svn: 332302
2018-05-14 22:05:01 +00:00
Craig Topper 911025b1cd [X86] Extend instcombine folds for pclmuldq intrinsics to the 256 and 512 bit version.
llvm-svn: 332202
2018-05-13 21:56:32 +00:00
Craig Topper f170b85d40 [X86] Add missing test for the InstCombines of pclmulqdq.
Apparently this test was lost when r293151 was committed. It was present in the review, but not the commit.

llvm-svn: 332199
2018-05-13 18:26:06 +00:00
Craig Topper a17d627abb [X86] Remove and autoupgrade a bunch of FMA instrinsics that are no longer used by clang.
llvm-svn: 332146
2018-05-11 21:59:34 +00:00
Daniel Neilson f6651d4d94 [InstCombine] Handle atomic memset in the same way as regular memset
Summary:
This change adds handling of the atomic memset intrinsic to the
code path that simplifies the regular memset. In practice this means
that we will now also expand a small constant-length atomic memset
into a single unordered atomic store.

Reviewers: apilipenko, skatkov, mkazantsev, anna, reames

Reviewed By: reames

Subscribers: reames, llvm-commits

Differential Revision: https://reviews.llvm.org/D46660

llvm-svn: 332132
2018-05-11 20:04:50 +00:00
David Bolvansky cd93c4ef1a [InstCombine] snprintf optimizations
Reviewers: spatel, efriedma, majnemer, rja, bkramer

Reviewed By: rja, bkramer

Subscribers: mstorsjo, rja, llvm-commits

Differential Revision: https://reviews.llvm.org/D46285

llvm-svn: 332110
2018-05-11 17:50:49 +00:00
Martin Storsjo 0d7c37756b [Analysis] Validate the return type of s(n)printf like libcalls
If the sprintf function is static (as on mingw-w64, where many stdio
functions are static inline wrappers), earlier optimization passes
could optimize out the return value altogether, and make it void,
which could break optimizations of this libcall that touch the
return value.

This fixes the issue discussed in PR37408 for the sprintf function.

Differential Revision: https://reviews.llvm.org/D46752

llvm-svn: 332106
2018-05-11 16:53:56 +00:00
Daniel Neilson 8f30ec65b0 [InstCombine] Unify handling of atomic memtransfer with non-atomic memtransfer
Summary:
This change reworks the handling of atomic memcpy within the instcombine pass.
Previously, a constant length atomic memcpy would be lowered into loads & stores
as long as no more than 16 load/store pairs are created. This is quite different
from the lowering done for a non-atomic memcpy; which only ever lowers into a single
load/store pair of no more than 8 bytes. Larger constant-sized memcpy calls are
expanded to load/stores in later passes, such as SelectionDAG lowering.

In this change the behaviour for atomic memcpy is unified with non-atomic memcpy;
atomic memcpy is now treated in the same was as non-atomic memcpy has always been.
We leave it to later passes to lower longer-length atomic memcpy calls.

Due to the structure of the pass's handling of memtransfer intrinsics, this change
also gives us handling of atomic memmove that we did not previously have.

Reviewers: apilipenko, skatkov, mkazantsev, anna, reames

Reviewed By: reames

Subscribers: reames, llvm-commits

Differential Revision: https://reviews.llvm.org/D46658

llvm-svn: 332093
2018-05-11 14:30:02 +00:00
Craig Topper 4b026e5ebd [InstCombine] Add tests for cases where we don't recognize type promoted rotate idioms.
These rotates take the form

(x << (n & mask)) | (x >> (-n & mask)) where mask is bitwidth - 1.

If x has been promoted to a wider type than its original bit width due to type promotion we fail to narrower it and therefore don't recognize it as a rotate.

llvm-svn: 332068
2018-05-11 00:46:09 +00:00
Martin Storsjo 86e6742c17 Revert "[InstCombine] snprintf optimizations"
This reverts commit SVN r331889, which could trigger failed
assertions for cases where the snprintf function is declared
with a vaguely differing signature (e.g. being defined as
static inline), see PR37408.

llvm-svn: 332043
2018-05-10 21:23:36 +00:00
Sanjay Patel c7bb14301a [InstCombine] add folds for minnum(-a, -b) --> -maxnum(a, b)
This is similar to what we do for integer min/max with 'not'
ops (rL321882).

This should fix:
https://bugs.llvm.org/show_bug.cgi?id=37404
https://bugs.llvm.org/show_bug.cgi?id=37405

llvm-svn: 332031
2018-05-10 20:03:13 +00:00
Sanjay Patel b5322e385e [InstCombine] add minnum/maxnum tests (PR37404, PR37405); NFC
llvm-svn: 332025
2018-05-10 19:21:08 +00:00
Sanjay Patel dc4bb3fd36 [InstCombine] regenerate full checks; NFC
llvm-svn: 331998
2018-05-10 17:05:38 +00:00
Philip Reames 79e917d117 [InstCombine] Widen guards with conditions between
The previous handling for guard widening in InstCombine was extremely restrictive. In particular, it didn't handle the common case where we had two guards separated by a single icmp. Handle this by scanning through a small fixed window of instructions to find the next guard if needed.

Differential Revision: https://reviews.llvm.org/D46203

llvm-svn: 331935
2018-05-09 22:56:32 +00:00
Benjamin Kramer 0d2fc1a501 [InstCombine] Teach SimplifyDemandedBits that udiv doesn't demand low dividend bits that are zero in the divisor
This is safe as long as the udiv is not exact. The pattern is not common in
C++ code, but comes up all the time in code generated by XLA's GPU backend.

Differential Revision: https://reviews.llvm.org/D46647

llvm-svn: 331933
2018-05-09 22:27:34 +00:00
David Bolvansky 9b5e6e8288 [InstCombine] snprintf optimizations
Reviewers: spatel, efriedma, majnemer, rja, bkramer

Reviewed By: rja, bkramer

Subscribers: rja, llvm-commits

Differential Revision: https://reviews.llvm.org/D46285

llvm-svn: 331889
2018-05-09 16:09:31 +00:00
Benjamin Kramer ccb0fbe9a0 Revert "[InstCombine] snprintf optimizations"
This reverts commit r331849. It miscompiles
snprintf(buf, sizeof(buf), "%s", "any constant string); into
memcpy(buf, "%s", sizeof("any constant string"));

llvm-svn: 331866
2018-05-09 11:38:57 +00:00
David Bolvansky 44a37f04b2 [InstCombine] snprintf optimizations
Reviewers: spatel, efriedma, majnemer, rja

Reviewed By: rja

Subscribers: rja, llvm-commits

Differential Revision: https://reviews.llvm.org/D46285

llvm-svn: 331849
2018-05-09 06:34:20 +00:00
Shiva Chen 2c864551df [DebugInfo] Add DILabel metadata and intrinsic llvm.dbg.label.
In order to set breakpoints on labels and list source code around
labels, we need collect debug information for labels, i.e., label
name, the function label belong, line number in the file, and the
address label located. In order to keep these information in LLVM
IR and to allow backend to generate debug information correctly.
We create a new kind of metadata for labels, DILabel. The format
of DILabel is

!DILabel(scope: !1, name: "foo", file: !2, line: 3)

We hope to keep debug information as much as possible even the
code is optimized. So, we create a new kind of intrinsic for label
metadata to avoid the metadata is eliminated with basic block.
The intrinsic will keep existing if we keep it from optimized out.
The format of the intrinsic is

llvm.dbg.label(metadata !1)

It has only one argument, that is the DILabel metadata. The
intrinsic will follow the label immediately. Backend could get the
label metadata through the intrinsic's parameter.

We also create DIBuilder API for labels to be used by Frontend.
Frontend could use createLabel() to allocate DILabel objects, and use
insertLabel() to insert llvm.dbg.label intrinsic in LLVM IR.

Differential Revision: https://reviews.llvm.org/D45024

Patch by Hsiangkai Wang.

llvm-svn: 331841
2018-05-09 02:40:45 +00:00
Roman Lebedev ffec58bce6 [InstCombine][NFC] Add tests for one more masked merge pattern.
This pattern came up in D46494.
I'm pretty sure we want to canonicalize it from
	(x | ~m) & (y &  m)
to
	(x &  m) | (y & ~m)

https://rise4fun.com/Alive/TEM

llvm-svn: 331625
2018-05-07 09:42:45 +00:00
Sanjay Patel e7b6654711 [InstCombine] refine select-of-constants to bitwise ops
Add logic for the special case when a cmp+select can clearly be
reduced to just a bitwise logic instruction, and remove an 
over-reaching chunk of general purpose bit magic. The primary goal 
is to remove cases where we are not improving the IR instruction 
count when doing these select transforms, and in all cases here that 
is true.

In the motivating 3-way compare tests, there are further improvements
because we can combine/propagate select values (not sure if that
belongs in instcombine, but it's there for now).

DAGCombiner has folds to turn some of these selects into bit magic,
so there should be no difference in the end result in those cases.
Not all constant combinations are handled there yet, however, so it
is possible that some targets will see more cmov/csel codegen with
this change in IR canonicalization. 

Ideally, we'll go further to *not* turn selects into multiple 
logic/math ops in instcombine, and we'll canonicalize to selects.
But we should make sure that this step does not result in regressions
first (and if it does, we should fix those in the backend).

The general direction for this change was discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2016-September/105373.html
http://lists.llvm.org/pipermail/llvm-dev/2017-July/114885.html

Alive proofs for the new bit magic:
https://rise4fun.com/Alive/XG7

Differential Revision: https://reviews.llvm.org/D46086

llvm-svn: 331486
2018-05-03 21:58:44 +00:00
Omer Paparo Bivas 54596e1c1a [InstCombine] new testcases for OverflowingBinaryOperators and PossiblyExactOperators transformations; NFC
instcombine should transform the relevant cases if the OverflowingBinaryOperator/PossiblyExactOperator can be proven to be safe.

Change-Id: I7aec62a31a894e465e00eb06aed80c3ea0c9dd45
llvm-svn: 331265
2018-05-01 14:27:10 +00:00
Omer Paparo Bivas 82ef8e19ef [InstCombine] Adjusting bswap pattern matching to hold for And/Shift mixed case
Differential Revision: https://reviews.llvm.org/D45731

Change-Id: I85d4226504e954933c41598327c91b2d08192a9d
llvm-svn: 331257
2018-05-01 12:25:46 +00:00
Sanjay Patel 1934cb1c49 [InstCombine] fix test to restore intent
This test had values that differed in only in capitalization,
and that causes problems for the auto-generating check line
script. So I changed that in rL331226, but I accidentally
forgot to change a subsequent use of a param.

llvm-svn: 331228
2018-04-30 21:28:18 +00:00
Sanjay Patel dab8515370 [InstCombine] add tests, update checks; NFC
llvm-svn: 331226
2018-04-30 21:03:36 +00:00
Roman Lebedev aa4faec114 [InstCombine] Unfold masked merge with constant mask
Summary:
As discussed in D45733, we want to do this in InstCombine.

https://rise4fun.com/Alive/LGk

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: chandlerc, xbolva00, llvm-commits

Differential Revision: https://reviews.llvm.org/D45867

llvm-svn: 331205
2018-04-30 17:59:33 +00:00
Roman Lebedev b7ae4ef82b [InstCombine][NFC] Add tests for unfolding masked merge with constant mask
Summary: As discussed in D45733, we want to do this in InstCombine.

Differential Revision: https://reviews.llvm.org/D45866

llvm-svn: 331204
2018-04-30 17:59:26 +00:00
Roman Lebedev 136867931a [InstCombine] Canonicalize variable mask in masked merge
Summary:
Masked merge has a pattern of: `((x ^ y) & M) ^ y`.
But, there is no difference between `((x ^ y) & M) ^ y` and `((x ^ y) & ~M) ^ x`,
We should canonicalize the pattern to non-inverted mask.

https://rise4fun.com/Alive/Yol

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45664

llvm-svn: 331112
2018-04-28 15:45:07 +00:00
Roman Lebedev 6b1e66b188 [InstCombine][NFC] Add tests for variable mask canonicalization in masked merge
Summary:
Masked merge has a pattern of: `((x ^ y) & M) ^ y`.
But, there is no difference between `((x ^ y) & M) ^ y` and `((x ^ y) & ~M) ^ x`,
We should canonicalize the pattern to non-inverted mask.

Differential Revision: https://reviews.llvm.org/D45663

llvm-svn: 331111
2018-04-28 15:45:00 +00:00
Max Kazantsev 303572f3df [NFC] Add some tests that demonstrate unrecognized three-way comparison patterns
llvm-svn: 331100
2018-04-28 04:38:21 +00:00
Roman Lebedev 6959b8e76f [PatternMatch] Stabilize the matching order of commutative matchers
Summary:
Currently, we
1. match `LHS` matcher to the `first` operand of binary operator,
2. and then match `RHS` matcher to the `second` operand of binary operator.
If that does not match, we swap the `LHS` and `RHS` matchers:
1. match `RHS` matcher to the `first` operand of binary operator,
2. and then match `LHS` matcher to the `second` operand of binary operator.

This works ok.
But it complicates writing of commutative matchers, where one would like to match
(`m_Value()`) the value on one side, and use (`m_Specific()`) it on the other side.

This is additionally complicated by the fact that `m_Specific()` stores the `Value *`,
not `Value **`, so it won't work at all out of the box.

The last problem is trivially solved by adding a new `m_c_Specific()` that stores the
`Value **`, not `Value *`. I'm choosing to add a new matcher, not change the existing
one because i guess all the current users are ok with existing behavior,
and this additional pointer indirection may have performance drawbacks.
Also, i'm storing pointer, not reference, because for some mysterious-to-me reason
it did not work with the reference.

The first one appears trivial, too.
Currently, we
1. match `LHS` matcher to the `first` operand of binary operator,
2. and then match `RHS` matcher to the `second` operand of binary operator.
If that does not match, we swap the ~~`LHS` and `RHS` matchers~~ **operands**:
1. match ~~`RHS`~~ **`LHS`** matcher to the ~~`first`~~ **`second`** operand of binary operator,
2. and then match ~~`LHS`~~ **`RHS`** matcher to the ~~`second`~ **`first`** operand of binary operator.

Surprisingly, `$ ninja check-llvm` still passes with this.
But i expect the bots will disagree..

The motivational unittest is included.
I'd like to use this in D45664.

Reviewers: spatel, craig.topper, arsenm, RKSimon

Reviewed By: craig.topper

Subscribers: xbolva00, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D45828

llvm-svn: 331085
2018-04-27 21:23:20 +00:00
Matt Morehouse 1ae1febfde Revert "[SimplifyLibcalls] Replace locked IO with unlocked IO"
This reverts r331002 due to sanitizer bot breakage.

llvm-svn: 331011
2018-04-27 01:48:09 +00:00