Commit Graph

234 Commits

Author SHA1 Message Date
Sanjay Patel eb5d046890 revert r326502: [InstCombine] allow fmul fold with less than 'fast'
I forgot that I added tests for 'reassoc' to -reassociate, but
suprisingly that file calls -instcombine too, so it is affected.
I'll update that file and try again.

llvm-svn: 326510
2018-03-01 23:39:24 +00:00
Sanjay Patel 7373ae5c9a [InstCombine] allow fmul fold with less than 'fast'
llvm-svn: 326502
2018-03-01 22:53:47 +00:00
Sanjay Patel f3b1af7aa4 [InstCombine] simplify code for (X*Y) * X => (X*X) * Y ; NFCI
llvm-svn: 326444
2018-03-01 15:50:26 +00:00
Sanjay Patel eaf5a120ed [InstCombine] simplify code for X * -1.0 --> -X; NFC
I've added random FMF to one of the tests to show those are propagated.

llvm-svn: 326377
2018-02-28 22:30:04 +00:00
Sanjay Patel b3f4f62698 [InstCombine] move invariant call out of loop; NFC
We really shouldn't need a 2-loop here at all, but that's another cleanup.

llvm-svn: 326330
2018-02-28 16:50:51 +00:00
Sanjay Patel 8fdd87f929 [InstCombine] move constant check into foldBinOpIntoSelectOrPhi; NFCI
Also, rename 'foldOpWithConstantIntoOperand' because that's annoyingly 
vague. The constant check is redundant in some cases, but it allows 
removing duplication for most of the calls.

llvm-svn: 326329
2018-02-28 16:36:24 +00:00
Sanjay Patel 31a90468e1 [InstCombine] allow fdiv folds with less than fully 'fast' ops
Note: gcc appears to allow this fold with -freciprocal-math alone, 
but clang/llvm require more than that with this patch. The wording
in the definitions seems fuzzy enough that it could go either way,
but we'll err on the conservative side of FMF interpretation.

This patch also changes the newly created fmul to have FMF propagated
by the last fdiv rather than intersecting the FMF of the fdivs. This
matches the behavior of other folds near here. The new fmul is only 
used to produce an intermediate op for the final fdiv result, so it
shouldn't be any stricter than that result. The previous behavior
could result in dropping FMF via other folds in instcombine or CSE.

Differential Revision: https://reviews.llvm.org/D43398

llvm-svn: 326098
2018-02-26 16:02:45 +00:00
Sanjay Patel 2db2769499 [InstCombine] simplify code for fabs(X) * fabs(X) -> X * X; NFC
llvm-svn: 325968
2018-02-23 22:38:10 +00:00
Sanjay Patel db53d1847b [InstSimplify] sqrt(X) * sqrt(X) --> X
This was misplaced in InstCombine. We can loosen the FMF as a follow-up step.

llvm-svn: 325965
2018-02-23 22:20:13 +00:00
Sanjay Patel d32104e1b2 [InstCombine] allow fmul-sqrt folds with less than full -ffast-math
Also, add a Builder method for intrinsics to reduce code duplication for clients.

llvm-svn: 325960
2018-02-23 21:16:12 +00:00
Sanjay Patel 6b9c7a9c83 [InstCombine] refactor fmul with negated op folds; NFCI
The existing code was inefficiently looking for 'nsz' variants.
That's unnecessary because we canonicalize those to the expected
form with -0.0.

We may also want to adjust or remove the fold that sinks negation.
We don't do that for fdiv (or integer ops?). That should be uniform?
It may also lead to missed optimization as in PR21914:
https://bugs.llvm.org/show_bug.cgi?id=21914
...or we just have to fix other passes to avoid that problem.

llvm-svn: 325924
2018-02-23 17:14:28 +00:00
Sanjay Patel 5a6f904520 [InstCombine] add and use Create*FMF functions; NFC
llvm-svn: 325730
2018-02-21 22:18:55 +00:00
Sanjay Patel 6f716a7c5e [InstCombine] C / -X --> -C / X
We already do this in DAGCombiner, but it should
also be good to eliminate the fsub use in IR.

This is similar to rL325648.

llvm-svn: 325649
2018-02-21 00:01:45 +00:00
Sanjay Patel d8dd0151fc [InstCombine] -X / C --> X / -C for FP
We already do this in DAGCombiner, but it should 
also be good to eliminate the fsub use in IR.

llvm-svn: 325648
2018-02-20 23:51:16 +00:00
Sanjay Patel 7365b44b85 [InstCombine] remove unneeded operand swap: NFCI
FMul is commutative, so complexity-based canonicalization should always 
take care of the swap via SimplifyAssociativeOrCommutative(). 

llvm-svn: 325628
2018-02-20 21:52:46 +00:00
Sanjay Patel 29b98ae337 [InstCombine] remove unneeded dyn_cast to prevent unused variable warning
llvm-svn: 325597
2018-02-20 17:14:53 +00:00
Sanjay Patel b2d978682b [InstCombine] remove compound fdiv pattern folds
These are fdiv-with-constant-divisor, so they already become
reciprocal multiplies. The last gap for vector ops should be
closed with rL325590.

It's possible that we're missing folds for some edge cases 
with denormal intermediate constants after deleting these,
but there are no tests for those patterns, and it would be 
better to handle denormals more consistently (and less 
conservatively) as noted in TODO comments.

llvm-svn: 325595
2018-02-20 16:52:17 +00:00
Sanjay Patel 90f4c8ec29 [InstCombine] fold fdiv with non-splat divisor to fmul: X/C --> X * (1/C)
llvm-svn: 325590
2018-02-20 16:08:15 +00:00
Sanjay Patel 2816560b2c [InstCombine] use CreateWithCopiedFlags to reduce code; NFCI
Also, move the folds with constants closer to make it easier to follow. 

llvm-svn: 325541
2018-02-19 23:09:03 +00:00
Sanjay Patel 1d14779aed [InstCombine] allow fdiv with constant dividend folds with less than full -ffast-math
It's possible that we could allow this either 'arcp' or 'reassoc' alone, but this
should be conservatively better than what we have right now. GCC allows this with
only -freciprocal-math.

The last test is changed to show a case that is expected to fold, but we need D43398.

llvm-svn: 325533
2018-02-19 21:46:52 +00:00
Sanjay Patel e412954953 [InstCombine] refactor fdiv with constant dividend folds; NFC
The last fold that used to be here was not necessary. That's a 
combination of 2 folds (and there's a regression test to show that).

The transforms are guarded by isFast(), but that should be loosened.

llvm-svn: 325531
2018-02-19 21:17:58 +00:00
Sanjay Patel 08868e494e [Constant] add floating-point helpers for normal/finite-nz; NFC
...and delete the equivalent local functiona from InstCombine.

These might be useful to other InstCombine files or other passes
and makes FP queries more similar to integer constant queries.

llvm-svn: 325398
2018-02-16 22:32:54 +00:00
Sanjay Patel 91bb775087 [InstCombine] clean up fdiv-with-fdiv folds; NFCI
llvm-svn: 325366
2018-02-16 17:52:32 +00:00
Sanjay Patel e16b0cfba9 [InstCombine] remove redundant debug info setting; NFC
The IRBuilder sets debuginfo in Insert(), so this was duplicating what already happened.

llvm-svn: 325358
2018-02-16 16:42:04 +00:00
Sanjay Patel 65da14d6c8 [InstCombine] reduce code duplication; NFC
llvm-svn: 325353
2018-02-16 16:13:20 +00:00
Sanjay Patel 1e04511e16 [InstCombine] use m_OneUse to reduce code; NFC
llvm-svn: 325263
2018-02-15 16:30:10 +00:00
Sanjay Patel 339b4d338d [InstCombine] allow sin/cos transforms with 'reassoc'
The variable name 'AllowReassociate' is a lie at this point because
it's set to 'isFast()' which is more than the 'reassoc' FMF after
rL317488.

In D41286, we showed that this transform may be valid even with strict
math by brute force checking every 32-bit float result.

There's a potential problem here because we're replacing with a tan()
libcall rather than a hypothetical LLVM tan intrinsic. So we might
set errno when we should be guaranteed not to do that. But that's
independent of this change.

llvm-svn: 325247
2018-02-15 15:07:12 +00:00
Sanjay Patel 6a0f667077 [InstCombine] allow X / C -> X * (1.0/C) for vector splat FP constants
llvm-svn: 325237
2018-02-15 13:55:52 +00:00
Sanjay Patel b39bcc0437 [InstCombine] clean up fold for X / C -> X * (1.0/C); NFCI
This should work with vector constants too, but it's currently limited to scalar.

llvm-svn: 325187
2018-02-14 23:04:17 +00:00
Sanjay Patel 5df4d8892f [InstCombine] simplify isFMulOrFDivWithConstant(); NFCI
llvm-svn: 325142
2018-02-14 17:16:33 +00:00
Sanjay Patel 58dab856f7 [InstCombine] replace isa/cast with dyn_cast; NFC
llvm-svn: 325141
2018-02-14 16:56:44 +00:00
Sanjay Patel 604cb9e3ed [InstCombine] refactor folds for mul with negated operands; NFCI
This keeps with our current usage of 'match' and is easier to see that
the optional NSW only applies in the non-constant operand case. 

llvm-svn: 325140
2018-02-14 16:50:55 +00:00
Sanjay Patel 7558d860af [InstCombine] (lshr X, 31) * Y --> (ashr X, 31) & Y
This replaces the bit-tracking based fold that did the same thing,
but it only worked for scalars and not directly. 

There is no evidence in existing regression tests that the greater 
power of bit-tracking was needed here, but we should be aware of 
this potential loss of optimization.

llvm-svn: 325062
2018-02-13 22:24:37 +00:00
Sanjay Patel cb8ac00f73 [InstCombine] (bool X) * Y --> X ? Y : 0
This is both a functional improvement for vectors and an
efficiency improvement for scalars. The existing code below
the new folds does the same thing for scalars, but in an 
indirect and expensive way.

llvm-svn: 325048
2018-02-13 20:41:22 +00:00
Simon Pilgrim be0dd72620 [InstCombine] Simplify getLogBase2 case for scalar/splats. NFCI.
llvm-svn: 325003
2018-02-13 13:16:26 +00:00
Sanjay Patel 4a4f35f324 [InstCombine] X / (X * Y) --> 1.0 / Y
This is similar to the instsimplify fold added with D42385 
( rL323716 )
...but this can't be in instsimplify because we're creating/morphing
a different instruction.

llvm-svn: 324927
2018-02-12 19:39:21 +00:00
Sanjay Patel 1998cc6a47 [InstCombine] various clean-ups for div transforms; NFC
llvm-svn: 324922
2018-02-12 18:38:35 +00:00
Sanjay Patel 39059d2630 [InstCombine] various clean-ups for commonIDivTransforms; NFC
llvm-svn: 324891
2018-02-12 14:14:56 +00:00
Sanjay Patel 510d647a4d [InstCombine] X / (X * Y) -> 1 / Y if the multiplication does not overflow
The related cases for (X * Y) / X were handled in rL124487.

https://rise4fun.com/Alive/6k9

The division in these tests is subsequently eliminated by existing instcombines
for 1/X.

llvm-svn: 324843
2018-02-11 17:20:32 +00:00
Simon Pilgrim 9620f4b746 [InstCombine] Add constant vector support for X udiv C, where C >= signbit
llvm-svn: 324728
2018-02-09 10:43:59 +00:00
Simon Pilgrim a54e8e429b [InstCombine] visitSRem - use m_Negative(APInt) helper. NFCI.
llvm-svn: 324636
2018-02-08 19:00:45 +00:00
Simon Pilgrim 1889f26b94 [InstCombine] Add m_Negative pattern matching
Allows us to add non-uniform constant vector support for "X urem C -> X < C ? X : X - C, where C >= signbit."

llvm-svn: 324631
2018-02-08 18:36:01 +00:00
Simon Pilgrim 2a90acd17a [InstCombine] Fix issue with X udiv (POW2_C1 << N) for non-splat constant vectors
foldUDivShl was assuming that the input was a scalar or a splat constant

llvm-svn: 324613
2018-02-08 15:19:38 +00:00
Simon Pilgrim 94cc89d5f2 [InstCombine] Fix issue with X udiv 2^C -> X >> C for non-splat constant vectors
foldUDivPow2Cst was assuming that the input was a scalar or a splat constant

llvm-svn: 324608
2018-02-08 14:46:10 +00:00
Simon Pilgrim 4039dbea77 Fix unused variable warning.
llvm-svn: 324605
2018-02-08 14:24:26 +00:00
Simon Pilgrim 0b9f3912ce [InstCombine] Improve mul(x, pow2) -> shl combine for vector constants
Refactor getLogBase2Vector into getLogBase2 to accept all scalars/vectors. Generalize from ConstantDataVector to support all constant vectors.

llvm-svn: 324603
2018-02-08 14:10:01 +00:00
Sanjay Patel 9530f18864 [InstCombine] (X << Y) / X -> 1 << Y
...when the shift is known to not overflow with the matching
signed-ness of the division.

This closes an optimization gap caused by canonicalizing mul
by power-of-2 to shl as shown in PR35709:
https://bugs.llvm.org/show_bug.cgi?id=35709

Patch by Anton Bikineev!

Differential Revision: https://reviews.llvm.org/D42032

llvm-svn: 323068
2018-01-21 16:14:51 +00:00
Benjamin Kramer 738e6e7cb0 [InstCombine] Apply the fix from r322284 for sin / cos -> tan too
llvm-svn: 322285
2018-01-11 15:33:21 +00:00
Benjamin Kramer 44993ede60 [InstCombine] For cos/sin -> tan copy attributes from cos instead of the
parent function

Ideally we should merge the attributes from the functions somehow, but
this is obviously an improvement over taking random attributes from the
caller which will trip up the verifier if they're nonsensical for an
unary intrinsic call.

llvm-svn: 322284
2018-01-11 15:19:02 +00:00
Dmitry Venikov e5fbf591a7 [InstCombine] Missed optimization in math expression: sin(x) / cos(x) => tan(x)
Summary: This patch enables folding sin(x) / cos(x) -> tan(x), cos(x) / sin(x) -> 1 / tan(x) under -ffast-math flag

Reviewers: hfinkel, spatel

Reviewed By: spatel

Subscribers: andrew.w.kaylor, efriedma, scanon, llvm-commits

Differential Revision: https://reviews.llvm.org/D41286

llvm-svn: 322255
2018-01-11 06:33:00 +00:00