Commit Graph

13087 Commits

Author SHA1 Message Date
Roman Lebedev 826db453d1 [NFC][InstCombine] onehot_merge.ll: add last few tests in the state they regress to in D62818
llvm-svn: 365056
2019-07-03 16:48:53 +00:00
Sanjay Patel c1c86adb16 [SLP] add tests for bitcasted vector pointer load; NFC
I'm not sure if this falls within the scope of SLP,
but we could create vector loads for some of these
patterns.

llvm-svn: 365055
2019-07-03 16:46:14 +00:00
Roman Lebedev 9f0c83902d [InstCombine] Y - ~X --> X + Y + 1 fold (PR42457)
Summary:
I *think* we'd want this new variant, because we obviously
have better handling for `add` as compared to `sub`/`not`.

https://rise4fun.com/Alive/WMn

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42457 | PR42457 ]]

Reviewers: spatel, nikic, huihuiz, efriedma

Reviewed By: spatel

Subscribers: RKSimon, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63992

llvm-svn: 365011
2019-07-03 09:41:50 +00:00
Eugene Leviant ac407a7b4a [SCEV][LSR] Prevent using undefined value in binops
On some occasions ReuseOrCreateCast may convert previously
expanded value to undefined. That value may be passed by
SCEVExpander as an argument to InsertBinop making IV chain
undefined.

Differential revision: https://reviews.llvm.org/D63928 

llvm-svn: 365009
2019-07-03 09:36:32 +00:00
Jordan Rupprecht 02647f73d4 Revert [InlineCost] cleanup calculations of Cost and Threshold
This reverts r364422 (git commit 1a3dc76186)

The inlining cost calculation is incorrect, leading to stack overflow due to large stack frames from heavy inlining.

llvm-svn: 365000
2019-07-03 04:01:51 +00:00
Vasileios Porpodas cf47ff5ffb [SLP] Recommit: Look-ahead operand reordering heuristic.
Summary: This patch introduces a new heuristic for guiding operand reordering. The new "look-ahead" heuristic can look beyond the immediate predecessors. This helps break ties when the immediate predecessors have identical opcodes (see lit test for an example).

Reviewers: RKSimon, ABataev, dtemirbulatov, Ayal, hfinkel, rnk

Reviewed By: RKSimon, dtemirbulatov

Subscribers: hiraditya, phosek, rnk, rcorcs, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60897

llvm-svn: 364964
2019-07-02 20:20:28 +00:00
David Bolvansky cb1a5a705c [SimplifyLibCalls] powf(x, sitofp(n)) -> powi(x, n)
Summary:
Partially solves https://bugs.llvm.org/show_bug.cgi?id=42190



Reviewers: spatel, nikic, efriedma

Reviewed By: efriedma

Subscribers: efriedma, nikic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63038

llvm-svn: 364940
2019-07-02 15:58:45 +00:00
Roman Lebedev 0bde7c6527 [InstCombine] Shift amount reassociation: fixup constantexpr handling (PR42484)
I was actually wondering if there was some nicer way than m_Value()+cast,
but apparently what i was really "subconsciously" thinking about
was correctness issue.

hasNoUnsignedWrap()/hasNoUnsignedWrap() exist for Instruction,
not for BinaryOperator, so let's just use m_Instruction(),
thus both avoiding a cast, and a crash.

Fixes https://bugs.llvm.org/show_bug.cgi?id=42484,
      https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=15587

llvm-svn: 364915
2019-07-02 12:54:48 +00:00
Roman Lebedev 7928fea4a7 [NFC][InstCombine] Revisit tests for "redundant shift input masking" (PR42456)
llvm-svn: 364897
2019-07-02 10:02:25 +00:00
Roman Lebedev 377dfb0226 [NFC][InstCombine] Add tests for "redundant shift input masking" (PR42456)
https://bugs.llvm.org/show_bug.cgi?id=42456
https://rise4fun.com/Alive/Vf1p

llvm-svn: 364894
2019-07-02 09:27:34 +00:00
Reid Kleckner d72163947a [PGO] Update ICP pass for recent byval type changes
Fixes verifier errors encountered in PR42413.

Reviewers: xur, t.p.northover, inglorion, gbiv, george.burgess.iv

Differential Revision: https://reviews.llvm.org/D63842

llvm-svn: 364861
2019-07-01 22:43:39 +00:00
Huihui Zhang 8e1051b3a0 [InstCombine][NFCI] Update test cases in onehot_merge.ll
Use both one bit and signbit shifting to check for one bit merge.

Reviewers: lebedev.ri, spatel, efriedma, craig.topper

Reviewed By: lebedev.ri

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63903

llvm-svn: 364857
2019-07-01 22:00:32 +00:00
Sanjay Patel ddc1b40f26 [InstCombine] reduce more checks for power-of-2-or-zero using ctpop
Extends the transform from:
rL364341
...to include another (more common?) pattern that tests whether a
value is a power-of-2 (including or excluding zero).

llvm-svn: 364856
2019-07-01 22:00:00 +00:00
Jordan Rupprecht a7972dc04a Revert [SLP] Look-ahead operand reordering heuristic.
This reverts r364478 (git commit 574cb0eb3a)

The patch is causing compilation timeouts.

llvm-svn: 364846
2019-07-01 21:10:43 +00:00
Roman Lebedev 975120a21b [NFC][InstCombine] More commutative tests for "shift direction in bittest" (PR42466)
'and' is commutative, if we don't want to touch shift-of-const,
we still need to check the other hand of 'and'.

llvm-svn: 364844
2019-07-01 20:33:56 +00:00
Roman Lebedev e62857786f [NFC][InstCombine] Add tests for "shift direction in bittest" (PR42466)
https://rise4fun.com/Alive/8O1
https://bugs.llvm.org/show_bug.cgi?id=42466

llvm-svn: 364824
2019-07-01 18:11:32 +00:00
Roman Lebedev 04d3d3bbff [InstCombine] (Y + ~X) + 1 --> Y - X fold (PR42459)
Summary:
To be noted, this pattern is not unhandled by instcombine per-se,
it is somehow does end up being folded when one runs opt -O3,
but not if it's just -instcombine. Regardless, that fold is
indirect, depends on some other folds, and is thus blind
when there are extra uses.

This does address the regression being exposed in D63992.

https://godbolt.org/z/7DGltU
https://rise4fun.com/Alive/EPO0

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42459 | PR42459 ]]

Reviewers: spatel, nikic, huihuiz

Reviewed By: spatel

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63993

llvm-svn: 364792
2019-07-01 15:55:24 +00:00
Roman Lebedev 72b8d41ce8 [InstCombine] Shift amount reassociation in bittest (PR42399)
Summary:
Given pattern:
`icmp eq/ne (and ((x shift Q), (y oppositeshift K))), 0`
we should move shifts to the same hand of 'and', i.e. rewrite as
`icmp eq/ne (and (x shift (Q+K)), y), 0`  iff `(Q+K) u< bitwidth(x)`

It might be tempting to not restrict this to situations where we know
we'd fold two shifts together, but i'm not sure what rules should there be
to avoid endless combine loops.

We pick the same shift that was originally used to shift the variable we picked to shift:
https://rise4fun.com/Alive/6x1v

Should fix [[ https://bugs.llvm.org/show_bug.cgi?id=42399 | PR42399]].

Reviewers: spatel, nikic, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63829

llvm-svn: 364791
2019-07-01 15:55:15 +00:00
Roman Lebedev 34a0b16e29 [NFC][InstCombine] Better commutative tests for "shift amount reassociation in bittest" pattern.
As discussed in https://reviews.llvm.org/D63829
*if* *both* shifts are one-use, we'd most likely want to produce `lshr`,
and not rely on ordering.

Also, there should likely be a *separate* fold to do this reordering.

llvm-svn: 364772
2019-07-01 14:28:24 +00:00
Roman Lebedev 9f3645869c [NFC][InstCombine] Improve test coverage for ((~x) + y) + 1 -> y - x fold fold (PR42459)
So we indeed to have this fold, but only if +1 is not the last operation..

llvm-svn: 364764
2019-07-01 13:31:06 +00:00
Roman Lebedev d5c3e34cb7 [NFC][InstCombine] Tests for ((~x) + y) + 1 -> y - x fold fold (PR42459)
To be noted, this pattern is not unhandled by instcombine per-se,
it is somehow does end up being folded when one runs opt -O3,
but not if it's just -instcombine. Regardless, that fold is
indirect, depends on some other folds, and is thus blind
when there are extra uses.

https://bugs.llvm.org/show_bug.cgi?id=42459
https://rise4fun.com/Alive/EPO0

llvm-svn: 364749
2019-07-01 12:22:06 +00:00
Roman Lebedev 4f878fe3a7 [NFC][InstCombine] Tests for x - ~(y) -> x + y + 1 fold (PR42457)
https://bugs.llvm.org/show_bug.cgi?id=42457
https://rise4fun.com/Alive/iFhE

llvm-svn: 364739
2019-07-01 09:57:53 +00:00
Roman Lebedev f55818e3a7 [InstCombine] Omit 'urem' where possible
This was added in D63390 / rL364286 to backend,
but it makes sense to also handle it in middle-end.
https://rise4fun.com/Alive/Zsln

llvm-svn: 364738
2019-07-01 09:41:43 +00:00
Roman Lebedev 0f82f64c83 [NFC][InstCombine] Copy test for omit urem when possible from TargetLowering
Was added in D63390 / rL364286 to backend, but it makes sense to also handle it here.
https://rise4fun.com/Alive/Zsln

llvm-svn: 364737
2019-07-01 09:41:27 +00:00
Yevgeny Rouban d4097b4a93 [SimpleLoopUnswitch] Implement handling of prof branch_weights metadata for SwitchInst
Differential Revision: https://reviews.llvm.org/D60606

llvm-svn: 364734
2019-07-01 08:43:53 +00:00
Sam Parker 98722691b0 [ARM] WLS/LE Code Generation
Backend changes to enable WLS/LE low-overhead loops for armv8.1-m:
1) Use TTI to communicate to the HardwareLoop pass that we should try
   to generate intrinsics that guard the loop entry, as well as setting
   the loop trip count.
2) Lower the BRCOND that uses said intrinsic to an Arm specific node:
   ARMWLS.
3) ISelDAGToDAG the node to a new pseudo instruction:
   t2WhileLoopStart.
4) Add support in ArmLowOverheadLoops to handle the new pseudo
   instruction.

Differential Revision: https://reviews.llvm.org/D63816

llvm-svn: 364733
2019-07-01 08:21:28 +00:00
Sanjay Patel 706b48251f [InstCombine] canonicalize fcmp+select to minnum/maxnum intrinsics
This is the opposite direction of D62158 (we have to choose 1 form or the other).
Now that we have FMF on the select, this becomes more palatable. And the benefits
of having a single IR instruction for this operation (less chances of missing folds
based on extra uses, etc) overcome my previous comments about the potential advantage
of larger pattern matching/analysis.

Differential Revision: https://reviews.llvm.org/D62414

llvm-svn: 364721
2019-06-30 13:40:31 +00:00
Sanjay Patel 77dc1e8568 [InstCombine] canonicalize fmin/fmax to LLVM intrinsics minnum/maxnum
This transform came up in D62414, but we should deal with it first.
We have LLVM intrinsics that correspond exactly to libm calls (unlike
most libm calls, these libm calls never set errno).
This holds without any fast-math-flags, so we should always canonicalize
to those intrinsics directly for better optimization.
Currently, we convert to fcmp+select only when we have FMF (nnan) because
fcmp+select does not preserve the semantics of the call in the general case.

Differential Revision: https://reviews.llvm.org/D63214

llvm-svn: 364714
2019-06-29 14:28:54 +00:00
Roman Lebedev e3a94ba4a9 [InstCombine] Shift amount reassociation (PR42391)
Summary:
Given pattern:
`(x shiftopcode Q) shiftopcode K`
we should rewrite it as
`x shiftopcode (Q+K)`  iff `(Q+K) u< bitwidth(x)`
This is valid for any shift, but they must be identical.

* https://rise4fun.com/Alive/9E2
* exact on both lshr => exact https://rise4fun.com/Alive/plHk
* exact on both ashr => exact https://rise4fun.com/Alive/QDAA
* nuw on both shl => nuw https://rise4fun.com/Alive/5Uk
* nsw on both shl => nsw https://rise4fun.com/Alive/0plg

Should fix [[ https://bugs.llvm.org/show_bug.cgi?id=42391 | PR42391]].

Reviewers: spatel, nikic, RKSimon

Reviewed By: nikic

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63812

llvm-svn: 364712
2019-06-29 11:51:50 +00:00
Nikita Popov 2d756c4feb [LFTR] Fix post-inc pointer IV with truncated exit count (PR41998)
Fixes https://bugs.llvm.org/show_bug.cgi?id=41998. Usually when we
have a truncated exit count we'll truncate the IV when comparing
against the limit, in which case exit count overflow in post-inc
form doesn't matter. However, for pointer IVs we don't do that, so
we have to be careful about incrementing the IV in the wide type.

I'm fixing this by removing the IVCount variable (which was
ExitCount or ExitCount+1) and replacing it with a UsePostInc flag,
and then moving the actual limit adjustment to the individual cases
(which are: pointer IV where we add to the wide type, integer IV
where we add to the narrow type, and constant integer IV where we
add to the wide type).

Differential Revision: https://reviews.llvm.org/D63686

llvm-svn: 364709
2019-06-29 09:24:12 +00:00
Cameron McInally b671535983 [NFC][NewGVN] Explicitly check fpmath metadata in fpmath.ll
Suggested in D63933.

llvm-svn: 364685
2019-06-28 21:39:08 +00:00
Cameron McInally 30e5cf1d8f [NewGVN] Add unary FNeg support to NewGVN pass
Differential Revision: https://reviews.llvm.org/D63933

llvm-svn: 364680
2019-06-28 20:09:32 +00:00
Cameron McInally ab4b2364e5 [GVNSink] Add unary FNeg support to GVNSink pass
Differential Revision: https://reviews.llvm.org/D63900

llvm-svn: 364678
2019-06-28 19:57:31 +00:00
Roman Lebedev 3b4f086df4 [NFC][InstCombine] Shift amount reassociation: revisit flag preservation tests
llvm-svn: 364657
2019-06-28 16:36:53 +00:00
Roman Lebedev 9f1dffdb02 [NFC][InstCombine] Shift amount reassociation: add flag preservation test
As discussed in https://reviews.llvm.org/D63812#inline-569870
* exact on both lshr => exact https://rise4fun.com/Alive/plHk
* exact on both ashr => exact https://rise4fun.com/Alive/QDAA
* nuw on both shl => nuw https://rise4fun.com/Alive/5Uk
* nsw on both shl => nsw https://rise4fun.com/Alive/0plg

So basically if the same flag is set on both original shifts -> set it on new shift.
Don't think we can do anything with non-matching flags on shl.

llvm-svn: 364652
2019-06-28 15:32:52 +00:00
Cameron McInally 9fab46ca0b [NFC][Float2Int] Pre-commit unary FNeg test to basic.ll
llvm-svn: 364649
2019-06-28 15:12:15 +00:00
Cameron McInally 13d9c723c8 [NFC][NewGVN] Pre-commit unary FNeg test to fpmath.ll
llvm-svn: 364646
2019-06-28 14:39:58 +00:00
Sam Parker 9a92be1b35 [HardwareLoops] Loop counter guard intrinsic
Introduce llvm.test.set.loop.iterations which sets the loop counter
and also produces an i1 after testing that the count is not zero.

Differential Revision: https://reviews.llvm.org/D63809

llvm-svn: 364628
2019-06-28 07:38:16 +00:00
Cameron McInally 30cab5d6ee [NFC][GVNSink] Pre-commit unary FNeg test to fpmath.ll
llvm-svn: 364597
2019-06-27 21:23:07 +00:00
Cameron McInally 6e62a796d5 [GVN] Add support for unary FNeg to GVN pass
Differential Revision: https://reviews.llvm.org/D63896

llvm-svn: 364592
2019-06-27 21:05:02 +00:00
Cameron McInally 22afca2ce0 [NFC][GVN] Pre-commit unary FNeg tests to fpmath.ll
llvm-svn: 364587
2019-06-27 20:33:44 +00:00
David Green 152dd3b854 [ARM] Move low overhead loop codegen tests into a separate file. NFC
llvm-svn: 364565
2019-06-27 16:56:41 +00:00
Johannes Doerfert 3b77583e95 [Attr] Add "willreturn" function attribute
This patch introduces a new function attribute, willreturn, to indicate
that a call of this function will either exhibit undefined behavior or
comes back and continues execution at a point in the existing call stack
that includes the current invocation.

This attribute guarantees that the function does not have any endless
loops, endless recursion, or terminating functions like abort or exit.

Patch by Hideto Ueno (@uenoku)

Reviewers: jdoerfert

Subscribers: mehdi_amini, hiraditya, steven_wu, dexonsmith, lebedev.ri, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62801

llvm-svn: 364555
2019-06-27 15:51:40 +00:00
Sanjay Patel d0e098696f [InstCombine] remove 'tmp' names and regenerate checks; NFC
llvm-svn: 364546
2019-06-27 14:20:10 +00:00
Tim Northover 22c96a966b IR: compare type attributes deeply when looking into functions.
FunctionComparator attempts to produce a stable comparison of two Function
instances by looking at all available properties. Since ByVal attributes now
contain a Type pointer, they are not trivially ordered and FunctionComparator
should use its own Type comparison logic to sort them.

llvm-svn: 364523
2019-06-27 11:44:45 +00:00
Stefan Stipanovic 5360589b7d [Attributor] Deducing existing nounwind attribute.
Adding nounwind deduction in new attributor framework.

Reviewers: jdoerfert, uenoku

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D63379

llvm-svn: 364521
2019-06-27 11:27:54 +00:00
Huihui Zhang 9f69052394 [InstCombine][NFCI] Fix test comments.
For fold
(X & (signbit l>> Y)) ==/!= 0 -> (X << Y) >=/< 0
(X & (signbit << Y)) ==/!= 0 -> (X l>> Y) >=/< 0

Test cases of X being constant are positive tests not negative.

Prep work for D62818.

llvm-svn: 364497
2019-06-27 05:46:06 +00:00
Vasileios Porpodas 574cb0eb3a [SLP] Look-ahead operand reordering heuristic.
Summary: This patch introduces a new heuristic for guiding operand reordering. The new "look-ahead" heuristic can look beyond the immediate predecessors. This helps break ties when the immediate predecessors have identical opcodes (see lit test for an example).

Reviewers: RKSimon, ABataev, dtemirbulatov, Ayal, hfinkel, rnk

Reviewed By: RKSimon, dtemirbulatov

Subscribers: rnk, rcorcs, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60897

llvm-svn: 364478
2019-06-26 21:25:24 +00:00
Sanjay Patel b5999f17d4 [InstCombine] change 'tmp' variable names; NFC
I don't think there was anything going wrong here,
but the auto-generating CHECK line script is known
to have problems with 'TMP' because it uses that
to match nameless values.

This is a retry of rL364452.

llvm-svn: 364477
2019-06-26 21:19:31 +00:00
Sanjay Patel 46a3dbf9a6 Revert [InstCombine] change 'tmp' variable names; NFC
This reverts r364452 (git commit 6083ae0b4a)

llvm-svn: 364455
2019-06-26 18:06:51 +00:00
Sanjay Patel 6083ae0b4a [InstCombine] change 'tmp' variable names; NFC
I don't think there was anything going wrong here,
but the auto-generating CHECK line script is known
to have problems with 'TMP' because it uses that
to match nameless values.

llvm-svn: 364452
2019-06-26 17:43:30 +00:00
Sanjay Patel dfdee7bc15 [InstCombine] regenerate test checks; NFC
llvm-svn: 364437
2019-06-26 15:24:08 +00:00
Roman Lebedev 3f3eacfec1 [NFC][InstCombine] Revisit one-use tests in shift-amount-reassociation-in-bittest.ll
llvm-svn: 364433
2019-06-26 14:42:39 +00:00
Roman Lebedev 78edfc4bf0 [NFC][InstCombine] Add shift amount reassociation in bittest tests (PR42399)
https://bugs.llvm.org/show_bug.cgi?id=42399
https://rise4fun.com/Alive/kBb
https://rise4fun.com/Alive/1SB

llvm-svn: 364430
2019-06-26 14:24:41 +00:00
Fedor Sergeev 1a3dc76186 [InlineCost] cleanup calculations of Cost and Threshold
Summary:
Doing better separation of Cost and Threshold.
Cost counts the abstract complexity of live instructions, while Threshold is an upper bound of complexity that inlining is comfortable to pay.
There are two parts:
     - huge 15K last-call-to-static bonus is no longer subtracted from Cost
       but rather is now added to Threshold.

       That makes much more sense, as the cost of inlining (Cost) is not changed by the fact
       that internal function is called once. It only changes the likelyhood of this inlining
       being profitable (Threshold).

     - bonus for calls proved-to-be-inlinable into callee is no longer subtracted from Cost
       but added to Threshold instead.

While calculations are somewhat different,  overall InlineResult should stay the same since Cost >= Threshold compares the same.

Reviewers: eraman, greened, chandlerc, yrouban, apilipenko
Reviewed By: apilipenko
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60740

llvm-svn: 364422
2019-06-26 13:24:24 +00:00
Clement Courbet 2851248fa1 Revert "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline."
Breaks sanitizers:
    libFuzzer :: cxxstring.test
    libFuzzer :: memcmp.test
    libFuzzer :: recommended-dictionary.test
    libFuzzer :: strcmp.test
    libFuzzer :: value-profile-mem.test
    libFuzzer :: value-profile-strcmp.test

llvm-svn: 364416
2019-06-26 12:13:13 +00:00
Clement Courbet 7b3a5f0e6d [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline.
This allows later passes (in particular InstCombine) to optimize more
cases.

One that's important to us is `memcmp(p, q, constant) < 0` and memcmp(p, q, constant) > 0.

llvm-svn: 364412
2019-06-26 11:50:18 +00:00
Florian Hahn 4c11b5268c [LoopUnroll] Add support for loops with exiting headers and uncond latches.
This patch generalizes the UnrollLoop utility to support loops that exit
from the header instead of the latch. Usually, LoopRotate would take care
of must of those cases, but in some cases (e.g. -Oz), LoopRotate does
not kick in.

Codesize impact looks relatively neutral on ARM64 with -Oz + LTO.

Program                                         master     patch     diff
 External/S.../CFP2006/447.dealII/447.dealII   629060.00  627676.00  -0.2%
 External/SPEC/CINT2000/176.gcc/176.gcc        1245916.00 1244932.00 -0.1%
 MultiSourc...Prolangs-C/simulator/simulator   86100.00   86156.00    0.1%
 MultiSourc...arks/Rodinia/backprop/backprop   66212.00   66252.00    0.1%
 MultiSourc...chmarks/Prolangs-C++/life/life   67276.00   67312.00    0.1%
 MultiSourc...s/Prolangs-C/compiler/compiler   69824.00   69788.00   -0.1%
 MultiSourc...Prolangs-C/assembler/assembler   86672.00   86696.00    0.0%

Reviewers: efriedma, vsk, paquette

Reviewed By: paquette

Differential Revision: https://reviews.llvm.org/D61962

llvm-svn: 364398
2019-06-26 09:16:57 +00:00
Roman Lebedev 567eea44c2 [NFC][InstCombine] Add shift amount reassociation tests (PR42391)
https://bugs.llvm.org/show_bug.cgi?id=42391
https://rise4fun.com/Alive/9E2

llvm-svn: 364393
2019-06-26 08:17:05 +00:00
Huihui Zhang b90cb57b63 [InstCombine] Simplify icmp ult/uge (shl %x, C2), C1 iff C1 is power of two -> icmp eq/ne (and %x, (lshr -C1, C2)), 0.
Simplify 'shl' inequality test into 'and' equality test.

This pattern happens in the middle-end while simplifying bitfield access,
Exposed in https://reviews.llvm.org/D63505

https://rise4fun.com/Alive/6uz

Reviewers: lebedev.ri, efriedma

Reviewed By: lebedev.ri

Subscribers: spatel, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63675

llvm-svn: 364348
2019-06-25 20:44:52 +00:00
Sanjay Patel fcfa056ceb [InstCombine] reduce checks for power-of-2-or-zero using ctpop
This follows up the transform from rL363956 to use the ctpop intrinsic when checking for power-of-2-or-zero.

This is matching the isPowerOf2() patterns used in PR42314:
https://bugs.llvm.org/show_bug.cgi?id=42314

But there's at least 1 instcombine follow-up needed to match the alternate form:

(v & (v - 1)) == 0;

We should have all of the backend expansions handled with:
rL364319
(x86-specific changes still needed for optimal code based on subtarget)

And the larger patterns to exclude zero as a power-of-2 are joining with this change after:
rL364153 ( D63660 )
rL364246

Differential Revision: https://reviews.llvm.org/D63777

llvm-svn: 364341
2019-06-25 18:51:44 +00:00
Simon Tatham a4b415a683 [ARM] Code-generation infrastructure for MVE.
This provides the low-level support to start using MVE vector types in
LLVM IR, loading and storing them, passing them to __asm__ statements
containing hand-written MVE vector instructions, and *if* you have the
hard-float ABI turned on, using them as function parameters.

(In the soft-float ABI, vector types are passed in integer registers,
and combining all those 32-bit integers into a q-reg requires support
for selection DAG nodes like insert_vector_elt and build_vector which
aren't implemented yet for MVE. In fact I've also had to add
`arm_aapcs_vfpcc` to a couple of existing tests to avoid that
problem.)

Specifically, this commit adds support for:

 * spills, reloads and register moves for MVE vector registers

 * ditto for the VPT predication mask that lives in VPR.P0

 * make all the MVE vector types legal in ISel, and provide selection
   DAG patterns for BITCAST, LOAD and STORE

 * make loads and stores of scalar FP types conditional on
   `hasFPRegs()` rather than `hasVFP2Base()`. As a result a few
   existing tests needed their llc command lines updating to use
   `-mattr=-fpregs` as their method of turning off all hardware FP
   support.

Reviewers: dmgreen, samparker, SjoerdMeijer

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60708

llvm-svn: 364329
2019-06-25 16:48:46 +00:00
Sam Parker bcf0eb7a64 [ARM] Fix for DLS/LE CodeGen
The expensive buildbots highlighted the mir tests were broken, which
I've now updated and added --verify-machineinstrs to them. This also
uncovered a couple of bugs in the backend pass, so these have also
been fixed.

llvm-svn: 364323
2019-06-25 15:11:17 +00:00
Simon Pilgrim e98f8cf78f [SLPVectorizer] Precommit of supernode.ll test for D63661
This is a pre-commit of the tests introduced by the SuperNode SLP patch D63661.

Committed on behalf of @vporpo (Vasileios Porpodas)

Differential Revision: https://reviews.llvm.org/D63664

llvm-svn: 364320
2019-06-25 14:58:20 +00:00
Sam Parker a6fd919cb3 [ARM] DLS/LE low-overhead loop code generation
Introduce three pseudo instructions to be used during DAG ISel to
represent v8.1-m low-overhead loops. One maps to set_loop_iterations
while loop_decrement_reg is lowered to two, so that we can separate
the decrement and branching operations. The pseudo instructions are
expanded pre-emission, where we can still decide whether we actually
want to generate a low-overhead loop, in a new pass:
ARMLowOverheadLoops. The pass currently bails, reverting to an sub,
icmp and br, in the cases where a call or stack spill/restore happens
between the decrement and branching instructions, or if the loop is
too large.

Differential Revision: https://reviews.llvm.org/D63476

llvm-svn: 364288
2019-06-25 10:45:51 +00:00
Huihui Zhang 2cc3b3856e [InstCombine][NFC] Add test to show missing fold for icmp ult/uge (shl %x, C2), C1.
Summary:
'shl' inequality test

```
  icmp ult/uge (shl %x, C2), C1 iff C1 is power of two
```

can be simplified as 'and' equality test

```
  icmp eq/ne (and %x, (lshr -C1, C2)), 0.
```

Reviewers: lebedev.ri, efriedma

Reviewed By: lebedev.ri

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63670

llvm-svn: 364256
2019-06-25 00:14:02 +00:00
Huihui Zhang 4626613ffe [InstCombine] Fold icmp eq/ne (and %x, C), 0 iff (-C) is power of two -> %x u</u>= (-C) earlier.
Summary:
To generate simplified IR, make sure fold
  (X & ~C) ==/!= 0 --> X u</u>= C+1

is scheduled before fold
  ((X << Y) & C) == 0 -> (X & (C >> Y)) == 0.

https://rise4fun.com/Alive/7ZN

Reviewers: lebedev.ri, efriedma, spatel, craig.topper

Reviewed By: lebedev.ri

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63505

llvm-svn: 364255
2019-06-25 00:09:10 +00:00
Sanjay Patel 2675b0c8ab [InstCombine] squash is-not-power-of-2 using ctpop
This is the Demorgan'd 'not' of the pattern handled in:
D63660 / rL364153

This is another intermediate IR step towards solving PR42314:
https://bugs.llvm.org/show_bug.cgi?id=42314

We can test if a value is not a power-of-2 using ctpop(X) > 1,
so combining that with an is-zero check of the input is the
same as testing if not exactly 1 bit is set:

(X == 0) || (ctpop(X) u> 1) --> ctpop(X) != 1

llvm-svn: 364246
2019-06-24 22:35:26 +00:00
Matt Arsenault 8025842599 InstCombine: Preserve nuw when reassociating nuw ops [3/3]
Alive says this is OK.

llvm-svn: 364235
2019-06-24 21:37:03 +00:00
Matt Arsenault 5d82ecd5d9 InstCombine: Preserve nuw when reassociating nuw ops [2/3]
Alive says this is OK.

llvm-svn: 364234
2019-06-24 21:37:02 +00:00
Matt Arsenault 5a89ba7343 InstCombine: Preserve nuw when reassociating nuw ops [1/3]
Alive says this is OK.

llvm-svn: 364233
2019-06-24 21:36:59 +00:00
Cameron McInally 1e5116cbb3 [NFC][Reassociate] Add unary FNeg tests to fast-ReassociateVector.ll
llvm-svn: 364232
2019-06-24 21:36:09 +00:00
Nikita Popov f1ffc4305d [CVP] Reenable nowrap flag inference
Inference of nowrap flags in CVP has been disabled, because it
triggered a bug in LFTR (https://bugs.llvm.org/show_bug.cgi?id=31181).
This issue has been fixed in D60935, so we should be able to reenable
nowrap flag inference now.

Differential Revision: https://reviews.llvm.org/D62776

llvm-svn: 364228
2019-06-24 20:13:13 +00:00
Sanjay Patel 2aa800052a [InstCombine] add tests for more variants of isPowerOf2; NFC
llvm-svn: 364227
2019-06-24 20:11:40 +00:00
Huihui Zhang 94b4316096 [InstCombine] Regenerate test pr17827. NFCI.
Prep work for upcoming patch D63505.

llvm-svn: 364224
2019-06-24 19:49:42 +00:00
Philip Reames b2f09391cf [Tests] Add cases where we're failing to discharge provably loop exits (tests for D63733)
llvm-svn: 364220
2019-06-24 19:26:17 +00:00
Cameron McInally fe3f15cf90 [SLP] Support unary FNeg vectorization
Differential Revision: https://reviews.llvm.org/D63609

llvm-svn: 364219
2019-06-24 19:24:23 +00:00
Sanjay Patel 89efefb170 [InstCombine] reduce funnel-shift i16 X, X, 8 to bswap X
Prefer the more exact intrinsic to remove a use of the input value
and possibly make further transforms easier (we will still need
to match patterns with funnel-shift of wider types as pieces of
bswap, especially if we want to canonicalize to funnel-shift with
constant shift amount). Discussed in D46760.

llvm-svn: 364187
2019-06-24 15:20:49 +00:00
Sanjay Patel f27f794d47 [InstCombine] add tests for funnel-shift to bswap; NFC
llvm-svn: 364184
2019-06-24 14:47:02 +00:00
Simon Pilgrim b617b0808d [InstCombine] SliceUpIllegalIntegerPHI - bail on out of range shifts
trunc(lshr) handling - if the shift is out of range (undefined) then bail like we do for non-constant shifts.

Fixes OSS Fuzz #15217

llvm-svn: 364181
2019-06-24 13:13:36 +00:00
Bjorn Pettersson 512b118779 [Scalarizer] Add scalarizer support for smul.fix.sat
Summary:
Handle smul.fix.sat in the scalarizer. This is done by
adding smul.fix.sat to the set of "isTriviallyVectorizable"
intrinsics.

The addition of smul.fix.sat in isTriviallyVectorizable and
hasVectorInstrinsicScalarOpd can also be seen as a preparation
to be able to use hasVectorInstrinsicScalarOpd in ConstantFolding.

Reviewers: rengolin, RKSimon, dblaikie

Reviewed By: rengolin

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63704

llvm-svn: 364177
2019-06-24 12:07:11 +00:00
Philip Reames 3f8264b062 [Tests] Autogen and improve test readability
llvm-svn: 364156
2019-06-23 17:13:53 +00:00
Philip Reames d22a2a9a72 [IndVars] Remove dead instructions after folding trivial loop exit
In rL364135, I taught IndVars to fold exiting branches in loops with a zero backedge taken count (i.e. loops that only run one iteration).  This extends that to eliminate the dead comparison left around.  

llvm-svn: 364155
2019-06-23 17:06:57 +00:00
Sanjay Patel 13a5ae58fc [InstCombine] squash is-power-of-2 that uses ctpop
This is another intermediate IR step towards solving PR42314:
https://bugs.llvm.org/show_bug.cgi?id=42314

We can test if a value is power-of-2-or-0 using ctpop(X) < 2,
so combining that with a non-zero check of the input is the
same as testing if exactly 1 bit is set:

(X != 0) && (ctpop(X) u< 2) --> ctpop(X) == 1

Differential Revision: https://reviews.llvm.org/D63660

llvm-svn: 364153
2019-06-23 14:22:37 +00:00
Philip Reames 8deb84c8ef Exploit a zero LoopExit count to eliminate loop exits
This turned out to be surprisingly effective. I was originally doing this just for completeness sake, but it seems like there are a lot of cases where SCEV's exit count reasoning is stronger than it's isKnownPredicate reasoning.

Once this is in, I'm thinking about trying to build on the same infrastructure to eliminate provably untaken checks. There may be something generally interesting here.

Differential Revision: https://reviews.llvm.org/D63618

llvm-svn: 364135
2019-06-22 17:54:25 +00:00
Nikita Popov b89d7e52db [LFTR] Add tests for PR41998; NFC
The limit for the pointer case is incorrect.

llvm-svn: 364128
2019-06-22 09:57:59 +00:00
Reid Kleckner 592a193285 Revert [SLP] Look-ahead operand reordering heuristic.
This reverts r364084 (git commit 5698921be2)

It caused crashes while compiling a file in Chrome. Reduction
forthcoming.

llvm-svn: 364111
2019-06-21 23:10:25 +00:00
Simon Pilgrim 5698921be2 [SLP] Look-ahead operand reordering heuristic.
This patch introduces a new heuristic for guiding operand reordering. The new "look-ahead" heuristic can look beyond the immediate predecessors. This helps break ties when the immediate predecessors have identical opcodes (see lit test for an example).

Committed on behalf of @vporpo (Vasileios Porpodas)

Differential Revision: https://reviews.llvm.org/D60897

llvm-svn: 364084
2019-06-21 17:57:01 +00:00
David Bolvansky 2441a4074c [NFC] Update shl-sub tests
llvm-svn: 364083
2019-06-21 17:51:18 +00:00
Sanjay Patel f483617256 [InstCombine] add tests for ctpop folds; NFC
llvm-svn: 364082
2019-06-21 17:44:09 +00:00
David Bolvansky dbcdad51ff [InstCombine] (1 << (C - x)) -> ((1 << C) >> x) if C is bitwidth - 1
Summary:
```
%a = sub i32 31, %x
%r = shl i32 1, %a
  =>
%d = shl i32 1, 31
%r = lshr i32 %d, %x

Done: 1
Optimization is correct!
```

https://rise4fun.com/Alive/btZm

Reviewers: spatel, lebedev.ri, nikic

Reviewed By: lebedev.ri

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63652

llvm-svn: 364073
2019-06-21 16:25:32 +00:00
David Bolvansky 045b0f60b6 [NFC] Added more tests for D63652
llvm-svn: 364069
2019-06-21 16:14:13 +00:00
David Bolvansky 4b28478389 [InstCombine] cttz(abs(x)) -> cttz(x)
Summary: Signedness does not change number of trailing zeros.

Reviewers: spatel, lebedev.ri, nikic

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D63546

llvm-svn: 364064
2019-06-21 15:26:22 +00:00
Sanjay Patel ddb9093684 [GVNSink] prevent crashing on mismatched instructions (PR42346)
Patch based on suggestion by James Molloy (@jmolloy) in:
https://bugs.llvm.org/show_bug.cgi?id=42346

llvm-svn: 364062
2019-06-21 15:17:24 +00:00
David Bolvansky b0ba049f58 [NFC] Added tests for (1 << (C - x)) -> ((1 << C) >> x)
llvm-svn: 364060
2019-06-21 15:00:31 +00:00
Jay Foad d9d3c91b48 [Scalarizer] Propagate IR flags
Summary:
The motivation for this was to propagate fast-math flags like nnan and
ninf on vector floating point operations to the corresponding scalar
operations to take advantage of follow-on optimizations. But I think
the same argument applies to all of our IR flags: if they apply to the
vector operation then they also apply to all the individual scalar
operations, and they might enable follow-on optimizations.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63593

llvm-svn: 364051
2019-06-21 14:10:18 +00:00
Sam Elliott 96c8bc7956 [RISCV] Add RISCV-specific TargetTransformInfo
Summary:
LLVM Allows Targets to provide information that guides optimisations
made to LLVM IR. This is done with callbacks on a TargetTransformInfo object.

This patch adds a TargetTransformInfo class for RISC-V. This will allow us to
implement RISC-V specific callbacks as they become necessary.

This commit also adds the getIntImmCost callbacks, and tests them with a simple
constant hoisting test. Our immediate costs are on the conservative side, for
the moment, but we prevent hoisting in most circumstances anyway.

Previous review was on D63007

Reviewers: asb, luismarques

Reviewed By: asb

Subscribers: ributzka, MaskRay, llvm-commits, Jim, benna, psnobl, jocewei, PkmX, rkruppe, the_o, brucehoult, MartinMosbeck, rogfer01, edward-jones, zzheng, jrtc27, shiva0217, kito-cheng, niosHD, sabuasal, apazos, simoncook, johnrusso, rbar, hiraditya, mgorny

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63433

llvm-svn: 364046
2019-06-21 13:36:09 +00:00
Cameron McInally 1c0bd6dd2c [Reassociate] Remove bogus assert reported in PR42349.
Also, add a FIXME for the unsafe transform on a unary FNeg. A unary FNeg can only be transformed to a FMul by -1.0 when the nnan flag is present. The unary FNeg project is a WIP, so the unsafe transformation is acceptable until that work is complete.

The bogus assert with introduced in D63445.

llvm-svn: 363998
2019-06-20 23:03:55 +00:00
Sanjay Patel b342f026a4 [InstSimplify] simplify power-of-2 (single bit set) sequences
As discussed in PR42314:
https://bugs.llvm.org/show_bug.cgi?id=42314

Improving the canonicalization for these patterns:
rL363956
...means we should adjust/enhance the related simplification.

https://rise4fun.com/Alive/w1cp

  Name: isPow2 or zero
  %x = and i32 %xx, 2048
  %a = add i32 %x, -1
  %r = and i32 %a, %x
  =>
  %r = i32 0

llvm-svn: 363997
2019-06-20 22:55:28 +00:00
Alina Sbirlea d0b11698cd [LICM & MSSA] Limit unsafe sinking and hoisting.
Summary:
The getClobberingMemoryAccess API checks for clobbering accesses in a loop by walking the backedge. This may check if a memory access is being
clobbered by the loop in a previous iteration, depending how smart AA got over the course of the updates in MemorySSA (it does not occur when built from scratch).
If no clobbering access is found inside the loop, it will optimize to an access outside the loop. This however does not mean that access is safe to sink.
Given:
```
for i
  load a[i]
  store a[i]
```
The access corresponding to the load can be optimized to outside the loop, and the load can be hoisted. But it is incorrect to sink it.
In order to sink the load, we'd need to check no Def clobbers the Use in the same iteration. With this patch we currently restrict sinking to either
Defs not existing in the loop, or Defs preceding the load in the same block. An easy extension is to ensure the load (Use) post-dominates all Defs.

Caught by PR42294.

This issue also shed light on the converse problem: hoisting stores in this same scenario would be illegal. With this patch we restrict
hoisting of stores to the case when their corresponding Defs are dominating all Uses in the loop.

Reviewers: george.burgess.iv

Subscribers: jlebar, Prazek, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63582

llvm-svn: 363982
2019-06-20 21:09:09 +00:00
Sanjay Patel 3207566dd6 [InstSimplify] add tests for known-not-a-power-of-2; NFC
I added a canonicalization to create this general pattern in:
rL363956

But as noted in PR42314:
https://bugs.llvm.org/show_bug.cgi?id=42314#c11

...we have a (potentially expensive) simplification for the version
of the code that we just canonicalized away from, so we should
add/adjust that code to match.

llvm-svn: 363981
2019-06-20 21:04:14 +00:00
Cameron McInally 9589db7a98 [NFC][SLP] Pre-commit unary FNeg test to X86/propagate_ir_flags.ll
llvm-svn: 363978
2019-06-20 20:53:51 +00:00
David Bolvansky e0c1c3baf9 [NFC] Updated tests for D63546
llvm-svn: 363967
2019-06-20 19:30:56 +00:00
Sanjay Patel 63311bfb83 [InstCombine] canonicalize check for power-of-2
The form that compares against 0 is better because:
1. It removes a use of the input value.
2. It's the more standard form for this pattern: https://graphics.stanford.edu/~seander/bithacks.html#DetermineIfPowerOf2
3. It results in equal or better codegen (tested with x86, AArch64, ARM, PowerPC, MIPS).

This is a root cause for PR42314, but probably doesn't completely answer the codegen request:
https://bugs.llvm.org/show_bug.cgi?id=42314

Alive proof:
https://rise4fun.com/Alive/9kG

  Name: is power-of-2
  %neg = sub i32 0, %x
  %a = and i32 %neg, %x
  %r = icmp eq i32 %a, %x
  =>
  %dec = add i32 %x, -1
  %a2 = and i32 %dec, %x
  %r = icmp eq i32 %a2, 0

  Name: is not power-of-2
  %neg = sub i32 0, %x
  %a = and i32 %neg, %x
  %r = icmp ne i32 %a, %x
  =>
  %dec = add i32 %x, -1
  %a2 = and i32 %dec, %x
  %r = icmp ne i32 %a2, 0

llvm-svn: 363956
2019-06-20 17:41:15 +00:00
Philip Reames 8c80d08052 [Tests] Add a tricky LFTR case for documentation purposes
Thought of this case while working on something else.  We appear to get it right in all of the variations I tried, but that's by accident.  So, add a test which would catch the potential bug.

llvm-svn: 363953
2019-06-20 17:16:53 +00:00
David Bolvansky 01511192b2 [InstCombine] cttz(-x) -> cttz(x)
Summary: Signedness does not change number of trailing zeros.

Reviewers: spatel, lebedev.ri, nikic

Reviewed By: spatel

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63534

llvm-svn: 363951
2019-06-20 17:04:14 +00:00
Sanjay Patel d729ed8d44 [InstCombine] add commuted variants for power-of-2 checks; NFC
llvm-svn: 363945
2019-06-20 16:27:23 +00:00
Sanjay Patel 345473c791 [InstCombine] add tests for checking power-of-2; NFC
llvm-svn: 363938
2019-06-20 15:25:18 +00:00
Cameron McInally 4452c3b490 [NFC][SLP] Pre-commit unary FNeg test to X86/phi3.ll
llvm-svn: 363937
2019-06-20 15:17:17 +00:00
Simon Pilgrim 72186a2494 [SLP][X86] Add lookahead reordering tests from D60897
llvm-svn: 363925
2019-06-20 12:52:58 +00:00
Philip Reames eda1ba65ca LFTR for multiple exit loops
Teach IndVarSimply's LinearFunctionTestReplace transform to handle multiple exit loops. LFTR does two key things 1) it rewrites (all) exit tests in terms of a common IV potentially eliminating one in the process and 2) it moves any offset/indexing/f(i) style logic out of the loop.

This turns out to actually be pretty easy to implement. SCEV already has all the information we need to know what the backedge taken count is for each individual exit. (We use that when computing the BE taken count for the loop as a whole.) We basically just need to iterate through the exiting blocks and apply the existing logic with the exit specific BE taken count. (The previously landed NFC makes this super obvious.)

I chose to go ahead and apply this to all loop exits instead of only latch exits as originally proposed. After reviewing other passes, the only case I could find where LFTR form was harmful was LoopPredication. I've fixed the latch case, and guards aren't LFTRed anyways. We'll have some more work to do on the way towards widenable_conditions, but that's easily deferred.

I do want to note that I added one bit after the review.  When running tests, I saw a new failure (no idea why didn't see previously) which pointed out LFTR can rewrite a constant condition back to a loop varying one.  This was theoretically possible with a single exit, but the zero case covered it in practice.  With multiple exits, we saw this happening in practice for the eliminate-comparison.ll test case because we'd compute a ExitCount for one of the exits which was guaranteed to never actually be reached.  Since LFTR ran after simplifyAndExtend, we'd immediately turn around and undo the simplication work we'd just done.  The solution seemed obvious, so I didn't bother with another round of review.

Differential Revision: https://reviews.llvm.org/D62625

llvm-svn: 363883
2019-06-19 21:58:25 +00:00
Philip Reames 80eb1ce7a0 [Tests] Autogen a test so that future changes are understandable
llvm-svn: 363882
2019-06-19 21:39:07 +00:00
Huihui Zhang 670778c762 [InstCombine] Fold icmp eq/ne (and %x, signbit), 0 -> %x s>=/s< 0 earlier
Summary:
To generate simplified IR, make sure fold
```
  (X & signbit) ==/!= 0) -> X s>=/s< 0;
```
is scheduled before fold
```
  ((X << Y) & C) == 0 -> (X & (C >> Y)) == 0.
```

https://rise4fun.com/Alive/fbdh

Reviewers: lebedev.ri, efriedma, spatel, craig.topper

Reviewed By: lebedev.ri

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63026

llvm-svn: 363845
2019-06-19 17:31:39 +00:00
Sanjay Patel 3e03bf6921 [InstSimplify] add a phi test with 1 incoming value; NFC
D63489 proposes to change this behavior, but there's no
direct -instsimplify test to verify that the transform exists.

llvm-svn: 363842
2019-06-19 17:23:29 +00:00
Hubert Tong e9983eed5a [NFC][LSR] Avoid undefined grep in pr2570.ll
greater-than-sign is not a BRE special character.

POSIX.1-2017 XBD Section 9.3.2 indicates that the interpretation of `\>`
is undefined. This patch replaces the pattern.

llvm-svn: 363828
2019-06-19 16:02:54 +00:00
Cameron McInally a027cf4764 [Reassociate] Handle unary FNeg in the Reassociate pass
Differential Revision: https://reviews.llvm.org/D63445

llvm-svn: 363813
2019-06-19 14:59:14 +00:00
David Bolvansky e3cd19d330 [NFC] Added tests for D63534
llvm-svn: 363796
2019-06-19 12:59:37 +00:00
David Bolvansky 21fd232385 [NFC] Added tests for cttz(abs(x)) -> cttz(x) fold
llvm-svn: 363795
2019-06-19 12:55:39 +00:00
Orlando Cazalet-Hyams 1251cac62a [DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion
Summary:
Bug: https://bugs.llvm.org/show_bug.cgi?id=39024

The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here:

A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins.
B) Instructions in the middle block have different line numbers which give the impression of another iteration.

In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks.

I have set up a separate review D61933 for a fix which is required for this patch.

Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel, jmorse

Reviewed By: hfinkel, jmorse

Subscribers: jmorse, javed.absar, eraman, kcc, bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits

Tags: #llvm, #debug-info

Differential Revision: https://reviews.llvm.org/D60831

> llvm-svn: 363046

llvm-svn: 363786
2019-06-19 10:50:47 +00:00
Jay Foad 45d19fb470 [ConstantFolding] Fix assertion failure on non-power-of-two vector load.
Summary:
The test case does an (out of bounds) load from a global constant with
type <3 x float>. InstSimplify tried to turn this into an integer load
of the whole alloc size of the vector, which is 128 bits due to
alignment padding, and then bitcast this to <3 x vector> which failed
an assertion due to the type size mismatch.

The fix is to do an integer load of the normal size of the vector, with
no alignment padding.

Reviewers: tpr, arsenm, majnemer, dstuttard

Reviewed By: arsenm

Subscribers: hfinkel, wdng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63375

llvm-svn: 363784
2019-06-19 10:28:48 +00:00
Michael Liao 4f7f70e262 Recommit [SROA] Enhance SROA to handle `addrspacecast`ed allocas
[SROA] Enhance SROA to handle `addrspacecast`ed allocas

- Fix typo in original change
- Add additional handling to ensure all return pointers are properly
  casted.

Summary:
- After `addrspacecast` is allowed to be eliminated in SROA, the
  adjusting of storage pointer (from `alloca) needs to handle the
  potential different address spaces between the storage pointer (from
  alloca) and the pointer being used.

Reviewers: arsenm

Subscribers: wdng, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63501

llvm-svn: 363743
2019-06-18 21:41:13 +00:00
Matt Arsenault e8d8bb5170 InstCombine: Pre-commit test for reassociating nuw
D39417

llvm-svn: 363741
2019-06-18 21:32:51 +00:00
Adrian Prantl 1db8d4a866 Fix broken debug info in in an !llvm.loop attachment in this testcase.
llvm-svn: 363730
2019-06-18 20:07:53 +00:00
Jordan Rupprecht 33e85ad956 Revert [SROA] Enhance SROA to handle `addrspacecast`ed allocas
This reverts r363711 (git commit 76a149ef81)

This causes stage2 build failures, e.g.:
http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/132/steps/stage%202%20build/logs/stdio
http://lab.llvm.org:8011/builders/ppc64le-lld-multistage-test/builds/87/steps/build-stage2-unified-tree/logs/stdio

llvm-svn: 363718
2019-06-18 18:40:04 +00:00
Michael Liao 76a149ef81 [SROA] Enhance SROA to handle `addrspacecast`ed allocas
Summary:
- After `addrspacecast` is allowed to be eliminated in SROA, the
  adjusting of storage pointer (from `alloca) needs to handle the
  potential different address spaces between the storage pointer (from
  alloca) and the pointer being used.

Reviewers: arsenm

Subscribers: wdng, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63501

llvm-svn: 363711
2019-06-18 17:58:49 +00:00
Philip Reames 44475363e8 Teach getSCEVAtScope how to handle loop phis w/invariant operands in loops w/taken backedges
This patch really contains two pieces:
    Teach SCEV how to fold a phi in the header of a loop to the value on the backedge when a) the backedge is known to execute at least once, and b) the value is safe to use globally within the scope dominated by the original phi.
    Teach IndVarSimplify's rewriteLoopExitValues to allow loop invariant expressions which already exist (and thus don't need new computation inserted) even in loops where we can't optimize away other uses.

Differential Revision: https://reviews.llvm.org/D63224

llvm-svn: 363619
2019-06-17 21:06:17 +00:00
Philip Reames fe8bd96ebd Fix a bug w/inbounds invalidation in LFTR (recommit)
Recommit r363289 with a bug fix for crash identified in pr42279.  Issue was that a loop exit test does not have to be an icmp, leading to a null dereference crash when new logic was exercised for that case.  Test case previously committed in r363601.

Original commit comment follows:

This contains fixes for two cases where we might invalidate inbounds and leave it stale in the IR (a miscompile). Case 1 is when switching to an IV with no dynamically live uses, and case 2 is when doing pre-to-post conversion on the same pointer type IV.

The basic scheme used is to prove that using the given IV (pre or post increment forms) would have to already trigger UB on the path to the test we're modifying. As such, our potential UB triggering use does not change the semantics of the original program.

As was pointed out in the review thread by Nikita, this is defending against a separate issue from the hasConcreteDef case. This is about poison, that's about undef. Unfortunately, the two are different, see Nikita's comment for a fuller explanation, he explains it well.

(Note: I'm going to address Nikita's last style comment in a separate commit just to minimize chance of subtle bugs being introduced due to typos.)

Differential Revision: https://reviews.llvm.org/D62939

llvm-svn: 363613
2019-06-17 20:32:22 +00:00
Philip Reames 58c75565f3 Reduced test case for pr42279 in advance of the relevant re-commit + fix
llvm-svn: 363601
2019-06-17 19:27:45 +00:00
Joseph Tremoulet daa1ae6142 [EarlyCSE] Fix hashing of self-compares
Summary:
Update compare normalization in SimpleValue hashing to break ties (when
the same value is being compared to itself) by switching to the swapped
predicate if it has a lower numerical value.  This brings the hashing in
line with isEqual, which already recognizes the self-compares with
swapped predicates as equal.

Fixes PR 42280.

Reviewers: spatel, efriedma, nikic, fhahn, uabelho

Reviewed By: nikic

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63349

llvm-svn: 363598
2019-06-17 19:11:28 +00:00
Matt Arsenault 6d741f29ec AMDGPU: Fold readlane/readfirstlane calls
llvm-svn: 363587
2019-06-17 17:52:35 +00:00
Warren Ristow 6452bdd29b [LV] Suppress vectorization in some nontemporal cases
When considering a loop containing nontemporal stores or loads for
vectorization, suppress the vectorization if the corresponding
vectorized store or load with the aligment of the original scaler
memory op is not supported with the nontemporal hint on the target.

This adds two new functions:
  bool isLegalNTStore(Type *DataType, unsigned Alignment) const;
  bool isLegalNTLoad(Type *DataType, unsigned Alignment) const;

to TTI, leaving the target independent default implementation as
returning true, but with overriding implementations for X86 that
check the legality based on available Subtarget features.

This fixes https://llvm.org/PR40759

Differential Revision: https://reviews.llvm.org/D61764

llvm-svn: 363581
2019-06-17 17:20:08 +00:00
Matt Arsenault 1df203d78e InferAddressSpaces: Fix cloning original addrspacecast
If an addrspacecast needed to be inserted again, this was creating a
clone of the original cast for each user. Just use the original, which
also saves losing the value name.

llvm-svn: 363562
2019-06-17 14:13:29 +00:00
Matt Arsenault b10f097833 AMDGPU: Ignore subtarget for InferAddressSpaces
Even if the target doesn't have flat instructions, addrspace(0) is
still flat. It just happens to not work.

llvm-svn: 363561
2019-06-17 14:13:24 +00:00
Matt Arsenault f3b64d80bc AMDGPU: Mark exp/exp.compr as inaccessiblememonly
Should also be marked writeonly, but I think that would require
splitting the version with done set to a separate intrinsic

Test change is only from renumbering the attribute group numbers,
which for some reason the generated check lines consider.

llvm-svn: 363560
2019-06-17 13:52:24 +00:00
Sam Parker 1bd3d00e7e [CodeGen] Check for HardwareLoop Latch ExitBlock
The HardwareLoops pass finds exit blocks with a scevable exit count.
If the target specifies to update the loop counter in a register,
through a phi, we need to ensure that the exit block is a latch so
that we can insert the phi with the correct value for the incoming
edge.

Differential Revision: https://reviews.llvm.org/D63336

llvm-svn: 363556
2019-06-17 13:39:28 +00:00
Bjorn Pettersson 83773b77a5 [LV] Deny irregular types in interleavedAccessCanBeWidened
Summary:
Avoid that loop vectorizer creates loads/stores of vectors
with "irregular" types when interleaving. An example of
an irregular type is x86_fp80 that is 80 bits, but that
may have an allocation size that is 96 bits. So an array
of x86_fp80 is not bitcast compatible with a vector
of the same type.

Not sure if interleavedAccessCanBeWidened is the best
place for this check, but it solves the problem seen
in the added test case. And it is the same kind of check
that already exists in memoryInstructionCanBeWidened.

Reviewers: fhahn, Ayal, craig.topper

Reviewed By: fhahn

Subscribers: hiraditya, rkruppe, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63386

llvm-svn: 363547
2019-06-17 12:02:24 +00:00
Sam Parker 60d6fb2a63 [SCEV] Use NoWrapFlags when expanding a simple mul
Second functional change following on from rL362687. Pass the
NoWrapFlags from the MulExpr to InsertBinop when we're generating a
shl or mul.

Differential Revision: https://reviews.llvm.org/D61934

llvm-svn: 363540
2019-06-17 10:05:18 +00:00
Fangrui Song ac14f7b10c [lit] Delete empty lines at the end of lit.local.cfg NFC
llvm-svn: 363538
2019-06-17 09:51:07 +00:00
Hans Wennborg a9e5d2f35d Re-commit r357452 (take 3): "SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)"
Third time's the charm.

This was reverted in r363220 due to being suspected of an internal benchmark
regression and a test failure, none of which turned out to be caused by this.

llvm-svn: 363529
2019-06-17 07:47:28 +00:00
Yevgeny Rouban ee62c40eae [SimplifyCFG] Fix prof branch_weights MD while removing unreachable switch cases
SimplifyCFG has a bug that results in inconsistent prof branch_weights metadata
if unreachable switch cases are removed. This patch fixes this bug by making use
of the newly introduced SwitchInstProfUpdateWrapper class (see patch D62122).
A new test is created.

Differential Revision: https://reviews.llvm.org/D62186

llvm-svn: 363527
2019-06-17 05:55:12 +00:00
Roman Lebedev 5a663bd77a [InstSimplify] Fix addo/subo undef folds (PR42209)
Fix folds of addo and subo with an undef operand to be:

`@llvm.{u,s}{add,sub}.with.overflow` all fold to `{ undef, false }`,
 as per LLVM undef rules.
Same for commuted variants.

Based on the original version of the patch by @nikic.

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42209 | PR42209 ]]

Differential Revision: https://reviews.llvm.org/D63065

llvm-svn: 363522
2019-06-16 20:39:45 +00:00
Sanjay Patel c8d88ad1a9 [CodeGenPrepare][x86] shift both sides of a vector select when profitable
This is based on the example/discussion in PR37428:
https://bugs.llvm.org/show_bug.cgi?id=37428

Proper vector shift instructions don't appear until AVX2, so we may generate several
extra instructions within a loop trying to compensate for that. It's difficult to
recover from that shift expansion later than this, so use the existing TLI hook and
splat analysis to enable better codegen.

This extends CGP functionality introduced with:
rL201655

Differential Revision: https://reviews.llvm.org/D63233

llvm-svn: 363511
2019-06-16 15:29:03 +00:00
Nikita Popov 9145562b48 [SimplifyIndVar] Simplify non-overflowing saturating add/sub
If we can detect that saturating math that depends on an IV cannot
overflow, replace it with simple math. This is similar to the CVP
optimization from D62703, just based on a different underlying
analysis (SCEV vs LVI) that catches different cases.

Differential Revision: https://reviews.llvm.org/D62792

llvm-svn: 363489
2019-06-15 08:48:52 +00:00
Huihui Zhang dc2fd6a14e [InstCombine] Add tests to show missing fold opportunity for "icmp and shift" (nfc).
Summary:
For icmp pred (and (sh X, Y), C), 0

  When C is signbit, expect to fold (X << Y) & signbit ==/!= 0 into (X << Y) >=/< 0,
  rather than (X & (signbit >> Y)) != 0.

  When C+1 is power of 2, expect to fold (X << Y) & ~C ==/!= 0 into (X << Y) </>= C+1,
  rather than (X & (~C >> Y)) == 0.

For icmp pred (and X, (sh signbit, Y)), 0

  Expect to fold (X & (signbit l>> Y)) ==/!= 0 into (X << Y) >=/< 0
  Expect to fold (X & (signbit << Y)) ==/!= 0 into (X l>> Y) >=/< 0

  Reviewers: lebedev.ri, efriedma, spatel, craig.topper

  Reviewed By: lebedev.ri

  Subscribers: llvm-commits

  Tags: #llvm

  Differential Revision: https://reviews.llvm.org/D63025

llvm-svn: 363479
2019-06-15 00:33:41 +00:00
Akira Hatanaka a704a8f28c [ObjC][ARC] Delete ObjC runtime calls on global variables annotated
with 'objc_arc_inert'

Those calls are no-ops, so they can be safely deleted.

rdar://problem/49839633

Differential Revision: https://reviews.llvm.org/D62433

llvm-svn: 363468
2019-06-14 22:06:32 +00:00
Matt Arsenault 282dac717e SROA: Allow eliminating addrspacecasted allocas
There is a circular dependency between SROA and InferAddressSpaces
today that requires running both multiple times in order to be able to
eliminate all simple allocas and addrspacecasts. InferAddressSpaces
can't remove addrspacecasts when written to memory, and SROA helps
move pointers out of memory.

This should avoid inserting new commuting addrspacecasts with GEPs,
since there are unresolved questions about pointer wrapping between
different address spaces.

For now, don't replace volatile operations that don't match the alloca
addrspace, as it would change the address space of the access. It may
be still OK to insert an addrspacecast from the new alloca, but be
more conservative for now.

llvm-svn: 363462
2019-06-14 21:38:31 +00:00
Matt Arsenault e6efb6433f SROA: Add baseline test for addrspacecast changes
llvm-svn: 363460
2019-06-14 21:22:26 +00:00
Florian Hahn dcdd12b68c Revert Fix a bug w/inbounds invalidation in LFTR
Reverting because it breaks a green dragon build:
    http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/18208

This reverts r363289 (git commit eb88badff9)

llvm-svn: 363427
2019-06-14 17:23:09 +00:00
Sanjay Patel 7ea378b940 [CodeGenPrepare] propagate debuginfo when copying a shuffle
llvm-svn: 363409
2019-06-14 15:05:35 +00:00
Matt Arsenault 492d71cc99 AMDGPU: Fold readlane intrinsics of constants
I'm not 100% sure about this, since I'm worried about IR transforms
that might end up introducing divergence downstream once replaced with
a constant, but I haven't come up with an example yet.

llvm-svn: 363406
2019-06-14 14:51:26 +00:00
Sam Parker 0cf9639a9c [SCEV] Pass NoWrapFlags when expanding an AddExpr
InsertBinop now accepts NoWrapFlags, so pass them through when
expanding a simple add expression.

This is the first re-commit of the functional changes from rL362687,
which was previously reverted.

Differential Revision: https://reviews.llvm.org/D61934

llvm-svn: 363364
2019-06-14 09:19:41 +00:00
Stanislav Mekhanoshin 68a2fef9ae [AMDGPU] gfx1010 wave32 icmp/fcmp intrinsic changes for wave32
Differential Revision: https://reviews.llvm.org/D63301

llvm-svn: 363339
2019-06-13 23:47:36 +00:00
Shawn Landden 24f4085811 [SimplifyCFG] NFC, update Switch tests as a baseline.
Also add baseline tests to show effect of later patches.

There were a couple of regressions here that were never caught,
but my patch set that this is a preparation to will fix them.

This is the third attempt to land this patch.

Differential Revision: https://reviews.llvm.org/D61150

llvm-svn: 363319
2019-06-13 19:36:38 +00:00
Sanjay Patel 5bf7f81aa8 [InstCombine] add test for failed libfunction prototype matching; NFC
llvm-svn: 363291
2019-06-13 18:26:10 +00:00
Philip Reames eb88badff9 Fix a bug w/inbounds invalidation in LFTR
This contains fixes for two cases where we might invalidate inbounds and leave it stale in the IR (a miscompile). Case 1 is when switching to an IV with no dynamically live uses, and case 2 is when doing pre-to-post conversion on the same pointer type IV.

The basic scheme used is to prove that using the given IV (pre or post increment forms) would have to already trigger UB on the path to the test we're modifying.  As such, our potential UB triggering use does not change the semantics of the original program.

As was pointed out in the review thread by Nikita, this is defending against a separate issue from the hasConcreteDef case. This is about poison, that's about undef. Unfortunately, the two are different, see Nikita's comment for a fuller explanation, he explains it well.

(Note: I'm going to address Nikita's last style comment in a separate commit just to minimize chance of subtle bugs being introduced due to typos.)

Differential Revision: https://reviews.llvm.org/D62939

llvm-svn: 363289
2019-06-13 18:23:13 +00:00
Sanjay Patel 4d93fb528e [InstCombine] auto-generate complete test checks; NFC
llvm-svn: 363286
2019-06-13 18:14:49 +00:00
David Bolvansky a9d8388e80 [NFC] Updated testcase for D54411/rL363284
llvm-svn: 363285
2019-06-13 18:13:03 +00:00
Joseph Tremoulet 3bc6e2a7aa [EarlyCSE] Ensure equal keys have the same hash value
Summary:
The logic in EarlyCSE that looks through 'not' operations in the
predicate recognizes e.g. that `select (not (cmp sgt X, Y)), X, Y` is
equivalent to `select (cmp sgt X, Y), Y, X`.  Without this change,
however, only the latter is recognized as a form of `smin X, Y`, so the
two expressions receive different hash codes.  This leads to missed
optimization opportunities when the quadratic probing for the two hashes
doesn't happen to collide, and assertion failures when probing doesn't
collide on insertion but does collide on a subsequent table grow
operation.

This change inverts the order of some of the pattern matching, checking
first for the optional `not` and then for the min/max/abs patterns, so
that e.g. both expressions above are recognized as a form of `smin X, Y`.

It also adds an assertion to isEqual verifying that it implies equal
hash codes; this fires when there's a collision during insertion, not
just grow, and so will make it easier to notice if these functions fall
out of sync again.  A new flag --earlycse-debug-hash is added which can
be used when changing the hash function; it forces hash collisions so
that any pair of values inserted which compare as equal but hash
differently will be caught by the isEqual assertion.

Reviewers: spatel, nikic

Reviewed By: spatel, nikic

Subscribers: lebedev.ri, arsenm, craig.topper, efriedma, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62644

llvm-svn: 363274
2019-06-13 15:24:11 +00:00
Sander de Smalen 51c2fa0e2a Improve reduction intrinsics by overloading result value.
This patch uses the mechanism from D62995 to strengthen the
definitions of the reduction intrinsics by letting the scalar
result/accumulator type be overloaded from the vector element type.

For example:

  ; The LLVM LangRef specifies that the scalar result must equal the
  ; vector element type, but this is not checked/enforced by LLVM.
  declare i32 @llvm.experimental.vector.reduce.or.i32.v4i32(<4 x i32> %a)

This patch changes that into:

  declare i32 @llvm.experimental.vector.reduce.or.v4i32(<4 x i32> %a)

Which has the type-constraint more explicit and causes LLVM to check
the result type with the vector element type.

Reviewers: RKSimon, arsenm, rnk, greened, aemerson

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D62996

llvm-svn: 363240
2019-06-13 09:37:38 +00:00
Sam Parker 9d28473a35 [ARM][TTI] Scan for existing loop intrinsics
TTI should report that it's not profitable to generate a hardware loop
if it, or one of its child loops, has already been converted.

Differential Revision: https://reviews.llvm.org/D63212

llvm-svn: 363234
2019-06-13 08:28:46 +00:00
Shawn Landden 8b142bcc3f [SimplifyCFG] reverting preliminary Switch patches again
This reverts 363226 and 363227, both NFC intended

I swear I fixed the test case that is failing, and ran
the tests, but I will look into it again.

llvm-svn: 363229
2019-06-13 05:26:17 +00:00
Shawn Landden c54b2011bd [SimplifyCFG] NFC, update Switch tests to better examine successive patches
Also add baseline tests to show effect of later patches.

There were a couple of regressions here that were never caught,
but my patch set that this is a preparation to will fix them.

Differential Revision: https://reviews.llvm.org/D61150

llvm-svn: 363226
2019-06-13 04:51:35 +00:00
Shawn Landden c6cba2957d [SimplifyCFG] revert the last commit.
I ran ALL the test suite locally, so I will look into this...

llvm-svn: 363223
2019-06-13 02:47:47 +00:00
Shawn Landden f93b99b2b6 [SimplifyCFG] NFC, update Switch tests to HEAD so I can
see if my changes change anything

Also add baseline tests to show effect of later patches.

Differential Revision: https://reviews.llvm.org/D61150

llvm-svn: 363222
2019-06-13 02:24:24 +00:00
David L. Jones c73fadaa84 Revert r361811: 'Re-commit r357452 (take 2): "SimplifyCFG SinkCommonCodeFromPredecessors ...'
We have observed some failures with internal builds with this revision.

- Performance regressions:
  - llvm's SingleSource/Misc evalloop shows performance regressions (although these may be red herrings).
  - Benchmarks for Abseil's SwissTable.
- Correctness:
  - Failures for particular libicu tests when building the Google AppEngine SDK (for PHP).

hwennborg has already been notified, and is aware of reproducer failures.

llvm-svn: 363220
2019-06-13 02:04:45 +00:00
Dinar Temirbulatov b2f45ba1e8 [SLP] Update propagate_ir_flags.ll test to check that we do retain the common subset, NFC.
llvm-svn: 363218
2019-06-13 00:19:50 +00:00
Philip Reames 0bded8442f [Tests] Highlight impact of multiple exit LFTR (D62625) as requested by reviewer
llvm-svn: 363217
2019-06-12 23:39:49 +00:00
Sanjay Patel a1421e8347 [x86] add tests for vector shifts; NFC
llvm-svn: 363203
2019-06-12 21:30:06 +00:00
Philip Reames 00e481b75d [Tests] Autogen RLEV test and add tests for a future enhancement
llvm-svn: 363193
2019-06-12 19:23:10 +00:00
Philip Reames 851adc000c [Tests] Add tests to highlight sibling loop optimization order issue for exit rewriting
The issue addressed in r363180 is more broadly relevant.  For the moment, we don't actually get any of these cases because we a) restrict SCEV formation due to SCEExpander needing to preserve LCSSA, and b) don't iterate between loops.

llvm-svn: 363192
2019-06-12 19:04:51 +00:00
Philip Reames e51c3d8b82 [SCEV] Teach computeSCEVAtScope benefit from one-input Phi. PR39673
SCEV does not propagate arguments through one-input Phis so as to make it easy for the SCEV expander (and related code) to preserve LCSSA.  It's not entirely clear this restriction is neccessary, but for the moment it exists.   For this reason, we don't analyze single-entry phi inputs.  However it is possible that when an this input leaves the loop through LCSSA Phi, it is a provable constant.  Missing that results in an order of optimization issue in loop exit value rewriting where we miss some oppurtunities based on order in which we visit sibling loops.

This patch teaches computeSCEVAtScope about this case. We can generalize it later, but so far we can only replace LCSSA Phis with their constant loop-exiting values.  We should probably also add similiar logic directly in the SCEV construction path itself.

Patch by: mkazantsev (with revised commit message by me)
Differential Revision: https://reviews.llvm.org/D58113

llvm-svn: 363180
2019-06-12 17:21:47 +00:00
Sanjay Patel 64006896ac [InstCombine] add tests for fmin/fmax libcalls; NFC
llvm-svn: 363175
2019-06-12 15:29:40 +00:00
Sam Parker 3d42959dd8 Revert rL363156.
The patch was to fix buildbots, but rL363157 should now be fixing it
in a cleaner way.

llvm-svn: 363174
2019-06-12 15:28:00 +00:00
Matt Arsenault f29366b1f5 StackProtector: Use PointerMayBeCaptured
This was using its own, outdated list of possible captures. This was
at minimum not catching cmpxchg and addrspacecast captures.

One change is now any volatile access is treated as capturing. The
test coverage for this pass is quite inadequate, but this required
removing volatile in the lifetime capture test.

Also fixes some infrastructure issues to allow running just the IR
pass.

Fixes bug 42238.

llvm-svn: 363169
2019-06-12 14:23:33 +00:00
Matt Arsenault aa6bdf9dcd LoopVersioning: Respect convergent
This changes the standalone pass only. Arguably the utility class
itself should assert there are no convergent calls. However, a target
pass with additional context may still be able to version a loop if
all of the dynamic conditions are sufficiently uniform.

llvm-svn: 363165
2019-06-12 14:05:58 +00:00
Sanjay Patel 082a41994a [InstCombine] add tests for fcmp+select with FMF (minnum/maxnum); NFC
llvm-svn: 363163
2019-06-12 13:51:33 +00:00
Matt Arsenault 86325be3d7 LoopLoadElim: Respect convergent
llvm-svn: 363162
2019-06-12 13:50:47 +00:00
Matt Arsenault 2466ba97bc LoopDistribute/LAA: Respect convergent
This case is slightly tricky, because loop distribution should be
allowed in some cases, and not others. As long as runtime dependency
checks don't need to be introduced, this should be OK. This is further
complicated by the fact that LoopDistribute partially ignores if LAA
says that vectorization is safe, and then does its own runtime pointer
legality checks.

Note this pass still does not handle noduplicate correctly, as this
should always be forbidden with it. I'm not going to bother trying to
fix it, as it would require more effort and I think noduplicate should
be removed.

https://reviews.llvm.org/D62607

llvm-svn: 363160
2019-06-12 13:34:19 +00:00
Matt Arsenault 1e21181aee LoopDistribute/LAA: Add tests to catch regressions
I broke 2 of these with a patch, but were not covered by existing
tests.

https://reviews.llvm.org/D63035

llvm-svn: 363158
2019-06-12 13:15:59 +00:00
Sam Parker 52d7326f32 [NFC] Add HardwareLoops lit.local.cfg file
Set Transforms/HardwareLoops/ARM/ tests as unsupported if there isn't
an arm target.

llvm-svn: 363157
2019-06-12 12:54:19 +00:00
Sam Parker ece316b56a Attempt to fix non-Arm buildbots
Adding REQUIRES: arm to failing tests

llvm-svn: 363156
2019-06-12 12:47:35 +00:00
Sam Parker 757ac02dc8 [ARM] Implement TTI::isHardwareLoopProfitable
Implement the backend target hook to drive the HardwareLoops pass.
The low-overhead branch extension for Arm M-class cores is flexible
enough that we don't have to ensure correctness at this point, except
checking that the loop counter variable can be stored in LR - a
32-bit register. For it to be profitable, we want to avoid loops that
contain function calls, or any other instruction that alters the PC.
    
This implementation uses TargetLoweringInfo, to query type and
operation actions, looks at intrinsic calls and also performs some
manual checks for remainder/division and FP operations.
    
I think this should be a good base to start and extra details can be
filled out later.

Differential Revision: https://reviews.llvm.org/D62907

llvm-svn: 363149
2019-06-12 12:00:42 +00:00
Orlando Cazalet-Hyams a947156396 Revert "[DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion"
This reverts commit 1a0f7a2077.
See phabricator thread for D60831.

llvm-svn: 363132
2019-06-12 08:34:51 +00:00
Philip Reames 082cd30327 Generalize icmp matching in IndVars' eliminateTrunc
We were only matching RHS being a loop invariant value, not the inverse. Since there's nothing which appears to canonicalize loop invariant values to RHS, this means we missed cases.

Differential Revision: https://reviews.llvm.org/D63112

llvm-svn: 363108
2019-06-11 22:43:25 +00:00
Cameron McInally 08200d6d26 [InstCombine] Handle -(X-Y) --> (Y-X) for unary fneg when NSZ
Differential Revision: https://reviews.llvm.org/D62612

llvm-svn: 363082
2019-06-11 16:21:21 +00:00
Cameron McInally 796de11331 [InstCombine] Update fptrunc (fneg x)) -> (fneg (fptrunc x) for unary FNeg
Differential Revision: https://reviews.llvm.org/D62629

llvm-svn: 363080
2019-06-11 15:45:41 +00:00
Orlando Cazalet-Hyams 1a0f7a2077 [DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion
Summary:
Bug: https://bugs.llvm.org/show_bug.cgi?id=39024

The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here:

A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins.
B) Instructions in the middle block have different line numbers which give the impression of another iteration.

In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks.

I have set up a separate review D61933 for a fix which is required for this patch.

Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel, jmorse

Reviewed By: hfinkel, jmorse

Subscribers: jmorse, javed.absar, eraman, kcc, bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits

Tags: #llvm, #debug-info

Differential Revision: https://reviews.llvm.org/D60831

llvm-svn: 363046
2019-06-11 10:37:20 +00:00
Matt Arsenault c5830f5f05 AtomicExpand: Don't crash on non-0 alloca
This now produces garbage on AMDGPU with a call to an nonexistent,
anonymous libcall but won't assert.

llvm-svn: 363022
2019-06-11 01:35:07 +00:00
Matt Arsenault 383e72fcfe AMDGPU: Expand < 32-bit atomics
Also fix AtomicExpand asserting on atomicrmw fadd/fsub.

llvm-svn: 363021
2019-06-11 01:35:00 +00:00
Philip Reames efb14f9005 [Tests] Adjust LFTR dead-iv tests to bypass undef cases
As pointed out by Nikita in review, undef and poison need to be handled separately.  Since we're no longer expecting any test improvements - just fixes for miscompiles - update the tests to bypass the existing undef check.

llvm-svn: 363002
2019-06-10 23:17:10 +00:00
Rong Xu e44fa83c37 [PGO] Handle cases of non-instrument BBs
As shown in PR41279, some basic blocks (such as catchswitch) cannot be
instrumented. This patch filters out these BBs in PGO instrumentation.
It also sets the profile count to the fail-to-instrument edge, so that we
can propagate the counts in the CFG.

Differential Revision: https://reviews.llvm.org/D62700

llvm-svn: 362995
2019-06-10 22:36:27 +00:00
Philip Reames 1d322ccaac [Tests] Split an LFTR dead-iv case
There are two interesting sub-cases here.  1) Switching IVs is legal, but only in pre-increment form.  and 2) Switching IVs is legal, and so is post-increment form.

llvm-svn: 362993
2019-06-10 22:33:20 +00:00
Philip Reames 78c0d75697 [Tests] Add tests for D62939 (miscompiles around dead pointer IVs)
Flesh out a collection of tests for switching to a dead IV within LFTR, both for the current miscompile, and for some cases which we should be able to handle via simple reasoning.

llvm-svn: 362976
2019-06-10 19:45:59 +00:00
Philip Reames a9633d5f0b [LFTR] Use recomputed BE count
This was discussed as part of D62880.  The basic thought is that computing BE taken count after widening should produce (on average) an equally good backedge taken count as the one before widening.  Since there's only one test in the suite which is impacted by this change, and it's essentially equivelent codegen, that seems to be a reasonable assertion.  This change was separated from r362971 so that if this turns out to be problematic, the triggering piece is obvious and easily revertable.

For the nestedIV example from elim-extend.ll, we end up with the following BE counts:
BEFORE: (-2 + (-1 * %innercount) + %limit)
AFTER: (-1 + (sext i32 (-1 + %limit) to i64) + (-1 * (sext i32 %innercount to i64))<nsw>)

Note that before is an i32 type, and the after is an i64.  Truncating the i64 produces the i32. 

llvm-svn: 362975
2019-06-10 19:18:53 +00:00
Sanjay Patel 9650c95b7e [InstCombine] allow unordered preds when canonicalizing to fabs()
We have a known-never-nan value via 'nnan', so an unordered predicate
is the same as its ordered sibling.

Similar to:
rL362937

llvm-svn: 362954
2019-06-10 15:39:00 +00:00
Sanjay Patel 07bba68889 [InstCombine] add tests for fabs() with unordered preds; NFC
llvm-svn: 362949
2019-06-10 15:08:22 +00:00
Sanjay Patel 85de9634e6 [InstCombine] fix bug in canonicalization to fabs()
Forgot to translate the predicate clauses in rL362943.

llvm-svn: 362945
2019-06-10 14:57:45 +00:00
Sanjay Patel 8b6d9f60ed [InstCombine] change canonicalization to fabs() to use FMF on fsub
Similar to rL362909:
This isn't the ideal fix (use FMF on the select), but it's still an
improvement until we have better FMF propagation to selects and other
FP math operators.

I don't think there's much risk of regression from this change by
not including the FMF on the fcmp any more. The nsz/nnan FMF
should be the same on the fcmp and the fsub because they have the
same operand.

llvm-svn: 362943
2019-06-10 14:46:36 +00:00
Sanjay Patel 8cd8c5784b [InstCombine] allow unordered preds when canonicalizing to fabs()
PR42179:
https://bugs.llvm.org/show_bug.cgi?id=42179

llvm-svn: 362937
2019-06-10 14:14:51 +00:00
Sanjay Patel 4cdd3ceb57 [InstCombine] add tests for fcmp unordered pred -> fabs (PR42179); NFC
llvm-svn: 362936
2019-06-10 14:04:10 +00:00