llvm-project

Commit Graph

Author	SHA1	Message	Date
Roman Lebedev	826db453d1	[NFC][InstCombine] onehot_merge.ll: add last few tests in the state they regress to in D62818 llvm-svn: 365056	2019-07-03 16:48:53 +00:00
Sanjay Patel	c1c86adb16	[SLP] add tests for bitcasted vector pointer load; NFC I'm not sure if this falls within the scope of SLP, but we could create vector loads for some of these patterns. llvm-svn: 365055	2019-07-03 16:46:14 +00:00
Roman Lebedev	9f0c83902d	[InstCombine] Y - ~X --> X + Y + 1 fold (PR42457) Summary: I think we'd want this new variant, because we obviously have better handling for `add` as compared to `sub`/`not`. https://rise4fun.com/Alive/WMn Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42457 \| PR42457 ]] Reviewers: spatel, nikic, huihuiz, efriedma Reviewed By: spatel Subscribers: RKSimon, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63992 llvm-svn: 365011	2019-07-03 09:41:50 +00:00
Eugene Leviant	ac407a7b4a	[SCEV][LSR] Prevent using undefined value in binops On some occasions ReuseOrCreateCast may convert previously expanded value to undefined. That value may be passed by SCEVExpander as an argument to InsertBinop making IV chain undefined. Differential revision: https://reviews.llvm.org/D63928 llvm-svn: 365009	2019-07-03 09:36:32 +00:00
Jordan Rupprecht	02647f73d4	Revert [InlineCost] cleanup calculations of Cost and Threshold This reverts r364422 (git commit `1a3dc76186`) The inlining cost calculation is incorrect, leading to stack overflow due to large stack frames from heavy inlining. llvm-svn: 365000	2019-07-03 04:01:51 +00:00
Vasileios Porpodas	cf47ff5ffb	[SLP] Recommit: Look-ahead operand reordering heuristic. Summary: This patch introduces a new heuristic for guiding operand reordering. The new "look-ahead" heuristic can look beyond the immediate predecessors. This helps break ties when the immediate predecessors have identical opcodes (see lit test for an example). Reviewers: RKSimon, ABataev, dtemirbulatov, Ayal, hfinkel, rnk Reviewed By: RKSimon, dtemirbulatov Subscribers: hiraditya, phosek, rnk, rcorcs, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60897 llvm-svn: 364964	2019-07-02 20:20:28 +00:00
David Bolvansky	cb1a5a705c	[SimplifyLibCalls] powf(x, sitofp(n)) -> powi(x, n) Summary: Partially solves https://bugs.llvm.org/show_bug.cgi?id=42190 Reviewers: spatel, nikic, efriedma Reviewed By: efriedma Subscribers: efriedma, nikic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63038 llvm-svn: 364940	2019-07-02 15:58:45 +00:00
Roman Lebedev	0bde7c6527	[InstCombine] Shift amount reassociation: fixup constantexpr handling (PR42484) I was actually wondering if there was some nicer way than m_Value()+cast, but apparently what i was really "subconsciously" thinking about was correctness issue. hasNoUnsignedWrap()/hasNoUnsignedWrap() exist for Instruction, not for BinaryOperator, so let's just use m_Instruction(), thus both avoiding a cast, and a crash. Fixes https://bugs.llvm.org/show_bug.cgi?id=42484, https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=15587 llvm-svn: 364915	2019-07-02 12:54:48 +00:00
Roman Lebedev	7928fea4a7	[NFC][InstCombine] Revisit tests for "redundant shift input masking" (PR42456) llvm-svn: 364897	2019-07-02 10:02:25 +00:00
Roman Lebedev	377dfb0226	[NFC][InstCombine] Add tests for "redundant shift input masking" (PR42456) https://bugs.llvm.org/show_bug.cgi?id=42456 https://rise4fun.com/Alive/Vf1p llvm-svn: 364894	2019-07-02 09:27:34 +00:00
Reid Kleckner	d72163947a	[PGO] Update ICP pass for recent byval type changes Fixes verifier errors encountered in PR42413. Reviewers: xur, t.p.northover, inglorion, gbiv, george.burgess.iv Differential Revision: https://reviews.llvm.org/D63842 llvm-svn: 364861	2019-07-01 22:43:39 +00:00
Huihui Zhang	8e1051b3a0	[InstCombine][NFCI] Update test cases in onehot_merge.ll Use both one bit and signbit shifting to check for one bit merge. Reviewers: lebedev.ri, spatel, efriedma, craig.topper Reviewed By: lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63903 llvm-svn: 364857	2019-07-01 22:00:32 +00:00
Sanjay Patel	ddc1b40f26	[InstCombine] reduce more checks for power-of-2-or-zero using ctpop Extends the transform from: rL364341 ...to include another (more common?) pattern that tests whether a value is a power-of-2 (including or excluding zero). llvm-svn: 364856	2019-07-01 22:00:00 +00:00
Jordan Rupprecht	a7972dc04a	Revert [SLP] Look-ahead operand reordering heuristic. This reverts r364478 (git commit `574cb0eb3a`) The patch is causing compilation timeouts. llvm-svn: 364846	2019-07-01 21:10:43 +00:00
Roman Lebedev	975120a21b	[NFC][InstCombine] More commutative tests for "shift direction in bittest" (PR42466) 'and' is commutative, if we don't want to touch shift-of-const, we still need to check the other hand of 'and'. llvm-svn: 364844	2019-07-01 20:33:56 +00:00
Roman Lebedev	e62857786f	[NFC][InstCombine] Add tests for "shift direction in bittest" (PR42466) https://rise4fun.com/Alive/8O1 https://bugs.llvm.org/show_bug.cgi?id=42466 llvm-svn: 364824	2019-07-01 18:11:32 +00:00
Roman Lebedev	04d3d3bbff	[InstCombine] (Y + ~X) + 1 --> Y - X fold (PR42459) Summary: To be noted, this pattern is not unhandled by instcombine per-se, it is somehow does end up being folded when one runs opt -O3, but not if it's just -instcombine. Regardless, that fold is indirect, depends on some other folds, and is thus blind when there are extra uses. This does address the regression being exposed in D63992. https://godbolt.org/z/7DGltU https://rise4fun.com/Alive/EPO0 Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42459 \| PR42459 ]] Reviewers: spatel, nikic, huihuiz Reviewed By: spatel Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63993 llvm-svn: 364792	2019-07-01 15:55:24 +00:00
Roman Lebedev	72b8d41ce8	[InstCombine] Shift amount reassociation in bittest (PR42399) Summary: Given pattern: `icmp eq/ne (and ((x shift Q), (y oppositeshift K))), 0` we should move shifts to the same hand of 'and', i.e. rewrite as `icmp eq/ne (and (x shift (Q+K)), y), 0` iff `(Q+K) u< bitwidth(x)` It might be tempting to not restrict this to situations where we know we'd fold two shifts together, but i'm not sure what rules should there be to avoid endless combine loops. We pick the same shift that was originally used to shift the variable we picked to shift: https://rise4fun.com/Alive/6x1v Should fix [[ https://bugs.llvm.org/show_bug.cgi?id=42399 \| PR42399]]. Reviewers: spatel, nikic, RKSimon Reviewed By: spatel Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63829 llvm-svn: 364791	2019-07-01 15:55:15 +00:00
Roman Lebedev	34a0b16e29	[NFC][InstCombine] Better commutative tests for "shift amount reassociation in bittest" pattern. As discussed in https://reviews.llvm.org/D63829 if both shifts are one-use, we'd most likely want to produce `lshr`, and not rely on ordering. Also, there should likely be a separate fold to do this reordering. llvm-svn: 364772	2019-07-01 14:28:24 +00:00
Roman Lebedev	9f3645869c	[NFC][InstCombine] Improve test coverage for ((~x) + y) + 1 -> y - x fold fold (PR42459) So we indeed to have this fold, but only if +1 is not the last operation.. llvm-svn: 364764	2019-07-01 13:31:06 +00:00
Roman Lebedev	d5c3e34cb7	[NFC][InstCombine] Tests for ((~x) + y) + 1 -> y - x fold fold (PR42459) To be noted, this pattern is not unhandled by instcombine per-se, it is somehow does end up being folded when one runs opt -O3, but not if it's just -instcombine. Regardless, that fold is indirect, depends on some other folds, and is thus blind when there are extra uses. https://bugs.llvm.org/show_bug.cgi?id=42459 https://rise4fun.com/Alive/EPO0 llvm-svn: 364749	2019-07-01 12:22:06 +00:00
Roman Lebedev	4f878fe3a7	[NFC][InstCombine] Tests for x - ~(y) -> x + y + 1 fold (PR42457) https://bugs.llvm.org/show_bug.cgi?id=42457 https://rise4fun.com/Alive/iFhE llvm-svn: 364739	2019-07-01 09:57:53 +00:00
Roman Lebedev	f55818e3a7	[InstCombine] Omit 'urem' where possible This was added in D63390 / rL364286 to backend, but it makes sense to also handle it in middle-end. https://rise4fun.com/Alive/Zsln llvm-svn: 364738	2019-07-01 09:41:43 +00:00
Roman Lebedev	0f82f64c83	[NFC][InstCombine] Copy test for omit urem when possible from TargetLowering Was added in D63390 / rL364286 to backend, but it makes sense to also handle it here. https://rise4fun.com/Alive/Zsln llvm-svn: 364737	2019-07-01 09:41:27 +00:00
Yevgeny Rouban	d4097b4a93	[SimpleLoopUnswitch] Implement handling of prof branch_weights metadata for SwitchInst Differential Revision: https://reviews.llvm.org/D60606 llvm-svn: 364734	2019-07-01 08:43:53 +00:00
Sam Parker	98722691b0	[ARM] WLS/LE Code Generation Backend changes to enable WLS/LE low-overhead loops for armv8.1-m: 1) Use TTI to communicate to the HardwareLoop pass that we should try to generate intrinsics that guard the loop entry, as well as setting the loop trip count. 2) Lower the BRCOND that uses said intrinsic to an Arm specific node: ARMWLS. 3) ISelDAGToDAG the node to a new pseudo instruction: t2WhileLoopStart. 4) Add support in ArmLowOverheadLoops to handle the new pseudo instruction. Differential Revision: https://reviews.llvm.org/D63816 llvm-svn: 364733	2019-07-01 08:21:28 +00:00
Sanjay Patel	706b48251f	[InstCombine] canonicalize fcmp+select to minnum/maxnum intrinsics This is the opposite direction of D62158 (we have to choose 1 form or the other). Now that we have FMF on the select, this becomes more palatable. And the benefits of having a single IR instruction for this operation (less chances of missing folds based on extra uses, etc) overcome my previous comments about the potential advantage of larger pattern matching/analysis. Differential Revision: https://reviews.llvm.org/D62414 llvm-svn: 364721	2019-06-30 13:40:31 +00:00
Sanjay Patel	77dc1e8568	[InstCombine] canonicalize fmin/fmax to LLVM intrinsics minnum/maxnum This transform came up in D62414, but we should deal with it first. We have LLVM intrinsics that correspond exactly to libm calls (unlike most libm calls, these libm calls never set errno). This holds without any fast-math-flags, so we should always canonicalize to those intrinsics directly for better optimization. Currently, we convert to fcmp+select only when we have FMF (nnan) because fcmp+select does not preserve the semantics of the call in the general case. Differential Revision: https://reviews.llvm.org/D63214 llvm-svn: 364714	2019-06-29 14:28:54 +00:00
Roman Lebedev	e3a94ba4a9	[InstCombine] Shift amount reassociation (PR42391) Summary: Given pattern: `(x shiftopcode Q) shiftopcode K` we should rewrite it as `x shiftopcode (Q+K)` iff `(Q+K) u< bitwidth(x)` This is valid for any shift, but they must be identical. * https://rise4fun.com/Alive/9E2 * exact on both lshr => exact https://rise4fun.com/Alive/plHk * exact on both ashr => exact https://rise4fun.com/Alive/QDAA * nuw on both shl => nuw https://rise4fun.com/Alive/5Uk * nsw on both shl => nsw https://rise4fun.com/Alive/0plg Should fix [[ https://bugs.llvm.org/show_bug.cgi?id=42391 \| PR42391]]. Reviewers: spatel, nikic, RKSimon Reviewed By: nikic Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63812 llvm-svn: 364712	2019-06-29 11:51:50 +00:00
Nikita Popov	2d756c4feb	[LFTR] Fix post-inc pointer IV with truncated exit count (PR41998) Fixes https://bugs.llvm.org/show_bug.cgi?id=41998. Usually when we have a truncated exit count we'll truncate the IV when comparing against the limit, in which case exit count overflow in post-inc form doesn't matter. However, for pointer IVs we don't do that, so we have to be careful about incrementing the IV in the wide type. I'm fixing this by removing the IVCount variable (which was ExitCount or ExitCount+1) and replacing it with a UsePostInc flag, and then moving the actual limit adjustment to the individual cases (which are: pointer IV where we add to the wide type, integer IV where we add to the narrow type, and constant integer IV where we add to the wide type). Differential Revision: https://reviews.llvm.org/D63686 llvm-svn: 364709	2019-06-29 09:24:12 +00:00
Cameron McInally	b671535983	[NFC][NewGVN] Explicitly check fpmath metadata in fpmath.ll Suggested in D63933. llvm-svn: 364685	2019-06-28 21:39:08 +00:00
Cameron McInally	30e5cf1d8f	[NewGVN] Add unary FNeg support to NewGVN pass Differential Revision: https://reviews.llvm.org/D63933 llvm-svn: 364680	2019-06-28 20:09:32 +00:00
Cameron McInally	ab4b2364e5	[GVNSink] Add unary FNeg support to GVNSink pass Differential Revision: https://reviews.llvm.org/D63900 llvm-svn: 364678	2019-06-28 19:57:31 +00:00
Roman Lebedev	3b4f086df4	[NFC][InstCombine] Shift amount reassociation: revisit flag preservation tests llvm-svn: 364657	2019-06-28 16:36:53 +00:00
Roman Lebedev	9f1dffdb02	[NFC][InstCombine] Shift amount reassociation: add flag preservation test As discussed in https://reviews.llvm.org/D63812#inline-569870 * exact on both lshr => exact https://rise4fun.com/Alive/plHk * exact on both ashr => exact https://rise4fun.com/Alive/QDAA * nuw on both shl => nuw https://rise4fun.com/Alive/5Uk * nsw on both shl => nsw https://rise4fun.com/Alive/0plg So basically if the same flag is set on both original shifts -> set it on new shift. Don't think we can do anything with non-matching flags on shl. llvm-svn: 364652	2019-06-28 15:32:52 +00:00
Cameron McInally	9fab46ca0b	[NFC][Float2Int] Pre-commit unary FNeg test to basic.ll llvm-svn: 364649	2019-06-28 15:12:15 +00:00
Cameron McInally	13d9c723c8	[NFC][NewGVN] Pre-commit unary FNeg test to fpmath.ll llvm-svn: 364646	2019-06-28 14:39:58 +00:00
Sam Parker	9a92be1b35	[HardwareLoops] Loop counter guard intrinsic Introduce llvm.test.set.loop.iterations which sets the loop counter and also produces an i1 after testing that the count is not zero. Differential Revision: https://reviews.llvm.org/D63809 llvm-svn: 364628	2019-06-28 07:38:16 +00:00
Cameron McInally	30cab5d6ee	[NFC][GVNSink] Pre-commit unary FNeg test to fpmath.ll llvm-svn: 364597	2019-06-27 21:23:07 +00:00
Cameron McInally	6e62a796d5	[GVN] Add support for unary FNeg to GVN pass Differential Revision: https://reviews.llvm.org/D63896 llvm-svn: 364592	2019-06-27 21:05:02 +00:00
Cameron McInally	22afca2ce0	[NFC][GVN] Pre-commit unary FNeg tests to fpmath.ll llvm-svn: 364587	2019-06-27 20:33:44 +00:00
David Green	152dd3b854	[ARM] Move low overhead loop codegen tests into a separate file. NFC llvm-svn: 364565	2019-06-27 16:56:41 +00:00
Johannes Doerfert	3b77583e95	[Attr] Add "willreturn" function attribute This patch introduces a new function attribute, willreturn, to indicate that a call of this function will either exhibit undefined behavior or comes back and continues execution at a point in the existing call stack that includes the current invocation. This attribute guarantees that the function does not have any endless loops, endless recursion, or terminating functions like abort or exit. Patch by Hideto Ueno (@uenoku) Reviewers: jdoerfert Subscribers: mehdi_amini, hiraditya, steven_wu, dexonsmith, lebedev.ri, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62801 llvm-svn: 364555	2019-06-27 15:51:40 +00:00
Sanjay Patel	d0e098696f	[InstCombine] remove 'tmp' names and regenerate checks; NFC llvm-svn: 364546	2019-06-27 14:20:10 +00:00
Tim Northover	22c96a966b	IR: compare type attributes deeply when looking into functions. FunctionComparator attempts to produce a stable comparison of two Function instances by looking at all available properties. Since ByVal attributes now contain a Type pointer, they are not trivially ordered and FunctionComparator should use its own Type comparison logic to sort them. llvm-svn: 364523	2019-06-27 11:44:45 +00:00
Stefan Stipanovic	5360589b7d	[Attributor] Deducing existing nounwind attribute. Adding nounwind deduction in new attributor framework. Reviewers: jdoerfert, uenoku Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D63379 llvm-svn: 364521	2019-06-27 11:27:54 +00:00
Huihui Zhang	9f69052394	[InstCombine][NFCI] Fix test comments. For fold (X & (signbit l>> Y)) ==/!= 0 -> (X << Y) >=/< 0 (X & (signbit << Y)) ==/!= 0 -> (X l>> Y) >=/< 0 Test cases of X being constant are positive tests not negative. Prep work for D62818. llvm-svn: 364497	2019-06-27 05:46:06 +00:00
Vasileios Porpodas	574cb0eb3a	[SLP] Look-ahead operand reordering heuristic. Summary: This patch introduces a new heuristic for guiding operand reordering. The new "look-ahead" heuristic can look beyond the immediate predecessors. This helps break ties when the immediate predecessors have identical opcodes (see lit test for an example). Reviewers: RKSimon, ABataev, dtemirbulatov, Ayal, hfinkel, rnk Reviewed By: RKSimon, dtemirbulatov Subscribers: rnk, rcorcs, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60897 llvm-svn: 364478	2019-06-26 21:25:24 +00:00
Sanjay Patel	b5999f17d4	[InstCombine] change 'tmp' variable names; NFC I don't think there was anything going wrong here, but the auto-generating CHECK line script is known to have problems with 'TMP' because it uses that to match nameless values. This is a retry of rL364452. llvm-svn: 364477	2019-06-26 21:19:31 +00:00
Sanjay Patel	46a3dbf9a6	Revert [InstCombine] change 'tmp' variable names; NFC This reverts r364452 (git commit `6083ae0b4a`) llvm-svn: 364455	2019-06-26 18:06:51 +00:00
Sanjay Patel	6083ae0b4a	[InstCombine] change 'tmp' variable names; NFC I don't think there was anything going wrong here, but the auto-generating CHECK line script is known to have problems with 'TMP' because it uses that to match nameless values. llvm-svn: 364452	2019-06-26 17:43:30 +00:00
Sanjay Patel	dfdee7bc15	[InstCombine] regenerate test checks; NFC llvm-svn: 364437	2019-06-26 15:24:08 +00:00
Roman Lebedev	3f3eacfec1	[NFC][InstCombine] Revisit one-use tests in shift-amount-reassociation-in-bittest.ll llvm-svn: 364433	2019-06-26 14:42:39 +00:00
Roman Lebedev	78edfc4bf0	[NFC][InstCombine] Add shift amount reassociation in bittest tests (PR42399) https://bugs.llvm.org/show_bug.cgi?id=42399 https://rise4fun.com/Alive/kBb https://rise4fun.com/Alive/1SB llvm-svn: 364430	2019-06-26 14:24:41 +00:00
Fedor Sergeev	1a3dc76186	[InlineCost] cleanup calculations of Cost and Threshold Summary: Doing better separation of Cost and Threshold. Cost counts the abstract complexity of live instructions, while Threshold is an upper bound of complexity that inlining is comfortable to pay. There are two parts: - huge 15K last-call-to-static bonus is no longer subtracted from Cost but rather is now added to Threshold. That makes much more sense, as the cost of inlining (Cost) is not changed by the fact that internal function is called once. It only changes the likelyhood of this inlining being profitable (Threshold). - bonus for calls proved-to-be-inlinable into callee is no longer subtracted from Cost but added to Threshold instead. While calculations are somewhat different, overall InlineResult should stay the same since Cost >= Threshold compares the same. Reviewers: eraman, greened, chandlerc, yrouban, apilipenko Reviewed By: apilipenko Tags: #llvm Differential Revision: https://reviews.llvm.org/D60740 llvm-svn: 364422	2019-06-26 13:24:24 +00:00
Clement Courbet	2851248fa1	Revert "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline." Breaks sanitizers: libFuzzer :: cxxstring.test libFuzzer :: memcmp.test libFuzzer :: recommended-dictionary.test libFuzzer :: strcmp.test libFuzzer :: value-profile-mem.test libFuzzer :: value-profile-strcmp.test llvm-svn: 364416	2019-06-26 12:13:13 +00:00
Clement Courbet	7b3a5f0e6d	[ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline. This allows later passes (in particular InstCombine) to optimize more cases. One that's important to us is `memcmp(p, q, constant) < 0` and memcmp(p, q, constant) > 0. llvm-svn: 364412	2019-06-26 11:50:18 +00:00
Florian Hahn	4c11b5268c	[LoopUnroll] Add support for loops with exiting headers and uncond latches. This patch generalizes the UnrollLoop utility to support loops that exit from the header instead of the latch. Usually, LoopRotate would take care of must of those cases, but in some cases (e.g. -Oz), LoopRotate does not kick in. Codesize impact looks relatively neutral on ARM64 with -Oz + LTO. Program master patch diff External/S.../CFP2006/447.dealII/447.dealII 629060.00 627676.00 -0.2% External/SPEC/CINT2000/176.gcc/176.gcc 1245916.00 1244932.00 -0.1% MultiSourc...Prolangs-C/simulator/simulator 86100.00 86156.00 0.1% MultiSourc...arks/Rodinia/backprop/backprop 66212.00 66252.00 0.1% MultiSourc...chmarks/Prolangs-C++/life/life 67276.00 67312.00 0.1% MultiSourc...s/Prolangs-C/compiler/compiler 69824.00 69788.00 -0.1% MultiSourc...Prolangs-C/assembler/assembler 86672.00 86696.00 0.0% Reviewers: efriedma, vsk, paquette Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D61962 llvm-svn: 364398	2019-06-26 09:16:57 +00:00
Roman Lebedev	567eea44c2	[NFC][InstCombine] Add shift amount reassociation tests (PR42391) https://bugs.llvm.org/show_bug.cgi?id=42391 https://rise4fun.com/Alive/9E2 llvm-svn: 364393	2019-06-26 08:17:05 +00:00
Huihui Zhang	b90cb57b63	[InstCombine] Simplify icmp ult/uge (shl %x, C2), C1 iff C1 is power of two -> icmp eq/ne (and %x, (lshr -C1, C2)), 0. Simplify 'shl' inequality test into 'and' equality test. This pattern happens in the middle-end while simplifying bitfield access, Exposed in https://reviews.llvm.org/D63505 https://rise4fun.com/Alive/6uz Reviewers: lebedev.ri, efriedma Reviewed By: lebedev.ri Subscribers: spatel, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63675 llvm-svn: 364348	2019-06-25 20:44:52 +00:00
Sanjay Patel	fcfa056ceb	[InstCombine] reduce checks for power-of-2-or-zero using ctpop This follows up the transform from rL363956 to use the ctpop intrinsic when checking for power-of-2-or-zero. This is matching the isPowerOf2() patterns used in PR42314: https://bugs.llvm.org/show_bug.cgi?id=42314 But there's at least 1 instcombine follow-up needed to match the alternate form: (v & (v - 1)) == 0; We should have all of the backend expansions handled with: rL364319 (x86-specific changes still needed for optimal code based on subtarget) And the larger patterns to exclude zero as a power-of-2 are joining with this change after: rL364153 ( D63660 ) rL364246 Differential Revision: https://reviews.llvm.org/D63777 llvm-svn: 364341	2019-06-25 18:51:44 +00:00
Simon Tatham	a4b415a683	[ARM] Code-generation infrastructure for MVE. This provides the low-level support to start using MVE vector types in LLVM IR, loading and storing them, passing them to __asm__ statements containing hand-written MVE vector instructions, and if you have the hard-float ABI turned on, using them as function parameters. (In the soft-float ABI, vector types are passed in integer registers, and combining all those 32-bit integers into a q-reg requires support for selection DAG nodes like insert_vector_elt and build_vector which aren't implemented yet for MVE. In fact I've also had to add `arm_aapcs_vfpcc` to a couple of existing tests to avoid that problem.) Specifically, this commit adds support for: * spills, reloads and register moves for MVE vector registers * ditto for the VPT predication mask that lives in VPR.P0 * make all the MVE vector types legal in ISel, and provide selection DAG patterns for BITCAST, LOAD and STORE * make loads and stores of scalar FP types conditional on `hasFPRegs()` rather than `hasVFP2Base()`. As a result a few existing tests needed their llc command lines updating to use `-mattr=-fpregs` as their method of turning off all hardware FP support. Reviewers: dmgreen, samparker, SjoerdMeijer Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60708 llvm-svn: 364329	2019-06-25 16:48:46 +00:00
Sam Parker	bcf0eb7a64	[ARM] Fix for DLS/LE CodeGen The expensive buildbots highlighted the mir tests were broken, which I've now updated and added --verify-machineinstrs to them. This also uncovered a couple of bugs in the backend pass, so these have also been fixed. llvm-svn: 364323	2019-06-25 15:11:17 +00:00
Simon Pilgrim	e98f8cf78f	[SLPVectorizer] Precommit of supernode.ll test for D63661 This is a pre-commit of the tests introduced by the SuperNode SLP patch D63661. Committed on behalf of @vporpo (Vasileios Porpodas) Differential Revision: https://reviews.llvm.org/D63664 llvm-svn: 364320	2019-06-25 14:58:20 +00:00
Sam Parker	a6fd919cb3	[ARM] DLS/LE low-overhead loop code generation Introduce three pseudo instructions to be used during DAG ISel to represent v8.1-m low-overhead loops. One maps to set_loop_iterations while loop_decrement_reg is lowered to two, so that we can separate the decrement and branching operations. The pseudo instructions are expanded pre-emission, where we can still decide whether we actually want to generate a low-overhead loop, in a new pass: ARMLowOverheadLoops. The pass currently bails, reverting to an sub, icmp and br, in the cases where a call or stack spill/restore happens between the decrement and branching instructions, or if the loop is too large. Differential Revision: https://reviews.llvm.org/D63476 llvm-svn: 364288	2019-06-25 10:45:51 +00:00
Huihui Zhang	2cc3b3856e	[InstCombine][NFC] Add test to show missing fold for icmp ult/uge (shl %x, C2), C1. Summary: 'shl' inequality test ``` icmp ult/uge (shl %x, C2), C1 iff C1 is power of two ``` can be simplified as 'and' equality test ``` icmp eq/ne (and %x, (lshr -C1, C2)), 0. ``` Reviewers: lebedev.ri, efriedma Reviewed By: lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63670 llvm-svn: 364256	2019-06-25 00:14:02 +00:00
Huihui Zhang	4626613ffe	[InstCombine] Fold icmp eq/ne (and %x, C), 0 iff (-C) is power of two -> %x u</u>= (-C) earlier. Summary: To generate simplified IR, make sure fold (X & ~C) ==/!= 0 --> X u</u>= C+1 is scheduled before fold ((X << Y) & C) == 0 -> (X & (C >> Y)) == 0. https://rise4fun.com/Alive/7ZN Reviewers: lebedev.ri, efriedma, spatel, craig.topper Reviewed By: lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63505 llvm-svn: 364255	2019-06-25 00:09:10 +00:00
Sanjay Patel	2675b0c8ab	[InstCombine] squash is-not-power-of-2 using ctpop This is the Demorgan'd 'not' of the pattern handled in: D63660 / rL364153 This is another intermediate IR step towards solving PR42314: https://bugs.llvm.org/show_bug.cgi?id=42314 We can test if a value is not a power-of-2 using ctpop(X) > 1, so combining that with an is-zero check of the input is the same as testing if not exactly 1 bit is set: (X == 0) \|\| (ctpop(X) u> 1) --> ctpop(X) != 1 llvm-svn: 364246	2019-06-24 22:35:26 +00:00
Matt Arsenault	8025842599	InstCombine: Preserve nuw when reassociating nuw ops [3/3] Alive says this is OK. llvm-svn: 364235	2019-06-24 21:37:03 +00:00
Matt Arsenault	5d82ecd5d9	InstCombine: Preserve nuw when reassociating nuw ops [2/3] Alive says this is OK. llvm-svn: 364234	2019-06-24 21:37:02 +00:00
Matt Arsenault	5a89ba7343	InstCombine: Preserve nuw when reassociating nuw ops [1/3] Alive says this is OK. llvm-svn: 364233	2019-06-24 21:36:59 +00:00
Cameron McInally	1e5116cbb3	[NFC][Reassociate] Add unary FNeg tests to fast-ReassociateVector.ll llvm-svn: 364232	2019-06-24 21:36:09 +00:00
Nikita Popov	f1ffc4305d	[CVP] Reenable nowrap flag inference Inference of nowrap flags in CVP has been disabled, because it triggered a bug in LFTR (https://bugs.llvm.org/show_bug.cgi?id=31181). This issue has been fixed in D60935, so we should be able to reenable nowrap flag inference now. Differential Revision: https://reviews.llvm.org/D62776 llvm-svn: 364228	2019-06-24 20:13:13 +00:00
Sanjay Patel	2aa800052a	[InstCombine] add tests for more variants of isPowerOf2; NFC llvm-svn: 364227	2019-06-24 20:11:40 +00:00
Huihui Zhang	94b4316096	[InstCombine] Regenerate test pr17827. NFCI. Prep work for upcoming patch D63505. llvm-svn: 364224	2019-06-24 19:49:42 +00:00
Philip Reames	b2f09391cf	[Tests] Add cases where we're failing to discharge provably loop exits (tests for D63733) llvm-svn: 364220	2019-06-24 19:26:17 +00:00
Cameron McInally	fe3f15cf90	[SLP] Support unary FNeg vectorization Differential Revision: https://reviews.llvm.org/D63609 llvm-svn: 364219	2019-06-24 19:24:23 +00:00
Sanjay Patel	89efefb170	[InstCombine] reduce funnel-shift i16 X, X, 8 to bswap X Prefer the more exact intrinsic to remove a use of the input value and possibly make further transforms easier (we will still need to match patterns with funnel-shift of wider types as pieces of bswap, especially if we want to canonicalize to funnel-shift with constant shift amount). Discussed in D46760. llvm-svn: 364187	2019-06-24 15:20:49 +00:00
Sanjay Patel	f27f794d47	[InstCombine] add tests for funnel-shift to bswap; NFC llvm-svn: 364184	2019-06-24 14:47:02 +00:00
Simon Pilgrim	b617b0808d	[InstCombine] SliceUpIllegalIntegerPHI - bail on out of range shifts trunc(lshr) handling - if the shift is out of range (undefined) then bail like we do for non-constant shifts. Fixes OSS Fuzz #15217 llvm-svn: 364181	2019-06-24 13:13:36 +00:00
Bjorn Pettersson	512b118779	[Scalarizer] Add scalarizer support for smul.fix.sat Summary: Handle smul.fix.sat in the scalarizer. This is done by adding smul.fix.sat to the set of "isTriviallyVectorizable" intrinsics. The addition of smul.fix.sat in isTriviallyVectorizable and hasVectorInstrinsicScalarOpd can also be seen as a preparation to be able to use hasVectorInstrinsicScalarOpd in ConstantFolding. Reviewers: rengolin, RKSimon, dblaikie Reviewed By: rengolin Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63704 llvm-svn: 364177	2019-06-24 12:07:11 +00:00
Philip Reames	3f8264b062	[Tests] Autogen and improve test readability llvm-svn: 364156	2019-06-23 17:13:53 +00:00
Philip Reames	d22a2a9a72	[IndVars] Remove dead instructions after folding trivial loop exit In rL364135, I taught IndVars to fold exiting branches in loops with a zero backedge taken count (i.e. loops that only run one iteration). This extends that to eliminate the dead comparison left around. llvm-svn: 364155	2019-06-23 17:06:57 +00:00
Sanjay Patel	13a5ae58fc	[InstCombine] squash is-power-of-2 that uses ctpop This is another intermediate IR step towards solving PR42314: https://bugs.llvm.org/show_bug.cgi?id=42314 We can test if a value is power-of-2-or-0 using ctpop(X) < 2, so combining that with a non-zero check of the input is the same as testing if exactly 1 bit is set: (X != 0) && (ctpop(X) u< 2) --> ctpop(X) == 1 Differential Revision: https://reviews.llvm.org/D63660 llvm-svn: 364153	2019-06-23 14:22:37 +00:00
Philip Reames	8deb84c8ef	Exploit a zero LoopExit count to eliminate loop exits This turned out to be surprisingly effective. I was originally doing this just for completeness sake, but it seems like there are a lot of cases where SCEV's exit count reasoning is stronger than it's isKnownPredicate reasoning. Once this is in, I'm thinking about trying to build on the same infrastructure to eliminate provably untaken checks. There may be something generally interesting here. Differential Revision: https://reviews.llvm.org/D63618 llvm-svn: 364135	2019-06-22 17:54:25 +00:00
Nikita Popov	b89d7e52db	[LFTR] Add tests for PR41998; NFC The limit for the pointer case is incorrect. llvm-svn: 364128	2019-06-22 09:57:59 +00:00
Reid Kleckner	592a193285	Revert [SLP] Look-ahead operand reordering heuristic. This reverts r364084 (git commit `5698921be2`) It caused crashes while compiling a file in Chrome. Reduction forthcoming. llvm-svn: 364111	2019-06-21 23:10:25 +00:00
Simon Pilgrim	5698921be2	[SLP] Look-ahead operand reordering heuristic. This patch introduces a new heuristic for guiding operand reordering. The new "look-ahead" heuristic can look beyond the immediate predecessors. This helps break ties when the immediate predecessors have identical opcodes (see lit test for an example). Committed on behalf of @vporpo (Vasileios Porpodas) Differential Revision: https://reviews.llvm.org/D60897 llvm-svn: 364084	2019-06-21 17:57:01 +00:00
David Bolvansky	2441a4074c	[NFC] Update shl-sub tests llvm-svn: 364083	2019-06-21 17:51:18 +00:00
Sanjay Patel	f483617256	[InstCombine] add tests for ctpop folds; NFC llvm-svn: 364082	2019-06-21 17:44:09 +00:00
David Bolvansky	dbcdad51ff	[InstCombine] (1 << (C - x)) -> ((1 << C) >> x) if C is bitwidth - 1 Summary: ``` %a = sub i32 31, %x %r = shl i32 1, %a => %d = shl i32 1, 31 %r = lshr i32 %d, %x Done: 1 Optimization is correct! ``` https://rise4fun.com/Alive/btZm Reviewers: spatel, lebedev.ri, nikic Reviewed By: lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63652 llvm-svn: 364073	2019-06-21 16:25:32 +00:00
David Bolvansky	045b0f60b6	[NFC] Added more tests for D63652 llvm-svn: 364069	2019-06-21 16:14:13 +00:00
David Bolvansky	4b28478389	[InstCombine] cttz(abs(x)) -> cttz(x) Summary: Signedness does not change number of trailing zeros. Reviewers: spatel, lebedev.ri, nikic Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D63546 llvm-svn: 364064	2019-06-21 15:26:22 +00:00
Sanjay Patel	ddb9093684	[GVNSink] prevent crashing on mismatched instructions (PR42346) Patch based on suggestion by James Molloy (@jmolloy) in: https://bugs.llvm.org/show_bug.cgi?id=42346 llvm-svn: 364062	2019-06-21 15:17:24 +00:00
David Bolvansky	b0ba049f58	[NFC] Added tests for (1 << (C - x)) -> ((1 << C) >> x) llvm-svn: 364060	2019-06-21 15:00:31 +00:00
Jay Foad	d9d3c91b48	[Scalarizer] Propagate IR flags Summary: The motivation for this was to propagate fast-math flags like nnan and ninf on vector floating point operations to the corresponding scalar operations to take advantage of follow-on optimizations. But I think the same argument applies to all of our IR flags: if they apply to the vector operation then they also apply to all the individual scalar operations, and they might enable follow-on optimizations. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63593 llvm-svn: 364051	2019-06-21 14:10:18 +00:00
Sam Elliott	96c8bc7956	[RISCV] Add RISCV-specific TargetTransformInfo Summary: LLVM Allows Targets to provide information that guides optimisations made to LLVM IR. This is done with callbacks on a TargetTransformInfo object. This patch adds a TargetTransformInfo class for RISC-V. This will allow us to implement RISC-V specific callbacks as they become necessary. This commit also adds the getIntImmCost callbacks, and tests them with a simple constant hoisting test. Our immediate costs are on the conservative side, for the moment, but we prevent hoisting in most circumstances anyway. Previous review was on D63007 Reviewers: asb, luismarques Reviewed By: asb Subscribers: ributzka, MaskRay, llvm-commits, Jim, benna, psnobl, jocewei, PkmX, rkruppe, the_o, brucehoult, MartinMosbeck, rogfer01, edward-jones, zzheng, jrtc27, shiva0217, kito-cheng, niosHD, sabuasal, apazos, simoncook, johnrusso, rbar, hiraditya, mgorny Tags: #llvm Differential Revision: https://reviews.llvm.org/D63433 llvm-svn: 364046	2019-06-21 13:36:09 +00:00
Cameron McInally	1c0bd6dd2c	[Reassociate] Remove bogus assert reported in PR42349. Also, add a FIXME for the unsafe transform on a unary FNeg. A unary FNeg can only be transformed to a FMul by -1.0 when the nnan flag is present. The unary FNeg project is a WIP, so the unsafe transformation is acceptable until that work is complete. The bogus assert with introduced in D63445. llvm-svn: 363998	2019-06-20 23:03:55 +00:00
Sanjay Patel	b342f026a4	[InstSimplify] simplify power-of-2 (single bit set) sequences As discussed in PR42314: https://bugs.llvm.org/show_bug.cgi?id=42314 Improving the canonicalization for these patterns: rL363956 ...means we should adjust/enhance the related simplification. https://rise4fun.com/Alive/w1cp Name: isPow2 or zero %x = and i32 %xx, 2048 %a = add i32 %x, -1 %r = and i32 %a, %x => %r = i32 0 llvm-svn: 363997	2019-06-20 22:55:28 +00:00
Alina Sbirlea	d0b11698cd	[LICM & MSSA] Limit unsafe sinking and hoisting. Summary: The getClobberingMemoryAccess API checks for clobbering accesses in a loop by walking the backedge. This may check if a memory access is being clobbered by the loop in a previous iteration, depending how smart AA got over the course of the updates in MemorySSA (it does not occur when built from scratch). If no clobbering access is found inside the loop, it will optimize to an access outside the loop. This however does not mean that access is safe to sink. Given: ``` for i load a[i] store a[i] ``` The access corresponding to the load can be optimized to outside the loop, and the load can be hoisted. But it is incorrect to sink it. In order to sink the load, we'd need to check no Def clobbers the Use in the same iteration. With this patch we currently restrict sinking to either Defs not existing in the loop, or Defs preceding the load in the same block. An easy extension is to ensure the load (Use) post-dominates all Defs. Caught by PR42294. This issue also shed light on the converse problem: hoisting stores in this same scenario would be illegal. With this patch we restrict hoisting of stores to the case when their corresponding Defs are dominating all Uses in the loop. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63582 llvm-svn: 363982	2019-06-20 21:09:09 +00:00
Sanjay Patel	3207566dd6	[InstSimplify] add tests for known-not-a-power-of-2; NFC I added a canonicalization to create this general pattern in: rL363956 But as noted in PR42314: https://bugs.llvm.org/show_bug.cgi?id=42314#c11 ...we have a (potentially expensive) simplification for the version of the code that we just canonicalized away from, so we should add/adjust that code to match. llvm-svn: 363981	2019-06-20 21:04:14 +00:00
Cameron McInally	9589db7a98	[NFC][SLP] Pre-commit unary FNeg test to X86/propagate_ir_flags.ll llvm-svn: 363978	2019-06-20 20:53:51 +00:00
David Bolvansky	e0c1c3baf9	[NFC] Updated tests for D63546 llvm-svn: 363967	2019-06-20 19:30:56 +00:00
Sanjay Patel	63311bfb83	[InstCombine] canonicalize check for power-of-2 The form that compares against 0 is better because: 1. It removes a use of the input value. 2. It's the more standard form for this pattern: https://graphics.stanford.edu/~seander/bithacks.html#DetermineIfPowerOf2 3. It results in equal or better codegen (tested with x86, AArch64, ARM, PowerPC, MIPS). This is a root cause for PR42314, but probably doesn't completely answer the codegen request: https://bugs.llvm.org/show_bug.cgi?id=42314 Alive proof: https://rise4fun.com/Alive/9kG Name: is power-of-2 %neg = sub i32 0, %x %a = and i32 %neg, %x %r = icmp eq i32 %a, %x => %dec = add i32 %x, -1 %a2 = and i32 %dec, %x %r = icmp eq i32 %a2, 0 Name: is not power-of-2 %neg = sub i32 0, %x %a = and i32 %neg, %x %r = icmp ne i32 %a, %x => %dec = add i32 %x, -1 %a2 = and i32 %dec, %x %r = icmp ne i32 %a2, 0 llvm-svn: 363956	2019-06-20 17:41:15 +00:00
Philip Reames	8c80d08052	[Tests] Add a tricky LFTR case for documentation purposes Thought of this case while working on something else. We appear to get it right in all of the variations I tried, but that's by accident. So, add a test which would catch the potential bug. llvm-svn: 363953	2019-06-20 17:16:53 +00:00
David Bolvansky	01511192b2	[InstCombine] cttz(-x) -> cttz(x) Summary: Signedness does not change number of trailing zeros. Reviewers: spatel, lebedev.ri, nikic Reviewed By: spatel Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63534 llvm-svn: 363951	2019-06-20 17:04:14 +00:00
Sanjay Patel	d729ed8d44	[InstCombine] add commuted variants for power-of-2 checks; NFC llvm-svn: 363945	2019-06-20 16:27:23 +00:00
Sanjay Patel	345473c791	[InstCombine] add tests for checking power-of-2; NFC llvm-svn: 363938	2019-06-20 15:25:18 +00:00
Cameron McInally	4452c3b490	[NFC][SLP] Pre-commit unary FNeg test to X86/phi3.ll llvm-svn: 363937	2019-06-20 15:17:17 +00:00
Simon Pilgrim	72186a2494	[SLP][X86] Add lookahead reordering tests from D60897 llvm-svn: 363925	2019-06-20 12:52:58 +00:00
Philip Reames	eda1ba65ca	LFTR for multiple exit loops Teach IndVarSimply's LinearFunctionTestReplace transform to handle multiple exit loops. LFTR does two key things 1) it rewrites (all) exit tests in terms of a common IV potentially eliminating one in the process and 2) it moves any offset/indexing/f(i) style logic out of the loop. This turns out to actually be pretty easy to implement. SCEV already has all the information we need to know what the backedge taken count is for each individual exit. (We use that when computing the BE taken count for the loop as a whole.) We basically just need to iterate through the exiting blocks and apply the existing logic with the exit specific BE taken count. (The previously landed NFC makes this super obvious.) I chose to go ahead and apply this to all loop exits instead of only latch exits as originally proposed. After reviewing other passes, the only case I could find where LFTR form was harmful was LoopPredication. I've fixed the latch case, and guards aren't LFTRed anyways. We'll have some more work to do on the way towards widenable_conditions, but that's easily deferred. I do want to note that I added one bit after the review. When running tests, I saw a new failure (no idea why didn't see previously) which pointed out LFTR can rewrite a constant condition back to a loop varying one. This was theoretically possible with a single exit, but the zero case covered it in practice. With multiple exits, we saw this happening in practice for the eliminate-comparison.ll test case because we'd compute a ExitCount for one of the exits which was guaranteed to never actually be reached. Since LFTR ran after simplifyAndExtend, we'd immediately turn around and undo the simplication work we'd just done. The solution seemed obvious, so I didn't bother with another round of review. Differential Revision: https://reviews.llvm.org/D62625 llvm-svn: 363883	2019-06-19 21:58:25 +00:00
Philip Reames	80eb1ce7a0	[Tests] Autogen a test so that future changes are understandable llvm-svn: 363882	2019-06-19 21:39:07 +00:00
Huihui Zhang	670778c762	[InstCombine] Fold icmp eq/ne (and %x, signbit), 0 -> %x s>=/s< 0 earlier Summary: To generate simplified IR, make sure fold ``` (X & signbit) ==/!= 0) -> X s>=/s< 0; ``` is scheduled before fold ``` ((X << Y) & C) == 0 -> (X & (C >> Y)) == 0. ``` https://rise4fun.com/Alive/fbdh Reviewers: lebedev.ri, efriedma, spatel, craig.topper Reviewed By: lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63026 llvm-svn: 363845	2019-06-19 17:31:39 +00:00
Sanjay Patel	3e03bf6921	[InstSimplify] add a phi test with 1 incoming value; NFC D63489 proposes to change this behavior, but there's no direct -instsimplify test to verify that the transform exists. llvm-svn: 363842	2019-06-19 17:23:29 +00:00
Hubert Tong	e9983eed5a	[NFC][LSR] Avoid undefined grep in pr2570.ll greater-than-sign is not a BRE special character. POSIX.1-2017 XBD Section 9.3.2 indicates that the interpretation of `\>` is undefined. This patch replaces the pattern. llvm-svn: 363828	2019-06-19 16:02:54 +00:00
Cameron McInally	a027cf4764	[Reassociate] Handle unary FNeg in the Reassociate pass Differential Revision: https://reviews.llvm.org/D63445 llvm-svn: 363813	2019-06-19 14:59:14 +00:00
David Bolvansky	e3cd19d330	[NFC] Added tests for D63534 llvm-svn: 363796	2019-06-19 12:59:37 +00:00
David Bolvansky	21fd232385	[NFC] Added tests for cttz(abs(x)) -> cttz(x) fold llvm-svn: 363795	2019-06-19 12:55:39 +00:00
Orlando Cazalet-Hyams	1251cac62a	[DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion Summary: Bug: https://bugs.llvm.org/show_bug.cgi?id=39024 The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here: A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins. B) Instructions in the middle block have different line numbers which give the impression of another iteration. In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks. I have set up a separate review D61933 for a fix which is required for this patch. Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel, jmorse Reviewed By: hfinkel, jmorse Subscribers: jmorse, javed.absar, eraman, kcc, bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D60831 > llvm-svn: 363046 llvm-svn: 363786	2019-06-19 10:50:47 +00:00
Jay Foad	45d19fb470	[ConstantFolding] Fix assertion failure on non-power-of-two vector load. Summary: The test case does an (out of bounds) load from a global constant with type <3 x float>. InstSimplify tried to turn this into an integer load of the whole alloc size of the vector, which is 128 bits due to alignment padding, and then bitcast this to <3 x vector> which failed an assertion due to the type size mismatch. The fix is to do an integer load of the normal size of the vector, with no alignment padding. Reviewers: tpr, arsenm, majnemer, dstuttard Reviewed By: arsenm Subscribers: hfinkel, wdng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63375 llvm-svn: 363784	2019-06-19 10:28:48 +00:00
Michael Liao	4f7f70e262	Recommit [SROA] Enhance SROA to handle `addrspacecast`ed allocas [SROA] Enhance SROA to handle `addrspacecast`ed allocas - Fix typo in original change - Add additional handling to ensure all return pointers are properly casted. Summary: - After `addrspacecast` is allowed to be eliminated in SROA, the adjusting of storage pointer (from `alloca) needs to handle the potential different address spaces between the storage pointer (from alloca) and the pointer being used. Reviewers: arsenm Subscribers: wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63501 llvm-svn: 363743	2019-06-18 21:41:13 +00:00
Matt Arsenault	e8d8bb5170	InstCombine: Pre-commit test for reassociating nuw D39417 llvm-svn: 363741	2019-06-18 21:32:51 +00:00
Adrian Prantl	1db8d4a866	Fix broken debug info in in an !llvm.loop attachment in this testcase. llvm-svn: 363730	2019-06-18 20:07:53 +00:00
Jordan Rupprecht	33e85ad956	Revert [SROA] Enhance SROA to handle `addrspacecast`ed allocas This reverts r363711 (git commit `76a149ef81`) This causes stage2 build failures, e.g.: http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/132/steps/stage%202%20build/logs/stdio http://lab.llvm.org:8011/builders/ppc64le-lld-multistage-test/builds/87/steps/build-stage2-unified-tree/logs/stdio llvm-svn: 363718	2019-06-18 18:40:04 +00:00
Michael Liao	76a149ef81	[SROA] Enhance SROA to handle `addrspacecast`ed allocas Summary: - After `addrspacecast` is allowed to be eliminated in SROA, the adjusting of storage pointer (from `alloca) needs to handle the potential different address spaces between the storage pointer (from alloca) and the pointer being used. Reviewers: arsenm Subscribers: wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63501 llvm-svn: 363711	2019-06-18 17:58:49 +00:00
Philip Reames	44475363e8	Teach getSCEVAtScope how to handle loop phis w/invariant operands in loops w/taken backedges This patch really contains two pieces: Teach SCEV how to fold a phi in the header of a loop to the value on the backedge when a) the backedge is known to execute at least once, and b) the value is safe to use globally within the scope dominated by the original phi. Teach IndVarSimplify's rewriteLoopExitValues to allow loop invariant expressions which already exist (and thus don't need new computation inserted) even in loops where we can't optimize away other uses. Differential Revision: https://reviews.llvm.org/D63224 llvm-svn: 363619	2019-06-17 21:06:17 +00:00
Philip Reames	fe8bd96ebd	Fix a bug w/inbounds invalidation in LFTR (recommit) Recommit r363289 with a bug fix for crash identified in pr42279. Issue was that a loop exit test does not have to be an icmp, leading to a null dereference crash when new logic was exercised for that case. Test case previously committed in r363601. Original commit comment follows: This contains fixes for two cases where we might invalidate inbounds and leave it stale in the IR (a miscompile). Case 1 is when switching to an IV with no dynamically live uses, and case 2 is when doing pre-to-post conversion on the same pointer type IV. The basic scheme used is to prove that using the given IV (pre or post increment forms) would have to already trigger UB on the path to the test we're modifying. As such, our potential UB triggering use does not change the semantics of the original program. As was pointed out in the review thread by Nikita, this is defending against a separate issue from the hasConcreteDef case. This is about poison, that's about undef. Unfortunately, the two are different, see Nikita's comment for a fuller explanation, he explains it well. (Note: I'm going to address Nikita's last style comment in a separate commit just to minimize chance of subtle bugs being introduced due to typos.) Differential Revision: https://reviews.llvm.org/D62939 llvm-svn: 363613	2019-06-17 20:32:22 +00:00
Philip Reames	58c75565f3	Reduced test case for pr42279 in advance of the relevant re-commit + fix llvm-svn: 363601	2019-06-17 19:27:45 +00:00
Joseph Tremoulet	daa1ae6142	[EarlyCSE] Fix hashing of self-compares Summary: Update compare normalization in SimpleValue hashing to break ties (when the same value is being compared to itself) by switching to the swapped predicate if it has a lower numerical value. This brings the hashing in line with isEqual, which already recognizes the self-compares with swapped predicates as equal. Fixes PR 42280. Reviewers: spatel, efriedma, nikic, fhahn, uabelho Reviewed By: nikic Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63349 llvm-svn: 363598	2019-06-17 19:11:28 +00:00
Matt Arsenault	6d741f29ec	AMDGPU: Fold readlane/readfirstlane calls llvm-svn: 363587	2019-06-17 17:52:35 +00:00
Warren Ristow	6452bdd29b	[LV] Suppress vectorization in some nontemporal cases When considering a loop containing nontemporal stores or loads for vectorization, suppress the vectorization if the corresponding vectorized store or load with the aligment of the original scaler memory op is not supported with the nontemporal hint on the target. This adds two new functions: bool isLegalNTStore(Type DataType, unsigned Alignment) const; bool isLegalNTLoad(Type DataType, unsigned Alignment) const; to TTI, leaving the target independent default implementation as returning true, but with overriding implementations for X86 that check the legality based on available Subtarget features. This fixes https://llvm.org/PR40759 Differential Revision: https://reviews.llvm.org/D61764 llvm-svn: 363581	2019-06-17 17:20:08 +00:00
Matt Arsenault	1df203d78e	InferAddressSpaces: Fix cloning original addrspacecast If an addrspacecast needed to be inserted again, this was creating a clone of the original cast for each user. Just use the original, which also saves losing the value name. llvm-svn: 363562	2019-06-17 14:13:29 +00:00
Matt Arsenault	b10f097833	AMDGPU: Ignore subtarget for InferAddressSpaces Even if the target doesn't have flat instructions, addrspace(0) is still flat. It just happens to not work. llvm-svn: 363561	2019-06-17 14:13:24 +00:00
Matt Arsenault	f3b64d80bc	AMDGPU: Mark exp/exp.compr as inaccessiblememonly Should also be marked writeonly, but I think that would require splitting the version with done set to a separate intrinsic Test change is only from renumbering the attribute group numbers, which for some reason the generated check lines consider. llvm-svn: 363560	2019-06-17 13:52:24 +00:00
Sam Parker	1bd3d00e7e	[CodeGen] Check for HardwareLoop Latch ExitBlock The HardwareLoops pass finds exit blocks with a scevable exit count. If the target specifies to update the loop counter in a register, through a phi, we need to ensure that the exit block is a latch so that we can insert the phi with the correct value for the incoming edge. Differential Revision: https://reviews.llvm.org/D63336 llvm-svn: 363556	2019-06-17 13:39:28 +00:00
Bjorn Pettersson	83773b77a5	[LV] Deny irregular types in interleavedAccessCanBeWidened Summary: Avoid that loop vectorizer creates loads/stores of vectors with "irregular" types when interleaving. An example of an irregular type is x86_fp80 that is 80 bits, but that may have an allocation size that is 96 bits. So an array of x86_fp80 is not bitcast compatible with a vector of the same type. Not sure if interleavedAccessCanBeWidened is the best place for this check, but it solves the problem seen in the added test case. And it is the same kind of check that already exists in memoryInstructionCanBeWidened. Reviewers: fhahn, Ayal, craig.topper Reviewed By: fhahn Subscribers: hiraditya, rkruppe, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63386 llvm-svn: 363547	2019-06-17 12:02:24 +00:00
Sam Parker	60d6fb2a63	[SCEV] Use NoWrapFlags when expanding a simple mul Second functional change following on from rL362687. Pass the NoWrapFlags from the MulExpr to InsertBinop when we're generating a shl or mul. Differential Revision: https://reviews.llvm.org/D61934 llvm-svn: 363540	2019-06-17 10:05:18 +00:00
Fangrui Song	ac14f7b10c	[lit] Delete empty lines at the end of lit.local.cfg NFC llvm-svn: 363538	2019-06-17 09:51:07 +00:00
Hans Wennborg	a9e5d2f35d	Re-commit r357452 (take 3): "SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)" Third time's the charm. This was reverted in r363220 due to being suspected of an internal benchmark regression and a test failure, none of which turned out to be caused by this. llvm-svn: 363529	2019-06-17 07:47:28 +00:00
Yevgeny Rouban	ee62c40eae	[SimplifyCFG] Fix prof branch_weights MD while removing unreachable switch cases SimplifyCFG has a bug that results in inconsistent prof branch_weights metadata if unreachable switch cases are removed. This patch fixes this bug by making use of the newly introduced SwitchInstProfUpdateWrapper class (see patch D62122). A new test is created. Differential Revision: https://reviews.llvm.org/D62186 llvm-svn: 363527	2019-06-17 05:55:12 +00:00
Roman Lebedev	5a663bd77a	[InstSimplify] Fix addo/subo undef folds (PR42209) Fix folds of addo and subo with an undef operand to be: `@llvm.{u,s}{add,sub}.with.overflow` all fold to `{ undef, false }`, as per LLVM undef rules. Same for commuted variants. Based on the original version of the patch by @nikic. Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42209 \| PR42209 ]] Differential Revision: https://reviews.llvm.org/D63065 llvm-svn: 363522	2019-06-16 20:39:45 +00:00
Sanjay Patel	c8d88ad1a9	[CodeGenPrepare][x86] shift both sides of a vector select when profitable This is based on the example/discussion in PR37428: https://bugs.llvm.org/show_bug.cgi?id=37428 Proper vector shift instructions don't appear until AVX2, so we may generate several extra instructions within a loop trying to compensate for that. It's difficult to recover from that shift expansion later than this, so use the existing TLI hook and splat analysis to enable better codegen. This extends CGP functionality introduced with: rL201655 Differential Revision: https://reviews.llvm.org/D63233 llvm-svn: 363511	2019-06-16 15:29:03 +00:00
Nikita Popov	9145562b48	[SimplifyIndVar] Simplify non-overflowing saturating add/sub If we can detect that saturating math that depends on an IV cannot overflow, replace it with simple math. This is similar to the CVP optimization from D62703, just based on a different underlying analysis (SCEV vs LVI) that catches different cases. Differential Revision: https://reviews.llvm.org/D62792 llvm-svn: 363489	2019-06-15 08:48:52 +00:00
Huihui Zhang	dc2fd6a14e	[InstCombine] Add tests to show missing fold opportunity for "icmp and shift" (nfc). Summary: For icmp pred (and (sh X, Y), C), 0 When C is signbit, expect to fold (X << Y) & signbit ==/!= 0 into (X << Y) >=/< 0, rather than (X & (signbit >> Y)) != 0. When C+1 is power of 2, expect to fold (X << Y) & ~C ==/!= 0 into (X << Y) </>= C+1, rather than (X & (~C >> Y)) == 0. For icmp pred (and X, (sh signbit, Y)), 0 Expect to fold (X & (signbit l>> Y)) ==/!= 0 into (X << Y) >=/< 0 Expect to fold (X & (signbit << Y)) ==/!= 0 into (X l>> Y) >=/< 0 Reviewers: lebedev.ri, efriedma, spatel, craig.topper Reviewed By: lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63025 llvm-svn: 363479	2019-06-15 00:33:41 +00:00
Akira Hatanaka	a704a8f28c	[ObjC][ARC] Delete ObjC runtime calls on global variables annotated with 'objc_arc_inert' Those calls are no-ops, so they can be safely deleted. rdar://problem/49839633 Differential Revision: https://reviews.llvm.org/D62433 llvm-svn: 363468	2019-06-14 22:06:32 +00:00
Matt Arsenault	282dac717e	SROA: Allow eliminating addrspacecasted allocas There is a circular dependency between SROA and InferAddressSpaces today that requires running both multiple times in order to be able to eliminate all simple allocas and addrspacecasts. InferAddressSpaces can't remove addrspacecasts when written to memory, and SROA helps move pointers out of memory. This should avoid inserting new commuting addrspacecasts with GEPs, since there are unresolved questions about pointer wrapping between different address spaces. For now, don't replace volatile operations that don't match the alloca addrspace, as it would change the address space of the access. It may be still OK to insert an addrspacecast from the new alloca, but be more conservative for now. llvm-svn: 363462	2019-06-14 21:38:31 +00:00
Matt Arsenault	e6efb6433f	SROA: Add baseline test for addrspacecast changes llvm-svn: 363460	2019-06-14 21:22:26 +00:00
Florian Hahn	dcdd12b68c	Revert Fix a bug w/inbounds invalidation in LFTR Reverting because it breaks a green dragon build: http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/18208 This reverts r363289 (git commit `eb88badff9`) llvm-svn: 363427	2019-06-14 17:23:09 +00:00
Sanjay Patel	7ea378b940	[CodeGenPrepare] propagate debuginfo when copying a shuffle llvm-svn: 363409	2019-06-14 15:05:35 +00:00
Matt Arsenault	492d71cc99	AMDGPU: Fold readlane intrinsics of constants I'm not 100% sure about this, since I'm worried about IR transforms that might end up introducing divergence downstream once replaced with a constant, but I haven't come up with an example yet. llvm-svn: 363406	2019-06-14 14:51:26 +00:00
Sam Parker	0cf9639a9c	[SCEV] Pass NoWrapFlags when expanding an AddExpr InsertBinop now accepts NoWrapFlags, so pass them through when expanding a simple add expression. This is the first re-commit of the functional changes from rL362687, which was previously reverted. Differential Revision: https://reviews.llvm.org/D61934 llvm-svn: 363364	2019-06-14 09:19:41 +00:00
Stanislav Mekhanoshin	68a2fef9ae	[AMDGPU] gfx1010 wave32 icmp/fcmp intrinsic changes for wave32 Differential Revision: https://reviews.llvm.org/D63301 llvm-svn: 363339	2019-06-13 23:47:36 +00:00
Shawn Landden	24f4085811	[SimplifyCFG] NFC, update Switch tests as a baseline. Also add baseline tests to show effect of later patches. There were a couple of regressions here that were never caught, but my patch set that this is a preparation to will fix them. This is the third attempt to land this patch. Differential Revision: https://reviews.llvm.org/D61150 llvm-svn: 363319	2019-06-13 19:36:38 +00:00
Sanjay Patel	5bf7f81aa8	[InstCombine] add test for failed libfunction prototype matching; NFC llvm-svn: 363291	2019-06-13 18:26:10 +00:00
Philip Reames	eb88badff9	Fix a bug w/inbounds invalidation in LFTR This contains fixes for two cases where we might invalidate inbounds and leave it stale in the IR (a miscompile). Case 1 is when switching to an IV with no dynamically live uses, and case 2 is when doing pre-to-post conversion on the same pointer type IV. The basic scheme used is to prove that using the given IV (pre or post increment forms) would have to already trigger UB on the path to the test we're modifying. As such, our potential UB triggering use does not change the semantics of the original program. As was pointed out in the review thread by Nikita, this is defending against a separate issue from the hasConcreteDef case. This is about poison, that's about undef. Unfortunately, the two are different, see Nikita's comment for a fuller explanation, he explains it well. (Note: I'm going to address Nikita's last style comment in a separate commit just to minimize chance of subtle bugs being introduced due to typos.) Differential Revision: https://reviews.llvm.org/D62939 llvm-svn: 363289	2019-06-13 18:23:13 +00:00
Sanjay Patel	4d93fb528e	[InstCombine] auto-generate complete test checks; NFC llvm-svn: 363286	2019-06-13 18:14:49 +00:00
David Bolvansky	a9d8388e80	[NFC] Updated testcase for D54411/rL363284 llvm-svn: 363285	2019-06-13 18:13:03 +00:00
Joseph Tremoulet	3bc6e2a7aa	[EarlyCSE] Ensure equal keys have the same hash value Summary: The logic in EarlyCSE that looks through 'not' operations in the predicate recognizes e.g. that `select (not (cmp sgt X, Y)), X, Y` is equivalent to `select (cmp sgt X, Y), Y, X`. Without this change, however, only the latter is recognized as a form of `smin X, Y`, so the two expressions receive different hash codes. This leads to missed optimization opportunities when the quadratic probing for the two hashes doesn't happen to collide, and assertion failures when probing doesn't collide on insertion but does collide on a subsequent table grow operation. This change inverts the order of some of the pattern matching, checking first for the optional `not` and then for the min/max/abs patterns, so that e.g. both expressions above are recognized as a form of `smin X, Y`. It also adds an assertion to isEqual verifying that it implies equal hash codes; this fires when there's a collision during insertion, not just grow, and so will make it easier to notice if these functions fall out of sync again. A new flag --earlycse-debug-hash is added which can be used when changing the hash function; it forces hash collisions so that any pair of values inserted which compare as equal but hash differently will be caught by the isEqual assertion. Reviewers: spatel, nikic Reviewed By: spatel, nikic Subscribers: lebedev.ri, arsenm, craig.topper, efriedma, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62644 llvm-svn: 363274	2019-06-13 15:24:11 +00:00
Sander de Smalen	51c2fa0e2a	Improve reduction intrinsics by overloading result value. This patch uses the mechanism from D62995 to strengthen the definitions of the reduction intrinsics by letting the scalar result/accumulator type be overloaded from the vector element type. For example: ; The LLVM LangRef specifies that the scalar result must equal the ; vector element type, but this is not checked/enforced by LLVM. declare i32 @llvm.experimental.vector.reduce.or.i32.v4i32(<4 x i32> %a) This patch changes that into: declare i32 @llvm.experimental.vector.reduce.or.v4i32(<4 x i32> %a) Which has the type-constraint more explicit and causes LLVM to check the result type with the vector element type. Reviewers: RKSimon, arsenm, rnk, greened, aemerson Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D62996 llvm-svn: 363240	2019-06-13 09:37:38 +00:00
Sam Parker	9d28473a35	[ARM][TTI] Scan for existing loop intrinsics TTI should report that it's not profitable to generate a hardware loop if it, or one of its child loops, has already been converted. Differential Revision: https://reviews.llvm.org/D63212 llvm-svn: 363234	2019-06-13 08:28:46 +00:00
Shawn Landden	8b142bcc3f	[SimplifyCFG] reverting preliminary Switch patches again This reverts 363226 and 363227, both NFC intended I swear I fixed the test case that is failing, and ran the tests, but I will look into it again. llvm-svn: 363229	2019-06-13 05:26:17 +00:00
Shawn Landden	c54b2011bd	[SimplifyCFG] NFC, update Switch tests to better examine successive patches Also add baseline tests to show effect of later patches. There were a couple of regressions here that were never caught, but my patch set that this is a preparation to will fix them. Differential Revision: https://reviews.llvm.org/D61150 llvm-svn: 363226	2019-06-13 04:51:35 +00:00
Shawn Landden	c6cba2957d	[SimplifyCFG] revert the last commit. I ran ALL the test suite locally, so I will look into this... llvm-svn: 363223	2019-06-13 02:47:47 +00:00
Shawn Landden	f93b99b2b6	[SimplifyCFG] NFC, update Switch tests to HEAD so I can see if my changes change anything Also add baseline tests to show effect of later patches. Differential Revision: https://reviews.llvm.org/D61150 llvm-svn: 363222	2019-06-13 02:24:24 +00:00
David L. Jones	c73fadaa84	Revert r361811: 'Re-commit r357452 (take 2): "SimplifyCFG SinkCommonCodeFromPredecessors ...' We have observed some failures with internal builds with this revision. - Performance regressions: - llvm's SingleSource/Misc evalloop shows performance regressions (although these may be red herrings). - Benchmarks for Abseil's SwissTable. - Correctness: - Failures for particular libicu tests when building the Google AppEngine SDK (for PHP). hwennborg has already been notified, and is aware of reproducer failures. llvm-svn: 363220	2019-06-13 02:04:45 +00:00
Dinar Temirbulatov	b2f45ba1e8	[SLP] Update propagate_ir_flags.ll test to check that we do retain the common subset, NFC. llvm-svn: 363218	2019-06-13 00:19:50 +00:00
Philip Reames	0bded8442f	[Tests] Highlight impact of multiple exit LFTR (D62625) as requested by reviewer llvm-svn: 363217	2019-06-12 23:39:49 +00:00
Sanjay Patel	a1421e8347	[x86] add tests for vector shifts; NFC llvm-svn: 363203	2019-06-12 21:30:06 +00:00
Philip Reames	00e481b75d	[Tests] Autogen RLEV test and add tests for a future enhancement llvm-svn: 363193	2019-06-12 19:23:10 +00:00
Philip Reames	851adc000c	[Tests] Add tests to highlight sibling loop optimization order issue for exit rewriting The issue addressed in r363180 is more broadly relevant. For the moment, we don't actually get any of these cases because we a) restrict SCEV formation due to SCEExpander needing to preserve LCSSA, and b) don't iterate between loops. llvm-svn: 363192	2019-06-12 19:04:51 +00:00
Philip Reames	e51c3d8b82	[SCEV] Teach computeSCEVAtScope benefit from one-input Phi. PR39673 SCEV does not propagate arguments through one-input Phis so as to make it easy for the SCEV expander (and related code) to preserve LCSSA. It's not entirely clear this restriction is neccessary, but for the moment it exists. For this reason, we don't analyze single-entry phi inputs. However it is possible that when an this input leaves the loop through LCSSA Phi, it is a provable constant. Missing that results in an order of optimization issue in loop exit value rewriting where we miss some oppurtunities based on order in which we visit sibling loops. This patch teaches computeSCEVAtScope about this case. We can generalize it later, but so far we can only replace LCSSA Phis with their constant loop-exiting values. We should probably also add similiar logic directly in the SCEV construction path itself. Patch by: mkazantsev (with revised commit message by me) Differential Revision: https://reviews.llvm.org/D58113 llvm-svn: 363180	2019-06-12 17:21:47 +00:00
Sanjay Patel	64006896ac	[InstCombine] add tests for fmin/fmax libcalls; NFC llvm-svn: 363175	2019-06-12 15:29:40 +00:00
Sam Parker	3d42959dd8	Revert rL363156. The patch was to fix buildbots, but rL363157 should now be fixing it in a cleaner way. llvm-svn: 363174	2019-06-12 15:28:00 +00:00
Matt Arsenault	f29366b1f5	StackProtector: Use PointerMayBeCaptured This was using its own, outdated list of possible captures. This was at minimum not catching cmpxchg and addrspacecast captures. One change is now any volatile access is treated as capturing. The test coverage for this pass is quite inadequate, but this required removing volatile in the lifetime capture test. Also fixes some infrastructure issues to allow running just the IR pass. Fixes bug 42238. llvm-svn: 363169	2019-06-12 14:23:33 +00:00
Matt Arsenault	aa6bdf9dcd	LoopVersioning: Respect convergent This changes the standalone pass only. Arguably the utility class itself should assert there are no convergent calls. However, a target pass with additional context may still be able to version a loop if all of the dynamic conditions are sufficiently uniform. llvm-svn: 363165	2019-06-12 14:05:58 +00:00
Sanjay Patel	082a41994a	[InstCombine] add tests for fcmp+select with FMF (minnum/maxnum); NFC llvm-svn: 363163	2019-06-12 13:51:33 +00:00
Matt Arsenault	86325be3d7	LoopLoadElim: Respect convergent llvm-svn: 363162	2019-06-12 13:50:47 +00:00
Matt Arsenault	2466ba97bc	LoopDistribute/LAA: Respect convergent This case is slightly tricky, because loop distribution should be allowed in some cases, and not others. As long as runtime dependency checks don't need to be introduced, this should be OK. This is further complicated by the fact that LoopDistribute partially ignores if LAA says that vectorization is safe, and then does its own runtime pointer legality checks. Note this pass still does not handle noduplicate correctly, as this should always be forbidden with it. I'm not going to bother trying to fix it, as it would require more effort and I think noduplicate should be removed. https://reviews.llvm.org/D62607 llvm-svn: 363160	2019-06-12 13:34:19 +00:00
Matt Arsenault	1e21181aee	LoopDistribute/LAA: Add tests to catch regressions I broke 2 of these with a patch, but were not covered by existing tests. https://reviews.llvm.org/D63035 llvm-svn: 363158	2019-06-12 13:15:59 +00:00
Sam Parker	52d7326f32	[NFC] Add HardwareLoops lit.local.cfg file Set Transforms/HardwareLoops/ARM/ tests as unsupported if there isn't an arm target. llvm-svn: 363157	2019-06-12 12:54:19 +00:00
Sam Parker	ece316b56a	Attempt to fix non-Arm buildbots Adding REQUIRES: arm to failing tests llvm-svn: 363156	2019-06-12 12:47:35 +00:00
Sam Parker	757ac02dc8	[ARM] Implement TTI::isHardwareLoopProfitable Implement the backend target hook to drive the HardwareLoops pass. The low-overhead branch extension for Arm M-class cores is flexible enough that we don't have to ensure correctness at this point, except checking that the loop counter variable can be stored in LR - a 32-bit register. For it to be profitable, we want to avoid loops that contain function calls, or any other instruction that alters the PC. This implementation uses TargetLoweringInfo, to query type and operation actions, looks at intrinsic calls and also performs some manual checks for remainder/division and FP operations. I think this should be a good base to start and extra details can be filled out later. Differential Revision: https://reviews.llvm.org/D62907 llvm-svn: 363149	2019-06-12 12:00:42 +00:00
Orlando Cazalet-Hyams	a947156396	Revert "[DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion" This reverts commit `1a0f7a2077`. See phabricator thread for D60831. llvm-svn: 363132	2019-06-12 08:34:51 +00:00
Philip Reames	082cd30327	Generalize icmp matching in IndVars' eliminateTrunc We were only matching RHS being a loop invariant value, not the inverse. Since there's nothing which appears to canonicalize loop invariant values to RHS, this means we missed cases. Differential Revision: https://reviews.llvm.org/D63112 llvm-svn: 363108	2019-06-11 22:43:25 +00:00
Cameron McInally	08200d6d26	[InstCombine] Handle -(X-Y) --> (Y-X) for unary fneg when NSZ Differential Revision: https://reviews.llvm.org/D62612 llvm-svn: 363082	2019-06-11 16:21:21 +00:00
Cameron McInally	796de11331	[InstCombine] Update fptrunc (fneg x)) -> (fneg (fptrunc x) for unary FNeg Differential Revision: https://reviews.llvm.org/D62629 llvm-svn: 363080	2019-06-11 15:45:41 +00:00
Orlando Cazalet-Hyams	1a0f7a2077	[DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion Summary: Bug: https://bugs.llvm.org/show_bug.cgi?id=39024 The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here: A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins. B) Instructions in the middle block have different line numbers which give the impression of another iteration. In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks. I have set up a separate review D61933 for a fix which is required for this patch. Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel, jmorse Reviewed By: hfinkel, jmorse Subscribers: jmorse, javed.absar, eraman, kcc, bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D60831 llvm-svn: 363046	2019-06-11 10:37:20 +00:00
Matt Arsenault	c5830f5f05	AtomicExpand: Don't crash on non-0 alloca This now produces garbage on AMDGPU with a call to an nonexistent, anonymous libcall but won't assert. llvm-svn: 363022	2019-06-11 01:35:07 +00:00
Matt Arsenault	383e72fcfe	AMDGPU: Expand < 32-bit atomics Also fix AtomicExpand asserting on atomicrmw fadd/fsub. llvm-svn: 363021	2019-06-11 01:35:00 +00:00
Philip Reames	efb14f9005	[Tests] Adjust LFTR dead-iv tests to bypass undef cases As pointed out by Nikita in review, undef and poison need to be handled separately. Since we're no longer expecting any test improvements - just fixes for miscompiles - update the tests to bypass the existing undef check. llvm-svn: 363002	2019-06-10 23:17:10 +00:00
Rong Xu	e44fa83c37	[PGO] Handle cases of non-instrument BBs As shown in PR41279, some basic blocks (such as catchswitch) cannot be instrumented. This patch filters out these BBs in PGO instrumentation. It also sets the profile count to the fail-to-instrument edge, so that we can propagate the counts in the CFG. Differential Revision: https://reviews.llvm.org/D62700 llvm-svn: 362995	2019-06-10 22:36:27 +00:00
Philip Reames	1d322ccaac	[Tests] Split an LFTR dead-iv case There are two interesting sub-cases here. 1) Switching IVs is legal, but only in pre-increment form. and 2) Switching IVs is legal, and so is post-increment form. llvm-svn: 362993	2019-06-10 22:33:20 +00:00
Philip Reames	78c0d75697	[Tests] Add tests for D62939 (miscompiles around dead pointer IVs) Flesh out a collection of tests for switching to a dead IV within LFTR, both for the current miscompile, and for some cases which we should be able to handle via simple reasoning. llvm-svn: 362976	2019-06-10 19:45:59 +00:00
Philip Reames	a9633d5f0b	[LFTR] Use recomputed BE count This was discussed as part of D62880. The basic thought is that computing BE taken count after widening should produce (on average) an equally good backedge taken count as the one before widening. Since there's only one test in the suite which is impacted by this change, and it's essentially equivelent codegen, that seems to be a reasonable assertion. This change was separated from r362971 so that if this turns out to be problematic, the triggering piece is obvious and easily revertable. For the nestedIV example from elim-extend.ll, we end up with the following BE counts: BEFORE: (-2 + (-1 * %innercount) + %limit) AFTER: (-1 + (sext i32 (-1 + %limit) to i64) + (-1 * (sext i32 %innercount to i64))<nsw>) Note that before is an i32 type, and the after is an i64. Truncating the i64 produces the i32. llvm-svn: 362975	2019-06-10 19:18:53 +00:00
Sanjay Patel	9650c95b7e	[InstCombine] allow unordered preds when canonicalizing to fabs() We have a known-never-nan value via 'nnan', so an unordered predicate is the same as its ordered sibling. Similar to: rL362937 llvm-svn: 362954	2019-06-10 15:39:00 +00:00
Sanjay Patel	07bba68889	[InstCombine] add tests for fabs() with unordered preds; NFC llvm-svn: 362949	2019-06-10 15:08:22 +00:00
Sanjay Patel	85de9634e6	[InstCombine] fix bug in canonicalization to fabs() Forgot to translate the predicate clauses in rL362943. llvm-svn: 362945	2019-06-10 14:57:45 +00:00
Sanjay Patel	8b6d9f60ed	[InstCombine] change canonicalization to fabs() to use FMF on fsub Similar to rL362909: This isn't the ideal fix (use FMF on the select), but it's still an improvement until we have better FMF propagation to selects and other FP math operators. I don't think there's much risk of regression from this change by not including the FMF on the fcmp any more. The nsz/nnan FMF should be the same on the fcmp and the fsub because they have the same operand. llvm-svn: 362943	2019-06-10 14:46:36 +00:00
Sanjay Patel	8cd8c5784b	[InstCombine] allow unordered preds when canonicalizing to fabs() PR42179: https://bugs.llvm.org/show_bug.cgi?id=42179 llvm-svn: 362937	2019-06-10 14:14:51 +00:00
Sanjay Patel	4cdd3ceb57	[InstCombine] add tests for fcmp unordered pred -> fabs (PR42179); NFC llvm-svn: 362936	2019-06-10 14:04:10 +00:00

... 2 3 4 5 6 ...

13087 Commits