llvm-project

Commit Graph

Author	SHA1	Message	Date
Sander de Smalen	4ca860742d	[InstructionCost] Don't conflate Invalid costs with Unknown costs. We previously made a change to getUserCost to return a Invalid cost when one of the TTI costs returned '-1' (meaning 'unknown' or 'infinitely expensive'). It makes no sense to say that: shufflevector <2 x i8> %x, <2 x i8> %y, <4 x i32> <i32 0, i32 1, i32 2, i32 3> has an invalid cost. Perhaps the cost is not known, but the IR is valid and can be code-generated. Invalid should only be used for IR that cannot possibly be code-generated and where a cost is nonsensical. With more passes now asserting that the cost must be valid, it is possible that those assertions will fail for perfectly valid IR. An incomplete cost-model probably shouldn't be a reason for the compiler to break. It's better to consider these costs as 'very expensive' and ignore them for other reasons. At some point, we should consider replacing -1 with some other mechanism. Reviewed By: paulwalker-arm, dmgreen Differential Revision: https://reviews.llvm.org/D99502	2021-03-30 09:29:42 +01:00
Nashe Mncube	19601a4c6c	[SVE][Analysis]Instruction costs for ops on scalable-vec The following operations have no associated cost for them when applied to scalable vectors, and as a consequence can trigger a crash when a call is made to AArch64TTIImpl::getCastInstrCost(): - fptrunc - trunc - fpext - fpto(u,s)i This patch adds costs for these operations and relevant regression tests. Differential Revision: https://reviews.llvm.org/D98934	2021-03-29 11:15:50 +01:00
Nikita Popov	ce066da81c	[BasicAA] Make sure types match in constant offset heuristic This can only happen if offset types that are larger than the pointer size are involved. The previous implementation did not assert in this case because it initialized the APInts to the width of one of the variables -- though I strongly suspect it did not compute correct results in this case. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=32621 reported by fhahn.	2021-03-28 21:38:09 +02:00
Nikita Popov	9075864b73	[BasicAA] Refactor linear expression decomposition The current linear expression decomposition handles zext/sext by decomposing the casted operand, and then checking NUW/NSW flags to determine whether the extension can be distributed. This has some disadvantages: First, it is not possible to perform a partial decomposition. If we have zext((x + C1) +<nuw> C2) then we will fail to decompose the expression entirely, even though it would be safe and profitable to decompose it to zext(x + C1) +<nuw> zext(C2) Second, we may end up performing unnecessary decompositions, which will later be discarded because they lack nowrap flags necessary for extensions. Third, correctness of the code is not entirely obvious: At a high level, we encounter zext(x -<nuw> C) in the form of a zext on the linear expression x + (-C) with nuw flag set. Notably, this case must be treated as zext(x) + -zext(C) rather than zext(x) + zext(-C). The code handles this correctly by speculatively zexting constants to the final bitwidth, and performing additional fixup if the actual extension turns out to be an sext. This was not immediately obvious to me. This patch inverts the approach: An ExtendedValue represents a zext(sext(V)), and linear expression decomposition will try to decompose V further, either by absorbing another sext/zext into the ExtendedValue, or by distributing zext(sext(x op C)) over a binary operator with appropriate nsw/nuw flags. At each step we can determine whether distribution is legal and abort with a partial decomposition if not. We also know which extensions we need to apply to constants, and don't need to speculate or fixup.	2021-03-27 23:31:58 +01:00
Nikita Popov	b981bc30bf	[BasicAA] Correct handle implicit sext in decomposition While explicit sext instructions were handled correctly, the implicit sext that occurs if the offset is smaller than the pointer size blindly assumed that sext(X * Scale + Offset) is the same as sext(X) * Scale + Offset, which is obviously not correct. Fix this by extracting the code that handles linear expression extension and reusing it for the implicit sext as well.	2021-03-27 15:15:47 +01:00
Nikita Popov	60f3e8fbe4	[BasicAA] Clarify entry values of GetLinearExpression() (NFC) A number of variables need to be correctly initialized on entry to GetLinearExpression() for the implementation to behave reasonably. The fact that SExtBits can currenlty be non-zero on entry is a bug, as demonstrated by the added test: For implicit sexts by the GEP, we do currently skip legality checks.	2021-03-27 14:50:09 +01:00
Nikita Popov	5a5a8088cc	[BasicAA] Retain shl nowrap flags in GetLinearExpression() Nowrap flags between mul and shl differ in that mul nsw allows multiplication of 1 * INT_MIN, while shl nsw does not. This means that it is always fine to transfer shl nowrap flags to muls, but not necessarily the other way around. In this case the NUW/NSW results refer to mul/add operations, so it's fine to retain the flags from the shl.	2021-03-27 12:26:22 +01:00
Nikita Popov	fd7df0cf38	[ValueTracking] Handle shl pair in isKnownNonEqual() Handle (x << s) != (y << s) where x != y and the shifts are non-wrapping. Once again, this establishes parity with the corresponing mul fold that already exists. The shift case is more powerful because we don't need to guard against multiplies by zero.	2021-03-26 20:21:05 +01:00
Nikita Popov	9666e89d57	[ValueTracking] Handle shl in isKnownNonEqual() This handles the pattern X != X << C for non-zero X and C and a non-overflowing shift. This establishes parity with the corresponing fold for multiplies.	2021-03-26 20:21:05 +01:00
Nikita Popov	5c85c37c87	[ValueTracking] Add tests for non equal shifts (NFC)	2021-03-26 20:21:05 +01:00
Nikita Popov	caf92a8a92	[ValueTracking] Handle non-zero shl recurrence In this case we don't care about the step at all, and only require that the starting value is non-zero.	2021-03-26 18:39:06 +01:00
Nikita Popov	41234329b4	[ValueTracking] Add tests for non-zero shl recurrences (NFC)	2021-03-26 18:35:38 +01:00
Nikita Popov	938d05b814	[ValueTracking] Handle non-zero add/mul recurrences more precisely This is mainly for clarity: It doesn't make sense to do any negative/positive checks when dealing with a nuw add/mul. These only make sense to nsw add/mul.	2021-03-26 18:30:07 +01:00
Nikita Popov	eac2c94bc2	[ValueTracking] Add more non-zero add/mul recurrence tests (NFC)	2021-03-26 18:30:07 +01:00
Florian Hahn	6fc29e30dc	[BasicAA] Add a few more interesting modulo tests.	2021-03-26 16:56:49 +00:00
Florian Hahn	bcc8d80192	[BasicAA] Add a few cases with overflows in index computations. This patch adds a few test cases where currently NoAlias is returned, but the pointers can alias if the multiply overflows while computing a GEP index value.	2021-03-26 14:50:03 +00:00
Jingu Kang	3fd64cc7a3	[ValueTracking] Handle two PHIs in isKnownNonEqual() loop: %cmp.0 = phi i32 [ 3, %entry ], [ %inc, %loop ] %pos.0 = phi i32 [ 1, %entry ], [ %cmp.0, %loop ] ... %inc = add i32 %cmp.0, 1 br label %loop On above example, %pos.0 uses previous iteration's %cmp.0 with backedge according to PHI's instruction's defintion. If the %inc is not same among iterations, we can say the two PHIs are not same. Differential Revision: https://reviews.llvm.org/D98422	2021-03-25 22:56:05 +00:00
Philip Reames	e7ebb87222	[deref] Handle byval/byref/sret/inalloc/preallocated arguments for deref-at-point semantics All of these are scoped allocations which remain dereferenceable during the lifetime of the callee. Differential Revision: https://reviews.llvm.org/D99310	2021-03-25 14:47:31 -07:00
Craig Topper	5797feaa55	[RISCV] Reorder checks in RISCVTTIImpl::getGatherScatterOpCost to avoid calling getMinRVVVectorSizeInBits() when V extension is not enabled. getMinRVVVectorSizeInBits() asserts if the V extension isn't enabled. So check that gather/scatter is legal first since it already contains a check for V extension being enabled. It also already checks getMinRVVVectorSizeInBits for fixed length vectors so we don't need a check in getGatherScatterOpCost.	2021-03-25 14:20:47 -07:00
Philip Reames	4054b8322f	[deref] Implement initial set of inference rules for deref-at-point This implements a subset of the initial set of inference rules proposed in the llvm-dev thread "RFC: Decomposing deref(N) into deref(N) + nofree". The nolias one got moved to a separate review as there was some concerns raised which require further discussion. Differential Revision: https://reviews.llvm.org/D99135	2021-03-24 16:20:41 -07:00
Nikita Popov	a7efed5a20	[SCEV] Improve handling of not expressions in isImpliedCond() SCEV currently tries to prove implications of x pred y by also trying to imply ~y pred ~x. This is expensive in terms of compile-time (in fact, the majority of isImpliedCond compile-time is spent here) and generally not fruitful. The issue is that this also swaps the operands and thus breaks canonical ordering. If originally we were trying to prove an implication like X > C1 -> Y > C2, then we'll now try to prove X > C1 -> C3 > ~Y, which will not work. The only real case where we can get some use out of this transform is if the original conditions were in the form X > C1 -> Y < C2, were then swapped to X > C1 -> C2 > Y and are then swapped again here to X > C1 -> ~Y > C3. As such, handle this at a higher level, where we are doing the swapping in the first place. There's four different ways that we can line up a predicate and a swapped predicate, so we use some heuristics to pick some profitable way. Because we now try this transform at a higher level (isImpliedCondOperands rather than isImpliedCondOperandsHelper), we can also prove additional facts. Of the added tests, one was proven previously while the other wasn't. Differential Revision: https://reviews.llvm.org/D90926	2021-03-24 21:53:02 +01:00
Craig Topper	512bae81cc	[RISCV] Add basic cost modelling for fixed vector gather/scatter. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D99142	2021-03-24 11:14:14 -07:00
Jingu Kang	2e2740b859	[ValueTracking] Handle increasing mul recurrence in isKnownNonZero() Differential Revision: https://reviews.llvm.org/D99069	2021-03-23 23:04:41 +00:00
Nikita Popov	931b6066ac	[BasicAA] Handle assumes with operand bundles This fixes a regression reported on D99022: If a call has operand bundles, then the inaccessiblememonly attribute on the function will be ignored, as operand bundles can affect modref behavior in the general case. However, for assume operand bundles in particular this is not the case. Adjust getModRefBehavior() to always report inaccessiblememonly for assumes, regardless of presence of operand bundles.	2021-03-23 21:21:19 +01:00
Nikita Popov	b1389f6683	[BasicAA] Add test for assume with operand bundles (NFC)	2021-03-23 21:21:19 +01:00
David Sherwood	748ae5281d	[IR][SVE] Add new llvm.experimental.stepvector intrinsic This patch adds a new llvm.experimental.stepvector intrinsic, which takes no arguments and returns a linear integer sequence of values of the form <0, 1, ...>. It is primarily intended for scalable vectors, although it will work for fixed width vectors too. It is intended that later patches will make use of this new intrinsic when vectorising induction variables, currently only supported for fixed width. I've added a new CreateStepVector method to the IRBuilder, which will generate a call to this intrinsic for scalable vectors and fall back on creating a ConstantVector for fixed width. For scalable vectors this intrinsic is lowered to a new ISD node called STEP_VECTOR, which takes a single constant integer argument as the step. During lowering this argument is set to a value of 1. The reason for this additional argument at the codegen level is because in future patches we will introduce various generic DAG combines such as mul step_vector(1), 2 -> step_vector(2) add step_vector(1), step_vector(1) -> step_vector(2) shl step_vector(1), 1 -> step_vector(2) etc. that encourage a canonical format for all targets. This hopefully means all other targets supporting scalable vectors can benefit from this too. I've added cost model tests for both fixed width and scalable vectors: llvm/test/Analysis/CostModel/AArch64/neon-stepvector.ll llvm/test/Analysis/CostModel/AArch64/sve-stepvector.ll as well as codegen lowering tests for fixed width and scalable vectors: llvm/test/CodeGen/AArch64/neon-stepvector.ll llvm/test/CodeGen/AArch64/sve-stepvector.ll See this thread for discussion of the intrinsic: https://lists.llvm.org/pipermail/llvm-dev/2021-January/147943.html	2021-03-23 10:43:35 +00:00
Philip Reames	69fae504bb	[test] precommit another test for point-in-time deref semantics	2021-03-22 19:11:19 -07:00
Philip Reames	a28fee9cb2	[tests] Expand tests for point-in-time dereferenceability	2021-03-22 18:56:57 -07:00
Philip Reames	b37d0a40a2	[deref] Split a test to show both global and pointwise semantics While doing so, also split one monster test into individually named test functions.	2021-03-22 18:34:40 -07:00
Nikita Popov	b7aae9fab1	[ValueTracking] Regenerate test checks (NFC)	2021-03-22 22:26:00 +01:00
Philip Reames	f24175fcb9	Autogen some tests for ease of update	2021-03-22 11:06:29 -07:00
Philip Reames	d4648eeaa2	[SCEV] Use trip count information to improve shift recurrence ranges This patch exploits the knowledge that we may be running many fewer than bitwidth iterations of the loop, and may be able to disallow the overflow case. This patch specifically implements only the shl case, but this can be generalized to ashr and lshr without difficulty. Differential Revision: https://reviews.llvm.org/D98222	2021-03-22 09:38:43 -07:00
Nikita Popov	d11d5d1c5f	[ValueTracking] Improve mul handling in isKnownNonEqual() X != X * C is true if: * C is not 0 or 1 * X is not 0 * mul is nsw or nuw Proof: https://alive2.llvm.org/ce/z/uwF29z This is motivated by one of the cases in D98422.	2021-03-21 18:41:35 +01:00
Nikita Popov	f5bbdf2a67	[ValueTracking] Add more tests for isKnownNonEqual() of mul (NFC) This is for the case of (x * C) == x, rather than the (x * C1) == (x * C2) variant that we already cover.	2021-03-21 18:41:35 +01:00
David Green	a2e0312cda	[ARM] Tone down the MVE scalarization overhead The scalarization overhead was set deliberately high for MVE, whilst the codegen was new. It helps protect us against the negative ramifications of mixing scalar and vector instructions. This decreases that, especially for floating point where the cost of extracting/inserting lane elements can be low. For integer the cost is still fairly high due to the cross-register-bank copy, but is no longer n^2 in the length of the vector. In general, this will decrease the cost of scalarizing floats and long integer vectors. i64 increase in cost, having a high cost before and after this patch. For floats this allows up to start doing things like vectorizing fdiv instructions, even if they are scalarized. Differential Revision: https://reviews.llvm.org/D98245	2021-03-19 18:30:11 +00:00
Alexey Bataev	14ae0cf0f5	[Cost]Canonicalize the cost for logical or/and reductions. The generic cost of logical or/and reductions should be cost of bitcast <ReduxWidth x i1> to iReduxWidth + cmp eq\|ne iReduxWidth. Differential Revision: https://reviews.llvm.org/D97961	2021-03-19 11:01:58 -07:00
Max Kazantsev	fff1363ba0	[SCEV] Add false->any implication By definition of Implication operator, `false -> true` and `false -> false`. It means that `false` implies any predicate, no matter true or false. We don't need to go any further trying to prove the statement we need and just always say that `false` implies it in this case. In practice it means that we are trying to prove something guarded by `false` condition, which means that this code is unreachable, and we can safely prove any fact or perform any transform in this code. Differential Revision: https://reviews.llvm.org/D98706 Reviewed By: lebedev.ri	2021-03-19 11:29:48 +07:00
David Green	35e0567d58	[ARM] Add VREV MVE shuffle costs This uses the shuffle mask cost from D98206 to give a better cost of MVE VREV instructions. This helps especially in VectorCombine where the cost of shuffles is used to reorder bitcasts, which this helps keep the phase ordering test for fp16 reductions producing optimal code. The isVREVMask has been moved to a header file to allow it to be used across target transform and isel lowering. Differential Revision: https://reviews.llvm.org/D98210	2021-03-17 21:21:43 +00:00
Max Kazantsev	a6074b092c	[BasicAA] Drop dependency on Loop Info. PR43276 BasicAA stores a reference to LoopInfo inside. This imposes an implicit requirement of keeping it up to date whenever we modify the IR (in particular, whenever we modify terminators of blocks that belong to loops). Failing to do so leads to incorrect state of the LoopInfo. Because general AA does not require loop info updates and provides to API to update it properly, the users of AA reasonably assume that there is no need to update the loop info. It may be a reason of bugs, as example in PR43276 shows. This patch drops dependence of BasicAA on LoopInfo to avoid this problem. This may potentially pessimize the result of queries to BasicAA. Differential Revision: https://reviews.llvm.org/D98627 Reviewed By: nikic	2021-03-17 11:43:44 +07:00
Roman Lebedev	78b8ce40ef	Reland [SCEV] Improve modelling for (null) pointer constants This reverts commit `329aeb5db4`, and relands commit `61f006ac65`. This is a continuation of D89456. As it was suggested there, now that SCEV models `PtrToInt`, we can try to improve SCEV's pointer handling. In particular, i believe, i will need this in the future to further fix `SCEVAddExpr`operation type handling. This removes special handling of `ConstantPointerNull` from `ScalarEvolution::createSCEV()`, and add constant folding into `ScalarEvolution::getPtrToIntExpr()`. This way, `null` constants stay as such in SCEV's, but gracefully become zero integers when asked. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D98147	2021-03-13 16:05:34 +03:00
Roman Lebedev	329aeb5db4	Temporairly evert "[SCEV] Improve modelling for (null) pointer constants" This appears to have broken ubsan bot: https://lab.llvm.org/buildbot/#/builders/85/builds/3062 https://reviews.llvm.org/D98147#2623549 It looks like LSR needs some kind of a change around insertion point handling. Reverting until i have a fix. This reverts commit `61f006ac65`.	2021-03-13 09:10:28 +03:00
Philip Reames	4b8eb894bf	[tests] Cover a case brought up in review of D98222	2021-03-12 13:11:25 -08:00
Roman Lebedev	61f006ac65	[SCEV] Improve modelling for (null) pointer constants This is a continuation of D89456. As it was suggested there, now that SCEV models `PtrToInt`, we can try to improve SCEV's pointer handling. In particular, i believe, i will need this in the future to further fix `SCEVAddExpr`operation type handling. This removes special handling of `ConstantPointerNull` from `ScalarEvolution::createSCEV()`, and add constant folding into `ScalarEvolution::getPtrToIntExpr()`. This way, `null` constants stay as such in SCEV's, but gracefully become zero integers when asked. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D98147	2021-03-12 22:11:58 +03:00
Philip Reames	387228059e	[test] precommit tests from D98222	2021-03-09 12:39:47 -08:00
Roman Lebedev	b46c085d2b	[NFCI] SCEVExpander: emit intrinsics for integral {u,s}{min,max} SCEV expressions These intrinsics, not the icmp+select are the canonical form nowadays, so we might as well directly emit them. This should not cause any regressions, but if it does, then then they would needed to be fixed regardless. Note that this doesn't deal with `SCEVExpander::isHighCostExpansion()`, but that is a pessimization, not a correctness issue. Additionally, the non-intrinsic form has issues with undef, see https://reviews.llvm.org/D88287#2587863	2021-03-06 21:52:46 +03:00
Philip Reames	83ae49671d	[basicaa] Recurse through a single phi input BasicAA knows how to analyze phis, but to control compile time, we're fairly limited in doing so. This patch loosens that restriction just slightly when there is exactly one phi input (after discounting induction variable increments). The result of this is that we can handle more cases around nested and sibling loops with pointer induction variables. A few points to note. * This is deliberately extremely restrictive about recursing through at most one input of the phi. There's a known general problem with BasicAA sometimes hitting exponential compile time already, and this patch makes every effort not to compound the problem. Once the root issue is fixed, we can probably loosen the restrictions here a bit. * As seen in the test file, we're still missing cases which aren't directly based on phis (e.g. using the indvar increment). I believe this to be a separate problem and am going to explore this in another patch once this one lands. * As seen in the test file, this results in the unfortunate fact that using phivalues sometimes results in worse quality results. I believe this comes down to an oversight in how recursive phi detection was implemented for phivalues. I'm happy to tackle this in a follow up change. Differential Revision: https://reviews.llvm.org/D97401	2021-03-04 13:07:06 -08:00
Caroline Concatto	f2b749be15	[CostModel][SVE] Add cost model for shuffle reverse with i1 and scalable vector This patch adds the cost model for experimental.vector.reverse with scalable vector types: nxv16i1, nxv8i1, nxv4i1 and nxv2i1. These types are missing from the previous cost model patch D95603. The cost model for experimental.vector.reverse with 1 bit mask is used by loop vectorization in the patch D95363 Differential Revision: https://reviews.llvm.org/D97758	2021-03-04 18:52:59 +00:00
Alexey Bataev	60470ac7ff	[Cost]Add tests for boolean and/or reductions, NFC. Tests with the default costs for boolean and/or reductions. Differential Revision: https://reviews.llvm.org/D97793	2021-03-03 12:34:30 -08:00
Philip Reames	ea7d208b78	[basicaa] Rewrite isGEPBaseAtNegativeOffset in terms of index difference [mostly NFC] This is almost purely NFC, it just fits more obviously in the flow of the code now that we've standardized on the index different approach. The non-NFC bit is that because of canceling the VariableOffsets in the subtract, we can now handle the case where both sides involve a common variable offset. This isn't an "interesting" improvement; it just happens to fall out of the natural code structure. One subtle point - the placement of this above the BaseAlias check is important in the original code as this can return NoAlias even when we can't find a relation between the bases otherwise. Also added some enhancement TODOs noticed while understanding the existing code. Note: This is slightly different than the LGTMed version. I fixed the "inbounds" issue Nikita noticed with the original code in `e6e5ef4` and rebased this to include the same fix. Differential Revision: https://reviews.llvm.org/D97520	2021-03-03 09:03:28 -08:00
Philip Reames	e6e5ef40cb	[basicaa] Fix a latent bug in isGEPBaseAtNegativeOffset This was pointed out in review of D97520 by Nikita, but existed in the original code as well. The basic issue is that a decomposed GEP expression describes (potentially) more than one getelementptr. The "inbounds" derived UB which justifies this aliasing rule requires that the entire offset be composed of "inbounds" geps. Otherwise, as can be seen in the recently added and changes in this patch test, we can end up with a large commulative offset with only a small sub-offset actually being "inbounds". If that small sub-offset lies within the object, the result was unsound. We could potentially be fancier here, but for the moment, simply be conservative when any of the GEPs parsed aren't inbounds.	2021-03-03 08:43:32 -08:00

1 2 3 4 5 ...

2612 Commits