llvm-project

Commit Graph

Author	SHA1	Message	Date
Roman Lebedev	156f10c840	[IR] `SCEVExpander::generateOverflowCheck()`: short-circuit `umul_with_overflow`-by-one It's a no-op, no overflow happens ever: https://alive2.llvm.org/ce/z/Zw89rZ While generally i don't like such hacks, we have a very good reason to do this: here we are expanding a run-time correctness check for the vectorization, and said `umul_with_overflow` will not be optimized out before we query the cost of the checks we've generated. Which means, the cost of run-time checks would be artificially inflated, and after https://reviews.llvm.org/D109368 that will affect the minimal trip count for which these checks are even evaluated. And if they aren't even evaluated, then the vectorized code certainly won't be run. We could consider doing this in IRBuilder, but then we'd need to also teach `CreateExtractValue()` to look into chain of `insertvalue`'s, and i'm not sure there's precedent for that. Refs. https://reviews.llvm.org/D109368#3089809	2021-10-27 19:45:55 +03:00
Alexey Bataev	64d1617d18	[SLP]Improve/fix reordering of the gathered graph nodes. Gathered loads/extractelements/extractvalue instructions should be checked if they can represent a vector reordering node too and their order should ve taken into account for better graph reordering analysis/ Also, if the gather node has reused scalars, they must be reordered instead of the scalars themselves. Differential Revision: https://reviews.llvm.org/D112454	2021-10-27 08:49:13 -07:00
Roman Lebedev	f3190dedee	[IR] `IRBuilderBase::CreateAnd()`: short-circuit `x & 0` --> `0` https://alive2.llvm.org/ce/z/YzPhSb Refs. https://reviews.llvm.org/D109368#3089809	2021-10-27 18:01:06 +03:00
Roman Lebedev	749581d21f	[IR] `IRBuilderBase::CreateAnd()`: fix short-circuiting for constant on LHS Refs. https://reviews.llvm.org/D109368#3089809	2021-10-27 18:01:06 +03:00
Roman Lebedev	f3df87d57e	[IR] `IRBuilderBase::CreateOr()`: fix short-circuiting for constant on LHS There is no guarantee that the constant is on RHS here, we have to handle both cases. Refs. https://reviews.llvm.org/D109368#3089809	2021-10-27 18:01:06 +03:00
Roman Lebedev	ab1dbcecd6	[IR] `IRBuilderBase::CreateSelect()`: if cond is a constant i1, short-circuit While we could emit such a tautological `select`, it will stick around until the next instsimplify invocation, which may happen after we count the cost of this redundant `select`. Which is precisely what happens with loop vectorization legality checks, and that artificially increases the cost of said checks, which is bad. There is prior art for this in `IRBuilderBase::CreateAnd()`/`IRBuilderBase::CreateOr()`. Refs. https://reviews.llvm.org/D109368#3089809	2021-10-27 18:01:05 +03:00
Roman Lebedev	5a8a7b3bf8	[NFC] Re-autogenerate check lines in some tests to ease of future update	2021-10-27 18:01:05 +03:00
Alexey Bataev	9b12975cbf	Revert "[SLP]Improve/fix reordering of the gathered graph nodes." This reverts commit `f719b794bc` to fix instability in tests.	2021-10-27 07:31:36 -07:00
Alexey Bataev	f719b794bc	[SLP]Improve/fix reordering of the gathered graph nodes. Gathered loads/extractelements/extractvalue instructions should be checked if they can represent a vector reordering node too and their order should ve taken into account for better graph reordering analysis/ Also, if the gather node has reused scalars, they must be reordered instead of the scalars themselves. Differential Revision: https://reviews.llvm.org/D112454	2021-10-27 06:08:40 -07:00
Matt	fc28a2f8ce	[AArch64][SVE] Combine predicated FMUL/FADD into FMA Combine FADD and FMUL intrinsics into FMA when the result of the FMUL is an FADD operand with one only use and both use the same predicate. Differential Revision: https://reviews.llvm.org/D111638	2021-10-27 11:41:23 +00:00
Alexey Bataev	cb4feae7bd	[SLP]Fix logical and/or reductions. Need to emit select(cmp) instructions for poison-safe forms of select ops. Currently alive reports that `Target is more poisonous than source` for operations we generating for such instructions. https://alive2.llvm.org/ce/z/FiNiAA Differential Revision: https://reviews.llvm.org/D112562	2021-10-27 04:25:20 -07:00
Florian Hahn	1a2a7cca3e	[DSE] Add test case with 2 memcpys that should not be eliminated.	2021-10-27 11:15:58 +01:00
David Sherwood	3d706c20f8	[NFC][LoopVectorize] Remove setBestPlan in favour of getBestPlanFor I have removed LoopVectorizationPlanner::setBestPlan, since this function is quite aggressive because it deletes all other plans except the one containing the <VF,UF> pair required. The code is currently written to assume that all <VF,UF> pairs will live in the same vplan. This is overly restrictive, since scalable VFs live in different plans to fixed-width VFS. When we add support for vectorising epilogue loops when the main loop uses scalable vectors then we will the vplan for the main loop will be different to the epilogue. Instead I have added a new function called LoopVectorizationPlanner::getBestPlanFor that returns the best vplan for the <VF,UF> pair requested and leaves all the vplans untouched. We then pass this best vplan to LoopVectorizationPlanner::executePlan which now takes an additional VPlanPtr argument. Differential revision: https://reviews.llvm.org/D111125	2021-10-27 09:38:27 +01:00
Sanjay Patel	acabad9ff6	[InstCombine] try to canonicalize icmp with trunc op into mask and cmp The motivating test is based on: https://llvm.org/PR52260 We have better analysis for X == 0, so try harder to form that.	2021-10-26 17:43:28 -04:00
Sanjay Patel	e8fdd030b1	[InstCombine] add tests for icmp with trunc op; NFC	2021-10-26 17:43:28 -04:00
Alexey Bataev	5db7568a6a	[SLP][NFC]Add a test for poison-free or reduction.	2021-10-26 14:04:05 -07:00
Stanislav Mekhanoshin	4faf88cc14	[InstCombine] Precommit new and-xor-or.ll tests. NFC.	2021-10-26 12:57:17 -07:00
Usman Nadeem	560dd1cdad	[NFC][Instcombine] Pre-commit some tests for negative fabs Change-Id: Idcce321c825ecc6b3a111a683e24dc10015f6872	2021-10-26 10:23:39 -07:00
Alexey Bataev	8ba8cf24f7	[SLP][NFC]Add a test for logical reduction with extra op.	2021-10-26 10:14:20 -07:00
Stanislav Mekhanoshin	6860abf748	[InstCombine] Precommit new and-xor-or.ll tests. NFC.	2021-10-26 10:12:53 -07:00
Alexey Bataev	ce14d1b690	[SLP]Do not reorder reduction nodes. The final reduction nodes should not be reordered, the order does not matter for reductions. Also, it might be profitable to vectorize smaller reduction trees, reduction cost may compensate small tree cost. Part of D111574 Differential Revision: https://reviews.llvm.org/D112467	2021-10-26 07:41:24 -07:00
Arthur Eubanks	544a21566d	[test] Make test added in D112473 check the IR The test was intended to also check the IR to be empty.	2021-10-25 14:10:58 -07:00
Arthur Eubanks	4a9db7367d	[AlwaysInliner] Invalidate analyses when we delete functions Fixes PR52292. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D112473	2021-10-25 13:36:32 -07:00
Zarko Todorovski	9769e97c35	[LLVM] Inclusive terms: remove/replace references to sanity in RewriteStatepointsForGC.cpp and test Part of work to have the LLVM backend to use more inclusive terms. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D112461	2021-10-25 16:17:41 -04:00
Philip Reames	f82cf6187f	[indvars] Fix pr52276 (missing one use check) The recently added logic to canonicalize exit conditions to unsigned relies on facts which hold about the use (i.e. exit test). Applying this blindly to the icmp is not legal, as there may be another use which never reaches the exit. Restrict ourselves to case where we have a single use.	2021-10-25 09:26:55 -07:00
Alexey Bataev	eb9b75dd4d	[SLP]Change the order of the reduction/binops args pair vectorization attempts. Need to change the order of the reduction/binops args pair vectorization attempts. Need to try to find the reduction at first and postpone vectorization of binops args. This may help to find more reduction patterns and vectorize them. Part of D111574. Differential Revision: https://reviews.llvm.org/D112224	2021-10-25 06:27:14 -07:00
Max Kazantsev	31822e0530	[Test] Add test for PR52290 Demonstrates hang in iterativelySimplifyCFG.	2021-10-25 18:25:59 +07:00
Philip Reames	3c06ecaa1e	[instcombine] Fix oss-fuzz 39934 (mul matcher can match non-instruction) Fixes a crash observed by oss-fuzz in 39934. Issue at hand is that code expects a pattern match on m_Mul to imply the operand is a mul instruction, however mul constexprs are also valid here.	2021-10-24 14:42:03 -07:00
Stanislav Mekhanoshin	55f7cc1a9a	[InstCombine] Precommit new and-xor-or.ll tests. NFC.	2021-10-22 11:59:15 -07:00
David Green	d4da71282f	[InstCombine] Various tests for truncating saturates and related patterns.	2021-10-22 18:36:08 +01:00
Philip Reames	412eb07edd	[indvars] Use fact loop must exit to canonicalize to unsigned conditions The logic in this patch is that if we find a comparison which would be unsigned except for when the loop is infinite, and we can prove that an infinite loop must be ill defined, we can still make the predicate unsigned. The eventual goal (combined with a follow on patch) is to use the fact the loop exits to remove the zext (see tests) entirely. A couple of points worth noting: * We loose the ability to prove the loop unreachable by committing to the must exit interpretation. If instead, we later proved that rhs was definitely outside the range required for finiteness, we could have killed the loop entirely. (We don't currently implement this transform, but could in theory, do so.) * simplifyAndExtend has a very limited list of users it walks. In particular, in the examples is stops at the zext and never visits the icmp. (Because we can't fold the zext to an addrec yet in SCEV.) Being willing to visit when we haven't simplified regresses multiple tests (seemingly because of less optimal results when computing trip counts). D112170 explores fixing that, but - at least so far - appears to be too expensive compile time wise. Differential Revision: https://reviews.llvm.org/D111836	2021-10-22 10:31:36 -07:00
Quinn Pham	950f22a5e1	[llvm]Inclusive language: replace master with main [NFC] This patch fixes a url in a testcase due to the renaming of the branch.	2021-10-22 11:56:44 -05:00
Nikita Popov	3a10fe2d89	[Loads] Use more powerful constant folding API This follows up on D111023 by exporting the generic "load value from constant at given offset as given type" and using it in the store to load forwarding code. We now need to make sure that the load size is smaller than the store size, previously this was implicitly ensured by ConstantFoldLoadThroughBitcast(). Differential Revision: https://reviews.llvm.org/D112260	2021-10-22 18:33:03 +02:00
Nikita Popov	5bb7562962	[Attributor] Generalize GEP construction Make use of the getGEPIndicesForOffset() helper for creating GEPs. This handles arrays as well, uses correct GEP index types and reduces code duplication. Differential Revision: https://reviews.llvm.org/D112263	2021-10-22 18:30:43 +02:00
Piotr Sobczak	7457fe3dd4	[InstCombine][NFC] Precommit new tests	2021-10-22 17:15:53 +02:00
Florian Hahn	286e98b97e	[DSE] Add test cases with more complex redundant stores. This patch adds more complex test cases with redundant stores of an existing memset, with other stores in between. It also makes a few of the existing tests more robust.	2021-10-22 13:50:32 +01:00
Roman Lebedev	e1db72703f	[NFC] Re-harden test/Transforms/LoopVectorize/X86/pr48340.ll This test is quite fragile WRT improvements to the interleaved load cost modelling. Let's bump the stride way up so that is no longer a concern.	2021-10-22 15:07:53 +03:00
Roman Lebedev	6f6842d782	Revert "[NFC][LV] Autogenerate check lines in a test for ease of future update" This reverts commit `8ae83a1baf`.	2021-10-22 15:07:53 +03:00
Roman Lebedev	2eaef53023	[TTI] `BasicTTIImplBase::getInterleavedMemoryOpCost()`: fix load discounting The math here is: Cost of 1 load = cost of n loads / n Cost of live loads = num live loads * Cost of 1 load Cost of live loads = num live loads * (cost of n loads / n) Cost of live loads = cost of n loads * (num live loads / n) But, all the variables here are integers, and integer division rounds down, but this calculation clearly expects float semantics. Instead multiply upfront, and then perform round-up-division. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D112302	2021-10-22 14:08:58 +03:00
Roman Lebedev	8ae83a1baf	[NFC][LV] Autogenerate check lines in a test for ease of future update	2021-10-22 14:08:58 +03:00
Chuanqi Xu	ddbf196194	[Coroutines] Ignore partial lifetime markers refer of an alloca When I playing with Coroutines, I found that it is possible to generate following IR: ``` %struct = alloca ... %sub.element = getelementptr %struct, i64 0, i64 index ; index is not %zero lifetime.marker.start(%sub.element) % use of %sub.element lifetime.marker.end(%sub.element) store %struct to xxx ; %struct is escaping! <suspend points> ``` Then the AllocaUseVisitor would collect the lifetime marker for sub.element and treat it as the lifetime markers of the alloca! So it judges that the alloca could be put on the stack instead of the frame by judging the lifetime markers only. The root cause for the bug is that AllocaUseVisitor collects wrong lifetime markers. This patch fixes this. Reviewed By: lxfind Differential Revision: https://reviews.llvm.org/D112216	2021-10-22 09:49:50 +08:00
Stanislav Mekhanoshin	c0d6e1b9e0	[InstCombine] Precommit new and-xor-or.ll tests. NFC.	2021-10-21 15:15:54 -07:00
Stanislav Mekhanoshin	969b72fb66	Add test to check we can instcombine after reassociate. NFC. The pattern became optimized after `b92412fb28`. Differential Revision: https://reviews.llvm.org/D112258	2021-10-21 12:27:26 -07:00
Nikita Popov	8262f45c73	[InstCombine] Add additional store forwarding test (NFC) Variant where the load is larger than the store. Make sure we don't forward this.	2021-10-21 20:47:48 +02:00
Nikita Popov	1848525842	[CodeMetrics] Don't require speculatability for ephemeral values As discussed in D112016, our current requirement of speculatability for ephemeral is overly strict: What we really care about is that the instruction will be DCEd once the assume is dropped. For that it is sufficient that the instruction is side-effect free and not a terminator. In particular, this allows non-dereferenceable loads to be ephemeral values. Differential Revision: https://reviews.llvm.org/D112179	2021-10-21 20:30:01 +02:00
Florian Hahn	a4b8979a81	[SLP] Add additional tests which caused crashes with versioning.	2021-10-21 18:17:31 +01:00
Sanjay Patel	66d22b4da4	[VectorCombine] fold shuffle-of-binops with common operand shuf (bo X, Y), (bo X, W) --> bo (shuf X), (shuf Y, W) This is motivated by an example in D111800 (although that patch avoids the problem for that particular example). The pattern is shown in reduced form with: https://llvm.org/PR52178 https://alive2.llvm.org/ce/z/d8zB4D There is no difference on the PhaseOrdering test from D111800 because the aarch64 cost model says that the shuffle cost is 3 while the fadd cost is 2. Differential Revision: https://reviews.llvm.org/D111901	2021-10-21 12:37:54 -04:00
Sanjay Patel	3888de9507	[InstCombine] generalize reassociated Demorgan folds This updates the recent D112108 / `b92412fb28` to handle the flipped logic ('or') sibling: https://alive2.llvm.org/ce/z/Y2L6Ch	2021-10-21 10:39:29 -04:00
Sanjay Patel	6b560a8e23	[InstCombine] add tests for DeMorgan with reassociation; NFC These are direct mutations of the tests added for D112108 - we should handle the sibling folds for 'or'.	2021-10-21 10:39:28 -04:00
Alexey Bataev	3ea7877c8b	[SLP]Unify vectorization of PHI and store nodes with improved tiny tree vectorization. Vectorization of PHIs and stores very similar, it might be beneficial to try to revectorize stores (like PHIs) if the total number of stores with the same/alternate opcode is less than the vector size but number of stores with the same type is larger than the vector size. Differential Revision: https://reviews.llvm.org/D109831	2021-10-21 06:25:32 -07:00

1 2 3 4 5 ...

20031 Commits