llvm-project

Commit Graph

Author	SHA1	Message	Date
eopXD	6a84579243	[LSR][TTI][PowerPC][SystemZ][X86] Add const-ness to TTI::isLSRCostLess. NFC Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D126350	2022-05-27 15:22:23 -07:00
William Huang	35b0955aa5	[ValueTracking] Added support to deduce PHI Nodes values being a power of 2 Add Value Tracking support to deduce induction variable being a power of 2, allowing urem optimizations Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D126018	2022-05-26 20:30:31 +00:00
Florian Hahn	6af5f5697c	[SCEV] Collect conditions from assumes same way as for branches. Also collect conditions from assume up-front in applyLoopGuards. This allows re-using the logic to handle logical ANDs as assume conditions. It should should pave the road for a fix for #55645.	2022-05-26 18:17:13 +01:00
serge-sans-paille	fb67d683db	[iwyu] Handle regressions in libLLVM header include Running iwyu-diff on LLVM codebase since `7030654296` detected a few regressions, fixing them. Differential Revision: https://reviews.llvm.org/D126417	2022-05-26 08:12:34 +02:00
Nikita Popov	8a6698b523	[ValueTracking] Loads with !dereferenceable metadata cannot be undef/poison A load with !dereferenceable or !dereferenceable_or_null metadata must return a well-defined (non-undef/poison) value. Effectively they imply !noundef. This is the same as we do for the dereferenceable(N) attribute. This should fix https://github.com/llvm/llvm-project/issues/55672, or at least the specific case discussed there. Differential Revision: https://reviews.llvm.org/D126296	2022-05-25 09:54:04 +02:00
Sanjay Patel	e8c20d995b	[IR] add and use pattern match specialization for sqrt intrinsic; NFC This was included in D126190 originally, but it's independent and a useful change for readability.	2022-05-23 14:16:30 -04:00
Jingu Kang	bb82f74612	Revert "Revert "[AArch64] Set maximum VF with shouldMaximizeVectorBandwidth"" This reverts commit `42ebfa8269`. The commmit from https://reviews.llvm.org/D125918 has fixed the stage 2 build failure. Differential Revision: https://reviews.llvm.org/D118979	2022-05-23 16:15:45 +01:00
Peter Waller	ade47bdc31	[LV] Improve register pressure estimate at high VFs Previously, `getRegUsageForType` was implemented using `getTypeLegalizationCost`. `getRegUsageForType` is used by the loop vectorizer to estimate the register pressure caused by using a vector type. However, `getTypeLegalizationCost` currently only appears to understand splitting and not scalarization, so significantly underestimates the register requirements. Instead, use `getNumRegisters`, which understands when scalarization can occur (via computeRegisterProperties). This was discovered while investigating D118979 (Set maximum VF with shouldMaximizeVectorBandwidth), where under fixed-length 512-bit SVE the loop vectorizer previously ends up costing an v128i1 as 2 v64i* registers where it actually occupies 128 i32 registers. I'm sending this patch early for comment, I'm still doing some sanity checking with LNT. I note that getRegisterClassForType appears to return VectorRC even though the type in question (large vNi1 types) end up occupying scalar registers. That might be worth fixing too. Differential Revision: https://reviews.llvm.org/D125918	2022-05-23 07:57:45 +00:00
Nikita Popov	c8b675eaa1	[SCEV] Use umin_seq for BECount of multi-exit loops When computing the BECount for multi-exit loops, we need to combine individual exit counts using umin_seq rather than umin. This is because an earlier exit may exit on the first iteration, in which case later exit expressions will not be evaluated and could be poisonous. We cannot propagate potential poison values from later exits. In particular, this avoids the introduction of "branch on poison" UB when optimizing multi-exit loops. Differential Revision: https://reviews.llvm.org/D124910	2022-05-21 15:48:14 +02:00
Craig Topper	f2df53b750	[InstructionSimplify] Remove multiple 'break' after 'return'. NFC	2022-05-20 10:23:57 -07:00
Nico Weber	304a5a7a14	Revert "[ValueTracking] Added support to deduce PHI Nodes values being a power of 2" This reverts commit `d5c130f17e`. Breaks tests, see https://reviews.llvm.org/D125332#3525819	2022-05-19 15:05:30 -04:00
William Huang	d5c130f17e	[ValueTracking] Added support to deduce PHI Nodes values being a power of 2 Add Value Tracking support to deduce induction variable being a power of 2, allowing urem optimizations Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D125332	2022-05-19 18:39:13 +00:00
Jay Foad	6bec3e9303	[APInt] Remove all uses of zextOrSelf, sextOrSelf and truncOrSelf Most clients only used these methods because they wanted to be able to extend or truncate to the same bit width (which is a no-op). Now that the standard zext, sext and trunc allow this, there is no reason to use the OrSelf versions. The OrSelf versions additionally have the strange behaviour of allowing extending to a smaller width, or truncating to a larger width, which are also treated as no-ops. A small amount of client code relied on this (ConstantRange::castOp and MicrosoftCXXNameMangler::mangleNumber) and needed rewriting. Differential Revision: https://reviews.llvm.org/D125557	2022-05-19 11:23:13 +01:00
Philip Reames	f7988d08a8	Revert "[BasicAA] Remove unneeded special case for malloc/calloc" This reverts commit `9b1e00738c`. Nikic reported in commit thread that I had forgotten history here, and that a) we'd tried this before, and b) had to revert due to an unexpected codegen impact. Current measurements confirm the same issue still exists.	2022-05-18 07:35:27 -07:00
NAKAMURA Takumi	6ca7eb2c6d	[SCEV] Part 1, Serialize function calls in function arguments. Evaluation odering in function call arguments is implementation-dependent. In fact, gcc evaluates bottom-top and clang does top-bottom. Fixes #55283 partially. Part of https://reviews.llvm.org/D125627	2022-05-18 23:20:08 +09:00
Philip Reames	9b1e00738c	[BasicAA] Remove unneeded special case for malloc/calloc This code pre-exists the generic handling for inaccessiblememonly. If we remove it and update one test with inaccessiblememonly, nothing else changes. Note that simply running O1 on that test would annotate malloc with the missing inaccessiblememonly.	2022-05-17 20:45:14 -07:00
Nikita Popov	b9b71c2b87	[LVI] Compute range for xor We do have a non-trivial implementation for binaryXor() now.	2022-05-17 10:18:38 +02:00
Yang Keao	7dce9eb6e5	[DomPrinter] Migrate -dot-dom to the new pass manager. In D123677, @YangKeao provided an implementation of `DOTGraphTraits{Viewer,Printer}` in the new pass manager. This commit migrates the `DomPrinter` and `DomViewer` to the new pass manager. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D124904	2022-05-16 15:07:16 -05:00
Nikita Popov	356d47ccb9	[ValueTracking] Handle and/or on RHS of isImpliedCondition() isImpliedCondition() currently handles and/or on the LHS, but not on the RHS, resulting in asymmetric behavior. This patch adds two new implication rules: * LHS ==> (RHS1 \|\| RHS2) if LHS ==> RHS1 or LHS ==> RHS2 * LHS ==> !(RHS1 && RHS2) if LHS ==> !RHS1 or LHS ==> !RHS2 Differential Revision: https://reviews.llvm.org/D125551	2022-05-16 16:30:26 +02:00
Florian Hahn	b7315ffc3c	[LAA,LV] Add initial support for pointer-diff memory checks. This patch adds initial support for a pointer diff based runtime check scheme for vectorization. This scheme requires fewer computations and checks than the existing full overlap checking, if it is applicable. The main idea is to only check if source and sink of a dependency are far enough apart so the accesses won't overlap in the vector loop. To do so, it is sufficient to compute the difference and compare it to the `VF * UF * AccessSize`. It is sufficient to check `(Sink - Src) <u VF * UF * AccessSize` to rule out a backwards dependence in the vector loop with the given VF and UF. If Src >=u Sink, there is not dependence preventing vectorization, hence the overflow should not matter and using the ULT should be sufficient. Note that the initial version is restricted in multiple ways: 1. Pointers must only either be read or written, by a single instruction (this allows re-constructing source/sink for dependences with the available information) 2. Source and sink pointers must be add-recs, with matching steps 3. The step must be a constant. 3. abs(step) == AccessSize. Most of those restrictions can be relaxed in the future. See https://github.com/llvm/llvm-project/issues/53590. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D119078	2022-05-16 15:27:22 +01:00
NAKAMURA Takumi	da7d8de1e4	ScalarEvolution.cpp: Reformat.	2022-05-15 20:51:27 +09:00
Sanjay Patel	ee6754c277	[ValueTracking] recognize sub X, (X % Y) as not overflowing I fixed some poison-safety violations on related patterns in InstCombine and noticed that we missed adding nsw/nuw on them, so this adds clauses to the underlying analysis for that. We need the undef input restriction to make this safe according to Alive2: https://alive2.llvm.org/ce/z/48g9K8 Differential Revision: https://reviews.llvm.org/D125500	2022-05-13 09:59:41 -04:00
Nikita Popov	ddfee07519	[InstSimplify] Fold and/or using implied conditions This adds two conjugated folds: * A \| B -> B if A implies B (https://alive2.llvm.org/ce/z/R6GU4j) * A & B -> A if A implies B (https://alive2.llvm.org/ce/z/EGMqyy) If A and B are icmps themselves, we will usually fold this through other logic already (though the tests show a couple additional cases we previously missed). However, isImpliedCond() also supports A being of the form X & Y, which allows us to handle cases like (X & Y) \| B where X implies B. This addresses the regression from D125398. Something that notably doesn't work yet is the (X \| Y) & B case. This is due to an asymmetry in the isImpliedCondition() implementation that will have to be addressed separately. Differential Revision: https://reviews.llvm.org/D125530	2022-05-13 15:09:14 +02:00
Florian Hahn	5890b30105	[LAA] Initial support for runtime checks with pointer selects. Scaffolding support for generating runtime checks for multiple SCEV expressions per pointer. The initial version just adds support for looking through a single pointer select. The more sophisticated logic for analyzing forks is in D108699 Reviewed By: huntergr Differential Revision: https://reviews.llvm.org/D114487	2022-05-12 19:33:48 +01:00
Arthur Eubanks	7e0802aeb5	[BasicAA] Fix order in which we pass MemoryLocations to alias() D98718 caused the order of Values/MemoryLocations we pass to alias() to be significant due to storing the offset in the PartialAlias case. But some callers weren't audited and were still passing swapped arguments, causing the returned PartialAlias offset to be negative in some cases. For example, the newly added unittests would return -1 instead of 1. Fixes #55343, a miscompile. Reviewed By: asbirlea, nikic Differential Revision: https://reviews.llvm.org/D125328	2022-05-10 12:05:38 -07:00
Nikita Popov	c077510bb1	[InstSimplify] Handle unknown function context in pointer icmp fold (PR54615) This issue reproduces in the context of LoopDeletion, because the bitcast does not get simplified away there. For a plain -inst-simplify run the bitcast would get folded away first. Fixes https://github.com/llvm/llvm-project/issues/54615.	2022-05-10 11:48:43 +02:00
Andrew Litteken	96345f773c	[IRSim] Remove early check from similarity matching such that commutative instructions are checked correctly when using the same value. When the first commutative instruction in a region using the same value in both positions was compared to a corresponding instruction with two different values, there was an early check that determined that since the values were new, it was true that these values acted in the same way structurally. If this was not contradicted later in the program, the regions were marked as similar. This removes that check, so that it is clear that the same value cannot be mapped to two different values. Reviewer: paquette Differential Revision: https://reviews.llvm.org/D124775	2022-05-09 22:59:09 -05:00
Mircea Trofin	c35ad9ee4f	[mlgo] Support exposing more features than those supported by models This allows the compiler to support more features than those supported by a model. The only requirement (development mode only) is that the new features must be appended at the end of the list of features requested from the model. The support is transparent to compiler code: for unsupported features, we provide a valid buffer to copy their values; it's just that this buffer is disconnected from the model, so insofar as the model is concerned (AOT or development mode), these features don't exist. The buffers are allocated at setup - meaning, at steady state, there is no extra allocation (maintaining the current invariant). These buffers has 2 roles: one, keep the compiler code simple. Second, allow logging their values in development mode. The latter allows retraining a model supporting the larger feature set starting from traces produced with the old model. For release mode (AOT-ed models), this decouples compiler evolution from model evolution, which we want in scenarios where the toolchain is frequently rebuilt and redeployed: we can first deploy the new features, and continue working with the older model, until a new model is made available, which can then be picked up the next time the compiler is built. Differential Revision: https://reviews.llvm.org/D124565	2022-05-09 18:01:21 -07:00
Michael Kruse	6b3b87376b	[polly] migrate -polly-show to the new pass manager Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D123678	2022-05-09 14:04:29 -05:00
Michael Kruse	a6b399ad79	[PassManager] Implement DOTGraphTraitsViewer under NPM Rename the legacy `DOTGraphTraits{Module,}{Viewer,Printer}` to the corresponding `DOTGraphTraits...WrapperPass`, and implement a new `DOTGraphTraitsViewer` with new pass manager. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D123677	2022-05-09 14:04:28 -05:00
Alexey Bataev	9dc4ced204	[SLP]Try partial store vectorization if supported by target. We can try to vectorize number of stores less than MinVecRegSize / scalar_value_size, if it is allowed by target. Gives an extra opportunity for the vectorization. Fixes PR54985. Differential Revision: https://reviews.llvm.org/D124284	2022-05-09 09:48:15 -07:00
Nikita Popov	68e1ba8188	[SCEV] Fold umin_seq using known predicate Fold %x umin_seq %y to %x if %x ule %y. This also subsumes the special handling for constant operands, as if %y is constant this folds to umin via implied poison reasoning, and if %x is constant then either %x is not zero and it folds to umin, or it is known zero, in which case it is ule anything.	2022-05-09 16:35:08 +02:00
Nikita Popov	18eaff1510	[ScalarEvolution] Fold %x umin_seq %y if %x cannot be zero Fold %x umin_seq %y to %x umin %y if %x cannot be zero. They only differ in semantics for %x==0. More generally %x _seq %y folds to %x %y if %x cannot be the saturation fold (though currently we only have umin_seq).	2022-05-09 15:11:05 +02:00
Serge Pavlov	eb28da89a6	[InstCombine] Remove side effect of replaced constrained intrinsics If a constrained intrinsic call was replaced by some value, it was not removed in some cases. The dangling instruction resulted in useless instructions executed in runtime. It happened because constrained intrinsics usually have side effect, it is used to model the interaction with floating-point environment. In some cases side effect is actually absent or can be ignored. This change adds specific treatment of constrained intrinsics so that their side effect can be removed if it actually absents. Differential Revision: https://reviews.llvm.org/D118426	2022-05-07 19:04:11 +07:00
Nikita Popov	47c559d6c1	[SCEV] Fold umin_seq to umin using implied poison reasoning Similar to how we convert logical and/or to bitwise and/or, we should also convert umin_seq to umin based on implied poison reasoning. In %x umin_seq %y, if %y being poison implies %x being poison, then we don't need the sequential evaluation: Having %y contribute towards the result will never make the result more poisonous. An important corollary of this is that if %y is never poison, we also don't need the sequential evaluation. This avoids some of the regressions in D124910. Differential Revision: https://reviews.llvm.org/D124921	2022-05-05 09:43:49 +02:00
Yangguang Li	3a8266902b	[SCEV] Removed an unnecessary assertion The assertion is to check we always get backedge taken count (`BECount`) of zero when the exit condition is in select form (`isa<BinaryOperation>(ExitCond)`) and the exit limit for the first operand is zero `EL0.ExactNotTaken->isZero()`). However the assertion is checking that the exit condition is NOT in select form. Removing the the whole assertion since we now handle select form in ScalarEvolution::getSequentialMinMaxExpr. Reviewed By: reames, nikic Differential Revision: https://reviews.llvm.org/D122835	2022-05-03 17:26:27 -04:00
Augie Fackler	1deea714b3	BuildLibCalls: simplify switch statement slightly Per feedback on D123086 after submit. Also added a test for vec_malloc et al attribute inference to show it's doing the right thing. The new tests exposed a defect, corrected by adding vec_free to the list of free functions in MemoryBuiltins.cpp, which had been overlooked all the way back in D94710, over a year ago. Differential Revision: https://reviews.llvm.org/D124859	2022-05-03 13:17:33 -04:00
Nikita Popov	47255834e7	[ValueTracking] A and (B & ~A) have no common bits set This extends haveNoCommonBitsSet() to two additional cases, allowing the following folds: * `A + (B & ~A)` --> `A \| (B & ~A)` (https://alive2.llvm.org/ce/z/crxxhN) * `A + ((A & B) ^ B)` --> `A \| ((A & B) ^ B)` (https://alive2.llvm.org/ce/z/A_wsH_) These should further fold to just `A \| B`, though this currently only works in the first case. The reason why the second fold is necessary is that we consider this to be the canonical form if B is a constant. (I did check whether we can change that, but it looks like a number of folds depend on the current canonicalization, so I ended up adding both patterns here.) Differential Revision: https://reviews.llvm.org/D124763	2022-05-03 11:33:27 +02:00
Igor Kirillov	4e5e042d9a	[LoopVectorize] Support reductions that store intermediary result Adds ability to vectorize loops containing a store to a loop-invariant address as part of a reduction that isn't converted to SSA form due to lack of aliasing info. Runtime checks are generated to ensure the store does not alias any other accesses in the loop. Ordered fadd reductions are not yet supported. Differential Revision: https://reviews.llvm.org/D110235	2022-05-03 10:12:30 +01:00
David Green	6f81903e89	[LV][SLP] Mark fptosi_sat as vectorizable This adds fptosi_sat and fptoui_sat to the list of trivially vectorizable functions, mainly so that the loop vectorizer can vectorize the instruction. Marking them as trivially vectorizable also allows them to be SLP vectorized, and Scalarized. The signature of a fptosi_sat requires two type overrides (@llvm.fptosi.sat.v2i32.v2f32), unlike other intrinsics that often only take a single. This patch alters hasVectorInstrinsicOverloadedScalarOpd to isVectorIntrinsicWithOverloadTypeAtArg, so that it can mark the first operand of the intrinsic as a overloaded (but not scalar) operand. Differential Revision: https://reviews.llvm.org/D124358	2022-05-03 09:32:34 +01:00
Bardia Mahjour	363b3a645a	fix warning caused by `ef4ecc3cef`	2022-05-02 17:06:27 -04:00
Bardia Mahjour	ef4ecc3cef	[LoopCacheAnalysis] Consider dimension depth of the subscript reference when calculating cost Reviewed By: congzhe, etiotto Differential Revision: https://reviews.llvm.org/D123400	2022-05-02 16:49:10 -04:00
Nikita Popov	597946a4dd	[ConstantFold] Don't convert getelementptr to ptrtoint+inttoptr ConstantFolding currently converts "getelementptr i8, Ptr, (sub 0, V)" to "inttoptr (sub (ptrtoint Ptr), V)". This transform is, taken by itself, correct, but does came with two issues: 1. It unnecessarily broadens provenance by introducing an inttoptr. We generally prefer not to introduce inttoptr during optimization. 2. For the case where V == ptrtoint Ptr, this folds to inttoptr 0, which further folds to null. In that case provenance becomes incorrect. This has been observed as a real-world miscompile with rustc. We should probably address that incorrect inttoptr 0 fold at some point, but in either case we should also drop this inttoptr-introducing fold. Instead, replace it with a fold rooted at ptrtoint(getelementptr), which seems to cover the original motivation for this fold (test2 in the changed file). Differential Revision: https://reviews.llvm.org/D124677	2022-05-02 10:24:46 +02:00
Congzhe Cao	c428a3d2a0	[LoopCacheAnalysis] Enable delinearization of fixed sized arrays Currently loop cache cost (LCC) cannot analyze fix-sized arrays since it cannot delinearize them. This patch adds the capability to delinearize fix-sized arrays to LCC. Most of the code is ported from DependenceAnalysis.cpp and some refactoring will be done in a next patch. Reviewed By: #loopoptwg, Meinersbur Differential Revision: https://reviews.llvm.org/D122857	2022-04-29 16:01:27 -04:00
Roman Lebedev	981ed72a17	[NFC][SCEV] Refactor `createNodeForSelectViaUMinSeq()` out of `createNodeForSelectOrPHIViaUMinSeq()`	2022-04-29 02:37:06 +03:00
Mircea Trofin	49942d595f	[NFC] remove const from FunctionPropertiesAnalysis::run, keep on Result The goal in `75881d8b02` was just modifying what `Result` is, didn't need to also modify ::run.	2022-04-28 15:10:21 -07:00
Mircea Trofin	75881d8b02	[NFC] const-ed the return type of FunctionPropertiesAnalysis The result is a data bag, this makes sure it's signaled to a user that the data can't be mutated when, for example, doing something like: auto &R = FAM.getResult<FunctionPropertiesAnalysis>(F) ... R.Uses++	2022-04-28 12:42:16 -07:00
Alexey Bataev	75e1cf4a6a	[COST]Improve cost model for shuffles in SLP. Introduced masks where they are not added and improved target dependent cost models to avoid returning of the incorrect cost results after adding masks. Differential Revision: https://reviews.llvm.org/D100486	2022-04-28 10:04:41 -07:00
Alexey Bataev	9861ca0c23	Revert "[COST]Improve cost model for shuffles in SLP." This reverts commit `29a470e380` to fix a crash reported in https://reviews.llvm.org/D100486#3479989.	2022-04-28 08:11:56 -07:00
Chris Jackson	c792884589	[Debuginfo][LSR] Add salvaging variadic dbg.value intrinsics [2/2] Reland `3f2b76ec90` with the test corrected to require x86-registered-target. Differential Revision: https://reviews.llvm.org/D120169	2022-04-28 14:21:56 +01:00

1 2 3 4 5 ...

11485 Commits