llvm-project

Commit Graph

Author	SHA1	Message	Date
Matthew Simpson	6349380fa4	Revert r291254: [AArch64] Reduce vector insert/extract cost for Falkor The default vector insert/extract cost is more profitable on Falkor than the reduced cost. llvm-svn: 303771	2017-05-24 16:48:39 +00:00
Diana Picus	183863fc3b	Revert "[SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start" This reverts commit r303730 because it broke all the buildbots. llvm-svn: 303747	2017-05-24 14:16:04 +00:00
Max Kazantsev	13e016bf48	[SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start When folding arguments of AddExpr or MulExpr with recurrences, we rely on the fact that the loop of our base recurrency is the bottom-lost in terms of domination. This assumption may be broken by an expression which is treated as invariant, and which depends on a complex Phi for which SCEVUnknown was created. If such Phi is a loop Phi, and this loop is lower than the chosen AddRecExpr's loop, it is invalid to fold our expression with the recurrence. Another reason why it might be invalid to fold SCEVUnknown into Phi start value is that unlike other SCEVs, SCEVUnknown are sometimes position-bound. For example, here: for (...) { // loop phi = {A,+,B} } X = load ... Folding phi + X into {A+X,+,B}<loop> actually makes no sense, because X does not exist and cannot exist while we are iterating in loop (this memory can be even not allocated and not filled by this moment). It is only valid to make such folding if X is defined before the loop. In this case the recurrence {A+X,+,B}<loop> may be existant. This patch prohibits folding of SCEVUnknown (and those who use them) into the start value of an AddRecExpr, if this instruction is dominated by the loop. Merging the dominating unknown values is still valid. Some tests that relied on the fact that some SCEVUnknown should be folded into AddRec's are changed so that they no longer expect such behavior. llvm-svn: 303730	2017-05-24 08:52:18 +00:00
Sanjoy Das	036dda25a5	[SCEV] Clarify behavior around max backedge taken count This is a re-application of a r303497 that was reverted in r303498. I thought it had broken a bot when it had not (the breakage did not go away with the revert). This change makes the split between the "exact" backedge taken count and the "maximum" backedge taken count a bit more obvious. Both of these are upper bounds on the number of times the loop header executes (since SCEV does not account for most kinds of abnormal control flow), but the latter is guaranteed to be a constant. There were a few places where the max backedge taken count was a non-constant; I've changed those to compute constants instead. At this point, I'm not sure if the constant max backedge count can be computed by calling `getUnsignedRange(Exact).getUnsignedMax()` without losing precision. If it can, we can simplify even further by making `getMaxBackedgeTakenCount` a thin wrapper around `getBackedgeTakenCount` and `getUnsignedRange`. llvm-svn: 303531	2017-05-22 06:46:04 +00:00
Sanjoy Das	8963650cfa	Revert "[SCEV] Clarify behavior around max backedge taken count" This reverts commit r303497 since it breaks the msan bootstrap bot: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/1379/ llvm-svn: 303498	2017-05-21 05:02:12 +00:00
Sanjoy Das	5207168383	[SCEV] Clarify behavior around max backedge taken count This change makes the split between the "exact" backedge taken count and the "maximum" backedge taken count a bit more obvious. Both of these are upper bounds on the number of times the loop header executes (since SCEV does not account for most kinds of abnormal control flow), but the latter is guaranteed to be a constant. There were a few places where the max backedge taken count was a non-constant; I've changed those to compute constants instead. At this point, I'm not sure if the constant max backedge count can be computed by calling `getUnsignedRange(Exact).getUnsignedMax()` without losing precision. If it can, we can simplify even further by making `getMaxBackedgeTakenCount` a thin wrapper around `getBackedgeTakenCount` and `getUnsignedRange`. llvm-svn: 303497	2017-05-21 01:47:50 +00:00
Simon Pilgrim	c74e7f0a42	Fix line-endings. llvm-svn: 303448	2017-05-19 19:47:29 +00:00
Simon Pilgrim	6bba6068be	[X86][AVX512] Add 512-bit vector ctpop costs + tests llvm-svn: 303342	2017-05-18 10:42:34 +00:00
Serguei Katkov	ba831f78fd	[BPI] Reduce the probability of unreachable edge to minimal value greater than 0 The probability of edge coming to unreachable block should be as low as possible. The change reduces the probability to minimal value greater than zero. The bug https://bugs.llvm.org/show_bug.cgi?id=32214 show the example when the probability of edge coming to unreachable block is greater than for edge coming to out of the loop and it causes incorrect loop rotation. Please note that with this change the behavior of unreachable heuristic is a bit different than others. Specifically, before this change the sum of probabilities coming to unreachable blocks have the same weight for all branches (it was just split over all edges of this block coming to unreachable blocks). With this change it might be slightly different but not to much due to probability of taken branch to unreachable block is really small. Reviewers: chandlerc, sanjoy, vsk, congh, junbuml, davidxl, dexonsmith Reviewed By: chandlerc, dexonsmith Subscribers: reames, llvm-commits Differential Revision: https://reviews.llvm.org/D30633 llvm-svn: 303327	2017-05-18 06:11:56 +00:00
Simon Pilgrim	23ef26728a	[X86][AVX512] Add 512-bit vector ctlz costs + tests llvm-svn: 303300	2017-05-17 21:02:18 +00:00
Simon Pilgrim	d0365967c4	[X86][AVX512] Add 512-bit vector cttz costs + tests llvm-svn: 303293	2017-05-17 20:22:54 +00:00
Simon Pilgrim	91b46c99be	[X86] Split ctpop/ctlz/cttz cost tests This will make things a lot easier to test all the permutations of avx512 llvm-svn: 303290	2017-05-17 19:57:20 +00:00
Simon Pilgrim	a9a92a1a6a	[X86][AVX512] Add 512-bit vector bitreverse costs + tests llvm-svn: 303283	2017-05-17 19:20:20 +00:00
Jonas Paulsson	8722ade770	[SystemZ] Modelling of costs of divisions with a constant power of 2. Such divisions will eventually be implemented with shifts which should be reflected in the cost function. Review: Ulrich Weigand llvm-svn: 303254	2017-05-17 12:46:26 +00:00
Max Kazantsev	b09b5db793	[SCEV] Fix sorting order for AddRecExprs The existing sorting order in defined CompareSCEVComplexity sorts AddRecExprs by loop depth, but does not pay attention to dominance of loops. This can lead us to the following buggy situation: for (...) { // loop1 op1 = {A,+,B} } for (...) { // loop2 op2 = {A,+,B} S = add op1, op2 } In this case there is no guarantee that in operand list of S the op2 comes before op1 (loop depth is the same, so they will be sorted just lexicographically), so we can incorrectly treat S as a recurrence of loop1, which is wrong. This patch changes the sorting logic so that it places the dominated recs before the dominating recs. This ensures that when we pick the first recurrency in the operands order, it will be the bottom-most in terms of domination tree. The attached test set includes some tests that produce incorrect SCEV estimations and crashes with oldlogic. Reviewers: sanjoy, reames, apilipenko, anna Reviewed By: sanjoy Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33121 llvm-svn: 303148	2017-05-16 07:27:06 +00:00
Daniel Jasper	61fa0dcac3	Add '#' to test regex that I forgot in r303025. llvm-svn: 303034	2017-05-15 04:58:27 +00:00
Daniel Jasper	54392a20a2	Fix two tests that weren't correctly copied. One didn't correctly fine the regex variable, the other still had a RUN line for FNOBUILTIN-checks, which weren't copied to the file. llvm-svn: 303025	2017-05-14 22:07:50 +00:00
Simon Pilgrim	d0ef9d8e93	[X86][AVX1] Account for cost of extract/insert of 256-bit shifts llvm-svn: 303023	2017-05-14 20:52:11 +00:00
Simon Pilgrim	f96b4ab92d	[X86][AVX2] Fix costs for v4i64 ashr by splat llvm-svn: 303022	2017-05-14 20:25:42 +00:00
Simon Pilgrim	de4467b182	[X86][AVX1] Account for cost of extract/insert of 256-bit shifts by splat llvm-svn: 303021	2017-05-14 20:02:34 +00:00
Simon Pilgrim	d3f0d03cc5	[X86][AVX1] Account for cost of extract/insert of 256-bit SDIV/UDIV by mul sequences llvm-svn: 303017	2017-05-14 18:52:15 +00:00
Simon Pilgrim	5bef9c627e	[X86][XOP] XOP's general v16i8 shifts will be used instead of v8i16 shift + mask. Tweak cost model to match what lowering actually does. llvm-svn: 303013	2017-05-14 17:59:46 +00:00
Simon Pilgrim	aa8dffb69b	[X86][SSE] Account for cost of extract/insert of v32i8 vector shifts llvm-svn: 303012	2017-05-14 17:36:07 +00:00
Simon Pilgrim	4599eaa09a	[X86][XOP] Account for cost of extract/insert of 256-bit vector shifts llvm-svn: 303010	2017-05-14 13:38:53 +00:00
Justin Bogner	b713266331	AA: Use generic intrinsics for tests instead of target specific ones Update a few tests to use llvm.masked.load/store instead of arm neon vector loads and stores, and move the tests that are actually specific to those arm intrinsics to their own files. This lets us mark the tests that use target specific intrinsics as requiring those targets. llvm-svn: 302972	2017-05-13 00:12:52 +00:00
Serguei Katkov	63c9c81152	[BPI] Ignore remainder while distributing the remaining probability from unreachanble This is a follow up patch for https://reviews.llvm.org/rL300440 to address a comment. To make implementation to be consistent with other cases we just ignore the remainder after distribution of remaining probability between reachable edges. If we reduced the probability of some edges coming to unreachable blocks we should distribute the remaining part across other edges coming to reachable blocks to satisfy the condition that sum of all probabilities should be equal to one. If this remaining part is not divided by number of "reachable" edges then we get this remainder. This remainder probability should be pretty small. Other cases just ignore if the sum of probabilities is not equal to one so we do the same. Reviewers: chandlerc, sanjoy, vsk, junbuml, reames Reviewed By: reames Subscribers: reames, llvm-commits Differential Revision: https://reviews.llvm.org/D32124 llvm-svn: 302883	2017-05-12 07:50:06 +00:00
Matt Arsenault	3c5e4237c6	AMDGPU: Make some packed shuffles free VOP3P instructions can encode access to either half of the register. llvm-svn: 302730	2017-05-10 21:29:33 +00:00
Matthew Simpson	78fd46b230	[AArch64] Consider widening instructions in cost calculations The AArch64 instruction set has a few "widening" instructions (e.g., uaddl, saddl, uaddw, etc.) that take one or more doubleword operands and produce quadword results. The operands are automatically sign- or zero-extended as appropriate. However, in LLVM IR, these extends are explicit. This patch updates TTI to consider these widening instructions as single operations whose cost is attached to the arithmetic instruction. It marks extends that are part of a widening operation "free" and applies a sub-target specified overhead (zero by default) to the arithmetic instructions. Differential Revision: https://reviews.llvm.org/D32706 llvm-svn: 302582	2017-05-09 20:18:12 +00:00
Simon Pilgrim	2d1c6d6e8d	[X86][AVX1] Improve 256-bit vector costs for integer unary intrinsics. Account for subvector extraction/insertion, helps prevent the vectorizers from selecting 256-bit vectors that will have to be split anyhow on AVX1 targets. llvm-svn: 302378	2017-05-07 20:58:55 +00:00
Michael Zolotukhin	37162adf3e	[SCEV] createAddRecFromPHI: Optimize for the most common case. Summary: The existing implementation creates a symbolic SCEV expression every time we analyze a phi node and then has to remove it, when the analysis is finished. This is very expensive, and in most of the cases it's also unnecessary. According to the data I collected, ~60-70% of analyzed phi nodes (measured on SPEC) have the following form: PN = phi(Start, OP(Self, Constant)) Handling such cases separately significantly speeds this up. Reviewers: sanjoy, pete Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32663 llvm-svn: 302096	2017-05-03 23:53:38 +00:00
Elad Cohen	ef5798acf5	Support arbitrary address space pointers in masked gather/scatter intrinsics. Fixes PR31789 - When loop-vectorize tries to use these intrinsics for a non-default address space pointer we fail with a "Calling a function with a bad singature!" assertion. This patch solves this by adding the 'vector of pointers' argument as an overloaded type which will determine the address space. Differential revision: https://reviews.llvm.org/D31490 llvm-svn: 302018	2017-05-03 12:28:54 +00:00
Sanjoy Das	ddebb703fc	Use WeakVH instead of WeakTrackingVH in AliasSetTracker's UnkownInsts In cases where an instruction (a call site, say) is RAUW'ed with some other value (this is possible via the `returned` attribute, for instance), we want the slot in UnknownInsts to point to the original Instruction we wanted to track, not the value it got replaced by. Fixes PR32587. This relands r301426. llvm-svn: 301814	2017-05-01 17:07:56 +00:00
Sanjoy Das	08989c7ecd	Rename isKnownNotFullPoison to programUndefinedIfPoison; NFC Summary: programUndefinedIfPoison makes more sense, given what the function does; and I'm about to add a function with a name similar to isKnownNotFullPoison (so do the rename to avoid confusion). Reviewers: broune, majnemer, bjarke.roune Reviewed By: broune Subscribers: mcrosier, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D30444 llvm-svn: 301776	2017-04-30 19:41:19 +00:00
Sanjoy Das	2cbeb00f38	Reverts commit r301424, r301425 and r301426 Commits were: "Use WeakVH instead of WeakTrackingVH in AliasSetTracker's UnkownInsts" "Add a new WeakVH value handle; NFC" "Rename WeakVH to WeakTrackingVH; NFC" The changes assumed pointers are 8 byte aligned on all architectures. llvm-svn: 301429	2017-04-26 16:37:05 +00:00
Sanjoy Das	8b32b81954	Use WeakVH instead of WeakTrackingVH in AliasSetTracker's UnkownInsts Summary: In cases where an instruction (a call site, say) is RAUW'ed with some other value (this is possible via the `returned` attribute, amongst other things), we want the slot in UnknownInsts to point to the original Instruction we wanted to track, not the value it got replaced by. Fixes PR32587. Reviewers: davide Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D32268 llvm-svn: 301426	2017-04-26 16:21:02 +00:00
Sanjoy Das	561247a823	[IVUsers] Don't bail out of normalizing non-affine add recs Summary: In a previous change I changed SCEV's normalization / denormalization to work with non-affine add recs. So the bailout in IVUsers can be removed. Reviewers: atrick, efriedma Reviewed By: atrick Subscribers: davide, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D32105 llvm-svn: 301298	2017-04-25 06:53:25 +00:00
Sanjoy Das	bdbc4938f9	[SCEV] Fix exponential time complexity by caching llvm-svn: 301149	2017-04-24 00:09:46 +00:00
Eli Friedman	d0e6ae5678	Revert r300746 (SCEV analysis for or instructions). There have been multiple reports of this causing problems: a compile-time explosion on the LLVM testsuite, and a stack overflow for an opencl kernel. llvm-svn: 300928	2017-04-20 23:59:05 +00:00
Eli Friedman	e77d2b86b4	[SCEV] Make SCEV or modeling more aggressive. Use haveNoCommonBitsSet to figure out whether an "or" instruction is equivalent to addition. This handles more cases than just checking for a constant on the RHS. Differential Revision: https://reviews.llvm.org/D32239 llvm-svn: 300746	2017-04-19 20:19:58 +00:00
Matt Arsenault	4c1ecded63	AMDGPU: Change DivergenceAnalysis for function arguments Stop assuming all functions are kernels. llvm-svn: 300719	2017-04-19 17:42:34 +00:00
Serguei Katkov	2616bbb16d	[BPI] Use metadata info before any other heuristics Metadata potentially is more precise than any heuristics we use, so it makes sense to use first metadata info if it is available. However it makes sense to examine it against other strong heuristics like unreachable one. If edge coming to unreachable block has higher probability then it is expected by unreachable heuristic then we use heuristic and remaining probability is distributed among other reachable blocks equally. An example where metadata might be more strong then unreachable heuristic is as follows: it is possible that there are two branches and for the branch A metadata says that its probability is (0, 2^25). For the branch B the probability is (1, 2^25). So the expectation is that first edge of B is hotter than first edge of A because first edge of A did not executed at least once. If first edge of A points to the unreachable block then using the unreachable heuristics we'll set the probability for A to (1, 2^20) and now edge of A becomes hotter than edge of B. This is unexpected behavior. This fixed the biggest part of https://bugs.llvm.org/show_bug.cgi?id=32214 Reviewers: sanjoy, junbuml, vsk, chandlerc Reviewed By: chandlerc Subscribers: llvm-commits, reames, davidxl Differential Revision: https://reviews.llvm.org/D30631 llvm-svn: 300440	2017-04-17 04:33:04 +00:00
Brian Gesiak	0a7894d99c	[Analysis] Support bitreverse in -demanded-bits pass Summary: * Add a bitreverse case in the demanded bits analysis pass. * Add tests for the bitreverse (and bswap) intrinsic in the demanded bits pass. * Add a test case to the BDCE tests: that manipulations to high-order bits are eliminated once the bits are reversed and then right-shifted. Reviewers: mkuper, jmolloy, hfinkel, trentxintong Reviewed By: jmolloy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31857 llvm-svn: 300215	2017-04-13 16:44:25 +00:00
Piotr Padlewski	04aee46779	Remove readnone from invariant.group.barrier Summary: Readnone attribute would cause CSE of two barriers with the same argument, which is invalid by example: struct Base { virtual int foo() { return 42; } }; struct Derived1 : Base { int foo() override { return 50; } }; struct Derived2 : Base { int foo() override { return 100; } }; void foo() { Base *x = new Base{}; new (x) Derived1{}; int a = std::launder(x)->foo(); new (x) Derived2{}; int b = std::launder(x)->foo(); } Here 2 calls of std::launder will produce @llvm.invariant.group.barrier, which would be merged into one call, causing devirtualization to devirtualize second call into Derived1::foo() instead of Derived2::foo() Reviewers: chandlerc, dberlin, hfinkel Subscribers: llvm-commits, rsmith, amharc Differential Revision: https://reviews.llvm.org/D31531 llvm-svn: 300101	2017-04-12 20:45:12 +00:00
Renato Golin	ab85113a93	[SystemZ] Fix target specific tests llvm-svn: 300078	2017-04-12 17:14:46 +00:00
Jonas Paulsson	9b4875434e	[SystemZ] Updated test fp-cast.ll This did not get included in the previous commit for SystemZ cost functions. llvm-svn: 300053	2017-04-12 12:11:41 +00:00
Jonas Paulsson	fccc7d66c3	[SystemZ] TargetTransformInfo cost functions implemented. getArithmeticInstrCost(), getShuffleCost(), getCastInstrCost(), getCmpSelInstrCost(), getVectorInstrCost(), getMemoryOpCost(), getInterleavedMemoryOpCost() implemented. Interleaved access vectorization enabled. BasicTTIImpl::getCastInstrCost() improved to check for legal extending loads, in which case the cost of the z/sext instruction becomes 0. Review: Ulrich Weigand, Renato Golin. https://reviews.llvm.org/D29631 llvm-svn: 300052	2017-04-12 11:49:08 +00:00
Serguei Katkov	ecebc3db72	[BPI] Refactor post domination calculation and simple fix for ColdCall Collection of PostDominatedByUnreachable and PostDominatedByColdCall have been split out of heuristics itself. Update of the data happens now for each basic block (before update for PostDominatedByColdCall might be skipped if unreachable or matadata heuristic handled this basic block). This separation allows re-ordering of heuristics without loosing the post-domination information. Reviewers: sanjoy, junbuml, vsk, chandlerc, reames Reviewed By: chandlerc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31701 llvm-svn: 300029	2017-04-12 05:42:14 +00:00
Daniel Berlin	554dcd8c89	MemorySSA: Move to Analysis, from Transforms/Utils. It's used as Analysis, it has Analysis passes, and once NewGVN is made an Analysis, this removes the cross dependency from Analysis to Transform/Utils. NFC. llvm-svn: 299980	2017-04-11 20:06:36 +00:00
Matt Arsenault	f10061ec70	Add address space mangling to lifetime intrinsics In preparation for allowing allocas to have non-0 addrspace. llvm-svn: 299876	2017-04-10 20:18:21 +00:00
Max Kazantsev	2e44d2969a	[ScalarEvolution] Re-enable Predicate implication from operations The patch rL298481 was reverted due to crash on clang-with-lto-ubuntu build. The reason of the crash was type mismatch between either a or b and RHS in the following situation: LHS = sext(a +nsw b) > RHS. This is quite rare, but still possible situation. Normally we need to cast all {a, b, RHS} to their widest type. But we try to avoid creation of new SCEV that are not constants to avoid initiating recursive analysis that can take a lot of time and/or cache a bad value for iterations number. To deal with this, in this patch we reject this case and will not try to analyze it if the type of sum doesn't match with the type of RHS. In this situation we don't need to create any non-constant SCEVs. This patch also adds an assertion to the method IsProvedViaContext so that we could fail on it and not go further into range analysis etc (because in some situations these analyzes succeed even when the passed arguments have wrong types, what should not normally happen). The patch also contains a fix for a problem with too narrow scope of the analysis caused by wrong usage of predicates in recursive invocations. The regression test on the said failure: test/Analysis/ScalarEvolution/implied-via-addition.ll Reviewers: reames, apilipenko, anna, sanjoy Reviewed By: sanjoy Subscribers: mzolotukhin, mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D31238 llvm-svn: 299205	2017-03-31 12:05:30 +00:00

1 2 3 4 5 ...

1251 Commits