llvm-project

Commit Graph

Author	SHA1	Message	Date
Michael Kuperstein	e6d59fdca5	[X86] Add costs for non-AVX512 single-source permutation integer shuffles Differential Revision: https://reviews.llvm.org/D29416 llvm-svn: 293932	2017-02-02 20:27:13 +00:00
Michael Kuperstein	cb4ceeda7f	[X86] Extend single-source shuffle cost test to test more arches. NFC. llvm-svn: 293793	2017-02-01 18:09:47 +00:00
Eli Friedman	10d1ff64fe	[SCEV] Simplify/generalize howFarToZero solving. Make SolveLinEquationWithOverflow take the start as a SCEV, so we can solve more cases. With that implemented, get rid of the special case for powers of two. The additional functionality probably isn't particularly useful, but it might help a little for certain cases involving pointer arithmetic. Differential Revision: https://reviews.llvm.org/D28884 llvm-svn: 293576	2017-01-31 00:42:42 +00:00
Matt Arsenault	41c1499504	AMDGPU: Fix atomic_inc/atomic_dec + ds_swizzle not being divergent llvm-svn: 293504	2017-01-30 17:09:47 +00:00
Mehdi Amini	1726fc698c	Fix BasicAA incorrect assumption on GEP This is fixing pr31761: BasicAA is deducing NoAlias on the result of the GEP if the base pointer is itself NoAlias. This is possible only if the NoAlias on the base pointer is deduced with a non-sized query: this should guarantee that the pointers are belonging to different memory allocation and that the GEP can't legally jump from one to another. Differential Revision: https://reviews.llvm.org/D29216 llvm-svn: 293293	2017-01-27 16:12:22 +00:00
Daniil Fukalov	b09dac59fc	[SCEV] Introduce add operation inlining limit Inlining in getAddExpr() can cause abnormal computational time in some cases. New parameter -scev-addops-inline-threshold is intruduced with default value 500. Reviewers: sanjoy Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D28812 llvm-svn: 293176	2017-01-26 13:33:17 +00:00
Daniel Jasper	65144c852d	Revert "[PPC] Give unaligned memory access lower cost on processor that supports it" This reverts commit r292680. It is causing significantly worse performance and test timeouts in our internal builds. I have already routed reproduction instructions your way. llvm-svn: 293092	2017-01-25 21:21:08 +00:00
Chandler Carruth	d501b18990	This test apparently requires an x86 target and is failing on numerous bots ever since d0k fixed the CHECK lines so that it did something at all. It isn't actually testing SCEV directly but LSR, so move it into LSR and the x86-specific tree of tests that already exists there. Target dependence is common and unavoidable with the current design of LSR. llvm-svn: 292774	2017-01-23 08:33:29 +00:00
Chandler Carruth	a504f2b8e8	[PM] Teach LVI to correctly invalidate itself when its dependencies become unavailable. The AssumptionCache is now immutable but it still needs to respond to DomTree invalidation if it ended up caching one. This lets us remove one of the explicit invalidates of LVI but the other one continues to avoid hitting a latent bug. llvm-svn: 292769	2017-01-23 06:35:12 +00:00
Benjamin Kramer	1fd0d44e9b	Attempt to fix test in release builds. llvm-svn: 292762	2017-01-22 21:01:19 +00:00
Benjamin Kramer	db9e0b659d	Fix some broken CHECK lines. The colon is important. llvm-svn: 292761	2017-01-22 20:28:56 +00:00
Guozhi Wei	a5c6ed5a5c	[PPC] Give unaligned memory access lower cost on processor that supports it Newer ppc supports unaligned memory access, it reduces the cost of unaligned memory access significantly. This patch handles this case in PPCTTIImpl::getMemoryOpCost. This patch fixes pr31492. Differential Revision: https://reviews.llvm.org/D28630 llvm-svn: 292680	2017-01-20 23:35:27 +00:00
Eli Friedman	f1f49c8265	[SCEV] Make getUDivExactExpr handle non-nuw multiplies correctly. To avoid regressions, make ScalarEvolution::createSCEV a bit more clever. Also get rid of some useless code in ScalarEvolution::howFarToZero which was hiding this bug. No new testcase because it's impossible to actually expose this bug: we don't have any in-tree users of getUDivExactExpr besides the two functions I just mentioned, and they both dodged the problem. I'll try to add some interesting users in a followup. Differential Revision: https://reviews.llvm.org/D28587 llvm-svn: 292449	2017-01-18 23:56:42 +00:00
Chandler Carruth	b6e32daa81	[PM] Teach the LoopPassManager to automatically canonicalize loops by runnig LCSSA over them prior to running the loop pipeline. This also teaches the loop PM to verify that LCSSA form is preserved throughout the pipeline's run across the loop nest. Most of the test updates just leverage this new functionality. One has to be relaxed with the new PM as IVUsers is less powerful when it sees LCSSA input. Differential Revision: https://reviews.llvm.org/D28743 llvm-svn: 292241	2017-01-17 19:18:12 +00:00
Simon Pilgrim	6ed996cdf0	[CostModel][X86] Fix AVX512BW vector shift costs for vXi16 types We already have patterns in place to support 128/256-bit shifts without AVX512VL llvm-svn: 292077	2017-01-15 20:44:00 +00:00
Simon Pilgrim	a6c1974e06	[CostModel][X86] Drop separate AVX512VL checks - they match existing AVX512 costs Keep the tests though. llvm-svn: 292076	2017-01-15 20:19:28 +00:00
Simon Pilgrim	9b169e3c22	[CostModel][X86] Update vector shift tests to correctly check by non-constant uniform values. Use shuffle( scslar_to_vector, zeroinitializer) pattern instead of shuffle( vec, zeroinitializer) llvm-svn: 292075	2017-01-15 20:10:28 +00:00
Chandler Carruth	0952750fae	[PM] Clean up the testing for IVUsers, especially with the new PM. First, I've moved a test of IVUsers from the LSR tree to a dedicated IVUsers test directory. I've also simplified its RUN line now that the new pass manager's loop PM is providing analyses on their own. No functionality changed, but it makes subsequent changes cleaner. llvm-svn: 292060	2017-01-15 09:29:27 +00:00
Chandler Carruth	2f19a324cb	[PM] The assumption cache is fundamentally designed to be self-updating, mark it as never invalidated in the new PM. The old PM already required this to work, and after a discussion with Hal this seems to really be the only sensible answer. The cache gracefully degrades as the IR is mutated, and most things which do this should already be incrementally updating the cache. This gets rid of a bunch of logic preserving and testing the invalidation of this analysis. llvm-svn: 292039	2017-01-15 00:26:18 +00:00
Simon Pilgrim	d419b73a42	[CostModel][X86] Updated vXi64 ASHR costs on AVX512 targets now that D28604 has landed llvm-svn: 292023	2017-01-14 19:24:23 +00:00
Eli Friedman	bd6dedaa7f	[SCEV] Make howFarToZero max backedge-taken count check for precondition. Refines max backedge-taken count if a loop like "for (int i = 0; i != n; ++i) { /* body */ }" is rotated. Differential Revision: https://reviews.llvm.org/D28536 llvm-svn: 291704	2017-01-11 21:07:15 +00:00
Eli Friedman	8396265655	[SCEV] Make howFarToZero use a simpler formula for max backedge-taken count. This is both easier to understand, and produces a tighter bound in certain cases. Differential Revision: https://reviews.llvm.org/D28393 llvm-svn: 291701	2017-01-11 20:55:48 +00:00
Simon Pilgrim	5a81fefad3	[X86][AVX512BW] Vectorize v64i8 vector shifts Differential Revision: https://reviews.llvm.org/D28447 llvm-svn: 291665	2017-01-11 10:36:51 +00:00
Simon Pilgrim	c22c889f77	Fix line endings llvm-svn: 291663	2017-01-11 10:25:31 +00:00
Mohammed Agabaria	2c96c43388	[X86] updating TTI costs for arithmetic instructions on X86\SLM arch. updated instructions: pmulld, pmullw, pmulhw, mulsd, mulps, mulpd, divss, divps, divsd, divpd, addpd and subpd. special optimization case which replaces pmulld with pmullw\pmulhw\pshuf seq. In case if the real operands bitwidth <= 16. Differential Revision: https://reviews.llvm.org/D28104 llvm-svn: 291657	2017-01-11 08:23:37 +00:00
Evandro Menezes	330e1b8945	[AArch64] Consider all vector types for FeatureSlowMisaligned128Store The original code considered only v2i64 as slow for this feature. This patch consider all 128-bit long vector types as slow candidates. In internal tests, extending this feature to all 128-bit vector types resulted in an overall improvement of 1% on Exynos M1. Differential revision: https://reviews.llvm.org/D27998 llvm-svn: 291616	2017-01-10 23:42:21 +00:00
Simon Pilgrim	b6d4fa6551	[CostModel][X86] Add AVX512VL vector shift cost tests. llvm-svn: 291585	2017-01-10 19:04:12 +00:00
Sanjay Patel	baac743254	[ValueTracking] regenerate checks; NFC llvm-svn: 291468	2017-01-09 19:31:20 +00:00
Chandler Carruth	082c183f06	[PM] Teach SCEV to invalidate itself when its dependencies become invalid. This fixes use-after-free bugs that will arise with any interesting use of SCEV. I've added a dedicated test that works diligently to trigger these kinds of bugs in the new pass manager and also checks for them explicitly as well as triggering ASan failures when things go squirly. llvm-svn: 291426	2017-01-09 07:44:34 +00:00
Simon Pilgrim	9c58950eeb	[CostModel][X86] Fixed vXi8 uniform shift costs. The 'fast' costs should only work for shifts by uniform constants (uniform non-constant are lowered using the slow default implementation). Logical shifts were not taking into account that we must mask the psrlw result, so the costs needed to be doubled. Added missing AVX2/AVX512BW costs as well. llvm-svn: 291391	2017-01-08 14:14:36 +00:00
Simon Pilgrim	1fa5487c05	[CostModel][X86] Moved legal uniform shift costs earlier. XOP was prematurely matching, doubling the cost of ashr/lshr uniform shifts. llvm-svn: 291390	2017-01-08 13:12:03 +00:00
Simon Pilgrim	9681c407b4	[CostModel][X86] Update SSE41/AVX1 vXi32 SHL costs SSE41 provides pmulld which allows the simpler pslld/paddd/cvttps2dq/pmulld pattern than SSE2's use of pmuludq. llvm-svn: 291372	2017-01-07 22:27:43 +00:00
Simon Pilgrim	a470296367	[CostModel][X86] Fix AVX2 v16i16 shift 'splat' costs. llvm-svn: 291366	2017-01-07 22:08:09 +00:00
Simon Pilgrim	82e3e05fe2	[CostModel][X86] Match 256-bit vector shift 'splat' costs for AVX2 and above We were matching against general vector shift costs before the uniform splat costs llvm-svn: 291365	2017-01-07 21:47:10 +00:00
Simon Pilgrim	a4109d6433	[CostModel][AVX512BW] Add v32i16 vector shift costs for avx512bw targets. llvm-svn: 291354	2017-01-07 17:54:10 +00:00
Simon Pilgrim	a1b8e2c725	[X86][AVX512] Use lowerShuffleAsRepeatedMaskAndLanePermute for non-VBMI v64i8 shuffles (PR31470) llvm-svn: 291347	2017-01-07 15:37:50 +00:00
Simon Pilgrim	9cbcc5ff0b	[CostModel][X86] Add AVX512 and 512-bit vector shift cost tests. llvm-svn: 291269	2017-01-06 19:41:26 +00:00
Chad Rosier	e177185e79	[AArch64] Reduce vector insert/extract cost for Falkor. Differential Revision: https://reviews.llvm.org/D28403 llvm-svn: 291254	2017-01-06 18:03:26 +00:00
Simon Pilgrim	d8333372bc	[CostModel][X86] Fix 512-bit SDIV/UDIV 'big' costs. Set the costs on the lowest target that supports the type. llvm-svn: 291229	2017-01-06 11:12:53 +00:00
Simon Pilgrim	441d1d35d2	[CostModel][X86] Add SDIV/UDIV cost tests for a wider range of targets Added a test demonstrating bug in AVX512 division costs llvm-svn: 291228	2017-01-06 11:02:40 +00:00
Simon Pilgrim	b01e844241	[CostModel][X86] Include the cost of 256-bit upper subvector extract/insertion in AVX1 v4i64 MUL Matches other MUL/ADD/SUB 256-bit case on AVX1 llvm-svn: 291149	2017-01-05 18:20:25 +00:00
Chad Rosier	e20a3a4831	[AArch64][CostModel] Add coverage for bswap intrinsics. llvm-svn: 291140	2017-01-05 16:55:32 +00:00
Simon Pilgrim	bca02f9e20	[CostModel][X86] Add support for broadcast shuffle costs Currently only for broadcasts with input and output of the same width. Differential Revision: https://reviews.llvm.org/D27811 llvm-svn: 291122	2017-01-05 15:56:08 +00:00
Chad Rosier	3ccd1dffff	[AArch64] Remove mcpu option as this test is not target specific. NFC. llvm-svn: 291117	2017-01-05 15:05:03 +00:00
Chad Rosier	e1dc73d9a7	[AArch64] Remove unused arguments from tests. NFC. llvm-svn: 291112	2017-01-05 14:48:53 +00:00
Tobias Grosser	9d88b858c8	Add missing CHECK: line to test case added in 29097 Without this CHECK line, we may not detect incorrectly detected additional regions at the end of the region tree. llvm-svn: 290994	2017-01-04 19:35:38 +00:00
Tobias Grosser	8ab80ba3a2	RegionInfo: add new test case This test case has been reduced from test/Analysis/RegionInfo/mix_1.ll and provides us with a minimal example of a test case which caused problems while working on an improved version of the RegionInfo analysis. We upstream this test case, as it certainly can be helpful in future debugging and optimization tests. Test case reduced by Pratik Bhatu <cs12b1010@iith.ac.in> llvm-svn: 290974	2017-01-04 17:50:15 +00:00
Simon Pilgrim	bb895f3e9c	[CostModel][X86] Updated vXi8 and vXi16 Reverse/Alternate shuffle costs Actual codegen is much better than the extract+insert patterns that was assumed. llvm-svn: 290962	2017-01-04 14:01:33 +00:00
Elena Demikhovsky	d96200d60a	Fixed shuffle-reverse cost on AVX-512. (This changed was approved in https://reviews.llvm.org/D28118, but Simon asked to submit it separately). llvm-svn: 290812	2017-01-02 11:44:10 +00:00
Elena Demikhovsky	21706cbd24	AVX-512 Loop Vectorizer: Cost calculation for interleave load/store patterns. X86 target does not provide any target specific cost calculation for interleave patterns.It uses the common target-independent calculation, which gives very high numbers. As a result, the scalar version is chosen in many cases. The situation on AVX-512 is even worse, since we have 3-src shuffles that significantly reduce the cost. In this patch I calculate the cost on AVX-512. It will allow to compare interleave pattern with gather/scatter and choose a better solution (PR31426). * Shiffle-broadcast cost will be changed in Simon's upcoming patch. Differential Revision: https://reviews.llvm.org/D28118 llvm-svn: 290810	2017-01-02 10:37:52 +00:00

1 2 3 4 5 ...

1174 Commits