llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	b82438872b	[CostModel][X86] We don't need a scale factor for SLM extract costs D74976 will handle larger vector types, but since SLM doesn't support AVX+ then we will always be extracting from 128-bit vectors so don't need to scale the cost.	2020-02-24 14:23:04 +00:00
Simon Pilgrim	eaa41e103c	[CostModel][X86] Try to check against common prefixes before using target-specific cpu checks SLM/GLM is still a mess so not all of them have been updated yet.	2020-02-24 11:59:07 +00:00
Jonas Paulsson	41bd9ead35	[SystemZ] Return scalarized costs for vector instructions on older archs. A cost query for a vector instruction should return a cost even without target vector support, and not trigger an assert. VectorCombine does this with an input containing source code vectors. Review: Ulrich Weigand	2020-02-21 09:17:37 -08:00
Evgeniy Brevnov	b0761bbc76	[DependenceAnalysis] Memory dependence analysis internal caching mechanism is broken in presence of TBAA (PR42733). Summary: There is a flaw in memory dependence analysis caching mechanism when memory accesses with TBAA are involved. Assume we first analysed and cached results for access with TBAA. Later we request dependence for the same memory but without TBAA (or different TBAA). By design these two queries should share one entry in the internal cache which corresponds to a general access (without TBAA). Thus upon second request internal cached is cleared and we continue analysis for access as if there is no TBAA. The problem is that even though internal cache is cleared the set of visited nodes is not. That means we won't traverse visited nodes again and populate internal cache with the corresponding dependence results. So we end up with internal cache in an incomplete state. Current implementation tries to signal that situation by resetting CacheInfo->Pair at line 1104. But that doesn't actually help since later code ignores this invalidation and relies on 'Cache->empty()' property to decide on cache completeness. Reviewers: reames, hfinkel, chandlerc, fedor.sergeev, asbirlea, fhahn, john.brawn, Prazek, sunfish Reviewed By: john.brawn Subscribers: DaniilSuchkov, kosarev, jfb, dantrushin, hiraditya, bmahjour, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73032	2020-02-21 20:20:36 +07:00
Sanjay Patel	d799190851	[ConstantFold] fold fsub -0.0, undef to undef rather than NaN A question about this behavior came up on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2020-February/139003.html ...and as part of backend improvements in D73978, but this is an IR change first because we already have fairly thorough tests in place here. We decided not to implement a more general change that would have folded any FP binop with nearly arbitrary constant + undef operand to undef because that is not theoretically correct (even if it is practically correct). Differential Revision: https://reviews.llvm.org/D74713	2020-02-21 08:03:19 -05:00
Sanjay Patel	7ddbf802cf	[ConstantFold] add/move tests for FP with undef operand; NFC	2020-02-20 15:07:11 -05:00
Hideto Ueno	e253cdda35	[MustExecute] Add backward exploration for must-be-executed-context Summary: As mentioned in D71974, it is useful for must-be-executed-context to explore CFG backwardly. This patch is ported from parts of D64975. We use a dominator tree to find the previous context if a dominator tree is available. Reviewers: jdoerfert, hfinkel, baziotis, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74817	2020-02-20 14:49:30 +09:00
Bardia Mahjour	0a2626d0cd	[DDG] Data Dependence Graph - Graph Simplification Summary: This is the last functional patch affecting the representation of DDG. Here we try to simplify the DDG to reduce the number of nodes and edges by iteratively merging pairs of nodes that satisfy the following conditions, until no such pair can be identified. A pair of nodes consisting of a and b can be merged if: 1. the only edge from a is a def-use edge to b and 2. the only edge to b is a def-use edge from a and 3. there is no cyclic edge from b to a and 4. all instructions in a and b belong to the same basic block and 5. both a and b are simple (single or multi instruction) nodes. These criteria allow us to fold many uninteresting def-use edges that commonly exist in the graph while avoiding the risk of introducing dependencies that didn't exist before. Authored By: bmahjour Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert Reviewed By: Meinersbur Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto, ppc-slack Tags: #llvm Differential Revision: https://reviews.llvm.org/D72350	2020-02-19 13:41:51 -05:00
Jay Foad	b329d1b06e	[AMDGPU][ConstantFolding] Fold llvm.amdgcn.fmul.legacy intrinsic Reviewers: arsenm, rampitec, nhaehnle Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74835	2020-02-19 16:01:30 +00:00
Evgeniy Brevnov	cae643d596	Reverting D73027 [DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses(PR42151).	2020-02-14 22:57:23 +07:00
Evgeniy Brevnov	5573abceab	[DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses(PR42151). Summary: This is second attempt to fix the problem with incorrect dependencies reported in presence of invariant load. Initial fix (https://reviews.llvm.org/D64405) was reverted due to a regression reported in https://reviews.llvm.org/D70516. The original fix changed caching behavior for invariant loads. Namely such loads are not put into the second level cache (NonLocalDepInfo). The problem with that fix is the first level cache (CachedNonLocalPointerInfo) still works as if invariant loads were in the second level cache. The solution is in addition to not putting dependence results into the second level cache avoid putting info about invariant loads into the first level cache as well. Reviewers: jdoerfert, reames, hfinkel, efriedma Reviewed By: jdoerfert Subscribers: DaniilSuchkov, hiraditya, bmahjour, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73027	2020-02-14 12:18:31 +07:00
Huihui Zhang	5350a48931	[ConstantFold][SVE] Fix constant fold for FoldReinterpretLoadFromConstPtr. Summary: Bail out early for scalable vectors. As global variables are not expected to be scalable. Use explicit call of getFixedSize() to assert on places where scalable size doesn't make sense. Reviewers: sdesmalen, efriedma, apazos, huntergr, willlovett Reviewed By: sdesmalen Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74424	2020-02-12 10:24:50 -08:00
Ehud Katz	2470d2988a	[ConstantFolding] Fold calls to FP remainder function With the fixed implementation of the "remainder" operation in rG9d0956ebd471, we can now add support to folding calls to it. Differential Revision: https://reviews.llvm.org/D69777	2020-02-12 13:21:18 +02:00
Nicolai Hähnle	ab2f610f38	AMDGPU: llvm.amdgcn.writelane is a source of divergence Summary: Consider: %r = call i32 @llvm.amdgcn.writelane(i32 0, i32 1, i32 2) This produces a value that is 0 on lane 1, and 2 everywhere else; i.e., it is divergent. Reported-by: Marek Olsak <Marek.Olsak@amd.com> Reviewers: arsenm, foad, mareko Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74400	2020-02-12 09:12:56 +01:00
Huihui Zhang	88de9338f2	[ConstantFold][SVE] Fix constand fold for vector call. Summary: Do not iterate on scalable vectors. Reviewers: sdesmalen, efriedma, apazos, huntergr, willlovett Reviewed By: sdesmalen Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74419	2020-02-11 14:06:15 -08:00
Alina Sbirlea	0cecafd647	[BasicAA] Make BasicAA a cfg pass. Summary: Part of the changes in D44564 made BasicAA not CFG only due to it using PhiAnalysisValues which may have values invalidated. Subsequent patches (rL340613) appear to have addressed this limitation. BasicAA should not be invalidated by non-CFG-altering passes. A concrete example is MemCpyOpt which preserves CFG, but we are testing it invalidates BasicAA. llvm-dev RFC: https://groups.google.com/forum/#!topic/llvm-dev/eSPXuWnNfzM Reviewers: john.brawn, sebpop, hfinkel, brzycki Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74353	2020-02-11 11:30:08 -08:00
Rachel Craik	1f55420065	[LoopCacheAnalysis]: Add support for negative stride LoopCacheAnalysis currently assumes the loop will be iterated over in a forward direction. This patch addresses the issue by using the absolute value of the stride when iterating backwards. Note: this patch will treat negative and positive array access the same, resulting in the same cost being calculated for single and bi-directional access patterns. This should be improved in a subsequent patch. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D73064	2020-02-10 13:22:35 -05:00
Huihui Zhang	5389ca7a1f	[ConstantFold][NFC] Move scalable vector unit tests under vscale.ll	2020-02-05 16:03:51 -08:00
Huihui Zhang	801857c59e	[ConstantFold][SVE] Fix constant folding for bitcast. Do not iterate on scalable vector type in BitCastConstantVector. Continuation work of D70985, D71147. Support for folding bitcast into splat value is kept in D74095, as it depends on D71637. Differential Revision: https://reviews.llvm.org/D71389	2020-02-05 15:39:57 -08:00
Christopher Tetreault	b03f3fbd6a	Reapply: [SVE] Fix bug in simplification of scalable vector instructions This reverts commit `a05441038a`, reapplying commit `31574d38ac`	2020-02-05 10:00:09 -08:00
Matt Arsenault	096cd991ee	AMDGPU: Fix divergence analysis of control flow intrinsics The mask results of these should be uniform. The trickier part is the dummy booleans used as IR glue need to be treated as divergent. This should make the divergence analysis results correct for the IR the DAG is constructed from. This should allow us to eliminate requiresUniformRegister, which has an expensive, recursive scan over all users looking for control flow intrinsics. This should avoid recent compile time regressions.	2020-02-05 09:30:54 -08:00
Matt Arsenault	4f9f5d09de	AMDGPU: Fix isAlwaysUniform for simple asm SGPR results We were handling the case where the result was a struct with an extracted SGPR component, but not for the simple case.	2020-02-04 13:34:14 -08:00
Matt Arsenault	cb7b661d3d	AMDGPU: Analyze divergence of inline asm	2020-02-03 12:42:16 -08:00
Reid Kleckner	a05441038a	Revert "[SVE] Fix bug in simplification of scalable vector instructions" This reverts commit `31574d38ac`. The newly added shufflevector test does not pass locally on either of my workstations.	2020-02-03 11:12:09 -08:00
Christopher Tetreault	31574d38ac	[SVE] Fix bug in simplification of scalable vector instructions Summary: * Most of the simplifications in SimplifyShuffleVectorInst depend on the concrete value of, or the length of the mask vector. For scalable vectors, this cannot be known at compile time. ** for these tests, detect if the vector is scalable before attempting the transformation * The functions ShuffleVectorInst::getMaskValue and ShuffleVectorInst::getShuffleMask access the value of the constant mask. However, since the length of the mask is unknown at compile time, these function do not work for scalable vectors. Add asserts to ensure that the input mask is not scalable Reviewers: efriedma, sdesmalen, apazos, chrisj, huihuiz Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73555	2020-02-03 10:15:56 -08:00
Huihui Zhang	b0d25fff9b	[ConstantFold][SVE][NFC] Add test for select instruction in scalable vector. Side notes from D73669, no need to guard the iteration on vectors, as it is explicitly looking for a ConstantVector/ConstantDataVector, which is not expected to be scalable at the moment. So, add the test only.	2020-01-30 10:56:12 -08:00
Huihui Zhang	34e6552dcb	[ConstantFold][SVE] Fix constant folding for scalable vector unary operations. Summary: Similar to issue D71445. Scalable vector should not be evaluated element by element. Add support to handle scalable vector UndefValue. Reviewers: sdesmalen, efriedma, apazos, huntergr, willlovett Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73678	2020-01-30 10:45:15 -08:00
Craig Topper	35625464c6	[X86] Fix the cost model for v16i16->v16i32 zero_extend/sign_extend with AVX2 We seem to be inheriting the cost from sse4.1. But if we have 256-bit registers we should be able to do this with just one extract to split the 16i16 and two v8i16->v8i32 operations so our cost should be 3 not 4. Differential Revision: https://reviews.llvm.org/D73646	2020-01-29 15:52:10 -08:00
Huihui Zhang	d2e2fc450e	[ConstantFold][SVE] Fix constant folding for scalable vector binary operations. Summary: Scalable vector should not be evaluated element by element. Add support to handle scalable vector UndefValue. Reviewers: sdesmalen, huntergr, spatel, lebedev.ri, apazos, efriedma, willlovett Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71445	2020-01-29 10:49:08 -08:00
Evgenii Stepanov	34ab56904e	Support zero size types in StackSafetyAnalysis. Reviewers: vitalybuka Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73395	2020-01-27 15:22:59 -08:00
Evgenii Stepanov	c3b80adcee	Fix StackSafetyAnalysis crash with scalable vector types. Summary: Treat scalable allocas as if they have storage size of 0, and scalable-typed memory accesses as if their range is unlimited. This is not a proper support of scalable vector types in the analysis - we can do better, but not today. Reviewers: vitalybuka Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73394	2020-01-27 15:22:59 -08:00
Roman Lebedev	9c801c48ee	[NFC][IndVarSimplify] Autogenerate tests affected by isHighCostExpansionHelper() cost modelling (PR44668)	2020-01-27 23:34:29 +03:00
Craig Topper	b1f3a0f972	Revert `a107f86` "[GlobalsAA] Add back a check to intrinsic_addresstaken.ll to see if the AVX and AVX512 bots still fail for it." It still fails some buildbots which is what I was trying to test.	2020-01-24 13:15:23 -08:00
Craig Topper	a107f86417	[GlobalsAA] Add back a check to intrinsic_addresstaken.ll to see if the AVX and AVX512 bots still fail for it. These bots failed for this several months ago and as a result, this check was removed. If they still fail I'm going to try to see if I can figure out why.	2020-01-24 11:54:23 -08:00
Austin Kerbow	c226646337	Resubmit: [DA][TTI][AMDGPU] Add option to select GPUDA with TTI Summary: Enable the new diveregence analysis by default for AMDGPU. Resubmit with test updates since GPUDA was causing failures on Windows. Reviewers: rampitec, nhaehnle, arsenm, thakis Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73315	2020-01-24 10:39:40 -08:00
Austin Kerbow	37aa16ebb7	[DA] Don't propagate from unreachable blocks Summary: Fixes crash that could occur when a divergent terminator has an unreachable parent. Reviewers: rampitec, nhaehnle, arsenm Subscribers: jvesely, wdng, hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73323	2020-01-24 10:28:11 -08:00
David Green	e9c198278e	[ARM] Basic gather scatter cost model This is a very basic MVE gather/scatter cost model, based roughly on the code that we will currently produce. It does not handle truncating scatters or extending gathers correctly yet, as it is difficult to tell that they are going to be correctly extended/truncated from the limited information in the cost function. This can be improved as we extend support for these in the future. Based on code originally written by David Sherwood. Differential Revision: https://reviews.llvm.org/D73021	2020-01-22 14:41:38 +00:00
David Green	0b83e14804	[ARM] MVE Gather Scatter cost model tests. NFC	2020-01-22 14:41:38 +00:00
Florian Hahn	0b21d55262	[IR] Mark memset.* intrinsics as IntrWriteMem. llvm.memset intrinsics do only write memory, but are missing IntrWriteMem, so they doesNotReadMemory() returns false for them. The test change is due to the test checking the fn attribute ids at the call sites, which got bumped up due to a new combination with writeonly appearing in the test file. Reviewers: jdoerfert, reames, efriedma, nlopes, lebedev.ri Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D72789	2020-01-16 10:35:46 +00:00
Zheng Chen	a6342c247a	[SCEV] accurate range for addrecexpr with nuw flag If addrecexpr has nuw flag, the value should never be less than its start value and start value does not required to be SCEVConstant. Reviewed By: nikic, sanjoy Differential Revision: https://reviews.llvm.org/D71690	2020-01-12 20:22:37 -05:00
Zheng Chen	569ccfc384	[SCEV] more accurate range for addrecexpr with nsw flag. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D72436	2020-01-11 23:26:35 -05:00
Zheng Chen	a701be8f03	[SCEV] [NFC] add more test cases for range of addrecexpr with nsw flag	2020-01-10 22:44:47 -05:00
Zheng Chen	4ebb589629	[SCEV] [NFC] add testcase for constant range for addrecexpr with nsw flag	2020-01-09 01:26:57 -05:00
Simon Pilgrim	5d986a68a5	[CostModel][X86] Add missing scalar i64->f32 uitofp costs	2020-01-06 13:17:02 +00:00
Fangrui Song	a36ddf0aa9	Migrate function attribute "no-frame-pointer-elim"="false" to "frame-pointer"="none" as cleanups after D56351	2019-12-24 16:27:51 -08:00
Fangrui Song	eb16435b5e	Migrate function attribute "no-frame-pointer-elim-non-leaf" to "frame-pointer"="non-leaf" as cleanups after D56351	2019-12-24 16:05:15 -08:00
Fangrui Song	502a77f125	Migrate function attribute "no-frame-pointer-elim" to "frame-pointer"="all" as cleanups after D56351	2019-12-24 15:57:33 -08:00
czhengsz	7259f04dde	[SCEV] add testcase for get accurate range for addrecexpr with nuw flag	2019-12-22 20:58:19 -05:00
Bardia Mahjour	86acaa9457	[DDG] Data Dependence Graph - Ordinals Summary: This patch associates ordinal numbers to the DDG Nodes allowing the builder to order nodes within a pi-block in program order. The algorithm works by simply assuming the order in which the BBList is fed into the builder. The builder already relies on the blocks being in program order so that it can compute the dependencies correctly. Similarly the order of instructions in their parent basic blocks determine their program order. Authored By: bmahjour Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert Reviewed By: Meinersbur Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto, ppc-slack Tags: #llvm Differential Revision: https://reviews.llvm.org/D70986	2019-12-19 10:57:33 -05:00
czhengsz	d588a00206	[SCEV] NFC - add testcase for get accurate range for AddExpr	2019-12-19 04:11:45 -05:00

1 2 3 4 5 ...

1934 Commits