llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	b70026c43c	[ScalarizeMaskedMemIntrin] Bitcast the mask to the scalar domain and use scalar bit tests for the branches. X86 at least is able to use movmsk or kmov to move the mask to the scalar domain. Then we can just use test instructions to test individual bits. This is more efficient than extracting each mask element individually. I special cased v1i1 to use the previous behavior. This avoids poor type legalization of bitcast of v1i1 to i1. I've skipped expandload/compressstore as I think we need to handle constant masks for those better first. Many tests end up with duplicate test instructions due to tail duplication in the branch folding pass. But the same thing happens when constructing similar code in C. So its not unique to the scalarization. Not sure if this lowering code will also be good for other targets, but we're only testing X86 today. Differential Revision: https://reviews.llvm.org/D65319 llvm-svn: 367489	2019-07-31 22:58:15 +00:00
Philip Reames	f8e7b53657	[IndVars, RLEV] Support rewriting exit values in loops without known exits (prep work) This is a prepatory patch for future work on support exit value rewriting in loops with a mixture of computable and non-computable exit counts. The intention is to be "mostly NFC" - i.e. not enable any interesting new transforms - but in practice, there are some small output changes. The test differences are caused by cases wherewhere getSCEVAtScope can simplify a single entry phi without needing any knowledge of the loop. llvm-svn: 367485	2019-07-31 21:15:21 +00:00
Alina Sbirlea	7153f2784c	[SCCP] Update condition to avoid overflow. Summary: Update condition to remove addition that may cause an overflow. Resolves PR42814. Reviewers: sanjoy, RKSimon Subscribers: jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65417 llvm-svn: 367461	2019-07-31 18:22:22 +00:00
Sanjay Patel	435cdecdf7	[InstCombine] canonicalize fneg before fmul/fdiv Reverse the canonicalization of fneg relative to fmul/fdiv. That makes it easier to implement the transforms (and possibly other fneg transforms) in 1 place because we can always start the pattern match from fneg (either the legacy binop or the new unop). There's a secondary practical benefit seen in PR21914 and PR42681: https://bugs.llvm.org/show_bug.cgi?id=21914 https://bugs.llvm.org/show_bug.cgi?id=42681 ...hoisting fneg rather than sinking seems to play nicer with LICM in IR (although this change may expose analysis holes in the other direction). 1. The instcombine test changes show the expected neutral IR diffs from reversing the order. 2. The reassociation tests show that we were missing an optimization opportunity to fold away fneg-of-fneg. My reading of IEEE-754 says that all of these transforms are allowed (regardless of binop/unop fneg version) because: "For all other operations [besides copy/abs/negate/copysign], this standard does not specify the sign bit of a NaN result." In all of these transforms, we always have some other binop (fadd/fsub/fmul/fdiv), so we are free to flip the sign bit of a potential intermediate NaN operand. (If that interpretation is wrong, then we must already have a bug in the existing transforms?) 3. The clang tests shouldn't exist as-is, but that's effectively a revert of rL367149 (the test broke with an extension of the pre-existing fneg canonicalization in rL367146). Differential Revision: https://reviews.llvm.org/D65399 llvm-svn: 367447	2019-07-31 16:53:22 +00:00
Stanislav Mekhanoshin	ba1e845c21	[AMDGPU] Fix for vectorizer crash with pointers of different size When vectorizer strips pointers it can eventually end up with pointers of two different sizes, then SCEV will crash. Differential Revision: https://reviews.llvm.org/D65480 llvm-svn: 367443	2019-07-31 16:33:11 +00:00
Roman Lebedev	8d76284599	[NFC][InstCombine] Add xor-or-icmp tests with icmp having extra uses Currently InstCombiner::foldXorOfICmps() bailouts if the ICMP it wants to invert has extra uses. As it can be seen in the tests in previous commit, this is super unfortunate, this is the single pattern that is left non-canonicalized. We could analyze if we can also invert all the uses if said ICMP at the same time, thus not bailing out there. I'm not seeing any nicer alternative. llvm-svn: 367439	2019-07-31 15:20:33 +00:00
Roman Lebedev	67688af5f0	[NFC][InstCombine] Add baseline tests with non-canonical CLAMP pattern As disscussed in https://reviews.llvm.org/D65148#1603922 these would all need to be canonicalized to traditional clamp pattern. llvm-svn: 367438	2019-07-31 15:20:21 +00:00
Florian Hahn	fa42f42858	[IPSCCP] Move callsite check to the beginning of the loop. We have some code marks instructions with struct operands as overdefined, but if the instruction is a call to a function with tracked arguments, this breaks the assumption that the lattice values of all call sites are not overdefined and will be replaced by a constant. This also re-adds the assertion from D65222, with additionally skipping non-callsite uses. This patch should address the cases reported in which the assertion fired. Fixes PR42738. Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D65439 llvm-svn: 367430	2019-07-31 12:57:04 +00:00
Roman Lebedev	a686c60c45	[DivRemPairs] Recommit: Handling for expanded-form rem - recomposition (PR42673) Summary: While `-div-rem-pairs` pass can decompose rem in div+rem pair when div-rem pair is unsupported by target, nothing performs the opposite fold. We can't do that in InstCombine or DAGCombine since neither of those has access to TTI. So it makes most sense to teach `-div-rem-pairs` about it. If we matched rem in expanded form, we know we will be able to place div-rem pair next to each other so we won't regress the situation. Also, we shouldn't decompose rem if we matched already-decomposed form. This is surprisingly straight-forward otherwise. The original patch was committed in rL367288 but was reverted in rL367289 because it exposed pre-existing RAUW issues in internal data structures of the pass; those now have been addressed in a previous patch. https://bugs.llvm.org/show_bug.cgi?id=42673 Reviewers: spatel, RKSimon, efriedma, ZaMaZaN4iK, bogner Reviewed By: bogner Subscribers: bogner, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65298 llvm-svn: 367419	2019-07-31 12:06:51 +00:00
Roman Lebedev	5f616901f5	[DivRemPairs] Avoid RAUW pitfalls (PR42823) Summary: `DivRemPairs` internally creates two maps: * {sign, divident, divisor} -> div instruction * {sign, divident, divisor} -> rem instruction Then it iterates over rem map, and looks if there is an entry in div map with the same key. Then depending on some internal logic it may RAUW rem instruction with something else. But if that rem instruction is an input to other div/rem, then it was used as a key in these maps, so the old value (used in key) is now dandling, because RAUW didn't update those maps. And we can't even RAUW map keys in general, there's `ValueMap`, but we don't have a single `Value` as key... The bug was discovered via D65298, and the test there exists. Now, i'm not sure how to expose this issue in trunk. The bug is clearly there if i change the map keys to be `AssertingVH`/`PoisoningVH`, but i guess this didn't miscompiled anything thus far? I really don't think this is benin without that patch. The fix is actually rather straight-forward - instead of trying to somehow shoe-horn `ValueMap` here (doesn't fit, key isn't just `Value`), or writing a new `ValueMap` with key being a struct of `Value`s, we can just have an intermediate data structure - a vector, each entry containing matching `Div, Rem` pair, and pre-filling it before doing any modifications. This way we won't need to query map after doing RAUW, so no bug is possible. Reviewers: spatel, bogner, RKSimon, craig.topper Reviewed By: spatel Subscribers: hiraditya, hans, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65451 llvm-svn: 367417	2019-07-31 12:06:38 +00:00
Roman Lebedev	0d60480737	[DivRemPairs][NFC] Autogenerate all checklines llvm-svn: 367415	2019-07-31 12:06:16 +00:00
Florian Hahn	189efe295b	Recommit "[GVN] Preserve loop related analysis/canonical forms." This fixes some pipeline tests. This reverts commit `d0b6f42936`. llvm-svn: 367401	2019-07-31 09:27:54 +00:00
Florian Hahn	d0b6f42936	Revert [GVN] Preserve loop related analysis/canonical forms. This reverts r367332 (git commit `2d7227ec3a`) llvm-svn: 367335	2019-07-30 17:04:58 +00:00
Florian Hahn	2d7227ec3a	[GVN] Preserve loop related analysis/canonical forms. LoopInfo can be easily preserved by passing it to the functions that modify the CFG (SplitCriticalEdge and MergeBlockIntoPredecessor. SplitCriticalEdge also preserves LoopSimplify and LCSSA form when when passing in LoopInfo. The test case shows that we preserve LoopSimplify and LoopInfo. Adding addPreservedID(LCSSAID) did not preserve LCSSA for some reason. Also I am not sure if it is possible to preserve those in the new pass manager, as they aren't analysis passes. Reviewers: reames, hfinkel, davide, jdoerfert Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D65137 llvm-svn: 367332	2019-07-30 16:43:39 +00:00
Kit Barton	de0b633999	[LoopFusion] Extend use of OptimizationRemarkEmitter Summary: This patch extends the use of the OptimizationRemarkEmitter to provide information about loops that are not fused, and loops that are not eligible for fusion. In particular, it uses the OptimizationRemarkAnalysis to identify loops that are not eligible for fusion and the OptimizationRemarkMissed to identify loops that cannot be fused. It also reuses the statistics to provide the messages used in the OptimizationRemarks. This provides common message strings between the optimization remarks and the statistics. I would like feedback on this approach, in general. If people are OK with this, I will flesh out additional remarks in subsequent commits. Subscribers: hiraditya, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63844 llvm-svn: 367327	2019-07-30 15:58:43 +00:00
Roman Lebedev	5e0adce40f	[DivRemPairs] Add srem-of-srem tests (PR42823, D65298, D65451) The @srem_of_srem_expanded case exposed a RAUW pitfall in D65298. Right now these don't appear to fail verification, so it should be safe to precommit them. https://reviews.llvm.org/D65298 https://bugs.llvm.org/show_bug.cgi?id=42823 https://reviews.llvm.org/D65451 llvm-svn: 367325	2019-07-30 15:46:03 +00:00
Roman Lebedev	be612ea471	[InstCombine] Fold "x ?% y ==/!= 0" to "x & (y-1) ==/!= 0" iff y is power-of-two Summary: I have stumbled into this by accident while preparing to extend backend `x s% C ==/!= 0` handling. While we did happen to handle this fold in most of the cases, the folding is indirect - we fold `x u% y` to `x & (y-1)` (iff `y` is power-of-two), or first turn `x s% -y` to `x u% y`; that does handle most of the cases. But we can't turn `x s% INT_MIN` to `x u% -INT_MIN`, and thus we end up being stuck with `(x s% INT_MIN) == 0`. There is no such restriction for the more general fold: https://rise4fun.com/Alive/IIeS To be noted, the fold does not enforce that `y` is a constant, so it may indeed increase instruction count. This is consistent with what `x u% y`->`x & (y-1)` already does. I think it makes sense, it's at most one (simple) extra instruction, while `rem`ainder is really much more un-simple (and likely very costly). Reviewers: spatel, RKSimon, nikic, xbolva00, craig.topper Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65046 llvm-svn: 367322	2019-07-30 15:28:22 +00:00
Sam Parker	e3a4a13fcc	[ARM][LowOverheadLoops] Enable by default The code is now in a good enough state to pass the bunch of tests that I have run (after fixing the bugs), so let's enable it by default. Differential Revision: https://reviews.llvm.org/D65277 llvm-svn: 367297	2019-07-30 08:14:28 +00:00
Roman Lebedev	8e0cf076ac	Revert "[DivRemPairs] Handling for expanded-form rem - recomposition (PR42673)" test-suite/MultiSource/Benchmarks/DOE-ProxyApps-C/miniGMG broke: Only PHI nodes may reference their own value! %sub33 = srem i32 %sub33, %ranks_in_i This reverts commit r367288. llvm-svn: 367289	2019-07-30 07:44:58 +00:00
Roman Lebedev	c75cdd056f	[DivRemPairs] Handling for expanded-form rem - recomposition (PR42673) Summary: While `-div-rem-pairs` pass can decompose rem in div+rem pair when div-rem pair is unsupported by target, nothing performs the opposite fold. We can't do that in InstCombine or DAGCombine since neither of those has access to TTI. So it makes most sense to teach `-div-rem-pairs` about it. If we matched rem in expanded form, we know we will be able to place div-rem pair next to each other so we won't regress the situation. Also, we shouldn't decompose rem if we matched already-decomposed form. This is surprisingly straight-forward otherwise. https://bugs.llvm.org/show_bug.cgi?id=42673 Reviewers: spatel, RKSimon, efriedma, ZaMaZaN4iK, bogner Reviewed By: bogner Subscribers: bogner, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65298 llvm-svn: 367288	2019-07-30 07:10:00 +00:00
Peter Collingbourne	dd9682196b	ThinLTOBitcodeWriter: Include globals associated with type metadata globals in the merged module. Globals that are associated with globals with type metadata need to appear in the merged module because they will reference the global's section directly. Differential Revision: https://reviews.llvm.org/D65312 llvm-svn: 367242	2019-07-29 17:22:40 +00:00
Cameron McInally	b32a6592eb	[NFC][FPEnv] Pre-commit tests for canonicalize negated operand of fdiv. llvm-svn: 367233	2019-07-29 16:09:56 +00:00
Sanjay Patel	e9ee7b47d4	[InstCombine] fold fadd+fneg with fdiv/fmul betweena The backend already does this via isNegatibleForFree(), but we may want to alter the fneg IR canonicalizations that currently exist, so we need to try harder to fold fneg in IR to avoid regressions. llvm-svn: 367227	2019-07-29 13:50:25 +00:00
Hideto Ueno	98d281a99f	[ValueTracking] Remove volatile check in isGuaranteedToTransferExecutionToSuccessor Summary: As clarified in D53184, volatile load and store do not trap. Therefore, we should remove volatile checks for instructions in `isGuaranteedToTransferExecutionToSuccessor`. Reviewers: jdoerfert, efriedma, nikic Reviewed By: nikic Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65375 llvm-svn: 367226	2019-07-29 13:35:34 +00:00
Sanjay Patel	74c35bd6b0	[InstCombine] add tests for fadd with negated operand; NFC llvm-svn: 367222	2019-07-29 12:49:36 +00:00
Roman Lebedev	6ff633ddc4	[NFC][InstCombine] Revisit tests in shift-amount-reassociation-with-truncation-shl.ll llvm-svn: 367196	2019-07-28 21:31:58 +00:00
Sanjay Patel	99c57c6daf	[InstCombine] fold fsub+fneg with fdiv/fmul between The backend already does this via isNegatibleForFree(), but we may want to alter the fneg IR canonicalizations that currently exist, so we need to try harder to fold fneg in IR to avoid regressions. llvm-svn: 367194	2019-07-28 17:10:06 +00:00
Roman Lebedev	d5bc4b09f1	[NFC][InstCombine] Shift amount reassociation: can have trunc between shl's https://rise4fun.com/Alive/OQbM Not so simple for lshr/ashr, so those maybe later. https://bugs.llvm.org/show_bug.cgi?id=42391 llvm-svn: 367189	2019-07-28 13:13:46 +00:00
Hideto Ueno	e7bea9b73a	[Attributor] Deduce "align" attribute Summary: Deduce "align" attribute in attributor. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64152 llvm-svn: 367187	2019-07-28 07:04:01 +00:00
Hideto Ueno	cc0a4cdc89	[FunctionAttrs] Annotate "willreturn" for intrinsics Summary: In D62801, new function attribute `willreturn` was introduced. In short, a function with `willreturn` is guaranteed to come back to the call site(more precise definition is in LangRef). In this patch, willreturn is annotated for LLVM intrinsics. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: jvesely, nhaehnle, sstefan1, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64904 llvm-svn: 367184	2019-07-28 06:09:56 +00:00
Sanjay Patel	02b9e45a7e	[InstSimplify] remove quadratic time looping (PR42771) The test case from: https://bugs.llvm.org/show_bug.cgi?id=42771 ...shows a ~30x slowdown caused by the awkward loop iteration (rL207302) that is seemingly done just to avoid invalidating the instruction iterator. We can instead delay instruction deletion until we reach the end of the block (or we could delay until we reach the end of all blocks). There's a test diff here for a degenerate case with llvm.assume that is not meaningful in itself, but serves to verify this change in logic. This change probably doesn't result in much overall compile-time improvement because we call '-instsimplify' as a standalone pass only once in the standard -O2 opt pipeline currently. Differential Revision: https://reviews.llvm.org/D65336 llvm-svn: 367173	2019-07-27 14:05:51 +00:00
Sanjay Patel	d20a0fe203	[InstCombine] add tests for fsub with negated operand; NFC llvm-svn: 367156	2019-07-26 21:12:22 +00:00
Wei Mi	55a68a2400	[JumpThreading] Stop searching predecessor when the current bb is in a unreachable loop. updatePredecessorProfileMetadata in jumpthreading tries to find the first dominating predecessor block for a PHI value by searching upwards the predecessor block chain. But jumpthreading may see some temporary IR state which contains unreachable bb not being cleaned up. If an unreachable loop happens to be on the predecessor block chain, keeping chasing the predecessor block will run into an infinite loop. The patch fixes it. Differential Revision: https://reviews.llvm.org/D65310 llvm-svn: 367154	2019-07-26 20:59:22 +00:00
Sanjay Patel	a9ab31558c	[InstCombine] canonicalize negated operand of fdiv This is a transform that we use with fmul, so use it for fdiv too for consistency. llvm-svn: 367146	2019-07-26 19:56:59 +00:00
Sanjay Patel	487e957775	[InstCombine] add tests for fdiv with negated operand; NFC llvm-svn: 367145	2019-07-26 19:44:53 +00:00
Sanjay Patel	c229cfeb7a	[InstCombine] remove flop from lerp patterns (Y * (1.0 - Z)) + (X * Z) --> Y - (Y * Z) + (X * Z) --> Y + Z * (X - Y) This is part of solving: https://bugs.llvm.org/show_bug.cgi?id=42716 Factoring eliminates an instruction, so that should be a good canonicalization. The potential conversion to FMA would be handled by the backend based on target capabilities. Differential Revision: https://reviews.llvm.org/D65305 llvm-svn: 367101	2019-07-26 11:19:18 +00:00
Sanjay Patel	8f15d40555	[InstCombine] add tests for lerp patterns (PR42716); NFC llvm-svn: 367069	2019-07-25 22:25:21 +00:00
Florian Hahn	c74808b914	[PredicateInfo] Replace pointer comparisons with deterministic compares. Currently there are a few pointer comparisons in ValueDFS_Compare, which can cause non-deterministic ordering when materializing values. There are 2 cases this patch fixes: 1. Order defs before uses used to compare pointers, which guarantees defs before uses, but causes non-deterministic ordering between 2 uses or 2 defs, depending on the allocation order. By converting the pointers to booleans, we can circumvent that problem. 2. comparePHIRelated was comparing the basic block pointers of edges, which also results in a non-deterministic order and is also not really meaningful for ordering. By ordering by their destination DFS numbers we guarantee a deterministic order. For the example below, we can end up with 2 different uselist orderings, when running `opt -mem2reg -ipsccp` hundreds of times. Because the non-determinism is caused by allocation ordering, we cannot reproduce it with ipsccp alone. declare i32 @hoge() local_unnamed_addr #0 define dso_local i32 @ham(i8* %arg, i8* %arg1) #0 { bb: %tmp = alloca i32 %tmp2 = alloca i32, align 4 br label %bb19 bb4: ; preds = %bb20 br label %bb6 bb6: ; preds = %bb4 %tmp7 = call i32 @hoge() store i32 %tmp7, i32* %tmp %tmp8 = load i32, i32* %tmp %tmp9 = icmp eq i32 %tmp8, 912730082 %tmp10 = load i32, i32* %tmp br i1 %tmp9, label %bb11, label %bb16 bb11: ; preds = %bb6 unreachable bb13: ; preds = %bb20 br label %bb14 bb14: ; preds = %bb13 %tmp15 = load i32, i32* %tmp br label %bb16 bb16: ; preds = %bb14, %bb6 %tmp17 = phi i32 [ %tmp10, %bb6 ], [ 0, %bb14 ] br label %bb19 bb18: ; preds = %bb20 unreachable bb19: ; preds = %bb16, %bb br label %bb20 bb20: ; preds = %bb19 indirectbr i8* null, [label %bb4, label %bb13, label %bb18] } Reviewers: davide, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D64866 llvm-svn: 367049	2019-07-25 20:48:13 +00:00
Roman Lebedev	aa205957ff	[NFC][DivRemPairs] Tests with rem in expanded form (PR42673) As discussed in https://bugs.llvm.org/show_bug.cgi?id=42673 there is a TTI hook hasDivRemOp() that matters here. While -div-rem-pairs will decompose 'rem' if that hook returns false, nothing does the opposite transform. We can't to this in InstCombine, because it does not currently access TTI, and i'm not sure we should change that. We can't really do that in DAGCombine since it also currently does not access TTI. Therefore only DivRemPairs is left. https://bugs.llvm.org/show_bug.cgi?id=42673 llvm-svn: 367046	2019-07-25 20:26:34 +00:00
Serguei Katkov	cde00c02e1	[Loop Peeling] Fix idom detection algorithm. We'd like to determine the idom of exit block after peeling one iteration. Let Exit is exit block. Let ExitingSet - is a set of predecessors of Exit block. They are exiting blocks. Let Latch' and ExitingSet' are copies after a peeling. We'd like to find an idom'(Exit) - idom of Exit after peeling. It is an evident that idom'(Exit) will be the nearest common dominator of ExitingSet and ExitingSet'. idom(Exit) is a nearest common dominator of ExitingSet. idom(Exit)' is a nearest common dominator of ExitingSet'. Taking into account that we have a single Latch, Latch' will dominate Header and idom(Exit). So the idom'(Exit) is nearest common dominator of idom(Exit)' and Latch'. All these basic blocks are in the same loop, so what we find is (nearest common dominator of idom(Exit) and Latch)'. Reviewers: reames, fhahn Reviewed By: reames Subscribers: hiraditya, zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D65292 llvm-svn: 367044	2019-07-25 19:31:50 +00:00
Sanjay Patel	b456310902	[SimplifyCFG] avoid crashing after simplifying a switch (PR42737) Later code in TryToSimplifyUncondBranchFromEmptyBlock() assumes that we have cleaned up unreachable blocks, but that was not happening with this switch transform. llvm-svn: 367037	2019-07-25 17:01:12 +00:00
Vlad Tsyrklevich	5d5a58317c	Revert "[InstCombine] try to narrow a truncated load" This reverts commit `bc4a63fd3c`, this is a speculative revert to fix a number of sanitizer bots (like sanitizer-x86_64-linux-bootstrap-ubsan) that have started to see stage2 compiler crashes, presumably due to a miscompile. llvm-svn: 367029	2019-07-25 15:37:57 +00:00
Florian Hahn	c0d0e3bda8	[PredicateInfo] Use SmallVector instead of SmallPtrSet. We do not need the SmallPtrSet to avoid adding duplicates to OpsToRename, because we already keep a ValueInfo mapping. If we see an op for the first time, Infos will be empty and we can also add it to OpsToRename. We process operands by visiting BBs depth-first and then iterate over all instructions & users, so the order should be deterministic. Therefore we can skip one round of sorting, which we purely needed for guaranteeing a deterministic order when iterating over the SmallPtrSet. Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D64816 llvm-svn: 367028	2019-07-25 15:35:10 +00:00
Sanjay Patel	bc4a63fd3c	[InstCombine] try to narrow a truncated load trunc (load X) --> load (bitcast X to narrow type) We have this transform in DAGCombiner::ReduceLoadWidth(), but the truncated load pattern can interfere with other instcombine transforms, so I'd like to allow the fold sooner. Example: https://bugs.llvm.org/show_bug.cgi?id=16739 ...in that report, we have bitcasts bracketing these ops, so those could get eliminated too. We've generally ruled out widening of loads early in IR ( LoadCombine - http://lists.llvm.org/pipermail/llvm-dev/2016-September/105291.html ), but that reasoning may not apply to narrowing if we can preserve information such as the dereferenceable range. Differential Revision: https://reviews.llvm.org/D64432 llvm-svn: 367011	2019-07-25 12:14:27 +00:00
Craig Topper	e9abc8177a	[InstCombine] Teach foldOrOfICmps to allow icmp eq MIN_INT/MAX to be part of a range comparision. Similar for foldAndOfICmps We can treat icmp eq X, MIN_UINT as icmp ule X, MIN_UINT and allow it to merge with icmp ugt X, C. Similar for the other constants. We can do simliar for icmp ne X, (U)INT_MIN/MAX in foldAndOfICmps. And we already handled UINT_MIN there. Fixes PR42691. Differential Revision: https://reviews.llvm.org/D65017 llvm-svn: 366945	2019-07-24 20:57:29 +00:00
David Bolvansky	db913d9618	[InstCombine] Adjusted pow-exp tests for Windows [NFC] Summary: https://bugs.llvm.org/show_bug.cgi?id=42740 Reviewers: efriedma, hans Reviewed By: hans Subscribers: spatel, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65220 llvm-svn: 366925	2019-07-24 17:01:20 +00:00
Matt Arsenault	0b7f226311	AMDGPU: Fix test after r366913 llvm-svn: 366916	2019-07-24 16:05:55 +00:00
Sanjay Patel	3624074426	[InstCombine] add tests for load narrowing; NFC Baseline results for D64432. llvm-svn: 366901	2019-07-24 12:44:21 +00:00
Petr Hosek	8b161bacf4	[SafeStack] Insert the deref before remaining elements This is a follow up to D64971. While we need to insert the deref after the offset, it needs to come before the remaining elements in the original expression since the deref needs to happen before the LLVM fragment if present. Differential Revision: https://reviews.llvm.org/D65172 llvm-svn: 366865	2019-07-24 00:16:23 +00:00
Philip Reames	ea5c94b497	[IndVars] Fix a subtle bug in optimizeLoopExits The original code failed to account for the fact that one exit can have a pointer exit count without all of them having pointer exit counts. This could cause two separate bugs: 1) We might exit the loop early, and leave optimizations undone. This is what triggered the assertion failure in the reported test case. 2) We might optimize one exit, then exit without indicating a change. This could result in an analysis invalidaton bug if no other transform is done by the rest of indvars. Note that the pointer exit counts are a really fragile concept. They show up only when we have a pointer IV w/o a datalayout to provide their size. It's really questionable to me whether the complexity implied is worth it. llvm-svn: 366829	2019-07-23 17:45:11 +00:00

1 2 3 4 5 ...

13087 Commits