llvm-project

Commit Graph

Author	SHA1	Message	Date
Clement Courbet	eaf4413d2d	Revert r360771 "[MergeICmps] Simplify the code." Breaks a bunch of builbdots. llvm-svn: 360776	2019-05-15 14:21:59 +00:00
Stephen Tozer	0d02f2ff4f	Revert "[Salvage] Change salvage debug info implementation to use DW_OP_LLVM_convert where needed" This reverts r360772 due to build issues. Reverted commit: `17dd4d7403`. llvm-svn: 360773	2019-05-15 13:41:44 +00:00
Stephen Tozer	17dd4d7403	[Salvage] Change salvage debug info implementation to use DW_OP_LLVM_convert where needed Fixes issue: https://bugs.llvm.org/show_bug.cgi?id=40645 Previously, LLVM had no functional way of performing casts inside of a DIExpression(), which made salvaging cast instructions other than Noop casts impossible. With the recent addition of DW_OP_LLVM_convert this salvaging is now possible, and so can be used to fix the attached bug as well as any cases where SExt instruction results are lost in the debugging metadata. This patch introduces this fix by expanding the salvage debug info method to cover these cases using the new operator. Differential revision: https://reviews.llvm.org/D61184 llvm-svn: 360772	2019-05-15 13:15:48 +00:00
Clement Courbet	157ae639fa	[MergeICmps] Simplify the code. Instead of patching the original blocks, we now generate new blocks and delete the old blocks. This results in simpler code with a less twisted control flow (see the change in `entry-block-shuffled.ll`). This will make https://reviews.llvm.org/D60318 simpler by making it more obvious where control flow created and deleted. Reviewers: gchatelet Subscribers: hiraditya, llvm-commits, spatel Tags: #llvm Differential Revision: https://reviews.llvm.org/D61736 llvm-svn: 360771	2019-05-15 13:04:24 +00:00
Roman Lebedev	da08fae397	[NFC][InstCombine] Regenerate trunc.ll test llvm-svn: 360759	2019-05-15 10:24:38 +00:00
Fangrui Song	5296e2809f	Fix 2-field llvm.global_ctors `REQUIRES: asserts` tests after rL360742 llvm-svn: 360743	2019-05-15 03:08:21 +00:00
Fangrui Song	f4dfd63c74	[IR] Disallow llvm.global_ctors and llvm.global_dtors of the 2-field form in textual format The 3-field form was introduced by D3499 in 2014 and the legacy 2-field form was planned to be removed in LLVM 4.0 For the textual format, this patch migrates the existing 2-field form to use the 3-field form and deletes the compatibility code. test/Verifier/global-ctors-2.ll checks we have a friendly error message. For bitcode, lib/IR/AutoUpgrade UpgradeGlobalVariables will upgrade the 2-field form (add i8* null as the third field). Reviewed By: rnk, dexonsmith Differential Revision: https://reviews.llvm.org/D61547 llvm-svn: 360742	2019-05-15 02:35:32 +00:00
Florian Hahn	53c9d585b5	[LICM] Allow AliasSetMap to contain top-level loops. When an outer loop gets deleted by a different pass, before LICM visits it, we cannot clean up its sub-loops in AliasSetMap, because at the point we receive the deleteAnalysisLoop callback for the outer loop, the loop object is already invalid and we cannot access its sub-loops any longer. Reviewers: asbirlea, sanjoy, chandlerc Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D61904 llvm-svn: 360704	2019-05-14 19:41:36 +00:00
Nikita Popov	48c4e4fa80	[LVI][CVP] Add support for abs/nabs select pattern flavor Based on ConstantRange support added in D61084, we can now handle abs and nabs select pattern flavors in LVI. Differential Revision: https://reviews.llvm.org/D61794 llvm-svn: 360700	2019-05-14 18:53:47 +00:00
Philip Reames	bd8d309111	[IndVars] Extend reasoning about loop invariant exits to non-header blocks Noticed while glancing through the code for other reasons. The extension is trivial enough, decided to just do it. llvm-svn: 360694	2019-05-14 17:20:10 +00:00
Cameron McInally	7c5c0c9fe5	Support FNeg in SpeculativeExecution pass Differential Revision: https://reviews.llvm.org/D61910 llvm-svn: 360692	2019-05-14 16:51:18 +00:00
Philip Reames	bbe4ff10df	[Test] Autogen a test for ease of later changing llvm-svn: 360690	2019-05-14 16:37:29 +00:00
Tim Northover	ed9117f88d	GlobalOpt: do not promote globals used atomically to constants. Some atomic loads are implemented as cmpxchg (particularly if large or floating), and that usually requires write access to the memory involved or it will segfault. We can still propagate the constant value to users we understand though. llvm-svn: 360662	2019-05-14 11:03:13 +00:00
Gor Nishanov	d64455cd43	[coroutines] Fix spills of static array allocas Summary: CoroFrame was not considering static array allocas, and was only ever reserving a single element in the coroutine frame. This meant that stores to the non-zero'th element would corrupt later frame data. Store static array allocas as field arrays in the coroutine frame. Added test. Committed by Gor Nishanov on behalf of ben-clayton Reviewers: GorNishanov, modocache Reviewed By: GorNishanov Subscribers: Orlando, capn, EricWF, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61372 llvm-svn: 360636	2019-05-13 23:58:24 +00:00
Nemanja Ivanovic	1d662316cb	[Pass Pipeline][NFC] Add a test prior to committing D61726 This patch just adds a test case to show the differences in code emitted by opt before and after https://reviews.llvm.org/D61726. Previous attempt to commit this did not include the registered target requirement so it caused buildbot breaks. llvm-svn: 360620	2019-05-13 21:14:36 +00:00
Sanjay Patel	760f61ab36	[InstCombine] try harder to form rotate (funnel shift) (PR20750) We have a similar match for patterns ending in a truncate. This should be ok for all targets because the default expansion would still likely be better from replacing 2 'and' ops with 1. Attempt to show the logic equivalence in Alive (which doesn't currently have funnel-shift in its vocabulary AFAICT): %shamt = zext i8 %i to i32 %m = and i32 %shamt, 31 %neg = sub i32 0, %shamt %and4 = and i32 %neg, 31 %shl = shl i32 %v, %m %shr = lshr i32 %v, %and4 %or = or i32 %shr, %shl => %a = and i8 %i, 31 %shamt2 = zext i8 %a to i32 %neg2 = sub i32 0, %shamt2 %and4 = and i32 %neg2, 31 %shl = shl i32 %v, %shamt2 %shr = lshr i32 %v, %and4 %or = or i32 %shr, %shl https://rise4fun.com/Alive/V9r llvm-svn: 360605	2019-05-13 17:28:19 +00:00
Sanjay Patel	cb8957f718	[InstCombine] add tests for rotates with narrow shift amount (PR20750); NFC llvm-svn: 360601	2019-05-13 17:02:26 +00:00
Sanjay Patel	2de619099a	[LoopVectorizer] add tests for FP minmax; NFC llvm-svn: 360542	2019-05-12 14:53:59 +00:00
Simon Pilgrim	6b10fde69b	[CostModel][X86] Add min/max reduction costs for all SSE targets The original costs stopped at SSE42, I've added conservative estimates for everything down to SSE1/SSE2 and moved some of the SSE42 costs to SSE41 (really only the addition of PCMPGT makes any difference). I've also added missing vXi8 costs (we use PHMINPOSUW for i8/i16 for scarily quick results) and 256-bit vector costs for AVX1. llvm-svn: 360528	2019-05-11 17:12:52 +00:00
Teresa Johnson	37b80122bd	[ThinLTO] Auto-hide prevailing linkonce_odr only when all copies eligible Summary: We hit undefined references building with ThinLTO when one source file contained explicit instantiations of a template method (weak_odr) but there were also implicit instantiations in another file (linkonce_odr), and the latter was the prevailing copy. In this case the symbol was marked hidden when the prevailing linkonce_odr copy was promoted to weak_odr. It led to unsats when the resulting shared library was linked with other code that contained a reference (expecting to be resolved due to the explicit instantiation). Add a CanAutoHide flag to the GV summary to allow the thin link to identify when all copies are eligible for auto-hiding (because they were all originally linkonce_odr global unnamed addr), and only do the auto-hide in that case. Most of the changes here are due to plumbing the new flag through the bitcode and llvm assembly, and resulting test changes. I augmented the existing auto-hide test to check for this situation. Reviewers: pcc Subscribers: mehdi_amini, inglorion, eraman, dexonsmith, arphaman, dang, llvm-commits, steven_wu, wmi Tags: #llvm Differential Revision: https://reviews.llvm.org/D59709 llvm-svn: 360466	2019-05-10 20:08:24 +00:00
Cameron McInally	e75412ab47	Add InstCombine::visitFNeg(...) Differential Revision: https://reviews.llvm.org/D61784 llvm-svn: 360461	2019-05-10 20:01:04 +00:00
Nikita Popov	e99486dc11	[CVP] Add tests for urem, sdiv, srem ranges; NFC We currently don't calcuate result ranges for these binary operators. llvm-svn: 360460	2019-05-10 19:36:38 +00:00
Nikita Popov	d74b871504	[CVP] Add tests for abs and nabs spf; NFC One half of the bound is already computed correctly for these tests, the other isn't. llvm-svn: 360445	2019-05-10 17:39:50 +00:00
Nemanja Ivanovic	34dc3aca40	Pull r360426 as it is breaking the build bots. llvm-svn: 360437	2019-05-10 16:03:22 +00:00
Nemanja Ivanovic	7a41cd5b88	Another attempt to fix the build bot breaks after r360426 The test case checks were produced by the update_test_checks.py scripts and I assumed that is sufficient. However, the behaviour is different with different default target triples. Specify the triple explicitly in the test case. If this doesn't clean up the build bot breaks, I'll remove the test case until I can get to the bottom of why the behaviour on build bots is different from my machine. llvm-svn: 360434	2019-05-10 15:44:56 +00:00
Nemanja Ivanovic	0f991c65f2	Fix build break after r360426 llvm-svn: 360433	2019-05-10 15:11:40 +00:00
Michael Liao	b284414a1b	[InferAddressSpaces] Enhance the handling of cosntexpr. Summary: - Constant expressions may not be added in strict postorder as the forward instruction scan order. Thus, for a constant express (CE0), if its operand (CE1) is used in an previous instruction, they are not in postorder. However, different from `cloneInstructionWithNewAddressSpace`, `cloneConstantExprWithNewAddressSpace` doesn't bookkeep uninferred instructions for later resolving. That results in failure of inferring constant address. - This patch adds the support to infer constant expression operand recursively, since there won't be loop, if that operand is another constant expression. Reviewers: arsenm Subscribers: jholewinski, jvesely, wdng, nhaehnle, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61760 llvm-svn: 360431	2019-05-10 14:57:42 +00:00
Nemanja Ivanovic	cfc89896e0	[Pass Pipeline][NFC] Add a test prior to committing D61726 This patch just adds a test case to show the differences in code emitted by opt before and after https://reviews.llvm.org/D61726. llvm-svn: 360426	2019-05-10 13:47:00 +00:00
Cameron McInally	a67e387de8	Pre-commit InstCombine::visitFNeg(...) test. llvm-svn: 360424	2019-05-10 13:18:57 +00:00
Sanjay Patel	012adfbb96	[LoopVectorizer] fix test file to not run the entire -O3 pipeline This test file has a long history of edits from changes outside of vectorization, and it would happen again with the proposal in D61726. End-to-end testing shouldn't be happening in a test file that is specifically checking for vector masked load/store ops. Larger-scale testing goes in PhaseOrdering or the test-suite. I've hopefully preserved the intent by taking what was completely unoptimized IR in some tests and passing that through the -O1 pipeline. That becomes the input IR, and now we just run the loop vectorizer and verify that the vector masked ops are produced as expected. llvm-svn: 360340	2019-05-09 13:43:22 +00:00
Clement Courbet	fa18e6b080	[MergeICmps][NFC] Re-generate tests with update_test_checks. And use a more compact name for the tested struct. llvm-svn: 360319	2019-05-09 08:37:58 +00:00
Clement Courbet	fb0f66ddb3	[NFC] Fix typo. llvm-svn: 360314	2019-05-09 07:12:25 +00:00
Cameron McInally	cdaf5a069c	Precommit FNeg InstCombine tests Differential Revision: https://reviews.llvm.org/D61685 llvm-svn: 360281	2019-05-08 19:06:03 +00:00
Warren Ristow	d27b0c6247	[SCEV] Suppress hoisting insertion point of binops when unsafe InsertBinop tries to move insertion-points out of loops for expressions that are loop-invariant. This patch adds a new parameter, IsSafeToHost, to guard that hoisting. This allows callers to suppress that hoisting for unsafe situations, such as divisions that may have a zero denominator. This fixes PR38697. Differential Revision: https://reviews.llvm.org/D55232 llvm-svn: 360280	2019-05-08 18:50:07 +00:00
Reid Kleckner	1558731607	Fix new reassociate-catchswitch.ll test llvm-svn: 360279	2019-05-08 18:39:03 +00:00
Sanjay Patel	b64c48597f	[InstSimplify] add tests for fcmp+minnum; NFC llvm-svn: 360275	2019-05-08 17:53:18 +00:00
David Greene	6c433713e9	[Reassociation] Place moved instructions after landing pads Reassociation's NegateValue moved instructions to the beginning of blocks (after PHIs) without checking for exception handling pads. It's possible for reassociation to move something into an exception handling block so we need to make sure we don't move things too early in the block. This change advances the insertion point past any exception handling pads. If the block we want to move into contains a catchswitch, we cannot move into it. In that case just create a new neg as if we had not found an existing neg to move. Differential Revision: https://reviews.llvm.org/D61089 llvm-svn: 360262	2019-05-08 15:44:24 +00:00
Nikita Popov	9fd02a71a3	Revert "[ValueTracking] Improve isKnowNonZero for Ints" This reverts commit `3b137a4956`. As reported in https://reviews.llvm.org/D60846, this is causing miscompiles. llvm-svn: 360260	2019-05-08 14:50:01 +00:00
Florian Hahn	3c696b3e7c	[SCCP] Fix crash when trying to constant-fold terminators multiple times. If we fold a branch/switch to an unconditional branch to another dead block we replace the branch with unreachable, to avoid attempting to fold the unconditional branch. Reviewers: davide, efriedma, mssimpso, jdoerfert Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D61300 llvm-svn: 360232	2019-05-08 09:09:54 +00:00
Mircea Trofin	0a753938db	[llvm] Avoid div by 0 when updating profile weights. Reviewers: davidxl Reviewed By: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61661 llvm-svn: 360223	2019-05-08 03:57:25 +00:00
Dan Robertson	3b137a4956	[ValueTracking] Improve isKnowNonZero for Ints Improve isKnownNonZero for integers in order to improve cttz optimizations. Differential Revision: https://reviews.llvm.org/D60846 llvm-svn: 360222	2019-05-08 02:25:08 +00:00
Sanjay Patel	e088d03b9c	[ValueTracking] add logic for known-never-nan with minnum/maxnum From the LangRef: "Returns NaN only if both operands are NaN." llvm-svn: 360206	2019-05-07 22:58:31 +00:00
Reid Kleckner	d028a463d5	Regenerate test case again after last revert llvm-svn: 360204	2019-05-07 22:40:40 +00:00
Reid Kleckner	a9cc7d71ac	Delete test cases added in r360162 that should have been deleted in r360190 llvm-svn: 360203	2019-05-07 22:35:56 +00:00
Sanjay Patel	9a1c2b7776	[InstSimplify] add tests for minnum/maxnum and NaN; NFC llvm-svn: 360197	2019-05-07 21:50:09 +00:00
Kostya Serebryany	b9c5768302	revert r360162 as it breaks most of the buildbots llvm-svn: 360190	2019-05-07 20:57:11 +00:00
Robert Lougher	8681ef8f41	[InstCombine] Add new combine to add folding (X \| C1) + C2 --> (X \| C1) ^ C1 iff (C1 == -C2) I verified the correctness using Alive: https://rise4fun.com/Alive/YNV This transform enables the following transform that already exists in instcombine: (X \| Y) ^ Y --> X & ~Y As a result, the full expected transform is: (X \| C1) + C2 --> X & ~C1 iff (C1 == -C2) There already exists the transform in the sub case: (X \| Y) - Y --> X & ~Y However this does not trigger in the case where Y is constant due to an earlier transform: X - (-C) --> X + C With this new add fold, both the add and sub constant cases are handled. Patch by Chris Dawson. Differential Revision: https://reviews.llvm.org/D61517 llvm-svn: 360185	2019-05-07 19:36:41 +00:00
Sanjay Patel	6a281a7545	[InstCombine] allow sinking fneg operands through an FP min/max Fundamentally/generally, we should not have to rely on bailouts/crippling of folds. In this particular case, I think we always recognize the inverted predicate min/max pattern, so there should not be any loss of optimization. Codegen looks better because we are eliminating an fneg. llvm-svn: 360180	2019-05-07 18:58:07 +00:00
Simon Pilgrim	0ed545ebb3	Regenerate test to try and fix buildbots llvm-svn: 360173	2019-05-07 17:10:10 +00:00
Sanjay Patel	2a3d16feea	[InstCombine] add tests for FP min/max with negated operands; NFC llvm-svn: 360170	2019-05-07 16:25:43 +00:00
Orlando Cazalet-Hyams	78a6062c24	[DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion Summary: Bug: https://bugs.llvm.org/show_bug.cgi?id=39024 The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here: A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins. B) Instructions in the middle block have different line numbers which give the impression of another iteration. In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks. Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel Reviewed By: hfinkel Subscribers: bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D60831 llvm-svn: 360162	2019-05-07 15:37:38 +00:00
Keno Fischer	a1a4adf4b9	[SCEV] Add explicit representations of umin/smin Summary: Currently we express umin as `~umax(~x, ~y)`. However, this becomes a problem for operands in non-integral pointer spaces, because `~x` is not something we can compute for `x` non-integral. However, since comparisons are generally still allowed, we are actually able to express `umin(x, y)` directly as long as we don't try to express is as a umax. Support this by adding an explicit umin/smin representation to SCEV. We do this by factoring the existing getUMax/getSMax functions into a new function that does all four. The previous two functions were largely identical. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D50167 llvm-svn: 360159	2019-05-07 15:28:47 +00:00
Robert Lougher	07298c9b1e	Precommit tests for or/add transform. NFC. llvm-svn: 360149	2019-05-07 14:14:29 +00:00
Jordan Rupprecht	8f14e7cacf	Revert "Re-commit r357452: SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)" This reverts r357452 (git commit `21eb771dcb`). This was causing strange optimization-related test failures on an internal test. Will followup with more details offline. llvm-svn: 360086	2019-05-06 21:55:05 +00:00
Sanjay Patel	a6019d5164	[InstCombine] sink FP negation of operands through select We don't always get this: Cond ? -X : -Y --> -(Cond ? X : Y) ...even with the legacy IR form of fneg in the case with extra uses, and we miss matching with the newer 'fneg' instruction because we are expecting binops through the rest of the path. Differential Revision: https://reviews.llvm.org/D61604 llvm-svn: 360075	2019-05-06 20:34:05 +00:00
Sanjay Patel	473dbf0301	[InstCombine] add tests for fneg+sel; NFC llvm-svn: 360058	2019-05-06 17:29:22 +00:00
Cameron McInally	c3167696bc	Add FNeg support to InstructionSimplify Differential Revision: https://reviews.llvm.org/D61573 llvm-svn: 360053	2019-05-06 16:05:10 +00:00
Sanjay Patel	3379fb599d	[InstCombine] regenerate test checks; NFC llvm-svn: 360052	2019-05-06 16:03:53 +00:00
Clement Courbet	9e1f2a7fe7	[SimplifyLibCalls] Simplify bcmp too. Summary: Fixes PR40699. Reviewers: gchatelet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61585 llvm-svn: 360021	2019-05-06 09:15:22 +00:00
Markus Lavin	a778074165	[DebugInfo] GlobalOpt DW_OP_deref_size instead of DW_OP_deref. Optimization pass lib/Transforms/IPO/GlobalOpt.cpp needs to insert DW_OP_deref_size instead of DW_OP_deref to be compatible with big-endian targets for same reasons as in D59687. Differential Revision: https://reviews.llvm.org/D60611 llvm-svn: 360013	2019-05-06 07:20:56 +00:00
Cameron McInally	1d0c845d9d	Add FNeg IR constant folding support llvm-svn: 359982	2019-05-05 16:07:09 +00:00
Cameron McInally	fd254e429e	Add InstCombine tests for FNeg instruction. llvm-svn: 359970	2019-05-04 14:56:08 +00:00
Sanjay Patel	5ab41a7a05	[CodeGenPrepare] limit overflow intrinsic matching to a single basic block (2nd try) This is a subset of the original commit from rL359879 which was reverted because it could crash when using the 'RemovedInstructions' structure that enables delayed deletion of dead instructions. The motivating compile-time win does not require that change though. We should get most of that win from this change alone. Using/updating a dominator tree to match math overflow patterns may be very expensive in compile-time (because of the way CGP uses a DT), so just handle the single-block case. See post-commit thread for rL354298 for more details: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190422/646276.html Differential Revision: https://reviews.llvm.org/D61075 llvm-svn: 359969	2019-05-04 12:46:32 +00:00
Evgeniy Stepanov	46ec57e576	Revert "[CodeGenPrepare] limit overflow intrinsic matching to a single basic block" This reverts commit r359879, which introduced a compiler crash. llvm-svn: 359908	2019-05-03 17:31:49 +00:00
Robert Lougher	e28ab93546	Revert r359549 - incorrect update of test checks. NFC llvm-svn: 359897	2019-05-03 15:14:19 +00:00
Sanjay Patel	d3cfaae243	[LICM] auto-generate complete test checks; NFC llvm-svn: 359881	2019-05-03 13:25:06 +00:00
Sanjay Patel	8ff072e48e	[CodeGenPrepare] limit overflow intrinsic matching to a single basic block Using/updating a dominator tree to match math overflow patterns may be very expensive in compile-time (because of the way CGP uses a DT), so just handle the single-block case. Also, we were restarting the iterator loops when doing the overflow intrinsic transforms by marking the dominator tree for update. That was done to prevent iterating over a removed instruction. But we can postpone the deletion using the existing "RemovedInsts" structure, and that means we don't need to update the DT. See post-commit thread for rL354298 for more details: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190422/646276.html Differential Revision: https://reviews.llvm.org/D61075 llvm-svn: 359879	2019-05-03 13:09:18 +00:00
Bob Haarman	a78ab77b6b	remove inalloca parameters in globalopt and simplify argpromotion Summary: Inalloca parameters require special handling in some optimizations. This change causes globalopt to strip the inalloca attribute from function parameters when it is safe to do so, removes the special handling for inallocas from argpromotion, and replaces it with a simple check that causes argpromotion to skip functions that receive inallocas (for when the pass is invoked on code that didn't run through globalopt first). This also avoids a case where argpromotion would incorrectly try to pass an inalloca in a register. Fixes PR41658. Reviewers: rnk, efriedma Reviewed By: rnk Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61286 llvm-svn: 359743	2019-05-02 00:37:36 +00:00
Hiroshi Yamauchi	1620104034	[PGO][CHR] A bug fix. Summary: Fix a transformation bug where two scopes share a common instrution to hoist. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61405 llvm-svn: 359736	2019-05-01 22:49:52 +00:00
Hubert Tong	02d055a269	[tests] Add host-byteorder-*-endian; update XFAILs of big-endian triples Summary: Triple components in `XFAIL` lines are tested against the target triple. Various tests that are expected to fail on big-endian hosts are marked as being `XFAIL` for big-endian targets. This patch corrects these tests by having them test against a new `host-byteorder-big-endian` feature. Reviewers: xingxue, sfertile, jasonliu Reviewed By: xingxue Subscribers: jvesely, nhaehnle, fedor.sergeev, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60551 llvm-svn: 359689	2019-05-01 15:36:18 +00:00
Philip Reames	84e54eb471	[InstCombine] Limit a vector demanded elts rule which was producing invalid IR. The demanded elts rules introduced for GEPs in https://reviews.llvm.org/rL356293 replaced vector constants with undefs (by design). It turns out that the LangRef disallows such cases when indexing structs. The right fix is probably to relax the langref requirement, and update other passes to expect the result, but for the moment, limit the transform to avoid compiler crashes. This should fix https://bugs.llvm.org/show_bug.cgi?id=41624. llvm-svn: 359633	2019-04-30 23:09:26 +00:00
Alina Sbirlea	4e1ac95cf5	[PassManagerBuilder] Add option for interleaved loops, for loop vectorize. Summary: Match NewPassManager behavior: add option for interleaved loops in the old pass manager, and use that instead of the flag used to disable loop unroll. No changes in the defaults. Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, dmgreen, hsaito, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61030 llvm-svn: 359615	2019-04-30 21:29:20 +00:00
Simon Pilgrim	83098d28a1	[SLP] Lit test that cannot get vectorized due to lack of look-ahead operand reordering heuristic. The code in this test is not vectorized by SLP because its operand reordering cannot look beyond the immediate predecessors. This will get fixed in a follow-up patch that introduces the look-ahead operand reordering heuristic. Committed on behalf of @vporpo (Vasileios Porpodas) Differential Revision: https://reviews.llvm.org/D61283 llvm-svn: 359553	2019-04-30 11:03:09 +00:00
Jeremy Morse	562f5f04f5	Update checks in an instcombine test, NFC This reduces the delta in some incoming work that changes this test. llvm-svn: 359549	2019-04-30 10:56:33 +00:00
Quentin Colombet	ae2cbb3400	[BlockExtractor] Change the basic block separator from ',' to ';' This change aims at making the file format be compatible with the way LLVM handles command line options. Differential Revision: https://reviews.llvm.org/D60970 llvm-svn: 359462	2019-04-29 16:14:00 +00:00
Simon Pilgrim	46128cdf08	[InstCombine][X86] Add PACKSS tests for truncation of sign-extended comparisons llvm-svn: 359435	2019-04-29 10:36:20 +00:00
Dan Robertson	9e441aee50	[NFC] Add baseline tests for int isKnownNonZero Add baseline tests for improvements of isKnownNonZero for integer types. Differential Revision: https://reviews.llvm.org/D60932 llvm-svn: 359267	2019-04-26 02:55:54 +00:00
Akira Hatanaka	8edf8f317b	[ObjC][ARC] Let ARC optimizer bail out if the number of pointer states it keeps track of becomes too large ARC optimizer does a top-down and a bottom-up traversal of the whole function to pair up retain and release instructions and remove them. This can be expensive if the number of instructions in the function and pointer states it tracks are large since it has to look at each pointer state and determine whether the instruction being visited can potentially use the pointer. This patch adds a command line option that sets a limit to the number of pointers it tracks. rdar://problem/49477063 Differential Revision: https://reviews.llvm.org/D61100 llvm-svn: 359226	2019-04-25 19:42:55 +00:00
Robert Lougher	d469133f95	[Evaluator] Walk initial elements when handling load through bitcast When evaluating a store through a bitcast, the evaluator tries to move the bitcast from the pointer onto the stored value. If the cast is invalid, it tries to "introspect" the type to get a valid cast by obtaining a pointer to the initial element (if the type is nested, this may require walking several initial elements). In some situations it is possible to get a bitcast on a load (e.g. with unions, where the bitcast may not be the same type as the store). However, equivalent logic to the store to introspect the type is missing. This patch add this logic. Note, when developing the patch I was unhappy with adding similar logic directly to the load case as it could get out of step. Instead, I have abstracted the "introspection" into a helper function, with the specifics being handled by a passed-in lambda function. Differential Revision: https://reviews.llvm.org/D60793 llvm-svn: 359205	2019-04-25 17:00:01 +00:00
Simon Pilgrim	86ff9d313a	[InstCombine][X86] Add PACKSS/PACKUS tests for truncation where saturation won't occur llvm-svn: 359185	2019-04-25 12:45:11 +00:00
Roman Lebedev	445c22b7eb	[NFC][LoopIdiomRecognize] Some basic baseline tests for bcmp loop idiom Doubt this is the final test coverage, but this appears to have good coverage already, so i figure i might as well precommit it. llvm-svn: 359173	2019-04-25 08:33:47 +00:00
Alina Sbirlea	733c8c40c8	Enable LoopVectorization by default. Summary: When refactoring vectorization flags, vectorization was disabled by default in the new pass manager. This patch re-enables is for both managers, and changes the assumptions opt makes, based on the new defaults. Comments in opt.cpp should clarify the intended use of all flags to enable/disable vectorization. Reviewers: chandlerc, jgorbe Subscribers: jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61091 llvm-svn: 359167	2019-04-25 04:49:48 +00:00
Alexey Bataev	ef3c1884ec	[SLP] Fix crash after r358519, by V. Porpodas. Summary: The code did not check if operand was undef before casting it to Instruction. Reviewers: RKSimon, ABataev, dtemirbulatov Reviewed By: ABataev Subscribers: uabelho Tags: #llvm Differential Revision: https://reviews.llvm.org/D61024 llvm-svn: 359136	2019-04-24 20:21:32 +00:00
Dmitry Mikulin	312b5f86b7	The error message for mismatched value sites is very cryptic. Make it more readable for an average user. Differential Revision: https://reviews.llvm.org/D60896 llvm-svn: 359043	2019-04-23 22:26:55 +00:00
Akira Hatanaka	5c3117b0a9	[ObjC][ARC] Check the basic block size before calling DominatorTree::dominate. ARC contract pass has an optimization that replaces the uses of the argument of an ObjC runtime function call with the call result. For example: ; Before optimization %1 = tail call i8* @foo1() %2 = tail call i8* @llvm.objc.retainAutoreleasedReturnValue(i8* %1) store i8* %1, i8** @g0, align 8 ; After optimization %1 = tail call i8* @foo1() %2 = tail call i8* @llvm.objc.retainAutoreleasedReturnValue(i8* %1) store i8* %2, i8** @g0, align 8 // %1 is replaced with %2 Before replacing the argument use, DominatorTree::dominate is called to determine whether the user instruction is dominated by the ObjC runtime function call instruction. The call to DominatorTree::dominate can be expensive if the two instructions belong to the same basic block and the size of the basic block is large. This patch checks the basic block size and just bails out if the size exceeds the limit set by command line option "arc-contract-max-bb-size". rdar://problem/49477063 Differential Revision: https://reviews.llvm.org/D60900 llvm-svn: 359027	2019-04-23 19:49:03 +00:00
Philip Reames	2ce017026a	[InstCombine] Convert a masked.load of a dereferenceable address to an unconditional load If we have a masked.load from a location we know to be dereferenceable, we can simply issue a speculative unconditional load against that address. The key advantage is that it produces IR which is well understood by the optimizer. The select (cnd, load, passthrough) form produced should be pattern matchable back to hardware predication if profitable. Differential Revision: https://reviews.llvm.org/D59703 llvm-svn: 359000	2019-04-23 15:25:14 +00:00
David Green	63a2aa715a	[LSR] Limit the recursion for setup cost In some circumstances we can end up with setup costs that are very complex to compute, even though the scevs are not very complex to create. This can also lead to setupcosts that are calculated to be exactly -1, which LSR treats as an invalid cost. This patch puts a limit on the recursion depth for setup cost to prevent them taking too long. Thanks to @reames for the report and test case. Differential Revision: https://reviews.llvm.org/D60944 llvm-svn: 358958	2019-04-23 08:52:21 +00:00
Philip Reames	d748689c7f	[InstCombine] Eliminate stores to constant memory If we have a store to a piece of memory which is known constant, then we know the store must be storing back the same value. As a result, the store (or memset, or memmove) must either be down a dead path, or a noop. In either case, it is valid to simply remove the store. The motivating case for this involves a memmove to a buffer which is constant down a path which is dynamically dead. Note that I'm choosing to implement the less aggressive of two possible semantics here. We could simply say that the store is undefined, and prune the path. Consensus in the review was that the more aggressive form might be a good follow on change at a later date. Differential Revision: https://reviews.llvm.org/D60659 llvm-svn: 358919	2019-04-22 20:28:19 +00:00
Philip Reames	f01583d097	[Tests] Revise a test as requested by reviewer in D59703 llvm-svn: 358907	2019-04-22 18:51:58 +00:00
Philip Reames	8f47089034	[Tests] Add a negative test for masked.gather part of D59703 llvm-svn: 358906	2019-04-22 18:28:44 +00:00
Serguei Katkov	40a3b96196	[NewPM] Add Option handling for SimpleLoopUnswitch This patch enables passing options to SimpleLoopUnswitch via the passes pipeline. Reviewers: chandlerc, fedor.sergeev, leonardchan, philip.pfaffe Reviewed By: fedor.sergeev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D60676 llvm-svn: 358880	2019-04-22 10:35:07 +00:00
Serguei Katkov	5614f4a3a5	[NewPM] Add dummy Test for LoopVectorize option parsing. llvm-svn: 358878	2019-04-22 09:53:26 +00:00
Luqman Aden	2993661cc0	[CorrelatedValuePropagation] Mark subs that we know not to wrap with nuw/nsw. Summary: Teach CorrelatedValuePropagation to also handle sub instructions in addition to add. Relatively simple since makeGuaranteedNoWrapRegion already understood sub instructions. Only subtle change is which range is passed as "Other" to that function, since sub isn't commutative. Note that CorrelatedValuePropagation::processAddSub is still hidden behind a default-off flag as IndVarSimplify hasn't yet been fixed to strip the added nsw/nuw flags and causes a miscompile. (PR31181) Reviewers: sanjoy, apilipenko, nikic Reviewed By: nikic Subscribers: hiraditya, jfb, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60036 llvm-svn: 358816	2019-04-20 13:14:18 +00:00
Nikita Popov	d89de3f7f4	[IndVarSimplify] Generate full checks for some LFTR tests; NFC llvm-svn: 358813	2019-04-20 12:05:53 +00:00
Nikita Popov	aa0c5a022f	[IndVarSimplify] Add tests for PR31181; NFC llvm-svn: 358812	2019-04-20 12:05:43 +00:00
Nikita Popov	2e33f8de57	[CVP] Add tests for sub nowrap inference; NFC These are baseline tests for D60036. Patch by Luqman Aden. llvm-svn: 358808	2019-04-20 07:43:15 +00:00
Vedant Kumar	282b26ec4d	[GVN+LICM] Use line 0 locations for better crash attribution This is a follow-up to r291037+r291258, which used null debug locations to prevent jumpy line tables. Using line 0 locations achieves the same effect, but works better for crash attribution because it preserves the right inline scope. Differential Revision: https://reviews.llvm.org/D60913 llvm-svn: 358791	2019-04-19 22:36:40 +00:00
Fangrui Song	884f557bb2	[MergeFunc] removeUsers: call remove() only on direct users removeUsers uses a work list to collect indirect users and call remove() on those functions. However it has a bug (`if (!Visited.insert(UU).second)`). Actually, we don't have to collect indirect users. After the merge of F and G, G's callers will be considered (added to Deferred). If G's callers can be merged, G's callers' callers will be considered. Update the test unnamed-addr-reprocessing.ll to make it clear we can still merge indirect callers. llvm-svn: 358741	2019-04-19 07:57:51 +00:00
Saleem Abdulrasool	b96d9b3419	MergeFunc: preserve COMDAT information when creating a thunk We would previously drop the COMDAT on the thunk we generated when replacing a function body with the forwarding thunk. This would result in a function that may have been multiply emitted and multiply merged to be emitted with the same name without the COMDAT. This is a hard error with PE/COFF where the COMDAT is used for the deduplication of Value Witness functions for Swift. llvm-svn: 358728	2019-04-19 01:48:36 +00:00
Philip Reames	137995d8da	[GuardWidening] Wire up a NPM version of the LoopGuardWidening pass llvm-svn: 358704	2019-04-18 19:17:14 +00:00
Quentin Colombet	ea3364bf85	[BlockExtractor] Extend the file format to support the grouping of basic blocks Prior to this patch, each basic block listed in the extrack-blocks-file would be extracted to a different function. This patch adds the support for comma separated list of basic blocks to form group. When the region formed by a group is not extractable, e.g., not single entry, all the blocks of that group are left untouched. Let us see this new format in action (comments are not part of the file format): ;; funcName bbName[,bbName...] foo bb1 ;; Extract bb1 in its own function foo bb2,bb3 ;; Extract bb2,bb3 in their own function bar bb1,bb4 ;; Extract bb1,bb4 in their own function bar bb2 ;; Extract bb2 in its own function Assuming all regions are extractable, this will create one function and thus one call per region. Differential Revision: https://reviews.llvm.org/D60746 llvm-svn: 358701	2019-04-18 18:28:30 +00:00
Philip Reames	adf288c5d9	[LoopPred] Fix a blatantly obvious bug in r358684 The bug is that I didn't check whether the operand of the invariant_loads were themselves invariant. I don't know how this got missed in the patch and review. I even had an unreduced test case locally, and I remember handling this case, but I must have lost it in one of the rebases. Oops. llvm-svn: 358688	2019-04-18 17:01:19 +00:00
Philip Reames	92a7177e6b	[LoopPredication] Allow predication of loop invariant computations (within the loop) The purpose of this patch is to eliminate a pass ordering dependence between LoopPredication and LICM. To understand the purpose, consider the following snippet of code inside some loop 'L' with IV 'i' A = _a.length; guard (i < A) a = _a[i] B = _b.length; guard (i < B); b = _b[i]; ... Z = _z.length; guard (i < Z) z = _z[i] accum += a + b + ... + z; Today, we need LICM to hoist the length loads, LoopPredication to make the guards loop invariant, and TrivialUnswitch to eliminate the loop invariant guard to establish must execute for the next length load. Today, if we can't prove speculation safety, we'd have to iterate these three passes 26 times to reduce this example down to the minimal form. Using the fact that the array lengths are known to be invariant, we can short circuit this iteration. By forming the loop invariant form of all the guards at once, we remove the need for LoopPredication from the iterative cycle. At the moment, we'd still have to iterate LICM and TrivialUnswitch; we'll leave that part for later. As a secondary benefit, this allows LoopPred to expose peeling oppurtunities in a much more obvious manner. See the udiv test changes as an example. If the udiv was not hoistable (i.e. we couldn't prove speculation safety) this would be an example where peeling becomes obviously profitable whereas it wasn't before. A couple of subtleties in the implementation: - SCEV's isSafeToExpand guarantees speculation safety (i.e. let's us expand at a new point). It is not a precondition for expansion if we know the SCEV corresponds to a Value which dominates the requested expansion point. - SCEV's isLoopInvariant returns true for expressions which compute the same value across all iterations executed, regardless of where the original Value is located. (i.e. it can be in the loop) This implies we have a speculation burden to prove before expanding them outside loops. - invariant_loads and AA->pointsToConstantMemory are two cases that SCEV currently does not handle, but meets the SCEV definition of invariance. I plan to sink this part into SCEV once this has baked for a bit. Differential Revision: https://reviews.llvm.org/D60093 llvm-svn: 358684	2019-04-18 16:33:17 +00:00
Kit Barton	3cdf87940f	Add basic loop fusion pass. This patch adds a basic loop fusion pass. It will fuse loops that conform to the following 4 conditions: 1. Adjacent (no code between them) 2. Control flow equivalent (if one loop executes, the other loop executes) 3. Identical bounds (both loops iterate the same number of iterations) 4. No negative distance dependencies between the loop bodies. The pass does not make any changes to the IR to create opportunities for fusion. Instead, it checks if the necessary conditions are met and if so it fuses two loops together. The pass has not been added to the pass pipeline yet, and thus is not enabled by default. It can be run stand alone using the -loop-fusion option. Differential Revision: https://reviews.llvm.org/D55851 llvm-svn: 358607	2019-04-17 18:53:27 +00:00
Steven Wu	05a358cdcd	[ThinLTO] Fix ThinLTOCodegenerator to export llvm.used symbols Summary: Reapply r357931 with fixes to ThinLTO testcases and llvm-lto tool. ThinLTOCodeGenerator currently does not preserve llvm.used symbols and it can internalize them. In order to pass the necessary information to the legacy ThinLTOCodeGenerator, the input to the code generator is rewritten to be based on lto::InputFile. Now ThinLTO using the legacy LTO API will requires data layout in Module. "internalize" thinlto action in llvm-lto is updated to run both "promote" and "internalize" with the same configuration as ThinLTOCodeGenerator. The old "promote" + "internalize" option does not produce the same output as ThinLTOCodeGenerator. This fixes: PR41236 rdar://problem/49293439 Reviewers: tejohnson, pcc, kromanova, dexonsmith Reviewed By: tejohnson Subscribers: ormris, bd1976llvm, mehdi_amini, inglorion, eraman, hiraditya, jkorous, dexonsmith, arphaman, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60421 llvm-svn: 358601	2019-04-17 17:38:09 +00:00
Nikita Popov	2039581002	[LVI][CVP] Constrain values in with.overflow branches If a branch is conditional on extractvalue(op.with.overflow(%x, C), 1) then we can constrain the value of %x inside the branch based on makeGuaranteedNoWrapRegion(). We do this by extending the edge-value handling in LVI. This allows CVP to then fold comparisons against %x, as illustrated in the tests. Differential Revision: https://reviews.llvm.org/D60650 llvm-svn: 358597	2019-04-17 16:57:42 +00:00
Florian Hahn	893aea58ea	[LoopUnroll] Allow unrolling if the unrolled size does not exceed loop size. Summary: In the following cases, unrolling can be beneficial, even when optimizing for code size: 1) very low trip counts 2) potential to constant fold most instructions after fully unrolling. We can unroll in those cases, by setting the unrolling threshold to the loop size. This might highlight some cost modeling issues and fixing them will have a positive impact in general. Reviewers: vsk, efriedma, dmgreen, paquette Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D60265 llvm-svn: 358586	2019-04-17 15:57:43 +00:00
Roman Lebedev	0080645846	[CVP] processOverflowIntrinsic(): don't crash if constant-holding happened As reported by Mikael Holmén in post-commit review in https://reviews.llvm.org/D60791#1469765 llvm-svn: 358559	2019-04-17 06:35:07 +00:00
Eric Christopher	e29874eaa0	Revert "Add basic loop fusion pass." Per request. This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358553	2019-04-17 04:55:24 +00:00
Eric Christopher	cee313d288	Revert "Temporarily Revert "Add basic loop fusion pass."" The reversion apparently deleted the test/Transforms directory. Will be re-reverting again. llvm-svn: 358552	2019-04-17 04:52:47 +00:00
Eric Christopher	a863435128	Temporarily Revert "Add basic loop fusion pass." As it's causing some bot failures (and per request from kbarton). This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358546	2019-04-17 02:12:23 +00:00
Kit Barton	ab70da0728	Add basic loop fusion pass. This patch adds a basic loop fusion pass. It will fuse loops that conform to the following 4 conditions: 1. Adjacent (no code between them) 2. Control flow equivalent (if one loop executes, the other loop executes) 3. Identical bounds (both loops iterate the same number of iterations) 4. No negative distance dependencies between the loop bodies. The pass does not make any changes to the IR to create opportunities for fusion. Instead, it checks if the necessary conditions are met and if so it fuses two loops together. The pass has not been added to the pass pipeline yet, and thus is not enabled by default. It can be run stand alone using the -loop-fusion option. Phabricator: https://reviews.llvm.org/D55851 llvm-svn: 358543	2019-04-17 01:37:00 +00:00
Sanjay Patel	e08783e2f5	[EarlyCSE] detect equivalence of selects with inverse conditions and commuted operands (PR41101) This is 1 of the problems discussed in the post-commit thread for: rL355741 / http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190311/635516.html and filed as: https://bugs.llvm.org/show_bug.cgi?id=41101 Instcombine tries to canonicalize some of these cases (and there's room for improvement there independently of this patch), but it can't always do that because of extra uses. So we need to recognize these commuted operand patterns here in EarlyCSE. This is similar to how we detect commuted compares and commuted min/max/abs. Differential Revision: https://reviews.llvm.org/D60723 llvm-svn: 358523	2019-04-16 20:41:20 +00:00
Nikita Popov	52b24ee932	[CVP] Simplify umulo and smulo that cannot overflow If a umul.with.overflow or smul.with.overflow operation cannot overflow, simplify it to a simple mul nuw / mul nsw. After the refactoring in D60668 this is just a matter of removing an explicit check against multiplications. Differential Revision: https://reviews.llvm.org/D60791 llvm-svn: 358521	2019-04-16 20:31:41 +00:00
Simon Pilgrim	82ffa88a04	[SLP] Refactoring of the operand reordering code. This is a refactoring patch which should have all the functionality of the current code. Its goal is twofold: i. Cleanup and simplify the reordering code, and ii. Generalize reordering so that it will work for an arbitrary number of operands, not just 2. This is the second patch in a series of patches that will enable operand reordering across chains of operations. An example of this was presented in EuroLLVM'18 https://www.youtube.com/watch?v=gIEn34LvyNo . Committed on behalf of @vporpo (Vasileios Porpodas) Differential Revision: https://reviews.llvm.org/D59973 llvm-svn: 358519	2019-04-16 19:27:00 +00:00
Nikita Popov	5a30177906	[CVP] Add tests for non-overflowing mulo; NFC Should be simplified to simple mul. llvm-svn: 358517	2019-04-16 19:25:35 +00:00
Nikita Popov	5ecd6a48b9	[InstCombine] Prune fshl/fshr with masked operands If a constant shift amount is used, then only some of the LHS/RHS operand bits are demanded and we may be able to simplify based on that. InstCombineSimplifyDemanded already had the necessary support for that, we just weren't calling it with fshl/fshr as root. In particular, this allows us to relax some masked funnel shifts into simple shifts, as shown in the tests. Patch by Shawn Landden. Differential Revision: https://reviews.llvm.org/D60660 llvm-svn: 358515	2019-04-16 19:05:49 +00:00
Nikita Popov	f700081a7d	[InstCombine] Add tests for fshl/fshr with masked operands; NFC Baseline tests for D60660. Patch by Shawn Landden. Differential Revision: https://reviews.llvm.org/D60688 llvm-svn: 358514	2019-04-16 19:05:40 +00:00
Philip Reames	c44b68e2b7	[Tests] Add branch_weights to latches so that test is not effected by future profitability patch to LoopPredication llvm-svn: 358506	2019-04-16 16:32:59 +00:00
Hans Wennborg	21eb771dcb	Re-commit r357452: SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259) The original commit caused false positives from AddressSanitizer's use-after-scope checks, which have now been fixed in r358478. > The code was previously checking that candidates for sinking had exactly > one use or were a store instruction (which can't have uses). This meant > we could sink call instructions only if they had a use. > > That limitation seemed a bit arbitrary, so this patch changes it to > "instruction has zero or one use" which seems more natural and removes > the need to special-case stores. > > Differential revision: https://reviews.llvm.org/D59936 llvm-svn: 358483	2019-04-16 12:13:25 +00:00
Quentin Colombet	fda0426888	[LSR] Rewrite misses some fixup locations if it splits critical edge If LSR split critical edge during rewriting phi operands and phi node has other pending fixup operands, we need to update those pending fixups. Otherwise formulae will not be implemented completely and some instructions will not be eliminated. llvm.org/PR41445 Differential Revision: https://reviews.llvm.org/D60645 Patch by: Denis Bakhvalov <denis.bakhvalov@intel.com> llvm-svn: 358457	2019-04-15 22:23:46 +00:00
Sanjay Patel	800a0c3e4b	[EarlyCSE] add more tests for double-negated select condition; NFC llvm-svn: 358454	2019-04-15 21:51:51 +00:00
Sanjay Patel	5ae05d810c	[EarlyCSE] add test for select condition double-negation; NFC llvm-svn: 358444	2019-04-15 20:25:31 +00:00
Philip Reames	af808ee2ee	[Tests] Add a few more tests for LoopPredication w/invariant loads Making sure to cover an important legality cornercase. llvm-svn: 358439	2019-04-15 19:45:27 +00:00
Wolfgang Pieb	4fe42214e2	[DEBUGINFO] Prevent Instcombine from dropping debuginfo when removing zexts Zexts can be treated like no-op casts when it comes to assessing whether their removal affects debug info. Reviewer: aprantl Differential Revision: https://reviews.llvm.org/D60641 llvm-svn: 358431	2019-04-15 17:36:29 +00:00
Hiroshi Yamauchi	09e539fcae	[PGO] Profile guided code size optimization. Summary: Enable some of the existing size optimizations for cold code under PGO. A ~5% code size saving in big internal app under PGO. The way it gets BFI/PSI is discussed in the RFC thread http://lists.llvm.org/pipermail/llvm-dev/2019-March/130894.html Note it doesn't currently touch loop passes. Reviewers: davidxl, eraman Reviewed By: eraman Subscribers: mgorny, javed.absar, smeenai, mehdi_amini, eraman, zzheng, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59514 llvm-svn: 358422	2019-04-15 16:49:00 +00:00
Sanjay Patel	0e0bb0e24a	[EarlyCSE] add tests for selects with commuted operands (PR41101); NFC llvm-svn: 358420	2019-04-15 16:01:05 +00:00
Philip Reames	fbe64a2cfb	[LoopPred] Hoist and of predicated checks where legal If we have multiple range checks which can be predicated, hoist the and of the results outside the loop. This minorly cleans up the resulting IR, but the main motivation is as a building block for D60093. llvm-svn: 358419	2019-04-15 15:53:25 +00:00
Sanjay Patel	c71433335a	[EarlyCSE] regenerate test checks; NFC llvm-svn: 358407	2019-04-15 14:02:37 +00:00
Sanjay Patel	5e13cd2e61	[InstCombine] canonicalize fdiv after fmul if reassociation is allowed (X / Y) * Z --> (X * Z) / Y This can allow other optimizations/reassociations as shown in the test diffs. llvm-svn: 358404	2019-04-15 13:23:38 +00:00
Serguei Katkov	f54328372b	[NewPM] Add Option handling for SimplifyCFG This patch enables passing options to SimplifyCFGPass via the passes pipeline. Reviewers: chandlerc, fedor.sergeev, leonardchan, philip.pfaffe Reviewed By: fedor.sergeev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D60675 llvm-svn: 358379	2019-04-15 08:57:53 +00:00
Philip Reames	0eeb2cd491	[Tests] Add tests for D60659, and make adjustments to others to make diff clear Three related changes: 1) auto-gen several test files 2) Add the new tests at the bottom of said files 3) Adjust a couple of other test files not to use stores to constants when trying to test constexpr address handling llvm-svn: 358344	2019-04-13 22:12:56 +00:00
Nikita Popov	040871db48	[CVP] Add tests for range of with.overflow result; NFC Test range of with.overflow result in the no-overflow branch. llvm-svn: 358341	2019-04-13 19:43:51 +00:00
Nikita Popov	41e284b9c3	[CVP] Fix inverted predicates in test; NFC Checked the wrong direction in the umul tests... fix predicated to line up with the test name. llvm-svn: 358331	2019-04-13 11:47:36 +00:00
Nikita Popov	25c1aa15a7	[CVP] Add tests for with.overflow used as condition; NFC llvm-svn: 358330	2019-04-13 11:40:16 +00:00
Chen Zheng	87dd0e06dc	[InstCombine] Canonicalize (-X srem Y) to -(X srem Y). Differential Revision: https://reviews.llvm.org/D60647 llvm-svn: 358328	2019-04-13 09:21:22 +00:00
Chen Zheng	fc59a0326b	[InstCombine] [NFC] add testcases for canonicalizing (-X srem Y) to -(X srem Y). llvm-svn: 358327	2019-04-13 07:34:55 +00:00
Philip Reames	b091cc081d	[InstCombine] Fix a nasty miscompile introduced w/masked.gather demanded elts This fixes a miscompile which was introduced in r356510 (https://reviews.llvm.org/D57372). The problem is that the original patch removed pointer operands where the load results we're demanded, but without considering the legality of the load itself. If the masked.gather had active, but undemanded, lanes, then we could end up creating a load which loaded from an undef address. The result could be a segfault, or, in theory, an arbitrary read from a random memory location into an used register. llvm-svn: 358299	2019-04-12 18:26:56 +00:00
Nikita Popov	00a0d5d1de	[CVP] Set NSW/NUW flags when simplifying with.overflow When CVP determines that a with.overflow intrinsic cannot overflow, it currently inserts a simple add/sub. As we already determined that there can be no overflow, we should add the appropriate NUW/NSW flag. Differential Revision: https://reviews.llvm.org/D60585 llvm-svn: 358298	2019-04-12 18:18:17 +00:00
Philip Reames	7a60cd38af	[Tests] Checkin a test demonstrating a miscompile so that patch which fixes it shows a clear diff llvm-svn: 358296	2019-04-12 18:11:58 +00:00
Jeremy Morse	32afe6a1f8	[DebugInfo] Fix pr41175 Dead Store Elimination missing debug loc Bug: https://bugs.llvm.org/show_bug.cgi?id=41175 In the bug test case the DSE pass is shortening the range of memory that a memset is working on. A getelementptr is generated so that the new starting address can be passed to memset. This instruction was not given a DebugLoc. To fix the bug, copy the DebugLoc from the memset instruction. Patch by Orlando Cazalet-Hyams! Differential Revision: https://reviews.llvm.org/D60556 llvm-svn: 358270	2019-04-12 09:47:35 +00:00
Fangrui Song	d5c404246f	[ConstantFold] Don't evaluate FP or FP vector casts or truncations when simplifying icmp Fix PR41476 llvm-svn: 358262	2019-04-12 07:34:30 +00:00
Nikita Popov	6ffa1511ea	[CVP] Generate full test checks for overflows.ll; NFC llvm-svn: 358229	2019-04-11 21:10:39 +00:00
Rong Xu	959ef16859	[PGO] Better handling of profile hash mismatch We currently assume profile hash conflicts will be caught by an upfront check and we assert for the cases that escape the check. The assumption is not always true as there are chances of conflict. This patch prints a warning and skips annotating the function for the escaped cases,. Differential Revision: https://reviews.llvm.org/D60154 llvm-svn: 358225	2019-04-11 20:54:17 +00:00
Simon Pilgrim	8d083c5e0b	[ConstantFold] ExtractConstantBytes - handle shifts on large integer types Use APInt instead of getZExtValue from the ConstantInt until we can confirm that the shift amount is in range. Reduced from OSS-Fuzz #14169 - https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14169 llvm-svn: 358192	2019-04-11 16:39:31 +00:00
Erik Pilkington	cb5c7bd9eb	Fix a hang when lowering __builtin_dynamic_object_size If the ObjectSizeOffsetEvaluator fails to fold the object size call, then it may litter some unused instructions in the function. When done repeatably in InstCombine, this results in an infinite loop. Fix this by tracking the set of instructions that were inserted, then removing them on failure. rdar://49172227 Differential revision: https://reviews.llvm.org/D60298 llvm-svn: 358146	2019-04-10 23:42:11 +00:00
Nikita Popov	0a8228fd28	[InstCombine] Handle ssubo always overflow Following D60483 and D60497, this adds support for AlwaysOverflows handling for ssubo. This is the last case we can handle right now. Differential Revision: https://reviews.llvm.org/D60518 llvm-svn: 358100	2019-04-10 16:32:15 +00:00
Nikita Popov	7a543c3758	[InstCombine] ssubo X, C -> saddo X, -C ssubo X, C is equivalent to saddo X, -C. Make the transformation in InstCombine and allow the logic implemented for saddo to fold prior usages of add nsw or sub nsw with constants. Patch by Dan Robertson. Differential Revision: https://reviews.llvm.org/D60061 llvm-svn: 358099	2019-04-10 16:27:36 +00:00
Nikita Popov	ef23e88480	[InstCombine] Handle saddo always overflow Followup to D60483: Handle AlwaysOverflow conditions for saddo as well. Differential Revision: https://reviews.llvm.org/D60497 llvm-svn: 358095	2019-04-10 16:18:01 +00:00
David Stenberg	fab4bdf4b9	Add REQUIRES: asserts to test using -debug-only llvm-svn: 358057	2019-04-10 08:44:57 +00:00
Florian Hahn	db1a69c250	[VPLAN] Minor improvement to testing and debug messages. 1. Use computed VF for stress testing. 2. If the computed VF does not produce vector code (VF smaller than 2), force VF to be 4. 3. Test vectorization of i64 data on AArch64 to make sure we generate VF != 4 (on X86 that was already tested on AVX). Patch by Francesco Petrogalli <francesco.petrogalli@arm.com> Differential Revision: https://reviews.llvm.org/D59952 llvm-svn: 358056	2019-04-10 08:17:28 +00:00
Nikita Popov	09020ec2a7	[InstCombine] Handle usubo always overflow Check AlwaysOverflow condition for usubo. The implementation is the same as the existing handling for uaddo and umulo. Handling for saddo and ssubo will follow (smulo doesn't have the necessary ValueTracking support). Differential Revision: https://reviews.llvm.org/D60483 llvm-svn: 358052	2019-04-10 07:10:53 +00:00
Chen Zheng	5e13ff1da2	[InstCombine] Canonicalize (-X s/ Y) to -(X s/ Y). Differential Revision: https://reviews.llvm.org/D60395 llvm-svn: 358050	2019-04-10 06:52:09 +00:00
Akira Hatanaka	9ca9d32b6b	[ObjC][ARC] Convert the retainRV marker that is passed as a named metadata into a module flag in the auto-upgrader and make the ARC contract pass read the marker as a module flag. This is needed to fix a bug where ARC contract wasn't inserting the retainRV marker when LTO was enabled, which caused objects returned from a function to be auto-released. rdar://problem/49464214 Differential Revision: https://reviews.llvm.org/D60303 llvm-svn: 358047	2019-04-10 06:20:20 +00:00
Nikita Popov	c176b708e4	[InstCombine] Add with.overflow always overflow tests; NFC The uadd and umul cases are currently handled, the usub, sadd, ssub and smul cases are not. usub, sadd and ssub already have the necessary ValueTracking support, smul doesn't. llvm-svn: 358031	2019-04-09 20:02:23 +00:00
Nikita Popov	2f5e9de8d1	Revert "[InstCombine] [InstCombine] Canonicalize (-X s/ Y) to -(X s/ Y)." This reverts commit `1383a91689`. sdiv-canonicalize.ll fails after this revision. The fold needs to be moved outside the branch handling constant operands. However when this is done there are further test changes, so I'm reverting this in the meantime. llvm-svn: 358026	2019-04-09 18:32:38 +00:00
Nikita Popov	4b2323d1a3	[ValueTracking] Use computeConstantRange() for signed sub overflow determination This is the same change as D60420 but for signed sub rather than signed add: Range information is intersected into the known bits result, allows to detect more no/always overflow conditions. Differential Revision: https://reviews.llvm.org/D60469 llvm-svn: 358020	2019-04-09 17:01:49 +00:00
Chen Zheng	1383a91689	[InstCombine] [InstCombine] Canonicalize (-X s/ Y) to -(X s/ Y). Differential Revision: https://reviews.llvm.org/D60395 llvm-svn: 358017	2019-04-09 16:34:31 +00:00
Nikita Popov	10edd2b79d	[ValueTracking] Use computeConstantRange() in signed add overflow determination This is D59386 for the signed add case. The computeConstantRange() result is now intersected into the existing known bits information, allowing to detect additional no-overflow/always-overflow conditions (though the latter isn't used yet). This (finally...) covers the motivating case from D59071. Differential Revision: https://reviews.llvm.org/D60420 llvm-svn: 358014	2019-04-09 16:12:59 +00:00
Sanjay Patel	49d9d17a77	[InstCombine] prevent possible miscompile with sdiv+negate of vector op Similar to: rL358005 Forego folding arbitrary vector constants to fix a possible miscompile bug. We can enhance the transform if we do want to handle the more complicated vector case. llvm-svn: 358013	2019-04-09 15:13:03 +00:00
Sanjay Patel	d5173f5acf	[InstCombine] add tests for sdiv with negated dividend and constant divisor; NFC llvm-svn: 358010	2019-04-09 14:48:44 +00:00
Sanjay Patel	7563b65ad4	[InstCombine] add tests for sdiv-by-int-min; NFC llvm-svn: 358008	2019-04-09 14:27:07 +00:00
Sanjay Patel	d469954d61	[InstCombine] auto-generate complete test checks; NFC llvm-svn: 358007	2019-04-09 14:27:03 +00:00
Sanjay Patel	f62dcea7ed	[InstCombine] prevent possible miscompile with negate+sdiv of vector op // 0 - (X sdiv C) -> (X sdiv -C) provided the negation doesn't overflow. This fold has been around for many years and nobody noticed the potential vector miscompile from overflow until recently... So it seems unlikely that there's much demand for a vector sdiv optimization on arbitrary vector constants, so just limit the matching to splat constants to avoid the possible bug. Differential Revision: https://reviews.llvm.org/D60426 llvm-svn: 358005	2019-04-09 14:09:06 +00:00
Sanjay Patel	a230bb5fc0	[InstCombine] add tests/comments for negate+sdiv; NFC llvm-svn: 358003	2019-04-09 13:41:29 +00:00
Chen Zheng	11cf397292	[InstCombine] add more testcases for canonicalize (-X s/ Y) to -(X s/ Y). llvm-svn: 358000	2019-04-09 12:47:29 +00:00
Sanjay Patel	74ccef1f4f	[InstCombine] add tests for negate+sdiv; NFC PR41425: https://bugs.llvm.org/show_bug.cgi?id=41425 llvm-svn: 357953	2019-04-08 22:55:10 +00:00
Sanjay Patel	773e04c883	[InstCombine] peek through fdiv to find a squared sqrt A more general canonicalization between fdiv and fmul would not handle this case because that would have to be limited by uses to prevent 2 values from becoming 3 values: (x/y) * (x/y) --> (xx) / (yy) (But we probably should still have that limited -- but more general -- canonicalization independently of this change.) llvm-svn: 357943	2019-04-08 21:23:50 +00:00
Sanjay Patel	bf1417d7e4	[InstCombine] add extra-use tests for fmul+sqrt; NFC llvm-svn: 357939	2019-04-08 20:37:34 +00:00
Nikita Popov	15abd74de7	[InstCombine] Add more tests for signed saturing math overflow; NFC Overflow conditions for sadd.sat and ssub.sat which can be determined based on constant ranges, but not necessarily known bits. llvm-svn: 357938	2019-04-08 20:02:47 +00:00
Brian M. Rzycki	887865c1ad	[JumpThreading] Fix incorrect fold conditional after indirectbr/callbr Fixes bug 40992: https://bugs.llvm.org/show_bug.cgi?id=40992 There is potential for miscompiled code emitted from JumpThreading when analyzing a block with one or more indirectbr or callbr predecessors. The ProcessThreadableEdges() function incorrectly folds conditional branches into an unconditional branch. This patch prevents incorrect branch folding without fully pessimizing other potential threading opportunities through the same basic block. This IR shape was manually fed in via opt and is unclear if clang and the full pass pipeline will ever emit similar code shapes. Thanks to Matthias Liedtke for the bug report and simplified IR example. Differential Revision: https://reviews.llvm.org/D60284 llvm-svn: 357930	2019-04-08 18:20:35 +00:00
Sanjay Patel	b33938df7a	[InstCombine] remove overzealous assert for shuffles (PR41419) As the TODO indicates, instsimplify could be improved. Should fix: https://bugs.llvm.org/show_bug.cgi?id=41419 llvm-svn: 357910	2019-04-08 13:28:29 +00:00
Simon Pilgrim	b4f1bfa659	[InstCombine][X86] Expand MOVMSK to generic IR (PR39927) First step towards removing the MOVMSK intrinsics completely - this patch expands MOVMSK to the pattern: e.g. PMOVMSKB(v16i8 x): %cmp = icmp slt <16 x i8> %x, zeroinitializer %int = bitcast <16 x i8> %cmp to i16 %res = zext i16 %int to i32 Which is correctly handled by ISel and FastIsel (give or take an annoying movzx move....): https://godbolt.org/z/rkrSFW Differential Revision: https://reviews.llvm.org/D60256 llvm-svn: 357909	2019-04-08 13:17:51 +00:00
Chen Zheng	923c7c9daa	[InstCombine] sdiv exact flag fixup. Differential Revision: https://reviews.llvm.org/D60396 llvm-svn: 357904	2019-04-08 12:08:03 +00:00
Chen Zheng	edf91ed855	[InstCombine] add more testcases for sdiv exact flag fixup. llvm-svn: 357894	2019-04-08 09:19:42 +00:00
Chen Zheng	d3b1d74624	[InstCombine] add testcases for sdiv exact flag fixing - NFC. llvm-svn: 357884	2019-04-08 05:49:15 +00:00
Chen Zheng	c84107612a	[InstCombine]add testcase for sdiv canonicalizetion - NFC llvm-svn: 357883	2019-04-08 03:07:32 +00:00
Nikita Popov	3db93ac5d6	Reapply [ValueTracking] Support min/max selects in computeConstantRange() Add support for min/max flavor selects in computeConstantRange(), which allows us to fold comparisons of a min/max against a constant in InstSimplify. This fixes an infinite InstCombine loop, with the test case taken from D59378. Relative to the previous iteration, this contains some adjustments for AMDGPU med3 tests: The AMDGPU target runs InstSimplify prior to codegen, which ends up constant folding some existing med3 tests after this change. To preserve these tests a hidden -amdgpu-scalar-ir-passes option is added, which allows disabling scalar IR passes (that use InstSimplify) for testing purposes. Differential Revision: https://reviews.llvm.org/D59506 llvm-svn: 357870	2019-04-07 17:22:16 +00:00
Sanjay Patel	c538c50113	[InstCombine] add more tests for fmul+fdiv+sqrt; NFC llvm-svn: 357816	2019-04-05 20:54:35 +00:00
Sanjay Patel	79df4454e1	[InstCombine] add tests for fdiv+fmul; NFC llvm-svn: 357782	2019-04-05 17:00:57 +00:00
Sanjay Patel	7e3e7f8040	[InstCombine] add tests for sqrt+fdiv+fmul; NFC Examples based on recent llvm-dev thread. These are specific patterns of more general enhancements that would solve these. llvm-svn: 357780	2019-04-05 16:52:57 +00:00
Sanjay Patel	9965f5aa70	[InstCombine] add test to show reassociation that creates a denormal constant; NFC llvm-svn: 357776	2019-04-05 16:42:21 +00:00
Simon Pilgrim	5ad10f4df9	[SLP][X86] Regenerate operandorder tests with arguments on same line. NFCI. Stops update_test_checks.py from splitting the later arguments after the CHECKs. llvm-svn: 357679	2019-04-04 09:31:12 +00:00
Luqman Aden	8911c5be46	[InstCombine] Combine no-wrap sub and icmp w/ constant. Teach InstCombine the transformation `(icmp P (sub nuw\|nsw C2, Y), C) -> (icmp swap(P) Y, C2-C)` Reviewers: majnemer, apilipenko, sanjoy, spatel, lebedev.ri Reviewed By: lebedev.ri Subscribers: dmgreen, lebedev.ri, nikic, hiraditya, JDevlieghere, jfb, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59916 llvm-svn: 357674	2019-04-04 07:08:30 +00:00
David L. Jones	8b8a02175a	Revert r357452 - 'SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)' This revision causes tests to fail under ASAN. Since the cause of the failures is not clear (could be ASAN, could be a Clang bug, could be a bug in this revision), the safest course of action seems to be to revert while investigating. llvm-svn: 357667	2019-04-04 02:27:57 +00:00
Taewook Oh	a960f89962	[ProfileSummary] Count callsite samples when computing total samples. Summary: Currently ProfileSummaryBuilder doesn't count into callsite samples when computing total samples. Considering that ProfileSummaryInfo is used to checked the hotness of not only body samples but also callsite samples (from SampleProfileLoader), I think the callsite sample counts should be considered when computing total samples. Reviewers: eraman, danielcdh, wmi Subscribers: hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59835 llvm-svn: 357627	2019-04-03 19:54:43 +00:00
David Bolvansky	937720e75b	[InstCombine] Simplify ctpop with bitreverse/bswap Summary: Fixes PR41337 Reviewers: spatel Reviewed By: spatel Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60148 llvm-svn: 357564	2019-04-03 08:08:44 +00:00
Matt Arsenault	f426ddbfc7	AMDGPU: Assume ECC is enabled by default if supported The test should really be checking for the property directly in the code object headers, but there are problems with this. I don't see this directly represented in the text form, and for the binary emission this is depending on a function level subtarget feature to emit a global flag. llvm-svn: 357558	2019-04-03 01:58:57 +00:00
Matt Arsenault	03e7492876	InstSimplify: Fold round intrinsics from sitofp/uitofp https://godbolt.org/z/gEMRZb llvm-svn: 357549	2019-04-03 00:25:06 +00:00
David Bolvansky	9f179b2c65	[InstCombine] Added tests for PR41337 llvm-svn: 357522	2019-04-02 20:21:26 +00:00
David Bolvansky	5ba60b22a4	[InstCombine] Simplify ctlz/cttz with bitreverse Summary: Fixes PR41273 Reviewers: spatel Reviewed By: spatel Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60096 llvm-svn: 357521	2019-04-02 20:13:28 +00:00
David Bolvansky	9bba938de4	[InstCombine] Added tests for PR41273 llvm-svn: 357508	2019-04-02 18:33:54 +00:00
Vedant Kumar	9da8a68d6b	[ArgPromotion] Set debug location at updated callsites Set the correct debug location on instructions which load arguments in preparation for a call to an arg-promoted function. This prevents location cascade from misattributing the line/scope of one of these loads to the location of the instruction preceding the call. Differential Revision: https://reviews.llvm.org/D60113 llvm-svn: 357500	2019-04-02 17:42:17 +00:00
Vedant Kumar	c6bceec01a	[DebugInfo] Fix pr41180 : Loop Vectorization Debugify Failure Bug: https://bugs.llvm.org/show_bug.cgi?id=41180 In the bug test case the debug location was missing for the cmp instruction in the "middle block" BB. This patch fixes the bug by copying the debug location from the cmp of the scalar loop's terminator branch, if it exists. The patch also fixes the debug location on the subsequent branch instruction. It was previously using the location of the of the original loop's pre-header block terminator. Both of these instructions will now map to the source line of the conditional branch in the original loop. A regression test has been added that covers these issues. Patch by Orlando Cazalet-Hyams! Differential Revision: https://reviews.llvm.org/D59944 llvm-svn: 357499	2019-04-02 17:28:34 +00:00
Philip Reames	d3d5d76a7b	[WideableCond] Fix a nasty bug in detection of "explicit guards" The code was failing to actually check for the presence of the call to widenable_condition. The whole point of specifying the widenable_condition intrinsic was allowing widening transforms. A normal branch is not widenable. A normal branch leading to a deopt is not widenable (in general). I added a test case via LoopPredication, but GuardWidening has an analogous bug. Those are the only two passes actually using this utility just yet. Noticed while working on LoopPredication for non-widenable branches; POC in D60111. llvm-svn: 357493	2019-04-02 16:51:43 +00:00
Joseph Tremoulet	fb4d9f7287	[SimplifyCFG] Don't split musttail call from ret Summary: When inserting an `unreachable` after a noreturn call, we must ensure that it's not a musttail call to avoid breaking the IR invariants for musttail calls. Reviewers: fedor.sergeev, majnemer Reviewed By: majnemer Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60080 llvm-svn: 357485	2019-04-02 15:48:58 +00:00
Taewook Oh	6a27c48be2	[SampleProfile] Repeat indirect call promotion only when the target is actually hot. Summary: It is possible that multiple indirect call targets have been promoted for a single callsite from the profiled binary. Current implementation repeats promotion for all these targets as far as the callsite itself is hot (the callsite is assumed to be hot if any one of these targets was "hot" during the profiling). However, even when one of the ICPed target is hot other targets may not, and we should not repeat promotion for "cold" targets. Reviewers: danielcdh, wmi Subscribers: hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59940 llvm-svn: 357484	2019-04-02 15:48:21 +00:00
Joseph Tremoulet	b69afa8e9b	[PruneEH] Don't split musttail call from ret Summary: When inserting an `unreachable` after a noreturn call, we must ensure that it's not a musttail call to avoid breaking the IR invariants for musttail calls. Reviewers: fedor.sergeev, majnemer Reviewed By: majnemer Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60079 llvm-svn: 357483	2019-04-02 15:47:11 +00:00
Hans Wennborg	b669fea42f	SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259) The code was previously checking that candidates for sinking had exactly one use or were a store instruction (which can't have uses). This meant we could sink call instructions only if they had a use. That limitation seemed a bit arbitrary, so this patch changes it to "instruction has zero or one use" which seems more natural and removes the need to special-case stores. Differential revision: https://reviews.llvm.org/D59936 llvm-svn: 357452	2019-04-02 08:01:38 +00:00
Matt Arsenault	fa0a2c529b	InstSimplify: Add missing case from r357386 llvm-svn: 357443	2019-04-02 00:46:19 +00:00
Matt Arsenault	294e07cf03	AMDGPU: Fix test filename llvm-svn: 357441	2019-04-02 00:36:04 +00:00
Philip Reames	05e3e554b4	[LoopPred] Be uniform about proving generated conditions We'd been optimizing the case where the predicate was obviously true, do the same for the false case. Mostly just for completeness sake, but also may improve compile time in loops which will exit through the guard. Such loops are presumed rare in fastpath code, but may be present down untaken paths, so optimizing for them is still useful. llvm-svn: 357408	2019-04-01 16:26:08 +00:00
Philip Reames	d109e2a7c3	[LoopPred] Delete the old condition expressions if unused LoopPredication was replacing the original condition, but leaving the instructions to compute the old conditions around. This would get cleaned up by other passes of course, but we might as well do it eagerly. That also makes the test output less confusing. llvm-svn: 357406	2019-04-01 16:05:15 +00:00
Philip Reames	7eee62b5d4	[Tests] Autogen all the LoopPredication tests I'm about to make some changes to the pass which cause widespread - but uninteresting - test diffs. Prepare the tests for easy updating. llvm-svn: 357404	2019-04-01 15:35:30 +00:00
Philip Reames	9ef7708bbb	[Tests] Add tests for a possible loop predication transform variant As highlighted by tests, if one of the operands is loop variant, but guaranteed to have the same value on all iterations, we have a missed oppurtunity. llvm-svn: 357403	2019-04-01 15:32:07 +00:00
Mikael Holmen	150a7ec2dc	[InstCombine] Handle vector gep with scalar argument in evaluateInDifferentElementOrder Summary: This fixes PR41270. The recursive function evaluateInDifferentElementOrder expects to be called on a vector Value, so when we call it on a vector GEP's arguments, we must first check that the argument is indeed a vector. Reviewers: reames, spatel Reviewed By: spatel Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60058 llvm-svn: 357389	2019-04-01 14:10:10 +00:00
Mikael Holmen	3e527cd823	Revert "[InstCombine] Handle vector gep with scalar argument in evaluateInDifferentElementOrder" This reverts commit 75216a6dbcfe5fb55039ef06a07e419fa875f4a5. I'll recommit with a better commit message with reference to the phabricator review. llvm-svn: 357387	2019-04-01 14:06:45 +00:00
Matt Arsenault	0276b94356	InstSimplify: Add baseline test for upcoming change llvm-svn: 357386	2019-04-01 14:03:44 +00:00
Mikael Holmen	d66a47f90a	[InstCombine] Handle vector gep with scalar argument in evaluateInDifferentElementOrder This fixes PR41270. The recursive function evaluateInDifferentElementOrder expects to be called on a vector Value, so when we call it on a vector GEP's arguments, we must first check that the argument is indeed a vector. llvm-svn: 357385	2019-04-01 13:48:56 +00:00
Sanjay Patel	97d1bc4454	[InstCombine] eliminate commuted select-shuffles + binop (PR41304) If we have a commutable vector binop with inverted select-shuffles, we don't care about the order of the operands in each vector lane: LHS = shuffle V1, V2, <0, 5, 6, 3> RHS = shuffle V2, V1, <0, 5, 6, 3> LHS + RHS --> <V1[0]+V2[0], V2[1]+V1[1], V2[2]+V1[2], V1[3]+V2[3]> --> V1 + V2 PR41304: https://bugs.llvm.org/show_bug.cgi?id=41304 ...is currently titled as an SLP enhancement, but at least for the given example, we can reduce that in instcombine because we are just eliminating shuffles. As noted in the TODO, this could be generalized, but I haven't thought through those patterns completely, so this is limited to what appears to be always safe. Differential Revision: https://reviews.llvm.org/D60048 llvm-svn: 357382	2019-04-01 13:36:40 +00:00
Sanjay Patel	7ac1186b58	[InstCombine] add tests for inverted select-shuffles + binop (PR41304); NFC llvm-svn: 357368	2019-03-31 15:45:47 +00:00
Sanjay Patel	b276dd195a	[InstCombine] canonicalize select shuffles by commuting In PR41304: https://bugs.llvm.org/show_bug.cgi?id=41304 ...we have a case where we want to fold a binop of select-shuffle (blended) values. Rather than try to match commuted variants of the pattern, we can canonicalize the shuffles and check for mask equality with commuted operands. We don't produce arbitrary shuffle masks in instcombine, but select-shuffles are a special case that the backend is required to handle because we already canonicalize vector select to this shuffle form. So there should be no codegen difference from this change. It's possible that this improves CSE in IR though. Differential Revision: https://reviews.llvm.org/D60016 llvm-svn: 357366	2019-03-31 15:01:30 +00:00
Luqman Aden	7c67dbdc65	[NFC][InstCombine] Add tests for combining icmp of no-wrap sub w/ constant. llvm-svn: 357360	2019-03-31 08:58:50 +00:00
Matt Arsenault	055e4dce45	AMDGPU: Remove dx10-clamp from subtarget features Since this can be set with s_setreg*, it should not be a subtarget property. Set a default based on the calling convention, and Introduce a new amdgpu-dx10-clamp attribute to override this if desired. Also introduce a new amdgpu-ieee attribute to match. The values need to match to allow inlining. I think it is OK for the caller's dx10-clamp attribute to override the callee, but there doesn't appear to be the infrastructure to do this currently without definining the attribute in the generic Attributes.td. Eventually the calling convention lowering will need to insert a mode switch somewhere for these. llvm-svn: 357302	2019-03-29 19:14:54 +00:00
Sanjay Patel	01c07b1a45	[InstCombine] autogenerate complete checks; NFC llvm-svn: 357291	2019-03-29 17:51:39 +00:00
Sanjay Patel	2bff8b4272	[InstCombine] regenerate test checks; NFC llvm-svn: 357288	2019-03-29 17:47:51 +00:00
Simon Pilgrim	6a75c36ea9	[SLP] Add support for commutative icmp/fcmp predicates For the cases where the icmp/fcmp predicate is commutative, use reorderInputsAccordingToOpcode to collect and commute the operands. This requires a helper to recognise commutativity in both general Instruction and CmpInstr types - the CmpInst::isCommutative doesn't overload the Instruction::isCommutative method for reasons I'm not clear on (maybe because its based on predicate not opcode?!?). Differential Revision: https://reviews.llvm.org/D59992 llvm-svn: 357266	2019-03-29 15:28:25 +00:00
Simon Pilgrim	62f0d1650a	[SLP] Add support for swapping icmp/fcmp predicates to permit vectorization We should be able to match elements with the swapped predicate as well - as long as we commute the source operands. Differential Revision: https://reviews.llvm.org/D59956 llvm-svn: 357243	2019-03-29 10:41:00 +00:00
Florian Hahn	45682fd633	[LSR] Fix signed overflow in GenerateCrossUseConstantOffsets. For the attached test case, unchecked addition of immediate starts and ends overflows, as they can be arbitrary i64 constants. Proof: https://rise4fun.com/Alive/Plqc Reviewers: qcolombet, gilr, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D59218 llvm-svn: 357217	2019-03-28 22:17:29 +00:00
Eli Friedman	96f295e23b	[InterleavedAccessPass] Don't increase the number of bytes loaded. Even if the interleaving transform would otherwise be legal, we shouldn't introduce an interleaved load that is wider than the original load: it might have undefined behavior. It might be possible to perform some sort of mask-narrowing transform in some cases (using a narrower interleaved load, then extending the results using shufflevectors). But I haven't tried to implement that, at least for now. Fixes https://bugs.llvm.org/show_bug.cgi?id=41245 . Differential Revision: https://reviews.llvm.org/D59954 llvm-svn: 357212	2019-03-28 20:44:50 +00:00
Simon Pilgrim	ceb3de5d25	[SLP][X86] Add tests showing failure to commute icmp/fcmp by swapping predicate By swapping icmp/fcmp predicates we can commute their operands to improve vectorization llvm-svn: 357204	2019-03-28 19:13:38 +00:00
Simon Pilgrim	66b5e322fc	[SLP][X86] Add tests showing failure to commute icmp/fcmp operands Some predicates are fully commutative - we should be able to easily commute their operands to improve vectorization llvm-svn: 357202	2019-03-28 19:03:53 +00:00
Clement Courbet	699dc025a6	[X86MacroFusion] Handle branch fusion (AMD CPUs). Summary: This adds a BranchFusion feature to replace the usage of the MacroFusion for AMD CPUs. See D59688 for context. Reviewers: andreadb, lebedev.ri Subscribers: hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59872 llvm-svn: 357171	2019-03-28 14:12:46 +00:00
Florian Hahn	e21ed594d8	[VPlan] Determine Vector Width programmatically. With this change, the VPlan native path is triggered with the directive: #pragma clang loop vectorize(enable) There is no need to specify the vectorize_width(N) clause. Patch by Francesco Petrogalli <francesco.petrogalli@arm.com> Differential Revision: https://reviews.llvm.org/D57598 llvm-svn: 357156	2019-03-28 10:37:12 +00:00
Chandler Carruth	923ff550b9	[NewPM] Fix a nasty bug with analysis invalidation in the new PM. The issue here is that we actually allow CGSCC passes to mutate IR (and therefore invalidate analyses) outside of the current SCC. At a minimum, we need to support mutating parent and ancestor SCCs to support the ArgumentPromotion pass which rewrites all calls to a function. However, the analysis invalidation infrastructure is heavily based around not needing to invalidate the same IR-unit at multiple levels. With Loop passes for example, they don't invalidate other Loops. So we need to customize how we handle CGSCC invalidation. Doing this without gratuitously re-running analyses is even harder. I've avoided most of these by using an out-of-band preserved set to accumulate the cross-SCC invalidation, but it still isn't perfect in the case of re-visiting the same SCC repeatedly but it coming off the worklist. Unclear how important this use case really is, but I wanted to call it out. Another wrinkle is that in order for this to successfully propagate to function analyses, we have to make sure we have a proxy from the SCC to the Function level. That requires pre-creating the necessary proxy. The motivating test case now works cleanly and is added for ArgumentPromotion. Thanks for the review from Philip and Wei! Differential Revision: https://reviews.llvm.org/D59869 llvm-svn: 357137	2019-03-28 00:51:36 +00:00
Nikita Popov	7462303e06	[InstCombine] Use uadd.sat and usub.sat for canonicalization Start using the uadd.sat and usub.sat intrinsics for the existing canonicalizations. These intrinsics should optimize better than expanded IR, have better handling in the X86 backend and should be no worse than expanded IR in other backends, as far as we know. rL357012 already introduced use of uadd.sat for the add+umin pattern. Differential Revision: https://reviews.llvm.org/D58872 llvm-svn: 357103	2019-03-27 17:56:15 +00:00
Clement Courbet	f8666b0649	[X86MacroFusion][NFC] Add a bulldozer test. llvm-svn: 357099	2019-03-27 17:44:16 +00:00
Nikita Popov	7f15dd097e	[InstCombine] Add tests for ssubo X, C -> saddo X, -C; NFC Add baseline tests for canonicalization of ssubo X, C -> saddo X, -C. Patch by Dan Robertson. Differential Revision: https://reviews.llvm.org/D59653 llvm-svn: 357013	2019-03-26 18:05:43 +00:00
Sanjay Patel	81e8d76f5b	[InstCombine] form uaddsat from add+umin (PR14613) This is the last step towards solving the examples shown in: https://bugs.llvm.org/show_bug.cgi?id=14613 With this change, x86 should end up with psubus instructions when those are available. All known codegen issues with expanding the saturating intrinsics were resolved with: D59006 / rL356855 We also have some early evidence in D58872 that using the intrinsics will lead to better perf. If some target regresses from this, custom lowering of the intrinsics (as in the above for x86) may be needed. llvm-svn: 357012	2019-03-26 17:50:08 +00:00
Sanjay Patel	0dd67ed462	[InstCombine] add tests for uaddsat using min; NFC llvm-svn: 357005	2019-03-26 16:19:13 +00:00
Sanjay Patel	418ee7b7bb	[InstCombine] update tests to use FileCheck; NFC llvm-svn: 357004	2019-03-26 15:58:33 +00:00
Simon Pilgrim	6f96795b88	[SLPVectorizer] Merge reorderAltShuffleOperands into reorderInputsAccordingToOpcode As discussed on D59738, this generalizes reorderInputsAccordingToOpcode to handle multiple + non-commutative instructions so we can get rid of reorderAltShuffleOperands and make use of the extra canonicalizations that reorderInputsAccordingToOpcode brings. Differential Revision: https://reviews.llvm.org/D59784 llvm-svn: 356939	2019-03-25 20:05:27 +00:00
Simon Pilgrim	77749567a1	[SLPVectorizer] Update file missed in rL356913 Differential Revision: https://reviews.llvm.org/D59738 llvm-svn: 356915	2019-03-25 16:14:21 +00:00
Simon Pilgrim	ff3abef395	[SLPVectorizer] reorderInputsAccordingToOpcode - remove non-Instruction canonicalization Remove attempts to commute non-Instructions to the LHS - the codegen changes appear to rely on chance more than anything else and also have a tendency to fight existing instcombine canonicalization which moves constants to the RHS of commutable binary ops. This is prep work towards: (a) reusing reorderInputsAccordingToOpcode for alt-shuffles and removing the similar reorderAltShuffleOperands (b) improving reordering to optimized cases with commutable and non-commutable instructions to still find splat/consecutive ops. Differential Revision: https://reviews.llvm.org/D59738 llvm-svn: 356913	2019-03-25 15:53:55 +00:00
Simon Pilgrim	9eb0de8573	[X86][SLP] Show example of failure to uniformly commute splats for 'alt' shuffles. If either the main/alt opcodes isn't commutable we may end up with the splats not correctly commuted to the same side. llvm-svn: 356837	2019-03-23 16:14:04 +00:00
Daniel Sanders	ef8761fd3b	Fix non-determinism in Reassociate caused by address coincidences Summary: Between building the pair map and querying it there are a few places that erase and create Values. It's rare but the address of these newly created Values is occasionally the same as a just-erased Value that we already have in the pair map. These coincidences should be accounted for to avoid non-determinism. Thanks to Roman Tereshin for the test case. Reviewers: rtereshin, bogner Reviewed By: rtereshin Subscribers: mgrang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59401 llvm-svn: 356803	2019-03-22 20:16:35 +00:00
Sanjay Patel	a0aaa11afc	[SLP] fix variables names in test; NFC 'tmpXXX' conflicts with the auto-generated script regex names. That could cause mask a bug or fail if the output changes. llvm-svn: 356790	2019-03-22 18:33:11 +00:00
James Y Knight	c0e6b8ac3a	IR: Support parsing numeric block ids, and emit them in textual output. Just as as llvm IR supports explicitly specifying numeric value ids for instructions, and emits them by default in textual output, now do the same for blocks. This is a slightly incompatible change in the textual IR format. Previously, llvm would parse numeric labels as string names. E.g. define void @f() { br label %"55" 55: ret void } defined a label named "55", even without needing to be quoted, while the reference required quoting. Now, if you intend a block label which looks like a value number to be a name, you must quote it in the definition too (e.g. `"55":`). Previously, llvm would print nameless blocks only as a comment, and would omit it if there was no predecessor. This could cause confusion for readers of the IR, just as unnamed instructions did prior to the addition of "%5 = " syntax, back in 2008 (PR2480). Now, it will always print a label for an unnamed block, with the exception of the entry block. (IMO it may be better to print it for the entry-block as well. However, that requires updating many more tests.) Thus, the following is supported, and is the canonical printing: define i32 @f(i32, i32) { %3 = add i32 %0, %1 br label %4 4: ret i32 %3 } New test cases covering this behavior are added, and other tests updated as required. Differential Revision: https://reviews.llvm.org/D58548 llvm-svn: 356789	2019-03-22 18:27:13 +00:00
Philip Reames	d627048c07	[Tests] Add masked.gather tests for non-constant masks + speculation possibilities llvm-svn: 356782	2019-03-22 16:39:04 +00:00
Bixia Zheng	bdf0230cff	[ConstantFolding] Fix GetConstantFoldFPValue to avoid cast overflow. Summary: In C++, the behavior of casting a double value that is beyond the range of a single precision floating-point to a float value is undefined. This change replaces such a cast with APFloat::convert to convert the value, which is consistent with how we convert a double value to a half value. Reviewers: sanjoy Subscribers: lebedev.ri, sanjoy, jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59500 llvm-svn: 356781	2019-03-22 16:37:37 +00:00
Philip Reames	f032e85d64	[tests] Add a generic masked.gather test to show sometimes we can't transform llvm-svn: 356779	2019-03-22 16:30:56 +00:00
Philip Reames	e234fd6118	[tests] Add tests for converting masked.load to load speculatively llvm-svn: 356778	2019-03-22 16:26:57 +00:00
Philip Reames	4a518c7055	[Tests] Use valid alignment in masked.gather tests llvm-svn: 356775	2019-03-22 16:20:24 +00:00
Tim Renouf	94c163c34e	InstCombineSimplifyDemanded: Allow v3 results for AMDGCN buffer and image intrinsics This helps to avoid the situation where RA spots that only 3 of the v4f32 result of a load are used, and immediately reallocates the 4th register for something else, requiring a stall waiting for the load. Differential Revision: https://reviews.llvm.org/D58906 Change-Id: I947661edfd5715f62361a02b100f14aeeada29aa llvm-svn: 356768	2019-03-22 15:53:50 +00:00
Dinar Temirbulatov	f95351b918	[SLPVectorizer] Add test related to SLP Throttling support, NFCI. llvm-svn: 356754	2019-03-22 14:50:53 +00:00
Nikita Popov	b86576a5b9	[InstSimplify] Add tests for signed icmp of and/or; NFC Even if a signed predicate is used, the ranges computed for and/or are unsigned, resulting in missed simplifications. llvm-svn: 356720	2019-03-21 21:13:08 +00:00
Akira Hatanaka	b576c77a9e	Don't add a tail keyword to calls to ObjC runtime functions if the calls are annotated with notail. r356705 annotated calls to objc_retainAutoreleasedReturnValue with notail on x86-64. This commit teaches ARC optimizer to check the notail marker on the call before turning it into a tail call. rdar://problem/38675807 llvm-svn: 356707	2019-03-21 20:16:09 +00:00
Craig Topper	16dc165046	[InstCombine] Don't transform ((C1 OP zext(X)) & C2) -> zext((C1 OP X) & C2) if either zext or OP has another use. If they have other users we'll just end up increasing the instruction count. We might be able to weaken this to only one of them having a single use if we can prove that the and will be removed. Fixes PR41164. Differential Revision: https://reviews.llvm.org/D59630 llvm-svn: 356690	2019-03-21 17:50:49 +00:00
Craig Topper	9f0b17a248	[ScalarizeMaskedMemIntrin] Add support for scalarizing expandload and compressstore intrinsics. This adds support for scalarizing these intrinsics as well the X86TargetTransformInfo support to avoid scalarizing them in the cases X86 can handle. I've omitted handling special cases for constant masks for this first pass. Though CodeGenPrepare can constant fold the branch conditions and remove some of the control flow anyway. Fixes PR40994 and is covers most of PR3666. Might want to implement constant masks to close that. Differential Revision: https://reviews.llvm.org/D59180 llvm-svn: 356687	2019-03-21 17:38:52 +00:00
Nikita Popov	3af5b28f47	[ValueTracking] Use ConstantRange based overflow check for signed sub This is D59450, but for signed sub. This case is not NFC, because the overflow logic in ConstantRange is more powerful than the existing check. This resolves the TODO in the function. I've added two tests to show that this indeed catches more cases than the previous logic, but the main correctness test coverage here is in the existing ConstantRange unit tests. Differential Revision: https://reviews.llvm.org/D59617 llvm-svn: 356685	2019-03-21 17:23:51 +00:00

... 3 4 5 6 7 ...

12759 Commits