llvm-project

Commit Graph

Author	SHA1	Message	Date
Nikita Popov	5b32f60ec3	Revert "[CorrelatedValuePropagation] Fix prof branch_weights metadata handling for SwitchInst" This reverts commit `53f2f32865`. As reported on D62126, this causes assertion failures if the switch has incorrect branch_weights metadata, which may happen as a result of other transforms not handling it correctly yet. llvm-svn: 361881	2019-05-28 21:28:24 +00:00
Nikita Popov	2941eb6864	[InstCombine] Add tests for signed saturating always overflow; NFC llvm-svn: 361864	2019-05-28 18:59:28 +00:00
Simon Tatham	760df47b77	[ARM] Replace fp-only-sp and d16 with fp64 and d32. Those two subtarget features were awkward because their semantics are reversed: each one indicates the _lack_ of support for something in the architecture, rather than the presence. As a consequence, you don't get the behavior you want if you combine two sets of feature bits. Each SubtargetFeature for an FP architecture version now comes in four versions, one for each combination of those options. So you can still say (for example) '+vfp2' in a feature string and it will mean what it's always meant, but there's a new string '+vfp2d16sp' meaning the version without those extra options. A lot of this change is just mechanically replacing positive checks for the old features with negative checks for the new ones. But one more interesting change is that I've rearranged getFPUFeatures() so that the main FPU feature is appended to the output list before rather than after the features derived from the Restriction field, so that -fp64 and -d32 can override defaults added by the main feature. Reviewers: dmgreen, samparker, SjoerdMeijer Subscribers: srhines, javed.absar, eraman, kristof.beyls, hiraditya, zzheng, Petar.Avramovic, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D60691 llvm-svn: 361845	2019-05-28 16:13:20 +00:00
Hans Wennborg	d936e40575	Re-commit r357452 (take 2): "SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)" This was reverted in r360086 as it was supected of causing mysterious test failures internally. However, it was never concluded that this patch was the root cause. > The code was previously checking that candidates for sinking had exactly > one use or were a store instruction (which can't have uses). This meant > we could sink call instructions only if they had a use. > > That limitation seemed a bit arbitrary, so this patch changes it to > "instruction has zero or one use" which seems more natural and removes > the need to special-case stores. > > Differential revision: https://reviews.llvm.org/D59936 llvm-svn: 361811	2019-05-28 12:19:38 +00:00
Yevgeny Rouban	53f2f32865	[CorrelatedValuePropagation] Fix prof branch_weights metadata handling for SwitchInst This patch fixes the CorrelatedValuePropagation pass to keep prof branch_weights metadata of SwitchInst consistent. It makes use of SwitchInstProfUpdateWrapper. New tests are added. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D62126 llvm-svn: 361808	2019-05-28 11:33:50 +00:00
Simon Pilgrim	48c8bdad2a	[SLPVectorizer][X86] Add broadcast test case from D62427 llvm-svn: 361805	2019-05-28 11:10:56 +00:00
Florian Hahn	11b2f4fe50	[LoopInterchange] Fix handling of LCSSA nodes defined in headers and latches. The code to preserve LCSSA PHIs currently only properly supports reduction PHIs and PHIs for values defined outside the latches. This patch improves the LCSSA PHI handling to cover PHIs for values defined in the latches. Fixes PR41725. Reviewers: efriedma, mcrosier, davide, jdoerfert Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D61576 llvm-svn: 361743	2019-05-26 23:38:25 +00:00
Shawn Landden	343578759e	[SimplifyCFG] back out all SwitchInst commits They caused the sanitizer builds to fail. My suspicion is the change the countLeadingZeros(). llvm-svn: 361736	2019-05-26 18:15:51 +00:00
Shawn Landden	7b883b7ed0	[SimplifyCFG] NFC, one more fixed test from previous push. The old test was checking for a stupid subtract one that is a transform that makes the code woorse. The constant-islands-jump-table.ll test wants the code a specific way, that makes sense, so I will submit code to fix that one. Sorry that I really didn't know how to run the test suite before this. llvm-svn: 361733	2019-05-26 15:29:10 +00:00
Shawn Landden	927fe7328d	[SimplifyCFG] NFC, fix failing tests from last patches. No problems with the transforms. llvm-svn: 361730	2019-05-26 14:44:14 +00:00
Sanjay Patel	9317963920	[InstCombine] prevent crashing with invalid extractelement index This was found/reduced from a fuzzer report: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14956 llvm-svn: 361729	2019-05-26 14:03:50 +00:00
Shawn Landden	fa91ab85d9	[SimplifyCFG] ReduceSwitchRange: Improve on the case where the SubThreshold doesn't trigger llvm-svn: 361728	2019-05-26 13:55:52 +00:00
Shawn Landden	30111c786f	[SimplifyCFG] Run ReduceSwitchRange unconditionally, generalize Rather than gating on "isSwitchDense" (resulting in necessesarily sparse lookup tables even when they were generated), always run this quite cheap transform. This transform is useful not just for generating tables. LowerSwitch also wants this: read LowerSwitch.cpp:257. Be careful to not generate worse code, by introducing a SubThreshold heuristic. Instead of just sorting by signed, generalize the finding of the best base. And now that it is run unconditionally, do not replicate its functionality in SwitchToLookupTable (which could use a Sub when having a hole is smaller, hence the SubThreshold heuristic located in a single place). This simplifies SwitchToLookupTable, and fixes some ugly corner cases due to the use of signed numbers, such as a table containing i16 32768 and 32769, of which 32769 would be interpreted as -32768, and now the code thinks the table is size 65536. (We still use unconditional subtraction when building a single-register mask, but I think this whole block should go when the more general sparse map is added, which doesn't leave empty holes in the table.) And the reason test4 and test5 did not trigger was documented wrong: it was because they were not considered sufficiently "dense". Also, fix generation of invalid LLVM-IR: shl by bit-width. llvm-svn: 361727	2019-05-26 13:55:14 +00:00
Shawn Landden	50c73a044f	[SimplifyCFG] NFC, update Switch tests to HEAD so I can see if my changes change anything Also add baseline tests to show effect of later patches. llvm-svn: 361725	2019-05-26 13:52:41 +00:00
David Bolvansky	0290a77aa8	[SimplifyCFG] Added condition assumption for unreachable blocks Summary: PR41688 Reviewers: spatel, efriedma, craig.topper, hfinkel, reames Reviewed By: hfinkel Subscribers: javed.absar, dmgreen, fhahn, hfinkel, reames, nikic, lebedev.ri, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61409 llvm-svn: 361707	2019-05-25 22:34:27 +00:00
Nikita Popov	6bb5041e94	[LVI][CVP] Add support for saturating add/sub Adds support for the uadd.sat family of intrinsics in LVI, based on ConstantRange methods from D60946. Differential Revision: https://reviews.llvm.org/D62447 llvm-svn: 361703	2019-05-25 16:44:14 +00:00
Nikita Popov	3c7edb2de5	[LoopVectorize] Fix test by regenerating checks llvm-svn: 361699	2019-05-25 14:33:30 +00:00
David Bolvansky	2149811854	[NFC] Make tests more robust for new optimizations llvm-svn: 361697	2019-05-25 14:10:20 +00:00
David Bolvansky	bb76cf0f96	[NFC] Update test checks llvm-svn: 361695	2019-05-25 13:11:22 +00:00
Nikita Popov	9a33dc9fb8	[CVP] Add tests for saturating add/sub ranges; NFC llvm-svn: 361694	2019-05-25 09:53:51 +00:00
Nikita Popov	024b18aca7	[LVI][CVP] Calculate with.overflow result range In LVI, calculate the range of extractvalue(op.with.overflow(%x, %y), 0) as the range of op(%x, %y). This is mainly useful in conjunction with D60650: If the result of the operation is extracted in a branch guarded against overflow, then the value of %x will be appropriately constrained and the result range of the operation will be calculated taking that into account. Differential Revision: https://reviews.llvm.org/D60656 llvm-svn: 361693	2019-05-25 09:53:45 +00:00
Craig Topper	46e5052b8e	[X86FixupLEAs] Turn optIncDec into a generic two address LEA optimizer. Support LEA64_32r properly. INC/DEC is really a special case of a more generic issue. We should also turn leas into add reg/reg or add reg/imm regardless of the slow lea flags. This also supports LEA64_32 which has 64 bit input registers and 32 bit output registers. So we need to convert the 64 bit inputs to their 32 bit equivalents to check if they are equal to base reg. One thing to note, the original code preserved the kill flags by adding operands to the new instruction instead of using addReg. But I think tied operands aren't supposed to have the kill flag set. I dropped the kill flags, but I could probably try to preserve it in the add reg/reg case if we think its important. Not sure which operand its supposed to go on for the LEA64_32r instruction due to the super reg implicit uses. Though I'm also not sure those are needed since they were probably just created by an INSERT_SUBREG from a 32-bit input. Differential Revision: https://reviews.llvm.org/D61472 llvm-svn: 361691	2019-05-25 06:17:47 +00:00
Matt Arsenault	0ff901fba0	AMDGPU: Boost inline threshold with addrspacecasted alloca arguments This was skipping GetUnderlyingObject for nonprivate addresses, but an alloca could also be found through an addrspacecast if it's flat. llvm-svn: 361649	2019-05-24 16:52:35 +00:00
Sanjay Patel	6f7734a125	[LoopVectorize] update test to be independent of instcombine; NFC This is a regression test for vectorization, so remove instcombine from the RUN line and adjust the comparison predicates to show what the vectorizer is creating rather than how instcombine cleans it up. llvm-svn: 361648	2019-05-24 16:46:09 +00:00
Neil Henning	119c31ad93	StructurizeCFG: Relax uniformity checks. This change relaxes the checks for hasOnlyUniformBranches such that our region is uniform if: 1. All conditional branches that are direct children are uniform. 2. And either: a. All sub-regions are uniform. b. There is one or less conditional branches among the direct children. Differential Revision: https://reviews.llvm.org/D62198 llvm-svn: 361610	2019-05-24 08:59:17 +00:00
Bjorn Pettersson	d63a2bb35f	[DSE] Bugfix to avoid PartialStoreMerging involving non byte-sized stores Summary: The DeadStoreElimination pass now skips doing PartialStoreMerging when stores overlap according to OW_PartialEarlierWithFullLater and at least one of the stores is having a store size that is different from the size of the type being stored. This solves problems seen in https://bugs.llvm.org/show_bug.cgi?id=41949 for which we in the past could end up with mis-compiles or assertions. The content and location of the padding bits is not formally described (or undefined) in the LangRef at the moment. So the solution is chosen based on that we cannot assume anything about the padding bits when having a store that clobbers more memory than indicated by the type of the value that is stored (such as storing an i6 using an 8-bit store instruction). Fixes: https://bugs.llvm.org/show_bug.cgi?id=41949 Reviewers: spatel, efriedma, fhahn Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62250 llvm-svn: 361605	2019-05-24 08:32:02 +00:00
Eli Friedman	052f87ae36	Revert r361460 It regresses https://bugs.llvm.org/show_bug.cgi?id=38309 (represented by the testcase test/Transforms/GlobalOpt/globalsra-multigep.ll). llvm-svn: 361581	2019-05-24 01:03:51 +00:00
Sanjay Patel	8869a98e82	[InstSimplify] fold insertelement-of-extractelement This was partly handled in InstCombine (only the constant index case), so delete that and zap it more generally in InstSimplify. llvm-svn: 361576	2019-05-24 00:13:58 +00:00
Sanjay Patel	3e15f83381	[InstSimplify] add tests for insert-of-extract; NFC llvm-svn: 361575	2019-05-24 00:11:23 +00:00
Sanjay Patel	e60cb7d1be	[InstSimplify] insertelement V, undef, ? --> V This was part of InstCombine, but it's better placed in InstSimplify. InstCombine also had an unreachable but weaker fold for insertelement with undef index, so that is deleted. llvm-svn: 361559	2019-05-23 21:49:47 +00:00
Sanjay Patel	3249be1e03	[InstCombine] be more careful when transforming a shuffle mask This is reduced from a fuzzer test: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14890 Usually, demanded elements should be able to simplify shuffle mask elements that are pointing to undef elements of its source operands, but that doesn't happen in the test case. llvm-svn: 361533	2019-05-23 18:46:03 +00:00
Saleem Abdulrasool	7bbefb13ee	Transforms: lower fadd and fsub atomicrmw instructions `fadd` and `fsub` have recently (r351850) been added as `atomicrmw` operations. This diff adds lowering cases for them to the LowerAtomic transform. Patch by Josh Berdine! llvm-svn: 361512	2019-05-23 17:03:43 +00:00
Cameron McInally	1312225f8c	[NFC][InstCombine] Add unary FNeg tests to maximum.ll/minimum.ll llvm-svn: 361500	2019-05-23 14:53:42 +00:00
Clement Courbet	43882b16a3	[MergeICmps] Make the pass compatible with the new pass manager. Reviewers: gchatelet, spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62287 llvm-svn: 361490	2019-05-23 12:35:26 +00:00
Christian Bruel	4a7da98bd9	[GlobalOpt] recognize dead struct fields and propagate values Summary: Allow struct fields SRA and dead stores. This works by considering fields accesses from getElementPtr to be considered as a possible pointer root that can be cleaned up. We check that the variable can be SRA by recursively checking the sub expressions with the new isSafeSubSROAGEP function. basically this allows the array in following C code to be optimized out struct Expr { int a[2]; int b; }; static struct Expr e; int foo (int i) { e.b = 2; e.a[i] = 1; return e.b; } Reviewers: greened, bkramer, nicholas, jmolloy Reviewed By: jmolloy Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61911 llvm-svn: 361460	2019-05-23 05:53:10 +00:00
Craig Topper	9816d55776	[X86][InstCombine] Remove InstCombine code that turns X86 round intrinsics into llvm.ceil/floor. Remove some isel patterns that existed because that was happening. We were turning roundss/sd/ps/pd intrinsics with immediates of 1 or 2 into llvm.floor/ceil. The llvm.ceil/floor intrinsics are supposed to correspond to the libm functions. For the libm functions we need to disable the precision exception so the llvm.floor/ceil functions should always map to encodings 0x9 and 0xA. We had a mix of isel patterns where some used 0x9 and 0xA and others used 0x1 and 0x2. We need to be consistent and always use 0x9 and 0xA. Since we have no way in isel of knowing where the llvm.ceil/floor came from, we can't map X86 specific intrinsics with encodings 1 or 2 to it. We could map 0x9 and 0xA to llvm.ceil/floor instead, but I'd really like to see a use case and optimization advantage first. I've left the backend test cases to show the blend we now emit without the extra isel patterns. But I've removed the InstCombine tests completely. llvm-svn: 361425	2019-05-22 20:04:55 +00:00
Hiroshi Yamauchi	dfeb797455	[PGO][CHR] Speed up following long use-def chains. Summary: Avoid visiting an instruction more than once by using a map. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62262 llvm-svn: 361416	2019-05-22 18:37:34 +00:00
Cameron McInally	adea0b6b40	[NFC][InstCombine] Add unary fneg tests to maxnum.ll/minnum.ll llvm-svn: 361415	2019-05-22 18:27:43 +00:00
Sanjay Patel	5a4f7cf2ff	[IR] allow fast-math-flags on select of FP values This is a minimal start to correcting a problem most directly discussed in PR38086: https://bugs.llvm.org/show_bug.cgi?id=38086 We have been hacking around a limitation for FP select patterns by using the fast-math-flags on the condition of the select rather than the select itself. This patch just allows FMF to appear with the 'select' opcode. No changes are needed to "FPMathOperator" because it already includes select-of-FP because that definition is based on the (return) value type. Once we have this ability, we can start correcting and adding IR transforms to use the FMF on a 'select' instruction. The instcombine and vectorizer test diffs only show that the IRBuilder change is behaving as expected by applying an FMF guard value to 'select'. For reference: rL241901 - allowed FMF with fcmp rL255555 - allowed FMF with FP calls Differential Revision: https://reviews.llvm.org/D61917 llvm-svn: 361401	2019-05-22 15:50:46 +00:00
Sanjay Patel	6a554188aa	[InstCombine] fold shuffles of insert_subvectors This should be a valid exception to the general rule of not creating new shuffle masks in IR... because we already do it. :) Also, DAG combining/legalization will undo this by widening the shuffle back out if needed. Explanation for how we already do this: SLP or vector source can create chains of insert/extract as shown in 1 of the examples from PR16739: https://godbolt.org/z/NlK7rA https://bugs.llvm.org/show_bug.cgi?id=16739 And we expect instcombine or DAGCombine to clean that up by creating relatively simple shuffles. Differential Revision: https://reviews.llvm.org/D62024 llvm-svn: 361338	2019-05-22 00:32:25 +00:00
Sanjay Patel	3590bae8d6	[InstCombine] add more tests for shuffle folding; NFC As discussed in D62024, we want to limit any potential IR transforms of shuffles to cases where we know the SDAG conversion would result in equivalent patterns for these IR variants. llvm-svn: 361317	2019-05-21 21:45:24 +00:00
Cameron McInally	17fdf1d383	[NFC][InstCombine] Add unary fneg tests to operand-complexity.ll. llvm-svn: 361311	2019-05-21 21:07:46 +00:00
Cameron McInally	872dc79f20	[NFC][InstCombine] Add unary FNeg tests to X86/x86-avx512.ll llvm-svn: 361308	2019-05-21 20:31:09 +00:00
Bob Haarman	032f87bbb3	Revert r360902 "Resubmit: [Salvage] Change salvage debug info ..." This reverts commit rr360902. It caused an assertion failure in lib/IR/DebugInfoMetadata.cpp: Assertion `(OffsetInBits + SizeInBits <= FragmentSizeInBits) && "new fragment outside of original fragment"' failed. PR41931. llvm-svn: 361246	2019-05-21 11:53:41 +00:00
Clement Courbet	a95d95d392	[MergeICmps] Preserve the dominator tree. Summary: In preparation for D60318 . Reviewers: gchatelet, efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62068 llvm-svn: 361239	2019-05-21 11:02:23 +00:00
Nikita Popov	e1d38ec811	[LFTR] Add additional PR31181 test cases One case where overflow happens in the first loop iteration, and two cases where we switch to a dynamically dead IV with post/pre increment, respectively. llvm-svn: 361189	2019-05-20 19:13:04 +00:00
Cameron McInally	2557ca296a	[InstCombine] Add visitFNeg(...) visitor for unary Fneg Also, break out a helper function, namely foldFNegIntoConstant(...), which performs transforms common between visitFNeg(...) and visitFSub(...). Differential Revision: https://reviews.llvm.org/D61693 llvm-svn: 361188	2019-05-20 19:10:30 +00:00
Sanjay Patel	d91f1dd470	[InstCombine] auto-generate test checks; NFC llvm-svn: 361181	2019-05-20 17:52:22 +00:00
Nick Desaulniers	639b29b1b5	[INLINER] allow inlining of blockaddresses if sole uses are callbrs Summary: It was supposed that Ref LazyCallGraph::Edge's were being inserted by inlining, but that doesn't seem to be the case. Instead, it seems that there was no test for a blockaddress Constant in an instruction that referenced the function that contained the instruction. Ex: ``` define void @f() { %1 = alloca i8, align 8 2: store i8 blockaddress(@f, %2), i8** %1, align 8 ret void } ``` When iterating blockaddresses, do not add the function they refer to back to the worklist if the blockaddress is referring to the contained function (as opposed to an external function). Because blockaddress has sligtly different semantics than GNU C's address of labels, there are 3 cases that can occur with blockaddress, where only 1 can happen in GNU C due to C's scoping rules: * blockaddress is within the function it refers to (possible in GNU C). * blockaddress is within a different function than the one it refers to (not possible in GNU C). * blockaddress is used in to declare a global (not possible in GNU C). The second case is tested in: ``` $ ./llvm/build/unittests/Analysis/AnalysisTests \ --gtest_filter=LazyCallGraphTest.HandleBlockAddress ``` This patch adjusts the iteration of blockaddresses in LazyCallGraph::visitReferences to not revisit the blockaddresses function in the first case. The Linux kernel contains code that's not semantically valid at -O0; specifically code passed to asm goto. It requires that asm goto be inline-able. This patch conservatively does not attempt to handle the more general case of inlining blockaddresses that have non-callbr users (pr/39560). https://bugs.llvm.org/show_bug.cgi?id=39560 https://bugs.llvm.org/show_bug.cgi?id=40722 https://github.com/ClangBuiltLinux/linux/issues/6 https://reviews.llvm.org/rL212077 Reviewers: jyknight, eli.friedman, chandlerc Reviewed By: chandlerc Subscribers: george.burgess.iv, nathanchance, mgorny, craig.topper, mengxu.gatech, void, mehdi_amini, E5ten, chandlerc, efriedma, eraman, hiraditya, haicheng, pirama, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D58260 llvm-svn: 361173	2019-05-20 16:48:09 +00:00
Cameron McInally	2d2a46db8e	[InstSimplify] Teach fsub -0.0, (fneg X) ==> X about unary fneg Differential Revision: https://reviews.llvm.org/D62077 llvm-svn: 361151	2019-05-20 13:13:35 +00:00
Orlando Cazalet-Hyams	ed67bf8d2f	Resubmit "[DebugInfo] Update loop metadata for inlined loops" This reverts commit `95805bc425`. I've squashed the test fix into this commit. [DebugInfo] Update loop metadata for inlined loops Currently, when a loop is cloned while inlining function (A) into function (B) the loop metadata is copied and then not modified at all. The loop metadata can encode the loop's start and end DILocations. Therefore, the new inlined loop in function (B) may have loop metadata which shows start and end locations residing in function (A). This patch ensures loop metadata is updated while inlining so that the start and end DILocations are given the "inlinedAt" operand. I've also added a regression test for this. This fix is required for D60831 because that patch uses loop metadata to determine the DILocation for the branches of new loop preheaders. Reviewers: aprantl, dblaikie, anemet Reviewed By: aprantl Subscribers: eraman, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D61933 llvm-svn: 361149	2019-05-20 13:02:30 +00:00
Orlando Cazalet-Hyams	95805bc425	Revert "[DebugInfo] Update loop metadata for inlined loops" This reverts commit `6e8f1a80cd`. Reverting patch while investigating build bot failure. llvm-svn: 361143	2019-05-20 11:24:39 +00:00
Orlando Cazalet-Hyams	6e8f1a80cd	[DebugInfo] Update loop metadata for inlined loops Summary: Currently, when a loop is cloned while inlining function (A) into function (B) the loop metadata is copied and then not modified at all. The loop metadata can encode the loop's start and end DILocations. Therefore, the new inlined loop in function (B) may have loop metadata which shows start and end locations residing in function (A). This patch ensures loop metadata is updated while inlining so that the start and end DILocations are given the "inlinedAt" operand. I've also added a regression test for this. This fix is required for D60831 because that patch uses loop metadata to determine the DILocation for the branches of new loop preheaders. Reviewers: aprantl, dblaikie, anemet Reviewed By: aprantl Subscribers: eraman, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D61933 llvm-svn: 361132	2019-05-20 09:40:44 +00:00
Sanjay Patel	9ef99b4b11	[InstSimplify] fold fcmp (maxnum, X, C1), C2 This is the sibling transform for rL360899 (D61691): maxnum(X, GreaterC) == C --> false maxnum(X, GreaterC) <= C --> false maxnum(X, GreaterC) < C --> false maxnum(X, GreaterC) >= C --> true maxnum(X, GreaterC) > C --> true maxnum(X, GreaterC) != C --> true llvm-svn: 361118	2019-05-19 14:26:39 +00:00
Matt Arsenault	b04f3258dd	GVN: Handle addrspacecast llvm-svn: 361103	2019-05-18 14:36:06 +00:00
Cameron McInally	12de5425c1	[NFC][InstSimplify] Add more unary fneg tests to floating-point-arithmetic.ll llvm-svn: 361076	2019-05-17 21:10:11 +00:00
Cameron McInally	bebc7d6a4e	[NFC][InstSimplify] Precommit new unary fneg test llvm-svn: 361060	2019-05-17 18:34:35 +00:00
Sanjay Patel	926e47751b	[InstCombine] move bitcast after insertelement-with-bitcasted-operands llvm-svn: 361058	2019-05-17 18:06:12 +00:00
Cameron McInally	19dc8c7280	[NFC][InstSImplify] Fix flip-flopped comments and test names In test/Transforms/InstSimplify/floating-point-arithmetic.ll Differential Revision: https://reviews.llvm.org/D62069 llvm-svn: 361057	2019-05-17 17:59:17 +00:00
Sanjay Patel	c05d85104d	[InstCombine] add tests for insertelement with bitcasted operands; NFC llvm-svn: 361051	2019-05-17 17:23:13 +00:00
Cameron McInally	067e946859	[InstSimplify] Add unary fneg to `fsub 0.0, (fneg X) ==> X` transform Differential Revision: https://reviews.llvm.org/D62013 llvm-svn: 361047	2019-05-17 16:47:00 +00:00
Roman Lebedev	3275060fe8	[InstCombine] canShiftBinOpWithConstantRHS(): drop bogus signbit check Summary: In D61918 i was looking at dropping it in DAGCombiner `visitShiftByConstant()`, but as @craig.topper pointed out, it was copied from here. That check claims that the transform is illegal otherwise. That isn't true: 1. For `ISD::ADD`, we only process `ISD::SHL` outer shift => sign bit does not matter https://rise4fun.com/Alive/K4A 2. For `ISD::AND`, there is no restriction on constants: https://rise4fun.com/Alive/Wy3 3. For `ISD::OR`, there is no restriction on constants: https://rise4fun.com/Alive/GOH 3. For `ISD::XOR`, there is no restriction on constants: https://rise4fun.com/Alive/ml6 So, why is it there then? As far as i can tell, it dates all the way back to original check-in rL7793. I think we should just drop it. Reviewers: spatel, craig.topper, efriedma, majnemer Reviewed By: spatel Subscribers: llvm-commits, craig.topper Tags: #llvm Differential Revision: https://reviews.llvm.org/D61938 llvm-svn: 361043	2019-05-17 15:52:49 +00:00
Clement Courbet	632dfdda16	Re-land r360859: "[MergeICmps] Simplify the code." With a fix for PR41917: The predecessor list was changing under our feet. - for (BasicBlock Pred : predecessors(EntryBlock_)) { + while (!pred_empty(EntryBlock_)) { + BasicBlock const Pred = *pred_begin(EntryBlock_); llvm-svn: 361009	2019-05-17 09:43:45 +00:00
Clement Courbet	580ff1e72a	[MergeICmps] Add test from PR41917. llvm-svn: 361001	2019-05-17 08:52:25 +00:00
Nico Weber	d764e7c660	Revert r360859: "Reland r360771 "[MergeICmps] Simplify the code."" It caused PR41917. llvm-svn: 360963	2019-05-17 00:43:53 +00:00
Philip Reames	f0a0e8bb36	[Tests] Consolidate more lftr tests These are all of the ones involving the same data layout string. Remainder take a bit more consideration, but at least everything can be auto-updated now. llvm-svn: 360961	2019-05-17 00:19:28 +00:00
Philip Reames	087a30d527	[Tests] Expand basic lftr coverage Newly written tests to cover the simple cases. We don't appear to have broad coverage of this transform anywhere. llvm-svn: 360957	2019-05-16 23:41:28 +00:00
Philip Reames	e7b680478c	[Tests] More consolidation of lftr tests llvm-svn: 360936	2019-05-16 20:42:00 +00:00
Philip Reames	c37a86d479	[Test] Remove a bunch of cruft from a test This test hadn't been fully reduced, so do so. llvm-svn: 360935	2019-05-16 20:37:20 +00:00
Philip Reames	fb70fbaba4	[Tests] Start consolidating lftr tests into a single file llvm-svn: 360934	2019-05-16 20:33:41 +00:00
Philip Reames	c8783798f4	[Tests] Autogen the last lftr test llvm-svn: 360933	2019-05-16 20:24:57 +00:00
Philip Reames	082ec7a784	[Tests] Autogen a few more lftr tests for readability llvm-svn: 360932	2019-05-16 20:19:02 +00:00
Philip Reames	12a8ea9876	[Tests] Autogen a few lftr test in preparation for merging llvm-svn: 360931	2019-05-16 20:15:25 +00:00
Cameron McInally	f637bb6ebd	[NFC][InstSimplify] Update fast-math.ll tests I botched in r360808. These were new tests I added in r360808. I made a mistake while converting the exisiting binary FNeg test into the new unary FNeg tests. Correct that. llvm-svn: 360928	2019-05-16 19:00:57 +00:00
Sanjay Patel	649bffccca	[InstCombine] add tests for shuffle of insert subvectors; NFC llvm-svn: 360923	2019-05-16 18:09:47 +00:00
Sanjay Patel	3413035477	[InstSimplify] add tests for fcmp of maxnum with constants; NFC Sibling tests for rL360899 (D61691). llvm-svn: 360905	2019-05-16 15:00:11 +00:00
Matt Arsenault	df24c92c0f	AMDGPU: Assume xnack is enabled by default This is the conservatively correct default. It is always safe to assume xnack is enabled, but not the converse. Introduce a feature to blacklist targets where xnack can never be meaningfully enabled. I'm not sure the targets this is applied to is 100% correct. llvm-svn: 360903	2019-05-16 14:48:34 +00:00
Stephen Tozer	6f59b4b6d9	Resubmit: [Salvage] Change salvage debug info implementation to use DW_OP_LLVM_convert where needed Fixes issue: https://bugs.llvm.org/show_bug.cgi?id=40645 Previously, LLVM had no functional way of performing casts inside of a DIExpression(), which made salvaging cast instructions other than Noop casts impossible. With the recent addition of DW_OP_LLVM_convert this salvaging is now possible, and so can be used to fix the attached bug as well as any cases where SExt instruction results are lost in the debugging metadata. This patch introduces this fix by expanding the salvage debug info method to cover these cases using the new operator. Differential revision: https://reviews.llvm.org/D61184 llvm-svn: 360902	2019-05-16 14:41:01 +00:00
Sanjay Patel	152f81fae8	[InstSimplify] fold fcmp (minnum, X, C1), C2 minnum(X, LesserC) == C --> false minnum(X, LesserC) >= C --> false minnum(X, LesserC) > C --> false minnum(X, LesserC) != C --> true minnum(X, LesserC) <= C --> true minnum(X, LesserC) < C --> true maxnum siblings will follow if there are no problems here. We should be able to perform some other combines when the constants are equal or greater-than too, but that would go in instcombine. We might also generalize this by creating an FP ConstantRange (similar to what we do for integers). Differential Revision: https://reviews.llvm.org/D61691 llvm-svn: 360899	2019-05-16 14:03:10 +00:00
Clement Courbet	c4fdd717ef	Reland r360771 "[MergeICmps] Simplify the code." This revision does not seem to be the culprit. llvm-svn: 360859	2019-05-16 06:18:02 +00:00
Yevgeny Rouban	bf6df042a5	Fix prof branch_weights in entry_counts_missing_dbginfo.ll test Removed extra parameter from !prof branch_weights metadata of a call instruction according to the spec. Differential Revision: https://reviews.llvm.org/D61932 llvm-svn: 360843	2019-05-16 03:39:09 +00:00
Roman Lebedev	4b77a6a55e	[NFC][InstCombine] Add some more tests for pulling binops through shifts The ashr variant may see relaxation in https://reviews.llvm.org/D61938 llvm-svn: 360814	2019-05-15 21:15:44 +00:00
Cameron McInally	14a90661f8	Revert llvm-svn: 360807 Somehow submitted this patch twice. llvm-svn: 360812	2019-05-15 20:48:50 +00:00
Cameron McInally	b8df789ff3	Pre-commit unary fneg tests to InstSimplify llvm-svn: 360808	2019-05-15 20:27:37 +00:00
Cameron McInally	94f16bfaba	Add unary fneg to InstSimplify/fp-nan.ll llvm-svn: 360807	2019-05-15 20:27:35 +00:00
Cameron McInally	a4d29b8e20	Add unary fneg to InstSimplify/fp-nan.ll llvm-svn: 360797	2019-05-15 19:37:03 +00:00
Taewook Oh	9d020de3e8	[PredicateInfo] Do not process unreachable operands. Summary: We should excluded unreachable operands from processing as their DFS visitation order is undefined. When `renameUses` function sorts `OpsToRename` (https://fburl.com/d2wubn60), the comparator assumes that the parent block of the operand has a corresponding dominator tree node. This is not the case for unreachable operands and crashes the compiler. Reviewers: dberlin, mgrang, davide Subscribers: efriedma, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61154 llvm-svn: 360796	2019-05-15 19:35:38 +00:00
Hiroshi Yamauchi	7dfd087a9a	[JumpThreading] A bug fix for stale loop info after unfold select Summary: The return value of a TryToUnfoldSelect call was not checked, which led to an incorrectly preserved loop info and some crash. The original crash was reported on https://reviews.llvm.org/D59514. Reviewers: davidxl, amehsan Reviewed By: davidxl Subscribers: fhahn, brzycki, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61920 llvm-svn: 360780	2019-05-15 15:15:16 +00:00
Cameron McInally	0c82d9b5a2	Teach InstSimplify -X + X --> 0.0 about unary FNeg Differential Revision: https://reviews.llvm.org/D61916 llvm-svn: 360777	2019-05-15 14:31:33 +00:00
Clement Courbet	eaf4413d2d	Revert r360771 "[MergeICmps] Simplify the code." Breaks a bunch of builbdots. llvm-svn: 360776	2019-05-15 14:21:59 +00:00
Stephen Tozer	0d02f2ff4f	Revert "[Salvage] Change salvage debug info implementation to use DW_OP_LLVM_convert where needed" This reverts r360772 due to build issues. Reverted commit: `17dd4d7403`. llvm-svn: 360773	2019-05-15 13:41:44 +00:00
Stephen Tozer	17dd4d7403	[Salvage] Change salvage debug info implementation to use DW_OP_LLVM_convert where needed Fixes issue: https://bugs.llvm.org/show_bug.cgi?id=40645 Previously, LLVM had no functional way of performing casts inside of a DIExpression(), which made salvaging cast instructions other than Noop casts impossible. With the recent addition of DW_OP_LLVM_convert this salvaging is now possible, and so can be used to fix the attached bug as well as any cases where SExt instruction results are lost in the debugging metadata. This patch introduces this fix by expanding the salvage debug info method to cover these cases using the new operator. Differential revision: https://reviews.llvm.org/D61184 llvm-svn: 360772	2019-05-15 13:15:48 +00:00
Clement Courbet	157ae639fa	[MergeICmps] Simplify the code. Instead of patching the original blocks, we now generate new blocks and delete the old blocks. This results in simpler code with a less twisted control flow (see the change in `entry-block-shuffled.ll`). This will make https://reviews.llvm.org/D60318 simpler by making it more obvious where control flow created and deleted. Reviewers: gchatelet Subscribers: hiraditya, llvm-commits, spatel Tags: #llvm Differential Revision: https://reviews.llvm.org/D61736 llvm-svn: 360771	2019-05-15 13:04:24 +00:00
Roman Lebedev	da08fae397	[NFC][InstCombine] Regenerate trunc.ll test llvm-svn: 360759	2019-05-15 10:24:38 +00:00
Fangrui Song	5296e2809f	Fix 2-field llvm.global_ctors `REQUIRES: asserts` tests after rL360742 llvm-svn: 360743	2019-05-15 03:08:21 +00:00
Fangrui Song	f4dfd63c74	[IR] Disallow llvm.global_ctors and llvm.global_dtors of the 2-field form in textual format The 3-field form was introduced by D3499 in 2014 and the legacy 2-field form was planned to be removed in LLVM 4.0 For the textual format, this patch migrates the existing 2-field form to use the 3-field form and deletes the compatibility code. test/Verifier/global-ctors-2.ll checks we have a friendly error message. For bitcode, lib/IR/AutoUpgrade UpgradeGlobalVariables will upgrade the 2-field form (add i8* null as the third field). Reviewed By: rnk, dexonsmith Differential Revision: https://reviews.llvm.org/D61547 llvm-svn: 360742	2019-05-15 02:35:32 +00:00
Florian Hahn	53c9d585b5	[LICM] Allow AliasSetMap to contain top-level loops. When an outer loop gets deleted by a different pass, before LICM visits it, we cannot clean up its sub-loops in AliasSetMap, because at the point we receive the deleteAnalysisLoop callback for the outer loop, the loop object is already invalid and we cannot access its sub-loops any longer. Reviewers: asbirlea, sanjoy, chandlerc Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D61904 llvm-svn: 360704	2019-05-14 19:41:36 +00:00
Nikita Popov	48c4e4fa80	[LVI][CVP] Add support for abs/nabs select pattern flavor Based on ConstantRange support added in D61084, we can now handle abs and nabs select pattern flavors in LVI. Differential Revision: https://reviews.llvm.org/D61794 llvm-svn: 360700	2019-05-14 18:53:47 +00:00
Philip Reames	bd8d309111	[IndVars] Extend reasoning about loop invariant exits to non-header blocks Noticed while glancing through the code for other reasons. The extension is trivial enough, decided to just do it. llvm-svn: 360694	2019-05-14 17:20:10 +00:00
Cameron McInally	7c5c0c9fe5	Support FNeg in SpeculativeExecution pass Differential Revision: https://reviews.llvm.org/D61910 llvm-svn: 360692	2019-05-14 16:51:18 +00:00
Philip Reames	bbe4ff10df	[Test] Autogen a test for ease of later changing llvm-svn: 360690	2019-05-14 16:37:29 +00:00
Tim Northover	ed9117f88d	GlobalOpt: do not promote globals used atomically to constants. Some atomic loads are implemented as cmpxchg (particularly if large or floating), and that usually requires write access to the memory involved or it will segfault. We can still propagate the constant value to users we understand though. llvm-svn: 360662	2019-05-14 11:03:13 +00:00
Gor Nishanov	d64455cd43	[coroutines] Fix spills of static array allocas Summary: CoroFrame was not considering static array allocas, and was only ever reserving a single element in the coroutine frame. This meant that stores to the non-zero'th element would corrupt later frame data. Store static array allocas as field arrays in the coroutine frame. Added test. Committed by Gor Nishanov on behalf of ben-clayton Reviewers: GorNishanov, modocache Reviewed By: GorNishanov Subscribers: Orlando, capn, EricWF, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61372 llvm-svn: 360636	2019-05-13 23:58:24 +00:00
Nemanja Ivanovic	1d662316cb	[Pass Pipeline][NFC] Add a test prior to committing D61726 This patch just adds a test case to show the differences in code emitted by opt before and after https://reviews.llvm.org/D61726. Previous attempt to commit this did not include the registered target requirement so it caused buildbot breaks. llvm-svn: 360620	2019-05-13 21:14:36 +00:00
Sanjay Patel	760f61ab36	[InstCombine] try harder to form rotate (funnel shift) (PR20750) We have a similar match for patterns ending in a truncate. This should be ok for all targets because the default expansion would still likely be better from replacing 2 'and' ops with 1. Attempt to show the logic equivalence in Alive (which doesn't currently have funnel-shift in its vocabulary AFAICT): %shamt = zext i8 %i to i32 %m = and i32 %shamt, 31 %neg = sub i32 0, %shamt %and4 = and i32 %neg, 31 %shl = shl i32 %v, %m %shr = lshr i32 %v, %and4 %or = or i32 %shr, %shl => %a = and i8 %i, 31 %shamt2 = zext i8 %a to i32 %neg2 = sub i32 0, %shamt2 %and4 = and i32 %neg2, 31 %shl = shl i32 %v, %shamt2 %shr = lshr i32 %v, %and4 %or = or i32 %shr, %shl https://rise4fun.com/Alive/V9r llvm-svn: 360605	2019-05-13 17:28:19 +00:00
Sanjay Patel	cb8957f718	[InstCombine] add tests for rotates with narrow shift amount (PR20750); NFC llvm-svn: 360601	2019-05-13 17:02:26 +00:00
Sanjay Patel	2de619099a	[LoopVectorizer] add tests for FP minmax; NFC llvm-svn: 360542	2019-05-12 14:53:59 +00:00
Simon Pilgrim	6b10fde69b	[CostModel][X86] Add min/max reduction costs for all SSE targets The original costs stopped at SSE42, I've added conservative estimates for everything down to SSE1/SSE2 and moved some of the SSE42 costs to SSE41 (really only the addition of PCMPGT makes any difference). I've also added missing vXi8 costs (we use PHMINPOSUW for i8/i16 for scarily quick results) and 256-bit vector costs for AVX1. llvm-svn: 360528	2019-05-11 17:12:52 +00:00
Teresa Johnson	37b80122bd	[ThinLTO] Auto-hide prevailing linkonce_odr only when all copies eligible Summary: We hit undefined references building with ThinLTO when one source file contained explicit instantiations of a template method (weak_odr) but there were also implicit instantiations in another file (linkonce_odr), and the latter was the prevailing copy. In this case the symbol was marked hidden when the prevailing linkonce_odr copy was promoted to weak_odr. It led to unsats when the resulting shared library was linked with other code that contained a reference (expecting to be resolved due to the explicit instantiation). Add a CanAutoHide flag to the GV summary to allow the thin link to identify when all copies are eligible for auto-hiding (because they were all originally linkonce_odr global unnamed addr), and only do the auto-hide in that case. Most of the changes here are due to plumbing the new flag through the bitcode and llvm assembly, and resulting test changes. I augmented the existing auto-hide test to check for this situation. Reviewers: pcc Subscribers: mehdi_amini, inglorion, eraman, dexonsmith, arphaman, dang, llvm-commits, steven_wu, wmi Tags: #llvm Differential Revision: https://reviews.llvm.org/D59709 llvm-svn: 360466	2019-05-10 20:08:24 +00:00
Cameron McInally	e75412ab47	Add InstCombine::visitFNeg(...) Differential Revision: https://reviews.llvm.org/D61784 llvm-svn: 360461	2019-05-10 20:01:04 +00:00
Nikita Popov	e99486dc11	[CVP] Add tests for urem, sdiv, srem ranges; NFC We currently don't calcuate result ranges for these binary operators. llvm-svn: 360460	2019-05-10 19:36:38 +00:00
Nikita Popov	d74b871504	[CVP] Add tests for abs and nabs spf; NFC One half of the bound is already computed correctly for these tests, the other isn't. llvm-svn: 360445	2019-05-10 17:39:50 +00:00
Nemanja Ivanovic	34dc3aca40	Pull r360426 as it is breaking the build bots. llvm-svn: 360437	2019-05-10 16:03:22 +00:00
Nemanja Ivanovic	7a41cd5b88	Another attempt to fix the build bot breaks after r360426 The test case checks were produced by the update_test_checks.py scripts and I assumed that is sufficient. However, the behaviour is different with different default target triples. Specify the triple explicitly in the test case. If this doesn't clean up the build bot breaks, I'll remove the test case until I can get to the bottom of why the behaviour on build bots is different from my machine. llvm-svn: 360434	2019-05-10 15:44:56 +00:00
Nemanja Ivanovic	0f991c65f2	Fix build break after r360426 llvm-svn: 360433	2019-05-10 15:11:40 +00:00
Michael Liao	b284414a1b	[InferAddressSpaces] Enhance the handling of cosntexpr. Summary: - Constant expressions may not be added in strict postorder as the forward instruction scan order. Thus, for a constant express (CE0), if its operand (CE1) is used in an previous instruction, they are not in postorder. However, different from `cloneInstructionWithNewAddressSpace`, `cloneConstantExprWithNewAddressSpace` doesn't bookkeep uninferred instructions for later resolving. That results in failure of inferring constant address. - This patch adds the support to infer constant expression operand recursively, since there won't be loop, if that operand is another constant expression. Reviewers: arsenm Subscribers: jholewinski, jvesely, wdng, nhaehnle, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61760 llvm-svn: 360431	2019-05-10 14:57:42 +00:00
Nemanja Ivanovic	cfc89896e0	[Pass Pipeline][NFC] Add a test prior to committing D61726 This patch just adds a test case to show the differences in code emitted by opt before and after https://reviews.llvm.org/D61726. llvm-svn: 360426	2019-05-10 13:47:00 +00:00
Cameron McInally	a67e387de8	Pre-commit InstCombine::visitFNeg(...) test. llvm-svn: 360424	2019-05-10 13:18:57 +00:00
Sanjay Patel	012adfbb96	[LoopVectorizer] fix test file to not run the entire -O3 pipeline This test file has a long history of edits from changes outside of vectorization, and it would happen again with the proposal in D61726. End-to-end testing shouldn't be happening in a test file that is specifically checking for vector masked load/store ops. Larger-scale testing goes in PhaseOrdering or the test-suite. I've hopefully preserved the intent by taking what was completely unoptimized IR in some tests and passing that through the -O1 pipeline. That becomes the input IR, and now we just run the loop vectorizer and verify that the vector masked ops are produced as expected. llvm-svn: 360340	2019-05-09 13:43:22 +00:00
Clement Courbet	fa18e6b080	[MergeICmps][NFC] Re-generate tests with update_test_checks. And use a more compact name for the tested struct. llvm-svn: 360319	2019-05-09 08:37:58 +00:00
Clement Courbet	fb0f66ddb3	[NFC] Fix typo. llvm-svn: 360314	2019-05-09 07:12:25 +00:00
Cameron McInally	cdaf5a069c	Precommit FNeg InstCombine tests Differential Revision: https://reviews.llvm.org/D61685 llvm-svn: 360281	2019-05-08 19:06:03 +00:00
Warren Ristow	d27b0c6247	[SCEV] Suppress hoisting insertion point of binops when unsafe InsertBinop tries to move insertion-points out of loops for expressions that are loop-invariant. This patch adds a new parameter, IsSafeToHost, to guard that hoisting. This allows callers to suppress that hoisting for unsafe situations, such as divisions that may have a zero denominator. This fixes PR38697. Differential Revision: https://reviews.llvm.org/D55232 llvm-svn: 360280	2019-05-08 18:50:07 +00:00
Reid Kleckner	1558731607	Fix new reassociate-catchswitch.ll test llvm-svn: 360279	2019-05-08 18:39:03 +00:00
Sanjay Patel	b64c48597f	[InstSimplify] add tests for fcmp+minnum; NFC llvm-svn: 360275	2019-05-08 17:53:18 +00:00
David Greene	6c433713e9	[Reassociation] Place moved instructions after landing pads Reassociation's NegateValue moved instructions to the beginning of blocks (after PHIs) without checking for exception handling pads. It's possible for reassociation to move something into an exception handling block so we need to make sure we don't move things too early in the block. This change advances the insertion point past any exception handling pads. If the block we want to move into contains a catchswitch, we cannot move into it. In that case just create a new neg as if we had not found an existing neg to move. Differential Revision: https://reviews.llvm.org/D61089 llvm-svn: 360262	2019-05-08 15:44:24 +00:00
Nikita Popov	9fd02a71a3	Revert "[ValueTracking] Improve isKnowNonZero for Ints" This reverts commit `3b137a4956`. As reported in https://reviews.llvm.org/D60846, this is causing miscompiles. llvm-svn: 360260	2019-05-08 14:50:01 +00:00
Florian Hahn	3c696b3e7c	[SCCP] Fix crash when trying to constant-fold terminators multiple times. If we fold a branch/switch to an unconditional branch to another dead block we replace the branch with unreachable, to avoid attempting to fold the unconditional branch. Reviewers: davide, efriedma, mssimpso, jdoerfert Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D61300 llvm-svn: 360232	2019-05-08 09:09:54 +00:00
Mircea Trofin	0a753938db	[llvm] Avoid div by 0 when updating profile weights. Reviewers: davidxl Reviewed By: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61661 llvm-svn: 360223	2019-05-08 03:57:25 +00:00
Dan Robertson	3b137a4956	[ValueTracking] Improve isKnowNonZero for Ints Improve isKnownNonZero for integers in order to improve cttz optimizations. Differential Revision: https://reviews.llvm.org/D60846 llvm-svn: 360222	2019-05-08 02:25:08 +00:00
Sanjay Patel	e088d03b9c	[ValueTracking] add logic for known-never-nan with minnum/maxnum From the LangRef: "Returns NaN only if both operands are NaN." llvm-svn: 360206	2019-05-07 22:58:31 +00:00
Reid Kleckner	d028a463d5	Regenerate test case again after last revert llvm-svn: 360204	2019-05-07 22:40:40 +00:00
Reid Kleckner	a9cc7d71ac	Delete test cases added in r360162 that should have been deleted in r360190 llvm-svn: 360203	2019-05-07 22:35:56 +00:00
Sanjay Patel	9a1c2b7776	[InstSimplify] add tests for minnum/maxnum and NaN; NFC llvm-svn: 360197	2019-05-07 21:50:09 +00:00
Kostya Serebryany	b9c5768302	revert r360162 as it breaks most of the buildbots llvm-svn: 360190	2019-05-07 20:57:11 +00:00
Robert Lougher	8681ef8f41	[InstCombine] Add new combine to add folding (X \| C1) + C2 --> (X \| C1) ^ C1 iff (C1 == -C2) I verified the correctness using Alive: https://rise4fun.com/Alive/YNV This transform enables the following transform that already exists in instcombine: (X \| Y) ^ Y --> X & ~Y As a result, the full expected transform is: (X \| C1) + C2 --> X & ~C1 iff (C1 == -C2) There already exists the transform in the sub case: (X \| Y) - Y --> X & ~Y However this does not trigger in the case where Y is constant due to an earlier transform: X - (-C) --> X + C With this new add fold, both the add and sub constant cases are handled. Patch by Chris Dawson. Differential Revision: https://reviews.llvm.org/D61517 llvm-svn: 360185	2019-05-07 19:36:41 +00:00
Sanjay Patel	6a281a7545	[InstCombine] allow sinking fneg operands through an FP min/max Fundamentally/generally, we should not have to rely on bailouts/crippling of folds. In this particular case, I think we always recognize the inverted predicate min/max pattern, so there should not be any loss of optimization. Codegen looks better because we are eliminating an fneg. llvm-svn: 360180	2019-05-07 18:58:07 +00:00
Simon Pilgrim	0ed545ebb3	Regenerate test to try and fix buildbots llvm-svn: 360173	2019-05-07 17:10:10 +00:00
Sanjay Patel	2a3d16feea	[InstCombine] add tests for FP min/max with negated operands; NFC llvm-svn: 360170	2019-05-07 16:25:43 +00:00
Orlando Cazalet-Hyams	78a6062c24	[DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion Summary: Bug: https://bugs.llvm.org/show_bug.cgi?id=39024 The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here: A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins. B) Instructions in the middle block have different line numbers which give the impression of another iteration. In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks. Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel Reviewed By: hfinkel Subscribers: bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D60831 llvm-svn: 360162	2019-05-07 15:37:38 +00:00
Keno Fischer	a1a4adf4b9	[SCEV] Add explicit representations of umin/smin Summary: Currently we express umin as `~umax(~x, ~y)`. However, this becomes a problem for operands in non-integral pointer spaces, because `~x` is not something we can compute for `x` non-integral. However, since comparisons are generally still allowed, we are actually able to express `umin(x, y)` directly as long as we don't try to express is as a umax. Support this by adding an explicit umin/smin representation to SCEV. We do this by factoring the existing getUMax/getSMax functions into a new function that does all four. The previous two functions were largely identical. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D50167 llvm-svn: 360159	2019-05-07 15:28:47 +00:00
Robert Lougher	07298c9b1e	Precommit tests for or/add transform. NFC. llvm-svn: 360149	2019-05-07 14:14:29 +00:00
Jordan Rupprecht	8f14e7cacf	Revert "Re-commit r357452: SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)" This reverts r357452 (git commit `21eb771dcb`). This was causing strange optimization-related test failures on an internal test. Will followup with more details offline. llvm-svn: 360086	2019-05-06 21:55:05 +00:00
Sanjay Patel	a6019d5164	[InstCombine] sink FP negation of operands through select We don't always get this: Cond ? -X : -Y --> -(Cond ? X : Y) ...even with the legacy IR form of fneg in the case with extra uses, and we miss matching with the newer 'fneg' instruction because we are expecting binops through the rest of the path. Differential Revision: https://reviews.llvm.org/D61604 llvm-svn: 360075	2019-05-06 20:34:05 +00:00
Sanjay Patel	473dbf0301	[InstCombine] add tests for fneg+sel; NFC llvm-svn: 360058	2019-05-06 17:29:22 +00:00
Cameron McInally	c3167696bc	Add FNeg support to InstructionSimplify Differential Revision: https://reviews.llvm.org/D61573 llvm-svn: 360053	2019-05-06 16:05:10 +00:00
Sanjay Patel	3379fb599d	[InstCombine] regenerate test checks; NFC llvm-svn: 360052	2019-05-06 16:03:53 +00:00
Clement Courbet	9e1f2a7fe7	[SimplifyLibCalls] Simplify bcmp too. Summary: Fixes PR40699. Reviewers: gchatelet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61585 llvm-svn: 360021	2019-05-06 09:15:22 +00:00
Markus Lavin	a778074165	[DebugInfo] GlobalOpt DW_OP_deref_size instead of DW_OP_deref. Optimization pass lib/Transforms/IPO/GlobalOpt.cpp needs to insert DW_OP_deref_size instead of DW_OP_deref to be compatible with big-endian targets for same reasons as in D59687. Differential Revision: https://reviews.llvm.org/D60611 llvm-svn: 360013	2019-05-06 07:20:56 +00:00
Cameron McInally	1d0c845d9d	Add FNeg IR constant folding support llvm-svn: 359982	2019-05-05 16:07:09 +00:00
Cameron McInally	fd254e429e	Add InstCombine tests for FNeg instruction. llvm-svn: 359970	2019-05-04 14:56:08 +00:00
Sanjay Patel	5ab41a7a05	[CodeGenPrepare] limit overflow intrinsic matching to a single basic block (2nd try) This is a subset of the original commit from rL359879 which was reverted because it could crash when using the 'RemovedInstructions' structure that enables delayed deletion of dead instructions. The motivating compile-time win does not require that change though. We should get most of that win from this change alone. Using/updating a dominator tree to match math overflow patterns may be very expensive in compile-time (because of the way CGP uses a DT), so just handle the single-block case. See post-commit thread for rL354298 for more details: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190422/646276.html Differential Revision: https://reviews.llvm.org/D61075 llvm-svn: 359969	2019-05-04 12:46:32 +00:00
Evgeniy Stepanov	46ec57e576	Revert "[CodeGenPrepare] limit overflow intrinsic matching to a single basic block" This reverts commit r359879, which introduced a compiler crash. llvm-svn: 359908	2019-05-03 17:31:49 +00:00
Robert Lougher	e28ab93546	Revert r359549 - incorrect update of test checks. NFC llvm-svn: 359897	2019-05-03 15:14:19 +00:00
Sanjay Patel	d3cfaae243	[LICM] auto-generate complete test checks; NFC llvm-svn: 359881	2019-05-03 13:25:06 +00:00
Sanjay Patel	8ff072e48e	[CodeGenPrepare] limit overflow intrinsic matching to a single basic block Using/updating a dominator tree to match math overflow patterns may be very expensive in compile-time (because of the way CGP uses a DT), so just handle the single-block case. Also, we were restarting the iterator loops when doing the overflow intrinsic transforms by marking the dominator tree for update. That was done to prevent iterating over a removed instruction. But we can postpone the deletion using the existing "RemovedInsts" structure, and that means we don't need to update the DT. See post-commit thread for rL354298 for more details: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190422/646276.html Differential Revision: https://reviews.llvm.org/D61075 llvm-svn: 359879	2019-05-03 13:09:18 +00:00
Bob Haarman	a78ab77b6b	remove inalloca parameters in globalopt and simplify argpromotion Summary: Inalloca parameters require special handling in some optimizations. This change causes globalopt to strip the inalloca attribute from function parameters when it is safe to do so, removes the special handling for inallocas from argpromotion, and replaces it with a simple check that causes argpromotion to skip functions that receive inallocas (for when the pass is invoked on code that didn't run through globalopt first). This also avoids a case where argpromotion would incorrectly try to pass an inalloca in a register. Fixes PR41658. Reviewers: rnk, efriedma Reviewed By: rnk Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61286 llvm-svn: 359743	2019-05-02 00:37:36 +00:00
Hiroshi Yamauchi	1620104034	[PGO][CHR] A bug fix. Summary: Fix a transformation bug where two scopes share a common instrution to hoist. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61405 llvm-svn: 359736	2019-05-01 22:49:52 +00:00
Hubert Tong	02d055a269	[tests] Add host-byteorder-*-endian; update XFAILs of big-endian triples Summary: Triple components in `XFAIL` lines are tested against the target triple. Various tests that are expected to fail on big-endian hosts are marked as being `XFAIL` for big-endian targets. This patch corrects these tests by having them test against a new `host-byteorder-big-endian` feature. Reviewers: xingxue, sfertile, jasonliu Reviewed By: xingxue Subscribers: jvesely, nhaehnle, fedor.sergeev, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60551 llvm-svn: 359689	2019-05-01 15:36:18 +00:00
Philip Reames	84e54eb471	[InstCombine] Limit a vector demanded elts rule which was producing invalid IR. The demanded elts rules introduced for GEPs in https://reviews.llvm.org/rL356293 replaced vector constants with undefs (by design). It turns out that the LangRef disallows such cases when indexing structs. The right fix is probably to relax the langref requirement, and update other passes to expect the result, but for the moment, limit the transform to avoid compiler crashes. This should fix https://bugs.llvm.org/show_bug.cgi?id=41624. llvm-svn: 359633	2019-04-30 23:09:26 +00:00
Alina Sbirlea	4e1ac95cf5	[PassManagerBuilder] Add option for interleaved loops, for loop vectorize. Summary: Match NewPassManager behavior: add option for interleaved loops in the old pass manager, and use that instead of the flag used to disable loop unroll. No changes in the defaults. Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, dmgreen, hsaito, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61030 llvm-svn: 359615	2019-04-30 21:29:20 +00:00
Simon Pilgrim	83098d28a1	[SLP] Lit test that cannot get vectorized due to lack of look-ahead operand reordering heuristic. The code in this test is not vectorized by SLP because its operand reordering cannot look beyond the immediate predecessors. This will get fixed in a follow-up patch that introduces the look-ahead operand reordering heuristic. Committed on behalf of @vporpo (Vasileios Porpodas) Differential Revision: https://reviews.llvm.org/D61283 llvm-svn: 359553	2019-04-30 11:03:09 +00:00
Jeremy Morse	562f5f04f5	Update checks in an instcombine test, NFC This reduces the delta in some incoming work that changes this test. llvm-svn: 359549	2019-04-30 10:56:33 +00:00
Quentin Colombet	ae2cbb3400	[BlockExtractor] Change the basic block separator from ',' to ';' This change aims at making the file format be compatible with the way LLVM handles command line options. Differential Revision: https://reviews.llvm.org/D60970 llvm-svn: 359462	2019-04-29 16:14:00 +00:00
Simon Pilgrim	46128cdf08	[InstCombine][X86] Add PACKSS tests for truncation of sign-extended comparisons llvm-svn: 359435	2019-04-29 10:36:20 +00:00
Dan Robertson	9e441aee50	[NFC] Add baseline tests for int isKnownNonZero Add baseline tests for improvements of isKnownNonZero for integer types. Differential Revision: https://reviews.llvm.org/D60932 llvm-svn: 359267	2019-04-26 02:55:54 +00:00
Akira Hatanaka	8edf8f317b	[ObjC][ARC] Let ARC optimizer bail out if the number of pointer states it keeps track of becomes too large ARC optimizer does a top-down and a bottom-up traversal of the whole function to pair up retain and release instructions and remove them. This can be expensive if the number of instructions in the function and pointer states it tracks are large since it has to look at each pointer state and determine whether the instruction being visited can potentially use the pointer. This patch adds a command line option that sets a limit to the number of pointers it tracks. rdar://problem/49477063 Differential Revision: https://reviews.llvm.org/D61100 llvm-svn: 359226	2019-04-25 19:42:55 +00:00
Robert Lougher	d469133f95	[Evaluator] Walk initial elements when handling load through bitcast When evaluating a store through a bitcast, the evaluator tries to move the bitcast from the pointer onto the stored value. If the cast is invalid, it tries to "introspect" the type to get a valid cast by obtaining a pointer to the initial element (if the type is nested, this may require walking several initial elements). In some situations it is possible to get a bitcast on a load (e.g. with unions, where the bitcast may not be the same type as the store). However, equivalent logic to the store to introspect the type is missing. This patch add this logic. Note, when developing the patch I was unhappy with adding similar logic directly to the load case as it could get out of step. Instead, I have abstracted the "introspection" into a helper function, with the specifics being handled by a passed-in lambda function. Differential Revision: https://reviews.llvm.org/D60793 llvm-svn: 359205	2019-04-25 17:00:01 +00:00
Simon Pilgrim	86ff9d313a	[InstCombine][X86] Add PACKSS/PACKUS tests for truncation where saturation won't occur llvm-svn: 359185	2019-04-25 12:45:11 +00:00
Roman Lebedev	445c22b7eb	[NFC][LoopIdiomRecognize] Some basic baseline tests for bcmp loop idiom Doubt this is the final test coverage, but this appears to have good coverage already, so i figure i might as well precommit it. llvm-svn: 359173	2019-04-25 08:33:47 +00:00
Alina Sbirlea	733c8c40c8	Enable LoopVectorization by default. Summary: When refactoring vectorization flags, vectorization was disabled by default in the new pass manager. This patch re-enables is for both managers, and changes the assumptions opt makes, based on the new defaults. Comments in opt.cpp should clarify the intended use of all flags to enable/disable vectorization. Reviewers: chandlerc, jgorbe Subscribers: jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61091 llvm-svn: 359167	2019-04-25 04:49:48 +00:00
Alexey Bataev	ef3c1884ec	[SLP] Fix crash after r358519, by V. Porpodas. Summary: The code did not check if operand was undef before casting it to Instruction. Reviewers: RKSimon, ABataev, dtemirbulatov Reviewed By: ABataev Subscribers: uabelho Tags: #llvm Differential Revision: https://reviews.llvm.org/D61024 llvm-svn: 359136	2019-04-24 20:21:32 +00:00
Dmitry Mikulin	312b5f86b7	The error message for mismatched value sites is very cryptic. Make it more readable for an average user. Differential Revision: https://reviews.llvm.org/D60896 llvm-svn: 359043	2019-04-23 22:26:55 +00:00
Akira Hatanaka	5c3117b0a9	[ObjC][ARC] Check the basic block size before calling DominatorTree::dominate. ARC contract pass has an optimization that replaces the uses of the argument of an ObjC runtime function call with the call result. For example: ; Before optimization %1 = tail call i8* @foo1() %2 = tail call i8* @llvm.objc.retainAutoreleasedReturnValue(i8* %1) store i8* %1, i8** @g0, align 8 ; After optimization %1 = tail call i8* @foo1() %2 = tail call i8* @llvm.objc.retainAutoreleasedReturnValue(i8* %1) store i8* %2, i8** @g0, align 8 // %1 is replaced with %2 Before replacing the argument use, DominatorTree::dominate is called to determine whether the user instruction is dominated by the ObjC runtime function call instruction. The call to DominatorTree::dominate can be expensive if the two instructions belong to the same basic block and the size of the basic block is large. This patch checks the basic block size and just bails out if the size exceeds the limit set by command line option "arc-contract-max-bb-size". rdar://problem/49477063 Differential Revision: https://reviews.llvm.org/D60900 llvm-svn: 359027	2019-04-23 19:49:03 +00:00
Philip Reames	2ce017026a	[InstCombine] Convert a masked.load of a dereferenceable address to an unconditional load If we have a masked.load from a location we know to be dereferenceable, we can simply issue a speculative unconditional load against that address. The key advantage is that it produces IR which is well understood by the optimizer. The select (cnd, load, passthrough) form produced should be pattern matchable back to hardware predication if profitable. Differential Revision: https://reviews.llvm.org/D59703 llvm-svn: 359000	2019-04-23 15:25:14 +00:00
David Green	63a2aa715a	[LSR] Limit the recursion for setup cost In some circumstances we can end up with setup costs that are very complex to compute, even though the scevs are not very complex to create. This can also lead to setupcosts that are calculated to be exactly -1, which LSR treats as an invalid cost. This patch puts a limit on the recursion depth for setup cost to prevent them taking too long. Thanks to @reames for the report and test case. Differential Revision: https://reviews.llvm.org/D60944 llvm-svn: 358958	2019-04-23 08:52:21 +00:00
Philip Reames	d748689c7f	[InstCombine] Eliminate stores to constant memory If we have a store to a piece of memory which is known constant, then we know the store must be storing back the same value. As a result, the store (or memset, or memmove) must either be down a dead path, or a noop. In either case, it is valid to simply remove the store. The motivating case for this involves a memmove to a buffer which is constant down a path which is dynamically dead. Note that I'm choosing to implement the less aggressive of two possible semantics here. We could simply say that the store is undefined, and prune the path. Consensus in the review was that the more aggressive form might be a good follow on change at a later date. Differential Revision: https://reviews.llvm.org/D60659 llvm-svn: 358919	2019-04-22 20:28:19 +00:00
Philip Reames	f01583d097	[Tests] Revise a test as requested by reviewer in D59703 llvm-svn: 358907	2019-04-22 18:51:58 +00:00
Philip Reames	8f47089034	[Tests] Add a negative test for masked.gather part of D59703 llvm-svn: 358906	2019-04-22 18:28:44 +00:00
Serguei Katkov	40a3b96196	[NewPM] Add Option handling for SimpleLoopUnswitch This patch enables passing options to SimpleLoopUnswitch via the passes pipeline. Reviewers: chandlerc, fedor.sergeev, leonardchan, philip.pfaffe Reviewed By: fedor.sergeev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D60676 llvm-svn: 358880	2019-04-22 10:35:07 +00:00
Serguei Katkov	5614f4a3a5	[NewPM] Add dummy Test for LoopVectorize option parsing. llvm-svn: 358878	2019-04-22 09:53:26 +00:00
Luqman Aden	2993661cc0	[CorrelatedValuePropagation] Mark subs that we know not to wrap with nuw/nsw. Summary: Teach CorrelatedValuePropagation to also handle sub instructions in addition to add. Relatively simple since makeGuaranteedNoWrapRegion already understood sub instructions. Only subtle change is which range is passed as "Other" to that function, since sub isn't commutative. Note that CorrelatedValuePropagation::processAddSub is still hidden behind a default-off flag as IndVarSimplify hasn't yet been fixed to strip the added nsw/nuw flags and causes a miscompile. (PR31181) Reviewers: sanjoy, apilipenko, nikic Reviewed By: nikic Subscribers: hiraditya, jfb, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60036 llvm-svn: 358816	2019-04-20 13:14:18 +00:00
Nikita Popov	d89de3f7f4	[IndVarSimplify] Generate full checks for some LFTR tests; NFC llvm-svn: 358813	2019-04-20 12:05:53 +00:00
Nikita Popov	aa0c5a022f	[IndVarSimplify] Add tests for PR31181; NFC llvm-svn: 358812	2019-04-20 12:05:43 +00:00
Nikita Popov	2e33f8de57	[CVP] Add tests for sub nowrap inference; NFC These are baseline tests for D60036. Patch by Luqman Aden. llvm-svn: 358808	2019-04-20 07:43:15 +00:00
Vedant Kumar	282b26ec4d	[GVN+LICM] Use line 0 locations for better crash attribution This is a follow-up to r291037+r291258, which used null debug locations to prevent jumpy line tables. Using line 0 locations achieves the same effect, but works better for crash attribution because it preserves the right inline scope. Differential Revision: https://reviews.llvm.org/D60913 llvm-svn: 358791	2019-04-19 22:36:40 +00:00
Fangrui Song	884f557bb2	[MergeFunc] removeUsers: call remove() only on direct users removeUsers uses a work list to collect indirect users and call remove() on those functions. However it has a bug (`if (!Visited.insert(UU).second)`). Actually, we don't have to collect indirect users. After the merge of F and G, G's callers will be considered (added to Deferred). If G's callers can be merged, G's callers' callers will be considered. Update the test unnamed-addr-reprocessing.ll to make it clear we can still merge indirect callers. llvm-svn: 358741	2019-04-19 07:57:51 +00:00
Saleem Abdulrasool	b96d9b3419	MergeFunc: preserve COMDAT information when creating a thunk We would previously drop the COMDAT on the thunk we generated when replacing a function body with the forwarding thunk. This would result in a function that may have been multiply emitted and multiply merged to be emitted with the same name without the COMDAT. This is a hard error with PE/COFF where the COMDAT is used for the deduplication of Value Witness functions for Swift. llvm-svn: 358728	2019-04-19 01:48:36 +00:00
Philip Reames	137995d8da	[GuardWidening] Wire up a NPM version of the LoopGuardWidening pass llvm-svn: 358704	2019-04-18 19:17:14 +00:00
Quentin Colombet	ea3364bf85	[BlockExtractor] Extend the file format to support the grouping of basic blocks Prior to this patch, each basic block listed in the extrack-blocks-file would be extracted to a different function. This patch adds the support for comma separated list of basic blocks to form group. When the region formed by a group is not extractable, e.g., not single entry, all the blocks of that group are left untouched. Let us see this new format in action (comments are not part of the file format): ;; funcName bbName[,bbName...] foo bb1 ;; Extract bb1 in its own function foo bb2,bb3 ;; Extract bb2,bb3 in their own function bar bb1,bb4 ;; Extract bb1,bb4 in their own function bar bb2 ;; Extract bb2 in its own function Assuming all regions are extractable, this will create one function and thus one call per region. Differential Revision: https://reviews.llvm.org/D60746 llvm-svn: 358701	2019-04-18 18:28:30 +00:00
Philip Reames	adf288c5d9	[LoopPred] Fix a blatantly obvious bug in r358684 The bug is that I didn't check whether the operand of the invariant_loads were themselves invariant. I don't know how this got missed in the patch and review. I even had an unreduced test case locally, and I remember handling this case, but I must have lost it in one of the rebases. Oops. llvm-svn: 358688	2019-04-18 17:01:19 +00:00
Philip Reames	92a7177e6b	[LoopPredication] Allow predication of loop invariant computations (within the loop) The purpose of this patch is to eliminate a pass ordering dependence between LoopPredication and LICM. To understand the purpose, consider the following snippet of code inside some loop 'L' with IV 'i' A = _a.length; guard (i < A) a = _a[i] B = _b.length; guard (i < B); b = _b[i]; ... Z = _z.length; guard (i < Z) z = _z[i] accum += a + b + ... + z; Today, we need LICM to hoist the length loads, LoopPredication to make the guards loop invariant, and TrivialUnswitch to eliminate the loop invariant guard to establish must execute for the next length load. Today, if we can't prove speculation safety, we'd have to iterate these three passes 26 times to reduce this example down to the minimal form. Using the fact that the array lengths are known to be invariant, we can short circuit this iteration. By forming the loop invariant form of all the guards at once, we remove the need for LoopPredication from the iterative cycle. At the moment, we'd still have to iterate LICM and TrivialUnswitch; we'll leave that part for later. As a secondary benefit, this allows LoopPred to expose peeling oppurtunities in a much more obvious manner. See the udiv test changes as an example. If the udiv was not hoistable (i.e. we couldn't prove speculation safety) this would be an example where peeling becomes obviously profitable whereas it wasn't before. A couple of subtleties in the implementation: - SCEV's isSafeToExpand guarantees speculation safety (i.e. let's us expand at a new point). It is not a precondition for expansion if we know the SCEV corresponds to a Value which dominates the requested expansion point. - SCEV's isLoopInvariant returns true for expressions which compute the same value across all iterations executed, regardless of where the original Value is located. (i.e. it can be in the loop) This implies we have a speculation burden to prove before expanding them outside loops. - invariant_loads and AA->pointsToConstantMemory are two cases that SCEV currently does not handle, but meets the SCEV definition of invariance. I plan to sink this part into SCEV once this has baked for a bit. Differential Revision: https://reviews.llvm.org/D60093 llvm-svn: 358684	2019-04-18 16:33:17 +00:00
Kit Barton	3cdf87940f	Add basic loop fusion pass. This patch adds a basic loop fusion pass. It will fuse loops that conform to the following 4 conditions: 1. Adjacent (no code between them) 2. Control flow equivalent (if one loop executes, the other loop executes) 3. Identical bounds (both loops iterate the same number of iterations) 4. No negative distance dependencies between the loop bodies. The pass does not make any changes to the IR to create opportunities for fusion. Instead, it checks if the necessary conditions are met and if so it fuses two loops together. The pass has not been added to the pass pipeline yet, and thus is not enabled by default. It can be run stand alone using the -loop-fusion option. Differential Revision: https://reviews.llvm.org/D55851 llvm-svn: 358607	2019-04-17 18:53:27 +00:00
Steven Wu	05a358cdcd	[ThinLTO] Fix ThinLTOCodegenerator to export llvm.used symbols Summary: Reapply r357931 with fixes to ThinLTO testcases and llvm-lto tool. ThinLTOCodeGenerator currently does not preserve llvm.used symbols and it can internalize them. In order to pass the necessary information to the legacy ThinLTOCodeGenerator, the input to the code generator is rewritten to be based on lto::InputFile. Now ThinLTO using the legacy LTO API will requires data layout in Module. "internalize" thinlto action in llvm-lto is updated to run both "promote" and "internalize" with the same configuration as ThinLTOCodeGenerator. The old "promote" + "internalize" option does not produce the same output as ThinLTOCodeGenerator. This fixes: PR41236 rdar://problem/49293439 Reviewers: tejohnson, pcc, kromanova, dexonsmith Reviewed By: tejohnson Subscribers: ormris, bd1976llvm, mehdi_amini, inglorion, eraman, hiraditya, jkorous, dexonsmith, arphaman, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60421 llvm-svn: 358601	2019-04-17 17:38:09 +00:00
Nikita Popov	2039581002	[LVI][CVP] Constrain values in with.overflow branches If a branch is conditional on extractvalue(op.with.overflow(%x, C), 1) then we can constrain the value of %x inside the branch based on makeGuaranteedNoWrapRegion(). We do this by extending the edge-value handling in LVI. This allows CVP to then fold comparisons against %x, as illustrated in the tests. Differential Revision: https://reviews.llvm.org/D60650 llvm-svn: 358597	2019-04-17 16:57:42 +00:00
Florian Hahn	893aea58ea	[LoopUnroll] Allow unrolling if the unrolled size does not exceed loop size. Summary: In the following cases, unrolling can be beneficial, even when optimizing for code size: 1) very low trip counts 2) potential to constant fold most instructions after fully unrolling. We can unroll in those cases, by setting the unrolling threshold to the loop size. This might highlight some cost modeling issues and fixing them will have a positive impact in general. Reviewers: vsk, efriedma, dmgreen, paquette Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D60265 llvm-svn: 358586	2019-04-17 15:57:43 +00:00
Roman Lebedev	0080645846	[CVP] processOverflowIntrinsic(): don't crash if constant-holding happened As reported by Mikael Holmén in post-commit review in https://reviews.llvm.org/D60791#1469765 llvm-svn: 358559	2019-04-17 06:35:07 +00:00
Eric Christopher	e29874eaa0	Revert "Add basic loop fusion pass." Per request. This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358553	2019-04-17 04:55:24 +00:00
Eric Christopher	cee313d288	Revert "Temporarily Revert "Add basic loop fusion pass."" The reversion apparently deleted the test/Transforms directory. Will be re-reverting again. llvm-svn: 358552	2019-04-17 04:52:47 +00:00
Eric Christopher	a863435128	Temporarily Revert "Add basic loop fusion pass." As it's causing some bot failures (and per request from kbarton). This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358546	2019-04-17 02:12:23 +00:00

... 2 3 4 5 6 ...

12798 Commits