llvm-project

Commit Graph

Author	SHA1	Message	Date
Sanjay Patel	0f9b4773c1	[SimplifyCFG] add a struct to house optional folds (PR34603) This was intended to be no-functional-change, but it's not - there's a test diff. So I thought I should stop here and post it as-is to see if this looks like what was expected based on the discussion in PR34603: https://bugs.llvm.org/show_bug.cgi?id=34603 Notes: 1. The test improvement occurs because the existing 'LateSimplifyCFG' marker is not carried through the recursive calls to 'SimplifyCFG()->SimplifyCFGOpt().run()->SimplifyCFG()'. The parameter isn't passed down, so we pick up the default value from the function signature after the first level. I assumed that was a bug, so I've passed 'Options' down in all of the 'SimplifyCFG' calls. 2. I split 'LateSimplifyCFG' into 2 bits: ConvertSwitchToLookupTable and KeepCanonicalLoops. This would theoretically allow us to differentiate the transforms controlled by those params independently. 3. We could stash the optional AssumptionCache pointer and 'LoopHeaders' pointer in the struct too. I just stopped here to minimize the diffs. 4. Similarly, I stopped short of messing with the pass manager layer. I have another question that could wait for the follow-up: why is the new pass manager creating the pass with LateSimplifyCFG set to true no matter where in the pipeline it's creating SimplifyCFG passes? // Create an early function pass manager to cleanup the output of the // frontend. EarlyFPM.addPass(SimplifyCFGPass()); --> /// \brief Construct a pass with the default thresholds /// and switch optimizations. SimplifyCFGPass::SimplifyCFGPass() : BonusInstThreshold(UserBonusInstThreshold), LateSimplifyCFG(true) {} <-- switches get converted to lookup tables and loops may not be in canonical form If this is unintended, then it's possible that the current behavior of dropping the 'LateSimplifyCFG' setting via recursion was masking this bug. Differential Revision: https://reviews.llvm.org/D38138 llvm-svn: 314308	2017-09-27 14:54:16 +00:00
Hongbin Zheng	d1b7b2efba	[SimplifyIndVar] Constant fold IV users This patch tries to transform cases like: for (unsigned i = 0; i < N; i += 2) { bool c0 = (i & 0x1) == 0; bool c1 = ((i + 1) & 0x1) == 1; } To for (unsigned i = 0; i < N; i += 2) { bool c0 = true; bool c1 = true; } This commit also update test/Transforms/IndVarSimplify/replace-srem-by-urem.ll to prevent constant folding. Differential Revision: https://reviews.llvm.org/D38272 llvm-svn: 314266	2017-09-27 03:11:46 +00:00
Sanjoy Das	eda7a86d42	[BypassSlowDivision] Improve our handling of divisions by constants Summary: Don't bail out on constant divisors for divisions that can be narrowed without introducing control flow . This gives us a 32 bit multiply instead of an emulated 64 bit multiply in the generated PTX assembly. Reviewers: jlebar Subscribers: jholewinski, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D38265 llvm-svn: 314253	2017-09-26 21:54:27 +00:00
Craig Topper	8bf622174d	[InstCombine] Remove one use restriction on the shift for calls to foldICmpAndShift. If this transformation succeeds, we're going to remove our dependency on the shift by rewriting the and. So it doesn't matter how many uses the shift has. This distributes the one use check to other transforms in foldICmpAndConstConst that do need it. Differential Revision: https://reviews.llvm.org/D38206 llvm-svn: 314233	2017-09-26 18:47:25 +00:00
Sanjay Patel	1d04b5bacf	[DSE] Merge stores when the later store only writes to memory locations the early store also wrote to (2nd try) This is a 2nd attempt at: https://reviews.llvm.org/rL310055 ...which was reverted at rL310123 because of PR34074: https://bugs.llvm.org/show_bug.cgi?id=34074 In this version, we break out of the inner loop after we successfully merge and kill a pair of stores. In the earlier rev, we were continuing instead, which meant we could process the invalid info from a now dead store. Original commit message (authored by Filipe Cabecinhas): This fixes PR31777. If both stores' values are ConstantInt, we merge the two stores (shifting the smaller store appropriately) and replace the earlier (and larger) store with an updated constant. In the future we should also support vectors of integers. And maybe float/double if we can. Differential Revision: https://reviews.llvm.org/D30703 llvm-svn: 314206	2017-09-26 13:54:28 +00:00
Sylvestre Ledru	e7d4cd639b	Don't move llvm.localescape outside the entry block in the GCOV profiling pass Summary: This fixes https://bugs.llvm.org/show_bug.cgi?id=34714. Patch by Marco Castelluccio Reviewers: rnk Reviewed By: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38224 llvm-svn: 314201	2017-09-26 11:56:43 +00:00
Matthias Braun	cc603ee3d5	TargetLibraryInfo: Stop guessing wchar_t size Usually the frontend communicates the size of wchar_t via metadata and we can optimize wcslen (and possibly other calls in the future). In cases without the wchar_size metadata we would previously try to guess the correct size based on the target triple; however this is fragile to keep up to date and may miss users manually changing the size via flags. Better be safe and stop guessing and optimizing if the frontend didn't communicate the size. Differential Revision: https://reviews.llvm.org/D38106 llvm-svn: 314185	2017-09-26 02:36:57 +00:00
Vlad Tsyrklevich	998b220e97	Add section headers to SpecialCaseLists Summary: Sanitizer blacklist entries currently apply to all sanitizers--there is no way to specify that an entry should only apply to a specific sanitizer. This is important for Control Flow Integrity since there are several different CFI modes that can be enabled at once. For maximum security, CFI blacklist entries should be scoped to only the specific CFI mode(s) that entry applies to. Adding section headers to SpecialCaseLists allows users to specify more information about list entries, like sanitizer names or other metadata, like so: [section1] fun:fun1 [section2\|section3] fun:fun23 The section headers are regular expressions. For backwards compatbility, blacklist entries entered before a section header are put into the '[*]' section so that blacklists without sections retain the same behavior. SpecialCaseList has been modified to also accept a section name when matching against the blacklist. It has also been modified so the follow-up change to clang can define a derived class that allows matching sections by SectionMask instead of by string. Reviewers: pcc, kcc, eugenis, vsk Reviewed By: eugenis, vsk Subscribers: vitalybuka, llvm-commits Differential Revision: https://reviews.llvm.org/D37924 llvm-svn: 314170	2017-09-25 22:11:11 +00:00
Craig Topper	30dc9797e9	[InstCombine] Move an optimization from foldICmpAndConstConst to foldICmpUsingKnownBits All this optimization cares about is knowing how many low bits of LHS is known to be zero and whether that means that the result is 0 or greater than the RHS constant. It doesn't matter where the zeros in the low bits came from. So we don't need to specifically look for an AND. Instead we can use known bits. Differential Revision: https://reviews.llvm.org/D38195 llvm-svn: 314153	2017-09-25 21:15:00 +00:00
Sanjay Patel	ecb175608f	[InstCombine] remove extract-of-select vector transform (2nd try) The 1st attempt at this: https://reviews.llvm.org/rL314117 was reverted at: https://reviews.llvm.org/rL314118 because of bot fails for clang tests that were checking optimized IR. That should be fixed with: https://reviews.llvm.org/rL314144 ...so try again. Original commit message: The transform to convert an extract-of-a-select-of-vectors was added at: https://reviews.llvm.org/rL194013 And a question about the validity of this transform was raised in the review: https://reviews.llvm.org/D1539: ...but not answered AFAICT> Most of the motivating cases in that patch are now handled by other combines. These are the tests that were added with the original commit, but they are not regressing even after we remove the transform in this patch. The diffs we see after removing this transform cause us to avoid increasing the instruction count, so we don't want to do those transforms as canonicalizations. The motivation for not turning a vector-select-of-vectors into a scalar operation is shown in PR33301: https://bugs.llvm.org/show_bug.cgi?id=33301 ...in those cases, we'll get vector ops with this patch rather than the vector/scalar mix that we currently see. Differential Revision: https://reviews.llvm.org/D38006 llvm-svn: 314147	2017-09-25 20:30:53 +00:00
Hongbin Zheng	bbe448abd8	[SimplifyIndvar] Minor change to refine r314125, NFC llvm-svn: 314130	2017-09-25 18:10:36 +00:00
Hongbin Zheng	f0093e45c4	[SimplifyIndvar] Replace the srem used by IV if we can prove both of its operands are non-negative Since now SCEV can handle 'urem', an 'urem' is a better canonical form than an 'srem' because it has well-defined behavior This is a follow up of D34598 Differential Revision: https://reviews.llvm.org/D38072 llvm-svn: 314125	2017-09-25 17:39:40 +00:00
Sanjay Patel	aa7f750bec	revert r314117 because there are bogus clang tests that depend on the optimizer llvm-svn: 314118	2017-09-25 17:00:04 +00:00
Sanjay Patel	9639897d77	[InstCombine] remove extract-of-select vector transform The transform to convert an extract-of-a-select-of-vectors was added at: rL194013 And a question about the validity of this transform was raised in the review: https://reviews.llvm.org/D1539: ...but not answered AFAICT> Most of the motivating cases in that patch are now handled by other combines. These are the tests that were added with the original commit, but they are not regressing even after we remove the transform in this patch. The diffs we see after removing this transform cause us to avoid increasing the instruction count, so we don't want to do those transforms as canonicalizations. The motivation for not turning a vector-select-of-vectors into a scalar operation is shown in PR33301: https://bugs.llvm.org/show_bug.cgi?id=33301 ...in those cases, we'll get vector ops with this patch rather than the vector/scalar mix that we currently see. Differential Revision: https://reviews.llvm.org/D38006 llvm-svn: 314117	2017-09-25 16:41:34 +00:00
Alexey Bataev	ccce7afee8	[SLP] Support for horizontal min/max reduction. Summary: SLP vectorizer supports horizontal reductions for Add/FAdd binary operations. Patch adds support for horizontal min/max reductions. Function getReductionCost() is split to getArithmeticReductionCost() for binary operation reductions and getMinMaxReductionCost() for min/max reductions. Patch fixes PR26956. Reviewers: spatel, mkuper, hfinkel, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27846 llvm-svn: 314101	2017-09-25 13:34:59 +00:00
Craig Topper	ea927baee2	[InstCombine] Teach foldICmpUsingKnownBits to simplify SLE/SGE/ULE/UGE to equality comparisons when the min/max ranges intersect in a single value. This is the inverse of what we do for SGT/SLT/UGT/ULT. llvm-svn: 314032	2017-09-22 21:47:22 +00:00
Craig Topper	3f364aa908	[InstCombine] Add constant splat handling to one of the ICMP_SLT/SGT cases in foldICmpUsingKnownBits. llvm-svn: 314025	2017-09-22 19:54:15 +00:00
Craig Topper	3edda87c42	[InstCombine] Move the call to isSignBitCheck into getDemandedBitsLHSMask instead of calling it outside and passing its result through a flag. NFCI The result of the isSignBitCheck isn't used anywhere else and this allows us to share the m_APInt call in the likely case that it isn't a sign bit check. llvm-svn: 314018	2017-09-22 18:57:23 +00:00
Craig Topper	5b35b68785	[InstCombine] Simplify check for RHS being a splat constant in foldICmpUsingKnownBits by just checking Op1Min==Op1Max rather than going through m_APInt. llvm-svn: 314017	2017-09-22 18:57:22 +00:00
Craig Topper	2c9b7d7894	[InstCombine] Make cases for ICMP_UGT/ICMP_ULT use similar formatting since they use similar code. NFC llvm-svn: 314016	2017-09-22 18:57:20 +00:00
Artur Pilipenko	889dc1e3a5	Rework loop predication pass We've found a serious issue with the current implementation of loop predication. The current implementation relies on SCEV and this turned out to be problematic. To fix the problem we had to rework the pass substantially. We have had the reworked implementation in our downstream tree for a while. This is the initial patch of the series of changes to upstream the new implementation. For now the transformation is limited to the following case: * The loop has a single latch with either ult or slt icmp condition. * The step of the IV used in the latch condition is 1. * The IV of the latch condition is the same as the post increment IV of the guard condition. * The guard condition is ult. See the review or the LoopPredication.cpp header for the details about the problem and the new implementation. Reviewed By: sanjoy, mkazantsev Differential Revision: https://reviews.llvm.org/D37569 llvm-svn: 313981	2017-09-22 13:13:57 +00:00
Sanjoy Das	388b012f4e	Rename markAsErased to erase, as pointed out in a previous review; NFC llvm-svn: 313951	2017-09-22 01:47:41 +00:00
Reid Kleckner	0fe506bc5e	Re-land r313825: "[IR] Add llvm.dbg.addr, a control-dependent version of llvm.dbg.declare" The fix is to avoid invalidating our insertion point in replaceDbgDeclare: Builder.insertDeclare(NewAddress, DIVar, DIExpr, Loc, InsertBefore); + if (DII == InsertBefore) + InsertBefore = &std::next(InsertBefore->getIterator()); DII->eraseFromParent(); I had to write a unit tests for this instead of a lit test because the use list order matters in order to trigger the bug. The reduced C test case for this was: void useit(int); static inline void inlineme() { int x[2]; useit(x); } void f() { inlineme(); inlineme(); } llvm-svn: 313905	2017-09-21 19:52:03 +00:00
Daniel Jasper	7d2f38d600	Revert r313825: "[IR] Add llvm.dbg.addr, a control-dependent version of llvm.dbg.declare" .. as well as the two subsequent changes r313826 and r313875. This leads to segfaults in combination with ASAN. Will forward repro instructions to the original author (rnk). llvm-svn: 313876	2017-09-21 12:07:33 +00:00
Mikael Holmen	582e141007	[SROA] Really remove associated dbg.declare when removing dead alloca Summary: There already was code that tried to remove the dbg.declare, but that code was placed after we had called I->replaceAllUsesWith(UndefValue::get(I->getType())); on the alloca, so when we searched for the relevant dbg.declare, we couldn't find it. Now we do the search before we call RAUW so there is a chance to find it. An existing testcase needed update due to this. Two dbg.declare with undef were removed and then suddenly one of the two CHECKS failed. Before this patch we got call void @llvm.dbg.declare(metadata i24* undef, metadata !14, metadata !DIExpression(DW_OP_LLVM_fragment, 32, 24)), !dbg !15 call void @llvm.dbg.declare(metadata %struct.prog_src_register* undef, metadata !14, metadata !DIExpression()), !dbg !15 call void @llvm.dbg.value(metadata i32 0, metadata !14, metadata !DIExpression(DW_OP_LLVM_fragment, 0, 32)), !dbg !15 call void @llvm.dbg.value(metadata i32 0, metadata !14, metadata !DIExpression(DW_OP_LLVM_fragment, 32, 24)), !dbg !15 and with it we get call void @llvm.dbg.value(metadata i32 0, metadata !14, metadata !DIExpression(DW_OP_LLVM_fragment, 0, 32)), !dbg !15 call void @llvm.dbg.value(metadata i32 0, metadata !14, metadata !DIExpression(DW_OP_LLVM_fragment, 32, 24)), !dbg !15 However, the CHECKs in the testcase checked things in a silly order, so they only passed since they found things in the first dbg.declare. Now we changed the order of the checks and the test passes. Reviewers: rnk Reviewed By: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37900 llvm-svn: 313875	2017-09-21 11:14:27 +00:00
Strahinja Petrovic	29202f6dc1	Fixed reverted commit rL312318 This patch contains fix for reverted commit rL312318 which was causing failure due to use of unchecked dyn_cast to CIInit. Patch by: Nikola Prica. llvm-svn: 313870	2017-09-21 10:04:02 +00:00
Serguei Katkov	675e304ef8	Revert "Re-enable "[IRCE] Identify loops with latch comparison against current IV value"" Revert the patch causing the functional failures. The patch owner is notified with test cases which fail. Test case has been provided to Maxim offline. llvm-svn: 313857	2017-09-21 04:50:41 +00:00
Craig Topper	18887bf179	[InstCombine] Teach getDemandedBitsLHSMask to handle constant splat vectors This replaces a ConstantInt dyn_cast with m_APInt Differential Revision: https://reviews.llvm.org/D38100 llvm-svn: 313840	2017-09-20 23:48:58 +00:00
Matt Morehouse	4881a23ca8	[MSan] Disable sanitization for __sanitizer_dtor_callback. Summary: Eliminate unnecessary instrumentation at __sanitizer_dtor_callback call sites. Fixes https://github.com/google/sanitizers/issues/861. Reviewers: eugenis, kcc Reviewed By: eugenis Subscribers: vitalybuka, llvm-commits, cfe-commits, hiraditya Differential Revision: https://reviews.llvm.org/D38063 llvm-svn: 313831	2017-09-20 22:53:08 +00:00
Sanjay Patel	73811a152a	[SimplifyCFG] don't create a no-op subtract I noticed this inefficiency while investigating PR34603: https://bugs.llvm.org/show_bug.cgi?id=34603 This fix will likely push another bug (we don't maintain state of 'LateSimplifyCFG') into hiding, but I'll try to clean that up with a follow-up patch anyway. llvm-svn: 313829	2017-09-20 22:31:35 +00:00
Reid Kleckner	3f547e87b2	[IR] Add llvm.dbg.addr, a control-dependent version of llvm.dbg.declare Summary: This implements the design discussed on llvm-dev for better tracking of variables that live in memory through optimizations: http://lists.llvm.org/pipermail/llvm-dev/2017-September/117222.html This is tracked as PR34136 llvm.dbg.addr is intended to be produced and used in almost precisely the same way as llvm.dbg.declare is today, with the exception that it is control-dependent. That means that dbg.addr should always have a position in the instruction stream, and it will allow passes that optimize memory operations on local variables to insert llvm.dbg.value calls to reflect deleted stores. See SourceLevelDebugging.rst for more details. The main drawback to generating DBG_VALUE machine instrs is that they usually cause LLVM to emit a location list for DW_AT_location. The next step will be to teach DwarfDebug.cpp how to recognize more DBG_VALUE ranges as not needing a location list, and possibly start setting DW_AT_start_offset for variables whose lifetimes begin mid-scope. Reviewers: aprantl, dblaikie, probinson Subscribers: eraman, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D37768 llvm-svn: 313825	2017-09-20 21:52:33 +00:00
Craig Topper	562bf99ee6	[InstCombine] Handle (X & C2) < C1 --> (X & C2) == 0 We already did (X & C2) > C1 --> (X & C2) != 0, if any bit set in (X & C2) will produce a result greater than C1. But there is an equivalent inverse condition with <= C1 (which will be canonicalized to < C1+1) Differential Revision: https://reviews.llvm.org/D38065 llvm-svn: 313819	2017-09-20 21:18:17 +00:00
Craig Topper	a0c897f634	[InstCombine] Use APInt::getActiveBits() to avoid creating an APInt from a trailing zero count to do a comparison. NFCI llvm-svn: 313792	2017-09-20 18:49:29 +00:00
Hans Wennborg	57c3341ada	Revert r313771 "[SLP] Vectorize jumbled memory loads." This broke the buildbots, e.g. http://bb.pgr.jp/builders/test-llvm-i686-linux-RA/builds/391 > Summary: > This patch tries to vectorize loads of consecutive memory accesses, accessed > in non-consecutive or jumbled way. An earlier attempt was made with patch D26905 > which was reverted back due to some basic issue with representing the 'use mask' > jumbled accesses. > > This patch fixes the mask representation by recording the 'use mask' in the usertree entry. > > Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df > > Subscribers: mzolotukhin > > Reviewed By: ayal > > Differential Revision: https://reviews.llvm.org/D36130 > > Review comments updated accordingly > > Change-Id: I22ab0a8a9bac9d49d74baa81a08e1e486f5e75f0 > > Added a TODO for sortLoadAccesses API > > Change-Id: I3c679bf1865422d1b45e17ea28f1992bca660b58 > > Modified the TODO for sortLoadAccesses API > > Change-Id: Ie64a66cb5f9e2a7610438abb0e750c6e090f9565 > > Review comment update for using OpdNum to insert the mask in respective location > > Change-Id: I016d0c1b29874e979efc0205bbf078991f92edce > > Fixes '-Wsign-compare warning' in LoopAccessAnalysis.cpp and code rebase > > Change-Id: I64b2ea5e68c1d7b6a028f5ef8251c5a97333f89b llvm-svn: 313781	2017-09-20 18:00:03 +00:00
Quentin Colombet	aa103b3d86	[InstCombine] Add select simplifications In these cases, two selects have constant selectable operands for both the true and false components and have the same conditional expression. We then create two arithmetic operations of the same type and feed a final select operation using the result of the true arithmetic for the true operand and the result of the false arithmetic for the false operand and reuse the original conditionl expression. The arithmetic operations are naturally folded as a consequence, leaving only the newly formed select to replace the old arithmetic operation. Patch by: Michael Berg <michael_c_berg@apple.com> Differential Revision: https://reviews.llvm.org/D37019 llvm-svn: 313774	2017-09-20 17:32:16 +00:00
Mohammad Shahid	2b281de576	[SLP] Vectorize jumbled memory loads. Summary: This patch tries to vectorize loads of consecutive memory accesses, accessed in non-consecutive or jumbled way. An earlier attempt was made with patch D26905 which was reverted back due to some basic issue with representing the 'use mask' jumbled accesses. This patch fixes the mask representation by recording the 'use mask' in the usertree entry. Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df Subscribers: mzolotukhin Reviewed By: ayal Differential Revision: https://reviews.llvm.org/D36130 Review comments updated accordingly Change-Id: I22ab0a8a9bac9d49d74baa81a08e1e486f5e75f0 Added a TODO for sortLoadAccesses API Change-Id: I3c679bf1865422d1b45e17ea28f1992bca660b58 Modified the TODO for sortLoadAccesses API Change-Id: Ie64a66cb5f9e2a7610438abb0e750c6e090f9565 Review comment update for using OpdNum to insert the mask in respective location Change-Id: I016d0c1b29874e979efc0205bbf078991f92edce Fixes '-Wsign-compare warning' in LoopAccessAnalysis.cpp and code rebase Change-Id: I64b2ea5e68c1d7b6a028f5ef8251c5a97333f89b llvm-svn: 313771	2017-09-20 17:19:57 +00:00
Teresa Johnson	f625118ec7	[ThinLTO] Fix dead stripping analysis for SamplePGO Summary: The fix for dead stripping analysis in the case of SamplePGO indirect calls to local functions (r313151) introduced the possibility of an infinite loop. Make sure we check for the value being already live after we update it for SamplePGO indirect call handling. Reviewers: danielcdh Subscribers: mehdi_amini, inglorion, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D38086 llvm-svn: 313766	2017-09-20 17:09:47 +00:00
Alexander Kornienko	6a140234ed	Revert r313736: "[SLP] Vectorize jumbled memory loads." The revision breaks buildbots: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/6694/steps/test/logs/stdio llvm-svn: 313758	2017-09-20 14:53:07 +00:00
Mohammad Shahid	f8db9bd857	[SLP] Vectorize jumbled memory loads. Summary: This patch tries to vectorize loads of consecutive memory accesses, accessed in non-consecutive or jumbled way. An earlier attempt was made with patch D26905 which was reverted back due to some basic issue with representing the 'use mask' of jumbled accesses. This patch fixes the mask representation by recording the 'use mask' in the usertree entry. Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh Reviewed By: Ayal Subscribers: mzolotukhin Differential Revision: https://reviews.llvm.org/D36130 Commit after rebase for patch D36130 Change-Id: I8add1c265455669ef288d880f870a9522c8c08ab llvm-svn: 313736	2017-09-20 08:18:28 +00:00
Sanjoy Das	09613b122e	Tighten the invariants around LoopBase::invalidate Summary: With this change: - Methods in LoopBase trip an assert if the receiver has been invalidated - LoopBase::clear frees up the memory held the LoopBase instance This change also shuffles things around as necessary to work with this stricter invariant. Reviewers: chandlerc Subscribers: mehdi_amini, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D38055 llvm-svn: 313708	2017-09-20 02:31:57 +00:00
Daniel Berlin	064cb68d18	GVNSink: Make ModelledPHIs constructor linear (and avoid edge case it worries about) by avoiding getIncomingValueForBlock llvm-svn: 313702	2017-09-20 00:07:27 +00:00
Daniel Berlin	dd323297d0	Revert "[GVNSink] Remove dependency on SmallPtrSet iteration order." This reverts commit r312156, because now the op and block arrays are not in the same order :(. llvm-svn: 313701	2017-09-20 00:07:25 +00:00
Daniel Berlin	9632dd7376	NewGVN: Remove unused includes llvm-svn: 313700	2017-09-20 00:07:12 +00:00
Sanjoy Das	76ab23234c	[LoopInfo] Make LoopBase and Loop destructors non-public Summary: See comment for why I think this is a good idea. This change also: - Removes an SCEV test case. The SCEV test was not testing anything useful (most of it was `#if 0` ed out) and it would need to be updated to deal with a private ~Loop::Loop. - Updates the loop pass manager test case to deal with a private ~Loop::Loop. - Renames markAsRemoved to markAsErased to contrast with removeLoop, via the usual remove vs. erase idiom we already have for instructions and basic blocks. Reviewers: chandlerc Subscribers: mehdi_amini, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D37996 llvm-svn: 313695	2017-09-19 23:19:00 +00:00
Adam Nemet	15fccf0009	Allow ORE.emit to take a closure to delay building the remark object In the lambda we are now returning the remark by value so we need to preserve its type in the insertion operator. This requires making the insertion operator generic. I've also converted a few cases to use the new API. It seems to work pretty well. See the LoopUnroller for a slightly more interesting case. llvm-svn: 313691	2017-09-19 23:00:55 +00:00
Dehao Chen	62b9c33e1e	Import all inlined indirect call targets for SamplePGO. Summary: In the ThinLTO compilation, if a function is inlined in the profiling binary, we need to inline it before annotation. If the callee is not available in the primary module, a first step is needed to import that callee function. For the current implementation, if the call is an indirect call, which has been promoted to >1 targets and inlined, SamplePGO will only import one target with the largest sample count. This patch fixed the bug to import all targets instead. Reviewers: tejohnson, davidxl Reviewed By: tejohnson Subscribers: sanjoy, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D36637 llvm-svn: 313678	2017-09-19 21:18:14 +00:00
Sanjay Patel	ca14697c2b	[SimplifyCFG] fix typos/formatting; NFC llvm-svn: 313671	2017-09-19 20:58:14 +00:00
Dehao Chen	b6e60c8b80	Handle profile mismatch correctly for SamplePGO. Summary: Fix the bug when promoted call return type mismatches with the promoted function, we should not try to inline it. Otherwise it may lead to compiler crash. Reviewers: davidxl, tejohnson, eraman Reviewed By: tejohnson Subscribers: llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D38018 llvm-svn: 313658	2017-09-19 18:26:54 +00:00
Reid Kleckner	1aa4ea8104	[gcov] Emit errors when opening the notes file fails No time to write a test case, on to the next bug. =P Discovered while investigating PR34659 llvm-svn: 313571	2017-09-18 21:31:48 +00:00
Sanjay Patel	55de1ed8e6	[SLP] clean up for vector store case; NFCI llvm-svn: 313541	2017-09-18 16:20:15 +00:00
Craig Topper	f264fcc704	[X86] Remove VPERM2F128/VPERM2I128 intrinsics and autoupgrade to native shuffles. I've moved the test cases from the InstCombine optimizations to the backend to keep the coverage we had there. It covered every possible immediate so I've preserved the resulting shuffle mask for each of those immediates. llvm-svn: 313450	2017-09-16 07:36:14 +00:00
Chandler Carruth	beb22b5437	[SLP] Revert r312791 and other necessary commits, except for TTI and CostModel. The original patch added support for horizontal min/max reductions to the SLP vectorizer. This patch causes LLVM to miscompile fairly simple signed min reductions. I have attached a test progrom to http://llvm.org/PR34635 that shows the behavior change after this patch. We found this in a test for the open source Eigen library, but also in other code. Unfortunately, the revert is moderately challenging. It required reverting: r313042: [SLP] Test with multiple uses of conditional op and wrong parent. r312853: [SLP] Fix buildbots, NFC. r312793: [SLP] Fix the warning about paths not returning the value, NFC. r312791: [SLP] Support for horizontal min/max reduction. And even then, I had to completely skip reverting the changes to TTI and CostModel because r312832 rewrote so much of this code. Plus, the cost modeling changes aren implicated in the miscompile, so they should be fine and will just not be used until this gets re-introduced. llvm-svn: 313409	2017-09-15 22:23:27 +00:00
Vivek Pandya	b5ab895e2a	This patch fixes https://bugs.llvm.org/show_bug.cgi?id=32352 It enables OptimizationRemarkEmitter::allowExtraAnalysis and MachineOptimizationRemarkEmitter::allowExtraAnalysis to return true not only for -fsave-optimization-record but when specific remarks are requested with command line options. The diagnostic handler used to be callback now this patch adds a class DiagnosticHandler. It has virtual method to provide custom diagnostic handler and methods to control which particular remarks are enabled. However LLVM-C API users can still provide callback function for diagnostic handler. llvm-svn: 313390	2017-09-15 20:10:09 +00:00
Vivek Pandya	df8598dcc4	This reverts r313381 llvm-svn: 313387	2017-09-15 19:53:54 +00:00
Vivek Pandya	00d887447b	This patch fixes https://bugs.llvm.org/show_bug.cgi?id=32352 It enables OptimizationRemarkEmitter::allowExtraAnalysis and MachineOptimizationRemarkEmitter::allowExtraAnalysis to return true not only for -fsave-optimization-record but when specific remarks are requested with command line options. The diagnostic handler used to be callback now this patch adds a class DiagnosticHandler. It has virtual method to provide custom diagnostic handler and methods to control which particular remarks are enabled. However LLVM-C API users can still provide callback function for diagnostic handler. llvm-svn: 313382	2017-09-15 19:30:59 +00:00
Anna Thomas	f34537dff8	[RuntimeUnroll] Add heuristic for unrolling multi-exit loop Add a profitability heuristic to enable runtime unrolling of multi-exit loop: There can be atmost two unique exit blocks for the loop and the second exit block should be a deoptimizing block. Also, there can be one other exiting block other than the latch exiting block. The reason for the latter is so that we limit the number of branches in the unrolled code to being at most the unroll factor. Deoptimizing blocks are rarely taken so these additional number of branches created due to the unrolling are predictable, since one of their target is the deopt block. Reviewers: apilipenko, reames, evstupac, mkuper Subscribers: llvm-commits Reviewed by: reames Differential Revision: https://reviews.llvm.org/D35380 llvm-svn: 313363	2017-09-15 15:56:05 +00:00
Anna Thomas	512dde77ba	[RuntimeUnrolling] Populate the VMap entry correctly when default generated through lookup During runtime unrolling on loops with multiple exits, we update the exit blocks with the correct phi values from both original and remainder loop. In this process, we lookup the VMap for the mapped incoming phi values, but did not update the VMap if a default entry was generated in the VMap during the lookup. This default value is generated when constants or values outside the current loop are looked up. This patch fixes the assertion failure when null entries are present in the VMap because of this lookup. Added a testcase that showcases the problem. llvm-svn: 313358	2017-09-15 13:29:33 +00:00
Ilya Biryukov	d23faa843e	Revert "[SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops." This reverts commit r313348. Reason: it caused buildbot failures. llvm-svn: 313352	2017-09-15 10:15:00 +00:00
Dinar Temirbulatov	e2358b53bc	[SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops. Patch tries to improve vectorization of the following code: void add1(int * __restrict dst, const int * __restrict src) { dst++ = src++; dst++ = src++ + 1; dst++ = src++ + 2; dst++ = src++ + 3; } Allows to vectorize even if the very first operation is not a binary add, but just a load. Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev, davide Subscribers: llvm-commits, RKSimon Differential Revision: https://reviews.llvm.org/D28907 llvm-svn: 313348	2017-09-15 06:56:39 +00:00
Dinar Temirbulatov	bb891b864c	[SLPVectorizer] Remove duplicated functionality code in initScheduleData function, NFCI. llvm-svn: 313341	2017-09-15 04:31:54 +00:00
Alina Sbirlea	7ed5856a32	Refactor collectChildrenInLoop to LoopUtils [NFC] Summary: Move to LoopUtils method that collects all children of a node inside a loop. Reviewers: majnemer, sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37870 llvm-svn: 313322	2017-09-15 00:04:16 +00:00
Dehao Chen	3a81f84d9a	Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader. Summary: SampleProfileLoader inlines hot functions if it is inlined in the profiled binary. However, the inline needs to be guarded by legality check, otherwise it could lead to correctness issues. Reviewers: eraman, davidxl Reviewed By: eraman Subscribers: vitalybuka, sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D37779 llvm-svn: 313277	2017-09-14 17:29:56 +00:00
Alon Kom	682cfc1d4c	[LV] Fix maximum legal VF calculation This patch fixes pr34283, which exposed that the computation of maximum legal width for vectorization was wrong, because it relied on MaxInterleaveFactor to obtain the maximum stride used in the loop, however not all strided accesses in the loop have an interleave-group associated with them. Instead of recording the maximum stride in the loop, which can be over conservative (e.g. if the access with the maximum stride is not involved in the dependence limitation), this patch tracks the actual maximum legal width imposed by accesses that are involved in dependencies. Differential Revision: https://reviews.llvm.org/D37507 llvm-svn: 313237	2017-09-14 07:40:02 +00:00
Vitaly Buka	48624d327a	Revert "Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader." Patch introduced uninitialized value. This reverts commit r313195. llvm-svn: 313230	2017-09-14 05:40:33 +00:00
Peter Collingbourne	cfbd089237	Reland r313157, "ThinLTO: Correctly follow aliasee references when dead stripping." which was reverted in r313222. This reland includes a fix for the LowerTypeTests pass so that it looks past aliases when determining which type identifiers are live. Differential Revision: https://reviews.llvm.org/D37842 llvm-svn: 313229	2017-09-14 05:02:59 +00:00
Dinar Temirbulatov	df0b843875	[SLPVectorizer] Prefer auto over explicit type for VL0, NFCI. llvm-svn: 313228	2017-09-14 04:28:35 +00:00
Hans Wennborg	ae050afeb9	Revert r313157 "ThinLTO: Correctly follow aliasee references when dead stripping." This broke Chromium's CFI build; see crbug.com/765004. > We were previously handling aliases during dead stripping by adding > the aliased global's "original name" GUID to the worklist. This will > lead to incorrect behaviour if the global has local linkage because > the original name GUID will not correspond to the global's GUID in > the summary. > > Because an alias is just another name for the global that it > references, there is no need to mark the referenced global as used, > or to follow references from any other copies of the global. So all > we need to do is to follow references from the aliasee's summary > instead of the alias. > > Differential Revision: https://reviews.llvm.org/D37789 llvm-svn: 313222	2017-09-14 00:40:14 +00:00
Eugene Zelenko	8002c504cd	[Transforms] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 313198	2017-09-13 21:43:53 +00:00
Dehao Chen	15c86ef970	Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader. Summary: SampleProfileLoader inlines hot functions if it is inlined in the profiled binary. However, the inline needs to be guarded by legality check, otherwise it could lead to correctness issues. Reviewers: eraman, davidxl Reviewed By: eraman Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D37779 llvm-svn: 313195	2017-09-13 21:22:55 +00:00
Anna Thomas	19529f75b9	[LV] Avoid computing the register usage for default VF. NFC These are changes to reduce redundant computations when calculating a feasible vectorization factor: 1. early return when target has no vector registers 2. don't compute register usage for the default VF. Suggested during review for D37702. llvm-svn: 313176	2017-09-13 19:35:45 +00:00
Hiroshi Yamauchi	a43913cfaf	Add options to dump PGO counts in text. Summary: Added text options to -pgo-view-counts and -pgo-view-raw-counts that dump block frequency and branch probability info in text. This is useful when the graph is very large and complex (the dot command crashes, lines/edges too close to tell apart, hard to navigate without textual search) or simply when text is preferred. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37776 llvm-svn: 313159	2017-09-13 17:20:38 +00:00
Peter Collingbourne	d067c8ed59	ThinLTO: Correctly follow aliasee references when dead stripping. We were previously handling aliases during dead stripping by adding the aliased global's "original name" GUID to the worklist. This will lead to incorrect behaviour if the global has local linkage because the original name GUID will not correspond to the global's GUID in the summary. Because an alias is just another name for the global that it references, there is no need to mark the referenced global as used, or to follow references from any other copies of the global. So all we need to do is to follow references from the aliasee's summary instead of the alias. Differential Revision: https://reviews.llvm.org/D37789 llvm-svn: 313157	2017-09-13 17:09:20 +00:00
Teresa Johnson	1958083d35	[ThinLTO] For SamplePGO, need to handle ICP targets consistently in thin link Summary: SamplePGO indirect call profiles record the target as the original GUID for statics. The importer had special handling to map to the normal GUID in that case. The dead global analysis needs the same treatment or inconsistencies arise, resulting in linker unsats due to some dead symbols being exported and kept, leaving in references to other dead symbols that are removed. This can happen when a SamplePGO profile collected by one binary is used for a different binary, so the indirect call profiles may not accurately reflect live targets. Reviewers: danielcdh Subscribers: mehdi_amini, inglorion, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D37783 llvm-svn: 313151	2017-09-13 15:16:38 +00:00
Ayal Zaks	e2a8c0758f	[LV] Fix PR34523 - avoid generating redundant selects When converting a PHI into a series of 'select' instructions to combine the incoming values together according their edge masks, initialize the first value to the incoming value In0 of the first predecessor, instead of generating a redundant assignment 'select(Cond[0], In0, In0)'. The latter fails when the Cond[0] mask is null, representing a full mask, which can happen only when there's a single incoming value. No functional changes intended nor expected other than surviving null Cond[0]'s. This fix follows D35725, which introduced using null to represent full masks. Differential Revision: https://reviews.llvm.org/D37619 llvm-svn: 313119	2017-09-13 06:28:37 +00:00
Aditya Kumar	dfa8741c96	[GVNHoist] Factor out reachability to search for anticipable instructions quickly Factor out the reachability such that multiple queries to find reachability of values are fast. This is based on finding the ANTIC points in the CFG which do not change during hoisting. The ANTIC points are basically the dominance-frontiers in the inverse graph. So we introduce a data structure (CHI nodes) to keep track of values flowing out of a basic block. We only do this for values with multiple occurrences in the function as they are the potential hoistable candidates. This patch allows us to hoist instructions to a basic block with >2 successors, as well as deal with infinite loops in a trivial way. Relevant test cases are added to show the functionality as well as regression fixes from PR32821. Regression from previous GVNHoist: We do not hoist fully redundant expressions because fully redundant expressions are already handled by NewGVN Differential Revision: https://reviews.llvm.org/D35918 Reviewers: dberlin, sebpop, gberry, llvm-svn: 313116	2017-09-13 05:28:03 +00:00
Reid Kleckner	8a1cd91016	[InstCombine] Add a flag to disable LowerDbgDeclare Summary: This should improve optimized debug info for address-taken variables at the cost of inaccurate debug info in some situations. We patched this into clang and deployed this change to Chromium developers, and this significantly improved debuggability of optimized code. The long-term solution to PR34136 seems more and more like it's going to take a while, so I would like to commit this change under a flag so that it can be used as a stop-gap measure. This flag should really help so for C++ aggregates like std::string and std::vector, which are typically address-taken, even after inlining, and cannot be SROA-ed. Reviewers: aprantl, dblaikie, probinson, dberlin Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D36596 llvm-svn: 313108	2017-09-13 01:43:25 +00:00
Dehao Chen	f3ed14d323	Refactor the code to pass down ACT to SampleProfileLoader correctly. Summary: This change passes down ACT to SampleProfileLoader for the new PM. Also remove the default value for SampleProfileLoader class as it is not used. Reviewers: eraman, davidxl Reviewed By: eraman Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D37773 llvm-svn: 313080	2017-09-12 21:55:55 +00:00
Alina Sbirlea	80b806bf30	Make promoteLoopAccessesToScalars independent of AliasSet [NFC] Summary: The current promoteLoopAccessesToScalars method receives an AliasSet, but the information used is in fact a list of Value, known to must alias. Create the list ahead of time to make this method independent of the AliasSet class. While there is no functionality change, this adds overhead for creating a set of Value, when promotion would normally exit earlier. This is meant to be as a first refactoring step in order to start replacing AliasSetTracker with MemorySSA. And while the end goal is to redesign LICM, the first few steps will focus on adding MemorySSA as an alternative to the AliasSetTracker using most of the existing functionality. Reviewers: mkuper, danielcdh, dberlin Subscribers: sanjoy, chandlerc, gberry, davide, llvm-commits Differential Revision: https://reviews.llvm.org/D35439 llvm-svn: 313075	2017-09-12 21:18:44 +00:00
Anna Thomas	9f1be02fa3	[LV] Clamp the VF to the trip count Summary: When the MaxVectorSize > ConstantTripCount, we should just clamp the vectorization factor to be the ConstantTripCount. This vectorizes loops where the TinyTripCountThreshold >= TripCount < MaxVF. Earlier we were finding the maximum vector width, which could be greater than the trip count itself. The Loop vectorizer does all the work for generating a vectorizable loop, but in the end we would always choose the scalar loop (since the VF > trip count). This allows us to choose the VF keeping in mind the trip count if available. This is a fix on top of rL312472. Reviewers: Ayal, zvi, hfinkel, dneilson Reviewed by: Ayal Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37702 llvm-svn: 313046	2017-09-12 16:32:45 +00:00
Alexey Bataev	a26d3e834d	[SLP] Fix for PHINode during horizontal reduction scanning, NFC. Reduces number of loops during instructions analysis. llvm-svn: 313035	2017-09-12 15:13:50 +00:00
Peter Collingbourne	b9b6025328	LowerTypeTests: Add import/export support for targets without absolute symbol constants. The rationale is the same as for r312967. Differential Revision: https://reviews.llvm.org/D37408 llvm-svn: 312968	2017-09-11 22:49:10 +00:00
Peter Collingbourne	b15a35e604	WholeProgramDevirt: Add import/export support for targets without absolute symbol constants. Not all targets support the use of absolute symbols to export constants. In particular, ARM has a wide variety of constant encodings that cannot currently be relocated by linkers. So instead of exporting the constants using symbols, export them directly in the summary. The values of the constants are left as zeroes on targets that support symbolic exports. This may result in more cache misses when targeting those architectures as a result of arbitrary changes in constant values, but this seems somewhat unavoidable for now. Differential Revision: https://reviews.llvm.org/D37407 llvm-svn: 312967	2017-09-11 22:34:42 +00:00
Uriel Korach	18972237a2	Test commit llvm-svn: 312878	2017-09-10 08:31:22 +00:00
Nuno Lopes	404f106d71	Merge isKnownNonNull into isKnownNonZero It now knows the tricks of both functions. Also, fix a bug that considered allocas of non-zero address space to be always non null Differential Revision: https://reviews.llvm.org/D37628 llvm-svn: 312869	2017-09-09 18:23:11 +00:00
Sanjay Patel	6fd4391ddd	[DivRempairs] add a pass to optimize div/rem pairs (PR31028) This is intended to be a superset of the functionality from D31037 (EarlyCSE) but implemented as an independent pass, so there's no stretching of scope and feature creep for an existing pass. I also proposed a weaker version of this for SimplifyCFG in D30910. And I initially had almost this same functionality as an addition to CGP in the motivating example of PR31028: https://bugs.llvm.org/show_bug.cgi?id=31028 The advantage of positioning this ahead of SimplifyCFG in the pass pipeline is that it can allow more flattening. But it needs to be after passes (InstCombine) that could sink a div/rem and undo the hoisting that is done here. Decomposing remainder may allow removing some code from the backend (PPC and possibly others). Differential Revision: https://reviews.llvm.org/D37121 llvm-svn: 312862	2017-09-09 13:38:18 +00:00
Kostya Serebryany	4192b96313	[sanitizer-coverage] call appendToUsed once per module, not once per function (which is too slow) llvm-svn: 312855	2017-09-09 05:30:13 +00:00
Alexey Bataev	628fbcae4c	[SLP] Fix buildbots, NFC. llvm-svn: 312853	2017-09-09 02:08:45 +00:00
Dinar Temirbulatov	0d31f0af43	[SLPVectorizer] Add struct InstructionsState that holds information about analysis of vector to be vectorized. Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev, davide Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D37212 llvm-svn: 312802	2017-09-08 17:08:17 +00:00
Alexey Bataev	bd4a361739	[SLP] Fix the warning about paths not returning the value, NFC. llvm-svn: 312793	2017-09-08 14:32:20 +00:00
Alexey Bataev	6dd29fccb8	[SLP] Support for horizontal min/max reduction. SLP vectorizer supports horizontal reductions for Add/FAdd binary operations. Patch adds support for horizontal min/max reductions. Function getReductionCost() is split to getArithmeticReductionCost() for binary operation reductions and getMinMaxReductionCost() for min/max reductions. Patch fixes PR26956. Differential revision: https://reviews.llvm.org/D27846 llvm-svn: 312791	2017-09-08 13:49:36 +00:00
Max Kazantsev	d7b0f74c64	Re-enable "[IRCE] Identify loops with latch comparison against current IV value" Re-applying after the found bug was fixed. Differential Revision: https://reviews.llvm.org/D36215 llvm-svn: 312783	2017-09-08 10:15:05 +00:00
Max Kazantsev	57db44838d	diff --git a/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp b/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp index f72a808..9fa49fd 100644 --- a/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp +++ b/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp @@ -450,20 +450,10 @@ struct LoopStructure { // equivalent to: // // intN_ty inc = IndVarIncreasing ? 1 : -1; - // pred_ty predicate = IndVarIncreasing - // ? IsSignedPredicate ? ICMP_SLT : ICMP_ULT - // : IsSignedPredicate ? ICMP_SGT : ICMP_UGT; + // pred_ty predicate = IndVarIncreasing ? ICMP_SLT : ICMP_SGT; // - // - // for (intN_ty iv = IndVarStart; predicate(IndVarBase, LoopExitAt); - // iv = IndVarNext) + // for (intN_ty iv = IndVarStart; predicate(iv, LoopExitAt); iv = IndVarBase) // ... body ... - // - // Here IndVarBase is either current or next value of the induction variable. - // in the former case, IsIndVarNext = false and IndVarBase points to the - // Phi node of the induction variable. Otherwise, IsIndVarNext = true and - // IndVarBase points to IV increment instruction. - // Value IndVarBase; Value IndVarStart; @@ -471,13 +461,12 @@ struct LoopStructure { Value LoopExitAt; bool IndVarIncreasing; bool IsSignedPredicate; - bool IsIndVarNext; LoopStructure() : Tag(""), Header(nullptr), Latch(nullptr), LatchBr(nullptr), LatchExit(nullptr), LatchBrExitIdx(-1), IndVarBase(nullptr), IndVarStart(nullptr), IndVarStep(nullptr), LoopExitAt(nullptr), - IndVarIncreasing(false), IsSignedPredicate(true), IsIndVarNext(false) {} + IndVarIncreasing(false), IsSignedPredicate(true) {} template <typename M> LoopStructure map(M Map) const { LoopStructure Result; @@ -493,7 +482,6 @@ struct LoopStructure { Result.LoopExitAt = Map(LoopExitAt); Result.IndVarIncreasing = IndVarIncreasing; Result.IsSignedPredicate = IsSignedPredicate; - Result.IsIndVarNext = IsIndVarNext; return Result; } @@ -841,42 +829,21 @@ LoopStructure::parseLoopStructure(ScalarEvolution &SE, return false; }; - // `ICI` can either be a comparison against IV or a comparison of IV.next. - // Depending on the interpretation, we calculate the start value differently. + // `ICI` is interpreted as taking the backedge if the next* value of the + // induction variable satisfies some constraint. - // Pair {IndVarBase; IsIndVarNext} semantically designates whether the latch - // comparisons happens against the IV before or after its value is - // incremented. Two valid combinations for them are: - // - // 1) { phi [ iv.start, preheader ], [ iv.next, latch ]; false }, - // 2) { iv.next; true }. - // - // The latch comparison happens against IndVarBase which can be either current - // or next value of the induction variable. const SCEVAddRecExpr IndVarBase = cast<SCEVAddRecExpr>(LeftSCEV); bool IsIncreasing = false; bool IsSignedPredicate = true; - bool IsIndVarNext = false; ConstantInt StepCI; if (!IsInductionVar(IndVarBase, IsIncreasing, StepCI)) { FailureReason = "LHS in icmp not induction variable"; return None; } - const SCEV IndVarStart = nullptr; - // TODO: Currently we only handle comparison against IV, but we can extend - // this analysis to be able to deal with comparison against sext(iv) and such. - if (isa<PHINode>(LeftValue) && - cast<PHINode>(LeftValue)->getParent() == Header) - // The comparison is made against current IV value. - IndVarStart = IndVarBase->getStart(); - else { - // Assume that the comparison is made against next IV value. - const SCEV StartNext = IndVarBase->getStart(); - const SCEV Addend = SE.getNegativeSCEV(IndVarBase->getStepRecurrence(SE)); - IndVarStart = SE.getAddExpr(StartNext, Addend); - IsIndVarNext = true; - } + const SCEV StartNext = IndVarBase->getStart(); + const SCEV Addend = SE.getNegativeSCEV(IndVarBase->getStepRecurrence(SE)); + const SCEV IndVarStart = SE.getAddExpr(StartNext, Addend); const SCEV Step = SE.getSCEV(StepCI); ConstantInt One = ConstantInt::get(IndVarTy, 1); @@ -1060,7 +1027,6 @@ LoopStructure::parseLoopStructure(ScalarEvolution &SE, Result.IndVarIncreasing = IsIncreasing; Result.LoopExitAt = RightValue; Result.IsSignedPredicate = IsSignedPredicate; - Result.IsIndVarNext = IsIndVarNext; FailureReason = nullptr; @@ -1350,9 +1316,8 @@ LoopConstrainer::RewrittenRangeInfo LoopConstrainer::changeIterationSpaceEnd( BranchToContinuation); NewPHI->addIncoming(PN->getIncomingValueForBlock(Preheader), Preheader); - auto FixupValue = - LS.IsIndVarNext ? PN->getIncomingValueForBlock(LS.Latch) : PN; - NewPHI->addIncoming(FixupValue, RRI.ExitSelector); + NewPHI->addIncoming(PN->getIncomingValueForBlock(LS.Latch), + RRI.ExitSelector); RRI.PHIValuesAtPseudoExit.push_back(NewPHI); } @@ -1735,10 +1700,7 @@ bool InductiveRangeCheckElimination::runOnLoop(Loop L, LPPassManager &LPM) { } LoopStructure LS = MaybeLoopStructure.getValue(); const SCEVAddRecExpr IndVar = - cast<SCEVAddRecExpr>(SE.getSCEV(LS.IndVarBase)); - if (LS.IsIndVarNext) - IndVar = cast<SCEVAddRecExpr>(SE.getMinusSCEV(IndVar, - SE.getSCEV(LS.IndVarStep))); + cast<SCEVAddRecExpr>(SE.getMinusSCEV(SE.getSCEV(LS.IndVarBase), SE.getSCEV(LS.IndVarStep))); Optional<InductiveRangeCheck::Range> SafeIterRange; Instruction ExprInsertPt = Preheader->getTerminator(); diff --git a/test/Transforms/IRCE/latch-comparison-against-current-value.ll b/test/Transforms/IRCE/latch-comparison-against-current-value.ll deleted file mode 100644 index afea0e6..0000000 --- a/test/Transforms/IRCE/latch-comparison-against-current-value.ll +++ /dev/null @@ -1,182 +0,0 @@ -; RUN: opt -verify-loop-info -irce-print-changed-loops -irce -S < %s 2>&1 \| FileCheck %s - -; Check that IRCE is able to deal with loops where the latch comparison is -; done against current value of the IV, not the IV.next. - -; CHECK: irce: in function test_01: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting> -; CHECK: irce: in function test_02: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting> -; CHECK-NOT: irce: in function test_03: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting> -; CHECK-NOT: irce: in function test_04: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting> - -; SLT condition for increasing loop from 0 to 100. -define void @test_01(i32* %arr, i32* %a_len_ptr) #0 { - -; CHECK: test_01 -; CHECK: entry: -; CHECK-NEXT: %exit.mainloop.at = load i32, i32* %a_len_ptr, !range !0 -; CHECK-NEXT: [[COND2:%[^ ]+]] = icmp slt i32 0, %exit.mainloop.at -; CHECK-NEXT: br i1 [[COND2]], label %loop.preheader, label %main.pseudo.exit -; CHECK: loop: -; CHECK-NEXT: %idx = phi i32 [ %idx.next, %in.bounds ], [ 0, %loop.preheader ] -; CHECK-NEXT: %idx.next = add nuw nsw i32 %idx, 1 -; CHECK-NEXT: %abc = icmp slt i32 %idx, %exit.mainloop.at -; CHECK-NEXT: br i1 true, label %in.bounds, label %out.of.bounds.loopexit1 -; CHECK: in.bounds: -; CHECK-NEXT: %addr = getelementptr i32, i32* %arr, i32 %idx -; CHECK-NEXT: store i32 0, i32* %addr -; CHECK-NEXT: %next = icmp slt i32 %idx, 100 -; CHECK-NEXT: [[COND3:%[^ ]+]] = icmp slt i32 %idx, %exit.mainloop.at -; CHECK-NEXT: br i1 [[COND3]], label %loop, label %main.exit.selector -; CHECK: main.exit.selector: -; CHECK-NEXT: %idx.lcssa = phi i32 [ %idx, %in.bounds ] -; CHECK-NEXT: [[COND4:%[^ ]+]] = icmp slt i32 %idx.lcssa, 100 -; CHECK-NEXT: br i1 [[COND4]], label %main.pseudo.exit, label %exit -; CHECK-NOT: loop.preloop: -; CHECK: loop.postloop: -; CHECK-NEXT: %idx.postloop = phi i32 [ %idx.copy, %postloop ], [ %idx.next.postloop, %in.bounds.postloop ] -; CHECK-NEXT: %idx.next.postloop = add nuw nsw i32 %idx.postloop, 1 -; CHECK-NEXT: %abc.postloop = icmp slt i32 %idx.postloop, %exit.mainloop.at -; CHECK-NEXT: br i1 %abc.postloop, label %in.bounds.postloop, label %out.of.bounds.loopexit - -entry: - %len = load i32, i32* %a_len_ptr, !range !0 - br label %loop - -loop: - %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ] - %idx.next = add nsw nuw i32 %idx, 1 - %abc = icmp slt i32 %idx, %len - br i1 %abc, label %in.bounds, label %out.of.bounds - -in.bounds: - %addr = getelementptr i32, i32* %arr, i32 %idx - store i32 0, i32* %addr - %next = icmp slt i32 %idx, 100 - br i1 %next, label %loop, label %exit - -out.of.bounds: - ret void - -exit: - ret void -} - -; ULT condition for increasing loop from 0 to 100. -define void @test_02(i32* %arr, i32* %a_len_ptr) #0 { - -; CHECK: test_02 -; CHECK: entry: -; CHECK-NEXT: %exit.mainloop.at = load i32, i32* %a_len_ptr, !range !0 -; CHECK-NEXT: [[COND2:%[^ ]+]] = icmp ult i32 0, %exit.mainloop.at -; CHECK-NEXT: br i1 [[COND2]], label %loop.preheader, label %main.pseudo.exit -; CHECK: loop: -; CHECK-NEXT: %idx = phi i32 [ %idx.next, %in.bounds ], [ 0, %loop.preheader ] -; CHECK-NEXT: %idx.next = add nuw nsw i32 %idx, 1 -; CHECK-NEXT: %abc = icmp ult i32 %idx, %exit.mainloop.at -; CHECK-NEXT: br i1 true, label %in.bounds, label %out.of.bounds.loopexit1 -; CHECK: in.bounds: -; CHECK-NEXT: %addr = getelementptr i32, i32* %arr, i32 %idx -; CHECK-NEXT: store i32 0, i32* %addr -; CHECK-NEXT: %next = icmp ult i32 %idx, 100 -; CHECK-NEXT: [[COND3:%[^ ]+]] = icmp ult i32 %idx, %exit.mainloop.at -; CHECK-NEXT: br i1 [[COND3]], label %loop, label %main.exit.selector -; CHECK: main.exit.selector: -; CHECK-NEXT: %idx.lcssa = phi i32 [ %idx, %in.bounds ] -; CHECK-NEXT: [[COND4:%[^ ]+]] = icmp ult i32 %idx.lcssa, 100 -; CHECK-NEXT: br i1 [[COND4]], label %main.pseudo.exit, label %exit -; CHECK-NOT: loop.preloop: -; CHECK: loop.postloop: -; CHECK-NEXT: %idx.postloop = phi i32 [ %idx.copy, %postloop ], [ %idx.next.postloop, %in.bounds.postloop ] -; CHECK-NEXT: %idx.next.postloop = add nuw nsw i32 %idx.postloop, 1 -; CHECK-NEXT: %abc.postloop = icmp ult i32 %idx.postloop, %exit.mainloop.at -; CHECK-NEXT: br i1 %abc.postloop, label %in.bounds.postloop, label %out.of.bounds.loopexit - -entry: - %len = load i32, i32* %a_len_ptr, !range !0 - br label %loop - -loop: - %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ] - %idx.next = add nsw nuw i32 %idx, 1 - %abc = icmp ult i32 %idx, %len - br i1 %abc, label %in.bounds, label %out.of.bounds - -in.bounds: - %addr = getelementptr i32, i32* %arr, i32 %idx - store i32 0, i32* %addr - %next = icmp ult i32 %idx, 100 - br i1 %next, label %loop, label %exit - -out.of.bounds: - ret void - -exit: - ret void -} - -; Same as test_01, but comparison happens against IV extended to a wider type. -; This test ensures that IRCE rejects it and does not falsely assume that it was -; a comparison against iv.next. -; TODO: We can actually extend the recognition to cover this case. -define void @test_03(i32* %arr, i64* %a_len_ptr) #0 { - -; CHECK: test_03 - -entry: - %len = load i64, i64* %a_len_ptr, !range !1 - br label %loop - -loop: - %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ] - %idx.next = add nsw nuw i32 %idx, 1 - %idx.ext = sext i32 %idx to i64 - %abc = icmp slt i64 %idx.ext, %len - br i1 %abc, label %in.bounds, label %out.of.bounds - -in.bounds: - %addr = getelementptr i32, i32* %arr, i32 %idx - store i32 0, i32* %addr - %next = icmp slt i32 %idx, 100 - br i1 %next, label %loop, label %exit - -out.of.bounds: - ret void - -exit: - ret void -} - -; Same as test_02, but comparison happens against IV extended to a wider type. -; This test ensures that IRCE rejects it and does not falsely assume that it was -; a comparison against iv.next. -; TODO: We can actually extend the recognition to cover this case. -define void @test_04(i32* %arr, i64* %a_len_ptr) #0 { - -; CHECK: test_04 - -entry: - %len = load i64, i64* %a_len_ptr, !range !1 - br label %loop - -loop: - %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ] - %idx.next = add nsw nuw i32 %idx, 1 - %idx.ext = sext i32 %idx to i64 - %abc = icmp ult i64 %idx.ext, %len - br i1 %abc, label %in.bounds, label %out.of.bounds - -in.bounds: - %addr = getelementptr i32, i32* %arr, i32 %idx - store i32 0, i32* %addr - %next = icmp ult i32 %idx, 100 - br i1 %next, label %loop, label %exit - -out.of.bounds: - ret void - -exit: - ret void -} - -!0 = !{i32 0, i32 50} -!1 = !{i64 0, i64 50} llvm-svn: 312775	2017-09-08 04:26:41 +00:00
Peter Collingbourne	88a58cf9e7	WholeProgramDevirt: When promoting for single-impl devirt, also rename the comdat. This is required when targeting COFF, as the comdat name must match one of the names of the symbols in the comdat. Differential Revision: https://reviews.llvm.org/D37550 llvm-svn: 312767	2017-09-08 00:10:53 +00:00
Reid Kleckner	0e8c4bb055	Sink some IntrinsicInst.h and Intrinsics.h out of llvm/include Many of these uses can get by with forward declarations. Hopefully this speeds up compilation after adding a single intrinsic. llvm-svn: 312759	2017-09-07 23:27:44 +00:00
Richard Trieu	c7828ebea4	Revert r312318, r312325, r312424, r312489 r312318 - Debug info for variables whose type is shrinked to bool r312325, r312424, r312489 - Test case for r312318 Revision 312318 introduced a null dereference bug. Details in https://bugs.llvm.org/show_bug.cgi?id=34490 llvm-svn: 312758	2017-09-07 23:20:35 +00:00
Krzysztof Parzyszek	1dc313727e	Disable jump threading into loop headers Consider this type of a loop: for (...) { ... if (...) continue; ... } Normally, the "continue" would branch to the loop control code that checks whether the loop should continue iterating and which contains the (often) unique loop latch branch. In certain cases jump threading can "thread" the inner branch directly to the loop header, creating a second loop latch. Loop canonicalization would then transform this loop into a loop nest. The problem with this is that in such a loop nest neither loop is countable even if the original loop was. This may inhibit subsequent loop optimizations and be detrimental to performance. Differential Revision: https://reviews.llvm.org/D36404 llvm-svn: 312664	2017-09-06 19:36:58 +00:00
Sanjay Patel	6840c5ff75	[ValueTracking, InstCombine] canonicalize fcmp ord/uno with non-NAN ops to null constants This is a preliminary step towards solving the remaining part of PR27145 - IR for isfinite(): https://bugs.llvm.org/show_bug.cgi?id=27145 In order to solve that one more generally, we need to add matching for and/or of fcmp ord/uno with a constant operand. But while looking at those patterns, I realized we were missing a canonicalization for nonzero constants. Rather than limiting to just folds for constants, we're adding a general value tracking method for this based on an existing DAG helper. By transforming everything to 0.0, we can simplify the existing code in foldLogicOfFCmps() and pick up missing vector folds. Differential Revision: https://reviews.llvm.org/D37427 llvm-svn: 312591	2017-09-05 23:13:13 +00:00
Davide Italiano	32504cf661	[GVNHoist] Move duplicated code to a helper function. NFCI. llvm-svn: 312575	2017-09-05 20:49:41 +00:00
Craig Topper	28d6d962d5	[InstCombine] Move foldSelectICmpAnd helper function earlier in the file to enable reuse in a future patch. llvm-svn: 312518	2017-09-05 05:26:37 +00:00
Craig Topper	4c766a0559	[InstCombine] In foldSelectIntoOp, avoid creating a Constant before we know for sure we're going to use it and avoid an unnecessary call to m_APInt. Instead of creating a Constant and then calling m_APInt with it (which will always return true). Just create an APInt initially, and use that for the checks in isSelect01 function. If it turns out we do need the Constant, create it from the APInt. This is a refactor for a future patch that will do some more checks of the constant values here. llvm-svn: 312517	2017-09-05 05:26:36 +00:00
Daniel Berlin	f9c9455d3f	NewGVN: Fix PR 34430 - we need to look through predicateinfo copies to detect self-cycles of phi nodes. We also need to not ignore certain types of arguments when testing whether the phi has a backedge or was originally constant. llvm-svn: 312510	2017-09-05 02:17:43 +00:00
Daniel Berlin	54a92fcc5d	NewGVN: Fix PR 34452 by passing instruction all the way down when we do aggregate value simplification llvm-svn: 312509	2017-09-05 02:17:42 +00:00
Daniel Berlin	1a58258232	NewGVN: Detect copies through predicateinfo llvm-svn: 312508	2017-09-05 02:17:41 +00:00
Daniel Berlin	4ad7e8d263	NewGVN: Change where check for original instruction in phi of ops leader finding is done. Where we had it before, we would stop looking when we hit the original instruction, but skip it. Now we skip it and keep looking. llvm-svn: 312507	2017-09-05 02:17:40 +00:00
Zvi Rackover	9a087a357a	LoopVectorize: MaxVF should not be larger than the loop trip count Summary: Improve how MaxVF is computed while taking into account that MaxVF should not be larger than the loop's trip count. Other than saving on compile-time by pruning the possible MaxVF candidates, this patch fixes pr34438 which exposed the following flow: 1. Short trip count identified -> Don't bail out, set OptForSize:=True to avoid tail-loop and runtime checks. 2. Compute MaxVF returned 16 on a target supporting AVX512. 3. OptForSize -> choose VF:=MaxVF. 4. Bail out because TripCount = 8, VF = 16, TripCount % VF !=0 means we need a tail loop. With this patch step 2. will choose MaxVF=8 based on TripCount. Reviewers: Ayal, dorit, mkuper, hfinkel Reviewed By: hfinkel Subscribers: hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D37425 llvm-svn: 312472	2017-09-04 08:35:13 +00:00
Sam Parker	7cd826a321	[LoopUnroll][DebugInfo] Don't add metadata to unrolled remainder loop Debug information can be, and was, corrupted when the runtime remainder loop was fully unrolled. This is because a !null node can be created instead of a unique one describing the loop. In this case, the original node gets incorrectly updated with the NewLoopID metadata. In the case when the remainder loop is going to be quickly fully unrolled, there isn't the need to add loop metadata for it anyway. Differential Revision: https://reviews.llvm.org/D37338 llvm-svn: 312471	2017-09-04 08:12:16 +00:00
Sanjay Patel	bc6da4e40f	[InstCombine] replace unnecessary fcmp fold with assert See https://reviews.llvm.org/rL312411 for related InstSimplify tests. llvm-svn: 312421	2017-09-02 18:10:29 +00:00
Sanjay Patel	64fc5daf42	[InstCombine] combine foldAndOfFCmps and foldOrOfFcmps; NFCI In addition to removing chunks of duplicated code, we don't want these to diverge. If there's a fold for one, there should be a fold of the other via DeMorgan's Laws. llvm-svn: 312420	2017-09-02 17:53:33 +00:00
Sanjay Patel	275bb5a14e	[InstCombine] fix misnamed locals and use them to reduce code; NFCI We had these locals: Value Op0RHS = LHS->getOperand(1); Value Op1LHS = RHS->getOperand(0); ...so we confusingly transposed the meaning of left/right and op0/op1. llvm-svn: 312418	2017-09-02 17:17:17 +00:00
Benjamin Kramer	14ddcdfb18	[LoopVectorize] Turn static DenseSet into switch. LLVM transforms this into a bit test which is a lot faster and smaller. llvm-svn: 312417	2017-09-02 16:41:55 +00:00
Sanjay Patel	da6f9b2fee	[InstCombine] remove unnecessary code; NFC llvm-svn: 312416	2017-09-02 16:32:37 +00:00
Sanjay Patel	4c52f765a5	[InstCombine] move related functions next to each other; NFC This makes it easier to see that they're almost duplicates. As with the similar icmp functions, there should be identical folds for both logic ops because those are DeMorganized variants. llvm-svn: 312415	2017-09-02 16:30:27 +00:00
Sanjay Patel	6b139464ca	[InstCombine] use local variable to reduce code duplication; NFCI llvm-svn: 312414	2017-09-02 15:11:55 +00:00
Daniel Berlin	94090dd13b	Fix PR/33305. caused by trying to simplify expressions in phi of ops that should have no leaders. Summary: After a discussion with Rekka, i believe this (or a small variant) should fix the remaining phi-of-ops problems. Rekka's algorithm for completeness relies on looking up expressions that should have no leader, and expecting it to fail (IE looking up expressions that can't exist in a predecessor, and expecting it to find nothing). Unfortunately, sometimes these expressions can be simplified to constants, but we need the lookup to fail anyway. Additionally, our simplifier outsmarts this by taking these "not quite right" expressions, and simplifying them into other expressions or walking through phis, etc. In the past, we've sometimes been able to find leaders for these expressions, incorrectly. This change causes us to not to try to phi of ops such expressions. We determine safety by seeing if they depend on a phi node in our block. This is not perfect, we can do a bit better, but this should be a "correctness start" that we can then improve. It also requires a bunch of caching that i'll eventually like to eliminate. The right solution, longer term, to the simplifier issues, is to make the query interface for the instruction simplifier/constant folder have the flags we need, so that we can keep most things going, but turn off the possibly-invalid parts (threading through phis, etc). This is an issue in another wrong code bug as well. Reviewers: davide, mcrosier Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D37175 llvm-svn: 312401	2017-09-02 02:18:44 +00:00
Eugene Zelenko	75075efe5e	[Analysis, Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 312383	2017-09-01 21:37:29 +00:00
Craig Topper	924f20262b	[InstCombine][InstSimplify] Teach decomposeBitTestICmp to look through truncate instructions This patch teaches decomposeBitTestICmp to look through truncate instructions on the input to the compare. If a truncate is found it will now return the pre-truncated Value and appropriately extend the APInt mask. This allows some code to be removed from InstSimplify that was doing this functionality. This allows InstCombine's bit test combining code to match a pre-truncate Value with the same Value appear with an 'and' on another icmp. Or it allows us to combine a truncate to i16 and a truncate to i8. This also required removing the type check from the beginning of getMaskedTypeForICmpPair, but I believe that's ok because we still have to find two values from the input to each icmp that are equal before we'll do any transformation. So the type check was really just serving as an early out. There was one user of decomposeBitTestICmp that didn't want to look through truncates, so I've added a flag to prevent that behavior when necessary. Differential Revision: https://reviews.llvm.org/D37158 llvm-svn: 312382	2017-09-01 21:27:34 +00:00
Craig Topper	d3b465606a	[InstCombine] Don't require the compare types to be the same in getMaskedTypeForICmpPair. A future patch will make the code look through truncates feeding the compare. So the compares might be different types but the pretruncated types might be the same. This should be safe because we still require the same Value* to be used truncated or not in both compares. So that serves to ensure the types are the same. llvm-svn: 312381	2017-09-01 21:27:31 +00:00
Craig Topper	085c1f4dea	[InstCombine] When converting decomposeBitTestICmp's APInt return to ConstantInt, make sure we use the type from the Value* that was also returned from decomposeBitTestICmp. Previously we used the type from the LHS of the compare, but a future patch will change decomposeBitTestICmp to look through truncates so it will return a pretruncated Value* and the type needs to match that. llvm-svn: 312380	2017-09-01 21:27:29 +00:00
Daniel Berlin	86932104db	NewGVN: Make sure we don't incorrectly use PredicateInfo when doing PHI of ops Summary: When we backtranslate expressions, we can't use the predicateinfo, since we are evaluating them in a different context. Reviewers: davide, mcrosier Subscribers: sanjoy, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D37174 llvm-svn: 312352	2017-09-01 19:20:18 +00:00
Manoj Gupta	6b54c7e11b	[LoopVectorizer] Use two step casting for float to pointer types. Summary: LoopVectorizer is creating casts between vec<ptr> and vec<float> types on ARM when compiling OpenCV. Since, tIs is illegal to directly cast a floating point type to a pointer type even if the types have same size causing a crash. Fix the crash using a two-step casting by bitcasting to integer and integer to pointer/float. Fixes PR33804. Reviewers: mkuper, Ayal, dlj, rengolin, srhines Reviewed By: rengolin Subscribers: aemerson, kristof.beyls, mkazantsev, Meinersbur, rengolin, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D35498 llvm-svn: 312331	2017-09-01 15:36:00 +00:00
Clement Courbet	bc0c4459c9	[MergeICmps] Fix build of rL312315 on clang-with-thin-lto-windows: MergeICmps.cpp(68,15): error: chosen constructor is explicit in copy-initialization return {}; APInt.h(339,12): note: explicit constructor declared here explicit APInt() : BitWidth(1) { U.VAL = 0; } ^ MergeICmps.cpp(56,9): note: in implicit initialization of field 'Offset' with omitted initializer APInt Offset; ^ llvm-svn: 312326	2017-09-01 11:51:23 +00:00
Clement Courbet	65130e2d8d	Reland rL312315: [MergeICmps] MergeICmps is a new optimization pass that turns chains of integer Add missing header. This reverts commit 86dd6335cf7607af22f383a9a8e072ba929848cf. llvm-svn: 312322	2017-09-01 10:56:34 +00:00
Strahinja Petrovic	676fd0b022	Debug info for variables whose type is shrinked to bool This patch provides such debug information for integer variables whose type is shrinked to bool by providing dwarf expression which returns either constant initial value or other value. Patch by Nikola Prica. Differential Revision: https://reviews.llvm.org/D35994 llvm-svn: 312318	2017-09-01 10:05:27 +00:00
Clement Courbet	316212575b	Revert "[MergeICmps] MergeICmps is a new optimization pass that turns chains of integer" Break build This reverts commit d07ab866f7f88f81e49046d691a80dcd32d7198b. llvm-svn: 312317	2017-09-01 09:43:08 +00:00
Clement Courbet	9473c01e96	[MergeICmps] MergeICmps is a new optimization pass that turns chains of integer comparisons into memcmp. Thanks to recent improvements in the LLVM codegen, the memcmp is typically inlined as a chain of efficient hardware comparisons. This typically benefits C++ member or nonmember operator==(). For now this is disabled by default until: - https://bugs.llvm.org/show_bug.cgi?id=33329 is complete - Benchmarks show that this is always useful. Differential Revision: https://reviews.llvm.org/D33987 llvm-svn: 312315	2017-09-01 09:07:05 +00:00
Eugene Zelenko	fa6434bebb	[Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes. Also affected in files (NFC). llvm-svn: 312289	2017-08-31 21:56:16 +00:00
Akira Hatanaka	13d2beb14d	[ObjCARC] Pass the correct BasicBlock to fix assertion failure. The BasicBlock passed to FindPredecessorRetainWithSafePath should be the parent block of Autorelease. This fixes a crash that occurs in FindDependencies when StartInst is not in StartBB. rdar://problem/33866381 llvm-svn: 312266	2017-08-31 18:27:47 +00:00
Sanjay Patel	e6b48a1b02	[InstCombine] improve demanded vector elements analysis of insertelement Recurse instead of returning on the first found optimization. Also, return early in the caller instead of continuing because that allows another round of simplification before we might potentially lose undef information from a shuffle mask by eliminating the shuffle. As noted in the review, we could probably do better and be more efficient by moving all of demanded elements into a separate pass, but this is yet another quick fix to instcombine. Differential Revision: https://reviews.llvm.org/D37236 llvm-svn: 312248	2017-08-31 15:57:17 +00:00
Dinar Temirbulatov	8870a14e4e	[SLPVectorizer] Move out Entry->NeedToGather check and assert of inner loop as invariant, NFCI. llvm-svn: 312242	2017-08-31 14:10:07 +00:00
Max Kazantsev	0a9c1ef2eb	[IRCE] Identify loops with latch comparison against current IV value Current implementation of parseLoopStructure interprets the latch comparison as a comarison against `iv.next`. If the actual comparison is made against the `iv` current value then the loop may be rejected, because this misinterpretation leads to incorrect evaluation of the latch start value. This patch teaches the IRCE to distinguish this kind of loops and perform the optimization for them. Now we use `IndVarBase` variable which can be either next or current value of the induction variable (previously we used `IndVarNext` which was always the value on next iteration). Differential Revision: https://reviews.llvm.org/D36215 llvm-svn: 312221	2017-08-31 07:04:20 +00:00
Max Kazantsev	a22742be5a	[IRCE][NFC] Rename IndVarNext to IndVarBase Renaming as a preparation step to generalizing IRCE for comparison not only against the next value of an indvar, but also against the current. Differential Revision: https://reviews.llvm.org/D36509 llvm-svn: 312215	2017-08-31 05:58:15 +00:00
Adrian Prantl	504b82d44b	Don't add a fragment expression when GlobalSRA splits up a single-member struct Fixes PR34390. https://bugs.llvm.org/show_bug.cgi?id=34390 llvm-svn: 312196	2017-08-31 00:06:18 +00:00
Matt Morehouse	034126e507	[SanitizeCoverage] Enable stack-depth coverage for -fsanitize=fuzzer Summary: - Don't sanitize __sancov_lowest_stack. - Don't instrument leaf functions. - Add CoverageStackDepth to Fuzzer and FuzzerNoLink. - Only enable on Linux. Reviewers: vitalybuka, kcc, george.karpenkov Reviewed By: kcc Subscribers: kubamracek, cfe-commits, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D37156 llvm-svn: 312185	2017-08-30 22:49:31 +00:00
Adrian Prantl	b192b545c1	Refactor DIBuilder::createFragmentExpression into a static DIExpression member NFC llvm-svn: 312165	2017-08-30 20:04:17 +00:00
Daniel Berlin	23fec57e6f	NewGVN: Make sure we add the correct user if we swapped the comparison operands llvm-svn: 312162	2017-08-30 19:53:23 +00:00
Daniel Berlin	7ef26daba8	NewGVN: Allow simplification into variables llvm-svn: 312161	2017-08-30 19:52:39 +00:00
Benjamin Kramer	b99d7c9214	[GVNSink] Remove dependency on SmallPtrSet iteration order. Found by LLVM_ENABLE_REVERSE_ITERATION. llvm-svn: 312156	2017-08-30 18:46:37 +00:00
Sanjay Patel	6f7ac7e402	[InstCombine] remove unnecessary vector select fold; NFCI This code is double-dead: 1. We simplify all selects with constant true/false condition in InstSimplify. I've minimized/moved the tests to show that works as expected. 2. All remaining vector selects with a constant condition are canonicalized to shufflevector, so we really can't see this pattern. llvm-svn: 312123	2017-08-30 14:04:57 +00:00
Florian Hahn	b992feee13	[InstCombine] Fold insert sequence if first ins has multiple users. Summary: If the first insertelement instruction has multiple users and inserts at position 0, we can re-use this instruction when folding a chain of insertelement instructions. As we need to generate the first insertelement instruction anyways, this should be a strict improvement. We could get rid of the restriction of inserting at position 0 by creating a different shufflemask, but it is probably worth to keep the first insertelement instruction with position 0, as this is easier to do efficiently than at other positions I think. Reviewers: grosser, mkuper, fpetrogalli, efriedma Reviewed By: fpetrogalli Subscribers: gareevroman, llvm-commits Differential Revision: https://reviews.llvm.org/D37064 llvm-svn: 312110	2017-08-30 10:54:21 +00:00
Mandeep Singh Grang	e3bbb68b0c	[cfi] Fixed non-determinism in codegen due to DenseSet iteration order llvm-svn: 312098	2017-08-30 04:47:21 +00:00
Evgeniy Stepanov	7372b67063	[cfi] Avoid branch veneers in jump tables when possible. Summary: When jumptable encoding does not match target code encoding (arm vs thumb), a veneer is inserted by the linker. We can not avoid this in all cases, because entries within one jumptable must have the same encoding, but we can make it less common by selecting the jumptable encoding to match the majority of its targets. This change only covers FullLTO, and not ThinLTO. Reviewers: pcc Subscribers: aemerson, mehdi_amini, javed.absar, kristof.beyls, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D37171 llvm-svn: 312054	2017-08-29 22:40:19 +00:00
Evgeniy Stepanov	4731ad81c7	[cfi] Build __cfi_check as Thumb when applicable. Summary: Cross-DSO CFI needs all __cfi_check exports to use the same encoding (ARM vs Thumb). Reviewers: pcc Subscribers: aemerson, srhines, kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D37243 llvm-svn: 312052	2017-08-29 22:29:15 +00:00
Matt Morehouse	ba2e61b357	Revert "[SanitizeCoverage] Enable stack-depth coverage for -fsanitize=fuzzer" This reverts r312026 due to bot breakage. llvm-svn: 312047	2017-08-29 21:56:56 +00:00
Wei Mi	ebb9327759	[LoopUnswitch] Fix a simple bug which disables loop unswitch for select statement This is to fix PR34257. rL309059 takes an early return when FindLIVLoopCondition fails to find a loop invariant condition. This is wrong and it will disable loop unswitch for select. The patch fixes the bug. Differential Revision: https://reviews.llvm.org/D36985 llvm-svn: 312045	2017-08-29 21:45:11 +00:00
Benjamin Kramer	154411e0e7	[FunctionImport] Avoid unused variable warnings in Release builds Just skip the entire block in NDEBUG. No functionality change intended. llvm-svn: 312031	2017-08-29 20:24:39 +00:00
Alexey Bataev	978e2e4760	[SimplifyCFG] Fix for PR34219: Preserve alignment after merging conditional stores. Summary: If SimplifyCFG pass is able to merge conditional stores into single one, it loses the alignment. This may lead to incorrect codegen. Patch sets the alignment of the new instruction if it is set in the original one. Reviewers: jmolloy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36841 llvm-svn: 312030	2017-08-29 20:06:24 +00:00
Matt Morehouse	2ad8d948b2	[SanitizeCoverage] Enable stack-depth coverage for -fsanitize=fuzzer Summary: - Don't sanitize __sancov_lowest_stack. - Don't instrument leaf functions. - Add CoverageStackDepth to Fuzzer and FuzzerNoLink. - Disable stack depth tracking on Mac. Reviewers: vitalybuka, kcc, george.karpenkov Reviewed By: kcc Subscribers: kubamracek, cfe-commits, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D37156 llvm-svn: 312026	2017-08-29 19:48:12 +00:00
Craig Topper	4431bfe88c	[InstCombine] Support vector splats in transformZExtICmp This patch adds splat support to transformZExtICmp. The test cases are vector versions of tests that failed when commenting out parts of the existing scalar code. One test didn't vectorize optimize properly due to another bug so a TODO has been added. Differential Revision: https://reviews.llvm.org/D37253 llvm-svn: 312023	2017-08-29 18:58:13 +00:00
Teresa Johnson	2df7fc7991	[ThinLTO] Clean up stale alias import handling Summary: Remove some code that was no longer needed. The first FIXME is stale since we long ago started using the index to drive importing, rather than doing force importing based on linkage type. And now with r309278, we no longer import any aliases. Reviewers: dblaikie Subscribers: inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D37266 llvm-svn: 312019	2017-08-29 18:15:34 +00:00
Dehao Chen	efd007f6f4	Add null check for promoted direct call Summary: We originally assume that in pgo-icp, the promoted direct call will never be null after strip point casts. However, stripPointerCasts is so smart that it could possibly return the value of the function call if it knows that the return value is always an argument. In this case, the returned value cannot cast to Instruction. In this patch, null check is added to ensure null pointer will not be accessed. Reviewers: tejohnson, xur, davidxl, djasper Reviewed By: tejohnson Subscribers: llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D37252 llvm-svn: 312005	2017-08-29 15:28:12 +00:00
Sanjay Patel	674d2c23ea	[Instruction] add moveAfter() convenience function; NFCI As suggested in D37121, here's a wrapper for removeFromParent() + insertAfter(), but implemented using moveBefore() for symmetry/efficiency. Differential Revision: https://reviews.llvm.org/D37239 llvm-svn: 312001	2017-08-29 14:07:48 +00:00
Max Kazantsev	bb1d010872	[LSR] Fix Shadow IV in case of integer overflow When LSR processes code like int accumulator = 0; for (int i = 0; i < N; i++) { accummulator += i; use((double) accummulator); } It may decide to replace integer `accumulator` with a double Shadow IV to get rid of casts. The problem with that is that the `accumulator`'s value may overflow. Starting from this moment, the behavior of integer and double accumulators will differ. This patch strenghtens up the conditions of Shadow IV mechanism applicability. We only allow it for IVs that are proved to be `AddRec`s with `nsw`/`nuw` flag. Differential Revision: https://reviews.llvm.org/D37209 llvm-svn: 311986	2017-08-29 07:32:20 +00:00
Craig Topper	5d6ddda92d	[InstCombine] Teach foldSelectICmpAndOr to handle vector splats This was pretty close to working already. While I was here I went ahead and passed the ICmpInst pointer from the caller instead of doing a dyn_cast that can never fail. Differential Revision: https://reviews.llvm.org/D37237 llvm-svn: 311960	2017-08-29 00:13:49 +00:00
Justin Bogner	f1a54a47b0	[sanitizer-coverage] Mark the guard and 8-bit counter arrays as used In r311742 we marked the PCs array as used so it wouldn't be dead stripped, but left the guard and 8-bit counters arrays alone since these are referenced by the coverage instrumentation. This doesn't quite work if we want the indices of the PCs array to match the other arrays though, since elements can still end up being dead and disappear. Instead, we mark all three of these arrays as used so that they'll be consistent with one another. llvm-svn: 311959	2017-08-29 00:11:05 +00:00
Justin Bogner	873a0746f1	[sanitizer-coverage] Return the array from CreatePCArray. NFC Be more consistent with CreateFunctionLocalArrayInSection in the API of CreatePCArray, and assign the member variable in the caller like we do for the guard and 8-bit counter arrays. This also tweaks the order of method declarations to match the order of definitions in the file. llvm-svn: 311955	2017-08-28 23:46:11 +00:00
Justin Bogner	be757de2b6	[sanitizer-coverage] Clean up trailing whitespace. NFC llvm-svn: 311954	2017-08-28 23:38:12 +00:00
Kamil Rytarowski	a9f404f813	Define NetBSD/amd64 ASAN Shadow Offset Summary: Catch up after compiler-rt changes and define kNetBSD_ShadowOffset64 as (1ULL << 46). Sponsored by <The NetBSD Foundation> Reviewers: kcc, joerg, filcab, vitalybuka, eugenis Reviewed By: eugenis Subscribers: llvm-commits, #sanitizers Tags: #sanitizers Differential Revision: https://reviews.llvm.org/D37234 llvm-svn: 311941	2017-08-28 22:13:52 +00:00
Craig Topper	516e39cd38	[InstCombine] Teach select01 helper of foldSelectIntoOp to handle vector splats We were handling some vectors in foldSelectIntoOp, but not if the operand of the bin op was any kind of vector constant. This patch fixes it to treat vector splats the same as scalars. Differential Revision: https://reviews.llvm.org/D37232 llvm-svn: 311940	2017-08-28 22:00:27 +00:00
Davide Italiano	20cb7e887f	[LoopUnroll] Properly update loop structure in case of successful peeling. When peeling kicks in, it updates the loop preheader. Later, a successful full unroll of the loop needs to update a PHI which i-th argument comes from the loop preheader, so it'd better look at the correct block. Fixes PR33437. Differential Revision: https://reviews.llvm.org/D37153 llvm-svn: 311922	2017-08-28 20:29:33 +00:00
Davide Italiano	9a09ae448d	[LoopUnroll] Add a cl::opt to force peeling, for testing purposes. Will be used to test the patch proposed in D37153. llvm-svn: 311915	2017-08-28 19:50:55 +00:00
Taewook Oh	572f45a3c8	Create PHI node for the return value only when the return value has uses. Summary: Currently, a phi node is created in the normal destination to unify the return values from promoted calls and the original indirect call. This patch makes this phi node to be created only when the return value has uses. This patch is necessary to generate valid code, as compiler crashes with the attached test case without this patch. Without this patch, an illegal phi node that has no incoming value from `entry`/`catch` is created in `cleanup` block. I think existing implementation is good as far as there is at least one use of the original indirect call. `insertCallRetPHI` creates a new phi node in the normal destination block only when the original indirect call dominates its use and the normal destination block. Otherwise, `fixupPHINodeForNormalDest` will handle the unification of return values naturally without creating a new phi node. However, if there's no use, `insertCallRetPHI` still creates a new phi node even when the original indirect call does not dominate the normal destination block, because `getCallRetPHINode` returns false. Reviewers: xur, davidxl, danielcdh Reviewed By: xur Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37176 llvm-svn: 311906	2017-08-28 18:57:00 +00:00
Craig Topper	3763f0e00d	[InstCombine] Call hasNoSignedWrap instead of hasNoUnsignedWrap to get the NSW flag when handling Add in SimplifyDemandedUseBits. This is a typo from r311789. This should fix PR34349. llvm-svn: 311902	2017-08-28 18:44:28 +00:00
NAKAMURA Takumi	a1e97a77f5	Untabify. llvm-svn: 311875	2017-08-28 06:47:47 +00:00
Dehao Chen	191b24d3d2	revert r310985 which breaks for the following case: struct string { ~string(); }; void f2(); void f1(int) { f2(); } void run(int c) { string body; while (true) { if (c) f1(c); else f1(c); } } Will recommit once the issue is fixed. llvm-svn: 311864	2017-08-27 22:22:39 +00:00
Ayal Zaks	1f58dda4e4	[LV] Fix PR34248 - recommit D32871 after revert r311304 Original commit r311077 of D32871 was reverted in r311304 due to failures reported in PR34248. This recommit fixes PR34248 by restricting the packing of predicated scalars into vectors only when vectorizing, avoiding doing so when unrolling w/o vectorizing. Added a test derived from the reproducer of PR34248. llvm-svn: 311849	2017-08-27 12:55:46 +00:00
Davide Italiano	9bdccb37d5	[NewGVN] Use `auto` when the type is obvious NFCI. llvm-svn: 311838	2017-08-26 22:31:10 +00:00
Daniel Berlin	de269f4620	NewGVN: Fix PR33204 - We need to add memory users when we bypass memorydefs for loads, not just when we do it for stores. llvm-svn: 311829	2017-08-26 07:37:11 +00:00
Davide Italiano	a872519dbd	[Inliner] Only compute fully inline cost when remarks are enabled. Prior to this change (and after r311371), we computed it unconditionally, causin gsevere compile time regressions (in some cases, 5 to 10x). llvm-svn: 311804	2017-08-25 22:01:42 +00:00
Matt Morehouse	6ec7595b1e	Revert "[SanitizeCoverage] Enable stack-depth coverage for -fsanitize=fuzzer" This reverts r311801 due to a bot failure. llvm-svn: 311803	2017-08-25 22:01:21 +00:00
Matt Morehouse	f42bd31323	[SanitizeCoverage] Enable stack-depth coverage for -fsanitize=fuzzer Summary: - Don't sanitize __sancov_lowest_stack. - Don't instrument leaf functions. - Add CoverageStackDepth to Fuzzer and FuzzerNoLink. Reviewers: vitalybuka, kcc Reviewed By: kcc Subscribers: cfe-commits, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D37156 llvm-svn: 311801	2017-08-25 21:18:29 +00:00
Kostya Serebryany	d3e4b7e24a	[sanitizer-coverage] extend fsanitize-coverage=pc-table with flags for every PC llvm-svn: 311794	2017-08-25 19:29:47 +00:00
Craig Topper	35171e5d67	[InstCombine] Don't fall back to only calling computeKnownBits if the upper bit of Add/Sub is demanded. Just create an all 1s demanded mask and continue recursing like normal. The recursive calls should be able to handle an all 1s mask and do the right thing. The only time we should care about knowing whether the upper bit was demanded is when we need to know if we should clear the NSW/NUW flags. Now that we have a consistent path through the code for all cases, use KnownBits::computeForAddSub to compute the known bits at the end since we already have the LHS and RHS. My larger goal here is to move the code that turns add into xor if only 1 bit is demanded and no bits below it are non-zero from InstCombiner::OptAndOp to here. This will allow it to be more general instead of just looking for 'add' and 'and' with constant RHS. Differential Revision: https://reviews.llvm.org/D36486 llvm-svn: 311789	2017-08-25 18:39:40 +00:00
Florian Hahn	cd78345398	[LoopInterchange] Skip zext instructions when looking for induction var. Summary: SimplifyIndVar may introduce zext instructions to widen arguments of the loop exit check. They should not prevent us from splitting the loop at the induction variable, but maybe the check should be more conservative, e.g. making sure it only extends arguments used by a comparison? Reviewers: karthikthecool, mcrosier, mzolotukhin Reviewed By: mcrosier Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D34879 llvm-svn: 311783	2017-08-25 16:52:29 +00:00
Amjad Aboud	22178dd33b	[InstCombine] Consider more cases where SimplifyDemandedUseBits does not convert AShr to LShr. There are cases where AShr have better chance to be optimized than LShr, especially when the demanded bits are not known to be Zero, and also known to be similar to the sign bit. Differential Revision: https://reviews.llvm.org/D36936 llvm-svn: 311773	2017-08-25 11:07:54 +00:00
Gor Nishanov	e29e94cf87	[coroutines] Add support for symmetric control transfer (musttail on coro.resumes followed by a suspend) Summary: Add musttail to any resume instructions that is immediately followed by a suspend (i.e. ret). We do this even in -O0 to support guaranteed tail call for symmetrical coroutine control transfer (C++ Coroutines TS extension). This transformation is done only in the resume part of the coroutine that has identical signature and calling convention as the coro.resume call. Reviewers: GorNishanov Reviewed By: GorNishanov Subscribers: EricWF, majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D37125 llvm-svn: 311751	2017-08-25 02:25:10 +00:00
Justin Bogner	ad96ff1228	[sanitizer-coverage] Make sure pc-tables aren't dead stripped Add a reference to the PC array in llvm.used so that linkers that aggressively dead strip (like ld64) don't remove it. llvm-svn: 311742	2017-08-25 01:24:54 +00:00
Xinliang David Li	66531dd10a	[Profile] backward propagate profile info in JumpThreading Take-2 after fixing bugs in the original patch. Differential Revsion: http://reviews.llvm.org/D36864 llvm-svn: 311727	2017-08-24 22:54:01 +00:00
Sanjay Patel	bb789381fc	[InstCombine] fix and enhance udiv/urem narrowing There are 3 small independent changes here: 1. Account for multiple uses in the pattern matching: avoid the transform if it increases the instruction count. 2. Add a missing fold for the case where the numerator is the constant: http://rise4fun.com/Alive/E2p 3. Enable all folds for vector types. There's still one more potential change - use "shouldChangeType()" to keep from transforming to an illegal integer type. Differential Revision: https://reviews.llvm.org/D36988 llvm-svn: 311726	2017-08-24 22:54:01 +00:00
Chad Rosier	f98335e0b0	[PartialInlining] Formatting. NFC. llvm-svn: 311702	2017-08-24 21:21:09 +00:00
Chad Rosier	4cb2e82774	[PartialInlining] Type. NFC. llvm-svn: 311699	2017-08-24 20:29:02 +00:00
Sanjay Patel	5d67d8916e	[BypassSlowDivision] move map helper code to header; NFC We can reuse this code with other div/rem transforms as shown in: https://reviews.llvm.org/D31037 https://bugs.llvm.org/show_bug.cgi?id=31028 llvm-svn: 311661	2017-08-24 14:43:33 +00:00
Mikael Holmen	7a99e33b8e	[Reassociate] Do not drop debug location if replacement is missing Summary: When reassociating an expression, do not drop the instruction's original debug location in case the replacement location is missing. The debug location must at least not be dropped for inlinable callsites of debug-info-bearing functions in debug-info-bearing functions. Failing to do so would result in an "inlinable function " "call in a function with debug info must have a !dbg location" error in the verifier. As preserving the original debug location is not expected to result in overly jumpy debug line information, it is preserved for all other cases too. This fixes PR34231: https://bugs.llvm.org/show_bug.cgi?id=34231 Original patch by David Stenberg Reviewers: davide, craig.topper, mcrosier, dblaikie, aprantl Reviewed By: davide, aprantl Subscribers: aprantl Differential Revision: https://reviews.llvm.org/D36865 llvm-svn: 311642	2017-08-24 09:05:00 +00:00
Daniel Berlin	f948603a15	NewGVN: We weren't properly simplifying selects with equal arguments due to a thinko. llvm-svn: 311626	2017-08-24 02:43:17 +00:00
Rong Xu	15848e5977	[PGO] Set edge weights for indirectbr instruction with profile counts Current PGO only annotates the edge weight for branch and switch instructions with profile counts. We should also annotate the indirectbr instruction as all the information is there. This patch enables the annotating for indirectbr instructions. Also uses this annotation in branch probability analysis. Differential Revision: https://reviews.llvm.org/D37074 llvm-svn: 311604	2017-08-23 21:36:02 +00:00
Hans Wennborg	66f6fc0a49	LowerAtomic: Don't skip optnone functions; atomic still need lowering (PR34020) The lowering isn't really an optimization, so optnone shouldn't make a difference. ARM relies on the pass running when using "-mthread-model single", because in that mode, it doesn't run AtomicExpand. See bug for more details. Differential Revision: https://reviews.llvm.org/D37040 llvm-svn: 311565	2017-08-23 15:43:28 +00:00
Gor Nishanov	2f55b958b1	[coroutines] CoroBegin from inner coroutines should be considered for spills Summary: If a coroutine outer calls another coroutine inner and the inner coroutine body is inlined into the outer, coro.begin from the inner coroutine should be considered for spilling if accessed across suspends. Prior to this change, coroutine frame building code was not considering any coro.begins for spilling. With this change, we only ignore coro.begin for the current coroutine, but, any coro.begins that were inlined into the current coroutine are eligible for spills. Fixes PR34267 Reviewers: GorNishanov Subscribers: qcolombet, llvm-commits, EricWF Differential Revision: https://reviews.llvm.org/D37062 llvm-svn: 311556	2017-08-23 14:47:52 +00:00
Chad Rosier	8db41e9dbd	[Reassociate] Don't canonicalize x + (-Constant * y) -> x - (Constant * y).. ..if the resulting subtract will be broken up later. This can cause us to get into an infinite loop. x + (-5.0 * y) -> x - (5.0 * y) ; Canonicalize neg const x - (5.0 * y) -> x + (0 - (5.0 * y)) ; Break up subtract x + (0 - (5.0 * y)) -> x + (-5.0 * y) ; Replace 0-X with X*-1. PR34078 llvm-svn: 311554	2017-08-23 14:10:06 +00:00
Davide Italiano	c78885818a	[InstCombine] Fold branches with irrelevant conditions to a constant. InstCombine folds instructions with irrelevant conditions to undef. This, as Nuno confirmed is a bug. (see https://bugs.llvm.org/show_bug.cgi?id=33409#c1 ) Given the original motivation for the change is that of removing an USE, we now fold to false instead (which reaches the same goal without undesired side effects). Fixes PR33409. Differential Revision: https://reviews.llvm.org/D36975 llvm-svn: 311540	2017-08-23 09:14:37 +00:00
Craig Topper	a85f86225a	[InstCombine] Remove unused argument. NFC llvm-svn: 311529	2017-08-23 05:46:09 +00:00
Craig Topper	a94069fb4c	[InstCombine] Replace a simple matcher with a plain old dyn_cast. NFC llvm-svn: 311528	2017-08-23 05:46:08 +00:00
Craig Topper	524c44f74e	[InstCombine] Remove an unnecessary dyn_cast to Instruction and a switch over two opcodes. Just dyn_cast to the specific instruction classes individually. NFC Change the helper methods to take the more specific class as well. llvm-svn: 311527	2017-08-23 05:46:07 +00:00
Craig Topper	ec4b82571c	[InstCombine] Remove check for sext of vector icmp from shouldOptimizeCast Looks like for 'and' and 'or' we end up performing at least some of the transformations this is bocking in a round about way anyway. For 'and sext(cmp1), sext(cmp2) we end up later turning it into 'select cmp1, sext(cmp2), 0'. Then we optimize that back to sext (and cmp1, cmp2). This is the same result we would have gotten if shouldOptimizeCast hadn't blocked it. We do something analogous for 'or'. With this patch we allow that transformation to happen directly in foldCastedBitwiseLogic. And we now support the same thing for 'xor'. This is definitely opening up many other cases, but since we already went around it for some cases hopefully it's ok. Differential Revision: https://reviews.llvm.org/D36213 llvm-svn: 311508	2017-08-22 23:40:15 +00:00
Peter Collingbourne	001052a067	WholeProgramDevirt: Create bitcast to i8* at each virtual call site. We can't reuse the llvm.assume instruction's bitcast because it may not dominate every user of the vtable pointer. Differential Revision: https://reviews.llvm.org/D36994 llvm-svn: 311491	2017-08-22 21:41:19 +00:00
Matt Morehouse	b1fa8255db	[SanitizerCoverage] Optimize stack-depth instrumentation. Summary: Use the initialexec TLS type and eliminate calls to the TLS wrapper. Fixes the sanitizer-x86_64-linux-fuzzer bot failure. Reviewers: vitalybuka, kcc Reviewed By: kcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D37026 llvm-svn: 311490	2017-08-22 21:28:29 +00:00
Jakub Kuderski	2724d45325	[ADCE][Dominators] Reapply: Teach ADCE to preserve dominators Summary: This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees. This is reapplies the original patch r311057 that was reverted in r311381. The previous version wasn't using the batch update api for updating dominators, which in vary rare cases caused assertion failures. This also fixes PR34258. Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki Reviewed By: davide Subscribers: grandinj, zhendongsu, llvm-commits, david2050 Differential Revision: https://reviews.llvm.org/D35869 llvm-svn: 311467	2017-08-22 16:30:21 +00:00
Craig Topper	775ffcc8f5	[InstCombine] Move the checks for pointer types in getMaskedTypeForICmpPair earlier in the function I don't think there's any reason to have them scattered about and on all 4 operands. We already have an early check that both compares must be the same type. And within a given compare the LHS and RHS must have the same type. Beyond that I don't think there's anyway this function returns anything valid for pointer types. So let's just return early and be done with it. Differential Revision: https://reviews.llvm.org/D36561 llvm-svn: 311383	2017-08-21 21:00:45 +00:00
Sanjoy Das	08a38fe71e	Revert "Reapply: [ADCE][Dominators] Teach ADCE to preserve dominators" Summary: This partially reverts commit r311057 since it breaks ADCE. See PR34258. Reviewers: kuhar Subscribers: mcrosier, david2050, llvm-commits Differential Revision: https://reviews.llvm.org/D36979 llvm-svn: 311381	2017-08-21 20:39:18 +00:00
Haicheng Wu	0812c5bea3	[InlineCost] Add cl::opt to allow full inline cost to be computed for debugging purposes. Currently, the inline cost model will bail once the inline cost exceeds the inline threshold in order to avoid unnecessary compile-time. However, when debugging it is useful to compute the full cost, so this command line option is added to override the default behavior. I took over this work from Chad Rosier (mcrosier@codeaurora.org). Differential Revision: https://reviews.llvm.org/D35850 llvm-svn: 311371	2017-08-21 20:00:09 +00:00
Sanjay Patel	82ec872990	[LibCallSimplifier] try harder to fold memcmp with constant arguments (2nd try) The 1st try was reverted because it could inf-loop by creating a dead instruction. Fixed that to not happen and added a test case to verify. Original commit message: Try to fold: memcmp(X, C, ConstantLength) == 0 --> load X == *C Without this change, we're unnecessarily checking the alignment of the constant data, so we miss the transform in the first 2 tests in the patch. I noted this shortcoming of LibCallSimpifier in one of the recent CGP memcmp expansion patches. This doesn't help the example in: https://bugs.llvm.org/show_bug.cgi?id=34032#c13 ...directly, but it's worth short-circuiting more of these simple cases since we're already trying to do that. The benefit of transforming to load+cmp is that existing IR analysis/transforms may further simplify that code. For example, if the load of the variable is common to multiple memcmp calls, CSE can remove the duplicate instructions. Differential Revision: https://reviews.llvm.org/D36922 llvm-svn: 311366	2017-08-21 19:13:14 +00:00
Craig Topper	74177e1ed1	[InstCombine] Teach foldSelectICmpAnd to recognize a (icmp slt X, 0) and (icmp sgt X, -1) as equivalent to an and with the sign bit of the truncated type This is similar to what was already done in foldSelectICmpAndOr. Ultimately I'd like to see if we can call foldSelectICmpAnd from foldSelectIntoOp if we detect a power of 2 constant. This would allow us to remove foldSelectICmpAndOr entirely. Differential Revision: https://reviews.llvm.org/D36498 llvm-svn: 311362	2017-08-21 19:02:06 +00:00
Sam Elliott	e963c89d11	Migrate WholeProgramDevirt to new Optimization Remark API Summary: This is an attempt to move WholeProgramDevirt to the new remark API. https://bugs.llvm.org/show_bug.cgi?id=33793 Reviewers: anemet Reviewed By: anemet Subscribers: fhahn, llvm-commits Differential Revision: https://reviews.llvm.org/D36943 llvm-svn: 311352	2017-08-21 16:57:21 +00:00
Sam Elliott	e604b563ea	Emit only A Single Opt Remark When Inlining Summary: This updates the Inliner to only add a single Optimization Remark when Inlining, rather than an Analysis Remark and an Optimization Remark. Fixes https://bugs.llvm.org/show_bug.cgi?id=33786 Reviewers: anemet, davidxl, chandlerc Reviewed By: anemet Subscribers: haicheng, fhahn, mehdi_amini, dblaikie, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D36054 llvm-svn: 311349	2017-08-21 16:45:47 +00:00
Craig Topper	cc255bcd77	[InstCombine] Fix a weakness in canEvaluateZExtd around 'and' instructions Summary: If the bitsToClear from the LHS of an 'and' comes back non-zero, but all of those bits are known zero on the RHS, we can reset bitsToClear. Without this, the 'or' in the modified test case blocks the transform because it has non-zero bits in its RHS in those bits. Reviewers: spatel, majnemer, davide Reviewed By: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36944 llvm-svn: 311343	2017-08-21 16:04:11 +00:00
Xinliang David Li	d2838fc4b9	Revert 311208, 311209 llvm-svn: 311341	2017-08-21 16:00:38 +00:00
Sanjay Patel	707f786cc5	revert r311333: [LibCallSimplifier] try harder to fold memcmp with constant arguments We're getting lots of compile-timeout bot failures like: http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/7119 http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux llvm-svn: 311340	2017-08-21 15:16:25 +00:00
Sanjay Patel	7756edfa93	[LibCallSimplifier] try harder to fold memcmp with constant arguments Try to fold: memcmp(X, C, ConstantLength) == 0 --> load X == *C Without this change, we're unnecessarily checking the alignment of the constant data, so we miss the transform in the first 2 tests in the patch. I noted this shortcoming of LibCallSimpifier in one of the recent CGP memcmp expansion patches. This doesn't help the example in: https://bugs.llvm.org/show_bug.cgi?id=34032#c13 ...directly, but it's worth short-circuiting more of these simple cases since we're already trying to do that. The benefit of transforming to load+cmp is that existing IR analysis/transforms may further simplify that code. For example, if the load of the variable is common to multiple memcmp calls, CSE can remove the duplicate instructions. Differential Revision: https://reviews.llvm.org/D36922 llvm-svn: 311333	2017-08-21 13:55:49 +00:00
Chandler Carruth	bd6dc14230	Revert r311077: [LV] Using VPlan ... This causes LLVM to assert fail on PPC64 and crash / infloop in other cases. Filed http://llvm.org/PR34248 with reproducer attached. llvm-svn: 311304	2017-08-20 23:17:11 +00:00
Benjamin Kramer	2e5be849cc	[Mem2Reg] Modernize code a bit. No functionality change intended. llvm-svn: 311290	2017-08-20 14:34:44 +00:00
Benjamin Kramer	49a49fe816	Move helper classes into anonymous namespaces. No functionality change intended. llvm-svn: 311288	2017-08-20 13:03:48 +00:00
Aditya Kumar	a525fffd07	[Loop Vectorize] Added a separate metadata Added a separate metadata to indicate when the loop has already been vectorized instead of setting width and count to 1. Patch written by Divya Shanmughan and Aditya Kumar Differential Revision: https://reviews.llvm.org/D36220 llvm-svn: 311281	2017-08-20 10:32:41 +00:00
Sam Elliott	7fe0aaa140	Revert "Emit only A Single Opt Remark When Inlining" Reverting due to clang build failure llvm-svn: 311274	2017-08-20 06:55:10 +00:00
Sam Elliott	785dd75369	Emit only A Single Opt Remark When Inlining Summary: This updates the Inliner to only add a single Optimization Remark when Inlining, rather than an Analysis Remark and an Optimization Remark. Fixes https://bugs.llvm.org/show_bug.cgi?id=33786 Reviewers: anemet, davidxl, chandlerc Reviewed By: anemet Subscribers: haicheng, fhahn, mehdi_amini, dblaikie, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D36054 llvm-svn: 311273	2017-08-20 06:43:34 +00:00
Teresa Johnson	73305f82e9	[ThinLTO] Fix ThinLTO crash Summary: Follow up to fix in r311023, which fixed the case where the combined index is written to disk. The same samplePGO logic exists for the in-memory index when computing imports, so we need to filter out GlobalVariable summaries there too. Reviewers: davidxl Subscribers: inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D36919 llvm-svn: 311254	2017-08-19 18:04:25 +00:00
Chandler Carruth	4f3aa29a46	[Inliner] Fix a nasty bug when inlining a non-recursive trace of a function into itself. We tried to fix this before in r306495 but that got reverted as the assert was actually hit. This fixes the original bug (which we seem to have lost track of with the revert) by blocking a second remapping when the function being inlined is also the caller and the remapping could succeed but erroneously. The included test case would actually load from an inlined copy of the alloca before this change, failing to load the stored value and miscompiling. Many thanks to Richard Smith for diagnosing a user miscompile to this bug, and to Kyle for the first attempt and initial analysis and David Li for remembering the issue and how to fix it and suggesting the patch. I'm just stitching it together and landing it. =] llvm-svn: 311229	2017-08-19 06:56:11 +00:00
Chandler Carruth	1f8212597d	[SLP] Fix an unused variable warning in non-asserts builds. llvm-svn: 311227	2017-08-19 05:06:23 +00:00
Dinar Temirbulatov	7aff8cfa55	[SLPVectorizer] Tighten up VLeft, VRight declaration, remove unnecessary testcase test/Transforms/SLPVectorizer/X86/reorder.ll, NFCI. llvm-svn: 311223	2017-08-19 03:15:07 +00:00
Dinar Temirbulatov	e3ce1b455e	[SLPVectorizer] Add opcode parameter to reorderAltShuffleOperands, reorderInputsAccordingToOpcode functions. Reviewers: mkuper, RKSimon, ABataev, mzolotukhin, spatel, filcab Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D36766 llvm-svn: 311221	2017-08-19 02:54:20 +00:00
Xinliang David Li	0d07f9d68a	Fix comment /NFC llvm-svn: 311209	2017-08-18 23:08:50 +00:00
Xinliang David Li	709ffe178e	[Profile] backward propagate profile info in JumpThreading Differential Revsion: http://reviews.llvm.org/D36864 llvm-svn: 311208	2017-08-18 23:00:05 +00:00
Max Kazantsev	0aaf8c16ac	[IRCE] Fix buggy behavior in Clamp Clamp function was too optimistic when choosing signed or unsigned min/max function for calculations. In fact, `!IsSignedPredicate` guarantees us that `Smallest` and `Greatest` can be compared safely using unsigned predicates, but we did not check this for `S` which can in theory be negative. This patch makes Clamp use signed min/max for cases when it fails to prove `S` being non-negative, and it adds a test where such situation may lead to incorrect conditions calculation. Differential Revision: https://reviews.llvm.org/D36873 llvm-svn: 311205	2017-08-18 22:50:29 +00:00
Ana Pazos	6210f27dfc	[PGO] Fixed assertion due to mismatched memcpy size type. Summary: Memcpy intrinsics have size argument of any integer type, like i32 or i64. Fixed size type along with its value when cloning the intrinsic. Reviewers: davidxl, xur Reviewed By: davidxl Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D36844 llvm-svn: 311188	2017-08-18 19:17:08 +00:00
Matt Morehouse	5c7fc76983	[SanitizerCoverage] Add stack depth tracing instrumentation. Summary: Augment SanitizerCoverage to insert maximum stack depth tracing for use by libFuzzer. The new instrumentation is enabled by the flag -fsanitize-coverage=stack-depth and is compatible with the existing trace-pc-guard coverage. The user must also declare the following global variable in their code: thread_local uintptr_t __sancov_lowest_stack https://bugs.llvm.org/show_bug.cgi?id=33857 Reviewers: vitalybuka, kcc Reviewed By: vitalybuka Subscribers: kubamracek, hiraditya, cfe-commits, llvm-commits Differential Revision: https://reviews.llvm.org/D36839 llvm-svn: 311186	2017-08-18 18:43:30 +00:00
Jakub Kuderski	e608ef7635	[LoopRotate][Dominators] Use the incremental API to update DomTree Summary: This patch teaches LoopRotate to use the new incremental API to update the DominatorTree. Reviewers: dberlin, davide, grosser, sanjoy Reviewed By: dberlin, davide Subscribers: hiraditya, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D35581 llvm-svn: 311125	2017-08-17 21:48:19 +00:00
Jakub Kuderski	e35a449140	[Dominators] Teach LoopUnswitch to use the incremental API Summary: This patch makes LoopUnswitch use new incremental API for updating dominators. It also updates SplitCriticalEdge, as it is called in LoopUnswitch. There doesn't seem to be any noticeable performance difference when bootstrapping clang with this patch. Reviewers: dberlin, davide, sanjoy, grosser, chandlerc Reviewed By: davide, grosser Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D35528 llvm-svn: 311093	2017-08-17 16:45:35 +00:00
Simon Dardis	b5205c69d2	[dfsan] Add explicit zero extensions for shadow parameters in function wrappers. In the case where dfsan provides a custom wrapper for a function, shadow parameters are added for each parameter of the function. These parameters are i16s. For targets which do not consider this a legal type, the lack of sign extension information would cause LLVM to generate anyexts around their usage with phi variables and calling convention logic. Address this by introducing zero exts for each shadow parameter. Reviewers: pcc, slthakur Differential Revision: https://reviews.llvm.org/D33349 llvm-svn: 311087	2017-08-17 14:14:25 +00:00
Ayal Zaks	6627883369	[LV] Using VPlan to model the vectorized code and drive its transformation VPlan is an ongoing effort to refactor and extend the Loop Vectorizer. This patch introduces the VPlan model into LV and uses it to represent the vectorized code and drive the generation of vectorized IR. In this patch VPlan models the vectorized loop body: the vectorized control-flow is represented using VPlan's Hierarchical CFG, with predication refactored from being a post-vectorization-step into a vectorization planning step modeling if-then VPRegionBlocks, and generating code inline with non-predicated code. The vectorized code within each VPBasicBlock is represented as a sequence of Recipes, each responsible for modelling and generating a sequence of IR instructions. To keep the size of this commit manageable the Recipes in this patch are coarse-grained and capture large chunks of LV's code-generation logic. The constructed VPlans are dumped in dot format under -debug. This commit retains current vectorizer output, except for minor instruction reorderings; see associated modifications to lit tests. For further details on the VPlan model see docs/Proposals/VectorizationPlan.rst and its references. Authors: Gil Rapaport and Ayal Zaks Differential Revision: https://reviews.llvm.org/D32871 llvm-svn: 311077	2017-08-17 09:29:59 +00:00
Jakub Kuderski	fd5c5c9144	Reapply: [ADCE][Dominators] Teach ADCE to preserve dominators Summary: This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees. I didn't notice any performance impact when bootstrapping clang with this patch. The patch was originally committed in r311039 and reverted in r311049. This revision fixes the problem with not adding a dependency on the DominatorTreeWrapperPass for the LegacyPassManager. Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki Reviewed By: davide Subscribers: grandinj, zhendongsu, llvm-commits, david2050 Differential Revision: https://reviews.llvm.org/D35869 llvm-svn: 311057	2017-08-17 01:41:49 +00:00
Amjad Aboud	86111c6696	[InstCombine] Teach canEvaluateTruncated to handle arithmetic shift (including those with vector splat shift amount) Differential Revision: https://reviews.llvm.org/D36784 llvm-svn: 311050	2017-08-16 22:42:38 +00:00
Jakub Kuderski	cbcffb173c	Revert "[ADCE][Dominators] Teach ADCE to preserve dominators" This reverts commit r311039. The patch caused the `test/Bindings/OCaml/Output/scalar_opts.ml` to fail. llvm-svn: 311049	2017-08-16 22:10:53 +00:00
Craig Topper	882f29630b	[InstCombine] Make folding (X >s -1) ? C1 : C2 --> ((X >>s 31) & (C2 - C1)) + C1 support splat vectors This also uses decomposeBitTestICmp to decode the compare. Differential Revision: https://reviews.llvm.org/D36781 llvm-svn: 311044	2017-08-16 21:52:07 +00:00
Jakub Kuderski	4552e9de9f	[ADCE][Dominators] Teach ADCE to preserve dominators Summary: This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees. I didn't notice any performance impact when bootstrapping clang with this patch. Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki Reviewed By: davide Subscribers: grandinj, zhendongsu, llvm-commits, david2050 Differential Revision: https://reviews.llvm.org/D35869 llvm-svn: 311039	2017-08-16 20:50:23 +00:00
Geoff Berry	40549ad1ac	[LoopDataPrefetch][AArch64FalkorHWPFFix] Preserve ScalarEvolution Summary: Mark LoopDataPrefetch and AArch64FalkorHWPFFix passes as preserving ScalarEvolution since they do not alter loop structure and should not alter any SCEV values (though LoopDataPrefetch may introduce new instructions that won't have cached SCEV values yet). This can result in slight code differences, mainly w.r.t. nsw/nuw flags on SCEVs, since these are computed somewhat lazily when a zext/sext instruction is encountered. As a result, passes after the modified passes may see SCEVs with more nsw/nuw flags present. Reviewers: sanjoy, anemet Subscribers: aemerson, rengolin, mzolotukhin, javed.absar, kristof.beyls, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D36716 llvm-svn: 311032	2017-08-16 19:03:16 +00:00
Hal Finkel	9e54b7093a	[BDCE] Don't check demanded bits on unsized types To clear assumptions that are potentially invalid after trivialization, we need to walk the use/def chain. Normally, the only way to reach an instruction with an unsized type is via an instruction that has side effects (or otherwise will demand its input bits). That would stop the walk. However, if we have a readnone function that returns an unsized type (e.g., void), we must avoid asking for the demanded bits of the function call's return value. A void-returning readnone function is always dead (and so we can stop walking the use/def chain here), but the check is necessary to avoid asserting. Fixes PR34211. llvm-svn: 311014	2017-08-16 16:09:22 +00:00
Dehao Chen	84d412035a	Merge debug info when hoist then-else code to if. Summary: When we move then-else code to if, we need to merge its debug info, otherwise the hoisted instruction may have inaccurate debug info attached. Reviewers: aprantl, probinson, dblaikie, echristo, loladiro Reviewed By: aprantl Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D36778 llvm-svn: 310985	2017-08-16 01:55:26 +00:00
Craig Topper	0a1a276d91	[InstCombine] Teach canEvaluateZExtd and canEvaluateTruncated to handle vector shifts with splat shift amount We were only allowing ConstantInt before. This patch allows splat of ConstantInt too. Differential Revision: https://reviews.llvm.org/D36763 llvm-svn: 310970	2017-08-15 22:48:41 +00:00
Amjad Aboud	0464c5d958	[InstCombine] Added support for (X >>s C) << C --> X & (-1 << C) Differential Revision: https://reviews.llvm.org/D36743 llvm-svn: 310949	2017-08-15 19:33:14 +00:00
Sanjay Patel	f69b7d5c93	[InstCombine] sink sext after ashr Narrow ops are better for bit-tracking, and in the case of vectors, may enable better codegen. As the trunc test shows, this can allow follow-on simplifications. There's a block of code in visitTrunc that deals with shifted ops with FIXME comments. It may be possible to remove some of that now, but I want to make sure there are no problems with this step first. http://rise4fun.com/Alive/Y3a Name: hoist_ashr_ahead_of_sext_1 %s = sext i8 %x to i32 %r = ashr i32 %s, 3 ; shift value is < than source bit width => %a = ashr i8 %x, 3 %r = sext i8 %a to i32 Name: hoist_ashr_ahead_of_sext_2 %s = sext i8 %x to i32 %r = ashr i32 %s, 8 ; shift value is >= than source bit width => %a = ashr i8 %x, 7 ; so clamp this shift value %r = sext i8 %a to i32 Name: junc_the_trunc %a = sext i16 %v to i32 %s = ashr i32 %a, 18 %t = trunc i32 %s to i16 => %t = ashr i16 %v, 15 llvm-svn: 310942	2017-08-15 18:25:52 +00:00
Jakub Kuderski	638c085d07	[Dominators] Include infinite loops in PostDominatorTree Summary: This patch teaches PostDominatorTree about infinite loops. It is built on top of D29705 by @dberlin which includes a very detailed motivation for this change. What's new is that the patch also teaches the incremental updater how to deal with reverse-unreachable regions and how to properly maintain and verify tree roots. Before that, the incremental algorithm sometimes ended up preserving reverse-unreachable regions after updates that wouldn't appear in the tree if it was constructed from scratch on the same CFG. This patch makes the following assumptions: - A sequence of updates should produce the same tree as a recalculating it. - Any sequence of the same updates should lead to the same tree. - Siblings and roots are unordered. The last two properties are essential to efficiently perform batch updates in the future. When it comes to the first one, we can decide later that the consistency between freshly built tree and an updated one doesn't matter match, as there are many correct ways to pick roots in infinite loops, and to relax this assumption. That should enable us to recalculate postdominators less frequently. This patch is pretty conservative when it comes to incremental updates on reverse-unreachable regions and ends up recalculating the whole tree in many cases. It should be possible to improve the performance in many cases, if we decide that it's important enough. That being said, my experiments showed that reverse-unreachable are very rare in the IR emitted by clang when bootstrapping clang. Here are the statistics I collected by analyzing IR between passes and after each removePredecessor call: ``` # functions: 52283 # samples: 337609 # reverse unreachable BBs: 216022 # BBs: 247840796 Percent reverse-unreachable: 0.08716159869015269 % Max(PercRevUnreachable) in a function: 87.58620689655172 % # > 25 % samples: 471 ( 0.1395104988314885 % samples ) ... in 145 ( 0.27733680163724345 % functions ) ``` Most of the reverse-unreachable regions come from invalid IR where it wouldn't be possible to construct a PostDomTree anyway. I would like to commit this patch in the next week in order to be able to complete the work that depends on it before the end of my internship, so please don't wait long to voice your concerns :). Reviewers: dberlin, sanjoy, grosser, brzycki, davide, chandlerc, hfinkel Reviewed By: dberlin Subscribers: nhaehnle, javed.absar, kparzysz, uabelho, jlebar, hiraditya, llvm-commits, dberlin, david2050 Differential Revision: https://reviews.llvm.org/D35851 llvm-svn: 310940	2017-08-15 18:14:57 +00:00
Rui Ueyama	4a17955030	Fix -Wunused-lambda-capture for Release build. `I` and `this` are used only in assert or DEBUG, so they are unused in Release build. llvm-svn: 310934	2017-08-15 17:39:35 +00:00
Ayal Zaks	25e2800e20	[LV] Minor savings to Sink casts to unravel first order recurrence Two minor savings: avoid copying the SinkAfter map and avoid moving a cast if it is not needed. Differential Revision: https://reviews.llvm.org/D36408 llvm-svn: 310910	2017-08-15 08:32:59 +00:00
Dinar Temirbulatov	9e43d6e7b2	[SLPVectorizer] Replace VL[0] to VL0 with assert, add propagateIRFlags extra parameter VL0, replace E->Scalars[0] to VL0, NFCI. llvm-svn: 310904	2017-08-15 00:31:49 +00:00
Dehao Chen	45847d3612	Add missing dependency in ICP. (NFC) llvm-svn: 310896	2017-08-14 23:25:21 +00:00
Reid Kleckner	18728822d2	Remove checks for debug info intrinsics in use lists, NFC These haven't done anything since debug info intrinsics stopped appearing in Value use lists in 2014. llvm-svn: 310892	2017-08-14 22:10:54 +00:00
Craig Topper	0aa3a19512	Recommit r310869, "[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify" This recommits r310869, with the moved files and no extra changes. Original commit message: This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too. I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself. I also had to make decomposeBitTest support vectors since InstSimplify needs that. As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library. Differential Revision: https://reviews.llvm.org/D36593 llvm-svn: 310889	2017-08-14 21:39:51 +00:00
Andrew Kaylor	53a5fbb45f	Add strictfp attribute to prevent unwanted optimizations of libm calls Differential Revision: https://reviews.llvm.org/D34163 llvm-svn: 310885	2017-08-14 21:15:13 +00:00
Craig Topper	69fa8e0d99	Revert r310869 "[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify" Failed to add the two files that moved. And then added an extra change I didn't mean to while trying to fix that. Reverting everything. llvm-svn: 310873	2017-08-14 19:09:32 +00:00
Craig Topper	2f0b450666	[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too. I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself. I also had to make decomposeBitTest support vectors since InstSimplify needs that. As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library. Differential Revision: https://reviews.llvm.org/D36593 llvm-svn: 310869	2017-08-14 18:49:42 +00:00
Dinar Temirbulatov	7b78f5e52d	[SLPVectorizer] Schedule bundle with different opcodes. This change let us schedule a bundle with different opcodes in it, for example : [ load, add, add, add ] Reviewers: mkuper, RKSimon, ABataev, mzolotukhin, spatel, filcab Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D36518 llvm-svn: 310847	2017-08-14 15:40:16 +00:00
Sanjay Patel	a1067d9bae	[BDCE] reduce scope of an assert (PR34179) The assert was added with r310779 and is usually correct, but as the test shows, not always. The 'volatile' on the load is needed to expose the faulty path because without it, DemandedBits would return that the load is just dead rather than not demanded, and so we wouldn't hit the bogus assert. Also, since the lambda is just a single-line now, get rid of it and inline the DB.isAllOnesValue() calls. This should fix (prevent execution of a faulty assert): https://bugs.llvm.org/show_bug.cgi?id=34179 llvm-svn: 310842	2017-08-14 15:13:46 +00:00
Sam Parker	718c8a6a2a	[LoopUnroll] Enable option to peel remainder loop On some targets, the penalty of executing runtime unrolling checks and then not the unrolled loop can be significantly detrimental to performance. This results in the need to be more conservative with the unroll count, keeping a trip count of 2 reduces the overhead as well as increasing the chance of the unrolled body being executed. But being conservative leaves performance gains on the table. This patch enables the unrolling of the remainder loop introduced by runtime unrolling. This can help reduce the overhead of misunrolled loops because the cost of non-taken branches is much less than the cost of the backedge that would normally be executed in the remainder loop. This allows larger unroll factors to be used without suffering performance loses with smaller iteration counts. Differential Revision: https://reviews.llvm.org/D36309 llvm-svn: 310824	2017-08-14 09:25:26 +00:00

... 3 4 5 6 7 ...

18959 Commits