llvm-project

Commit Graph

Author	SHA1	Message	Date
Daniel Berlin	d36c27bedb	NewGVN: Fix PR 34473, by not using ExactlyEqualsExpression for finding phi of ops users. llvm-svn: 314612	2017-09-30 23:51:55 +00:00
Daniel Berlin	c1305af09b	NewGVN: Evaluate phi of ops expressions before creating phi node llvm-svn: 314611	2017-09-30 23:51:54 +00:00
Daniel Berlin	9b926e90d3	NewGVN: Allow dependent PHI of ops llvm-svn: 314610	2017-09-30 23:51:53 +00:00
Daniel Berlin	de6958ee85	NewGVN: Make OpIsSafeForPhiOfOps non-recursive llvm-svn: 314609	2017-09-30 23:51:04 +00:00
Daniel Jasper	0a51ec29c9	Revert r314435: "[JumpThreading] Preserve DT and LVI across the pass" Causes a segfault on a builtbot (and in our internal bootstrapping of Clang). See Eli's response on the commit thread. llvm-svn: 314589	2017-09-30 11:57:19 +00:00
Alex Shlyapnikov	e76aa3b0b2	Revert "Use the basic cost if a GEP is not used as addressing mode" This reverts commit r314517. This commit crashes sanitizer bots, for example: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/4167 Stack snippet: ... /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Support/Casting.h:255:0 llvm::TargetTransformInfoImplCRTPBase<llvm::X86TTIImpl>::getGEPCost(llvm::GEPOperator const, llvm::ArrayRef<llvm::Value const>) /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h:742:0 llvm::TargetTransformInfoImplCRTPBase<llvm::X86TTIImpl>::getUserCost(llvm::User const, llvm::ArrayRef<llvm::Value const>) /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h:782:0 /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/lib/Analysis/TargetTransformInfo.cpp:116:0 /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/ADT/SmallVector.h:116:0 /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/ADT/SmallVector.h:343:0 /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/ADT/SmallVector.h:864:0 /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Analysis/TargetTransformInfo.h:285:0 ... llvm-svn: 314560	2017-09-29 22:04:45 +00:00
Jun Bum Lim	0e16a59e83	Use the basic cost if a GEP is not used as addressing mode Summary: Currently, getGEPCost() returns TCC_FREE whenever a GEP is a legal addressing mode in the target. However, since it doesn't check its actual users, it will return FREE even in cases where the GEP cannot be folded away as a part of actual addressing mode. For example, if an user of the GEP is a call instruction taking the GEP as a parameter, then the GEP may not be folded in isel. Reviewers: hfinkel, efriedma, mcrosier, jingyue, haicheng Reviewed By: hfinkel Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D38085 llvm-svn: 314517	2017-09-29 14:50:16 +00:00
Evandro Menezes	3701df55c6	[JumpThreading] Preserve DT and LVI across the pass JumpThreading now preserves dominance and lazy value information across the entire pass. The pass manager is also informed of this preservation with the goal of DT and LVI being recalculated fewer times overall during compilation. This change prepares JumpThreading for enhanced opportunities; particularly those across loop boundaries. Patch by: Brian Rzycki <b.rzycki@samsung.com>, Sebastian Pop <s.pop@samsung.com> Differential revision: https://reviews.llvm.org/D37528 llvm-svn: 314435	2017-09-28 17:24:40 +00:00
Benjamin Kramer	c965b30e54	[LoopUnroll] Fix use after poison. llvm-svn: 314418	2017-09-28 14:47:39 +00:00
Sanjoy Das	def1729dc4	Use a BumpPtrAllocator for Loop objects Summary: And now that we no longer have to explicitly free() the Loop instances, we can (with more ease) use the destructor of LoopBase to do what LoopBase::clear() was doing. Reviewers: chandlerc Subscribers: mehdi_amini, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D38201 llvm-svn: 314375	2017-09-28 02:45:42 +00:00
Rui Ueyama	0dbb0f107e	Fix -Wunused-variable for Release build. llvm-svn: 314353	2017-09-27 22:03:15 +00:00
Sanjoy Das	4f3ebd537c	Return the LoopUnrollResult from tryToUnrollLoop; NFC I will use this in a later change. llvm-svn: 314352	2017-09-27 21:45:22 +00:00
Sanjoy Das	8e8c1bc490	LoopDeletion: use return value instead of passing in LPMUpdater; NFC I will use this refactoring in a later patch. llvm-svn: 314351	2017-09-27 21:45:21 +00:00
Sanjoy Das	3567d3d2ec	Rename LoopUnrollStatus to LoopUnrollResult; NFC A "Result" suffix is more appropriate here llvm-svn: 314350	2017-09-27 21:45:19 +00:00
Sanjay Patel	0f9b4773c1	[SimplifyCFG] add a struct to house optional folds (PR34603) This was intended to be no-functional-change, but it's not - there's a test diff. So I thought I should stop here and post it as-is to see if this looks like what was expected based on the discussion in PR34603: https://bugs.llvm.org/show_bug.cgi?id=34603 Notes: 1. The test improvement occurs because the existing 'LateSimplifyCFG' marker is not carried through the recursive calls to 'SimplifyCFG()->SimplifyCFGOpt().run()->SimplifyCFG()'. The parameter isn't passed down, so we pick up the default value from the function signature after the first level. I assumed that was a bug, so I've passed 'Options' down in all of the 'SimplifyCFG' calls. 2. I split 'LateSimplifyCFG' into 2 bits: ConvertSwitchToLookupTable and KeepCanonicalLoops. This would theoretically allow us to differentiate the transforms controlled by those params independently. 3. We could stash the optional AssumptionCache pointer and 'LoopHeaders' pointer in the struct too. I just stopped here to minimize the diffs. 4. Similarly, I stopped short of messing with the pass manager layer. I have another question that could wait for the follow-up: why is the new pass manager creating the pass with LateSimplifyCFG set to true no matter where in the pipeline it's creating SimplifyCFG passes? // Create an early function pass manager to cleanup the output of the // frontend. EarlyFPM.addPass(SimplifyCFGPass()); --> /// \brief Construct a pass with the default thresholds /// and switch optimizations. SimplifyCFGPass::SimplifyCFGPass() : BonusInstThreshold(UserBonusInstThreshold), LateSimplifyCFG(true) {} <-- switches get converted to lookup tables and loops may not be in canonical form If this is unintended, then it's possible that the current behavior of dropping the 'LateSimplifyCFG' setting via recursion was masking this bug. Differential Revision: https://reviews.llvm.org/D38138 llvm-svn: 314308	2017-09-27 14:54:16 +00:00
Sanjay Patel	1d04b5bacf	[DSE] Merge stores when the later store only writes to memory locations the early store also wrote to (2nd try) This is a 2nd attempt at: https://reviews.llvm.org/rL310055 ...which was reverted at rL310123 because of PR34074: https://bugs.llvm.org/show_bug.cgi?id=34074 In this version, we break out of the inner loop after we successfully merge and kill a pair of stores. In the earlier rev, we were continuing instead, which meant we could process the invalid info from a now dead store. Original commit message (authored by Filipe Cabecinhas): This fixes PR31777. If both stores' values are ConstantInt, we merge the two stores (shifting the smaller store appropriately) and replace the earlier (and larger) store with an updated constant. In the future we should also support vectors of integers. And maybe float/double if we can. Differential Revision: https://reviews.llvm.org/D30703 llvm-svn: 314206	2017-09-26 13:54:28 +00:00
Artur Pilipenko	889dc1e3a5	Rework loop predication pass We've found a serious issue with the current implementation of loop predication. The current implementation relies on SCEV and this turned out to be problematic. To fix the problem we had to rework the pass substantially. We have had the reworked implementation in our downstream tree for a while. This is the initial patch of the series of changes to upstream the new implementation. For now the transformation is limited to the following case: * The loop has a single latch with either ult or slt icmp condition. * The step of the IV used in the latch condition is 1. * The IV of the latch condition is the same as the post increment IV of the guard condition. * The guard condition is ult. See the review or the LoopPredication.cpp header for the details about the problem and the new implementation. Reviewed By: sanjoy, mkazantsev Differential Revision: https://reviews.llvm.org/D37569 llvm-svn: 313981	2017-09-22 13:13:57 +00:00
Sanjoy Das	388b012f4e	Rename markAsErased to erase, as pointed out in a previous review; NFC llvm-svn: 313951	2017-09-22 01:47:41 +00:00
Reid Kleckner	0fe506bc5e	Re-land r313825: "[IR] Add llvm.dbg.addr, a control-dependent version of llvm.dbg.declare" The fix is to avoid invalidating our insertion point in replaceDbgDeclare: Builder.insertDeclare(NewAddress, DIVar, DIExpr, Loc, InsertBefore); + if (DII == InsertBefore) + InsertBefore = &std::next(InsertBefore->getIterator()); DII->eraseFromParent(); I had to write a unit tests for this instead of a lit test because the use list order matters in order to trigger the bug. The reduced C test case for this was: void useit(int); static inline void inlineme() { int x[2]; useit(x); } void f() { inlineme(); inlineme(); } llvm-svn: 313905	2017-09-21 19:52:03 +00:00
Daniel Jasper	7d2f38d600	Revert r313825: "[IR] Add llvm.dbg.addr, a control-dependent version of llvm.dbg.declare" .. as well as the two subsequent changes r313826 and r313875. This leads to segfaults in combination with ASAN. Will forward repro instructions to the original author (rnk). llvm-svn: 313876	2017-09-21 12:07:33 +00:00
Mikael Holmen	582e141007	[SROA] Really remove associated dbg.declare when removing dead alloca Summary: There already was code that tried to remove the dbg.declare, but that code was placed after we had called I->replaceAllUsesWith(UndefValue::get(I->getType())); on the alloca, so when we searched for the relevant dbg.declare, we couldn't find it. Now we do the search before we call RAUW so there is a chance to find it. An existing testcase needed update due to this. Two dbg.declare with undef were removed and then suddenly one of the two CHECKS failed. Before this patch we got call void @llvm.dbg.declare(metadata i24* undef, metadata !14, metadata !DIExpression(DW_OP_LLVM_fragment, 32, 24)), !dbg !15 call void @llvm.dbg.declare(metadata %struct.prog_src_register* undef, metadata !14, metadata !DIExpression()), !dbg !15 call void @llvm.dbg.value(metadata i32 0, metadata !14, metadata !DIExpression(DW_OP_LLVM_fragment, 0, 32)), !dbg !15 call void @llvm.dbg.value(metadata i32 0, metadata !14, metadata !DIExpression(DW_OP_LLVM_fragment, 32, 24)), !dbg !15 and with it we get call void @llvm.dbg.value(metadata i32 0, metadata !14, metadata !DIExpression(DW_OP_LLVM_fragment, 0, 32)), !dbg !15 call void @llvm.dbg.value(metadata i32 0, metadata !14, metadata !DIExpression(DW_OP_LLVM_fragment, 32, 24)), !dbg !15 However, the CHECKs in the testcase checked things in a silly order, so they only passed since they found things in the first dbg.declare. Now we changed the order of the checks and the test passes. Reviewers: rnk Reviewed By: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37900 llvm-svn: 313875	2017-09-21 11:14:27 +00:00
Serguei Katkov	675e304ef8	Revert "Re-enable "[IRCE] Identify loops with latch comparison against current IV value"" Revert the patch causing the functional failures. The patch owner is notified with test cases which fail. Test case has been provided to Maxim offline. llvm-svn: 313857	2017-09-21 04:50:41 +00:00
Reid Kleckner	3f547e87b2	[IR] Add llvm.dbg.addr, a control-dependent version of llvm.dbg.declare Summary: This implements the design discussed on llvm-dev for better tracking of variables that live in memory through optimizations: http://lists.llvm.org/pipermail/llvm-dev/2017-September/117222.html This is tracked as PR34136 llvm.dbg.addr is intended to be produced and used in almost precisely the same way as llvm.dbg.declare is today, with the exception that it is control-dependent. That means that dbg.addr should always have a position in the instruction stream, and it will allow passes that optimize memory operations on local variables to insert llvm.dbg.value calls to reflect deleted stores. See SourceLevelDebugging.rst for more details. The main drawback to generating DBG_VALUE machine instrs is that they usually cause LLVM to emit a location list for DW_AT_location. The next step will be to teach DwarfDebug.cpp how to recognize more DBG_VALUE ranges as not needing a location list, and possibly start setting DW_AT_start_offset for variables whose lifetimes begin mid-scope. Reviewers: aprantl, dblaikie, probinson Subscribers: eraman, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D37768 llvm-svn: 313825	2017-09-20 21:52:33 +00:00
Sanjoy Das	09613b122e	Tighten the invariants around LoopBase::invalidate Summary: With this change: - Methods in LoopBase trip an assert if the receiver has been invalidated - LoopBase::clear frees up the memory held the LoopBase instance This change also shuffles things around as necessary to work with this stricter invariant. Reviewers: chandlerc Subscribers: mehdi_amini, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D38055 llvm-svn: 313708	2017-09-20 02:31:57 +00:00
Daniel Berlin	064cb68d18	GVNSink: Make ModelledPHIs constructor linear (and avoid edge case it worries about) by avoiding getIncomingValueForBlock llvm-svn: 313702	2017-09-20 00:07:27 +00:00
Daniel Berlin	dd323297d0	Revert "[GVNSink] Remove dependency on SmallPtrSet iteration order." This reverts commit r312156, because now the op and block arrays are not in the same order :(. llvm-svn: 313701	2017-09-20 00:07:25 +00:00
Daniel Berlin	9632dd7376	NewGVN: Remove unused includes llvm-svn: 313700	2017-09-20 00:07:12 +00:00
Sanjoy Das	76ab23234c	[LoopInfo] Make LoopBase and Loop destructors non-public Summary: See comment for why I think this is a good idea. This change also: - Removes an SCEV test case. The SCEV test was not testing anything useful (most of it was `#if 0` ed out) and it would need to be updated to deal with a private ~Loop::Loop. - Updates the loop pass manager test case to deal with a private ~Loop::Loop. - Renames markAsRemoved to markAsErased to contrast with removeLoop, via the usual remove vs. erase idiom we already have for instructions and basic blocks. Reviewers: chandlerc Subscribers: mehdi_amini, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D37996 llvm-svn: 313695	2017-09-19 23:19:00 +00:00
Vivek Pandya	b5ab895e2a	This patch fixes https://bugs.llvm.org/show_bug.cgi?id=32352 It enables OptimizationRemarkEmitter::allowExtraAnalysis and MachineOptimizationRemarkEmitter::allowExtraAnalysis to return true not only for -fsave-optimization-record but when specific remarks are requested with command line options. The diagnostic handler used to be callback now this patch adds a class DiagnosticHandler. It has virtual method to provide custom diagnostic handler and methods to control which particular remarks are enabled. However LLVM-C API users can still provide callback function for diagnostic handler. llvm-svn: 313390	2017-09-15 20:10:09 +00:00
Vivek Pandya	df8598dcc4	This reverts r313381 llvm-svn: 313387	2017-09-15 19:53:54 +00:00
Vivek Pandya	00d887447b	This patch fixes https://bugs.llvm.org/show_bug.cgi?id=32352 It enables OptimizationRemarkEmitter::allowExtraAnalysis and MachineOptimizationRemarkEmitter::allowExtraAnalysis to return true not only for -fsave-optimization-record but when specific remarks are requested with command line options. The diagnostic handler used to be callback now this patch adds a class DiagnosticHandler. It has virtual method to provide custom diagnostic handler and methods to control which particular remarks are enabled. However LLVM-C API users can still provide callback function for diagnostic handler. llvm-svn: 313382	2017-09-15 19:30:59 +00:00
Alina Sbirlea	7ed5856a32	Refactor collectChildrenInLoop to LoopUtils [NFC] Summary: Move to LoopUtils method that collects all children of a node inside a loop. Reviewers: majnemer, sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37870 llvm-svn: 313322	2017-09-15 00:04:16 +00:00
Eugene Zelenko	8002c504cd	[Transforms] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 313198	2017-09-13 21:43:53 +00:00
Aditya Kumar	dfa8741c96	[GVNHoist] Factor out reachability to search for anticipable instructions quickly Factor out the reachability such that multiple queries to find reachability of values are fast. This is based on finding the ANTIC points in the CFG which do not change during hoisting. The ANTIC points are basically the dominance-frontiers in the inverse graph. So we introduce a data structure (CHI nodes) to keep track of values flowing out of a basic block. We only do this for values with multiple occurrences in the function as they are the potential hoistable candidates. This patch allows us to hoist instructions to a basic block with >2 successors, as well as deal with infinite loops in a trivial way. Relevant test cases are added to show the functionality as well as regression fixes from PR32821. Regression from previous GVNHoist: We do not hoist fully redundant expressions because fully redundant expressions are already handled by NewGVN Differential Revision: https://reviews.llvm.org/D35918 Reviewers: dberlin, sebpop, gberry, llvm-svn: 313116	2017-09-13 05:28:03 +00:00
Alina Sbirlea	80b806bf30	Make promoteLoopAccessesToScalars independent of AliasSet [NFC] Summary: The current promoteLoopAccessesToScalars method receives an AliasSet, but the information used is in fact a list of Value, known to must alias. Create the list ahead of time to make this method independent of the AliasSet class. While there is no functionality change, this adds overhead for creating a set of Value, when promotion would normally exit earlier. This is meant to be as a first refactoring step in order to start replacing AliasSetTracker with MemorySSA. And while the end goal is to redesign LICM, the first few steps will focus on adding MemorySSA as an alternative to the AliasSetTracker using most of the existing functionality. Reviewers: mkuper, danielcdh, dberlin Subscribers: sanjoy, chandlerc, gberry, davide, llvm-commits Differential Revision: https://reviews.llvm.org/D35439 llvm-svn: 313075	2017-09-12 21:18:44 +00:00
Sanjay Patel	6fd4391ddd	[DivRempairs] add a pass to optimize div/rem pairs (PR31028) This is intended to be a superset of the functionality from D31037 (EarlyCSE) but implemented as an independent pass, so there's no stretching of scope and feature creep for an existing pass. I also proposed a weaker version of this for SimplifyCFG in D30910. And I initially had almost this same functionality as an addition to CGP in the motivating example of PR31028: https://bugs.llvm.org/show_bug.cgi?id=31028 The advantage of positioning this ahead of SimplifyCFG in the pass pipeline is that it can allow more flattening. But it needs to be after passes (InstCombine) that could sink a div/rem and undo the hoisting that is done here. Decomposing remainder may allow removing some code from the backend (PPC and possibly others). Differential Revision: https://reviews.llvm.org/D37121 llvm-svn: 312862	2017-09-09 13:38:18 +00:00
Max Kazantsev	d7b0f74c64	Re-enable "[IRCE] Identify loops with latch comparison against current IV value" Re-applying after the found bug was fixed. Differential Revision: https://reviews.llvm.org/D36215 llvm-svn: 312783	2017-09-08 10:15:05 +00:00
Max Kazantsev	57db44838d	diff --git a/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp b/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp index f72a808..9fa49fd 100644 --- a/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp +++ b/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp @@ -450,20 +450,10 @@ struct LoopStructure { // equivalent to: // // intN_ty inc = IndVarIncreasing ? 1 : -1; - // pred_ty predicate = IndVarIncreasing - // ? IsSignedPredicate ? ICMP_SLT : ICMP_ULT - // : IsSignedPredicate ? ICMP_SGT : ICMP_UGT; + // pred_ty predicate = IndVarIncreasing ? ICMP_SLT : ICMP_SGT; // - // - // for (intN_ty iv = IndVarStart; predicate(IndVarBase, LoopExitAt); - // iv = IndVarNext) + // for (intN_ty iv = IndVarStart; predicate(iv, LoopExitAt); iv = IndVarBase) // ... body ... - // - // Here IndVarBase is either current or next value of the induction variable. - // in the former case, IsIndVarNext = false and IndVarBase points to the - // Phi node of the induction variable. Otherwise, IsIndVarNext = true and - // IndVarBase points to IV increment instruction. - // Value IndVarBase; Value IndVarStart; @@ -471,13 +461,12 @@ struct LoopStructure { Value LoopExitAt; bool IndVarIncreasing; bool IsSignedPredicate; - bool IsIndVarNext; LoopStructure() : Tag(""), Header(nullptr), Latch(nullptr), LatchBr(nullptr), LatchExit(nullptr), LatchBrExitIdx(-1), IndVarBase(nullptr), IndVarStart(nullptr), IndVarStep(nullptr), LoopExitAt(nullptr), - IndVarIncreasing(false), IsSignedPredicate(true), IsIndVarNext(false) {} + IndVarIncreasing(false), IsSignedPredicate(true) {} template <typename M> LoopStructure map(M Map) const { LoopStructure Result; @@ -493,7 +482,6 @@ struct LoopStructure { Result.LoopExitAt = Map(LoopExitAt); Result.IndVarIncreasing = IndVarIncreasing; Result.IsSignedPredicate = IsSignedPredicate; - Result.IsIndVarNext = IsIndVarNext; return Result; } @@ -841,42 +829,21 @@ LoopStructure::parseLoopStructure(ScalarEvolution &SE, return false; }; - // `ICI` can either be a comparison against IV or a comparison of IV.next. - // Depending on the interpretation, we calculate the start value differently. + // `ICI` is interpreted as taking the backedge if the next* value of the + // induction variable satisfies some constraint. - // Pair {IndVarBase; IsIndVarNext} semantically designates whether the latch - // comparisons happens against the IV before or after its value is - // incremented. Two valid combinations for them are: - // - // 1) { phi [ iv.start, preheader ], [ iv.next, latch ]; false }, - // 2) { iv.next; true }. - // - // The latch comparison happens against IndVarBase which can be either current - // or next value of the induction variable. const SCEVAddRecExpr IndVarBase = cast<SCEVAddRecExpr>(LeftSCEV); bool IsIncreasing = false; bool IsSignedPredicate = true; - bool IsIndVarNext = false; ConstantInt StepCI; if (!IsInductionVar(IndVarBase, IsIncreasing, StepCI)) { FailureReason = "LHS in icmp not induction variable"; return None; } - const SCEV IndVarStart = nullptr; - // TODO: Currently we only handle comparison against IV, but we can extend - // this analysis to be able to deal with comparison against sext(iv) and such. - if (isa<PHINode>(LeftValue) && - cast<PHINode>(LeftValue)->getParent() == Header) - // The comparison is made against current IV value. - IndVarStart = IndVarBase->getStart(); - else { - // Assume that the comparison is made against next IV value. - const SCEV StartNext = IndVarBase->getStart(); - const SCEV Addend = SE.getNegativeSCEV(IndVarBase->getStepRecurrence(SE)); - IndVarStart = SE.getAddExpr(StartNext, Addend); - IsIndVarNext = true; - } + const SCEV StartNext = IndVarBase->getStart(); + const SCEV Addend = SE.getNegativeSCEV(IndVarBase->getStepRecurrence(SE)); + const SCEV IndVarStart = SE.getAddExpr(StartNext, Addend); const SCEV Step = SE.getSCEV(StepCI); ConstantInt One = ConstantInt::get(IndVarTy, 1); @@ -1060,7 +1027,6 @@ LoopStructure::parseLoopStructure(ScalarEvolution &SE, Result.IndVarIncreasing = IsIncreasing; Result.LoopExitAt = RightValue; Result.IsSignedPredicate = IsSignedPredicate; - Result.IsIndVarNext = IsIndVarNext; FailureReason = nullptr; @@ -1350,9 +1316,8 @@ LoopConstrainer::RewrittenRangeInfo LoopConstrainer::changeIterationSpaceEnd( BranchToContinuation); NewPHI->addIncoming(PN->getIncomingValueForBlock(Preheader), Preheader); - auto FixupValue = - LS.IsIndVarNext ? PN->getIncomingValueForBlock(LS.Latch) : PN; - NewPHI->addIncoming(FixupValue, RRI.ExitSelector); + NewPHI->addIncoming(PN->getIncomingValueForBlock(LS.Latch), + RRI.ExitSelector); RRI.PHIValuesAtPseudoExit.push_back(NewPHI); } @@ -1735,10 +1700,7 @@ bool InductiveRangeCheckElimination::runOnLoop(Loop L, LPPassManager &LPM) { } LoopStructure LS = MaybeLoopStructure.getValue(); const SCEVAddRecExpr IndVar = - cast<SCEVAddRecExpr>(SE.getSCEV(LS.IndVarBase)); - if (LS.IsIndVarNext) - IndVar = cast<SCEVAddRecExpr>(SE.getMinusSCEV(IndVar, - SE.getSCEV(LS.IndVarStep))); + cast<SCEVAddRecExpr>(SE.getMinusSCEV(SE.getSCEV(LS.IndVarBase), SE.getSCEV(LS.IndVarStep))); Optional<InductiveRangeCheck::Range> SafeIterRange; Instruction ExprInsertPt = Preheader->getTerminator(); diff --git a/test/Transforms/IRCE/latch-comparison-against-current-value.ll b/test/Transforms/IRCE/latch-comparison-against-current-value.ll deleted file mode 100644 index afea0e6..0000000 --- a/test/Transforms/IRCE/latch-comparison-against-current-value.ll +++ /dev/null @@ -1,182 +0,0 @@ -; RUN: opt -verify-loop-info -irce-print-changed-loops -irce -S < %s 2>&1 \| FileCheck %s - -; Check that IRCE is able to deal with loops where the latch comparison is -; done against current value of the IV, not the IV.next. - -; CHECK: irce: in function test_01: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting> -; CHECK: irce: in function test_02: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting> -; CHECK-NOT: irce: in function test_03: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting> -; CHECK-NOT: irce: in function test_04: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting> - -; SLT condition for increasing loop from 0 to 100. -define void @test_01(i32* %arr, i32* %a_len_ptr) #0 { - -; CHECK: test_01 -; CHECK: entry: -; CHECK-NEXT: %exit.mainloop.at = load i32, i32* %a_len_ptr, !range !0 -; CHECK-NEXT: [[COND2:%[^ ]+]] = icmp slt i32 0, %exit.mainloop.at -; CHECK-NEXT: br i1 [[COND2]], label %loop.preheader, label %main.pseudo.exit -; CHECK: loop: -; CHECK-NEXT: %idx = phi i32 [ %idx.next, %in.bounds ], [ 0, %loop.preheader ] -; CHECK-NEXT: %idx.next = add nuw nsw i32 %idx, 1 -; CHECK-NEXT: %abc = icmp slt i32 %idx, %exit.mainloop.at -; CHECK-NEXT: br i1 true, label %in.bounds, label %out.of.bounds.loopexit1 -; CHECK: in.bounds: -; CHECK-NEXT: %addr = getelementptr i32, i32* %arr, i32 %idx -; CHECK-NEXT: store i32 0, i32* %addr -; CHECK-NEXT: %next = icmp slt i32 %idx, 100 -; CHECK-NEXT: [[COND3:%[^ ]+]] = icmp slt i32 %idx, %exit.mainloop.at -; CHECK-NEXT: br i1 [[COND3]], label %loop, label %main.exit.selector -; CHECK: main.exit.selector: -; CHECK-NEXT: %idx.lcssa = phi i32 [ %idx, %in.bounds ] -; CHECK-NEXT: [[COND4:%[^ ]+]] = icmp slt i32 %idx.lcssa, 100 -; CHECK-NEXT: br i1 [[COND4]], label %main.pseudo.exit, label %exit -; CHECK-NOT: loop.preloop: -; CHECK: loop.postloop: -; CHECK-NEXT: %idx.postloop = phi i32 [ %idx.copy, %postloop ], [ %idx.next.postloop, %in.bounds.postloop ] -; CHECK-NEXT: %idx.next.postloop = add nuw nsw i32 %idx.postloop, 1 -; CHECK-NEXT: %abc.postloop = icmp slt i32 %idx.postloop, %exit.mainloop.at -; CHECK-NEXT: br i1 %abc.postloop, label %in.bounds.postloop, label %out.of.bounds.loopexit - -entry: - %len = load i32, i32* %a_len_ptr, !range !0 - br label %loop - -loop: - %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ] - %idx.next = add nsw nuw i32 %idx, 1 - %abc = icmp slt i32 %idx, %len - br i1 %abc, label %in.bounds, label %out.of.bounds - -in.bounds: - %addr = getelementptr i32, i32* %arr, i32 %idx - store i32 0, i32* %addr - %next = icmp slt i32 %idx, 100 - br i1 %next, label %loop, label %exit - -out.of.bounds: - ret void - -exit: - ret void -} - -; ULT condition for increasing loop from 0 to 100. -define void @test_02(i32* %arr, i32* %a_len_ptr) #0 { - -; CHECK: test_02 -; CHECK: entry: -; CHECK-NEXT: %exit.mainloop.at = load i32, i32* %a_len_ptr, !range !0 -; CHECK-NEXT: [[COND2:%[^ ]+]] = icmp ult i32 0, %exit.mainloop.at -; CHECK-NEXT: br i1 [[COND2]], label %loop.preheader, label %main.pseudo.exit -; CHECK: loop: -; CHECK-NEXT: %idx = phi i32 [ %idx.next, %in.bounds ], [ 0, %loop.preheader ] -; CHECK-NEXT: %idx.next = add nuw nsw i32 %idx, 1 -; CHECK-NEXT: %abc = icmp ult i32 %idx, %exit.mainloop.at -; CHECK-NEXT: br i1 true, label %in.bounds, label %out.of.bounds.loopexit1 -; CHECK: in.bounds: -; CHECK-NEXT: %addr = getelementptr i32, i32* %arr, i32 %idx -; CHECK-NEXT: store i32 0, i32* %addr -; CHECK-NEXT: %next = icmp ult i32 %idx, 100 -; CHECK-NEXT: [[COND3:%[^ ]+]] = icmp ult i32 %idx, %exit.mainloop.at -; CHECK-NEXT: br i1 [[COND3]], label %loop, label %main.exit.selector -; CHECK: main.exit.selector: -; CHECK-NEXT: %idx.lcssa = phi i32 [ %idx, %in.bounds ] -; CHECK-NEXT: [[COND4:%[^ ]+]] = icmp ult i32 %idx.lcssa, 100 -; CHECK-NEXT: br i1 [[COND4]], label %main.pseudo.exit, label %exit -; CHECK-NOT: loop.preloop: -; CHECK: loop.postloop: -; CHECK-NEXT: %idx.postloop = phi i32 [ %idx.copy, %postloop ], [ %idx.next.postloop, %in.bounds.postloop ] -; CHECK-NEXT: %idx.next.postloop = add nuw nsw i32 %idx.postloop, 1 -; CHECK-NEXT: %abc.postloop = icmp ult i32 %idx.postloop, %exit.mainloop.at -; CHECK-NEXT: br i1 %abc.postloop, label %in.bounds.postloop, label %out.of.bounds.loopexit - -entry: - %len = load i32, i32* %a_len_ptr, !range !0 - br label %loop - -loop: - %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ] - %idx.next = add nsw nuw i32 %idx, 1 - %abc = icmp ult i32 %idx, %len - br i1 %abc, label %in.bounds, label %out.of.bounds - -in.bounds: - %addr = getelementptr i32, i32* %arr, i32 %idx - store i32 0, i32* %addr - %next = icmp ult i32 %idx, 100 - br i1 %next, label %loop, label %exit - -out.of.bounds: - ret void - -exit: - ret void -} - -; Same as test_01, but comparison happens against IV extended to a wider type. -; This test ensures that IRCE rejects it and does not falsely assume that it was -; a comparison against iv.next. -; TODO: We can actually extend the recognition to cover this case. -define void @test_03(i32* %arr, i64* %a_len_ptr) #0 { - -; CHECK: test_03 - -entry: - %len = load i64, i64* %a_len_ptr, !range !1 - br label %loop - -loop: - %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ] - %idx.next = add nsw nuw i32 %idx, 1 - %idx.ext = sext i32 %idx to i64 - %abc = icmp slt i64 %idx.ext, %len - br i1 %abc, label %in.bounds, label %out.of.bounds - -in.bounds: - %addr = getelementptr i32, i32* %arr, i32 %idx - store i32 0, i32* %addr - %next = icmp slt i32 %idx, 100 - br i1 %next, label %loop, label %exit - -out.of.bounds: - ret void - -exit: - ret void -} - -; Same as test_02, but comparison happens against IV extended to a wider type. -; This test ensures that IRCE rejects it and does not falsely assume that it was -; a comparison against iv.next. -; TODO: We can actually extend the recognition to cover this case. -define void @test_04(i32* %arr, i64* %a_len_ptr) #0 { - -; CHECK: test_04 - -entry: - %len = load i64, i64* %a_len_ptr, !range !1 - br label %loop - -loop: - %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ] - %idx.next = add nsw nuw i32 %idx, 1 - %idx.ext = sext i32 %idx to i64 - %abc = icmp ult i64 %idx.ext, %len - br i1 %abc, label %in.bounds, label %out.of.bounds - -in.bounds: - %addr = getelementptr i32, i32* %arr, i32 %idx - store i32 0, i32* %addr - %next = icmp ult i32 %idx, 100 - br i1 %next, label %loop, label %exit - -out.of.bounds: - ret void - -exit: - ret void -} - -!0 = !{i32 0, i32 50} -!1 = !{i64 0, i64 50} llvm-svn: 312775	2017-09-08 04:26:41 +00:00
Reid Kleckner	0e8c4bb055	Sink some IntrinsicInst.h and Intrinsics.h out of llvm/include Many of these uses can get by with forward declarations. Hopefully this speeds up compilation after adding a single intrinsic. llvm-svn: 312759	2017-09-07 23:27:44 +00:00
Krzysztof Parzyszek	1dc313727e	Disable jump threading into loop headers Consider this type of a loop: for (...) { ... if (...) continue; ... } Normally, the "continue" would branch to the loop control code that checks whether the loop should continue iterating and which contains the (often) unique loop latch branch. In certain cases jump threading can "thread" the inner branch directly to the loop header, creating a second loop latch. Loop canonicalization would then transform this loop into a loop nest. The problem with this is that in such a loop nest neither loop is countable even if the original loop was. This may inhibit subsequent loop optimizations and be detrimental to performance. Differential Revision: https://reviews.llvm.org/D36404 llvm-svn: 312664	2017-09-06 19:36:58 +00:00
Davide Italiano	32504cf661	[GVNHoist] Move duplicated code to a helper function. NFCI. llvm-svn: 312575	2017-09-05 20:49:41 +00:00
Daniel Berlin	f9c9455d3f	NewGVN: Fix PR 34430 - we need to look through predicateinfo copies to detect self-cycles of phi nodes. We also need to not ignore certain types of arguments when testing whether the phi has a backedge or was originally constant. llvm-svn: 312510	2017-09-05 02:17:43 +00:00
Daniel Berlin	54a92fcc5d	NewGVN: Fix PR 34452 by passing instruction all the way down when we do aggregate value simplification llvm-svn: 312509	2017-09-05 02:17:42 +00:00
Daniel Berlin	1a58258232	NewGVN: Detect copies through predicateinfo llvm-svn: 312508	2017-09-05 02:17:41 +00:00
Daniel Berlin	4ad7e8d263	NewGVN: Change where check for original instruction in phi of ops leader finding is done. Where we had it before, we would stop looking when we hit the original instruction, but skip it. Now we skip it and keep looking. llvm-svn: 312507	2017-09-05 02:17:40 +00:00
Daniel Berlin	94090dd13b	Fix PR/33305. caused by trying to simplify expressions in phi of ops that should have no leaders. Summary: After a discussion with Rekka, i believe this (or a small variant) should fix the remaining phi-of-ops problems. Rekka's algorithm for completeness relies on looking up expressions that should have no leader, and expecting it to fail (IE looking up expressions that can't exist in a predecessor, and expecting it to find nothing). Unfortunately, sometimes these expressions can be simplified to constants, but we need the lookup to fail anyway. Additionally, our simplifier outsmarts this by taking these "not quite right" expressions, and simplifying them into other expressions or walking through phis, etc. In the past, we've sometimes been able to find leaders for these expressions, incorrectly. This change causes us to not to try to phi of ops such expressions. We determine safety by seeing if they depend on a phi node in our block. This is not perfect, we can do a bit better, but this should be a "correctness start" that we can then improve. It also requires a bunch of caching that i'll eventually like to eliminate. The right solution, longer term, to the simplifier issues, is to make the query interface for the instruction simplifier/constant folder have the flags we need, so that we can keep most things going, but turn off the possibly-invalid parts (threading through phis, etc). This is an issue in another wrong code bug as well. Reviewers: davide, mcrosier Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D37175 llvm-svn: 312401	2017-09-02 02:18:44 +00:00
Eugene Zelenko	75075efe5e	[Analysis, Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 312383	2017-09-01 21:37:29 +00:00
Daniel Berlin	86932104db	NewGVN: Make sure we don't incorrectly use PredicateInfo when doing PHI of ops Summary: When we backtranslate expressions, we can't use the predicateinfo, since we are evaluating them in a different context. Reviewers: davide, mcrosier Subscribers: sanjoy, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D37174 llvm-svn: 312352	2017-09-01 19:20:18 +00:00
Clement Courbet	bc0c4459c9	[MergeICmps] Fix build of rL312315 on clang-with-thin-lto-windows: MergeICmps.cpp(68,15): error: chosen constructor is explicit in copy-initialization return {}; APInt.h(339,12): note: explicit constructor declared here explicit APInt() : BitWidth(1) { U.VAL = 0; } ^ MergeICmps.cpp(56,9): note: in implicit initialization of field 'Offset' with omitted initializer APInt Offset; ^ llvm-svn: 312326	2017-09-01 11:51:23 +00:00
Clement Courbet	65130e2d8d	Reland rL312315: [MergeICmps] MergeICmps is a new optimization pass that turns chains of integer Add missing header. This reverts commit 86dd6335cf7607af22f383a9a8e072ba929848cf. llvm-svn: 312322	2017-09-01 10:56:34 +00:00
Clement Courbet	316212575b	Revert "[MergeICmps] MergeICmps is a new optimization pass that turns chains of integer" Break build This reverts commit d07ab866f7f88f81e49046d691a80dcd32d7198b. llvm-svn: 312317	2017-09-01 09:43:08 +00:00
Clement Courbet	9473c01e96	[MergeICmps] MergeICmps is a new optimization pass that turns chains of integer comparisons into memcmp. Thanks to recent improvements in the LLVM codegen, the memcmp is typically inlined as a chain of efficient hardware comparisons. This typically benefits C++ member or nonmember operator==(). For now this is disabled by default until: - https://bugs.llvm.org/show_bug.cgi?id=33329 is complete - Benchmarks show that this is always useful. Differential Revision: https://reviews.llvm.org/D33987 llvm-svn: 312315	2017-09-01 09:07:05 +00:00
Eugene Zelenko	fa6434bebb	[Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes. Also affected in files (NFC). llvm-svn: 312289	2017-08-31 21:56:16 +00:00
Max Kazantsev	0a9c1ef2eb	[IRCE] Identify loops with latch comparison against current IV value Current implementation of parseLoopStructure interprets the latch comparison as a comarison against `iv.next`. If the actual comparison is made against the `iv` current value then the loop may be rejected, because this misinterpretation leads to incorrect evaluation of the latch start value. This patch teaches the IRCE to distinguish this kind of loops and perform the optimization for them. Now we use `IndVarBase` variable which can be either next or current value of the induction variable (previously we used `IndVarNext` which was always the value on next iteration). Differential Revision: https://reviews.llvm.org/D36215 llvm-svn: 312221	2017-08-31 07:04:20 +00:00
Max Kazantsev	a22742be5a	[IRCE][NFC] Rename IndVarNext to IndVarBase Renaming as a preparation step to generalizing IRCE for comparison not only against the next value of an indvar, but also against the current. Differential Revision: https://reviews.llvm.org/D36509 llvm-svn: 312215	2017-08-31 05:58:15 +00:00
Adrian Prantl	b192b545c1	Refactor DIBuilder::createFragmentExpression into a static DIExpression member NFC llvm-svn: 312165	2017-08-30 20:04:17 +00:00
Daniel Berlin	23fec57e6f	NewGVN: Make sure we add the correct user if we swapped the comparison operands llvm-svn: 312162	2017-08-30 19:53:23 +00:00
Daniel Berlin	7ef26daba8	NewGVN: Allow simplification into variables llvm-svn: 312161	2017-08-30 19:52:39 +00:00
Benjamin Kramer	b99d7c9214	[GVNSink] Remove dependency on SmallPtrSet iteration order. Found by LLVM_ENABLE_REVERSE_ITERATION. llvm-svn: 312156	2017-08-30 18:46:37 +00:00
Wei Mi	ebb9327759	[LoopUnswitch] Fix a simple bug which disables loop unswitch for select statement This is to fix PR34257. rL309059 takes an early return when FindLIVLoopCondition fails to find a loop invariant condition. This is wrong and it will disable loop unswitch for select. The patch fixes the bug. Differential Revision: https://reviews.llvm.org/D36985 llvm-svn: 312045	2017-08-29 21:45:11 +00:00
Max Kazantsev	bb1d010872	[LSR] Fix Shadow IV in case of integer overflow When LSR processes code like int accumulator = 0; for (int i = 0; i < N; i++) { accummulator += i; use((double) accummulator); } It may decide to replace integer `accumulator` with a double Shadow IV to get rid of casts. The problem with that is that the `accumulator`'s value may overflow. Starting from this moment, the behavior of integer and double accumulators will differ. This patch strenghtens up the conditions of Shadow IV mechanism applicability. We only allow it for IVs that are proved to be `AddRec`s with `nsw`/`nuw` flag. Differential Revision: https://reviews.llvm.org/D37209 llvm-svn: 311986	2017-08-29 07:32:20 +00:00
Davide Italiano	9a09ae448d	[LoopUnroll] Add a cl::opt to force peeling, for testing purposes. Will be used to test the patch proposed in D37153. llvm-svn: 311915	2017-08-28 19:50:55 +00:00
NAKAMURA Takumi	a1e97a77f5	Untabify. llvm-svn: 311875	2017-08-28 06:47:47 +00:00
Davide Italiano	9bdccb37d5	[NewGVN] Use `auto` when the type is obvious NFCI. llvm-svn: 311838	2017-08-26 22:31:10 +00:00
Daniel Berlin	de269f4620	NewGVN: Fix PR33204 - We need to add memory users when we bypass memorydefs for loads, not just when we do it for stores. llvm-svn: 311829	2017-08-26 07:37:11 +00:00
Florian Hahn	cd78345398	[LoopInterchange] Skip zext instructions when looking for induction var. Summary: SimplifyIndVar may introduce zext instructions to widen arguments of the loop exit check. They should not prevent us from splitting the loop at the induction variable, but maybe the check should be more conservative, e.g. making sure it only extends arguments used by a comparison? Reviewers: karthikthecool, mcrosier, mzolotukhin Reviewed By: mcrosier Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D34879 llvm-svn: 311783	2017-08-25 16:52:29 +00:00
Xinliang David Li	66531dd10a	[Profile] backward propagate profile info in JumpThreading Take-2 after fixing bugs in the original patch. Differential Revsion: http://reviews.llvm.org/D36864 llvm-svn: 311727	2017-08-24 22:54:01 +00:00
Mikael Holmen	7a99e33b8e	[Reassociate] Do not drop debug location if replacement is missing Summary: When reassociating an expression, do not drop the instruction's original debug location in case the replacement location is missing. The debug location must at least not be dropped for inlinable callsites of debug-info-bearing functions in debug-info-bearing functions. Failing to do so would result in an "inlinable function " "call in a function with debug info must have a !dbg location" error in the verifier. As preserving the original debug location is not expected to result in overly jumpy debug line information, it is preserved for all other cases too. This fixes PR34231: https://bugs.llvm.org/show_bug.cgi?id=34231 Original patch by David Stenberg Reviewers: davide, craig.topper, mcrosier, dblaikie, aprantl Reviewed By: davide, aprantl Subscribers: aprantl Differential Revision: https://reviews.llvm.org/D36865 llvm-svn: 311642	2017-08-24 09:05:00 +00:00
Daniel Berlin	f948603a15	NewGVN: We weren't properly simplifying selects with equal arguments due to a thinko. llvm-svn: 311626	2017-08-24 02:43:17 +00:00
Hans Wennborg	66f6fc0a49	LowerAtomic: Don't skip optnone functions; atomic still need lowering (PR34020) The lowering isn't really an optimization, so optnone shouldn't make a difference. ARM relies on the pass running when using "-mthread-model single", because in that mode, it doesn't run AtomicExpand. See bug for more details. Differential Revision: https://reviews.llvm.org/D37040 llvm-svn: 311565	2017-08-23 15:43:28 +00:00
Chad Rosier	8db41e9dbd	[Reassociate] Don't canonicalize x + (-Constant * y) -> x - (Constant * y).. ..if the resulting subtract will be broken up later. This can cause us to get into an infinite loop. x + (-5.0 * y) -> x - (5.0 * y) ; Canonicalize neg const x - (5.0 * y) -> x + (0 - (5.0 * y)) ; Break up subtract x + (0 - (5.0 * y)) -> x + (-5.0 * y) ; Replace 0-X with X*-1. PR34078 llvm-svn: 311554	2017-08-23 14:10:06 +00:00
Jakub Kuderski	2724d45325	[ADCE][Dominators] Reapply: Teach ADCE to preserve dominators Summary: This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees. This is reapplies the original patch r311057 that was reverted in r311381. The previous version wasn't using the batch update api for updating dominators, which in vary rare cases caused assertion failures. This also fixes PR34258. Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki Reviewed By: davide Subscribers: grandinj, zhendongsu, llvm-commits, david2050 Differential Revision: https://reviews.llvm.org/D35869 llvm-svn: 311467	2017-08-22 16:30:21 +00:00
Sanjoy Das	08a38fe71e	Revert "Reapply: [ADCE][Dominators] Teach ADCE to preserve dominators" Summary: This partially reverts commit r311057 since it breaks ADCE. See PR34258. Reviewers: kuhar Subscribers: mcrosier, david2050, llvm-commits Differential Revision: https://reviews.llvm.org/D36979 llvm-svn: 311381	2017-08-21 20:39:18 +00:00
Xinliang David Li	d2838fc4b9	Revert 311208, 311209 llvm-svn: 311341	2017-08-21 16:00:38 +00:00
Benjamin Kramer	49a49fe816	Move helper classes into anonymous namespaces. No functionality change intended. llvm-svn: 311288	2017-08-20 13:03:48 +00:00
Xinliang David Li	0d07f9d68a	Fix comment /NFC llvm-svn: 311209	2017-08-18 23:08:50 +00:00
Xinliang David Li	709ffe178e	[Profile] backward propagate profile info in JumpThreading Differential Revsion: http://reviews.llvm.org/D36864 llvm-svn: 311208	2017-08-18 23:00:05 +00:00
Max Kazantsev	0aaf8c16ac	[IRCE] Fix buggy behavior in Clamp Clamp function was too optimistic when choosing signed or unsigned min/max function for calculations. In fact, `!IsSignedPredicate` guarantees us that `Smallest` and `Greatest` can be compared safely using unsigned predicates, but we did not check this for `S` which can in theory be negative. This patch makes Clamp use signed min/max for cases when it fails to prove `S` being non-negative, and it adds a test where such situation may lead to incorrect conditions calculation. Differential Revision: https://reviews.llvm.org/D36873 llvm-svn: 311205	2017-08-18 22:50:29 +00:00
Jakub Kuderski	e608ef7635	[LoopRotate][Dominators] Use the incremental API to update DomTree Summary: This patch teaches LoopRotate to use the new incremental API to update the DominatorTree. Reviewers: dberlin, davide, grosser, sanjoy Reviewed By: dberlin, davide Subscribers: hiraditya, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D35581 llvm-svn: 311125	2017-08-17 21:48:19 +00:00
Jakub Kuderski	e35a449140	[Dominators] Teach LoopUnswitch to use the incremental API Summary: This patch makes LoopUnswitch use new incremental API for updating dominators. It also updates SplitCriticalEdge, as it is called in LoopUnswitch. There doesn't seem to be any noticeable performance difference when bootstrapping clang with this patch. Reviewers: dberlin, davide, sanjoy, grosser, chandlerc Reviewed By: davide, grosser Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D35528 llvm-svn: 311093	2017-08-17 16:45:35 +00:00
Jakub Kuderski	fd5c5c9144	Reapply: [ADCE][Dominators] Teach ADCE to preserve dominators Summary: This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees. I didn't notice any performance impact when bootstrapping clang with this patch. The patch was originally committed in r311039 and reverted in r311049. This revision fixes the problem with not adding a dependency on the DominatorTreeWrapperPass for the LegacyPassManager. Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki Reviewed By: davide Subscribers: grandinj, zhendongsu, llvm-commits, david2050 Differential Revision: https://reviews.llvm.org/D35869 llvm-svn: 311057	2017-08-17 01:41:49 +00:00
Jakub Kuderski	cbcffb173c	Revert "[ADCE][Dominators] Teach ADCE to preserve dominators" This reverts commit r311039. The patch caused the `test/Bindings/OCaml/Output/scalar_opts.ml` to fail. llvm-svn: 311049	2017-08-16 22:10:53 +00:00
Jakub Kuderski	4552e9de9f	[ADCE][Dominators] Teach ADCE to preserve dominators Summary: This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees. I didn't notice any performance impact when bootstrapping clang with this patch. Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki Reviewed By: davide Subscribers: grandinj, zhendongsu, llvm-commits, david2050 Differential Revision: https://reviews.llvm.org/D35869 llvm-svn: 311039	2017-08-16 20:50:23 +00:00
Geoff Berry	40549ad1ac	[LoopDataPrefetch][AArch64FalkorHWPFFix] Preserve ScalarEvolution Summary: Mark LoopDataPrefetch and AArch64FalkorHWPFFix passes as preserving ScalarEvolution since they do not alter loop structure and should not alter any SCEV values (though LoopDataPrefetch may introduce new instructions that won't have cached SCEV values yet). This can result in slight code differences, mainly w.r.t. nsw/nuw flags on SCEVs, since these are computed somewhat lazily when a zext/sext instruction is encountered. As a result, passes after the modified passes may see SCEVs with more nsw/nuw flags present. Reviewers: sanjoy, anemet Subscribers: aemerson, rengolin, mzolotukhin, javed.absar, kristof.beyls, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D36716 llvm-svn: 311032	2017-08-16 19:03:16 +00:00
Hal Finkel	9e54b7093a	[BDCE] Don't check demanded bits on unsized types To clear assumptions that are potentially invalid after trivialization, we need to walk the use/def chain. Normally, the only way to reach an instruction with an unsized type is via an instruction that has side effects (or otherwise will demand its input bits). That would stop the walk. However, if we have a readnone function that returns an unsized type (e.g., void), we must avoid asking for the demanded bits of the function call's return value. A void-returning readnone function is always dead (and so we can stop walking the use/def chain here), but the check is necessary to avoid asserting. Fixes PR34211. llvm-svn: 311014	2017-08-16 16:09:22 +00:00
Jakub Kuderski	638c085d07	[Dominators] Include infinite loops in PostDominatorTree Summary: This patch teaches PostDominatorTree about infinite loops. It is built on top of D29705 by @dberlin which includes a very detailed motivation for this change. What's new is that the patch also teaches the incremental updater how to deal with reverse-unreachable regions and how to properly maintain and verify tree roots. Before that, the incremental algorithm sometimes ended up preserving reverse-unreachable regions after updates that wouldn't appear in the tree if it was constructed from scratch on the same CFG. This patch makes the following assumptions: - A sequence of updates should produce the same tree as a recalculating it. - Any sequence of the same updates should lead to the same tree. - Siblings and roots are unordered. The last two properties are essential to efficiently perform batch updates in the future. When it comes to the first one, we can decide later that the consistency between freshly built tree and an updated one doesn't matter match, as there are many correct ways to pick roots in infinite loops, and to relax this assumption. That should enable us to recalculate postdominators less frequently. This patch is pretty conservative when it comes to incremental updates on reverse-unreachable regions and ends up recalculating the whole tree in many cases. It should be possible to improve the performance in many cases, if we decide that it's important enough. That being said, my experiments showed that reverse-unreachable are very rare in the IR emitted by clang when bootstrapping clang. Here are the statistics I collected by analyzing IR between passes and after each removePredecessor call: ``` # functions: 52283 # samples: 337609 # reverse unreachable BBs: 216022 # BBs: 247840796 Percent reverse-unreachable: 0.08716159869015269 % Max(PercRevUnreachable) in a function: 87.58620689655172 % # > 25 % samples: 471 ( 0.1395104988314885 % samples ) ... in 145 ( 0.27733680163724345 % functions ) ``` Most of the reverse-unreachable regions come from invalid IR where it wouldn't be possible to construct a PostDomTree anyway. I would like to commit this patch in the next week in order to be able to complete the work that depends on it before the end of my internship, so please don't wait long to voice your concerns :). Reviewers: dberlin, sanjoy, grosser, brzycki, davide, chandlerc, hfinkel Reviewed By: dberlin Subscribers: nhaehnle, javed.absar, kparzysz, uabelho, jlebar, hiraditya, llvm-commits, dberlin, david2050 Differential Revision: https://reviews.llvm.org/D35851 llvm-svn: 310940	2017-08-15 18:14:57 +00:00
Sanjay Patel	a1067d9bae	[BDCE] reduce scope of an assert (PR34179) The assert was added with r310779 and is usually correct, but as the test shows, not always. The 'volatile' on the load is needed to expose the faulty path because without it, DemandedBits would return that the load is just dead rather than not demanded, and so we wouldn't hit the bogus assert. Also, since the lambda is just a single-line now, get rid of it and inline the DB.isAllOnesValue() calls. This should fix (prevent execution of a faulty assert): https://bugs.llvm.org/show_bug.cgi?id=34179 llvm-svn: 310842	2017-08-14 15:13:46 +00:00
Sam Parker	718c8a6a2a	[LoopUnroll] Enable option to peel remainder loop On some targets, the penalty of executing runtime unrolling checks and then not the unrolled loop can be significantly detrimental to performance. This results in the need to be more conservative with the unroll count, keeping a trip count of 2 reduces the overhead as well as increasing the chance of the unrolled body being executed. But being conservative leaves performance gains on the table. This patch enables the unrolling of the remainder loop introduced by runtime unrolling. This can help reduce the overhead of misunrolled loops because the cost of non-taken branches is much less than the cost of the backedge that would normally be executed in the remainder loop. This allows larger unroll factors to be used without suffering performance loses with smaller iteration counts. Differential Revision: https://reviews.llvm.org/D36309 llvm-svn: 310824	2017-08-14 09:25:26 +00:00
Sanjay Patel	fe346f9f5b	[BDCE] clear poison generators after turning a value into zero (PR33695, PR34037) nsw, nuw, and exact carry implicit assumptions about their operands, so we need to clear those after trivializing a value. We decided there was no danger for llvm.assume or metadata, so there's just a comment about that. This fixes miscompiles as shown in: https://bugs.llvm.org/show_bug.cgi?id=33695 https://bugs.llvm.org/show_bug.cgi?id=34037 Differential Revision: https://reviews.llvm.org/D36592 llvm-svn: 310779	2017-08-12 16:41:08 +00:00
Craig Topper	9cd976d041	[DebugCounter] Move the semicolon out of the DEBUG_COUNTER macro and require it to be placed at the end of each use. This make it consistent with STATISTIC which it will often appears near. While there move one DEBUG_COUNTER instance out of an anonymous namespace. It's already declaring a static variable so the namespace is unnecessary. llvm-svn: 310637	2017-08-10 17:48:11 +00:00
Chad Rosier	a5508e3119	[NewGVN] Add CL option to control the generation of phi-of-ops (disable by default). Differential Revision: https://reviews.llvm.org/D36478539 llvm-svn: 310594	2017-08-10 14:12:57 +00:00
Jonas Paulsson	6228aeda65	[LSR / TTI / SystemZ] Eliminate TargetTransformInfo::isFoldableMemAccess() isLegalAddressingMode() has recently gained the extra optional Instruction* parameter, and therefore it can now do the job that previously only isFoldableMemAccess() could do. The SystemZ implementation of isLegalAddressingMode() has gained the functionality of checking for offsets, which used to be done with isFoldableMemAccess(). The isFoldableMemAccess() hook has been removed everywhere. Review: Quentin Colombet, Ulrich Weigand https://reviews.llvm.org/D35933 llvm-svn: 310463	2017-08-09 11:28:01 +00:00
Jonas Paulsson	5052771af3	[LoopStrengthReduce] Don't neglect the Fixup.Offset in isAMCompletelyFolded(). In the recursive call to isAMCompletelyFolded(), the passed offset should be the sum of F.BaseOffset and Fixup.Offset. Review: Quentin Colombet. llvm-svn: 310462	2017-08-09 11:27:46 +00:00
Wei Mi	bb9106ac4b	[GVN] Remove stale entries in phitranslate cache when new phi is generated for PRE When a new phi is generated for scalarpre of an expression, the phiTranslate cache will become stale: Before PRE, the candidate expression must not be available in a predecessor block, and phitranslate will cache the information. After PRE, the expression will become available in all predecessor blocks, so the related entries in phiTranslate cache becomes stale. The patch will simply remove the stale entries so phiTranslate can be recomputed next time. The stale entries in phitranslate cache will not affect correctness but will cause missing PRE opportunity for later instructions. Differential Revision: https://reviews.llvm.org/D36124 llvm-svn: 310421	2017-08-08 21:40:14 +00:00
Chad Rosier	4d852597f8	[NewGVN] Use a cast instead of a dyn_cast. Differential Revision: https://reviews.llvm.org/D36478 llvm-svn: 310397	2017-08-08 18:41:49 +00:00
Chandler Carruth	7c888dca46	[PM] Fix new LoopUnroll function pass by invalidating loop analysis results when a loop is completely removed. This is very hard to manifest as a visible bug. You need to arrange for there to be a subsequent allocation of a 'Loop' object which gets the exact same address as the one which the unroll deleted, and you need the LoopAccessAnalysis results to be significant in the way that they're stale. And you need a million other things to align. But when it does, you get a deeply mysterious crash due to actually finding a stale analysis result. This fixes the issue and tests for it by directly checking we successfully invalidate things. I have not been able to get any test case to reliably trigger this. Changes to LLVM itself caused the only test case I ever had to cease to crash. I've looked pretty extensively at less brittle ways of fixing this and they are actually very, very hard to do. This is a somewhat strange and unusual case as we have a pass which is deleting an IR unit, but is not running within that IR unit's pass framework (which is what handles this cleanly for the normal loop unroll). And where there isn't a definitive way to clear all of the stale cache entries. And where the pass is updating the core analysis that provides the IR units! For example, we don't have any of these problems with Function analyses because it is easy to clear out function analyses when the functions themselves may have been deleted -- we clear an entire module's worth! But that is too heavy of a hammer down here in the LoopAnalysisManager layer. A better long-term solution IMO is to require that AnalysisManager's make their keys durable to this kind of thing. Specifically, when caching an analysis for one IR unit that is conceptually "owned" by a higher level IR unit, the AnalysisManager should incorporate this into its data structures so that we can reliably clear these results without having to teach each and every pass to do so manually as we do here. But that is a change for another day as it will be a fairly invasive change to the AnalysisManager infrastructure. Until then, this fortunately seems to be quite rare. llvm-svn: 310333	2017-08-08 02:24:20 +00:00
Evgeny Stupachenko	c675290680	Reapply fix PR23384 (part 3 of 3) r304824 (was reverted in r305720). The root cause of reverting was fixed - PR33514. Summary: The patch makes instruction count the highest priority for LSR solution for X86 (previously registers had highest priority). Reviewers: qcolombet Differential Revision: http://reviews.llvm.org/D30562 From: Evgeny Stupachenko <evstupac@gmail.com> <evgeny.v.stupachenko@intel.com> llvm-svn: 310289	2017-08-07 19:56:34 +00:00
Davide Italiano	b53b075bb1	[Reassociate] Use a range loop for clarity. NFCI. While here, rename `i` to `Rank` as the latter is more self-explanatory (and this code also uses `I` two lines below to identify an Instruction). llvm-svn: 310238	2017-08-07 01:57:21 +00:00
Davide Italiano	a5cdc22e70	[Reassociate] Try to bail out early when canonicalizing. This commit rearranges the checks to avoid calls to getRank() when not needed (e.g. when RHS == LHS). llvm-svn: 310237	2017-08-07 01:49:09 +00:00
Nico Weber	69ca5322d4	Revert r310055, it caused PR34074. llvm-svn: 310123	2017-08-04 20:40:38 +00:00
Evgeny Stupachenko	38197c66a1	Fix PR33514 Summary: The bug was uncovered after fix of PR23384 (part 3 of 3). The patch restricts pointer multiplication in SCEV computaion for ICmpZero. Reviewers: qcolombet Differential Revision: http://reviews.llvm.org/D36170 From: Evgeny Stupachenko <evstupac@gmail.com> <evgeny.v.stupachenko@intel.com> llvm-svn: 310092	2017-08-04 18:46:13 +00:00
Filipe Cabecinhas	fb9d2a8775	[DSE] Merge stores when the later store only writes to memory locations the early store also wrote to. Summary: This fixes PR31777. If both stores' values are ConstantInt, we merge the two stores (shifting the smaller store appropriately) and replace the earlier (and larger) store with an updated constant. In the future we should also support vectors of integers. And maybe float/double if we can. Reviewers: hfinkel, junbuml, jfb, RKSimon, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30703 llvm-svn: 310055	2017-08-04 12:28:36 +00:00
Max Kazantsev	9505470033	Do not declare a variable which is used only in assert. NFC llvm-svn: 310034	2017-08-04 07:41:24 +00:00
Max Kazantsev	2f6ae28152	[IRCE] Handle loops with step different from 1/-1 This patch generalizes IRCE to handle IV steps that are not equal to 1 or -1. Differential Revision: https://reviews.llvm.org/D35539 llvm-svn: 310032	2017-08-04 07:01:04 +00:00
Max Kazantsev	07da1ab23a	[IRCE] Recognize loops with unsigned latch conditions This patch enables recognition of loops with ult/ugt latch conditions. Differential Revision: https://reviews.llvm.org/D35302 llvm-svn: 310027	2017-08-04 05:40:20 +00:00
Teresa Johnson	8482e56920	Use profile summary to disable peeling for huge working sets Summary: Detect when the working set size of a profiled application is huge, by comparing the number of counts required to reach the hot percentile in the profile summary to a large threshold. When the working set size is determined to be huge, disable peeling to avoid bloating the working set further. Note that the selected threshold (15K) is significantly larger than the largest working set value in SPEC cpu2006 (which is gcc at around 11K). Reviewers: davidxl Subscribers: mehdi_amini, mzolotukhin, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D36288 llvm-svn: 310005	2017-08-03 23:42:58 +00:00
Davide Italiano	5974c31d91	[NewGVN] Fix the case where we have a phi-of-ops which goes away. Patch by Daniel Berlin, fixes PR33196 (and probably something else). llvm-svn: 309988	2017-08-03 21:17:49 +00:00
Teresa Johnson	9a18a6f08b	Disable loop peeling during full unrolling pass. Summary: Peeling should not occur during the full unrolling invocation early in the pipeline, but rather later with partial and runtime loop unrolling. The later loop unrolling invocation will also eventually utilize profile summary and branch frequency information, which we would like to use to control peeling. And for ThinLTO we want to delay peeling until the backend (post thin link) phase, just as we do for most types of unrolling. Ensure peeling doesn't occur during the full unrolling invocation by adding a parameter to the shared implementation function, similar to the way partial and runtime loop unrolling are disabled. Performance results for ThinLTO suggest this has a neutral to positive effect on some internal benchmarks. Reviewers: chandlerc, davidxl Subscribers: mzolotukhin, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D36258 llvm-svn: 309966	2017-08-03 17:52:38 +00:00
Sanjay Patel	7cf745cc94	[NewGVN] fix typos; NFC llvm-svn: 309946	2017-08-03 15:18:27 +00:00
Teresa Johnson	ecd901314d	[PM] Split LoopUnrollPass and make partial unroller a function pass Summary: This is largely NFC, in preparation for utilizing ProfileSummaryInfo and BranchFrequencyInfo analyses. In this patch I am only doing the splitting for the New PM, but I can do the same for the legacy PM as a follow-on if this looks good. Not NFC since for partial unrolling we lose the updates done to the loop traversal (adding new sibling and child loops) - according to Chandler this is not very useful for partial unrolling, but it also means that the debugging flag -unroll-revisit-child-loops no longer works for partial unrolling. Reviewers: chandlerc Subscribers: mehdi_amini, mzolotukhin, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D36157 llvm-svn: 309886	2017-08-02 20:35:29 +00:00
Jakub Kuderski	d869913f3b	[Dominators] Teach LoopDeletion to use the new incremental API Summary: This patch makes LoopDeletion use the incremental DominatorTree API. We modify LoopDeletion to perform the deletion in 5 steps: 1. Create a new dummy edge from the preheader to the exit, by adding a conditional branch. 2. Inform the DomTree about the new edge. 3. Remove the conditional branch and replace it with an unconditional edge to the exit. This removes the edge to the loop header, making it unreachable. 4. Inform the DomTree about the deleted edge. 5. Remove the unreachable block from the function. Creating the dummy conditional branch is necessary to perform incremental DomTree update. We should consider using the batch updater when it's ready. Reviewers: dberlin, davide, grosser, sanjoy Reviewed By: dberlin, grosser Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D35391 llvm-svn: 309850	2017-08-02 18:17:52 +00:00
Davide Italiano	c2f73b7fae	[NewGVN] Fold single-use variables. NFCI. llvm-svn: 309790	2017-08-02 04:05:49 +00:00
Davide Italiano	b13a3fa41b	[NewGVN] Remove a (now stale) comment. NFCI. llvm-svn: 309789	2017-08-02 03:51:40 +00:00
Chad Rosier	dfd1de687d	[Value Tracking] Default argument to true and rename accordingly. NFC. IMHO this is a bit more readable. llvm-svn: 309739	2017-08-01 20:18:54 +00:00
Max Kazantsev	e4c220e8f2	[IRCE][NFC] Add another assert that AddRecExpr's step is not zero One more assertion of this kind. It is a preparation step for generalizing to the case of stride not equal to +1/-1. llvm-svn: 309663	2017-08-01 06:49:29 +00:00
Max Kazantsev	85da7543f9	[IRCE][NFC] Add assert that AddRecExpr's step is not zero We should never return zero steps, ensure this fact by adding a sanity check when we are analyzing the induction variable. llvm-svn: 309661	2017-08-01 06:27:51 +00:00
Alina Sbirlea	30d8a881e8	Default MemoryLocation passed to getModRefInfo should be None (D35441) llvm-svn: 309645	2017-08-01 00:47:17 +00:00
Alina Sbirlea	967e7966fc	Allow None as a MemoryLocation to getModRefInfo Summary: Adding part of the changes in D30369 (needed to make progress): Current patch updates AliasAnalysis and MemoryLocation, but does _not_ clean up MemorySSA. Original summary from D30369, by dberlin: Currently, we have instructions which affect memory but have no memory location. If you call, for example, MemoryLocation::get on a fence, it asserts. This means things specifically have to avoid that. It also means we end up with a copy of each API, one taking a memory location, one not. This starts to fix that. We add MemoryLocation::getOrNone as a new call, and reimplement the old asserting version in terms of it. We make MemoryLocation optional in the (Instruction, MemoryLocation) version of getModRefInfo, and kill the old one argument version in favor of passing None (it had one caller). Now both can handle fences because you can just use MemoryLocation::getOrNone on an instruction and it will return a correct answer. We use all this to clean up part of MemorySSA that had to handle this difference. Note that literally every actual getModRefInfo interface we have could be made private and replaced with: getModRefInfo(Instruction, Optional<MemoryLocation>) and getModRefInfo(Instruction, Optional<MemoryLocation>, Instruction, Optional<MemoryLocation>) and delegating to the right ones, if we wanted to. I have not attempted to do this yet. Reviewers: dberlin, davide, dblaikie Subscribers: sanjoy, hfinkel, chandlerc, llvm-commits Differential Revision: https://reviews.llvm.org/D35441 llvm-svn: 309641	2017-08-01 00:28:29 +00:00
David Majnemer	91c6330c96	[IPSCCP] Guard a user of getInitializer with hasDefinitiveInitializer We are not allowed to reason about an initializer value without first consulting hasDefinitiveInitializer. llvm-svn: 309594	2017-07-31 17:47:07 +00:00
Florian Hahn	94b8a87c8e	Extend ifdefs to more unused helper functions. This fixes a buildbot failure with -Werror introduced by r309553 llvm-svn: 309572	2017-07-31 16:11:43 +00:00
Florian Hahn	6b3216aad8	Guard print() functions only used by dump() functions. Summary: Since r293359, most dump() function are only defined when `!defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP)` holds. print() functions only used by dump() functions are now unused in release builds, generating lots of warnings. This patch only defines some print() functions if they are used. Reviewers: MatzeB Reviewed By: MatzeB Subscribers: arsenm, mzolotukhin, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D35949 llvm-svn: 309553	2017-07-31 10:07:49 +00:00
Florian Hahn	4284049dcc	[LoopInterchange] Do not interchange loops with function calls. Summary: Without any information about the called function, we cannot be sure that it is safe to interchange loops which contain function calls. For example there could be dependences that prevent interchanging between accesses in the called function and the loops. Even functions without any parameters could cause problems, as they could access memory using global pointers. For now, I think it is only safe to interchange loops with calls marked as readnone. With this patch, the LLVM test suite passes with `-O3 -mllvm -enable-loopinterchange` and LoopInterchangeProfitability::isProfitable returning true for all loops. check-llvm and check-clang also pass when bootstrapped in a similar fashion, although only 3 loops got interchanged. Reviewers: karthikthecool, blitz.opensource, hfinkel, mcrosier, mkuper Reviewed By: mcrosier Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D35489 llvm-svn: 309547	2017-07-31 09:00:52 +00:00
Wei Mi	55c05e14af	[GVN] Recommit the patch "Add phi-translate support in scalarpre" Recommit after workaround the bug PR31652. Three bugs fixed in previous recommits: The first one is to use CurrentBlock instead of PREInstr's Parent as param of performScalarPREInsertion because the Parent of a clone instruction may be uninitialized. The second one is stop PRE when CurrentBlock to its predecessor is a backedge and an operand of CurInst is defined inside of CurrentBlock. The same value defined inside of loop in last iteration can not be regarded as available. The third one is an out-of-bound array access in a flipped if guard. Right now scalarpre doesn't have phi-translate support, so it will miss some simple pre opportunities. Like the following testcase, current scalarpre cannot recognize the last "a * b" is fully redundent because a and b used by the last "a * b" expr are both defined by phis. long a[100], b[100], g1, g2, g3; __attribute__((pure)) long goo(); void foo(long a, long b, long c, long d) { g1 = a * b; if (__builtin_expect(g2 > 3, 0)) { a = c; b = d; g2 = a * b; } g3 = a * b; // fully redundant. } The patch adds phi-translate support in scalarpre. This is only a temporary solution before the newpre based on newgvn is available. Differential Revision: https://reviews.llvm.org/D32252 llvm-svn: 309397	2017-07-28 15:47:25 +00:00
Davide Italiano	75a001ba78	[JumpThreading] Stop falsely preserving LazyValueInfo. JumpThreading claims to preserve LVI, but it doesn't preserve the analyses which LVI holds a reference to (e.g. the Dominator). In the current pass manager infrastructure, after JT runs, the PM frees these analyses (including DominatorTree) but preserves LVI. CorrelatedValuePropagation runs immediately after and queries a corrupted domtree, causing weird miscompiles. This commit disables the preservation of LVI for the time being. Eventually, we should either move LVI to a proper dependency tracking mechanism (i.e. an analyses shouldn't hold references to other analyses and compute them on demand if needed), or we should teach all the passes preserving LVI to preserve the analyses LVI depends on. The new pass manager has a mechanism to invalidate LVI in case one of the analyses it depends on becomes invalid, so this problem shouldn't exist (at least not in this immediate form), but handling of analyses holding references is still a very delicate subject. Fixes PR33917 (and rustc). llvm-svn: 309355	2017-07-28 03:10:43 +00:00
Davide Italiano	01cb947abb	[JumpThreading] Add an option to dump LazyValueInfo after the run. Differential Revision: https://reviews.llvm.org/D35973 llvm-svn: 309353	2017-07-28 02:57:43 +00:00
Daniel Neilson	2574d7cbf6	All libcalls should be considered to be GC-leaf functions. Summary: It is possible for some passes to materialize a call to a libcall (ex: ldexp, exp2, etc), but these passes will not mark the call as a gc-leaf-function. All libcalls are actually gc-leaf-functions, so we change llvm::callsGCLeafFunction() to tell us that available libcalls are equivalent to gc-leaf-function calls. Reviewers: sanjoy, anna, reames Reviewed By: anna Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35840 llvm-svn: 309291	2017-07-27 16:49:39 +00:00
Wei Mi	fc0e245464	Disable loop unswitching for some patterns containing equality comparison with undef. This is a workaround for the bug described in PR31652 and http://lists.llvm.org/pipermail/llvm-dev/2017-July/115497.html. The temporary solution is to add a function EqualityPropUnSafe. In EqualityPropUnSafe, for some simple patterns we can know the equality comparison may contains undef, so we regard such comparison as unsafe and will not do loop-unswitching for them. We also need to disable the select simplification when one of select operand is undef and its result feeds into equality comparison. The patch cannot clear the safety issue caused by the bug, but it can suppress the issue from happening to some extent. Differential Revision: https://reviews.llvm.org/D35811 llvm-svn: 309059	2017-07-25 23:37:17 +00:00
Chandler Carruth	1dc34c6d80	[LIR] Teach LIR to avoid extending the BE count prior to adding one to it when safe. Very often the BE count is the trip count minus one, and the plus one here should fold with that minus one. But because the BE count might in theory be UINT_MAX or some such, adding one before we extend could in some cases wrap to zero and break when we scale things. This patch checks to see if it would be safe to add one because the specific case that would cause this is guarded for prior to entering the preheader. This should handle essentially all of the common loop idioms coming out of C/C++ code once canonicalized by LLVM. Before this patch, both forms of loop in the added test cases ended up subtracting one from the size, extending it, scaling it up by 8 and then adding 8 back onto it. This is really silly, and it turns out made it all the way into generated code very often, so this is a surprisingly important cleanup to do. Many thanks to Sanjoy for showing me how to do this with SCEV. Differential Revision: https://reviews.llvm.org/D35758 llvm-svn: 308968	2017-07-25 10:48:32 +00:00
Florian Hahn	f66efd6181	[LoopInterchange] Update code to use range-based for loops (NFC). Summary: The remaining non range-based for loops do not iterate over full ranges, so leave them as they are. Reviewers: karthikthecool, blitz.opensource, mcrosier, mkuper, aemerson Reviewed By: aemerson Subscribers: aemerson, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D35777 llvm-svn: 308872	2017-07-24 11:41:30 +00:00
Jonas Paulsson	024e319489	[SystemZ, LoopStrengthReduce] This patch makes LSR generate better code for SystemZ in the cases of memory intrinsics, Load->Store pairs or comparison of immediate with memory. In order to achieve this, the following common code changes were made: * New TTI hook: LSRWithInstrQueries(), which defaults to false. Controls if LSR should do instruction-based addressing evaluations by calling isLegalAddressingMode() with the Instruction pointers. * In LoopStrengthReduce: handle address operands of memset, memmove and memcpy as address uses, and call isFoldableMemAccessOffset() for any LSRUse::Address, not just loads or stores. SystemZ changes: * isLSRCostLess() implemented with Insns first, and without ImmCost. * New function supportedAddressingMode() that is a helper for TTI methods looking at Instructions passed via pointers. Review: Ulrich Weigand, Quentin Colombet https://reviews.llvm.org/D35262 https://reviews.llvm.org/D35049 llvm-svn: 308729	2017-07-21 11:59:37 +00:00
David Majnemer	e6bb895ab5	[LICM] Make sinkRegion and hoistRegion non-recursive Large CFGs can cause us to blow up the stack because we would have a recursive step for each basic block in a region. Instead, create a worklist and iterate it. This limits the stack usage to something more manageable. Differential Revision: https://reviews.llvm.org/D35609 llvm-svn: 308582	2017-07-20 03:27:02 +00:00
Davide Italiano	4b8c8eae32	[TRE] Move to the new OptRemark API. Fixes PR33788. Differential Revision: https://reviews.llvm.org/D35570 llvm-svn: 308524	2017-07-19 21:13:22 +00:00
Balaram Makam	b05a55787a	[SimplifyCFG] Defer folding unconditional branches to LateSimplifyCFG if it can destroy canonical loop structure. Summary: When simplifying unconditional branches from empty blocks, we pre-test if the BB belongs to a set of loop headers and keep the block to prevent passes from destroying canonical loop structure. However, the current algorithm fails if the destination of the branch is a loop header. Especially when such a loop's latch block is folded into loop header it results in additional backedges and LoopSimplify turns it into a nested loop which prevent later optimizations from being applied (e.g., loop unrolling and loop interleaving). This patch augments the existing algorithm by further checking if the destination of the branch belongs to a set of loop headers and defer eliminating it if yes to LateSimplifyCFG. Fixes PR33605: https://bugs.llvm.org/show_bug.cgi?id=33605 Reviewers: efriedma, mcrosier, pacxx, hsung, davidxl Reviewed By: efriedma Subscribers: ashutosh.nema, gberry, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D35411 llvm-svn: 308422	2017-07-19 08:53:34 +00:00
Weiming Zhao	984f1dc338	Fix DebugLoc propagation for unreachable LoadInst Summary: Currently, when GVN creates a load and when InstCombine creates a new store for unreachable Load, the DebugLoc info gets lost. Reviewers: dberlin, davide, aprantl Reviewed By: aprantl Subscribers: davide, llvm-commits Differential Revision: https://reviews.llvm.org/D34639 llvm-svn: 308404	2017-07-19 01:27:24 +00:00
Davide Italiano	6da7db31df	[TRE] Simplify canTRE() a bit using all_of(). NFCI. This has a ~11 years old FIXME, which may not be true today. We might consider removing this code altogether. llvm-svn: 308319	2017-07-18 15:42:59 +00:00
Max Kazantsev	2c627a97fd	[IRCE] Recognize loops with ne/eq latch conditions In some particular cases eq/ne conditions can be turned into equivalent slt/sgt conditions. This patch teaches parseLoopStructure to handle some of these cases. Differential Revision: https://reviews.llvm.org/D35010 llvm-svn: 308264	2017-07-18 04:53:48 +00:00
Florian Hahn	ad993521ac	[LoopInterchange] Add some optimization remarks. Reviewers: anemet, karthikthecool, blitz.opensource Reviewed By: anemet Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D35122 llvm-svn: 308094	2017-07-15 13:13:19 +00:00
Geoff Berry	f7d5daa0c0	[EarlyCSE] Handle calls with no MemorySSA info. Summary: When checking for memory dependencies between calls using MemorySSA, handle cases where the calls have no MemoryAccess associated with them because the AA analysis being used has determined that the call does not read/write memory. Fixes PR33756 Reviewers: dberlin, davide Subscribers: mcrosier, llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D35317 llvm-svn: 308051	2017-07-14 20:13:21 +00:00
Haicheng Wu	476adcca6b	[JumpThreading] Add a pattern to TryToUnfoldSelectInCurrBB() Add the following pattern to TryToUnfoldSelectInCurrBB() bb: %p = phi [0, %bb1], [1, %bb2], [0, %bb3], [1, %bb4], ... %c = cmp %p, 0 %s = select %c, trueval, falseval The Select in the above pattern will be unfolded and then jump-threaded. The current implementation does not allow CMP in the middle of PHI and Select. Differential Revision: https://reviews.llvm.org/D34762 llvm-svn: 308050	2017-07-14 19:16:47 +00:00
Max Kazantsev	f80ffa1a78	[IRCE] Fix corner case with Start = INT_MAX When iterating through loop for (int i = INT_MAX; i > 0; i--) We fail to generate the pre-loop for it. It happens because we use the overflown value in a comparison predicate when identifying whether or not we need it. In old logic, we used SLE predicate against Greatest value which exceeds all seen values of the IV and might be overflown. Now we use the GreatestSeen value of this IV with SLT predicate. Also added a test that ensures that a pre-loop is generated for such loops. Differential Revision: https://reviews.llvm.org/D35347 llvm-svn: 308001	2017-07-14 06:35:03 +00:00
Jakub Kuderski	b323f4f173	[LoopRotate] Fix DomTree update logic for unreachable nodes. Fix PR33701. Summary: LoopRotate manually updates the DoomTree by iterating over all predecessors of a basic block and computing the Nearest Common Dominator. When a predecessor happens to be unreachable, `DT.findNearestCommonDominator` returns nullptr. This patch teaches LoopRotate to handle this case and fixes [[ https://bugs.llvm.org/show_bug.cgi?id=33701 \| PR33701 ]]. In the future, LoopRotate should be taught to use the new incremental API for updating the DomTree. Reviewers: dberlin, davide, uabelho, grosser Subscribers: efriedma, mzolotukhin Differential Revision: https://reviews.llvm.org/D35074 llvm-svn: 307828	2017-07-12 18:42:16 +00:00
Konstantin Zhuravlyov	bb80d3e1d3	Enhance synchscope representation OpenCL 2.0 introduces the notion of memory scopes in atomic operations to global and local memory. These scopes restrict how synchronization is achieved, which can result in improved performance. This change extends existing notion of synchronization scopes in LLVM to support arbitrary scopes expressed as target-specific strings, in addition to the already defined scopes (single thread, system). The LLVM IR and MIR syntax for expressing synchronization scopes has changed to use syncscope("<scope>"), where <scope> can be "singlethread" (this replaces singlethread keyword), or a target-specific name. As before, if the scope is not specified, it defaults to CrossThread/System scope. Implementation details: - Mapping from synchronization scope name/string to synchronization scope id is stored in LLVM context; - CrossThread/System and SingleThread scopes are pre-defined to efficiently check for known scopes without comparing strings; - Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in the bitcode. Differential Revision: https://reviews.llvm.org/D21723 llvm-svn: 307722	2017-07-11 22:23:00 +00:00
Davide Italiano	ee1c82112e	[NewGVN] Check for congruency of memory accesses. This is fine as nothing in the code relies on leader and memory leader being the same for a given congruency class. Ack'ed by Dan. Fixes PR33720. llvm-svn: 307699	2017-07-11 19:49:12 +00:00
Davide Italiano	67b0e53dc1	[NewGVN] Fix an innocent typo I found while debugging PR33720. llvm-svn: 307694	2017-07-11 19:19:45 +00:00
Davide Italiano	fb4544cd15	[NewGVN] Clarify the function invariants formatting them properly. llvm-svn: 307692	2017-07-11 19:15:36 +00:00
Leo Li	93abd7d915	[ConstantHoisting] Remove dupliate logic in constant hoisting Summary: As metioned in https://reviews.llvm.org/D34576, checkings in `collectConstantCandidates` can be replaced by using `llvm::canReplaceOperandWithVariable`. The only special case is that `collectConstantCandidates` return false for all `IntrinsicInst` but it is safe for us to collect constant candidates from `IntrinsicInst`. Reviewers: pirama, efriedma, srhines Reviewed By: efriedma Subscribers: llvm-commits, javed.absar Differential Revision: https://reviews.llvm.org/D34921 llvm-svn: 307587	2017-07-10 20:45:34 +00:00
Davide Italiano	a7a77540ef	[NewGVN] Simplify a lambda a little bit. NFCI. llvm-svn: 307586	2017-07-10 20:45:00 +00:00
Craig Topper	95d2347ae1	[IR] Make use of Type::isPtrOrPtrVectorTy/isIntOrIntVectorTy/isFPOrFPVectorTy to shorten code. NFC llvm-svn: 307491	2017-07-09 07:04:00 +00:00
Hiroshi Inoue	713b5ba2de	fix trivial typos; NFC sucessor -> successor llvm-svn: 307488	2017-07-09 05:54:44 +00:00
Yaxun Liu	b909f11a31	[InferAddressSpaces] Fix assertion about null pointer InferAddressSpaces does not check address space in collectFlatAddressExpressions, which causes values with non flat address space put into Postorder and causes assertion in cloneValueWithNewAddressSpace. This patch fixes assertion in OpenCL 2.0 conformance test generic_address_space subtest for amdgcn target. Differential Revision: https://reviews.llvm.org/D34991 llvm-svn: 307349	2017-07-07 02:40:13 +00:00
Wei Mi	7586755013	[ConstHoisting] Turn on consthoist-with-block-frequency by default. Using profile information to guide consthoisting is generally helpful for performance, so the patch turns it on by default. No compile time or perf regression were found using spec2000 and spec2006 on x86. Some significant improvement (>20%) was seen on internal benchmarks. Differential Revision: https://reviews.llvm.org/D35063 llvm-svn: 307338	2017-07-07 00:11:05 +00:00
Wei Mi	20526b2725	[ConstHoisting] choose to hoist when frequency is the same. The patch is to adjust the strategy of frequency based consthoisting: Previously when the candidate block has the same frequency with the existing blocks containing a const, it will not hoist the const to the candidate block. For that case, now we change the strategy to hoist the const if only existing blocks have more than one block member. This is helpful for reducing code size. Differential Revision: https://reviews.llvm.org/D35084 llvm-svn: 307328	2017-07-06 22:32:27 +00:00
Craig Topper	79ab643da8	[Constants] If we already have a ConstantInt*, prefer to use isZero/isOne/isMinusOne instead of isNullValue/isOneValue/isAllOnesValue inherited from Constant. NFCI Going through the Constant methods requires redetermining that the Constant is a ConstantInt and then calling isZero/isOne/isMinusOne. llvm-svn: 307292	2017-07-06 18:39:47 +00:00
Wei Mi	90707394e3	[LSR] Narrow search space by filtering non-optimal formulae with the same ScaledReg and Scale. When the formulae search space is huge, LSR uses a series of heuristic to keep pruning the search space until the number of possible solutions are within certain limit. The big hammer of the series of heuristics is NarrowSearchSpaceByPickingWinnerRegs, which picks the register which is used by the most LSRUses and deletes the other formulae which don't use the register. This is a effective way to prune the search space, but quite often not a good way to keep the best solution. We saw cases before that the heuristic pruned the best formula candidate out of search space. To relieve the problem, we introduce a new heuristic called NarrowSearchSpaceByFilterFormulaWithSameScaledReg. The basic idea is in order to reduce the search space while keeping the best formula, we want to keep as many formulae with different Scale and ScaledReg as possible. That is because the central idea of LSR is to choose a group of loop induction variables and use those induction variables to represent LSRUses. An induction variable candidate is often represented by the Scale and ScaledReg in a formula. If we have more formulae with different ScaledReg and Scale to choose, we have better opportunity to find the best solution. That is why we believe pruning search space by only keeping the best formula with the same Scale and ScaledReg should be more effective than PickingWinnerReg. And we use two criteria to choose the best formula with the same Scale and ScaledReg. The first criteria is to select the formula using less non shared registers, and the second criteria is to select the formula with less cost got from RateFormula. The patch implements the heuristic before NarrowSearchSpaceByPickingWinnerRegs, which is the last resort. Testing shows we get 1.8% and 2% on two internal benchmarks on x86. llvm nightly testsuite performance is neutral. We also tried lsr-exp-narrow and it didn't help on the two improved internal cases we saw. Differential Revision: https://reviews.llvm.org/D34583 llvm-svn: 307269	2017-07-06 15:52:14 +00:00
Anna Thomas	ada4ddc0bc	[LoopDeletion] NFC: Add loop being analyzed debug statement llvm-svn: 307096	2017-07-04 17:00:03 +00:00
Anna Thomas	90f69abc8b	[LoopDeletion] NFC: Add debug statements to the optimization We have a DEBUG option for loop deletion, but no related debug messages. Added some debug messages to state why loop deletion failed. llvm-svn: 307078	2017-07-04 14:05:19 +00:00
Florian Hahn	4eeff394d3	[LoopInterchange] Add more debug messages to currentLimitations(). Summary: This makes it easier to find out which limitation prevented this pass from doing its work. Reviewers: karthikthecool, mzolotukhin, efriedma, mcrosier Reviewed By: mcrosier Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D34940 llvm-svn: 307035	2017-07-03 15:32:00 +00:00
Benjamin Kramer	fb620493e1	Revert "[GVN] Recommit the patch "Add phi-translate support in scalarpre"." This reverts commit r306313. This breaks selfhost at -O3 and PR33652. Let me know if you need additional information on reproducing the issue. llvm-svn: 307021	2017-07-03 12:23:10 +00:00
Hiroshi Inoue	bb703e8960	fix trivial typos; NFC suport -> support llvm-svn: 306968	2017-07-02 03:24:54 +00:00
Hiroshi Inoue	ef1c2ba22a	fix trivial typos, NFC llvm-svn: 306952	2017-07-01 07:12:15 +00:00
Leo Li	20fbad9307	[ConstantHoisting] Avoid hoisting constants in GEPs that index into a struct type. Summary: Indices for GEPs that index into a struct type should always be constants. This added more checks in `collectConstantCandidates:` which make sure constants for GEP pointer type are not hoisted. This fixed Bug https://bugs.llvm.org/show_bug.cgi?id=33538 Reviewers: ributzka, rnk Reviewed By: ributzka Subscribers: efriedma, llvm-commits, srhines, javed.absar, pirama Differential Revision: https://reviews.llvm.org/D34576 llvm-svn: 306704	2017-06-29 17:03:34 +00:00
Daniel Berlin	b779db7ebc	NewGVN: Remove useless test in addPhiOfOps. llvm-svn: 306702	2017-06-29 17:01:10 +00:00
Geoff Berry	b0573547f6	[LoopUnroll] Fix bug in computeUnrollCount causing it to not honor MaxCount Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper Subscribers: mcrosier, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D34532 llvm-svn: 306564	2017-06-28 17:01:15 +00:00
Geoff Berry	66d9bdbca8	[LoopUnroll] Pass SCEV to getUnrollingPreferences hook. NFCI. Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper Subscribers: jholewinski, arsenm, mzolotukhin, nemanjai, nhaehnle, javed.absar, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D34531 llvm-svn: 306554	2017-06-28 15:53:17 +00:00
Max Kazantsev	6c466a376e	[IRCE][NFC] Better get SCEV for 1 in calculateSubRanges A slightly more efficient way to get constant, we avoid resolving in getSCEV and excessive invocations, and we don't create a ConstantInt if 'true' branch is taken. Differential Revision: https://reviews.llvm.org/D34672 llvm-svn: 306503	2017-06-28 04:57:45 +00:00
Yaxun Liu	7c44f340de	[SROA] Fix APInt size when alloca address space is not 0 SROA assumes alloca address space is 0, which causes assertion. This patch fixes that. Differential Revision: https://reviews.llvm.org/D34104 llvm-svn: 306440	2017-06-27 18:26:06 +00:00
Chandler Carruth	3f81d8024c	[SROA] Fix PR32902 by more carefully propagating !nonnull metadata. This is based heavily on the work done ni D34285. I mostly wanted to do test cleanup for the author to save them some time, but I had a really hard time understanding why it was so hard to write better test cases for these issues. The problem is that because SROA does a second rewrite of the loads and because we don't propagate !nonnull for non-pointer loads, we first introduced invalid !nonnull metadata and then stripped it back off just in time to avoid most ways of this PR manifesting. Moving to the more careful utility only fixes this by changing the predicate to look at the new load's type rather than the target type. However, that does fix the bug, and the utility is much nicer including adding range metadata to model the nonnull property after a conversion to an integer. However, we have bigger problems because we don't actually propagate range metadata, and the utility to do this extracted from instcombine isn't really in good shape to do this currently. It only handles the case of copying range metadata from an integer load to a pointer load. It doesn't even handle the trivial cases of propagating from one integer load to another when they are the same width! This utility will need to be beefed up prior to using in this location to get the metadata to fully survive. And even then, we need to go and teach things to turn the range metadata into an assume the way we do with nonnull so that when we promote an integer we don't lose the information. All of this will require a new test case that looks kind-of like `preserve-nonnull.ll` does here but focuses on range metadata. It will also likely require more testing because it needs to correctly handle changes to the integer width, especially as SROA actively tries to change the integer width! Last but not least, I'm a little worried about hooking the range metadata up here because the instcombine logic for converting from a range metadata to a nonnull metadata node seems broken in the face of non-zero address spaces where null is not mapped to the integer `0`. So that probably needs to get fixed with test cases both in SROA and in instcombine to cover it. But this does extract the core PR fix from D34285 of preventing the !nonnull metadata from being propagated in a broken state just long enough to feed into promotion and crash value tracking. On D34285 there is some discussion of zero-extend handling because it isn't necessary. First, the new load size covers all of the non-undef (ie, possibly initialized) bits. This may even extend past the original alloca if loading those bits could produce valid data. The only way its valid for us to zero-extend an integer load in SROA is if the original code had a zero extend or those bits were undef. And we get to assume things like undef never satifies nonnull, so non undef bits can participate here. No need to special case the zero-extend handling, it just falls out correctly. The original credit goes to Ariel Ben-Yehuda! I'm mostly landing this to save a few rounds of trivial edits fixing style issues and test case formulation. Differental Revision: D34285 llvm-svn: 306379	2017-06-27 08:32:03 +00:00
Mikael Holmen	37b5120a9a	[Reassociate] Make sure EraseInst sets MadeChange Summary: EraseInst didn't report that it made IR changes through MadeChange. It is essential that changes to the IR are reported correctly, since for example ReassociatePass::run() will indicate that all analyses are preserved otherwise. And the CGPassManager determines if the CallGraph is up-to-date based on status from InstructionCombiningPass::runOnFunction(). Reviewers: craig.topper, rnk, davide Reviewed By: rnk, davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34616 llvm-svn: 306368	2017-06-27 05:32:13 +00:00
Wei Mi	71f06420e4	[GVN] Recommit the patch "Add phi-translate support in scalarpre". The recommit fixes three bugs: The first one is to use CurrentBlock instead of PREInstr's Parent as param of performScalarPREInsertion because the Parent of a clone instruction may be uninitialized. The second one is stop PRE when CurrentBlock to its predecessor is a backedge and an operand of CurInst is defined inside of CurrentBlock. The same value defined inside of loop in last iteration can not be regarded as available. The third one is an out-of-bound array access in a flipped if guard. Right now scalarpre doesn't have phi-translate support, so it will miss some simple pre opportunities. Like the following testcase, current scalarpre cannot recognize the last "a * b" is fully redundent because a and b used by the last "a * b" expr are both defined by phis. long a[100], b[100], g1, g2, g3; __attribute__((pure)) long goo(); void foo(long a, long b, long c, long d) { g1 = a * b; if (__builtin_expect(g2 > 3, 0)) { a = c; b = d; g2 = a * b; } g3 = a * b; // fully redundant. } The patch adds phi-translate support in scalarpre. This is only a temporary solution before the newpre based on newgvn is available. llvm-svn: 306313	2017-06-26 18:16:10 +00:00
Anna Thomas	e7cb633d29	[LoopDeletion] NFC: Move phi node value setting into prepass Recommit NFC patch (rL306157) where I missed incrementing the basic block iterator, which caused loop deletion tests to hang due to infinite loop. Had reverted it in rL306162. rL306157 commit message: Currently, the implementation of delete dead loops has a special case when the loop being deleted is never executed. This special case (updating of exit block's incoming values for phis) can be run as a prepass for non-executable loops before performing the actual deletion. llvm-svn: 306254	2017-06-25 21:13:58 +00:00
Hiroshi Inoue	b300824ee7	fix trivial typos in comment, NFC dereferencable -> dereferenceable llvm-svn: 306210	2017-06-24 15:43:33 +00:00
Anna Thomas	77a2e6b198	Revert "[LoopDeletion] NFC: Move phi node value setting into prepass" This reverts commit r306157. It caused some timeouts in clang tests. Perhaps unreachable loops have far too many phi nodes. Reverting and investigating. llvm-svn: 306162	2017-06-23 21:30:48 +00:00
Anna Thomas	a43b387f27	[LoopDeletion] NFC: Move phi node value setting into prepass Currently, the implementation of delete dead loops has a special case when the loop being deleted is never executed. This special case (updating of exit block's incoming values for phis) can be run as a prepass for non-executable loops before performing the actual deletion. llvm-svn: 306157	2017-06-23 20:38:50 +00:00
Craig Topper	68ed55e06a	[CorrelatedValuePropagation] Fix typo in comment sense->since. NFC llvm-svn: 306152	2017-06-23 20:28:40 +00:00
Craig Topper	29cdfe2cd9	[CorrelatedValuePropagation] Remove comment about iterating switch cases in reverse order. This is no longer being done after r298791. NFC llvm-svn: 306151	2017-06-23 20:28:35 +00:00
Craig Topper	2c20c42cb6	[JumpThreading] Teach jump threading how to analyze (and (cmp A, C1), (cmp A, C2)) after InstCombine has turned it into (cmp (add A, C3), C4) Currently JumpThreading can use LazyValueInfo to analyze an 'and' or 'or' of compare if the compare is fed by a livein of a basic block. This can be used to to prove the condition can't be met for some predecessor and the jump from that predecessor can be moved to the false path of the condition. But if the compare is something that InstCombine turns into an add and a single compare, it can't be analyzed because the livein is now an input to the add and not the compare. This patch adds a new method to LVI to get a ConstantRange on an edge. Then we teach jump threading to detect the add livein feeding a compare and to get the ConstantRange and propagate it. Differential Revision: https://reviews.llvm.org/D33262 llvm-svn: 306085	2017-06-23 05:41:35 +00:00
Craig Topper	7927996140	[JumpThreading] Use some temporary variables to reduce the number of times we call the same methods. NFC A future patch will add even more uses of these variables. llvm-svn: 306084	2017-06-23 05:41:32 +00:00
Eric Christopher	5a7c2f1700	Remove the LoadCombine pass. It was never enabled and is unsupported. Based on discussions with the author on mailing lists. llvm-svn: 306067	2017-06-22 22:58:12 +00:00
Anna Thomas	72c90c87f8	[LoopDeletion] Update exits correctly when multiple duplicate edges from an exiting block Summary: Currently, we incorrectly update exit blocks of loops when there are multiple edges from a single exiting block to the exit block. This can happen when we have switches as the terminator of the exiting blocks. The fix here is to correctly update the phi nodes in the exit block, and remove all incoming values except for one which is from the preheader. Note: Currently, this error can manifest only while deleting non-executed loops. However, it is possible to trigger this error in invariant loops, once we enhance the logic around the exit conditions for the loop check. Reviewers: chandlerc, dberlin, sanjoy, efriedma Reviewed by: efriedma Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D34516 llvm-svn: 306048	2017-06-22 20:20:56 +00:00
Sam Clegg	705f798bff	Mark dump() methods as const. NFC Add const qualifier to any dump() method where adding one was trivial. Differential Revision: https://reviews.llvm.org/D34481 llvm-svn: 305963	2017-06-21 22:19:17 +00:00
Craig Topper	34caf5396f	[Reassociate] Use early returns in a couple places to reduce indentation and improve readability. NFC llvm-svn: 305946	2017-06-21 19:39:35 +00:00
Craig Topper	99a2e89920	[Reassociate] Const correct a helper function. NFC llvm-svn: 305945	2017-06-21 19:39:33 +00:00
Craig Topper	cbac691c4b	[Reassociate] Support xor reassociating for splat vectors Summary: This patch adds support for xors of splat vectors. Reviewers: mcrosier Reviewed By: mcrosier Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34354 llvm-svn: 305925	2017-06-21 16:07:09 +00:00
Davide Italiano	0ec715be1f	[NewGVN] Fix a bug that made the store verifier less effective. We weren't actually checking for duplicated stores, as the condition was always actually false. This was found by Coverity, and I have no clue how to trigger this in real-world code (although I tried for a bit). llvm-svn: 305867	2017-06-20 22:57:40 +00:00
Hans Wennborg	ca69fc1cb7	Revert r304824 "Fix PR23384 (part 3 of 3)" This seems to be interacting badly with ASan somehow, causing false reports of heap-buffer overflows: PR33514. > Summary: > The patch makes instruction count the highest priority for > LSR solution for X86 (previously registers had highest priority). > > Reviewers: qcolombet > > Differential Revision: http://reviews.llvm.org/D30562 > > From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 305720	2017-06-19 17:57:15 +00:00
Davide Italiano	daa9c0e403	[NewGVN] Simplify findConditionEquivalence(). NFCI. llvm-svn: 305707	2017-06-19 16:46:15 +00:00
Craig Topper	ef85498e05	[Reassociate] Support some reassociation of vector xors Summary: Currently we don't try to do anything with vector xors. This patch adds support for removing duplicate pairs from a chain of vector xors as its pretty easy to support. We still dont' try to combine the xors with and/ors, but I might try that in a future patch. Reviewers: mcrosier, davide, resistor Reviewed By: mcrosier Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34338 llvm-svn: 305704	2017-06-19 16:23:46 +00:00
Craig Topper	4350734d36	[Reassociate] Make one of the helper methods static because it doesn't use any class variables. NFC llvm-svn: 305703	2017-06-19 16:23:43 +00:00
Anna Thomas	7949f4529a	[JumpThreading][LVI] Invalidate LVI information after blocks are merged Summary: After a single predecessor is merged into a basic block, we need to invalidate the LVI information for the new merged block, when LVI is not provably true for all of instructions in the new block. The test cases added show the correct LVI information using the LVI printer pass. Reviewers: reames, dberlin, davide, sanjoy Reviewed by: dberlin, davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34108 llvm-svn: 305699	2017-06-19 15:23:33 +00:00
Xin Tong	b412831d11	[TRE] Improve code motion in TRE, use AA to tell whether a load can be moved before a call that writes to memory. Summary: use AA to tell whether a load can be moved before a call that writes to memory. Reviewers: dberlin, davide, sanjoy, hfinkel Reviewed By: hfinkel Subscribers: hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D34115 llvm-svn: 305698	2017-06-19 15:21:18 +00:00
Daniel Berlin	36b08b2088	NewGVN: Fix PR 33461, caused by slightly overzealous verification. llvm-svn: 305657	2017-06-19 00:24:00 +00:00
Craig Topper	d96177cf72	[Reassociate] Use APInt::isNullValue() instead of comparing with 0. NFC This should compile to slightly better code. llvm-svn: 305651	2017-06-18 18:15:38 +00:00
Sanjoy Das	b70ddd8901	[SROA] Add support for non-integral pointers Summary: C.f. http://llvm.org/docs/LangRef.html#non-integral-pointer-type Reviewers: chandlerc, loladiro Reviewed By: loladiro Subscribers: reames, loladiro, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D32203 llvm-svn: 305639	2017-06-17 20:28:13 +00:00
Xin Tong	025780ba6e	[TRE] Add assertion for folding trivial return block llvm-svn: 305637	2017-06-17 16:55:12 +00:00
Xin Tong	d5b4d0b53a	[TRE] Update comments. NFC llvm-svn: 305636	2017-06-17 16:18:36 +00:00
Wei Mi	c7ba876323	Revert rL305578. There is still some buildbot failure to be fixed. llvm-svn: 305603	2017-06-16 23:14:35 +00:00
Davide Italiano	ec5b0257bf	[SCCP] Simplify the code a bit. NFCI. llvm-svn: 305583	2017-06-16 20:50:31 +00:00
Davide Italiano	0b1190aa8d	[SCCP] Clarify a comment about unhandled instructions. llvm-svn: 305579	2017-06-16 20:27:17 +00:00
Wei Mi	a2493b6ad9	[GVN] Recommit the patch "Add phi-translate support in scalarpre". The recommit fixes two bugs: The first one is to use CurrentBlock instead of PREInstr's Parent as param of performScalarPREInsertion because the Parent of a clone instruction may be uninitialized. The second one is stop PRE when CurrentBlock to its predecessor is a backedge and an operand of CurInst is defined inside of CurrentBlock. The same value defined inside of loop in last iteration can not be regarded as available. Right now scalarpre doesn't have phi-translate support, so it will miss some simple pre opportunities. Like the following testcase, current scalarpre cannot recognize the last "a * b" is fully redundent because a and b used by the last "a * b" expr are both defined by phis. long a[100], b[100], g1, g2, g3; __attribute__((pure)) long goo(); void foo(long a, long b, long c, long d) { g1 = a * b; if (__builtin_expect(g2 > 3, 0)) { a = c; b = d; g2 = a * b; } g3 = a * b; // fully redundant. } The patch adds phi-translate support in scalarpre. This is only a temporary solution before the newpre based on newgvn is available. Differential Revision: https://reviews.llvm.org/D32252 llvm-svn: 305578	2017-06-16 20:21:01 +00:00
Davide Italiano	d95d871af0	[SCCP] Remove redundant instruction visitors. Whenever we don't know what to do with an instruction, we send it to overdefined anyway. llvm-svn: 305575	2017-06-16 19:43:57 +00:00
Daniel Neilson	3faabbbe85	[Atomics] Rename and change prototype for atomic memcpy intrinsic Summary: Background: http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html This change is to alter the prototype for the atomic memcpy intrinsic. The prototype itself is being changed to more closely resemble the semantics and parameters of the llvm.memcpy intrinsic -- to ease later combination of the llvm.memcpy and atomic memcpy intrinsics. Furthermore, the name of the atomic memcpy intrinsic is being changed to make it clear that it is not a generic atomic memcpy, but specifically a memcpy is unordered atomic. Reviewers: reames, sanjoy, efriedma Reviewed By: reames Subscribers: mzolotukhin, anna, llvm-commits, skatkov Differential Revision: https://reviews.llvm.org/D33240 llvm-svn: 305558	2017-06-16 14:43:59 +00:00
Craig Topper	4d76da4fc9	[CorrelatedValuePropagation] Remove superfluous semicolon. NFC llvm-svn: 305538	2017-06-16 01:53:20 +00:00
Daniel Berlin	51e878e01d	NewGVN: This is wrong by inspection, it will not cause an issue currently due to other limitations, i believe. This also means i can't make a test for it. llvm-svn: 305415	2017-06-14 21:19:28 +00:00
Davide Italiano	0dc4778067	[EarlyCSE] Make PhiToCheck in removeMSSA() a set. This way we end up not looking at PHI args already removed. MemSSA now goes through the updater so we can prune it to avoid having redundant MemoryPHI arguments, but that doesn't quite work for the general case. Discussed with Daniel Berlin, fixes PR33406. llvm-svn: 305409	2017-06-14 19:29:53 +00:00
Frederich Munch	dceb612eeb	Hide dbgs() stream for when built with -fmodules. Summary: Make DebugCounter::print and dump methods to be const correct. Reviewers: aprantl Reviewed By: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34214 llvm-svn: 305408	2017-06-14 19:16:22 +00:00
George Burgess IV	f613749382	Fix signed/unsigned comparison warning; NFC llvm-svn: 305262	2017-06-13 01:28:49 +00:00
Anna Thomas	4b027e8f89	[RS4GC] Drop invalid metadata after pointers are relocated Summary: After RS4GC, we should drop metadata that is no longer valid. These metadata is used by optimizations scheduled after RS4GC, and can cause a miscompile. One such metadata is invariant.load which is used by LICM sinking transform. After rewriting statepoints, the address of a load maybe relocated. With invariant.load metadata on a load instruction, LICM sinking assumes the loaded value (from a dererenceable address) to be invariant, and rematerializes the load operand and the load at the exit block. This transforms the IR to have an unrelocated use of the address after a statepoint, which is incorrect. Other metadata we conservatively remove are related to dereferenceability and noalias metadata. This patch drops such metadata on store and load instructions after rewriting statepoints. Reviewers: reames, sanjoy, apilipenko Reviewed by: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33756 llvm-svn: 305234	2017-06-12 21:26:53 +00:00
Andrew Kaylor	647025f9e1	[InstSimplify] Don't constant fold or DCE calls that are marked nobuiltin Differential Revision: https://reviews.llvm.org/D33737 llvm-svn: 305132	2017-06-09 23:18:11 +00:00
Yaxun Liu	6455b0dbf3	[SROA] Fix APInt size when load/store have different address space Currently there is a bug in SROA::presplitLoadsAndStores which causes assertion in GEPOperator::accumulateConstantOffset. Basically it does not consider the situation that the pointer operand of load or store may be in a non-zero address space and its size may be different from the size of a pointer in address space 0. This patch fixes assertion when compiling Blender Cycles kernels for amdgpu backend. Diffferential Revision: https://reviews.llvm.org/D33298 llvm-svn: 305107	2017-06-09 20:46:29 +00:00
Keno Fischer	5329174cb1	[Sink] Fix predicate in legality check Summary: isSafeToSpeculativelyExecute is the wrong predicate to use here. All that checks for is whether it is safe to hoist a value due to unaligned/un-dereferencable accesses. However, not only are we doing sinking rather than hoisting, our concern is that the location we're loading from may have been modified. Instead forbid sinking any load across a critical edge. Reviewers: majnemer Subscribers: davide, llvm-commits Differential Revision: https://reviews.llvm.org/D33179 llvm-svn: 305102	2017-06-09 19:31:10 +00:00
Serguei Katkov	38414b57f9	[IndVars] Add an option to be able to disable LFTR This change adds an option disable-lftr to be able to disable Linear Function Test Replace optimization. By default option is off so current behavior is not changed. Reviewers: reames, sanjoy, wmi, andreadb, apilipenko Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33979 llvm-svn: 305055	2017-06-09 06:11:59 +00:00
Galina Kistanova	e128958552	Changed a comparison operator for std::stable_sort to implement strict weak ordering. This is a temporarily fix which needs additional work, as it triggers a test3 failure. test3 is commented out till then. llvm-svn: 304993	2017-06-08 17:27:40 +00:00
Nirav Dave	62fb8498d3	InferAddressSpaces: Avoid assertion failure with replacing identical cloned constexpr Have cloneConstantExprWithNewAddressSpaces return nullptr when returning initial ConstantExpr. Reviewers: arsenm Subscribers: jholewinski, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D33995 llvm-svn: 304975	2017-06-08 13:20:55 +00:00
John Brawn	da4a68a1d2	[BPI] Don't assume that strcmp returning >0 is more likely than <0 The zero heuristic assumes that integers are more likely positive than negative, but this also has the effect of assuming that strcmp return values are more likely positive than negative. Given that for nonzero strcmp return values it's the ordering of arguments that determines the sign of the result there's no reason to assume that's true. Fix this by inspecting the LHS of the compare and using TargetLibraryInfo to decide if it's strcmp-like, and if so only assume that nonzero is more likely than zero i.e. strings are more often different than the same. This causes a slight code generation change in the spec2006 benchmark 403.gcc, but with no noticeable performance impact. The intent of this patch is to allow better optimisation of dhrystone on Cortex-M cpus, but currently it won't as there are also some changes that need to be made to if-conversion. Differential Revision: https://reviews.llvm.org/D33934 llvm-svn: 304970	2017-06-08 09:44:40 +00:00
Xinliang David Li	4f49bee764	Fix builin_expect lowering bug PR33346 Skip cases when expected value is not constant int. llvm-svn: 304933	2017-06-07 18:32:24 +00:00
Evgeny Stupachenko	3b88291581	Fix PR23384 (part 3 of 3) Summary: The patch makes instruction count the highest priority for LSR solution for X86 (previously registers had highest priority). Reviewers: qcolombet Differential Revision: http://reviews.llvm.org/D30562 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 304824	2017-06-06 20:04:16 +00:00
Daniel Berlin	eafdd862e5	NewGVN: Fix PR/33187. This is a bug caused by two things: 1. When there is no perfect iteration order, we can't let phi nodes put themselves in terms of things that come later in the iteration order, or we will endlessly cycle (the normal RPO algorithm clears the hashtable to avoid this issue). 2. We are sometimes erasing the wrong expression (causing pessimism) because our equality says loads and stores are the same. We introduce an exact equality function and use it when erasing to make sure we erase only identical expressions, not equivalent ones. llvm-svn: 304807	2017-06-06 17:15:28 +00:00
Anna Thomas	b2a212c070	[Atomics][LoopIdiom] Recognize unordered atomic memcpy Summary: Expanding the loop idiom test for memcpy to also recognize unordered atomic memcpy. The only difference for recognizing an unordered atomic memcpy and instead of a normal memcpy is that the loads and/or stores involved are unordered atomic operations. Background: http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html Patch by Daniel Neilson! Reviewers: reames, anna, skatkov Reviewed By: reames, anna Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33243 llvm-svn: 304806	2017-06-06 16:45:25 +00:00
Anna Thomas	7218032019	[IRCE] Canonicalize pre/post loops after the blocks are added into parent loop Summary: We were canonizalizing the pre loop (into loop-simplify form) before the post loop blocks were added into parent loop. This is incorrect when IRCE is done on a subloop. The post-loop blocks are created, but not yet added to the parent loop. So, loop-simplification on the pre-loop incorrectly updates LoopInfo. This patch corrects the ordering so that pre and post loop blocks are added to parent loop (if any), and then the loops are canonicalized to LCSSA and LoopSimplifyForm. Reviewers: reames, sanjoy, apilipenko Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33846 llvm-svn: 304800	2017-06-06 14:54:01 +00:00
Chandler Carruth	6bda14b313	Sort the remaining #include lines in include/... and lib/.... I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is entirely mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787	2017-06-06 11:49:48 +00:00
Evgeny Stupachenko	f2b3b467e5	Fix PR23384 (part 2 of 3) NFC Summary: The patch moves LSR cost comparison to target part. Reviewers: qcolombet Differential Revision: http://reviews.llvm.org/D30561 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 304750	2017-06-05 23:37:00 +00:00
Evgeny Stupachenko	4d94e99446	LSR: Calculate instruction cost only if InsnsCost is set to true (NFC) Summary: The patch guard all instruction cost calculations with InsnCosts (-lsr-insns-cost) option. Currently even if the option set to false we calculate and print (in debug mode) instruction costs. Reviewers: qcolombet Differential Revision: http://reviews.llvm.org/D33914 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 304746	2017-06-05 22:44:18 +00:00
Galina Kistanova	55344aba7e	Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC. llvm-svn: 304637	2017-06-03 05:19:10 +00:00
Philip Reames	b70cecd60a	[Statepoint] Be consistent about using deopt naming [NFCI] We'd called this "vm state" in the early days, but have long since standardized on calling it "deopt" in line with the operand bundle tag. Fix a few cases we'd missed. llvm-svn: 304607	2017-06-02 23:03:26 +00:00
Keno Fischer	514a6a54e7	[SROA] Fix crash due to bad bitcast Summary: As shown in the test case, SROA was crashing when trying to split stores (to the alloca) of loads (from anywhere), because it assumed the pointer operand to the loads and stores had to have the same address space. This isn't the case. Make sure to use the correct pointer type for both the load and the store. Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D32593 llvm-svn: 304585	2017-06-02 19:04:17 +00:00
Xinliang David Li	621e8dcf1f	[Profile] Enhance expect lowering to handle correlated branches builtin_expect applied on && or \|\| expressions were not handled properly before. With this patch, the problem is fixed. Differential Revision: http://reviews.llvm.org/D33164 llvm-svn: 304517	2017-06-02 02:09:31 +00:00
Philip Reames	ae80045deb	[RS4GC] Comment clarification llvm-svn: 304514	2017-06-02 01:52:06 +00:00
Xinliang David Li	d6cfba2a02	Fix compiler_rt buildbot failure llvm-svn: 304489	2017-06-01 23:05:11 +00:00
Xinliang David Li	ee8d6acb1f	[Profile] Fix builtin_expect lowering bug The lowerer wrongly assumes the ICMP instruction 1) always has a constant operand; 2) the operand has value 0. It also assumes the expected value can only be one, thus other values other than one will be considered 'zero'. This leads to wrong profile annotation when other integer values are used other than 0, 1 in the comparison or in the expect intrinsic. Also missing is handling of equal predicate. This patch fixes all the above problems. Differential Revision: http://reviews.llvm.org/D33757 llvm-svn: 304453	2017-06-01 19:05:55 +00:00
Wei Mi	0bd3f41588	Revert rL304050. It may break sanitizer bootstrap. Revert it for now while investigating. llvm-svn: 304350	2017-05-31 21:29:33 +00:00
Reid Kleckner	5fbdd17714	[IR] Add additional addParamAttr/removeParamAttr to AttributeList API Summary: Fairly straightforward patch to fill in some of the holes in the attributes API with respect to accessing parameter/argument attributes. The patch aims to step further towards encapsulating the idx+FirstArgIndex pattern to access these attributes to within the AttributeList. Patch by Daniel Neilson! Reviewers: rnk, chandlerc, pete, javed.absar, reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33355 llvm-svn: 304329	2017-05-31 19:23:09 +00:00
Anna Thomas	777bb90bdc	Revert "[Atomics][LoopIdiom] Recognize unordered atomic memcpy" This reverts commit r304310. It caused build failures in polly and mingw due to undefined reference to llvm::RTLIB::getMEMCPY_ELEMENT_ATOMIC. llvm-svn: 304315	2017-05-31 17:20:51 +00:00
Anna Thomas	056c009f1b	[Atomics][LoopIdiom] Recognize unordered atomic memcpy Summary: Expanding the loop idiom test for memcpy to also recognize unordered atomic memcpy. The only difference for recognizing an unordered atomic memcpy and instead of a normal memcpy is that the loads and/or stores involved are unordered atomic operations. Background: http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html Patch by Daniel Neilson! Reviewers: reames, anna, skatkov Reviewed By: reames Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33243 llvm-svn: 304310	2017-05-31 16:39:52 +00:00
Daniel Berlin	be3e7ba45e	NewGVN: Fix PR 33185 by checking whether we need to recursively generate a phi of ops, which we don't currently support. llvm-svn: 304272	2017-05-31 01:47:32 +00:00
Daniel Berlin	2aa5dc1589	NewGVN: Compute hash value of expression on demand and use it in inequality testing. llvm-svn: 304195	2017-05-30 06:58:18 +00:00
Daniel Berlin	c8ed40400c	NewGVN: Fix PR33194, memory corruption by putting temporary instructions in tables sometimes. llvm-svn: 304194	2017-05-30 06:42:29 +00:00
Hiroshi Inoue	ac9cd3080d	[trivial] fix a typo in comment, NFC llvm-svn: 304139	2017-05-29 08:37:42 +00:00
Wei Mi	5bbb5aafc1	[GVN] Recommit the patch "Add phi-translate support in scalarpre". The recommit is to fix a bug about ExtractValue and InsertValue ops. For those ops, some varargs inside GVN::Expression are not value numbers but raw index numbers. It is wrong to do phi-translate for raw index numbers, and the fix is to stop doing that. Right now scalarpre doesn't have phi-translate support, so it will miss some simple pre opportunities. Like the following testcase, current scalarpre cannot recognize the last "a * b" is fully redundent because a and b used by the last "a * b" expr are both defined by phis. long a[100], b[100], g1, g2, g3; __attribute__((pure)) long goo(); void foo(long a, long b, long c, long d) { g1 = a * b; if (__builtin_expect(g2 > 3, 0)) { a = c; b = d; g2 = a * b; } g3 = a * b; // fully redundant. } The patch adds phi-translate support in scalarpre. This is only a temporary solution before the newpre based on newgvn is available. Differential Revision: https://reviews.llvm.org/D32252 llvm-svn: 304050	2017-05-27 00:54:19 +00:00
Benjamin Kramer	debb3c35e0	Make helper functions static. NFC. llvm-svn: 304029	2017-05-26 20:09:00 +00:00
Wei Mi	3250ae3f7c	Revert rL303923 since it broke the sanitizer bootstrap build bot. llvm-svn: 303969	2017-05-26 05:42:50 +00:00
Wei Mi	fd257fa7bf	[GVN] Add phi-translate support in scalarpre. Right now scalarpre doesn't have phi-translate support, so it will miss some simple pre opportunities. Like the following testcase, current scalarpre cannot recognize the last "a * b" is fully redundent because a and b used by the last "a * b" expr are both defined by phis. long a[100], b[100], g1, g2, g3; __attribute__((pure)) long goo(); void foo(long a, long b, long c, long d) { g1 = a * b; if (__builtin_expect(g2 > 3, 0)) { a = c; b = d; g2 = a * b; } g3 = a * b; // fully redundant. } The patch adds phi-translate support in scalarpre. This is only a temporary solution before the newpre based on newgvn is available. Differential Revision: https://reviews.llvm.org/D32252 llvm-svn: 303923	2017-05-25 21:49:02 +00:00
Daniel Berlin	e67c322260	NewGVN: Fix PR 33119, PR 33129, due to regressed undef handling Fix PR33120 and others by eliminating self-cycles a different way. llvm-svn: 303875	2017-05-25 15:44:20 +00:00
James Molloy	dc2d64bc35	[GVNSink] Pacify MSVC Don't convert an unsigned to a pointer for a sentinel, use a size_t instead. llvm-svn: 303855	2017-05-25 13:14:10 +00:00
James Molloy	2a237f19f1	[GVNSink] Don't define operator<< in NDEBUG Without debug macros enabled, the raw_ostream operator<< overload is unused. llvm-svn: 303852	2017-05-25 13:11:18 +00:00
James Molloy	a929063233	[GVNSink] GVNSink pass This patch provides an initial prototype for a pass that sinks instructions based on GVN information, similar to GVNHoist. It is not yet ready for commiting but I've uploaded it to gather some initial thoughts. This pass attempts to sink instructions into successors, reducing static instruction count and enabling if-conversion. We use a variant of global value numbering to decide what can be sunk. Consider: [ %a1 = add i32 %b, 1 ] [ %c1 = add i32 %d, 1 ] [ %a2 = xor i32 %a1, 1 ] [ %c2 = xor i32 %c1, 1 ] \ / [ %e = phi i32 %a2, %c2 ] [ add i32 %e, 4 ] GVN would number %a1 and %c1 differently because they compute different results - the VN of an instruction is a function of its opcode and the transitive closure of its operands. This is the key property for hoisting and CSE. What we want when sinking however is for a numbering that is a function of the uses of an instruction, which allows us to answer the question "if I replace %a1 with %c1, will it contribute in an equivalent way to all successive instructions?". The (new) PostValueTable class in GVN provides this mapping. This pass has some shown really impressive improvements especially for codesize already on internal benchmarks, so I have high hopes it can replace all the sinking logic in SimplifyCFG. Differential revision: https://reviews.llvm.org/D24805 llvm-svn: 303850	2017-05-25 12:51:11 +00:00
Chandler Carruth	dd2e275a47	[PM/Unswitch] Fix a bug in the domtree update logic for the new unswitch pass. The original logic only considered direct successors of the hoisted domtree nodes, but that isn't really enough. If there are other basic blocks that are completely within the subtree, their successors could just as easily be impacted by the hoisting. The more I think about it, the more I think the correct update here is to hoist every block on the dominance frontier which has an idom in the chain we hoist across. However, this is subtle enough that I'd definitely appreciate some more eyes on it. Sadly, if this is the correct algorithm, it requires computing a (highly localized) dominance frontier. I've done this in the simplest (IE, least code) way I could come up with, but that may be too naive. Suggestions welcome here, dominance update algorithms are not an area I've studied much, so I don't have strong opinions. In good news, with this patch, turning on simple unswitch passes the LLVM test suite for me with asserts enabled. Differential Revision: https://reviews.llvm.org/D32740 llvm-svn: 303843	2017-05-25 06:33:36 +00:00
Chandler Carruth	29c22d2835	[LegacyPM] Make the 'addLoop' method accept a loop to add rather than having it internally allocate the loop. This is a much more flexible API and necessary in the new loop unswitch to reasonably support both new and old PMs in common code. It also just seems like a cleaner separation of concerns. NFC, this should just be a pure refactoring. Differential Revision: https://reviews.llvm.org/D33528 llvm-svn: 303834	2017-05-25 03:01:31 +00:00
Craig Topper	8205a1a9b6	[ValueTracking] Convert most of the calls to computeKnownBits to use the version that returns the KnownBits object. This continues the changes started when computeSignBit was replaced with this new version of computeKnowBits. Differential Revision: https://reviews.llvm.org/D33431 llvm-svn: 303773	2017-05-24 16:53:07 +00:00
Davide Italiano	fd9100e056	[NewGVN] Update additionalUsers when we simplify to a value. Otherwise we don't revisit an instruction that could be simplified, and when we verify, we discover there's something that changed, i.e. what we had wasn't a maximal fixpoint. Fixes PR32836. llvm-svn: 303715	2017-05-24 02:30:24 +00:00
Davide Italiano	c4861adad9	[SCCP] Use the `hasAddressTaken()` version defined in `Function`. Instead of using the SCCP homegrown one. We should eventually make the private SCCP version disappear, but that wont' be today. PR33143 tracks this issue. Add braces for consistency while here. No functional change intended. llvm-svn: 303706	2017-05-23 23:59:23 +00:00

... 3 4 5 6 7 ...

8684 Commits