forked from OSchip/llvm-project
5e2d70b9a3
the DFS stack for leaves in the call graph. As mentioned in my previous commit, this is particularly interesting for graphs which have high fan out but low connectivity resulting in many leaves. For such graphs, this can remove a large % of the DFS stack traffic even though it doesn't make the stack much smaller. It's a bit easier to formulate this for the full algorithm because that one stops completely for each SCC. For example, I was able to directly eliminate the "Recurse" boolean used to continue an outer loop from the inner loop. llvm-svn: 207311 |
||
---|---|---|
.. | ||
IPA | ||
AliasAnalysis.cpp | ||
AliasAnalysisCounter.cpp | ||
AliasAnalysisEvaluator.cpp | ||
AliasDebugger.cpp | ||
AliasSetTracker.cpp | ||
Analysis.cpp | ||
BasicAliasAnalysis.cpp | ||
BlockFrequencyInfo.cpp | ||
BlockFrequencyInfoImpl.cpp | ||
BranchProbabilityInfo.cpp | ||
CFG.cpp | ||
CFGPrinter.cpp | ||
CGSCCPassManager.cpp | ||
CMakeLists.txt | ||
CaptureTracking.cpp | ||
CodeMetrics.cpp | ||
ConstantFolding.cpp | ||
CostModel.cpp | ||
Delinearization.cpp | ||
DependenceAnalysis.cpp | ||
DomPrinter.cpp | ||
DominanceFrontier.cpp | ||
IVUsers.cpp | ||
InstCount.cpp | ||
InstructionSimplify.cpp | ||
Interval.cpp | ||
IntervalPartition.cpp | ||
LLVMBuild.txt | ||
LazyCallGraph.cpp | ||
LazyValueInfo.cpp | ||
LibCallAliasAnalysis.cpp | ||
LibCallSemantics.cpp | ||
Lint.cpp | ||
Loads.cpp | ||
LoopInfo.cpp | ||
LoopPass.cpp | ||
Makefile | ||
MemDepPrinter.cpp | ||
MemoryBuiltins.cpp | ||
MemoryDependenceAnalysis.cpp | ||
ModuleDebugInfoPrinter.cpp | ||
NoAliasAnalysis.cpp | ||
PHITransAddr.cpp | ||
PostDominators.cpp | ||
PtrUseVisitor.cpp | ||
README.txt | ||
RegionInfo.cpp | ||
RegionPass.cpp | ||
RegionPrinter.cpp | ||
ScalarEvolution.cpp | ||
ScalarEvolutionAliasAnalysis.cpp | ||
ScalarEvolutionExpander.cpp | ||
ScalarEvolutionNormalization.cpp | ||
SparsePropagation.cpp | ||
TargetTransformInfo.cpp | ||
Trace.cpp | ||
TypeBasedAliasAnalysis.cpp | ||
ValueTracking.cpp |
README.txt
Analysis Opportunities: //===---------------------------------------------------------------------===// In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the ScalarEvolution expression for %r is this: {1,+,3,+,2}<loop> Outside the loop, this could be evaluated simply as (%n * %n), however ScalarEvolution currently evaluates it as (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n)) In addition to being much more complicated, it involves i65 arithmetic, which is very inefficient when expanded into code. //===---------------------------------------------------------------------===// In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll, ScalarEvolution is forming this expression: ((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32))) This could be folded to (-1 * (trunc i64 undef to i32)) //===---------------------------------------------------------------------===//