llvm-project/llvm/lib/Analysis
Kerry McLaughlin ba1e150d03 [SVE] Add support for scalable vectorization of loops with int/fast FP reductions
This patch enables scalable vectorization of loops with integer/fast reductions, e.g:

```
unsigned sum = 0;
for (int i = 0; i < n; ++i) {
  sum += a[i];
}
```

A new TTI interface, isLegalToVectorizeReduction, has been added to prevent
reductions which are not supported for scalable types from vectorizing.
If the reduction is not supported for a given scalable VF,
computeFeasibleMaxVF will fall back to using fixed-width vectorization.

Reviewed By: david-arm, fhahn, dmgreen

Differential Revision: https://reviews.llvm.org/D95245
2021-02-16 13:50:06 +00:00
..
models/inliner [MLInliner] Simplify TFUTILS_SUPPORTED_TYPES 2020-08-25 14:19:39 -07:00
AliasAnalysis.cpp [AA] Add option for tracing AA queries (NFC) 2021-02-12 21:42:49 +01:00
AliasAnalysisEvaluator.cpp [Analysis] Use range-based for loops (NFC) 2021-02-06 11:17:10 -08:00
AliasAnalysisSummary.cpp
AliasAnalysisSummary.h
AliasSetTracker.cpp [Analysis] Use range-based for loops (NFC) 2021-02-06 11:17:10 -08:00
Analysis.cpp [NPM] Port module-debuginfo pass to the new pass manager 2020-10-19 14:31:17 -07:00
AssumeBundleQueries.cpp [llvm] Use llvm::is_contained (NFC) 2021-02-14 08:36:20 -08:00
AssumptionCache.cpp Revert "[AssumptionCache] Avoid dangling llvm.assume calls in the cache" 2021-02-11 12:17:38 -06:00
BasicAliasAnalysis.cpp [BasicAA] Merge aliasGEP code paths 2021-02-14 19:35:36 +01:00
BlockFrequencyInfo.cpp
BlockFrequencyInfoImpl.cpp
BranchProbabilityInfo.cpp [Analysis] Use range-based for loops (NFC) 2021-02-06 11:17:10 -08:00
CFG.cpp [Analysis] Use is_contained (NFC) 2020-12-11 21:19:31 -08:00
CFGPrinter.cpp [llvm] Use llvm::all_of (NFC) 2021-01-06 18:27:36 -08:00
CFLAndersAliasAnalysis.cpp [MemLoc] Use hasValue() method (NFC) 2020-11-19 21:53:50 +01:00
CFLGraph.h
CFLSteensAliasAnalysis.cpp
CGSCCPassManager.cpp [llvm] Use *::empty (NFC) 2021-01-16 09:40:55 -08:00
CMakeLists.txt [NFC] Move ImportedFunctionsInliningStatistics to Analysis 2021-01-20 13:18:03 -08:00
CallGraph.cpp [Analysis] Remove spliceFunction (NFC) 2020-12-23 21:57:25 -08:00
CallGraphSCCPass.cpp [NewPM] Support --print-before/after in NPM 2020-12-03 16:52:14 -08:00
CallPrinter.cpp [Analysis] Use range-based for loops (NFC) 2021-02-06 11:17:10 -08:00
CaptureTracking.cpp [CaptureTracking] Add statistics (NFC) 2020-11-07 12:57:00 +01:00
CmpInstAnalysis.cpp
CodeMetrics.cpp Revert "[AssumptionCache] Avoid dangling llvm.assume calls in the cache" 2021-02-11 12:17:38 -06:00
ConstantFolding.cpp [ConstantFold] Fold fptoi.sat intrinsics 2021-01-10 17:37:27 +01:00
ConstraintSystem.cpp [llvm] Remove redundant string initialization (NFC) 2021-01-12 21:43:46 -08:00
CostModel.cpp [Support] Introduce a new InstructionCost class 2020-12-11 08:12:54 +00:00
DDG.cpp [Analysis] Use llvm::append_range (NFC) 2021-01-22 23:25:01 -08:00
DDGPrinter.cpp [DDG] Data Dependence Graph - DOT printer - recommit 2020-12-16 12:37:36 -05:00
Delinearization.cpp [Analysis] Use range-based for loops (NFC) 2021-02-06 11:17:10 -08:00
DemandedBits.cpp [DemandedBits][BDCE] Add support for min/max intrinsics 2020-09-10 22:13:31 +02:00
DependenceAnalysis.cpp [AA] Split up LocationSize::unknown() 2020-11-26 18:39:55 +01:00
DependenceGraphBuilder.cpp [Analysis] Use llvm::append_range (NFC) 2021-01-22 23:25:01 -08:00
DevelopmentModeInlineAdvisor.cpp [Analysis] Use llvm::append_range (NFC) 2021-01-22 23:25:01 -08:00
DivergenceAnalysis.cpp [NewPM] Introduce (GPU)DivergenceAnalysis in the new pass manager 2021-02-16 10:26:45 +05:30
DomPrinter.cpp
DomTreeUpdater.cpp [Target, Transforms] Use *Set::contains (NFC) 2021-01-08 18:39:54 -08:00
DominanceFrontier.cpp
EHPersonalities.cpp [XCOFF][AIX] Generate LSDA data and compact unwind section on AIX 2020-12-02 18:42:44 +00:00
FunctionPropertiesAnalysis.cpp [llvm] Ensure newlines at the end of files (NFC) 2021-01-10 09:24:57 -08:00
GlobalsModRef.cpp Reapply [BasicAA] Handle recursive queries more efficiently 2021-01-17 10:34:35 +01:00
GuardUtils.cpp
HeatUtils.cpp [CallPrinter] Remove static constructor. 2020-06-17 13:02:58 +02:00
IRSimilarityIdentifier.cpp [IROutliner] Adding instruction strings to IRSimilarityPrinting diagnostics. 2021-02-09 12:11:47 -06:00
IVDescriptors.cpp [LoopVectorizer] Require no-signed-zeros-fp-math=true for fmin/fmax 2021-02-15 13:47:05 +00:00
IVUsers.cpp
ImportedFunctionsInliningStatistics.cpp Reland "[NPM][Inliner] Factor ImportedFunctionStats in the InlineAdvisor" 2021-01-20 13:33:43 -08:00
IndirectCallPromotionAnalysis.cpp [Analysis] Remove unused system header includes 2020-11-22 10:32:37 +00:00
InlineAdvisor.cpp [InlineAdvisor] Allow replay of inline decisions for the CGSCC inliner from optimization remarks 2021-01-25 15:38:57 -08:00
InlineCost.cpp [AMDGPU][Inliner] Remove amdgpu-inline and add a new TTI inline hook 2021-01-21 20:29:17 -08:00
InlineSizeEstimatorAnalysis.cpp [MLGO] Fix build break as result of new InstructionCost (D91174) 2020-12-11 20:28:39 -08:00
InstCount.cpp [NFC] Port InstCount pass to new pass manager 2020-08-21 12:39:42 +03:00
InstructionPrecedenceTracking.cpp
InstructionSimplify.cpp [CodeGen][SelectionDAG]Add new intrinsic experimental.vector.reverse 2021-02-15 13:39:43 +00:00
Interval.cpp [Analysis/Interval] Remove isLoop (NFC) 2020-12-12 10:09:35 -08:00
IntervalPartition.cpp
LazyBlockFrequencyInfo.cpp
LazyBranchProbabilityInfo.cpp
LazyCallGraph.cpp [llvm] Use llvm::drop_begin (NFC) 2021-01-14 20:30:33 -08:00
LazyValueInfo.cpp [Analysis] Use range-based for loops (NFC) 2021-02-06 11:17:10 -08:00
LegacyDivergenceAnalysis.cpp [NewPM] Introduce (GPU)DivergenceAnalysis in the new pass manager 2021-02-16 10:26:45 +05:30
Lint.cpp [AA] Split up LocationSize::unknown() 2020-11-26 18:39:55 +01:00
Loads.cpp reland [InstCombine] convert assumes to operand bundles 2021-02-13 13:03:11 +01:00
LoopAccessAnalysis.cpp [llvm] Drop unnecessary make_range (NFC) 2021-01-09 09:25:00 -08:00
LoopAnalysisManager.cpp [NFC] Reduce include files dependency. 2020-12-03 18:25:05 +03:00
LoopCacheAnalysis.cpp [Analysis] Use llvm::append_range (NFC) 2021-01-22 23:25:01 -08:00
LoopInfo.cpp [llvm] Use llvm::is_contained (NFC) 2021-02-14 08:36:20 -08:00
LoopNestAnalysis.cpp [Analysis] Use llvm::append_range (NFC) 2021-01-22 23:25:01 -08:00
LoopPass.cpp [Analysis] Use llvm::erase_value (NFC) 2020-12-14 22:40:13 -08:00
LoopUnrollAnalyzer.cpp ScalarEvolution.h - reduce LoopInfo.h include to forward declarations. NFC. 2020-06-17 15:48:23 +01:00
MLInlineAdvisor.cpp Reland "[NPM][Inliner] Factor ImportedFunctionStats in the InlineAdvisor" 2021-01-20 13:33:43 -08:00
MemDepPrinter.cpp [Analysis] Remove dead function getInstTypePair (NFC) 2020-12-19 10:57:35 -08:00
MemDerefPrinter.cpp Port -print-memderefs to NPM 2020-11-23 11:56:22 -08:00
MemoryBuiltins.cpp [Analysis] Support AIX vec_malloc routines 2021-01-22 16:03:01 -05:00
MemoryDependenceAnalysis.cpp [llvm] Use pop_back_val (NFC) 2021-01-24 12:18:57 -08:00
MemoryLocation.cpp [MemLoc] Fix debug print for LocationSize 2020-12-20 17:52:48 +01:00
MemorySSA.cpp [MemorySSA] Don't treat lifetime.end as NoAlias 2021-02-04 20:58:28 +01:00
MemorySSAUpdater.cpp [DominatorTree] Add support for mixed pre/post CFG views. 2021-01-06 14:53:09 -08:00
ModuleDebugInfoPrinter.cpp [NPM] Port module-debuginfo pass to the new pass manager 2020-10-19 14:31:17 -07:00
ModuleSummaryAnalysis.cpp [ThinLTO] Add Visibility bits to GlobalValueSummary::GVFlags 2021-01-27 10:43:51 -08:00
MustExecute.cpp [MustExecute] Use ListSeparator (NFC) 2021-01-28 22:21:16 -08:00
ObjCARCAliasAnalysis.cpp [AA] Split up LocationSize::unknown() 2020-11-26 18:39:55 +01:00
ObjCARCAnalysisUtils.cpp [NFC] Reduce include files dependency. 2020-12-03 18:25:05 +03:00
ObjCARCInstKind.cpp [ObjC][ARC] Use operand bundle 'clang.arc.attachedcall' instead of 2021-02-12 09:51:57 -08:00
OptimizationRemarkEmitter.cpp [BPI] Improve static heuristics for "cold" paths. 2020-12-23 22:47:36 +07:00
PHITransAddr.cpp
PhiValues.cpp [PhiValues] Use SetVector to avoid non-determinism 2020-10-23 20:14:02 +02:00
PostDominators.cpp
ProfileSummaryInfo.cpp [NFC] Change getEntryForPercentile to be a static function in ProfileSummaryBuilder. 2020-07-09 16:38:19 -07:00
PtrUseVisitor.cpp
README.txt
RegionInfo.cpp RegionInfo.cpp - remove duplicate includes that already exist in RegionInfo.h. NFC. 2020-07-23 17:50:22 +01:00
RegionPass.cpp [NFC] Clean up always false variables 2020-10-21 10:54:55 -07:00
RegionPrinter.cpp
ReleaseModeModelRunner.cpp static const char *const foo => const char foo[] 2020-12-01 10:33:18 -08:00
ReplayInlineAdvisor.cpp [InlineAdvisor] Allow replay of inline decisions for the CGSCC inliner from optimization remarks 2021-01-25 15:38:57 -08:00
ScalarEvolution.cpp Revert "[AssumptionCache] Avoid dangling llvm.assume calls in the cache" 2021-02-11 12:17:38 -06:00
ScalarEvolutionAliasAnalysis.cpp [AA] Split up LocationSize::unknown() 2020-11-26 18:39:55 +01:00
ScalarEvolutionDivision.cpp [SCEV] Generalize SCEVParameterRewriter to accept SCEV expression as target. 2020-09-18 10:05:02 +01:00
ScalarEvolutionNormalization.cpp
ScopedNoAliasAA.cpp [NFC] Reduce include files dependency and AA header cleanup (part 2). 2020-12-17 14:04:48 +03:00
StackLifetime.cpp When dumping results of StackLifetime, it will print the following 2020-09-07 11:43:16 +08:00
StackSafetyAnalysis.cpp [llvm] Drop unnecessary make_range (NFC) 2021-01-09 09:25:00 -08:00
StratifiedSets.h
SyncDependenceAnalysis.cpp [Analysis] Use ListSeparator (NFC) 2021-02-14 08:36:14 -08:00
SyntheticCountsUtils.cpp
TFUtils.cpp [NFC][TFUtils] also include output specs lookup logic in loadOutputSpecs 2020-11-18 21:20:21 -08:00
TargetLibraryInfo.cpp [NFC][Analysis] Change struct VecDesc to use ElementCount 2021-02-12 11:07:58 +00:00
TargetTransformInfo.cpp [SVE] Add support for scalable vectorization of loops with int/fast FP reductions 2021-02-16 13:50:06 +00:00
Trace.cpp
TypeBasedAliasAnalysis.cpp [NFC] Reduce include files dependency and AA header cleanup (part 2). 2020-12-17 14:04:48 +03:00
TypeMetadataUtils.cpp TypeMetadataUtils.h - reduce Instructions.h include to forward declaration. NFC. 2020-06-05 17:40:33 +01:00
VFABIDemangling.cpp [llvm] Use the default value of drop_begin (NFC) 2021-01-18 10:16:36 -08:00
ValueLattice.cpp
ValueLatticeUtils.cpp [ValueLattice] Simplify canTrackGlobalVariableInterprocedurally (NFC). 2020-07-09 18:33:09 +01:00
ValueTracking.cpp [ValueTracking] add scan limit for assumes 2021-02-15 15:24:20 -05:00
VectorUtils.cpp [Analysis] Change VFABI::mangleTLIVectorName to use ElementCount 2021-02-12 09:38:12 +00:00

README.txt

Analysis Opportunities:

//===---------------------------------------------------------------------===//

In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the
ScalarEvolution expression for %r is this:

  {1,+,3,+,2}<loop>

Outside the loop, this could be evaluated simply as (%n * %n), however
ScalarEvolution currently evaluates it as

  (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n))

In addition to being much more complicated, it involves i65 arithmetic,
which is very inefficient when expanded into code.

//===---------------------------------------------------------------------===//

In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll,

ScalarEvolution is forming this expression:

((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32)))

This could be folded to

(-1 * (trunc i64 undef to i32))

//===---------------------------------------------------------------------===//