llvm-project/llvm/lib/Analysis
Sanjay Patel 0c1c70aef4 [ValueTracking] recognize variations of 'clamp' to improve codegen (PR31693)
By enhancing value tracking, we allow an existing min/max canonicalization to
kick in and improve codegen for several targets that have min/max instructions.

Unfortunately, recognizing min/max in value tracking may cause us to hit
a hack in InstCombiner::visitICmpInst() more often:
http://lists.llvm.org/pipermail/llvm-dev/2017-January/109340.html
...but I'm hoping we can remove that soon.

Correctness proofs based on Alive:

Name: smaxmin
Pre: C1 < C2
%cmp2 = icmp slt i8 %x, C2
%min = select i1 %cmp2, i8 %x, i8 C2
%cmp3 = icmp slt i8 %x, C1
%r = select i1 %cmp3, i8 C1, i8 %min
=>
%cmp2 = icmp slt i8 %x, C2
%min = select i1 %cmp2, i8 %x, i8 C2
%cmp1 = icmp sgt i8 %min, C1
%r = select i1 %cmp1, i8 %min, i8 C1

Name: sminmax
Pre: C1 > C2
%cmp2 = icmp sgt i8 %x, C2
%max = select i1 %cmp2, i8 %x, i8 C2
%cmp3 = icmp sgt i8 %x, C1
%r = select i1 %cmp3, i8 C1, i8 %max
=>
%cmp2 = icmp sgt i8 %x, C2
%max = select i1 %cmp2, i8 %x, i8 C2
%cmp1 = icmp slt i8 %max, C1
%r = select i1 %cmp1, i8 %max, i8 C1

---------------------------------------- 
Optimization: smaxmin 
Done: 1 
Optimization is correct! 
---------------------------------------- 
Optimization: sminmax 
Done: 1 
Optimization is correct!



Name: umaxmin
Pre: C1 u< C2
%cmp2 = icmp ult i8 %x, C2
%min = select i1 %cmp2, i8 %x, i8 C2
%cmp3 = icmp ult i8 %x, C1
%r = select i1 %cmp3, i8 C1, i8 %min
=>
%cmp2 = icmp ult i8 %x, C2
%min = select i1 %cmp2, i8 %x, i8 C2
%cmp1 = icmp ugt i8 %min, C1
%r = select i1 %cmp1, i8 %min, i8 C1

Name: uminmax
Pre: C1 u> C2
%cmp2 = icmp ugt i8 %x, C2
%max = select i1 %cmp2, i8 %x, i8 C2
%cmp3 = icmp ugt i8 %x, C1
%r = select i1 %cmp3, i8 C1, i8 %max
=>
%cmp2 = icmp ugt i8 %x, C2
%max = select i1 %cmp2, i8 %x, i8 C2
%cmp1 = icmp ult i8 %max, C1
%r = select i1 %cmp1, i8 %max, i8 C1

---------------------------------------- 
Optimization: umaxmin 
Done: 1 
Optimization is correct! 
---------------------------------------- 
Optimization: uminmax 
Done: 1 
Optimization is correct! 
 

llvm-svn: 292660
2017-01-20 22:18:47 +00:00
..
AliasAnalysis.cpp [AliasAnalysis] Fences do not modify constant memory location 2017-01-20 00:21:33 +00:00
AliasAnalysisEvaluator.cpp Consistently use FunctionAnalysisManager 2016-08-09 00:28:15 +00:00
AliasAnalysisSummary.cpp Update a comment. 2016-08-25 01:29:55 +00:00
AliasAnalysisSummary.h Make some LLVM_CONSTEXPR variables const. NFC. 2016-08-25 01:05:08 +00:00
AliasSetTracker.cpp [AliasSetTracker] Make AST smarter about assume intrinsics that don't actually affect memory. 2016-11-07 14:11:45 +00:00
Analysis.cpp [LCSSA] Perform LCSSA verification only for the current loop nest. 2016-10-28 12:57:20 +00:00
AssumptionCache.cpp [ValueTracking] recognize a 'not' of an assumed condition as false 2017-01-17 18:15:49 +00:00
BasicAliasAnalysis.cpp [PM] Remove a pointless optimization. 2016-12-27 18:04:11 +00:00
BlockFrequencyInfo.cpp Add an interface to scale the frequencies of a set of blocks. 2017-01-19 18:53:16 +00:00
BlockFrequencyInfoImpl.cpp [GraphTraits] Replace all NodeType usage with NodeRef 2016-08-22 21:09:30 +00:00
BranchProbabilityInfo.cpp Retry: [BPI] Use a safer constructor to calculate branch probabilities 2016-12-17 01:02:08 +00:00
CFG.cpp Avoid overly large SmallPtrSet/SmallSet 2016-01-30 01:24:31 +00:00
CFGPrinter.cpp [PM] Port CFGViewer and CFGPrinter to the new Pass Manager 2016-09-15 18:35:27 +00:00
CFLAndersAliasAnalysis.cpp Apply clang-tidy's performance-unnecessary-value-param to LLVM. 2017-01-13 14:39:03 +00:00
CFLGraph.h [CFLAA] Check for pointer types in more places. 2016-07-29 01:23:45 +00:00
CFLSteensAliasAnalysis.cpp [PM] Change the static object whose address is used to uniquely identify 2016-11-23 17:53:26 +00:00
CGSCCPassManager.cpp [PM] Teach the CGSCC's CG update utility to more carefully invalidate 2016-12-28 10:34:50 +00:00
CMakeLists.txt [PM] Separate the LoopAnalysisManager from the LoopPassManager and move 2017-01-11 09:43:56 +00:00
CallGraph.cpp [PM] Change the static object whose address is used to uniquely identify 2016-11-23 17:53:26 +00:00
CallGraphSCCPass.cpp Improve the `-filter-print-funcs` option to skip the banner for CGSCC pass when nothing is to be printed 2017-01-18 21:37:11 +00:00
CallPrinter.cpp [CG] Rename the DOT printing pass to actually reference "DOT". 2016-03-10 11:04:40 +00:00
CaptureTracking.cpp [CaptureTracking] Volatile operations capture their memory location 2016-05-26 17:36:22 +00:00
CodeMetrics.cpp Revert @llvm.assume with operator bundles (r289755-r289757) 2016-12-19 08:22:17 +00:00
ConstantFolding.cpp [InstCombiner] Simplify lib calls to `round{,f}` 2016-12-26 14:29:29 +00:00
CostModel.cpp [X86] updating TTI costs for arithmetic instructions on X86\SLM arch. 2017-01-11 08:23:37 +00:00
Delinearization.cpp [NFC] Header cleanup 2016-04-18 09:17:29 +00:00
DemandedBits.cpp Revert @llvm.assume with operator bundles (r289755-r289757) 2016-12-19 08:22:17 +00:00
DependenceAnalysis.cpp [PM] Change the static object whose address is used to uniquely identify 2016-11-23 17:53:26 +00:00
DivergenceAnalysis.cpp DivergenceAnalysis: Fix crash with no return blocks 2016-05-09 16:57:08 +00:00
DomPrinter.cpp Introduce analysis pass to compute PostDominators in the new pass manager. NFC 2016-02-25 17:54:07 +00:00
DominanceFrontier.cpp [PM] Introduce an analysis set used to preserve all analyses over 2017-01-15 06:32:49 +00:00
EHPersonalities.cpp [tsan] Add support for C++ exceptions into TSan (call __tsan_func_exit during unwinding), LLVM part 2016-11-14 21:41:13 +00:00
GlobalsModRef.cpp [PM] Change the static object whose address is used to uniquely identify 2016-11-23 17:53:26 +00:00
IVUsers.cpp [PM] Separate the LoopAnalysisManager from the LoopPassManager and move 2017-01-11 09:43:56 +00:00
IndirectCallPromotionAnalysis.cpp Remove another unused variable from r275216 2016-07-12 23:49:17 +00:00
InlineCost.cpp Recommit "[InlineCost] Use TTI to check if GEP is free." #3 2017-01-20 18:51:22 +00:00
InstCount.cpp
InstructionSimplify.cpp Removing potentially error-prone fallthrough. NFC 2017-01-14 07:28:47 +00:00
Interval.cpp Apply clang-tidy's modernize-loop-convert to lib/Analysis. 2016-06-26 17:27:42 +00:00
IntervalPartition.cpp Apply clang-tidy's modernize-loop-convert to lib/Analysis. 2016-06-26 17:27:42 +00:00
IteratedDominanceFrontier.cpp Normalize file docs. NFC. 2016-07-21 20:52:35 +00:00
LLVMBuild.txt Restore "[ThinLTO] Prevent exporting of locals used/defined in module level asm" 2016-11-14 17:12:32 +00:00
LazyBlockFrequencyInfo.cpp [BPI] Add new LazyBPI analysis 2016-07-28 23:31:12 +00:00
LazyBranchProbabilityInfo.cpp [BPI] Add new LazyBPI analysis 2016-07-28 23:31:12 +00:00
LazyCallGraph.cpp [PM] Teach the CGSCC's CG update utility to more carefully invalidate 2016-12-28 10:34:50 +00:00
LazyValueInfo.cpp Make processing @llvm.assume more efficient - Add affected values to the assumption cache 2017-01-11 13:24:24 +00:00
Lint.cpp Revert @llvm.assume with operator bundles (r289755-r289757) 2016-12-19 08:22:17 +00:00
Loads.cpp [Loads] Fix crash in is isDereferenceableAndAlignedPointer() 2016-10-28 15:32:28 +00:00
LoopAccessAnalysis.cpp [PM] Separate the LoopAnalysisManager from the LoopPassManager and move 2017-01-11 09:43:56 +00:00
LoopAnalysisManager.cpp [LoopInfo] Add helper methods to compute two useful orderings of the 2017-01-20 02:41:20 +00:00
LoopInfo.cpp Use getLoopLatch in place of isLoopSimplifyForm 2017-01-15 21:17:52 +00:00
LoopPass.cpp Reverted: Track validity of pass results 2017-01-15 10:23:18 +00:00
LoopUnrollAnalyzer.cpp [LoopUnrollAnalyzer] Handle out of bounds accesses in visitLoad 2016-07-23 02:56:49 +00:00
MemDepPrinter.cpp Apply clang-tidy's modernize-loop-convert to lib/Analysis. 2016-06-26 17:27:42 +00:00
MemDerefPrinter.cpp NFC. Move isDereferenceable to Loads.h/cpp 2016-02-24 12:49:04 +00:00
MemoryBuiltins.cpp [Analysis] Ignore `nobuiltin` on `allocsize` function calls. 2016-12-27 06:32:14 +00:00
MemoryDependenceAnalysis.cpp [Devirtualization] MemDep returns non-local !invariant.group dependencies 2017-01-12 11:33:58 +00:00
MemoryLocation.cpp [TLI] Unify LibFunc signature checking. NFCI. 2016-04-27 19:04:35 +00:00
ModuleDebugInfoPrinter.cpp [IR] Remove the DIExpression field from DIGlobalVariable. 2016-12-20 02:09:43 +00:00
ModuleSummaryAnalysis.cpp ThinLTO: add early "dead-stripping" on the Index 2017-01-05 21:34:18 +00:00
ObjCARCAliasAnalysis.cpp Consistently use FunctionAnalysisManager 2016-08-09 00:28:15 +00:00
ObjCARCAnalysisUtils.cpp
ObjCARCInstKind.cpp Create llvm.addressofreturnaddress intrinsic 2016-10-12 22:13:19 +00:00
OptimizationDiagnosticInfo.cpp [PM] Teach the optimization remarks emitter to handle invalidation 2017-01-15 08:20:50 +00:00
OrderedBasicBlock.cpp
PHITransAddr.cpp Revert @llvm.assume with operator bundles (r289755-r289757) 2016-12-19 08:22:17 +00:00
PostDominators.cpp [PM] Introduce an analysis set used to preserve all analyses over 2017-01-15 06:32:49 +00:00
ProfileSummaryInfo.cpp Compute summary before calling extractProfTotalWeight 2017-01-14 00:32:37 +00:00
PtrUseVisitor.cpp
README.txt
RegionInfo.cpp [PM] Introduce an analysis set used to preserve all analyses over 2017-01-15 06:32:49 +00:00
RegionPass.cpp Reverted: Track validity of pass results 2017-01-15 10:23:18 +00:00
RegionPrinter.cpp Apply clang-tidy's modernize-loop-convert to lib/Analysis. 2016-06-26 17:27:42 +00:00
ScalarEvolution.cpp [SCEV] Make getUDivExactExpr handle non-nuw multiplies correctly. 2017-01-18 23:56:42 +00:00
ScalarEvolutionAliasAnalysis.cpp [PM] Change the static object whose address is used to uniquely identify 2016-11-23 17:53:26 +00:00
ScalarEvolutionExpander.cpp Revert @llvm.assume with operator bundles (r289755-r289757) 2016-12-19 08:22:17 +00:00
ScalarEvolutionNormalization.cpp Remove emacs mode markers from .cpp files. NFC 2016-04-24 17:55:41 +00:00
ScopedNoAliasAA.cpp [PM] Change the static object whose address is used to uniquely identify 2016-11-23 17:53:26 +00:00
SparsePropagation.cpp Apply clang-tidy's modernize-loop-convert to lib/Analysis. 2016-06-26 17:27:42 +00:00
StratifiedSets.h Do a sweep over move ctors and remove those that are identical to the default. 2016-10-20 12:20:28 +00:00
TargetLibraryInfo.cpp [TLI] Appease spurious MSVC warning using llvm_unreachable. NFC. 2017-01-17 19:54:18 +00:00
TargetTransformInfo.cpp [X86] updating TTI costs for arithmetic instructions on X86\SLM arch. 2017-01-11 08:23:37 +00:00
Trace.cpp Annotate dump() methods with LLVM_DUMP_METHOD, addressing Richard Smith r259192 post commit comment. 2016-01-29 20:50:44 +00:00
TypeBasedAliasAnalysis.cpp [TBAA] Don't generate invalid TBAA when merging nodes 2016-12-11 20:07:25 +00:00
TypeMetadataUtils.cpp TypeMetadataUtils: Simplify; spotted by Mehdi. 2016-12-21 19:00:47 +00:00
ValueTracking.cpp [ValueTracking] recognize variations of 'clamp' to improve codegen (PR31693) 2017-01-20 22:18:47 +00:00
VectorUtils.cpp IR: Change the gep_type_iterator API to avoid always exposing the "current" type. 2016-12-02 02:24:42 +00:00

README.txt

Analysis Opportunities:

//===---------------------------------------------------------------------===//

In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the
ScalarEvolution expression for %r is this:

  {1,+,3,+,2}<loop>

Outside the loop, this could be evaluated simply as (%n * %n), however
ScalarEvolution currently evaluates it as

  (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n))

In addition to being much more complicated, it involves i65 arithmetic,
which is very inefficient when expanded into code.

//===---------------------------------------------------------------------===//

In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll,

ScalarEvolution is forming this expression:

((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32)))

This could be folded to

(-1 * (trunc i64 undef to i32))

//===---------------------------------------------------------------------===//