llvm-project

History

Sanjay Patel 0c1c70aef4 [ValueTracking] recognize variations of 'clamp' to improve codegen (PR31693) By enhancing value tracking, we allow an existing min/max canonicalization to kick in and improve codegen for several targets that have min/max instructions. Unfortunately, recognizing min/max in value tracking may cause us to hit a hack in InstCombiner::visitICmpInst() more often: http://lists.llvm.org/pipermail/llvm-dev/2017-January/109340.html ...but I'm hoping we can remove that soon. Correctness proofs based on Alive: Name: smaxmin Pre: C1 < C2 %cmp2 = icmp slt i8 %x, C2 %min = select i1 %cmp2, i8 %x, i8 C2 %cmp3 = icmp slt i8 %x, C1 %r = select i1 %cmp3, i8 C1, i8 %min => %cmp2 = icmp slt i8 %x, C2 %min = select i1 %cmp2, i8 %x, i8 C2 %cmp1 = icmp sgt i8 %min, C1 %r = select i1 %cmp1, i8 %min, i8 C1 Name: sminmax Pre: C1 > C2 %cmp2 = icmp sgt i8 %x, C2 %max = select i1 %cmp2, i8 %x, i8 C2 %cmp3 = icmp sgt i8 %x, C1 %r = select i1 %cmp3, i8 C1, i8 %max => %cmp2 = icmp sgt i8 %x, C2 %max = select i1 %cmp2, i8 %x, i8 C2 %cmp1 = icmp slt i8 %max, C1 %r = select i1 %cmp1, i8 %max, i8 C1 ---------------------------------------- Optimization: smaxmin Done: 1 Optimization is correct! ---------------------------------------- Optimization: sminmax Done: 1 Optimization is correct! Name: umaxmin Pre: C1 u< C2 %cmp2 = icmp ult i8 %x, C2 %min = select i1 %cmp2, i8 %x, i8 C2 %cmp3 = icmp ult i8 %x, C1 %r = select i1 %cmp3, i8 C1, i8 %min => %cmp2 = icmp ult i8 %x, C2 %min = select i1 %cmp2, i8 %x, i8 C2 %cmp1 = icmp ugt i8 %min, C1 %r = select i1 %cmp1, i8 %min, i8 C1 Name: uminmax Pre: C1 u> C2 %cmp2 = icmp ugt i8 %x, C2 %max = select i1 %cmp2, i8 %x, i8 C2 %cmp3 = icmp ugt i8 %x, C1 %r = select i1 %cmp3, i8 C1, i8 %max => %cmp2 = icmp ugt i8 %x, C2 %max = select i1 %cmp2, i8 %x, i8 C2 %cmp1 = icmp ult i8 %max, C1 %r = select i1 %cmp1, i8 %max, i8 C1 ---------------------------------------- Optimization: umaxmin Done: 1 Optimization is correct! ---------------------------------------- Optimization: uminmax Done: 1 Optimization is correct! llvm-svn: 292660		2017-01-20 22:18:47 +00:00
..
AliasAnalysis.cpp	[AliasAnalysis] Fences do not modify constant memory location	2017-01-20 00:21:33 +00:00
AliasAnalysisEvaluator.cpp	Consistently use FunctionAnalysisManager	2016-08-09 00:28:15 +00:00
AliasAnalysisSummary.cpp	Update a comment.	2016-08-25 01:29:55 +00:00
AliasAnalysisSummary.h	Make some LLVM_CONSTEXPR variables const. NFC.	2016-08-25 01:05:08 +00:00
AliasSetTracker.cpp	[AliasSetTracker] Make AST smarter about assume intrinsics that don't actually affect memory.	2016-11-07 14:11:45 +00:00
Analysis.cpp	[LCSSA] Perform LCSSA verification only for the current loop nest.	2016-10-28 12:57:20 +00:00
AssumptionCache.cpp	[ValueTracking] recognize a 'not' of an assumed condition as false	2017-01-17 18:15:49 +00:00
BasicAliasAnalysis.cpp	[PM] Remove a pointless optimization.	2016-12-27 18:04:11 +00:00
BlockFrequencyInfo.cpp	Add an interface to scale the frequencies of a set of blocks.	2017-01-19 18:53:16 +00:00
BlockFrequencyInfoImpl.cpp	[GraphTraits] Replace all NodeType usage with NodeRef	2016-08-22 21:09:30 +00:00
BranchProbabilityInfo.cpp	Retry: [BPI] Use a safer constructor to calculate branch probabilities	2016-12-17 01:02:08 +00:00
CFG.cpp	…
CFGPrinter.cpp	[PM] Port CFGViewer and CFGPrinter to the new Pass Manager	2016-09-15 18:35:27 +00:00
CFLAndersAliasAnalysis.cpp	Apply clang-tidy's performance-unnecessary-value-param to LLVM.	2017-01-13 14:39:03 +00:00
CFLGraph.h	[CFLAA] Check for pointer types in more places.	2016-07-29 01:23:45 +00:00
CFLSteensAliasAnalysis.cpp	[PM] Change the static object whose address is used to uniquely identify	2016-11-23 17:53:26 +00:00
CGSCCPassManager.cpp	[PM] Teach the CGSCC's CG update utility to more carefully invalidate	2016-12-28 10:34:50 +00:00
CMakeLists.txt	[PM] Separate the LoopAnalysisManager from the LoopPassManager and move	2017-01-11 09:43:56 +00:00
CallGraph.cpp	[PM] Change the static object whose address is used to uniquely identify	2016-11-23 17:53:26 +00:00
CallGraphSCCPass.cpp	Improve the `-filter-print-funcs` option to skip the banner for CGSCC pass when nothing is to be printed	2017-01-18 21:37:11 +00:00
CallPrinter.cpp	[CG] Rename the DOT printing pass to actually reference "DOT".	2016-03-10 11:04:40 +00:00
CaptureTracking.cpp	[CaptureTracking] Volatile operations capture their memory location	2016-05-26 17:36:22 +00:00
CodeMetrics.cpp	Revert @llvm.assume with operator bundles (r289755-r289757)	2016-12-19 08:22:17 +00:00
ConstantFolding.cpp	[InstCombiner] Simplify lib calls to `round{,f}`	2016-12-26 14:29:29 +00:00
CostModel.cpp	[X86] updating TTI costs for arithmetic instructions on X86\SLM arch.	2017-01-11 08:23:37 +00:00
Delinearization.cpp	[NFC] Header cleanup	2016-04-18 09:17:29 +00:00
DemandedBits.cpp	Revert @llvm.assume with operator bundles (r289755-r289757)	2016-12-19 08:22:17 +00:00
DependenceAnalysis.cpp	[PM] Change the static object whose address is used to uniquely identify	2016-11-23 17:53:26 +00:00
DivergenceAnalysis.cpp	DivergenceAnalysis: Fix crash with no return blocks	2016-05-09 16:57:08 +00:00
DomPrinter.cpp	Introduce analysis pass to compute PostDominators in the new pass manager. NFC	2016-02-25 17:54:07 +00:00
DominanceFrontier.cpp	[PM] Introduce an analysis set used to preserve all analyses over	2017-01-15 06:32:49 +00:00
EHPersonalities.cpp	[tsan] Add support for C++ exceptions into TSan (call __tsan_func_exit during unwinding), LLVM part	2016-11-14 21:41:13 +00:00
GlobalsModRef.cpp	[PM] Change the static object whose address is used to uniquely identify	2016-11-23 17:53:26 +00:00
IVUsers.cpp	[PM] Separate the LoopAnalysisManager from the LoopPassManager and move	2017-01-11 09:43:56 +00:00
IndirectCallPromotionAnalysis.cpp	Remove another unused variable from r275216	2016-07-12 23:49:17 +00:00
InlineCost.cpp	Recommit "[InlineCost] Use TTI to check if GEP is free." #3	2017-01-20 18:51:22 +00:00
InstCount.cpp	…
InstructionSimplify.cpp	Removing potentially error-prone fallthrough. NFC	2017-01-14 07:28:47 +00:00
Interval.cpp	Apply clang-tidy's modernize-loop-convert to lib/Analysis.	2016-06-26 17:27:42 +00:00
IntervalPartition.cpp	Apply clang-tidy's modernize-loop-convert to lib/Analysis.	2016-06-26 17:27:42 +00:00
IteratedDominanceFrontier.cpp	Normalize file docs. NFC.	2016-07-21 20:52:35 +00:00
LLVMBuild.txt	Restore "[ThinLTO] Prevent exporting of locals used/defined in module level asm"	2016-11-14 17:12:32 +00:00
LazyBlockFrequencyInfo.cpp	[BPI] Add new LazyBPI analysis	2016-07-28 23:31:12 +00:00
LazyBranchProbabilityInfo.cpp	[BPI] Add new LazyBPI analysis	2016-07-28 23:31:12 +00:00
LazyCallGraph.cpp	[PM] Teach the CGSCC's CG update utility to more carefully invalidate	2016-12-28 10:34:50 +00:00
LazyValueInfo.cpp	Make processing @llvm.assume more efficient - Add affected values to the assumption cache	2017-01-11 13:24:24 +00:00
Lint.cpp	Revert @llvm.assume with operator bundles (r289755-r289757)	2016-12-19 08:22:17 +00:00
Loads.cpp	[Loads] Fix crash in is isDereferenceableAndAlignedPointer()	2016-10-28 15:32:28 +00:00
LoopAccessAnalysis.cpp	[PM] Separate the LoopAnalysisManager from the LoopPassManager and move	2017-01-11 09:43:56 +00:00
LoopAnalysisManager.cpp	[LoopInfo] Add helper methods to compute two useful orderings of the	2017-01-20 02:41:20 +00:00
LoopInfo.cpp	Use getLoopLatch in place of isLoopSimplifyForm	2017-01-15 21:17:52 +00:00
LoopPass.cpp	Reverted: Track validity of pass results	2017-01-15 10:23:18 +00:00
LoopUnrollAnalyzer.cpp	[LoopUnrollAnalyzer] Handle out of bounds accesses in visitLoad	2016-07-23 02:56:49 +00:00
MemDepPrinter.cpp	Apply clang-tidy's modernize-loop-convert to lib/Analysis.	2016-06-26 17:27:42 +00:00
MemDerefPrinter.cpp	NFC. Move isDereferenceable to Loads.h/cpp	2016-02-24 12:49:04 +00:00
MemoryBuiltins.cpp	[Analysis] Ignore `nobuiltin` on `allocsize` function calls.	2016-12-27 06:32:14 +00:00
MemoryDependenceAnalysis.cpp	[Devirtualization] MemDep returns non-local !invariant.group dependencies	2017-01-12 11:33:58 +00:00
MemoryLocation.cpp	[TLI] Unify LibFunc signature checking. NFCI.	2016-04-27 19:04:35 +00:00
ModuleDebugInfoPrinter.cpp	[IR] Remove the DIExpression field from DIGlobalVariable.	2016-12-20 02:09:43 +00:00
ModuleSummaryAnalysis.cpp	ThinLTO: add early "dead-stripping" on the Index	2017-01-05 21:34:18 +00:00
ObjCARCAliasAnalysis.cpp	Consistently use FunctionAnalysisManager	2016-08-09 00:28:15 +00:00
ObjCARCAnalysisUtils.cpp	…
ObjCARCInstKind.cpp	Create llvm.addressofreturnaddress intrinsic	2016-10-12 22:13:19 +00:00
OptimizationDiagnosticInfo.cpp	[PM] Teach the optimization remarks emitter to handle invalidation	2017-01-15 08:20:50 +00:00
OrderedBasicBlock.cpp	…
PHITransAddr.cpp	Revert @llvm.assume with operator bundles (r289755-r289757)	2016-12-19 08:22:17 +00:00
PostDominators.cpp	[PM] Introduce an analysis set used to preserve all analyses over	2017-01-15 06:32:49 +00:00
ProfileSummaryInfo.cpp	Compute summary before calling extractProfTotalWeight	2017-01-14 00:32:37 +00:00
PtrUseVisitor.cpp	…
README.txt	…
RegionInfo.cpp	[PM] Introduce an analysis set used to preserve all analyses over	2017-01-15 06:32:49 +00:00
RegionPass.cpp	Reverted: Track validity of pass results	2017-01-15 10:23:18 +00:00
RegionPrinter.cpp	Apply clang-tidy's modernize-loop-convert to lib/Analysis.	2016-06-26 17:27:42 +00:00
ScalarEvolution.cpp	[SCEV] Make getUDivExactExpr handle non-nuw multiplies correctly.	2017-01-18 23:56:42 +00:00
ScalarEvolutionAliasAnalysis.cpp	[PM] Change the static object whose address is used to uniquely identify	2016-11-23 17:53:26 +00:00
ScalarEvolutionExpander.cpp	Revert @llvm.assume with operator bundles (r289755-r289757)	2016-12-19 08:22:17 +00:00
ScalarEvolutionNormalization.cpp	Remove emacs mode markers from .cpp files. NFC	2016-04-24 17:55:41 +00:00
ScopedNoAliasAA.cpp	[PM] Change the static object whose address is used to uniquely identify	2016-11-23 17:53:26 +00:00
SparsePropagation.cpp	Apply clang-tidy's modernize-loop-convert to lib/Analysis.	2016-06-26 17:27:42 +00:00
StratifiedSets.h	Do a sweep over move ctors and remove those that are identical to the default.	2016-10-20 12:20:28 +00:00
TargetLibraryInfo.cpp	[TLI] Appease spurious MSVC warning using llvm_unreachable. NFC.	2017-01-17 19:54:18 +00:00
TargetTransformInfo.cpp	[X86] updating TTI costs for arithmetic instructions on X86\SLM arch.	2017-01-11 08:23:37 +00:00
Trace.cpp	…
TypeBasedAliasAnalysis.cpp	[TBAA] Don't generate invalid TBAA when merging nodes	2016-12-11 20:07:25 +00:00
TypeMetadataUtils.cpp	TypeMetadataUtils: Simplify; spotted by Mehdi.	2016-12-21 19:00:47 +00:00
ValueTracking.cpp	[ValueTracking] recognize variations of 'clamp' to improve codegen (PR31693)	2017-01-20 22:18:47 +00:00
VectorUtils.cpp	IR: Change the gep_type_iterator API to avoid always exposing the "current" type.	2016-12-02 02:24:42 +00:00

README.txt

Analysis Opportunities:

//===---------------------------------------------------------------------===//

In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the
ScalarEvolution expression for %r is this:

  {1,+,3,+,2}<loop>

Outside the loop, this could be evaluated simply as (%n * %n), however
ScalarEvolution currently evaluates it as

  (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n))

In addition to being much more complicated, it involves i65 arithmetic,
which is very inefficient when expanded into code.

//===---------------------------------------------------------------------===//

In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll,

ScalarEvolution is forming this expression:

((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32)))

This could be folded to

(-1 * (trunc i64 undef to i32))

//===---------------------------------------------------------------------===//