llvm-project/llvm/lib/Analysis
Philip Reames b5681138e4 Allow value forwarding past release fences in GVN
A release fence acts as a publication barrier for stores within the current thread to become visible to other threads which might observe the release fence. It does not require the current thread to observe stores performed on other threads. As a result, we can allow store-load and load-load forwarding across a release fence.  

We choose to be much more conservative about stores.  In theory, nothing prevents us from shifting a store from after a release fence to before it, and then eliminating the preceeding (previously fenced) store.  Doing this without actually moving the second store is likely also legal, but we chose to be conservative at this time.

The LangRef indicates only atomic loads and stores are effected by fences. This patch chooses to be far more conservative then that. 

This is the GVN companion to http://reviews.llvm.org/D11434 which applied the same logic in EarlyCSE and has been baking in tree for a while now.

Differential Revision: http://reviews.llvm.org/D11436

llvm-svn: 264472
2016-03-25 22:40:35 +00:00
..
AliasAnalysis.cpp [PM] Implement the final conclusion as to how the analysis IDs should 2016-03-11 10:22:49 +00:00
AliasAnalysisEvaluator.cpp [PM] Make the AnalysisManager parameter to run methods a reference. 2016-03-11 11:05:24 +00:00
AliasSetTracker.cpp [AliasSetTracker] Do not strip pointer casts when processing MemSetInst 2016-03-14 18:34:29 +00:00
Analysis.cpp [CG] Actually hoist up the generic CallGraphPrinter pass from a weird 2016-03-10 11:08:44 +00:00
AssumptionCache.cpp [PM] Make the AnalysisManager parameter to run methods a reference. 2016-03-11 11:05:24 +00:00
BasicAliasAnalysis.cpp [AA] Make BasicAA just require domtree. 2016-03-11 13:53:18 +00:00
BlockFrequencyInfo.cpp Add getBlockProfileCount method to BlockFrequencyInfo 2016-03-23 18:18:26 +00:00
BlockFrequencyInfoImpl.cpp Fix Clang-tidy readability-redundant-control-flow warnings; other minor fixes. 2016-02-02 18:20:45 +00:00
BranchProbabilityInfo.cpp [BPI] Fix two potential divide-by-zero operations that are introduced in r256263. 2015-12-22 23:45:55 +00:00
CFG.cpp Avoid overly large SmallPtrSet/SmallSet 2016-01-30 01:24:31 +00:00
CFGPrinter.cpp Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC) 2015-06-23 09:49:53 +00:00
CFLAliasAnalysis.cpp Rename DenseMap::resize() into DenseMap::reserve() (NFC) 2016-03-22 07:20:00 +00:00
CGSCCPassManager.cpp [PM] Implement the final conclusion as to how the analysis IDs should 2016-03-11 10:22:49 +00:00
CMakeLists.txt PM: Implement a basic loop pass manager 2016-02-25 07:23:08 +00:00
CallGraph.cpp [PM] Make the AnalysisManager parameter to run methods a reference. 2016-03-11 11:05:24 +00:00
CallGraphSCCPass.cpp Recommit r256952 "Filtering IR printing for print-after-all/print-before-all" 2016-01-06 22:55:03 +00:00
CallPrinter.cpp [CG] Rename the DOT printing pass to actually reference "DOT". 2016-03-10 11:04:40 +00:00
CaptureTracking.cpp [CaptureTracking] Support atomicrmw and cmpxchg 2016-02-18 19:23:27 +00:00
CodeMetrics.cpp use range-based for loop; NFCI 2016-03-08 20:53:48 +00:00
ConstantFolding.cpp Implement constant folding for bitreverse 2016-03-21 15:00:35 +00:00
CostModel.cpp Implemented cost model for masked gather and scatter operations 2015-12-28 20:10:59 +00:00
Delinearization.cpp SCEV: Allow simple AddRec * Parameter products in delinearization 2015-10-12 08:02:00 +00:00
DemandedBits.cpp [DemandedBits] Revert r249687 due to PR26071 2016-02-03 15:05:06 +00:00
DependenceAnalysis.cpp [SCEV] Add and use SCEVConstant::getAPInt; NFCI 2015-12-17 20:28:46 +00:00
DivergenceAnalysis.cpp Introduce analysis pass to compute PostDominators in the new pass manager. NFC 2016-02-25 17:54:07 +00:00
DomPrinter.cpp Introduce analysis pass to compute PostDominators in the new pass manager. NFC 2016-02-25 17:54:07 +00:00
DominanceFrontier.cpp [PM] Make the AnalysisManager parameter to run methods a reference. 2016-03-11 11:05:24 +00:00
EHPersonalities.cpp Add Rust's personality function to the list of known personality functions 2016-03-15 20:35:45 +00:00
GlobalsModRef.cpp [PM] Make the AnalysisManager parameter to run methods a reference. 2016-03-11 11:05:24 +00:00
IVUsers.cpp Annotate dump() methods with LLVM_DUMP_METHOD, addressing Richard Smith r259192 post commit comment. 2016-01-29 20:50:44 +00:00
InlineCost.cpp Revert revisions 262636, 262643, 262679, and 262682. 2016-03-08 00:36:35 +00:00
InstCount.cpp Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC) 2015-06-23 09:49:53 +00:00
InstructionSimplify.cpp [InstSimplify] Restore fsub 0.0, (fsub 0.0, X) ==> X optzn 2016-02-29 12:18:25 +00:00
Interval.cpp
IntervalPartition.cpp
IteratedDominanceFrontier.cpp Move IDF Calculation to a separate file, expose an interface to it. 2015-04-21 19:13:02 +00:00
LLVMBuild.txt [PM/AA] Remove the last relics of the separate IPA library from LLVM, 2015-08-18 17:51:53 +00:00
LazyCallGraph.cpp [PM] Make the AnalysisManager parameter to run methods a reference. 2016-03-11 11:05:24 +00:00
LazyValueInfo.cpp [LVI] Fix a bug which prevented use of !range metadata within a query 2016-03-04 22:27:39 +00:00
Lint.cpp [opaque pointer types] [NFC] FindAvailableLoadedValue: take LoadInst instead of just the pointer. 2016-01-22 01:51:51 +00:00
Loads.cpp NFC. Move isDereferenceable to Loads.h/cpp 2016-02-24 12:49:04 +00:00
LoopAccessAnalysis.cpp [LAA] Formatting fix in previous change 2016-03-24 05:15:24 +00:00
LoopInfo.cpp IR: Reserve an MDKind for !llvm.loop; NFC 2016-03-25 00:35:38 +00:00
LoopPass.cpp LoopInfo: Simplify ownership of Loop objects 2016-01-08 19:08:53 +00:00
LoopPassManager.cpp [PM] Implement the final conclusion as to how the analysis IDs should 2016-03-11 10:22:49 +00:00
LoopUnrollAnalyzer.cpp [LoopUnrollAnalyzer] Check that we're using SCEV for the same loop we're simulating. 2016-02-26 02:57:05 +00:00
MemDepPrinter.cpp [PM] Port memdep to the new pass manager. 2016-03-10 00:55:30 +00:00
MemDerefPrinter.cpp NFC. Move isDereferenceable to Loads.h/cpp 2016-02-24 12:49:04 +00:00
MemoryBuiltins.cpp [MemoryBuiltins] Fix an issue with hasNoAliasAttr 2016-02-09 21:54:18 +00:00
MemoryDependenceAnalysis.cpp Allow value forwarding past release fences in GVN 2016-03-25 22:40:35 +00:00
MemoryLocation.cpp [PM/AA] Split the location computation out of getArgLocation so the 2015-06-17 07:12:40 +00:00
ModuleDebugInfoPrinter.cpp Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC) 2015-06-23 09:49:53 +00:00
ObjCARCAliasAnalysis.cpp [PM] Make the AnalysisManager parameter to run methods a reference. 2016-03-11 11:05:24 +00:00
ObjCARCAnalysisUtils.cpp [ARC] Pull the ObjC ARC components that really serve the role of 2015-08-20 08:06:03 +00:00
ObjCARCInstKind.cpp Add support for objc_unsafeClaimAutoreleasedReturnValue to the 2016-01-27 19:05:08 +00:00
OrderedBasicBlock.cpp [CaptureTracker] Provide an ordered basic block to PointerMayBeCapturedBefore 2015-07-31 14:31:35 +00:00
PHITransAddr.cpp Annotate dump() methods with LLVM_DUMP_METHOD, addressing Richard Smith r259192 post commit comment. 2016-01-29 20:50:44 +00:00
PostDominators.cpp [PM] Make the AnalysisManager parameter to run methods a reference. 2016-03-11 11:05:24 +00:00
PtrUseVisitor.cpp Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool> 2014-11-19 07:49:26 +00:00
README.txt
RegionInfo.cpp [PM] Make the AnalysisManager parameter to run methods a reference. 2016-03-11 11:05:24 +00:00
RegionPass.cpp Change range-based for-loops to be -Wrange-loop-analysis clean. 2015-04-15 01:21:15 +00:00
RegionPrinter.cpp [RegionInfo] Add debug-time region viewer functions 2015-08-10 13:21:59 +00:00
ScalarEvolution.cpp [SCEV] Change the SCEV Predicates interfaces for conversion to AddRecExpr to return SCEVAddRecExpr* instead of SCEV* 2016-03-23 15:29:30 +00:00
ScalarEvolutionAliasAnalysis.cpp [PM] Make the AnalysisManager parameter to run methods a reference. 2016-03-11 11:05:24 +00:00
ScalarEvolutionExpander.cpp [IndVars] Fix PR26974: make sure replaceCongruentIVs doesn't break LCSSA 2016-03-21 12:44:29 +00:00
ScalarEvolutionNormalization.cpp Analysis: Remove implicit ilist iterator conversions 2015-10-10 00:53:03 +00:00
ScopedNoAliasAA.cpp [PM] Make the AnalysisManager parameter to run methods a reference. 2016-03-11 11:05:24 +00:00
SparsePropagation.cpp Analysis: Remove implicit ilist iterator conversions 2015-10-10 00:53:03 +00:00
StratifiedSets.h Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC) 2015-06-23 09:49:53 +00:00
TargetLibraryInfo.cpp [PM] Implement the final conclusion as to how the analysis IDs should 2016-03-11 10:22:49 +00:00
TargetTransformInfo.cpp [LoopDataPrefetch] Add TTI to limit the number of iterations to prefetch ahead 2016-03-18 00:27:43 +00:00
Trace.cpp Annotate dump() methods with LLVM_DUMP_METHOD, addressing Richard Smith r259192 post commit comment. 2016-01-29 20:50:44 +00:00
TypeBasedAliasAnalysis.cpp [PM] Make the AnalysisManager parameter to run methods a reference. 2016-03-11 11:05:24 +00:00
ValueTracking.cpp [ValueTracking] Extract isKnownPositive [NFCI] 2016-03-09 21:31:47 +00:00
VectorUtils.cpp [opaque pointer types] [NFC] GEP: replace get(Pointer)ElementType uses with get{Source,Result}ElementType. 2016-01-19 17:28:00 +00:00

README.txt

Analysis Opportunities:

//===---------------------------------------------------------------------===//

In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the
ScalarEvolution expression for %r is this:

  {1,+,3,+,2}<loop>

Outside the loop, this could be evaluated simply as (%n * %n), however
ScalarEvolution currently evaluates it as

  (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n))

In addition to being much more complicated, it involves i65 arithmetic,
which is very inefficient when expanded into code.

//===---------------------------------------------------------------------===//

In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll,

ScalarEvolution is forming this expression:

((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32)))

This could be folded to

(-1 * (trunc i64 undef to i32))

//===---------------------------------------------------------------------===//