2007-08-03 02:11:11 +08:00
|
|
|
//===- DeadStoreElimination.cpp - Fast Dead Store Elimination -------------===//
|
2007-07-11 08:46:18 +08:00
|
|
|
//
|
|
|
|
// The LLVM Compiler Infrastructure
|
|
|
|
//
|
2007-12-30 04:36:04 +08:00
|
|
|
// This file is distributed under the University of Illinois Open Source
|
|
|
|
// License. See LICENSE.TXT for details.
|
2007-07-11 08:46:18 +08:00
|
|
|
//
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
//
|
2015-12-12 02:39:41 +08:00
|
|
|
// This file implements a trivial dead store elimination that only considers
|
|
|
|
// basic-block local redundant stores.
|
|
|
|
//
|
|
|
|
// FIXME: This should eventually be extended to be a post-dominator tree
|
|
|
|
// traversal. Doing so would be pretty trivial.
|
2007-07-11 08:46:18 +08:00
|
|
|
//
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
|
|
|
|
#include "llvm/Transforms/Scalar.h"
|
2012-12-04 00:50:05 +08:00
|
|
|
#include "llvm/ADT/STLExtras.h"
|
|
|
|
#include "llvm/ADT/SetVector.h"
|
|
|
|
#include "llvm/ADT/Statistic.h"
|
2007-07-12 07:19:17 +08:00
|
|
|
#include "llvm/Analysis/AliasAnalysis.h"
|
2011-10-23 05:59:35 +08:00
|
|
|
#include "llvm/Analysis/CaptureTracking.h"
|
[PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.
This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:
- FunctionAAResults is a type-erasing alias analysis results aggregation
interface to walk a single query across a range of results from
different alias analyses. Currently this is function-specific as we
always assume that aliasing queries are *within* a function.
- AAResultBase is a CRTP utility providing stub implementations of
various parts of the alias analysis result concept, notably in several
cases in terms of other more general parts of the interface. This can
be used to implement only a narrow part of the interface rather than
the entire interface. This isn't really ideal, this logic should be
hoisted into FunctionAAResults as currently it will cause
a significant amount of redundant work, but it faithfully models the
behavior of the prior infrastructure.
- All the alias analysis passes are ported to be wrapper passes for the
legacy PM and new-style analysis passes for the new PM with a shared
result object. In some cases (most notably CFL), this is an extremely
naive approach that we should revisit when we can specialize for the
new pass manager.
- BasicAA has been restructured to reflect that it is much more
fundamentally a function analysis because it uses dominator trees and
loop info that need to be constructed for each function.
All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.
The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.
This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.
Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.
One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.
Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.
Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.
Differential Revision: http://reviews.llvm.org/D12080
llvm-svn: 247167
2015-09-10 01:55:00 +08:00
|
|
|
#include "llvm/Analysis/GlobalsModRef.h"
|
2009-10-28 04:05:49 +08:00
|
|
|
#include "llvm/Analysis/MemoryBuiltins.h"
|
2007-07-11 08:46:18 +08:00
|
|
|
#include "llvm/Analysis/MemoryDependenceAnalysis.h"
|
2015-03-24 03:32:43 +08:00
|
|
|
#include "llvm/Analysis/TargetLibraryInfo.h"
|
2010-12-01 07:05:20 +08:00
|
|
|
#include "llvm/Analysis/ValueTracking.h"
|
2013-01-02 19:36:10 +08:00
|
|
|
#include "llvm/IR/Constants.h"
|
|
|
|
#include "llvm/IR/DataLayout.h"
|
2014-01-13 17:26:24 +08:00
|
|
|
#include "llvm/IR/Dominators.h"
|
2013-01-02 19:36:10 +08:00
|
|
|
#include "llvm/IR/Function.h"
|
|
|
|
#include "llvm/IR/GlobalVariable.h"
|
|
|
|
#include "llvm/IR/Instructions.h"
|
|
|
|
#include "llvm/IR/IntrinsicInst.h"
|
2012-12-04 00:50:05 +08:00
|
|
|
#include "llvm/Pass.h"
|
|
|
|
#include "llvm/Support/Debug.h"
|
2015-03-24 03:32:43 +08:00
|
|
|
#include "llvm/Support/raw_ostream.h"
|
2007-07-11 08:46:18 +08:00
|
|
|
#include "llvm/Transforms/Utils/Local.h"
|
|
|
|
using namespace llvm;
|
|
|
|
|
2014-04-22 10:55:47 +08:00
|
|
|
#define DEBUG_TYPE "dse"
|
|
|
|
|
2015-08-13 23:36:11 +08:00
|
|
|
STATISTIC(NumRedundantStores, "Number of redundant stores deleted");
|
2007-07-11 08:46:18 +08:00
|
|
|
STATISTIC(NumFastStores, "Number of stores deleted");
|
|
|
|
STATISTIC(NumFastOther , "Number of other instrs removed");
|
|
|
|
|
|
|
|
namespace {
|
2009-09-02 14:11:42 +08:00
|
|
|
struct DSE : public FunctionPass {
|
2010-12-01 03:34:42 +08:00
|
|
|
AliasAnalysis *AA;
|
2016-03-10 08:55:30 +08:00
|
|
|
MemoryDependenceResults *MD;
|
2011-11-05 18:48:42 +08:00
|
|
|
DominatorTree *DT;
|
2012-09-25 06:07:09 +08:00
|
|
|
const TargetLibraryInfo *TLI;
|
2015-08-19 10:15:13 +08:00
|
|
|
|
2007-07-11 08:46:18 +08:00
|
|
|
static char ID; // Pass identification, replacement for typeid
|
2015-08-19 10:15:13 +08:00
|
|
|
DSE() : FunctionPass(ID), AA(nullptr), MD(nullptr), DT(nullptr) {
|
2010-10-20 01:21:58 +08:00
|
|
|
initializeDSEPass(*PassRegistry::getPassRegistry());
|
|
|
|
}
|
2007-07-11 08:46:18 +08:00
|
|
|
|
2014-03-05 17:10:37 +08:00
|
|
|
bool runOnFunction(Function &F) override {
|
2016-04-23 06:06:11 +08:00
|
|
|
if (skipFunction(F))
|
2014-02-06 08:07:05 +08:00
|
|
|
return false;
|
|
|
|
|
[PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.
This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:
- FunctionAAResults is a type-erasing alias analysis results aggregation
interface to walk a single query across a range of results from
different alias analyses. Currently this is function-specific as we
always assume that aliasing queries are *within* a function.
- AAResultBase is a CRTP utility providing stub implementations of
various parts of the alias analysis result concept, notably in several
cases in terms of other more general parts of the interface. This can
be used to implement only a narrow part of the interface rather than
the entire interface. This isn't really ideal, this logic should be
hoisted into FunctionAAResults as currently it will cause
a significant amount of redundant work, but it faithfully models the
behavior of the prior infrastructure.
- All the alias analysis passes are ported to be wrapper passes for the
legacy PM and new-style analysis passes for the new PM with a shared
result object. In some cases (most notably CFL), this is an extremely
naive approach that we should revisit when we can specialize for the
new pass manager.
- BasicAA has been restructured to reflect that it is much more
fundamentally a function analysis because it uses dominator trees and
loop info that need to be constructed for each function.
All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.
The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.
This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.
Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.
One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.
Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.
Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.
Differential Revision: http://reviews.llvm.org/D12080
llvm-svn: 247167
2015-09-10 01:55:00 +08:00
|
|
|
AA = &getAnalysis<AAResultsWrapperPass>().getAAResults();
|
2016-03-10 08:55:30 +08:00
|
|
|
MD = &getAnalysis<MemoryDependenceWrapperPass>().getMemDep();
|
2014-01-13 21:07:17 +08:00
|
|
|
DT = &getAnalysis<DominatorTreeWrapperPass>().getDomTree();
|
2015-08-13 02:01:44 +08:00
|
|
|
TLI = &getAnalysis<TargetLibraryInfoWrapperPass>().getTLI();
|
2015-08-19 10:15:13 +08:00
|
|
|
|
2010-12-01 03:34:42 +08:00
|
|
|
bool Changed = false;
|
2015-10-14 02:26:00 +08:00
|
|
|
for (BasicBlock &I : F)
|
2010-02-11 13:11:54 +08:00
|
|
|
// Only check non-dead blocks. Dead blocks may have strange pointer
|
|
|
|
// cycles that will confuse alias analysis.
|
2015-10-14 02:26:00 +08:00
|
|
|
if (DT->isReachableFromEntry(&I))
|
|
|
|
Changed |= runOnBasicBlock(I);
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2014-04-25 13:29:35 +08:00
|
|
|
AA = nullptr; MD = nullptr; DT = nullptr;
|
2007-07-11 08:46:18 +08:00
|
|
|
return Changed;
|
|
|
|
}
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2007-07-11 08:46:18 +08:00
|
|
|
bool runOnBasicBlock(BasicBlock &BB);
|
2015-09-23 19:38:44 +08:00
|
|
|
bool MemoryIsNotModifiedBetween(Instruction *FirstI, Instruction *SecondI);
|
2010-11-30 09:28:33 +08:00
|
|
|
bool HandleFree(CallInst *F);
|
2008-11-28 08:27:14 +08:00
|
|
|
bool handleEndBlock(BasicBlock &BB);
|
2015-06-17 15:18:54 +08:00
|
|
|
void RemoveAccessedObjects(const MemoryLocation &LoadedLoc,
|
2015-03-10 10:37:25 +08:00
|
|
|
SmallSetVector<Value *, 16> &DeadStackObjects,
|
|
|
|
const DataLayout &DL);
|
2015-08-19 10:15:13 +08:00
|
|
|
|
2014-03-05 17:10:37 +08:00
|
|
|
void getAnalysisUsage(AnalysisUsage &AU) const override {
|
2007-07-11 08:46:18 +08:00
|
|
|
AU.setPreservesCFG();
|
2014-01-13 21:07:17 +08:00
|
|
|
AU.addRequired<DominatorTreeWrapperPass>();
|
[PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.
This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:
- FunctionAAResults is a type-erasing alias analysis results aggregation
interface to walk a single query across a range of results from
different alias analyses. Currently this is function-specific as we
always assume that aliasing queries are *within* a function.
- AAResultBase is a CRTP utility providing stub implementations of
various parts of the alias analysis result concept, notably in several
cases in terms of other more general parts of the interface. This can
be used to implement only a narrow part of the interface rather than
the entire interface. This isn't really ideal, this logic should be
hoisted into FunctionAAResults as currently it will cause
a significant amount of redundant work, but it faithfully models the
behavior of the prior infrastructure.
- All the alias analysis passes are ported to be wrapper passes for the
legacy PM and new-style analysis passes for the new PM with a shared
result object. In some cases (most notably CFL), this is an extremely
naive approach that we should revisit when we can specialize for the
new pass manager.
- BasicAA has been restructured to reflect that it is much more
fundamentally a function analysis because it uses dominator trees and
loop info that need to be constructed for each function.
All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.
The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.
This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.
Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.
One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.
Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.
Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.
Differential Revision: http://reviews.llvm.org/D12080
llvm-svn: 247167
2015-09-10 01:55:00 +08:00
|
|
|
AU.addRequired<AAResultsWrapperPass>();
|
2016-03-10 08:55:30 +08:00
|
|
|
AU.addRequired<MemoryDependenceWrapperPass>();
|
2015-08-13 02:01:44 +08:00
|
|
|
AU.addRequired<TargetLibraryInfoWrapperPass>();
|
2014-01-13 21:07:17 +08:00
|
|
|
AU.addPreserved<DominatorTreeWrapperPass>();
|
[PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.
This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:
- FunctionAAResults is a type-erasing alias analysis results aggregation
interface to walk a single query across a range of results from
different alias analyses. Currently this is function-specific as we
always assume that aliasing queries are *within* a function.
- AAResultBase is a CRTP utility providing stub implementations of
various parts of the alias analysis result concept, notably in several
cases in terms of other more general parts of the interface. This can
be used to implement only a narrow part of the interface rather than
the entire interface. This isn't really ideal, this logic should be
hoisted into FunctionAAResults as currently it will cause
a significant amount of redundant work, but it faithfully models the
behavior of the prior infrastructure.
- All the alias analysis passes are ported to be wrapper passes for the
legacy PM and new-style analysis passes for the new PM with a shared
result object. In some cases (most notably CFL), this is an extremely
naive approach that we should revisit when we can specialize for the
new pass manager.
- BasicAA has been restructured to reflect that it is much more
fundamentally a function analysis because it uses dominator trees and
loop info that need to be constructed for each function.
All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.
The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.
This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.
Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.
One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.
Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.
Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.
Differential Revision: http://reviews.llvm.org/D12080
llvm-svn: 247167
2015-09-10 01:55:00 +08:00
|
|
|
AU.addPreserved<GlobalsAAWrapperPass>();
|
2016-03-10 08:55:30 +08:00
|
|
|
AU.addPreserved<MemoryDependenceWrapperPass>();
|
2007-07-11 08:46:18 +08:00
|
|
|
}
|
|
|
|
};
|
2015-06-23 17:49:53 +08:00
|
|
|
}
|
2007-07-11 08:46:18 +08:00
|
|
|
|
2008-05-13 08:00:25 +08:00
|
|
|
char DSE::ID = 0;
|
2010-10-13 03:48:12 +08:00
|
|
|
INITIALIZE_PASS_BEGIN(DSE, "dse", "Dead Store Elimination", false, false)
|
2014-01-13 21:07:17 +08:00
|
|
|
INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
|
[PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.
This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:
- FunctionAAResults is a type-erasing alias analysis results aggregation
interface to walk a single query across a range of results from
different alias analyses. Currently this is function-specific as we
always assume that aliasing queries are *within* a function.
- AAResultBase is a CRTP utility providing stub implementations of
various parts of the alias analysis result concept, notably in several
cases in terms of other more general parts of the interface. This can
be used to implement only a narrow part of the interface rather than
the entire interface. This isn't really ideal, this logic should be
hoisted into FunctionAAResults as currently it will cause
a significant amount of redundant work, but it faithfully models the
behavior of the prior infrastructure.
- All the alias analysis passes are ported to be wrapper passes for the
legacy PM and new-style analysis passes for the new PM with a shared
result object. In some cases (most notably CFL), this is an extremely
naive approach that we should revisit when we can specialize for the
new pass manager.
- BasicAA has been restructured to reflect that it is much more
fundamentally a function analysis because it uses dominator trees and
loop info that need to be constructed for each function.
All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.
The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.
This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.
Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.
One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.
Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.
Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.
Differential Revision: http://reviews.llvm.org/D12080
llvm-svn: 247167
2015-09-10 01:55:00 +08:00
|
|
|
INITIALIZE_PASS_DEPENDENCY(AAResultsWrapperPass)
|
|
|
|
INITIALIZE_PASS_DEPENDENCY(GlobalsAAWrapperPass)
|
2016-03-10 08:55:30 +08:00
|
|
|
INITIALIZE_PASS_DEPENDENCY(MemoryDependenceWrapperPass)
|
2015-08-13 02:10:45 +08:00
|
|
|
INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
|
2010-10-13 03:48:12 +08:00
|
|
|
INITIALIZE_PASS_END(DSE, "dse", "Dead Store Elimination", false, false)
|
2008-05-13 08:00:25 +08:00
|
|
|
|
2007-08-01 14:36:51 +08:00
|
|
|
FunctionPass *llvm::createDeadStoreEliminationPass() { return new DSE(); }
|
2007-07-11 08:46:18 +08:00
|
|
|
|
2010-12-01 05:58:14 +08:00
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
// Helper functions
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
|
2015-08-19 10:15:13 +08:00
|
|
|
/// DeleteDeadInstruction - Delete this instruction. Before we do, go through
|
|
|
|
/// and zero out all the operands of this instruction. If any of them become
|
|
|
|
/// dead, delete them and the computation tree that feeds them.
|
|
|
|
///
|
|
|
|
/// If ValueSet is non-null, remove any deleted instructions from it as well.
|
|
|
|
///
|
|
|
|
static void DeleteDeadInstruction(Instruction *I,
|
2016-03-10 08:55:30 +08:00
|
|
|
MemoryDependenceResults &MD,
|
2015-08-19 10:15:13 +08:00
|
|
|
const TargetLibraryInfo &TLI,
|
|
|
|
SmallSetVector<Value*, 16> *ValueSet = nullptr) {
|
|
|
|
SmallVector<Instruction*, 32> NowDeadInsts;
|
|
|
|
|
|
|
|
NowDeadInsts.push_back(I);
|
|
|
|
--NumFastOther;
|
|
|
|
|
|
|
|
// Before we touch this instruction, remove it from memdep!
|
|
|
|
do {
|
|
|
|
Instruction *DeadInst = NowDeadInsts.pop_back_val();
|
|
|
|
++NumFastOther;
|
|
|
|
|
|
|
|
// This instruction is dead, zap it, in stages. Start by removing it from
|
|
|
|
// MemDep, which needs to know the operands and needs it to be in the
|
|
|
|
// function.
|
|
|
|
MD.removeInstruction(DeadInst);
|
|
|
|
|
|
|
|
for (unsigned op = 0, e = DeadInst->getNumOperands(); op != e; ++op) {
|
|
|
|
Value *Op = DeadInst->getOperand(op);
|
|
|
|
DeadInst->setOperand(op, nullptr);
|
|
|
|
|
|
|
|
// If this operand just became dead, add it to the NowDeadInsts list.
|
|
|
|
if (!Op->use_empty()) continue;
|
|
|
|
|
|
|
|
if (Instruction *OpI = dyn_cast<Instruction>(Op))
|
|
|
|
if (isInstructionTriviallyDead(OpI, &TLI))
|
|
|
|
NowDeadInsts.push_back(OpI);
|
|
|
|
}
|
|
|
|
|
|
|
|
DeadInst->eraseFromParent();
|
|
|
|
|
|
|
|
if (ValueSet) ValueSet->remove(DeadInst);
|
|
|
|
} while (!NowDeadInsts.empty());
|
|
|
|
}
|
|
|
|
|
|
|
|
|
2010-11-30 09:37:52 +08:00
|
|
|
/// hasMemoryWrite - Does this instruction write some memory? This only returns
|
|
|
|
/// true for things that we can analyze with other helpers below.
|
2015-08-13 02:01:44 +08:00
|
|
|
static bool hasMemoryWrite(Instruction *I, const TargetLibraryInfo &TLI) {
|
2009-11-10 14:46:40 +08:00
|
|
|
if (isa<StoreInst>(I))
|
|
|
|
return true;
|
|
|
|
if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(I)) {
|
|
|
|
switch (II->getIntrinsicID()) {
|
2009-12-02 14:35:55 +08:00
|
|
|
default:
|
|
|
|
return false;
|
|
|
|
case Intrinsic::memset:
|
|
|
|
case Intrinsic::memmove:
|
|
|
|
case Intrinsic::memcpy:
|
|
|
|
case Intrinsic::init_trampoline:
|
|
|
|
case Intrinsic::lifetime_end:
|
|
|
|
return true;
|
2009-11-10 14:46:40 +08:00
|
|
|
}
|
|
|
|
}
|
2015-04-10 22:50:08 +08:00
|
|
|
if (auto CS = CallSite(I)) {
|
2012-09-25 06:09:10 +08:00
|
|
|
if (Function *F = CS.getCalledFunction()) {
|
2015-08-13 02:01:44 +08:00
|
|
|
if (TLI.has(LibFunc::strcpy) &&
|
|
|
|
F->getName() == TLI.getName(LibFunc::strcpy)) {
|
2012-09-25 06:09:10 +08:00
|
|
|
return true;
|
|
|
|
}
|
2015-08-13 02:01:44 +08:00
|
|
|
if (TLI.has(LibFunc::strncpy) &&
|
|
|
|
F->getName() == TLI.getName(LibFunc::strncpy)) {
|
2012-09-25 06:09:10 +08:00
|
|
|
return true;
|
|
|
|
}
|
2015-08-13 02:01:44 +08:00
|
|
|
if (TLI.has(LibFunc::strcat) &&
|
|
|
|
F->getName() == TLI.getName(LibFunc::strcat)) {
|
2012-09-25 06:09:10 +08:00
|
|
|
return true;
|
|
|
|
}
|
2015-08-13 02:01:44 +08:00
|
|
|
if (TLI.has(LibFunc::strncat) &&
|
|
|
|
F->getName() == TLI.getName(LibFunc::strncat)) {
|
2012-09-25 06:09:10 +08:00
|
|
|
return true;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
2009-11-10 14:46:40 +08:00
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
2010-11-30 15:23:21 +08:00
|
|
|
/// getLocForWrite - Return a Location stored to by the specified instruction.
|
2011-09-13 09:28:59 +08:00
|
|
|
/// If isRemovable returns true, this function and getLocForRead completely
|
|
|
|
/// describe the memory operations for this instruction.
|
2015-06-17 15:18:54 +08:00
|
|
|
static MemoryLocation getLocForWrite(Instruction *Inst, AliasAnalysis &AA) {
|
2010-11-30 15:23:21 +08:00
|
|
|
if (StoreInst *SI = dyn_cast<StoreInst>(Inst))
|
2015-06-04 10:03:15 +08:00
|
|
|
return MemoryLocation::get(SI);
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2010-11-30 15:23:21 +08:00
|
|
|
if (MemIntrinsic *MI = dyn_cast<MemIntrinsic>(Inst)) {
|
|
|
|
// memcpy/memmove/memset.
|
2015-06-17 15:18:54 +08:00
|
|
|
MemoryLocation Loc = MemoryLocation::getForDest(MI);
|
2010-11-30 15:23:21 +08:00
|
|
|
return Loc;
|
|
|
|
}
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2010-11-30 15:23:21 +08:00
|
|
|
IntrinsicInst *II = dyn_cast<IntrinsicInst>(Inst);
|
2015-06-17 15:18:54 +08:00
|
|
|
if (!II)
|
|
|
|
return MemoryLocation();
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2010-11-30 15:23:21 +08:00
|
|
|
switch (II->getIntrinsicID()) {
|
2015-06-17 15:18:54 +08:00
|
|
|
default:
|
|
|
|
return MemoryLocation(); // Unhandled intrinsic.
|
2010-11-30 15:23:21 +08:00
|
|
|
case Intrinsic::init_trampoline:
|
|
|
|
// FIXME: We don't know the size of the trampoline, so we can't really
|
|
|
|
// handle it here.
|
2015-06-17 15:18:54 +08:00
|
|
|
return MemoryLocation(II->getArgOperand(0));
|
2010-11-30 15:23:21 +08:00
|
|
|
case Intrinsic::lifetime_end: {
|
|
|
|
uint64_t Len = cast<ConstantInt>(II->getArgOperand(0))->getZExtValue();
|
2015-06-17 15:18:54 +08:00
|
|
|
return MemoryLocation(II->getArgOperand(1), Len);
|
2010-11-30 15:23:21 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2010-12-06 09:48:06 +08:00
|
|
|
/// getLocForRead - Return the location read by the specified "hasMemoryWrite"
|
|
|
|
/// instruction if any.
|
2015-08-13 02:01:44 +08:00
|
|
|
static MemoryLocation getLocForRead(Instruction *Inst,
|
|
|
|
const TargetLibraryInfo &TLI) {
|
|
|
|
assert(hasMemoryWrite(Inst, TLI) && "Unknown instruction case");
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2010-12-06 09:48:06 +08:00
|
|
|
// The only instructions that both read and write are the mem transfer
|
|
|
|
// instructions (memcpy/memmove).
|
|
|
|
if (MemTransferInst *MTI = dyn_cast<MemTransferInst>(Inst))
|
2015-06-04 10:03:15 +08:00
|
|
|
return MemoryLocation::getForSource(MTI);
|
2015-06-17 15:18:54 +08:00
|
|
|
return MemoryLocation();
|
2010-12-06 09:48:06 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
|
2010-11-30 13:30:45 +08:00
|
|
|
/// isRemovable - If the value of this instruction and the memory it writes to
|
|
|
|
/// is unused, may we delete this instruction?
|
|
|
|
static bool isRemovable(Instruction *I) {
|
2011-08-18 06:22:24 +08:00
|
|
|
// Don't remove volatile/atomic stores.
|
2009-11-10 14:46:40 +08:00
|
|
|
if (StoreInst *SI = dyn_cast<StoreInst>(I))
|
2011-08-18 06:22:24 +08:00
|
|
|
return SI->isUnordered();
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2012-09-25 06:09:10 +08:00
|
|
|
if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(I)) {
|
|
|
|
switch (II->getIntrinsicID()) {
|
|
|
|
default: llvm_unreachable("doesn't pass 'hasMemoryWrite' predicate");
|
|
|
|
case Intrinsic::lifetime_end:
|
|
|
|
// Never remove dead lifetime_end's, e.g. because it is followed by a
|
|
|
|
// free.
|
|
|
|
return false;
|
|
|
|
case Intrinsic::init_trampoline:
|
|
|
|
// Always safe to remove init_trampoline.
|
|
|
|
return true;
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2012-09-25 06:09:10 +08:00
|
|
|
case Intrinsic::memset:
|
|
|
|
case Intrinsic::memmove:
|
|
|
|
case Intrinsic::memcpy:
|
|
|
|
// Don't remove volatile memory intrinsics.
|
|
|
|
return !cast<MemIntrinsic>(II)->isVolatile();
|
|
|
|
}
|
2010-12-01 03:12:10 +08:00
|
|
|
}
|
2012-09-25 06:09:10 +08:00
|
|
|
|
2015-04-10 22:50:08 +08:00
|
|
|
if (auto CS = CallSite(I))
|
2012-09-25 09:55:59 +08:00
|
|
|
return CS.getInstruction()->use_empty();
|
2012-09-25 06:09:10 +08:00
|
|
|
|
|
|
|
return false;
|
2009-11-10 14:46:40 +08:00
|
|
|
}
|
|
|
|
|
2011-11-10 07:07:35 +08:00
|
|
|
|
2016-04-23 03:51:29 +08:00
|
|
|
/// Returns true if the end of this instruction can be safely shortened in
|
2011-11-10 07:07:35 +08:00
|
|
|
/// length.
|
2016-04-23 03:51:29 +08:00
|
|
|
static bool isShortenableAtTheEnd(Instruction *I) {
|
2011-11-10 07:07:35 +08:00
|
|
|
// Don't shorten stores for now
|
|
|
|
if (isa<StoreInst>(I))
|
|
|
|
return false;
|
2012-07-24 18:51:42 +08:00
|
|
|
|
2012-09-25 06:09:10 +08:00
|
|
|
if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(I)) {
|
|
|
|
switch (II->getIntrinsicID()) {
|
|
|
|
default: return false;
|
|
|
|
case Intrinsic::memset:
|
|
|
|
case Intrinsic::memcpy:
|
|
|
|
// Do shorten memory intrinsics.
|
2016-04-23 03:51:29 +08:00
|
|
|
// FIXME: Add memmove if it's also safe to transform.
|
2012-09-25 06:09:10 +08:00
|
|
|
return true;
|
|
|
|
}
|
2011-11-10 07:07:35 +08:00
|
|
|
}
|
2012-09-25 06:09:10 +08:00
|
|
|
|
|
|
|
// Don't shorten libcalls calls for now.
|
|
|
|
|
|
|
|
return false;
|
2011-11-10 07:07:35 +08:00
|
|
|
}
|
|
|
|
|
2016-04-23 03:51:29 +08:00
|
|
|
/// Returns true if the beginning of this instruction can be safely shortened
|
|
|
|
/// in length.
|
|
|
|
static bool isShortenableAtTheBeginning(Instruction *I) {
|
|
|
|
// FIXME: Handle only memset for now. Supporting memcpy/memmove should be
|
|
|
|
// easily done by offsetting the source address.
|
|
|
|
IntrinsicInst *II = dyn_cast<IntrinsicInst>(I);
|
|
|
|
return II && II->getIntrinsicID() == Intrinsic::memset;
|
|
|
|
}
|
|
|
|
|
2010-12-01 05:58:14 +08:00
|
|
|
/// getStoredPointerOperand - Return the pointer that is being written to.
|
|
|
|
static Value *getStoredPointerOperand(Instruction *I) {
|
2009-11-10 14:46:40 +08:00
|
|
|
if (StoreInst *SI = dyn_cast<StoreInst>(I))
|
|
|
|
return SI->getPointerOperand();
|
|
|
|
if (MemIntrinsic *MI = dyn_cast<MemIntrinsic>(I))
|
2010-12-01 05:58:14 +08:00
|
|
|
return MI->getDest();
|
2010-06-24 20:03:56 +08:00
|
|
|
|
2012-09-25 06:09:10 +08:00
|
|
|
if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(I)) {
|
|
|
|
switch (II->getIntrinsicID()) {
|
|
|
|
default: llvm_unreachable("Unexpected intrinsic!");
|
|
|
|
case Intrinsic::init_trampoline:
|
|
|
|
return II->getArgOperand(0);
|
|
|
|
}
|
2009-11-10 21:49:50 +08:00
|
|
|
}
|
2012-09-25 06:09:10 +08:00
|
|
|
|
2015-04-10 22:50:08 +08:00
|
|
|
CallSite CS(I);
|
2012-09-25 06:09:10 +08:00
|
|
|
// All the supported functions so far happen to have dest as their first
|
|
|
|
// argument.
|
|
|
|
return CS.getArgument(0);
|
2009-11-10 14:46:40 +08:00
|
|
|
}
|
|
|
|
|
2015-03-10 10:37:25 +08:00
|
|
|
static uint64_t getPointerSize(const Value *V, const DataLayout &DL,
|
2015-08-13 02:01:44 +08:00
|
|
|
const TargetLibraryInfo &TLI) {
|
2012-06-21 23:45:28 +08:00
|
|
|
uint64_t Size;
|
2015-08-13 02:01:44 +08:00
|
|
|
if (getObjectSize(V, Size, DL, &TLI))
|
2012-06-21 23:45:28 +08:00
|
|
|
return Size;
|
2015-06-17 15:21:38 +08:00
|
|
|
return MemoryLocation::UnknownSize;
|
2010-12-01 07:43:23 +08:00
|
|
|
}
|
2010-12-01 03:34:42 +08:00
|
|
|
|
2011-11-10 07:07:35 +08:00
|
|
|
namespace {
|
2016-04-23 03:51:29 +08:00
|
|
|
enum OverwriteResult {
|
|
|
|
OverwriteBegin,
|
|
|
|
OverwriteComplete,
|
|
|
|
OverwriteEnd,
|
|
|
|
OverwriteUnknown
|
|
|
|
};
|
2011-11-10 07:07:35 +08:00
|
|
|
}
|
|
|
|
|
2016-04-23 03:51:29 +08:00
|
|
|
/// Return 'OverwriteComplete' if a store to the 'Later' location completely
|
|
|
|
/// overwrites a store to the 'Earlier' location, 'OverwriteEnd' if the end of
|
|
|
|
/// the 'Earlier' location is completely overwritten by 'Later',
|
|
|
|
/// 'OverwriteBegin' if the beginning of the 'Earlier' location is overwritten
|
|
|
|
/// by 'Later', or 'OverwriteUnknown' if nothing can be determined.
|
2015-06-17 15:18:54 +08:00
|
|
|
static OverwriteResult isOverwrite(const MemoryLocation &Later,
|
|
|
|
const MemoryLocation &Earlier,
|
2015-03-10 10:37:25 +08:00
|
|
|
const DataLayout &DL,
|
2015-08-13 02:01:44 +08:00
|
|
|
const TargetLibraryInfo &TLI,
|
2015-03-10 10:37:25 +08:00
|
|
|
int64_t &EarlierOff, int64_t &LaterOff) {
|
2010-12-01 07:05:20 +08:00
|
|
|
const Value *P1 = Earlier.Ptr->stripPointerCasts();
|
|
|
|
const Value *P2 = Later.Ptr->stripPointerCasts();
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2010-12-01 07:05:20 +08:00
|
|
|
// If the start pointers are the same, we just have to compare sizes to see if
|
|
|
|
// the later store was larger than the earlier store.
|
|
|
|
if (P1 == P2) {
|
|
|
|
// If we don't know the sizes of either access, then we can't do a
|
|
|
|
// comparison.
|
2015-06-17 15:21:38 +08:00
|
|
|
if (Later.Size == MemoryLocation::UnknownSize ||
|
|
|
|
Earlier.Size == MemoryLocation::UnknownSize)
|
2011-11-10 07:07:35 +08:00
|
|
|
return OverwriteUnknown;
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2010-12-01 07:05:20 +08:00
|
|
|
// Make sure that the Later size is >= the Earlier size.
|
2011-11-10 07:07:35 +08:00
|
|
|
if (Later.Size >= Earlier.Size)
|
|
|
|
return OverwriteComplete;
|
2010-12-01 07:05:20 +08:00
|
|
|
}
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2010-12-01 07:05:20 +08:00
|
|
|
// Otherwise, we have to have size information, and the later store has to be
|
|
|
|
// larger than the earlier one.
|
2015-06-17 15:21:38 +08:00
|
|
|
if (Later.Size == MemoryLocation::UnknownSize ||
|
|
|
|
Earlier.Size == MemoryLocation::UnknownSize)
|
2011-11-10 07:07:35 +08:00
|
|
|
return OverwriteUnknown;
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2010-12-01 07:43:23 +08:00
|
|
|
// Check to see if the later store is to the entire object (either a global,
|
2014-01-28 10:38:36 +08:00
|
|
|
// an alloca, or a byval/inalloca argument). If so, then it clearly
|
|
|
|
// overwrites any other store to the same object.
|
2014-02-22 02:34:28 +08:00
|
|
|
const Value *UO1 = GetUnderlyingObject(P1, DL),
|
|
|
|
*UO2 = GetUnderlyingObject(P2, DL);
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2010-12-01 07:43:23 +08:00
|
|
|
// If we can't resolve the same pointers to the same object, then we can't
|
|
|
|
// analyze them at all.
|
|
|
|
if (UO1 != UO2)
|
2011-11-10 07:07:35 +08:00
|
|
|
return OverwriteUnknown;
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2010-12-01 07:43:23 +08:00
|
|
|
// If the "Later" store is to a recognizable object, get its size.
|
2015-03-10 10:37:25 +08:00
|
|
|
uint64_t ObjectSize = getPointerSize(UO2, DL, TLI);
|
2015-06-17 15:21:38 +08:00
|
|
|
if (ObjectSize != MemoryLocation::UnknownSize)
|
2011-11-11 04:22:08 +08:00
|
|
|
if (ObjectSize == Later.Size && ObjectSize >= Earlier.Size)
|
2011-11-10 07:07:35 +08:00
|
|
|
return OverwriteComplete;
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2010-12-01 07:05:20 +08:00
|
|
|
// Okay, we have stores to two completely different pointers. Try to
|
|
|
|
// decompose the pointer into a "base + constant_offset" form. If the base
|
|
|
|
// pointers are equal, then we can reason about the two stores.
|
2011-11-10 07:07:35 +08:00
|
|
|
EarlierOff = 0;
|
|
|
|
LaterOff = 0;
|
2014-02-22 02:34:28 +08:00
|
|
|
const Value *BP1 = GetPointerBaseWithConstantOffset(P1, EarlierOff, DL);
|
|
|
|
const Value *BP2 = GetPointerBaseWithConstantOffset(P2, LaterOff, DL);
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2010-12-01 07:05:20 +08:00
|
|
|
// If the base pointers still differ, we have two completely different stores.
|
|
|
|
if (BP1 != BP2)
|
2011-11-10 07:07:35 +08:00
|
|
|
return OverwriteUnknown;
|
2011-03-26 09:20:37 +08:00
|
|
|
|
2011-03-26 16:02:59 +08:00
|
|
|
// The later store completely overlaps the earlier store if:
|
2011-09-07 02:14:09 +08:00
|
|
|
//
|
2011-03-26 16:02:59 +08:00
|
|
|
// 1. Both start at the same offset and the later one's size is greater than
|
|
|
|
// or equal to the earlier one's, or
|
|
|
|
//
|
|
|
|
// |--earlier--|
|
|
|
|
// |-- later --|
|
2011-09-07 02:14:09 +08:00
|
|
|
//
|
2011-03-26 16:02:59 +08:00
|
|
|
// 2. The earlier store has an offset greater than the later offset, but which
|
|
|
|
// still lies completely within the later store.
|
|
|
|
//
|
|
|
|
// |--earlier--|
|
|
|
|
// |----- later ------|
|
2011-03-31 05:37:19 +08:00
|
|
|
//
|
|
|
|
// We have to be careful here as *Off is signed while *.Size is unsigned.
|
2011-03-26 17:32:07 +08:00
|
|
|
if (EarlierOff >= LaterOff &&
|
2012-08-14 15:32:05 +08:00
|
|
|
Later.Size >= Earlier.Size &&
|
2011-03-31 05:37:19 +08:00
|
|
|
uint64_t(EarlierOff - LaterOff) + Earlier.Size <= Later.Size)
|
2011-11-10 07:07:35 +08:00
|
|
|
return OverwriteComplete;
|
2012-07-24 18:51:42 +08:00
|
|
|
|
2016-04-23 03:51:29 +08:00
|
|
|
// Another interesting case is if the later store overwrites the end of the
|
|
|
|
// earlier store.
|
2011-11-10 07:07:35 +08:00
|
|
|
//
|
|
|
|
// |--earlier--|
|
|
|
|
// |-- later --|
|
|
|
|
//
|
|
|
|
// In this case we may want to trim the size of earlier to avoid generating
|
|
|
|
// writes to addresses which will definitely be overwritten later
|
|
|
|
if (LaterOff > EarlierOff &&
|
|
|
|
LaterOff < int64_t(EarlierOff + Earlier.Size) &&
|
2011-12-03 08:04:30 +08:00
|
|
|
int64_t(LaterOff + Later.Size) >= int64_t(EarlierOff + Earlier.Size))
|
2011-11-10 07:07:35 +08:00
|
|
|
return OverwriteEnd;
|
2011-03-26 16:02:59 +08:00
|
|
|
|
2016-04-23 03:51:29 +08:00
|
|
|
// Finally, we also need to check if the later store overwrites the beginning
|
|
|
|
// of the earlier store.
|
|
|
|
//
|
|
|
|
// |--earlier--|
|
|
|
|
// |-- later --|
|
|
|
|
//
|
|
|
|
// In this case we may want to move the destination address and trim the size
|
|
|
|
// of earlier to avoid generating writes to addresses which will definitely
|
|
|
|
// be overwritten later.
|
|
|
|
if (LaterOff <= EarlierOff && int64_t(LaterOff + Later.Size) > EarlierOff) {
|
|
|
|
assert (int64_t(LaterOff + Later.Size) < int64_t(EarlierOff + Earlier.Size)
|
|
|
|
&& "Expect to be handled as OverwriteComplete" );
|
|
|
|
return OverwriteBegin;
|
|
|
|
}
|
2011-03-26 16:02:59 +08:00
|
|
|
// Otherwise, they don't completely overlap.
|
2011-11-10 07:07:35 +08:00
|
|
|
return OverwriteUnknown;
|
2009-11-05 07:20:12 +08:00
|
|
|
}
|
|
|
|
|
2010-12-06 09:48:06 +08:00
|
|
|
/// isPossibleSelfRead - If 'Inst' might be a self read (i.e. a noop copy of a
|
|
|
|
/// memory region into an identical pointer) then it doesn't actually make its
|
2011-09-07 02:14:09 +08:00
|
|
|
/// input dead in the traditional sense. Consider this case:
|
2010-12-06 09:48:06 +08:00
|
|
|
///
|
|
|
|
/// memcpy(A <- B)
|
|
|
|
/// memcpy(A <- A)
|
|
|
|
///
|
|
|
|
/// In this case, the second store to A does not make the first store to A dead.
|
|
|
|
/// The usual situation isn't an explicit A<-A store like this (which can be
|
|
|
|
/// trivially removed) but a case where two pointers may alias.
|
|
|
|
///
|
|
|
|
/// This function detects when it is unsafe to remove a dependent instruction
|
|
|
|
/// because the DSE inducing instruction may be a self-read.
|
|
|
|
static bool isPossibleSelfRead(Instruction *Inst,
|
2015-06-17 15:18:54 +08:00
|
|
|
const MemoryLocation &InstStoreLoc,
|
2015-08-13 02:01:44 +08:00
|
|
|
Instruction *DepWrite,
|
|
|
|
const TargetLibraryInfo &TLI,
|
|
|
|
AliasAnalysis &AA) {
|
2010-12-06 09:48:06 +08:00
|
|
|
// Self reads can only happen for instructions that read memory. Get the
|
|
|
|
// location read.
|
2015-08-13 02:01:44 +08:00
|
|
|
MemoryLocation InstReadLoc = getLocForRead(Inst, TLI);
|
2014-04-25 13:29:35 +08:00
|
|
|
if (!InstReadLoc.Ptr) return false; // Not a reading instruction.
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2010-12-06 09:48:06 +08:00
|
|
|
// If the read and written loc obviously don't alias, it isn't a read.
|
|
|
|
if (AA.isNoAlias(InstReadLoc, InstStoreLoc)) return false;
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2010-12-06 09:48:06 +08:00
|
|
|
// Okay, 'Inst' may copy over itself. However, we can still remove a the
|
|
|
|
// DepWrite instruction if we can prove that it reads from the same location
|
|
|
|
// as Inst. This handles useful cases like:
|
|
|
|
// memcpy(A <- B)
|
|
|
|
// memcpy(A <- B)
|
|
|
|
// Here we don't know if A/B may alias, but we do know that B/B are must
|
|
|
|
// aliases, so removing the first memcpy is safe (assuming it writes <= #
|
|
|
|
// bytes as the second one.
|
2015-08-13 02:01:44 +08:00
|
|
|
MemoryLocation DepReadLoc = getLocForRead(DepWrite, TLI);
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2010-12-06 09:48:06 +08:00
|
|
|
if (DepReadLoc.Ptr && AA.isMustAlias(InstReadLoc.Ptr, DepReadLoc.Ptr))
|
|
|
|
return false;
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2010-12-06 09:48:06 +08:00
|
|
|
// If DepWrite doesn't read memory or if we can't prove it is a must alias,
|
|
|
|
// then it can't be considered dead.
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
2010-12-01 05:58:14 +08:00
|
|
|
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
// DSE Pass
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
|
2007-08-01 14:36:51 +08:00
|
|
|
bool DSE::runOnBasicBlock(BasicBlock &BB) {
|
2015-09-23 19:38:44 +08:00
|
|
|
const DataLayout &DL = BB.getModule()->getDataLayout();
|
2007-07-11 08:46:18 +08:00
|
|
|
bool MadeChange = false;
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2009-09-02 14:31:02 +08:00
|
|
|
// Do a top-down walk on the BB.
|
2008-11-29 05:29:52 +08:00
|
|
|
for (BasicBlock::iterator BBI = BB.begin(), BBE = BB.end(); BBI != BBE; ) {
|
2015-10-14 02:26:00 +08:00
|
|
|
Instruction *Inst = &*BBI++;
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2010-11-30 09:28:33 +08:00
|
|
|
// Handle 'free' calls specially.
|
2012-09-25 06:07:09 +08:00
|
|
|
if (CallInst *F = isFreeCall(Inst, TLI)) {
|
2010-11-30 09:28:33 +08:00
|
|
|
MadeChange |= HandleFree(F);
|
|
|
|
continue;
|
|
|
|
}
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2010-11-30 09:37:52 +08:00
|
|
|
// If we find something that writes memory, get its memory dependence.
|
2015-08-13 02:01:44 +08:00
|
|
|
if (!hasMemoryWrite(Inst, *TLI))
|
2007-08-08 12:52:29 +08:00
|
|
|
continue;
|
2010-11-30 08:01:19 +08:00
|
|
|
|
2014-08-07 03:30:38 +08:00
|
|
|
// If we're storing the same value back to a pointer that we just
|
|
|
|
// loaded from, then the store can be removed.
|
2010-11-30 08:01:19 +08:00
|
|
|
if (StoreInst *SI = dyn_cast<StoreInst>(Inst)) {
|
2015-09-23 19:38:44 +08:00
|
|
|
|
|
|
|
auto RemoveDeadInstAndUpdateBBI = [&](Instruction *DeadInst) {
|
|
|
|
// DeleteDeadInstruction can delete the current instruction. Save BBI
|
|
|
|
// in case we need it.
|
2015-10-14 02:26:00 +08:00
|
|
|
WeakVH NextInst(&*BBI);
|
2015-09-23 19:38:44 +08:00
|
|
|
|
|
|
|
DeleteDeadInstruction(DeadInst, *MD, *TLI);
|
|
|
|
|
|
|
|
if (!NextInst) // Next instruction deleted.
|
|
|
|
BBI = BB.begin();
|
|
|
|
else if (BBI != BB.begin()) // Revisit this instruction if possible.
|
|
|
|
--BBI;
|
|
|
|
++NumRedundantStores;
|
|
|
|
MadeChange = true;
|
|
|
|
};
|
|
|
|
|
2015-08-13 23:36:11 +08:00
|
|
|
if (LoadInst *DepLoad = dyn_cast<LoadInst>(SI->getValueOperand())) {
|
2010-11-30 08:01:19 +08:00
|
|
|
if (SI->getPointerOperand() == DepLoad->getPointerOperand() &&
|
2015-08-13 23:36:11 +08:00
|
|
|
isRemovable(SI) &&
|
|
|
|
MemoryIsNotModifiedBetween(DepLoad, SI)) {
|
|
|
|
|
2010-12-07 05:13:51 +08:00
|
|
|
DEBUG(dbgs() << "DSE: Remove Store Of Load from same pointer:\n "
|
|
|
|
<< "LOAD: " << *DepLoad << "\n STORE: " << *SI << '\n');
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2015-09-23 19:38:44 +08:00
|
|
|
RemoveDeadInstAndUpdateBBI(SI);
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
}
|
2014-08-07 03:30:38 +08:00
|
|
|
|
2015-09-23 19:38:44 +08:00
|
|
|
// Remove null stores into the calloc'ed objects
|
|
|
|
Constant *StoredConstant = dyn_cast<Constant>(SI->getValueOperand());
|
2014-08-06 01:48:20 +08:00
|
|
|
|
2015-09-23 19:38:44 +08:00
|
|
|
if (StoredConstant && StoredConstant->isNullValue() &&
|
|
|
|
isRemovable(SI)) {
|
|
|
|
Instruction *UnderlyingPointer = dyn_cast<Instruction>(
|
|
|
|
GetUnderlyingObject(SI->getPointerOperand(), DL));
|
|
|
|
|
|
|
|
if (UnderlyingPointer && isCallocLikeFn(UnderlyingPointer, TLI) &&
|
|
|
|
MemoryIsNotModifiedBetween(UnderlyingPointer, SI)) {
|
|
|
|
DEBUG(dbgs()
|
|
|
|
<< "DSE: Remove null store to the calloc'ed object:\n DEAD: "
|
|
|
|
<< *Inst << "\n OBJECT: " << *UnderlyingPointer << '\n');
|
|
|
|
|
|
|
|
RemoveDeadInstAndUpdateBBI(SI);
|
2014-08-07 03:30:38 +08:00
|
|
|
continue;
|
|
|
|
}
|
2014-08-06 01:48:20 +08:00
|
|
|
}
|
2010-11-30 08:01:19 +08:00
|
|
|
}
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2015-08-13 23:36:11 +08:00
|
|
|
MemDepResult InstDep = MD->getDependency(Inst);
|
|
|
|
|
2015-12-12 02:39:41 +08:00
|
|
|
// Ignore any store where we can't find a local dependence.
|
|
|
|
// FIXME: cross-block DSE would be fun. :)
|
|
|
|
if (!InstDep.isDef() && !InstDep.isClobber())
|
2015-12-10 21:51:43 +08:00
|
|
|
continue;
|
|
|
|
|
2015-12-12 02:39:41 +08:00
|
|
|
// Figure out what location is being stored to.
|
|
|
|
MemoryLocation Loc = getLocForWrite(Inst, *AA);
|
2015-12-10 21:51:43 +08:00
|
|
|
|
2015-12-12 02:39:41 +08:00
|
|
|
// If we didn't get a useful location, fail.
|
|
|
|
if (!Loc.Ptr)
|
|
|
|
continue;
|
2015-12-10 21:51:43 +08:00
|
|
|
|
2015-12-12 02:39:41 +08:00
|
|
|
while (InstDep.isDef() || InstDep.isClobber()) {
|
|
|
|
// Get the memory clobbered by the instruction we depend on. MemDep will
|
|
|
|
// skip any instructions that 'Loc' clearly doesn't interact with. If we
|
|
|
|
// end up depending on a may- or must-aliased load, then we can't optimize
|
|
|
|
// away the store and we bail out. However, if we depend on on something
|
|
|
|
// that overwrites the memory location we *can* potentially optimize it.
|
|
|
|
//
|
|
|
|
// Find out what memory location the dependent instruction stores.
|
|
|
|
Instruction *DepWrite = InstDep.getInst();
|
|
|
|
MemoryLocation DepLoc = getLocForWrite(DepWrite, *AA);
|
|
|
|
// If we didn't get a useful location, or if it isn't a size, bail out.
|
|
|
|
if (!DepLoc.Ptr)
|
2015-12-10 21:51:43 +08:00
|
|
|
break;
|
|
|
|
|
2015-12-12 02:39:41 +08:00
|
|
|
// If we find a write that is a) removable (i.e., non-volatile), b) is
|
|
|
|
// completely obliterated by the store to 'Loc', and c) which we know that
|
|
|
|
// 'Inst' doesn't load from, then we can remove it.
|
|
|
|
if (isRemovable(DepWrite) &&
|
|
|
|
!isPossibleSelfRead(Inst, Loc, DepWrite, *TLI, *AA)) {
|
|
|
|
int64_t InstWriteOffset, DepWriteOffset;
|
|
|
|
OverwriteResult OR =
|
|
|
|
isOverwrite(Loc, DepLoc, DL, *TLI, DepWriteOffset, InstWriteOffset);
|
|
|
|
if (OR == OverwriteComplete) {
|
|
|
|
DEBUG(dbgs() << "DSE: Remove Dead Store:\n DEAD: "
|
|
|
|
<< *DepWrite << "\n KILLER: " << *Inst << '\n');
|
|
|
|
|
|
|
|
// Delete the store and now-dead instructions that feed it.
|
|
|
|
DeleteDeadInstruction(DepWrite, *MD, *TLI);
|
|
|
|
++NumFastStores;
|
|
|
|
MadeChange = true;
|
|
|
|
|
|
|
|
// DeleteDeadInstruction can delete the current instruction in loop
|
|
|
|
// cases, reset BBI.
|
|
|
|
BBI = Inst->getIterator();
|
|
|
|
if (BBI != BB.begin())
|
|
|
|
--BBI;
|
2015-12-10 21:51:43 +08:00
|
|
|
break;
|
2016-04-23 03:51:29 +08:00
|
|
|
} else if ((OR == OverwriteEnd && isShortenableAtTheEnd(DepWrite)) ||
|
|
|
|
((OR == OverwriteBegin &&
|
|
|
|
isShortenableAtTheBeginning(DepWrite)))) {
|
2015-12-12 02:39:41 +08:00
|
|
|
// TODO: base this on the target vector size so that if the earlier
|
|
|
|
// store was too small to get vector writes anyway then its likely
|
|
|
|
// a good idea to shorten it
|
|
|
|
// Power of 2 vector writes are probably always a bad idea to optimize
|
|
|
|
// as any store/memset/memcpy is likely using vector instructions so
|
|
|
|
// shortening it to not vector size is likely to be slower
|
2016-04-23 03:51:29 +08:00
|
|
|
MemIntrinsic *DepIntrinsic = cast<MemIntrinsic>(DepWrite);
|
2015-12-12 02:39:41 +08:00
|
|
|
unsigned DepWriteAlign = DepIntrinsic->getAlignment();
|
2016-04-23 03:51:29 +08:00
|
|
|
bool IsOverwriteEnd = (OR == OverwriteEnd);
|
|
|
|
if (!IsOverwriteEnd)
|
|
|
|
InstWriteOffset = int64_t(InstWriteOffset + Loc.Size);
|
|
|
|
|
|
|
|
if ((llvm::isPowerOf2_64(InstWriteOffset) &&
|
|
|
|
DepWriteAlign <= InstWriteOffset) ||
|
2015-12-12 02:39:41 +08:00
|
|
|
((DepWriteAlign != 0) && InstWriteOffset % DepWriteAlign == 0)) {
|
|
|
|
|
2016-04-23 03:51:29 +08:00
|
|
|
DEBUG(dbgs() << "DSE: Remove Dead Store:\n OW "
|
|
|
|
<< (IsOverwriteEnd ? "END" : "BEGIN") << ": "
|
|
|
|
<< *DepWrite << "\n KILLER (offset "
|
|
|
|
<< InstWriteOffset << ", " << DepLoc.Size << ")"
|
|
|
|
<< *Inst << '\n');
|
2015-12-12 02:39:41 +08:00
|
|
|
|
2016-04-23 03:51:29 +08:00
|
|
|
int64_t NewLength =
|
|
|
|
IsOverwriteEnd
|
|
|
|
? InstWriteOffset - DepWriteOffset
|
|
|
|
: DepLoc.Size - (InstWriteOffset - DepWriteOffset);
|
|
|
|
|
|
|
|
Value *DepWriteLength = DepIntrinsic->getLength();
|
|
|
|
Value *TrimmedLength =
|
|
|
|
ConstantInt::get(DepWriteLength->getType(), NewLength);
|
2015-12-12 02:39:41 +08:00
|
|
|
DepIntrinsic->setLength(TrimmedLength);
|
2016-04-23 03:51:29 +08:00
|
|
|
|
|
|
|
if (!IsOverwriteEnd) {
|
|
|
|
int64_t OffsetMoved = (InstWriteOffset - DepWriteOffset);
|
|
|
|
Value *Indices[1] = {
|
|
|
|
ConstantInt::get(DepWriteLength->getType(), OffsetMoved)};
|
|
|
|
GetElementPtrInst *NewDestGEP = GetElementPtrInst::CreateInBounds(
|
|
|
|
DepIntrinsic->getRawDest(), Indices, "", DepWrite);
|
|
|
|
DepIntrinsic->setDest(NewDestGEP);
|
|
|
|
}
|
2015-12-12 02:39:41 +08:00
|
|
|
MadeChange = true;
|
|
|
|
}
|
2015-12-10 21:51:43 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2015-12-12 02:39:41 +08:00
|
|
|
// If this is a may-aliased store that is clobbering the store value, we
|
|
|
|
// can keep searching past it for another must-aliased pointer that stores
|
|
|
|
// to the same location. For example, in:
|
|
|
|
// store -> P
|
|
|
|
// store -> Q
|
|
|
|
// store -> P
|
|
|
|
// we can remove the first store to P even though we don't know if P and Q
|
|
|
|
// alias.
|
|
|
|
if (DepWrite == &BB.front()) break;
|
|
|
|
|
|
|
|
// Can't look past this instruction if it might read 'Loc'.
|
|
|
|
if (AA->getModRefInfo(DepWrite, Loc) & MRI_Ref)
|
|
|
|
break;
|
2015-12-10 21:51:43 +08:00
|
|
|
|
2015-12-12 02:39:41 +08:00
|
|
|
InstDep = MD->getPointerDependencyFrom(Loc, false,
|
|
|
|
DepWrite->getIterator(), &BB);
|
2015-12-10 21:51:43 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2015-12-12 02:39:41 +08:00
|
|
|
// If this block ends in a return, unwind, or unreachable, all allocas are
|
|
|
|
// dead at its end, which means stores to them are also dead.
|
|
|
|
if (BB.getTerminator()->getNumSuccessors() == 0)
|
|
|
|
MadeChange |= handleEndBlock(BB);
|
|
|
|
|
2015-12-10 21:51:43 +08:00
|
|
|
return MadeChange;
|
|
|
|
}
|
|
|
|
|
2015-09-23 19:38:44 +08:00
|
|
|
/// Returns true if the memory which is accessed by the second instruction is not
|
|
|
|
/// modified between the first and the second instruction.
|
|
|
|
/// Precondition: Second instruction must be dominated by the first
|
2015-08-13 23:36:11 +08:00
|
|
|
/// instruction.
|
2015-09-23 19:38:44 +08:00
|
|
|
bool DSE::MemoryIsNotModifiedBetween(Instruction *FirstI,
|
|
|
|
Instruction *SecondI) {
|
2015-08-13 23:36:11 +08:00
|
|
|
SmallVector<BasicBlock *, 16> WorkList;
|
|
|
|
SmallPtrSet<BasicBlock *, 8> Visited;
|
2015-09-23 19:38:44 +08:00
|
|
|
BasicBlock::iterator FirstBBI(FirstI);
|
|
|
|
++FirstBBI;
|
|
|
|
BasicBlock::iterator SecondBBI(SecondI);
|
|
|
|
BasicBlock *FirstBB = FirstI->getParent();
|
|
|
|
BasicBlock *SecondBB = SecondI->getParent();
|
|
|
|
MemoryLocation MemLoc = MemoryLocation::get(SecondI);
|
2015-08-13 23:36:11 +08:00
|
|
|
|
|
|
|
// Start checking the store-block.
|
2015-09-23 19:38:44 +08:00
|
|
|
WorkList.push_back(SecondBB);
|
2015-08-13 23:36:11 +08:00
|
|
|
bool isFirstBlock = true;
|
|
|
|
|
|
|
|
// Check all blocks going backward until we reach the load-block.
|
|
|
|
while (!WorkList.empty()) {
|
|
|
|
BasicBlock *B = WorkList.pop_back_val();
|
|
|
|
|
2015-09-23 19:38:44 +08:00
|
|
|
// Ignore instructions before LI if this is the FirstBB.
|
|
|
|
BasicBlock::iterator BI = (B == FirstBB ? FirstBBI : B->begin());
|
2015-08-13 23:36:11 +08:00
|
|
|
|
|
|
|
BasicBlock::iterator EI;
|
|
|
|
if (isFirstBlock) {
|
2015-09-23 19:38:44 +08:00
|
|
|
// Ignore instructions after SI if this is the first visit of SecondBB.
|
|
|
|
assert(B == SecondBB && "first block is not the store block");
|
|
|
|
EI = SecondBBI;
|
2015-08-13 23:36:11 +08:00
|
|
|
isFirstBlock = false;
|
|
|
|
} else {
|
2015-09-23 19:38:44 +08:00
|
|
|
// It's not SecondBB or (in case of a loop) the second visit of SecondBB.
|
2015-08-13 23:36:11 +08:00
|
|
|
// In this case we also have to look at instructions after SI.
|
|
|
|
EI = B->end();
|
|
|
|
}
|
|
|
|
for (; BI != EI; ++BI) {
|
2015-10-14 02:26:00 +08:00
|
|
|
Instruction *I = &*BI;
|
2015-09-23 19:38:44 +08:00
|
|
|
if (I->mayWriteToMemory() && I != SecondI) {
|
|
|
|
auto Res = AA->getModRefInfo(I, MemLoc);
|
2015-08-13 23:36:11 +08:00
|
|
|
if (Res != MRI_NoModRef)
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
}
|
2015-09-23 19:38:44 +08:00
|
|
|
if (B != FirstBB) {
|
|
|
|
assert(B != &FirstBB->getParent()->getEntryBlock() &&
|
2015-08-13 23:36:11 +08:00
|
|
|
"Should not hit the entry block because SI must be dominated by LI");
|
2015-12-12 02:39:41 +08:00
|
|
|
for (auto PredI = pred_begin(B), PE = pred_end(B); PredI != PE; ++PredI) {
|
|
|
|
if (!Visited.insert(*PredI).second)
|
2015-08-13 23:36:11 +08:00
|
|
|
continue;
|
2015-12-12 02:39:41 +08:00
|
|
|
WorkList.push_back(*PredI);
|
2015-08-13 23:36:11 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
2011-11-05 18:48:42 +08:00
|
|
|
/// Find all blocks that will unconditionally lead to the block BB and append
|
|
|
|
/// them to F.
|
|
|
|
static void FindUnconditionalPreds(SmallVectorImpl<BasicBlock *> &Blocks,
|
|
|
|
BasicBlock *BB, DominatorTree *DT) {
|
2014-07-22 01:06:51 +08:00
|
|
|
for (pred_iterator I = pred_begin(BB), E = pred_end(BB); I != E; ++I) {
|
|
|
|
BasicBlock *Pred = *I;
|
2011-12-09 06:36:35 +08:00
|
|
|
if (Pred == BB) continue;
|
2011-11-05 18:48:42 +08:00
|
|
|
TerminatorInst *PredTI = Pred->getTerminator();
|
|
|
|
if (PredTI->getNumSuccessors() != 1)
|
|
|
|
continue;
|
|
|
|
|
|
|
|
if (DT->isReachableFromEntry(Pred))
|
|
|
|
Blocks.push_back(Pred);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2010-11-30 09:28:33 +08:00
|
|
|
/// HandleFree - Handle frees of entire structures whose dependency is a store
|
|
|
|
/// to a field of that structure.
|
|
|
|
bool DSE::HandleFree(CallInst *F) {
|
2011-06-15 08:47:34 +08:00
|
|
|
bool MadeChange = false;
|
|
|
|
|
2015-06-17 15:18:54 +08:00
|
|
|
MemoryLocation Loc = MemoryLocation(F->getOperand(0));
|
2011-11-05 18:48:42 +08:00
|
|
|
SmallVector<BasicBlock *, 16> Blocks;
|
|
|
|
Blocks.push_back(F->getParent());
|
2015-03-10 10:37:25 +08:00
|
|
|
const DataLayout &DL = F->getModule()->getDataLayout();
|
2011-06-15 08:47:34 +08:00
|
|
|
|
2011-11-05 18:48:42 +08:00
|
|
|
while (!Blocks.empty()) {
|
|
|
|
BasicBlock *BB = Blocks.pop_back_val();
|
|
|
|
Instruction *InstPt = BB->getTerminator();
|
|
|
|
if (BB == F->getParent()) InstPt = F;
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2015-10-14 02:26:00 +08:00
|
|
|
MemDepResult Dep =
|
|
|
|
MD->getPointerDependencyFrom(Loc, false, InstPt->getIterator(), BB);
|
2011-11-05 18:48:42 +08:00
|
|
|
while (Dep.isDef() || Dep.isClobber()) {
|
|
|
|
Instruction *Dependency = Dep.getInst();
|
2015-08-13 02:01:44 +08:00
|
|
|
if (!hasMemoryWrite(Dependency, *TLI) || !isRemovable(Dependency))
|
2011-11-05 18:48:42 +08:00
|
|
|
break;
|
2008-01-20 18:49:23 +08:00
|
|
|
|
2011-11-05 18:48:42 +08:00
|
|
|
Value *DepPointer =
|
2015-03-10 10:37:25 +08:00
|
|
|
GetUnderlyingObject(getStoredPointerOperand(Dependency), DL);
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2011-11-05 18:48:42 +08:00
|
|
|
// Check for aliasing.
|
|
|
|
if (!AA->isMustAlias(F->getArgOperand(0), DepPointer))
|
|
|
|
break;
|
2010-11-12 10:19:17 +08:00
|
|
|
|
2015-10-14 02:26:00 +08:00
|
|
|
auto Next = ++Dependency->getIterator();
|
2011-11-05 18:48:42 +08:00
|
|
|
|
|
|
|
// DCE instructions only used to calculate that store
|
2015-08-13 02:01:44 +08:00
|
|
|
DeleteDeadInstruction(Dependency, *MD, *TLI);
|
2011-11-05 18:48:42 +08:00
|
|
|
++NumFastStores;
|
|
|
|
MadeChange = true;
|
|
|
|
|
|
|
|
// Inst's old Dependency is now deleted. Compute the next dependency,
|
|
|
|
// which may also be dead, as in
|
|
|
|
// s[0] = 0;
|
|
|
|
// s[1] = 0; // This has just been deleted.
|
|
|
|
// free(s);
|
|
|
|
Dep = MD->getPointerDependencyFrom(Loc, false, Next, BB);
|
|
|
|
}
|
|
|
|
|
|
|
|
if (Dep.isNonLocal())
|
|
|
|
FindUnconditionalPreds(Blocks, BB, DT);
|
|
|
|
}
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2011-06-15 08:47:34 +08:00
|
|
|
return MadeChange;
|
2007-07-12 07:19:17 +08:00
|
|
|
}
|
|
|
|
|
2007-08-03 02:11:11 +08:00
|
|
|
/// handleEndBlock - Remove dead stores to stack-allocated locations in the
|
2007-08-09 01:50:09 +08:00
|
|
|
/// function end block. Ex:
|
|
|
|
/// %A = alloca i32
|
|
|
|
/// ...
|
|
|
|
/// store i32 1, i32* %A
|
|
|
|
/// ret void
|
2008-11-28 08:27:14 +08:00
|
|
|
bool DSE::handleEndBlock(BasicBlock &BB) {
|
2007-07-13 05:41:30 +08:00
|
|
|
bool MadeChange = false;
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2010-12-01 05:32:12 +08:00
|
|
|
// Keep track of all of the stack objects that are dead at the end of the
|
|
|
|
// function.
|
2012-06-16 12:28:11 +08:00
|
|
|
SmallSetVector<Value*, 16> DeadStackObjects;
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2008-11-28 08:27:14 +08:00
|
|
|
// Find all of the alloca'd pointers in the entry block.
|
2015-10-14 02:26:00 +08:00
|
|
|
BasicBlock &Entry = BB.getParent()->front();
|
|
|
|
for (Instruction &I : Entry) {
|
|
|
|
if (isa<AllocaInst>(&I))
|
|
|
|
DeadStackObjects.insert(&I);
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2011-10-23 05:59:35 +08:00
|
|
|
// Okay, so these are dead heap objects, but if the pointer never escapes
|
|
|
|
// then it's leaked by this function anyways.
|
2015-10-14 02:26:00 +08:00
|
|
|
else if (isAllocLikeFn(&I, TLI) && !PointerMayBeCaptured(&I, true, true))
|
|
|
|
DeadStackObjects.insert(&I);
|
2011-10-23 05:59:35 +08:00
|
|
|
}
|
|
|
|
|
2014-01-28 10:38:36 +08:00
|
|
|
// Treat byval or inalloca arguments the same, stores to them are dead at the
|
|
|
|
// end of the function.
|
2015-10-14 02:26:00 +08:00
|
|
|
for (Argument &AI : BB.getParent()->args())
|
|
|
|
if (AI.hasByValOrInAllocaAttr())
|
|
|
|
DeadStackObjects.insert(&AI);
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2015-03-10 10:37:25 +08:00
|
|
|
const DataLayout &DL = BB.getModule()->getDataLayout();
|
|
|
|
|
2007-07-13 05:41:30 +08:00
|
|
|
// Scan the basic block backwards
|
|
|
|
for (BasicBlock::iterator BBI = BB.end(); BBI != BB.begin(); ){
|
|
|
|
--BBI;
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2015-08-20 16:58:47 +08:00
|
|
|
// If we find a store, check to see if it points into a dead stack value.
|
2015-10-14 02:26:00 +08:00
|
|
|
if (hasMemoryWrite(&*BBI, *TLI) && isRemovable(&*BBI)) {
|
2010-12-01 03:48:15 +08:00
|
|
|
// See through pointer-to-pointer bitcasts
|
2012-05-11 02:57:38 +08:00
|
|
|
SmallVector<Value *, 4> Pointers;
|
2015-10-14 02:26:00 +08:00
|
|
|
GetUnderlyingObjects(getStoredPointerOperand(&*BBI), Pointers, DL);
|
2010-12-01 03:48:15 +08:00
|
|
|
|
2010-12-01 05:58:14 +08:00
|
|
|
// Stores to stack values are valid candidates for removal.
|
2012-05-11 02:57:38 +08:00
|
|
|
bool AllDead = true;
|
|
|
|
for (SmallVectorImpl<Value *>::iterator I = Pointers.begin(),
|
|
|
|
E = Pointers.end(); I != E; ++I)
|
|
|
|
if (!DeadStackObjects.count(*I)) {
|
|
|
|
AllDead = false;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (AllDead) {
|
2015-10-14 02:26:00 +08:00
|
|
|
Instruction *Dead = &*BBI++;
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2010-12-07 05:13:51 +08:00
|
|
|
DEBUG(dbgs() << "DSE: Dead Store at End of Block:\n DEAD: "
|
2012-05-11 02:57:38 +08:00
|
|
|
<< *Dead << "\n Objects: ";
|
|
|
|
for (SmallVectorImpl<Value *>::iterator I = Pointers.begin(),
|
|
|
|
E = Pointers.end(); I != E; ++I) {
|
|
|
|
dbgs() << **I;
|
2014-03-02 20:27:27 +08:00
|
|
|
if (std::next(I) != E)
|
2012-05-11 02:57:38 +08:00
|
|
|
dbgs() << ", ";
|
|
|
|
}
|
|
|
|
dbgs() << '\n');
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2010-12-07 05:13:51 +08:00
|
|
|
// DCE instructions only used to calculate that store.
|
2015-08-13 02:01:44 +08:00
|
|
|
DeleteDeadInstruction(Dead, *MD, *TLI, &DeadStackObjects);
|
2010-12-01 03:48:15 +08:00
|
|
|
++NumFastStores;
|
|
|
|
MadeChange = true;
|
2011-08-31 05:11:06 +08:00
|
|
|
continue;
|
2010-12-01 03:48:15 +08:00
|
|
|
}
|
|
|
|
}
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2010-12-01 03:48:15 +08:00
|
|
|
// Remove any dead non-memory-mutating instructions.
|
2015-10-14 02:26:00 +08:00
|
|
|
if (isInstructionTriviallyDead(&*BBI, TLI)) {
|
|
|
|
Instruction *Inst = &*BBI++;
|
2015-08-13 02:01:44 +08:00
|
|
|
DeleteDeadInstruction(Inst, *MD, *TLI, &DeadStackObjects);
|
2010-12-01 03:48:15 +08:00
|
|
|
++NumFastOther;
|
|
|
|
MadeChange = true;
|
|
|
|
continue;
|
|
|
|
}
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2012-08-08 10:17:32 +08:00
|
|
|
if (isa<AllocaInst>(BBI)) {
|
|
|
|
// Remove allocas from the list of dead stack objects; there can't be
|
|
|
|
// any references before the definition.
|
2015-10-14 02:26:00 +08:00
|
|
|
DeadStackObjects.remove(&*BBI);
|
2012-05-11 01:14:00 +08:00
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
2015-10-14 02:26:00 +08:00
|
|
|
if (auto CS = CallSite(&*BBI)) {
|
2012-08-08 10:17:32 +08:00
|
|
|
// Remove allocation function calls from the list of dead stack objects;
|
|
|
|
// there can't be any references before the definition.
|
2015-10-14 02:26:00 +08:00
|
|
|
if (isAllocLikeFn(&*BBI, TLI))
|
|
|
|
DeadStackObjects.remove(&*BBI);
|
2012-08-08 10:17:32 +08:00
|
|
|
|
2010-12-01 03:48:15 +08:00
|
|
|
// If this call does not access memory, it can't be loading any of our
|
|
|
|
// pointers.
|
2010-12-01 03:34:42 +08:00
|
|
|
if (AA->doesNotAccessMemory(CS))
|
2007-08-09 01:58:56 +08:00
|
|
|
continue;
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2010-12-01 05:18:46 +08:00
|
|
|
// If the call might load from any of our allocas, then any store above
|
|
|
|
// the call is live.
|
2014-03-04 03:28:52 +08:00
|
|
|
DeadStackObjects.remove_if([&](Value *I) {
|
2014-03-01 19:47:00 +08:00
|
|
|
// See if the call site touches the value.
|
2015-08-13 02:01:44 +08:00
|
|
|
ModRefInfo A = AA->getModRefInfo(CS, I, getPointerSize(I, DL, *TLI));
|
2014-03-01 19:47:00 +08:00
|
|
|
|
2015-07-23 07:15:57 +08:00
|
|
|
return A == MRI_ModRef || A == MRI_Ref;
|
2014-03-04 03:28:52 +08:00
|
|
|
});
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2010-12-01 05:18:46 +08:00
|
|
|
// If all of the allocas were clobbered by the call then we're not going
|
|
|
|
// to find anything else to process.
|
2012-10-14 18:21:31 +08:00
|
|
|
if (DeadStackObjects.empty())
|
2012-08-08 10:17:32 +08:00
|
|
|
break;
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2007-07-13 05:41:30 +08:00
|
|
|
continue;
|
2010-12-01 05:18:46 +08:00
|
|
|
}
|
2011-07-27 09:08:30 +08:00
|
|
|
|
2015-06-17 15:18:54 +08:00
|
|
|
MemoryLocation LoadedLoc;
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2010-12-01 05:18:46 +08:00
|
|
|
// If we encounter a use of the pointer, it is no longer considered dead
|
|
|
|
if (LoadInst *L = dyn_cast<LoadInst>(BBI)) {
|
2011-08-18 06:22:24 +08:00
|
|
|
if (!L->isUnordered()) // Be conservative with atomic/volatile load
|
|
|
|
break;
|
2015-06-04 10:03:15 +08:00
|
|
|
LoadedLoc = MemoryLocation::get(L);
|
2010-12-01 05:18:46 +08:00
|
|
|
} else if (VAArgInst *V = dyn_cast<VAArgInst>(BBI)) {
|
2015-06-04 10:03:15 +08:00
|
|
|
LoadedLoc = MemoryLocation::get(V);
|
2010-12-01 05:18:46 +08:00
|
|
|
} else if (MemTransferInst *MTI = dyn_cast<MemTransferInst>(BBI)) {
|
2015-06-04 10:03:15 +08:00
|
|
|
LoadedLoc = MemoryLocation::getForSource(MTI);
|
2011-09-07 02:14:09 +08:00
|
|
|
} else if (!BBI->mayReadFromMemory()) {
|
|
|
|
// Instruction doesn't read memory. Note that stores that weren't removed
|
|
|
|
// above will hit this case.
|
2008-11-28 08:27:14 +08:00
|
|
|
continue;
|
2011-07-27 09:08:30 +08:00
|
|
|
} else {
|
|
|
|
// Unknown inst; assume it clobbers everything.
|
|
|
|
break;
|
2007-07-13 05:41:30 +08:00
|
|
|
}
|
2008-10-01 23:25:41 +08:00
|
|
|
|
2010-12-01 05:32:12 +08:00
|
|
|
// Remove any allocas from the DeadPointer set that are loaded, as this
|
|
|
|
// makes any stores above the access live.
|
2015-03-10 10:37:25 +08:00
|
|
|
RemoveAccessedObjects(LoadedLoc, DeadStackObjects, DL);
|
2008-10-01 23:25:41 +08:00
|
|
|
|
2010-12-01 05:32:12 +08:00
|
|
|
// If all of the allocas were clobbered by the access then we're not going
|
|
|
|
// to find anything else to process.
|
|
|
|
if (DeadStackObjects.empty())
|
|
|
|
break;
|
2007-07-13 05:41:30 +08:00
|
|
|
}
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2007-07-13 05:41:30 +08:00
|
|
|
return MadeChange;
|
|
|
|
}
|
|
|
|
|
2010-12-01 05:32:12 +08:00
|
|
|
/// RemoveAccessedObjects - Check to see if the specified location may alias any
|
|
|
|
/// of the stack objects in the DeadStackObjects set. If so, they become live
|
|
|
|
/// because the location is being loaded.
|
2015-06-17 15:18:54 +08:00
|
|
|
void DSE::RemoveAccessedObjects(const MemoryLocation &LoadedLoc,
|
2015-03-10 10:37:25 +08:00
|
|
|
SmallSetVector<Value *, 16> &DeadStackObjects,
|
|
|
|
const DataLayout &DL) {
|
|
|
|
const Value *UnderlyingPointer = GetUnderlyingObject(LoadedLoc.Ptr, DL);
|
2010-12-01 05:32:12 +08:00
|
|
|
|
|
|
|
// A constant can't be in the dead pointer set.
|
|
|
|
if (isa<Constant>(UnderlyingPointer))
|
2010-12-01 05:38:30 +08:00
|
|
|
return;
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2010-12-01 05:32:12 +08:00
|
|
|
// If the kill pointer can be easily reduced to an alloca, don't bother doing
|
|
|
|
// extraneous AA queries.
|
2010-12-01 05:38:30 +08:00
|
|
|
if (isa<AllocaInst>(UnderlyingPointer) || isa<Argument>(UnderlyingPointer)) {
|
2012-06-16 12:28:11 +08:00
|
|
|
DeadStackObjects.remove(const_cast<Value*>(UnderlyingPointer));
|
2010-12-01 05:38:30 +08:00
|
|
|
return;
|
2010-12-01 05:32:12 +08:00
|
|
|
}
|
2011-09-07 02:14:09 +08:00
|
|
|
|
2012-10-14 18:21:31 +08:00
|
|
|
// Remove objects that could alias LoadedLoc.
|
2014-03-04 03:49:02 +08:00
|
|
|
DeadStackObjects.remove_if([&](Value *I) {
|
2014-03-01 19:47:00 +08:00
|
|
|
// See if the loaded location could alias the stack location.
|
2015-08-13 02:01:44 +08:00
|
|
|
MemoryLocation StackLoc(I, getPointerSize(I, DL, *TLI));
|
2014-03-01 19:47:00 +08:00
|
|
|
return !AA->isNoAlias(StackLoc, LoadedLoc);
|
2014-03-04 03:49:02 +08:00
|
|
|
});
|
2007-07-13 05:41:30 +08:00
|
|
|
}
|