Teach the AllocaPromoter which is wrapped around the SSAUpdater

infrastructure to do promotion without a domtree the same smarts about
looking through GEPs, bitcasts, etc., that I just taught mem2reg about.
This way, if SROA chooses to promote an alloca which still has some
noisy instructions this code can cope with them.

I've not used as principled of an approach here for two reasons:
1) This code doesn't really need it as we were already set up to zip
   through the instructions used by the alloca.
2) I view the code here as more of a hack, and hopefully a temporary one.

The SSAUpdater path in SROA is a real sore point for me. It doesn't make
a lot of architectural sense for many reasons:
- We're likely to end up needing the domtree anyways in a subsequent
  pass, so why not compute it earlier and use it.
- In the future we'll likely end up needing the domtree for parts of the
  inliner itself.
- If we need to we could teach the inliner to preserve the domtree. Part
  of the re-work of the pass manager will allow this to be very powerful
  even in large SCCs with many functions.
- Ultimately, computing a domtree has gotten significantly faster since
  the original SSAUpdater-using code went into ScalarRepl. We no longer
  use domfrontiers, and much of domtree is lazily done based on queries
  rather than eagerly.
- At this point keeping the SSAUpdater-based promotion saves a total of
  0.7% on a build of the 'opt' tool for me. That's not a lot of
  performance given the complexity!

So I'm leaving this a bit ugly in the hope that eventually we just
remove all of this nonsense.

I can't even readily test this because this code isn't reachable except
through SROA. When I re-instate the patch that fast-tracks allocas
already suitable for promotion, I'll add a testcase there that failed
before this change. Before that, SROA will fix any test case I give it.

llvm-svn: 187347
This commit is contained in:
Chandler Carruth 2013-07-29 09:06:53 +00:00
parent 03524b774a
commit cd7c8cdfa1
1 changed files with 51 additions and 15 deletions

View File

@ -738,7 +738,8 @@ public:
: LoadAndStorePromoter(Insts, S), AI(AI), DIB(DIB) {}
void run(const SmallVectorImpl<Instruction*> &Insts) {
// Remember which alloca we're promoting (for isInstInList).
// Retain the debug information attached to the alloca for use when
// rewriting loads and stores.
if (MDNode *DebugNode = MDNode::getIfExists(AI.getContext(), &AI)) {
for (Value::use_iterator UI = DebugNode->use_begin(),
UE = DebugNode->use_end();
@ -750,7 +751,9 @@ public:
}
LoadAndStorePromoter::run(Insts);
AI.eraseFromParent();
// While we have the debug information, clear it off of the alloca. The
// caller takes care of deleting the alloca.
while (!DDIs.empty())
DDIs.pop_back_val()->eraseFromParent();
while (!DVIs.empty())
@ -3360,6 +3363,15 @@ void SROA::deleteDeadInstructions(SmallPtrSet<AllocaInst*, 4> &DeletedAllocas) {
}
}
static void enqueueUsersInWorklist(Instruction &I,
SmallVectorImpl<Use *> &UseWorklist,
SmallPtrSet<Use *, 8> &VisitedUses) {
for (Value::use_iterator UI = I.use_begin(), UE = I.use_end(); UI != UE;
++UI)
if (VisitedUses.insert(&UI.getUse()))
UseWorklist.push_back(&UI.getUse());
}
/// \brief Promote the allocas, using the best available technique.
///
/// This attempts to promote whatever allocas have been identified as viable in
@ -3386,34 +3398,58 @@ bool SROA::promoteAllocas(Function &F) {
DIBuilder DIB(*F.getParent());
SmallVector<Instruction*, 64> Insts;
// We need a worklist to walk the uses of each alloca.
SmallVector<Use *, 8> UseWorklist;
SmallPtrSet<Use *, 8> VisitedUses;
SmallVector<Instruction *, 32> DeadInsts;
for (unsigned Idx = 0, Size = PromotableAllocas.size(); Idx != Size; ++Idx) {
AllocaInst *AI = PromotableAllocas[Idx];
for (Value::use_iterator UI = AI->use_begin(), UE = AI->use_end();
UI != UE;) {
Instruction *I = cast<Instruction>(*UI++);
UseWorklist.clear();
VisitedUses.clear();
enqueueUsersInWorklist(*AI, UseWorklist, VisitedUses);
while (!UseWorklist.empty()) {
Use *U = UseWorklist.pop_back_val();
Instruction &I = *cast<Instruction>(U->getUser());
// FIXME: Currently the SSAUpdater infrastructure doesn't reason about
// lifetime intrinsics and so we strip them (and the bitcasts+GEPs
// leading to them) here. Eventually it should use them to optimize the
// scalar values produced.
if (isa<BitCastInst>(I) || isa<GetElementPtrInst>(I)) {
assert(onlyUsedByLifetimeMarkers(I) &&
"Found a bitcast used outside of a lifetime marker.");
while (!I->use_empty())
cast<Instruction>(*I->use_begin())->eraseFromParent();
I->eraseFromParent();
continue;
}
if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(I)) {
if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(&I)) {
assert(II->getIntrinsicID() == Intrinsic::lifetime_start ||
II->getIntrinsicID() == Intrinsic::lifetime_end);
II->eraseFromParent();
continue;
}
Insts.push_back(I);
// Push the loads and stores we find onto the list. SROA will already
// have validated that all loads and stores are viable candidates for
// promotion.
if (LoadInst *LI = dyn_cast<LoadInst>(&I)) {
assert(LI->getType() == AI->getAllocatedType());
Insts.push_back(LI);
continue;
}
if (StoreInst *SI = dyn_cast<StoreInst>(&I)) {
assert(SI->getValueOperand()->getType() == AI->getAllocatedType());
Insts.push_back(SI);
continue;
}
// For everything else, we know that only no-op bitcasts and GEPs will
// make it this far, just recurse through them and recall them for later
// removal.
DeadInsts.push_back(&I);
enqueueUsersInWorklist(I, UseWorklist, VisitedUses);
}
AllocaPromoter(Insts, SSA, *AI, DIB).run(Insts);
Insts.clear();
while (!DeadInsts.empty())
DeadInsts.pop_back_val()->eraseFromParent();
AI->eraseFromParent();
}
PromotableAllocas.clear();