[AMDGPU] Combine DPP mov even if old reg def is in different BB

Given a DPP mov like this:

  %2:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
  ...
  %3:vgpr_32 = V_MOV_B32_dpp %2, %1, 1, 1, 1, 0, implicit $exec

this patch just removes a check that %2 (the "old reg") was defined in
the same BB as the DPP mov instruction. GCNDPPCombine requires that the
MIR is in SSA form so I don't understand why the BB matters.

This lets the optimization work in more real world cases when the
definition of %2 gets hoisted out of a loop.

Differential Revision: https://reviews.llvm.org/D124182
This commit is contained in:
Jay Foad 2022-04-21 12:01:59 +01:00
parent 6f095babc2
commit ba6c8d42d4
2 changed files with 1 additions and 8 deletions

View File

@ -452,12 +452,6 @@ bool GCNDPPCombine::combineDPPMov(MachineInstr &MovMI) const {
return false;
}
if (OldOpndValue->getParent()->getParent() != MovMI.getParent()) {
LLVM_DEBUG(dbgs() <<
" failed: old reg def and mov should be in the same BB\n");
return false;
}
if (OldOpndValue->getImm() == 0) {
if (MaskAllLanes) {
assert(!BoundCtrlZero); // by check [1]

View File

@ -434,9 +434,8 @@ body: |
SI_END_CF %8, implicit-def dead $exec, implicit-def dead $scc, implicit $exec
...
# old reg def is in diff BB - cannot combine
# GCN-LABEL: name: old_in_diff_bb
# GCN: %3:vgpr_32 = V_MOV_B32_dpp %2, %1, 1, 1, 1, 0, implicit $exec
# GCN: %4:vgpr_32 = V_ADD_U32_dpp %0, %1, %0, 1, 1, 1, 0, implicit $exec
name: old_in_diff_bb
tracksRegLiveness: true