forked from OSchip/llvm-project
Fix subrange liveness checking at rematerialization
This patch fixes an issue where an instruction reading a whole register would be moved during register allocation into a spot where one of the subregisters was dead. The code to check whether an instruction can be rematerialized at a given point or not was already checking for subranges to ensure that subregisters are live, but only when the instruction being moved was using a subregister, this patch changes that so the subranges are checked even when the moved instruction uses the full register. This patch also adds a case to the original test for the subrange checking that trigger the issue described above. The original subrange checking code was introduced in this revision: https://reviews.llvm.org/D115278 And I've encountered this issue on AMDGPUs while working with DPC++: https://github.com/intel/llvm/issues/6209 Essentially the greedy register allocator attempts to move the following instruction: ``` %3961:vreg_64 = V_LSHLREV_B64_e64 3, %3078:vreg_64, implicit $exec ``` From `@3440` into the body of a loop `@16312`, but `%3078` has the following live ranges: ``` %3078 [2224r,2240r:0)[2240r,3488B:1)[16192B,38336B:1) 0@2224r 1@2240r L0000000000000003 [2224r,3440r:0) 0@2224r L000000000000000C [2240r,3488B:0)[16192B,38336B:0) 0@2240r ``` So `@16312e` `%3078.sub1` is alive but `%3078.sub0` is dead, so this instruction being moved there leads to invalid memory accesses as `3078.sub0` ends up being trashed and the result of this instruction is used as part of an address calculation for a load. On the original ticket this issue showed up on gfx906 and gfx90a but not on gfx908, this turned out to be because on gfx908 instead of moving the shift instruction into the loop, its value is spilled into an ACC register, gfx906 doesn't have ACC registers and for gfx90a ACC registers are used like regular vector registers and so aren't used for spilling. With this patch the original application from the DPC++ ticket works properly on gfx906, and the result of the shift instruction is correctly spilled instead of moving the instruction in the loop. Original Author: npmiller Reviewed by: rampitec Submitted by: rampitec Differential Revision: https://reviews.llvm.org/D131884
This commit is contained in:
parent
fe7450b34d
commit
ccfabfbb1f
|
@ -134,9 +134,11 @@ bool LiveRangeEdit::allUsesAvailableAt(const MachineInstr *OrigMI,
|
|||
return false;
|
||||
|
||||
// Check that subrange is live at UseIdx.
|
||||
if (MO.getSubReg()) {
|
||||
if (li.hasSubRanges()) {
|
||||
const TargetRegisterInfo *TRI = MRI.getTargetRegisterInfo();
|
||||
LaneBitmask LM = TRI->getSubRegIndexLaneMask(MO.getSubReg());
|
||||
unsigned SubReg = MO.getSubReg();
|
||||
LaneBitmask LM = SubReg ? TRI->getSubRegIndexLaneMask(SubReg)
|
||||
: MRI.getMaxLaneMaskForVReg(MO.getReg());
|
||||
for (LiveInterval::SubRange &SR : li.subranges()) {
|
||||
if ((SR.LaneMask & LM).none())
|
||||
continue;
|
||||
|
|
|
@ -79,3 +79,28 @@ body: |
|
|||
%6:vreg_64 = V_MOV_B64_PSEUDO %2, implicit $exec
|
||||
S_NOP 0, implicit %1.sub0, implicit %1.sub3
|
||||
...
|
||||
---
|
||||
name: dead_subreg_whole_reg
|
||||
tracksRegLiveness: true
|
||||
body: |
|
||||
bb.0.entry:
|
||||
; GCN-LABEL: name: dead_subreg_whole_reg
|
||||
; GCN: $m0 = IMPLICIT_DEF
|
||||
; GCN-NEXT: renamable $sgpr0_sgpr1 = S_MOV_B64 1, implicit $m0
|
||||
; GCN-NEXT: renamable $sgpr2_sgpr3 = S_MOV_B64 renamable $sgpr0_sgpr1
|
||||
; GCN-NEXT: SI_SPILL_S64_SAVE killed renamable $sgpr2_sgpr3, %stack.0, implicit $exec, implicit $sp_reg :: (store (s64) into %stack.0, align 4, addrspace 5)
|
||||
; GCN-NEXT: renamable $sgpr4_sgpr5 = S_MOV_B64 2, implicit $m0
|
||||
; GCN-NEXT: renamable $sgpr2_sgpr3 = S_MOV_B64 3, implicit $m0
|
||||
; GCN-NEXT: dead %4:vgpr_32 = V_MOV_B32_e32 $sgpr0, implicit $exec, implicit killed $sgpr4_sgpr5, implicit killed $sgpr2_sgpr3
|
||||
; GCN-NEXT: renamable $sgpr2_sgpr3 = SI_SPILL_S64_RESTORE %stack.0, implicit $exec, implicit $sp_reg :: (load (s64) from %stack.0, align 4, addrspace 5)
|
||||
; GCN-NEXT: dead %5:vreg_64 = V_MOV_B64_PSEUDO killed $sgpr2_sgpr3, implicit $exec
|
||||
; GCN-NEXT: S_NOP 0, implicit killed renamable $sgpr0
|
||||
$m0 = IMPLICIT_DEF
|
||||
%0:sreg_64_xexec = S_MOV_B64 1, implicit $m0
|
||||
%1:sreg_64 = S_MOV_B64 %0:sreg_64_xexec
|
||||
%2:sreg_64 = S_MOV_B64 2, implicit $m0
|
||||
%3:sreg_64 = S_MOV_B64 3, implicit $m0
|
||||
%4:vgpr_32 = V_MOV_B32_e32 %0.sub0:sreg_64_xexec, implicit $exec, implicit %2, implicit %3
|
||||
%5:vreg_64 = V_MOV_B64_PSEUDO %1, implicit $exec
|
||||
S_NOP 0, implicit %0.sub0
|
||||
...
|
||||
|
|
Loading…
Reference in New Issue