[MachineOutliner][AArch64] WA for multiple stack fixup cases in MachineOutliner.

In cases where MachineOutliner candidates either are:

  * noreturn
  * have calls with no available LR or free regs
  * Don't use SP

we can end up hitting stack fixup code for the caller and the callee for
a FrameID of MachineOutlinerDefault. This triggers the assert:

  `assert(OF.FrameConstructionID != MachineOutlinerDefault &&
          "Can only fix up stack references once");`

in AArch64InstrInfo.cpp. This assert exists for now because a lot of the
fixup code is not tested to handle fixing up more than once and needs
some better checks and enhancements to avoid potentially generating
illegal code.

I've filed a Bugzilla report to track this until these cases are handled
by the AArch64 MachineOutliner: https://bugs.llvm.org/show_bug.cgi?id=46767

This diff detects cases that will cause these multiple stack fixups and
prune the Candidates from `RepeatedSequenceLocs`.

    Differential Revision: https://reviews.llvm.org/D83923
This commit is contained in:
Puyan Lotfi 2020-08-10 14:45:38 -04:00
parent 4cd8e9b169
commit 7bc03f5553
4 changed files with 266 additions and 0 deletions

View File

@ -6193,6 +6193,60 @@ outliner::OutlinedFunction AArch64InstrInfo::getOutliningCandidateInfo(
FrameID = MachineOutlinerNoLRSave;
} else {
SetCandidateCallInfo(MachineOutlinerDefault, 12);
// Bugzilla ID: 46767
// TODO: Check if fixing up the stack more than once is safe so we can
// outline these.
//
// An outline resulting in a caller that requires stack fixups at the
// callsite to a callee that also requires stack fixups can happen when
// there are no available registers at the candidate callsite for a
// candidate that itself also has calls.
//
// In other words if function_containing_sequence in the following pseudo
// assembly requires that we save LR at the point of the call, but there
// are no available registers: in this case we save using SP and as a
// result the SP offsets requires stack fixups by multiples of 16.
//
// function_containing_sequence:
// ...
// save LR to SP <- Requires stack instr fixups in OUTLINED_FUNCTION_N
// call OUTLINED_FUNCTION_N
// restore LR from SP
// ...
//
// OUTLINED_FUNCTION_N:
// save LR to SP <- Requires stack instr fixups in OUTLINED_FUNCTION_N
// ...
// bl foo
// restore LR from SP
// ret
//
// Because the code to handle more than one stack fixup does not
// currently have the proper checks for legality, these cases will assert
// in the AArch64 MachineOutliner. This is because the code to do this
// needs more hardening, testing, better checks that generated code is
// legal, etc and because it is only verified to handle a single pass of
// stack fixup.
//
// The assert happens in AArch64InstrInfo::buildOutlinedFrame to catch
// these cases until they are known to be handled. Bugzilla 46767 is
// referenced in comments at the assert site.
//
// To avoid asserting (or generating non-legal code on noassert builds)
// we remove all candidates which would need more than one stack fixup by
// pruning the cases where the candidate has calls while also having no
// available LR and having no available general purpose registers to copy
// LR to (ie one extra stack save/restore).
//
if (FlagsSetInAll & MachineOutlinerMBBFlags::HasCalls) {
erase_if(RepeatedSequenceLocs, [this](outliner::Candidate &C) {
return (std::any_of(
C.front(), std::next(C.back()),
[](const MachineInstr &MI) { return MI.isCall(); })) &&
(!C.LRU.available(AArch64::LR) || !findRegisterToSaveLRTo(C));
});
}
}
// If we dropped all of the candidates, bail out here.
@ -6616,6 +6670,9 @@ void AArch64InstrInfo::buildOutlinedFrame(
if (std::any_of(MBB.instr_begin(), MBB.instr_end(), IsNonTailCall)) {
// Fix up the instructions in the range, since we're going to modify the
// stack.
// Bugzilla ID: 46767
// TODO: Check if fixing up twice is safe so we can outline these.
assert(OF.FrameConstructionID != MachineOutlinerDefault &&
"Can only fix up stack references once");
fixupPostOutline(MBB);

View File

@ -0,0 +1,75 @@
# RUN: llc -mtriple=aarch64--- -run-pass=machine-outliner \
# RUN: -verify-machineinstrs %s -o - | FileCheck %s
# CHECK-NOT: OUTLINED_FUNCTION
--- |
define void @f1() #0 { ret void }
define void @f2() #0 { ret void }
define void @f3() #0 { ret void }
define void @f4() #0 { ret void }
attributes #0 = { minsize noredzone "branch-target-enforcement" }
...
---
name: f1
tracksRegLiveness: true
body: |
bb.0:
liveins: $lr, $x0, $x1, $x2, $x3, $x4, $x5, $x6, $x7, $x8, $x9, $x10, $x11, $x12, $x13, $x14, $x15, $x18, $x19, $x20, $x21, $x22, $x23, $x20, $x21, $x22, $x23, $x24, $x25, $x26, $x27, $x28, $fp
$x20, $x19 = LDPXi $sp, 11
$x20, $x19 = LDPXi $sp, 12
$x20, $x19 = LDPXi $sp, 13
$x20, $x19 = LDPXi $sp, 14
$x20, $x19 = LDPXi $sp, 18
$x20, $x19 = LDPXi $sp, 19
$x20, $x19 = LDPXi $sp, 20
$x20, $x19 = LDPXi $sp, 21
BLR $x20, implicit $sp
bb.2:
liveins: $lr, $x0, $x1, $x2, $x3, $x4, $x5, $x6, $x7, $x8, $x9, $x10, $x11, $x12, $x13, $x14, $x15, $x18, $x19, $x20, $x21, $x22, $x23, $x20, $x21, $x22, $x23, $x24, $x25, $x26, $x27, $x28, $fp
RET undef $lr
...
---
name: f2
tracksRegLiveness: true
body: |
bb.0:
liveins: $lr, $x0, $x1, $x2, $x3, $x4, $x5, $x6, $x7, $x8, $x9, $x10, $x11, $x12, $x13, $x14, $x15, $x18, $x19, $x20, $x21, $x22, $x23, $x20, $x21, $x22, $x23, $x24, $x25, $x26, $x27, $x28, $fp
$x20, $x19 = LDPXi $sp, 11
$x20, $x19 = LDPXi $sp, 12
$x20, $x19 = LDPXi $sp, 13
$x20, $x19 = LDPXi $sp, 14
$x20, $x19 = LDPXi $sp, 18
$x20, $x19 = LDPXi $sp, 19
$x20, $x19 = LDPXi $sp, 20
$x20, $x19 = LDPXi $sp, 21
BLR $x20, implicit $sp
bb.2:
liveins: $lr, $x0, $x1, $x2, $x3, $x4, $x5, $x6, $x7, $x8, $x9, $x10, $x11, $x12, $x13, $x14, $x15, $x18, $x19, $x20, $x21, $x22, $x23, $x20, $x21, $x22, $x23, $x24, $x25, $x26, $x27, $x28, $fp
RET undef $lr
...
---
name: f3
tracksRegLiveness: true
body: |
bb.0:
liveins: $lr, $x0, $x1, $x2, $x3, $x4, $x5, $x6, $x7, $x8, $x9, $x10, $x11, $x12, $x13, $x14, $x15, $x18, $x19, $x20, $x21, $x22, $x23, $x20, $x21, $x22, $x23, $x24, $x25, $x26, $x27, $x28, $fp
$x20, $x19 = LDPXi $sp, 11
$x20, $x19 = LDPXi $sp, 12
$x20, $x19 = LDPXi $sp, 13
$x20, $x19 = LDPXi $sp, 14
$x20, $x19 = LDPXi $sp, 18
$x20, $x19 = LDPXi $sp, 19
$x20, $x19 = LDPXi $sp, 20
$x20, $x19 = LDPXi $sp, 21
BLR $x20, implicit $sp
bb.2:
liveins: $lr, $x0, $x1, $x2, $x3, $x4, $x5, $x6, $x7, $x8, $x9, $x10, $x11, $x12, $x13, $x14, $x15, $x18, $x19, $x20, $x21, $x22, $x23, $x20, $x21, $x22, $x23, $x24, $x25, $x26, $x27, $x28, $fp
RET undef $lr
...
---
name: f4
tracksRegLiveness: true
body: |
bb.0:
RET undef $lr

View File

@ -0,0 +1,67 @@
# RUN: llc -mtriple=arm64-apple-ios -run-pass=prologepilog -run-pass=machine-outliner -verify-machineinstrs %s -o - | FileCheck %s
# Noreturn functions conservatively need to save and restore lr.
# When there is no available register, the stack is used at call site.
# If the stack also needs to be set up for a call in the outlined function,
# bail-out this case since we do not handle adjusting the stack twice.
# CHECK: OUTLINED_FUNCTION
--- |
@g = external global i32
define void @stack_1() #0 { ret void }
define void @stack_2() #0 { ret void }
define void @baz() { ret void }
attributes #0 = { noredzone }
...
---
name: stack_1
tracksRegLiveness: true
body: |
bb.0:
liveins: $x4, $x0, $x1, $x2, $x3
$w8 = MOVZWi 259, 0
$x9 = ADRP target-flags(aarch64-page) @g
$x9 = ADDXri $x9, target-flags(aarch64-pageoff, aarch64-nc) @g, 0
STRXui $x9, $x1, 0
STRHHui $w8, $x1, 8
$w8 = ORRWrs $wzr, $w4, 0, implicit-def $x8
STRXui $x8, $x3, 0
STPXi $x3, $xzr, $x2, 0
$w8 = MOVZWi 271, 0
STRHHui $w8, $x2, 8
$x8 = ORRXrs $xzr, $x0, 0
$x0 = ORRXrs $xzr, $x1, 0
$x1 = ORRXrs $xzr, $x2, 0
BL @baz, implicit-def dead $lr, implicit $sp, implicit $x8, implicit $x0, implicit $x1, implicit $x3, implicit $x4, implicit-def $sp, implicit-def $x5, implicit-def $x6, implicit-def $x7, implicit-def $x8, implicit-def $x9, implicit-def $x10, implicit-def $x11, implicit-def $x12, implicit-def $x13, implicit-def $x14, implicit-def $x15, implicit-def $x18
BRK 1
...
---
name: stack_2
tracksRegLiveness: true
body: |
bb.0:
liveins: $x4, $x0, $x1, $x2, $x3
$w8 = MOVZWi 259, 0
$x9 = ADRP target-flags(aarch64-page) @g
$x9 = ADDXri $x9, target-flags(aarch64-pageoff, aarch64-nc) @g, 0
STRXui $x9, $x1, 0
STRHHui $w8, $x1, 8
$w8 = ORRWrs $wzr, $w4, 0, implicit-def $x8
STRXui $x8, $x3, 0
STPXi $x3, $xzr, $x2, 0
$w8 = MOVZWi 271, 0
STRHHui $w8, $x2, 8
$x8 = ORRXrs $xzr, $x0, 0
$x0 = ORRXrs $xzr, $x1, 0
$x1 = ORRXrs $xzr, $x2, 0
BL @baz, implicit-def dead $lr, implicit $sp, implicit $x8, implicit $x0, implicit $x1, implicit $x3, implicit $x4, implicit-def $sp, implicit-def $x5, implicit-def $x6, implicit-def $x7, implicit-def $x8, implicit-def $x9, implicit-def $x10, implicit-def $x11, implicit-def $x12, implicit-def $x13, implicit-def $x14, implicit-def $x15, implicit-def $x18
BRK 1
...
---
name: baz
tracksRegLiveness: true
body: |
bb.0:
liveins: $w0, $lr, $w8
RET undef $lr

View File

@ -0,0 +1,67 @@
# RUN: llc -mtriple=arm64-apple-ios -run-pass=prologepilog -run-pass=machine-outliner -verify-machineinstrs %s -o - | FileCheck %s
# Noreturn functions conservatively need to save and restore lr.
# When there is no available register, the stack is used at call site.
# If the stack also needs to be set up for a call in the outlined function,
# bail-out this case since we do not handle adjusting the stack twice.
# CHECK-NOT: OUTLINED_FUNCTION
--- |
@g = external global i32
define void @stack_1() #0 { ret void }
define void @stack_2() #0 { ret void }
define void @baz() { ret void }
attributes #0 = { noredzone noreturn }
...
---
name: stack_1
tracksRegLiveness: true
body: |
bb.0:
liveins: $x4, $x0, $x1, $x2, $x3
$w8 = MOVZWi 259, 0
$x9 = ADRP target-flags(aarch64-page) @g
$x9 = ADDXri $x9, target-flags(aarch64-pageoff, aarch64-nc) @g, 0
STRXui $x9, $x1, 0
STRHHui $w8, $x1, 8
$w8 = ORRWrs $wzr, $w4, 0, implicit-def $x8
STRXui $x8, $x3, 0
STPXi $x3, $xzr, $x2, 0
$w8 = MOVZWi 271, 0
STRHHui $w8, $x2, 8
$x8 = ORRXrs $xzr, $x0, 0
$x0 = ORRXrs $xzr, $x1, 0
$x1 = ORRXrs $xzr, $x2, 0
BL @baz, implicit-def dead $lr, implicit $sp, implicit $x8, implicit $x0, implicit $x1, implicit $x3, implicit $x4, implicit-def $sp, implicit-def $x5, implicit-def $x6, implicit-def $x7, implicit-def $x8, implicit-def $x9, implicit-def $x10, implicit-def $x11, implicit-def $x12, implicit-def $x13, implicit-def $x14, implicit-def $x15, implicit-def $x18
BRK 1
...
---
name: stack_2
tracksRegLiveness: true
body: |
bb.0:
liveins: $x4, $x0, $x1, $x2, $x3
$w8 = MOVZWi 259, 0
$x9 = ADRP target-flags(aarch64-page) @g
$x9 = ADDXri $x9, target-flags(aarch64-pageoff, aarch64-nc) @g, 0
STRXui $x9, $x1, 0
STRHHui $w8, $x1, 8
$w8 = ORRWrs $wzr, $w4, 0, implicit-def $x8
STRXui $x8, $x3, 0
STPXi $x3, $xzr, $x2, 0
$w8 = MOVZWi 271, 0
STRHHui $w8, $x2, 8
$x8 = ORRXrs $xzr, $x0, 0
$x0 = ORRXrs $xzr, $x1, 0
$x1 = ORRXrs $xzr, $x2, 0
BL @baz, implicit-def dead $lr, implicit $sp, implicit $x8, implicit $x0, implicit $x1, implicit $x3, implicit $x4, implicit-def $sp, implicit-def $x5, implicit-def $x6, implicit-def $x7, implicit-def $x8, implicit-def $x9, implicit-def $x10, implicit-def $x11, implicit-def $x12, implicit-def $x13, implicit-def $x14, implicit-def $x15, implicit-def $x18
BRK 1
...
---
name: baz
tracksRegLiveness: true
body: |
bb.0:
liveins: $w0, $lr, $w8
RET undef $lr