[Windows] Disable TrapUnreachable for Win64, add SEH_NoReturn

Users have complained llvm.trap produce two ud2 instructions on Win64,
one for the trap, and one for unreachable. This change fixes that.

TrapUnreachable was added and enabled for Win64 in r206684 (April 2014)
to avoid poorly understood issues with the Windows unwinder.

There seem to be two major things in play:
- the unwinder
- C++ EH, _CxxFrameHandler3 & co

The unwinder disassembles forward from the return address to scan for
epilogues. Inserting a ud2 had the effect of stopping the unwinder, and
ensuring that it ran the EH personality function for the current frame.
However, it's not clear what the unwinder does when the return address
happens to be the last address of one function and the first address of
the next function.

The Visual C++ EH personality, _CxxFrameHandler3, needs to figure out
what the current EH state number is. It does this by consulting the
ip2state table, which maps from PC to state number. This seems to go
wrong when the return address is the last PC of the function or catch
funclet.

I'm not sure precisely which system is involved here, but in order to
address these real or hypothetical problems, I believe it is enough to
insert int3 after a call site if it would otherwise be the last
instruction in a function or funclet.  I was able to reproduce some
similar problems locally by arranging for a noreturn call to appear at
the end of a catch block immediately before an unrelated function, and I
confirmed that the problems go away when an extra trailing int3
instruction is added.

MSVC inserts int3 after every noreturn function call, but I believe it's
only necessary to do it if the call would be the last instruction. This
change inserts a pseudo instruction that expands to int3 if it is in the
last basic block of a function or funclet. I did what I could to run the
Microsoft compiler EH tests, and the ones I was able to run showed no
behavior difference before or after this change.

Differential Revision: https://reviews.llvm.org/D66980

llvm-svn: 370525
This commit is contained in:
Reid Kleckner 2019-08-30 20:46:39 +00:00
parent 0227208b87
commit 0bb1630685
21 changed files with 145 additions and 38 deletions

View File

@ -2558,7 +2558,7 @@ bool blockEndIsUnreachable(const MachineBasicBlock &MBB,
MBB.succ_begin(), MBB.succ_end(),
[](const MachineBasicBlock *Succ) { return Succ->isEHPad(); }) &&
std::all_of(MBBI, MBB.end(), [](const MachineInstr &MI) {
return MI.isMetaInstruction();
return MI.isMetaInstruction() || MI.getOpcode() == X86::SEH_NoReturn;
});
}

View File

@ -4129,6 +4129,17 @@ X86TargetLowering::LowerCall(TargetLowering::CallLoweringInfo &CLI,
InFlag = Chain.getValue(1);
}
// Insert a pseudo instruction after noreturn calls that expands to int3 if
// this would be the last instruction in the funclet. If the return address of
// a call refers to the last PC of a function, the Windows SEH machinery can
// get confused about which function or scope the return address belongs to.
// MSVC inserts int3 after every noreturn function call, but LLVM only places
// them when it would cause a problem otherwise.
if (CLI.DoesNotReturn && Subtarget.isTargetWin64()) {
Chain = DAG.getNode(X86ISD::SEH_NORETURN, dl, NodeTys, Chain, InFlag);
InFlag = Chain.getValue(1);
}
// Handle result values, copying them out of physregs into vregs that we
// return.
return LowerCallResult(Chain, InFlag, CallConv, isVarArg, Ins, dl, DAG,
@ -28711,6 +28722,7 @@ const char *X86TargetLowering::getTargetNodeName(unsigned Opcode) const {
case X86ISD::VASTART_SAVE_XMM_REGS: return "X86ISD::VASTART_SAVE_XMM_REGS";
case X86ISD::VAARG_64: return "X86ISD::VAARG_64";
case X86ISD::WIN_ALLOCA: return "X86ISD::WIN_ALLOCA";
case X86ISD::SEH_NORETURN: return "X86ISD::SEH_NORETURN";
case X86ISD::MEMBARRIER: return "X86ISD::MEMBARRIER";
case X86ISD::MFENCE: return "X86ISD::MFENCE";
case X86ISD::SEG_ALLOCA: return "X86ISD::SEG_ALLOCA";

View File

@ -531,6 +531,9 @@ namespace llvm {
// Windows's _chkstk call to do stack probing.
WIN_ALLOCA,
// Expands to int3 or nothing, depending on basic block layout.
SEH_NORETURN,
// For allocating variable amounts of stack space when using
// segmented stacks. Check if the current stacklet has enough space, and
// falls back to heap allocation if not.

View File

@ -239,6 +239,9 @@ let isPseudo = 1, SchedRW = [WriteSystem] in {
"#SEH_EndPrologue", []>;
def SEH_Epilogue : I<0, Pseudo, (outs), (ins),
"#SEH_Epilogue", []>;
let hasSideEffects = 1 in
def SEH_NoReturn : I<0, Pseudo, (outs), (ins),
"#SEH_NoReturn", [(X86SehNoReturn)]>;
}
//===----------------------------------------------------------------------===//

View File

@ -289,6 +289,9 @@ def X86mul_imm : SDNode<"X86ISD::MUL_IMM", SDTIntBinOp>;
def X86WinAlloca : SDNode<"X86ISD::WIN_ALLOCA", SDT_X86WIN_ALLOCA,
[SDNPHasChain, SDNPOutGlue]>;
def X86SehNoReturn : SDNode<"X86ISD::SEH_NORETURN", SDTX86Void,
[SDNPHasChain, SDNPOutGlue]>;
def X86SegAlloca : SDNode<"X86ISD::SEG_ALLOCA", SDT_X86SEG_ALLOCA,
[SDNPHasChain]>;

View File

@ -1929,6 +1929,20 @@ void X86AsmPrinter::EmitInstruction(const MachineInstr *MI) {
return;
}
case X86::SEH_NoReturn: {
// Materialize an int3 if this instruction is in the last basic block in the
// function. The int3 serves the same purpose as the noop emitted above for
// SEH_Epilogue, which is to make the Win64 unwinder happy. If the return
// address of the preceding call appears to precede an epilogue or a new
// function, then the unwinder may get lost.
const MachineBasicBlock *MBB = MI->getParent();
const MachineBasicBlock *NextMBB = MBB->getNextNode();
if (!NextMBB || NextMBB->isEHPad()) {
EmitAndCountInstruction(MCInstBuilder(X86::INT3));
}
return;
}
// Lower PSHUFB and VPERMILP normally but add a comment if we can find
// a constant shuffle mask. We won't be able to do this at the MC layer
// because the mask isn't an immediate.

View File

@ -219,17 +219,9 @@ X86TargetMachine::X86TargetMachine(const Target &T, const Triple &TT,
getEffectiveX86CodeModel(CM, JIT, TT.getArch() == Triple::x86_64),
OL),
TLOF(createTLOF(getTargetTriple())) {
// Windows stack unwinder gets confused when execution flow "falls through"
// after a call to 'noreturn' function.
// To prevent that, we emit a trap for 'unreachable' IR instructions.
// (which on X86, happens to be the 'ud2' instruction)
// On PS4, the "return address" of a 'noreturn' call must still be within
// the calling function, and TrapUnreachable is an easy way to get that.
// The check here for 64-bit windows is a bit icky, but as we're unlikely
// to ever want to mix 32 and 64-bit windows code in a single module
// this should be fine.
if ((TT.isOSWindows() && TT.getArch() == Triple::x86_64) || TT.isPS4() ||
TT.isOSBinFormatMachO()) {
if (TT.isPS4() || TT.isOSBinFormatMachO()) {
this->Options.TrapUnreachable = true;
this->Options.NoTrapAfterNoreturn = TT.isOSBinFormatMachO();
}

View File

@ -1,4 +1,3 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: sed -e s/.Cxx:// %s | llc -mtriple=x86_64-pc-windows-msvc | FileCheck %s --check-prefix=CXX
; RUN: sed -e s/.Seh:// %s | llc -mtriple=x86_64-pc-windows-msvc | FileCheck %s --check-prefix=SEH
@ -69,13 +68,13 @@ catch.body.2:
; SEH-NEXT: .long .Ltmp0@IMGREL+1
; SEH-NEXT: .long .Ltmp1@IMGREL+1
; SEH-NEXT: .long dummy_filter@IMGREL
; SEH-NEXT: .long .LBB0_2@IMGREL
; SEH-NEXT: .long .LBB0_5@IMGREL
; SEH-NEXT: .long .Ltmp2@IMGREL+1
; SEH-NEXT: .long .Ltmp3@IMGREL+1
; SEH-NEXT: .long "?dtor$5@?0?test@4HA"@IMGREL
; SEH-NEXT: .long "?dtor$2@?0?test@4HA"@IMGREL
; SEH-NEXT: .long 0
; SEH-NEXT: .long .Ltmp2@IMGREL+1
; SEH-NEXT: .long .Ltmp3@IMGREL+1
; SEH-NEXT: .long dummy_filter@IMGREL
; SEH-NEXT: .long .LBB0_2@IMGREL
; SEH-NEXT: .long .LBB0_5@IMGREL
; SEH-NEXT: .Llsda_end0:

View File

@ -5,18 +5,18 @@
; RUN: llc -mtriple=x86_64-scei-ps4 < %s | FileCheck -check-prefix=PS4 %s
; X64_DARWIN: orq
; X64-DARWIN-NEXT: ud2
; X64_DARWIN-NEXT: ud2
; X64_LINUX: orq %rax, %rcx
; X64_LINUX-NEXT: jne
; X64_LINUX-NEXT: %bb8.i329
; X64_WINDOWS: orq %rax, %rcx
; X64_WINDOWS-NEXT: ud2
; X64_WINDOWS-NEXT: jne
; X64_WINDOWS_GNU: movq .refptr._ZN11xercesc_2_513SchemaSymbols21fgURI_SCHEMAFORSCHEMAE(%rip), %rax
; X64_WINDOWS_GNU: orq .refptr._ZN11xercesc_2_56XMLUni16fgNotationStringE(%rip), %rax
; X64_WINDOWS_GNU-NEXT: ud2
; X64_WINDOWS_GNU-NEXT: jne
; PS4: orq %rax, %rcx
; PS4-NEXT: ud2

View File

@ -7,6 +7,8 @@ declare void @throw()
declare i32 @__CxxFrameHandler3(...)
declare void @llvm.trap()
define void @test1() personality i32 (...)* @__CxxFrameHandler3 {
entry:
%alloca2 = alloca i8*, align 4
@ -30,6 +32,7 @@ catch.pad: ; preds = %catch.dispatch
%bc2 = bitcast i8** %alloca2 to i8*
call void @llvm.lifetime.start.p0i8(i64 4, i8* %bc2)
store volatile i8* null, i8** %alloca1
call void @llvm.trap()
unreachable
; CHECK-LABEL: "?catch$2@?0?test1@4HA"
@ -67,6 +70,7 @@ catch.pad: ; preds = %catch.dispatch
%bc2 = bitcast i8** %alloca2 to i8*
call void @llvm.lifetime.start.p0i8(i64 4, i8* %bc2)
store volatile i8* null, i8** %alloca1
call void @llvm.trap()
unreachable
; CHECK-LABEL: "?catch$2@?0?test2@4HA"

View File

@ -75,7 +75,7 @@ unreachable: ; preds = %entry
; CHECK: popq %rbp
; CHECK: retq
; CHECK: "?catch$2@?0?global_array@4HA":
; CHECK: "?catch${{[0-9]+}}@?0?global_array@4HA":
; CHECK: pushq %rbp
; CHECK: movslq {{.*}}, %[[idx:[^ ]*]]
; CHECK: leaq array(%rip), %[[base:[^ ]*]]
@ -122,7 +122,7 @@ unreachable: ; preds = %entry
; CHECK: popq %rbp
; CHECK: retq
; CHECK: "?catch$2@?0?access_imported@4HA":
; CHECK: "?catch${{[0-9]+}}@?0?access_imported@4HA":
; CHECK: pushq %rbp
; CHECK: movq __imp_imported(%rip), %[[base:[^ ]*]]
; CHECK: movl $222, (%[[base]])

View File

@ -6,6 +6,7 @@ target triple = "x86_64-pc-windows-msvc"
declare i32 @__CxxFrameHandler3(...)
declare void @throw() noreturn uwtable
declare i8* @getval()
declare void @llvm.trap()
define i8* @reload_out_of_pad(i8* %arg) #0 personality i32 (...)* @__CxxFrameHandler3 {
assertPassed:
@ -19,6 +20,7 @@ catch:
; This block *must* appear after the catchret to test the bug.
; FIXME: Make this an MIR test so we can control MBB layout.
unreachable:
call void @llvm.trap()
unreachable
catch.dispatch:
@ -35,7 +37,7 @@ return:
; CHECK: movq -[[arg_slot]](%rbp), %rax # 8-byte Reload
; CHECK: retq
; CHECK: "?catch$3@?0?reload_out_of_pad@4HA":
; CHECK: "?catch${{[0-9]+}}@?0?reload_out_of_pad@4HA":
; CHECK-NOT: Reload
; CHECK: retq
@ -50,6 +52,7 @@ catch:
catchret from %cp to label %return
unreachable:
call void @llvm.trap()
unreachable
catch.dispatch:
@ -65,7 +68,7 @@ return:
; CHECK: movq -[[val_slot:[0-9]+]](%rbp), %rax # 8-byte Reload
; CHECK: retq
; CHECK: "?catch$3@?0?spill_in_pad@4HA":
; CHECK: "?catch${{[0-9]+}}@?0?spill_in_pad@4HA":
; CHECK: callq getval
; CHECK: movq %rax, -[[val_slot]](%rbp) # 8-byte Spill
; CHECK: retq

View File

@ -15,7 +15,7 @@ entry:
; CHECK-LABEL: f:
; WIN32: nop
; WIN64: ud2
; WIN64: nop
; LINUX-NOT: nop
; LINUX-NOT: ud2

View File

@ -9,6 +9,8 @@ target triple = "x86_64-pc-windows-msvc"
@"\01??_7type_info@@6B@" = external constant i8*
@"\01??_R0H@8" = internal global %rtti.TypeDescriptor2 { i8** @"\01??_7type_info@@6B@", i8* null, [3 x i8] c".H\00" }
declare void @llvm.trap()
define void @test1(i1 %B) personality i32 (...)* @__CxxFrameHandler3 {
entry:
invoke void @g()
@ -31,6 +33,7 @@ try.cont:
ret void
unreachable:
call void @llvm.trap()
unreachable
}
@ -76,6 +79,7 @@ try.cont.5: ; preds = %try.cont
ret i32 0
unreachable: ; preds = %catch, %entry
call void @llvm.trap()
unreachable
}
@ -125,11 +129,13 @@ try.cont: ; preds = %entry
br i1 %V, label %exit_one, label %exit_two
exit_one:
tail call void @exit(i32 0)
tail call void @g()
call void @llvm.trap()
unreachable
exit_two:
tail call void @exit(i32 0)
tail call void @g()
call void @llvm.trap()
unreachable
}
@ -138,7 +144,7 @@ exit_two:
; The entry funclet contains %entry and %try.cont
; CHECK: # %entry
; CHECK: # %try.cont
; CHECK: callq exit
; CHECK: callq g
; CHECK-NOT: # exit_one
; CHECK-NOT: # exit_two
; CHECK: ud2
@ -146,12 +152,12 @@ exit_two:
; The catch(...) funclet contains %catch.2
; CHECK: # %catch.2{{$}}
; CHECK: callq exit
; CHECK: ud2
; CHECK-NEXT: int3
; The catch(int) funclet contains %catch
; CHECK: # %catch{{$}}
; CHECK: callq exit
; CHECK: ud2
; CHECK-NEXT: int3
declare void @exit(i32) noreturn nounwind
declare void @_CxxThrowException(i8*, %eh.ThrowInfo*)

View File

@ -0,0 +1,53 @@
; RUN: llc < %s -mtriple=x86_64-windows-msvc | FileCheck %s
; Function Attrs: noinline nounwind optnone uwtable
define dso_local i32 @foo() {
entry:
%call = call i32 @cond()
%tobool = icmp ne i32 %call, 0
br i1 %tobool, label %if.then, label %if.end
if.then: ; preds = %entry
call void @abort1()
unreachable
if.end: ; preds = %entry
%call1 = call i32 @cond()
%tobool2 = icmp ne i32 %call1, 0
br i1 %tobool2, label %if.then3, label %if.end4
if.then3: ; preds = %if.end
call void @abort2()
unreachable
if.end4: ; preds = %if.end
%call5 = call i32 @cond()
%tobool6 = icmp ne i32 %call5, 0
br i1 %tobool6, label %if.then7, label %if.end8
if.then7: ; preds = %if.end4
call void @abort3()
unreachable
if.end8: ; preds = %if.end4
ret i32 0
}
; CHECK-LABEL: foo:
; CHECK: callq cond
; CHECK: callq cond
; CHECK: callq cond
; We don't need int3's between these calls to abort, since they won't confuse
; the unwinder.
; CHECK: callq abort1
; CHECK-NEXT: # %if.then3
; CHECK: callq abort2
; CHECK-NEXT: # %if.then7
; CHECK: callq abort3
; CHECK-NEXT: int3
declare dso_local i32 @cond()
declare dso_local void @abort1() noreturn
declare dso_local void @abort2() noreturn
declare dso_local void @abort3() noreturn

View File

@ -31,6 +31,6 @@ define void @g() {
unreachable
}
; CHECK-LABEL: g:
; CHECK: ud2
; CHECK: nop
attributes #0 = { nounwind }

View File

@ -1,13 +1,19 @@
; RUN: llc < %s -mtriple=i686-apple-darwin8 -mcpu=yonah | FileCheck %s -check-prefix=DARWIN
; RUN: llc < %s -mtriple=i686-unknown-linux -mcpu=yonah | FileCheck %s -check-prefix=LINUX
; RUN: llc < %s -mtriple=x86_64-scei-ps4 | FileCheck %s -check-prefix=PS4
; RUN: llc < %s -mtriple=x86_64-windows-msvc | FileCheck %s -check-prefix=WIN64
; DARWIN-LABEL: test0:
; DARWIN: ud2
; LINUX-LABEL: test0:
; LINUX: ud2
; FIXME: PS4 probably doesn't want two ud2s.
; PS4-LABEL: test0:
; PS4: ud2
; PS4: ud2
; WIN64-LABEL: test0:
; WIN64: ud2
; WIN64-NOT: ud2
define i32 @test0() noreturn nounwind {
entry:
tail call void @llvm.trap( )
@ -20,6 +26,9 @@ entry:
; LINUX: int3
; PS4-LABEL: test1:
; PS4: int $65
; WIN64-LABEL: test1:
; WIN64: int3
; WIN64-NOT: ud2
define i32 @test1() noreturn nounwind {
entry:
tail call void @llvm.debugtrap( )

View File

@ -1,10 +1,13 @@
; RUN: llc -o - %s -mtriple=x86_64-windows-msvc | FileCheck %s --check-prefixes=CHECK,TRAP_AFTER_NORETURN
; RUN: llc -o - %s -mtriple=x86_64-linux-gnu | FileCheck %s --check-prefixes=CHECK,NORMAL
; RUN: llc -o - %s -mtriple=x86_64-windows-msvc | FileCheck %s --check-prefixes=CHECK,NORMAL
; RUN: llc -o - %s -mtriple=x86_64-scei-ps4 | FileCheck %s --check-prefixes=CHECK,TRAP_AFTER_NORETURN
; RUN: llc -o - %s -mtriple=x86_64-apple-darwin | FileCheck %s --check-prefixes=CHECK,NO_TRAP_AFTER_NORETURN
; CHECK-LABEL: call_exit:
; CHECK: callq {{_?}}exit
; TRAP_AFTER_NORETURN: ud2
; NO_TRAP_AFTER_NORETURN-NOT: ud2
; NORMAL-NOT: ud2
define i32 @call_exit() noreturn nounwind {
tail call void @exit(i32 0)
unreachable
@ -14,13 +17,17 @@ define i32 @call_exit() noreturn nounwind {
; CHECK: ud2
; TRAP_AFTER_NORETURN: ud2
; NO_TRAP_AFTER_NORETURN-NOT: ud2
; NORMAL-NOT: ud2
define i32 @trap() noreturn nounwind {
tail call void @llvm.trap()
unreachable
}
; CHECK-LABEL: unreachable:
; CHECK: ud2
; TRAP_AFTER_NORETURN: ud2
; NO_TRAP_AFTER_NORETURN: ud2
; NORMAL-NOT: ud2
; NORMAL: # -- End function
define i32 @unreachable() noreturn nounwind {
unreachable
}

View File

@ -24,10 +24,9 @@ catch:
; WIN64: nop
; WIN64: addq ${{[0-9]+}}, %rsp
; WIN64: retq
; Check for 'ud2' after noreturn call
; Check for 'int3' after noreturn call.
; WIN64: callq _Unwind_Resume
; WIN64-NEXT: ud2
; WIN64: .seh_endproc
; WIN64-NEXT: int3
; Check it still works when blocks are reordered.

View File

@ -125,11 +125,11 @@ endtryfinally:
; WIN64-LABEL: foo4:
; WIN64: .seh_proc foo4
; WIN64: .seh_handler _d_eh_personality, @unwind, @except
; NORM: subq $56, %rsp
; ATOM: leaq -56(%rsp), %rsp
; WIN64: .seh_stackalloc 56
; NORM: subq $40, %rsp
; ATOM: leaq -40(%rsp), %rsp
; WIN64: .seh_stackalloc 40
; WIN64: .seh_endprologue
; WIN64: addq $56, %rsp
; WIN64: addq $40, %rsp
; WIN64: ret
; WIN64: .seh_handlerdata
; WIN64: .seh_endproc

View File

@ -54,7 +54,7 @@
; ASM: [[p_b2:\.Ltmp[0-9]+]]:
; ASM: #DEBUG_VALUE: p <- $esi
; ASM: callq call_noreturn
; ASM: ud2
; ASM: int3
; ASM: .Lfunc_end0:
; ASM: .short {{.*}} # Record length