llvm-project/llvm/test/CodeGen/WebAssembly/lower-em-sjlj.ll

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

348 lines
16 KiB
LLVM
Raw Normal View History

[WebAssembly] Tidy up EH/SjLj options This CL is small, but the description can be a little long because I'm trying to sum up the status quo for Emscripten/Wasm EH/SjLj options. First, this CL adds an option for Wasm SjLj (`-wasm-enable-sjlj`), which handles SjLj using Wasm EH. The implementation for this will be added as a followup CL, but this adds the option first to do error checking. This also adds an option for Wasm EH (`-wasm-enable-eh`), which has been already implemented. Before we used `-exception-model=wasm` as the same meaning as enabling Wasm EH, but after we add Wasm SjLj, it will be possible to use Wasm EH instructions for Wasm SjLj while not enabling EH, so going forward, to use Wasm EH, `opt` and `llc` will need this option. This only affects `opt` and `llc` command lines and does not affect Emscripten user interface. Now we have two modes of EH (Emscripten/Wasm) and also two modes of SjLj (also Emscripten/Wasm). The options corresponding to each of are: - Emscripten EH: `-enable-emscripten-cxx-exceptions` - Emscripten SjLj: `-enable-emscripten-sjlj` - Wasm EH: `-wasm-enable-eh -exception-model=wasm` `-mattr=+exception-handling` - Wasm SjLj: `-wasm-enable-sjlj -exception-model=wasm` `-mattr=+exception-handling` The reason Wasm EH/SjLj's options are a little complicated are `-exception-model` and `-mattr` are common LLVM options ane not under our control. (`-mattr` can be omitted if it is embedded within the bitcode file.) And we have the following rules of the option composition: - Emscripten EH and Wasm EH cannot be turned on at the same itme - Emscripten SjLj and Wasm SjLj cannot be turned on at the same time - Wasm SjLj should be used with Wasm EH Which means we now allow these combinations: - Emscripten EH + Emscripten SjLj: the current default in `emcc` - Wasm EH + Emscripten SjLj: This is allowed, but only as an interim step in which we are testing Wasm EH but not yet have a working implementation of Wasm SjLj. This will error out (D107687) in compile time if `setjmp` is called in a function in which Wasm exception is used. - Wasm EH + Wasm SjLj: This will be the default mode later when using Wasm EH. Currently Wasm SjLj implementation doesn't exist, so it doesn't work. - Emscripten EH + Wasm SjLj will not work. This CL moves these error checking routines to `WebAssemblyPassConfig::addIRPasses`. Not sure if this is an ideal place to do this, but I couldn't find elsewhere. Currently some checking is done within LowerEmscriptenEHSjLj, but these checks only run if LowerEmscriptenEHSjLj runs so it may not run when Wasm EH is used. This moves that to `addIRPasses` and adds some more checks. Currently LowerEmscriptenEHSjLj pass is responsible for Emscripten EH and Emscripten SjLj. Wasm EH transformations are done in multiple places, including WasmEHPrepare, LateEHPrepare, and CFGStackify. But in the followup CL, LowerEmscriptenEHSjLj pass will be also responsible for a part of Wasm SjLj transformation, because WasmSjLj will also be using several Emscripten library functions, and we will be sharing more than half of the transformation to do that between Emscripten SjLj and Wasm SjLj. Currently we have `-enable-emscripten-cxx-exceptions` and `-enable-emscripten-sjlj` but these only work for `llc`, because for `llc` we feed these options to the pass but when we run the pass using `opt` the pass will be created with no options and the default options will be used, which turns both Emscripten EH and Emscripten SjLj on. Now we have one more SjLj option to care for, LowerEmscriptenEHSjLj pass needs a finer way to control these options. This CL removes those default parameters and make LowerEmscriptenEHSjLj pass read directly from command line options specified. So if we only run `opt -wasm-lower-em-ehsjlj`, currently both Emscripten EH and Emscripten SjLj will run, but with this CL, none will run unless we additionally pass `-enable-emscripten-cxx-exceptions` or `-enable-emscripten-sjlj`, or both. This does not affect users; this only affects our `opt` tests because `emcc` will not call either `opt` or `llc`. As a result of this, our existing Emscripten EH/SjLj tests gained one or both of those options in their `RUN` lines. Reviewed By: dschuff Differential Revision: https://reviews.llvm.org/D107685
2021-08-07 10:35:18 +08:00
; RUN: opt < %s -wasm-lower-em-ehsjlj -enable-emscripten-sjlj -S | FileCheck %s --check-prefixes=CHECK,NO-TLS -DPTR=i32
; RUN: opt < %s -wasm-lower-em-ehsjlj -enable-emscripten-sjlj -S --mattr=+atomics,+bulk-memory | FileCheck %s --check-prefixes=CHECK,TLS -DPTR=i32
; RUN: opt < %s -wasm-lower-em-ehsjlj -enable-emscripten-sjlj --mtriple=wasm64-unknown-unknown -data-layout="e-m:e-p:64:64-i64:64-n32:64-S128" -S | FileCheck %s --check-prefixes=CHECK -DPTR=i64
target datalayout = "e-m:e-p:32:32-i64:64-n32:64-S128"
target triple = "wasm32-unknown-unknown"
%struct.__jmp_buf_tag = type { [6 x i32], i32, [32 x i32] }
@global_var = global i32 0, align 4
; NO-TLS-DAG: __THREW__ = external global [[PTR]]
; NO-TLS-DAG: __threwValue = external global [[PTR]]
; TLS-DAG: __THREW__ = external thread_local(localexec) global i32
; TLS-DAG: __threwValue = external thread_local(localexec) global i32
@global_longjmp_ptr = global void (%struct.__jmp_buf_tag*, i32)* @longjmp, align 4
; CHECK-DAG: @global_longjmp_ptr = global void (%struct.__jmp_buf_tag*, i32)* bitcast (void ([[PTR]], i32)* @emscripten_longjmp to void (%struct.__jmp_buf_tag*, i32)*)
; Test a simple setjmp - longjmp sequence
define void @setjmp_longjmp() {
; CHECK-LABEL: @setjmp_longjmp
entry:
%buf = alloca [1 x %struct.__jmp_buf_tag], align 16
%arraydecay = getelementptr inbounds [1 x %struct.__jmp_buf_tag], [1 x %struct.__jmp_buf_tag]* %buf, i32 0, i32 0
%call = call i32 @setjmp(%struct.__jmp_buf_tag* %arraydecay) #0
%arraydecay1 = getelementptr inbounds [1 x %struct.__jmp_buf_tag], [1 x %struct.__jmp_buf_tag]* %buf, i32 0, i32 0
call void @longjmp(%struct.__jmp_buf_tag* %arraydecay1, i32 1) #1
unreachable
; CHECK: entry:
; CHECK-NEXT: %[[MALLOCCALL:.*]] = tail call i8* @malloc(i32 40)
; CHECK-NEXT: %[[SETJMP_TABLE:.*]] = bitcast i8* %[[MALLOCCALL]] to i32*
; CHECK-NEXT: store i32 0, i32* %[[SETJMP_TABLE]]
; CHECK-NEXT: %[[SETJMP_TABLE_SIZE:.*]] = add i32 4, 0
[WebAssembly] Use entry block only for initializations in EmSjLj Emscripten SjLj transformation is done in four steps. This will be mostly the same for the soon-to-be-added Wasm SjLj; the step 1, 3, and 4 will be shared and there will be separate way of doing step 2. 1. Initialize `setjmpTable` and `setjmpTableSize` in the entry BB 2. Handle `setjmp` callsites 3. Handle `longjmp` callsites 4. Cleanup and update SSA We initialize `setjmpTable` and `setjmpTableSize` in the entry BB. But if the entry BB contains a `setjmp` call, some `setjmp` handling transformation will also happen in the entry BB, such as calling `saveSetjmp`. This is fine for Emscripten SjLj but not for Wasm SjLj, because in Wasm SjLj we will add a dispatch BB that contains a `switch` right after the entry BB, from which we jump to one of post-`setjmp` BBs. And this dispatch BB should precede all `setjmp` calls. Emscripten SjLj (current): ``` entry: %setjmpTable = ... %setjmpTableSize = ... ... call @saveSetjmp(...) ``` Wasm SjLj (follow-up): ``` entry: %setjmpTable = ... %setjmpTableSize = ... setjmp.dispatch: ... ; Jump to the right post-setjmp BB, if we are returning from a ; longjmp. If this is the first setjmp call, go to %entry.split. switch i32 %no, label %entry.split [ i32 1, label %post.setjmp1 i32 2, label %post.setjmp2 ... i32 N, label %post.setjmpN ] entry.split: ... call @saveSetjmp(...) ``` So in Wasm SjLj we split the entry BB to make the entry block only for `setjmpTable` and `setjmpTableSize` initialization and insert a `setjmp.dispatch` BB. (This part is not in this CL. This will be a follow-up.) But note that Emscripten SjLj and Wasm SjLj share all steps except for the step 2. If we only split the entry BB only for Wasm SjLj, there will be one more `if`-`else` and the code will be more complicated. So this CL splits the entry BB in Emscripten SjLj and put only initialization stuff there as follows: Emscripten SjLj (this CL): ``` entry: %setjmpTable = ... %setjmpTableSize = ... br %entry.split entry.split: ... call @saveSetjmp(...) ``` This is just done to share code with Wasm SjLj. It adds an unnecessary branch but this will be removed in later optimization passes anyway. This is in effect NFC, meaning the program behavior will not change, but existing ll tests files have changed because the entry block was split. The reason I upload this in a separate CL is to make the Wasm SjLj diff tidier, because this changes many existing Emscripten SjLj tests, which can be confusing for the follow-up Wasm SjLj CL. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D108729
2021-08-25 18:53:22 +08:00
; CHECK-NEXT: br label %entry.split
; CHECK: entry.split
; CHECK-NEXT: %[[BUF:.*]] = alloca [1 x %struct.__jmp_buf_tag]
; CHECK-NEXT: %[[ARRAYDECAY:.*]] = getelementptr inbounds [1 x %struct.__jmp_buf_tag], [1 x %struct.__jmp_buf_tag]* %[[BUF]], i32 0, i32 0
; CHECK-NEXT: %[[SETJMP_TABLE1:.*]] = call i32* @saveSetjmp(%struct.__jmp_buf_tag* %[[ARRAYDECAY]], i32 1, i32* %[[SETJMP_TABLE]], i32 %[[SETJMP_TABLE_SIZE]])
; CHECK-NEXT: %[[SETJMP_TABLE_SIZE1:.*]] = call i32 @getTempRet0()
[WebAssembly] Use entry block only for initializations in EmSjLj Emscripten SjLj transformation is done in four steps. This will be mostly the same for the soon-to-be-added Wasm SjLj; the step 1, 3, and 4 will be shared and there will be separate way of doing step 2. 1. Initialize `setjmpTable` and `setjmpTableSize` in the entry BB 2. Handle `setjmp` callsites 3. Handle `longjmp` callsites 4. Cleanup and update SSA We initialize `setjmpTable` and `setjmpTableSize` in the entry BB. But if the entry BB contains a `setjmp` call, some `setjmp` handling transformation will also happen in the entry BB, such as calling `saveSetjmp`. This is fine for Emscripten SjLj but not for Wasm SjLj, because in Wasm SjLj we will add a dispatch BB that contains a `switch` right after the entry BB, from which we jump to one of post-`setjmp` BBs. And this dispatch BB should precede all `setjmp` calls. Emscripten SjLj (current): ``` entry: %setjmpTable = ... %setjmpTableSize = ... ... call @saveSetjmp(...) ``` Wasm SjLj (follow-up): ``` entry: %setjmpTable = ... %setjmpTableSize = ... setjmp.dispatch: ... ; Jump to the right post-setjmp BB, if we are returning from a ; longjmp. If this is the first setjmp call, go to %entry.split. switch i32 %no, label %entry.split [ i32 1, label %post.setjmp1 i32 2, label %post.setjmp2 ... i32 N, label %post.setjmpN ] entry.split: ... call @saveSetjmp(...) ``` So in Wasm SjLj we split the entry BB to make the entry block only for `setjmpTable` and `setjmpTableSize` initialization and insert a `setjmp.dispatch` BB. (This part is not in this CL. This will be a follow-up.) But note that Emscripten SjLj and Wasm SjLj share all steps except for the step 2. If we only split the entry BB only for Wasm SjLj, there will be one more `if`-`else` and the code will be more complicated. So this CL splits the entry BB in Emscripten SjLj and put only initialization stuff there as follows: Emscripten SjLj (this CL): ``` entry: %setjmpTable = ... %setjmpTableSize = ... br %entry.split entry.split: ... call @saveSetjmp(...) ``` This is just done to share code with Wasm SjLj. It adds an unnecessary branch but this will be removed in later optimization passes anyway. This is in effect NFC, meaning the program behavior will not change, but existing ll tests files have changed because the entry block was split. The reason I upload this in a separate CL is to make the Wasm SjLj diff tidier, because this changes many existing Emscripten SjLj tests, which can be confusing for the follow-up Wasm SjLj CL. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D108729
2021-08-25 18:53:22 +08:00
; CHECK-NEXT: br label %entry.split.split
[WebAssembly] Use entry block only for initializations in EmSjLj Emscripten SjLj transformation is done in four steps. This will be mostly the same for the soon-to-be-added Wasm SjLj; the step 1, 3, and 4 will be shared and there will be separate way of doing step 2. 1. Initialize `setjmpTable` and `setjmpTableSize` in the entry BB 2. Handle `setjmp` callsites 3. Handle `longjmp` callsites 4. Cleanup and update SSA We initialize `setjmpTable` and `setjmpTableSize` in the entry BB. But if the entry BB contains a `setjmp` call, some `setjmp` handling transformation will also happen in the entry BB, such as calling `saveSetjmp`. This is fine for Emscripten SjLj but not for Wasm SjLj, because in Wasm SjLj we will add a dispatch BB that contains a `switch` right after the entry BB, from which we jump to one of post-`setjmp` BBs. And this dispatch BB should precede all `setjmp` calls. Emscripten SjLj (current): ``` entry: %setjmpTable = ... %setjmpTableSize = ... ... call @saveSetjmp(...) ``` Wasm SjLj (follow-up): ``` entry: %setjmpTable = ... %setjmpTableSize = ... setjmp.dispatch: ... ; Jump to the right post-setjmp BB, if we are returning from a ; longjmp. If this is the first setjmp call, go to %entry.split. switch i32 %no, label %entry.split [ i32 1, label %post.setjmp1 i32 2, label %post.setjmp2 ... i32 N, label %post.setjmpN ] entry.split: ... call @saveSetjmp(...) ``` So in Wasm SjLj we split the entry BB to make the entry block only for `setjmpTable` and `setjmpTableSize` initialization and insert a `setjmp.dispatch` BB. (This part is not in this CL. This will be a follow-up.) But note that Emscripten SjLj and Wasm SjLj share all steps except for the step 2. If we only split the entry BB only for Wasm SjLj, there will be one more `if`-`else` and the code will be more complicated. So this CL splits the entry BB in Emscripten SjLj and put only initialization stuff there as follows: Emscripten SjLj (this CL): ``` entry: %setjmpTable = ... %setjmpTableSize = ... br %entry.split entry.split: ... call @saveSetjmp(...) ``` This is just done to share code with Wasm SjLj. It adds an unnecessary branch but this will be removed in later optimization passes anyway. This is in effect NFC, meaning the program behavior will not change, but existing ll tests files have changed because the entry block was split. The reason I upload this in a separate CL is to make the Wasm SjLj diff tidier, because this changes many existing Emscripten SjLj tests, which can be confusing for the follow-up Wasm SjLj CL. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D108729
2021-08-25 18:53:22 +08:00
; CHECK: entry.split.split:
; CHECK-NEXT: phi i32 [ 0, %entry.split ], [ %[[LONGJMP_RESULT:.*]], %if.end ]
; CHECK-NEXT: %[[ARRAYDECAY1:.*]] = getelementptr inbounds [1 x %struct.__jmp_buf_tag], [1 x %struct.__jmp_buf_tag]* %[[BUF]], i32 0, i32 0
; CHECK-NEXT: %[[JMPBUF:.*]] = ptrtoint %struct.__jmp_buf_tag* %[[ARRAYDECAY1]] to [[PTR]]
; CHECK-NEXT: store [[PTR]] 0, [[PTR]]* @__THREW__
; CHECK-NEXT: call cc{{.*}} void @__invoke_void_[[PTR]]_i32(void ([[PTR]], i32)* @emscripten_longjmp, [[PTR]] %[[JMPBUF]], i32 1)
; CHECK-NEXT: %[[__THREW__VAL:.*]] = load [[PTR]], [[PTR]]* @__THREW__
; CHECK-NEXT: store [[PTR]] 0, [[PTR]]* @__THREW__
; CHECK-NEXT: %[[CMP0:.*]] = icmp ne [[PTR]] %__THREW__.val, 0
; CHECK-NEXT: %[[THREWVALUE_VAL:.*]] = load i32, i32* @__threwValue
; CHECK-NEXT: %[[CMP1:.*]] = icmp ne i32 %[[THREWVALUE_VAL]], 0
; CHECK-NEXT: %[[CMP:.*]] = and i1 %[[CMP0]], %[[CMP1]]
; CHECK-NEXT: br i1 %[[CMP]], label %if.then1, label %if.else1
[WebAssembly] Use entry block only for initializations in EmSjLj Emscripten SjLj transformation is done in four steps. This will be mostly the same for the soon-to-be-added Wasm SjLj; the step 1, 3, and 4 will be shared and there will be separate way of doing step 2. 1. Initialize `setjmpTable` and `setjmpTableSize` in the entry BB 2. Handle `setjmp` callsites 3. Handle `longjmp` callsites 4. Cleanup and update SSA We initialize `setjmpTable` and `setjmpTableSize` in the entry BB. But if the entry BB contains a `setjmp` call, some `setjmp` handling transformation will also happen in the entry BB, such as calling `saveSetjmp`. This is fine for Emscripten SjLj but not for Wasm SjLj, because in Wasm SjLj we will add a dispatch BB that contains a `switch` right after the entry BB, from which we jump to one of post-`setjmp` BBs. And this dispatch BB should precede all `setjmp` calls. Emscripten SjLj (current): ``` entry: %setjmpTable = ... %setjmpTableSize = ... ... call @saveSetjmp(...) ``` Wasm SjLj (follow-up): ``` entry: %setjmpTable = ... %setjmpTableSize = ... setjmp.dispatch: ... ; Jump to the right post-setjmp BB, if we are returning from a ; longjmp. If this is the first setjmp call, go to %entry.split. switch i32 %no, label %entry.split [ i32 1, label %post.setjmp1 i32 2, label %post.setjmp2 ... i32 N, label %post.setjmpN ] entry.split: ... call @saveSetjmp(...) ``` So in Wasm SjLj we split the entry BB to make the entry block only for `setjmpTable` and `setjmpTableSize` initialization and insert a `setjmp.dispatch` BB. (This part is not in this CL. This will be a follow-up.) But note that Emscripten SjLj and Wasm SjLj share all steps except for the step 2. If we only split the entry BB only for Wasm SjLj, there will be one more `if`-`else` and the code will be more complicated. So this CL splits the entry BB in Emscripten SjLj and put only initialization stuff there as follows: Emscripten SjLj (this CL): ``` entry: %setjmpTable = ... %setjmpTableSize = ... br %entry.split entry.split: ... call @saveSetjmp(...) ``` This is just done to share code with Wasm SjLj. It adds an unnecessary branch but this will be removed in later optimization passes anyway. This is in effect NFC, meaning the program behavior will not change, but existing ll tests files have changed because the entry block was split. The reason I upload this in a separate CL is to make the Wasm SjLj diff tidier, because this changes many existing Emscripten SjLj tests, which can be confusing for the follow-up Wasm SjLj CL. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D108729
2021-08-25 18:53:22 +08:00
; CHECK: entry.split.split.split:
; CHECK-NEXT: unreachable
; CHECK: if.then1:
; CHECK-NEXT: %[[__THREW__VAL_P:.*]] = inttoptr [[PTR]] %[[__THREW__VAL]] to [[PTR]]*
; CHECK-NEXT: %[[__THREW__VAL_P_LOADED:.*]] = load [[PTR]], [[PTR]]* %[[__THREW__VAL_P]]
; CHECK-NEXT: %[[LABEL:.*]] = call i32 @testSetjmp([[PTR]] %[[__THREW__VAL_P_LOADED]], i32* %[[SETJMP_TABLE1]], i32 %[[SETJMP_TABLE_SIZE1]])
; CHECK-NEXT: %[[CMP:.*]] = icmp eq i32 %[[LABEL]], 0
[WebAssembly] Share rethrowing BBs in LowerEmscriptenEHSjLj There are three kinds of "rethrowing" BBs in this pass: 1. In Emscripten SjLj, after a possibly longjmping function call, we check if the thrown longjmp corresponds to one of setjmps within the current function. If not, we rethrow the longjmp by calling `emscripten_longjmp`. 2. In Emscripten EH, after a possibly throwing function call, we check if the thrown exception corresponds to the current `catch` clauses. If not, we rethrow the exception by calling `__resumeException`. 3. When both Emscripten EH and SjLj are used, when we check for an exception after a possibly throwing function call, it is possible that we get not an exception but a longjmp. In this case, we shouldn't swallow it; we should rethrow the longjmp by calling `emscripten_longjmp`. 4. When both Emscripten EH and SjLj are used, when we check for a longjmp after a possibly longjmping function call, it is possible that we get not a longjmp but an exception. In this case, we shouldn't swallot it; we should rethrow the exception by calling `__resumeException`. Case 1 is in Emscripten SjLj, 2 is in Emscripten EH, and 3 and 4 are relevant when both Emscripten EH and SjLj are used. 3 and 4 were first implemented in D106525. We create BBs for 1, 3, and 4 in this pass. We create those BBs for every throwing/longjmping function call, along with other BBs that contain condition checks. What this CL does is to create a single BB within a function for each of 1, 3, and 4 cases. These BBs are exiting BBs in the function and thus don't have successors, so easy to be shared between calls. The names of BBs created are: Case 1: `call.em.longjmp` Case 3: `rethrow.exn` Case 4: `rethrow.longjmp` For the case 2 we don't currently create BBs; we only replace the existing `resume` instruction with `call @__resumeException`. And Clang already creates only a single `resume` BB per function and reuses it, so we don't need to optimize this case. Not sure what are good benchmarks for EH/SjLj, but this decreases the size of the object file for `grfmt_jpeg.bc` (presumably from opencv) we got from one of our users by 8.9%. Even after running `wasm-opt -O4` on them, there is still 4.8% improvement. Reviewed By: dschuff Differential Revision: https://reviews.llvm.org/D108945
2021-08-28 13:14:49 +08:00
; CHECK-NEXT: br i1 %[[CMP]], label %call.em.longjmp, label %if.end2
; CHECK: if.else1:
; CHECK-NEXT: br label %if.end
; CHECK: if.end:
; CHECK-NEXT: %[[LABEL_PHI:.*]] = phi i32 [ %[[LABEL:.*]], %if.end2 ], [ -1, %if.else1 ]
; CHECK-NEXT: %[[LONGJMP_RESULT]] = call i32 @getTempRet0()
[WebAssembly] Use entry block only for initializations in EmSjLj Emscripten SjLj transformation is done in four steps. This will be mostly the same for the soon-to-be-added Wasm SjLj; the step 1, 3, and 4 will be shared and there will be separate way of doing step 2. 1. Initialize `setjmpTable` and `setjmpTableSize` in the entry BB 2. Handle `setjmp` callsites 3. Handle `longjmp` callsites 4. Cleanup and update SSA We initialize `setjmpTable` and `setjmpTableSize` in the entry BB. But if the entry BB contains a `setjmp` call, some `setjmp` handling transformation will also happen in the entry BB, such as calling `saveSetjmp`. This is fine for Emscripten SjLj but not for Wasm SjLj, because in Wasm SjLj we will add a dispatch BB that contains a `switch` right after the entry BB, from which we jump to one of post-`setjmp` BBs. And this dispatch BB should precede all `setjmp` calls. Emscripten SjLj (current): ``` entry: %setjmpTable = ... %setjmpTableSize = ... ... call @saveSetjmp(...) ``` Wasm SjLj (follow-up): ``` entry: %setjmpTable = ... %setjmpTableSize = ... setjmp.dispatch: ... ; Jump to the right post-setjmp BB, if we are returning from a ; longjmp. If this is the first setjmp call, go to %entry.split. switch i32 %no, label %entry.split [ i32 1, label %post.setjmp1 i32 2, label %post.setjmp2 ... i32 N, label %post.setjmpN ] entry.split: ... call @saveSetjmp(...) ``` So in Wasm SjLj we split the entry BB to make the entry block only for `setjmpTable` and `setjmpTableSize` initialization and insert a `setjmp.dispatch` BB. (This part is not in this CL. This will be a follow-up.) But note that Emscripten SjLj and Wasm SjLj share all steps except for the step 2. If we only split the entry BB only for Wasm SjLj, there will be one more `if`-`else` and the code will be more complicated. So this CL splits the entry BB in Emscripten SjLj and put only initialization stuff there as follows: Emscripten SjLj (this CL): ``` entry: %setjmpTable = ... %setjmpTableSize = ... br %entry.split entry.split: ... call @saveSetjmp(...) ``` This is just done to share code with Wasm SjLj. It adds an unnecessary branch but this will be removed in later optimization passes anyway. This is in effect NFC, meaning the program behavior will not change, but existing ll tests files have changed because the entry block was split. The reason I upload this in a separate CL is to make the Wasm SjLj diff tidier, because this changes many existing Emscripten SjLj tests, which can be confusing for the follow-up Wasm SjLj CL. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D108729
2021-08-25 18:53:22 +08:00
; CHECK-NEXT: switch i32 %[[LABEL_PHI]], label %entry.split.split.split [
; CHECK-NEXT: i32 1, label %entry.split.split
; CHECK-NEXT: ]
[WebAssembly] Share rethrowing BBs in LowerEmscriptenEHSjLj There are three kinds of "rethrowing" BBs in this pass: 1. In Emscripten SjLj, after a possibly longjmping function call, we check if the thrown longjmp corresponds to one of setjmps within the current function. If not, we rethrow the longjmp by calling `emscripten_longjmp`. 2. In Emscripten EH, after a possibly throwing function call, we check if the thrown exception corresponds to the current `catch` clauses. If not, we rethrow the exception by calling `__resumeException`. 3. When both Emscripten EH and SjLj are used, when we check for an exception after a possibly throwing function call, it is possible that we get not an exception but a longjmp. In this case, we shouldn't swallow it; we should rethrow the longjmp by calling `emscripten_longjmp`. 4. When both Emscripten EH and SjLj are used, when we check for a longjmp after a possibly longjmping function call, it is possible that we get not a longjmp but an exception. In this case, we shouldn't swallot it; we should rethrow the exception by calling `__resumeException`. Case 1 is in Emscripten SjLj, 2 is in Emscripten EH, and 3 and 4 are relevant when both Emscripten EH and SjLj are used. 3 and 4 were first implemented in D106525. We create BBs for 1, 3, and 4 in this pass. We create those BBs for every throwing/longjmping function call, along with other BBs that contain condition checks. What this CL does is to create a single BB within a function for each of 1, 3, and 4 cases. These BBs are exiting BBs in the function and thus don't have successors, so easy to be shared between calls. The names of BBs created are: Case 1: `call.em.longjmp` Case 3: `rethrow.exn` Case 4: `rethrow.longjmp` For the case 2 we don't currently create BBs; we only replace the existing `resume` instruction with `call @__resumeException`. And Clang already creates only a single `resume` BB per function and reuses it, so we don't need to optimize this case. Not sure what are good benchmarks for EH/SjLj, but this decreases the size of the object file for `grfmt_jpeg.bc` (presumably from opencv) we got from one of our users by 8.9%. Even after running `wasm-opt -O4` on them, there is still 4.8% improvement. Reviewed By: dschuff Differential Revision: https://reviews.llvm.org/D108945
2021-08-28 13:14:49 +08:00
; CHECK: call.em.longjmp:
; CHECK-NEXT: %threw.phi = phi [[PTR]] [ %[[__THREW__VAL]], %if.then1 ]
; CHECK-NEXT: %threwvalue.phi = phi i32 [ %[[THREWVALUE_VAL]], %if.then1 ]
; CHECK-NEXT: %{{.*}} = bitcast i32* %[[SETJMP_TABLE1]] to i8*
; CHECK-NEXT: tail call void @free(i8* %{{.*}})
; CHECK-NEXT: call void @emscripten_longjmp([[PTR]] %threw.phi, i32 %threwvalue.phi)
; CHECK-NEXT: unreachable
; CHECK: if.end2:
; CHECK-NEXT: call void @setTempRet0(i32 %[[THREWVALUE_VAL]])
; CHECK-NEXT: br label %if.end
}
; Test a case of a function call (which is not longjmp) after a setjmp
[WebAssembly] Share rethrowing BBs in LowerEmscriptenEHSjLj There are three kinds of "rethrowing" BBs in this pass: 1. In Emscripten SjLj, after a possibly longjmping function call, we check if the thrown longjmp corresponds to one of setjmps within the current function. If not, we rethrow the longjmp by calling `emscripten_longjmp`. 2. In Emscripten EH, after a possibly throwing function call, we check if the thrown exception corresponds to the current `catch` clauses. If not, we rethrow the exception by calling `__resumeException`. 3. When both Emscripten EH and SjLj are used, when we check for an exception after a possibly throwing function call, it is possible that we get not an exception but a longjmp. In this case, we shouldn't swallow it; we should rethrow the longjmp by calling `emscripten_longjmp`. 4. When both Emscripten EH and SjLj are used, when we check for a longjmp after a possibly longjmping function call, it is possible that we get not a longjmp but an exception. In this case, we shouldn't swallot it; we should rethrow the exception by calling `__resumeException`. Case 1 is in Emscripten SjLj, 2 is in Emscripten EH, and 3 and 4 are relevant when both Emscripten EH and SjLj are used. 3 and 4 were first implemented in D106525. We create BBs for 1, 3, and 4 in this pass. We create those BBs for every throwing/longjmping function call, along with other BBs that contain condition checks. What this CL does is to create a single BB within a function for each of 1, 3, and 4 cases. These BBs are exiting BBs in the function and thus don't have successors, so easy to be shared between calls. The names of BBs created are: Case 1: `call.em.longjmp` Case 3: `rethrow.exn` Case 4: `rethrow.longjmp` For the case 2 we don't currently create BBs; we only replace the existing `resume` instruction with `call @__resumeException`. And Clang already creates only a single `resume` BB per function and reuses it, so we don't need to optimize this case. Not sure what are good benchmarks for EH/SjLj, but this decreases the size of the object file for `grfmt_jpeg.bc` (presumably from opencv) we got from one of our users by 8.9%. Even after running `wasm-opt -O4` on them, there is still 4.8% improvement. Reviewed By: dschuff Differential Revision: https://reviews.llvm.org/D108945
2021-08-28 13:14:49 +08:00
define void @setjmp_longjmpable_call() {
; CHECK-LABEL: @setjmp_longjmpable_call
entry:
%buf = alloca [1 x %struct.__jmp_buf_tag], align 16
%arraydecay = getelementptr inbounds [1 x %struct.__jmp_buf_tag], [1 x %struct.__jmp_buf_tag]* %buf, i32 0, i32 0
%call = call i32 @setjmp(%struct.__jmp_buf_tag* %arraydecay) #0
call void @foo()
ret void
; CHECK: entry:
; CHECK: %[[SETJMP_TABLE:.*]] = call i32* @saveSetjmp(
[WebAssembly] Use entry block only for initializations in EmSjLj Emscripten SjLj transformation is done in four steps. This will be mostly the same for the soon-to-be-added Wasm SjLj; the step 1, 3, and 4 will be shared and there will be separate way of doing step 2. 1. Initialize `setjmpTable` and `setjmpTableSize` in the entry BB 2. Handle `setjmp` callsites 3. Handle `longjmp` callsites 4. Cleanup and update SSA We initialize `setjmpTable` and `setjmpTableSize` in the entry BB. But if the entry BB contains a `setjmp` call, some `setjmp` handling transformation will also happen in the entry BB, such as calling `saveSetjmp`. This is fine for Emscripten SjLj but not for Wasm SjLj, because in Wasm SjLj we will add a dispatch BB that contains a `switch` right after the entry BB, from which we jump to one of post-`setjmp` BBs. And this dispatch BB should precede all `setjmp` calls. Emscripten SjLj (current): ``` entry: %setjmpTable = ... %setjmpTableSize = ... ... call @saveSetjmp(...) ``` Wasm SjLj (follow-up): ``` entry: %setjmpTable = ... %setjmpTableSize = ... setjmp.dispatch: ... ; Jump to the right post-setjmp BB, if we are returning from a ; longjmp. If this is the first setjmp call, go to %entry.split. switch i32 %no, label %entry.split [ i32 1, label %post.setjmp1 i32 2, label %post.setjmp2 ... i32 N, label %post.setjmpN ] entry.split: ... call @saveSetjmp(...) ``` So in Wasm SjLj we split the entry BB to make the entry block only for `setjmpTable` and `setjmpTableSize` initialization and insert a `setjmp.dispatch` BB. (This part is not in this CL. This will be a follow-up.) But note that Emscripten SjLj and Wasm SjLj share all steps except for the step 2. If we only split the entry BB only for Wasm SjLj, there will be one more `if`-`else` and the code will be more complicated. So this CL splits the entry BB in Emscripten SjLj and put only initialization stuff there as follows: Emscripten SjLj (this CL): ``` entry: %setjmpTable = ... %setjmpTableSize = ... br %entry.split entry.split: ... call @saveSetjmp(...) ``` This is just done to share code with Wasm SjLj. It adds an unnecessary branch but this will be removed in later optimization passes anyway. This is in effect NFC, meaning the program behavior will not change, but existing ll tests files have changed because the entry block was split. The reason I upload this in a separate CL is to make the Wasm SjLj diff tidier, because this changes many existing Emscripten SjLj tests, which can be confusing for the follow-up Wasm SjLj CL. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D108729
2021-08-25 18:53:22 +08:00
; CHECK: entry.split.split:
[WebAssembly] Fix conflict between ret legalization and sjlj Summary: When the WebAssembly backend encounters a return type that doesn't fit within i32, SelectionDAG performs sret demotion, adding an additional argument to the start of the function that contains a pointer to an sret buffer to use instead. However, this conflicts with the emscripten sjlj lowering pass. There we translate calls like: ``` call {i32, i32} @foo() ``` into (in pseudo-llvm) ``` %addr = @foo call {i32, i32} @__invoke_{i32,i32}(%addr) ``` i.e. we perform an indirect call through an extra function. However, the sret transform now transforms this into the equivalent of ``` %addr = @foo %sret = alloca {i32, i32} call {i32, i32} @__invoke_{i32,i32}(%sret, %addr) ``` (while simultaneously translation the implementation of @foo as well). Unfortunately, this doesn't work out. The __invoke_ ABI expected the function address to be the first argument, causing crashes. There is several possible ways to fix this: 1. Implementing the sret rewrite at the IR level as well and performing it as part of lowering to __invoke 2. Fixing the wasm backend to recognize that __invoke has a special ABI 3. A change to the binaryen/emscripten ABI to recognize this situation This revision implements the middle option, teaching the backend to treat __invoke_ functions specially in sret lowering. This is achieved by 1) Introducing a new CallingConv ID for invoke functions 2) When this CallingConv ID is seen in the backend and the first argument is marked as sret (a function pointer would never be marked as sret), swapping the first two arguments. Reviewed By: tlively, aheejin Differential Revision: https://reviews.llvm.org/D65463 llvm-svn: 367935
2019-08-06 05:36:09 +08:00
; CHECK: @__invoke_void(void ()* @foo)
[WebAssembly] Use entry block only for initializations in EmSjLj Emscripten SjLj transformation is done in four steps. This will be mostly the same for the soon-to-be-added Wasm SjLj; the step 1, 3, and 4 will be shared and there will be separate way of doing step 2. 1. Initialize `setjmpTable` and `setjmpTableSize` in the entry BB 2. Handle `setjmp` callsites 3. Handle `longjmp` callsites 4. Cleanup and update SSA We initialize `setjmpTable` and `setjmpTableSize` in the entry BB. But if the entry BB contains a `setjmp` call, some `setjmp` handling transformation will also happen in the entry BB, such as calling `saveSetjmp`. This is fine for Emscripten SjLj but not for Wasm SjLj, because in Wasm SjLj we will add a dispatch BB that contains a `switch` right after the entry BB, from which we jump to one of post-`setjmp` BBs. And this dispatch BB should precede all `setjmp` calls. Emscripten SjLj (current): ``` entry: %setjmpTable = ... %setjmpTableSize = ... ... call @saveSetjmp(...) ``` Wasm SjLj (follow-up): ``` entry: %setjmpTable = ... %setjmpTableSize = ... setjmp.dispatch: ... ; Jump to the right post-setjmp BB, if we are returning from a ; longjmp. If this is the first setjmp call, go to %entry.split. switch i32 %no, label %entry.split [ i32 1, label %post.setjmp1 i32 2, label %post.setjmp2 ... i32 N, label %post.setjmpN ] entry.split: ... call @saveSetjmp(...) ``` So in Wasm SjLj we split the entry BB to make the entry block only for `setjmpTable` and `setjmpTableSize` initialization and insert a `setjmp.dispatch` BB. (This part is not in this CL. This will be a follow-up.) But note that Emscripten SjLj and Wasm SjLj share all steps except for the step 2. If we only split the entry BB only for Wasm SjLj, there will be one more `if`-`else` and the code will be more complicated. So this CL splits the entry BB in Emscripten SjLj and put only initialization stuff there as follows: Emscripten SjLj (this CL): ``` entry: %setjmpTable = ... %setjmpTableSize = ... br %entry.split entry.split: ... call @saveSetjmp(...) ``` This is just done to share code with Wasm SjLj. It adds an unnecessary branch but this will be removed in later optimization passes anyway. This is in effect NFC, meaning the program behavior will not change, but existing ll tests files have changed because the entry block was split. The reason I upload this in a separate CL is to make the Wasm SjLj diff tidier, because this changes many existing Emscripten SjLj tests, which can be confusing for the follow-up Wasm SjLj CL. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D108729
2021-08-25 18:53:22 +08:00
; CHECK: entry.split.split.split:
; CHECK-NEXT: %[[BUF:.*]] = bitcast i32* %[[SETJMP_TABLE]] to i8*
; CHECK-NEXT: tail call void @free(i8* %[[BUF]])
; CHECK-NEXT: ret void
}
[WebAssembly] Share rethrowing BBs in LowerEmscriptenEHSjLj There are three kinds of "rethrowing" BBs in this pass: 1. In Emscripten SjLj, after a possibly longjmping function call, we check if the thrown longjmp corresponds to one of setjmps within the current function. If not, we rethrow the longjmp by calling `emscripten_longjmp`. 2. In Emscripten EH, after a possibly throwing function call, we check if the thrown exception corresponds to the current `catch` clauses. If not, we rethrow the exception by calling `__resumeException`. 3. When both Emscripten EH and SjLj are used, when we check for an exception after a possibly throwing function call, it is possible that we get not an exception but a longjmp. In this case, we shouldn't swallow it; we should rethrow the longjmp by calling `emscripten_longjmp`. 4. When both Emscripten EH and SjLj are used, when we check for a longjmp after a possibly longjmping function call, it is possible that we get not a longjmp but an exception. In this case, we shouldn't swallot it; we should rethrow the exception by calling `__resumeException`. Case 1 is in Emscripten SjLj, 2 is in Emscripten EH, and 3 and 4 are relevant when both Emscripten EH and SjLj are used. 3 and 4 were first implemented in D106525. We create BBs for 1, 3, and 4 in this pass. We create those BBs for every throwing/longjmping function call, along with other BBs that contain condition checks. What this CL does is to create a single BB within a function for each of 1, 3, and 4 cases. These BBs are exiting BBs in the function and thus don't have successors, so easy to be shared between calls. The names of BBs created are: Case 1: `call.em.longjmp` Case 3: `rethrow.exn` Case 4: `rethrow.longjmp` For the case 2 we don't currently create BBs; we only replace the existing `resume` instruction with `call @__resumeException`. And Clang already creates only a single `resume` BB per function and reuses it, so we don't need to optimize this case. Not sure what are good benchmarks for EH/SjLj, but this decreases the size of the object file for `grfmt_jpeg.bc` (presumably from opencv) we got from one of our users by 8.9%. Even after running `wasm-opt -O4` on them, there is still 4.8% improvement. Reviewed By: dschuff Differential Revision: https://reviews.llvm.org/D108945
2021-08-28 13:14:49 +08:00
; When there are multiple longjmpable calls after setjmp. In this test we
; specifically check if 'call.em.longjmp' BB, which rethrows longjmps by calling
; emscripten_longjmp for ones that are not for this function's setjmp, is
; correctly created for multiple predecessors.
define void @setjmp_multiple_longjmpable_calls() {
; CHECK-LABEL: @setjmp_multiple_longjmpable_calls
entry:
%buf = alloca [1 x %struct.__jmp_buf_tag], align 16
%arraydecay = getelementptr inbounds [1 x %struct.__jmp_buf_tag], [1 x %struct.__jmp_buf_tag]* %buf, i32 0, i32 0
%call = call i32 @setjmp(%struct.__jmp_buf_tag* %arraydecay) #0
call void @foo()
call void @foo()
ret void
; CHECK: call.em.longjmp:
; CHECK-NEXT: %threw.phi = phi [[PTR]] [ %__THREW__.val, %if.then1 ], [ %__THREW__.val4, %if.then15 ]
; CHECK-NEXT: %threwvalue.phi = phi i32 [ %__threwValue.val, %if.then1 ], [ %__threwValue.val8, %if.then15 ]
; CHECK-NEXT: %{{.*}} = bitcast i32* %[[SETJMP_TABLE1]] to i8*
; CHECK-NEXT: tail call void @free(i8* %{{.*}})
; CHECK-NEXT: call void @emscripten_longjmp([[PTR]] %threw.phi, i32 %threwvalue.phi)
; CHECK-NEXT: unreachable
}
; Test a case where a function has a setjmp call but no other calls that can
; longjmp. We don't need to do any transformation in this case.
define void @setjmp_only(i8* %ptr) {
; CHECK-LABEL: @setjmp_only
entry:
%buf = alloca [1 x %struct.__jmp_buf_tag], align 16
%arraydecay = getelementptr inbounds [1 x %struct.__jmp_buf_tag], [1 x %struct.__jmp_buf_tag]* %buf, i32 0, i32 0
%call = call i32 @setjmp(%struct.__jmp_buf_tag* %arraydecay) #0
; free cannot longjmp
call void @free(i8* %ptr)
ret void
; CHECK-NOT: @malloc
; CHECK-NOT: %setjmpTable
; CHECK-NOT: @saveSetjmp
; CHECK-NOT: @testSetjmp
}
; Test SSA validity
define void @ssa(i32 %n) {
; CHECK-LABEL: @ssa
entry:
%buf = alloca [1 x %struct.__jmp_buf_tag], align 16
%cmp = icmp sgt i32 %n, 5
br i1 %cmp, label %if.then, label %if.end
; CHECK: entry:
; CHECK: %[[SETJMP_TABLE0:.*]] = bitcast i8*
; CHECK: %[[SETJMP_TABLE_SIZE0:.*]] = add i32 4, 0
if.then: ; preds = %entry
%0 = load i32, i32* @global_var, align 4
%arraydecay = getelementptr inbounds [1 x %struct.__jmp_buf_tag], [1 x %struct.__jmp_buf_tag]* %buf, i32 0, i32 0
%call = call i32 @setjmp(%struct.__jmp_buf_tag* %arraydecay) #0
store i32 %0, i32* @global_var, align 4
br label %if.end
; CHECK: if.then:
; CHECK: %[[VAR0:.*]] = load i32, i32* @global_var, align 4
; CHECK: %[[SETJMP_TABLE1:.*]] = call i32* @saveSetjmp(
; CHECK-NEXT: %[[SETJMP_TABLE_SIZE1:.*]] = call i32 @getTempRet0()
; CHECK: if.then.split:
[WebAssembly] Use SSAUpdaterBulk in LowerEmscriptenSjLj We update SSA in two steps in Emscripten SjLj: 1. Rewrite uses of `setjmpTable` and `setjmpTableSize` variables and place `phi`s where necessary, which are updated where we call `saveSetjmp`. 2. Do a whole function level SSA update for all variables, because we split BBs where `setjmp` is called and there are possibly variable uses that are not dominated by a def. (See https://github.com/llvm/llvm-project/blob/955b91c19c00ed4c917559a5d66d14c669dde2e3/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp#L1314-L1324) We have been using `SSAUpdater` to do this, but `SSAUpdaterBulk` class was added after this pass was first created, and for the step 2 it looks like a better alternative with a possible performance benefit. Not sure the author is aware of it, but `SSAUpdaterBulk` seems to have a limitation: it cannot handle a use within the same BB as a def but before it. For example: ``` ... = %a + 1 %a = foo(); ``` or ``` %a = %a + 1 ``` The uses `%a` in RHS should be rewritten with another SSA variable of `%a`, most likely one generated from a `phi`. But `SSAUpdaterBulk` thinks all uses of `%a` are below the def of `%a` within the same BB. (`SSAUpdater` has two different functions of rewriting because of this: `RewriteUse` and `RewriteUseAfterInsertions`.) This doesn't affect our usage in the step 2 because that deals with possibly non-dominated uses by defs after block splitting. But it does in the step 1, which still uses `SSAUpdater`. But this CL also simplifies the step 1 by using `make_early_inc_range`, removing the need to advance the iterator before rewriting a use. This is NFC; the test changes are just the order of PHI nodes. Reviewed By: dschuff Differential Revision: https://reviews.llvm.org/D108583
2021-08-17 14:30:02 +08:00
; CHECK: %[[VAR1:.*]] = phi i32 [ %[[VAR2:.*]], %if.end3 ], [ %[[VAR0]], %if.then ]
; CHECK: %[[SETJMP_TABLE_SIZE2:.*]] = phi i32 [ %[[SETJMP_TABLE_SIZE1]], %if.then ], [ %[[SETJMP_TABLE_SIZE3:.*]], %if.end3 ]
; CHECK: %[[SETJMP_TABLE2:.*]] = phi i32* [ %[[SETJMP_TABLE1]], %if.then ], [ %[[SETJMP_TABLE3:.*]], %if.end3 ]
; CHECK: store i32 %[[VAR1]], i32* @global_var, align 4
if.end: ; preds = %if.then, %entry
%arraydecay1 = getelementptr inbounds [1 x %struct.__jmp_buf_tag], [1 x %struct.__jmp_buf_tag]* %buf, i32 0, i32 0
call void @longjmp(%struct.__jmp_buf_tag* %arraydecay1, i32 5) #1
unreachable
; CHECK: if.end:
[WebAssembly] Use entry block only for initializations in EmSjLj Emscripten SjLj transformation is done in four steps. This will be mostly the same for the soon-to-be-added Wasm SjLj; the step 1, 3, and 4 will be shared and there will be separate way of doing step 2. 1. Initialize `setjmpTable` and `setjmpTableSize` in the entry BB 2. Handle `setjmp` callsites 3. Handle `longjmp` callsites 4. Cleanup and update SSA We initialize `setjmpTable` and `setjmpTableSize` in the entry BB. But if the entry BB contains a `setjmp` call, some `setjmp` handling transformation will also happen in the entry BB, such as calling `saveSetjmp`. This is fine for Emscripten SjLj but not for Wasm SjLj, because in Wasm SjLj we will add a dispatch BB that contains a `switch` right after the entry BB, from which we jump to one of post-`setjmp` BBs. And this dispatch BB should precede all `setjmp` calls. Emscripten SjLj (current): ``` entry: %setjmpTable = ... %setjmpTableSize = ... ... call @saveSetjmp(...) ``` Wasm SjLj (follow-up): ``` entry: %setjmpTable = ... %setjmpTableSize = ... setjmp.dispatch: ... ; Jump to the right post-setjmp BB, if we are returning from a ; longjmp. If this is the first setjmp call, go to %entry.split. switch i32 %no, label %entry.split [ i32 1, label %post.setjmp1 i32 2, label %post.setjmp2 ... i32 N, label %post.setjmpN ] entry.split: ... call @saveSetjmp(...) ``` So in Wasm SjLj we split the entry BB to make the entry block only for `setjmpTable` and `setjmpTableSize` initialization and insert a `setjmp.dispatch` BB. (This part is not in this CL. This will be a follow-up.) But note that Emscripten SjLj and Wasm SjLj share all steps except for the step 2. If we only split the entry BB only for Wasm SjLj, there will be one more `if`-`else` and the code will be more complicated. So this CL splits the entry BB in Emscripten SjLj and put only initialization stuff there as follows: Emscripten SjLj (this CL): ``` entry: %setjmpTable = ... %setjmpTableSize = ... br %entry.split entry.split: ... call @saveSetjmp(...) ``` This is just done to share code with Wasm SjLj. It adds an unnecessary branch but this will be removed in later optimization passes anyway. This is in effect NFC, meaning the program behavior will not change, but existing ll tests files have changed because the entry block was split. The reason I upload this in a separate CL is to make the Wasm SjLj diff tidier, because this changes many existing Emscripten SjLj tests, which can be confusing for the follow-up Wasm SjLj CL. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D108729
2021-08-25 18:53:22 +08:00
; CHECK: %[[VAR2]] = phi i32 [ %[[VAR1]], %if.then.split ], [ undef, %entry.split ]
; CHECK: %[[SETJMP_TABLE_SIZE3]] = phi i32 [ %[[SETJMP_TABLE_SIZE2]], %if.then.split ], [ %[[SETJMP_TABLE_SIZE0]], %entry.split ]
; CHECK: %[[SETJMP_TABLE3]] = phi i32* [ %[[SETJMP_TABLE2]], %if.then.split ], [ %[[SETJMP_TABLE0]], %entry.split ]
}
; Test a case when a function only calls other functions that are neither setjmp nor longjmp
define void @other_func_only() {
; CHECK-LABEL: @other_func_only
entry:
call void @foo()
ret void
; CHECK: call void @foo()
}
; Test inline asm handling
define void @inline_asm() {
; CHECK-LABEL: @inline_asm
entry:
%env = alloca [1 x %struct.__jmp_buf_tag], align 16
%arraydecay = getelementptr inbounds [1 x %struct.__jmp_buf_tag], [1 x %struct.__jmp_buf_tag]* %env, i32 0, i32 0
%call = call i32 @setjmp(%struct.__jmp_buf_tag* %arraydecay) #4
; Inline assembly should not generate __invoke wrappers.
; Doing so would fail as inline assembly cannot be passed as a function pointer.
; CHECK: call void asm sideeffect "", ""()
; CHECK-NOT: __invoke_void
call void asm sideeffect "", ""()
ret void
}
; Test that the allocsize attribute is being transformed properly
declare i8 *@allocator(i32, %struct.__jmp_buf_tag*) #3
define i8 *@allocsize() {
; CHECK-LABEL: @allocsize
entry:
%buf = alloca [1 x %struct.__jmp_buf_tag], align 16
%arraydecay = getelementptr inbounds [1 x %struct.__jmp_buf_tag], [1 x %struct.__jmp_buf_tag]* %buf, i32 0, i32 0
%call = call i32 @setjmp(%struct.__jmp_buf_tag* %arraydecay) #0
[WebAssembly] Fix conflict between ret legalization and sjlj Summary: When the WebAssembly backend encounters a return type that doesn't fit within i32, SelectionDAG performs sret demotion, adding an additional argument to the start of the function that contains a pointer to an sret buffer to use instead. However, this conflicts with the emscripten sjlj lowering pass. There we translate calls like: ``` call {i32, i32} @foo() ``` into (in pseudo-llvm) ``` %addr = @foo call {i32, i32} @__invoke_{i32,i32}(%addr) ``` i.e. we perform an indirect call through an extra function. However, the sret transform now transforms this into the equivalent of ``` %addr = @foo %sret = alloca {i32, i32} call {i32, i32} @__invoke_{i32,i32}(%sret, %addr) ``` (while simultaneously translation the implementation of @foo as well). Unfortunately, this doesn't work out. The __invoke_ ABI expected the function address to be the first argument, causing crashes. There is several possible ways to fix this: 1. Implementing the sret rewrite at the IR level as well and performing it as part of lowering to __invoke 2. Fixing the wasm backend to recognize that __invoke has a special ABI 3. A change to the binaryen/emscripten ABI to recognize this situation This revision implements the middle option, teaching the backend to treat __invoke_ functions specially in sret lowering. This is achieved by 1) Introducing a new CallingConv ID for invoke functions 2) When this CallingConv ID is seen in the backend and the first argument is marked as sret (a function pointer would never be marked as sret), swapping the first two arguments. Reviewed By: tlively, aheejin Differential Revision: https://reviews.llvm.org/D65463 llvm-svn: 367935
2019-08-06 05:36:09 +08:00
; CHECK: call cc{{.*}} i8* @"__invoke_i8*_i32_%struct.__jmp_buf_tag*"([[ARGS:.*]]) #[[ALLOCSIZE_ATTR:[0-9]+]]
%alloc = call i8* @allocator(i32 20, %struct.__jmp_buf_tag* %arraydecay) #3
ret i8 *%alloc
}
; Test a case when a function only calls longjmp and not setjmp
@buffer = global [1 x %struct.__jmp_buf_tag] zeroinitializer, align 16
define void @longjmp_only() {
; CHECK-LABEL: @longjmp_only
entry:
; CHECK: call void @emscripten_longjmp
call void @longjmp(%struct.__jmp_buf_tag* getelementptr inbounds ([1 x %struct.__jmp_buf_tag], [1 x %struct.__jmp_buf_tag]* @buffer, i32 0, i32 0), i32 1) #1
unreachable
}
[WebAssembly] Fix SSA rebuilding in SjLj transformation Summary: Previously we skipped uses within the same BB as a def when rebuilding SSA after SjLj transformation. For example, before transformation, ``` for.cond: %0 = phi i32 [ %var, %for.inc ] ... %var = ... br label %for.inc for.inc: ; preds = %for.cond call i32 @setjmp(...) br %for.cond ``` In this BB, %var should be defined in all paths from %for.inc to make %0 valid. In the input it was true; %for.inc's only predecessor was %for.cond. But after SjLj transformation, it is possible that %for.inc has other predecessors that are reachable without reaching %for.cond. ``` entry.split: ... br i1 %a, label %bb.1, label %for.inc for.cond: %0 = phi i32 [ %var, %for.inc ] ... ; Not valid! %var = ... br label %for.inc for.inc: ; preds = %for.cond, %entry.split call i32 @setjmp(...) ... br %for.cond ``` In this case, we can't use %var in the `phi` instruction in %for.cond, because %var is not defined in all paths through %for.inc (If the control flow is %entry -> %entry.split -> %for.inc -> %for.cond, %var has not been defined until we reach the `phi`). But the previous code excluded users within the same BB, skipping instructions within the same BB so they are not rewritten properly. User instructions within the same BB also should be candidates for rewriting if they are _before_ the original definition. Fixes PR43097. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66729 llvm-svn: 369978
2019-08-27 05:51:35 +08:00
; Tests if SSA rewrite works when a use and its def are within the same BB.
define void @ssa_rewite_in_same_bb() {
; CHECK-LABEL: @ssa_rewite_in_same_bb
[WebAssembly] Fix SSA rebuilding in SjLj transformation Summary: Previously we skipped uses within the same BB as a def when rebuilding SSA after SjLj transformation. For example, before transformation, ``` for.cond: %0 = phi i32 [ %var, %for.inc ] ... %var = ... br label %for.inc for.inc: ; preds = %for.cond call i32 @setjmp(...) br %for.cond ``` In this BB, %var should be defined in all paths from %for.inc to make %0 valid. In the input it was true; %for.inc's only predecessor was %for.cond. But after SjLj transformation, it is possible that %for.inc has other predecessors that are reachable without reaching %for.cond. ``` entry.split: ... br i1 %a, label %bb.1, label %for.inc for.cond: %0 = phi i32 [ %var, %for.inc ] ... ; Not valid! %var = ... br label %for.inc for.inc: ; preds = %for.cond, %entry.split call i32 @setjmp(...) ... br %for.cond ``` In this case, we can't use %var in the `phi` instruction in %for.cond, because %var is not defined in all paths through %for.inc (If the control flow is %entry -> %entry.split -> %for.inc -> %for.cond, %var has not been defined until we reach the `phi`). But the previous code excluded users within the same BB, skipping instructions within the same BB so they are not rewritten properly. User instructions within the same BB also should be candidates for rewriting if they are _before_ the original definition. Fixes PR43097. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66729 llvm-svn: 369978
2019-08-27 05:51:35 +08:00
entry:
call void @foo()
br label %for.cond
for.cond: ; preds = %for.inc, %entry
; CHECK: %{{.*}} = phi i32 [ %var[[VARNO:.*]], %for.inc.split ]
%0 = phi i32 [ %var, %for.inc ], [ undef, %entry ]
%var = add i32 0, 0
br label %for.inc
for.inc: ; preds = %for.cond
%call5 = call i32 @setjmp(%struct.__jmp_buf_tag* undef) #0
br label %for.cond
; CHECK: for.inc.split:
[WebAssembly] Use SSAUpdaterBulk in LowerEmscriptenSjLj We update SSA in two steps in Emscripten SjLj: 1. Rewrite uses of `setjmpTable` and `setjmpTableSize` variables and place `phi`s where necessary, which are updated where we call `saveSetjmp`. 2. Do a whole function level SSA update for all variables, because we split BBs where `setjmp` is called and there are possibly variable uses that are not dominated by a def. (See https://github.com/llvm/llvm-project/blob/955b91c19c00ed4c917559a5d66d14c669dde2e3/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp#L1314-L1324) We have been using `SSAUpdater` to do this, but `SSAUpdaterBulk` class was added after this pass was first created, and for the step 2 it looks like a better alternative with a possible performance benefit. Not sure the author is aware of it, but `SSAUpdaterBulk` seems to have a limitation: it cannot handle a use within the same BB as a def but before it. For example: ``` ... = %a + 1 %a = foo(); ``` or ``` %a = %a + 1 ``` The uses `%a` in RHS should be rewritten with another SSA variable of `%a`, most likely one generated from a `phi`. But `SSAUpdaterBulk` thinks all uses of `%a` are below the def of `%a` within the same BB. (`SSAUpdater` has two different functions of rewriting because of this: `RewriteUse` and `RewriteUseAfterInsertions`.) This doesn't affect our usage in the step 2 because that deals with possibly non-dominated uses by defs after block splitting. But it does in the step 1, which still uses `SSAUpdater`. But this CL also simplifies the step 1 by using `make_early_inc_range`, removing the need to advance the iterator before rewriting a use. This is NFC; the test changes are just the order of PHI nodes. Reviewed By: dschuff Differential Revision: https://reviews.llvm.org/D108583
2021-08-17 14:30:02 +08:00
; CHECK: %var[[VARNO]] = phi i32 [ undef, %if.end ], [ %var, %for.inc ]
[WebAssembly] Fix SSA rebuilding in SjLj transformation Summary: Previously we skipped uses within the same BB as a def when rebuilding SSA after SjLj transformation. For example, before transformation, ``` for.cond: %0 = phi i32 [ %var, %for.inc ] ... %var = ... br label %for.inc for.inc: ; preds = %for.cond call i32 @setjmp(...) br %for.cond ``` In this BB, %var should be defined in all paths from %for.inc to make %0 valid. In the input it was true; %for.inc's only predecessor was %for.cond. But after SjLj transformation, it is possible that %for.inc has other predecessors that are reachable without reaching %for.cond. ``` entry.split: ... br i1 %a, label %bb.1, label %for.inc for.cond: %0 = phi i32 [ %var, %for.inc ] ... ; Not valid! %var = ... br label %for.inc for.inc: ; preds = %for.cond, %entry.split call i32 @setjmp(...) ... br %for.cond ``` In this case, we can't use %var in the `phi` instruction in %for.cond, because %var is not defined in all paths through %for.inc (If the control flow is %entry -> %entry.split -> %for.inc -> %for.cond, %var has not been defined until we reach the `phi`). But the previous code excluded users within the same BB, skipping instructions within the same BB so they are not rewritten properly. User instructions within the same BB also should be candidates for rewriting if they are _before_ the original definition. Fixes PR43097. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66729 llvm-svn: 369978
2019-08-27 05:51:35 +08:00
}
; Tests cases where longjmp function pointer is used in other ways than direct
; calls. longjmps should be replaced with
; (void(*)(jmp_buf*, int))emscripten_longjmp.
declare void @take_longjmp(void (%struct.__jmp_buf_tag*, i32)* %arg_ptr)
define void @indirect_longjmp() {
; CHECK-LABEL: @indirect_longjmp
entry:
%local_longjmp_ptr = alloca void (%struct.__jmp_buf_tag*, i32)*, align 4
%buf0 = alloca [1 x %struct.__jmp_buf_tag], align 16
%buf1 = alloca [1 x %struct.__jmp_buf_tag], align 16
; Store longjmp in a local variable, load it, and call it
store void (%struct.__jmp_buf_tag*, i32)* @longjmp, void (%struct.__jmp_buf_tag*, i32)** %local_longjmp_ptr, align 4
; CHECK: store void (%struct.__jmp_buf_tag*, i32)* bitcast (void ([[PTR]], i32)* @emscripten_longjmp to void (%struct.__jmp_buf_tag*, i32)*), void (%struct.__jmp_buf_tag*, i32)** %local_longjmp_ptr, align 4
%longjmp_from_local_ptr = load void (%struct.__jmp_buf_tag*, i32)*, void (%struct.__jmp_buf_tag*, i32)** %local_longjmp_ptr, align 4
%arraydecay = getelementptr inbounds [1 x %struct.__jmp_buf_tag], [1 x %struct.__jmp_buf_tag]* %buf0, i32 0, i32 0
call void %longjmp_from_local_ptr(%struct.__jmp_buf_tag* %arraydecay, i32 0)
; Load longjmp from a global variable and call it
%longjmp_from_global_ptr = load void (%struct.__jmp_buf_tag*, i32)*, void (%struct.__jmp_buf_tag*, i32)** @global_longjmp_ptr, align 4
%arraydecay1 = getelementptr inbounds [1 x %struct.__jmp_buf_tag], [1 x %struct.__jmp_buf_tag]* %buf1, i32 0, i32 0
call void %longjmp_from_global_ptr(%struct.__jmp_buf_tag* %arraydecay1, i32 0)
; Pass longjmp as a function argument. This is a call but longjmp is not a
; callee but an argument.
call void @take_longjmp(void (%struct.__jmp_buf_tag*, i32)* @longjmp)
; CHECK: call void @take_longjmp(void (%struct.__jmp_buf_tag*, i32)* bitcast (void ([[PTR]], i32)* @emscripten_longjmp to void (%struct.__jmp_buf_tag*, i32)*))
ret void
}
; Test if _setjmp and _longjmp calls are treated in the same way as setjmp and
; longjmp
define void @_setjmp__longjmp() {
; CHECK-LABEL: @_setjmp__longjmp
; These calls should have been transformed away
; CHECK-NOT: call i32 @_setjmp
; CHECK-NOT: call void @_longjmp
entry:
%buf = alloca [1 x %struct.__jmp_buf_tag], align 16
%arraydecay = getelementptr inbounds [1 x %struct.__jmp_buf_tag], [1 x %struct.__jmp_buf_tag]* %buf, i32 0, i32 0
%call = call i32 @_setjmp(%struct.__jmp_buf_tag* %arraydecay) #0
%arraydecay1 = getelementptr inbounds [1 x %struct.__jmp_buf_tag], [1 x %struct.__jmp_buf_tag]* %buf, i32 0, i32 0
call void @_longjmp(%struct.__jmp_buf_tag* %arraydecay1, i32 1) #1
unreachable
}
[WebAssembly] Make Emscripten EH work with Emscripten SjLj When Emscripten EH mixes with Emscripten SjLj, we are not currently handling some of them correctly. There are three cases: 1. The current function calls `setjmp` and there is an `invoke` to a function that can either throw or longjmp. In this case, we have to check both for exception and longjmp. We are currently handling this case correctly: https://github.com/llvm/llvm-project/blob/0c0eb76782d5224b8d81a5afbb9a152bcf7c94c7/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp#L1058-L1090 When inserting routines for functions that can longjmp, which we do only for setjmp-calling functions, we check if the function was previously an `invoke` and handle it correctly. 2. The current function does NOT call `setjmp` and there is an `invoke` to a function that can either throw or longjmp. Because there is no `setjmp` call, we haven't been doing any check for functions that can longjmp. But in that case, for `invoke`, we only check for an exception and if it is not an exception we reset `__THREW__` to 0, which can silently swallow the longjmp: https://github.com/llvm/llvm-project/blob/0c0eb76782d5224b8d81a5afbb9a152bcf7c94c7/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp#L70-L80 This CL fixes this. 3. The current function calls `setjmp` and there is no `invoke`. Because it is not an `invoke`, we haven't been doing any check for functions that can throw, and only insert longjmp-checking routines for functions that can longjmp. But in that case, if a longjmpable function throws, we only check for a longjmp so if it is not a longjmp we reset `__THREW__` to 0, which can silently swallow the exception: https://github.com/llvm/llvm-project/blob/0c0eb76782d5224b8d81a5afbb9a152bcf7c94c7/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp#L156-L169 This CL fixes this. To do that, this moves around some code, so we register necessary functions for both EH and SjLj and precompute some data (the set of functions that contains `setjmp`) before doing actual EH or SjLj transformation. This CL makes 2nd and 3rd tests in https://github.com/emscripten-core/emscripten/pull/14732 work. Reviewed By: dschuff Differential Revision: https://reviews.llvm.org/D106525
2021-07-17 14:37:09 +08:00
; Function Attrs: nounwind
declare void @foo() #2
; Function Attrs: returns_twice
declare i32 @setjmp(%struct.__jmp_buf_tag*) #0
declare i32 @_setjmp(%struct.__jmp_buf_tag*) #0
; Function Attrs: noreturn
declare void @longjmp(%struct.__jmp_buf_tag*, i32) #1
declare void @_longjmp(%struct.__jmp_buf_tag*, i32) #1
declare i32 @__gxx_personality_v0(...)
declare i8* @__cxa_begin_catch(i8*)
declare void @__cxa_end_catch()
declare i8* @malloc(i32)
declare void @free(i8*)
; JS glue functions and invoke wrappers declaration
; CHECK-DAG: declare i32 @getTempRet0()
; CHECK-DAG: declare void @setTempRet0(i32)
; CHECK-DAG: declare i32* @saveSetjmp(%struct.__jmp_buf_tag*, i32, i32*, i32)
; CHECK-DAG: declare i32 @testSetjmp([[PTR]], i32*, i32)
; CHECK-DAG: declare void @emscripten_longjmp([[PTR]], i32)
; CHECK-DAG: declare void @__invoke_void(void ()*)
attributes #0 = { returns_twice }
attributes #1 = { noreturn }
attributes #2 = { nounwind }
attributes #3 = { allocsize(0) }
; CHECK-DAG: attributes #{{[0-9]+}} = { nounwind "wasm-import-module"="env" "wasm-import-name"="getTempRet0" }
; CHECK-DAG: attributes #{{[0-9]+}} = { nounwind "wasm-import-module"="env" "wasm-import-name"="setTempRet0" }
; CHECK-DAG: attributes #{{[0-9]+}} = { "wasm-import-module"="env" "wasm-import-name"="__invoke_void" }
; CHECK-DAG: attributes #{{[0-9]+}} = { "wasm-import-module"="env" "wasm-import-name"="saveSetjmp" }
; CHECK-DAG: attributes #{{[0-9]+}} = { "wasm-import-module"="env" "wasm-import-name"="testSetjmp" }
; CHECK-DAG: attributes #{{[0-9]+}} = { noreturn "wasm-import-module"="env" "wasm-import-name"="emscripten_longjmp" }
; CHECK-DAG: attributes #{{[0-9]+}} = { "wasm-import-module"="env" "wasm-import-name"="__invoke_i8*_i32_%struct.__jmp_buf_tag*" }
; CHECK-DAG: attributes #[[ALLOCSIZE_ATTR]] = { allocsize(1) }
!llvm.dbg.cu = !{!2}
!llvm.module.flags = !{!0}
!0 = !{i32 2, !"Debug Info Version", i32 3}
!1 = !DIFile(filename: "lower-em-sjlj.c", directory: "test")
!2 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1)
!3 = distinct !DISubprogram(name: "setjmp_debug_info", unit:!2, file: !1, line: 1)
!4 = !DILocation(line:2, scope: !3)
!5 = !DILocation(line:3, scope: !3)
!6 = !DILocation(line:4, scope: !3)
!7 = !DILocation(line:5, scope: !3)
!8 = !DILocation(line:6, scope: !3)