When accumulating the GEP offset in BasicAA, we should use the
pointer index size rather than the pointer size.
Differential Revision: https://reviews.llvm.org/D112370
[NFC] This patch fixes URLs containing "master". Old URLs were either broken or
redirecting to the new URL.
Reviewed By: #libc, ldionne, mehdi_amini
Differential Revision: https://reviews.llvm.org/D113186
This is a reworked version of the reverted patch: https://reviews.llvm.org/D112487
Note that
a) it doesn't need the test changes anymore, and
b) I checked at least locally it passes other.test_pthread_lsan_leak
Differential Revision: https://reviews.llvm.org/D113208
To support return (it not being supported well was the ground cause for
https://github.com/WebAssembly/wasi-sdk/issues/200) we also have to have
at least a basic notion of unreachable, which in this case just means to stop
type checking until there is an end_block (an incoming control flow edge).
This is conservative (may miss on some type checking opportunities) but is
simple and an improvement over what we had before.
Differential Revision: https://reviews.llvm.org/D112953
Add i32x4.relaxed_trunc_f32x4_s, i32x4.relaxed_trunc_f32x4_u,
i32x4.relaxed_trunc_f64x2_s_zero, i32x4.relaxed_trunc_f64x2_u_zero.
These are only exposed as builtins, and require user opt-in.
Differential Revision: https://reviews.llvm.org/D112186
Add relaxed. f32x4.min, f32x4.max, f64x2.min, f64x2.max. These are only
exposed as builtins, and require user opt-in.
Differential Revision: https://reviews.llvm.org/D112146
Our fallback expansion for CTLZ/CTTZ relies on CTPOP. If CTPOP
isn't legal or custom for a vector type we would scalarize the
CTLZ/CTTZ. This is different than CTPOP itself which would use a
vector expansion.
This patch teaches expandCTLZ/CTTZ to rely on the vector CTPOP
expansion instead of scalarizing. To do this I had to add additional
checks to make sure the operations used by CTPOP expansions are all
supported. Some of the operations were already needed for the CTLZ/CTTZ
expansion.
This is a huge improvement to the RISCV which doesn't have a scalar
ctlz or cttz in the base ISA.
For WebAssembly, I've added Custom lowering to keep the scalarizing
behavior. I've also extended the scalarizing to CTPOP.
Differential Revision: https://reviews.llvm.org/D111919
This change implements new DAG nodes TABLE_GET/TABLE_SET, and lowering
methods for load and stores of reference types from IR arrays. These
global LLVM IR arrays represent tables at the Wasm level.
Differential Revision: https://reviews.llvm.org/D111154
Add i8x16 relaxed_swizzle instructions. These are only
exposed as builtins, and require user opt-in.
Differential Revision: https://reviews.llvm.org/D112022
This makes Wasm EH work with dynamic linking. So far we were only able
to handle destructors, which do not use any tags or LSDA info.
1. This uses `TargetExternalSymbol` for `GCC_except_tableN` symbols,
which points to the address of per-function LSDA info. It is more
convenient to use than `MCSymbol` because it can take additional
target flags.
2. When lowering `wasm_lsda` intrinsic, if PIC is enabled, make the
symbol relative to `__memory_base` and generate the `add` node. If
PIC is disabled, continue to use the absolute address.
3. Make tag symbols (`__cpp_exception` and `__c_longjmp`) undefined in
the backend, because it is hard to make it work with dynamic
linking's loading order. Instead, we make all tag symbols undefined
in the LLVM backend and import it from JS.
4. Add support for undefined tags to the linker.
Companion patches:
- https://github.com/WebAssembly/binaryen/pull/4223
- https://github.com/emscripten-core/emscripten/pull/15266
Reviewed By: sbc100
Differential Revision: https://reviews.llvm.org/D111388
This moves the registry higher in the LLVM library dependency stack.
Every client of the target registry needs to link against MC anyway to
actually use the target, so we might as well move this out of Support.
This allows us to ensure that Support doesn't have includes from MC/*.
Differential Revision: https://reviews.llvm.org/D111454
Based on the reasoning of D53903, register operands of DBG_VALUE are
invariably treated as RegState::Debug operands. This change enforces
this invariant as part of MachineInstr::addOperand so that all passes
emit this flag consistently.
RegState::Debug is inconsistently set on DBG_VALUE registers throughout
LLVM. This runs the risk of a filtering iterator like
MachineRegisterInfo::reg_nodbg_iterator to process these operands
erroneously when not parsed from MIR sources.
This issue was observed in the development of the llvm-mos fork which
adds a backend that relies on physical register operands much more than
existing targets. Physical RegUnit 0 has the same numeric encoding as
$noreg (indicating an undef for DBG_VALUE). Allowing debug operands into
the machine scheduler correlates $noreg with RegUnit 0 (i.e. a collision
of register numbers with different zero semantics). Eventually, this
causes an assert where DBG_VALUE instructions are prohibited from
participating in live register ranges.
Reviewed By: MatzeB, StephenTozer
Differential Revision: https://reviews.llvm.org/D110105
As described on D111049, we're trying to remove the <string> dependency from error handling and replace uses of report_fatal_error(const std::string&) with the Twine() variant which can be forward declared.
The currently implementation of funcrefs is broken since it is putting
the funcref itself on the stack before the call_indirect. Instead what
should be on the stack is the constant 0, which is the index at which
we store the funcref in __funcref_call_table.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D111152
This is a non-functional change to remove the duplicate
WasmAddressSpace enum and refactor reftype predicates by moving them
to the Utilities source file.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D111144
This removes `WasmTagType`. `WasmTagType` contained an attribute and a
signature index:
```
struct WasmTagType {
uint8_t Attribute;
uint32_t SigIndex;
};
```
Currently the attribute field is not used and reserved for future use,
and always 0. And that this class contains `SigIndex` as its property is
a little weird in the place, because the tag type's signature index is
not an inherent property of a tag but rather a reference to another
section that changes after linking. This makes tag handling in the
linker also weird that tag-related methods are taking both `WasmTagType`
and `WasmSignature` even though `WasmTagType` contains a signature
index. This is because the signature index changes in linking so it
doesn't have any info at this point. This instead moves `SigIndex` to
`struct WasmTag` itself, as we did for `struct WasmFunction` in D111104.
In this CL, in lib/MC and lib/Object, this now treats tag types in the
same way as function types. Also in YAML, this removes `struct Tag`,
because now it only contains the tag index. Also tags set `SigIndex` in
`WasmImport` union, as functions do.
I think this makes things simpler and makes tag handling more in line
with function handling. These two shares similar properties in that both
of them have signatures, but they are kind of nominal so having the same
signature doesn't mean they are the same element.
Also a drive-by fix: the reserved 'attirubute' part's encoding changed
from uleb32 to uint8 a while ago. This was fixed in lib/MC and
lib/Object but not in YAML. This doesn't change object files because the
field's value is always 0 and its encoding is the same for the both
encoding.
This is effectively NFC; I didn't mark it as such just because it
changed YAML test results.
Reviewed By: sbc100, tlively
Differential Revision: https://reviews.llvm.org/D111086
Based off a discussion on D110100, we should be avoiding default CostKinds whenever possible.
This initial patch removes them from the 'inner' target implementation callbacks - these should only be used by the main TTI calls, so this should guarantee that we don't cause changes in CostKind by missing it in an inner call. This exposed a few missing arguments in getGEPCost and reduction cost calls that I've cleaned up.
Differential Revision: https://reviews.llvm.org/D110242
This reverts commit b7b4ebbcfa.
Reason: This breaks several code-size tests in Emscripten test suite
because this exports `emscripten_longjmp` for programs that didn't do it
before.
We previously had a limitation that TLS variables could not
be exported (and therefore could also not be imported). This
change removed that limitation.
Differential Revision: https://reviews.llvm.org/D108877
This is a fix on top of D106525's Case 2. In D106525, in
`runEHOnFunction` which handles Emscripten EH, We rethrow `longjmp` only
when the module has any usage of `setjmp` or `longjmp`. But now Wasm
object files are linked using wasm-ld, the module this pass sees is not
the whole program, and even if this module does not contain any
`longjmp`, another file can contain it and can be linked with the
current module. This enables the rethrowing of longjmp whenever
Emscripten SjLj is enabled, regardless of whether it is used in this
module or not.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D109670
Both Wasm & Emscripten SjLj handling has a restriction that `setjmp`
cannot be called indirectly. I thought we have been erroring out on
indirect uses of `setjmp`, but some recent CL disrupted the logic and
we are not erroring out anymore.
We currently
1. Collect functions that contain `setjmp` calls in `SetjmpUsers`. This
only counts direct calls:
8f77dc459e/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp (L869-L878)
2. Run `runSjLjOnFunction` only on those `SetjmpUsers`. Within
`runSjLjOnFunction`, if we see an indirect use of `setjmp`, we error
out:
8f77dc459e/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp (L1218-L1221)
So if there are only indirect setjmp calls within the module,
`SetjmpUsers` will be empty, and `runSjLjOnFunction` is not even entered
once. And the indirect `setjmp` call will error out at link time. So in
this CL we check for the indirect uses of `setjmp` upfront before we
enter `runSjLjOnFunction`.
Also this currently errors out on `invoke @setjmp`, which can only occur
when using Wasm EH + Wasm SjLj within a function. We recently added Wasm
SjLj support but we don't support using Wasm EH + Wasm SjLj in the same
function yet. We plan to add this support very soon, so I don't think
it's worth creating another test file just for this. (This is an error
test so it needs its own file)
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D109375
On some architectures such as Arm and X86 the encoding for a nop may
change depending on the subtarget in operation at the time of
encoding. This change replaces the per module MCSubtargetInfo retained
by the targets AsmBackend in favour of passing through the local
MCSubtargetInfo in operation at the time.
On Arm using the architectural NOP instruction can have a performance
benefit on some implementations.
For Arm I've deleted the copy of the AsmBackend's MCSubtargetInfo to
limit the chances of this causing problems in the future. I've not
done this for other targets such as X86 as there is more frequent use
of the MCSubtargetInfo and it looks to be for stable properties that
we would not expect to vary per function.
This change required threading STI through MCNopsFragment and
MCBoundaryAlignFragment.
I've attempted to take into account the in tree experimental backends.
Differential Revision: https://reviews.llvm.org/D45962
The change here is basically the same as in D108880: Rather than
looking at bitcasts, look at calls and their function type. We
still need to look through bitcasts to find those calls.
The change in llvm/test/CodeGen/WebAssembly/add-prototypes-conflict.ll
is due to different visitation order. add-prototypes-opaque-ptrs.ll
is a copy of add-prototypes.ll with -force-opaque-pointers.
Differential Revision: https://reviews.llvm.org/D109256
This ISD node/wrapper represents am address which is relative to a base
address and therefore lowers to `i32.const` rather than `global.get`.
Use this wrapper type for TLS-relative addresses, paving the way for the
non-REL wrapper to be used to external TLS address once those are
supported.
Differential Revision: https://reviews.llvm.org/D109179
This add support for SjLj using Wasm exception handling instructions:
https://github.com/WebAssembly/exception-handling/blob/master/proposals/exception-handling/Exceptions.md
This does not yet support the mixed use of EH and SjLj within a
function. It will be added in a follow-up CL.
This currently passes all SjLj Emscripten tests for wasm0/1/2/3/s,
except for the below:
- `test_longjmp_standalone`: Uses Node
- `test_dlfcn_longjmp`: Uses NodeRAWFS
- `test_longjmp_throw`: Mixes EH and SjLj
- `test_exceptions_longjmp1`: Mixes EH and SjLj
- `test_exceptions_longjmp2`: Mixes EH and SjLj
- `test_exceptions_longjmp3`: Mixes EH and SjLj
Reviewed By: dschuff, tlively
Differential Revision: https://reviews.llvm.org/D108960
With opaque pointers, no actual bitcasts will be present. Instead,
there will be a mismatch between the call FunctionType and the
function ValueType. Change the code to collect CallBases
specifically (rather than general Uses) and compare these types.
RAUW is no longer performed, as there would no longer be any
bitcasts that can be RAUWd.
Differential Revision: https://reviews.llvm.org/D108880
Previously extra wide v4f32 to v4f64 extending loads would be legalized to v2f32
to v2f64 extending loads, which would then be scalarized by legalization. (v2f32
to v2f64 extending loads not produced by legalization were already being emitted
correctly.) Instead, mark v2f32 to v2f64 extending loads as legal and explicitly
lower them using promote_low. This regresses the addressing modes supported for
the extloads not produced by legalization, but that's a fine trade off for now.
Differential Revision: https://reviews.llvm.org/D108496
This is an improvement over D107852. We don't need to enumerate specific
function names; we can just check for `noreturn` attribute. This also
requires us to make sure `__resumeExeption` and `emscripten_longjmp`
have `noreturn` attribute too; one of them is a JS function and the
other calls a JS function so Clang does not have a way to deduce they
don't return.
This is effectively NFC, because I'm not sure if there is an additional
case this case covers; if we add a custom function call that has
`noreturn` attribute, it will be processed within the SjLj handling and
turned into `__invoke` call. So this really applies to some special
functions like `emscripten_longjmp`.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D108955
There are three kinds of "rethrowing" BBs in this pass:
1. In Emscripten SjLj, after a possibly longjmping function call, we
check if the thrown longjmp corresponds to one of setjmps within the
current function. If not, we rethrow the longjmp by calling
`emscripten_longjmp`.
2. In Emscripten EH, after a possibly throwing function call, we check
if the thrown exception corresponds to the current `catch` clauses.
If not, we rethrow the exception by calling `__resumeException`.
3. When both Emscripten EH and SjLj are used, when we check for an
exception after a possibly throwing function call, it is possible
that we get not an exception but a longjmp. In this case, we
shouldn't swallow it; we should rethrow the longjmp by calling
`emscripten_longjmp`.
4. When both Emscripten EH and SjLj are used, when we check for a
longjmp after a possibly longjmping function call, it is possible
that we get not a longjmp but an exception. In this case, we
shouldn't swallot it; we should rethrow the exception by calling
`__resumeException`.
Case 1 is in Emscripten SjLj, 2 is in Emscripten EH, and 3 and 4 are
relevant when both Emscripten EH and SjLj are used. 3 and 4 were first
implemented in D106525.
We create BBs for 1, 3, and 4 in this pass. We create those BBs for
every throwing/longjmping function call, along with other BBs that
contain condition checks. What this CL does is to create a single BB
within a function for each of 1, 3, and 4 cases. These BBs are exiting
BBs in the function and thus don't have successors, so easy to be shared
between calls.
The names of BBs created are:
Case 1: `call.em.longjmp`
Case 3: `rethrow.exn`
Case 4: `rethrow.longjmp`
For the case 2 we don't currently create BBs; we only replace the
existing `resume` instruction with `call @__resumeException`. And Clang
already creates only a single `resume` BB per function and reuses it,
so we don't need to optimize this case.
Not sure what are good benchmarks for EH/SjLj, but this decreases the
size of the object file for `grfmt_jpeg.bc` (presumably from opencv) we
got from one of our users by 8.9%. Even after running `wasm-opt -O4` on
them, there is still 4.8% improvement.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D108945
If the icmp is in a different block, then the register for the icmp
operand may not be initialized, as it nominally does not have
cross-block uses. Add a check that the icmp is in the same block
as the branch, which should be the common case.
This matches what X86 FastISel does:
5b6b090cf2/llvm/lib/Target/X86/X86FastISel.cpp (L1648)
The "not" transform that could have a similar issue is dropped
entirely, because it is currently dead: The incoming value is
a branch or select condition of type i1, but this code requires
an i32 to trigger.
Fixes https://bugs.llvm.org/show_bug.cgi?id=51651.
Differential Revision: https://reviews.llvm.org/D108840
When doing Emscritpen EH, if SjLj is also enabled and used and if the
thrown exception has a possiblity being a longjmp instead of an
exception, we shouldn't swallow it; we should rethrow, or relay it. It
was done in D106525 and the code is here:
8441a8eea8/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp (L858-L898)
Here is the pseudocode of that part: (copied from comments)
```
if (%__THREW__.val == 0 || %__THREW__.val == 1)
goto %tail
else
goto %longjmp.rethrow
longjmp.rethrow: ;; This is longjmp. Rethrow it
%__threwValue.val = __threwValue
emscripten_longjmp(%__THREW__.val, %__threwValue.val);
tail: ;; Nothing happened or an exception is thrown
... Continue exception handling ...
```
If the current BB (where the `invoke` is created) has successors that
has the current BB as its PHI incoming node, now that has to change to
`tail` in the pseudocode, because `tail` is the latest BB that is
connected with the next BB, but this was missing.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D108785
Emscripten SjLj transformation is done in four steps. This will be
mostly the same for the soon-to-be-added Wasm SjLj; the step 1, 3, and 4
will be shared and there will be separate way of doing step 2.
1. Initialize `setjmpTable` and `setjmpTableSize` in the entry BB
2. Handle `setjmp` callsites
3. Handle `longjmp` callsites
4. Cleanup and update SSA
We initialize `setjmpTable` and `setjmpTableSize` in the entry BB. But
if the entry BB contains a `setjmp` call, some `setjmp` handling
transformation will also happen in the entry BB, such as calling
`saveSetjmp`.
This is fine for Emscripten SjLj but not for Wasm SjLj, because in Wasm
SjLj we will add a dispatch BB that contains a `switch` right after the
entry BB, from which we jump to one of post-`setjmp` BBs. And this
dispatch BB should precede all `setjmp` calls.
Emscripten SjLj (current):
```
entry:
%setjmpTable = ...
%setjmpTableSize = ...
...
call @saveSetjmp(...)
```
Wasm SjLj (follow-up):
```
entry:
%setjmpTable = ...
%setjmpTableSize = ...
setjmp.dispatch:
...
; Jump to the right post-setjmp BB, if we are returning from a
; longjmp. If this is the first setjmp call, go to %entry.split.
switch i32 %no, label %entry.split [
i32 1, label %post.setjmp1
i32 2, label %post.setjmp2
...
i32 N, label %post.setjmpN
]
entry.split:
...
call @saveSetjmp(...)
```
So in Wasm SjLj we split the entry BB to make the entry block only for
`setjmpTable` and `setjmpTableSize` initialization and insert a
`setjmp.dispatch` BB. (This part is not in this CL. This will be a
follow-up.) But note that Emscripten SjLj and Wasm SjLj share all
steps except for the step 2. If we only split the entry BB only for Wasm
SjLj, there will be one more `if`-`else` and the code will be more
complicated.
So this CL splits the entry BB in Emscripten SjLj and put only
initialization stuff there as follows:
Emscripten SjLj (this CL):
```
entry:
%setjmpTable = ...
%setjmpTableSize = ...
br %entry.split
entry.split:
...
call @saveSetjmp(...)
```
This is just done to share code with Wasm SjLj. It adds an unnecessary
branch but this will be removed in later optimization passes anyway.
This is in effect NFC, meaning the program behavior will not change, but
existing ll tests files have changed because the entry block was split.
The reason I upload this in a separate CL is to make the Wasm SjLj diff
tidier, because this changes many existing Emscripten SjLj tests, which
can be confusing for the follow-up Wasm SjLj CL.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D108729
Emscripten SjLj and (soon-to-be-added) Wasm SjLj transformation share
many steps:
1. Initialize `setjmpTable` and `setjmpTableSize` in the entry BB
2. Handle `setjmp` callsites
3. Handle `longjmp` callsites
4. Cleanup and update SSA
1, 3, and 4 are identical for Emscripten SjLj and Wasm SjLj. Only the
step 2 is different. This CL extracts the current Emscripten SjLj's
longjmp callsites handling into a function. The reason to make this a
separate CL is, without this, the diff tool cannot compare things well
in the presence of moved code and added code in the followup Wasm SjLj
CL, and it ends up mixing them together, making the diff unreadable.
Also fixes some typos and variable names. So far we've been calling the
buffer argument to `setjmp` and `longjmp` `jmpbuf`, but the name used in
the man page for those functions is `env`, so updated them to be
consistent.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D108728
The plan was to use `wasm.catch.exn` intrinsic to catch exceptions and
add `wasm.catch.longjmp` intrinsic, that returns two values (setjmp
buffer and return value), later to catch longjmps. But because we
decided not to use multivalue support at the moment, we are going to use
one intrinsic that returns a single value for both exceptions and
longjmps. And even if it's not for that, I now think the naming of
`wasm.catch.exn` is a little weird, because the intrinsic can still take
a tag immediate, which means it can be used for anything, not only
exceptions, as long as that returns a single value.
This partially reverts D107405.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D108683
We update SSA in two steps in Emscripten SjLj:
1. Rewrite uses of `setjmpTable` and `setjmpTableSize` variables and
place `phi`s where necessary, which are updated where we call
`saveSetjmp`.
2. Do a whole function level SSA update for all variables, because we
split BBs where `setjmp` is called and there are possibly variable
uses that are not dominated by a def.
(See 955b91c19c/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp (L1314-L1324))
We have been using `SSAUpdater` to do this, but `SSAUpdaterBulk` class
was added after this pass was first created, and for the step 2 it looks
like a better alternative with a possible performance benefit. Not sure
the author is aware of it, but `SSAUpdaterBulk` seems to have a
limitation: it cannot handle a use within the same BB as a def but
before it. For example:
```
... = %a + 1
%a = foo();
```
or
```
%a = %a + 1
```
The uses `%a` in RHS should be rewritten with another SSA variable of
`%a`, most likely one generated from a `phi`. But `SSAUpdaterBulk`
thinks all uses of `%a` are below the def of `%a` within the same BB.
(`SSAUpdater` has two different functions of rewriting because of this:
`RewriteUse` and `RewriteUseAfterInsertions`.) This doesn't affect our
usage in the step 2 because that deals with possibly non-dominated uses
by defs after block splitting. But it does in the step 1, which still
uses `SSAUpdater`.
But this CL also simplifies the step 1 by using `make_early_inc_range`,
removing the need to advance the iterator before rewriting a use.
This is NFC; the test changes are just the order of PHI nodes.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D108583
This CL is small, but the description can be a little long because I'm
trying to sum up the status quo for Emscripten/Wasm EH/SjLj options.
First, this CL adds an option for Wasm SjLj (`-wasm-enable-sjlj`), which
handles SjLj using Wasm EH. The implementation for this will be added as
a followup CL, but this adds the option first to do error checking.
This also adds an option for Wasm EH (`-wasm-enable-eh`), which has been
already implemented. Before we used `-exception-model=wasm` as the same
meaning as enabling Wasm EH, but after we add Wasm SjLj, it will be
possible to use Wasm EH instructions for Wasm SjLj while not enabling
EH, so going forward, to use Wasm EH, `opt` and `llc` will need this
option. This only affects `opt` and `llc` command lines and does not
affect Emscripten user interface.
Now we have two modes of EH (Emscripten/Wasm) and also two modes of SjLj
(also Emscripten/Wasm). The options corresponding to each of are:
- Emscripten EH: `-enable-emscripten-cxx-exceptions`
- Emscripten SjLj: `-enable-emscripten-sjlj`
- Wasm EH: `-wasm-enable-eh -exception-model=wasm`
`-mattr=+exception-handling`
- Wasm SjLj: `-wasm-enable-sjlj -exception-model=wasm`
`-mattr=+exception-handling`
The reason Wasm EH/SjLj's options are a little complicated are
`-exception-model` and `-mattr` are common LLVM options ane not under
our control. (`-mattr` can be omitted if it is embedded within the
bitcode file.)
And we have the following rules of the option composition:
- Emscripten EH and Wasm EH cannot be turned on at the same itme
- Emscripten SjLj and Wasm SjLj cannot be turned on at the same time
- Wasm SjLj should be used with Wasm EH
Which means we now allow these combinations:
- Emscripten EH + Emscripten SjLj: the current default in `emcc`
- Wasm EH + Emscripten SjLj:
This is allowed, but only as an interim step in which we are testing
Wasm EH but not yet have a working implementation of Wasm SjLj. This
will error out (D107687) in compile time if `setjmp` is called in a
function in which Wasm exception is used.
- Wasm EH + Wasm SjLj:
This will be the default mode later when using Wasm EH. Currently Wasm
SjLj implementation doesn't exist, so it doesn't work.
- Emscripten EH + Wasm SjLj will not work.
This CL moves these error checking routines to
`WebAssemblyPassConfig::addIRPasses`. Not sure if this is an ideal place
to do this, but I couldn't find elsewhere. Currently some checking is
done within LowerEmscriptenEHSjLj, but these checks only run if
LowerEmscriptenEHSjLj runs so it may not run when Wasm EH is used. This
moves that to `addIRPasses` and adds some more checks.
Currently LowerEmscriptenEHSjLj pass is responsible for Emscripten EH
and Emscripten SjLj. Wasm EH transformations are done in multiple
places, including WasmEHPrepare, LateEHPrepare, and CFGStackify. But in
the followup CL, LowerEmscriptenEHSjLj pass will be also responsible for
a part of Wasm SjLj transformation, because WasmSjLj will also be using
several Emscripten library functions, and we will be sharing more than
half of the transformation to do that between Emscripten SjLj and Wasm
SjLj.
Currently we have `-enable-emscripten-cxx-exceptions` and
`-enable-emscripten-sjlj` but these only work for `llc`, because for
`llc` we feed these options to the pass but when we run the pass using
`opt` the pass will be created with no options and the default options
will be used, which turns both Emscripten EH and Emscripten SjLj on.
Now we have one more SjLj option to care for, LowerEmscriptenEHSjLj pass
needs a finer way to control these options. This CL removes those
default parameters and make LowerEmscriptenEHSjLj pass read directly
from command line options specified. So if we only run
`opt -wasm-lower-em-ehsjlj`, currently both Emscripten EH and Emscripten
SjLj will run, but with this CL, none will run unless we additionally
pass `-enable-emscripten-cxx-exceptions` or `-enable-emscripten-sjlj`,
or both. This does not affect users; this only affects our `opt` tests
because `emcc` will not call either `opt` or `llc`. As a result of this,
our existing Emscripten EH/SjLj tests gained one or both of those
options in their `RUN` lines.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D107685
Fixes PR51605 in which a DAG combine and legalization sequence generated
out-of-range constants in BUILD_VECTOR lanes. In the v16i8 case, the constants
were 255, which would be in range if DAG ISel used unsigned constants, but it is
out of range because DAG ISel uses signed constants.
Differential Revision: https://reviews.llvm.org/D108669
Partially reverts 85157c0079, which had removed these builtins and intrinsics
in favor of normal codegen patterns. It turns out that it is possible for the
patterns to be split over multiple basic blocks, however, which means that DAG
ISel is not able to select them to the pmin/pmax instructions. To make sure the
SIMD intrinsics generate the correct instructions in these cases, reintroduce
the clang builtins and corresponding LLVM intrinsics, but also keep the normal
pattern matching as well.
Differential Revision: https://reviews.llvm.org/D108387
The convert_low and promote_low instructions can widen the lower two lanes of a
four-lane vector, but we were previously scalarizing patterns that widened lanes
besides the low two lanes. The commit adds a shuffle to move the widened lanes
into the low lane positions so the convert_low and promote_low instructions can
be used instead of scalarizing.
Depends on D108266.
Differential Revision: https://reviews.llvm.org/D108341
Since the simplest DAG patterns for convert_low and promote_low instructions
involved v2i32, v2f32, v4i64, and v4f64 types, which are not legal in the
WebAssembly backend and would be eliminated by type legalization, we were
previously matching those patterns in a DAG combine before the type legalization
stage. However in cases where the vectors were wider than 128 bits, the patterns
we matched were not created until the type legalization stage when the wide
vectors were split up. Type legalization would continue to eliminate the illegal
types we were matching as well, so the code ended up scalarized.
To make the ISel for these instructions more robust, match the scalarized
patterns rather than the patterns containing illegal types. Add tests with
double-wide vectors to show that this works as intended.
Fixes PR51098.
Depends on D107502.
Differential Revision: https://reviews.llvm.org/D108266
The default legalization of unsupported vector types is to promote the integers
in each lane, which leads to extra sign or zero extending and masking when
moving data into and out of vectors. Switch our preferred type legalization from
the default to vector widening, which keeps the data in the low lanes of the
vector rather than in the low bits of each lane. The unused high lanes can be
ignored.
Half-wide vectors are now loaded from memory into the low 64 bits of the v128
rather than spread out among the lanes. As a result, v128.load64_splat is a much
more common operation, so add new patterns to support it.
Differential Revision: https://reviews.llvm.org/D107502
AttributeList::hasAttribute() is confusing, use clearer methods like
hasParamAttr()/hasRetAttr().
Add hasRetAttr() since it was missing from AttributeList.
For SjLj, we allocate a table to record setjmp buffer info in the entry
of each setjmp-calling function by inserting a `malloc` call, and insert
a `free` call to free the buffer before each `ret` instruction.
But this is not sufficient; we have to free the buffer before we throw.
In SjLj handling, normal functions that can possibly throw or longjmp
are wrapped with an invoke and caught within the function so they don't
end up escaping the function. But three functions throw and escape the
function:
- `__resumeException` (Emscripten library function used for Emscripten
EH)
- `emscripten_longjmp` (Emscripten library function used for Emscripten
SjLj)
- `__cxa_throw` (libc++abi function called when for C++ `throw` keyword)
The first two functions are used to rethrow the current
exception/longjmp when the caught exception/longjmp is not for the
current function. `__cxa_throw` is used for exception, and because we
consider that a function that cannot longjmp, it escapes the function
right away, before which we should free the buffer.
Currently `lsan.test_longjmp3` and `lsan.test_exceptions_longjmp3` fail
in Emscripten; this CL fixes these.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D107852
Currently, when Wasm EH is used with Emscripten SjLj, Emscripten SjLj
cannot handle `invoke` instructions - it assumes all `invoke`s have been
lowered away with Emscripten EH. But in Wasm EH they are lowered in
instruction selection, so they are still present in the IR stage. This
happens when
1. Wasm EH and Emscripten SjLj are used together
2. A function that calls `setjmp` uses exceptions, i.e., has `invoke`s
We were already erroring out with an assertion failure in this case, but
this CL makes it error out more properly with a valid error message.
Wasm EH + Wasm SjLj will not have this restrictions. (it will have
another restriction though, e.g., `setjmp` cannot be called within
`catch`. But why would anyone do that..)
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D107687
When there is a `setjmp` call in a function, we transform every callsite
of `setjmp` to record its information by calling `saveSetjmp` function,
and we also transform every callsite of a function that can longjmp to
to check if a longjmp occurred and if so jump to the corresponding
post-setjmp BB. Currently we are doing this for every function that
contains a call to `setjmp`, but if there is no other function call
within that function that can longjmp, this transformation of `setjmp`
callsite and all the preparation of `setjmpTable` in the entry of the
function are not necessary.
This checks if a setjmp-calling function has any other calls that can
longjmp, and if not, skips the function for the SjLj transformation.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D107530
`catch` instruction can have any number of result values depending on
its tag, but so far we have only needed a single i32 return value for
C++ exception so the instruction was specified that way. But using the
instruction for SjLj handling requires multiple return values.
This makes `catch` instruction's results variadic and moves selection of
`throw` and `catch` instruction from ISelLowering to ISelDAGToDAG.
Moving `catch` to ISelDAGToDAG is necessary because I am not aware of
a good way to do instruction selection for variadic output instructions
in TableGen. This also moves `throw` because 1. `throw` and `catch`
share the same utility function and 2. there is really no reason we
should do that in ISelLowering in the first place. What we do is mostly
the same in both places, and moving them to ISelDAGToDAG allows us to
remove unnecessary mid-level nodes for `throw` and `catch` in
WebAssemblyISD.def and WebAssemblyInstrInfo.td.
This also adds handling for new `catch` instruction to AsmTypeCheck.
Reviewed By: dschuff, tlively
Differential Revision: https://reviews.llvm.org/D107423
- Rename `wasm.catch` intrinsic to `wasm.catch.exn`, because we are
planning to add a separate `wasm.catch.longjmp` intrinsic which
returns two values.
- Rename several variables
- Remove an unnecessary parameter from `canLongjmp` and `isEmAsmCall`
from LowerEmscriptenEHSjLj pass
- Add `-verify-machineinstrs` in a test for a safety measure
- Add more comments + fix some errors in comments
- Replace `std::vector` with `SmallVector` for cases likely with small
number of elements
- Renamed `EnableEH`/`EnableSjLj` to `EnableEmEH`/`EnableEmSjLj`: We are
soon going to add `EnableWasmSjLj`, so this makes the distincion
clearer
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D107405
Add new pass LowerRefTypesIntPtrConv to generate debugtrap
instruction for an inttoptr and ptrtoint of a reference type instead
of erroring, since calling these instructions on non-integral pointers
has been since allowed (see ac81cb7e6).
Differential Revision: https://reviews.llvm.org/D107102
I'm not sure this is the best way to approach this,
but the situation is rather not very detectable unless we explicitly call it out when refusing to advise to unroll.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D107271
Add new pass LowerRefTypesIntPtrConv to generate trap
instruction for an inttoptr and ptrtoint of a reference type instead
of erroring, since calling these instructions on non-integral pointers
has been since allowed (see ac81cb7e6).
Differential Revision: https://reviews.llvm.org/D107102
This optimizes out the mask when the result of a bitmask is interpreted as an i8
or i16 value. Resolves PR50507.
Differential Revision: https://reviews.llvm.org/D107103
Replace the clang builtins and LLVM intrinsics for the SIMD extmul instructions
with normal codegen patterns.
Differential Revision: https://reviews.llvm.org/D106724
When Emscripten EH mixes with Emscripten SjLj, we are not currently
handling some of them correctly. There are three cases:
1. The current function calls `setjmp` and there is an `invoke` to a
function that can either throw or longjmp. In this case, we have to
check both for exception and longjmp. We are currently handling this
case correctly:
0c0eb76782/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp (L1058-L1090)
When inserting routines for functions that can longjmp, which we do
only for setjmp-calling functions, we check if the function was
previously an `invoke` and handle it correctly.
2. The current function does NOT call `setjmp` and there is an `invoke`
to a function that can either throw or longjmp. Because there is no
`setjmp` call, we haven't been doing any check for functions that can
longjmp. But in that case, for `invoke`, we only check for an
exception and if it is not an exception we reset `__THREW__` to 0,
which can silently swallow the longjmp:
0c0eb76782/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp (L70-L80)
This CL fixes this.
3. The current function calls `setjmp` and there is no `invoke`. Because
it is not an `invoke`, we haven't been doing any check for functions
that can throw, and only insert longjmp-checking routines for
functions that can longjmp. But in that case, if a longjmpable
function throws, we only check for a longjmp so if it is not a
longjmp we reset `__THREW__` to 0, which can silently swallow the
exception:
0c0eb76782/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp (L156-L169)
This CL fixes this.
To do that, this moves around some code, so we register necessary
functions for both EH and SjLj and precompute some data (the set of
functions that contains `setjmp`) before doing actual EH or SjLj
transformation.
This CL makes 2nd and 3rd tests in
https://github.com/emscripten-core/emscripten/pull/14732 work.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D106525
Both `__THREW__` and `__threwValue` are global variables, and we have
been distinguishing the global variable `__THREW__` and the loaded value
`%__THREW__.val` in comments but not doing it for `__threwValue`. Made
the pseudocode comments consistent for both variables.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D106524
Replace the clang builtins and LLVM intrinsics for {f32x4,f64x2}.{pmin,pmax}
with standard codegen patterns. Since wasm_simd128.h uses an integer vector as
the standard single vector type, the IR for the pmin and pmax intrinsic
functions contains bitcasts that would not be there otherwise. Add extra codegen
patterns that can still select the pmin and pmax instructions in the presence of
these bitcasts.
Differential Revision: https://reviews.llvm.org/D106612
Reland of 31859f896.
This change implements new DAG notes GLOBAL_GET/GLOBAL_SET, and
lowering methods for load and stores of reference types from IR
globals. Once the lowering creates the new nodes, tablegen pattern
matches those and converts them to Wasm global.get/set.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D104797
Replace the experimental clang builtins and LLVM intrinsics for these
instructions with normal instruction selection patterns. The wasm_simd128.h
intrinsics header was already using portable code for the corresponding
intrinsics, so now it produces the correct instructions.
Differential Revision: https://reviews.llvm.org/D106400
Debug info sections need R_WASM_FUNCTION_OFFSET_I32 relocs (with FK_Data_4 fixup
kinds) to refer to functions (instead of R_WASM_TABLE_INDEX as is used in data
sections). Usually this is done in a convoluted way, with unnamed temp data
symbols which target the start of the function, in which case
WasmObjectWriter::recordRelocation converts it to use the section symbol
instead. However in some cases the function can actually be undefined; in this
case the dwarf generator uses the function symbol (a named undefined function
symbol) instead. In that case the section-symbol transform doesn't work and we
need to generate the correct reloc type a different way. In this change
WebAssemblyWasmObjectWriter::getRelocType takes the fixup section type into
account to choose the correct reloc type.
Fixes PR50408
Differential Revision: https://reviews.llvm.org/D103557
Replace the experimental clang builtins and LLVM intrinsics for these
instructions with normal codegen patterns. Resolves PR50435.
Differential Revision: https://reviews.llvm.org/D106019
Replace the experimental clang builtin and LLVM intrinsics for these
instructions with normal codegen patterns. Resolves PR50433.
Differential Revision: https://reviews.llvm.org/D105950
Replace the clang builtin function and LLVM intrinsic for
f32x4.demote_zero_f64x2 with combines from normal SDNodes. Also add missing
combines for i32x4.trunc_sat_zero_f64x2_{s,u}, which share the same pattern.
Differential Revision: https://reviews.llvm.org/D105755
Replace the clang builtin function and LLVM intrinsic previously used to select
the f64x2.promote_low_f32x4 instruction with custom combines from standard
SelectionDAG nodes. Implement the new combines to share code with the similar
combines for f64x2.convert_low_i32x4_{s,u}. Resolves PR50232.
Differential Revision: https://reviews.llvm.org/D105675
This to protect against non-sensical instruction sequences being assembled,
which would either cause asserts/crashes further down, or a Wasm module being output that doesn't validate.
Unlike a validator, this type checker is able to give type-errors as part of the parsing process, which makes the assembler much friendlier to be used by humans writing manual input.
Because the MC system is single pass (instructions aren't even stored in MC format, they are directly output) the type checker has to be single pass as well, which means that from now on .globaltype and .functype decls must come before their use. An extra pass is added to Codegen to collect information for this purpose, since AsmPrinter is normally single pass / streaming as well, and would otherwise generate this information on the fly.
A `-no-type-check` flag was added to llvm-mc (and any other tools that take asm input) that surpresses type errors, as a quick escape hatch for tests that were not intended to be type correct.
This is a first version of the type checker that ignores control flow, i.e. it checks that types are correct along the linear path, but not the branch path. This will still catch most errors. Branch checking could be added in the future.
Differential Revision: https://reviews.llvm.org/D104945
Override the `shouldScalarizeBinop` target lowering hook using the same
implementation used in the x86 backend. This causes `extract_vector_elt`s of
vector binary ops to be scalarized if the scalarized version would be supported.
Differential Revision: https://reviews.llvm.org/D105646
WebAssembly's shift instructions implicitly masks the shift count, so optimize
out redundant explicit masks of the shift count. For vector shifts, this
currently only works if the mask is applied before splatting the shift count,
but this should be addressed in a future commit. Resolves PR49655.
Differential Revision: https://reviews.llvm.org/D105600
Reland of 31859f896.
This change implements new DAG notes GLOBAL_GET/GLOBAL_SET, and
lowering methods for load and stores of reference types from IR
globals. Once the lowering creates the new nodes, tablegen pattern
matches those and converts them to Wasm global.get/set.
Differential Revision: https://reviews.llvm.org/D104797
This is a mechanical change. This actually also renames the
similarly named methods in the SmallString class, however these
methods don't seem to be used outside of the llvm subproject, so
this doesn't break building of the rest of the monorepo.
This change implements new DAG notes GLOBAL_GET/GLOBAL_SET, and
lowering methods for load and stores of reference types from IR
globals. Once the lowering creates the new nodes, tablegen pattern
matches those and converts them to Wasm global.get/set.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D95425
This patch adds TargetStackID::WasmLocal. This stack holds locations of
values that are only addressable by name -- not via a pointer to memory.
For the WebAssembly target, these objects are lowered to WebAssembly
local variables, which are managed by the WebAssembly run-time and are
not addressable by linear memory.
For the WebAssembly target IR indicates that an AllocaInst should be put
on TargetStackID::WasmLocal by putting it in the non-integral address
space WASM_ADDRESS_SPACE_WASM_VAR, with value 1. SROA will mostly lift
these allocations to SSA locals, but any alloca that reaches instruction
selection (usually in non-optimized builds) will be assigned the new
TargetStackID there. Loads and stores to those values are transformed
to new WebAssemblyISD::LOCAL_GET / WebAssemblyISD::LOCAL_SET nodes,
which then lower to the type-specific LOCAL_GET_I32 etc instructions via
tablegen patterns.
Differential Revision: https://reviews.llvm.org/D101140
This patch adds TargetStackID::WasmLocal. This stack holds locations of
values that are only addressable by name -- not via a pointer to memory.
For the WebAssembly target, these objects are lowered to WebAssembly
local variables, which are managed by the WebAssembly run-time and are
not addressable by linear memory.
For the WebAssembly target IR indicates that an AllocaInst should be put
on TargetStackID::WasmLocal by putting it in the non-integral address
space WASM_ADDRESS_SPACE_WASM_VAR, with value 1. SROA will mostly lift
these allocations to SSA locals, but any alloca that reaches instruction
selection (usually in non-optimized builds) will be assigned the new
TargetStackID there. Loads and stores to those values are transformed
to new WebAssemblyISD::LOCAL_GET / WebAssemblyISD::LOCAL_SET nodes,
which then lower to the type-specific LOCAL_GET_I32 etc instructions via
tablegen patterns.
Differential Revision: https://reviews.llvm.org/D101140
This patch adds TargetStackID::WasmLocal. This stack holds locations of
values that are only addressable by name -- not via a pointer to memory.
For the WebAssembly target, these objects are lowered to WebAssembly
local variables, which are managed by the WebAssembly run-time and are
not addressable by linear memory.
For the WebAssembly target IR indicates that an AllocaInst should be put
on TargetStackID::WasmLocal by putting it in the non-integral address
space WASM_ADDRESS_SPACE_WASM_VAR, with value 1. SROA will mostly lift
these allocations to SSA locals, but any alloca that reaches instruction
selection (usually in non-optimized builds) will be assigned the new
TargetStackID there. Loads and stores to those values are transformed
to new WebAssemblyISD::LOCAL_GET / WebAssemblyISD::LOCAL_SET nodes,
which then lower to the type-specific LOCAL_GET_I32 etc instructions via
tablegen patterns.
Differential Revision: https://reviews.llvm.org/D101140
DwarfDebug unconditionally assumes for all call instructions the 0th
operand is the callee operand, which seems to be true for other targets,
but not for WebAssembly. This adds `TargetInstrInfo::getCallOperand`
method whose default implementation returns `getOperand(0)` and makes
WebAssembly overrides it to use its own utility method to get the callee
operand.
This also fixes an existing bug in `WebAssembly::getCalleeOp`, which was
uncovered by this CL.
Reviewed By: dschuff, djtodoro
Differential Revision: https://reviews.llvm.org/D102978
`WebAssemblyDebugValueManager` does not currently handle
`DBG_VALUE_LIST`, which is a recent addition to LLVM. We tried to
nullify them within the constructor of `WebAssemblyDebugValueManager` in
D102589, but it made the class error-prone to use because it deletes
instructions within the constructor and thus invalidates existing
iterators within the BB, so the user of the class should take special
care not to use invalidated iterators. This actually caused a bug in
ExplicitLocals pass.
Instead of trying to fix ExplicitLocals pass to make the iterator usage
correct, which is possible but error-prone, this adds
NullifyDebugValueLists pass that nullifies all `DBG_VALUE_LIST`
instructions before we run WebAssembly specific passes in the backend.
We can remove this pass after we implement handlers for
`DBG_VALUE_LIST`s in `WebAssemblyDebugValueManager` and elsewhere.
Fixes https://github.com/emscripten-core/emscripten/issues/14255.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D102999
Unlike normal loads these don't have an extension field, but we know
from TargetLowering whether these are sign-extending or zero-extending,
and so can optimise away unnecessary extensions.
This was noticed on RISC-V, where sign extensions in the calling
convention would result in unnecessary explicit extension instructions,
but this also fixes some Mips inefficiencies. PowerPC sees churn in the
tests as all the zero extensions are only for promoting 32-bit to
64-bit, but these zero extensions are still not optimised away as they
should be, likely due to i32 being a legal type.
This also simplifies the WebAssembly code somewhat, which currently
works around the lack of target-independent combines with some ugly
patterns that break once they're optimised away.
Re-landed with correct handling in ComputeNumSignBits for Tmp == VTBits,
where zero-extending atomics were incorrectly returning 0 rather than
the (slightly confusing) required return value of 1.
Re-landed again after D102819 fixed PowerPC to correctly zero-extend all
of its atomics as it claimed to do, since the combination of that bug
and this optimisation caused buildbot regressions.
Reviewed By: RKSimon, atanasyan
Differential Revision: https://reviews.llvm.org/D101342
__table_base is know 64-bit, since in LLVM it represents a function pointer offset
__table_base32 is a copy in wasm32 for use in elem init expr, since no truncation may be used there.
New reloc R_WASM_TABLE_INDEX_REL_SLEB64 added
Differential Revision: https://reviews.llvm.org/D101784
We have been handling filters and landingpads incorrectly all along. We
pass clauses' (catches') types to `__cxa_find_matching_catch` in JS glue
code, which returns the thrown pointer and sets the selector using
`setTempRet0()`.
We apparently have been doing the same for filters' (exception specs')
types; we pass them to `__cxa_find_matching_catch` just the same way as
clauses. And `__cxa_find_matching_catch` treats all given types as
clauses. So it is a little surprising; maybe we intended to do something
from the JS side and didn't end up doing?
So anyway, I don't think supporting exception specs in Emscripten EH is
a priority, but this can actually cause incorrect results for normal
catches when functions are inlined and the inlined spec type has a
parent-child relationship with the catch's type.
---
The below is an example of a bug that can happen when inlining and class
hierarchy is mixed. If you are busy you can skip this part:
```
struct A {};
struct B : A {};
void bar() throw (B) { throw B(); }
void foo() {
try {
bar();
} catch (A &) {
fputs ("Expected result\n", stdout);
}
}
```
In the unoptimized code, `bar`'s landingpad will have a filter for `B`
and `foo`'s landingpad will have a clause for `A`. But when `bar` is
inlined into `foo`, `foo`'s landingpad has both a filter for `B` and a
clause for `A`, and it passes the both types to
`__cxa_find_matching_catch`:
```
__cxa_find_matching_catch(typeinfo for B, typeinfo for A)
```
`__cxa_find_matching_catch` thinks both are clauses, and looks at the
first type `B`, which belongs to a filter. And the thrown type is `B`,
so it thinks the first type `B` is caught. But this makes it return an
incorrect selector, because it is supposed to catch the exception using
the second type `A`, which is a parent of `B`. As a result, the `foo` in
the example program above does not print "Expected result" but just
throws the exception to the caller. (This wouldn't have happened if `A`
and `B` are completely disjoint types, such as `float` and `int`)
Fixes https://bugs.llvm.org/show_bug.cgi?id=50357.
Reviewed By: dschuff, kripken
Differential Revision: https://reviews.llvm.org/D102795
WebAssemblyDebugValueManager class currently does not handle
DBG_VALUE_LIST instructions correctly for two reasons, which are
explained in https://bugs.llvm.org/show_bug.cgi?id=50361.
This effectively nullifies DBG_VALUE_LISTs in
WebAssemblyDebugValueManager so that the info will appear as "optimized
out" in debuggers but still be at least correct in the meantime.
Reviewed By: dschuff, jmorse
Differential Revision: https://reviews.llvm.org/D102589
When a stackified variable has an associated `DBG_VALUE` instruction,
DebugFixup pass adds a `DBG_VALUE` instruction after the stackified
value's last use to clear the variable's debug range info. But when the
last use instruction is a terminator, it can cause a verification
failure (when run with `-verify-machineinstrs`) because there are no
instructions allowed after a terminator.
For example:
```
%myvar = ...
DBG_VALUE target-index(wasm-operand-stack), $noreg, !"myvar", ...
BR_IF 0, %myvar, ...
DBG_VALUE $noreg, $noreg, !"myvar", ...
```
In this test, `%myvar` is stackified, so the first `DBG_VALUE`
instruction's first operand has changed to `wasm-operand-stack` to
denote it. And an additional `DBG_VALUE` instruction is added after its
last use, `BR_IF`, to signal variable `myvar` is not in the operand
stack anymore. But because the `DBG_VALUE` instruction is added after
the `BR_IF`, a terminator, it fails MachineVerifier.
`DBG_VALUE` instructions are used in `DbgEntityHistoryCalculator` to
compute value ranges to emit DWARF info, and it turns out the
`DbgEntityHistoryCalculator` terminates ranges at the end of a BB, so we
don't need to emit `DBG_VALUE` after a terminator.
Fixes https://bugs.llvm.org/show_bug.cgi?id=50175.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D102309
In wasm64, the signatures of some library functions and global variables
defined in Emscripten change:
- `emscripten_longjmp`: `(i32, i32) -> ()` -> `(i64, i32) -> ()`
This changes because the first argument is the address of a memory
buffer. This in turn causes more changes below.
- `setThrew`: `(i32, i32) -> ()` -> `(i64, i32) -> ()`
`emscripten_longjmp` calls `setThrew` with the i64 buffer argument as
the first parameter.
- `__THREW__` (global var): `i32` to `i64`
`setThrew`'s first argument is set to this `__THREW__` variable, so it
should change to i64 as well.
- `testSetjmp`: `(i32, i32*, i32) -> (i32)` -> `(i64, i32*, i32) -> (i32)`
In the code transformation done in this pass, the value of `__THREW__`
is passed as the first parameter of `testSetjmp`.
This patch creates some helper functions to easily get types that become
different depending on the wasm32/wasm64, and uses them to change
various function signatures and code transformations. Also updates the
tests with WASM32/WASM64 check lines.
(Untested) Emscripten side patch: https://github.com/emscripten-core/emscripten/pull/14108
Reviewed By: aardappel
Differential Revision: https://reviews.llvm.org/D101985
We explicitly made it error out in D101403, out of a good intention that
the error message will make people less confusing. Turns out, we weren't
failing all cases of wasm EH + SjLj; only a few cases were failing and
our client was able to get around by fixing source code, but now we made
it fail for all cases, even the cases that previously succeeded fail,
which we didn't intend. This reverts that change.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D102364
As we have been missing support for WebAssembly globals on the IR level,
the lowering of WASM_SYMBOL_TYPE_GLOBAL to IR was incomplete. This
commit fleshes out the lowering support, lowering references to and
definitions of addrspace(1) values to correctly typed
WASM_SYMBOL_TYPE_GLOBAL symbols.
Depends on D101608.
Differential Revision: https://reviews.llvm.org/D101913
This patch adds support for WebAssembly globals in LLVM IR, representing
them as pointers to global values, in a non-default, non-integral
address space. Instruction selection legalizes loads and stores to
these pointers to new WebAssemblyISD nodes GLOBAL_GET and GLOBAL_SET.
Once the lowering creates the new nodes, tablegen pattern matches those
and converts them to Wasm global.get/set of the appropriate type.
Based on work by Paulo Matos in https://reviews.llvm.org/D95425.
Reviewed By: pmatos
Differential Revision: https://reviews.llvm.org/D101608
This change was originally landed in: 5000a1b4b9
It was reverted in: 061e071d8c
This change adds support for a new WASM_SEG_FLAG_STRINGS flag in
the object format which works in a similar fashion to SHF_STRINGS
in the ELF world.
Unlike the ELF linker this support is currently limited:
- No support for SHF_MERGE (non-string merging)
- Always do full tail merging ("lo" can be merged with "hello")
- Only support single byte strings (p2align 0)
Like the ELF linker merging is only performed at `-O1` and above.
This fixes part of https://bugs.llvm.org/show_bug.cgi?id=48828,
although crucially it doesn't not currently support debug sections
because they are not represented by data segments (they are custom
sections)
Differential Revision: https://reviews.llvm.org/D97657
This change adds support for a new WASM_SEG_FLAG_STRINGS flag in
the object format which works in a similar fashion to SHF_STRINGS
in the ELF world.
Unlike the ELF linker this support is currently limited:
- No support for SHF_MERGE (non-string merging)
- Always do full tail merging ("lo" can be merged with "hello")
- Only support single byte strings (p2align 0)
Like the ELF linker merging is only performed at `-O1` and above.
This fixes part of https://bugs.llvm.org/show_bug.cgi?id=48828,
although crucially it doesn't not currently support debug sections
because they are not represented by data segments (they are custom
sections)
Differential Revision: https://reviews.llvm.org/D97657
Unlike normal loads these don't have an extension field, but we know
from TargetLowering whether these are sign-extending or zero-extending,
and so can optimise away unnecessary extensions.
This was noticed on RISC-V, where sign extensions in the calling
convention would result in unnecessary explicit extension instructions,
but this also fixes some Mips inefficiencies. PowerPC sees churn in the
tests as all the zero extensions are only for promoting 32-bit to
64-bit, but these zero extensions are still not optimised away as they
should be, likely due to i32 being a legal type.
This also simplifies the WebAssembly code somewhat, which currently
works around the lack of target-independent combines with some ugly
patterns that break once they're optimised away.
Re-landed with correct handling in ComputeNumSignBits for Tmp == VTBits,
where zero-extending atomics were incorrectly returning 0 rather than
the (slightly confusing) required return value of 1.
Reviewed By: RKSimon, atanasyan
Differential Revision: https://reviews.llvm.org/D101342
- Removes the mention of fastcomp, which is deprecated.
- Some functions in Emscripten have moved from JS glue code to
compiler-rt/emscripten_setjmp.c and
compiler-rt/emscripten_exception_builtins.c. This fixes comments about
that.
Reviewed By: sbc100
Differential Revision: https://reviews.llvm.org/D101812
The WebAssembly SIMD intrinsics in wasm_simd128.h generally try not to require
any particular alignment for memory operations to be maximally flexible. For
builtin memory access functions and their corresponding LLVM IR intrinsics,
there's no way to set the expected alignment, so the best we can do is set the
alignment to 1 in the backend. This change means that the alignment hints in the
emitted code will no longer be incorrect when users use the intrinsics to access
unaligned data.
Differential Revision: https://reviews.llvm.org/D101850
This seems to have broken sanitizers, giving lots of
Assertion `NumBits <= MAX_INT_BITS && "bitwidth too large"' failed.
failures across multiple targets (currently X86 and PowerPC). Reverting
until I have a chance to reproduce and debug.
This reverts commit 6e876f9ded.
Unlike normal loads these don't have an extension field, but we know
from TargetLowering whether these are sign-extending or zero-extending,
and so can optimise away unnecessary extensions.
This was noticed on RISC-V, where sign extensions in the calling
convention would result in unnecessary explicit extension instructions,
but this also fixes some Mips inefficiencies. PowerPC sees churn in the
tests as all the zero extensions are only for promoting 32-bit to
64-bit, but these zero extensions are still not optimised away as they
should be, likely due to i32 being a legal type.
This also simplifies the WebAssembly code somewhat, which currently
works around the lack of target-independent combines with some ugly
patterns that break once they're optimised away.
Reviewed By: RKSimon, atanasyan
Differential Revision: https://reviews.llvm.org/D101342
We previously had an ISel pattern for i64x2.abs, but because the ISDNode was not
marked legal for v2i64, the instruction was not being selected.
Differential Revision: https://reviews.llvm.org/D101803
- Error out when both Emscripten EH and wasm EH are used together, i.e.,
both `-enable-emscripten-cxx-exceptions` and `-exception-model=wasm`
are given together. This will not happen if you use Emscripten, but
this can happen when you call `llc` manually with wrong set of
arguments.
- Currently we don't yet support using wasm EH with Emscripten SjLj.
Unlike `-enable-emscripten-cxx-exceptions` which is turned on only
when you use `emcc -s DISABLE_EXCEPTION_CATCHING=0`,
`-enable-emscripten-sjlj` is turned on by Emscripten by default. So we
error out only when it is turned on and `setjmp` or `longjmp` is
actually used.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D101403
Previously we used an i32 constant to store the saturation width, but i32 isn't
legal on RISCV64. This wasn't a big deal to fix, but it is extra work for the
type legalizer.
This patch uses a VTSDNode to store the type similar to SEXT_INREG. This makes
it opaque to the type legalizer.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D101262
Background:
CFGStackify's [[ 398f253400/llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp (L1481-L1540) | fixEndsAtEndOfFunction ]] fixes block/loop/try's return
type when the end of function is unreachable and the function return
type is not void. So if a function returns i32 and `block`-`end` wraps the
whole function, i.e., the `block`'s `end` is the last instruction of the
function, the `block`'s return type should be i32 too:
```
block i32
...
end
end_function
```
If there are consecutive `end`s, this signature has to be propagate to
those blocks too, like:
```
block i32
...
block i32
...
end
end
end_function
```
This applies to `try`-`end` too:
```
try i32
...
catch
...
end
end_function
```
In case of `try`, we not only follow consecutive `end`s but also follow
`catch`, because for the type of the whole `try` to be i32, both `try`
and `catch` parts have to be i32:
```
try i32
...
block i32
...
end
catch
...
block i32
...
end
end
end_function
```
---
Previously we only handled consecutive `end`s or `end` before a `catch`.
But now we have `delegate`, which serves like `end` for
`try`-`delegate`. So we have to follow `delegate` too and mark its
corresponding `try` as i32 (the function's return type):
```
try i32
...
catch
...
try i32 ;; Here
...
delegate N
end
end_function
```
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D101036
This adds support for YAML serialization of `Params` and `Results`
fields in `WebAssemblyMachineFunctionInfo`. Types are printed as `MVT`'s
string representation. This is for writing MIR tests easier.
The tests added are testing simple parsing and printing of `params` /
`results` fields under `machineFunctionInfo`.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D101029
This CL
1. Creates Utils/ directory under lib/Target/WebAssembly
2. Moves existing WebAssemblyUtilities.cpp|h into the Utils/ directory
3. Creates Utils/WebAssemblyTypeUtilities.cpp|h and put type
declarataions and type conversion functions scattered in various
places into this single place.
It has been suggested several times that it is not easy to share utility
functions between subdirectories (AsmParser, DIsassembler, MCTargetDesc,
...). Sometimes we ended up [[ https://reviews.llvm.org/D92840#2478863 | duplicating ]] the same function because of
this.
There are already other targets doing this: AArch64, AMDGPU, and ARM
have Utils/ subdirectory under their target directory.
This extracts the utility functions into a single directory Utils/ and
make them sharable among all passes in WebAssembly/ and its
subdirectories. Also I believe gathering all type-related conversion
functionalities into a single place makes it more usable. (Actually I
was working on another CL that uses various type conversion functions
scattered in multiple places, which became the motivation for this CL.)
Reviewed By: dschuff, aardappel
Differential Revision: https://reviews.llvm.org/D100995
This is just a cleanup of the very high level stuff. I'm sure there is
more to update here but I'll leave that to others and/or a followup.
Differential Revision: https://reviews.llvm.org/D100888
af7925b4dd added a custom DAG combine for recognizing fp-to-ints of
extract_subvectors that could be lowered to f64x2.convert_low_i32x4_{s,u}
instructions. This commit extends the combines to recognize equivalent
extract_subvectors of fp-to-ints as well.
Differential Revision: https://reviews.llvm.org/D100790
We previously used splats instead of v128.const to materialize vector constants
because V8 did not support v128.const. Now that V8 supports v128.const, we can
use v128.const instead. Although this increases code size, it should also
increase performance (or at least require fewer engine-side optimizations), so
it is an appropriate change to make.
Differential Revision: https://reviews.llvm.org/D100716
Use the target-independent @llvm.fptosi and @llvm.fptoui intrinsics instead.
This includes removing the instrinsics for i32x4.trunc_sat_zero_f64x2_{s,u},
which are now represented in IR as a saturating truncation to a v2i32 followed by
a concatenation with a zero vector.
Differential Revision: https://reviews.llvm.org/D100596
Removes the builtins and intrinsics used to opt in to using these instructions
and replaces them with normal ISel patterns now that they are no longer
prototypes.
Differential Revision: https://reviews.llvm.org/D100402
Add a custom DAG combine and ISD opcode for detecting patterns like
(uint_to_fp (extract_subvector ...))
before the extract_subvector is expanded to ensure that they will ultimately
lower to f64x2.convert_low_i32x4_{s,u} instructions. Since these instructions
are no longer prototypes and can now be produced via standard IR, this commit
also removes the target intrinsics and builtins that had been used to prototype
the instructions.
Differential Revision: https://reviews.llvm.org/D100425
Now that these instructions are no longer prototypes, we do not need to be
careful about keeping them opt-in and can use the standard LLVM infrastructure
for them. This commit removes the bespoke intrinsics we were using to represent
these operations in favor of the corresponding target-independent intrinsics.
The clang builtins are preserved because there is no standard way to easily
represent these operations in C/C++.
For consistency with the scalar codegen in the Wasm backend, the intrinsic used
to represent {f32x4,f64x2}.nearest is @llvm.nearbyint even though
@llvm.roundeven better captures the semantics of the underlying Wasm
instruction. Replacing our use of @llvm.nearbyint with use of @llvm.roundeven is
left to a potential future patch.
Differential Revision: https://reviews.llvm.org/D100411
In the final SIMD spec, there is only a single v128.any_true instruction, rather
than one for each lane interpretation because the semantics do not depend on the
lane interpretation.
Differential Revision: https://reviews.llvm.org/D100241