Commit Graph

1654 Commits

Author SHA1 Message Date
Thomas Lively f8c5a4c670 [WebAssembly] Optimize out shift masks
WebAssembly's shift instructions implicitly masks the shift count, so optimize
out redundant explicit masks of the shift count. For vector shifts, this
currently only works if the mask is applied before splatting the shift count,
but this should be addressed in a future commit. Resolves PR49655.

Differential Revision: https://reviews.llvm.org/D105600
2021-07-07 23:14:31 -07:00
Roman Lebedev c2c0d3ea89
Revert "[WebAssembly] Implementation of global.get/set for reftypes in LLVM IR"
This reverts commit 4facbf213c.

```
********************
FAIL: LLVM :: CodeGen/WebAssembly/funcref-call.ll (44466 of 44468)
******************** TEST 'LLVM :: CodeGen/WebAssembly/funcref-call.ll' FAILED ********************
Script:
--
: 'RUN: at line 1';   /builddirs/llvm-project/build-Clang12/bin/llc < /repositories/llvm-project/llvm/test/CodeGen/WebAssembly/funcref-call.ll --mtriple=wasm32-unknown-unknown -asm-verbose=false -mattr=+reference-types | /builddirs/llvm-project/build-Clang12/bin/FileCheck /repositories/llvm-project/llvm/test/CodeGen/WebAssembly/funcref-call.ll
--
Exit Code: 2

Command Output (stderr):
--
llc: /repositories/llvm-project/llvm/include/llvm/Support/LowLevelTypeImpl.h:44: static llvm::LLT llvm::LLT::scalar(unsigned int): Assertion `SizeInBits > 0 && "invalid scalar size"' failed.

```
2021-07-02 11:49:51 +03:00
Paulo Matos 4facbf213c [WebAssembly] Implementation of global.get/set for reftypes in LLVM IR
Reland of 31859f896.

This change implements new DAG notes GLOBAL_GET/GLOBAL_SET, and
lowering methods for load and stores of reference types from IR
globals. Once the lowering creates the new nodes, tablegen pattern
matches those and converts them to Wasm global.get/set.

Differential Revision: https://reviews.llvm.org/D104797
2021-07-02 09:46:28 +02:00
Martin Storsjö 42f74e8249 [llvm] Rename StringRef _lower() method calls to _insensitive()
This is a mechanical change. This actually also renames the
similarly named methods in the SmallString class, however these
methods don't seem to be used outside of the llvm subproject, so
this doesn't break building of the rest of the monorepo.
2021-06-25 00:22:01 +03:00
Heejin Ahn 1d891d44f3 [WebAssembly] Rename event to tag
We recently decided to change 'event' to 'tag', and 'event section' to
'tag section', out of the rationale that the section contains a
generalized tag that references a type, which may be used for something
other than exceptions, and the name 'event' can be confusing in the web
context.

See
- https://github.com/WebAssembly/exception-handling/issues/159#issuecomment-857910130
- https://github.com/WebAssembly/exception-handling/pull/161

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D104423
2021-06-17 20:34:19 -07:00
David Spickett 64de8763aa Revert "Implementation of global.get/set for reftypes in LLVM IR"
This reverts commit 31859f896c.

Causing SVE and RISCV-V test failures on bots.
2021-06-10 10:11:17 +00:00
Paulo Matos 31859f896c Implementation of global.get/set for reftypes in LLVM IR
This change implements new DAG notes GLOBAL_GET/GLOBAL_SET, and
lowering methods for load and stores of reference types from IR
globals. Once the lowering creates the new nodes, tablegen pattern
matches those and converts them to Wasm global.get/set.

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D95425
2021-06-10 10:07:45 +02:00
Andy Wingo 82f92e35c6 [WebAssembly][CodeGen] IR support for WebAssembly local variables
This patch adds TargetStackID::WasmLocal.  This stack holds locations of
values that are only addressable by name -- not via a pointer to memory.
For the WebAssembly target, these objects are lowered to WebAssembly
local variables, which are managed by the WebAssembly run-time and are
not addressable by linear memory.

For the WebAssembly target IR indicates that an AllocaInst should be put
on TargetStackID::WasmLocal by putting it in the non-integral address
space WASM_ADDRESS_SPACE_WASM_VAR, with value 1.  SROA will mostly lift
these allocations to SSA locals, but any alloca that reaches instruction
selection (usually in non-optimized builds) will be assigned the new
TargetStackID there.  Loads and stores to those values are transformed
to new WebAssemblyISD::LOCAL_GET / WebAssemblyISD::LOCAL_SET nodes,
which then lower to the type-specific LOCAL_GET_I32 etc instructions via
tablegen patterns.

Differential Revision: https://reviews.llvm.org/D101140
2021-06-01 11:31:39 +02:00
Andy Wingo bc1ad6e3c4 Revert "[WebAssembly][CodeGen] IR support for WebAssembly local variables"
This reverts commit bf35f4af51.  There was
an error in a shared-library build.
2021-05-31 10:55:15 +02:00
Andy Wingo bf35f4af51 [WebAssembly][CodeGen] IR support for WebAssembly local variables
This patch adds TargetStackID::WasmLocal.  This stack holds locations of
values that are only addressable by name -- not via a pointer to memory.
For the WebAssembly target, these objects are lowered to WebAssembly
local variables, which are managed by the WebAssembly run-time and are
not addressable by linear memory.

For the WebAssembly target IR indicates that an AllocaInst should be put
on TargetStackID::WasmLocal by putting it in the non-integral address
space WASM_ADDRESS_SPACE_WASM_VAR, with value 1.  SROA will mostly lift
these allocations to SSA locals, but any alloca that reaches instruction
selection (usually in non-optimized builds) will be assigned the new
TargetStackID there.  Loads and stores to those values are transformed
to new WebAssemblyISD::LOCAL_GET / WebAssemblyISD::LOCAL_SET nodes,
which then lower to the type-specific LOCAL_GET_I32 etc instructions via
tablegen patterns.

Differential Revision: https://reviews.llvm.org/D101140
2021-05-31 10:40:38 +02:00
Andy Wingo ca5f07f8c4 Revert "[WebAssembly][CodeGen] IR support for WebAssembly local variables"
This reverts commit 00ecf18979, as it
broke the AMDGPU build.  Will reland later with a fix.
2021-05-28 12:42:12 +02:00
Andy Wingo 00ecf18979 [WebAssembly][CodeGen] IR support for WebAssembly local variables
This patch adds TargetStackID::WasmLocal.  This stack holds locations of
values that are only addressable by name -- not via a pointer to memory.
For the WebAssembly target, these objects are lowered to WebAssembly
local variables, which are managed by the WebAssembly run-time and are
not addressable by linear memory.

For the WebAssembly target IR indicates that an AllocaInst should be put
on TargetStackID::WasmLocal by putting it in the non-integral address
space WASM_ADDRESS_SPACE_WASM_VAR, with value 1.  SROA will mostly lift
these allocations to SSA locals, but any alloca that reaches instruction
selection (usually in non-optimized builds) will be assigned the new
TargetStackID there.  Loads and stores to those values are transformed
to new WebAssemblyISD::LOCAL_GET / WebAssemblyISD::LOCAL_SET nodes,
which then lower to the type-specific LOCAL_GET_I32 etc instructions via
tablegen patterns.

Differential Revision: https://reviews.llvm.org/D101140
2021-05-28 11:07:41 +02:00
Heejin Ahn 5dd86aadf0 [WebAssembly] Add TargetInstrInfo::getCalleeOperand
DwarfDebug unconditionally assumes for all call instructions the 0th
operand is the callee operand, which seems to be true for other targets,
but not for WebAssembly. This adds `TargetInstrInfo::getCallOperand`
method whose default implementation returns `getOperand(0)` and makes
WebAssembly overrides it to use its own utility method to get the callee
operand.

This also fixes an existing bug in `WebAssembly::getCalleeOp`, which was
uncovered by this CL.

Reviewed By: dschuff, djtodoro

Differential Revision: https://reviews.llvm.org/D102978
2021-05-26 11:43:59 -07:00
Heejin Ahn a64ebb8637 [WebAssembly] Add NullifyDebugValueLists pass
`WebAssemblyDebugValueManager` does not currently handle
`DBG_VALUE_LIST`, which is a recent addition to LLVM. We tried to
nullify them within the constructor of `WebAssemblyDebugValueManager` in
D102589, but it made the class error-prone to use because it deletes
instructions within the constructor and thus invalidates existing
iterators within the BB, so the user of the class should take special
care not to use invalidated iterators. This actually caused a bug in
ExplicitLocals pass.

Instead of trying to fix ExplicitLocals pass to make the iterator usage
correct, which is possible but error-prone, this adds
NullifyDebugValueLists pass that nullifies all `DBG_VALUE_LIST`
instructions before we run WebAssembly specific passes in the backend.
We can remove this pass after we implement handlers for
`DBG_VALUE_LIST`s in `WebAssemblyDebugValueManager` and elsewhere.

Fixes https://github.com/emscripten-core/emscripten/issues/14255.

Reviewed By: dschuff

Differential Revision: https://reviews.llvm.org/D102999
2021-05-24 11:36:01 -07:00
Jessica Clarke e10958c807 [SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics
Unlike normal loads these don't have an extension field, but we know
from TargetLowering whether these are sign-extending or zero-extending,
and so can optimise away unnecessary extensions.

This was noticed on RISC-V, where sign extensions in the calling
convention would result in unnecessary explicit extension instructions,
but this also fixes some Mips inefficiencies. PowerPC sees churn in the
tests as all the zero extensions are only for promoting 32-bit to
64-bit, but these zero extensions are still not optimised away as they
should be, likely due to i32 being a legal type.

This also simplifies the WebAssembly code somewhat, which currently
works around the lack of target-independent combines with some ugly
patterns that break once they're optimised away.

Re-landed with correct handling in ComputeNumSignBits for Tmp == VTBits,
where zero-extending atomics were incorrectly returning 0 rather than
the (slightly confusing) required return value of 1.

Re-landed again after D102819 fixed PowerPC to correctly zero-extend all
of its atomics as it claimed to do, since the combination of that bug
and this optimisation caused buildbot regressions.

Reviewed By: RKSimon, atanasyan

Differential Revision: https://reviews.llvm.org/D101342
2021-05-20 20:34:23 +01:00
Wouter van Oortmerssen 3a293cbf13 [WebAssembly] Fix PIC/GOT codegen for wasm64
__table_base is know 64-bit, since in LLVM it represents a function pointer offset
__table_base32 is a copy in wasm32 for use in elem init expr, since no truncation may be used there.
New reloc R_WASM_TABLE_INDEX_REL_SLEB64 added

Differential Revision: https://reviews.llvm.org/D101784
2021-05-20 09:59:31 -07:00
Heejin Ahn 412a3381f7 [WebAssembly] Ignore filters in Emscripten EH landingpads
We have been handling filters and landingpads incorrectly all along. We
pass clauses' (catches') types to `__cxa_find_matching_catch` in JS glue
code, which returns the thrown pointer and sets the selector using
`setTempRet0()`.

We apparently have been doing the same for filters' (exception specs')
types; we pass them to `__cxa_find_matching_catch` just the same way as
clauses. And `__cxa_find_matching_catch` treats all given types as
clauses. So it is a little surprising; maybe we intended to do something
from the JS side and didn't end up doing?

So anyway, I don't think supporting exception specs in Emscripten EH is
a priority, but this can actually cause incorrect results for normal
catches when functions are inlined and the inlined spec type has a
parent-child relationship with the catch's type.

---

The below is an example of a bug that can happen when inlining and class
hierarchy is mixed. If you are busy you can skip this part:
```
struct A {};
struct B : A {};

void bar() throw (B) { throw B(); }

void foo() {
  try {
    bar();
  } catch (A &) {
    fputs ("Expected result\n", stdout);
  }
}
```

In the unoptimized code, `bar`'s landingpad will have a filter for `B`
and `foo`'s landingpad will have a clause for `A`. But when `bar` is
inlined into `foo`, `foo`'s landingpad has both a filter for `B` and a
clause for `A`, and it passes the both types to
`__cxa_find_matching_catch`:
```
__cxa_find_matching_catch(typeinfo for B, typeinfo for A)
```
`__cxa_find_matching_catch` thinks both are clauses, and looks at the
first type `B`, which belongs to a filter. And the thrown type is `B`,
so it thinks the first type `B` is caught. But this makes it return an
incorrect selector, because it is supposed to catch the exception using
the second type `A`, which is a parent of `B`. As a result, the `foo` in
the example program above does not print "Expected result" but just
throws the exception to the caller. (This wouldn't have happened if `A`
and `B` are completely disjoint types, such as `float` and `int`)

Fixes https://bugs.llvm.org/show_bug.cgi?id=50357.

Reviewed By: dschuff, kripken

Differential Revision: https://reviews.llvm.org/D102795
2021-05-20 01:28:16 -07:00
Heejin Ahn 6e1c1dac4c [WebAssembly] Nullify DBG_VALUE_LISTs in DebugValueManager
WebAssemblyDebugValueManager class currently does not handle
DBG_VALUE_LIST instructions correctly for two reasons, which are
explained in https://bugs.llvm.org/show_bug.cgi?id=50361.

This effectively nullifies DBG_VALUE_LISTs in
WebAssemblyDebugValueManager so that the info will appear as "optimized
out" in debuggers but still be at least correct in the meantime.

Reviewed By: dschuff, jmorse

Differential Revision: https://reviews.llvm.org/D102589
2021-05-17 13:47:36 -07:00
Heejin Ahn 71fbfb499a [WebAssembly] Omit DBG_VALUE after terminator
When a stackified variable has an associated `DBG_VALUE` instruction,
DebugFixup pass adds a `DBG_VALUE` instruction after the stackified
value's last use to clear the variable's debug range info. But when the
last use instruction is a terminator, it can cause a verification
failure (when run with `-verify-machineinstrs`) because there are no
instructions allowed after a terminator.

For example:
```
%myvar = ...
DBG_VALUE target-index(wasm-operand-stack), $noreg, !"myvar", ...
BR_IF 0, %myvar, ...
DBG_VALUE $noreg, $noreg, !"myvar", ...
```
In this test, `%myvar` is stackified, so the first `DBG_VALUE`
instruction's first operand has changed to `wasm-operand-stack` to
denote it. And an additional `DBG_VALUE` instruction is added after its
last use, `BR_IF`, to signal variable `myvar` is not in the operand
stack anymore. But because the `DBG_VALUE` instruction is added after
the `BR_IF`, a terminator, it fails MachineVerifier.

`DBG_VALUE` instructions are used in `DbgEntityHistoryCalculator` to
compute value ranges to emit DWARF info, and it turns out the
`DbgEntityHistoryCalculator` terminates ranges at the end of a BB, so we
don't need to emit `DBG_VALUE` after a terminator.

Fixes https://bugs.llvm.org/show_bug.cgi?id=50175.

Reviewed By: dschuff

Differential Revision: https://reviews.llvm.org/D102309
2021-05-14 03:48:19 -07:00
Heejin Ahn 8e35a18e4a [WebAssembly] Support Emscripten EH/SjLj in Wasm64
In wasm64, the signatures of some library functions and global variables
defined in Emscripten change:
- `emscripten_longjmp`: `(i32, i32) -> ()` -> `(i64, i32) -> ()`
  This changes because the first argument is the address of a memory
  buffer. This in turn causes more changes below.
- `setThrew`: `(i32, i32) -> ()` -> `(i64, i32) -> ()`
  `emscripten_longjmp` calls `setThrew` with the i64 buffer argument as
  the first parameter.
- `__THREW__` (global var): `i32` to `i64`
  `setThrew`'s first argument is set to this `__THREW__` variable, so it
  should change to i64 as well.
- `testSetjmp`: `(i32, i32*, i32) -> (i32)` -> `(i64, i32*, i32) -> (i32)`
  In the code transformation done in this pass, the value of `__THREW__`
  is passed as the first parameter of `testSetjmp`.

This patch creates some helper functions to easily get types that become
different depending on the wasm32/wasm64, and uses them to change
various function signatures and code transformations. Also updates the
tests with WASM32/WASM64 check lines.

(Untested) Emscripten side patch: https://github.com/emscripten-core/emscripten/pull/14108

Reviewed By: aardappel

Differential Revision: https://reviews.llvm.org/D101985
2021-05-14 03:45:09 -07:00
Heejin Ahn ba38b72ec2 [WebAssembly] Allow Wasm EH with Emscripten SjLj
We explicitly made it error out in D101403, out of a good intention that
the error message will make people less confusing. Turns out, we weren't
failing all cases of wasm EH + SjLj; only a few cases were failing and
our client was able to get around by fixing source code, but now we made
it fail for all cases, even the cases that previously succeeded fail,
which we didn't intend. This reverts that change.

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D102364
2021-05-12 13:27:04 -07:00
Stefan Pintilie 8d37411e48 Revert "[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics"
This reverts commit 6c80361b84.
Breaks PowerPC Big Endian buildbots.
2021-05-12 09:46:18 -05:00
Andy Wingo b2f21b145a [CodeGen][WebAssembly] Better lowering for WASM_SYMBOL_TYPE_GLOBAL symbols
As we have been missing support for WebAssembly globals on the IR level,
the lowering of WASM_SYMBOL_TYPE_GLOBAL to IR was incomplete.  This
commit fleshes out the lowering support, lowering references to and
definitions of addrspace(1) values to correctly typed
WASM_SYMBOL_TYPE_GLOBAL symbols.

Depends on D101608.

Differential Revision: https://reviews.llvm.org/D101913
2021-05-11 11:47:40 +02:00
Paulo Matos d7086af214 [WebAssembly] Support for WebAssembly globals in LLVM IR
This patch adds support for WebAssembly globals in LLVM IR, representing
them as pointers to global values, in a non-default, non-integral
address space.  Instruction selection legalizes loads and stores to
these pointers to new WebAssemblyISD nodes GLOBAL_GET and GLOBAL_SET.
Once the lowering creates the new nodes, tablegen pattern matches those
and converts them to Wasm global.get/set of the appropriate type.

Based on work by Paulo Matos in https://reviews.llvm.org/D95425.

Reviewed By: pmatos

Differential Revision: https://reviews.llvm.org/D101608
2021-05-11 11:19:29 +02:00
Sam Clegg 3b8d2be527 Reland: "[lld][WebAssembly] Initial support merging string data"
This change was originally landed in: 5000a1b4b9
It was reverted in: 061e071d8c

This change adds support for a new WASM_SEG_FLAG_STRINGS flag in
the object format which works in a similar fashion to SHF_STRINGS
in the ELF world.

Unlike the ELF linker this support is currently limited:
- No support for SHF_MERGE (non-string merging)
- Always do full tail merging ("lo" can be merged with "hello")
- Only support single byte strings (p2align 0)

Like the ELF linker merging is only performed at `-O1` and above.

This fixes part of https://bugs.llvm.org/show_bug.cgi?id=48828,
although crucially it doesn't not currently support debug sections
because they are not represented by data segments (they are custom
sections)

Differential Revision: https://reviews.llvm.org/D97657
2021-05-10 16:03:38 -07:00
Nico Weber 061e071d8c Revert "[lld][WebAssembly] Initial support merging string data"
This reverts commit 5000a1b4b9.
Breaks tests, see https://reviews.llvm.org/D97657#2749151

Easily repros locally with `ninja check-llvm-mc-webassembly`.
2021-05-10 18:28:28 -04:00
Sam Clegg 5000a1b4b9 [lld][WebAssembly] Initial support merging string data
This change adds support for a new WASM_SEG_FLAG_STRINGS flag in
the object format which works in a similar fashion to SHF_STRINGS
in the ELF world.

Unlike the ELF linker this support is currently limited:
- No support for SHF_MERGE (non-string merging)
- Always do full tail merging ("lo" can be merged with "hello")
- Only support single byte strings (p2align 0)

Like the ELF linker merging is only performed at `-O1` and above.

This fixes part of https://bugs.llvm.org/show_bug.cgi?id=48828,
although crucially it doesn't not currently support debug sections
because they are not represented by data segments (they are custom
sections)

Differential Revision: https://reviews.llvm.org/D97657
2021-05-10 13:15:12 -07:00
Jessica Clarke 6c80361b84 [SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics
Unlike normal loads these don't have an extension field, but we know
from TargetLowering whether these are sign-extending or zero-extending,
and so can optimise away unnecessary extensions.

This was noticed on RISC-V, where sign extensions in the calling
convention would result in unnecessary explicit extension instructions,
but this also fixes some Mips inefficiencies. PowerPC sees churn in the
tests as all the zero extensions are only for promoting 32-bit to
64-bit, but these zero extensions are still not optimised away as they
should be, likely due to i32 being a legal type.

This also simplifies the WebAssembly code somewhat, which currently
works around the lack of target-independent combines with some ugly
patterns that break once they're optimised away.

Re-landed with correct handling in ComputeNumSignBits for Tmp == VTBits,
where zero-extending atomics were incorrectly returning 0 rather than
the (slightly confusing) required return value of 1.

Reviewed By: RKSimon, atanasyan

Differential Revision: https://reviews.llvm.org/D101342
2021-05-06 04:01:20 +01:00
Heejin Ahn 7f06cae1c1 [WebAssembly] Fix JS code mentions in LowerEmscriptenEHSjLj
- Removes the mention of fastcomp, which is deprecated.
- Some functions in Emscripten have moved from JS glue code to
  compiler-rt/emscripten_setjmp.c and
  compiler-rt/emscripten_exception_builtins.c. This fixes comments about
  that.

Reviewed By: sbc100

Differential Revision: https://reviews.llvm.org/D101812
2021-05-05 17:04:14 -07:00
Thomas Lively 89333b35a7 [WebAssembly] Set alignment to 1 for SIMD memory intrinsics
The WebAssembly SIMD intrinsics in wasm_simd128.h generally try not to require
any particular alignment for memory operations to be maximally flexible. For
builtin memory access functions and their corresponding LLVM IR intrinsics,
there's no way to set the expected alignment, so the best we can do is set the
alignment to 1 in the backend. This change means that the alignment hints in the
emitted code will no longer be incorrect when users use the intrinsics to access
unaligned data.

Differential Revision: https://reviews.llvm.org/D101850
2021-05-05 11:59:33 -07:00
Jessica Clarke 897d7bceb9 Revert "[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics"
This seems to have broken sanitizers, giving lots of

  Assertion `NumBits <= MAX_INT_BITS && "bitwidth too large"' failed.

failures across multiple targets (currently X86 and PowerPC). Reverting
until I have a chance to reproduce and debug.

This reverts commit 6e876f9ded.
2021-05-05 17:02:05 +01:00
Jessica Clarke 6e876f9ded [SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics
Unlike normal loads these don't have an extension field, but we know
from TargetLowering whether these are sign-extending or zero-extending,
and so can optimise away unnecessary extensions.

This was noticed on RISC-V, where sign extensions in the calling
convention would result in unnecessary explicit extension instructions,
but this also fixes some Mips inefficiencies. PowerPC sees churn in the
tests as all the zero extensions are only for promoting 32-bit to
64-bit, but these zero extensions are still not optimised away as they
should be, likely due to i32 being a legal type.

This also simplifies the WebAssembly code somewhat, which currently
works around the lack of target-independent combines with some ugly
patterns that break once they're optimised away.

Reviewed By: RKSimon, atanasyan

Differential Revision: https://reviews.llvm.org/D101342
2021-05-05 16:34:45 +01:00
Thomas Lively 14ca2e5e22 [WebAssembly] Mark abs of v2i64 as legal
We previously had an ISel pattern for i64x2.abs, but because the ISDNode was not
marked legal for v2i64, the instruction was not being selected.

Differential Revision: https://reviews.llvm.org/D101803
2021-05-04 13:25:32 -07:00
Paulo Matos cd460c4d11 [WebAssembly] Fixup order of ins variables for table instructions
WebAssembly instruction arguments should have their arguments ordered from
the deepest to the shallowest on the stack.
2021-05-03 13:04:51 -07:00
Heejin Ahn b4a5dd4da9 [WebAssembly] Error when wasm EH is used with Emscripten EH/SjLj
- Error out when both Emscripten EH and wasm EH are used together, i.e.,
  both `-enable-emscripten-cxx-exceptions` and `-exception-model=wasm`
  are given together. This will not happen if you use Emscripten, but
  this can happen when you call `llc` manually with wrong set of
  arguments.
- Currently we don't yet support using wasm EH with Emscripten SjLj.
  Unlike `-enable-emscripten-cxx-exceptions` which is turned on only
  when you use `emcc -s DISABLE_EXCEPTION_CATCHING=0`,
  `-enable-emscripten-sjlj` is turned on by Emscripten by default. So we
  error out only when it is turned on and `setjmp` or `longjmp` is
  actually used.

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D101403
2021-04-27 16:07:53 -07:00
Craig Topper 3067520bf4 [SelectionDAG] Use a VTSDNode to store the saturation width for FP_TO_SINT_SAT/FP_TO_UINT_SAT
Previously we used an i32 constant to store the saturation width, but i32 isn't
legal on RISCV64. This wasn't a big deal to fix, but it is extra work for the
type legalizer.

This patch uses a VTSDNode to store the type similar to SEXT_INREG. This makes
it opaque to the type legalizer.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D101262
2021-04-27 14:38:42 -07:00
Heejin Ahn c390621aeb [WebAssembly] Fix fixEndsAtEndOfFunction for delegate
Background:
CFGStackify's [[ 398f253400/llvm/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp (L1481-L1540) | fixEndsAtEndOfFunction ]] fixes block/loop/try's return
type when the end of function is unreachable and the function return
type is not void. So if a function returns i32 and `block`-`end` wraps the
whole function, i.e., the `block`'s `end` is the last instruction of the
function, the `block`'s return type should be i32 too:
```
block i32
  ...
end
end_function
```

If there are consecutive `end`s, this signature has to be propagate to
those blocks too, like:
```
block i32
  ...
  block i32
    ...
  end
end
end_function
```

This applies to `try`-`end` too:
```
try i32
  ...
catch
  ...
end
end_function
```

In case of `try`, we not only follow consecutive `end`s but also follow
`catch`, because for the type of the whole `try` to be i32, both `try`
and `catch` parts have to be i32:
```
try i32
  ...
  block i32
    ...
  end
catch
  ...
  block i32
    ...
  end
end
end_function
```

---

Previously we only handled consecutive `end`s or `end` before a `catch`.
But now we have `delegate`, which serves like `end` for
`try`-`delegate`. So we have to follow `delegate` too and mark its
corresponding `try` as i32 (the function's return type):
```
try i32
  ...
catch
  ...
  try i32    ;; Here
    ...
  delegate N
end
end_function
```

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D101036
2021-04-22 15:32:00 -07:00
Heejin Ahn b3e88ccba7 [WebAssembly] Serialize params/results in MachineFunctionInfo
This adds support for YAML serialization of `Params` and `Results`
fields in `WebAssemblyMachineFunctionInfo`. Types are printed as `MVT`'s
string representation. This is for writing MIR tests easier.

The tests added are testing simple parsing and printing of `params` /
`results` fields under `machineFunctionInfo`.

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D101029
2021-04-22 15:31:09 -07:00
Heejin Ahn 0b2bc69ba2 [WebAssembly] Put utility functions in Utils directory (NFC)
This CL
1. Creates Utils/ directory under lib/Target/WebAssembly
2. Moves existing WebAssemblyUtilities.cpp|h into the Utils/ directory
3. Creates Utils/WebAssemblyTypeUtilities.cpp|h and put type
   declarataions and type conversion functions scattered in various
   places into this single place.

It has been suggested several times that it is not easy to share utility
functions between subdirectories (AsmParser, DIsassembler, MCTargetDesc,
...). Sometimes we ended up [[ https://reviews.llvm.org/D92840#2478863 | duplicating ]] the same function because of
this.

There are already other targets doing this: AArch64, AMDGPU, and ARM
have Utils/ subdirectory under their target directory.

This extracts the utility functions into a single directory Utils/ and
make them sharable among all passes in WebAssembly/ and its
subdirectories. Also I believe gathering all type-related conversion
functionalities into a single place makes it more usable. (Actually I
was working on another CL that uses various type conversion functions
scattered in multiple places, which became the motivation for this CL.)

Reviewed By: dschuff, aardappel

Differential Revision: https://reviews.llvm.org/D100995
2021-04-22 15:29:43 -07:00
Sam Clegg 103956170b [WebAssembly] Update README. NFC.
This is just a cleanup of the very high level stuff.  I'm sure there is
more to update here but I'll leave that to others and/or a followup.

Differential Revision: https://reviews.llvm.org/D100888
2021-04-20 16:59:08 -07:00
Sam Clegg d2de2d1724 [WebAssembly] Remove unused known_gcc_test_failures.txt. NFC
Differential Revision: https://reviews.llvm.org/D100887
2021-04-20 14:07:25 -07:00
Thomas Lively 693d767c60 [WebAssembly] More codegen for f64x2.convert_low_i32x4_{s,u}
af7925b4dd added a custom DAG combine for recognizing fp-to-ints of
extract_subvectors that could be lowered to f64x2.convert_low_i32x4_{s,u}
instructions. This commit extends the combines to recognize equivalent
extract_subvectors of fp-to-ints as well.

Differential Revision: https://reviews.llvm.org/D100790
2021-04-20 12:37:13 -07:00
Thomas Lively e657c84fa1 [WebAssembly] Use v128.const instead of splats for constants
We previously used splats instead of v128.const to materialize vector constants
because V8 did not support v128.const. Now that V8 supports v128.const, we can
use v128.const instead. Although this increases code size, it should also
increase performance (or at least require fewer engine-side optimizations), so
it is an appropriate change to make.

Differential Revision: https://reviews.llvm.org/D100716
2021-04-19 12:43:59 -07:00
Thomas Lively 5c729750a6 [WebAssembly] Remove saturating fp-to-int target intrinsics
Use the target-independent @llvm.fptosi and @llvm.fptoui intrinsics instead.
This includes removing the instrinsics for i32x4.trunc_sat_zero_f64x2_{s,u},
which are now represented in IR as a saturating truncation to a v2i32 followed by
a concatenation with a zero vector.

Differential Revision: https://reviews.llvm.org/D100596
2021-04-16 12:11:20 -07:00
Thomas Lively 6a18cc23ef [WebAssembly] Codegen for i64x2.extend_{low,high}_i32x4_{s,u}
Removes the builtins and intrinsics used to opt in to using these instructions
and replaces them with normal ISel patterns now that they are no longer
prototypes.

Differential Revision: https://reviews.llvm.org/D100402
2021-04-14 13:43:09 -07:00
Thomas Lively af7925b4dd [WebAssembly] Codegen for f64x2.convert_low_i32x4_{s,u}
Add a custom DAG combine and ISD opcode for detecting patterns like

  (uint_to_fp (extract_subvector ...))

before the extract_subvector is expanded to ensure that they will ultimately
lower to f64x2.convert_low_i32x4_{s,u} instructions. Since these instructions
are no longer prototypes and can now be produced via standard IR, this commit
also removes the target intrinsics and builtins that had been used to prototype
the instructions.

Differential Revision: https://reviews.llvm.org/D100425
2021-04-14 10:42:45 -07:00
Sander de Smalen 4f42d873c2 [TTI] NFC: Change getArithmeticInstrCost to return InstructionCost
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D100317
2021-04-14 17:20:36 +01:00
Sander de Smalen 1af35e77f4 [TTI] NFC: Change getVectorInstrCost to return InstructionCost
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D100315
2021-04-14 17:20:35 +01:00
Thomas Lively af7ab81ce3 [WebAssembly] Use standard intrinsics for f32x4 and f64x2 ops
Now that these instructions are no longer prototypes, we do not need to be
careful about keeping them opt-in and can use the standard LLVM infrastructure
for them. This commit removes the bespoke intrinsics we were using to represent
these operations in favor of the corresponding target-independent intrinsics.
The clang builtins are preserved because there is no standard way to easily
represent these operations in C/C++.

For consistency with the scalar codegen in the Wasm backend, the intrinsic used
to represent {f32x4,f64x2}.nearest is @llvm.nearbyint even though
@llvm.roundeven better captures the semantics of the underlying Wasm
instruction. Replacing our use of @llvm.nearbyint with use of @llvm.roundeven is
left to a potential future patch.

Differential Revision: https://reviews.llvm.org/D100411
2021-04-14 09:19:27 -07:00
Thomas Lively ea8dd3ee2e [WebAssembly] Update v128.any_true
In the final SIMD spec, there is only a single v128.any_true instruction, rather
than one for each lane interpretation because the semantics do not depend on the
lane interpretation.

Differential Revision: https://reviews.llvm.org/D100241
2021-04-11 11:13:16 -07:00
Thomas Lively f30c429da6 [WebAssembly] Add shuffles as an option for lowering BUILD_VECTOR
When lowering a BUILD_VECTOR SDNode, we choose among various possible vector
creation instructions in an attempt to minimize the total number of instructions
used. We previously considered using swizzles, consts, and splats, and this
patch adds shuffles as well. A common pattern that now lowers to shuffles is
when two 64-bit vectors are concatenated. Previously, concatenations generally
lowered to sequences of extract_lane and replace_lane instructions when they
could have been a single shuffle.

Differential Revision: https://reviews.llvm.org/D100018
2021-04-09 11:21:49 -07:00
Wouter van Oortmerssen 04e9cd09c8 [WebAssembly] Fix for PIC external symbol ISEL
wasm64 was missing DAG ISEL patterns for external symbol based global.get, but simply adding these analogous to the existing 32-bit versions doesn't work.
This is because we are conflating the 32-bit global index with the pointer represented by the external symbol, which for wasm32 happened to work.
The simplest fix is to pretend we have a 64-bit global index. This sounds incorrect, but is immaterial since once this index is stored as a MachineOperand it becomes 64-bit anyway (and has been all along). As such, the EmitInstrWithCustomInserter based implementation I experimented with become a no-op and no further changes in the C++ code are required.

Differential Revision: https://reviews.llvm.org/D99904
2021-04-08 12:07:38 -07:00
Nikita Popov 665065821e [FastISel] Remove kill tracking
This is a followup to D98145: As far as I know, tracking of kill
flags in FastISel is just a compile-time optimization. However,
I'm not actually seeing any compile-time regression when removing
the tracking. This probably used to be more important in the past,
before FastRA was switched to allocate instructions in reverse
order, which means that it discovers kills as a matter of course.

As such, the kill tracking doesn't really seem to serve a purpose
anymore, and just adds additional complexity and potential for
errors. This patch removes it entirely. The primary changes are
dropping the hasTrivialKill() method and removing the kill
arguments from the emitFast methods. The rest is mechanical fixup.

Differential Revision: https://reviews.llvm.org/D98294
2021-04-03 15:50:13 +02:00
Sam Parker 92e7771483 [WebAssembly] Invert branch condition on xor input
A frequent pattern for floating point conditional branches use an xor
to invert the input for the branch. Instead we can fold away the xor
by swapping the branch target instead.

Differential Revision: https://reviews.llvm.org/D99171
2021-04-01 09:23:28 +01:00
Thomas Lively 45783d0e8a [WebAssembly] Implement i64x2 comparisons
Removes the prototype builtin and intrinsic for i64x2.eq and implements that
instruction as well as the other i64x2 comparison instructions in the final SIMD
spec. Unsigned comparisons were not included in the final spec, so they still
need to be scalarized via a custom lowering.

Differential Revision: https://reviews.llvm.org/D99623
2021-03-31 10:46:17 -07:00
Tomas Matheson a9968c0a33 [NFC][CodeGen] Tidy up TargetRegisterInfo stack realignment functions
Currently needsStackRealignment returns false if canRealignStack returns false.
This means that the behavior of needsStackRealignment does not correspond to
it's name and description; a function might need stack realignment, but if it
is not possible then this function returns false. Furthermore,
needsStackRealignment is not virtual and therefore some backends have made use
of canRealignStack to indicate whether a function needs stack realignment.

This patch attempts to clarify the situation by separating them and introducing
new names:

 - shouldRealignStack - true if there is any reason the stack should be
   realigned

 - canRealignStack - true if we are still able to realign the stack (e.g. we
   can still reserve/have reserved a frame pointer)

 - hasStackRealignment = shouldRealignStack && canRealignStack (not target
   customisable)

Targets can now override shouldRealignStack to indicate that stack realignment
is required.

This change will make it easier in a future change to handle the case where we
need to realign the stack but can't do so (for example when the register
allocator creates an aligned spill after the frame pointer has been
eliminated).

Differential Revision: https://reviews.llvm.org/D98716

Change-Id: Ib9a4d21728bf9d08a545b4365418d3ffe1af4d87
2021-03-30 17:31:39 +01:00
Thomas Lively a1b8b0739a [WebAssembly] Fix i8x16.popcnt opcode
When I updated the SIMD opcodes in f5764a8654, I accidentally missed updating
i8x16.popcnt. This patch fixes the omission.

Differential Revision: https://reviews.llvm.org/D99536
2021-03-29 17:23:15 -07:00
Sander de Smalen 55d18b3cc2 [TTI] Return a TypeSize from getRegisterBitWidth.
This patch changes the interface to take a RegisterKind, to indicate
whether the register bitwidth of a scalar register, fixed-width vector
register, or scalable vector register must be returned.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D98874
2021-03-24 14:45:13 +00:00
Andy Wingo c9801db2eb [WebAssembly][MC] Record limit constraints for table sizes
This commit adds a full WasmTableType to MCSymbolWasm, differing from
the current situation (just an ElemType) in that it additionally records
a WasmLimits.

We add support for specifying the limits in .S files also, via the
following syntax variations:

  .tabletype SYM, ELEMTYPE
  .tabletype SYM, ELEMTYPE, MINSIZE
  .tabletype SYM, ELEMTYPE, MINSIZE, MAXSIZE

Depends on D99186.

Differential Revision: https://reviews.llvm.org/D99191
2021-03-24 09:44:22 +01:00
Thomas Lively f5764a8654 [WebAssembly] Finalize SIMD names and opcodes
Updates the names (e.g. widen => extend, saturate => sat) and opcodes of all
SIMD instructions to match the finalized SIMD spec. Deliberately does not change
the public interface in wasm_simd128.h yet; that will require more care.

Depends on D98466.

Differential Revision: https://reviews.llvm.org/D98676
2021-03-18 11:21:25 -07:00
Thomas Lively 2f2ae08da9 [WebAssembly] Remove experimental SIMD instructions
Removes the instruction definitions, intrinsics, and builtins for qfma/qfms,
signselect, and prefetch instructions, which were not included in the final
WebAssembly SIMD spec.

Depends on D98457.

Differential Revision: https://reviews.llvm.org/D98466
2021-03-18 11:21:24 -07:00
Thomas Lively 8638c897f4 [WebAssembly] Remove unimplemented-simd target feature
Now that the WebAssembly SIMD specification is finalized and engines are
generally up-to-date, there is no need for a separate target feature for gating
SIMD instructions that engines have not implemented. With this change,
v128.const is now enabled by default with the simd128 target feature.

Differential Revision: https://reviews.llvm.org/D98457
2021-03-18 10:23:12 -07:00
Stephen Tozer 1db137b185 [DebugInfo] Handle DBG_VALUES with multiple variable location operands in MIR
This patch adds handling for DBG_VALUE_LIST in the MIR-passes (after
finalize-isel), excluding the debug liveness passes and DWARF emission. This
most significantly affects MachineSink, which now needs to consider all used
registers of a debug value when sinking, but for most passes this change is
simply replacing getDebugOperand(0) with an iteration over all debug operands.

Differential Revision: https://reviews.llvm.org/D92578
2021-03-10 17:15:24 +00:00
Yuta Saito aa0c571a5f [WebAssembly] Add new relocation for location relative data
This `R_WASM_MEMORY_ADDR_SELFREL_I32` relocation represents an offset
between its relocating address and the symbol address. It's very similar
to `R_X86_64_PC32` but restricted to be used for only data segments.

```
S + A - P
```

A: Represents the addend used to compute the value of the relocatable
field.
P: Represents the place of the storage unit being relocated.
S: Represents the value of the symbol whose index resides in the
relocation entry.

Proposal: https://github.com/WebAssembly/tool-conventions/issues/162

Differential Revision: https://reviews.llvm.org/D96659
2021-03-08 11:34:10 -08:00
Stephen Tozer f677413071 Reapply "[DebugInfo] Add new instruction and DIExpression operator for variadic debug values"
Rewrites test to use correct architecture triple; fixes incorrect
reference in SourceLevelDebugging doc; simplifies `spillReg` behaviour
so as to not be dependent on changes elsewhere in the patch stack.

This reverts commit d2000b45d0.
2021-03-05 12:32:05 +00:00
Heejin Ahn 2b957ed4ff [WebAssembly] Fix ExceptionInfo grouping again
This is a case D97677 missed. When taking out remaining BBs that are
reachable from already-taken-out exceptions (because they are not
subexcptions but unwind destinations), I assumed the remaining BBs are
not EH pads, but they can be. For example,
```
try {
  try {
    throw 0;
  } catch (int) { // (a)
  }
} catch (int) {   // (b)
}
try {
  foo();
} catch (int) {   // (c)
}
```
In this code, (b) is the unwind destination of (a) so its exception is
taken out of (a)'s exception, But even though the next try-catch is not
inside the first two-level try-catches, because the first try always
throws, its continuation BB is unreachable and the whole rest of the
function is dominated by EH pad (a), including EH pad (c). So after we
take out of (b)'s exception out of (a)'s, we also need to take out (c)'s
exception out of (a)'s, because (c) is reachable from (b).

This adds one more step before what we did for remaining BBs in D97677;
it traverses EH pads first to take subexceptions out of their incorrect
parent exception. It's the same thing as D97677, but because we can do
this before we add BBs to exceptions' sets, we don't need to fix sets
and only need to fix parent exception pointers.

Other changes are variable name changes (I changed `WE` -> `SrcWE`,
`UnwindWE` -> `DstWE` for clarity), some comment changes, and a drive-by
fix in a bug in a `LLVM_DEBUG` print statement.

Fixes https://github.com/emscripten-core/emscripten/issues/13588.

Reviewed By: dschuff

Differential Revision: https://reviews.llvm.org/D97929
2021-03-04 15:05:13 -08:00
Heejin Ahn 561abd83ff [WebAssembly] Disable uses of __clang_call_terminate
Background:

Wasm EH, while using Windows EH (catchpad/cleanuppad based) IR, uses
Itanium-based libraries and ABIs with some modifications.

`__clang_call_terminate` is a wrapper generated in Clang's Itanium C++
ABI implementation. It contains this code, in C-style pseudocode:
```
void __clang_call_terminate(void *exn) {
  __cxa_begin_catch(exn);
  std::terminate();
}
```
So this function is a wrapper to call `__cxa_begin_catch` on the
exception pointer before termination.

In Itanium ABI, this function is called when another exception is thrown
while processing an exception. The pointer for this second, violating
exception is passed as the argument of this `__clang_call_terminate`,
which calls `__cxa_begin_catch` with that pointer and calls
`std::terminate` to terminate the program.

The spec (https://libcxxabi.llvm.org/spec.html) for `__cxa_begin_catch`
says,
```
When the personality routine encounters a termination condition, it
will call __cxa_begin_catch() to mark the exception as handled and then
call terminate(), which shall not return to its caller.
```

In wasm EH's Clang implementation, this function is called from
cleanuppads that terminates the program, which we also call terminate
pads. Cleanuppads normally don't access the thrown exception and the
wasm backend converts them to `catch_all` blocks. But because we need
the exception pointer in this cleanuppad, we generate
`wasm.get.exception` intrinsic (which will eventually be lowered to
`catch` instruction) as we do in the catchpads. But because terminate
pads are cleanup pads and should run even when a foreign exception is
thrown, so what we have been doing is:
1. In `WebAssemblyLateEHPrepare::ensureSingleBBTermPads()`, we make sure
terminate pads are in this simple shape:
```
%exn = catch
call @__clang_call_terminate(%exn)
unreachable
```
2. In `WebAssemblyHandleEHTerminatePads` pass at the end of the
pipeline, we attach a `catch_all` to terminate pads, so they will be in
this form:
```
%exn = catch
call @__clang_call_terminate(%exn)
unreachable
catch_all
call @std::terminate()
unreachable
```
In `catch_all` part, we don't have the exception pointer, so we call
`std::terminate()` directly. The reason we ran HandleEHTerminatePads at
the end of the pipeline, separate from LateEHPrepare, was it was
convenient to assume there was only a single `catch` part per `try`
during CFGSort and CFGStackify.

---

Problem:

While it thinks terminate pads could have been possibly split or calls
to `__clang_call_terminate` could have been duplicated,
`WebAssemblyLateEHPrepare::ensureSingleBBTermPads()` assumes terminate
pads contain no more than calls to `__clang_call_terminate` and
`unreachable` instruction. I assumed that because in LLVM very limited
forms of transformations are done to catchpads and cleanuppads to
maintain the scoping structure. But it turned out to be incorrect;
passes can merge cleanuppads into one, including terminate pads, as long
as the new code has a correct scoping structure. One pass that does this
I observed was `SimplifyCFG`, but there can be more. After this
transformation, a single cleanuppad can contain any number of other
instructions with the call to `__clang_call_terminate` and can span many
BBs. It wouldn't be practical to duplicate all these BBs within the
cleanuppad to generate the equivalent `catch_all` blocks, only with
calls to `__clang_call_terminate` replaced by calls to `std::terminate`.

Unless we do more complicated transformation to split those calls to
`__clang_call_terminate` into a separate cleanuppad, it is tricky to
solve.

---

Solution (?):

This CL just disables the generation and use of `__clang_call_terminate`
and calls `std::terminate()` directly in its place.

The possible downside of this approach can be, because the Itanium ABI
intended to "mark" the violating exception handled, we don't do that
anymore. What `__cxa_begin_catch` actually does is increment the
exception's handler count and decrement the uncaught exception count,
which in my opinion do not matter much given that we are about to
terminate the program anyway. Also it does not affect info like stack
traces that can be possibly shown to developers.

And while we use a variant of Itanium EH ABI, we can make some
deviations if we choose to; we are already different in that in the
current version of the EH spec we don't support two-phase unwinding. We
can possibly consider a more complicated transformation later to
reenable this, but I don't think that has high priority.

Changes in this CL contains:
- In Clang, we don't generate a call to `wasm.get.exception()` intrinsic
  and `__clang_call_terminate` function in terminate pads anymore; we
  simply generate calls to `std::terminate()`, which is the default
  implementation of `CGCXXABI::emitTerminateForUnexpectedException`.
- Remove `WebAssembly::ensureSingleBBTermPads() function and
  `WebAssemblyHandleEHTerminatePads` pass, because terminate pads are
  already `catch_all` now (because they don't need the exception
  pointer) and we don't need these transformations anymore.
- Change tests to use `std::terminate` directly. Also removes tests that
  tested `LateEHPrepare::ensureSingleBBTermPads` and
  `HandleEHTerminatePads` pass.
- Drive-by fix: Add some function attributes to EH intrinsic
  declarations

Fixes https://github.com/emscripten-core/emscripten/issues/13582.

Reviewed By: dschuff, tlively

Differential Revision: https://reviews.llvm.org/D97834
2021-03-04 14:26:35 -08:00
Stephen Tozer d2000b45d0 Revert "[DebugInfo] Add new instruction and DIExpression operator for variadic debug values"
This reverts commit d07f106f4a.
2021-03-04 11:59:21 +00:00
gbtozers d07f106f4a [DebugInfo] Add new instruction and DIExpression operator for variadic debug values
This patch adds a new instruction that can represent variadic debug values,
DBG_VALUE_VAR. This patch alone covers the addition of the instruction and a set
of basic code changes in MachineInstr and a few adjacent areas, but does not
correctly handle variadic debug values outside of these areas, nor does it
generate them at any point.

The new instruction is similar to the existing DBG_VALUE instruction, with the
following differences: the operands are in a different order, any number of
values may be used in the instruction following the Variable and Expression
operands (these are referred to in code as “debug operands”) and are indexed
from 0 so that getDebugOperand(X) == getOperand(X+2), and the Expression in a
DBG_VALUE_VAR must use the DW_OP_LLVM_arg operator to pass arguments into the
expression.

The new DW_OP_LLVM_arg operator is only valid in expressions appearing in a
DBG_VALUE_VAR; it takes a single argument and pushes the debug operand at the
index given by the argument onto the Expression stack. For example the
sub-expression `DW_OP_LLVM_arg, 0` has the meaning “Push the debug operand at
index 0 onto the expression stack.”

Differential Revision: https://reviews.llvm.org/D82363
2021-03-04 11:45:35 +00:00
Andy Wingo 4307069df4 [WebAssembly] Swap operand order of call_indirect in text format
The WebAssembly text and binary formats have different operand orders
for the "type" and "table" fields of call_indirect (and
return_call_indirect).  In LLVM we use the binary order for the MCInstr,
but when we produce or consume the text format we should use the text
order.  For compilation units targetting WebAssembly 1.0 (without the
reference types feature), we omit the table operand entirely.

Differential Revision: https://reviews.llvm.org/D97761
2021-03-03 08:51:21 +01:00
Heejin Ahn 4a58116b7e [WebAssembly] Fix more ExceptionInfo grouping bugs
This fixes two bugs in `WebAssemblyExceptionInfo` grouping, created by
D97247. These two bugs are not easy to split into two different CLs,
because tests that fail for one also tend to fail for the other.

- In D97247, when fixing `ExceptionInfo` grouping by taking out
  the unwind destination' exception from the unwind src's exception, we
  just iterated the BBs in the function order, but this was incorrect;
  this changes it to dominator tree preorder. Please refer to the
  comments in the code for the reason and an example.

- After this subexception-taking-out fix, there still can be remaining
  BBs we have to take out. When Exception B is taken out of Exception A
  (because EHPad B is the unwind destination of EHPad A), there can
  still be BBs within Exception A that are reachable from Exception B,
  which also should be taken out. Please refer to the comments in the
  code for more detailed explanation on why this can happen. To make
  this possible, this splits `WebAssemblyException::addBlock` into two
  parts: adding to a set and adding to a vector. We need to iterate on
  BBs within a `WebAssemblyException` to fix this, so we add BBs to sets
  first. But we add BBs to vectors later after we fix all incorrectness
  because deleting BBs from vectors is expensive. I considered removing
  the vector from `WebAssemblyException`, but it was not easy because
  this class has to maintain a similar interface with `MachineLoop` to
  be wrapped into a single interface `SortRegion`, which is used in
  CFGSort.

Other misc. drive-by fixes:
- Make `WebAssemblyExceptionInfo` do not even run when wasm EH is not
  used or the function doesn't have any EH pads, not to waste time
- Add `LLVM_DEBUG` lines for easy debugging
- Fix `preds` comments in cfg-stackify-eh.ll
- Fix `__cxa_throw`'s signature in cfg-stackify-eh.ll

Fixes https://github.com/emscripten-core/emscripten/issues/13554.

Reviewed By: dschuff, tlively

Differential Revision: https://reviews.llvm.org/D97677
2021-03-02 13:44:09 -08:00
Heejin Ahn dcfec279d6 [WebAssembly] Handle empty cleanuppads when adding catch_all
In `LateEHPrepare::addCatchAlls`, the current code tries to get the
iterator's debug info even when it is `MachineBasicBlock::end()`. This
fixes the bug by adding empty debug info instead in that case.

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D97679
2021-03-01 10:07:05 -08:00
Andy Wingo 2632ba6a35 [WebAssembly] call_indirect issues table number relocs
If the reference-types feature is enabled, call_indirect will explicitly
reference its corresponding function table via TABLE_NUMBER
relocations against a table symbol.

Also, as before, address-taken functions can also cause the function
table to be created, only with reference-types they additionally cause a
symbol table entry to be emitted.

Differential Revision: https://reviews.llvm.org/D90948
2021-03-01 16:49:00 +01:00
Heejin Ahn aa097ef8d4 [WebAssembly] Fix reverse mapping in WasmEHFuncInfo
D97247 added the reverse mapping from unwind destination to their
source, but it had a critical bug; sources can be multiple, because
multiple BBs can have a single BB as their unwind destination.

This changes `WasmEHFuncInfo::getUnwindSrc` to `getUnwindSrcs` and makes
it return a vector rather than a single BB. It does not return the const
reference to the existing vector but creates a new vector because
`WasmEHFuncInfo` stores not `BasicBlock*` or `MachineBasicBlock*` but
`PointerUnion` of them. Also I hoped to unify those methods for
`BasicBlock` and `MachineBasicBlock` into one using templates to reduce
duplication, but failed because various usages require `BasicBlock*` to
be `const` but it's hard to make it `const` for `MachineBasicBlock`
usages.

Fixes https://github.com/emscripten-core/emscripten/issues/13514.
(More precisely, fixes
https://github.com/emscripten-core/emscripten/issues/13514#issuecomment-784708744)

Reviewed By: dschuff, tlively

Differential Revision: https://reviews.llvm.org/D97583
2021-02-26 17:12:10 -08:00
Dan Gohman c62dabc3f5 [WebAssembly] Avoid `bit_cast` when printing f32 and f64 immediates
Use `APInt` to convert a 32-bit or 64-bit immediate to an `APFloat` rather than
`bit_cast` to a `float` or `double` to avoid going through host floating-point and
potentially changing the bit pattern of NaNs.

Differential Revision: https://reviews.llvm.org/D97490
2021-02-26 14:19:02 -08:00
Heejin Ahn d8b3dc5a68 [WebAssembly] Fix remapping branch dests in fixCatchUnwindMismatches
This is a case D97178 tried to solve but missed. D97178 could not handle
the case when
multiple consecutive delegates are generated:
- Before:
```
block
  br (a)
  try
  catch
  end_try
end_block
          <- (a)
```

- After
```
block
  br (a)
  try
    ...
    try
      try
      catch
      end_try
            <- (a)
    delegate
  delegate
end_block
          <- (b)
```
(The `br` should point to (b) now)

D97178 assumed `end_block` exists two BBs later than `end_try`, because
it assumed the order as `end_try` BB -> `delegate` BB -> `end_block` BB.
But it turned out there can be multiple `delegate`s in between. This
patch changes the logic so we just search from `end_try` BB until we
find `end_block`.

Fixes https://github.com/emscripten-core/emscripten/issues/13515.
(More precisely, fixes
https://github.com/emscripten-core/emscripten/issues/13515#issuecomment-784711318.)

Reviewed By: dschuff, tlively

Differential Revision: https://reviews.llvm.org/D97569
2021-02-26 13:38:13 -08:00
Heejin Ahn ea8c6375e3 [WebAssembly] Fix incorrect grouping and sorting of exceptions
This CL is not big but contains changes that span multiple analyses and
passes. This description is very long because it tries to explain basics
on what each pass/analysis does and why we need this change on top of
that. Please feel free to skip parts that are not necessary for your
understanding.

---

`WasmEHFuncInfo` contains the mapping of <EH pad, the EH pad's next
unwind destination>. The value (unwind dest) here is where an exception
should end up when it is not caught by the key (EH pad). We record this
info in WasmEHPrepare to fix catch mismatches, because the CFG itself
does not have this info. A CFG only contains BBs and
predecessor-successor relationship between them, but in `WasmEHFuncInfo`
the unwind destination BB is not necessarily a successor or the key EH
pad BB. Their relationship can be intuitively explained by this C++ code
snippet:
```
try {
  try {
    foo();
  } catch (int) { // EH pad
    ...
  }
} catch (...) {   // unwind destination
}
```
So when `foo()` throws, it goes to `catch (int)` first. But if it is not
caught by it, it ends up in the next unwind destination `catch (...)`.
This unwind destination is what you see in `catchswitch`'s
`unwind label %bb` part.

---

`WebAssemblyExceptionInfo` groups exceptions so that they can be sorted
continuously together in CFGSort, as we do for loops. What this analysis
does is very simple: it creates a single `WebAssemblyException` per EH
pad, and all BBs that are dominated by that EH pad are included in this
exception. We also identify subexception relationship in this way: if
EHPad A domiantes EHPad B, EHPad B's exception is a subexception of
EHPad A's exception.

This simple rule turns out to be incorrect in some cases. In
`WasmEHFuncInfo`, if EHPad A's unwind destination is EHPad B, it means
semantically EHPad B should not be included in EHPad A's exception,
because it does not make sense to rethrow/delegate to an inner scope.
This is what happened in CFGStackify as a result of this:
```
try
  try
  catch
    ...   <- %dest_bb is among here!
  end
delegate %dest_bb
```

So this patch adds a phase in `WebAssemblyExceptionInfo::recalculate` to
make sure excptions' unwind destinations are not subexceptions of
their unwind sources in `WasmEHFuncInfo`.

But this alone does not prevent `dest_bb` in the example above from
being sorted within the inner `catch`'s exception, even if its exception
is not a subexception of that `catch`'s exception anymore, because of
how CFGSort works, which will be explained below.

---

CFGSort places BBs within the same `SortRegion` (loop or exception)
continuously together so they can be demarcated with `loop`-`end_loop`
or `catch`-`end_try` in CFGStackify.

`SortRegion` is a wrapper for one of `MachineLoop` or
`WebAssemblyException`. `SortRegionInfo` already does some complicated
things because there discrepancies between those two data structures.
`WebAssemblyException` is what we control, and it is defined as an EH
pad as its header and BBs dominated by the header as its BBs (with a
newly added exception of unwind destinations explained in the previous
paragraph). But `MachineLoop` is an LLVM data structure and uses the
standard loop detection algorithm. So by the algorithm, BBs that are 1.
dominated by the loop header and 2. have a path back to its header.
Because of the second condition, many BBs that are dominated by the loop
header are not included in the loop. So BBs that contain `return` or
branches to outside of the loop are not technically included in
`MachineLoop`, but they can be sorted together with the loop with no
problem.

Maybe to relax the condition, in CFGSort, when we are in a `SortRegion`
we allow sorting of not only BBs that belong to the current innermost
region but also BBs that are by the current region header.
(This was written this way from the first version written by Dan, when
only loops existed.) But now, we have cases in exceptions when EHPad B
is the unwind destination for EHPad A, even if EHPad B is dominated by
EHPad A it should not be included in EHPad A's exception, and should not
be sorted within EHPad A.

One way to make things work, at least correctly, is change `dominates`
condition to `contains` condition for `SortRegion` when sorting BBs, but
this will change compilation results for existing non-EH code and I
can't be sure it will not degrade performance or code size. I think it
will degrade performance because it will force many BBs dominated by a
loop, which don't have the path back to the header, to be placed after
the loop and it will likely to create more branches and blocks.

So this does a little hacky check when adding BBs to `Preferred` list:
(`Preferred` list is a ready list. CFGSort maintains ready list in two
priority queues: `Preferred` and `Ready`. I'm not very sure why, but it
was written that way from the beginning. BBs are first added to
`Preferred` list and then some of them are pushed to `Ready` list, so
here we only need to guard condition for `Preferred` list.)

When adding a BB to `Preferred` list, we check if that BB is an unwind
destination of another BB. To do this, this adds the reverse mapping,
`UnwindDestToSrc`, and getter methods to `WasmEHFuncInfo`. And if the BB
is an unwind destination, it checks if the current stack of regions
(`Entries`) contains its source BB by traversing the stack backwards. If
we find its unwind source in there, we add the BB to its `Deferred`
list, to make sure that unwind destination BB is added to `Preferred`
list only after that region with the unwind source BB is sorted and
popped from the stack.

---

This does not contain a new test that crashes because of this bug, but
this fix changes the result for one of existing test case. This test
case didn't crash because it fortunately didn't contain `delegate` to
the incorrectly placed unwind destination BB.

Fixes https://github.com/emscripten-core/emscripten/issues/13514.

Reviewed By: dschuff, tlively

Differential Revision: https://reviews.llvm.org/D97247
2021-02-23 14:54:55 -08:00
Andy Wingo 7dc98adbb0 Revert "[WebAssembly] call_indirect issues table number relocs"
This reverts commit 861dbe1a02.  It broke
emscripten -- see https://reviews.llvm.org/D90948#2578843.
2021-02-23 11:48:08 +01:00
Heejin Ahn f47a654a39 [WebAssembly] Remap branch dests after fixCatchUnwindMismatches
Fixing catch unwind mismatches can sometimes invalidate existing branch
destinations. This CL remaps those destinations after placing
try-delegates.

Fixes https://github.com/emscripten-core/emscripten/issues/13515.

Reviewed By: dschuff

Differential Revision: https://reviews.llvm.org/D97178
2021-02-22 13:25:58 -08:00
Heejin Ahn 51fb5bf4d6 [WebAssembly] Support WasmEHFuncInfo serialization
This adds support for serialization of `WasmEHFuncInfo`, in the form of
<Source BB Number, Unwind destination BB number>. To make YAML mapping
work, we needed to make a copy of the existing `SrcToUnwindDest` map
within `yaml::WebAssemblyMachineFunctionInfo`.

It was hard to add EH MIR tests for CFGStackify because `WasmEHFuncInfo`
could not be read from test MIR files. This adds the serialization
support for that to make EH MIR tests easier.

Reviewed By: dschuff

Differential Revision: https://reviews.llvm.org/D97174
2021-02-22 13:13:51 -08:00
Heejin Ahn a08e609d2e [WebAssembly] Rename methods in WasmEHFuncInfo (NFC)
This renames variable and method names in `WasmEHFuncInfo` class to be
simpler and clearer. For example, unwind destinations are EH pads by
definition so it doesn't necessarily need to be included in every method
name. Also I am planning to add the reverse mapping in a later CL,
something like `UnwindDestToSrc`, so this renaming will make meanings
clearer.

Reviewed By: dschuff

Differential Revision: https://reviews.llvm.org/D97173
2021-02-22 12:16:11 -08:00
Andy Wingo 861dbe1a02 [WebAssembly] call_indirect issues table number relocs
If the reference-types feature is enabled, call_indirect will explicitly
reference its corresponding function table via `TABLE_NUMBER`
relocations against a table symbol.

Also, as before, address-taken functions can also cause the function
table to be created, only with reference-types they additionally cause a
symbol table entry to be emitted.

We abuse the used-in-reloc flag on symbols to indicate which tables
should end up in the symbol table.  We do this because unfortunately
older wasm-ld will carp if it see a table symbol.

Differential Revision: https://reviews.llvm.org/D90948
2021-02-22 10:13:36 +01:00
Heejin Ahn 6f2999b36a [WebAssembly] Handle multiple EH_LABELs in EH pad
Usually `EH_LABEL`s are placed in
- Before an `invoke` (which becomes calls in the backend)
- After an `invoke`
- At the start of an EH pad

I don't know exactly why, but I noticed there are cases of multiple, not
a single, `EH_LABEL` instructions in the beginning of an EH pad. In that
case `global.set` instruction placed to restore `__stack_pointer` ended
up between two `EH_LABEL` instructions before `CATCH`. It should follow
after the `EH_LABEL`s and `CATCH`. This CL fixes that case.

Reviewed By: dschuff

Differential Revision: https://reviews.llvm.org/D96970
2021-02-18 10:18:00 -08:00
Heejin Ahn da01a9db8b [WebAssemblly] Fix EHPadStack update in fixCallUnwindMismatches
Updating `EHPadStack` with respect to `TRY` and `CATCH` instructions
have to be done after checking all other conditions, not before. Because
we did this before checking other conditions, when we encounter `TRY`
and we want to record the current mismatching range, we already have
popped up the entry from `EHPadStack`, which we need to access to record
the range.

The `baz` call in the added test needs try-delegate because the previous
TRY marker placement for `quux` was placed before `baz`, because `baz`'s
return value was stackified in RegStackify. If this wasn't stackified
this try-delegate is not strictly necessary, but at the moment it is not
easy to identify cases like this. I plan to transfer `nounwind`
attributes from the LLVM IR to prevent cases like this. The call in the
test does not have `unwind` attribute in order to test this bug, but in
many cases of this pattern the previous call has `nounwind` attribute.

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D96711
2021-02-17 12:14:11 -08:00
Heejin Ahn 7c594bab00 [WebAssembly] Change catch_all's opcode
We decided to change `catch_all`'s opcode from 0x05, which is the same
as `else`, to 0x19, to avoid some complicated handling in the tools.

See: https://github.com/WebAssembly/exception-handling/issues/147

Reviewed By: sbc100

Differential Revision: https://reviews.llvm.org/D96863
2021-02-17 10:16:23 -08:00
Heejin Ahn 35f5f797a6 [WebAssemblly] Fix rethrow's argument computation
Previously we assumed `rethrow`'s argument was always 0, but it turned
out `rethrow` follows the same rule with `br` or `delegate`:
https://github.com/WebAssembly/exception-handling/pull/137
https://github.com/WebAssembly/exception-handling/issues/146#issuecomment-777349038

Currently `rethrow`s generated by our backend always rethrow the
exception caught by the innermost enclosing catch, so this adds a
function to compute that and replaces `rethrow`'s argument with its
computed result.

This also renames `EHPadStack` in `InstPrinter` to `TryStack`, because
in CFGStackify we use `EHPadStack` to mean the range between
`catch`~`end`, while in `InstPrinter` we used it to mean the range
between `try`~`catch`, so choosing different names would look clearer.
Doesn't contain any functional changes in `InstPrinter`.

Reviewed By: dschuff

Differential Revision: https://reviews.llvm.org/D96595
2021-02-13 03:43:15 -08:00
Heejin Ahn 2968611fda [WebAssembly] Fix delegate's argument computation
I previously assumed `delegate`'s immediate argument computation
followed a different rule than that of branches, but we agreed to make
it the same
(https://github.com/WebAssembly/exception-handling/issues/146). This
removes the need for a separate `DelegateStack` in both CFGStackify and
InstPrinter.

When computing the immediate argument, we use a different function for
`delegate` computation because in MIR `DELEGATE`'s instruction's
destination is the destination catch BB or delegate BB, and when it is a
catch BB, we need an additional step of getting its corresponding `end`
marker.

Reviewed By: tlively, dschuff

Differential Revision: https://reviews.llvm.org/D96525
2021-02-11 21:57:28 -08:00
Sam Parker 9d81ccc02f [WebAssembly] Enable loop unrolling
Enable partial and runtime unrolling with a threshold of 30, which
was derived from a large number of kernels running on node and
wasmtime for amd64 and aarch64.

Unrolling is enabled by default at -O2 and -O3 and is disabled at
-Oz and -Os. Compiling with -Os is recommended if the wasm binary
size is the most important factor.

Differential Revision: https://reviews.llvm.org/D95125
2021-02-10 08:25:46 +00:00
Simon Pilgrim db5abfbbb4 [WebAssembly] Fix multiclass template parameter types. NFC.
Fixes TableGen parser errors reported by D95874 due to incompatible types being used on multiclass templates.

Differential Revision: https://reviews.llvm.org/D96205
2021-02-08 09:36:56 +00:00
Heejin Ahn 5afdd64a53 [WebAssembly] Update InstPrinter and AsmParser for new EH instructions
This updates InstPrinter and AsmParser for `delegate` and `catch_all`
instructions. Both will reject programs with multiple `catch_all`s per a
single `try`. And InstPrinter uses `EHInstStack` to figure out whether
to print catch label comments: It does not print catch label comments
for second `catch` or `catch_all` in a `try`.

Reviewed By: aardappel

Differential Revision: https://reviews.llvm.org/D94051
2021-02-06 08:54:56 -08:00
Heejin Ahn be0efa1f23 [WebAssembly] Handle EH terminate pads for cleanup
Terminate pads, cleanup pads with `__clang_call_terminate` call, have
`catch` instruction in them because `__clang_call_terminate` takes an
exception pointer. But these terminate pads should be reached also in
case of foreign exception. So this pass attaches an additional
`catch_all` BB after every terminate pad BB, with a call to
`std::terminate`.

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D94050
2021-02-06 08:40:30 -08:00
Heejin Ahn 9f770b36cb [WebAssembly] Fix catch unwind mismatches
This fixes unwind destination mismatches caused by 'catch'es, which
occur when a foreign exception is not caught by the nearest `catch` and
the next outer `catch` is not the catch it should unwind to, or the next
unwind destination should be the caller instead. This kind of mismatches
didn't exist in the previous version of the spec, because in the
previous spec `catch` was effectively `catch_all`, catching all
exceptions.

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D94049
2021-02-06 07:13:38 -08:00
Heejin Ahn ed41945faa [WebAssembly] Fix call unwind mismatches
This adds `delegate` instruction and use it to fix unwind destination
mismatches created by marker placement in CFGStackify.

There are two kinds of unwind destination mismatches:
- Mismatches caused by throwing instructions (here we call it "call
  unwind mismatches", even though `throw` and `rethrow` can also cause
  mismatches)
- Mismatches caused by `catch`es, in case a foreign exception is not
  caught by the nearest `catch` and the next outer `catch` is not the
  catch it should unwind to. This kind of mismatches didn't exist in the
  previous version of the spec, because in the previous spec `catch` was
  effectively `catch_all`, catching all exceptions.

This implements routines to fix the first kind of unwind mismatches,
which we call "call unwind mismatches". The second mismatch (catch
unwind mismatches) will be fixed in a later CL.

This also reenables all previously disabled tests in cfg-stackify-eh.ll
and updates FileCheck lines to match the new spec. Two tests were
deleted because they specifically tested the way we fixed unwind
mismatches before using `exnref`s and branches, which we don't do
anymore.

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D94048
2021-02-06 07:07:04 -08:00
Wouter van Oortmerssen a872ee2f36 [WebAssembly] ensure .functype applies to right label in assembler
We used to require .functype immediately follows the label it sets the type of, but not all Clang output follows this rule.

Now we simply allow it on any symbol, but only assume its a function start for a defined symbol, which is simpler and more general.

Fixes (part of) https://bugs.llvm.org/show_bug.cgi?id=49036

Differential Revision: https://reviews.llvm.org/D96165
2021-02-05 15:36:15 -08:00
Wouter van Oortmerssen 5e5b2cb131 [WebAssembly] Prevent data inside text sections in assembly
This is not supported in Wasm, unless the data was encoded instructions, but that wouldn't work with the assembler's other functionality (enforcing nesting etc.).

Fixes: https://bugs.llvm.org/show_bug.cgi?id=48971

Differential Revision: https://reviews.llvm.org/D95838
2021-02-05 13:48:25 -08:00
Wouter van Oortmerssen e3c0b0fe09 [WebAssembly] locals can now be indirect in DWARF
This for example to indicate that byval args are represented by a pointer to a struct.
Followup to https://reviews.llvm.org/D94140

Differential Revision: https://reviews.llvm.org/D94347
2021-02-05 11:14:42 -08:00
Fangrui Song e5269da979 [ARM][WebAssembly] Fix incorrect MCOperand::createDFPImm after D96091 2021-02-04 20:39:52 -08:00
Craig Topper 11ef356d9e [TargetLowering] Use Align in allowsMisalignedMemoryAccesses.
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D96097
2021-02-04 19:22:06 -08:00
Dan Gohman 698c6b0a09 [WebAssembly] Support single-floating-point immediate value
As mentioned in TODO comment, casting double to float causes NaNs to change bits.
To avoid the change, this patch adds support for single-floating-point immediate value on MachineCode.

Patch by Yuta Saito.

Differential Revision: https://reviews.llvm.org/D77384
2021-02-04 18:05:06 -08:00
Kazu Hirata 046cfb8565 [llvm] Forward-declare formatted_raw_ostream (NFC)
Various *TargetStreamer.h need formatted_raw_ostream but rely on a
forward declaration of formatted_raw_ostream in MCStreamer.h.  This
patch adds forward declarations right in *TargetStreamer.h.

While we are at it, this patch removes the one in MCStreamer.h, where
it is unnecessary.
2021-01-28 22:21:13 -08:00
Thomas Lively 4b68b64dcc [WebAssembly] Prototype i8x16 to i32x4 widening instructions
As proposed in https://github.com/WebAssembly/simd/pull/395 and matching the
opcodes used in V8:
https://chromium-review.googlesource.com/c/v8/v8/+/2617385/4/src/wasm/wasm-opcodes.h

Differential Revision: https://reviews.llvm.org/D95557
2021-01-28 10:59:32 -08:00
Wouter van Oortmerssen 275c6af7d7 [WebAssembly] Fix Fast ISEL not lowering 64-bit function pointers
Differential Revision: https://reviews.llvm.org/D95410
2021-01-28 10:05:29 -08:00
Kazu Hirata 8f5da41c4d [llvm] Construct SmallVector with iterator ranges (NFC) 2021-01-20 21:35:52 -08:00
Thomas Lively 11802eced5 [WebAssembly] Prototype new f64x2 conversions
As proposed in https://github.com/WebAssembly/simd/pull/383.

Differential Revision: https://reviews.llvm.org/D95012
2021-01-20 11:28:06 -08:00
Sam Clegg 96ef4f307d Revert "[WebAssembly] call_indirect issues table number relocs"
This reverts commit 418df4a6ab.

This change broke emscripten tests, I believe because it started
generating 5-byte a wide table index in the call_indirect instruction.
Neither v8 nor wabt seem to be able to handle that.  The spec
currently says that this is single 0x0 byte and:

"In future versions of WebAssembly, the zero byte occurring in the
encoding of the call_indirectcall_indirect instruction may be used to
index additional tables."

So we need to revisit this change.  For backwards compat I guess
we need to guarantee that __indirect_function_table is always at
address zero.   We could also consider making this a single-byte
relocation with and assert if have more than 127 tables (for now).

Differential Revision: https://reviews.llvm.org/D95005
2021-01-19 15:06:07 -08:00
Andy Wingo 418df4a6ab [WebAssembly] call_indirect issues table number relocs
This patch changes to make call_indirect explicitly refer to the
corresponding function table, residualizing TABLE_NUMBER relocs against
it.

With this change, wasm-ld now sees all references to tables, and can
link multiple tables.

Differential Revision: https://reviews.llvm.org/D90948
2021-01-19 09:32:45 +01:00
Kazu Hirata 23b0ab2acb [llvm] Use the default value of drop_begin (NFC) 2021-01-18 10:16:36 -08:00
Kazu Hirata 2082b10d10 [llvm] Use *::empty (NFC) 2021-01-16 09:40:55 -08:00
Kazu Hirata 2efcbe24a7 [llvm] Use llvm::drop_begin (NFC) 2021-01-14 20:30:33 -08:00
Heejin Ahn c93b955939 [WebAssembly] Remove more unnecessary brs in CFGStackify
After placing markers, we removed some unnecessary branches, but it only
handled the simplest case. This makes more unnecessary branches to be
removed.

Reviewed By: dschuff, tlively

Differential Revision: https://reviews.llvm.org/D94047
2021-01-12 01:18:10 -08:00
Heejin Ahn 1cc5235712 [WebAssembly] Misc. refactoring in CFGStackify (NFC)
Updating `ScopeTops` is something we frequently do in CFGStackify, so
this factors it out as a function. This also makes a few utility
functions templated so that they are not dependent on input vector
types and simplifies function parameters.

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D94046
2021-01-12 00:36:27 -08:00
Heejin Ahn 9f8b25769e [WebAssembly] Ensure terminate pads are a single BB
This ensures every single terminate pad is a single BB in the form of:
```
%exn = catch $__cpp_exception
call @__clang_call_terminate(%exn)
unreachable
```

This is a preparation for HandleEHTerminatePads pass, which will be
added in a later CL and will run after CFGStackify. That pass duplicates
terminate pads with a `catch_all` instruction, and duplicating it
becomes simpler if we can ensure every terminate pad is a single BB.

Reviewed By: dschuff, tlively

Differential Revision: https://reviews.llvm.org/D94045
2021-01-11 17:54:28 -08:00
Heejin Ahn 4e4df1e38d [WebAssembly] Remove unreachable EH pads
This removes unreachable EH pads in LateEHPrepare. This is not for
optimization but for preparation for CFGStackify. In CFGStackify, we
determine where to place `try` marker by computing the nearest common
dominator of all predecessors of an EH pad, but when an EH pad does not
have a predecessor, it becomes tricky. We can insert an empty dummy BB
before the EH pad and place the `try` there, but removing unreachable EH
pads is simpler.

This moves an existing exception label test from eh-label.mir to
exception.mir and adds a new test there.

This also adds some comments to existing methods.

Reviewed By: dschuff, tlively

Differential Revision: https://reviews.llvm.org/D94044
2021-01-09 03:42:38 -08:00
Heejin Ahn 0d8dfbb42a [WebAssembly] Update InstPrinter support for EH
- Updates InstPrinter to handle `catch_all`.
- Makes `rethrow` condition an early exit from the function to make the
  rest simpler.
- Unify label and catch counters. They don't need to be counted
  separately and this will help `delegate` instruction later.
- Removes `LastSeenEHInst` field. This was first introduced to handle
  when there are more than one `catch` blocks per `try`, but this was
  not implemented correctly and not being used at the moment anyway.
- Reenables all tests in cfg-stackify-eh.ll that don't deal with unwind
  destination mismatches, which will be handled in a later CL.

Reviewed By: dschuff, tlively, aardappel

Differential Revision: https://reviews.llvm.org/D94043
2021-01-09 02:42:35 -08:00
Heejin Ahn 52e240a072 [WebAssembly] Remove exnref and br_on_exn
This removes `exnref` type and `br_on_exn` instruction. This is
effectively NFC because most uses of these were already removed in the
previous CLs.

Reviewed By: dschuff, tlively

Differential Revision: https://reviews.llvm.org/D94041
2021-01-09 02:02:54 -08:00
Heejin Ahn 9e4eadeb13 [WebAssembly] Update basic EH instructions for the new spec
This implements basic instructions for the new spec.

- Adds new versions of instructions: `catch`, `catch_all`, and `rethrow`
- Adds support for instruction selection for the new instructions
 - `catch` needs a custom routine for the same reason `throw` needs one,
   to encode `__cpp_exception` tag symbol.
- Updates `WebAssembly::isCatch` utility function to include `catch_all`
  and Change code that compares an instruction's opcode with `catch` to
  use that function.
- LateEHPrepare
  - Previously in LateEHPrepare we added `catch` instruction to both
    `catchpad`s (for user catches) and `cleanuppad`s (for destructors).
    In the new version `catch` is generated from `llvm.catch` intrinsic
    in instruction selection phase, so we only need to add `catch_all`
    to the beginning of cleanup pads.
  - `catch` is generated from instruction selection, but we need to
    hoist the `catch` instruction to the beginning of every EH pad,
    because `catch` can be in the middle of the EH pad or even in a
    split BB from it after various code transformations.
  - Removes `addExceptionExtraction` function, which was used to
    generate `br_on_exn` before.
- CFGStackfiy: Deletes `fixUnwindMismatches` function. Running this
  function on the new instruction causes crashes, and the new version
  will be added in a later CL, whose contents will be completely
  different. So deleting the whole function will make the diff easier to
  read.
- Reenables all disabled tests in exception.ll and eh-lsda.ll and a
  single basic test in cfg-stackify-eh.ll.
- Updates existing tests to use the new assembly format. And deletes
  `br_on_exn` instructions from the tests and FileCheck lines.

Reviewed By: dschuff, tlively

Differential Revision: https://reviews.llvm.org/D94040
2021-01-09 01:48:06 -08:00
Heejin Ahn 9724c3cff4 [WebAssembly] Update WasmEHPrepare for the new spec
Clang generates `wasm.get.exception` and `wasm.get.ehselector`
intrinsics, which respectively return a caught exception value (a
pointer to some C++ exception struct) and a selector (an integer value
that tells which C++ `catch` clause the current exception matches, or
does not match any).

WasmEHPrepare is a pass that does some IR-level preparation before
instruction selection. Previously one of things we did in this pass was
to convert `wasm.get.exception` intrinsic calls to
`wasm.extract.exception` intrinsics. Their semantics were the same
except `wasm.extract.exception` did not have a token argument. We
maintained these two separate intrinsics with the same semantics because
instruction selection couldn't handle token arguments. This
`wasm.extract.exception` intrinsic was later converted to
`extract_exception` instruction in instruction selection, which was a
pseudo instruction to implement `br_on_exn`. Because `br_on_exn` pushed
an extracted value onto the value stack after the `end` instruction of a
`block`, but LLVM does not have a way of modeling that kind of behavior,
so this pseudo instruction was used to pull an extracted value out of
thin air, like this:
```
block $l0
  ...
  br_on_exn $cpp_exception $l0
  ...
end
extract_exception ;; pushes values onto the stack
```

In the new spec, we don't need this pseudo instruction anymore because
`catch` itself returns a value and we don't have `br_on_exn` anymore. In
the spec `catch` returns multiple values (like `br_on_exn`), but here we
assume it only returns a single i32, which is sufficient to support C++.

So this renames `wasm.get.exception` intrinsic to `wasm.catch`. Because
this CL does not yet contain instruction selection for `wasm.catch`
intrinsic, all `RUN` lines in exception.ll, eh-lsda.ll, and
cfg-stackify-eh.ll, and a single `RUN` line in wasm-eh.cpp (which is an
end-to-end test from C++ source to assembly) fail. So this CL
temporarily disables those `RUN` lines, and for those test files without
any valid remaining `RUN` lines, adds a dummy `RUN` line to make them
pass. These tests will be reenabled in later CLs.

Reviewed By: dschuff, tlively

Differential Revision: https://reviews.llvm.org/D94039
2021-01-08 23:38:26 -08:00
Heejin Ahn 7be271537e [WebAssembly] Rename wasm_rethrow_in_catch intrinsic/builtin
`wasm_rethrow_in_catch` intrinsic and builtin are used in order to
rethrow an exception when the exception is caught but there is no
matching clause within the current `catch`. For example,
```
try {
  foo();
} catch (int n) {
  ...
}
```
If the caught exception does not correspond to C++ `int` type, it should
be rethrown. These intrinsic/builtin were renamed `rethrow_in_catch`
because at the time I thought there would be another intrinsic for C++'s
`throw` keyword, which rethrows an exception. It turned out that `throw`
keyword doesn't require wasm's `rethrow` instruction, so we rename
`rethrow_in_catch` to just `rethrow` here.

Reviewed By: dschuff, tlively

Differential Revision: https://reviews.llvm.org/D94038
2021-01-08 06:55:04 -08:00
Kazu Hirata b934160aaa [Target] Use llvm::find_if (NFC) 2021-01-07 20:29:36 -08:00
Wouter van Oortmerssen 5c38ae36c5 [WebAssembly] Fixed byval args missing DWARF DW_AT_LOCATION
A struct in C passed by value did not get debug information. Such values are currently
lowered to a Wasm local even in -O0 (not to an alloca like on other archs), which becomes
a Target Index operand (TI_LOCAL). The DWARF writing code was not emitting locations
in for TI's specifically if the location is a single range (not a list).

In addition, the ExplicitLocals pass which removes the ARGUMENT pseudo instructions did
not update the associated DBG_VALUEs, and couldn't even find these values since the code
assumed such instructions are adjacent, which is not the case here.

Also fixed asm printing of TIs needed by a test.

Differential Revision: https://reviews.llvm.org/D94140
2021-01-07 10:31:38 -08:00
Matt Arsenault c9122ddef5 CodeGen: Refactor regallocator command line and target selection
Make the sequence of passes to select and rewrite instructions to
physical registers be a target callback. This is to prepare to allow
targets to split register allocation into multiple phases.
2021-01-07 13:13:25 -05:00
Thomas Lively 497026c902 [WebAssembly] Prototype prefetch instructions
As proposed in https://github.com/WebAssembly/simd/pull/352 and using the
opcodes used in the V8 prototype:
https://chromium-review.googlesource.com/c/v8/v8/+/2543167. These instructions
are only usable via intrinsics and clang builtins to make them opt-in while they
are being benchmarked.

Differential Revision: https://reviews.llvm.org/D93883
2021-01-05 11:32:03 -08:00
Andy Wingo 9ad83fd6dc [WebAssembly] call_indirect causes indirect function table import
For wasm-ld table linking work to proceed, object files should indicate
if they use an indirect function table.  In the future this will be done
by the usual symbols and relocations mechanism, but until that support
lands in the linker, the presence of an `__indirect_function_table` in
the object file's import section shows that the object file needs an
indirect function table.

Prior to https://reviews.llvm.org/D91637, this condition was met by all
object files residualizing an `__indirect_function_table` import.

Since https://reviews.llvm.org/D91637, the intention has been that only
those object files needing an indirect function table would have the
`__indirect_function_table` import.  However, we missed the case of
object files which use the table via `call_indirect` but which
themselves do not declare any indirect functions.

This changeset makes it so that when we lower a call to `call_indirect`,
that we ensure that a `__indirect_function_table` symbol is present and
that it will be propagated to the linker.

A followup patch will revise this mechanism to make an explicit link
between `call_indirect` and its associated indirect function table; see
https://reviews.llvm.org/D90948.

Differential Revision: https://reviews.llvm.org/D92840
2021-01-05 11:09:24 +01:00
Heejin Ahn c4f12a07a4 [WebAssembly] Remove old SDT_WebAssemblyCalls (NFC)
These are not used anymore.

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D94036
2021-01-04 16:31:16 -08:00
Kazu Hirata 0e219b6443 [Target] Construct SmallVector with iterator ranges (NFC) 2021-01-03 09:57:45 -08:00
Thomas Lively 5e09e9979b [WebAssembly] Prototype extending pairwise add instructions
As proposed in https://github.com/WebAssembly/simd/pull/380. This commit makes
the new instructions available only via clang builtins and LLVM intrinsics to
make their use opt-in while they are still being evaluated for inclusion in the
SIMD proposal.

Depends on D93771.

Differential Revision: https://reviews.llvm.org/D93775
2020-12-28 14:11:14 -08:00
Thomas Lively 44ee14f993 [WebAssembly][NFC] Finish cleaning up SIMD tablegen
This commit is a follow-on to c2c2e9119e73, using the `Vec` records introduced
in that commit in the rest of the SIMD instruction definitions. Also removes
unnecessary types in output patterns.

Differential Revision: https://reviews.llvm.org/D93771
2020-12-28 13:59:23 -08:00
Thomas Lively efe7f5ede0 [WebAssembly][NFC] Refactor SIMD load/store tablegen defs
Introduce `Vec` records, each bundling all information related to a single SIMD
lane interpretation. This lets TableGen definitions take a single Vec parameter
from which they can extract information rather than taking multiple redundant
parameters. This commit refactors all of the SIMD load and store instruction
definitions to use the new `Vec`s. Subsequent commits will similarly refactor
additional instruction definitions.

Differential Revision: https://reviews.llvm.org/D93660
2020-12-22 20:06:12 -08:00
Thomas Lively a781a706b9 [WebAssembly][SIMD] Rename shuffle, swizzle, and load_splats
These instructions previously used prefixes like v8x16 to signify that they were
agnostic between float and int interpretations. We renamed these instructions to
remove this form of prefix in https://github.com/WebAssembly/simd/issues/297 and
https://github.com/WebAssembly/simd/issues/316 and this commit brings the names
in LLVM up to date.

Differential Revision: https://reviews.llvm.org/D93722
2020-12-22 14:29:06 -08:00
Kazu Hirata 966f1431de [Target] Use llvm::erase_if (NFC) 2020-12-20 17:43:22 -08:00
Derek Schuff 8d396acac3 [WebAssembly] Support COMDAT sections in assembly syntax
This CL changes the asm syntax for section flags, making them more like ELF
(previously "passive" was the only option). Now we also allow "G" to designate
COMDAT group sections. In these sections we set the appropriate comdat flag on
function symbols, and also avoid auto-creating a new section for them.

This also adds asm-based tests for the changes D92691 to go along with
the direct-to-object tests.

Differential Revision: https://reviews.llvm.org/D92952
This is a reland of rG4564553b8d8a with a fix to the lit pipeline in
llvm/test/MC/WebAssembly/comdat.ll
2020-12-10 16:43:59 -08:00
Derek Schuff dd1aa4fdd8 Revert "[WebAssembly] Support COMDAT sections in assembly syntax"
This reverts commit 4564553b8d.
It broke several buildbots.
2020-12-10 15:55:33 -08:00
Derek Schuff 4564553b8d [WebAssembly] Support COMDAT sections in assembly syntax
This CL changes the asm syntax for section flags, making them more like ELF
(previously "passive" was the only option). Now we also allow "G" to designate
COMDAT group sections. In these sections we set the appropriate comdat flag on
function symbols, and also avoid auto-creating a new section for them.

This also adds asm-based tests for the changes D92691 to go along with
the direct-to-object tests.

Differential Revision: https://reviews.llvm.org/D92952
2020-12-10 14:46:24 -08:00
Sam Clegg 1b6d879ec1 [WebAssembly] Fix code generated for atomic operations in PIC mode
The main this this test does is to add the `IsNotPIC` predicate to the
all the atomic instructions pattern that directly refer to
`tglobaladdr`.

This is because in PIC mode we need to generate separate instruction
sequence (either a direct global.get, or __memory_base + offset) for
accessing global addresses.

As part of this change I noticed that many of the `Requires` attributes
added to the instruction in `WebAssemblyInstrAtomics.td` were being
honored.  This is because the wrapped in a `let Predicates =
[HasAtomics]` block and it seems that that outer wrapping overrides any
`Requires` on defs within it.   As a workaround I removed the outer
`let` and added `HasAtomics` to all the inner `Requires`.  I believe
that all the instrucitons that don't have `Requires` explicit bottom out
in `ATOMIC_I` and `ATOMIC_NRI` which have `HasAtomics` so this should
not remove this predicate from any patterns (at least that is the idea).

The alternative to this approach looks like implementing something
like `PredicateControl` in `Mips.td` where we can split the predicates
into groups so they don't clobber each other.

Differential Revision: https://reviews.llvm.org/D92744
2020-12-08 18:41:32 -08:00
Heejin Ahn 60653e24b6 [WebAssembly] Support select and block for reference types
This adds missing `select` instruction support and block return type
support for reference types. Also refactors WebAssemblyInstrRef.td and
rearranges tests in reference-types.s. Tests don't include `exnref`
types, because we currently don't support `exnref` for `ref.null` and
the type will be removed soon anyway.

Reviewed By: tlively, sbc100, wingo

Differential Revision: https://reviews.llvm.org/D92359
2020-12-01 19:16:57 -08:00
snek 803af31e5b [WebAssembly] Support fp reg class in r constraint
Patch by snek

Reviewed By: aheejin

Differential Revision: https://reviews.llvm.org/D90978
2020-11-18 17:05:58 -08:00
Florian Hahn b2f4c5fddc
[AsmWriter] Factor out mnemonic generation to accessible getMnemonic.
This patch factors out the part of printInstruction that gets the
mnemonic string for a given MCInst. This is intended to be used
subsequently for the instruction-mix remarks to display the final
mnemonic (D90040).

Unfortunately making `getMnemonic` available to the AsmPrinter
seems to require making it virtual. Not sure if there's a way around
that with the current layering of the AsmPrinters.

Reviewed By: Paul-C-Anagnostopoulos

Differential Revision: https://reviews.llvm.org/D90039
2020-11-17 09:47:38 +00:00
Sam Clegg a083b28a31 [WebAssembly] Move GlobalTLSAddress handling to WebAssemblyISelLowering. NFC
I'm not why it was added to DAGToDAG oringally but it seems
to make sense alongside the non-TLS version:  LowerGlobalAddress

Differential Revision: https://reviews.llvm.org/D91432
2020-11-13 14:35:51 -08:00
Heejin Ahn 902ea588ea [WebAssembly] Rename atomic.notify and *.atomic.wait
- atomic.notify -> memory.atomic.notify
- i32.atomic.wait -> memory.atomic.wait32
- i64.atomic.wait -> memory.atomic.wait64

See https://github.com/WebAssembly/threads/pull/149.

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D91447
2020-11-13 12:04:48 -08:00
Wouter van Oortmerssen 16f02431dc [WebAssembly] Added R_WASM_FUNCTION_OFFSET_I64 for use with DWARF DW_AT_low_pc
Needed for wasm64, see discussion in https://reviews.llvm.org/D91203

Differential Revision: https://reviews.llvm.org/D91395
2020-11-13 09:32:31 -08:00
Sam Clegg a28a466210 [WebAssembly] Add new relocation type for TLS data symbols
These relocations represent offsets from the __tls_base symbol.

Previously we were just using normal MEMORY_ADDR relocations and relying
on the linker to select a segment-offset rather and absolute value in
Symbol::getVirtualAddress().  Using an explicit relocation type allows
allow us to clearly distinguish absolute from relative relocations based
on the relocation information alone.

One place this is useful is being able to reject absolute relocation in
the PIC case, but still accept TLS relocations.

Differential Revision: https://reviews.llvm.org/D91276
2020-11-13 07:59:29 -08:00
serge-sans-paille 9218ff50f9 llvmbuildectomy - replace llvm-build by plain cmake
No longer rely on an external tool to build the llvm component layout.

Instead, leverage the existing `add_llvm_componentlibrary` cmake function and
introduce `add_llvm_component_group` to accurately describe component behavior.

These function store extra properties in the created targets. These properties
are processed once all components are defined to resolve library dependencies
and produce the header expected by llvm-config.

Differential Revision: https://reviews.llvm.org/D90848
2020-11-13 10:35:24 +01:00
Julien Jorge 0fca651711 [WebAssembly] Don't fold frame offset for global addresses
When machine instructions are in the form of
```
%0 = CONST_I32 @str
%1 = ADD_I32 %stack.0, %0
%2 = LOAD 0, 0, %1
```

In the `ADD_I32` instruction, it is possible to fold it if `%0` is a
`CONST_I32` from an immediate number. But in this case it is a global
address, so we shouldn't do that. But we haven't checked if the operand
of `ADD` is an immediate so far. This fixes the problem. (The case
applies the same for `ADD_I64` and `CONST_I64` instructions.)

Fixes https://bugs.llvm.org/show_bug.cgi?id=47944.

Patch by Julien Jorge (jjorge@quarkslab.com)

Reviewed By: dschuff

Differential Revision: https://reviews.llvm.org/D90577
2020-11-03 14:56:25 -08:00
Jordan Rupprecht 980bf1d5d1 [NFC] Inline wasm assertion-only variable 2020-11-03 13:06:59 -08:00
Andy Wingo 107c3a12d6 [WebAssembly] Implement ref.null
This patch adds a new "heap type" operand kind to the WebAssembly MC
layer, used by ref.null. Currently the possible values are "extern" and
"func"; when typed function references come, though, this operand may be
a type index.

Note that the "heap type" production is still known as "refedtype" in
the draft proposal; changing its name in the spec is
ongoing (https://github.com/WebAssembly/reference-types/issues/123).

The register form of ref.null is still untested.

Differential Revision: https://reviews.llvm.org/D90608
2020-11-03 10:46:23 -08:00
Thomas Lively a787e09779 [WebAssembly] Prototype i64x2.bitmask
As proposed in https://github.com/WebAssembly/simd/pull/368.

Differential Revision: https://reviews.llvm.org/D90514
2020-10-30 17:23:30 -07:00
Wouter van Oortmerssen 86cd2332ce [WebAssembly] Fixed DWARF DW_AT_low_pc encoded as 64-bit in wasm64
Also added general wasm64 DWARF test
Also added asserts for unsupported reloc combinations that triggered this bug.

Differential Revision: https://reviews.llvm.org/D90503
2020-10-30 16:42:48 -07:00
Thomas Lively 0a512a555a [WebAssembly] Prototype i64x2.eq
As proposed in https://github.com/WebAssembly/simd/pull/381. Since it is still
in the prototyping phase, it is only accessible via a target builtin function
and a target intrinsic.

Depends on D90504.

Differential Revision: https://reviews.llvm.org/D90508
2020-10-30 16:38:15 -07:00
Thomas Lively 1cb0b56607 [WebAssembly] Prototype i64x2.widen_{low,high}_i32x4_{s,u}
As proposed in https://github.com/WebAssembly/simd/pull/290. As usual, these
instructions are available only via builtin functions and intrinsics while they
are in the prototyping stage.

Differential Revision: https://reviews.llvm.org/D90504
2020-10-30 15:44:04 -07:00
Thomas Lively be6f50798e [WebAssembly] Implement SIMD signselect instructions
As proposed in https://github.com/WebAssembly/simd/pull/124, using the opcodes
adopted by V8 in
https://chromium-review.googlesource.com/c/v8/v8/+/2486235/2/src/wasm/wasm-opcodes.h.
Uses new builtin functions and a new target intrinsic exclusively to ensure that
the new instructions are only emitted when a user explicitly opts in to using
them since they are still in the prototyping and evaluation phase.

Differential Revision: https://reviews.llvm.org/D90357
2020-10-29 11:06:20 -07:00
Thomas Lively 31e944556f [WebAssembly] Prototype extending multiplication SIMD instructions
As proposed in https://github.com/WebAssembly/simd/pull/376. This commit
implements new builtin functions and intrinsics for these instructions, but does
not yet add them to wasm_simd128.h because they have not yet been merged to the
proposal. These are the first instructions with opcodes greater than 0xff, so
this commit updates the MC layer and disassembler to handle that correctly.

Differential Revision: https://reviews.llvm.org/D90253
2020-10-28 09:38:59 -07:00
Paulo Matos 69e2797eae [WebAssembly] Implementation of (most) table instructions
Implementation of instructions table.get, table.set, table.grow,
table.size, table.fill, table.copy.

Missing instructions are table.init and elem.drop as they deal with
element sections which are not yet implemented.

Added more tests to tables.s

Differential Revision: https://reviews.llvm.org/D89797
2020-10-23 08:42:54 -07:00
Thomas Lively 1992e30c2d [WebAssembly] Prototype i8x16.popcnt
As proposed at https://github.com/WebAssembly/simd/pull/379. Use a target
builtin and intrinsic rather than normal codegen patterns to make the
instruction opt-in until it is merged to the proposal and stabilized in engines.

Differential Revision: https://reviews.llvm.org/D89446
2020-10-15 21:18:22 +00:00
Thomas Lively 3f738d1f5e Reland "[WebAssembly] v128.load{8,16,32,64}_lane instructions"
This reverts commit 7c8385a352 with a typing fix
to an instruction selection pattern.
2020-10-15 19:32:34 +00:00
Thomas Lively 7c8385a352 Revert "[WebAssembly] v128.load{8,16,32,64}_lane instructions"
This reverts commit 7c6bfd90ab.
2020-10-15 15:49:36 +00:00
Thomas Lively 7c6bfd90ab [WebAssembly] v128.load{8,16,32,64}_lane instructions
Prototype the newly proposed load_lane instructions, as specified in
https://github.com/WebAssembly/simd/pull/350. Since these instructions are not
available to origin trial users on Chrome stable, make them opt-in by only
selecting them from intrinsics rather than normal ISel patterns. Since we only
need rough prototypes to measure performance right now, this commit does not
implement all the load and store patterns that would be necessary to make full
use of the offset immediate. However, the full suite of offset tests is included
to make it easy to track improvements in the future.

Since these are the first instructions to have a memarg immediate as well as an
additional immediate, the disassembler needed some additional hacks to be able
to parse them correctly. Making that code more principled is left as future
work.

Differential Revision: https://reviews.llvm.org/D89366
2020-10-15 15:33:10 +00:00
Paulo Matos 388fb67b0d [WebAssembly] Added .tabletype to asm and multiple table support in obj files
Adds more testing in basic-assembly.s and a new test tables.s.
Adds support to yaml reading and writing of tables as well.

Differential Revision: https://reviews.llvm.org/D88815
2020-10-13 07:52:23 -07:00
Thomas Lively 72c628e835 Reland "[WebAssembly] Emulate v128.const efficiently""
This reverts commit 432e4e56d3, which reverted 542523a61a. Two issues from
the original commit have been fixed. First, MSVC does not like when std::array
is initialized with only single braces, so this commit switches to using the
more portable double braces. Second, there was a subtle endianness bug that
prevented the original commit from working correctly on big-endian machines,
which has been fixed by switching to using endianness-agnostic bit twiddling
instead of type punning.

Differential Revision: https://reviews.llvm.org/D88773
2020-10-13 04:36:59 +00:00
Thomas Lively d8f58bf53a [WebAssembly] Prototype i16x8.q15mulr_sat_s
This saturating, rounding, Q-format multiplication instruction is proposed in
https://github.com/WebAssembly/simd/pull/365.

Differential Revision: https://reviews.llvm.org/D88968
2020-10-09 21:17:53 +00:00
Mircea Trofin 4cfc4025cc [NFC][MC] MCRegister API typing.
Mostly LiveIntervals, with their effects (users).

Differential Revision: https://reviews.llvm.org/D89018
2020-10-08 15:08:34 -07:00
Heejin Ahn 750b3ddd80 [WebAssembly] Handle indirect uses of longjmp
In LowerEmscriptenEHSjLj, `longjmp` used to be replaced with
`emscripten_longjmp_jmpbuf(jmp_buf*, i32)`, which will eventually be
lowered to `emscripten_longjmp(i32, i32)`. The reason we used two
different names was because they had different signatures in the IR
pass.

D88697 fixed this by only using `emscripten_longjmp(i32, i32)` and
adding a `ptrtoint` cast to its first argument, so
```
longjmp(buf, 0)
```
becomes
```
emscripten_longjmp((i32)buf, 0)
```

But this assumed all uses of `longjmp` was a direct call to it, which
was not the case. This patch handles indirect uses of `longjmp` by
replacing
```
longjmp
```
with
```
(i32(*)(jmp_buf*, i32))emscripten_longjmp
```

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D89032
2020-10-08 11:37:19 -07:00
Simon Pilgrim 0328005515 Fix MSVC "not all control paths return a value" warning. NFCI. 2020-10-07 19:53:39 +01:00
Heejin Ahn 3bba91f64e [WebAssembly] Rename Emscripten EH functions
Renaming for some Emscripten EH functions has so far been done in
wasm-emscripten-finalize tool in Binaryen. But recently we decided to
make a compilation/linking path that does not rely on
wasm-emscripten-finalize for modifications, so here we move that
functionality to LLVM.

Invoke wrappers are generated in LowerEmscriptenEHSjLj pass, but final
wasm types are not available in the IR pass, we need to rename them at
the end of the pipeline.

This patch also removes uses of `emscripten_longjmp_jmpbuf` in
LowerEmscriptenEHSjLj pass, replacing that with `emscripten_longjmp`.
`emscripten_longjmp_jmpbuf` is lowered to `emscripten_longjmp`, but
previously we generated calls to `emscripten_longjmp_jmpbuf` in
LowerEmscriptenEHSjLj pass because it takes `jmp_buf*` instead of `i32`.
But we were able use `ptrtoint` to make it use `emscripten_longjmp`
directly here.

Addresses:
https://github.com/WebAssembly/binaryen/issues/3043
https://github.com/WebAssembly/binaryen/issues/3081

Companions:
https://github.com/WebAssembly/binaryen/pull/3191
https://github.com/emscripten-core/emscripten/pull/12399

Reviewed By: dschuff, tlively, sbc100

Differential Revision: https://reviews.llvm.org/D88697
2020-10-07 09:42:49 -07:00
Stella Stamenova 432e4e56d3 Revert "[WebAssembly] Emulate v128.const efficiently"
This reverts commit 542523a61a.
2020-10-02 09:26:21 -07:00
Thomas Lively 542523a61a [WebAssembly] Emulate v128.const efficiently
v128.const was recently implemented in V8, but until it rolls into Chrome
stable, we can't enable it in the WebAssembly backend without breaking origin
trial users. So far we have been lowering build_vectors that would otherwise
have been lowered to v128.const to splats followed by sequences of replace_lane
instructions to initialize each lane individually. That produces large and
inefficient code, so this patch introduces new logic to lower integer vector
constants to a single i64x2.splat where possible, with at most a single
i64x2.replace_lane following it if necessary.

Adapted from a patch authored by @omnisip.

Differential Revision: https://reviews.llvm.org/D88591
2020-10-02 00:28:06 -07:00
Thomas Lively 89fe083c19 [WebAssembly] Check features before making SjLj vars thread-local
1c5a3c4d38 updated the variables inserted by Emscripten SjLj lowering to be
thread-local, depending on the CoalesceFeaturesAndStripAtomics pass to downgrade
them to normal globals if the target features did not support TLS. However, this
had the unintended side effect of preventing all non-TLS-supporting objects from
being linked into modules with shared memory, because stripping TLS marks an
object as thread-unsafe. This patch fixes the problem by only making the SjLj
lowering variables thread-local if the target machine supports TLS so that it
never introduces new usage of TLS that will be stripped. Since SjLj lowering
works on Modules instead of Functions, this required that the
WebAssemblyTargetMachine have its feature string updated to reflect the
coalesced features collected from all the functions so that a
WebAssemblySubtarget can be created without using any particular function.

Differential Revision: https://reviews.llvm.org/D88323
2020-09-25 11:45:16 -07:00
Thomas Lively 1c5a3c4d38 [WebAssembly] Make SjLj lowering globals thread-local
Emscripten's longjump and exception mechanism depends on two global variables,
`__THREW__` and `__threwValue`, which are changed to be defined as thread-local
in https://github.com/emscripten-core/emscripten/pull/12056. This patch updates
the corresponding code in the WebAssembly backend to properly declare these
globals as thread-local as well.

Differential Revision: https://reviews.llvm.org/D88262
2020-09-24 14:56:19 -07:00
Mircea Trofin 6e85c3d5c7 [NFC][Regalloc] accessors for 'reg' and 'weight'
Also renamed the fields to follow style guidelines.

Accessors help with readability - weight mutation, in particular,
is easier to follow this way.

Differential Revision: https://reviews.llvm.org/D87725
2020-09-16 08:28:57 -07:00
Craig Topper c193a689b4 [SelectionDAG] Use Align/MaybeAlign in calls to getLoad/getStore/getExtLoad/getTruncStore.
The versions that take 'unsigned' will be removed in the future.

I tried to use getOriginalAlign instead of getAlign in some
places. getAlign factors in the minimum alignment implied by
the offset in the pointer info. Since we're also passing the
pointer info we can use the original alignment.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D87592
2020-09-14 13:54:50 -07:00
Sam Clegg fa2a8acc71 [WebAssembly] Add assembly syntax for mutable globals
This adds and optional ", immutable" to the end of a `.globaltype`
declaration.  I would have prefered to match the `.wat` syntax
where immutable is the default and `mut` is the signifier for
mutable globals.  Sadly changing the default would break backwards
compat with existing assembly in the wild so I think its best
to stick with this approach.

Differential Revision: https://reviews.llvm.org/D87515
2020-09-11 11:11:02 -07:00
Dominic Chen 4252f3009b
[WebAssembly] Set unreachable as canonical to permit disassembly
Currently, using llvm-objdump to disassemble a function containing
unreachable will trigger an assertion while decoding the opcode, since both
unreachable and debug_unreachable have the same encoding. To avoid this, set
unreachable as the canonical decoding.

Differential Revision: https://reviews.llvm.org/D87431
2020-09-10 15:04:16 -04:00
Heejin Ahn d25c17f317 [WebAssembly] Fix fixEndsAtEndOfFunction for try-catch
When the function return type is non-void and `end` instructions are at
the very end of a function, CFGStackify's `fixEndsAtEndOfFunction`
function fixes the corresponding block/loop/try's type to match the
function's return type. This is applied to consecutive `end` markers at
the end of a function. For example, when the function return type is
`i32`,
```
block i32    ;; return type is fixed to i32
  ...
  loop i32   ;; return type is fixed to i32
    ...
  end_loop
end_block
end_function
```

But try-catch is a little different, because it consists of two parts:
a try part and a catch part, and both parts' return type should satisfy
the function's return type. Which means,
```
try i32      ;; return type is fixed to i32
  ...
  block i32  ;; this should be changed i32 too!
    ...
  end_block
catch
  ...
end_try
end_function
```
As you can see in this example, it is not sufficient to only `end`
instructions at the end of a function; in case of `try`, we should
check instructions before `catch`es, in case their corresponding `try`'s
type has been fixed.

This changes `fixEndsAtEndOfFunction`'s algorithm to use a worklist
that contains a reverse iterator, each of which is a starting point for
a new backward `end` instruction search.

Fixes https://bugs.llvm.org/show_bug.cgi?id=47413.

Reviewed By: dschuff, tlively

Differential Revision: https://reviews.llvm.org/D87207
2020-09-08 09:27:40 -07:00
Thomas Lively caee15a0ed [WebAssembly] Fix incorrect assumption of simple value types
Fixes PR47375, in which an assertion was triggering because
WebAssemblyTargetLowering::isVectorLoadExtDesirable was improperly
assuming the use of simple value types.

Differential Revision: https://reviews.llvm.org/D87110
2020-09-06 15:42:21 -07:00
Simon Pilgrim 83ca548fcb WebAssemblyUtilities.h - reduce unnecessary includes to forward declarations. NFCI. 2020-09-03 17:43:35 +01:00
Craig Topper aab90384a3 [Attributes] Add a method to check if an Attribute has AttrKind None. Use instead of hasAttribute(Attribute::None)
There's a special case in hasAttribute for None when pImpl is null. If pImpl is not null we dispatch to pImpl->hasAttribute which will always return false for Attribute::None.

So if we just want to check for None its sufficient to just check that pImpl is null. Which can even be done inline.

This patch adds a helper for that case which I hope will speed up our getSubtargetImpl implementations.

Differential Revision: https://reviews.llvm.org/D86744
2020-08-28 13:23:45 -07:00
Craig Topper c7a0b2684f [X86][MC][Target] Initial backend support a tune CPU to support -mtune
This patch implements initial backend support for a -mtune CPU controlled by a "tune-cpu" function attribute. If the attribute is not present X86 will use the resolved CPU from target-cpu attribute or command line.

This patch adds MC layer support a tune CPU. Each CPU now has two sets of features stored in their GenSubtargetInfo.inc tables . These features lists are passed separately to the Processor and ProcessorModel classes in tablegen. The tune list defaults to an empty list to avoid changes to non-X86. This annoyingly increases the size of static tables on all target as we now store 24 more bytes per CPU. I haven't quantified the overall impact, but I can if we're concerned.

One new test is added to X86 to show a few tuning features with mismatched tune-cpu and target-cpu/target-feature attributes to demonstrate independent control. Another new test is added to demonstrate that the scheduler model follows the tune CPU.

I have not added a -mtune to llc/opt or MC layer command line yet. With no attributes we'll just use the -mcpu for both. MC layer tools will always follow the normal CPU for tuning.

Differential Revision: https://reviews.llvm.org/D85165
2020-08-14 15:31:50 -07:00
Thomas Lively d53d952810 [WebAssembly] Allow inlining functions with different features
Allow inlining only when the Callee has a subset of the Caller's
features. In principle, we should be able to inline regardless of any
features because WebAssembly supports features at module granularity,
not function granularity, but without this restriction it would be
possible for a module to "forget" about features if all the functions
that used them were inlined.

Requested in PR46812.

Differential Revision: https://reviews.llvm.org/D85494
2020-08-13 13:57:43 -07:00
Jordan Rupprecht 1a67522d3e [NFC] Inline variable only used in debug builds 2020-08-11 19:38:01 -07:00
Thomas Lively 2985c02f79 [WebAssembly][AsmParser] Name missing features in error message
Rather than just saying that some feature is missing, report the exact
features to make the error message more useful and actionable.

Differential Revision: https://reviews.llvm.org/D85795
2020-08-11 17:26:14 -07:00
Thomas Lively 1a69f02397 [WebAssembly][NFC] Replace WASM with standard Wasm
The officially specified abbreviation for WebAssembly is Wasm and the
spec explicitly calls out WASM as being an incorrect spelling. This
patch fixes a few comments and error messages to use the
spec-compliant abbreviation.

Differential Revision: https://reviews.llvm.org/D85764
2020-08-11 12:27:59 -07:00
Wouter van Oortmerssen 582fd474dd [WebAssembly] wasm64: fix memory.init operand types
I had assumed they would all become in i64, but this is not necessary as long as data segments stay 32-bit, see:
https://github.com/WebAssembly/memory64/blob/master/proposals/memory64/Overview.md

Differential Revision: https://reviews.llvm.org/D85552
2020-08-10 10:15:20 -07:00
Thomas Lively cc612c2908 [WebAssembly] Fix FastISel address calculation bug
Fixes PR47040, in which an assertion was improperly triggered during
FastISel's address computation. The issue was that an `Address` set to
be relative to the FrameIndex with offset zero was incorrectly
considered to have an unset base. When the left hand side of an add
set the Address to be 0 off the FrameIndex, the right side would not
detect that the Address base had already been set and could try to set
the Address to be relative to a register instead, triggering an
assertion.

This patch fixes the issue by explicitly tracking whether an `Address`
has been set rather than interpreting an offset of zero to mean the
`Address` has not been set.

Differential Revision: https://reviews.llvm.org/D85581
2020-08-08 15:23:11 -07:00
Thomas Lively cb32792210 [WebAssembly] Implement prototype v128.load{32,64}_zero instructions
Specified in https://github.com/WebAssembly/simd/pull/237, these
instructions load the first vector lane from memory and zero the other
lanes. Since these instructions are not officially part of the SIMD
proposal, they are only available on an opt-in basis via LLVM
intrinsics and clang builtin functions. If these instructions are
merged to the proposal, this implementation will change so that the
instructions will be generated from normal IR. At that point the
intrinsics and builtin functions would be removed.

This PR also changes the opcodes for the experimental f32x4.qfm{a,s}
instructions because their opcodes conflicted with those of the
v128.load{32,64}_zero instructions. The new opcodes were chosen to
match those used in V8.

Differential Revision: https://reviews.llvm.org/D84820
2020-08-03 13:54:00 -07:00
Fangrui Song 40da58a04b [MC] Default MCAsmBackend::mayNeedRelaxation() to false 2020-08-02 22:13:59 -07:00
Wouter van Oortmerssen ce1eb7af9d [WebAssembly] Fixed 64-bit indices in br_table
LLVM selection dag assumes "switch" indices are pointer sized, which causes problems for our 32-bit br_table. The new function ensures 32-bit operands don't get unnecessarily extended, and 64-bit operands get truncated.

Note that the changes to the existing test test exactly that: the addition of -NEXT in 2 places ensures no extension is inserted (which the test previously ignored) and that the wrap is present (previously omitted in wasm64 mode).

Differential Revision: https://reviews.llvm.org/D84705
2020-07-30 10:52:16 -07:00
Craig Topper 3632f765dc [WebAssembly] Fix GCC 5 build.
Hans' speculative fix in b7292f2db0
didn't work for me. This seems to.
2020-07-30 10:00:28 -07:00
Hans Wennborg b7292f2db0 Speculative GCC 5 build fix
It's complaining about specializing the template in a different namespace.
2020-07-30 16:12:52 +02:00
Heejin Ahn 276f9e8cfa [WebAssembly] Fix getBottom for loops
When it was first created, CFGSort only made sure BBs in each
`MachineLoop` are sorted together. After we added exception support,
CFGSort now also sorts BBs in each `WebAssemblyException`, which
represents a `catch` block, together, and
`Region` class was introduced to be a thin wrapper for both
`MachineLoop` and `WebAssemblyException`.

But how we compute those loops and exceptions is different.
`MachineLoopInfo` is constructed using the standard loop computation
algorithm in LLVM; the definition of loop is "a set of BBs that are
dominated by a loop header and have a path back to the loop header". So
even if some BBs are semantically contained by a loop in the original
program, or in other words dominated by a loop header, if they don't
have a path back to the loop header, they are not considered a part of
the loop. For example, if a BB is dominated by a loop header but
contains `call abort()` or `rethrow`, it wouldn't have a path back to
the header, so it is not included in the loop.

But `WebAssemblyException` is wasm-specific data structure, and its
algorithm is simple: a `WebAssemblyException` consists of an EH pad and
all BBs dominated by the EH pad. So this scenario is possible: (This is
also the situation in the newly added test in cfg-stackify-eh.ll)

```
Loop L: header, A, ehpad, latch
Exception E: ehpad, latch, B
```
(B contains `abort()`, so it does not have a path back to the loop
header, so it is not included in L.)

And it is sorted in this order:
```
header
A
ehpad
latch
B
```

And when CFGStackify places `end_loop` or `end_try` markers, it
previously used `WebAssembly::getBottom()`, which returns the latest BB
in the sorted order, and placed the marker there. So in this case the
marker placements will be like this:
```
loop
  header
  try
    A
  catch
    ehpad
    latch
end_loop         <-- misplaced!
    B
  end_try
```
in which nesting between the loop and the exception is not correct.
`end_loop` marker has to be placed after `B`, and also after `end_try`.

Maybe the fundamental way to solve this problem is to come up with our
own algorithm for computing loop region too, in which we include all BBs
dominated by a loop header in a loop. But this takes a lot more effort.
The only thing we need to fix is actually, `getBottom()`. If we make it
return the right BB, which means in case of a loop, the latest BB of the
loop itself and all exceptions contained in there, we are good.

This renames `Region` and `RegionInfo` to `SortRegion` and
`SortRegionInfo` and extracts them into their own file. And add
`getBottom` to `SortRegionInfo` class, from which it can access
`WebAssemblyExceptionInfo`, so that it can compute a correct bottom
block for loops.

Reviewed By: dschuff

Differential Revision: https://reviews.llvm.org/D84724
2020-07-29 10:36:32 -07:00
Thomas Lively 11bb7eef41 [WebAssembly] Remove intrinsics for SIMD widening ops
Instead, pattern match extends of extract_subvectors to generate
widening operations. Since extract_subvector is not a legal node, this
is implemented via a custom combine that recognizes extract_subvector
nodes before they are legalized. The combine produces custom ISD nodes
that are later pattern matched directly, just like the intrinsic was.

Also removes the clang builtins for these operations since the
instructions can now be generated from portable code sequences.

Differential Revision: https://reviews.llvm.org/D84556
2020-07-28 18:25:55 -07:00
Thomas Lively ffd8c23ccb [WebAssembly] Implement truncating vector stores
Rather than expanding truncating stores so that vectors are stored one
lane at a time, lower them to a sequence of instructions using
narrowing operations instead, when possible. Since the narrowing
operations have saturating semantics, but truncating stores require
truncation, mask the stored value to manually truncate it before
narrowing. Also, since narrowing is a binary operation, pass in the
original vector as the unused second argument.

Differential Revision: https://reviews.llvm.org/D84377
2020-07-28 17:46:45 -07:00
Wouter van Oortmerssen cc1b9b680f [WebAssembly] 64-bit (function) pointer fixes.
Accounting for the fact that Wasm function indices are 32-bit, but in wasm64 we want uniform 64-bit pointers.
Includes reloc types for 64-bit table indices.

Differential Revision: https://reviews.llvm.org/D83729
2020-07-16 14:10:22 -07:00
Thomas Lively ecb2e5bcd7 [WebAssembly] Implement v128.select
Although the SIMD spec proposal does not specifically include a
select instruction, the select instruction in MVP WebAssembly is
polymorphic over the selected types, so it is able to work on v128
values when they are enabled. This patch introduces a new variant of
the select instruction for each legal vector type. Additional ISel
patterns are adapted from the SELECT_I32 and SELECT_I64 patterns.

Depends on D83736.

Differential Revision: https://reviews.llvm.org/D83737
2020-07-16 11:37:25 -07:00
Thomas Lively f0f9787646 [WebAssembly] Lower vselect to v128.bitselect
We were previously expanding vselect and matching on the expansion to
generate bitselects, but in some cases the expansion would be further
combined and a bitselect would not get generated. This patch improves
codegen in those cases by legalizing vselect and lowering it to
v128.bitselect. The old pattern that matches the expansion is still
useful for lowering IR that already uses the expansion rather than a
select operation.

Differential Revision: https://reviews.llvm.org/D83734
2020-07-16 11:11:19 -07:00
Thomas Lively b59c6fcaf3 [WebAssembly] Prefer v128.const for constant splats
In BUILD_VECTOR lowering, we used to generally prefer using splats
over v128.const instructions because v128.const has a very large
encoding. However, in d5b7a4e2e8 we switched to preferring consts
because they are expected to be more efficient in engines. This patch
updates the ISel patterns to match this current preference.

Differential Revision: https://reviews.llvm.org/D83581
2020-07-10 18:27:52 -07:00
Thomas Lively 043eaa9a4a [WebAssembly][NFC] Simplify vector shift lowering and add tests
This patch builds on 0d7286a652 by simplifying the code for detecting
splat values and adding new tests demonstrating the lowering of
splatted absolute value shift amounts, which are common in code
generated by Halide. The lowering is very bad right now, but
subsequent patches will improve it considerably. The tests will be
useful for evaluating the improvements in those patches.

Reviewed By: aheejin

Differential Revision: https://reviews.llvm.org/D83493
2020-07-10 00:18:59 -07:00
Thomas Lively 0d7286a652 [WebAssembly] Avoid scalarizing vector shifts in more cases
Since WebAssembly's vector shift instructions take a scalar shift
amount rather than a vector shift amount, we have to check in ISel
that the vector shift amount is a splat. Previously, we were checking
explicitly for splat BUILD_VECTOR nodes, but this change uses the
standard utilities for detecting splat values that can handle more
complex splat patterns. Since the C++ ISel lowering is now more
general than the ISel patterns, this change also simplifies shift
lowering by using the C++ lowering for all SIMD shifts rather than
mixing C++ and normal pattern-based lowering.

This change improves ISel for shifts to the point that the
simd-shift-unroll.ll regression test no longer tests the code path it
was originally meant to test. The bug corresponding to that regression
test is no longer reproducible with its original reported reproducer,
so rather than try to fix the regression test, this change just
removes it.

Differential Revision: https://reviews.llvm.org/D83278
2020-07-07 10:45:26 -07:00
Wouter van Oortmerssen 16d83c395a [WebAssembly] Added 64-bit memory.grow/size/copy/fill
This covers both the existing memory functions as well as the new bulk memory proposal.
Added new test files since changes where also required in the inputs.

Also removes unused init/drop intrinsics rather than trying to make them work for 64-bit.

Differential Revision: https://reviews.llvm.org/D82821
2020-07-06 12:49:50 -07:00
Alexander Belyaev 2247f7218a [llvm] Cast to (void) the unused variable. 2020-07-05 12:33:58 +02:00
Thomas Lively 65330f394b [WebAssembly] Do not assume br_table range checks will be gt_u
OSS-Fuzz and the Emscripten test suite uncovered some edge cases in
which the range check instruction seemed to be an (i32.const 0) or
other unexpected instruction, triggering an assertion. Unfortunately
the reproducers are rather complicated, so they don't make good unit
tests. This commit removes the bad assertion and conservatively
optimizes range checks only when the range check instruction is
i32.gt_u.

Differential Revision: https://reviews.llvm.org/D83169
2020-07-04 18:11:24 -07:00
Thomas Lively 8df30d988e [WebAssembly] Do not omit range checks for i64 switches
Summary:
Since the br_table instruction takes an i32, switches over i64s (and
larger integers) must use the i32.wrap_i64 instruction to truncate the
table index. This truncation makes numbers just over 2^32
indistinguishable from small numbers, so it was a miscompilation to
omit the range check preceding these br_tables. This change fixes the
problem by skipping the "fixing" of the br_table when the range check
is an i64 instruction.

Fixes PR46447.

Reviewers: aheejin, dschuff, kripken

Reviewed By: kripken

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83017
2020-07-03 17:15:39 -07:00
Guillaume Chatelet c1cd61e02a [Alignment][NFC] Migrate SelectionDAGTargetInfo::EmitTargetCodeForMemcpy to Align
This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D82849
2020-06-30 13:12:31 +00:00
Guillaume Chatelet 306d7c6929 [Alignment][NFC] Migrate SelectionDAGTargetInfo::EmitTargetCodeForMemmove to Align
This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D82850
2020-06-30 12:46:59 +00:00
Guillaume Chatelet 6a6af30d43 [Alignment][NFC] Migrate SelectionDAGTargetInfo::EmitTargetCodeForMemset to Align
This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D82851
2020-06-30 12:46:26 +00:00
Wouter van Oortmerssen b9a539c010 [WebAssembly] Adding 64-bit versions of __stack_pointer and other globals
We have 6 globals, all of which except for __table_base are 64-bit under wasm64.

Differential Revision: https://reviews.llvm.org/D82130
2020-06-25 15:52:44 -07:00
Guillaume Chatelet 2e7bba693e [Alignment][NFC] Use Align for TargetCallingConv::OrigAlign
This patch replaces D69249.

This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D82307
2020-06-25 13:21:22 +00:00
Matt Arsenault c5d240093b WebAssembly: Don't store MachineFunction in MachineFunctionInfo
Soon it will be disallowed to depend on MachineFunction state in the
constructor. This was only being used to get the MachineRegisterInfo
for an assert, which I'm not sure is necessarily worth it. I would
think any missing defs would be caught by the verifier later instead.
2020-06-24 10:52:58 -04:00
Sam Clegg 79aad89d8d [WebAssembly] Add support for externalref to MC and wasm-ld
This allows code for handling externref values to be processed by the
assembler and linker.

Differential Revision: https://reviews.llvm.org/D81977
2020-06-22 15:57:24 -07:00
Christopher Tetreault 5e2c736395 [SVE] Remove calls to VectorType::getNumElements from WebASM
Summary:
The getNumElements in base VectorType is being deprecated.

See: http://lists.llvm.org/pipermail/llvm-dev/2020-March/139811.html

Reviewers: efriedma, tlively, fpetrogalli, c-rhodes, dschuff

Reviewed By: tlively, dschuff

Subscribers: dschuff, sbc100, tschuett, jgravelle-google, hiraditya, aheejin, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82217
2020-06-22 12:25:08 -07:00
stozer 539381da26 [DebugInfo] Update MachineInstr to help support variadic DBG_VALUE instructions
Following on from this RFC[0] from a while back, this is the first patch towards
implementing variadic debug values.

This patch specifically adds a set of functions to MachineInstr for performing
operations specific to debug values, and replacing uses of the more general
functions where appropriate. The most prevalent of these is replacing
getOperand(0) with getDebugOperand(0) for debug-value-specific code, as the
operands corresponding to values will no longer be at index 0, but index 2 and
upwards: getDebugOperand(x) == getOperand(x+2). Similar replacements have been
added for the other operands, along with some helper functions to replace
oft-repeated code and operate on a variable number of value operands.

[0] http://lists.llvm.org/pipermail/llvm-dev/2020-February/139376.html<Paste>

Differential Revision: https://reviews.llvm.org/D81852
2020-06-22 16:01:12 +01:00
Eric Christopher cf23852587 [Target] As part of using inclusive language within the llvm project,
migrate away from the use of blacklist and whitelist.

This change affects an internal llvm command line option.
2020-06-20 00:06:39 -07:00
Heejin Ahn 83c26eae23 [WebAssembly] Remove TEEs when dests are unstackified
When created in RegStackify pass, `TEE` has two destinations, where
op0 is stackified and op1 is not. But it is possible that
op0 becomes unstackified in `fixUnwindMismatches` function in
CFGStackify pass when a nested try-catch-end is introduced, violating
the invariant of `TEE`s destinations.

In this case we convert the `TEE` into two `COPY`s, which will
eventually be resolved in ExplicitLocals.

Reviewed By: dschuff

Differential Revision: https://reviews.llvm.org/D81851
2020-06-19 14:55:21 -07:00
Ronak Chauhan 5bd33de9c8 [MC] Pass the symbol rather than its name to onSymbolStart()
Summary: This allows targets to also consider the symbol's type and/or address if needed.

Reviewers: scott.linder, jhenderson, MaskRay, aardappel

Reviewed By: scott.linder, MaskRay

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, aheejin, rupprecht, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82090
2020-06-19 09:30:12 +05:30
Thomas Lively 49754dcf22 [WebAssembly] Fix bug in FixBrTables and use branch analysis utils
Summary:
This commit fixes a bug in the FixBrTables pass in which an
unconditional branch from the switch header block to the jump table
block was not removed before the blocks were combined. The result was
an invalid CFG in the MachineFunction. This commit also switches from
using bespoke branch analysis and deletion code to using the standard
utilities for the same.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81909
2020-06-17 12:34:45 -07:00
Wouter van Oortmerssen 3b29376e3f [WebAssembly] Adding 64-bit version of R_WASM_MEMORY_ADDR_* relocs
This adds 4 new reloc types.

A lot of code that previously assumed any memory or offset values could be contained in a uint32_t (and often truncated results from functions returning 64-bit values) have been upgraded to uint64_t. This is not comprehensive: it is only the values that come in contact with the new relocation values and their dependents.

A new tablegen mapping was added to automatically upgrade loads/stores in the assembler, which otherwise has no way to select for these instructions (since they are indentical other than for the offset immediate). It follows a similar technique to https://reviews.llvm.org/D53307

Differential Revision: https://reviews.llvm.org/D81704
2020-06-15 10:07:42 -07:00
Wouter van Oortmerssen d9e0bbd17b [WebAssembly] Adding 64-bit versions of all load & store ops.
Context: https://github.com/WebAssembly/memory64/blob/master/proposals/memory64/Overview.md
This is just a first step, adding the new instruction variants while keeping the existing 32-bit functionality working.
Some of the basic load/store tests have new wasm64 versions that show that the basics of the target are working.
Further features need implementation, but these will be added in followups to keep things reviewable.

Differential Revision: https://reviews.llvm.org/D80769
2020-06-15 08:31:56 -07:00
Ronak Chauhan 480a16d5c8 [MC] Changes to help improve target specific symbol disassembly
Summary:
This commit slightly modifies the MCDisassembler, and llvm-objdump to
allow targets to also decode entire symbols.

WebAssembly uses the onSymbolStart hook it to decode preludes.
WebAssembly partially disassembles the symbol in its target specific
way; and then falls back to the normal flow of llvm-objdump.

AMDGPU needs it to decode kernel descriptors entirely, and move to the
next symbol.

This commit is to split the above task into 2.
- Changes to llvm-objdump and MC-layer without breaking WebAssembly code
  [ this commit ]
- AMDGPU's implementation of onSymbolStart that decodes kernel
  descriptors. [ https://reviews.llvm.org/D80713 ]

Reviewers: scott.linder, t-tye, sunfish, arsenm, jhenderson, MaskRay, aardappel

Reviewed By: scott.linder, jhenderson, aardappel

Subscribers: bcain, dschuff, wdng, tpr, sbc100, jgravelle-google, hiraditya, aheejin, MaskRay, rupprecht, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80512
2020-06-12 15:51:37 -04:00
Thomas Lively c5d012341e [WebAssembly] Make BR_TABLE non-duplicable
Summary:
After their range checks were removed in 7f50c15be5, br_tables
started being duplicated into their predecessors by tail
folding. Unfortunately, when the br_tables were in loops this
transformation introduced bad irreducible control flow which was later
expanded into even more br_tables. This commit abuses the
`isNotDuplicable` property to prevent this irreducible control flow
from being introduced. This change saves a few dozen bytes of code
size and has a negligible affect on performance for most of the large
Emscripten benchmarks, but can improve performance significantly on
microbenchmarks of switches in loops.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81628
2020-06-11 15:11:45 -07:00
Heejin Ahn a5099ad918 [WebAssembly] Fix a warning for an unused variable
`ErasedUncondBr` is used only in an `assert`, so it triggers a warning
on builds without assertions. Fixed.
2020-06-10 10:06:28 -07:00
Thomas Lively b7d369280b [WebAssembly] Implement prototype SIMD rounding instructions
Summary:
As specified in https://github.com/WebAssembly/simd/pull/232. These
instructions are implemented as LLVM intrinsics for now rather than
normal ISel patterns to make these instructions opt-in. Once the
instructions are merged to the spec proposal, the intrinsics will be
replaced with proper ISel patterns.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D81222
2020-06-09 10:14:14 -07:00
Guillaume Chatelet 1778564f91 [Alignment][NFC] Migrate the rest of backends
Summary: This is a followup on D81196

Reviewers: courbet

Subscribers: arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81278
2020-06-08 07:17:20 +00:00
James Y Knight 1978309db1 MachineBasicBlock::updateTerminator now requires an explicit layout successor.
Previously, it tried to infer the correct destination block from the
successor list, but this is a rather tricky propspect, given the
existence of successors that occur mid-block, such as invoke, and
potentially in the future, callbr/INLINEASM_BR. (INLINEASM_BR, in
particular would be problematic, because its successor blocks are not
distinct from "normal" successors, as EHPads are.)

Instead, require the caller to pass in the expected fallthrough
successor explicitly. In most callers, the correct block is
immediately clear. But, in MachineBlockPlacement, we do need to record
the original ordering, before starting to reorder blocks.

Unfortunately, the goal of decoupling the behavior of end-of-block
jumps from the successor list has not been fully accomplished in this
patch, as there is currently no other way to determine whether a block
is intended to fall-through, or end as unreachable. Further work is
needed there.

Differential Revision: https://reviews.llvm.org/D79605
2020-06-06 22:30:51 -04:00
Thomas Lively a07c08f74f [WebAssembly] Lower llvm.debugtrap properly
Summary:
Unlike normal traps, debug traps are allowed to return and can have
additional instructions in the same basic block. Without explicit
backend support for debug traps, they are lowered in ISel as normal
traps. Since normal traps are lowered in the WebAssembly backend to
the UNREACHABLE instruction, which is a terminator, using debug traps
could lead to invalid MBBs when there are additional instructions
after the trap. This patch fixes the issue by lowering debug traps to
a new version of the UNREACHABLE instruction, DEBUG_UNREACHABLE, that
is not a terminator.

An alternative approach would have been to make UNREACHABLE not a
terminator, but that breaks a large number of tests. In particular, it
would require removing the traps inserted after noreturn calls to
@llvm.wasm.throw because otherwise the terminator throw would be
followed by a non-terminator UNREACHABLE and we would be back to
having invalid MBBs. Overall the approach in this patch seems simpler.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81055
2020-06-04 13:25:10 -07:00
Mikael Holmen 2f671c4225 [WebAssembly] Fix gcc warning [NFC]
gcc 7.4 complained with
../lib/Target/WebAssembly/WebAssemblyFixBrTableDefaults.cpp:125:23: warning: extra ';' [-Wpedantic]
                 false);
                       ^
2020-06-04 10:03:13 +02:00
Thomas Lively 25af2126f9 [WebAssembly] Fix ISel crash in SIGN_EXTEND_INREG lowering
Summary:
The code previously assumed that the index of a vector extract was
constant, but this was not always true. This patch fixes the problem
by bailing out of the lowering if the index is nonconstant and also
replaces `static_cast`s in the lowering function with `cast`s because
the latter contain type-checking asserts that would make similar
issues easier to find and debug.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81025
2020-06-03 15:36:44 -07:00
Thomas Lively 7f50c15be5 Reland "[WebAssembly] Eliminate range checks on br_tables"
This reverts commit 755a895915.
Although I was not able to reproduce any test failures locally,
aheejin was able to reproduce them and found a fix, applied here.
2020-06-03 14:04:59 -07:00
Thomas Lively 755a895915 Revert "[WebAssembly] Eliminate range checks on br_tables"
This reverts commit f99d5f8c32.
The change was causing UBSan and other failures on some bots.
2020-06-03 01:26:53 -07:00
Thomas Lively f99d5f8c32 [WebAssembly] Eliminate range checks on br_tables
Summary:
Jump tables for most targets cannot handle out of range indices by
themselves, so LLVM emits range checks to guard the jump
tables. WebAssembly, on the other hand, implements jump tables using
the br_table instruction, which takes a default branch target as an
operand, making the range checks redundant. This patch introduces a
new MachineFunction pass in the WebAssembly backend to find and
eliminate the redundant range checks.

Reviewers: aheejin, dschuff

Subscribers: mgorny, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80863
2020-06-02 13:14:27 -07:00
Sam Clegg 26c78e3095 [WebAssembly] Update test expectations
simd-2.C now compiles thanks to:
  https://github.com/WebAssembly/wasi-libc/pull/183

Differential Revision: https://reviews.llvm.org/D80930
2020-06-01 08:35:27 -07:00
Heejin Ahn 4cd3f4b31b [WebAssembly] Fix a bug in finding matching EH pad
Summary:
`getMatchingEHPad()` in LateEHPrepare is a function to find the nearest
EH pad that dominates the given instruction. This intends to be
lightweight so it does not use full WebAssemblyException scope analysis
or dominator analysis. It simply does backward BFS to its predecessors
and stops at the first EH pad each search path encounters. All search
should end up at the same EH pad, and if not, it returns null.

But it didn't take into account that when there are inner scopes within
the current scope, some path in BFS can hit an inner EH pad first. For
example, in the given diagram, `Inst` belongs to the outer scope and
`getMathingEHPad()` should return 'EHPad 1', but some search path can go
into the inner scope and end up with 'EHPad 2'. The search will return
null because different paths end up with different EH pads.
```
--- EHPad 1 ---
| - EHPad 2 - |
| |         | |
| ----------- |
|   Inst      |
---------------
```

So far this was OK because we haven't tested a case in which a given
instruction is far from its EH pad. Also, this bug does not happen when
the inner EH scope is a cleanup scope, because a cleanup scope ends with
a `cleanupret` whose successor is an EH pad, so the search encounters
that EH pad first before going into the child scope. But this can happen
when the child scope is a catch scope that ends with `catchret`. So this
patch, when doing backward BFS, does not search predecessors that ends
with `catchret`. Because `catchret`s are replaced with `br`s during this
pass, this records BBs that have `catchret`s in the beginning, before
doing any other transformations.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80571
2020-05-28 19:46:11 -07:00
Heejin Ahn 3fe6ea4641 [WebAssembly] Fix a bug in removing unnecessary branches
Summary:
One of the things `removeUnnecessaryInstrs()` in CFGStackify does is to
remove an unnecessary unconditinal branch before an EH pad. When there
is an unconditional branch right before a catch instruction and it
branches to the end of `end_try` marker, we don't need the branch,
because it there is no exception, the control flow transfers to
that point anyway.
```
bb0:
  try
    ...
    br bb2      <- Not necessary
bb1:
  catch
    ...
bb2:
  end
```

This applies when we have a conditional branch followed by an
unconditional one, in which case we should only remove the unconditional
branch. For example:
```
bb0:
  try
    ...
    br_if someplace_else
    br bb2                 <- Not necessary
bb1:
  catch
    ...
bb2:
  end
```

But `TargetInstrInfo::removeBranch` we used removed all existing
branches when there are multiple ones. This patch fixes it by only
deleting the last (= unconditional) branch manually.

Also fixes some `preds` comments in the test file.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80572
2020-05-28 19:44:27 -07:00
Simon Pilgrim fe0006c882 TargetLowering.h - remove unnecessary TargetMachine.h include. NFC
Replace with forward declaration and move dependency down to source files that actually need it.

Both TargetLowering.h and TargetMachine.h are 2 of the most expensive headers (top 10) in the ClangBuildAnalyzer report when building llc.
2020-05-23 19:49:38 +01:00
Marek Kurdej 9301e3aaca [Target] Fix typos. NFC 2020-05-22 14:40:43 +02:00
Thomas Lively 8a43d41a40 [WebAssembly] Fix bug in custom shuffle combine
Summary:
The code previously assumed the source of the bitcast in the combined
pattern was a vector type, but this is not always true. This patch
adds a check to avoid an assertion failure in that case.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80164
2020-05-19 12:54:15 -07:00
Thomas Lively 3181273be7 [WebAssembly] Implement i64x2.mul and remove i8x16.mul
Summary:
This reflects changes in the spec proposal made since basic arithmetic
was first implemented.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D80174
2020-05-19 12:50:44 -07:00
Wouter van Oortmerssen 10e2e7de0c [WebAssembly] iterate stack in DebugFixup from the top.
Differential Revision: https://reviews.llvm.org/D80045
2020-05-18 08:33:36 -07:00
Thomas Lively 40af48101b [WebAssembly] Optimize splats of bitcasted vectors
Summary:
This new custom DAG combine fixes a codegen issue with the
wasm_simd128.h intrinsics. Clang lowers the

  return (v128_t)(__f32x4){__a, __a, __a, __a};

body of f32x4_splat to a splat shuffle of a bitcasted vector, as seen
in the new simd-shuffle-bitcast.ll test. The bitcast interfered with
the target-independent DAG combine that combines splat shuffles into
BUILD_VECTOR nodes, so this patch introduces a new custom DAG combine
to hoist the bitcast out of the shuffle, allowing the
target-independent combine to work as intended.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80021
2020-05-15 12:12:20 -07:00
Thomas Lively c702d4bf41 [WebAssembly] Update latest implemented SIMD instructions
Summary:
Move instructions that have recently been implemented in V8 from the
`unimplemented-simd128` target feature to the `simd128` target
feature. The updated instructions match the update at
https://github.com/WebAssembly/simd/pull/223.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D79973
2020-05-15 10:53:02 -07:00
Wouter van Oortmerssen 62efd1eca2 [WebAssembly] Fixed debugloc in DebugFixup pass
BuildMI requires this debug loc to be from the same sub program as the variable metadata passed in.

Differential Revision:  https://reviews.llvm.org/D80019
2020-05-15 10:50:14 -07:00
Wouter van Oortmerssen 2b7fe0863a [WebAssembly] Added Debug Fixup pass
This pass changes debug_value instructions referring to stackified registers into TI_OPERAND_STACK with correct stack depth.
2020-05-14 13:14:45 -07:00
Thomas Lively 3d49d1cfa7 [WebAssembly] Implement pseudo-min/max SIMD instructions
Summary:
As proposed in https://github.com/WebAssembly/simd/pull/122. Since
these instructions are not yet merged to the SIMD spec proposal, this
patch makes them entirely opt-in by surfacing them only through LLVM
intrinsics and clang builtins. If these instructions are made
official, these intrinsics and builtins should be replaced with simple
instruction patterns.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D79742
2020-05-12 09:39:01 -07:00
Thomas Lively 8e3e56f2a3 [WebAssembly] Add wasm-specific vector shuffle builtin and intrinsic
Summary:

Although using `__builtin_shufflevector` and the `shufflevector`
instruction works fine, they are not opaque to the optimizer. As a
result, DAGCombine can potentially reduce the number of shuffles and
change the shuffle masks. This is unexpected behavior for users of the
WebAssembly SIMD intrinsics who have crafted their shuffles to
optimize the code generated by engines. This patch solves the problem
by adding a new shuffle intrinsic that is opaque to the optimizers in
line with the decision of the WebAssembly SIMD contributors at
https://github.com/WebAssembly/simd/issues/196#issuecomment-622494748. In
the future we may implement custom DAG combines to properly optimize
shuffles and replace this solution.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D66983
2020-05-11 10:01:55 -07:00
Thomas Lively a1ae9566ea [WebAssembly] Disallow 'shared-mem' rather than 'atomics'
Summary:
The WebAssembly backend automatically lowers atomic operations and TLS
to nonatomic operations and non-TLS data when either are present and
the atomics or bulk-memory features are not present, respectively. The
resulting object is no longer thread-safe, so the linker has to be
told not to allow it to be linked into a module with shared
memory. This was previously done by disallowing the 'atomics' feature,
which prevented any objct with its atomic operations or TLS removed
from being linked with any object containing atomics or TLS, and
therefore preventing it from being linked into a module with shared
memory since shared memory requires atomics.

However, as of https://github.com/WebAssembly/threads/issues/144, the
validation rules are relaxed to allow atomic operations to validate
with unshared memories, which makes it perfectly safe to link an
object with stripped atomics and TLS with another object that still
contains TLS and atomics as long as the resulting module has an
unshared memory. To allow this kind of link, this patch disallows a
pseudo-feature 'shared-mem' rather than 'atomics' to communicate to
the linker that the object is not thread-safe. This means that the
'atomics' feature is available to accurately reflect whether or not an
object has atomics enabled.

As a drive-by tweak, this change also requires that bulk-memory be
enabled in addition to atomics in order to use shared memory. This is
because initializing shared memories requires bulk-memory operations.

Reviewers: aheejin, sbc100

Subscribers: dschuff, jgravelle-google, hiraditya, sunfish, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79542
2020-05-08 13:52:39 -07:00
Sam Parker 40574fefe9 [NFC][CostModel] Add TargetCostKind to relevant APIs
Make the kind of cost explicit throughout the cost model which,
apart from making the cost clear, will allow the generic parts to
calculate better costs. It will also allow some backends to
approximate and correlate the different costs if they wish. Another
benefit is that it will also help simplify the cost model around
immediate and intrinsic costs, where we currently have multiple APIs.

RFC thread:
http://lists.llvm.org/pipermail/llvm-dev/2020-April/141263.html

Differential Revision: https://reviews.llvm.org/D79002
2020-05-05 10:35:54 +01:00
Heejin Ahn 834debfffd [WebAssembly] Fix block marker placing after fixUnwindMismatches
Summary:
This fixes a few things that are connected. It is very hard to provide
an independent test case for each of those fixes, because they are
interconnected and sometimes one masks another. The provided test case
triggers some of those bugs below but not all.

---

1. Background:
`placeBlockMarker` takes a BB, and if the BB is a destination of some
branch, it places `end_block` marker there, and computes the nearest
common dominator of all predecessors (what we call 'header') and places
a `block` marker there.

When we first place markers, we traverse BBs from top to bottom. For
example, when there are 5 BBs A, B, C, D, and E and B, D, and E are
branch destinations, if mark the BB given to `placeBlockMarker` with `*`
and draw a rectangle representing the border of `block` and `end_block`
markers, the process is going to look like
```
                       -------
           -----       |-----|
 ---       |---|       ||---||
 |A|       ||A||       |||A|||
 ---  -->  |---|  -->  ||---||
 *B        | B |       || B ||
  C        | C |       || C ||
  D        -----       |-----|
  E         *D         |  D  |
             E         -------
                         *E
```
which means when we first place markers, we go from inner to outer
scopes. So when we place a `block` marker, if the header already
contains other `block` or `try` marker, it has to belong to an inner
scope, so the existing `block`/`try` markers should go _after_ the new
marker. This was the assumption we had.

But after placing all markers we run `fixUnwindMismatches` function.
There we do some control flow transformation and create some branches,
and we call `placeBlockMarker` again to place `block`/`end_block`
markers for those newly created branches. We can't assume that we are
traversing branch destination BBs from top to bottom now because we are
basically inserting some new markers in the middle of existing markers.

Fix:
In `placeBlockMarker`, we don't have the assumption that the BB given is
in the order of top to bottom, and when placing `block` markers,
calculates whether existing `block` or `try` markers are inner or
outer scopes with respect to the current scope.

---

2. Background:
In `fixUnwindMismatches`, when there is a call whose correct unwind
destination mismatches the current destination after initially placing
`try` markers, we wrap that with a new nested `try`/`catch`/`end` and
jump to the correct handler within the new `catch`. The correct handler
code is split as a separate BB from its original EH pad so it can be
branched to. Here's an example:

- Before
```
mbb:
  call @foo       <- Unwind destination mismatch!
wrong-ehpad:
  catch
  ...
cont:
  end_try
  ...
correct-ehpad:
  catch
  [handler code]
```

- After
```
mbb:
  try                (new)
  call @foo
nested-ehpad:        (new)
  catch              (new)
  local.set n / drop (new)
  br %handleri       (new)
nested-end:          (new)
  end_try            (new)
wrong-ehpad:
  catch
  ...
cont:
  end_try
  ...
correct-ehpad:
  catch
  local.set n / drop (new)
handler:             (new)
  end_try
  [handler code]
```

Note that after this transformation, it is possible there are no calls
to actually unwind to `correct-ehpad` here. `call @foo` now
branches to `handler`, and there can be no other calls to unwind to
`correct-ehpad`. In this case `correct-ehpad` does not have any
predecessors anymore.

This can cause a bug in `placeBlockMarker`, because we may need to place
`end_block` marker in `handler`, and `placeBlockMarker` computes the
nearest common dominator of all predecessors. If one of `handler`'s
predecessor (here `correct-ehpad`) does not have any predecessors, i.e.,
no way of reaching it, we cannot correctly compute the common dominator
of predecessors of `handler`, and end up placing no `block`/`end`
markers. This bug actually sometimes masks the bug 1.

Fix:
When we have an EH pad that does not have any predecessors after this
transformation, deletes all its successors, so that its successors don't
have any dangling predecessors.

---

3. Background:
Actually the `handler` BB in the example shown in bug 2 doesn't need
`end_block` marker, despite it being a new branch destination, because
it already has `end_try` marker which can serve the same purpose. I just
put that example there for an illustration purpose. There is a case we
actually need to place `end_block` marker: when the branch dest is the
appendix BB. The appendix BB is created when there is a call that is
supposed to unwind to the caller ends up unwinding to a wrong EH pad. In
this case we also wrap the call with a nested `try`/`catch`/`end`,
create an 'appendix' BB at the very end of the function, and branch to
that BB, where we rethrow the exception to the caller.

Fix:
When we don't actually need to place block markers, we don't.

---

4. In case we fall through to the continuation BB after the catch block,
after extracting handler code in `fixUnwindMismatches` (refer to bug 2
for an example), we now have to add a branch to it to bypass the
handler.
- Before
```
try
  ...
  (falls through to 'cont')
catch
  handler body
end
              <-- cont
```

- After
```
try
  ...
  br %cont    (new)
catch
end
handler body
              <-- cont
```

The problem is, we haven't been placing a new `end_block` marker in the
`cont` BB in this case. We should, and this fixes it. But it is hard to
provide a test case that triggers this bug, because the current
compilation pipeline from .ll to .s does not generate this kind of code;
we always have a `br` after `invoke`. But code without `br` is still
valid, and we can have that kind of code if we have some pipeline
changes or optimizations later. Even mir test cases cannot trigger this
part for now, because we don't encode auxiliary EH-related data
structures (such as `WasmEHFuncInfo`) in mir now. Those functionalities
can be added later, but I don't think we should block this fix on that.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79324
2020-05-05 02:06:47 -07:00
Sam McCall d10c995b4d std::isspace -> llvm::isSpace (where locale should be ignored)
I've left out some cases where I wasn't totally sure this was right or
whether the include was ok (compiler-rt) or idiomatic (flang).
2020-05-02 15:36:04 +02:00
Thomas Lively e0f52842c8 [WebAssembly] Renumber SIMD opcodes
Summary:
As described in https://github.com/WebAssembly/simd/pull/209. This is
the final reorganization of the SIMD opcode space before
standardization. It has been landed in concert with corresponding
changes in other projects in the WebAssembly SIMD ecosystem.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79224
2020-05-01 17:20:49 -07:00
David Blaikie f6d5320ebe WebAssemblyExceptionInfo::Exceptions: Use unique_ptr to simplify memory management 2020-04-28 17:33:46 -07:00
Craig Topper a58b62b4a2 [IR] Replace all uses of CallBase::getCalledValue() with getCalledOperand().
This method has been commented as deprecated for a while. Remove
it and replace all uses with the equivalent getCalledOperand().

I also made a few cleanups in here. For example, to removes use
of getElementType on a pointer when we could just use getFunctionType
from the call.

Differential Revision: https://reviews.llvm.org/D78882
2020-04-27 22:17:03 -07:00
Simon Pilgrim 5387899bb4 [WebAssembly] Remove unused forward declarations. NFC. 2020-04-23 16:30:45 +01:00
Kazuaki Ishizaki 0312b9f550 [llvm] NFC: Fix trivial typo in rst and td files
Differential Revision: https://reviews.llvm.org/D77469
2020-04-23 14:26:32 +09:00