This makes Wasm EH work with dynamic linking. So far we were only able
to handle destructors, which do not use any tags or LSDA info.
1. This uses `TargetExternalSymbol` for `GCC_except_tableN` symbols,
which points to the address of per-function LSDA info. It is more
convenient to use than `MCSymbol` because it can take additional
target flags.
2. When lowering `wasm_lsda` intrinsic, if PIC is enabled, make the
symbol relative to `__memory_base` and generate the `add` node. If
PIC is disabled, continue to use the absolute address.
3. Make tag symbols (`__cpp_exception` and `__c_longjmp`) undefined in
the backend, because it is hard to make it work with dynamic
linking's loading order. Instead, we make all tag symbols undefined
in the LLVM backend and import it from JS.
4. Add support for undefined tags to the linker.
Companion patches:
- https://github.com/WebAssembly/binaryen/pull/4223
- https://github.com/emscripten-core/emscripten/pull/15266
Reviewed By: sbc100
Differential Revision: https://reviews.llvm.org/D111388
This removes `WasmTagType`. `WasmTagType` contained an attribute and a
signature index:
```
struct WasmTagType {
uint8_t Attribute;
uint32_t SigIndex;
};
```
Currently the attribute field is not used and reserved for future use,
and always 0. And that this class contains `SigIndex` as its property is
a little weird in the place, because the tag type's signature index is
not an inherent property of a tag but rather a reference to another
section that changes after linking. This makes tag handling in the
linker also weird that tag-related methods are taking both `WasmTagType`
and `WasmSignature` even though `WasmTagType` contains a signature
index. This is because the signature index changes in linking so it
doesn't have any info at this point. This instead moves `SigIndex` to
`struct WasmTag` itself, as we did for `struct WasmFunction` in D111104.
In this CL, in lib/MC and lib/Object, this now treats tag types in the
same way as function types. Also in YAML, this removes `struct Tag`,
because now it only contains the tag index. Also tags set `SigIndex` in
`WasmImport` union, as functions do.
I think this makes things simpler and makes tag handling more in line
with function handling. These two shares similar properties in that both
of them have signatures, but they are kind of nominal so having the same
signature doesn't mean they are the same element.
Also a drive-by fix: the reserved 'attirubute' part's encoding changed
from uleb32 to uint8 a while ago. This was fixed in lib/MC and
lib/Object but not in YAML. This doesn't change object files because the
field's value is always 0 and its encoding is the same for the both
encoding.
This is effectively NFC; I didn't mark it as such just because it
changed YAML test results.
Reviewed By: sbc100, tlively
Differential Revision: https://reviews.llvm.org/D111086
We previously had a limitation that TLS variables could not
be exported (and therefore could also not be imported). This
change removed that limitation.
Differential Revision: https://reviews.llvm.org/D108877
In the case of weakly defined symbols in shared libraries we now
generate both an import and an export. The dynamic linker can then
choose how a winner from among all the shared libraries that define a
given symbol.
Previously any direct usage of a weakly defined symbol would use the
DSO-local definition (For example, even through there would be single
address for a weakly defined function, each DSO could end up directly
calling its local version).
Fixes: https://github.com/emscripten-core/emscripten/issues/13773
Differential Revision: https://reviews.llvm.org/D108413
This avoids duplication and simplifies the code in several places
without increasing the size of the symbol union (at least not
above the assert'd limit of 120 bytes).
Originally commit: 9b965b37c7
Reverted in: 16aac493e5.
Differential Revision: https://reviews.llvm.org/D106026
This avoids duplication and simplifies the code in several places
without increasing the size of the symbol union (at least not
above the assert'd limit of 120 bytes).
Differential Revision: https://reviews.llvm.org/D106026
__table_base is know 64-bit, since in LLVM it represents a function pointer offset
__table_base32 is a copy in wasm32 for use in elem init expr, since no truncation may be used there.
New reloc R_WASM_TABLE_INDEX_REL_SLEB64 added
Differential Revision: https://reviews.llvm.org/D101784
The main motivation for this refactor is to remove the subclass
relationship between the InputSegment and MergeInputSegment and
SyntenticMergedInputSegment so that we can use the merging classes for
debug sections which are not data segments.
In the process of refactoring I also remove all the virtual functions
from the class hierarchy and try to reuse techniques used in the ELF
linker (see `lld/ELF/InputSections.h`).
Differential Revision: https://reviews.llvm.org/D102546
Don't include the relocation addend when calculating the
virtual address of a symbol. Instead just pass the symbol's
offset and add the addend afterwards.
Without this fix we hit the `offset is outside the section`
error in MergeInputSegment::getSegmentPiece.
This fixes a real world error we were are seeing in emscripten.
Differential Revision: https://reviews.llvm.org/D102271
Specifically:
- InputChunk::outputOffset -> outSecOffset
- Symbol::get/setVirtualAddress -> get/setVA
- add InputChunk::getOffset helper that takes an offset
These are mostly in preparation for adding support for
SHF_MERGE/SHF_STRINGS but its also good to align with ELF where
possible.
Differential Revision: https://reviews.llvm.org/D97595
This patch adds support to wasm-ld for linking multiple table references
together, in a manner similar to wasm globals. The indirect function
table is synthesized as needed.
To manage the transitional period in which the compiler doesn't yet
produce TABLE_NUMBER relocations and doesn't residualize table symbols,
the linker will detect object files which have table imports or
definitions, but no table symbols. In that case it will synthesize
symbols for the defined and imported tables.
As a change, relocatable objects are now written with table symbols,
which can cause symbol renumbering in some of the tests. If no object
file requires an indirect function table, none will be written to the
file. Note that for legacy ObjFile inputs, this test is conservative: as
we don't have relocs for each use of the indirecy function table, we
just assume that any incoming indirect function table should be
propagated to the output.
Differential Revision: https://reviews.llvm.org/D91870
This patch adds support to wasm-ld for linking multiple table references
together, in a manner similar to wasm globals. The indirect function
table is synthesized as needed.
To manage the transitional period in which the compiler doesn't yet
produce TABLE_NUMBER relocations and doesn't residualize table symbols,
the linker will detect object files which have table imports or
definitions, but no table symbols. In that case it will synthesize
symbols for the defined and imported tables.
As a change, relocatable objects are now written with table symbols,
which can cause symbol renumbering in some of the tests. If no object
file requires an indirect function table, none will be written to the
file. Note that for legacy ObjFile inputs, this test is conservative: as
we don't have relocs for each use of the indirecy function table, we
just assume that any incoming indirect function table should be
propagated to the output.
Differential Revision: https://reviews.llvm.org/D91870
This commit adds table symbol support in a partial way, while still
including some special cases for the __indirect_function_table symbol.
No change in tests.
Differential Revision: https://reviews.llvm.org/D94075
We have two types of relocations that we apply on startup:
1. Relocations that apply to wasm globals
2. Relocations that apply to wasm memory
The first set of relocations use only the `__memory_base` import to
update a set of internal globals. Because wasm globals are thread local
these need to run on each thread. Memory relocations, like static
constructors, must only be run once.
To ensure global relocations run on all threads and because the only
depend on the immutable `__memory_base` import we can run them during
the WebAssembly start functions, instead of waiting until the
post-instantiation __wasm_call_ctors.
Differential Revision: https://reviews.llvm.org/D93066
Without this extra flag we can't distingish between stub functions and
functions that happen to have address 0 (relative to __table_base).
Adding this flag bit the base symbol class actually avoids growing the
SymbolUnion struct which would not be true if we added it to the
FunctionSymbol subclass (due to bitbacking).
The previous approach of setting it's table index to zero worked for
normal static relocations but not for `-fPIC` code.
See https://github.com/emscripten-core/emscripten/issues/12819
Differential Revision: https://reviews.llvm.org/D92038
This is a more full featured version of ``--allow-undefined``.
The semantics of the different methods are as follows:
report-all:
Report all unresolved symbols. This is the default. Normally the
linker will generate an error message for each reported unresolved
symbol but the option ``--warn-unresolved-symbols`` can change this
to a warning.
ignore-all:
Resolve all undefined symbols to zero. For data and function
addresses this is trivial. For direct function calls, the linker
will generate a trapping stub function in place of the undefined
function.
import-functions:
Generate WebAssembly imports for any undefined functions. Undefined
data symbols are resolved to zero as in `ignore-all`. This
corresponds to the legacy ``--allow-undefined`` flag.
The plan is to followup with a new mode called `import-dynamic` which
allows for statically linked binaries to refer to both data and
functions symbols from the embedder.
Differential Revision: https://reviews.llvm.org/D79248
This adds support for new-style command support. In this mode, all exports
are considered command entrypoints, and the linker inserts calls to
`__wasm_call_ctors` and `__wasm_call_dtors` for all such entrypoints.
This enables support for:
- Command entrypoints taking arguments other than strings and return values
other than `int`.
- Multicall executables without requiring on the use of string-based
command-line arguments.
This new behavior is disabled when the input has an explicit call to
`__wasm_call_ctors`, indicating code not expecting new-style command
support.
This change does mean that wasm-ld no longer supports DCE-ing the
`__wasm_call_ctors` function when there are no calls to it. If there are no
calls to it, and there are ctors present, we assume it's wasm-ld's job to
insert the calls. This seems ok though, because if there are ctors present,
the program is expecting them to be called. This change affects the
init-fini-gc.ll test.
When a weak reference of a lazy symbol occurs we were not correctly
updating the lazy symbol. We need to tag the existing lazy symbol
as weak and, in the case of a function symbol, give it a signature.
Without the signature we can't then create the dummy function which
is needed when an weakly undefined function is called.
We had tests for weakly referenced lazy symbols but we were only
tests in the case where the reference was seen before the lazy
symbol.
See: https://github.com/WebAssembly/wasi-libc/pull/214
Differential Revision: https://reviews.llvm.org/D85567
This adds 4 new reloc types.
A lot of code that previously assumed any memory or offset values could be contained in a uint32_t (and often truncated results from functions returning 64-bit values) have been upgraded to uint64_t. This is not comprehensive: it is only the values that come in contact with the new relocation values and their dependents.
A new tablegen mapping was added to automatically upgrade loads/stores in the assembler, which otherwise has no way to select for these instructions (since they are indentical other than for the offset immediate). It follows a similar technique to https://reviews.llvm.org/D53307
Differential Revision: https://reviews.llvm.org/D81704
When there are both strong and weak references to an undefined
symbol ensure that the strong reference prevails in the output symbol
generating the correct error.
Test case copied from lld/test/ELF/weak-and-strong-undef.s
Differential Revision: https://reviews.llvm.org/D75322
The changes the in-memory representation of wasm symbols such that their
optional ImportName and ImportModule use llvm::Optional.
ImportName is set whenever WASM_SYMBOL_EXPLICIT_NAME flag is set.
ImportModule (for imports) is currently always set since it defaults to
"env".
In the future we can possibly extent to binary format distingish
import which have explit module names.
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74109
Summary:
- `__wasm_init_memory` is now the WebAssembly start function instead
of being called from `__wasm_call_ctors` or called directly by the
runtime.
- Adds a new synthetic data symbol `__wasm_init_memory_flag` that is
atomically incremented from zero to one by the thread responsible
for initializing memory.
- All threads now unconditionally perform data.drop on all passive
segments.
- Removes --passive-segments and --active-segments flags and controls
segment type based on --shared-memory instead. The deleted flags
were only present to ameliorate the upgrade path in Emscripten.
Reviewers: sbc100, aheejin
Subscribers: dschuff, jgravelle-google, sunfish, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65783
llvm-svn: 370965
This patch implements support for the NO_STRIP flag, which will allow
__attribute__((used)) to be implemented.
This accompanies https://reviews.llvm.org/D62542, which moves to setting the
NO_STRIP flag, and will continue to set EXPORTED for Emscripten targets for
compatibility.
Differential Revision: https://reviews.llvm.org/D66968
llvm-svn: 370416
Summary:
Add immutable WASM global `__tls_align` which stores the alignment
requirements of the TLS segment.
Add `__builtin_wasm_tls_align()` intrinsic to get this alignment in Clang.
The expected usage has now changed to:
__wasm_init_tls(memalign(__builtin_wasm_tls_align(),
__builtin_wasm_tls_size()));
Reviewers: tlively, aheejin, sbc100, sunfish, alexcrichton
Reviewed By: tlively
Subscribers: dschuff, jgravelle-google, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D65028
llvm-svn: 366624
Summary:
Thread local variables are placed inside a `.tdata` segment. Their symbols are
offsets from the start of the segment. The address of a thread local variable
is computed as `__tls_base` + the offset from the start of the segment.
`.tdata` segment is a passive segment and `memory.init` is used once per thread
to initialize the thread local storage.
`__tls_base` is a wasm global. Since each thread has its own wasm instance,
it is effectively thread local. Currently, `__tls_base` must be initialized
at thread startup, and so cannot be used with dynamic libraries.
`__tls_base` is to be initialized with a new linker-synthesized function,
`__wasm_init_tls`, which takes as an argument a block of memory to use as the
storage for thread locals. It then initializes the block of memory and sets
`__tls_base`. As `__wasm_init_tls` will handle the memory initialization,
the memory does not have to be zeroed.
To help allocating memory for thread-local storage, a new compiler intrinsic
is introduced: `__builtin_wasm_tls_size()`. This instrinsic function returns
the size of the thread-local storage for the current function.
The expected usage is to run something like the following upon thread startup:
__wasm_init_tls(malloc(__builtin_wasm_tls_size()));
Reviewers: tlively, aheejin, kripken, sbc100
Subscribers: dschuff, jgravelle-google, hiraditya, sunfish, jfb, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D64537
llvm-svn: 366272
This patch does the same thing as r365595 to other subdirectories,
which completes the naming style change for the entire lld directory.
With this, the naming style conversion is complete for lld.
Differential Revision: https://reviews.llvm.org/D64473
llvm-svn: 365730
We should be generating one __start/__stop pair per output segment
not per input segment. The test wasn't catching this because it was
only linking a single object file.
Fixes PR41565
Differential Revision: https://reviews.llvm.org/D64148
llvm-svn: 365308
On Windows, the bitfield layout rule places `ussigned Referenced : 1` at
byte offset 40, instead of byte offset 37 on *NIX. The consequence is that
sizeof(SymbolUnion) == 104 on Windows while 96 on *NIX.
To eliminate this difference, change these unsigned bitfields to bool.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D64238
llvm-svn: 365296
On 64-bit systems, this decreases sizeof(SymbolUnion) from 112 to 96.
Add a static_assert to avoid accidental increases in future.
Reviewed By: sbc100
Differential Revision: https://reviews.llvm.org/D64208
llvm-svn: 365169
Summary:
Adds `--passive-segments` and `--active-segments` flags to control
what kind of segments are emitted. For now the default is always
to emit active segments so this is not a breaking change, but in
the future the default will be changed to passive segments when
shared memory is requested and active segments otherwise. When
passive segments are emitted, corresponding memory.init and
data.drop instructions are emitted in a `__wasm_init_memory`
function that is automatically called at the beginning of
`__wasm_call_ctors`.
Reviewers: sbc100, aheejin, dschuff
Subscribers: azakai, dschuff, jgravelle-google, sunfish, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59343
llvm-svn: 365088
Summary:
This is needed for address sanitizer on Emscripten. As everything in
memory starts at the value passed to --global-base, everything before
that can be used as shadow memory.
This symbol is added so that the library for the ASan runtime can know
where the shadow memory ends and real memory begins.
This is split from D63742.
Reviewers: tlively, aheejin, sbc100
Subscribers: sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63833
llvm-svn: 364467
When a function is excluded via comdat we shouldn't add it to the
final list of init functions.
Differential Revision: https://reviews.llvm.org/D62983
llvm-svn: 362769
When function signatures don't match and the undefined function is not
called directly (i.e. only has its address taken) we don't issue a
warning or create a runtime thunk for the undefined function.
Instead in this case we simply use the defined version of the function.
This is possible since checking signatures of dynamic calls happens
at runtime so any invalid usage will still result in a runtime error.
This is needed to allow C++ programs to link without generating
warnings. Its not uncommon in C++ for vtables to be populated by
function address whee the signature of the function is not known in the
compilation unit. In this case clang declares the method as void(void)
and relies on the vtable caller casting the data back to the correct
signature.
Fixes: https://bugs.llvm.org/show_bug.cgi?id=40412
Differential Revision: https://reviews.llvm.org/D62153
llvm-svn: 361678