__table_base is know 64-bit, since in LLVM it represents a function pointer offset
__table_base32 is a copy in wasm32 for use in elem init expr, since no truncation may be used there.
New reloc R_WASM_TABLE_INDEX_REL_SLEB64 added
Differential Revision: https://reviews.llvm.org/D101784
The main motivation for this refactor is to remove the subclass
relationship between the InputSegment and MergeInputSegment and
SyntenticMergedInputSegment so that we can use the merging classes for
debug sections which are not data segments.
In the process of refactoring I also remove all the virtual functions
from the class hierarchy and try to reuse techniques used in the ELF
linker (see `lld/ELF/InputSections.h`).
Differential Revision: https://reviews.llvm.org/D102546
Don't include the relocation addend when calculating the
virtual address of a symbol. Instead just pass the symbol's
offset and add the addend afterwards.
Without this fix we hit the `offset is outside the section`
error in MergeInputSegment::getSegmentPiece.
This fixes a real world error we were are seeing in emscripten.
Differential Revision: https://reviews.llvm.org/D102271
This change was originally landed in: 5000a1b4b9
It was reverted in: 061e071d8c
This change adds support for a new WASM_SEG_FLAG_STRINGS flag in
the object format which works in a similar fashion to SHF_STRINGS
in the ELF world.
Unlike the ELF linker this support is currently limited:
- No support for SHF_MERGE (non-string merging)
- Always do full tail merging ("lo" can be merged with "hello")
- Only support single byte strings (p2align 0)
Like the ELF linker merging is only performed at `-O1` and above.
This fixes part of https://bugs.llvm.org/show_bug.cgi?id=48828,
although crucially it doesn't not currently support debug sections
because they are not represented by data segments (they are custom
sections)
Differential Revision: https://reviews.llvm.org/D97657
This change adds support for a new WASM_SEG_FLAG_STRINGS flag in
the object format which works in a similar fashion to SHF_STRINGS
in the ELF world.
Unlike the ELF linker this support is currently limited:
- No support for SHF_MERGE (non-string merging)
- Always do full tail merging ("lo" can be merged with "hello")
- Only support single byte strings (p2align 0)
Like the ELF linker merging is only performed at `-O1` and above.
This fixes part of https://bugs.llvm.org/show_bug.cgi?id=48828,
although crucially it doesn't not currently support debug sections
because they are not represented by data segments (they are custom
sections)
Differential Revision: https://reviews.llvm.org/D97657
Specifically:
- InputChunk::outputOffset -> outSecOffset
- Symbol::get/setVirtualAddress -> get/setVA
- add InputChunk::getOffset helper that takes an offset
These are mostly in preparation for adding support for
SHF_MERGE/SHF_STRINGS but its also good to align with ELF where
possible.
Differential Revision: https://reviews.llvm.org/D97595
MVP object files may import at most one table, and if they do, it must
be assigned table number zero in the output, as the references to that
table are not relocatable. Ensure that this is the case, even if some
inputs define other tables.
Differential Revision: https://reviews.llvm.org/D96001
With dynamic linking we have the current limitation that there can be
only a single active data segment (since we use __memory_base as the
load address and we can't do arithmetic in constant expresions).
This change delays the merging of active segments until a little later
in the linking process which means that the grouping of data by section,
and the magic __start/__end symbols work as expected under dynamic
linking.
Differential Revision: https://reviews.llvm.org/D96453
This commit regroups commonalities among InputGlobal, InputEvent, and
InputTable into the new InputElement. The subclasses are defined
inline in the new InputElement.h. NFC.
Reviewed By: sbc100
Differential Revision: https://reviews.llvm.org/D94677
This patch adds support to wasm-ld for linking multiple table references
together, in a manner similar to wasm globals. The indirect function
table is synthesized as needed.
To manage the transitional period in which the compiler doesn't yet
produce TABLE_NUMBER relocations and doesn't residualize table symbols,
the linker will detect object files which have table imports or
definitions, but no table symbols. In that case it will synthesize
symbols for the defined and imported tables.
As a change, relocatable objects are now written with table symbols,
which can cause symbol renumbering in some of the tests. If no object
file requires an indirect function table, none will be written to the
file. Note that for legacy ObjFile inputs, this test is conservative: as
we don't have relocs for each use of the indirecy function table, we
just assume that any incoming indirect function table should be
propagated to the output.
Differential Revision: https://reviews.llvm.org/D91870
This patch adds support to wasm-ld for linking multiple table references
together, in a manner similar to wasm globals. The indirect function
table is synthesized as needed.
To manage the transitional period in which the compiler doesn't yet
produce TABLE_NUMBER relocations and doesn't residualize table symbols,
the linker will detect object files which have table imports or
definitions, but no table symbols. In that case it will synthesize
symbols for the defined and imported tables.
As a change, relocatable objects are now written with table symbols,
which can cause symbol renumbering in some of the tests. If no object
file requires an indirect function table, none will be written to the
file. Note that for legacy ObjFile inputs, this test is conservative: as
we don't have relocs for each use of the indirecy function table, we
just assume that any incoming indirect function table should be
propagated to the output.
Differential Revision: https://reviews.llvm.org/D91870
This commit adds table symbol support in a partial way, while still
including some special cases for the __indirect_function_table symbol.
No change in tests.
Differential Revision: https://reviews.llvm.org/D94075
Live symbols should only cause the files in which they are defined
to become live.
For now this is only tested in emscripten: we're continuing
to work on reducing the test case further for an lld-style
unit test.
Differential Revision: https://reviews.llvm.org/D93472
We have two types of relocations that we apply on startup:
1. Relocations that apply to wasm globals
2. Relocations that apply to wasm memory
The first set of relocations use only the `__memory_base` import to
update a set of internal globals. Because wasm globals are thread local
these need to run on each thread. Memory relocations, like static
constructors, must only be run once.
To ensure global relocations run on all threads and because the only
depend on the immutable `__memory_base` import we can run them during
the WebAssembly start functions, instead of waiting until the
post-instantiation __wasm_call_ctors.
Differential Revision: https://reviews.llvm.org/D93066
This is a more full featured version of ``--allow-undefined``.
The semantics of the different methods are as follows:
report-all:
Report all unresolved symbols. This is the default. Normally the
linker will generate an error message for each reported unresolved
symbol but the option ``--warn-unresolved-symbols`` can change this
to a warning.
ignore-all:
Resolve all undefined symbols to zero. For data and function
addresses this is trivial. For direct function calls, the linker
will generate a trapping stub function in place of the undefined
function.
import-functions:
Generate WebAssembly imports for any undefined functions. Undefined
data symbols are resolved to zero as in `ignore-all`. This
corresponds to the legacy ``--allow-undefined`` flag.
The plan is to followup with a new mode called `import-dynamic` which
allows for statically linked binaries to refer to both data and
functions symbols from the embedder.
Differential Revision: https://reviews.llvm.org/D79248
These relocations represent offsets from the __tls_base symbol.
Previously we were just using normal MEMORY_ADDR relocations and relying
on the linker to select a segment-offset rather and absolute value in
Symbol::getVirtualAddress(). Using an explicit relocation type allows
allow us to clearly distinguish absolute from relative relocations based
on the relocation information alone.
One place this is useful is being able to reject absolute relocation in
the PIC case, but still accept TLS relocations.
Differential Revision: https://reviews.llvm.org/D91276
This allows `__wasilibc_populate_libpreopen` to be GC'd in more cases
where it isn't needed, including when linked from Rust's libstd.
Differential Revision: https://reviews.llvm.org/D85062
This adds support for new-style command support. In this mode, all exports
are considered command entrypoints, and the linker inserts calls to
`__wasm_call_ctors` and `__wasm_call_dtors` for all such entrypoints.
This enables support for:
- Command entrypoints taking arguments other than strings and return values
other than `int`.
- Multicall executables without requiring on the use of string-based
command-line arguments.
This new behavior is disabled when the input has an explicit call to
`__wasm_call_ctors`, indicating code not expecting new-style command
support.
This change does mean that wasm-ld no longer supports DCE-ing the
`__wasm_call_ctors` function when there are no calls to it. If there are no
calls to it, and there are ctors present, we assume it's wasm-ld's job to
insert the calls. This seems ok though, because if there are ctors present,
the program is expecting them to be called. This change affects the
init-fini-gc.ll test.
When a weak reference of a lazy symbol occurs we were not correctly
updating the lazy symbol. We need to tag the existing lazy symbol
as weak and, in the case of a function symbol, give it a signature.
Without the signature we can't then create the dummy function which
is needed when an weakly undefined function is called.
We had tests for weakly referenced lazy symbols but we were only
tests in the case where the reference was seen before the lazy
symbol.
See: https://github.com/WebAssembly/wasi-libc/pull/214
Differential Revision: https://reviews.llvm.org/D85567
This adds 4 new reloc types.
A lot of code that previously assumed any memory or offset values could be contained in a uint32_t (and often truncated results from functions returning 64-bit values) have been upgraded to uint64_t. This is not comprehensive: it is only the values that come in contact with the new relocation values and their dependents.
A new tablegen mapping was added to automatically upgrade loads/stores in the assembler, which otherwise has no way to select for these instructions (since they are indentical other than for the offset immediate). It follows a similar technique to https://reviews.llvm.org/D53307
Differential Revision: https://reviews.llvm.org/D81704
WebAssembly requires that caller and callee signatures match, so it
can't do the usual trick of passing more arguments to main than it
expects. Instead WebAssembly will mangle "main" with argc/argv
parameters as "__main_argc_argv". This patch teaches lld how to
demangle it.
This patch is part of https://reviews.llvm.org/D70700.
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.
Instead of returning an optional, just return the input string if
demangling fails, as that's what all callers use anyway.
Differential Revision: https://reviews.llvm.org/D68015
llvm-svn: 373077
Summary:
- `__wasm_init_memory` is now the WebAssembly start function instead
of being called from `__wasm_call_ctors` or called directly by the
runtime.
- Adds a new synthetic data symbol `__wasm_init_memory_flag` that is
atomically incremented from zero to one by the thread responsible
for initializing memory.
- All threads now unconditionally perform data.drop on all passive
segments.
- Removes --passive-segments and --active-segments flags and controls
segment type based on --shared-memory instead. The deleted flags
were only present to ameliorate the upgrade path in Emscripten.
Reviewers: sbc100, aheejin
Subscribers: dschuff, jgravelle-google, sunfish, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65783
llvm-svn: 370965
This patch implements support for the NO_STRIP flag, which will allow
__attribute__((used)) to be implemented.
This accompanies https://reviews.llvm.org/D62542, which moves to setting the
NO_STRIP flag, and will continue to set EXPORTED for Emscripten targets for
compatibility.
Differential Revision: https://reviews.llvm.org/D66968
llvm-svn: 370416
Summary:
Add immutable WASM global `__tls_align` which stores the alignment
requirements of the TLS segment.
Add `__builtin_wasm_tls_align()` intrinsic to get this alignment in Clang.
The expected usage has now changed to:
__wasm_init_tls(memalign(__builtin_wasm_tls_align(),
__builtin_wasm_tls_size()));
Reviewers: tlively, aheejin, sbc100, sunfish, alexcrichton
Reviewed By: tlively
Subscribers: dschuff, jgravelle-google, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D65028
llvm-svn: 366624
Summary:
Thread local variables are placed inside a `.tdata` segment. Their symbols are
offsets from the start of the segment. The address of a thread local variable
is computed as `__tls_base` + the offset from the start of the segment.
`.tdata` segment is a passive segment and `memory.init` is used once per thread
to initialize the thread local storage.
`__tls_base` is a wasm global. Since each thread has its own wasm instance,
it is effectively thread local. Currently, `__tls_base` must be initialized
at thread startup, and so cannot be used with dynamic libraries.
`__tls_base` is to be initialized with a new linker-synthesized function,
`__wasm_init_tls`, which takes as an argument a block of memory to use as the
storage for thread locals. It then initializes the block of memory and sets
`__tls_base`. As `__wasm_init_tls` will handle the memory initialization,
the memory does not have to be zeroed.
To help allocating memory for thread-local storage, a new compiler intrinsic
is introduced: `__builtin_wasm_tls_size()`. This instrinsic function returns
the size of the thread-local storage for the current function.
The expected usage is to run something like the following upon thread startup:
__wasm_init_tls(malloc(__builtin_wasm_tls_size()));
Reviewers: tlively, aheejin, kripken, sbc100
Subscribers: dschuff, jgravelle-google, hiraditya, sunfish, jfb, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D64537
llvm-svn: 366272
This patch does the same thing as r365595 to other subdirectories,
which completes the naming style change for the entire lld directory.
With this, the naming style conversion is complete for lld.
Differential Revision: https://reviews.llvm.org/D64473
llvm-svn: 365730
Summary:
Adds `--passive-segments` and `--active-segments` flags to control
what kind of segments are emitted. For now the default is always
to emit active segments so this is not a breaking change, but in
the future the default will be changed to passive segments when
shared memory is requested and active segments otherwise. When
passive segments are emitted, corresponding memory.init and
data.drop instructions are emitted in a `__wasm_init_memory`
function that is automatically called at the beginning of
`__wasm_call_ctors`.
Reviewers: sbc100, aheejin, dschuff
Subscribers: azakai, dschuff, jgravelle-google, sunfish, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59343
llvm-svn: 365088
Summary:
This is needed for address sanitizer on Emscripten. As everything in
memory starts at the value passed to --global-base, everything before
that can be used as shadow memory.
This symbol is added so that the library for the ASan runtime can know
where the shadow memory ends and real memory begins.
This is split from D63742.
Reviewers: tlively, aheejin, sbc100
Subscribers: sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63833
llvm-svn: 364467
When a function is excluded via comdat we shouldn't add it to the
final list of init functions.
Differential Revision: https://reviews.llvm.org/D62983
llvm-svn: 362769
Major refactor to better match the structure of the ELF linker.
- Split out relocation processing into scanRelocations
- Split out synthetic sections into their own classes.
Differential Revision: https://reviews.llvm.org/D61811
llvm-svn: 361233
The code we generate for applying data relocations at runtime omitted
the symbols with GOT entries.
Also refactor the code to reduce duplication.
Differential Revision: https://reviews.llvm.org/D61111
llvm-svn: 359207
This change implements lowering of references global symbols in PIC
mode.
This change implements lowering of global references in PIC mode using a
new @GOT reference type. @GOT references can be used with function or
data symbol names combined with the get_global instruction. In this case
the linker will insert the wasm global that stores the address of the
symbol (either in memory for data symbols or in the wasm table for
function symbols).
For now I'm continuing to use the R_WASM_GLOBAL_INDEX_LEB relocation
type for this type of reference which means that this relocation type
can refer to either a global or a function or data symbol. We could
choose to introduce specific relocation types for GOT entries in the
future. See the current dynamic linking proposal:
https://github.com/WebAssembly/tool-conventions/blob/master/DynamicLinking.md
Differential Revision: https://reviews.llvm.org/D54647
llvm-svn: 357022
Previously we could emit a warning and generate a potentially invalid
wasm module (due to call sites and functions having conflicting
signatures). Now, rather than create invalid binaries we handle such
cases by creating stub functions containing unreachable, effectively
turning these into runtime errors rather than validation failures.
Differential Revision: https://reviews.llvm.org/D57909
llvm-svn: 354528
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636