Symbols from LTO objects don't contain Wasm signatures, but we need a
signature when we create undefined/stub functions for missing weakly
undefined symbols.
Luckily, after LTO, we know that symbols that are not referenced by a
regular object file must not be needed in the final output so there
is no need to generate undefined/stub function for them.
Differential Revision: https://reviews.llvm.org/D126554
Move all variables at file-scope or function-static-scope into a hosting structure (lld::CommonLinkerContext) that lives at lldMain()-scope. Drivers will inherit from this structure and add their own global state, in the same way as for the existing COFFLinkerContext.
See discussion in https://lists.llvm.org/pipermail/llvm-dev/2021-June/151184.html
Differential Revision: https://reviews.llvm.org/D108850
This makes Wasm EH work with dynamic linking. So far we were only able
to handle destructors, which do not use any tags or LSDA info.
1. This uses `TargetExternalSymbol` for `GCC_except_tableN` symbols,
which points to the address of per-function LSDA info. It is more
convenient to use than `MCSymbol` because it can take additional
target flags.
2. When lowering `wasm_lsda` intrinsic, if PIC is enabled, make the
symbol relative to `__memory_base` and generate the `add` node. If
PIC is disabled, continue to use the absolute address.
3. Make tag symbols (`__cpp_exception` and `__c_longjmp`) undefined in
the backend, because it is hard to make it work with dynamic
linking's loading order. Instead, we make all tag symbols undefined
in the LLVM backend and import it from JS.
4. Add support for undefined tags to the linker.
Companion patches:
- https://github.com/WebAssembly/binaryen/pull/4223
- https://github.com/emscripten-core/emscripten/pull/15266
Reviewed By: sbc100
Differential Revision: https://reviews.llvm.org/D111388
This removes `WasmTagType`. `WasmTagType` contained an attribute and a
signature index:
```
struct WasmTagType {
uint8_t Attribute;
uint32_t SigIndex;
};
```
Currently the attribute field is not used and reserved for future use,
and always 0. And that this class contains `SigIndex` as its property is
a little weird in the place, because the tag type's signature index is
not an inherent property of a tag but rather a reference to another
section that changes after linking. This makes tag handling in the
linker also weird that tag-related methods are taking both `WasmTagType`
and `WasmSignature` even though `WasmTagType` contains a signature
index. This is because the signature index changes in linking so it
doesn't have any info at this point. This instead moves `SigIndex` to
`struct WasmTag` itself, as we did for `struct WasmFunction` in D111104.
In this CL, in lib/MC and lib/Object, this now treats tag types in the
same way as function types. Also in YAML, this removes `struct Tag`,
because now it only contains the tag index. Also tags set `SigIndex` in
`WasmImport` union, as functions do.
I think this makes things simpler and makes tag handling more in line
with function handling. These two shares similar properties in that both
of them have signatures, but they are kind of nominal so having the same
signature doesn't mean they are the same element.
Also a drive-by fix: the reserved 'attirubute' part's encoding changed
from uleb32 to uint8 a while ago. This was fixed in lib/MC and
lib/Object but not in YAML. This doesn't change object files because the
field's value is always 0 and its encoding is the same for the both
encoding.
This is effectively NFC; I didn't mark it as such just because it
changed YAML test results.
Reviewed By: sbc100, tlively
Differential Revision: https://reviews.llvm.org/D111086
In PIC mode we import function address via `GOT.mem` imports but for
direct function calls we still import the first class function.
However, if the function is never directly called we can avoid the first
class import completely.
Differential Revision: https://reviews.llvm.org/D108345
The main motivation for this refactor is to remove the subclass
relationship between the InputSegment and MergeInputSegment and
SyntenticMergedInputSegment so that we can use the merging classes for
debug sections which are not data segments.
In the process of refactoring I also remove all the virtual functions
from the class hierarchy and try to reuse techniques used in the ELF
linker (see `lld/ELF/InputSections.h`).
Differential Revision: https://reviews.llvm.org/D102546
Specifically:
- InputChunk::outputOffset -> outSecOffset
- Symbol::get/setVirtualAddress -> get/setVA
- add InputChunk::getOffset helper that takes an offset
These are mostly in preparation for adding support for
SHF_MERGE/SHF_STRINGS but its also good to align with ELF where
possible.
Differential Revision: https://reviews.llvm.org/D97595
This commit regroups commonalities among InputGlobal, InputEvent, and
InputTable into the new InputElement. The subclasses are defined
inline in the new InputElement.h. NFC.
Reviewed By: sbc100
Differential Revision: https://reviews.llvm.org/D94677
addOptionalGlobalSymbols should be addOptionalGlobalSymbol.
Also, remove unnecessary additional argument to make the signature match
the sibling function: addOptionalDataSymbol.
Differential Revision: https://reviews.llvm.org/D96305
This patch adds support to wasm-ld for linking multiple table references
together, in a manner similar to wasm globals. The indirect function
table is synthesized as needed.
To manage the transitional period in which the compiler doesn't yet
produce TABLE_NUMBER relocations and doesn't residualize table symbols,
the linker will detect object files which have table imports or
definitions, but no table symbols. In that case it will synthesize
symbols for the defined and imported tables.
As a change, relocatable objects are now written with table symbols,
which can cause symbol renumbering in some of the tests. If no object
file requires an indirect function table, none will be written to the
file. Note that for legacy ObjFile inputs, this test is conservative: as
we don't have relocs for each use of the indirecy function table, we
just assume that any incoming indirect function table should be
propagated to the output.
Differential Revision: https://reviews.llvm.org/D91870
This patch adds support to wasm-ld for linking multiple table references
together, in a manner similar to wasm globals. The indirect function
table is synthesized as needed.
To manage the transitional period in which the compiler doesn't yet
produce TABLE_NUMBER relocations and doesn't residualize table symbols,
the linker will detect object files which have table imports or
definitions, but no table symbols. In that case it will synthesize
symbols for the defined and imported tables.
As a change, relocatable objects are now written with table symbols,
which can cause symbol renumbering in some of the tests. If no object
file requires an indirect function table, none will be written to the
file. Note that for legacy ObjFile inputs, this test is conservative: as
we don't have relocs for each use of the indirecy function table, we
just assume that any incoming indirect function table should be
propagated to the output.
Differential Revision: https://reviews.llvm.org/D91870
This commit adds table symbol support in a partial way, while still
including some special cases for the __indirect_function_table symbol.
No change in tests.
Differential Revision: https://reviews.llvm.org/D94075
Without this extra flag we can't distingish between stub functions and
functions that happen to have address 0 (relative to __table_base).
Adding this flag bit the base symbol class actually avoids growing the
SymbolUnion struct which would not be true if we added it to the
FunctionSymbol subclass (due to bitbacking).
The previous approach of setting it's table index to zero worked for
normal static relocations but not for `-fPIC` code.
See https://github.com/emscripten-core/emscripten/issues/12819
Differential Revision: https://reviews.llvm.org/D92038
This is a more full featured version of ``--allow-undefined``.
The semantics of the different methods are as follows:
report-all:
Report all unresolved symbols. This is the default. Normally the
linker will generate an error message for each reported unresolved
symbol but the option ``--warn-unresolved-symbols`` can change this
to a warning.
ignore-all:
Resolve all undefined symbols to zero. For data and function
addresses this is trivial. For direct function calls, the linker
will generate a trapping stub function in place of the undefined
function.
import-functions:
Generate WebAssembly imports for any undefined functions. Undefined
data symbols are resolved to zero as in `ignore-all`. This
corresponds to the legacy ``--allow-undefined`` flag.
The plan is to followup with a new mode called `import-dynamic` which
allows for statically linked binaries to refer to both data and
functions symbols from the embedder.
Differential Revision: https://reviews.llvm.org/D79248
Previously we limited the use of atomics and TLS to programs
linked with `--shared-memory`.
However, as of https://reviews.llvm.org/D79530 we now allow
programs that use atomic to be linked without `--shared-memory`.
For this to be useful we also want to all TLS usage in such
programs. In this case, since we know we are single threaded
we simply include the TLS data as a regular active segment
and create an immutable `__tls_base` global that point to the
start of this segment.
Fixes: https://github.com/emscripten-core/emscripten/issues/12489
Differential Revision: https://reviews.llvm.org/D91115
When a weak reference of a lazy symbol occurs we were not correctly
updating the lazy symbol. We need to tag the existing lazy symbol
as weak and, in the case of a function symbol, give it a signature.
Without the signature we can't then create the dummy function which
is needed when an weakly undefined function is called.
We had tests for weakly referenced lazy symbols but we were only
tests in the case where the reference was seen before the lazy
symbol.
See: https://github.com/WebAssembly/wasi-libc/pull/214
Differential Revision: https://reviews.llvm.org/D85567
This adds 4 new reloc types.
A lot of code that previously assumed any memory or offset values could be contained in a uint32_t (and often truncated results from functions returning 64-bit values) have been upgraded to uint64_t. This is not comprehensive: it is only the values that come in contact with the new relocation values and their dependents.
A new tablegen mapping was added to automatically upgrade loads/stores in the assembler, which otherwise has no way to select for these instructions (since they are indentical other than for the offset immediate). It follows a similar technique to https://reviews.llvm.org/D53307
Differential Revision: https://reviews.llvm.org/D81704
This is a followup to https://reviews.llvm.org/D78779.
When signatures mismatch we create set of variant symbols. Some of
the fields in these symbols were not be initialized correct.
Specifically we were seeing isUsedInRegularObj not being set correctly,
leading to the symbol not getting included in the symbol table
and a crash writing relections in --reloctable mode.
There is larger refactor due here, but this is a minimal change the
fixes the bug at hand.
Differential Revision: https://reviews.llvm.org/D79756
These stub new function were not being added to the symbol table
which in turn meant that we were crashing when trying to output
relocations against them.
Differential Revision: https://reviews.llvm.org/D78779
When there are both strong and weak references to an undefined
symbol ensure that the strong reference prevails in the output symbol
generating the correct error.
Test case copied from lld/test/ELF/weak-and-strong-undef.s
Differential Revision: https://reviews.llvm.org/D75322
The changes the in-memory representation of wasm symbols such that their
optional ImportName and ImportModule use llvm::Optional.
ImportName is set whenever WASM_SYMBOL_EXPLICIT_NAME flag is set.
ImportModule (for imports) is currently always set since it defaults to
"env".
In the future we can possibly extent to binary format distingish
import which have explit module names.
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74109
This can happen if lto::LTO::getRuntimeLibcallSymbols doesn't return
an complete/accurate list of libcalls. In this case new bitcode
object can be linked in after LTO.
For example the WebAssembly backend currently calls:
setLibcallName(RTLIB::FPROUND_F32_F16, "__truncsfhf2");
But `__truncsfhf2` is not part of `getRuntimeLibcallSymbols` so if
this symbol is generated during LTO the link will currently fail.
Without this change the linker crashes because the bitcode symbol
makes it all the way to the output phase.
See: https://bugs.llvm.org/show_bug.cgi?id=44353
Differential Revision: https://reviews.llvm.org/D71632
Undefined symbols in WebAssembly can come with custom `import-module`
and `import-field` attributes. However when reading symbols from
bitcode object files during LTO those curtom attributes are not
available.
Once we compile the LTO object and read in the symbol table from the
object file we have access to these custom attributes. In this case,
when undefined symbols are added and a symbol already exists in the
SymbolTable we can't simple return it, we may need to update the
symbol's attributes.
Fixes: PR43211
Differential Revision: https://reviews.llvm.org/D68959
llvm-svn: 375081
This allows undefined references in input files be resolved by the
optional symbols. Previously we were doing this before input file
reading which means it was working only for command line symbols
references (i.e. -u or --export).
Also use addOptionalDataSymbol for __dso_handle and make all optional
symbols hidden by default.
Differential Revision: https://reviews.llvm.org/D65920
llvm-svn: 368310
This patch does the same thing as r365595 to other subdirectories,
which completes the naming style change for the entire lld directory.
With this, the naming style conversion is complete for lld.
Differential Revision: https://reviews.llvm.org/D64473
llvm-svn: 365730
This puts handling of undefined symbols in a single location. Its
also more in line with the ELF backend which only reports undefined
symbols based on relocations.
One side effect is that we no longer report undefined symbols that are
only referenced in GC'd sections.
This also fixes a crash reported in the emscripten toolchain:
https://github.com/emscripten-core/emscripten/issues/8930.
Differential Revision: https://reviews.llvm.org/D64280
llvm-svn: 365553
We should be generating one __start/__stop pair per output segment
not per input segment. The test wasn't catching this because it was
only linking a single object file.
Fixes PR41565
Differential Revision: https://reviews.llvm.org/D64148
llvm-svn: 365308
When function signatures don't match and the undefined function is not
called directly (i.e. only has its address taken) we don't issue a
warning or create a runtime thunk for the undefined function.
Instead in this case we simply use the defined version of the function.
This is possible since checking signatures of dynamic calls happens
at runtime so any invalid usage will still result in a runtime error.
This is needed to allow C++ programs to link without generating
warnings. Its not uncommon in C++ for vtables to be populated by
function address whee the signature of the function is not known in the
compilation unit. In this case clang declares the method as void(void)
and relies on the vtable caller casting the data back to the correct
signature.
Fixes: https://bugs.llvm.org/show_bug.cgi?id=40412
Differential Revision: https://reviews.llvm.org/D62153
llvm-svn: 361678