This bug was found by accident while trying to expand out testcases
for imported symbols, and is covered by the additional test case.
Differential Revision: https://reviews.llvm.org/D44331
llvm-svn: 327290
This matches the existing ordering that's been there for globals
for a while (__stack_pointer coming first).
Differential Revision: https://reviews.llvm.org/D44333
llvm-svn: 327286
toString(T) is a stringize function for an object of type T. Each type
that has that function defined should know how to stringize itself, and
there should be one string representation of an object. Passing a
"supplemental" argument to toString() breaks that princple. We shouldn't
add a second parameter to that function.
Differential Revision: https://reviews.llvm.org/D44323
llvm-svn: 327182
This error case is described in Linking.md. The operand for call requires
generation of a synthetic stub.
Differential Revision: https://reviews.llvm.org/D44028
llvm-svn: 327151
Previously we created __wasm_call_ctors with null InputFunction, and
added the InputFunction later. Now we create the SyntheticFunction with
null body, and set the body later.
Differential Revision: https://reviews.llvm.org/D44206
llvm-svn: 327149
When a symbol is exported via --export=foo but --allow-undefined
is also specified, the symbol is now allowed to be undefined.
Previously we were special casing such symbols.
This combinations of behavior is exactly what emescripten
requires. Although we are trying hard not to allow emscripten
specific features in lld, this one makes sense.
Enforce this behavior by added this case to test/wasm/undefined.ll.
Differential Revision: https://reviews.llvm.org/D44237
llvm-svn: 326976
This avoids the Writer unnecessarily having a member to retain ownership
of the function body.
Differential Revision: https://reviews.llvm.org/D43933
llvm-svn: 326580
Let X and Y be types. Previously, functions F(X, Y) and G(Y, X) had
the same hash value because their hash values are computed as follows:
hash(F) = hash(X) + hash(Y)
hash(G) = hash(Y) + hash(X)
This patch fixes the issue by using hash_combine.
Differential Revision: https://reviews.llvm.org/D43856
llvm-svn: 326336
This patch simplifies initializeSymbols. Since that function is called
at the tail context of ObjFile::parse, and the function is called only
once from that function, that's effectively just a continuation of
ObjFile::parse. So this patch merge it with ObjFile::parse.
Differential Revision: https://reviews.llvm.org/D43848
llvm-svn: 326296
The problem I want to address now is that chunks have too many data
members for "offsets", and their origins are not well defined.
For example, InputSegment has OutputSegmentOffset, but it's base class
also has OutputOffset. That's very confusing.
Differential Revision: https://reviews.llvm.org/D43726
llvm-svn: 326291
These output section names are ELF-specific. We shouldn't have this rule
for WebAssembly.
Differential Revision: https://reviews.llvm.org/D43712
llvm-svn: 326289
SubSection inherited from SyntheticSection, and SyntheticSection inherits
from OutputSection, so SubSection was an OutputSection. But that's wrong
because SubSection is not actually a WebAssembly output section.
It shares some functionalities with OutputSection, but overall it's very
different.
This patch removes that inheritance.
Differential Revision: https://reviews.llvm.org/D43719
llvm-svn: 326286
FileOutputBuffer automatically removes an existing file, so we don't
need to do that. Actually doing that is discouraged because when the
linker fails to create an output for some reason after instantiating
FileOutputBufffer, FileOutputBuffer removes a temporary file and don't
touch an existing file. That's an desired behavior from the user's
point of view.
(Internally, FileOutputBuffer writes its contents to a temporary file
and then rename it over to an existing file on commit()).
Differential Revision: https://reviews.llvm.org/D43728
llvm-svn: 326281
Looks like these accessor functions are a bit overly defensive, and
due to the amount of code, that part isn't easy to read. We have
code to log translation results in writeTo and writeRelocations, so
I don't think we need to log it again in these functions.
I think the new function is much easier to understand.
Differential Revision: https://reviews.llvm.org/D43713
llvm-svn: 326277
Instead of {Function,Global,Data}Symbol, use Defined{Function,Global,Data}
because undefined symbol should never reach this function.
Differential Revision: https://reviews.llvm.org/D43710
llvm-svn: 326275
I think calling reserve() for each object file is too many and isn't useful.
We can add reserve() later. By default, we shouldn't add reserve() to a lot
of places.
Differential Revision: https://reviews.llvm.org/D43722
llvm-svn: 326273
Previously, one function adds all types of undefined symbols. That
doesn't fit to the wasm's undefined symbol semantics well because
different types of undefined symbols are very different in wasm.
As a result, separate control flows merge in this addUndefined function
and then separate again for each type. That wasn't easy to read.
This patch separates the function into three functions. Now it is pretty
clear what we are doing for each undefined symbol type.
Differential Revision: https://reviews.llvm.org/D43697
llvm-svn: 326271
This means we don't need to write the linking metadata section
at all for executable (non-relocatable) output.
Differential Revision: https://reviews.llvm.org/D42869
llvm-svn: 326268
lld uses an arena allocator to one of allocations
like these can just use make<>.
Differential Revision: https://reviews.llvm.org/D43587
llvm-svn: 325706
Purely a rename in preparation for adding new global symbol type.
We want to use GlobalSymbol to represent real wasm globals and
DataSymbol for pointers to things in linear memory (what ELF would
call STT_OBJECT).
This reduces the size the patch to add the explicit symbol table
which is coming soon!
Differential Revision: https://reviews.llvm.org/D43476
llvm-svn: 325645
The profailing style in lld seem to be to not include such empty lines.
Clang-tidy/clang-format seem to handle this just fine.
Differential Revision: https://reviews.llvm.org/D43528
llvm-svn: 325629
I think if statements that end with return is easier to read than
cascaded if-else-if-else-if statements because it makes clear that
execution of a function terminates there. This patch also adds more
blank lines to separate code blocks to meaningful pieces.
Differential Revision: https://reviews.llvm.org/D43517
llvm-svn: 325625
This header used to be needed here for the `OutRelocation` struct
but that no longer exists.
Differential Revision: https://reviews.llvm.org/D43516
llvm-svn: 325608
Instead include InputFuction and InputSegment directly
in the subclasses that use them (DefinedFunction and
DefinedGlobal).
Differential Revision: https://reviews.llvm.org/D43493
llvm-svn: 325603
This patch removes `OutRelocations` vector from the InputChunk and
directly consume `Relocations` vector instead. This should make the linker
faster because we don't create a temporary data structure, and it matches
the lld's design principle that we don't create temporary data structures
for object files but instead directly consume mmap'ed data whenever possible.
Differential Revision: https://reviews.llvm.org/D43491
llvm-svn: 325549
We already have isa<> for this, and these methods were simply
duplicating those redundantly.
Differential Revision: https://reviews.llvm.org/D43422
llvm-svn: 325418
Summary:
- Makes code more in line with LLVM style (e.g. const char * -> StringRef)
- Do not use formatv since we can construct message just by `+`
- Replace some odd types such as `const StringRef` with more common type
- Do not use default arguments if they are not necessary
Reviewers: sbc100
Subscribers: jfb, aheejin, llvm-commits, sunfish
Differential Revision: https://reviews.llvm.org/D43403
llvm-svn: 325383
This bug effected undefined symbols that were resolved by
existing defined symbols. We were skipping the signature
check in this case.
Differential Revision: https://reviews.llvm.org/D43399
llvm-svn: 325376
This was causing GCC builds with fail with:
Symbols.h:240:3: error: static assertion failed: Symbol types must be
trivially destructible
static_assert(std::is_trivially_destructible<T>(
The reason this is a gcc-only failure is that OptionalStorage has
as specialization for POD types that isn't built under GCC.
Differential Revision: https://reviews.llvm.org/D43317
llvm-svn: 325185
This brings wasm into line with ELF and COFF in terms of
symbol types are represented.
Differential Revision: https://reviews.llvm.org/D43112
llvm-svn: 325150
SetVector guarantees ordering, so with that we can get a deterministic
output for error messages.
Differential Revision: https://reviews.llvm.org/D43254
llvm-svn: 325064
It seems redundant to store this information twice. None of the
locations where this bit is checked care about the distinction.
Differential Revision: https://reviews.llvm.org/D43250
llvm-svn: 325046
These were duplicating (incorrectly) some of the logic for
handling conflicts, but since they are only ever added right
at the start we can assume no existing symbols.
Also rename these methods for clarity.
Differential Revision: https://reviews.llvm.org/D43252
llvm-svn: 325045
This is similar to _end (See https://linux.die.net/man/3/edata for more)
but using our own unique name since our use cases will most likely be
different and we want to keep our options open WRT to memory layout.
This change will allow is to remove the DataSize from the linking
metadata section which is currently being used by emscripten to derive
the end of the data.
Differential Revision: https://reviews.llvm.org/D42867
llvm-svn: 324443
Group all synthetic symbols in the in single struct to match
the ELF linker.
This change is part of a larger change to add more linker
symbols such as `_end` and `_edata`.
Differential Revision: https://reviews.llvm.org/D42866
llvm-svn: 324157
Don't include type signatures that are not referenced by
some relocation.
We don't include this in the -gc-sections settings since
we are always building the type section from scratch,
just like we do the table elements.
In the future we might want to unify the relocation
processing which is currently done once for gc-sections
and then again for building the sympathetic type and
table sections.
Differential Revision: https://reviews.llvm.org/D42747
llvm-svn: 323931
In this initial version we only GC symbols with `hidden` visibility since
other symbols we export to the embedder.
We could potentially modify this the future and only use symbols
explicitly passed via `--export` as GC roots.
This version of the code only does GC of data and code. GC for the
types section is coming soon.
Differential Revision: https://reviews.llvm.org/D42511
llvm-svn: 323842
Summary:
Rather than explicit Function or InputSegment points store a
pointer to the base class (InputChunk) and use llvm dynamic
casts when we need a subtype.
This change is useful for add the upcoming gc-section support
wants to deal with all input chunks.
Subscribers: aheejin, llvm-commits
Differential Revision: https://reviews.llvm.org/D42625
llvm-svn: 323621
Previously, we were ensuring that the "output index" for
InputFunctions was unique across all symbols that referenced
a function body, but allowing the same function body to have
multiple table indexes.
Now, we use the same mechanism for table indexes as we already
do for output indexes, ensuring that each InputFunction is only
placed in the table once.
This makes the LLD output table denser and smaller, but should
not change the behaviour.
Note that we still need the `Symbol::TableIndex` member, to
store the table index for function Symbols that don't have an
InputFunction, i.e. for address-taken imports.
Patch by Nicholas Wilson!
Differential Revision: https://reviews.llvm.org/D42476
llvm-svn: 323379
Previously llvm was using 0 as the first table index for wasm object
files but now that has switched to 1 we can have the output of lld
do the same and simplify the code.
Patch by Nicholas Wilson!
Differential Revision: https://reviews.llvm.org/D42096
llvm-svn: 323378
TABLE relocations now store the function that is being refered
to indirectly.
See rL323165.
Also extend the call-indirect.ll a little.
Based on a patch by Nicholas Wilson!
llvm-svn: 323168
This was added to mimic ELF, but maintaining it has cost
and we currently don't have any use for it outside of the
test code.
Differential Revision: https://reviews.llvm.org/D42324
llvm-svn: 323154
Its much easier to export it via setHidden(false), now that
that is a thing.
As a side effect the start function is not longer always exports first
(becuase its being exported just like all the other function).
Differential Revision: https://reviews.llvm.org/D42321
llvm-svn: 323025
This code was needed back when we were not able to write
out the synthetic symbol for main.
Add tests to make sure we can handle this now.
Differential Revision: https://reviews.llvm.org/D42322
llvm-svn: 323020
We need these import since relocations are generated against them.
Patch by Nicholas Wilson!
Differential Revision: https://reviews.llvm.org/D42305
llvm-svn: 322990
This solves the problem that --emit-relocs needs the stack-pointer
to be exported, in order to write out any relocations that reference
the __stack_pointer symbol by its symbol index.
Patch by Nicholas Wilson!
Differential Revision: https://reviews.llvm.org/D42237
llvm-svn: 322911
When writing relocatable files we were exporting for all globals
(including file-local syms), but not for functions. Oops. To be
consistent with non-relocatable output, all symbols (file-local
and global) should be exported. Any symbol targetted by further
relocations needs to be exported. The lack of local function
exports was just an omission, I think.
Second bug: Local symbol names can collide, causing an illegal
Wasm file to be generated! Oops again. This only previously affected
producing relocatable output from two files, where each had a global
with the same name. We need to "budge" the symbol names for locals
that are exported on relocatable output.
Third bug: LLD's relocatable output wasn't writing out any symbol
flags! Thus the local globals weren't being marked as local, and
the hidden flag was also stripped...
Added tests to exercise colliding local names with/without
relocatable flag
Patch by Nicholas Wilson!
Differential Revision: https://reviews.llvm.org/D42105
llvm-svn: 322908
Simplify generation of "names" section by simply iterating
over the DefinedFunctions array.
This even fixes some bugs, judging by the test changes required.
Some tests are asserting that functions are named multiple times,
other tests are asserting that the "names" section contains the
function's alias rather than its original name
Patch by Nicholas Wilson!
Differential Revision: https://reviews.llvm.org/D42076
llvm-svn: 322751
This is an immutable exported global representing
the start of the heap area. It is a page aligned.
Differential Revision: https://reviews.llvm.org/D42030
llvm-svn: 322609
This is used by __cxa_ataxit to determine the currently
executing DLL. Once we fully support DLLs this will need
to be set to some address within the DLL.
The ELF linker added support for this symbol here:
https://reviews.llvm.org/D33856
Differential Revision: https://reviews.llvm.org/D42024
llvm-svn: 322606
See https://bugs.llvm.org/show_bug.cgi?id=35533, and D40844
Things covered:
* Removing duplicate data segments (as determined by COMDATs emitted
by the frontend)
* Removing duplicate globals and functions in COMDATs
* Checking that each time a COMDAT is seen it has the same symbols
as at other times (ie it's a stronger check than simply giving all
the symbols in the COMDAT weak linkage)
Patch by Nicholas Wilson!
Differential Revision: https://reviews.llvm.org/D40845
llvm-svn: 322415
This is useful for emscripten or other tools that want to
selectively exports symbols without necessarily changing the
source code.
Differential Revision: https://reviews.llvm.org/D42003
llvm-svn: 322408
Even though a function can have multiple names in the
linking standards (i.e. due to aliases), there can only
be one name for a given function in the NAME section.
Differential Revision: https://reviews.llvm.org/D41975
llvm-svn: 322383
This allows libraries to supply a list of symbols which are
allowed to be undefined at link time (i.e. result in imports).
This method replaces the existing mechanism (-allow-undefined-file)
used by the clang driver to allow undefined symbols in libc.
For more on motivation for this see:
https://github.com/WebAssembly/tool-conventions/issues/35
In the long run we hope to remove this features and instead
include this information in the object format itself.
Differential Revision: https://reviews.llvm.org/D41922
llvm-svn: 322320
This WasmSymbol types comes directly from the input objects
but we want to be able to represent synthetic symbols too.
This is more in line with how the ELF linker represents symbols.
This change also removes the logic from Symbol::getVirtualAddress
for finding the global address and instead has the InputFile
provide this.
Differential Revision: https://reviews.llvm.org/D41426
llvm-svn: 322145
The code section is now written out one function
at a time rather than all the functions in a given
objects being serialized at once.
This change lays the groundwork for supporting
--gc-sections.
Differential Revision: https://reviews.llvm.org/D41315
llvm-svn: 322138
The addresses of undefined symbols that make it into the final
executable (i.e. weak references to non-existent symbols) should
resolve to zero.
Also, make sure to not include function in the indirect function
table if they are not included in the output.
Differential Revision: https://reviews.llvm.org/D41839
llvm-svn: 322045
Store data relocations with their respective segment.
This allows relocations to be applied as each segment
is written (and therefore in parallel).
Differential Revision: https://reviews.llvm.org/D41410
llvm-svn: 321105