When a symbol is GC'd it can still be references by relocations
in the debug sections, but such symbols are not assigned virtual
addresses.
This change adds a new global data symbol which gets GC'd but
should still appears in the output debug info, albeit with a 0
address.
Fixes 37555
Differential Revision: https://reviews.llvm.org/D47238
llvm-svn: 333047
This change adds the ability for lld to remove LEB padding from
code section. This effectively shrinks the size of the resulting
binary in proportion to the number of code relocations.
Since there will be a performance cost this is currently only active for
-O1 and above. Some toolchains may instead want to perform this
compression as a post linker step (for example running a binary through
binaryen will automatically compress these values).
I imagine we might want to make this the default in the future.
Differential Revision: https://reviews.llvm.org/D46416
llvm-svn: 332783
Since we a no longer using this function for the wasm start
section we don't actually care what its signature is.
Differential Revision: https://reviews.llvm.org/D46594
llvm-svn: 332308
Merging data segments produces smaller code sizes because each segment
has some boilerplate. Therefore, merging data segments is generally the
right approach, especially with wasm where binaries are typically
delivered over the network.
However, when analyzing wasm binaries, it can be helpful to get a
conservative picture of which functions are using which data
segments[0]. Perhaps there is a large data segment that you didn't
expect to be included in the wasm, introduced by some library you're
using, and you'd like to know which library it was. In this scenario,
merging data segments only makes the analysis worse.
Alternatively, perhaps you will remove some dead functions by-hand[1]
that can't be statically proven dead by the compiler or lld, and
removing these functions might make some data garbage collect-able, and
you'd like to run `--gc-sections` again so that this now-unused data can
be collected. If the segments were originally merged, then a single use
of the merged data segment will entrench all of the data.
[0] https://github.com/rustwasm/twiggy
[1] https://github.com/fitzgen/wasm-snip
Patch by Nick Fitzgerald!
Differential Revision: https://reviews.llvm.org/D46417
llvm-svn: 332013
Specifically add support for custom sections that contain
relocations, and for the two new relocation types needed
by DWARF sections.
See: https://reviews.llvm.org/D44184
Patch by Yury Delendik!
Differential Revision: https://reviews.llvm.org/D44184
llvm-svn: 331566
Enables cleaning up confusion between which name variables are mangled
and which are unmangled, and --print-gc-sections then excersises and
tests that.
Differential Revision: https://reviews.llvm.org/D44440
llvm-svn: 330449
Relocation addends can be negative so should be written as
signed LEBs. This bug meant that writing value between 64
and 128 would be incorrectly interpreted as negative by the
object file readers.
Differential Revision: https://reviews.llvm.org/D45825
llvm-svn: 330374
Summary:
The content of custome sections no longer includes the
name itself.
See: https://reviews.llvm.org/D45579
Subscribers: jfb, dschuff, jgravelle-google, aheejin, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D45580
llvm-svn: 329948
Copy user-defined custom sections into the output, concatenating
sections with the same name.
Differential Revision: https://reviews.llvm.org/D45340
llvm-svn: 329717
This enables callback-style programming where the JavaScript environment
can call back into the Wasm environment using a function pointer
received from the module.
Differential Revision: https://reviews.llvm.org/D44427
llvm-svn: 328643
Previously, Config->InitialMemory/MaxMemory were hooked up to some
commandline args but had no effect at all.
Differential Revision: https://reviews.llvm.org/D44393
llvm-svn: 327508
This fixes issues found on the wasm waterfall related to relocations
with addends. Undefined symbols, even those with addends should
always have a provisional value of zero. At least this is what llvm
emits (and I believe this is true for ELF too).
Differential Revision: https://reviews.llvm.org/D44451
llvm-svn: 327468
Previously, Config->Demangle was uninitialised (not hooked up to
commandline handling)
Differential Revision: https://reviews.llvm.org/D44301
llvm-svn: 327390
This bug was found by accident while trying to expand out testcases
for imported symbols, and is covered by the additional test case.
Differential Revision: https://reviews.llvm.org/D44331
llvm-svn: 327290
This matches the existing ordering that's been there for globals
for a while (__stack_pointer coming first).
Differential Revision: https://reviews.llvm.org/D44333
llvm-svn: 327286
This error case is described in Linking.md. The operand for call requires
generation of a synthetic stub.
Differential Revision: https://reviews.llvm.org/D44028
llvm-svn: 327151
When a symbol is exported via --export=foo but --allow-undefined
is also specified, the symbol is now allowed to be undefined.
Previously we were special casing such symbols.
This combinations of behavior is exactly what emescripten
requires. Although we are trying hard not to allow emscripten
specific features in lld, this one makes sense.
Enforce this behavior by added this case to test/wasm/undefined.ll.
Differential Revision: https://reviews.llvm.org/D44237
llvm-svn: 326976
Update LLD test expectations for new symbol ordering introduced by
Differential D43685.
Differential Revision: https://reviews.llvm.org/D43875
llvm-svn: 326335
This means we don't need to write the linking metadata section
at all for executable (non-relocatable) output.
Differential Revision: https://reviews.llvm.org/D42869
llvm-svn: 326268
Purely a rename in preparation for adding new global symbol type.
We want to use GlobalSymbol to represent real wasm globals and
DataSymbol for pointers to things in linear memory (what ELF would
call STT_OBJECT).
This reduces the size the patch to add the explicit symbol table
which is coming soon!
Differential Revision: https://reviews.llvm.org/D43476
llvm-svn: 325645
Invoking lld as ld.lld, ld.ld64, lld-link or wasm-ld is preferred
than invoking lld as lld and pass an -flavor option. We have "lld"
file mostly for historical reasons.
Differential Revision: https://reviews.llvm.org/D43407
llvm-svn: 325405
This bug effected undefined symbols that were resolved by
existing defined symbols. We were skipping the signature
check in this case.
Differential Revision: https://reviews.llvm.org/D43399
llvm-svn: 325376
This is similar to _end (See https://linux.die.net/man/3/edata for more)
but using our own unique name since our use cases will most likely be
different and we want to keep our options open WRT to memory layout.
This change will allow is to remove the DataSize from the linking
metadata section which is currently being used by emscripten to derive
the end of the data.
Differential Revision: https://reviews.llvm.org/D42867
llvm-svn: 324443
Don't include type signatures that are not referenced by
some relocation.
We don't include this in the -gc-sections settings since
we are always building the type section from scratch,
just like we do the table elements.
In the future we might want to unify the relocation
processing which is currently done once for gc-sections
and then again for building the sympathetic type and
table sections.
Differential Revision: https://reviews.llvm.org/D42747
llvm-svn: 323931
In this initial version we only GC symbols with `hidden` visibility since
other symbols we export to the embedder.
We could potentially modify this the future and only use symbols
explicitly passed via `--export` as GC roots.
This version of the code only does GC of data and code. GC for the
types section is coming soon.
Differential Revision: https://reviews.llvm.org/D42511
llvm-svn: 323842
Add a simple start entry point input file and have the tests
reference that rather than duplicating these.
This allows more tests to be pure `.test` files rather than
`.ll`.
Differential Revision: https://reviews.llvm.org/D42662
llvm-svn: 323838
Previously, we were ensuring that the "output index" for
InputFunctions was unique across all symbols that referenced
a function body, but allowing the same function body to have
multiple table indexes.
Now, we use the same mechanism for table indexes as we already
do for output indexes, ensuring that each InputFunction is only
placed in the table once.
This makes the LLD output table denser and smaller, but should
not change the behaviour.
Note that we still need the `Symbol::TableIndex` member, to
store the table index for function Symbols that don't have an
InputFunction, i.e. for address-taken imports.
Patch by Nicholas Wilson!
Differential Revision: https://reviews.llvm.org/D42476
llvm-svn: 323379
Previously llvm was using 0 as the first table index for wasm object
files but now that has switched to 1 we can have the output of lld
do the same and simplify the code.
Patch by Nicholas Wilson!
Differential Revision: https://reviews.llvm.org/D42096
llvm-svn: 323378
This is somewhat preferable since (in many cases) it allows llc
to be run directly on the .ll files without having to pass the
`-mtriple` argument.
Differential Revision: https://reviews.llvm.org/D42438
llvm-svn: 323299
There seems to be an bug related to table relocations not being
written correctly in this case. This change is intended simply
to increase the coverage, not fix the issue.
llvm-svn: 323282
TABLE relocations now store the function that is being refered
to indirectly.
See rL323165.
Also extend the call-indirect.ll a little.
Based on a patch by Nicholas Wilson!
llvm-svn: 323168
This was added to mimic ELF, but maintaining it has cost
and we currently don't have any use for it outside of the
test code.
Differential Revision: https://reviews.llvm.org/D42324
llvm-svn: 323154
Its much easier to export it via setHidden(false), now that
that is a thing.
As a side effect the start function is not longer always exports first
(becuase its being exported just like all the other function).
Differential Revision: https://reviews.llvm.org/D42321
llvm-svn: 323025
This code was needed back when we were not able to write
out the synthetic symbol for main.
Add tests to make sure we can handle this now.
Differential Revision: https://reviews.llvm.org/D42322
llvm-svn: 323020
We need these import since relocations are generated against them.
Patch by Nicholas Wilson!
Differential Revision: https://reviews.llvm.org/D42305
llvm-svn: 322990
This solves the problem that --emit-relocs needs the stack-pointer
to be exported, in order to write out any relocations that reference
the __stack_pointer symbol by its symbol index.
Patch by Nicholas Wilson!
Differential Revision: https://reviews.llvm.org/D42237
llvm-svn: 322911
When writing relocatable files we were exporting for all globals
(including file-local syms), but not for functions. Oops. To be
consistent with non-relocatable output, all symbols (file-local
and global) should be exported. Any symbol targetted by further
relocations needs to be exported. The lack of local function
exports was just an omission, I think.
Second bug: Local symbol names can collide, causing an illegal
Wasm file to be generated! Oops again. This only previously affected
producing relocatable output from two files, where each had a global
with the same name. We need to "budge" the symbol names for locals
that are exported on relocatable output.
Third bug: LLD's relocatable output wasn't writing out any symbol
flags! Thus the local globals weren't being marked as local, and
the hidden flag was also stripped...
Added tests to exercise colliding local names with/without
relocatable flag
Patch by Nicholas Wilson!
Differential Revision: https://reviews.llvm.org/D42105
llvm-svn: 322908
Simplify generation of "names" section by simply iterating
over the DefinedFunctions array.
This even fixes some bugs, judging by the test changes required.
Some tests are asserting that functions are named multiple times,
other tests are asserting that the "names" section contains the
function's alias rather than its original name
Patch by Nicholas Wilson!
Differential Revision: https://reviews.llvm.org/D42076
llvm-svn: 322751
This is an immutable exported global representing
the start of the heap area. It is a page aligned.
Differential Revision: https://reviews.llvm.org/D42030
llvm-svn: 322609
See https://bugs.llvm.org/show_bug.cgi?id=35533, and D40844
Things covered:
* Removing duplicate data segments (as determined by COMDATs emitted
by the frontend)
* Removing duplicate globals and functions in COMDATs
* Checking that each time a COMDAT is seen it has the same symbols
as at other times (ie it's a stronger check than simply giving all
the symbols in the COMDAT weak linkage)
Patch by Nicholas Wilson!
Differential Revision: https://reviews.llvm.org/D40845
llvm-svn: 322415
This is useful for emscripten or other tools that want to
selectively exports symbols without necessarily changing the
source code.
Differential Revision: https://reviews.llvm.org/D42003
llvm-svn: 322408
Even though a function can have multiple names in the
linking standards (i.e. due to aliases), there can only
be one name for a given function in the NAME section.
Differential Revision: https://reviews.llvm.org/D41975
llvm-svn: 322383
This allows libraries to supply a list of symbols which are
allowed to be undefined at link time (i.e. result in imports).
This method replaces the existing mechanism (-allow-undefined-file)
used by the clang driver to allow undefined symbols in libc.
For more on motivation for this see:
https://github.com/WebAssembly/tool-conventions/issues/35
In the long run we hope to remove this features and instead
include this information in the object format itself.
Differential Revision: https://reviews.llvm.org/D41922
llvm-svn: 322320
The addresses of undefined symbols that make it into the final
executable (i.e. weak references to non-existent symbols) should
resolve to zero.
Also, make sure to not include function in the indirect function
table if they are not included in the output.
Differential Revision: https://reviews.llvm.org/D41839
llvm-svn: 322045
Summary:
Currently the test only checks behaviour for weak function symbols.
Should be good to merge straight away?
Reviewers: sbc100
Reviewed By: sbc100
Subscribers: jfb, dschuff, jgravelle-google, aheejin, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D41449
llvm-svn: 321308
In this case we are calling a function pointer which
a type that doesn't otherwise exist in the code.
Clearly this code can't would trap if it was ever
called (because there is not such function that
the pointer can resolve to).
But it should valid and compile and link and validation
time.
llvm-svn: 321134
This change add support for init functions in the linking
section, but only in -r/--relocatable mode.
Differential Revision: https://reviews.llvm.org/D41375
llvm-svn: 321088
Also make function bodies unique so they can be distinguished
in the output. This is helpful for adding support for --gc-sections.
Differential Revision: https://reviews.llvm.org/D41093
llvm-svn: 320441
Create the indirect function table based on symbols rather
than just duplicating the input entries. This has the
effect of de-duplicating the table.
This is a followup to the equivalent change made for globals:
https://reviews.llvm.org/D40859
Partially based on a patch by Nicholas Wilson:
https://reviews.llvm.org/D40845
Differential Revision: https://reviews.llvm.org/D40989
llvm-svn: 320428
Add test for weakly defined symbols with the same name
Improve test for call-indirect to include the same call in two
different objects. This lays the ground work to improve the
output via de-duplicating the indirect call table:
https://reviews.llvm.org/D40989
Also make all tests consistently pass -mtriple rather than
declaring in the sources.
Differential Revision: https://reviews.llvm.org/D41024
llvm-svn: 320172
This adds a `--no-entry` argument to wasm LLD used to
suppress the default `_start` entry point.
Patch by Nicholas Wilson!
Differential Revision: https://reviews.llvm.org/D40725
llvm-svn: 320167
Adds a new argument to wasm-lld, `--undefined`, with
similar semantics to the ELF linker. It pulls in symbols
from files contained within a `.a` archive, forcing them
to be included even if the translation unit would not
otherwise be pulled in.
Patch by Nicholas Wilson
Differential Revision: https://reviews.llvm.org/D40724
llvm-svn: 320004
This change cleans up the way wasm exports and globals
are generated, particualrly for -r/--relocatable where
globals need to be created and exported in order for
output relocations which reference them.
Remove the need for a per file GlobalIndexOffset and
instead set the output index for each symbol directly.
This simplifies the code in several places.
Differential Revision: https://reviews.llvm.org/D40859
llvm-svn: 320001
This line was mistakenly deleted in rL319813
Add a test for stackpointer relocations.
Differential Revision: https://reviews.llvm.org/D40847
llvm-svn: 319828
This change allows checking of function signatures but
does not yes enable it by default. In this mode, linking
two objects that were compiled with a different signatures
for the same function will produce a link error.
New options for enabling and disabling this feature have been
added: (--check-signatures/--no-check-signatures).
Differential Revision: https://reviews.llvm.org/D40371
llvm-svn: 319396
This linker backend is still a work in progress but is
enough to link simple programs including linking against
library archives.
Differential Revision: https://reviews.llvm.org/D34851
llvm-svn: 318539