Extend the range of calls beyond an architecture's limited branch range by first calling a thunk, which loads the far address into a scratch register (x16 on ARM64) and branches through it.
Other ports (COFF, ELF) use multiple passes with successively-refined guesses regarding the expansion of text-space imposed by thunk-space overhead. This MachO algorithm places thunks during MergedOutputSection::finalize() in a single pass using exact thunk-space overheads. Thunks are kept in a separate vector to avoid the overhead of inserting into the `inputs` vector of `MergedOutputSection`.
FIXME:
* arm64-stubs.s test is broken
* add thunk tests
* Handle thunks to DylibSymbol in MergedOutputSection::finalize()
Differential Revision: https://reviews.llvm.org/D100818
Before this, if an inline function was defined in several input files,
lld would write each copy of the inline function the output. With this
patch, it only writes one copy.
Reduces the size of Chromium Framework from 378MB to 345MB (compared
to 290MB linked with ld64, which also does dead-stripping, which we
don't do yet), and makes linking it faster:
N Min Max Median Avg Stddev
x 10 3.9957051 4.3496981 4.1411121 4.156837 0.10092097
+ 10 3.908154 4.169318 3.9712729 3.9846753 0.075773012
Difference at 95.0% confidence
-0.172162 +/- 0.083847
-4.14165% +/- 2.01709%
(Student's t, pooled s = 0.0892373)
Implementation-wise, when merging two weak symbols, this sets a
"canOmitFromOutput" on the InputSection belonging to the weak symbol not put in
the symbol table. We then don't write InputSections that have this set, as long
as they are not referenced from other symbols. (This happens e.g. for object
files that don't set .subsections_via_symbols or that use .alt_entry.)
Some restrictions:
- not yet done for bitcode inputs
- no "comdat" handling (`kindNoneGroupSubordinate*` in ld64) --
Frame Descriptor Entries (FDEs), Language Specific Data Areas (LSDAs)
(that is, catch block unwind information) and Personality Routines
associated with weak functions still not stripped. This is wasteful,
but harmless.
- However, this does strip weaks from __unwind_info (which is needed for
correctness and not just for size)
- This nopes out on InputSections that are referenced form more than
one symbol (eg from .alt_entry) for now
Things that work based on symbols Just Work:
- map files (change in MapFile.cpp is no-op and not needed; I just
found it a bit more explicit)
- exports
Things that work with inputSections need to explicitly check if
an inputSection is written (e.g. unwind info).
This patch is useful in itself, but it's also likely also a useful foundation
for dead_strip.
I used to have a "canoncialRepresentative" pointer on InputSection instead of
just the bool, which would be handy for ICF too. But I ended up not needing it
for this patch, so I removed that again for now.
Differential Revision: https://reviews.llvm.org/D102076
This allows LLVM's LTO to internalize symbols that are not referenced
directly by regular objects. Naturally, this means we need to track
which symbols are referenced by regular objects. The approach taken here
is similar to LLD-COFF's: like the COFF port, we extend
`SymbolTable::insert()` to set the isVisibleToRegularObj bit. (LLD-ELF
relies on the Symbol constructor and `Symbol::mergeProperties()`, but
the Mach-O port does not have a `mergeProperties()` equivalent.)
From what I can tell, ld64 (which uses libLTO) doesn't do this
optimization at all. I'm not even sure libLTO provides a way to do this.
Not having ld64's behavior as a reference implementation is unfortunate;
instead, I am relying on LLD-ELF/COFF's behavior as references while
erring on the conservative side. In particular, LLD-MachO will only do
this optimization for executables right now.
We also don't attempt it when `-flat_namespace` is used -- otherwise
we'd need scan the symbol table to find matches for every un-namespaced
symbol reference, which is expensive.
internalize.ll is based off the LLD-ELF tests `internalize-basic.ll` and
`internalize-undef.ll`. Looks like @davide added some of LLD-ELF's internalize
tests, so adding him as a reviewer...
Reviewed By: #lld-macho, gkm
Differential Revision: https://reviews.llvm.org/D99105
I noticed two problems with the previous implementation:
* N_ALT_ENTRY symbols weren't being handled correctly -- they should
determine the size of the previous symbol, even though they don't
cause a new section to be created
* The last symbol in a section had its size calculated wrongly;
the first subsection's size was used instead of the last one
I decided to take the opportunity to refactor things as well, mainly to
realize my observation
[here](https://reviews.llvm.org/D98837#inline-931511) that we could
avoid doing a binary search to match symbols with subsections. I think
the resulting code is a bit simpler too.
N Min Max Median Avg Stddev
x 20 4.31 4.43 4.37 4.3775 0.034162922
+ 20 4.32 4.43 4.38 4.3755 0.02799906
No difference proven at 95.0% confidence
Reviewed By: #lld-macho, alexshap
Differential Revision: https://reviews.llvm.org/D99972
`class Symbol` defines a data member `InputFile *file;`
`class Defined` inherits from `Symbol` and also defines a data member `InputFile *file;` for no apparent purpose.
Differential Revision: https://reviews.llvm.org/D99783
This diff addresses FIXME in SyntheticSections.cpp and removes
the dependency of emitEndFunStab on .subsections_via_symbols.
Test plan: make check-lld-macho
Differential revision: https://reviews.llvm.org/D99054
Previously, it was difficult to write code that handled both synthetic
and regular sections generically. We solve this problem by creating a
fake InputSection at the start of every SyntheticSection.
This refactor allows us to handle DSOHandle like a regular Defined
symbol (since Defined symbols must be attached to an InputSection), and
paves the way for supporting `__mh_*header` symbols. Additionally, it
simplifies our binding/rebase code.
I did have to extend Defined a little -- it now has a `linkerInternal`
flag, to indicate that `___dso_handle` should not be in the final symbol
table.
I've also added some additional testing for `___dso_handle`.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D98545
Dynamic lookup symbols are symbols that work like dynamic symbols
in ELF: They're not bound to a dylib like normal Mach-O twolevel lookup
symbols, but they live in a global pool and dyld resolves them against
exported symbols from all loaded dylibs.
This adds support for dynamical lookup symbols to lld/mac. They are
represented as DylibSymbols with file set to nullptr.
This also uses this support to implement the -U flag, which makes
a specific symbol that's undefined at the end of the link a
dynamic lookup symbol.
For -U, it'd be sufficient to just to a pass over remaining undefined symbols
at the end of the link and to replace them with dynamic lookup symbols then.
But I'd like to use this code to implement flat_namespace too, and that will
require real support for resolving dynamic lookup symbols in SymbolTable. So
this patch adds this now already.
While writing tests for this, I noticed that we didn't set N_WEAK_DEF in the
symbol table for DylibSymbols, so this fixes that too.
Differential Revision: https://reviews.llvm.org/D97521
This makes our error messages more informative. But the bigger motivation is for
LTO symbol resolution, which will be in an upcoming diff. The changes in this
one are largely mechanical.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D94316
Private extern symbols are used for things scoped to the linkage unit.
They cause duplicate symbol errors (so they're in the symbol table,
unlike TU-scoped truly local symbols), but they don't make it into the
export trie. They are created e.g. by compiling with
-fvisibility=hidden.
If two weak symbols have differing privateness, the combined symbol is
non-private external. (Example: inline functions and some TUs that
include the header defining it were built with
-fvisibility-inlines-hidden and some weren't).
A weak private external symbol implicitly has its "weak" dropped and
behaves like a regular strong private external symbol: Weak is an export
trie concept, and private symbols are not in the export trie.
If a weak and a strong symbol have different privateness, the strong
symbol wins.
If two common symbols have differing privateness, the larger symbol
wins. If they have the same size, the privateness of the symbol seen
later during the link wins (!) -- this is a bit lame, but it matches
ld64 and this behavior takes 2 lines less to implement than the less
surprising "result is non-private external), so match ld64.
(Example: `int a` in two .c files, both built with -fcommon,
one built with -fvisibility=hidden and one without.)
This also makes `__dyld_private` a true TU-local symbol, matching ld64.
To make this work, make the `const char*` StringRefZ ctor to correctly
set `size` (without this, writing the string table crashed when calling
getName() on the __dyld_private symbol).
Mention in CommonSymbol's comment that common symbols are now disabled
by default in clang.
Mention in -keep_private_externs's HelpText that the flag only has an
effect with `-r` (which we don't implement yet -- so this patch here
doesn't regress any behavior around -r + -keep_private_externs)). ld64
doesn't explicitly document it, but the commit text of
http://reviews.llvm.org/rL216146 does, and ld64's
OutputFile::buildSymbolTable() checks `_options.outputKind() ==
Options::kObjectFile` before calling `_options.keepPrivateExterns()`
(the only reference to that function).
Fixes PR48536.
Differential Revision: https://reviews.llvm.org/D93609
Note that dylibs without *any* refs will still be loaded in the usual
(strong) fashion.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D93435
Weak references need not necessarily be satisfied at runtime (but they must
still be satisfied at link time). So symbol resolution still works as per usual,
but we now pass around a flag -- ultimately emitting it in the bind table -- to
indicate if a given dylib symbol is a weak reference.
ld64's behavior for symbols that have both weak and strong references is
a bit bizarre. For non-function symbols, it will emit a weak import. For
function symbols (those referenced by BRANCH relocs), it will emit a
regular import. I'm not sure what value there is in that behavior, and
since emulating it will make our implementation more complex, I've
decided to treat regular weakrefs like function symbol ones for now.
Fixes PR48511.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D93369
- most importantly, fix a use-after-free when using thin archives,
by putting the archive unique_ptr to the arena allocator. This
ports D65565 to MachO
- correctly demangle symbol namess from archives in diagnostics
- add a test for thin archives -- it finds this UaF, but only when
running it under asan (it also finds the demangling fix)
- make forceLoadArchive() use addFile() with a bool to have the archive
loading code in fewer places. no behavior change; matches COFF port a
bit better
Differential Revision: https://reviews.llvm.org/D92360
They operate like Defined symbols but with no associated InputSection.
Note that `ld64` seems to treat the weak definition flag like a no-op for
absolute symbols, so I have replicated that behavior.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D87909
On Unix, it is traditionally allowed to write variable definitions without
initialization expressions (such as "int foo;") to header files. These are
called tentative definitions.
The compiler creates common symbols when it sees tentative definitions. When
linking the final binary, if there are remaining common symbols after name
resolution is complete, the linker converts them to regular defined symbols in
a `__common` section.
This diff implements most of that functionality, though we do not yet handle
the case where there are both common and non-common definitions of the same
symbol.
Reviewed By: #lld-macho, gkm
Differential Revision: https://reviews.llvm.org/D86909
Found such a relocation while testing some real world programs.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D86642
These opcodes tell dyld to coalesce the overridden weak dysyms to this
particular symbol definition.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D86575
Since there is no "weak lazy" lookup, function calls to weak symbols are
always non-lazily bound. We emit both regular non-lazy bindings as well
as weak bindings, in order that the weak bindings may overwrite the
non-lazy bindings if an appropriate symbol is found at runtime. However,
the bound addresses will still be written (non-lazily) into the
LazyPointerSection.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D86573
Previously, we were only emitting regular bindings to weak
dynamic symbols; this diff adds support for the weak bindings too, which
can overwrite the regular bindings at runtime. We also treat weak
defined global symbols similarly -- since they can also be interposed at
runtime, they need to be treated as potentially dynamic symbols.
Note that weak bindings differ from regular bindings in that they do not
specify the dylib to do the lookup in (i.e. weak symbol lookup happens
in a flat namespace.)
Differential Revision: https://reviews.llvm.org/D86572
References to symbols in dylibs work very similarly regardless of
whether the symbol is a TLV. The main difference is that we have a
separate `__thread_ptrs` section that acts as the GOT for these
thread-locals.
We can identify thread-locals in dylibs by a flag in their export trie
entries, and we cross-check it with the relocations that refer to them
to ensure that we are not using a GOT relocation to reference a
thread-local (or vice versa).
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D85081
The C++ ABI requires dylibs to pass a pointer to __cxa_atexit which does
e.g. cleanup of static global variables. The C++ spec says that the pointer
can point to any address in one of the dylib's segments, but in practice
ld64 seems to set it to point to the header, so that's what's implemented
here.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D83603
This diff adds support for weak definitions, though it doesn't handle weak
symbols in dylibs quite correctly -- we need to emit binding opcodes for them
in the weak binding section rather than the lazy binding section.
What *is* covered in this diff:
1. Reading the weak flag from symbol table / export trie, and writing it to the
export trie
2. Refining the symbol table's rules for choosing one symbol definition over
another. Wrote a few dozen test cases to make sure we were matching ld64's
behavior.
We can now link basic C++ programs.
Reviewed By: #lld-macho, compnerd
Differential Revision: https://reviews.llvm.org/D83532
Summary:
Turns out this case is actually really common -- it happens whenever there's
a reference to an `extern` variable that ends up statically linked.
Depends on D80856.
Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee
Reviewed By: smeenai
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80857
That's what ld64 uses for 64-bit targets. I figured it's best to make
this change sooner rather than later since a bunch of our tests are
relying on hardcoded addresses that depend on this value.
Reviewed By: smeenai
Differential Revision: https://reviews.llvm.org/D80177
With this change, basic archive files can be linked together. Input
section discovery has been refactored into a function since archive
files lazily resolve their symbols / the object files containing those
symbols.
Reviewed By: int3, smeenai
Differential Revision: https://reviews.llvm.org/D78342
Summary:
This diff implements lazy symbol binding -- very similar to the PLT
mechanism in ELF.
ELF's .plt section is broken up into two sections in Mach-O:
StubsSection and StubHelperSection. Calls to functions in dylibs will
end up calling into StubsSection, which contains indirect jumps to
addresses stored in the LazyPointerSection (the counterpart to ELF's
.plt.got).
Initially, the LazyPointerSection contains addresses that point into one
of the entry points in the middle of the StubHelperSection. The code in
StubHelperSection will push on the stack an offset into the
LazyBindingSection. The push is followed by a jump to the beginning of
the StubHelperSection (similar to PLT0), which then calls into
dyld_stub_binder. dyld_stub_binder is a non-lazily bound symbol, so this
call looks it up in the GOT.
The stub binder will look up the bind opcodes in the LazyBindingSection
at the given offset. The bind opcodes will tell the binder to update the
address in the LazyPointerSection to point to the symbol, so that
subsequent calls don't have to redo the symbol resolution. The binder
will then jump to the resolved symbol.
Depends on D78269.
Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78270
Summary: Similar to other formats, input sections in the MachO
implementation are now grouped under output sections. This is primarily
a refactor, although there's some new logic (like resolving the output
section's flags based on its inputs).
Differential Revision: https://reviews.llvm.org/D77893
Currently, getVA() returns a virtual address with the assumption that
the ImageBase is zero. As I understand, this is what lld-ELF is doing.
However, under our current design, it seems like an awkward setup --
I'm finding that I have to add and subtract ImageBase in several places
to make things work out.
As such, I think it's simpler to have getVA() return a non-relative VA,
but I'm not sure if I'm missing something. Would love to hear more from
folks familiar with lld-ELF.
Differential Revision: https://reviews.llvm.org/D78168
This diff implements:
* dylib loading (much of which is being restored from @pcc and @ruiu's
original work)
* The GOT_LOAD relocation, which allows us to load non-lazy dylib
symbols
* Basic bind opcode emission, which tells `dyld` how to populate the GOT
Differential Revision: https://reviews.llvm.org/D76252
Summary:
This is the first commit for the new Mach-O backend, designed to roughly
follow the architecture of the existing ELF and COFF backends, and
building off work that @ruiu and @pcc did in a branch a while back. Note
that this is a very stripped-down commit with the bare minimum of
functionality for ease of review. We'll be following up with more diffs
soon.
Currently, we're able to generate a simple "Hello World!" executable
that runs on OS X Catalina (and possibly on earlier OS X versions; I
haven't tested them). (This executable can be obtained by compiling
`test/MachO/relocations.s`.) We're mocking out a few load commands to
achieve this -- for example, we can't load dynamic libraries, but
Catalina requires binaries to be linked against `dyld`, so we hardcode
the emission of a `LC_LOAD_DYLIB` command. Other mocked out load
commands include LC_SYMTAB and LC_DYSYMTAB.
Differential Revision: https://reviews.llvm.org/D75382