Found such a relocation while testing some real world programs.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D86642
We can have GOT_LOAD relocations that reference `__dso_handle`.
However, our binding opcode encoder doesn't support binding to the DSOHandle
symbol. Instead of adding support for that, I decided it would be cleaner to
implement GOT_LOAD relaxation since `__dso_handle`'s location is always
statically known.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D86641
These opcodes tell dyld to coalesce the overridden weak dysyms to this
particular symbol definition.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D86575
Since there is no "weak lazy" lookup, function calls to weak symbols are
always non-lazily bound. We emit both regular non-lazy bindings as well
as weak bindings, in order that the weak bindings may overwrite the
non-lazy bindings if an appropriate symbol is found at runtime. However,
the bound addresses will still be written (non-lazily) into the
LazyPointerSection.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D86573
Previously, we were only emitting regular bindings to weak
dynamic symbols; this diff adds support for the weak bindings too, which
can overwrite the regular bindings at runtime. We also treat weak
defined global symbols similarly -- since they can also be interposed at
runtime, they need to be treated as potentially dynamic symbols.
Note that weak bindings differ from regular bindings in that they do not
specify the dylib to do the lookup in (i.e. weak symbol lookup happens
in a flat namespace.)
Differential Revision: https://reviews.llvm.org/D86572
Previously, the BindingEntry struct could only store bindings to offsets
within InputSections. Since the GOTSection and TLVPointerSections are
OutputSections, I handled those in a separate code path. However, this
makes it awkward to support weak bindings properly without code
duplication. This diff allows BindingEntries to point directly to
OutputSections, simplifying the upcoming weak binding implementation.
Along the way, I also converted a bunch of functions taking references
to symbols to take pointers instead. Given how much casting we do for
Symbol (especially in the upcoming weak binding diffs), it's cleaner
this way.
Differential Revision: https://reviews.llvm.org/D86571
The re-exports list in a TAPI document can either refer to other inlined
TAPI documents, or to on-disk files (which may themselves be TBD or
regular files.) Similarly, the re-exports of a regular dylib can refer
to a TBD file.
Differential Revision: https://reviews.llvm.org/D85404
Two things needed fixing for that to work:
1. getName() no longer returns null for DylibFiles constructed from TAPIs
2. markSubLibrary() now accepts .tbd as a possible extension
Differential Revision: https://reviews.llvm.org/D86180
Pretty straightforward; just emits LC_RPATH for dyld to consume.
Note that lld itself does not yet support dylib lookup via @rpath.
Reviewed By: #lld-macho, compnerd
Differential Revision: https://reviews.llvm.org/D85701
It's similar to lld-ELF's `-whole-archive`, but applied to individual
archives instead of to a series of them.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D85550
Do folks care if we don't have a test for this? Creating 16
dylibs to trigger this straightforward code path seems a little tedious
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D85467
Previously, lld would crash while complaining that `Expected<T>
must be checked before access or destruction`.
Reviewed By: #lld-macho, compnerd
Differential Revision: https://reviews.llvm.org/D85403
DylibFile doesn't store a pointer to its InterfaceFile
parameter, so there's no need to use a shared_ptr.
Reviewed By: #lld-macho, compnerd
Differential Revision: https://reviews.llvm.org/D85402
References to symbols in dylibs work very similarly regardless of
whether the symbol is a TLV. The main difference is that we have a
separate `__thread_ptrs` section that acts as the GOT for these
thread-locals.
We can identify thread-locals in dylibs by a flag in their export trie
entries, and we cross-check it with the relocations that refer to them
to ensure that we are not using a GOT relocation to reference a
thread-local (or vice versa).
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D85081
This improves the handling of `-platform_version` by addressing the FIXME in the code to process the arguments.
Reviewed By: int3, #lld-macho
Differential Revision: https://reviews.llvm.org/D81413
Handle command-line option `-sectcreate SEG SECT FILE`, which inputs a binary blob from `FILE` into `SEG,SECT`
Reviewed By: int3
Differential Revision: https://reviews.llvm.org/D85501
Required for e.g. linking iOS apps since they don't have a platform-native
SDK
Reviewed By: #lld-macho, compnerd, smeenai
Differential Revision: https://reviews.llvm.org/D85153
Note: What ELF refers to as "TLS", Mach-O seems to refer to as "TLV", i.e.
thread-local variables.
This diff implements support for TLV relocations that reference defined
symbols. On x86_64, TLV relocations are always used with movq opcodes, so for
defined TLVs, we don't need to create a synthetic section to store the
addresses of the symbols -- we can just convert the `movq` to a `leaq`.
One notable quirk of Mach-O's TLVs is that absolute-address relocations
inside TLV-defining sections behave differently -- their addresses are
no longer absolute, but relative to the start of the target section.
(AFAICT, RIP-relative relocations are not allowed in these sections.)
Reviewed By: #lld-macho, compnerd, smeenai
Differential Revision: https://reviews.llvm.org/D85080
This diff makes the behavior in {D80859} and {D81888} apply to
thread-local ZeroFill sections too. I realized this was necessary whie
trying to implement thread-local variables.
Reviewed By: #lld-macho, compnerd, MaskRay
Differential Revision: https://reviews.llvm.org/D85079
This adds support for the `-syslibroot` option. This is required to
make the library search order actually function. With this, it is now
possible to link a test Darwin x86_64 program with lld on Darwin.
Differential Revision: https://reviews.llvm.org/D82252
Reviewed By: Jez Ng
codesign (or more specifically libstuff) checks that each section in
__LINKEDIT ends where the next one starts -- no gaps are permitted. This
diff achieves it by aligning every section's start and end points to
WordSize.
Remarks: ld64 appears to satisfy the constraint by adding padding bytes
when generating the __LINKEDIT data, e.g. by emitting BIND_OPCODE_DONE
(which is a 0x0 byte) repeatedly. I think the approach this diff takes
is a bit more elegant, but I'm not sure if it's too restrictive. In
particular, it assumes padding always uses the zero byte. But we can
revisit this later.
Reviewed By: #lld-macho, compnerd
Differential Revision: https://reviews.llvm.org/D84718
Tools like `install_name_tool` and `codesign` may modify the Mach-O
header and increase its size. The linker has to provide padding to make this
possible. This diff does that, plus sets its default value to 32 bytes (which
is what ld64 does).
Unlike ld64, however, we lay out our sections *exactly* `-headerpad` bytes from
the header, whereas ld64 just treats the padding requirement as a lower bound.
ld64 actually starts laying out the non-header sections in the __TEXT segment
from the end of the (page-aligned) segment rather than the front, so its
binaries typically have more than `-headerpad` bytes of actual padding.
We should consider implementing the same alignment behavior.
Reviewed By: #lld-macho, compnerd
Differential Revision: https://reviews.llvm.org/D84714
The C++ ABI requires dylibs to pass a pointer to __cxa_atexit which does
e.g. cleanup of static global variables. The C++ spec says that the pointer
can point to any address in one of the dylib's segments, but in practice
ld64 seems to set it to point to the header, so that's what's implemented
here.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D83603
The previous approach of adding up the file sizes of the
component sections ignored the fact that the sections did not have to be
contiguous in the file. As such, it was underestimating the true size.
I discovered this issue because `codesign` checks whether `__LINKEDIT`
extends to the end of the file. Since we were underestimating segment
sizes, this check failed.
Reviewed By: #lld-macho, compnerd
Differential Revision: https://reviews.llvm.org/D84574
Needed for testing Objective-C programs (since e.g. Core
Foundation is a framework)
Reviewed By: #lld-macho, compnerd
Differential Revision: https://reviews.llvm.org/D83925
XCode passes in this flag, which we do not yet implement. Skip
over the argument for now so we can at least successfully parse the
linker invocation.
Reviewed By: #lld-macho, compnerd
Differential Revision: https://reviews.llvm.org/D84485
This diff adds support for weak definitions, though it doesn't handle weak
symbols in dylibs quite correctly -- we need to emit binding opcodes for them
in the weak binding section rather than the lazy binding section.
What *is* covered in this diff:
1. Reading the weak flag from symbol table / export trie, and writing it to the
export trie
2. Refining the symbol table's rules for choosing one symbol definition over
another. Wrote a few dozen test cases to make sure we were matching ld64's
behavior.
We can now link basic C++ programs.
Reviewed By: #lld-macho, compnerd
Differential Revision: https://reviews.llvm.org/D83532
Previously, we only supported binding dysyms to the GOT. This
diff adds support for binding them to any arbitrary section. C++
programs appear to use this, I believe for vtables and type_info.
This diff also makes our bind opcode encoding a bit smarter -- we now
encode just the differences between bindings, which will make things
more compact.
I was initially concerned about the performance overhead of iterating
over these relocations, but it turns out that the number of such
relocations is small. A quick analysis of my llvm-project build
directory showed that < 1.3% out of ~7M relocations are RELOC_UNSIGNED
bindings to symbols (including both dynamic and static symbols).
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D83103
Summary:
ld64 does this, and references an internal rdar:// number as an explanation. No
idea what that rdar issue is, but in practice, it seems that not putting a BSS
section at the end can cause subsequent sections in the same segment to be
overwritten with zeroes.
Reviewers: #lld-macho
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81888
Summary:
There were a few issues with the previous setup:
1. The section sorting comparator used a declarative map of section names to
determine the correct order, but it turns out we need to match on more than
just names -- in particular, an upcoming diff will sort based on whether the
S_ZERO_FILL flag is set. This diff changes the sorter to a more imperative but
flexible form.
2. We were sorting OutputSections stored in a MapVector, which left the
MapVector in an inconsistent state -- the wrong keys map to the wrong values!
In practice, we weren't doing key lookups (only container iteration) after the
sort, so this was fine, but it was still a dubious state of affairs. This diff
copies the OutputSections to a vector before sorting them.
3. We were adding unneeded OutputSections to OutputSegments and then filtering
them out later, which meant that we had to remember whether an OutputSegment
was in a pre- or post-filtered state. This diff only adds the sections to the
segments if they are needed.
In addition to those major changes, two minor ones worth noting:
1. I renamed all OutputSection variable names to `osec`, to parallel `isec`.
Previously we were using some inconsistent combination of `osec`, `os`, and
`section`.
2. I added a check (and a test) for InputSections with names that clashed with
those of our synthetic OutputSections.
Reviewers: #lld-macho
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81887
Summary:
llvm-mc emits `__bss` sections with an offset of zero, but we weren't expecting
that in our input, so we were copying non-zero data from the start of the file and
putting it in `__bss`, with obviously undesirable runtime results. (It appears that
the kernel will copy those nonzero bytes as long as the offset is nonzero, regardless
of whether S_ZERO_FILL is set.)
I debated on whether to make a special ZeroFillSection -- separate from a
regular InputSection -- but it seemed like too much work for now. But I'm happy
to refactor if anyone feels strongly about having it as a separate class.
Depends on D80857.
Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee
Reviewed By: smeenai
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80859
Summary:
Turns out this case is actually really common -- it happens whenever there's
a reference to an `extern` variable that ends up statically linked.
Depends on D80856.
Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee
Reviewed By: smeenai
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80857