Summary:
There were a few issues with the previous setup:
1. The section sorting comparator used a declarative map of section names to
determine the correct order, but it turns out we need to match on more than
just names -- in particular, an upcoming diff will sort based on whether the
S_ZERO_FILL flag is set. This diff changes the sorter to a more imperative but
flexible form.
2. We were sorting OutputSections stored in a MapVector, which left the
MapVector in an inconsistent state -- the wrong keys map to the wrong values!
In practice, we weren't doing key lookups (only container iteration) after the
sort, so this was fine, but it was still a dubious state of affairs. This diff
copies the OutputSections to a vector before sorting them.
3. We were adding unneeded OutputSections to OutputSegments and then filtering
them out later, which meant that we had to remember whether an OutputSegment
was in a pre- or post-filtered state. This diff only adds the sections to the
segments if they are needed.
In addition to those major changes, two minor ones worth noting:
1. I renamed all OutputSection variable names to `osec`, to parallel `isec`.
Previously we were using some inconsistent combination of `osec`, `os`, and
`section`.
2. I added a check (and a test) for InputSections with names that clashed with
those of our synthetic OutputSections.
Reviewers: #lld-macho
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81887
Summary:
Turns out this case is actually really common -- it happens whenever there's
a reference to an `extern` variable that ends up statically linked.
Depends on D80856.
Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee
Reviewed By: smeenai
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80857
Summary:
So things work on 32-bit machines. (@vzakhari reported the
breakage starting from D80177).
Reviewers: #lld-macho, vzakhari
Subscribers: llvm-commits, vzakhari
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81982
That's what ld64 uses for 64-bit targets. I figured it's best to make
this change sooner rather than later since a bunch of our tests are
relying on hardcoded addresses that depend on this value.
Reviewed By: smeenai
Differential Revision: https://reviews.llvm.org/D80177
The order file indicates how input sections should be sorted within each
output section, based on the symbols contained within those sections.
This diff sets the stage for implementing and testing
`.subsections_via_symbols`, where we will break up InputSections by each
symbol and sort them more granularly.
Reviewed By: smeenai
Differential Revision: https://reviews.llvm.org/D79668
Summary:
This diff implements lazy symbol binding -- very similar to the PLT
mechanism in ELF.
ELF's .plt section is broken up into two sections in Mach-O:
StubsSection and StubHelperSection. Calls to functions in dylibs will
end up calling into StubsSection, which contains indirect jumps to
addresses stored in the LazyPointerSection (the counterpart to ELF's
.plt.got).
Initially, the LazyPointerSection contains addresses that point into one
of the entry points in the middle of the StubHelperSection. The code in
StubHelperSection will push on the stack an offset into the
LazyBindingSection. The push is followed by a jump to the beginning of
the StubHelperSection (similar to PLT0), which then calls into
dyld_stub_binder. dyld_stub_binder is a non-lazily bound symbol, so this
call looks it up in the GOT.
The stub binder will look up the bind opcodes in the LazyBindingSection
at the given offset. The bind opcodes will tell the binder to update the
address in the LazyPointerSection to point to the symbol, so that
subsequent calls don't have to redo the symbol resolution. The binder
will then jump to the resolved symbol.
Depends on D78269.
Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78270
Summary: Similar to other formats, input sections in the MachO
implementation are now grouped under output sections. This is primarily
a refactor, although there's some new logic (like resolving the output
section's flags based on its inputs).
Differential Revision: https://reviews.llvm.org/D77893
Build the trie by performing a three-way radix quicksort: We start by
sorting the strings by their first characters, then sort the strings
with the same first characters by their second characters, and so on
recursively. Each time the prefixes diverge, we add a node to the trie.
Thanks to @ruiu for the idea.
I used llvm-mc's radix quicksort implementation as a starting point. The
trie offset fixpoint code was taken from
MachONormalizedFileBinaryWriter.cpp.
Differential Revision: https://reviews.llvm.org/D76977
This diff implements basic support for writing a symbol table.
Attributes are loosely supported for extern symbols and not at all for
other types.
Initial version by Kellie Medlin <kelliem@fb.com>
Originally committed in a3d95a50ee and reverted in fbae153ca5 due to
UBSAN erroring over unaligned writes. That has been fixed in the
current diff with the following changes:
```
diff --git a/lld/MachO/SyntheticSections.cpp b/lld/MachO/SyntheticSections.cpp
--- a/lld/MachO/SyntheticSections.cpp
+++ b/lld/MachO/SyntheticSections.cpp
@@ -133,6 +133,9 @@ SymtabSection::SymtabSection(StringTableSection &stringTableSection)
: stringTableSection(stringTableSection) {
segname = segment_names::linkEdit;
name = section_names::symbolTable;
+ // TODO: When we introduce the SyntheticSections superclass, we should make
+ // all synthetic sections aligned to WordSize by default.
+ align = WordSize;
}
size_t SymtabSection::getSize() const {
diff --git a/lld/MachO/Writer.cpp b/lld/MachO/Writer.cpp
--- a/lld/MachO/Writer.cpp
+++ b/lld/MachO/Writer.cpp
@@ -371,6 +371,7 @@ void Writer::assignAddresses(OutputSegment *seg) {
ArrayRef<InputSection *> sections = p.second;
for (InputSection *isec : sections) {
addr = alignTo(addr, isec->align);
+ // We must align the file offsets too to avoid misaligned writes of
+ // structs.
+ fileOff = alignTo(fileOff, isec->align);
isec->addr = addr;
addr += isec->getSize();
fileOff += isec->getFileSize();
@@ -396,6 +397,7 @@ void Writer::writeSections() {
uint64_t fileOff = seg->fileOff;
for (auto § : seg->getSections()) {
for (InputSection *isec : sect.second) {
+ fileOff = alignTo(fileOff, isec->align);
isec->writeTo(buf + fileOff);
fileOff += isec->getFileSize();
}
```
I don't think it's easy to write a test for alignment (that doesn't
involve brittly hard-coding file offsets), so there isn't one... but
UBSAN builds pass now.
Differential Revision: https://reviews.llvm.org/D79050
Summary:
Add logic for emitting the correct set of load commands and segments
when `-dylib` is passed.
I haven't gotten to implementing a real export trie yet, so we can only
emit a single symbol, but it's enough to replace the YAML test files
introduced in D76252.
Differential Revision: https://reviews.llvm.org/D76908
This diff implements basic support for writing a symbol table.
- Attributes are loosely supported for extern symbols and not at all for
other types
Immediate future work will involve implementing section merging.
Initial version by Kellie Medlin <kelliem@fb.com>
Differential Revision: https://reviews.llvm.org/D76742
Previously, the special segments `__PAGEZERO` and `__LINKEDIT` were
implemented as special LoadCommands. This diff implements them using
special sections instead which have an `isHidden()` attribute. We do not
emit section headers for hidden sections, but we use their addresses and
file offsets to determine that of their containing segments. In addition
to allowing us to share more segment-related code, this refactor is also
important for the next step of emitting dylibs:
1) dylibs don't have segments like __PAGEZERO, so we need an easy way of
omitting them w/o messing up segment indices
2) Unlike the kernel, which is happy to run an executable with
out-of-order segments, dyld requires dylibs to have their segment
load commands arranged in increasing address order. The refactor
makes it easier to implement sorting of sections and segments.
Differential Revision: https://reviews.llvm.org/D76839
This diff implements:
* dylib loading (much of which is being restored from @pcc and @ruiu's
original work)
* The GOT_LOAD relocation, which allows us to load non-lazy dylib
symbols
* Basic bind opcode emission, which tells `dyld` how to populate the GOT
Differential Revision: https://reviews.llvm.org/D76252