Commit Graph

374 Commits

Author SHA1 Message Date
Jez Ng 7bbdbacd00 [lld-macho] Use export trie instead of symtab when linking against dylibs
Summary:
This allows us to link against stripped dylibs. Moreover, it's simply
more correct: The symbol table includes symbols that the dylib uses but
doesn't export.

This temporarily regresses our ability to do lazy symbol binding because
dyld_stub_binder isn't in libSystem's export trie. Rather, it is in one
of the sub-libraries libSystem re-exports. (This doesn't affect our
tests since we are mocking out dyld_stub_binder there.) A follow-up diff
will address this by adding support for sub-libraries.

Depends on D79114.

Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee

Subscribers: mgorny, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79226
2020-05-09 20:56:22 -07:00
Jez Ng 5d3feefa0d [lld-macho] Dylib symbols should always replace undefined symbols
Summary:
Otherwise we get undefined symbol errors depending on the order of
arguments on the command line.

Depends on D78270.

Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79114
2020-05-09 20:56:22 -07:00
Jez Ng b3e2fc931d [lld-macho] Support calls to functions in dylibs
Summary:
This diff implements lazy symbol binding -- very similar to the PLT
mechanism in ELF.

ELF's .plt section is broken up into two sections in Mach-O:
StubsSection and StubHelperSection. Calls to functions in dylibs will
end up calling into StubsSection, which contains indirect jumps to
addresses stored in the LazyPointerSection (the counterpart to ELF's
.plt.got).

Initially, the LazyPointerSection contains addresses that point into one
of the entry points in the middle of the StubHelperSection. The code in
StubHelperSection will push on the stack an offset into the
LazyBindingSection. The push is followed by a jump to the beginning of
the StubHelperSection (similar to PLT0), which then calls into
dyld_stub_binder. dyld_stub_binder is a non-lazily bound symbol, so this
call looks it up in the GOT.

The stub binder will look up the bind opcodes in the LazyBindingSection
at the given offset. The bind opcodes will tell the binder to update the
address in the LazyPointerSection to point to the symbol, so that
subsequent calls don't have to redo the symbol resolution. The binder
will then jump to the resolved symbol.

Depends on D78269.

Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78270
2020-05-09 20:56:22 -07:00
Jez Ng db157d2733 [lld-macho] Follow-up to D77893
Summary:
1. Don't have isHidden() depend on isNeeded(). Whether a section is
  hidden is orthogonal from whether it is needed: hidden sections will
  never have a header regardless of whether they have a body. (I know we
  override this method with return false for synthetic sections, but
  regardless I think it's confusing to write it this way for non-synthetic
  sections.)

2. Don't call writeTo() on unneeded sections. D78270 assumes that this
  is true when implementing the stub helper section.

3. Filter out the unneeded sections early on to avoid having to deal
   with them in multiple places.

4. Remove assumption in test that the referenced file has no other symbols.
  (We should create separate input files for future tests to avoid such
  issues.)

Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79460
2020-05-09 20:56:22 -07:00
Fangrui Song 6939fe6e08 [lld-macho] Support X86_64_RELOC_SIGNED_{1,2,4}
We currently only support extern relocations.
`X86_64_RELOC_SIGNED_{1,2,4}` are like X86_64_RELOC_SIGNED, but with the
implicit addend fixed to 1, 2, and 4, respectively.
See the comment in `lib/Target/X86/MCTargetDesc/X86MachObjectWriter.cpp RecordX86_64Relocation`.

Reviewed By: int3

Differential Revision: https://reviews.llvm.org/D79311
2020-05-04 15:15:35 -07:00
Kellie Medlin 6cb073133c [lld] Merge Mach-O input sections
Summary: Similar to other formats, input sections in the MachO
implementation are now grouped under output sections. This is primarily
a refactor, although there's some new logic (like resolving the output
section's flags based on its inputs).

Differential Revision: https://reviews.llvm.org/D77893
2020-05-01 16:57:18 -07:00
Jez Ng e82c5e17b5 [lld-macho] Support X86_64_RELOC_BRANCH
Relatively straightforward diff, to set the stage for calling functions
in dylibs.

Differential Revision: https://reviews.llvm.org/D78269
2020-04-29 15:45:01 -07:00
Jez Ng df92377823 [lld-macho] Have Symbol::getVA() return a non-relative virtual address
Currently, getVA() returns a virtual address with the assumption that
the ImageBase is zero. As I understand, this is what lld-ELF is doing.
However, under our current design, it seems like an awkward setup --
I'm finding that I have to add and subtract ImageBase in several places
to make things work out.

As such, I think it's simpler to have getVA() return a non-relative VA,
but I'm not sure if I'm missing something. Would love to hear more from
folks familiar with lld-ELF.

Differential Revision: https://reviews.llvm.org/D78168
2020-04-29 15:44:50 -07:00
Jez Ng 918948db4d [lld-macho] Support reading of universal binaries
Differential Revision: https://reviews.llvm.org/D77006
2020-04-29 15:44:44 -07:00
Jez Ng 89285a1a97 [lld-macho] Disable colors in errors when not printing to a pty
This makes for better tests (and is just the right thing to do)

Differential Revision: https://reviews.llvm.org/D79069
2020-04-29 15:44:35 -07:00
Jez Ng 9854edd817 [lld-macho] Implement basic export trie
Build the trie by performing a three-way radix quicksort: We start by
sorting the strings by their first characters, then sort the strings
with the same first characters by their second characters, and so on
recursively. Each time the prefixes diverge, we add a node to the trie.
Thanks to @ruiu for the idea.

I used llvm-mc's radix quicksort implementation as a starting point. The
trie offset fixpoint code was taken from
MachONormalizedFileBinaryWriter.cpp.

Differential Revision: https://reviews.llvm.org/D76977
2020-04-29 15:44:27 -07:00
Jez Ng 62b8f32f76 [lld-macho][reland] Add support for emitting dylibs with a single symbol
This got reverted due to UBSAN errors in a diff lower in the stack,
which is being fixed in https://reviews.llvm.org/D79050. This diff is
otherwise identical to the original https://reviews.llvm.org/D76908
(which was committed in 9598778bd1 and reverted in b52bc2653b).

Differential Revision: https://reviews.llvm.org/D79051
2020-04-28 17:08:32 -07:00
Jez Ng 4f0cccdd7a [lld-macho][reland] Add basic symbol table output
This diff implements basic support for writing a symbol table.

Attributes are loosely supported for extern symbols and not at all for
other types.

Initial version by Kellie Medlin <kelliem@fb.com>

Originally committed in a3d95a50ee and reverted in fbae153ca5 due to
UBSAN erroring over unaligned writes. That has been fixed in the
current diff with the following changes:

```
diff --git a/lld/MachO/SyntheticSections.cpp b/lld/MachO/SyntheticSections.cpp
--- a/lld/MachO/SyntheticSections.cpp
+++ b/lld/MachO/SyntheticSections.cpp
@@ -133,6 +133,9 @@ SymtabSection::SymtabSection(StringTableSection &stringTableSection)
     : stringTableSection(stringTableSection) {
   segname = segment_names::linkEdit;
   name = section_names::symbolTable;
+  // TODO: When we introduce the SyntheticSections superclass, we should make
+  // all synthetic sections aligned to WordSize by default.
+  align = WordSize;
 }

 size_t SymtabSection::getSize() const {
diff --git a/lld/MachO/Writer.cpp b/lld/MachO/Writer.cpp
--- a/lld/MachO/Writer.cpp
+++ b/lld/MachO/Writer.cpp
@@ -371,6 +371,7 @@ void Writer::assignAddresses(OutputSegment *seg) {
     ArrayRef<InputSection *> sections = p.second;
     for (InputSection *isec : sections) {
       addr = alignTo(addr, isec->align);
+      // We must align the file offsets too to avoid misaligned writes of
+      // structs.
+      fileOff = alignTo(fileOff, isec->align);
       isec->addr = addr;
       addr += isec->getSize();
       fileOff += isec->getFileSize();
@@ -396,6 +397,7 @@ void Writer::writeSections() {
     uint64_t fileOff = seg->fileOff;
     for (auto &sect : seg->getSections()) {
       for (InputSection *isec : sect.second) {
+        fileOff = alignTo(fileOff, isec->align);
         isec->writeTo(buf + fileOff);
         fileOff += isec->getFileSize();
       }
```

I don't think it's easy to write a test for alignment (that doesn't
involve brittly hard-coding file offsets), so there isn't one... but
UBSAN builds pass now.

Differential Revision: https://reviews.llvm.org/D79050
2020-04-28 17:07:06 -07:00
Shoaib Meenai fbae153ca5 Revert "[lld-macho] Add basic symbol table output"
This reverts commit a3d95a50ee.

Reverting due to UBSan failures:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/40817/steps/check-lld%20ubsan/logs/stdio
2020-04-28 11:34:03 -07:00
Shoaib Meenai b52bc2653b Revert "[lld-macho] Add support for emitting dylibs with a single symbol"
This reverts commit 9598778bd1.

Reverting due to UBSan failures:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/40817/steps/check-lld%20ubsan/logs/stdio
2020-04-28 11:34:03 -07:00
Shoaib Meenai af40bff32d [MachO] Fix UB in memcpy
UBSan complains about a memcpy with a null pointer, so just skip the
memcpy call if the data is empty.
2020-04-28 11:33:54 -07:00
Jez Ng 9598778bd1 [lld-macho] Add support for emitting dylibs with a single symbol
Summary:
Add logic for emitting the correct set of load commands and segments
when `-dylib` is passed.

I haven't gotten to implementing a real export trie yet, so we can only
emit a single symbol, but it's enough to replace the YAML test files
introduced in D76252.

Differential Revision: https://reviews.llvm.org/D76908
2020-04-27 13:33:46 -07:00
Jez Ng a3d95a50ee [lld-macho] Add basic symbol table output
This diff implements basic support for writing a symbol table.

- Attributes are loosely supported for extern symbols and not at all for
  other types

Immediate future work will involve implementing section merging.

Initial version by Kellie Medlin <kelliem@fb.com>

Differential Revision: https://reviews.llvm.org/D76742
2020-04-27 13:33:15 -07:00
Jez Ng 6f63216c3d [lld-macho] Extend SyntheticSections to cover all segment load commands
Previously, the special segments `__PAGEZERO` and `__LINKEDIT` were
implemented as special LoadCommands. This diff implements them using
special sections instead which have an `isHidden()` attribute. We do not
emit section headers for hidden sections, but we use their addresses and
file offsets to determine that of their containing segments. In addition
to allowing us to share more segment-related code, this refactor is also
important for the next step of emitting dylibs:

1) dylibs don't have segments like __PAGEZERO, so we need an easy way of
   omitting them w/o messing up segment indices
2) Unlike the kernel, which is happy to run an executable with
   out-of-order segments, dyld requires dylibs to have their segment
   load commands arranged in increasing address order. The refactor
   makes it easier to implement sorting of sections and segments.

Differential Revision: https://reviews.llvm.org/D76839
2020-04-27 12:58:12 -07:00
Simon Pilgrim 0847cfa334 [lld][macho] Fix implicit dependency on DenseMap.h include
It was depending on CachedHashString.h providing the include.
2020-04-27 14:05:29 +01:00
Jez Ng 060efd24c7 [lld-macho] Add basic support for linking against dylibs
This diff implements:

* dylib loading (much of which is being restored from @pcc and @ruiu's
  original work)
* The GOT_LOAD relocation, which allows us to load non-lazy dylib
  symbols
* Basic bind opcode emission, which tells `dyld` how to populate the GOT

Differential Revision: https://reviews.llvm.org/D76252
2020-04-21 13:43:19 -07:00
Fangrui Song 6acd300375 Reland D75382 "[lld] Initial commit for new Mach-O backend"
With a fix for http://lab.llvm.org:8011/builders/clang-cmake-armv8-lld/builds/3636

Also trims some unneeded dependencies.
2020-04-02 12:03:43 -07:00
Oliver Stannard af39151f3c Revert "[lld] Initial commit for new Mach-O backend"
This is causing buildbot failures on 32-bit hosts, for example:
http://lab.llvm.org:8011/builders/clang-cmake-armv8-lld/builds/3636

This reverts commit 03f43b3aca.
2020-04-02 13:23:30 +01:00
Jez Ng 03f43b3aca [lld] Initial commit for new Mach-O backend
Summary:
This is the first commit for the new Mach-O backend, designed to roughly
follow the architecture of the existing ELF and COFF backends, and
building off work that @ruiu and @pcc did in a branch a while back. Note
that this is a very stripped-down commit with the bare minimum of
functionality for ease of review. We'll be following up with more diffs
soon.

Currently, we're able to generate a simple "Hello World!" executable
that runs on OS X Catalina (and possibly on earlier OS X versions; I
haven't tested them). (This executable can be obtained by compiling
`test/MachO/relocations.s`.) We're mocking out a few load commands to
achieve this -- for example, we can't load dynamic libraries, but
Catalina requires binaries to be linked against `dyld`, so we hardcode
the emission of a `LC_LOAD_DYLIB` command. Other mocked out load
commands include LC_SYMTAB and LC_DYSYMTAB.

Differential Revision: https://reviews.llvm.org/D75382
2020-03-31 11:58:47 -07:00