Commit Graph

546 Commits

Author SHA1 Message Date
Michael Liao 43793b89a0 [lld] Fix shared library build by adding the missing dependency. 2020-06-08 16:12:58 -04:00
Saleem Abdulrasool fcdf7578aa lld: improve the `-arch` handling for MachO
Use the default target triple configured by the user to determine the
default architecture for `ld64.lld`.  Stash the architecture in the
configuration as when linking against TBDs, we will need to filter out
the symbols based upon the architecture.  Treat the Haswell slice as it
is equivalent to `x86_64` but with the extra Haswell extensions (e.g.
AVX2, FMA3, BMI1, etc).  This will make it easier to add new
architectures in the future.

This change also changes the failure mode where an invalid `-arch`
parameter will result in the linker exiting without further processing.
2020-06-08 11:04:19 -07:00
Saleem Abdulrasool e78431354b lld: use modern library search ordering
This merges the static and shared library and behaves as if
`-search_paths_first` was specified which is also the default behaviour
on ld64 (and now lld). Unify the paths, and use `llvm::sys::path` to
deal with the path to be truly agnostic to the host.
2020-06-05 12:12:26 -07:00
Saleem Abdulrasool 116e38fd8b lld: add basic static library search
This is a very basic static library search addition. This is the pre-Xcode4
behaviour of searching all paths for the shared version before searching for
the static version of the library. This behaviour is supposed to be inverted
with `-search_paths_first` being the default. This adds the library search
with the intention of providing the setup to merge the paths into one path
and making it controllable by `OPT_search_paths_first`.
2020-06-03 23:32:05 +00:00
Saleem Abdulrasool 37d93b528c lld: ignore the `-search_paths_first` option on MachO
ld64 provides the `-search_path_firsts` which will search each path in
the library search path order for both `lib[name].dylib`, `lib[name].a`
before moving on (searching all paths for the dylib and then falling
back to the static library if a shared library was not found).

This option has been the default for a long time, but the command line
flag still exists.  Ignore it for compatibility.
2020-06-03 15:36:35 +00:00
Jez Ng d767de44bf [lld-macho] Fix PAGEZERO=4GB errors on Windows by ensuring enum is uint64_t
It appears that MSVC doesn't resize the enum properly to fit the
constants.
2020-06-02 15:24:31 -07:00
Jez Ng 1e1a3f67ee [lld-macho] Ensure reads from nlist_64 structs are aligned when necessary
My test refactoring in D80217 seems to have caused yaml2obj to emit
unaligned nlist_64 structs, causing ASAN'd lld to be unhappy. I don't
think this is an issue with yaml2obj though -- llvm-mc also seems to
emit unaligned nlist_64s. This diff makes lld able to safely do aligned
reads under ASAN builds while hopefully creating no overhead for regular
builds on architectures that support unaligned reads.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D80414
2020-06-02 13:19:38 -07:00
Jez Ng a04c133564 [lld-macho] Set __PAGEZERO size to 4GB
That's what ld64 uses for 64-bit targets. I figured it's best to make
this change sooner rather than later since a bunch of our tests are
relying on hardcoded addresses that depend on this value.

Reviewed By: smeenai

Differential Revision: https://reviews.llvm.org/D80177
2020-06-02 13:19:38 -07:00
Jez Ng df2a5778c3 [lld-macho] Error on encountering undefined symbols
... instead of silently emitting a reference to the zero address.

Reviewed By: smeenai

Differential Revision: https://reviews.llvm.org/D80169
2020-06-02 13:19:38 -07:00
Jez Ng 6f6d91867d [lld-macho] Add some relocation validation logic
I considered making a `Target::validate()` method, but I wasn't sure how
I felt about the overhead of doing yet another switch-dispatch on the
relocation type, so I put the validation in `relocateOne` instead...
might be a bit of a micro-optimization, but `relocateOne` does assume
certain things about the relocations it gets, and this error handling
makes that explicit, so it's not a totally unreasonable code
organization.

Reviewed By: smeenai

Differential Revision: https://reviews.llvm.org/D80049
2020-06-02 13:19:38 -07:00
Jez Ng ce0d8beebc [lld-macho][re-land] Support X86_64_RELOC_UNSIGNED
This reverts commit db8559eee4.
2020-05-19 12:31:55 -07:00
Jez Ng 4eb6f4854e [lld-macho][re-land] Support .subsections_via_symbols
Summary:
This diff restores and builds upon @pcc and @ruiu's initial work on
subsections.

The .subsections_via_symbols directive indicates we can split each
section along symbol boundaries, unless those symbols have been marked
with `.alt_entry`.

We exercise this functionality in our tests by using order files that
rearrange those symbols.

Depends on D79668.

Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee

Reviewed By: smeenai

Subscribers: thakis, llvm-commits, pcc, ruiu

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79926
2020-05-19 12:31:54 -07:00
Jez Ng 70fbbcdd34 Revert "[lld-macho] Support .subsections_via_symbols"
Due to build breakage mentioned in https://reviews.llvm.org/D79926.

This reverts commit e270b2f172.
2020-05-19 08:30:02 -07:00
Jez Ng db8559eee4 Revert "[lld-macho] Support X86_64_RELOC_UNSIGNED"
This reverts commit 1f820e3559.
2020-05-19 08:30:02 -07:00
Jez Ng 1f820e3559 [lld-macho] Support X86_64_RELOC_UNSIGNED
Note that it's only used for non-pc-relative contexts.

Reviewed By: MaskRay, smeenai

Differential Revision: https://reviews.llvm.org/D80048
2020-05-19 07:46:57 -07:00
Jez Ng e270b2f172 [lld-macho] Support .subsections_via_symbols
This diff restores and builds upon @pcc and @ruiu's initial work on
subsections.

The .subsections_via_symbols directive indicates we can split each
section along symbol boundaries, unless those symbols have been marked
with `.alt_entry`.

We exercise this functionality in our tests by using order files that
rearrange those symbols.

Reviewed By: smeenai

Differential Revision: https://reviews.llvm.org/D79926
2020-05-19 07:46:57 -07:00
Jez Ng 55e9eb416e [lld-macho] Support -order_file
The order file indicates how input sections should be sorted within each
output section, based on the symbols contained within those sections.

This diff sets the stage for implementing and testing
`.subsections_via_symbols`, where we will break up InputSections by each
symbol and sort them more granularly.

Reviewed By: smeenai

Differential Revision: https://reviews.llvm.org/D79668
2020-05-19 07:46:57 -07:00
Kellie Medlin 2b920ae78c [lld] Add archive file support to Mach-O backend
With this change, basic archive files can be linked together. Input
section discovery has been refactored into a function since archive
files lazily resolve their symbols / the object files containing those
symbols.

Reviewed By: int3, smeenai

Differential Revision: https://reviews.llvm.org/D78342
2020-05-14 12:58:35 -07:00
Nico Weber 759bae956a [lld-macho] Ignore -platform_version and -syslibroot flags.
clang passes these flags; this makes it easier to try `clang -v`
output with `ld -flavor darwinnew`.

Differential Revision: https://reviews.llvm.org/D79797
2020-05-12 19:17:01 -04:00
Jez Ng 87b6fd3e02 [lld-macho] Add support for creating and reading reexported dylibs
This unblocks the linking of real programs, since many core system
functions are only available as sub-libraries of libSystem.

Differential Revision: https://reviews.llvm.org/D79228
2020-05-12 07:52:03 -07:00
Eric Christopher 020022e12e Fix auto -> auto * clang tidy. 2020-05-11 15:50:52 -07:00
Jez Ng 198b0c57df [lld-macho] Support pc-relative section relocations
Summary: So far we've only supported symbol relocations.

Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79211
2020-05-09 20:56:23 -07:00
Jez Ng 7bbdbacd00 [lld-macho] Use export trie instead of symtab when linking against dylibs
Summary:
This allows us to link against stripped dylibs. Moreover, it's simply
more correct: The symbol table includes symbols that the dylib uses but
doesn't export.

This temporarily regresses our ability to do lazy symbol binding because
dyld_stub_binder isn't in libSystem's export trie. Rather, it is in one
of the sub-libraries libSystem re-exports. (This doesn't affect our
tests since we are mocking out dyld_stub_binder there.) A follow-up diff
will address this by adding support for sub-libraries.

Depends on D79114.

Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee

Subscribers: mgorny, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79226
2020-05-09 20:56:22 -07:00
Jez Ng 5d3feefa0d [lld-macho] Dylib symbols should always replace undefined symbols
Summary:
Otherwise we get undefined symbol errors depending on the order of
arguments on the command line.

Depends on D78270.

Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79114
2020-05-09 20:56:22 -07:00
Jez Ng b3e2fc931d [lld-macho] Support calls to functions in dylibs
Summary:
This diff implements lazy symbol binding -- very similar to the PLT
mechanism in ELF.

ELF's .plt section is broken up into two sections in Mach-O:
StubsSection and StubHelperSection. Calls to functions in dylibs will
end up calling into StubsSection, which contains indirect jumps to
addresses stored in the LazyPointerSection (the counterpart to ELF's
.plt.got).

Initially, the LazyPointerSection contains addresses that point into one
of the entry points in the middle of the StubHelperSection. The code in
StubHelperSection will push on the stack an offset into the
LazyBindingSection. The push is followed by a jump to the beginning of
the StubHelperSection (similar to PLT0), which then calls into
dyld_stub_binder. dyld_stub_binder is a non-lazily bound symbol, so this
call looks it up in the GOT.

The stub binder will look up the bind opcodes in the LazyBindingSection
at the given offset. The bind opcodes will tell the binder to update the
address in the LazyPointerSection to point to the symbol, so that
subsequent calls don't have to redo the symbol resolution. The binder
will then jump to the resolved symbol.

Depends on D78269.

Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78270
2020-05-09 20:56:22 -07:00
Jez Ng db157d2733 [lld-macho] Follow-up to D77893
Summary:
1. Don't have isHidden() depend on isNeeded(). Whether a section is
  hidden is orthogonal from whether it is needed: hidden sections will
  never have a header regardless of whether they have a body. (I know we
  override this method with return false for synthetic sections, but
  regardless I think it's confusing to write it this way for non-synthetic
  sections.)

2. Don't call writeTo() on unneeded sections. D78270 assumes that this
  is true when implementing the stub helper section.

3. Filter out the unneeded sections early on to avoid having to deal
   with them in multiple places.

4. Remove assumption in test that the referenced file has no other symbols.
  (We should create separate input files for future tests to avoid such
  issues.)

Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79460
2020-05-09 20:56:22 -07:00
Fangrui Song 6939fe6e08 [lld-macho] Support X86_64_RELOC_SIGNED_{1,2,4}
We currently only support extern relocations.
`X86_64_RELOC_SIGNED_{1,2,4}` are like X86_64_RELOC_SIGNED, but with the
implicit addend fixed to 1, 2, and 4, respectively.
See the comment in `lib/Target/X86/MCTargetDesc/X86MachObjectWriter.cpp RecordX86_64Relocation`.

Reviewed By: int3

Differential Revision: https://reviews.llvm.org/D79311
2020-05-04 15:15:35 -07:00
Kellie Medlin 6cb073133c [lld] Merge Mach-O input sections
Summary: Similar to other formats, input sections in the MachO
implementation are now grouped under output sections. This is primarily
a refactor, although there's some new logic (like resolving the output
section's flags based on its inputs).

Differential Revision: https://reviews.llvm.org/D77893
2020-05-01 16:57:18 -07:00
Jez Ng e82c5e17b5 [lld-macho] Support X86_64_RELOC_BRANCH
Relatively straightforward diff, to set the stage for calling functions
in dylibs.

Differential Revision: https://reviews.llvm.org/D78269
2020-04-29 15:45:01 -07:00
Jez Ng df92377823 [lld-macho] Have Symbol::getVA() return a non-relative virtual address
Currently, getVA() returns a virtual address with the assumption that
the ImageBase is zero. As I understand, this is what lld-ELF is doing.
However, under our current design, it seems like an awkward setup --
I'm finding that I have to add and subtract ImageBase in several places
to make things work out.

As such, I think it's simpler to have getVA() return a non-relative VA,
but I'm not sure if I'm missing something. Would love to hear more from
folks familiar with lld-ELF.

Differential Revision: https://reviews.llvm.org/D78168
2020-04-29 15:44:50 -07:00
Jez Ng 918948db4d [lld-macho] Support reading of universal binaries
Differential Revision: https://reviews.llvm.org/D77006
2020-04-29 15:44:44 -07:00
Jez Ng 89285a1a97 [lld-macho] Disable colors in errors when not printing to a pty
This makes for better tests (and is just the right thing to do)

Differential Revision: https://reviews.llvm.org/D79069
2020-04-29 15:44:35 -07:00
Jez Ng 9854edd817 [lld-macho] Implement basic export trie
Build the trie by performing a three-way radix quicksort: We start by
sorting the strings by their first characters, then sort the strings
with the same first characters by their second characters, and so on
recursively. Each time the prefixes diverge, we add a node to the trie.
Thanks to @ruiu for the idea.

I used llvm-mc's radix quicksort implementation as a starting point. The
trie offset fixpoint code was taken from
MachONormalizedFileBinaryWriter.cpp.

Differential Revision: https://reviews.llvm.org/D76977
2020-04-29 15:44:27 -07:00
Jez Ng 62b8f32f76 [lld-macho][reland] Add support for emitting dylibs with a single symbol
This got reverted due to UBSAN errors in a diff lower in the stack,
which is being fixed in https://reviews.llvm.org/D79050. This diff is
otherwise identical to the original https://reviews.llvm.org/D76908
(which was committed in 9598778bd1 and reverted in b52bc2653b).

Differential Revision: https://reviews.llvm.org/D79051
2020-04-28 17:08:32 -07:00
Jez Ng 4f0cccdd7a [lld-macho][reland] Add basic symbol table output
This diff implements basic support for writing a symbol table.

Attributes are loosely supported for extern symbols and not at all for
other types.

Initial version by Kellie Medlin <kelliem@fb.com>

Originally committed in a3d95a50ee and reverted in fbae153ca5 due to
UBSAN erroring over unaligned writes. That has been fixed in the
current diff with the following changes:

```
diff --git a/lld/MachO/SyntheticSections.cpp b/lld/MachO/SyntheticSections.cpp
--- a/lld/MachO/SyntheticSections.cpp
+++ b/lld/MachO/SyntheticSections.cpp
@@ -133,6 +133,9 @@ SymtabSection::SymtabSection(StringTableSection &stringTableSection)
     : stringTableSection(stringTableSection) {
   segname = segment_names::linkEdit;
   name = section_names::symbolTable;
+  // TODO: When we introduce the SyntheticSections superclass, we should make
+  // all synthetic sections aligned to WordSize by default.
+  align = WordSize;
 }

 size_t SymtabSection::getSize() const {
diff --git a/lld/MachO/Writer.cpp b/lld/MachO/Writer.cpp
--- a/lld/MachO/Writer.cpp
+++ b/lld/MachO/Writer.cpp
@@ -371,6 +371,7 @@ void Writer::assignAddresses(OutputSegment *seg) {
     ArrayRef<InputSection *> sections = p.second;
     for (InputSection *isec : sections) {
       addr = alignTo(addr, isec->align);
+      // We must align the file offsets too to avoid misaligned writes of
+      // structs.
+      fileOff = alignTo(fileOff, isec->align);
       isec->addr = addr;
       addr += isec->getSize();
       fileOff += isec->getFileSize();
@@ -396,6 +397,7 @@ void Writer::writeSections() {
     uint64_t fileOff = seg->fileOff;
     for (auto &sect : seg->getSections()) {
       for (InputSection *isec : sect.second) {
+        fileOff = alignTo(fileOff, isec->align);
         isec->writeTo(buf + fileOff);
         fileOff += isec->getFileSize();
       }
```

I don't think it's easy to write a test for alignment (that doesn't
involve brittly hard-coding file offsets), so there isn't one... but
UBSAN builds pass now.

Differential Revision: https://reviews.llvm.org/D79050
2020-04-28 17:07:06 -07:00
Shoaib Meenai fbae153ca5 Revert "[lld-macho] Add basic symbol table output"
This reverts commit a3d95a50ee.

Reverting due to UBSan failures:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/40817/steps/check-lld%20ubsan/logs/stdio
2020-04-28 11:34:03 -07:00
Shoaib Meenai b52bc2653b Revert "[lld-macho] Add support for emitting dylibs with a single symbol"
This reverts commit 9598778bd1.

Reverting due to UBSan failures:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/40817/steps/check-lld%20ubsan/logs/stdio
2020-04-28 11:34:03 -07:00
Shoaib Meenai af40bff32d [MachO] Fix UB in memcpy
UBSan complains about a memcpy with a null pointer, so just skip the
memcpy call if the data is empty.
2020-04-28 11:33:54 -07:00
Jez Ng 9598778bd1 [lld-macho] Add support for emitting dylibs with a single symbol
Summary:
Add logic for emitting the correct set of load commands and segments
when `-dylib` is passed.

I haven't gotten to implementing a real export trie yet, so we can only
emit a single symbol, but it's enough to replace the YAML test files
introduced in D76252.

Differential Revision: https://reviews.llvm.org/D76908
2020-04-27 13:33:46 -07:00
Jez Ng a3d95a50ee [lld-macho] Add basic symbol table output
This diff implements basic support for writing a symbol table.

- Attributes are loosely supported for extern symbols and not at all for
  other types

Immediate future work will involve implementing section merging.

Initial version by Kellie Medlin <kelliem@fb.com>

Differential Revision: https://reviews.llvm.org/D76742
2020-04-27 13:33:15 -07:00
Jez Ng 6f63216c3d [lld-macho] Extend SyntheticSections to cover all segment load commands
Previously, the special segments `__PAGEZERO` and `__LINKEDIT` were
implemented as special LoadCommands. This diff implements them using
special sections instead which have an `isHidden()` attribute. We do not
emit section headers for hidden sections, but we use their addresses and
file offsets to determine that of their containing segments. In addition
to allowing us to share more segment-related code, this refactor is also
important for the next step of emitting dylibs:

1) dylibs don't have segments like __PAGEZERO, so we need an easy way of
   omitting them w/o messing up segment indices
2) Unlike the kernel, which is happy to run an executable with
   out-of-order segments, dyld requires dylibs to have their segment
   load commands arranged in increasing address order. The refactor
   makes it easier to implement sorting of sections and segments.

Differential Revision: https://reviews.llvm.org/D76839
2020-04-27 12:58:12 -07:00
Simon Pilgrim 0847cfa334 [lld][macho] Fix implicit dependency on DenseMap.h include
It was depending on CachedHashString.h providing the include.
2020-04-27 14:05:29 +01:00
Jez Ng 060efd24c7 [lld-macho] Add basic support for linking against dylibs
This diff implements:

* dylib loading (much of which is being restored from @pcc and @ruiu's
  original work)
* The GOT_LOAD relocation, which allows us to load non-lazy dylib
  symbols
* Basic bind opcode emission, which tells `dyld` how to populate the GOT

Differential Revision: https://reviews.llvm.org/D76252
2020-04-21 13:43:19 -07:00
Fangrui Song 6acd300375 Reland D75382 "[lld] Initial commit for new Mach-O backend"
With a fix for http://lab.llvm.org:8011/builders/clang-cmake-armv8-lld/builds/3636

Also trims some unneeded dependencies.
2020-04-02 12:03:43 -07:00
Oliver Stannard af39151f3c Revert "[lld] Initial commit for new Mach-O backend"
This is causing buildbot failures on 32-bit hosts, for example:
http://lab.llvm.org:8011/builders/clang-cmake-armv8-lld/builds/3636

This reverts commit 03f43b3aca.
2020-04-02 13:23:30 +01:00
Jez Ng 03f43b3aca [lld] Initial commit for new Mach-O backend
Summary:
This is the first commit for the new Mach-O backend, designed to roughly
follow the architecture of the existing ELF and COFF backends, and
building off work that @ruiu and @pcc did in a branch a while back. Note
that this is a very stripped-down commit with the bare minimum of
functionality for ease of review. We'll be following up with more diffs
soon.

Currently, we're able to generate a simple "Hello World!" executable
that runs on OS X Catalina (and possibly on earlier OS X versions; I
haven't tested them). (This executable can be obtained by compiling
`test/MachO/relocations.s`.) We're mocking out a few load commands to
achieve this -- for example, we can't load dynamic libraries, but
Catalina requires binaries to be linked against `dyld`, so we hardcode
the emission of a `LC_LOAD_DYLIB` command. Other mocked out load
commands include LC_SYMTAB and LC_DYSYMTAB.

Differential Revision: https://reviews.llvm.org/D75382
2020-03-31 11:58:47 -07:00