From what I can tell, it's pretty similar to arm64. The two main differences
are:
1. No 64-bit relocations
2. Stub code writes to 32-bit registers instead of 64-bit
Plus of course the various on-disk structures like `segment_command` are using
the 32-bit instead of the 64-bit variants.
Reviewed By: #lld-macho, gkm
Differential Revision: https://reviews.llvm.org/D99822
This could probably have been part of D99633, but I split it up to make
things a bit more reviewable. I also fixed some bugs in the implementation that
were masked through integer underflows when operating in 64-bit mode.
Reviewed By: #lld-macho, gkm
Differential Revision: https://reviews.llvm.org/D99823
From what I can tell, it's pretty similar to arm64. The two main differences
are:
1. No 64-bit relocations
2. Stub code writes to 32-bit registers instead of 64-bit
Plus of course the various on-disk structures like `segment_command` are using
the 32-bit instead of the 64-bit variants.
Reviewed By: #lld-macho, gkm
Differential Revision: https://reviews.llvm.org/D99822
Benchmarking chromium_framework on a 3.2 GHz 16-Core Intel Xeon W Mac Pro:
N Min Max Median Avg Stddev
x 20 4.33 4.42 4.37 4.37 0.021026299
+ 20 4.12 4.23 4.18 4.175 0.035318103
Difference at 95.0% confidence
-0.195 +/- 0.0186025
-4.46224% +/- 0.425686%
(Student's t, pooled s = 0.0290644)
Reviewed By: #lld-macho, gkm
Differential Revision: https://reviews.llvm.org/D99998
The main challenge was handling the different on-disk structures (e.g.
`mach_header` vs `mach_header_64`). I tried to strike a balance between
sprinkling `target->wordSize == 8` checks everywhere (branchy = slow, and ugly)
and templatizing everything (causes code bloat, also ugly). I think I struck a
decent balance by judicious use of type erasure.
Note that LLD-ELF has a similar architecture, though it seems to use more templating.
Linking chromium_framework takes about the same time before and after this
change:
N Min Max Median Avg Stddev
x 20 4.52 4.67 4.595 4.5945 0.044423204
+ 20 4.5 4.71 4.575 4.582 0.056344803
No difference proven at 95.0% confidence
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D99633
This reuses the approach (and some code) from LLD-ELF.
It's a decent win when linking chromium_framework on a Mac Pro (3.2 GHz 16-Core Intel Xeon W):
N Min Max Median Avg Stddev
x 20 4.58 4.83 4.66 4.6685 0.066591844
+ 20 4.42 4.61 4.5 4.505 0.04751731
Difference at 95.0% confidence
-0.1635 +/- 0.0370242
-3.5022% +/- 0.793064%
(Student's t, pooled s = 0.0578462)
The output binary is 381MB.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D99279
Within `lld/macho/`, only `InputFiles.cpp` and `Symbols.h` require the `macho::` namespace qualifier to disambiguate references to `class Symbol`.
Add braces to outer `for` of a 5-level single-line `if`/`for` nest.
Differential Revision: https://reviews.llvm.org/D99555
Pretty simple code-wise. Also threw in some refactoring:
* Put the functionStartSection under Writer instead of InStruct, since
it doesn't need to be accessed outside of Writer
* Adjusted the test to put all files under the temp dir instead of at
the top-level
* Added some CHECK-LABELs to make it clearer where the function starts
data is
Differential Revision: https://reviews.llvm.org/D99112
I added just enough to allow us to see a top-level breakdown of time taken. This
is the result of loading the time-trace output into `chrome:://tracing`:
ef5e8234f3/tracing.png
Reviewed By: oontvoo
Differential Revision: https://reviews.llvm.org/D99311
Move some functions closer to their uses. Move detailed address-assignment logic out of the otherwise abstract `Writer::run()`. This prepares the ground for a diff to implement branch range extension thunks.
* `SyntheticSections.cpp`
** move `needsBinding()` and `prepareBranchTarget()` into `Writer.cpp`
** move `addNonLazyBindingEntries()` adjacent to its use.
* `Writer.cpp`
** move address-assignment logic from `Writer::run()` into new function `Writer::assignAddresses()`
** move `needsBinding()` and `prepareBranchTarget()` from `SyntheticSections.cpp`
* `Target.h`
** remove orphaned decls of `prepareSymbolRelocation()` and `validateRelocationInfo()` which were moved to other files in earlier diffs.
Differential Revision: https://reviews.llvm.org/D98795
This pleases the codesign
(Otherwise it complains about "function starts data out of place")
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D98648
Previously, SyntheticSections.cpp did not have a top-level `using namespace
llvm::MachO` because it caused a naming conflict: `llvm::MachO::Symbol` would
collide with `lld::macho::Symbol`.
`MachO::Symbol` represents the symbols defined in InterfaceFiles (TBDs). By
moving the inclusion of InterfaceFile.h into our .cpp files, we can avoid this
name collision in other files where we are only dealing with LLD's own symbols.
Along the way, I removed all unnecessary "MachO::" prefixes in our code.
Cons of this approach: If TextAPI/MachO/Symbol.h gets included via some other
header file in the future, we could run into this collision again.
Alternative 1: Have either TextAPI/MachO or BinaryFormat/MachO.h use a different
namespace. Most of the benefit of `using namespace llvm::MachO` comes from being
able to use things in BinaryFormat/MachO.h conveniently; if TextAPI was under a
different (and fully-qualified) namespace like `llvm::tapi` that would solve our
problems. Cons: lots of files across llvm-project will need to be updated, and
folks who own the TextAPI code need to agree to the name change.
Alternative 2: Rename our Symbol to something like `LldSymbol`. I think this is
ugly.
Personally I think alternative #1 is ideal, but I'm not sure the effort to do it is
worthwhile, this diff's halfway solution seems good enough to me. Thoughts?
Reviewed By: #lld-macho, oontvoo, MaskRay
Differential Revision: https://reviews.llvm.org/D98149
Pointer and reference induction variables of range-based for loops are often const, and code authors often lax about qualifying them.
Differential Revision: https://reviews.llvm.org/D98317
Add first bits for emitting LC_FUNCTION_STARTS.
This is a recommit of f344dfeb with the adjusted test
which should address build bots breakages.
Test plan: make check-all
Differential revision: https://reviews.llvm.org/D97260
Previously, lld/mac only ad-hoc codesigned executables on arm64.
Matches ld64 behavior. Part of PR49443. Fixes 14 of 17 failures when running
check-llvm with lld as host linker on an M1 MBP.
Differential Revision: https://reviews.llvm.org/D97994
Previously, we were loading re-exports without checking whether
they were compatible with our target. Prior to {D97209}, it meant that
we were defining dylib symbols that were invalid -- usually a silent
failure unless our binary actually used them. D97209 exposed this as an
explicit error.
Along the way, I've extended our TAPI compatibility check to cover the
platform as well, instead of just checking the arch. To this end, I've
replaced MachO::Architecture with MachO::Target in our Config struct.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D97867
This adds support for `-undefined dynamic_lookup`, and for
`-undefined warning` and `-undefined suppress` with `-flat_namespace`.
We just replace undefined symbols with a DynamicLookup when we hit them.
With this, `check-llvm` passes when using ld64.lld.darwinnew as host linker.
Differential Revision: https://reviews.llvm.org/D97642
Also add a few asserts to verify that we are indeed handling an
UNSIGNED relocation as the minued. I haven't made it an actual
user-facing error since I don't think llvm-mc is capable of generating
SUBTRACTOR relocations without an associated UNSIGNED.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D97103
Dynamic lookup symbols are symbols that work like dynamic symbols
in ELF: They're not bound to a dylib like normal Mach-O twolevel lookup
symbols, but they live in a global pool and dyld resolves them against
exported symbols from all loaded dylibs.
This adds support for dynamical lookup symbols to lld/mac. They are
represented as DylibSymbols with file set to nullptr.
This also uses this support to implement the -U flag, which makes
a specific symbol that's undefined at the end of the link a
dynamic lookup symbol.
For -U, it'd be sufficient to just to a pass over remaining undefined symbols
at the end of the link and to replace them with dynamic lookup symbols then.
But I'd like to use this code to implement flat_namespace too, and that will
require real support for resolving dynamic lookup symbols in SymbolTable. So
this patch adds this now already.
While writing tests for this, I noticed that we didn't set N_WEAK_DEF in the
symbol table for DylibSymbols, so this fixes that too.
Differential Revision: https://reviews.llvm.org/D97521
I've adjusted the RelocAttrBits to better fit the semantics of
the relocations. In particular:
1. *_UNSIGNED relocations are no longer marked with the `TLV` bit, even
though they can occur within TLV sections. Instead the `TLV` bit is
reserved for relocations that can reference thread-local symbols, and
*_UNSIGNED relocations have their own `UNSIGNED` bit. The previous
implementation caused TLV and regular UNSIGNED semantics to be
conflated, resulting in rebase opcodes being incorrectly emitted for TLV
relocations.
2. I've added a new `POINTER` bit to denote non-relaxable GOT
relocations. This distinction isn't important on x86 -- the GOT
relocations there are either relaxable or non-relaxable loads -- but
arm64 has `GOT_LOAD_PAGE21` which loads the page that the referent
symbol is in (regardless of whether the symbol ends up in the GOT). This
relocation must reference a GOT symbol (so must have the `GOT` bit set)
but isn't itself relaxable (so must not have the `LOAD` bit). The
`POINTER` bit is used for relocations that *must* reference a GOT
slot.
3. A similar situation occurs for TLV relocations.
4. ld64 supports both a pcrel and an absolute version of
ARM64_RELOC_POINTER_TO_GOT. But the semantics of the absolute version
are pretty weird -- it results in the value of the GOT slot being
written, rather than the address. (That means a reference to a
dynamically-bound slot will result in zeroes being written.) The
programs I've tried linking don't use this form of the relocation, so
I've dropped our partial support for it by removing the relevant
RelocAttrBits.
Reviewed By: alexshap
Differential Revision: https://reviews.llvm.org/D97031
Differential Revision: https://reviews.llvm.org/D95913
Usage: -bundle_loader <executable>
This option specifies the executable that will load the build output file being linked.
When building a bundle, users can use the --bundle_loader to specify an executable
that contains symbols referenced, but not implemented in the bundle.
This is an initial base commit for ARM64 target arch support. I don't represent that it complete or bug-free, but wish to put it out for review now that some basic things like branch target & load/store address relocs are working.
I can add more tests to this base commit, or add them in follow-up commits.
It is not entirely clear whether I use the "ARM64" (Apple) or "AArch64" (non-Apple) naming convention. Guidance is appreciated.
Differential Revision: https://reviews.llvm.org/D88629
Note that there is a triple indirection involved with
personalities and compact unwind:
1. Two bits of each CU encoding are used as an offset into the
personality array.
2. Each entry of the personality array is an offset from the image base.
The resulting address (after adding the image base) should point within the
GOT.
3. The corresponding GOT entry contains the actual pointer to the
personality function.
To further complicate things, when the personality function is in the
object file (as opposed to a dylib), its references in
`__compact_unwind` may refer to it via a section + offset relocation
instead of a symbol relocation. Since our GOT implementation can only
create entries for symbols, we have to create a synthetic symbol at the
given section offset.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D95809
The Mach kernel & codesign on arm64 macOS has strict requirements for alignment and sequence of segments and sections. Dyld probably is just as picky, though kernel & codesign reject malformed Mach-O files before dyld ever has a chance.
I developed this diff by incrementally changing alignments & sequences to match the output of ld64. I stopped when my hello-world test program started working: `codesign --verify` succeded, and `execve(2)` didn't immediately fail with `errno == EBADMACHO` = `"Malformed Mach-O file"`.
Differential Revision: https://reviews.llvm.org/D94935
This makes our error messages more informative. But the bigger motivation is for
LTO symbol resolution, which will be in an upcoming diff. The changes in this
one are largely mechanical.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D94316
Add per-reloc-type attribute bits and migrate code from per-target file into target independent code, driven by reloc attributes.
Many cleanups
Differential Revision: https://reviews.llvm.org/D95121
We were mishandling the case where both `__tbss` and `__thread_data` sections were
present.
TLVP relocations should be encoded as offsets from the start of `__thread_data`,
even if the symbol is actually located in `__thread_bss`. Previously, we were
writing the offset from the start of the containing section, which doesn't
really make sense since there's no way `tlv_get_addr()` can know which section a
given `tlv$init` symbol is in at runtime.
In addition, this patch ensures that we place `__thread_data` immediately before
`__thread_bss`. This is what ld64 does, likely for performance reasons. Zerofill
sections must also be at the end of their segments; we were already doing this,
but now we ensure that `__thread_bss` occurs before `__bss`, so that it's always
possible to have it contiguous with `__thread_data`.
Fixes llvm.org/PR48657.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D94329
Also remove iteration over ArchiveFile symbols in buildInputSectionPriorities --
that was rendered unnecessary after D92539, which included ObjFiles from
ArchiveFiles inside the `inputFiles` vector.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D93569
TREATMENT can be `error`, `warning`, `suppress`, or `dynamic_lookup`
The `dymanic_lookup` remains unimplemented for now.
Differential Revision: https://reviews.llvm.org/D93263
Note that dylibs without *any* refs will still be loaded in the usual
(strong) fashion.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D93435
This was causing a crash as we were attempting to look up the
nonexistent parent OutputSection of the debug sections. We didn't detect
it earlier because there was no test for PIEs with debug info (PIEs
require us to emit rebases for X86_64_RELOC_UNSIGNED).
This diff filters out the debug sections while loading the ObjFiles. In
addition to fixing the above problem, it also lets us avoid doing
redundant work -- we no longer parse / apply relocations / attempt to
emit dyld opcodes for these sections that we don't emit.
Fixes llvm.org/PR48392.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D92904
15% faster for linking Chromium's base_unittests.txt, according to ministat:
```
N Min Max Median Avg Stddev
x 10 0.650213 0.69287586 0.65793395 0.66127126 0.012365407
+ 10 0.54993701 0.59006906 0.55885506 0.56146643 0.013215349
Difference at 95.0% confidence
-0.0998048 +/- 0.0120244
-15.0929% +/- 1.81838%
(Student's t, pooled s = 0.0127974)
```
And matches what we do on the other ports.
Differential Revision: https://reviews.llvm.org/D92736
Also, for .o files, include full path as given on link command line.
Before:
lld: error: undefined symbol [...], referenced from sandbox_logging.o
After:
lld: error: undefined symbol [...], referenced from libseatbelt.a(sandbox_logging.o)
Move archiveName up to InputFile so we can consistently use toString()
to print InputFiles in diags, and pass it to the ObjFile ctor. This
matches the ELF and COFF ports.
Differential Revision: https://reviews.llvm.org/D92437
Symbols of the same type must be laid out contiguously: following ld64's
lead, we choose to emit all local symbols first, then external symbols,
and finally undefined symbols. For each symbol type, the LC_DYSYMTAB
load command will record the range (start index and total number) of
those symbols in the symbol table.
This work was motivated by the fact that LLDB won't search for debug
info if LC_DYSYMTAB says there are no local symbols (since STABS symbols
are all local symbols). With this change, LLDB is now able to display
the source lines at a given breakpoint when debugging our binaries.
Some tests had to be updated due to local symbol names now appearing in
`llvm-objdump`'s output.
Reviewed By: #lld-macho, smeenai, clayborg
Differential Revision: https://reviews.llvm.org/D89285