Commit Graph

546 Commits

Author SHA1 Message Date
Vy Nguyen 6c1ae8f2dc [lld-macho][nfc] Fixed typo in comment
Missed this one from https://reviews.llvm.org/D97007?id=331759#inline-930034

Differential Revision: https://reviews.llvm.org/D98973
2021-03-19 14:19:36 -04:00
Vy Nguyen 66f340051a [lld-macho] Define __mh_*_header synthetic symbols.
Bug: https://bugs.llvm.org/show_bug.cgi?id=49290

    Differential Revision: https://reviews.llvm.org/D97007
2021-03-19 14:14:40 -04:00
caoming.roy ed8bff13dc [lld-macho] implement options -map
Implement command-line options -map

Reviewed By: int3, #lld-macho

Differential Revision: https://reviews.llvm.org/D98323
2021-03-18 10:39:19 -04:00
Greg McGary 74b888baad [lld-macho][NFC] Minor refactor of Writer::run()
Move some functions closer to their uses. Move detailed address-assignment logic out of the otherwise abstract `Writer::run()`. This prepares the ground for a diff to implement branch range extension thunks.

* `SyntheticSections.cpp`
 ** move `needsBinding()` and `prepareBranchTarget()` into `Writer.cpp`
 ** move `addNonLazyBindingEntries()` adjacent to its use.

* `Writer.cpp`
 ** move address-assignment logic from `Writer::run()` into new function `Writer::assignAddresses()`
 ** move `needsBinding()` and `prepareBranchTarget()` from `SyntheticSections.cpp`

* `Target.h`
** remove orphaned decls of `prepareSymbolRelocation()` and `validateRelocationInfo()` which were moved to other files in earlier diffs.

Differential Revision: https://reviews.llvm.org/D98795
2021-03-17 15:13:43 -07:00
Greg McGary a170533632 [lld-macho][NFC] Drop unnecessary braces around simple if/for bodies
Minor cleanup

Differential Revision: https://reviews.llvm.org/D98758
2021-03-16 22:39:39 -07:00
Greg McGary db1e845a96 [lld-macho] Handle error cases properly for -exported_symbol(s_list)
This fixes defects in D98223 [lld-macho] implement options -(un)exported_symbol(s_list):
* disallow export of hidden symbols
* verify that whitelisted literal names are defined in the symbol table
* reflect export-status overrides in `nlist` attribute of `N_EXT` or `N_PEXT`

Thanks to @thakis for raising these issues

Differential Revision: https://reviews.llvm.org/D98381
2021-03-16 21:20:39 -07:00
Jez Ng 29d4676059 [lld-macho] Place LC_FUNCTION_STARTS data at the right position
This pleases the codesign

(Otherwise it complains about "function starts data out of place")

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D98648
2021-03-15 14:56:31 -04:00
Jez Ng 04eec6f881 [lld-macho][nfc] Move list of section names into InputSection.h
They were previously in SyntheticSections.h, but now there are
a bunch of non-synthetic section names in the list.

Also renamed `__functionStarts` to `__func_starts` for uniformity with
other section names + keeps the name under 16 characters (in case we ever
want to write it out as a real section).

Reviewed By: #lld-macho, compnerd

Differential Revision: https://reviews.llvm.org/D98586
2021-03-13 17:41:50 -05:00
Jez Ng 38a6374564 [lld-macho] Only codesign by default on arm64 macOS
instead of doing it on all arm64 platforms.

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D98446
2021-03-12 17:26:27 -05:00
Jez Ng d8283d9ddc [lld-macho][nfc] Give every SyntheticSection a fake InputSection
Previously, it was difficult to write code that handled both synthetic
and regular sections generically. We solve this problem by creating a
fake InputSection at the start of every SyntheticSection.

This refactor allows us to handle DSOHandle like a regular Defined
symbol (since Defined symbols must be attached to an InputSection), and
paves the way for supporting `__mh_*header` symbols. Additionally, it
simplifies our binding/rebase code.

I did have to extend Defined a little -- it now has a `linkerInternal`
flag, to indicate that `___dso_handle` should not be in the final symbol
table.

I've also added some additional testing for `___dso_handle`.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D98545
2021-03-12 17:26:27 -05:00
Jez Ng dc8bee9265 [lld-macho] Check address ranges when applying relocations
This diff required fixing `getEmbeddedAddend` to apply sign
extension to 32-bit values. We were previously passing around wrong
64-bit addend values that became "right" after being truncated back to
32-bit.

I've also made `getEmbeddedAddend` return a signed int, which is similar
to what LLD-ELF does for its `getImplicitAddend`.

`reportRangeError`, `checkUInt`, and `checkInt` are counterparts of similar
functions in LLD-ELF.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D98387
2021-03-12 17:26:27 -05:00
Jez Ng 29bbbd06fe [lld-macho] Unbreak build breakage from rG1752f2850685 2021-03-11 13:35:13 -05:00
Jez Ng a723db92d8 [lld-macho][nfc] Refactor subtractor reloc handling
SUBTRACTOR relocations are always paired with UNSIGNED
relocations to indicate a pair of symbols whose address difference we
want. Functionally they are like a single relocation: only one pointer
gets written / relocated. Previously, we would handle these pairs by
skipping over the SUBTRACTOR relocation and writing the pointer when
handling the UNSIGNED reloc. This diff reverses things, so we write
while handling SUBTRACTORs and skip over the UNSIGNED relocs instead.

Being able to distinguish between SUBTRACTOR and UNSIGNED relocs in the
write phase (i.e. inside `relocateOne`) is useful for the upcoming range
check diff: we want to check that SUBTRACTOR relocs write signed values,
but UNSIGNED relocs (naturally) write unsigned values.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D98386
2021-03-11 13:28:13 -05:00
Jez Ng e8a3058303 [lld-macho] Fix handling of X86_64_RELOC_SIGNED_{1,2,4}
The previous implementation miscalculated the addend, resulting
in an underflow. This meant that every SIGNED_N section relocation would
be associated with the last subsection (since the addend would now be a
huge number). We were "lucky" that this mistake was typically cancelled
out -- 64-to-32-bit-truncation meant that the final value was correct,
as long as subsections were not rearranged.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D98385
2021-03-11 13:28:11 -05:00
Jez Ng 5433a79176 [lld-macho][nfc] Create Relocations.{h,cpp} for relocation-specific code
This more closely mirrors the structure of lld-ELF.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D98384
2021-03-11 13:28:09 -05:00
Jez Ng 1752f28506 [lld-macho][nfc] Remove `MachO::` prefix where possible
Previously, SyntheticSections.cpp did not have a top-level `using namespace
llvm::MachO` because it caused a naming conflict: `llvm::MachO::Symbol` would
collide with `lld::macho::Symbol`.

`MachO::Symbol` represents the symbols defined in InterfaceFiles (TBDs). By
moving the inclusion of InterfaceFile.h into our .cpp files, we can avoid this
name collision in other files where we are only dealing with LLD's own symbols.

Along the way, I removed all unnecessary "MachO::" prefixes in our code.

Cons of this approach: If TextAPI/MachO/Symbol.h gets included via some other
header file in the future, we could run into this collision again.

Alternative 1: Have either TextAPI/MachO or BinaryFormat/MachO.h use a different
namespace. Most of the benefit of `using namespace llvm::MachO` comes from being
able to use things in BinaryFormat/MachO.h conveniently; if TextAPI was under a
different (and fully-qualified) namespace like `llvm::tapi` that would solve our
problems. Cons: lots of files across llvm-project will need to be updated, and
folks who own the TextAPI code need to agree to the name change.

Alternative 2: Rename our Symbol to something like `LldSymbol`. I think this is
ugly.

Personally I think alternative #1 is ideal, but I'm not sure the effort to do it is
worthwhile, this diff's halfway solution seems good enough to me. Thoughts?

Reviewed By: #lld-macho, oontvoo, MaskRay

Differential Revision: https://reviews.llvm.org/D98149
2021-03-11 13:28:08 -05:00
Thorsten Schütt 50c1b21851 [lld-macho] minimal TimeTrace support
This is the minimal port from ELF. Any extension should easy from here

Test plan: ninja check-all-macho

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D98419
2021-03-11 15:30:45 +01:00
Greg McGary 98fe9e41f7 [lld-macho][NFC] add const to pointer/reference induction variables of range-based for loops
Pointer and reference induction variables of range-based for loops are often const, and code authors often lax about qualifying them.

Differential Revision: https://reviews.llvm.org/D98317
2021-03-10 12:07:31 -08:00
Nico Weber 6e92f468c8 [lld/mac] warn on -install_name without -dylib
The flag doesn't (and shouldn't) have an effect in that case.
ld64 doesn't warn on this, but it seems like a good thing to do.
If it causes problems in practice for some reason, we can revert it.

Also add a dedicated test for install_name.

Differential Revision: https://reviews.llvm.org/D98259
2021-03-10 09:05:44 -05:00
Nico Weber 1aafaaca67 [lld/mac] Implement support for -mark_dead_strippable_dylib
lld doesn't read MH_DEAD_STRIPPABLE_DYLIB to strip dead dylibs yet,
but now it can produce dylibs with it set.

While here, also switch an existing test that looks only at the main Mach-O
header from --all-headers to --private-header.

Differential Revision: https://reviews.llvm.org/D98262
2021-03-10 08:57:46 -05:00
Greg McGary 714ec86c02 [lld-macho][NFC] drop opt:: when already using llvm::opt
Top-level `using llvm::opt` has been present in `lld/MachO/Driver*.cpp` for some time, so remove lingering `opt::` prefixes.

Differential Revision: https://reviews.llvm.org/D98314
2021-03-09 22:08:32 -08:00
Greg McGary fdc0c21973 [lld-macho][NFC] when reasonable, replace auto keyword with type names
lld policy discourages `auto`. Replace it with a type name whenever reasonable. Retain `auto` to avoid ...
* redundancy, as for decls such as `auto *t = mumble_cast<TYPE *>` or similar that specifies the result type on the RHS
* verbosity, as for iterators
* gratuitous suffering, as for lambdas

Along the way, add `const` when appropriate.

Note: a future diff will ...
* add more `const` qualifiers
* remove `opt::` when we are already `using llvm::opt`

Differential Revision: https://reviews.llvm.org/D98313
2021-03-09 22:08:32 -08:00
Greg McGary 06c4aadeb6 [lld-macho] implement options -(un)exported_symbol(s_list)
Implement command-line options to alter a dylib's exported-symbols list:
* `-exported_symbol*` options override the default export list. The export list is compiled according to the command-line option(s) only.
* `-unexported_symbol*` options hide otherwise public symbols.
* `-*exported_symbol PATTERN` options specify a single literal or glob pattern.
* `-*exported_symbols_list FILE` options specify a file containing a series of lines containing symbol literals or glob patterns. Whitespace and `#`-prefix comments are stripped.

Note: This is a simple implementation of the primary use case. ld64 has much more complexity surrounding interactions with other options, many of which are obscure and undocumented. We will start simple and complexity as necessary.

Differential Revision: https://reviews.llvm.org/D98223
2021-03-09 18:43:39 -08:00
Alexander Shaposhnikov 9afdd3607a [lld][MachO] Add support for LC_FUNCTION_STARTS
Add first bits for emitting LC_FUNCTION_STARTS.

This is a recommit of f344dfeb with the adjusted test
which should address build bots breakages.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D97260
2021-03-08 22:08:36 -08:00
Alexander Shaposhnikov 1b0819e325 Revert "[lld][MachO] Add support for LC_FUNCTION_STARTS"
This reverts commit f344dfebdb.
2021-03-08 21:10:10 -08:00
Alexander Shaposhnikov f344dfebdb [lld][MachO] Add support for LC_FUNCTION_STARTS
Add first bits for emitting LC_FUNCTION_STARTS.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D97260
2021-03-08 20:42:24 -08:00
Vy Nguyen 70c0dbf151 [lld-macho][NFC] Replace config param with a global in hasCompatVersion() helper.
Differential Revision: https://reviews.llvm.org/D98115
2021-03-06 11:32:51 -05:00
Jez Ng 97c91a43dc [lld-macho] Move a bunch of options into the "obsolete" category
These are mostly things that ld64 has itself marked obsolete.
In the case of `-sectorder`, it's suggested in ld64's manpage that it
could be deprecated, so let's skip implementing it for now.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D98066
2021-03-05 18:42:22 -05:00
Vy Nguyen fc5d804ddb [lld-macho] Check platform and version when constructor ObjFile
Differential Revision: https://reviews.llvm.org/D97979
2021-03-05 17:34:38 -05:00
Jez Ng 3c19b4f34d [lld-macho] Skip over symbols in un-parsed debug info sections
clang appears to emit symbols in `__debug_aranges`, at least
for arm64... in the examples I've seen, it doesn't seem like those
symbols are referenced outside of `__DWARF`, so I think they're safe to
ignore. But hopefully @clayborg can confirm.

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D98073
2021-03-05 17:24:32 -05:00
Jez Ng fc011b5eb1 [lld-macho] Replace debug-info-related assert with FIXME
We'll need to properly handle object files with multiple source inputs
eventually, but remove the assert for now so we can successfully emit binaries
for testing.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D98067
2021-03-05 17:24:31 -05:00
Nico Weber 7d26916859 [lld/mac] tweak comment based on feedback on D98053 2021-03-05 16:28:38 -05:00
Nico Weber 210cc0738b [mac/lld] Fix scale computation for vector ops in PAGEOFF12 relocations
With this, llvm-tblgen no longer tries and fails to allocate 7953 petabyte
when it runs during the build. Instead, `check-llvm` with lld/mac as host
linker now completes without any failures on an m1 mac.

This vector op handling code matches what happens in:
- ld64's OutputFile::applyFixUps() in OutputFile.cpp for kindStoreARM64PageOff12
- lld.ld64.darwinold's offset12KindFromInstruction() in
  lld/lib/ReaderWriter/MachO/ArchHandler_arm64.cpp for offset12scale16
- RuntimeDyld's decodeAddend() in
  llvm/lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldMachOAArch64.h for
  ARM64_RELOC_PAGEOFF12

Fixes PR49444.

Differential Revision: https://reviews.llvm.org/D98053
2021-03-05 12:24:37 -05:00
Nico Weber 90085d9286 [lld/mac] fix clang-format violation from 0e319bd0be 2021-03-05 12:12:56 -05:00
Nico Weber 0e319bd0be [lld/mac] ad-hoc sign dylibs and bundles on arm64 by default, support -(no_)adhoc_codesign flags
Previously, lld/mac only ad-hoc codesigned executables on arm64.

Matches ld64 behavior. Part of PR49443. Fixes 14 of 17 failures when running
check-llvm with lld as host linker on an M1 MBP.

Differential Revision: https://reviews.llvm.org/D97994
2021-03-05 09:12:34 -05:00
Jez Ng 0d4dadc64c [lld-macho] Include install name in error messages for dylibs from TBDs
Since multiple dylibs can be defined in one TBD, this is
necessary to avoid confusion.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D97905
2021-03-04 14:36:49 -05:00
Jez Ng 55a32812fa [lld-macho] Filter TAPI re-exports by target
Previously, we were loading re-exports without checking whether
they were compatible with our target. Prior to {D97209}, it meant that
we were defining dylib symbols that were invalid -- usually a silent
failure unless our binary actually used them. D97209 exposed this as an
explicit error.

Along the way, I've extended our TAPI compatibility check to cover the
platform as well, instead of just checking the arch. To this end, I've
replaced MachO::Architecture with MachO::Target in our Config struct.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D97867
2021-03-04 14:36:47 -05:00
Jez Ng 8601be809e [lld-macho] Fix & fold reexport-nested-libs test into stub-link.s
The reexport-nested-libs test added in D97438 was a bit wonky.

First, it was linking against libReexportSystem.tbd which targets the
iOS simulator, and which in turn attempted to re-export the iOS
simulator's libSystem. However, due to the way `-syslibroot` works, it
was actually re-exporting the macOS libSystem.

As a result, the test was not actually able to resolve the symbols in
the desired libSystem. I'm guessing that @oontvoo was confused by this
and therefore included those symbols in libReexportSystem.tbd itself.
But this means that the test wasn't actually testing the resolution of
re-exported symbols (though it did at least verify that the re-exported
libraries could be located).

After some consideration, I figured that stub-link.s could be extended
to cover what reexport-nested-libs.s was attempting to do. The test
targets macOS, so we only have one `-syslibroot` and no chance of
confusion.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D97866
2021-03-04 14:36:46 -05:00
Jez Ng 5d9aafc09a [lld-macho] Bind re-exported symbols directly to implicitly-linked umbrellas
Suppose we are linking against libFoo, which re-exports the
implicitly-bound libSystem, which in turn re-exports some
non-explicitly-bound library like `/usr/lib/system/libsystem_c.dylib`.
Then any bindings we have to a symbol in libsystem_c should use
libSystem (and not libFoo) as the umbrella library.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D97865
2021-03-04 14:36:44 -05:00
Jez Ng b63919e180 [lld-macho] Require -arch and -platform_version to always be specified
We previously defaulted to x86_64 and an unknown platform, which was fine when
we only supported one arch and did no platform checks, but that will no longer
be true going ahead. Therefore, we should require those flags to be specified
whenever the linker is invoked.

Note that LLD-ELF and ld64 both infer the arch from their input object files,
but the usefulness of that is questionable since clang will always specify these
flags, and most of the time `lld` will be invoked via clang.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D97799
2021-03-03 15:52:10 -05:00
Jez Ng 1168736c66 [lld-macho][nfc] Parse more options using getLastArg{Value}
The option-iterating loop should be reserved for options whose command-line
order is important. I think LLD-ELF follows a similar design.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D97797
2021-03-03 15:52:06 -05:00
Mikael Holmen 85b67d5fa9 [lld][MachO] Silence "enumeral and non-enumeral type" warning from gcc
gcc complained with

[1110/1140] Building CXX object tools/lld/MachO/CMakeFiles/lldMachO2.dir/SyntheticSections.cpp.o
../../lld/MachO/SyntheticSections.cpp: In function 'int16_t ordinalForDylibSymbol(const lld::macho::DylibSymbol&)':
../../lld/MachO/SyntheticSections.cpp:287:14: warning: enumeral and non-enumeral type in conditional expression [-Wextra]
  286 |   return config->namespaceKind == NamespaceKind::flat || dysym.isDynamicLookup()
      |          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  287 |              ? MachO::BIND_SPECIAL_DYLIB_FLAT_LOOKUP
      |              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  288 |              : dysym.getFile()->ordinal;
      |              ~~~~~~~~~~~~~~~~~~~~~~~~~~
2021-03-03 10:39:35 +01:00
Greg McGary 4af1522a85 [lld-macho] Rework length check when opening input files
This reverts diff D97610 (commit 0223ab035c) and adds a one-line fix to verify that a `MemoryBufferRef` has sufficient length before reading a 4-byte magic number.

Differential Revision: https://reviews.llvm.org/D97757
2021-03-02 13:00:57 -08:00
Vy Nguyen 9a2e2de15f [lld-macho] Change loadReexport to handle the case where a TAPI re-exports to reference documents nested within other TBD.
Currently, it was delibrately impleneted to not handle this case, but as it has turnt out, we need this feature.
The concrete use case is
       `System/Library/Frameworks/Cocoa.framework/Versions/A/Cocoa` reexports
               /System/Library/Frameworks/AppKit.framework/Versions/C/AppKit , which then rexports
                    /System/Library/PrivateFrameworks/UIFoundation.framework/Versions/A/UIFoundation

The current implemention uses a global currentTopLevelTapi, which is not reset until it finishes loading the whole tree.
This is a problem because if the top-level is set to Cocoa, then when we get to UIFoundation, it will try to find UIFoundation in the current top level, which is Cocoa and will not find it.

The right thing should be:
 - When loading a library from a TBD file, re-exports need to be looked up in the auxiliary documents within the same TBD.
 - When loading from an actual dylib, no additional TBD documents need to be examined.
 - In no case does a re-export mentioned in one TBD file need to be looked up in a document in an auxiliary document from a different TBD file

Differential Revision: https://reviews.llvm.org/D97438
2021-03-02 12:14:31 -05:00
Nico Weber 0658fc654c [lld/mac] Implement the missing bits of -undefined
This adds support for `-undefined dynamic_lookup`, and for
`-undefined warning` and `-undefined suppress` with `-flat_namespace`.

We just replace undefined symbols with a DynamicLookup when we hit them.

With this, `check-llvm` passes when using ld64.lld.darwinnew as host linker.

Differential Revision: https://reviews.llvm.org/D97642
2021-03-01 15:30:53 -05:00
Nico Weber 8174f33dc9 [lld/mac] Add support for -flat_namespace
-flat_namespace makes lld emit binaries that use name lookup that's more in
line with other POSIX systems: Instead of looking up symbols as (dylib,name)
pairs by dyld, they're instead looked up just by name.

-flat_namespace has three effects:

1. MH_TWOLEVEL and MH_NNOUNDEFS are no longer set in the Mach-O header
2. All symbols use BIND_SPECIAL_DYLIB_FLAT_LOOKUP as ordinal
3. When a dylib is added to the link, its dependent dylibs are also added,
   so that lld can verify that no undefined symbols remain at the end of
   a link with -flat_namespace. These transitive dylibs are added for symbol
   resolution, but they are not emitted in LC_LOAD_COMMANDs.

-undefined with -flat_namespace still isn't implemented. Before this change,
it was impossible to hit that combination because -flat_namespace caused a
diagnostic. Now that it no longer does, emit a dedicated temporary diagnostic
when both flags are used.

Differential Revision: https://reviews.llvm.org/D97641
2021-03-01 15:25:10 -05:00
Nico Weber ab45289d2e [lld/mac] Make -v print version and search paths in additon to linking, not instead of linking
This matches ld64's behavior.

Differential Revision: https://reviews.llvm.org/D97718
2021-03-01 15:09:46 -05:00
Nico Weber bacacb9d5c [lld/mac] Prefix errors with "ld64.lld" instead of just "lld"
Matches the ELF and COFF ports, which use ld.lld and lld-link, respectively.

While here, also move up `cleanupCallback` to match ELF / COFF.

Differential Revision: https://reviews.llvm.org/D97715
2021-03-01 15:08:19 -05:00
Jez Ng f083f652c3 [lld-macho][nfc] Remove TODO regarding addends
There was initially some concern around the correct handling of pcrel
section relocations with r_length != 2. But it looks like there are no such
relocations in practice -- x86_64's pcrel section relocs all have r_length == 2,
and ARM64 doesn't even have pcrel section relocs. So we can replace the TODO
with an assert.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D97576
2021-03-01 12:30:08 -05:00
Nico Weber 860e862f34 [lld/mac] Simplify encodeDylibOrdinal() a bit
Only one of the two callers used the lastBinding parameter, so
do that work at that one call site. Extract a ordinalForDylibSymbol()
helper to make this tidy.

No behavior change.

Differential Revision: https://reviews.llvm.org/D97597
2021-02-28 09:16:45 -05:00
Greg McGary 0223ab035c [lld-macho] check minimum header length when opening linkable input files
Bifurcate the `readFile()` API into ...
* `readRawFile()` which performs no checks, and
* `readLinkableFile()` which enforces minimum length of 20 bytes, same as ld64

There are no new tests because tweaks to existing tests are sufficient.

Differential Revision: https://reviews.llvm.org/D97610
2021-02-27 14:41:40 -08:00
Greg McGary 6f9dd843db [lld-macho] Implement options -rename_section -rename_segment
Implement command-line options to rename output sections & segments.

Differential Revision: https://reviews.llvm.org/D97600
2021-02-27 11:44:12 -08:00
Jez Ng 82b3da6f6f [lld-macho] Extract embedded addends for arm64 UNSIGNED relocations
On arm64, UNSIGNED relocs are the only ones that use embedded addends
instead of the ADDEND relocation.

Also ensure that the addend works when UNSIGNED is part of a SUBTRACTOR
pair.

Reviewed By: #lld-macho, alexshap

Differential Revision: https://reviews.llvm.org/D97105
2021-02-27 12:31:34 -05:00
Jez Ng 541390131e [lld-macho] Don't emit rebase opcodes for subtractor minuend relocs
Also add a few asserts to verify that we are indeed handling an
UNSIGNED relocation as the minued. I haven't made it an actual
user-facing error since I don't think llvm-mc is capable of generating
SUBTRACTOR relocations without an associated UNSIGNED.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D97103
2021-02-27 12:31:34 -05:00
Jez Ng cc5c03e109 [lld-macho] Properly test subtractor relocations & fix their attributes
`llvm-mc` doesn't generate any relocations for subtractions
between local symbols -- they must be global -- so the previous test
wasn't actually testing any relocation logic. I've fixed that and
extended the test to cover r_length=3 relocations as well as both x86_64
and arm64.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D97057
2021-02-27 12:31:34 -05:00
Nico Weber cafb6cd10c [lld/mac] Add some support for dynamic lookup symbols, and implement -U
Dynamic lookup symbols are symbols that work like dynamic symbols
in ELF: They're not bound to a dylib like normal Mach-O twolevel lookup
symbols, but they live in a global pool and dyld resolves them against
exported symbols from all loaded dylibs.

This adds support for dynamical lookup symbols to lld/mac. They are
represented as DylibSymbols with file set to nullptr.

This also uses this support to implement the -U flag, which makes
a specific symbol that's undefined at the end of the link a
dynamic lookup symbol.

For -U, it'd be sufficient to just to a pass over remaining undefined symbols
at the end of the link and to replace them with dynamic lookup symbols then.
But I'd like to use this code to implement flat_namespace too, and that will
require real support for resolving dynamic lookup symbols in SymbolTable. So
this patch adds this now already.

While writing tests for this, I noticed that we didn't set N_WEAK_DEF in the
symbol table for DylibSymbols, so this fixes that too.

Differential Revision: https://reviews.llvm.org/D97521
2021-02-26 16:50:53 -05:00
Jez Ng 84579fc24f [lld-macho] Basic support for linkage and visibility attributes in LTO
When parsing bitcode, convert LTO Symbols to LLD Symbols in order to perform
resolution. The "winning" symbol will then be marked as Prevailing at LTO
compilation time. This is similar to what the other LLD ports do.

This change allows us to handle `linkonce` symbols correctly, and to deal with
duplicate bitcode symbols gracefully. Previously, both scenarios would result in
an assertion failure inside the LTO code, complaining that multiple Prevailing
definitions are not allowed.

While at it, I also added basic logic around visibility. We don't do anything
useful with it yet, but we do check that its value is valid. LLD-ELF appears to
use it only to set FinalDefinitionInLinkageUnit for LTO, which I think is just a
performance optimization.

From my local experimentation, the linker itself doesn't seem to do anything
differently when encountering linkonce / linkonce_odr / weak / weak_odr. So I've
only written a test for one of them. LLD-ELF has more, but they seem to mostly
be testing the intermediate bitcode output of their LTO backend...? I'm far from
an expert here though, so I might very well be missing things.

Reviewed By: #lld-macho, MaskRay, smeenai

Differential Revision: https://reviews.llvm.org/D94342
2021-02-25 13:27:40 -05:00
Greg McGary 151990dd94 [lld-macho] add code signature for native arm64 macOS
Differential Revision: https://reviews.llvm.org/D96164
2021-02-24 17:05:23 -08:00
Jez Ng 4a5e111aea [lld-macho] Better deduplication of personality pointers
{D95809} introduced a mechanism for synthetic symbol creation of personality
pointers. When multiple section relocations referred to the same personality
pointer, it would deduplicate them. However, it neglected to consider that we
could have symbol relocations that also refer to the same personality pointer.
This diff fixes it.

In practice, this mix of relocations arises when there is a statically-linked
personality routine that is referenced from multiple object files. Within the
same object file, it will be referred to via section relocations, but
(obviously) other object files will refer to it via symbol relocations. Failing
to deduplicate these references resulted in us going over the
3-personality-pointer limit when linking some larger applications.

Fixes llvm.org/PR48389.

Reviewed By: #lld-macho, thakis, alexshap

Differential Revision: https://reviews.llvm.org/D97245
2021-02-23 22:02:38 -05:00
Jez Ng 4752cdc9a2 [lld-macho] Check for arch compatibility when loading ObjFiles and TBDs
The silent failures had confused me a few times.

I haven't added a similar check for platform yet as we don't yet have logic to
infer the platform automatically, and so adding that check would require
updating dozens of test files.

Reviewed By: #lld-macho, thakis, alexshap

Differential Revision: https://reviews.llvm.org/D97209
2021-02-23 22:02:38 -05:00
Jez Ng 5e851733c5 [lld-macho] Fix semantics & add tests for ARM64 GOT/TLV relocs
I've adjusted the RelocAttrBits to better fit the semantics of
the relocations. In particular:

1. *_UNSIGNED relocations are no longer marked with the `TLV` bit, even
   though they can occur within TLV sections. Instead the `TLV` bit is
   reserved for relocations that can reference thread-local symbols, and
   *_UNSIGNED relocations have their own `UNSIGNED` bit. The previous
   implementation caused TLV and regular UNSIGNED semantics to be
   conflated, resulting in rebase opcodes being incorrectly emitted for TLV
   relocations.

2. I've added a new `POINTER` bit to denote non-relaxable GOT
   relocations. This distinction isn't important on x86 -- the GOT
   relocations there are either relaxable or non-relaxable loads -- but
   arm64 has `GOT_LOAD_PAGE21` which loads the page that the referent
   symbol is in (regardless of whether the symbol ends up in the GOT). This
   relocation must reference a GOT symbol (so must have the `GOT` bit set)
   but isn't itself relaxable (so must not have the `LOAD` bit). The
   `POINTER` bit is used for relocations that *must* reference a GOT
   slot.

3. A similar situation occurs for TLV relocations.

4. ld64 supports both a pcrel and an absolute version of
   ARM64_RELOC_POINTER_TO_GOT. But the semantics of the absolute version
   are pretty weird -- it results in the value of the GOT slot being
   written, rather than the address. (That means a reference to a
   dynamically-bound slot will result in zeroes being written.) The
   programs I've tried linking don't use this form of the relocation, so
   I've dropped our partial support for it by removing the relevant
   RelocAttrBits.

Reviewed By: alexshap

Differential Revision: https://reviews.llvm.org/D97031
2021-02-23 22:02:38 -05:00
Jez Ng e5d780e049 [lld-macho] Use full input file name in invalid relocation error message
Just something I noticed while debugging arm relocations...

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D97078
2021-02-23 22:02:38 -05:00
Vy Nguyen 5a856f5b44 Reland [lld-macho]Implement bundle_loader
Reland 1a0afcf518
  https://reviews.llvm.org/D95913

New change: fix UB bug caused by copying empty path/name. (since the executable does not have a name)
2021-02-22 14:05:12 -05:00
Jez Ng 8c4638c367 [lld-macho] Clean up comments 2021-02-22 12:13:17 -05:00
Jez Ng 5bfdbdeb40 [lld-macho] Fix cpuSubtype for non-x86_64 archs
dyld on iOS will complain if the LIB64 bit is set.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D96565
2021-02-22 12:13:17 -05:00
Nico Weber 28d9953af9 [lld/mac] reject -undefined warning and -undefined suppress with -twolevel_namespace
See discussion on https://reviews.llvm.org/D93263

-flat_namespace isn't implemented yet, and neither is -undefined dynamic,
so this makes -undefined pretty pointless in lld/MachO for now. But once
we implement -flat_namespace (which we need to do anyways to get check-llvm
to pass with lld as host linker), the code's already there.

Follow-up to https://reviews.llvm.org/D93263#2491865

Differential Revision: https://reviews.llvm.org/D96963
2021-02-20 13:35:22 -05:00
Vitaly Buka c17547df44 Revert "Implement -bundle_loader"
D95913 passes null pointer into memcpy

This reverts commit 1a0afcf518.
2021-02-19 17:40:07 -08:00
Vy Nguyen 1a0afcf518 Implement -bundle_loader
Differential Revision: https://reviews.llvm.org/D95913

Usage: -bundle_loader <executable>
This option specifies the executable that will load the build output file being linked.
When building a bundle, users can use the --bundle_loader  to specify an executable
that contains symbols referenced, but not implemented in the bundle.
2021-02-18 16:11:37 -05:00
Mikael Holmen caff023b77 [lld] Silence compiler warnings by removing always true/false comparisons
type is an uint8_t so
 type >= 0
is always true and
 type < 0
is always false.
2021-02-17 08:16:02 +01:00
Nico Weber e0b8604e5d [lld/mac] Implement -u flag
Since we emit diagnostics for undefineds in Writer::scanRelocations()
and symbols referenced by -u flags aren't referenced by any relocations,
this needs some manual code (similar to the entry point).

Differential Revision: https://reviews.llvm.org/D94371
2021-02-09 08:23:06 -05:00
Greg McGary 87104faac4 [lld-macho] Add ARM64 target arch
This is an initial base commit for ARM64 target arch support. I don't represent that it complete or bug-free, but wish to put it out for review now that some basic things like branch target & load/store address relocs are working.

I can add more tests to this base commit, or add them in follow-up commits.

It is not entirely clear whether I use the "ARM64" (Apple) or "AArch64" (non-Apple) naming convention. Guidance is appreciated.

Differential Revision: https://reviews.llvm.org/D88629
2021-02-08 18:14:07 -07:00
Jez Ng ac9dd247da [lld-macho] Try to make ubsan happy
Summary: We should avoid passing a null pointer to memcpy.
2021-02-08 14:51:36 -05:00
Jez Ng 5112035751 [lld-macho] Emit LSDA info in compact unwind
The LSDA pointers are encoded as offsets from the image base,
and arranged in one big contiguous array. Each second-level page records
the offset within that LSDA array which corresponds to the LSDA for its
first CU entry.

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D95810
2021-02-08 13:48:00 -05:00
Jez Ng 525bfa10ec [lld-macho] Emit personalities in compact unwind
Note that there is a triple indirection involved with
personalities and compact unwind:

1. Two bits of each CU encoding are used as an offset into the
   personality array.
2. Each entry of the personality array is an offset from the image base.
   The resulting address (after adding the image base) should point within the
   GOT.
3. The corresponding GOT entry contains the actual pointer to the
   personality function.

To further complicate things, when the personality function is in the
object file (as opposed to a dylib), its references in
`__compact_unwind` may refer to it via a section + offset relocation
instead of a symbol relocation. Since our GOT implementation can only
create entries for symbols, we have to create a synthetic symbol at the
given section offset.

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D95809
2021-02-08 13:47:59 -05:00
Greg McGary c3e4f3b231 [lld-macho] Fix alignment & layout to match ld64 and satisfy kernel & codesign
The Mach kernel & codesign on arm64 macOS has strict requirements for alignment and sequence of segments and sections. Dyld probably is just as picky, though kernel & codesign reject malformed Mach-O files before dyld ever has a chance.

I developed this diff by incrementally changing alignments & sequences to match the output of ld64. I stopped when my hello-world test program started working: `codesign --verify` succeded, and `execve(2)` didn't immediately fail with `errno == EBADMACHO` = `"Malformed Mach-O file"`.

Differential Revision: https://reviews.llvm.org/D94935
2021-02-05 17:22:03 -07:00
Jez Ng ea5b75de49 [lld-macho] Try to fix Windows build 2021-02-03 16:05:23 -05:00
Jez Ng 2d2e0000d3 [lld-macho] Rename VERSION CONTROL to VERSION TARGETING in helptext
Per https://reviews.llvm.org/D94938#inline-896740.
2021-02-03 13:43:47 -05:00
Jez Ng 4b2169fb6b [lld-macho] Remove stray ehFrame change
Per https://reviews.llvm.org/D95121#inline-897943.
2021-02-03 13:43:47 -05:00
Jez Ng f843bb82c0 [lld-macho] Force-loading should share code path with regular archive loads
This extends {D92539} to work even when we are loading archive
members via `-force_load`. I uncovered this issue while trying to
force-load archives containing bitcode -- we were segfaulting.

In addition to fixing the `-force_load` case, this diff also addresses
the behavior of `-ObjC` when LTO bitcode is involved -- we need to
force-load those archive members if they contain ObjC categories.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D95265
2021-02-03 13:43:47 -05:00
Jez Ng 163dcd8513 [lld-macho] Associate each Symbol with an InputFile
This makes our error messages more informative. But the bigger motivation is for
LTO symbol resolution, which will be in an upcoming diff. The changes in this
one are largely mechanical.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D94316
2021-02-03 13:43:47 -05:00
Greg McGary 3a9d2f1488 [lld-macho][NFC] refactor relocation handling
Add per-reloc-type attribute bits and migrate code from per-target file into target independent code, driven by reloc attributes.

Many cleanups

Differential Revision: https://reviews.llvm.org/D95121
2021-02-02 10:54:53 -07:00
Greg McGary 0ef25cf558 [lld-macho][NFC] Add new option group for versions
Coalesce all version control options into a group

Differential Revision: https://reviews.llvm.org/D94938
2021-01-29 22:27:41 -07:00
Jez Ng 3dd5ea9dd8 [lld-macho] Link against ObjCARCOpts instead of ObjCARC
Not sure what the difference is, but using the latter appears to cause
issues in standalone builds. See llvm.org/PR48853.

Reviewed By: #lld-macho, compnerd

Differential Revision: https://reviews.llvm.org/D95359
2021-01-25 19:01:48 -05:00
Sam Clegg 299b0e5ee9 [lld] Consistent help text for `--save-temps`
I noticed that this option was not appearing at all in the `--help`
messages for `wasm-ld` or `ld.lld`.

Add help text and make it consistent across all ports.

Differential Revision: https://reviews.llvm.org/D94925
2021-01-25 10:27:18 -08:00
Jez Ng 041f3ee664 [lld-macho] Ignore -lto_library
Just getting rid of some logspew as I test LLD under existing build
systems.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D95213
2021-01-22 16:48:50 -05:00
Jez Ng 34e8fcf63f [lld-macho] Add dependency on ObjCARC to fix shared build 2021-01-20 20:41:51 -05:00
Jez Ng 697f4e429b [lld-macho] Run ObjCContractPass during LTO
Run the ObjCARCContractPass during LTO. The legacy LTO backend (under
LTO/ThinLTOCodeGenerator.cpp) already does this; this diff just adds that
behavior to the new LTO backend. Without that pass, the objc.clang.arc.use
intrinsic will get passed to the instruction selector, which doesn't know how to
handle it.

In order to test both the new and old pass managers, I've also added support for
the `--[no-]lto-legacy-pass-manager` flags.

P.S. Not sure if the ordering of the pass within the pipeline matters...

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D94547
2021-01-20 14:21:32 -05:00
Jez Ng b3e73dc5af [lld-macho][easy] Create group for LLD-specific CLI flags
Reviewed By: #lld-macho, compnerd

Differential Revision: https://reviews.llvm.org/D94545
2021-01-20 14:21:31 -05:00
Kazu Hirata fb98a1be43 Fix the warnings on unused variables (NFC) 2021-01-13 13:32:40 -08:00
Nico Weber 47991a15d1 [lld/mac] llvm style fix: no else after return 2021-01-10 09:35:00 -05:00
Nico Weber 1198478c42 [lld/mac] remove redundant null check
This is already checked two lines up. No behavior change.
2021-01-09 21:18:32 -05:00
Jez Ng e98b441a09 [lld-macho] Remove unnecessary llvm:: namespace prefixes 2021-01-09 12:44:35 -05:00
Jez Ng daaaed6bb8 [lld-macho] Fix TLV data initialization
We were mishandling the case where both `__tbss` and `__thread_data` sections were
present.

TLVP relocations should be encoded as offsets from the start of `__thread_data`,
even if the symbol is actually located in `__thread_bss`. Previously, we were
writing the offset from the start of the containing section, which doesn't
really make sense since there's no way `tlv_get_addr()` can know which section a
given `tlv$init` symbol is in at runtime.

In addition, this patch ensures that we place `__thread_data` immediately before
`__thread_bss`. This is what ld64 does, likely for performance reasons. Zerofill
sections must also be at the end of their segments; we were already doing this,
but now we ensure that `__thread_bss` occurs before `__bss`, so that it's always
possible to have it contiguous with `__thread_data`.

Fixes llvm.org/PR48657.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D94329
2021-01-08 18:48:12 -05:00
Fangrui Song 7916fd71e9 [lld-macho] Fix GCC -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off build 2021-01-06 10:58:46 -08:00
Nico Weber 568824798f fix typo to cycle bots 2021-01-01 22:28:11 -05:00
Thorsten Schütt 4a290a5905 [lld/mac] fix typo 2020-12-31 08:23:29 +01:00
Nico Weber 8886be242d [lld/mac] Add -adhoc_codesign / -no_adhoc_codesign flags
These are new in Xcode 12's ld64. lld never codesigns at the moment, so
-no_adhoc_codesign doesn't even have to warn that it's not implemented.
2020-12-30 20:57:25 -05:00
Jez Ng 9d1140e18e [lld-macho] Simulator & DriverKit executables should always be PIE
We didn't have support for parsing DriverKit in our `-platform`
flag, so add that too. Also remove a bunch of unnecessary namespace
prefixes.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D93741
2020-12-23 11:24:12 -05:00
Nico Weber 77fb45e59e [lld/mac] Add --version flag
It's an extension to ld64, but all the other ports have it, and
someone asked for it in PR43721.

While here, change the COFF help text to match the other ports.

Differential Revision: https://reviews.llvm.org/D93491
2020-12-22 22:06:39 -05:00
Nico Weber 57ffbe020a glld/mac] Don't add names of unreferenced symbols to string table
Before this, a hello world program would contain many many unnecessary
entries in its string table.

No behavior change, just makes the string table in the output smaller
and more like ld64's.

Differential Revision: https://reviews.llvm.org/D93711
2020-12-22 15:52:33 -05:00
Nico Weber 13f439a187 [lld/mac] Implement support for private extern symbols
Private extern symbols are used for things scoped to the linkage unit.
They cause duplicate symbol errors (so they're in the symbol table,
unlike TU-scoped truly local symbols), but they don't make it into the
export trie. They are created e.g. by compiling with
-fvisibility=hidden.

If two weak symbols have differing privateness, the combined symbol is
non-private external. (Example: inline functions and some TUs that
include the header defining it were built with
-fvisibility-inlines-hidden and some weren't).

A weak private external symbol implicitly has its "weak" dropped and
behaves like a regular strong private external symbol: Weak is an export
trie concept, and private symbols are not in the export trie.

If a weak and a strong symbol have different privateness, the strong
symbol wins.

If two common symbols have differing privateness, the larger symbol
wins. If they have the same size, the privateness of the symbol seen
later during the link wins (!) -- this is a bit lame, but it matches
ld64 and this behavior takes 2 lines less to implement than the less
surprising "result is non-private external), so match ld64.
(Example: `int a` in two .c files, both built with -fcommon,
one built with -fvisibility=hidden and one without.)

This also makes `__dyld_private` a true TU-local symbol, matching ld64.
To make this work, make the `const char*` StringRefZ ctor to correctly
set `size` (without this, writing the string table crashed when calling
getName() on the __dyld_private symbol).

Mention in CommonSymbol's comment that common symbols are now disabled
by default in clang.

Mention in -keep_private_externs's HelpText that the flag only has an
effect with `-r` (which we don't implement yet -- so this patch here
doesn't regress any behavior around -r + -keep_private_externs)). ld64
doesn't explicitly document it, but the commit text of
http://reviews.llvm.org/rL216146 does, and ld64's
OutputFile::buildSymbolTable() checks `_options.outputKind() ==
Options::kObjectFile` before calling `_options.keepPrivateExterns()`
(the only reference to that function).

Fixes PR48536.

Differential Revision: https://reviews.llvm.org/D93609
2020-12-21 21:23:33 -05:00
Fangrui Song 791fe7ac57 [lld-macho] Fix memcpy ub after D93267 2020-12-20 20:01:20 -08:00
Jez Ng 64e4757200 [lld-macho] Have order files support filtering by archive member paths
Also remove iteration over ArchiveFile symbols in buildInputSectionPriorities --
that was rendered unnecessary after D92539, which included ObjFiles from
ArchiveFiles inside the `inputFiles` vector.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D93569
2020-12-20 13:49:18 -05:00
Jez Ng 5f9896d3b2 [lld-macho] Support Obj-C symbols in order files
Obj-C symbols may have spaces and colons, which our previous order file
parser would be confused by. The order file format has made the very unfortunate
choice of using colons for its delimiters, which means that we have to use
heuristics to determine if a given colon is part of a symbol or not...

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D93567
2020-12-20 13:49:18 -05:00
Greg McGary 99930719c6 Handle overflow beyond the 127 common encodings limit
The common encodings table holds only 127 entries. The encodings index for compact entries is 8 bits wide, and indexes 127..255 are stored locally to each second-level page. Prior to this diff, lld would `fatal()` if encodings overflowed the 127 limit.

This diff populates a per-second-level-page encodings table as needed. When the per-page encodings table hits its limit, we must terminate the page. If such early termination would consume fewer entries than a regular (non-compact) encoding page, then we prefer the regular format.

Caveat: one reason the common-encoding table might overflow is because of DWARF debug-info references, which are not yet implemented and will come with a later diff.

Differential Revision: https://reviews.llvm.org/D93267
2020-12-19 14:54:37 -08:00
Greg McGary d4ec3346b1 [lld-macho][nfc] Refactor to accommodate paired relocs
This is a refactor to pave the way for supporting paired-ADDEND for ARM64. The only paired reloc type for X86_64 is SUBTRACTOR. In a later diff, I will add SUBTRACTOR for both X86_64 and ARM64.

* s/`getImplicitAddend`/`getAddend`/ because it handles all forms of addend: implicit, explicit, paired.
* add predicate `bool isPairedReloc()`
* check range of `relInfo.r_symbolnum` is internal, unrelated to user-input, so use `assert()`, not `error()`
* minor cleanups & rearrangements in `InputFile::parseRelocations()`

Differential Revision: https://reviews.llvm.org/D90614
2020-12-17 20:21:41 -08:00
Greg McGary cc1cf6332a [lld-macho] Implement option: -undefined TREATMENT
TREATMENT can be `error`, `warning`, `suppress`, or `dynamic_lookup`
The `dymanic_lookup` remains unimplemented for now.

Differential Revision: https://reviews.llvm.org/D93263
2020-12-17 17:40:50 -08:00
Nico Weber f710bb7063 lld: Replace some lld::outs()s with message()
No behavior change.
2020-12-17 16:19:09 -05:00
Jez Ng 4c8276cdc1 [lld-macho] Use LC_LOAD_WEAK_DYLIB for dylibs with only weakrefs
Note that dylibs without *any* refs will still be loaded in the usual
(strong) fashion.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D93435
2020-12-17 08:49:17 -05:00
Jez Ng 811444d7a1 [lld-macho] Add support for weak references
Weak references need not necessarily be satisfied at runtime (but they must
still be satisfied at link time). So symbol resolution still works as per usual,
but we now pass around a flag -- ultimately emitting it in the bind table -- to
indicate if a given dylib symbol is a weak reference.

ld64's behavior for symbols that have both weak and strong references is
a bit bizarre. For non-function symbols, it will emit a weak import. For
function symbols (those referenced by BRANCH relocs), it will emit a
regular import. I'm not sure what value there is in that behavior, and
since emulating it will make our implementation more complex, I've
decided to treat regular weakrefs like function symbol ones for now.

Fixes PR48511.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D93369
2020-12-17 08:49:16 -05:00
dfukalov 9ed8e0caab [NFC] Reduce include files dependency and AA header cleanup (part 2).
Continuing work started in https://reviews.llvm.org/D92489:

Removed a bunch of includes from "AliasAnalysis.h" and "LoopPassManager.h".

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D92852
2020-12-17 14:04:48 +03:00
Nico Weber ec88746a05 [lld/mac] fill in current and compatibility version for LC_LOAD_(WEAK_)DYLIB
Not sure if anything actually depends on this, but it makes `otool -L`
output look nicer.

Differential Revision: https://reviews.llvm.org/D93332
2020-12-15 19:34:59 -05:00
Nico Weber 09edd9df6e [mac/lld] simplify code using PackedVersion instead of VersionTuple
PackedVersion already does the correct range checks.

No behavior change.

Differential Revision: https://reviews.llvm.org/D93338
2020-12-15 19:23:07 -05:00
Jez Ng 3aa8e071dd [lld-macho] Add implicit dylib support for frameworks
{D93000} applied to frameworks. Partial fix for PR48511.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D93277
2020-12-15 15:58:26 -05:00
Jez Ng 8a5e068823 [lld-macho] Support -sub_umbrella
From what I can tell, it's essentially identical to
`-sub_library`, but it doesn't match files ending in ".dylib".

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D93276
2020-12-15 15:58:26 -05:00
Jez Ng 3184519909 [lld-macho] Don't emit rebase opcodes for relocs in TLV sections
Their addresses are already encoded as section-relative offsets, so
there's no need to rebase them at runtime. {D85080} has some context
on the weirdness of TLV sections.

Fixes llvm.org/PR48491.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D93257
2020-12-15 15:58:26 -05:00
Jez Ng 544148ae70 [lld-macho] -weak_{library,framework} should always take priority
We were not setting forceWeakImport for file paths given by
`-weak_library` if we had already loaded the file. This diff fixes that
by having `loadDylib` return a cached DylibFile instance even if we have
already loaded that file.

We still avoid emitting multiple LC_LOAD_DYLIBs, but we achieve this by
making inputFiles a SetVector instead of relying on the `loadedDylibs`
cache.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D93255
2020-12-15 15:58:26 -05:00
Nico Weber 601f0fb846 [lld/mac] Set ordinal on dynamic undefined symbols in symbol table
This lets `nm -m` print "(from libfoo)" in its output, which is more
accessible than dumping the bind table.

See https://reviews.llvm.org/D57190#2455761 for the somewhat
surprising `AltEntry` that appears in symtab.s.

Differential Revision: https://reviews.llvm.org/D93318
2020-12-15 14:39:36 -05:00
Nico Weber d058b69b1c [lld/mac] implement -compatibility_version, -current_version
Differential Revision: https://reviews.llvm.org/D93237
2020-12-14 18:41:36 -05:00
Jez Ng 76c36c11a9 [lld-macho] Don't load dylibs more than once
Also remove `DylibFile::reexported` since it's unused.

Fixes llvm.org/PR48393.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D93001
2020-12-10 15:57:52 -08:00
Jez Ng 6a348f6158 [lld-macho] Implement `-no_implicit_dylibs`
Dylibs that are "public" -- i.e. top-level system libraries -- are considered
implicitly linked when another library re-exports them. That is, we should load
them & bind directly to their symbols instead of via their re-exporting
umbrella library. This diff implements that behavior by default, as well as an
opt-out flag.

In theory, this is just a performance optimization, but in practice it seems
that it's needed for correctness.

Fixes llvm.org/PR48395.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D93000
2020-12-10 15:57:52 -08:00
Jez Ng 74d799926e [lld-macho] Initialize AsmParsers earlier
We need to initialize AsmParsers before any calls to `addFile`, as
bitcode files may require them. Otherwise we trigger `Assertion T &&
T->hasMCAsmParser()' failed`.

Reviewed By: #lld-macho, compnerd

Differential Revision: https://reviews.llvm.org/D92913
2020-12-10 15:57:52 -08:00
Jez Ng 29d3b0e471 [lld-macho] Add support for -mcpu, -mattr, -code-model in LTO
`-mcpu` and `-code-model` tests were copied from similar ones in
LLD-ELF.

There doesn't seem to be an equivalent test for `-mattr` in LLD-ELF, so
I've verified our behavior by cribbing a test from
CodeGen/X86/recip-fastmath.ll.

Reviewed By: #lld-macho, compnerd, MaskRay

Differential Revision: https://reviews.llvm.org/D92912
2020-12-10 15:57:51 -08:00
Jez Ng 863f7a745e [lld-macho] Don't attempt to emit rebase opcodes for debug sections
This was causing a crash as we were attempting to look up the
nonexistent parent OutputSection of the debug sections. We didn't detect
it earlier because there was no test for PIEs with debug info (PIEs
require us to emit rebases for X86_64_RELOC_UNSIGNED).

This diff filters out the debug sections while loading the ObjFiles. In
addition to fixing the above problem, it also lets us avoid doing
redundant work -- we no longer parse / apply relocations / attempt to
emit dyld opcodes for these sections that we don't emit.

Fixes llvm.org/PR48392.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D92904
2020-12-10 15:57:51 -08:00
Jez Ng 95831a56d0 [lld-macho] Implement -object_path_lto
This makes it possible for STABS entries to reference the debug info
contained in the LTO-compiled output.

I'm not sure how to test the file mtime within llvm-lit -- GNU and BSD
`stat` take different command-line arguments. I've omitted the check for
now.

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D92537
2020-12-10 15:57:51 -08:00
Nico Weber 9d6177c2a5 [lld/mac] Use xxhash instead of MD5 for computing the UUID
15% faster for linking Chromium's base_unittests.txt, according to ministat:

```
    N           Min           Max        Median           Avg        Stddev
x  10      0.650213    0.69287586    0.65793395    0.66127126   0.012365407
+  10    0.54993701    0.59006906    0.55885506    0.56146643   0.013215349
Difference at 95.0% confidence
        -0.0998048 +/- 0.0120244
        -15.0929% +/- 1.81838%
        (Student's t, pooled s = 0.0127974)
```

And matches what we do on the other ports.

Differential Revision: https://reviews.llvm.org/D92736
2020-12-09 21:06:17 -05:00
Jez Ng 78976bf3da [lld-macho] Support parsing of bitcode within archives
Also error out if we find anything other than an object or bitcode file
in the archive.

Note that we were previously inserting the symbols and sections of the
unpacked ObjFile into the containing ArchiveFile. This was actually
unnecessary -- we can just insert the ObjectFile (or BitcodeFile) into
the `inputFiles` vector. This is the approach taken by LLD-ELF.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D92539
2020-12-08 10:34:32 -08:00
Jez Ng 7b007ac080 [lld-macho][nfc] Move some methods from InputFile to ObjFile
Additionally:
1. Move the helper functions in InputSection.h below the definition of
   `InputSection`, so the important stuff is on top
2. Remove unnecessary `explicit`

Reviewed By: #lld-macho, compnerd

Differential Revision: https://reviews.llvm.org/D92453
2020-12-08 10:34:32 -08:00
Nico Weber feadc3798d [lld/mac] Make X86_64::getImplicitAddend not do heap allocations
Speeds up linking Chromium's base_unittests almost 10%. According to ministat:

    N           Min           Max        Median           Avg        Stddev
x   5    0.72193289    0.73073196    0.72560811    0.72565799  0.0032265649
+   5    0.64069581    0.67173195    0.65876389    0.65796089   0.011349451
Difference at 95.0% confidence
	-0.0676971 +/- 0.0121682
	-9.32906% +/- 1.67685%
	(Student's t, pooled s = 0.00834328)

Differential Revision: https://reviews.llvm.org/D92734
2020-12-07 09:23:51 -05:00
Fangrui Song 4701cb41ed [lld] Delete unused declarations
Notes:

* runMSVCLinker: remnant of r338615
* wasm markSymbol: remnant of r374275
* wasm addDataAddressGlobal: accidentally added by r372779
* MachO Writer::createSymtabContents: accidentally added by D76839
2020-12-06 15:26:37 -08:00
Nico Weber 16b1f6e385 [mac/lld] Add support for the LC_LINKER_OPTION load command in o files
clang puts `-framework CoreFoundation` in this load command for files
that use @available / __builtin_available. Without support for this,
binaries that don't explicitly link to CoreFoundation fail to link.

Differential Revision: https://reviews.llvm.org/D92624
2020-12-04 08:46:53 -05:00
Nico Weber 7cb0a373d1 [mac/lld] Implement -t
Goes well with `-why_load` to get an idea of load order.

Differential Revision: https://reviews.llvm.org/D92583
2020-12-03 16:02:38 -05:00
Nico Weber 3422f3cc6e Reland "[mac/lld] Implement -why_load".
The problem was that `sym` became replaced in the call
to make<ObjFile> and referring to it afer that read memory that now
stored a different kind of symbol (a Defined instead of a LazySymbol).
Since this happens only once per archive, just copy the symbol to the
stack before make<ObjFile> and read the copy instead.

Originally reviewed at https://reviews.llvm.org/D92496
2020-12-03 08:35:12 -05:00
Nico Weber ea0029f55d Revert "[mac/lld] Implement -why_load"
This reverts commit 542d3b609d.
Seems to break check-lld. Reverting while I take a look.
2020-12-02 18:57:46 -05:00
Nico Weber 542d3b609d [mac/lld] Implement -why_load
This is useful for debugging why lld loads .o files it shouldn't load.
It's also useful for users of lld -- I've used ld64's version of this a
few times.

Differential Revision: https://reviews.llvm.org/D92496
2020-12-02 18:33:12 -05:00
Nico Weber ca634393fc [mac/lld] Make --reproduce work with thin archives
See http://reviews.llvm.org/rL268229 and
http://reviews.llvm.org/rL313832 which did the same for the ELF port.

Differential Revision: https://reviews.llvm.org/D92456
2020-12-02 09:48:31 -05:00
Nico Weber b2f00f24a3 [mac/lld] Include archive name in diagnostics
Also, for .o files, include full path as given on link command line.

Before:
    lld: error: undefined symbol [...], referenced from sandbox_logging.o

After:
    lld: error: undefined symbol [...], referenced from libseatbelt.a(sandbox_logging.o)

Move archiveName up to InputFile so we can consistently use toString()
to print InputFiles in diags, and pass it to the ObjFile ctor. This
matches the ELF and COFF ports.

Differential Revision: https://reviews.llvm.org/D92437
2020-12-01 23:00:25 -05:00
Heejin Ahn 6fb88c6cd5 [lld-macho] Add dependency to DebugInfoDWARF
Without this `-DBUILD_SHARED_LIBS=ON` doesn't work.
2020-12-01 19:10:46 -08:00
Nico Weber 126f58e838 fix typos to cycle bots 2020-12-01 20:27:33 -05:00
Nico Weber 07ab597bb0 [lld/mac] Fix issues around thin archives
- most importantly, fix a use-after-free when using thin archives,
  by putting the archive unique_ptr to the arena allocator. This
  ports D65565 to MachO

- correctly demangle symbol namess from archives in diagnostics

- add a test for thin archives -- it finds this UaF, but only when
  running it under asan (it also finds the demangling fix)

- make forceLoadArchive() use addFile() with a bool to have the archive
  loading code in fewer places. no behavior change; matches COFF port a
  bit better

Differential Revision: https://reviews.llvm.org/D92360
2020-12-01 18:48:29 -05:00
Jez Ng c7dbaec396 [lld-macho] Add isCodeSection()
This is the same logic that ld64 uses to determine which sections
contain functions. This was added so that we could determine which
STABS entries should be N_FUN.

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D92430
2020-12-01 15:05:21 -08:00
Jez Ng 78f6498cdc [lld-macho] Flesh out STABS implementation
This addresses a lot of the comments in {D89257}. Ideally it'd have been
done in the same diff, but the commits in between make that difficult.

This diff implements:
* N_GSYM and N_STSYM, the STABS for global and static symbols
* Has the STABS reflect the section IDs of their referent symbols
* Ensures we don't fail when encountering absolute symbols or files with
  no debug info
* Sorts STABS symbols by file to minimize the number of N_OSO entries

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D92366
2020-12-01 15:05:21 -08:00
Jez Ng b768d57b36 [lld-macho] Add archive name and file modtime to STABS output
We should also set the modtime when running LTO. That will be done in a
future diff, together with support for the `-object_path_lto` flag.

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D91318
2020-12-01 15:05:21 -08:00
Jez Ng d0c4be42e3 [lld-macho] Emit empty string as first entry of string table
ld64 emits string tables which start with a space and a zero byte. We
match its behavior here since some tools depend on it.

Similar rationale as {D89561}.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D89639
2020-12-01 15:05:20 -08:00
Jez Ng 51629abce0 [lld-macho] Emit local symbols in symtab; record metadata in LC_DYSYMTAB
Symbols of the same type must be laid out contiguously: following ld64's
lead, we choose to emit all local symbols first, then external symbols,
and finally undefined symbols. For each symbol type, the LC_DYSYMTAB
load command will record the range (start index and total number) of
those symbols in the symbol table.

This work was motivated by the fact that LLDB won't search for debug
info if LC_DYSYMTAB says there are no local symbols (since STABS symbols
are all local symbols). With this change, LLDB is now able to display
the source lines at a given breakpoint when debugging our binaries.

Some tests had to be updated due to local symbol names now appearing in
`llvm-objdump`'s output.

Reviewed By: #lld-macho, smeenai, clayborg

Differential Revision: https://reviews.llvm.org/D89285
2020-12-01 15:05:20 -08:00
Jez Ng 3fcb0eeb15 [lld-macho] Emit STABS symbols for debugging, and drop debug sections
Debug sections contain a large amount of data. In order not to bloat the size
of the final binary, we remove them and instead emit STABS symbols for
`dsymutil` and the debugger to locate their contents in the object files.

With this diff, `dsymutil` is able to locate the debug info. However, we need
a few more features before `lldb` is able to work well with our binaries --
e.g. having `LC_DYSYMTAB` accurately reflect the number of local symbols,
emitting `LC_UUID`, and more. Those will be handled in follow-up diffs.

Note also that the STABS we emit differ slightly from what ld64 does. First, we
emit the path to the source file as one `N_SO` symbol instead of two. (`ld64`
emits one `N_SO` for the dirname and one of the basename.) Second, we do not
emit `N_BNSYM` and `N_ENSYM` STABS to mark the start and end of functions,
because the `N_FUN` STABS already serve that purpose. @clayborg recommended
these changes based on his knowledge of what the debugging tools look for.

Additionally, this current implementation doesn't accurately reflect the size
of function symbols. It uses the size of their containing sectioins as a proxy,
but that is only accurate if `.subsections_with_symbols` is set, and if there
isn't an `N_ALT_ENTRY` in that particular subsection. I think we have two
options to solve this:

1. We can split up subsections by symbol even if `.subsections_with_symbols`
   is not set, but include constraints to ensure those subsections retain
   their order in the final output. This is `ld64`'s approach.
2. We could just add a `size` field to our `Symbol` class. This seems simpler,
   and I'm more inclined toward it, but I'm not sure if there are use cases
   that it doesn't handle well. As such I'm punting on the decision for now.

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D89257
2020-12-01 15:05:20 -08:00
Jez Ng 6b3eecd22a [lld-macho] Extend PIE option handling
* Enable PIE by default if targeting 10.6 or above on x86-64. (The
  manpage says 10.7, but that actually applies only to i386, and in
  general varies based on the target platform. I didn't update the
  manpage because listing all the different behaviors would make for a
  pretty long description.)
* Add support for `-no_pie`
* Remove `HelpHidden` from `-pie`

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D92362
2020-12-01 14:35:51 -08:00
Nico Weber 78c04fe99e [lld/mac] Don't warn on -bundle and -execute flags
They've been implemented since D87856 but since they still were
HelpHidden, the driver still warned claiming they were implemented.
Remove HelpHidden.

Use -fatal_warnings to test that the flags now don't warn. The
test depends on D91894 and D91891 to pass.

Differential Revision: https://reviews.llvm.org/D91971
2020-11-30 16:07:58 -05:00
Nico Weber ebac710009 [lld-macho] Don't warn on non-existent system libraries
Now, new mach-o lld no longer warns if the isysroot has just
usr/lib and System/Library/Frameworks but is missing usr/local/lib
and System/Frameworks.

This matches ld64 and old mach-o lld and fixes a regression from D85992.

It also fixes the only test failure in `check-lld` when running it
on an M1 Mac.

Differential Revision: https://reviews.llvm.org/D91891
2020-11-30 16:07:20 -05:00
Nico Weber c0e4020c92 [lld-macho] Implement -fatal_warnings
Differential Revision: https://reviews.llvm.org/D91894
2020-11-30 09:29:21 -05:00
Nico Weber 83e60f5a55 [lld/mac] Add --reproduce option
This adds support for ld.lld's --reproduce / lld-link's /reproduce:
flag to the MachO port. This flag can be added to a link command
to make the link write a tar file containing all inputs to the link
and a response file containing the link command. This can be used
to reproduce the link on another machine, which is useful for sharing
bug report inputs or performance test loads.

Since the linker is usually called through the clang driver and
adding linker flags can be a bit cumbersome, setting the env var
`LLD_REPRODUCE=foo.tar` triggers the feature as well.

The file response.txt in the archive can be used with
`ld64.lld.darwinnew $(cat response.txt)` as long as the contents are
smaller than the command-line limit, or with `ld64.lld.darwinnew
@response.txt` once D92149 is in.

The support in this patch is sufficient to create a tar file for
Chromium's base_unittests that can link after unpacking on a different
machine.

Differential Revision: https://reviews.llvm.org/D92274
2020-11-30 08:40:21 -05:00
Nico Weber d20abb1ec3 [mac/lld] Add support for response files
ld64 learned about them in Xcode 12, so we should too.

Differential Revision: https://reviews.llvm.org/D92149
2020-11-30 08:23:58 -05:00
Nico Weber 11b7625833 [lld/mac] Implement basic typo correction for flags
Also use "unknown flag 'flag'" instead of "unknown flag: flag" for
consistency with the other ports.

Differential Revision: https://reviews.llvm.org/D91970
2020-11-24 11:33:39 -05:00
Nico Weber e16c0a9a68 clang+lld: Improve clang+ld.darwinnew.lld interaction, pass -demangle
This patch:
- adds an ld64.lld.darwinnew symlink for lld, to go with f2710d4b57,
  so that `clang -fuse-ld=lld.darwinnew` can be used to test new
  Mach-O lld while it's in bring-up. (The expectation is that we'll
  remove this again once new Mach-O lld is the defauld and only Mach-O
  lld.)
- lets the clang driver know if the linker is lld (currently
  only triggered if `-fuse-ld=lld` or `-fuse-ld=lld.darwinnew` is
  passed). Currently only used for the next point, but could be used
  to implement other features that need close coordination between
  compiler and linker, e.g. having a diag for calling `clang++` instead
  of `clang` when link errors are caused by a missing C++ stdlib.
- lets the clang driver pass `-demangle` to Mach-O lld (both old and
  new), in addition to ld64
- implements -demangle for new Mach-O lld
- changes demangleItanium() to accept _Z, __Z, ___Z, ____Z prefixes
  (and updates one test added in D68014). Mach-O has an extra
  underscore for symbols, and the three (or, on Mach-O, four)
  underscores are used for block names.

Differential Revision: https://reviews.llvm.org/D91884
2020-11-24 08:51:58 -05:00
Gabriel Hjort Åkerlund 2d1f471e45 [Mach0] Fix unused-variable warnings
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D91519
2020-11-19 10:51:15 +01:00
Nico Weber c519bc7e16 lld/MachO: Move MachOOptTable to DriverUtils.cpp, remove DriverUtils.h
This makes lld/MachO look more like lld/COFF and lld/ELF, as discussed
in D91640.
2020-11-18 12:33:15 -05:00
Nico Weber baa2aa28f5 lld: Add --color-diagnostic to MachO port, harmonize others
This adds `--[no-]color-diagnostics[=auto,never,always]` to
the MachO port and harmonizes the flag in the other ports:
- Consistently use MetaVarName
- Consistently document the non-eq version as alias of the eq version
- Use B<> in the ports that have it (no-op, shorter)
- Fix oversight in COFF port that made the --no flag have the wrong
  prefix

Differential Revision: https://reviews.llvm.org/D91640
2020-11-17 12:58:30 -05:00
Jez Ng 21f831134c [lld-macho] Add very basic support for LTO
Just enough to consume some bitcode files and link them. There's more
to be done around the symbol resolution API and the LTO config, but I don't yet
understand what all the various LTO settings do...

Reviewed By: #lld-macho, compnerd, smeenai, MaskRay

Differential Revision: https://reviews.llvm.org/D90663
2020-11-10 12:19:28 -08:00
Jez Ng 6cf244327b [lld-macho][easy] Fix segment max protection
We should have maxprot == initprot for all non-i386 architectures, which
is what ld64 does.

Reviewed By: #lld-macho, compnerd

Differential Revision: https://reviews.llvm.org/D89420
2020-11-10 12:19:28 -08:00
Jez Ng b86908171e [lld-macho] Implement LC_UUID
Apple devtools use this to locate the dSYM files for a given
binary.

The UUID is computed based on an MD5 hash of the binary's contents. In order to
hash the contents, we must first write them, but LC_UUID itself must be part of
the written contents in order for all the offsets to be calculated correctly.
We resolve this circular paradox by first writing an LC_UUID with an all-zero
UUID, then updating the UUID with its real value later.

I'm not sure there's a good way to test that the value of the UUID is
"as expected", so I've just checked that it's present.

Reviewed By: #lld-macho, compnerd, smeenai

Differential Revision: https://reviews.llvm.org/D89418
2020-11-10 12:19:28 -08:00
Jez Ng 2e8e1bdb89 [lld-macho] Support linking against stub dylibs
Stub dylibs differ from "real" dylibs in that they lack any content in
their sections. What they do have are export tries and symbol tables,
which means we can still link against them. I am unclear how to
properly create these stub dylibs; XCode 11.3's `lipo` is able to create
stub dylibs, but those lack LC_ID_DYLIB load commands and are considered
invalid by most tooling. Newer versions of `lipo` aren't able to create
stub dylibs at all.  However, recent SDKs in XCode still come with valid
stub dylibs, so it still seems worthwhile to support them. The YAML in
this diff's test was generated by taking a non-stub dylib and editing
the appropriate fields.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D89012
2020-11-10 12:19:27 -08:00
Fangrui Song 20e9c36c01 Internalize functions from various tools. NFC
And internalize some classes if I noticed them:)
2020-09-26 15:57:13 -07:00
Jez Ng 2c2a749448 [lld-macho] Ignore a few more undocumented flags
Reviewed By: #lld-macho, compnerd

Differential Revision: https://reviews.llvm.org/D88268
2020-09-25 11:28:37 -07:00
Jez Ng 62a3f0c984 [lld-macho] Support absolute symbols
They operate like Defined symbols but with no associated InputSection.

Note that `ld64` seems to treat the weak definition flag like a no-op for
absolute symbols, so I have replicated that behavior.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D87909
2020-09-25 11:28:35 -07:00
Jez Ng c7c9776f77 [lld-macho] Allow the entry symbol to be dynamically bound
Apparently this is used in real programs. I've handled this by reusing
the logic we already have for branch (function call) relocations.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D87852
2020-09-25 11:28:33 -07:00
Jez Ng f23f512691 [lld-macho] Support -bundle
Not 100% sure but it appears that bundles are almost identical to
dylibs, aside from the fact that they do not contain `LC_ID_DYLIB`. ld64's code
seems to treat bundles and dylibs identically in most places.

Supporting bundles allows us to run e.g. XCTests, as all test suites are
compiled into bundles which get dynamically loaded by the `xctest` test runner.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D87856
2020-09-25 11:28:32 -07:00
Jez Ng e4e673e75a [lld-macho] Implement support for PIC
* Implement rebase opcodes. Rebase opcodes tell dyld where absolute
  addresses have been encoded in the binary. If the binary is not loaded
  at its preferred address, dyld has to rebase these addresses by adding
  an offset to them.
* Support `-pie` and use it to test rebase opcodes.

This is necessary for absolute address references in dylibs, bundles etc
to work.

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D87199
2020-09-25 11:28:31 -07:00
Jez Ng 5213576fa2 [lld-macho][re-land] Implement and test resolution of common symbols
Earlier build break fixed in c32e69b2ce.

This reverts commit c367f93e85.
2020-09-24 15:00:56 -07:00
Jez Ng c32e69b2ce [lld-macho][re-land] Initial support for common symbols
Fix earlier build break via a static_cast.

This reverts commit 8112d494d3.

Differential Revision: https://reviews.llvm.org/D86909
2020-09-24 15:00:20 -07:00
Alexandre Ganea f2efb5742c [LLD][COFF] Cover usage of LLD-as-a-library in tests
In lit tests, we run each LLD invocation twice (LLD_IN_TEST=2), without shutting down the process in-between. This ensures a full cleanup is properly done between runs.
Only active for the COFF driver for now. Other drivers still use LLD_IN_TEST=1 which executes just one iteration with full cleanup, like before.
When the environment variable LLD_IN_TEST is unset, a shortcut is taken, only one iteration is executed, no cleanup for faster exit, like before.
A public API, lld::safeLldMain(), is also available when using LLD as a library.

Differential Revision: https://reviews.llvm.org/D70378
2020-09-24 15:07:50 -04:00
Muhammad Omair Javaid 8112d494d3 Revert "[lld-macho] Initial support for common symbols"
This reverts commit 63ace77962.

Breaks LLDB Arm build:
http://lab.llvm.org:8011/builders/lldb-arm-ubuntu/builds/4409
2020-09-24 12:26:40 +05:00
Muhammad Omair Javaid c367f93e85 Revert "[lld-macho] Implement and test resolution of common symbols"
This reverts commit cd7cb0c303.
Break lldb Arm build:
http://lab.llvm.org:8011/builders/lldb-arm-ubuntu/builds/4409
2020-09-24 12:25:47 +05:00
Jez Ng 9c70281497 [lld-macho][NFC] Make `!= nullptr` implicit 2020-09-23 20:09:49 -07:00
Jez Ng ca8752a793 [lld-macho][NFC] Refactor syslibroot / library path lookup
* Move computation of systemLibraryRoots into a separate function, so we
  can add more functionality to it without things becoming unwieldy
* Have `getSearchPaths` and related functions return by value instead of
  by output parameter. NRVO should ensure that performance is unaffected.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D87959
2020-09-23 19:26:41 -07:00
Jez Ng 98f03908d0 [lld-macho] Support -weak_lx, -weak_library, -weak_framework
They cause their corresponding libraries / frameworks to be loaded via
`LC_LOAD_WEAK_DYLIB` instead of `LC_LOAD_DYLIB`.

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D87929
2020-09-23 19:26:41 -07:00
Jez Ng 79412d6ca7 [lld-macho] Ignore `-mllvm` and its argument
Test Plan:

Reviewed By: #lld-macho, compnerd, MaskRay

Differential Revision: https://reviews.llvm.org/D87803
2020-09-23 19:26:40 -07:00
Jez Ng 5d26bd3b75 [lld-macho] Emit indirect symbol table
Makes it a little easier to read objdump's disassembly.

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D87178
2020-09-23 19:26:40 -07:00
Jez Ng cd7cb0c303 [lld-macho] Implement and test resolution of common symbols
Handle the case where there are both common and non-common definitions
of the same symbol. Add a bunch of tests to ensure compatibility with ld64.

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D86910
2020-09-23 19:26:40 -07:00
Jez Ng 63ace77962 [lld-macho] Initial support for common symbols
On Unix, it is traditionally allowed to write variable definitions without
initialization expressions (such as "int foo;") to header files. These are
called tentative definitions.

The compiler creates common symbols when it sees tentative definitions. When
linking the final binary, if there are remaining common symbols after name
resolution is complete, the linker converts them to regular defined symbols in
a `__common` section.

This diff implements most of that functionality, though we do not yet handle
the case where there are both common and non-common definitions of the same
symbol.

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D86909
2020-09-23 19:26:40 -07:00
Greg McGary 8f2c31f22b [lld-macho] handle options -search_paths_first, -search_dylibs_first
Differential Revision: https://reviews.llvm.org/D88054
2020-09-23 14:56:33 -07:00
Greg McGary fa5f945212 [lld-macho] cleanup unimplemented-option warnings
Remove all spurious `HelpHidden` flags from  `lld/MachO/Options.td`. Add test for `HelpHidden` to `warnIfUnimplementedOption()` so that the empty `// handled elsewhere` case is unnecessary.

Reviewed By: #lld-macho, int3, smeenai

Differential Revision: https://reviews.llvm.org/D88160
2020-09-23 14:38:23 -07:00
Greg McGary ab903560a4 [lld-maco] fix build breakage 2020-09-22 20:42:23 -07:00
Greg McGary 1a3ef0417c [lld-macho] In the context of relocs, s/target/referent/ for sections & symbols
The word "target" is overloaded, so lighten its load by using another word to denote the symbol or section to which a reloc points. While more stilted than "target", "referent" is rather less pompous than "designatum" or "denotatum". :P

Along the way, make a few neighboring variable names more descriptive.

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D87584
2020-09-22 20:31:01 -07:00
Greg McGary 145ce86dba [lld-macho] handle option -headerpad_max_install_names
Differential Revision: https://reviews.llvm.org/D88064
2020-09-22 17:24:19 -07:00
Greg McGary 7afbf3192d [lld-macho] minimally handle option -dynamic
Stifle the warning for unimplemented option `-dyamic`, since it is already the default. Add `Config::staticLink` and skeletal support for altering the flag, but otherwise leave the option `-static` as hidden and its warning in place.

Differential Revision: https://reviews.llvm.org/D88045
2020-09-22 08:03:44 -07:00
Jez Ng abd70fb398 [lld-macho] Export trie addresses should be relative to the image base
We didn't notice this earlier this we were only testing the export trie
encoded in a dylib, whose image base starts at zero. But a regular
executable contains `__PAGEZERO`, which means it has a non-zero image
base. This bug was discovered after attempting to run some programs that
performed `dlopen` on an executable.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D87780
2020-09-20 20:43:15 -07:00
Jez Ng 0a7e56f74c [lld-macho] Mark weak symbols in symbol table
Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D86908
2020-09-20 20:43:14 -07:00
Greg McGary cba45514fb align __TEXT,__unwind_info to 8 byte boundary 2020-09-19 12:43:30 -07:00
Greg McGary 2124ca1d5c [lld-macho] create __TEXT,__unwind_info from __LD,__compact_unwind
Digest the input `__LD,__compact_unwind` and produce the output `__TEXT,__unwind_info`. This is the initial commit with the major functionality.

Successor commits will add handling for ...
* `__TEXT,__eh_frame`
* personalities & LSDA
* `-r` pass-through

Differential Revision: https://reviews.llvm.org/D86805
2020-09-18 22:01:03 -07:00
Jez Ng ae8fa1d8a6 [lld-macho][NFC] Define isHidden() in LinkEditSection
Since it's always true

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D86749
2020-08-27 17:44:18 -07:00
Jez Ng ccbacdded4 [lld-macho] Weak locals should be relaxed too
Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D86746
2020-08-27 17:44:17 -07:00
Jez Ng 0407197711 [lld-macho] Support GOT relocations to __dso_handle
Found such a relocation while testing some real world programs.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D86642
2020-08-27 17:44:17 -07:00
Jez Ng 7083363c05 [lld-macho] Implement GOT_LOAD relaxation
We can have GOT_LOAD relocations that reference `__dso_handle`.
However, our binding opcode encoder doesn't support binding to the DSOHandle
symbol. Instead of adding support for that, I decided it would be cleaner to
implement GOT_LOAD relaxation since `__dso_handle`'s location is always
statically known.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D86641
2020-08-27 17:44:17 -07:00
Jez Ng 2a38dba7dd [lld-macho] Emit binding opcodes for defined symbols that override weak dysyms
These opcodes tell dyld to coalesce the overridden weak dysyms to this
particular symbol definition.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D86575
2020-08-27 17:44:16 -07:00
Jez Ng 3da2130e45 [lld-macho] Emit the right header flags for weak bindings/symbols
Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D86574
2020-08-27 17:44:16 -07:00
Jez Ng e263287c79 [lld-macho] Implement weak binding for branch relocations
Since there is no "weak lazy" lookup, function calls to weak symbols are
always non-lazily bound. We emit both regular non-lazy bindings as well
as weak bindings, in order that the weak bindings may overwrite the
non-lazy bindings if an appropriate symbol is found at runtime. However,
the bound addresses will still be written (non-lazily) into the
LazyPointerSection.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D86573
2020-08-27 17:44:15 -07:00
Jez Ng 62b39b3a0c [lld-macho] Implement -all_load
Differential Revision: https://reviews.llvm.org/D86640
2020-08-26 19:21:17 -07:00
Jez Ng cbe27316ef [lld-macho] Implement weak bindings for GOT/TLV
Previously, we were only emitting regular bindings to weak
dynamic symbols; this diff adds support for the weak bindings too, which
can overwrite the regular bindings at runtime. We also treat weak
defined global symbols similarly -- since they can also be interposed at
runtime, they need to be treated as potentially dynamic symbols.

Note that weak bindings differ from regular bindings in that they do not
specify the dylib to do the lookup in (i.e. weak symbol lookup happens
in a flat namespace.)

Differential Revision: https://reviews.llvm.org/D86572
2020-08-26 19:21:09 -07:00
Jez Ng b84d72d893 [lld-macho][NFC] Handle GOT bindings and regular bindings more uniformly
Previously, the BindingEntry struct could only store bindings to offsets
within InputSections. Since the GOTSection and TLVPointerSections are
OutputSections, I handled those in a separate code path. However, this
makes it awkward to support weak bindings properly without code
duplication. This diff allows BindingEntries to point directly to
OutputSections, simplifying the upcoming weak binding implementation.

Along the way, I also converted a bunch of functions taking references
to symbols to take pointers instead. Given how much casting we do for
Symbol (especially in the upcoming weak binding diffs), it's cleaner
this way.

Differential Revision: https://reviews.llvm.org/D86571
2020-08-26 19:21:04 -07:00
Jez Ng cf918c809b [lld-macho] Implement -ObjC
It's roughly like -force_load with some filtering.

Differential Revision: https://reviews.llvm.org/D86181
2020-08-26 19:20:55 -07:00
Jez Ng 7394460d87 [lld-macho] Handle TAPI and regular re-exports uniformly
The re-exports list in a TAPI document can either refer to other inlined
TAPI documents, or to on-disk files (which may themselves be TBD or
regular files.) Similarly, the re-exports of a regular dylib can refer
to a TBD file.

Differential Revision: https://reviews.llvm.org/D85404
2020-08-26 19:20:48 -07:00
Jez Ng 6336c042f6 [lld-macho] Make it possible to re-export .tbd files
Two things needed fixing for that to work:

1. getName() no longer returns null for DylibFiles constructed from TAPIs
2. markSubLibrary() now accepts .tbd as a possible extension

Differential Revision: https://reviews.llvm.org/D86180
2020-08-26 19:20:42 -07:00
Jez Ng 3e7a86e366 [lld-macho] Fall back to raw path if we don't find anything under syslibroot
This matches ld64's behavior

Differential Revision: https://reviews.llvm.org/D85992
2020-08-26 19:20:35 -07:00
Greg McGary 537f5483fe [lld-macho] Emit load command LC_BUILD_VERSION
Reviewed By: int3

Differential Revision: https://reviews.llvm.org/D85786
2020-08-14 12:36:43 -07:00
Jez Ng e48d1262b8 [lld-macho] Support -rpath
Pretty straightforward; just emits LC_RPATH for dyld to consume.

Note that lld itself does not yet support dylib lookup via @rpath.

Reviewed By: #lld-macho, compnerd

Differential Revision: https://reviews.llvm.org/D85701
2020-08-12 19:50:28 -07:00
Jez Ng 437e6bd286 [lld-macho] Implement -force_load
It's similar to lld-ELF's `-whole-archive`, but applied to individual
archives instead of to a series of them.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D85550
2020-08-12 19:50:27 -07:00
Jez Ng 180ad756ec [lld-macho] Support larger dylib symbol ordinals in bindings
Do folks care if we don't have a test for this? Creating 16
dylibs to trigger this straightforward code path seems a little tedious

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D85467
2020-08-12 19:50:25 -07:00
Jez Ng c3eb1e2754 [lld-macho] Add error handling for malformed TBD files
Previously, lld would crash while complaining that `Expected<T>
must be checked before access or destruction`.

Reviewed By: #lld-macho, compnerd

Differential Revision: https://reviews.llvm.org/D85403
2020-08-12 19:50:13 -07:00
Jez Ng 7e6d675499 [lld-macho] Avoid unnecessary shared_ptr in DylibFile ctor
DylibFile doesn't store a pointer to its InterfaceFile
parameter, so there's no need to use a shared_ptr.

Reviewed By: #lld-macho, compnerd

Differential Revision: https://reviews.llvm.org/D85402
2020-08-12 19:50:12 -07:00
Jez Ng a499898e86 [lld-macho] Generate ObjC symbols from .tbd files
I followed similar logic in TapiFile.cpp.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D85255
2020-08-12 19:50:10 -07:00
Jez Ng 3c9100fb78 [lld-macho] Support dynamic linking of thread-locals
References to symbols in dylibs work very similarly regardless of
whether the symbol is a TLV. The main difference is that we have a
separate `__thread_ptrs` section that acts as the GOT for these
thread-locals.

We can identify thread-locals in dylibs by a flag in their export trie
entries, and we cross-check it with the relocations that refer to them
to ensure that we are not using a GOT relocation to reference a
thread-local (or vice versa).

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D85081
2020-08-12 19:50:09 -07:00
Pavel Labath c3817728e7 [lld] s/dyn_cast/isa
Fixes some unused variable warnings with gcc.
2020-08-11 15:22:44 +02:00
Greg McGary 49fb1c2e90 [lld-macho] improve handling of -platform_version
This improves the handling of `-platform_version` by addressing the FIXME in the code to process the arguments.

Reviewed By: int3, #lld-macho

Differential Revision: https://reviews.llvm.org/D81413
2020-08-10 18:47:16 -07:00
Greg McGary a379f2c251 [lld-macho] Handle command-line option -sectcreate SEG SECT FILE
Handle command-line option `-sectcreate SEG SECT FILE`, which inputs a binary blob from `FILE` into `SEG,SECT`

Reviewed By: int3

Differential Revision: https://reviews.llvm.org/D85501
2020-08-10 18:47:13 -07:00
Jez Ng 25367dfefb [lld-macho] Add .tbd support for frameworks
Required for e.g. linking iOS apps since they don't have a platform-native
SDK

Reviewed By: #lld-macho, compnerd, smeenai

Differential Revision: https://reviews.llvm.org/D85153
2020-08-07 11:04:54 -07:00
Jez Ng ca85e37338 [lld-macho] Support static linking of thread-locals
Note: What ELF refers to as "TLS", Mach-O seems to refer to as "TLV", i.e.
thread-local variables.

This diff implements support for TLV relocations that reference defined
symbols. On x86_64, TLV relocations are always used with movq opcodes, so for
defined TLVs, we don't need to create a synthetic section to store the
addresses of the symbols -- we can just convert the `movq` to a `leaq`.

One notable quirk of Mach-O's TLVs is that absolute-address relocations
inside TLV-defining sections behave differently -- their addresses are
no longer absolute, but relative to the start of the target section.
(AFAICT, RIP-relative relocations are not allowed in these sections.)

Reviewed By: #lld-macho, compnerd, smeenai

Differential Revision: https://reviews.llvm.org/D85080
2020-08-07 11:04:52 -07:00
Jez Ng 4e43f18048 [lld-macho] Ensure .tbss sections are also considered as ZeroFilled
This diff makes the behavior in {D80859} and {D81888} apply to
thread-local ZeroFill sections too. I realized this was necessary whie
trying to implement thread-local variables.

Reviewed By: #lld-macho, compnerd, MaskRay

Differential Revision: https://reviews.llvm.org/D85079
2020-08-07 11:04:41 -07:00
Saleem Abdulrasool 0ccda7c232 MachO: support `-syslibroot`
This adds support for the `-syslibroot` option.  This is required to
make the library search order actually function.  With this, it is now
possible to link a test Darwin x86_64 program with lld on Darwin.

Differential Revision: https://reviews.llvm.org/D82252
Reviewed By: Jez Ng
2020-08-05 08:41:24 -07:00
Jez Ng c89e46e767 [lld-macho] Add comment for literal argument 2020-07-30 14:38:58 -07:00
Jez Ng 98210796e1 [lld-macho] Make __LINKEDIT sections contiguous
codesign (or more specifically libstuff) checks that each section in
__LINKEDIT ends where the next one starts -- no gaps are permitted. This
diff achieves it by aligning every section's start and end points to
WordSize.

Remarks: ld64 appears to satisfy the constraint by adding padding bytes
when generating the __LINKEDIT data, e.g. by emitting BIND_OPCODE_DONE
(which is a 0x0 byte) repeatedly. I think the approach this diff takes
is a bit more elegant, but I'm not sure if it's too restrictive. In
particular, it assumes padding always uses the zero byte. But we can
revisit this later.

Reviewed By: #lld-macho, compnerd

Differential Revision: https://reviews.llvm.org/D84718
2020-07-30 14:30:07 -07:00
Jez Ng 22e6648a18 [lld-macho] Implement -headerpad
Tools like `install_name_tool` and `codesign` may modify the Mach-O
header and increase its size. The linker has to provide padding to make this
possible. This diff does that, plus sets its default value to 32 bytes (which
is what ld64 does).

Unlike ld64, however, we lay out our sections *exactly* `-headerpad` bytes from
the header, whereas ld64 just treats the padding requirement as a lower bound.
ld64 actually starts laying out the non-header sections in the __TEXT segment
from the end of the (page-aligned) segment rather than the front, so its
binaries typically have more than `-headerpad` bytes of actual padding.
We should consider implementing the same alignment behavior.

Reviewed By: #lld-macho, compnerd

Differential Revision: https://reviews.llvm.org/D84714
2020-07-30 14:29:31 -07:00
Jez Ng 3587de2281 [lld-macho] Support __dso_handle for C++
The C++ ABI requires dylibs to pass a pointer to __cxa_atexit which does
e.g. cleanup of static global variables. The C++ spec says that the pointer
can point to any address in one of the dylib's segments, but in practice
ld64 seems to set it to point to the header, so that's what's implemented
here.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D83603
2020-07-30 14:28:41 -07:00
Jez Ng d32e32500f [lld-macho] Fix segment filesize calculation
The previous approach of adding up the file sizes of the
component sections ignored the fact that the sections did not have to be
contiguous in the file. As such, it was underestimating the true size.

I discovered this issue because `codesign` checks whether `__LINKEDIT`
extends to the end of the file. Since we were underestimating segment
sizes, this check failed.

Reviewed By: #lld-macho, compnerd

Differential Revision: https://reviews.llvm.org/D84574
2020-07-28 10:02:19 -07:00
Jez Ng 4853a86022 [lld-macho] Support -filelist
XCode passes files in using this flag

Reviewed By: #lld-macho, compnerd

Differential Revision: https://reviews.llvm.org/D84486
2020-07-28 10:02:19 -07:00
Jez Ng 9282d04e04 [lld-macho] Support lookup of dylibs in frameworks
Needed for testing Objective-C programs (since e.g. Core
Foundation is a framework)

Reviewed By: #lld-macho, compnerd

Differential Revision: https://reviews.llvm.org/D83925
2020-07-26 12:46:46 -07:00
Jez Ng 06a0dd2467 [lld-macho] Ignore -dependency_info and its argument
XCode passes in this flag, which we do not yet implement. Skip
over the argument for now so we can at least successfully parse the
linker invocation.

Reviewed By: #lld-macho, compnerd

Differential Revision: https://reviews.llvm.org/D84485
2020-07-24 15:55:27 -07:00
Jez Ng 31d5885842 [lld-macho] Partial support for weak definitions
This diff adds support for weak definitions, though it doesn't handle weak
symbols in dylibs quite correctly -- we need to emit binding opcodes for them
in the weak binding section rather than the lazy binding section.

What *is* covered in this diff:

1. Reading the weak flag from symbol table / export trie, and writing it to the
   export trie
2. Refining the symbol table's rules for choosing one symbol definition over
   another. Wrote a few dozen test cases to make sure we were matching ld64's
   behavior.

We can now link basic C++ programs.

Reviewed By: #lld-macho, compnerd

Differential Revision: https://reviews.llvm.org/D83532
2020-07-24 15:55:25 -07:00
Benjamin Kramer 9a0689e072 Make helpers static. NFC. 2020-07-17 13:49:11 +02:00
Jez Ng 53eb7fda51 [lld-macho] Support binding dysyms to any section
Previously, we only supported binding dysyms to the GOT. This
diff adds support for binding them to any arbitrary section. C++
programs appear to use this, I believe for vtables and type_info.

This diff also makes our bind opcode encoding a bit smarter -- we now
encode just the differences between bindings, which will make things
more compact.

I was initially concerned about the performance overhead of iterating
over these relocations, but it turns out that the number of such
relocations is small. A quick analysis of my llvm-project build
directory showed that < 1.3% out of ~7M relocations are RELOC_UNSIGNED
bindings to symbols (including both dynamic and static symbols).

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D83103
2020-07-02 21:21:01 -07:00
Nico Weber 7be1661fc6 lld/MachO: Remove a useless temporary 2020-07-02 00:05:43 -04:00
Jez Ng 7996a1ef70 [lld-macho] Make sure ZeroFill sections are at the end of their segments
Summary:
ld64 does this, and references an internal rdar:// number as an explanation. No
idea what that rdar issue is, but in practice, it seems that not putting a BSS
section at the end can cause subsequent sections in the same segment to be
overwritten with zeroes.

Reviewers: #lld-macho

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81888
2020-07-01 19:39:29 -07:00
Fangrui Song 40bc99538c [lld-macho] Remove using namespace llvm::MachO
llvm/include/llvm/TextAPI/MachO/ inappropriately uses the llvm::MachO namespace (this is for BinaryFormat and Object) and causes conflicts in some MSVC builds::

http://lab.llvm.org:8011/builders/sanitizer-windows/builds/65324/steps/stage%201%20build/logs/stdio

Removing `using namespace llvm::MachO` should decrease name collisions.
2020-06-24 13:46:13 -07:00
Fangrui Song 18db086dca [lld-macho] Use namespace qualifiers (macho::) instead of `namespace lld { namespace macho {`
See https://llvm.org/docs/CodingStandards.html#use-namespace-qualifiers-to-implement-previously-declared-functions
Similar to D79982.

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D82432
2020-06-24 12:34:06 -07:00
Jez Ng 3646ee503d [lld-macho] Refactor segment/section creation, sorting, and merging
Summary:
There were a few issues with the previous setup:

1. The section sorting comparator used a declarative map of section names to
  determine the correct order, but it turns out we need to match on more than
  just names -- in particular, an upcoming diff will sort based on whether the
  S_ZERO_FILL flag is set. This diff changes the sorter to a more imperative but
  flexible form.

2. We were sorting OutputSections stored in a MapVector, which left the
  MapVector in an inconsistent state -- the wrong keys map to the wrong values!
  In practice, we weren't doing key lookups (only container iteration) after the
  sort, so this was fine, but it was still a dubious state of affairs. This diff
  copies the OutputSections to a vector before sorting them.

3. We were adding unneeded OutputSections to OutputSegments and then filtering
  them out later, which meant that we had to remember whether an OutputSegment
  was in a pre- or post-filtered state. This diff only adds the sections to the
  segments if they are needed.

In addition to those major changes, two minor ones worth noting:

1. I renamed all OutputSection variable names to `osec`, to parallel `isec`.
  Previously we were using some inconsistent combination of `osec`, `os`, and
  `section`.

2. I added a check (and a test) for InputSections with names that clashed with
  those of our synthetic OutputSections.

Reviewers: #lld-macho

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81887
2020-06-21 17:13:59 -07:00
Nico Weber 9bcd59fdef fix a typo to cycle bots 2020-06-18 11:31:47 -04:00
Greg McGary d50f44a2f7 [lld-macho] Handle framework search path, alongside library search path
Summary:
Add front-end support for `lld::macho::Configuration::frameworkSearchPath`.

Depends on D80582.

Reviewers: ruiu, pcc, MaskRay, smeenai, int3, Ktwu, alexshap, christylee

Reviewed By: int3

Subscribers: ormris, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80677
2020-06-17 20:41:28 -07:00
Jez Ng 525c7d8cda [lld-macho] Handle alignment correctly when merging InputSections
Summary:
Previously, we weren't updating isecAddr when aligning InputSections,
resulting in truncated sections under the right conditions.

Reviewers: #lld-macho, compnerd

Reviewed By: #lld-macho, compnerd

Subscribers: smeenai, compnerd, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81298
2020-06-17 20:41:28 -07:00
Jez Ng 74871cdad7 [lld-macho] Ensure __bss sections we output have file offset of zero
Summary:
llvm-mc emits `__bss` sections with an offset of zero, but we weren't expecting
that in our input, so we were copying non-zero data from the start of the file and
putting it in `__bss`, with obviously undesirable runtime results. (It appears that
the kernel will copy those nonzero bytes as long as the offset is nonzero, regardless
of whether S_ZERO_FILL is set.)

I debated on whether to make a special ZeroFillSection -- separate from a
regular InputSection -- but it seemed like too much work for now. But I'm happy
to refactor if anyone feels strongly about having it as a separate class.

Depends on D80857.

Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee

Reviewed By: smeenai

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80859
2020-06-17 20:41:28 -07:00
Jez Ng a12e7d406d [lld-macho] Handle GOT relocations of non-dylib symbols
Summary:
Turns out this case is actually really common -- it happens whenever there's
a reference to an `extern` variable that ends up statically linked.

Depends on D80856.

Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee

Reviewed By: smeenai

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80857
2020-06-17 20:41:28 -07:00
Jez Ng c3d98ea89f [lld-macho] Support X86_64_RELOC_GOT
Summary:
As far as I can tell, it's identical to _GOT_LOAD. llvm-mc has the following
comment explaining why _GOT exists:

```
// x86_64 distinguishes movq foo@GOTPCREL so that the linker can
// rewrite the movq to an leaq at link time if the symbol ends up in
// the same linkage unit.
```

Depends on D80855.

Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee

Reviewed By: MaskRay, smeenai

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80856
2020-06-17 20:41:28 -07:00
Jez Ng fcde378dcb [lld-macho] Support non-pcrel section relocs
Summary: Depends on D80854.

Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80855
2020-06-17 20:41:28 -07:00
Jez Ng 2f4cfa3c7a [lld-macho] Avoid explicit -arch in tests by defaulting to x86-64
Summary:
As mentioned in https://reviews.llvm.org/D81326#2093931, I'm not sure it
makes sense to use the default target triple to determine -arch.
Long-term we should probably detect it from the input object files, but
in the meantime it would be nice not to have to add it to all our tests
by using a convenient default.

Reviewers: #lld-macho

Subscribers: arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81983
2020-06-17 20:41:27 -07:00
Jez Ng a2d096df26 [lld-macho] Use uint64_t for getSize() instead of size_t
Summary:
So things work on 32-bit machines. (@vzakhari reported the
breakage starting from D80177).

Reviewers: #lld-macho, vzakhari

Subscribers: llvm-commits, vzakhari

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81982
2020-06-16 18:42:45 -07:00
Saleem Abdulrasool 73312976ad lld: remove old test support path
This removes the stub library that lld injected to satisfy the
dependency on the libSystem.  Now with TBD support, we can provide the
stub library to permit the tests to function properly as they would on a
real system.

Reviewed By: smeenai
Differential Revision: https://reviews.llvm.org/D81418
2020-06-16 15:57:58 -07:00
Greg McGary 7df80e3f23 [lld-macho] Specify the complete set of command-line options for ld64
This is a complete Options.td compiled from ld(1) dated 2018-03-07 and
cross checked with ld64 source code version 512.4 dated 2018-03-18.

This is the first in a series of diffs for argument handling. Follow-ups
will include switch cases for all the new instances of `OPT_foo`, and
parsing/validation of arguments attached to options, e.g., more code
akin to `OPT_platform_version` and associated `parsePlatformVersion()`.

Reviewed By: smeenai

Differential Revision: https://reviews.llvm.org/D80582
2020-06-15 12:50:20 -07:00
Shoaib Meenai 72e096fd1e [MachO] Fix typo in comment
The case the calculation works for is when r_length = 2.
2020-06-15 12:26:55 -07:00
Kirill Bobyrev 9baba7cf66
Revert "[lld-macho] No need to explicitly specify -arch in tests"
This reverts commit 51c5baacf3 and also
337fb8c767 - "[lld-macho] Set REQUIRES:
x86 on more tests".

These patches cause test crashes:

http://lab.llvm.org:8011/builders/ppc64le-lld-multistage-test/builds/10054
2020-06-15 12:27:30 +02:00
Jez Ng 53c796b948 [lld-macho] Properly handle & validate relocation r_length
Summary:
We should be reading / writing our addends / relocated addresses based on
r_length, and not just based on the type of the relocation. But since only
some r_length values are valid for a given reloc type, I've also added some
validation.

ld64 has code to allow for r_length = 0 in X86_64_RELOC_BRANCH relocs, but I'm
not sure how to create such a relocation...

Reviewed By: smeenai

Differential Revision: https://reviews.llvm.org/D80854
2020-06-14 16:35:23 -07:00
Jez Ng 51c5baacf3 [lld-macho] No need to explicitly specify -arch in tests
Summary: After {D81326} landed, some tests started failing if they did
not have `-arch` specified. I think one of the reasons happened was due
to the fact that we were taking a reference to a temporary value that
was freed too early. Fixing that got the error to go away on my local
Linux machine.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D81802
2020-06-14 16:35:21 -07:00
Saleem Abdulrasool 6fe27b5fed lld: initial pass at supporting TBD
Add support to lld to use Text Based API stubs for linking.  This is
support is incomplete not filtering out platforms.  It also does not
account for architecture specific API handling and potentially does not
correctly handle trees of re-exports with inlined libraries being
treated as direct children of the top level library.
2020-06-08 18:15:40 -07:00
Michael Liao 43793b89a0 [lld] Fix shared library build by adding the missing dependency. 2020-06-08 16:12:58 -04:00
Saleem Abdulrasool fcdf7578aa lld: improve the `-arch` handling for MachO
Use the default target triple configured by the user to determine the
default architecture for `ld64.lld`.  Stash the architecture in the
configuration as when linking against TBDs, we will need to filter out
the symbols based upon the architecture.  Treat the Haswell slice as it
is equivalent to `x86_64` but with the extra Haswell extensions (e.g.
AVX2, FMA3, BMI1, etc).  This will make it easier to add new
architectures in the future.

This change also changes the failure mode where an invalid `-arch`
parameter will result in the linker exiting without further processing.
2020-06-08 11:04:19 -07:00
Saleem Abdulrasool e78431354b lld: use modern library search ordering
This merges the static and shared library and behaves as if
`-search_paths_first` was specified which is also the default behaviour
on ld64 (and now lld). Unify the paths, and use `llvm::sys::path` to
deal with the path to be truly agnostic to the host.
2020-06-05 12:12:26 -07:00
Saleem Abdulrasool 116e38fd8b lld: add basic static library search
This is a very basic static library search addition. This is the pre-Xcode4
behaviour of searching all paths for the shared version before searching for
the static version of the library. This behaviour is supposed to be inverted
with `-search_paths_first` being the default. This adds the library search
with the intention of providing the setup to merge the paths into one path
and making it controllable by `OPT_search_paths_first`.
2020-06-03 23:32:05 +00:00
Saleem Abdulrasool 37d93b528c lld: ignore the `-search_paths_first` option on MachO
ld64 provides the `-search_path_firsts` which will search each path in
the library search path order for both `lib[name].dylib`, `lib[name].a`
before moving on (searching all paths for the dylib and then falling
back to the static library if a shared library was not found).

This option has been the default for a long time, but the command line
flag still exists.  Ignore it for compatibility.
2020-06-03 15:36:35 +00:00
Jez Ng d767de44bf [lld-macho] Fix PAGEZERO=4GB errors on Windows by ensuring enum is uint64_t
It appears that MSVC doesn't resize the enum properly to fit the
constants.
2020-06-02 15:24:31 -07:00
Jez Ng 1e1a3f67ee [lld-macho] Ensure reads from nlist_64 structs are aligned when necessary
My test refactoring in D80217 seems to have caused yaml2obj to emit
unaligned nlist_64 structs, causing ASAN'd lld to be unhappy. I don't
think this is an issue with yaml2obj though -- llvm-mc also seems to
emit unaligned nlist_64s. This diff makes lld able to safely do aligned
reads under ASAN builds while hopefully creating no overhead for regular
builds on architectures that support unaligned reads.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D80414
2020-06-02 13:19:38 -07:00
Jez Ng a04c133564 [lld-macho] Set __PAGEZERO size to 4GB
That's what ld64 uses for 64-bit targets. I figured it's best to make
this change sooner rather than later since a bunch of our tests are
relying on hardcoded addresses that depend on this value.

Reviewed By: smeenai

Differential Revision: https://reviews.llvm.org/D80177
2020-06-02 13:19:38 -07:00
Jez Ng df2a5778c3 [lld-macho] Error on encountering undefined symbols
... instead of silently emitting a reference to the zero address.

Reviewed By: smeenai

Differential Revision: https://reviews.llvm.org/D80169
2020-06-02 13:19:38 -07:00
Jez Ng 6f6d91867d [lld-macho] Add some relocation validation logic
I considered making a `Target::validate()` method, but I wasn't sure how
I felt about the overhead of doing yet another switch-dispatch on the
relocation type, so I put the validation in `relocateOne` instead...
might be a bit of a micro-optimization, but `relocateOne` does assume
certain things about the relocations it gets, and this error handling
makes that explicit, so it's not a totally unreasonable code
organization.

Reviewed By: smeenai

Differential Revision: https://reviews.llvm.org/D80049
2020-06-02 13:19:38 -07:00
Jez Ng ce0d8beebc [lld-macho][re-land] Support X86_64_RELOC_UNSIGNED
This reverts commit db8559eee4.
2020-05-19 12:31:55 -07:00
Jez Ng 4eb6f4854e [lld-macho][re-land] Support .subsections_via_symbols
Summary:
This diff restores and builds upon @pcc and @ruiu's initial work on
subsections.

The .subsections_via_symbols directive indicates we can split each
section along symbol boundaries, unless those symbols have been marked
with `.alt_entry`.

We exercise this functionality in our tests by using order files that
rearrange those symbols.

Depends on D79668.

Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee

Reviewed By: smeenai

Subscribers: thakis, llvm-commits, pcc, ruiu

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79926
2020-05-19 12:31:54 -07:00
Jez Ng 70fbbcdd34 Revert "[lld-macho] Support .subsections_via_symbols"
Due to build breakage mentioned in https://reviews.llvm.org/D79926.

This reverts commit e270b2f172.
2020-05-19 08:30:02 -07:00
Jez Ng db8559eee4 Revert "[lld-macho] Support X86_64_RELOC_UNSIGNED"
This reverts commit 1f820e3559.
2020-05-19 08:30:02 -07:00
Jez Ng 1f820e3559 [lld-macho] Support X86_64_RELOC_UNSIGNED
Note that it's only used for non-pc-relative contexts.

Reviewed By: MaskRay, smeenai

Differential Revision: https://reviews.llvm.org/D80048
2020-05-19 07:46:57 -07:00
Jez Ng e270b2f172 [lld-macho] Support .subsections_via_symbols
This diff restores and builds upon @pcc and @ruiu's initial work on
subsections.

The .subsections_via_symbols directive indicates we can split each
section along symbol boundaries, unless those symbols have been marked
with `.alt_entry`.

We exercise this functionality in our tests by using order files that
rearrange those symbols.

Reviewed By: smeenai

Differential Revision: https://reviews.llvm.org/D79926
2020-05-19 07:46:57 -07:00
Jez Ng 55e9eb416e [lld-macho] Support -order_file
The order file indicates how input sections should be sorted within each
output section, based on the symbols contained within those sections.

This diff sets the stage for implementing and testing
`.subsections_via_symbols`, where we will break up InputSections by each
symbol and sort them more granularly.

Reviewed By: smeenai

Differential Revision: https://reviews.llvm.org/D79668
2020-05-19 07:46:57 -07:00
Kellie Medlin 2b920ae78c [lld] Add archive file support to Mach-O backend
With this change, basic archive files can be linked together. Input
section discovery has been refactored into a function since archive
files lazily resolve their symbols / the object files containing those
symbols.

Reviewed By: int3, smeenai

Differential Revision: https://reviews.llvm.org/D78342
2020-05-14 12:58:35 -07:00
Nico Weber 759bae956a [lld-macho] Ignore -platform_version and -syslibroot flags.
clang passes these flags; this makes it easier to try `clang -v`
output with `ld -flavor darwinnew`.

Differential Revision: https://reviews.llvm.org/D79797
2020-05-12 19:17:01 -04:00
Jez Ng 87b6fd3e02 [lld-macho] Add support for creating and reading reexported dylibs
This unblocks the linking of real programs, since many core system
functions are only available as sub-libraries of libSystem.

Differential Revision: https://reviews.llvm.org/D79228
2020-05-12 07:52:03 -07:00
Eric Christopher 020022e12e Fix auto -> auto * clang tidy. 2020-05-11 15:50:52 -07:00
Jez Ng 198b0c57df [lld-macho] Support pc-relative section relocations
Summary: So far we've only supported symbol relocations.

Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79211
2020-05-09 20:56:23 -07:00
Jez Ng 7bbdbacd00 [lld-macho] Use export trie instead of symtab when linking against dylibs
Summary:
This allows us to link against stripped dylibs. Moreover, it's simply
more correct: The symbol table includes symbols that the dylib uses but
doesn't export.

This temporarily regresses our ability to do lazy symbol binding because
dyld_stub_binder isn't in libSystem's export trie. Rather, it is in one
of the sub-libraries libSystem re-exports. (This doesn't affect our
tests since we are mocking out dyld_stub_binder there.) A follow-up diff
will address this by adding support for sub-libraries.

Depends on D79114.

Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee

Subscribers: mgorny, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79226
2020-05-09 20:56:22 -07:00
Jez Ng 5d3feefa0d [lld-macho] Dylib symbols should always replace undefined symbols
Summary:
Otherwise we get undefined symbol errors depending on the order of
arguments on the command line.

Depends on D78270.

Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79114
2020-05-09 20:56:22 -07:00
Jez Ng b3e2fc931d [lld-macho] Support calls to functions in dylibs
Summary:
This diff implements lazy symbol binding -- very similar to the PLT
mechanism in ELF.

ELF's .plt section is broken up into two sections in Mach-O:
StubsSection and StubHelperSection. Calls to functions in dylibs will
end up calling into StubsSection, which contains indirect jumps to
addresses stored in the LazyPointerSection (the counterpart to ELF's
.plt.got).

Initially, the LazyPointerSection contains addresses that point into one
of the entry points in the middle of the StubHelperSection. The code in
StubHelperSection will push on the stack an offset into the
LazyBindingSection. The push is followed by a jump to the beginning of
the StubHelperSection (similar to PLT0), which then calls into
dyld_stub_binder. dyld_stub_binder is a non-lazily bound symbol, so this
call looks it up in the GOT.

The stub binder will look up the bind opcodes in the LazyBindingSection
at the given offset. The bind opcodes will tell the binder to update the
address in the LazyPointerSection to point to the symbol, so that
subsequent calls don't have to redo the symbol resolution. The binder
will then jump to the resolved symbol.

Depends on D78269.

Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78270
2020-05-09 20:56:22 -07:00
Jez Ng db157d2733 [lld-macho] Follow-up to D77893
Summary:
1. Don't have isHidden() depend on isNeeded(). Whether a section is
  hidden is orthogonal from whether it is needed: hidden sections will
  never have a header regardless of whether they have a body. (I know we
  override this method with return false for synthetic sections, but
  regardless I think it's confusing to write it this way for non-synthetic
  sections.)

2. Don't call writeTo() on unneeded sections. D78270 assumes that this
  is true when implementing the stub helper section.

3. Filter out the unneeded sections early on to avoid having to deal
   with them in multiple places.

4. Remove assumption in test that the referenced file has no other symbols.
  (We should create separate input files for future tests to avoid such
  issues.)

Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79460
2020-05-09 20:56:22 -07:00
Fangrui Song 6939fe6e08 [lld-macho] Support X86_64_RELOC_SIGNED_{1,2,4}
We currently only support extern relocations.
`X86_64_RELOC_SIGNED_{1,2,4}` are like X86_64_RELOC_SIGNED, but with the
implicit addend fixed to 1, 2, and 4, respectively.
See the comment in `lib/Target/X86/MCTargetDesc/X86MachObjectWriter.cpp RecordX86_64Relocation`.

Reviewed By: int3

Differential Revision: https://reviews.llvm.org/D79311
2020-05-04 15:15:35 -07:00
Kellie Medlin 6cb073133c [lld] Merge Mach-O input sections
Summary: Similar to other formats, input sections in the MachO
implementation are now grouped under output sections. This is primarily
a refactor, although there's some new logic (like resolving the output
section's flags based on its inputs).

Differential Revision: https://reviews.llvm.org/D77893
2020-05-01 16:57:18 -07:00
Jez Ng e82c5e17b5 [lld-macho] Support X86_64_RELOC_BRANCH
Relatively straightforward diff, to set the stage for calling functions
in dylibs.

Differential Revision: https://reviews.llvm.org/D78269
2020-04-29 15:45:01 -07:00
Jez Ng df92377823 [lld-macho] Have Symbol::getVA() return a non-relative virtual address
Currently, getVA() returns a virtual address with the assumption that
the ImageBase is zero. As I understand, this is what lld-ELF is doing.
However, under our current design, it seems like an awkward setup --
I'm finding that I have to add and subtract ImageBase in several places
to make things work out.

As such, I think it's simpler to have getVA() return a non-relative VA,
but I'm not sure if I'm missing something. Would love to hear more from
folks familiar with lld-ELF.

Differential Revision: https://reviews.llvm.org/D78168
2020-04-29 15:44:50 -07:00
Jez Ng 918948db4d [lld-macho] Support reading of universal binaries
Differential Revision: https://reviews.llvm.org/D77006
2020-04-29 15:44:44 -07:00
Jez Ng 89285a1a97 [lld-macho] Disable colors in errors when not printing to a pty
This makes for better tests (and is just the right thing to do)

Differential Revision: https://reviews.llvm.org/D79069
2020-04-29 15:44:35 -07:00
Jez Ng 9854edd817 [lld-macho] Implement basic export trie
Build the trie by performing a three-way radix quicksort: We start by
sorting the strings by their first characters, then sort the strings
with the same first characters by their second characters, and so on
recursively. Each time the prefixes diverge, we add a node to the trie.
Thanks to @ruiu for the idea.

I used llvm-mc's radix quicksort implementation as a starting point. The
trie offset fixpoint code was taken from
MachONormalizedFileBinaryWriter.cpp.

Differential Revision: https://reviews.llvm.org/D76977
2020-04-29 15:44:27 -07:00
Jez Ng 62b8f32f76 [lld-macho][reland] Add support for emitting dylibs with a single symbol
This got reverted due to UBSAN errors in a diff lower in the stack,
which is being fixed in https://reviews.llvm.org/D79050. This diff is
otherwise identical to the original https://reviews.llvm.org/D76908
(which was committed in 9598778bd1 and reverted in b52bc2653b).

Differential Revision: https://reviews.llvm.org/D79051
2020-04-28 17:08:32 -07:00
Jez Ng 4f0cccdd7a [lld-macho][reland] Add basic symbol table output
This diff implements basic support for writing a symbol table.

Attributes are loosely supported for extern symbols and not at all for
other types.

Initial version by Kellie Medlin <kelliem@fb.com>

Originally committed in a3d95a50ee and reverted in fbae153ca5 due to
UBSAN erroring over unaligned writes. That has been fixed in the
current diff with the following changes:

```
diff --git a/lld/MachO/SyntheticSections.cpp b/lld/MachO/SyntheticSections.cpp
--- a/lld/MachO/SyntheticSections.cpp
+++ b/lld/MachO/SyntheticSections.cpp
@@ -133,6 +133,9 @@ SymtabSection::SymtabSection(StringTableSection &stringTableSection)
     : stringTableSection(stringTableSection) {
   segname = segment_names::linkEdit;
   name = section_names::symbolTable;
+  // TODO: When we introduce the SyntheticSections superclass, we should make
+  // all synthetic sections aligned to WordSize by default.
+  align = WordSize;
 }

 size_t SymtabSection::getSize() const {
diff --git a/lld/MachO/Writer.cpp b/lld/MachO/Writer.cpp
--- a/lld/MachO/Writer.cpp
+++ b/lld/MachO/Writer.cpp
@@ -371,6 +371,7 @@ void Writer::assignAddresses(OutputSegment *seg) {
     ArrayRef<InputSection *> sections = p.second;
     for (InputSection *isec : sections) {
       addr = alignTo(addr, isec->align);
+      // We must align the file offsets too to avoid misaligned writes of
+      // structs.
+      fileOff = alignTo(fileOff, isec->align);
       isec->addr = addr;
       addr += isec->getSize();
       fileOff += isec->getFileSize();
@@ -396,6 +397,7 @@ void Writer::writeSections() {
     uint64_t fileOff = seg->fileOff;
     for (auto &sect : seg->getSections()) {
       for (InputSection *isec : sect.second) {
+        fileOff = alignTo(fileOff, isec->align);
         isec->writeTo(buf + fileOff);
         fileOff += isec->getFileSize();
       }
```

I don't think it's easy to write a test for alignment (that doesn't
involve brittly hard-coding file offsets), so there isn't one... but
UBSAN builds pass now.

Differential Revision: https://reviews.llvm.org/D79050
2020-04-28 17:07:06 -07:00
Shoaib Meenai fbae153ca5 Revert "[lld-macho] Add basic symbol table output"
This reverts commit a3d95a50ee.

Reverting due to UBSan failures:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/40817/steps/check-lld%20ubsan/logs/stdio
2020-04-28 11:34:03 -07:00
Shoaib Meenai b52bc2653b Revert "[lld-macho] Add support for emitting dylibs with a single symbol"
This reverts commit 9598778bd1.

Reverting due to UBSan failures:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/40817/steps/check-lld%20ubsan/logs/stdio
2020-04-28 11:34:03 -07:00
Shoaib Meenai af40bff32d [MachO] Fix UB in memcpy
UBSan complains about a memcpy with a null pointer, so just skip the
memcpy call if the data is empty.
2020-04-28 11:33:54 -07:00
Jez Ng 9598778bd1 [lld-macho] Add support for emitting dylibs with a single symbol
Summary:
Add logic for emitting the correct set of load commands and segments
when `-dylib` is passed.

I haven't gotten to implementing a real export trie yet, so we can only
emit a single symbol, but it's enough to replace the YAML test files
introduced in D76252.

Differential Revision: https://reviews.llvm.org/D76908
2020-04-27 13:33:46 -07:00
Jez Ng a3d95a50ee [lld-macho] Add basic symbol table output
This diff implements basic support for writing a symbol table.

- Attributes are loosely supported for extern symbols and not at all for
  other types

Immediate future work will involve implementing section merging.

Initial version by Kellie Medlin <kelliem@fb.com>

Differential Revision: https://reviews.llvm.org/D76742
2020-04-27 13:33:15 -07:00
Jez Ng 6f63216c3d [lld-macho] Extend SyntheticSections to cover all segment load commands
Previously, the special segments `__PAGEZERO` and `__LINKEDIT` were
implemented as special LoadCommands. This diff implements them using
special sections instead which have an `isHidden()` attribute. We do not
emit section headers for hidden sections, but we use their addresses and
file offsets to determine that of their containing segments. In addition
to allowing us to share more segment-related code, this refactor is also
important for the next step of emitting dylibs:

1) dylibs don't have segments like __PAGEZERO, so we need an easy way of
   omitting them w/o messing up segment indices
2) Unlike the kernel, which is happy to run an executable with
   out-of-order segments, dyld requires dylibs to have their segment
   load commands arranged in increasing address order. The refactor
   makes it easier to implement sorting of sections and segments.

Differential Revision: https://reviews.llvm.org/D76839
2020-04-27 12:58:12 -07:00
Simon Pilgrim 0847cfa334 [lld][macho] Fix implicit dependency on DenseMap.h include
It was depending on CachedHashString.h providing the include.
2020-04-27 14:05:29 +01:00
Jez Ng 060efd24c7 [lld-macho] Add basic support for linking against dylibs
This diff implements:

* dylib loading (much of which is being restored from @pcc and @ruiu's
  original work)
* The GOT_LOAD relocation, which allows us to load non-lazy dylib
  symbols
* Basic bind opcode emission, which tells `dyld` how to populate the GOT

Differential Revision: https://reviews.llvm.org/D76252
2020-04-21 13:43:19 -07:00
Fangrui Song 6acd300375 Reland D75382 "[lld] Initial commit for new Mach-O backend"
With a fix for http://lab.llvm.org:8011/builders/clang-cmake-armv8-lld/builds/3636

Also trims some unneeded dependencies.
2020-04-02 12:03:43 -07:00
Oliver Stannard af39151f3c Revert "[lld] Initial commit for new Mach-O backend"
This is causing buildbot failures on 32-bit hosts, for example:
http://lab.llvm.org:8011/builders/clang-cmake-armv8-lld/builds/3636

This reverts commit 03f43b3aca.
2020-04-02 13:23:30 +01:00
Jez Ng 03f43b3aca [lld] Initial commit for new Mach-O backend
Summary:
This is the first commit for the new Mach-O backend, designed to roughly
follow the architecture of the existing ELF and COFF backends, and
building off work that @ruiu and @pcc did in a branch a while back. Note
that this is a very stripped-down commit with the bare minimum of
functionality for ease of review. We'll be following up with more diffs
soon.

Currently, we're able to generate a simple "Hello World!" executable
that runs on OS X Catalina (and possibly on earlier OS X versions; I
haven't tested them). (This executable can be obtained by compiling
`test/MachO/relocations.s`.) We're mocking out a few load commands to
achieve this -- for example, we can't load dynamic libraries, but
Catalina requires binaries to be linked against `dyld`, so we hardcode
the emission of a `LC_LOAD_DYLIB` command. Other mocked out load
commands include LC_SYMTAB and LC_DYSYMTAB.

Differential Revision: https://reviews.llvm.org/D75382
2020-03-31 11:58:47 -07:00