Matches the ELF and COFF ports, which use ld.lld and lld-link, respectively.
While here, also move up `cleanupCallback` to match ELF / COFF.
Differential Revision: https://reviews.llvm.org/D97715
The new Darwin backend for LLD is now able to link reasonably large
real-world programs on x86_64. For instance, we have achieved
self-hosting for the X86_64 target, where all LLD tests pass when
building lld with itself on macOS. As such, we would like to make it the
default back-end.
The new port is now named `ld64.lld`, and the old port remains
accessible as `ld64.lld.darwinold`
This [annoucement email][1] has some context. (But note that, unlike
what the email says, we are no longer doing this as part of the LLVM 12
branch cut -- instead we will go into LLVM 13.)
Numerous mechanical test changes were required to make this change; in
the interest of creating something that's reviewable on Phabricator,
I've split out the boring changes into a separate diff (D95905). I plan to
merge its contents with those in this diff before landing.
(@gkm made the original draft of this diff, and he has agreed to let me
take over.)
[1]: https://lists.llvm.org/pipermail/llvm-dev/2021-January/147665.html
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D95204
There was initially some concern around the correct handling of pcrel
section relocations with r_length != 2. But it looks like there are no such
relocations in practice -- x86_64's pcrel section relocs all have r_length == 2,
and ARM64 doesn't even have pcrel section relocs. So we can replace the TODO
with an assert.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D97576
If the reference-types feature is enabled, call_indirect will explicitly
reference its corresponding function table via TABLE_NUMBER
relocations against a table symbol.
Also, as before, address-taken functions can also cause the function
table to be created, only with reference-types they additionally cause a
symbol table entry to be emitted.
Differential Revision: https://reviews.llvm.org/D90948
Also a couple of minor cleanups in merge-string.s:
- fix inconsistent use of tabs
- use `.p2align` rather than `.align` since `.p2align` works the
same on all platforms (the meaning of align seems to differ
between platforms according to `AlignmentIsInBytes`.
I noticed these potential cleanups while porting SHF_STRINGS support to
wasm-ld.
Differential Revision: https://reviews.llvm.org/D97647
Only one of the two callers used the lastBinding parameter, so
do that work at that one call site. Extract a ordinalForDylibSymbol()
helper to make this tidy.
No behavior change.
Differential Revision: https://reviews.llvm.org/D97597
Bifurcate the `readFile()` API into ...
* `readRawFile()` which performs no checks, and
* `readLinkableFile()` which enforces minimum length of 20 bytes, same as ld64
There are no new tests because tweaks to existing tests are sufficient.
Differential Revision: https://reviews.llvm.org/D97610
On arm64, UNSIGNED relocs are the only ones that use embedded addends
instead of the ADDEND relocation.
Also ensure that the addend works when UNSIGNED is part of a SUBTRACTOR
pair.
Reviewed By: #lld-macho, alexshap
Differential Revision: https://reviews.llvm.org/D97105
Also add a few asserts to verify that we are indeed handling an
UNSIGNED relocation as the minued. I haven't made it an actual
user-facing error since I don't think llvm-mc is capable of generating
SUBTRACTOR relocations without an associated UNSIGNED.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D97103
`llvm-mc` doesn't generate any relocations for subtractions
between local symbols -- they must be global -- so the previous test
wasn't actually testing any relocation logic. I've fixed that and
extended the test to cover r_length=3 relocations as well as both x86_64
and arm64.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D97057
Specifically:
- InputChunk::outputOffset -> outSecOffset
- Symbol::get/setVirtualAddress -> get/setVA
- add InputChunk::getOffset helper that takes an offset
These are mostly in preparation for adding support for
SHF_MERGE/SHF_STRINGS but its also good to align with ELF where
possible.
Differential Revision: https://reviews.llvm.org/D97595
Dynamic lookup symbols are symbols that work like dynamic symbols
in ELF: They're not bound to a dylib like normal Mach-O twolevel lookup
symbols, but they live in a global pool and dyld resolves them against
exported symbols from all loaded dylibs.
This adds support for dynamical lookup symbols to lld/mac. They are
represented as DylibSymbols with file set to nullptr.
This also uses this support to implement the -U flag, which makes
a specific symbol that's undefined at the end of the link a
dynamic lookup symbol.
For -U, it'd be sufficient to just to a pass over remaining undefined symbols
at the end of the link and to replace them with dynamic lookup symbols then.
But I'd like to use this code to implement flat_namespace too, and that will
require real support for resolving dynamic lookup symbols in SymbolTable. So
this patch adds this now already.
While writing tests for this, I noticed that we didn't set N_WEAK_DEF in the
symbol table for DylibSymbols, so this fixes that too.
Differential Revision: https://reviews.llvm.org/D97521
For one metadata section usage, each text section references a metadata section.
The metadata sections have a C identifier name to allow the runtime to collect them via `__start_/__stop_` symbols.
Since `__start_`/`__stop_` references are always present from live sections, the
C identifier name sections appear like GC roots, which means they cannot be
discarded by `ld --gc-sections`.
To make such sections GCable, either SHF_LINK_ORDER or a section group is needed.
SHF_LINK_ORDER is not suitable for the references can be inlined into other functions
(See D97430:
Function A (in the section .text.A) references its `__sancov_guard` section.
Function B inlines A (so now .text.B references `__sancov_guard` - this is invalid with the semantics of SHF_LINK_ORDER).
In the linking stage,
if `.text.A` gets discarded, and `__sancov_guard` is retained via the reference from `.text.B`,
the output will be invalid because `__sancov_guard` references the discarded `.text.A`.
LLD errors "sh_link points to discarded section".
)
A section group have size overhead, and is cumbersome when there is just one metadata section.
Add `-z start-stop-gc` to drop the "__start_/__stop_ references retain
non-SHF_LINK_ORDER non-SHF_GROUP C identifier name sections" rule.
We reserve the rights to switch the default in the future.
Reviewed By: phosek, jrtc27
Differential Revision: https://reviews.llvm.org/D96914
When parsing bitcode, convert LTO Symbols to LLD Symbols in order to perform
resolution. The "winning" symbol will then be marked as Prevailing at LTO
compilation time. This is similar to what the other LLD ports do.
This change allows us to handle `linkonce` symbols correctly, and to deal with
duplicate bitcode symbols gracefully. Previously, both scenarios would result in
an assertion failure inside the LTO code, complaining that multiple Prevailing
definitions are not allowed.
While at it, I also added basic logic around visibility. We don't do anything
useful with it yet, but we do check that its value is valid. LLD-ELF appears to
use it only to set FinalDefinitionInLinkageUnit for LTO, which I think is just a
performance optimization.
From my local experimentation, the linker itself doesn't seem to do anything
differently when encountering linkonce / linkonce_odr / weak / weak_odr. So I've
only written a test for one of them. LLD-ELF has more, but they seem to mostly
be testing the intermediate bitcode output of their LTO backend...? I'm far from
an expert here though, so I might very well be missing things.
Reviewed By: #lld-macho, MaskRay, smeenai
Differential Revision: https://reviews.llvm.org/D94342
Remove a stray -lib argument in guardcf-lto.ll; llvm-lib doesn't
support generating import libs from a def file unlike lib.exe.
Previously this worked because the -lib argument was ignored
(printing only a warning).
Differential Revision: https://reviews.llvm.org/D96699
{D95809} introduced a mechanism for synthetic symbol creation of personality
pointers. When multiple section relocations referred to the same personality
pointer, it would deduplicate them. However, it neglected to consider that we
could have symbol relocations that also refer to the same personality pointer.
This diff fixes it.
In practice, this mix of relocations arises when there is a statically-linked
personality routine that is referenced from multiple object files. Within the
same object file, it will be referred to via section relocations, but
(obviously) other object files will refer to it via symbol relocations. Failing
to deduplicate these references resulted in us going over the
3-personality-pointer limit when linking some larger applications.
Fixes llvm.org/PR48389.
Reviewed By: #lld-macho, thakis, alexshap
Differential Revision: https://reviews.llvm.org/D97245
The silent failures had confused me a few times.
I haven't added a similar check for platform yet as we don't yet have logic to
infer the platform automatically, and so adding that check would require
updating dozens of test files.
Reviewed By: #lld-macho, thakis, alexshap
Differential Revision: https://reviews.llvm.org/D97209
I've adjusted the RelocAttrBits to better fit the semantics of
the relocations. In particular:
1. *_UNSIGNED relocations are no longer marked with the `TLV` bit, even
though they can occur within TLV sections. Instead the `TLV` bit is
reserved for relocations that can reference thread-local symbols, and
*_UNSIGNED relocations have their own `UNSIGNED` bit. The previous
implementation caused TLV and regular UNSIGNED semantics to be
conflated, resulting in rebase opcodes being incorrectly emitted for TLV
relocations.
2. I've added a new `POINTER` bit to denote non-relaxable GOT
relocations. This distinction isn't important on x86 -- the GOT
relocations there are either relaxable or non-relaxable loads -- but
arm64 has `GOT_LOAD_PAGE21` which loads the page that the referent
symbol is in (regardless of whether the symbol ends up in the GOT). This
relocation must reference a GOT symbol (so must have the `GOT` bit set)
but isn't itself relaxable (so must not have the `LOAD` bit). The
`POINTER` bit is used for relocations that *must* reference a GOT
slot.
3. A similar situation occurs for TLV relocations.
4. ld64 supports both a pcrel and an absolute version of
ARM64_RELOC_POINTER_TO_GOT. But the semantics of the absolute version
are pretty weird -- it results in the value of the GOT slot being
written, rather than the address. (That means a reference to a
dynamically-bound slot will result in zeroes being written.) The
programs I've tried linking don't use this form of the relocation, so
I've dropped our partial support for it by removing the relevant
RelocAttrBits.
Reviewed By: alexshap
Differential Revision: https://reviews.llvm.org/D97031
/reproduce: now works correctly with:
- /call-graph-ordering-file:
- /def:
- /natvis:
- /order:
- /pdbstream:
I went through all instances of MemoryBuffer::getFile() and made sure
everything that didn't already do so called takeBuffer().
For natvis, that wasn't possible since DebugInfo/PDB wants to take
owernship of the natvis buffer. For that case, I'm manually adding the
tar file entry.
/natvis: and /pdbstream: is slightly awkward, since createResponseFile()
always adds these flags to the response file but createPDB() (which
ultimately adds the files referenced by the flags) is only called if
/debug is also passed. So when using /natvis: without /debug with
/reproduce:, lld won't warn, but when linking using the response
file from the archive, it won't find the natvis file since it's not
in the tar. This isn't a new issue though, and after this patch things
at least work with using /natvis: _with_ debug with /reproduce:.
(Same for /pdbstream:)
Differential Revison: https://reviews.llvm.org/D97212
If the reference-types feature is enabled, call_indirect will explicitly
reference its corresponding function table via `TABLE_NUMBER`
relocations against a table symbol.
Also, as before, address-taken functions can also cause the function
table to be created, only with reference-types they additionally cause a
symbol table entry to be emitted.
We abuse the used-in-reloc flag on symbols to indicate which tables
should end up in the symbol table. We do this because unfortunately
older wasm-ld will carp if it see a table symbol.
Differential Revision: https://reviews.llvm.org/D90948
The special root semantics for identifier-named sections is meant
specifically for the metadata sections. In the context of group
semantics, where group members are always retained or discarded as a
unit, it's natural not to have this semantics apply to a section in a
group, otherwise we would never discard the group defeating the purpose
of using the group in the first place.
This change modifies the GC behavior so that __start_/__stop_ references
don't retain C identifier named sections in section groups which allows
for these groups to be collected. This matches the behavior of BFD ld.
The only kind of existing case that might break is interdependent
metadata sections that are all in a group together, but that group
doesn't contain any other sections referenced by anything except
implicit inclusion in a `__start_` and/or `__stop_`-referenced
identifier-named section, but such cases should be unlikely.
Differential Revision: https://reviews.llvm.org/D96753
See discussion on https://reviews.llvm.org/D93263
-flat_namespace isn't implemented yet, and neither is -undefined dynamic,
so this makes -undefined pretty pointless in lld/MachO for now. But once
we implement -flat_namespace (which we need to do anyways to get check-llvm
to pass with lld as host linker), the code's already there.
Follow-up to https://reviews.llvm.org/D93263#2491865
Differential Revision: https://reviews.llvm.org/D96963
Differential Revision: https://reviews.llvm.org/D95913
Usage: -bundle_loader <executable>
This option specifies the executable that will load the build output file being linked.
When building a bundle, users can use the --bundle_loader to specify an executable
that contains symbols referenced, but not implemented in the bundle.
For relocatable output that needs the indirect function table, identify
the well-known function table. This allows us to properly fix the
limits on the imported table, and in a followup will allow the element
section to reference the indirect function table even if it's not
assigned to table number 0. Adapt tests for import reordering.
Differential Revision: https://reviews.llvm.org/D96770
Before, --importTable forced the creation of an indirect function table,
whether it was needed or not. Now it only imports a table if needed.
Differential Revision: https://reviews.llvm.org/D96872
ST_Data is used to model BFD `BFD_OBJECT`.
A STT_TLS symbol does not have the `BFD_OBJECT` flag in BFD.
This makes sense because a STT_TLS symbol is like in a different address space,
normal data/object properties do not apply on them.
With this change, a STT_TLS symbol will not be displayed as 'O'.
This new behavior matches objdump.
Differential Revision: https://reviews.llvm.org/D96735
Adds a lld test for a case that the handling added for dynamically
exported symbols in 1487747e99 already
fixes. Because isExportDynamic returns true when the symbol is
SharedKind with default visibility, it will treat as dynamically
exported and block devirtualization when the definition of a vtable
comes from a shared library. This is desireable as it is dangerous to
devirtualize in that case, since there could be hidden overrides in the
shared library. Typically that happens when the shared library header
contains available externally definitions, which applications can
override. An example is std::error_category, which is overridden in LLVM
and causing failures after a self build with WPD enabled, because
libstdc++ contains hidden overrides of the virtual base class methods.
The regular LTO case in the new test already worked, but there are
2 fixes in this patch needed for the index-only case and the hybrid
LTO case. For the index-only case, WPD should not simply ignore
available externally vtables. A follow on fix will be made to clang to
emit type metadata for those vtables, which the new test is modeling.
For the hybrid case, we need to ensure when the module is split that any
llvm.*used globals are cloned to the regular LTO split module so
available externally vtable definitions are not prematurely deleted.
Another follow on fix will add the equivalent gold test, which requires
a small fix to the plugin to treat symbols in dynamic libraries the same
way lld already is.
Differential Revision: https://reviews.llvm.org/D96721
This change introduces support for zero flag ELF section groups to lld.
lld already supports COMDAT sections, which in ELF are a special type of
ELF section groups. These are generally useful to enable linker GC where
you want a group of sections to always travel together, that is to be
either retained or discarded as a whole, but without the COMDAT
semantics. Other ELF linkers already support zero flag ELF section
groups and this change helps us reach feature parity.
Differential Revision: https://reviews.llvm.org/D96636
As we don't sort local symbols, don't sort non-local symbols. This makes
non-local symbols appear in their register order, which matches GNU as. The
register order is nice in that you can write tests with interleaved CHECK
prefixes, e.g.
```
// CHECK: something about foo
.globl foo
foo:
// CHECK: something about bar
.globl bar
bar:
```
With the lexicographical order, the user needs to place lexicographical smallest
symbol first or keep CHECK prefixes in one place.