They were previously in SyntheticSections.h, but now there are
a bunch of non-synthetic section names in the list.
Also renamed `__functionStarts` to `__func_starts` for uniformity with
other section names + keeps the name under 16 characters (in case we ever
want to write it out as a real section).
Reviewed By: #lld-macho, compnerd
Differential Revision: https://reviews.llvm.org/D98586
This is an initial base commit for ARM64 target arch support. I don't represent that it complete or bug-free, but wish to put it out for review now that some basic things like branch target & load/store address relocs are working.
I can add more tests to this base commit, or add them in follow-up commits.
It is not entirely clear whether I use the "ARM64" (Apple) or "AArch64" (non-Apple) naming convention. Guidance is appreciated.
Differential Revision: https://reviews.llvm.org/D88629
Add per-reloc-type attribute bits and migrate code from per-target file into target independent code, driven by reloc attributes.
Many cleanups
Differential Revision: https://reviews.llvm.org/D95121
We were mishandling the case where both `__tbss` and `__thread_data` sections were
present.
TLVP relocations should be encoded as offsets from the start of `__thread_data`,
even if the symbol is actually located in `__thread_bss`. Previously, we were
writing the offset from the start of the containing section, which doesn't
really make sense since there's no way `tlv_get_addr()` can know which section a
given `tlv$init` symbol is in at runtime.
In addition, this patch ensures that we place `__thread_data` immediately before
`__thread_bss`. This is what ld64 does, likely for performance reasons. Zerofill
sections must also be at the end of their segments; we were already doing this,
but now we ensure that `__thread_bss` occurs before `__bss`, so that it's always
possible to have it contiguous with `__thread_data`.
Fixes llvm.org/PR48657.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D94329
Additionally:
1. Move the helper functions in InputSection.h below the definition of
`InputSection`, so the important stuff is on top
2. Remove unnecessary `explicit`
Reviewed By: #lld-macho, compnerd
Differential Revision: https://reviews.llvm.org/D92453
This is the same logic that ld64 uses to determine which sections
contain functions. This was added so that we could determine which
STABS entries should be N_FUN.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D92430
Debug sections contain a large amount of data. In order not to bloat the size
of the final binary, we remove them and instead emit STABS symbols for
`dsymutil` and the debugger to locate their contents in the object files.
With this diff, `dsymutil` is able to locate the debug info. However, we need
a few more features before `lldb` is able to work well with our binaries --
e.g. having `LC_DYSYMTAB` accurately reflect the number of local symbols,
emitting `LC_UUID`, and more. Those will be handled in follow-up diffs.
Note also that the STABS we emit differ slightly from what ld64 does. First, we
emit the path to the source file as one `N_SO` symbol instead of two. (`ld64`
emits one `N_SO` for the dirname and one of the basename.) Second, we do not
emit `N_BNSYM` and `N_ENSYM` STABS to mark the start and end of functions,
because the `N_FUN` STABS already serve that purpose. @clayborg recommended
these changes based on his knowledge of what the debugging tools look for.
Additionally, this current implementation doesn't accurately reflect the size
of function symbols. It uses the size of their containing sectioins as a proxy,
but that is only accurate if `.subsections_with_symbols` is set, and if there
isn't an `N_ALT_ENTRY` in that particular subsection. I think we have two
options to solve this:
1. We can split up subsections by symbol even if `.subsections_with_symbols`
is not set, but include constraints to ensure those subsections retain
their order in the final output. This is `ld64`'s approach.
2. We could just add a `size` field to our `Symbol` class. This seems simpler,
and I'm more inclined toward it, but I'm not sure if there are use cases
that it doesn't handle well. As such I'm punting on the decision for now.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D89257
The word "target" is overloaded, so lighten its load by using another word to denote the symbol or section to which a reloc points. While more stilted than "target", "referent" is rather less pompous than "designatum" or "denotatum". :P
Along the way, make a few neighboring variable names more descriptive.
Reviewed By: #lld-macho, int3
Differential Revision: https://reviews.llvm.org/D87584
References to symbols in dylibs work very similarly regardless of
whether the symbol is a TLV. The main difference is that we have a
separate `__thread_ptrs` section that acts as the GOT for these
thread-locals.
We can identify thread-locals in dylibs by a flag in their export trie
entries, and we cross-check it with the relocations that refer to them
to ensure that we are not using a GOT relocation to reference a
thread-local (or vice versa).
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D85081
Note: What ELF refers to as "TLS", Mach-O seems to refer to as "TLV", i.e.
thread-local variables.
This diff implements support for TLV relocations that reference defined
symbols. On x86_64, TLV relocations are always used with movq opcodes, so for
defined TLVs, we don't need to create a synthetic section to store the
addresses of the symbols -- we can just convert the `movq` to a `leaq`.
One notable quirk of Mach-O's TLVs is that absolute-address relocations
inside TLV-defining sections behave differently -- their addresses are
no longer absolute, but relative to the start of the target section.
(AFAICT, RIP-relative relocations are not allowed in these sections.)
Reviewed By: #lld-macho, compnerd, smeenai
Differential Revision: https://reviews.llvm.org/D85080
This diff makes the behavior in {D80859} and {D81888} apply to
thread-local ZeroFill sections too. I realized this was necessary whie
trying to implement thread-local variables.
Reviewed By: #lld-macho, compnerd, MaskRay
Differential Revision: https://reviews.llvm.org/D85079
Summary:
llvm-mc emits `__bss` sections with an offset of zero, but we weren't expecting
that in our input, so we were copying non-zero data from the start of the file and
putting it in `__bss`, with obviously undesirable runtime results. (It appears that
the kernel will copy those nonzero bytes as long as the offset is nonzero, regardless
of whether S_ZERO_FILL is set.)
I debated on whether to make a special ZeroFillSection -- separate from a
regular InputSection -- but it seemed like too much work for now. But I'm happy
to refactor if anyone feels strongly about having it as a separate class.
Depends on D80857.
Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee
Reviewed By: smeenai
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80859
Summary:
So things work on 32-bit machines. (@vzakhari reported the
breakage starting from D80177).
Reviewers: #lld-macho, vzakhari
Subscribers: llvm-commits, vzakhari
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81982
Summary:
We should be reading / writing our addends / relocated addresses based on
r_length, and not just based on the type of the relocation. But since only
some r_length values are valid for a given reloc type, I've also added some
validation.
ld64 has code to allow for r_length = 0 in X86_64_RELOC_BRANCH relocs, but I'm
not sure how to create such a relocation...
Reviewed By: smeenai
Differential Revision: https://reviews.llvm.org/D80854
Summary:
This diff restores and builds upon @pcc and @ruiu's initial work on
subsections.
The .subsections_via_symbols directive indicates we can split each
section along symbol boundaries, unless those symbols have been marked
with `.alt_entry`.
We exercise this functionality in our tests by using order files that
rearrange those symbols.
Depends on D79668.
Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee
Reviewed By: smeenai
Subscribers: thakis, llvm-commits, pcc, ruiu
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79926
This diff restores and builds upon @pcc and @ruiu's initial work on
subsections.
The .subsections_via_symbols directive indicates we can split each
section along symbol boundaries, unless those symbols have been marked
with `.alt_entry`.
We exercise this functionality in our tests by using order files that
rearrange those symbols.
Reviewed By: smeenai
Differential Revision: https://reviews.llvm.org/D79926
Summary: Similar to other formats, input sections in the MachO
implementation are now grouped under output sections. This is primarily
a refactor, although there's some new logic (like resolving the output
section's flags based on its inputs).
Differential Revision: https://reviews.llvm.org/D77893
Previously, the special segments `__PAGEZERO` and `__LINKEDIT` were
implemented as special LoadCommands. This diff implements them using
special sections instead which have an `isHidden()` attribute. We do not
emit section headers for hidden sections, but we use their addresses and
file offsets to determine that of their containing segments. In addition
to allowing us to share more segment-related code, this refactor is also
important for the next step of emitting dylibs:
1) dylibs don't have segments like __PAGEZERO, so we need an easy way of
omitting them w/o messing up segment indices
2) Unlike the kernel, which is happy to run an executable with
out-of-order segments, dyld requires dylibs to have their segment
load commands arranged in increasing address order. The refactor
makes it easier to implement sorting of sections and segments.
Differential Revision: https://reviews.llvm.org/D76839
This diff implements:
* dylib loading (much of which is being restored from @pcc and @ruiu's
original work)
* The GOT_LOAD relocation, which allows us to load non-lazy dylib
symbols
* Basic bind opcode emission, which tells `dyld` how to populate the GOT
Differential Revision: https://reviews.llvm.org/D76252
Summary:
This is the first commit for the new Mach-O backend, designed to roughly
follow the architecture of the existing ELF and COFF backends, and
building off work that @ruiu and @pcc did in a branch a while back. Note
that this is a very stripped-down commit with the bare minimum of
functionality for ease of review. We'll be following up with more diffs
soon.
Currently, we're able to generate a simple "Hello World!" executable
that runs on OS X Catalina (and possibly on earlier OS X versions; I
haven't tested them). (This executable can be obtained by compiling
`test/MachO/relocations.s`.) We're mocking out a few load commands to
achieve this -- for example, we can't load dynamic libraries, but
Catalina requires binaries to be linked against `dyld`, so we hardcode
the emission of a `LC_LOAD_DYLIB` command. Other mocked out load
commands include LC_SYMTAB and LC_DYSYMTAB.
Differential Revision: https://reviews.llvm.org/D75382