llvm-project

Commit Graph

Author	SHA1	Message	Date
Jez Ng	3fcb0eeb15	[lld-macho] Emit STABS symbols for debugging, and drop debug sections Debug sections contain a large amount of data. In order not to bloat the size of the final binary, we remove them and instead emit STABS symbols for `dsymutil` and the debugger to locate their contents in the object files. With this diff, `dsymutil` is able to locate the debug info. However, we need a few more features before `lldb` is able to work well with our binaries -- e.g. having `LC_DYSYMTAB` accurately reflect the number of local symbols, emitting `LC_UUID`, and more. Those will be handled in follow-up diffs. Note also that the STABS we emit differ slightly from what ld64 does. First, we emit the path to the source file as one `N_SO` symbol instead of two. (`ld64` emits one `N_SO` for the dirname and one of the basename.) Second, we do not emit `N_BNSYM` and `N_ENSYM` STABS to mark the start and end of functions, because the `N_FUN` STABS already serve that purpose. @clayborg recommended these changes based on his knowledge of what the debugging tools look for. Additionally, this current implementation doesn't accurately reflect the size of function symbols. It uses the size of their containing sectioins as a proxy, but that is only accurate if `.subsections_with_symbols` is set, and if there isn't an `N_ALT_ENTRY` in that particular subsection. I think we have two options to solve this: 1. We can split up subsections by symbol even if `.subsections_with_symbols` is not set, but include constraints to ensure those subsections retain their order in the final output. This is `ld64`'s approach. 2. We could just add a `size` field to our `Symbol` class. This seems simpler, and I'm more inclined toward it, but I'm not sure if there are use cases that it doesn't handle well. As such I'm punting on the decision for now. Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D89257	2020-12-01 15:05:20 -08:00
Greg McGary	2124ca1d5c	[lld-macho] create __TEXT,__unwind_info from __LD,__compact_unwind Digest the input `__LD,__compact_unwind` and produce the output `__TEXT,__unwind_info`. This is the initial commit with the major functionality. Successor commits will add handling for ... * `__TEXT,__eh_frame` * personalities & LSDA * `-r` pass-through Differential Revision: https://reviews.llvm.org/D86805	2020-09-18 22:01:03 -07:00
Jez Ng	3646ee503d	[lld-macho] Refactor segment/section creation, sorting, and merging Summary: There were a few issues with the previous setup: 1. The section sorting comparator used a declarative map of section names to determine the correct order, but it turns out we need to match on more than just names -- in particular, an upcoming diff will sort based on whether the S_ZERO_FILL flag is set. This diff changes the sorter to a more imperative but flexible form. 2. We were sorting OutputSections stored in a MapVector, which left the MapVector in an inconsistent state -- the wrong keys map to the wrong values! In practice, we weren't doing key lookups (only container iteration) after the sort, so this was fine, but it was still a dubious state of affairs. This diff copies the OutputSections to a vector before sorting them. 3. We were adding unneeded OutputSections to OutputSegments and then filtering them out later, which meant that we had to remember whether an OutputSegment was in a pre- or post-filtered state. This diff only adds the sections to the segments if they are needed. In addition to those major changes, two minor ones worth noting: 1. I renamed all OutputSection variable names to `osec`, to parallel `isec`. Previously we were using some inconsistent combination of `osec`, `os`, and `section`. 2. I added a check (and a test) for InputSections with names that clashed with those of our synthetic OutputSections. Reviewers: #lld-macho Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81887	2020-06-21 17:13:59 -07:00
Jez Ng	55e9eb416e	[lld-macho] Support -order_file The order file indicates how input sections should be sorted within each output section, based on the symbols contained within those sections. This diff sets the stage for implementing and testing `.subsections_via_symbols`, where we will break up InputSections by each symbol and sort them more granularly. Reviewed By: smeenai Differential Revision: https://reviews.llvm.org/D79668	2020-05-19 07:46:57 -07:00
Jez Ng	b3e2fc931d	[lld-macho] Support calls to functions in dylibs Summary: This diff implements lazy symbol binding -- very similar to the PLT mechanism in ELF. ELF's .plt section is broken up into two sections in Mach-O: StubsSection and StubHelperSection. Calls to functions in dylibs will end up calling into StubsSection, which contains indirect jumps to addresses stored in the LazyPointerSection (the counterpart to ELF's .plt.got). Initially, the LazyPointerSection contains addresses that point into one of the entry points in the middle of the StubHelperSection. The code in StubHelperSection will push on the stack an offset into the LazyBindingSection. The push is followed by a jump to the beginning of the StubHelperSection (similar to PLT0), which then calls into dyld_stub_binder. dyld_stub_binder is a non-lazily bound symbol, so this call looks it up in the GOT. The stub binder will look up the bind opcodes in the LazyBindingSection at the given offset. The bind opcodes will tell the binder to update the address in the LazyPointerSection to point to the symbol, so that subsequent calls don't have to redo the symbol resolution. The binder will then jump to the resolved symbol. Depends on D78269. Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78270	2020-05-09 20:56:22 -07:00
Jez Ng	db157d2733	[lld-macho] Follow-up to D77893 Summary: 1. Don't have isHidden() depend on isNeeded(). Whether a section is hidden is orthogonal from whether it is needed: hidden sections will never have a header regardless of whether they have a body. (I know we override this method with return false for synthetic sections, but regardless I think it's confusing to write it this way for non-synthetic sections.) 2. Don't call writeTo() on unneeded sections. D78270 assumes that this is true when implementing the stub helper section. 3. Filter out the unneeded sections early on to avoid having to deal with them in multiple places. 4. Remove assumption in test that the referenced file has no other symbols. (We should create separate input files for future tests to avoid such issues.) Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79460	2020-05-09 20:56:22 -07:00
Kellie Medlin	6cb073133c	[lld] Merge Mach-O input sections Summary: Similar to other formats, input sections in the MachO implementation are now grouped under output sections. This is primarily a refactor, although there's some new logic (like resolving the output section's flags based on its inputs). Differential Revision: https://reviews.llvm.org/D77893	2020-05-01 16:57:18 -07:00
Jez Ng	6f63216c3d	[lld-macho] Extend SyntheticSections to cover all segment load commands Previously, the special segments `__PAGEZERO` and `__LINKEDIT` were implemented as special LoadCommands. This diff implements them using special sections instead which have an `isHidden()` attribute. We do not emit section headers for hidden sections, but we use their addresses and file offsets to determine that of their containing segments. In addition to allowing us to share more segment-related code, this refactor is also important for the next step of emitting dylibs: 1) dylibs don't have segments like __PAGEZERO, so we need an easy way of omitting them w/o messing up segment indices 2) Unlike the kernel, which is happy to run an executable with out-of-order segments, dyld requires dylibs to have their segment load commands arranged in increasing address order. The refactor makes it easier to implement sorting of sections and segments. Differential Revision: https://reviews.llvm.org/D76839	2020-04-27 12:58:12 -07:00
Jez Ng	060efd24c7	[lld-macho] Add basic support for linking against dylibs This diff implements: * dylib loading (much of which is being restored from @pcc and @ruiu's original work) * The GOT_LOAD relocation, which allows us to load non-lazy dylib symbols * Basic bind opcode emission, which tells `dyld` how to populate the GOT Differential Revision: https://reviews.llvm.org/D76252	2020-04-21 13:43:19 -07:00
Fangrui Song	6acd300375	Reland D75382 "[lld] Initial commit for new Mach-O backend" With a fix for http://lab.llvm.org:8011/builders/clang-cmake-armv8-lld/builds/3636 Also trims some unneeded dependencies.	2020-04-02 12:03:43 -07:00
Oliver Stannard	af39151f3c	Revert "[lld] Initial commit for new Mach-O backend" This is causing buildbot failures on 32-bit hosts, for example: http://lab.llvm.org:8011/builders/clang-cmake-armv8-lld/builds/3636 This reverts commit `03f43b3aca`.	2020-04-02 13:23:30 +01:00
Jez Ng	03f43b3aca	[lld] Initial commit for new Mach-O backend Summary: This is the first commit for the new Mach-O backend, designed to roughly follow the architecture of the existing ELF and COFF backends, and building off work that @ruiu and @pcc did in a branch a while back. Note that this is a very stripped-down commit with the bare minimum of functionality for ease of review. We'll be following up with more diffs soon. Currently, we're able to generate a simple "Hello World!" executable that runs on OS X Catalina (and possibly on earlier OS X versions; I haven't tested them). (This executable can be obtained by compiling `test/MachO/relocations.s`.) We're mocking out a few load commands to achieve this -- for example, we can't load dynamic libraries, but Catalina requires binaries to be linked against `dyld`, so we hardcode the emission of a `LC_LOAD_DYLIB` command. Other mocked out load commands include LC_SYMTAB and LC_DYSYMTAB. Differential Revision: https://reviews.llvm.org/D75382	2020-03-31 11:58:47 -07:00

12 Commits