llvm-project

Commit Graph

Author	SHA1	Message	Date
Jez Ng	7b007ac080	[lld-macho][nfc] Move some methods from InputFile to ObjFile Additionally: 1. Move the helper functions in InputSection.h below the definition of `InputSection`, so the important stuff is on top 2. Remove unnecessary `explicit` Reviewed By: #lld-macho, compnerd Differential Revision: https://reviews.llvm.org/D92453	2020-12-08 10:34:32 -08:00
Nico Weber	16b1f6e385	[mac/lld] Add support for the LC_LINKER_OPTION load command in o files clang puts `-framework CoreFoundation` in this load command for files that use @available / __builtin_available. Without support for this, binaries that don't explicitly link to CoreFoundation fail to link. Differential Revision: https://reviews.llvm.org/D92624	2020-12-04 08:46:53 -05:00
Nico Weber	7cb0a373d1	[mac/lld] Implement -t Goes well with `-why_load` to get an idea of load order. Differential Revision: https://reviews.llvm.org/D92583	2020-12-03 16:02:38 -05:00
Nico Weber	3422f3cc6e	Reland "[mac/lld] Implement -why_load". The problem was that `sym` became replaced in the call to make<ObjFile> and referring to it afer that read memory that now stored a different kind of symbol (a Defined instead of a LazySymbol). Since this happens only once per archive, just copy the symbol to the stack before make<ObjFile> and read the copy instead. Originally reviewed at https://reviews.llvm.org/D92496	2020-12-03 08:35:12 -05:00
Nico Weber	ea0029f55d	Revert "[mac/lld] Implement -why_load" This reverts commit `542d3b609d`. Seems to break check-lld. Reverting while I take a look.	2020-12-02 18:57:46 -05:00
Nico Weber	542d3b609d	[mac/lld] Implement -why_load This is useful for debugging why lld loads .o files it shouldn't load. It's also useful for users of lld -- I've used ld64's version of this a few times. Differential Revision: https://reviews.llvm.org/D92496	2020-12-02 18:33:12 -05:00
Nico Weber	ca634393fc	[mac/lld] Make --reproduce work with thin archives See http://reviews.llvm.org/rL268229 and http://reviews.llvm.org/rL313832 which did the same for the ELF port. Differential Revision: https://reviews.llvm.org/D92456	2020-12-02 09:48:31 -05:00
Nico Weber	b2f00f24a3	[mac/lld] Include archive name in diagnostics Also, for .o files, include full path as given on link command line. Before: lld: error: undefined symbol [...], referenced from sandbox_logging.o After: lld: error: undefined symbol [...], referenced from libseatbelt.a(sandbox_logging.o) Move archiveName up to InputFile so we can consistently use toString() to print InputFiles in diags, and pass it to the ObjFile ctor. This matches the ELF and COFF ports. Differential Revision: https://reviews.llvm.org/D92437	2020-12-01 23:00:25 -05:00
Nico Weber	07ab597bb0	[lld/mac] Fix issues around thin archives - most importantly, fix a use-after-free when using thin archives, by putting the archive unique_ptr to the arena allocator. This ports D65565 to MachO - correctly demangle symbol namess from archives in diagnostics - add a test for thin archives -- it finds this UaF, but only when running it under asan (it also finds the demangling fix) - make forceLoadArchive() use addFile() with a bool to have the archive loading code in fewer places. no behavior change; matches COFF port a bit better Differential Revision: https://reviews.llvm.org/D92360	2020-12-01 18:48:29 -05:00
Jez Ng	78f6498cdc	[lld-macho] Flesh out STABS implementation This addresses a lot of the comments in {D89257}. Ideally it'd have been done in the same diff, but the commits in between make that difficult. This diff implements: * N_GSYM and N_STSYM, the STABS for global and static symbols * Has the STABS reflect the section IDs of their referent symbols * Ensures we don't fail when encountering absolute symbols or files with no debug info * Sorts STABS symbols by file to minimize the number of N_OSO entries Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D92366	2020-12-01 15:05:21 -08:00
Jez Ng	b768d57b36	[lld-macho] Add archive name and file modtime to STABS output We should also set the modtime when running LTO. That will be done in a future diff, together with support for the `-object_path_lto` flag. Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D91318	2020-12-01 15:05:21 -08:00
Jez Ng	3fcb0eeb15	[lld-macho] Emit STABS symbols for debugging, and drop debug sections Debug sections contain a large amount of data. In order not to bloat the size of the final binary, we remove them and instead emit STABS symbols for `dsymutil` and the debugger to locate their contents in the object files. With this diff, `dsymutil` is able to locate the debug info. However, we need a few more features before `lldb` is able to work well with our binaries -- e.g. having `LC_DYSYMTAB` accurately reflect the number of local symbols, emitting `LC_UUID`, and more. Those will be handled in follow-up diffs. Note also that the STABS we emit differ slightly from what ld64 does. First, we emit the path to the source file as one `N_SO` symbol instead of two. (`ld64` emits one `N_SO` for the dirname and one of the basename.) Second, we do not emit `N_BNSYM` and `N_ENSYM` STABS to mark the start and end of functions, because the `N_FUN` STABS already serve that purpose. @clayborg recommended these changes based on his knowledge of what the debugging tools look for. Additionally, this current implementation doesn't accurately reflect the size of function symbols. It uses the size of their containing sectioins as a proxy, but that is only accurate if `.subsections_with_symbols` is set, and if there isn't an `N_ALT_ENTRY` in that particular subsection. I think we have two options to solve this: 1. We can split up subsections by symbol even if `.subsections_with_symbols` is not set, but include constraints to ensure those subsections retain their order in the final output. This is `ld64`'s approach. 2. We could just add a `size` field to our `Symbol` class. This seems simpler, and I'm more inclined toward it, but I'm not sure if there are use cases that it doesn't handle well. As such I'm punting on the decision for now. Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D89257	2020-12-01 15:05:20 -08:00
Nico Weber	83e60f5a55	[lld/mac] Add --reproduce option This adds support for ld.lld's --reproduce / lld-link's /reproduce: flag to the MachO port. This flag can be added to a link command to make the link write a tar file containing all inputs to the link and a response file containing the link command. This can be used to reproduce the link on another machine, which is useful for sharing bug report inputs or performance test loads. Since the linker is usually called through the clang driver and adding linker flags can be a bit cumbersome, setting the env var `LLD_REPRODUCE=foo.tar` triggers the feature as well. The file response.txt in the archive can be used with `ld64.lld.darwinnew $(cat response.txt)` as long as the contents are smaller than the command-line limit, or with `ld64.lld.darwinnew @response.txt` once D92149 is in. The support in this patch is sufficient to create a tar file for Chromium's base_unittests that can link after unpacking on a different machine. Differential Revision: https://reviews.llvm.org/D92274	2020-11-30 08:40:21 -05:00
Nico Weber	c519bc7e16	lld/MachO: Move MachOOptTable to DriverUtils.cpp, remove DriverUtils.h This makes lld/MachO look more like lld/COFF and lld/ELF, as discussed in D91640.	2020-11-18 12:33:15 -05:00
Jez Ng	21f831134c	[lld-macho] Add very basic support for LTO Just enough to consume some bitcode files and link them. There's more to be done around the symbol resolution API and the LTO config, but I don't yet understand what all the various LTO settings do... Reviewed By: #lld-macho, compnerd, smeenai, MaskRay Differential Revision: https://reviews.llvm.org/D90663	2020-11-10 12:19:28 -08:00
Jez Ng	62a3f0c984	[lld-macho] Support absolute symbols They operate like Defined symbols but with no associated InputSection. Note that `ld64` seems to treat the weak definition flag like a no-op for absolute symbols, so I have replicated that behavior. Reviewed By: #lld-macho, smeenai Differential Revision: https://reviews.llvm.org/D87909	2020-09-25 11:28:35 -07:00
Jez Ng	c32e69b2ce	[lld-macho][re-land] Initial support for common symbols Fix earlier build break via a static_cast. This reverts commit `8112d494d3`. Differential Revision: https://reviews.llvm.org/D86909	2020-09-24 15:00:20 -07:00
Muhammad Omair Javaid	8112d494d3	Revert "[lld-macho] Initial support for common symbols" This reverts commit `63ace77962`. Breaks LLDB Arm build: http://lab.llvm.org:8011/builders/lldb-arm-ubuntu/builds/4409	2020-09-24 12:26:40 +05:00
Jez Ng	9c70281497	[lld-macho][NFC] Make `!= nullptr` implicit	2020-09-23 20:09:49 -07:00
Jez Ng	63ace77962	[lld-macho] Initial support for common symbols On Unix, it is traditionally allowed to write variable definitions without initialization expressions (such as "int foo;") to header files. These are called tentative definitions. The compiler creates common symbols when it sees tentative definitions. When linking the final binary, if there are remaining common symbols after name resolution is complete, the linker converts them to regular defined symbols in a `__common` section. This diff implements most of that functionality, though we do not yet handle the case where there are both common and non-common definitions of the same symbol. Reviewed By: #lld-macho, gkm Differential Revision: https://reviews.llvm.org/D86909	2020-09-23 19:26:40 -07:00
Greg McGary	1a3ef0417c	[lld-macho] In the context of relocs, s/target/referent/ for sections & symbols The word "target" is overloaded, so lighten its load by using another word to denote the symbol or section to which a reloc points. While more stilted than "target", "referent" is rather less pompous than "designatum" or "denotatum". :P Along the way, make a few neighboring variable names more descriptive. Reviewed By: #lld-macho, int3 Differential Revision: https://reviews.llvm.org/D87584	2020-09-22 20:31:01 -07:00
Jez Ng	cbe27316ef	[lld-macho] Implement weak bindings for GOT/TLV Previously, we were only emitting regular bindings to weak dynamic symbols; this diff adds support for the weak bindings too, which can overwrite the regular bindings at runtime. We also treat weak defined global symbols similarly -- since they can also be interposed at runtime, they need to be treated as potentially dynamic symbols. Note that weak bindings differ from regular bindings in that they do not specify the dylib to do the lookup in (i.e. weak symbol lookup happens in a flat namespace.) Differential Revision: https://reviews.llvm.org/D86572	2020-08-26 19:21:09 -07:00
Jez Ng	cf918c809b	[lld-macho] Implement -ObjC It's roughly like -force_load with some filtering. Differential Revision: https://reviews.llvm.org/D86181	2020-08-26 19:20:55 -07:00
Jez Ng	7394460d87	[lld-macho] Handle TAPI and regular re-exports uniformly The re-exports list in a TAPI document can either refer to other inlined TAPI documents, or to on-disk files (which may themselves be TBD or regular files.) Similarly, the re-exports of a regular dylib can refer to a TBD file. Differential Revision: https://reviews.llvm.org/D85404	2020-08-26 19:20:48 -07:00
Jez Ng	6336c042f6	[lld-macho] Make it possible to re-export .tbd files Two things needed fixing for that to work: 1. getName() no longer returns null for DylibFiles constructed from TAPIs 2. markSubLibrary() now accepts .tbd as a possible extension Differential Revision: https://reviews.llvm.org/D86180	2020-08-26 19:20:42 -07:00
Jez Ng	7e6d675499	[lld-macho] Avoid unnecessary shared_ptr in DylibFile ctor DylibFile doesn't store a pointer to its InterfaceFile parameter, so there's no need to use a shared_ptr. Reviewed By: #lld-macho, compnerd Differential Revision: https://reviews.llvm.org/D85402	2020-08-12 19:50:12 -07:00
Jez Ng	a499898e86	[lld-macho] Generate ObjC symbols from .tbd files I followed similar logic in TapiFile.cpp. Reviewed By: #lld-macho, smeenai Differential Revision: https://reviews.llvm.org/D85255	2020-08-12 19:50:10 -07:00
Jez Ng	3c9100fb78	[lld-macho] Support dynamic linking of thread-locals References to symbols in dylibs work very similarly regardless of whether the symbol is a TLV. The main difference is that we have a separate `__thread_ptrs` section that acts as the GOT for these thread-locals. We can identify thread-locals in dylibs by a flag in their export trie entries, and we cross-check it with the relocations that refer to them to ensure that we are not using a GOT relocation to reference a thread-local (or vice versa). Reviewed By: #lld-macho, smeenai Differential Revision: https://reviews.llvm.org/D85081	2020-08-12 19:50:09 -07:00
Greg McGary	a379f2c251	[lld-macho] Handle command-line option -sectcreate SEG SECT FILE Handle command-line option `-sectcreate SEG SECT FILE`, which inputs a binary blob from `FILE` into `SEG,SECT` Reviewed By: int3 Differential Revision: https://reviews.llvm.org/D85501	2020-08-10 18:47:13 -07:00
Jez Ng	31d5885842	[lld-macho] Partial support for weak definitions This diff adds support for weak definitions, though it doesn't handle weak symbols in dylibs quite correctly -- we need to emit binding opcodes for them in the weak binding section rather than the lazy binding section. What is covered in this diff: 1. Reading the weak flag from symbol table / export trie, and writing it to the export trie 2. Refining the symbol table's rules for choosing one symbol definition over another. Wrote a few dozen test cases to make sure we were matching ld64's behavior. We can now link basic C++ programs. Reviewed By: #lld-macho, compnerd Differential Revision: https://reviews.llvm.org/D83532	2020-07-24 15:55:25 -07:00
Jez Ng	74871cdad7	[lld-macho] Ensure __bss sections we output have file offset of zero Summary: llvm-mc emits `__bss` sections with an offset of zero, but we weren't expecting that in our input, so we were copying non-zero data from the start of the file and putting it in `__bss`, with obviously undesirable runtime results. (It appears that the kernel will copy those nonzero bytes as long as the offset is nonzero, regardless of whether S_ZERO_FILL is set.) I debated on whether to make a special ZeroFillSection -- separate from a regular InputSection -- but it seemed like too much work for now. But I'm happy to refactor if anyone feels strongly about having it as a separate class. Depends on D80857. Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee Reviewed By: smeenai Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80859	2020-06-17 20:41:28 -07:00
Jez Ng	fcde378dcb	[lld-macho] Support non-pcrel section relocs Summary: Depends on D80854. Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80855	2020-06-17 20:41:28 -07:00
Saleem Abdulrasool	73312976ad	lld: remove old test support path This removes the stub library that lld injected to satisfy the dependency on the libSystem. Now with TBD support, we can provide the stub library to permit the tests to function properly as they would on a real system. Reviewed By: smeenai Differential Revision: https://reviews.llvm.org/D81418	2020-06-16 15:57:58 -07:00
Jez Ng	53c796b948	[lld-macho] Properly handle & validate relocation r_length Summary: We should be reading / writing our addends / relocated addresses based on r_length, and not just based on the type of the relocation. But since only some r_length values are valid for a given reloc type, I've also added some validation. ld64 has code to allow for r_length = 0 in X86_64_RELOC_BRANCH relocs, but I'm not sure how to create such a relocation... Reviewed By: smeenai Differential Revision: https://reviews.llvm.org/D80854	2020-06-14 16:35:23 -07:00
Saleem Abdulrasool	6fe27b5fed	lld: initial pass at supporting TBD Add support to lld to use Text Based API stubs for linking. This is support is incomplete not filtering out platforms. It also does not account for architecture specific API handling and potentially does not correctly handle trees of re-exports with inlined libraries being treated as direct children of the top level library.	2020-06-08 18:15:40 -07:00
Jez Ng	1e1a3f67ee	[lld-macho] Ensure reads from nlist_64 structs are aligned when necessary My test refactoring in D80217 seems to have caused yaml2obj to emit unaligned nlist_64 structs, causing ASAN'd lld to be unhappy. I don't think this is an issue with yaml2obj though -- llvm-mc also seems to emit unaligned nlist_64s. This diff makes lld able to safely do aligned reads under ASAN builds while hopefully creating no overhead for regular builds on architectures that support unaligned reads. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D80414	2020-06-02 13:19:38 -07:00
Jez Ng	6f6d91867d	[lld-macho] Add some relocation validation logic I considered making a `Target::validate()` method, but I wasn't sure how I felt about the overhead of doing yet another switch-dispatch on the relocation type, so I put the validation in `relocateOne` instead... might be a bit of a micro-optimization, but `relocateOne` does assume certain things about the relocations it gets, and this error handling makes that explicit, so it's not a totally unreasonable code organization. Reviewed By: smeenai Differential Revision: https://reviews.llvm.org/D80049	2020-06-02 13:19:38 -07:00
Jez Ng	ce0d8beebc	[lld-macho][re-land] Support X86_64_RELOC_UNSIGNED This reverts commit `db8559eee4`.	2020-05-19 12:31:55 -07:00
Jez Ng	4eb6f4854e	[lld-macho][re-land] Support .subsections_via_symbols Summary: This diff restores and builds upon @pcc and @ruiu's initial work on subsections. The .subsections_via_symbols directive indicates we can split each section along symbol boundaries, unless those symbols have been marked with `.alt_entry`. We exercise this functionality in our tests by using order files that rearrange those symbols. Depends on D79668. Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee Reviewed By: smeenai Subscribers: thakis, llvm-commits, pcc, ruiu Tags: #llvm Differential Revision: https://reviews.llvm.org/D79926	2020-05-19 12:31:54 -07:00
Jez Ng	70fbbcdd34	Revert "[lld-macho] Support .subsections_via_symbols" Due to build breakage mentioned in https://reviews.llvm.org/D79926. This reverts commit `e270b2f172`.	2020-05-19 08:30:02 -07:00
Jez Ng	db8559eee4	Revert "[lld-macho] Support X86_64_RELOC_UNSIGNED" This reverts commit `1f820e3559`.	2020-05-19 08:30:02 -07:00
Jez Ng	1f820e3559	[lld-macho] Support X86_64_RELOC_UNSIGNED Note that it's only used for non-pc-relative contexts. Reviewed By: MaskRay, smeenai Differential Revision: https://reviews.llvm.org/D80048	2020-05-19 07:46:57 -07:00
Jez Ng	e270b2f172	[lld-macho] Support .subsections_via_symbols This diff restores and builds upon @pcc and @ruiu's initial work on subsections. The .subsections_via_symbols directive indicates we can split each section along symbol boundaries, unless those symbols have been marked with `.alt_entry`. We exercise this functionality in our tests by using order files that rearrange those symbols. Reviewed By: smeenai Differential Revision: https://reviews.llvm.org/D79926	2020-05-19 07:46:57 -07:00
Jez Ng	55e9eb416e	[lld-macho] Support -order_file The order file indicates how input sections should be sorted within each output section, based on the symbols contained within those sections. This diff sets the stage for implementing and testing `.subsections_via_symbols`, where we will break up InputSections by each symbol and sort them more granularly. Reviewed By: smeenai Differential Revision: https://reviews.llvm.org/D79668	2020-05-19 07:46:57 -07:00
Kellie Medlin	2b920ae78c	[lld] Add archive file support to Mach-O backend With this change, basic archive files can be linked together. Input section discovery has been refactored into a function since archive files lazily resolve their symbols / the object files containing those symbols. Reviewed By: int3, smeenai Differential Revision: https://reviews.llvm.org/D78342	2020-05-14 12:58:35 -07:00
Jez Ng	87b6fd3e02	[lld-macho] Add support for creating and reading reexported dylibs This unblocks the linking of real programs, since many core system functions are only available as sub-libraries of libSystem. Differential Revision: https://reviews.llvm.org/D79228	2020-05-12 07:52:03 -07:00
Jez Ng	198b0c57df	[lld-macho] Support pc-relative section relocations Summary: So far we've only supported symbol relocations. Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79211	2020-05-09 20:56:23 -07:00
Jez Ng	7bbdbacd00	[lld-macho] Use export trie instead of symtab when linking against dylibs Summary: This allows us to link against stripped dylibs. Moreover, it's simply more correct: The symbol table includes symbols that the dylib uses but doesn't export. This temporarily regresses our ability to do lazy symbol binding because dyld_stub_binder isn't in libSystem's export trie. Rather, it is in one of the sub-libraries libSystem re-exports. (This doesn't affect our tests since we are mocking out dyld_stub_binder there.) A follow-up diff will address this by adding support for sub-libraries. Depends on D79114. Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee Subscribers: mgorny, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79226	2020-05-09 20:56:22 -07:00
Jez Ng	b3e2fc931d	[lld-macho] Support calls to functions in dylibs Summary: This diff implements lazy symbol binding -- very similar to the PLT mechanism in ELF. ELF's .plt section is broken up into two sections in Mach-O: StubsSection and StubHelperSection. Calls to functions in dylibs will end up calling into StubsSection, which contains indirect jumps to addresses stored in the LazyPointerSection (the counterpart to ELF's .plt.got). Initially, the LazyPointerSection contains addresses that point into one of the entry points in the middle of the StubHelperSection. The code in StubHelperSection will push on the stack an offset into the LazyBindingSection. The push is followed by a jump to the beginning of the StubHelperSection (similar to PLT0), which then calls into dyld_stub_binder. dyld_stub_binder is a non-lazily bound symbol, so this call looks it up in the GOT. The stub binder will look up the bind opcodes in the LazyBindingSection at the given offset. The bind opcodes will tell the binder to update the address in the LazyPointerSection to point to the symbol, so that subsequent calls don't have to redo the symbol resolution. The binder will then jump to the resolved symbol. Depends on D78269. Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78270	2020-05-09 20:56:22 -07:00
Kellie Medlin	6cb073133c	[lld] Merge Mach-O input sections Summary: Similar to other formats, input sections in the MachO implementation are now grouped under output sections. This is primarily a refactor, although there's some new logic (like resolving the output section's flags based on its inputs). Differential Revision: https://reviews.llvm.org/D77893	2020-05-01 16:57:18 -07:00

1 2

55 Commits