llvm-project

Commit Graph

Author	SHA1	Message	Date
Keith Smiley	0bc100986c	[lld-macho] Add support for -alias This creates a symbol alias similar to --defsym in the elf linker. This is used by swiftpm for all executables, so it's useful to support. This doesn't implement -alias_list but that could be done pretty easily as needed. Differential Revision: https://reviews.llvm.org/D129938	2022-07-19 13:55:56 -07:00
Kaining Zhong	6c641d0de6	[lld-macho] Handle user-provided dtrace symbols to avoid linking failure This fixes https://github.com/llvm/llvm-project/issues/56238. ld64.lld currently does not generate __dof section in Mach-O, and -no_dtrace_dof option is on by default. However when there are user-defined dtrace symbols, ld64.lld will treat them as undefined symbols, which causes the linking to fail because lld cannot find their definitions. This patch allows ld64.lld to rewrite the instructions calling dtrace symbols to instructions like nop as what ld64 does; therefore, when encountered with user-provided dtrace probes, the linking can still succeed. I'm not sure whether support for dtrace is expected in lld, so for now I didn't add codes to make lld emit __dof section like ld64, and only made it possible to link with dtrace symbols provided. If this feature is needed, I can add that part in Dtrace.cpp & Dtrace.h. Reviewed By: int3, #lld-macho Differential Revision: https://reviews.llvm.org/D129062	2022-07-11 15:32:26 -04:00
Daniel Bertalan	ed39fd515a	[lld-macho] Use source information in duplicate symbol errors Similarly to how undefined symbol diagnostics were changed in D128184, we now show where in the source file duplicate symbols are defined at: ld64.lld: error: duplicate symbol: _foo >> defined in bar.c:42 >> /path/to/bar.o >> defined in baz.c:1 >> /path/to/libbaz.a(baz.o) For objects that don't contain DWARF data, the format is unchanged. A slight difference to undefined symbol diagnostics is that we don't print the name of the symbol on the third line, as it's already contained on the first line. Differential Revision: https://reviews.llvm.org/D128425	2022-06-23 11:07:15 -04:00
Daniel Bertalan	5792797c5b	Reland "[lld-macho] Show source information for undefined references" The error used to look like this: ld64.lld: error: undefined symbol: _foo >>> referenced by /path/to/bar.o:(symbol _baz+0x4) If DWARF line information is available, we now show where in the source the references are coming from: ld64.lld: error: unreferenced symbol: _foo >>> referenced by: bar.cpp:42 (/path/to/bar.cpp:42) >>> /path/to/bar.o:(symbol _baz+0x4) The reland is identical to the first time this landed. The fix was in D128294. This reverts commit `0cc7ad4175`. Differential Revision: https://reviews.llvm.org/D128184	2022-06-21 18:50:06 -04:00
Nico Weber	0cc7ad4175	Revert "[lld-macho] Show source information for undefined references" This reverts commit `cd7624f153`. See https://reviews.llvm.org/D128184#3597534	2022-06-20 19:15:57 -04:00
Daniel Bertalan	cd7624f153	[lld-macho] Show source information for undefined references The error used to look like this: ld64.lld: error: undefined symbol: _foo >>> referenced by /path/to/bar.o:(symbol _baz+0x4) If DWARF line information is available, we now show where in the source the references are coming from: ld64.lld: error: unreferenced symbol: _foo >>> referenced by: bar.cpp:42 (/path/to/bar.cpp:42) >>> /path/to/bar.o:(symbol _baz+0x4) Differential Revision: https://reviews.llvm.org/D128184	2022-06-20 18:49:42 -04:00
Daniel Bertalan	0eec7e2a89	Reland "[lld-macho] Group undefined symbol diagnostics by symbol". This reverts commit `36e7c9a450`. This relands `d61341768c` with the fix described in https://reviews.llvm.org/D127753#3587390	2022-06-15 19:22:39 -04:00
Stella Stamenova	36e7c9a450	Revert "[lld-macho] Group undefined symbol diagnostics by symbol" This reverts commit `d61341768c`. This change broke multiple lld tests, including some sanitizer builds: https://lab.llvm.org/buildbot/#/builders/5/builds/24787/steps/19/logs/stdio	2022-06-15 15:42:26 -07:00
Daniel Bertalan	d61341768c	[lld-macho] Group undefined symbol diagnostics by symbol ld64.lld used to print the "undefined symbol" line for each reference to an undefined symbol previously: ld64.lld: error: undefined symbol: _foo >>> referenced by /path/to/bar.o:(symbol _baz+0x0) ld64.lld: error: undefined symbol: _foo >>> referenced by /path/to/bar.o:(symbol _quux+0x1) Now they are deduplicated: ld64.lld: error: undefined symbol: _foo >>> referenced by /path/to/bar.o:(symbol _baz+0x0) >>> referenced by /path/to/bar.o:(symbol _quux+0x1) As with the other lld ports, only the first 3 references are printed. Differential Revision: https://reviews.llvm.org/D127753	2022-06-14 16:38:11 -04:00
Daniel Bertalan	f2e92cf60e	[lld-macho] Print the name of functions containing undefined references The error used to look like this: ld64.lld: error: undefined symbol: _foo >>> referenced by /path/to/bar.o Now it displays the name of the function that contains the undefined reference as well: ld64.lld: error: undefined symbol: _foo >>> referenced by /path/to/bar.o:(symbol _baz+0x4) Differential Revision: https://reviews.llvm.org/D127696	2022-06-14 09:41:28 -04:00
Vy Nguyen	66bd14697b	[lld-macho] Demangle symbol names in duplicate-symbol error when -demangle is specified Differential Revision: https://reviews.llvm.org/D127110	2022-06-06 15:12:26 -04:00
Jez Ng	1cff723ff5	[lld-macho][nfc] Use includeInSymtab for all symtab-skipping logic {D123302} got me looking deeper at `includeInSymtab`. I thought it was a little odd that there were excluded (live) symbols for which `includeInSymtab` was false; we shouldn't have so many different ways to exclude a symbol. As such, this diff makes the `L`-prefixed-symbol exclusion code use `includeInSymtab` too. (Note that as part of our support for `__eh_frame`, we will also be excluding all `__eh_frame` symbols from the symtab in a future diff.) Another thing I noticed is that the `emitStabs` code never has to deal with excluded symbols because `SymtabSection::finalize()` already filters them out. As such, I've updated the comments and asserts from {D123302} to reflect this. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D123433	2022-04-11 15:45:46 -04:00
Nico Weber	2cb3d28b17	[lld/mac] Add some comments and asserts I was wondering if SymtabSection::emitStabs() should check defined->includeInSymtab. Add asserts and comments explaining why that's not necessary. No behavior change. Differential Revision: https://reviews.llvm.org/D123302	2022-04-07 15:43:28 -04:00
Jez Ng	ceff23c6e3	[lld-macho] -flat_namespace for dylibs should make all externs interposable All references to interposable symbols can be redirected at runtime to point to a different symbol definition (with the same name). For example, if both dylib A and B define symbol _foo, and we load A before B at runtime, then all references to _foo within dylib B will point to the definition in dylib A. ld64 makes all extern symbols interposable when linking with `-flat_namespace`. TODO 1: Support `-interposable` and `-interposable_list`, which should just be a matter of parsing those CLI flags and setting the `Defined::interposable` bit. TODO 2: Set Reloc::FinalDefinitionInLinkageUnit correctly with this info (we are currently not setting it at all, so we're erring on the conservative side, but we should help the LTO backend generate more optimal code.) Reviewed By: modimo, MaskRay Differential Revision: https://reviews.llvm.org/D119294	2022-03-14 22:18:32 -04:00
Jez Ng	2b78ef06c2	[lld-macho][nfc] Eliminate InputSection::Shared Earlier in LLD's evolution, I tried to create the illusion that subsections were indistinguishable from "top-level" sections. Thus, even though the subsections shared many common field values, I hid those common values away in a private Shared struct (see D105305). More recently, however, @gkm added a public `Section` struct in D113241 that served as an explicit way to store values that are common to an entire set of subsections (aka InputSections). Now that we have another "common value" struct, `Shared` has been rendered redundant. All its fields can be moved into `Section` instead, and the pointer to `Shared` can be replaced with a pointer to `Section`. This `Section` pointer also has the advantage of letting us inspect other subsections easily, simplifying the implementation of {D118798}. P.S. I do think that having both `Section` and `InputSection` makes for a slightly confusing naming scheme. I considered renaming `InputSection` to `Subsection`, but that would break the symmetry with `OutputSection`. It would also make us deviate from LLD-ELF's naming scheme. This change is perf-neutral on my 3.2 GHz 16-Core Intel Xeon W machine: base diff difference (95% CI) sys_time 1.258 ± 0.031 1.248 ± 0.023 [ -1.6% .. +0.1%] user_time 3.659 ± 0.047 3.658 ± 0.041 [ -0.5% .. +0.4%] wall_time 4.640 ± 0.085 4.625 ± 0.063 [ -1.0% .. +0.3%] samples 49 61 There's also no stat sig change in RSS (as measured by `time -l`): base diff difference (95% CI) time 998038627.097 ± 13567305.958 1003327715.556 ± 15210451.236 [ -0.2% .. +1.2%] samples 31 36 Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D118797	2022-02-03 19:55:42 -05:00
Fangrui Song	0aae2bf373	[lld-macho] Add --start-lib --end-lib In ld.lld, when an ObjFile/BitcodeFile is read in --start-lib state, the file is given archive semantics. --end-lib closes the previous --start-lib. A build system can use this feature as an alternative to archives. This patch ports the feature to lld-macho. --start-lib and --end-lib are positional, unlike usual ld64 options. I think the slight drawback does not matter as (a) reusing option names make build systems convenient (b) `--start-lib a.o b.o --end-lib` conveys more information than an alternative design: `-objlib a.o -objlib b.o` because --start-lib makes it clear which objects are in the same conceptual archive. This provides flexibility (c) `-objlib`/`-filelist` interaction may be weird. Close https://github.com/llvm/llvm-project/issues/52931 Reviewed By: #lld-macho, Jez Ng, oontvoo Differential Revision: https://reviews.llvm.org/D116913	2022-01-19 10:14:49 -08:00
Fangrui Song	97a5dccb7d	[lld-macho] Rename LazySymbol to LazyArchive. NFC D116913 will add LazyObject. Rename LazySymbol to LazyArchive to avoid confusion and mirror ELF. Reviewed By: #lld-macho, Jez Ng Differential Revision: https://reviews.llvm.org/D116914	2022-01-11 16:49:06 -08:00
Fangrui Song	477bc36d3b	[lld-macho] Change some global pointers to unique_ptr Similar to D116143. My x86-64 `lld` is ~8KiB smaller. Reviewed By: keith Differential Revision: https://reviews.llvm.org/D116902	2022-01-10 19:39:14 -08:00
Jez Ng	1b44364714	[lld-macho] Unreferenced weak dylib symbols shouldn't fetch archive symbols We were fetching archive symbols too eagerly, bloating binary size as well as just screwing up binaries that expected to look up certain symbols only at runtime. Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D115092	2021-12-05 15:11:44 -05:00
Vy Nguyen	9b29dae3ca	[lld-macho] Allow exporting weak_def_can_be_hidden(AKA "autohide") symbols autohide symbols behaves similarly to private_extern symbols. However, LD64 allows exporting autohide symbols. LLD currently does not. This patch allows LLD to export them. Differential Revision: https://reviews.llvm.org/D113167	2021-11-12 21:57:30 -05:00
Vy Nguyen	2e1be96df6	Reland "[lld-macho] Fix assertion failure in registerCompactUnwind"" PR/52372 Differential Revision: https://reviews.llvm.org/D112977 New changes: - use llvm-otool instead of `otool` which doesn't in exist on non-OSX platforms - add llvm-otool to the set of tools used by test so that the bot will use the <build_dir>/bin/llvm-otool instead of the unqualified `llvm-otool` (which may not exist) - update tests since the latest (TOT) llvm-otool prints a space between two bytes and the old one doesn't.	2021-11-09 11:52:46 -05:00
Vy Nguyen	eb4a517816	Revert "[lld-macho] Fix assertion failure in registerCompactUnwind" broke windows build - reverting to investigate This reverts commit `b2d9258474`.	2021-11-09 10:31:47 -05:00
Vy Nguyen	b2d9258474	[lld-macho] Fix assertion failure in registerCompactUnwind PR/52372 Differential Revision: https://reviews.llvm.org/D112977	2021-11-09 10:08:17 -05:00
Jez Ng	002eda7056	[lld-macho] Associate compact unwind entries with function symbols Compact unwind entries (CUEs) contain pointers to their respective function symbols. However, during the link process, it's far more useful to have pointers from the function symbol to the CUE than vice versa. This diff adds that pointer in the form of `Defined::compactUnwind`. In particular, when doing dead-stripping, we want to mark CUEs live when their function symbol is live; and when doing ICF, we want to dedup sections iff the symbols in that section have identical CUEs. In both cases, we want to be able to locate the symbols within a given section, as well as locate the CUEs belonging to those symbols. So this diff also adds `InputSection::symbols`. The ultimate goal of this refactor is to have ICF support dedup'ing functions with unwind info, but that will be handled in subsequent diffs. This diff focuses on simplifying `-dead_strip` -- `findFunctionsWithUnwindInfo` is no longer necessary, and `Defined::isLive()` is now a lot simpler. Moreover, UnwindInfoSection no longer has to check for dead CUEs -- we simply avoid adding them in the first place. Additionally, we now support stripping of dead LSDAs, which follows quite naturally since `markLive()` can now reach them via the CUEs. Reviewed By: #lld-macho, gkm Differential Revision: https://reviews.llvm.org/D109944	2021-10-26 16:04:15 -04:00
Nico Weber	80caa1eb4a	[lld/mac] Add support for segment$start$ and segment$end$ symbols These symbols are somewhat interesting in that they create non-existing segments, which as far as I know is the only way to create segments that don't contain any sections. Final part of part of PR50760. Like D106629, but for segments instead of sections. I'm not aware of anything that needs this in practice. Differential Revision: https://reviews.llvm.org/D106767	2021-07-25 18:25:13 -04:00
Nico Weber	04e8d0b62d	[lld/mac] Implement support for section$start and section$ end symbols With this, libclang_rt.profile_osx.a can be linked, that is coverage and PGO-instrumented builds should now work with lld. section$start and section$end symbols can create non-existing sections. They're also undefined symbols that are only magic if there isn't a regular symbol with their name, which means the need to be handled in treatUndefined() instead of just looping over all existing sections and adding start and end symbols like the ELF port does. To represent the actual symbols, this uses absolute symbols that get their value updated once an output section is layed out. segment$start and segment$end are still missing for now, but they produce a nicer error message after this patch. Main part of PR50760. Differential Revision: https://reviews.llvm.org/D106629	2021-07-23 16:01:09 -04:00
Nico Weber	2d6fb62ef2	[lld/mac] Handle symbols from -U in treatUndefinedSymbol() In ld64, `-U section$start$FOO$bar` handles `section$start$FOO$bar` as a regular `section$start` symbol, that is section$start processing happens before -U processing. Likely, nobody uses that in practice so it doesn't seem very important to be compatible with this, but it also moves the -U handling code next to the `-undefined dynamic_lookup` handling code, which is nice because they do the same thing. And, in fact, this did identify a bug in a corner case in the intersection of `-undefined dynamic_lookup` and dead-stripping (fix for that in D106565). Vaguely related to PR50760. No interesting behavior change. Differential Revision: https://reviews.llvm.org/D106566	2021-07-22 19:43:57 -04:00
Nico Weber	64be5b7d87	[lld/mac] Implement -arch_multiple This is the other flag clang passes when calling clang with two -arch flags (which means with this, `clang -arch x86_64 -arch arm64 -fuse-ld=lld ...` now no longer prints any warnings \o/). Since clang calls the linker several times in that setup, it's not clear to the user from which invocation the errors are. The flag's help text is Specifies that the linker should augment error and warning messages with the architecture name. In ld64, the only effect of the flag is that undefined symbols are prefaced with Undefined symbols for architecture x86_64: instead of the usual "Undefined symbols:". So for now, let's add this only to undefined symbol errors too. That's probably the most common linker diagnostic. Another idea would be to prefix errors and warnings with "ld64.lld(x86_64):" instead of the usual "ld64.lld:", but I'm not sure if people would misunderstand that as a comment about the arch of ld itself. But open to suggestions on what effect this flag should have :) And we don't have to get it perfect now, we can iterate on it. Differential Revision: https://reviews.llvm.org/D105450	2021-07-06 00:25:18 -04:00
Jez Ng	f6b6e72143	[lld-macho] Factor out common InputSection members We have been creating many ConcatInputSections with identical values due to .subsections_via_symbols. This diff factors out the identical values into a Shared struct, to reduce memory consumption and make copying cheaper. I also changed `callSiteCount` from a uint32_t to a 31-bit field to save an extra word. All in all, this takes InputSection from 120 to 72 bytes (and ConcatInputSection from 160 to 112 bytes), i.e. 30% size reduction in ConcatInputSection. Numbers for linking chromium_framework on my 3.2 GHz 16-Core Intel Xeon W: N Min Max Median Avg Stddev x 20 4.14 4.24 4.18 4.183 0.027548999 + 20 4.04 4.11 4.075 4.0775 0.018027756 Difference at 95.0% confidence -0.1055 +/- 0.0149005 -2.52211% +/- 0.356215% (Student's t, pooled s = 0.0232803) Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D105305	2021-07-01 21:22:39 -04:00
Jez Ng	7f2ba39b16	[lld-macho][nfc] Move liveness-tracking fields into ConcatInputSection These fields currently live in the parent InputSection class, but they should be specific to ConcatInputSection, since the other InputSection classes (that contain literals) aren't atomically live or dead -- rather their component string/int literals should have individual liveness states. (An upcoming diff will add liveness bits for StringPieces and fixed-sized literals.) I also factored out some asserts for isCoalescedWeak() in MarkLive.cpp. We now avoid putting coalesced sections in the `inputSections` vector, so we don't have to check/assert against it everywhere. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D103977	2021-06-11 19:50:08 -04:00
Nico Weber	a5645513db	[lld/mac] Implement -dead_strip Also adds support for live_support sections, no_dead_strip sections, .no_dead_strip symbols. Chromium Framework 345MB unstripped -> 250MB stripped (vs 290MB unstripped -> 236M stripped with ld64). Doing dead stripping is a bit faster than not, because so much less data needs to be processed: % ministat lld_* x lld_nostrip.txt + lld_strip.txt N Min Max Median Avg Stddev x 10 3.929414 4.07692 4.0269079 4.0089678 0.044214794 + 10 3.8129408 3.9025559 3.8670411 3.8642573 0.024779651 Difference at 95.0% confidence -0.144711 +/- 0.0336749 -3.60967% +/- 0.839989% (Student's t, pooled s = 0.0358398) This interacts with many parts of the linker. I tried to add test coverage for all added `isLive()` checks, so that some test will fail if any of them is removed. I checked that the test expectations for the most part match ld64's behavior (except for live-support-iterations.s, see the comment in the test). Interacts with: - debug info - export tries - import opcodes - flags like -exported_symbol(s_list) - -U / dynamic_lookup - mod_init_funcs, mod_term_funcs - weak symbol handling - unwind info - stubs - map files - -sectcreate - undefined, dylib, common, defined (both absolute and normal) symbols It's possible it interacts with more features I didn't think of, of course. I also did some manual testing: - check-llvm check-clang check-lld work with lld with this patch as host linker and -dead_strip enabled - Chromium still starts - Chromium's base_unittests still pass, including unwind tests Implemenation-wise, this is InputSection-based, so it'll work for object files with .subsections_via_symbols (which includes all object files generated by clang). I first based this on the COFF implementation, but later realized that things are more similar to ELF. I think it'd be good to refactor MarkLive.cpp to look more like the ELF part at some point, but I'd like to get a working state checked in first. Mechanical parts: - Rename canOmitFromOutput to wasCoalesced (no behavior change) since it really is for weak coalesced symbols - Add noDeadStrip to Defined, corresponding to N_NO_DEAD_STRIP (`.no_dead_strip` in asm) Fixes PR49276. Differential Revision: https://reviews.llvm.org/D103324	2021-06-02 11:09:26 -04:00
Nico Weber	2c1903412b	[lld/mac] Implement removal of unused dylibs This omits load commands for unreferenced dylibs if: - the dylib was loaded implicitly, - it is marked MH_DEAD_STRIPPABLE_DYLIB - or -dead_strip_dylibs is passed This matches ld64. Currently, the "is dylib referenced" state is computed before dead code stripping and is not updated after dead code stripping. This too matches ld64. We should do better here. With this, clang-format linked with lld (like with ld64) no longer has libobjc.A.dylib in `otool -L` output. (It was implicitly loaded as a reexport of CoreFoundation.framework, but it's not needed.) Differential Revision: https://reviews.llvm.org/D103430	2021-06-01 16:06:30 -04:00
Nico Weber	4a12248ee2	[lld/mac] Honor REFERENCED_DYAMICALLY, set it on __mh_execute_header Has the effect that `__mh_execute_header` stays in the symbol table of outputs even after running `strip` on the output. I don't know if that's important for anything -- my motivation for the patch is just is to make the output more similar to ld64. (Corresponds to symbolTableInAndNeverStrip in ld64.) Differential Revision: https://reviews.llvm.org/D102619	2021-05-17 14:22:12 -04:00
Jez Ng	2516b0b526	[lld-macho] Treat undefined symbols uniformly In particular, we should apply the `-undefined` behavior to all such symbols, include those that are specified via the command line (i.e. `-e`, `-u`, and `-exported_symbol`). ld64 supports this too. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D102143	2021-05-10 15:45:54 -04:00
Nico Weber	d5a70db193	[lld/mac] Write every weak symbol only once in the output Before this, if an inline function was defined in several input files, lld would write each copy of the inline function the output. With this patch, it only writes one copy. Reduces the size of Chromium Framework from 378MB to 345MB (compared to 290MB linked with ld64, which also does dead-stripping, which we don't do yet), and makes linking it faster: N Min Max Median Avg Stddev x 10 3.9957051 4.3496981 4.1411121 4.156837 0.10092097 + 10 3.908154 4.169318 3.9712729 3.9846753 0.075773012 Difference at 95.0% confidence -0.172162 +/- 0.083847 -4.14165% +/- 2.01709% (Student's t, pooled s = 0.0892373) Implementation-wise, when merging two weak symbols, this sets a "canOmitFromOutput" on the InputSection belonging to the weak symbol not put in the symbol table. We then don't write InputSections that have this set, as long as they are not referenced from other symbols. (This happens e.g. for object files that don't set .subsections_via_symbols or that use .alt_entry.) Some restrictions: - not yet done for bitcode inputs - no "comdat" handling (`kindNoneGroupSubordinate*` in ld64) -- Frame Descriptor Entries (FDEs), Language Specific Data Areas (LSDAs) (that is, catch block unwind information) and Personality Routines associated with weak functions still not stripped. This is wasteful, but harmless. - However, this does strip weaks from __unwind_info (which is needed for correctness and not just for size) - This nopes out on InputSections that are referenced form more than one symbol (eg from .alt_entry) for now Things that work based on symbols Just Work: - map files (change in MapFile.cpp is no-op and not needed; I just found it a bit more explicit) - exports Things that work with inputSections need to explicitly check if an inputSection is written (e.g. unwind info). This patch is useful in itself, but it's also likely also a useful foundation for dead_strip. I used to have a "canoncialRepresentative" pointer on InputSection instead of just the bool, which would be handy for ICF too. But I ended up not needing it for this patch, so I removed that again for now. Differential Revision: https://reviews.llvm.org/D102076	2021-05-07 17:11:40 -04:00
Jez Ng	05c5363b39	[lld-macho] Parse & emit the N_ARM_THUMB_DEF symbol flag Eventually we'll use this flag to properly handle bl/blx opcodes. Reviewed By: #lld-macho, gkm Differential Revision: https://reviews.llvm.org/D101558	2021-04-30 16:17:26 -04:00
Jez Ng	eb5b7d4497	[lld-macho] LTO: Unset VisibleToRegularObj where possible This allows LLVM's LTO to internalize symbols that are not referenced directly by regular objects. Naturally, this means we need to track which symbols are referenced by regular objects. The approach taken here is similar to LLD-COFF's: like the COFF port, we extend `SymbolTable::insert()` to set the isVisibleToRegularObj bit. (LLD-ELF relies on the Symbol constructor and `Symbol::mergeProperties()`, but the Mach-O port does not have a `mergeProperties()` equivalent.) From what I can tell, ld64 (which uses libLTO) doesn't do this optimization at all. I'm not even sure libLTO provides a way to do this. Not having ld64's behavior as a reference implementation is unfortunate; instead, I am relying on LLD-ELF/COFF's behavior as references while erring on the conservative side. In particular, LLD-MachO will only do this optimization for executables right now. We also don't attempt it when `-flat_namespace` is used -- otherwise we'd need scan the symbol table to find matches for every un-namespaced symbol reference, which is expensive. internalize.ll is based off the LLD-ELF tests `internalize-basic.ll` and `internalize-undef.ll`. Looks like @davide added some of LLD-ELF's internalize tests, so adding him as a reviewer... Reviewed By: #lld-macho, gkm Differential Revision: https://reviews.llvm.org/D99105	2021-04-15 21:16:33 -04:00
Jez Ng	2461804b48	[lld-macho] Symbol::value should always be uint64_t D98837 migrated a bunch of `value`s to uint64_t, but missed these.	2021-04-06 17:54:11 -04:00
Jez Ng	ceec610754	[lld-macho] Fix & refactor symbol size calculations I noticed two problems with the previous implementation: * N_ALT_ENTRY symbols weren't being handled correctly -- they should determine the size of the previous symbol, even though they don't cause a new section to be created * The last symbol in a section had its size calculated wrongly; the first subsection's size was used instead of the last one I decided to take the opportunity to refactor things as well, mainly to realize my observation [here](https://reviews.llvm.org/D98837#inline-931511) that we could avoid doing a binary search to match symbols with subsections. I think the resulting code is a bit simpler too. N Min Max Median Avg Stddev x 20 4.31 4.43 4.37 4.3775 0.034162922 + 20 4.32 4.43 4.38 4.3755 0.02799906 No difference proven at 95.0% confidence Reviewed By: #lld-macho, alexshap Differential Revision: https://reviews.llvm.org/D99972	2021-04-06 15:10:01 -04:00
Alexander Shaposhnikov	f6ad045366	[lld][MachO] Make emitEndFunStab independent from .subsections_via_symbols This diff addresses FIXME in SyntheticSections.cpp and removes the dependency of emitEndFunStab on .subsections_via_symbols. Test plan: make check-lld-macho Differential revision: https://reviews.llvm.org/D99054	2021-04-01 17:48:09 -07:00
Vy Nguyen	66f340051a	[lld-macho] Define __mh_*_header synthetic symbols. Bug: https://bugs.llvm.org/show_bug.cgi?id=49290 Differential Revision: https://reviews.llvm.org/D97007	2021-03-19 14:14:40 -04:00
Greg McGary	a170533632	[lld-macho][NFC] Drop unnecessary braces around simple if/for bodies Minor cleanup Differential Revision: https://reviews.llvm.org/D98758	2021-03-16 22:39:39 -07:00
Greg McGary	db1e845a96	[lld-macho] Handle error cases properly for -exported_symbol(s_list) This fixes defects in D98223 [lld-macho] implement options -(un)exported_symbol(s_list): * disallow export of hidden symbols * verify that whitelisted literal names are defined in the symbol table * reflect export-status overrides in `nlist` attribute of `N_EXT` or `N_PEXT` Thanks to @thakis for raising these issues Differential Revision: https://reviews.llvm.org/D98381	2021-03-16 21:20:39 -07:00
Jez Ng	d8283d9ddc	[lld-macho][nfc] Give every SyntheticSection a fake InputSection Previously, it was difficult to write code that handled both synthetic and regular sections generically. We solve this problem by creating a fake InputSection at the start of every SyntheticSection. This refactor allows us to handle DSOHandle like a regular Defined symbol (since Defined symbols must be attached to an InputSection), and paves the way for supporting `__mh_*header` symbols. Additionally, it simplifies our binding/rebase code. I did have to extend Defined a little -- it now has a `linkerInternal` flag, to indicate that `___dso_handle` should not be in the final symbol table. I've also added some additional testing for `___dso_handle`. Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D98545	2021-03-12 17:26:27 -05:00
Greg McGary	fdc0c21973	[lld-macho][NFC] when reasonable, replace auto keyword with type names lld policy discourages `auto`. Replace it with a type name whenever reasonable. Retain `auto` to avoid ... * redundancy, as for decls such as `auto t = mumble_cast<TYPE >` or similar that specifies the result type on the RHS * verbosity, as for iterators * gratuitous suffering, as for lambdas Along the way, add `const` when appropriate. Note: a future diff will ... * add more `const` qualifiers * remove `opt::` when we are already `using llvm::opt` Differential Revision: https://reviews.llvm.org/D98313	2021-03-09 22:08:32 -08:00
Nico Weber	0658fc654c	[lld/mac] Implement the missing bits of -undefined This adds support for `-undefined dynamic_lookup`, and for `-undefined warning` and `-undefined suppress` with `-flat_namespace`. We just replace undefined symbols with a DynamicLookup when we hit them. With this, `check-llvm` passes when using ld64.lld.darwinnew as host linker. Differential Revision: https://reviews.llvm.org/D97642	2021-03-01 15:30:53 -05:00
Nico Weber	8174f33dc9	[lld/mac] Add support for -flat_namespace -flat_namespace makes lld emit binaries that use name lookup that's more in line with other POSIX systems: Instead of looking up symbols as (dylib,name) pairs by dyld, they're instead looked up just by name. -flat_namespace has three effects: 1. MH_TWOLEVEL and MH_NNOUNDEFS are no longer set in the Mach-O header 2. All symbols use BIND_SPECIAL_DYLIB_FLAT_LOOKUP as ordinal 3. When a dylib is added to the link, its dependent dylibs are also added, so that lld can verify that no undefined symbols remain at the end of a link with -flat_namespace. These transitive dylibs are added for symbol resolution, but they are not emitted in LC_LOAD_COMMANDs. -undefined with -flat_namespace still isn't implemented. Before this change, it was impossible to hit that combination because -flat_namespace caused a diagnostic. Now that it no longer does, emit a dedicated temporary diagnostic when both flags are used. Differential Revision: https://reviews.llvm.org/D97641	2021-03-01 15:25:10 -05:00
Nico Weber	cafb6cd10c	[lld/mac] Add some support for dynamic lookup symbols, and implement -U Dynamic lookup symbols are symbols that work like dynamic symbols in ELF: They're not bound to a dylib like normal Mach-O twolevel lookup symbols, but they live in a global pool and dyld resolves them against exported symbols from all loaded dylibs. This adds support for dynamical lookup symbols to lld/mac. They are represented as DylibSymbols with file set to nullptr. This also uses this support to implement the -U flag, which makes a specific symbol that's undefined at the end of the link a dynamic lookup symbol. For -U, it'd be sufficient to just to a pass over remaining undefined symbols at the end of the link and to replace them with dynamic lookup symbols then. But I'd like to use this code to implement flat_namespace too, and that will require real support for resolving dynamic lookup symbols in SymbolTable. So this patch adds this now already. While writing tests for this, I noticed that we didn't set N_WEAK_DEF in the symbol table for DylibSymbols, so this fixes that too. Differential Revision: https://reviews.llvm.org/D97521	2021-02-26 16:50:53 -05:00
Jez Ng	163dcd8513	[lld-macho] Associate each Symbol with an InputFile This makes our error messages more informative. But the bigger motivation is for LTO symbol resolution, which will be in an upcoming diff. The changes in this one are largely mechanical. Reviewed By: #lld-macho, smeenai Differential Revision: https://reviews.llvm.org/D94316	2021-02-03 13:43:47 -05:00
Jez Ng	e98b441a09	[lld-macho] Remove unnecessary llvm:: namespace prefixes	2021-01-09 12:44:35 -05:00

1 2

72 Commits