llvm-project

Commit Graph

Author	SHA1	Message	Date
Jez Ng	2179930868	[lld-macho] Fix unwind info personality size This was missed by {D107035}. This fix addresses the following warning: loop variable 'personality' has type 'const uint32_t &' (aka 'const unsigned int &') but is initialized with type 'const unsigned long long' resulting in a copy [-Wrange-loop-analysis] In addition to fixing the size, I also removed the const reference, since there's no performance benefit to avoiding copies of integer-sized values.	2021-08-26 18:52:06 -04:00
Vy Nguyen	0bd14711ac	[lld-macho] Change personalities entry type to Ptr to avoid overflowing uint32 PR51262 Differential Revision: https://reviews.llvm.org/D107035	2021-07-29 14:26:07 -04:00
Jez Ng	428a7c1b38	[lld-macho] Have ICF operate on all sections at once ICF previously operated only within a given OutputSection. We would merge all CFStrings first, then merge all regular code sections in a second phase. This worked fine since CFStrings would never reference regular `__text` sections. However, I would like to expand ICF to merge functions that reference unwind info. Unwind info references the LSDA section, which can in turn reference the `__text` section, so we cannot perform ICF in phases. In order to have ICF operate on InputSections spanning multiple OutputSections, we need a way to distinguish InputSections that are destined for different OutputSections, so that we don't fold across section boundaries. We achieve this by creating OutputSections early, and setting `InputSection::parent` to point to them. This is what LLD-ELF does. (This change should also make it easier to implement the `section$start$` symbols.) This diff also folds InputSections w/o checking their flags, which I think is the right behavior -- if they are destined for the same OutputSection, they will have the same flags in the output (even if their input flags differ). I.e. the `parent` pointer check subsumes the `flags` check. In practice this has nearly no effect (ICF did not become any more effective on chromium_framework). I've also updated ICF.cpp's block comment to better reflect its current status. Reviewed By: #lld-macho, smeenai Differential Revision: https://reviews.llvm.org/D105641	2021-07-17 13:42:51 -04:00
Jez Ng	28a2102ee3	[lld-macho][nfc] Remove unnecessary llvm:: namespace prefixes	2021-07-11 18:36:53 -04:00
Vy Nguyen	3822e3d5b0	[lld-macho] Fix bug in handling unwind info from ld -r Two changess: - Drop assertions that all symbols are in GOT - Set allEntriesAreOmitted correctly Related bug: 50812 Differential Revision: https://reviews.llvm.org/D105364	2021-07-09 22:44:51 -04:00
Nico Weber	8a7b5ebf4d	[lld/mac] Don't crash when dead-stripping removes all unwind info If the input has compact unwind info but all of it is removed after dead stripping, we would crash. Now we don't write any __unwind_info section at all, like ld64. This is a bit awkward to implement because we only know the final state of unwind info after UnwindInfoSectionImpl<Ptr>::finalize(), which is called after sections are added. So add a small amount of bookkeeping to relocateCompactUnwind() instead (which runs earlier) so that we can predict what finalize() will do before it runs. Fixes PR51010. Differential Revision: https://reviews.llvm.org/D105557	2021-07-07 13:05:40 -04:00
Nico Weber	d7e65757ed	[lld/mac] Tweak reserve() argument in unwind code addEntriesForFunctionsWithoutUnwindInfo() can add entries to cuVector, so cuCount can be stale. Use cuVector.size() instead. No behavior change.	2021-07-07 11:44:22 -04:00
Nico Weber	9e24979d73	[lld/mac] Fix function offset on 1st-level unwind table sentinel Two bugs: 1. This tries to take the address of the last symbol plus the length of the last symbol. However, the sorted vector is cuPtrVector, not cuVector. Also, cuPtrVector has tombstone values removed and cuVector doesn't. If there was a stripped value at the end, the "last" element's value was UINT64_MAX, which meant the sentinel value was one less than the length of that "last" dead symbol. 2. We have to subtract in.header->addr. For 64-bit binaries that's (1 << 32) and functionAddress is 32-bit so this is a no-op, but for 32-bit binaries the sentinel's value was too large. I believe this has no effect in practice since the first-level binary search code in libunwind (in UnwindCursor.hpp) does: uint32_t low = 0; uint32_t high = sectionHeader.indexCount(); uint32_t last = high - 1; while (low < high) { uint32_t mid = (low + high) / 2; if ((mid == last) \|\| (topIndex.functionOffset(mid + 1) > targetFunctionOffset)) { low = mid; break; } else { low = mid + 1; } So the address of the last entry in the first-level table isn't really checked -- except for the very end, but the check against `last` means we just run the loop once more than necessary. But it makes `unwinddump` output look less confusing, and it's what it looks was the intention here. (No test since I can't think of a way to make FileCheck check that one number is larger than another.) Differential Revision: https://reviews.llvm.org/D105404	2021-07-04 18:06:20 -04:00
Nico Weber	d2d6da3011	[lld/mac] Don't crash on 32-bit output binaries when dead-stripping Fixes PR50974. Differential Revision: https://reviews.llvm.org/D105399	2021-07-04 18:03:31 -04:00
Jez Ng	f6b6e72143	[lld-macho] Factor out common InputSection members We have been creating many ConcatInputSections with identical values due to .subsections_via_symbols. This diff factors out the identical values into a Shared struct, to reduce memory consumption and make copying cheaper. I also changed `callSiteCount` from a uint32_t to a 31-bit field to save an extra word. All in all, this takes InputSection from 120 to 72 bytes (and ConcatInputSection from 160 to 112 bytes), i.e. 30% size reduction in ConcatInputSection. Numbers for linking chromium_framework on my 3.2 GHz 16-Core Intel Xeon W: N Min Max Median Avg Stddev x 20 4.14 4.24 4.18 4.183 0.027548999 + 20 4.04 4.11 4.075 4.0775 0.018027756 Difference at 95.0% confidence -0.1055 +/- 0.0149005 -2.52211% +/- 0.356215% (Student's t, pooled s = 0.0232803) Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D105305	2021-07-01 21:22:39 -04:00
Jez Ng	3a11528d97	[lld-macho] Move ICF earlier to avoid emitting redundant binds This is a pretty big refactoring diff, so here are the motivations: Previously, ICF ran after scanRelocations(), where we emitting bind/rebase opcodes etc. So we had a bunch of redundant leftovers after ICF. Having ICF run before Writer seems like a better design, and is what LLD-ELF does, so this diff refactors it accordingly. However, ICF had two dependencies on things occurring in Writer: 1) it needs literals to be deduplicated beforehand and 2) it needs to know which functions have unwind info, which was being handled by `UnwindInfoSection::prepareRelocations()`. In order to do literal deduplication earlier, we need to add literal input sections to their corresponding output sections. So instead of putting all input sections into the big `inputSections` vector, and then filtering them by type later on, I've changed things so that literal sections get added directly to their output sections during the 'gather' phase. Likewise for compact unwind sections -- they get added directly to the UnwindInfoSection now. This latter change is not strictly necessary, but makes it easier for ICF to determine which functions have unwind info. Adding literal sections directly to their output sections means that we can no longer determine `inputOrder` from iterating over `inputSections`. Instead, we store that order explicitly on InputSection. Bloating the size of InputSection for this purpose would be unfortunate -- but LLD-ELF has already solved this problem: it reuses `outSecOff` to store this order value. One downside of this refactor is that we now make an additional pass over the unwind info relocations to figure out which functions have unwind info, since want to know that before `processRelocations()`. I've made sure to run that extra loop only if ICF is enabled, so there should be no overhead in non-optimizing runs of the linker. The upside of all this is that the `inputSections` vector now contains only ConcatInputSections that are destined for ConcatOutputSections, so we can clean up a bunch of code that just existed to filter out other elements from that vector. I will test for the lack of redundant binds/rebases in the upcoming cfstring deduplication diff. While binds/rebases can also happen in the regular `.text` section, they're more common in `.data` sections, so it seems more natural to test it that way. This change is perf-neutral when linking chromium_framework. Reviewed By: oontvoo Differential Revision: https://reviews.llvm.org/D105044	2021-07-01 21:22:38 -04:00
Jez Ng	bf457919f2	[lld-macho][nfc] Remove unnecessary dyn_cast and simplify code	2021-06-28 14:50:44 -04:00
Nico Weber	0f24ffcdfa	[lld/mac] Don't fold UNWIND_X86_64_MODE_STACK_IND unwind entries libunwind uses unwind info to find the function address belonging to the current instruction pointer. libunwind/src/CompactUnwinder.hpp's step functions read functionStart for UNWIND_X86_64_MODE_STACK_IND (and for nothing else), so these encodings need a dedicated entry per function, so that the runtime can get the stacksize off the `subq` instrunction in the function's prologue. This matches ld64. (CompactUnwinder.hpp from https://opensource.apple.com/source/libunwind/ also reads functionStart in a few more cases if `SUPPORT_OLD_BINARIES` is set, but it defaults to 0, and ld64 seems to not worry about these additional cases.) Related upstream bug: https://crbug.com/1220175 Differential Revision: https://reviews.llvm.org/D104978	2021-06-27 06:49:32 -04:00
Jez Ng	8aa17d1eae	[lld-macho] Move ICF members from InputSection to ConcatInputSection `icfEqClass` only makes sense on ConcatInputSections since (in contrast to literal sections) they are deduplicated as an atomic unit. Similarly, `hasPersonality` and `replacement` don't make sense on literal sections. This mirrors LLD-ELF, which stores `icfEqClass` only on non-mergeable sections. Reviewed By: #lld-macho, gkm Differential Revision: https://reviews.llvm.org/D104670	2021-06-24 22:23:12 -04:00
Nico Weber	ef75358080	[lld/mac] Delete incorrect FIXME """Bitcode symbols only exist before LTO runs, and only serve the purpose of resolving visibility so LTO can better optimize. Running LTO creates ObjFiles from BitcodeFiles, and those ObjFiles contain regular Defined symbols (with isec set and all) that will replace the bitcode symbols. So things should (hopefully) work as-is :)""" -- https://reviews.llvm.org/rGdbbc8d8333f29cf4ad6f4793da1adf71bbfdac69#inline-6081	2021-06-23 16:25:34 -04:00
Nico Weber	dbbc8d8333	[lld/mac] Don't crash on absolute symbols in unwind info generation Fixes a regression from `d6565a2dbc` and PR50820.	2021-06-23 14:25:34 -04:00
Nico Weber	d6565a2dbc	[lld/mac] Add explicit "no unwind info" entries for functions without unwind info Fixes PR50529. With this, lld-linked Chromium base_unittests passes on arm macs. Surprisingly, no measurable impact on link time. Differential Revision: https://reviews.llvm.org/D104681	2021-06-22 06:12:42 -04:00
Greg McGary	f27e4548fc	[lld-macho] Implement ICF ICF = Identical C(ode\|OMDAT) Folding This is the LLD ELF/COFF algorithm, adapted for MachO. So far, only `-icf all` is supported. In order to support `-icf safe`, we will need to port address-significance tables (`.addrsig` directives) to MachO, which will come in later diffs. `check-{llvm,clang,lld}` have 0 regressions for `lld -icf all` vs. baseline ld64. We only run ICF on `__TEXT,__text` for reasons explained in the block comment in `ConcatOutputSection.cpp`. Here is the perf impact for linking `chromium_framekwork` on a Mac Pro (16-core Xeon W) for the non-ICF case vs. pre-ICF: ``` N Min Max Median Avg Stddev x 20 4.27 4.44 4.34 4.349 0.043029977 + 20 4.37 4.46 4.405 4.4115 0.025188761 Difference at 95.0% confidence 0.0625 +/- 0.0225658 1.43711% +/- 0.518873% (Student's t, pooled s = 0.0352566) ``` Reviewed By: #lld-macho, int3 Differential Revision: https://reviews.llvm.org/D103292	2021-06-17 10:07:44 -07:00
Jez Ng	b8bbb9723a	[lld-macho][nfc] Put back shouldOmitFromOutput() asserts I removed them in rG5de7467e982 but @thakis pointed out that they were useful to keep, so here they are again. I've also converted the `!isCoalescedWeak()` asserts into `!shouldOmitFromOutput()` asserts, since the latter check subsumes the former. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D104169	2021-06-16 15:23:04 -04:00
Jez Ng	b2a0739012	[lld-macho][nfc] Remove InputSection::outSecFileOff `outSecFileOff` and the associated `getFileOffset()` accessors were unnecessary. For all the cases we care about, `outSecFileOff` is the same as `outSecOff`. The only time they deviate is if there are zerofill sections within a given segment. But since zerofill sections are always at the end of a segment, the only sections where the two values deviate are zerofill sections themselves. And we never actually query the outSecFileOff of zerofill sections. As for `getFileOffset()`, the only place it was being used was to calculate the offset of the entry symbol. However, we can compute that value by just taking the difference between the address of the entry symbol and the address of the Mach-O header. In fact, this appears to be what ld64 itself does. This difference is the same as the file offset as long as there are no intervening zerofill sections, but since `__text` is the first section in `__TEXT`, this never happens, so our previous use of `getFileOffset()` was not wrong -- just inefficient. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D104177	2021-06-13 19:51:30 -04:00
Nico Weber	7d4c8a2b8f	[lld/mac] clarify comment This is a "we should do X in the future" fixme, not an "X might go wrong" fixme.	2021-06-13 13:30:07 -04:00
Jez Ng	5de7467e98	[lld-macho] Fix debug build D103977 broke a bunch of stuff as I had only tested the release build which eliminated asserts. I've retained the asserts where possible, but I also removed a bunch instead of adding a whole lot of verbose ConcatInputSection casts.	2021-06-11 20:21:27 -04:00
Jez Ng	7f2ba39b16	[lld-macho][nfc] Move liveness-tracking fields into ConcatInputSection These fields currently live in the parent InputSection class, but they should be specific to ConcatInputSection, since the other InputSection classes (that contain literals) aren't atomically live or dead -- rather their component string/int literals should have individual liveness states. (An upcoming diff will add liveness bits for StringPieces and fixed-sized literals.) I also factored out some asserts for isCoalescedWeak() in MarkLive.cpp. We now avoid putting coalesced sections in the `inputSections` vector, so we don't have to check/assert against it everywhere. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D103977	2021-06-11 19:50:08 -04:00
Jez Ng	04259cde15	[lld-macho] Implement cstring deduplication Our implementation draws heavily from LLD-ELF's, which in turn delegates its string deduplication to llvm-mc's StringTableBuilder. The messiness of this diff is largely due to the fact that we've previously assumed that all InputSections get concatenated together to form the output. This is no longer true with CStringInputSections, which split their contents into StringPieces. StringPieces are much more lightweight than InputSections, which is important as we create a lot of them. They may also overlap in the output, which makes it possible for strings to be tail-merged. In fact, the initial version of this diff implemented tail merging, but I've dropped it for reasons I'll explain later. Alignment Issues Mergeable cstring literals are found under the `__TEXT,__cstring` section. In contrast to ELF, which puts strings that need different alignments into different sections, clang's Mach-O backend puts them all in one section. Strings that need to be aligned have the `.p2align` directive emitted before them, which simply translates into zero padding in the object file. I think ld64 extracts the desired per-string alignment from this data by preserving each string's offset from the last section-aligned address. I'm not entirely certain since it doesn't seem consistent about doing this; but perhaps this can be chalked up to cases where ld64 has to deduplicate strings with different offset/alignment combos -- it seems to pick one of their alignments to preserve. This doesn't seem correct in general; we can in fact can induce ld64 to produce a crashing binary just by linking in an additional object file that only contains cstrings and no code. See PR50563 for details. Moreover, this scheme seems rather inefficient: since unaligned and aligned strings are all put in the same section, which has a single alignment value, it doesn't seem possible to tell whether a given string doesn't have any alignment requirements. Preserving offset+alignments for strings that don't need it is wasteful. In practice, the crashes seen so far seem to stem from x86_64 SIMD operations on cstrings. X86_64 requires SIMD accesses to be 16-byte-aligned. So for now, I'm thinking of just aligning all strings to 16 bytes on x86_64. This is indeed wasteful, but implementation-wise it's simpler than preserving per-string alignment+offsets. It also avoids the aforementioned crash after deduplication of differently-aligned strings. Finally, the overhead is not huge: using 16-byte alignment (vs no alignment) is only a 0.5% size overhead when linking chromium_framework. With these alignment requirements, it doesn't make sense to attempt tail merging -- most strings will not be eligible since their overlaps aren't likely to start at a 16-byte boundary. Tail-merging (with alignment) for chromium_framework only improves size by 0.3%. It's worth noting that LLD-ELF only does tail merging at `-O2`. By default (at `-O1`), it just deduplicates w/o tail merging. @thakis has also mentioned that they saw it regress compressed size in some cases and therefore turned it off. `ld64` does not seem to do tail merging at all. Performance Numbers CString deduplication reduces chromium_framework from 250MB to 242MB, or about a 3.2% reduction. Numbers for linking chromium_framework on my 3.2 GHz 16-Core Intel Xeon W: N Min Max Median Avg Stddev x 20 3.91 4.03 3.935 3.95 0.034641016 + 20 3.99 4.14 4.015 4.0365 0.0492336 Difference at 95.0% confidence 0.0865 +/- 0.027245 2.18987% +/- 0.689746% (Student's t, pooled s = 0.0425673) As expected, cstring merging incurs some non-trivial overhead. When passing `--no-literal-merge`, it seems that performance is the same, i.e. the refactoring in this diff didn't cost us. N Min Max Median Avg Stddev x 20 3.91 4.03 3.935 3.95 0.034641016 + 20 3.89 4.02 3.935 3.9435 0.043197831 No difference proven at 95.0% confidence Reviewed By: #lld-macho, gkm Differential Revision: https://reviews.llvm.org/D102964	2021-06-07 23:48:35 -04:00
Nico Weber	a5645513db	[lld/mac] Implement -dead_strip Also adds support for live_support sections, no_dead_strip sections, .no_dead_strip symbols. Chromium Framework 345MB unstripped -> 250MB stripped (vs 290MB unstripped -> 236M stripped with ld64). Doing dead stripping is a bit faster than not, because so much less data needs to be processed: % ministat lld_* x lld_nostrip.txt + lld_strip.txt N Min Max Median Avg Stddev x 10 3.929414 4.07692 4.0269079 4.0089678 0.044214794 + 10 3.8129408 3.9025559 3.8670411 3.8642573 0.024779651 Difference at 95.0% confidence -0.144711 +/- 0.0336749 -3.60967% +/- 0.839989% (Student's t, pooled s = 0.0358398) This interacts with many parts of the linker. I tried to add test coverage for all added `isLive()` checks, so that some test will fail if any of them is removed. I checked that the test expectations for the most part match ld64's behavior (except for live-support-iterations.s, see the comment in the test). Interacts with: - debug info - export tries - import opcodes - flags like -exported_symbol(s_list) - -U / dynamic_lookup - mod_init_funcs, mod_term_funcs - weak symbol handling - unwind info - stubs - map files - -sectcreate - undefined, dylib, common, defined (both absolute and normal) symbols It's possible it interacts with more features I didn't think of, of course. I also did some manual testing: - check-llvm check-clang check-lld work with lld with this patch as host linker and -dead_strip enabled - Chromium still starts - Chromium's base_unittests still pass, including unwind tests Implemenation-wise, this is InputSection-based, so it'll work for object files with .subsections_via_symbols (which includes all object files generated by clang). I first based this on the COFF implementation, but later realized that things are more similar to ELF. I think it'd be good to refactor MarkLive.cpp to look more like the ELF part at some point, but I'd like to get a working state checked in first. Mechanical parts: - Rename canOmitFromOutput to wasCoalesced (no behavior change) since it really is for weak coalesced symbols - Add noDeadStrip to Defined, corresponding to N_NO_DEAD_STRIP (`.no_dead_strip` in asm) Fixes PR49276. Differential Revision: https://reviews.llvm.org/D103324	2021-06-02 11:09:26 -04:00
Jez Ng	33706191d8	[lld-macho][nfc] Rename MergedOutputSection to ConcatOutputSection The ELF format has the concept of merge sections (marked by SHF_MERGE), which contain data that can be safely deduplicated. The Mach-O equivalents are called literal sections (marked by S_CSTRING_LITERALS or S_{4,8,16}BYTE_LITERALS). While the Mach-O format doesn't use the word 'merge', to avoid confusion, I've renamed our MergedOutputSection to ConcatOutputSection. I believe it's a more descriptive name too. This renaming sets the stage for {D102964}. Reviewed By: #lld-macho, alexshap Differential Revision: https://reviews.llvm.org/D102971	2021-05-25 14:58:29 -04:00
Jez Ng	9cc0d893f7	[lld-macho][nfc] clang-format everything	2021-05-25 14:58:29 -04:00
Nico Weber	4a12248ee2	[lld/mac] Honor REFERENCED_DYAMICALLY, set it on __mh_execute_header Has the effect that `__mh_execute_header` stays in the symbol table of outputs even after running `strip` on the output. I don't know if that's important for anything -- my motivation for the patch is just is to make the output more similar to ld64. (Corresponds to symbolTableInAndNeverStrip in ld64.) Differential Revision: https://reviews.llvm.org/D102619	2021-05-17 14:22:12 -04:00
Nico Weber	7b6dd265ce	[lld/mac] Copy some of the commit message of `d5a70db193` into a comment	2021-05-08 13:03:17 -04:00
Nico Weber	d5a70db193	[lld/mac] Write every weak symbol only once in the output Before this, if an inline function was defined in several input files, lld would write each copy of the inline function the output. With this patch, it only writes one copy. Reduces the size of Chromium Framework from 378MB to 345MB (compared to 290MB linked with ld64, which also does dead-stripping, which we don't do yet), and makes linking it faster: N Min Max Median Avg Stddev x 10 3.9957051 4.3496981 4.1411121 4.156837 0.10092097 + 10 3.908154 4.169318 3.9712729 3.9846753 0.075773012 Difference at 95.0% confidence -0.172162 +/- 0.083847 -4.14165% +/- 2.01709% (Student's t, pooled s = 0.0892373) Implementation-wise, when merging two weak symbols, this sets a "canOmitFromOutput" on the InputSection belonging to the weak symbol not put in the symbol table. We then don't write InputSections that have this set, as long as they are not referenced from other symbols. (This happens e.g. for object files that don't set .subsections_via_symbols or that use .alt_entry.) Some restrictions: - not yet done for bitcode inputs - no "comdat" handling (`kindNoneGroupSubordinate*` in ld64) -- Frame Descriptor Entries (FDEs), Language Specific Data Areas (LSDAs) (that is, catch block unwind information) and Personality Routines associated with weak functions still not stripped. This is wasteful, but harmless. - However, this does strip weaks from __unwind_info (which is needed for correctness and not just for size) - This nopes out on InputSections that are referenced form more than one symbol (eg from .alt_entry) for now Things that work based on symbols Just Work: - map files (change in MapFile.cpp is no-op and not needed; I just found it a bit more explicit) - exports Things that work with inputSections need to explicitly check if an inputSection is written (e.g. unwind info). This patch is useful in itself, but it's also likely also a useful foundation for dead_strip. I used to have a "canoncialRepresentative" pointer on InputSection instead of just the bool, which would be handy for ICF too. But I ended up not needing it for this patch, so I removed that again for now. Differential Revision: https://reviews.llvm.org/D102076	2021-05-07 17:11:40 -04:00
Jez Ng	05c5363b39	[lld-macho] Parse & emit the N_ARM_THUMB_DEF symbol flag Eventually we'll use this flag to properly handle bl/blx opcodes. Reviewed By: #lld-macho, gkm Differential Revision: https://reviews.llvm.org/D101558	2021-04-30 16:17:26 -04:00
Jez Ng	7ca133c360	[lld-macho] std::sort -> llvm::sort	2021-04-27 18:02:59 -04:00
Nico Weber	c1b2a7bfbf	[lld/mac] make a few "named parameter comments" more consistent Most of LLVM and almost all of lld/MachO uses `/foo=/bar` style. No behavior change.	2021-04-22 10:48:03 -04:00
Jez Ng	1460942c15	[lld-macho] Add 32-bit compact unwind support This could probably have been part of D99633, but I split it up to make things a bit more reviewable. I also fixed some bugs in the implementation that were masked through integer underflows when operating in 64-bit mode. Reviewed By: #lld-macho, gkm Differential Revision: https://reviews.llvm.org/D99823	2021-04-15 21:16:33 -04:00
Jez Ng	8ca366935b	Revert "[lld-macho] Add support for arm64_32" and other stacked diffs This reverts commits: * `8914902b01` * `35a745d814` * `682d1dfe09`	2021-04-13 12:40:58 -04:00
Jez Ng	35a745d814	[lld-macho] Add 32-bit compact unwind support This could probably have been part of D99633, but I split it up to make things a bit more reviewable. I also fixed some bugs in the implementation that were masked through integer underflows when operating in 64-bit mode. Reviewed By: #lld-macho, gkm Differential Revision: https://reviews.llvm.org/D99823	2021-04-13 10:43:28 -04:00
Jez Ng	817d98d841	[lld-macho][nfc] Refactor in preparation for 32-bit support The main challenge was handling the different on-disk structures (e.g. `mach_header` vs `mach_header_64`). I tried to strike a balance between sprinkling `target->wordSize == 8` checks everywhere (branchy = slow, and ugly) and templatizing everything (causes code bloat, also ugly). I think I struck a decent balance by judicious use of type erasure. Note that LLD-ELF has a similar architecture, though it seems to use more templating. Linking chromium_framework takes about the same time before and after this change: N Min Max Median Avg Stddev x 20 4.52 4.67 4.595 4.5945 0.044423204 + 20 4.5 4.71 4.575 4.582 0.056344803 No difference proven at 95.0% confidence Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D99633	2021-04-02 18:46:39 -04:00
Alexander Shaposhnikov	f6ad045366	[lld][MachO] Make emitEndFunStab independent from .subsections_via_symbols This diff addresses FIXME in SyntheticSections.cpp and removes the dependency of emitEndFunStab on .subsections_via_symbols. Test plan: make check-lld-macho Differential revision: https://reviews.llvm.org/D99054	2021-04-01 17:48:09 -07:00
Greg McGary	427d359721	[lld-macho][NFC] Drop unnecessary macho:: namespace prefix on unambiguous references to Symbol Within `lld/macho/`, only `InputFiles.cpp` and `Symbols.h` require the `macho::` namespace qualifier to disambiguate references to `class Symbol`. Add braces to outer `for` of a 5-level single-line `if`/`for` nest. Differential Revision: https://reviews.llvm.org/D99555	2021-03-30 14:58:35 -07:00
Greg McGary	98fe9e41f7	[lld-macho][NFC] add const to pointer/reference induction variables of range-based for loops Pointer and reference induction variables of range-based for loops are often const, and code authors often lax about qualifying them. Differential Revision: https://reviews.llvm.org/D98317	2021-03-10 12:07:31 -08:00
Greg McGary	fdc0c21973	[lld-macho][NFC] when reasonable, replace auto keyword with type names lld policy discourages `auto`. Replace it with a type name whenever reasonable. Retain `auto` to avoid ... * redundancy, as for decls such as `auto t = mumble_cast<TYPE >` or similar that specifies the result type on the RHS * verbosity, as for iterators * gratuitous suffering, as for lambdas Along the way, add `const` when appropriate. Note: a future diff will ... * add more `const` qualifiers * remove `opt::` when we are already `using llvm::opt` Differential Revision: https://reviews.llvm.org/D98313	2021-03-09 22:08:32 -08:00
Nico Weber	0658fc654c	[lld/mac] Implement the missing bits of -undefined This adds support for `-undefined dynamic_lookup`, and for `-undefined warning` and `-undefined suppress` with `-flat_namespace`. We just replace undefined symbols with a DynamicLookup when we hit them. With this, `check-llvm` passes when using ld64.lld.darwinnew as host linker. Differential Revision: https://reviews.llvm.org/D97642	2021-03-01 15:30:53 -05:00
Jez Ng	4a5e111aea	[lld-macho] Better deduplication of personality pointers {D95809} introduced a mechanism for synthetic symbol creation of personality pointers. When multiple section relocations referred to the same personality pointer, it would deduplicate them. However, it neglected to consider that we could have symbol relocations that also refer to the same personality pointer. This diff fixes it. In practice, this mix of relocations arises when there is a statically-linked personality routine that is referenced from multiple object files. Within the same object file, it will be referred to via section relocations, but (obviously) other object files will refer to it via symbol relocations. Failing to deduplicate these references resulted in us going over the 3-personality-pointer limit when linking some larger applications. Fixes llvm.org/PR48389. Reviewed By: #lld-macho, thakis, alexshap Differential Revision: https://reviews.llvm.org/D97245	2021-02-23 22:02:38 -05:00
Jez Ng	ac9dd247da	[lld-macho] Try to make ubsan happy Summary: We should avoid passing a null pointer to memcpy.	2021-02-08 14:51:36 -05:00
Jez Ng	5112035751	[lld-macho] Emit LSDA info in compact unwind The LSDA pointers are encoded as offsets from the image base, and arranged in one big contiguous array. Each second-level page records the offset within that LSDA array which corresponds to the LSDA for its first CU entry. Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D95810	2021-02-08 13:48:00 -05:00
Jez Ng	525bfa10ec	[lld-macho] Emit personalities in compact unwind Note that there is a triple indirection involved with personalities and compact unwind: 1. Two bits of each CU encoding are used as an offset into the personality array. 2. Each entry of the personality array is an offset from the image base. The resulting address (after adding the image base) should point within the GOT. 3. The corresponding GOT entry contains the actual pointer to the personality function. To further complicate things, when the personality function is in the object file (as opposed to a dylib), its references in `__compact_unwind` may refer to it via a section + offset relocation instead of a symbol relocation. Since our GOT implementation can only create entries for symbols, we have to create a synthetic symbol at the given section offset. Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D95809	2021-02-08 13:47:59 -05:00
Greg McGary	c3e4f3b231	[lld-macho] Fix alignment & layout to match ld64 and satisfy kernel & codesign The Mach kernel & codesign on arm64 macOS has strict requirements for alignment and sequence of segments and sections. Dyld probably is just as picky, though kernel & codesign reject malformed Mach-O files before dyld ever has a chance. I developed this diff by incrementally changing alignments & sequences to match the output of ld64. I stopped when my hello-world test program started working: `codesign --verify` succeded, and `execve(2)` didn't immediately fail with `errno == EBADMACHO` = `"Malformed Mach-O file"`. Differential Revision: https://reviews.llvm.org/D94935	2021-02-05 17:22:03 -07:00
Nico Weber	568824798f	fix typo to cycle bots	2021-01-01 22:28:11 -05:00
Fangrui Song	791fe7ac57	[lld-macho] Fix memcpy ub after D93267	2020-12-20 20:01:20 -08:00
Greg McGary	99930719c6	Handle overflow beyond the 127 common encodings limit The common encodings table holds only 127 entries. The encodings index for compact entries is 8 bits wide, and indexes 127..255 are stored locally to each second-level page. Prior to this diff, lld would `fatal()` if encodings overflowed the 127 limit. This diff populates a per-second-level-page encodings table as needed. When the per-page encodings table hits its limit, we must terminate the page. If such early termination would consume fewer entries than a regular (non-compact) encoding page, then we prefer the regular format. Caveat: one reason the common-encoding table might overflow is because of DWARF debug-info references, which are not yet implemented and will come with a later diff. Differential Revision: https://reviews.llvm.org/D93267	2020-12-19 14:54:37 -08:00

1 2

53 Commits