Commit Graph

733 Commits

Author SHA1 Message Date
Nico Weber 895a72111b [lld/mac] Support writing zippered dylibs and bundles
With -platform_version flags for two distinct platforms,
this writes a LC_BUILD_VERSION header for each.

The motivation is that this is needed for self-hosting with lld as linker
after D124059.

To create a zippered output at the clang driver level, pass

    -target arm64-apple-macos -darwin-target-variant arm64-apple-ios-macabi

to create a zippered dylib.

(In Xcode's clang, `-darwin-target-variant` is spelled just `-target-variant`.)

(If you pass `-target arm64-apple-ios-macabi -target-variant arm64-apple-macos`
instead, ld64 crashes!)

This results in two -platform_version flags being passed to the linker.

ld64 also verifies that the iOS SDK version is at least 13.1. We don't do that
yet. But ld64 also does that for other platforms and we don't. So we need to
do that at some point, but not in this patch.

Only dylib and bundle outputs can be zippered.

I verified that a Catalyst app linked against a dylib created with

    clang -shared foo.cc -o libfoo.dylib \
          -target arm64-apple-macos \
          -target-variant arm64-apple-ios-macabi \
          -Wl,-install_name,@rpath/libfoo.dylib \
          -fuse-ld=$PWD/out/gn/bin/ld64.lld

runs successfully. (The app calls a function `f()` in libfoo.dylib
that returns a const char* "foo", and NSLog(@"%s")s it.)

ld64 is a bit more permissive when writing zippered outputs,
see references to "unzippered twins". That's not implemented yet.
(If anybody wants to implement that, D124275 is a good start.)

Differential Revision: https://reviews.llvm.org/D124887
2022-05-04 19:23:35 -04:00
Alex Borcan e29dc0c6fd [lld] Implement safe icf for MachO
This change implements --icf=safe for MachO based on addrsig section that is implemented in D123751.

Reviewed By: int3, #lld-macho

Differential Revision: https://reviews.llvm.org/D123752
2022-05-03 21:01:03 -04:00
Nico Weber 010acc52a8 [lld/mac] Revert libcompiler_rt.dylib version check change
This reverts D117925 since it's no longer needed after D124336.

Differential Revision: https://reviews.llvm.org/D124354
2022-04-25 06:55:49 -04:00
Nico Weber 3254f46884 [lld/mac] For catalyst outputs, tolerate implicitly linking against mac-only tbd files
Before this,

  clang empty.cc -target x86_64-apple-ios13.1-macabi \
      -framework CoreServices -fuse-ld=lld

would error out with

    ld64.lld: error: path/to/MacOSX.sdk/System/Library/Frameworks/
         CoreServices.framework/Versions/A/Frameworks/CarbonCore.framework/
         Versions/A/CarbonCore.tbd(
             /System/Library/Frameworks/
             CoreServices.framework/Versions/A/Frameworks/CarbonCore.framework/
             Versions/A/CarbonCore) is incompatible with x86_64 (macCatalyst)

Now it works, like with ld64.

Differential Revision: https://reviews.llvm.org/D124336
2022-04-23 21:43:46 -04:00
Jez Ng 013efeec34 [lld-macho] Remove stray debug printf
Accidentally committed as part of b440c25742.
2022-04-22 22:17:24 -04:00
Vincent Lee 9f2272ff51 [lld-macho] Allow dead_strip to work with exported private extern symbols
It seems like we are overly asserting when running `-dead_strip` with
exported symbols. ld64 treats exported private extern symbols as a liveness
root. Loosen the assert to match ld64's behavior.

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D124143
2022-04-22 18:45:27 -07:00
Jez Ng c242e10c74 [lld-macho] Fix ICF crash when comparing symbol relocs
Previously, when encountering a symbol reloc located in a literal section, we
would look up the contents of the literal at the `symbol value + addend` offset
within the literal section. However, it seems that this offset is not guaranteed
to be valid. Instead, we should use just the symbol value to retrieve the
literal's contents, and compare the addend values separately. ld64 seems to do
this.

Reviewed By: #lld-macho, thevinster

Differential Revision: https://reviews.llvm.org/D124223
2022-04-22 15:36:53 -04:00
Jez Ng e6382d23fc [lld-macho][nfc] Simplify unwind section lookup
Previously, we stored a pointer from the ObjFile to its compact unwind
section in order to avoid iterating over the file's sections a second
time. However, given the small number of sections (not subsections) per
file, this caching was really quite unnecessary. We will soon do lookups
for more sections (such as the `__eh_frame` section), so let's simplify
the code first.

Reviewed By: #lld-macho, Roger

Differential Revision: https://reviews.llvm.org/D123434
2022-04-22 15:36:53 -04:00
Keith Smiley 2d8cf26d08 [lld-macho] Fix crash on invalid framework tbd
Previously these would crash because `file` is null in the case there is
an invalid tbd file.

Differential Revision: https://reviews.llvm.org/D124271
2022-04-22 10:26:48 -07:00
Nico Weber 889847922d [lld/mac] Warn that writing zippered outputs isn't implemented
A "zippered" dylib contains several LC_BUILD_VERSION load commands, usually
one each for "normal" macOS and one for macCatalyst.

These are usually created by passing something like

   -shared -target arm64-apple-macos -darwin-target-variant arm64-apple-ios13.1-macabi

to clang, which turns it into

    -platform_version macos 12.0.0 12.3 -platform_version "mac catalyst" 14.0.0 15.4

for the linker.

ld64.lld can read these files fine, but it can't write them.  Before this
change, it would just silently use the last -platform_version flag and ignore
the rest.

This change adds a warning that writing zippered dylibs isn't implemented yet
instead.

Sadly, parts of ld64.lld's test suite relied on the previous
"silently use last flag" semantics for its test suite: `%lld` always expanded
to `ld64.lld -platform_version macos 10.15 11.0` and tests that wanted a
different value passed a 2nd `-platform_version` flag later on. But this now
produces a warning if the platform passed to `-platform_version` is not `macos`.

There weren't very many cases of this, so move these to use `%no-arg-lld` and
manually pass `-arch`.

Differential Revision: https://reviews.llvm.org/D124106
2022-04-21 12:05:56 -04:00
Jez Ng 2a6669060f [lld-macho][nfc] De-templatize UnwindInfoSection
Follow-on to {D123276}. Now that we work with an internal
representation of compact unwind entries, we no longer need to template
our UnwindInfoSectionImpl code based on the pointer size of the target
architecture.

I've still kept the split between `UnwindInfoSectionImpl` and
`UnwindInfoSection`. I'd introduced that split in order to do type
erasure, but I think it's still useful to have in order to keep
`UnwindInfoSection`'s definition in the header file clean.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D123277
2022-04-13 16:19:22 -04:00
Jez Ng 1cff723ff5 [lld-macho][nfc] Use includeInSymtab for all symtab-skipping logic
{D123302} got me looking deeper at `includeInSymtab`. I thought it was a
little odd that there were excluded (live) symbols for which
`includeInSymtab` was false; we shouldn't have so many different ways to
exclude a symbol. As such, this diff makes the `L`-prefixed-symbol
exclusion code use `includeInSymtab` too. (Note that as part of our
support for `__eh_frame`, we will also be excluding all `__eh_frame`
symbols from the symtab in a future diff.)

Another thing I noticed is that the `emitStabs` code never has to deal
with excluded symbols because `SymtabSection::finalize()` already
filters them out. As such, I've updated the comments and asserts from
{D123302} to reflect this.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D123433
2022-04-11 15:45:46 -04:00
Jez Ng 82dcf30636 [lld-macho] Use fewer indirections in UnwindInfo implementation
The previous implementation of UnwindInfoSection materialized
all the compact unwind entries & applied their relocations, then parsed
the resulting data to generate the final unwind info. This design had
some unfortunate conseqeuences: since relocations can only be applied
after their referents have had addresses assigned, operations that need
to happen before address assignment must contort themselves. (See
{D113582} and observe how this diff greatly simplifies it.)

Moreover, it made synthesizing new compact unwind entries awkward.
Handling PR50956 will require us to do this synthesis, and is the main
motivation behind this diff.

Previously, instead of generating a new CompactUnwindEntry directly, we
would have had to generate a ConcatInputSection with a number of
`Reloc`s that would then get "flattened" into a CompactUnwindEntry.

This diff introduces an internal representation of `CompactUnwindEntry`
(the former `CompactUnwindEntry` has been renamed to
`CompactUnwindLayout`). The new CompactUnwindEntry stores references to
its personality symbol and LSDA section directly, without the use of
`Reloc` structs.

In addition to being easier to work with, this diff also allows us to
handle unwind info whose personality symbols are located in sections
placed after the `__unwind_info`.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D123276
2022-04-08 23:49:07 -04:00
Jorge Gorbe Moya 627f55b3ae Fix format specifier. NFCI.
Using a portable format specifier avoids a "format specifies type
'unsigned long long' but the argument has type 'uint64_t' (aka 'unsigned
long') [-Werror,-Wformat]" error depending on the exact definition of
`uint64_t`.
2022-04-07 15:26:49 -07:00
Jez Ng b440c25742 [lld-macho][nfc] Give non-text ConcatOutputSections order-independent finalization
This diff is motivated by my work to add proper DWARF unwind support. As
detailed in PR50956 functions that need DWARF unwind need to have
compact unwind entries synthesized for them. These CU entries encode an
offset within `__eh_frame` that points to the corresponding DWARF FDE.

In order to encode this offset during
`UnwindInfoSectionImpl::finalize()`, we need to first assign values to
`InputSection::outSecOff` for each `__eh_frame` subsection. But
`__eh_frame` is ordered after `__unwind_info` (according to ld64 at
least), which puts us in a bit of a bind: `outSecOff` gets assigned
during finalization, but `__eh_frame` is being finalized after
`__unwind_info`.

But it occurred to me that there's no real need for most
ConcatOutputSections to be finalized sequentially. It's only necessary
for text-containing ConcatOutputSections that may contain branch relocs
which may need thunks. ConcatOutputSections containing other types of
data can be finalized in any order.

This diff moves the finalization logic for non-text sections into a
separate `finalizeContents()` method. This method is called before
section address assignment & unwind info finalization takes place. In
theory we could call these `finalizeContents()` methods in parallel, but
in practice it seems to be faster to do it all on the main thread.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D123279
2022-04-07 18:13:27 -04:00
Nico Weber 2cb3d28b17 [lld/mac] Add some comments and asserts
I was wondering if SymtabSection::emitStabs() should check
defined->includeInSymtab. Add asserts and comments explaining why that's not
necessary.

No behavior change.

Differential Revision: https://reviews.llvm.org/D123302
2022-04-07 15:43:28 -04:00
Jez Ng f004ecf6ec [lld-macho][nfc] Remove indirection when looking up common section members
{D118797} means that we can now check the name/segname of a given
section directly, instead of having to look those properties up on one
of its subsections. This allows us to simplify our code.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D123275
2022-04-07 14:28:52 -04:00
Jez Ng da6b6b3c82 [lld-macho][nfc] Factor out findSymbolAtOffset
Our compact unwind handling code currently has some logic to locate a
symbol at a given offset in an InputSection. The EH frame code will need
to do something similar, so let's factor out the code.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D123301
2022-04-07 09:13:39 -04:00
Nico Weber 8c1ea1ab81 [lld/mac] Don't emit stabs entries for functions folded during ICF
This matches ld64, and makes dsymutil work better with lld's output.
Fixes PR54783, see there for details.

Reduces time needed to run dsymutil on Chromium Framework from 8m30s
(which is already down from 26 min with D123218) to 6m30s and removes
many lines of "could not find object file symbol for symbol" from dsymutil output
(previously: several MB of those messages, now dsymutil is completely silent).

Differential Revision: https://reviews.llvm.org/D123252
2022-04-07 08:09:32 -04:00
Simon Pilgrim 156b94c2d3 Fix "result of 32-bit shift implicitly converted to 64 bits" MSVC warning. NFC. 2022-04-07 11:25:09 +01:00
Nikita Popov b8f50abd04 [lld] Remove support for legacy pass manager
This removes options for performing LTO with the legacy pass
manager in LLD. Options that explicitly enable the new pass manager
are retained as no-ops.

Differential Revision: https://reviews.llvm.org/D123219
2022-04-07 10:17:31 +02:00
Jez Ng e4b286211c [lld-macho][nfc] Rearrange order of statements to clarify data dependencies 2022-04-07 00:00:41 -04:00
Nikita Popov ed4e6e0398 [cmake] Remove LLVM_ENABLE_NEW_PASS_MANAGER cmake option
Or rather, error out if it is set to something other than ON. This
removes the ability to enable the legacy pass manager by default,
but does not remove the ability to explicitly enable it through
various flags like -flegacy-pass-manager or -enable-new-pm=0.

I checked, and our test suite definitely doesn't pass with
LLVM_ENABLE_NEW_PASS_MANAGER=OFF anymore.

Differential Revision: https://reviews.llvm.org/D123126
2022-04-06 09:52:21 +02:00
Argyrios Kyrtzidis 330268ba34 [Support/Hash functions] Change the `final()` and `result()` of the hashing functions to return an array of bytes
Returning `std::array<uint8_t, N>` is better ergonomics for the hashing functions usage, instead of a `StringRef`:

* When returning `StringRef`, client code is "jumping through hoops" to do string manipulations instead of dealing with fixed array of bytes directly, which is more natural
* Returning `std::array<uint8_t, N>` avoids the need for the hasher classes to keep a field just for the purpose of wrapping it and returning it as a `StringRef`

As part of this patch also:

* Introduce `TruncatedBLAKE3` which is useful for using BLAKE3 as the hasher type for `HashBuilder` with non-default hash sizes.
* Make `MD5Result` inherit from `std::array<uint8_t, 16>` which improves & simplifies its API.

Differential Revision: https://reviews.llvm.org/D123100
2022-04-05 21:38:06 -07:00
Nico Weber 663a7fa712 [lld/mac] Tweak a few comments
Addresses review feedback I had missed on https://reviews.llvm.org/D122624

No behavior change.

Differential Revision: https://reviews.llvm.org/D122904
2022-04-01 19:32:07 -04:00
Leonard Grey a9e325116c Add output filename to UUID hash
Differential Revision: https://reviews.llvm.org/D122843
2022-03-31 18:50:05 -04:00
Roger Kim 34b9729561 [lld-macho][NFC] Encapsulate symbol priority implementation.
Just some code clean up.

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D122752
2022-03-31 13:47:38 -04:00
Nico Weber 10cda6e36c [lld/mac] Give range extension thunks for local symbols local visibility
When two local symbols (think: file-scope static functions, or functions in
unnamed namespaces) with the same name in two different translation units
both needed thunks, ld64.lld previously created external thunks for both
of them. These thunks ended up with the same name, leading to a duplicate
symbol error for the thunk symbols.

Instead, give thunks for local symbols local visibility.

(Hitting this requires a jump to a local symbol from over 128 MiB away.
It's unlikely that a single .o file is 128 MiB large, but with ICF
you can end up with a situation where the local symbol is ICF'd with
a symbol in a separate translation unit. And that can introduce a
large enough jump to require a thunk.)

Fixes PR54599.

Differential Revision: https://reviews.llvm.org/D122624
2022-03-30 16:45:05 -04:00
Roger Kim f858fba631 [lld][Macho][NFC] Encapsulate priorities map in a priority class
`config->priorities` has been used to hold the intermediate state during the construction of the order in which sections should be laid out. This is not a good place to hold this state since the intermediate state is not a "configuration" for LLD. It should be encapsulated in a class for building a mapping from section to priority (which I created in this diff as the `PriorityBuilder` class).

The same thing is being done for `config->callGraphProfile`.

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D122156
2022-03-23 13:57:26 -04:00
Jez Ng c9c2363048 [lld-macho][nfc] Don't mix file sizes with addresses
Update DataInCode's calculation of `endAddr` to use `getSize()` instead
of `getFileSize()` -- while in practice they're the same for
non-zerofill sections (which code sections are), we still should treat
address sizes / offsets as distinct from file sizes / offsets.
2022-03-22 17:52:53 -04:00
Jez Ng a993d607de [lld-macho][nfc] Add comment explaining why a cast<> is safe 2022-03-21 07:23:09 -04:00
Jez Ng 1c0234dfcc [lld-macho][nfc] Have findContainingSubsection take a Section
... instead of an instance of `Subsections`.

This simplifies the code slightly since all its callsites have a Section
instance anyway.
2022-03-21 07:23:09 -04:00
Jez Ng 8ce3750ff6 [lld-macho] Set FinalDefinitionInLinkageUnit on most LTO externs
Since Mach-O has a two-level namespace (unlike ELF), we can usually set
this property to true.

(I believe this setting is only available in the new LTO backend, so I
can't really use ld64 / libLTO's behavior as a reference here... I'm
just doing what I think is correct.)

See {D119294} for the work done to calculate the `interposable` used in
this diff.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D119506
2022-03-15 20:25:06 -04:00
Jez Ng ceff23c6e3 [lld-macho] -flat_namespace for dylibs should make all externs interposable
All references to interposable symbols can be redirected at runtime to
point to a different symbol definition (with the same name). For
example, if both dylib A and B define symbol _foo, and we load A before
B at runtime, then all references to _foo within dylib B will point to
the definition in dylib A.

ld64 makes all extern symbols interposable when linking with
`-flat_namespace`.

TODO 1: Support `-interposable` and `-interposable_list`, which should
just be a matter of parsing those CLI flags and setting the
`Defined::interposable` bit.

TODO 2: Set Reloc::FinalDefinitionInLinkageUnit correctly with this info
(we are currently not setting it at all, so we're erring on the
conservative side, but we should help the LTO backend generate more
optimal code.)

Reviewed By: modimo, MaskRay

Differential Revision: https://reviews.llvm.org/D119294
2022-03-14 22:18:32 -04:00
Jez Ng 7f3ddf8443 [lld-macho][nfc] Allow Defined symbols to be placed in binding sections
Previously, we only allowed this for DylibSymbols. However, in order to
properly support `-flat_namespace` as well as `-interposable`, we need
to allow this for Defined symbols too. Therefore we hoist the
`lazyBindOffset` and the `stubsHelperIndex` into the parent Symbol
class.

The actual change to support interposition under `-flat_namespace` is in
{D119294}; the NFC changes here have been split out for easier review.

Perf regression isn't stat sig on my 3.2 GHz 16-Core Intel Xeon W linking
chromium_framework:

             base           diff           difference (95% CI)
  sys_time   1.227 ± 0.021  1.234 ± 0.031  [  -0.3% ..   +1.5%]
  user_time  3.665 ± 0.036  3.674 ± 0.035  [  -0.2% ..   +0.7%]
  wall_time  4.596 ± 0.055  4.609 ± 0.064  [  -0.3% ..   +0.9%]
  samples    34             47

Max RSS regression is barely stat sig:

           base                           diff                           difference (95% CI)
  time     1003664356.324 ± 15404053.912  1010380403.613 ± 10578309.455  [  +0.0% ..   +1.3%]
  samples  37                             31

Reviewed By: modimo

Differential Revision: https://reviews.llvm.org/D121351
2022-03-14 22:18:32 -04:00
Vy Nguyen 0d5e27623a Reland "[lld-macho] Avoid using bump-alloc in TrieBuider""
This reverts commit ee7a286cd3.
2022-03-14 19:33:13 -04:00
Sterling Augustine ee7a286cd3 Revert "[lld-macho] Avoid using bump-alloc in TrieBuider"
This reverts commit e049a87f04.

That commit breaks the build with errors of the form:

/usr/local/google/home/saugustine/llvm/llvm-project/lld/MachO/ExportTrie.cpp:148:11: error: definition of implicitly declared destructor
TrieNode::~TrieNode() {
2022-03-14 15:23:04 -07:00
Vy Nguyen e049a87f04 [lld-macho] Avoid using bump-alloc in TrieBuider
The code can be used in multi-threads and the allocator is not thread safe.

fixes PR/54378

Reviewed By: int3, #lld-macho

Differential Revision: https://reviews.llvm.org/D121638
2022-03-14 17:22:53 -04:00
Jez Ng 9b7b21d2f7 [lld-macho] Don't allocate memory in parallelForEach
... since BumpPtrAllocator isn't thread-safe.

Reviewed By: #lld-macho, Roger

Differential Revision: https://reviews.llvm.org/D121458
2022-03-11 13:32:24 -05:00
Jez Ng fc968bcba4
[lld-macho][nfc] Fix formatting in ld64-vs-lld.rst 2022-03-10 18:33:18 -05:00
Jez Ng 4308f031cd [lld-macho] Align cstrings less conservatively
Previously, we aligned every cstring to 16 bytes as a temporary hack to
deal with https://github.com/llvm/llvm-project/issues/50135. However, it
was highly wasteful in terms of binary size.

To recap, in contrast to ELF, which puts strings that need different
alignments into different sections, `clang`'s Mach-O backend puts them
all in one section.  Strings that need to be aligned have the .p2align
directive emitted before them, which simply translates into zero padding
in the object file. In other words, we have to infer the alignment of
the cstrings from their addresses.

We differ slightly from ld64 in how we've chosen to align these
cstrings. Both LLD and ld64 preserve the number of trailing zeros in
each cstring's address in the input object files. When deduplicating
identical cstrings, both linkers pick the cstring whose address has more
trailing zeros, and preserve the alignment of that address in the final
binary. However, ld64 goes a step further and also preserves the offset
of the cstring from the last section-aligned address.  I.e. if a cstring
is at offset 18 in the input, with a section alignment of 16, then both
LLD and ld64 will ensure the final address is 2-byte aligned (since
`18 == 16 + 2`). But ld64 will also ensure that the final address is of
the form 16 * k + 2 for some k (which implies 2-byte alignment).

Note that ld64's heuristic means that a dedup'ed cstring's final address is
dependent on the order of the input object files. E.g. if in addition to the
cstring at offset 18 above, we have a duplicate one in another file with a
`.cstring` section alignment of 2 and an offset of zero, then ld64 will pick
the cstring from the object file earlier on the command line (since both have
the same number of trailing zeros in their address). So the final cstring may
either be at some address `16 * k + 2` or at some address `2 * k`.

I've opted not to follow this behavior primarily for implementation
simplicity, and secondarily to save a few more bytes. It's not clear to me
that preserving the section alignment + offset is ever necessary, and there
are many cases that are clearly redundant. In particular, if an x86_64 object
file contains some strings that are accessed via SIMD instructions, then the
.cstring section in the object file will be 16-byte-aligned (since SIMD
requires its operand addresses to be 16-byte aligned). However, there will
typically also be other cstrings in the same file that aren't used via SIMD
and don't need this alignment. They will be emitted at some arbitrary address
`A`, but ld64 will treat them as being 16-byte aligned with an offset of
`16 % A`.

I have verified that the two repros in https://github.com/llvm/llvm-project/issues/50135
work well with the new alignment behavior.

Fixes https://github.com/llvm/llvm-project/issues/54036.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D121342
2022-03-10 15:18:15 -05:00
Jez Ng ce2ae38124 [lld-macho] Deduplicate the `__objc_classrefs` section contents
ld64 breaks down `__objc_classrefs` on a per-word level and deduplicates
them. This greatly reduces the number of bind entries emitted (and
therefore the amount of work `dyld` has to do at runtime). For
chromium_framework, this change to LLD cuts the number of (non-lazy)
binds from 912 to 190, getting us to parity with ld64 in this aspect.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D121053
2022-03-08 08:34:04 -05:00
Jez Ng 8ec1033933 [lld-macho] Deduplicate CFStrings during ICF
`__cfstring` has embedded addends that foil ICF's hashing / equality
checks. (We can ignore embedded addends when doing ICF because the same
information gets recorded in our Reloc structs.) Therefore, in order to
properly dedup CFStrings, we create a mutable copy of the CFString and
zero out the embedded addends before performing any hashing / equality
checks.

(We did in fact have a partial implementation of CFString deduplication
already. However, it only worked when the cstrings they point to are at
identical offsets in their object files.)

I anticipate this approach can be extended to other similar
statically-allocated struct sections in the future.

In addition, we previously treated all references with differing addends
as unequal. This is not true when the references are to literals:
different addends may point to the same literal in the output binary. In
particular, `__cfstring` has such references to `__cstring`. I've
adjusted ICF's `equalsConstant` logic accordingly, and I've added a few
more tests to make sure the addend-comparison code path is adequately
covered.

Fixes https://github.com/llvm/llvm-project/issues/51281.

Reviewed By: #lld-macho, Roger

Differential Revision: https://reviews.llvm.org/D120137
2022-03-08 08:34:03 -05:00
Jez Ng 0405920c5f Re-land [lld-macho][nfc] Don't use `stubsHelperIndex` in ICF hash
Previous attempt was commit 112135e774 and
reverted in d86d431814.
2022-03-07 16:58:00 -05:00
Nico Weber d86d431814 Revert "[lld-macho][nfc] Don't use `stubsHelperIndex` in ICF hash"
This reverts commit 112135e774.
Breaks lld/test/MachO/{icf.s,cfstring-dedup.s,invalid/cfstring.s}
2022-03-07 13:50:38 -05:00
Jez Ng ad1c32e9b3 [lld-macho][nfc] Reduce size of icfEqClass hash
... from a `uint64_t` to a `uint32_t`. (LLD-ELF uses a `uint32_t` too.)

About a 1.7% reduction in peak RSS when linking chromium_framework on my
3.2 GHz 16-Core Intel Xeon W Mac Pro, and no stat sig change in wall
time.

           </Users/jezng/test2.sh ["before"]>  </Users/jezng/test2.sh ["after"]>  difference (95% CI)
  RSS      1003036672.000 ± 9891065.259        985539505.231 ± 10272748.749       [  -2.3% ..   -1.2%]
  samples  27                                  26

             base           diff           difference (95% CI)
  sys_time   1.277 ± 0.023  1.277 ± 0.024  [  -0.9% ..   +0.9%]
  user_time  6.682 ± 0.046  6.598 ± 0.043  [  -1.6% ..   -0.9%]
  wall_time  5.904 ± 0.062  5.895 ± 0.063  [  -0.7% ..   +0.4%]
  samples    46             28

No appreciable change (~0.01%) in number of `equals` comparisons either:

Before:

  ld64.lld: ICF needed 8 iterations
  ld64.lld: equalsConstant() called 701643 times
  ld64.lld: equalsVariable() called 3438526 times

After:

  ld64.lld: ICF needed 8 iterations
  ld64.lld: equalsConstant() called 701729 times
  ld64.lld: equalsVariable() called 3438526 times

Reviewed By: #lld-macho, MaskRay, thakis

Differential Revision: https://reviews.llvm.org/D121052
2022-03-07 12:36:28 -05:00
Jez Ng 112135e774 [lld-macho][nfc] Don't use `stubsHelperIndex` in ICF hash
The existing hashing of stubsHelperIndex has mostly been a no-op* for
some time now (ever since we made ICF run before dylib symbols get their
stubs indices assigned). I guess we could consider hashing the name +
filename of the DylibSymbol instead, but I'm not sure the overhead's
worth it... moreover, LLD/ELF only hashes their Defined symbols as well.

*: Technically it does change the hash value since stubsHelperIndex is
initialized to `UINT32_MAX` by default. But since all stubsHelperIndex
values are the same at when ICF runs, they don't add any useful
information to the hash.
2022-03-07 12:36:28 -05:00
Jez Ng 7028799ca3 [lld-macho][nfc] Rename isec -> referentIsec to avoid shadowing
I found the shadowing a bit confusing
2022-03-07 12:36:28 -05:00
Jez Ng 64cc719766 [lld-macho][nfc] Track # of ICF calls to `equals*` methods
This is debug code that is disabled by default. It'll provide a easy way
to figure out the impact (if any) of tweaking ICF's hashing algorithm
(since a poor quality hash will result in many more `equals*` calls).

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D121051
2022-03-07 12:36:27 -05:00
Jez Ng 53e7eef43f [lld-macho][nfc] Use llvm::function_ref instead of std::function 2022-03-07 12:36:27 -05:00