Commit Graph

15 Commits

Author SHA1 Message Date
Vy Nguyen c0ec1036d6 [lld-macho][nfc] Run clang-format on lld/MachO/*.{h,cpp}
- fixed inconsistent indents and spaces
- prevent extraneous formatting changes in other patches

Differential Revision: https://reviews.llvm.org/D126262
2022-05-24 08:36:20 +07:00
Vincent Lee 9f2272ff51 [lld-macho] Allow dead_strip to work with exported private extern symbols
It seems like we are overly asserting when running `-dead_strip` with
exported symbols. ld64 treats exported private extern symbols as a liveness
root. Loosen the assert to match ld64's behavior.

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D124143
2022-04-22 18:45:27 -07:00
Jez Ng a993d607de [lld-macho][nfc] Add comment explaining why a cast<> is safe 2022-03-21 07:23:09 -04:00
Reid Kleckner da11f17e90 [lld/MachO] Fix +asserts build after recent change 2022-02-24 13:12:48 -08:00
Jez Ng 850592ec14 [lld-macho] Implement -why_live (without perf overhead)
This was based off @thakis' draft in {D103517}. I employed templates to ensure
the support for `-why_live` wouldn't slow down the regular non-why-live code
path.

No stat sig perf difference on my 3.2 GHz 16-Core Intel Xeon W:

             base           diff           difference (95% CI)
  sys_time   1.195 ± 0.015  1.199 ± 0.022  [  -0.4% ..   +1.0%]
  user_time  3.716 ± 0.022  3.701 ± 0.025  [  -0.7% ..   -0.1%]
  wall_time  4.606 ± 0.034  4.597 ± 0.046  [  -0.6% ..   +0.2%]
  samples    44             37

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D120377
2022-02-24 15:49:36 -05:00
Jez Ng e42ad84ba0 [lld-macho][nfc] Refactor MarkLive
This mirrors the code structure in `lld/ELF`. It also paves the way for
an upcoming diff where I templatize things.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D120376
2022-02-23 08:58:26 -05:00
Greg McGary 9cc489a4b2 [lld-macho][nfc] Factor-out NFC changes from main __eh_frame diff
In order to keep signal:noise high for the `__eh_frame` diff, I have teased-out the NFC changes and put them here.

Differential Revision: https://reviews.llvm.org/D114017
2021-11-17 15:16:44 -07:00
Fangrui Song 005456e5fc [lld-macho] Fix an assertion failure when -u specifies an undefined section$start symbol
This matches ld64. Also improve the test for `-dead_strip`.

Reviewed By: #lld-macho, Jez Ng

Differential Revision: https://reviews.llvm.org/D113147
2021-11-04 21:28:33 -07:00
Jez Ng 002eda7056 [lld-macho] Associate compact unwind entries with function symbols
Compact unwind entries (CUEs) contain pointers to their respective
function symbols. However, during the link process, it's far more useful
to have pointers from the function symbol to the CUE than vice versa.
This diff adds that pointer in the form of `Defined::compactUnwind`.

In particular, when doing dead-stripping, we want to mark CUEs live when
their function symbol is live; and when doing ICF, we want to dedup
sections iff the symbols in that section have identical CUEs. In both
cases, we want to be able to locate the symbols within a given section,
as well as locate the CUEs belonging to those symbols. So this diff also
adds `InputSection::symbols`.

The ultimate goal of this refactor is to have ICF support dedup'ing
functions with unwind info, but that will be handled in subsequent
diffs. This diff focuses on simplifying `-dead_strip` --
`findFunctionsWithUnwindInfo` is no longer necessary, and
`Defined::isLive()` is now a lot simpler. Moreover, UnwindInfoSection no
longer has to check for dead CUEs -- we simply avoid adding them in the
first place.

Additionally, we now support stripping of dead LSDAs, which follows
quite naturally since `markLive()` can now reach them via the CUEs.

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D109944
2021-10-26 16:04:15 -04:00
Nico Weber 3eb2fc4b50 [lld/mac] Partially implement -export_dynamic
This implements the part of -export_dynamic that adds external
symbols as dead strip roots even for executables.

It does not yet implement the effect -export_dynamic has for LTO.
I tried just replacing `config->outputType != MH_EXECUTE` with
`(config->outputType != MH_EXECUTE || config->exportDynamic)` in
LTO.cpp, but then local symbols make it into the symbol table too,
which is too much (and also doesn't match ld64). So punt on this
for now until I understand it better.
(D91583 may or may not be related too).

Differential Revision: https://reviews.llvm.org/D105482
2021-07-06 11:22:18 -04:00
Jez Ng f6b6e72143 [lld-macho] Factor out common InputSection members
We have been creating many ConcatInputSections with identical values due
to .subsections_via_symbols. This diff factors out the identical values
into a Shared struct, to reduce memory consumption and make copying
cheaper.

I also changed `callSiteCount` from a uint32_t to a 31-bit field to save an
extra word.

All in all, this takes InputSection from 120 to 72 bytes (and
ConcatInputSection from 160 to 112 bytes), i.e. 30% size reduction in
ConcatInputSection.

Numbers for linking chromium_framework on my 3.2 GHz 16-Core Intel Xeon W:

      N           Min           Max        Median           Avg        Stddev
  x  20          4.14          4.24          4.18         4.183   0.027548999
  +  20          4.04          4.11         4.075        4.0775   0.018027756
  Difference at 95.0% confidence
          -0.1055 +/- 0.0149005
          -2.52211% +/- 0.356215%
          (Student's t, pooled s = 0.0232803)

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D105305
2021-07-01 21:22:39 -04:00
Jez Ng 3a11528d97 [lld-macho] Move ICF earlier to avoid emitting redundant binds
This is a pretty big refactoring diff, so here are the motivations:

Previously, ICF ran after scanRelocations(), where we emitting
bind/rebase opcodes etc. So we had a bunch of redundant leftovers after
ICF. Having ICF run before Writer seems like a better design, and is
what LLD-ELF does, so this diff refactors it accordingly.

However, ICF had two dependencies on things occurring in Writer: 1) it
needs literals to be deduplicated beforehand and 2) it needs to know
which functions have unwind info, which was being handled by
`UnwindInfoSection::prepareRelocations()`.

In order to do literal deduplication earlier, we need to add literal
input sections to their corresponding output sections. So instead of
putting all input sections into the big `inputSections` vector, and then
filtering them by type later on, I've changed things so that literal
sections get added directly to their output sections during the 'gather'
phase. Likewise for compact unwind sections -- they get added directly
to the UnwindInfoSection now. This latter change is not strictly
necessary, but makes it easier for ICF to determine which functions have
unwind info.

Adding literal sections directly to their output sections means that we
can no longer determine `inputOrder` from iterating over
`inputSections`. Instead, we store that order explicitly on
InputSection. Bloating the size of InputSection for this purpose would
be unfortunate -- but LLD-ELF has already solved this problem: it reuses
`outSecOff` to store this order value.

One downside of this refactor is that we now make an additional pass
over the unwind info relocations to figure out which functions have
unwind info, since want to know that before `processRelocations()`. I've
made sure to run that extra loop only if ICF is enabled, so there should
be no overhead in non-optimizing runs of the linker.

The upside of all this is that the `inputSections` vector now contains
only ConcatInputSections that are destined for ConcatOutputSections, so
we can clean up a bunch of code that just existed to filter out other
elements from that vector.

I will test for the lack of redundant binds/rebases in the upcoming
cfstring deduplication diff. While binds/rebases can also happen in the
regular `.text` section, they're more common in `.data` sections, so it
seems more natural to test it that way.

This change is perf-neutral when linking chromium_framework.

Reviewed By: oontvoo

Differential Revision: https://reviews.llvm.org/D105044
2021-07-01 21:22:38 -04:00
Jez Ng 464d3dc3d1 [lld-macho] Have dead-stripping work with literal sections
Literal sections are not atomically live or dead. Rather,
liveness is tracked for each individual literal they contain. CStrings
have their liveness tracked via a `live` bit in StringPiece, and
fixed-width literals have theirs tracked via a BitVector.

The live-marking code now needs to track the offset within each section
that is to be marked live, in order to identify the literal at that
particular offset.

Numbers for linking chromium_framework on my 3.2 GHz 16-Core Intel Xeon W
with both `-dead_strip` and `--deduplicate-literals`, with and without this diff
applied:

```
    N           Min           Max        Median           Avg        Stddev
x  20          4.32          4.44         4.375         4.372    0.03105174
+  20           4.3          4.39          4.36        4.3595   0.023277502
No difference proven at 95.0% confidence
```
This gives us size savings of about 0.4%.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D103979
2021-06-11 19:50:09 -04:00
Jez Ng 7f2ba39b16 [lld-macho][nfc] Move liveness-tracking fields into ConcatInputSection
These fields currently live in the parent InputSection class,
but they should be specific to ConcatInputSection, since the other
InputSection classes (that contain literals) aren't atomically live or
dead -- rather their component string/int literals should have
individual liveness states. (An upcoming diff will add liveness bits for
StringPieces and fixed-sized literals.)

I also factored out some asserts for isCoalescedWeak() in MarkLive.cpp.
We now avoid putting coalesced sections in the `inputSections` vector,
so we don't have to check/assert against it everywhere.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D103977
2021-06-11 19:50:08 -04:00
Nico Weber a5645513db [lld/mac] Implement -dead_strip
Also adds support for live_support sections, no_dead_strip sections,
.no_dead_strip symbols.

Chromium Framework 345MB unstripped -> 250MB stripped
(vs 290MB unstripped -> 236M stripped with ld64).

Doing dead stripping is a bit faster than not, because so much less
data needs to be processed:

    % ministat lld_*
    x lld_nostrip.txt
    + lld_strip.txt
        N           Min           Max        Median           Avg        Stddev
    x  10      3.929414       4.07692     4.0269079     4.0089678   0.044214794
    +  10     3.8129408     3.9025559     3.8670411     3.8642573   0.024779651
    Difference at 95.0% confidence
            -0.144711 +/- 0.0336749
            -3.60967% +/- 0.839989%
            (Student's t, pooled s = 0.0358398)

This interacts with many parts of the linker. I tried to add test coverage
for all added `isLive()` checks, so that some test will fail if any of them
is removed. I checked that the test expectations for the most part match
ld64's behavior (except for live-support-iterations.s, see the comment
in the test). Interacts with:
- debug info
- export tries
- import opcodes
- flags like -exported_symbol(s_list)
- -U / dynamic_lookup
- mod_init_funcs, mod_term_funcs
- weak symbol handling
- unwind info
- stubs
- map files
- -sectcreate
- undefined, dylib, common, defined (both absolute and normal) symbols

It's possible it interacts with more features I didn't think of,
of course.

I also did some manual testing:
- check-llvm check-clang check-lld work with lld with this patch
  as host linker and -dead_strip enabled
- Chromium still starts
- Chromium's base_unittests still pass, including unwind tests

Implemenation-wise, this is InputSection-based, so it'll work for
object files with .subsections_via_symbols (which includes all
object files generated by clang). I first based this on the COFF
implementation, but later realized that things are more similar to ELF.
I think it'd be good to refactor MarkLive.cpp to look more like the ELF
part at some point, but I'd like to get a working state checked in first.

Mechanical parts:
- Rename canOmitFromOutput to wasCoalesced (no behavior change)
  since it really is for weak coalesced symbols
- Add noDeadStrip to Defined, corresponding to N_NO_DEAD_STRIP
  (`.no_dead_strip` in asm)

Fixes PR49276.

Differential Revision: https://reviews.llvm.org/D103324
2021-06-02 11:09:26 -04:00