Commit Graph

6946 Commits

Author SHA1 Message Date
Jez Ng 718c32175b [lld-macho] Only emit one BIND_OPCODE_SET_SYMBOL per symbol
Size-wise, BIND_OPCODE_SET_SYMBOL_TRAILING_FLAGS_IMM is the most
expensive opcode, since it comes with an associated symbol string. We
were previously emitting it once per binding, instead of once per
symbol. This diff groups all bindings for a given symbol together and
ensures we only emit one such opcode per symbol. This matches ld64's
behavior.

While this is a relatively small win on chromium_framework (-72KiB), for
programs that have more dynamic bindings, the difference can be quite
large.

This change is perf-neutral when linking chromium_framework.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D105075
2021-07-05 20:00:19 -04:00
Jez Ng 4aaf878750 [lld-macho][nfc] Add REQUIRES: x86 to test
I didn't realize that llvm-objdump's features were arch-specific.

This should fix the non-x86 buildbots.
2021-07-05 03:40:54 -04:00
Jez Ng bcaf57cae8 [lld-macho] Parse relocations quickly by assuming sorted order
clang and gcc both seem to emit relocations in reverse order of
address. That means we can match relocations to their containing
subsections in `O(relocs + subsections)` rather than the `O(relocs *
log(subsections))` that our previous binary search implementation
required.

Unfortunately, `ld -r` can still emit unsorted relocations, so we have a
fallback code path for that (less common) case.

Numbers for linking chromium_framework on my 3.2 GHz 16-Core Intel Xeon W:

      N           Min           Max        Median           Avg        Stddev
  x  20          4.04          4.11         4.075        4.0775   0.018027756
  +  20          3.95          4.02          3.98         3.985   0.020900768
  Difference at 95.0% confidence
          -0.0925 +/- 0.0124919
          -2.26855% +/- 0.306361%
          (Student's t, pooled s = 0.0195172)

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D105410
2021-07-05 01:13:44 -04:00
Nico Weber 9e24979d73 [lld/mac] Fix function offset on 1st-level unwind table sentinel
Two bugs:
1. This tries to take the address of the last symbol plus the length
   of the last symbol. However, the sorted vector is cuPtrVector,
   not cuVector. Also, cuPtrVector has tombstone values removed
   and cuVector doesn't. If there was a stripped value at the end,
   the "last" element's value was UINT64_MAX, which meant the
   sentinel value was one less than the length of that "last"
   dead symbol.

2. We have to subtract in.header->addr. For 64-bit binaries that's
   (1 << 32) and functionAddress is 32-bit so this is a no-op, but
   for 32-bit binaries the sentinel's value was too large.

I believe this has no effect in practice since the first-level
binary search code in libunwind (in UnwindCursor.hpp) does:

    uint32_t low = 0;
    uint32_t high = sectionHeader.indexCount();
    uint32_t last = high - 1;
    while (low < high) {
      uint32_t mid = (low + high) / 2;
        if ((mid == last) ||
            (topIndex.functionOffset(mid + 1) > targetFunctionOffset)) {
          low = mid;
          break;
        } else {
        low = mid + 1;
      }

So the address of the last entry in the first-level table isn't really
checked -- except for the very end, but the check against `last` means
we just run the loop once more than necessary. But it makes `unwinddump` output
look less confusing, and it's what it looks was the intention here.

(No test since I can't think of a way to make FileCheck check that one
number is larger than another.)

Differential Revision: https://reviews.llvm.org/D105404
2021-07-04 18:06:20 -04:00
Nico Weber d2d6da3011 [lld/mac] Don't crash on 32-bit output binaries when dead-stripping
Fixes PR50974.

Differential Revision: https://reviews.llvm.org/D105399
2021-07-04 18:03:31 -04:00
David Blaikie bf7f846b68 Fix test so it doesn't try to write to the test directory, only to %t 2021-07-02 14:59:50 -07:00
Vy Nguyen c7c5a1c9ae [lld-macho] Ignore debug symbols while preparing relocations.
Details: see https://bugs.llvm.org/show_bug.cgi?id=50812

Differential Revision: https://reviews.llvm.org/D105210
2021-07-02 13:51:46 -04:00
Martin Storsjö ce211c505b [LLD] [COFF] Fix up missing stdcall decorations in MinGW mode
If linking directly against a DLL without an import library, the
DLL export symbols might not contain stdcall decorations.

If we have an undefined symbol with decoration, and we happen to have
a matching undecorated symbol (which either is lazy and can be loaded,
or already defined), then alias it against that instead.

This matches what's done in reverse, when we have a def file
declaring to export a symbol without decoration, but we only have
a defined decorated symbol. In that case we do a fuzzy match
(SymbolTable::findMangle). This case is more straightforward; if we
have a decorated undefined symbol, just strip the decoration and look
for the corresponding undecorated symbol name.

Add warnings and options for either silencing the warning or disabling
the whole feature, corresponding to how ld.bfd does it.

(This feature works for any symbol decoration mismatch, not only when
linking against a DLL directly; ld.bfd also tolerates it anywhere,
and also fixes up mismatches in the other direction, like
SymbolTable::findMangle, for any symbol, not only exports. But in
practice, at least for lld, it would primarily end up used for linking
against DLLs.)

Differential Revision: https://reviews.llvm.org/D104532
2021-07-02 09:49:14 +03:00
Martin Storsjö c09e5e50b1 [LLD] [MinGW] Allow linking to DLLs directly
As the COFF linker is capable of linking directly against a DLL now
(after D104530, as long as it is running in mingw mode), don't error
out here but successfully load libraries specified with "-l" from DLLs
if that's what ld.bfd would have matched.

Differential Revision: https://reviews.llvm.org/D104531
2021-07-02 09:49:13 +03:00
Martin Storsjö a9ff1ce1b9 [LLD] [COFF] Support linking directly against DLLs in MinGW mode
GNU ld.bfd supports linking directly against DLLs without using an
import library, and some projects have picked up on this habit.
(There's no one single unsurmountable issue with using import
libraries, but this is a regularly surfacing missing feature.)

As long as one is linking by name (instead of by ordinal), the DLL
export table contains most of the information needed. (One can
inspect what section a symbol points at, to see if it's a function
or data symbol. The practical implementation of this loops over all
sections for each symbol, but as long as they're not very many, that
should hopefully be tolerable performance wise.)

One exception where the information in the DLL isn't entirely enough
is on i386 with stdcall functions; depending on how they're done,
the exported function name can be a plain undecorated name, while
the import library would contain the full decorated symbol name. This
issue is addressed separately in a different patch.

This is implemented mimicing the structure of a regular import library,
with one InputFile corresponding to the static archive that just adds
lazy symbols, which then are fetched when they are needed. When such
a symbol is fetched, we synthesize a coff_import_header structure
in memory and create a regular ImportFile out of it.

The implementation could be even smaller by just creating ImportFiles
for every symbol available immediately, but that would have the
drawback of actually ending up importing all symbols unless running
with GC enabled (and mingw mode defaults to having it disabled for
historical reasons).

Differential Revision: https://reviews.llvm.org/D104530
2021-07-02 09:49:13 +03:00
Jez Ng ac2dd06b91 [lld-macho] Deduplicate CFStrings
`__cfstring` is a special literal section, so instead of breaking it up
at symbol boundaries, we break it up at fixed-width boundaries (since
each literal is the same size). Symbols can only occur at one of those
boundaries, so this is strictly more powerful than
`.subsections_via_symbols`.

With that in place, we then run the section through ICF.

This change is about perf-neutral when linking chromium_framework.

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D105045
2021-07-01 21:22:38 -04:00
Jez Ng b41b4148e7 [lld-macho] Only enable `__DATA_CONST` for newer platforms
Matches ld64.

Reviewed By: #lld-macho, alexander-shaposhnikov

Differential Revision: https://reviews.llvm.org/D105080
2021-06-30 18:55:48 -04:00
Jez Ng 0d6d35e63b [lld-macho] -section_rename should work on synthetic sections too
Previously, we only applied the renames to
ConcatOutputSections.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D105079
2021-06-30 18:55:48 -04:00
Fangrui Song 03051f7ac8 [ELF] Preserve section order within an INSERT AFTER command
For
```
SECTIONS {
  text.0 : {}
  text.1 : {}
  text.2 : {}
} INSERT AFTER .data;
```

the current order is `.data text.2 text.1 text.0`. It makes more sense to
preserve the specified order and thus improve compatibility with GNU ld.

For
```
SECTIONS { text.0 : {} } INSERT AFTER .data;
SECTIONS { text.3 : {} } INSERT AFTER .data;
```

GNU ld somehow collects sections with `INSERT AFTER .data` together (IMO
inconsistent) but I think it makes more sense to execute the commands in order
and get `.data text.3 text.0` instead.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D105158
2021-06-30 11:35:50 -07:00
Fangrui Song 7b06bfc49e [ELF] -pie: produce dynamic relocations for absolute relocations referencing undef weak
See the comment for my understanding of -no-pie and -shared expectation.
-no-pie has freedom on choices. We choose dynamic relocations to be consistent
with the handling of GOT-generating relocations.

Note: GNU ld has arch-varying behaviors and its x86 -pie has a very
complex rule:
if there is at least one GOT-generating or PLT-generating relocation and
-z dynamic-undefined-weak (enabled by default) is in effect, generate a
dynamic relocation.

We don't emulate its rule.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D105164
2021-06-30 09:43:28 -07:00
Peter Smith fc1cb3104b [LLD][ELF][ARM] Tidy up test to hook up missing filecheck patterns [NFC]
A couple of filecheck patterns had not been hooked up with
the patterns suffering from some drift. As this test is old
and llvm-objdump has improved a lot, take this opportunity to
hide the instruction encoding. I've also taken out a lot of
the explanatory comments that llvm-objdump improvements make
redundant, as these comments oftern don't get updated when addresses
change.

Differential Revision: https://reviews.llvm.org/D104907
2021-06-30 14:16:40 +01:00
Peter Smith dd4d3f7406 [LLD][ELF][ARM] Fix case of patched unrelocated BLX
There are a couple of problems with the code to patch
unrelocated BLX instructions:
1. The calculation of the PC needs to take into account
   the alignment of the instruction. The Thumb BLX
   uses alignDown(PC, 4) for the source address.
2. The calculation of the PC bias is hard-coded to 4
   which works for Thumb, but when there is a BLX the
   branch will be in Arm state so it needs an 8 byte
   PC bias.

No asssembler generates an unrelocated BLX instruction
so these problems do not affect real world programs.
However we should still fix them.

Differential Revision: https://reviews.llvm.org/D104905
2021-06-30 14:07:35 +01:00
Igor Kudrin 657e067bb5 [ARMInstPrinter] Print the target address of a branch instruction
This follows other patches that changed printing immediate values of
branch instructions to target addresses, see D76580 (x86), D76591 (PPC),
D77853 (AArch64).

As observing immediate values might sometimes be useful, they are
printed as comments for branch instructions.

// llvm-objdump -d output (before)
000200b4 <_start>:
   200b4: ff ff ff fa   blx     #-4 <thumb>
000200b8 <thumb>:
   200b8: ff f7 fc ef   blx     #-8 <_start>

// llvm-objdump -d output (after)
000200b4 <_start>:
   200b4: ff ff ff fa   blx     0x200b8 <thumb>         @ imm = #-4
000200b8 <thumb>:
   200b8: ff f7 fc ef   blx     0x200b4 <_start>        @ imm = #-8

// GNU objdump -d.
000200b4 <_start>:
   200b4:       faffffff        blx     200b8 <thumb>
000200b8 <thumb>:
   200b8:       f7ff effc       blx     200b4 <_start>

Differential Revision: https://reviews.llvm.org/D104701
2021-06-30 16:35:28 +07:00
Nico Weber aed0a08c69 [lld/mac] Make symbol table order deterministic
SymtabSection::emitStabs() writes the symbol table in the order
of externalSymbols, which has the order of symtab->getSymbols(),
which is just the order symbols are added to the symbol table.

In practice, symbols in the symbol files of input .o files are
sorted, but since that's not guaranteed we sort them in
ObjFile::parseSymbols(). To make sure several symbols with the same
address keep the order they're in the input file, we have to use
stable_sort().

In practice, std::sort() on already-sorted inputs won't change the order
of just adjacent elements, and while in theory std::sort() could use a
random pivot, in practice the code should be deterministic as it was
previously too.

But now lld/test/MachO/stabs.s passes with LLVM_ENABLE_EXPENSIVE_CHECKS=ON
(the last test that was failing with that set).

Fixes a regression from D99972.

While here, remove an empty section in stabs.s and move
.subsections_via_symbols to the end where it usually is (this part no
behavior change).

Differential Revision: https://reviews.llvm.org/D105071
2021-06-29 09:29:49 -04:00
Leonard Grey a8a6e5b094 [lld-macho] Preserve alignment for non-deduplicated cstrings
Fixes PR50637.

Downstream bug: https://crbug.com/1218958

Currently, we split __cstring along symbol boundaries with .subsections_via_symbols
when not deduplicating, and along null bytes when deduplicating. This change splits
along null bytes unconditionally, and preserves original alignment in the non-
deduplicated case.

Removing subsections-section-relocs.s because with this change, __cstring
is never reordered based on the order file.

Differential Revision: https://reviews.llvm.org/D104919
2021-06-28 22:26:43 -04:00
Nico Weber f1969b74a7 [lld/mac] Fix nondeterminism in output section ordering
The two different thread_local_regular sections (__thread_data and
more_thread_data) had nondeterminstic ordering for two reasons:

1. https://reviews.llvm.org/D102972 changed concatOutputSections
   from MapVector to DenseMap, so when we iterate it to make
   output segments, we would add the two sections to the __DATA
   output segment in nondeterministic order.

2. The same change also moved the two stable_sort()s for segments
   and sections to sort(). Since sections with assigned priority
   (such as TLV data) have the same priority for all sections,
   this is incorrect -- we must use stable_sort() so that the
   initial (input-order-based) order remains.

As a side effect, we now (deterministically) put the __common
section in front of __bss (while previously we happened to
put it after it). (__common and __bss are both zerofill so
both have order INT_MAX, but common symbols are added to
inputSections before normal sections are collected.)

Makes lld/test/MachO/tlv.s and lld/test/MachO/tlv-dylib.s pass with
LLVM_ENABLE_EXPENSIVE_CHECKS=ON.

Differential Revision: https://reviews.llvm.org/D105054
2021-06-28 18:41:33 -04:00
Jez Ng 74d5f30d83 [lld-macho][nfc] Add absolute-vs-non-absolute symbol test for ICF
Make sure we don't wrongly fold two sections that refer to
symbols with the same value if they are not both absolute /
non-absolute.

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D104876
2021-06-28 14:49:40 -04:00
Jez Ng 557e1fa02f [lld-macho] Extend ICF to literal sections
Literal sections can be deduplicated before running ICF. That makes it
easy to compare them during ICF: we can tell if two literals are
constant-equal by comparing their offsets in their OutputSection.

LLD-ELF takes a similar approach.

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D104671
2021-06-28 14:49:39 -04:00
David Spickett 6942076096 [lld][MachO] Temporarily require 64 bit build for dead-strip.s
This test has always failed on 32 bit armv8 bots:
https://lab.llvm.org/buildbot/#/builders/178/builds/42

Due to the output order of some symbols changing.
I don't think this is an Arm specific issue so disabling
on 32 bit while it's investigated.
2021-06-28 09:37:45 +00:00
Igor Kudrin d25e572421 [llvm-objdump] Print memory operand addresses as regular comments
The patch reuses the common code to print memory operand addresses as
instruction comments. This helps to align the comments and enables using
target-specific comment markers when `evaluateMemoryOperandAddress()` is
implemented for them.

Differential Revision: https://reviews.llvm.org/D104861
2021-06-28 14:25:22 +07:00
Igor Kudrin e7fffa6f03 [llvm-objdump] Prefix memory operand addresses with '0x'
This helps to avoid ambiguity when the address contains only digits 0..9.

Differential Revision: https://reviews.llvm.org/D104909
2021-06-28 14:25:21 +07:00
Nico Weber 0f24ffcdfa [lld/mac] Don't fold UNWIND_X86_64_MODE_STACK_IND unwind entries
libunwind uses unwind info to find the function address belonging
to the current instruction pointer. libunwind/src/CompactUnwinder.hpp's
step functions read functionStart for UNWIND_X86_64_MODE_STACK_IND
(and for nothing else), so these encodings need a dedicated entry
per function, so that the runtime can get the stacksize off the
`subq` instrunction in the function's prologue.

This matches ld64.

(CompactUnwinder.hpp from https://opensource.apple.com/source/libunwind/
also reads functionStart in a few more cases if `SUPPORT_OLD_BINARIES` is set,
but it defaults to 0, and ld64 seems to not worry about these additional
cases.)

Related upstream bug: https://crbug.com/1220175

Differential Revision: https://reviews.llvm.org/D104978
2021-06-27 06:49:32 -04:00
Jan Kratochvil a7afaf9019 Fix lld testsuite after llvm-dwarfdump now errors on invalid DWARF
D104271 broke buildbots for lld/test/ELF/non-abs-reloc.s .
2021-06-27 12:26:11 +02:00
Fangrui Song 2508733e1b [ELF] --sysroot: change sysrooted script to not fall back for an absolute path
Modify the D13209 logic: for a script inside the sysroot, if an absolute path
does not exist, report an error instead of falling back to the path without the
sysroot prefix.

This matches GNU ld, which makes sense to me: we don't want to find an arbitrary
file in the host.

Reviewed By: ikudrin

Differential Revision: https://reviews.llvm.org/D104894
2021-06-25 12:52:39 -07:00
Martin Storsjö d07f43641f [LLD] [COFF] Fix handling of LTO comdats with nontrivial selection types after 728cc0075e
Commit 728cc0075e made comdat symbols
from LTO objects be treated as any regular comdat symbol. This works
great for symbols that actually are IMAGE_COMDAT_SELECT_ANY, but
if the symbols have a less trivial selection type that require comparing
either the section chunk size or contents, we can't check that before
actually doing the LTO compilation.

Therefore bring back one aspect of handling from before; that comdat
resolution with a leader from an LTO symbol is essentially skipped,
like it was before 728cc0075e.

Differential Revision: https://reviews.llvm.org/D104605
2021-06-25 09:39:56 +03:00
Greg McGary 8a8558ae27 [lld-macho] add tests for ICF, plus cleanups
Add tests for pending TODOs, plus some global cleanups:
* No fold: func has personality/LSDA
* Fold: reference to absolute symbol with different name but identical value
* No fold: reloc references to absolute symbols with different values
* No fold: N_ALT_ENTRY symbols

Differential Revision: https://reviews.llvm.org/D104721
2021-06-23 20:44:25 -07:00
Nico Weber dbbc8d8333 [lld/mac] Don't crash on absolute symbols in unwind info generation
Fixes a regression from d6565a2dbc and PR50820.
2021-06-23 14:25:34 -04:00
Martin Storsjö f1a18fb699 [LLD] [MinGW] Silence the printouts in one test. NFC.
This particular linker invocation is only run to check that we accept
options, but we don't inspect the generated command line. As all other
commands in the file have their output piped to FileCheck, the lit test
doesn't print any other output; therefore silence this one for consistency
as well.
2021-06-23 10:44:01 +03:00
Martin Storsjö fdf54f5c50 [LLD] [MinGW] Print the lld-link command to stderr
This is consistent with how clang prints its internal commands with
-### and -v.

When linking with -verbose, we get log messages from the actual
linking written to stderr. By printing the command to the same stream,
we make sure they appear in a sensible chronological order.

Differential Revision: https://reviews.llvm.org/D104527
2021-06-23 10:21:42 +03:00
Reid Kleckner 5bcbc7ee52 Add regression test for maybeMangle issue
This was crbug.com/1222724, which caused D104529 to be reverted. The
new test fails when D104529 is reapplied locally.
2021-06-22 12:55:25 -07:00
Nico Weber d6565a2dbc [lld/mac] Add explicit "no unwind info" entries for functions without unwind info
Fixes PR50529. With this, lld-linked Chromium base_unittests passes on arm macs.

Surprisingly, no measurable impact on link time.

Differential Revision: https://reviews.llvm.org/D104681
2021-06-22 06:12:42 -04:00
Nico Weber e6cb55d5ce [lld/mac] Test zerofill sections after __thread_bss
Real zerofill sections go after __thread_bss, since zerofill sections
must all be at the end of their segment and __thread_bss must be right
after __thread_data.

Works fine already, but wasn't tested as far as I can tell.

Also tweak comment about zerofill sections a bit.

No behavior change.

Differential Revision: https://reviews.llvm.org/D104609
2021-06-20 20:44:29 -04:00
Fangrui Song 89e66a3ab3 [ELF] Delete --no-cref which does not exist in GNU ld
Also delete the single dash form which does not appear to be used.
2021-06-20 14:28:56 -07:00
Fangrui Song cd6b1b2b86 [ELF][test] Add missing tests for --no-export-dynamic & --no-warn-backrefs 2021-06-20 14:20:14 -07:00
Martin Storsjö 1c8bb625b7 [LLD] [MinGW] Print errors/warnings in lld-link with a "ld.lld" prefix
Pass the original argv[0] to the coff linker, as the coff linker uses
the basename of argv[0] as the log prefix.

This makes error messages to be printed with a "ld.lld:" prefix
instead of "lld-link:". The current "lld-link:" prefix can be confusing
to users, as they're invoking the MinGW linker (and might not even have
a lld-link executable).

Keep the first argument as lld-link when printing the command line, to
make it an actually reproducible standalone command.

Differential Revision: https://reviews.llvm.org/D104526
2021-06-19 22:32:37 +03:00
Nico Weber c931e12b1d [lld/mac] Make sure __thread_ptrs is in front of __thread_bss
The exact location doesn't matter, but it should be in front
of __thread_bss. We put it right in front of __thread_data
which is where ld64 seems to put it as well.

Fixes PR50769.

(As mentioned on the bug, there is probably a more structural
fix too, see comment 5. If we don't address this, it's likely
we'll run into this again with other synthetic sections. But
for now, let's fix the immediate breakage.)

Differential Revision: https://reviews.llvm.org/D104596
2021-06-19 12:56:43 -04:00
Nico Weber 17271ece0d [lld/mac] Give __DATA,__thread_ptrs type S_THREAD_LOCAL_VARIABLE_POINTERS
...instead of S_NON_LAZY_SYMBOL_POINTERS. This matches ld64.

Part of PR50769.

While here, also remove an old TODO that was done in D87178.

Differential Revision: https://reviews.llvm.org/D104594
2021-06-19 12:56:42 -04:00
Jez Ng 4507f64165 [re-land][lld-macho] Avoid force-loading the same archive twice
This reverts commit c9b241efd6, which was
a backout diff to fix the buildbots.

The real culprit of the crash is
1d31fb8d12,
which is being reverted.

Differential Revision: https://reviews.llvm.org/D104353
2021-06-18 22:43:50 -04:00
Nico Weber c9b241efd6 Revert "[lld-macho] Avoid force-loading the same archive twice"
This reverts commit 24706cd73c.
Test seems to fail flakily. See comments on https://reviews.llvm.org/D104353
for a hypothesis for why.
2021-06-18 20:25:27 -04:00
Jez Ng 4c49f9ceaf [lld-macho] Handle non-extern symbols marked as private extern
Previously, we asserted that such a case was invalid, but in fact
`ld -r` can emit such symbols if the input contained a (true) private
extern, or if it contained a symbol started with "L".

Non-extern symbols marked as private extern are essentially equivalent
to regular TU-scoped symbols, so no new functionality is needed.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D104502
2021-06-18 16:36:14 -04:00
Nico Weber f7366890c2 [lld/mac] Support -data_in_code_info, -function_starts flags
These are on by default, but there's also an explicit flag for them.

Differential Revision: https://reviews.llvm.org/D104543
2021-06-18 13:01:42 -04:00
Greg McGary 8120c9e379 Rename option -icf MODE to --icf=MODE
The `icf` command-line option is not present in ld64, so it should use the LLD option syntax, which begins with double dashes and separates primary option from any suboption with the equal sign.

Differential Revision: https://reviews.llvm.org/D104548
2021-06-18 09:52:15 -07:00
Heejin Ahn 1d891d44f3 [WebAssembly] Rename event to tag
We recently decided to change 'event' to 'tag', and 'event section' to
'tag section', out of the rationale that the section contains a
generalized tag that references a type, which may be used for something
other than exceptions, and the name 'event' can be confusing in the web
context.

See
- https://github.com/WebAssembly/exception-handling/issues/159#issuecomment-857910130
- https://github.com/WebAssembly/exception-handling/pull/161

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D104423
2021-06-17 20:34:19 -07:00
Sam Clegg d01e673a9f [lld][WebAssembly] Fix crash calling weakly undefined function in PIC code
Differential Revision: https://reviews.llvm.org/D104495
2021-06-17 16:49:02 -07:00
Sam Clegg 758633f922 [lld][WebAssembly] Add new `--import-undefined` option
This change revisits https://reviews.llvm.org/D79248 which originally
added support for the --unresolved-symbols flag.

At the time I thought it would make sense to add a third option to this
flag called `import-functions` but it turns out (as was suspects by on
the reviewers IIRC) that this option can be authoganal.

Instead I've added a new option called `--import-undefined` that only
operates on symbols that can be imported (for example, function symbols
can always be imported as opposed to data symbols we can only be
imported when compiling with PIC).

This option gives us the full expresivitiy that emscripten needs to be
able allow reporting of undefined data symbols as well as the option to
disable that.

This change does remove the `--unresolved-symbols=import-functions`
option, which is been in the codebase now for about a year but I would
be extremely surprised if anyone was using it.

Differential Revision: https://reviews.llvm.org/D103290
2021-06-17 11:44:21 -07:00
Vy Nguyen 366df11a35 [lld-macho] Rework mergeFlag to behave closer to what ld64 does.
Details:
I've been getting a few weird errors similar to the following from our internal tests:

```
ld64.lld.darwinnew: error: Cannot merge section __eh_frame (type=0x0) into __eh_frame (type=0xB): inconsistent types
ld64.lld.darwinnew: error: Cannot merge section __eh_frame (flags=0x0) into __eh_frame (flags=0x6800000B): strict flags differ
ld64.lld.darwinnew: error: Cannot merge section __eh_frame (type=0x0) into __eh_frame (type=0xB): inconsistent types
ld64.lld.darwinnew: error: Cannot merge section __eh_frame (flags=0x0) into __eh_frame (flags=0x6800000B): strict flags differ
```

Differential Revision: https://reviews.llvm.org/D103971
2021-06-17 14:22:58 -04:00
Greg McGary f27e4548fc [lld-macho] Implement ICF
ICF = Identical C(ode|OMDAT) Folding

This is the LLD ELF/COFF algorithm, adapted for MachO. So far, only `-icf all` is supported. In order to support `-icf safe`, we will need to port address-significance tables (`.addrsig` directives) to MachO, which will come in later diffs.

`check-{llvm,clang,lld}` have 0 regressions for `lld -icf all` vs. baseline ld64.

We only run ICF on `__TEXT,__text` for reasons explained in the block comment in `ConcatOutputSection.cpp`.

Here is the perf impact for linking `chromium_framekwork` on a Mac Pro (16-core Xeon W) for the non-ICF case vs. pre-ICF:
```
    N           Min           Max        Median           Avg        Stddev
x  20          4.27          4.44          4.34         4.349   0.043029977
+  20          4.37          4.46         4.405        4.4115   0.025188761
Difference at 95.0% confidence
        0.0625 +/- 0.0225658
        1.43711% +/- 0.518873%
        (Student's t, pooled s = 0.0352566)
```

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D103292
2021-06-17 10:07:44 -07:00
Jez Ng 24706cd73c [lld-macho] Avoid force-loading the same archive twice
We need to dedup archive loads (similar to what we do for dylib
loads).

I noticed this issue after building some Swift stuff that used
`-force_load_swift_libs`, as it caused some Swift archives to be loaded
many times.

Reviewed By: #lld-macho, thakis, MaskRay

Differential Revision: https://reviews.llvm.org/D104353
2021-06-17 11:13:54 -04:00
Igor Kudrin 5355b8c631 [ELF] Restore arm-branch.s test
After D77330, the comments are inconsistent with the disassembled code.
As the value of `far` has been changed, a thunk to reach it is now
generated, and target addresses of branch instructions are different
from what was initially expected.

The patch fixes that and makes the test closer to what it was originally.

Differential Revision: https://reviews.llvm.org/D104286
2021-06-17 17:08:13 +07:00
Jez Ng 560636e549 [lld-macho] Put DATA_IN_CODE immediately after FUNCTION_STARTS
codesign checks for this.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D104354
2021-06-16 15:23:07 -04:00
Jez Ng eeac6b2bec [lld-macho] Handle multiple LC_LINKER_OPTIONs
We previously only parsed the first one.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D104352
2021-06-16 15:23:06 -04:00
Jez Ng d52d1b93c3 [lld-macho] Downgrade version mismatch to warning
It's a warning in ld64. While having LLD be stricter would be nice, it
makes it harder for it to be a drop-in replacement into existing builds.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D104333
2021-06-16 11:06:26 -04:00
Nico Weber b579938d40 [lld/mac] Add support for -no_data_in_code_info flag
Differential Revision: https://reviews.llvm.org/D104345
2021-06-16 06:40:42 -04:00
Konstantin Schwarz 5d621ed85d [ELF] Consider that NOLOAD sections should be placed in a PT_LOAD segment
During PHDR creation, the case where an output section does not require a
PT_LOAD header but still occupies memory in the current VMA region was not handled.

If such an output section interleaves two output sections that have the same
VMA and LMA regions set, we would previously re-use the existing PT_LOAD header
for the second output section.
However, since the memory region is not contiguous, we need to start a new PT_LOAD
segment.

This fixes https://bugs.llvm.org/show_bug.cgi?id=50558

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D103815
2021-06-16 12:36:45 +02:00
Alexander Shaposhnikov 928394d109 [lld][MachO] Add support for LC_DATA_IN_CODE
Add first bits for emitting LC_DATA_IN_CODE.

Test plan: make check-lld-macho

Differential revision: https://reviews.llvm.org/D103006
2021-06-14 19:21:59 -07:00
Fangrui Song 899fdf548e [ELF] Add OVERWRITE_SECTIONS command
This implements https://sourceware.org/bugzilla/show_bug.cgi?id=26404

An `OVERWRITE_SECTIONS` command is a `SECTIONS` variant which contains several
output section descriptions. The output sections do not have specify an order.
Similar to `INSERT [BEFORE|AFTER]`, `LinkerScript::hasSectionsCommand` is not
set, so the built-in rules (see `docs/ELF/linker_script.rst`) still apply.
`OVERWRITE_SECTIONS` can be more convenient than `INSERT` because it does not
need an anchor section.

The initial syntax is intentionally narrow to facilitate backward compatible
extensions in the future. Symbol assignments cannot be used.

This feature is versatile. To list a few usage:

* Use `section : { KEEP(...) }` to retain input sections under GC
* Define encapsulation symbols (start/end) for an output section
* Use `section : ALIGN(...) : { ... }` to overalign an output section (similar to ld64 `-sectalign`)

When an output section is specified by both `OVERWRITE_SECTIONS` and
`INSERT`, `INSERT` is processed after overwrite sections. To make this work,
this patch changes `InsertCommand` to use name based matching instead of pointer
based matching. (This may cause a difference when `INSERT` moves one output
section more than once. Such duplicate commands should not be used in practice
(seems that in GNU ld the output sections may just disappear).)

A linker script can be used without -T/--script. The traditional `SECTIONS`
commands are concatenated, so a wrong rule can be more noticeable from the
section order. This feature if misused can be less noticeable, just like
`INSERT`.

Differential Revision: https://reviews.llvm.org/D103303
2021-06-13 12:41:11 -07:00
Alexander Shaposhnikov b9095f5e1a [lld][MachO] Fix function starts section
Sort the addresses stored in FunctionStarts section.
Previously we were encoding potentially large numbers (due to unsigned overflow).

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D103662
2021-06-11 17:47:28 -07:00
Jez Ng 464d3dc3d1 [lld-macho] Have dead-stripping work with literal sections
Literal sections are not atomically live or dead. Rather,
liveness is tracked for each individual literal they contain. CStrings
have their liveness tracked via a `live` bit in StringPiece, and
fixed-width literals have theirs tracked via a BitVector.

The live-marking code now needs to track the offset within each section
that is to be marked live, in order to identify the literal at that
particular offset.

Numbers for linking chromium_framework on my 3.2 GHz 16-Core Intel Xeon W
with both `-dead_strip` and `--deduplicate-literals`, with and without this diff
applied:

```
    N           Min           Max        Median           Avg        Stddev
x  20          4.32          4.44         4.375         4.372    0.03105174
+  20           4.3          4.39          4.36        4.3595   0.023277502
No difference proven at 95.0% confidence
```
This gives us size savings of about 0.4%.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D103979
2021-06-11 19:50:09 -04:00
Jez Ng 5d88f2dd94 [lld-macho] Deduplicate fixed-width literals
Conceptually, the implementation is pretty straightforward: we put each
literal value into a hashtable, and then write out the keys of that
hashtable at the end.

In contrast with ELF, the Mach-O format does not support variable-length
literals that aren't strings. Its literals are either 4, 8, or 16 bytes
in length. LLD-ELF dedups its literals via sorting + uniq'ing, but since
we don't need to worry about overly-long values, we should be able to do
a faster job by just hashing.

That said, the implementation right now is far from optimal, because we
add to those hashtables serially. To parallelize this, we'll need a
basic concurrent hashtable (only needs to support concurrent writes w/o
interleave reads), which shouldn't be to hard to implement, but I'd like
to punt on it for now.

Numbers for linking chromium_framework on my 3.2 GHz 16-Core Intel Xeon W:

      N           Min           Max        Median           Avg        Stddev
  x  20          4.27          4.39         4.315        4.3225   0.033225703
  +  20          4.36          4.82          4.44        4.4845    0.13152846
  Difference at 95.0% confidence
          0.162 +/- 0.0613971
          3.74783% +/- 1.42041%
          (Student's t, pooled s = 0.0959262)

This corresponds to binary size savings of 2MB out of 335MB, or 0.6%.
It's not a great tradeoff as-is, but as mentioned our implementation can
be signficantly optimized, and literal dedup will unlock more
opportunities for ICF to identify identical structures that reference
the same literals.

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D103113
2021-06-11 19:50:08 -04:00
Nico Weber 54418c5a35 [lld/mac] Make binaries written by lld strippable
Be less clever when writing the indirect symbols in LC_DYSYMTAB:
lld used to make point __stubs and __la_symbol_ptr point at the
same bytes in the indirect symbol table in the __LINKEDIT segment.
That confused strip, so write the same bytes twice and make
__stubs and __la_symbol_ptr point at one copy each, so that they
don't share data. This unconfuses strip, and seems to be what ld64
does too, so hopefully tools are generally more used to this.

This makes the output binaries a bit larger, but not much: 4 bytes
for roughly each called function from a dylib and each weak function.
Chromium Framewoork grows by 6536 bytes, clang-format by a few hundred.

With this, `strip -x Chromium\ Framework` works (244 MB before stripping
to 171 MB after stripping, compared to 236 MB=>164 MB with ld64). Running
strip without `-x` produces the same error message now for lld-linked
Chromium Framework as for when using ld64 as a linker.

`strip clang-format` also works now but didn't previously.

Fixes PR50657.

Differential Revision: https://reviews.llvm.org/D104081
2021-06-11 00:18:03 -04:00
Fangrui Song c03b6305d8 [ELF][RISCV] Resolve branch relocations referencing undefined weak to current location if not using PLT
In a -no-pie link we optimize R_PLT_PC to R_PC. Currently we resolve a branch
relocation to the link-time zero address. However such a choice tends to cause
relocation overflow possibility for RISC architectures.

* aarch64: GNU ld: rewrite the instruction to a NOP; ld.lld: branch to the next instruction
* mips: GNU ld: branch to the start of the text segment (?); ld.lld: branch to zero
* ppc32: GNU ld: rewrite the instruction to a NOP; ld.lld: branch to the current instruction
* ppc64: GNU ld: rewrite the instruction to a NOP; ld.lld: branch to the current instruction
* riscv: GNU ld: branch to the absolute zero address (with instruction rewriting)
* i386/x86_64: GNU ld/ld.lld: branch to the link-time zero address

I think that resolving to the same location is a good choice. The instruction,
if triggered, is clearly an undefined behavior. Resolving to the same location
can cause an infinite loop (making the user aware of the issue) while ensuring
no overflow.

Reviewed By: jrtc27

Differential Revision: https://reviews.llvm.org/D103001
2021-06-10 13:25:16 -07:00
Nico Weber e87c095af3 [lld/mac] Print dylib search details with --print-dylib-search or RC_TRACE_DYLIB_SEARCHING
For debugging dylib loading, it's useful to have some insight into what
the linker is doing.

ld64 has the undocumented RC_TRACE_DYLIB_SEARCHING env var
for this printing dylib search candidates.

This adds a flag --print-dylib-search to make lld print the seame information.
It's useful for users, but also for writing tests. The output is formatted
slightly differently than ld64, but we still support RC_TRACE_DYLIB_SEARCHING
to offer at least a compatible way to trigger this.

ld64 has both `-print_statistics` and `-trace_symbol_output` to enable
diagnostics output. I went with "print" since that seems like a more
straightforward name.

Differential Revision: https://reviews.llvm.org/D103985
2021-06-09 22:08:20 -04:00
Nico Weber bbe6f51b72 [lld/mac] Make framework symlinks in tests more realistic
In a framework Foo.framework, Foo.framework/Foo is usually a relative
symbolic link to Foo.framework/Versions/Current/Foo,
and Foo.framework/Versions/Current is usually a relative symbolic
link to A.

Our tests used absolute symbolic links. Now they use relative symbolic links.

No behavior change, just makes the tests more representative of the real world.

(implicit-dylib.s omits the "Current" folder too, but I'm not changing that
here.)

Differential Revision: https://reviews.llvm.org/D103998
2021-06-09 20:39:39 -04:00
Nico Weber 0e399eb527 [lld/mac] When handling @loader_path, use realpath() of symlinks
This is important for Frameworks, which are usually symlinks.

ld64 gets this right for @rpath that's replaced with @loader_path, but not for
bare @loader_path -- ld64's code calls realpath() in that case too, but ignores
the result.

ld64 somehow manages to find libbar1.dylib in the test without the
explicit `-rpath` in Foo1. I don't understand why or how. But this
change is a step forward and fixes an immediate problem I'm having,
so let's start with this :)

Differential Revision: https://reviews.llvm.org/D103990
2021-06-09 20:36:07 -04:00
Fangrui Song 928a197d26 [ELF] Add a GRP_COMDAT test with a local signature symbol
See https://groups.google.com/g/generic-abi/c/2X6mR-s2zoc

Test that a local signature symbol does not suppress COMDAT deduplication.
2021-06-08 09:22:30 -07:00
Jez Ng 447dfbe005 [lld-macho] Implement -force_load_swift_libs
It causes libraries whose names start with "swift" to be force-loaded.
Note that unlike the more general `-force_load`, this flag only applies
to libraries specified via LC_LINKER_OPTIONS, and not those passed on
the command-line. This is what ld64 does.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D103709
2021-06-07 23:48:35 -04:00
Jez Ng 04259cde15 [lld-macho] Implement cstring deduplication
Our implementation draws heavily from LLD-ELF's, which in turn delegates
its string deduplication to llvm-mc's StringTableBuilder. The messiness of
this diff is largely due to the fact that we've previously assumed that
all InputSections get concatenated together to form the output. This is
no longer true with CStringInputSections, which split their contents into
StringPieces. StringPieces are much more lightweight than InputSections,
which is important as we create a lot of them. They may also overlap in
the output, which makes it possible for strings to be tail-merged. In
fact, the initial version of this diff implemented tail merging, but
I've dropped it for reasons I'll explain later.

**Alignment Issues**

Mergeable cstring literals are found under the `__TEXT,__cstring`
section. In contrast to ELF, which puts strings that need different
alignments into different sections, clang's Mach-O backend puts them all
in one section. Strings that need to be aligned have the `.p2align`
directive emitted before them, which simply translates into zero padding
in the object file.

I *think* ld64 extracts the desired per-string alignment from this data
by preserving each string's offset from the last section-aligned
address. I'm not entirely certain since it doesn't seem consistent about
doing this; but perhaps this can be chalked up to cases where ld64 has
to deduplicate strings with different offset/alignment combos -- it
seems to pick one of their alignments to preserve. This doesn't seem
correct in general; we can in fact can induce ld64 to produce a crashing
binary just by linking in an additional object file that only contains
cstrings and no code. See PR50563 for details.

Moreover, this scheme seems rather inefficient: since unaligned and
aligned strings are all put in the same section, which has a single
alignment value, it doesn't seem possible to tell whether a given string
doesn't have any alignment requirements. Preserving offset+alignments
for strings that don't need it is wasteful.

In practice, the crashes seen so far seem to stem from x86_64 SIMD
operations on cstrings. X86_64 requires SIMD accesses to be
16-byte-aligned. So for now, I'm thinking of just aligning all strings
to 16 bytes on x86_64. This is indeed wasteful, but implementation-wise
it's simpler than preserving per-string alignment+offsets. It also
avoids the aforementioned crash after deduplication of
differently-aligned strings. Finally, the overhead is not huge: using
16-byte alignment (vs no alignment) is only a 0.5% size overhead when
linking chromium_framework.

With these alignment requirements, it doesn't make sense to attempt tail
merging -- most strings will not be eligible since their overlaps aren't
likely to start at a 16-byte boundary. Tail-merging (with alignment) for
chromium_framework only improves size by 0.3%.

It's worth noting that LLD-ELF only does tail merging at `-O2`. By
default (at `-O1`), it just deduplicates w/o tail merging. @thakis has
also mentioned that they saw it regress compressed size in some cases
and therefore turned it off. `ld64` does not seem to do tail merging at
all.

**Performance Numbers**

CString deduplication reduces chromium_framework from 250MB to 242MB, or
about a 3.2% reduction.

Numbers for linking chromium_framework on my 3.2 GHz 16-Core Intel Xeon W:

      N           Min           Max        Median           Avg        Stddev
  x  20          3.91          4.03         3.935          3.95   0.034641016
  +  20          3.99          4.14         4.015        4.0365     0.0492336
  Difference at 95.0% confidence
          0.0865 +/- 0.027245
          2.18987% +/- 0.689746%
          (Student's t, pooled s = 0.0425673)

As expected, cstring merging incurs some non-trivial overhead.

When passing `--no-literal-merge`, it seems that performance is the
same, i.e. the refactoring in this diff didn't cost us.

      N           Min           Max        Median           Avg        Stddev
  x  20          3.91          4.03         3.935          3.95   0.034641016
  +  20          3.89          4.02         3.935        3.9435   0.043197831
  No difference proven at 95.0% confidence

Reviewed By: #lld-macho, gkm

Differential Revision: https://reviews.llvm.org/D102964
2021-06-07 23:48:35 -04:00
Nico Weber 17c43c4045 [lld/mac] Add reexports after reexporter to inputFiles
When a library "host"'s reexports change their installName with
`$ld$os10.11$install_name$host`, we used to write a load command for "host" but
write the version numbers of the reexport instead of "host". This fixes that.

I first thought that the rule is to take the version numbers from the library
that originally had that install name (implemented in D103819), but that's not
what ld64 seems to be doing: It takes the version number from the first dylib
with that install name it loads, and it loads the reexporting library before
the reexports. We already did most of that, we just added reexports before the
reexporter. After this change, we add the reexporter before the reexports.

Addresses https://bugs.llvm.org/show_bug.cgi?id=49800#c11 part 1.

(ld64 seems to add reexports after processing _all_ files on the command line,
while we add them right after the reexporter. For the common case of reexport +
$ld$ symbol changing back to the exporter name, this doesn't make a difference,
but you can construct a case where it does. I expect this to not make a
difference in practice though.)

Differential Revision: https://reviews.llvm.org/D103821
2021-06-07 17:04:03 -04:00
Nico Weber 422544414b [lld/mac] Add a test for -reexport_library + -dead_strip_dylibs
Our behavior here already matched ld64, now we have a test for it.

(ld64 even strips the library here if you also pass -needed_library bar.dylib.
That seems wrong to me, and lld honors needed_library in that case.)

Differential Revision: https://reviews.llvm.org/D103812
2021-06-07 13:44:58 -04:00
Nico Weber c5ffe97988 [lld/mac] Implement support for searching dylibs with @rpath/ in install name
Also adjust a few comments, and move the DylibFile comment talking about
umbrella next to the parameter again.

Differential Revision: https://reviews.llvm.org/D103783
2021-06-07 06:22:52 -04:00
Nico Weber 52489021cf [lld/mac] Implement support for searching dylibs with @loader_path/ in install name
Differential Revision: https://reviews.llvm.org/D103779
2021-06-06 20:19:50 -04:00
Nico Weber a48bd587f7 [lld/mac] Implement support for searching dylibs with @executable_path/ in install name
Differential Revision: https://reviews.llvm.org/D103775
2021-06-06 20:01:50 -04:00
Alexander Shaposhnikov 5e49ee8794 [lld][MachO] Add support for $ld$install_name symbols
This diff adds support for $ld$install_name symbols.

Test plan: make check-lld-macho

Differential revision: https://reviews.llvm.org/D103746
2021-06-05 12:58:59 -07:00
Alexander Shaposhnikov cf29a92b90 [lld][MachO] Fix typo in special-symbol-ld-previous.s
Fix typo in the test special-symbol-ld-previous.s. NFC.
2021-06-05 01:27:42 -07:00
Alexander Shaposhnikov 1309c181a8 [lld][MachO] Add first bits to support special symbols
This diff adds first bits to support special symbols $ld$previous* in LLD.
$ld$* symbols modify properties/behavior of the library
(e.g. its install name, compatibility version or hide/add symbols)
for specific target versions.

Test plan: make check-lld-macho

Differential revision: https://reviews.llvm.org/D103505
2021-06-04 23:32:26 -07:00
Nico Weber 1aae55ddea [lld/mac] Add test coverage for --reproduce + -flat_namespace
Works fine already, now it has a test too.

Differential Revision: https://reviews.llvm.org/D103643
2021-06-03 21:00:35 -04:00
Jez Ng 6881f29a36 [lld-macho] Parse re-exports of nested TAPI documents
D103423 neglected to call `parseReexports()` for nested TBD
documents, leading to symbol resolution failures when trying to look up
a symbol nested more than one level deep in a TBD file. This fixes the
regression and adds a test.

It also appears that `umbrella` wasn't being set properly when calling
`parseLoadCommands` -- it's supposed to resolve to `this` if `nullptr`
is passed. I didn't write a failing test case for this but I've made
`umbrella` a member so the previous behavior should be preserved.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D103586
2021-06-03 12:02:30 -04:00
Martin Storsjö 728cc0075e [LLD] [COFF] Fix autoexport from LTO objects with comdat symbols
Make sure that comdat symbols also have a non-null dummy
SectionChunk associated.

This requires moving around an existing FIXME regarding comdats in
LTO.

Differential Revision: https://reviews.llvm.org/D103012
2021-06-03 15:14:49 +03:00
Nico Weber 5ecfdb5123 [lld/mac] try to fix tests after a5645513db
My linux system doesn't like the `grep` for some reason,
but FileCheck seems to work.
2021-06-02 11:33:11 -04:00
Nico Weber a5645513db [lld/mac] Implement -dead_strip
Also adds support for live_support sections, no_dead_strip sections,
.no_dead_strip symbols.

Chromium Framework 345MB unstripped -> 250MB stripped
(vs 290MB unstripped -> 236M stripped with ld64).

Doing dead stripping is a bit faster than not, because so much less
data needs to be processed:

    % ministat lld_*
    x lld_nostrip.txt
    + lld_strip.txt
        N           Min           Max        Median           Avg        Stddev
    x  10      3.929414       4.07692     4.0269079     4.0089678   0.044214794
    +  10     3.8129408     3.9025559     3.8670411     3.8642573   0.024779651
    Difference at 95.0% confidence
            -0.144711 +/- 0.0336749
            -3.60967% +/- 0.839989%
            (Student's t, pooled s = 0.0358398)

This interacts with many parts of the linker. I tried to add test coverage
for all added `isLive()` checks, so that some test will fail if any of them
is removed. I checked that the test expectations for the most part match
ld64's behavior (except for live-support-iterations.s, see the comment
in the test). Interacts with:
- debug info
- export tries
- import opcodes
- flags like -exported_symbol(s_list)
- -U / dynamic_lookup
- mod_init_funcs, mod_term_funcs
- weak symbol handling
- unwind info
- stubs
- map files
- -sectcreate
- undefined, dylib, common, defined (both absolute and normal) symbols

It's possible it interacts with more features I didn't think of,
of course.

I also did some manual testing:
- check-llvm check-clang check-lld work with lld with this patch
  as host linker and -dead_strip enabled
- Chromium still starts
- Chromium's base_unittests still pass, including unwind tests

Implemenation-wise, this is InputSection-based, so it'll work for
object files with .subsections_via_symbols (which includes all
object files generated by clang). I first based this on the COFF
implementation, but later realized that things are more similar to ELF.
I think it'd be good to refactor MarkLive.cpp to look more like the ELF
part at some point, but I'd like to get a working state checked in first.

Mechanical parts:
- Rename canOmitFromOutput to wasCoalesced (no behavior change)
  since it really is for weak coalesced symbols
- Add noDeadStrip to Defined, corresponding to N_NO_DEAD_STRIP
  (`.no_dead_strip` in asm)

Fixes PR49276.

Differential Revision: https://reviews.llvm.org/D103324
2021-06-02 11:09:26 -04:00
Nico Weber 66a1ecd2cf [lld/mac] Implement -needed_framework, -needed_library, -needed-l
These allow overriding dead_strip_dylibs.

Differential Revision: https://reviews.llvm.org/D103499
2021-06-02 11:06:42 -04:00
Nico Weber e14fd7d879 [lld/mac] Don't strip explicit dylib also mentioned in LC_LINKER_OPTION
Noticed by Jez in D103499.

Differential Revision: https://reviews.llvm.org/D103521
2021-06-02 10:59:56 -04:00
Nico Weber 78ce89bb1e [lld/mac] Implement -reexport_framework, -reexport_library, -reexport-l
These are slightly easier-to-use versions of -sub_library and -sub_umbrella.

Differential Revision: https://reviews.llvm.org/D103497
2021-06-02 06:37:34 -04:00
Nico Weber 222a88a243 [lld/mac] Make -t work correctly with -flat_namespace
We used to not print dylibs referenced by other dylibs in `-t` mode. This
affected reexports, and with `-flat_namespace` also just dylibs loaded by
dylibs. Now we print them.

Fixes PR49514.

Differential Revision: https://reviews.llvm.org/D103428
2021-06-01 19:23:39 -04:00
Nico Weber aeae3e0ba9 [lld/mac] Emit only one LC_LOAD_DYLIB per dylib
In some cases, we end up with several distinct DylibFiles that
have the same install name. Only emit a single LC_LOAD_DYLIB in
those cases.

This happens in 3 cases I know of:

1. Some tbd files are symlinks. libpthread.tbd is a symlink against
   libSystem.tbd for example, so `-lSystem -lpthread` loads
   libSystem.tbd twice. We could (and maybe should) cache loaded
   dylibs by realpath() to catch this.

2. Some tbd files are copies of each other. For example,
   CFNetwork.framework/CFNetwork.tbd and
   CFNetwork.framework/Versions/A/CFNetwork.tbd are two distinct
   copies of the same file. The former is found by
   `-framework CFNetwork` and the latter by the reexport in
   CoreServices.tbd. We could conceivably catch this by
   making `-framework` search look in `Versions/Current` instead
   of in the root, and/or by using a content hash to cache
   tbd files, but that's starting to sound complicated.

3. Magic $ld$ symbol processing can change the install name of
   a dylib based on the target platform_version. Here, two
   truly distinct dylibs can have the same install name.

So we need this code to deal with (3) anyways. Might as well use
it for 1 and 2, at least for now :)

With this (and D103430), clang-format links in the same dylibs
when linked with lld and ld64.

Differential Revision: https://reviews.llvm.org/D103488
2021-06-01 18:15:35 -04:00
Sam Clegg c1a59fa550 [lld][WebAssemlby] Fix for string merging of -dwarf-5 sections
We were mistakenly treating `.debug_str_offsets` as a string mergable
section when it is not (it contains integers not strings).  This is an
indication that we really should find a way to store flags for custom
sections.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=48828
Fixes: https://bugs.chromium.org/p/chromium/issues/detail?id=1172217

Differential Revision: https://reviews.llvm.org/D103486
2021-06-01 14:33:56 -07:00
Nico Weber 2c1903412b [lld/mac] Implement removal of unused dylibs
This omits load commands for unreferenced dylibs if:
- the dylib was loaded implicitly,
- it is marked MH_DEAD_STRIPPABLE_DYLIB
- or -dead_strip_dylibs is passed

This matches ld64.

Currently, the "is dylib referenced" state is computed before dead code
stripping and is not updated after dead code stripping. This too matches ld64.
We should do better here.

With this, clang-format linked with lld (like with ld64) no longer has
libobjc.A.dylib in `otool -L` output. (It was implicitly loaded as a reexport
of CoreFoundation.framework, but it's not needed.)

Differential Revision: https://reviews.llvm.org/D103430
2021-06-01 16:06:30 -04:00
Nico Weber 0b39f055d8 [lld/mac] Don't write mtimes to N_OSO entries if ZERO_AR_DATE is set.
This is important for build determinism. This matches ld64.

Differential Revision: https://reviews.llvm.org/D103446
2021-06-01 15:29:38 -04:00
Nico Weber c4053cd14e [lld/mac] Don't crash on -order_file with assembly inputs on arm64
.s files with `-g` generate __debug_aranges on darwin/arm64 for some
reason, and those lead to `nullptr` symbols. Don't crash on that.

Fixes PR50517.

Differential Revision: https://reviews.llvm.org/D103350
2021-05-28 21:00:46 -04:00
Fangrui Song 2644399ce7 [lld-macho][test] Simplify --allow-empty with count 0 2021-05-28 15:15:59 -07:00
Reid Kleckner 109aac9212 [PDB] Enable parallel ghash type merging by default
Ghashing is probably going to be faster in most cases, even without
precomputed ghashes in object files.

Here is my table of results linking clang.pdb:

-------------------------------
| threads | GHASH   | NOGHASH |
-------------------------------
|  j1     | 51.031s | 25.141s |
|  j2     | 31.079s | 22.109s |
|  j4     | 18.609s | 23.156s |
|  j8     | 11.938s | 21.984s |
| j28     |  8.375s | 18.391s |
-------------------------------

This shows that ghashing is faster if at least four cores are available.
This may make the linker slower if most cores are busy in the middle of
a build, but in that case, the linker probably isn't on the critical
path of the build. Incremental build performance is arguably more
important than highly contended batch build link performance.

The -time output indicates that ghash computation is the dominant
factor:

    Input File Reading:             924 ms (  1.8%)
    GC:                             689 ms (  1.3%)
    ICF:                            527 ms (  1.0%)
    Code Layout:                    414 ms (  0.8%)
    Commit Output File:              24 ms (  0.0%)
    PDB Emission (Cumulative):    49938 ms ( 94.8%)
      Add Objects:                46783 ms ( 88.8%)
        Global Type Hashing:      38983 ms ( 74.0%)
        GHash Type Merging:        5640 ms ( 10.7%)
        Symbol Merging:            2154 ms (  4.1%)
      Publics Stream Layout:        188 ms (  0.4%)
      TPI Stream Layout:             18 ms (  0.0%)
      Commit to Disk:              2818 ms (  5.4%)
  --------------------------------------------------
  Total Link Time:                52669 ms (100.0%)

We can speed that up with a faster content hash (not SHA1).

Differential Revision: https://reviews.llvm.org/D102888
2021-05-27 14:19:36 -07:00
Jez Ng fcab06bd85 [lld-macho][nfc] Sort OutputSections based on explicit order of command-line inputs
This diff paves the way for {D102964} which adds a new kind of
InputSection.

We previously maintained section ordering implicitly: we created
InputSections as we parsed each file in command-line order, and passed
on this ordering when we created OutputSections and OutputSegments by
iterating over these InputSections. The implicitness of the ordering
made it difficult to refactor the code to e.g. handle a new type of
InputSection. As such, I've codified the ordering explicitly via
`inputOrder` fields. This also allows us to use `sort` instead of
`stable_sort`.

Benchmarking chromium_framework on my 3.2 GHz 16-Core Intel Xeon W:

      N           Min           Max        Median           Avg        Stddev
  x  20          4.23          4.35          4.27         4.274   0.030157481
  +  20          4.24          4.38          4.27        4.2815   0.033759989
  No difference proven at 95.0% confidence

Reviewed By: #lld-macho, alexshap

Differential Revision: https://reviews.llvm.org/D102972
2021-05-25 14:58:29 -04:00
Nathan Lanza 2f65166056 [lld:elf] Weaken the requirement for a computed binding to be STB_LOCAL
Given the following scenario:

```
// Cat.cpp
struct Animal { virtual void makeNoise() const = 0; };
struct Cat : Animal { void makeNoise() const override; };

extern "C" int puts(char const *);
void Cat::makeNoise() const { puts("Meow"); }
void doThingWithCat(Animal *a) { static_cast<Cat *>(a)->makeNoise(); }

// CatUser.cpp
struct Animal { virtual void makeNoise() const = 0; };
struct Cat : Animal { void makeNoise() const override; };

void doThingWithCat(Animal *a);

void useDoThingWithCat() {
  Cat *d = new Cat;
  doThingWithCat(d);
}

// cat.ver
{
  global: _Z17useDoThingWithCatv;
  local: *;
};

$ clang++ Cat.cpp CatUser.cpp -fpic -flto=thin -fwhole-program-vtables
-shared -O3 -fuse-ld=lld -Wl,--lto-whole-program-visibility
-Wl,--version-script,cat.ver
```

We cannot devirtualize `Cat::makeNoise`. The issue is complex:

Due to `-fsplit-lto-unit` and usage of type metadata, we place the Cat
vtable declaration into module 0 and the Cat vtable definition with type
metadata into module 1, causing duplicate entries (Undefined followed by
Defined) in the `lto::InputFile::symbols()` output.
In `BitcodeFile::parse`, after processing the `Undefined` then the
`Defined`, the final state is `Defined`.
In `BitcodeCompiler::add`, for the first symbol, `computeBinding`
returns `STB_LOCAL`, then we reset it to `Undefined` because it is
prevailing (`versionId` is `preserved`). For the second symbol, because
the state is now `Undefined`, `computeBinding` returns `STB_GLOBAL`,
causing `ExportDynamic` to be true and suppressing devirtualization.

In D77280, the `computeBinding` change used a stricter `isDefined()`
condition to make weak``Lazy` symbol work.
This patch relaxes the condition to weaker `!isLazy()` to keep it
working while making the devirtualization work as well.

Differential Revision: https://reviews.llvm.org/D98686
2021-05-24 23:32:21 -04:00
serge-sans-paille 4ab3041acb Revert "[NFC] remove explicit default value for strboolattr attribute in tests"
This reverts commit bda6e5bee0.

See https://lab.llvm.org/buildbot/#/builders/109/builds/15424 for instance
2021-05-24 19:43:40 +02:00
serge-sans-paille bda6e5bee0 [NFC] remove explicit default value for strboolattr attribute in tests
Since d6de1e1a71, no attributes is quivalent to
setting attribute to false.

This is a preliminary commit for https://reviews.llvm.org/D99080
2021-05-24 19:31:04 +02:00