Commit Graph

7491 Commits

Author SHA1 Message Date
Daniel Bertalan 573c7e6b3c [lld-macho] Handle LOH_ARM64_ADRP_LDR linker optimization hints
This linker optimization hint transforms a pair of adrp+ldr (immediate)
instructions into an ldr (literal) load from a PC-relative address if
it is 4-byte aligned and within +/- 1 MiB, as ldr can encode a signed
19-bit offset that gets multiplied by 4.

In the wild, only a small number of these hints are applicable because
not many loads end up close enough to the data segment. However, the
added helper functions will be useful in implementing the rest of the
LOH types.

Differential Revision: https://reviews.llvm.org/D128942
2022-07-01 09:44:24 +02:00
Daniel Bertalan a3f67f0920 [lld-macho] Initial support for Linker Optimization Hints
Linker optimization hints mark a sequence of instructions used for
synthesizing an address, like ADRP+ADD. If the referenced symbol ends up
close enough, it can be replaced by a faster sequence of instructions
like ADR+NOP.

This commit adds support for 2 of the 7 defined ARM64 optimization
hints:
- LOH_ARM64_ADRP_ADD, which transforms a pair of ADRP+ADD into ADR+NOP
  if the referenced address is within +/- 1 MiB
- LOH_ARM64_ADRP_ADRP, which transforms two ADRP instructions into
  ADR+NOP if they reference the same page

These two kinds already cover more than 50% of all LOHs in
chromium_framework.

Differential Review: https://reviews.llvm.org/D128093
2022-06-30 06:28:42 +02:00
Daniel Bertalan 8d29f0fdb9 [lld-macho] Emit REBASE_OPCODE_ADD_ADDR_IMM_SCALED if possible
An ADD_ADDR rebase opcode's argument can be encoded as an immediate if
the offset is less than 15 * word size. This change reduces the size of
chromium_framework by 100+ KiB.

Differential Revision: https://reviews.llvm.org/D128798
2022-06-29 22:28:39 +02:00
Sam Clegg 53217ecb88 [lld][WebAssembly] Don't apply data relocations at static constructor time
Instead, export `__wasm_apply_data_relocs` and `__wasm_call_ctors`
separately.

This is required since user code in a shared library (such as static
constructors) should not be run until relocations have been applied to
all loaded libraries.

See: https://github.com/emscripten-core/emscripten/issues/17295

Differential Revision: https://reviews.llvm.org/D128515
2022-06-27 15:50:02 -07:00
Fangrui Song 0688b00fc3 [ELF] Remove deprecated -dc
-dc is deprecated in release/14.x. Remove it for 15.0.
The only usage I know was FreeBSD crungen which was removed by https://reviews.freebsd.org/D34215

glibc just dropped -Wl,-d today. Keep -d for now.
2022-06-26 17:26:44 -07:00
Fangrui Song b95cca03cd [ELF] Improve compound assignment tests
Also use strchr instead of is_contained.
2022-06-25 22:30:52 -07:00
Fangrui Song 0a0effdd5b [ELF] Support -= *= /= <<= >>= &= |= in symbol assignments 2022-06-25 22:22:59 -07:00
Fangrui Song 77295c5486 [ELF] Allow ? without adjacent space
GNU ld allows 1 ? 2?3:4 : 5?6 :7
2022-06-25 21:16:59 -07:00
Fangrui Song e3f3d2abf0 [ELF][test] Improve expression test 2022-06-25 21:11:32 -07:00
Fangrui Song 21bf6bb3d3 [ELF] Fix assertion failure when PROVIDE/HIDDEN/PROVIDE_HIDDEN does not have = 2022-06-25 20:26:47 -07:00
Fangrui Song fe0de25b21 [ELF] Allow an expression to follow = in a symbol assignment
GNU ld doesn't require whitespace before =. Match it.
2022-06-25 20:25:34 -07:00
Fangrui Song b0d6dd3905 [ELF] Fix precedence of ? when there are 2 or more operators on the left hand side
For 1 != 1 <= 1 ? 1 : 2, the current code incorrectly considers that ?
has a higher precedence than != (minPrec).

Also, add a test for right associativity.
2022-06-25 13:48:52 -07:00
Fangrui Song d479b2e4db [ELF] Fix precedence of == and != in expressions
In GNU ld, the == and != operators have lower precedence than < > <= >=.
This behavior matches C.
2022-06-25 13:47:32 -07:00
Fangrui Song 4cb05dc3cb [ELF] Support quoted name in the TARGET command 2022-06-25 12:31:20 -07:00
Fangrui Song 363b29567e [ELF] Support quoted symbol in the ENTRY command
This matches GNU ld and matches other places we unquote the symbol name.

Fixes #56208
2022-06-25 12:19:45 -07:00
Fangrui Song c5578fca16 [ELF][test] Improve linkerscript/entry.s 2022-06-25 12:14:47 -07:00
Peter Collingbourne b064bc18c3 ELF: Do not relax ADRP/LDR -> ADRP/ADD for absolute symbols in PIC.
GOT references to absolute symbols can't be relaxed to use ADRP/ADD in
position-independent code because these instructions produce a relative
address.

Differential Revision: https://reviews.llvm.org/D128492
2022-06-24 08:47:23 -07:00
Nico Weber a2c1f7c90d [lld, ELF and mac] Add --time-trace=<file>, remove --time-trace-file=<file>
`--time-trace=foo` has the same behavior as `--time-trace --time-trace-file=<file>`
had previously.

Also, for mac, make --time-trace-granularity *not* imply --time-trace, to match
behavior of the ELF port.

Differential Revision: https://reviews.llvm.org/D128451
2022-06-23 15:46:22 -04:00
Jin Xin Ng 22f1273357
[ThinLTO][ELF] Add --thinlto-emit-index-files option
Allows ThinLTO indices to be written to disk on-the-fly/as-part-of “normal” linker execution. Previously ThinLTO indices could be written via --thinlto-index-only but that would cause the linker to exit early. For MLGO specifically, this enables saving the ThinLTO index files without having to restart the linker to collect data only available at later stages (i.e. output of --save-temps) of the linker's execution.

Note, this option does not currently work with:
--thinlto-object-suffix-replace, as this is intended to be used to consume minimized IR bitcode files while --thinlto-emit-index-files is intended to be run together with InProcessThinLTO (which cannot parse minimized IR).
--thinlto-prefix-replace  support is left unimplemented but can be implemented if needed

Differential Revision: https://reviews.llvm.org/D127777
2022-06-23 12:35:42 -07:00
Daniel Bertalan ed39fd515a [lld-macho] Use source information in duplicate symbol errors
Similarly to how undefined symbol diagnostics were changed in D128184,
we now show where in the source file duplicate symbols are defined at:

  ld64.lld: error: duplicate symbol: _foo
  >> defined in bar.c:42
  >>            /path/to/bar.o
  >> defined in baz.c:1
  >>            /path/to/libbaz.a(baz.o)

For objects that don't contain DWARF data, the format is unchanged.

A slight difference to undefined symbol diagnostics is that we don't
print the name of the symbol on the third line, as it's already
contained on the first line.

Differential Revision: https://reviews.llvm.org/D128425
2022-06-23 11:07:15 -04:00
Fangrui Song 4512dda6af [ELF][test] Clean up thinlto* 2022-06-22 16:19:17 -07:00
Daniel Bertalan 5792797c5b Reland "[lld-macho] Show source information for undefined references"
The error used to look like this:

  ld64.lld: error: undefined symbol: _foo
  >>> referenced by /path/to/bar.o:(symbol _baz+0x4)

If DWARF line information is available, we now show where in the source
the references are coming from:

  ld64.lld: error: unreferenced symbol: _foo
  >>> referenced by: bar.cpp:42 (/path/to/bar.cpp:42)
  >>>                /path/to/bar.o:(symbol _baz+0x4)

The reland is identical to the first time this landed. The fix was in D128294.
This reverts commit 0cc7ad4175.

Differential Revision: https://reviews.llvm.org/D128184
2022-06-21 18:50:06 -04:00
Daniel Bertalan 77b6efbd82 [ADT] [lld-macho] Check for end iterator deref in filter_iterator_base
If ld64.lld was supplied an object file that had a `__debug_abbrev` or
`__debug_str` section, but didn't have any compile unit DIEs in
`__debug_info`, it would dereference an iterator pointing to the empty
array of DIEs. This underlying issue started causing segmentation faults
when parsing for `__debug_info` was addded in D128184. That commit was
reverted, and this one fixes the invalid dereference to allow relanding
it.

This commit adds an assertion to `filter_iterator_base`'s dereference
operators to catch bugs like this one.

Ran check-llvm, check-clang and check-lld.

Differential Revision: https://reviews.llvm.org/D128294
2022-06-21 15:47:45 -04:00
Nico Weber 0cc7ad4175 Revert "[lld-macho] Show source information for undefined references"
This reverts commit cd7624f153.
See https://reviews.llvm.org/D128184#3597534
2022-06-20 19:15:57 -04:00
Daniel Bertalan cd7624f153 [lld-macho] Show source information for undefined references
The error used to look like this:

  ld64.lld: error: undefined symbol: _foo
  >>> referenced by /path/to/bar.o:(symbol _baz+0x4)

If DWARF line information is available, we now show where in the source
the references are coming from:

  ld64.lld: error: unreferenced symbol: _foo
  >>> referenced by: bar.cpp:42 (/path/to/bar.cpp:42)
  >>>                /path/to/bar.o:(symbol _baz+0x4)

Differential Revision: https://reviews.llvm.org/D128184
2022-06-20 18:49:42 -04:00
Jez Ng 8eeede973c [lld-macho][nfc] Tests for -force_load + regular archive load combinations
I realized we'd forgotten to cover this case (though our existing
behavior is indeed correct / matches ld64's).

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D128025
2022-06-16 23:50:07 -04:00
Daniel Bertalan 0eec7e2a89 Reland "[lld-macho] Group undefined symbol diagnostics by symbol".
This reverts commit 36e7c9a450.

This relands d61341768c with the fix described in
https://reviews.llvm.org/D127753#3587390
2022-06-15 19:22:39 -04:00
Stella Stamenova 36e7c9a450 Revert "[lld-macho] Group undefined symbol diagnostics by symbol"
This reverts commit d61341768c.

This change broke multiple lld tests, including some sanitizer builds: https://lab.llvm.org/buildbot/#/builders/5/builds/24787/steps/19/logs/stdio
2022-06-15 15:42:26 -07:00
Keith Smiley 272bf0fc41
[lld-macho] Add support for exporting no symbols
As an optimization for ld64 sometimes it can be useful to not export any
symbols for top level binaries that don't need any exports, to do this
you can pass `-exported_symbols_list /dev/null`, or new with Xcode 14
(ld64 816) there is a `-no_exported_symbols` flag for the same behavior.
This reproduces this behavior where previously an empty exported symbols
list file would have been ignored.

Differential Revision: https://reviews.llvm.org/D127562
2022-06-15 15:07:27 -07:00
Pengxuan Zheng 9db61c3fe1 [LLD][COFF] Convert file name to lowercase when inserting it into visitedLibs
It seems to be a bug in `LinkerDriver::findFile`, the file name is not converted
to lowercase when being inserted into `visitedLibs`. This is the only exception
in the file and all other places always convert file names to lowercase when
inserting them into `visitedLibs` (or `visitedFiles`).

Reviewed By: thieta, hans

Differential Revision: https://reviews.llvm.org/D127709
2022-06-15 09:39:35 -07:00
Martin Storsjö aefa11166f [LLD] [MinGW] Implement --disable-reloc-section, mapped to /fixed
Since binutils 2.36, GNU ld defaults to emitting base relocations,
and that version added the new option --disable-reloc-section to
disable it.

Differential Revision: https://reviews.llvm.org/D127478
2022-06-15 16:51:20 +03:00
Daniel Bertalan d61341768c [lld-macho] Group undefined symbol diagnostics by symbol
ld64.lld used to print the "undefined symbol" line for each reference to
an undefined symbol previously:

  ld64.lld: error: undefined symbol: _foo
  >>> referenced by /path/to/bar.o:(symbol _baz+0x0)

  ld64.lld: error: undefined symbol: _foo
  >>> referenced by /path/to/bar.o:(symbol _quux+0x1)

Now they are deduplicated:

  ld64.lld: error: undefined symbol: _foo
  >>> referenced by /path/to/bar.o:(symbol _baz+0x0)
  >>> referenced by /path/to/bar.o:(symbol _quux+0x1)

As with the other lld ports, only the first 3 references are printed.

Differential Revision: https://reviews.llvm.org/D127753
2022-06-14 16:38:11 -04:00
Daniel Bertalan f2e92cf60e [lld-macho] Print the name of functions containing undefined references
The error used to look like this:

  ld64.lld: error: undefined symbol: _foo
  >>> referenced by /path/to/bar.o

Now it displays the name of the function that contains the undefined
reference as well:

  ld64.lld: error: undefined symbol: _foo
  >>> referenced by /path/to/bar.o:(symbol _baz+0x4)

Differential Revision: https://reviews.llvm.org/D127696
2022-06-14 09:41:28 -04:00
Daniel Bertalan 5f627cc225 [lld-macho] Fix symbol name returned from InputSection::getLocation
This commit fixes the issue that getLocation always printed the name of
the first symbol in the section.

For clarity, upper_bound is used instead of a linear search for finding
the closest symbol name. Note that this change does not affect
performance: this function is only called when printing errors and
`symbols` typically contains a single symbol because of
.subsections_via_symbols.

Differential Revision: https://reviews.llvm.org/D127670
2022-06-13 15:49:27 -04:00
Jez Ng 224094eb44 [lld-macho] Require aarch64 for eh-frame.s test
Should fix the test failure introduced by D124561.
2022-06-13 14:05:07 -04:00
Jez Ng b422dac240 [lld-macho][reland] Support EH frames under arm64
This reverts commit 10641a42e2.

Differential Revision: https://reviews.llvm.org/D124561
2022-06-13 07:45:27 -04:00
Jez Ng e183bf8e15 [lld-macho][reland] Initial support for EH Frames
This reverts commit 942f4e3a7c.

The additional change required to avoid the assertion errors seen
previously is:

  --- a/lld/MachO/ICF.cpp
  +++ b/lld/MachO/ICF.cpp
  @@ -443,7 +443,9 @@ void macho::foldIdenticalSections() {
                                 /*relocVA=*/0);
           isec->data = copy;
         }
  -    } else {
  +    } else if (!isEhFrameSection(isec)) {
  +      // EH frames are gathered as hashables from unwindEntry above; give a
  +      // unique ID to everything else.
         isec->icfEqClass[0] = ++icfUniqueID;
       }
     }

Differential Revision: https://reviews.llvm.org/D123435
2022-06-13 07:45:16 -04:00
Jez Ng d378268ead [lld-macho] Make `--icf=safe` work with LTO
Just matter of enabling the config option.

(Also changed the platform of the input test file to macOS, since that's
the default that we specify in the `%lld` substitution. The conflict was
causing errors when linking with LTO.)

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D127600
2022-06-12 17:26:08 -04:00
Keith Smiley 7d57c69826
[lld-macho] Add support for -w
This flag suppresses warnings produced by the linker. In ld64 this has
an interesting interaction with -fatal_warnings, it silences the
warnings but the link still fails. Instead of doing that here we still
print the warning and eagerly fail the link in case both are passed,
this seems more reasonable so users can understand why the link fails.

Differential Revision: https://reviews.llvm.org/D127564
2022-06-11 17:38:50 -07:00
Sam Clegg 457f38a7b0 [lld][WebAssembly] Revert moving of data relocations to start function
Back in https://reviews.llvm.org/D117412 we moved the application of
data reloctions to the wasm start function.

However, because the dynamic linker doesn't know the final addresses
at module instantiation time, this proved to be too early and the
relocations could be applied with the wrong values.

Fixes: https://github.com/emscripten-core/emscripten/issues/17150

Differential Revision: https://reviews.llvm.org/D127333
2022-06-09 17:49:35 -07:00
Douglas Yung 942f4e3a7c Revert "[lld-macho] Initial support for EH Frames"
This reverts commit 826be330af.

This was causing a test failure on build bots:
  - https://lab.llvm.org/buildbot/#/builders/36/builds/21770
  - https://lab.llvm.org/buildbot/#/builders/58/builds/23913
2022-06-09 05:25:43 -07:00
Douglas Yung 10641a42e2 Revert "[lld-macho] Support EH frames under arm64"
This reverts commit 977d62c33e.

This change was causing crashes in 2 tests on the buildbots:
  - https://lab.llvm.org/buildbot/#/builders/58/builds/23914
  - https://lab.llvm.org/buildbot/#/builders/36/builds/21771
2022-06-09 05:24:28 -07:00
Jez Ng 977d62c33e [lld-macho] Support EH frames under arm64
For arm64, llvm-mc emits relocations for the target function
address like so:

  ltmp:
    <CIE start>
    ...
    <CIE end>
    ... multiple FDEs ...
    <FDE start>
    <target function address - (ltmp + pcrel offset)>
    ...

If any of the FDEs in `multiple FDEs` get dead-stripped, then `FDE start`
will move to an earlier address, and `ltmp + pcrel offset` will no longer
reflect an accurate pcrel value. To avoid this problem, we "canonicalize"
our relocation by adding an `EH_Frame` symbol at `FDE start`, and updating
the reloc to be `target function address - (EH_Frame + new pcrel offset)`.

Reviewed By: #lld-macho, Roger

Differential Revision: https://reviews.llvm.org/D124561
2022-06-08 23:41:29 -04:00
Jez Ng 826be330af [lld-macho] Initial support for EH Frames
== Background ==

`llvm-mc` generates unwind info in both compact unwind and DWARF
formats. LLD already handles the compact unwind format; this diff gets
us close to handling the DWARF format properly.

== Caveats ==

It's not quite done yet, but I figure it's worth getting this reviewed
and landed first as it's shaping up to be a fairly large code change.

**Known limitations of the current code:**

* Only works for x86_64, for which `llvm-mc` emits "abs-ified"
  relocations as described in 618def651b.
  `llvm-mc` emits regular relocations for ARM EH frames, which we do not
  yet handle correctly.

Since the feature is not ready for real use yet, I've gated it behind a
flag that only gets toggled on during test suite runs. With most of the
new code disabled, we see just a hint of perf regression, so I don't
think it'd be remiss to land this as-is:

             base           diff           difference (95% CI)
  sys_time   1.926 ± 0.168  1.979 ± 0.117  [  -1.2% ..   +6.6%]
  user_time  3.590 ± 0.033  3.606 ± 0.028  [  +0.0% ..   +0.9%]
  wall_time  7.104 ± 0.184  7.179 ± 0.151  [  -0.2% ..   +2.3%]
  samples    30             31

== Design ==

Like compact unwind entries, EH frames are also represented as regular
ConcatInputSections that get pointed to via `Defined::unwindEntry`. This
allows them to be handled generically by e.g. the MarkLive and ICF
code. (But note that unlike compact unwind subsections, EH frame
subsections do end up in the final binary.)

In order to make EH frames "look like" a regular ConcatInputSection,
some processing is required. First, we need to split the `__eh_frame`
section along EH frame boundaries rather than along symbol boundaries.
We do this by decoding the length field of each EH frame. Second, the
abs-ified relocations need to be turned into regular Relocs.

== Next Steps ==

In order to support EH frames on ARM targets, we will either have to
teach LLD how to handle EH frames with explicit relocs, or we can try to
make `llvm-mc` emit abs-ified relocs for ARM as well. I'm hoping to do
the latter as I think it will make the LLD implementation both simpler
and faster to execute.

== Misc ==

The `obj-file-with-stabs.s` test had to be updated as the previous
version would trip assertion errors in the code. It appears that in our
attempt to produce a minimal YAML test input, we created a file with
invalid EH frame data. I've fixed this by re-generating the YAML and not
doing any hand-pruning of it.

Reviewed By: #lld-macho, Roger

Differential Revision: https://reviews.llvm.org/D123435
2022-06-08 23:40:52 -04:00
Michael Eisel 44978a234b [lld/mac] Write output sections in parallel
This reduces linking time by ~8% for my project (1.19s -> 0.53s for
writeSections()). writeTo is const, which bodes well for it being
parallelizable, and I've looked through the different overridden versions and
can't see any race conditions. It produces the same byte-for-byte output for my
project.

Differential Revision: https://reviews.llvm.org/D126800
2022-06-08 20:11:50 -04:00
Florian Mayer f6b1bfb7d5 [ELF] Support 'G' in .eh_frame
Reviewed By: MaskRay, eugenis

Differential Revision: https://reviews.llvm.org/D127148
2022-06-08 14:28:58 -07:00
Florian Mayer 6fb4fe7285 Revert "[ELF] Support 'G' in .eh_frame"
This reverts commit 40f34fe4a8.
2022-06-08 13:52:38 -07:00
Florian Mayer 40f34fe4a8 [ELF] Support 'G' in .eh_frame
Reviewed By: MaskRay, eugenis

Differential Revision: https://reviews.llvm.org/D127148
2022-06-08 13:40:20 -07:00
Vy Nguyen 66bd14697b [lld-macho] Demangle symbol names in duplicate-symbol error when -demangle is specified
Differential Revision: https://reviews.llvm.org/D127110
2022-06-06 15:12:26 -04:00
Fangrui Song 025b309631 Revert D126950 "[lld][WebAssembly] Retain data segments referenced via __start/__stop"
This reverts commit dcf3368e33.

It breaks -DLLVM_ENABLE_ASSERTIONS=on builds. In addition, the description is
incorrect about ld.lld behavior. For wasm, there should be justification to add
the new mode.
2022-06-03 22:18:06 -07:00