Commit Graph

7499 Commits

Author SHA1 Message Date
Fangrui Song 75e551e5d8 [ELF] Relax R_RISCV_CALL and R_RISCV_CALL_PLT
A pair of auipc+jalr relocated by R_RISCV_CALL or R_RISCV_CALL_PLT can be
converted to c.j, c.jal, or jal.

* c.j: RVC and displacement is representable as an int12
* c.jal: RV32C and displacement is representable as an int12
* jal: displacement is representable as an int21

Use the D127581 relaxation framework to implement the relaxation. If a shorter
sequence is satisfied, we record the new relocation type in `relocTypes` and
saves the new instruction into `writes`. Finally let `riscvFinalizeRelax` rewrite the
instruction by setting `skip`.

Differential Revision: https://reviews.llvm.org/D127611
2022-07-07 10:18:45 -07:00
Fangrui Song 6611d58f5b [ELF] Relax R_RISCV_ALIGN
Alternative to D125036. Implement R_RISCV_ALIGN relaxation so that we can handle
-mrelax object files (i.e. -mno-relax is no longer needed) and creates a
framework for future relaxation.

`relaxAux` is placed in a union with InputSectionBase::jumpInstrMod, storing
auxiliary information for relaxation. In the first pass, `relaxAux` is allocated.
The main data structure is `relocDeltas`: when referencing `relocations[i]`, the
actual offset is `r_offset - (i ? relocDeltas[i-1] : 0)`.

`relaxOnce` performs one relaxation pass. It computes `relocDeltas` for all text
section. Then, adjust st_value/st_size for symbols relative to this section
based on `SymbolAnchor`. `bytesDropped` is set so that `assignAddresses` knows
that the size has changed.

Run `relaxOnce` in the `finalizeAddressDependentContent` loop to wait for
convergence of text sections and other address dependent sections (e.g.
SHT_RELR). Note: extrating `relaxOnce` into a separate loop works for many cases
but has issues in some linker script edge cases.

After convergence, compute section contents: shrink the NOP sequence of each
R_RISCV_ALIGN as appropriate. Instead of deleting bytes, we run a sequence of
memcpy on the content delimitered by relocation locations. For R_RISCV_ALIGN let
the next memcpy skip the desired number of bytes. Section content computation is
parallelizable, but let's ensure the implementation is mature before
optimizations. Technically we can save a copy if we interleave some code with
`OutputSection::writeTo`, but let's not pollute the generic code (we don't have
templated relocation resolving, so using conditions can impose overhead to
non-RISCV.)

Tested:
`make ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu- LLVM=1 defconfig all` built Linux kernel using -mrelax is bootable.
FreeBSD RISCV64 system using -mrelax is bootable.
bash/curl/firefox/libevent/vim/tmux using -mrelax works.

Differential Revision: https://reviews.llvm.org/D127581
2022-07-07 10:16:09 -07:00
Tim Northover 0f4339a835 lld test fix: don't check the precise hex emitted as a comment.
It can vary depending on the platform, so as with the NO-FMA test just check
for "0x".
2022-07-07 13:25:24 +01:00
Tim Northover fe62019387 lld: fix test after x86 instruction comments now end in newline 2022-07-07 13:01:32 +01:00
Jin Xin Ng 65001f5777
[LTO][ELF] Add selective --save-temps= option
Allows specific “temps” to be saved, instead of the current all-or-nothing nature of --save-temps. Multiple of these “temps” can be saved by specifying the argument multiple times.

Differential Revision: https://reviews.llvm.org/D127778
2022-07-06 10:06:18 -07:00
Ben Dunbobbin c35a6454b1 [BUILD] Add missed CMakeLists.txt change from dfb77f2
See: https://reviews.llvm.org/D128195
2022-07-05 16:04:58 +01:00
Ben Dunbobbin dfb77f2e99 [LLD][ELF] Add FORCE_LLD_DIAGNOSTICS_CRASH to force LLD to crash
Add FORCE_LLD_DIAGNOSTICS_CRASH inspired by the existing
FORCE_CLANG_DIAGNOSTICS_CRASH.

This is particularly useful for people customizing LLD as they may
want to modify the crash reporting behavior.

Differential Revision: https://reviews.llvm.org/D128195
2022-07-05 09:43:09 +01:00
Daniel Bertalan 2028fe6fbc [lld-macho] Handle LOH_ARM64_ADRP_LDR_GOT optimization hints
This hint instructs the linker to perform the AdrpLdr or AdrpAdd
transformation depending on whether the GOT load has been relaxed to
load a local symbol's address.

Differential Revision: https://reviews.llvm.org/D129059
2022-07-05 07:33:13 +02:00
Daniel Bertalan 573c7e6b3c [lld-macho] Handle LOH_ARM64_ADRP_LDR linker optimization hints
This linker optimization hint transforms a pair of adrp+ldr (immediate)
instructions into an ldr (literal) load from a PC-relative address if
it is 4-byte aligned and within +/- 1 MiB, as ldr can encode a signed
19-bit offset that gets multiplied by 4.

In the wild, only a small number of these hints are applicable because
not many loads end up close enough to the data segment. However, the
added helper functions will be useful in implementing the rest of the
LOH types.

Differential Revision: https://reviews.llvm.org/D128942
2022-07-01 09:44:24 +02:00
Daniel Bertalan a3f67f0920 [lld-macho] Initial support for Linker Optimization Hints
Linker optimization hints mark a sequence of instructions used for
synthesizing an address, like ADRP+ADD. If the referenced symbol ends up
close enough, it can be replaced by a faster sequence of instructions
like ADR+NOP.

This commit adds support for 2 of the 7 defined ARM64 optimization
hints:
- LOH_ARM64_ADRP_ADD, which transforms a pair of ADRP+ADD into ADR+NOP
  if the referenced address is within +/- 1 MiB
- LOH_ARM64_ADRP_ADRP, which transforms two ADRP instructions into
  ADR+NOP if they reference the same page

These two kinds already cover more than 50% of all LOHs in
chromium_framework.

Differential Review: https://reviews.llvm.org/D128093
2022-06-30 06:28:42 +02:00
Daniel Bertalan 8d29f0fdb9 [lld-macho] Emit REBASE_OPCODE_ADD_ADDR_IMM_SCALED if possible
An ADD_ADDR rebase opcode's argument can be encoded as an immediate if
the offset is less than 15 * word size. This change reduces the size of
chromium_framework by 100+ KiB.

Differential Revision: https://reviews.llvm.org/D128798
2022-06-29 22:28:39 +02:00
Sam Clegg 53217ecb88 [lld][WebAssembly] Don't apply data relocations at static constructor time
Instead, export `__wasm_apply_data_relocs` and `__wasm_call_ctors`
separately.

This is required since user code in a shared library (such as static
constructors) should not be run until relocations have been applied to
all loaded libraries.

See: https://github.com/emscripten-core/emscripten/issues/17295

Differential Revision: https://reviews.llvm.org/D128515
2022-06-27 15:50:02 -07:00
Fangrui Song 0688b00fc3 [ELF] Remove deprecated -dc
-dc is deprecated in release/14.x. Remove it for 15.0.
The only usage I know was FreeBSD crungen which was removed by https://reviews.freebsd.org/D34215

glibc just dropped -Wl,-d today. Keep -d for now.
2022-06-26 17:26:44 -07:00
Fangrui Song b95cca03cd [ELF] Improve compound assignment tests
Also use strchr instead of is_contained.
2022-06-25 22:30:52 -07:00
Fangrui Song 0a0effdd5b [ELF] Support -= *= /= <<= >>= &= |= in symbol assignments 2022-06-25 22:22:59 -07:00
Fangrui Song 77295c5486 [ELF] Allow ? without adjacent space
GNU ld allows 1 ? 2?3:4 : 5?6 :7
2022-06-25 21:16:59 -07:00
Fangrui Song e3f3d2abf0 [ELF][test] Improve expression test 2022-06-25 21:11:32 -07:00
Fangrui Song 21bf6bb3d3 [ELF] Fix assertion failure when PROVIDE/HIDDEN/PROVIDE_HIDDEN does not have = 2022-06-25 20:26:47 -07:00
Fangrui Song fe0de25b21 [ELF] Allow an expression to follow = in a symbol assignment
GNU ld doesn't require whitespace before =. Match it.
2022-06-25 20:25:34 -07:00
Fangrui Song b0d6dd3905 [ELF] Fix precedence of ? when there are 2 or more operators on the left hand side
For 1 != 1 <= 1 ? 1 : 2, the current code incorrectly considers that ?
has a higher precedence than != (minPrec).

Also, add a test for right associativity.
2022-06-25 13:48:52 -07:00
Fangrui Song d479b2e4db [ELF] Fix precedence of == and != in expressions
In GNU ld, the == and != operators have lower precedence than < > <= >=.
This behavior matches C.
2022-06-25 13:47:32 -07:00
Fangrui Song 4cb05dc3cb [ELF] Support quoted name in the TARGET command 2022-06-25 12:31:20 -07:00
Fangrui Song 363b29567e [ELF] Support quoted symbol in the ENTRY command
This matches GNU ld and matches other places we unquote the symbol name.

Fixes #56208
2022-06-25 12:19:45 -07:00
Fangrui Song c5578fca16 [ELF][test] Improve linkerscript/entry.s 2022-06-25 12:14:47 -07:00
Peter Collingbourne b064bc18c3 ELF: Do not relax ADRP/LDR -> ADRP/ADD for absolute symbols in PIC.
GOT references to absolute symbols can't be relaxed to use ADRP/ADD in
position-independent code because these instructions produce a relative
address.

Differential Revision: https://reviews.llvm.org/D128492
2022-06-24 08:47:23 -07:00
Nico Weber a2c1f7c90d [lld, ELF and mac] Add --time-trace=<file>, remove --time-trace-file=<file>
`--time-trace=foo` has the same behavior as `--time-trace --time-trace-file=<file>`
had previously.

Also, for mac, make --time-trace-granularity *not* imply --time-trace, to match
behavior of the ELF port.

Differential Revision: https://reviews.llvm.org/D128451
2022-06-23 15:46:22 -04:00
Jin Xin Ng 22f1273357
[ThinLTO][ELF] Add --thinlto-emit-index-files option
Allows ThinLTO indices to be written to disk on-the-fly/as-part-of “normal” linker execution. Previously ThinLTO indices could be written via --thinlto-index-only but that would cause the linker to exit early. For MLGO specifically, this enables saving the ThinLTO index files without having to restart the linker to collect data only available at later stages (i.e. output of --save-temps) of the linker's execution.

Note, this option does not currently work with:
--thinlto-object-suffix-replace, as this is intended to be used to consume minimized IR bitcode files while --thinlto-emit-index-files is intended to be run together with InProcessThinLTO (which cannot parse minimized IR).
--thinlto-prefix-replace  support is left unimplemented but can be implemented if needed

Differential Revision: https://reviews.llvm.org/D127777
2022-06-23 12:35:42 -07:00
Daniel Bertalan ed39fd515a [lld-macho] Use source information in duplicate symbol errors
Similarly to how undefined symbol diagnostics were changed in D128184,
we now show where in the source file duplicate symbols are defined at:

  ld64.lld: error: duplicate symbol: _foo
  >> defined in bar.c:42
  >>            /path/to/bar.o
  >> defined in baz.c:1
  >>            /path/to/libbaz.a(baz.o)

For objects that don't contain DWARF data, the format is unchanged.

A slight difference to undefined symbol diagnostics is that we don't
print the name of the symbol on the third line, as it's already
contained on the first line.

Differential Revision: https://reviews.llvm.org/D128425
2022-06-23 11:07:15 -04:00
Fangrui Song 4512dda6af [ELF][test] Clean up thinlto* 2022-06-22 16:19:17 -07:00
Daniel Bertalan 5792797c5b Reland "[lld-macho] Show source information for undefined references"
The error used to look like this:

  ld64.lld: error: undefined symbol: _foo
  >>> referenced by /path/to/bar.o:(symbol _baz+0x4)

If DWARF line information is available, we now show where in the source
the references are coming from:

  ld64.lld: error: unreferenced symbol: _foo
  >>> referenced by: bar.cpp:42 (/path/to/bar.cpp:42)
  >>>                /path/to/bar.o:(symbol _baz+0x4)

The reland is identical to the first time this landed. The fix was in D128294.
This reverts commit 0cc7ad4175.

Differential Revision: https://reviews.llvm.org/D128184
2022-06-21 18:50:06 -04:00
Daniel Bertalan 77b6efbd82 [ADT] [lld-macho] Check for end iterator deref in filter_iterator_base
If ld64.lld was supplied an object file that had a `__debug_abbrev` or
`__debug_str` section, but didn't have any compile unit DIEs in
`__debug_info`, it would dereference an iterator pointing to the empty
array of DIEs. This underlying issue started causing segmentation faults
when parsing for `__debug_info` was addded in D128184. That commit was
reverted, and this one fixes the invalid dereference to allow relanding
it.

This commit adds an assertion to `filter_iterator_base`'s dereference
operators to catch bugs like this one.

Ran check-llvm, check-clang and check-lld.

Differential Revision: https://reviews.llvm.org/D128294
2022-06-21 15:47:45 -04:00
Nico Weber 0cc7ad4175 Revert "[lld-macho] Show source information for undefined references"
This reverts commit cd7624f153.
See https://reviews.llvm.org/D128184#3597534
2022-06-20 19:15:57 -04:00
Daniel Bertalan cd7624f153 [lld-macho] Show source information for undefined references
The error used to look like this:

  ld64.lld: error: undefined symbol: _foo
  >>> referenced by /path/to/bar.o:(symbol _baz+0x4)

If DWARF line information is available, we now show where in the source
the references are coming from:

  ld64.lld: error: unreferenced symbol: _foo
  >>> referenced by: bar.cpp:42 (/path/to/bar.cpp:42)
  >>>                /path/to/bar.o:(symbol _baz+0x4)

Differential Revision: https://reviews.llvm.org/D128184
2022-06-20 18:49:42 -04:00
Jez Ng 8eeede973c [lld-macho][nfc] Tests for -force_load + regular archive load combinations
I realized we'd forgotten to cover this case (though our existing
behavior is indeed correct / matches ld64's).

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D128025
2022-06-16 23:50:07 -04:00
Daniel Bertalan 0eec7e2a89 Reland "[lld-macho] Group undefined symbol diagnostics by symbol".
This reverts commit 36e7c9a450.

This relands d61341768c with the fix described in
https://reviews.llvm.org/D127753#3587390
2022-06-15 19:22:39 -04:00
Stella Stamenova 36e7c9a450 Revert "[lld-macho] Group undefined symbol diagnostics by symbol"
This reverts commit d61341768c.

This change broke multiple lld tests, including some sanitizer builds: https://lab.llvm.org/buildbot/#/builders/5/builds/24787/steps/19/logs/stdio
2022-06-15 15:42:26 -07:00
Keith Smiley 272bf0fc41
[lld-macho] Add support for exporting no symbols
As an optimization for ld64 sometimes it can be useful to not export any
symbols for top level binaries that don't need any exports, to do this
you can pass `-exported_symbols_list /dev/null`, or new with Xcode 14
(ld64 816) there is a `-no_exported_symbols` flag for the same behavior.
This reproduces this behavior where previously an empty exported symbols
list file would have been ignored.

Differential Revision: https://reviews.llvm.org/D127562
2022-06-15 15:07:27 -07:00
Pengxuan Zheng 9db61c3fe1 [LLD][COFF] Convert file name to lowercase when inserting it into visitedLibs
It seems to be a bug in `LinkerDriver::findFile`, the file name is not converted
to lowercase when being inserted into `visitedLibs`. This is the only exception
in the file and all other places always convert file names to lowercase when
inserting them into `visitedLibs` (or `visitedFiles`).

Reviewed By: thieta, hans

Differential Revision: https://reviews.llvm.org/D127709
2022-06-15 09:39:35 -07:00
Martin Storsjö aefa11166f [LLD] [MinGW] Implement --disable-reloc-section, mapped to /fixed
Since binutils 2.36, GNU ld defaults to emitting base relocations,
and that version added the new option --disable-reloc-section to
disable it.

Differential Revision: https://reviews.llvm.org/D127478
2022-06-15 16:51:20 +03:00
Daniel Bertalan d61341768c [lld-macho] Group undefined symbol diagnostics by symbol
ld64.lld used to print the "undefined symbol" line for each reference to
an undefined symbol previously:

  ld64.lld: error: undefined symbol: _foo
  >>> referenced by /path/to/bar.o:(symbol _baz+0x0)

  ld64.lld: error: undefined symbol: _foo
  >>> referenced by /path/to/bar.o:(symbol _quux+0x1)

Now they are deduplicated:

  ld64.lld: error: undefined symbol: _foo
  >>> referenced by /path/to/bar.o:(symbol _baz+0x0)
  >>> referenced by /path/to/bar.o:(symbol _quux+0x1)

As with the other lld ports, only the first 3 references are printed.

Differential Revision: https://reviews.llvm.org/D127753
2022-06-14 16:38:11 -04:00
Daniel Bertalan f2e92cf60e [lld-macho] Print the name of functions containing undefined references
The error used to look like this:

  ld64.lld: error: undefined symbol: _foo
  >>> referenced by /path/to/bar.o

Now it displays the name of the function that contains the undefined
reference as well:

  ld64.lld: error: undefined symbol: _foo
  >>> referenced by /path/to/bar.o:(symbol _baz+0x4)

Differential Revision: https://reviews.llvm.org/D127696
2022-06-14 09:41:28 -04:00
Daniel Bertalan 5f627cc225 [lld-macho] Fix symbol name returned from InputSection::getLocation
This commit fixes the issue that getLocation always printed the name of
the first symbol in the section.

For clarity, upper_bound is used instead of a linear search for finding
the closest symbol name. Note that this change does not affect
performance: this function is only called when printing errors and
`symbols` typically contains a single symbol because of
.subsections_via_symbols.

Differential Revision: https://reviews.llvm.org/D127670
2022-06-13 15:49:27 -04:00
Jez Ng 224094eb44 [lld-macho] Require aarch64 for eh-frame.s test
Should fix the test failure introduced by D124561.
2022-06-13 14:05:07 -04:00
Jez Ng b422dac240 [lld-macho][reland] Support EH frames under arm64
This reverts commit 10641a42e2.

Differential Revision: https://reviews.llvm.org/D124561
2022-06-13 07:45:27 -04:00
Jez Ng e183bf8e15 [lld-macho][reland] Initial support for EH Frames
This reverts commit 942f4e3a7c.

The additional change required to avoid the assertion errors seen
previously is:

  --- a/lld/MachO/ICF.cpp
  +++ b/lld/MachO/ICF.cpp
  @@ -443,7 +443,9 @@ void macho::foldIdenticalSections() {
                                 /*relocVA=*/0);
           isec->data = copy;
         }
  -    } else {
  +    } else if (!isEhFrameSection(isec)) {
  +      // EH frames are gathered as hashables from unwindEntry above; give a
  +      // unique ID to everything else.
         isec->icfEqClass[0] = ++icfUniqueID;
       }
     }

Differential Revision: https://reviews.llvm.org/D123435
2022-06-13 07:45:16 -04:00
Jez Ng d378268ead [lld-macho] Make `--icf=safe` work with LTO
Just matter of enabling the config option.

(Also changed the platform of the input test file to macOS, since that's
the default that we specify in the `%lld` substitution. The conflict was
causing errors when linking with LTO.)

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D127600
2022-06-12 17:26:08 -04:00
Keith Smiley 7d57c69826
[lld-macho] Add support for -w
This flag suppresses warnings produced by the linker. In ld64 this has
an interesting interaction with -fatal_warnings, it silences the
warnings but the link still fails. Instead of doing that here we still
print the warning and eagerly fail the link in case both are passed,
this seems more reasonable so users can understand why the link fails.

Differential Revision: https://reviews.llvm.org/D127564
2022-06-11 17:38:50 -07:00
Sam Clegg 457f38a7b0 [lld][WebAssembly] Revert moving of data relocations to start function
Back in https://reviews.llvm.org/D117412 we moved the application of
data reloctions to the wasm start function.

However, because the dynamic linker doesn't know the final addresses
at module instantiation time, this proved to be too early and the
relocations could be applied with the wrong values.

Fixes: https://github.com/emscripten-core/emscripten/issues/17150

Differential Revision: https://reviews.llvm.org/D127333
2022-06-09 17:49:35 -07:00
Douglas Yung 942f4e3a7c Revert "[lld-macho] Initial support for EH Frames"
This reverts commit 826be330af.

This was causing a test failure on build bots:
  - https://lab.llvm.org/buildbot/#/builders/36/builds/21770
  - https://lab.llvm.org/buildbot/#/builders/58/builds/23913
2022-06-09 05:25:43 -07:00
Douglas Yung 10641a42e2 Revert "[lld-macho] Support EH frames under arm64"
This reverts commit 977d62c33e.

This change was causing crashes in 2 tests on the buildbots:
  - https://lab.llvm.org/buildbot/#/builders/58/builds/23914
  - https://lab.llvm.org/buildbot/#/builders/36/builds/21771
2022-06-09 05:24:28 -07:00