Commit Graph

6403 Commits

Author SHA1 Message Date
Fangrui Song 74bece8dde [WPD][ELF] Allow whole program devirtualization for version script localized symbols
A `local:` version node in a version script can change the effective symbol binding
to STB_LOCAL. The linker needs to communicate the fact to enable WPD
(otherwise LTO does not know that the `!vcall_visibility` metadata has
effectively changed from VCallVisibilityPublic to VCallVisibilityLinkageUnit).

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D98220
2021-03-09 22:33:47 -08:00
Albion Fung 36192790d8 [PowerPC][PC Rel] Implement option to omit Power10 instructions from stubs
Implemented the option to omit Power10 instructions from save stubs via the
option --no-power10-stubs or --power10-stubs=no on lld. --power10-stubs= will
override the other option. --power10-stubs=auto also exists to use the default
behaviour (ie allow Power10 instructions in stubs).

Differential Revision: https://reviews.llvm.org/D94627
2021-03-04 13:27:46 -05:00
Peter Smith e35929e026 [LLD][ELF][ARM] Refactor inBranchRange to use addend for PC Bias
In AArch32 ARM, the PC reads two instructions ahead of the currently
executiing instruction. This evaluates to 8 in ARM state and 4 in
Thumb state. Branch instructions on AArch32 compensate for this by
subtracting the PC bias from the addend. For a branch to symbol this
will result in an addend of -8 in ARM state and -4 in Thumb state.

The existing ARM Target::inBranchRange function accounted for this
implict addend within the function meaning that if the addend were
to be taken into account by the caller then it would be double
counted. This complicates the interface for all Targets as callers
wanting to account for addends had to account for the ARM PC-bias.

In certain situations such as:
https://github.com/ClangBuiltLinux/linux/issues/1305
the PC-bias compensation code didn't match up. In particular
normalizeExistingThunk() didn't put the PC-bias back in as Arm
thunks did not store the addend.

The simplest fix for the problem is to add the PC bias in
normalizeExistingThunk when restoring the addend. However I think
it is worth refactoring the Arm inBranchRange implementation so
that fewer calls to getPCBias are needed for other Targets. I
wasn't able to remove getPCBias completely but hopefully the
Relocations.cpp code is simpler now.

In principle a test could be written to replicate the linux kernel
build failure but I wasn't able to reproduce with a small example
that I could build up from scratch.

Fixes https://github.com/ClangBuiltLinux/linux/issues/1305

Differential Revision: https://reviews.llvm.org/D97550
2021-03-02 11:02:33 +00:00
Sam Clegg d49270b087 [lld][ELF] Removing redundant cast. NFC.
Also a couple of minor cleanups in merge-string.s:
- fix inconsistent use of tabs
- use `.p2align` rather than `.align` since `.p2align` works the
  same on all platforms (the meaning of align seems to differ
  between platforms according to `AlignmentIsInBytes`.

I noticed these potential cleanups while porting SHF_STRINGS support to
wasm-ld.

Differential Revision: https://reviews.llvm.org/D97647
2021-02-28 16:53:41 -08:00
Fangrui Song 4bbcd63eea [ELF] Add -z start-stop-gc to let __start_/__stop_ not retain C identifier name sections
For one metadata section usage, each text section references a metadata section.
The metadata sections have a C identifier name to allow the runtime to collect them via `__start_/__stop_` symbols.

Since `__start_`/`__stop_` references are always present from live sections, the
C identifier name sections appear like GC roots, which means they cannot be
discarded by `ld --gc-sections`.

To make such sections GCable, either SHF_LINK_ORDER or a section group is needed.

SHF_LINK_ORDER is not suitable for the references can be inlined into other functions
(See D97430:
Function A (in the section .text.A) references its `__sancov_guard` section.
Function B inlines A (so now .text.B references `__sancov_guard` - this is invalid with the semantics of SHF_LINK_ORDER).

In the linking stage,
if `.text.A` gets discarded, and `__sancov_guard` is retained via the reference from `.text.B`,
the output will be invalid because `__sancov_guard` references the discarded `.text.A`.
LLD errors "sh_link points to discarded section".
)

A section group have size overhead, and is cumbersome when there is just one metadata section.

Add `-z start-stop-gc` to drop the "__start_/__stop_ references retain
non-SHF_LINK_ORDER non-SHF_GROUP C identifier name sections" rule.
We reserve the rights to switch the default in the future.

Reviewed By: phosek, jrtc27

Differential Revision: https://reviews.llvm.org/D96914
2021-02-25 15:46:37 -08:00
Petr Hosek 1a3f3a3fa1 [lld][ELF] __start_/__stop_ refs don't retain C-ident named group sections
The special root semantics for identifier-named sections is meant
specifically for the metadata sections. In the context of group
semantics, where group members are always retained or discarded as a
unit, it's natural not to have this semantics apply to a section in a
group, otherwise we would never discard the group defeating the purpose
of using the group in the first place.

This change modifies the GC behavior so that __start_/__stop_ references
don't retain C identifier named sections in section groups which allows
for these groups to be collected. This matches the behavior of BFD ld.

The only kind of existing case that might break is interdependent
metadata sections that are all in a group together, but that group
doesn't contain any other sections referenced by anything except
implicit inclusion in a `__start_` and/or `__stop_`-referenced
identifier-named section, but such cases should be unlikely.

Differential Revision: https://reviews.llvm.org/D96753
2021-02-20 22:22:05 -08:00
Nico Weber cb4df6eb8d fix comment typos to cycle bots 2021-02-18 14:25:21 -05:00
Nico Weber 279c5dc2f3 fix comment typo to cycle bots 2021-02-17 15:29:39 -05:00
Nico Weber 872efb0b31 fix comment typo to cycle bots 2021-02-17 11:53:42 -05:00
Petr Hosek bfa4235e6e [lld][ELF] Support for zero flag section groups
This change introduces support for zero flag ELF section groups to lld.
lld already supports COMDAT sections, which in ELF are a special type of
ELF section groups. These are generally useful to enable linker GC where
you want a group of sections to always travel together, that is to be
either retained or discarded as a whole, but without the COMDAT
semantics. Other ELF linkers already support zero flag ELF section
groups and this change helps us reach feature parity.

Differential Revision: https://reviews.llvm.org/D96636
2021-02-16 14:33:09 -08:00
Fangrui Song 0557b1bdec [ELF] Resolve defined symbols before undefined symbols
When parsing an object file, LLD interleaves undefined symbol resolution (which
may recursively fetch other lazy objects) with defined symbol resolution.

This may lead to surprising results, e.g. if an object file defines currently
undefined symbols and references another lazy symbol, we may interleave defined
symbols with the lazy fetch, potentially leading to the defined symbols
resolving to different files.

As an example, if both `a.a(a.o)` and `a.a(b.o)` define `foo` (not in COMDAT
group, or in different COMDAT groups) and `__profd_foo` (in COMDAT group
`__profd_foo`).  LLD may resolve `foo` to `a.a(a.o)` and `__profd_foo` to
`b.a(b.o)`, i.e. different files.

```
parse ArchiveFile a.a
  entry fetches a.a(a.o)
  parse ObjectFile a.o
    define entry
    define foo
    reference b
    b fetches a.a(b.o)
    parse ObjectFile b.o
      define prevailing __profd_foo
    define (ignored) non-prevailing __profd_foo
```

Assuming a set of interconnected symbols are defined all or none in several lazy
objects. Arguably making them resolve to the same file is preferable than making
them resolve to different files (some are lazy objects).

The main argument favoring the new behavior is the stability. The relative order
between a defined symbol and an undefined symbol does not change the symbol
resolution behavior.  Only the relative order between two undefined symbols can
affect fetching behaviors.

---

The real world case is reduced from a Fuchsia PGO usage: `a.a(a.o)` has a
constructor within COMDAT group C5 while `a.a(b.o)` has a constructor within
COMDAT group C2. Because they use different group signatures, they are not
de-duplicated. It is not entirely whether Clang behavior is entirely conforming.

LLD selects the PGO counter section (`__profd_*`) from `a.a(b.o)` and the
constructor section from `a.a(a.o)`. The `__profd_*` is a SHF_LINK_ORDER section
linking to its own non-prevailing constructor section, so LLD errors
`sh_link points to discarded section`. This patch fixes the error.

Differential Revision: https://reviews.llvm.org/D95985
2021-02-11 09:41:46 -08:00
Fangrui Song d82679d805 [ELF] Drop Android specific workaround -m aarch64_elf64_le_vec
`extern const bfd_target aarch64_elf64_le_vec;` is a variable in BFD.
It was somehow misused as an emulation by Android.

```
% aarch64-linux-gnu-ld -m aarch64_elf64_le_vec a.o
aarch64-linux-gnu-ld: unrecognised emulation mode: aarch64_elf64_le_vec
Supported emulations: aarch64linux aarch64elf aarch64elf32 aarch64elf32b aarch64elfb armelf armelfb aarch64linuxb aarch64linux32 aarch64linux32b armelfb_linux_eabi armelf_linux_eabi
```

Acked by Stephen Hines, who removed the flag from Android a while back.
2021-02-09 00:43:10 -08:00
Hongtao Yu 5b8db127a3 [ELF] Rewriting the path of sample profile file for --reproduce response.txt
Rewritting the path of the sample profile file in response.txt to be relative to the repro tar.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D96193
2021-02-09 00:00:16 -08:00
Fangrui Song eea34aae2e [ELF] Inspect -EL & -EB for OUTPUT_FORMAT(default, big, little)
Choose big if -EB is specified, little if -EL is specified, or default if neither is specified.
The new behavior matches GNU ld.

Fixes: https://github.com/ClangBuiltLinux/linux/issues/1025

Differential Revision: https://reviews.llvm.org/D96214
2021-02-08 10:34:57 -08:00
Fangrui Song 7605a9a009 [ELF] Support aarch64_be
This patch adds

* Big-endian values for `R_AARCH64_{ABS,PREL}{16,32,64}` and `R_AARCH64_PLT32`
* aarch64elfb & aarch64linuxb BFD emulations
* elf64-bigaarch64 output format (bfdname)

Link: https://github.com/ClangBuiltLinux/linux/issues/1288

Differential Revision: https://reviews.llvm.org/D96188
2021-02-08 08:55:29 -08:00
Fangrui Song 6a1235211d [ELF] --gc-sections: collect unused SHF_LINK_ORDER .gcc_except_table
A SHF_LINK_ORDER .gcc_except_table is similar to a .gcc_except_table in
a section group. The associated text section is responsible for retaining it.

LLD still does not support GC of non-group non-SHF_LINK_ORDER .gcc_except_table -
but that is not necessary because we can teach the compiler to set SHF_LINK_ORDER.
2021-02-05 21:35:27 -08:00
Fangrui Song 5f4d7b2f0a [ELF] Improve --icf=safe diagnostic
The current diagnostic has confused users. The new wording is adapted from one suggested by Ian Lance Taylor.

Differential Revision: https://reviews.llvm.org/D95917
2021-02-05 09:37:37 -08:00
Fangrui Song ed399d508f [ELF] Make SHF_GNU_RETAIN sections GC roots
binutils 2.36 introduced the new section flag SHF_GNU_RETAIN (for ELFOSABI_GNU &
ELFOSABI_FREEBSD) to mark a sections as a GC root. Several LLVM side toolchain
folks (including me) were involved in the design process of SHF_GNU_RETAIN and
were happy with this proposal.

Currently GNU ld only respects SHF_GNU_RETAIN semantics for ELFOSABI_GNU &
ELFOSABI_FREEBSD object files
(https://sourceware.org/bugzilla/show_bug.cgi?id=27282).  GNU ld sets EI_OSABI
to ELFOSABI_GNU for relocatable output
(https://sourceware.org/bugzilla/show_bug.cgi?id=27091). In practice the single
value EI_OSABI is neither a good indicator for object file compatibility, nor a
useful mechanism marking used ELF extensions.

For input, we respect SHF_GNU_RETAIN semantics even for ELFOSABI_NONE object
files. This is compatible with how LLD and GNU ld handle (mildly useful) STT_GNU_IFUNC
/ (emitted by GCC, considered misfeature by some folks) STB_GNU_UNIQUE input.
(As of LLVM 12.0.0, the integrated assembler does not set ELFOSABI_GNU for
STT_GNU_IFUNC/STB_GNU_UNIQUE).
Arguably STT_GNU_IFUNC/STB_GNU_UNIQUE probably need indicators in object files
but SHF_GNU_RETAIN is more likely accepted by more OSABI platforms.

For output, we take a step further than GNU ld: we don't promote ELFOSABI_NONE
to ELFOSABI_GNU for all output.

Differential Revision: https://reviews.llvm.org/D95749
2021-02-04 09:23:01 -08:00
Fangrui Song b3165a70ae [ELF] Allow R_386_GOTOFF from .debug_info
In GCC emitted .debug_info sections, R_386_GOTOFF may be used to
relocate DW_AT_GNU_call_site_value values
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98946).

R_386_GOTOFF (`S + A - GOT`) is one of the `isStaticLinkTimeConstant` relocation
type which is not PC-relative, so it can be used from non-SHF_ALLOC sections. We
current allow new relocation types as needs come. The diagnostic has caught some
bugs in the past.

Differential Revision: https://reviews.llvm.org/D95994
2021-02-04 09:17:47 -08:00
Fangrui Song 57bfa2ddb6 [ELF] Delete unused --warn-ifunc-textrel
The option catches incompatibility between `R_*_IRELATIVE` and DT_TEXTREL/DF_TEXTREL
before glibc 2.29. Newer glibc versions are more common nowadays and I don't
think this option has ever been used. Diagnosing this problem is also
straightforward by reading the stack trace.
2021-02-02 09:47:06 -08:00
Teresa Johnson 1487747e99 [LTO] Prevent devirtualization for symbols dynamically exported
Identify dynamically exported symbols (--export-dynamic[-symbol=],
--dynamic-list=, or definitions needed to preempt shared objects) and
prevent their LTO visibility from being upgraded.
This helps avoid use of whole program devirtualization when there may
be overrides in dynamic libraries.

Differential Revision: https://reviews.llvm.org/D91583
2021-01-27 15:54:13 -08:00
Adhemerval Zanella 988cc0a083 [LLD][ELF][AArch64] Add support for R_AARCH64_LD64_GOTPAGE_LO15 relocation
It is not used by LLVM, but GCC might generates it when compiling
with -fpie, as indicated by PR#40357 [1].

[1] https://bugs.llvm.org/show_bug.cgi?id=40357
2021-01-26 12:01:38 +00:00
Sam Clegg 299b0e5ee9 [lld] Consistent help text for `--save-temps`
I noticed that this option was not appearing at all in the `--help`
messages for `wasm-ld` or `ld.lld`.

Add help text and make it consistent across all ports.

Differential Revision: https://reviews.llvm.org/D94925
2021-01-25 10:27:18 -08:00
Fangrui Song eda973bbc7 [ELF][test] Add a test about --exclude-libs applying to version symbols
D94280 also fixed PR48702.
2021-01-22 18:46:56 -08:00
Hongtao Yu 8aa3ee241d [CSSPGO] LTO option for pseudo probe
Adding a lld option to support emitting pseudo probe metadata in LTO mode.

Reviewed By: MaskRay, wmi, wenlei

Differential Revision: https://reviews.llvm.org/D95056
2021-01-22 11:07:10 -08:00
Fangrui Song d24b94f070 [ELF] --wrap: retain __wrap_foo if foo is defined in an object/bitcode file
If foo is referenced in any object file, bitcode file or shared object,
`__wrap_foo` should be retained as the redirection target of sym
(f96ff3c0f8).

If the object file defining foo has foo references, we cannot easily distinguish
the case from cases where foo is not referenced (we haven't scanned
relocations). Retain `__wrap_foo` because we choose to wrap sym references
regardless of whether sym is defined to keep non-LTO/LTO/relocatable links' behaviors similar
https://sourceware.org/bugzilla/show_bug.cgi?id=26358 .

If foo is defined in a shared object, `__wrap_foo` can still be omitted
(`wrap-dynamic-undef.s`).

Reviewed By: andrewng

Differential Revision: https://reviews.llvm.org/D95152
2021-01-22 09:20:29 -08:00
Bob Haarman 8e0b179315 [ELF] report section sizes when output file too large
Fixes PR48523. When the linker errors with "output file too large",
one question that comes to mind is how the section sizes differ from
what they were previously. Unfortunately, this information is lost
when the linker exits without writing the output file. This change
makes it so that the error message includes the sizes of the largest
sections.

Reviewed By: MaskRay, grimar, jhenderson

Differential Revision: https://reviews.llvm.org/D94560
2021-01-21 19:47:03 +00:00
Fangrui Song f96ff3c0f8 [ELF] --wrap: Produce a dynamic symbol for undefined __wrap_
```
// a.s
jmp fcntl
// b.s
.globl fcntl
fcntl:
  ret
```

`ld.lld -shared --wrap=fcntl a.o b.o` has an `R_X86_64_JUMP_SLOT` referencing
the index 0 undefined symbol, which will cause a glibc `symbol lookup error` at
runtime. This is because `__wrap_fcntl` is not in .dynsym

We use an approximation `!wrap->isUndefined()`, which doesn't set
`isUsedInRegularObj` of `__wrap_fcntl` when `fcntl` is referenced and
`__wrap_fcntl` is undefined.

Fix this by using `sym->referenced`.
2021-01-19 21:23:57 -08:00
Fangrui Song 5fcb412ed0 [ELF] Support R_PPC64_ADDR16_HIGH
R_PPC64_ADDR16_HI represents bits 16-31 of a 32-bit value
R_PPC64_ADDR16_HIGH represents bits 16-31 of a 64-bit value.

In the Linux kernel, `LOAD_REG_IMMEDIATE_SYM` defined in `arch/powerpc/include/asm/ppc_asm.h`
uses @l, @high, @higher, @highest to load the 64-bit value of a symbol.

Fixes https://github.com/ClangBuiltLinux/linux/issues/1260
2021-01-19 11:42:53 -08:00
Fangrui Song e12e0d66c0 [ELF] Error for out-of-range R_PPC64_ADDR16_HA, R_PPC64_ADDR16_HI and their friends
There are no tests for REL16_* and TPREL16_*.
2021-01-19 11:42:52 -08:00
Adhemerval Zanella 2f92386e72 [LLD][ELF][AArch64] Set _GLOBAL_OFFSET_TABLE_ at the start of .got
The commit 18aa0be36e changed the default GotBaseSymInGotPlt to true
for AArch64.  This is different than binutils, where
_GLOBAL_OFFSET_TABLE_ points at the start or .got.

It seems to not intefere with current relocations used by LLVM.  However
as indicated by PR#40357 [1] gcc generates R_AARCH64_LD64_GOTPAGE_LO15
for -pie (in fact it also generated the relocation for -fpic).

This change is requires to correctly handle R_AARCH64_LD64_GOTPAGE_LO15
by lld from objects generated by gcc.

[1] https://bugs.llvm.org/show_bug.cgi?id=40357
2021-01-18 14:51:14 -03:00
Fangrui Song 3809f4ebab [ELF] Support R_PPC_ADDR24 (ba foo; bla foo) 2021-01-17 00:02:13 -08:00
Bob Haarman 6166b91e83 [ELF][NFCI] small cleanup to OutputSections.h
OutputSections.h used to close the lld::elf namespace only to
immediately open it again. This change merges both parts into
one.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D94538
2021-01-12 23:09:16 +00:00
Fangrui Song 93ad0edf67 [ELF] Drop .rel[a].debug_gnu_pub{names,types} for --gdb-index --emit-relocs
Fixes PR48693: --emit-relocs keeps relocation sections. --gdb-index drops
.debug_gnu_pubnames and .debug_gnu_pubtypes but not their relocation sections.
This can cause a null pointer dereference in `getOutputSectionName`.

Also delete debug-gnu-pubnames.s which is covered by gdb-index.s

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D94354
2021-01-12 00:07:28 -08:00
Fangrui Song ac2224c022 [ELF] --exclude-libs: localize defined libcall symbols referenced by lto.tmp
Fixes PR48681: after LTO, lto.tmp may reference a libcall symbol not in an IR
symbol table of any bitcode file. If such a symbol is defined in an archive
matched by a --exclude-libs, we don't correctly localize the symbol.

Add another `excludeLibs` after `compileBitcodeFiles` to localize such libcall
symbols. Unfortunately we have keep the existing one for D43126.

Using VER_NDX_LOCAL is an implementation detail of `--exclude-libs`, it does not
necessarily tie to the "localize" behavior.  `local:` patterns in a version
script can be omitted.
The `symbol ... has undefined version ...` error should not be exempted.
Ideally we should error as GNU ld does. https://issuetracker.google.com/issues/73020933

Reviewed By: psmith

Differential Revision: https://reviews.llvm.org/D94280
2021-01-11 09:33:22 -08:00
Peter Collingbourne aed84542d5 ELF: Teach the linker about the 'B' augmentation string character.
This character indicates that when return pointer authentication is
being used, the function signs the return address using the B key.

Differential Revision: https://reviews.llvm.org/D93954
2021-01-05 19:51:11 -08:00
Brandon Bergren 275eb8289c [PowerPC] Support powerpcle target in LLD [4/5]
Add support for linking powerpcle code in LLD.

Rewrite lld/test/ELF/emulation-ppc.s to use a shared check block and add powerpcle tests.

Update tests.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D93917
2021-01-02 12:18:05 -06:00
Fangrui Song b0d6bebe90 [ELF] Drop '>>> defined in ' for locations of linker synthesized symbols
Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D93925
2020-12-30 09:16:26 -08:00
Georgii Rymar ed146d6291 [LLD][ELF] - Use LLVM_ELF_IMPORT_TYPES_ELFT instead of multiple types definitions. NFCI.
We can reduce the number of "using" declarations.
`LLVM_ELF_IMPORT_TYPES_ELFT` was extended in D93801.

Differential revision: https://reviews.llvm.org/D93856
2020-12-29 10:50:07 +03:00
Fangrui Song fb3c1b3de5 [ELF] Reject local-exec TLS relocations for -shared
For x86-64, D33100 added a diagnostic for local-exec TLS relocations referencing a preemptible symbol.

This patch generalizes it to non-preemptible symbols (see `-Bsymbolic` in `tls.s`)
on all targets.

Local-exec TLS relocations resolve to offsets relative to a fixed point within
the static TLS block, which are only meaningful for the executable.

With this change, `clang -fpic -shared -fuse-ld=bfd a.c` on the following example will be flagged for AArch64/ARM/i386/x86-64/RISC-V

```
static __attribute__((tls_model("local-exec"))) __thread long TlsVar = 42;
long bump() { return ++TlsVar; }
```

Note, in GNU ld, at least arm, riscv and x86's ports have the similar
diagnostics, but aarch64 and ppc64 do not error.

Differential Revision: https://reviews.llvm.org/D93331
2020-12-21 08:47:04 -08:00
Fangrui Song e25afcfa51 [ELF][PPC64] Detect missing R_PPC64_TLSGD/R_PPC64_TLSLD and disable TLS relaxation
Alternative to D91611.

The TLS General Dynamic/Local Dynamic code sequences need to mark
`__tls_get_addr` with R_PPC64_TLSGD or R_PPC64_TLSLD, e.g.

```
addis r3, r2, x@got@tlsgd@ha # R_PPC64_GOT_TLSGD16_HA
addi r3, r3, x@got@tlsgd@l   # R_PPC64_GOT_TLSGD16_LO
bl __tls_get_addr(x@tlsgd)   # R_PPC64_TLSGD followed by R_PPC64_REL24
nop
```

However, there are two deviations form the above:

1. direct call to `__tls_get_addr`. This is essential to implement ld.so in glibc/musl/FreeBSD.

```
bl __tls_get_addr
nop
```

This is only used in a -shared link, and thus not subject to the GD/LD to IE/LE
relaxation issue below.

2. Missing R_PPC64_TLSGD/R_PPC64_TLSGD for compiler generated TLS references

According to Stefan Pintille, "In the early days of the transition from the
ELFv1 ABI that is used for big endian PowerPC Linux distributions to the ELFv2
ABI that is used for little endian PowerPC Linux distributions, there was some
ambiguity in the specification of the relocations for TLS. The GNU linker has
implemented support for correct handling of calls to __tls_get_addr with a
missing relocation.  Unfortunately, we didn't notice that the IBM XL compiler
did not handle TLS according to the updated ABI until we tried linking XL
compiled libraries with LLD."

In short, LLD needs to work around the old IBM XL compiler issue.
Otherwise, if the object file is linked in -no-pie or -pie mode,
the result will be incorrect because the 4 instructions are partially
rewritten (the latter 2 are not changed).

Work around the compiler bug by disable General Dynamic/Local Dynamic to
Initial Exec/Local Exec relaxation. Note, we also disable Initial Exec
to Local Exec relaxation for implementation simplicity, though technically it can be kept.

ppc64-tls-missing-gdld.s demonstrates the updated behavior.

Reviewed By: #powerpc, stefanp, grimar

Differential Revision: https://reviews.llvm.org/D92959
2020-12-21 08:45:41 -08:00
Fangrui Song 22c1bd57bf [ELF] Rename R_TLS to R_TPREL and R_NEG_TLS to R_TPREL_NEG. NFC
The scope of R_TLS (TP offset relocation types (TPREL/TPOFF) used for the
local-exec TLS model) is actually narrower than its name may imply. R_TLS_NEG
is only used by Solaris R_386_TLS_LE_32.

Rename them so that they will be less confusing.

Reviewed By: grimar, psmith, rprichard

Differential Revision: https://reviews.llvm.org/D93467
2020-12-18 08:24:42 -08:00
Reshabh Sharma fdd6ed8e93 [LLD] Rename lld port driver entry function to a consistent name
Libraries linked to the lld elf library exposes a function named main.
When debugging code linked to such libraries and intending to set a
breakpoint at main, the debugger also sets breakpoint at the main
function at lld elf driver. The possible choice was to rename it to
link but that would again clash with lld::*::link. This patch tries
to consistently rename them to linkerMain.

Differential Revision: https://reviews.llvm.org/D91418
2020-12-18 12:18:37 +05:30
Adhemerval Zanella 978eb3b87b [lld] [ELF] AArch64: Handle DT_AARCH64_VARIANT_PCS
As indicated by AArch64 ELF specification, symbols with st_other
marked with STO_AARCH64_VARIANT_PCS indicates it may follow a variant
procedure call standard with different register usage convention
(for instance SVE calls).

Static linkers must preserve the marking and propagate it to the dynamic
symbol table if any reference or definition of the symbol is marked with
STO_AARCH64_VARIANT_PCS, and add a DT_AARCH64_VARIANT_PCS dynamic tag if
there are R_<CLS>_JUMP_SLOT relocations that reference that symbols.

It implements https://bugs.llvm.org/show_bug.cgi?id=48368.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D93045
2020-12-17 11:09:55 -03:00
Fangrui Song 16cb7910f5 [ELF] --emit-relocs: fix a crash if .rela.dyn is an empty output section
Fix PR48357: If .rela.dyn appears as an output section description, its type may
be SHT_RELA (due to the empty synthetic .rela.plt) while there is no input
section. The empty .rela.dyn may be retained due to a reference in a linker
script. Don't crash.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D93367
2020-12-16 08:59:38 -08:00
Fangrui Song c8da71b53f [ELF] Error for out-of-range R_X86_64_[REX_]GOTPCRELX
Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D93259
2020-12-15 09:20:07 -08:00
LemonBoy 92c6141ce6 lld/ELF: Parse MSP430 BFD/emulation names
Follow the naming set by TI's own GCC-based toolchain.
Also, force the `osabi` field to `ELFOSABI_STANDALONE`, this matches GNU LD's output (the patching is done in `elf32_msp430_post_process_headers`).

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D92931
2020-12-14 09:38:12 -08:00
Fangrui Song 7d38861ce3 [ELF] Rename --[no-]lto-new-pass-manager to --[no-]lto-legacy-pass-manager
Normally we should not delete options. However, the Clang driver passes
`-plugin-opt={new,legacy}-pass-manager` instead of
`--[no-]lto-legacy-pass-manager` (`-plugin-opt=new-pass-manager` has been used
since 7.0), and it is unlikely anyone will use the `--lto-*` style options directly.

So let's rename them to be consistent with the Clang driver option names.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D92988
2020-12-09 17:53:37 -08:00
Fangrui Song 7adcacda06 Rename -plugin-opt=no-new-pass-manager to -plugin-opt=legacy-pass-manager 2020-12-09 16:43:30 -08:00
Fangrui Song 68ff3b3376 [LLD][gold] Add -plugin-opt=no-new-pass-manager
-DENABLE_EXPERIMENTAL_NEW_PASS_MANAGER=on configured LLD and LLVMgold.so
will use the new pass manager by default. Add an option to
use the legacy pass manager. This will also be used by the Clang driver
when -fno-new-pass-manager (D92915) / -fno-experimental-new-pass-manager is set.

Reviewed By: aeubanks, tejohnson

Differential Revision: https://reviews.llvm.org/D92916
2020-12-09 13:31:03 -08:00
Fangrui Song baef18dffb [ELF] Reorganize "is only supported on" tests and fix some diagnostics 2020-12-09 12:14:00 -08:00
Arthur Eubanks fa602d74f6 [ELF][LTO][NPM] Use NPM with ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER
Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D92885
2020-12-08 15:12:57 -08:00
Sean Fertile 8f91f38148 [LLD] Search archives for symbol defs to override COMMON symbols.
This patch changes the archive handling to enable the semantics needed
for legacy FORTRAN common blocks and block data. When we have a COMMON
definition of a symbol and are including an archive, LLD will now
search the members for global/weak defintions to override the COMMON
symbol. The previous LLD behavior (where a member would only be included
if it satisifed some other needed symbol definition) can be re-enabled with the
option '-no-fortran-common'.

Differential Revision: https://reviews.llvm.org/D86142
2020-12-07 10:09:19 -05:00
Georgii Rymar 3f5dc57fd1 [LLD][ELF] - Don't keep empty output sections which have explicit program headers.
This reverts a side effect introduced in the code cleanup patch D43571:
LLD started to emit empty output sections that are explicitly assigned to a segment.

This patch fixes the issue by removing the !sec.phdrs.empty() special case from
isDiscardable. As compensation, we add an early phdrs propagation step (see the inline comment).
This is similar to one that we do in adjustSectionsAfterSorting.

Differential revision: https://reviews.llvm.org/D92301
2020-12-02 11:19:21 +03:00
Nico Weber b2f00f24a3 [mac/lld] Include archive name in diagnostics
Also, for .o files, include full path as given on link command line.

Before:
    lld: error: undefined symbol [...], referenced from sandbox_logging.o

After:
    lld: error: undefined symbol [...], referenced from libseatbelt.a(sandbox_logging.o)

Move archiveName up to InputFile so we can consistently use toString()
to print InputFiles in diags, and pass it to the ObjFile ctor. This
matches the ELF and COFF ports.

Differential Revision: https://reviews.llvm.org/D92437
2020-12-01 23:00:25 -05:00
Arthur Eubanks 99d82412f8 [LLD][ELF][NewPM] Add option to force legacy PM
In preparation for the NPM switch.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D92417
2020-12-01 13:41:17 -08:00
Fangrui Song 843c2b2303 [ELF] Error for undefined foo@v1
If an object file has an undefined foo@v1, we emit a dynamic symbol foo.
This is incorrect if at runtime a shared object provides the non-default version foo@v1
(the undefined foo may bind to foo@@v2, for example).

GNU ld issues an error for this case, even if foo@v1 is undefined weak
(https://sourceware.org/bugzilla/show_bug.cgi?id=3351). This behavior makes
sense because to represent an undefined foo@v1, we have to construct a Verneed
entry. However, without knowing the defining filename, we cannot construct a
Verneed entry (Verneed::vn_file is unavailable).

This patch implements the error.

Depends on D92258

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D92260
2020-12-01 08:59:54 -08:00
Fangrui Song 941e9336d0 [ELF] Make foo@@v1 resolve undefined foo@v1
The symbol resolution rules for versioned symbols are:

* foo@@v1 (default version) resolves both undefined foo and foo@v1
* foo@v1 (non-default version) resolves undefined foo@v1

Note, foo@@v1 must be defined (the assembler errors if attempting to
create an undefined foo@@v1).

For defined foo@@v1 in a shared object, we call `SymbolTable::addSymbol` twice,
one for foo and the other for foo@v1. We don't do the same for object files, so
foo@@v1 defined in one object file incorrectly does not resolve a foo@v1
reference in another object file.

This patch fixes the issue by reusing the --wrap code to redirect symbols in
object files. This has to be done after processing input files because
foo and foo@v1 are two separate symbols if we haven't seen foo@@v1.

Add a helper `Symbol::getVersionSuffix` to retrieve the optional trailing
`@...` or `@@...` from the possibly truncated symbol name.

Depends on D92258

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D92259
2020-12-01 08:54:01 -08:00
Nico Weber 4431c212a0 lld/ELF: Make three rarely-used flags work with --reproduce
All three use readFile() for their argument so their argument file is
already copied to the tar, but we weren't rewriting the argument to
point to the path used in the tar file.

No test because the change is trivial (several other flags in
createResponseFile() also aren't tested, likely for the same reason.)

Differential Revision: https://reviews.llvm.org/D92356
2020-12-01 09:20:29 -05:00
Wei Wang 3acda91742 [Remarks][1/2] Expand remarks hotness threshold option support in more tools
This is the #1 of 2 changes that make remarks hotness threshold option
available in more tools. The changes also allow the threshold to sync with
hotness threshold from profile summary with special value 'auto'.

This change modifies the interface of lto::setupLLVMOptimizationRemarks() to
accept remarks hotness threshold. Update all the tools that use it with remarks
hotness threshold options:

* lld: '--opt-remarks-hotness-threshold='
* llvm-lto2: '--pass-remarks-hotness-threshold='
* llvm-lto: '--lto-pass-remarks-hotness-threshold='
* gold plugin: '-plugin-opt=opt-remarks-hotness-threshold='

Differential Revision: https://reviews.llvm.org/D85809
2020-11-30 21:55:49 -08:00
Fangrui Song 589e10f858 [ELF] Don't relax R_X86_64_GOTPCRELX if addend != -4
clang may produce `movl x@GOTPCREL+4(%rip), %eax` when loading the high 32 bits
of the address of a global variable in -fpic/-fpie mode.

If assembled by GNU as, the fixup emits an R_X86_64_GOTPCRELX with an
addend != -4. The instruction loads from the GOT entry with an offset
and thus it is incorrect to relax the instruction.

If assembled by the integrated assembler, we emit R_X86_64_GOTPCREL for
relocations that definitely cannot be relaxed (D92114), so this patch is not
needed.

This patch disables the relaxation, which is compatible with the implementation in GNU ld
("Add R_X86_64_[REX_]GOTPCRELX support to gas and ld").

Reviewed By: grimar, jhenderson

Differential Revision: https://reviews.llvm.org/D91993
2020-11-30 08:30:19 -08:00
Nico Weber 83e60f5a55 [lld/mac] Add --reproduce option
This adds support for ld.lld's --reproduce / lld-link's /reproduce:
flag to the MachO port. This flag can be added to a link command
to make the link write a tar file containing all inputs to the link
and a response file containing the link command. This can be used
to reproduce the link on another machine, which is useful for sharing
bug report inputs or performance test loads.

Since the linker is usually called through the clang driver and
adding linker flags can be a bit cumbersome, setting the env var
`LLD_REPRODUCE=foo.tar` triggers the feature as well.

The file response.txt in the archive can be used with
`ld64.lld.darwinnew $(cat response.txt)` as long as the contents are
smaller than the command-line limit, or with `ld64.lld.darwinnew
@response.txt` once D92149 is in.

The support in this patch is sufficient to create a tar file for
Chromium's base_unittests that can link after unpacking on a different
machine.

Differential Revision: https://reviews.llvm.org/D92274
2020-11-30 08:40:21 -05:00
Fangrui Song dfcf1acf13 [ELF] Improve 2 SmallVector<*, N> usage
For --gc-sections, SmallVector<InputSection *, 256> -> SmallVector<InputSection *, 0> because the code bloat (1296 bytes) is not worthwhile (the saved reallocation is negligible).
For OutputSection::compressedData, N=1 is useless (for a compressed .debug_*, the size is always larger than 1).
2020-11-29 14:01:32 -08:00
Fangrui Song 048b16f7fb [ELF] Check --orphan-handling=place (default value) early
The function took 1% (161MiB clang) to 1.7% (an 4.9GiB executable) time.
2020-11-29 12:36:27 -08:00
Nico Weber a0994cbe27 lld-link: Let LLD_REPRODUCE control /reproduce:, like in ld.lld
Also sync help texts for the option between elf and coff ports.

Decisions:
- Do this even if /lldignoreenv is passed. /reproduce: does not affect
  the main output, and this makes the env var more convenient to use.
  (On the other hand, it's now possible to set this env var and forget
  about it, and all future builds in the same shell will be much slower.
  That's true for ld.lld, but posix shells have an easy way to set an
  env var for a single command; in cmd.exe this is not possible without
  contortions. Then again, lld-link runs in posix shells too.)

Original patch rebased across D68378 and D68381.

Differential Revision: https://reviews.llvm.org/D67707
2020-11-27 13:33:55 -05:00
Fangrui Song 50564ca075 [ELF] Rename adjustRelaxExpr to adjustTlsExpr and delete the unused `data` parameter. NFC
Reviewed By: psmith

Differential Revision: https://reviews.llvm.org/D91995
2020-11-25 09:00:55 -08:00
Fangrui Song 572d18397c [ELF] Add TargetInfo::adjustGotPcExpr for `R_GOT_PC` relaxations. NFC
With this change, `TargetInfo::adjustRelaxExpr` is only related to TLS
relaxations and a subsequent clean-up can delete the `data` parameter.

Differential Revision: https://reviews.llvm.org/D92079
2020-11-25 08:43:26 -08:00
Teresa Johnson 07f234be1c [lld] Add --no-lto-whole-program-visibility
Enables overriding earlier --lto-whole-program-visibility.

Variant of D91583 while discussing alternate ways to identify and
handle the --export-dynamic case.

Differential Revision: https://reviews.llvm.org/D92060
2020-11-24 16:46:08 -08:00
Nico Weber 11b7625833 [lld/mac] Implement basic typo correction for flags
Also use "unknown flag 'flag'" instead of "unknown flag: flag" for
consistency with the other ports.

Differential Revision: https://reviews.llvm.org/D91970
2020-11-24 11:33:39 -05:00
Georgii Rymar 9a99d23a1b [lib/Object] - Generalize the RelocationResolver API.
This allows to reuse the RelocationResolver from the code
that doesn't want to deal with `RelocationRef` class.

I am going to use it in llvm-readobj. See the description
of D91530 for more details.

Differential revision: https://reviews.llvm.org/D91533
2020-11-20 10:32:49 +03:00
Fangrui Song 55d310adc0 [ELF] Fix interaction between --unresolved-symbols= and --[no-]allow-shlib-undefined
As mentioned in https://reviews.llvm.org/D67479#1667256 ,

* `--[no-]allow-shlib-undefined` control the diagnostic for an unresolved symbol in a shared object
* `-z defs/-z undefs` control the diagnostic for an unresolved symbol in a regular object file
* `--unresolved-symbols=` controls both bits.

In addition, make --warn-unresolved-symbols affect --no-allow-shlib-undefined.

This patch makes the behavior match GNU ld.

Reviewed By: psmith

Differential Revision: https://reviews.llvm.org/D91510
2020-11-17 12:20:57 -08:00
Nico Weber baa2aa28f5 lld: Add --color-diagnostic to MachO port, harmonize others
This adds `--[no-]color-diagnostics[=auto,never,always]` to
the MachO port and harmonizes the flag in the other ports:
- Consistently use MetaVarName
- Consistently document the non-eq version as alias of the eq version
- Use B<> in the ports that have it (no-op, shorter)
- Fix oversight in COFF port that made the --no flag have the wrong
  prefix

Differential Revision: https://reviews.llvm.org/D91640
2020-11-17 12:58:30 -05:00
Fangrui Song 3f90918886 [ELF] --gc-sections: collect unused .gcc_except_table in section groups and associated text sections
`try ... catch` in an inline function produces `.gcc_except_table.*` in a COMDAT
group with GCC or newer Clang (since D83655). For --gc-sections, currently we
scan `.eh_frame` pieces and mark liveness of such a `.gcc_except_table.*` and
then the associated `.text.*` (if a member in a section group is retained, the
others should be retained as well).

Essentially all `.text.*` and `.gcc_except_table.*` compiled from inline
functions with `try ... catch` cannot be discarded by the imprecise
--gc-sections.  Compared with the state before D83655, the output
`.gcc_except_table` is smaller (non-prevailing copies in COMDAT groups can now
be discarded) but `.text` may be larger, i.e. size regression.

This patch teaches the .eh_frame piece scanning code to not mark
`.gcc_except_table` in a section group, thus allow unused `.text.*` and
`.gcc_except_table.*` in a section group to be discarded.

Note, non-group `.gcc_except_table` can still not be discarded. That is the status quo.

Reviewed By: grimar, echristo

Differential Revision: https://reviews.llvm.org/D91579
2020-11-17 09:11:20 -08:00
Fangrui Song 8df4e60945 [ELF] Don't consider SHF_ALLOC ".debug*" sections debug sections
Fixes PR48071

* The Rust compiler produces SHF_ALLOC `.debug_gdb_scripts` (which normally does not have the flag)
* `.debug_gdb_scripts` sections are removed from `inputSections` due to --strip-debug/--strip-all
* When processing --gc-sections, pieces of a SHF_MERGE section can be marked live separately

`=>` segfault when marking liveness of a `.debug_gdb_scripts` which is not split into pieces (because it is not in `inputSections`)

This patch circumvents the problem by not treating SHF_ALLOC ".debug*" as debug sections (to prevent --strip-debug's stripping)
(which is still useful on its own).

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D91291
2020-11-12 09:59:43 -08:00
Fangrui Song 40a42f9f3f [ELF] Make SORT_INIT_PRIORITY support .ctors.N
Input sections `.ctors/.ctors.N` may go to either the output section `.init_array` or the output section `.ctors`:

* output `.ctors`: currently we sort them by name. This patch changes to sort by priority from high to low. If N in `.ctors.N` is in the form of %05u, there is no semantic difference. Actually GCC and Clang do use %05u. (In the test `ctors_dtors_priority.s` and Gold's test `gold/testsuite/script_test_14.s`, we can see %03u, but they are not really produced by compilers.)
* output `.init_array`: users can provide an input section description `SORT_BY_INIT_PRIORITY(.init_array.* .ctors.*)` to mix `.init_array.*` and `.ctors.*`. This can make .init_array.N and .ctors.(65535-N) interchangeable.

With this change, users can mix `.ctors.N` and `.init_array.N` in `.init_array` (PR44698 and PR48096) with linker scripts. As an example:

```
SECTIONS {
  .init_array : {
    *(SORT_BY_INIT_PRIORITY(.init_array.* .ctors.*))
    *(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin?.o *crtend.o *crtend?.o ) .ctors)
  }
} INSERT AFTER .fini_array;
SECTIONS {
  .fini_array : {
    *(SORT_BY_INIT_PRIORITY(.fini_array.* .dtors.*))
    *(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin?.o *crtend.o *crtend?.o ) .dtors)
  }
} INSERT BEFORE .init_array;
```

Reviewed By: psmith

Differential Revision: https://reviews.llvm.org/D91187
2020-11-12 08:56:12 -08:00
Fangrui Song 73d01a80ce [ELF] Sort by input order within an input section description
According to
https://sourceware.org/binutils/docs/ld/Input-Section-Basics.html#Input-Section-Basics
for `*(.a .b)`, the order should match the input order:

* for `ld 1.o 2.o`, sections from 1.o precede sections from 2.o
* within a file, `.a` and `.b` appear in the section header table order

This patch implements the behavior. The interaction with `SORT*` and --sort-section is:

Matched sections are ordered by radix sort with the keys being `(SORT*, --sort-section, input order)`,
where `SORT*` (if present) is most significant.

> Note, multiple `SORT*` within an input section description has undocumented and
> confusing behaviors in GNU ld:
> https://sourceware.org/pipermail/binutils/2020-November/114083.html
> Therefore multiple `SORT*` is not the focus for this patch but
> this patch still strives to have an explainable behavior.

As an example, we partition `SORT(a.*) b.* c.* SORT(d.*)`, into
`SORT(a.*) | b.* c.* | SORT(d.*)` and perform sorting within groups. Sections
matched by patterns between two `SORT*` are sorted by input order.  If
--sort-alignment is given, they are sorted by --sort-alignment, breaking tie by
input order.

This patch also allows a section to be matched by multiple patterns, previously
duplicated sections could occupy more space in the output and had erroneous zero bytes.

The patch is in preparation for support for
`*(SORT_BY_INIT_PRIORITY(.init_array.* .ctors.*)) *(.init_array .ctors)`,
which will allow LLD to mix .ctors*/.init_array* like GNU ld (gold's --ctors-in-init-array)
PR44698 and PR48096

Reviewed By: grimar, psmith

Differential Revision: https://reviews.llvm.org/D91127
2020-11-12 08:53:11 -08:00
Fangrui Song 2a9aed0e8b [ELF] Support multiple SORT in an input section description
The second `SORT` in `*(SORT(...) SORT(...))` is incorrectly parsed as a file pattern.
Fix the bug by stopping at `SORT*` in `readInputSectionsList`.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D91180
2020-11-12 08:46:53 -08:00
James Henderson 439341b9bf [lld][ELF] Add additional time trace categories
I noticed when running a large link with the --time-trace option that
there were several areas which were missing any specific time trace
categories (aside from the generic link/ExecuteLinker categories). This
patch adds new categories to fill most of the "gaps", or to provide more
detail than was previously provided.

Reviewed by: MaskRay, grimar, russell.gallop

Differential Revision: https://reviews.llvm.org/D90686
2020-11-10 10:28:46 +00:00
Fangrui Song b22317705d [ELF] Special case static_assert for _WIN32
I don't have a Windows machine. Hope someone can test why its InputSection is
still larger.
2020-11-09 10:08:44 -08:00
Fangrui Song 2eccde4a2b [ELF] Make InputSection smaller
On LP64/Windows platforms, this decreases sizeof(InputSection) from 208 (larger
on Windows) to 184.

For a large executable (7.6GiB, inputSections.size()=5105122,
make<InputSection> called 4835760 times), this decreases cgroup
memory.max_usage_in_bytes by 0.6%

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D91018
2020-11-09 09:55:09 -08:00
serge-sans-paille 1e70ec10eb [lld] Provide a hook to customize undefined symbols error handling
This is a follow up to https://reviews.llvm.org/D87758, implementing the missing
symbol part, as done by binutils.

Differential Revision: https://reviews.llvm.org/D89687
2020-11-09 13:28:48 +01:00
Fangrui Song 3ba3342232 [ELF] --warn-backrefs-exclude: use toString to match the documentation
The pattern should patch `a.a(a.o)` instead of `a.a`
2020-11-07 20:19:21 -08:00
serge-sans-paille cfc32267e2 Provide a hook to customize missing library error handling
Make it possible for lld users to provide a custom script that would help to
find missing libraries. A possible scenario could be:

    % clang /tmp/a.c -fuse-ld=lld -loauth -Wl,--error-handling-script=/tmp/addLibrary.py
    unable to find library -loauth
    looking for relevant packages to provides that library

        liboauth-0.9.7-4.el7.i686
        liboauth-devel-0.9.7-4.el7.i686
        liboauth-0.9.7-4.el7.x86_64
        liboauth-devel-0.9.7-4.el7.x86_64
        pix-1.6.1-3.el7.x86_64

Where addLibrary would be called with the missing library name as first argument
(in that case addLibrary.py oauth)

Differential Revision: https://reviews.llvm.org/D87758
2020-11-03 11:01:29 +01:00
Fangrui Song 2fc704a0a5 [ELF] --emit-relocs: fix st_value of STT_SECTION in the presence of a gap before the first input section
In the presence of a gap, the st_value field of a STT_SECTION symbol is the
address of the first input section (incorrect if there is a gap). Set it to the
output section address instead.

In -r mode, this bug can cause an incorrect non-zero st_value of a STT_SECTION
symbol (while output sections have zero addresses, input sections may have
non-zero outSecOff).  The non-zero st_value can cause the final link to have
incorrect relocation computation (both GNU ld and LLD add st_value of the
STT_SECTION symbol to the output section address).

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D90520
2020-11-02 08:37:15 -08:00
Fangrui Song ae73091f30 [ELF] -r: don't crash when a non-SHF_LINK_ORDER orphan is added before a SHF_LINK_ORDER orphan
Fixes https://github.com/ClangBuiltLinux/linux/issues/1186

If a non-SHF_LINK_ORDER orphan is added first, `firstIsec->flags & SHF_LINK_ORDER`
will be zero and we currently assert when calling `getLinkOrderDep`.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D90200
2020-10-28 08:56:42 -07:00
Fangrui Song 398b81067c [ELF] Don't crash on R_X86_64_GOTPCRELX for test/binop instructions
While MC did not produce R_X86_64_GOTPCRELX for test/binop instructions
(movl/adcl/addl/andl/...) before the previous commit, this code path has been
exercised by -fno-integrated-as for GNU as since 2016: -no-pie relaxing
may incorrectly access loc[-3] and produce a corrupted instruction.

Simply handle test/binop R_X86_64_GOTPCRELX like R_X86_64_GOTPCREL.
2020-10-24 15:14:17 -07:00
Fangrui Song 9267caebfa [ELF] Don't error on R_PPC64_REL24/R_PPC64_REL24_NOTOC referencing __tls_get_addr for missing R_PPC64_TLSGD/R_PPC64_TLSLD
This partially reverts D85994.

In glibc, elf/dl-sym.c calls the raw `__tls_get_addr` by specifying the
tls_index parameter. Such a call does not have a pairing R_PPC64_TLSGD/R_PPC64_TLSLD.
This is legitimate. Since we cannot distinguish the benign case from cases due
to toolchain issues, we have to be permissive.

Acked by Stefan Pintilie
2020-10-23 10:38:07 -07:00
Stefan Pintilie c6561ccfd9 [PowerPC][LLD] Support for PC Relative TLS for Local Dynamic
Add support to LLD for PC Relative Thread Local Storage for Local Dynamic.
This patch adds support for two relocations: R_PPC64_GOT_TLSLD_PCREL34 and
R_PPC64_DTPREL34.

The Local Dynamic code is:
```
pla r3, x@got@tlsld@pcrel        R_PPC64_GOT_TLSLD_PCREL34
bl __tls_get_addr@notoc(x@tlsld) R_PPC64_TLSLD
                                 R_PPC64_REL24_NOTOC
...
paddi r9, r3, x@dtprel           R_PPC64_DTPREL34
```

After relaxation to Local Exec:
```
paddi r3, r13, 0x1000
nop
...
paddi r9, r3, x@dtprel          R_PPC64_DTPREL34
```

Reviewed By: NeHuang, sfertile

Differential Revision: https://reviews.llvm.org/D87504
2020-10-23 08:23:56 -05:00
Fangrui Song ce3c5dae06 [ELF] --warn-backrefs: save the referenced InputFile *
For a diagnostic `A refers to B` where B refers to a bitcode file, if the
symbol gets optimized out, the user may see `A refers to <internal>`; if the
symbol is retained, the user may see `A refers to lto.tmp`.

Save the reference InputFile * in the DenseMap so that the original filename is
available in reportBackrefs().
2020-10-22 15:27:19 -07:00
Fangrui Song a8f9f08018 [ELF] Set SHF_INFO_LINK for .rel[a].plt and .rel[a].dyn
The ELF spec says

> If the sh_flags field for this section header includes the attribute SHF_INFO_LINK, then this member represents a section header table index.

Set SHF_INFO_LINK so that binary manipulation tools know that sh_info is
a section header table index instead of (the number of local symbols in the case of SHT_SYMTAB/SHT_DYNSYM).
We have already added SHF_INFO_LINK for --emit-relocs retained SHT_REL[A].

For example, we can teach llvm-objcopy to preserve the section index of the sh_info referenced section if
SHF_INFO_LINK is set. (GNU objcopy recognizes .rel[a].plt and updates
sh_info even if SHF_INFO_LINK is not set).

Reviewed By: grimar, psmith

Differential Revision: https://reviews.llvm.org/D89828
2020-10-22 09:48:19 -07:00
Fangrui Song b6e4aae2cc [ELF] --gc-sections: retain dependent sections of non-SHF_ALLOC sections
Fix http://lists.llvm.org/pipermail/llvm-dev/2020-October/145908.html

Currently non-SHF_ALLOC SHT_REL[A] (due to --emit-relocs) and SHF_LINK_ORDER are not
marked live.

Reviewed By: grimar, psmith

Differential Revision: https://reviews.llvm.org/D89841
2020-10-21 10:11:26 -07:00
Fangrui Song 38b632c16e [ELF] --gdb-index: support --icf={safe,all}
The combination has not been tested before. In the case of ICF,
`e.section->getVA(0)` equals the start address of the output section.

This can cause incorrect overlapping with the actual function at the
start of the output section and potentially trigger a GDB internal error
in `dw2_find_pc_sect_compunit_symtab` (presumably because:
if a short address range incorrectly starts at the start address of the
output section, GDB may pick it instead of the correct longer address
range. When mapping an address within the long address range but
out of the scope of the short address range, the routine may find
nothing - while the code asserts that it can find something).

Note that in the case of ICF there may be duplicate address range entries,
but GDB appears to be fine with them.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D89751
2020-10-20 09:35:32 -07:00
Andrew Ng 88ce27c39c [LLD][ELF] Improve ICF for relocations to ineligible sections via "aliases"
ICF was not able to merge equivalent sections because of relocations to
sections ineligible for ICF that use alternative symbols, e.g. symbol
aliases or section relative relocations.

Merging in this scenario has been enabled by giving the sections that
are ineligible for ICF a unique ID, i.e. an equivalence class of their
own. This approach also provides another benefit as it improves the
hashing that is used to perform the initial equivalance grouping for
ICF. This is because the ICF ineligible sections can now contribute a
unique value towards the hashes instead of the same value of zero. This
has been seen to reduce link time with ICF by ~68% for objects compiled
with -fprofile-instr-generate.

In order to facilitate this use of a unique ID, the existing
inconsistent approach to the setting of the InputSection eqClass in ICF
has been changed so that there is a clear distinction between the
eqClass values of ICF eligible sections and those of the ineligible
sections that have a unique ID. This inconsistency could have caused
incorrect equivalence class equality in the past, although it appears
that no issues were encountered in actual use.

Differential Revision: https://reviews.llvm.org/D88830
2020-10-15 12:43:14 +01:00
Konstantin Zhuravlyov f218652a36 LLD/AMDGPU: Infer os abi based on input llvm bitcode
Differential Revision: https://reviews.llvm.org/D89042
2020-10-13 12:20:28 -04:00
Christian Iversen a9cefc3dee [ELF] Fix broken bitstream linking with lld when e_machine > 255
In ELF/InputFiles.cpp, getBitcodeMachineKind() is limited to uint8_t return
type. This works as long as EM_xxx is < 256, which is true for common
architectures, but not for some newly assigned or unofficial EM_* values.

The corresponding ELF field (e_machine) can hold uint16_t.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D89185
2020-10-11 14:19:25 -07:00
Martin Storsjö 1dbfd87319 [LLD] [ELF] Fix the help listing for the wrap option. NFC.
This option just takes a single symbol name per invocation of the
option.

Differential Revision: https://reviews.llvm.org/D89007
2020-10-09 15:32:00 +03:00
Fangrui Song db1988f038 [ELF] Don't change binding to STB_WEAK for an undefined specified by -u
Similar to D66992.
In GNU ld, a -u specified symbol is a STB_DEFAULT undefined.
It cannot be changed to STB_WEAK by a later STB_WEAK undefined in a regular object file.

The behavior is consistent with our model because -u means "we need to fetch a lazy definition".
It should not be altered just because there is also a STB_WEAK undefined.

Note, our -u semantics are still different from GNU ld (https://github.com/ClangBuiltLinux/linux/issues/515):
we don't force the specified symbol to appear in .symtab This is a deliberate decision.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D88945
2020-10-08 08:31:34 -07:00
Martin Storsjö 9b2b32743d [LLD] [ELF] Fix up a comment regarding the --wrap option. NFC.
Add missing leading underscores to the __wrap_<symbol> and
__real_<symbol> names.

Differential Revision: https://reviews.llvm.org/D89008
2020-10-08 09:33:23 +03:00
Fangrui Song 88f2fe5cad Raland D87318 [LLD][PowerPC] Add support for R_PPC64_GOT_TLSGD_PCREL34 used in TLS General Dynamic
Add Thread Local Storage support for the 34 bit relocation R_PPC64_GOT_TLSGD_PCREL34 used in General Dynamic.

The compiler will produce code that looks like:
```
pla r3, x@got@tlsgd@pcrel            R_PPC64_GOT_TLSGD_PCREL34
bl __tls_get_addr@notoc(x@tlsgd)     R_PPC64_TLSGD
                                     R_PPC64_REL24_NOTOC
```
LLD should be able to correctly compute the relocation for  R_PPC64_GOT_TLSGD_PCREL34 as well as do the following two relaxations where possible:
General Dynamic to Local Exec:
```
paddi r3, r13, x@tprel
nop
```
and General Dynamic to Initial Exec:
```
pld r3, x@got@tprel@pcrel
add r3, r3, r13
```
Note:
This patch adds support for the PC Relative (no TOC) version of General Dynamic on top of the existing support for the TOC version of General Dynamic.
The ABI does not provide any way to tell by looking only at the relocation `R_PPC64_TLSGD` when it is being used in a TOC instruction sequence or and when it is being used in a no TOC sequence. The TOC sequence should always be 4 byte aligned. This patch adds one to the offset of the relocation when it is being used in a no TOC sequence. In this way LLD can tell by looking at the alignment of the offset of `R_PPC64_TLSGD` whether or not it is being used as part of a TOC or no TOC sequence.

Reviewed By: NeHuang, sfertile, MaskRay

Differential Revision: https://reviews.llvm.org/D87318
2020-10-01 12:36:33 -07:00
Stefan Pintilie 5f3e565f59 Revert "[LLD][PowerPC] Add support for R_PPC64_GOT_TLSGD_PCREL34 used in TLS General Dynamic"
This reverts commit 79122868f9.
2020-10-01 13:28:35 -05:00
Stefan Pintilie 79122868f9 [LLD][PowerPC] Add support for R_PPC64_GOT_TLSGD_PCREL34 used in TLS General Dynamic
Add Thread Local Storage support for the 34 bit relocation R_PPC64_GOT_TLSGD_PCREL34 used in General Dynamic.

The compiler will produce code that looks like:
```
pla r3, x@got@tlsgd@pcrel            R_PPC64_GOT_TLSGD_PCREL34
bl __tls_get_addr@notoc(x@tlsgd)     R_PPC64_TLSGD
                                     R_PPC64_REL24_NOTOC
```
LLD should be able to correctly compute the relocation for  R_PPC64_GOT_TLSGD_PCREL34 as well as do the following two relaxations where possible:
General Dynamic to Local Exec:
```
paddi r3, r13, x@tprel
nop
```
and General Dynamic to Initial Exec:
```
pld r3, x@got@tprel@pcrel
add r3, r3, r13
```
Note:
This patch adds support for the PC Relative (no TOC) version of General Dynamic on top of the existing support for the TOC version of General Dynamic.
The ABI does not provide any way to tell by looking only at the relocation `R_PPC64_TLSGD` when it is being used in a TOC instruction sequence or and when it is being used in a no TOC sequence. The TOC sequence should always be 4 byte aligned. This patch adds one to the offset of the relocation when it is being used in a no TOC sequence. In this way LLD can tell by looking at the alignment of the offset of `R_PPC64_TLSGD` whether or not it is being used as part of a TOC or no TOC sequence.

Reviewed By: NeHuang, sfertile, MaskRay

Differential Revision: https://reviews.llvm.org/D87318
2020-10-01 13:00:37 -05:00
Fangrui Song 4e9277eda1 [ELF] --wrap: don't unnecessarily expose __real_
The routing rules are:

sym -> __wrap_sym
__real_sym -> sym

__wrap_sym and sym are routing targets, so they need to be exposed to the symbol
table. __real_sym is not and can be eliminated if not used by regular object.
2020-09-30 20:09:25 -07:00
Fangrui Song 259bb61c11 [ELF] Fix multiple -mllvm after D70378
Fixes https://reviews.llvm.org/D70378#2299569 Multiple -mllvm is intended to be supported.

We don't have a proper test for `-plugin-opt=-`. This patch adds the test as well.

Differential Revision: https://reviews.llvm.org/D88461
2020-09-29 10:26:58 -07:00
Stefan Pintilie 8c53282d64 [PowerPC][NFC] Merged two switch entries.
Two switch entries did exactly the same thing. This patch merges them.
2020-09-25 09:49:13 -05:00
Stefan Pintilie d224175230 [PowerPC][LLD] Extend R2 save stub to support offsets of more than 26 bits
The R2 save stub will now support offsets up to 64 bits.

There are three cases that will be used.
1) The offset fits in 26 bits.
```
b <26 bit offset>
```
2) The offset does not fit in 26 bits but fits in 34 bits.
```
paddi r12, 0, <34 bit offset>, 1
mtctr r12
bctr
```
3) The offset does not fit in 34 bits. Since this is an R2 save stub we can use
the TOC in R2. We are not loading the offset but the actual address we want to
branch to.
```
addis r12, r2, <address in TOC lo>
ld r12 <address in TOC hi>(r12)
mtctr r12
bctr
```

In case 1) the stub is only 8 bytes while in cases 2) and 3) the stub will be
20 bytes.

Reviewed By: MaskRay, sfertile, NeHuang

Differential Revision: https://reviews.llvm.org/D87916
2020-09-25 06:39:14 -05:00
Fangrui Song 1ca6bd261e [lld] Clean up in lld::{coff,elf}::link after D70378
Library users should not need to call errorHandler().reset() explicitly.

google/iree calls lld:🧝:link and without the patch some global
variables are not cleaned up in the next invocation.
2020-09-24 18:02:45 -07:00
Snehasish Kumar 070555c6c0 [lld] Make -z keep-text-section-prefix recognize .text.split. as a prefix.
".text.split." holds symbols which are split out from functions in
other input sections. For example, with -fsplit-machine-functions,
placing the cold parts in .text.split instead of .text.unlikely mitigates
against poor profile inaccuracy. Techniques such as hugepage remapping can
make conservative decisions at the section granularity.

Differential Revision: https://reviews.llvm.org/D87840
2020-09-24 15:02:48 -07:00
Alexandre Ganea f2efb5742c [LLD][COFF] Cover usage of LLD-as-a-library in tests
In lit tests, we run each LLD invocation twice (LLD_IN_TEST=2), without shutting down the process in-between. This ensures a full cleanup is properly done between runs.
Only active for the COFF driver for now. Other drivers still use LLD_IN_TEST=1 which executes just one iteration with full cleanup, like before.
When the environment variable LLD_IN_TEST is unset, a shortcut is taken, only one iteration is executed, no cleanup for faster exit, like before.
A public API, lld::safeLldMain(), is also available when using LLD as a library.

Differential Revision: https://reviews.llvm.org/D70378
2020-09-24 15:07:50 -04:00
Victor Huang 967e29ff8c [LLD][PowerPC][test] Update thunk range error report for PPC64PCRelLongBranchThunk
Update the thunk range error report for PPC64PCRelLongBranchThunk and add a range
error test case for PPC64R12SetupStub.

Differential Revision: https://reviews.llvm.org/D87381
2020-09-22 07:37:54 -05:00
Stefan Pintilie c0071862bb [PowerPC] Add support for R_PPC64_GOT_TPREL_PCREL34 used in TLS Initial Exec
Add Thread Local Storage Initial Exec support to LLD.

This patch adds the computation for the relocations as well as the relaxation from Initial Exec to Local Exec.

Initial Exec:
```
pld r9, x@got@tprel@pcrel
add r9, r9, x@tls@pcrel
```
or
```
pld r9, x@got@tprel@pcrel
lbzx r10, r9, x@tls@pcrel
```
Note that @tls@pcrel is actually encoded as R_PPC64_TLS with a one byte displacement.

For the above examples relaxing Intitial Exec to Local Exec:
```
paddi r9, r9, x@tprel
nop
```
or
```
paddi r9, r13, x@tprel
lbz r10, 0(r9)
```

Reviewed By: nemanjai, MaskRay, #powerpc

Differential Revision: https://reviews.llvm.org/D86893
2020-09-22 05:48:43 -05:00
Jianzhou Zhao 11201315d5 Flush bitcode incrementally for LTO output
Bitcode writer does not flush buffer until the end by default. This is
fine to small bitcode files. When -flto,--plugin-opt=emit-llvm,-gmlt are
used, the final bitcode file is large, for example, >8G. Keeping all
data in memory consumes a lot of memory.

This change allows bitcode writer flush data to disk early when buffered
data size is above some threshold. This is only enabled when lld emits
LLVM bitcode.

One issue to address is backpatching bitcode: subblock length, function
body indexes, meta data indexes need to backfill. If buffer can be
flushed partially, we introduced raw_fd_stream that supports
read/seek/write, and enables backpatching bitcode flushed in disk.

Reviewed-by: tejohnson, MaskRay

Differential Revision: https://reviews.llvm.org/D86905
2020-09-17 03:32:31 +00:00
Fangrui Song 15f0ad2fa2 [ELF] Bump the limit of thunk creation passes from 10 to 15
I have noticed that a 374MiB powerpc64le 'ld.lld' requires 11 passes to link.
There is a ThunkSection (whose parent OutputSection is ".text" of 169MiB) with 12867 thunks.
2020-09-16 14:05:22 -07:00
Andrew Ng 77152a6b7a [LLD][ELF] Optimize linker script filename glob pattern matching NFC
Optimize the filename glob pattern matching in
LinkerScript::computeInputSections() and LinkerScript::shouldKeep().

Add InputFile::getNameForScript() which gets and if required caches the
Inputfile's name used for linker script matching. This avoids the
overhead of name creation that was in getFilename() in LinkerScript.cpp.

Add InputSectionDescription::matchesFile() and
SectionPattern::excludesFile() which perform the glob pattern matching
for an InputFile and make use of a cache of the previous result. As both
computeInputSections() and shouldKeep() process sections in order and
the sections of the same InputFile are contiguous, these single entry
caches can significantly speed up performance for more complex glob
patterns.

These changes have been seen to reduce link time with --gc-sections by
up to ~40% with linker scripts that contain KEEP filename glob patterns
such as "*crtbegin*.o".

Differential Revision: https://reviews.llvm.org/D87469
2020-09-16 10:26:11 +01:00
Stefan Pintilie 65f6810d3a [LLD][PowerPC] Add support for R_PPC64_TPREL34 used in TLS Local Exec
Add Thread Local Storage Local Exec support to LLD. This is to support PC Relative addressing of Local Exec.
The patch teaches LLD to handle:
```
paddi r9, r13, x1@tprel
```
The relocation is:
```
R_PPC_TPREL34
```

Reviewed By: NeHuang, MaskRay

Differential Revision: https://reviews.llvm.org/D86608
2020-09-15 09:06:19 -05:00
Georgii Rymar 4845531fa8 [lib/Object] - Refine interface of ELFFile<ELFT>. NFCI.
`ELFFile<ELFT>` has many methods that take pointers,
though they assume that arguments are never null and
hence could take references instead.

This patch performs such clean-up.

Differential revision: https://reviews.llvm.org/D87385
2020-09-15 11:38:31 +03:00
Fangrui Song 94921e9f8a [ELF] Define a reportRangeError() overload for thunks and tidy up recent PPC64 thunk range errors
Prefer `errorOrWarn` to `fatal` for recoverable errors and graceful degradation
when --noinhibit-exec is specified.

Mention the destination symbol, otherwise the diagnostic is not really actionable.
Two errors are not tested but the patch does not intend to add the coverage.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D87486
2020-09-14 09:55:59 -07:00
Fangrui Song 560188ddcc [ELF][PowerPC] Define NOP as 0x60000000 to tidy up code. NFC
Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D87483
2020-09-11 09:20:24 -07:00
Fangrui Song 485f3f35cc [ELF] Make two PPC64.cpp variables constexpr. NFC
Why are they mutable? :)
2020-09-10 14:31:10 -07:00
Andrew Ng 863aa0a37b [LLD][ELF] Fix performance of MarkLive::scanEhFrameSection
MarkLive::scanEhFrameSection is used to retain personality/LSDA
functions when --gc-sections is enabled.

Improve its performance by only iterating over the .eh_frame relocations
that need to be resolved for an EhSectionPiece. This optimization makes
the same assumption as elsewhere in LLD that the .eh_frame relocations
are sorted by r_offset.

This appears to be a performance regression introduced in commit
e6c24299d2 (https://reviews.llvm.org/D59800).

This change has been seen to reduce link time by up to ~50%.

Differential Revision: https://reviews.llvm.org/D87245
2020-09-08 19:32:34 +01:00
Fangrui Song e59d9df774 [ELF] --symbol-ordering-file: optimize a loop 2020-09-07 21:47:30 -07:00
Jessica Clarke bef38e86b4 [ELF] Handle SHT_RISCV_ATTRIBUTES similarly to SHT_ARM_ATTRIBUTES
Currently we treat SHT_RISCV_ATTRIBUTES like a normal section and
concatenate all such input sections, yielding invalid output unless only
a single attributes section is present in the input. Instead, pick the
first as with SHT_ARM_ATTRIBUTES. We do not currently need to condition
our behaviour on the contents, unlike Arm. In future, we should both do
stricter validation of the input and merge all sections together to
ensure we have, for example, the full arch string requirement, but this
rudimentary implementation is good enough for most common cases.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D86309
2020-09-05 18:36:23 +01:00
Victor Huang bfc7636612 [LLD][PowerPC] Add a pc-rel based long branch thunk
In this patch, a pc-rel based long branch thunk is added for the local
call protocol that caller and callee does not use TOC.

Reviewed By: sfertile, nemanjai

Differential Revision: https://reviews.llvm.org/D86706
2020-08-28 10:40:48 -05:00
Fangrui Song 25863cc512 [ELF] .note.gnu.property: error for invalid pr_datasize
A n_type==NT_GNU_PROPERTY_TYPE_0 note encodes a program property.
If pr_datasize is invalid, LLD may crash
(https://github.com/ClangBuiltLinux/linux/issues/1141)

This patch adds some error checking, supports big-endian, and add some tests
for invalid n_descsz.

Differential Revision: https://reviews.llvm.org/D86422
2020-08-25 08:05:39 -07:00
Pavel Labath 3d1b0000f9 [lld] s/dyn_cast/isa in InputSection.cpp
Avoids a -Wunused-variable with gcc.
2020-08-24 11:45:30 +02:00
Stefan Pintilie 02e02f5398 [LLD][PowerPC] Add check in LLD to produce an error for missing TLSGD/TLSLD
The function `__tls_get_addr` is used to get the address of an object that is Thread Local Storage.
It needs to have two relocations on it.
One relocation is for the function call itself and it is either R_PPC64_REL24 or R_PPC64_REL24_NOTOC.
The other is R_PPC64_TLSGD or R_PPC64_TLSLD for the symbol that is having its address computed.

In the early days of the transition from the ELFv1 ABI that is used for big endian PowerPC Linux distributions to the ELFv2 ABI that is used for little endian PowerPC Linux distributions, there was some ambiguity in the specification of the relocations for TLS. The GNU linker has implemented support for correct handling of calls to __tls_get_addr with a missing relocation. Unfortunately, we didn't notice that the IBM XL compiler did not handle TLS according to the updated ABI until we tried linking XL compiled libraries with LLD. As a result, there is a lot of code out there in various libraries compiled with XL that have this problem.

This patch adds a new error check in LLD that makes sure calls to `__tls_get_addr` are not missing the TLSGD/TLSLD relocation.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D85994
2020-08-21 12:56:12 -05:00
Fangrui Song 9670029b6b [ELF] Keep st_type for symbol assignment
PR46970: for `alias = aliasee`, the alias can be used in relocation processing
and on ARM st_type does affect Thumb interworking. It is thus desirable for the
alias to get the same st_type.

Note that the st_size field should not be inherited because some tools use
st_size=0 as a heuristic to detect aliases. Retaining st_size can thwart such
heuristics and cause aliases to be preferred over the original symbols.

Differential Revision: https://reviews.llvm.org/D86263
2020-08-20 16:05:27 -07:00
Fangrui Song ec29538af2 [ELF] Assign file offsets of non-SHF_ALLOC after SHF_ALLOC and set sh_addr=0 to non-SHF_ALLOC
* GNU ld places non-SHF_ALLOC sections after SHF_ALLOC sections. This has the
  advantage that the file offsets of a non-SHF_ALLOC cannot be contained in
  a PT_LOAD. This patch matches the behavior.
* For non-SHF_ALLOC non-orphan sections, GNU ld may assign non-zero sh_addr and
  treat them similar to SHT_NOBITS (not advance location counter). This
  is an alternative approach to what we have done in D85100.
  By placing non-SHF_ALLOC sections at the end, we can drop special
  cases in createSection and findOrphanPos added by D85100.

  Different from GNU ld, we set sh_addr to 0 for non-SHF_ALLOC sections. 0
  arguably is better because non-SHF_ALLOC sections don't appear in the memory
  image.

ELF spec says:

> sh_addr - If the section will appear in the memory image of a process, this
> member gives the address at which the section's first byte should
> reside. Otherwise, the member contains 0.

D85100 appeared to take a detour. If we take a combined view on D85100 and this
patch, the overall complexity slightly increases (one more 3-line loop) and
compatibility with GNU ld improves.

The behavior we don't want to match is the special treatment of .symtab
.shstrtab .strtab: they can be matched in LLD but not in GNU ld.

Reviewed By: jhenderson, psmith

Differential Revision: https://reviews.llvm.org/D85867
2020-08-18 09:03:01 -07:00
Fangrui Song e8a11c0558 [ELF] Allow mixed SHF_LINK_ORDER & non-SHF_LINK_ORDER sections and sort within InputSectionDescription
LLD currently does not allow non-contiguous SHF_LINK_ORDER components in an
output section. This makes it infeasible to add SHF_LINK_ORDER to an existing
metadata section if backward compatibility with older object files are
concerned.

We did not allow mixed components (like GNU ld) and D77007 relaxed to allow
non-contiguous SHF_LINK_ORDER components. This patch allows arbitrary mix, with
sorting performed within an InputSectionDescription. For example,
`.rodata : {*(.rodata.foo) *(.rodata.bar)}`, has two InputSectionDescription's.
If there is at least one SHF_LINK_ORDER and at least one non-SHF_LINK_ORDER in
.rodata.foo, they are ordered within `*(.rodata.foo)`: we arbitrarily place
SHF_LINK_ORDER components before non-SHF_LINK_ORDER components (like Solaris ld).

`*(.rodata.bar)` is ordered similarly, but the two InputSectionDescription's
don't interact.  It can be argued that this is more reasonable than the previous
behavior where written order was not respected.

It would be nice if the two different semantics (ordering requirement & garbage
collection) were not overloaded on one section flag, however, it is probably
difficult to obtain a generic flag at this point
(https://groups.google.com/forum/#!topic/generic-abi/hgx_m1aXqUo
"SHF_LINK_ORDER's original semantics make upgrade difficult").

(Actually, without the GC semantics, SHF_LINK_ORDER would still have the
sh_link!=0 & sh_link=0 issue. It is just that people find the GC semantics more
useful and tend to use the feature more often.)

GNU ld feature request: https://sourceware.org/bugzilla/show_bug.cgi?id=16833

Differential Revision: https://reviews.llvm.org/D84001
2020-08-17 11:29:05 -07:00
Fangrui Song 661c089a40 [ELF] Enforce two-dash form for some LLD specific options and the newer --[no-]pcrel-optimize
Since -[no-]toc-optimize has not ever been used, we can enforce the two-dash form as well.
2020-08-17 10:00:31 -07:00
Nemanja Ivanovic cddb0dbcef [LLD][PowerPC] Implement GOT to PC-Rel relaxation
This patch implements the handling for the R_PPC64_PCREL_OPT relocation as well
as the GOT relocation for the associated R_PPC64_GOT_PCREL34 relocation.

On Power10 targets with PC-Relative addressing, the linker can relax
GOT-relative accesses to PC-Relative under some conditions. Since the sequence
consists of a prefixed load, followed by a non-prefixed access (load or store),
the linker needs to replace the first instruction (as the replacement
instruction will be prefixed). The compiler communicates to the linker that
this optimization is safe by placing the two aforementioned relocations on the
GOT load (of the address).
The linker then does two things:

- Convert the load from the got into a PC-Relative add to compute the address
  relative to the PC
- Find the instruction referred to by the second relocation (R_PPC64_PCREL_OPT)
  and replace the first with the PC-Relative version of it

It is important to synchronize the mapping from legacy memory instructions to
their PC-Relative form. Hence, this patch adds a file to be included by both
the compiler and the linker so they're always in agreement.

Differential revision: https://reviews.llvm.org/D84360
2020-08-17 09:36:09 -05:00
Victor Huang 7b391245d8 [PowerPC] Fix thunk alignment issue when using pc-rel instruction
Thunk alignment is added in thie patch when using pc-rel instructions
to avoid crossing the 64 byte boundary.

Patched by: nemanjai, NeHuang
Reviewed By: sfertile, MaskRay

Differential Revision: https://reviews.llvm.org/D85973
2020-08-17 09:09:36 -05:00
Georgii Rymar c135a68d42 [LLD][ELF] - Do not produce an invalid dynamic relocation order with --shuffle-sections.
Normally (when not on android with android relocation packing enabled),
we put IRelative relocations to ".rel[a].dyn", after other relocations,
to ensure that IRelatives are processed last by the dynamic loader.

To achieve that we add the `in.relaIplt` after the `part.relaDyn`:
https://github.com/llvm/llvm-project/blob/master/lld/ELF/Writer.cpp#L540

The problem is that `--shuffle-sections` might break the sections order.
This patch fixes it.

Fixes https://bugs.llvm.org/show_bug.cgi?id=47056.

Differential revision: https://reviews.llvm.org/D85651
2020-08-17 14:46:52 +03:00
Fangrui Song b358daddea [ELF] Re-initialize InputFile::isInGroup so that elf::link can be called more than once 2020-08-14 15:38:41 -07:00
Fangrui Song fb141292f4 [ELF] --gdb-index: skip SHF_GROUP .debug_info
-gdwarf-5 -fdebug-types-section may produce multiple .debug_info sections.  All
except one are type units (.debug_types before DWARF v5). When constructing
.gdb_index, we should ignore these type units. We use a simple heuristic: the
compile unit does not have the SHF_GROUP flag. (This needs to be revisited if
people place compile unit .debug_info in COMDAT groups.)

This issue manifests as a data race: because an object file may have multiple
.debug_info sections, we may concurrently construct `LLDDwarfObj` for the same
file in multiple threads. The threads may access `InputSectionBase::data()`
concurrently on the same input section. `InputSectionBase::data()` does a lazy
uncompress() and rewrites the member variable `rawData`. A thread running zlib
`inflate()` (transitively called by uncompress()) on a buffer with `rawData`
tampered by another thread may fail with `uncompress failed: zlib error: Z_DATA_ERROR`.

Even if no data race occurred in an optimistic run, if there are N .debug_info,
one CU entry and its address ranges will be replicated N times. The result
.gdb_index can be much larger than a correct one.

The new test gdb-index-dwarf5-type-unit.s actually has two compile units. This
cannot be produced with regular approaches (it can be produced with -r
--unique). This is used to demonstrate that the .gdb_index construction code
only considers the last non-SHF_GROUP .debug_info

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D85579
2020-08-13 09:11:01 -07:00
Fangrui Song 88498f44df [ELF] -r: allow SHT_X86_64_UNWIND to be merged into SHT_PROGBITS
* For .cfi_*, GCC/GNU as emits SHT_PROGBITS type .eh_frame sections.
* Since rL252300, clang emits SHT_X86_64_UNWIND type .eh_frame sections
  (originated from Solaris, documented in the x86-64 psABI).
* Some assembly use `.section .eh_frame,"a",@unwind` to generate
  SHT_X86_64_UNWIND .eh_frame sections.

In a non-relocatable link, input .eh_frame are combined and there is
only one SyntheticSection .eh_frame in the output section, so the
"section type mismatch" diagnostic does not fire.

In a relocatable link, there is no SyntheticSection .eh_frame. .eh_frame of
mixed types can trigger the diagnostic. This patch fixes it by adding another
special case 0x70000001 (= SHT_X86_64_UNWIND) to canMergeToProgbits().

    ld.lld -r gcc.o clang.o => error: section type mismatch for .eh_frame

There was a discussion "RFC: Usefulness of SHT_X86_64_UNWIND" on the x86-64-abi
mailing list. Folks are not wild about making the psABI value 0x70000001 into
gABI, but a few think defining 0x70000001 for .eh_frame may be a good idea for a
new architecture.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D85785
2020-08-13 08:14:45 -07:00
Fangrui Song e973c1375e [ELF] Move the outSecOff addend from relocAlloc/relocNonAlloc/... to InputSectionBase::relocate
For an InputSection, the `buf` argument of `InputSectionBase::relocate` points
to the content of the containing OutputSection, instead of the content of the
InputSection itself, so `outSecOff` needs to be added in its callees.  This is
counter-intuitive and leads to many `- outSecOff` and `+ outSecOff`.

This patch makes `InputSection::writeTo` call `InputSectionBase::relocate` with
`outSecOff` added. relocAlloc/relocNonAlloc/relocateNonAllocForRelocatable can
thus be simplified now.

Updated test:

* non-abs-reloc.s: A minor offset bug is fixed for a diagnostic in `relocateNonAlloc`

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D85618
2020-08-11 08:06:38 -07:00
Fangrui Song 0334578edc [ELF] --wrap: don't leave the original symbol as SHN_UNDEF in .symtab or .dynsym 2020-08-08 18:18:20 -07:00
Fangrui Song 99cd56906a [ELF] --wrap: set isUsedInRegularObj of __wrap_ if it is defined or shared
Fixes PR47017 (a regression when fixing PR46169): if __wrap_ is shared,
it is not exported.
2020-08-08 09:24:31 -07:00
Fangrui Song d30d461938 [ELF] Support .cfi_signal_frame
glibc/sysdeps/unix/sysv/linux/x86_64/sigaction.c libc.a(sigaction.o) has a CIE
with the augmentation string "zRS". Support 'S' to allow --icf={safe,all}.
2020-08-07 22:08:44 -07:00
Fangrui Song 164a02d0fa [ELF]: --icf: don't fold sections referencing sections with LCDA after D84610 2020-08-07 13:42:25 -07:00
Victor Huang 6c64f05b90 [PowerPC] Add compatibility check for PPC PLT stubs
Compatibility checks for PPC64PltCallStub and PPC64PCRelPLTStub are
added in this patch to prevent the usage of incompatible thunk/stub.

Reviewed By: sfertile, nemanjai, stefanp

Differential Revision: https://reviews.llvm.org/D85459
2020-08-07 13:45:18 +00:00
Fangrui Song 004be4037e [ELF] Change tombstone values to (.debug_ranges/.debug_loc) 1 and (other .debug_*) 0
tl;dr See D81784 for the 'tombstone value' concept. This patch changes our behavior to be almost the same as GNU ld (except that we also use 1 for .debug_loc):

* .debug_ranges & .debug_loc: 1 (LLD<11: 0+addend; GNU ld uses 1 for .debug_ranges)
* .debug_*: 0 (LLD<11: 0+addend; GNU ld uses 0; future LLD: -1)

We make the tweaks because:

1) The new tombstone is novel and needs more time to be adopted by consumers before it's the default.
2) The old (gold) strategy had problems with zero-length functions - so rather than going back that, we're going to the GNU ld strategy which doesn't have that problem.
3) One slight tweak to (2) is to apply the .debug_ranges workaround to .debug_loc for the same reasons it applies to debug_ranges - to avoid terminating lists early.

-----

http://lists.llvm.org/pipermail/llvm-dev/2020-July/143482.html

The tombstone value -1 in .debug_line caused problems to lldb (fixed by D83957;
will be included in 11.0.0) and breakpad (fixed by
https://crrev.com/c/2321300). It may potentially affects other DWARF consumers.

For .debug_ranges & .debug_loc: 1, an argument preferring 1 (GNU ld for .debug_ranges) over -2 is that:
```
{-1, -2}    <<< base address selection entry
{0, length} <<< address range
```
may create a situation where low_pc is greater than high_pc. So we use
1, the GNU ld behavior for .debug_ranges

For other .debug_* sections, there haven't been many reports. One issue is that
bloaty (src/dwarf.cc) can incorrectly count address ranges in .debug_ranges . To
reduce similar disruption, this patch changes the tombstone values to be similar to GNU ld.

This does mean another behavior change to the default trunk behavior. Sorry
about it. The default trunk behavior will be similar to release/11.x while we work on a transition plan for LLD users.

Reviewed By: dblaikie, echristo

Differential Revision: https://reviews.llvm.org/D84825
2020-08-06 15:30:08 -07:00
Fangrui Song a6db64ef4a [ELF] Allow sections after a non-SHF_ALLOC section to be covered by PT_LOAD
GNU ld allows sections after a non-SHF_ALLOC section to be covered by PT_LOAD
(PR37607) and assigns addresses to non-SHF_ALLOC output sections (similar to
SHF_ALLOC NOBITS sections. The location counter is not advanced).

This patch tries to fix PR37607 (remove a special case in
`Writer<ELFT>::createPhdrs`). To make the created PT_LOAD meaningful, we cannot
reset dot to 0 for a middle non-SHF_ALLOC output section. This results in
removal of two special cases in LinkerScript::assignOffsets. Non-SHF_ALLOC
non-orphan sections can have non-zero addresses like in GNU ld.

The zero address rule for non-SHF_ALLOC sections is weakened to apply to orphan
only. This results in a special case in createSection and findOrphanPos, respectively.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D85100
2020-08-06 08:27:15 -07:00
Muhammad Omair Javaid d9e191cb17 Revert "[ELF] Allow sections after a non-SHF_ALLOC section to be covered by PT_LOAD"
This reverts commit 030ddc0a0b.

This breaks http://lab.llvm.org:8011/builders/lldb-arm-ubuntu
and http://lab.llvm.org:8011/builders/lldb-aarch64-ubuntu

Differential Revision: https://reviews.llvm.org/D85100
2020-08-06 16:30:05 +05:00
Fangrui Song 279e4cf782 [ELF] Fix type of ciesWithLSDA after D84610 2020-08-05 16:33:54 -07:00
Fangrui Song b216c80cc2 [ELF] Allow SHF_LINK_ORDER sections to have sh_link=0
Part of https://bugs.llvm.org/show_bug.cgi?id=41734

The semantics of SHF_LINK_ORDER have been extended to represent metadata
sections associated with some other sections (usually text).

The associated text section may be discarded (e.g. LTO) and we want the
metadata section to have sh_link=0 (D72899, D76802).

Normally the metadata section is only referenced by the associated text
section. sh_link=0 means the associated text section is discarded, and
the metadata section will be garbage collected. If there is another
section (.gc_root) referencing the metadata section, the metadata
section will be retained. It's the .gc_root consumer's job to validate
the metadata sections.

  # This creates a SHF_LINK_ORDER .meta with sh_link=0
  .section .meta,"awo",@progbits,0
  1:
  .section .meta,"awo",@progbits,foo
  2:

  .section .gc_root,"a",@progbits
  .quad 1b
  .quad 2b

Reviewed By: pcc, jhenderson

Differential Revision: https://reviews.llvm.org/D72904
2020-08-05 16:17:42 -07:00
Fangrui Song 030ddc0a0b [ELF] Allow sections after a non-SHF_ALLOC section to be covered by PT_LOAD
GNU ld allows sections after a non-SHF_ALLOC section to be covered by PT_LOAD
(PR37607) and assigns addresses to non-SHF_ALLOC output sections (similar to
SHF_ALLOC NOBITS sections. The location counter is not advanced).

This patch tries to fix PR37607 (remove a special case in
`Writer<ELFT>::createPhdrs`). To make the created PT_LOAD meaningful, we cannot
reset dot to 0 for a middle non-SHF_ALLOC output section. This results in
removal of two special cases in LinkerScript::assignOffsets. Non-SHF_ALLOC
non-orphan sections can have non-zero addresses like in GNU ld.

The zero address rule for non-SHF_ALLOC sections is weakened to apply to orphan
only. This results in a special case in createSection and findOrphanPos, respectively.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D85100
2020-08-05 09:30:23 -07:00
Fangrui Song 21b4f8060a [ELF] --icf: don't fold text sections with LSDA
Fix PR36272 and PR46835

A .eh_frame FDE references a text section and (optionally) a LSDA (in
.gcc_except_table).  Even if two text sections have identical content and
relocations (e.g. a() and b()), we cannot fold them if their LSDA are different.

```
void foo();
void a() {
  try { foo(); } catch (int) { }
}
void b() {
  try { foo(); } catch (float) { }
}
```

Scan .eh_frame pieces with LSDA and disallow referenced text sections to be
folded. If two .gcc_except_table have identical semantics (usually identical
content with PC-relative encoding), we will lose folding opportunity.
For ClickHouse (an exception-heavy application), this can reduce --icf=all efficiency
from 9% to 5%. There may be some percentage we can reclaim without affecting
correctness, if we analyze .eh_frame and .gcc_except_table sections.

gold 2.24 implemented a more complex fix (resolution to
https://sourceware.org/bugzilla/show_bug.cgi?id=21066) which combines the
checksum of .eh_frame CIE/FDE pieces.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D84610
2020-08-05 09:16:28 -07:00
Fangrui Song acb66b9111 [ELF] --oformat=binary: use LMA to compute file offsets
--oformat=binary is rare (used in a few places in FreeBSD, see `stand/i386/mbr/Makefile` `LDFLAGS_BIN`)
The result should be identical to a normal output transformed by `objcopy -O binary`.

The current implementation ignores addresses and lays out sections by
respecting output section alignments. It can fail when an output section
address is specified, e.g. `.rodata ALIGN(16) :` (PR33651).

Fix PR33651 by respecting LMA. The code is similar to
`tools/llvm-objcop/ELF/Object.cpp` BinaryWriter::finalize after D71035 and D79229.
Unforunately for an output section without PT_LOAD, we assume its LMA is equal
to its VMA. So the result is still incorrect when an output section LMA
(`AT(...)`) is specified

Also drop `alignTo(off, config->wordsize)`. GNU ld does not round up the file size.

Differential Revision: https://reviews.llvm.org/D85086
2020-08-05 09:10:01 -07:00
Petr Hosek 81eeabbd97 [ELF] Add --dependency-file option
Clang and GCC have a feature (-MD flag) to create a dependency file
in a format that build systems such as Make or Ninja can read, which
specifies all the additional inputs such .h files.

This change introduces the same functionality to lld bringing it to
feature parity with ld and gold which gained this feature recently.
See https://sourceware.org/bugzilla/show_bug.cgi?id=22843 for more
details and discussion.

The implementation corresponds to -MD -MP compiler flag where the
generated dependency file also includes phony targets which works
around the errors where the dependency is removed. This matches the
format used by ld and gold.

Fixes PR42806

Differential Revision: https://reviews.llvm.org/D82437
2020-08-03 16:59:13 -07:00