Commit Graph

6365 Commits

Author SHA1 Message Date
Fangrui Song 49279ca160 [ELF] Improve --export-dynamic-symbol performance by checking whether wildcard is really used
A hasWildcard pattern iterates over symVector, which can be slow when there
are many --export-dynamic-symbol. In optimistic cases, most patterns don't use
a wildcard character. hasWildcard: false can avoid a symbol table iteration.

While here, add two tests using `[` and `?`, respectively.
2020-06-17 17:12:10 -07:00
Hongtao Yu 2638aafe12 [LLD][ThinLTO] Add --thinlto-single-module to allow compiling partial modules.
This change introduces an LLD switch --thinlto-single-module to allow compiling only a part of the input modules. This is specifically enables:

  1. Fast investigating/debugging modules of interest without spending time on compiling unrelated modules.
  2. Compiler debug dump with -mllvm -debug-only= for specific modules.

It will be useful for large applications which has 1K+ input modules for thinLTO.

The switch can be combined with `--lto-obj-path=` or `--lto-emit-asm` to obtain intermediate object files or assembly files. So far the module name matching is implemented as a fuzzy name lookup where the modules with name containing the switch value are compiled.

E.g,
Command:
     ld.lld main.o thin.a --thinlto-single-module=thin.a --lto-obj-path=single.o
log:
     [ThinLTO] Selecting thin.a(thin1.o at 168) to compile
     [ThinLTO] Selecting thin.a(thin2.o at 228) to compile
Command:
     ld.lld main.o thin.a --thinlto-single-module=thin1.o --lto-obj-path=single.o
log:
     [ThinLTO] Selecting thin.a(thin1.o at 168) to compile

Differential Revision: https://reviews.llvm.org/D80406
2020-06-10 15:32:30 -07:00
Fangrui Song b114e134bd [ELF] Fix --thinlto-index-only regression after D79300
After D79300, we don't rewrite InputFile::mb to an empty buffer.
In thinLTOCreateEmptyIndexFiles(), we should check LazyObjFile::fetched
as well as checking whether mb is a bitcode, otherwise we would overwrite (path + .thinlto.bc) with an empty index.
2020-06-09 23:10:30 -07:00
Fangrui Song ba890da287 [ELF] Demote lazy symbols relative to a discarded section to Undefined
Fixes PR45594.

In `ObjFile<ELFT>::initializeSymbols()`, for a defined symbol relative to
a discarded section (due to section group rules), it may have been
inserted as a lazy symbol. We need to demote it to an Undefined to
enable the `discarded section` error happened in a later pass.

Add `LazyObjFile::fetched` (if true) and `ArchiveFile::parsed` (if
false) to represent that there is an ongoing lazy symbol fetch and we
should replace the current lazy symbol with an Undefined, instead of
calling `Symbol::resolve` (`Symbol::resolve` should be called if the lazy
symbol was added by an unrelated archive/lazy object).

As a side result, one small issue in start-lib-comdat.s is now fixed.
The hack motivating D51892 will be unsupported: if
`.gnu.linkonce.t.__i686.get_pc_thunk.bx` in an archive is referenced
by another section, this will likely be errored unless the function is
also defined in a regular object file.
(Bringing back rL330869 would error `undefined symbol` instead of the
more relevant `discarded section`.)

Note, glibc i386's crti.o still works (PR31215), because
`.gnu.linkonce.t.__x86.get_pc_thunk.bx` is in crti.o (one of the first
regular object files in a linker command line).

Reviewed By: psmith

Differential Revision: https://reviews.llvm.org/D79300
2020-06-09 11:27:34 -07:00
Fangrui Song ac6abc99e2 [ELF] Don't cause assertion failure if --dynamic-list or --version-script takes an empty file
Fixes PR46184
Report line 1 of the last memory buffer.
2020-06-05 15:59:54 -07:00
Fangrui Song 7bee6e30fe [ELF] Handle -u before input files
If both a.a and b.so define foo

```
ld.bfd -u foo a.a b.so  # foo is defined
ld.bfd a.a b.so -u foo  # foo is defined
ld.bfd -u foo b.so a.a  # foo is undefined (provided at runtime by b.so)
ld.bfd b.so a.a -u foo  # foo is undefined (provided at runtime by b.so)
```

In all cases we make foo undefined in the output.  I tend to think the
GNU ld behavior makes more sense.

* In their model, they have to treat -u as a fake object file with an
  undefined symbol before all input files, otherwise the first archive would not be fetched.
* Following their behavior allows us to drop a --warn-backrefs special case.

Reviewed By: psmith

Differential Revision: https://reviews.llvm.org/D81052
2020-06-05 08:44:38 -07:00
Fangrui Song 3eb4bf13ba [ELF] Append " [--no-allow-shlib-undefined]" to the corresponding diagnostics
--no-allow-shlib-undefined (enabled by default when linking an
executable) rejects unresolved references in shared objects.

Users may be confused by the common diagnostics of unresolved symbols in
object files (LLD: "undefined symbol: foo"; GNU ld/gold: "undefined reference to")

Learn from GCC/clang " [-Wfoo]": append the option name to the
diagnostics. Users can find relevant information by searching
"--no-allow-shlib-undefined".  It should also be obvious to them that
the positive form --allow-shlib-undefined can suppress the error.

Also downgrade the error to a warning if --noinhibit-exec is used (compatible
with GNU ld and gold).

Reviewed By: grimar, psmith

Differential Revision: https://reviews.llvm.org/D81028
2020-06-03 07:59:37 -07:00
Sriraman Tallam e0bca46b08 Options for Basic Block Sections, enabled in D68063 and D73674.
This patch adds clang options:
-fbasic-block-sections={all,<filename>,labels,none} and
-funique-basic-block-section-names.
LLVM Support for basic block sections is already enabled.

+ -fbasic-block-sections={all, <file>, labels, none} : Enables/Disables basic
block sections for all or a subset of basic blocks. "labels" only enables
basic block symbols.
+ -funique-basic-block-section-names: Enables unique section names for
basic block sections, disabled by default.

Differential Revision: https://reviews.llvm.org/D68049
2020-06-02 00:23:32 -07:00
Fangrui Song a6ae333a0c [ELF] --wrap: don't error `undefined reference to __real_foo` (--no-allow-shlib-undefined) if foo is a wrapped definition
This is a regression after D51283.

Also, export `foo` if `__real_foo` is referenced by a shared object.
2020-06-01 23:00:51 -07:00
Fangrui Song 751f18e7d4 [ELF] Refine --export-dynamic-symbol semantics to be compatible GNU ld 2.35
GNU ld from binutils 2.35 onwards will likely support
--export-dynamic-symbol but with different semantics.
https://sourceware.org/pipermail/binutils/2020-May/111302.html

Differences:

1. -export-dynamic-symbol is not supported
2. --export-dynamic-symbol takes a glob argument
3. --export-dynamic-symbol can suppress binding the references to the definition within the shared object if (-Bsymbolic or -Bsymbolic-functions)
4. --export-dynamic-symbol does not imply -u

I don't think the first three points can affect any user.
For the fourth point, Not implying -u can lead to some archive members unfetched.
Add -u foo to restore the previous behavior.

Exact semantics:

* -no-pie or -pie: matched non-local defined symbols will be added to the dynamic symbol table.
* -shared: matched non-local STV_DEFAULT symbols will not be bound to definitions within the shared object
  even if they would otherwise be due to -Bsymbolic, -Bsymbolic-functions, or --dynamic-list.

Reviewed By: psmith

Differential Revision: https://reviews.llvm.org/D80487
2020-06-01 11:30:03 -07:00
Fangrui Song ee9a251caf [ELF] Set DF_1_PIE for -pie
DF_1_PIE originated from Solaris (https://docs.oracle.com/cd/E36784_01/html/E36857/chapter6-42444.html ).
GNU ld since
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=5fe2850dd96483f176858fd75c098313d5b20bc2
sets the flag on non-Solaris platforms.

It can help distinguish PIE from ET_DYN.
eu-classify from elfutils uses this to recognize PIE (https://sourceware.org/git/?p=elfutils.git;a=commit;h=3f489b5c7c78df6d52f8982f79c36e9a220e8951 )

glibc uses this flag to reject dlopen'ing a PIE (https://sourceware.org/bugzilla/show_bug.cgi?id=24323 )

Reviewed By: psmith

Differential Revision: https://reviews.llvm.org/D80872
2020-06-01 10:19:41 -07:00
Fangrui Song 881c5eef98 [ELF] Add -z rel and -z rela
LLD supports both REL and RELA for static relocations, but emits either
of REL and RELA for dynamic relocations. The relocation entry format is
specified by each psABI.

musl ld.so supports both REL and RELA. For such ld.so implementations,
REL (.rel.dyn .rel.plt) has size benefits even if the psABI chooses RELA:
sizeof(Elf64_Rel)=16 < sizeof(Elf64_Rela)=24.

* COPY, GLOB_DAT and J[U]MP_SLOT always have 0 addend. A ld.so
  implementation does not need to read the implicit addend.
  REL is strictly better.
* A RELATIVE has a non-zero addend. Such relocations can be packed
  compactly with the RELR relocation entry format, which is out of scope
  of this patch.
* For other dynamic relocation types (e.g. symbolic relocation R_X86_64_64),
  a ld.so implementation needs to read the implicit addend. REL may have
  minor performance impact, because reading implicit addends forces
  random access reads instead of being able to blast out a bunch of
  writes while chasing the relocation array.

This patch adds -z rel and -z rela to change the relocation entry format
for dynamic relocations. I have tested that a -z rel produced x86-64
executable works with musl ld.so

-z rela may be useful for debugging purposes on processors whose psABIs
specify REL as the canonical format: addends can be easily read by a tool.

Reviewed By: grimar, mcgrathr

Differential Revision: https://reviews.llvm.org/D80496
2020-05-29 14:22:03 -07:00
Rui Ueyama 54d2896852 [ELF] --wrap: Drop __real_ symbol from the symbol table
In D34993, we discussed and concluded that we should drop `__real_
symbol from the symbol table, but I did the opposite in D50569.
This patch is to drop `__real_` symbol.

MaskRay's note: omitting `__real_` is important if it is undefined:
otherwise a subsequent link may error due to the undefined `__real_` in .dynsym

Differential Revision: https://reviews.llvm.org/D51283
2020-05-27 16:58:00 -07:00
Fangrui Song b8a3c618d6 [ELF] Allow misaligned SHT_GNU_verneed
Bazel created interface shared objects (.ifso) may be misaligned.  We use
llvm::support::detail::packed_endian_specific_integral under the hood
which allows reading of misaligned values, so there is not a need to
diagnose (in LLD we don't intend to support sophisticated parsing for
SHT_GNU_*).
2020-05-26 11:18:19 -07:00
Fangrui Song bae7cf6746 [ELF][PPC64] Synthesize _savegpr[01]_{14..31} and _restgpr[01]_{14..31}
In the 64-bit ELF V2 API Specification: Power Architecture, 2.3.3.1. GPR
Save and Restore Functions defines some special functions which may be
referenced by GCC produced assembly (LLVM does not reference them).

With GCC -Os, when the number of call-saved registers exceeds a certain
threshold, GCC generates `_savegpr0_* _restgpr0_*` calls and expects the
linker to define them. See
https://sourceware.org/pipermail/binutils/2002-February/017444.html and
https://sourceware.org/pipermail/binutils/2004-August/036765.html . This
is weird because libgcc.a would be the natural place. However, the linker
generation approach has the advantage that the linker can generate
multiple copies to avoid long branch thunks. We don't consider the
advantage significant enough to complicate our trunk implementation, so
we take a simple approach.

* Check whether `_savegpr0_{14..31}` are used
* If yes, define needed symbols and add an InputSection with the code sequence.

`_savegpr1_*` `_restgpr0_*` and `_restgpr1_*` are similar.

Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D79977
2020-05-26 09:35:41 -07:00
Fangrui Song e32f04cdc9 [ELF] Parse SHT_GNU_verneed and respect versioned undefined symbols in shared objects
An undefined symbol in a shared object can be versioned, like `f@v1`.
We currently insert `f` as an Undefined into the symbol table, but we
should insert `f@v1` instead.

The string `v1` is inferred from SHT_GNU_versym and SHT_GNU_verneed.
This patch implements the functionality.

Failing to do this can cause two issues:

* If a versioned symbol referenced by a shared object is defined in the
  executable, we will fail to export it.
* If a versioned symbol referenced by a shared object in another object
  file, --no-allow-shlib-undefined may spuriously report an
  "undefined reference to " error. See https://bugs.llvm.org/show_bug.cgi?id=44842
  (Linking -lfftw3 -lm on Arch Linux can cause
  `undefined reference to __log_finite`)

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D80059
2020-05-23 09:55:48 -07:00
Fangrui Song 6467649974 [ELF] Make --trace-symbol track preempted shared definitions
Note, we still name a preempted SharedSymbol "shared definition",
instead of "reference" as printed by GNU ld. This difference should not matter.

```
// GNU ld
ld.bfd: t: definition of f@v1
ld.bfd: t.so: reference to f@v1
```

Reviewed By: psmith

Differential Revision: https://reviews.llvm.org/D80143
2020-05-19 08:56:35 -07:00
Hongtao Yu 90af55d8a9 [LLD][ELF] Use offset in thin archives to disambiguate thinLTO members
This is fixing a thinLTO module collision issue for thin archives. The problem is that we always use a zero offset to name members in a thin archive and that causes the following build error:

    ld.lld: error: Expected at most one ThinLTO module per bitcode file

which happens to a thin archive that has two members with the same object file name (whose paths will be ignored by thinLTO driver)

The fix here is to use real member offset instead as is done for non-thin archives.

Differential Revision: https://reviews.llvm.org/D79880
2020-05-15 12:02:08 -07:00
Fangrui Song e36223c85c [ELF] Enforce two dashes for Flag options not supported by GNU ld (i.e. no compatibility burden)
Announced on https://lists.llvm.org/pipermail/llvm-dev/2020-May/141416.html

Similar to D79371, but for `multiclass B` (convenience helper for defining --foo and --no-foo)

Some changed options are also used by gold, but I haven't seen their
one-dash use cases outside of lld's testsuite.
2020-05-15 11:07:25 -07:00
Fangrui Song 07837b8f49 [ELF] Use namespace qualifiers (lld:: or elf::) instead of `namespace lld { namespace elf {`
Similar to D74882. This reverts much code from commit
bd8cfe65f5 (D68323) and fixes some
problems before D68323.

Sorry for the churn but D68323 was a mistake. Namespace qualifiers avoid
bugs where the definition does not match the declaration from the
header. See
https://llvm.org/docs/CodingStandards.html#use-namespace-qualifiers-to-implement-previously-declared-functions (D74515)

Differential Revision: https://reviews.llvm.org/D79982
2020-05-15 08:49:53 -07:00
Peter Smith 0ae7990b60 [ELF][ARM] Support /DISCARD/ of subset of .ARM.exidx sections
Both the .ARM.exidx and .eh_frame sections have a custom SyntheticSection
that acts as a container for the InputSections. The InputSections are added
to the SyntheticSection prior to /DISCARD/ which limits the affect a
/DISCARD/ can have to the whole SyntheticSection. In the majority of cases
this is sufficient as it is not common to discard subsets of the
InputSections. The Linux kernel has one of these scripts which has something
like:
/DISCARD/ : { *(.ARM.exidx.exit.text) *(.ARM.extab.exit.text) ... }
The .ARM.exidx.exit.text are not discarded because the InputSection has been
transferred to the Synthetic Section. The *(.ARM.extab.exit.text) sections
have not so they are discarded. When we come to write out the .ARM.exidx
sections the dangling references from .ARM.exidx.exit.text to
.ARM.extab.exit.text currently cause relocation out of range errors, but
could as easily cause a fatal error message if we check for dangling
references at relocation time.

This patch attempts to respect the /DISCARD/ command by running it on the
.ARM.exidx InputSections stored in the SyntheticSection.

The .eh_frame is in theory affected by this problem, but I don't think that
there is a dangling reference problem that can happen with these sections.

Fixes remaining part of pr44824

Differential Revision: https://reviews.llvm.org/D79687
2020-05-11 14:27:13 +01:00
Wei Mi 538208f6c0 [lld] Add a new output section ".text.unknown" for funtions with unknown hotness
For sampleFDO, because the optimized build uses profile generated from previous
release, often we couldn't tell a function without profile was truely cold or
just newly created so we had to treat them conservatively and put them in .text
section instead of .text.unlikely. The result was when we persue the best
performance by locking .text.hot and .text in memory, we wasted a lot of memory
to keep cold functions inside. This problem has been largely solved for regular
sampleFDO using profile-symbol-list (https://reviews.llvm.org/D66374), but for
the case when we use partial profile, we still waste a lot of memory because
of it.

In https://reviews.llvm.org/D62540, we propose to save functions with unknown
hotness information in a special section called ".text.unknown", so that
compiler will treat those functions as luck-warm, but runtime can choose not
to mlock the special section in memory or use other strategy to save memory.
That will solve most of the memory problem even if we use a partial profile.

The patch adds the support in lld for the special section.For sampleFDO,
because the optimized build uses profile generated from previous release,
often we couldn't tell a function without profile was truely cold or just
newly created so we had to treat them conservatively and put them in .text
section instead of .text.unlikely. The result was when we persue the best
performance by locking .text.hot and .text in memory, we wasted a lot of
memory to keep cold functions inside. This problem has been largely solved
for regular sampleFDO using profile-symbol-list
(https://reviews.llvm.org/D66374), but for the case when we use partial
profile, we still waste a lot of memory because of it.

In https://reviews.llvm.org/D62540, we propose to save functions with unknown
hotness information in a special section called ".text.unknown", so that
compiler will treat those functions as luck-warm, but runtime can choose not
to mlock the special section in memory or use other strategy to save memory.
That will solve most of the memory problem even if we use a partial profile.

The patch adds the support in lld for the special section.

Differential Revision: https://reviews.llvm.org/D79590
2020-05-08 11:14:48 -07:00
Fangrui Song e20a215992 [ELF] Add convenience TableGen classes to enforce two dashes for options not supported by GNU ld
Announced on https://lists.llvm.org/pipermail/llvm-dev/2020-May/141416.html

For many options, we have to support either one or two dash to be
compatible with GNU ld. For newer and lld specific options, we can enforce strict double dashes.

Affected options:

* --thinlto-*
* --lto-*
* --shuffle-sections=

This patch does not change `-plugin-opt=*` because clang driver passes
`-plugin-opt=*` and I don't intend to cause churn.

In 2000, GNU ld tried something similar with --omagic
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=e4897a3288f37d5f69e8acd256a6e83e607fe8d8

Reviewed By: tejohnson, psmith

Differential Revision: https://reviews.llvm.org/D79371
2020-05-08 07:37:06 -07:00
Reid Kleckner 932f0276ea [Support] Move LLD's parallel algorithm wrappers to support
Essentially takes the lld/Common/Threads.h wrappers and moves them to
the llvm/Support/Paralle.h algorithm header.

The changes are:
- Remove policy parameter, since all clients use `par`.
- Rename the methods to `parallelSort` etc to match LLVM style, since
  they are no longer C++17 pstl compatible.
- Move algorithms from llvm::parallel:: to llvm::, since they have
  "parallel" in the name and are no longer overloads of the regular
  algorithms.
- Add range overloads
- Use the sequential algorithm directly when 1 thread is requested
  (skips task grouping)
- Fix the index type of parallelForEachN to size_t. Nobody in LLVM was
  using any other parameter, and it made overload resolution hard for
  for_each_n(par, 0, foo.size(), ...) because 0 is int, not size_t.

Remove Threads.h and update LLD for that.

This is a prerequisite for parallel public symbol processing in the PDB
library, which is in LLVM.

Reviewed By: MaskRay, aganea

Differential Revision: https://reviews.llvm.org/D79390
2020-05-05 15:21:05 -07:00
Sid Manning 0e6536fd97 [Hexagon] Add R_HEX_GD_PLT_B22/32_PCREL relocations
Extended versions of GD_PLT_B22_PCREL. These surface when -mlong-calls
is used.

Differential Revision: https://reviews.llvm.org/D79191
2020-05-05 11:47:51 -05:00
Peter Smith 48aebfc908 [ELF][ARM] Do not create .ARM.exidx sections for out of range inputs
A linker will create .ARM.exidx sections for InputSections that don't
have them. This can cause a relocation out of range error If the
InputSection happens to be extremely far away from the other sections.
This is often the case for the vector table on older ARM CPUs as the only
two places that the table can be placed is 0 or 0xffff0000. We fix this
by removing InputSections that need a linker generated .ARM.exidx
section if that would cause an error.

Differential Revision: https://reviews.llvm.org/D79289
2020-05-05 09:59:45 +01:00
Zakk Chen ad5fad0ac5 [LTO] Suppress emission of empty combined module by default
Summary:
That unless the user requested an output object (--lto-obj-path), the an
unused empty combined module is not emitted.

This changed is helpful for some target (ex. RISCV-V) which encoded the
ABI info in IR module flags (target-abi). Empty unused module has no ABI
info so the linker would get the linking error during merging
incompatible ABIs.

Reviewers: tejohnson, espindola, MaskRay

Subscribers: emaste, inglorion, arichardson, hiraditya, simoncook, MaskRay, steven_wu, dexonsmith, PkmX, dang, lenary, s.egerton, luismarques, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78988
2020-05-04 18:31:09 -07:00
Fangrui Song c49f83b6e9 [ELF] Don't advance sh_offset for an empty section whose PT_LOAD is removed (due to p_memsz=0)
removeEmptyPTLoad() removes empty (p_memsz=0) PT_LOAD segments.  In
assignFileOffsets(), setFileOffset() unnecessarily advances file offsets
for containing empty sections.

This is exposed by arm Linux kernel's multi_v5_defconfig
(see https://bugs.llvm.org/show_bug.cgi?id=45632)

```
ld.lld (max-page-size=65536):
  [34] .init.data        PROGBITS        c0c24000 c34000 0128ac 00  WA  0   0 4096
  [35] .text_itcm        PROGBITS        fffe0000 c50000 000000 00  WA  0   0  1
  [36] .data_dtcm        PROGBITS        fffe8000 c58000 000000 00  WA  0   0  1
  [37] .data             PROGBITS        c0c38000 c58000 0647a0 00  WA  0   0 32

arm-linux-gnueabi-ld (max-page-size=65536):
  [23] .init.data        PROGBITS        c0c12000 c22000 0128ac 00  WA  0   0 4096
  [24] .text_itcm        PROGBITS        fffe0000 ca2558 000000 00   W  0   0  1
  [25] .data_dtcm        PROGBITS        fffe8000 ca2558 000000 00   W  0   0  1
  [26] .data             PROGBITS        c0c26000 c36000 0647a0 00  WA  0   0 32
```

This patch clears OutputSection::ptLoad if ptLoad is removed by
removeEmptyPTLoad(). Conceptually this removes "dangling" references.

Reviewed By: psmith

Differential Revision: https://reviews.llvm.org/D79254
2020-05-04 08:07:34 -07:00
Peter Smith 3834385f27 [ELF] Move SHF_LINK_ORDER till OutputSection addresses are known
Sections with the SHF_LINK_ORDER flag must be ordered in the same relative
order as the Sections they have a link to. When using a linker script an
arbitrary expression may be used for the virtual address of the
OutputSection. In some cases the virtual address does not monotonically
increase as the OutputSection index increases, so if we base the ordering
of the SHF_LINK_ORDER sections on the index then we can get the order
wrong. We fix this by moving SHF_LINK_ORDER resolution till after we have
created OutputSection virtual addresses.

Differential Revision: https://reviews.llvm.org/D79286
2020-05-04 14:25:25 +01:00
Fangrui Song b257d3c8a8 [ELF][PPC64] Suppress toc-indirect to toc-relative relaxation if R_PPC64_TOC16_LO is seen
The current implementation assumes that R_PPC64_TOC16_HA is always followed
by R_PPC64_TOC16_LO_DS. This can break with R_PPC64_TOC16_LO:

  // Load the address of the TOC entry, instead of the value stored at that address
  addis 3, 2, .LC0@tloc@ha  # R_PPC64_TOC16_HA
  addi  3, 3, .LC0@tloc@l   # R_PPC64_TOC16_LO
  blr

which is used by boringssl's util/fipstools/delocate/delocate.go
https://github.com/google/boringssl/blob/master/crypto/fipsmodule/FIPS.md has some documentation.
In short, this tool converts an assembly file to avoid any potential relocations.
The distance to an input .toc is not a constant after linking, so it cannot use an `addis;ld` pair.
Instead, it jumps to a stub which loads the TOC entry address with `addis;addi`.

This patch checks the presence of R_PPC64_TOC16_LO and suppresses
toc-indirect to toc-relative relaxation if R_PPC64_TOC16_LO is seen.
This approach is conservative and loses some relaxation opportunities but is easy to implement.

  addis 3, 2, .LC0@toc@ha  # no relaxation
  addi  3, 3, .LC0@toc@l   # no relaxation
  li    9, 0
  addis 4, 2, .LC0@toc@ha  # can relax but suppressed
  ld    4, .LC0@toc@l(4)   # can relax but suppressed

Also note that interleaved R_PPC64_TOC16_HA and R_PPC64_TOC16_LO_DS is
possible and this patch accounts for that.

  addis 3, 2, .LC1@toc@ha  # can relax
  addis 4, 2, .LC2@toc@ha  # can relax
  ld    3, .LC1@toc@l(3)   # can relax
  ld    4, .LC2@toc@l(4)   # can relax

Reviewed By: #powerpc, sfertile

Differential Revision: https://reviews.llvm.org/D78431
2020-04-30 09:16:51 -07:00
Fangrui Song b912b887d8 [ELF] Add --print-archive-stats=
gold has an option --print-symbol-counts= which prints:

  // For each archive
  archive $archive $members $fetched_members
  // For each object file
  symbols $object $defined_symbols $used_defined_symbols

In most cases, `$defined_symbols = $used_defined_symbols` unless weak
symbols are present. Strangely `$used_defined_symbols` includes symbols defined relative to --gc-sections discarded sections.
The `symbols` lines do not appear to be useful.

`archive` lines are useful: `$fetched_members=0` lines correspond to
unused archives. The information can be used to trim dependencies.

This patch implements --print-archive-stats= which prints the number of
members and the number of fetched members for each archive.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D78983
2020-04-29 18:04:37 -07:00
Fangrui Song e96d7b5e9e [ELF] Add --rosegment to complement --no-rosegment
This option can cancel --no-rosegment and it just seems right to have
a corresponding positive option for a --no-* negative option.

Anecdotally, gold had --rosegment but did not have --no-rosegment.
I added --no-rosegment (https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=9a6c68caa9543e09b064b7ac7c2b658f277bc19c) for binutils>=2.35
2020-04-29 18:00:00 -07:00
Fangrui Song 1ccde53342 [ELF] --gdb-index: support .debug_loclists
--gdb-index currently crashes when reading a translation unit with
DWARF v5 .debug_loclists . Call stack:

```
SyntheticSections.cpp GdbIndexSection::create
SyntheticSections.cpp readAddressAreas
DWARFUnit.cpp DWARFUnit::tryExtractDIEsIfNeeded
DWARFListTable.cpp DWARFListTableHeader::extract
...
DWARFDataExtractor.cpp DWARFDataExtractor::getRelocatedValue
lld/ELF/DWARF.cpp LLDDwarfObj<ELFT>::find (sec.sec is nullptr)
...

```

This patch adds support for .debug_loclists to make `DWARFUnit::tryExtractDIEsIfNeeded` happy.
Building --gdb-index does not need .debug_loclists

Reviewed By: dblaikie, grimar

Differential Revision: https://reviews.llvm.org/D79061
2020-04-29 15:04:13 -07:00
Sean Fertile f9106e85c4 Revert "[ELF][PPC64] Don't perform toc-indirect to toc-relative relax... "
This reverts commit 03ffe58605.

Full tile of reverted commit is:
[ELF][PPC64] Don't perform toc-indirect to toc-relative relaxation for
R_PPC64_TOC16_HA not followed by R_PPC64_TOC16_LO_DS

Breaks the multistage lld PowerPC buildbot.
2020-04-29 10:30:35 -04:00
Fangrui Song 03ffe58605 [ELF][PPC64] Don't perform toc-indirect to toc-relative relaxation for R_PPC64_TOC16_HA not followed by R_PPC64_TOC16_LO_DS
The current implementation assumes that R_PPC64_TOC16_HA is always followed
by R_PPC64_TOC16_LO_DS. This can break with:

// Load the address of the TOC entry, instead of the value stored at that address
  addis 3, 2, .LC0@tloc@ha  # R_PPC64_TOC16_HA
  addi  3, 3, .LC0@tloc@l   # R_PPC64_TOC16_LO
  blr

which is used by boringssl's util/fipstools/delocate/delocate.go
https://github.com/google/boringssl/blob/master/crypto/fipsmodule/FIPS.md has some documentation.
In short, this tool converts an assembly file to avoid any potential relocations.
The distance to an input .toc is not a constant after linking, so the assembly cannot use an `addis;ld` pair.
Instead, delocate changes the code to jump to a stub (`addis;addi`) which loads the TOC entry address.

Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D78431
2020-04-28 12:13:27 -07:00
Fangrui Song d9786b566b [ELF] Clear lazyObjFiles in lld:🧝:link after D46034 2020-04-28 09:54:20 -07:00
Igor Kudrin 9f65f5acca [LLD][ELF] Eliminate symbols of merged .ARM.exidx sections.
GNU tools generate mapping symbols "$d" for .ARM.exidx sections. The
symbols are added to the symbol table much earlier than the merging
takes place, and after that, they become dangling. Before the patch,
LLD output those symbols as SHN_ABS with the value of 0. The patch
removes such symbols from the symbol table.

Differential Revision: https://reviews.llvm.org/D78820
2020-04-28 18:58:40 +07:00
Hongtao Yu 964ef8eecc [lld] Support --lto-emit-asm and --plugin-opt=emit-asm
Summary: The switch --plugin-opt=emit-asm can be used with the gold linker to dump the final assembly code generated by LTO in a user-friendly way. Unfortunately it doesn't work with lld. I'm hooking it up with lld. With that switch, lld emits assembly code into the output file (specified by -o) and if there are multiple input files, each of their assembly code will be emitted into a separate file named by suffixing the output file name with a unique number, respectively. The linking then stops after generating those assembly files.

Reviewers: espindola, wenlei, tejohnson, MaskRay, grimar

Reviewed By: tejohnson, MaskRay, grimar

Subscribers: pcc, emaste, inglorion, arichardson, hiraditya, MaskRay, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77231
2020-04-27 11:00:46 -07:00
Igor Kudrin 66e4eb9c1b [LLD][ELF] Implement --discard-* for cases when -r or --emit-relocs are used.
When discarding local symbols with --discard-all or --discard-locals,
the ones which are used in relocations should be preserved. LLD used
the simplest approach and just ignored those switches when -r or
--emit-relocs was specified.

The patch implements handling the --discard-* switches for the cases
when relocations are kept by identifying used local symbols and allowing
removing only unused ones. This makes the behavior of LLD compatible
with GNU linkers.

Differential Revision: https://reviews.llvm.org/D77807
2020-04-25 18:59:41 +07:00
Peter Smith 3b1622d63a [LLD][ELF][ARM] recommit Fix ARM Exidx order for non monotonic section order
Fixed error detected by msan. The size field of the .ARM.exidx synthetic
section needs to be initialized to at least estimation level before
calling assignAddresses as that will use the size field.

This was previously reverted in 1ca16fc4f5.

Differential Revision: https://reviews.llvm.org/D78422
2020-04-24 13:47:28 +01:00
Peter Smith 1ca16fc4f5 Revert "[LLD][ELF][ARM] Fix ARM Exidx order for non monotonic section order"
This reverts commit f969c2aa65.

There are some msan buildbot failures sanitzer-x86_64-linux-fast that
I need to investigate.

Differential Revision: https://reviews.llvm.org/D78422
2020-04-23 16:58:50 +01:00
Peter Smith f969c2aa65 [LLD][ELF][ARM] Fix ARM Exidx order for non monotonic section order
The contents of the .ARM.exidx section must be ordered by SHF_LINK_ORDER
rules. We don't need to know the precise address for this order, but we
do need to know the relative order of sections. We have been using the
sectionIndex for this purpose, this works when the OutputSection order
has a monotonically increasing virtual address, but it is possible to
write a linker script with non-monotonically increasing virtual address.
For these cases we need to evaluate the base address of the OutputSection
so that we can order the .ARM.exidx sections properly.

This change moves the finalisation of .ARM.exidx till after the first
call to AssignAddresses. This permits us to sort on virtual address which
is linker script safe. It also permits a fix for part of pr44824 where
we generate .ARM.exidx section for the vector table when that table is so
far away it is out of range of the .ARM.exidx section. This fix will come
in a follow up patch.

Differential Revision: https://reviews.llvm.org/D78422
2020-04-23 15:46:44 +01:00
Fangrui Song c384ca3c6a [ELF] For relative paths in INPUT() and GROUP(), search the directory of the current linker script before searching other paths
For a relative path in INPUT() or GROUP(), this patch changes the search order by adding the directory of the current linker script.
The new search order (consistent with GNU ld >= 2.35 regarding the new test `test/ELF/input-relative.s`):

1. the directory of the current linker script (GNU ld from Binutils 2.35 onwards; https://sourceware.org/bugzilla/show_bug.cgi?id=25806)
2. the current working directory
3. library paths (-L)

This behavior makes it convenient to replace a .so or .a with a linker script with additional input. For example, glibc

```
% cat /usr/lib/x86_64-linux-gnu/libm.a
/* GNU ld script
*/
OUTPUT_FORMAT(elf64-x86-64)
GROUP ( /usr/lib/x86_64-linux-gnu/libm-2.29.a /usr/lib/x86_64-linux-gnu/libmvec.a )
```

could be simplified as `GROUP(libm-2.29.a libmvec.a)`.

Another example is to make libc++.a a linker script:
```
INPUT(libc++.a.1 libc++abi.a)
```

Note, -l is not affected.

Reviewed By: psmith

Differential Revision: https://reviews.llvm.org/D77779
2020-04-22 12:34:20 -07:00
Fangrui Song 01d2a01e79 [ELF] Fix a null pointer dereference when relocating a Local-Exec TLS relocation for a lazy symbol
If there is no SHF_TLS section, there will be no PT_TLS and Out::tlsPhdr may be a nullptr.
If the symbol referenced by an R_TLS is lazy, we should treat the symbol as undefined.

Also reorganize tls-in-archive.s and tls-weak-undef.s . They do not test what they intended to test.
2020-04-21 15:39:31 -07:00
Fangrui Song 497c76e96d [ELF] Keep local symbols when both --emit-relocs and --discard-all are specified
This fixes a bug as exposed by D77807.

Add tests for {--emit-relocs,-r} x {--discard-locals,--discard-all}. They add coverage for previously undertested cases:

* STT_SECTION associated to GCed sections (`gc`)
* STT_SECTION associated to retained sections (`text`)
* STT_SECTION associated to non-SHF_ALLOC sections (`.comment`)
* STB_LOCAL in GCed sections (`unused_gc`)

Reviewed By: grimar, ikudrin

Differential Revision: https://reviews.llvm.org/D78389
2020-04-21 08:28:12 -07:00
Fangrui Song 58207d6fe1 [ELF] Fix "TLS attribute mismatch" false positives for STT_NOTYPE undefined symbols
D13550 added the diagnostic to address/work around a crash.
The rule was refined by D19836 (test/ELF/tls-archive.s) to exclude Lazy symbols.

https://bugs.llvm.org/show_bug.cgi?id=45598 reported another case where the current logic has a false positive:

Bitcode does not record undefined module-level inline assembly symbols
(`IRSymtab.cpp:Builder::addSymbol`). Such an undefined symbol does not
have the FB_tls bit and lld will not consider it STT_TLS. When the symbol is
later replaced by a STT_TLS Defined, lld will error "TLS attribute mismatch".

This patch fixes this false positive by allowing a STT_NOTYPE undefined
symbol to be replaced by a STT_TLS.

Considered alternative:

Moving the diagnostics to scanRelocs() can improve the diagnostics (PR36049)
but that requires a fair amount of refactoring. We will need more
RelExpr members. It requires more thoughts whether it is worthwhile.

See `test/ELF/tls-mismatch.s` for behavior differences. We will fail to
diagnose a likely runtime bug (STT_NOTYPE non-TLS relocation referencing
a TLS definition). This is probably acceptable because compiler
generated code sets symbol types properly.

Reviewed By: grimar, psmith

Differential Revision: https://reviews.llvm.org/D78438
2020-04-21 07:56:35 -07:00
Fangrui Song 232578804a [ELF] Add --warn-backrefs-exclude=<glob>
D77522 changed --warn-backrefs to not warn for linking sandwich
problems (-ldef1 -lref -ldef2). This removed lots of false positives.

However, glibc still has some problems. libc.a defines some symbols
which are normally in libm.a and libpthread.a, e.g. __isnanl/raise.

For a linking order `-lm -lpthread -lc`, I have seen:

```
// different resolutions: GNU ld/gold select libc.a(s_isnan.o) as the definition
backward reference detected: __isnanl in libc.a(printf_fp.o) refers to libm.a(m_isnanl.o)

// different resolutions: GNU ld/gold select libc.a(raise.o) as the definition
backward reference detected: raise in libc.a(abort.o) refers to libpthread.a(pt-raise.o)
```

To facilitate deployment of --warn-backrefs, add --warn-backrefs-exclude= so that
certain known issues (which may be impractical to fix) can be whitelisted.

Deliberate choices:

* Not a comma-separated list (`--warn-backrefs-exclude=liba.a,libb.a`).
  -Wl, splits the argument at commas, so we cannot use commas.
  --export-dynamic-symbol is similar.
* Not in the style of `--warn-backrefs='*' --warn-backrefs=-liba.a`.
  We just need exclusion, not inclusion. For easier build system
  integration, we should avoid order dependency. With the current
  scheme, we enable --warn-backrefs, and indivial libraries can add
  --warn-backrefs-exclude=<glob> to their LDFLAGS.

Reviewed By: psmith

Differential Revision: https://reviews.llvm.org/D77512
2020-04-20 07:52:15 -07:00
Tobias Hieta 87383e408d [ELF][ARM] Increase default max-page-size from 4096 to 6536
See http://lists.llvm.org/pipermail/llvm-dev/2020-April/140549.html

For the record, GNU ld changed to 64k max page size in 2014
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=7572ca8989ead4c3425a1500bc241eaaeffa2c89
"[RFC] ld/ARM: Increase maximum page size to 64kB"

Android driver forced 4k page size in AArch64 (D55029) and ARM (D77746).

A binary linked with max-page-size=4096 does not run on a system with a
higher page size configured. There are some systems out there that do
this and it leads to the binary getting `Killed!` by the kernel.

In the non-linker-script cases, when linked with -z noseparate-code
(default), the max-page-size increase should not cause any size
difference. There may be some VMA usage differences, though.

Reviewed By: psmith, MaskRay

Differential Revision: https://reviews.llvm.org/D77330
2020-04-18 08:19:45 -07:00
LemonBoy aff950e95d [ELF] Support a few more SPARCv9 relocations
Implemented a bunch of relocations found in binaries with medium/large code model and the Local-Exec TLS model. The binaries link and run fine in Qemu.
In addition, the emulation `elf64_sparc` is now recognized.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D77672
2020-04-17 08:12:15 -07:00
Fangrui Song cd5d5ce235 [ELF] Refactor the way we handle -plugin-opt= (GCC collect2 or clang LTO related options)
GCC collect2 passes several options to the linker even if LTO is not used
(note, lld does not support GCC LTO). The lto-wrapper may be a relative
path (especially during development, when gcc is in a build directory), e.g.

  -plugin-opt=relative/path/to/lto-wrapper

We need to ignore such options, which are currently interpreted by
cl::ParseCommandLineOptions() and will fail with `error: --plugin-opt: ld.lld: Unknown command line argument 'relative/path/to/lto-wrapper'`
because the path is apparently not an option registered by an `llvm:🆑:opt`.

See lto-plugin-ignore.s for how we interpret various -plugin-opt= options now.

Reviewed By: grimar, tejohnson

Differential Revision: https://reviews.llvm.org/D78158
2020-04-15 08:00:50 -07:00
Brian Cain f3da6b7ab5 Add duplex to R_HEX_GOT_16_X
Building 'espresso' from llvm-test-suite revealed missing support
for duplex instructions with R_HEX_GOT_16_X.
2020-04-13 19:32:44 -05:00
Fangrui Song a27a7b98cd [ELF] --warn-backrefs: don't warn if -u/--export-dynamic-symbol
Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D77630
2020-04-08 09:33:22 -07:00
Peter Smith 28b172e341 [LLD][ELF][ARM] Implement ARM pc-relative relocations for ADR and LDR
The R_ARM_ALU_PC_G0 and R_ARM_LDR_PC_G0 relocations are used by the
ADR and LDR pseudo instructions, and are the basis of the group
relocations that can load an arbitrary constant via a series of add, sub
and ldr instructions.

The relocations need to be obtained via the .reloc directive.

R_ARM_ALU_PC_G0 is much more complicated as the add/sub instruction uses
a modified immediate encoding of an 8-bit immediate rotated right by an
even 4-bit field. This means that the range of representable immediates
is sparse. We extract the encoding and decoding functions for the modified
immediate from llvm/lib/Target/ARM/MCTargetDesc/ARMAddressingModes.h as
this header file is not accessible from LLD. Duplication of code isn't
ideal, but as these are well-defined mathematical functions they are
unlikely to change.

Differential Revision: https://reviews.llvm.org/D75349
2020-04-08 12:43:44 +01:00
Fangrui Song 03c825c224 [ELF] --warn-backrefs: don't warn for linking sandwich problems
This is an alternative design to D77512.

D45195 added --warn-backrefs to detect

* A. certain input orders which GNU ld either errors ("undefined reference")
  or has different resolution semantics
* B. (byproduct) some latent multiple definition problems (-ldef1 -lref -ldef2) which I
  call "linking sandwich problems". def2 may or may not be the same as def1.

When an archive appears more than once (-ldef -lref -ldef), lld and GNU
ld may have the same resolution but --warn-backrefs may warn. This is
not uncommon. For example, currently lld itself has such a problem:

```
liblldCommon.a liblldCOFF.a ... liblldCommon.a
  _ZN3lld10DWARFCache13getDILineInfoEmm in liblldCOFF.a refers to liblldCommon.a(DWARF.cpp.o)
libLLVMSupport.a also appears twice and has a similar warning
```

glibc has such problems. It is somewhat destined because of its separate
libc/libpthread/... and arbitrary grouping. The situation is getting
improved over time but I have seen:
```
-lc __isnanl references -lm
-lc _IO_funlockfile references -lpthread
```

There are also various issues in interaction with other runtime
libraries such as libgcc_eh and libunwind:
```
-lc __gcc_personality_v0 references -lgcc_eh
-lpthread __gcc_personality_v0 references -lgcc_eh
-lpthread _Unwind_GetCFA references -lunwind
```

These problems are actually benign. We want --warn-backrefs to focus on
its main task A and defer task B (which is also useful) to a more
specific future feature (see gold --detect-odr-violations and
https://bugs.llvm.org/show_bug.cgi?id=43110).

Instead of warning immediately, we store the message and only report it
if no subsequent lazy definition exists.

The use of the static variable `backrefDiags` is similar to `undefs` in
Relocations.cpp

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D77522
2020-04-07 10:25:23 -07:00
Fangrui Song 4e907e93fb [ELF] -M/-Map: fix VMA/LMA/Size columns of symbol assignments when address/size>=2**32
SymbolAssignment::addr stores the location counter. The type should be
uint64_t instead of unsigned. The upper half of the address space is
commonly used by operating system kernels.

Similarly, SymbolAssignment::size should be an uint64_t. A kernel linker
script can move the location counter from 0 to the upper half of the
address space.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D77445
2020-04-07 10:15:15 -07:00
Sriraman Tallam 94317878d8 LLD Support for Basic Block Sections
This is part of the Propeller framework to do post link code layout
optimizations. Please see the RFC here:
https://groups.google.com/forum/#!msg/llvm-dev/ef3mKzAdJ7U/1shV64BYBAAJ and the
detailed RFC doc here:
https://github.com/google/llvm-propeller/blob/plo-dev/Propeller_RFC.pdf

This patch adds lld support for basic block sections and performs relaxations
after the basic blocks have been reordered.

After the linker has reordered the basic block sections according to the
desired sequence, it runs a relaxation pass to optimize jump instructions.
Currently, the compiler emits the long form of all jump instructions. AMD64 ISA
supports variants of jump instructions with one byte offset or a four byte
offset. The compiler generates jump instructions with R_X86_64 32-bit PC
relative relocations. We would like to use a new relocation type for these jump
instructions as it makes it easy and accurate while relaxing these instructions.

The relaxation pass does two things:

First, it deletes all explicit fall-through direct jump instructions between
adjacent basic blocks. This is done by discarding the tail of the basic block
section.

Second, If there are consecutive jump instructions, it checks if the first
conditional jump can be inverted to convert the second into a fall through and
delete the second.

The jump instructions are relaxed by using jump instruction mods, something
like relocations. These are used to modify the opcode of the jump instruction.
Jump instruction mods contain three values, instruction offset, jump type and
size. While writing this jump instruction out to the final binary, the linker
uses the jump instruction mod to determine the opcode and the size of the
modified jump instruction. These mods are required because the input object
files are memory-mapped without write permissions and directly modifying the
object files requires copying these sections. Copying a large number of basic
block sections significantly bloats memory.

Differential Revision: https://reviews.llvm.org/D68065
2020-04-07 06:55:57 -07:00
Fangrui Song c1c679e2d2 [ELF] Make --version-script/--dynamic-list work for lazy symbols fetched by LTO libcalls
Fixes https://bugs.llvm.org/show_bug.cgi?id=45391

The LTO code generator happens after version script scanning and may
create references which will fetch some lazy symbols.

Currently a version script does not assign VER_NDX_LOCAL to lazy symbols
and such symbols will be made global after they are fetched.

Change findByVersion and findAllByVersion to work on lazy symbols.
For unfetched lazy symbols, we should keep them non-local (D35263).
Check isDefined() in computeBinding() as a compensation.

This patch fixes a companion bug that --dynamic-list does not export
libcall fetched symbols.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D77280
2020-04-06 09:47:06 -07:00
Fangrui Song 9195b01911 [ELF][PPC64] Enable R_PPC64_REL14 trunks
The thunk implementation is available but an assertion disallows it.
Linux kernel has such a use case: in arch/powerpc/kernel/exceptions-64s.S:handle_page_fault,
beq+ ret_from_except_lite may get out of range.

Link: https://github.com/ClangBuiltLinux/linux/issues/951

Differential Revision: https://reviews.llvm.org/D76904
2020-04-04 10:59:17 -07:00
Fangrui Song 56decd982d [ELF] Allow invalid sh_size%sh_entsize!=0 for non-SHF_MERGE sections
Fixes https://bugs.llvm.org/show_bug.cgi?id=45370
Fixes https://github.com/Clozure/ccl/issues/273

.stab holds a table of 12-byte entries. GNU as before 2.35 incorrectly
sets sh_entsize(.stab) to 20 on 64-bit architectures:
https://sourceware.org/bugzilla/show_bug.cgi?id=25768

We should not emit the confusing error:
"SHF_MERGE section size (...) must be a multiple of sh_entsize (20)

Reviewed By: grimar, psmith

Differential Revision: https://reviews.llvm.org/D77368
2020-04-03 08:48:30 -07:00
Sid Manning c484b3e334 [Hexagon] Fix issue with non-preemptible STT_TLS symbols
A PC-relative relocation referencing a non-preemptible absolute symbol
(due to STT_TLS) is not representable in -pie/-shared mode.

Differential Revision: https://reviews.llvm.org/D77021
2020-04-03 08:55:23 -05:00
Fangrui Song 42bb5cc502 [ELF] Change some "Alias for " help messages to use double dashed options
The aliased options in the --help output use double dashes. It is
inconsistent to have single-dashed messages. Additionally, -l and -t are
common short options and single-dashed forms prefixed with them can
cause confusion.
2020-04-02 09:27:56 -07:00
Igor Kudrin b0b1f451ae [LLD][ELF] Follow the common pattern in a message about an undefined vtable symbol.
In most cases, LLD prints its multiline diagnostic messages starting
additional lines with ">>> ". That greatly helps external tools to parse
the output, simplifying combining several lines of the log back into one
message. The patch fixes the only message I found that does not follow
the common pattern.

Differential Revision: https://reviews.llvm.org/D77132
2020-04-02 11:39:03 +07:00
Kazuaki Ishizaki 7c5fcb3591 [lld] NFC: fix trivial typos in comments
Differential Revision: https://reviews.llvm.org/D72339
2020-04-02 01:21:36 +09:00
Fangrui Song bb4a36ea28 [ELF] Propagate LMA offset to sections with neither AT() nor AT>
Fixes https://bugs.llvm.org/show_bug.cgi?id=45313
Also fixes linkerscript/{at4.s,overlay.test} LMA address issues exposed by
011b785505.
Related: D74297

This patch improves emulation of GNU ld's heuristics on the difference
between the LMA and the VMA:
https://sourceware.org/binutils/docs/ld/Output-Section-LMA.html#Output-Section-LMA

New test linkerscript/lma-offset.s (based on at4.s) demonstrates some behaviors.

Reviewed By: psmith

Differential Revision: https://reviews.llvm.org/D76995
2020-04-01 08:19:06 -07:00
Fangrui Song f2036a15d3 [ELF] Print symbols with non-default versions for better "undefined symbol" diagnostics
When reporting an "undefined symbol" diagnostic:

* We don't print @ for the reference.
* We don't print @ or @@ for the definition. https://bugs.llvm.org/show_bug.cgi?id=45318

This can lead to confusing diagnostics:

```
// foo may be foo@v2
ld.lld: error: undefined symbol: foo
>>> referenced by t1.o:(.text+0x1)
// foo may be foo@v1 or foo@@v1
>>> did you mean: foo
>>> defined in: t.so
```

There are 2 ways a symbol in symtab may get truncated:

* A @@ definition may be truncated *early* by SymbolTable::insert().
  The name ends with a '\0'.
* A @ definition/reference may be truncated *later* by Symbol::parseSymbolVersion().
  The name ends with a '@'.

This patch detects the second case and improves the diagnostics. The first case is
not improved but the second case is sufficient to make diagnostics not confusing.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D76999
2020-04-01 08:04:36 -07:00
Fangrui Song eb4663d8c6 [lld][COFF][ELF][WebAssembly] Replace --[no-]threads /threads[:no] with --threads={1,2,...} /threads:{1,2,...}
--no-threads is a name copied from gold.
gold has --no-thread, --thread-count and several other --thread-count-*.

There are needs to customize the number of threads (running several lld
processes concurrently or customizing the number of LTO threads).
Having a single --threads=N is a straightforward replacement of gold's
--no-threads + --thread-count.

--no-threads is used rarely. So just delete --no-threads instead of
keeping it for compatibility for a while.

If --threads= is specified (ELF,wasm; COFF /threads: is similar),
--thinlto-jobs= defaults to --threads=,
otherwise all available hardware threads are used.

There is currently no way to override a --threads={1,2,...}. It is still
a debate whether we should use --threads=all.

Reviewed By: rnk, aganea

Differential Revision: https://reviews.llvm.org/D76885
2020-03-31 08:46:12 -07:00
Peter Smith 2539b4ae47 [LLD][ELF] Allow empty (.init|.preinit|.fini)_array to be RELRO
The default GNU linker script uses the following idiom for the array
sections. I'll use .init_array here, but this also applies to
.preinit_array and .fini_array sections.

  .init_array    :
  {
    PROVIDE_HIDDEN (__init_array_start = .);
    KEEP (*(.init_array))
    PROVIDE_HIDDEN (__init_array_end = .);
  }

The C-library will take references to the _start and _end symbols to
process the array. This will make LLD keep the OutputSection even if there
are no .init_array sections. As the current check for RELRO uses the
section type for .init_array the above example with no .init_array
InputSections fails the checks as there are no .init_array sections to give
the OutputSection a type of SHT_INIT_ARRAY. This often leads to a
non-contiguous RELRO error message.

The simple fix is to a textual section match as well as a section type
match.

Differential Revision: https://reviews.llvm.org/D76915
2020-03-31 12:53:12 +01:00
Kai Wang 581ba35291 [RISCV] ELF attribute section for RISC-V.
Leverage ARM ELF build attribute section to create ELF attribute section
for RISC-V. Extract the common part of parsing logic for this section
into ELFAttributeParser.[cpp|h] and ELFAttributes.[cpp|h].

Differential Revision: https://reviews.llvm.org/D74023
2020-03-31 16:16:19 +08:00
Nico Weber 20eb719f99 lld: Reduce number of references to undefined printed from 10 to 3.
As of a while ago, lld groups all undefined references to a single
symbol in a single diagnostic. Back then, I made it so that we
print up to 10 references to each undefined symbol.

Having used this for a while, I never wished there were more
references, but I sometimes found that this can print a lot of
output. lld prints up to 10 diagnostics by default, and if
each has 10 references (which I've seen in practice), and each
undefined symbol produces 2 (possibly very long) lines of output,
that's over 200 lines of error output.

Let's try it with just 3 references for a while and see how
that feels in practice.

Differential Revision: https://reviews.llvm.org/D77017
2020-03-30 14:31:32 -04:00
Fangrui Song 673e81eee4 [ELF] Allow SHF_LINK_ORDER and non-SHF_LINK_ORDER to be mixed
Currently, `error: incompatible section flags for .rodata` is reported
when we mix SHF_LINK_ORDER and non-SHF_LINK_ORDER sections in an output section.

This is overconstrained. This patch allows mixed flags with the
requirement that SHF_LINK_ORDER sections must be contiguous. Mixing
flags is used by Linux aarch64 (https://github.com/ClangBuiltLinux/linux/issues/953)

  .init.data : { ... KEEP(*(__patchable_function_entries)) ... }

When the integrated assembler is enabled, clang's -fpatchable-function-entry=N[,M]
implementation sets the SHF_LINK_ORDER flag (D72215) to fix a number of
garbage collection issues.

Strictly speaking, the ELF specification does not require contiguous
SHF_LINK_ORDER sections but for many current uses of SHF_LINK_ORDER like
.ARM.exidx/__patchable_function_entries there has been a requirement for
the sections to be contiguous on top of the requirements of the ELF
specification.

This patch also imposes one restriction: SHF_LINK_ORDER sections cannot
be separated by a symbol assignment or a BYTE command. Not allowing BYTE
is a natural extension that a non-SHF_LINK_ORDER cannot be a separator.
Symbol assignments can delimiter the contents of SHF_LINK_ORDER
sections.  Allowing SHF_LINK_ORDER sections across symbol assignments
(especially __start_/__stop_) can make things hard to explain. The
restriction should not be a problem for practical use cases.

Reviewed By: psmith

Differential Revision: https://reviews.llvm.org/D77007
2020-03-30 10:03:55 -07:00
Alexandre Ganea 42dc667db2 [LLD][ELF] Put back rounding which was lost in 8404aeb56a 2020-03-29 21:52:01 -04:00
Matt Schulte fdc41aa22c [lld][ELF] Mark empty NOLOAD output sections SHT_NOBITS instead of SHT_PROGBITS
This fixes PR# 45336.
Output sections described in a linker script as NOLOAD with no input sections would be marked as SHT_PROGBITS.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D76981
2020-03-28 10:07:58 -07:00
Alexandre Ganea 09158252f7 [ThinLTO] Allow usage of all hardware threads in the system
Before this patch, it wasn't possible to extend the ThinLTO threads to all SMT/CMT threads in the system. Only one thread per core was allowed, instructed by usage of llvm::heavyweight_hardware_concurrency() in the ThinLTO code. Any number passed to the LLD flag /opt:lldltojobs=..., or any other ThinLTO-specific flag, was previously interpreted in the context of llvm::heavyweight_hardware_concurrency(), which means SMT disabled.

One can now say in LLD:
/opt:lldltojobs=0 -- Use one std::thread / hardware core in the system (no SMT). Default value if flag not specified.
/opt:lldltojobs=N -- Limit usage to N threads, regardless of usage of heavyweight_hardware_concurrency().
/opt:lldltojobs=all -- Use all hardware threads in the system. Equivalent to /opt:lldltojobs=$(nproc) on Linux and /opt:lldltojobs=%NUMBER_OF_PROCESSORS% on Windows. When an affinity mask is set for the process, threads will be created only for the cores selected by the mask.

When N > number-of-hardware-threads-in-the-system, the threads in the thread pool will be dispatched equally on all CPU sockets (tested only on Windows).
When N <= number-of-hardware-threads-on-a-CPU-socket, the threads will remain on the CPU socket where the process started (only on Windows).

Differential Revision: https://reviews.llvm.org/D75153
2020-03-27 10:20:58 -04:00
James Henderson 3ff3c6986b [lld][ELF] Fix error message
The error previously talked about a "section header" but was actually
referring to a program header.

Reviewed by: grimar, MaskRay

Differential Revision: https://reviews.llvm.org/D76846
2020-03-26 15:30:24 +00:00
Fangrui Song 9e33c09647 [ELF] Keep orphan section names (.rodata.foo .text.foo) unchanged if !hasSectionsCommand
This behavior matches GNU ld and seems reasonable.

```
// If a SECTIONS command is not specified
.text.* -> .text
.rodata.* -> .rodata
.init_array.* -> .init_array
```

A proposed Linux feature CONFIG_FG_KASLR may depend on the GNU ld behavior.

Reword a comment about -z keep-text-section-prefix and a comment about
CommonSection (deleted by rL286234).

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D75225
2020-03-23 10:30:06 -07:00
Fangrui Song 011b785505 [ELF] Create readonly PT_LOAD in the presence of a SECTIONS command
This essentially drops the change by r288021 (discussed with Georgii Rymar
and Peter Smith and noted down in the release note of lld 10).

GNU ld>=2.31 enables -z separate-code by default for Linux x86. By
default (in the absence of a PHDRS command) a readonly PT_LOAD is
created, which is different from its traditional behavior.

Not emulating GNU ld's traditional behavior is good for us because it
improves code consistency (we create a readonly PT_LOAD in the absence
of a SECTIONS command).

Users can add --no-rosegment to restore the previous behavior (combined
readonly and read-executable sections in a single RX PT_LOAD).
2020-03-19 19:11:11 -07:00
Georgii Rymar bb7d2b1780 [LLD][ELF] - Disambiguate "=fillexp" with a primary expression to allow =0x90 /DISCARD/
Fixes https://bugs.llvm.org/show_bug.cgi?id=44903

It is about the following case:

```
SECTIONS {
  .foo : { *(.foo) } =0x90909090
  /DISCARD/ : { *(.bar) }
}
```

Here while parsing the fill expression we treated the
"/" of "/DISCARD/" as operator.

With this change, suggested by Fangrui Song, we do
not allow expressions with operators (e.g. "0x1100 + 0x22")
that are not wrapped into round brackets. It should not
be an issue for users, but helps to resolve parsing ambiguity.

Differential revision: https://reviews.llvm.org/D74687
2020-03-19 12:49:25 +03:00
Sid Manning 5a5a075c5b [LLD][ELF][Hexagon] Support GDPLT transforms
Hexagon ABI specifies that call x@gdplt is transformed to call __tls_get_addr.

Example:
     call x@gdplt
is changed to
     call __tls_get_addr

When x is an external tls variable.

Differential Revision: https://reviews.llvm.org/D74443
2020-03-13 11:02:11 -05:00
Shoaib Meenai 2822852ffc [ELF] Correct error message when OUTPUT_FORMAT is used
Any OUTPUT_FORMAT in a linker script overrides the emulation passed on
the command line, so record the passed bfdname and use that in the error
message about incompatible input files.

This prevents confusing error messages. For example, if you explicitly
pass `-m elf_x86_64` to LLD but accidentally include a linker script
which sets `OUTPUT_FORMAT(elf32-i386)`, LLD would previously complain
about your input files being compatible with elf_x86_64, which isn't the
actual issue, and is confusing because the input files are in fact
x86-64 ELF files.

Interestingly enough, this also prevents a segfault! When we don't pass
`-m` and we have an object file which is incompatible with the
`OUTPUT_FORMAT` set by a linker script, the object file is checked for
compatibility before it's added to the objectFiles vector.
config->emulation, objectFiles, and sharedFiles will all be empty, so
we'll attempt to access bitcodeFiles[0], but bitcodeFiles is also empty,
so we'll segfault. This commit prevents the segfault by adding
OUTPUT_FORMAT as a possible source of machine configuration, and it also
adds an llvm_unreachable to diagnose similar issues in the future.

Differential Revision: https://reviews.llvm.org/D76109
2020-03-12 22:54:53 -07:00
Fangrui Song 0bb362c164 [ELF] --gdb-index: fix memory usage regression after D74773
On an internal target,

* Before D74773: time -f '%M' => 18275680
* After D74773:  time -f '%M' => 22088964

This patch restores to the status before D74773.
2020-03-12 16:55:30 -07:00
Fangrui Song eb4b5a36a6 [ELF] Move --print-map(-M)/--cref before checkSections() and openFile()
-M output can be useful when diagnosing an "error: output file too large" problem (emitted in openFile()).

I just ran into such a situation where I had to debug an erronerous
Linux kernel linker script. It tried to create a file larger than
INT64_MAX bytes.

This patch could have helped https://bugs.llvm.org/show_bug.cgi?id=44715 as well.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D75966
2020-03-12 08:00:18 -07:00
Reid Kleckner 213aea4c58 Remove unused Endian.h includes, NFC
Mainly avoids including Host.h everywhere:

$ diff -u <(sort thedeps-before.txt) <(sort thedeps-after.txt) \
    | grep '^[-+] ' | sort | uniq -c | sort -nr
   3141 - /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/Support/Host.h
2020-03-11 15:45:34 -07:00
Fangrui Song fbf41b5267 [ELF] Simplify sh_addr computation and warn if sh_addr is not a multiple of sh_addralign
See `docs/ELF/linker_script.rst` for the new computation for sh_addr and sh_addralign.
`ALIGN(section_align)` now means: "increase alignment to section_align"
(like yet another input section requirement).

The "start of section .foo changes from 0x11 to 0x20" warning no longer
makes sense. Change it to warn if sh_addr%sh_addralign!=0.

To decrease the alignment from the default max_input_align,
use `.output ALIGN(8) : {}` instead of `.output : ALIGN(8) {}`
See linkerscript/section-address-align.test as an example.

When both an output section address and ALIGN are set (can be seen as an
"undefined behavior" https://sourceware.org/ml/binutils/2020-03/msg00115.html),
lld may align more than GNU ld, but it makes a linker script working
with GNU ld hard to break with lld.

This patch can be considered as restoring part of the behavior before D74736.

Differential Revision: https://reviews.llvm.org/D75724
2020-03-11 09:35:42 -07:00
David Bozier 6e2804ce6b [LLD] Add support for --unique option
Summary:
Places orphan sections into a unique output section. This prevents the merging of orphan sections of the same name.
Matches behaviour of GNU ld --unique. --unique=pattern is not implemented.

Motivated user case shown in the test has 2 local symbols as they would appear if C++ source has been compiled with -ffunction-sections. The merging of these sections in the case of a partial link (-r) may limit the effectiveness of -gc-sections of a subsequent link.

Reviewers: espindola, jhenderson, bd1976llvm, edd, andrewng, JonChesterfield, MaskRay, grimar, ruiu, psmith

Reviewed By: MaskRay, grimar

Subscribers: emaste, arichardson, MaskRay, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75536
2020-03-10 12:20:21 +00:00
Fangrui Song 92b5b980d2 [ELF] Postpone evaluation of ORIGIN/LENGTH in a MEMORY command
```
createFiles(args)
 readDefsym
 readerLinkerScript(*mb)
  ...
   readMemory
    readMemoryAssignment("ORIGIN", "org", "o") // eagerly evaluated
target = getTarget();
link(args)
 writeResult<ELFT>()
  ...
   finalizeSections()
    script->processSymbolAssignments()
     addSymbol(cmd) // with this patch, evaluated here
```

readMemoryAssignment eagerly evaluates ORIGIN/LENGTH and returns an uint64_t.
This patch postpones the evaluation to make

* --defsym and symbol assignments
* `CONSTANT(COMMONPAGESIZE)` (requires a non-null `lld:🧝:target`)

work. If the expression somehow requires interaction with memory
regions, the circular dependency may cause the expression to evaluate to
a strange value. See the new test added to memory-err.s

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D75763
2020-03-09 08:31:41 -07:00
Andrew Monshizadeh 3669f0ed4f Refactor TimeProfiler write methods (NFC)
Added a write method for TimeTrace that takes two strings representing
file names. The first is any file name that may have been provided by the
user via `time-trace-file` flag, and the second is a fallback that should
be configured by the caller. This method makes it cleaner to write the
trace output because there is no longer a need to check file names at the
caller and simplifies future TimeTrace usages.

Reviewed By: modocache

Differential Revision: https://reviews.llvm.org/D74514
2020-03-06 14:34:56 -08:00
Alexey Lapshin dcf6494abe LLD already has a mechanism for caching creation of DWARCContext:
llvm::call_once(initDwarfLine, [this]() { initializeDwarf(); });

Though it is not used in all places.

I need that patch for implementing "Remove obsolete debug info" feature
(D74169). But this caching mechanism is useful by itself, and I think it
would be good to use it without connection to "Remove obsolete debug info"
feature. So this patch changes inplace creation of DWARFContext with
its cached version.

Depends on D74308

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D74773
2020-03-06 21:17:07 +03:00
Fangrui Song 791efb148f [ARM] Rewrite ARMAttributeParser
* Delete boilerplate
* Change functions to return `Error`
* Test parsing errors
* Update callers of ARMAttributeParser::parse() to check the `Error` return value.

Since this patch touches nearly everything in the file, I apply
http://llvm.org/docs/Proposals/VariableNames.html and change variable
names to lower case.

Reviewed By: compnerd

Differential Revision: https://reviews.llvm.org/D75015
2020-03-05 10:57:27 -08:00
Alexey Lapshin a130be6ac5 [LLD][NFC] Remove getOffsetInFile() workaround.
Summary:
LLD has workaround for the times when SectionIndex was not passed properly:

LT->getFileLineInfoForAddress(
      S->getOffsetInFile() + Offset, nullptr,
      DILineInfoSpecifier::FileLineInfoKind::AbsoluteFilePath, Info));

S->getOffsetInFile() was added to differentiate offsets between
various sections. Now SectionIndex is properly specified.
Thus it is not necessary to use getOffsetInFile() workaround.
See https://reviews.llvm.org/D58194, https://reviews.llvm.org/D58357.

This patch removes getOffsetInFile() workaround.

Reviewers: ruiu, grimar, MaskRay, espindola

Reviewed By: grimar, MaskRay

Subscribers: emaste, arichardson, llvm-commits

Tags: #llvm, #lld

Differential Revision: https://reviews.llvm.org/D75636
2020-03-05 15:52:46 +03:00
Sam Clegg 928e9e1723 [lld][WebAssembly] Add support for --rsp-quoting
This also changes to default style to match the host.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D75577
2020-03-04 11:41:33 -08:00
evgeny 497c110e87 [lld][ELF][COFF] Fix archived bitcode files naming
Differential revision: https://reviews.llvm.org/D75422
2020-03-04 12:46:31 +03:00
Fangrui Song 315f8a55f5 [ELF][PPC32] Don't report "relocation refers to a discarded section" for .got2
Similar to D63182 [ELF][PPC64] Don't report "relocation refers to a discarded section" for .toc

Reviewed By: Bdragon28

Differential Revision: https://reviews.llvm.org/D75419
2020-03-01 19:54:40 -08:00
Fangrui Song 00925aadb3 [ELF][PPC32] Fix canonical PLTs when the order does not match the PLT order
Reviewed By: Bdragon28

Differential Revision: https://reviews.llvm.org/D75394
2020-02-28 22:23:14 -08:00
Fangrui Song 718cbd394a [ELF] Delete two unneeded `referenced = true` after D65584 2020-02-28 21:59:08 -08:00
Alexey Lapshin 0a2d415bd0 [LLD] Report errors occurred while parsing debug info as warnings.
Summary:
Extracted from D74773. Currently, errors happened while parsing
debug info are reported as errors. DebugInfoDWARF library treats such
errors as "Recoverable errors". This patch makes debug info errors
to be reported as warnings, to support DebugInfoDWARF approach.

Reviewers: ruiu, grimar, MaskRay, jhenderson, espindola

Reviewed By: MaskRay, jhenderson

Subscribers: emaste, aprantl, arichardson, arphaman, llvm-commits

Tags: #llvm, #debug-info, #lld

Differential Revision: https://reviews.llvm.org/D75234
2020-02-29 00:03:18 +03:00
Peter Smith 6b035b607f [LLD][ELF][ARM] Implement Thumb pc-relative relocations for adr and ldr
MC will now output the R_ARM_THM_PC8, R_ARM_THM_PC12 and
R_ARM_THM_PREL_11_0 relocations. These are short-ranged relocations that
are used to implement the adr rd, literal and ldr rd, literal pseudo
instructions.

The instructions use a new RelExpr called R_ARM_PCA in order to calculate
the required S + A - Pa expression, where Pa is AlignDown(P, 4) as the
instructions add their immediate to AlignDown(PC, 4). We also do not want
these relocations to generate or resolve against a PLT entry as the range
of these relocations is so short they would never reach.

The R_ARM_THM_PC8 has a special encoding convention for the relocation
addend, the immediate field is unsigned, yet the addend must be -4 to
account for the Thumb PC bias. The ABI (not the architecture) uses the
convention that the 8-byte immediate of 0xff represents -4.

Differential Revision: https://reviews.llvm.org/D75042
2020-02-28 11:29:29 +00:00
Fangrui Song 37c7f0d945 [ELF] --orphan-handling=: don't warn/error for input SHT_REL[A] retained by --emit-relocs
They are purposefully skipped by input section descriptions (rL295324).
Similarly, --orphan-handling= should not warn/error for them.
This behavior matches GNU ld.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D75151
2020-02-26 10:32:54 -08:00
Fangrui Song 423194098b [ELF] --orphan-handling=: don't warn/error for unused synthesized sections
This makes --orphan-handling= less noisy.
This change also improves our compatibility with GNU ld.

GNU ld special cases .symtab, .strtab and .shstrtab . We need output section
descriptions for .symtab, .strtab and .shstrtab to suppress:

  <internal>:(.symtab) is being placed in '.symtab'
  <internal>:(.shstrtab) is being placed in '.shstrtab'
  <internal>:(.strtab) is being placed in '.strtab'

With --strip-all, .symtab and .strtab can be omitted (note, --strip-all is not compatible with --emit-relocs).

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D75149
2020-02-26 08:56:12 -08:00
Fangrui Song 93331a17e8 [ELF] Support archive:file syntax in input section descriptions
Fixes https://bugs.llvm.org/show_bug.cgi?id=44450

https://sourceware.org/binutils/docs/ld/Input-Section-Basics.html#Input-Section-Basics
The following two rules are not implemented.

* `archive:` matches every file in the archive.
* `:file` matches a file not in an archive.

Reviewed By: grimar, ruiu

Differential Revision: https://reviews.llvm.org/D75100
2020-02-25 07:57:43 -08:00
Rafael Ávila de Espíndola 7b44f0428a Add a llvm::shuffle and use it in lld
With this --shuffle-sections=seed produces the same result in every
host.

Reviewed By: grimar, MaskRay

Differential Revision: https://reviews.llvm.org/D74971
2020-02-22 10:05:29 -08:00
Fangrui Song 73d8d83a6d [ARM] Change ARMAttributeParser::Parse to use support::endianness and simplify 2020-02-21 11:05:33 -08:00
Fangrui Song dbd7281aa7 [ELF] Shuffle .init_array/.fini_array with --shuffle-sections=
Useful for detecting static initialization order fiasco.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D74887
2020-02-21 08:16:07 -08:00
Fangrui Song de0dda54d3 [ELF] Warn changed output section address
When the output section address (addrExpr) is specified, GNU ld warns if
sh_addr is different. This patch implements the warning.

Note, LinkerScript::assignAddresses can be called more than once. We
need to record the changed section addresses, and only report the
warnings after the addresses are finalized.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D74741
2020-02-21 08:13:29 -08:00
Fangrui Song 6ed8e20143 [ELF] Ignore the maximum of input section alignments for two cases
Follow-up for D74286.

Notations:

* alignExpr: the computed ALIGN value
* max_input_align: the maximum of input section alignments

This patch changes the following two cases to match GNU ld:

* When ALIGN is present, GNU ld sets output sh_addr to alignExpr, while lld use max(alignExpr, max_input_align)
* When addrExpr is specified but alignExpr is not, GNU ld sets output sh_addr to addrExpr, while lld uses `advance(0, max_input_align)`

Note, sh_addralign is still set to max(alignExpr, max_input_align).

lma-align.test is enhanced a bit to check we don't overalign sh_addr.

fixSectionAlignments() sets addrExpr but not alignExpr for the `!hasSectionsCommand` case.
This patch sets alignExpr as well so that max_input_align will be respected.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D74736
2020-02-21 08:12:00 -08:00
Rafael Ávila de Espíndola d48d339156 [lld][ELF] Add --shuffle-sections=seed to shuffle input sections
Summary:
This option causes lld to shuffle sections by assigning different
priorities in each run.

The use case for this is to introduce randomization in benchmarks. The
idea is inspired by the paper "Producing Wrong Data Without Doing
Anything Obviously Wrong!"
(https://www.inf.usi.ch/faculty/hauswirth/publications/asplos09.pdf). Unlike
the paper, we shuffle individual sections, not just input files.

Doing this in lld is particularly convenient as the --reproduce option
makes it easy to collect all the necessary bits for relinking the
program being benchmarked. Once that it is done, all that is needed is
to add --shuffle-sections=0 to the response file and relink before each
run of the benchmark.

Differential Revision: https://reviews.llvm.org/D74791
2020-02-19 13:44:12 -08:00
Tamas Petz 6e326882da [LLD][ELF][ARM] Fix support for SBREL type relocations
With this patch lld recognizes ARM SBREL relocations.
R_ARM*_MOVW_BREL relocations are not tested because they are not used.

Patch by Tamas Petz

Differential Revision: https://reviews.llvm.org/D74604
2020-02-19 10:07:46 +00:00
Daniel Kiss b6162622c0 [LLD][ELF][AArch64] Change the semantics of -z pac-plt.
Summary:
Generate PAC protected plt only when "-z pac-plt" is passed to the
linker. GNU toolchain generates when it is explicitly requested[1].
When pac-plt is requested then set the GNU_PROPERTY_AARCH64_FEATURE_1_PAC
note even when not all function compiled with PAC but issue a warning.
Harmonizing the warning style for BTI/PAC/IBT.
Generate BTI protected PLT if case of "-z force-bti".

[1] https://www.sourceware.org/ml/binutils/2019-03/msg00021.html

Reviewers: peter.smith, espindola, MaskRay, grimar

Reviewed By: peter.smith, MaskRay

Subscribers: tatyana-krasnukha, emaste, arichardson, kristof.beyls, MaskRay, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74537
2020-02-18 09:56:57 +01:00
Alexandre Ganea 8404aeb56a [Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups
The goal of this patch is to maximize CPU utilization on multi-socket or high core count systems, so that parallel computations such as LLD/ThinLTO can use all hardware threads in the system. Before this patch, on Windows, a maximum of 64 hardware threads could be used at most, in some cases dispatched only on one CPU socket.

== Background ==
Windows doesn't have a flat cpu_set_t like Linux. Instead, it projects hardware CPUs (or NUMA nodes) to applications through a concept of "processor groups". A "processor" is the smallest unit of execution on a CPU, that is, an hyper-thread if SMT is active; a core otherwise. There's a limit of 32-bit processors on older 32-bit versions of Windows, which later was raised to 64-processors with 64-bit versions of Windows. This limit comes from the affinity mask, which historically is represented by the sizeof(void*). Consequently, the concept of "processor groups" was introduced for dealing with systems with more than 64 hyper-threads.

By default, the Windows OS assigns only one "processor group" to each starting application, in a round-robin manner. If the application wants to use more processors, it needs to programmatically enable it, by assigning threads to other "processor groups". This also means that affinity cannot cross "processor group" boundaries; one can only specify a "preferred" group on start-up, but the application is free to allocate more groups if it wants to.

This creates a peculiar situation, where newer CPUs like the AMD EPYC 7702P (64-cores, 128-hyperthreads) are projected by the OS as two (2) "processor groups". This means that by default, an application can only use half of the cores. This situation could only get worse in the years to come, as dies with more cores will appear on the market.

== The problem ==
The heavyweight_hardware_concurrency() API was introduced so that only *one hardware thread per core* was used. Once that API returns, that original intention is lost, only the number of threads is retained. Consider a situation, on Windows, where the system has 2 CPU sockets, 18 cores each, each core having 2 hyper-threads, for a total of 72 hyper-threads. Both heavyweight_hardware_concurrency() and hardware_concurrency() currently return 36, because on Windows they are simply wrappers over std:🧵:hardware_concurrency() -- which can only return processors from the current "processor group".

== The changes in this patch ==
To solve this situation, we capture (and retain) the initial intention until the point of usage, through a new ThreadPoolStrategy class. The number of threads to use is deferred as late as possible, until the moment where the std::threads are created (ThreadPool in the case of ThinLTO).

When using hardware_concurrency(), setting ThreadCount to 0 now means to use all the possible hardware CPU (SMT) threads. Providing a ThreadCount above to the maximum number of threads will have no effect, the maximum will be used instead.
The heavyweight_hardware_concurrency() is similar to hardware_concurrency(), except that only one thread per hardware *core* will be used.

When LLVM_ENABLE_THREADS is OFF, the threading APIs will always return 1, to ensure any caller loops will be exercised at least once.

Differential Revision: https://reviews.llvm.org/D71775
2020-02-14 10:24:22 -05:00
Fangrui Song 105a270028 [ELF][AArch64] Rename pacPlt to zPacPlt and forceBti to zForceIbt after D71327. NFC
We use config->z* for -z options.
2020-02-13 21:02:54 -08:00
Fangrui Song 6c73246179 [ELF] Fix a null pointer dereference when --emit-relocs and --strip-debug are used together
Fixes https://bugs.llvm.org//show_bug.cgi?id=44878

When --strip-debug is specified, .debug* are removed from inputSections
while .rel[a].debug* (incorrectly) remain.

LinkerScript::addOrphanSections() requires the output section of a relocated
InputSectionBase to be created first.

.debug* are not in inputSections ->
output sections .debug* are not created ->
getOutputSectionName(.rel[a].debug*) dereferences a null pointer.

Fix the null pointer dereference by deleting .rel[a].debug* from inputSections as well.

Reviewed By: grimar, nickdesaulniers

Differential Revision: https://reviews.llvm.org/D74510
2020-02-13 08:56:38 -08:00
Peter Smith 29c1361557 [LLD][ELF][ARM] Do not substitute BL/BLX for non STT_FUNC symbols.
Recommit of 0b4a047bfb
(reverted in c29003813a) to incorporate
subsequent fix and add a warning when LLD's interworking behavior has
changed.

D73474 disabled the generation of interworking thunks for branch
relocations to non STT_FUNC symbols. This patch handles the case of BL and
BLX instructions to non STT_FUNC symbols. LLD would normally look at the
state of the caller and the callee and write a BL if the states are the
same and a BLX if the states are different.

This patch disables BL/BLX substitution when the destination symbol does
not have type STT_FUNC. This brings our behavior in line with GNU ld which
may prevent difficult to diagnose runtime errors when switching to lld.

This change does change how LLD handles interworking of symbols that do not
have type STT_FUNC from previous versions including the 10.0 release. This
brings LLD in line with ld.bfd but there may be programs that have not been
linked with ld.bfd that depend on LLD's previous behavior. We emit a warning
when the behavior changes.

A summary of the difference between 10.0 and 11.0 is that for symbols
that do not have a type of STT_FUNC LLD will not change a BL to a BLX or
vice versa. The table below enumerates the changes
| relocation     | STT_FUNC | bit(0) | in  | 10.0- out | 11.0+ out |
| R_ARM_CALL     | no       | 1      | BL  | BLX       | BL        |
| R_ARM_CALL     | no       | 0      | BLX | BL        | BLX       |
| R_ARM_THM_CALL | no       | 1      | BLX | BL        | BLX       |
| R_ARM_THM_CALL | no       | 0      | BL  | BLX       | BL        |

Differential Revision: https://reviews.llvm.org/D73542
2020-02-13 09:40:21 +00:00
Fangrui Song 7c426fb1a6 [ELF] Support INSERT [AFTER|BEFORE] for orphan sections
D43468+D44380 added INSERT [AFTER|BEFORE] for non-orphan sections. This patch
makes INSERT work for orphan sections as well.

`SECTIONS {...} INSERT [AFTER|BEFORE] .foo` does not set `hasSectionCommands`, so the result
will be similar to a regular link without a linker script. The differences when `hasSectionCommands` is set include:

* image base is different
* -z noseparate-code/-z noseparate-loadable-segments are unavailable
* some special symbols such as `_end _etext _edata` are not defined

The behavior is similar to GNU ld:
INSERT is not considered an external linker script.

This feature makes the section layout more flexible. It can be used to:

* Place .nv_fatbin before other readonly SHT_PROGBITS sections to mitigate relocation overflows.
* Disturb the layout to expose address sensitive application bugs.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D74375
2020-02-12 08:21:52 -08:00
Fangrui Song b498d99338 [ELF] Start a new PT_LOAD if LMA region is different
GNU ld has a counterintuitive lang_propagate_lma_regions rule.

```
// .foo's LMA region is propagated to .bar because their VMA region is the same,
// and .bar does not have an explicit output section address (addr_tree).
.foo : { *(.foo) } >RAM AT> FLASH
.bar : { *(.bar) } >RAM

// An explicit output section address disables propagation.
.foo : { *(.foo) } >RAM AT> FLASH
.bar . : { *(.bar) } >RAM
```

In both cases, lld thinks .foo's LMA region is propagated and
places .bar in the same PT_LOAD, so lld diverges from GNU ld w.r.t. the
second case (lma-align.test).

This patch changes Writer<ELFT>::createPhdrs to disable propagation
(start a new PT_LOAD). A user of the first case can make linker scripts
portable by explicitly specifying `AT>`. By contrast, there was no
workaround for the old behavior.

This change uncovers another LMA related bug in assignOffsets() where
`ctx->lmaOffset = 0;` was omitted. It caused a spurious "load address
range overlaps" error for at2.test

The new PT_LOAD rule is complex. For convenience, I listed the origins of some subexpressions:

* rL323449: `sec->memRegion == load->firstSec->memRegion`; linkerscript/at3.test
* D43284: `load->lastSec == Out::programHeaders` (don't start a new PT_LOAD after program headers); linkerscript/at4.test
* D58892: `sec != relroEnd` (start a new PT_LOAD after PT_GNU_RELRO)

Reviewed By: psmith

Differential Revision: https://reviews.llvm.org/D74297
2020-02-12 08:20:14 -08:00
Fangrui Song e21b9ca751 [ELF] Respect output section alignment for AT> (non-null lmaRegion)
When lmaRegion is non-null, respect `sec->alignment`
This rule is analogous to `switchTo(sec)` which advances sh_addr (VMA).

This fixes the p_paddr misalignment issue as reported by
https://android-review.googlesource.com/c/trusty/external/trusted-firmware-a/+/1230058

Note, `sec->alignment` is the maximum of ALIGN and input section alignments. We may overalign LMA than GNU ld.

linkerscript/align-lma.s has a FIXME that demonstrates another bug:
`.bss ... >RAM` should be placed in a different PT_LOAD (GNU ld
behavior) because its lmaRegion (nullptr) is different from the previous
section's lmaRegion (ROM).

Reviewed By: psmith

Differential Revision: https://reviews.llvm.org/D74286
2020-02-12 08:19:42 -08:00
Fangrui Song 9f854c0489 [ELF][RISCV] Add R_RISCV_IRELATIVE
https://github.com/riscv/riscv-elf-psabi-doc/pull/131 assigned 58 to R_RISCV_IRELATIVE.

Differential Revision: https://reviews.llvm.org/D74022
2020-02-10 20:22:39 -08:00
Fangrui Song 5f38040359 [ELF] Simplify parsing of version dependency. NFC 2020-02-08 14:10:29 -08:00
Nico Weber c29003813a Revert "[LLD][ELF][ARM] Do not substitute BL/BLX for non STT_FUNC symbols."
There are still problems after the fix in
"[ELF][ARM] Fix regression of BL->BLX substitution after D73542"
so let's revert to get trunk back to green while we investigate.
See https://reviews.llvm.org/D73542

This reverts commit 5461fa2b1f.
This reverts commit 0b4a047bfb.
2020-02-07 08:55:52 -05:00
Russell Gallop e7cb374433 [LLD][ELF] Add time-trace to ELF LLD
This adds some of LLD specific scopes and picks up optimisation scopes
via LTO/ThinLTO. Makes use of TimeProfiler multi-thread support added in
77e6bb3c.

Differential Revision: https://reviews.llvm.org/D71060
2020-02-06 12:14:13 +00:00
Fangrui Song 5461fa2b1f [ELF][ARM] Fix regression of BL->BLX substitution after D73542
D73542 made a typo (`rel.type == R_PLT_PC`; should be `rel.expr`) and introduced a regression:
BL->BLX substitution was disabled when the target symbol is preemptible
(expr is R_PLT_PC).

The two added bl instructions in arm-thumb-interwork-shared.s check that
we patch BL to BLX.

Fixes https://bugs.chromium.org/p/chromium/issues/detail?id=1047531
2020-02-05 14:09:14 -08:00
Fangrui Song da1973a241 [ELF][Mips] Drop an unneeded config->relocatable check 2020-01-31 21:00:28 -08:00
Jonas Devlieghere 3e24242a7d [lld] Replace SmallStr.str().str() with std::string conversion operator.
Use the std::string conversion operator introduced in
d7049213d0.
2020-01-29 21:30:21 -08:00
Fangrui Song 4a4ce14eb2 [ELF] Mention symbol name in reportRangeError()
For an out-of-range relocation referencing a non-local symbol, report the symbol name and the object file that defines the symbol. As an example:
```
t.o:(function func: .text.func+0x3): relocation R_X86_64_32S out of range: -281474974609120 is not in [-2147483648, 2147483647]
```
=>
```
t.o:(function func: .text.func+0x3): relocation R_X86_64_32S out of range: -281474974609120 is not in [-2147483648, 2147483647]; references func
>>> defined in t1.o
```

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D73518
2020-01-29 09:38:25 -08:00
Peter Smith 0b4a047bfb [LLD][ELF][ARM] Do not substitute BL/BLX for non STT_FUNC symbols.
D73474 disabled the generation of interworking thunks for branch
relocations to non STT_FUNC symbols. This patch handles the case of BL and
BLX instructions to non STT_FUNC symbols. LLD would normally look at the
state of the caller and the callee and write a BL if the states are the
same and a BLX if the states are different.

This patch disables BL/BLX substitution when the destination symbol does
not have type STT_FUNC. This brings our behavior in line with GNU ld which
may prevent difficult to diagnose runtime errors when switching to lld.

Differential Revision: https://reviews.llvm.org/D73542
2020-01-29 11:42:25 +00:00
Benjamin Kramer adcd026838 Make llvm::StringRef to std::string conversions explicit.
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.

This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.

This doesn't actually modify StringRef yet, I'll do that in a follow-up.
2020-01-28 23:25:25 +01:00
Fangrui Song e11b709b19 [ELF][PPC32] Support --emit-relocs link of R_PPC_PLTREL24
Similar to R_MIPS_GPREL16 and R_MIPS_GPREL32 (D45972).

If the addend of an R_PPC_PLTREL24 is >= 0x8000, it indicates that r30
is relative to the input section .got2.

```
addis 30, 30, .got2+0x8000-.L1$pb@ha
addi 30, 30, .got2+0x8000-.L1$pb@l
...
bl foo+0x8000@PLT
```

After linking, the relocation will be relative to the output section .got2.
To compensate for the shift `address(input section .got2) - address(output section .got2) = ppc32Got2OutSecOff`, adjust by `ppc32Got2OutSecOff`:

```
addis 30, 30, .got2+0x8000-.L1+ppc32Got2OutSecOff$pb@ha
addi 30, 30, .got2+0x8000-.L1+ppc32Got2OutSecOff$pb@ha$pb@l
...
bl foo+0x8000+ppc32Got2OutSecOff@PLT
```

This rule applys to a relocatable link or a non-relocatable link with --emit-relocs.

Reviewed By: Bdragon28

Differential Revision: https://reviews.llvm.org/D73532
2020-01-28 11:04:04 -08:00
Peter Smith 4f38ab250f [LLD][ELF][ARM] Do not insert interworking thunks for non STT_FUNC symbols
ELF for the ARM architecture requires linkers to provide
interworking for symbols that are of type STT_FUNC. Interworking for
other symbols must be encoded directly in the object file. LLD was always
providing interworking, regardless of the symbol type, this breaks some
programs that have branches from Thumb state targeting STT_NOTYPE symbols
that have bit 0 clear, but they are in fact internal labels in a Thumb
function. LLD treats these symbols as ARM and inserts a transition to Arm.

This fixes the problem for in range branches, R_ARM_JUMP24,
R_ARM_THM_JUMP24 and R_ARM_THM_JUMP19. This is expected to be the vast
majority of problem cases as branching to an internal label close to the
function.

There is at least one follow up patch required.
- R_ARM_CALL and R_ARM_THM_CALL may do interworking via BL/BLX
  substitution.

In theory range-extension thunks can be altered to not change state when
the symbol type is not STT_FUNC. I will need to check with ld.bfd to see if
this is the case in practice.

Fixes (part of) https://github.com/ClangBuiltLinux/linux/issues/773

Differential Revision: https://reviews.llvm.org/D73474
2020-01-28 11:54:18 +00:00
Peter Smith 3238b03c19 [LLD][ELF][ARM] clang-format function signature [NFC]
ARM::needsThunk had gone over 80 characters, run clang-format over it to
prevent it wrapping.
2020-01-28 11:54:18 +00:00
Teresa Johnson 2f63d549f1 Restore "[LTO/WPD] Enable aggressive WPD under LTO option"
This restores 59733525d3 (D71913), along
with bot fix 19c76989bb.

The bot failure should be fixed by D73418, committed as
af954e441a.

I also added a fix for non-x86 bot failures by requiring x86 in new test
lld/test/ELF/lto/devirt_vcall_vis_public.ll.
2020-01-27 07:55:05 -08:00
Fangrui Song 70389be7a0 [ELF][PPC32] Support range extension thunks with addends
* Generalize the code added in D70637 and D70937. We should eventually remove the EM_MIPS special case.
* Handle R_PPC_LOCAL24PC the same way as R_PPC_REL24.

Reviewed By: Bdragon28

Differential Revision: https://reviews.llvm.org/D73424
2020-01-25 22:32:42 -08:00
Fangrui Song 837e8a9c0c [ELF][PPC32] Support canonical PLT
-fno-pie produces a pair of non-GOT-non-PLT relocations R_PPC_ADDR16_{HA,LO} (R_ABS) referencing external
functions.

```
lis 3, func@ha
la 3, func@l(3)
```

In a -no-pie/-pie link, if func is not defined in the executable, a canonical PLT entry (st_value>0, st_shndx=0) will be needed.
References to func in shared objects will be resolved to this address.
-fno-pie -pie should fail with "can't create dynamic relocation ... against ...", so we just need to think about -no-pie.

On x86, the PLT entry passes the JMP_SLOT offset to the rtld PLT resolver.
On x86-64: the PLT entry passes the JUMP_SLOT index to the rtld PLT resolver.
On ARM/AArch64: the PLT entry passes &.got.plt[n]. The PLT header passes &.got.plt[fixed-index]. The rtld PLT resolver can compute the JUMP_SLOT index from the two addresses.

For these targets, the canonical PLT entry can just reuse the regular PLT entry (in PltSection).

On PPC32: PltSection (.glink) consists of `b PLTresolve` instructions and `PLTresolve`. The rtld PLT resolver depends on r11 having been set up to the .plt (GotPltSection) entry.
On PPC64 ELFv2: PltSection (.glink) consists of `__glink_PLTresolve` and `bl __glink_PLTresolve`. The rtld PLT resolver depends on r12 having been set up to the .plt (GotPltSection) entry.

We cannot reuse a `b PLTresolve`/`bl __glink_PLTresolve` in PltSection as a canonical PLT entry. PPC64 ELFv2 avoids the problem by using TOC for any external reference, even in non-pic code, so the canonical PLT entry scenario should not happen in the first place.
For PPC32, we have to create a PLT call stub as the canonical PLT entry. The code sequence sets up r11.

Reviewed By: Bdragon28

Differential Revision: https://reviews.llvm.org/D73399
2020-01-25 17:56:37 -08:00
Fangrui Song deb5819d62 [ELF] Rename relocateOne() to relocate() and pass `Relocation` to it
Symbol information can be used to improve out-of-range/misalignment diagnostics.
It also helps R_ARM_CALL/R_ARM_THM_CALL which has different behaviors with different symbol types.

There are many (67) relocateOne() call sites used in thunks, {Arm,AArch64}errata, PLT, etc.
Rename them to `relocateNoSym()` to be clearer that there is no symbol information.

Reviewed By: grimar, peter.smith

Differential Revision: https://reviews.llvm.org/D73254
2020-01-25 12:00:18 -08:00
Fangrui Song f1dab29908 [ELF][PowerPC] Support R_PPC_COPY and R_PPC64_COPY
Reviewed By: Bdragon28, jhenderson, grimar, sfertile

Differential Revision: https://reviews.llvm.org/D73255
2020-01-24 09:06:20 -08:00
Teresa Johnson 90e630a95e Revert "[LTO/WPD] Enable aggressive WPD under LTO option"
This reverts commit 59733525d3.

There is a windows sanitizer bot failure in one of the cfi tests
that I will need some time to figure out:
http://lab.llvm.org:8011/builders/sanitizer-windows/builds/57155/steps/stage%201%20check/logs/stdio
2020-01-23 17:29:24 -08:00
Teresa Johnson 59733525d3 [LTO/WPD] Enable aggressive WPD under LTO option
Summary:
Third part in series to support Safe Whole Program Devirtualization
Enablement, see RFC here:
http://lists.llvm.org/pipermail/llvm-dev/2019-December/137543.html

This patch adds type test metadata under -fwhole-program-vtables,
even for classes without hidden visibility. It then changes WPD to skip
devirtualization for a virtual function call when any of the compatible
vtables has public vcall visibility.

Additionally, internal LLVM options as well as lld and gold-plugin
options are added which enable upgrading all public vcall visibility
to linkage unit (hidden) visibility during LTO. This enables the more
aggressive WPD to kick in based on LTO time knowledge of the visibility
guarantees.

Support was added to all flavors of LTO WPD (regular, hybrid and
index-only), and to both the new and old LTO APIs.

Unfortunately it was not simple to split the first and second parts of
this part of the change (the unconditional emission of type tests and
the upgrading of the vcall visiblity) as I needed a way to upgrade the
public visibility on legacy WPD llvm assembly tests that don't include
linkage unit vcall visibility specifiers, to avoid a lot of test churn.

I also added a mechanism to LowerTypeTests that allows dropping type
test assume sequences we now aggressively insert when we invoke
distributed ThinLTO backends with null indexes, which is used in testing
mode, and which doesn't invoke the normal ThinLTO backend pipeline.

Depends on D71907 and D71911.

Reviewers: pcc, evgeny777, steven_wu, espindola

Subscribers: emaste, Prazek, inglorion, arichardson, hiraditya, MaskRay, dexonsmith, dang, davidxl, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71913
2020-01-23 16:09:44 -08:00
Fangrui Song 0fbf28f7aa [ELF] --no-dynamic-linker: don't emit undefined weak symbols to .dynsym
I felt really sad to push this commit for my selfish purpose to make
glibc -static-pie build with lld. Some code constructs in glibc require
R_X86_64_GOTPCREL/R_X86_64_REX_GOTPCRELX referencing undefined weak to
be resolved to a GOT entry not relocated by R_X86_64_GLOB_DAT (GNU ld
behavior), e.g.

csu/libc-start.c
  if (__pthread_initialize_minimal != NULL)
    __pthread_initialize_minimal ();

elf/dl-object.c
  void
  _dl_add_to_namespace_list (struct link_map *new, Lmid_t nsid)
  {
    /* We modify the list of loaded objects.  */
    __rtld_lock_lock_recursive (GL(dl_load_write_lock));

Emitting a GLOB_DAT will make the address equal &__ehdr_start (true
value) and cause elf/ldconfig to segfault. glibc really should move away
from weak references, which do not have defined semantics.

Temporarily special case --no-dynamic-linker.
2020-01-23 12:25:15 -08:00
Fangrui Song 1e57038bf2 [ELF] Pass `Relocation` to relaxGot and relaxTls{GdToIe,GdToLe,LdToLe,IeToLe}
These functions call relocateOne(). This patch is a prerequisite for
making relocateOne() aware of `Symbol` (D73254).

Reviewed By: grimar, nickdesaulniers

Differential Revision: https://reviews.llvm.org/D73250
2020-01-23 10:39:25 -08:00
Thomas Preud'homme c42fe24754 [lld/ELF] PR44498: Support input filename in double quote
Summary:
Linker scripts allow filenames to be put in double quotes to prevent
characters in filenames that are part of the linker script syntax from
having their special meaning. Case in point the * wildcard character.

Availability of double quoting filenames also allows to fix a failure in
ELF/linkerscript/filename-spec.s when the path contain a @ which the
lexer consider as a special characters and thus break up a filename
containing it. This may happens under Jenkins which createspath such as
pipeline@2.

To avoid the need for escaping GlobPattern metacharacters in filename
in double quotes, GlobPattern::create is augmented with a new parameter
to request literal matching instead of relying on the presence of a
wildcard character in the pattern.

Reviewers: jhenderson, MaskRay, evgeny777, espindola, alexshap

Reviewed By: MaskRay

Subscribers: peter.smith, grimar, ruiu, emaste, arichardson, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72517
2020-01-22 12:03:10 +00:00
Peter Smith e727f39ec0 [LLD][ELF][ARM] Don't apply --fix-cortex-a8 to relocatable links.
The --fix-cortex-a8 is sensitive to alignment and the precise destination
of branch instructions. These are not knowable at relocatable link time. We
follow GNU ld and the --fix-cortex-a53-843419 (D72968) by not patching the
code when there is a relocatable link.

Differential Revision: https://reviews.llvm.org/D73100
2020-01-22 11:03:40 +00:00
Sid Manning 6b9a5e6f05 [lld][Hexagon] Add General Dynamic relocations (GD)
Differential revision: https://reviews.llvm.org/D72522
2020-01-21 14:10:03 -06:00
Andrew Ng 4e8116f469 [ELF] Refactor uses of getInputSections to improve efficiency NFC
Add new method getFirstInputSection and use instead of getInputSections
where appropriate to avoid creation of an unneeded vector of input
sections.

Differential Revision: https://reviews.llvm.org/D73047
2020-01-21 12:27:52 +00:00
Peter Smith dbd0ad3366 [LLD][ELF] Add support for INPUT_SECTION_FLAGS
The INPUT_SECTION_FLAGS linker script command is used to constrain the
section pattern matching to sections that match certain combinations of
flags.

There are two ways to express the constraint.
withFlags: Section must have these flags.
withoutFlags: Section must not have these flags.

The syntax of the command is:
INPUT_SECTION_FLAGS '(' sect_flag_list ')'
sect_flag_list: NAME
| sect_flag_list '&' NAME

Where NAME matches a section flag name such as SHF_EXECINSTR, or the
integer value of a section flag. If the first character of NAME is ! then
it means must not contain flag.

We do not support the rare case of { INPUT_SECTION_FLAGS(flags) filespec }
where filespec has no input section description like (.text).

As an example from the ld man page:
SECTIONS {
  .text : { INPUT_SECTION_FLAGS (SHF_MERGE & SHF_STRINGS) *(.text) }
  .text2 :  { INPUT_SECTION_FLAGS (!SHF_WRITE) *(.text) }
}
.text will match sections called .text that have both the SHF_MERGE and
SHF_STRINGS flag.
.text2 will match sections called .text that don't have the SHF_WRITE flag.

The flag names accepted are the generic to all targets and SHF_ARM_PURECODE
as it is very useful to filter all the pure code sections into a single
program header that can be marked execute never.

fixes PR44265

Differential Revision: https://reviews.llvm.org/D72756
2020-01-21 10:05:26 +00:00
James Clarke d1da63664f [lld][RISCV] Print error when encountering R_RISCV_ALIGN
Summary:
Unlike R_RISCV_RELAX, which is a linker hint, R_RISCV_ALIGN requires the
support of the linker even when ignoring all R_RISCV_RELAX relocations.
This is because the compiler emits as many NOPs as may be required for
the requested alignment, more than may be required pre-relaxation, to
allow for the target becoming more unaligned after relaxing earlier
sequences. This means that the target is often not initially aligned in
the object files, and so the R_RISCV_ALIGN relocations cannot just be
ignored. Since we do not support linker relaxation, we must turn these
into errors.

Reviewers: ruiu, MaskRay, espindola

Reviewed By: MaskRay

Subscribers: grimar, Jim, emaste, arichardson, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71820
2020-01-21 02:49:45 +00:00
Eli Friedman c81fe34718 [lld][ELF] Don't apply --fix-cortex-a53-843419 to relocatable links.
The code doesn't apply the fix correctly to relocatable links. I could
try to fix the code that applies the fix, but it's pointless: we don't
actually know what the offset will be in the final executable. So just
ignore the flag for relocatable links.

Issue discovered building Android.

Differential Revision: https://reviews.llvm.org/D72968
2020-01-20 15:27:41 -08:00
Fangrui Song 6ab89c3c5d [ELF] Allow R_PLT_PC (R_PC) to a hidden undefined weak symbol
This essentially reverts b841e119d7.

Such code construct can be used in the following way:

  // glibc/stdlib/exit.c
  // clang -fuse-ld=lld => succeeded
  // clang -fuse-ld=lld -fpie -pie => relocation R_PLT_PC cannot refer to absolute symbol
  __attribute__((weak, visibility("hidden"))) extern void __call_tls_dtors();
  void __run_exit_handlers() {
    if (__call_tls_dtors)
        __call_tls_dtors();
  }

Since we allow R_PLT_PC in -no-pie mode, it makes sense to allow it in
-pie mode as well.

Reviewed By: pcc

Differential Revision: https://reviews.llvm.org/D72943
2020-01-17 13:06:42 -08:00
Peter Smith 01ad4c8384 [LLD][ELF][ARM][AArch64] Only round up ThunkSection Size when large OS.
In D71281 a fix was put in to round up the size of a ThunkSection to the
nearest 4KiB when performing errata patching. This fixed a problem with a
very large instrumented program that had thunks and patches mutually
trigger each other. Unfortunately it triggers an assertion failure in an
AArch64 allyesconfig build of the kernel. There is a specific assertion
preventing an InputSectionDescription being larger than 4KiB. This will
always trigger if there is at least one Thunk needed in that
InputSectionDescription, which is possible for an allyesconfig build.

Abstractly the problem case is:
.text : {
          *(.text) ;
          ...
          . = ALIGN(SZ_4K);
          __idmap_text_start = .;
          *(.idmap.text)
          __idmap_text_end = .;
          ...
        }
The assertion checks that __idmap_text_end - __idmap_start is < 4 KiB.
Note that there is more than one InputSectionDescription in the
OutputSection so we can't just restrict the fix to OutputSections smaller
than 4 KiB.

The fix presented here limits the D71281 to InputSectionDescriptions that
meet the following conditions:
1.) The OutputSection is bigger than the thunkSectionSpacing so adding
thunks will affect the addresses of following code.
2.) The InputSectionDescription is larger than 4 KiB. This will prevent
any assertion failures that an InputSectionDescription is < 4 KiB
in size.

We do this at ThunkSection creation time as at this point we know that
the addresses are stable and up to date prior to adding the thunks as
assignAddresses() will have been called immediately prior to thunk
generation.

The fix reverts the two tests affected by D71281 to their original state
as they no longer need the 4KiB size roundup. I've added simpler tests to
check for D71281 when the OutputSection size is larger than the ThunkSection
spacing.

Fixes https://github.com/ClangBuiltLinux/linux/issues/812

Differential Revision: https://reviews.llvm.org/D72344
2020-01-17 10:47:21 +00:00
Fangrui Song 2d7a8cf904 [ELF] -r: don't create .interp
`{clang,gcc} -nostdlib -r a.c` passes --dynamic-linker to the linker,
and the expected behavior is to ignore it.

If .interp is kept in the relocatable object file, a final link will get
PT_INTERP even if --dynamic-linker is not specified. glibc ld.so expects
to see PT_DYNAMIC and the executable will likely fail to run.

Ignore --dynamic-linker in -r mode as well as -shared.
2020-01-16 12:14:32 -08:00
Fangrui Song 870094decf [ELF] Decrease alignment of ThunkSection on 64-bit targets from 8 to 4
ThunkSection contains 4-byte instructions on all targets that use
thunks. Thunks should not be used in any performance sensitive places,
and locality/cache line/instruction fetching arguments should not apply.

We use 16 bytes as preferred function alignments for modern PowerPC cores.
In any case, 8 is not optimal.

Differential Revision: https://reviews.llvm.org/D72819
2020-01-16 10:36:33 -08:00
Andrew Ng d36b2649e5 [ELF] Optimization to LinkerScript::computeInputSections NFC
Moved the section name check ahead of any filename matching or
exclusion. Firstly, this reduces the need to retrieve the filename and
secondly, reduces the amount of potentially expensive filename pattern
matching if such rules are present in the linker script.

The impact of this change is particularly significant when linking
objects built with -ffunction-sections and -fstack-size-section, using a
linker script that includes non-trivial filename patterns. In a number
of such cases, the link time is halved.

Differential Revision: https://reviews.llvm.org/D72775
2020-01-16 13:56:02 +00:00
Alex Richardson 441410be47 [ELF] Avoid false-positive assert in getErrPlace()
This assertion was added as part of D70659 but did not account for .bss
input sections. I noticed that this assert was incorrectly triggering
while building FreeBSD for MIPS64. Fixed by relaxing the assert to also
account for SHT_NOBITS input sections and adjust the test
mips-jalr-non-function.s to link a file with a .bss section first.

Reviewed By: MaskRay, grimar
Differential Revision: https://reviews.llvm.org/D72567
2020-01-15 14:32:25 +00:00
Fangrui Song bec1b55c64 [ELF] Delete the RelExpr member R_HINT. NFC
R_HINT is ignored like R_NONE. There are no strong reasons to keep
R_HINT. The largest RelExpr member R_RISCV_PC_INDIRECT is 60 now.

Differential Revision: https://reviews.llvm.org/D71822
2020-01-14 10:56:53 -08:00
Fangrui Song 40c5bd4212 [ELF] --exclude-libs: don't assign VER_NDX_LOCAL to undefined symbols
Suggested by Peter Collingbourne.

Non-VER_NDX_GLOBAL versions should not be assigned to defined symbols. --exclude-libs violates this and can cause a spurious error "cannot refer to absolute symbol" after D71795.

excludeLibs incorrectly assigns VER_NDX_LOCAL to an undefined weak symbol =>
isPreemptible is false =>
R_PLT_PC is optimized to R_PC =>
in isStaticLinkTimeConstant, an error is emitted.

Reviewed By: pcc, grimar

Differential Revision: https://reviews.llvm.org/D72681
2020-01-14 10:12:28 -08:00
Fangrui Song d9819f3662 [ELF] Delete unintended --force-bti 2020-01-13 23:57:00 -08:00
Fangrui Song 7cd429f27d [ELF] Add -z force-ibt and -z shstk for Intel Control-flow Enforcement Technology
This patch is a joint work by Rui Ueyama and me based on D58102 by Xiang Zhang.

It adds Intel CET (Control-flow Enforcement Technology) support to lld.
The implementation follows the draft version of psABI which you can
download from https://github.com/hjl-tools/x86-psABI/wiki/X86-psABI.

CET introduces a new restriction on indirect jump instructions so that
you can limit the places to which you can jump to using indirect jumps.

In order to use the feature, you need to compile source files with
-fcf-protection=full.

* IBT is enabled if all input files are compiled with the flag. To force enabling ibt, pass -z force-ibt.
* SHSTK is enabled if all input files are compiled with the flag, or if -z shstk is specified.

IBT-enabled executables/shared objects have two PLT sections, ".plt" and
".plt.sec".  For the details as to why we have two sections, please read
the comments.

Reviewed By: xiangzhangllvm

Differential Revision: https://reviews.llvm.org/D59780
2020-01-13 23:39:28 -08:00
Fangrui Song 2d077d6dfa [ELF] Make TargetInfo::writeIgotPlt a no-op
RELA targets don't read initial .got.plt entries.
REL targets (ARM, x86-32) write the address of the IFUNC resolver to the
entry (`write32le(buf, s.getVA())`).

The default writeIgotPlt() is not meaningful. Make it a no-op. AArch64
and x86-64 will have 0 as initial .got.plt entries associated with
IFUNC.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D72474
2020-01-10 09:59:22 -08:00
Wei Mi 21a4710c67 [ThinLTO] Pass CodeGenOpts like UnrollLoops/VectorizeLoop/VectorizeSLP
down to pass builder in ltobackend.

Currently CodeGenOpts like UnrollLoops/VectorizeLoop/VectorizeSLP in clang
are not passed down to pass builder in ltobackend when new pass manager is
used. This is inconsistent with the behavior when new pass manager is used
and thinlto is not used. Such inconsistency causes slp vectorization pass
not being enabled in ltobackend for O3 + thinlto right now. This patch
fixes that.

Differential Revision: https://reviews.llvm.org/D72386
2020-01-09 21:13:11 -08:00
Fangrui Song 375371cc8b [ELF] Fix includeInDynsym() when an undefined weak is merged with a lazy definition
An undefined weak does not fetch the lazy definition. A lazy weak symbol
should be considered undefined, and thus preemptible if .dynsym exists.

D71795 is not quite an NFC. It errors on an R_X86_64_PLT32 referencing
an undefined weak symbol. isPreemptible is false (incorrect) => R_PLT_PC
is optimized to R_PC => in isStaticLinkTimeConstant, an error is emitted
when an R_PC is applied on an undefined weak (considered absolute).
2020-01-09 16:24:02 -08:00
Alex Richardson 1444e6e2e6 Re-apply "[ELF] Allow getErrPlace() to work before Out::bufferStart is set"
This time with a fix for the UBSAN failure.

Differential Revision: https://reviews.llvm.org/D70659
2020-01-09 20:26:31 +00:00
Sid Manning 0fa8f701cc [ELF][Hexagon] Add support for IE relocations
Differential Revision: https://reviews.llvm.org/D71143
2020-01-09 09:45:24 -06:00
Fangrui Song b841e119d7 [ELF] Delete an unused special rule from isStaticLinkTimeConstant. NFC
Weak undefined symbols are preemptible after D71794.

  if (sym.isPreemptible)
    return false;
  if (!config->isPic)
    return true;
  // isPic means includeInDynsym is true after D71794.

  ...

  // We can delete this if because it can never be true.
  if (sym.isUndefWeak)
    return true;

Differential Revision: https://reviews.llvm.org/D71795
2020-01-08 09:41:59 -08:00
Fangrui Song 96e2376d02 [ELF] Don't special case weak symbols for pie with no shared objects
D59275 added the following clause to Symbol::includeInDynsym()

  if (isUndefWeak() && Config->Pie && SharedFiles.empty())
    return false;

D59549 explored the possibility to generalize it for -no-pie.

GNU ld's rules are architecture dependent and partly controlled by -z
{,no-}dynamic-undefined-weak. Our attempts to mimic its rules are
actually half-baked and don't provide perceivable benefits (it can save
a few more weak undefined symbols in .dynsym in a -static-pie
executable). Let's just delete the rule for simplicity. We will expect
cosmetic inconsistencies with ld.bfd in certain -static-pie scenarios.

This permits a simplification in D71795.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D71794
2020-01-08 09:38:49 -08:00
Peter Smith 051c4d5b7b [LLD][ELF][AArch64] Do not use thunk for undefined weak symbol.
In AArch64 a branch to an undefined weak symbol that does not have a PLT
entry should resolve to the next instruction. The thunk generation code
can prevent this from happening as a range extension thunk can be generated
if the branch is sufficiently far away from 0, the value of an undefined
weak symbol.

The fix is taken from the Arm implementation of needsThunk(), we prevent a
thunk from being generated to an undefined weak symbol.

fixes pr44451

Differential Revision: https://reviews.llvm.org/D72267
2020-01-07 09:57:51 +00:00
Kazuaki Ishizaki 7ae3d33546 [lld] Fix trivial typos in comments
Reviewed By: ruiu, MaskRay

Differential Revision: https://reviews.llvm.org/D72196
2020-01-06 10:25:48 -08:00
Fangrui Song 085898d469 [ELF] Drop const qualifier to fix -Wrange-loop-analysis. NFC
```
lld/ELF/Relocations.cpp:1622:56: warning: loop variable 'ts' of type 'const std::pair<ThunkSection *, uint32_t>' (aka 'const pair<lld:🧝:ThunkSection *, unsigned int>') creates a copy from type 'const std::pair<ThunkSection *, uint32_t>' [-Wrange-loop-analysis]
        for (const std::pair<ThunkSection *, uint32_t> ts : isd->thunkSections)
```

Drop const qualifier to fix -Wrange-loop-analysis.
We can make -Wrange-loop-analysis warnings (DiagnoseForRangeConstVariableCopies) on `const A` more
permissive on more types (e.g. POD -> trivially copyable), unfortunately it will not make std::pair
good, because `constexpr pair& operator=(const pair& p);` is unfortunately user-defined.

Reviewed By: Mordante

Differential Revision: https://reviews.llvm.org/D72211
2020-01-04 12:24:39 -08:00
Sid Manning 81ffe89735 Add TPREL relocation support to Hexagon
Differential Revision: https://reviews.llvm.org/D71069
2020-01-02 11:18:26 -06:00
Fangrui Song 681b1be774 [lld] Fix -Wrange-loop-analysis warnings
One instance looks like a false positive:

lld/ELF/Relocations.cpp:1622:14: note: use reference type 'const std::pair<ThunkSection *, uint32_t> &' (aka 'cons
t pair<lld:🧝:ThunkSection *, unsigned int> &') to prevent copying
        for (const std::pair<ThunkSection *, uint32_t> ts : isd->thunkSections)

It is not changed in this commit.
2020-01-01 15:41:20 -08:00
Fangrui Song e3e13db714 [ELF][RISCV] Improve error message for unknown relocations
Like rLLD354040.
2019-12-31 16:09:55 -08:00
Fangrui Song bb87364f26 [ELF][PPC64] Improve "call lacks nop" diagnostic and make it compatible with GCC<5.5 and GCC<6.4
GCC before r245813 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79439)
did not emit nop after b/bl. This can happen with recursive calls.
r245813 was back ported to GCC 5.5 and GCC 6.4.

This is common, for example, libstdc++.a(locale.o) shipped with GCC 4.9
and many objects in netlib lapack can cause lld to error.  gold allows
such calls to the same section. Our __plt_foo symbol's `section` field
is used for ThunkSection, so we can't implement a similar loosen rule
easily. But we can make use of its `file` field which is currently NULL.

Differential Revision: https://reviews.llvm.org/D71639
2019-12-29 23:05:11 -08:00
Fangrui Song fb2944bd7f [ELF][PPC32] Implement IPLT code sequence for non-preemptible IFUNC
Similar to D71509 (EM_PPC64), on EM_PPC, the IPLT code sequence should
be similar to a PLT call stub. Unlike EM_PPC64, EM_PPC -msecure-plt has
small/large PIC model differences.

* -fpic/-fpie: R_PPC_PLTREL24 r_addend=0.  The call stub loads an address relative to `_GLOBAL_OFFSET_TABLE_`.
* -fPIC/-fPIE: R_PPC_PLTREL24 r_addend=0x8000. (A partial linked object
  file may have an addend larger than 0x8000.) The call stub loads an address relative to .got2+0x8000.

Just assume large PIC model for now. This patch makes:

  // clang -fuse-ld=lld -msecure-plt -fno-pie -no-pie a.c
  // clang -fuse-ld=lld -msecure-plt -fPIE -pie a.c
  #include <stdio.h>
  static void impl(void) { puts("meow"); }
  void thefunc(void) __attribute__((ifunc("resolver")));
  void *resolver(void) { return &impl; }
  int main(void) {
    thefunc();
    void (*theptr)(void) = &thefunc;
    theptr();
  }

work on Linux glibc. -fpie will crash because the compiler and the
linker do not agree on the value which r30 stores (_GLOBAL_OFFSET_TABLE_
vs .got2+0x8000).

Differential Revision: https://reviews.llvm.org/D71621
2019-12-29 22:42:53 -08:00
Fangrui Song 45acc35ac2 [ELF][PPC64] Implement IPLT code sequence for non-preemptible IFUNC
Non-preemptible IFUNC are placed in in.iplt (.glink on EM_PPC64).  If
there is a non-GOT non-PLT relocation, for pointer equality, we change
the type of the symbol from STT_IFUNC and STT_FUNC and bind it to the
.glink entry.

On EM_386, EM_X86_64, EM_ARM, and EM_AARCH64, the PLT code sequence
loads the address from its associated .got.plt slot. An IPLT also has an
associated .got.plt slot and can use the same code sequence.

On EM_PPC64, the PLT code sequence is actually a bl instruction in
.glink .  It jumps to `__glink_PLTresolve` (the PLT header). and
`__glink_PLTresolve` computes the .plt slot (relocated by
R_PPC64_JUMP_SLOT).

An IPLT does not have an associated R_PPC64_JUMP_SLOT, so we cannot use
`bl` in .iplt . Instead, create a call stub which has a similar code
sequence as PPC64PltCallStub. We don't save the TOC pointer, so such
scenarios will not work: a function pointer to a non-preemptible ifunc,
which resolves to a function defined in another DSO. This is the
restriction described by https://sourceware.org/glibc/wiki/GNU_IFUNC
(though on many architectures it works in practice):

  Requirement (a): Resolver must be defined in the same translation unit as the implementations.

If an ifunc is taken address but not called, technically we don't need
an entry for it, but we currently do that.

This patch makes

  // clang -fuse-ld=lld -fno-pie -no-pie a.c
  // clang -fuse-ld=lld -fPIE -pie a.c
  #include <stdio.h>
  static void impl(void) { puts("meow"); }
  void thefunc(void) __attribute__((ifunc("resolver")));
  void *resolver(void) { return &impl; }
  int main(void) {
    thefunc();
    void (*theptr)(void) = &thefunc;
    theptr();
  }

work on Linux glibc and FreeBSD. Calling a function pointer pointing to
a Non-preemptible IFUNC never worked before.

Differential Revision: https://reviews.llvm.org/D71509
2019-12-29 22:40:03 -08:00
Fangrui Song dce7a362be [ELF] Improve the condition to create .interp
This restores commit 1417558e4a and its follow-up, reverted by commit c3dbd782f1.

After this commit:

clang -fuse-ld=bfd -no-pie -nostdlib a.c => .interp not created
clang -fuse-ld=bfd -pie -fPIE -nostdlib a.c => .interp created

clang -fuse-ld=gold -no-pie -nostdlib a.c => .interp not created
clang -fuse-ld=gold -pie -fPIE -nostdlib a.c => .interp created

clang -fuse-ld=lld -no-pie -nostdlib a.c => .interp created
clang -fuse-ld=lld -pie -fPIE -nostdlib a.c => .interp created
2019-12-27 15:34:25 -08:00
Reid Kleckner c3dbd782f1 Revert "[ELF] Improve the condition to create .interp"
This reverts commit 1417558e4a.
Also reverts commit 019a92bb28.

This causes check-sanitizer to fail. The "-Nolib" variant of the test
crashes on startup in the loader.
2019-12-27 13:05:41 -08:00
Fangrui Song 1417558e4a [ELF] Improve the condition to create .interp
Similar to rL362355, but with the `!config->shared` guard.

(1) {gcc,clang} -fuse-ld=bfd -pie -fPIE -nostdlib a.c => .interp created
(2) {gcc,clang} -fuse-ld=lld -pie -fPIE -nostdlib a.c => .interp not created
(3) {gcc,clang} -fuse-ld=lld -pie -fPIE -nostdlib a.c a.so => .interp created

The inconsistency of (2) is due to the condition `!Config->SharedFiles.empty()`.
To make lld behave more like ld.bfd, we could change the condition to:

    config->hasDynSymTab && !config->dynamicLinker.empty() && script->needsInterpSection();

However, that would bring another inconsistency as can be observed with:

(4) {gcc,clang} -fuse-ld=bfd -no-pie -nostdlib a.c => .interp not created
2019-12-26 13:26:43 -08:00
Fangrui Song 1edd965130 [ELF] Support input section description .gnu.version* in /DISCARD/
Linux powerpc discards `*(.gnu.version*)` (arch/powerpc/kernel/vmlinux.lds.S)
to suppress --orphan-handling=warn warnings in the -pie output `.tmp_vmlinux1`

The support is simple. Just add isLive() to:

1) Fix an assertion in SectionBase::getPartition() called by VersionTableSection::isNeeded().
2) Suppress DT_VERSYM, DT_VERDEF, DT_VERNEED and DT_VERNEEDNUM, if the relevant section is discarded.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D71819
2019-12-26 09:54:22 -08:00
Fangrui Song 261b7b4a6b [ELF] Don't suggest an alternative spelling for a symbol in a discarded section
For undef-not-suggest.test, we currently make redundant alternative
spelling suggestions:

```
ld.lld: error: relocation refers to a discarded section: .text.foo
>>> defined in a.o
>>> section group signature: foo
>>> prevailing definition is in a.o
>>> referenced by a.o:(.rodata+0x0)
>>> did you mean:
>>> defined in: a.o

ld.lld: error: relocation refers to a symbol in a discarded section: foo
>>> defined in a.o
>>> section group signature: foo
>>> prevailing definition is in a.o
>>> referenced by a.o:(.rodata+0x8)
>>> did you mean: for
>>> defined in: a.o
```

Reviewed By: grimar, ruiu

Differential Revision: https://reviews.llvm.org/D71735
2019-12-23 09:10:29 -08:00
Fangrui Song 2539cd22e9 [ELF] Delete a redundant R_HINT check from isStaticLinkTimeConstant(). NFC
scanReloc() returns when it sees an R_HINT.
2019-12-22 16:58:22 -08:00
John Baldwin 189b7393d5 [lld][RISCV] Use an e_flags of 0 if there are only binary input files.
Summary:
If none of the input files are ELF object files (for example, when
generating an object file from a single binary input file via
"-b binary"), use a fallback value for the ELF header flags instead
of crashing with an assertion failure.

Reviewers: MaskRay, ruiu, espindola

Reviewed By: MaskRay, ruiu

Subscribers: kevans, grimar, emaste, arichardson, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits, jrtc27

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71101
2019-12-21 17:59:37 +00:00
Fangrui Song 37b2808059 [ELF] writePlt, writeIplt: replace parameters gotPltEntryAddr and index with `const Symbol &`. NFC
PPC::writeIplt (IPLT code sequence, D71621) needs to access `Symbol`.

Reviewed By: grimar, ruiu

Differential Revision: https://reviews.llvm.org/D71631
2019-12-18 00:14:03 -08:00
Fangrui Song 07522e4e23 [ELF] Fix a comment. NFC 2019-12-17 17:17:33 -08:00
Fangrui Song 345f59667d [ELF] Rename .plt to .iplt and decrease EM_PPC{,64} alignment of .glink to 4
GNU ld creates the synthetic section .iplt, and has a built-in linker
script that assigns .iplt to the output section .plt . There is no
output section named .iplt .

Making .iplt an output section actually has a benefit that makes the
tricky toolchain feature stand out. Symbolizers don't have to deal with
mixed PLT entries (e.g. llvm-objdump -d incorrectly annotates such jump
targets).

On EM_PPC{,64}, .glink contains a PLT resolver and a series of jump
instructions. The 4-byte entry size makes it unnecessary to have an
alignment of 16.

Mark ppc32-gnu-ifunc.s and ppc32-gnu-ifunc-nonpreemptable.s as `XFAIL: *`.
They test IPLT on EM_PPC, which never works.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D71520
2019-12-17 00:15:59 -08:00
Fangrui Song 891a8655ab [ELF] Add IpltSection
PltSection is used by both PLT and IPLT. The PLT section may have a
header while the IPLT section does not. Split off IpltSection from
PltSection to be clearer.

Unlike other targets, PPC64 cannot use the same code sequence for PLT
and IPLT. This helps make a future PPC64 patch (D71509) more isolated.

On EM_386 and EM_X86_64, when PLT is empty while IPLT is not, currently
we are inconsistent whether the PLT header is conceptually attached to
in.plt or in.iplt .  Consistently attach the header to in.plt can make
the -z retpolineplt logic simpler. It also makes `jmp` point to an
aesthetically better place for non-retpolineplt cases.

Reviewed By: grimar, ruiu

Differential Revision: https://reviews.llvm.org/D71519
2019-12-17 00:06:04 -08:00
Fangrui Song ee912fe6a1 [ELF] Delete unused declaration addIRelativeRelocs after D65995. NFC 2019-12-16 11:19:22 -08:00
Fangrui Song 90d195d026 [ELF] Delete relOff from TargetInfo::writePLT
This change only affects EM_386. relOff can be computed from `index`
easily, so it is unnecessarily passed as a parameter.

Both in.plt and in.iplt entries are written by writePLT. For in.iplt,
the instruction `push reloc_offset` will change because `index` is now
different. Fortunately, this does not matter because `push; jmp` is only
used by PLT. IPLT does not need the code sequence.

Reviewed By: grimar, ruiu

Differential Revision: https://reviews.llvm.org/D71518
2019-12-16 11:10:02 -08:00
Fangrui Song 98afa2c1f1 [ELF] De-template PltSection::addEntry. NFC 2019-12-16 11:03:20 -08:00
Fangrui Song f036f1cc85 [ELF] Delete redundant isLive() check. NFC 2019-12-15 21:59:55 -08:00
Vlad Tsyrklevich 17063abd1e Revert "[ELF] Allow getErrPlace() to work before Out::bufferStart is set"
This reverts commit 2bbd32f5e8, it was
causing UBSan failures like the following:
lld/ELF/Target.cpp:103:41: runtime error: applying non-zero offset 24 to null pointer
2019-12-13 09:43:51 -08:00
Fangrui Song 69d10d282e [ELF] Update st_size when merging a common symbol with a shared symbol
When a common symbol is merged with a shared symbol, increase st_size if
the shared symbol has a larger st_size. At runtime, the executable's
symbol overrides the shared symbol.  The shared symbol may be created
from common symbols in a previous link.  This rule makes sure we pick
the largest size among all common symbols.

This behavior matches GNU ld. See
https://sourceware.org/bugzilla/show_bug.cgi?id=25236 for discussions.

A shared symbol does not hold alignment constraints. Ignore the
alignment update.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D71161
2019-12-13 09:23:36 -08:00
Alex Richardson 2bbd32f5e8 [ELF] Allow getErrPlace() to work before Out::bufferStart is set
Summary:
So far it seems like the only test affected by this change is the one I
recently added for R_MIPS_JALR relocations since the other test cases that
use this function early (unknown-relocation-*) do not have a valid input
section for the relocation offset.

Reviewers: ruiu, grimar, MaskRay, espindola

Reviewed By: ruiu, MaskRay

Subscribers: emaste, sdardis, jrtc27, atanasyan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70659
2019-12-13 12:19:55 +00:00
Rui Ueyama 69da7e29de Revert an accidental commit af5ca40b47 2019-12-13 15:17:40 +09:00
Rui Ueyama af5ca40b47 temporary 2019-12-13 14:35:03 +09:00
Fangrui Song ba8149e27d [ELF] Add a comment to handleSectionGroup(). NFC
Apply suggestion in https://reviews.llvm.org/D71157#1780834

Reviewed By: grimar, ruiu

Differential Revision: https://reviews.llvm.org/D71388
2019-12-12 09:23:59 -08:00
Fangrui Song 5a3a9e9927 [ELF][AArch64] Rename --force-bti to -z force-bti and --pac-plt to -z pac-plt
Summary:
The original design used --foo but the upstream complained that ELF only
options should be -z foo. See https://sourceware.org/ml/binutils/2019-04/msg00151.html
https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=8bf6d176b0a442a8091d338d4af971591d19922c
made the rename.

Our --force-bti and --pac-plt implement the same functionality, so it
seems wise to be consistent with GNU ld.

Reviewed By: peter.smith

Subscribers: emaste, arichardson, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71327
2019-12-11 09:26:32 -08:00
Peter Smith 86d24193a9 [LLD][ELF][AArch64][ARM] When errata patching, round thunk size to 4KiB.
On some edge cases such as Chromium compiled with full instrumentation we
have a .text section over twice the size of the maximum branch range and
the instrumented code generation containing many examples of the erratum
sequence. The combination of Thunks and many erratum sequences causes
finalizeAddressDependentContent() to not converge. We end up with:
start
- Thunk Creation (disturbs addresses after thunks, creating more patches)
- Patch Creation (disturbs addresses after patches, creating more thunks)
- goto start

In most images with few thunks and patches the mutual disturbance does not
cause convergence problems. As the .text size and number of patches go up
the risk increases.

A way to prevent the thunk creation from interfering with patch creation is
to round up the size of the thunks to a 4KiB boundary when the
erratum patch is enabled. As the erratum sequence only triggers when an
instruction sequence starts at 0xff8 or 0xffc modulo (4 KiB) by making the
thunks not affect addresses modulo (4 KiB) we prevent thunks from
interfering with the patch.

The patches themselves could be aggregated in the same way that Thunks are
within ThunkSections and we could round up the size in the same way. This
would reduce the number of patches created in a .text section size >
128 MiB but would not likely help convergence problems.

Differential Revision: https://reviews.llvm.org/D71281

fixes (remaining part of) pr44071, other part in D71242
2019-12-11 14:09:15 +00:00
Peter Smith 247b2ce11c [LLD][ELF][AArch64][ARM] Add missing classof to patch sections.
The code to insert patch section merges them with a comparison function that
uses logic of the form:
return (isa<PatchSection>(a) && !isa<PatchSection>(b));
If the PatchSections don't implement classof this check fails if b is also
a SyntheticSection. This can result in the patches being out of range if
the SyntheticSection is big, for example a ThunkSection with lots of thunks.

Differential Revision: https://reviews.llvm.org/D71242

fixes (part of) pr44071
2019-12-11 14:09:15 +00:00
Fangrui Song 6e513a5382 [ELF] Move a computeIsPreemptible() pass into ICF. NFC
Address post-commit review for D71163.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D71326
2019-12-10 22:21:05 -08:00
Fangrui Song cd0ab2428f [ELF] --icf: do not fold preemptible symbols
Fixes PR44124.

A preemptible symbol may refer to a different definition at runtime.
When comparing a pair of relocations, if they refer to different
symbols, and either symbol is preemptible, the two containing sections
should be considered different.

gold has a similar rule https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=ce97fa81e0c46d216b80b143ad8c02fff6906fef

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D71163
2019-12-10 09:06:08 -08:00
Fangrui Song 60ce444eaa [ELF] Refine section group --gc-sections rules to not discard .debug_types
clang/gcc -fdebug-type-sections places .debug_types and
.rela.debug_types in a section group, with a signature symbol which
represents the type signature. The section group is for deduplication
purposes.

After D70146, we will discard such section groups. Refine the rule so
that we will retain the group if no member has the SHF_ALLOC flag.

GNU ld has a similar rule to retain the group if all members have the
SEC_DEBUGGING flag. We try to be more general for future-proof purposes:
if other non-SHF_ALLOC sections have deduplication needs, they may be
placed in a section group. Don't discard them.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D71157
2019-12-10 09:00:58 -08:00
Fangrui Song c8f0d3e130 [ELF][PPC64] Support long branch thunks with addends
Fixes PPC64 part of PR40438

  // clang -target ppc64le -c a.cc
  // .text.unlikely may be placed in a separate output section (via -z keep-text-section-prefix)
  // The distance between bar in .text.unlikely and foo in .text may be larger than 32MiB.
  static void foo() {}
  __attribute__((section(".text.unlikely"))) static int bar() { foo(); return 0; }
  __attribute__((used)) static int dummy = bar();

This patch makes such thunks with addends work for PPC64.

AArch64: .text -> `__AArch64ADRPThunk_ (adrp x16, ...; add x16, x16, ...; br x16)` -> target
PPC64: .text -> `__long_branch_ (addis 12, 2, ...; ld 12, ...(12); mtctr 12; bctr)` -> target

AArch64 can leverage ADRP to jump to the target directly, but PPC64
needs to load an address from .branch_lt . Before Power ISA v3.0, the
PC-relative ADDPCIS was not available. .branch_lt was invented to work
around the limitation.

Symbol::ppc64BranchltIndex is replaced by
PPC64LongBranchTargetSection::entry_index which take addends into
consideration.

The tests are rewritten: ppc64-long-branch.s tests -no-pie and
ppc64-long-branch-pi.s tests -pie and -shared.

Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D70937
2019-12-05 10:17:45 -08:00
Fangrui Song 944f109ad7 [ELF][PPC64] Don't copy ppc64BranchltIndex in replaceWithDefined
replaceWithDefined is used by canonical PLT and copy relocations, which
imply that the symbol is preemptable. ppc64BranchltIndex is only used by
non-preemptable cases, and it can only be the default value in
replaceWithDefined.
2019-12-05 09:33:30 -08:00
Peter Smith 784f57584f [LLD][ELF][AArch64] .note.gnu.property sections should have alignment 8
The .note.gnu.property SHT_NOTE sections on AArch64 (a 64-bit target)
should have alignment 8 to more closely match the binutils implementation
where alignment is 4-bytes on 32-bit machines and 8-bytes on 64-bit
machines.

Previously LLD was using 4 for both 32-bit and 64-bit machines.

Differential Revision: https://reviews.llvm.org/D70962
2019-12-05 10:11:31 +00:00
Peter Smith 4d6c4cb426 [LLD][ELF] Add support for PT_GNU_PROPERTY
The PT_GNU_PROPERTY program header describes the location of the
.note.gnu.property SHT_NOTES section. The linux kernel uses this program
header to find the .note.gnu.property section rather than parsing.
Executables that have properties that the kernel needs to act on that don't
have the PT_GNU_PROPERTY program header will not boot.

Differential Revision: https://reviews.llvm.org/D70961
2019-12-05 09:54:58 +00:00
Fangrui Song bf535ac4a2 [ELF][AArch64] Support R_AARCH64_{CALL26,JUMP26} range extension thunks with addends
Fixes AArch64 part of PR40438

The current range extension thunk framework does not handle a relocation
relative to a STT_SECTION symbol with a non-zero addend, which may be
used by jumps/calls to local functions on some RELA targets (AArch64,
powerpc ELFv1, powerpc64 ELFv2, etc).  See PR40438 and the following
code for examples:

  // clang -target $target a.cc
  // .text.cold may be placed in a separate output section.
  // The distance between bar in .text.cold and foo in .text may be larger than 128MiB.
  static void foo() {}
  __attribute__((section(".text.cold"))) static int bar() { foo(); return
  0; }
  __attribute__((used)) static int dummy = bar();

This patch makes such thunks with addends work for AArch64. The target
independent part can be reused by PPC in the future.

On REL targets (ARM, MIPS), jumps/calls are not represented as
STT_SECTION + non-zero addend (see
MCELFObjectTargetWriter::needsRelocateWithSymbol), so they don't need
this feature, but we need to make sure this patch does not affect them.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D70637
2019-12-02 10:07:24 -08:00
Fangrui Song 3d9b1128d6 [ELF][ARM] Add getPCBias()
ThunkCreator::getThunk and ThunkCreator::normalizeExistingThunk
currently assume that the implicit addends are -8 for ARM and -4 for
Thumb. In D70637, ThunkCreator::getThunk will need to take care of the
relocation addend explicitly.

Add the utility function getPCBias() as a prerequisite so that the getThunk change in D70637
can be more general.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D70690
2019-11-27 09:09:46 -08:00
Fangrui Song 54a366f515 [ELF] Add a corrector for case mismatch problems
Reviewed By: grimar, peter.smith

Differential Revision: https://reviews.llvm.org/D70506
2019-11-26 09:11:56 -08:00
Fangrui Song a2fc964417 [ELF] Replace SymbolTable::forEachSymbol with iterator_range symbols()
D62381 introduced forEachSymbol(). It seems that many call sites cannot
be parallelized because the body shared some states. Replace
forEachSymbol with iterator_range<filter_iterator<...>> symbols() to
simplify code and improve debuggability (std::function calls take some
frames).

It also allows us to use early return to simplify code added in D69650.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D70505
2019-11-26 09:09:32 -08:00
Georgii Rymar 19edd675c6 [LLD][ELF] - Make compression level be dependent on -On.
Currently LLD always use zlib compression level 6.
This patch changes it to use 1 for -O0, -O1 and 6 for -O2.

It fixes https://bugs.llvm.org/show_bug.cgi?id=44089.

There was also a thread in llvm-dev on this topic:
https://lists.llvm.org/pipermail/llvm-dev/2018-August/125020.html

Here is a table with results of building clang mentioned there:

```
Level   Time            Size
0       0m17.128s       2045081496   Z_NO_COMPRESSION
1       0m31.471s       922618584    Z_BEST_SPEED
2       0m32.659s       903642376
3       0m36.749s       890805856
4       0m41.532s       876697184
5       0m48.383s       862778576
6       1m3.176s        855251640    Z_DEFAULT_COMPRESSION
7       1m15.335s       853755920
8       2m0.561s        852497560
9       2m33.972s       852397408    Z_BEST_COMPRESSION
```

It shows that it is probably not reasonable to use values greater than 6.

Differential revision: https://reviews.llvm.org/D70658
2019-11-26 11:50:22 +03:00
Fangrui Song a71c1e2a57 [ELF] Support input section description .rel[a].dyn in /DISCARD/
Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D70695
2019-11-25 21:49:46 -08:00
Fangrui Song f0558f582a [ELF] Delete unused Configuration::zExecstack after D56554 2019-11-25 14:44:09 -08:00
Fangrui Song 4dc2fb123d [ELF] Error if -Ttext-segment is specified
In GNU ld, -Ttext sets the address of the .text section and -Ttext-segment sets the address of the text segment (RX).

gold only supports the -Ttext-segment semantic and treats -Ttext as an alias for -Ttext-segment.

lld only supports the -Ttext semantic and treats -Ttext-segment as an
alias for -Ttext.  The text segment will be assigned to an address less
than the specified -Ttext-segment value.

This patch drops the -Ttext-segment alias.

The text segment is traditionally the first segment. Users who specify
-Ttext-segment may actually want to specify --image-base, the lld way to
express this. Unfortunately currently this is supported by GNU ld's
COFF port but not by its ELF port. gold does not support this option.
With -z separate-code, the behavior of GNU ld -Ttext-segment is weird (see https://sourceware.org/bugzilla/show_bug.cgi?id=25207)

rL289827 introduced the alias for linking qemu's non-pie user mode
binaries. As explained previously, this actually assigns the text
segment to an address less than 0x60000000. I feel that a better fix is
on the qemu side:
https://lists.nongnu.org/archive/html/qemu-devel/2019-11/msg02480.html

Reviewed By: grimar, ruiu

Differential Revision: https://reviews.llvm.org/D70468
2019-11-21 09:41:55 -08:00
James Y Knight d3fec7fb45 LLD: Don't use the stderrOS stream in link before it's reassigned.
Remove the lld::enableColors function, as it just obscures which
stream it's affecting, and replace with explicit calls to the stream's
enable_colors.

Also, assign the stderrOS and stdoutOS globals first in link function,
just to ensure nothing might use them.

(Either change individually fixes the issue of using the old
stream, but both together seems best.)

Follow-up to b11386f9be.

Differential Revision: https://reviews.llvm.org/D70492
2019-11-21 10:55:03 -05:00
Alex Richardson 5bab291b7b Ignore R_MIPS_JALR relocations against non-function symbols
Summary:
Current versions of clang would erroneously emit this relocation not only
against functions (loaded from the GOT) but also against data symbols
(e.g. a table of function pointers). LLD was then changing this into a
branch-and-link instruction, causing the program to jump to the data
symbol at run time. I discovered this problem when attempting to boot
MIPS64 FreeBSD after updating the to the latest upstream master.

Reviewers: atanasyan, jrtc27, espindola

Reviewed By: atanasyan

Subscribers: emaste, sdardis, krytarowski, MaskRay, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70406
2019-11-20 13:23:26 +00:00
Fangrui Song ce5de93e83 [ELF] Disallow out-of-range section group indices after D70146
Exposed by invalid/sht-group-wrong-section.test
http://45.33.8.238/win/2613/step_9.txt
2019-11-19 09:49:45 -08:00
Fangrui Song 6b0eb5a672 [ELF] Improve --gc-sections compatibility with GNU ld regarding section groups
Based on D70020 by serge-sans-paille.

The ELF spec says:

> Furthermore, there may be internal references among these sections that would not make sense if one of the sections were removed or replaced by a duplicate from another object. Therefore, such groups must be included or omitted from the linked object as a unit. A section cannot be a member of more than one group.

GNU ld has 2 behaviors that we don't have:

- Group members (nextInSectionGroup != nullptr) are subject to garbage collection.
  This includes non-SHF_ALLOC SHT_NOTE sections.
  In particular, discarding non-SHF_ALLOC SHT_NOTE sections is an expected behavior by the Annobin
  project. See
  https://developers.redhat.com/blog/2018/02/20/annobin-storing-information-binaries/
  for more information.
- Groups members are retained or discarded as a unit.
  Members may have internal references that are not expressed as
  SHF_LINK_ORDER, relocations, etc. It seems that we should be more conservative here:
  if a section is marked live, mark all the other member within the
  group.

Both behaviors are reasonable. This patch implements them.

A new field InputSectionBase::nextInSectionGroup tracks the next member
within a group. on ELF64, this increases sizeof(InputSectionBase) froms
144 to 152.

InputSectionBase::dependentSections tracks section dependencies, which
is used by both --gc-sections and /DISCARD/. We can't overload it for
the "next member" semantic, because we should allow /DISCARD/ to discard
sections independent of --gc-sections (GNU ld behavior). This behavior
may be reasonably used by `/DISCARD/ : { *(.ARM.exidx*) }` or `/DISCARD/
: { *(.note*) }` (new test `linkerscript/discard-group.s`).

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D70146
2019-11-19 08:54:06 -08:00
Rui Ueyama b11386f9be Make it possible to redirect not only errs() but also outs()
This change is for those who use lld as a library. Context:
https://reviews.llvm.org/D70287

This patch adds a new parmeter to lld::*::link() so that we can pass
an raw_ostream object representing stdout. Previously, lld::*::link()
took only an stderr object.

Justification for making stdoutOS and stderrOS mandatory: I wanted to
make link() functions to take stdout and stderr in that order.
However, if we change the function signature from

  bool link(ArrayRef<const char *> args, bool canExitEarly,
            raw_ostream &stderrOS = llvm::errs());

to

  bool link(ArrayRef<const char *> args, bool canExitEarly,
            raw_ostream &stdoutOS = llvm::outs(),
            raw_ostream &stderrOS = llvm::errs());

, then the meaning of existing code that passes stderrOS silently
changes (stderrOS would be interpreted as stdoutOS). So, I chose to
make existing code not to compile, so that developers can fix their
code.

Differential Revision: https://reviews.llvm.org/D70292
2019-11-18 11:18:06 +09:00
Ayke van Laethem 57776f71fa
[ELF] Fix lld build on Windows/MinGW
The patch in https://reviews.llvm.org/D64077 causes a build failure
because both the Defined and SharedSymbol classes are bigger than 80
bytes on MinGW 8.

This patch fixes this build failure by changing the type of the
bitfields. It is a similar change to the bitfield changes in
https://reviews.llvm.org/D64238, but instead of changing to bool I
decided to use uint8_t because one of the bitfields takes up two bits
instead of one.

Note: the patch is slightly different from the one reviewed in
Phabricator, but it is a trivial change to align it with LLVM master
instead of LLVM 9. Also, it passes all lld tests.

Differential Revision: https://reviews.llvm.org/D70266
2019-11-16 13:28:53 +01:00
Reid Kleckner adfad4d7c8 Forward declare the DWARFCache to avoid including LLVM DWARF details
LLD's DWARF.h header leaks a lot of LLVM DWARF includes that LLD doesn't
need. For Chunks.cpp, I see a compile time decrease of 3.1s to 2.7s.
2019-11-14 14:17:49 -08:00
Fangrui Song 5b47efa20e [ELF] Fix stack-use-after-scope after D69592 and 69650 2019-11-08 11:21:32 -08:00
Fangrui Song 59d3fbc227 [ELF] Suggest extern "C" when the definition is mangled while an undefined reference is not
The definition may be mangled while an undefined reference is not.
This may come up when (1) the reference is from a C file or (2) the definition
misses an extern "C".

(2) is more common. Suggest an arbitrary mangled name that matches the
undefined reference, if such a definition exists.

  ld.lld: error: undefined symbol: foo
  >>> referenced by a.o:(.text+0x1)
  >>> did you mean to declare foo(int) as extern "C"?
  >>> defined in: a1.o

Reviewed By: dblaikie, ruiu

Differential Revision: https://reviews.llvm.org/D69650
2019-11-08 09:46:45 -08:00
Fangrui Song 70e62a4fa6 [ELF] Suggest extern "C" when an undefined reference is mangled while the definition is not
When missing an extern "C" declaration, an undefined reference may be
mangled while the definition is not. Suggest the missing
extern "C" and the base name.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D69592
2019-11-08 09:42:50 -08:00
Rui Ueyama f95273f75a Keep symbols passed by -init and -fini
Previously, symbols passed by -init and -fini look as if they are
not referenced by anyone, and the LTO might eliminate them.
This patch fixes the issue.

Fixes a bug reported in https://bugs.llvm.org/show_bug.cgi?id=43927

Differential Revision: https://reviews.llvm.org/D69985
2019-11-08 19:08:15 +09:00
Peter Collingbourne 2c6fae179e ELF: Discard .ARM.exidx sections for empty functions instead of misordering them.
The logic added in r372781 caused ARMExidxSyntheticSection::addSection()
to return false for exidx sections without a link order dep that passed
isValidExidxSectionDep(). This included exidx sections for empty functions. As
a result, such exidx sections would end up treated like ordinary sections and
would end up being laid out before the ARMExidxSyntheticSection, most likely in
the wrong order relative to the exidx entries in the ARMExidxSyntheticSection,
breaking the orderedness invariant relied upon by unwinders. Fix this by
simply discarding such sections.

Differential Revision: https://reviews.llvm.org/D69744
2019-11-04 09:11:14 -08:00
Nico Weber 07255f81fa comment typo fix to cycle bots 2019-10-31 07:54:16 -04:00
Nico Weber 4138fc9567 comment typo fix to cycle bots 2019-10-30 22:17:52 -04:00
Nick Terrell 6814232429 [LLD][ELF] Support --[no-]mmap-output-file with F_no_mmap
Summary:
Add a flag `F_no_mmap` to `FileOutputBuffer` to support
`--[no-]mmap-output-file` in ELF LLD. LLD currently explicitly ignores
this flag for compatibility with GNU ld and gold.

We need this flag to speed up link time for large binaries in certain
scenarios. When we link some of our larger binaries we find that LLD
takes 50+ GB of memory, which causes memory pressure. The memory
pressure causes the VM to flush dirty pages of the output file to disk.
This is normally okay, since we should be flushing cold pages. However,
when using BtrFS with compression we need to write 128KB at a time when
we flush a page. If any page in that 128KB block is written again, then
it must be flushed a second time, and so on. Since LLD doesn't write
sequentially this causes write amplification. The same 128KB block will
end up being flushed multiple times, causing the linker to many times
more IO than necessary. We've observed 3-5x faster builds with
-no-mmap-output-file when we hit this scenario.

The bad scenario only applies to compressed filesystems, which group
together multiple pages into a single compressed block. I've tested
BtrFS, but the problem will be present for any compressed filesystem
on Linux, since it is caused by the VM.

Silently ignoring --no-mmap-output-file caused a silent regression when
we switched from gold to lld. We pass --no-mmap-output-file to fix this
edge case, but since lld silently ignored the flag we didn't realize it
wasn't being respected.

Benchmark building a 9 GB binary that exposes this edge case. I linked 3
times with --mmap-output-file and 3 times with --no-mmap-output-file and
took the average. The machine has 24 cores @ 2.4 GHz, 112 GB of RAM,
BtrFS mounted with -compress-force=zstd, and an 80% full disk.

| Mode    | Time  |
|---------|-------|
| mmap    | 894 s |
| no mmap | 126 s |

When compression is disabled, BtrFS performs just as well with and
without mmap on this benchmark.

I was unable to reproduce the regression with any binaries in
lld-speed-test.

Reviewed By: ruiu, MaskRay

Differential Revision: https://reviews.llvm.org/D69294
2019-10-29 15:49:08 -07:00
Fangrui Song 94bfa6deb0 [ELF] Delete redundant comment after D56554. NFC 2019-10-29 10:00:48 -07:00
Michał Górny 2a0fcae3d4 [lld] [ELF] Add '-z nognustack' opt to suppress emitting PT_GNU_STACK
Add a new '-z nognustack' option that suppresses emitting PT_GNU_STACK
segment.  This segment is not supported at all on NetBSD (stack is
always non-executable), and the option is meant to be used to disable
emitting it.

Differential Revision: https://reviews.llvm.org/D56554
2019-10-29 17:54:23 +01:00
Nico Weber 5976a3f5aa Fix a few typos in lld/ELF to cycle bots 2019-10-28 21:41:47 -04:00
Sterling Augustine 118ceea5c3 Crt files are special cased by name when dealing with ctor and dtor
sections, but the current code misses certain variants. In particular, those
named when clang takes the code path in
clang/lib/Driver/ToolChain.cpp:416, where crtfiles are named:

clang_rt.<component>-<arch>-<env>.<suffix>

Previously, the code only handled:
clang_rt.<component>.<suffix>
<component>.<suffix>

This revision fixes that.
2019-10-25 11:04:56 -07:00
Fangrui Song 56d81104f1 [ELF] -r: fix crash when processing a SHT_REL[A] that relocates a SHF_MERGE after D67504/r372734
Fix PR43767

In -r mode, when processing a SHT_REL[A] that relocates a SHF_MERGE, sec->getRelocatedSection() is a
MergeInputSection and its parent is an OutputSection but is asserted to
be a SyntheticSection (MergeSyntheticSection) in LinkerScript.cpp:addInputSec().
 ##
The code path is not exercised in non -r mode because the relocated
section changed from MergeInputSection to InputSection.

Reorder the code to make the non -r logic apply to -r as well, thus fix
the crash.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D69364
2019-10-24 11:35:29 -07:00
Martin Storsjo 908b780952 [LLD] Move duplicated dwarf parsing code to the Common library. NFC.
Differential Revision: https://reviews.llvm.org/D69197

llvm-svn: 375390
2019-10-21 08:01:52 +00:00
Sid Manning ab50256544 [lld] Check for branch range overflows.
Differential Revision: https://reviews.llvm.org/D68875

llvm-svn: 374891
2019-10-15 14:12:54 +00:00
Russell Gallop 6d6ec1b869 [LLD][ELF] Fix stale comments about doing ICF
Differential Revision: https://reviews.llvm.org/D68396

llvm-svn: 374362
2019-10-10 14:50:02 +00:00
Rui Ueyama 9adea6e4fa Make nullptr check more robust
The only condition that isecLoc becomes null is

  Out::bufferStart == nullptr,
  isec->getParent()->offset == 0, and
  isec->outSecOff == 0.

We can check the first condition only once.

llvm-svn: 374332
2019-10-10 12:41:08 +00:00
Roman Lebedev 1508fbad79 [lld] getErrPlace(): don't perform arithmetics on maybe-null pointer
isecLoc there can be null, but at the same time isec->getSize() may
be non-null. It is UB to offset a nullptr.The most straight-forward fix
here appears to perform casts+normal integral arithmetics.

FAIL: lld :: ELF/invalid/invalid-relocation-aarch64.test (1158 of 2217)
******************** TEST 'lld :: ELF/invalid/invalid-relocation-aarch64.test' FAILED ********************
Script:
--
: 'RUN: at line 2';   /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/yaml2obj /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/lld/test/ELF/invalid/invalid-relocation-aarch64.test -o /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/tools/lld/test/ELF/invalid/Output/invalid-relocation-aarch64.test.tmp.o
: 'RUN: at line 3';   not /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/ld.lld /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/tools/lld/test/ELF/invalid/Output/invalid-relocation-aarch64.test.tmp.o -o /dev/null 2>&1 | /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/FileCheck /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/lld/test/ELF/invalid/invalid-relocation-aarch64.test
--
Exit Code: 1

Command Output (stderr):
--
/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/lld/test/ELF/invalid/invalid-relocation-aarch64.test:4:10: error: CHECK: expected string not found in input
# CHECK: error: unknown relocation (1024) against symbol foo
         ^
<stdin>:1:1: note: scanning from here
/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/lld/ELF/Target.cpp💯41: runtime error: applying non-zero offset 24 to null pointer
^
<stdin>:1:118: note: possible intended match here
/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/lld/ELF/Target.cpp💯41: runtime error: applying non-zero offset 24 to null pointer
                                                                                                                     ^

--

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.
FAIL: lld :: ELF/invalid/invalid-relocation-x64.test (1270 of 2217)
******************** TEST 'lld :: ELF/invalid/invalid-relocation-x64.test' FAILED ********************
Script:
--
: 'RUN: at line 2';   /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/yaml2obj /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/lld/test/ELF/invalid/invalid-relocation-x64.test -o /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/tools/lld/test/ELF/invalid/Output/invalid-relocation-x64.test.tmp1.o
: 'RUN: at line 3';   echo ".global foo; foo:" > /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/tools/lld/test/ELF/invalid/Output/invalid-relocation-x64.test.tmp2.s
: 'RUN: at line 4';   /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/llvm-mc /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/tools/lld/test/ELF/invalid/Output/invalid-relocation-x64.test.tmp2.s -o /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/tools/lld/test/ELF/invalid/Output/invalid-relocation-x64.test.tmp2.o -filetype=obj -triple x86_64-pc-linux
: 'RUN: at line 5';   not /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/ld.lld /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/tools/lld/test/ELF/invalid/Output/invalid-relocation-x64.test.tmp1.o /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/tools/lld/test/ELF/invalid/Output/invalid-relocation-x64.test.tmp2.o -o /dev/null 2>&1 | /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/FileCheck /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/lld/test/ELF/invalid/invalid-relocation-x64.test
--
Exit Code: 1

Command Output (stderr):
--
/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/lld/test/ELF/invalid/invalid-relocation-x64.test:6:10: error: CHECK: expected string not found in input
# CHECK: error: unknown relocation (152) against symbol foo
         ^
<stdin>:1:1: note: scanning from here
/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/lld/ELF/Target.cpp💯41: runtime error: applying non-zero offset 24 to null pointer
^
<stdin>:1:118: note: possible intended match here
/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/lld/ELF/Target.cpp💯41: runtime error: applying non-zero offset 24 to null pointer
                                                                                                                     ^

--

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
Testing Time: 20.73s
********************
Failing Tests (2):
    lld :: ELF/invalid/invalid-relocation-aarch64.test
    lld :: ELF/invalid/invalid-relocation-x64.test

llvm-svn: 374329
2019-10-10 12:22:55 +00:00
Rui Ueyama d7ead5b58d Improve error message for bad SHF_MERGE sections
This patch adds a section name to error messages.

Differential Revision: https://reviews.llvm.org/D68758

llvm-svn: 374290
2019-10-10 08:32:12 +00:00
Sid Manning aca5d395d5 [lld][Hexagon] Support PLT relocation R_HEX_B15_PCREL_X/R_HEX_B9_PCREL_X
These are sometimes generated by tail call optimizations.

Differential Revision: https://reviews.llvm.org/D66542

llvm-svn: 374052
2019-10-08 14:23:49 +00:00
Rui Ueyama 5493366729 Report error if -export-dynamic is used with -r
The combination of the two flags doesn't make sense. And other linkers
seem to just ignore --export-dynamic if --relocatable is given, but
we probably should report it as an error to let users know that is
an invalid combination.

Fixes https://bugs.llvm.org/show_bug.cgi?id=43552

Differential Revision: https://reviews.llvm.org/D68441

llvm-svn: 374022
2019-10-08 08:03:40 +00:00
Fangrui Song 24ec80425a [ELF][MIPS] De-template writeValue. NFC
Depends on D68561.

llvm-svn: 373886
2019-10-07 08:52:07 +00:00
Fangrui Song bd8cfe65f5 [ELF] Wrap things in `namespace lld { namespace elf {`, NFC
This makes it clear `ELF/**/*.cpp` files define things in the `lld::elf`
namespace and simplifies `elf::foo` to `foo`.

Reviewed By: atanasyan, grimar, ruiu

Differential Revision: https://reviews.llvm.org/D68323

llvm-svn: 373885
2019-10-07 08:31:18 +00:00
Fangrui Song 5761e3cef4 [ELF][MIPS] Use lld:🧝:{read,write}* instead of llvm::support::endian::{read,write}*
This allows us to delete `using namespace llvm::support::endian` and
simplify D68323. This change adds runtime config->endianness check but
the overhead should be negligible.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D68561

llvm-svn: 373884
2019-10-07 08:30:46 +00:00
Fangrui Song 7588cf09da [ELF] Use union-find set and doubly linked list in Call-Chain Clustering (C³) heuristic
Before, SecToClusters[*] was used to track the belonged cluster.
During a merge (From -> Into), every element of From has to be updated.
Use a union-find set to speed up this use case.

Also, replace `std::vector<int> Sections;` with a doubly-linked
pointers: int Next, Prev;

Reviewed By: Bigcheese

Differential Revision: https://reviews.llvm.org/D46228

llvm-svn: 373708
2019-10-04 07:56:54 +00:00
Peter Collingbourne 0bb825d208 ELF: Add .interp synthetic sections first in createSyntheticSections().
Our .interp section is not a SyntheticSection. As a result, it terminates the
loop in removeUnusedSyntheticSections(). This has at least two consequences:

- The synthetic .bss and .bss.rel.ro sections are always present in
  dynamically linked executables, even when they are not needed.
- The synthetic .ARM.exidx (and possibly other) sections are always present
  in partitions other than the last one, even when not needed.
  .ARM.exidx in particular is problematic because it assumes that its
  list of code sections is non-empty in getLinkOrderDep(), which can
  lead to a crash if the partition does not have any code sections.

Fix these problems by moving the creation of the .interp sections to the
top of createSyntheticSections(). While here, make the code a little less
error-prone by changing the add() lambdas to take a SyntheticSection instead
of an InputSectionBase.

Differential Revision: https://reviews.llvm.org/D68256

llvm-svn: 373347
2019-10-01 16:10:13 +00:00
Peter Collingbourne 97e251e05a ELF: Don't merge SHF_LINK_ORDER sections for different output sections in relocatable links.
Merging SHF_LINK_ORDER sections can affect semantics if the sh_link
fields point to different sections.

Specifically, for SHF_LINK_ORDER sections, the sh_link field acts as a reverse
dependency from the linked section, causing the SHF_LINK_ORDER section to
be included if the linked section is included. Merging sections with different
sh_link fields will cause the entire contents of the SHF_LINK_ORDER section
to be associated with a single (arbitrarily chosen) output section, whereas the
correct semantics are for the individual pieces of the SHF_LINK_ORDER section
to be associated with their linked output sections. As a result we can end up
incorrectly dropping SHF_LINK_ORDER section contents or including the wrong
section contents, depending on which linked sections were chosen.

Differential Revision: https://reviews.llvm.org/D68094

llvm-svn: 373255
2019-09-30 20:23:00 +00:00
Martin Storsjo 5ebab1f8f9 [LLD] Simplify the demangleItanium function. NFC.
Instead of returning an optional, just return the input string if
demangling fails, as that's what all callers use anyway.

Differential Revision: https://reviews.llvm.org/D68015

llvm-svn: 373077
2019-09-27 12:24:18 +00:00
Fangrui Song f1e1451946 [ELF] Set SectionBase::partition in processSectionCommands
Fixes PR43461 (regression caused by D67504)

The partition field of a SECTIONS-specified section is not set after
D67504. The 0 value affects findSection() which checks if the partition
field is 1.

So `Out::initArray = findSection(".init_array")` is null, and
DT_INIT_ARRAYSZ is not set.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D68087

llvm-svn: 372996
2019-09-26 17:10:09 +00:00
Simon Atanasyan fba48fcf44 [mips] Relax jalr/jr instructions using R_MIPS_JALR relocation
The R_MIPS_JALR relocation denotes jalr/jr instructions in position
independent code. Both these instructions take a target's address from
the $25 register. If offset to the target symbol fits into the 18-bits,
it's more efficient to replace jalr/jr by bal/b instructions.

Differential Revision: https://reviews.llvm.org/D68057

llvm-svn: 372951
2019-09-26 09:13:20 +00:00
Fangrui Song 0264950697 [ELF] Add -z separate-loadable-segments to complement separate-code and noseparate-code
D64906 allows PT_LOAD to have overlapping p_offset ranges. In the
default R RX RW RW layout + -z noseparate-code case, we do not tail pad
segments when transiting to another segment. This can save at most
3*maxPageSize bytes.

a) Before D64906, we tail pad R, RX and the first RW.
b) With -z separate-code, we tail pad R and RX, but not the first RW (RELRO).

In some cases, b) saves one file page. In some cases, b) wastes one
virtual memory page. The waste is a concern on Fuchsia. Because it uses
compressed binaries, it doesn't benefit from the saved file page.

This patch adds -z separate-loadable-segments to restore the behavior before
D64906. It can affect section addresses and can thus be used as a
debugging mechanism (see PR43214 and ld.so partition bug in
crbug.com/998712).

Reviewed By: jakehehrlich, ruiu

Differential Revision: https://reviews.llvm.org/D67481

llvm-svn: 372807
2019-09-25 03:39:31 +00:00
Bob Haarman 9f0f36e022 [ELF] accept thinlto options without --plugin-opt= prefix
Summary:
When support for ThinLTO was first added to lld, the options that
control it were prefixed with --plugin-opt= for compatibility with
an existing implementation as a linker plugin. This change enables
shorter versions of the options to be used, as follows:

  New                              Existing
  -thinlto-emit-imports-files      --plugin-opt=thinlto-emit-imports-files
  -thinlto-index-only              --plugin-opt=thinlto-index-only
  -thinlto-index-only=             --plugin-opt=thinlto-index-only=
  -thinlto-object-suffix-replace=  --plugin-opt=thinlto-object-suffix-replace=
  -thinlto-prefix-replace=         --plugin-opt=thinlto-prefix-replace=
  -lto-obj-path=                   --plugin-opt=obj-path=

The options with the --plugin-opt= prefix have been retained as aliases
for the shorter variants so that they continue to be accepted.

Reviewers: tejohnson, ruiu, espindola

Reviewed By: ruiu

Subscribers: emaste, arichardson, MaskRay, steven_wu, dexonsmith, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67782

llvm-svn: 372798
2019-09-25 01:19:48 +00:00
Peter Smith 06b3e3421a [ELF][ARM] Fix crash when discarding InputSections that have .ARM.exidx
When /DISCARD/ is used on an input section, that input section may have
a .ARM.exidx metadata section that depends on it. As the discard handling
comes after the .ARM.exidx synthetic section is created we need to make
sure that we account for the case where the .ARM.exidx output section
should be removed because there are no more live input sections.

Differential Revision: https://reviews.llvm.org/D67848

llvm-svn: 372781
2019-09-24 21:44:14 +00:00
George Rimar 355764e388 [LLD][ELF][MIPS] - Inline the short helper function. NFC.
It was requested in a post-commit comment for r372570.

llvm-svn: 372747
2019-09-24 12:53:53 +00:00
Fangrui Song e447d5afd3 [ELF] Delete SectionBase::assigned
D67504 removed uses of `assigned` from OutputSection::addSection, which
makes `assigned` purely used in processSectionCommands() and its
callees. By replacing its references with `parent`, we can remove
`assigned`.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D67531

llvm-svn: 372735
2019-09-24 11:48:46 +00:00
Fangrui Song e47bbd28f8 [ELF] Make MergeInputSection merging aware of output sections
Fixes PR38748

mergeSections() calls getOutputSectionName() to get output section
names. Two MergeInputSections may be merged even if they are made
different by SECTIONS commands.

This patch moves mergeSections() after processSectionCommands() and
addOrphanSections() to fix the issue. The new pass is renamed to
OutputSection::finalizeInputSections().

processSectionCommands() and addorphanSections() are changed to add
sections to InputSectionDescription::sectionBases.

finalizeInputSections() merges MergeInputSections and migrates
`sectionBases` to `sections`.

For the -r case, we drop an optimization that tries keeping sh_entsize
non-zero. This is for the simplicity of addOrphanSections(). The
updated merge-entsize2.s reflects the change.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D67504

llvm-svn: 372734
2019-09-24 11:48:31 +00:00
Simon Atanasyan 4750d79ac6 [mips] Support elf32btsmipn32_fbsd / elf32ltsmipn32_fbsd emulations
Patch by Kyle Evans.

llvm-svn: 372651
2019-09-23 20:32:43 +00:00
George Rimar c60913f162 [LLD][ELF] - Simplify getFlagsFromEmulation(). NFCI.
A straightforward simplification.

llvm-svn: 372570
2019-09-23 09:55:10 +00:00
Simon Atanasyan e03007cb4e [mips] Deduce MIPS specific ELF header flags from `emulation`
In case of linking binary blobs which do not have any ELF headers, we can
deduce MIPS ABI  ELF header flags from an `emulation` option.

Patch by Kyle Evans.

llvm-svn: 372513
2019-09-22 16:26:39 +00:00
Fangrui Song 2672051495 [ELF] Error if the linked-to section of a SHF_LINK_ORDER section is discarded
Summary:
If st_link(A)=B, and A has the SHF_LINK_ORDER flag, we may dereference
a null pointer if B is garbage collected (PR43147):

1. In Wrter.cpp:compareByFilePosition, `aOut->sectionIndex` or `bOut->sectionIndex`
2. In OutputSections::finalize, `d->getParent()->sectionIndex`

Simply error and bail out to avoid null pointer dereferences. ld.bfd has
a similar error:

    sh_link of section `.bar' points to discarded section `.foo0' of `a.o'

ld.bfd is more permissive in that it just checks whether the linked-to
section of the first input section is discarded. This is likely because
it sets sh_link of the output section according to the first input
section.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D67761

llvm-svn: 372400
2019-09-20 15:03:21 +00:00
Peter Smith 43d32cdd87 [ELF][AARCH64] Refactor AArchErrataFix to match changes in ARMErrataFix NFC.
D67284 introduced ARMErrataFix.cpp which was derived from
AArch64ErrataFix.cpp. There were some useful refactoring changes made to
ARMErrataFix.cpp made as part of the review. This change applies the
relevant changes back to AArch64ErrataFix.cpp.

Main changes are:
- Old style variable names in comments like IS, are now new style isec.
- Simplify init() collection of mappingSymbols to always start with a code
mapping symbol.
- Simplify logic in mergeCmp().
- Fix one 80 column overflow caused by IS -> isec transformation.

Differential Revision: https://reviews.llvm.org/D67622

llvm-svn: 372094
2019-09-17 09:49:30 +00:00
Fangrui Song 4816e516e5 [ELF][Hexagon] Allow PT_LOAD to have overlapping p_offset ranges on EM_HEXAGON
Port the D64906 technique to EM_HEXAGON. This concludes the patch series.

Differential Revision: https://reviews.llvm.org/D67605

llvm-svn: 372059
2019-09-17 02:45:38 +00:00
Steven Wu dd63b9f570 [lld] Update lld driver to use new LTO APIs to handle libcall symbols
NFC. Remove duplicated code in ELF/COFF driver and libLTO legacy
interfaces.

llvm-svn: 372022
2019-09-16 18:49:57 +00:00
Peter Smith 1d74940b31 [ELF][ARM] Fix -Werror buildbots NFC.
Provide a missing initializer to get rid of warning provoking buildbot
failures.

error: missing field 'rel' initializer
[-Werror,-Wmissing-field-initializers]

llvm-svn: 371970
2019-09-16 10:07:53 +00:00
Peter Smith ea99ce5e9b [ELF][ARM] Implement --fix-cortex-a8 to fix erratum 657417
The --fix-cortex-a8 option implements a linker workaround for the
coretex-a8 erratum 657417. A summary of the erratum conditions is:
- A 32-bit Thumb-2 branch instruction B.w, Bcc.w, BL, BLX spans two
4KiB regions.
- The destination of the branch is to the first 4KiB region.
- The instruction before the branch is a 32-bit Thumb-2 non-branch
instruction.

The linker fix is to redirect the branch to a patch not in the first
4KiB region. The patch forwards the branch on to its target.

The cortex-a8, is an old CPU, with the first implementation of this
workaround in ld.bfd appearing in 2009. The cortex-a8 has been used in
early Android Phones and there are some critical applications that still
need to run on a cortex-a8 that have the erratum. The patch is applied
roughly 10 times on LLD and 20 on Clang when they are built with
--fix-cortex-a8 on an Arm system.

The formal erratum description is avaliable in the ARM Core Cortex-A8
(AT400/AT401) Errata Notice document. This is available from Arm on
request but it seems to be findable via a web search.

Differential Revision: https://reviews.llvm.org/D67284

llvm-svn: 371965
2019-09-16 09:38:38 +00:00
Fangrui Song d4306e90cb [ELF][X86] Allow PT_LOAD to have overlapping p_offset ranges on EM_X86_64
Port the D64906 technique to EM_X86_64.

Differential Revision: https://reviews.llvm.org/D67482

llvm-svn: 371958
2019-09-16 07:05:34 +00:00
Fangrui Song 06bb7dfbd4 [ELF] Map the ELF header at imageBase
If there is no readonly section, we map:

* The ELF header at imageBase+maxPageSize
* Program headers at imageBase+maxPageSize+sizeof(Ehdr)
* The first section .text at imageBase+maxPageSize+sizeof(Ehdr)+sizeof(program headers)

Due to the interaction between Writer<ELFT>::fixSectionAlignments and
LinkerScript::allocateHeaders,
`alignDown(p_vaddr(R PT_LOAD)) = alignDown(p_vaddr(RX PT_LOAD))`.
The RX PT_LOAD will override the R PT_LOAD at runtime, which is not ideal:

```
// PHDR at 0x401034, should be 0x400034
  PHDR           0x000034 0x00401034 0x00401034 0x000a0 0x000a0 R   0x4
// R PT_LOAD contains just Ehdr and program headers.
// At 0x401000, should be 0x400000
  LOAD           0x000000 0x00401000 0x00401000 0x000d4 0x000d4 R   0x1000
  LOAD           0x0000d4 0x004010d4 0x004010d4 0x00001 0x00001 R E 0x1000
```

* createPhdrs allocates the headers to the R PT_LOAD.
* fixSectionAlignments assigns `imageBase+maxPageSize+sizeof(Ehdr)+sizeof(program headers)` (formula: `alignTo(dot, maxPageSize) + dot % config->maxPageSize`) to addrExpr of .text
* allocateHeaders computes the minimum address among SHF_ALLOC sections, i.e. addr(.text)
* allocateHeaders sets address of ELF header to `addr(.text)-sizeof(Ehdr)-sizeof(program headers) = imageBase+maxPageSize`

The main observation is that when the SECTIONS command is not used, we
don't have to call allocateHeaders. This requires an assumption that
the presence of PT_PHDR and addresses of headers can be decided
regardless of address information.

This may seem natural because dot is not manipulated by a linker script.
The other thing is that we have to drop the special rule for -T<section>
in `getInitialDot`. If -Ttext is smaller than the image base, the headers
will not be allocated with the old behavior (allocateHeaders is called)
but always allocated with the new behavior.

The behavior change is not a problem. Whether and where headers are
allocated can vary among linkers, or ld.bfd across different versions
(--enable-separate-code or not). It is thus advised to use a linker
script with the PHDRS command to have a consistent behavior across
linkers. If PT_PHDR is needed, an explicit --image-base can be a simpler
alternative.

Differential Revision: https://reviews.llvm.org/D67325

llvm-svn: 371957
2019-09-16 07:04:16 +00:00
Fangrui Song 51ead00bf8 [ELF] Delete a redundant assignment to SectionBase::assigned. NFC
LinkerScript::discard marks a section dead. It is unnecessary to set the
`assigned` bit.

llvm-svn: 371804
2019-09-13 02:18:04 +00:00
Fangrui Song 2ad25a4aee [ELF] ICF: change a dyn_cast<InputSection> to cast
ICF is performed after EhInputSections and MergeInputSections were
eliminated from inputSections. Every element of inputSections is an
InputSection.

llvm-svn: 371744
2019-09-12 16:46:19 +00:00
Fangrui Song 786ce3fbd6 [ELF] Fix a common-page-size typo
llvm-svn: 371716
2019-09-12 08:59:17 +00:00
Fangrui Song 60ff4dd9cd [ELF] Support -z undefs
-z undefs is the inverse of -z defs. It allows unresolved references
from object files. This can be used to cancel --no-undefined or -z defs.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D67479

llvm-svn: 371715
2019-09-12 08:55:17 +00:00
Simon Atanasyan 6c6f5a9984 [mips] Allow PT_LOAD to have overlapping p_offset ranges on EM_MIPS
Port the D64906 <https://reviews.llvm.org/D64906> technique to MIPS.

Fix PR33131

llvm-svn: 371554
2019-09-10 20:19:59 +00:00
Fangrui Song e8c0d93360 [ELF] nmagic or omagic: don't allocate PT_PHDR or PF_R PT_LOAD for the !hasPhdrsCommands case
```
part.phdrs = script->hasPhdrsCommands() ? script->createPhdrs() : createPhdrs(part);
```

createPhdrs() allocates a PT_PHDR and a PF_R PT_LOAD, which will be
deleted later in LinkerScript::allocateHeaders, but leave a gap between
the program headers and the first section. Don't allocate the segments
to avoid the gap. PT_INTERP is likely not needed as well.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D67324

llvm-svn: 371398
2019-09-09 13:08:51 +00:00
Fangrui Song 298c7a09de [ELF][AArch64] Apply some NFC cleanups to AArch64ErrataFix.cpp
Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D67310

llvm-svn: 371389
2019-09-09 11:22:27 +00:00
Fangrui Song 2682bc3c9d [ELF] Replace error() with errorOrWarn() for the ASSERT command
Summary:
ld.bfd produces an output with --noinhibit-exec when an ASSERT fails.
Use errorOrWarn() so that we can produce an output as well.

An interesting case is that symbol assignments may execute multiple
times, so we probably want to suppress errors for non-final runs.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D67285

llvm-svn: 371225
2019-09-06 16:30:22 +00:00
Fangrui Song 8d30c1dcec Reland D66717 [ELF] Do not ICF two sections with different output sections (by SECTIONS commands)
Recommit r370635 (reverted by r371202), with one change: move addOrphanSections() before ICF.

Before, orphan sections in two different partitions may be folded and
moved to the main partition.

Now, InputSection->OutputSection assignment for orphans happens before
ICF. ICF does not fold input sections with different output sections.

With the PR43241 reproduce,
`llvm-objcopy --extract-partition libvr.so libchrome__combined.so libvr.so` => no error

Updated description:

Fixes PR39418. Complements D47241 (the non-linker-script case).

processSectionCommands() assigns input sections to output sections.
ICF is called before it, so .text.foo and .text.bar may be folded even if
their output sections are made different by SECTIONS commands.

```
markLive<ELFT>()
doIcf<ELFT>()                      // During ICF, we don't know the output sections
writeResult()
  combineEhSections<ELFT>()
  script->processSectionCommands() // InputSection -> OutputSection assignment
```

This patch splits processSectionCommands() into processSectionCommands()
and processSymbolAssignments(), and moves
processSectionCommands()/addOrphanSections() before ICF:

```
markLive<ELFT>()
combineEhSections<ELFT>()
script->processSectionCommands()
script->addOrphanSections();
doIcf<ELFT>()                      // should remove folded input sections
writeResult()
  script->processSymbolAssignments()
```

An alternative approach is to unfold a section `sec` in
processSectionCommands() when we find `sec` and `sec->repl` belong to
different output sections. I feel this patch is superior because this
can fold more sections and the decouple of
SectionCommand/SymbolAssignment gives flexibility:

* An ExprValue can't be evaluated before its section is assigned to an
  output section -> we can delete getOutputSectionVA and simplify
  another place where we had to check if the output section is null.
  Moreover, a case in linkerscript/early-assign-symbol.s can be handled
  now.
* processSectionCommands/processSymbolAssignments can be freely moved
  around.

llvm-svn: 371216
2019-09-06 15:57:44 +00:00
Fangrui Song 5d9f419a2e Revert "Revert r370635, it caused PR43241."
This reverts commit 50d2dca22b3b05d0ee4883b0cbf93d7d15f241fc.

llvm-svn: 371215
2019-09-06 15:57:24 +00:00
Nico Weber 8455294f2a Revert r370635, it caused PR43241.
llvm-svn: 371202
2019-09-06 13:23:42 +00:00
Fangrui Song 6dc2bd70bb [ELF] Initialize PhdrEntry::p_align to maxPageSize for PT_LOAD
```
Writer<ELFT>::run
  assignFileOffsets
    setFileOffset
      computeFileOffset
        os->ptLoad->p_align may be smaller than config->maxPageSize
  setPhdrs
    p_align = max(p_align, config->maxPageSize)
```

If we move the config->maxPageSize logic to the constructor of
PhdrEntry, computeFileOffset can be simplified.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D67211

llvm-svn: 371085
2019-09-05 16:32:31 +00:00
Rui Ueyama e99dc4ba57 Align output segments correctly
Previously, segments were aligned according to their first section's
alignment requirements. That was not correct, but segments are also
aligned to a page boundary, and a page boundary is usually much larger
than a section alignment requirement, so no one noticed this bug before.

Now, lld has --nmagic option which sets maxPageSize to 1 to effectively
disable page alignment, which reveals the issue.

Fixes https://bugs.llvm.org/show_bug.cgi?id=43212

Differential Revision: https://reviews.llvm.org/D67152

llvm-svn: 371013
2019-09-05 05:30:24 +00:00
Fangrui Song 7afffb54ea [ELF] Don't shrink RelrSection
Fixes PR43214.

The size of SHT_RELR may oscillate between 2 numbers (see D53003 for a
similar --pack-dyn-relocs=android issue). This can happen if the shrink
of SHT_RELR causes it to take more words to encode relocation offsets
(this can happen with thunks or segments with overlapping p_offset
ranges), and the expansion of SHT_RELR causes it to take fewer words to
encode relocation offsets.

To avoid the issue, add padding 1s to the end of the relocation section
if its size would decrease. Trailing 1s do not decode to more relocations.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D67164

llvm-svn: 370923
2019-09-04 16:27:35 +00:00
Fangrui Song 520bdf79b5 [ELF] Fix spell corrector: don't call elf::InputFile::getSymbols() on shared objects
Exposed by pr34872.s

llvm-svn: 370875
2019-09-04 11:02:58 +00:00
Fangrui Song b4745fad24 [ELF] Add a spell corrector for "undefined symbol" diagnostics
Non-undefined symbols with Levenshtein distance 1 or a transposition are
suggestion candidates. This is probably good enough and it can suggest
some missing/superfluous qualifiers: const, restrict, volatile, & and &&
ref-qualifier, e.g.

   error: undefined symbol: foo(int*)
   >>> referenced by b.o:(.text+0x1)
  +>>> did you mean: foo(int const*)
  +>>> defined in: a.o

   error: undefined symbol: foo(int*&)
   >>> referenced by b.o:(.text+0x1)
  +>>> did you mean: foo(int*)
  +>>> defined in: b.o

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D67039

llvm-svn: 370853
2019-09-04 09:04:26 +00:00
Fangrui Song d8bc6a48ea [ELF] Do not ICF two sections with different output sections (by SECTIONS commands)
Fixes PR39418. Complements D47241 (the non-linker-script case).

processSectionCommands() assigns input sections to output sections.
ICF is called before it, so .text.foo and .text.bar may be folded even if
their output sections are made different by SECTIONS commands.

```
markLive<ELFT>()
doIcf<ELFT>()                      // During ICF, we don't know the output sections
writeResult()
  combineEhSections<ELFT>()
  script->processSectionCommands() // InputSection -> OutputSection assignment
```

This patch splits processSectionCommands() into processSectionCommands() and
processSymbolAssignments(), and moves processSectionCommands() before ICF:

```
markLive<ELFT>()
combineEhSections<ELFT>()
script->processSectionCommands()
doIcf<ELFT>()                      // should remove folded input sections
writeResult()
  script->processSymbolAssignments()
```

An alternative approach is to unfold a section `sec` in
processSectionCommands() when we find `sec` and `sec->repl` belong to
different output sections. I feel this patch is superior because this
can fold more sections and the decouple of
SectionCommand/SymbolAssignment gives flexibility:

* An ExprValue can't be evaluated before its section is assigned to an
  output section -> we can delete getOutputSectionVA and simplify
  another place where we had to check if the output section is null.
  Moreover, a case in linkerscript/early-assign-symbol.s can be handled
  now.
* processSectionCommands/processSymbolAssignments can be freely moved
  around.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D66717

llvm-svn: 370635
2019-09-02 10:33:58 +00:00
Fangrui Song 4514ac7cfb [ELF] Align SHT_LLVM_PART_EHDR to a maximum page size boundary
Fixes https://bugs.chromium.org/p/chromium/issues/detail?id=998712

SHT_LLVM_PART_EHDR marks the start of a partition. The partition
sections will be extracted to a separate file. Align to the next maximum
page size boundary so that we can find the ELF header at the start. We
cannot benefit from overlapping p_offset ranges with the previous
segment anyway.

It seems we lack some llvm-objcopy --extract-main-partition and
--extract-partition sanity checks. It may place EHDR at the start
even if p_offset if non zero. Anyway, the lld change is justified for
the reasons above.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D67032

llvm-svn: 370629
2019-09-02 08:49:50 +00:00
Fangrui Song 688183ec54 [ELF] Set `referenced` bit of Undefined created by BitcodeFile
D64136 and D65584, while fixing STB_WEAK issues and improving our
compatibility with ld.bfd, can cause another STB_WEAK problem related to
LTO:

If %tundef.o has an undefined reference on f,
and %tweakundef.o has a weak undefined reference on f,
%tdef.o has a definition of f

```
ld.lld %tundef.o %tweakundef.o --start-lib %tdef.o --end-lib
```

1) `%tundef.o` doesn't set the `referenced` bit.
2) `%weakundef.o` changes the binding from STB_GLOBAL to STB_WEAK
3) `%tdef.o` is not fetched because the binding is weak.

Step (1) is incorrect. This patch sets the `referenced` bit of Undefined
created by bitcode files.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D66992

llvm-svn: 370437
2019-08-30 07:10:30 +00:00
Fangrui Song 523f999acf [ELF][RISCV] Allow PT_LOAD to have overlapping p_offset ranges on EM_RISCV
Port the D64906 technique to RISC-V. It deletes 3 alignments at
PT_LOAD boundaries for the default case: the size of a RISC-V binary
decreases by at most 12kb.

llvm-svn: 370192
2019-08-28 12:06:06 +00:00
Fangrui Song 54a6f6839b [ELF][AMDGPU][SPARC] Allow PT_LOAD to have overlapping p_offset ranges on EM_AMDGPU and EM_SPARCV9
llvm-svn: 370180
2019-08-28 09:45:06 +00:00
Fangrui Song 8fbe81fb29 [ELF][RISCV] Assign st_shndx of __global_pointer$ to 1 if .sdata does not exist
This essentially reverts the code change of D63132 and switches to a simpler approach.

In an executable/shared object, st_shndx of a symbol can be:

1) SHN_UNDEF: undefined symbol (or canonical PLT)
2) SHN_ABS: absolute symbol
3) any other value (usually a regular section index) represents a relative symbol.
  The actual value does not matter.

Many ld.so (musl, all archs except MIPS of FreeBSD rtld-elf) even treat 2) and 3)
the same. If .sdata does not exist, it does not matter what value/section
__global_pointer$ has, as long as it is relative (otherwise there will be a pedantic
lld error. See D63132). Just set the st_shndx arbitrarily to 1.

Dummy st_shndx=1 may be used by __rela_iplt_start, linker-script-defined symbols outside a section, __dso_handle, etc.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D66798

llvm-svn: 370172
2019-08-28 09:01:03 +00:00
Fangrui Song 024bf27ddf [ELF][ARM] Allow PT_LOAD to have overlapping p_offset ranges on EM_ARM
Port the D64906 technique to ARM. It deletes 3 alignments at
PT_LOAD boundaries for the default case: the size of an arm binary
decreases by at most 12kb.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D66749

llvm-svn: 370049
2019-08-27 11:52:36 +00:00
Fangrui Song 1681ceb2c4 [ELF] EhFrameSection: postpone FDE liveness check to finalizeSections
EhFrameSection::addSection checks liveness of FDE early. This makes it
infeasible to move combineEhSections() before ICF.

Postpone the check to EhFrameSection::finalizeContents(). This is what
ARMExidxSyntheticSection does and it will make a subsequent patch D66717
simpler.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D66727

llvm-svn: 369890
2019-08-26 10:32:12 +00:00
Fangrui Song debcac9fef [ELF] Make LinkerScript::assignAddresses iterative
PR42990. For `SECTIONS { b = a; . = 0xff00 + (a >> 8); a = .; }`,
we currently set st_value(a)=0xff00 while st_value(b)=0xffff.

The following call tree demonstrates the problem:

```
link<ELF64LE>(Args);
  Script->declareSymbols(); // insert a and b as absolute Defined
  Writer<ELFT>().run();
    Script->processSectionCommands();
      addSymbol(cmd);       // a and b are re-inserted. LinkerScript::getSymbolValue
                            // is lazily called by subsequent evaluation
    finalizeSections();
      forEachRelSec(scanRelocations<ELFT>);
        processRelocAux     // another problem PR42506, not affected by this patch
      finalizeAddressDependentContent(); // loop executed once
        script->assignAddresses(); // a = 0, b = 0xff00
    script->assignAddresses(); // a = 0xff00, _end = 0xffff
```

We need another assignAddresses() to finalize the value of `a`.

This patch

1) modifies assignAddress() to track the original section/value of each
  symbol and return a symbol whose section/value has changed.
2) moves the post-finalizeSections assignAddress() inside the loop
  of finalizeAddressDependentContent() and makes it iterative.
  Symbol assignment may not converge so we make a few attempts before
  bailing out.

Note, assignAddresses() must be called at least twice. The penultimate
call finalized section addresses while the last finalized symbol values.
It is somewhat obscure and there was no comment.
linkerscript/addr-zero.test tests this.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D66279

llvm-svn: 369889
2019-08-26 10:23:31 +00:00
Fangrui Song 8e5184af71 [ELF] Error if --strip-all and --emit-relocs are used together
--strip-all suppresses the creation of in.symtab
This can cause a null pointer dereference in OutputSection::finalize()

  // --emit-relocs => copyRelocs is true
  if (!config->copyRelocs || (type != SHT_RELA && type != SHT_REL))
    return;
  ...
  link = in.symTab->getParent()->sectionIndex; // in.symTab is null

Let's just disallow the combination. In some cases the combination can
cause GNU linkers to fail:

* ld.bfd: final link failed: invalid operation
* gold: internal error in set_no_output_symtab_entry, at ../../gold/object.h:1814

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D66704

llvm-svn: 369878
2019-08-26 06:23:53 +00:00
Fangrui Song 76f005535a [ELF] Delete a redundant dyn_cast<InputSection>. NFC
llvm-svn: 369868
2019-08-25 14:41:18 +00:00
Fangrui Song 6d5a8c92bf [ELF] Simplify with less_second. NFC
llvm-svn: 369844
2019-08-24 08:40:20 +00:00
Fangrui Song 62083ec157 [ELF] Make member function Writer<ELFT>::removeEmptyPTLoad non-member. NFC
llvm-svn: 369838
2019-08-24 06:31:34 +00:00
Fangrui Song af47d0021c [ELF] Align the first section of a PT_LOAD even if its type is SHT_NOBITS
Reported at https://reviews.llvm.org/D64930#1642223

If the only section of a PT_LOAD is a SHT_NOBITS section (e.g. .bss), we
may not align its sh_offset. p_offset of the PT_LOAD will be set to
sh_offset, and we will get p_offset!=p_vaddr (mod p_align).  If such
executable is mapped by the Linux kernel, it will segfault.

After D64906, this may happen the non-linker script case.

The linker script case has had this issue for a long time.
This was fixed by rL321657 (but the test linkerscript/nobits-offset.s
failed to test a SHT_NOBITS section), but broken by rL345154.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D66658

llvm-svn: 369828
2019-08-24 00:41:15 +00:00
Peter Smith 7d6aa7eb7f [ELF] Mention contents of reproduce archive and add help description.
Building on D60557 mention the name of the linker generated contents of
the reproduce archive, response.txt and version.txt.

Also write a shorter description in the ld.lld --help that is closer to
the documentation.

Differential Revision: https://reviews.llvm.org/D66641

llvm-svn: 369762
2019-08-23 14:41:25 +00:00
Benjamin Kramer b3a991df3c Fight a bit against global initializers. NFC.
llvm-svn: 369695
2019-08-22 19:43:27 +00:00
Fangrui Song 2d337fdc95 Reland D65242 "[ELF] More dynamic relocation packing""
This fixed a bug in r369488. When config->isRela is false, i->r_addend
is not initialized (see encodeDynamicReloc). So we should check
config->isRela before accessing r_addend:

- if (j - i < 3 || i->r_addend)
+ if (j - i < 3 || (config->isRela && i->r_addend != 0))

Original description:

Currently, with Android dynamic relocation packing, only relative
relocations are grouped together. This patch implements similar
packing for non-relative relocations.

The implementation groups non-relative relocations with the same
r_info and r_addend, if using RELA. By requiring a minimum group
size of 3, this achieves smaller relocation sections. Building Android
for an ARM32 device, I see the total size of /system/lib decrease by
392 KB.

Grouping by r_info also allows the runtime dynamic linker to implement
an 1-entry cache to reduce the number of symbol lookup required. With
such 1-entry cache implemented on Android, I'm seeing 10% to 20%
reduction in total time spent in runtime linker for several executables
that I tested.

As a simple correctness check, I've also built x86_64 Android and booted
successfully.

Differential Revision: https://reviews.llvm.org/D65242
Patch by Vic Yang

llvm-svn: 369507
2019-08-21 09:21:37 +00:00
Fangrui Song b2895a8cdc Revert D65242 "[ELF] More dynamic relocation packing"
This reverts r369488 and r369489. The change broke build bots:

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap-ubsan/builds/14511
http://lab.llvm.org:8011/builders/lld-x86_64-freebsd/builds/34407

llvm-svn: 369497
2019-08-21 06:50:08 +00:00
Fangrui Song 35f9a84a15 [ELF] More dynamic relocation packing
Currently, with Android dynamic relocation packing, only relative
relocations are grouped together. This patch implements similar
packing for non-relative relocations.

The implementation groups non-relative relocations with the same
r_info and r_addend, if using RELA. By requiring a minimum group
size of 3, this achieves smaller relocation sections. Building Android
for an ARM32 device, I see the total size of /system/lib decrease by
392 KB.

Grouping by r_info also allows the runtime dynamic linker to implement
an 1-entry cache to reduce the number of symbol lookup required. With
such 1-entry cache implemented on Android, I'm seeing 10% to 20%
reduction in total time spent in runtime linker for several executables
that I tested.

As a simple correctness check, I've also built x86_64 Android and booted
successfully.

Differential Revision: https://reviews.llvm.org/D66491
Patch by Vic Yang!

llvm-svn: 369488
2019-08-21 03:02:08 +00:00
Fangrui Song 12d83b4270 [ELF][PPC] Allow PT_LOAD to have overlapping p_offset ranges on EM_PPC
Ported the D64906 technique to EM_PPC.

Delete ppc-rela.s that is covered by ppc32-abs-pic.s

llvm-svn: 369351
2019-08-20 09:20:05 +00:00
Fangrui Song 9c371309f3 [ELF][X86] Allow PT_LOAD to have overlapping p_offset ranges on EM_386
Ported the D64906 technique to EM_386.

If `sh_addralign(.tdata) < sh_addralign(.tbss)`,
we can potentially make `p_vaddr(PT_TLS)%p_align(PT_TLS) != 0`.

ld.so that are known to have problems if p_vaddr%p_align!=0:

* FreeBSD 13.0-CURRENT rtld-elf
* glibc https://sourceware.org/bugzilla/show_bug.cgi?id=24606

New test i386-tls-vaddr-align.s checks our workaround makes p_vaddr%p_align = 0.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D65865

llvm-svn: 369347
2019-08-20 08:43:47 +00:00
Fangrui Song f66b767abe [ELF][AArch64] Allow PT_LOAD to have overlapping p_offset ranges
Ported the D64906 technique to AArch64. It deletes 3 alignments at
PT_LOAD boundaries for the default case: the size of an aarch64 binary
decreases by at most 192kb.

If `sh_addralign(.tdata) < sh_addralign(.tbss)`,
we can potentially make `p_vaddr(PT_TLS)%p_align(PT_TLS) != 0`.

ld.so that are known to have problems if p_vaddr%p_align!=0:

* musl<=1.1.22
* FreeBSD 13.0-CURRENT (and before) rtld-elf arm64

New test aarch64-tls-vaddr-align.s checks that our workaround makes p_vaddr%p_align = 0.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D64930

llvm-svn: 369344
2019-08-20 08:34:56 +00:00