Commit Graph

5689 Commits

Author SHA1 Message Date
Fangrui Song e98baf8631 [ELF] Delete GotEntrySize and GotPltEntrySize
GotEntrySize and GotPltEntrySize were added in D22288. Later, with
the introduction of wordsize() (then Config->Wordsize), they become
redundant, because there is no target that sets GotEntrySize or
GotPltEntrySize to a number different from Config->Wordsize.

Reviewed By: grimar, ruiu

Differential Revision: https://reviews.llvm.org/D62727

llvm-svn: 362220
2019-05-31 10:35:45 +00:00
Fangrui Song 3f29cfd915 [ELF] Replace a dead test in getSymVA() with assert()
Symbols relative to discarded comdat sections are Undefined instead of
Defined now (after D59649 and D61583). The `== &InputSection::Discarded`
test becomes dead. I cannot find a test related to this behavior.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D62725

llvm-svn: 362218
2019-05-31 10:12:22 +00:00
Fangrui Song 0317e46a63 [ELF] Delete dead SHT_NOBITS->SHT_PROGBITS code after r358981
After D60131/r358981, we no longer create SHT_NOBITS sections that may
contain ByteCommand (BYTE, SHORT, LONG, QUAD).

llvm-svn: 362108
2019-05-30 15:52:11 +00:00
Fangrui Song bdaa39ea6c [ELF] De-template addUndefined() and addWrappedSymbols(). NFC
llvm-svn: 362099
2019-05-30 14:50:10 +00:00
Fangrui Song 0526c0cd8e [ELF] Implement Local Dynamic style TLSDESC for x86-64
For the Local Dynamic case of TLSDESC, _TLS_MODULE_BASE_ is defined as a
special TLS symbol that makes:

1) Without relaxation: it produces a dynamic TLSDESC relocation that
computes 0. Adding @dtpoff to access a TLS symbol.
2) With LD->LE relaxation: _TLS_MODULE_BASE_@tpoff = 0 (lowest address in
the TLS block). Adding @tpoff to access a TLS symbol.

For 1), this saves dynamic relocations and GOT slots as otherwise
(General Dynamic) we would create an R_X86_64_TLSDESC and reserve two
GOT slots for each symbol.

Add ElfSym::TlsModuleBase and change the signature of getTlsTpOffset()
to special case _TLS_MODULE_BASE_.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D62577

llvm-svn: 362078
2019-05-30 10:00:20 +00:00
Peter Collingbourne 87575f6501 ELF: Don't reuse a thunk in a different loadable partition.
There's no guarantee that the other partition will be loaded, so it
can't be reused.

Differential Revision: https://reviews.llvm.org/D62365

llvm-svn: 361926
2019-05-29 04:06:01 +00:00
Peter Collingbourne ba2816be82 ELF: Add basic partition data structures and behaviours.
This change causes us to read partition specifications from partition
specification sections and split output sections into partitions according
to their reachability from partition entry points.

This is only the first step towards a full implementation of partitions. Later
changes will add additional synthetic sections to each partition so that
they can be loaded independently.

Differential Revision: https://reviews.llvm.org/D60353

llvm-svn: 361925
2019-05-29 03:55:20 +00:00
Fangrui Song 719322411c [ELF] Implement General Dynamic style TLSDESC for x86-64
This handles two initial relocation types R_X86_64_GOTPC32_TLSDESC and
R_X86_64_TLSDESC_CALL, as well as the GD->LE and GD->IE relaxations.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D62513

llvm-svn: 361911
2019-05-29 02:03:56 +00:00
Fangrui Song 5d3b3188f7 Reland D61583 [ELF] Error on relocations to STT_SECTION symbols if the sections were discarded
This is implemented by creating Undefined (instead of Defined) for such
local STT_SECTION symbols. It allows us to catch errors when there are
relocations to such discarded sections (e.g. in PR41693, ld.bfd and gold
error but we don't). Updated comdat-discarded-error.s checks we emit
friendly error message.

For relocatable-eh-frame.s, ld.lld -r a.o a.o will now error
"STT_SECTION symbol should be defined" because the section .eh_frame
refers to is now an Undefined instead of a Defined.
So I have to change `error()` to `warn()` to retain the output.

rLLD361144 inadvertently enabled the error for --gdb-index
(in LLDDwarfObj<ELFT>::findAux()).

Relocations from .debug_info (not in comdat) to .text.* (in comdat) for
DW_AT_low_pc are common. If an .text.* was discarded, rLLD361144 would error,
which was unexpected. (Note, if we don't error as this patch does,
InputSection::relocateNonAlloc() will resolve such relocations).

llvm-svn: 361830
2019-05-28 14:34:28 +00:00
Haojian Wu 241dcb386e Revert [ELF] Error on relocations to STT_SECTION symbols if the sections were discarded
This reverts r361792 (git commit cfca5095df), the
revision causes link errors internally, will share more details with the
author.

llvm-svn: 361806
2019-05-28 11:21:59 +00:00
Fangrui Song 173a68f1fb [ELF] Replace two addSymbol() call sites with Symbol::resolve(). NFC
If we have a handle of the symbol, insert() called by addSymbol() is
redundant. Just call resolve().

llvm-svn: 361802
2019-05-28 10:12:06 +00:00
Fangrui Song cfca5095df [ELF] Error on relocations to STT_SECTION symbols if the sections were discarded
This is implemented by creating Undefined (instead of Defined) for such
local STT_SECTION symbols. It allows us to catch errors when there are
relocations to such discarded sections (e.g. in PR41693, ld.bfd and gold
error but we don't). Updated comdat-discarded-error.s checks we emit
friendly error message.

For relocatable-eh-frame.s, ld.lld -r a.o a.o will now error
"STT_SECTION symbol should be defined" because the section .eh_frame
refers to is now an Undefined instead of a Defined.
So I have to change `error()` to `warn()` to retain the output.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D61583

llvm-svn: 361792
2019-05-28 06:34:52 +00:00
Rui Ueyama d8f8abbd4a Use SymbolTable::insert() to implement --trace.
Differential Revision: https://reviews.llvm.org/D62381

llvm-svn: 361791
2019-05-28 06:33:06 +00:00
Rui Ueyama 92069605bf Merge ELFFileBase::{initSymtab,parseHeader} as ELFFileBase:init. NFC.
This patch simplifies ELFFile instance initialization by merging
two similar functions into a single function and call it from the
ctor.

llvm-svn: 361789
2019-05-28 05:17:21 +00:00
Rui Ueyama 76737f4d19 Remove elf::createSharedFile and move its code to SharedFile's ctor. NFC.
llvm-svn: 361747
2019-05-27 07:26:13 +00:00
Sam Clegg a5ca34e6b3 [WebAssebmly] Add support for --wrap
The code for implementing this features is taken almost verbatim
from the ELF backend.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=41681

Differential Revision: https://reviews.llvm.org/D62380

llvm-svn: 361639
2019-05-24 14:14:25 +00:00
Sam Clegg 7991b68284 [lld] Trace all references with lld --trace-symbol
Previously undefined symbol references were only traced if they were
seen before that definition.

Fixes https://bugs.llvm.org/show_bug.cgi?id=41878

Differential Revision: https://reviews.llvm.org/D61929

llvm-svn: 361636
2019-05-24 13:29:17 +00:00
Fangrui Song 7d4a67852d [ELF] Fix a doc typo. NFC
llvm-svn: 361617
2019-05-24 09:53:25 +00:00
Fangrui Song 7f1ff68a16 [ELF] Deleted unused forward declarations. NFC
llvm-svn: 361614
2019-05-24 09:25:47 +00:00
Peter Collingbourne ca6a8ae0bf ELF: Remove a comparison against In.EhFrame. NFCI.
This won't work once we have multiple .eh_frame sections.

Differential Revision: https://reviews.llvm.org/D62280

llvm-svn: 361556
2019-05-23 21:30:30 +00:00
Rui Ueyama f5d9d23905 Simplify InputFile::fetch().
We don't have to return a value from the function. Instead, we can
directly call parseFile from the functions.

llvm-svn: 361478
2019-05-23 10:15:12 +00:00
Rui Ueyama 821a1ac050 Remove LazyObjFile::AddedToLink.
Instead we can just clear a MemoryBuffer so that we cannot get the
same buffer more than once.

llvm-svn: 361477
2019-05-23 10:08:56 +00:00
Rui Ueyama 7f7d2b2e62 Move code for symbol resolution from SymbolTable.cpp to Symbols.cpp.
My recent commits separated symbol resolution from the symbol table,
so the functions to resolve symbols are now in a somewhat wrong file.
This patch moves it to Symbols.cpp.

The functions are now member functions of the symbol.

This is code move change. I modified function names so that they are
appropriate as member functions, though. No functionality change
intended.

Differential Revision: https://reviews.llvm.org/D62290

llvm-svn: 361474
2019-05-23 09:58:08 +00:00
Rui Ueyama 4254840313 Speed up --start-lib and --end-lib.
--{start,end}-lib give files grouped by the options the archive file
semantics. That is, each object file between them acts as if it were
in an archive file whose sole member is the file.

Therefore, files between --{start,end}-lib are linked to the final
output only if they are needed to resolve some undefined symbols.

Previously, the feature was implemented this way:

 1. We read a symbol table and insert defined symbols to the symbol
    table as lazy symbols.

 2. If an undefind symbol is resolved to a lazy symbol, that lazy
    symbol instantiate ObjFile class for that symbol, which re-insert
    all defined symbols to the symbol table.

So, if an ObjFile is instantiated, defined symbols are inserted to the
symbol table twice. Since inserting long symbol names is not cheap,
there's a room to optimize here.

This patch optimzies it. Now, LazyObjFile remembers symbol handles and
passed them over to a new ObjFile instance, so that the ObjFile
doesn't insert the same strings.

Here is a quick benchmark to link clang. "Original" is the original
lld with unmodified command line options. For "Case 1" and "Case 2", I
extracted all files from archive files and replace .a's in a command
line with .o's wrapped with --{start,end}-lib. I used the original lld
for Case 1" and use this patch for Case 2.

  Original: 5.892
    Case 1: 6.001 (+1.8%)
    Case 2: 5.701 (-3.2%)

So, interestingly, --{start,end}-lib are now faster than the regular
linking scheme with archive files. That's perhaps not too surprising,
though, because for regular archive files, we look up the symbol table
with the same string twice.

Differential Revision: https://reviews.llvm.org/D62188

llvm-svn: 361473
2019-05-23 09:53:30 +00:00
George Rimar 77b4f0abb8 [LLD][ELF] - Improve diagnostic about unrecognized relocations.
This is a minor improvement inspired by https://bugs.llvm.org/show_bug.cgi?id=38303.

A person reported that he observed message complaining about unsupported R_ARM_V4BX:
error: can't create dynamic relocation R_ARM_V4BX against local symbol in readonly segment; recompile object files with -fPIC

But with -z notext he only saw a relocation number, what is not convenient:
error: ../../gfx/cairo/libpixman/src/pixman-arm-neon-asm-bilinear.o:(.text+0x4F0): unrecognized reloc 40

Also, in the error messages we use relocation but not reloc.

With this patch we start to print one of the following messages:
error: file.o: unrecognized relocation Unknown(999)
error: file.o: unrecognized relocation R_X_KNOWN_BY_LLVM_BUT_UNSUPPORTED_BY_LLD_NAME

There is no way to write a test for that I believe.

Differential revision: https://reviews.llvm.org/D62237

llvm-svn: 361472
2019-05-23 09:50:18 +00:00
Rui Ueyama 0baaf45be7 Move SymbolTable::addCombinedLTOObject() to LinkerDriver.
Also renames it LinkerDriver::compileBitcodeFiles.

The function doesn't logically belong to SymbolTable. We added this
function to the symbol table because symbol table used to be a
container of input files. This is no longer the case.

Differential Revision: https://reviews.llvm.org/D62291

llvm-svn: 361469
2019-05-23 09:26:27 +00:00
Rui Ueyama ecf6eb515f Copy symbol length when we replace a symbol.
Symbol's NameSize is computed lazily. Currently, when we replace a symbol,
a cached length value can be discarded. This patch propagates that value.

Differential Revision: https://reviews.llvm.org/D62234

llvm-svn: 361364
2019-05-22 09:19:30 +00:00
Fangrui Song b72b091389 [ELF] Improve error message for relocations to symbols defined in discarded sections
Rather than report "undefined symbol: ", give more informative message
about the object file that defines the discarded section.

In particular, PR41133, if the section is a discarded COMDAT, print the
section group signature and the object file with the prevailing
definition. This is useful to track down some ODR issues.

We need to
* add `uint32_t DiscardedSecIdx` to Undefined for this feature.
* make ComdatGroups public and change its type to DenseMap<CachedHashStringRef, const InputFile *>

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D59649

llvm-svn: 361359
2019-05-22 09:06:42 +00:00
Rui Ueyama 33e74d9f62 Simplify the logic to instantiate Symbols. Should be NFC.
llvm-svn: 361350
2019-05-22 04:56:25 +00:00
Fangrui Song 5ea0d06e81 [ELF] Deleted unused ComdatGroups member variable left by D61854
llvm-svn: 361266
2019-05-21 14:40:38 +00:00
Fangrui Song ecf4c9e13c [ELF] Don't advance position in a memory region when assigning to the Dot
For memory5.test, ld.bfd appears to ignore `. += 0x2000;`, so the test was testing
a wrong behavior. After deleting the code added in rLLD336335, we match ld.bfd and thus fix PR41357.

PR37836 (memory4.test) seems to have been fixed by another change.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D62177

llvm-svn: 361228
2019-05-21 08:21:44 +00:00
Fangrui Song 547e3e930c [ELF] Error on relocations to local undefined symbols
For a reference to a local symbol, ld.bfd and gold error if the symbol
is defined in a discarded section but accept it if the symbol is
undefined. This inconsistent behavior seems unnecessary for us (it
probably makes sense for them as they differentiate local/global
symbols, the error would mean more code).

Catch such errors. Symbol index 0 may be used by marker relocations,
e.g. R_*_NONE R_ARM_V4BX. Don't error on them.

The difference from D61563 (which caused msan failure) is we don't call
Sym.computeBinding() on local symbols - VersionId is uninitialized.

llvm-svn: 361213
2019-05-21 02:38:11 +00:00
Tiancong Wang a5d8d01d6f [ELF][Driver] Fix precedence of symbol ordering file and CGProfile
This patch is a fix for https://bugs.llvm.org/show_bug.cgi?id=41804.
We try to solve the precedence of user-specified symbol ordering file and C3 ordering provided as call graph. It deals with two case:
(1) When both --symbol-ordering-file=<file> and --call-graph-order-file=<file> are present, whichever flag comes later will take precedence.
(2) When only --symbol-ordering-file=<file> is present, it takes precedence over implicit call graph (CGProfile) generated by CGProfilePass enabled in new pass manager.

llvm-svn: 361190
2019-05-20 19:13:34 +00:00
Tiancong Wang af4219adf5 Test commit, add an empty line.
llvm-svn: 361186
2019-05-20 18:46:25 +00:00
Fangrui Song 055906e1e5 [ELF] -z combreloc: sort dynamic relocations by (!is_relative,symbol_index,r_offset)
We currently sort dynamic relocations by (!is_relative,symbol_index).
Add r_offset as the third key. This makes `readelf -r` debugging easier
(relocations to the same symbol are ordered by r_offset).

Refactor the test combreloc.s (renamed from combrelocs.s) to check
R_X86_64_RELATIVE, and delete --expand-relocs.

The difference from the reverted D61477 is that we keep !is_relative as
the first key. In local dynamic TLS model, DTPMOD (e.g.
R_ARM_TLS_DTPMOD32 R_X86_64_DTPMOD and R_PPC{,64}_DTPMOD) may use 0 as
the symbol index.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D62141

llvm-svn: 361164
2019-05-20 15:25:01 +00:00
Dmitri Gribenko 0429fddc9d Revert "[ELF] Error on relocations to local undefined symbols"
This reverts commit r361144.  It causes a use-of-uninitialized-value in
maybeReportUndefined at llvm/tools/lld/ELF/Relocations.cpp:682, as
detected by MemorySanitizer when local-undefined-symbol.s test is run.

llvm-svn: 361162
2019-05-20 15:04:08 +00:00
Dmitri Gribenko a2fbe2bcda Revert "[ELF] -z combreloc: sort dynamic relocations by (symbol_index,r_offset)"
This reverts commit r361125.  This linker change breaks shared libraries
in some subtle way on x86_64.  (Specifically, gold segfaults when
loading the LLVMgold.so plugin linked with lldb with this patch.)

llvm-svn: 361150
2019-05-20 13:05:55 +00:00
Fangrui Song 2109572464 [ELF] Fix getRelocTargetVA formulae of R_TLS and R_NEG_TLS
For R_TLS:
1) Delete Sym.isTls() . The assembler ensures the symbol is STT_TLS.
   If not (the input is broken), we would crash (dereferencing null Out::TlsPhdr).
2) Change Sym.isUndefWeak() to Sym.isUndefined(), otherwise with --noinhibit-exec
   we would still evaluate the symbol and crash.
3) Return A if the symbol is undefined. This is PR40570.
   The case is probably unrealistic but returning A matches R_ABS and the
   behavior of several dynamic loaders.

R_NEG_TLS is obsoleted Sun TLS we don't fully support, but
R_RELAX_TLS_GD_TO_LE_NEG is still used by GD->LE relaxation (subl $var@tpoff,%eax).

They should add the addend. Unfortunately I can't test it as compilers don't seem to generate non-zero implicit addends.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D62098

llvm-svn: 361146
2019-05-20 11:47:31 +00:00
Fangrui Song 7c7425483a [ELF] Error on relocations to local undefined symbols
For a reference to a local symbol, ld.bfd and gold error if the symbol
is defined in a discarded section but accept it if the symbol is
undefined. This inconsistent behavior seems unnecessary for us (it
probably makes sense for them as they differentiate local/global
symbols, the error would mean more code).

Weaken the condition to getSymbol(Config->IsMips64EL) == 0 to catch such
errors. The symbol index can be 0 (e.g. R_*_NONE R_ARM_V4BX) and we shouldn't error on them.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D61563

llvm-svn: 361144
2019-05-20 11:25:55 +00:00
Fangrui Song 9f1a6de631 [ELF] -z combreloc: sort dynamic relocations by (symbol_index,r_offset)
Fixes PR41692.

We currently sort dynamic relocations by (!is_relative,symbol_index).
Change it to (symbol_index,r_offset). We still place relative
relocations first because R_*_RELATIVE are the only dynamic relocations
with 0 symbol index (except on MIPS, which doesn't use DT_REL[A]COUNT
anyway).

This makes `readelf -r` debugging easier (relocations to the same symbol
are ordered by r_offset).

Refactor the test combreloc.s (renamed from combrelocs.s) to check
R_X86_64_RELATIVE, and delete --expand-relocs.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D61477

llvm-svn: 361125
2019-05-20 07:22:55 +00:00
Rui Ueyama faf541e1e1 Make replaceSymbol a member function of Symbol.
This is a mechanical rewrite of replaceSymbol(A, B) to A->replace(B).
I also added a comment to Symbol::replace().

Technically this change is not necessary, but this change makes code a
bit more concise.

Differential Revision: https://reviews.llvm.org/D62117

llvm-svn: 361123
2019-05-20 03:36:33 +00:00
Fangrui Song a6720e7407 [ELF] Copy IsPreemptible in replaceSymbol()
Otherwise, we may set IsPreemptible (e.g. --dynamic-list) then clear it
(in replaceCommonSymbols()).

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D62107

llvm-svn: 361122
2019-05-20 02:07:03 +00:00
Michael Liao 63621832da Suppress false-positive GCC -Wreturn-type warning.
llvm-svn: 361094
2019-05-18 06:35:47 +00:00
Fangrui Song ed2ad77ccb [ARM][AArch64] Revert Android Bionic PT_TLS overaligning hack
This reverts D53906.

D53906 increased p_align of PT_TLS on ARM/AArch64 to 32/64 to make the
static TLS layout compatible with Android Bionic's ELF TLS. However,
this may cause glibc ARM/AArch64 programs to crash (see PR41527).

The faulty PT_TLS in the executable satisfies p_vaddr%p_align != 0. The
remainder is normally 0 but may be non-zero with the hack in place. The
problem is that we increase PT_TLS's p_align after OutputSections'
addresses are fixed (assignAddress()). It is possible that
p_vaddr%old_p_align = 0 while p_vaddr%new_p_align != 0.

For a thread local variable defined in the executable, lld computed TLS
offset (local exec) is different from glibc computed TLS offset from
another module (initial exec/generic dynamic). Note: PR41527 said the
bug affects initial exec but actually generic dynamic is affected as
well.

(glibc is correct in that it compute offsets that satisfy
`offset%p_align == p_vaddr%p_align`, which is a basic ELF requirement.
This hack appears to work on FreeBSD rtld, musl<=1.1.22, and Bionic, but
that is just because they (and lld) incorrectly compute offsets that
satisfy `offset%p_align = 0` instead.)

Android developers are fine to revert this patch, carry this patch in
their tree before figuring out a long-term solution (e.g. a dummy .tdata
with sh_addralign=64 sh_size={0,1} in crtbegin*.o files. The overhead is
now insignificant after D62059).

Reviewed By: rprichard, srhines

Differential Revision: https://reviews.llvm.org/D62055

llvm-svn: 361090
2019-05-18 03:16:00 +00:00
Fangrui Song 898896836d [ELF][X86] Fix R_RELAX_TLS_GD_TO_LE_NEG and R_NEG_TLS after D62059
After D62059, we don't align p_memsz of PT_TLS to p_align. The
getRelocTargetVA formula should align it instead.

It becomes clear that R_NEG_TLS and R_TLS are opposite from each other.

In i386-tls-le-align.s, I put ret after call ___tls_get_addr@plt as
otherwise ld.bfd would reject the relaxation:
TLS transition from R_386_TLS_GD to R_386_TLS_LE_32 against `a' at 0x3 in section `.text' failed

llvm-svn: 361088
2019-05-18 01:58:40 +00:00
Fangrui Song 348731aeed [ELF] Fix TP offset of TLS Variant I after D62059
As Ryan Prichard pointed out, after D62059, the TP offset is incorrect.

Add x86-64-tls-le-align.s to check this.  Better formulae for both
variants should take p_vaddr%p_align into account (offset%p_align =
p_vaddr%p_align is a basic ELF requirement), but I can't find a way to
test the behavior.

llvm-svn: 361084
2019-05-18 00:43:10 +00:00
Fangrui Song f3a3b93f54 [ELF] -r: fix R_*_NONE to section symbols on Elf*_Rel targets
On Elf*_Rel targets, for a relocation to a section symbol, an R_ABS is
added which will be used by relocateOne() to compute the implicit
addend.

Addends of R_*_NONE should be ignored, so don't emit an R_ABS.

This fixes crashes on X86 and ARM because their relocateOne() do not
handle R_*_NONE.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D62052

llvm-svn: 361036
2019-05-17 14:11:03 +00:00
Fangrui Song f3dccc64af [ELF] Don't align PT_TLS's p_memsz
The code was added in r252352, probably to address some layout issues.
Actually PT_TLS's p_memsz doesn't need to be aligned on either variant.
ld.bfd doesn't do that.

In case of larger alignment (e.g. 64 for Android Bionic on AArch64, see
D62055), this may make the overhead smaller.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D62059

llvm-svn: 361029
2019-05-17 12:48:53 +00:00
Ben Dunbobbin 1d16515fb4 [ELF] Implement Dependent Libraries Feature
This patch implements a limited form of autolinking primarily designed to allow
either the --dependent-library compiler option, or "comment lib" pragmas (
https://docs.microsoft.com/en-us/cpp/preprocessor/comment-c-cpp?view=vs-2017) in
C/C++ e.g. #pragma comment(lib, "foo"), to cause an ELF linker to automatically
add the specified library to the link when processing the input file generated
by the compiler.

Currently this extension is unique to LLVM and LLD. However, care has been taken
to design this feature so that it could be supported by other ELF linkers.

The design goals were to provide:

- A simple linking model for developers to reason about.
- The ability to to override autolinking from the linker command line.
- Source code compatibility, where possible, with "comment lib" pragmas in other
  environments (MSVC in particular).

Dependent library support is implemented differently for ELF platforms than on
the other platforms. Primarily this difference is that on ELF we pass the
dependent library specifiers directly to the linker without manipulating them.
This is in contrast to other platforms where they are mapped to a specific
linker option by the compiler. This difference is a result of the greater
variety of ELF linkers and the fact that ELF linkers tend to handle libraries in
a more complicated fashion than on other platforms. This forces us to defer
handling the specifiers to the linker.

In order to achieve a level of source code compatibility with other platforms
we have restricted this feature to work with libraries that meet the following
"reasonable" requirements:

1. There are no competing defined symbols in a given set of libraries, or
   if they exist, the program owner doesn't care which is linked to their
   program.
2. There may be circular dependencies between libraries.

The binary representation is a mergeable string section (SHF_MERGE,
SHF_STRINGS), called .deplibs, with custom type SHT_LLVM_DEPENDENT_LIBRARIES
(0x6fff4c04). The compiler forms this section by concatenating the arguments of
the "comment lib" pragmas and --dependent-library options in the order they are
encountered. Partial (-r, -Ur) links are handled by concatenating .deplibs
sections with the normal mergeable string section rules. As an example, #pragma
comment(lib, "foo") would result in:

.section ".deplibs","MS",@llvm_dependent_libraries,1
         .asciz "foo"

For LTO, equivalent information to the contents of a the .deplibs section can be
retrieved by the LLD for bitcode input files.

LLD processes the dependent library specifiers in the following way:

1. Dependent libraries which are found from the specifiers in .deplibs sections
   of relocatable object files are added when the linker decides to include that
   file (which could itself be in a library) in the link. Dependent libraries
   behave as if they were appended to the command line after all other options. As
   a consequence the set of dependent libraries are searched last to resolve
   symbols.
2. It is an error if a file cannot be found for a given specifier.
3. Any command line options in effect at the end of the command line parsing apply
   to the dependent libraries, e.g. --whole-archive.
4. The linker tries to add a library or relocatable object file from each of the
   strings in a .deplibs section by; first, handling the string as if it was
   specified on the command line; second, by looking for the string in each of the
   library search paths in turn; third, by looking for a lib<string>.a or
   lib<string>.so (depending on the current mode of the linker) in each of the
   library search paths.
5. A new command line option --no-dependent-libraries tells LLD to ignore the
   dependent libraries.

Rationale for the above points:

1. Adding the dependent libraries last makes the process simple to understand
   from a developers perspective. All linkers are able to implement this scheme.
2. Error-ing for libraries that are not found seems like better behavior than
   failing the link during symbol resolution.
3. It seems useful for the user to be able to apply command line options which
   will affect all of the dependent libraries. There is a potential problem of
   surprise for developers, who might not realize that these options would apply
   to these "invisible" input files; however, despite the potential for surprise,
   this is easy for developers to reason about and gives developers the control
   that they may require.
4. This algorithm takes into account all of the different ways that ELF linkers
   find input files. The different search methods are tried by the linker in most
   obvious to least obvious order.
5. I considered adding finer grained control over which dependent libraries were
   ignored (e.g. MSVC has /nodefaultlib:<library>); however, I concluded that this
   is not necessary: if finer control is required developers can fall back to using
   the command line directly.

RFC thread: http://lists.llvm.org/pipermail/llvm-dev/2019-March/131004.html.

Differential Revision: https://reviews.llvm.org/D60274

llvm-svn: 360984
2019-05-17 03:44:15 +00:00
Rui Ueyama bbf154cf9c Move symbol resolution code out of SymbolTable class.
This is the last patch of the series of patches to make it possible to
resolve symbols without asking SymbolTable to do so.

The main point of this patch is the introduction of
`elf::resolveSymbol(Symbol *Old, Symbol *New)`. That function resolves
or merges given symbols by examining symbol types and call
replaceSymbol (which memcpy's New to Old) if necessary.

With the new function, we have now separated symbol resolution from
symbol lookup. If you already have a Symbol pointer, you can directly
resolve the symbol without asking SymbolTable to do that.

Now that the nice abstraction become available, I can start working on
performance improvement of the linker. As a starter, I'm thinking of
making --{start,end}-lib faster.

--{start,end}-lib is currently unnecessarily slow because it looks up
the symbol table twice for each symbol.

 - The first hash table lookup/insertion occurs when we instantiate a
   LazyObject file to insert LazyObject symbols.

 - The second hash table lookup/insertion occurs when we create an
   ObjFile from LazyObject file. That overwrites LazyObject symbols
   with Defined symbols.

I think it is not too hard to see how we can now eliminate the second
hash table lookup. We can keep LazyObject symbols in Step 1, and then
call elf::resolveSymbol() to do Step 2.

Differential Revision: https://reviews.llvm.org/D61898

llvm-svn: 360975
2019-05-17 01:55:20 +00:00