Commit Graph

336 Commits

Author SHA1 Message Date
Fangrui Song f347541fbc [ELF] resolveUndefined: ignore undefined symbols in SharedFile for Undefined and SharedSymbol
If %t1.o has a weak reference on foo, and %t2.so has a non-weak
reference on foo: `ld.lld %t1.o %t2.so -o %t`

We incorrectly set the binding of the undefined foo to STB_GLOBAL.
Fix this by ignoring undefined symbols in a SharedFile for Undefined and
SharedSymbol.

This fixes the binding of pthread_once when the program links against
both librt.so and libpthread.so

```
a.o: STB_WEAK reference to pthread_once
librt.so: STB_GLOBAL reference to pthread_once    # should be ignored
libstdc++.so: STB_WEAK reference to pthread_once  # should be ignored
libgcc_s.so.1: STB_WEAK reference to pthread_once # should be ignored
```

The STB_GLOBAL pthread_once issue (not fixed by D63974) can cause a link error when the result
DSO is used to link another DSO with -z defs if -lpthread is not specified. (libstdc++.so.6 not having a dependency on libpthread.so is a really nasty hack...)

We happened to create a weak undef before D63974 because libgcc_s.so.1
was linked the last and it changed the binding again to weak.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D64136

llvm-svn: 365129
2019-07-04 10:38:04 +00:00
Rui Ueyama 11ae59f0ce Avoid identifiers that are different only in case. NFC.
Some variables in lld have the same name as functions ignoring case.
This patch gives them different names, so that my next patch is easier
to read.

llvm-svn: 365003
2019-07-03 06:11:50 +00:00
Fangrui Song 1c70d136fb [ELF] Only allow the binding of SharedSymbol to change for the first undef ref
Fixes PR42442

t.o has a STB_GLOBAL undef ref to f
t2.so has a STB_WEAK undef ref to f
t1.so defines f

ld.lld t.o t1.so t2.so currently sets the binding of `f` to STB_WEAK.
This is not correct because there exists a STB_GLOBAL undef ref from a
regular object. The problem is that resolveUndefined() doesn't check
if the undef ref is seen for the first time:

    if (isShared() || isLazy() || (isUndefined() && Other.Binding != STB_WEAK))
      Binding = Other.Binding;

The isShared() condition should be `isShared() && !Referenced`
where Referenced is set to true after an undef ref is seen.

In practice, when linking a pthread program with glibc:

    // a.o
    #include <pthread.h>
    pthread_mutex_t mu = PTHREAD_MUTEX_INITIALIZER;
    int main() { pthread_mutex_unlock(&mu); }

{clang,gcc} -fuse-ld=lld a.o -lpthread # libpthread.so is linked before libgcc_s.so.1

The weak undef pthread_mutex_unlock in libgcc_s.so.1 makes the result
weak, which diverges from GNU linkers where STB_DEFAULT is used:

    23: 0000000000000000     0 FUNC    WEAK   DEFAULT  UND pthread_mutex_lock

(Note, if -pthread is used instead, libpthread.so will be linked **after**
libgcc_s.so.1 . lld sets the binding to the expected STB_GLOBAL)

Similar linking sequences (ld.lld t.o t1.so t2.so) appear to be used by
Go, which cause a build error https://github.com/golang/go/issues/31912.

Reviewed By: grimar, ruiu

Differential Revision: https://reviews.llvm.org/D63974

llvm-svn: 364913
2019-07-02 11:37:21 +00:00
Fangrui Song ba51fd5664 Reland D61583 [ELF] Error on relocations to STT_SECTION symbols if the sections were discarded
This restores r361830 "[ELF] Error on relocations to STT_SECTION symbols if the sections were discarded"
and dependent commits (r362218, r362497) which were reverted by r364321, with a fix of a --gdb-index issue.

.rela.debug_ranges contains relocations of range list entries:

    // start address of a range list entry
    // old: 0; after r361830: 0
    00000000000033a0 R_X86_64_64 .text._ZN2v88internal7Isolate7factoryEv + 0
    // end address of a range list entry
    // old: 0xe; after r361830: 0
    00000000000033a8 R_X86_64_64 .text._ZN2v88internal7Isolate7factoryEv + e

If both start and end addresses of a range list entry resolve to 0,
DWARFDebugRangeList::isEndOfListEntry() will return true, then the
.debug_range decoding loop will terminate prematurely:

    while (true) {
      decode StartAddress
      decode EndAddress
      if (Entry.isEndOfListEntry()) // prematurely
        break;
      Entries.push_back(Entry);
    }

In lld/ELF/SyntheticSections.cpp, readAddressAreas() will read
incomplete address ranges and the resulting .gdb_index will be
incomplete. For files that gdb hasn't loaded their debug info, gdb uses
.gdb_index to map addresses to CUs. The absent entries make gdb fail to
symbolize some addresses.

To address this issue, we simply allow relocations to undefined symbols
in DWARF.cpp:findAux() and let RelocationResolver resolve them.

This patch should fix:

[1] http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190603/659848.html
[2] https://bugs.chromium.org/p/chromium/issues/detail?id=978067

llvm-svn: 364391
2019-06-26 08:09:08 +00:00
Hans Wennborg 36c23cad15 Revert r362743 "Revert "Revert "Reland D61583 [ELF] Error on relocations to STT_SECTION symbols if the sections were discarded"""
(In effect, reverting "[ELF] Error on relocations to STT_SECTION symbols if the sections were discarded".)

It caused debug info problems in LibreOffice [1] and Chromium/V8 [2].
Reverting until those can be fixed.

It also reverts r362497 "STT_SECTION symbol should be defined" on .eh_frame, .debug*, .zdebug* and .gcc_except_table"
which was landed as a follow-up to the above.

> With -r or --emit-relocs, we warn `STT_SECTION symbol should be defined`
> on relocations to discarded section symbol. This was added as an error
> in rLLD319404, but was not so effective before D61583 (it turned the
> error to a warning).
>
> Relocations from .eh_frame .debug* .zdebug* .gcc_except_table to
> discarded .text are very common and somewhat expected. Don't warn/error
> on them. As a reference, ld.bfd has a similar logic in
> _bfd_elf_default_action_discarded() to allow these cases.
>
> Delete invalid-undef-section-symbol.test because what it intended to
> check is now covered by the updated comdat-discarded-reloc.s
>
> Delete relocatable-eh-frame.s because we allow relocations from
> .eh_frame as a special case now.

And finally it reverts r362218 "[ELF] Replace a dead test in getSymVA() with assert()"
as that also depended on the main change reverted here.

> Symbols relative to discarded comdat sections are Undefined instead of
> Defined now (after D59649 and D61583). The `== &InputSection::Discarded`
> test becomes dead. I cannot find a test related to this behavior.

 [1] http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190603/659848.html
 [2] https://bugs.chromium.org/p/chromium/issues/detail?id=978067

llvm-svn: 364321
2019-06-25 14:58:46 +00:00
Rui Ueyama e63ae7fee4 Fix an issue that common symbols are not internalized under some condition.
r360841 introduced CommonSymbol class. An unintended behavioral change
introduced by that change was that common symbols are not internalized
by LTO under some condition. This patch fixes that issue.

The issue occurred under the following condition:

  1. There exists a common symbol
  2. At least one DSO is given to lld or -pie is used

If the above conditions are met, Symbol::includeInDynsym() returned a
wrong value for a common symbol.

Fixes https://bugs.llvm.org/show_bug.cgi?id=41978

Differential Revision: https://reviews.llvm.org/D63752

llvm-svn: 364273
2019-06-25 06:58:07 +00:00
Fangrui Song 5b4285d82d [ELF][RISCV] Create dummy .sdata for __global_pointer$ if .sdata does not exist
If .sdata is absent, linker synthesized __global_pointer$ gets a section index of SHN_ABS.
(ld.bfd has a similar issue: binutils PR24678)

Scrt1.o may use `lla gp, __global_pointer$` to reference the symbol PC
relatively. In -pie/-shared mode, lld complains if a PC relative
relocation references an absolute symbol (SHN_ABS) but ld.bfd doesn't:

    ld.lld: error: relocation R_RISCV_PCREL_HI20 cannot refer to lute symbol: __global_pointer$

Let the reference of __global_pointer$ to force creation of .sdata to
fix the problem. This is similar to _GLOBAL_OFFSET_TABLE_, which forces
creation of .got or .got.plt .

Also, change the visibility from STV_HIDDEN to STV_DEFAULT and don't
define the symbol for -shared. This matches ld.bfd, though I don't
understand why it uses STV_DEFAULT.

Reviewed By: ruiu, jrtc27

Differential Revision: https://reviews.llvm.org/D63132

llvm-svn: 363351
2019-06-14 02:14:53 +00:00
Fangrui Song e98baf8631 [ELF] Delete GotEntrySize and GotPltEntrySize
GotEntrySize and GotPltEntrySize were added in D22288. Later, with
the introduction of wordsize() (then Config->Wordsize), they become
redundant, because there is no target that sets GotEntrySize or
GotPltEntrySize to a number different from Config->Wordsize.

Reviewed By: grimar, ruiu

Differential Revision: https://reviews.llvm.org/D62727

llvm-svn: 362220
2019-05-31 10:35:45 +00:00
Fangrui Song 3f29cfd915 [ELF] Replace a dead test in getSymVA() with assert()
Symbols relative to discarded comdat sections are Undefined instead of
Defined now (after D59649 and D61583). The `== &InputSection::Discarded`
test becomes dead. I cannot find a test related to this behavior.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D62725

llvm-svn: 362218
2019-05-31 10:12:22 +00:00
Fangrui Song 0526c0cd8e [ELF] Implement Local Dynamic style TLSDESC for x86-64
For the Local Dynamic case of TLSDESC, _TLS_MODULE_BASE_ is defined as a
special TLS symbol that makes:

1) Without relaxation: it produces a dynamic TLSDESC relocation that
computes 0. Adding @dtpoff to access a TLS symbol.
2) With LD->LE relaxation: _TLS_MODULE_BASE_@tpoff = 0 (lowest address in
the TLS block). Adding @tpoff to access a TLS symbol.

For 1), this saves dynamic relocations and GOT slots as otherwise
(General Dynamic) we would create an R_X86_64_TLSDESC and reserve two
GOT slots for each symbol.

Add ElfSym::TlsModuleBase and change the signature of getTlsTpOffset()
to special case _TLS_MODULE_BASE_.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D62577

llvm-svn: 362078
2019-05-30 10:00:20 +00:00
Peter Collingbourne ba2816be82 ELF: Add basic partition data structures and behaviours.
This change causes us to read partition specifications from partition
specification sections and split output sections into partitions according
to their reachability from partition entry points.

This is only the first step towards a full implementation of partitions. Later
changes will add additional synthetic sections to each partition so that
they can be loaded independently.

Differential Revision: https://reviews.llvm.org/D60353

llvm-svn: 361925
2019-05-29 03:55:20 +00:00
Sam Clegg 7991b68284 [lld] Trace all references with lld --trace-symbol
Previously undefined symbol references were only traced if they were
seen before that definition.

Fixes https://bugs.llvm.org/show_bug.cgi?id=41878

Differential Revision: https://reviews.llvm.org/D61929

llvm-svn: 361636
2019-05-24 13:29:17 +00:00
Rui Ueyama f5d9d23905 Simplify InputFile::fetch().
We don't have to return a value from the function. Instead, we can
directly call parseFile from the functions.

llvm-svn: 361478
2019-05-23 10:15:12 +00:00
Rui Ueyama 7f7d2b2e62 Move code for symbol resolution from SymbolTable.cpp to Symbols.cpp.
My recent commits separated symbol resolution from the symbol table,
so the functions to resolve symbols are now in a somewhat wrong file.
This patch moves it to Symbols.cpp.

The functions are now member functions of the symbol.

This is code move change. I modified function names so that they are
appropriate as member functions, though. No functionality change
intended.

Differential Revision: https://reviews.llvm.org/D62290

llvm-svn: 361474
2019-05-23 09:58:08 +00:00
Rui Ueyama 5c073a94f9 Introduce CommonSymbol.
Previously, we handled common symbols as a kind of Defined symbol,
but what we were doing for common symbols is pretty different from
regular defined symbols.

Common symbol and defined symbol are probably as different as shared
symbol and defined symbols are different.

This patch introduces CommonSymbol to represent common symbols.
After symbols are resolved, they are converted to Defined symbols
residing in a .bss section.

Differential Revision: https://reviews.llvm.org/D61895

llvm-svn: 360841
2019-05-16 03:29:03 +00:00
Rui Ueyama 7d4761928e Simplify SymbolTable::add{Defined,Undefined,...} functions.
SymbolTable's add-family functions have lots of parameters because
when they have to create a new symbol, they forward given arguments
to Symbol's constructors. Therefore, the functions take at least as
many arguments as their corresponding constructors.

This patch simplifies the add-family functions. Now, the functions
take a symbol instead of arguments to construct a symbol. If there's
no existing symbol, a given symbol is memcpy'ed to the symbol table.
Otherwise, the functions attempt to merge the existing and a given
new symbol.

I also eliminated `CanOmitFromDynSym` parameter, so that the functions
take really one argument.

Symbol classes are trivially constructible, so looks like constructing
them to pass to add-family functions is as cheap as passing a lot of
arguments to the functions. A quick benchmark showed that this patch
seems performance-neutral.

This is a preparation for
http://lists.llvm.org/pipermail/llvm-dev/2019-April/131902.html

Differential Revision: https://reviews.llvm.org/D61855

llvm-svn: 360838
2019-05-16 02:14:00 +00:00
Siva Chandra 1915e2be93 [ELF] Emit weak-undef symbols in .dynsym of a PIE binary only if linked against shared libs.
Reviewers: espindola

Subscribers: emaste, arichardson, MaskRay, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59275

llvm-svn: 356374
2019-03-18 15:32:57 +00:00
Simon Atanasyan fae2a509fa [MIPS] Handle cross-mode (regular <-> microMIPS) jumps
The patch solves two tasks:

1. MIPS ABI allows to mix regular and microMIPS code and perform
cross-mode jumps. Linker needs to detect such cases and replace
jump/branch instructions by their cross-mode equivalents.

2. Other tools like dunamic linkers need to recognize cases when dynamic
table entries, e_entry field of an ELF header etc point to microMIPS
symbol. Linker should provide such information.

The first task is implemented in the `MIPS<ELFT>::relocateOne()` method.
New routine `fixupCrossModeJump` detects ISA mode change, checks and
replaces an instruction.

The main problem is how to recognize that relocation target is microMIPS
symbol. For absolute and section symbols compiler or assembler set the
less-significant bit of the symbol's value or sum of the symbol's value
and addend. And this bit signals to linker about microMIPS code. For
global symbols compiler cannot do the same trick because other tools like,
for example, disassembler wants to know an actual position of the symbol.
So compiler sets STO_MIPS_MICROMIPS flag in the `st_other` field.

In `MIPS<ELFT>::relocateOne()` method we have a symbol's value only and
cannot access any symbol's attributes. To pass type of the symbol
(regular/microMIPS) to that routine as well as other places where we
write a symbol value as-is (.dynamic section, `Elf_Ehdr::e_entry` field
etc) we set when necessary a less-significant bit in the `getSymVA`
function.

Differential revision: https://reviews.llvm.org/D40147

llvm-svn: 354311
2019-02-19 10:36:58 +00:00
Peter Collingbourne 8331f61a51 ELF: Allow GOT relocs pointing to non-preemptable ifunc to resolve to an IRELATIVE where possible.
Non-GOT non-PLT relocations to non-preemptible ifuncs result in the
creation of a canonical PLT, which now takes the identity of the IFUNC
in the symbol table. This (a) ensures address consistency inside and
outside the module, and (b) fixes a bug where some of these relocations
end up pointing to the resolver.

Fixes (at least) PR40474 and PR40501.

Differential Revision: https://reviews.llvm.org/D57371

llvm-svn: 353981
2019-02-13 21:49:55 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Rui Ueyama 6f9d49cdde Do not emit a corrupt symbol table entry for .rela_iplt_{start,end}.
If .rela.iplt does not exist, we used to emit a corrupt symbol table
that contains two symbols, .rela_iplt_{start,end}, pointing to a
nonexisting section.

This patch fixes the issue by setting section index 0 to the symbols
if .rel.iplt section does not exist.

Differential Revision: https://reviews.llvm.org/D56623

llvm-svn: 351218
2019-01-15 18:30:23 +00:00
Rui Ueyama 63d397ea6e Simplify Symbol::getPltVA.
This patch also makes getPltEntryOffset a non-member function because
it doesn't depend on any private members of the TargetInfo class.

I tried a few different ideas, and it seems this change fits in best to me.

Differential Revision: https://reviews.llvm.org/D54981

llvm-svn: 347781
2018-11-28 17:42:59 +00:00
Fangrui Song f5badf4905 [ELF] Write IPLT header in -static -z retpolineplt mode
Summary:
This fixes PR39711: -static -z retpolineplt does not produce retpoline PLT header.
-z now is not relevant.

Statically linked executable does not have PLT, but may have IPLT with no header. When -z retpolineplt is specified, however, the repoline PLT header should still be emitted.

I've checked that this fixes the FreeBSD reproduce in PR39711 and a Linux program statically linked against glibc. The programm print "Hi" rather than SIGILL/SIGSEGV.

getPltEntryOffset may look dirty after this patch, but it can be cleaned up later.

Another possible improvement is that when there are non-preemptible IFUNC symbols (rare case, e.g. -Bsymbolic), both In.Plt and In.Iplt can be non-empty and we'll emit the retpoline PLT header twice.

Reviewers: espindola, emaste, chandlerc, ruiu

Reviewed By: emaste

Subscribers: emaste, arichardson, krytarowski, llvm-commits

Differential Revision: https://reviews.llvm.org/D54782

llvm-svn: 347404
2018-11-21 18:10:00 +00:00
Sean Fertile 614dc11ca8 [PPC64] Long branch thunks.
On PowerPC64, when a function call offset is too large to encode in a call
instruction the address is stored in a table in the data segment. A thunk is
used to load the branch target address from the table relative to the
TOC-pointer and indirectly branch to the callee. When linking position-dependent
code the addresses are stored directly in the table, for position-independent
code the table is allocated and filled in at load time by the dynamic linker.

For position-independent code the branch targets could have gone in the .got.plt
but using the .branch_lt section for both position dependent and position
independent binaries keeps it consitent and helps keep this PPC64 specific logic
seperated from the target-independent code handling the .got.plt.

Differential Revision: https://reviews.llvm.org/D53408

llvm-svn: 346877
2018-11-14 17:56:43 +00:00
Rui Ueyama 4c06a6cc90 Rename warnUnorderableSymbol maybeWarnUnorderableSymbol because the function doesn't always emit a warning.
llvm-svn: 345393
2018-10-26 15:07:12 +00:00
Rui Ueyama 660e8721a9 Remove a global variable that is set but not used.
llvm-svn: 345080
2018-10-23 21:00:28 +00:00
Rui Ueyama f3fad55787 Remove `Type` parameter from SymbolTable::insert(). NFC.
`Type` parameter was used only to check for TLS attribute mismatch,
but we can do that when we actually replace symbols, so we don't need
to type as an argument. This change should simplify the interface of
the symbol table a bit.

llvm-svn: 344394
2018-10-12 18:29:18 +00:00
Fangrui Song 11ca54f49c [ELF] Don't warn on undefined symbols if UnresolvedPolicy::Ignore is used
Summary:
Add a condition UnresolvedPolicy::Ignore to elf::warnUnorderedSymbol to suppress Sym->isUndefined() warnings from both

1) --symbol-ordering-file=
2) .llvm.call-graph-profile

If --unresolved-symbols=ignore-all is used,

  no "undefined symbol" error/warning is emitted. It makes sense to not warn unorderable symbols.

Otherwise,

  If an executable is linked, the default policy UnresolvedPolicy::ErrorOrWarn will issue a "undefined symbol" error. The unorderable symbol warning is redundant.

  If a shared object is linked, it is possible that only part of object files are used and some symbols are left undefined. The warning is not very necessary.
    In particular for .llvm.call-graph-profile, when linking a shared object, a call graph profile may contain undefined symbols. This case generated a warning before but it will be suppressed by this patch.

Reviewers: ruiu, davidxl, espindola

Reviewed By: ruiu

Subscribers: grimar, emaste, arichardson, llvm-commits

Differential Revision: https://reviews.llvm.org/D53044

llvm-svn: 344195
2018-10-10 22:48:57 +00:00
George Rimar f79a8ef2ae [ELF] - Do not forget to include to .dymsym symbols that were converted to Defined.
This is the fix for
"Bug 39104 - LLD links incorrect ELF executable if version script contains "local: *;"
(https://bugs.llvm.org/show_bug.cgi?id=39104).

The issue happens when we have non-PIC program call to function in a shared library.
(for example, the PR above has R_X86_64_PC32 relocation against __libc_start_main)

LLD converts symbol to Defined in that case with the use of replaceWithDefined()

The issue is that after above we create a broken relocation because do not
include the symbol into .dynsym.

That happens when the version script is used because we treat the symbol as
STB_LOCAL if the following condition match:
VersionId == VER_NDX_LOCAL && isDefined() and do not include it to
.dynsym because of that. Patch fixes the issue.

Differential revision: https://reviews.llvm.org/D52724

llvm-svn: 343668
2018-10-03 09:33:00 +00:00
Rui Ueyama 4e247522ac Reset input section pointers to null on each linker invocation.
Previously, if you invoke lld's `main` more than once in the same process,
the second invocation could fail or produce a wrong result due to a stale
pointer values of the previous run.

Differential Revision: https://reviews.llvm.org/D52506

llvm-svn: 343009
2018-09-25 19:26:58 +00:00
Ryan Prichard 1c33d14bcd [ELF] Set Out::TlsPhdr earlier for encoding packed reloc tables
Summary:
For --pack-dyn-relocs=android, finalizeSections calls
LinkerScript::assignAddresses and
AndroidPackedRelocationSection::updateAllocSize in a loop,
where assignAddresses lays out the ELF image, then updateAllocSize
determines the size of the Android packed relocation table by encoding it.
Encoding the table requires knowing the values of relocation addends.

To get the addend of a TLS relocation, updateAllocSize can call getSymVA
on a TLS symbol before setPhdrs has initialized Out::TlsPhdr, producing an
error:

    <file> has an STT_TLS symbol but doesn't have an SHF_TLS section

Fix the problem by initializing Out::TlsPhdr immediately after the program
headers are created. The segment's p_vaddr field isn't initialized until
setPhdrs, so use FirstSec->Addr, which is what setPhdrs would use.
FirstSec will typically refer to the .tdata or .tbss output section, whose
(tentative) address was computed by assignAddresses.

Android currently avoids this problem because it uses emutls and doesn't
support ELF TLS. This problem doesn't apply to --pack-dyn-relocs=relr
because SHR_RELR only handles relative relocations without explicit addends
or info.

Fixes https://bugs.llvm.org/show_bug.cgi?id=37841.

Reviewers: ruiu, pcc, chh, javed.absar, espindola

Subscribers: emaste, arichardson, llvm-commits, srhines

Differential Revision: https://reviews.llvm.org/D51671

llvm-svn: 342432
2018-09-18 00:24:48 +00:00
Chih-Hung Hsieh 73e04847bf [ELF] Revert "Also demote lazy symbols."
This reverts commit https://reviews.llvm.org/rL330869
for a regression to link Android dex2oatds.

Differential Revision: https://reviews.llvm.org/D51892

llvm-svn: 342007
2018-09-11 23:00:36 +00:00
Rui Ueyama 5cd9c6bcd8 Support RISC-V
Patch by PkmX.

This patch makes lld recognize RISC-V target and implements basic
relocation for RV32/RV64 (and RVC). This should be necessary for static
linking ELF applications.

The ABI documentation for RISC-V can be found at:
https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md.
Note that the documentation is far from complete so we had to figure out
some details from bfd.

The patch should be pretty straightforward. Some highlights:

 - A new relocation Expr R_RISCV_PC_INDIRECT is added. This is needed as
   the low part of a PC-relative relocation is linked to the corresponding
   high part (auipc), see:
   https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md#pc-relative-symbol-addresses

 - LLVM's MC support for RISC-V is very incomplete (we are working on
   this), so tests are given in objectyaml format with the original
   assembly included in the comments. Once we have complete support for
   RISC-V in MC, we can switch to llvm-as/llvm-objdump.

 - We don't support linker relaxation for now as it requires greater
   changes to lld that is beyond the scope of this patch. Once this is
   accepted we can start to work on adding relaxation to lld.

Differential Revision: https://reviews.llvm.org/D39322

llvm-svn: 339364
2018-08-09 17:59:56 +00:00
Peter Collingbourne 98930115ea ELF: Only add libcall symbols to the link if defined in bitcode.
Adding all libcall symbols to the link can have undesired consequences.
For example, the libgcc implementation of __sync_val_compare_and_swap_8
on 32-bit ARM pulls in an .init_array entry that aborts the program if
the Linux kernel does not support 64-bit atomics, which would prevent
the program from running even if it does not use 64-bit atomics.

This change makes it so that we only add libcall symbols to the
link before LTO if we have to, i.e. if the symbol's definition is in
bitcode. Any other required libcall symbols will be added to the link
after LTO when we add the LTO object file to the link.

Differential Revision: https://reviews.llvm.org/D50475

llvm-svn: 339301
2018-08-08 23:48:12 +00:00
George Rimar 904ed692a1 [ELF] - Simplify Symbol::getSize(). NFC.
There are following symbols currently available:
DefinedKind, SharedKind, UndefinedKind, LazyArchiveKind, LazyObjectKind.

Our code calls getSize() only for first two and there
seems to be no reason to return 0 for the rest.

llvm-svn: 337265
2018-07-17 11:35:28 +00:00
Peter Smith 796fb999b3 [ELF] Do not error for missing version when symbol has local version.
If a symbol with an undefined version in a DSO is not going to be
exported into the dynamic symbol table then do not give an error message
for the missing version. This can happen with the --exclude-libs option
which implicitly gives all symbols in a static library the local version.
This matches the behavior of ld.gold and is exploited by the Bionic
dynamic linker on Arm.

Differential Revision: https://reviews.llvm.org/D43126

llvm-svn: 332224
2018-05-14 10:13:56 +00:00
Rafael Espindola ab0cce5f1f Replace SharedSymbols with Defined when creating copy relocations.
This is slightly simpler to read IMHO. Now if a symbol has a position
in the file, it is Defined.

The main motivation is that with this a SharedSymbol doesn't need a
section, which reduces the size of SymbolUnion.

With this the peak allocation when linking chromium goes from 568.1 to
564.2 MB.

llvm-svn: 330966
2018-04-26 17:58:58 +00:00
Rafael Espindola f4a9d56a9a Delete GotPltIndex.
It was always an offset of PltIndex.

This doesn't reduce the size of the structures, but makes it easier to
do so in a followup patch.

llvm-svn: 330953
2018-04-26 16:09:30 +00:00
Rui Ueyama b774c3c0e5 Simplify. NFC.
llvm-svn: 330892
2018-04-26 01:38:29 +00:00
Rafael Espindola 1eeb26293d Pack symbols a bit more.
Before this patch:

Symbol 56
Defined 80
Undefined 56
SharedSymbol 88
LazyArchive 72
LazyObject 56

With this patch

Symbol 48
Defined 72
Undefined 48
SharedSymbol 80
LazyArchive 64
LazyObject 48

The result is that peak allocation when linking chromium (according to
heaptrack) goes from 578 to 568 MB.

llvm-svn: 330874
2018-04-25 21:44:37 +00:00
Rafael Espindola 047f857642 Also demote lazy symbols.
This is not a big simplification right now, but the special cases for
lazy symbols have been a common source of bugs in the past.

llvm-svn: 330869
2018-04-25 20:46:08 +00:00
Rafael Espindola f4d6e8caea Simplify Repl handling.
Now that we don't ICF synthetic sections, we can go back to the old
logic on whose responsibility it is to check Repl.

The idea is that Sec->something() will not check Repl. It is the
responsibility of the caller to find the correct Sec.

llvm-svn: 330346
2018-04-19 17:26:50 +00:00
Rafael Espindola aded409325 Simplify getOffset for synthetic sections.
We had a single symbol using -1 with a synthetic section. It is
simpler to just update its value.

This is not a big will by itself, but will allow having a simple
getOffset for InputSeciton.

llvm-svn: 330340
2018-04-19 16:54:30 +00:00
Michael J. Spencer b842725c1d [ELF] Add profile guided section layout
This adds profile guided layout using the Call-Chain Clustering (C³) heuristic
from https://research.fb.com/wp-content/uploads/2017/01/cgo2017-hfsort-final1.pdf .

RFC: [llvm-dev] [RFC] Profile guided section layout
     http://lists.llvm.org/pipermail/llvm-dev/2017-June/114178.html

Pass `--call-graph-ordering-file <file>` to read a call graph profile where each
line has the format:

    <from symbol> <to symbol> <call count>

Differential Revision: https://reviews.llvm.org/D36351

llvm-svn: 330234
2018-04-17 23:30:05 +00:00
George Rimar 1ef746ba21 [ELF] - Eliminate Lazy class.
Patch removes Lazy class which
is just an excessive layer.

Differential revision: https://reviews.llvm.org/D45083

llvm-svn: 329086
2018-04-03 17:16:52 +00:00
Rui Ueyama 24a47201fd Merge LazyArchive::fetch() and ArchiveFile::getMember(). NFC.
They are to pull out an object file for a symbol, but for a historical
reason the code is written in two separate functions. This patch
merges them.

llvm-svn: 329039
2018-04-03 02:06:57 +00:00
Rui Ueyama 7d6131a898 Inline two trivial functions that are called only once. NFC.
llvm-svn: 329034
2018-04-02 23:58:50 +00:00
Rafael Espindola 4f058a2c6b Add a SectionBase::getVA helper. NFC.
There were a few too many places duplicating this.

llvm-svn: 328402
2018-03-24 00:35:11 +00:00
Rafael Espindola 4bb482eeac Move a Repl access.
Since SectionBase::getOutputSection handles ICF replaces and
SectionBase::getOffset was handling it in some cases, it is more
consistent to have getOffset always handle it.

llvm-svn: 328391
2018-03-23 23:55:49 +00:00
Rafael Espindola 4e82a9e2fa Drop redundant ->Repl.
SectionBase::getOutputSection handles replacement sections, so this
code doesn't have to.

llvm-svn: 328390
2018-03-23 23:53:01 +00:00