Commit Graph

136 Commits

Author SHA1 Message Date
Fangrui Song 977a1a523c [ELF] Symbol::replace: use the old nameData/nameSize. NFC
Currently `this->getName() == newSym.getName()`.
By keeping the old nameData/nameSize, newSym's nameData/nameSize will be
ignored. The call sites can avoid calling getName().

printTraceSymbol needs to take the symbol name since `other`'s name is empty.
2022-02-05 16:34:02 -08:00
Fangrui Song 53fc5d9b9a [ELF] Support R_PPC_NONE/R_PPC64_NONE in getImplicitAddend
Similar to f457863ae3
2022-02-04 15:13:37 -08:00
Alexandre Ganea 83d59e05b2 Re-land [LLD] Remove global state in lldCommon
Move all variables at file-scope or function-static-scope into a hosting structure (lld::CommonLinkerContext) that lives at lldMain()-scope. Drivers will inherit from this structure and add their own global state, in the same way as for the existing COFFLinkerContext.

See discussion in https://lists.llvm.org/pipermail/llvm-dev/2021-June/151184.html

The previous land f860fe3622 caused issues in https://lab.llvm.org/buildbot/#/builders/123/builds/8383, fixed by 22ee510dac.

Differential Revision: https://reviews.llvm.org/D108850
2022-01-20 14:53:26 -05:00
Alexandre Ganea e6b153947d Revert [LLD] Remove global state in lldCommon
It seems to be causing issues on https://lab.llvm.org/buildbot/#/builders/123/builds/8383
2022-01-16 11:03:06 -05:00
Alexandre Ganea f860fe3622 [LLD] Remove global state in lldCommon
Move all variables at file-scope or function-static-scope into a hosting structure (lld::CommonLinkerContext) that lives at lldMain()-scope. Drivers will inherit from this structure and add their own global state, in the same way as for the existing COFFLinkerContext.

See discussion in https://lists.llvm.org/pipermail/llvm-dev/2021-June/151184.html

Differential Revision: https://reviews.llvm.org/D108850
2022-01-16 08:57:57 -05:00
Fangrui Song 5d3bd7f360 [ELF] Move gotIndex/pltIndex/globalDynIndex to SymbolAux
to decrease sizeof(SymbolUnion) by 8 on ELF64 platforms.

Symbols needing such information are typically 1% or fewer (5134 out of 560520
when linking clang, 19898 out of 5550705 when linking chrome). Storing them
elsewhere can decrease memory usage and symbol initialization time.
There is a ~0.8% saving on max RSS when linking a large program.

Future direction:

* Move some of dynsymIndex/verdefIndex/versionId to SymbolAux
* Support mixed TLSDESC and TLS GD without increasing sizeof(SymbolUnion)

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D116281
2022-01-09 13:43:27 -08:00
Fangrui Song d71bb6a409 [ELF] Inline isPPC64SmallCodeModelTocReloc which is only called once. NFC 2021-11-09 20:41:05 -08:00
Fangrui Song ecc93ed2d7 [ELF] Replace InputBaseSection::{areRelocsRela,firstRelocation,numRelocation} with relSecIdx
For `InputSection` `.foo`, its `InputBaseSection::{areRelocsRela,firstRelocation,numRelocation}` basically
encode the information of `.rel[a].foo`. However, one uint32_t (the relocation section index)
suffices. See the implementation of `relsOrRelas`.

This change decreases sizeof(InputSection) from 184 to 176 on 64-bit Linux.

The maximum resident set size linking a large application (1.2G output) decreases by 0.39%.

Differential Revision: https://reviews.llvm.org/D112513
2021-10-27 09:51:07 -07:00
Fangrui Song d23fd8ae89 [ELF] Replace noneRel = R_*_NONE with static constexpr. NFC
All architectures define R_*_NONE to 0.
2021-09-25 15:16:44 -07:00
Fangrui Song 40cd4db442 [ELF] Default gotBaseSymInGotPlt to false (NFC for most architectures)
Most architectures use .got instead of .got.plt, so switching the default can
minimize customization.

This fixes an issue for SPARC V9 which uses .got .
AVR, AMDGPU, and MSP430 don't seem to use _GLOBAL_OFFSET_TABLE_.
2021-09-25 15:06:09 -07:00
Stefan Pintilie f21704e080 [LLD][PowerPC] Fix bug in PC-Relative initial exec
There is a bug when initial exec is relaxed to local exec.
In the following situation:

InitExec.c
```
extern __thread unsigned TGlobal;
unsigned getConst(unsigned*);
unsigned addVal(unsigned, unsigned*);

unsigned GetAddrT() {
  return addVal(getConst(&TGlobal), &TGlobal);
}
```

Def.c
```
__thread unsigned TGlobal;

unsigned getConst(unsigned* A) {
  return *A + 3;
}

unsigned addVal(unsigned A, unsigned* B) {
  return A + *B;
}
```

The problem is in InitExec.c but Def.c is required if you want to link the example and see the problem.
To compile everything:
```
clang -O3 -mcpu=pwr10 -c InitExec.c
clang -O3 -mcpu=pwr10 -c Def.c
ld.lld InitExec.o Def.o -o IeToLe
```

If you objdump the problem object file:
```
$ llvm-objdump -dr --mcpu=pwr10 InitExec.o
```
you will get the following assembly:
```
0000000000000000 <GetAddrT>:
       0: a6 02 08 7c  	mflr 0
       4: f0 ff c1 fb  	std 30, -16(1)
       8: 10 00 01 f8  	std 0, 16(1)
       c: d1 ff 21 f8  	stdu 1, -48(1)
      10: 00 00 10 04 00 00 60 e4      	pld 3, 0(0), 1
		0000000000000010:  R_PPC64_GOT_TPREL_PCREL34	TGlobal
      18: 14 6a c3 7f  	add 30, 3, 13
		0000000000000019:  R_PPC64_TLS	TGlobal
      1c: 78 f3 c3 7f  	mr	3, 30
      20: 01 00 00 48  	bl 0x20
		0000000000000020:  R_PPC64_REL24_NOTOC	getConst
      24: 78 f3 c4 7f  	mr	4, 30
      28: 30 00 21 38  	addi 1, 1, 48
      2c: 10 00 01 e8  	ld 0, 16(1)
      30: f0 ff c1 eb  	ld 30, -16(1)
      34: a6 03 08 7c  	mtlr 0
      38: 00 00 00 48  	b 0x38
		0000000000000038:  R_PPC64_REL24_NOTOC	addVal
```
The lines of interest are:
```
      10: 00 00 10 04 00 00 60 e4      	pld 3, 0(0), 1
		0000000000000010:  R_PPC64_GOT_TPREL_PCREL34	TGlobal
      18: 14 6a c3 7f  	add 30, 3, 13
		0000000000000019:  R_PPC64_TLS	TGlobal
      1c: 78 f3 c3 7f  	mr	3, 30
```
Which once linked gets turned into:
```
10010210: ff ff 03 06 00 90 6d 38      	paddi 3, 13, -28672, 0
10010218: 00 00 00 60  	nop
1001021c: 78 f3 c3 7f  	mr	3, 30
```
The problem is that register 30 is never set after the optimization.

Therefore it is not correct to relax the above instructions by replacing
the add instruction with a nop.
Instead the add instruction should be replaced with a copy (mr) instruction.
If the add uses the same resgiter as input and as ouput then it is safe to
continue to replace the add with a nop.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D95262
2021-03-22 13:15:44 -05:00
Fangrui Song 5fcb412ed0 [ELF] Support R_PPC64_ADDR16_HIGH
R_PPC64_ADDR16_HI represents bits 16-31 of a 32-bit value
R_PPC64_ADDR16_HIGH represents bits 16-31 of a 64-bit value.

In the Linux kernel, `LOAD_REG_IMMEDIATE_SYM` defined in `arch/powerpc/include/asm/ppc_asm.h`
uses @l, @high, @higher, @highest to load the 64-bit value of a symbol.

Fixes https://github.com/ClangBuiltLinux/linux/issues/1260
2021-01-19 11:42:53 -08:00
Fangrui Song e12e0d66c0 [ELF] Error for out-of-range R_PPC64_ADDR16_HA, R_PPC64_ADDR16_HI and their friends
There are no tests for REL16_* and TPREL16_*.
2021-01-19 11:42:52 -08:00
Fangrui Song 22c1bd57bf [ELF] Rename R_TLS to R_TPREL and R_NEG_TLS to R_TPREL_NEG. NFC
The scope of R_TLS (TP offset relocation types (TPREL/TPOFF) used for the
local-exec TLS model) is actually narrower than its name may imply. R_TLS_NEG
is only used by Solaris R_386_TLS_LE_32.

Rename them so that they will be less confusing.

Reviewed By: grimar, psmith, rprichard

Differential Revision: https://reviews.llvm.org/D93467
2020-12-18 08:24:42 -08:00
Fangrui Song 50564ca075 [ELF] Rename adjustRelaxExpr to adjustTlsExpr and delete the unused `data` parameter. NFC
Reviewed By: psmith

Differential Revision: https://reviews.llvm.org/D91995
2020-11-25 09:00:55 -08:00
Fangrui Song 572d18397c [ELF] Add TargetInfo::adjustGotPcExpr for `R_GOT_PC` relaxations. NFC
With this change, `TargetInfo::adjustRelaxExpr` is only related to TLS
relaxations and a subsequent clean-up can delete the `data` parameter.

Differential Revision: https://reviews.llvm.org/D92079
2020-11-25 08:43:26 -08:00
Stefan Pintilie c6561ccfd9 [PowerPC][LLD] Support for PC Relative TLS for Local Dynamic
Add support to LLD for PC Relative Thread Local Storage for Local Dynamic.
This patch adds support for two relocations: R_PPC64_GOT_TLSLD_PCREL34 and
R_PPC64_DTPREL34.

The Local Dynamic code is:
```
pla r3, x@got@tlsld@pcrel        R_PPC64_GOT_TLSLD_PCREL34
bl __tls_get_addr@notoc(x@tlsld) R_PPC64_TLSLD
                                 R_PPC64_REL24_NOTOC
...
paddi r9, r3, x@dtprel           R_PPC64_DTPREL34
```

After relaxation to Local Exec:
```
paddi r3, r13, 0x1000
nop
...
paddi r9, r3, x@dtprel          R_PPC64_DTPREL34
```

Reviewed By: NeHuang, sfertile

Differential Revision: https://reviews.llvm.org/D87504
2020-10-23 08:23:56 -05:00
Fangrui Song 88f2fe5cad Raland D87318 [LLD][PowerPC] Add support for R_PPC64_GOT_TLSGD_PCREL34 used in TLS General Dynamic
Add Thread Local Storage support for the 34 bit relocation R_PPC64_GOT_TLSGD_PCREL34 used in General Dynamic.

The compiler will produce code that looks like:
```
pla r3, x@got@tlsgd@pcrel            R_PPC64_GOT_TLSGD_PCREL34
bl __tls_get_addr@notoc(x@tlsgd)     R_PPC64_TLSGD
                                     R_PPC64_REL24_NOTOC
```
LLD should be able to correctly compute the relocation for  R_PPC64_GOT_TLSGD_PCREL34 as well as do the following two relaxations where possible:
General Dynamic to Local Exec:
```
paddi r3, r13, x@tprel
nop
```
and General Dynamic to Initial Exec:
```
pld r3, x@got@tprel@pcrel
add r3, r3, r13
```
Note:
This patch adds support for the PC Relative (no TOC) version of General Dynamic on top of the existing support for the TOC version of General Dynamic.
The ABI does not provide any way to tell by looking only at the relocation `R_PPC64_TLSGD` when it is being used in a TOC instruction sequence or and when it is being used in a no TOC sequence. The TOC sequence should always be 4 byte aligned. This patch adds one to the offset of the relocation when it is being used in a no TOC sequence. In this way LLD can tell by looking at the alignment of the offset of `R_PPC64_TLSGD` whether or not it is being used as part of a TOC or no TOC sequence.

Reviewed By: NeHuang, sfertile, MaskRay

Differential Revision: https://reviews.llvm.org/D87318
2020-10-01 12:36:33 -07:00
Stefan Pintilie 5f3e565f59 Revert "[LLD][PowerPC] Add support for R_PPC64_GOT_TLSGD_PCREL34 used in TLS General Dynamic"
This reverts commit 79122868f9.
2020-10-01 13:28:35 -05:00
Stefan Pintilie 79122868f9 [LLD][PowerPC] Add support for R_PPC64_GOT_TLSGD_PCREL34 used in TLS General Dynamic
Add Thread Local Storage support for the 34 bit relocation R_PPC64_GOT_TLSGD_PCREL34 used in General Dynamic.

The compiler will produce code that looks like:
```
pla r3, x@got@tlsgd@pcrel            R_PPC64_GOT_TLSGD_PCREL34
bl __tls_get_addr@notoc(x@tlsgd)     R_PPC64_TLSGD
                                     R_PPC64_REL24_NOTOC
```
LLD should be able to correctly compute the relocation for  R_PPC64_GOT_TLSGD_PCREL34 as well as do the following two relaxations where possible:
General Dynamic to Local Exec:
```
paddi r3, r13, x@tprel
nop
```
and General Dynamic to Initial Exec:
```
pld r3, x@got@tprel@pcrel
add r3, r3, r13
```
Note:
This patch adds support for the PC Relative (no TOC) version of General Dynamic on top of the existing support for the TOC version of General Dynamic.
The ABI does not provide any way to tell by looking only at the relocation `R_PPC64_TLSGD` when it is being used in a TOC instruction sequence or and when it is being used in a no TOC sequence. The TOC sequence should always be 4 byte aligned. This patch adds one to the offset of the relocation when it is being used in a no TOC sequence. In this way LLD can tell by looking at the alignment of the offset of `R_PPC64_TLSGD` whether or not it is being used as part of a TOC or no TOC sequence.

Reviewed By: NeHuang, sfertile, MaskRay

Differential Revision: https://reviews.llvm.org/D87318
2020-10-01 13:00:37 -05:00
Stefan Pintilie 8c53282d64 [PowerPC][NFC] Merged two switch entries.
Two switch entries did exactly the same thing. This patch merges them.
2020-09-25 09:49:13 -05:00
Stefan Pintilie c0071862bb [PowerPC] Add support for R_PPC64_GOT_TPREL_PCREL34 used in TLS Initial Exec
Add Thread Local Storage Initial Exec support to LLD.

This patch adds the computation for the relocations as well as the relaxation from Initial Exec to Local Exec.

Initial Exec:
```
pld r9, x@got@tprel@pcrel
add r9, r9, x@tls@pcrel
```
or
```
pld r9, x@got@tprel@pcrel
lbzx r10, r9, x@tls@pcrel
```
Note that @tls@pcrel is actually encoded as R_PPC64_TLS with a one byte displacement.

For the above examples relaxing Intitial Exec to Local Exec:
```
paddi r9, r9, x@tprel
nop
```
or
```
paddi r9, r13, x@tprel
lbz r10, 0(r9)
```

Reviewed By: nemanjai, MaskRay, #powerpc

Differential Revision: https://reviews.llvm.org/D86893
2020-09-22 05:48:43 -05:00
Stefan Pintilie 65f6810d3a [LLD][PowerPC] Add support for R_PPC64_TPREL34 used in TLS Local Exec
Add Thread Local Storage Local Exec support to LLD. This is to support PC Relative addressing of Local Exec.
The patch teaches LLD to handle:
```
paddi r9, r13, x1@tprel
```
The relocation is:
```
R_PPC_TPREL34
```

Reviewed By: NeHuang, MaskRay

Differential Revision: https://reviews.llvm.org/D86608
2020-09-15 09:06:19 -05:00
Georgii Rymar 4845531fa8 [lib/Object] - Refine interface of ELFFile<ELFT>. NFCI.
`ELFFile<ELFT>` has many methods that take pointers,
though they assume that arguments are never null and
hence could take references instead.

This patch performs such clean-up.

Differential revision: https://reviews.llvm.org/D87385
2020-09-15 11:38:31 +03:00
Fangrui Song 560188ddcc [ELF][PowerPC] Define NOP as 0x60000000 to tidy up code. NFC
Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D87483
2020-09-11 09:20:24 -07:00
Fangrui Song 485f3f35cc [ELF] Make two PPC64.cpp variables constexpr. NFC
Why are they mutable? :)
2020-09-10 14:31:10 -07:00
Nemanja Ivanovic cddb0dbcef [LLD][PowerPC] Implement GOT to PC-Rel relaxation
This patch implements the handling for the R_PPC64_PCREL_OPT relocation as well
as the GOT relocation for the associated R_PPC64_GOT_PCREL34 relocation.

On Power10 targets with PC-Relative addressing, the linker can relax
GOT-relative accesses to PC-Relative under some conditions. Since the sequence
consists of a prefixed load, followed by a non-prefixed access (load or store),
the linker needs to replace the first instruction (as the replacement
instruction will be prefixed). The compiler communicates to the linker that
this optimization is safe by placing the two aforementioned relocations on the
GOT load (of the address).
The linker then does two things:

- Convert the load from the got into a PC-Relative add to compute the address
  relative to the PC
- Find the instruction referred to by the second relocation (R_PPC64_PCREL_OPT)
  and replace the first with the PC-Relative version of it

It is important to synchronize the mapping from legacy memory instructions to
their PC-Relative form. Hence, this patch adds a file to be included by both
the compiler and the linker so they're always in agreement.

Differential revision: https://reviews.llvm.org/D84360
2020-08-17 09:36:09 -05:00
Victor Huang 8dbea4785c [PowerPC] Support for R_PPC64_REL24_NOTOC calls where the caller has no TOC and the callee is not DSO local
This patch supports the situation where caller does not have a valid TOC and
calls using the R_PPC64_REL24_NOTOC relocation and the callee is not DSO local.
In this case the call cannot be made directly since the callee may or may not
require a valid TOC pointer. As a result this situation require a PC-relative
plt stub to set up r12.

Reviewed By: sfertile, MaskRay, stefanp

Differential Revision: https://reviews.llvm.org/D83669
2020-07-29 19:49:28 +00:00
Victor Huang 91cce1a2bc [PowerPC] Implement R_PPC64_REL24_NOTOC local calls, callee requires a TOC
The PC Relative code now allows for calls that are marked with the relocation
R_PPC64_REL24_NOTOC. This indicates that the caller does not have a valid TOC
pointer in R2 and does not require R2 to be restored after the call.

This patch is added to support local calls to callees that require a TOC

Reviewed By: sfertile, MaskRay, nemanjai, stefanp

Differential Revision: https://reviews.llvm.org/D83504
2020-07-20 17:46:49 +00:00
Victor Huang 118366dcb6 [PowerPC] Implement R_PPC64_REL24_NOTOC calls, callee also has no TOC
The PC Relative code allows for calls that are marked with the relocation
R_PPC64_REL24_NOTOC. This indicates that the caller does not have a valid TOC
pointer in R2 and does not require R2 to be restored after the call.

This patch is added to support local calls to callees tha also do not have a TOC.

Reviewed By: sfertile, MaskRay, stefanp

Differential Revision: https://reviews.llvm.org/D82816
2020-07-10 07:23:32 -05:00
Stefan Pintilie beb52b12cb [PowerPC] Support PCRelative Callees for R_PPC64_REL24 Relocation
The R_PPC64_REL24 is used in function calls when the caller requires a
valid TOC pointer. If the callee shares the same TOC or does not clobber
the TOC pointer then a direct call can be made. If the callee does not
share the TOC a thunk must be added to save the TOC pointer for the caller.

Up until PC Relative was introduced all local calls on medium and large code
models were assumed to share a TOC. This is no longer the case because
if the caller requires a TOC and the callee is PC Relative then the callee
can clobber the TOC even if it is in the same DSO.

This patch is to add support for a TOC caller calling a PC Relative callee that
clobbers the TOC.

Reviewed By: sfertile, MaskRay

Differential Revision: https://reviews.llvm.org/D82950
2020-07-09 09:50:19 -05:00
Stefan Pintilie 8131ef5d63 [LLD][PowerPC] Add support for R_PPC64_GOT_PCREL34
Add support for the 34bit relocation R_PPC64_GOT_PCREL34 for
PC Relative in LLD.

Reviewers: sfertile, MaskRay

Differential Revision: https://reviews.llvm.org/D81948
2020-06-24 07:40:35 -05:00
Stefan Pintilie 3a55a2a97f [LLD][PowerPC] Add support for R_PPC64_PCREL34
Add support for the 34bit relocation R_PPC64_PCREL34 for PC Relative in LLD.
2020-06-23 14:59:19 -05:00
Fangrui Song bae7cf6746 [ELF][PPC64] Synthesize _savegpr[01]_{14..31} and _restgpr[01]_{14..31}
In the 64-bit ELF V2 API Specification: Power Architecture, 2.3.3.1. GPR
Save and Restore Functions defines some special functions which may be
referenced by GCC produced assembly (LLVM does not reference them).

With GCC -Os, when the number of call-saved registers exceeds a certain
threshold, GCC generates `_savegpr0_* _restgpr0_*` calls and expects the
linker to define them. See
https://sourceware.org/pipermail/binutils/2002-February/017444.html and
https://sourceware.org/pipermail/binutils/2004-August/036765.html . This
is weird because libgcc.a would be the natural place. However, the linker
generation approach has the advantage that the linker can generate
multiple copies to avoid long branch thunks. We don't consider the
advantage significant enough to complicate our trunk implementation, so
we take a simple approach.

* Check whether `_savegpr0_{14..31}` are used
* If yes, define needed symbols and add an InputSection with the code sequence.

`_savegpr1_*` `_restgpr0_*` and `_restgpr1_*` are similar.

Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D79977
2020-05-26 09:35:41 -07:00
Fangrui Song 07837b8f49 [ELF] Use namespace qualifiers (lld:: or elf::) instead of `namespace lld { namespace elf {`
Similar to D74882. This reverts much code from commit
bd8cfe65f5 (D68323) and fixes some
problems before D68323.

Sorry for the churn but D68323 was a mistake. Namespace qualifiers avoid
bugs where the definition does not match the declaration from the
header. See
https://llvm.org/docs/CodingStandards.html#use-namespace-qualifiers-to-implement-previously-declared-functions (D74515)

Differential Revision: https://reviews.llvm.org/D79982
2020-05-15 08:49:53 -07:00
Kazuaki Ishizaki 7c5fcb3591 [lld] NFC: fix trivial typos in comments
Differential Revision: https://reviews.llvm.org/D72339
2020-04-02 01:21:36 +09:00
Fangrui Song deb5819d62 [ELF] Rename relocateOne() to relocate() and pass `Relocation` to it
Symbol information can be used to improve out-of-range/misalignment diagnostics.
It also helps R_ARM_CALL/R_ARM_THM_CALL which has different behaviors with different symbol types.

There are many (67) relocateOne() call sites used in thunks, {Arm,AArch64}errata, PLT, etc.
Rename them to `relocateNoSym()` to be clearer that there is no symbol information.

Reviewed By: grimar, peter.smith

Differential Revision: https://reviews.llvm.org/D73254
2020-01-25 12:00:18 -08:00
Fangrui Song f1dab29908 [ELF][PowerPC] Support R_PPC_COPY and R_PPC64_COPY
Reviewed By: Bdragon28, jhenderson, grimar, sfertile

Differential Revision: https://reviews.llvm.org/D73255
2020-01-24 09:06:20 -08:00
Fangrui Song 1e57038bf2 [ELF] Pass `Relocation` to relaxGot and relaxTls{GdToIe,GdToLe,LdToLe,IeToLe}
These functions call relocateOne(). This patch is a prerequisite for
making relocateOne() aware of `Symbol` (D73254).

Reviewed By: grimar, nickdesaulniers

Differential Revision: https://reviews.llvm.org/D73250
2020-01-23 10:39:25 -08:00
Fangrui Song 45acc35ac2 [ELF][PPC64] Implement IPLT code sequence for non-preemptible IFUNC
Non-preemptible IFUNC are placed in in.iplt (.glink on EM_PPC64).  If
there is a non-GOT non-PLT relocation, for pointer equality, we change
the type of the symbol from STT_IFUNC and STT_FUNC and bind it to the
.glink entry.

On EM_386, EM_X86_64, EM_ARM, and EM_AARCH64, the PLT code sequence
loads the address from its associated .got.plt slot. An IPLT also has an
associated .got.plt slot and can use the same code sequence.

On EM_PPC64, the PLT code sequence is actually a bl instruction in
.glink .  It jumps to `__glink_PLTresolve` (the PLT header). and
`__glink_PLTresolve` computes the .plt slot (relocated by
R_PPC64_JUMP_SLOT).

An IPLT does not have an associated R_PPC64_JUMP_SLOT, so we cannot use
`bl` in .iplt . Instead, create a call stub which has a similar code
sequence as PPC64PltCallStub. We don't save the TOC pointer, so such
scenarios will not work: a function pointer to a non-preemptible ifunc,
which resolves to a function defined in another DSO. This is the
restriction described by https://sourceware.org/glibc/wiki/GNU_IFUNC
(though on many architectures it works in practice):

  Requirement (a): Resolver must be defined in the same translation unit as the implementations.

If an ifunc is taken address but not called, technically we don't need
an entry for it, but we currently do that.

This patch makes

  // clang -fuse-ld=lld -fno-pie -no-pie a.c
  // clang -fuse-ld=lld -fPIE -pie a.c
  #include <stdio.h>
  static void impl(void) { puts("meow"); }
  void thefunc(void) __attribute__((ifunc("resolver")));
  void *resolver(void) { return &impl; }
  int main(void) {
    thefunc();
    void (*theptr)(void) = &thefunc;
    theptr();
  }

work on Linux glibc and FreeBSD. Calling a function pointer pointing to
a Non-preemptible IFUNC never worked before.

Differential Revision: https://reviews.llvm.org/D71509
2019-12-29 22:40:03 -08:00
Fangrui Song 37b2808059 [ELF] writePlt, writeIplt: replace parameters gotPltEntryAddr and index with `const Symbol &`. NFC
PPC::writeIplt (IPLT code sequence, D71621) needs to access `Symbol`.

Reviewed By: grimar, ruiu

Differential Revision: https://reviews.llvm.org/D71631
2019-12-18 00:14:03 -08:00
Fangrui Song 891a8655ab [ELF] Add IpltSection
PltSection is used by both PLT and IPLT. The PLT section may have a
header while the IPLT section does not. Split off IpltSection from
PltSection to be clearer.

Unlike other targets, PPC64 cannot use the same code sequence for PLT
and IPLT. This helps make a future PPC64 patch (D71509) more isolated.

On EM_386 and EM_X86_64, when PLT is empty while IPLT is not, currently
we are inconsistent whether the PLT header is conceptually attached to
in.plt or in.iplt .  Consistently attach the header to in.plt can make
the -z retpolineplt logic simpler. It also makes `jmp` point to an
aesthetically better place for non-retpolineplt cases.

Reviewed By: grimar, ruiu

Differential Revision: https://reviews.llvm.org/D71519
2019-12-17 00:06:04 -08:00
Fangrui Song 90d195d026 [ELF] Delete relOff from TargetInfo::writePLT
This change only affects EM_386. relOff can be computed from `index`
easily, so it is unnecessarily passed as a parameter.

Both in.plt and in.iplt entries are written by writePLT. For in.iplt,
the instruction `push reloc_offset` will change because `index` is now
different. Fortunately, this does not matter because `push; jmp` is only
used by PLT. IPLT does not need the code sequence.

Reviewed By: grimar, ruiu

Differential Revision: https://reviews.llvm.org/D71518
2019-12-16 11:10:02 -08:00
Fangrui Song c8f0d3e130 [ELF][PPC64] Support long branch thunks with addends
Fixes PPC64 part of PR40438

  // clang -target ppc64le -c a.cc
  // .text.unlikely may be placed in a separate output section (via -z keep-text-section-prefix)
  // The distance between bar in .text.unlikely and foo in .text may be larger than 32MiB.
  static void foo() {}
  __attribute__((section(".text.unlikely"))) static int bar() { foo(); return 0; }
  __attribute__((used)) static int dummy = bar();

This patch makes such thunks with addends work for PPC64.

AArch64: .text -> `__AArch64ADRPThunk_ (adrp x16, ...; add x16, x16, ...; br x16)` -> target
PPC64: .text -> `__long_branch_ (addis 12, 2, ...; ld 12, ...(12); mtctr 12; bctr)` -> target

AArch64 can leverage ADRP to jump to the target directly, but PPC64
needs to load an address from .branch_lt . Before Power ISA v3.0, the
PC-relative ADDPCIS was not available. .branch_lt was invented to work
around the limitation.

Symbol::ppc64BranchltIndex is replaced by
PPC64LongBranchTargetSection::entry_index which take addends into
consideration.

The tests are rewritten: ppc64-long-branch.s tests -no-pie and
ppc64-long-branch-pi.s tests -pie and -shared.

Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D70937
2019-12-05 10:17:45 -08:00
Fangrui Song bf535ac4a2 [ELF][AArch64] Support R_AARCH64_{CALL26,JUMP26} range extension thunks with addends
Fixes AArch64 part of PR40438

The current range extension thunk framework does not handle a relocation
relative to a STT_SECTION symbol with a non-zero addend, which may be
used by jumps/calls to local functions on some RELA targets (AArch64,
powerpc ELFv1, powerpc64 ELFv2, etc).  See PR40438 and the following
code for examples:

  // clang -target $target a.cc
  // .text.cold may be placed in a separate output section.
  // The distance between bar in .text.cold and foo in .text may be larger than 128MiB.
  static void foo() {}
  __attribute__((section(".text.cold"))) static int bar() { foo(); return
  0; }
  __attribute__((used)) static int dummy = bar();

This patch makes such thunks with addends work for AArch64. The target
independent part can be reused by PPC in the future.

On REL targets (ARM, MIPS), jumps/calls are not represented as
STT_SECTION + non-zero addend (see
MCELFObjectTargetWriter::needsRelocateWithSymbol), so they don't need
this feature, but we need to make sure this patch does not affect them.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D70637
2019-12-02 10:07:24 -08:00
Nico Weber 5976a3f5aa Fix a few typos in lld/ELF to cycle bots 2019-10-28 21:41:47 -04:00
Fangrui Song bd8cfe65f5 [ELF] Wrap things in `namespace lld { namespace elf {`, NFC
This makes it clear `ELF/**/*.cpp` files define things in the `lld::elf`
namespace and simplifies `elf::foo` to `foo`.

Reviewed By: atanasyan, grimar, ruiu

Differential Revision: https://reviews.llvm.org/D68323

llvm-svn: 373885
2019-10-07 08:31:18 +00:00
Fangrui Song d5d79dfd56 [ELF][PPC] Fix getRelExpr for R_PPC64_REL16_HI
Fixes https://github.com/ClangBuiltLinux/linux/issues/640

R_PPC64_REL16_HI was incorrectly computed as an R_ABS relocation.
rLLD368964 made it a linker failure. Change it to use R_PC to fix the
failures.

Add ppc64-reloc-rel.s for these R_PPC64_REL* tests.

llvm-svn: 369184
2019-08-17 06:28:03 +00:00
Fangrui Song 1542ff5282 [ELF][PPC] Improve error message for unknown relocations
Like rLLD354040.

Previously, for unrecognized relocation types, in -no-pie mode:

  foo.o: unrecognized reloc 256

In -pie/-shared mode:

  error: can't create dynamic relocation R_PPC_xxx against symbol: yyy in readonly segment

llvm-svn: 368964
2019-08-15 05:22:23 +00:00
Fangrui Song dc06b0bc9a [ELF] Don't special case symbolic relocations with 0 addend to ifunc in writable locations
Currently the following 3 relocation types do not trigger the creation
of a canonical PLT (which changes STT_GNU_IFUNC to STT_FUNC and
redirects all references):

1) GOT-generating (`needsGot`)
2) PLT-generating (`needsPlt`)
3) R_ABS with 0 addend in a writable location. This is used for
  for ifunc function pointers in writable sections such as .data and .toc.

This patch deletes case 3) to simplify the R_*_IRELATIVE generating
logic added in D57371. Other advantages:

* It is guaranteed no more than 1 R_*_IRELATIVE is created for an ifunc.
* PPC64: no need to special case ifunc in toc-indirect to toc-relative relaxation. See D65755

The deleted elf::addIRelativeRelocs demonstrates that one-pass scan
through relocations makes several optimizations difficult. This is
something we can think about in the future.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D65995

llvm-svn: 368661
2019-08-13 09:43:40 +00:00