With this patch lld recognizes ARM SBREL relocations.
R_ARM*_MOVW_BREL relocations are not tested because they are not used.
Patch by Tamas Petz
Differential Revision: https://reviews.llvm.org/D74604
Summary:
Generate PAC protected plt only when "-z pac-plt" is passed to the
linker. GNU toolchain generates when it is explicitly requested[1].
When pac-plt is requested then set the GNU_PROPERTY_AARCH64_FEATURE_1_PAC
note even when not all function compiled with PAC but issue a warning.
Harmonizing the warning style for BTI/PAC/IBT.
Generate BTI protected PLT if case of "-z force-bti".
[1] https://www.sourceware.org/ml/binutils/2019-03/msg00021.html
Reviewers: peter.smith, espindola, MaskRay, grimar
Reviewed By: peter.smith, MaskRay
Subscribers: tatyana-krasnukha, emaste, arichardson, kristof.beyls, MaskRay, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74537
Recommit of 0b4a047bfb
(reverted in c29003813a) to incorporate
subsequent fix and add a warning when LLD's interworking behavior has
changed.
D73474 disabled the generation of interworking thunks for branch
relocations to non STT_FUNC symbols. This patch handles the case of BL and
BLX instructions to non STT_FUNC symbols. LLD would normally look at the
state of the caller and the callee and write a BL if the states are the
same and a BLX if the states are different.
This patch disables BL/BLX substitution when the destination symbol does
not have type STT_FUNC. This brings our behavior in line with GNU ld which
may prevent difficult to diagnose runtime errors when switching to lld.
This change does change how LLD handles interworking of symbols that do not
have type STT_FUNC from previous versions including the 10.0 release. This
brings LLD in line with ld.bfd but there may be programs that have not been
linked with ld.bfd that depend on LLD's previous behavior. We emit a warning
when the behavior changes.
A summary of the difference between 10.0 and 11.0 is that for symbols
that do not have a type of STT_FUNC LLD will not change a BL to a BLX or
vice versa. The table below enumerates the changes
| relocation | STT_FUNC | bit(0) | in | 10.0- out | 11.0+ out |
| R_ARM_CALL | no | 1 | BL | BLX | BL |
| R_ARM_CALL | no | 0 | BLX | BL | BLX |
| R_ARM_THM_CALL | no | 1 | BLX | BL | BLX |
| R_ARM_THM_CALL | no | 0 | BL | BLX | BL |
Differential Revision: https://reviews.llvm.org/D73542
There are still problems after the fix in
"[ELF][ARM] Fix regression of BL->BLX substitution after D73542"
so let's revert to get trunk back to green while we investigate.
See https://reviews.llvm.org/D73542
This reverts commit 5461fa2b1f.
This reverts commit 0b4a047bfb.
D73542 made a typo (`rel.type == R_PLT_PC`; should be `rel.expr`) and introduced a regression:
BL->BLX substitution was disabled when the target symbol is preemptible
(expr is R_PLT_PC).
The two added bl instructions in arm-thumb-interwork-shared.s check that
we patch BL to BLX.
Fixes https://bugs.chromium.org/p/chromium/issues/detail?id=1047531
D73474 disabled the generation of interworking thunks for branch
relocations to non STT_FUNC symbols. This patch handles the case of BL and
BLX instructions to non STT_FUNC symbols. LLD would normally look at the
state of the caller and the callee and write a BL if the states are the
same and a BLX if the states are different.
This patch disables BL/BLX substitution when the destination symbol does
not have type STT_FUNC. This brings our behavior in line with GNU ld which
may prevent difficult to diagnose runtime errors when switching to lld.
Differential Revision: https://reviews.llvm.org/D73542
ELF for the ARM architecture requires linkers to provide
interworking for symbols that are of type STT_FUNC. Interworking for
other symbols must be encoded directly in the object file. LLD was always
providing interworking, regardless of the symbol type, this breaks some
programs that have branches from Thumb state targeting STT_NOTYPE symbols
that have bit 0 clear, but they are in fact internal labels in a Thumb
function. LLD treats these symbols as ARM and inserts a transition to Arm.
This fixes the problem for in range branches, R_ARM_JUMP24,
R_ARM_THM_JUMP24 and R_ARM_THM_JUMP19. This is expected to be the vast
majority of problem cases as branching to an internal label close to the
function.
There is at least one follow up patch required.
- R_ARM_CALL and R_ARM_THM_CALL may do interworking via BL/BLX
substitution.
In theory range-extension thunks can be altered to not change state when
the symbol type is not STT_FUNC. I will need to check with ld.bfd to see if
this is the case in practice.
Fixes (part of) https://github.com/ClangBuiltLinux/linux/issues/773
Differential Revision: https://reviews.llvm.org/D73474
* Generalize the code added in D70637 and D70937. We should eventually remove the EM_MIPS special case.
* Handle R_PPC_LOCAL24PC the same way as R_PPC_REL24.
Reviewed By: Bdragon28
Differential Revision: https://reviews.llvm.org/D73424
-fno-pie produces a pair of non-GOT-non-PLT relocations R_PPC_ADDR16_{HA,LO} (R_ABS) referencing external
functions.
```
lis 3, func@ha
la 3, func@l(3)
```
In a -no-pie/-pie link, if func is not defined in the executable, a canonical PLT entry (st_value>0, st_shndx=0) will be needed.
References to func in shared objects will be resolved to this address.
-fno-pie -pie should fail with "can't create dynamic relocation ... against ...", so we just need to think about -no-pie.
On x86, the PLT entry passes the JMP_SLOT offset to the rtld PLT resolver.
On x86-64: the PLT entry passes the JUMP_SLOT index to the rtld PLT resolver.
On ARM/AArch64: the PLT entry passes &.got.plt[n]. The PLT header passes &.got.plt[fixed-index]. The rtld PLT resolver can compute the JUMP_SLOT index from the two addresses.
For these targets, the canonical PLT entry can just reuse the regular PLT entry (in PltSection).
On PPC32: PltSection (.glink) consists of `b PLTresolve` instructions and `PLTresolve`. The rtld PLT resolver depends on r11 having been set up to the .plt (GotPltSection) entry.
On PPC64 ELFv2: PltSection (.glink) consists of `__glink_PLTresolve` and `bl __glink_PLTresolve`. The rtld PLT resolver depends on r12 having been set up to the .plt (GotPltSection) entry.
We cannot reuse a `b PLTresolve`/`bl __glink_PLTresolve` in PltSection as a canonical PLT entry. PPC64 ELFv2 avoids the problem by using TOC for any external reference, even in non-pic code, so the canonical PLT entry scenario should not happen in the first place.
For PPC32, we have to create a PLT call stub as the canonical PLT entry. The code sequence sets up r11.
Reviewed By: Bdragon28
Differential Revision: https://reviews.llvm.org/D73399
Symbol information can be used to improve out-of-range/misalignment diagnostics.
It also helps R_ARM_CALL/R_ARM_THM_CALL which has different behaviors with different symbol types.
There are many (67) relocateOne() call sites used in thunks, {Arm,AArch64}errata, PLT, etc.
Rename them to `relocateNoSym()` to be clearer that there is no symbol information.
Reviewed By: grimar, peter.smith
Differential Revision: https://reviews.llvm.org/D73254
These functions call relocateOne(). This patch is a prerequisite for
making relocateOne() aware of `Symbol` (D73254).
Reviewed By: grimar, nickdesaulniers
Differential Revision: https://reviews.llvm.org/D73250
Summary:
Unlike R_RISCV_RELAX, which is a linker hint, R_RISCV_ALIGN requires the
support of the linker even when ignoring all R_RISCV_RELAX relocations.
This is because the compiler emits as many NOPs as may be required for
the requested alignment, more than may be required pre-relaxation, to
allow for the target becoming more unaligned after relaxing earlier
sequences. This means that the target is often not initially aligned in
the object files, and so the R_RISCV_ALIGN relocations cannot just be
ignored. Since we do not support linker relaxation, we must turn these
into errors.
Reviewers: ruiu, MaskRay, espindola
Reviewed By: MaskRay
Subscribers: grimar, Jim, emaste, arichardson, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71820
R_HINT is ignored like R_NONE. There are no strong reasons to keep
R_HINT. The largest RelExpr member R_RISCV_PC_INDIRECT is 60 now.
Differential Revision: https://reviews.llvm.org/D71822
This patch is a joint work by Rui Ueyama and me based on D58102 by Xiang Zhang.
It adds Intel CET (Control-flow Enforcement Technology) support to lld.
The implementation follows the draft version of psABI which you can
download from https://github.com/hjl-tools/x86-psABI/wiki/X86-psABI.
CET introduces a new restriction on indirect jump instructions so that
you can limit the places to which you can jump to using indirect jumps.
In order to use the feature, you need to compile source files with
-fcf-protection=full.
* IBT is enabled if all input files are compiled with the flag. To force enabling ibt, pass -z force-ibt.
* SHSTK is enabled if all input files are compiled with the flag, or if -z shstk is specified.
IBT-enabled executables/shared objects have two PLT sections, ".plt" and
".plt.sec". For the details as to why we have two sections, please read
the comments.
Reviewed By: xiangzhangllvm
Differential Revision: https://reviews.llvm.org/D59780
In AArch64 a branch to an undefined weak symbol that does not have a PLT
entry should resolve to the next instruction. The thunk generation code
can prevent this from happening as a range extension thunk can be generated
if the branch is sufficiently far away from 0, the value of an undefined
weak symbol.
The fix is taken from the Arm implementation of needsThunk(), we prevent a
thunk from being generated to an undefined weak symbol.
fixes pr44451
Differential Revision: https://reviews.llvm.org/D72267
Similar to D71509 (EM_PPC64), on EM_PPC, the IPLT code sequence should
be similar to a PLT call stub. Unlike EM_PPC64, EM_PPC -msecure-plt has
small/large PIC model differences.
* -fpic/-fpie: R_PPC_PLTREL24 r_addend=0. The call stub loads an address relative to `_GLOBAL_OFFSET_TABLE_`.
* -fPIC/-fPIE: R_PPC_PLTREL24 r_addend=0x8000. (A partial linked object
file may have an addend larger than 0x8000.) The call stub loads an address relative to .got2+0x8000.
Just assume large PIC model for now. This patch makes:
// clang -fuse-ld=lld -msecure-plt -fno-pie -no-pie a.c
// clang -fuse-ld=lld -msecure-plt -fPIE -pie a.c
#include <stdio.h>
static void impl(void) { puts("meow"); }
void thefunc(void) __attribute__((ifunc("resolver")));
void *resolver(void) { return &impl; }
int main(void) {
thefunc();
void (*theptr)(void) = &thefunc;
theptr();
}
work on Linux glibc. -fpie will crash because the compiler and the
linker do not agree on the value which r30 stores (_GLOBAL_OFFSET_TABLE_
vs .got2+0x8000).
Differential Revision: https://reviews.llvm.org/D71621
Non-preemptible IFUNC are placed in in.iplt (.glink on EM_PPC64). If
there is a non-GOT non-PLT relocation, for pointer equality, we change
the type of the symbol from STT_IFUNC and STT_FUNC and bind it to the
.glink entry.
On EM_386, EM_X86_64, EM_ARM, and EM_AARCH64, the PLT code sequence
loads the address from its associated .got.plt slot. An IPLT also has an
associated .got.plt slot and can use the same code sequence.
On EM_PPC64, the PLT code sequence is actually a bl instruction in
.glink . It jumps to `__glink_PLTresolve` (the PLT header). and
`__glink_PLTresolve` computes the .plt slot (relocated by
R_PPC64_JUMP_SLOT).
An IPLT does not have an associated R_PPC64_JUMP_SLOT, so we cannot use
`bl` in .iplt . Instead, create a call stub which has a similar code
sequence as PPC64PltCallStub. We don't save the TOC pointer, so such
scenarios will not work: a function pointer to a non-preemptible ifunc,
which resolves to a function defined in another DSO. This is the
restriction described by https://sourceware.org/glibc/wiki/GNU_IFUNC
(though on many architectures it works in practice):
Requirement (a): Resolver must be defined in the same translation unit as the implementations.
If an ifunc is taken address but not called, technically we don't need
an entry for it, but we currently do that.
This patch makes
// clang -fuse-ld=lld -fno-pie -no-pie a.c
// clang -fuse-ld=lld -fPIE -pie a.c
#include <stdio.h>
static void impl(void) { puts("meow"); }
void thefunc(void) __attribute__((ifunc("resolver")));
void *resolver(void) { return &impl; }
int main(void) {
thefunc();
void (*theptr)(void) = &thefunc;
theptr();
}
work on Linux glibc and FreeBSD. Calling a function pointer pointing to
a Non-preemptible IFUNC never worked before.
Differential Revision: https://reviews.llvm.org/D71509
Summary:
If none of the input files are ELF object files (for example, when
generating an object file from a single binary input file via
"-b binary"), use a fallback value for the ELF header flags instead
of crashing with an assertion failure.
Reviewers: MaskRay, ruiu, espindola
Reviewed By: MaskRay, ruiu
Subscribers: kevans, grimar, emaste, arichardson, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits, jrtc27
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71101
PltSection is used by both PLT and IPLT. The PLT section may have a
header while the IPLT section does not. Split off IpltSection from
PltSection to be clearer.
Unlike other targets, PPC64 cannot use the same code sequence for PLT
and IPLT. This helps make a future PPC64 patch (D71509) more isolated.
On EM_386 and EM_X86_64, when PLT is empty while IPLT is not, currently
we are inconsistent whether the PLT header is conceptually attached to
in.plt or in.iplt . Consistently attach the header to in.plt can make
the -z retpolineplt logic simpler. It also makes `jmp` point to an
aesthetically better place for non-retpolineplt cases.
Reviewed By: grimar, ruiu
Differential Revision: https://reviews.llvm.org/D71519
This change only affects EM_386. relOff can be computed from `index`
easily, so it is unnecessarily passed as a parameter.
Both in.plt and in.iplt entries are written by writePLT. For in.iplt,
the instruction `push reloc_offset` will change because `index` is now
different. Fortunately, this does not matter because `push; jmp` is only
used by PLT. IPLT does not need the code sequence.
Reviewed By: grimar, ruiu
Differential Revision: https://reviews.llvm.org/D71518
Fixes PPC64 part of PR40438
// clang -target ppc64le -c a.cc
// .text.unlikely may be placed in a separate output section (via -z keep-text-section-prefix)
// The distance between bar in .text.unlikely and foo in .text may be larger than 32MiB.
static void foo() {}
__attribute__((section(".text.unlikely"))) static int bar() { foo(); return 0; }
__attribute__((used)) static int dummy = bar();
This patch makes such thunks with addends work for PPC64.
AArch64: .text -> `__AArch64ADRPThunk_ (adrp x16, ...; add x16, x16, ...; br x16)` -> target
PPC64: .text -> `__long_branch_ (addis 12, 2, ...; ld 12, ...(12); mtctr 12; bctr)` -> target
AArch64 can leverage ADRP to jump to the target directly, but PPC64
needs to load an address from .branch_lt . Before Power ISA v3.0, the
PC-relative ADDPCIS was not available. .branch_lt was invented to work
around the limitation.
Symbol::ppc64BranchltIndex is replaced by
PPC64LongBranchTargetSection::entry_index which take addends into
consideration.
The tests are rewritten: ppc64-long-branch.s tests -no-pie and
ppc64-long-branch-pi.s tests -pie and -shared.
Reviewed By: sfertile
Differential Revision: https://reviews.llvm.org/D70937
Fixes AArch64 part of PR40438
The current range extension thunk framework does not handle a relocation
relative to a STT_SECTION symbol with a non-zero addend, which may be
used by jumps/calls to local functions on some RELA targets (AArch64,
powerpc ELFv1, powerpc64 ELFv2, etc). See PR40438 and the following
code for examples:
// clang -target $target a.cc
// .text.cold may be placed in a separate output section.
// The distance between bar in .text.cold and foo in .text may be larger than 128MiB.
static void foo() {}
__attribute__((section(".text.cold"))) static int bar() { foo(); return
0; }
__attribute__((used)) static int dummy = bar();
This patch makes such thunks with addends work for AArch64. The target
independent part can be reused by PPC in the future.
On REL targets (ARM, MIPS), jumps/calls are not represented as
STT_SECTION + non-zero addend (see
MCELFObjectTargetWriter::needsRelocateWithSymbol), so they don't need
this feature, but we need to make sure this patch does not affect them.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D70637
Summary:
Current versions of clang would erroneously emit this relocation not only
against functions (loaded from the GOT) but also against data symbols
(e.g. a table of function pointers). LLD was then changing this into a
branch-and-link instruction, causing the program to jump to the data
symbol at run time. I discovered this problem when attempting to boot
MIPS64 FreeBSD after updating the to the latest upstream master.
Reviewers: atanasyan, jrtc27, espindola
Reviewed By: atanasyan
Subscribers: emaste, sdardis, krytarowski, MaskRay, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70406
This makes it clear `ELF/**/*.cpp` files define things in the `lld::elf`
namespace and simplifies `elf::foo` to `foo`.
Reviewed By: atanasyan, grimar, ruiu
Differential Revision: https://reviews.llvm.org/D68323
llvm-svn: 373885
This allows us to delete `using namespace llvm::support::endian` and
simplify D68323. This change adds runtime config->endianness check but
the overhead should be negligible.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D68561
llvm-svn: 373884
The R_MIPS_JALR relocation denotes jalr/jr instructions in position
independent code. Both these instructions take a target's address from
the $25 register. If offset to the target symbol fits into the 18-bits,
it's more efficient to replace jalr/jr by bal/b instructions.
Differential Revision: https://reviews.llvm.org/D68057
llvm-svn: 372951
In case of linking binary blobs which do not have any ELF headers, we can
deduce MIPS ABI ELF header flags from an `emulation` option.
Patch by Kyle Evans.
llvm-svn: 372513
Like rLLD354040
Previously, for unknown relocation types, in -no-pie/-pie mode, we got something like:
foo.o: unrecognized relocation ...
In -shared mode:
error: can't create dynamic relocation ... against symbol: yyy in readonly segment
Delete the default case from Hexagon::getRelExpr and add the error there. We will get consistent error message like `error: unknown relocation (1024) against symbol foo`
Reviewed By: sidneym
Differential Revision: https://reviews.llvm.org/D66275
llvm-svn: 369260
Fixes https://github.com/ClangBuiltLinux/linux/issues/640
R_PPC64_REL16_HI was incorrectly computed as an R_ABS relocation.
rLLD368964 made it a linker failure. Change it to use R_PC to fix the
failures.
Add ppc64-reloc-rel.s for these R_PPC64_REL* tests.
llvm-svn: 369184
R_GOTPLT is relative to .got.plt since D59594. Since R_HEXAGON_GOT
relocations always have 0 r_addend, they can use R_GOTPLT instead.
Reviewed By: sidneym
Differential Revision: https://reviews.llvm.org/D66274
llvm-svn: 369128
Like rLLD354040.
Previously, for unrecognized relocation types, in -no-pie/-pie mode, we got something like:
foo.o: unrecognized relocation ...
In -shared mode:
error: can't create dynamic relocation ... against symbol: yyy in readonly segment
Delete the default case from AArch64::getRelExpr and add the error there.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D66277
llvm-svn: 368983
Currently the following 3 relocation types do not trigger the creation
of a canonical PLT (which changes STT_GNU_IFUNC to STT_FUNC and
redirects all references):
1) GOT-generating (`needsGot`)
2) PLT-generating (`needsPlt`)
3) R_ABS with 0 addend in a writable location. This is used for
for ifunc function pointers in writable sections such as .data and .toc.
This patch deletes case 3) to simplify the R_*_IRELATIVE generating
logic added in D57371. Other advantages:
* It is guaranteed no more than 1 R_*_IRELATIVE is created for an ifunc.
* PPC64: no need to special case ifunc in toc-indirect to toc-relative relaxation. See D65755
The deleted elf::addIRelativeRelocs demonstrates that one-pass scan
through relocations makes several optimizations difficult. This is
something we can think about in the future.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D65995
llvm-svn: 368661