This shaves another word off SectionBase and makes it possible to clone a
section using the implicit copy constructor.
This basically reverts r311056, which removed the mutex in order to
make the code easier to understand. On balance I think it's probably more
straightforward to have a mutex here than to have an unusual copy constructor
in SectionBase.
Differential Revision: https://reviews.llvm.org/D59269
llvm-svn: 355966
This patch removes the precompiled binary from inputs,
replacing it with a YAML. And teaches LLD to report a
section name in case of such error.
Differential revision: https://reviews.llvm.org/D58670
llvm-svn: 354959
Previously, we showed the following message for an unknown relocation:
foo.o: unrecognized reloc 256
This patch improves it so that the error message includes a symbol name:
foo.o: unknown relocation (256) against symbol bar
llvm-svn: 354040
Non-GOT non-PLT relocations to non-preemptible ifuncs result in the
creation of a canonical PLT, which now takes the identity of the IFUNC
in the symbol table. This (a) ensures address consistency inside and
outside the module, and (b) fixes a bug where some of these relocations
end up pointing to the resolver.
Fixes (at least) PR40474 and PR40501.
Differential Revision: https://reviews.llvm.org/D57371
llvm-svn: 353981
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
The section and offset can be very helpful in diagnosing certian errors.
For example on a relocation overflow or misalignment diagnostic:
test.c:(function foo): relocation R_PPC64_ADDR16_DS out of range: ...
The function foo can have many R_PPC64_ADDR16_DS relocations. Adding the offset
and section will identify exactly which relocation is causing the failure.
Differential Revision: https://reviews.llvm.org/D56453
llvm-svn: 350828
ARM and AArch64 use TLS variant 1, where the first two words after the
thread pointer are reserved for the TCB, followed by the executable's TLS
segment. Both the thread pointer and the TLS segment are aligned to at
least the TLS segment's alignment.
Android/Bionic historically has not supported ELF TLS, and it has
allocated memory after the thread pointer for several Bionic TLS slots
(currently 9 but soon only 8). At least one of these allocations
(TLS_SLOT_STACK_GUARD == 5) is widespread throughout Android/AArch64
binaries and can't be changed.
To reconcile this disagreement about TLS memory layout, set the minimum
alignment for executable TLS segments to 8 words on ARM/AArch64, which
reserves at least 8 words of memory after the TP (2 for the ABI-specified
TCB and 6 for alignment padding). For simplicity, and because lld doesn't
know when it's targeting Android, increase the alignment regardless of
operating system.
Differential Revision: https://reviews.llvm.org/D53906
llvm-svn: 350681
In the ABI for the 64-bit Arm architecture the section on weak references
states:
During linking, the symbol value of an undefined weak reference is:
- Zero if the relocation type is absolute
- The address of the place if the relocation type is pc-relative.
The relocations associated with an ADRP are relative so we should resolve
the undefined weak reference to the place instead of 0. This matches GNU
ld.bfd behaviour.
fixes pr34928
Differential Revision: https://reviews.llvm.org/D55599
llvm-svn: 349024
Previously, we have a hash table containing strings and their offsets
to manage mergeable strings. Technically we can live without that, because
we can do binary search on a vector of mergeable strings to find a mergeable
strings.
We did have both the hash table and the binary search because we thought
that that is faster.
We recently observed that lld tend to consume more memory than gold when
building an output with debug info. A few percent of memory is consumed by
the hash table. So, we needed to reevaluate whether or not having the extra
hash table is a good CPU/memory tradeoff. I run a few benchmarks with and
without the hash table.
I got a mixed result for the benchmark. We observed a regression for some
programs by removing the hash table (that's what we expected), but we also
observed that performance imrpovements for some programs. This is perhaps
due to reduced memory usage.
Differential Revision: https://reviews.llvm.org/D55234
llvm-svn: 348401
This is https://bugs.llvm.org/show_bug.cgi?id=38074.
The issue is that when calling a function, LLD generates a
.got entry that points to the IFUNC resolver function when
instead, it should use the PLT entries properly for
handling the IFUNC.
So we should create a got entry that points to PLT entry,
which itself loads the value from
.got.plt, relocated with R_*_IRELATIVE to make things work.
Patch do that.
Differential revision: https://reviews.llvm.org/D54314
llvm-svn: 347650
The R_AARCH64_ADR_PREL_PG_HI21 relocation type is given the R_PAGE_PC
RelExpr. This can be transformed to R_PLT_PAGE_PC via toPlt().
Unfortunately the resolution is identical to R_PAGE_PC so instead of
getting the address of the PLT entry we get the address of the symbol
which may not be correct in the case of static ifuncs. The fix is to
handle the cases separately and use getPltVA() + A with R_PLT_PAGE_PC.
Differential Revision: https://reviews.llvm.org/D54474
llvm-svn: 346863
This is https://bugs.llvm.org/show_bug.cgi?id=39493.
We crashed previously because did not handle /DISCARD/ properly
when -r was used. I think it is uncommon to use scripts with -r, though I see
nothing wrong to handle the /DISCARD/ so that we will not crash at least.
Differential revision: https://reviews.llvm.org/D53864
llvm-svn: 345819
Summary:
There are really three different kinds of TLS layouts:
* A fixed TLS-to-TP offset. On architectures like PowerPC, MIPS, and
RISC-V, the thread pointer points to a fixed offset from the start
of the executable's TLS segment. The offset is 0x7000 for PowerPC
and MIPS, which allows a signed 16-bit offset to reach 0x1000 of
per-thread implementation data and 0xf000 of the application's TLS
segment. The size and layout of the TCB isn't relevant to the static
linker and might not be known.
* A fixed TCB size. This is the format documented as "variant 1" in
Ulrich Drepper's TLS spec. The thread pointer points to a 2-word TCB
followed by the executable's TLS segment. The first word is always
the DTV pointer. Used on ARM. The thread pointer must be aligned to
the TLS segment's alignment, possibly creating alignment padding.
* Variant 2. This format predates variant 1 and is also documented in
Drepper's TLS spec. It allocates the executable's TLS segment before
the thread pointer, apparently for backwards-compatibility. It's
used on x86 and SPARC.
Factor out an lld:🧝:getTlsTpOffset() function for use in a
follow-up patch for Android. The TcbSize/TlsTpOffset fields are only used
in getTlsTpOffset, so replace them with a switch on Config->EMachine.
Reviewers: espindola, ruiu, PkmX, jrtc27
Reviewed By: ruiu, PkmX, jrtc27
Subscribers: jyknight, emaste, sdardis, nemanjai, javed.absar, arichardson, kristof.beyls, kbarton, fedor.sergeev, atanasyan, PkmX, jsji, llvm-commits
Differential Revision: https://reviews.llvm.org/D53905
llvm-svn: 345775
Recommitting https://reviews.llvm.org/rL344544 after fixing undefined behavior
from left-shifting a negative value. Original commit message:
This support is slightly different then the X86_64 implementation in that calls
to __morestack don't need to get rewritten to calls to __moresatck_non_split
when a split-stack caller calls a non-split-stack callee. Instead the size of
the stack frame requested by the caller is adjusted prior to the call to
__morestack. The size the stack-frame will be adjusted by is tune-able through a
new --split-stack-adjust-size option.
llvm-svn: 344622
This reverts commit https://reviews.llvm.org/rL344544, which causes failures on
a undefined behaviour sanitizer bot -->
lld/ELF/Arch/PPC64.cpp:849:35: runtime error: left shift of negative value -1
llvm-svn: 344551
This support is slightly different then the X86_64 implementation in that calls
to __morestack don't need to get rewritten to calls to __moresatck_non_split
when a split-stack caller calls a non-split-stack callee. Instead the size of
the stack frame requested by the caller is adjusted prior to the call to
__morestack. The size the stack-frame will be adjusted by is tune-able through a
new --split-stack-adjust-size option.
Differential Revision: https://reviews.llvm.org/D52099
llvm-svn: 344544
Previously, we cast a pointer to Elf{32,64}_Chdr like this
auto *Hdr = reinterpret_cast<const ELF64_Chdr>(Ptr);
and read from its members like this
read32(&Hdr->ch_size);
I was thinking that this does not violate alignment requirement,
since &Hdr->ch_size doesn't really access memory, but seems like
it is a violation in terms of C++ spec (?)
In this patch, I use a different struct that allows unaligned access.
llvm-svn: 344083
Previously, we uncompress all compressed sections before doing anything.
That works, and that is conceptually simple, but that could results in
a waste of CPU time and memory if uncompressed sections are then
discarded or just copied to the output buffer.
In particular, if .debug_gnu_pub{names,types} are compressed and if no
-gdb-index option is given, we wasted CPU and memory because we
uncompress them into newly allocated bufers and then memcpy the buffers
to the output buffer. That temporary buffer was redundant.
This patch changes how to uncompress sections. Now, compressed sections
are uncompressed lazily. To do that, `Data` member of `InputSectionBase`
is now hidden from outside, and `data()` accessor automatically expands
an compressed buffer if necessary.
If no one calls `data()`, then `writeTo()` directly uncompresses
compressed data into the output buffer. That eliminates the redundant
memory allocation and redundant memcpy.
This patch significantly reduces memory consumption (20 GiB max RSS to
15 Gib) for an executable whose .debug_gnu_pub{names,types} are in total
5 GiB in an uncompressed form.
Differential Revision: https://reviews.llvm.org/D52917
llvm-svn: 343979
The GOT is referenced through the symbol _GLOBAL_OFFSET_TABLE_ .
The relocation added calculates the offset into the global offset table for
the entry of a symbol. In order to get the correct TargetVA I needed to
create an new relocation expression, HEXAGON_GOT. It does
Sym.getGotVA() - In.GotPlt->getVA().
Differential Revision: https://reviews.llvm.org/D52744
llvm-svn: 343784
Summary: The convenience wrapper in STLExtras is available since rL342102.
Reviewers: ruiu, espindola
Subscribers: emaste, arichardson, mgrang, llvm-commits
Differential Revision: https://reviews.llvm.org/D52569
llvm-svn: 343146
Previously, if you invoke lld's `main` more than once in the same process,
the second invocation could fail or produce a wrong result due to a stale
pointer values of the previous run.
Differential Revision: https://reviews.llvm.org/D52506
llvm-svn: 343009
The PPC64 elf V2 abi defines 2 entry points for a function. There are a few
places we need to calculate the offset from the global entry to the local entry
and how this is done is not straight forward. This patch adds a helper function
mostly for documentation purposes, explaining how the 2 entry points differ and
why we choose one over the other, as well as documenting how the offsets are
encoded into a functions st_other field.
Differential Revision: https://reviews.llvm.org/D52231
llvm-svn: 342603
section will not have an input file. Don't crash under those circumstances.
Neither clang nor llvm-mc generates R_X86_64_PC32 relocations due to
https://reviews.llvm.org/D43383, which makes it hard to write a test case.
However, gcc does generate such relocations. I want to get a fix in now,
but will figure out a way to actually exercise this code path as soon
as I can.
llvm-svn: 341408
This patch moves the checking for too large offsets into merge sections
earlier.
Without this change the large offset generated in the added test-case
will cause an assert (as it happens to be a value reserved as a
"tombstone" in the DenseMap implementation) when OffsetMap is queried in
getSectionPiece().
To simplify the code and avoid future mistakes I have refactored so that
there is only one function that looks up offsets in the OffsetMap.
Differential Revision: https://reviews.llvm.org/D51180
llvm-svn: 341206
Patch by PkmX.
This patch makes lld recognize RISC-V target and implements basic
relocation for RV32/RV64 (and RVC). This should be necessary for static
linking ELF applications.
The ABI documentation for RISC-V can be found at:
https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md.
Note that the documentation is far from complete so we had to figure out
some details from bfd.
The patch should be pretty straightforward. Some highlights:
- A new relocation Expr R_RISCV_PC_INDIRECT is added. This is needed as
the low part of a PC-relative relocation is linked to the corresponding
high part (auipc), see:
https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md#pc-relative-symbol-addresses
- LLVM's MC support for RISC-V is very incomplete (we are working on
this), so tests are given in objectyaml format with the original
assembly included in the comments. Once we have complete support for
RISC-V in MC, we can switch to llvm-as/llvm-objdump.
- We don't support linker relaxation for now as it requires greater
changes to lld that is beyond the scope of this patch. Once this is
accepted we can start to work on adding relaxation to lld.
Differential Revision: https://reviews.llvm.org/D39322
llvm-svn: 339364
In according to the comment, undefined symbol should never reach there.
So, should be able to remove the check. I am assuming this is NFC.
llvm-svn: 338723
It does not seem that this code is alive.
I seems was needed previously but we fixed it.
If it is still needed, it needs new tests,
but for now I do not know how to trigger it,
and so I removed it.
llvm-svn: 338713
This patch adds the target call back relaxTlsLdToLe to support TLS relaxation
from local dynamic to local exec model.
Differential Revision: https://reviews.llvm.org/D48293
llvm-svn: 336559
The local dynamic TLS access on PPC64 ELF v2 ABI uses R_PPC64_GOT_DTPREL16*
relocations when a TLS variables falls outside 2 GB of the thread storage
block. This patch adds support for these relocations by adding a new RelExpr
called R_TLSLD_GOT_OFF which emits a got entry for the TLS variable relative
to the dynamic thread pointer using the relocation R_PPC64_DTPREL64. It then
evaluates the R_PPC64_GOT_DTPREL16* relocations as the got offset for the
R_PPC64_DTPREL64 got entries.
Differential Revision: https://reviews.llvm.org/D48484
llvm-svn: 335732
Patch adds support for relaxing the general-dynamic tls sequence to
initial-exec.
the relaxation performs the following transformation:
addis r3, r2, x@got@tlsgd@ha --> addis r3, r2, x@got@tprel@ha
addi r3, r3, x@got@tlsgd@l --> ld r3, x@got@tprel@l(r3)
bl __tls_get_addr(x@tlsgd) --> nop
nop --> add r3, r3, r13
and instead of emitting a DTPMOD64/DTPREL64 pair for x, we emit a single
R_PPC64_TPREL64.
Differential Revision: https://reviews.llvm.org/D48090
llvm-svn: 335651
Almost all entries inside MIPS GOT are referenced by signed 16-bit
index. Zero entry lies approximately in the middle of the GOT. So the
total number of GOT entries cannot exceed ~16384 for 32-bit architecture
and ~8192 for 64-bit architecture. This limitation makes impossible to
link rather large application like for example LLVM+Clang. There are two
workaround for this problem. The first one is using the -mxgot
compiler's flag. It enables using a 32-bit index to access GOT entries.
But each access requires two assembly instructions two load GOT entry
index to a register. Another workaround is multi-GOT. This patch
implements it.
Here is a brief description of multi-GOT for detailed one see the
following link https://dmz-portal.mips.com/wiki/MIPS_Multi_GOT.
If the sum of local, global and tls entries is less than 64K only single
got is enough. Otherwise, multi-got is created. Series of primary and
multiple secondary GOTs have the following layout:
```
- Primary GOT
Header
Local entries
Global entries
Relocation only entries
TLS entries
- Secondary GOT
Local entries
Global entries
TLS entries
...
```
All GOT entries required by relocations from a single input file
entirely belong to either primary or one of secondary GOTs. To reference
GOT entries each GOT has its own _gp value points to the "middle" of the
GOT. In the code this value loaded to the register which is used for GOT
access.
MIPS 32 function's prologue:
```
lui v0,0x0
0: R_MIPS_HI16 _gp_disp
addiu v0,v0,0
4: R_MIPS_LO16 _gp_disp
```
MIPS 64 function's prologue:
```
lui at,0x0
14: R_MIPS_GPREL16 main
```
Dynamic linker does not know anything about secondary GOTs and cannot
use a regular MIPS mechanism for GOT entries initialization. So we have
to use an approach accepted by other architectures and create dynamic
relocations R_MIPS_REL32 to initialize global entries (and local in case
of PIC code) in secondary GOTs. But ironically MIPS dynamic linker
requires GOT entries and correspondingly ordered dynamic symbol table
entries to deal with dynamic relocations. To handle this problem
relocation-only section in the primary GOT contains entries for all
symbols referenced in global parts of secondary GOTs. Although the sum
of local and normal global entries of the primary got should be less
than 64K, the size of the primary got (including relocation-only entries
can be greater than 64K, because parts of the primary got that overflow
the 64K limit are used only by the dynamic linker at dynamic link-time
and not by 16-bit gp-relative addressing at run-time.
The patch affects common LLD code in the following places:
- Added new hidden -mips-got-size flag. This flag required to set low
maximum size of a single GOT to be able to test the implementation using
small test cases.
- Added InputFile argument to the getRelocTargetVA function. The same
symbol referenced by GOT relocation from different input file might be
allocated in different GOT. So result of relocation depends on the file.
- Added new ctor to the DynamicReloc class. This constructor records
settings of dynamic relocation which used to adjust address of 64kb page
lies inside a specific output section.
With the patch LLD is able to link all LLVM+Clang+LLD applications and
libraries for MIPS 32/64 targets.
Differential revision: https://reviews.llvm.org/D31528
llvm-svn: 334390