The local dynamic TLS access on PPC64 ELF v2 ABI uses R_PPC64_GOT_DTPREL16*
relocations when a TLS variables falls outside 2 GB of the thread storage
block. This patch adds support for these relocations by adding a new RelExpr
called R_TLSLD_GOT_OFF which emits a got entry for the TLS variable relative
to the dynamic thread pointer using the relocation R_PPC64_DTPREL64. It then
evaluates the R_PPC64_GOT_DTPREL16* relocations as the got offset for the
R_PPC64_DTPREL64 got entries.
Differential Revision: https://reviews.llvm.org/D48484
llvm-svn: 335732
Patch adds support for relaxing the general-dynamic tls sequence to
initial-exec.
the relaxation performs the following transformation:
addis r3, r2, x@got@tlsgd@ha --> addis r3, r2, x@got@tprel@ha
addi r3, r3, x@got@tlsgd@l --> ld r3, x@got@tprel@l(r3)
bl __tls_get_addr(x@tlsgd) --> nop
nop --> add r3, r3, r13
and instead of emitting a DTPMOD64/DTPREL64 pair for x, we emit a single
R_PPC64_TPREL64.
Differential Revision: https://reviews.llvm.org/D48090
llvm-svn: 335651
Almost all entries inside MIPS GOT are referenced by signed 16-bit
index. Zero entry lies approximately in the middle of the GOT. So the
total number of GOT entries cannot exceed ~16384 for 32-bit architecture
and ~8192 for 64-bit architecture. This limitation makes impossible to
link rather large application like for example LLVM+Clang. There are two
workaround for this problem. The first one is using the -mxgot
compiler's flag. It enables using a 32-bit index to access GOT entries.
But each access requires two assembly instructions two load GOT entry
index to a register. Another workaround is multi-GOT. This patch
implements it.
Here is a brief description of multi-GOT for detailed one see the
following link https://dmz-portal.mips.com/wiki/MIPS_Multi_GOT.
If the sum of local, global and tls entries is less than 64K only single
got is enough. Otherwise, multi-got is created. Series of primary and
multiple secondary GOTs have the following layout:
```
- Primary GOT
Header
Local entries
Global entries
Relocation only entries
TLS entries
- Secondary GOT
Local entries
Global entries
TLS entries
...
```
All GOT entries required by relocations from a single input file
entirely belong to either primary or one of secondary GOTs. To reference
GOT entries each GOT has its own _gp value points to the "middle" of the
GOT. In the code this value loaded to the register which is used for GOT
access.
MIPS 32 function's prologue:
```
lui v0,0x0
0: R_MIPS_HI16 _gp_disp
addiu v0,v0,0
4: R_MIPS_LO16 _gp_disp
```
MIPS 64 function's prologue:
```
lui at,0x0
14: R_MIPS_GPREL16 main
```
Dynamic linker does not know anything about secondary GOTs and cannot
use a regular MIPS mechanism for GOT entries initialization. So we have
to use an approach accepted by other architectures and create dynamic
relocations R_MIPS_REL32 to initialize global entries (and local in case
of PIC code) in secondary GOTs. But ironically MIPS dynamic linker
requires GOT entries and correspondingly ordered dynamic symbol table
entries to deal with dynamic relocations. To handle this problem
relocation-only section in the primary GOT contains entries for all
symbols referenced in global parts of secondary GOTs. Although the sum
of local and normal global entries of the primary got should be less
than 64K, the size of the primary got (including relocation-only entries
can be greater than 64K, because parts of the primary got that overflow
the 64K limit are used only by the dynamic linker at dynamic link-time
and not by 16-bit gp-relative addressing at run-time.
The patch affects common LLD code in the following places:
- Added new hidden -mips-got-size flag. This flag required to set low
maximum size of a single GOT to be able to test the implementation using
small test cases.
- Added InputFile argument to the getRelocTargetVA function. The same
symbol referenced by GOT relocation from different input file might be
allocated in different GOT. So result of relocation depends on the file.
- Added new ctor to the DynamicReloc class. This constructor records
settings of dynamic relocation which used to adjust address of 64kb page
lies inside a specific output section.
With the patch LLD is able to link all LLVM+Clang+LLD applications and
libraries for MIPS 32/64 targets.
Differential revision: https://reviews.llvm.org/D31528
llvm-svn: 334390
Add support for the R_PPC64_GOT_TLSLD16 relocations used to build the address of
the tls_index struct used in local-dynamic tls.
Differential Revision: https://reviews.llvm.org/D47538
llvm-svn: 333681
getRelocTargetVA for R_TLSGD and R_TLSLD RelExprs calculate an offset from the
end of the got, so adjust the names to reflect this.
Differential Revision: https://reviews.llvm.org/D47379
llvm-svn: 333674
Adds handling of all the relocation types for general-dynamic thread local
storage.
Differential Revision: https://reviews.llvm.org/D47325
llvm-svn: 333420
Note that this doesn't do the right thing in the case where there is
a linker script. We probably need to move output section assignment
before ICF to get the correct behaviour here.
Differential Revision: https://reviews.llvm.org/D47241
llvm-svn: 333052
Some MIPS relocations depend on "gp" value. By default, this value has
0x7ff0 offset from a .got section. But relocatable files produced by a
compiler or a linker might redefine this default value and we have to
use it for a calculation of the relocation result. When we generate EXE
or DSO it's trivial. Generating a relocatable output is more difficult
case because the linker does calculate relocations in this case and
cannot store individual "gp" values used by each input object file.
As a workaround we add the "gp" value to the relocation addend.
This fixes https://llvm.org/pr31149
Differential revision: https://reviews.llvm.org/D45972
llvm-svn: 331772
On PowerPC calls to functions through the plt must be done through a call stub
that is responsible for:
1) Saving the toc pointer to the stack.
2) Loading the target functions address from the plt into both r12 and the
count register.
3) Indirectly branching to the target function.
Previously we have been emitting these call stubs to the .plt section, however
the .plt section should be reserved for the lazy symbol resolution stubs. This
patch moves the call stubs to the text section by moving the implementation from
writePlt to the thunk framework.
Differential Revision: https://reviews.llvm.org/D46204
llvm-svn: 331607
The current support for V1 ABI in LLD is incomplete.
This patch removes V1 ABI support and changes the default behavior to V2 ABI,
issuing an error when using the V1 ABI. It also updates the testcases to V2
and removes any V1 specific tests.
Differential Revision: https://reviews.llvm.org/D46316
llvm-svn: 331529
Now that getSectionPiece is fast (uses a hash) it is probably OK to
split merge sections early.
The reason I want to do this is to split eh_frame sections in the same
place.
This does mean that we have to decompress early. Given that the only
compressed sections are debug info, I don't think we are missing much.
It is a small improvement: 0.5% on the geometric mean.
llvm-svn: 331058
PPC64 V2 ABI describes two entry points to a function. The global entry point
sets up the TOC base pointer. When calling a local function, the call should
branch to the local entry point rather than the global entry point.
Section 3.4.1 describes using the 3 most significant bits of the st_other
field to find out how many instructions there are between the local and global
entry point. This patch adds the correct offset required to branch to the local
entry point of a function.
Differential Revision: https://reviews.llvm.org/D45729
llvm-svn: 331046
The PPC64 V2 ABI restores the toc base by loading from an offset of 24 from r1.
This patch fixes the offset and updates the testcases from V1 to V2. It also
issues an error when a nop is missing after a call to an external function.
Differential Revision: https://reviews.llvm.org/D45892
llvm-svn: 330600
Now that we don't ICF synthetic sections, we can go back to the old
logic on whose responsibility it is to check Repl.
The idea is that Sec->something() will not check Repl. It is the
responsibility of the caller to find the correct Sec.
llvm-svn: 330346
We had a single symbol using -1 with a synthetic section. It is
simpler to just update its value.
This is not a big will by itself, but will allow having a simple
getOffset for InputSeciton.
llvm-svn: 330340
Using getOffset is here was a bit of an overkill. This is being
written and has relocations. This implies it is a .eh_frame or regular
section.
llvm-svn: 330307
This is similar to r329219, but for the entire section. Like r329219 I
don't expect this to have any real impact, it is just more consistent
and simpler.
llvm-svn: 329367
We were ignoring the addend if the piece was dead. I don't expect this
to make a difference in any real world situations, but it is simpler
anyway.
llvm-svn: 329219
In the lld perf builder r328686 had a negative impact in
stalled-cycles-frontend. Somehow that stat is not showing on my
machine, but the attached patch shows an improvement on cache-misses,
which is probably a reasonable proxy.
My working theory is that given a large input the pieces vector is out
of cache by the time initOffsetMap runs.
Both finalizeContents implementation have a convenient location for
initializing the OffsetMap, so this seems the best solution.
llvm-svn: 329117
OffsetMap maps to a SectionPiece index, but we were not taking
advantage of that in getSectionPiece.
With this patch both getOffset and getSectionPiece use OffsetMap and
the binary search is moved to findSectionPiece.
llvm-svn: 329044
Since SectionBase::getOutputSection handles ICF replaces and
SectionBase::getOffset was handling it in some cases, it is more
consistent to have getOffset always handle it.
llvm-svn: 328391
When looking for the output section and the output offset the
expectation was that the caller had looked at Repl. That works fine
for InputSections, but in the case of MergeInputSections the caller
doesn't have the section that is actually replaced.
The original testcase was failing because getOutputSection was
returning null. The slightly extended testcase also checks that
getOffset also checks Repl.
I will send a refactoring separetelly.
llvm-svn: 328332
Our code assumes all input sections in an output SHF_LINK_ORDER
section has SHF_LINK_ORDER flag. We do not check that and that can cause a crash.
That happens because we call
std::stable_sort(Sections.begin(), Sections.end(), compareByFilePosition);,
where compareByFilePosition predicate does not expect to see
null when calls getLinkOrderDep.
The same might happen when sections refer to non-regular sections.
Test cases demonstrate the issues, patch fixes them.
Differential revision: https://reviews.llvm.org/D44193
llvm-svn: 327006
Previously we would crash because did not mark .rel[a] sections
as dead and they tried to access parent which was not live
after ICF and therefore was null.
Differential revision: https://reviews.llvm.org/D43241
llvm-svn: 325877
Summary:
This follows up on r321889 where writing of Elf_Rel addends was partially
moved to RelocationBaseSection. This patch ensures that the addends are
always written to the output section when a input section uses RELA but the
output is REL.
Differential Revision: https://reviews.llvm.org/D42843
llvm-svn: 325328
Even though it doesn't make sense, there seems to be multiple programs
in the wild that create PC-relative relocations in non-ALLOC sections.
I believe this is caused by the negligence of GNU linkers to not report
any errors for such relocations.
Currently, lld emits warnings against such relocations and exits.
So, you cannot link any program that contains wrong relocations until
you fix an issue in a program that generates wrong ELF files. It's often
impractical to fix a program because it's not always easy.
This patch relaxes the error checking and emit a warning instead.
Differential Revision: https://reviews.llvm.org/D43351
llvm-svn: 325307
In order to identify a compressed section, we check if a section name
starts with ".zdebug" or the section has SHF_COMPRESSED flag. We already
use the knowledge in this function. So hiding that check in
isCompressedELFSection doesn't make sense.
llvm-svn: 324951
When decompressing a compressed debug section, we drop SHF_COMPRESSED
flag but we didn't drop "z" in ".zdebug" section name. This patch does
that for consistency.
This change also fixes the issue that .zdebug_gnu_pubnames are not
dropped when we are creating a .gdb_index section.
llvm-svn: 324949
Initially LLD generates Elf_Rel relocations for O32 ABI and Elf_Rela
relocations for N32 / N64 ABIs. In other words, format of input and
output relocations was always the same. Now LLD generates all output
relocations using Elf_Rel format only. It conforms to ABIs requirement.
The patch suggested by Alexander Richardson.
llvm-svn: 324064
We normally avoid "switch (Config->EKind)", but in this case I think
it is worth it.
It is only executed when there is an error and it allows detemplating
a lot of code.
llvm-svn: 321404
It is currently in InputSectionBase. Only InputSections are used in
ICF, so Repl should be move to InputSection to clear the class
hierarchy or, like this patch does, to SectionBase for convenience.
The convenience of having it on the base class is that we can just
access the replacement without having to first check if it is an
InputSection. It is a bit less code and a bit faster as some of this
code is very hot.
I got up to 1.77% improvement in clang-gdb-index and no regressions
according to lnt.
llvm-svn: 320654
Having a SectionBase method check Repl is inconsistent with how we
handle other section information.
For example, if a section is replaced, Sec->Live is false and it is
natural for Sec->getOutputSection() to be null.
It is the symbol that is moved to the replacement section.
llvm-svn: 320599
Since MarkLive.cpp is the place where we set Live flags for
other sections, it looks correct to do that there.
Benefit is that we stop spreading GC logic outsize of MarkLive.cpp.
Differential revision: https://reviews.llvm.org/D40454
llvm-svn: 319435
Previously our relocations we rewrote were broken for that case.
We emited incorrect addend and broken relocation info field
because did not produce section symbol for mergeable synthetic sections.
Differential revision: https://reviews.llvm.org/D40070
llvm-svn: 318394
Now that DefinedRegular is the only remaining derived class of
Defined, we can merge the two classes.
Differential Revision: https://reviews.llvm.org/D39667
llvm-svn: 317448
Now that we have only SymbolBody as the symbol class. So, "SymbolBody"
is a bit strange name now. This is a mechanical change generated by
perl -i -pe s/SymbolBody/Symbol/g $(git grep -l SymbolBody lld/ELF lld/COFF)
nd clang-format-diff.
Differential Revision: https://reviews.llvm.org/D39459
llvm-svn: 317370
This is PR34826.
Currently LLD is unable to report line number when reporting
duplicate declaration of some variable.
That happens because for extracting line information we always use
.debug_line section content which describes mapping from machine
instructions to source file locations, what does not help for
variables as does not describe them.
In this patch I am taking the approproate information about
variables locations from the .debug_info section.
Differential revision: https://reviews.llvm.org/D38721
llvm-svn: 317080
This is for PR34852.
GCC 8.0 or earlier have a bug that it emits R_386_GOTPC relocations
against _GLOBAL_OFFSET_TABLE for .debug_info. The bug seems to have
been fixed in 2017: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82630,
but we do not want LLD to report errors for such inputs.
In this patch we ignore such relocations.
Differential revision: https://reviews.llvm.org/D38625
llvm-svn: 316761
Summary:
The COFF linker and the ELF linker have long had similar but separate
Error.h and Error.cpp files to implement error handling. This change
introduces new error handling code in Common/ErrorHandler.h, changes the
COFF and ELF linkers to use it, and removes the old, separate
implementations.
Reviewers: ruiu
Reviewed By: ruiu
Subscribers: smeenai, jyknight, emaste, sdardis, nemanjai, nhaehnle, mgorny, javed.absar, kbarton, fedor.sergeev, llvm-commits
Differential Revision: https://reviews.llvm.org/D39259
llvm-svn: 316624
We used to have a map from section piece offsets to section pieces
as a cache for binary search. But I found that the map took quite a
large amount of memory and didn't make linking faster. So, in this
patch, I removed the map.
This patch saves 566 MiB of RAM (2.019 GiB -> 1.453 GiB) when linking
clang with debug info, and the link time is 4% faster in that test case.
Thanks for Sean Silva for pointing this out.
llvm-svn: 316305
By assuming that mergeable input sections are smaller than 4 GiB,
lld's memory usage when linking clang with debug info drops from
2.788 GiB to 2.019 GiB (measured by valgrind, and that does not include
memory space for mmap'ed files). I think that's a reasonable assumption
given such a large RAM savings, so this patch.
According to valgrind, gold needs 3.54 GiB of RAM to do the same thing.
NB: This patch does not introduce a limitation on the size of
output sections. You can still create sections larger than 4 GiB.
llvm-svn: 316280
A section was passed to getRelExpr just to create an error message.
But if there's an invalid relocation, we would eventually report it
in relocateOne. So we don't have to pass a section to getRelExpr.
llvm-svn: 315552
We were using uint32_t as the type of relocation kind. It has a
readability issue because what Type really means in `uint32_t Type`
is not obvious. It could be a section type, a symbol type or a
relocation type.
Since we do not do any arithemetic operations on relocation types
(e.g. adding one to R_X86_64_PC32 doesn't make sense), it would be
more natural if they are represented as enums. Unfortunately, that
is not doable because relocation type definitions are spread into
multiple header files.
So I decided to use typedef. This still should be better than the
plain uint32_t because the intended type is now obvious.
llvm-svn: 315525
The condition whether a section is alive or not by default
is becoming increasingly complex, so the decision of garbage
collection is spreading over InputSection.h and MarkLive.cpp,
which is not a good state.
This moves the code to MarkLive.cpp, to keep the file the central
place to make decisions about garbage collection.
llvm-svn: 315384
When reporting a symbol conflict, LLD parses the debug info to report
source location information. Sections have not been decompressed at this
point, so if an object file contains zlib compressed debug info, LLD
ends up passing this compressed debug info to the DWARF parser, which
causes debug info parsing failures and can trigger assertions in the
parser (as the test case demonstrates).
Decompress debug sections when constructing the LLDDwarfObj to avoid
this issue. This doesn't handle GNU-style compressed debug info sections
(.zdebug_*), which at present are simply ignored by LLDDwarfObj; those
can be done in a follow-up.
Differential Revision: https://reviews.llvm.org/D38491
llvm-svn: 314866
The result of hash_value(StringRef) depends on sizeof(size_t).
That causes lld to create different mergeable table contents on
32-bit machines.
This patch is to use xxHash64 so that we get the same hash values
on 32-bit machines.
llvm-svn: 314603
This is "Bug 34688 - lld much slower than bfd when linking the linux kernel"
Inside copyRelocations() we have O(N*M) algorithm, where N - amount of
relocations and M - amount of symbols in symbol table. It isincredibly slow
for linking linux kernel.
Patch creates local search tables to speedup.
With this fix link time goes for me from 12.95s to 0.55s what is almost 23x
faster. (used release LLD).
Differential revision: https://reviews.llvm.org/D38129
llvm-svn: 314282
EhSectionPiece used to have a pointer to a section, but that pointer was
mostly redundant because we almost always know what the section is without
using that pointer. This patch removes the pointer from the struct.
This patch also use uint32_t/int32_t instead of size_t to represent
offsets that are hardly be larger than 4 GiB. At the moment, I think it is
OK even if we cannot handle .eh_frame sections larger than 4 GiB.
Differential Revision: https://reviews.llvm.org/D38012
llvm-svn: 313697
We crashed when --emit-relocs was used
and relocated section was collected by GC.
Differential revision: https://reviews.llvm.org/D37561
llvm-svn: 313620
This patch removes lot of static Instances arrays from different input file
classes and introduces global arrays for access instead. Similar to arrays we
have for InputSections/OutputSectionCommands.
It allows to iterate over input files in a non-templated code.
Differential revision: https://reviews.llvm.org/D35987
llvm-svn: 313619