Previously, we uncompress all compressed sections before doing anything.
That works, and that is conceptually simple, but that could results in
a waste of CPU time and memory if uncompressed sections are then
discarded or just copied to the output buffer.
In particular, if .debug_gnu_pub{names,types} are compressed and if no
-gdb-index option is given, we wasted CPU and memory because we
uncompress them into newly allocated bufers and then memcpy the buffers
to the output buffer. That temporary buffer was redundant.
This patch changes how to uncompress sections. Now, compressed sections
are uncompressed lazily. To do that, `Data` member of `InputSectionBase`
is now hidden from outside, and `data()` accessor automatically expands
an compressed buffer if necessary.
If no one calls `data()`, then `writeTo()` directly uncompresses
compressed data into the output buffer. That eliminates the redundant
memory allocation and redundant memcpy.
This patch significantly reduces memory consumption (20 GiB max RSS to
15 Gib) for an executable whose .debug_gnu_pub{names,types} are in total
5 GiB in an uncompressed form.
Differential Revision: https://reviews.llvm.org/D52917
llvm-svn: 343979
When we write a struct to a mmap'ed buffer, we usually use
write16/32/64, but we didn't for VersionDefinitionSection, so
we needed to template that class.
llvm-svn: 343024
Previously, if you invoke lld's `main` more than once in the same process,
the second invocation could fail or produce a wrong result due to a stale
pointer values of the previous run.
Differential Revision: https://reviews.llvm.org/D52506
llvm-svn: 343009
These files used to contain classes and functions for .gdb_index,
but they are moved to SyntheticSections.{cpp,h}, so the name is now
irrelevant.
llvm-svn: 342299
With this patch, lld creates a .note.GNU_stack and adds that to an
output file if it is creating a re-linkable object file (i.e. if -r
is given). If we don't do this, and if you use GNU linkers as a final
linker, they create an executable whose stack area is executable,
which is considered pretty bad these days.
Differential Revision: https://reviews.llvm.org/D51400
llvm-svn: 340902
It turns out that postThunkContents() is only used for
sorting symbols in .symtab.
Though we can instead move the logic to SymbolTableBaseSection::finalizeContents(),
postpone calling it and then get rid of postThunkContents completely.
Differential revision: https://reviews.llvm.org/D49547
llvm-svn: 339413
This patch merges createGdbIndex function and GdbIndexSection's
constructor into a single static member function of the class.
This patch also change how we keep CU vectors. Previously, CuVector
and GdbSymbols were parallel arrays, but there's no reason to choose that
design. Now, CuVector is a member of GdbSymbol class.
A lot of members are removed from GdbIndexSection. Previously, it has
members that need to be kept in sync over several phases. I belive the new
design is less error-prone, and the new code is much easier to read
than before.
llvm-svn: 336743
.gdb_index sections can be very large. When you are compiling
multi-gibibyte executables, they can be larger than 1 GiB. The previous
implementation of .gdb_index seems to consume too much memory.
This patch reduces memory consumption by eliminating temporary objects.
In one experiment, memory consumption of GdbIndexSection class is
reduced from 962 MiB to 228 MiB when creating a .gdb_index of 1350 GiB.
Differential Revision: https://reviews.llvm.org/D49094
llvm-svn: 336672
I believe the only way to test this functionality is to create extremely
large object files and attempt to create a .gdb_index that is greater
than 4 GiB. But I think that's too much for most environments and buildbots,
so I'm commiting this without a test that actually triggers the new
error condition.
llvm-svn: 336631
Patch by Rahul Chaudhry!
This change adds experimental support for SHT_RELR sections, proposed
here: https://groups.google.com/forum/#!topic/generic-abi/bX460iggiKg
Pass '--pack-dyn-relocs=relr' to enable generation of SHT_RELR section
and DT_RELR, DT_RELRSZ, and DT_RELRENT dynamic tags.
Definitions for the new ELF section type and dynamic array tags, as well
as the encoding used in the new section are all under discussion and are
subject to change. Use with caution!
Pass '--use-android-relr-tags' with '--pack-dyn-relocs=relr' to use
SHT_ANDROID_RELR section type instead of SHT_RELR, as well as
DT_ANDROID_RELR* dynamic tags instead of DT_RELR*. The generated
section contents are identical.
'--pack-dyn-relocs=android+relr --use-android-relr-tags' enables both
'--pack-dyn-relocs=android' and '--pack-dyn-relocs=relr': lld will
encode the relative relocations in a SHT_ANDROID_RELR section, and pack
the rest of the dynamic relocations in a SHT_ANDROID_REL(A) section.
Differential Revision: https://reviews.llvm.org/D48247
llvm-svn: 336594
Almost all entries inside MIPS GOT are referenced by signed 16-bit
index. Zero entry lies approximately in the middle of the GOT. So the
total number of GOT entries cannot exceed ~16384 for 32-bit architecture
and ~8192 for 64-bit architecture. This limitation makes impossible to
link rather large application like for example LLVM+Clang. There are two
workaround for this problem. The first one is using the -mxgot
compiler's flag. It enables using a 32-bit index to access GOT entries.
But each access requires two assembly instructions two load GOT entry
index to a register. Another workaround is multi-GOT. This patch
implements it.
Here is a brief description of multi-GOT for detailed one see the
following link https://dmz-portal.mips.com/wiki/MIPS_Multi_GOT.
If the sum of local, global and tls entries is less than 64K only single
got is enough. Otherwise, multi-got is created. Series of primary and
multiple secondary GOTs have the following layout:
```
- Primary GOT
Header
Local entries
Global entries
Relocation only entries
TLS entries
- Secondary GOT
Local entries
Global entries
TLS entries
...
```
All GOT entries required by relocations from a single input file
entirely belong to either primary or one of secondary GOTs. To reference
GOT entries each GOT has its own _gp value points to the "middle" of the
GOT. In the code this value loaded to the register which is used for GOT
access.
MIPS 32 function's prologue:
```
lui v0,0x0
0: R_MIPS_HI16 _gp_disp
addiu v0,v0,0
4: R_MIPS_LO16 _gp_disp
```
MIPS 64 function's prologue:
```
lui at,0x0
14: R_MIPS_GPREL16 main
```
Dynamic linker does not know anything about secondary GOTs and cannot
use a regular MIPS mechanism for GOT entries initialization. So we have
to use an approach accepted by other architectures and create dynamic
relocations R_MIPS_REL32 to initialize global entries (and local in case
of PIC code) in secondary GOTs. But ironically MIPS dynamic linker
requires GOT entries and correspondingly ordered dynamic symbol table
entries to deal with dynamic relocations. To handle this problem
relocation-only section in the primary GOT contains entries for all
symbols referenced in global parts of secondary GOTs. Although the sum
of local and normal global entries of the primary got should be less
than 64K, the size of the primary got (including relocation-only entries
can be greater than 64K, because parts of the primary got that overflow
the 64K limit are used only by the dynamic linker at dynamic link-time
and not by 16-bit gp-relative addressing at run-time.
The patch affects common LLD code in the following places:
- Added new hidden -mips-got-size flag. This flag required to set low
maximum size of a single GOT to be able to test the implementation using
small test cases.
- Added InputFile argument to the getRelocTargetVA function. The same
symbol referenced by GOT relocation from different input file might be
allocated in different GOT. So result of relocation depends on the file.
- Added new ctor to the DynamicReloc class. This constructor records
settings of dynamic relocation which used to adjust address of 64kb page
lies inside a specific output section.
With the patch LLD is able to link all LLVM+Clang+LLD applications and
libraries for MIPS 32/64 targets.
Differential revision: https://reviews.llvm.org/D31528
llvm-svn: 334390
These maps are small, but we are creating an destroying one for each
input .eh_frame.
This patch reduces the total memory allocation from 765.54MB to
749.19MB. The peak is still the same: 563.7MB.
llvm-svn: 331075
Now that getSectionPiece is fast (uses a hash) it is probably OK to
split merge sections early.
The reason I want to do this is to split eh_frame sections in the same
place.
This does mean that we have to decompress early. Given that the only
compressed sections are debug info, I don't think we are missing much.
It is a small improvement: 0.5% on the geometric mean.
llvm-svn: 331058
This is slightly simpler to read IMHO. Now if a symbol has a position
in the file, it is Defined.
The main motivation is that with this a SharedSymbol doesn't need a
section, which reduces the size of SymbolUnion.
With this the peak allocation when linking chromium goes from 568.1 to
564.2 MB.
llvm-svn: 330966
MIPS ABI requires creation of the MIPS_RLD_MAP dynamic tag for non-PIE
executables only and MIPS_RLD_MAP_REL tag for both PIE and non-PIE
executables. The patch skips definition of the MIPS_RLD_MAP for PIE
files and defines MIPS_RLD_MAP_REL.
The MIPS_RLD_MAP_REL tag stores the offset to the .rld_map section
relative to the address of the tag itself.
Differential Revision: https://reviews.llvm.org/D43347
llvm-svn: 329996
Also make certain Thunk methods non-const as this will be required for
an upcoming change.
Differential Revision: https://reviews.llvm.org/D44961
llvm-svn: 328732
Choosing a Shift2 value based on wordsize is cargo-culted from gold.
Assuming that djb hash is a good hash function, choosing bits [4,9]
shouldn't be any worse or better than choosing bits [5,10]. We shouldn't
have copied that behavior that we can't justify in the first place.
Differential Revision: https://reviews.llvm.org/D44547
llvm-svn: 327921
This is the same as 327248 except Arm defining _GLOBAL_OFFSET_TABLE_ to
be the base of the .got section as some existing code is relying upon it.
For most Targets the _GLOBAL_OFFSET_TABLE_ symbol is expected to be at
the start of the .got.plt section so that _GLOBAL_OFFSET_TABLE_[0] =
reserved value that is by convention the address of the dynamic section.
Previously we had defined _GLOBAL_OFFSET_TABLE_ as either the start or end
of the .got section with the intention that the .got.plt section would
follow the .got. However this does not always hold with the current
default section ordering so _GLOBAL_OFFSET_TABLE_[0] may not be consistent
with the reserved first entry of the .got.plt.
X86, X86_64 and AArch64 will use the .got.plt. Arm, Mips and Power use .got
Fixes PR36555
Differential Revision: https://reviews.llvm.org/D44259
llvm-svn: 327823
This change broke ARM code that expects to be able to add
_GLOBAL_OFFSET_TABLE_ to the result of an R_ARM_REL32.
I will provide a reproducer on llvm-commits.
llvm-svn: 327688
the start of the .got.plt section so that _GLOBAL_OFFSET_TABLE_[0] =
reserved value that is by convention the address of the dynamic section.
Previously we had defined _GLOBAL_OFFSET_TABLE_ as either the start or end
of the .got section with the intention that the .got.plt section would
follow the .got. However this does not always hold with the current
default section ordering so _GLOBAL_OFFSET_TABLE_[0] may not be consistent
with the reserved first entry of the .got.plt.
X86, X86_64, Arm and AArch64 will use the .got.plt. Mips and Power use .got
Fixes PR36555
Differential Revision: https://reviews.llvm.org/D44259
llvm-svn: 327248
Summary:
Before the name of the function sounded like it was just a getter for the
private class member Addend. However, it actually calculates the final
value for the r_addend field in Elf_Rela that is used when writing the
.rela.dyn section. I also added a comment to the UseSymVA member to
explain how it interacts with computeAddend().
Differential Revision: https://reviews.llvm.org/D43161
llvm-svn: 325485
Now that we have R_ADDEND, UseSymVA was redundant. We only want to
write the symbol virtual address when using an expression other than
R_ADDEND.
llvm-svn: 325360
Summary:
This follows up on r321889 where writing of Elf_Rel addends was partially
moved to RelocationBaseSection. This patch ensures that the addends are
always written to the output section when a input section uses RELA but the
output is REL.
Differential Revision: https://reviews.llvm.org/D42843
llvm-svn: 325328
Previously we checked (HeaderSize == 0) to find out if
PltSection section is IPLT or PLT. Some targets does not set
HeaderSize though. For example PPC64 has no lazy binding implemented
and does not set PltHeaderSize constant.
Because of that using of both IPLT and PLT relocations worked
incorrectly there (testcase is provided).
Patch fixes the issue.
Differential revision: https://reviews.llvm.org/D41613
llvm-svn: 322362
We add dynamic section entries both in the ctor of the class and
DynamicSection::finalizeContents(). Some entries need to be added early
in the ctor because they add strings to .dynstr. Other entries were
intended to be added in finalizeContents(). However, some entries are
added in the ctor even though they don't add strings. This patch
fix the issue.
llvm-svn: 320851