This patch implements a new design for the symbol table that stores
SymbolBodies within a memory region of the Symbol object. Symbols are mutated
by constructing SymbolBodies in place over existing SymbolBodies, rather
than by mutating pointers. As mentioned in the initial proposal [1], this
memory layout helps reduce the cache miss rate by improving memory locality.
Performance numbers:
old(s) new(s)
Without debug info:
chrome 7.178 6.432 (-11.5%)
LLVMgold.so 0.505 0.502 (-0.5%)
clang 0.954 0.827 (-15.4%)
llvm-as 0.052 0.045 (-15.5%)
With debug info:
scylla 5.695 5.613 (-1.5%)
clang 14.396 14.143 (-1.8%)
Performance counter results show that the fewer required indirections is
indeed the cause of the improved performance. For example, when linking
chrome, stalled cycles decreases from 14,556,444,002 to 12,959,238,310, and
instructions per cycle increases from 0.78 to 0.83. We are also executing
many fewer instructions (15,516,401,933 down to 15,002,434,310), probably
because we spend less time allocating SymbolBodies.
The new mechanism by which symbols are added to the symbol table is by calling
add* functions on the SymbolTable.
In this patch, I handle local symbols by storing them inside "unparented"
SymbolBodies. This is suboptimal, but if we do want to try to avoid allocating
these SymbolBodies, we can probably do that separately.
I also removed a few members from the SymbolBody class that were only being
used to pass information from the input file to the symbol table.
This patch implements the new design for the ELF linker only. I intend to
prepare a similar patch for the COFF linker.
[1] http://lists.llvm.org/pipermail/llvm-dev/2016-April/098832.html
Differential Revision: http://reviews.llvm.org/D19752
llvm-svn: 268178
We were only doing it for .so and --export-dynamic, but those are not
the only ways a symbol ends up in the dynamic symbol table.
Problem diagnostic and earlier patch version by Peter Collingbourne.
llvm-svn: 267568
The fix is to handle local symbols referring to SHF_MERGE sections.
Original message:
GC entries of SHF_MERGE sections.
It is a fairly direct extension of the gc algorithm. For merge sections
instead of remembering just a live bit, we remember which offsets
were used.
This reduces the .rodata sections in chromium from 9648861 to 9477472
bytes.
llvm-svn: 267233
These are properties of a symbol name, rather than a particular instance
of a symbol in an object file. We can simplify the code by collecting these
properties in Symbol.
The MustBeInDynSym flag has been renamed ExportDynamic, as its semantics
have been changed to be the same as those of --dynamic-list and
--export-dynamic-symbol, which do not cause hidden symbols to be exported.
Differential Revision: http://reviews.llvm.org/D19400
llvm-svn: 267183
It is a fairly direct extension of the gc algorithm. For merge sections
instead of remembering just a live bit, we remember which offsets were
used.
This reduces the .rodata sections in chromium from 9648861 to 9477472
bytes.
llvm-svn: 267164
It turns out that this will read data from the section to properly
handle Elf_Rel implicit addends.
Sorry for the noise.
Original messages:
Try to fix Windows lld build.
Move getRelocTarget to ObjectFile.
It doesn't use anything from the InputSection.
llvm-svn: 267163
llvm\tools\lld\ELF\MarkLive.cpp(49): error C2872: 'ObjectFile': ambiguous symbol
llvm\tools\lld\elf\InputFiles.h(100): note: could be 'lld:🧝:ObjectFile'
llvm\include\llvm/Object/IRObjectFile.h(26): note: or 'llvm::object::ObjectFile'
llvm\tools\lld\ELF\MarkLive.cpp(133): note: see reference to function template instantiation
'void forEachSuccessor<ELFT>(lld:🧝:InputSection<ELFT> *,
std::function<void (lld:🧝:InputSectionBase<ELFT> *)>)'
being compiled with
[ ELFT=llvm::object::ELF32LE ]
llvm\tools\lld\ELF\MarkLive.cpp(136): note: see reference to function template instantiation
'void lld:🧝:markLive<llvm::object::ELF32LE>(lld:🧝:SymbolTable<llvm::object::ELF32LE> *)
being compiled
llvm-svn: 267161
Originally, linker scripts were basically an alternative way to specify
options to the command line options. But as we add more features to hanlde
symbols and sections, many member functions needed to be templated.
Now most the members are templated. It is probably time to template the
entire class.
Previously, LinkerScript is an executor of the linker script as well as
a storage of linker script configurations. This is not suitable to template
the class because when we are reading linker script files, we don't know
the ELF type yet, so we can't instantiate ELF-templated classes.
In this patch, I defined a new class, ScriptConfiguration, to store
linker script configurations. ScriptParser writes parse results to it,
and LinkerScript uses them.
Differential Revision: http://reviews.llvm.org/D19302
llvm-svn: 266908
We never need to iterate over the K,V pairs, so we can avoid copying the
key as MapVector does.
This is a small speedup on most benchmarks.
llvm-svn: 266364
The DenseMap doesn't store hash results. This means that when it is
resized it has to recompute them.
This patch is a small hack that wraps the StringRef in a struct that
remembers the hash value. That way we can be sure it is only hashed
once.
llvm-svn: 266357
Our symbol representation was redundant, and some times would get out of
sync. It had an Elf_Sym, but some fields were copied to SymbolBody.
Different parts of the code were checking the bits in SymbolBody and
others were checking Elf_Sym.
There are two general approaches to fix this:
* Copy the required information and don't store and Elf_Sym.
* Don't copy the information and always use the Elf_Smy.
The second way sounds tempting, but has a big problem: we would have to
template SymbolBody. I started doing it, but it requires templeting
*everything* and creates a bit chicken and egg problem at the driver
where we have to find ELFT before we can create an ArchiveFile for
example.
As much as possible I compared the test differences with what gold and
bfd produce to make sure they are still valid. In most cases we are just
adding hidden visibility to a local symbol, which is harmless.
In most tests this is a small speedup. The only slowdown was scylla
(1.006X). The largest speedup was clang with no --build-id, -O3 or
--gc-sections (i.e.: focus on the relocations): 1.019X.
llvm-svn: 265293
"Discarded" section is a marker for discarded sections, and we do not
use the instance except for checking its identity. In that sense, it
is just another type of a "null" pointer for InputSectionBase. So,
it doesn't have to be a real instance of InputSectionBase class.
In this patch, we no longer instantiate Discarded section but instead
use -1 as a pointer value. This eliminates a global variable which
needed initialization at startup.
llvm-svn: 261761
When link-time garbage collection is in use (-gc-sections), it is
often useful to mark sections that should not be eliminated.
This is accomplished by surrounding an input section's wildcard
entry with KEEP(). Patch implements that command.
Differential revision: http://reviews.llvm.org/D17242
llvm-svn: 261616
There are 3 symbol types that a .bc can provide during lto: defined,
undefined, common.
Defined and undefined symbols have already been refactored. I was
working on common and noticed that absolute symbols would become an
oddity: They would be the only symbol type present in a .o but not in
a.bc.
Looking a bit more, other than the special section number they were only
used for special rules for computing values. In that way they are
similar to TLS, and we don't have a DefinedTLS.
This patch deletes it. With it we have a reasonable rule of the thumb
for having a symbol kind: It exists if it has special resolution
semantics.
llvm-svn: 256383
In FreeBSD, rtld expects .ctors containing -1 (0xffffffff), and a
.ctors section containing the correct bits is provided to the linker as
input (/usr/lib/crtbegin.o).
Contents of section .ctors:
0000 ffffffff ffffffff ........
This section is not stripped even if not referenced or empty, also in
gold or ld.bfd. It would be nice to strip it when not needed but
since existing object files rely on that we can't do better to keep it
around.
Differential Revision: http://reviews.llvm.org/D15767
llvm-svn: 256373
.eh_frame sections need to be preserved if they refer to live sections.
So the liveness relation is reverse for eh_frame sections. For now,
we simply preserve all .eh_frame sections. Thanks Rafael for pointing
this out. .jcr are kept for the same reason.
llvm-svn: 251068
Section garbage collection is a feature to remove unused sections
from outputs. Unused sections are sections that cannot be reachable
from known GC-root symbols or sections. Naturally the feature is
implemented as a mark-sweep garbage collector.
In this patch, I added Live bit to InputSectionBase. If and only
if Live bit is on, the section will be written to the output.
Starting from GC-root symbols or sections, a new function, markLive(),
visits all reachable sections and sets their Live bits. Writer then
ignores sections whose Live bit is off, so that such sections are
excluded from the output.
This change has small negative impact on performance if you use
the feature because making sections means more work. The time to
link Clang changes from 0.356s to 0.386s, or +8%.
It reduces Clang size from 57,764,984 bytes to 55,296,600 bytes.
That is 4.3% reduction.
http://reviews.llvm.org/D13950
llvm-svn: 251043