We will need to do something like this to support range extension
thunks since that process is iterative.
Doing this also has the advantage that when doing the regular
relocation scan the offset in the output section is known and we can
just store that. This reduces the number of times we have to run
getOffset and I think will allow a more specialized .eh_frame
representation.
By itself this is already a performance win.
firefox
master 7.295045737
patch 7.209466989 0.98826892235
chromium
master 4.531254468
patch 4.509221804 0.995137623774
chromium fast
master 1.836928973
patch 1.823805241 0.992855612714
the gold plugin
master 0.379768791
patch 0.380043405 1.00072310839
clang
master 0.642698284
patch 0.642215663 0.999249070657
llvm-as
master 0.036665467
patch 0.036456225 0.994293213284
the gold plugin fsds
master 0.40395817
patch 0.404384555 1.0010555177
clang fsds
master 0.722045545
patch 0.720946135 0.998477367518
llvm-as fsds
master 0.03292646
patch 0.032759965 0.994943428477
scylla
master 3.427376378
patch 3.368316181 0.98276810292
llvm-svn: 276146
This patch simplifies output section management by making
Factory class have ownership of sections that creates.
Differential Revision: https://reviews.llvm.org/D22575
llvm-svn: 276141
In the last patch for --trace-symbol, I introduced a new symbol type
PlaceholderKind and store it to SymVector storage. It made all code
that iterates over SymVector to recognize and skip PlaceholderKind
symbols. I found that that's annoying.
In this patch, I removed PlaceholderKind and stop storing them to SymVector.
Now the information whether a symbol is being watched by --trace-symbol
is stored to the Symtab hash table.
llvm-svn: 275747
--trace-symbol is a command line option to watch a symbol.
Previosly, we looked up a hash table for a new symbol if the
option is given. Any code that looks up a hash table for each
symbol is expensive because the linker handles a lot of symbols.
In our design, we look up a hash table strictly only once
for a symbol, so --trace-symbol was an exception.
This patch improves efficiency of the option by merging the
hash table into the symbol table.
Instead of looking up a separate hash table with a string,
this patch sets `Traced` flag to symbols specified by --trace-symbol.
So, if you insert a symbol and get a symbol with `Traced` flag on,
you know that you need to print out a log message for the symbol.
This is nearly zero cost.
llvm-svn: 275716
Previously, each subclass of SymbolBody had a pointer to a source
file from which it was created. So, there was no single way to get
a source file for a symbol. We had getSourceFile<ELFT>(), but the
function was a bit inconvenient as it's a template.
This patch makes SymbolBody have a pointer to a source file.
If a symbol is not created from a file, the pointer has a nullptr.
llvm-svn: 275701
The identifier `Version` was used too often in the code to handle
symbol versions. The struct that contains version definitions is
named `Version`. Local variables for version ID are named `Version`.
Local varaible for version string are named `Version`.
This patch give them different names.
llvm-svn: 275673
Previously, Verdefs and Verdauxs are separated in the section.
The new layout is easier to write as we do not have to maintain
two pointers and can avoid passing a reference to a pointer.
llvm-svn: 275665
ELF spec says that alignment of 0 is equivalent to 1.
Previously, we arbitrary set to 0 or 1, but always setting to 1
makes our program simpler.
llvm-svn: 275374
Patch by H.J Lu.
For x86-64 psABI, the entry size of .got and .got.plt sections is 8
bytes for both LP64 and ILP32. Add GotEntrySize and GotPltEntrySize
to ELF target instead of using size of ELFT::uint. Now we can generate
a simple working x32 executable.
Differential Revision: http://reviews.llvm.org/D22288
llvm-svn: 275301
Config members are named after corresponding command line options.
This patch renames VAStart ImageBase so that they are in line with
--image-base.
Differential Revision: http://reviews.llvm.org/D22277
llvm-svn: 275298
Since linkerscript should create sections by itself
(if SECTIONS command is present),
then we might want to reuse the OutputSectionFactory (D19976 already do that now),
so this patch moves it out from writer cpp file for that purpose.
Differential revision: http://reviews.llvm.org/D19977
llvm-svn: 275161
The build otherwise fails with:
[ 39%] Building CXX object
tools/lld/ELF/CMakeFiles/lldELF.dir/OutputSections.cpp.o
/export/gnu/import/git/llvm/tools/lld/ELF/OutputSections.cpp: In
member function ‘void
lld:🧝:GnuHashTableSection<ELFT>::addSymbols(std::vector<std::pair<lld:🧝:SymbolBody*,
long unsigned int> >&)’:
/export/gnu/import/git/llvm/tools/lld/ELF/OutputSections.cpp:585:8:
error: inconsistent deduction for ‘auto’: ‘auto’ and then
‘__gnu_cxx::__normal_iterator<std::pair<lld:🧝:SymbolBody*, long
unsigned int>*, std::vector<std::pair<lld:🧝:SymbolBody*, long
unsigned int> > >’
Reported by: H. J. Liu
llvm-svn: 274518
This is PR28358
According to
https://www.akkadia.org/drepper/dsohowto.pdf
"The fourth point, the VERS 1.0 version being referred to in the VERS 2.0 definition, is not really important in symbol versioning. It marks the predecessor relationship of the two versions and it is done to maintain the similar- ities with Solaris’ internal versioning. It does not cause any problem it might in fact be useful to a human reader so predecessors should always be mentioned."
Patch partially reverts 273423 "[ELF] - Implemented version script hierarchies.",
version references are just ignored now.
Differential revision: http://reviews.llvm.org/D21888
llvm-svn: 274345
The patch adds one more partition to the MIPS GOT. This time it is for
TLS related GOT entries. Such entries are located after 'local' and 'global'
ones. We cannot get a final offset for these entries at the time of
creation because we do not know size of 'local' and 'global' partitions.
So we have to adjust the offset later using `getMipsTlsOffset()` method.
All MIPS TLS relocations which need GOT entries operates MIPS style GOT
offset - 'offset from the GOT's beginning' - MipsGPOffset constant. That
is why I add new types of relocation expressions.
One more difference from othe ABIs is that the MIPS ABI does not support
any TLS relocation relaxations. I decided to make a separate function
`handleMipsTlsRelocation` and put MIPS TLS relocation handling code
there. It is similar to `handleTlsRelocation` routine and duplicates its
code. But it allows to make the code cleaner and prevent pollution of
the `handleTlsRelocation` by MIPS 'if' statements.
Differential Revision: http://reviews.llvm.org/D21606
llvm-svn: 273569
Peter Smith found while trying to support thunk creation for ARM that
LLD sometimes creates broken thunks for MIPS. The cause of the bug is
that we assign file offsets to input sections too early. We need to
create all sections and then assign section offsets because appending
thunks changes file offsets for all following sections.
This patch separates the pass to assign file offsets from thunk
creation pass. This effectively reverts r265673.
Differential Revision: http://reviews.llvm.org/D21598
llvm-svn: 273532
Patch implements hierarchies for version scripts.
This allows to handle script files with dependencies, like next one has:
LIBSAMPLE_1.0{
global:
a;
};
LIBSAMPLE_2.0
{
global:
b;
}LIBSAMPLE_1.0;
Differential revision: http://reviews.llvm.org/D21556
llvm-svn: 273423
With fix:
-soname flag was not set in testcase. Hash calculated for base def was different on local
and bot machines because filename fos used for calculating.
Initial commit message:
Patch implements basic support of versioned symbols.
There is no wildcards patterns matching except local: *;
There is no support for hierarchies.
There is no support for symbols overrides (@ vs @@ not handled).
This patch allows programs that using simple scripts to link and run.
Differential revision: http://reviews.llvm.org/D21018
llvm-svn: 273152
Patch implements basic support of versioned symbols.
There is no wildcards patterns matching except local: *;
There is no support for hierarchies.
There is no support for symbols overrides (@ vs @@ not handled).
This patch allows programs that using simple scripts to link and run.
Differential revision: http://reviews.llvm.org/D21018
llvm-svn: 273143
There are two motivations for this patch. The first one is a preparation
for support MIPS TLS relocations. It might sound like a joke but for GOT
entries related to TLS relocations MIPS ABI uses almost regular approach
with creation of dynamic relocations for each GOT enty etc. But we need
to separate these 'regular' TLS related entries from MIPS specific local
and global parts of GOT. ABI declare simple solution - all TLS related
entries allocated at the end of GOT after local/global parts. The second
motivation it to support GOT relocations for non-preemptible symbols
with addends. If we have more than one GOT relocations against symbol S
with different addends we need to create GOT entries for each unique
Symbol/Addend pairs.
So we store all MIPS GOT entries in separate containers. For non-preemptible
symbols we have to maintain two data structures. The first one is MipsLocal
vector. Each entry corresponds to the GOT entry from the 'local' part
of the GOT contains the symbol's address plus addend. The second one
is MipsLocalMap. It is a map from Symbol/Addend pair to the GOT index.
Differential Revision: http://reviews.llvm.org/D21297
llvm-svn: 273127
I think it is me who named these variables, but I always find that
they are slightly confusing because align is a verb.
Adding four letters is worth it.
llvm-svn: 272984
PltZero (or PLT[0]) was an appropriate name for the little code
we have at beginning of the PLT section when we only supported x86
since the code for x86 just fits in the first PLT slot.
It's not the case anymore. The code for ARM64 occupies first two
slots, so PltZero spans PLT[0] and PLT[1], for example.
This patch renames it to avoid confusion.
llvm-svn: 272913
For ARM and MIPS, we don't need to call this function.
This patch passes a symbol instead of a PLT entry address
so that the target handler can call it if necessary.
llvm-svn: 272910
If the symbol is local we don't need to create a R_X86_64_DTPOFF64, we
can just write the correct value in the got.
Should fix pr28018.
llvm-svn: 272205
MergedInputSection::getOffset is the busiest function in LLD if string
merging is enabled and input files have lots of mergeable sections.
It is usually the case when creating executable with debug info,
so it is pretty common.
The reason why it is slow is because it has to do faily complex
computations. For non-mergeable sections, section contents are
contiguous in output, so in order to compute an output offset,
we only have to add the output section's base address to an input
offset. But for mergeable strings, section contents are split for
merging, so they are not contigous. We've got to do some lookups.
We used to do binary search on the list of section pieces.
It is slow because I think it's hostile to branch prediction.
This patch replaces it with hash table lookup. Seems it's working
pretty well. Below is "perf stat -r10" output when linking clang
with debug info. In this case this patch speeds up about 4%.
Before:
6584.153205 task-clock (msec) # 1.001 CPUs utilized ( +- 0.09% )
238 context-switches # 0.036 K/sec ( +- 6.59% )
0 cpu-migrations # 0.000 K/sec ( +- 50.92% )
1,067,675 page-faults # 0.162 M/sec ( +- 0.15% )
18,369,931,470 cycles # 2.790 GHz ( +- 0.09% )
9,640,680,143 stalled-cycles-frontend # 52.48% frontend cycles idle ( +- 0.18% )
<not supported> stalled-cycles-backend
21,206,747,787 instructions # 1.15 insns per cycle
# 0.45 stalled cycles per insn ( +- 0.04% )
3,817,398,032 branches # 579.786 M/sec ( +- 0.04% )
132,787,249 branch-misses # 3.48% of all branches ( +- 0.02% )
6.579106511 seconds time elapsed ( +- 0.09% )
After:
6312.317533 task-clock (msec) # 1.001 CPUs utilized ( +- 0.19% )
221 context-switches # 0.035 K/sec ( +- 4.11% )
1 cpu-migrations # 0.000 K/sec ( +- 45.21% )
1,280,775 page-faults # 0.203 M/sec ( +- 0.37% )
17,611,539,150 cycles # 2.790 GHz ( +- 0.19% )
10,285,148,569 stalled-cycles-frontend # 58.40% frontend cycles idle ( +- 0.30% )
<not supported> stalled-cycles-backend
18,794,779,900 instructions # 1.07 insns per cycle
# 0.55 stalled cycles per insn ( +- 0.03% )
3,287,450,865 branches # 520.799 M/sec ( +- 0.03% )
72,259,605 branch-misses # 2.20% of all branches ( +- 0.01% )
6.307411828 seconds time elapsed ( +- 0.19% )
Differential Revision: http://reviews.llvm.org/D20645
llvm-svn: 270999
MIPS .reginfo and .MIPS.options sections are consumed by the linker, and
the linker produces a single output section. But it is possible that
input files contain section symbol points to the corresponding input
section. In case of generation a relocatable output we need to write
such symbols to the output file.
Fixes bug 27878.
Differential Revision: http://reviews.llvm.org/D20688
llvm-svn: 270910
This patch makes SectionPiece class 8 bytes smaller on platforms
on which pointer size is 8 bytes. Sean suggested in a post commit
review for r270340 that this could make a differentce, and it
actually is. Time to link clang (with debug info) improved from
6.725 seconds to 6.589 seconds or by about 2%.
Differential Revision: http://reviews.llvm.org/D20613
llvm-svn: 270717
This patch addresses a post-commit review for r270325. r270325
introduced getReloc function that searches a relocation for a
given range. It always started searching from beginning of relocation
vector, so it was slower than before. Previously, we used to use
the fact that the relocations are sorted. This patch restore it.
llvm-svn: 270572