New lld's files are spread under lib subdirectory, and it isn't easy
to find which files are actually maintained. This patch moves maintained
files to Common subdirectory.
Differential Revision: https://reviews.llvm.org/D37645
llvm-svn: 314719
Convert all common symbols to regular symbols after scan.
This means that the downstream code does not to handle common symbols as a special case.
Differential Revision: https://reviews.llvm.org/D38137
llvm-svn: 314495
In order to keep track of symbol renaming, we used to have
Config->SymbolRenaming, and whether a symbol is in the map or not
affects its symbol attribute (i.e. "LinkeRedefined" bit).
This patch adds "CanInline" bit to Symbol to aggreagate symbol
information in one place and removed the member from Config since
no one except SymbolTable now uses the table.
llvm-svn: 314088
to separate commons based on file name patterns. The following linker script
construct does not work because commons are allocated before section placement
is done and the only synthesized BssSection that holds all commons has no file
associated with it:
SECTIONS { .common_0 : { *file0.o(COMMON) }}
This patch changes the allocation of commons to create a section per common
symbol and let the section logic do the layout.
Differential revision: https://reviews.llvm.org/D37489
llvm-svn: 312796
Liveness is usually a notion of input sections, but this patch adds
"liveness" bit to common symbols because they don't belong to any
input section.
This patch is based on https://reviews.llvm.org/D36520
Differential Revision: https://reviews.llvm.org/D36546
llvm-svn: 310617
This is probably a small optimization, but the main motivation is
having a way of fixing pr34053 that doesn't require a hash lookup in
isPreempitible.
llvm-svn: 310602
With this Symbol has the same size as before, but DefinedRegular goes
from 72 to 64 bytes.
I also find this a bit easier to read. There are fewer places
initializing File for example.
This has a small but measurable speed improvement on all tests (1%
max).
llvm-svn: 310142
When version script was used, binding opf undefined weak symbols sometimes
was calculated as STB_LOCAL, making them non-preemtible what
broke correct relocations handling logic for them.
Fixes PR33738.
Differential revision: https://reviews.llvm.org/D35263
llvm-svn: 307767
It was intially implemented in D19517 but then broken.
Patch fixes PR33707, testcase is based on PR's case.
Differential revision: https://reviews.llvm.org/D35119
llvm-svn: 307652
We could have add this function either Symbol or SymbolBody. I added it
to Symbol at first. But I noticed that if I've added it to SymbolBody,
we could've removed SymbolBody::setName(). So I'll do that in this patch.
llvm-svn: 306590
Most "reserved" symbols are in ElfSym and it looks like there's no
reason to not do the same thing for _GLOBAL_OFFSET_TABLE_. This should
help https://reviews.llvm.org/D34618 too.
llvm-svn: 306292
Previously, when symbol A is renamed B, both A and B end up having
the same name. This is because name is a symbol's attribute, and
we memcpy symbols for symbol renaming.
This pathc saves the original symbol name and restore it after memcpy
to keep the original name.
This patch shouldn't change program's meaning, but names in symbol
tables make more sense than before.
llvm-svn: 306036
GNU linkers define __bss_start symbol.
Patch teaches LLD to do that. This is PR32051.
Below is part of standart ld.bfd script:
.data1 : { *(.data1) }
_edata = .; PROVIDE (edata = .);
. = .;
__bss_start = .;
.bss :
{
Currently LLD can emit up to 3 .bss* sections as one of testcase shows.
Implementation inserts this symbol before first .bss* output section.
Differential revision: https://reviews.llvm.org/D30419
llvm-svn: 299528
Was fixed, details on review page.
Original commit message:
That removes CopyRelSection class completely, making
Bss/BssRelRo to be just regular synthetics.
This is splitted from D30541 and polished.
Difference from D30541 that all logic of SharedSymbol
converting to DefinedRegular was removed for now and
probably will be posted as separate patch.
Differential revision: https://reviews.llvm.org/D30892
llvm-svn: 298062
That removes CopyRelSection class completely, making
Bss/BssRelRo to be just regular synthetics.
This is splitted from D30541 and polished.
Difference from D30541 that all logic of SharedSymbol
converting to DefinedRegular was removed for now and
probably will be posted as separate patch.
Differential revision: https://reviews.llvm.org/D30892
llvm-svn: 297814
With this we have a single section hierarchy. It is a bit less code,
but the main advantage will be in a future patch being able to handle
foo = symbol_in_obj;
in a linker script. Currently that fails since we try to find the
output section of symbol_in_obj. With this we should be able to just
return an InputSection from the expression.
llvm-svn: 297313
In compare with D30458, this makes Bss/BssRelRo to be pure
synthetic sections.
That removes CopyRelSection class completely, making
Bss/BssRelRo to be just regular synthetics.
SharedSymbols involved in creating copy relocations are
converted to DefinedRegular, what also simplifies things.
Differential revision: https://reviews.llvm.org/D30541
llvm-svn: 297008
That function doesn't use any member of SymbolTableSection, so I
couldn't see a reason to make it a member of that class. The function
takes a SymbolBody, so it is more natural to make it a member of
SymbolBody.
llvm-svn: 296433
With the current design an InputSection is basically anything that
goes directly in a OutputSection. That includes plain input section
but also synthetic sections, so this should probably not be a
template.
llvm-svn: 295993
This patch removes NeedsCopyOrPltAddr and instead add two variables,
NeedsCopy and NeedsPltAddr. This uses one more bit in Symbol class,
but the actual size doesn't increase because we had unused bits.
This should improve code readability.
llvm-svn: 295287
In the target dependent code we already always return a int64_t. In
the target independent code we carefully use uintX_t, which has the
same result given 2 complement rules.
This just simplifies the code to use int64_t everywhere.
llvm-svn: 295263
When we need a copy relocation we create a synthetic SHT_NOBITS
section that contains the right amount of ZI and assign it to either
.bss or .rel.ro.bss as appropriate. This allows the dynamic relocation
to be placed on the InputSection, removing the last case where a
dynamic relocation is stored as an offset from the OutputSection. This
has the side effect that we can run assignOffsets() after scanRelocs()
without losing the additional ZI needed for the copy relocations.
Differential Revision: https://reviews.llvm.org/D29637
llvm-svn: 294577
Thunks are now implemented by redirecting the relocation to the
symbol S, to a symbol TS in a Thunk. The Thunk will transfer control
to S. This has the following implications:
- All the side-effects of Thunks happen within createThunks()
- Thunks are no longer stored in InputSections and Symbols no longer
need to hold a pointer to a Thunk
- The synthetic Thunk sections need to be merged into OutputSections
This implementation is almost a direct conversion of the existing
Thunks with the following exceptions:
- Mips LA25 Thunks are placed before the InputSection that defines
the symbol that needs a Thunk.
- All ARM Thunks are placed at the end of the OutputSection of the
first caller to the Thunk.
Range extension Thunks are not supported yet so it is optimistically
assumed that all Thunks can be reused.
This is a recommit of r293283 with a fixed comparison predicate as
std::merge requires a strict weak ordering.
Differential revision: https://reviews.llvm.org/D29327
llvm-svn: 293757
The symbols _end, end, _etext, etext, _edata, edata and __ehdr_start
refer to positions in the file and are therefore not absolute. Making
them absolute was on unfortunate cargo cult of what bfd was doing.
Changing the symbols allows for pc relocations to them to be resolved,
which should fix the wine build.
llvm-svn: 293385
Thunks are now implemented by redirecting the relocation to the
symbol S, to a symbol TS in a Thunk. The Thunk will transfer control
to S. This has the following implications:
- All the side-effects of Thunks happen within createThunks()
- Thunks are no longer stored in InputSections and Symbols no longer
need to hold a pointer to a Thunk
- The synthetic Thunk sections need to be merged into OutputSections
This implementation is almost a direct conversion of the existing
Thunks with the following exceptions:
- Mips LA25 Thunks are placed before the InputSection that defines
the symbol that needs a Thunk.
- All ARM Thunks are placed at the end of the OutputSection of the
first caller to the Thunk.
Range extension Thunks are not supported yet so it is optimistically
assumed that all Thunks can be reused.
Differential Revision: https://reviews.llvm.org/D29129
llvm-svn: 293283
When reserving copy relocation space for a shared symbol, scan the DSO's
program headers to see if the symbol is in a read-only segment. If so,
reserve space for that symbol in a new synthetic section named .bss.rel.ro
which will be covered by the relro program header.
This fixes the security issue disclosed on the binutils mailing list at:
https://sourceware.org/ml/libc-alpha/2016-12/msg00914.html
Differential Revision: https://reviews.llvm.org/D28272
llvm-svn: 291524
In a shared library an undefined symbol is implicitly imported. If the
symbol is called as a function a PLT entry is generated for it. When the
caller is a Thumb b.w a thunk to the PLT entry is needed as all PLT
entries are in ARM state.
This change allows undefined symbols to have thunks in the same way that
shared symbols may have thunks.
llvm-svn: 290951
DefinedSynthetic is not created for a real ELF object, so it doesn't
have to be a template function. It has a virtual st_value, which is
either 32 bit or 64 bit, but we can simply use 64 bit.
llvm-svn: 290241
We first decide that the symbol is global, than that it should have
version foo. Since it was already not the default version, we were
producing a bogus warning.
llvm-svn: 289284
This change introduces new synthetic sections IpltSection, IgotPltSection
that represent the ifunc entries that would previously have been put in
the PltSection and the GotPltSection. The separation makes sure that
the R_*_IRELATIVE relocations are placed after the non R_*_IRELATIVE
relocations, which permits ifunc resolvers to know that the .got.plt
slots will be initialized prior to the resolver being called.
A secondary benefit is that for ARM we can move the IgotPltSection and its
dynamic relocations to the .got and .rel.dyn as the ARM glibc expects all
the R_*_IRELATIVE relocations to be in the .rel.dyn
Differential revision: https://reviews.llvm.org/D27406
llvm-svn: 289045
These MIPS specific symbols should be global because in general they can
have an arbitrary value. By default this value is a fixed offset from .got
section.
This patch adds more checks to the mips-gp-local.s test case but marks
it as XFAIL because LLD does not allow redefinition of absolute symbols
value by a linker script. This should be fixed by D27276.
Differential revision: https://reviews.llvm.org/D27524
llvm-svn: 289025
StringRefZ is a class to represent a null-terminated string. String
length is computed lazily, so it's more efficient than StringRef to
represent strings in string table.
The motivation of defining this new class is to merge functions
that only differ in string types; we have many constructors that takes
`const char *` or `StringRef`. With StringRefZ, we can merge them.
Differential Revision: https://reviews.llvm.org/D27037
llvm-svn: 288172
Offset between beginning of a .got section and _gp symbols used in MIPS
GOT relocations calculations. Usually the expression looks like
VA + Offset - GP, where VA is the .got section address, Offset - offset
of the GOT entry, GP - offset between .got and _gp. Also there two "magic"
symbols _gp_disp and __gnu_local_gp which hold the offset mentioned above.
These symbols might be referenced by MIPS relocations.
Now the linker always defines _gp symbol and uses hardcoded value for
its initialization. So offset between .got and _gp is 0x7ff0. The _gp_disp
and __gnu_local_gp defined if required and initialized by 0x7ff0.
In fact that is not correct because _gp symbol might be defined by a linker
script and holds arbitrary value. In that case we need to use this value
in relocation calculation and initialize _gp_disp and __gnu_local_gp
properly.
The patch fixes the problem and completes fixing the bug #30311.
https://llvm.org/bugs/show_bug.cgi?id=30311
Differential revision: https://reviews.llvm.org/D27036
llvm-svn: 287832
There are two ways to set symbol versions. One way is to use symbol
definition file, and the other is to embed version names to symbol
names. In the latter way, symbol name is in the form of `foo@version1`
where `foo` is a real name and `version1` is a version.
We were parsing symbol names in insert(). That seems unnecessarily
too early. We can do it later after we resolve all symbols. Doing it
lazily is a good thing because it makes code easier to read
(because now we have a separate pass to parse symbol names). Also
it could slightly improve performance because if two identical symbols
have versions, we now parse them only once.
llvm-svn: 287741
Previously, we stored offsets in string tables to symbols, so
you needed to pass a string table to get a symbol name. This patch
stores const char pointers instead to eliminate the need to pass
a string table.
llvm-svn: 287737
Patch adds a filename to that error message.
I faced next error when debugged one of FreeBSD port:
error: relocation R_X86_64_PLT32 cannot refer to absolute symbol __tls_get_addr
error message was poor and this patch improves it to show the locations
of symbol declaration and using.
Differential revision: https://reviews.llvm.org/D26508
llvm-svn: 286940
Patch allows to pass a symbols file to linker.
LLD will map symbols to sections and sort sections
in output according to symbol ordering file.
That can help to reduce the startup time and/or
amount of pagefaults during startup.
Also, interesting benchmark result was produced by Rafael Espíndola.
After applying the symbols file for clang he timed compiling
X86MCTargetDesc.ii to an object file.
The page faults went from just
56,988 to 56,946 since most faults are not in the binary.
Running time went from 4.403053515 to 4.178112244.
The speedup seems to be because of better cache
locality.
Differential revision: https://reviews.llvm.org/D26130
llvm-svn: 286440
The disadvantage is that we use uint64_t instad of uint32_t for some
value in 32 bit files. The advantage is a substantially simpler code,
faster builds and less code duplication.
llvm-svn: 286414
Previously, we have a lot of BumpPtrAllocators, but all these
allocators virtually have the same lifetime because they are
not freed until the linker finishes its job. This patch aggregates
them into a single allocator.
Differential revision: https://reviews.llvm.org/D26042
llvm-svn: 285452
We used to have one allocator per file, which reduces the advantage of
using an allocator in the first place.
This is a small speed up is most cases. The largest speedup was in
1.014X in chromium no-gc. The largest slowdown was scylla at 1.003X.
llvm-svn: 285205
Some MIPS relocations used to access GOT entries are able to manipulate
16-bit index. The other ones like R_MIPS_CALL_HI16/LO16 can handle
32-bit indexes. 16-bit relocations are generated by default. The 32-bit
relocations are generated by -mxgot flag passed to compiler. Usually
these relocation are not mixed in the same code but files like crt*.o
contain 16-bit relocations so even if all "user's" code compiled with
-mxgot flag a few 16-bit relocations might come to the linking phase.
Now LLD does not differentiate local GOT entries accessed via a 16-bit
and 32-bit indexes. That might lead to relocation's overflow if 16-bit
entries are allocated to far from the beginning of the GOT.
The patch introduces new "part" of MIPS GOT dedicated to the local GOT
entries accessed by 32-bit relocations. That allows to put local GOT
entries accessed via a 16-bit index first and escape relocation's overflow.
Differential revision: https://reviews.llvm.org/D25833
llvm-svn: 284809
In case of linking PIC and non-PIC code together and generation of a
relocatable object, all PIC symbols should have STO_MIPS_PIC flag in the
symbol table of the ouput file.
llvm-svn: 282714
Previously, all input files were owned by the symbol table.
Files were created at various places, such as the Driver, the lazy
symbols, or the bitcode compiler, and the ownership of new files
was transferred to the symbol table using std::unique_ptr.
All input files were then free'd when the symbol table is freed
which is on program exit.
I think we don't have to transfer ownership just to free all
instance at once on exit.
In this patch, all instances are automatically collected to a
vector and freed on exit. In this way, we no longer have to
use std::unique_ptr.
Differential Revision: https://reviews.llvm.org/D24493
llvm-svn: 281425
r275711 for "speedng up symbol version handling" was committed
by misunderstanding; the benchmark number was measured with
a debug build. The number with a release build didn't actually change.
This patch removes false optimizations added in that patch.
llvm-svn: 276267
In the last patch for --trace-symbol, I introduced a new symbol type
PlaceholderKind and store it to SymVector storage. It made all code
that iterates over SymVector to recognize and skip PlaceholderKind
symbols. I found that that's annoying.
In this patch, I removed PlaceholderKind and stop storing them to SymVector.
Now the information whether a symbol is being watched by --trace-symbol
is stored to the Symtab hash table.
llvm-svn: 275747
--trace-symbol is a command line option to watch a symbol.
Previosly, we looked up a hash table for a new symbol if the
option is given. Any code that looks up a hash table for each
symbol is expensive because the linker handles a lot of symbols.
In our design, we look up a hash table strictly only once
for a symbol, so --trace-symbol was an exception.
This patch improves efficiency of the option by merging the
hash table into the symbol table.
Instead of looking up a separate hash table with a string,
this patch sets `Traced` flag to symbols specified by --trace-symbol.
So, if you insert a symbol and get a symbol with `Traced` flag on,
you know that you need to print out a log message for the symbol.
This is nearly zero cost.
llvm-svn: 275716
Versions can be assigned to symbols in two different ways.
One is the usual version scripts, and the other is special
symbol suffix '@'. If a symbol contains '@', the string after
that is considered to specify a version name.
Previously, we look for '@' for all symbols.
Anything that works on every symbol can be expensive because
the linker has to handle a lot of symbols. The search for '@'
was not an exception.
In this patch, I made two optimizations.
The first optimization is to handle '@' only when at least one
version is defined. If no versions are defined, no versions can
be assigned to any symbols, so it's waste of time to search for '@'.
The second optimization is to scan only suffixes of symbol names
instead of entire symbol names. Symbol names can be very long, but
symbol versions are usually short, so scanning entire symbol names
is waste of time, too.
There are some error cases which we no longer be able to detect
with this patch. I don't think it's a major drawback because they
are minor errors. Speed is more important.
This change improves LLD with debug info self-link time from
6.6993 seconds to 6.3426 seconds (or -5.3%).
Differential Revision: https://reviews.llvm.org/D22433
llvm-svn: 275711