We used to create a vector contantaining all version definitions
with wildcards because doing that was efficient. All patterns were
compiled to a regexp and matched against symbol names. Because
a regexp can be converted to a DFA, matching against union of patterns
is as cheap as matching against one patter.
We are no longer converting them to regexp. Our own glob pattern
handler doesn't do such optimization. Therefore, creating a vector
no longer makes sense.
llvm-svn: 287196
This change separates all versioned locals to be a separate list in config,
that was suggested by Rafael and simplifies the logic a bit.
Differential revision: https://reviews.llvm.org/D26754
llvm-svn: 287132
The code to handle symbol versions is getting tricky and hard to
understand, so it is probably time to simplify it. This patch does
the following.
- Add `DemangledSyms` variable to SymbolTable so that we don't
need to pass it around to findDemangled.
- Define `initDemangledSyms` to initialize the variable lazily.
- hasExternCpp is removed because we no longer have to initialize
the map eagerly.
- scanScriptVersion is split.
- Comments are updated.
llvm-svn: 287002
Patch adds a filename to that error message.
I faced next error when debugged one of FreeBSD port:
error: relocation R_X86_64_PLT32 cannot refer to absolute symbol __tls_get_addr
error message was poor and this patch improves it to show the locations
of symbol declaration and using.
Differential revision: https://reviews.llvm.org/D26508
llvm-svn: 286940
Previously we did not support anything except "local: *", patch changes that.
Actually GNU rules of proccessing wildcards are more complex than that (http://www.airs.com/blog/archives/300):
There are 2 iteration for wildcards, at first iteration "*" wildcards are ignored and handled at second iteration.
Since we previously decided not to implement such complex rules,
I suggest solution that is implemented in this patch. So for "local: *" case nothing changes,
but if we have wildcarded locals,
they are processed before wildcarded globals.
This should fix several FreeBSD ports, one of them is jpeg-turbo-1.5.1 and
currently blocks about 5k of ports.
Differential revision: https://reviews.llvm.org/D26395
llvm-svn: 286713
The change in D26502 splits ReaderWriter.h, which contains the APIs
into both the BitReader and BitWriter libraries, into BitcodeReader.h
and BitcodeWriter.h.
Change lld uses to the appropriate split header, removing it
completely in one case where it wasn't needed.
llvm-svn: 286568
The disadvantage is that we use uint64_t instad of uint32_t for some
value in 32 bit files. The advantage is a substantially simpler code,
faster builds and less code duplication.
llvm-svn: 286414
In short the patch introduces support for linking object file conform
MIPS N32 ABI [1]. This ABI is similar to N64 ABI but uses 32-bit
pointer size.
The most non-trivial requirement of this ABI is one more relocation
packing format. N64 ABI puts multiple relocation type into the single
relocation record. The N32 ABI uses series of successive relocations
with the same offset for this purpose. In this patch, new function
`mergeMipsN32RelTypes` handle this case and "convert" N32 relocation to
the N64 relocation so the rest of the code keep unchanged.
For now, linker does not support series of relocations applied to sections
without SHF_ALLOC bit. Probably later I will add the support or insert
some sort of assert into the `relocateNonAlloc` routine to catch this
case.
[1] ftp://www.linux-mips.org/pub/linux/mips/doc/ABI/MIPS-N32-ABI-Handbook.pdf
Differential revision: https://reviews.llvm.org/D26298
llvm-svn: 286052
Previously, it didn't support the character class, so we couldn't
eliminate the use fo llvm::Regex. Now that it is supported, we
can remove compileGlobPattern, which converts a glob pattern to
a regex.
This patch contains optimization for exact/prefix/suffix matches.
Differential Revision: https://reviews.llvm.org/D26284
llvm-svn: 285949
When we have SHT_GNU_versym section, it is should be associated with symbol table
section. Usually (and in out implementation) it is .dynsym.
In case when .dynsym is absent (due to broken object for example),
lld crashes in parseVerdefs() when accesses null pointer:
Versym = reinterpret_cast<const Elf_Versym *>(this->ELFObj.base() +
VersymSec->sh_offset) +
this->Symtab->sh_info;
DIfferential revision: https://reviews.llvm.org/D25553
llvm-svn: 285796
Previously, we have a lot of BumpPtrAllocators, but all these
allocators virtually have the same lifetime because they are
not freed until the linker finishes its job. This patch aggregates
them into a single allocator.
Differential revision: https://reviews.llvm.org/D26042
llvm-svn: 285452
We used to have one allocator per file, which reduces the advantage of
using an allocator in the first place.
This is a small speed up is most cases. The largest speedup was in
1.014X in chromium no-gc. The largest slowdown was scylla at 1.003X.
llvm-svn: 285205
Summary:
This uses one less word on 64-bit platforms, so should be a strict
improvement. This change also lets us get rid of llvm::CachedHash.
Reviewers: rafael, timshen
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D25646
llvm-svn: 284502
Previouly bot was failing:
http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/413/steps/test-stage1-compiler/logs/stdio
Fixed possible segfault, so commit should bix the buildbot.
Initial commit message:
This is PR30312. Info from bug page:
Both of these symbols demangle to abc::abc():
_ZN3abcC1Ev
_ZN3abcC2Ev
(These would be abc's complete object constructor and base object constructor, respectively.)
however with "abc::abc()" in the version script only one of the two receives the symbol version.
Patch fixes that.
It uses testcase created by Ed Maste (D24306).
Differential revision: https://reviews.llvm.org/D24336
llvm-svn: 281605
Previously, all input files were owned by the symbol table.
Files were created at various places, such as the Driver, the lazy
symbols, or the bitcode compiler, and the ownership of new files
was transferred to the symbol table using std::unique_ptr.
All input files were then free'd when the symbol table is freed
which is on program exit.
I think we don't have to transfer ownership just to free all
instance at once on exit.
In this patch, all instances are automatically collected to a
vector and freed on exit. In this way, we no longer have to
use std::unique_ptr.
Differential Revision: https://reviews.llvm.org/D24493
llvm-svn: 281425
Something broked BBots:
281318 failed on step 9:
http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/413
r281317 built step 9 green:
http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/415
Initial revision commits were:
This is PR30312. Info from bug page:
Both of these symbols demangle to abc::abc():
_ZN3abcC1Ev
_ZN3abcC2Ev
(These would be abc's complete object constructor and base object constructor, respectively.)
however with "abc::abc()" in the version script only one of the two receives the symbol version.
Patch fixes that.
It uses testcase created by Ed Maste (D24306).
Differential revision: https://reviews.llvm.org/D24336
llvm-svn: 281411
This is PR30312. Info from bug page:
Both of these symbols demangle to abc::abc():
_ZN3abcC1Ev
_ZN3abcC2Ev
(These would be abc's complete object constructor and base object constructor, respectively.)
however with "abc::abc()" in the version script only one of the two receives the symbol version.
Patch fixes that.
It uses testcase created by Ed Maste (D24306).
Differential revision: https://reviews.llvm.org/D24336
llvm-svn: 281318
Implemented by building an ELF file in memory.
elf, default, and binary match gold behavior.
Differential Revision: https://reviews.llvm.org/D24060
llvm-svn: 281108
Fixed code that was not checked before on windows for me, because of testcases that are
disabled on that platform atm.
Inital commit message:
"[ELF] - Versionscript: do not treat non-wildcarded names as wildcards."
Previously we incorrectly handled cases when symbol name in extern c++ tag
was enclosed in quotes. Next case was treated as wildcard:
GLIBCXX_3.4 {
extern "C++" {
"aaa*"
}
But it should have not. Quotes around aaa here means that we should have do exact
name matching.
That is PR30268 which has name with pointer is interpreted as wildcard by lld:
extern "C++" {
"operator delete[](void*)";
Patch fixes the issue.
Differential revision: https://reviews.llvm.org/D24229
llvm-svn: 281049
Fixed code that was not checked by testcases that are disabled on windows.
Inital commit message:
"[ELF] - Versionscript: do not treat non-wildcarded names as wildcards."
Previously we incorrectly handled cases when symbol name in extern c++ tag
was enclosed in quotes. Next case was treated as wildcard:
GLIBCXX_3.4 {
extern "C++" {
"aaa*"
}
But it should have not. Quotes around aaa here means that we should have do exact
name matching.
That is PR30268 which has name with pointer is interpreted as wildcard by lld:
extern "C++" {
"operator delete[](void*)";
Patch fixes the issue.
Differential revision: https://reviews.llvm.org/D24229
llvm-svn: 281045
Previously we incorrectly handled cases when symbol name in extern c++ tag
was enclosed in quotes. Next case was treated as wildcard:
GLIBCXX_3.4 {
extern "C++" {
"aaa*"
}
But it should have not. Quotes around aaa here means that we should have do exact
name matching.
That is PR30268 which has name with pointer is interpreted as wildcard by lld:
extern "C++" {
"operator delete[](void*)";
Patch fixes the issue.
Differential revision: https://reviews.llvm.org/D24229
llvm-svn: 281038
Use std::regex instead of hand written matcher.
Patch based on code and ideas of Rui Ueyama.
Differential revision: https://reviews.llvm.org/D23829
llvm-svn: 280544
Before this lld was always creating common symbols itself. It worked,
but prevented them from being internalized when possible.
Now it preserves common symbols is the bitcode and they are internalized.
Fixes pr30184.
llvm-svn: 280242
Previously for extern keyword only names in quotes (exact match) was supported.
Patch adds support for wildcards, so next scripts can be handled properly:
LIBSAMPLE_1.0 {
global:
extern "C++" {
foo*;
};
};
Differential revision: https://reviews.llvm.org/D23794
llvm-svn: 280067
This patch is opposite to D19024, which made this symbols to be hidden by default.
Unfortunately FreeBSD loader wants to see
start_set_modmetadata_set/stop_set_modmetadata_set in the dynamic symbol table.
They were not placed there because had hidden visibility.
Patch makes them to have default visibility again.
Differential revision: https://reviews.llvm.org/D23552
llvm-svn: 279262
r275711 for "speedng up symbol version handling" was committed
by misunderstanding; the benchmark number was measured with
a debug build. The number with a release build didn't actually change.
This patch removes false optimizations added in that patch.
llvm-svn: 276267
Under MSVS 2015 I observed integral constant overflow warning when aggregate initialization was used
to init the bit field. Patch fixes that.
llvm-svn: 276118
In the last patch for --trace-symbol, I introduced a new symbol type
PlaceholderKind and store it to SymVector storage. It made all code
that iterates over SymVector to recognize and skip PlaceholderKind
symbols. I found that that's annoying.
In this patch, I removed PlaceholderKind and stop storing them to SymVector.
Now the information whether a symbol is being watched by --trace-symbol
is stored to the Symtab hash table.
llvm-svn: 275747
SymVector contains all symbols, so we can iterate either Symtab or SymVector
to visit all symbols. Iterating over SymVector makes the next change for
--trace-symbol possible.
llvm-svn: 275746
--trace-symbol is a command line option to watch a symbol.
Previosly, we looked up a hash table for a new symbol if the
option is given. Any code that looks up a hash table for each
symbol is expensive because the linker handles a lot of symbols.
In our design, we look up a hash table strictly only once
for a symbol, so --trace-symbol was an exception.
This patch improves efficiency of the option by merging the
hash table into the symbol table.
Instead of looking up a separate hash table with a string,
this patch sets `Traced` flag to symbols specified by --trace-symbol.
So, if you insert a symbol and get a symbol with `Traced` flag on,
you know that you need to print out a log message for the symbol.
This is nearly zero cost.
llvm-svn: 275716
Versions can be assigned to symbols in two different ways.
One is the usual version scripts, and the other is special
symbol suffix '@'. If a symbol contains '@', the string after
that is considered to specify a version name.
Previously, we look for '@' for all symbols.
Anything that works on every symbol can be expensive because
the linker has to handle a lot of symbols. The search for '@'
was not an exception.
In this patch, I made two optimizations.
The first optimization is to handle '@' only when at least one
version is defined. If no versions are defined, no versions can
be assigned to any symbols, so it's waste of time to search for '@'.
The second optimization is to scan only suffixes of symbol names
instead of entire symbol names. Symbol names can be very long, but
symbol versions are usually short, so scanning entire symbol names
is waste of time, too.
There are some error cases which we no longer be able to detect
with this patch. I don't think it's a major drawback because they
are minor errors. Speed is more important.
This change improves LLD with debug info self-link time from
6.6993 seconds to 6.3426 seconds (or -5.3%).
Differential Revision: https://reviews.llvm.org/D22433
llvm-svn: 275711
Previously, each subclass of SymbolBody had a pointer to a source
file from which it was created. So, there was no single way to get
a source file for a symbol. We had getSourceFile<ELFT>(), but the
function was a bit inconvenient as it's a template.
This patch makes SymbolBody have a pointer to a source file.
If a symbol is not created from a file, the pointer has a nullptr.
llvm-svn: 275701
The identifier `Version` was used too often in the code to handle
symbol versions. The struct that contains version definitions is
named `Version`. Local variables for version ID are named `Version`.
Local varaible for version string are named `Version`.
This patch give them different names.
llvm-svn: 275673
Patch implements 'extern' version script tag.
Currently only values in quotes(") are supported.
Matching of externs is performed in the same pass as exact match of globals.
Differential revision: http://reviews.llvm.org/D21930
llvm-svn: 275257
That helps to avoid expressions like I + 2 in code
that assigns version number to symbols.
Change was suggested by Rui Ueyama.
Differential revision: http://reviews.llvm.org/D22086
llvm-svn: 275159
When building executable usually version script is absent.
Before this patch error was shown in the case when
symbol name contained version and there was no script to match it.
Instead of error out patch allows
to create new version declaration in this case and use it.
gnu linkers do the same.
That is PR28359.
Differential revision: http://reviews.llvm.org/D21890
llvm-svn: 274828
Symbols.cpp contains functions to handle ELF symbols.
demangle() function is essentially a function to work on a
string rather than on an ELF symbol. So Strings.cpp is a
better place to put that function.
This change also make demangle to demangle symbols unconditionally.
Previously, it demangled symbols only when Config->Demangle is true.
llvm-svn: 274804
Previously we had incorrect logic here. Imagine we would have the next script:
LIBSAMPLE_1.0
{
global:
a_2;
local:
*;
};
LIBSAMPLE_2.0
{
global:
a*;
};
According to previous logic it would assign version 1 to a_2 and then
would try to reassign it to version 2 because of applying wildcard a*.
And show a warning about that.
Generally Ian Lance Tailor wrote about next rules that should be applied:
(http://www.airs.com/blog/archives/300)
Here are the current rules for gold:
"If there is an exact match for the mangled name, we use it. If there is more than one exact match, we give a warning, and we use the first tag in the script which matches. If a symbol has an exact match as both global and local for the same version tag, we give an error.
Otherwise, we look for an extern C++ or an extern Java exact match. If we find an exact match, we use it. If there is more than one exact match, we give a warning, and we use the first tag in the script which matches. If a symbol has an exact match as both global and local for the same version tag, we give an error.
Otherwise, we look through the wildcard patterns, ignoring “*” patterns. We look through the version tags in reverse order. For each version tag, we look through the global patterns and then the local patterns. We use the first match we find (i.e., the last matching version tag in the file).
Otherwise, we use the “*” pattern if there is one. We give a warning if there are multiple “*” patterns."
Patch makes wildcard matching to be in revered order and to follow after the regular naming matching.
Differential revision: http://reviews.llvm.org/D21894
llvm-svn: 274739
Previously BC files were not checked for the same platform etc,
That lead to confusing error "Invalid section header entry size (e_shentsize) in ELF header" when
mixing files for different architectures.
Patch fixes PR28324.
Differential revision: http://reviews.llvm.org/D21832
llvm-svn: 274113
We allowed the function to return a vector that contains nullptrs
which is weird. This change makes the function to return only
defined symbols.
Differential Revision: http://reviews.llvm.org/D21828
llvm-svn: 274099
Example:
VERSION_1.0 {
global: foo*;
local: *; }
now correctly matches all the symbols which name starts with
`foo`.
Differential Revision: http://reviews.llvm.org/D21732
llvm-svn: 274091
Previously, we initialized Config->EKind and Config->EMachine when
we instantiate ELF objects. That was not an ideal location to do that
because the logic was buried too deep inside a concrete logic.
This patch moves the code to the driver so that the initialization
becomes explicit.
Differential Revision: http://reviews.llvm.org/D21784
llvm-svn: 274089
t is possible to create new version of symbol instead of depricated one
using combination of version script and asm commands. For example:
__asm__(".symver b_1,b@LIBSAMPLE_1.0");
int b_1() { return 10; }
__asm__(".symver b_2,b@@LIBSAMPLE_2.0");
int b_2() { return 20; }
This code makes b_2() to be default implementation for b().
b_1() is used for compatibility with binaries compiled against
library of older version LIBSAMPLE_1.0.
This patch implements support for above functionality in lld.
Differential revision: http://reviews.llvm.org/D21681
llvm-svn: 274002
Option checks for cases where a version script explicitly lists
a symbol, but the symbol is not defined and errors out such
cases if any.
Differential revision: http://reviews.llvm.org/D21745
llvm-svn: 273998
Patch by Shridhar Joshi.
This option provides names of all the link time modules which define and
reference symbols requested by user. This helps to speed up application
development by detecting references causing undefined symbols.
It also helps in detecting symbols being resolved to wrong (unintended)
definitions in case of applications containing multiple definitions for
same symbols with different types, bindings.
Implements PR28226.
llvm-svn: 273536
For next version script:
VER1{
global:
a;
};
VER2{
global:
a;
};
gold would produce warning like:
"warning: using 'VER1' as version for 'a' which is also named in version 'VER2' in script."
Documentation also says we do not want this duplications (https://people.freebsd.org/~deischen/symver/library_versioning.txt):
"Note that you do not want to duplicate symbols in the map file. The .symver directives are all that is required to add compatibility
symbols into old versions."
This patch restricts such mixing and makes lld to produce error in this case.
Differential revision: http://reviews.llvm.org/D21555
llvm-svn: 273396
With fix:
-soname flag was not set in testcase. Hash calculated for base def was different on local
and bot machines because filename fos used for calculating.
Initial commit message:
Patch implements basic support of versioned symbols.
There is no wildcards patterns matching except local: *;
There is no support for hierarchies.
There is no support for symbols overrides (@ vs @@ not handled).
This patch allows programs that using simple scripts to link and run.
Differential revision: http://reviews.llvm.org/D21018
llvm-svn: 273152
Patch implements basic support of versioned symbols.
There is no wildcards patterns matching except local: *;
There is no support for hierarchies.
There is no support for symbols overrides (@ vs @@ not handled).
This patch allows programs that using simple scripts to link and run.
Differential revision: http://reviews.llvm.org/D21018
llvm-svn: 273143
Doing that in an anonymous version is a bit silly, but this opens the
way for supporting it in general.
Since we don't support actual versions, for now we just disable the
version script if we detect that it is missing a local.
llvm-svn: 273000
This should never happen with correct programs, but it is trivial
write a testcase where lld would crash or report duplicated
symbols. We now behave like when an archive is used and include the
file only once.
llvm-svn: 272724
We can now use this to decide whether to emit a verneed during the final
pass over the symbols. We were previously wrongly creating a verneed entry
in the case where all references to a DSO's symbols were weak.
In a future change we may also want to use the used bit to control whether
shared symbols are preemptible and appear in the dynsym. This seems a little
tricky to do at the moment because isNeeded() is templated.
The only other functional change here is that we emit a DT_NEEDED for DSOs
whose symbols are all preempted by objects that appear later in the link. But
that doesn't seem too important to me.
Differential Revision: http://reviews.llvm.org/D21171
llvm-svn: 272282
Introduce a special symbol type to indicate that we have not yet seen a type
for the symbol, so we should not report TLS mismatches for that symbol.
Differential Revision: http://reviews.llvm.org/D19836
llvm-svn: 268411
Weak undefined symbols resolve to the image base. This is a little strange,
but it allows us to link function calls to such symbols. Normally such a
call will be guarded with a comparison, which will load a zero from the GOT.
There's one example of such a function call in crti.o in Linux's CRT.
As part of this change, I also needed to make the synthetic start and end
symbols image base relative in the case where their sections were empty,
so that PC-relative references to those symbols would continue to work.
Differential Revision: http://reviews.llvm.org/D19844
llvm-svn: 268350
This patch increases the size of Undefined by the size of a pointer,
but it wouldn't actually increase the size of memory that LLD uses
because we are not allocating the exact size but the size of the
largest SymbolBody.
llvm-svn: 268310
This patch implements a new design for the symbol table that stores
SymbolBodies within a memory region of the Symbol object. Symbols are mutated
by constructing SymbolBodies in place over existing SymbolBodies, rather
than by mutating pointers. As mentioned in the initial proposal [1], this
memory layout helps reduce the cache miss rate by improving memory locality.
Performance numbers:
old(s) new(s)
Without debug info:
chrome 7.178 6.432 (-11.5%)
LLVMgold.so 0.505 0.502 (-0.5%)
clang 0.954 0.827 (-15.4%)
llvm-as 0.052 0.045 (-15.5%)
With debug info:
scylla 5.695 5.613 (-1.5%)
clang 14.396 14.143 (-1.8%)
Performance counter results show that the fewer required indirections is
indeed the cause of the improved performance. For example, when linking
chrome, stalled cycles decreases from 14,556,444,002 to 12,959,238,310, and
instructions per cycle increases from 0.78 to 0.83. We are also executing
many fewer instructions (15,516,401,933 down to 15,002,434,310), probably
because we spend less time allocating SymbolBodies.
The new mechanism by which symbols are added to the symbol table is by calling
add* functions on the SymbolTable.
In this patch, I handle local symbols by storing them inside "unparented"
SymbolBodies. This is suboptimal, but if we do want to try to avoid allocating
these SymbolBodies, we can probably do that separately.
I also removed a few members from the SymbolBody class that were only being
used to pass information from the input file to the symbol table.
This patch implements the new design for the ELF linker only. I intend to
prepare a similar patch for the COFF linker.
[1] http://lists.llvm.org/pipermail/llvm-dev/2016-April/098832.html
Differential Revision: http://reviews.llvm.org/D19752
llvm-svn: 268178
There seems to be no reason to keep st_size of undefined symbols.
This patch removes the member for it. This patch will change outputs
in cases that undefined symbols are copied to output, but I think
this is unimportant.
Differential Revision: http://reviews.llvm.org/D19574
llvm-svn: 267826
The semantics of the -u flag are to load the lazy symbol named by the flag. We
were previously relying on this behavior falling out of symbol resolution
against a synthetic undefined symbol, but that didn't quite give us the
correct behavior, so we needed a flag to mark symbols created with -u so
we could treat them specially in the writer. However, it's simpler and less
error prone to implement the required behavior directly and remove the flag.
This fixes an issue where symbols loaded with -u would receive hidden
visibility even when the definition in an object file had wider visibility.
Differential Revision: http://reviews.llvm.org/D19560
llvm-svn: 267639
This remove a fixme, cleans up the weak undef interaction with archives and
lets us keep weak undefs still weak if they resolve to shared.
llvm-svn: 267555
This patch only implements support for version scripts of the form:
{ [ global: symbol1; symbol2; [...]; symbolN; ] local: *; };
No wildcards are supported, other than for the local entry. Symbol versioning
is also not supported.
It works by introducing a new Symbol flag which tracks whether a symbol
appears in the global section of a version script.
This patch also simplifies the logic in SymbolBody::isPreemptible(), and
teaches it to handle the case where symbols with default visibility in DSOs
do not appear in the dynamic symbol table because of a version script.
Fixes PR27482.
Differential Revision: http://reviews.llvm.org/D19430
llvm-svn: 267208
These are properties of a symbol name, rather than a particular instance
of a symbol in an object file. We can simplify the code by collecting these
properties in Symbol.
The MustBeInDynSym flag has been renamed ExportDynamic, as its semantics
have been changed to be the same as those of --dynamic-list and
--export-dynamic-symbol, which do not cause hidden symbols to be exported.
Differential Revision: http://reviews.llvm.org/D19400
llvm-svn: 267183
Parallelism level can be chosen using the new --lto-jobs=K option
where K is the number of threads used for CodeGen. It currently
defaults to 1.
llvm-svn: 266484
We never need to iterate over the K,V pairs, so we can avoid copying the
key as MapVector does.
This is a small speedup on most benchmarks.
llvm-svn: 266364
This patch implements the --dynamic-list option, which adds a list of
global symbol that either should not be bounded by default definition
when creating shared libraries, or add in dynamic symbol table in the
case of creating executables.
The patch modifies the ScriptParserBase class to use a list of Token
instead of StringRef, which contains information if the token is a
quoted or unquoted strings. It is used to use a faster search for
exact match symbol name.
The input file follow a similar format of linker script with some
simplifications (it does not have scope or node names). It leads
to a simplified parser define in DynamicList.{cpp,h}.
Different from ld/gold neither glob pattern nor mangled names
(extern 'C++') are currently supported.
llvm-svn: 266227
Previously each archive file was reported no matter were it's member used or not,
like:
lib/libLLVMSupport.a
Now lld prints line for each used internal file, like:
lib/libLLVMSupport.a(lib/Support/CMakeFiles/LLVMSupport.dir/StringSaver.cpp.o)
lib/libLLVMSupport.a(lib/Support/CMakeFiles/LLVMSupport.dir/Host.cpp.o)
lib/libLLVMSupport.a(lib/Support/CMakeFiles/LLVMSupport.dir/ConvertUTF.c.o)
That should be consistent with what gold do.
This fixes PR27243.
Differential revision: http://reviews.llvm.org/D19011
llvm-svn: 266220
This simplifies the code by allowing us to remove the visibility argument
to functions that create synthetic symbols.
The only functional change is that the visibility of the MIPS "_gp" symbol
is now hidden. Because this symbol is defined in every executable or DSO, it
would be difficult to observe a visibility change here.
Differential Revision: http://reviews.llvm.org/D19033
llvm-svn: 266208
Now MustBeInDynSym is only true if the symbol really must be in the
dynamic symbol table.
IsUsedInRegularObj is only true if the symbol is used in a .o or -u. Not
a .so or a .bc.
A benefit is that this is now done almost entirilly during symbol
resolution. The only exception is copy relocations because of aliases.
This includes a small fix in that protected symbols in .so don't force
executable symbols to be exported.
This also opens the way for implementing internalize for -shared.
llvm-svn: 265826
start-lib and end-lib are options to link object files in the same
semantics as archive files. If an object is in start-lib and end-lib,
the object is linked only when the file is needed to resolve
undefined symbols. That means, if an object is in start-lib and end-lib,
it behaves as if it were in an archive file.
In this patch, I introduced a new notion, LazyObjectFile. That is
analogous to Archive file type, but that works for a single object
file instead of for an archive file.
http://reviews.llvm.org/D18814
llvm-svn: 265710
We have to differentiate undefined symbols from bitcode and undefined
symbols from other sources.
Undefined symbols from bitcode should not inhibit the symbol being
internalized. Undefined symbols from other sources should.
llvm-svn: 265536
Make sure to copy the MustBeInDynSym field when replacing shared symbols with
bitcode symbols, and when replacing bitcode symbols with regular symbols
in addCombinedLtoObject. Fixes interposition of DSO symbols with bitcode
symbols in the main executable.
Differential Revision: http://reviews.llvm.org/D18780
llvm-svn: 265371
Our symbol representation was redundant, and some times would get out of
sync. It had an Elf_Sym, but some fields were copied to SymbolBody.
Different parts of the code were checking the bits in SymbolBody and
others were checking Elf_Sym.
There are two general approaches to fix this:
* Copy the required information and don't store and Elf_Sym.
* Don't copy the information and always use the Elf_Smy.
The second way sounds tempting, but has a big problem: we would have to
template SymbolBody. I started doing it, but it requires templeting
*everything* and creates a bit chicken and egg problem at the driver
where we have to find ELFT before we can create an ArchiveFile for
example.
As much as possible I compared the test differences with what gold and
bfd produce to make sure they are still valid. In most cases we are just
adding hidden visibility to a local symbol, which is harmless.
In most tests this is a small speedup. The only slowdown was scylla
(1.006X). The largest speedup was clang with no --build-id, -O3 or
--gc-sections (i.e.: focus on the relocations): 1.019X.
llvm-svn: 265293
If a symbol is defined in an archive, when we replace its body
the isUsedInRegularObj wasn't set correctly. Internalize makes
its decision based on that bit so we ended up internalizing
symbols that we shouldn't (because they're referenced).
This should fix. Thanks to Peter and Rafael for discussion
and help diagnosing the issue!
Found during LTO of unittests.
llvm-svn: 265208
Some optimizations, e.g. SimplifyLibCalls, can replace functions with
others as part of the lowering, e.g. printf => puts.
The new symbols don't have the isUsedInRegularObj flag set so they
don't get included in the final symbol table (and dynamic symbol
table), and the dynamic linker gets confused. Include them as a fix.
Differential Revision: http://reviews.llvm.org/D18357
llvm-svn: 264688
The code for LTO has been growing, so now is probably a good time to
move it to its own file. SymbolTable.cpp is for symbol table, and
because compiling bitcode files are semantically not a part of
symbol table, this is I think a good thing to do.
http://reviews.llvm.org/D18370
llvm-svn: 264091
-pie
--pic-executable
Create a position independent executable. This is currently only
supported on ELF platforms. Position independent executables are
similar to shared libraries in that they are relocated by the
dynamic linker to the virtual address the OS chooses for them
(which can vary between invocations). Like normal dynamically
linked executables they can be executed and symbols defined in the
executable cannot be overridden by shared libraries.
Differential revision: http://reviews.llvm.org/D18183
llvm-svn: 263693
which was reverted because included
unrelative changes by mistake.
Original commit message:
[ELF] - Change all messages to lowercase to be consistent.
That is directly opposite to http://reviews.llvm.org/D18045,
which was reverted.
This patch changes all messages to start from lowercase letter if
they were not before.
That is done to be consistent with clang.
Differential revision: http://reviews.llvm.org/D18085
llvm-svn: 263337
That is directly opposite to http://reviews.llvm.org/D18045,
which was reverted.
This patch changes all messages to start from lowercase letter if
they were not before.
That is done to be consistent with clang.
Differential revision: http://reviews.llvm.org/D18085
llvm-svn: 263252
In lld we usually avoid hash lookups. In addition to that, IR names are
not fully mangled, so it is best to avoid using them whenever possible.
llvm-svn: 263248
It was discussed to make all messages be
lowercase to be consistent with clang.
(also reverts the r263128 which fixed
build bot fail after r263125)
Original commit message:
[ELF] - Consistent spelling for error/warning messages
Previously error and warnings were not consistent in lld.
Some of them started from lowercase letter, others from
uppercase. Also there was one or two which had a dot at the end.
This patch changes all messages to start from uppercase letter if
they were not before.
Differential revision: http://reviews.llvm.org/D18045
llvm-svn: 263240
Summary:
More generally, appending linkage is a special case that we don't want
to create a SymbolBody for.
Reviewers: rafael, ruiu
Subscribers: Bigcheese, llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D18012
llvm-svn: 263179
Previously error and warnings were not consistent in lld.
Some of them started from lowercase letter, others from
uppercase. Also there was one or two which had a dot at the end.
This patch changes all messages to start from uppercase letter if
they were not before.
Differential revision: http://reviews.llvm.org/D18045
llvm-svn: 263125
Summary:
This implements another part of -save-temps.
After this, the only remaining part is dumping the optimized bitcode. But
currently LLD's LTO doesn't have a non-intrusive place to put this.
Eventually we probably will and it will make sense to add it then.
Reviewers: ruiu, rafael
Subscribers: joker.eph, Bigcheese, llvm-commits
Differential Revision: http://reviews.llvm.org/D18009
llvm-svn: 263070
Summary:
This is useful for debugging issues with LTO.
The option follows the analogous option in ld64 and the gold plugin (per
Rafael's suggestion).
For starters, this only dumps the combined bitcode file.
In a future patch I will add dumping for the .o file.
The naming of the output follows ld64's convention which is slightly more
consistent IMO (consistent `.lto.<extension>` for all the files).
Reviewers: rafael, ruiu
Subscribers: joker.eph, Bigcheese, llvm-commits
Differential Revision: http://reviews.llvm.org/D18006
llvm-svn: 263055
Summary:
At the very least we hit
Assertion failed: (((Flags & RF_HaveUnmaterializedMetadata) || Node->isResolved()) && "Unexpected unresolved node"), function MapMetadataImpl, file /Users/Sean/pg/llvm/lib/Transforms/Utils/ValueMapper.cpp, line 375.
on the included test case.
We currently do things like parse the module twice to keep the
implementation minimal. I think it makes sense to add start with eager
loading for similar reasons.
Reviewers: rafael
Subscribers: ruiu, Bigcheese, llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D17982
llvm-svn: 263045
Get rid of few accessors in that class, and replace
them with direct fields access.
Differential revision: http://reviews.llvm.org/D17879
llvm-svn: 262796
This has a few advantages:
* If lld selected a non bitcode symbol, be the bitcode GV is not merged
* lib/Linker is not redoing symbol resolution.
llvm-svn: 262773
SymbolBody constructor and friends take isFunc and isTLS boolean arguments.
ELF symbols have already a type so than be easily passed as argument.
If we want to support another type, this scheme is not good enough, that is,
the current code logic would require passing another `bool isObject` around.
Up to two argument, this stretching exercise was a little bit goofy but
still acceptable, but with more types to support, is just too much, IMHO.
Change the code so that the type is passed instead.
Differential Revision: http://reviews.llvm.org/D17871
llvm-svn: 262684
A weak undefined should not fetch archive members, so we have to keep
the Lazy symbol.
That means the lazy symbol has to encode information about the original
weak undef.
Fixes pr25762.
llvm-svn: 261591
The variable was marking various cases where a symbol must be included
in the dynamic symbol table. Being used by a dynamic relocation was only
one of them.
llvm-svn: 259889
If object files are drawn from archive files, the error message should
be something like "conflict symbols in foo.a(bar.o) and baz.o" instead
of "conflict symbols in bar.o and baz.o". This patch implements that.
llvm-svn: 259475
In many situations, we don't want to exit at the first error even in the
process model. For example, it is better to report all undefined symbols
rather than reporting the first one that the linker picked up randomly.
In order to handle such errors, we don't need to wrap everything with
ErrorOr (thanks for David Blaikie for pointing this out!) Instead, we
can set a flag to record the fact that we found an error and keep it
going until it reaches a reasonable checkpoint.
This idea should be applicable to other places. For example, we can
ignore broken relocations and check for errors after visiting all relocs.
In this patch, I rename error to fatal, and introduce another version of
error which doesn't call exit. That function instead sets HasError to true.
Once HasError becomes true, it stays true, so that we know that there
was an error if it is true.
I think introducing a non-noreturn error reporting function is by itself
a good idea, and it looks to me that this also provides a gradual path
towards lld-as-a-library (or at least embed-lld-to-your-program) without
sacrificing code readability with lots of ErrorOr's.
http://reviews.llvm.org/D16641
llvm-svn: 259069
On MIPS O32 ABI, _gp_disp is a magic symbol designates offset between
start of function and gp pointer into GOT. To make seal with such symbol
we add new method addIgnoredStrong(). It adds ignored symbol with global
binding to prevent the symbol substitution. The addIgnored call is not
enough here because this call adds a weak symbol which might be
substituted by symbol from shared library.
Differential Revision: http://reviews.llvm.org/D16084
llvm-svn: 257449
We used to have code in SymbolTable constructor to add entry symbols, etc.
That code has been moved to Driver. We can remove the constructor.
llvm-svn: 257214
For historical reasons, some add* functions for SymbolTable returns a
pointer to a SymbolBody, while some are not. This patch is to make them
consistently return a pointer to a newly added symbol.
llvm-svn: 257211
In this patch, all symbols are resolved normally and then wrap options
are applied. Renaming is implemented by mutating `Body` pointers of
Symbols. (As a result, Symtab.find(SymbolName)->getName() may return
a string that's different from SymbolName, but that is by design.
I designed the symbol and the symbol table to allow this kind of
operations.)
http://reviews.llvm.org/D15896
llvm-svn: 257075
Unlike ObjectFile or ArchiveFile, SharedFile had two parse functions,
parseSoName() and parse(). parse must have been called after parseSoName,
but that requirement was not obvious from their names. (So it looked
like you could call parse() on a shared object file right away.)
This patch rename parseRest. It is now obvious that there's no single
parse function for the shared object file.
llvm-svn: 256898
There are 3 symbol types that a .bc can provide during lto: defined,
undefined, common.
Defined and undefined symbols have already been refactored. I was
working on common and noticed that absolute symbols would become an
oddity: They would be the only symbol type present in a .o but not in
a.bc.
Looking a bit more, other than the special section number they were only
used for special rules for computing values. In that way they are
similar to TLS, and we don't have a DefinedTLS.
This patch deletes it. With it we have a reasonable rule of the thumb
for having a symbol kind: It exists if it has special resolution
semantics.
llvm-svn: 256383
I am working on adding LTO support to the new ELF lld.
In order to do that, it will be necessary to represent defined and
undefined symbols that are not from ELF files. One way to do it is to
change the symbol hierarchy to look like
Defined : SymbolBody
Undefined : SymbolBody
DefinedElf<ELFT> : Defined
UndefinedElf<ELFT> : Undefined
Another option would be to use bogus Elf_Sym, but I think that is
getting a bit too hackish.
This patch does the Undefined/UndefinedElf. Split. The next one
will do the Defined/DefinedElf split.
llvm-svn: 256289
The function was used only in Writer.cpp and did not depend on SymbolTable.
There is no reason to have that function in SymbolTable.cpp.
llvm-svn: 255850
addELFFile was called only from addFile, and what it did was actually
just adding a file to the symbol table. There seems to be no reason
to separate the two.
llvm-svn: 255839
The `_gp_disp` is a magic symbol designates offset between start of
function and gp pointer into GOT. Only `R_MIPS_HI16` and `R_MIPS_LO16`
relocations are permitted with `_gp_disp`. The patch adds the `_gp_disp`
as an ignored symbol and adjusts symbol value before call the `relocateOne`
for `R_MIPS_HI16/LO16` relocations.
Differential Revision: http://reviews.llvm.org/D15480
llvm-svn: 255768
This patch implements R_MIPS_GOT16 relocation for global symbols in order to
generate some entries in GOT. Only reserved and global entries are supported
for now. For the detailed description about GOT in MIPS, see "Global Offset
Table" in Chapter 5 in the followin document:
ftp://www.linux-mips.org/pub/linux/mips/doc/ABI/mipsabi.pdf
In addition, the platform specific symbol "_gp" is added, see "Global Data
Symbols" in Chapter 6 in the aforementioned document.
Differential revision: http://reviews.llvm.org/D14211
llvm-svn: 252275
This patch is to use ELFT instead of Is64Bits to template OutputSection
and its subclasses. This increases code size slightly because it creates
two identical functions for some classes, but that's only 20 KB out of
33 MB, so it's negligible.
This is as per discussion with Rafael. He's not fan of the idea but OK
with this. We'll revisit later to this topic.
llvm-svn: 250466
If a section name is valid as a C identifier (which is rare because of
the leading '.'), linkers are expected to define __start_<secname> and
__stop_<secname> symbols. They are at beginning and end of the section,
respectively. This is not requested by the ELF standard, but GNU ld and
gold provide this feature.
llvm-svn: 250432
"finalize" does not give a hint about what that function is actually
going to do. This patch make it more specific by renaming scanShlibUndefined.
Also add a comment that we basically ignore undefined symbols in DSOs except
this function.
llvm-svn: 250191
BSD's DSO files have undefined symbol "__progname" which is defined
in crt1.o. On that system, both user programs and system shared
libraries depend on each other.
In general, we need to put symbols defined by user programs which are
referenced by shared libraries to user program's .dynsym.
http://reviews.llvm.org/D13637
llvm-svn: 250176
SymbolTable was not a right place for initialization. We had to do that
because Driver didn't know what type of ELF objects are being handled.
We taught Driver that, so we can now move this code to Driver.
llvm-svn: 249904
SymbolTable was not a template class. Instead we had switch-case-based
type dispatch to call desired functions. We had to do that because
SymbolTable was created before we know what ELF type objects had been
passed.
Every time I tried to add a new function to the symbol table, I had to
define a dispatcher which consist of a single switch statement.
It also brought an restriction what the driver can do. For example,
we cannot add undefined symbols before any files are added to the symbol
table. That's because no symbols can be added until the symbol table
knows the ELF type, but when it knows about that, it's too late.
In this patch, the driver makes a decision on what ELF type objects
are being handled. Then the driver creates a SymbolTable object for
an appropriate ELF type.
http://reviews.llvm.org/D13544
llvm-svn: 249902
Parse and apply emulation given with -m option.
Check input files to match ELF type and machine architecture provided with -m.
Differential Revision: http://reviews.llvm.org/D13055
llvm-svn: 249529
We were still fetching them when the archive was seen first.
We should experiment with just letting lazy symbols get to compare, it
might be cleaner for ELF.
llvm-svn: 249417
This is a case that requires --start-group --end-group with regular ELF
linkers. Fortunately it is still possible to handle it with lazy symbols without
taking a second look at archives.
Thanks to Michael Spencer for the bug report.
llvm-svn: 249406
Add symbol specified with -u as undefined which may cause additional
object files from archives to be linked into the resulting binary.
Differential Revision: http://reviews.llvm.org/D13345
llvm-svn: 249295
Summary:
If --whole-archive is used, all symbols from the following archives are added to the output. --no-whole-archive restores default behavior. These switches can be used multiple times.
NB. We have to keep an ArchiveFile instance within SymbolTable even if --whole-archive mode is active since it can be a thin archive which contains just names of external files. In that case actual memory buffers for the archive members will be stored within the File member of ArchiveFile class.
Reviewers: rafael, ruiu
Subscribers: grimar, llvm-commits
Projects: #lld
Differential Revision: http://reviews.llvm.org/D13286
llvm-svn: 249045
Besides a trivial MIPS support the patch introduces new TargetInfo class
member getDefaultEntry() to override default name of the entry symbol.
MIPS uses __start for that.
Differential Revision: http://reviews.llvm.org/D13227
llvm-svn: 248779