Commit Graph

227 Commits

Author SHA1 Message Date
Eugene Leviant 3e6b027705 [ELF] Allows setting section for common symbols in linker script
llvm-svn: 277023
2016-07-28 19:24:13 +00:00
Rui Ueyama dace838138 Simplify symbol version handling.
r275711 for "speedng up symbol version handling" was committed
by misunderstanding; the benchmark number was measured with
a debug build. The number with a release build didn't actually change.
This patch removes false optimizations added in that patch.

llvm-svn: 276267
2016-07-21 13:13:21 +00:00
George Rimar b084125dbf [ELF] - Fixed integral constant overflow warning under MSVS 2015. NFC.
Under MSVS 2015 I observed integral constant overflow warning when aggregate initialization was used
to init the bit field. Patch fixes that.

llvm-svn: 276118
2016-07-20 14:26:48 +00:00
Rui Ueyama e33579072d Remove SymbolBody::PlaceholderKind.
In the last patch for --trace-symbol, I introduced a new symbol type
PlaceholderKind and store it to SymVector storage. It made all code
that iterates over SymVector to recognize and skip PlaceholderKind
symbols. I found that that's annoying.

In this patch, I removed PlaceholderKind and stop storing them to SymVector.
Now the information whether a symbol is being watched by --trace-symbol
is stored to the Symtab hash table.

llvm-svn: 275747
2016-07-18 01:35:00 +00:00
Rui Ueyama d6328526ba Iterate over SymVector instead of Symtab hash table.
SymVector contains all symbols, so we can iterate either Symtab or SymVector
to visit all symbols. Iterating over SymVector makes the next change for
--trace-symbol possible.

llvm-svn: 275746
2016-07-18 01:34:57 +00:00
Rui Ueyama 69c778c084 Implement almost-zero-cost --trace-symbol.
--trace-symbol is a command line option to watch a symbol.
Previosly, we looked up a hash table for a new symbol if the
option is given. Any code that looks up a hash table for each
symbol is expensive because the linker handles a lot of symbols.
In our design, we look up a hash table strictly only once
for a symbol, so --trace-symbol was an exception.

This patch improves efficiency of the option by merging the
hash table into the symbol table.

Instead of looking up a separate hash table with a string,
this patch sets `Traced` flag to symbols specified by --trace-symbol.
So, if you insert a symbol and get a symbol with `Traced` flag on,
you know that you need to print out a log message for the symbol.
This is nearly zero cost.

llvm-svn: 275716
2016-07-17 17:50:09 +00:00
Rui Ueyama 2a7c1c1507 Print out file names for common symbols for --trace-symbol.
Previously, there was no way to get a file name for a DefinedCommon
symbol. This patch adds it.

llvm-svn: 275712
2016-07-17 17:36:22 +00:00
Rui Ueyama 663b8c2769 Handle versioned symbols efficiently.
Versions can be assigned to symbols in two different ways.
One is the usual version scripts, and the other is special
symbol suffix '@'. If a symbol contains '@', the string after
that is considered to specify a version name.

Previously, we look for '@' for all symbols.

Anything that works on every symbol can be expensive because
the linker has to handle a lot of symbols. The search for '@'
was not an exception.

In this patch, I made two optimizations.

The first optimization is to handle '@' only when at least one
version is defined. If no versions are defined, no versions can
be assigned to any symbols, so it's waste of time to search for '@'.

The second optimization is to scan only suffixes of symbol names
instead of entire symbol names. Symbol names can be very long, but
symbol versions are usually short, so scanning entire symbol names
is waste of time, too.

There are some error cases which we no longer be able to detect
with this patch. I don't think it's a major drawback because they
are minor errors. Speed is more important.

This change improves LLD with debug info self-link time from
6.6993 seconds to 6.3426 seconds (or -5.3%).

Differential Revision: https://reviews.llvm.org/D22433

llvm-svn: 275711
2016-07-17 17:23:17 +00:00
Rui Ueyama 434b56179e Add a pointer to a source file to SymbolBody.
Previously, each subclass of SymbolBody had a pointer to a source
file from which it was created. So, there was no single way to get
a source file for a symbol. We had getSourceFile<ELFT>(), but the
function was a bit inconvenient as it's a template.

This patch makes SymbolBody have a pointer to a source file.
If a symbol is not created from a file, the pointer has a nullptr.

llvm-svn: 275701
2016-07-17 03:11:46 +00:00
Rui Ueyama 818bb2f8dc Remove redundant namespace specifiers.
llvm-svn: 275694
2016-07-16 18:55:47 +00:00
Rui Ueyama 962b277d2d Resurrect code that was lost in conflicting commits.
llvm-svn: 275693
2016-07-16 18:45:25 +00:00
George Rimar 50dcece2a0 Recommit r275257 "[ELF] - Implement extern "c++" version script tag"
BSD toolchain contains a bug:
https://sourceforge.net/p/elftoolchain/tickets/491/

In short demangler works differently, fix was to update the testcase.
It should fix the FreeBSD bot failture:
http://lab.llvm.org:8011/builders/lld-x86_64-freebsd/builds/19432/steps/test_lld/logs/stdio

Original commit message was:
[ELF] - Implement extern "c++" version script tag

Patch implements 'extern' version script tag.
Currently only values in quotes(") are supported.

Matching of externs is performed in the same pass as exact match of globals.

Differential revision: http://reviews.llvm.org/D21930

llvm-svn: 275682
2016-07-16 12:26:39 +00:00
Rui Ueyama af469d47e9 Rename SymbolVersions VersionDefinitions.
SymbolVersions sounds like it had versions for a symbol, so rename it.

llvm-svn: 275674
2016-07-16 04:09:27 +00:00
Rui Ueyama bc94dd9b28 Rename Version VersionDefinition.
The identifier `Version` was used too often in the code to handle
symbol versions. The struct that contains version definitions is
named `Version`. Local variables for version ID are named `Version`.
Local varaible for version string are named `Version`.

This patch give them different names.

llvm-svn: 275673
2016-07-16 04:02:00 +00:00
Rui Ueyama c9b4c073b2 Simplify. NFC.
llvm-svn: 275670
2016-07-16 03:12:16 +00:00
Rui Ueyama 2506866ff6 Simplify default symbol version management. NFC.
llvm-svn: 275669
2016-07-16 03:08:26 +00:00
George Rimar dd64bb38bd Reverted r275257 "[ELF] - Implement extern "c++" version script tag"
It broke build bots:
http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/8204
http://lab.llvm.org:8011/builders/lld-x86_64-freebsd/builds/19432

llvm-svn: 275258
2016-07-13 08:19:04 +00:00
George Rimar e05103ea11 [ELF] - Implement extern "c++" version script tag
Patch implements 'extern' version script tag.
Currently only values in quotes(") are supported.

Matching of externs is performed in the same pass as exact match of globals.

Differential revision: http://reviews.llvm.org/D21930

llvm-svn: 275257
2016-07-13 07:46:00 +00:00
George Rimar 7899d48dff [ELF] - Add Id field to Version struct.
That helps to avoid expressions like I + 2 in code
that assigns version number to symbols.

Change was suggested by Rui Ueyama.

Differential revision: http://reviews.llvm.org/D22086

llvm-svn: 275159
2016-07-12 07:44:40 +00:00
George Rimar c61bcd80af [ELF] - Do not error out when version declaration not found when building executable.
When building executable usually version script is absent.
Before this patch error was shown in the case when
symbol name contained version and there was no script to match it.
  Instead of error out patch allows
to create new version declaration in this case and use it.
gnu linkers do the same.

That is PR28359.

Differential revision: http://reviews.llvm.org/D21890

llvm-svn: 274828
2016-07-08 06:47:28 +00:00
Rui Ueyama f4d9338dfb Move demangle() from Symbols.cpp to Strings.cpp.
Symbols.cpp contains functions to handle ELF symbols.
demangle() function is essentially a function to work on a
string rather than on an ELF symbol. So Strings.cpp is a
better place to put that function.

This change also make demangle to demangle symbols unconditionally.
Previously, it demangled symbols only when Config->Demangle is true.

llvm-svn: 274804
2016-07-07 23:04:15 +00:00
George Rimar f73a25812f [ELF] - Fixed incorrect logic of version assignments when mixing wildcards with values matching.
Previously we had incorrect logic here. Imagine we would have the next script:

LIBSAMPLE_1.0
{
  global:
   a_2;
 local:
  *;
};

LIBSAMPLE_2.0
{
  global:
   a*;
};
According to previous logic it would assign version 1 to a_2 and then
would try to reassign it to version 2 because of applying wildcard a*.
And show a warning about that.

Generally Ian Lance Tailor wrote about next rules that should be applied:
(http://www.airs.com/blog/archives/300)

Here are the current rules for gold:

"If there is an exact match for the mangled name, we use it. If there is more than one exact match, we give a warning, and we use the first tag in the script which matches. If a symbol has an exact match as both global and local for the same version tag, we give an error.
Otherwise, we look for an extern C++ or an extern Java exact match. If we find an exact match, we use it. If there is more than one exact match, we give a warning, and we use the first tag in the script which matches. If a symbol has an exact match as both global and local for the same version tag, we give an error.
Otherwise, we look through the wildcard patterns, ignoring “*” patterns. We look through the version tags in reverse order. For each version tag, we look through the global patterns and then the local patterns. We use the first match we find (i.e., the last matching version tag in the file).
Otherwise, we use the “*” pattern if there is one. We give a warning if there are multiple “*” patterns."

Patch makes wildcard matching to be in revered order and to follow after the regular naming matching.

Differential revision: http://reviews.llvm.org/D21894

llvm-svn: 274739
2016-07-07 07:45:27 +00:00
George Rimar dbbf60e590 [ELF] - Check the input bitcode files for compatibility.
Previously BC files were not checked for the same platform etc,
That lead to confusing error "Invalid section header entry size (e_shentsize) in ELF header" when
mixing files for different architectures.

Patch fixes PR28324.

Differential revision: http://reviews.llvm.org/D21832

llvm-svn: 274113
2016-06-29 09:46:00 +00:00
George Rimar 9fc1d4ed75 [ELF] - Updated comments. NFC.
As was suggested by Rafael Espíndola.

llvm-svn: 274111
2016-06-29 08:36:36 +00:00
Rui Ueyama 93c9af425e Create Strings.cpp and move string manipulation functions to that file.
llvm-svn: 274109
2016-06-29 08:01:32 +00:00
Rui Ueyama 722830a51b Rename matchStr -> globMatch.
llvm-svn: 274103
2016-06-29 05:32:09 +00:00
Rui Ueyama 48e4251e1d Make SymbolTable::findAll to return only defined symbols.
We allowed the function to return a vector that contains nullptrs
which is weird. This change makes the function to return only
defined symbols.

Differential Revision: http://reviews.llvm.org/D21828

llvm-svn: 274099
2016-06-29 04:47:39 +00:00
Davide Italiano 8e1131dc46 [ELF] Support for wildcard in version scripts.
Example:

VERSION_1.0 {
  global: foo*;
  local: *; }

now correctly matches all the symbols which name starts with
`foo`.

Differential Revision:  http://reviews.llvm.org/D21732

llvm-svn: 274091
2016-06-29 02:46:51 +00:00
Rui Ueyama 5e64d3fb94 Refactor ELF type inference functions.
Previously, we initialized Config->EKind and Config->EMachine when
we instantiate ELF objects. That was not an ideal location to do that
because the logic was buried too deep inside a concrete logic.

This patch moves the code to the driver so that the initialization
becomes explicit.

Differential Revision: http://reviews.llvm.org/D21784

llvm-svn: 274089
2016-06-29 01:30:50 +00:00
George Rimar 4365158689 [ELF] - Implemented support of default/non-default symbols versions
t is possible to create new version of symbol instead of depricated one
using combination of version script and asm commands. For example:

__asm__(".symver b_1,b@LIBSAMPLE_1.0");
int b_1() { return 10; }
__asm__(".symver b_2,b@@LIBSAMPLE_2.0");
int b_2() { return 20; }

This code makes b_2() to be default implementation for b().
b_1() is used for compatibility with binaries compiled against
library of older version LIBSAMPLE_1.0.

This patch implements support for above functionality in lld.

Differential revision: http://reviews.llvm.org/D21681

llvm-svn: 274002
2016-06-28 08:21:10 +00:00
George Rimar 36b2c0a683 [ELF] - Implemented --no-undefined-version flag
Option checks for cases where a version script explicitly lists
a symbol, but the symbol is not defined and errors out such
cases if any.

Differential revision: http://reviews.llvm.org/D21745

llvm-svn: 273998
2016-06-28 08:07:26 +00:00
Davide Italiano 4fdc648592 [ELF] Warn for duplicate symbols in version scripts instead of erroring out.
Emitting an error in this case breaks real-world application (e.g. libreoffice).
See http://reviews.llvm.org/D21555 for context.

Differential Revision:  http://reviews.llvm.org/D21781

llvm-svn: 273989
2016-06-28 03:40:49 +00:00
Rui Ueyama d60dae8a6a Implement --trace-symbol=symbol option.
Patch by Shridhar Joshi.

This option provides names of all the link time modules which define and
reference symbols requested by user. This helps to speed up application
development by detecting references causing undefined symbols.
It also helps in detecting symbols being resolved to wrong (unintended)
definitions in case of applications containing multiple definitions for
same symbols with different types, bindings.

Implements PR28226.

llvm-svn: 273536
2016-06-23 07:00:17 +00:00
George Rimar 50b80359c0 [ELF] - Do not allow to mix global symbols versions.
For next version script:
VER1{
  global:
  a;
};

VER2{
  global:
  a;
};
gold would produce warning like:
"warning: using 'VER1' as version for 'a' which is also named in version 'VER2' in script."

Documentation also says we do not want this duplications (https://people.freebsd.org/~deischen/symver/library_versioning.txt):
"Note that you do not want to duplicate symbols in the map file. The .symver directives are all that is required to add compatibility
symbols into old versions."

This patch restricts such mixing and makes lld to produce error in this case.

Differential revision: http://reviews.llvm.org/D21555

llvm-svn: 273396
2016-06-22 09:10:38 +00:00
George Rimar d3566309eb [ELF] - Recommit r273143("[ELF] - Basic versioned symbols support implemented.")
With fix:
-soname flag was not set in testcase. Hash calculated for base def was different on local
and bot machines because filename fos used for calculating.

Initial commit message:
Patch implements basic support of versioned symbols.
There is no wildcards patterns matching except local: *;
There is no support for hierarchies.
There is no support for symbols overrides (@ vs @@ not handled).

This patch allows programs that using simple scripts to link and run.

Differential revision: http://reviews.llvm.org/D21018

llvm-svn: 273152
2016-06-20 11:55:12 +00:00
George Rimar d03f97211a Revert r273143 "[ELF] - Basic versioned symbols support implemented."
It broke buildbot:
http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast

llvm-svn: 273146
2016-06-20 10:29:53 +00:00
George Rimar c31fee2212 [ELF] - Basic versioned symbols support implemented.
Patch implements basic support of versioned symbols.
There is no wildcards patterns matching except local: *;
There is no support for hierarchies.
There is no support for symbols overrides (@ vs @@ not handled).

This patch allows programs that using simple scripts to link and run.

Differential revision: http://reviews.llvm.org/D21018

llvm-svn: 273143
2016-06-20 10:16:33 +00:00
Rafael Espindola f70fb04e4f Make local: optional.
Doing that in an anonymous version is a bit silly, but this opens the
way for supporting it in general.

Since we don't support actual versions, for now we just disable the
version script if we detect that it is missing a local.

llvm-svn: 273000
2016-06-17 13:38:09 +00:00
Rafael Espindola cc70da39ff Internalize symbols in comdats.
We were dropping the CanOmitFromDynSym bit when creating undefined
symbols because of comdat.

llvm-svn: 272812
2016-06-15 17:56:10 +00:00
Rafael Espindola 65c65ce897 Don't include --start-lib/--end-lib files twice.
This should never happen with correct programs, but it is trivial
write a testcase where lld would crash or report duplicated
symbols. We now behave like when an archive is used and include the
file only once.

llvm-svn: 272724
2016-06-14 21:56:36 +00:00
Rafael Espindola 07543a8c2d Use a reference instead of a pointer. NFC.
llvm-svn: 272719
2016-06-14 21:40:23 +00:00
Peter Collingbourne ca8c994818 ELF: Compute used bit for --as-needed during symbol resolution.
We can now use this to decide whether to emit a verneed during the final
pass over the symbols. We were previously wrongly creating a verneed entry
in the case where all references to a DSO's symbols were weak.

In a future change we may also want to use the used bit to control whether
shared symbols are preemptible and appear in the dynsym. This seems a little
tricky to do at the moment because isNeeded() is templated.

The only other functional change here is that we emit a DT_NEEDED for DSOs
whose symbols are all preempted by objects that appear later in the link. But
that doesn't seem too important to me.

Differential Revision: http://reviews.llvm.org/D21171

llvm-svn: 272282
2016-06-09 18:01:35 +00:00
Rafael Espindola 78db5a9dca Print member name in undefined symbol error.
llvm-svn: 268976
2016-05-09 21:40:06 +00:00
Peter Collingbourne f3a2b0e8f7 ELF: Fix regression in TLS attribute mismatch logic.
Introduce a special symbol type to indicate that we have not yet seen a type
for the symbol, so we should not report TLS mismatches for that symbol.

Differential Revision: http://reviews.llvm.org/D19836

llvm-svn: 268411
2016-05-03 18:03:47 +00:00
Peter Collingbourne c357278a38 ELF: Remove the function SymbolTable<ELFT>::findFile.
We already have the function SymbolBody::getSourceFile which does the same thing.

llvm-svn: 268353
2016-05-03 01:48:25 +00:00
Peter Collingbourne 6a4225962d ELF: Forbid all relative relocations to absolute symbols in PIC, except for weak undefined.
Weak undefined symbols resolve to the image base. This is a little strange,
but it allows us to link function calls to such symbols. Normally such a
call will be guarded with a comparison, which will load a zero from the GOT.

There's one example of such a function call in crti.o in Linux's CRT.

As part of this change, I also needed to make the synthetic start and end
symbols image base relative in the case where their sections were empty,
so that PC-relative references to those symbols would continue to work.

Differential Revision: http://reviews.llvm.org/D19844

llvm-svn: 268350
2016-05-03 01:21:08 +00:00
Rui Ueyama 6d0cd2b62b Teach Undefined symbols from which file they are created from.
This patch increases the size of Undefined by the size of a pointer,
but it wouldn't actually increase the size of memory that LLD uses
because we are not allocating the exact size but the size of the
largest SymbolBody.

llvm-svn: 268310
2016-05-02 21:30:42 +00:00
Peter Collingbourne 4f9527065c ELF: New symbol table design.
This patch implements a new design for the symbol table that stores
SymbolBodies within a memory region of the Symbol object. Symbols are mutated
by constructing SymbolBodies in place over existing SymbolBodies, rather
than by mutating pointers. As mentioned in the initial proposal [1], this
memory layout helps reduce the cache miss rate by improving memory locality.

Performance numbers:

           old(s) new(s)
Without debug info:
chrome      7.178  6.432 (-11.5%)
LLVMgold.so 0.505  0.502 (-0.5%)
clang       0.954  0.827 (-15.4%)
llvm-as     0.052  0.045 (-15.5%)
With debug info:
scylla      5.695  5.613 (-1.5%)
clang      14.396 14.143 (-1.8%)

Performance counter results show that the fewer required indirections is
indeed the cause of the improved performance. For example, when linking
chrome, stalled cycles decreases from 14,556,444,002 to 12,959,238,310, and
instructions per cycle increases from 0.78 to 0.83. We are also executing
many fewer instructions (15,516,401,933 down to 15,002,434,310), probably
because we spend less time allocating SymbolBodies.

The new mechanism by which symbols are added to the symbol table is by calling
add* functions on the SymbolTable.

In this patch, I handle local symbols by storing them inside "unparented"
SymbolBodies. This is suboptimal, but if we do want to try to avoid allocating
these SymbolBodies, we can probably do that separately.

I also removed a few members from the SymbolBody class that were only being
used to pass information from the input file to the symbol table.

This patch implements the new design for the ELF linker only. I intend to
prepare a similar patch for the COFF linker.

[1] http://lists.llvm.org/pipermail/llvm-dev/2016-April/098832.html

Differential Revision: http://reviews.llvm.org/D19752

llvm-svn: 268178
2016-05-01 04:55:03 +00:00
George Rimar 9fc5908674 [ELF] Fixed warning. NFC.
SymbolTable.cpp:298:36: warning: enumeral and non-enumeral type in conditional expression [-Wextra]
     Sym->Binding = New->isShared() ? STB_GLOBAL : New->Binding;
                                    ^

llvm-svn: 268040
2016-04-29 13:32:30 +00:00
Rui Ueyama 62ee16faa8 Remove Size from Undefined symbol.
There seems to be no reason to keep st_size of undefined symbols.
This patch removes the member for it. This patch will change outputs
in cases that undefined symbols are copied to output, but I think
this is unimportant.

Differential Revision: http://reviews.llvm.org/D19574

llvm-svn: 267826
2016-04-28 00:26:54 +00:00