Commit Graph

173 Commits

Author SHA1 Message Date
Rui Ueyama a13efc2a73 Introduce StringRefZ class to represent null-terminated strings.
StringRefZ is a class to represent a null-terminated string. String
length is computed lazily, so it's more efficient than StringRef to
represent strings in string table.

The motivation of defining this new class is to merge functions
that only differ in string types; we have many constructors that takes
`const char *` or `StringRef`. With StringRefZ, we can merge them.

Differential Revision: https://reviews.llvm.org/D27037

llvm-svn: 288172
2016-11-29 18:05:04 +00:00
Rui Ueyama a3ac17372b Define toString(const SymbolBody &) and remove maybeDemangle instead.
Differential Revision: https://reviews.llvm.org/D27065

llvm-svn: 287899
2016-11-24 20:24:18 +00:00
Rui Ueyama 3fc0f7e54f Define toString() as a generic function to get a string for error message.
We have different functions to stringize objects to construct
error messages. For InputFile, we have getFilename, and for
InputSection, we have getName. You had to memorize them.

I think this is the case where the function overloading comes in handy.

This patch defines toString() functions that are overloaded for all these
types, so that you just call it in error().

Differential Revision: https://reviews.llvm.org/D27030

llvm-svn: 287787
2016-11-23 18:07:33 +00:00
Rui Ueyama 35fa6c58ad Parse symbol versions in scanVersionScript() instead of insert().
There are two ways to set symbol versions. One way is to use symbol
definition file, and the other is to embed version names to symbol
names. In the latter way, symbol name is in the form of `foo@version1`
where `foo` is a real name and `version1` is a version.

We were parsing symbol names in insert(). That seems unnecessarily
too early. We can do it later after we resolve all symbols. Doing it
lazily is a good thing because it makes code easier to read
(because now we have a separate pass to parse symbol names). Also
it could slightly improve performance because if two identical symbols
have versions, we now parse them only once.

llvm-svn: 287741
2016-11-23 05:48:40 +00:00
Rui Ueyama c72ba3a4d7 Allow calling getName() on local symbols.
Previously, we stored offsets in string tables to symbols, so
you needed to pass a string table to get a symbol name. This patch
stores const char pointers instead to eliminate the need to pass
a string table.

llvm-svn: 287737
2016-11-23 04:57:25 +00:00
Eugene Leviant ff23d3e741 [ELF] Convert PltSection to input section
Differential revision: https://reviews.llvm.org/D26842

llvm-svn: 287346
2016-11-18 14:35:03 +00:00
Eugene Leviant afaa934304 [ELF] Add Section() to expression object
This allows making symbols containing ADDR(section) synthetic,
and defining synthetic symbols outside SECTIONS block.

Differential revision: https://reviews.llvm.org/D25441

llvm-svn: 287090
2016-11-16 09:49:39 +00:00
Eugene Leviant ad4439e802 [ELF] Convert .got section to input section
Differential revision: https://reviews.llvm.org/D26498

llvm-svn: 286580
2016-11-11 11:33:32 +00:00
Eugene Leviant 41ca327b5e [ELF] Convert .got.plt section to input section
Differential revision: https://reviews.llvm.org/D26349

llvm-svn: 286443
2016-11-10 09:48:29 +00:00
George Rimar 1a33c0f242 [ELF] - Implemented --symbol-ordering-file option.
Patch allows to pass a symbols file to linker.
LLD will map symbols to sections and sort sections
in output according to symbol ordering file.

That can help to reduce the startup time and/or
amount of pagefaults during startup.

Also, interesting benchmark result was produced by Rafael Espíndola. 
After applying the symbols file for clang he timed compiling 
X86MCTargetDesc.ii to an object file.  

The page faults went from just
56,988 to 56,946 since most faults are not in the binary.
Running time went from 4.403053515 to 4.178112244. 
The speedup seems to be because of better cache
locality.

Differential revision: https://reviews.llvm.org/D26130

llvm-svn: 286440
2016-11-10 09:05:20 +00:00
Rafael Espindola e08e78df6d Make OutputSectionBase a class instead of class template.
The disadvantage is that we use uint64_t instad of uint32_t for some
value in 32 bit files. The advantage is a substantially simpler code,
faster builds and less code duplication.

llvm-svn: 286414
2016-11-09 23:23:45 +00:00
Rafael Espindola 04a2e348bb Split Header into individual fields.
This is similar to what was done for InputSection.

With this the various fields are stored in host order and only
converted to target order when writing.

llvm-svn: 286327
2016-11-09 01:42:41 +00:00
Rui Ueyama e8a6102fa9 Rewrite CommonInputSection as a synthetic input section.
A CommonInputSection is a section containing all common symbols.
That was an input section but was abstracted in a different way
than the synthetic input sections because it was written before
the synthetic input section was invented.

This patch rewrites CommonInputSection as a synthetic input section
so that it behaves better with other sections.

llvm-svn: 286053
2016-11-05 23:05:47 +00:00
Rui Ueyama 55518e7dd8 Consolidate BumpPtrAllocators.
Previously, we have a lot of BumpPtrAllocators, but all these
allocators virtually have the same lifetime because they are
not freed until the linker finishes its job. This patch aggregates
them into a single allocator.

Differential revision: https://reviews.llvm.org/D26042

llvm-svn: 285452
2016-10-28 20:57:25 +00:00
Rafael Espindola 5da1d88492 Reduce the number of allocators.
We used to have one allocator per file, which reduces the advantage of
using an allocator in the first place.

This is a small speed up is most cases. The largest speedup was in
1.014X in chromium no-gc. The largest slowdown was scylla at 1.003X.

llvm-svn: 285205
2016-10-26 15:34:24 +00:00
Rafael Espindola 0e090522c8 Read section headers upfront.
Instead of storing a pointer, store the members we need.

The reason for doing this is that it makes it far easier to create
synthetic sections. It also avoids reading data from files multiple
times., which might help with cross endian linking and host
architectures with slow unaligned access.

There are obvious compacting opportunities, but this already has mixed
results even on native x86_64 linking.

There is also the possibility of better refactoring the code for
handling common symbols, but this already shows that a custom class is
not necessary.

llvm-svn: 285148
2016-10-26 00:54:03 +00:00
Simon Atanasyan bed04bf1df [ELF][MIPS] Put local GOT entries accessed via a 16-bit index first
Some MIPS relocations used to access GOT entries are able to manipulate
16-bit index. The other ones like R_MIPS_CALL_HI16/LO16 can handle
32-bit indexes. 16-bit relocations are generated by default. The 32-bit
relocations are generated by -mxgot flag passed to compiler. Usually
these relocation are not mixed in the same code but files like crt*.o
contain 16-bit relocations so even if all "user's" code compiled with
-mxgot flag a few 16-bit relocations might come to the linking phase.

Now LLD does not differentiate local GOT entries accessed via a 16-bit
and 32-bit indexes. That might lead to relocation's overflow if 16-bit
entries are allocated to far from the beginning of the GOT.

The patch introduces new "part" of MIPS GOT dedicated to the local GOT
entries accessed by 32-bit relocations. That allows to put local GOT
entries accessed via a 16-bit index first and escape relocation's overflow.

Differential revision: https://reviews.llvm.org/D25833

llvm-svn: 284809
2016-10-21 07:22:30 +00:00
Davide Italiano bcdd6c60a0 [ThinLTO] Avoid archive member collisions.
This fixes PR30665.

Differential Revision:  https://reviews.llvm.org/D25495

llvm-svn: 284034
2016-10-12 19:35:54 +00:00
Eugene Leviant cc1ba8c7d0 Alternative fix for reloc tareting discarded section
r283984 introduced a problem of too many warning messages being shown
when -ffunction-sections and -fdata-sections were used in conjunction 
with --gc-sections linker flag and debugging information present. This
happens because lot of relocations from .debug_line section may become
invalid in such case. The newer fix doesn't show any warning message but
zeroes OutSec pointer in createInputSectionList() to avoid crash, when
relocations are written

llvm-svn: 284010
2016-10-12 12:31:34 +00:00
Eugene Leviant c958d8d621 Don't crash if reloc targets discarded section
Differential revision: https://reviews.llvm.org/D25433

llvm-svn: 283984
2016-10-12 08:19:30 +00:00
George Rimar 6a3b154aab [ELF] - Do not crash if symbol type set to TLS when there is no tls sections.
id_000021,sig_11,src_000002,op_flip1,pos_92 from PR30540

does not have TLS sections, but type
of one of the symbol is broken and set to STT_TLS,
what resulted in a crash. Patch fixes crash.

DIfferential revision: https://reviews.llvm.org/D25083

llvm-svn: 283198
2016-10-04 08:52:51 +00:00
Simon Atanasyan f967f090b8 [ELF][MIPS] Setup STO_MIPS_PIC flag for PIC symbols when generate a relocatable object
In case of linking PIC and non-PIC code together and generation of a
relocatable object, all PIC symbols should have STO_MIPS_PIC flag in the
symbol table of the ouput file.

llvm-svn: 282714
2016-09-29 12:58:36 +00:00
Simon Atanasyan d10a5ea19e [ELF] Do not adjust TLS symbol value when produce relocatable object
When the linker generates a relocatable object there is no TLS program
header and we should not adjust TLS symbols value.

llvm-svn: 281494
2016-09-14 16:26:19 +00:00
Rui Ueyama 38dbd3eea9 Simplify InputFile ownership management.
Previously, all input files were owned by the symbol table.
Files were created at various places, such as the Driver, the lazy
symbols, or the bitcode compiler, and the ownership of new files
was transferred to the symbol table using std::unique_ptr.
All input files were then free'd when the symbol table is freed
which is on program exit.

I think we don't have to transfer ownership just to free all
instance at once on exit.

In this patch, all instances are automatically collected to a
vector and freed on exit. In this way, we no longer have to
use std::unique_ptr.

Differential Revision: https://reviews.llvm.org/D24493

llvm-svn: 281425
2016-09-14 00:05:51 +00:00
Rafael Espindola e7553e4eac Delete unnecessary template.
llvm-svn: 280237
2016-08-31 13:28:33 +00:00
Rafael Espindola a6c9744a6c Delete DefinedBitcode.
Given that we almost always want to handle it as DefinedRegular, just
use DefinedRegular.

llvm-svn: 280226
2016-08-31 12:30:34 +00:00
Eugene Leviant ceabe80e97 [ELF] Symbol assignment within output section description
llvm-svn: 278322
2016-08-11 07:56:43 +00:00
Rui Ueyama 0778490428 Remove DefinedCommon::Section.
Since CommonInputSection is a singleton class, we don't need
to store pointers to all DefinedCommon symbols.

llvm-svn: 277410
2016-08-02 01:35:13 +00:00
Eugene Leviant 3e6b027705 [ELF] Allows setting section for common symbols in linker script
llvm-svn: 277023
2016-07-28 19:24:13 +00:00
Rui Ueyama dace838138 Simplify symbol version handling.
r275711 for "speedng up symbol version handling" was committed
by misunderstanding; the benchmark number was measured with
a debug build. The number with a release build didn't actually change.
This patch removes false optimizations added in that patch.

llvm-svn: 276267
2016-07-21 13:13:21 +00:00
Rui Ueyama e33579072d Remove SymbolBody::PlaceholderKind.
In the last patch for --trace-symbol, I introduced a new symbol type
PlaceholderKind and store it to SymVector storage. It made all code
that iterates over SymVector to recognize and skip PlaceholderKind
symbols. I found that that's annoying.

In this patch, I removed PlaceholderKind and stop storing them to SymVector.
Now the information whether a symbol is being watched by --trace-symbol
is stored to the Symtab hash table.

llvm-svn: 275747
2016-07-18 01:35:00 +00:00
Rui Ueyama 69c778c084 Implement almost-zero-cost --trace-symbol.
--trace-symbol is a command line option to watch a symbol.
Previosly, we looked up a hash table for a new symbol if the
option is given. Any code that looks up a hash table for each
symbol is expensive because the linker handles a lot of symbols.
In our design, we look up a hash table strictly only once
for a symbol, so --trace-symbol was an exception.

This patch improves efficiency of the option by merging the
hash table into the symbol table.

Instead of looking up a separate hash table with a string,
this patch sets `Traced` flag to symbols specified by --trace-symbol.
So, if you insert a symbol and get a symbol with `Traced` flag on,
you know that you need to print out a log message for the symbol.
This is nearly zero cost.

llvm-svn: 275716
2016-07-17 17:50:09 +00:00
Rui Ueyama 2a7c1c1507 Print out file names for common symbols for --trace-symbol.
Previously, there was no way to get a file name for a DefinedCommon
symbol. This patch adds it.

llvm-svn: 275712
2016-07-17 17:36:22 +00:00
Rui Ueyama 663b8c2769 Handle versioned symbols efficiently.
Versions can be assigned to symbols in two different ways.
One is the usual version scripts, and the other is special
symbol suffix '@'. If a symbol contains '@', the string after
that is considered to specify a version name.

Previously, we look for '@' for all symbols.

Anything that works on every symbol can be expensive because
the linker has to handle a lot of symbols. The search for '@'
was not an exception.

In this patch, I made two optimizations.

The first optimization is to handle '@' only when at least one
version is defined. If no versions are defined, no versions can
be assigned to any symbols, so it's waste of time to search for '@'.

The second optimization is to scan only suffixes of symbol names
instead of entire symbol names. Symbol names can be very long, but
symbol versions are usually short, so scanning entire symbol names
is waste of time, too.

There are some error cases which we no longer be able to detect
with this patch. I don't think it's a major drawback because they
are minor errors. Speed is more important.

This change improves LLD with debug info self-link time from
6.6993 seconds to 6.3426 seconds (or -5.3%).

Differential Revision: https://reviews.llvm.org/D22433

llvm-svn: 275711
2016-07-17 17:23:17 +00:00
Rui Ueyama 434b56179e Add a pointer to a source file to SymbolBody.
Previously, each subclass of SymbolBody had a pointer to a source
file from which it was created. So, there was no single way to get
a source file for a symbol. We had getSourceFile<ELFT>(), but the
function was a bit inconvenient as it's a template.

This patch makes SymbolBody have a pointer to a source file.
If a symbol is not created from a file, the pointer has a nullptr.

llvm-svn: 275701
2016-07-17 03:11:46 +00:00
Rui Ueyama 803b120ba1 Add GotEntrySize/GotPltEntrySize to ELF target.
Patch by H.J Lu.

For x86-64 psABI, the entry size of .got and .got.plt sections is 8
bytes for both LP64 and ILP32.  Add GotEntrySize and GotPltEntrySize
to ELF target instead of using size of ELFT::uint.  Now we can generate
a simple working x32 executable.

Differential Revision: http://reviews.llvm.org/D22288

llvm-svn: 275301
2016-07-13 18:55:14 +00:00
Peter Smith fb05cd997c Recommit R274836 Add Thunk support framework for ARM and Mips
The TinyPtrVector of const Thunk<ELFT>* in InputSections.h can cause 
build failures on certain compiler/library combinations when Thunk<ELFT> 
is not a complete type or is an abstract class. Fixed by making Thunk<ELFT>
non Abstract.

type or is an abstract class 

llvm-svn: 274863
2016-07-08 16:10:27 +00:00
Peter Smith eeb827447e Revert R274836 Add Thunk support framework for ARM and Mips
This seems to be causing a buildbot failure on lld-x86_64-freebsd. Will
reproduce locally and fix. 

llvm-svn: 274841
2016-07-08 12:25:50 +00:00
Peter Smith de01b98a26 Add Thunk support framework for ARM and Mips
Generalise the Mips LA25 Thunk code and implement ARM and Thumb
    interworking Thunks.
    
    - Introduce a new module Thunks.cpp to store the Target Specific Thunk
      implementations.
    - DefinedRegular and Shared have a ThunkData field to record Thunk.
    - A Target can have more than one type of Thunk.
    - Support PC-relative calls to Thunks.
    - Support Thunks to PLT entries.
    - Existing Mips LA25 Thunk code integrated.
    - Support for ARMv7A interworking Thunks.
    
    Limitations:
    - Only one Thunk per SymbolBody, this is sufficient for all currently
      implemented Thunks.
    - ARM thunks assume presence of V6T2 MOVT and MOVW instructions.

    Differential revision: http://reviews.llvm.org/D21891

llvm-svn: 274836
2016-07-08 11:13:40 +00:00
Rui Ueyama f4d9338dfb Move demangle() from Symbols.cpp to Strings.cpp.
Symbols.cpp contains functions to handle ELF symbols.
demangle() function is essentially a function to work on a
string rather than on an ELF symbol. So Strings.cpp is a
better place to put that function.

This change also make demangle to demangle symbols unconditionally.
Previously, it demangled symbols only when Config->Demangle is true.

llvm-svn: 274804
2016-07-07 23:04:15 +00:00
Rafael Espindola 580d7a1b1e -Bsymbolic should not make symbols more preemptable.
But it was doing that for protected undefined symbols.

llvm-svn: 274803
2016-07-07 22:50:54 +00:00
George Rimar 4365158689 [ELF] - Implemented support of default/non-default symbols versions
t is possible to create new version of symbol instead of depricated one
using combination of version script and asm commands. For example:

__asm__(".symver b_1,b@LIBSAMPLE_1.0");
int b_1() { return 10; }
__asm__(".symver b_2,b@@LIBSAMPLE_2.0");
int b_2() { return 20; }

This code makes b_2() to be default implementation for b().
b_1() is used for compatibility with binaries compiled against
library of older version LIBSAMPLE_1.0.

This patch implements support for above functionality in lld.

Differential revision: http://reviews.llvm.org/D21681

llvm-svn: 274002
2016-06-28 08:21:10 +00:00
George Rimar d3566309eb [ELF] - Recommit r273143("[ELF] - Basic versioned symbols support implemented.")
With fix:
-soname flag was not set in testcase. Hash calculated for base def was different on local
and bot machines because filename fos used for calculating.

Initial commit message:
Patch implements basic support of versioned symbols.
There is no wildcards patterns matching except local: *;
There is no support for hierarchies.
There is no support for symbols overrides (@ vs @@ not handled).

This patch allows programs that using simple scripts to link and run.

Differential revision: http://reviews.llvm.org/D21018

llvm-svn: 273152
2016-06-20 11:55:12 +00:00
George Rimar d03f97211a Revert r273143 "[ELF] - Basic versioned symbols support implemented."
It broke buildbot:
http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast

llvm-svn: 273146
2016-06-20 10:29:53 +00:00
George Rimar c31fee2212 [ELF] - Basic versioned symbols support implemented.
Patch implements basic support of versioned symbols.
There is no wildcards patterns matching except local: *;
There is no support for hierarchies.
There is no support for symbols overrides (@ vs @@ not handled).

This patch allows programs that using simple scripts to link and run.

Differential revision: http://reviews.llvm.org/D21018

llvm-svn: 273143
2016-06-20 10:16:33 +00:00
Simon Atanasyan 4132511cdc [ELF][MIPS] Support GOT entries for non-preemptible symbols with different addends
There are two motivations for this patch. The first one is a preparation
for support MIPS TLS relocations. It might sound like a joke but for GOT
entries related to TLS relocations MIPS ABI uses almost regular approach
with creation of dynamic relocations for each GOT enty etc. But we need
to separate these 'regular' TLS related entries from MIPS specific local
and global parts of GOT. ABI declare simple solution - all TLS related
entries allocated at the end of GOT after local/global parts. The second
motivation it to support GOT relocations for non-preemptible symbols
with addends. If we have more than one GOT relocations against symbol S
with different addends we need to create GOT entries for each unique
Symbol/Addend pairs.

So we store all MIPS GOT entries in separate containers. For non-preemptible
symbols we have to maintain two data structures. The first one is MipsLocal
vector. Each entry corresponds to the GOT entry from the 'local' part
of the GOT contains the symbol's address plus addend. The second one
is MipsLocalMap. It is a map from Symbol/Addend pair to the GOT index.

Differential Revision: http://reviews.llvm.org/D21297

llvm-svn: 273127
2016-06-19 21:39:37 +00:00
Rui Ueyama 4a90f57ef2 Rename PltZero -> PltHeader.
PltZero (or PLT[0]) was an appropriate name for the little code
we have at beginning of the PLT section when we only supported x86
since the code for x86 just fits in the first PLT slot.

It's not the case anymore. The code for ARM64 occupies first two
slots, so PltZero spans PLT[0] and PLT[1], for example.
This patch renames it to avoid confusion.

llvm-svn: 272913
2016-06-16 16:28:50 +00:00
Rafael Espindola 65c65ce897 Don't include --start-lib/--end-lib files twice.
This should never happen with correct programs, but it is trivial
write a testcase where lld would crash or report duplicated
symbols. We now behave like when an archive is used and include the
file only once.

llvm-svn: 272724
2016-06-14 21:56:36 +00:00
Rafael Espindola 07543a8c2d Use a reference instead of a pointer. NFC.
llvm-svn: 272719
2016-06-14 21:40:23 +00:00
Rui Ueyama 70595aae64 Inline SymbolBody::init. NFC.
I think this function was too short to be an independent function.

llvm-svn: 270534
2016-05-24 04:51:49 +00:00