Commit Graph

368 Commits

Author SHA1 Message Date
George Rimar 69268a8ab3 [ELF] - Detemplate SymbolBody::getOutputSection(). NFC.
llvm-svn: 297943
2017-03-16 11:06:13 +00:00
George Rimar 47bed8c9f7 Revert r297813 "[ELF] - Make Bss and BssRelRo sections to be synthetic (#3)."
I suppose it is the reason of BB fail:
http://lab.llvm.org:8011/builders/clang-cmake-aarch64-lld/builds/921

https://bugs.llvm.org/show_bug.cgi?id=32167

llvm-svn: 297933
2017-03-16 08:44:53 +00:00
George Rimar 22e3e7d165 [ELF] - Make Bss and BssRelRo sections to be synthetic (#3).
That removes CopyRelSection class completely, making
Bss/BssRelRo to be just regular synthetics.

This is splitted from D30541 and polished.
Difference from D30541 that all logic of SharedSymbol
converting to DefinedRegular was removed for now and
probably will be posted as separate patch.

Differential revision: https://reviews.llvm.org/D30892

llvm-svn: 297814
2017-03-15 09:21:29 +00:00
Rafael Espindola 5616adf655 Remove DefinedSynthetic.
With this we have a single section hierarchy. It is a bit less code,
but the main advantage will be in a future patch being able to handle

foo = symbol_in_obj;

in a linker script. Currently that fails since we try to find the
output section of symbol_in_obj.  With this we should be able to just
return an InputSection from the expression.

llvm-svn: 297313
2017-03-08 22:36:28 +00:00
Rafael Espindola fcd208fdb3 Use uint32_t for alignment in more places, NFC.
llvm-svn: 297305
2017-03-08 19:35:29 +00:00
Rui Ueyama 007c002cb6 Revert r297008: [ELF] - Make Bss and BssRelRo sections to be synthetic (#2).
This reverts commit r297008 because it's reported that that
change broke AArch64 bots.

llvm-svn: 297297
2017-03-08 17:24:24 +00:00
Rafael Espindola 5e434b3f11 Remove unnecessary template.
llvm-svn: 297293
2017-03-08 16:08:36 +00:00
Rafael Espindola e1294091d3 Remove unnecessary template. NFC.
llvm-svn: 297292
2017-03-08 16:03:41 +00:00
Rafael Espindola c86b2cddc8 Convert a few more uses of uintX_t to uint64_t.
llvm-svn: 297286
2017-03-08 15:34:04 +00:00
Rafael Espindola 9371bab55a Convert a few uses of uintX_t to uint64_t.
llvm-svn: 297282
2017-03-08 15:21:32 +00:00
Evgeniy Stepanov 5ac5542544 Fix -Werror build error.
tools/lld/ELF/Symbols.cpp:215:13: error: unused variable 'S'
[-Werror,-Wunused-variable]
  if (auto *S = dyn_cast<SharedSymbol>(this)

llvm-svn: 297063
2017-03-06 20:27:38 +00:00
George Rimar c215a2ac40 [ELF] - Make Bss and BssRelRo sections to be synthetic (#2).
In compare with D30458, this makes Bss/BssRelRo to be pure
synthetic sections.

That removes CopyRelSection class completely, making
Bss/BssRelRo to be just regular synthetics.

SharedSymbols involved in creating copy relocations are
converted to DefinedRegular, what also simplifies things.

Differential revision: https://reviews.llvm.org/D30541

llvm-svn: 297008
2017-03-06 14:37:45 +00:00
Sean Silva d4e606273b Improve comment.
Thanks to Rafael for helping to dig down into what we're really trying
to communicate here.

llvm-svn: 296580
2017-03-01 04:44:04 +00:00
Rui Ueyama 80474a26b9 De-template DefinedRegular.
Differential Revision: https://reviews.llvm.org/D30348

llvm-svn: 296508
2017-02-28 19:29:55 +00:00
Sean Silva 6ab3926ee6 Comment on the typical/simple case of VA calculation
There are many special cases and a layer of abstraction or two in the
way, but the VA calculation in the typical case is actually very simple
and probably makes perfect sense even to somebody new to linkers.

Also, this line brings together many components and is a good place to
start understanding the linker (or improve one's existing
understanding).

llvm-svn: 296451
2017-02-28 09:01:58 +00:00
Sean Silva a9ba450c52 Add a comment.
Naively it seemed at first like getVA had the responsibility of adding
the addend, and getSymVA had the responsibility of getting the symbol
VA.
So it was not obvious to me at first why getVA passes Addend to
getSymVA. In fact, it passes it as a mutable reference.

It turns out that it only matters for SHF_MERGE sections, and in
particular only for STT_SECTION symbols that are used as a hack for
reducing the number of local symbols (e.g. to avoid a local symbol for
each string in the string table).

llvm-svn: 296448
2017-02-28 08:32:56 +00:00
Rui Ueyama 968db48cee Move SymbolTableSection::getOutputSection to SymbolBody::getOutputSection.
That function doesn't use any member of SymbolTableSection, so I
couldn't see a reason to make it a member of that class. The function
takes a SymbolBody, so it is more natural to make it a member of
SymbolBody.

llvm-svn: 296433
2017-02-28 04:02:42 +00:00
Rui Ueyama 9d1bacb1b4 Remove useless template so that Out<ELFT> becomes just Out.
llvm-svn: 296307
2017-02-27 02:31:26 +00:00
Rui Ueyama 4076fa1e21 De-template SharedSymbol.
Differential Revision: https://reviews.llvm.org/D30351

llvm-svn: 296303
2017-02-26 23:35:34 +00:00
Rafael Espindola 24e6f363c5 Merge OutputSectionBase and OutputSection. NFC.
Now that all special sections are SyntheticSections, we only need one
OutputSection class.

llvm-svn: 296127
2017-02-24 15:07:30 +00:00
Rafael Espindola 774ea7d0a9 Make InputSection a class. NFC.
With the current design an InputSection is basically anything that
goes directly in a OutputSection. That includes plain input section
but also synthetic sections, so this should probably not be a
template.

llvm-svn: 295993
2017-02-23 16:49:07 +00:00
Rafael Espindola b4c9b81aad Convert InputSectionBase to a class.
Removing this template is not a big win by itself, but opens the way
for removing more templates.

llvm-svn: 295923
2017-02-23 02:28:28 +00:00
Rui Ueyama e6e206d4b4 Do not use errs() or outs() directly. Instead use message(), log() or error()
LLD is a multi-threaded program. errs() or outs() are not guaranteed
to be thread-safe (they are actually not).

LLD's message(), log() or error() are thread-safe. We should use them.

llvm-svn: 295787
2017-02-21 23:22:56 +00:00
Rui Ueyama f829e8c97d Removes a trivial accessor.
llvm-svn: 295288
2017-02-16 06:12:41 +00:00
Rui Ueyama 924b361d01 Do not overload a one-bit variable, NeedsCopyOrPltAddr.
This patch removes NeedsCopyOrPltAddr and instead add two variables,
NeedsCopy and NeedsPltAddr. This uses one more bit in Symbol class,
but the actual size doesn't increase because we had unused bits.
This should improve code readability.

llvm-svn: 295287
2017-02-16 06:12:22 +00:00
Rafael Espindola 7386ceac74 Addends should always be signed.
In the target dependent code we already always return a int64_t. In
the target independent code we carefully use uintX_t, which has the
same result given 2 complement rules.

This just simplifies the code to use int64_t everywhere.

llvm-svn: 295263
2017-02-16 00:12:34 +00:00
Peter Smith ebfe994142 [ELF] Use synthetic section to hold copy relocation
When we need a copy relocation we create a synthetic SHT_NOBITS
section that contains the right amount of ZI and assign it to either
.bss or .rel.ro.bss as appropriate. This allows the dynamic relocation
to be placed on the InputSection, removing the last case where a
dynamic relocation is stored as an offset from the OutputSection. This
has the side effect that we can run assignOffsets() after scanRelocs()
without losing the additional ZI needed for the copy relocations.

Differential Revision: https://reviews.llvm.org/D29637

llvm-svn: 294577
2017-02-09 10:27:57 +00:00
Rafael Espindola 9e9754b520 Replace MergeOutputSection with a synthetic section.
With a synthetic merge section we can have, for example, a single
.rodata section with stings, fixed sized constants and non merge
constants.

I can be simplified further by not setting Entsize, but that is
probably better done is a followup patch.

This should allow some cleanup in the linker script code now that
every output section command maps to just one output section.

llvm-svn: 294005
2017-02-03 13:06:18 +00:00
Peter Smith 3a52eb0054 [ELF] Use SyntheticSections for Thunks
Thunks are now implemented by redirecting the relocation to the
symbol S, to a symbol TS in a Thunk. The Thunk will transfer control
to S. This has the following implications:
- All the side-effects of Thunks happen within createThunks()
- Thunks are no longer stored in InputSections and Symbols no longer
  need to hold a pointer to a Thunk
- The synthetic Thunk sections need to be merged into OutputSections
    
This implementation is almost a direct conversion of the existing
Thunks with the following exceptions:
- Mips LA25 Thunks are placed before the InputSection that defines
  the symbol that needs a Thunk.
- All ARM Thunks are placed at the end of the OutputSection of the
  first caller to the Thunk.
    
Range extension Thunks are not supported yet so it is optimistically
assumed that all Thunks can be reused.

This is a recommit of r293283 with a fixed comparison predicate as
std::merge requires a strict weak ordering.

Differential revision: https://reviews.llvm.org/D29327

llvm-svn: 293757
2017-02-01 10:26:03 +00:00
Rui Ueyama f20ee9f11a Revert "[ELF][ARM] Use SyntheticSections for Thunks"
This reverts commit r293283 because it broke MSVC build.

llvm-svn: 293352
2017-01-28 00:48:06 +00:00
Peter Smith 5191c6f945 [ELF][ARM] Use SyntheticSections for Thunks
Thunks are now implemented by redirecting the relocation to the
symbol S, to a symbol TS in a Thunk. The Thunk will transfer control
to S. This has the following implications:
- All the side-effects of Thunks happen within createThunks()
- Thunks are no longer stored in InputSections and Symbols no longer
  need to hold a pointer to a Thunk
- The synthetic Thunk sections need to be merged into OutputSections
    
This implementation is almost a direct conversion of the existing
Thunks with the following exceptions:
- Mips LA25 Thunks are placed before the InputSection that defines
  the symbol that needs a Thunk.
- All ARM Thunks are placed at the end of the OutputSection of the
  first caller to the Thunk.
    
Range extension Thunks are not supported yet so it is optimistically
assumed that all Thunks can be reused.

Differential Revision:  https://reviews.llvm.org/D29129

llvm-svn: 293283
2017-01-27 13:10:16 +00:00
Rui Ueyama b2a23cf3c0 Do not allocate space for common symbols with -r
Currently ld.lld -r allocates space for common symbols, whereas ld.bfd
-r doesn't.  As a result the OpenBSD makefile bits for creating libraries
fail as they use ld -X -r to strip local symbols, which results in
duplicate symbol errors because space for the common symbols has been
allocated.

The diff also implements the --define-commons option such that allocation
of commons can be forced even if -r is used.

Patch by Mark Kettenis.

llvm-svn: 292878
2017-01-24 03:41:20 +00:00
Rafael Espindola 1d6d1b44cc Add a isInCurrentDSO helper. NFC.
llvm-svn: 292228
2017-01-17 16:08:06 +00:00
Rafael Espindola 41a93a3edf Give priority to linker scripts over preemption.
LLD exports symbols that are also present in used shared libraries to
make sure they are preempted at runtime. That is a reasonable default,
but we must allow for it to be overwritten with linker script. If we
don't, libraries that expect to be able to hide a c++ delete operator
will fail.

This should fix the firebird build.

llvm-svn: 292146
2017-01-16 17:35:23 +00:00
Rafael Espindola b7e2ee2aba Give local binding to VER_NDX_LOCAL symbols.
We were already dropping them from the dynamic symbol table, but the
regular symbol table was still listing them as globals.

llvm-svn: 291573
2017-01-10 17:08:13 +00:00
Peter Collingbourne feb6629d6d ELF: Reserve space for copy relocations of read-only symbols in relro.
When reserving copy relocation space for a shared symbol, scan the DSO's
program headers to see if the symbol is in a read-only segment. If so,
reserve space for that symbol in a new synthetic section named .bss.rel.ro
which will be covered by the relro program header.

This fixes the security issue disclosed on the binutils mailing list at:
https://sourceware.org/ml/libc-alpha/2016-12/msg00914.html

Differential Revision: https://reviews.llvm.org/D28272

llvm-svn: 291524
2017-01-10 01:21:50 +00:00
Rafael Espindola 2756e04fac Handle versioned undefined symbols.
In order to keep symbol lookup a simple name lookup this patch adds
versioned symbols with an explicit @ to the symbol table.

llvm-svn: 291293
2017-01-06 22:30:35 +00:00
Rui Ueyama ce039266c1 Merge elf::toString and coff::toString.
The two overloaded functions hid each other. This patch merges them.

llvm-svn: 291222
2017-01-06 10:04:08 +00:00
Peter Smith 97c6d78f3e [ELF] Add support for thunks to undefined non-weak symbols
In a shared library an undefined symbol is implicitly imported. If the
symbol is called as a function a PLT entry is generated for it. When the
caller is a Thumb b.w a thunk to the PLT entry is needed as all PLT
entries are in ARM state.
    
This change allows undefined symbols to have thunks in the same way that
shared symbols may have thunks.

llvm-svn: 290951
2017-01-04 09:45:45 +00:00
Rui Ueyama 4f2f50dc64 De-template DefinedSynthetic.
DefinedSynthetic is not created for a real ELF object, so it doesn't
have to be a template function. It has a virtual st_value, which is
either 32 bit or 64 bit, but we can simply use 64 bit.

llvm-svn: 290241
2016-12-21 08:40:09 +00:00
Rafael Espindola 17cb7c0a2a Detemplate PhdrEntry. NFC.
llvm-svn: 290115
2016-12-19 17:01:01 +00:00
Sean Silva 902ae3cb3f Rename this variable.
`SC` didn't make much sense. We don't seem to have a clear convention,
but `IS` sounds good here because it emphasizes that it is an input
section (this is one place in the code where we are dealing with both
input sections and output sections at the same time so that extra
emphasis makes it a bit clearer).

llvm-svn: 289748
2016-12-15 00:57:53 +00:00
Peter Smith baffdb8bc2 [ELF] ifunc implementation using synthetic sections
This change introduces new synthetic sections IpltSection, IgotPltSection
that represent the ifunc entries that would previously have been put in
the PltSection and the GotPltSection. The separation makes sure that
the R_*_IRELATIVE relocations are placed after the non R_*_IRELATIVE
relocations, which permits ifunc resolvers to know that the .got.plt
slots will be initialized prior to the resolver being called.

A secondary benefit is that for ARM we can move the IgotPltSection and its
dynamic relocations to the .got and .rel.dyn as the ARM glibc expects all
the R_*_IRELATIVE relocations to be in the .rel.dyn

Differential revision: https://reviews.llvm.org/D27406

llvm-svn: 289045
2016-12-08 12:58:55 +00:00
Rui Ueyama 4c5b8cea02 Make demangle() return None instead of "" if a given string is not a mangled symbol.
llvm-svn: 288993
2016-12-07 23:17:05 +00:00
Rui Ueyama 44da9decb5 Include object file name to an error message.
llvm-svn: 288686
2016-12-05 18:40:14 +00:00
Rui Ueyama a13efc2a73 Introduce StringRefZ class to represent null-terminated strings.
StringRefZ is a class to represent a null-terminated string. String
length is computed lazily, so it's more efficient than StringRef to
represent strings in string table.

The motivation of defining this new class is to merge functions
that only differ in string types; we have many constructors that takes
`const char *` or `StringRef`. With StringRefZ, we can merge them.

Differential Revision: https://reviews.llvm.org/D27037

llvm-svn: 288172
2016-11-29 18:05:04 +00:00
Rui Ueyama a3ac17372b Define toString(const SymbolBody &) and remove maybeDemangle instead.
Differential Revision: https://reviews.llvm.org/D27065

llvm-svn: 287899
2016-11-24 20:24:18 +00:00
Rui Ueyama 3fc0f7e54f Define toString() as a generic function to get a string for error message.
We have different functions to stringize objects to construct
error messages. For InputFile, we have getFilename, and for
InputSection, we have getName. You had to memorize them.

I think this is the case where the function overloading comes in handy.

This patch defines toString() functions that are overloaded for all these
types, so that you just call it in error().

Differential Revision: https://reviews.llvm.org/D27030

llvm-svn: 287787
2016-11-23 18:07:33 +00:00
Rui Ueyama 35fa6c58ad Parse symbol versions in scanVersionScript() instead of insert().
There are two ways to set symbol versions. One way is to use symbol
definition file, and the other is to embed version names to symbol
names. In the latter way, symbol name is in the form of `foo@version1`
where `foo` is a real name and `version1` is a version.

We were parsing symbol names in insert(). That seems unnecessarily
too early. We can do it later after we resolve all symbols. Doing it
lazily is a good thing because it makes code easier to read
(because now we have a separate pass to parse symbol names). Also
it could slightly improve performance because if two identical symbols
have versions, we now parse them only once.

llvm-svn: 287741
2016-11-23 05:48:40 +00:00
Rui Ueyama c72ba3a4d7 Allow calling getName() on local symbols.
Previously, we stored offsets in string tables to symbols, so
you needed to pass a string table to get a symbol name. This patch
stores const char pointers instead to eliminate the need to pass
a string table.

llvm-svn: 287737
2016-11-23 04:57:25 +00:00
Eugene Leviant ff23d3e741 [ELF] Convert PltSection to input section
Differential revision: https://reviews.llvm.org/D26842

llvm-svn: 287346
2016-11-18 14:35:03 +00:00
Eugene Leviant afaa934304 [ELF] Add Section() to expression object
This allows making symbols containing ADDR(section) synthetic,
and defining synthetic symbols outside SECTIONS block.

Differential revision: https://reviews.llvm.org/D25441

llvm-svn: 287090
2016-11-16 09:49:39 +00:00
Eugene Leviant ad4439e802 [ELF] Convert .got section to input section
Differential revision: https://reviews.llvm.org/D26498

llvm-svn: 286580
2016-11-11 11:33:32 +00:00
Eugene Leviant 41ca327b5e [ELF] Convert .got.plt section to input section
Differential revision: https://reviews.llvm.org/D26349

llvm-svn: 286443
2016-11-10 09:48:29 +00:00
George Rimar 1a33c0f242 [ELF] - Implemented --symbol-ordering-file option.
Patch allows to pass a symbols file to linker.
LLD will map symbols to sections and sort sections
in output according to symbol ordering file.

That can help to reduce the startup time and/or
amount of pagefaults during startup.

Also, interesting benchmark result was produced by Rafael Espíndola. 
After applying the symbols file for clang he timed compiling 
X86MCTargetDesc.ii to an object file.  

The page faults went from just
56,988 to 56,946 since most faults are not in the binary.
Running time went from 4.403053515 to 4.178112244. 
The speedup seems to be because of better cache
locality.

Differential revision: https://reviews.llvm.org/D26130

llvm-svn: 286440
2016-11-10 09:05:20 +00:00
Rafael Espindola e08e78df6d Make OutputSectionBase a class instead of class template.
The disadvantage is that we use uint64_t instad of uint32_t for some
value in 32 bit files. The advantage is a substantially simpler code,
faster builds and less code duplication.

llvm-svn: 286414
2016-11-09 23:23:45 +00:00
Rafael Espindola 04a2e348bb Split Header into individual fields.
This is similar to what was done for InputSection.

With this the various fields are stored in host order and only
converted to target order when writing.

llvm-svn: 286327
2016-11-09 01:42:41 +00:00
Rui Ueyama e8a6102fa9 Rewrite CommonInputSection as a synthetic input section.
A CommonInputSection is a section containing all common symbols.
That was an input section but was abstracted in a different way
than the synthetic input sections because it was written before
the synthetic input section was invented.

This patch rewrites CommonInputSection as a synthetic input section
so that it behaves better with other sections.

llvm-svn: 286053
2016-11-05 23:05:47 +00:00
Rui Ueyama 55518e7dd8 Consolidate BumpPtrAllocators.
Previously, we have a lot of BumpPtrAllocators, but all these
allocators virtually have the same lifetime because they are
not freed until the linker finishes its job. This patch aggregates
them into a single allocator.

Differential revision: https://reviews.llvm.org/D26042

llvm-svn: 285452
2016-10-28 20:57:25 +00:00
Rafael Espindola 5da1d88492 Reduce the number of allocators.
We used to have one allocator per file, which reduces the advantage of
using an allocator in the first place.

This is a small speed up is most cases. The largest speedup was in
1.014X in chromium no-gc. The largest slowdown was scylla at 1.003X.

llvm-svn: 285205
2016-10-26 15:34:24 +00:00
Rafael Espindola 0e090522c8 Read section headers upfront.
Instead of storing a pointer, store the members we need.

The reason for doing this is that it makes it far easier to create
synthetic sections. It also avoids reading data from files multiple
times., which might help with cross endian linking and host
architectures with slow unaligned access.

There are obvious compacting opportunities, but this already has mixed
results even on native x86_64 linking.

There is also the possibility of better refactoring the code for
handling common symbols, but this already shows that a custom class is
not necessary.

llvm-svn: 285148
2016-10-26 00:54:03 +00:00
Simon Atanasyan bed04bf1df [ELF][MIPS] Put local GOT entries accessed via a 16-bit index first
Some MIPS relocations used to access GOT entries are able to manipulate
16-bit index. The other ones like R_MIPS_CALL_HI16/LO16 can handle
32-bit indexes. 16-bit relocations are generated by default. The 32-bit
relocations are generated by -mxgot flag passed to compiler. Usually
these relocation are not mixed in the same code but files like crt*.o
contain 16-bit relocations so even if all "user's" code compiled with
-mxgot flag a few 16-bit relocations might come to the linking phase.

Now LLD does not differentiate local GOT entries accessed via a 16-bit
and 32-bit indexes. That might lead to relocation's overflow if 16-bit
entries are allocated to far from the beginning of the GOT.

The patch introduces new "part" of MIPS GOT dedicated to the local GOT
entries accessed by 32-bit relocations. That allows to put local GOT
entries accessed via a 16-bit index first and escape relocation's overflow.

Differential revision: https://reviews.llvm.org/D25833

llvm-svn: 284809
2016-10-21 07:22:30 +00:00
Davide Italiano bcdd6c60a0 [ThinLTO] Avoid archive member collisions.
This fixes PR30665.

Differential Revision:  https://reviews.llvm.org/D25495

llvm-svn: 284034
2016-10-12 19:35:54 +00:00
Eugene Leviant cc1ba8c7d0 Alternative fix for reloc tareting discarded section
r283984 introduced a problem of too many warning messages being shown
when -ffunction-sections and -fdata-sections were used in conjunction 
with --gc-sections linker flag and debugging information present. This
happens because lot of relocations from .debug_line section may become
invalid in such case. The newer fix doesn't show any warning message but
zeroes OutSec pointer in createInputSectionList() to avoid crash, when
relocations are written

llvm-svn: 284010
2016-10-12 12:31:34 +00:00
Eugene Leviant c958d8d621 Don't crash if reloc targets discarded section
Differential revision: https://reviews.llvm.org/D25433

llvm-svn: 283984
2016-10-12 08:19:30 +00:00
George Rimar 6a3b154aab [ELF] - Do not crash if symbol type set to TLS when there is no tls sections.
id_000021,sig_11,src_000002,op_flip1,pos_92 from PR30540

does not have TLS sections, but type
of one of the symbol is broken and set to STT_TLS,
what resulted in a crash. Patch fixes crash.

DIfferential revision: https://reviews.llvm.org/D25083

llvm-svn: 283198
2016-10-04 08:52:51 +00:00
Simon Atanasyan f967f090b8 [ELF][MIPS] Setup STO_MIPS_PIC flag for PIC symbols when generate a relocatable object
In case of linking PIC and non-PIC code together and generation of a
relocatable object, all PIC symbols should have STO_MIPS_PIC flag in the
symbol table of the ouput file.

llvm-svn: 282714
2016-09-29 12:58:36 +00:00
Simon Atanasyan d10a5ea19e [ELF] Do not adjust TLS symbol value when produce relocatable object
When the linker generates a relocatable object there is no TLS program
header and we should not adjust TLS symbols value.

llvm-svn: 281494
2016-09-14 16:26:19 +00:00
Rui Ueyama 38dbd3eea9 Simplify InputFile ownership management.
Previously, all input files were owned by the symbol table.
Files were created at various places, such as the Driver, the lazy
symbols, or the bitcode compiler, and the ownership of new files
was transferred to the symbol table using std::unique_ptr.
All input files were then free'd when the symbol table is freed
which is on program exit.

I think we don't have to transfer ownership just to free all
instance at once on exit.

In this patch, all instances are automatically collected to a
vector and freed on exit. In this way, we no longer have to
use std::unique_ptr.

Differential Revision: https://reviews.llvm.org/D24493

llvm-svn: 281425
2016-09-14 00:05:51 +00:00
Rafael Espindola e7553e4eac Delete unnecessary template.
llvm-svn: 280237
2016-08-31 13:28:33 +00:00
Rafael Espindola a6c9744a6c Delete DefinedBitcode.
Given that we almost always want to handle it as DefinedRegular, just
use DefinedRegular.

llvm-svn: 280226
2016-08-31 12:30:34 +00:00
Eugene Leviant ceabe80e97 [ELF] Symbol assignment within output section description
llvm-svn: 278322
2016-08-11 07:56:43 +00:00
Rui Ueyama 0778490428 Remove DefinedCommon::Section.
Since CommonInputSection is a singleton class, we don't need
to store pointers to all DefinedCommon symbols.

llvm-svn: 277410
2016-08-02 01:35:13 +00:00
Eugene Leviant 3e6b027705 [ELF] Allows setting section for common symbols in linker script
llvm-svn: 277023
2016-07-28 19:24:13 +00:00
Rui Ueyama dace838138 Simplify symbol version handling.
r275711 for "speedng up symbol version handling" was committed
by misunderstanding; the benchmark number was measured with
a debug build. The number with a release build didn't actually change.
This patch removes false optimizations added in that patch.

llvm-svn: 276267
2016-07-21 13:13:21 +00:00
Rui Ueyama e33579072d Remove SymbolBody::PlaceholderKind.
In the last patch for --trace-symbol, I introduced a new symbol type
PlaceholderKind and store it to SymVector storage. It made all code
that iterates over SymVector to recognize and skip PlaceholderKind
symbols. I found that that's annoying.

In this patch, I removed PlaceholderKind and stop storing them to SymVector.
Now the information whether a symbol is being watched by --trace-symbol
is stored to the Symtab hash table.

llvm-svn: 275747
2016-07-18 01:35:00 +00:00
Rui Ueyama 69c778c084 Implement almost-zero-cost --trace-symbol.
--trace-symbol is a command line option to watch a symbol.
Previosly, we looked up a hash table for a new symbol if the
option is given. Any code that looks up a hash table for each
symbol is expensive because the linker handles a lot of symbols.
In our design, we look up a hash table strictly only once
for a symbol, so --trace-symbol was an exception.

This patch improves efficiency of the option by merging the
hash table into the symbol table.

Instead of looking up a separate hash table with a string,
this patch sets `Traced` flag to symbols specified by --trace-symbol.
So, if you insert a symbol and get a symbol with `Traced` flag on,
you know that you need to print out a log message for the symbol.
This is nearly zero cost.

llvm-svn: 275716
2016-07-17 17:50:09 +00:00
Rui Ueyama 2a7c1c1507 Print out file names for common symbols for --trace-symbol.
Previously, there was no way to get a file name for a DefinedCommon
symbol. This patch adds it.

llvm-svn: 275712
2016-07-17 17:36:22 +00:00
Rui Ueyama 663b8c2769 Handle versioned symbols efficiently.
Versions can be assigned to symbols in two different ways.
One is the usual version scripts, and the other is special
symbol suffix '@'. If a symbol contains '@', the string after
that is considered to specify a version name.

Previously, we look for '@' for all symbols.

Anything that works on every symbol can be expensive because
the linker has to handle a lot of symbols. The search for '@'
was not an exception.

In this patch, I made two optimizations.

The first optimization is to handle '@' only when at least one
version is defined. If no versions are defined, no versions can
be assigned to any symbols, so it's waste of time to search for '@'.

The second optimization is to scan only suffixes of symbol names
instead of entire symbol names. Symbol names can be very long, but
symbol versions are usually short, so scanning entire symbol names
is waste of time, too.

There are some error cases which we no longer be able to detect
with this patch. I don't think it's a major drawback because they
are minor errors. Speed is more important.

This change improves LLD with debug info self-link time from
6.6993 seconds to 6.3426 seconds (or -5.3%).

Differential Revision: https://reviews.llvm.org/D22433

llvm-svn: 275711
2016-07-17 17:23:17 +00:00
Rui Ueyama 434b56179e Add a pointer to a source file to SymbolBody.
Previously, each subclass of SymbolBody had a pointer to a source
file from which it was created. So, there was no single way to get
a source file for a symbol. We had getSourceFile<ELFT>(), but the
function was a bit inconvenient as it's a template.

This patch makes SymbolBody have a pointer to a source file.
If a symbol is not created from a file, the pointer has a nullptr.

llvm-svn: 275701
2016-07-17 03:11:46 +00:00
Rui Ueyama 803b120ba1 Add GotEntrySize/GotPltEntrySize to ELF target.
Patch by H.J Lu.

For x86-64 psABI, the entry size of .got and .got.plt sections is 8
bytes for both LP64 and ILP32.  Add GotEntrySize and GotPltEntrySize
to ELF target instead of using size of ELFT::uint.  Now we can generate
a simple working x32 executable.

Differential Revision: http://reviews.llvm.org/D22288

llvm-svn: 275301
2016-07-13 18:55:14 +00:00
Peter Smith fb05cd997c Recommit R274836 Add Thunk support framework for ARM and Mips
The TinyPtrVector of const Thunk<ELFT>* in InputSections.h can cause 
build failures on certain compiler/library combinations when Thunk<ELFT> 
is not a complete type or is an abstract class. Fixed by making Thunk<ELFT>
non Abstract.

type or is an abstract class 

llvm-svn: 274863
2016-07-08 16:10:27 +00:00
Peter Smith eeb827447e Revert R274836 Add Thunk support framework for ARM and Mips
This seems to be causing a buildbot failure on lld-x86_64-freebsd. Will
reproduce locally and fix. 

llvm-svn: 274841
2016-07-08 12:25:50 +00:00
Peter Smith de01b98a26 Add Thunk support framework for ARM and Mips
Generalise the Mips LA25 Thunk code and implement ARM and Thumb
    interworking Thunks.
    
    - Introduce a new module Thunks.cpp to store the Target Specific Thunk
      implementations.
    - DefinedRegular and Shared have a ThunkData field to record Thunk.
    - A Target can have more than one type of Thunk.
    - Support PC-relative calls to Thunks.
    - Support Thunks to PLT entries.
    - Existing Mips LA25 Thunk code integrated.
    - Support for ARMv7A interworking Thunks.
    
    Limitations:
    - Only one Thunk per SymbolBody, this is sufficient for all currently
      implemented Thunks.
    - ARM thunks assume presence of V6T2 MOVT and MOVW instructions.

    Differential revision: http://reviews.llvm.org/D21891

llvm-svn: 274836
2016-07-08 11:13:40 +00:00
Rui Ueyama f4d9338dfb Move demangle() from Symbols.cpp to Strings.cpp.
Symbols.cpp contains functions to handle ELF symbols.
demangle() function is essentially a function to work on a
string rather than on an ELF symbol. So Strings.cpp is a
better place to put that function.

This change also make demangle to demangle symbols unconditionally.
Previously, it demangled symbols only when Config->Demangle is true.

llvm-svn: 274804
2016-07-07 23:04:15 +00:00
Rafael Espindola 580d7a1b1e -Bsymbolic should not make symbols more preemptable.
But it was doing that for protected undefined symbols.

llvm-svn: 274803
2016-07-07 22:50:54 +00:00
George Rimar 4365158689 [ELF] - Implemented support of default/non-default symbols versions
t is possible to create new version of symbol instead of depricated one
using combination of version script and asm commands. For example:

__asm__(".symver b_1,b@LIBSAMPLE_1.0");
int b_1() { return 10; }
__asm__(".symver b_2,b@@LIBSAMPLE_2.0");
int b_2() { return 20; }

This code makes b_2() to be default implementation for b().
b_1() is used for compatibility with binaries compiled against
library of older version LIBSAMPLE_1.0.

This patch implements support for above functionality in lld.

Differential revision: http://reviews.llvm.org/D21681

llvm-svn: 274002
2016-06-28 08:21:10 +00:00
George Rimar d3566309eb [ELF] - Recommit r273143("[ELF] - Basic versioned symbols support implemented.")
With fix:
-soname flag was not set in testcase. Hash calculated for base def was different on local
and bot machines because filename fos used for calculating.

Initial commit message:
Patch implements basic support of versioned symbols.
There is no wildcards patterns matching except local: *;
There is no support for hierarchies.
There is no support for symbols overrides (@ vs @@ not handled).

This patch allows programs that using simple scripts to link and run.

Differential revision: http://reviews.llvm.org/D21018

llvm-svn: 273152
2016-06-20 11:55:12 +00:00
George Rimar d03f97211a Revert r273143 "[ELF] - Basic versioned symbols support implemented."
It broke buildbot:
http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast

llvm-svn: 273146
2016-06-20 10:29:53 +00:00
George Rimar c31fee2212 [ELF] - Basic versioned symbols support implemented.
Patch implements basic support of versioned symbols.
There is no wildcards patterns matching except local: *;
There is no support for hierarchies.
There is no support for symbols overrides (@ vs @@ not handled).

This patch allows programs that using simple scripts to link and run.

Differential revision: http://reviews.llvm.org/D21018

llvm-svn: 273143
2016-06-20 10:16:33 +00:00
Simon Atanasyan 4132511cdc [ELF][MIPS] Support GOT entries for non-preemptible symbols with different addends
There are two motivations for this patch. The first one is a preparation
for support MIPS TLS relocations. It might sound like a joke but for GOT
entries related to TLS relocations MIPS ABI uses almost regular approach
with creation of dynamic relocations for each GOT enty etc. But we need
to separate these 'regular' TLS related entries from MIPS specific local
and global parts of GOT. ABI declare simple solution - all TLS related
entries allocated at the end of GOT after local/global parts. The second
motivation it to support GOT relocations for non-preemptible symbols
with addends. If we have more than one GOT relocations against symbol S
with different addends we need to create GOT entries for each unique
Symbol/Addend pairs.

So we store all MIPS GOT entries in separate containers. For non-preemptible
symbols we have to maintain two data structures. The first one is MipsLocal
vector. Each entry corresponds to the GOT entry from the 'local' part
of the GOT contains the symbol's address plus addend. The second one
is MipsLocalMap. It is a map from Symbol/Addend pair to the GOT index.

Differential Revision: http://reviews.llvm.org/D21297

llvm-svn: 273127
2016-06-19 21:39:37 +00:00
Rui Ueyama 4a90f57ef2 Rename PltZero -> PltHeader.
PltZero (or PLT[0]) was an appropriate name for the little code
we have at beginning of the PLT section when we only supported x86
since the code for x86 just fits in the first PLT slot.

It's not the case anymore. The code for ARM64 occupies first two
slots, so PltZero spans PLT[0] and PLT[1], for example.
This patch renames it to avoid confusion.

llvm-svn: 272913
2016-06-16 16:28:50 +00:00
Rafael Espindola 65c65ce897 Don't include --start-lib/--end-lib files twice.
This should never happen with correct programs, but it is trivial
write a testcase where lld would crash or report duplicated
symbols. We now behave like when an archive is used and include the
file only once.

llvm-svn: 272724
2016-06-14 21:56:36 +00:00
Rafael Espindola 07543a8c2d Use a reference instead of a pointer. NFC.
llvm-svn: 272719
2016-06-14 21:40:23 +00:00
Rui Ueyama 70595aae64 Inline SymbolBody::init. NFC.
I think this function was too short to be an independent function.

llvm-svn: 270534
2016-05-24 04:51:49 +00:00
Rafael Espindola 66434562e7 Fix copy relocations in pie.
We were creating the copy relocations just fine, but then thinking that
the .bss position could be preempted and creating a dynamic relocation
to it, which would crash at runtime since that memory is read only.

llvm-svn: 268668
2016-05-05 19:41:49 +00:00
Peter Collingbourne 6a4225962d ELF: Forbid all relative relocations to absolute symbols in PIC, except for weak undefined.
Weak undefined symbols resolve to the image base. This is a little strange,
but it allows us to link function calls to such symbols. Normally such a
call will be guarded with a comparison, which will load a zero from the GOT.

There's one example of such a function call in crti.o in Linux's CRT.

As part of this change, I also needed to make the synthetic start and end
symbols image base relative in the case where their sections were empty,
so that PC-relative references to those symbols would continue to work.

Differential Revision: http://reviews.llvm.org/D19844

llvm-svn: 268350
2016-05-03 01:21:08 +00:00
Rui Ueyama 6d0cd2b62b Teach Undefined symbols from which file they are created from.
This patch increases the size of Undefined by the size of a pointer,
but it wouldn't actually increase the size of memory that LLD uses
because we are not allocating the exact size but the size of the
largest SymbolBody.

llvm-svn: 268310
2016-05-02 21:30:42 +00:00
Peter Collingbourne 4f9527065c ELF: New symbol table design.
This patch implements a new design for the symbol table that stores
SymbolBodies within a memory region of the Symbol object. Symbols are mutated
by constructing SymbolBodies in place over existing SymbolBodies, rather
than by mutating pointers. As mentioned in the initial proposal [1], this
memory layout helps reduce the cache miss rate by improving memory locality.

Performance numbers:

           old(s) new(s)
Without debug info:
chrome      7.178  6.432 (-11.5%)
LLVMgold.so 0.505  0.502 (-0.5%)
clang       0.954  0.827 (-15.4%)
llvm-as     0.052  0.045 (-15.5%)
With debug info:
scylla      5.695  5.613 (-1.5%)
clang      14.396 14.143 (-1.8%)

Performance counter results show that the fewer required indirections is
indeed the cause of the improved performance. For example, when linking
chrome, stalled cycles decreases from 14,556,444,002 to 12,959,238,310, and
instructions per cycle increases from 0.78 to 0.83. We are also executing
many fewer instructions (15,516,401,933 down to 15,002,434,310), probably
because we spend less time allocating SymbolBodies.

The new mechanism by which symbols are added to the symbol table is by calling
add* functions on the SymbolTable.

In this patch, I handle local symbols by storing them inside "unparented"
SymbolBodies. This is suboptimal, but if we do want to try to avoid allocating
these SymbolBodies, we can probably do that separately.

I also removed a few members from the SymbolBody class that were only being
used to pass information from the input file to the symbol table.

This patch implements the new design for the ELF linker only. I intend to
prepare a similar patch for the COFF linker.

[1] http://lists.llvm.org/pipermail/llvm-dev/2016-April/098832.html

Differential Revision: http://reviews.llvm.org/D19752

llvm-svn: 268178
2016-05-01 04:55:03 +00:00
Rui Ueyama 62ee16faa8 Remove Size from Undefined symbol.
There seems to be no reason to keep st_size of undefined symbols.
This patch removes the member for it. This patch will change outputs
in cases that undefined symbols are copied to output, but I think
this is unimportant.

Differential Revision: http://reviews.llvm.org/D19574

llvm-svn: 267826
2016-04-28 00:26:54 +00:00
Peter Collingbourne 60976ed7c0 ELF: Merge UndefinedBitcode and UndefinedElf. NFC.
Differential Revision: http://reviews.llvm.org/D19566

llvm-svn: 267640
2016-04-27 00:05:06 +00:00
Peter Collingbourne 892d498017 ELF: Re-implement -u directly and remove CanKeepUndefined flag.
The semantics of the -u flag are to load the lazy symbol named by the flag. We
were previously relying on this behavior falling out of symbol resolution
against a synthetic undefined symbol, but that didn't quite give us the
correct behavior, so we needed a flag to mark symbols created with -u so
we could treat them specially in the writer. However, it's simpler and less
error prone to implement the required behavior directly and remove the flag.

This fixes an issue where symbols loaded with -u would receive hidden
visibility even when the definition in an object file had wider visibility.

Differential Revision: http://reviews.llvm.org/D19560

llvm-svn: 267639
2016-04-27 00:05:03 +00:00
Peter Collingbourne dbe4187d11 ELF: Simplify preemption logic. Do not include weak undefined symbols in non-DSOs.
Add a test for -Bsymbolic + undefined symbols.

llvm-svn: 267323
2016-04-24 04:29:59 +00:00
Peter Collingbourne d869a040ee ELF: Always include undefined DSO symbols in the symbol table.
Fixes check-llvm when bootstrapping.

Also remove mostly dead and most likely incorrect logic regarding preemption
of weak undefined symbols.

llvm-svn: 267314
2016-04-24 02:31:02 +00:00
Peter Collingbourne 66ac1d6152 ELF: Implement basic support for --version-script.
This patch only implements support for version scripts of the form:
  { [ global: symbol1; symbol2; [...]; symbolN; ] local: *; };
No wildcards are supported, other than for the local entry. Symbol versioning
is also not supported.

It works by introducing a new Symbol flag which tracks whether a symbol
appears in the global section of a version script.

This patch also simplifies the logic in SymbolBody::isPreemptible(), and
teaches it to handle the case where symbols with default visibility in DSOs
do not appear in the dynamic symbol table because of a version script.

Fixes PR27482.

Differential Revision: http://reviews.llvm.org/D19430

llvm-svn: 267208
2016-04-22 20:21:26 +00:00
Rui Ueyama 8bf71066c5 Inline SymbolTable::compareCommons and add comments. NFC.
llvm-svn: 267195
2016-04-22 19:34:59 +00:00
Peter Collingbourne dadcc17ead ELF: Move Visibility, IsUsedInRegularObj and MustBeInDynSym flags to Symbol.
These are properties of a symbol name, rather than a particular instance
of a symbol in an object file. We can simplify the code by collecting these
properties in Symbol.

The MustBeInDynSym flag has been renamed ExportDynamic, as its semantics
have been changed to be the same as those of --dynamic-list and
--export-dynamic-symbol, which do not cause hidden symbols to be exported.

Differential Revision: http://reviews.llvm.org/D19400

llvm-svn: 267183
2016-04-22 18:42:48 +00:00
Rafael Espindola 4d480ed545 Internalize linkonce_odr more often.
Since there is a copy in every translation unit that uses them, they can
be omitted from the symbol table if the address is not significant.

This still doesn't catch as many cases as the gold plugin. The
difference is that we check canBeOmittedFromSymbolTable in each file and
use lazy loading which limits what it can do. Gold checks it in the merged file.

I think the correct way of getting the same results as gold is just to
cache in the IR the result of canBeOmittedFromSymbolTable.

llvm-svn: 267063
2016-04-21 21:44:25 +00:00
Rafael Espindola ae605c1b0c Start adding support for internalizing shared libraries.
llvm-svn: 267045
2016-04-21 20:35:25 +00:00
Rafael Espindola 3666025880 Two small related fixes.
* A hidden undefined is not preemptable.
* It is always zero, so we don't need a dynamic reloc for it.

llvm-svn: 266424
2016-04-15 11:57:07 +00:00
Rafael Espindola f9d3dcf0a8 Don't set MustBeInDynSym for hidden symbols.
llvm-svn: 266230
2016-04-13 19:03:34 +00:00
Peter Collingbourne f6e9b4ec24 ELF: Use hidden visibility for all DefinedSynthetic symbols.
This simplifies the code by allowing us to remove the visibility argument
to functions that create synthetic symbols.

The only functional change is that the visibility of the MIPS "_gp" symbol
is now hidden. Because this symbol is defined in every executable or DSO, it
would be difficult to observe a visibility change here.

Differential Revision: http://reviews.llvm.org/D19033

llvm-svn: 266208
2016-04-13 16:57:28 +00:00
Rafael Espindola 8caf33c483 Cleanup the handling of MustBeInDynSym and IsUsedInRegularObj.
Now MustBeInDynSym is only true if the symbol really must be in the
dynamic symbol table.

IsUsedInRegularObj is only true if the symbol is used in a .o or -u. Not
a .so or a .bc.

A benefit is that this is now done almost entirilly during symbol
resolution. The only exception is copy relocations because of aliases.

This includes a small fix in that protected symbols in .so don't force
executable symbols to be exported.

This also opens the way for implementing internalize for -shared.

llvm-svn: 265826
2016-04-08 18:39:03 +00:00
Rafael Espindola a15fb15b05 Don't lower the visibility because of shared symbols.
If a shared library has a protected symbol 'foo', that doesn't imply
that the symbol 'foo' in the output should be protected or not.

llvm-svn: 265794
2016-04-08 16:11:42 +00:00
Rui Ueyama f8baa66056 ELF: Implement --start-lib and --end-lib
start-lib and end-lib are options to link object files in the same
semantics as archive files. If an object is in start-lib and end-lib,
the object is linked only when the file is needed to resolve
undefined symbols. That means, if an object is in start-lib and end-lib,
it behaves as if it were in an archive file.

In this patch, I introduced a new notion, LazyObjectFile. That is
analogous to Archive file type, but that works for a single object
file instead of for an archive file.

http://reviews.llvm.org/D18814

llvm-svn: 265710
2016-04-07 19:24:51 +00:00
Rafael Espindola 74031ba1e9 Simplify dynamic relocation creation.
The position of a relocation can always be expressed as an offset in an
output section.

llvm-svn: 265682
2016-04-07 15:20:56 +00:00
Rafael Espindola 5e34568f79 Use a bit in SymbolBody to store CanKeepUndefined.
UndefinedElf for 64 bits goes from 72 to 64 bytes.

llvm-svn: 265543
2016-04-06 14:31:03 +00:00
Rafael Espindola f47657301b Change the type hierarchy for undefined symbols.
We have to differentiate undefined symbols from bitcode and undefined
symbols from other sources.

Undefined symbols from bitcode should not inhibit the symbol being
internalized. Undefined symbols from other sources should.

llvm-svn: 265536
2016-04-06 13:22:41 +00:00
Rafael Espindola 242ffa8da1 Fix use of uninitialized.
The names of undefined locals are not used, so I don't think it is
possible to actually test this.

llvm-svn: 265534
2016-04-06 12:19:25 +00:00
Rafael Espindola f9b79a479e Rename a few Visibility arguments to StOther.
llvm-svn: 265533
2016-04-06 12:14:31 +00:00
Rafael Espindola d9a1717efc Remove redundant argument. NFC.
llvm-svn: 265386
2016-04-05 11:47:46 +00:00
Peter Collingbourne d0856a6bb2 ELF: Make SymbolBody::compare a non-template function.
Differential Revision: http://reviews.llvm.org/D18781

llvm-svn: 265372
2016-04-05 00:47:58 +00:00
Peter Collingbourne e8afa4971c ELF: Preserve MustBeInDynSym for bitcode symbols.
Make sure to copy the MustBeInDynSym field when replacing shared symbols with
bitcode symbols, and when replacing bitcode symbols with regular symbols
in addCombinedLtoObject. Fixes interposition of DSO symbols with bitcode
symbols in the main executable.

Differential Revision: http://reviews.llvm.org/D18780

llvm-svn: 265371
2016-04-05 00:47:55 +00:00
Rui Ueyama b5792b231b Rename Other -> StOther.
"Other" as a name is too generic, so name it StOther.

llvm-svn: 265332
2016-04-04 19:09:08 +00:00
Rafael Espindola ccfe3cb3d6 Don't store an Elf_Sym for most symbols.
Our symbol representation was redundant, and some times would get out of
sync. It had an Elf_Sym, but some fields were copied to SymbolBody.

Different parts of the code were checking the bits in SymbolBody and
others were checking Elf_Sym.

There are two general approaches to fix this:
* Copy the required information and don't store and Elf_Sym.
* Don't copy the information and always use the Elf_Smy.

The second way sounds tempting, but has a big problem: we would have to
template SymbolBody. I started doing it, but it requires templeting
*everything* and creates a bit chicken and egg problem at the driver
where we have to find ELFT before we can create an ArchiveFile for
example.

As much as possible I compared the test differences with what gold and
bfd produce to make sure they are still valid. In most cases we are just
adding hidden visibility to a local symbol, which is harmless.

In most tests this is a small speedup. The only slowdown was scylla
(1.006X). The largest speedup was clang with no --build-id, -O3 or
--gc-sections (i.e.: focus on the relocations): 1.019X.

llvm-svn: 265293
2016-04-04 14:04:16 +00:00
Rui Ueyama bfc1d9d976 Remove DefinedElf class.
DefinedElf was a superclass of DefinedRegular and SharedSymbol classes
and represented the notion of defined symbols created for ELF symbols.

It turned out that we didn't use that class often. We had only two
occurrences of dyn_cast'ing to DefinedElf, and both were easily
rewritten without it.

The class was also a bit confusing. The concept of "created for ELF
symbol" is orthogonal to defined/undefined types. However, we had
two distinct classes, DefinedElf and UndefinedElf.

This patch simply removes the class. Now the class hierarchy is one
level shallower.

llvm-svn: 265234
2016-04-02 18:06:18 +00:00
Simon Atanasyan 13f6da1d2c [ELF] Implement infrastructure for thunk code creation
Some targets might require creation of thunks. For example, MIPS targets
require stubs to call PIC code from non-PIC one. The patch implements
infrastructure for thunk code creation and provides support for MIPS
LA25 stubs. Any MIPS PIC code function is invoked with its address
in register $t9. So if we have a branch instruction from non-PIC code
to the PIC one we cannot make the jump directly and need to create a small
stub to save the target function address.
See page 3-38 ftp://www.linux-mips.org/pub/linux/mips/doc/ABI/mipsabi.pdf

- In relocation scanning phase we ask target about thunk creation necessity
by calling `TagetInfo::needsThunk` method. The `InputSection` class
maintains list of Symbols requires thunk creation.

- Reassigning offsets performed for each input sections after relocation
scanning complete because position of each section might change due
thunk creation.

- The patch introduces new dedicated value for DefinedSynthetic symbols
DefinedSynthetic::SectionEnd. Synthetic symbol with that value always
points to the end of the corresponding output section. That allows to
escape updating synthetic symbols if output sections sizes changes after
relocation scanning due thunk creation.

- In the `InputSection::writeTo` method we write thunks after corresponding
input section. Each thunk is written by calling `TargetInfo::writeThunk` method.

- The patch supports the only type of thunk code for each target. For now,
it is enough.

Differential Revision: http://reviews.llvm.org/D17934

llvm-svn: 265059
2016-03-31 21:26:23 +00:00
Davide Italiano f6523aecd7 Revert r264961. I didn't have asserts enable when testing.
llvm-svn: 264692
2016-03-29 02:20:10 +00:00
Davide Italiano a50e0b97f1 [LTO] Include bitcode symbol name in unreachable messages.
llvm-svn: 264691
2016-03-29 01:40:07 +00:00
Rafael Espindola 5432287bad Make needsPlt a plain function instead of a template.
llvm-svn: 264267
2016-03-24 12:55:27 +00:00
Davide Italiano 901de03fe2 [ELF] Simplify code a bit. No functional change.
llvm-svn: 263999
2016-03-21 22:44:24 +00:00
Rafael Espindola 8381c565c3 Make evaluation order explicit.
llvm-svn: 263762
2016-03-17 23:36:19 +00:00
Rui Ueyama 9328b2cdde Use ELFT instead of ELFFile<ELFT>.
llvm-svn: 263510
2016-03-14 23:16:09 +00:00
George Rimar 343580097d [ELF] implement --warn-common/--no-warn-common
-warn-common
Warn when a common symbol is combined with another common symbol
or with a symbol definition.  Unix linkers allow  this  somewhat
sloppy  practice, but linkers on some other operating systems do
not.  This option allows you to  find  potential  problems  from
combining global symbols.

Differential revision: http://reviews.llvm.org/D17998

llvm-svn: 263413
2016-03-14 09:19:30 +00:00
Rui Ueyama 2df72898c8 Remove `else` after `return`.
llvm-svn: 263392
2016-03-13 20:54:38 +00:00
Rui Ueyama c4466605d8 ELF: Redefine canBeDefined as a member function of SymbolBody.
We want to make SymbolBody the central place to query symbol information.
This patch also renames canBePreempted to isPreemptible because I feel that
the latter is slightly better (the former is three words and the latter
is two words.)

llvm-svn: 263386
2016-03-13 19:48:18 +00:00
Rui Ueyama 7ede54310a Redefine isGnuIfunc as a member function of SymbolBody.
llvm-svn: 263365
2016-03-13 04:40:14 +00:00
George Rimar 777f96304e Recommit of r263252, [ELF] - Change all messages to lowercase to be consistent.
which was reverted because included
unrelative changes by mistake.

Original commit message:

[ELF] - Change all messages to lowercase to be consistent.

That is directly opposite to http://reviews.llvm.org/D18045,
which was reverted.

This patch changes all messages to start from lowercase letter if
they were not before.

That is done to be consistent with clang.

Differential revision: http://reviews.llvm.org/D18085

llvm-svn: 263337
2016-03-12 08:31:34 +00:00
Rui Ueyama f714955402 Revert r263252: "[ELF] - Change all messages to lowercase to be consistent."
This reverts commit r263252 because the change contained unrelated changes.

llvm-svn: 263272
2016-03-11 18:46:51 +00:00
George Rimar 96bcdae1a5 [ELF] - Change all messages to lowercase to be consistent.
That is directly opposite to http://reviews.llvm.org/D18045,
which was reverted.

This patch changes all messages to start from lowercase letter if
they were not before.

That is done to be consistent with clang.

Differential revision: http://reviews.llvm.org/D18085

llvm-svn: 263252
2016-03-11 16:40:55 +00:00
Rafael Espindola 1f5b70f64f Represent local symbols with DefinedRegular.
llvm-svn: 263237
2016-03-11 14:21:37 +00:00
Rafael Espindola 87d9f10733 Compute value of local symbol with getVA.
llvm-svn: 263225
2016-03-11 12:19:05 +00:00
Rafael Espindola 67d72c02bc Create a SymbolBody for locals.
pr26878 shows a case where locals have to be in the got.

llvm-svn: 263222
2016-03-11 12:06:30 +00:00
George Rimar 56e0d53e92 [ELF] - Move initSymbols() to Driver.cpp. NFC.
That is followup for http://reviews.llvm.org/D18047
patch. initSymbols() moved to Driver.cpp and made static.

llvm-svn: 263214
2016-03-11 10:07:18 +00:00
Rui Ueyama 17d6983a4e Rename MaxAlignment -> Alignment.
We can argue about a maximum alignment of a group of symbols,
but for each symbol, there is only one alignment.
So it is a bit weird that each symbol has a "maximum alignment".

llvm-svn: 263151
2016-03-10 18:58:53 +00:00
George Rimar 3498c7fbd0 [ELF] - Refactor of SymbolBody::compare()
That makes it a bit shorter.

Differential revision: http://reviews.llvm.org/D18004

llvm-svn: 263144
2016-03-10 18:49:24 +00:00
George Rimar 5a3dcf4edf [ELF] - Do not call doInitSymbols for all ELFTs
It looks a bit wierd that we have to initialize symbols for all ELFT 
types when we use only one ELFT for link. We can only init those
that we need. Patch fixes it.

Differential revision: http://reviews.llvm.org/D18047

llvm-svn: 263133
2016-03-10 17:38:49 +00:00
Rafael Espindola e090fb2891 ELF: Remove non-standard ELF features from AMDGPU target.
Patch by Tom Stellard!

llvm-svn: 263063
2016-03-09 21:37:22 +00:00
Rafael Espindola 4f29c1a337 lto: Record visibility in defined symbols.
llvm-svn: 262835
2016-03-07 17:14:36 +00:00
George Rimar 2f0fab53e4 [ELF] - Simplify a SymbolBody class interface a bit.
Get rid of few accessors in that class, and replace
them with direct fields access.

Differential revision: http://reviews.llvm.org/D17879

llvm-svn: 262796
2016-03-06 06:26:18 +00:00
Davide Italiano 255730cdc5 [ELF] Generalize symbol type handling.
SymbolBody constructor and friends take isFunc and isTLS boolean arguments.
ELF symbols have already a type so than be easily passed as argument.
If we want to support another type, this scheme is not good enough, that is,
the current code logic would require passing another `bool isObject` around.
Up to two argument, this stretching exercise was a little bit goofy but
still acceptable, but with more types to support, is just too much, IMHO.

Change the code so that the type is passed instead.

Differential Revision:   http://reviews.llvm.org/D17871

llvm-svn: 262684
2016-03-04 01:55:28 +00:00
George Rimar aa4dc20f09 [ELF] - Create _DYNAMIC symbol for dynamic output
lld needs to provide _DYNAMIC symbol when creating a shared library
both bfd and gold do that.

This should fix the https://llvm.org/bugs/show_bug.cgi?id=26732

Differential revision: http://reviews.llvm.org/D17607

llvm-svn: 262348
2016-03-01 16:23:13 +00:00
Rafael Espindola e0df00b91f Rename elf2 to elf.
llvm-svn: 262159
2016-02-28 00:25:54 +00:00
Rui Ueyama 72acaa1d17 Add comment on AMDGPU that the difference has no obvious reason.
llvm-svn: 262026
2016-02-26 15:39:26 +00:00
George Rimar 9e8593949d Description of symbols is avalable here:
https://docs.oracle.com/cd/E53394_01/html/E54766/u-etext-3c.html

It is said that:
_etext - The address of _etext is the first 
location after the last read-only loadable segment.

_edata - The address of _edata is the first 
location after the last read-write loadable segment.

_end - If the address of _edata is greater than the address 
of _etext, the address of _end is same as the address of _edata.

In real life _end and _edata has different values for that case.
Both gold/bfd set _edata to the end of the last non SHT_NOBITS section.
This patch do the same for consistency.

It should fix the https://llvm.org/bugs/show_bug.cgi?id=26729.

Differential revision: http://reviews.llvm.org/D17601

llvm-svn: 262019
2016-02-26 14:36:36 +00:00
Rui Ueyama 0b28952993 ELF: Implement ICF.
This patch implements the same algorithm as LLD/COFF's ICF. I'm
not going to repeat the same description about how it works, so you
want to read the comment in ICF.cpp in this patch if you want to know
the details. This algorithm should be more powerful than the ICF
algorithm implemented in GNU gold. It can even merge mutually-recursive
functions (which is harder than one might think).

ICF is a fairly effective size optimization. Here are some examples.

 LLD:   37.14 MB -> 35.80 MB (-3.6%)
 Clang: 59.41 MB -> 57.80 MB (-2.7%)

The lacking feature is "safe" version of ICF. This merges all
identical sections. That is not compatible with a C/C++ language
requirement that two distinct functions must have distinct addresses.

But as long as your program do not rely on the pointer equality
(which is in many cases true), your program should work with the
feature. LLD works fine for example.

GNU gold implements so-called "safe ICF" that identifies functions
that are safe to merge by heuristics -- for example, gold thinks
that constructors are safe to merge because there is no way to
take an address of a constructor in C++. We have a different idea
which David Majnemer suggested that we add NOPs at beginning of
merged functions so that two or more pointers can have distinct
values. We can do whichever we want, but this patch does not
include neither.

http://reviews.llvm.org/D17529

llvm-svn: 261912
2016-02-25 18:43:51 +00:00
Rafael Espindola 148445ef98 Add support for weak symbols in LTO.
llvm-svn: 261881
2016-02-25 16:25:41 +00:00
Rui Ueyama c89bff2ce1 Handle bitcode files in archive files with --whole-archive.
This patch moves BitcodeFile instantiation into createObjectFile.
Previously, we handle bitcode files outside that function and did
not handle for --whole-archive.

http://reviews.llvm.org/D17527

llvm-svn: 261663
2016-02-23 18:17:11 +00:00
Rafael Espindola 6feecec0b0 Handle undef symbols in LTO.
This also handles bc files is archives.

llvm-svn: 261374
2016-02-19 22:50:16 +00:00
Rafael Espindola 9f77ef0c08 Add initial LTO support.
llvm-svn: 260726
2016-02-12 20:54:57 +00:00
Rafael Espindola a0a65f973a Use the plt entry as the address of some symbols.
This is the function equivalent of a copy relocation.

Since functions are expected to change sizes, we cannot use copy
relocations. In situations where one would be needed, what is done
instead is:
* Create a plt entry
* Output an undefined symbol whose addr is the plt entry.

The dynamic linker makes sure any shared library uses the plt entry as
the function address.

llvm-svn: 260224
2016-02-09 15:11:01 +00:00
Rafael Espindola abebed982a Rename IsUsedInDynamicReloc to MustBeInDynSym.
The variable was marking various cases where a symbol must be included
in the dynamic symbol table. Being used by a dynamic relocation was only
one of them.

llvm-svn: 259889
2016-02-05 15:27:15 +00:00
Rui Ueyama 512c61df1c Define SymbolBody::getSize instead of getSymSize(SymbolBody&). NFC.
llvm-svn: 259613
2016-02-03 00:12:24 +00:00
George Rimar 5c36e5938d [ELF] Implemented -Bsymbolic-functions command line option
-Bsymbolic-functions: 
When creating a shared library, bind references to global 
function symbols to the definition within the shared library, if any.

This patch also fixed behavior of already existent -Bsymbolic:
previously PLT entries were created even if -Bsymbolic was specified.

Differential revision: http://reviews.llvm.org/D16411

llvm-svn: 259481
2016-02-02 09:28:53 +00:00
Rui Ueyama 71c066d8cf ELF: Include archive names in error messages.
If object files are drawn from archive files, the error message should
be something like "conflict symbols in foo.a(bar.o) and baz.o" instead
of "conflict symbols in bar.o and baz.o". This patch implements that.

llvm-svn: 259475
2016-02-02 08:22:41 +00:00
Rui Ueyama b5a6970ace ELF: Teach SymbolBody about how to get its addresses.
Previously, the methods to get symbol addresses were somewhat scattered
in many places. You can use getEntryAddr returns the address of the symbol,
but if you want to get the GOT address for the symbol, you needed to call
Out<ELFT>::Got->getEntryAddr(Sym). This change adds new functions, getVA,
getGotVA, getGotPltVA, and getPltVA to SymbolBody, so that you can use
SymbolBody as the central place to ask about symbols.

http://reviews.llvm.org/D16710

llvm-svn: 259404
2016-02-01 21:00:35 +00:00
George Rimar 02ca17906d [ELF] - Symbols from object files that override symbols in DSO are added to .dynsym table.
Main executable did not export symbols that exist both in the main executable and in DSOs before this patch.
Symbols from object files that override symbols in DSO should be added to .dynsym table.

Differential revision: http://reviews.llvm.org/D16405

llvm-svn: 258672
2016-01-25 08:44:38 +00:00
Rafael Espindola 65e80b963a Rename IgnoredWeak to Ignored.
Thanks to Rui for the suggestion.

llvm-svn: 258189
2016-01-19 21:19:52 +00:00
Rafael Espindola 3a6a0a0109 Delete addIgnoredStrong.
It is not needed now that we resolve symbols is shared libraries
correctly.

llvm-svn: 258104
2016-01-19 00:05:54 +00:00
Rafael Espindola 0bc0c02387 Prefer symbols from .o over .so.
This matches the behavior of gold and bfd ld.

llvm-svn: 258102
2016-01-18 23:54:05 +00:00
Rui Ueyama df1545163f Update comment for __cxa_demangle again to make it precise.
Thanks for Ed Maste for the suggestion.

llvm-svn: 257685
2016-01-13 22:09:09 +00:00
Rui Ueyama e16d1b0a9c Update comment for __cxa_demangle.
llvm-svn: 257678
2016-01-13 21:39:40 +00:00
Rui Ueyama 5fa978b4c0 Attempt to make FreeBSD buildbot green.
It seems that __cxa_demangle function on the buildbot tried to demangle
a variable "tlsvar" as a C++ symbol. Do not call that function unless
it does not start with "_Z" which is the prefix of mangled names.

llvm-svn: 257661
2016-01-13 19:40:13 +00:00
Rui Ueyama a4a628fb51 Demangle symbols when including them in error messages.
llvm-svn: 257647
2016-01-13 18:55:39 +00:00
Rui Ueyama 09eb0b3b3f Rename IgnoredUndef -> Ignored since it is not an undefined symbol.
Also rename Ignored -> IgnoredWeak and IgnoredStrong -> Ignored,
since strong symbol is a norm.

llvm-svn: 257507
2016-01-12 19:24:55 +00:00
Simon Atanasyan 188558e5eb [ELF][MIPS] Prevent substitution of _gp_disp symbol
On MIPS O32 ABI, _gp_disp is a magic symbol designates offset between
start of function and gp pointer into GOT. To make seal with such symbol
we add new method addIgnoredStrong(). It adds ignored symbol with global
binding to prevent the symbol substitution. The addIgnored call is not
enough here because this call adds a weak symbol which might be
substituted by symbol from shared library.

Differential Revision: http://reviews.llvm.org/D16084

llvm-svn: 257449
2016-01-12 06:23:57 +00:00
Rui Ueyama 83cd6e00e9 Remove unnecessary `lld::`.
llvm-svn: 256970
2016-01-06 20:11:55 +00:00
Rui Ueyama 533c03078b Do not use templates to instantiate {Object,Shared}Files.
createELFFile looked complex because of its use of template,
so I want to keep it private within this file.

llvm-svn: 256880
2016-01-06 00:09:43 +00:00
Rui Ueyama a246e094bc Factor out static members from DefinedRegular.
This patch moves statically-allocated Elf_Sym objects out
of DefinedRegular class, so that the class definition becomes
smaller.

llvm-svn: 256408
2015-12-25 06:12:18 +00:00
Rafael Espindola 1119191c4f Make it possible to create common symbols from bitcode.
Since the only missing bit was the size, I just replaced the Elf_Sym
with the size.

llvm-svn: 256384
2015-12-24 16:23:37 +00:00
Rafael Espindola 02ce26a1b4 Delete DefinedAbsolute.
There are 3 symbol types that a .bc can provide during lto: defined,
undefined, common.

Defined and undefined symbols have already been refactored. I was
working on common and noticed that absolute symbols would become an
oddity: They would be the only symbol type present in a .o but not in
a.bc.

Looking a bit more, other than the special section number they were only
used for special rules for computing values. In that way they are
similar to TLS, and we don't have a DefinedTLS.

This patch deletes it. With it we have a reasonable rule of the thumb
for having a symbol kind: It exists if it has special resolution
semantics.

llvm-svn: 256383
2015-12-24 14:22:24 +00:00
Rafael Espindola 4d4b06a0f8 Split Defined and DefinedElf.
This is similar to what was done for Undefined and opens the way for
having a symbol defined in bitcode.

llvm-svn: 256354
2015-12-24 00:47:42 +00:00
Rafael Espindola 5d7593bc59 Split Undefined and UndefinedElf.
I am working on adding LTO support to the new ELF lld.

In order to do that, it will be necessary to represent defined and
undefined symbols that are not from ELF files. One way to do it is to
change the symbol hierarchy to look like

Defined : SymbolBody
Undefined : SymbolBody

DefinedElf<ELFT> : Defined
UndefinedElf<ELFT> : Undefined

Another option would be to use bogus Elf_Sym, but I think that is
getting a bit too hackish.

This patch does the Undefined/UndefinedElf. Split. The next one
will do the Defined/DefinedElf split.

llvm-svn: 256289
2015-12-22 23:00:50 +00:00
Rui Ueyama 6be68525d0 Typedef uintX_t at beginning of a function just like others.
llvm-svn: 255853
2015-12-16 23:49:19 +00:00
Igor Kudrin b044af50f2 [ELF] Define symbols "_end" and "end" if referenced.
These symbols are expected to point to the end of the data segment.

Implements http://llvm.org/pr25528.

Differential Revision: http://reviews.llvm.org/D14833

llvm-svn: 253637
2015-11-20 02:32:35 +00:00
Rui Ueyama 86696f38ea ELF2: Avoid bitwise-OR hack. NFC.
The previous code is too clever that that needs a bit more
brain power than this new code.

llvm-svn: 250934
2015-10-21 19:41:03 +00:00
Rui Ueyama 8f2c4da65a ELF2: Rename getMostConstrainingVisibility -> getVisibility. NFC.
The previous name was too long.

llvm-svn: 250920
2015-10-21 18:13:47 +00:00
Rui Ueyama aca48ffb41 ELF2: Inititalize other symbols only once as well.
llvm-svn: 249645
2015-10-08 00:44:28 +00:00
Rui Ueyama 833ce281db ELF2: Make member variable names shorter.
I'm going to use them in other patches, and the names feel too long
despite their narrow scope.

llvm-svn: 249642
2015-10-08 00:29:00 +00:00
Rui Ueyama 9ea49c7948 ELF2: Initialize SyntheticOptional only once.
llvm-svn: 249636
2015-10-07 23:46:11 +00:00
Rafael Espindola 8e5560d894 Don't complain about symbols showing up in multible shared libraries.
llvm-svn: 248381
2015-09-23 14:23:59 +00:00
Rafael Espindola 9d06ab6ded Rename Chunks.(h|cpp) to InputSection.(h|cpp). NFC.
llvm-svn: 248226
2015-09-22 00:01:39 +00:00
Rui Ueyama 6666f6ad73 ELF2: Reduce nesting by returning early. NFC.
llvm-svn: 247168
2015-09-09 17:55:09 +00:00
Rui Ueyama 7da94a58a0 ELF2: Return early. NFC.
llvm-svn: 247165
2015-09-09 17:40:51 +00:00
Rafael Espindola 18173d420e Start adding support for symbols in shared libraries.
llvm-svn: 247019
2015-09-08 15:50:05 +00:00
Michael J. Spencer 1b348a68e5 [elf2] Add basic archive file support.
llvm-svn: 246886
2015-09-04 22:28:10 +00:00
Rafael Espindola 78471f0ec1 Merge visibility from all symbols with the same name.
The ELF spec says:

... if any reference to or definition of a name is a symbol with a
non-default visibility attribute, the visibility attribute must be
propagated to the resolving symbol in the linked object. If different
visibility attributes are specified for distinct references to or
definitions of a symbol, the most constraining visibility attribute
must be propagated to the resolving symbol in the linked object. The
attributes, ordered from least to most constraining, are:
STV_PROTECTED, STV_HIDDEN and STV_INTERNAL.

llvm-svn: 246603
2015-09-01 23:12:52 +00:00
Rafael Espindola f31f9617ca Remember the maximum alignment used to refer to a common symbol.
llvm-svn: 246517
2015-09-01 01:19:12 +00:00
Rafael Espindola daa92a6193 Keep the largest common symbol.
This requires templating some functions over ELFT, but that opens other cleanup
opportunities for future patches.

llvm-svn: 246405
2015-08-31 01:16:19 +00:00
Rafael Espindola 30e1797b38 Turn resolution.s into an exhaustive testcase.
Now that we print a symbol table and all symbol kinds are at least declared,
we can test all combinations that don't produce an error.

This also includes a few fixes to keep the test passing:

* Keep the strong symbol in a weak X strong pair
* Handle common symbols.

The common X common case will be finished in a followup patch.

llvm-svn: 246401
2015-08-30 23:17:30 +00:00