The condition whether a section is alive or not by default
is becoming increasingly complex, so the decision of garbage
collection is spreading over InputSection.h and MarkLive.cpp,
which is not a good state.
This moves the code to MarkLive.cpp, to keep the file the central
place to make decisions about garbage collection.
llvm-svn: 315384
Summary:
We were crashing when linking telnetd in FreeBSD because lld was emitting
corrupted output files for --norosegment. In this file the version index of some symbols
was set to 9 but lld only found 8 version definitions.
I am not sure how to create a minimal .so file that also exposes this behaviour so I just added the one that initially caused the error to Inputs/
This partially addresses https://bugs.llvm.org/show_bug.cgi?id=34705
Reviewers: ruiu, rafael, pcc, grimar
Reviewed By: ruiu
Subscribers: emaste, krytarowski
Tags: #lld
Differential Revision: https://reviews.llvm.org/D38397
llvm-svn: 315036
Summary:
We were crashing when linking telnetd in FreeBSD because lld was emitting
corrupted output files for --norosegment. In this file the version index of some symbols
was set to 9 but lld only found 8 version definitions.
I am not sure how to create a minimal .so file that also exposes this behaviour so I just added the one that initially caused the error to Inputs/
This partially addresses https://bugs.llvm.org/show_bug.cgi?id=34705
Reviewers: ruiu, rafael, pcc, grimar
Reviewed By: ruiu
Subscribers: emaste, krytarowski
Tags: #lld
Differential Revision: https://reviews.llvm.org/D38397
llvm-svn: 315035
This patch removes lot of static Instances arrays from different input file
classes and introduces global arrays for access instead. Similar to arrays we
have for InputSections/OutputSectionCommands.
It allows to iterate over input files in a non-templated code.
Differential revision: https://reviews.llvm.org/D35987
llvm-svn: 313619
If using --format=binary with an input file name that has one or more non-ascii
characters in, LLD has undefined behaviour (it crashes on my Windows Debug build)
when calling isalnum with these non-ascii characters. Instead, of calling
std::isalnum, this patch uses an internal version that ignores the locale and
checks a specific subset of characters.
Reviewers: ruiu
Differential Revision: https://reviews.llvm.org/D37331
llvm-svn: 312705
With this Symbol has the same size as before, but DefinedRegular goes
from 72 to 64 bytes.
I also find this a bit easier to read. There are fewer places
initializing File for example.
This has a small but measurable speed improvement on all tests (1%
max).
llvm-svn: 310142
Reviewing another change I noticed that we use "getSymbols" to mean
different things in different files. Depending on the file it can
return
ArrayRef<StringRef>
ArrayRef<SymbolBody*>
ArrayRef<Symbol*>
ArrayRef<Elf_Sym>
With this change it always returns an ArrayRef<SymbolBody*>. The other
functions are renamed getELFsyms() and getSymbolNames().
Note that we cannot return ArrayRef<Symbol*> instead of
ArreyRef<SymbolBody*> because local symbols have a SymbolBody but not
a Symbol.
llvm-svn: 309840
Summary:
If the linker is invoked with `--chroot /foo` and `/bar/baz.o`, it
tries to read the file from `/foo/bar/baz.o`. This feature is useful
when you are dealing with files created by the --reproduce option.
Reviewers: grimar
Subscribers: llvm-commits, emaste
Differential Revision: https://reviews.llvm.org/D35517
llvm-svn: 308646
With that in place we can use lld's own infrastructure for the low
level detail of dwarf parsing.
With this we don't decompress sections twice, we don't scan all
realocations and even with this simplistic implementation linking
clang with gdb index goes from 34.09 seconds to 20.80 seconds.
llvm-svn: 308544
The --exclude-libs option is not a popular option, but at least some
programs in Android depend on it, so it's worth to support it.
Differential Revision: https://reviews.llvm.org/D34422
llvm-svn: 305920
The ELF standard defines that the SHT_GROUP section as follows:
- its sh_link has the symbol index, and
- the symbol name is used to uniquify section groups.
Object files created by GNU gold does not seem to comply with the
standard. They have this additional rule:
- if the symbol has no name and a STT_SECTION symbol, a section
name is used instead of a symbol name.
If we don't do anything for this, the linker fails with a mysterious
error message if input files are generated by gas. It is unfortunate
but I think we need to support it.
Differential Revision: https://reviews.llvm.org/D34064
llvm-svn: 305218
This is PR33052, "Bug 33052 - -r eats comdats ".
To fix it I stop removing group section from out when -r is given
and fixing SHT_GROUP content when writing it just like we do some
other fixup, e.g. for Rel[a]. (it needs fix for section indices that
are in group).
Differential revision: https://reviews.llvm.org/D33485
llvm-svn: 304140
A variable `ComdatGroup` is not supposed to contain a large number of
items. Even when linking clang, it ends up having only 300K strings.
It doesn't make sense to use CachedHashStringRef for this hash table.
This patch has neutral or slightly positive impact on performance while
reducing code complexity.
llvm-svn: 303787
It seems virtually everyone who tries to do LTO build with Clang and
LLD was hit by a mistake to forget using llvm-ar command to create
archive files. I wasn't an exception. Since this is an annoying common
issue, it is probably better to handle that gracefully rather than
reporting an error and tell the user to redo build with different
configuration.
Differential Revision: https://reviews.llvm.org/D32721
llvm-svn: 302083
This patch is to ignore .debug_gnu_pub{names,types} sections if the
-gdb-index option was given.
Differential Revision: https://reviews.llvm.org/D32662
llvm-svn: 301710
This patch is to reduce amount of template uses. The new code is less
exciting and boring than before, but I think it is easier to read.
Differential Revision: https://reviews.llvm.org/D32467
llvm-svn: 301488
We can just use the existing SoName member variable. It now initially
contains what was in DefaultSoName and is modified if the .so has an
actual soname.
llvm-svn: 301259
Start using it in LLD to avoid needing to read bitcode again just to get the
target triple, and in llvm-lto2 to avoid printing symbol table information
that is inappropriate for the target.
Differential Revision: https://reviews.llvm.org/D32038
llvm-svn: 300300
Fixes PR32572.
When
(a) a library has no soname
and (b) library is given on the command line with path (and not through -L/-l flags)
DT_NEEDED entry for such library keeps the path as given.
This behavior is consistent with gold and bfd, and is used in compiler-rt test suite.
This is a second attempt after r300007 got reverted. This time relro-omagic test is
changed in a way to avoid hardcoding the path to the test directory in the objdump'd
binary.
llvm-svn: 300011
Fixes PR32572.
When
(a) a library has no soname
and (b) library is given on the command line with path (and not through -L/-l flags)
DT_NEEDED entry for such library keeps the path as given.
This behavior is consistent with gold and bfd, and is used in compiler-rt test suite.
llvm-svn: 300007
LogName member was added to construct input file names for logging
only once. This patch does this in a different way. Now toString
caches its results.
Differential Revision: https://reviews.llvm.org/D31546
llvm-svn: 299375
Introduce symbol table data structures that can be potentially written to
disk, have the LTO library build those data structures using temporarily
constructed modules and redirect the LTO library implementation to go through
those data structures. This allows us to remove the LLVMContext and Modules
owned by InputFile.
With this change I measured a peak memory consumption decrease from 5.4GB to
2.8GB in a no-op incremental ThinLTO link of Chromium on Linux. The impact on
memory consumption is larger in COFF linkers where we are currently forced
to materialize all metadata in order to read linker options. Peak memory
consumption linking a large piece of Chromium for Windows with full LTO and
debug info decreases from >64GB (OOM) to 15GB.
Part of PR27551.
Differential Revision: https://reviews.llvm.org/D31364
llvm-svn: 299168
Previously, undefined symbol errors are one line like this
and wasn't easy to read.
/ssd/clang/bin/ld.lld: error: /ssd/llvm-project/lld/ELF/Writer.cpp:207: undefined symbol 'lld:🧝:EhFrameSection<llvm::object::ELFType<(llvm::support::endianness)0, true> >::addSection(lld:🧝:InputSectionBase*)'
This patch make it more structured like this.
bin/ld.lld: error: undefined symbol: lld:🧝:EhFrameSection<llvm::object::ELFType<(llvm::support::endianness)0, true>
>>> Referenced by Writer.cpp:207 (/ssd/llvm-project/lld/ELF/Writer.cpp:207)
>>> Writer.cpp.o in archive lib/liblldELF.a
Discussion thread:
http://lists.llvm.org/pipermail/llvm-dev/2017-March/111459.html
Differential Revision: https://reviews.llvm.org/D31481
llvm-svn: 299097
We had a few Config member functions that returns configuration values.
For example, we had is64() which returns true if the target is 64-bit.
The return values of these functions are constant and never change.
This patch is to compute them only once to make it clear that they'll
never change.
llvm-svn: 298168
Summary:
When we perform LTO builds with a version of ar that does not
understand LLVM bitcode objects, we end up with undefined references,
because our archive files do not list the bitcode symbols in their
indices. The error messages do not make it clear what the real problem
is. This change adds a note that points out the likely problem and
solution. It is similar in spirit to r282633, but aims to avoid false
positives by only triggering when we see both undefined references and
archives without symbols in their indices.
Fixes PR32281.
Reviewers: davide, ruiu, tejohnson
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31011
llvm-svn: 298124
We really need to find a way to get this info from a single point of
truth in the LLVM backend, but it seems that the EM_* constants are
buried deep inside the constructors of the MCAsmBackend's.
For now, just fill in entries as we run into cases. AFAIK these mappings
are largely immutable, so we get a 75% discount on the technical debt
(code is duplicated, but little chance of divergence).
llvm-svn: 296429
With the current design an InputSection is basically anything that
goes directly in a OutputSection. That includes plain input section
but also synthetic sections, so this should probably not be a
template.
llvm-svn: 295993
We have InputSection, MergeInputSection and EhInputSection, so
isa<MergeInputSection> is equivalent to !isa<InputSection> && !isa<
EhInputSection>.
llvm-svn: 295937
LLD is a multi-threaded program. errs() or outs() are not guaranteed
to be thread-safe (they are actually not).
LLD's message(), log() or error() are thread-safe. We should use them.
llvm-svn: 295787
Previously LLD crashed on on provided testcases because "/DISCARD/" was
not supported. Patch implements that.
After this I think there is no known issues with --emit-relocs implementation
required for linux kernel linking.
Differential revision: https://reviews.llvm.org/D29273
llvm-svn: 295488
I splitted it from D29273.
Since we plan to make relocatable sections as dependent for target ones for
--emit-relocs implementation, this change is required to support .eh_frame case.
EhInputSection inherets from InputSectionBase and not from InputSection.
So for case when it has relocation section, it should be able to access DependentSections
vector.
This case is real for Linux kernel.
Differential revision: https://reviews.llvm.org/D30084
llvm-svn: 295483
That fixes a case when section has more than one metadata
section. Previously GC would collect one of such sections
because we had implementation that stored only last one as
dependent.
Differential revision: https://reviews.llvm.org/D29981
llvm-svn: 295298
If we had SHT_GROUP sections, then when -r was used we might crash.
This is PR31952.
Issue happened because we emited relocation section though its target was discared
because was a member of duplicated group. When we tried to get VA of target,
segfault happened.
Core cause is the bug that GNU as 2.27 (and probably later versions) has.
In compare with llvm-mc, it does not include relocation sections into the group,
like shown in testcase. This patch covers that case.
Differential revision: https://reviews.llvm.org/D29929
llvm-svn: 295067
with temporarily file name fix in testcase.
Original commit message:
-q, --emit-relocs - Generate relocations in output
Simplest implementation:
* no GC case,
* no "/DISCARD/" linkerscript command support.
This patch is extracted from D28612 / D29636,
Relative to PR31579.
Differential revision: https://reviews.llvm.org/D29663
llvm-svn: 294469
-q, --emit-relocs - Generate relocations in output
Simplest implementation:
* no GC case,
* no "/DISCARD/" linkerscript command support.
This patch is extracted from D28612 / D29636,
Relative to PR31579.
Differential revision: https://reviews.llvm.org/D29663
llvm-svn: 294464
Thunks are now implemented by redirecting the relocation to the
symbol S, to a symbol TS in a Thunk. The Thunk will transfer control
to S. This has the following implications:
- All the side-effects of Thunks happen within createThunks()
- Thunks are no longer stored in InputSections and Symbols no longer
need to hold a pointer to a Thunk
- The synthetic Thunk sections need to be merged into OutputSections
This implementation is almost a direct conversion of the existing
Thunks with the following exceptions:
- Mips LA25 Thunks are placed before the InputSection that defines
the symbol that needs a Thunk.
- All ARM Thunks are placed at the end of the OutputSection of the
first caller to the Thunk.
Range extension Thunks are not supported yet so it is optimistically
assumed that all Thunks can be reused.
This is a recommit of r293283 with a fixed comparison predicate as
std::merge requires a strict weak ordering.
Differential revision: https://reviews.llvm.org/D29327
llvm-svn: 293757
Thunks are now implemented by redirecting the relocation to the
symbol S, to a symbol TS in a Thunk. The Thunk will transfer control
to S. This has the following implications:
- All the side-effects of Thunks happen within createThunks()
- Thunks are no longer stored in InputSections and Symbols no longer
need to hold a pointer to a Thunk
- The synthetic Thunk sections need to be merged into OutputSections
This implementation is almost a direct conversion of the existing
Thunks with the following exceptions:
- Mips LA25 Thunks are placed before the InputSection that defines
the symbol that needs a Thunk.
- All ARM Thunks are placed at the end of the OutputSection of the
first caller to the Thunk.
Range extension Thunks are not supported yet so it is optimistically
assumed that all Thunks can be reused.
Differential Revision: https://reviews.llvm.org/D29129
llvm-svn: 293283
The linkonce feature is a sort of proto-comdat. As far as I am aware no
compiler produces linkonce sections anymore, but some glibc i386 object
files contain definitions of symbol "__x86.get_pc_thunk.bx" in linkonce
sections. Drop those sections to avoid duplicate symbol errors.
This is glibc PR20543, we should remove this hack once that has been
fixed for a while.
Fixes PR31215.
Differential Revision: https://reviews.llvm.org/D28430
llvm-svn: 291474
Previously, files added using INCLUDE directive weren't added
to reproduce archives. In this patch, I defined a function to
open a file and use that from Driver and LinkerScript.
llvm-svn: 291413
Previously, when we printed out a path obtained from DWARF debug info,
we replaced \ with / on Windows. But that doesn't make sense. We should
respect the system's path separator.
llvm-svn: 291219
This is how we use TarWriter in LLD. Now LLD does not append
a file extension, so you need to pass `--reproduce foo.tar`
instead of `--reproduce foo`.
Differential Revision: https://reviews.llvm.org/D28103
llvm-svn: 291210
In a shared library an undefined symbol is implicitly imported. If the
symbol is called as a function a PLT entry is generated for it. When the
caller is a Thumb b.w a thunk to the PLT entry is needed as all PLT
entries are in ARM state.
This change allows undefined symbols to have thunks in the same way that
shared symbols may have thunks.
llvm-svn: 290951
I thought for a while about how to remove it, but it looks like we
can just copy the file for now. Of course I'm not happy about that,
but it's just less than 50 lines of code, and we already have
duplicate code in Error.h and some other places. I want to solve
them all at once later.
Differential Revision: https://reviews.llvm.org/D27819
llvm-svn: 290062
The eglibc library, as used by Ubuntu 14.04 requires the presence of an
SHT_ARM_ATTRIBUTES section in for the purposes of checking hard/soft float
compatibility when dlopen() is used. Unfortunately when the section is not
present dlopen() fails with a generic could not find file message.
This change makes lld keep the first .ARM.attributes section that it
encounters and propagates it to the output. This is not a complete
SHT_ARM_ATTRIBUTES implementation, that would involve reading the contents
of the section and joining each individual attribute. It should suffice
for a homogenous build all libraries and executables on the same system
with a compatible set of command line options.
Differential revision: https://reviews.llvm.org/D27718
llvm-svn: 289642
Currently LLD prints basename of source file name in error messages,
for example:
$ mkdir foo
$ echo 'void _start(void) { foobar(); }' > foo/bar.c
$ gcc -g -c foo/bar.c
$ bin/ld.lld -o out bar.o
bin/ld.lld: error: bar.c:1: undefined symbol 'foobar'
$
This should say:
bin/ld.lld: error: foo/bar.c:1: undefined symbol 'foobar'
This is PR31299
Differential revision: https://reviews.llvm.org/D27506
llvm-svn: 288966
StringRefZ is a class to represent a null-terminated string. String
length is computed lazily, so it's more efficient than StringRef to
represent strings in string table.
The motivation of defining this new class is to merge functions
that only differ in string types; we have many constructors that takes
`const char *` or `StringRef`. With StringRefZ, we can merge them.
Differential Revision: https://reviews.llvm.org/D27037
llvm-svn: 288172
We have different functions to stringize objects to construct
error messages. For InputFile, we have getFilename, and for
InputSection, we have getName. You had to memorize them.
I think this is the case where the function overloading comes in handy.
This patch defines toString() functions that are overloaded for all these
types, so that you just call it in error().
Differential Revision: https://reviews.llvm.org/D27030
llvm-svn: 287787
Previously, we stored offsets in string tables to symbols, so
you needed to pass a string table to get a symbol name. This patch
stores const char pointers instead to eliminate the need to pass
a string table.
llvm-svn: 287737
LLD's error messages contain line numbers, function names or section names.
Currently they are formatter as follows.
foo.c (32): symbol 'foo' not found
foo.c (function bar): symbol 'foo' not found
foo.c (.text+0x1234): symbol 'foo' not found
This patch changes them so that they are consistent with Clang's output.
foo.c:32: symbol 'foo' not found
foo.c:(function bar): symbol 'foo' not found
foo.c:(.text+0x1234): symbol 'foo' not found
Differential Revision: https://reviews.llvm.org/D26901
llvm-svn: 287537
Patch adds a filename to that error message.
I faced next error when debugged one of FreeBSD port:
error: relocation R_X86_64_PLT32 cannot refer to absolute symbol __tls_get_addr
error message was poor and this patch improves it to show the locations
of symbol declaration and using.
Differential revision: https://reviews.llvm.org/D26508
llvm-svn: 286940
Found this when tried to link lang/ccl FreeBSD port.
Issue is very close to D23201.
This is the reason of lang/ccl port link fail.
GNU assembler 2.17.50 [FreeBSD] 2007-07-03 could generate broken objects,
where notype symbols are associated with symtab:
...
[ 9] .symtab SYMTAB 0000000000000000 00003c78
0000000000006858 0000000000000018 10 803 8
...
192: 000000000000000d 0 NOTYPE LOCAL DEFAULT 9 _cons_org
Patch allows to handle such objects.
Differential revision: https://reviews.llvm.org/D26613
llvm-svn: 286939
The functions getBitcodeTargetTriple(), isBitcodeContainingObjCCategory(),
getBitcodeProducerString() and hasGlobalValueSummary() now return errors
via their return value rather than via the diagnostic handler.
To make this work, re-implement these functions using non-member functions
so that they can be used without the LLVMContext required by BitcodeReader.
Differential Revision: https://reviews.llvm.org/D26532
llvm-svn: 286623
The change in D26502 splits ReaderWriter.h, which contains the APIs
into both the BitReader and BitWriter libraries, into BitcodeReader.h
and BitcodeWriter.h.
Change lld uses to the appropriate split header, removing it
completely in one case where it wasn't needed.
llvm-svn: 286568
Relocations are the last thing that we wore storing a raw section
pointer to and parsing on demand.
With this patch we parse it only once and store a pointer to the
actual data.
The patch also changes where we store it. It is now in
InputSectionBase. Not all sections have relocations, but most do and
this simplifies the logic. It also means that we now only support one
relocation section per section. Given that that constraint is
maintained even with -r with gold bfd and lld, I think it is OK.
llvm-svn: 286459
Previously, we have both input and output section for .MIPS.abiflags.
Now we have only one class for .MIPS.abiflags, which is MipsAbiFlagsSection.
This class is a synthetic input section.
.MIPS.abiflags sections are handled as regular sections until
the control reaches Writer. Writer then aggregates all sections
whose type is SHT_MIPS_ABIFLAGS to create a single synthesized
input section. The synthesized section is then processed normally
as if it came from an input file.
llvm-svn: 286398
Previously, we have both input and output sections for .reginfo and
.MIPS.options. Now for each such sections we have one synthetic input
sections: MipsReginfoSection and MipsOptionsSection respectively.
Both sections are handled as regular sections until the control reaches
Writer. Writer then aggregates all sections whose type is SHT_MIPS_REGINFO
or SHT_MIPS_OPTIONS to create a single synthesized input section. In that
moment Writer also save GP0 value to the MipsGp0 field of the corresponding
ObjectFile. This value required for R_MIPS_GPREL16 and R_MIPS_GPREL32
relocations calculation.
Differential revision: https://reviews.llvm.org/D26444
llvm-svn: 286397
Instead of remembering a raw Elf_Shdr, store the symbol table proper
and the index of the first non local.
This moves error handling upfront and simplifies it.
llvm-svn: 285933
DIHelper is a class having only one member, and ObjectFile has
a unique pointer to a DIHelper. So we can directly have ObjectFile
have the member.
Differential Revision: https://reviews.llvm.org/D26223
llvm-svn: 285850
When we have SHT_GNU_versym section, it is should be associated with symbol table
section. Usually (and in out implementation) it is .dynsym.
In case when .dynsym is absent (due to broken object for example),
lld crashes in parseVerdefs() when accesses null pointer:
Versym = reinterpret_cast<const Elf_Versym *>(this->ELFObj.base() +
VersymSec->sh_offset) +
this->Symtab->sh_info;
DIfferential revision: https://reviews.llvm.org/D25553
llvm-svn: 285796