I'm proposing a new command line flag, --warn-backrefs in this patch.
The flag and the feature proposed below don't exist in GNU linkers
nor the current lld.
--warn-backrefs is an option to detect reverse or cyclic dependencies
between static archives, and it can be used to keep your program
compatible with GNU linkers after you switch to lld. I'll explain the
feature and why you may find it useful below.
lld's symbol resolution semantics is more relaxed than traditional
Unix linkers. Therefore,
ld.lld foo.a bar.o
succeeds even if bar.o contains an undefined symbol that have to be
resolved by some object file in foo.a. Traditional Unix linkers
don't allow this kind of backward reference, as they visit each
file only once from left to right in the command line while
resolving all undefined symbol at the moment of visiting.
In the above case, since there's no undefined symbol when a linker
visits foo.a, no files are pulled out from foo.a, and because the
linker forgets about foo.a after visiting, it can't resolve
undefined symbols that could have been resolved otherwise.
That lld accepts more relaxed form means (besides it makes more
sense) that you can accidentally write a command line or a build
file that works only with lld, even if you have a plan to
distribute it to wider users who may be using GNU linkers. With
--check-library-dependency, you can detect a library order that
doesn't work with other Unix linkers.
The option is also useful to detect cyclic dependencies between
static archives. Again, lld accepts
ld.lld foo.a bar.a
even if foo.a and bar.a depend on each other. With --warn-backrefs
it is handled as an error.
Here is how the option works. We assign a group ID to each file. A
file with a smaller group ID can pull out object files from an
archive file with an equal or greater group ID. Otherwise, it is a
reverse dependency and an error.
A file outside --{start,end}-group gets a fresh ID when
instantiated. All files within the same --{start,end}-group get the
same group ID. E.g.
ld.lld A B --start-group C D --end-group E
A and B form group 0, C, D and their member object files form group
1, and E forms group 2. I think that you can see how this group
assignment rule simulates the traditional linker's semantics.
Differential Revision: https://reviews.llvm.org/D45195
llvm-svn: 329636
Some system libraries have a lot of versioned symbols. When linking
scylla this brings the number of malloc calls from 49154 to 37944.
llvm-svn: 329453
They are to pull out an object file for a symbol, but for a historical
reason the code is written in two separate functions. This patch
merges them.
llvm-svn: 329039
I tried a few different designs to find a way to implement it without
too much hassle and settled down with this. Unlike before, object files
given as arguments for --just-symbols are handled as object files, with
an exception that their section tables are handled as if they were all
null.
Differential Revision: https://reviews.llvm.org/D42025
llvm-svn: 328852
NonLocal is technically more accurate, but we already use the term
"Global" to specify the non-local part of the symbol table, and
Local <-> Global is easier to digest.
llvm-svn: 328740
This fixes pr36623.
The problem is that we have to parse versions out of names before LTO
so that LTO can use that information.
When we get the LTO produced .o files, we replace the previous symbols
with the LTO produced ones, but they still have @ in their names.
We could just trim the name directly, but calling parseSymbolVersion
to do it is simpler.
llvm-svn: 328738
Some tools (dwarfdump for example) get confused by the current -O0 -r
output since it has multiple copies of .debug_str.
We cannot just merge sections with the same name as they can have
different sh_entsize.
We could have duplicated logic for merging sections based on name and
sh_entsize, but it seems better to just use the existing logic by
enabling optimizations.
llvm-svn: 328640
SharedFile::parseRest function grew organically and got a bit hard to
understand. This patch refactor it. This patch also adds comments.
Differential Revision: https://reviews.llvm.org/D44860
llvm-svn: 328579
Previously, we used 0 as an alias for VER_NDX_GLOBAL and had a dummy
entry in SharedFile::Verdefs so that the access to the array is within
its boundary. But that's not straightforwad. We can just stop doing both.
llvm-svn: 328401
Our code assumes all input sections in an output SHF_LINK_ORDER
section has SHF_LINK_ORDER flag. We do not check that and that can cause a crash.
That happens because we call
std::stable_sort(Sections.begin(), Sections.end(), compareByFilePosition);,
where compareByFilePosition predicate does not expect to see
null when calls getLinkOrderDep.
The same might happen when sections refer to non-regular sections.
Test cases demonstrate the issues, patch fixes them.
Differential revision: https://reviews.llvm.org/D44193
llvm-svn: 327006
We don't need to handle an object file having more than one symbol table,
so as soon as we find the first one, we can process it and then return
from the function.
llvm-svn: 326977
addElfSymbols and readJustSymbolsFile still has duplicate code, but
I didn't come up with a good idea to eliminate them. Since this patch
is an improvement, I'm sending this for review.
Differential Revision: https://reviews.llvm.org/D44187
llvm-svn: 326972
LLD uses the debug info and debug line sections to determine the location of
e.g. references to undefined symbols, when producing error messages. In the
event that debug info was present, but debug line parsing failed for some
reason, then a nullptr would end up being dereferenced by the location-lookup
code.
Differential Revision: https://reviews.llvm.org/D44205
Reviewers: grimar
llvm-svn: 326899
It should be possible to resolve undefined symbols in dynamic libraries
using symbols defined in a linker script.
Differential Revision: https://reviews.llvm.org/D43011
llvm-svn: 326176
There seems to be no reason to collect this list of symbols.
Also fix a bug where --exclude-libs would apply to all symbols that
appear in an archive's symbol table, even if the relevant archive
member was not added to the link.
Differential Revision: https://reviews.llvm.org/D43369
llvm-svn: 325380
MIPS BFD linker puts _gp_disp symbol into DSO files and assigns zero
version definition index to it. This value means 'unversioned local
symbol' while _gp_disp is a section global symbol. We have to handle
this bug in the LLD because BFD linker is used for building MIPS
toolchain libraries.
Differential revision: https://reviews.llvm.org/D42486
llvm-svn: 324467
We did not report valid filename for duplicate symbol error when
symbol came from binary input file.
Patch fixes it.
Differential revision: https://reviews.llvm.org/D42635
llvm-svn: 324217
r323476 added support for DW_FORM_line_strp, and incorrectly made that
depend on having a DWARFUnit available. We shouldn't be tracking
.debug_line_str in DWARFUnit after all. After this patch, I can do an
NFC follow up and undo a bunch of the "plumbing" part of r323476.
Differential Revision: https://reviews.llvm.org/D42609
llvm-svn: 323691
We normally avoid "switch (Config->EKind)", but in this case I think
it is worth it.
It is only executed when there is an error and it allows detemplating
a lot of code.
llvm-svn: 321404
I noticed that the continue this patch deletes was not tested. Trying
to add a test I realized that we never put a VER_NDX_LOCAL symbol in
the dynamic symbol table. There doesn't seem to be any reason for a
linker to use VER_NDX_LOCAL for a defined shared symbol.
llvm-svn: 320817
Specifically, libwidevinecdm.so in Chrome has such bad symbol.
It seems the BFD linker handles them as local symbols, so instead
of inserting them to the symbol table, we should skip them too.
Differential Revision: https://reviews.llvm.org/D41257
llvm-svn: 320770
By using an index instead of a pointer for verdef we can put the index
next to the alignment field. This uses the otherwise wasted area and
reduces the shared symbol size.
By itself the performance change of this is in the noise, but I have a
followup patch to remove another 8 bytes that improves performance
when combined with this.
llvm-svn: 320449
This patch is to rename check CHECK and make it a C macro, so that
we can evaluate the second argument lazily.
Differential Revision: https://reviews.llvm.org/D40915
llvm-svn: 319974
This avoids allocating the error message when there is no error that
check requires.
It avoids the code duplication of inlining check.
llvm-svn: 319922
The SHT_ARM_ATTRIBUTES section type is in the processor specific space
adding guard to make sure that it is only recognized when EMachine is
EM_ARM.
llvm-svn: 319304
lld assumes some ARM features that are not available in all Arm
processors. In particular:
- The blx instruction present for interworking.
- The movt/movw instructions are used in Thunks.
- The J1=1 J2=1 encoding of branch immediates to improve Thumb wide
branch range are assumed to be present.
This patch reads the ARM Attributes section to check for the
architecture the object file was compiled with. If none of the objects
have an architecture that supports either of these features a warning
will be given. This is most likely to affect armv6 as used in the first
Raspberry Pi.
Differential Revision: https://reviews.llvm.org/D36823
llvm-svn: 319169
This fixes PR35223.
Here I enabled SHF_MERGE section content merging for -r like
we do for regular linking.
Differential revision: https://reviews.llvm.org/D40026
llvm-svn: 318516
Now that DefinedRegular is the only remaining derived class of
Defined, we can merge the two classes.
Differential Revision: https://reviews.llvm.org/D39667
llvm-svn: 317448
Currently LLD tries to use information about functions and variables location
taking it from debug sections. When --strip-* is given we discard such sections
and that breaks error reporting.
Patch stops discarding such sections and just removes them from InputSections list.
Differential revision: https://reviews.llvm.org/D39550
llvm-svn: 317405
Now that we have only SymbolBody as the symbol class. So, "SymbolBody"
is a bit strange name now. This is a mechanical change generated by
perl -i -pe s/SymbolBody/Symbol/g $(git grep -l SymbolBody lld/ELF lld/COFF)
nd clang-format-diff.
Differential Revision: https://reviews.llvm.org/D39459
llvm-svn: 317370
Currently we do not strip .zdebug_*, what looks wrong.
Also this simplifies the testcase we have for this options.
Differential revision: https://reviews.llvm.org/D39552
llvm-svn: 317306
This is PR34826.
Currently LLD is unable to report line number when reporting
duplicate declaration of some variable.
That happens because for extracting line information we always use
.debug_line section content which describes mapping from machine
instructions to source file locations, what does not help for
variables as does not describe them.
In this patch I am taking the approproate information about
variables locations from the .debug_info section.
Differential revision: https://reviews.llvm.org/D38721
llvm-svn: 317080
SymbolBody and Symbol were separated classes due to a historical reason.
Symbol used to be a pointer to a SymbolBody, and the relationship
between Symbol and SymbolBody was n:1.
r2681780 changed that. Since that patch, SymbolBody and Symbol are
allocated next to each other to improve memory locality, and they have
1:1 relationship now. So, the separation of Symbol and SymbolBody no
longer makes sense.
This patch merges them into one class. In order to avoid updating too
many places, I chose SymbolBody as a unified name. I'll rename it Symbol
in a follow-up patch.
Differential Revision: https://reviews.llvm.org/D39406
llvm-svn: 317006
I don't think we have to aim for precise bug compatibility.
We can return a nullptr if a section is consumed by the linker, and
the rest should naturally work.
llvm-svn: 316817
Summary:
The COFF linker and the ELF linker have long had similar but separate
Error.h and Error.cpp files to implement error handling. This change
introduces new error handling code in Common/ErrorHandler.h, changes the
COFF and ELF linkers to use it, and removes the old, separate
implementations.
Reviewers: ruiu
Reviewed By: ruiu
Subscribers: smeenai, jyknight, emaste, sdardis, nemanjai, nhaehnle, mgorny, javed.absar, kbarton, fedor.sergeev, llvm-commits
Differential Revision: https://reviews.llvm.org/D39259
llvm-svn: 316624
The condition whether a section is alive or not by default
is becoming increasingly complex, so the decision of garbage
collection is spreading over InputSection.h and MarkLive.cpp,
which is not a good state.
This moves the code to MarkLive.cpp, to keep the file the central
place to make decisions about garbage collection.
llvm-svn: 315384
Summary:
We were crashing when linking telnetd in FreeBSD because lld was emitting
corrupted output files for --norosegment. In this file the version index of some symbols
was set to 9 but lld only found 8 version definitions.
I am not sure how to create a minimal .so file that also exposes this behaviour so I just added the one that initially caused the error to Inputs/
This partially addresses https://bugs.llvm.org/show_bug.cgi?id=34705
Reviewers: ruiu, rafael, pcc, grimar
Reviewed By: ruiu
Subscribers: emaste, krytarowski
Tags: #lld
Differential Revision: https://reviews.llvm.org/D38397
llvm-svn: 315036
Summary:
We were crashing when linking telnetd in FreeBSD because lld was emitting
corrupted output files for --norosegment. In this file the version index of some symbols
was set to 9 but lld only found 8 version definitions.
I am not sure how to create a minimal .so file that also exposes this behaviour so I just added the one that initially caused the error to Inputs/
This partially addresses https://bugs.llvm.org/show_bug.cgi?id=34705
Reviewers: ruiu, rafael, pcc, grimar
Reviewed By: ruiu
Subscribers: emaste, krytarowski
Tags: #lld
Differential Revision: https://reviews.llvm.org/D38397
llvm-svn: 315035
This patch removes lot of static Instances arrays from different input file
classes and introduces global arrays for access instead. Similar to arrays we
have for InputSections/OutputSectionCommands.
It allows to iterate over input files in a non-templated code.
Differential revision: https://reviews.llvm.org/D35987
llvm-svn: 313619
If using --format=binary with an input file name that has one or more non-ascii
characters in, LLD has undefined behaviour (it crashes on my Windows Debug build)
when calling isalnum with these non-ascii characters. Instead, of calling
std::isalnum, this patch uses an internal version that ignores the locale and
checks a specific subset of characters.
Reviewers: ruiu
Differential Revision: https://reviews.llvm.org/D37331
llvm-svn: 312705
With this Symbol has the same size as before, but DefinedRegular goes
from 72 to 64 bytes.
I also find this a bit easier to read. There are fewer places
initializing File for example.
This has a small but measurable speed improvement on all tests (1%
max).
llvm-svn: 310142
Reviewing another change I noticed that we use "getSymbols" to mean
different things in different files. Depending on the file it can
return
ArrayRef<StringRef>
ArrayRef<SymbolBody*>
ArrayRef<Symbol*>
ArrayRef<Elf_Sym>
With this change it always returns an ArrayRef<SymbolBody*>. The other
functions are renamed getELFsyms() and getSymbolNames().
Note that we cannot return ArrayRef<Symbol*> instead of
ArreyRef<SymbolBody*> because local symbols have a SymbolBody but not
a Symbol.
llvm-svn: 309840
Summary:
If the linker is invoked with `--chroot /foo` and `/bar/baz.o`, it
tries to read the file from `/foo/bar/baz.o`. This feature is useful
when you are dealing with files created by the --reproduce option.
Reviewers: grimar
Subscribers: llvm-commits, emaste
Differential Revision: https://reviews.llvm.org/D35517
llvm-svn: 308646
With that in place we can use lld's own infrastructure for the low
level detail of dwarf parsing.
With this we don't decompress sections twice, we don't scan all
realocations and even with this simplistic implementation linking
clang with gdb index goes from 34.09 seconds to 20.80 seconds.
llvm-svn: 308544
The --exclude-libs option is not a popular option, but at least some
programs in Android depend on it, so it's worth to support it.
Differential Revision: https://reviews.llvm.org/D34422
llvm-svn: 305920
The ELF standard defines that the SHT_GROUP section as follows:
- its sh_link has the symbol index, and
- the symbol name is used to uniquify section groups.
Object files created by GNU gold does not seem to comply with the
standard. They have this additional rule:
- if the symbol has no name and a STT_SECTION symbol, a section
name is used instead of a symbol name.
If we don't do anything for this, the linker fails with a mysterious
error message if input files are generated by gas. It is unfortunate
but I think we need to support it.
Differential Revision: https://reviews.llvm.org/D34064
llvm-svn: 305218
This is PR33052, "Bug 33052 - -r eats comdats ".
To fix it I stop removing group section from out when -r is given
and fixing SHT_GROUP content when writing it just like we do some
other fixup, e.g. for Rel[a]. (it needs fix for section indices that
are in group).
Differential revision: https://reviews.llvm.org/D33485
llvm-svn: 304140
A variable `ComdatGroup` is not supposed to contain a large number of
items. Even when linking clang, it ends up having only 300K strings.
It doesn't make sense to use CachedHashStringRef for this hash table.
This patch has neutral or slightly positive impact on performance while
reducing code complexity.
llvm-svn: 303787
It seems virtually everyone who tries to do LTO build with Clang and
LLD was hit by a mistake to forget using llvm-ar command to create
archive files. I wasn't an exception. Since this is an annoying common
issue, it is probably better to handle that gracefully rather than
reporting an error and tell the user to redo build with different
configuration.
Differential Revision: https://reviews.llvm.org/D32721
llvm-svn: 302083
This patch is to ignore .debug_gnu_pub{names,types} sections if the
-gdb-index option was given.
Differential Revision: https://reviews.llvm.org/D32662
llvm-svn: 301710
This patch is to reduce amount of template uses. The new code is less
exciting and boring than before, but I think it is easier to read.
Differential Revision: https://reviews.llvm.org/D32467
llvm-svn: 301488
We can just use the existing SoName member variable. It now initially
contains what was in DefaultSoName and is modified if the .so has an
actual soname.
llvm-svn: 301259
Start using it in LLD to avoid needing to read bitcode again just to get the
target triple, and in llvm-lto2 to avoid printing symbol table information
that is inappropriate for the target.
Differential Revision: https://reviews.llvm.org/D32038
llvm-svn: 300300
Fixes PR32572.
When
(a) a library has no soname
and (b) library is given on the command line with path (and not through -L/-l flags)
DT_NEEDED entry for such library keeps the path as given.
This behavior is consistent with gold and bfd, and is used in compiler-rt test suite.
This is a second attempt after r300007 got reverted. This time relro-omagic test is
changed in a way to avoid hardcoding the path to the test directory in the objdump'd
binary.
llvm-svn: 300011
Fixes PR32572.
When
(a) a library has no soname
and (b) library is given on the command line with path (and not through -L/-l flags)
DT_NEEDED entry for such library keeps the path as given.
This behavior is consistent with gold and bfd, and is used in compiler-rt test suite.
llvm-svn: 300007
LogName member was added to construct input file names for logging
only once. This patch does this in a different way. Now toString
caches its results.
Differential Revision: https://reviews.llvm.org/D31546
llvm-svn: 299375
Introduce symbol table data structures that can be potentially written to
disk, have the LTO library build those data structures using temporarily
constructed modules and redirect the LTO library implementation to go through
those data structures. This allows us to remove the LLVMContext and Modules
owned by InputFile.
With this change I measured a peak memory consumption decrease from 5.4GB to
2.8GB in a no-op incremental ThinLTO link of Chromium on Linux. The impact on
memory consumption is larger in COFF linkers where we are currently forced
to materialize all metadata in order to read linker options. Peak memory
consumption linking a large piece of Chromium for Windows with full LTO and
debug info decreases from >64GB (OOM) to 15GB.
Part of PR27551.
Differential Revision: https://reviews.llvm.org/D31364
llvm-svn: 299168
Previously, undefined symbol errors are one line like this
and wasn't easy to read.
/ssd/clang/bin/ld.lld: error: /ssd/llvm-project/lld/ELF/Writer.cpp:207: undefined symbol 'lld:🧝:EhFrameSection<llvm::object::ELFType<(llvm::support::endianness)0, true> >::addSection(lld:🧝:InputSectionBase*)'
This patch make it more structured like this.
bin/ld.lld: error: undefined symbol: lld:🧝:EhFrameSection<llvm::object::ELFType<(llvm::support::endianness)0, true>
>>> Referenced by Writer.cpp:207 (/ssd/llvm-project/lld/ELF/Writer.cpp:207)
>>> Writer.cpp.o in archive lib/liblldELF.a
Discussion thread:
http://lists.llvm.org/pipermail/llvm-dev/2017-March/111459.html
Differential Revision: https://reviews.llvm.org/D31481
llvm-svn: 299097
We had a few Config member functions that returns configuration values.
For example, we had is64() which returns true if the target is 64-bit.
The return values of these functions are constant and never change.
This patch is to compute them only once to make it clear that they'll
never change.
llvm-svn: 298168
Summary:
When we perform LTO builds with a version of ar that does not
understand LLVM bitcode objects, we end up with undefined references,
because our archive files do not list the bitcode symbols in their
indices. The error messages do not make it clear what the real problem
is. This change adds a note that points out the likely problem and
solution. It is similar in spirit to r282633, but aims to avoid false
positives by only triggering when we see both undefined references and
archives without symbols in their indices.
Fixes PR32281.
Reviewers: davide, ruiu, tejohnson
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31011
llvm-svn: 298124
We really need to find a way to get this info from a single point of
truth in the LLVM backend, but it seems that the EM_* constants are
buried deep inside the constructors of the MCAsmBackend's.
For now, just fill in entries as we run into cases. AFAIK these mappings
are largely immutable, so we get a 75% discount on the technical debt
(code is duplicated, but little chance of divergence).
llvm-svn: 296429