This was clearly wrong (thanks Rui for spotting), and I honestly would
like to get this tested so such mistakes won't repeat. Unfortunately, I
wasn't (easily) able to craft a test that exposes the bad behavior.
Ideally, we would like to get tests of this kind for all relocations, but
at the time of writing, this is not true. So, for now just fix this bug
and try to re-evaluate a way to test this in the future.
llvm-svn: 249359
The entries are added if there are "_init" or "_fini" entries in
the symbol table respectively. According to the behavior of ld,
entries are inserted even for undefined symbols.
Symbol names can be overridden by using -init and -fini command
line switches. If used, these switches neither add new symbol table
entries nor require those symbols to be resolved.
Differential Revision: http://reviews.llvm.org/D13385
llvm-svn: 249297
Add symbol specified with -u as undefined which may cause additional
object files from archives to be linked into the resulting binary.
Differential Revision: http://reviews.llvm.org/D13345
llvm-svn: 249295
Using the "raw" Elf64_Dyn or Elf32_Dyn structures in
DynamicSection<ELFT>::writeTo does not correctly handle mixed-Endian
situations. Instead, use the corresponding llvm::object::* structures which
have Endian-converting members (like the rest of the code).
This fixes all currently-failing elf2 tests when running on big-Endian
PPC64/Linux (I've added a big-Endian test case which should fail on
little-Endian machines in the same way that test/elf2/shared.s failed on
big-Endian machines prior to this change).
llvm-svn: 249150
Found while testing a FreeBSD base system build with lld. Ignored for
now while we continue to identify missing options and functionality.
llvm-svn: 249144
Sort by:
ALLOC
ALLOC && NOBITS
ALLOC & EXEC
ALLOC & EXEC && NOBITS
ALLOC & WRITE
ALLOC & WRITE && NOBITS
<nothing> (ignoring NOBITS)
The dynamic section is finalized early because it adds strings to the dynamic string table, which comes before the dynamic table.
llvm-svn: 249071
Summary:
If --whole-archive is used, all symbols from the following archives are added to the output. --no-whole-archive restores default behavior. These switches can be used multiple times.
NB. We have to keep an ArchiveFile instance within SymbolTable even if --whole-archive mode is active since it can be a thin archive which contains just names of external files. In that case actual memory buffers for the archive members will be stored within the File member of ArchiveFile class.
Reviewers: rafael, ruiu
Subscribers: grimar, llvm-commits
Projects: #lld
Differential Revision: http://reviews.llvm.org/D13286
llvm-svn: 249045
Summary:
These switches affect library searching for '-l' which follow them. Synonym forms are also supported:
* -dy and -call_shared for -Bdynamic switch
* -dn, -non_shared and -static for -Bstatic switch
Reviewers: rafael, ruiu
Subscribers: emaste, llvm-commits
Projects: #lld
Differential Revision: http://reviews.llvm.org/D13238
llvm-svn: 249029
If a shared library has a DT_SONAME entry, that is what should be included
in the DT_NEEDED of a program using it.
We don't implement -soname yet, so check in a .so for now.
llvm-svn: 249025
Opening a file and dispatching to readLinkerScript() or createFile()
is a common operation. We want to use that at least from Driver and
from LinkerScript. In COFF, we had the same problem. This patch
resolves the problem in the same way as we did for COFF.
Now, if you have a path that you want to open, just call
Driver->addFile(StringRef). That function opens the file and handles
that as if that were given by command line. This function is the
only place we call identify_magic().
llvm-svn: 249023
The new name starts with a verb, and it does not imply that it errors
out and exit (it acutally can just emit a warning depending on settings.)
llvm-svn: 249016
This linker script parser and evaluator is powerful enough to read
Linux's libc.so, which is (despite its name) a linker script that
contains OUTPUT_FORMAT, GROUP and AS_NEEDED directives.
The parser implemented in this patch is a recursive-descendent one.
It does *not* construct an AST but consumes directives in place and
sets the results to Symtab object, like what Driver is doing.
This should be very fast since less objects are allocated, and
this is also more readable.
http://reviews.llvm.org/D13232
llvm-svn: 248918
Sort by:
ALLOC
ALLOC && NOBITS
ALLOC & EXEC
ALLOC & EXEC && NOBITS
ALLOC & WRITE
ALLOC & WRITE && NOBITS
<nothing> (ignoring NOBITS)
The dynamic section is finalized early because it adds strings to the dynamic string table, which comes before the dynamic table.
llvm-svn: 248845
The code in driver is about to change so that the invalid files would no
longer be seen as ELF.
This makes sure that the error path will remain tested.
llvm-svn: 248820
Besides a trivial MIPS support the patch introduces new TargetInfo class
member getDefaultEntry() to override default name of the entry symbol.
MIPS uses __start for that.
Differential Revision: http://reviews.llvm.org/D13227
llvm-svn: 248779
This is a basic initial implementation of the -flat_namespace and
-undefined options for LLD-darwin. It ignores several subtlties,
but the result is close enough that we can now link LLVM (but not
clang) on Darwin and pass all regression tests.
llvm-svn: 248732
There's actually a room to improve this patch. Instead of not merging
sections that have different alignements, we can choose the section that
has the largest alignment requirement among all sections that are otherwise
considered the same. Then all section alignments are satisfied, so we can
merge them.
I don't know if that improvement could make any difference for real-world
input, so I'll leave it alone. Would be interesting to revisit later.
llvm-svn: 248581
Since FreeBSD 4.1, the kernel expects binaries to be marked with
ELFOSABI_FREEBSD in the ELF header to exec() them. LLD unconditionally
sets OSABI to ELF_OSABINONE, and everything linked with it won't run
on FreeBSD (unless explicitly rebranded).
Example:
% ./aarch64-hello
ELF binary type "0" not known.
zsh: exec format error: ./aarch64-hello
FreeBSD could be modified to accept ELF_OSABINONE, but that would break all
existing binaries, so the kernel needs to support both ABINONE and ABIFREEBSD.
I plan to push this change in FreeBSD one day, which, unfortunately, is
not today. This code patches lld so it sets the header field correctly.
For completeness, the rationale of this change is explained in the FreeBSD
commit message, and it's apparently to pleasee binutils maintainers at the time.
https://svnweb.freebsd.org/base?view=revision&revision=59342
Differential Revision: http://reviews.llvm.org/D13140
llvm-svn: 248554
This was prematurely committed (and I take the blame for that).
Ideally, we want to support only ignored options that are really used
by somebody. Some of the options listed are not even supported by gold
(but only by ld.bfd), which says a lot about their "real-world" usefulness.
llvm-svn: 248503
Unfortunately the i386 and x86_64 relocation have the same numerical value
and it is a probably a bit much to add got support for another architecture
just to test this.
llvm-svn: 248326
This is just enough to get PLT working on 32 bit x86.
The idea behind using a virtual interface is that it should be easy to
convert any of the functions to template parameters if any turns out to be
performance critical.
llvm-svn: 248308
This is an LLD extension to MSVC link.exe command line. MSVC linker
does not write symbol tables for executables. We do unless no /debug
option is given.
There's a situation that we want to enable debug info but don't want
to emit the symbol table. One example is when we are comparing output
file size. With this patch, you can tell the linker to not create
a symbol table by just specifying /nosymtab.
llvm-svn: 248225
std::distance(C->Relocs.end(), C->Relocs.begin()) is the same as NumRelocs
which is already added to the hash value. What we are missing here is the
section size.
llvm-svn: 248202
This is more consistent with OutputSection. This is also part of removing
the "Chunk" term from the ELF linker, since we just have input/output sections
and program headers.
llvm-svn: 248183
This patch fixes a regression introduced by r247964. Relocations that
are referring the same symbol should be considered equal, but they
were not if they were pointing to non-section chunks.
llvm-svn: 248132
Previously, InputFile::parse() was run in batch. We construct a list
of all input files and call parse() on each file using parallel_for_each.
That means we cannot start parsing files until we get a complete list
of input files, although InputFile::parse() is safe to call from anywhere.
This patch makes it asynchronous. As soon as we add a file to the symbol
table, we now start parsing the file using std::async().
This change shortens self-hosting time (650 ms) by 28 ms. It's about 4%
improvement.
llvm-svn: 248109
I made the field an atomic pointer in hope that we would be able to
parallelize the symbol resolver soon, but that's not going to happen
soon. This patch reverts that change for the sake of readability.
llvm-svn: 248104
InputFile::parse() can be called in parallel with other calls of
the same function. By doing that, time to self-link improves from
741 ms to 654 ms or 12% faster.
This is probably the last low hanging fruit in terms of parallelism.
Input file parsing and symbol table insertion takes 450 ms in total.
If we want to optimize further, we probably have to parallelize
symbol table insertion using concurrent hashmap or something.
That's doable, but that's not easy, especially if you want to keep
the exact same semantics and linking order. I'm not going to do that
at least soon.
Anyway, compared to r248019 (the change before the first attempt for
parallelism), we achieved 36% performance improvement from 1022 ms
to 654 ms. MSVC linker takes 3.3 seconds to link the same program.
MSVC's ICF feature is very slow for some reason, but even if we
disable the feature, it still takes about 1.2 seconds.
Our number is probably good enough.
llvm-svn: 248078
Self-hosting took 801 ms on my machine. Of which this function took
69 ms. Now it takes 37 ms. That is about 4% overall performance
improvement.
llvm-svn: 248052
The LLD's ICF algorithm is highly parallelizable. This patch does that
using parallel_for_each.
ICF accounted for about one third of total execution time. Previously,
it took 324 ms when self-hosting. Now it takes only 62 ms.
Of course your mileage may vary. My machine is a beefy 24-core Xeon machine,
so you may not see this much speedup. But this optimization should be
effective even for 2-core machine, since I saw speedup (324 ms -> 189 ms)
when setting parallelism parameter to 2.
llvm-svn: 248038
Previously, ICF created a vector for each SectionChunk. The vector
contained pointers to successors, which are namely associative sections
and COMDAT relocation targets. The reason I created vectors is because
I thought that that would make section comparison faster.
It did make the comparison faster. When self-linking, for example, it
saved about 10 ms on each iteration. The time we spent on constructing
the vectors was 124 ms. If we iterate more than 12 times, return from
the investment exceeds the initial cost.
In reality, it usually needs 5 iterations. So we shouldn't construct
the vectors.
llvm-svn: 247963
equalsConstants() is the heaviest function in ICF, and that consumes
more than half of total ICF execution time. Of which, section content
comparison accounts for roughly one third.
Previously, we compared section contents at the beginning of the
function after comparing their checksums. The comparison is very
likely to succeed because when the control reaches that comparison,
their checksums are always equal. And because checksums are 64-bit
CRC, they are unlikely to collide.
We compared relocations and associative sections after that.
If they are different, the time we spent on byte-by-byte comparison
of section contents were wasted.
This patch moves the comparison at the end of function. If the
comparison fails, the time we spent on relocation comparison are
wasted, but as I wrote it's very unlikely to happen.
LLD took 1198 ms to link itself to produce a 27.11 MB executable.
Of which, ICF accounted for 536 ms. This patch cuts it by 90 ms,
which is 17% speedup of ICF and 7.5% speedup overall. All numbers
are median of ten runs.
llvm-svn: 247961
The offset of the .rela.dyn section isn't the same between hosts because a path comes before it. This test doesn't care what the offset is.
llvm-svn: 247946
We used to sort the symbols at the very end, but we need to know the order
earlier so that we can create reference to them in the dynamic relocations.
Thanks to Igor Kudrin for pointing out the problem.
llvm-svn: 247911
Basically the concept of "liveness" is for sections (or chunks in LLD
terminology) and not for symbols. Symbols are always available or live,
or otherwise it indicates a link failure.
Previously, we had isLive() and markLive() methods for DefinedSymbol.
They are confusing methods. What they actually did is to act as a proxy
to backing section chunks. We can simplify eliminate these methods
and call section chunk's methods directly.
llvm-svn: 247869
Only live symbols are written to the symbol table. Because isLive()
returned false if dead-stripping was disabled entirely, only
non-COMDAT sections were written to the symbol table. This patch fixes
the issue.
llvm-svn: 247856
Symbol table is now populated correctly, but some fields are missing,
they'll be added in the future. This patch also adds --discard-all
flag, which was the default behavior until now.
Differential Revision: http://reviews.llvm.org/D12874
llvm-svn: 247849
This patch defines ICF class and defines ICF-related functions as
members of the class. By doing this we can move code that are
related only to ICF from SectionChunk to the newly-defined class.
This also eliminates a global variable "NextID".
llvm-svn: 247802
This is a patch to make LLD to be on par with MSVC in terms of ICF
effectiveness. MSVC produces a 27.14MB executable when linking LLD.
LLD previously produced a 27.61MB when self-linking. Now the size
is reduced to 27.11MB. Note that without ICF the size is 29.63MB.
In r247387, I implemented an algorithm that handles section graphs
as cyclic graphs and merge them using SCC. The algorithm did not
always work as intended as I demonstrated in r247721. The new
algortihm implemented in this patch is different from the previous
one. If you are interested the details, you want to read the file
comment of ICF.cpp.
llvm-svn: 247770