Change the way we calculate the build id to use MD5 to give reproducible build
ids. Previously we would generate random bytes for the build id GUID.
llvm-svn: 281079
Previously, we created temporary files using llvm::sys::fs::createTemporaryFile
and removed them using llvm::FileRemover. This is error-prone as it is easy to
forget creating FileRemover instances after creating temporary files.
There is actually a temporary file leak bug.
This patch introduces a new class, TemporaryFile, to manage temporary files
in the RAII style.
Differential Revision: https://reviews.llvm.org/D24176
llvm-svn: 280510
Summary:
UBSan complains like the following:
tools/lld/COFF/Writer.cpp:97:15: runtime error: null pointer passed as argument 2, which is declared to never be null
The reason is that the vector could be empty.
Reviewers: rsmith
Subscribers: Eugene.Zelenko, kcc
Differential Revision: https://reviews.llvm.org/D24050
llvm-svn: 280259
The IMAGE_FILE_HEADER structure contains a (RVA, size) to an array of
COFF_DEBUG_DIRECTORY records. Each one of these records contains an RVA to a OMF
Debug Directory. These OMF debug directories are derived into newer types such
as PDB70, PDB20, etc. This constructs a PDB70 structure which will allow us to
associate a GUID with a build to actually tie debug information.
llvm-svn: 280012
Reorder the table setup to mirror the indices corresponding to them. This means
that the table values are filled out as per the enumeration ordering. Doing so
makes it easier to identify a particular table. NFC.
llvm-svn: 278199
Add the support infrastructure for the /debugtype option which takes a comma
delimited list of debug info to generate. The defaults are based on other
options potentially (/driver or /profile). This sets up the infrastructure to
allow us to emit RSDS records to get "build id" equivalents on COFF (similar to
binutils).
llvm-svn: 278056
Don't blindly OR in the new value, but clear the existing one, since it can be
nonzero. Read out the existing value before, and add into the desired offset.
(The add is done outside of the applyMOV, to handle potential overflow between
the two.)
Patch by Martin Storsjö!
llvm-svn: 277846
The opcode for the bl branches can initially be F000 F800, i.e.
the J1 and J2 bits are already set. Therefore mask these bits out
before or'ing in the new bits.
Patch by Martin Storsjö!
llvm-svn: 277836
This flag is implemented similarly to --reproduce in the ELF linker.
This patch implements /linkrepro by moving the cpio writer and associated
utility functions to lldCore, and using that implementation in both linkers.
One COFF-specific detail is that we store the object file from which the
resource files were created in our reproducer, rather than the resource
files themselves. This allows the reproducer to be used on non-Windows
systems for example.
Differential Revision: https://reviews.llvm.org/D22418
llvm-svn: 276719
lld currently relies on lib.exe in order to generate an empty import library.
The "empty" import library consists of 5 members:
- first linker member
- second linker member
- Import Descriptor
- NULL Import Descriptor
- NULl Thunk
The first two entries (first and second linker members) are string tables which
are never updated. Therefore, they may as well as not be present. A subsequent
change to add that is probably warranted. However, this does not prevent the
use of the linker.
The Import Descriptor is the content which is most important. It provides an
Import Name Table entry for the library (as specified by the LIBRARY directive
in the DEF file). Additionally, it contains undefined references to the NULL
Import Descriptor and the library NULL Thunk Data. This ensures that the linker
will pull in the subsequent objects from the import library for the link. The
Import Descriptor has a single symbol (__IMPORT_DESCRIPTOR_<Library>) which
contains 3 relocations, one to the INT (Import Name Table) entry, one to the ILT
(Import Lookup Table) entry, and one to the IAT (Import Address Table) entry.
The NULL Import Descriptor is the last import descriptor and terminates the
import descriptor array. It contains a single symbol
(__NULL_IMPORT_DESCRIPTOR).
The NULL Thunk contains a single symbol (\x7f<Library>_NULL_THUNK_DATA) and
provides the terminator for the ILT and IAT.
These files are currently constructed manually following the example of the
Short Import Library format. This is arguably less than ideal, and it may be
possible to use MCAssembler and feed it the fragments to construct the object.
The major difference between the LIB (LINK) generated objects and the ones
generated here is that they are all one section shorter (.debug$S) as they do
not contain the debug information and one symbol shorter (@comp.id) as they do
not contain the RICH signature.
Move the logic related to the librarian into a new source file (Librarian.cpp).
llvm-svn: 275242
Manifest file is a separate or embedded XML file having metadata
of an executable. As it is XML, it can contain various types of
information. Probably the most popular one is to request escalated
priviledges.
Usually the linker creates an XML file and embed that file into
an executable. However, there's a way to supply an XML file from
command line. /manifestniput is it.
Apparently it is over-designed here, but if you supply two or more
manifest files, then the linker needs to merge the files into a
single XML file. A good news is that we don't need to do that ourselves.
MT.exe command can do that, so we call the command from the linker
in this patch.
llvm-svn: 266704
With the llvm change in r265606 this is the matching needed change to the lld
code now that createBinary() is returning Expected<...> .
llvm-svn: 265607
This flag disables link.exe's crash handler so that normal windows error
reporting and crash dumping occurs. For now it is reasonable for LLD to
ignore the flag.
Chromium is currently using this flag to collect minidumps of link.exe
crashing, and it breaks the LLD build.
llvm-svn: 264439
Some declarations of memcpy (like glibc's for example) are attributed
with notnull which makes it UB for NULL to get passed in, even if the
memcpy count is zero.
To account for this, guard the memcpy with an appropriate precondition.
This should fix the last UBSan bug, exposed by the test suite, in the
COFF linker.
llvm-svn: 263919
LLD type-punned an integral type and a pointer type using a pointer
field. This is problematic because the pointer type has alignment
greater than some of the integral values.
This would be less problematic if a union was used but it turns out the
integral values are only present for a short, transient, amount of time.
Let's remove this undefined behavior by skipping the punning altogether
by storing the state in a separate memory location: a vector which
informs us which symbols to process for weak externs.
llvm-svn: 263918
This fixes a test which exposed an ASan issue.
We assumed that a symbol's section number had a corresponding section
without performing validation.
llvm-svn: 263558
The load configuration directory is a structure whose size varies as the
OS gains additional functionality. To account for this, the structure's
layout begins with a size field; this allows loaders to know which
fields are available.
However, LLD hard-coded the sizes (112 bytes for 64-bit and 64 for
32-bit). This means that we might not inform the loader of all the
pertinent fields or we might claim that there are more fields than are
actually present.
To correctly account for this, the size field must be loaded from the
_load_config_used symbol.
N.B. The COFF spec is either wrong or out of date, the load
configuration directory is not correctly documented in the
specification: it omits the size field.
llvm-svn: 263543
The TLS directory has a different layout depending on the bitness of the
machine the image will run on. LLD would always use the 64-bit TLS
directory for the data directory entry instead of an appropriately sized
TLS directory.
llvm-svn: 263539
Now that DarwinLdDriver is the only derived class of Driver.
This patch merges them and actually removed the class because
they can now just be non-member functions. This change simplifies
a common header, Driver.h.
http://reviews.llvm.org/D17788
llvm-svn: 262502
DLL export tables usually contain dllexport'ed symbol RVAs so that
applications which use the DLLs can find symbols from the DLLs.
However, there's a minor feature to "forward" DLL symbols to other
DLLs.
If you set an RVA to a string whose form is "<dllname>.<symbolname>"
(e.g. "KERNEL32.ExitProcess") instead of symbol RVA to the export
table, the loader interprets that as a forwarder symbol, and resolve
that symbol from the specified DLL.
This patch implements that feature.
llvm-svn: 257243
In a UI such as XCode, LLVM source files are in 'libraries' while clang
files are in 'clang libraries'.
This change moves the lld source to 'lld libraries' to make code browsing easier.
It should be NFC as the build itself is still the same, just the structure in a
UI differs.
llvm-svn: 257001
MSVC linker considers PDB files created with this patch valid.
So you don't have to remove PDB files created by lld before
running MSVC linker.
This patch has no test since llvm-pdbdump dislikes PDB files
with no metadata streams.
llvm-svn: 255039
Before this patch, we created an empty PDB file if /debug option is
specified. For MSVC linker, such PDB file is completely broken, and
linker exits without doing anything as soon as it finds an empty PDB
file.
A PDB file created in this patch has the correct file signature.
MSVC linker still thinks that the file is broken, but it then removes
and replaces with its output.
This is an initial patch to support PDB in LLD. We aim to support
PDB in order to make it 100% compatible with MSVC linker. PDB support
is the last missing piece.
llvm-svn: 254796
If a section symbol is not external, that COMDAT section should never
be merge with other sections in other compilation unit. Previously,
we didn't take visibility into account.
Note that COMDAT sections with non-external visibility makes sense
because they can be removed by dead-stripping.
Fixes https://llvm.org/bugs/show_bug.cgi?id=25686
llvm-svn: 254578
There was a threading issue in the ICF code for COFF. That seems like
a venign bug in the sense that it doesn't produce an incorrect output,
but it oftentimes misses reducible sections. As a result, mergeable
sections could remain in outputs, which makes the output nondeterministic.
Basically the algorithm we are using for ICF is this: We group sections
so that identical sections will eventually be in the same group. Initially,
all sections are in one group. We split the group by relocation targets
until we get a convergence (if relocation targets are in different gruops,
the sections are different). Once a group is split, they will never be
merged.
Each section has a group ID. That variable itself is atomic, so there's
no threading issue at the level that we can use thread sanitizer.
The point is, when we split a group, we re-assign new group IDs to group
of sections. That are multiple separate writes to atomic varaibles.
Thus, splitting a group is not an atomic operation, and there's a small
chance that the other thread observes inconsistent group IDs.
Over-splitting is always "safe", so it will never create incorrect output.
I suspect that the nondeterminism stems from that point. However, I
cannot prove or fix that at this moment, so I'm going to avoid using
threads here.
llvm-svn: 251300
There's actually a room to improve this patch. Instead of not merging
sections that have different alignements, we can choose the section that
has the largest alignment requirement among all sections that are otherwise
considered the same. Then all section alignments are satisfied, so we can
merge them.
I don't know if that improvement could make any difference for real-world
input, so I'll leave it alone. Would be interesting to revisit later.
llvm-svn: 248581
This is an LLD extension to MSVC link.exe command line. MSVC linker
does not write symbol tables for executables. We do unless no /debug
option is given.
There's a situation that we want to enable debug info but don't want
to emit the symbol table. One example is when we are comparing output
file size. With this patch, you can tell the linker to not create
a symbol table by just specifying /nosymtab.
llvm-svn: 248225
std::distance(C->Relocs.end(), C->Relocs.begin()) is the same as NumRelocs
which is already added to the hash value. What we are missing here is the
section size.
llvm-svn: 248202
This patch fixes a regression introduced by r247964. Relocations that
are referring the same symbol should be considered equal, but they
were not if they were pointing to non-section chunks.
llvm-svn: 248132
Previously, InputFile::parse() was run in batch. We construct a list
of all input files and call parse() on each file using parallel_for_each.
That means we cannot start parsing files until we get a complete list
of input files, although InputFile::parse() is safe to call from anywhere.
This patch makes it asynchronous. As soon as we add a file to the symbol
table, we now start parsing the file using std::async().
This change shortens self-hosting time (650 ms) by 28 ms. It's about 4%
improvement.
llvm-svn: 248109
I made the field an atomic pointer in hope that we would be able to
parallelize the symbol resolver soon, but that's not going to happen
soon. This patch reverts that change for the sake of readability.
llvm-svn: 248104
InputFile::parse() can be called in parallel with other calls of
the same function. By doing that, time to self-link improves from
741 ms to 654 ms or 12% faster.
This is probably the last low hanging fruit in terms of parallelism.
Input file parsing and symbol table insertion takes 450 ms in total.
If we want to optimize further, we probably have to parallelize
symbol table insertion using concurrent hashmap or something.
That's doable, but that's not easy, especially if you want to keep
the exact same semantics and linking order. I'm not going to do that
at least soon.
Anyway, compared to r248019 (the change before the first attempt for
parallelism), we achieved 36% performance improvement from 1022 ms
to 654 ms. MSVC linker takes 3.3 seconds to link the same program.
MSVC's ICF feature is very slow for some reason, but even if we
disable the feature, it still takes about 1.2 seconds.
Our number is probably good enough.
llvm-svn: 248078
Self-hosting took 801 ms on my machine. Of which this function took
69 ms. Now it takes 37 ms. That is about 4% overall performance
improvement.
llvm-svn: 248052
The LLD's ICF algorithm is highly parallelizable. This patch does that
using parallel_for_each.
ICF accounted for about one third of total execution time. Previously,
it took 324 ms when self-hosting. Now it takes only 62 ms.
Of course your mileage may vary. My machine is a beefy 24-core Xeon machine,
so you may not see this much speedup. But this optimization should be
effective even for 2-core machine, since I saw speedup (324 ms -> 189 ms)
when setting parallelism parameter to 2.
llvm-svn: 248038
Previously, ICF created a vector for each SectionChunk. The vector
contained pointers to successors, which are namely associative sections
and COMDAT relocation targets. The reason I created vectors is because
I thought that that would make section comparison faster.
It did make the comparison faster. When self-linking, for example, it
saved about 10 ms on each iteration. The time we spent on constructing
the vectors was 124 ms. If we iterate more than 12 times, return from
the investment exceeds the initial cost.
In reality, it usually needs 5 iterations. So we shouldn't construct
the vectors.
llvm-svn: 247963
equalsConstants() is the heaviest function in ICF, and that consumes
more than half of total ICF execution time. Of which, section content
comparison accounts for roughly one third.
Previously, we compared section contents at the beginning of the
function after comparing their checksums. The comparison is very
likely to succeed because when the control reaches that comparison,
their checksums are always equal. And because checksums are 64-bit
CRC, they are unlikely to collide.
We compared relocations and associative sections after that.
If they are different, the time we spent on byte-by-byte comparison
of section contents were wasted.
This patch moves the comparison at the end of function. If the
comparison fails, the time we spent on relocation comparison are
wasted, but as I wrote it's very unlikely to happen.
LLD took 1198 ms to link itself to produce a 27.11 MB executable.
Of which, ICF accounted for 536 ms. This patch cuts it by 90 ms,
which is 17% speedup of ICF and 7.5% speedup overall. All numbers
are median of ten runs.
llvm-svn: 247961
Basically the concept of "liveness" is for sections (or chunks in LLD
terminology) and not for symbols. Symbols are always available or live,
or otherwise it indicates a link failure.
Previously, we had isLive() and markLive() methods for DefinedSymbol.
They are confusing methods. What they actually did is to act as a proxy
to backing section chunks. We can simplify eliminate these methods
and call section chunk's methods directly.
llvm-svn: 247869
Only live symbols are written to the symbol table. Because isLive()
returned false if dead-stripping was disabled entirely, only
non-COMDAT sections were written to the symbol table. This patch fixes
the issue.
llvm-svn: 247856
This patch defines ICF class and defines ICF-related functions as
members of the class. By doing this we can move code that are
related only to ICF from SectionChunk to the newly-defined class.
This also eliminates a global variable "NextID".
llvm-svn: 247802
This is a patch to make LLD to be on par with MSVC in terms of ICF
effectiveness. MSVC produces a 27.14MB executable when linking LLD.
LLD previously produced a 27.61MB when self-linking. Now the size
is reduced to 27.11MB. Note that without ICF the size is 29.63MB.
In r247387, I implemented an algorithm that handles section graphs
as cyclic graphs and merge them using SCC. The algorithm did not
always work as intended as I demonstrated in r247721. The new
algortihm implemented in this patch is different from the previous
one. If you are interested the details, you want to read the file
comment of ICF.cpp.
llvm-svn: 247770
Previously, LLD's ICF couldn't merge cyclic graphs. That was unfortunate
because, in COFF, cyclic graphs are not exceptional at all. That is
pretty common.
In this patch, sections are grouped by Tarjan's strongly connected
component algorithm to get acyclic graphs. And then we try to merge
SCCs whose outdegree is zero, and remove them from the graph. This
makes other SCCs to have outdegree zero, so we can repeat the
process until all SCCs are removed. When comparing two SCCs, we handle
cycles properly.
This algorithm works better than previous one. Previously, self-linking
produced a 29.0MB executable. It now produces a 27.7MB. There's still some
gap compared to MSVC linker which produces a 27.1MB executable for the
same input. So the gap is narrowed, but still LLD is not on par with MSVC.
I'll investigate that later.
llvm-svn: 247387
Identical COMDAT Folding is a feature to merge COMDAT sections
by contents. Two sections are considered the same if their contents,
relocations, attributes, etc, are all the same.
An interesting fact is that MSVC linker takes "iterations" parameter
for ICF because the algorithm they are using is iterative. Merging
two sections could make more sections to be mergeable because
different relocations could now point to the same section. ICF is
repeated until we get a convergence (until no section can be merged).
This algorithm is not fast. Usually it needs three iterations until a
convergence is obtained.
In the new algorithm implemented in this patch, we consider sections
and relocations as a directed acyclic graph, and we try to merge
sections whose outdegree is zero. Sections with outdegree zero are then
removed from the graph, which makes other sections to have outdegree
zero. We repeat that until all sections are processed. In this
algorithm, we don't iterate over the same sections many times.
There's an apparent issue in the algorithm -- the section graph is
not guaranteed to be acyclic. It's actually pretty often cyclic.
So this algorithm cannot eliminate all possible duplicates.
That's OK for now because the previous algorithm was not able to
eliminate cycles too. I'll address the issue in a follow-up patch.
llvm-svn: 246878
Previously, we calculated our own hash values for section contents.
Of coruse that's slow because we had to access all bytes in sections.
Fortunately, COFF objects usually contain hash values for COMDAT
sections. We can use that to speed up Identical COMDAT Folding.
llvm-svn: 246869
The option is added in MSVC 2015, and there's no documentation about
what the option is. This patch is to ignore the option for now, so that
at least LLD is usable with MSVC 2015.
llvm-svn: 246780
This patch fixes a subtle incompatibility with MSVC linker.
MSVC linker preserves the original spelling of a DLL in the
import descriptor table. LLD previously converted all
characters to lowercase. Usually this difference is benign,
but if a program explicitly checks for DLL file names, the
program could fail.
llvm-svn: 246620
In r246424, I made a change that disables non-DLL to export
symbols. It turned out that the change was not correct. Both
DLLs and executables are able to export symbols (although the
latter is relatively rare). This change restores the feature.
llvm-svn: 246537
I have totally no idea why, but MSVC linker is sensitive about
file names of archive members. If we do not make import library
file names to the same as the DLL name, MSVC link *crashes*
when it is processing the library file. This patch is to set
the same name.
llvm-svn: 246535
The rules for dllexported symbols are overly complicated due to
x86 name decoration, fuzzy symbol resolution, and the fact that
one symbol can be resolved by so many different names. The rules
are probably intended to be "intuitive", so that users don't have
to understand the name mangling schemes, but it seems that it can
lead to unintended symbol exports.
To make it clear what I'm trying to do with this patch, let me
write how the export rules are subtle and complicated.
- x86 name decoration: If machine type is i386 and export name
is given by a command line option, like /export:foo, the
real symbol name the linker has to search for is _foo because
all symbols are decorated with "_" prefixes. This doesn't happen
on non-x86 machines. This automatic name decoration happens only
when the name is not C++ mangled.
However, the symbol name exported from DLLs are ones without "_"
on all platforms.
Moreover, if the option is given via .drectve section, no
symbol decoration is done (the reason being that the .drectve
section is created by a compiler and the compiler should always
know the exact name of the symbol, I guess).
- Fuzzy symbol resolution: In addition to x86 name decoration,
the linker has to look for cdecl or C++ mangled symbols
for a given /export. For example, it searches for not only
_foo but also _foo@<number> or ??foo@... for /export:foo.
Previous implementation didn't get it right. I'm trying to make
it as compatible with MSVC linker as possible with this patch
however the rules are. The new code looks a bit messy to me, but
I don't think it can be simpler due to the ad-hoc-ness of the rules.
llvm-svn: 246424
This is exposed via a new flag /opt:lldltojobs=N, where N is the number of
code generation threads.
Differential Revision: http://reviews.llvm.org/D12309
llvm-svn: 246342
lib.exe has a feature to create import library files (which contain
short import files) from module-definition files. Previously, we were
using that feature, but it turned out that the feature is not complete
for us.
There seems no way to specify "Import Types" in module-definition file.
lib.exe always adds "_" to given symbols and specify IMPORT_NAME_UNDECORATE.
We need more fine-grainded control on that value.
This patch teaches LLD to create short import files itself.
We are still using lib.exe, but the use of the tool is limited to create
empty import library files. We then create short import files and add them
to the empty files as new members.
This patch does not intend to change the functionality. LLD produces
the same import libraries as before. I'll make another change to create
different import libraries in a follow-up patch.
llvm-svn: 246292
__NULL_IMPORT_DESCRIPTOR is a symbol used by MSVC liner to construct
the import descriptor table. We do not use the symbol. Previously,
we had code to skip that symbol. That code does not actually do
anything meaningful because no one is referencing the symbol, the
symbol would naturally be ignored. This patch stops recognizing
the symbol.
llvm-svn: 245280
Previously, weak external symbols could reference only symbols that
appeared before them. Although that covers almost all use cases
of weak externals, there are object files out there which contains
weak externals that have forward references.
This patch supports such weak externals.
llvm-svn: 245258
There are some DLLs whose initializers depends on other DLLs'
initializers. The initialization order matters for them.
MSVC linker uses the order of the libraries from the command line.
LLD used ASCII-betical order. So they were incompatible.
This patch makes LLD compatible with MSVC.
llvm-svn: 245201
This is more convenient than the offset from the start of the file as we
don't have to worry about it changing when we move the output section.
This is a port of r245008 from ELF.
llvm-svn: 245018
Sections must start at page boundaries in memory, but they
can be aligned to sector boundaries (512-bytes) on disk.
We aligned them to 4096-byte boundaries even on disk, so we
wasted disk space a bit.
llvm-svn: 244691
MSVC 2015's load configuration object (__load_config_used) contains
references to these symbols. I don't fully understand how it works,
but looks like these symbols are linker-defined ones. So I define them
here in the Driver. With this patch, LLD can self-host with MSVC 2015.
This patch is to link MSVC 2015-produced object files. It does not
implement Control Flow Protection. If I understand correctly, the
linker has to create a bitmap of function entry point addresses for
the CFG runtime. We don't do that yet. Produced executables will not
be protected by CFG.
llvm-svn: 244425
SymbolTable::find(mangle(X)) is equivalent to SymbolTable::findUnderscore(X)
except that the latter is slightly efficient as that doesn't allocate a new
string.
llvm-svn: 244377
This has a few advantages
* Less C++ code (about 300 lines less).
* Less machine code (about 14 KB of text on a linux x86_64 build).
* It is more debugger friendly. Just set a breakpoint on the exit function and
you get the complete lld stack trace of when the error was found.
* It is a more robust API. The errors are handled early and we don't get a
std::error_code hot potato being passed around.
* In most cases the error function in a better position to print diagnostics
(it has more context).
llvm-svn: 244215
Various parameters are passed implicitly using Config global variable
already. Output file path is no different from others, so there was no
special reason to handle that differnetly.
This patch changes the signature of writeResult(SymbolTable *, StringRef)
to writeResult(SymbolTable *).
llvm-svn: 244180
We were printing an error but exiting with 0.
Not sure how to test this. We could add a no-winlib feature,
but that is probably not worth it.
llvm-svn: 244109
I don't remember why I thought that only functions are subject
of garbage collection, but the comment here said so, which is
not correct. Moreover, the code just below the comment does not
do what the comment says -- it handles non-COMDAT, non-function
sections as GC root. As a result, it just handles non-COMDAT
sections as GC root.
This patch cleans that up by removing SectionChunk::isRoot and
use isCOMDAT instead.
llvm-svn: 243700
We want to convince the NT loader not to map these sections into memory.
A good first step is to move them to the end of the executable.
Differential Revision: http://reviews.llvm.org/D11655
llvm-svn: 243680
We create a module-definition file and give that to lib.exe to
create an import library file. A module-definition has to be
syntactically and semantically correct, of course.
There was a case that we created a module-definition file that
lib.exe would complain for duplicate entries. If a user gives
an unmangled and mangled name for the same symbol, we would end
up having two duplicate lines for the mangled name in a module-
definition file.
This patch fixes that issue by uniquefying entries by mangled
symbol name.
llvm-svn: 243587
Windows ARM is the thumb ARM environment, and pointers to thumb code
needs to have its LSB set. When we apply relocations, we need to
adjust the LSB if it points to an executable section.
llvm-svn: 243560
SECREL should sets the 32-bit offset of the target from the beginning
of *target's* output section. Previously, the offset from the beginning
of source's output section was used instead.
SECTION means the target section's index, and not the source section's
index. This patch fixes that issue too.
llvm-svn: 243535
I don't fully understand the rationale behind the name mangling
scheme used for the DLL export table and the import library.
Why only leading "_" is dropped for the import library while
both "_" and "@" are dropped from DLL symbol table? But this seems
to be what MSVC linker does.
llvm-svn: 243490
The linker is now able to link not only LLVM/Clang/LLD for x86 but
even larger programs. I confirmed that it successsfully linked Chrome
for x86. Because the browser is a pretty large program, I think I can
say that the linker is now mostly feature complete. (I'm pretty sure
that there are hidden bugs somewhere, but they shouldn't be significant.)
llvm-svn: 243377
Previously, we ignore /merge option if /debug is specified
because I thought that was MSVC linker did. This was wrong.
/merge shouldn't be ignored even in debug mode.
llvm-svn: 243375
Leaving them in an executable is basically harmless but wastes disk space.
Because no one is using non-DWARF debug info linked by LLD, we can just
remove them.
llvm-svn: 243364
On x64 and x86, we use only one base relocation type, so we handled
base relocations just as a list of RVAs. That doesn't work well for
ARM becuase we have to handle two types of base relocations on ARM.
This patch changes the type of base relocation from uint32_t to
{reltype, uint32_t} to make it easy to port this code to ARM.
llvm-svn: 243197
In many places we assumed that is64() means AMD64 and i386 otherwise.
This assumption is not sound because Windows also supports ARM.
The linker doesn't support ARM yet, but this is a first step.
llvm-svn: 243188
An object file compatible with Safe SEH contains a .sxdata section.
The section contains a list of symbol table indices, each of which
is an exception handler function. A safe SEH-enabled executable
contains a list of exception handler RVAs. So, what the linker has
to do to support Safe SEH is basically to read the .sxdata section,
interpret the contents as a list of symbol indices, unique-fy and
sort their RVAs, and then emit that list to .rdata. This patch
implements that feature.
llvm-svn: 243182
__ImageBase is a special symbol whose value is the image base address.
Previously, we handled __ImageBase symbol as an absolute symbol.
Absolute symbols point to specific locations in memory and the locations
never change even if an image is base-relocated. That means that we
don't have base relocation entries for absolute symbols.
This is not a case for __ImageBase. If an image is base-relocated, its
base address changes, and __ImageBase needs to be shifted as well.
So we have to have base relocations for __ImageBase. That means that
__ImageBase is not really an absolute symbol but a different kind of
symbol.
In this patch, I introduced a new type of symbol -- DefinedRelative.
DefinedRelative is similar to DefinedAbsolute, but it has not a VA but RVA
and is a subject of base relocation. Currently only __ImageBase is of
the new symbol type.
llvm-svn: 243176
Load Configuration field points to a structure containing information
for SEH. That data strucutre is not created by the linker but provided
by an external file. What we have to do is just to set __load_config_used
address to the header.
llvm-svn: 242427
If a symbol is exported as /export:foo, and foo is resolved as a
mangled name (_foo@<number> or ?foo@@Y...), that mangled name should
be written to the export table. Previously, we wrote the original
name to the export table.
llvm-svn: 242342
Because thunks for dllimported symbols contain absolute addresses on x86,
they need to be relocated at load-time. This bug was a cause of crashes
in DLL initialization routines.
llvm-svn: 242259
I am adding support for thin archives. On those, getting the buffer
involves reading another file.
Since we only need an id in here, use the member offset in the archive.
llvm-svn: 242205
Entry name selection rule is already complicated on x64, but it's more
complicated on x86 because of the underscore name mangling scheme.
If one of _main, _main@<number> (a C function) or ?main@@... (a C++ function)
is defined, entry name is _mainCRTStartup. If _wmain, _wmain@<number or
?wmain@@... is defined, entry name is _wmainCRTStartup. And so on.
llvm-svn: 242110
If /delayload option is given, we have to resolve __delayLoadHelper2
since the function is the dynamic loader to delay-load DLLs.
The function name is mangled in x86 as ___delayLoadHelper2@8.
llvm-svn: 242078
clang-cl doesn't compile std::atomic_flag correctly (PR24101). Since the COFF
linker doesn't use threads yet, just revert r241420 and r241481 for now to
work around this clang-cl bug.
llvm-svn: 242006
Symbol foo is mangled as _foo in C and ?foo@@... in C++ on x86.
findMangle has to remove prefix underscore before mangle a given name
as a C++ symbol.
llvm-svn: 241874
Symbol names are usually mangled by appending "_" prefix on x86.
But the mangled name is not used in DLL export table. The export
table contains unmangled names.
llvm-svn: 241872
With this patch, LLD is now able to self-link an .exe file for x86
that runs correctly, although I don't think some headers (particularly
SEH) are not correct. DLL support is coming soon.
llvm-svn: 241857
Previously, we infer machine type at the very end of linking after
all symbols are resolved. That's actually too late because machine
type affects how we mangle symbols (whether or not we need to
add "_").
For example, /entry:foo adds "_foo" to the symbol table if x86 but
"foo" if x64.
This patch moves the code to infer machine type, so that machine
type is inferred based on input files given via the command line
(but not based on .directives files).
llvm-svn: 241843
Symbols exported by DLLs are listed in import library files.
Exported names may be mangled by "Import Name Type" field as
described in PE/COFF spec 7.3. This patch implements that
mangling scheme.
llvm-svn: 241719
Providing a symbol table in the executable is quite useful when
debugging a fully-linked executable without having to reconstruct one
from DWARF.
Differential Revision: http://reviews.llvm.org/D11023
llvm-svn: 241689