Only live symbols are written to the symbol table. Because isLive()
returned false if dead-stripping was disabled entirely, only
non-COMDAT sections were written to the symbol table. This patch fixes
the issue.
llvm-svn: 247856
This is a patch to make LLD to be on par with MSVC in terms of ICF
effectiveness. MSVC produces a 27.14MB executable when linking LLD.
LLD previously produced a 27.61MB when self-linking. Now the size
is reduced to 27.11MB. Note that without ICF the size is 29.63MB.
In r247387, I implemented an algorithm that handles section graphs
as cyclic graphs and merge them using SCC. The algorithm did not
always work as intended as I demonstrated in r247721. The new
algortihm implemented in this patch is different from the previous
one. If you are interested the details, you want to read the file
comment of ICF.cpp.
llvm-svn: 247770
In this test, we have two functions, foo and bar. MSVC linker can
choose one and discard the other using ICF. LLD cannot. I add this
test as a TODO.
foo and bar are conceptually equivalent to the following:
void foo() { foo(); }
void bar() { foo(); }
foo and bar are effectively the same function. If foo and bar are
compiled to the same instructions, both their contents (foo and bar)
and relocation targets (foo) become the same, so from the ICF point
of view, they are reducible. But their graphs are not isomorphic!
LLD's ICF algorithm cannot handle this case yet.
llvm-svn: 247721
Previously, LLD's ICF couldn't merge cyclic graphs. That was unfortunate
because, in COFF, cyclic graphs are not exceptional at all. That is
pretty common.
In this patch, sections are grouped by Tarjan's strongly connected
component algorithm to get acyclic graphs. And then we try to merge
SCCs whose outdegree is zero, and remove them from the graph. This
makes other SCCs to have outdegree zero, so we can repeat the
process until all SCCs are removed. When comparing two SCCs, we handle
cycles properly.
This algorithm works better than previous one. Previously, self-linking
produced a 29.0MB executable. It now produces a 27.7MB. There's still some
gap compared to MSVC linker which produces a 27.1MB executable for the
same input. So the gap is narrowed, but still LLD is not on par with MSVC.
I'll investigate that later.
llvm-svn: 247387
I don't understand why the previous code is pretty flaky and
the new code is at least less flaky, but the original test
occasionally failed on the second run of lib.exe.
My guess was that lib.exe was failing because the output of
the echo command executed immediately before lib.exe was not
flushed to a file, but as far as I can say, the file
descriptor is properly closed in TestRunner.py, so this's
probably not correct. Other theory is that, on Windows, file
output is not guaranteed to be visible to other processes even
if a process flushes file descriptors, but I'd think that's
unlikely. So honestly I don't know the cause yet.
llvm-svn: 246621
This patch fixes a subtle incompatibility with MSVC linker.
MSVC linker preserves the original spelling of a DLL in the
import descriptor table. LLD previously converted all
characters to lowercase. Usually this difference is benign,
but if a program explicitly checks for DLL file names, the
program could fail.
llvm-svn: 246620
In r246424, I made a change that disables non-DLL to export
symbols. It turned out that the change was not correct. Both
DLLs and executables are able to export symbols (although the
latter is relatively rare). This change restores the feature.
llvm-svn: 246537
I have totally no idea why, but MSVC linker is sensitive about
file names of archive members. If we do not make import library
file names to the same as the DLL name, MSVC link *crashes*
when it is processing the library file. This patch is to set
the same name.
llvm-svn: 246535
The rules for dllexported symbols are overly complicated due to
x86 name decoration, fuzzy symbol resolution, and the fact that
one symbol can be resolved by so many different names. The rules
are probably intended to be "intuitive", so that users don't have
to understand the name mangling schemes, but it seems that it can
lead to unintended symbol exports.
To make it clear what I'm trying to do with this patch, let me
write how the export rules are subtle and complicated.
- x86 name decoration: If machine type is i386 and export name
is given by a command line option, like /export:foo, the
real symbol name the linker has to search for is _foo because
all symbols are decorated with "_" prefixes. This doesn't happen
on non-x86 machines. This automatic name decoration happens only
when the name is not C++ mangled.
However, the symbol name exported from DLLs are ones without "_"
on all platforms.
Moreover, if the option is given via .drectve section, no
symbol decoration is done (the reason being that the .drectve
section is created by a compiler and the compiler should always
know the exact name of the symbol, I guess).
- Fuzzy symbol resolution: In addition to x86 name decoration,
the linker has to look for cdecl or C++ mangled symbols
for a given /export. For example, it searches for not only
_foo but also _foo@<number> or ??foo@... for /export:foo.
Previous implementation didn't get it right. I'm trying to make
it as compatible with MSVC linker as possible with this patch
however the rules are. The new code looks a bit messy to me, but
I don't think it can be simpler due to the ad-hoc-ness of the rules.
llvm-svn: 246424
This is exposed via a new flag /opt:lldltojobs=N, where N is the number of
code generation threads.
Differential Revision: http://reviews.llvm.org/D12309
llvm-svn: 246342
ICF is a feature to merge sections not by name (which is the regular
COMDAT merging) but by contents. If two or more sections have the
identical contents and relocations, ICF merges them to save space.
Accessors or templated functions tend to have the same contents, and
ICF can hold them.
If we consider sections as vertices and relocations as edges, the
problem is to find as many isomorphic graphs as possile from input
graphs. MSVC linker is smart enough to identify isomorphic graphs
even if they contain circles (GNU gold cannot handle circles
according to http://research.google.com/pubs/pub36912.html, so this
is impressive).
Circular references are not uncommon in COFF object files.
One example is .pdata. .pdata sections contain exception handler info
for functions, so they naturally have relocations for the functions.
The functions in turn have references to the .pdata sections so that
the functions and their .pdata are linked together. As a result, they
form circles.
This is a test case for circular graphs. LLD is not able to handle
this test case yet. I'll add code soon.
llvm-svn: 245827
The old test files were just compiler outputs, so it was hard to
debug if something goes wrong. The new test file is carefully
hand-crafted to trigger ICF to avoid that.
llvm-svn: 245826
Previously, weak external symbols could reference only symbols that
appeared before them. Although that covers almost all use cases
of weak externals, there are object files out there which contains
weak externals that have forward references.
This patch supports such weak externals.
llvm-svn: 245258
There are some DLLs whose initializers depends on other DLLs'
initializers. The initialization order matters for them.
MSVC linker uses the order of the libraries from the command line.
LLD used ASCII-betical order. So they were incompatible.
This patch makes LLD compatible with MSVC.
llvm-svn: 245201
Sections must start at page boundaries in memory, but they
can be aligned to sector boundaries (512-bytes) on disk.
We aligned them to 4096-byte boundaries even on disk, so we
wasted disk space a bit.
llvm-svn: 244691
We were printing an error but exiting with 0.
Not sure how to test this. We could add a no-winlib feature,
but that is probably not worth it.
llvm-svn: 244109
Right now PE image section addresses are RVAs and symbol addresses are
VAs. We should probably fix this by changing section addresses to match
symbol addresses. Fixing this might take a few hours, so temporarily
disable the objdump part of this test.
llvm-svn: 243758
We want to convince the NT loader not to map these sections into memory.
A good first step is to move them to the end of the executable.
Differential Revision: http://reviews.llvm.org/D11655
llvm-svn: 243680
Windows ARM is the thumb ARM environment, and pointers to thumb code
needs to have its LSB set. When we apply relocations, we need to
adjust the LSB if it points to an executable section.
llvm-svn: 243560
SECREL should sets the 32-bit offset of the target from the beginning
of *target's* output section. Previously, the offset from the beginning
of source's output section was used instead.
SECTION means the target section's index, and not the source section's
index. This patch fixes that issue too.
llvm-svn: 243535
I don't fully understand the rationale behind the name mangling
scheme used for the DLL export table and the import library.
Why only leading "_" is dropped for the import library while
both "_" and "@" are dropped from DLL symbol table? But this seems
to be what MSVC linker does.
llvm-svn: 243490
Previously, we ignore /merge option if /debug is specified
because I thought that was MSVC linker did. This was wrong.
/merge shouldn't be ignored even in debug mode.
llvm-svn: 243375