Commit Graph

81 Commits

Author SHA1 Message Date
Rui Ueyama f5313b3498 COFF: Allow mangled symbols as arguments for /export.
Usually dllexported symbols are defined with 'extern "C"',
so identifying them is easy. We can just do hash table lookup
to look up exported symbols.

However, C++ non-member functions are also allowed to be exported,
and they can be specified with unmangled name. So, if /export:foo
is given, we need to look up not only "foo" but also its all
mangled names. In MSVC mangling scheme, that means that we need to
look up any symbol which starts with "?foo@@Y".

In this patch, we scan the entire symbol table to search for
a mangled symbol. The symbol table is a DenseMap, and that doesn't
support table lookup by string prefix. This is of course very
inefficient. But that should be probably OK because the user
should always add 'extern "C"' to dllexported symbols.

llvm-svn: 240919
2015-06-28 22:16:41 +00:00
Rui Ueyama 091b1a6585 COFF: Fix flaky test.
This test was flaky because stdout and stderr can be mixed.

llvm-svn: 240918
2015-06-28 22:06:53 +00:00
Rui Ueyama a8b60458ea COFF: Add /noentry flag.
This option is sometimes used to create a resource-only DLL that
doesn't need any initialization.

llvm-svn: 240915
2015-06-28 19:56:30 +00:00
Rui Ueyama fe6d11c0db Fix broken test.
llvm-svn: 240914
2015-06-28 19:38:10 +00:00
Rui Ueyama 95925fd1ab COFF: Support /force flag.
This option is to ignore remaining undefined symbols and force
the linker to create an output file anyways.

The existing code assumes that there's no undefined symbol after
reportRemainingUndefines(). That assumption is legitimate.
I also don't want to mess up the existing code for this minor feature.
In order to keep it as is, remaining undefined symbols are replaced
with dummy defined symbols.

llvm-svn: 240913
2015-06-28 19:35:15 +00:00
Rui Ueyama 06cf3df2d5 COFF: Handle LINK environment variable.
If LINK is defined and not empty, it's supposed to contain
command line options.

llvm-svn: 240900
2015-06-28 02:35:31 +00:00
Rui Ueyama 871847e32d COFF: Fix ICF correctness bug.
When comparing two COMDAT sections, we need to take section values
and associative sections into account. This patch fixes that bug.
It fixes a crash bug of llvm-tblgen when linked with /opt:lldicf.

One thing I don't understand yet is that this logic seems to be
too strict. MSVC linker is able to create more compact executables
(which of course work correctly). With this ICF algorithm, LLD is
able to make executable smaller, but the outputs are larger than
MSVC's. There must be something I'm missing here.

llvm-svn: 240897
2015-06-28 01:30:54 +00:00
Rui Ueyama 810551a694 COFF: Add base relocation for delay-import table.
Because the address table of the delay-import table contains
absolute address, it needs to be added to the base relocation
table.

llvm-svn: 240844
2015-06-26 22:05:32 +00:00
Rui Ueyama 382dc96e29 COFF: Fix delay-import tables.
There were a few issues with the previous delay-import tables.

 - "Attribute" field should have been 1 instead of 0.
   (I don't know the meaning of this field, though.)
 - LEA and CALL operands had wrong addresses.
 - Address tables are in .didat (which is read-only).
   They should have been in .data.

llvm-svn: 240837
2015-06-26 21:40:15 +00:00
Peter Collingbourne be54955bba COFF: Implement /lldmap flag.
This flag can be used to produce a map file, which is essentially a list
of objects linked into the final output file together with the RVAs of
their symbols. Because our format differs from MSVC's we expose it as a
separate flag.

Differential Revision: http://reviews.llvm.org/D10773

llvm-svn: 240812
2015-06-26 18:58:24 +00:00
Rui Ueyama 7383562bc9 COFF: Align DLL import thunks on 16-byte boundaries.
llvm-svn: 240806
2015-06-26 18:28:56 +00:00
Rui Ueyama 923cb64801 COFF: Add a test for r240719.
llvm-svn: 240758
2015-06-26 03:50:27 +00:00
Rui Ueyama 32f8e1cb4e COFF: Change symbol resolution order for entry and /include.
We were resolving entry symbols and /include'd symbols after all other
symbols are resolved. But looks like it's too late. I found that it
causes some program to fail to link.

Let's say we have an object file A which defines symbols X and Y in an
archive. We also have another file B after A which defines X, Y and
_DLLMainCRTStartup in another archive. They conflict each other, so
either A or B can be linked.

If we have _DLLMainCRTStartup as an undefined symbol, file B is always
chosen. If not, there's a chance that A is chosen. If the linker
find it needs _DllMainCRTStartup after that, it's too late.

This patch adds undefined symbols to the symbol table as soon as
possible to fix the issue.

llvm-svn: 240757
2015-06-26 03:44:00 +00:00
Rui Ueyama ccde19d77e COFF: Fix local absolute symbols.
Absolute symbols were always handled as external symbols, so if two
or more object files define the same absolute symbol, they would
conflict even if the symbol is private to each file.
This patch fixes that bug.

llvm-svn: 240756
2015-06-26 03:09:23 +00:00
Rui Ueyama a29948873f COFF: Don't read non-x64 object files.
Currently the new LLD supports only x86-64.

llvm-svn: 240749
2015-06-26 00:42:21 +00:00
Rui Ueyama f799edef28 COFF: Rename /opt:icf -> /opt:lldicf.
ICF implemented in LLD is so experimental that we don't want to
enable that even if /opt:icf option is passed. I'll rename it back
once the feature is complete.

llvm-svn: 240721
2015-06-25 23:26:58 +00:00
Rui Ueyama 5817ebb0c8 COFF: Fix lexer for the module-definition file.
Previously it would hang if there's a stray punctuation (e.g. ?).

llvm-svn: 240697
2015-06-25 21:06:00 +00:00
Rui Ueyama 88e0f9206b COFF: Fix a bug of __imp_ symbol.
The change I made in r240620 was not correct. If a symbol foo is
defined, and if you use __imp_foo, __imp_foo symbol is automatically
defined as a pointer (not just an alias) to foo.

Now that we need to create a chunk for automatically-created symbols.
I defined LocalImportChunk class for them.

llvm-svn: 240622
2015-06-25 03:31:47 +00:00
Rui Ueyama d766653534 COFF: Handle undefined symbols starting with __imp_ in a special way.
MSVC linker is able to link an object file created from the following code.
Note that __imp_hello is not defined anywhere.

  void hello() { printf("Hello\n"); }
  extern void (*__imp_hello)();
  int main() { __imp_hello(); }

Function symbols exported from DLLs are automatically mangled by appending
__imp_ prefix, so they have two names (original one and with the prefix).
This "feature" seems to simulate that behavior even for non-DLL symbols.

This is in my opnion very odd feature. Even MSVC linker warns if you use this.
I'm adding that anyway for the sake of compatibiltiy.

llvm-svn: 240620
2015-06-25 02:21:44 +00:00
Rui Ueyama ddf71fc370 COFF: Initial implementation of Identical COMDAT Folding.
Identical COMDAT Folding (ICF) is an optimization to reduce binary
size by merging COMDAT sections that contain the same metadata,
actual data and relocations. MSVC link.exe and many other linkers
have this feature. LLD achieves on per with MSVC in terms produced
binary size with this patch.

This technique is pretty effective. For example, LLD's size is
reduced from 64MB to 54MB by enaling this optimization.

The algorithm implemented in this patch is extremely inefficient.
It puts all COMDAT sections into a set to identify duplicates.
Time to self-link with/without ICF are 3.3 and 320 seconds,
respectively. So this option roughly makes LLD 100x slower.
But it's okay as I wanted to achieve correctness first.
LLD is still able to link itself with this optimization.
I'm going to make it more efficient in followup patches.

Note that this optimization is *not* entirely safe. C/C++ require
different functions have different addresses. If your program
relies on that property, your program wouldn't work with ICF.
However, it's not going to be an issue on Windows because MSVC
link.exe turns ICF on by default. As long as your program works
with default settings (or not passing /opt:noicf), your program
would work with LLD too.

llvm-svn: 240519
2015-06-24 04:36:52 +00:00
Peter Collingbourne c7b685d997 COFF: Ignore debug symbols.
Differential Revision: http://reviews.llvm.org/D10675

llvm-svn: 240487
2015-06-24 00:05:50 +00:00
Rui Ueyama 0d2e999050 COFF: Make link order compatible with MSVC link.exe.
Previously, we added files in directive sections to the symbol
table as we read the sections, so the link order was depth-first.
That's not compatible with MSVC link.exe nor the old LLD.

This patch is to queue files so that new files are added to the
end of the queue and processed last. Now addFile() doesn't parse
files nor resolve symbols. You need to call run() to process
queued files.

llvm-svn: 240483
2015-06-23 23:56:39 +00:00
Rui Ueyama a77336bd5d COFF: Support delay-load import tables.
DLLs are usually resolved at process startup, but you can
delay-load them by passing /delayload option to the linker.

If a /delayload is specified, the linker has to create data
which is similar to regular import table.
One notable difference is that the pointers in a delay-load
import table are originally pointing to thunks that resolves
themselves. Each thunk loads a DLL, resolve its name, and then
overwrites the pointer with the result so that subsequent
function calls directly call a desired function. The linker
has to emit thunks.

llvm-svn: 240250
2015-06-21 22:31:52 +00:00
Rui Ueyama 4d769c3a57 COFF: Support exception table.
.pdata section contains a list of triplets of function start address,
function end address and its unwind information. Linkers have to
sort section contents by function start address and set the section
address to the file header (so that runtime is able to find it and
do binary search.)

This change seems to resolve all but one remaining test failures in
check{,-clang,-lld} when building the entire stuff with clang-cl and
lld-link.

llvm-svn: 240231
2015-06-21 04:00:54 +00:00
Rui Ueyama 5e31d0b2e9 COFF: Fix common symbol alignment.
llvm-svn: 240217
2015-06-20 07:25:45 +00:00
Rui Ueyama efb7e1aa29 COFF: Fix a common symbol bug.
This is a case that one mistake caused a very mysterious bug.
I made a mistake to calculate addresses of common symbols, so
each common symbol pointed not to the beginning of its location
but to the end of its location. (Ouch!)

Common symbols are aligned on 16 byte boundaries. If a common
symbol is small enough to fit between the end of its real
location and whatever comes next, this bug didn't cause any harm.

However, if a common symbol is larger than that, its memory
naturally overlapped with other symbols. That means some
uninitialized variables accidentally shared memory. Because
totally unrelated memory writes mutated other varaibles, it was
hard to debug.

It's surprising that LLD was able to link itself and all LLD
tests except gunit tests passed with this nasty bug.

With this fix, the new COFF linker is able to pass all tests
for LLVM, Clang and LLD if I use MSVC cl.exe as a compiler.
Only three tests are failing when used with clang-cl.

llvm-svn: 240216
2015-06-20 07:21:57 +00:00
Rui Ueyama f00df0af2d COFF: Fix precedence between LIB and /libpath.
/libpath should take precedence over LIB.
Previously, LIB took precedence over /libpath.

llvm-svn: 240182
2015-06-19 22:39:48 +00:00
Rui Ueyama 165b254e06 COFF: Add search paths in the correct order.
Previously, we added search paths in reverse order.

llvm-svn: 240180
2015-06-19 21:44:32 +00:00
Rui Ueyama 573bf7de9c COFF: Continue reading object files until converge.
In this linker model, adding an undefined symbol may trigger chain
reactions. It may trigger a Lazy symbol to read a new file.
A new file may contain a directive section, which may contain various
command line options.

Previously, we didn't handle chain reactions well. We visited /include'd
symbols only once, so newly-added /include symbols were ignored.
This patch fixes that bug.

Now, the symbol table is versioned; every time the symbol table is
updated, the version number is incremented. We repeat adding undefined
symbols until the version number does not change. It is guaranteed to
converge -- the number of undefined symbol in the system is finite,
and adding the same undefined symbol more than once is basically no-op.

llvm-svn: 240177
2015-06-19 21:12:48 +00:00
Rui Ueyama 4d2834bd7b COFF: Don't add new undefined symbols for /alternatename.
Alternatename option is in the form of /alternatename:<from>=<to>.
It's effect is to resolve <from> as <to> if <from> is still undefined
at end of name resolution.

If <from> is not undefined but completely a new symbol, alternatename
shouldn't do anything. Previously, it introduced a new undefined
symbol for <from>, which resulted in undefined symbol error.

llvm-svn: 240161
2015-06-19 19:23:43 +00:00
Rui Ueyama 08d5e1875f COFF: Handle /include in .drectve.
We don't want to insert a new symbol to the symbol table while reading
a .drectve section because it's going to be too complicated.
That we are reading a directive section means that we are currently
reading some object file. Adding a new undefined symbol to the symbol
table can trigger a library file to read a new file, so it would make
the call stack too deep.

In this patch, I add new symbol names to a list to resolve them later.

llvm-svn: 240076
2015-06-18 23:20:11 +00:00
Rui Ueyama e8d56b5258 COFF: Allow identical alternatename options.
Alternatename option is in the form of /alternatename:<from>=<to>.
It is an error if there are two options having the same <from> but
different <to>. It is *not* an error if both are the same.

llvm-svn: 240075
2015-06-18 23:04:26 +00:00
Rui Ueyama 562daa8148 COFF: Unknown options in .drectve section is an error.
We skip unknown options in the command line with a warning message
being printed out, but we shouldn't do that for .drectve section.
The section is not visible to the user. We should handle unknown
options as an error.

llvm-svn: 240067
2015-06-18 21:50:38 +00:00
Rui Ueyama b95188cb2c COFF: Add /implib option.
llvm-svn: 240045
2015-06-18 20:27:09 +00:00
Rui Ueyama 2edb35a264 COFF: Handle /alternatename in .drectve section.
llvm-svn: 240037
2015-06-18 19:09:30 +00:00
Peter Collingbourne 8b2492f2a0 COFF: Implement DLL symbol exports for bitcode files.
Differential Revision: http://reviews.llvm.org/D10530

llvm-svn: 239994
2015-06-18 05:22:15 +00:00
Rui Ueyama ae36985af7 COFF: Fix entry point inference bug.
Previously, LLD couldn't find a default entry point if it's
defined by a library.

llvm-svn: 239982
2015-06-18 00:40:33 +00:00
Rui Ueyama 24c5fd0419 COFF: Support /manifest{,uac,dependency,file} options.
The linker has to create an XML file for each executable.
This patch supports that feature.

You can optionally embed an XML file to an executable as .rsrc
section. If you choose to do that (by passing /manifest:embed
option), the linker has to create a textual resource file
containing an XML file, compile that using rc.exe to a binary
resource file, conver that resource file to a COFF file using
cvtres.exe, and then link that COFF file. This patch implements
that feature too.

llvm-svn: 239978
2015-06-18 00:12:42 +00:00
Rui Ueyama 151d862d97 COFF: Create import library files.
On Windows, we have to create a .lib file for each .dll.
When linking against DLLs, the linker doesn't use the DLL files,
but instead read a list of dllexported symbols from corresponding
lib files.

A library file containing descriptors of a DLL is called an
import library file.

lib.exe has a feature to create an import library file from a
module-definition file. In this patch, we create a module-definition
file and pass that to lib.exe.

We eventually want to create an import library file by ourselves
to eliminate dependency to lib.exe. For now, we just use the MSVC
tool.

llvm-svn: 239937
2015-06-17 20:40:43 +00:00
Rui Ueyama 39026589b9 COFF: Fix a test which was failing with debug build.
llvm-svn: 239931
2015-06-17 19:28:01 +00:00
Rui Ueyama 1f373704e3 COFF: Support module-definition files.
Module-definition files (.def files) are yet another way to
specify parameters to the linker. You can write a list of dllexported
symbols in module-definition files instead of using /export command
line option. It also supports a few more directives.

The parser code is taken from lib/Driver/WinLinkModuleDef.cpp
with the following modifications.

 - variable names are updated to comply with the LLVM coding style.
 - Instead of returning parsing results as "directive" objects,
   it updates Config object directly.

llvm-svn: 239929
2015-06-17 19:19:25 +00:00
Rui Ueyama 97dff9ee3a COFF: Support creating DLLs.
DLL files are in the same format as executables but they have export tables.
The format of the export table is described in PE/COFF spec section 5.3.

A new class, EdataContents, takes care of creating chunks for export tables.
What we need to do is to parse command line flags for dllexports, and then
instantiate the class to create chunks. For the writer, export table chunks
are opaque data -- it just add chunks to .edata section.

llvm-svn: 239869
2015-06-17 00:16:33 +00:00
Rui Ueyama 1e974577d4 COFF: Fix tests.
I was accidentally testing not -flavor link2 but -flavor link.

llvm-svn: 239868
2015-06-16 23:51:58 +00:00
Rui Ueyama 6592ff8c93 COFF: Add miscellaneous boolean flags.
llvm-svn: 239864
2015-06-16 23:13:00 +00:00
Rui Ueyama bc2cc7d0b8 COFF: Fix .reloc section attributes.
llvm-svn: 239738
2015-06-15 18:03:47 +00:00
Rui Ueyama 59e9578f20 COFF: Fix resource table size.
The size field shouldn't include trailing padding.

llvm-svn: 239712
2015-06-15 01:35:56 +00:00
Rui Ueyama 588e832d0a COFF: Support base relocations.
PE/COFF executables/DLLs usually contain data which is called
base relocations. Base relocations are a list of addresses that
need to be fixed by the loader if load-time relocation is needed.

Base relocations are in .reloc section.

We emit one base relocation entry for each IMAGE_REL_AMD64_ADDR64
relocation.

In order to save disk space, base relocations are grouped by page.
Each group is called a block. A block starts with a 32-bit page
address followed by 16-bit offsets in the page. That is more
efficient representation of addresses than just an array of 32-bit
addresses.

llvm-svn: 239710
2015-06-15 01:23:58 +00:00
Rui Ueyama 2bf6a12238 COFF: Support Windows resource files.
Resource files are data files containing i18n messages, icon images, etc.
MSVC has a tool to convert a resource file to a regular COFF file so that
you can just link that file to embed resources to an executable.

However, you can directly pass resource files to the linker. If you do that,
the linker invokes the tool automatically. This patch implements that feature.

llvm-svn: 239704
2015-06-14 21:50:50 +00:00
Peter Collingbourne 1b6fd1f5fd COFF: Symbol resolution for common and comdat symbols defined in bitcode.
In the case where either a bitcode file and a regular file or two bitcode
files export a common or comdat symbol with the same name, the linker needs
to pick one of them following COFF semantics. This patch implements a design
for resolving such symbols that pushes most of the work onto either LLD's
regular mechanism for resolving common or comdat symbols or the IR linker's
mechanism for doing the same.

We modify SymbolBody::compare to always prefer non-bitcode symbols, so that
during the initial phase of symbol resolution, the symbol table always contains
a regular symbol in any case where we need to choose between a regular and
a bitcode symbol. In SymbolTable::addCombinedLTOObject, we force export
any bitcode symbols that were initially pre-empted by a regular symbol,
and later use SymbolBody::compare to choose between the regular symbol in
the symbol table and the regular symbol from the combined LTO object file.

This design seems to be sound, so long as the resolution mechanism is defined
to be commutative and associative modulo arbitrary choices between symbols
(which seems to be the case for COFF).

Differential Revision: http://reviews.llvm.org/D10329

llvm-svn: 239563
2015-06-11 21:49:54 +00:00
Peter Collingbourne 73b75e3d0c COFF: Handle references from LTO object to lazy symbols correctly.
The code generator may create references to runtime library symbols such as
__chkstk which were not visible via LTOModule. Handle these cases by loading
the object file from the library, but abort if we end up having loaded any
bitcode objects.

Because loading the object file may have introduced new undefined references,
call reportRemainingUndefines again to detect and report them.

Differential Revision: http://reviews.llvm.org/D10332

llvm-svn: 239386
2015-06-09 04:29:54 +00:00