Commit Graph

59 Commits

Author SHA1 Message Date
Reid Kleckner 502d4ce2e4 [COFF] Improve synthetic symbol handling
Summary:
The main change is that we can have SECREL and SECTION relocations
against ___safe_se_handler_table, which is important for handling the
debug info in the MSVCRT.

Previously we were using DefinedRelative for __safe_se_handler_table and
__ImageBase, and after we implement CFGuard, we plan to extend it to
handle __guard_fids_table, __guard_longjmp_table, and more.  However,
DefinedRelative is really only suitable for implementing __ImageBase,
because it lacks a Chunk, which you need in order to figure out the
output section index and output section offset when resolving SECREl and
SECTION relocations.

This change renames DefinedRelative to DefinedSynthetic and gives it a
Chunk. One wart is that __ImageBase doesn't have a chunk. It points to
the PE header, effectively. We could split DefinedRelative and
DefinedSynthetic if we think that's cleaner and creates fewer special
cases.

I also added safeseh.s, which checks that we don't emit a safe seh table
entries pointing to garbage collected handlers and that we don't emit a
table at all when there are no handlers.

Reviewers: ruiu

Reviewed By: ruiu

Subscribers: inglorion, pcc, llvm-commits, aprantl

Differential Revision: https://reviews.llvm.org/D34577

llvm-svn: 306293
2017-06-26 15:39:52 +00:00
Rui Ueyama 01f93335a0 Use make<> everywhere in COFF to make it consistent with ELF.
We've been using make<> to allocate new objects in ELF. We have
the same function in COFF, but we didn't use it widely due to
negligence. This patch uses the function in COFF to close the gap
between ELF and COFF.

llvm-svn: 303357
2017-05-18 17:03:49 +00:00
Peter Collingbourne dc07769c90 COFF: Remove some undefined declarations. NFC.
llvm-svn: 300910
2017-04-20 22:23:36 +00:00
Rui Ueyama 1e0b158dbd Add an option to use the MSVC linker to link LTO-generated object files.
This patch defines a new command line option, /MSVCLTO, to LLD.
If that option is given, LLD invokes link.exe to link LTO-generated
object files. This is hacky but useful because link.exe can create
PDB files.

Differential Revision: https://reviews.llvm.org/D29526

llvm-svn: 294234
2017-02-06 20:47:55 +00:00
Bob Haarman cde5e5b600 refactor COFF linker to use new LTO API
Summary: The COFF linker previously implemented link-time optimization using an API which has now been marked as legacy. This change refactors the COFF linker to use the new LTO API, which is also used by the ELF linker.

Reviewers: pcc, ruiu

Reviewed By: pcc

Subscribers: mgorny, mehdi_amini

Differential Revision: https://reviews.llvm.org/D29059

llvm-svn: 293967
2017-02-02 23:58:14 +00:00
Peter Collingbourne 6f24fdb6a0 COFF: Change the /lldmap output format to be more like the ELF linker.
Differential Revision: https://reviews.llvm.org/D28717

llvm-svn: 291990
2017-01-14 03:14:46 +00:00
Peter Collingbourne 6ee0b4e9f5 COFF: Open and map input files asynchronously on Windows.
Profiling revealed that the majority of lld's execution time on Windows was
spent opening and mapping input files. We can reduce this cost significantly
by performing these operations asynchronously.

This change introduces a queue for all operations on input file data. When
we discover that we need to load a file (for example, when we find a lazy
archive for an undefined symbol, or when we read a linker directive to
load a file from disk), the file operation is launched using a future and
the symbol resolution operation is enqueued.  This implies another change
to symbol resolution semantics, but it seems to be harmless ("ninja All"
in Chromium still succeeds).

To measure the perf impact of this change I linked Chromium's chrome_child.dll
with both thin and fat archives.

Thin archives:

Before (median of 5 runs): 19.50s
After: 10.93s

Fat archives:

Before: 12.00s
After: 9.90s

On Linux I found that doing this asynchronously had a negative effect on
performance, probably because the cost of mapping a file is small enough that
it becomes outweighed by the cost of managing the futures. So on non-Windows
platforms I use the deferred execution strategy.

Differential Revision: https://reviews.llvm.org/D27768

llvm-svn: 289760
2016-12-15 04:02:23 +00:00
Peter Collingbourne 3f530938f6 COFF: Use CachedHashStringRef in the symbol table.
This resulted in about a 1% perf improvement linking chrome_child.dll.

Differential Revision: https://reviews.llvm.org/D27666

llvm-svn: 289410
2016-12-11 22:15:30 +00:00
Peter Collingbourne 8b65e51b73 COFF: Load inputs immediately instead of adding them to a queue.
This patch replaces the symbol table's object and archive queues, as well as
the convergent loop in the linker driver, with a design more similar to the
ELF linker where symbol resolution directly causes input files to be added to
the link, including input files arising from linker directives. Effectively
this removes the last vestiges of the old parallel input file loader.

Differential Revision: https://reviews.llvm.org/D27660

llvm-svn: 289409
2016-12-11 22:15:25 +00:00
Peter Collingbourne 79a5e6b1b7 COFF: New symbol table design.
This ports the ELF linker's symbol table design, introduced in r268178,
to the COFF linker.

Differential Revision: http://reviews.llvm.org/D21166

llvm-svn: 289280
2016-12-09 21:55:24 +00:00
Rui Ueyama 7a0a076e40 COFF: Use make() in SymbolTable and Writer.
llvm-svn: 289170
2016-12-09 02:13:12 +00:00
Rui Ueyama bf4ad1d63d COFF: Use make() to create a new file object in createFile.
llvm-svn: 289097
2016-12-08 20:20:22 +00:00
Rui Ueyama aa4f4450af Revert r289084: Start using make() in COFF.
This reverts commit r289084 to appease buildbots.

llvm-svn: 289086
2016-12-08 18:49:04 +00:00
Rui Ueyama 843233b920 Start using make() in COFF.
We don't want ELF and COFF to diverge too much.

llvm-svn: 289085
2016-12-08 18:31:18 +00:00
Rui Ueyama be939b3ff6 Do plumbing work for CodeView debug info.
Previously, we discarded .debug$ sections. This patch adds them to
files so that PDB.cpp can access them.

This patch also adds a debug option, /dumppdb, to dump debug info
fed to createPDB so that we can verify that valid data has been passed.

llvm-svn: 287555
2016-11-21 17:22:35 +00:00
Davide Italiano eeae124faf [COFF] SmallVector<char, 0> -> SmallString<0>.
This way we're consistent between ELF and COFF.

llvm-svn: 265885
2016-04-09 23:00:31 +00:00
Rui Ueyama 997b357ac1 COFF: Run InputFile::parse() in background using std::async().
Previously, InputFile::parse() was run in batch. We construct a list
of all input files and call parse() on each file using parallel_for_each.
That means we cannot start parsing files until we get a complete list
of input files, although InputFile::parse() is safe to call from anywhere.

This patch makes it asynchronous. As soon as we add a file to the symbol
table, we now start parsing the file using std::async().

This change shortens self-hosting time (650 ms) by 28 ms. It's about 4%
improvement.

llvm-svn: 248109
2015-09-20 03:11:16 +00:00
Peter Collingbourne df5783b7a5 COFF: Implement parallel LTO code generation.
This is exposed via a new flag /opt:lldltojobs=N, where N is the number of
code generation threads.

Differential Revision: http://reviews.llvm.org/D12309

llvm-svn: 246342
2015-08-28 22:16:09 +00:00
Rafael Espindola beee25e484 Make these headers as being c++.
llvm-svn: 245050
2015-08-14 14:12:54 +00:00
Rafael Espindola b835ae8e4a Port the error functions from ELF to COFF.
This has a few advantages

* Less C++ code (about 300 lines less).
* Less machine code (about 14 KB of text on a linux x86_64 build).
* It is more debugger friendly. Just set a breakpoint on the exit function and
  you get the complete lld stack trace of when the error was found.
* It is a more robust API. The errors are handled early and we don't get a
  std::error_code hot potato being passed around.
* In most cases the error function in a better position to print diagnostics
  (it has more context).

llvm-svn: 244215
2015-08-06 14:58:50 +00:00
Rui Ueyama 506f6d1ae1 COFF: _tls_used is __tls_used on x86.
llvm-svn: 243495
2015-07-28 22:56:02 +00:00
Rui Ueyama cd3f99b6c5 COFF: Implement Safe SEH support for x86.
An object file compatible with Safe SEH contains a .sxdata section.
The section contains a list of symbol table indices, each of which
is an exception handler function. A safe SEH-enabled executable
contains a list of exception handler RVAs. So, what the linker has
to do to support Safe SEH is basically to read the .sxdata section,
interpret the contents as a list of symbol indices, unique-fy and
sort their RVAs, and then emit that list to .rdata. This patch
implements that feature.

llvm-svn: 243182
2015-07-24 23:51:14 +00:00
Rui Ueyama 3cb895c930 COFF: Fix __ImageBase symbol relocation.
__ImageBase is a special symbol whose value is the image base address.
Previously, we handled __ImageBase symbol as an absolute symbol.

Absolute symbols point to specific locations in memory and the locations
never change even if an image is base-relocated. That means that we
don't have base relocation entries for absolute symbols.

This is not a case for __ImageBase. If an image is base-relocated, its
base address changes, and __ImageBase needs to be shifted as well.
So we have to have base relocations for __ImageBase. That means that
__ImageBase is not really an absolute symbol but a different kind of
symbol.

In this patch, I introduced a new type of symbol -- DefinedRelative.
DefinedRelative is similar to DefinedAbsolute, but it has not a VA but RVA
and is a subject of base relocation. Currently only __ImageBase is of
the new symbol type.

llvm-svn: 243176
2015-07-24 22:58:44 +00:00
Rui Ueyama a50387f1b3 COFF: Fix entry name inference for x86.
Entry name selection rule is already complicated on x64, but it's more
complicated on x86 because of the underscore name mangling scheme.

If one of _main, _main@<number> (a C function) or ?main@@... (a C++ function)
is defined, entry name is _mainCRTStartup. If _wmain, _wmain@<number or
?wmain@@... is defined, entry name is _wmainCRTStartup. And so on.

llvm-svn: 242110
2015-07-14 02:58:13 +00:00
Rui Ueyama ea533cde30 COFF: Infer machine type earlier than before.
Previously, we infer machine type at the very end of linking after
all symbols are resolved. That's actually too late because machine
type affects how we mangle symbols (whether or not we need to
add "_").

For example, /entry:foo adds "_foo" to the symbol table if x86 but
"foo" if x64.

This patch moves the code to infer machine type, so that machine
type is inferred based on input files given via the command line
(but not based on .directives files).

llvm-svn: 241843
2015-07-09 19:54:13 +00:00
Peter Collingbourne 2612a32ce5 COFF: Numerous fixes for interaction between LTO and weak externals.
We were previously hitting assertion failures in the writer in cases where
a regular object file defined a weak external symbol that was defined by
a bitcode file. Because /export and /entry name mangling were implemented
using weak externals, the same problem affected mangled symbol names in
bitcode files.

The underlying cause of the problem was that weak external symbols were
being resolved before doing LTO, so the symbol table may have contained stale
references to bitcode symbols. The fix here is to defer weak external symbol
resolution until after LTO.

Also implement support for weak external symbols in bitcode files
by modelling them as replaceable DefinedBitcode symbols.

Differential Revision: http://reviews.llvm.org/D10940

llvm-svn: 241391
2015-07-04 05:28:41 +00:00
Rui Ueyama 49d6cd35ad COFF: Fix /base option.
Previously, __ImageBase symbol got a different value than the one
specified by /base:<number> because the symbol was created in the
SymbolTable's constructor. When the constructor is called,
no command line options are processed yet, so the symbol was
created always with the initial value. This caused wrong relocations
and thus caused mysterious crashes of some executables linked by LLD.

llvm-svn: 241313
2015-07-03 00:02:19 +00:00
Rui Ueyama 6be9099140 COFF: Define SymbolTable::insert to simplify. NFC.
llvm-svn: 241311
2015-07-02 22:52:33 +00:00
Rui Ueyama 458d74421b COFF: Merge SymbolTable::find{,Symbol}. NFC
llvm-svn: 241238
2015-07-02 03:59:04 +00:00
Rui Ueyama 85225b0a36 COFF: Infer entry point as early as possible, but not too early.
On Windows, we have four different main functions, {w,}{main,WinMain}.
The linker has to choose a corresponding entry point function among
{w,}{main,WinMain}CRTStartup. These entry point functions are defined
in the standard library. The linker resolves one of them by looking at
which main function is defined and adding a corresponding undefined
symbol to the symbol table.

Object files containing entry point functions conflicts each other.
For example, we cannot resolve both mainCRTStartup and WinMainCRTStartup
because other symbols defined in the files conflict.

Previously, we inferred CRT function name at the very end of name
resolution. I found that that is sometimes too late. If the linker
already linked one of these four archive member objects, it's too late
to change the decision.

The right thing to do here is to infer entry point name after adding
all symbols from command line files and before adding any other files
(which are specified by directive sections). This patch does that.

llvm-svn: 241236
2015-07-02 03:15:15 +00:00
Rui Ueyama 3d4c69c04d COFF: Resolve AlternateNames using weak aliases.
Previously, we use SymbolTable::rename to resolve AlternateName symbols.
This patch is to merge that mechanism with weak aliases, so that we
remove that function.

llvm-svn: 241230
2015-07-02 02:38:59 +00:00
Rui Ueyama 6bf638e688 COFF: Simplify and rename findMangle. NFC.
Occasionally we have to resolve an undefined symbol to its
mangled symbol. Previously, we did that on calling side of
findMangle by explicitly updating SymbolBody.
In this patch, mangled symbols are handled as weak aliases
for undefined symbols.

llvm-svn: 241213
2015-07-02 00:04:14 +00:00
Rui Ueyama 4b6698917d COFF: Simplify SymbolTable::findLazy. NFC.
llvm-svn: 241128
2015-06-30 23:46:52 +00:00
Rui Ueyama 8d3010a1a6 COFF: Change the order of adding symbols to the symbol table.
Previously, the order of adding symbols to the symbol table was simple.
We have a list of all input files. We read each file from beginning of
the list and add all symbols in it to the symbol table.

This patch changes that order. Now all archive files are added to the
symbol table first, and then all the other object files are added.
This shouldn't change the behavior in single-threading, and make room
to parallelize in multi-threading.

In the first step, only lazy symbols are added to the symbol table
because archives contain only Lazy symbols. Member object files
found to be necessary are queued. In the second step, defined and
undefined symbols are added from object files. Adding an undefined
symbol to the symbol table may cause more member files to be added
to the queue. We simply continue reading all object files until the
queue is empty.

Finally, new archive or object files may be added to the queues by
object files' directive sections (which contain new command line
options).

The above process is repeated until we get no new files.

Symbols defined both in object files and in archives can make results
undeterministic. If an archive is read before an object, a new member
file gets linked, while in the other way, no new file would be added.
That is the most popular cause of an undeterministic result or linking
failure as I observed. Separating phases of adding lazy symbols and
undefined symbols makes that deterministic. Adding symbols in each
phase should be parallelizable.

llvm-svn: 241107
2015-06-30 19:35:21 +00:00
Chandler Carruth be6e80b012 [opt] Hoist the call throuh SymbolBody::getReplacement out of the inline
method to get a SymbolBody and into the callers, and kill now dead
includes.

This removes the need to have the SymbolBody definition when we're
defining the inline method and makes it a better inline method. That was
the only reason for a lot of header includes here. Removing these and
using forward declarations actually uncovers a bunch of cross-header
dependencies that I've fixed while I'm here, and will allow me to
introduce some *important* inline code into Chunks.h that requires the
definition of ObjectFile.

No functionality changed at this point.

Differential Revision: http://reviews.llvm.org/D10789

llvm-svn: 240982
2015-06-29 18:50:11 +00:00
Rui Ueyama 45044f47d3 COFF: Fix logic to find default entry name or subsystem.
The previous logic to find default entry name or subsystem does not
seem correct (i.e. was not compatible with MSVC linker). Previously,
default entry name was inferred from CRT functions and user-defined
entry functions. Subsystem was inferred from CRT functions.

Default entry name and subsystem are now inferred based on the
following table. Note that we no longer use CRT functions to infer
them.

               Entry name           Subsystem
  main         mainCRTStartup       console
  wmain        wmainCRTStartup      console
  WinMain      WinMainCRTStartup    windows
  wWinMain     wWinMainCRTStartup   windows

llvm-svn: 240922
2015-06-29 01:03:53 +00:00
Rui Ueyama f5313b3498 COFF: Allow mangled symbols as arguments for /export.
Usually dllexported symbols are defined with 'extern "C"',
so identifying them is easy. We can just do hash table lookup
to look up exported symbols.

However, C++ non-member functions are also allowed to be exported,
and they can be specified with unmangled name. So, if /export:foo
is given, we need to look up not only "foo" but also its all
mangled names. In MSVC mangling scheme, that means that we need to
look up any symbol which starts with "?foo@@Y".

In this patch, we scan the entire symbol table to search for
a mangled symbol. The symbol table is a DenseMap, and that doesn't
support table lookup by string prefix. This is of course very
inefficient. But that should be probably OK because the user
should always add 'extern "C"' to dllexported symbols.

llvm-svn: 240919
2015-06-28 22:16:41 +00:00
Chandler Carruth 2eb15fff94 Switch the new COFF linker's symbol table to use a DenseMap of
StringRefs. This uses the LLVM hashing rather than the standard library
and a closed addressed hash table rather than chaining.

This improves the Windows self-link of LLD by 4.4% (averaged over 10
runs, with well under 1% of variance on each).

There is still some room to improve here. Two things I clearly see in
the profile:

1) This is one of the biggest stress tests for the LLVM hashing code. It
   actually consumes something like 3-4% of the link time after the
   change.
2) The way that StringRef keys are handled in the DenseMap interface is
   pretty suboptimal. We pay the price of checking for empty and
   tombstone keys when we could only possibly be looking for a normal
   key. But fixing this requires invasive API changes.

So there is still some headroom here.

Differential Revision: http://reviews.llvm.org/D10684

llvm-svn: 240871
2015-06-27 02:05:40 +00:00
Rui Ueyama 81ba1353ce COFF: Remove dead code.
llvm-svn: 240846
2015-06-26 22:14:41 +00:00
Peter Collingbourne be54955bba COFF: Implement /lldmap flag.
This flag can be used to produce a map file, which is essentially a list
of objects linked into the final output file together with the RVAs of
their symbols. Because our format differs from MSVC's we expose it as a
separate flag.

Differential Revision: http://reviews.llvm.org/D10773

llvm-svn: 240812
2015-06-26 18:58:24 +00:00
Rui Ueyama 88e0f9206b COFF: Fix a bug of __imp_ symbol.
The change I made in r240620 was not correct. If a symbol foo is
defined, and if you use __imp_foo, __imp_foo symbol is automatically
defined as a pointer (not just an alias) to foo.

Now that we need to create a chunk for automatically-created symbols.
I defined LocalImportChunk class for them.

llvm-svn: 240622
2015-06-25 03:31:47 +00:00
Rui Ueyama 0d2e999050 COFF: Make link order compatible with MSVC link.exe.
Previously, we added files in directive sections to the symbol
table as we read the sections, so the link order was depth-first.
That's not compatible with MSVC link.exe nor the old LLD.

This patch is to queue files so that new files are added to the
end of the queue and processed last. Now addFile() doesn't parse
files nor resolve symbols. You need to call run() to process
queued files.

llvm-svn: 240483
2015-06-23 23:56:39 +00:00
Rui Ueyama e3a335076a COFF: Combine add{Object,Archive,Bitcode,Import} functions. NFC.
llvm-svn: 240229
2015-06-20 23:10:05 +00:00
Rui Ueyama 573bf7de9c COFF: Continue reading object files until converge.
In this linker model, adding an undefined symbol may trigger chain
reactions. It may trigger a Lazy symbol to read a new file.
A new file may contain a directive section, which may contain various
command line options.

Previously, we didn't handle chain reactions well. We visited /include'd
symbols only once, so newly-added /include symbols were ignored.
This patch fixes that bug.

Now, the symbol table is versioned; every time the symbol table is
updated, the version number is incremented. We repeat adding undefined
symbols until the version number does not change. It is guaranteed to
converge -- the number of undefined symbol in the system is finite,
and adding the same undefined symbol more than once is basically no-op.

llvm-svn: 240177
2015-06-19 21:12:48 +00:00
Rui Ueyama 23ed96d95f COFF: Rename a function. NFC.
llvm-svn: 240031
2015-06-18 17:29:50 +00:00
Rui Ueyama ae36985af7 COFF: Fix entry point inference bug.
Previously, LLD couldn't find a default entry point if it's
defined by a library.

llvm-svn: 239982
2015-06-18 00:40:33 +00:00
Rafael Espindola 9bd82e9952 Update for llvm api change.
llvm-svn: 239671
2015-06-13 12:50:13 +00:00
Davide Italiano d106ab263a [COFF] Spell the namespace correctly.
llvm-svn: 239641
2015-06-12 21:37:55 +00:00
Rui Ueyama efba7812cc COFF: Split SymbolTable::addCombinedLTOObject. NFC.
llvm-svn: 239418
2015-06-09 17:52:17 +00:00
Rui Ueyama eeae5ddbe2 COFF: Add more log messages.
llvm-svn: 239289
2015-06-08 06:00:10 +00:00