Commit Graph

74 Commits

Author SHA1 Message Date
Martin Storsjo d2752aa9ec [COFF] Add support for aligncomm directives
These are emitted for comm symbols in object files, when targeting
a GNU environment.

Alternatively, just ignore them since we already align CommonChunk
to the natural size of the content (up to 32 bytes). That would only
trade away the possibility to overalign small symbols, which doesn't
sound like something that might not need to be handled?

Differential Revision: https://reviews.llvm.org/D36304

llvm-svn: 310871
2017-08-14 19:07:27 +00:00
Reid Kleckner eacdf04fdd [PDB] Write public symbol records and the publics hash table
Summary:
MSVC link.exe records all external symbol names in the publics stream.
It provides similar functionality to an ELF .symtab.

Reviewers: zturner, ruiu

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D35871

llvm-svn: 309303
2017-07-27 18:25:59 +00:00
Rui Ueyama e1b48e099c Rename ObjectFile ObjFile for COFF as well.
llvm-svn: 309228
2017-07-26 23:05:24 +00:00
Reid Kleckner a1001b8f38 [COFF] Allow debug info to relocate against discarded symbols
Summary:
In order to do this without switching on the symbol kind multiple times,
I created Defined::getChunkAndOffset and use that instead of
SymbolBody::getRVA in the inner relocation loop.

Now we get the symbol's chunk before switching over relocation types, so
we can test if it has been discarded outside the inner relocation type
switch. This also simplifies application of section relative
relocations. Previously we would switch on symbol kind to compute the
RVA, then the relocation type, and then the symbol kind again to get the
output section so we could subtract that from the symbol RVA. Now we
*always* have an OutputSection, so applying SECREL and SECTION
relocations isn't as much of a special case.

I'm still not quite happy with the cleanliness of this code. I'm not
sure what offsets and bases we should be using during the relocation
processing loop: VA, RVA, or OutputSectionOffset.

Reviewers: ruiu, pcc

Reviewed By: ruiu

Subscribers: majnemer, inglorion, llvm-commits, aprantl

Differential Revision: https://reviews.llvm.org/D34650

llvm-svn: 306566
2017-06-28 17:06:35 +00:00
Reid Kleckner eb8c0f9d51 [COFF] Fix SECREL and SECTION relocations against common symbols
Summary:
They do the obvious thing: provide the section index of .bss and the
offset of the symbol in .bss.

Reviewers: ruiu

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34628

llvm-svn: 306304
2017-06-26 16:45:36 +00:00
Reid Kleckner 502d4ce2e4 [COFF] Improve synthetic symbol handling
Summary:
The main change is that we can have SECREL and SECTION relocations
against ___safe_se_handler_table, which is important for handling the
debug info in the MSVCRT.

Previously we were using DefinedRelative for __safe_se_handler_table and
__ImageBase, and after we implement CFGuard, we plan to extend it to
handle __guard_fids_table, __guard_longjmp_table, and more.  However,
DefinedRelative is really only suitable for implementing __ImageBase,
because it lacks a Chunk, which you need in order to figure out the
output section index and output section offset when resolving SECREl and
SECTION relocations.

This change renames DefinedRelative to DefinedSynthetic and gives it a
Chunk. One wart is that __ImageBase doesn't have a chunk. It points to
the PE header, effectively. We could split DefinedRelative and
DefinedSynthetic if we think that's cleaner and creates fewer special
cases.

I also added safeseh.s, which checks that we don't emit a safe seh table
entries pointing to garbage collected handlers and that we don't emit a
table at all when there are no handlers.

Reviewers: ruiu

Reviewed By: ruiu

Subscribers: inglorion, pcc, llvm-commits, aprantl

Differential Revision: https://reviews.llvm.org/D34577

llvm-svn: 306293
2017-06-26 15:39:52 +00:00
Reid Kleckner 8456411e3b [COFF] Fix SECTION and SECREL relocation handling for absolute symbols
Summary:
For SECTION relocations against absolute symbols, MSVC emits the largest
output section index plus one. I've implemented that by threading a
global variable through DefinedAbsolute that is filled in by the Writer.
A more library-oriented approach would be to thread the Writer through
Chunk::writeTo and SectionChunk::applyRel*, but Rui seems to prefer
doing it this way.

MSVC rejects SECREL relocations against absolute symbols, but only when
the relocation is in a real output section. When the relocation is in a
CodeView debug info section destined for the PDB, it seems that this
relocation error is suppressed, and absolute symbols become zeros in the
object file. This is easily implemented by checking the input section
from which we're applying relocations.

This should fix errors about __safe_se_handler_table and
__guard_fids_table when linking the CRT and generating a PDB.

Reviewers: ruiu

Subscribers: aprantl, llvm-commits

Differential Revision: https://reviews.llvm.org/D34541

llvm-svn: 306071
2017-06-22 23:33:04 +00:00
Rui Ueyama 9aa82f76ac Garbage collect dllimported symbols.
This is a different implementation than r303225 (which was reverted
in r303270, re-submitted in r303304 and then re-reverted in r303527).

In the previous patch, I tried to add Live bit to each dllimported
symbol. It turned out that it didn't work with "oldnames.lib" which
contains a lot of weak aliases to dllimported symbols.

The way we handle weak aliases is to check if undefined symbols
can be resolved using weak aliases, and if so, memcpy the Defined
symbols to weak Undefined symbols, so that any references to weak
aliases automatically see defined symbols instead of undefined ones.

This memcpy happens before MarkLive kicks in.

That means we may have multiple copies of dllimported symbols. So
turning on one instance's Live bit is not enough.

This patch moves the Live bit to dllimport file. Since multiple
copies of dllsymbols still point to the same file, we can use it as the
central repository to keep track of liveness.

Differential Revision: https://reviews.llvm.org/D33520

llvm-svn: 303814
2017-05-24 22:30:06 +00:00
Rui Ueyama b6632d9cd1 Revert r303304: Re-submit r303225: Garbage collect dllimported symbols.
This reverts commit r303304 because it looks like the change
introduced a crash bug. At least after that change, LLD with thinlto
crashes when linking Chromium.

llvm-svn: 303527
2017-05-22 06:01:37 +00:00
Rui Ueyama cd41bc8dec Re-submit r303225: Garbage collect dllimported symbols.
This reverts re-submits r303225 which was reverted in r303270 because it
broke the sanitizer-windows bot.

The reason of the failure is that we were writing dead symbols to the
symbol table. I fixed the issue.

llvm-svn: 303304
2017-05-17 21:36:08 +00:00
Hans Wennborg e67c5f6b52 Revert r303225 "Garbage collect dllimported symbols."
and follow-up r303226 "Fix Windows buildbots."

This broke the sanitizer-windows buildbot.

> Previously, the garbage collector (enabled by default or by explicitly
> passing /opt:ref) did not kill dllimported symbols. As a result,
> dllimported symbols could be added to resulting executables' dllimport
> list even if no one was actually using them.
>
> This patch implements dllexported symbol garbage collection. Just like
> COMDAT sections, dllimported symbols now have Live bits to manage their
> liveness, and MarkLive marks reachable dllimported symbols.
>
> Fixes https://bugs.llvm.org/show_bug.cgi?id=32950
>
> Reviewers: pcc
>
> Subscribers: llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D33264

llvm-svn: 303270
2017-05-17 16:22:03 +00:00
Rui Ueyama 02df7a6cf1 Garbage collect dllimported symbols.
Summary:
Previously, the garbage collector (enabled by default or by explicitly
passing /opt:ref) did not kill dllimported symbols. As a result,
dllimported symbols could be added to resulting executables' dllimport
list even if no one was actually using them.

This patch implements dllexported symbol garbage collection. Just like
COMDAT sections, dllimported symbols now have Live bits to manage their
liveness, and MarkLive marks reachable dllimported symbols.

Fixes https://bugs.llvm.org/show_bug.cgi?id=32950

Reviewers: pcc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33264

llvm-svn: 303225
2017-05-17 00:35:50 +00:00
Bob Haarman cd7197fec3 fix nullptr dereference in COFF/Symbol.h
llvm-svn: 294064
2017-02-03 23:05:17 +00:00
Bob Haarman cde5e5b600 refactor COFF linker to use new LTO API
Summary: The COFF linker previously implemented link-time optimization using an API which has now been marked as legacy. This change refactors the COFF linker to use the new LTO API, which is also used by the ELF linker.

Reviewers: pcc, ruiu

Reviewed By: pcc

Subscribers: mgorny, mehdi_amini

Differential Revision: https://reviews.llvm.org/D29059

llvm-svn: 293967
2017-02-02 23:58:14 +00:00
Rui Ueyama ce039266c1 Merge elf::toString and coff::toString.
The two overloaded functions hid each other. This patch merges them.

llvm-svn: 291222
2017-01-06 10:04:08 +00:00
Rui Ueyama 9381eb1045 Remove lld/Support/Memory.h.
I thought for a while about how to remove it, but it looks like we
can just copy the file for now. Of course I'm not happy about that,
but it's just less than 50 lines of code, and we already have
duplicate code in Error.h and some other places. I want to solve
them all at once later.

Differential Revision: https://reviews.llvm.org/D27819

llvm-svn: 290062
2016-12-18 14:06:06 +00:00
Peter Collingbourne 6ee0b4e9f5 COFF: Open and map input files asynchronously on Windows.
Profiling revealed that the majority of lld's execution time on Windows was
spent opening and mapping input files. We can reduce this cost significantly
by performing these operations asynchronously.

This change introduces a queue for all operations on input file data. When
we discover that we need to load a file (for example, when we find a lazy
archive for an undefined symbol, or when we read a linker directive to
load a file from disk), the file operation is launched using a future and
the symbol resolution operation is enqueued.  This implies another change
to symbol resolution semantics, but it seems to be harmless ("ninja All"
in Chromium still succeeds).

To measure the perf impact of this change I linked Chromium's chrome_child.dll
with both thin and fat archives.

Thin archives:

Before (median of 5 runs): 19.50s
After: 10.93s

Fat archives:

Before: 12.00s
After: 9.90s

On Linux I found that doing this asynchronously had a negative effect on
performance, probably because the cost of mapping a file is small enough that
it becomes outweighed by the cost of managing the futures. So on non-Windows
platforms I use the deferred execution strategy.

Differential Revision: https://reviews.llvm.org/D27768

llvm-svn: 289760
2016-12-15 04:02:23 +00:00
Peter Collingbourne e50f4854c2 COFF: Fix memory leaks reported by lsan.
llvm-svn: 289451
2016-12-12 18:42:09 +00:00
Peter Collingbourne 99111287fc COFF: Use a bit in SymbolBody to track which symbols are written to the symbol table.
Using a set here caused us to take about 1 second longer to write the symbol
table when linking chrome_child.dll. With this I consistently get better
performance on Windows with the new symbol table.

Before r289280 and with r289183 reverted (median of 5 runs): 17.65s
After this change: 17.33s

On Linux things look even better:

Before: 10.700480444s
After: 5.735681610s

Differential Revision: https://reviews.llvm.org/D27648

llvm-svn: 289408
2016-12-11 22:15:20 +00:00
Peter Collingbourne 79a5e6b1b7 COFF: New symbol table design.
This ports the ELF linker's symbol table design, introduced in r268178,
to the COFF linker.

Differential Revision: http://reviews.llvm.org/D21166

llvm-svn: 289280
2016-12-09 21:55:24 +00:00
Rui Ueyama bf4ad1d63d COFF: Use make() to create a new file object in createFile.
llvm-svn: 289097
2016-12-08 20:20:22 +00:00
Rui Ueyama aa4f4450af Revert r289084: Start using make() in COFF.
This reverts commit r289084 to appease buildbots.

llvm-svn: 289086
2016-12-08 18:49:04 +00:00
Rui Ueyama 843233b920 Start using make() in COFF.
We don't want ELF and COFF to diverge too much.

llvm-svn: 289085
2016-12-08 18:31:18 +00:00
Rui Ueyama a45d45e231 COFF: Define overloaded toString functions.
Previously, we had different way to stringize SymbolBody and InputFile
to construct error messages. This patch defines overloaded function
toString() so that we don't need to memorize all these different
function names.

With that change, it is now easy to include demangled names in error
messages. Now, if there is a symbol name conflict, we'll print out
both mangled and demangled names.

llvm-svn: 288992
2016-12-07 23:17:02 +00:00
David Majnemer 1706719972 [COFF] Remove an unused function, getFileOff
The function was not used and was not functional: all paths would lead
to report_fatal_error or endless stack recursion.

llvm-svn: 263542
2016-03-15 09:48:18 +00:00
Rui Ueyama de88072a00 COFF: Rename Ptr -> Repl.
This pointer points to a replacement for this chunk. Ptr was not a good name.

llvm-svn: 248579
2015-09-25 16:20:24 +00:00
Rui Ueyama 3cfd2bff1e Remove dead code.
llvm-svn: 248105
2015-09-20 01:19:36 +00:00
Rui Ueyama 1cce300843 COFF: Change Symbol::Body type from atomic pointer to regular pointer.
I made the field an atomic pointer in hope that we would be able to
parallelize the symbol resolver soon, but that's not going to happen
soon. This patch reverts that change for the sake of readability.

llvm-svn: 248104
2015-09-20 00:00:05 +00:00
Rui Ueyama 63dd8766ab COFF: Remove DefinedSymbol::isLive() and markLive(). NFC.
Basically the concept of "liveness" is for sections (or chunks in LLD
terminology) and not for symbols. Symbols are always available or live,
or otherwise it indicates a link failure.

Previously, we had isLive() and markLive() methods for DefinedSymbol.
They are confusing methods. What they actually did is to act as a proxy
to backing section chunks. We can simplify eliminate these methods
and call section chunk's methods directly.

llvm-svn: 247869
2015-09-16 23:55:52 +00:00
Rafael Espindola beee25e484 Make these headers as being c++.
llvm-svn: 245050
2015-08-14 14:12:54 +00:00
Rafael Espindola 5c546a1437 COFF: In chunks, store the offset from the start of the output section. NFC.
This is more convenient than the offset from the start of the file as we
don't have to worry about it changing when we move the output section.

This is a port of r245008 from ELF.

llvm-svn: 245018
2015-08-14 03:30:59 +00:00
Rui Ueyama 234afc4a0e Remove unused `using`.
llvm-svn: 244422
2015-08-09 20:38:58 +00:00
Rafael Espindola b835ae8e4a Port the error functions from ELF to COFF.
This has a few advantages

* Less C++ code (about 300 lines less).
* Less machine code (about 14 KB of text on a linux x86_64 build).
* It is more debugger friendly. Just set a breakpoint on the exit function and
  you get the complete lld stack trace of when the error was found.
* It is a more robust API. The errors are handled early and we don't get a
  std::error_code hot potato being passed around.
* In most cases the error function in a better position to print diagnostics
  (it has more context).

llvm-svn: 244215
2015-08-06 14:58:50 +00:00
Rui Ueyama 8bc43a142b COFF: ARM: Fix relocations to thumb code.
Windows ARM is the thumb ARM environment, and pointers to thumb code
needs to have its LSB set. When we apply relocations, we need to
adjust the LSB if it points to an executable section.

llvm-svn: 243560
2015-07-29 19:25:00 +00:00
Rui Ueyama eb26e1d03c COFF: Fix SECREL and SECTION relocations.
SECREL should sets the 32-bit offset of the target from the beginning
of *target's* output section. Previously, the offset from the beginning
of source's output section was used instead.

SECTION means the target section's index, and not the source section's
index. This patch fixes that issue too.

llvm-svn: 243535
2015-07-29 16:30:45 +00:00
Rui Ueyama 5e706b3ee3 COFF: Use short identifiers. NFC.
llvm-svn: 243229
2015-07-25 21:54:50 +00:00
Rui Ueyama 28df04211c COFF: Split ImportThunkChunk into x86 and x64. NFC.
This change should make it easy to port this code to ARM.

llvm-svn: 243195
2015-07-25 01:16:06 +00:00
Rui Ueyama cd3f99b6c5 COFF: Implement Safe SEH support for x86.
An object file compatible with Safe SEH contains a .sxdata section.
The section contains a list of symbol table indices, each of which
is an exception handler function. A safe SEH-enabled executable
contains a list of exception handler RVAs. So, what the linker has
to do to support Safe SEH is basically to read the .sxdata section,
interpret the contents as a list of symbol indices, unique-fy and
sort their RVAs, and then emit that list to .rdata. This patch
implements that feature.

llvm-svn: 243182
2015-07-24 23:51:14 +00:00
Rui Ueyama 3cb895c930 COFF: Fix __ImageBase symbol relocation.
__ImageBase is a special symbol whose value is the image base address.
Previously, we handled __ImageBase symbol as an absolute symbol.

Absolute symbols point to specific locations in memory and the locations
never change even if an image is base-relocated. That means that we
don't have base relocation entries for absolute symbols.

This is not a case for __ImageBase. If an image is base-relocated, its
base address changes, and __ImageBase needs to be shifted as well.
So we have to have base relocations for __ImageBase. That means that
__ImageBase is not really an absolute symbol but a different kind of
symbol.

In this patch, I introduced a new type of symbol -- DefinedRelative.
DefinedRelative is similar to DefinedAbsolute, but it has not a VA but RVA
and is a subject of base relocation. Currently only __ImageBase is of
the new symbol type.

llvm-svn: 243176
2015-07-24 22:58:44 +00:00
Rui Ueyama cb71c72ccc COFF: Inline Defined::getRVA because it's very hot.
llvm-svn: 242075
2015-07-13 22:01:27 +00:00
David Majnemer 3a62d3d456 COFF: Fill in the type and storage class in the symbol table
We can use the type and storage class from the symbol's original object
file to fill in the linked executable's symbol table.

llvm-svn: 241828
2015-07-09 17:43:50 +00:00
Rui Ueyama 183f53fd22 COFF: Support isa<> for Symbol::Body, whose type is std::atomic<SymbolBody *>.
llvm-svn: 241477
2015-07-06 17:45:22 +00:00
Rui Ueyama c80c03da6c COFF: Use atomic pointers in preparation for parallelizing.
In the new design, mutation of Symbol pointers is the name resolution
operation. This patch makes them atomic pointers so that they can
be mutated by multiple threads safely. I'm going to use atomic
compare-exchange on these pointers.

dyn_cast<> doesn't recognize atomic pointers as pointers,
so we need to call load(). This is unfortunate, but in other places
automatic type conversion works fine.

llvm-svn: 241416
2015-07-05 21:54:42 +00:00
Peter Collingbourne 2612a32ce5 COFF: Numerous fixes for interaction between LTO and weak externals.
We were previously hitting assertion failures in the writer in cases where
a regular object file defined a weak external symbol that was defined by
a bitcode file. Because /export and /entry name mangling were implemented
using weak externals, the same problem affected mangled symbol names in
bitcode files.

The underlying cause of the problem was that weak external symbols were
being resolved before doing LTO, so the symbol table may have contained stale
references to bitcode symbols. The fix here is to defer weak external symbol
resolution until after LTO.

Also implement support for weak external symbols in bitcode files
by modelling them as replaceable DefinedBitcode symbols.

Differential Revision: http://reviews.llvm.org/D10940

llvm-svn: 241391
2015-07-04 05:28:41 +00:00
Peter Collingbourne da2f094bbb COFF: Fix the case where an object defines a weak external and its alias.
This worked before, but only by accident, and only with assertions disabled.
We ended up storing a DefinedRegular symbol in the WeakAlias field,
and never using it as an Undefined.

Differential Revision: http://reviews.llvm.org/D10934

llvm-svn: 241376
2015-07-03 22:03:36 +00:00
Rui Ueyama 65813edfe2 COFF: Make symbols satisfy weak ordering.
Previously, SymbolBody::compare(A, B) didn't satisfy weak ordering.
There was a case that A < B and B < A could have been true.
This is because we just pick LHS if A and B are consisdered equivalent.

This patch is to make symbols being weakly ordered. If A and B are
not tie, one of A < B && B > A or A > B && B < A is true.
This is not an improtant property for a single-threaded environment
because everything is deterministic anyways. However, in a multi-
threaded environment, this property becomes important.

If a symbol is defined or lazy, ties are resolved by its file index.
For simple types that we don't really care about their identities,
symbols are compared by their addresses.

llvm-svn: 241294
2015-07-02 20:33:48 +00:00
Rui Ueyama 3d4c69c04d COFF: Resolve AlternateNames using weak aliases.
Previously, we use SymbolTable::rename to resolve AlternateName symbols.
This patch is to merge that mechanism with weak aliases, so that we
remove that function.

llvm-svn: 241230
2015-07-02 02:38:59 +00:00
Rui Ueyama 0744e87fad COFF: Rename getReplacement -> repl.
The previous name was too long to my taste.

llvm-svn: 241215
2015-07-02 00:21:11 +00:00
Rui Ueyama 4897596728 COFF: Chagne weak alias' type from SymbolBody** to SymbolBody*. NFC.
llvm-svn: 241198
2015-07-01 22:32:23 +00:00
Rui Ueyama 8d3010a1a6 COFF: Change the order of adding symbols to the symbol table.
Previously, the order of adding symbols to the symbol table was simple.
We have a list of all input files. We read each file from beginning of
the list and add all symbols in it to the symbol table.

This patch changes that order. Now all archive files are added to the
symbol table first, and then all the other object files are added.
This shouldn't change the behavior in single-threading, and make room
to parallelize in multi-threading.

In the first step, only lazy symbols are added to the symbol table
because archives contain only Lazy symbols. Member object files
found to be necessary are queued. In the second step, defined and
undefined symbols are added from object files. Adding an undefined
symbol to the symbol table may cause more member files to be added
to the queue. We simply continue reading all object files until the
queue is empty.

Finally, new archive or object files may be added to the queues by
object files' directive sections (which contain new command line
options).

The above process is repeated until we get no new files.

Symbols defined both in object files and in archives can make results
undeterministic. If an archive is read before an object, a new member
file gets linked, while in the other way, no new file would be added.
That is the most popular cause of an undeterministic result or linking
failure as I observed. Separating phases of adding lazy symbols and
undefined symbols makes that deterministic. Adding symbols in each
phase should be parallelizable.

llvm-svn: 241107
2015-06-30 19:35:21 +00:00