Commit Graph

242 Commits

Author SHA1 Message Date
Rui Ueyama b0c001c055 COFF: Remove dead code.
llvm-svn: 240682
2015-06-25 20:12:15 +00:00
Rui Ueyama fc510f4cf8 COFF: Devirtualize mark(), markLive() and isCOMDAT().
Only SectionChunk can be dead-stripped. Previously,
all types of chunks implemented these functions,
but their functions were blank.

Likewise, only DefinedRegular and DefinedCOMDAT symbols
can be dead-stripped. markLive() function was implemented
for other symbol types, but they were blank.

I started thinking that the change I made in r240319 was
a mistake. I separated DefinedCOMDAT from DefinedRegular
because I thought that would make the code cleaner, but now
we want to handle them as the same type here. Maybe we
should roll it back.

This change should improve readability a bit as this removes
some dubious uses of reinterpret_cast. Previously, we
assumed that all COMDAT chunks are actually SectionChunks,
which was not very obvious.

llvm-svn: 240675
2015-06-25 19:10:58 +00:00
Rui Ueyama 88e0f9206b COFF: Fix a bug of __imp_ symbol.
The change I made in r240620 was not correct. If a symbol foo is
defined, and if you use __imp_foo, __imp_foo symbol is automatically
defined as a pointer (not just an alias) to foo.

Now that we need to create a chunk for automatically-created symbols.
I defined LocalImportChunk class for them.

llvm-svn: 240622
2015-06-25 03:31:47 +00:00
Rui Ueyama 49560c7a10 COFF: Move code for ICF from Writer.cpp to ICF.cpp.
llvm-svn: 240590
2015-06-24 20:40:03 +00:00
Rui Ueyama ddf71fc370 COFF: Initial implementation of Identical COMDAT Folding.
Identical COMDAT Folding (ICF) is an optimization to reduce binary
size by merging COMDAT sections that contain the same metadata,
actual data and relocations. MSVC link.exe and many other linkers
have this feature. LLD achieves on per with MSVC in terms produced
binary size with this patch.

This technique is pretty effective. For example, LLD's size is
reduced from 64MB to 54MB by enaling this optimization.

The algorithm implemented in this patch is extremely inefficient.
It puts all COMDAT sections into a set to identify duplicates.
Time to self-link with/without ICF are 3.3 and 320 seconds,
respectively. So this option roughly makes LLD 100x slower.
But it's okay as I wanted to achieve correctness first.
LLD is still able to link itself with this optimization.
I'm going to make it more efficient in followup patches.

Note that this optimization is *not* entirely safe. C/C++ require
different functions have different addresses. If your program
relies on that property, your program wouldn't work with ICF.
However, it's not going to be an issue on Windows because MSVC
link.exe turns ICF on by default. As long as your program works
with default settings (or not passing /opt:noicf), your program
would work with LLD too.

llvm-svn: 240519
2015-06-24 04:36:52 +00:00
Rui Ueyama 0d2e999050 COFF: Make link order compatible with MSVC link.exe.
Previously, we added files in directive sections to the symbol
table as we read the sections, so the link order was depth-first.
That's not compatible with MSVC link.exe nor the old LLD.

This patch is to queue files so that new files are added to the
end of the queue and processed last. Now addFile() doesn't parse
files nor resolve symbols. You need to call run() to process
queued files.

llvm-svn: 240483
2015-06-23 23:56:39 +00:00
Rui Ueyama a77336bd5d COFF: Support delay-load import tables.
DLLs are usually resolved at process startup, but you can
delay-load them by passing /delayload option to the linker.

If a /delayload is specified, the linker has to create data
which is similar to regular import table.
One notable difference is that the pointers in a delay-load
import table are originally pointing to thunks that resolves
themselves. Each thunk loads a DLL, resolve its name, and then
overwrites the pointer with the result so that subsequent
function calls directly call a desired function. The linker
has to emit thunks.

llvm-svn: 240250
2015-06-21 22:31:52 +00:00
Rui Ueyama 1a109285c2 COFF: Use short varaible name. NFC.
llvm-svn: 240232
2015-06-21 04:10:54 +00:00
Rui Ueyama 4d769c3a57 COFF: Support exception table.
.pdata section contains a list of triplets of function start address,
function end address and its unwind information. Linkers have to
sort section contents by function start address and set the section
address to the file header (so that runtime is able to find it and
do binary search.)

This change seems to resolve all but one remaining test failures in
check{,-clang,-lld} when building the entire stuff with clang-cl and
lld-link.

llvm-svn: 240231
2015-06-21 04:00:54 +00:00
Rui Ueyama 97dff9ee3a COFF: Support creating DLLs.
DLL files are in the same format as executables but they have export tables.
The format of the export table is described in PE/COFF spec section 5.3.

A new class, EdataContents, takes care of creating chunks for export tables.
What we need to do is to parse command line flags for dllexports, and then
instantiate the class to create chunks. For the writer, export table chunks
are opaque data -- it just add chunks to .edata section.

llvm-svn: 239869
2015-06-17 00:16:33 +00:00
Rui Ueyama 6592ff8c93 COFF: Add miscellaneous boolean flags.
llvm-svn: 239864
2015-06-16 23:13:00 +00:00
Rui Ueyama f3770d3edb COFF: Use ulittle32_t::operator|=. NFC.
llvm-svn: 239717
2015-06-15 03:03:23 +00:00
Rui Ueyama 59e9578f20 COFF: Fix resource table size.
The size field shouldn't include trailing padding.

llvm-svn: 239712
2015-06-15 01:35:56 +00:00
Rui Ueyama 588e832d0a COFF: Support base relocations.
PE/COFF executables/DLLs usually contain data which is called
base relocations. Base relocations are a list of addresses that
need to be fixed by the loader if load-time relocation is needed.

Base relocations are in .reloc section.

We emit one base relocation entry for each IMAGE_REL_AMD64_ADDR64
relocation.

In order to save disk space, base relocations are grouped by page.
Each group is called a block. A block starts with a 32-bit page
address followed by 16-bit offsets in the page. That is more
efficient representation of addresses than just an array of 32-bit
addresses.

llvm-svn: 239710
2015-06-15 01:23:58 +00:00
Rui Ueyama 9a03362a08 COFF: Change const name. NFC.
llvm-svn: 239707
2015-06-14 22:21:29 +00:00
Rui Ueyama 669236fef3 COFF: Set Chunk to OutputSection backreference in addChunk().
When we add a chunk to an OutputSection, we always want to create
a backreference from an OutputSection to a Chunk. To make sure
we always do, do that in addChunk(). NFC.

llvm-svn: 239706
2015-06-14 22:16:47 +00:00
Rui Ueyama 2bf6a12238 COFF: Support Windows resource files.
Resource files are data files containing i18n messages, icon images, etc.
MSVC has a tool to convert a resource file to a regular COFF file so that
you can just link that file to embed resources to an executable.

However, you can directly pass resource files to the linker. If you do that,
the linker invokes the tool automatically. This patch implements that feature.

llvm-svn: 239704
2015-06-14 21:50:50 +00:00
Rui Ueyama f533d3e09d COFF: Avoid callign stable_sort.
MSVC profiler reported that this stable_sort takes 7% time
when self-linking. As a result, createSection was taking 10% time.
Now createSection takes 3%. This small change actually makes
the linker a bit but perceptibly faster.

llvm-svn: 239292
2015-06-08 08:26:28 +00:00
Rui Ueyama e2cbfeae5c COFF: Add /opt:noref option.
This option disables dead-stripping.

llvm-svn: 239243
2015-06-07 03:17:42 +00:00
Rui Ueyama 4a9fbbca9f COFF: Add comments and move main function to the top. NFC.
llvm-svn: 239237
2015-06-06 23:32:08 +00:00
Rui Ueyama cc608e4f35 COFF: Rename writeHeader -> writeHeaderTo.
Chunk has writeTo function which takes uint8_t *Buf.
writeHeaderTo feels more consistent with that because this member
function also takes uint8_t *Buf.

llvm-svn: 239236
2015-06-06 23:19:38 +00:00
Rui Ueyama 929d8c52b1 COFF: Inline a constant that is used only once.
llvm-svn: 239235
2015-06-06 23:19:36 +00:00
Rui Ueyama 55168c9f70 COFF: Add .didat section.
llvm-svn: 239233
2015-06-06 23:07:01 +00:00
Rui Ueyama 458df98869 COFF: Update comments.
llvm-svn: 239232
2015-06-06 22:56:55 +00:00
Rui Ueyama c6ea057d7f COFF: Move .idata constructor from Writer to Chunk.
Previously, half of the constructor for .idata contents was in Chunks.cpp
and the rest was in Writer.cpp. This patch moves the latter to Chunks.cpp.
Now IdataContents class manages everything for .idata section.

llvm-svn: 239230
2015-06-06 22:46:15 +00:00
Rui Ueyama 743afa0736 COFF: Merge Chunk::applyRelocations with Chunk::writeTo.
In this design, Chunk is the only thing that knows how to write
its contents to output file as well as how to apply relocations
there. The writer shouldn't know about the details.

llvm-svn: 239216
2015-06-06 04:07:39 +00:00
Rui Ueyama eb262ce4b6 COFF: /include'd symbols must be preserved.
Not only entry point symbol but also symbols specified by /include
option must be preserved, as they will never be dead-stripped.

http://reviews.llvm.org/D10220

llvm-svn: 239005
2015-06-04 02:12:16 +00:00
Rui Ueyama bda72a4af4 COFF: Change OutputSections' type from vector<unique_ptr<T>> to vector<T*>.
This is mainly for readability. OutputSection objects are still owned
by the writer using SpecificBumpPtrAllocator.

llvm-svn: 238936
2015-06-03 16:44:00 +00:00
Rui Ueyama c2abdd9152 COFF: Use Chunk instead of its derived classes.
I'm adding ordinal-only (nameless) imports to the import table.
The chunk for that type is going to be different from LookupChunk.
Without this change, we cannot add objects of the new type to the
vectors.

llvm-svn: 238779
2015-06-01 21:05:24 +00:00
Rui Ueyama 68216c680d Fix comments.
llvm-svn: 238718
2015-06-01 03:55:02 +00:00
Rui Ueyama 8fd9fb9857 COFF: Define an error category for the linker.
Instead of returning non-categorized errors, return categorized errors.
All uses of make_dynamic_error_code are removed.

Because we don't have error reporting mechanism, I just chose to print out
error messages to stderr, and then return an error object. Not sure if
that's the right thing to do, but at least it seems practical.

http://reviews.llvm.org/D10129

llvm-svn: 238714
2015-06-01 02:58:15 +00:00
Rui Ueyama e00d651071 Use initializer instead of memset to zero out.
llvm-svn: 238662
2015-05-30 19:28:58 +00:00
Rui Ueyama bfb4aa1791 COFF: Support long section name.
Section names were truncated to 8 bytes because the section table's
name field is 8 byte long. This patch creates the string table to
store long names.

llvm-svn: 238661
2015-05-30 19:09:50 +00:00
Peter Collingbourne 246ccc5f51 COFF: Move machine type auto-detection to SymbolTable.
The new mechanism is less code, and fixes the case where all inputs
are archives.

Differential Revision: http://reviews.llvm.org/D10136

llvm-svn: 238618
2015-05-29 21:47:36 +00:00
Rui Ueyama 15cc47ee81 COFF: Add /subsystem option.
llvm-svn: 238571
2015-05-29 16:34:31 +00:00
Rui Ueyama b9dcdb5fc9 COFF: Add /version option.
llvm-svn: 238570
2015-05-29 16:28:29 +00:00
Rui Ueyama c377e9aefe COFF: Add /heap option.
llvm-svn: 238569
2015-05-29 16:23:40 +00:00
Rui Ueyama b41b7e5a69 Add /stack option.
llvm-svn: 238568
2015-05-29 16:21:11 +00:00
Rui Ueyama 3d3e6fba6e COFF: Add /machine option.
llvm-svn: 238564
2015-05-29 16:06:00 +00:00
Rui Ueyama 322b2c413d COFF: Return an error_code directly.
llvm-svn: 238486
2015-05-28 20:39:29 +00:00
Rui Ueyama d6fefba447 COFF: Teach Chunk to write to a mmap'ed output file.
Previously Writer directly handles writes to a file.
Chunks needed to give Writer a continuous chunk of memory.
That was inefficent if you construct data in chunks because
it would require two memory copies (one to construct a chunk
and the other is to write that to a file).

This patch teaches chunk to write directly to a file.
From readability point of view, this is also good because
you no longer have to call hasData() before calling getData().

llvm-svn: 238464
2015-05-28 19:45:43 +00:00
Rui Ueyama 411c636081 COFF: Add a new PE/COFF port.
This is an initial patch for a section-based COFF linker.

The patch has 2300 lines of code including comments and blank lines.
Before diving into details, you want to start from reading README
because it should give you an overview of the design.

All important things are written in the README file, so I write
summary here.

- The linker is already able to self-link on Windows.

- It's significantly faster than the existing implementation.
  The existing one takes 5 seconds to link LLD on my machine,
  while the new one only takes 1.2 seconds, even though the new
  one is not multi-threaded yet. (And a proof-of-concept multi-
  threaded version was able to link it in 0.5 seconds.)

- It uses much less memory (250MB vs. 2GB virtual memory space
  to self-host).

- IMHO the new code is much simpler and easier to read than
  the existing PE/COFF port.

http://reviews.llvm.org/D10036

llvm-svn: 238458
2015-05-28 19:09:30 +00:00