Commit Graph

284 Commits

Author SHA1 Message Date
Martin Storsjo eac1b05f1d [COFF] Support MinGW automatic dllimport of data
Normally, in order to reference exported data symbols from a different
DLL, the declarations need to have the dllimport attribute, in order to
use the __imp_<var> symbol (which contains an address to the actual
variable) instead of the variable itself directly. This isn't an issue
in the same way for functions, since any reference to the function without
the dllimport attribute will end up as a reference to a thunk which loads
the actual target function from the import address table (IAT).

GNU ld, in MinGW environments, supports automatically importing data
symbols from DLLs, even if the references didn't have the appropriate
dllimport attribute. Since the PE/COFF format doesn't support the kind
of relocations that this would require, the MinGW's CRT startup code
has an custom framework of their own for manually fixing the missing
relocations once module is loaded and the target addresses in the IAT
are known.

For this to work, the linker (originall in GNU ld) creates a list of
remaining references needing fixup, which the runtime processes on
startup before handing over control to user code.

While this feature is rather controversial, it's one of the main features
allowing unix style libraries to be used on windows without any extra
porting effort.

Some sort of automatic fixing of data imports is also necessary for the
itanium C++ ABI on windows (as clang implements it right now) for importing
vtable pointers in certain cases, see D43184 for some discussion on that.

The runtime pseudo relocation handler supports 8/16/32/64 bit addresses,
either PC relative references (like IMAGE_REL_*_REL32*) or absolute
references (IMAGE_REL_AMD64_ADDR32, IMAGE_REL_AMD64_ADDR32,
IMAGE_REL_I386_DIR32). On linking, the relocation is handled as a
relocation against the corresponding IAT slot. For the absolute references,
a normal base relocation is created, to update the embedded address
in case the image is loaded at a different address.

The list of runtime pseudo relocations contains the RVA of the
imported symbol (the IAT slot), the RVA of the location the relocation
should be applied to, and a size of the memory location. When the
relocations are fixed at runtime, the difference between the actual
IAT slot value and the IAT slot address is added to the reference,
doing the right thing for both absolute and relative references.

With this patch alone, things work fine for i386 binaries, and mostly
for x86_64 binaries, with feature parity with GNU ld. Despite this,
there are a few gotchas:
- References to data from within code works fine on both x86 architectures,
  since their relocations consist of plain 32 or 64 bit absolute/relative
  references. On ARM and AArch64, references to data doesn't consist of
  a plain 32 or 64 bit embedded address or offset in the code. On ARMNT,
  it's usually a MOVW+MOVT instruction pair represented by a
  IMAGE_REL_ARM_MOV32T relocation, each instruction containing 16 bit of
  the target address), on AArch64, it's usually an ADRP+ADD/LDR/STR
  instruction pair with an even more complex encoding, storing a PC
  relative address (with a range of +/- 4 GB). This could theoretically
  be remedied by extending the runtime pseudo relocation handler with new
  relocation types, to support these instruction encodings. This isn't an
  issue for GCC/GNU ld since they don't support windows on ARMNT/AArch64.
- For x86_64, if references in code are encoded as 32 bit PC relative
  offsets, the runtime relocation will fail if the target turns out to be
  out of range for a 32 bit offset.
- Fixing up the relocations at runtime requires making sections writable
  if necessary, with the VirtualProtect function. In Windows Store/UWP apps,
  this function is forbidden.

These limitations are addressed by a few later patches in lld and
llvm.

Differential Revision: https://reviews.llvm.org/D50917

llvm-svn: 340726
2018-08-27 08:43:31 +00:00
Hans Wennborg bdd8493f2b [COFF] Make the relocation scanning for CFG more discriminating
link.exe ignores REL32 relocations on 32-bit x86, as well as relocations
against non-function symbols such as labels. This makes lld do the same.

Differential Revision: https://reviews.llvm.org/D50430

llvm-svn: 339345
2018-08-09 13:43:22 +00:00
Martin Storsjo 98ff9f845d [COFF] Sort .reloc before all other discardable sections
If a binary is stripped, which can remove discardable sections (except
for the .reloc section, which also is marked as discardable as it isn't
loaded at runtime, only read by the loader), the .reloc section should
be first of them, in order not to create gaps in the image.

Previously, binaries with relocations were broken if they were stripped
by GNU binutils strip. Trying to execute such binaries produces an error
about "xx is not a valid win32 application".

This fixes GNU binutils bug 23348.

Prior to SVN r329370 (which didn't intend to have functional changes),
the code for moving discardable sections to the end didn't clearly
express how other discardable sections should be ordered compared to
.reloc, but the change retained the exact same end result as before.

After SVN r329370, the code (and comments) more clearly indicate that
it tries to make the .reloc section the absolutely last one; this patch
changes that.

This matches how GNU binutils ld sorts .reloc compared to dwarf debug
info sections.

Differential Revision: https://reviews.llvm.org/D49351

Signed-off-by: Martin Storsjö <martin@martin.st>
llvm-svn: 337598
2018-07-20 18:43:35 +00:00
Martin Storsjo a55fc71614 [COFF] Write the debug directory and build id to a separate section for MinGW
For dwarf debug info, an executable normally either contains the debug
info, or it is stripped out. To reduce the storage needed (slightly)
for the debug info kept separately from the released, stripped binaries,
one can choose to only copy the debug data from the original executable
(essentially the reverse of the strip operation), producing a file with
only debug info.

When copying the debug data from an executable with GNU objcopy,
the build id and debug directory need to reside in a separate section,
as this will be kept while the rest of the .rdata section is removed.

Differential Revision: https://reviews.llvm.org/D49352

llvm-svn: 337526
2018-07-20 05:44:34 +00:00
Martin Storsjo c35e4bf7eb [COFF] Don't produce base relocs for discardable sections
Dwarf debug info contains some data that contains absolute addresses.
Since these sections are discardable and aren't loaded at runtime,
there's no point in adding base relocations for them.

This makes sure that after stripping out dwarf debug info, there are no
base relocations that point to nonexistent sections.

Differential Revision: https://reviews.llvm.org/D49350

llvm-svn: 337438
2018-07-19 04:25:22 +00:00
Zachary Turner e2ce2a5c86 [coff] remove_dots from /PDBPATH but not /PDBALTPATH.
This more closely matches the behavior of link.exe, and also
simplifies the code slightly.

llvm-svn: 336882
2018-07-12 03:22:39 +00:00
Zachary Turner bf9abccacd [coff] Remove dots in path pointing to PDB file.
Some Microsoft tools (e.g. new versions of WPA) fail when the
COFF Debug Directory contains a path to the PDB that contains
dots, such as D:\foo\./bar.pdb.  Remove dots before writing this
path.

This fixes pr38126.

llvm-svn: 336873
2018-07-12 00:44:15 +00:00
Martin Storsjo 474be005db [COFF] Store import symbol pointers as pointers to the base class
Future symbol insertions can potentially change the type of these
symbols - keep pointers to the base class to reflect this, and
use dynamic casts to inspect them before using as the subclass
type.

This fixes crashes that were possible before, by touching these
symbols that now are populated as e.g. a DefinedRegular, via
the old pointers with DefinedImportThunk type.

Differential Revision: https://reviews.llvm.org/D48953

llvm-svn: 336652
2018-07-10 10:40:11 +00:00
Martin Storsjo 3a7905b2aa [COFF] Add an LLD specific option -debug:symbtab
With this set, we retain the symbol table, but skip the actual debug
information.

This is meant to be used by the MinGW frontend.

Differential Revision: https://reviews.llvm.org/D48745

llvm-svn: 335946
2018-06-29 06:08:25 +00:00
Bob Haarman c103156c60 lld-link: align sections to 16 bytes if referenced from the gfids table
Summary:
Control flow guard works best when targets it checks are 16-byte aligned.
Microsoft's link.exe helps ensure this by aligning code from sections
that are referenced from the gfids table to 16 bytes when linking with
-guard:cf, even if the original section specifies a smaller alignment.
This change implements that behavior in lld-link.

See https://crbug.com/857012 for more details.

Reviewers: ruiu, hans, thakis, zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D48690

llvm-svn: 335864
2018-06-28 15:22:40 +00:00
Shoaib Meenai 02c4344262 [COFF] Fix crash when emitting symbol tables with GC
When running with linker GC (`-opt:ref`), defined imported symbols that
are referenced but then dropped by GC end up with their `Location`
member being nullptr, which means `getChunk()` returns nullptr for them
and attempting to call `getChunk()->getOutputSection()` causes a crash
from the nullptr dereference. Check for `getChunk()` being nullptr and
bail out early to avoid the crash.

Differential Revision: https://reviews.llvm.org/D48092

llvm-svn: 334548
2018-06-12 21:19:33 +00:00
Nico Weber d657c25649 lld-link: Implement /INTEGRITYCHECK flag
/INTEGRITYCHECK has the effect of setting
IMAGE_DLLCHARACTERISTICS_FORCE_INTEGRITY. Fixes PR31066.
https://reviews.llvm.org/D47472

llvm-svn: 333652
2018-05-31 13:43:02 +00:00
Shoaib Meenai 663518d61a [COFF] Unify output section code. NFC
Peter Collingbourne suggested moving the switch to the top of the
function, so that all the code that cares about the output section for a
symbol is in the same place.

Differential Revision: https://reviews.llvm.org/D47497

llvm-svn: 333472
2018-05-29 22:49:56 +00:00
Shoaib Meenai 4e51833611 [COFF] Simplify symbol table output section computation
Rather than using a loop to compare symbol RVAs to the starting RVAs of
sections to determine which section a symbol belongs to, just get the
output section of a symbol directly via its chunk, and bail if the
symbol doesn't have an output section, which avoids having to hardcode
logic for handling dead symbols, CodeView symbols, etc. This was
suggested by Reid Kleckner; thank you.

This also fixes writing out symbol tables in the presence of RVA table
input sections (e.g. .sxdata and .gfids). Such sections aren't written
to the output file directly, so their RVA is 0, and the loop would thus
fail to find an output section for them, resulting in a segfault. Extend
some existing tests to cover this case.

Fixes PR37584.

Differential Revision: https://reviews.llvm.org/D47391

llvm-svn: 333450
2018-05-29 19:07:47 +00:00
Zachary Turner c8dd6ccc8a [COFF] Add /Brepro and /TIMESTAMP options.
Previously we would always write a hash of the binary into the
PE file, for reproducible builds.  This breaks AppCompat, which
is a feature of Windows that relies on the timestamp in the PE
header being set to a real value (or at the very least, a value
that satisfies certain properties).

To address this, we put the old behavior of writing the hash
behind the /Brepro flag, which mimics MSVC linker behavior.  We
also match MSVC default behavior, which is to write an actual
timestamp to the PE header.  Finally, we add the /TIMESTAMP
option (an lld extension) so that the user can specify the exact
value to be used in case he/she manually constructs a value which
is both reproducible and satisfies AppCompat.

Differential Revision: https://reviews.llvm.org/D46966

llvm-svn: 332613
2018-05-17 15:11:01 +00:00
Peter Collingbourne e28faed768 COFF: Don't create unnecessary thunks.
A thunk is only needed if a relocation points to the undecorated
import name.

Differential Revision: https://reviews.llvm.org/D46673

llvm-svn: 332019
2018-05-10 19:01:28 +00:00
Peter Collingbourne 71c7de5b77 COFF: Preserve section type when processing /section flag.
It turns out that we were dropping this before.

Differential Revision: https://reviews.llvm.org/D45802

llvm-svn: 330481
2018-04-20 21:23:16 +00:00
Peter Collingbourne 381b3d8aa3 COFF: Use (name, output characteristics) as a key when grouping input sections into output sections.
This is what link.exe does and lets us avoid needing to worry about
merging output characteristics while adding input sections to output
sections.

With this change we can't process /merge in the same way as before
because sections with different output characteristics can still
be merged into one another. So this change moves the processing of
/merge to just before we assign addresses. In the case where there
are multiple output sections with the same name, link.exe only merges
the first section with the source name into the first section with
the target name, and we do the same.

At the same time I also implemented transitive merging (which means
that /merge:.c=.b /merge:.b=.a merges both .c and .b into .a).

This isn't quite enough though because link.exe has a special case for
.CRT in 32-bit mode: it processes sections whose output characteristics
are DATA | R | W as though the output characteristics were DATA | R
(so that they get merged into things like constructor lists in the
expected way). Chromium has a few such sections, and it turns out
that those sections were causing the problem that resulted in r318699
(merge .xdata into .rdata) being reverted: because of the previous
permission merging semantics, the .CRT sections were causing the entire
.rdata section to become writable, which caused the SEH runtime to
crash because it apparently requires .xdata to be read-only. This
change also implements the same special case.

This should unblock being able to merge .xdata into .rdata by default,
as well as .bss into .data, both of which will be done in followups.

Differential Revision: https://reviews.llvm.org/D45801

llvm-svn: 330479
2018-04-20 21:10:33 +00:00
Peter Collingbourne be084eca5b COFF: Remove OutputSection::getPermissions() and getCharacteristics().
All callers can just access the header directly.

Differential Revision: https://reviews.llvm.org/D45800

llvm-svn: 330367
2018-04-19 21:48:37 +00:00
Peter Collingbourne fa322abee9 COFF: Rename Chunk::getPermissions to getOutputCharacteristics.
In an upcoming change I will need to make a distinction between section
type (code, data, bss) and permissions. The term that I use for both
of these things is "output characteristics".

Differential Revision: https://reviews.llvm.org/D45799

llvm-svn: 330361
2018-04-19 20:03:24 +00:00
Reid Kleckner 8f1a28f190 [COFF] Mark images with no exception handlers for /safeseh
Summary:
DLLs and executables with no exception handlers need to be marked with
IMAGE_DLL_CHARACTERISTICS_NO_SEH, even if they have a load config.

Discovered here when building Chromium with LLD on Windows:
https://crbug.com/833951

Reviewers: ruiu, mstorsjo

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45778

llvm-svn: 330300
2018-04-18 22:37:10 +00:00
Peter Collingbourne 94aa62e48a COFF: Implement /pdbaltpath flag.
I needed to revert r330223 because we were embedding an absolute PDB
path in the .rdata section, which ended up being laid out before the
.idata section and affecting its RVAs. This flag will let us control
the embedded path.

Differential Revision: https://reviews.llvm.org/D45747

llvm-svn: 330232
2018-04-17 23:28:38 +00:00
Peter Collingbourne 4902508934 COFF: Process /merge flag as we create output sections.
With this we can merge builtin sections.

Differential Revision: https://reviews.llvm.org/D45350

llvm-svn: 329471
2018-04-07 00:46:55 +00:00
Peter Collingbourne f2c0f39b91 COFF: Create output sections early. NFCI.
With this, all output sections are created in one place. This will make
it simpler to implement merging of builtin sections.

Differential Revision: https://reviews.llvm.org/D45349

llvm-svn: 329370
2018-04-06 03:25:49 +00:00
Peter Collingbourne 05f0bae318 COFF: Sort non-discardable sections at the same time as other sections. NFC.
This makes the sort order a little clearer.

Differential Revision: https://reviews.llvm.org/D45282

llvm-svn: 329227
2018-04-04 20:30:37 +00:00
Hans Wennborg 9a9fc78744 COFF: Layout sections in the same order as link.exe
One place where this seems to matter is to make sure the .rsrc section comes
after .text. The Win32 UpdateResource() function can change the contents of
.rsrc. It will move the sections that come after, but if .text gets moved, the
entry point header will not get updated and the executable breaks. This was
found by a test in Chromium.

Differential Revision: https://reviews.llvm.org/D45260

llvm-svn: 329221
2018-04-04 19:15:55 +00:00
Shoaib Meenai 290f26fefd [COFF] Clarify comment. NFC
Reid pointed out the string table for supporting long section names is a
BFD extension and the comments should reflect that. Explicitly spell out
link.exe's and binutil's behavior around section names and the rationale
for LLD's behavior.

Differential Revision: https://reviews.llvm.org/D42659

llvm-svn: 327736
2018-03-16 20:20:01 +00:00
Peter Collingbourne f1a11f87a0 COFF: Implement string tail merging.
In COFF, duplicate string literals are merged by placing them in a
comdat whose leader symbol name contains a specific prefix followed
by the hash and partial contents of the string literal. This gives
us an easy way to identify sections containing string literals in
the linker: check for leader symbol names with the given prefix.

Any sections that are identified in this way as containing string
literals may be tail merged. We do so using the StringTableBuilder
class, which is also used to tail merge string literals in the ELF
linker. Tail merging is enabled only if ICF is enabled, as this
provides a signal as to whether the user cares about binary size.

Differential Revision: https://reviews.llvm.org/D44504

llvm-svn: 327668
2018-03-15 21:14:02 +00:00
Peter Collingbourne 435b099115 COFF: Move assignment of section RVAs to assignAddresses(). NFCI.
This makes the design a little more similar to the ELF linker and
should allow for features such as ARM range extension thunks to be
implemented more easily.

Differential Revision: https://reviews.llvm.org/D44501

llvm-svn: 327667
2018-03-15 21:13:46 +00:00
Zachary Turner b575f46b6d Resubmit "Write a hash of the executable into the PE timestamp fields."
This fixes the broken tests that were causing failures.  The tests
before were verifying that the time stamp was 0, but now that we
are actually writing a timestamp, I just removed the match against
the timestamp value.

llvm-svn: 327049
2018-03-08 19:33:47 +00:00
Hans Wennborg aee5881a85 [COFF] Make the DOS stub a real DOS program
It only adds a few bytes and is nice for backward compatibility.

Differential Revision: https://reviews.llvm.org/D44018

llvm-svn: 327001
2018-03-08 14:27:28 +00:00
Zachary Turner 0b4af0434b Revert "Write a hash of the executable into the PE timestamp fields."
This is breaking a couple of tests, so I'm reverting temporarily
until I can get everything resolved properly.

llvm-svn: 326943
2018-03-07 21:22:10 +00:00
Zachary Turner 69f3347b56 Write a hash of the executable into the PE timestamp fields.
Windows tools treats the timestamp fields as sort of a build id,
using it to archive executables on a symbol server, as well as
for matching executables to PDBs.  We were writing 0 for these
fields, which would cause symbol servers to break as they are
indexed in the symbol server based on this value.

Although the field is called timestamp, it can really be any
value that is unique per build, so to support reproducible builds
we use a hash of the executable here.

Differential Revision: https://reviews.llvm.org/D43978

llvm-svn: 326920
2018-03-07 18:13:41 +00:00
Rui Ueyama b3107476a4 Remove an unused accessor and simplify the logic a bit. NFC.
llvm-svn: 325445
2018-02-17 20:41:38 +00:00
Reid Kleckner fd52096259 [LLD] Implement /guard:[no]longjmp
Summary:
This protects calls to longjmp from transferring control to arbitrary
program points. Instead, longjmp calls are limited to the set of
registered setjmp return addresses.

This also implements /guard:nolongjmp to allow users to link in object
files that call setjmp that weren't compiled with /guard:cf. In this
case, the linker will approximate the set of address taken functions,
but it will leave longjmp unprotected.

I used the following program to test, compiling it with different -guard
flags:
  $ cl -c t.c -guard:cf
  $ lld-link t.obj -guard:cf

  #include <setjmp.h>
  #include <stdio.h>
  jmp_buf buf;
  void g() {
    printf("before longjmp\n");
    fflush(stdout);
    longjmp(buf, 1);
  }
  void f() {
    if (setjmp(buf)) {
      printf("setjmp returned non-zero\n");
      return;
    }
    g();
  }
  int main() {
    f();
    printf("hello world\n");
  }

In particular, the program aborts when the code is compiled *without*
-guard:cf and linked with -guard:cf. That indicates that longjmps are
protected.

Reviewers: ruiu, inglorion, amccarth

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D43217

llvm-svn: 325047
2018-02-13 20:32:53 +00:00
Reid Kleckner af2f7da74c [COFF] Add minimal support for /guard:cf
Summary:
This patch adds some initial support for Windows control flow guard. At
the end of the day, the linker needs to synthesize a table of RVAs very
similar to the structured exception handler table (/safeseh).

Both /safeseh and /guard:cf take sections of symbol table indices
(.sxdata and .gfids$y) and turn them into RVA tables referenced by the
load config struct in the CRT through special symbols.

Reviewers: ruiu, amccarth

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42592

llvm-svn: 324306
2018-02-06 01:58:26 +00:00
Shoaib Meenai 34a1101b06 [COFF] Update comment to reflect link.exe behavior. NFC
In my experimentation with link.exe from both VS 2015 and 2017, it
always produces images with truncated section names. Update the comment
accordingly.

Differential Revision: https://reviews.llvm.org/D42603

llvm-svn: 323598
2018-01-27 18:17:08 +00:00
Rui Ueyama 57175aa1e9 Add the /order option.
With the /order option, you can give an order file. An order file
contains symbol names, one per line, and the linker places comdat
sections in that given order. The option is used often to optimize
an output binary for (in particular, startup) speed by improving
locality.

Differential Revision: https://reviews.llvm.org/D42598

llvm-svn: 323579
2018-01-27 00:34:46 +00:00
Zachary Turner 727f153b6f [coff] Print detailed timing information with /TIME.
The classes used to print and update time information are in
common, so other linkers could use this as well if desired.

Differential Revision: https://reviews.llvm.org/D41915

llvm-svn: 322736
2018-01-17 19:16:26 +00:00
Rui Ueyama 2c95e798a0 [LLD][COFF] Report error when file will exceed Windows maximum image size (4GB)
Patch by Colden Cullen.

Currently, when a large PE (>4 GiB) is to be produced, a crash occurs
because:

1. Calling setOffset with a number greater than UINT32_MAX causes the
   PointerToRawData to overflow

2. When adding the symbol table to the end of the file, the last section's
   offset was used to calculate file size. Because this had overflowed,
   this number was too low, and the file created would not be large enough.
   This lead to the actual crash I saw, which was a buffer overrun.

This change:

1. Adds comment to setOffset, clarifying that overflow can occur, but it's
   somewhat safe because the error will be handled elsewhere

2. Adds file size check after all output data has been created This matches
   the MS link.exe error, which looks prints as: "LINK : fatal error
   LNK1248: image size (10000EFC9) exceeds maximum allowable size
   (FFFFFFFF)"

3. Changes calculate of the symbol table offset to just use the existing
   FileSize. This should match the previous calculations, but doesn't rely
   on the use of a u32 that can overflow.

4. Removes trivial usage of a magic number that bugged me while I was
   debugging the issue

I'm not sure how to add a test for this outside of adding 4GB of object
files to the repo. If there's an easier way, let me know and I'll be
happy to add a test.

Differential Revision: https://reviews.llvm.org/D42010

llvm-svn: 322605
2018-01-17 01:08:02 +00:00
Martin Storsjo a1e9b6e3d2 [COFF] Set the IMAGE_DLL_CHARACTERISTICS_NO_SEH flag automatically
This seems to match how link.exe sets it.

Differential Revision: https://reviews.llvm.org/D41252

llvm-svn: 320860
2017-12-15 20:53:03 +00:00
Martin Storsjo 9603b8e3f5 [COFF] Sort .pdata for arm64
This works for linking the output from the MSVC compiler.
The pdata entries for arm64 seem to be 8 bytes in the same
(or at least similar) form to arm.

Differential Revision: https://reviews.llvm.org/D41160

llvm-svn: 320676
2017-12-14 08:56:29 +00:00
Rui Ueyama bdc5150984 Always evaluate the second argument for CHECK() lazily.
This patch is to rename check CHECK and make it a C macro, so that
we can evaluate the second argument lazily.

Differential Revision: https://reviews.llvm.org/D40915

llvm-svn: 319974
2017-12-06 22:08:17 +00:00
Peter Collingbourne 24ca79c776 COFF: Simplify construction of safe SEH table. NFCI.
Instead of building intermediate sets of exception handlers for each
object file, just create one for the final output file.

Differential Revision: https://reviews.llvm.org/D40581

llvm-svn: 319244
2017-11-28 22:50:53 +00:00
Rui Ueyama 2017d52b54 Move Memory.{h,cpp} to Common.
Differential Revision: https://reviews.llvm.org/D40571

llvm-svn: 319221
2017-11-28 20:39:17 +00:00
Martin Storsjo f2508f46ca [COFF] Interpret a period as a separator for section suffix just like '$'
This allows grouping all sections like ".ctors.12345" into ".ctors".

For MinGW, the numerical values for such ctors are all zero-padded,
so a lexical sort is good enough.

Differential Revision: https://reviews.llvm.org/D40408

llvm-svn: 319151
2017-11-28 08:08:37 +00:00
Peter Collingbourne f874bd67d8 COFF: Emit a COFF symbol table if /debug:dwarf is specified.
This effectively reverts r318548 and r318635 while keeping the
functionality behind the flag and preserving the bug fix from r318548.

Differential Revision: https://reviews.llvm.org/D40264

llvm-svn: 318721
2017-11-21 01:14:14 +00:00
Peter Collingbourne 5e80bdebd2 COFF: Stop emitting a non-standard COFF symbol table into PEs.
Now that our support for PDB emission is reasonably good, there is
no longer a need to emit a COFF symbol table.

Also fix a bug where we would fail to emit a string table for long
section names if /debug was not specified.

Differential Revision: https://reviews.llvm.org/D40189

llvm-svn: 318548
2017-11-17 19:51:20 +00:00
Martin Storsjo 46304e03ec [COFF] Don't write long section names for sections that will be mapped at runtime
Sections that will be mapped at runtime will only have the short
section name available, since the string table it points into isn't
mapped. Therefore prefer truncating those names over writing a
long name that is unavailable at runtime.

This allows libunwind to find the .eh_frame section at runtime even
if the module was built with debug info enabled.

Differential Revision: https://reviews.llvm.org/D40025

llvm-svn: 318391
2017-11-16 12:06:42 +00:00
Bob Haarman fe059c782f [coff] correctly emit safeseh entries for handlers defined in dlls
Summary:
We previously assumed that all SafeSEH handlers are
DefinedRegular symbols. This is not the case for handlers defined in
DLLs. As a result, we were failing to emit entries in the SafeSEH
table for those handlers. This change fixes that.

Fixes PR35324.

Reviewers: rnk, ruiu

Reviewed By: rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40102

llvm-svn: 318364
2017-11-16 01:22:01 +00:00
Martin Storsjo 61716878ae [COFF] Always include the size of the string table size field
Even if we don't actually write any string table contents, the
4 byte size for the string table will always be written. Make
sure we accommodate for this in the file size. Since this size
is aligned up, this would seldom be an issue in practice.

Differential Revision: https://reviews.llvm.org/D39891

llvm-svn: 318284
2017-11-15 08:18:25 +00:00
Rafael Espindola 0a7d0230fc Try harder to delete the temporary file.
This changes COFF to use the output buffer that is reset by the error
handler.

llvm-svn: 318062
2017-11-13 18:15:22 +00:00
Rafael Espindola 5f903f3848 Update for llvm change.
llvm-svn: 317657
2017-11-08 01:50:34 +00:00
Bob Haarman 6c301b6eb1 [coff] use relative instead of absolute __safe_se_handler_base when present
Summary:
__safe_se_handler_base should be either absolute 0 (when no SafeSEH
table is present), or relative to the image base (when the table is
present). An earlier change inadvertedly made the symbol absolute in
both cases, leading to the SafeSEH table not being locatble at run
time. This change fixes that and updates the safeseh test to check for
the presence of the relocation.

Reviewers: rnk, ruiu

Reviewed By: ruiu

Subscribers: ruiu, llvm-commits

Differential Revision: https://reviews.llvm.org/D39765

llvm-svn: 317635
2017-11-07 23:24:10 +00:00
Rui Ueyama f483da0038 Rename replaceBody -> replaceSymbol.
llvm-svn: 317383
2017-11-03 22:48:47 +00:00
Rui Ueyama f52496e1e0 Rename SymbolBody -> Symbol
Now that we have only SymbolBody as the symbol class. So, "SymbolBody"
is a bit strange name now. This is a mechanical change generated by

  perl -i -pe s/SymbolBody/Symbol/g $(git grep -l SymbolBody lld/ELF lld/COFF)

nd clang-format-diff.

Differential Revision: https://reviews.llvm.org/D39459

llvm-svn: 317370
2017-11-03 21:21:47 +00:00
Rui Ueyama 616cd99194 [COFF] Merge Symbol and SymbolBody.
llvm-svn: 317007
2017-10-31 16:10:24 +00:00
Rui Ueyama 5ace35cba5 Fix SizeOfImage in the PE header.
IIUC, SizeOfImage is the distance from the end of the last section to
the image base, rounded up to the page size. So the previous code is
wrong.

Should fix https://bugs.llvm.org/show_bug.cgi?id=34949

(It is nice to know that lld is already being used to create Putty
distribution binaries.)

llvm-svn: 316626
2017-10-25 23:00:40 +00:00
Bob Haarman b8a59c8aa5 [lld] unified COFF and ELF error handling on new Common/ErrorHandler
Summary:
The COFF linker and the ELF linker have long had similar but separate
Error.h and Error.cpp files to implement error handling. This change
introduces new error handling code in Common/ErrorHandler.h, changes the
COFF and ELF linkers to use it, and removes the old, separate
implementations.

Reviewers: ruiu

Reviewed By: ruiu

Subscribers: smeenai, jyknight, emaste, sdardis, nemanjai, nhaehnle, mgorny, javed.absar, kbarton, fedor.sergeev, llvm-commits

Differential Revision: https://reviews.llvm.org/D39259

llvm-svn: 316624
2017-10-25 22:28:38 +00:00
Shoaib Meenai 4aa7f8a30f [COFF] Check for sections larger than 4 GiB
Sections are limited to 4 GiB. Error out early if a section exceeds this
size, rather than overflowing the section size and getting confusing
assertion failures/segfaults later.

Differential Revision: https://reviews.llvm.org/D38005

llvm-svn: 313699
2017-09-19 23:58:05 +00:00
Rui Ueyama eef6b2a5c9 Revert r303378: Set IMAGE_DLL_CHARACTERISTICS_NO_BIND.
r303378 was submitted because r303374 (Merge IAT and ILT) made lld's
output incompatible with the Binding feature. Now that r303374 was
reverted, we do not need to keep this change.

Pointed out by pcc.

llvm-svn: 313414
2017-09-15 22:49:13 +00:00
Rui Ueyama cfc2f80df6 Remove {get,set}Align accessor functions and use Alignment member variable instead.
llvm-svn: 313204
2017-09-13 21:54:55 +00:00
Rui Ueyama cbf969eb20 Remove Symtab aliases.
Various classes have `Symtab` member variables even though we have
lld::coff::Symtab variable because previous attempts to make COFF lld's
internal structure resemble to ELF's was incomplete. This patch finishes
that job by removing member variables.

llvm-svn: 311938
2017-08-28 21:51:07 +00:00
Sam Clegg 7dbd1fd73b Update comments: parallel_for_each -> parallelForEach
Also remove unused include of raw_ostream.h

Differential Revision: https://reviews.llvm.org/D37048

llvm-svn: 311587
2017-08-23 19:03:20 +00:00
Zachary Turner 1bc6cb64b1 Fix warning about unused variable.
I'm explicitly ignoring the warning by casting to void instead of
deleting the local assignment, because it's confusing to see a
function that fails when its return value evaluates to true.
But when you see that it's a std::error_code, it makes more sense.

llvm-svn: 310965
2017-08-15 21:46:51 +00:00
Zachary Turner 024323cb12 [LLD COFF/PDB] Incrementally update the build id.
Previously, our algorithm to compute a build id involved hashing the
executable and storing that as the GUID in the CV Debug Record chunk,
and setting the age to 1.

This breaks down in one very obvious case: a user adds some newlines to
a file, rebuilds, but changes nothing else. This causes new line
information and new file checksums to get written to the PDB, meaning
that the debug info is different, but the generated code would be the
same, so we would write the same build over again with an age of 1.

Anyone using a symbol cache would have a problem now, because the
debugger would open the executable, look at the age and guid, find a
matching PDB in the symbol cache and then load it. It would never copy
the new PDB to the symbol cache.

This patch implements the canonical Windows algorithm for updating
a build id, which is to check the existing executable first, and
re-use an existing GUID while bumping the age if it already
exists.

Differential Revision: https://reviews.llvm.org/D36758

llvm-svn: 310961
2017-08-15 21:31:41 +00:00
Zachary Turner 4f588a93bf Fix build breakage.
llvm-svn: 310112
2017-08-04 20:07:08 +00:00
Zachary Turner f1ca78c253 [lld] Write the absolute PDB path to the debug directory.
This matches the behavior of MSVC's linker.

Differential Revision: https://reviews.llvm.org/D36334

llvm-svn: 310108
2017-08-04 20:02:55 +00:00
Reid Kleckner 175af4bcc7 [PDB] Fix section contributions
Summary:
PDB section contributions are supposed to use output section indices and
offsets, not input section indices and offsets.

This allows the debugger to look up the index of the module that it
should look up in the modules stream for symbol information. With this
change, windbg can now find line tables, but it still cannot print local
variables.

Fixes PR34048

Reviewers: zturner

Subscribers: hiraditya, ruiu, llvm-commits

Differential Revision: https://reviews.llvm.org/D36285

llvm-svn: 309987
2017-08-03 21:15:09 +00:00
Reid Kleckner 8d2cbf2e9b [PDB] Improve our PDB OMF debug directory entry
In order to get dbghelp to load our pdb, we have to fill in the
PointerToRawData field as well as the AddressOfRawData field. One is the
file offset and the other is the RVA.

llvm-svn: 309900
2017-08-02 23:19:54 +00:00
Reid Kleckner eacdf04fdd [PDB] Write public symbol records and the publics hash table
Summary:
MSVC link.exe records all external symbol names in the publics stream.
It provides similar functionality to an ELF .symtab.

Reviewers: zturner, ruiu

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D35871

llvm-svn: 309303
2017-07-27 18:25:59 +00:00
Rui Ueyama acd632d338 Add {Obj,Import,Bitcode}File::Instances to COFF input files.
We did the same thing for ELF in r309152, and I want to maintain
COFF and ELF as close as possible.

llvm-svn: 309239
2017-07-27 00:45:26 +00:00
Rui Ueyama e1b48e099c Rename ObjectFile ObjFile for COFF as well.
llvm-svn: 309228
2017-07-26 23:05:24 +00:00
Zachary Turner 6708e0b45e [lld/pdb] Add some basic linker module symbols.
Differential Revision: https://reviews.llvm.org/D35152

llvm-svn: 307590
2017-07-10 21:01:37 +00:00
Sam Clegg c090962255 Remove unused declarations
Differential Revision: https://reviews.llvm.org/D34852

llvm-svn: 306772
2017-06-30 00:34:35 +00:00
Reid Kleckner a1001b8f38 [COFF] Allow debug info to relocate against discarded symbols
Summary:
In order to do this without switching on the symbol kind multiple times,
I created Defined::getChunkAndOffset and use that instead of
SymbolBody::getRVA in the inner relocation loop.

Now we get the symbol's chunk before switching over relocation types, so
we can test if it has been discarded outside the inner relocation type
switch. This also simplifies application of section relative
relocations. Previously we would switch on symbol kind to compute the
RVA, then the relocation type, and then the symbol kind again to get the
output section so we could subtract that from the symbol RVA. Now we
*always* have an OutputSection, so applying SECREL and SECTION
relocations isn't as much of a special case.

I'm still not quite happy with the cleanliness of this code. I'm not
sure what offsets and bases we should be using during the relocation
processing loop: VA, RVA, or OutputSectionOffset.

Reviewers: ruiu, pcc

Reviewed By: ruiu

Subscribers: majnemer, inglorion, llvm-commits, aprantl

Differential Revision: https://reviews.llvm.org/D34650

llvm-svn: 306566
2017-06-28 17:06:35 +00:00
Reid Kleckner eb8c0f9d51 [COFF] Fix SECREL and SECTION relocations against common symbols
Summary:
They do the obvious thing: provide the section index of .bss and the
offset of the symbol in .bss.

Reviewers: ruiu

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34628

llvm-svn: 306304
2017-06-26 16:45:36 +00:00
Reid Kleckner 502d4ce2e4 [COFF] Improve synthetic symbol handling
Summary:
The main change is that we can have SECREL and SECTION relocations
against ___safe_se_handler_table, which is important for handling the
debug info in the MSVCRT.

Previously we were using DefinedRelative for __safe_se_handler_table and
__ImageBase, and after we implement CFGuard, we plan to extend it to
handle __guard_fids_table, __guard_longjmp_table, and more.  However,
DefinedRelative is really only suitable for implementing __ImageBase,
because it lacks a Chunk, which you need in order to figure out the
output section index and output section offset when resolving SECREl and
SECTION relocations.

This change renames DefinedRelative to DefinedSynthetic and gives it a
Chunk. One wart is that __ImageBase doesn't have a chunk. It points to
the PE header, effectively. We could split DefinedRelative and
DefinedSynthetic if we think that's cleaner and creates fewer special
cases.

I also added safeseh.s, which checks that we don't emit a safe seh table
entries pointing to garbage collected handlers and that we don't emit a
table at all when there are no handlers.

Reviewers: ruiu

Reviewed By: ruiu

Subscribers: inglorion, pcc, llvm-commits, aprantl

Differential Revision: https://reviews.llvm.org/D34577

llvm-svn: 306293
2017-06-26 15:39:52 +00:00
Reid Kleckner 5a7eca5223 Silence -Wunused-variable warning
llvm-svn: 306135
2017-06-23 18:22:29 +00:00
Reid Kleckner 8456411e3b [COFF] Fix SECTION and SECREL relocation handling for absolute symbols
Summary:
For SECTION relocations against absolute symbols, MSVC emits the largest
output section index plus one. I've implemented that by threading a
global variable through DefinedAbsolute that is filled in by the Writer.
A more library-oriented approach would be to thread the Writer through
Chunk::writeTo and SectionChunk::applyRel*, but Rui seems to prefer
doing it this way.

MSVC rejects SECREL relocations against absolute symbols, but only when
the relocation is in a real output section. When the relocation is in a
CodeView debug info section destined for the PDB, it seems that this
relocation error is suppressed, and absolute symbols become zeros in the
object file. This is easily implemented by checking the input section
from which we're applying relocations.

This should fix errors about __safe_se_handler_table and
__guard_fids_table when linking the CRT and generating a PDB.

Reviewers: ruiu

Subscribers: aprantl, llvm-commits

Differential Revision: https://reviews.llvm.org/D34541

llvm-svn: 306071
2017-06-22 23:33:04 +00:00
Rui Ueyama 28ea8c7ad7 [COFF] Set MajorLinkerVersion to 14 instead of 0.
This works around a strange interaction with Authenticode signatures,
in which a signed PE executable with {Major,Minor}LinkerVersion = 0.0
fails to validate on Windows 7 (but is OK on Windows 10). Setting the
linker version to 14.0 (which is what VS2015 outputs) makes it work
again.

Patch by Simon Tatham <simon.tatham@arm.com>.

llvm-svn: 305929
2017-06-21 16:42:08 +00:00
Rui Ueyama f076b97a80 Improve error messages.
llvm-svn: 305868
2017-06-20 23:11:28 +00:00
Reid Kleckner f5bb738f75 [PDB] Don't emit debug info associated with dead chunks
Summary:
Previously we didn't add debug info chunks to the SparseChunks array, so
they didn't participate in section GC. Now we do.

Reviewers: ruiu

Subscribers: aprantl, llvm-commits

Differential Revision: https://reviews.llvm.org/D34356

llvm-svn: 305811
2017-06-20 17:14:09 +00:00
Reid Kleckner 44cdb10964 [PDB] Start emitting source file and line information
Summary:
This is a first step towards getting line info to show up in VS and
windbg. So far, only llvm-pdbutil can parse the PDBs that we produce.
cvdump doesn't like something about our file checksum tables. I'll have
to dig into that next.

This patch adds a new DebugSubsectionRecordBuilder which takes bytes
directly from some other producer, such as a linker, and sticks it into
the PDB. Line tables only need to be relocated. No data needs to be
rewritten.

File checksums and string tables, on the other hand, need to be re-done.

Reviewers: zturner, ruiu

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D34257

llvm-svn: 305713
2017-06-19 17:21:45 +00:00
Rui Ueyama 236e781011 Use MD5::hash(). NFC.
llvm-svn: 303893
2017-05-25 18:17:43 +00:00
Rui Ueyama 69ae29b1d1 Do not allow delay-importing data symbols.
If you pass /delayload:<dllname> to the COFF linker, it creates thunks
so that DLLs are loaded when they are used for the first time instead of
load-time.

This mechanism do not work for data symbols as there's no way to trap
acccesses to data imported from DLLs. (Technically, I think if we do not
initially map dllimport tables in memory, we could actually trap accesses
and delay-load data symbols, but that's not what Windows do.)

This patch is to report an error when you try to delay-load data symbols.

Fixes https://bugs.llvm.org/show_bug.cgi?id=33106

Differential Revision: https://reviews.llvm.org/D33557

llvm-svn: 303890
2017-05-25 18:03:34 +00:00
Rui Ueyama 0e8521c05a Reduce indentation. NFC.
llvm-svn: 303815
2017-05-24 22:36:11 +00:00
Rui Ueyama 9aa82f76ac Garbage collect dllimported symbols.
This is a different implementation than r303225 (which was reverted
in r303270, re-submitted in r303304 and then re-reverted in r303527).

In the previous patch, I tried to add Live bit to each dllimported
symbol. It turned out that it didn't work with "oldnames.lib" which
contains a lot of weak aliases to dllimported symbols.

The way we handle weak aliases is to check if undefined symbols
can be resolved using weak aliases, and if so, memcpy the Defined
symbols to weak Undefined symbols, so that any references to weak
aliases automatically see defined symbols instead of undefined ones.

This memcpy happens before MarkLive kicks in.

That means we may have multiple copies of dllimported symbols. So
turning on one instance's Live bit is not enough.

This patch moves the Live bit to dllimport file. Since multiple
copies of dllsymbols still point to the same file, we can use it as the
central repository to keep track of liveness.

Differential Revision: https://reviews.llvm.org/D33520

llvm-svn: 303814
2017-05-24 22:30:06 +00:00
Rui Ueyama b6632d9cd1 Revert r303304: Re-submit r303225: Garbage collect dllimported symbols.
This reverts commit r303304 because it looks like the change
introduced a crash bug. At least after that change, LLD with thinlto
crashes when linking Chromium.

llvm-svn: 303527
2017-05-22 06:01:37 +00:00
Rui Ueyama a674943211 Set IMAGE_DLL_CHARACTERISTICS_NO_BIND.
Our output is not compatible with the Binding feature, so make it
explicit that.

Differential Revision: https://reviews.llvm.org/D33336

llvm-svn: 303378
2017-05-18 20:26:58 +00:00
Rui Ueyama 01f93335a0 Use make<> everywhere in COFF to make it consistent with ELF.
We've been using make<> to allocate new objects in ELF. We have
the same function in COFF, but we didn't use it widely due to
negligence. This patch uses the function in COFF to close the gap
between ELF and COFF.

llvm-svn: 303357
2017-05-18 17:03:49 +00:00
Zachary Turner 8a7508970a [COFF] Fix interaction between /DEBUG and /PDB
When /DEBUG is not specified, /PDB should be ignored.  When
/DEBUG is specified, a PDB should be output regardless of
whether or not /PDB is specified.  /PDB just overrides the
default name.

This patch implements this behavior, and adds some tests, while
also removing a dead option /DEBUGPDB which was unused in any
code.

Differential Revision: https://reviews.llvm.org/D33302

llvm-svn: 303352
2017-05-18 15:15:10 +00:00
Rui Ueyama cd41bc8dec Re-submit r303225: Garbage collect dllimported symbols.
This reverts re-submits r303225 which was reverted in r303270 because it
broke the sanitizer-windows bot.

The reason of the failure is that we were writing dead symbols to the
symbol table. I fixed the issue.

llvm-svn: 303304
2017-05-17 21:36:08 +00:00
Hans Wennborg e67c5f6b52 Revert r303225 "Garbage collect dllimported symbols."
and follow-up r303226 "Fix Windows buildbots."

This broke the sanitizer-windows buildbot.

> Previously, the garbage collector (enabled by default or by explicitly
> passing /opt:ref) did not kill dllimported symbols. As a result,
> dllimported symbols could be added to resulting executables' dllimport
> list even if no one was actually using them.
>
> This patch implements dllexported symbol garbage collection. Just like
> COMDAT sections, dllimported symbols now have Live bits to manage their
> liveness, and MarkLive marks reachable dllimported symbols.
>
> Fixes https://bugs.llvm.org/show_bug.cgi?id=32950
>
> Reviewers: pcc
>
> Subscribers: llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D33264

llvm-svn: 303270
2017-05-17 16:22:03 +00:00
Rui Ueyama 02df7a6cf1 Garbage collect dllimported symbols.
Summary:
Previously, the garbage collector (enabled by default or by explicitly
passing /opt:ref) did not kill dllimported symbols. As a result,
dllimported symbols could be added to resulting executables' dllimport
list even if no one was actually using them.

This patch implements dllexported symbol garbage collection. Just like
COMDAT sections, dllimported symbols now have Live bits to manage their
liveness, and MarkLive marks reachable dllimported symbols.

Fixes https://bugs.llvm.org/show_bug.cgi?id=32950

Reviewers: pcc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33264

llvm-svn: 303225
2017-05-17 00:35:50 +00:00
Zachary Turner 3a57fbd6db [Support] Move Parallel algorithms from LLD to LLVM.
Differential Revision: https://reviews.llvm.org/D33024

llvm-svn: 302748
2017-05-11 00:03:52 +00:00
Zachary Turner 092c767745 [Core] Make parallel algorithms match C++ Parallelism TS.
Differential Revision: https://reviews.llvm.org/D33016

llvm-svn: 302613
2017-05-10 01:16:22 +00:00
Saleem Abdulrasool 671029daec COFF: support the /appcontainer flag
The /appcontainer flag indicates that the module may only be used inside
an application container (for isolation).  This has been supported by
link.exe since Windows 8.0.  It sets an additional bit in the PE DLL
Characteristics flag to indicate the behavioural change.

llvm-svn: 299728
2017-04-06 23:07:53 +00:00
Zachary Turner 82a0c97b32 Add a function to MD5 a file's contents.
In doing so, clean up the MD5 interface a little.  Most
existing users only care about the lower 8 bytes of an MD5,
but for some users that care about the upper and lower,
there wasn't a good interface.  Furthermore, consumers
of the MD5 checksum were required to handle endianness
details on their own, so it seems reasonable to abstract
this into a nicer interface that just gives you the right
value.

Differential Revision: https://reviews.llvm.org/D31105

llvm-svn: 298322
2017-03-20 23:33:18 +00:00
Saleem Abdulrasool 0acd6dd6ce COFF: prevent nullptr dereference
If `/debugtypes` is used to omit the codeview information, we would not
have constructed the debug info codeview record which is used to tie the
PDB to the binary.  In such a case, rub out the GUID and Age fields.

llvm-svn: 294279
2017-02-07 04:28:02 +00:00