Commit Graph

228 Commits

Author SHA1 Message Date
Zachary Turner bf9abccacd [coff] Remove dots in path pointing to PDB file.
Some Microsoft tools (e.g. new versions of WPA) fail when the
COFF Debug Directory contains a path to the PDB that contains
dots, such as D:\foo\./bar.pdb.  Remove dots before writing this
path.

This fixes pr38126.

llvm-svn: 336873
2018-07-12 00:44:15 +00:00
Martin Storsjo 474be005db [COFF] Store import symbol pointers as pointers to the base class
Future symbol insertions can potentially change the type of these
symbols - keep pointers to the base class to reflect this, and
use dynamic casts to inspect them before using as the subclass
type.

This fixes crashes that were possible before, by touching these
symbols that now are populated as e.g. a DefinedRegular, via
the old pointers with DefinedImportThunk type.

Differential Revision: https://reviews.llvm.org/D48953

llvm-svn: 336652
2018-07-10 10:40:11 +00:00
Martin Storsjo 3a7905b2aa [COFF] Add an LLD specific option -debug:symbtab
With this set, we retain the symbol table, but skip the actual debug
information.

This is meant to be used by the MinGW frontend.

Differential Revision: https://reviews.llvm.org/D48745

llvm-svn: 335946
2018-06-29 06:08:25 +00:00
Bob Haarman c103156c60 lld-link: align sections to 16 bytes if referenced from the gfids table
Summary:
Control flow guard works best when targets it checks are 16-byte aligned.
Microsoft's link.exe helps ensure this by aligning code from sections
that are referenced from the gfids table to 16 bytes when linking with
-guard:cf, even if the original section specifies a smaller alignment.
This change implements that behavior in lld-link.

See https://crbug.com/857012 for more details.

Reviewers: ruiu, hans, thakis, zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D48690

llvm-svn: 335864
2018-06-28 15:22:40 +00:00
Shoaib Meenai 02c4344262 [COFF] Fix crash when emitting symbol tables with GC
When running with linker GC (`-opt:ref`), defined imported symbols that
are referenced but then dropped by GC end up with their `Location`
member being nullptr, which means `getChunk()` returns nullptr for them
and attempting to call `getChunk()->getOutputSection()` causes a crash
from the nullptr dereference. Check for `getChunk()` being nullptr and
bail out early to avoid the crash.

Differential Revision: https://reviews.llvm.org/D48092

llvm-svn: 334548
2018-06-12 21:19:33 +00:00
Nico Weber d657c25649 lld-link: Implement /INTEGRITYCHECK flag
/INTEGRITYCHECK has the effect of setting
IMAGE_DLLCHARACTERISTICS_FORCE_INTEGRITY. Fixes PR31066.
https://reviews.llvm.org/D47472

llvm-svn: 333652
2018-05-31 13:43:02 +00:00
Shoaib Meenai 663518d61a [COFF] Unify output section code. NFC
Peter Collingbourne suggested moving the switch to the top of the
function, so that all the code that cares about the output section for a
symbol is in the same place.

Differential Revision: https://reviews.llvm.org/D47497

llvm-svn: 333472
2018-05-29 22:49:56 +00:00
Shoaib Meenai 4e51833611 [COFF] Simplify symbol table output section computation
Rather than using a loop to compare symbol RVAs to the starting RVAs of
sections to determine which section a symbol belongs to, just get the
output section of a symbol directly via its chunk, and bail if the
symbol doesn't have an output section, which avoids having to hardcode
logic for handling dead symbols, CodeView symbols, etc. This was
suggested by Reid Kleckner; thank you.

This also fixes writing out symbol tables in the presence of RVA table
input sections (e.g. .sxdata and .gfids). Such sections aren't written
to the output file directly, so their RVA is 0, and the loop would thus
fail to find an output section for them, resulting in a segfault. Extend
some existing tests to cover this case.

Fixes PR37584.

Differential Revision: https://reviews.llvm.org/D47391

llvm-svn: 333450
2018-05-29 19:07:47 +00:00
Zachary Turner c8dd6ccc8a [COFF] Add /Brepro and /TIMESTAMP options.
Previously we would always write a hash of the binary into the
PE file, for reproducible builds.  This breaks AppCompat, which
is a feature of Windows that relies on the timestamp in the PE
header being set to a real value (or at the very least, a value
that satisfies certain properties).

To address this, we put the old behavior of writing the hash
behind the /Brepro flag, which mimics MSVC linker behavior.  We
also match MSVC default behavior, which is to write an actual
timestamp to the PE header.  Finally, we add the /TIMESTAMP
option (an lld extension) so that the user can specify the exact
value to be used in case he/she manually constructs a value which
is both reproducible and satisfies AppCompat.

Differential Revision: https://reviews.llvm.org/D46966

llvm-svn: 332613
2018-05-17 15:11:01 +00:00
Peter Collingbourne e28faed768 COFF: Don't create unnecessary thunks.
A thunk is only needed if a relocation points to the undecorated
import name.

Differential Revision: https://reviews.llvm.org/D46673

llvm-svn: 332019
2018-05-10 19:01:28 +00:00
Peter Collingbourne 71c7de5b77 COFF: Preserve section type when processing /section flag.
It turns out that we were dropping this before.

Differential Revision: https://reviews.llvm.org/D45802

llvm-svn: 330481
2018-04-20 21:23:16 +00:00
Peter Collingbourne 381b3d8aa3 COFF: Use (name, output characteristics) as a key when grouping input sections into output sections.
This is what link.exe does and lets us avoid needing to worry about
merging output characteristics while adding input sections to output
sections.

With this change we can't process /merge in the same way as before
because sections with different output characteristics can still
be merged into one another. So this change moves the processing of
/merge to just before we assign addresses. In the case where there
are multiple output sections with the same name, link.exe only merges
the first section with the source name into the first section with
the target name, and we do the same.

At the same time I also implemented transitive merging (which means
that /merge:.c=.b /merge:.b=.a merges both .c and .b into .a).

This isn't quite enough though because link.exe has a special case for
.CRT in 32-bit mode: it processes sections whose output characteristics
are DATA | R | W as though the output characteristics were DATA | R
(so that they get merged into things like constructor lists in the
expected way). Chromium has a few such sections, and it turns out
that those sections were causing the problem that resulted in r318699
(merge .xdata into .rdata) being reverted: because of the previous
permission merging semantics, the .CRT sections were causing the entire
.rdata section to become writable, which caused the SEH runtime to
crash because it apparently requires .xdata to be read-only. This
change also implements the same special case.

This should unblock being able to merge .xdata into .rdata by default,
as well as .bss into .data, both of which will be done in followups.

Differential Revision: https://reviews.llvm.org/D45801

llvm-svn: 330479
2018-04-20 21:10:33 +00:00
Peter Collingbourne be084eca5b COFF: Remove OutputSection::getPermissions() and getCharacteristics().
All callers can just access the header directly.

Differential Revision: https://reviews.llvm.org/D45800

llvm-svn: 330367
2018-04-19 21:48:37 +00:00
Peter Collingbourne fa322abee9 COFF: Rename Chunk::getPermissions to getOutputCharacteristics.
In an upcoming change I will need to make a distinction between section
type (code, data, bss) and permissions. The term that I use for both
of these things is "output characteristics".

Differential Revision: https://reviews.llvm.org/D45799

llvm-svn: 330361
2018-04-19 20:03:24 +00:00
Reid Kleckner 8f1a28f190 [COFF] Mark images with no exception handlers for /safeseh
Summary:
DLLs and executables with no exception handlers need to be marked with
IMAGE_DLL_CHARACTERISTICS_NO_SEH, even if they have a load config.

Discovered here when building Chromium with LLD on Windows:
https://crbug.com/833951

Reviewers: ruiu, mstorsjo

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45778

llvm-svn: 330300
2018-04-18 22:37:10 +00:00
Peter Collingbourne 94aa62e48a COFF: Implement /pdbaltpath flag.
I needed to revert r330223 because we were embedding an absolute PDB
path in the .rdata section, which ended up being laid out before the
.idata section and affecting its RVAs. This flag will let us control
the embedded path.

Differential Revision: https://reviews.llvm.org/D45747

llvm-svn: 330232
2018-04-17 23:28:38 +00:00
Peter Collingbourne 4902508934 COFF: Process /merge flag as we create output sections.
With this we can merge builtin sections.

Differential Revision: https://reviews.llvm.org/D45350

llvm-svn: 329471
2018-04-07 00:46:55 +00:00
Peter Collingbourne f2c0f39b91 COFF: Create output sections early. NFCI.
With this, all output sections are created in one place. This will make
it simpler to implement merging of builtin sections.

Differential Revision: https://reviews.llvm.org/D45349

llvm-svn: 329370
2018-04-06 03:25:49 +00:00
Peter Collingbourne 05f0bae318 COFF: Sort non-discardable sections at the same time as other sections. NFC.
This makes the sort order a little clearer.

Differential Revision: https://reviews.llvm.org/D45282

llvm-svn: 329227
2018-04-04 20:30:37 +00:00
Hans Wennborg 9a9fc78744 COFF: Layout sections in the same order as link.exe
One place where this seems to matter is to make sure the .rsrc section comes
after .text. The Win32 UpdateResource() function can change the contents of
.rsrc. It will move the sections that come after, but if .text gets moved, the
entry point header will not get updated and the executable breaks. This was
found by a test in Chromium.

Differential Revision: https://reviews.llvm.org/D45260

llvm-svn: 329221
2018-04-04 19:15:55 +00:00
Shoaib Meenai 290f26fefd [COFF] Clarify comment. NFC
Reid pointed out the string table for supporting long section names is a
BFD extension and the comments should reflect that. Explicitly spell out
link.exe's and binutil's behavior around section names and the rationale
for LLD's behavior.

Differential Revision: https://reviews.llvm.org/D42659

llvm-svn: 327736
2018-03-16 20:20:01 +00:00
Peter Collingbourne f1a11f87a0 COFF: Implement string tail merging.
In COFF, duplicate string literals are merged by placing them in a
comdat whose leader symbol name contains a specific prefix followed
by the hash and partial contents of the string literal. This gives
us an easy way to identify sections containing string literals in
the linker: check for leader symbol names with the given prefix.

Any sections that are identified in this way as containing string
literals may be tail merged. We do so using the StringTableBuilder
class, which is also used to tail merge string literals in the ELF
linker. Tail merging is enabled only if ICF is enabled, as this
provides a signal as to whether the user cares about binary size.

Differential Revision: https://reviews.llvm.org/D44504

llvm-svn: 327668
2018-03-15 21:14:02 +00:00
Peter Collingbourne 435b099115 COFF: Move assignment of section RVAs to assignAddresses(). NFCI.
This makes the design a little more similar to the ELF linker and
should allow for features such as ARM range extension thunks to be
implemented more easily.

Differential Revision: https://reviews.llvm.org/D44501

llvm-svn: 327667
2018-03-15 21:13:46 +00:00
Zachary Turner b575f46b6d Resubmit "Write a hash of the executable into the PE timestamp fields."
This fixes the broken tests that were causing failures.  The tests
before were verifying that the time stamp was 0, but now that we
are actually writing a timestamp, I just removed the match against
the timestamp value.

llvm-svn: 327049
2018-03-08 19:33:47 +00:00
Hans Wennborg aee5881a85 [COFF] Make the DOS stub a real DOS program
It only adds a few bytes and is nice for backward compatibility.

Differential Revision: https://reviews.llvm.org/D44018

llvm-svn: 327001
2018-03-08 14:27:28 +00:00
Zachary Turner 0b4af0434b Revert "Write a hash of the executable into the PE timestamp fields."
This is breaking a couple of tests, so I'm reverting temporarily
until I can get everything resolved properly.

llvm-svn: 326943
2018-03-07 21:22:10 +00:00
Zachary Turner 69f3347b56 Write a hash of the executable into the PE timestamp fields.
Windows tools treats the timestamp fields as sort of a build id,
using it to archive executables on a symbol server, as well as
for matching executables to PDBs.  We were writing 0 for these
fields, which would cause symbol servers to break as they are
indexed in the symbol server based on this value.

Although the field is called timestamp, it can really be any
value that is unique per build, so to support reproducible builds
we use a hash of the executable here.

Differential Revision: https://reviews.llvm.org/D43978

llvm-svn: 326920
2018-03-07 18:13:41 +00:00
Rui Ueyama b3107476a4 Remove an unused accessor and simplify the logic a bit. NFC.
llvm-svn: 325445
2018-02-17 20:41:38 +00:00
Reid Kleckner fd52096259 [LLD] Implement /guard:[no]longjmp
Summary:
This protects calls to longjmp from transferring control to arbitrary
program points. Instead, longjmp calls are limited to the set of
registered setjmp return addresses.

This also implements /guard:nolongjmp to allow users to link in object
files that call setjmp that weren't compiled with /guard:cf. In this
case, the linker will approximate the set of address taken functions,
but it will leave longjmp unprotected.

I used the following program to test, compiling it with different -guard
flags:
  $ cl -c t.c -guard:cf
  $ lld-link t.obj -guard:cf

  #include <setjmp.h>
  #include <stdio.h>
  jmp_buf buf;
  void g() {
    printf("before longjmp\n");
    fflush(stdout);
    longjmp(buf, 1);
  }
  void f() {
    if (setjmp(buf)) {
      printf("setjmp returned non-zero\n");
      return;
    }
    g();
  }
  int main() {
    f();
    printf("hello world\n");
  }

In particular, the program aborts when the code is compiled *without*
-guard:cf and linked with -guard:cf. That indicates that longjmps are
protected.

Reviewers: ruiu, inglorion, amccarth

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D43217

llvm-svn: 325047
2018-02-13 20:32:53 +00:00
Reid Kleckner af2f7da74c [COFF] Add minimal support for /guard:cf
Summary:
This patch adds some initial support for Windows control flow guard. At
the end of the day, the linker needs to synthesize a table of RVAs very
similar to the structured exception handler table (/safeseh).

Both /safeseh and /guard:cf take sections of symbol table indices
(.sxdata and .gfids$y) and turn them into RVA tables referenced by the
load config struct in the CRT through special symbols.

Reviewers: ruiu, amccarth

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42592

llvm-svn: 324306
2018-02-06 01:58:26 +00:00
Shoaib Meenai 34a1101b06 [COFF] Update comment to reflect link.exe behavior. NFC
In my experimentation with link.exe from both VS 2015 and 2017, it
always produces images with truncated section names. Update the comment
accordingly.

Differential Revision: https://reviews.llvm.org/D42603

llvm-svn: 323598
2018-01-27 18:17:08 +00:00
Rui Ueyama 57175aa1e9 Add the /order option.
With the /order option, you can give an order file. An order file
contains symbol names, one per line, and the linker places comdat
sections in that given order. The option is used often to optimize
an output binary for (in particular, startup) speed by improving
locality.

Differential Revision: https://reviews.llvm.org/D42598

llvm-svn: 323579
2018-01-27 00:34:46 +00:00
Zachary Turner 727f153b6f [coff] Print detailed timing information with /TIME.
The classes used to print and update time information are in
common, so other linkers could use this as well if desired.

Differential Revision: https://reviews.llvm.org/D41915

llvm-svn: 322736
2018-01-17 19:16:26 +00:00
Rui Ueyama 2c95e798a0 [LLD][COFF] Report error when file will exceed Windows maximum image size (4GB)
Patch by Colden Cullen.

Currently, when a large PE (>4 GiB) is to be produced, a crash occurs
because:

1. Calling setOffset with a number greater than UINT32_MAX causes the
   PointerToRawData to overflow

2. When adding the symbol table to the end of the file, the last section's
   offset was used to calculate file size. Because this had overflowed,
   this number was too low, and the file created would not be large enough.
   This lead to the actual crash I saw, which was a buffer overrun.

This change:

1. Adds comment to setOffset, clarifying that overflow can occur, but it's
   somewhat safe because the error will be handled elsewhere

2. Adds file size check after all output data has been created This matches
   the MS link.exe error, which looks prints as: "LINK : fatal error
   LNK1248: image size (10000EFC9) exceeds maximum allowable size
   (FFFFFFFF)"

3. Changes calculate of the symbol table offset to just use the existing
   FileSize. This should match the previous calculations, but doesn't rely
   on the use of a u32 that can overflow.

4. Removes trivial usage of a magic number that bugged me while I was
   debugging the issue

I'm not sure how to add a test for this outside of adding 4GB of object
files to the repo. If there's an easier way, let me know and I'll be
happy to add a test.

Differential Revision: https://reviews.llvm.org/D42010

llvm-svn: 322605
2018-01-17 01:08:02 +00:00
Martin Storsjo a1e9b6e3d2 [COFF] Set the IMAGE_DLL_CHARACTERISTICS_NO_SEH flag automatically
This seems to match how link.exe sets it.

Differential Revision: https://reviews.llvm.org/D41252

llvm-svn: 320860
2017-12-15 20:53:03 +00:00
Martin Storsjo 9603b8e3f5 [COFF] Sort .pdata for arm64
This works for linking the output from the MSVC compiler.
The pdata entries for arm64 seem to be 8 bytes in the same
(or at least similar) form to arm.

Differential Revision: https://reviews.llvm.org/D41160

llvm-svn: 320676
2017-12-14 08:56:29 +00:00
Rui Ueyama bdc5150984 Always evaluate the second argument for CHECK() lazily.
This patch is to rename check CHECK and make it a C macro, so that
we can evaluate the second argument lazily.

Differential Revision: https://reviews.llvm.org/D40915

llvm-svn: 319974
2017-12-06 22:08:17 +00:00
Peter Collingbourne 24ca79c776 COFF: Simplify construction of safe SEH table. NFCI.
Instead of building intermediate sets of exception handlers for each
object file, just create one for the final output file.

Differential Revision: https://reviews.llvm.org/D40581

llvm-svn: 319244
2017-11-28 22:50:53 +00:00
Rui Ueyama 2017d52b54 Move Memory.{h,cpp} to Common.
Differential Revision: https://reviews.llvm.org/D40571

llvm-svn: 319221
2017-11-28 20:39:17 +00:00
Martin Storsjo f2508f46ca [COFF] Interpret a period as a separator for section suffix just like '$'
This allows grouping all sections like ".ctors.12345" into ".ctors".

For MinGW, the numerical values for such ctors are all zero-padded,
so a lexical sort is good enough.

Differential Revision: https://reviews.llvm.org/D40408

llvm-svn: 319151
2017-11-28 08:08:37 +00:00
Peter Collingbourne f874bd67d8 COFF: Emit a COFF symbol table if /debug:dwarf is specified.
This effectively reverts r318548 and r318635 while keeping the
functionality behind the flag and preserving the bug fix from r318548.

Differential Revision: https://reviews.llvm.org/D40264

llvm-svn: 318721
2017-11-21 01:14:14 +00:00
Peter Collingbourne 5e80bdebd2 COFF: Stop emitting a non-standard COFF symbol table into PEs.
Now that our support for PDB emission is reasonably good, there is
no longer a need to emit a COFF symbol table.

Also fix a bug where we would fail to emit a string table for long
section names if /debug was not specified.

Differential Revision: https://reviews.llvm.org/D40189

llvm-svn: 318548
2017-11-17 19:51:20 +00:00
Martin Storsjo 46304e03ec [COFF] Don't write long section names for sections that will be mapped at runtime
Sections that will be mapped at runtime will only have the short
section name available, since the string table it points into isn't
mapped. Therefore prefer truncating those names over writing a
long name that is unavailable at runtime.

This allows libunwind to find the .eh_frame section at runtime even
if the module was built with debug info enabled.

Differential Revision: https://reviews.llvm.org/D40025

llvm-svn: 318391
2017-11-16 12:06:42 +00:00
Bob Haarman fe059c782f [coff] correctly emit safeseh entries for handlers defined in dlls
Summary:
We previously assumed that all SafeSEH handlers are
DefinedRegular symbols. This is not the case for handlers defined in
DLLs. As a result, we were failing to emit entries in the SafeSEH
table for those handlers. This change fixes that.

Fixes PR35324.

Reviewers: rnk, ruiu

Reviewed By: rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40102

llvm-svn: 318364
2017-11-16 01:22:01 +00:00
Martin Storsjo 61716878ae [COFF] Always include the size of the string table size field
Even if we don't actually write any string table contents, the
4 byte size for the string table will always be written. Make
sure we accommodate for this in the file size. Since this size
is aligned up, this would seldom be an issue in practice.

Differential Revision: https://reviews.llvm.org/D39891

llvm-svn: 318284
2017-11-15 08:18:25 +00:00
Rafael Espindola 0a7d0230fc Try harder to delete the temporary file.
This changes COFF to use the output buffer that is reset by the error
handler.

llvm-svn: 318062
2017-11-13 18:15:22 +00:00
Rafael Espindola 5f903f3848 Update for llvm change.
llvm-svn: 317657
2017-11-08 01:50:34 +00:00
Bob Haarman 6c301b6eb1 [coff] use relative instead of absolute __safe_se_handler_base when present
Summary:
__safe_se_handler_base should be either absolute 0 (when no SafeSEH
table is present), or relative to the image base (when the table is
present). An earlier change inadvertedly made the symbol absolute in
both cases, leading to the SafeSEH table not being locatble at run
time. This change fixes that and updates the safeseh test to check for
the presence of the relocation.

Reviewers: rnk, ruiu

Reviewed By: ruiu

Subscribers: ruiu, llvm-commits

Differential Revision: https://reviews.llvm.org/D39765

llvm-svn: 317635
2017-11-07 23:24:10 +00:00
Rui Ueyama f483da0038 Rename replaceBody -> replaceSymbol.
llvm-svn: 317383
2017-11-03 22:48:47 +00:00
Rui Ueyama f52496e1e0 Rename SymbolBody -> Symbol
Now that we have only SymbolBody as the symbol class. So, "SymbolBody"
is a bit strange name now. This is a mechanical change generated by

  perl -i -pe s/SymbolBody/Symbol/g $(git grep -l SymbolBody lld/ELF lld/COFF)

nd clang-format-diff.

Differential Revision: https://reviews.llvm.org/D39459

llvm-svn: 317370
2017-11-03 21:21:47 +00:00