Commit Graph

668 Commits

Author SHA1 Message Date
Bob Haarman 2ba4d231d1 [COFF] don't mark lazy symbols as used in regular objects
Summary:
r338767 updated the COFF and wasm linker SymbolTable code to be
strutured more like the ELF linker's. That inadvertedly changed the
behavior of the COFF linker so that lazy symbols would be marked as
used in regular objects. This change adds an overload of the insert()
function, similar to the ELF linker, which does not perform that
marking.

Reviewers: ruiu, rnk, hans

Subscribers: aheejin, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D51720

llvm-svn: 341585
2018-09-06 20:23:56 +00:00
Nico Weber 13b55bbc2f lld-link: Write an empty "repro" debug directory entry if /Brepro is passed
If the coff timestamp is set to a hash, like lld-link does if /Brepro is
passed, the coff spec suggests that a IMAGE_DEBUG_TYPE_REPRO entry is in the
debug directory. This lets lld-link write such a section.
Fixes PR38429, see bug for details.

Differential Revision: https://reviews.llvm.org/D51652

llvm-svn: 341486
2018-09-05 18:02:43 +00:00
Martin Storsjo a47957ab13 [COFF] Allow exporting all symbols from system libraries specfied with -wholearchive:
When building a shared libc++.dll, it pulls in libc++abi.a statically
with the --wholearchive flag. If such a build is done with
--export-all-symbols, it's reasonable to assume that everything
from that library also should be exported with the same rules as normal
local object files, even though we normally avoid autoexporting things
from libc++abi.a in other cases when linking a DLL (user code).

Differential Revision: https://reviews.llvm.org/D51529

llvm-svn: 341403
2018-09-04 20:56:56 +00:00
Martin Storsjo 802fcb4167 [COFF] When doing automatic dll imports, replace whole .refptr.<var> chunks with __imp_<var>
After fixing up the runtime pseudo relocation, the .refptr.<var>
will be a plain pointer with the same value as the IAT entry itself.
To save a little binary size and reduce the number of runtime pseudo
relocations, redirect references to the IAT entry (via the __imp_<var>
symbol) itself and discard the .refptr.<var> chunk (as long as the
same section chunk doesn't contain anything else than the single
pointer).

As there are now cases for both setting the Live variable to true
and false externally, remove the accessors and setters and just make
the variable public instead.

Differential Revision: https://reviews.llvm.org/D51456

llvm-svn: 341175
2018-08-31 07:45:20 +00:00
Martin Storsjo fcd552999f [COFF] Skip exporting artificial symbols when exporting all symbols
Differential Revision: https://reviews.llvm.org/D51457

llvm-svn: 341017
2018-08-30 05:44:41 +00:00
Martin Storsjo e5120a3bd4 [test] Adjust a test to use CHECK-NEXT instead of CHECK-NOT. NFC.
Since the order and placement of the non-wanted elements might not
be obvious, it feels more straightforward to hardcode the whole list
with -NEXT elements (and checking for the end of the output with
CHECK-EMPTY) instead of adding CHECK-NOT lines at the right places
where the unwanted elements would appear if they erroneously
were to included.

llvm-svn: 341016
2018-08-30 05:44:36 +00:00
Martin Storsjo cfbbb707f5 [COFF] Merge the .ctors, .dtors and .CRT sections into .rdata for MinGW
There's no point in keeping them as separate sections.

This differs from GNU ld, which places .ctors and .dtors content in
.text (implemented by a built-in linker script). But since the content
only is pointers, there's no need to have it executable.

GNU ld also leaves .CRT separate as its own standalone section.

MSVC merges .CRT into .rdata similarly, with a directive embedded in
an object file in msvcrt.lib or libcmt.lib.

Differential Revision: https://reviews.llvm.org/D51414

llvm-svn: 340940
2018-08-29 17:24:10 +00:00
Martin Storsjo eac1b05f1d [COFF] Support MinGW automatic dllimport of data
Normally, in order to reference exported data symbols from a different
DLL, the declarations need to have the dllimport attribute, in order to
use the __imp_<var> symbol (which contains an address to the actual
variable) instead of the variable itself directly. This isn't an issue
in the same way for functions, since any reference to the function without
the dllimport attribute will end up as a reference to a thunk which loads
the actual target function from the import address table (IAT).

GNU ld, in MinGW environments, supports automatically importing data
symbols from DLLs, even if the references didn't have the appropriate
dllimport attribute. Since the PE/COFF format doesn't support the kind
of relocations that this would require, the MinGW's CRT startup code
has an custom framework of their own for manually fixing the missing
relocations once module is loaded and the target addresses in the IAT
are known.

For this to work, the linker (originall in GNU ld) creates a list of
remaining references needing fixup, which the runtime processes on
startup before handing over control to user code.

While this feature is rather controversial, it's one of the main features
allowing unix style libraries to be used on windows without any extra
porting effort.

Some sort of automatic fixing of data imports is also necessary for the
itanium C++ ABI on windows (as clang implements it right now) for importing
vtable pointers in certain cases, see D43184 for some discussion on that.

The runtime pseudo relocation handler supports 8/16/32/64 bit addresses,
either PC relative references (like IMAGE_REL_*_REL32*) or absolute
references (IMAGE_REL_AMD64_ADDR32, IMAGE_REL_AMD64_ADDR32,
IMAGE_REL_I386_DIR32). On linking, the relocation is handled as a
relocation against the corresponding IAT slot. For the absolute references,
a normal base relocation is created, to update the embedded address
in case the image is loaded at a different address.

The list of runtime pseudo relocations contains the RVA of the
imported symbol (the IAT slot), the RVA of the location the relocation
should be applied to, and a size of the memory location. When the
relocations are fixed at runtime, the difference between the actual
IAT slot value and the IAT slot address is added to the reference,
doing the right thing for both absolute and relative references.

With this patch alone, things work fine for i386 binaries, and mostly
for x86_64 binaries, with feature parity with GNU ld. Despite this,
there are a few gotchas:
- References to data from within code works fine on both x86 architectures,
  since their relocations consist of plain 32 or 64 bit absolute/relative
  references. On ARM and AArch64, references to data doesn't consist of
  a plain 32 or 64 bit embedded address or offset in the code. On ARMNT,
  it's usually a MOVW+MOVT instruction pair represented by a
  IMAGE_REL_ARM_MOV32T relocation, each instruction containing 16 bit of
  the target address), on AArch64, it's usually an ADRP+ADD/LDR/STR
  instruction pair with an even more complex encoding, storing a PC
  relative address (with a range of +/- 4 GB). This could theoretically
  be remedied by extending the runtime pseudo relocation handler with new
  relocation types, to support these instruction encodings. This isn't an
  issue for GCC/GNU ld since they don't support windows on ARMNT/AArch64.
- For x86_64, if references in code are encoded as 32 bit PC relative
  offsets, the runtime relocation will fail if the target turns out to be
  out of range for a 32 bit offset.
- Fixing up the relocations at runtime requires making sections writable
  if necessary, with the VirtualProtect function. In Windows Store/UWP apps,
  this function is forbidden.

These limitations are addressed by a few later patches in lld and
llvm.

Differential Revision: https://reviews.llvm.org/D50917

llvm-svn: 340726
2018-08-27 08:43:31 +00:00
Martin Storsjo c4b0061c05 [COFF] Check the instructions in ARM MOV32T relocations
For this relocation, which applies to two consecutive instructions,
it's plausible that the second instruction might not actually be
the right one.

Differential Revision: https://reviews.llvm.org/D50998

llvm-svn: 340715
2018-08-27 06:04:36 +00:00
Peter Collingbourne ab038025a5 COFF: Implement safe ICF on rodata using address-significance tables.
Differential Revision: https://reviews.llvm.org/D51050

llvm-svn: 340555
2018-08-23 17:44:42 +00:00
Nico Weber 7830c6f66f lld-link: Separate 'undefined symbol' errors with just one newline, not two.
newline() in ErrorHandler.cpp already tries to insert newlines between messages
that contain embedded newlines, so getSymbolLocations() shouldn't return a
string that ends in a newline -- else we end up with two newlines between error
messages.

Makes lld-link's output look more like ld.lld output.

https://reviews.llvm.org/D51117

llvm-svn: 340482
2018-08-22 23:45:05 +00:00
Nico Weber 613edd1582 Fix two RUN: lines that were unintentionally spelled "RN:".
https://reviews.llvm.org/D51140

llvm-svn: 340481
2018-08-22 23:44:03 +00:00
Nico Weber ebc27c4873 lld-link: Emit warning if one each of {main,wmain} and {WinMain,wWinMain} exist and no /subsystem: flag is passed.
Similar to link.exe's LNK4031.
https://reviews.llvm.org/D51076

llvm-svn: 340420
2018-08-22 16:47:16 +00:00
Reid Kleckner e8299ded5b Update LLD tests for CodeView dumper change in r339907
llvm-svn: 339913
2018-08-16 18:03:06 +00:00
Reid Kleckner bd5d71229d [codeview] Use push_macro to avoid conflicts instead of a prefix
Summary:
This prefix was added in r333421, and it changed our dumper output to
say things like "CVRegEAX" instead of just "EAX". That's a functional
change that I'd rather avoid.

I tested GCC, Clang, and MSVC, and all of them support #pragma
push_macro. They don't issue warnings whem the macro is not defined
either.

I don't have a Mac so I can't test the real termios.h header, but I
looked at the termios.h sources online and looked for other conflicts.
I saw only the CR* macros, so those are the ones we work around.

Reviewers: zturner, JDevlieghere

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D50851

llvm-svn: 339907
2018-08-16 17:34:31 +00:00
Hans Wennborg bdd8493f2b [COFF] Make the relocation scanning for CFG more discriminating
link.exe ignores REL32 relocations on 32-bit x86, as well as relocations
against non-function symbols such as labels. This makes lld do the same.

Differential Revision: https://reviews.llvm.org/D50430

llvm-svn: 339345
2018-08-09 13:43:22 +00:00
Peter Smith 5c57957281 Add missing REQUIRES x86 to tests.
Add REQUIRES to tests that fail when an x86 backend is not present.

Differential Revision: https://reviews.llvm.org/D50440

llvm-svn: 339253
2018-08-08 14:50:33 +00:00
Nico Weber f4f5b7eea3 lld-link: Take /SUBSYSTEM into account for automatic /ENTRY detection.
If /subsystem:windows is passed, link.exe only looks for WinMain and wWinMain,
and if /subsystem:console is passed it only looks for main and wmain. lld-link
used to look for all 4 in both cases. This patch makes lld-link match
link.exe's behavior.

This requires that the subsystem is known by the time findDefaultEntry() gets
called. findDefaultEntry() is called before the main link loop, so that the
loop can mark the entry point as undefined. That means inferSubsystem() has to
be called above the main loop as well. This in turn means /subsystem: from
.drectve sections only has an effect on entry point inference for obj files
passed to lld-link directly (and not in obj files found later in .lib files).
link.exe seems to ignore /subsystem: for obj files from lib files completely
(while in lld it's ignored only for entry point detection but it still
overrides /subsystem: flags passed on the command line for the value that gets
written in the output file).

Also, if the subsytem isn't needed (e.g. when only writing a /def: lib file and
not writing a coff file), link.exe doesn't complain if the subsystem isn't
known, so both subsystem and entry point handling should be below the early
return lld has for that case.

Fixes PR36523.
https://reviews.llvm.org/D50316

llvm-svn: 339165
2018-08-07 19:10:28 +00:00
Martin Storsjo 21858a9b63 [COFF] Treat .xdata/.pdata$<sym> as implicitly associative to <sym> for MinGW
MinGW configurations don't use associative comdats, as GNU ld doesn't
support that. Instead they produce normal comdats named .text$sym,
.xdata$sym and .pdata$sym.

GNU ld doesn't discard any comdats starting with .xdata or .pdata,
even if --gc-sections is used (while it does discard other unreferenced
comdats), regardless of what symbol name is used after the $ separator.

For LLD, treat any such comdat as implicitly associative to the base
symbol. This requires maintaining a map from symbol name to section
number, but that is only maintained when the MinGW flag has been
enabled.

Differential Revision: https://reviews.llvm.org/D49700

llvm-svn: 339058
2018-08-06 21:26:09 +00:00
Martin Storsjo 214d69975c [COFF] Remove a superfluous warning about aligncomm for non-common symbols
It's not an error if a common symbol (uninitialized data, with alignment
specified via the aligncomm directive) is replaced with a regular
one with initialized data (with alignment specified via the section
chunk).

Differential Revision: https://reviews.llvm.org/D50268

llvm-svn: 339049
2018-08-06 19:49:18 +00:00
Nico Weber d48d5f086f lld-link: Fix subsystem inference for non-console apps on 32-bit, and fix entry point inference on 32-bit with /nodefaultlib
LinkerDriver::inferSubsystem() used to do Symtab->findUnderscore("WinMain"),
but WinMain is stdcall in 32-bit and is hence is called _WinMain@16. Instead,
Symtab->findMangle(mangle("WinMain")) needs to be called.

But since LinkerDriver::inferSubsystem() and LinkerDriver::findDefaultEntry()
both need to call this, introduce a common helper function for this and call it
from both places. (Also call it for "main" for consistency, even though
findUnderscore() is enough for main since that's __cdecl on 32-bit).

This also exposed a bug for /nodefaultlib entrypoint inference: The code here
called findMangle(Sym) instead of findMangle(mangle(Sym)), again doing the
wrong thing on 32-bit. Fix that too.

While here, make Driver::mangle() a static free function.

https://reviews.llvm.org/D50184

llvm-svn: 338877
2018-08-03 12:00:12 +00:00
Chris Jackson 1a721eb3a2 [lld] Make tests calling llvm-ar more robust
Some lit tests that call llvm-ar use the 'r' flag. If the target archive
already exists and is in a corrupt state, this can cause the test to fail. We
have added 'rm -f' calls before the llvm-ar calls to increase the
robustness of the tests.

Differential revision: https://reviews.llvm.org/D49184

llvm-svn: 338705
2018-08-02 11:33:54 +00:00
Nico Weber 11f14904d3 lld-link: Remove /msvclto option
This was useful for LTO bringup in lld-link while lld couldn't write PDBs. Now
that it can, this should no longer be needed. Hopefully the flag is obscure
enough and recent enough, that nobody uses it – but if somebody should use it,
they should be able to just stop passing it and things should continue to work.

https://reviews.llvm.org/D50139

llvm-svn: 338615
2018-08-01 19:00:49 +00:00
Martin Storsjo 6c8cbf6db0 [COFF] Handle comdat sections without leader symbols
Discard them unless they have been associated by other means (yet
uimplemented).

According to MS link.exe, such sections are illegal, but MinGW setups
use them in their take on associative comdats.

This avoids leaving references to the bogus SectionChunk* PendingComdat,
which cannot be dereferenced.

This fixes PR38183.

Differential Revision: https://reviews.llvm.org/D49653

llvm-svn: 338064
2018-07-26 20:14:50 +00:00
Rui Ueyama 7e95d9e362 Fix error messages for bad symbols.
Previously, the error messages didn't contain symbol name because we
didn't read a symbol name for these error messages.

Differential Revision: https://reviews.llvm.org/D49762

llvm-svn: 337863
2018-07-24 22:52:11 +00:00
Reid Kleckner 276d7167d0 [PDB] Write the command line after response file expansion
Summary: Fixes PR38085

Reviewers: ruiu, zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49566

llvm-svn: 337628
2018-07-20 22:34:20 +00:00
Martin Storsjo 98ff9f845d [COFF] Sort .reloc before all other discardable sections
If a binary is stripped, which can remove discardable sections (except
for the .reloc section, which also is marked as discardable as it isn't
loaded at runtime, only read by the loader), the .reloc section should
be first of them, in order not to create gaps in the image.

Previously, binaries with relocations were broken if they were stripped
by GNU binutils strip. Trying to execute such binaries produces an error
about "xx is not a valid win32 application".

This fixes GNU binutils bug 23348.

Prior to SVN r329370 (which didn't intend to have functional changes),
the code for moving discardable sections to the end didn't clearly
express how other discardable sections should be ordered compared to
.reloc, but the change retained the exact same end result as before.

After SVN r329370, the code (and comments) more clearly indicate that
it tries to make the .reloc section the absolutely last one; this patch
changes that.

This matches how GNU binutils ld sorts .reloc compared to dwarf debug
info sections.

Differential Revision: https://reviews.llvm.org/D49351

Signed-off-by: Martin Storsjö <martin@martin.st>
llvm-svn: 337598
2018-07-20 18:43:35 +00:00
Martin Storsjo a55fc71614 [COFF] Write the debug directory and build id to a separate section for MinGW
For dwarf debug info, an executable normally either contains the debug
info, or it is stripped out. To reduce the storage needed (slightly)
for the debug info kept separately from the released, stripped binaries,
one can choose to only copy the debug data from the original executable
(essentially the reverse of the strip operation), producing a file with
only debug info.

When copying the debug data from an executable with GNU objcopy,
the build id and debug directory need to reside in a separate section,
as this will be kept while the rest of the .rdata section is removed.

Differential Revision: https://reviews.llvm.org/D49352

llvm-svn: 337526
2018-07-20 05:44:34 +00:00
Takuto Ikuta d855928ec3 [PDB] Add PDBSourcePath flag to support absolutize source file path
This patch changes relative path for source files in obj files to
absolute path in PDB when linking with added flag.

I will make obj file generated by clang-cl independent from build
directory for chromium build. But I don't want to confuse visual studio
debugger or require additional configuration. To attain this goal, I
added flag to convert relative source file path in obj to absolute path
when emitting PDB.

By removing absolute path from obj files, we can share build cache
between chromium developers even when they are doing debug build.
That will make build time faster.

More context:
https://bugs.chromium.org/p/chromium/issues/detail?id=712796
https://groups.google.com/a/chromium.org/forum/#!topic/chromium-dev/5HXSVX-7fPc

llvm-svn: 337439
2018-07-19 04:56:22 +00:00
Martin Storsjo c35e4bf7eb [COFF] Don't produce base relocs for discardable sections
Dwarf debug info contains some data that contains absolute addresses.
Since these sections are discardable and aren't loaded at runtime,
there's no point in adding base relocations for them.

This makes sure that after stripping out dwarf debug info, there are no
base relocations that point to nonexistent sections.

Differential Revision: https://reviews.llvm.org/D49350

llvm-svn: 337438
2018-07-19 04:25:22 +00:00
Rui Ueyama c93530d873 Look for an entry point function if /nodefaultlib is given.
Summary: Fixes https://bugs.llvm.org/show_bug.cgi?id=38018

Reviewers: thakis

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D48990

llvm-svn: 337407
2018-07-18 17:48:14 +00:00
Nico Weber 6c414a31ae attempt to get test/COFF/driver.test passing on sanitizer-x86_64-linux-fast; cf r337092
llvm-svn: 337093
2018-07-14 11:47:23 +00:00
Nico Weber c421fe5ef4 lld-link: Add /lib to Options.td so that it appears in lld-link's help output.
https://reviews.llvm.org/D49319

llvm-svn: 337086
2018-07-14 04:07:51 +00:00
Zachary Turner bf9abccacd [coff] Remove dots in path pointing to PDB file.
Some Microsoft tools (e.g. new versions of WPA) fail when the
COFF Debug Directory contains a path to the PDB that contains
dots, such as D:\foo\./bar.pdb.  Remove dots before writing this
path.

This fixes pr38126.

llvm-svn: 336873
2018-07-12 00:44:15 +00:00
Martin Storsjo 474be005db [COFF] Store import symbol pointers as pointers to the base class
Future symbol insertions can potentially change the type of these
symbols - keep pointers to the base class to reflect this, and
use dynamic casts to inspect them before using as the subclass
type.

This fixes crashes that were possible before, by touching these
symbols that now are populated as e.g. a DefinedRegular, via
the old pointers with DefinedImportThunk type.

Differential Revision: https://reviews.llvm.org/D48953

llvm-svn: 336652
2018-07-10 10:40:11 +00:00
Zachary Turner 648bebdc67 [PDB] One more fix for hasing GSI records.
The reference implementation uses a case-insensitive string
comparison for strings of equal length.  This will cause the
string "tEo" to compare less than "VUo".  However we were using
a case sensitive comparison, which would generate the opposite
outcome.  Switch to a case insensitive comparison.  Also, when
one of the strings contains non-ascii characters, fallback to
a straight memcmp.

The only way to really test this is with a DIA test.  Before this
patch, the test will fail (but succeed if link.exe is used instead
of lld-link).  After the patch, it succeeds even with lld-link.

llvm-svn: 336464
2018-07-06 21:01:42 +00:00
Hans Wennborg b09e004cde Relax filechecks in r336405 tests
They were failing in Chromium's packaging builds with:

  C:\b\rr\tmphqfaff\w\src\third_party\llvm\tools\lld\test\COFF\pdb-globals-dia-vfunc-collision2.test:24:8:
  error: expected string not found in input
  CHECK: func [0x00001060+ 0 - 0x0000106c-12 | sizeof= 12] (FPO) virtual int __cdecl A132()
         ^
  <stdin>:8:11: note: scanning from here
   struct S [sizeof = 8] {
            ^
  <stdin>:9:2: note: possible intended match here
   func [0x00001060+ 0 - 0x0000106c-12 | sizeof= 12] (FPO) virtual int __cdecl S::A132()
   ^

Maybe due to different DIA versions.

llvm-svn: 336424
2018-07-06 08:44:08 +00:00
Hans Wennborg bf7caf4232 dos2unix
llvm-svn: 336423
2018-07-06 08:44:04 +00:00
Zachary Turner 457cc34e48 [llvm-pdbutil] Dump more info about globals.
We add an option to dump the entire global / public symbol record
stream.  Previously we would dump globals or publics, but not both.
And when we did dump them, we would always dump them in the order
they were referenced by the corresponding hash streams, not in
the order they were serialized in.  This patch adds a lower level
mode that just dumps the whole stream in serialization order.

Additionally, when dumping global-extras, we now dump the hash
bitmap as well as the record offset instead of dumping all zeros
for the offsets.

llvm-svn: 336407
2018-07-06 02:59:25 +00:00
Zachary Turner 1f200adfa7 [PDB] Sort globals symbols by name in GSI hash buckets.
It seems like the debugger first computes a symbol's bucket,
and then does a binary search of entries in the bucket using the
symbol's name in order to find it.  If the bucket entries are not
in sorted order, this obviously won't work.  After this patch a
couple of simple test cases show that we generate an exactly
identical GSI hash stream, which is very nice.

llvm-svn: 336405
2018-07-06 02:33:58 +00:00
Zachary Turner d3fe59833f Fix test after S_PROCREF change.
Since the names are being hashed correctly now, enumerating them
returns them in a different order.  Update the test to reflect
that.

llvm-svn: 336027
2018-06-29 22:41:16 +00:00
Martin Storsjo 3a7905b2aa [COFF] Add an LLD specific option -debug:symbtab
With this set, we retain the symbol table, but skip the actual debug
information.

This is meant to be used by the MinGW frontend.

Differential Revision: https://reviews.llvm.org/D48745

llvm-svn: 335946
2018-06-29 06:08:25 +00:00
Bob Haarman c103156c60 lld-link: align sections to 16 bytes if referenced from the gfids table
Summary:
Control flow guard works best when targets it checks are 16-byte aligned.
Microsoft's link.exe helps ensure this by aligning code from sections
that are referenced from the gfids table to 16 bytes when linking with
-guard:cf, even if the original section specifies a smaller alignment.
This change implements that behavior in lld-link.

See https://crbug.com/857012 for more details.

Reviewers: ruiu, hans, thakis, zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D48690

llvm-svn: 335864
2018-06-28 15:22:40 +00:00
Reid Kleckner 3408568392 [COFF] Fix /wholearchive: to do libpath search again
Fixes https://crbug.com/852882

llvm-svn: 334761
2018-06-14 19:56:03 +00:00
Rui Ueyama 4eed6cc433 Fix /WholeArchive bug.
`lld-link foo.lib /wholearchive:foo.lib` should work the same way as
`lld-link /wholearchive:foo.lib foo.lib`. Previously, /wholearchive in
the former case was ignored.

Differential Revision: https://reviews.llvm.org/D47565

llvm-svn: 334552
2018-06-12 21:47:31 +00:00
Shoaib Meenai 02c4344262 [COFF] Fix crash when emitting symbol tables with GC
When running with linker GC (`-opt:ref`), defined imported symbols that
are referenced but then dropped by GC end up with their `Location`
member being nullptr, which means `getChunk()` returns nullptr for them
and attempting to call `getChunk()->getOutputSection()` causes a crash
from the nullptr dereference. Check for `getChunk()` being nullptr and
bail out early to avoid the crash.

Differential Revision: https://reviews.llvm.org/D48092

llvm-svn: 334548
2018-06-12 21:19:33 +00:00
Joel Jones a5752e199c [lld] Add REQUIRES: x86 where needed to tests
If building lld without x86 support, tests that require that support should
be treated as unsupported, not errors.

Tested using:
  1. cmake '-DLLVM_TARGETS_TO_BUILD=AArch64;X86'
     make check-lld
     =>
     Expected Passes    : 1406
     Unsupported Tests  : 287

  2. cmake '-DLLVM_TARGETS_TO_BUILD=AArch64'
     make check-lld
     =>
     Expected Passes    : 410
     Unsupported Tests  : 1283

Patch by Joel Jones

Differential Revision: https://reviews.llvm.org/D47748

llvm-svn: 334095
2018-06-06 13:56:51 +00:00
Nico Weber d657c25649 lld-link: Implement /INTEGRITYCHECK flag
/INTEGRITYCHECK has the effect of setting
IMAGE_DLLCHARACTERISTICS_FORCE_INTEGRITY. Fixes PR31066.
https://reviews.llvm.org/D47472

llvm-svn: 333652
2018-05-31 13:43:02 +00:00
Shoaib Meenai 4e51833611 [COFF] Simplify symbol table output section computation
Rather than using a loop to compare symbol RVAs to the starting RVAs of
sections to determine which section a symbol belongs to, just get the
output section of a symbol directly via its chunk, and bail if the
symbol doesn't have an output section, which avoids having to hardcode
logic for handling dead symbols, CodeView symbols, etc. This was
suggested by Reid Kleckner; thank you.

This also fixes writing out symbol tables in the presence of RVA table
input sections (e.g. .sxdata and .gfids). Such sections aren't written
to the output file directly, so their RVA is 0, and the loop would thus
fail to find an output section for them, resulting in a segfault. Extend
some existing tests to cover this case.

Fixes PR37584.

Differential Revision: https://reviews.llvm.org/D47391

llvm-svn: 333450
2018-05-29 19:07:47 +00:00
Jonas Devlieghere 1fc7fc4db8 [COFF] Update CV register names.
Update tests to use the new prefix for CodeView registers added in
r333421.

llvm-svn: 333425
2018-05-29 14:58:41 +00:00
Zachary Turner c762666e87 Resubmit [pdb] Change /DEBUG:GHASH to emit 8 byte hashes."
This fixes the remaining failing tests, so resubmitting with no
functional change.

llvm-svn: 332676
2018-05-17 22:55:15 +00:00
Zachary Turner 1de9fce151 Revert "[pdb] Change /DEBUG:GHASH to emit 8 byte hashes."
A few tests haven't been properly updated, so reverting while
I have time to investigate proper fixes.

llvm-svn: 332672
2018-05-17 21:49:25 +00:00
Zachary Turner 3c4c8a0937 [pdb] Change /DEBUG:GHASH to emit 8 byte hashes.
Previously we emitted 20-byte SHA1 hashes.  This is overkill
for identifying debug info records, and has the negative side
effect of making object files bigger and links slower.  By
using only the last 8 bytes of a SHA1, we get smaller object
files and ~10% faster links.

This modifies the format of the .debug$H section by adding a new
value for the hash algorithm field, so that the linker will still
work when its object files have an old format.

Differential Revision: https://reviews.llvm.org/D46855

llvm-svn: 332669
2018-05-17 21:22:48 +00:00
Reid Kleckner f40f85868e [codeview] Include record prefix in global type hashing
The prefix includes type kind, which is important to preserve. Two
different type leafs can easily have the same interior record contents
as another type.

We ran into this issue in PR37492 where a bitfield type record collided
with a const modifier record. Their contents were bitwise identical, but
their kinds were different.

llvm-svn: 332664
2018-05-17 20:47:22 +00:00
Zachary Turner c8dd6ccc8a [COFF] Add /Brepro and /TIMESTAMP options.
Previously we would always write a hash of the binary into the
PE file, for reproducible builds.  This breaks AppCompat, which
is a feature of Windows that relies on the timestamp in the PE
header being set to a real value (or at the very least, a value
that satisfies certain properties).

To address this, we put the old behavior of writing the hash
behind the /Brepro flag, which mimics MSVC linker behavior.  We
also match MSVC default behavior, which is to write an actual
timestamp to the PE header.  Finally, we add the /TIMESTAMP
option (an lld extension) so that the user can specify the exact
value to be used in case he/she manually constructs a value which
is both reproducible and satisfies AppCompat.

Differential Revision: https://reviews.llvm.org/D46966

llvm-svn: 332613
2018-05-17 15:11:01 +00:00
Peter Collingbourne 62f7af712c COFF: Allow ICFing sections with different alignments.
The combined section gets the maximum alignment of all sections.

Differential Revision: https://reviews.llvm.org/D46786

llvm-svn: 332273
2018-05-14 18:36:51 +00:00
Peter Collingbourne 107f55005b COFF: ICF a section and its associated sections as a unit.
This is needed to avoid merging two functions with identical
instructions but different xdata. It also reduces binary size by
deduplicating identical pdata sections.

Fixes PR35337.

Differential Revision: https://reviews.llvm.org/D46672

llvm-svn: 332169
2018-05-12 02:12:40 +00:00
Peter Collingbourne d25dfe9bda COFF: Add a flag for disabling string tail merging.
We discovered (crbug.com/838449#c24) that string tail merging can
negatively affect compressed binary size, so provide a flag to turn
it off for users who care more about compressed size than uncompressed
size.

Differential Revision: https://reviews.llvm.org/D46780

llvm-svn: 332149
2018-05-11 22:21:36 +00:00
Peter Collingbourne b6c5a3045b COFF: Allow ICF on vtable sections.
Differential Revision: https://reviews.llvm.org/D46734

llvm-svn: 332059
2018-05-10 23:31:58 +00:00
Peter Collingbourne e28faed768 COFF: Don't create unnecessary thunks.
A thunk is only needed if a relocation points to the undecorated
import name.

Differential Revision: https://reviews.llvm.org/D46673

llvm-svn: 332019
2018-05-10 19:01:28 +00:00
Martin Storsjo 0ca06f7950 [COFF] Allow specifying export forwarding in a def file
Previously this was only supported when specified on the command line
or in directives.

Differential Revision: https://reviews.llvm.org/D46244

llvm-svn: 331900
2018-05-09 18:19:41 +00:00
Peter Collingbourne e5ad31d376 Object: The default alignment of a section without alignment flags is 16.
Differential Revision: https://reviews.llvm.org/D46420

llvm-svn: 331538
2018-05-04 16:45:57 +00:00
Martin Storsjo cc80776eff [COFF] Implement the remaining ARM64 relocations
Now only IMAGE_REL_ARM64_ABSOLUTE and IMAGE_REL_ARM64_TOKEN
are unhandled.

Also add range checks for the existing BRANCH26 relocation.

Differential Revision: https://reviews.llvm.org/D46354

llvm-svn: 331505
2018-05-04 06:06:27 +00:00
Hans Wennborg 03ca8f4fd0 [COFF] Don't set the tsaware bit on DLLs
It doesn't apply to DLLs, and link.exe doesn't set it.

Differential Revision: https://reviews.llvm.org/D46077

llvm-svn: 330868
2018-04-25 20:32:00 +00:00
Peter Collingbourne 9106b2becd Use /pdbaltpath to avoid a path length dependency.
llvm-svn: 330485
2018-04-20 21:54:55 +00:00
Peter Collingbourne 3d636edc56 COFF: Merge .xdata into .rdata by default.
Differential Revision: https://reviews.llvm.org/D45804

llvm-svn: 330484
2018-04-20 21:32:37 +00:00
Peter Collingbourne 326f419335 COFF: Merge .bss into .data by default.
Differential Revision: https://reviews.llvm.org/D45803

llvm-svn: 330483
2018-04-20 21:30:36 +00:00
Peter Collingbourne 71c7de5b77 COFF: Preserve section type when processing /section flag.
It turns out that we were dropping this before.

Differential Revision: https://reviews.llvm.org/D45802

llvm-svn: 330481
2018-04-20 21:23:16 +00:00
Peter Collingbourne 381b3d8aa3 COFF: Use (name, output characteristics) as a key when grouping input sections into output sections.
This is what link.exe does and lets us avoid needing to worry about
merging output characteristics while adding input sections to output
sections.

With this change we can't process /merge in the same way as before
because sections with different output characteristics can still
be merged into one another. So this change moves the processing of
/merge to just before we assign addresses. In the case where there
are multiple output sections with the same name, link.exe only merges
the first section with the source name into the first section with
the target name, and we do the same.

At the same time I also implemented transitive merging (which means
that /merge:.c=.b /merge:.b=.a merges both .c and .b into .a).

This isn't quite enough though because link.exe has a special case for
.CRT in 32-bit mode: it processes sections whose output characteristics
are DATA | R | W as though the output characteristics were DATA | R
(so that they get merged into things like constructor lists in the
expected way). Chromium has a few such sections, and it turns out
that those sections were causing the problem that resulted in r318699
(merge .xdata into .rdata) being reverted: because of the previous
permission merging semantics, the .CRT sections were causing the entire
.rdata section to become writable, which caused the SEH runtime to
crash because it apparently requires .xdata to be read-only. This
change also implements the same special case.

This should unblock being able to merge .xdata into .rdata by default,
as well as .bss into .data, both of which will be done in followups.

Differential Revision: https://reviews.llvm.org/D45801

llvm-svn: 330479
2018-04-20 21:10:33 +00:00
Zachary Turner 194be871b9 [LLD/PDB] Emit first section contribution for DBI Module Descriptor.
Part of the DBI stream is a list of variable length structures
describing each module that contributes to the final executable.

One member of this structure is a section contribution entry that
describes the first section contribution in the output file for
the given module.

We have been leaving this structure unpopulated until now, so with
this patch it is now filled out correctly.

Differential Revision: https://reviews.llvm.org/D45832

llvm-svn: 330457
2018-04-20 18:00:46 +00:00
Reid Kleckner 8f1a28f190 [COFF] Mark images with no exception handlers for /safeseh
Summary:
DLLs and executables with no exception handlers need to be marked with
IMAGE_DLL_CHARACTERISTICS_NO_SEH, even if they have a load config.

Discovered here when building Chromium with LLD on Windows:
https://crbug.com/833951

Reviewers: ruiu, mstorsjo

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45778

llvm-svn: 330300
2018-04-18 22:37:10 +00:00
Peter Collingbourne 3108802f16 COFF: Friendlier undefined symbol errors.
Summary:
This change does three things:
- Try to find the file and line number of an undefined symbol
  reference by reading codeview debug info.
- Try to find the name of the function or global variable with the
  undefined symbol reference by searching the object file's symbol
  table.
- Prints the information in the same style as the ELF linker.

Differential Revision: https://reviews.llvm.org/D45467

llvm-svn: 330235
2018-04-17 23:32:33 +00:00
Peter Collingbourne 66f1c9a858 Reland r330223, "COFF: Merge .idata, .didat and .edata into .rdata by default.", which was reverted in r330228.
In this reland I removed an unnecessary use of /debug in the test
delayimports32.test and used the /pdbaltpath flag in the test
pdb-publics-import.test, both of which avoid embedding absolute PDB
paths in executables which could affect later RVAs.

Original commit message:
> COFF: Merge .idata, .didat and .edata into .rdata by default.
>
> This saves a little space and matches what link.exe does.
>
> Tested using the chromium Windows trybots:
> https://chromium-review.googlesource.com/c/chromium/src/+/1014784

Differential Revision: https://reviews.llvm.org/D45737

llvm-svn: 330233
2018-04-17 23:28:52 +00:00
Peter Collingbourne 94aa62e48a COFF: Implement /pdbaltpath flag.
I needed to revert r330223 because we were embedding an absolute PDB
path in the .rdata section, which ended up being laid out before the
.idata section and affecting its RVAs. This flag will let us control
the embedded path.

Differential Revision: https://reviews.llvm.org/D45747

llvm-svn: 330232
2018-04-17 23:28:38 +00:00
Peter Collingbourne 1254f3e77c Revert r330223, "COFF: Merge .idata, .didat and .edata into .rdata by default."
Seems to have uncovered some sort of non-determinism on the bots.

llvm-svn: 330228
2018-04-17 22:16:39 +00:00
Peter Collingbourne 09e0e2e656 COFF: Merge .idata, .didat and .edata into .rdata by default.
This saves a little space and matches what link.exe does.

Tested using the chromium Windows trybots:
https://chromium-review.googlesource.com/c/chromium/src/+/1014784

Differential Revision: https://reviews.llvm.org/D45737

llvm-svn: 330223
2018-04-17 21:44:31 +00:00
Peter Collingbourne 7c26663f58 llvm-pdbutil: Fix an off-by-one error.
Differential Revision: https://reviews.llvm.org/D45740

llvm-svn: 330222
2018-04-17 21:44:17 +00:00
Zachary Turner bee6c22414 [llvm-pdbutil] Dump first section contribution for each module.
The DBI stream contains a list of module descriptors.  At the
beginning of each descriptor is a structure representing the first
section contribution in the output file for that module.  LLD
currently doesn't fill out this structure at all, but link.exe
does.  So as a precursor to emitting this data in LLD, we first
need a way to dump it so that it can be checked.

This patch adds support for the dumping, and verifies via a test
that LLD emits bogus information.

llvm-svn: 330208
2018-04-17 20:06:43 +00:00
Zachary Turner e3fe669855 Resubmit "Fix some incorrect fields in our generated PDBs."
This fixes the failing tests.  They simply hadn't been updated
to match the new output resulting from this patch.

llvm-svn: 330145
2018-04-16 18:17:13 +00:00
Peter Collingbourne 4902508934 COFF: Process /merge flag as we create output sections.
With this we can merge builtin sections.

Differential Revision: https://reviews.llvm.org/D45350

llvm-svn: 329471
2018-04-07 00:46:55 +00:00
Hans Wennborg 2a6943ca14 Fix the test some more after r329221
llvm-svn: 329224
2018-04-04 19:55:45 +00:00
Hans Wennborg e6cf0a3d9f Fix test after r329221
It seems I accidentally overspecified the section size in my previous
commit, whereas it was previously carefully left out.

llvm-svn: 329222
2018-04-04 19:36:27 +00:00
Hans Wennborg 9a9fc78744 COFF: Layout sections in the same order as link.exe
One place where this seems to matter is to make sure the .rsrc section comes
after .text. The Win32 UpdateResource() function can change the contents of
.rsrc. It will move the sections that come after, but if .text gets moved, the
entry point header will not get updated and the executable breaks. This was
found by a test in Chromium.

Differential Revision: https://reviews.llvm.org/D45260

llvm-svn: 329221
2018-04-04 19:15:55 +00:00
Rui Ueyama 20b3423715 Fix manifestinput-error.test on Windows 10.
Patch by Alexandre Ganea.

Differential Revision: https://reviews.llvm.org/D45232

llvm-svn: 329132
2018-04-03 23:12:28 +00:00
Nico Weber a764379458 [lld-link] Add comment explaining that /FIXED behavior is correct despite contradicting MSDN.
Also add a test for /FIXED.
https://reviews.llvm.org/D45087

llvm-svn: 328879
2018-03-30 17:17:04 +00:00
Nico Weber 0945ad6643 [lld-link] Let /PROFILE imply /OPT:REF /OPT:NOICF /INCREMENTAL:NO /FIXED:NO
/FIXED:NO is always the default, so that part needs no work.

Also test the interaction of /ORDER: with /INCREMENTAL.

https://reviews.llvm.org/D45091

llvm-svn: 328877
2018-03-30 17:14:50 +00:00
Nico Weber 8ee3b06f82 Simplify test more.
llvm-svn: 328863
2018-03-30 13:48:11 +00:00
Nico Weber 4fb8799f74 Simplify test.
As of rL215127, FileCheck has an -allow-empty flag,
so this could be used instead of writing a dummy line.
But it looks like the log is never empty now, so not
even that is needed.

llvm-svn: 328862
2018-03-30 13:44:15 +00:00
Zachary Turner 3203e27473 [MSF] Default to FPM2, and always mark FPM pages allocated.
There are two FPMs in an MSF file, the idea being that for
incremental updates you can write to the alternate one and then
atomically swap them on commit.  LLVM defaulted to using FPM1
on the first commit, but this differs from Microsoft's behavior
which is to default to using FPM2 on the first commit.  To
eliminate some byte-level file differences, this patch changes
LLVM's default to also be FPM2.

Additionally, LLVM was trying to be "smart" about marking FPM
pages allocated.  In addition to marking every page belonging
to the alternate FPM as unallocated, LLVM also marked pages at
the end of the main FPM which were not needed as unallocated.

In order to match the behavior of Microsoft-generated PDBs, we
now always mark every FPM block as allocated, regardless of
whether it is in the main FPM or the alt FPM, and regardless of
whether or not it describes blocks which are actually in the file.

This has the side benefit of simplifying our code.

llvm-svn: 328812
2018-03-29 18:34:15 +00:00
Zachary Turner f228276262 [PDB] Resubmit "Support embedding natvis files in PDBs."
This was reverted several times due to what ultimately turned out
to be incompatibilities in our serialized hash table format.

Several changes went in prior to this to fix those issues since
they were more fundamental and independent of supporting injected
sources, so now that those are fixed this change should hopefully
pass.

llvm-svn: 328363
2018-03-23 19:57:25 +00:00
Zachary Turner a6fb536e5b [PDB] Make our PDBs look more like MS PDBs.
When investigating bugs in PDB generation, the first step is
often to do the same link with link.exe and then compare PDBs.

But comparing PDBs is hard because two completely different byte
sequences can both be correct, so it hampers the investigation when
you also have to spend time figuring out not just which bytes are
different, but also if the difference is meaningful.

This patch fixes a couple of cases related to string table emission,
hash table emission, and the order in which we emit strings that
makes more of our bytes the same as the bytes generated by MS PDBs.

Differential Revision: https://reviews.llvm.org/D44810

llvm-svn: 328348
2018-03-23 18:43:39 +00:00
Zachary Turner fced530650 Revert "Resubmit "Support embedding natvis files in PDBs.""
This is still failing on a different bot this time due to some
issue related to hashing absolute paths.  Reverting until I can
figure it out.

llvm-svn: 328014
2018-03-20 18:37:03 +00:00
Zachary Turner 132d7a134f Resubmit "Support embedding natvis files in PDBs."
The issue causing this to fail in certain configurations
should be fixed.

It was due to the fact that DIA apparently expects there to be
a null string at ID 1 in the string table.  I'm not sure why this
is important but it seems to make a difference, so set it.

llvm-svn: 328002
2018-03-20 17:06:39 +00:00
Zachary Turner a21558897b Revert "Support embedding natvis files in PDBs."
This is causing a test failure on a certain bot, so I'm removing
this temporarily until we can figure out the source of the error.

llvm-svn: 327903
2018-03-19 20:41:59 +00:00
Zachary Turner de53aaf132 Support embedding natvis files in PDBs.
Natvis is a debug language supported by Visual Studio for
specifying custom visualizers.  The /NATVIS option is an
undocumented link.exe flag which will take a .natvis file
and "inject" it into the PDB.  This way, you can ship the
debug visualizers for a program along with the PDB, which
is very useful for postmortem debugging.

This is implemented by adding a new "named stream" to the
PDB with a special name of /src/files/<natvis file name>
and simply copying the contents of the xml into this file.

Additionally, we need to emit a single stream named
/src/headerblock which contains a hash table of embedded
files to records describing them.

This patch adds this functionality, including the /NATVIS
option to lld-link.

Differential Revision: https://reviews.llvm.org/D44328

llvm-svn: 327895
2018-03-19 19:53:51 +00:00
Peter Collingbourne f1a11f87a0 COFF: Implement string tail merging.
In COFF, duplicate string literals are merged by placing them in a
comdat whose leader symbol name contains a specific prefix followed
by the hash and partial contents of the string literal. This gives
us an easy way to identify sections containing string literals in
the linker: check for leader symbol names with the given prefix.

Any sections that are identified in this way as containing string
literals may be tail merged. We do so using the StringTableBuilder
class, which is also used to tail merge string literals in the ELF
linker. Tail merging is enabled only if ICF is enabled, as this
provides a signal as to whether the user cares about binary size.

Differential Revision: https://reviews.llvm.org/D44504

llvm-svn: 327668
2018-03-15 21:14:02 +00:00
Reid Kleckner 19454f1a9d [COFF] Fix LLD COFF tests as a follow-up to r327563
I definitely didn't run the tests before committing :(

Most of these tests failed because the LLD map file output changed,
moving the functions from the main text section to a new per-function
section.

ICF also started to fire in a few cases, leading to new layouts.

llvm-svn: 327571
2018-03-14 20:41:39 +00:00
Martin Storsjo af841d113c [test] Fix a temp filename in a test from SVN r327561. NFC.
An earlier file name accidentally slipped through into the committed
version.

llvm-svn: 327567
2018-03-14 20:31:31 +00:00
Reid Kleckner 8364901f24 [COFF] Enable per-function and data sections in LTO
Summary: This allows post-LTO symbol reordering and ICF.

Reviewers: inglorion

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D44492

llvm-svn: 327563
2018-03-14 20:25:41 +00:00
Martin Storsjo 5351891b55 [COFF] Add support for the GNU ld flag --kill-at
GNU ld has got a number of different flags for adjusting how to
behave around stdcall functions. The --kill-at flag strips the
trailing sdcall suffix from exported functions (which otherwise
is included by default in MinGW setups).

This also strips it from the corresponding import library though.
That makes it hard to link to such an import library from code
that calls the functions - but this matches what GNU ld does with
this flag. Therefore, this flag is probably not sensibly used
together with import libraries, but probably mostly when creating
some sort of plugin, or if creating the import library separately
with dlltool.

Differential Revision: https://reviews.llvm.org/D44292

llvm-svn: 327561
2018-03-14 20:17:16 +00:00