Summary:
r338767 updated the COFF and wasm linker SymbolTable code to be
strutured more like the ELF linker's. That inadvertedly changed the
behavior of the COFF linker so that lazy symbols would be marked as
used in regular objects. This change adds an overload of the insert()
function, similar to the ELF linker, which does not perform that
marking.
Reviewers: ruiu, rnk, hans
Subscribers: aheejin, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D51720
llvm-svn: 341585
If the coff timestamp is set to a hash, like lld-link does if /Brepro is
passed, the coff spec suggests that a IMAGE_DEBUG_TYPE_REPRO entry is in the
debug directory. This lets lld-link write such a section.
Fixes PR38429, see bug for details.
Differential Revision: https://reviews.llvm.org/D51652
llvm-svn: 341486
When building a shared libc++.dll, it pulls in libc++abi.a statically
with the --wholearchive flag. If such a build is done with
--export-all-symbols, it's reasonable to assume that everything
from that library also should be exported with the same rules as normal
local object files, even though we normally avoid autoexporting things
from libc++abi.a in other cases when linking a DLL (user code).
Differential Revision: https://reviews.llvm.org/D51529
llvm-svn: 341403
After fixing up the runtime pseudo relocation, the .refptr.<var>
will be a plain pointer with the same value as the IAT entry itself.
To save a little binary size and reduce the number of runtime pseudo
relocations, redirect references to the IAT entry (via the __imp_<var>
symbol) itself and discard the .refptr.<var> chunk (as long as the
same section chunk doesn't contain anything else than the single
pointer).
As there are now cases for both setting the Live variable to true
and false externally, remove the accessors and setters and just make
the variable public instead.
Differential Revision: https://reviews.llvm.org/D51456
llvm-svn: 341175
Since the order and placement of the non-wanted elements might not
be obvious, it feels more straightforward to hardcode the whole list
with -NEXT elements (and checking for the end of the output with
CHECK-EMPTY) instead of adding CHECK-NOT lines at the right places
where the unwanted elements would appear if they erroneously
were to included.
llvm-svn: 341016
There's no point in keeping them as separate sections.
This differs from GNU ld, which places .ctors and .dtors content in
.text (implemented by a built-in linker script). But since the content
only is pointers, there's no need to have it executable.
GNU ld also leaves .CRT separate as its own standalone section.
MSVC merges .CRT into .rdata similarly, with a directive embedded in
an object file in msvcrt.lib or libcmt.lib.
Differential Revision: https://reviews.llvm.org/D51414
llvm-svn: 340940
Normally, in order to reference exported data symbols from a different
DLL, the declarations need to have the dllimport attribute, in order to
use the __imp_<var> symbol (which contains an address to the actual
variable) instead of the variable itself directly. This isn't an issue
in the same way for functions, since any reference to the function without
the dllimport attribute will end up as a reference to a thunk which loads
the actual target function from the import address table (IAT).
GNU ld, in MinGW environments, supports automatically importing data
symbols from DLLs, even if the references didn't have the appropriate
dllimport attribute. Since the PE/COFF format doesn't support the kind
of relocations that this would require, the MinGW's CRT startup code
has an custom framework of their own for manually fixing the missing
relocations once module is loaded and the target addresses in the IAT
are known.
For this to work, the linker (originall in GNU ld) creates a list of
remaining references needing fixup, which the runtime processes on
startup before handing over control to user code.
While this feature is rather controversial, it's one of the main features
allowing unix style libraries to be used on windows without any extra
porting effort.
Some sort of automatic fixing of data imports is also necessary for the
itanium C++ ABI on windows (as clang implements it right now) for importing
vtable pointers in certain cases, see D43184 for some discussion on that.
The runtime pseudo relocation handler supports 8/16/32/64 bit addresses,
either PC relative references (like IMAGE_REL_*_REL32*) or absolute
references (IMAGE_REL_AMD64_ADDR32, IMAGE_REL_AMD64_ADDR32,
IMAGE_REL_I386_DIR32). On linking, the relocation is handled as a
relocation against the corresponding IAT slot. For the absolute references,
a normal base relocation is created, to update the embedded address
in case the image is loaded at a different address.
The list of runtime pseudo relocations contains the RVA of the
imported symbol (the IAT slot), the RVA of the location the relocation
should be applied to, and a size of the memory location. When the
relocations are fixed at runtime, the difference between the actual
IAT slot value and the IAT slot address is added to the reference,
doing the right thing for both absolute and relative references.
With this patch alone, things work fine for i386 binaries, and mostly
for x86_64 binaries, with feature parity with GNU ld. Despite this,
there are a few gotchas:
- References to data from within code works fine on both x86 architectures,
since their relocations consist of plain 32 or 64 bit absolute/relative
references. On ARM and AArch64, references to data doesn't consist of
a plain 32 or 64 bit embedded address or offset in the code. On ARMNT,
it's usually a MOVW+MOVT instruction pair represented by a
IMAGE_REL_ARM_MOV32T relocation, each instruction containing 16 bit of
the target address), on AArch64, it's usually an ADRP+ADD/LDR/STR
instruction pair with an even more complex encoding, storing a PC
relative address (with a range of +/- 4 GB). This could theoretically
be remedied by extending the runtime pseudo relocation handler with new
relocation types, to support these instruction encodings. This isn't an
issue for GCC/GNU ld since they don't support windows on ARMNT/AArch64.
- For x86_64, if references in code are encoded as 32 bit PC relative
offsets, the runtime relocation will fail if the target turns out to be
out of range for a 32 bit offset.
- Fixing up the relocations at runtime requires making sections writable
if necessary, with the VirtualProtect function. In Windows Store/UWP apps,
this function is forbidden.
These limitations are addressed by a few later patches in lld and
llvm.
Differential Revision: https://reviews.llvm.org/D50917
llvm-svn: 340726
For this relocation, which applies to two consecutive instructions,
it's plausible that the second instruction might not actually be
the right one.
Differential Revision: https://reviews.llvm.org/D50998
llvm-svn: 340715
newline() in ErrorHandler.cpp already tries to insert newlines between messages
that contain embedded newlines, so getSymbolLocations() shouldn't return a
string that ends in a newline -- else we end up with two newlines between error
messages.
Makes lld-link's output look more like ld.lld output.
https://reviews.llvm.org/D51117
llvm-svn: 340482
Summary:
This prefix was added in r333421, and it changed our dumper output to
say things like "CVRegEAX" instead of just "EAX". That's a functional
change that I'd rather avoid.
I tested GCC, Clang, and MSVC, and all of them support #pragma
push_macro. They don't issue warnings whem the macro is not defined
either.
I don't have a Mac so I can't test the real termios.h header, but I
looked at the termios.h sources online and looked for other conflicts.
I saw only the CR* macros, so those are the ones we work around.
Reviewers: zturner, JDevlieghere
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D50851
llvm-svn: 339907
link.exe ignores REL32 relocations on 32-bit x86, as well as relocations
against non-function symbols such as labels. This makes lld do the same.
Differential Revision: https://reviews.llvm.org/D50430
llvm-svn: 339345
If /subsystem:windows is passed, link.exe only looks for WinMain and wWinMain,
and if /subsystem:console is passed it only looks for main and wmain. lld-link
used to look for all 4 in both cases. This patch makes lld-link match
link.exe's behavior.
This requires that the subsystem is known by the time findDefaultEntry() gets
called. findDefaultEntry() is called before the main link loop, so that the
loop can mark the entry point as undefined. That means inferSubsystem() has to
be called above the main loop as well. This in turn means /subsystem: from
.drectve sections only has an effect on entry point inference for obj files
passed to lld-link directly (and not in obj files found later in .lib files).
link.exe seems to ignore /subsystem: for obj files from lib files completely
(while in lld it's ignored only for entry point detection but it still
overrides /subsystem: flags passed on the command line for the value that gets
written in the output file).
Also, if the subsytem isn't needed (e.g. when only writing a /def: lib file and
not writing a coff file), link.exe doesn't complain if the subsystem isn't
known, so both subsystem and entry point handling should be below the early
return lld has for that case.
Fixes PR36523.
https://reviews.llvm.org/D50316
llvm-svn: 339165
MinGW configurations don't use associative comdats, as GNU ld doesn't
support that. Instead they produce normal comdats named .text$sym,
.xdata$sym and .pdata$sym.
GNU ld doesn't discard any comdats starting with .xdata or .pdata,
even if --gc-sections is used (while it does discard other unreferenced
comdats), regardless of what symbol name is used after the $ separator.
For LLD, treat any such comdat as implicitly associative to the base
symbol. This requires maintaining a map from symbol name to section
number, but that is only maintained when the MinGW flag has been
enabled.
Differential Revision: https://reviews.llvm.org/D49700
llvm-svn: 339058
It's not an error if a common symbol (uninitialized data, with alignment
specified via the aligncomm directive) is replaced with a regular
one with initialized data (with alignment specified via the section
chunk).
Differential Revision: https://reviews.llvm.org/D50268
llvm-svn: 339049
LinkerDriver::inferSubsystem() used to do Symtab->findUnderscore("WinMain"),
but WinMain is stdcall in 32-bit and is hence is called _WinMain@16. Instead,
Symtab->findMangle(mangle("WinMain")) needs to be called.
But since LinkerDriver::inferSubsystem() and LinkerDriver::findDefaultEntry()
both need to call this, introduce a common helper function for this and call it
from both places. (Also call it for "main" for consistency, even though
findUnderscore() is enough for main since that's __cdecl on 32-bit).
This also exposed a bug for /nodefaultlib entrypoint inference: The code here
called findMangle(Sym) instead of findMangle(mangle(Sym)), again doing the
wrong thing on 32-bit. Fix that too.
While here, make Driver::mangle() a static free function.
https://reviews.llvm.org/D50184
llvm-svn: 338877
Some lit tests that call llvm-ar use the 'r' flag. If the target archive
already exists and is in a corrupt state, this can cause the test to fail. We
have added 'rm -f' calls before the llvm-ar calls to increase the
robustness of the tests.
Differential revision: https://reviews.llvm.org/D49184
llvm-svn: 338705
This was useful for LTO bringup in lld-link while lld couldn't write PDBs. Now
that it can, this should no longer be needed. Hopefully the flag is obscure
enough and recent enough, that nobody uses it – but if somebody should use it,
they should be able to just stop passing it and things should continue to work.
https://reviews.llvm.org/D50139
llvm-svn: 338615
Discard them unless they have been associated by other means (yet
uimplemented).
According to MS link.exe, such sections are illegal, but MinGW setups
use them in their take on associative comdats.
This avoids leaving references to the bogus SectionChunk* PendingComdat,
which cannot be dereferenced.
This fixes PR38183.
Differential Revision: https://reviews.llvm.org/D49653
llvm-svn: 338064
Previously, the error messages didn't contain symbol name because we
didn't read a symbol name for these error messages.
Differential Revision: https://reviews.llvm.org/D49762
llvm-svn: 337863
If a binary is stripped, which can remove discardable sections (except
for the .reloc section, which also is marked as discardable as it isn't
loaded at runtime, only read by the loader), the .reloc section should
be first of them, in order not to create gaps in the image.
Previously, binaries with relocations were broken if they were stripped
by GNU binutils strip. Trying to execute such binaries produces an error
about "xx is not a valid win32 application".
This fixes GNU binutils bug 23348.
Prior to SVN r329370 (which didn't intend to have functional changes),
the code for moving discardable sections to the end didn't clearly
express how other discardable sections should be ordered compared to
.reloc, but the change retained the exact same end result as before.
After SVN r329370, the code (and comments) more clearly indicate that
it tries to make the .reloc section the absolutely last one; this patch
changes that.
This matches how GNU binutils ld sorts .reloc compared to dwarf debug
info sections.
Differential Revision: https://reviews.llvm.org/D49351
Signed-off-by: Martin Storsjö <martin@martin.st>
llvm-svn: 337598
For dwarf debug info, an executable normally either contains the debug
info, or it is stripped out. To reduce the storage needed (slightly)
for the debug info kept separately from the released, stripped binaries,
one can choose to only copy the debug data from the original executable
(essentially the reverse of the strip operation), producing a file with
only debug info.
When copying the debug data from an executable with GNU objcopy,
the build id and debug directory need to reside in a separate section,
as this will be kept while the rest of the .rdata section is removed.
Differential Revision: https://reviews.llvm.org/D49352
llvm-svn: 337526
This patch changes relative path for source files in obj files to
absolute path in PDB when linking with added flag.
I will make obj file generated by clang-cl independent from build
directory for chromium build. But I don't want to confuse visual studio
debugger or require additional configuration. To attain this goal, I
added flag to convert relative source file path in obj to absolute path
when emitting PDB.
By removing absolute path from obj files, we can share build cache
between chromium developers even when they are doing debug build.
That will make build time faster.
More context:
https://bugs.chromium.org/p/chromium/issues/detail?id=712796https://groups.google.com/a/chromium.org/forum/#!topic/chromium-dev/5HXSVX-7fPc
llvm-svn: 337439
Dwarf debug info contains some data that contains absolute addresses.
Since these sections are discardable and aren't loaded at runtime,
there's no point in adding base relocations for them.
This makes sure that after stripping out dwarf debug info, there are no
base relocations that point to nonexistent sections.
Differential Revision: https://reviews.llvm.org/D49350
llvm-svn: 337438
Some Microsoft tools (e.g. new versions of WPA) fail when the
COFF Debug Directory contains a path to the PDB that contains
dots, such as D:\foo\./bar.pdb. Remove dots before writing this
path.
This fixes pr38126.
llvm-svn: 336873
Future symbol insertions can potentially change the type of these
symbols - keep pointers to the base class to reflect this, and
use dynamic casts to inspect them before using as the subclass
type.
This fixes crashes that were possible before, by touching these
symbols that now are populated as e.g. a DefinedRegular, via
the old pointers with DefinedImportThunk type.
Differential Revision: https://reviews.llvm.org/D48953
llvm-svn: 336652
The reference implementation uses a case-insensitive string
comparison for strings of equal length. This will cause the
string "tEo" to compare less than "VUo". However we were using
a case sensitive comparison, which would generate the opposite
outcome. Switch to a case insensitive comparison. Also, when
one of the strings contains non-ascii characters, fallback to
a straight memcmp.
The only way to really test this is with a DIA test. Before this
patch, the test will fail (but succeed if link.exe is used instead
of lld-link). After the patch, it succeeds even with lld-link.
llvm-svn: 336464
They were failing in Chromium's packaging builds with:
C:\b\rr\tmphqfaff\w\src\third_party\llvm\tools\lld\test\COFF\pdb-globals-dia-vfunc-collision2.test:24:8:
error: expected string not found in input
CHECK: func [0x00001060+ 0 - 0x0000106c-12 | sizeof= 12] (FPO) virtual int __cdecl A132()
^
<stdin>:8:11: note: scanning from here
struct S [sizeof = 8] {
^
<stdin>:9:2: note: possible intended match here
func [0x00001060+ 0 - 0x0000106c-12 | sizeof= 12] (FPO) virtual int __cdecl S::A132()
^
Maybe due to different DIA versions.
llvm-svn: 336424
We add an option to dump the entire global / public symbol record
stream. Previously we would dump globals or publics, but not both.
And when we did dump them, we would always dump them in the order
they were referenced by the corresponding hash streams, not in
the order they were serialized in. This patch adds a lower level
mode that just dumps the whole stream in serialization order.
Additionally, when dumping global-extras, we now dump the hash
bitmap as well as the record offset instead of dumping all zeros
for the offsets.
llvm-svn: 336407
It seems like the debugger first computes a symbol's bucket,
and then does a binary search of entries in the bucket using the
symbol's name in order to find it. If the bucket entries are not
in sorted order, this obviously won't work. After this patch a
couple of simple test cases show that we generate an exactly
identical GSI hash stream, which is very nice.
llvm-svn: 336405
With this set, we retain the symbol table, but skip the actual debug
information.
This is meant to be used by the MinGW frontend.
Differential Revision: https://reviews.llvm.org/D48745
llvm-svn: 335946
Summary:
Control flow guard works best when targets it checks are 16-byte aligned.
Microsoft's link.exe helps ensure this by aligning code from sections
that are referenced from the gfids table to 16 bytes when linking with
-guard:cf, even if the original section specifies a smaller alignment.
This change implements that behavior in lld-link.
See https://crbug.com/857012 for more details.
Reviewers: ruiu, hans, thakis, zturner
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D48690
llvm-svn: 335864
`lld-link foo.lib /wholearchive:foo.lib` should work the same way as
`lld-link /wholearchive:foo.lib foo.lib`. Previously, /wholearchive in
the former case was ignored.
Differential Revision: https://reviews.llvm.org/D47565
llvm-svn: 334552
When running with linker GC (`-opt:ref`), defined imported symbols that
are referenced but then dropped by GC end up with their `Location`
member being nullptr, which means `getChunk()` returns nullptr for them
and attempting to call `getChunk()->getOutputSection()` causes a crash
from the nullptr dereference. Check for `getChunk()` being nullptr and
bail out early to avoid the crash.
Differential Revision: https://reviews.llvm.org/D48092
llvm-svn: 334548
If building lld without x86 support, tests that require that support should
be treated as unsupported, not errors.
Tested using:
1. cmake '-DLLVM_TARGETS_TO_BUILD=AArch64;X86'
make check-lld
=>
Expected Passes : 1406
Unsupported Tests : 287
2. cmake '-DLLVM_TARGETS_TO_BUILD=AArch64'
make check-lld
=>
Expected Passes : 410
Unsupported Tests : 1283
Patch by Joel Jones
Differential Revision: https://reviews.llvm.org/D47748
llvm-svn: 334095
Rather than using a loop to compare symbol RVAs to the starting RVAs of
sections to determine which section a symbol belongs to, just get the
output section of a symbol directly via its chunk, and bail if the
symbol doesn't have an output section, which avoids having to hardcode
logic for handling dead symbols, CodeView symbols, etc. This was
suggested by Reid Kleckner; thank you.
This also fixes writing out symbol tables in the presence of RVA table
input sections (e.g. .sxdata and .gfids). Such sections aren't written
to the output file directly, so their RVA is 0, and the loop would thus
fail to find an output section for them, resulting in a segfault. Extend
some existing tests to cover this case.
Fixes PR37584.
Differential Revision: https://reviews.llvm.org/D47391
llvm-svn: 333450
Previously we emitted 20-byte SHA1 hashes. This is overkill
for identifying debug info records, and has the negative side
effect of making object files bigger and links slower. By
using only the last 8 bytes of a SHA1, we get smaller object
files and ~10% faster links.
This modifies the format of the .debug$H section by adding a new
value for the hash algorithm field, so that the linker will still
work when its object files have an old format.
Differential Revision: https://reviews.llvm.org/D46855
llvm-svn: 332669
The prefix includes type kind, which is important to preserve. Two
different type leafs can easily have the same interior record contents
as another type.
We ran into this issue in PR37492 where a bitfield type record collided
with a const modifier record. Their contents were bitwise identical, but
their kinds were different.
llvm-svn: 332664
Previously we would always write a hash of the binary into the
PE file, for reproducible builds. This breaks AppCompat, which
is a feature of Windows that relies on the timestamp in the PE
header being set to a real value (or at the very least, a value
that satisfies certain properties).
To address this, we put the old behavior of writing the hash
behind the /Brepro flag, which mimics MSVC linker behavior. We
also match MSVC default behavior, which is to write an actual
timestamp to the PE header. Finally, we add the /TIMESTAMP
option (an lld extension) so that the user can specify the exact
value to be used in case he/she manually constructs a value which
is both reproducible and satisfies AppCompat.
Differential Revision: https://reviews.llvm.org/D46966
llvm-svn: 332613
This is needed to avoid merging two functions with identical
instructions but different xdata. It also reduces binary size by
deduplicating identical pdata sections.
Fixes PR35337.
Differential Revision: https://reviews.llvm.org/D46672
llvm-svn: 332169
We discovered (crbug.com/838449#c24) that string tail merging can
negatively affect compressed binary size, so provide a flag to turn
it off for users who care more about compressed size than uncompressed
size.
Differential Revision: https://reviews.llvm.org/D46780
llvm-svn: 332149
Previously this was only supported when specified on the command line
or in directives.
Differential Revision: https://reviews.llvm.org/D46244
llvm-svn: 331900
Now only IMAGE_REL_ARM64_ABSOLUTE and IMAGE_REL_ARM64_TOKEN
are unhandled.
Also add range checks for the existing BRANCH26 relocation.
Differential Revision: https://reviews.llvm.org/D46354
llvm-svn: 331505
This is what link.exe does and lets us avoid needing to worry about
merging output characteristics while adding input sections to output
sections.
With this change we can't process /merge in the same way as before
because sections with different output characteristics can still
be merged into one another. So this change moves the processing of
/merge to just before we assign addresses. In the case where there
are multiple output sections with the same name, link.exe only merges
the first section with the source name into the first section with
the target name, and we do the same.
At the same time I also implemented transitive merging (which means
that /merge:.c=.b /merge:.b=.a merges both .c and .b into .a).
This isn't quite enough though because link.exe has a special case for
.CRT in 32-bit mode: it processes sections whose output characteristics
are DATA | R | W as though the output characteristics were DATA | R
(so that they get merged into things like constructor lists in the
expected way). Chromium has a few such sections, and it turns out
that those sections were causing the problem that resulted in r318699
(merge .xdata into .rdata) being reverted: because of the previous
permission merging semantics, the .CRT sections were causing the entire
.rdata section to become writable, which caused the SEH runtime to
crash because it apparently requires .xdata to be read-only. This
change also implements the same special case.
This should unblock being able to merge .xdata into .rdata by default,
as well as .bss into .data, both of which will be done in followups.
Differential Revision: https://reviews.llvm.org/D45801
llvm-svn: 330479
Part of the DBI stream is a list of variable length structures
describing each module that contributes to the final executable.
One member of this structure is a section contribution entry that
describes the first section contribution in the output file for
the given module.
We have been leaving this structure unpopulated until now, so with
this patch it is now filled out correctly.
Differential Revision: https://reviews.llvm.org/D45832
llvm-svn: 330457
Summary:
DLLs and executables with no exception handlers need to be marked with
IMAGE_DLL_CHARACTERISTICS_NO_SEH, even if they have a load config.
Discovered here when building Chromium with LLD on Windows:
https://crbug.com/833951
Reviewers: ruiu, mstorsjo
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D45778
llvm-svn: 330300
Summary:
This change does three things:
- Try to find the file and line number of an undefined symbol
reference by reading codeview debug info.
- Try to find the name of the function or global variable with the
undefined symbol reference by searching the object file's symbol
table.
- Prints the information in the same style as the ELF linker.
Differential Revision: https://reviews.llvm.org/D45467
llvm-svn: 330235
In this reland I removed an unnecessary use of /debug in the test
delayimports32.test and used the /pdbaltpath flag in the test
pdb-publics-import.test, both of which avoid embedding absolute PDB
paths in executables which could affect later RVAs.
Original commit message:
> COFF: Merge .idata, .didat and .edata into .rdata by default.
>
> This saves a little space and matches what link.exe does.
>
> Tested using the chromium Windows trybots:
> https://chromium-review.googlesource.com/c/chromium/src/+/1014784
Differential Revision: https://reviews.llvm.org/D45737
llvm-svn: 330233
I needed to revert r330223 because we were embedding an absolute PDB
path in the .rdata section, which ended up being laid out before the
.idata section and affecting its RVAs. This flag will let us control
the embedded path.
Differential Revision: https://reviews.llvm.org/D45747
llvm-svn: 330232
The DBI stream contains a list of module descriptors. At the
beginning of each descriptor is a structure representing the first
section contribution in the output file for that module. LLD
currently doesn't fill out this structure at all, but link.exe
does. So as a precursor to emitting this data in LLD, we first
need a way to dump it so that it can be checked.
This patch adds support for the dumping, and verifies via a test
that LLD emits bogus information.
llvm-svn: 330208
One place where this seems to matter is to make sure the .rsrc section comes
after .text. The Win32 UpdateResource() function can change the contents of
.rsrc. It will move the sections that come after, but if .text gets moved, the
entry point header will not get updated and the executable breaks. This was
found by a test in Chromium.
Differential Revision: https://reviews.llvm.org/D45260
llvm-svn: 329221
/FIXED:NO is always the default, so that part needs no work.
Also test the interaction of /ORDER: with /INCREMENTAL.
https://reviews.llvm.org/D45091
llvm-svn: 328877
As of rL215127, FileCheck has an -allow-empty flag,
so this could be used instead of writing a dummy line.
But it looks like the log is never empty now, so not
even that is needed.
llvm-svn: 328862
There are two FPMs in an MSF file, the idea being that for
incremental updates you can write to the alternate one and then
atomically swap them on commit. LLVM defaulted to using FPM1
on the first commit, but this differs from Microsoft's behavior
which is to default to using FPM2 on the first commit. To
eliminate some byte-level file differences, this patch changes
LLVM's default to also be FPM2.
Additionally, LLVM was trying to be "smart" about marking FPM
pages allocated. In addition to marking every page belonging
to the alternate FPM as unallocated, LLVM also marked pages at
the end of the main FPM which were not needed as unallocated.
In order to match the behavior of Microsoft-generated PDBs, we
now always mark every FPM block as allocated, regardless of
whether it is in the main FPM or the alt FPM, and regardless of
whether or not it describes blocks which are actually in the file.
This has the side benefit of simplifying our code.
llvm-svn: 328812
This was reverted several times due to what ultimately turned out
to be incompatibilities in our serialized hash table format.
Several changes went in prior to this to fix those issues since
they were more fundamental and independent of supporting injected
sources, so now that those are fixed this change should hopefully
pass.
llvm-svn: 328363
When investigating bugs in PDB generation, the first step is
often to do the same link with link.exe and then compare PDBs.
But comparing PDBs is hard because two completely different byte
sequences can both be correct, so it hampers the investigation when
you also have to spend time figuring out not just which bytes are
different, but also if the difference is meaningful.
This patch fixes a couple of cases related to string table emission,
hash table emission, and the order in which we emit strings that
makes more of our bytes the same as the bytes generated by MS PDBs.
Differential Revision: https://reviews.llvm.org/D44810
llvm-svn: 328348
This is still failing on a different bot this time due to some
issue related to hashing absolute paths. Reverting until I can
figure it out.
llvm-svn: 328014
The issue causing this to fail in certain configurations
should be fixed.
It was due to the fact that DIA apparently expects there to be
a null string at ID 1 in the string table. I'm not sure why this
is important but it seems to make a difference, so set it.
llvm-svn: 328002
Natvis is a debug language supported by Visual Studio for
specifying custom visualizers. The /NATVIS option is an
undocumented link.exe flag which will take a .natvis file
and "inject" it into the PDB. This way, you can ship the
debug visualizers for a program along with the PDB, which
is very useful for postmortem debugging.
This is implemented by adding a new "named stream" to the
PDB with a special name of /src/files/<natvis file name>
and simply copying the contents of the xml into this file.
Additionally, we need to emit a single stream named
/src/headerblock which contains a hash table of embedded
files to records describing them.
This patch adds this functionality, including the /NATVIS
option to lld-link.
Differential Revision: https://reviews.llvm.org/D44328
llvm-svn: 327895
In COFF, duplicate string literals are merged by placing them in a
comdat whose leader symbol name contains a specific prefix followed
by the hash and partial contents of the string literal. This gives
us an easy way to identify sections containing string literals in
the linker: check for leader symbol names with the given prefix.
Any sections that are identified in this way as containing string
literals may be tail merged. We do so using the StringTableBuilder
class, which is also used to tail merge string literals in the ELF
linker. Tail merging is enabled only if ICF is enabled, as this
provides a signal as to whether the user cares about binary size.
Differential Revision: https://reviews.llvm.org/D44504
llvm-svn: 327668
I definitely didn't run the tests before committing :(
Most of these tests failed because the LLD map file output changed,
moving the functions from the main text section to a new per-function
section.
ICF also started to fire in a few cases, leading to new layouts.
llvm-svn: 327571
GNU ld has got a number of different flags for adjusting how to
behave around stdcall functions. The --kill-at flag strips the
trailing sdcall suffix from exported functions (which otherwise
is included by default in MinGW setups).
This also strips it from the corresponding import library though.
That makes it hard to link to such an import library from code
that calls the functions - but this matches what GNU ld does with
this flag. Therefore, this flag is probably not sensibly used
together with import libraries, but probably mostly when creating
some sort of plugin, or if creating the import library separately
with dlltool.
Differential Revision: https://reviews.llvm.org/D44292
llvm-svn: 327561