Summary:
r338767 updated the COFF and wasm linker SymbolTable code to be
strutured more like the ELF linker's. That inadvertedly changed the
behavior of the COFF linker so that lazy symbols would be marked as
used in regular objects. This change adds an overload of the insert()
function, similar to the ELF linker, which does not perform that
marking.
Reviewers: ruiu, rnk, hans
Subscribers: aheejin, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D51720
llvm-svn: 341585
If the coff timestamp is set to a hash, like lld-link does if /Brepro is
passed, the coff spec suggests that a IMAGE_DEBUG_TYPE_REPRO entry is in the
debug directory. This lets lld-link write such a section.
Fixes PR38429, see bug for details.
Differential Revision: https://reviews.llvm.org/D51652
llvm-svn: 341486
When building a shared libc++.dll, it pulls in libc++abi.a statically
with the --wholearchive flag. If such a build is done with
--export-all-symbols, it's reasonable to assume that everything
from that library also should be exported with the same rules as normal
local object files, even though we normally avoid autoexporting things
from libc++abi.a in other cases when linking a DLL (user code).
Differential Revision: https://reviews.llvm.org/D51529
llvm-svn: 341403
Following D50807, and heading towards D50664, this intermediary change does the following:
1. Upgrade all custom Error types in llvm/trunk/lib/DebugInfo/ to use the new StringError behavior (D50807).
2. Implement std::is_error_code_enum and make_error_code() for DebugInfo error enumerations.
3. Rename GenericError -> PDBError (the file will be renamed in a subsequent commit)
4. Update custom error messages to follow the same formatting: (\w\s*)+\.
5. Keep generic "file not found" (ENOENT) errors as they are in PDB code. Previously, there used to be a custom enumeration for that purpose.
6. Remove a few extraneous LF in log() implementations. Printing LF is a responsability at a higher level, not at the error level.
Differential Revision: https://reviews.llvm.org/D51499
llvm-svn: 341228
After fixing up the runtime pseudo relocation, the .refptr.<var>
will be a plain pointer with the same value as the IAT entry itself.
To save a little binary size and reduce the number of runtime pseudo
relocations, redirect references to the IAT entry (via the __imp_<var>
symbol) itself and discard the .refptr.<var> chunk (as long as the
same section chunk doesn't contain anything else than the single
pointer).
As there are now cases for both setting the Live variable to true
and false externally, remove the accessors and setters and just make
the variable public instead.
Differential Revision: https://reviews.llvm.org/D51456
llvm-svn: 341175
There's no point in keeping them as separate sections.
This differs from GNU ld, which places .ctors and .dtors content in
.text (implemented by a built-in linker script). But since the content
only is pointers, there's no need to have it executable.
GNU ld also leaves .CRT separate as its own standalone section.
MSVC merges .CRT into .rdata similarly, with a directive embedded in
an object file in msvcrt.lib or libcmt.lib.
Differential Revision: https://reviews.llvm.org/D51414
llvm-svn: 340940
Normally, in order to reference exported data symbols from a different
DLL, the declarations need to have the dllimport attribute, in order to
use the __imp_<var> symbol (which contains an address to the actual
variable) instead of the variable itself directly. This isn't an issue
in the same way for functions, since any reference to the function without
the dllimport attribute will end up as a reference to a thunk which loads
the actual target function from the import address table (IAT).
GNU ld, in MinGW environments, supports automatically importing data
symbols from DLLs, even if the references didn't have the appropriate
dllimport attribute. Since the PE/COFF format doesn't support the kind
of relocations that this would require, the MinGW's CRT startup code
has an custom framework of their own for manually fixing the missing
relocations once module is loaded and the target addresses in the IAT
are known.
For this to work, the linker (originall in GNU ld) creates a list of
remaining references needing fixup, which the runtime processes on
startup before handing over control to user code.
While this feature is rather controversial, it's one of the main features
allowing unix style libraries to be used on windows without any extra
porting effort.
Some sort of automatic fixing of data imports is also necessary for the
itanium C++ ABI on windows (as clang implements it right now) for importing
vtable pointers in certain cases, see D43184 for some discussion on that.
The runtime pseudo relocation handler supports 8/16/32/64 bit addresses,
either PC relative references (like IMAGE_REL_*_REL32*) or absolute
references (IMAGE_REL_AMD64_ADDR32, IMAGE_REL_AMD64_ADDR32,
IMAGE_REL_I386_DIR32). On linking, the relocation is handled as a
relocation against the corresponding IAT slot. For the absolute references,
a normal base relocation is created, to update the embedded address
in case the image is loaded at a different address.
The list of runtime pseudo relocations contains the RVA of the
imported symbol (the IAT slot), the RVA of the location the relocation
should be applied to, and a size of the memory location. When the
relocations are fixed at runtime, the difference between the actual
IAT slot value and the IAT slot address is added to the reference,
doing the right thing for both absolute and relative references.
With this patch alone, things work fine for i386 binaries, and mostly
for x86_64 binaries, with feature parity with GNU ld. Despite this,
there are a few gotchas:
- References to data from within code works fine on both x86 architectures,
since their relocations consist of plain 32 or 64 bit absolute/relative
references. On ARM and AArch64, references to data doesn't consist of
a plain 32 or 64 bit embedded address or offset in the code. On ARMNT,
it's usually a MOVW+MOVT instruction pair represented by a
IMAGE_REL_ARM_MOV32T relocation, each instruction containing 16 bit of
the target address), on AArch64, it's usually an ADRP+ADD/LDR/STR
instruction pair with an even more complex encoding, storing a PC
relative address (with a range of +/- 4 GB). This could theoretically
be remedied by extending the runtime pseudo relocation handler with new
relocation types, to support these instruction encodings. This isn't an
issue for GCC/GNU ld since they don't support windows on ARMNT/AArch64.
- For x86_64, if references in code are encoded as 32 bit PC relative
offsets, the runtime relocation will fail if the target turns out to be
out of range for a 32 bit offset.
- Fixing up the relocations at runtime requires making sections writable
if necessary, with the VirtualProtect function. In Windows Store/UWP apps,
this function is forbidden.
These limitations are addressed by a few later patches in lld and
llvm.
Differential Revision: https://reviews.llvm.org/D50917
llvm-svn: 340726
For this relocation, which applies to two consecutive instructions,
it's plausible that the second instruction might not actually be
the right one.
Differential Revision: https://reviews.llvm.org/D50998
llvm-svn: 340715
This is a minor follow-up to https://reviews.llvm.org/D49189. On Windows, lld
used to print "lld-link.exe: error: ...". Now it just prints "lld-link: error:
...". This matches what link.exe does (it prints "LINK : ...") and makes lld's
output less dependent on the host system.
https://reviews.llvm.org/D51133
llvm-svn: 340487
newline() in ErrorHandler.cpp already tries to insert newlines between messages
that contain embedded newlines, so getSymbolLocations() shouldn't return a
string that ends in a newline -- else we end up with two newlines between error
messages.
Makes lld-link's output look more like ld.lld output.
https://reviews.llvm.org/D51117
llvm-svn: 340482
In most of these cases, it's easy to go on despite the error,
printing as many valuable error messages as possible from one run
as possible.
Differential Revision: https://reviews.llvm.org/D51087
llvm-svn: 340399
link.exe ignores REL32 relocations on 32-bit x86, as well as relocations
against non-function symbols such as labels. This makes lld do the same.
Differential Revision: https://reviews.llvm.org/D50430
llvm-svn: 339345
If /subsystem:windows is passed, link.exe only looks for WinMain and wWinMain,
and if /subsystem:console is passed it only looks for main and wmain. lld-link
used to look for all 4 in both cases. This patch makes lld-link match
link.exe's behavior.
This requires that the subsystem is known by the time findDefaultEntry() gets
called. findDefaultEntry() is called before the main link loop, so that the
loop can mark the entry point as undefined. That means inferSubsystem() has to
be called above the main loop as well. This in turn means /subsystem: from
.drectve sections only has an effect on entry point inference for obj files
passed to lld-link directly (and not in obj files found later in .lib files).
link.exe seems to ignore /subsystem: for obj files from lib files completely
(while in lld it's ignored only for entry point detection but it still
overrides /subsystem: flags passed on the command line for the value that gets
written in the output file).
Also, if the subsytem isn't needed (e.g. when only writing a /def: lib file and
not writing a coff file), link.exe doesn't complain if the subsystem isn't
known, so both subsystem and entry point handling should be below the early
return lld has for that case.
Fixes PR36523.
https://reviews.llvm.org/D50316
llvm-svn: 339165
MinGW configurations don't use associative comdats, as GNU ld doesn't
support that. Instead they produce normal comdats named .text$sym,
.xdata$sym and .pdata$sym.
GNU ld doesn't discard any comdats starting with .xdata or .pdata,
even if --gc-sections is used (while it does discard other unreferenced
comdats), regardless of what symbol name is used after the $ separator.
For LLD, treat any such comdat as implicitly associative to the base
symbol. This requires maintaining a map from symbol name to section
number, but that is only maintained when the MinGW flag has been
enabled.
Differential Revision: https://reviews.llvm.org/D49700
llvm-svn: 339058
It's not an error if a common symbol (uninitialized data, with alignment
specified via the aligncomm directive) is replaced with a regular
one with initialized data (with alignment specified via the section
chunk).
Differential Revision: https://reviews.llvm.org/D50268
llvm-svn: 339049
LinkerDriver::inferSubsystem() used to do Symtab->findUnderscore("WinMain"),
but WinMain is stdcall in 32-bit and is hence is called _WinMain@16. Instead,
Symtab->findMangle(mangle("WinMain")) needs to be called.
But since LinkerDriver::inferSubsystem() and LinkerDriver::findDefaultEntry()
both need to call this, introduce a common helper function for this and call it
from both places. (Also call it for "main" for consistency, even though
findUnderscore() is enough for main since that's __cdecl on 32-bit).
This also exposed a bug for /nodefaultlib entrypoint inference: The code here
called findMangle(Sym) instead of findMangle(mangle(Sym)), again doing the
wrong thing on 32-bit. Fix that too.
While here, make Driver::mangle() a static free function.
https://reviews.llvm.org/D50184
llvm-svn: 338877
This was useful for LTO bringup in lld-link while lld couldn't write PDBs. Now
that it can, this should no longer be needed. Hopefully the flag is obscure
enough and recent enough, that nobody uses it – but if somebody should use it,
they should be able to just stop passing it and things should continue to work.
https://reviews.llvm.org/D50139
llvm-svn: 338615
This patch does the same thing as r338153 for COFF.
Note that this patch affects only the order of log messages.
The output file is already deterministic.
Differential Revision: https://reviews.llvm.org/D50023
llvm-svn: 338406
Discard them unless they have been associated by other means (yet
uimplemented).
According to MS link.exe, such sections are illegal, but MinGW setups
use them in their take on associative comdats.
This avoids leaving references to the bogus SectionChunk* PendingComdat,
which cannot be dereferenced.
This fixes PR38183.
Differential Revision: https://reviews.llvm.org/D49653
llvm-svn: 338064
Patch by Andrew Kelley.
Previously, running lld::coff::link() twice in the same process would
access stale pointers because of these global variables not being reset.
After this patch, lld::coff::link() can be called any number of times,
just like its ELF and MACH-O counterparts.
Differential Revision: https://reviews.llvm.org/D49856
llvm-svn: 338042
Previously, the error messages didn't contain symbol name because we
didn't read a symbol name for these error messages.
Differential Revision: https://reviews.llvm.org/D49762
llvm-svn: 337863
lld currently prepends the absolute path to itself to every diagnostic it
emits. This path can be longer than the diagnostic, and makes the actual error
message hard to read.
There isn't a good reason for printing this path: if you want to know which lld
you're running, pass -v to clang – chances are that if you're unsure of this,
you're not only unsure when it errors out. Some people want an indication that
the diagnostic is from the linker though, so instead print just the basename of
the linker's path.
Before:
```
$ out/bin/clang -target x86_64-unknown-linux -x c++ /dev/null -fuse-ld=lld
/Users/thakis/src/llvm-mono/out/bin/ld.lld: error: cannot open crt1.o: No such file or directory
/Users/thakis/src/llvm-mono/out/bin/ld.lld: error: cannot open crti.o: No such file or directory
/Users/thakis/src/llvm-mono/out/bin/ld.lld: error: cannot open crtbegin.o: No such file or directory
/Users/thakis/src/llvm-mono/out/bin/ld.lld: error: unable to find library -lgcc
/Users/thakis/src/llvm-mono/out/bin/ld.lld: error: unable to find library -lgcc_s
/Users/thakis/src/llvm-mono/out/bin/ld.lld: error: unable to find library -lc
/Users/thakis/src/llvm-mono/out/bin/ld.lld: error: unable to find library -lgcc
/Users/thakis/src/llvm-mono/out/bin/ld.lld: error: unable to find library -lgcc_s
/Users/thakis/src/llvm-mono/out/bin/ld.lld: error: cannot open crtend.o: No such file or directory
/Users/thakis/src/llvm-mono/out/bin/ld.lld: error: cannot open crtn.o: No such file or directory
clang: error: linker command failed with exit code 1 (use -v to see invocation)
```
After:
```
$ out/bin/clang -target x86_64-unknown-linux -x c++ /dev/null -fuse-ld=lld
ld.lld: error: cannot open crt1.o: No such file or directory
ld.lld: error: cannot open crti.o: No such file or directory
ld.lld: error: cannot open crtbegin.o: No such file or directory
ld.lld: error: unable to find library -lgcc
ld.lld: error: unable to find library -lgcc_s
ld.lld: error: unable to find library -lc
ld.lld: error: unable to find library -lgcc
ld.lld: error: unable to find library -lgcc_s
ld.lld: error: cannot open crtend.o: No such file or directory
ld.lld: error: cannot open crtn.o: No such file or directory
clang: error: linker command failed with exit code 1 (use -v to see invocation)
```
https://reviews.llvm.org/D49189
llvm-svn: 337634
If a binary is stripped, which can remove discardable sections (except
for the .reloc section, which also is marked as discardable as it isn't
loaded at runtime, only read by the loader), the .reloc section should
be first of them, in order not to create gaps in the image.
Previously, binaries with relocations were broken if they were stripped
by GNU binutils strip. Trying to execute such binaries produces an error
about "xx is not a valid win32 application".
This fixes GNU binutils bug 23348.
Prior to SVN r329370 (which didn't intend to have functional changes),
the code for moving discardable sections to the end didn't clearly
express how other discardable sections should be ordered compared to
.reloc, but the change retained the exact same end result as before.
After SVN r329370, the code (and comments) more clearly indicate that
it tries to make the .reloc section the absolutely last one; this patch
changes that.
This matches how GNU binutils ld sorts .reloc compared to dwarf debug
info sections.
Differential Revision: https://reviews.llvm.org/D49351
Signed-off-by: Martin Storsjö <martin@martin.st>
llvm-svn: 337598
For dwarf debug info, an executable normally either contains the debug
info, or it is stripped out. To reduce the storage needed (slightly)
for the debug info kept separately from the released, stripped binaries,
one can choose to only copy the debug data from the original executable
(essentially the reverse of the strip operation), producing a file with
only debug info.
When copying the debug data from an executable with GNU objcopy,
the build id and debug directory need to reside in a separate section,
as this will be kept while the rest of the .rdata section is removed.
Differential Revision: https://reviews.llvm.org/D49352
llvm-svn: 337526
This patch changes relative path for source files in obj files to
absolute path in PDB when linking with added flag.
I will make obj file generated by clang-cl independent from build
directory for chromium build. But I don't want to confuse visual studio
debugger or require additional configuration. To attain this goal, I
added flag to convert relative source file path in obj to absolute path
when emitting PDB.
By removing absolute path from obj files, we can share build cache
between chromium developers even when they are doing debug build.
That will make build time faster.
More context:
https://bugs.chromium.org/p/chromium/issues/detail?id=712796https://groups.google.com/a/chromium.org/forum/#!topic/chromium-dev/5HXSVX-7fPc
llvm-svn: 337439
Dwarf debug info contains some data that contains absolute addresses.
Since these sections are discardable and aren't loaded at runtime,
there's no point in adding base relocations for them.
This makes sure that after stripping out dwarf debug info, there are no
base relocations that point to nonexistent sections.
Differential Revision: https://reviews.llvm.org/D49350
llvm-svn: 337438
Some Microsoft tools (e.g. new versions of WPA) fail when the
COFF Debug Directory contains a path to the PDB that contains
dots, such as D:\foo\./bar.pdb. Remove dots before writing this
path.
This fixes pr38126.
llvm-svn: 336873
Future symbol insertions can potentially change the type of these
symbols - keep pointers to the base class to reflect this, and
use dynamic casts to inspect them before using as the subclass
type.
This fixes crashes that were possible before, by touching these
symbols that now are populated as e.g. a DefinedRegular, via
the old pointers with DefinedImportThunk type.
Differential Revision: https://reviews.llvm.org/D48953
llvm-svn: 336652
With this set, we retain the symbol table, but skip the actual debug
information.
This is meant to be used by the MinGW frontend.
Differential Revision: https://reviews.llvm.org/D48745
llvm-svn: 335946
Summary:
Control flow guard works best when targets it checks are 16-byte aligned.
Microsoft's link.exe helps ensure this by aligning code from sections
that are referenced from the gfids table to 16 bytes when linking with
-guard:cf, even if the original section specifies a smaller alignment.
This change implements that behavior in lld-link.
See https://crbug.com/857012 for more details.
Reviewers: ruiu, hans, thakis, zturner
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D48690
llvm-svn: 335864
`lld-link foo.lib /wholearchive:foo.lib` should work the same way as
`lld-link /wholearchive:foo.lib foo.lib`. Previously, /wholearchive in
the former case was ignored.
Differential Revision: https://reviews.llvm.org/D47565
llvm-svn: 334552
When running with linker GC (`-opt:ref`), defined imported symbols that
are referenced but then dropped by GC end up with their `Location`
member being nullptr, which means `getChunk()` returns nullptr for them
and attempting to call `getChunk()->getOutputSection()` causes a crash
from the nullptr dereference. Check for `getChunk()` being nullptr and
bail out early to avoid the crash.
Differential Revision: https://reviews.llvm.org/D48092
llvm-svn: 334548
This simplifies some code which had StringRefs to begin with, and
makes other code more complicated which had const char* to begin
with.
In the end, I think this makes for a more idiomatic and platform
agnostic API. Not all platforms launch process with null terminated
c-string arrays for the environment pointer and argv, but the api
was designed that way because it allowed easy pass-through for
posix-based platforms. There's a little additional overhead now
since on posix based platforms we'll be takign StringRefs which
were constructed from null terminated strings and then copying
them to null terminate them again, but from a readability and
usability standpoint of the API user, I think this API signature
is strictly better.
llvm-svn: 334518