Previously these were dropped. We now understand them sufficiently
well to start emitting them. From the debugger's perspective, this
now enables us to have debug info about typedefs (both global and
function-locally scoped)
Differential Revision: https://reviews.llvm.org/D55228
llvm-svn: 348306
We initialize .text section with 0xcc (INT3 instruction), so we need to
explicitly write data even if it is zero if it can be in a .text section.
If you specify /merge:.rdata=.text, .rdata (which contains .idata) is put
to .text, so we need to do this.
Fixes https://bugs.llvm.org/show_bug.cgi?id=39826
Differential Revision: https://reviews.llvm.org/D55098
llvm-svn: 348000
The number of sections is used in assignAddresses (in
finalizeAddresses) and the space for all sections is permanent from
that point on, even if we later decide we won't write some of them.
The VirtualSize field also gets calculated in assignAddresses, so we
need to manually check whether the section is empty here instead.
Differential Revision: https://reviews.llvm.org/D54495
llvm-svn: 347704
Summary:
This speeds up linking clang.exe/pdb with /DEBUG:GHASH by 31%, from
12.9s to 9.8s.
Symbol records are typically small (16.7 bytes on average), but we
processed them one at a time. CVSymbol is a relatively "large" type. It
wraps an ArrayRef<uint8_t> with a kind an optional 32-bit hash, which we
don't need. Before this change, each DbiModuleDescriptorBuilder would
maintain an array of CVSymbols, and would write them individually with a
BinaryItemStream.
With this change, we now add symbols that happen to appear contiguously
in bulk. For each .debug$S section (roughly one per function), we
allocate two copies, one for relocation, and one for realignment
purposes. For runs of symbols that go in the module stream, which is
most symbols, we now add them as a single ArrayRef<uint8_t>, so the
vector DbiModuleDescriptorBuilder is roughly linear in the number of
.debug$S sections (O(# funcs)) instead of the number of symbol records
(very large).
Some stats on symbol sizes for the curious:
PDB size: 507M
sym bytes: 316,508,016
sym count: 18,954,971
sym byte avg: 16.7
As future work, we may be able to skip copying symbol records in the
linker for realignment purposes if we make LLVM write them aligned into
the object file. We need to double check that such symbol records are
still compatible with link.exe, but if so, it's definitely worth doing,
since my profile shows we spend 500ms in memcpy in the symbol merging
code. We could potentially cut that in half by saving a copy.
Alternatively, we could apply the relocations *after* we iterate the
symbols. This would require some careful re-engineering of the
relocation processing code, though.
Reviewers: zturner, aganea, ruiu
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D54554
llvm-svn: 347687
GNU ld, which doesn't generate PDBs, can optionally generate a
build id by passing the --build-id option. LLD's MinGW frontend knows
about this option but ignores it, as I had falsely assumed that LLD
already generated build IDs even in those cases.
If debug info is requested and no PDB path is set, generate a
build id signature as a hash of the binary itself. This allows
associating a binary to a minidump, even if debug info isn't
written in PDB form by the linker.
Differential Revision: https://reviews.llvm.org/D54828
llvm-svn: 347645
Summary:
MSVC does this, and we should to.
The .gfids table is a table of RVAs, so it's impossible for a DLL to
indicate that an imported symbol is address taken. Therefore, exports
appear to be listed as address taken by the DLL that exports them.
This fixes an issue that Firefox ran into here:
https://bugzilla.mozilla.org/show_bug.cgi?id=1485016#c12
In Firefox, the export directive came from a .def file, but we need to
do this for any kind of export.
Reviewers: dmajor, hans, amccarth, alex
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D54723
llvm-svn: 347623
Summary: They have an additional `ThreadsEnabled` check, which does not matter much.
Reviewers: pcc, ruiu, rnk
Reviewed By: ruiu
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D54812
llvm-svn: 347587
Previously we were taking over 13 minutes to link Firefox's xul.dll
on ARM64; this reduces link time to around 18s on my machine.
The root cause of the problem was that all of the input .pdata sections
had the same unrelocated section data and therefore the same hash,
which made segregation quadratic in the number of .pdata sections. The
reason why we weren't observing this on other architectures was that
ARM has a different .pdata format. On non-ARM the format is (start
address, end address, .xdata), which caused the size of the function
to appear in the unrelocated section data where the end address field
is. However, the ARM format omits the end address field.
Fixes PR39667.
Differential Revision: https://reviews.llvm.org/D54809
llvm-svn: 347429
Don't use a uint32_t*, use a ulittle32_t* to make this correct
on big endian systems.
Patch by James Clarke
Differential Revision: https://reviews.llvm.org/D54421
llvm-svn: 347349
- Make mergeSymbolRecords a method of PDBLinker to reduce the number of
parameters it needs.
- Remove a stale FIXME comment about error handling. We already drop
unknown symbol records, log them, and continue.
- Update a comment about why we're copying the symbol record. We do it
to realign the record. We can already mutate the symbol record memory,
it's memory allocated by relocateDebugChunk.
- Avoid the extra `CVSymbol NewSym` variable. We can mutate Sym in
place, which is best, since we're mutating the underlying record anyway.
llvm-svn: 346817
Summary:
Reuse the "referenced by" note diagnostic code that we already use for
undefined symbols. In my case, it turned this:
lld-link: error: relocation against symbol in discarded section: .text
lld-link: error: relocation against symbol in discarded section: .text
...
Into this:
lld-link: error: relocation against symbol in discarded section: .text
>>> referenced by libANGLE.lib(CompilerGL.obj):(.SCOVP$M)
>>> referenced by libANGLE.lib(CompilerGL.obj):(.SCOVP$M)
...
lld-link: error: relocation against symbol in discarded section: .text
>>> referenced by obj/third_party/angle/libGLESv2/entry_points_egl_ext.obj:(.SCOVP$M)
>>> referenced by obj/third_party/angle/libGLESv2/entry_points_egl_ext.obj:(.SCOVP$M)
...
I think the new output is more useful.
Reviewers: ruiu, pcc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D54240
llvm-svn: 346427
This change allows for link-time merging of debugging information from
Microsoft precompiled types OBJs compiled with cl.exe /Z7 /Yc and /Yu.
This fixes llvm.org/PR34278
Differential Revision: https://reviews.llvm.org/D45213
llvm-svn: 346154
Normally one wouldn't run into that case, but it is possible with
a little creative ordering of special libraries.
Differential Revision: https://reviews.llvm.org/D53388
llvm-svn: 344776
This a resubmission of a patch which was previously reverted
due to breaking several lld tests. The issues causing those
failures have been fixed, so the patch is now resubmitted.
---Original Commit Message---
While it doesn't make a *ton* of sense for POSIX paths to be
in PDBs, it's possible to occur in real scenarios involving
cross compilation.
The tools need to be able to handle this, because certain types
of debugging scenarios are possible without a running process
and so don't necessarily require you to be on a Windows system.
These include post-mortem debugging and binary forensics (e.g.
using a debugger to disassemble functions and examine symbols
without running the process).
There's changes in clang, LLD, and lldb in this patch. After
this the cross-platform disassembly and source-list tests pass
on Linux.
Furthermore, the behavior of LLD can now be summarized by a much
simpler rule than before: Unless you specify /pdbsourcepath and
/pdbaltpath, the PDB ends up with paths that are valid within
the context of the machine that the link is performed on.
Differential Revision: https://reviews.llvm.org/D53149
llvm-svn: 344377
This was originally causing some test failures on non-Windows
platforms, which required fixes in the compiler and linker. After
those fixes, however, other tests started failing. Reverting
temporarily until I can address everything.
llvm-svn: 344279
While it doesn't make a *ton* of sense for POSIX paths to be
in PDBs, it's possible to occur in real scenarios involving
cross compilation.
The tools need to be able to handle this, because certain types
of debugging scenarios are possible without a running process
and so don't necessarily require you to be on a Windows system.
These include post-mortem debugging and binary forensics (e.g.
using a debugger to disassemble functions and examine symbols
without running the process).
There's changes in clang, LLD, and lldb in this patch. After
this the cross-platform disassembly and source-list tests pass
on Linux.
Furthermore, the behavior of LLD can now be summarized by a much
simpler rule than before: Unless you specify /pdbsourcepath and
/pdbaltpath, the PDB ends up with paths that are valid within
the context of the machine that the link is performed on.
Differential Revision: https://reviews.llvm.org/D53149
llvm-svn: 344269
When these are accessed with load/store instructions on ARM64,
it becomes strictly necessary to have them properly aligned.
This fixes PR39228.
Differential Revision: https://reviews.llvm.org/D53128
llvm-svn: 344264
This allows using #pragma comment(lib, "foo") in MinGW built code,
if built with -fms-extensions. (This works for system libraries and
static libraries only, as it doesn't try to look for .dll.a. As
ld.bfd doesn't support embedded defaultlib directives, this isn't
in widespread use among mingw users.)
Differential Revision: https://reviews.llvm.org/D53017
llvm-svn: 344124
Summary: Before, OptTable::PrintHelp append "[options] <inputs>" to its parameter `Help`. It is more flexible to change its semantic to `Usage` and let user customize the usage line.
Reviewers: rupprecht, ruiu, espindola
Reviewed By: rupprecht
Subscribers: emaste, sbc100, arichardson, aheejin, llvm-commits
Differential Revision: https://reviews.llvm.org/D53054
llvm-svn: 344099
/pdbsourcepath: was added in https://reviews.llvm.org/D48882 to make it
possible to have relative paths in the debug info that clang-cl writes.
lld-link then makes the paths absolute at link time, which debuggers require.
This way, clang-cl's output is independent of the absolute path of the build
directory, which is useful for cacheability in distcc-like systems.
This patch extends /pdbsourcepath: (if passed) to also be used for:
1. The "cwd" stored in the env block in the pdb is /pdbsourcepath: if present
2. The "exe" stored in the env block in the pdb is made absolute relative
to /pdbsourcepath: instead of the cwd
3. The "pdb" stored in the env block in the pdb is made absolute relative
to /pdbsourcepath: instead of the cwd
4. For making absolute paths to .obj files referenced from the pdb
/pdbsourcepath: is now useful in three scenarios (the first one already working
before this change):
1. When building with full debug info, passing the real build dir to
/pdbsourcepath: allows having clang-cl's output to be independent
of the build directory path. This patch effectively doesn't change
behavior for this use case (assuming the cwd is the build dir).
2. When building without compile-time debug info but linking with /debug,
a fake fixed /pdbsourcepath: can be passed to get symbolized stacks
while making the pdb and exe independent of the current build dir.
For this two work, lld-link needs to be invoked with relative paths for
the lld-link invocation itself (for "exe"), for the pdb output name, the exe
output name (for "pdb"), and the obj input files, and no absolute path
must appear on the link command (for "cmd" in the pdb's env block).
Since no full debug info is present, it doesn't matter that the absolute
path doesn't exist on disk -- we only get symbols in stacks.
3. When building production builds with full debug info that don't have
local changes, and that get source indexed and their pdbs get uploaded
to a symbol server. /pdbsourcepath: again makes the build output independent
of the current directory, and the fixed path passed to /pdbsourcepath: can
be given the source indexing transform so that it gets mapped to a
repository path. This has the same requirements as 2.
This patch also makes it possible to create PDB files containing Windows-style
absolute paths when cross-compiling on a POSIX system.
Differential Revision: https://reviews.llvm.org/D53021
llvm-svn: 344061
ld.bfd doesn't do any inference of subsystem; unless the windows
subsystem is specified, the console subsystem is used.
For the console subsystem, the entry point is called mainCRTStartup,
regardless of whether the the user code entry point is main or wmain.
The same goes for the windows subsystem, where the entry point always
is WinMainCRTStartup, for both WinMain and wWinMain in user code.
One detail that we don't emulate, is that if the inferred entry point
is undefined, ld.bfd silently just sets the entry point to the start
of the image. And if an explicit entry point is set, but it is
undefined, the link still succeeds but the linker warns about the
entry point not being found.
Differential Revision: https://reviews.llvm.org/D52931
llvm-svn: 343879
For certain cases of inline functions written to comdat sections,
GCC 5.x produces a weak symbol in addition, which would end up
undefined in some cases.
This no longer seems to happen with GCC 6.x or newer though.
Differential Revision: https://reviews.llvm.org/D52602
llvm-svn: 343877
(patch by Benoit Rousseau)
This patch fixes a bug where the global variable initializers were sometimes not invoked in the correct order when it involved a C++ template instantiation.
Differential Revision: https://reviews.llvm.org/D52749
llvm-svn: 343847
When GNU tools create a weak alias, they produce a strong symbol
named .weak.<weaksymbol>.<relatedstrongsymbol>.
GNU ld allows many such weak alternatives for the same weak symbol, and
the linker picks the first one encountered.
This can't be reproduced by assembling from .s files, since llvm-mc
produces symbols named .weak.<weaksymbol>.default in these cases.
Differential Revision: https://reviews.llvm.org/D52601
llvm-svn: 343704
Three related changes:
1. link.exe uses the presence of main and wmain to decide if it should call
mainCRTStartup or wmainCRTStartup, even if /nodefaultlib is passed. For
compatibility, remove FindMain logic.
2. Default to the non-wide entrypoint if main is not found. This has two effects:
2a. In normal links, lld-link now prints
lld-link: error: undefined symbol: _main
>>> referenced by f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl:78
>>> libcmt.lib(exe_main.obj):("int __cdecl invoke_main(void)" (?invoke_main@@YAHXZ))
>>> referenced by f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl:283
>>> libcmt.lib(exe_main.obj):("int __cdecl __scrt_common_main_seh(void)" (?__scrt_common_main_seh@@YAHXZ))
instead of
lld-link: error: entry point must be defined
This is arguably a better error message, since it now mentions that _main is
missing. (This matches link.exe's diagnostic in this case.)
2b. With /nodefautlib, we now default to mainCRTStartup if no main() is
present, again matching link.exe. This makes r337407 obsolete.
This means if you have a cc file containing both mainCRTStartup and
wmainCRTStartup and you pass /nodefaultlib /subsystem:console, lld-link will
now call mainCRTStartup, matching link.exe
3. Print a warning if both main and wmain are present, similar to link.exe's
LNK4067.
Differential Revision: https://reviews.llvm.org/D52832
llvm-svn: 343698
When GCC produces a jump table as part of a comdat function, the
jump table itself is produced as plain non-comdat rdata section. When
linked with ld.bfd, all of those rdata sections are kept, with
relocations unchanged in the sections that refer to discarded comdat
sections.
This has been observed with at least GCC 5.x and 7.x.
Differential Revision: https://reviews.llvm.org/D52600
llvm-svn: 343422
This involves adding more generic list of symbol suffixes/prefixes
to ignore for autoexport; adding a few other entries to these lists
as well from the corresponding lists in binutils.
Differential Revision: https://reviews.llvm.org/D52382
llvm-svn: 343070
Don't assume that the IAT chunk will be a DefinedImportData, it can
just as well be a DefinedRegular for gnu import libraries.
Differential Revision: https://reviews.llvm.org/D52381
llvm-svn: 343069
This is a feature that MS link.exe lacks; it currently errors out on
such relocations, just like lld did before.
This allows linking clang.exe for ARM - practically, any image over
16 MB will likely run into the issue.
Differential Revision: https://reviews.llvm.org/D52156
llvm-svn: 342962
Implement final argument precedence if multiple /debug arguments are passed on the command-line to match expected link.exe behavior.
Support /debug:none and emit warning for /debug:fastlink with automatic fallback to /debug:full.
Emit error if last /debug:option is unknown.
Emit warning if last /debugtype:option is unknown.
https://reviews.llvm.org/D50404
llvm-svn: 342894
GNU binutils import libraries aren't the same kind of short import
libraries as link.exe and LLD produce, but are a plain static library
containing .idata section chunks. MSVC link.exe can successfully link
to them.
In order for imports from GNU import libraries to mix properly with the
normal import chunks, the chunks from the existing mechanism needs to
be added into named sections like .idata$2.
These GNU import libraries consist of one header object, a number of
object files, one for each imported function/variable, and one trailer.
Within the import libraries, the object files are ordered alphabetically
in this order. The chunks stemming from these libraries have to be
grouped by what library they originate from and sorted, to make sure
the section chunks for headers and trailers for the lists are ordered
as intended. This is done on all sections named .idata$*, before adding
the synthesized chunks to them.
Differential Revision: https://reviews.llvm.org/D38513
llvm-svn: 342777
The __NULL_IMPORT_DESCRIPTOR symbol has two leading underscores on
architectures other than i386 as well; it is not a mangled symbol name.
llvm-svn: 342448
Previously, lld-link would use a random byte sequence as the PDB GUID. Instead,
use a hash of the PDB file contents.
To not disturb llvm-pdbutil pdb2yaml, the hash generation is an opt-in feature
on InfoStreamBuilder and ldb/COFF/PDB.cpp always sets it.
Since writing the PDB computes this ID which also goes in the exe, the PDB
writing code now must be called before writeBuildId(). writeBuildId() for that
reason is no longer included in the "Code Layout" timer.
Since the PDB GUID is now a function of the PDB contents, the PDB Age is always
set to 1. There was a long comment above loadExistingBuildId (now gone) about
how not changing the GUID and only incrementing the age was important, but
according to the discussion in PR35914 that comment was incorrect.
Differential Revision: https://reviews.llvm.org/D51956
llvm-svn: 342334
For this, add a few toString() calls when printing the "undefined symbol"
diagnostics; toString() already does demangling on Windows hosts.
Also make lld::demangleMSVC() (called by toString(Symbol*)) call LLVM's
microsoftDemangle() instead of UnDecorateSymbolName() so that it works on
non-Windows hosts – this makes both updating tests easier and provides a better
user experience for people doing cross-links.
This doesn't yet do the right thing for symbols starting with __imp_, but that
can be improved in a follow-up.
Differential Revision: https://reviews.llvm.org/D52104
llvm-svn: 342332
MinGW uses these kind of list terminator symbols for traversing
the constructor/destructor lists. These list terminators are
actual pointers entries in the lists, with the values 0 and
(uintptr_t)-1 (instead of just symbols pointing to the start/end
of the list).
(This mechanism exists in both the mingw-w64 crt startup code and
in libgcc; normally the mingw-w64 one is used, but a DLL build of
libgcc uses the libgcc one. Therefore it's not trivial to change
the mechanism without lots of cross-project synchronization and
potentially invalidating some combinations of old/new versions
of them.)
When mingw-w64 has been used with lld so far, the CRT startup object
files have so far provided these symbols, ending up with different,
incompatible builds of the CRT startup object files depending on
whether binutils or lld are going to be used.
In order to avoid the need of different configuration of the CRT startup
object files depending on what linker to be used, provide these symbols
in lld instead. (Mingw-w64 checks at build time whether the linker
provides these symbols or not.) This unifies this particular detail
between the two linkers.
This does disallow the use of the very latest lld with older versions
of mingw-w64 (the configure check for the list was added recently;
earlier it simply checked whether the CRT was built with gcc or clang),
and requires rebuilding the mingw-w64 CRT. But the number of users of
lld+mingw still is low enough that such a change should be tolerable,
and unifies this aspect of the toolchains, easing interoperability
between the toolchains for the future.
The actual test for this feature is added in ctors_dtors_priority.s,
but a number of other tests that checked absolute output addresses
are updated.
Differential Revision: https://reviews.llvm.org/D52053
llvm-svn: 342294