In MinGW configurations (GCC, or clang with a *-windows-gnu target),
the -export directives in the object file contains the undecorated
symbol name, while it is decorated in MSVC configurations. (On the
command line, link.exe takes an undecorated symbol name for the
-export argument though.)
Differential Revision: https://reviews.llvm.org/D37772
llvm-svn: 313174
It is possible for two modules to have the same name if they are
archive members with the same name, or if we are doing LTO (in which
case all modules will have the name "lto.tmp").
Differential Revision: https://reviews.llvm.org/D37589
llvm-svn: 312744
Summary:
Previous would throw warning whenever libxml2 is not installed. Now
only give this warning if merging manifest fails.
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D37240
llvm-svn: 312604
If a symbol is locally defined and is DLL imported in another
translation unit, and the object with the locally defined version is
loaded prior to the imported version, then the linker will fail to
resolve the definition of the thunk and return the locally defined
symbol. This will then be attempted to be cast to an import thunk,
which will clearly fail.
Only return the thunk if the symbol is inserted or a thunk is created.
Otherwise, report a duplication error.
llvm-svn: 312386
We have llvm-readobj for dumping CodeView from object files, and
llvm-pdbutil has always been more focused on PDB. However,
llvm-pdbutil has a lot of useful options for summarizing debug
information in aggregate and presenting high level statistical
views. Furthermore, it's arguably better as a testing tool since
we don't have to write tests to conform to a state-machine like
structure where you match multiple lines in succession, each
depending on a previous match. llvm-pdbutil dumps much more
concisely, so it's possible to use single-line matches in many
cases where as with readobj tests you have to use multi-line
matches with an implicit state machine.
Because of this, I'm adding object file support to llvm-pdbutil.
In fact, this mirrors the cvdump tool from Microsoft, which also
supports both object files and pdb files. In the future we could
perhaps rename this tool llvm-cvutil.
In the meantime, this allows us to deep dive into object files
the same way we already can with PDB files.
llvm-svn: 312358
This reverts commit r312171 because it is pointed out that that's not a
correct fix (see https://bugs.llvm.org/show_bug.cgi?id=32674#c14) and
also because it broke buildbots.
llvm-svn: 312174
MSVC link.exe supports nested static libraries. That is, an .a file can
contain other .a file as its member. It is reported that MySQL actually
depends on this feature.
Fixes https://bugs.llvm.org/show_bug.cgi?id=32674
llvm-svn: 312171
Summary: Now that the llvm-mt manifest merging libraries are complete, we may use them to merge manifests instead of needing to shell out to mt.exe.
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D36255
llvm-svn: 311424
This adds support for dumping a summary of module symbols
and CodeView debug chunks. This option prints a table for
each module of all of the symbols that occurred in the module
and the number of times it occurred and total byte size. Then
at the end it prints the totals for the entire file.
Additionally, this patch adds the -jmc (just my code) option,
which suppresses modules which are from external libraries or
linker imports, so that you can focus only on the object files
and libraries that originate from your own source code.
llvm-svn: 311338
When creating an import library from lld, the cases with
Name != ExtName shouldn't end up as a weak alias, but as a real
export of the new name, which is what actually is exported from
the DLL.
This restores the behaviour of renamed exports to what it was in
4.0.
Differential Revision: https://reviews.llvm.org/D36634
llvm-svn: 310992
Since SVN r303491 and r304573, LLD used the COFFImportLibrary
functions from LLVM. These only had two names, Name and ExtName,
which wasn't enough to convey all the details of stdcall functions.
Stdcall functions got the wrong symbol name in the import library
itself in r303491, which is why it was reverted in r304561. When
re-landed and fixed in r304573 (after adding a test in r304572),
the symbol name itself in the import library ended up right, but the
name type of the import library entry was wrong.
This had the effect that linking to the import library succeeded
(contrary to in r303491, where linking to such an import library
failed), but at runtime, the symbol wouldn't be found in the DLL
(since the caller linked to the stdcall decorated name).
Differential Revision: https://reviews.llvm.org/D36545
llvm-svn: 310989
Previously, our algorithm to compute a build id involved hashing the
executable and storing that as the GUID in the CV Debug Record chunk,
and setting the age to 1.
This breaks down in one very obvious case: a user adds some newlines to
a file, rebuilds, but changes nothing else. This causes new line
information and new file checksums to get written to the PDB, meaning
that the debug info is different, but the generated code would be the
same, so we would write the same build over again with an age of 1.
Anyone using a symbol cache would have a problem now, because the
debugger would open the executable, look at the age and guid, find a
matching PDB in the symbol cache and then load it. It would never copy
the new PDB to the symbol cache.
This patch implements the canonical Windows algorithm for updating
a build id, which is to check the existing executable first, and
re-use an existing GUID while bumping the age if it already
exists.
Differential Revision: https://reviews.llvm.org/D36758
llvm-svn: 310961
These are emitted for comm symbols in object files, when targeting
a GNU environment.
Alternatively, just ignore them since we already align CommonChunk
to the natural size of the content (up to 32 bytes). That would only
trade away the possibility to overalign small symbols, which doesn't
sound like something that might not need to be handled?
Differential Revision: https://reviews.llvm.org/D36304
llvm-svn: 310871
We don't have the right algorithm for copying S_UDT symbols
from object files to the globals stream, and having it wrong
is worse than not having it at all, since it breaks display
of local variables of UDT types (for example, "dv Foo" fails
in our current implementation, but succeeds if the S_UDT records
are omitted). Omit them until we fix the algorithm.
llvm-svn: 310867
Previously we were writing an empty globals stream. Windows
tools interpret this as "private symbols are not present in
this PDB", even when they are, so we need to fix this. Regardless,
without it we don't have information about global variables, so
we need to fix it anyway. This patch does that.
With this patch, the "lm" command in WinDbg correctly reports
that we have private symbols available, but the "dv" command
still refuses to display local variables.
Differential Revision: https://reviews.llvm.org/D36535
llvm-svn: 310743
In the refactor to merge the publics and globals stream, a bug
was introduced that wrote the wrong value for one of the fields
of the PublicsStreamHeader. This caused debugging in WinDbg
to break.
We had no way of dumping any of these fields, so in addition to
fixing the bug I've added dumping support for them along with a
test that verifies the correct value is written.
llvm-svn: 310439
The publics stream and globals stream are very similar. They both
contain a list of hash buckets that refer into a single shared stream,
the symbol record stream. Because of the need for each builder to manage
both an independent hash stream as well as a single shared record
stream, making the two builders be independent entities is not the right
design. This patch merges them into a single class, of which only a
single instance is needed to create all 3 streams. PublicsStreamBuilder
and GlobalsStreamBuilder are now merged into the single GSIStreamBuilder
class, which writes all 3 streams at once.
Note that this patch does not contain any functionality change. So we're
still not yet writing any records to the globals stream. All we're doing
is making it so that when we do start writing records to the globals,
this refactor won't have to be part of that patch.
Differential Revision: https://reviews.llvm.org/D36489
llvm-svn: 310438
The compiler outputs PROC32_ID symbols into the object files
for functions, and these symbols have an embedded type index
which, when copied to the PDB, refer to the IPI stream. However,
the symbols themselves are also converted into regular symbols
(e.g. S_GPROC32_ID -> S_GPROC32), and type indices in the regular
symbol records refer to the TPI stream. So this patch applies
two fixes to function records.
1. It converts ID symbols to the proper non-ID record type.
2. After remapping the type index from the object file's index
space to the PDB file/IPI stream's index space, it then
remaps that index to the TPI stream's index space by.
Besides functions, during the remapping process we were also
discarding symbol record types which we did not recognize.
In particular, we were discarding S_BPREL32 records, which is
what MSVC uses to describe local variables on the stack. So
this patch fixes that as well by copying them to the PDB.
Differential Revision: https://reviews.llvm.org/D36426
llvm-svn: 310394
Image section headers are stored in the DBI stream, but we
had no way to dump them. This patch adds dumping support,
along with some tests that LLD actually dumps them correctly.
Differential Revision: https://reviews.llvm.org/D36332
llvm-svn: 310107
Summary:
PDB section contributions are supposed to use output section indices and
offsets, not input section indices and offsets.
This allows the debugger to look up the index of the module that it
should look up in the modules stream for symbol information. With this
change, windbg can now find line tables, but it still cannot print local
variables.
Fixes PR34048
Reviewers: zturner
Subscribers: hiraditya, ruiu, llvm-commits
Differential Revision: https://reviews.llvm.org/D36285
llvm-svn: 309987
We don't write any actual symbols to this stream yet, but for
now we just create the stream and hook it up to the appropriate
places and give it a valid header.
Differential Revision: https://reviews.llvm.org/D35290
llvm-svn: 309608
Summary:
MSVC link.exe records all external symbol names in the publics stream.
It provides similar functionality to an ELF .symtab.
Reviewers: zturner, ruiu
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D35871
llvm-svn: 309303
Also handle overflow correctly in LDR/STR relocations. Even if the
offset range of a 8 byte LDR instruction is 15 bit (even if the immediate
itself is 12 bit) due to a 3 bit shift, only include up to 12 bits of offset
after doing the relocation, by limiting the range of the immediate by the
number of shifted bits.
Differential Revision: https://reviews.llvm.org/D35792
llvm-svn: 309175
The test used /manifestinput: without /manifest:embed, which isn't actually
supported. Just remove this part of the test for now; if it's important to
check this the llvm-readobj part should be extended to check this.
llvm-svn: 309002
Also emit an error if /manifestinput: is used without /manifest:embed.
Increases compatibility with link.exe
https://reviews.llvm.org/D35842
llvm-svn: 308998
The same adjustment is already done for the entry point in
Writer.cpp and for relocations that point to executable code
in Chunks.cpp.
Differential Revision: https://reviews.llvm.org/D35767
llvm-svn: 308953
Also extend the tests for IMAGE_REL_ARM64_PAGEOFFSET_12L to test
all 8/16/32/64 bit GPR and 8/16/32/64/128 SIMD/FP bit ldr/str variants,
and a ldr with an existing offset.
Differential revision: https://reviews.llvm.org/D35647
llvm-svn: 308631
This fixes cases on ARM64 when importing from more than one DLL,
in case the imports from the first DLL ended up unaligned.
When fixing up a IMAGE_REL_ARM64_PAGEOFFSET_12L, which shifts the
offset by the load/store size, check that the shift doesn't discard
any bits. (This would also detect if the import address chunks were
unaligned.)
Differential revision: https://reviews.llvm.org/D35640
llvm-svn: 308585
This test is folded into implib-name. Don't bother with the racy test.
The use of %T results in left-overs from previous tests which write out
the same import library as this test.
llvm-svn: 308409
Improve the link conformance for the import name embedded into the
import library. This requires the associated change to the LLVM portion
for the DEF file parser. The import file generation embeds a different
name based on whether the driver is invoked as "link" or "lib".
Furthermore, the LIBRARY keyword in the DEF file influences the import
name. The behaviour can be summarised according to the following table:
| LIBRARY w/ ext | LIBRARY w/o ext | no LIBRARY
-----+----------------+---------------------+------------------
LINK | {value} | {value}.{.dll/.exe} | {output name}
LIB | {value} | {value}.dll | {output name}.dll
llvm-svn: 308407
Noticed while testing for an out of tree target. There are probably more tests that should be so marked.
I'm not sure who owns these tests so I've added a few names I recognise from the recent history.
With advice from probinson, ruiu, rafael and dramatically improved by davidb. Thank you all!
Differential Revision: https://reviews.llvm.org/D34685
llvm-svn: 308335
DWARF debug sections can also contain relocations against symbols in
discared segments. LLD should accept such relocations.
Differential Revision: https://reviews.llvm.org/D35526
llvm-svn: 308315
Summary:
Object files compiled with /Zi emit type information into a type server
PDB. The .debug$S section will contain a single TypeServer2Record with
the absolute path and GUID of the type server. LLD needs to load the
type server PDB and merge all types and items it finds in it into the
destination PDB.
Depends on D35495
Reviewers: ruiu, inglorion
Subscribers: zturner, llvm-commits
Differential Revision: https://reviews.llvm.org/D35504
llvm-svn: 308235
Summary:
We were treating the GUIDs in TypeServer2Record as strings, and the
non-ASCII bytes in the GUID would not round-trip through YAML.
We already had the PDB_UniqueId type portably represent a Windows GUID,
but we need to hoist that up to the DebugInfo/CodeView library so that
we can use it in the TypeServer2Record as well as in PDB parsing code.
Reviewers: inglorion, amccarth
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D35495
llvm-svn: 308234
Summary:
Instead of wiring these through the CVTypeVisitor interface, clients
should inspect the CVTypeArray before visiting it and potentially load
up the type server's TPI stream if they need it.
No tests relied on this functionality because LLD was the only client.
Reviewers: ruiu
Subscribers: mgorny, hiraditya, zturner, llvm-commits
Differential Revision: https://reviews.llvm.org/D35394
llvm-svn: 308212
Summary:
This would have caught the invalid object file I used in my test case in
r307726. The OOB was only caught by ASan later, which is slow and
doesn't work on some platforms. LLD should do some basic input
validation itself. This check isn't perfect, so relocations can reach
OOB by up to seven bytes, but it's better than what we had and probably
cheap.
Reviewers: ruiu
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D35371
llvm-svn: 307948
Summary:
This fixes type indices for SDK or CRT static archives. Previously we'd
try to look next to the archive object file path, which would not exist
on the local machine.
Also error out if we can't resolve a type server record. Hypothetically
we can recover from this error by discarding debug info for this object,
but that is not yet implemented.
Reviewers: ruiu, amccarth
Subscribers: aprantl, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D35369
llvm-svn: 307946
Revert "[PDB] Tweak bad type index error handling"
check-lld with asan detects use-after-poison.
This reverts commits r307733 and r307726.
llvm-svn: 307752
Translate invalid type indices to a sentinel value instead of skipping
the record. Skipping records isn't a good recovery method, because we
can skip a scope open or close record, which will confuse the scope
management code.
We currently have lots of invalid type indices on Microsoft-provided
standard libraries, because the LF_TYPESERVER2 records contain absolute
paths that are only valid on their build servers. Our type server
handlers need to look at other things (GUIDs) to find these type server
PDBs.
llvm-svn: 307726
This is enough to link a working hello world executable, with
a call to an imported function, a string constant passed to
the imported function, and loads from a global variable.
Differential Revision: https://reviews.llvm.org/D34964
llvm-svn: 307629
This is part of the continuing effort to increase parity between
LLD and MSVC PDBs. link still doesn't like our PDBs, so the most
obvious thing to check was whether adding an empty publics stream
would get it to do something else. It still fails in the same way
but at least this removes one more variable from the equation.
The next logical step would be to try creating an empty globals
stream.
Differential Revision: https://reviews.llvm.org/D35224
llvm-svn: 307598
This was originally reverted because of two issues.
1) Printing ANSI color escape codes even when outputting to
a file
2) Module name comparisons were failing when comparing a PDB
generated on one machine to a PDB generated on another
machine.
I attempted to fix#2 by adding command line options which let
you specify prefixes to strip from the beginning of embedded
paths, which effectively lets us specify a path to "base" each
PDB from and only compare the parts under the base. But this is
tricky because PDB paths always use Windows path syntax, even
when they are created on non-Windows hosts. A problem still
existed when constructing the prefix to strip, where we were
accidentally using a host-specific path separator instead of
a Windows path separator.
This resubmission fixes the issue on Linux (and I have verified
that the test now passes on Linux).
llvm-svn: 307571
A test was checked in on Friday that worked by checking in an
object file and PDB generated locally by MSVC, and then having
the test run lld-link on the object file and diffing LLD's PDB
against the checked in PDB.
This failed because part of the diffing algorithm involves
determining if two modules are the same, and if so drilling into
the module and diffing individual fields of the module. The
only thing we can use to make this determination though is the
"name" of the module, which is a path to where the module (obj
file) was read from on the machine where it was linked. This
fails for obvious reasons when comparing a PDB generated on one
machine to a PDB on another machine.
The fix employed here is to add two command line options to the
diff subcommand, which allow the user to specify a "binary root
path". The bin root path, if specified, is stripped from the
beginning of any embedded PDB paths. The test is updated to
specify the user's local test output directory for the left
PDB, and is hardcoded to the location where the original PDB
was created for the right PDB. This way all the equivalence
comparisons should succeed.
llvm-svn: 307555
This reverts commit 147f45ff24456aea59575fa4ac16c8fa554df46a.
Revert "Revert "Revert "Revert "Replace trivial use of external rc.exe by writing our own .res file.""""
This reverts commit 61a90a67ed54a1f0dfeab457b65abffa129569e4.
The patches were intially reverted because they were causing a failure
on CrWinClangLLD. Unfortunately, this was done haphazardly and didn't
compile, so the revert was reverted again quickly to fix this. One that
was done, the revert of the revert was itself reverted. This allowed me
to finally fix the actual bug in r307452. This patch re-enables the
code path that had originally been causing the bug, now that it (should)
be fixed.
llvm-svn: 307460
Summary:
The original cvtres.exe sets the high bit when an identifier offset
points to a string. Even though this is not mentioned in the spec, and
in fact does not seem to cause errors with most cases, for some reason
this causes a failure in Chromium where the new resource file is not
verified as a new version. This patch sets this high bit flag, and also
adds a test case to check that the output of our library is always
identical to original cvtres.
Reviewers: zturner, ruiu
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D35099
llvm-svn: 307452
1) Don't write a /src/headerblock stream. This appears to be
written conditionally by MSVC, but it's not clear what the
condition is. For now, just remove it since we dont' know
what it is anyway and the particular pdb we've checked in
for the test doesn't have one.
2) Write a valid timestamp for the PDB file signature. This
leads to non-reproducible builds, but it matches the default
behavior of link, so it should be out default as well. If
we need reproducibility, we should add a separate command
line option for it that is off by default.
3) Write an empty FPO stream. MSVC seems to always write an
FPO stream. This change makes the stream directory match
up, although we still need to make the contents of the FPO
stream match.
llvm-svn: 307436
Some platforms require an explicit specialization of std::hash
for PdbRaw_FeaturesSig. Also a test involving case sensitivity
needed to be fixed. For now that particular check just accepts
any path even if they're completely different. Long term we
should output paths in the correct case to match MSVC.
llvm-svn: 307426
Without this we would just append whatever the user
wrote on the command line, so if we're in C:\foo
and we run lld-link bar/baz.obj, we would write
C:\foo\bar/baz.obj in various places in the PDB.
MSVC linker does not do this, so we shouldn't either.
This fixes some differences in the diff test, so we
update the test as well.
Differential Revision: https://reviews.llvm.org/D35092
llvm-svn: 307423
A couple of things were different about our generated PDBs.
1) We were outputting the wrong Version on the PDB Stream.
The version we were setting was newer than what MSVC is setting.
It's not clear what the implications are, but we change LLD
to use PdbImplVC70, as MSVC does.
2) For the optional debug stream indices in the DBI Stream, we
were outputting 0 to mean "the stream is not present". MSVC
outputs uint16_t(-1), which is the "correct" way to specify
that a stream is not present. So we fix that as well.
3) We were setting the PDB Stream signature to 0. This is supposed
to be the result of calling time(nullptr). Although this leads
to non-deterministic builds, a better way to solve that is by
having a command line option explicitly for generating a
reproducible build, and have the default behavior of lld-link
match the default behavior of link.
To test this, I'm making use of the new and improved `pdb diff`
sub command. To make it suitable for writing tests against, I had
to modify the diff subcommand slightly to print less verbose output.
Previously it would always print | <column> | <value1> | <value2> |
which is quite verbose, and the values are fragile. All we really
want to know is "did we produce the same value as link?" So I added
command line options to print a single character representing the
result status (different, identical, equivalent), and another to
hide the value display. Note that just inspecting the diff output
used to write the test, you can see some things that are obviously
wrong. That is just reflective of the fact that this is the state
of affairs today, not that we're asserting that this is "correct".
We can use this as a starting point to discover differences, fix
them, and update the test.
Differential Revision: https://reviews.llvm.org/D35086
llvm-svn: 307422
Summary:
There are a variety of records that open scopes: function scopes, block
scopes, and inlined call site scopes. These symbol records contain
Parent and End fields with the offsets of other symbol records. The End
field contains the offset of the matching S_END or S_INLINESITE_END
record. The Parent field contains the offset of the parent record, or 0
if this is a top-level scope (i.e. a function).
With this change, `llvm-pdbutil pretty -all` no longer crashes on PDBs
produced by LLD. I haven't tried a real debugger yet.
Reviewers: zturner, ruiu
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34898
llvm-svn: 307278
This reverts commit ae21ee0b6cacbc1efaf4d42502e71da2f0eb45c3.
The initial revert was done in order to prevent ongoing errors on
chromium bots such as CrWinClangLLD. However, this was done haphazardly
and I didn't realize there were test and compilation failures, so this
revert was reverted. Now that those have been fixed, we can revert the
revert of the revert.
llvm-svn: 307227
This reverts commit 5fecbbbe5049665d86834cf69d8f75db4f392308.
The initial revert was done in order to prevent ongoing errors on
chromium bots such as CrWinClangLLD. However, this was done haphazardly
and I didn't realize there were test and compilation failures, so this
revert was reverted. Now that those have been fixed, we can revert the
revert of the revert.
llvm-svn: 307226
This reverts commit 600d52c278e123dd08bee24c1f00932b55add8de.
This patch still seems to break CrWinClangLLD, reverting until I can
find root problem.
llvm-svn: 307189
This patch still seems to break CrWinClangLLD, reverting this once more
until I can discover root problem.
This reverts commit 3dbbc8ce43be50ffde2b1c655c6d3a25796fe78b.
llvm-svn: 307188
A plain empty entry point function that returns 0 seems to produce
a binary that loads and runs fine in wine.
Differential Revision: https://reviews.llvm.org/D34833
llvm-svn: 306963
Summary:
This reverts commit 51931072a7c9a52540baf76fc30ef391d2529a2f.
This revert was originally done because the integrations of the new
WindowsResource library into LLD was causing error in chromium, due to
bugs in how resource sections were handled. These bugs were fixed,
meaning that the features may be reintegrated.
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D34922
llvm-svn: 306941
Type records have a unique type index, but symbol records do
not. Instead, symbol records refer to other symbol records
by referencing their offset in the symbol stream. In a sense
this is the analogue of the TypeIndex, but we are not printing
it in the dumper. Printing it not only gives us more useful
information when manually investigating the contents of a PDB,
but also allows us to write better tests by enabling us to
verify that fields that reference other symbol records do
so correctly.
Differential Revision: https://reviews.llvm.org/D34906
llvm-svn: 306890
Summary:
There have been bugs with the WindowsResource library, such as incorrect
symbols for addresses. Directly checking the .rsrc in the final PE will
help ensure this doesn't happen again.
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34900
llvm-svn: 306854
This reverts commit d4c7e9fc63c10dbab0c30186ef8575474a704496.
This is done in order to address the failure of CrWinClangLLD etc. bots.
These throw an error of "side-by-side configuration is incorrect" during
compilation, which sounds suspiciously related to these manifest
changes.
Revert "Switch external cvtres.exe for llvm's own resource library."
This reverts commit 71fe8ef283a9dab9a3f21432c98466cbc23990d1.
llvm-svn: 306618
Summary:
In order to do this without switching on the symbol kind multiple times,
I created Defined::getChunkAndOffset and use that instead of
SymbolBody::getRVA in the inner relocation loop.
Now we get the symbol's chunk before switching over relocation types, so
we can test if it has been discarded outside the inner relocation type
switch. This also simplifies application of section relative
relocations. Previously we would switch on symbol kind to compute the
RVA, then the relocation type, and then the symbol kind again to get the
output section so we could subtract that from the symbol RVA. Now we
*always* have an OutputSection, so applying SECREL and SECTION
relocations isn't as much of a special case.
I'm still not quite happy with the cleanliness of this code. I'm not
sure what offsets and bases we should be using during the relocation
processing loop: VA, RVA, or OutputSectionOffset.
Reviewers: ruiu, pcc
Reviewed By: ruiu
Subscribers: majnemer, inglorion, llvm-commits, aprantl
Differential Revision: https://reviews.llvm.org/D34650
llvm-svn: 306566
Summary: The testing on the resource section of executables produced by lld has been very lax, and allowed a major bug to go unnoticed when we switched from shelling out to cvtres.exe to using llvm's own library. These additional tests should cover all the major failure points.
Reviewers: zturner, ruiu
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34664
llvm-svn: 306465
This patch removes the dependency on the external rc.exe tool by writing
a simple .res file using our own library. In this patch I also added an
explicit definition for the .res file magic. Furthermore, I added a
unittest for embeded manifests and fixed a bug exposed by the test.
llvm-svn: 306311
Summary:
They do the obvious thing: provide the section index of .bss and the
offset of the symbol in .bss.
Reviewers: ruiu
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34628
llvm-svn: 306304
Over time we've started to add inputs and test cases using the .yaml
extension, which seems to be preferred over the .objtxt extension that
we were using initially. One nice thing about using .yaml is that it
triggers existing editor highlighting and formatting support.
Fix two pdb*.yaml test cases that I added that weren't being run as part
of check-lld.
llvm-svn: 306303
Summary:
The main change is that we can have SECREL and SECTION relocations
against ___safe_se_handler_table, which is important for handling the
debug info in the MSVCRT.
Previously we were using DefinedRelative for __safe_se_handler_table and
__ImageBase, and after we implement CFGuard, we plan to extend it to
handle __guard_fids_table, __guard_longjmp_table, and more. However,
DefinedRelative is really only suitable for implementing __ImageBase,
because it lacks a Chunk, which you need in order to figure out the
output section index and output section offset when resolving SECREl and
SECTION relocations.
This change renames DefinedRelative to DefinedSynthetic and gives it a
Chunk. One wart is that __ImageBase doesn't have a chunk. It points to
the PE header, effectively. We could split DefinedRelative and
DefinedSynthetic if we think that's cleaner and creates fewer special
cases.
I also added safeseh.s, which checks that we don't emit a safe seh table
entries pointing to garbage collected handlers and that we don't emit a
table at all when there are no handlers.
Reviewers: ruiu
Reviewed By: ruiu
Subscribers: inglorion, pcc, llvm-commits, aprantl
Differential Revision: https://reviews.llvm.org/D34577
llvm-svn: 306293
Summary:
For SECTION relocations against absolute symbols, MSVC emits the largest
output section index plus one. I've implemented that by threading a
global variable through DefinedAbsolute that is filled in by the Writer.
A more library-oriented approach would be to thread the Writer through
Chunk::writeTo and SectionChunk::applyRel*, but Rui seems to prefer
doing it this way.
MSVC rejects SECREL relocations against absolute symbols, but only when
the relocation is in a real output section. When the relocation is in a
CodeView debug info section destined for the PDB, it seems that this
relocation error is suppressed, and absolute symbols become zeros in the
object file. This is easily implemented by checking the input section
from which we're applying relocations.
This should fix errors about __safe_se_handler_table and
__guard_fids_table when linking the CRT and generating a PDB.
Reviewers: ruiu
Subscribers: aprantl, llvm-commits
Differential Revision: https://reviews.llvm.org/D34541
llvm-svn: 306071
Now you run llvm-pdbutil dump <options>. This is a followup
after having renamed the tool, whereas before raw was obviously
just the style of dumping, whereas now "dump" is the action to
perform with the "util".
llvm-svn: 306055
Summary:
The main complexity in adding symbol records is that we need to
"relocate" all the type indices. Type indices do not have anything like
relocations, an opaque data structure describing where to find existing
type indices for fixups. The linker just has to "know" where the type
references are in the symbol records. I added an overload of
`discoverTypeIndices` that works on symbol records, and it seems to be
able to link the standard library.
Reviewers: zturner, ruiu
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D34432
llvm-svn: 305933
This works around a strange interaction with Authenticode signatures,
in which a signed PE executable with {Major,Minor}LinkerVersion = 0.0
fails to validate on Windows 7 (but is OK on Windows 10). Setting the
linker version to 14.0 (which is what VS2015 outputs) makes it work
again.
Patch by Simon Tatham <simon.tatham@arm.com>.
llvm-svn: 305929
VC2017 contains these new symbols as undefined symobls. They are used
for /guard:cf. Since we do not support the control flow guard, but we
want to at least ignore these symbols so that we can link against VS2017
libraries.
Fixes https://bugs.chromium.org/p/chromium/issues/detail?id=727193.
llvm-svn: 305876
We forgot to serialize these because llvm-readobj didn't dump them. They
are typically all zeros in an object file. The linker fills them in with
relocations before adding them to the PDB. Now we can properly round
trip these symbols through pdb2yaml -> yaml2pdb.
I made these fields optional with a zero default so that we can elide
them from our test cases.
llvm-svn: 305857
Summary:
Previously we didn't add debug info chunks to the SparseChunks array, so
they didn't participate in section GC. Now we do.
Reviewers: ruiu
Subscribers: aprantl, llvm-commits
Differential Revision: https://reviews.llvm.org/D34356
llvm-svn: 305811
Summary:
This is a first step towards getting line info to show up in VS and
windbg. So far, only llvm-pdbutil can parse the PDBs that we produce.
cvdump doesn't like something about our file checksum tables. I'll have
to dig into that next.
This patch adds a new DebugSubsectionRecordBuilder which takes bytes
directly from some other producer, such as a linker, and sticks it into
the PDB. Line tables only need to be relocated. No data needs to be
rewritten.
File checksums and string tables, on the other hand, need to be re-done.
Reviewers: zturner, ruiu
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D34257
llvm-svn: 305713
In this patch, I flip the switch in DriverUtils from using the external
cvtres.exe tool to using the Windows Resource library in llvm.
I also fixed a bug where .rsrc sections were marked as discardable
memory and therefore were placed in the wrong order in the final PE.
Furthermore, I modified WindowsResource to write the coff directly to a
memory buffer instead of to file, also had it use the machine types
already declared in COFF.h instead creating my own enum.
Finally, I flipped the switch to allow all unit tests that had
previously run only on windows due to a winres dependency to run
cross-platform.
Reviewers: zturner, ruiu
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D34265
llvm-svn: 305592
Summary:
Adds a "Discarded" bool to SectionChunk to indicate if the section was
discarded by COMDAT deduplication. The Writer still just checks
`isLive()`.
Fixes PR33446
Reviewers: ruiu
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34288
llvm-svn: 305582
This resubmits commit c0c249e9f2ef83e1d1e5f166b50673d92f3579d7.
It was broken due to some weird template issues, which have
since been fixed.
llvm-svn: 305517
This reverts commit 83ea17ebf2106859a51fbc2a86031b44d33696ad.
This is failing due to some strange template problems, so reverting
until it can be straightened out.
llvm-svn: 305505
When link is invoked with `/def:` and no input files, it behaves as if
`lib.exe` was invoked. Emulate this behaviour, generating the import
library from the def file that was passed. Because there is no input to
actually generate the dll, we simply process the def file early and exit
once we have created the import library.
llvm-svn: 305502
After some internal discussions, we agreed that the raw output style had
outlived its usefulness. It was originally created before we had even
thought of dumping to YAML, and it was intended to give us some insight
into the internals of a PDB file. Now we have YAML mode which does
almost exactly this but is more powerful in that it can round-trip back
to a PDB, which the raw mode could not do. So the raw mode had become
purely a maintenance burden.
One option was to just delete it. However, its original goal was to be
as readable as possible while staying close to the "metal" - i.e.
presenting the output in a way that maps directly to the underlying file
format. We don't actually need that last requirement anymore since it's
covered by the yaml mode, so we could repurpose "raw" mode to actually
just be as readable as possible.
This patch implements about 80% of the functionality previously in raw
mode, but in a completely different style that is more akin to what
cvdump outputs. Records are very compressed, often times appearing on
just one line. One nice thing about this is that it makes full record
matching easier, because you can grep for indices, names, and leaf types
on a single line often.
See the tests for some examples of what the new output looks like.
Note that this patch actually regresses the functionality of raw mode in
a few areas, but only because the patch was already unreasonably large
and going 100% would have been even worse. Specifically, this patch is
missing:
The ability to dump module debug subsections (checksums, lines, etc)
The ability to dump section headers
Aside from that everything is here. While goign through the tests fixing
them all up, I found many duplicate tests. They've been deleted. In
subsequent patches I will go through and re-add the missing
functionality.
Differential Revision: https://reviews.llvm.org/D34191
llvm-svn: 305495
This was originally reverted because of some non-deterministic
failures on certain buildbots. Luckily ASAN eventually caught
this as a stack-use-after-scope, so the fix is included in
this patch.
llvm-svn: 305393
This is causing failures on linux bots with an invalid stream
read. It doesn't repro in any configuration on Windows, so
reverting until I have a chance to investigate on Linux.
llvm-svn: 305371
This allows us to use yaml2obj and obj2yaml to round-trip CodeView
symbol and type information without having to manually specify the bytes
of the section. This makes for much easier to maintain tests. See the
tests under lld/COFF in this patch for example. Before they just said
SectionData: <blob> whereas now we can use meaningful record
descriptions. Note that it still supports the SectionData yaml field,
which could be useful for initializing a section to invalid bytes for
testing, for example.
Differential Revision: https://reviews.llvm.org/D34127
llvm-svn: 305366
Summary:
Expose the module descriptor index and fill it in for section
contributions.
Reviewers: zturner
Subscribers: llvm-commits, ruiu, hiraditya
Differential Revision: https://reviews.llvm.org/D34126
llvm-svn: 305296
The last fix required the user to manually add the required
feature. This caused an LLD test to fail because I failed to
update LLD. In practice we can hide this logic so it can just
be transparently added when we write the PDB.
llvm-svn: 305236
This is to reflect the evolving nature of the tool as being
useful for more than just dumping PDBs, as it can do many other
things.
Differential Revision: https://reviews.llvm.org/D34062
llvm-svn: 305106
The .def file parser changes I reverted broke this test case, and
exported "__imp__foo" instead of "__imp__foo@8". This was
http://crbug.com/728726.
llvm-svn: 304572
If you pass /delayload:<dllname> to the COFF linker, it creates thunks
so that DLLs are loaded when they are used for the first time instead of
load-time.
This mechanism do not work for data symbols as there's no way to trap
acccesses to data imported from DLLs. (Technically, I think if we do not
initially map dllimport tables in memory, we could actually trap accesses
and delay-load data symbols, but that's not what Windows do.)
This patch is to report an error when you try to delay-load data symbols.
Fixes https://bugs.llvm.org/show_bug.cgi?id=33106
Differential Revision: https://reviews.llvm.org/D33557
llvm-svn: 303890
This is a different implementation than r303225 (which was reverted
in r303270, re-submitted in r303304 and then re-reverted in r303527).
In the previous patch, I tried to add Live bit to each dllimported
symbol. It turned out that it didn't work with "oldnames.lib" which
contains a lot of weak aliases to dllimported symbols.
The way we handle weak aliases is to check if undefined symbols
can be resolved using weak aliases, and if so, memcpy the Defined
symbols to weak Undefined symbols, so that any references to weak
aliases automatically see defined symbols instead of undefined ones.
This memcpy happens before MarkLive kicks in.
That means we may have multiple copies of dllimported symbols. So
turning on one instance's Live bit is not enough.
This patch moves the Live bit to dllimport file. Since multiple
copies of dllsymbols still point to the same file, we can use it as the
central repository to keep track of liveness.
Differential Revision: https://reviews.llvm.org/D33520
llvm-svn: 303814
This reverts commit r303304 because it looks like the change
introduced a crash bug. At least after that change, LLD with thinlto
crashes when linking Chromium.
llvm-svn: 303527
Our output is not compatible with the Binding feature, so make it
explicit that.
Differential Revision: https://reviews.llvm.org/D33336
llvm-svn: 303378
Previously, LLD-produced executables had IAT (Import Address Table) and
ILT (Import Lookup Table) as separate chunks of data, although their
contents are identical. My interpretation of the COFF spec when I wrote
the COFF linker is that they need to be separate tables even though they
are the same.
But Peter found that the Windows loader is fine with executables in
which IAT and ILT are merged. This is a patch to merge IAT and ILT.
I confirmed that an lld-link self-hosted with this patch works fine.
Fixes https://bugs.llvm.org/show_bug.cgi?id=33064
Differential Revision: https://reviews.llvm.org/D33326
llvm-svn: 303374
When /DEBUG is not specified, /PDB should be ignored. When
/DEBUG is specified, a PDB should be output regardless of
whether or not /PDB is specified. /PDB just overrides the
default name.
This patch implements this behavior, and adds some tests, while
also removing a dead option /DEBUGPDB which was unused in any
code.
Differential Revision: https://reviews.llvm.org/D33302
llvm-svn: 303352
This reverts re-submits r303225 which was reverted in r303270 because it
broke the sanitizer-windows bot.
The reason of the failure is that we were writing dead symbols to the
symbol table. I fixed the issue.
llvm-svn: 303304
and follow-up r303226 "Fix Windows buildbots."
This broke the sanitizer-windows buildbot.
> Previously, the garbage collector (enabled by default or by explicitly
> passing /opt:ref) did not kill dllimported symbols. As a result,
> dllimported symbols could be added to resulting executables' dllimport
> list even if no one was actually using them.
>
> This patch implements dllexported symbol garbage collection. Just like
> COMDAT sections, dllimported symbols now have Live bits to manage their
> liveness, and MarkLive marks reachable dllimported symbols.
>
> Fixes https://bugs.llvm.org/show_bug.cgi?id=32950
>
> Reviewers: pcc
>
> Subscribers: llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D33264
llvm-svn: 303270
Summary:
Previously, the garbage collector (enabled by default or by explicitly
passing /opt:ref) did not kill dllimported symbols. As a result,
dllimported symbols could be added to resulting executables' dllimport
list even if no one was actually using them.
This patch implements dllexported symbol garbage collection. Just like
COMDAT sections, dllimported symbols now have Live bits to manage their
liveness, and MarkLive marks reachable dllimported symbols.
Fixes https://bugs.llvm.org/show_bug.cgi?id=32950
Reviewers: pcc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33264
llvm-svn: 303225
CONSTANT imports expect both the `_imp_` prefixed and non-prefixed
symbols should be added to the symbol table. This allows for linking
symbols like _NSConcreteGlobalBlock in WinObjC. The previous change
would generate the import library properly by handling the option but
would not consume the generated entry properly.
llvm-svn: 301657
This seems to be the behavior of the MSVC linker. Previously, this
incompatibility caused nasty issues in chromium build a few times.
Differential Revision: https://reviews.llvm.org/D30363
llvm-svn: 301598
Summary: When using /msvclto, lld and MSVC's linker both do their own symbol resolution. This can cause them to select different archive members, which can result in undefined references. This change avoids that situation by extracting archive members that are selected by lld and passing those to link.exe before any archives, so that MSVC's uses those objects for symbol resolution instead of different archive members.
Reviewers: pcc, rnk, ruiu
Reviewed By: pcc
Subscribers: llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D32317
llvm-svn: 301045
The CONSTANT export type is marked as obsolete, but link still supports
this. Furthermore, WinObjC uses this for certain exports. Add support
for this export type.
llvm-svn: 301013
Filenames are case-insensitive on Windows, so when we dispatch based
on argv0, we need to handle it case-insensitively.
Fixes https://bugs.llvm.org/show_bug.cgi?id=32637.
llvm-svn: 300087
Summary:
This lets PDB readers lookup type record data by type index in O(log n)
time. It also enables makes `cvdump -t` work on PDBs produced by LLD.
cvdump will not dump a PDB that doesn't have an index-to-offset table.
The table is sorted by type index, and has an entry every 8KB. Looking
up a type record by index is a binary search of this table, followed by
a scan of at most 8KB.
Reviewers: ruiu, zturner, inglorion
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31636
llvm-svn: 299958
The /appcontainer flag indicates that the module may only be used inside
an application container (for isolation). This has been supported by
link.exe since Windows 8.0. It sets an additional bit in the PE DLL
Characteristics flag to indicate the behavioural change.
llvm-svn: 299728
Summary:
This adds support for reporting multiple errors in a single invocation of lld-link. The limit defaults to 20 and can be changed with the /ERRORLIMIT command line parameter, or set to unlimited by passing a value of 0.
This is a new attempt after r295507, which was reverted because opening files raced with exiting early, causing the test to be flaky. This version avoids the race by exiting before calling enqueuePath.
Reviewers: pcc, ruiu
Reviewed By: ruiu
Subscribers: llvm-commits, dblaikie
Differential Revision: https://reviews.llvm.org/D31688
llvm-svn: 299496
Summary: In the ELF linker, we create the buffer identifier for bitcode files by appending the object name to the archive name. This change makes the COFF linker do the same. Without the change, ThinLTO builds can fail with an error message about multiple ThinLTO modules per object file, caused by object files contained in different archives having the same name.
Reviewers: pcc, ruiu
Reviewed By: pcc
Subscribers: mehdi_amini
Differential Revision: https://reviews.llvm.org/D31402
llvm-svn: 298942
Summary: MSVC does this when producing a PDB.
Reviewers: ruiu
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31316
llvm-svn: 298717
This will be used in the sanitizer test suite, which wants to use DWARF
line tables.
At some point we should reconsider how LLD handles the long section
names required by DWARF debug sections.
llvm-svn: 298544
Summary:
This also delays setting the output filename based on the first input
argument until after processing /def.
Fixes PR32354
Reviewers: ruiu, pcc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31152
llvm-svn: 298327
The MSVC linker doesn't like archive files containing non-native object
files.
When we are doing an LTO build, we may create archive files containing
both LLVM bitcode files and native object files. For example, if a
project contains assembly files and C++ files, we create native object
files for the assembly files and LLVM bitcode files for the C++ files.
With the /msvclto option, LLD passes archive files to the MSVC linker.
Previously, we didn't pass archive files if they contain at least one
bitcode files. That wasn't correct because the native object files that
weren't passed to the MSVC linker may be needed to complete linking.
In this patch, we create new temporary archive files to strip bitcode
files.
Differential Revision: https://reviews.llvm.org/D31053
llvm-svn: 297997
Previously, if you have foo=bar in a definition file, this assertion
could fire because when symbols are read from file they could be mangled.
It seems that due to historical reasons underscore mangling scheme is
really ad-hoc, and I cannot find a clean way to handle this. I had
to just de-mangle symbols to search again.
llvm-svn: 297357
Some archive files created during chromium build contains both BC
and native files. If that's the case, we want to pass the archive
file to link.exe. Otherwise, the MSVC linker would complain that
there's an unresolved symbol in a given set of files.
I cannot explain why link.exe doesn't complain about the presence
of bitcode files in this case, but it seems link.exe doesn't touch BC.
llvm-svn: 297229
If /msvclto is specified, we compile bitcode files and pass it to the
MSVC linker, stripping all bitcode files. We haven't stripped archive
files, because I was thinking that the MSVC linker wouldn't touch files
in archive files. When we pass an object file to link.exe, all symbols
have been resolved already, so link.exe shoulnd't need any of the files
in archives.
It turns out that even though link.exe doesn't need to do that, it
seems to try to read each file in all archives. And if there's a non-
COFF file in an archive, it exists with an error message. So we need
to remove archives from the command line too.
llvm-svn: 297191
Summary: Creates bitcode files suitable for use with ThinLTO, then checks that the linker can build an executable from them.
Reviewers: ruiu, pcc
Reviewed By: pcc
Subscribers: mehdi_amini, Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D30277
llvm-svn: 296042
I really do not understand what is going on on some Windows buildbots,
but FileCheck command on some buildbot behaves like long lines were
truncated. I'll try to find a cause of the issue, but let me relax the
test so that they'll succeed on all buildbots.
llvm-svn: 295798
Behavior races on ErrorCount. If the enqueued paths are evaluated
eagerly (in enqueuePath) then the behavior is as the test expects. But
they may not be evaluated until the future is waited on, in run() -
which is after the early return/exit on ErrorCount. (this causes the
test to fail (because in the "/ERRORCOUNT:XYZ" test, no other errors
are printed), at least for me, on linux)
This reverts commit r295507.
llvm-svn: 295590
Summary: This adds support for reporting multiple errors in a single invocation of lld-link. The limit defaults to 20 and can be changed with the /ERRORLIMIT command line parameter, or set to unlimited by passing a value of 0.
Reviewers: pcc, ruiu
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D29691
llvm-svn: 295507
The test is failing on the bot because "/subsystem:console" was
truncated for some reason. I don't know why that is happening on
that machine (it is not reproducible on my Windows machine).
In this patch, I'm trying to tame it by making the output shorter.
llvm-svn: 294502
Summary: This adds an option to save temporary files generated during link-time optimization. This can be useful for debugging.
Reviewers: ruiu, pcc
Reviewed By: ruiu, pcc
Subscribers: mehdi_amini
Differential Revision: https://reviews.llvm.org/D29518
llvm-svn: 294498
If `/debugtypes` is used to omit the codeview information, we would not
have constructed the debug info codeview record which is used to tie the
PDB to the binary. In such a case, rub out the GUID and Age fields.
llvm-svn: 294279
This patch defines a new command line option, /MSVCLTO, to LLD.
If that option is given, LLD invokes link.exe to link LTO-generated
object files. This is hacky but useful because link.exe can create
PDB files.
Differential Revision: https://reviews.llvm.org/D29526
llvm-svn: 294234
Summary: The COFF linker previously implemented link-time optimization using an API which has now been marked as legacy. This change refactors the COFF linker to use the new LTO API, which is also used by the ELF linker.
Reviewers: pcc, ruiu
Reviewed By: pcc
Subscribers: mgorny, mehdi_amini
Differential Revision: https://reviews.llvm.org/D29059
llvm-svn: 293967
Previously, we were printing out something like this for
sections/symbols with alignment 16
0000000000201000 0000000000000182 10 .data
which I think confusing. I think printing it in decimal is better.
Differential Revision: https://reviews.llvm.org/D29258
llvm-svn: 293685
Summary: MSVC allows linker options to be specified in source code. One of these is the /INCLUDE directive, which specifies that a symbol must be added to the symbol table, even if it otherwise wouldn't be. Existing tests cover the case where the linker is given an object file with an /INCLUDE directive, but we also need to cover the case where /INCLUDE is specified in a bitcode file (as would happen when using LTO). This new test covers that case.
Reviewers: pcc, ruiu
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D29096
llvm-svn: 293107
This patch is to merge type info in multiple .debug$T sections.
One mystery that needs to be solved is that it is not clear how
the MSVC linker uses TPI and IPI streams. Both streams contain
type info, and it is not obvious what kind of record should go
which.
dumppdb command in microsoft-pdb repository prints out IPI stream
contents as "IDs" and TPI stream as "TYPES", but looks like the tool
don't really care about which stream type recrods were read from.
For now, in this patch, I emit all type records to TPI stream.
It might just work with other tools. If not, we need to investigate
it more.
llvm-svn: 291739
This broke the following two bots:
lld-x86_64-win7: the test failed because a diff command is not available
on that bot. That's a configuration error of the bot and will be fixed soon.
llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast: "tar xf" failed on that
bot. I suspect that it is due to the maximum path limitation on Windows.
A build directory contains a buildbot name, so it's longer than usual on
that machine. On Windows, many filesystem operations fail if a path is
longer than 255 characters. I'll try to address that in another patch.
llvm-svn: 291527
I think generated tar files are more compatible with old tar commands
because of r291494 and r291340, so I want to enable this test on buildbots.
llvm-svn: 291526
This is how we use TarWriter in LLD. Now LLD does not append
a file extension, so you need to pass `--reproduce foo.tar`
instead of `--reproduce foo`.
Differential Revision: https://reviews.llvm.org/D28103
llvm-svn: 291210
The GUID should match between the RSDS and the PDB. This should repair
the build bots, though we should be ensuring that the GUIDs match.
Unfortunately, different build bots seem to be getting different GUIDs.
llvm-svn: 290981
The PDB GUID, Age, and version are tied together by the RSDS record in
the binary. Pass along the BuildId information into the createPDB to
allow us to tie the binary and the PDB together.
llvm-svn: 290975
Profiling revealed that the majority of lld's execution time on Windows was
spent opening and mapping input files. We can reduce this cost significantly
by performing these operations asynchronously.
This change introduces a queue for all operations on input file data. When
we discover that we need to load a file (for example, when we find a lazy
archive for an undefined symbol, or when we read a linker directive to
load a file from disk), the file operation is launched using a future and
the symbol resolution operation is enqueued. This implies another change
to symbol resolution semantics, but it seems to be harmless ("ninja All"
in Chromium still succeeds).
To measure the perf impact of this change I linked Chromium's chrome_child.dll
with both thin and fat archives.
Thin archives:
Before (median of 5 runs): 19.50s
After: 10.93s
Fat archives:
Before: 12.00s
After: 9.90s
On Linux I found that doing this asynchronously had a negative effect on
performance, probably because the cost of mapping a file is small enough that
it becomes outweighed by the cost of managing the futures. So on non-Windows
platforms I use the deferred execution strategy.
Differential Revision: https://reviews.llvm.org/D27768
llvm-svn: 289760
This patch replaces the symbol table's object and archive queues, as well as
the convergent loop in the linker driver, with a design more similar to the
ELF linker where symbol resolution directly causes input files to be added to
the link, including input files arising from linker directives. Effectively
this removes the last vestiges of the old parallel input file loader.
Differential Revision: https://reviews.llvm.org/D27660
llvm-svn: 289409
This ports the ELF linker's symbol table design, introduced in r268178,
to the COFF linker.
Differential Revision: http://reviews.llvm.org/D21166
llvm-svn: 289280
The former option bases the filename on the output name, e.g. if the
link output is a.exe, the map will be written to a.map. This matches the
behaviour of link.exe's /MAP option and is useful for creating a map
file of each executable when building a large project.
Differential Revision: https://reviews.llvm.org/D27595
llvm-svn: 289271
I don't think the data I add to a TPI stream in this patch is correct,
but at least it can be displayed using llvm-pdbdump. Until I add more
streams to a PDB file, I'm not able to know whether the data will be
accepted by MSVC tools or not.
llvm-svn: 289183
Previously, we had different way to stringize SymbolBody and InputFile
to construct error messages. This patch defines overloaded function
toString() so that we don't need to memorize all these different
function names.
With that change, it is now easy to include demangled names in error
messages. Now, if there is a symbol name conflict, we'll print out
both mangled and demangled names.
llvm-svn: 288992
Associative sections are sections that need to be linked if their associated
sections are linked. Associative sections are used to append auxiliary data
such as debug info.
Previously, we compared all associative sections when comparing two comdat
sections. Because usually assocative sections are not mergeable sections,
we missed a lot of mergeable sections. MSVC linker doesn't seem to check
the identity of associative sections.
This patch makes LLD to ignore associative sections when doing ICF.
llvm-svn: 288483
Previously, we discarded .debug$ sections. This patch adds them to
files so that PDB.cpp can access them.
This patch also adds a debug option, /dumppdb, to dump debug info
fed to createPDB so that we can verify that valid data has been passed.
llvm-svn: 287555
Object files compiled with cl.exe /GL contain intermediate code for LTO.
We can't (and don't want to) interpret such code, but we should print
out a user-friendly error message.
Differential Revision: https://reviews.llvm.org/D26647
llvm-svn: 286921
I don't really understand what is failing on lld-x86_64-darwin13 bot,
but this patch should at least reduces the number of moving parts.
llvm-svn: 286876
Following the lazy reference might bring in an object file that depends
on bitcode files that weren't part of the LTO step.
Differential Revision: https://reviews.llvm.org/D25461
llvm-svn: 283989
I don't really understand why we get a larger .rodata section only
on this bot. I guess it may be picking up a library which contains
a .rodata. I removed the specific values since their values are not
important for this test case.
llvm-svn: 283931
With this, "llvm-pdbdump yaml -ipi-stream" prints out an IPI stream.
Previously it crashed because it can't handle the case where IPI
stream doesn't exist.
llvm-svn: 283392
Handle this in the exact same way as IMAGE_REL_AMD64_SECREL
and IMAGE_REL_I386_SECREL.
Differential revision: https://reviews.llvm.org/D24608
llvm-svn: 282531
test/COFF/rsds.test checks only RSDS directory in a DLL and
didn't check the validity of the PDF file produced.
(Technically the produced PDB is not valid because it is really
a stub, but this test is still good to have.)
llvm-svn: 281678
Although the GUID seems to be stable across test runs now, it seems to be
unstable across hosts. Lets be a bit more lax about the reading of the RSDS
record.
llvm-svn: 281083
Change the way we calculate the build id to use MD5 to give reproducible build
ids. Previously we would generate random bytes for the build id GUID.
llvm-svn: 281079
The IMAGE_FILE_HEADER structure contains a (RVA, size) to an array of
COFF_DEBUG_DIRECTORY records. Each one of these records contains an RVA to a OMF
Debug Directory. These OMF debug directories are derived into newer types such
as PDB70, PDB20, etc. This constructs a PDB70 structure which will allow us to
associate a GUID with a build to actually tie debug information.
llvm-svn: 280012
Don't blindly OR in the new value, but clear the existing one, since it can be
nonzero. Read out the existing value before, and add into the desired offset.
(The add is done outside of the applyMOV, to handle potential overflow between
the two.)
Patch by Martin Storsjö!
llvm-svn: 277846
The opcode for the bl branches can initially be F000 F800, i.e.
the J1 and J2 bits are already set. Therefore mask these bits out
before or'ing in the new bits.
Patch by Martin Storsjö!
llvm-svn: 277836
This flag is implemented similarly to --reproduce in the ELF linker.
This patch implements /linkrepro by moving the cpio writer and associated
utility functions to lldCore, and using that implementation in both linkers.
One COFF-specific detail is that we store the object file from which the
resource files were created in our reproducer, rather than the resource
files themselves. This allows the reproducer to be used on non-Windows
systems for example.
Differential Revision: https://reviews.llvm.org/D22418
llvm-svn: 276719
Manifest file is a separate or embedded XML file having metadata
of an executable. As it is XML, it can contain various types of
information. Probably the most popular one is to request escalated
priviledges.
Usually the linker creates an XML file and embed that file into
an executable. However, there's a way to supply an XML file from
command line. /manifestniput is it.
Apparently it is over-designed here, but if you supply two or more
manifest files, then the linker needs to merge the files into a
single XML file. A good news is that we don't need to do that ourselves.
MT.exe command can do that, so we call the command from the linker
in this patch.
llvm-svn: 266704
Some COFF tests used INT_MIN for the alignment of the directive section.
This is invalid; replace the alignment with something more sensible: 1.
llvm-svn: 263723
This fixes a test which exposed an ASan issue.
We assumed that a symbol's section number had a corresponding section
without performing validation.
llvm-svn: 263558
The load configuration directory is a structure whose size varies as the
OS gains additional functionality. To account for this, the structure's
layout begins with a size field; this allows loaders to know which
fields are available.
However, LLD hard-coded the sizes (112 bytes for 64-bit and 64 for
32-bit). This means that we might not inform the loader of all the
pertinent fields or we might claim that there are more fields than are
actually present.
To correctly account for this, the size field must be loaded from the
_load_config_used symbol.
N.B. The COFF spec is either wrong or out of date, the load
configuration directory is not correctly documented in the
specification: it omits the size field.
llvm-svn: 263543
The TLS directory has a different layout depending on the bitness of the
machine the image will run on. LLD would always use the 64-bit TLS
directory for the data directory entry instead of an appropriately sized
TLS directory.
llvm-svn: 263539
This test is flaky for more than half a year or so on buildbots
and has been causing confusion. Remove it while I'm investing the
cause.
llvm-svn: 261709
DLL export tables usually contain dllexport'ed symbol RVAs so that
applications which use the DLLs can find symbols from the DLLs.
However, there's a minor feature to "forward" DLL symbols to other
DLLs.
If you set an RVA to a string whose form is "<dllname>.<symbolname>"
(e.g. "KERNEL32.ExitProcess") instead of symbol RVA to the export
table, the loader interprets that as a forwarder symbol, and resolve
that symbol from the specified DLL.
This patch implements that feature.
llvm-svn: 257243
If a section symbol is not external, that COMDAT section should never
be merge with other sections in other compilation unit. Previously,
we didn't take visibility into account.
Note that COMDAT sections with non-external visibility makes sense
because they can be removed by dead-stripping.
Fixes https://llvm.org/bugs/show_bug.cgi?id=25686
llvm-svn: 254578
There's actually a room to improve this patch. Instead of not merging
sections that have different alignements, we can choose the section that
has the largest alignment requirement among all sections that are otherwise
considered the same. Then all section alignments are satisfied, so we can
merge them.
I don't know if that improvement could make any difference for real-world
input, so I'll leave it alone. Would be interesting to revisit later.
llvm-svn: 248581
This is an LLD extension to MSVC link.exe command line. MSVC linker
does not write symbol tables for executables. We do unless no /debug
option is given.
There's a situation that we want to enable debug info but don't want
to emit the symbol table. One example is when we are comparing output
file size. With this patch, you can tell the linker to not create
a symbol table by just specifying /nosymtab.
llvm-svn: 248225
This patch fixes a regression introduced by r247964. Relocations that
are referring the same symbol should be considered equal, but they
were not if they were pointing to non-section chunks.
llvm-svn: 248132
Only live symbols are written to the symbol table. Because isLive()
returned false if dead-stripping was disabled entirely, only
non-COMDAT sections were written to the symbol table. This patch fixes
the issue.
llvm-svn: 247856
This is a patch to make LLD to be on par with MSVC in terms of ICF
effectiveness. MSVC produces a 27.14MB executable when linking LLD.
LLD previously produced a 27.61MB when self-linking. Now the size
is reduced to 27.11MB. Note that without ICF the size is 29.63MB.
In r247387, I implemented an algorithm that handles section graphs
as cyclic graphs and merge them using SCC. The algorithm did not
always work as intended as I demonstrated in r247721. The new
algortihm implemented in this patch is different from the previous
one. If you are interested the details, you want to read the file
comment of ICF.cpp.
llvm-svn: 247770
In this test, we have two functions, foo and bar. MSVC linker can
choose one and discard the other using ICF. LLD cannot. I add this
test as a TODO.
foo and bar are conceptually equivalent to the following:
void foo() { foo(); }
void bar() { foo(); }
foo and bar are effectively the same function. If foo and bar are
compiled to the same instructions, both their contents (foo and bar)
and relocation targets (foo) become the same, so from the ICF point
of view, they are reducible. But their graphs are not isomorphic!
LLD's ICF algorithm cannot handle this case yet.
llvm-svn: 247721
Previously, LLD's ICF couldn't merge cyclic graphs. That was unfortunate
because, in COFF, cyclic graphs are not exceptional at all. That is
pretty common.
In this patch, sections are grouped by Tarjan's strongly connected
component algorithm to get acyclic graphs. And then we try to merge
SCCs whose outdegree is zero, and remove them from the graph. This
makes other SCCs to have outdegree zero, so we can repeat the
process until all SCCs are removed. When comparing two SCCs, we handle
cycles properly.
This algorithm works better than previous one. Previously, self-linking
produced a 29.0MB executable. It now produces a 27.7MB. There's still some
gap compared to MSVC linker which produces a 27.1MB executable for the
same input. So the gap is narrowed, but still LLD is not on par with MSVC.
I'll investigate that later.
llvm-svn: 247387
I don't understand why the previous code is pretty flaky and
the new code is at least less flaky, but the original test
occasionally failed on the second run of lib.exe.
My guess was that lib.exe was failing because the output of
the echo command executed immediately before lib.exe was not
flushed to a file, but as far as I can say, the file
descriptor is properly closed in TestRunner.py, so this's
probably not correct. Other theory is that, on Windows, file
output is not guaranteed to be visible to other processes even
if a process flushes file descriptors, but I'd think that's
unlikely. So honestly I don't know the cause yet.
llvm-svn: 246621
This patch fixes a subtle incompatibility with MSVC linker.
MSVC linker preserves the original spelling of a DLL in the
import descriptor table. LLD previously converted all
characters to lowercase. Usually this difference is benign,
but if a program explicitly checks for DLL file names, the
program could fail.
llvm-svn: 246620
In r246424, I made a change that disables non-DLL to export
symbols. It turned out that the change was not correct. Both
DLLs and executables are able to export symbols (although the
latter is relatively rare). This change restores the feature.
llvm-svn: 246537
I have totally no idea why, but MSVC linker is sensitive about
file names of archive members. If we do not make import library
file names to the same as the DLL name, MSVC link *crashes*
when it is processing the library file. This patch is to set
the same name.
llvm-svn: 246535
The rules for dllexported symbols are overly complicated due to
x86 name decoration, fuzzy symbol resolution, and the fact that
one symbol can be resolved by so many different names. The rules
are probably intended to be "intuitive", so that users don't have
to understand the name mangling schemes, but it seems that it can
lead to unintended symbol exports.
To make it clear what I'm trying to do with this patch, let me
write how the export rules are subtle and complicated.
- x86 name decoration: If machine type is i386 and export name
is given by a command line option, like /export:foo, the
real symbol name the linker has to search for is _foo because
all symbols are decorated with "_" prefixes. This doesn't happen
on non-x86 machines. This automatic name decoration happens only
when the name is not C++ mangled.
However, the symbol name exported from DLLs are ones without "_"
on all platforms.
Moreover, if the option is given via .drectve section, no
symbol decoration is done (the reason being that the .drectve
section is created by a compiler and the compiler should always
know the exact name of the symbol, I guess).
- Fuzzy symbol resolution: In addition to x86 name decoration,
the linker has to look for cdecl or C++ mangled symbols
for a given /export. For example, it searches for not only
_foo but also _foo@<number> or ??foo@... for /export:foo.
Previous implementation didn't get it right. I'm trying to make
it as compatible with MSVC linker as possible with this patch
however the rules are. The new code looks a bit messy to me, but
I don't think it can be simpler due to the ad-hoc-ness of the rules.
llvm-svn: 246424
This is exposed via a new flag /opt:lldltojobs=N, where N is the number of
code generation threads.
Differential Revision: http://reviews.llvm.org/D12309
llvm-svn: 246342
ICF is a feature to merge sections not by name (which is the regular
COMDAT merging) but by contents. If two or more sections have the
identical contents and relocations, ICF merges them to save space.
Accessors or templated functions tend to have the same contents, and
ICF can hold them.
If we consider sections as vertices and relocations as edges, the
problem is to find as many isomorphic graphs as possile from input
graphs. MSVC linker is smart enough to identify isomorphic graphs
even if they contain circles (GNU gold cannot handle circles
according to http://research.google.com/pubs/pub36912.html, so this
is impressive).
Circular references are not uncommon in COFF object files.
One example is .pdata. .pdata sections contain exception handler info
for functions, so they naturally have relocations for the functions.
The functions in turn have references to the .pdata sections so that
the functions and their .pdata are linked together. As a result, they
form circles.
This is a test case for circular graphs. LLD is not able to handle
this test case yet. I'll add code soon.
llvm-svn: 245827
The old test files were just compiler outputs, so it was hard to
debug if something goes wrong. The new test file is carefully
hand-crafted to trigger ICF to avoid that.
llvm-svn: 245826
Previously, weak external symbols could reference only symbols that
appeared before them. Although that covers almost all use cases
of weak externals, there are object files out there which contains
weak externals that have forward references.
This patch supports such weak externals.
llvm-svn: 245258
There are some DLLs whose initializers depends on other DLLs'
initializers. The initialization order matters for them.
MSVC linker uses the order of the libraries from the command line.
LLD used ASCII-betical order. So they were incompatible.
This patch makes LLD compatible with MSVC.
llvm-svn: 245201
Sections must start at page boundaries in memory, but they
can be aligned to sector boundaries (512-bytes) on disk.
We aligned them to 4096-byte boundaries even on disk, so we
wasted disk space a bit.
llvm-svn: 244691
We were printing an error but exiting with 0.
Not sure how to test this. We could add a no-winlib feature,
but that is probably not worth it.
llvm-svn: 244109
Right now PE image section addresses are RVAs and symbol addresses are
VAs. We should probably fix this by changing section addresses to match
symbol addresses. Fixing this might take a few hours, so temporarily
disable the objdump part of this test.
llvm-svn: 243758
We want to convince the NT loader not to map these sections into memory.
A good first step is to move them to the end of the executable.
Differential Revision: http://reviews.llvm.org/D11655
llvm-svn: 243680
Windows ARM is the thumb ARM environment, and pointers to thumb code
needs to have its LSB set. When we apply relocations, we need to
adjust the LSB if it points to an executable section.
llvm-svn: 243560
SECREL should sets the 32-bit offset of the target from the beginning
of *target's* output section. Previously, the offset from the beginning
of source's output section was used instead.
SECTION means the target section's index, and not the source section's
index. This patch fixes that issue too.
llvm-svn: 243535
I don't fully understand the rationale behind the name mangling
scheme used for the DLL export table and the import library.
Why only leading "_" is dropped for the import library while
both "_" and "@" are dropped from DLL symbol table? But this seems
to be what MSVC linker does.
llvm-svn: 243490
Previously, we ignore /merge option if /debug is specified
because I thought that was MSVC linker did. This was wrong.
/merge shouldn't be ignored even in debug mode.
llvm-svn: 243375
An object file compatible with Safe SEH contains a .sxdata section.
The section contains a list of symbol table indices, each of which
is an exception handler function. A safe SEH-enabled executable
contains a list of exception handler RVAs. So, what the linker has
to do to support Safe SEH is basically to read the .sxdata section,
interpret the contents as a list of symbol indices, unique-fy and
sort their RVAs, and then emit that list to .rdata. This patch
implements that feature.
llvm-svn: 243182
__ImageBase is a special symbol whose value is the image base address.
Previously, we handled __ImageBase symbol as an absolute symbol.
Absolute symbols point to specific locations in memory and the locations
never change even if an image is base-relocated. That means that we
don't have base relocation entries for absolute symbols.
This is not a case for __ImageBase. If an image is base-relocated, its
base address changes, and __ImageBase needs to be shifted as well.
So we have to have base relocations for __ImageBase. That means that
__ImageBase is not really an absolute symbol but a different kind of
symbol.
In this patch, I introduced a new type of symbol -- DefinedRelative.
DefinedRelative is similar to DefinedAbsolute, but it has not a VA but RVA
and is a subject of base relocation. Currently only __ImageBase is of
the new symbol type.
llvm-svn: 243176
Load Configuration field points to a structure containing information
for SEH. That data strucutre is not created by the linker but provided
by an external file. What we have to do is just to set __load_config_used
address to the header.
llvm-svn: 242427
If a symbol is exported as /export:foo, and foo is resolved as a
mangled name (_foo@<number> or ?foo@@Y...), that mangled name should
be written to the export table. Previously, we wrote the original
name to the export table.
llvm-svn: 242342
Because thunks for dllimported symbols contain absolute addresses on x86,
they need to be relocated at load-time. This bug was a cause of crashes
in DLL initialization routines.
llvm-svn: 242259
Entry name selection rule is already complicated on x64, but it's more
complicated on x86 because of the underscore name mangling scheme.
If one of _main, _main@<number> (a C function) or ?main@@... (a C++ function)
is defined, entry name is _mainCRTStartup. If _wmain, _wmain@<number or
?wmain@@... is defined, entry name is _wmainCRTStartup. And so on.
llvm-svn: 242110
If /delayload option is given, we have to resolve __delayLoadHelper2
since the function is the dynamic loader to delay-load DLLs.
The function name is mangled in x86 as ___delayLoadHelper2@8.
llvm-svn: 242078
Symbol foo is mangled as _foo in C and ?foo@@... in C++ on x86.
findMangle has to remove prefix underscore before mangle a given name
as a C++ symbol.
llvm-svn: 241874
Symbol names are usually mangled by appending "_" prefix on x86.
But the mangled name is not used in DLL export table. The export
table contains unmangled names.
llvm-svn: 241872
With this patch, LLD is now able to self-link an .exe file for x86
that runs correctly, although I don't think some headers (particularly
SEH) are not correct. DLL support is coming soon.
llvm-svn: 241857
Previously, we infer machine type at the very end of linking after
all symbols are resolved. That's actually too late because machine
type affects how we mangle symbols (whether or not we need to
add "_").
For example, /entry:foo adds "_foo" to the symbol table if x86 but
"foo" if x64.
This patch moves the code to infer machine type, so that machine
type is inferred based on input files given via the command line
(but not based on .directives files).
llvm-svn: 241843
Symbols exported by DLLs are listed in import library files.
Exported names may be mangled by "Import Name Type" field as
described in PE/COFF spec 7.3. This patch implements that
mangling scheme.
llvm-svn: 241719
Providing a symbol table in the executable is quite useful when
debugging a fully-linked executable without having to reconstruct one
from DWARF.
Differential Revision: http://reviews.llvm.org/D11023
llvm-svn: 241689
Previously we were unnecessarily loading lazy symbols if they appeared in an
archive multiple times, as can happen with comdat symbols. This change fixes
the bug by only loading symbols from archives at load time if the original
symbol was undefined.
Differential Revision: http://reviews.llvm.org/D10980
llvm-svn: 241538
TLS table header field is supposed to have address and size of TLS table.
The linker doesn't have to understand what TLS table is. TLS table's name
is always "_tls_used", so if there's that symbol, the linker simply sets
that symbol's RVA to the header. The size of the TLS table is always 40 bytes.
llvm-svn: 241426
We were previously hitting assertion failures in the writer in cases where
a regular object file defined a weak external symbol that was defined by
a bitcode file. Because /export and /entry name mangling were implemented
using weak externals, the same problem affected mangled symbol names in
bitcode files.
The underlying cause of the problem was that weak external symbols were
being resolved before doing LTO, so the symbol table may have contained stale
references to bitcode symbols. The fix here is to defer weak external symbol
resolution until after LTO.
Also implement support for weak external symbols in bitcode files
by modelling them as replaceable DefinedBitcode symbols.
Differential Revision: http://reviews.llvm.org/D10940
llvm-svn: 241391
This worked before, but only by accident, and only with assertions disabled.
We ended up storing a DefinedRegular symbol in the WeakAlias field,
and never using it as an Undefined.
Differential Revision: http://reviews.llvm.org/D10934
llvm-svn: 241376
DLLs can export symbols only by ordinal, and DLLs are also able to be
delay-loaded. The combination of the two is valid. I didn't expect
that combination. This patch implements that feature.
With this patch, LLD is now able to link a working executable of Chrome
for 64-bit debug build. The browser seemed to be working fine. Chrome is
good for testing because of its variety and size. It contains various
open-source libraries written by various people. The largest file in
Chrome is chrome.dll whose size is 496MB. LLD can link it in 24 seconds.
MSVC linker takes 48 seconds. So it is exactly 2x faster. (I measured
that with debug info and ICF being turned off.)
With this achievement, I think I can say that the new COFF linker is
now mostly feature complete for x86-64 Windows. I believe there are
still many lingering bugs, though.
llvm-svn: 241318
Previously, __ImageBase symbol got a different value than the one
specified by /base:<number> because the symbol was created in the
SymbolTable's constructor. When the constructor is called,
no command line options are processed yet, so the symbol was
created always with the initial value. This caused wrong relocations
and thus caused mysterious crashes of some executables linked by LLD.
llvm-svn: 241313
Previously, pointers pointed by locally-imported symbols were broken.
It has only 4 bytes although the correct size is 8 byte. This patch
fixes that bug.
llvm-svn: 241295
On Windows, we have four different main functions, {w,}{main,WinMain}.
The linker has to choose a corresponding entry point function among
{w,}{main,WinMain}CRTStartup. These entry point functions are defined
in the standard library. The linker resolves one of them by looking at
which main function is defined and adding a corresponding undefined
symbol to the symbol table.
Object files containing entry point functions conflicts each other.
For example, we cannot resolve both mainCRTStartup and WinMainCRTStartup
because other symbols defined in the files conflict.
Previously, we inferred CRT function name at the very end of name
resolution. I found that that is sometimes too late. If the linker
already linked one of these four archive member objects, it's too late
to change the decision.
The right thing to do here is to infer entry point name after adding
all symbols from command line files and before adding any other files
(which are specified by directive sections). This patch does that.
llvm-svn: 241236
Previously, the order of adding symbols to the symbol table was simple.
We have a list of all input files. We read each file from beginning of
the list and add all symbols in it to the symbol table.
This patch changes that order. Now all archive files are added to the
symbol table first, and then all the other object files are added.
This shouldn't change the behavior in single-threading, and make room
to parallelize in multi-threading.
In the first step, only lazy symbols are added to the symbol table
because archives contain only Lazy symbols. Member object files
found to be necessary are queued. In the second step, defined and
undefined symbols are added from object files. Adding an undefined
symbol to the symbol table may cause more member files to be added
to the queue. We simply continue reading all object files until the
queue is empty.
Finally, new archive or object files may be added to the queues by
object files' directive sections (which contain new command line
options).
The above process is repeated until we get no new files.
Symbols defined both in object files and in archives can make results
undeterministic. If an archive is read before an object, a new member
file gets linked, while in the other way, no new file would be added.
That is the most popular cause of an undeterministic result or linking
failure as I observed. Separating phases of adding lazy symbols and
undefined symbols makes that deterministic. Adding symbols in each
phase should be parallelizable.
llvm-svn: 241107
Compilers recognize "main" function and don't mangle its name.
But if you use a different function as a user-defined entry name,
and if you didn't define that function with extern C, your entry
point function name is mangled. And the linker has to be able to
find that. This is relatively rare but can happen.
llvm-svn: 240953
The previous logic to find default entry name or subsystem does not
seem correct (i.e. was not compatible with MSVC linker). Previously,
default entry name was inferred from CRT functions and user-defined
entry functions. Subsystem was inferred from CRT functions.
Default entry name and subsystem are now inferred based on the
following table. Note that we no longer use CRT functions to infer
them.
Entry name Subsystem
main mainCRTStartup console
wmain wmainCRTStartup console
WinMain WinMainCRTStartup windows
wWinMain wWinMainCRTStartup windows
llvm-svn: 240922
Usually dllexported symbols are defined with 'extern "C"',
so identifying them is easy. We can just do hash table lookup
to look up exported symbols.
However, C++ non-member functions are also allowed to be exported,
and they can be specified with unmangled name. So, if /export:foo
is given, we need to look up not only "foo" but also its all
mangled names. In MSVC mangling scheme, that means that we need to
look up any symbol which starts with "?foo@@Y".
In this patch, we scan the entire symbol table to search for
a mangled symbol. The symbol table is a DenseMap, and that doesn't
support table lookup by string prefix. This is of course very
inefficient. But that should be probably OK because the user
should always add 'extern "C"' to dllexported symbols.
llvm-svn: 240919
This option is to ignore remaining undefined symbols and force
the linker to create an output file anyways.
The existing code assumes that there's no undefined symbol after
reportRemainingUndefines(). That assumption is legitimate.
I also don't want to mess up the existing code for this minor feature.
In order to keep it as is, remaining undefined symbols are replaced
with dummy defined symbols.
llvm-svn: 240913
When comparing two COMDAT sections, we need to take section values
and associative sections into account. This patch fixes that bug.
It fixes a crash bug of llvm-tblgen when linked with /opt:lldicf.
One thing I don't understand yet is that this logic seems to be
too strict. MSVC linker is able to create more compact executables
(which of course work correctly). With this ICF algorithm, LLD is
able to make executable smaller, but the outputs are larger than
MSVC's. There must be something I'm missing here.
llvm-svn: 240897
There were a few issues with the previous delay-import tables.
- "Attribute" field should have been 1 instead of 0.
(I don't know the meaning of this field, though.)
- LEA and CALL operands had wrong addresses.
- Address tables are in .didat (which is read-only).
They should have been in .data.
llvm-svn: 240837
This flag can be used to produce a map file, which is essentially a list
of objects linked into the final output file together with the RVAs of
their symbols. Because our format differs from MSVC's we expose it as a
separate flag.
Differential Revision: http://reviews.llvm.org/D10773
llvm-svn: 240812
We were resolving entry symbols and /include'd symbols after all other
symbols are resolved. But looks like it's too late. I found that it
causes some program to fail to link.
Let's say we have an object file A which defines symbols X and Y in an
archive. We also have another file B after A which defines X, Y and
_DLLMainCRTStartup in another archive. They conflict each other, so
either A or B can be linked.
If we have _DLLMainCRTStartup as an undefined symbol, file B is always
chosen. If not, there's a chance that A is chosen. If the linker
find it needs _DllMainCRTStartup after that, it's too late.
This patch adds undefined symbols to the symbol table as soon as
possible to fix the issue.
llvm-svn: 240757
Absolute symbols were always handled as external symbols, so if two
or more object files define the same absolute symbol, they would
conflict even if the symbol is private to each file.
This patch fixes that bug.
llvm-svn: 240756
ICF implemented in LLD is so experimental that we don't want to
enable that even if /opt:icf option is passed. I'll rename it back
once the feature is complete.
llvm-svn: 240721
The change I made in r240620 was not correct. If a symbol foo is
defined, and if you use __imp_foo, __imp_foo symbol is automatically
defined as a pointer (not just an alias) to foo.
Now that we need to create a chunk for automatically-created symbols.
I defined LocalImportChunk class for them.
llvm-svn: 240622
MSVC linker is able to link an object file created from the following code.
Note that __imp_hello is not defined anywhere.
void hello() { printf("Hello\n"); }
extern void (*__imp_hello)();
int main() { __imp_hello(); }
Function symbols exported from DLLs are automatically mangled by appending
__imp_ prefix, so they have two names (original one and with the prefix).
This "feature" seems to simulate that behavior even for non-DLL symbols.
This is in my opnion very odd feature. Even MSVC linker warns if you use this.
I'm adding that anyway for the sake of compatibiltiy.
llvm-svn: 240620
Identical COMDAT Folding (ICF) is an optimization to reduce binary
size by merging COMDAT sections that contain the same metadata,
actual data and relocations. MSVC link.exe and many other linkers
have this feature. LLD achieves on per with MSVC in terms produced
binary size with this patch.
This technique is pretty effective. For example, LLD's size is
reduced from 64MB to 54MB by enaling this optimization.
The algorithm implemented in this patch is extremely inefficient.
It puts all COMDAT sections into a set to identify duplicates.
Time to self-link with/without ICF are 3.3 and 320 seconds,
respectively. So this option roughly makes LLD 100x slower.
But it's okay as I wanted to achieve correctness first.
LLD is still able to link itself with this optimization.
I'm going to make it more efficient in followup patches.
Note that this optimization is *not* entirely safe. C/C++ require
different functions have different addresses. If your program
relies on that property, your program wouldn't work with ICF.
However, it's not going to be an issue on Windows because MSVC
link.exe turns ICF on by default. As long as your program works
with default settings (or not passing /opt:noicf), your program
would work with LLD too.
llvm-svn: 240519
Previously, we added files in directive sections to the symbol
table as we read the sections, so the link order was depth-first.
That's not compatible with MSVC link.exe nor the old LLD.
This patch is to queue files so that new files are added to the
end of the queue and processed last. Now addFile() doesn't parse
files nor resolve symbols. You need to call run() to process
queued files.
llvm-svn: 240483
DLLs are usually resolved at process startup, but you can
delay-load them by passing /delayload option to the linker.
If a /delayload is specified, the linker has to create data
which is similar to regular import table.
One notable difference is that the pointers in a delay-load
import table are originally pointing to thunks that resolves
themselves. Each thunk loads a DLL, resolve its name, and then
overwrites the pointer with the result so that subsequent
function calls directly call a desired function. The linker
has to emit thunks.
llvm-svn: 240250
.pdata section contains a list of triplets of function start address,
function end address and its unwind information. Linkers have to
sort section contents by function start address and set the section
address to the file header (so that runtime is able to find it and
do binary search.)
This change seems to resolve all but one remaining test failures in
check{,-clang,-lld} when building the entire stuff with clang-cl and
lld-link.
llvm-svn: 240231
This is a case that one mistake caused a very mysterious bug.
I made a mistake to calculate addresses of common symbols, so
each common symbol pointed not to the beginning of its location
but to the end of its location. (Ouch!)
Common symbols are aligned on 16 byte boundaries. If a common
symbol is small enough to fit between the end of its real
location and whatever comes next, this bug didn't cause any harm.
However, if a common symbol is larger than that, its memory
naturally overlapped with other symbols. That means some
uninitialized variables accidentally shared memory. Because
totally unrelated memory writes mutated other varaibles, it was
hard to debug.
It's surprising that LLD was able to link itself and all LLD
tests except gunit tests passed with this nasty bug.
With this fix, the new COFF linker is able to pass all tests
for LLVM, Clang and LLD if I use MSVC cl.exe as a compiler.
Only three tests are failing when used with clang-cl.
llvm-svn: 240216
In this linker model, adding an undefined symbol may trigger chain
reactions. It may trigger a Lazy symbol to read a new file.
A new file may contain a directive section, which may contain various
command line options.
Previously, we didn't handle chain reactions well. We visited /include'd
symbols only once, so newly-added /include symbols were ignored.
This patch fixes that bug.
Now, the symbol table is versioned; every time the symbol table is
updated, the version number is incremented. We repeat adding undefined
symbols until the version number does not change. It is guaranteed to
converge -- the number of undefined symbol in the system is finite,
and adding the same undefined symbol more than once is basically no-op.
llvm-svn: 240177
Alternatename option is in the form of /alternatename:<from>=<to>.
It's effect is to resolve <from> as <to> if <from> is still undefined
at end of name resolution.
If <from> is not undefined but completely a new symbol, alternatename
shouldn't do anything. Previously, it introduced a new undefined
symbol for <from>, which resulted in undefined symbol error.
llvm-svn: 240161
We don't want to insert a new symbol to the symbol table while reading
a .drectve section because it's going to be too complicated.
That we are reading a directive section means that we are currently
reading some object file. Adding a new undefined symbol to the symbol
table can trigger a library file to read a new file, so it would make
the call stack too deep.
In this patch, I add new symbol names to a list to resolve them later.
llvm-svn: 240076
Alternatename option is in the form of /alternatename:<from>=<to>.
It is an error if there are two options having the same <from> but
different <to>. It is *not* an error if both are the same.
llvm-svn: 240075
We skip unknown options in the command line with a warning message
being printed out, but we shouldn't do that for .drectve section.
The section is not visible to the user. We should handle unknown
options as an error.
llvm-svn: 240067
The linker has to create an XML file for each executable.
This patch supports that feature.
You can optionally embed an XML file to an executable as .rsrc
section. If you choose to do that (by passing /manifest:embed
option), the linker has to create a textual resource file
containing an XML file, compile that using rc.exe to a binary
resource file, conver that resource file to a COFF file using
cvtres.exe, and then link that COFF file. This patch implements
that feature too.
llvm-svn: 239978
On Windows, we have to create a .lib file for each .dll.
When linking against DLLs, the linker doesn't use the DLL files,
but instead read a list of dllexported symbols from corresponding
lib files.
A library file containing descriptors of a DLL is called an
import library file.
lib.exe has a feature to create an import library file from a
module-definition file. In this patch, we create a module-definition
file and pass that to lib.exe.
We eventually want to create an import library file by ourselves
to eliminate dependency to lib.exe. For now, we just use the MSVC
tool.
llvm-svn: 239937
Module-definition files (.def files) are yet another way to
specify parameters to the linker. You can write a list of dllexported
symbols in module-definition files instead of using /export command
line option. It also supports a few more directives.
The parser code is taken from lib/Driver/WinLinkModuleDef.cpp
with the following modifications.
- variable names are updated to comply with the LLVM coding style.
- Instead of returning parsing results as "directive" objects,
it updates Config object directly.
llvm-svn: 239929
DLL files are in the same format as executables but they have export tables.
The format of the export table is described in PE/COFF spec section 5.3.
A new class, EdataContents, takes care of creating chunks for export tables.
What we need to do is to parse command line flags for dllexports, and then
instantiate the class to create chunks. For the writer, export table chunks
are opaque data -- it just add chunks to .edata section.
llvm-svn: 239869
PE/COFF executables/DLLs usually contain data which is called
base relocations. Base relocations are a list of addresses that
need to be fixed by the loader if load-time relocation is needed.
Base relocations are in .reloc section.
We emit one base relocation entry for each IMAGE_REL_AMD64_ADDR64
relocation.
In order to save disk space, base relocations are grouped by page.
Each group is called a block. A block starts with a 32-bit page
address followed by 16-bit offsets in the page. That is more
efficient representation of addresses than just an array of 32-bit
addresses.
llvm-svn: 239710
Resource files are data files containing i18n messages, icon images, etc.
MSVC has a tool to convert a resource file to a regular COFF file so that
you can just link that file to embed resources to an executable.
However, you can directly pass resource files to the linker. If you do that,
the linker invokes the tool automatically. This patch implements that feature.
llvm-svn: 239704
In the case where either a bitcode file and a regular file or two bitcode
files export a common or comdat symbol with the same name, the linker needs
to pick one of them following COFF semantics. This patch implements a design
for resolving such symbols that pushes most of the work onto either LLD's
regular mechanism for resolving common or comdat symbols or the IR linker's
mechanism for doing the same.
We modify SymbolBody::compare to always prefer non-bitcode symbols, so that
during the initial phase of symbol resolution, the symbol table always contains
a regular symbol in any case where we need to choose between a regular and
a bitcode symbol. In SymbolTable::addCombinedLTOObject, we force export
any bitcode symbols that were initially pre-empted by a regular symbol,
and later use SymbolBody::compare to choose between the regular symbol in
the symbol table and the regular symbol from the combined LTO object file.
This design seems to be sound, so long as the resolution mechanism is defined
to be commutative and associative modulo arbitrary choices between symbols
(which seems to be the case for COFF).
Differential Revision: http://reviews.llvm.org/D10329
llvm-svn: 239563
The code generator may create references to runtime library symbols such as
__chkstk which were not visible via LTOModule. Handle these cases by loading
the object file from the library, but abort if we end up having loaded any
bitcode objects.
Because loading the object file may have introduced new undefined references,
call reportRemainingUndefines again to detect and report them.
Differential Revision: http://reviews.llvm.org/D10332
llvm-svn: 239386
The LLVM code generator can sometimes synthesize symbols, such as SSE
constants, that are not visible via the LTOModule interface. Allow such
symbols so long as they have definitions.
Differential Revision: http://reviews.llvm.org/D10331
llvm-svn: 239385