Summary:
It currently receives an output parameter and returns
std::error_code. Expected<StringRef> fits for this purpose perfectly.
Differential Revision: https://reviews.llvm.org/D61421
llvm-svn: 359774
Summary:
Archives can contain multiple members with the same name. This would
cause ThinLTO links to fail ("Expected at most one ThinLTO module per
bitcode file"). This change implements the same strategy we use in
the ELF linker: make the offset in the archive part of the module
name so that names are unique.
Reviewers: pcc, mehdi_amini, ruiu
Reviewed By: ruiu
Subscribers: eraman, steven_wu, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60549
llvm-svn: 358440
We introduce a new class hierarchy for debug types merging (in DebugTypes.h). The end-goal is to parallelize the type merging - please see the plan in D59226.
Previously, dependency discovery was done on the fly, much later, during the type merging loop. Unfortunately, parallelizing the type merging requires the dependencies to be merged in first, before any dependent ObjFile, thus this early discovery.
The overall intention for this path is to discover debug information dependencies at a much earlier stage, when processing input files. Currently, two types of dependency are supported: PDB type servers (when compiling with MSVC /Zi) and precompiled headers OBJs (when compiling with MSVC /Yc and /Yu). Once discovered, an explicit link is added into the dependent ObjFile, through the new debug types class hierarchy introduced in DebugTypes.h.
Differential Revision: https://reviews.llvm.org/D59053
llvm-svn: 357383
Generate import modules for each imported DLL, along with its symbol stream.
Also create COFF groups in the * Linker * module, one for each PartialSection (input, unmerged sections)
Currently COFF groups are disabled for MINGW because it significantly increases PDB sizes. We could enable that later with an option.
The overall objective for this change is to support code hot patching tools. Such tools need to know the import libraries used, from the PDB alone.
Differential Revision: https://reviews.llvm.org/D54802
llvm-svn: 357308
Summary:
We translate @llvm.used to COFF by generating /include directives
in the .drectve section. However, in LTO links, this happens after
directives have already been processed, so the new directives do
not take effect. This change marks @llvm.used symbols as GCRoots
so that they are preserved as intended.
Fixes PR40733.
Reviewers: rnk, pcc, ruiu
Reviewed By: ruiu
Subscribers: mehdi_amini, steven_wu, dexonsmith, dang, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58255
llvm-svn: 354410
Turns out nobody understands what "conflicting comdat type" is supposed to
mean, so just emit a regular "duplicate symbol" error and move the comdat
selection information into /verbose output.
This also fixes a problem where the error output would depend on the order of
.obj files passed. Before this patch:
- If passed `one_only.obj discard.obj`, lld-link would only err "conflicting
comdat type"
- If passed `discard.obj one_only.obj`, lld-link would err "conflicting comdat
type" and then "duplicate symbol"
Now lld-link only errs "duplicate symbol" in both cases.
I considered adding a "Detail" parameter to reportDuplicate() that's printed in
parens at the end of the "duplicate symbol" diag if present, and then put the
comdat selection mismatch details there, but since users don't know what it's
supposed to mean decided against it. I also considered special-casing the
Detail message for one_only/discard mismatches, which in practice means
"function defined as inline in TU 1 but as out-of-line in TU 2", but I wasn't
sure how useful it is so I omitted that too.
Differential Revision: https://reviews.llvm.org/D58180
llvm-svn: 354006
cl.exe and clang-cl.exe put vftables in a 'discard' comdat when building with
RTTI disabled (/GR-) but in a 'largest' comdat when building with RTTI enabled.
To be able to link /GR- code with /GR code, lld-link needs to accept comdats
that have this type of comdat selection conflict.
For example, static libraries in the Visual Studio standard library are built
with /GR, and without this it's impossible to build client code with /GR- and
still link to the standard library.
link.exe also accepts merging 'discard' with 'largest', and it accepts merging
'largest' with any other selection type. lld-link is still a bit stricter since
it only allows merging 'largest' with 'discard' for symmetry.
Differential Revision: https://reviews.llvm.org/D57515
llvm-svn: 352765
LLD used to handle comdats as if the selection field was always set to
IMAGE_COMDAT_SELECT_ANY. This means for obj files produced by `cl /Gy`, LLD
would never report a duplicate symbol error.
This change:
- adds validation for the Selection field (should make no difference in
practice for compiler-generated obj inputs)
- rejects comdats that have different Selection fields in different obj files
(likewise). This is a bit more strict but also more self-consistent thank
link.exe (see comment in code)
- implements handling for all the selection kinds
In practice, compilers only generate comdats with
IMAGE_COMDAT_SELECT_NODUPLICATES (LLD now produces duplicate symbol errors for
these), IMAGE_COMDAT_SELECT_ANY (no behavior change), and
IMAGE_COMDAT_SELECT_LARGEST (for RTTI data; here LLD should no longer create
broken executables when linking some TUs with RTTI enabled and some with it
disabled – but see below).
The implementation of `IMAGE_COMDAT_SELECT_LARGEST` is incomplete: If one
SELECT_LARGEST comdat replaces an earlier one, the comdat symbol is replaced
correctly, but the old section stays loaded and if /opt:ref is disabled (via
/opt:noref or /debug) it's still written to the output. That's not ideal, but
better than the current treatment of just picking any one of those comdats. I
hope to fix this better later.
Fixes most of PR40094.
Differential Revision: https://reviews.llvm.org/D57324
llvm-svn: 352590
References between associated comdats are invalid per COFF spec, but the newest
Windows SDK contains obj files that have these references
(https://bugs.chromium.org/p/chromium/issues/detail?id=925943#c13). So add back
support for them and add tests for them. The old code handled them fine.
This makes lld-link match the behavior of newer link.exe versions as far as I
can tell. (The behavior before this change matched the behavior of older
link.exe versions.)
This mostly reverts r352254.
Differential Revision: https://reviews.llvm.org/D57387
llvm-svn: 352508
Many different sections can have the same name, so include the indices of the
sections mentioned in the diagnostic too.
I'm debugging something I can't repro locally, maybe this will help.
llvm-svn: 352428
I need the comdat selection for PR40094. To keep the patch for that smaller,
I'm adding it here, and as a first application I'm using it to reject
associative comdats referring to earlier associative comdats. Depends on
D56929; together with that all associative comdats referring to other
associative comdats are now rejected.
Differential Revision: https://reviews.llvm.org/D56931
llvm-svn: 352254
Currently, if an associative comdat appears after the comdat it's associated
with it's processed immediately, else it's deferred until the end of the object
file. I found this confusing to think about while working on PR40094, so this
makes it so that associated comdats are always processed at the end of the
object file. This seems to be perf-neutral and simpler.
Now there's a natural place to reject the associated comdats referring to later
associated comdats (associated comdats referring to associated comdats is
invalid per COFF spec) that, so reject those. (A later patch will reject
associated comdats referring to earlier comdats.)
Differential Revision: https://reviews.llvm.org/D56929
llvm-svn: 351917
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
Changes a few things I noticed while reading this code.
- fix a few typos in comments
- remove two `auto` uses where the type wasn't clear to me
- add comment saying that two sequential checks for `if (SparseChunks[SectionNumber] == PendingComdat)` are intentional
- name two parameters
No behavior change.
Differential Revision: https://reviews.llvm.org/D56677
llvm-svn: 351101
Normally one wouldn't run into that case, but it is possible with
a little creative ordering of special libraries.
Differential Revision: https://reviews.llvm.org/D53388
llvm-svn: 344776
For certain cases of inline functions written to comdat sections,
GCC 5.x produces a weak symbol in addition, which would end up
undefined in some cases.
This no longer seems to happen with GCC 6.x or newer though.
Differential Revision: https://reviews.llvm.org/D52602
llvm-svn: 343877
When GNU tools create a weak alias, they produce a strong symbol
named .weak.<weaksymbol>.<relatedstrongsymbol>.
GNU ld allows many such weak alternatives for the same weak symbol, and
the linker picks the first one encountered.
This can't be reproduced by assembling from .s files, since llvm-mc
produces symbols named .weak.<weaksymbol>.default in these cases.
Differential Revision: https://reviews.llvm.org/D52601
llvm-svn: 343704
MinGW configurations don't use associative comdats, as GNU ld doesn't
support that. Instead they produce normal comdats named .text$sym,
.xdata$sym and .pdata$sym.
GNU ld doesn't discard any comdats starting with .xdata or .pdata,
even if --gc-sections is used (while it does discard other unreferenced
comdats), regardless of what symbol name is used after the $ separator.
For LLD, treat any such comdat as implicitly associative to the base
symbol. This requires maintaining a map from symbol name to section
number, but that is only maintained when the MinGW flag has been
enabled.
Differential Revision: https://reviews.llvm.org/D49700
llvm-svn: 339058
Discard them unless they have been associated by other means (yet
uimplemented).
According to MS link.exe, such sections are illegal, but MinGW setups
use them in their take on associative comdats.
This avoids leaving references to the bogus SectionChunk* PendingComdat,
which cannot be dereferenced.
This fixes PR38183.
Differential Revision: https://reviews.llvm.org/D49653
llvm-svn: 338064
Previously, the error messages didn't contain symbol name because we
didn't read a symbol name for these error messages.
Differential Revision: https://reviews.llvm.org/D49762
llvm-svn: 337863
Future symbol insertions can potentially change the type of these
symbols - keep pointers to the base class to reflect this, and
use dynamic casts to inspect them before using as the subclass
type.
This fixes crashes that were possible before, by touching these
symbols that now are populated as e.g. a DefinedRegular, via
the old pointers with DefinedImportThunk type.
Differential Revision: https://reviews.llvm.org/D48953
llvm-svn: 336652
We discovered (crbug.com/838449#c24) that string tail merging can
negatively affect compressed binary size, so provide a flag to turn
it off for users who care more about compressed size than uncompressed
size.
Differential Revision: https://reviews.llvm.org/D46780
llvm-svn: 332149
Summary:
When a symbol refers to a special section or a section that doesn't
exist, lld would fatal with "broken object file". This change gives a
different message for each scenario, and includes the name of the
file, name of the symbol, and the section being referred to.
Reviewers: pcc, ruiu
Reviewed By: ruiu
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D46090
llvm-svn: 330883
Summary:
This change does three things:
- Try to find the file and line number of an undefined symbol
reference by reading codeview debug info.
- Try to find the name of the function or global variable with the
undefined symbol reference by searching the object file's symbol
table.
- Prints the information in the same style as the ELF linker.
Differential Revision: https://reviews.llvm.org/D45467
llvm-svn: 330235
In COFF, duplicate string literals are merged by placing them in a
comdat whose leader symbol name contains a specific prefix followed
by the hash and partial contents of the string literal. This gives
us an easy way to identify sections containing string literals in
the linker: check for leader symbol names with the given prefix.
Any sections that are identified in this way as containing string
literals may be tail merged. We do so using the StringTableBuilder
class, which is also used to tail merge string literals in the ELF
linker. Tail merging is enabled only if ICF is enabled, as this
provides a signal as to whether the user cares about binary size.
Differential Revision: https://reviews.llvm.org/D44504
llvm-svn: 327668
Summary:
This protects calls to longjmp from transferring control to arbitrary
program points. Instead, longjmp calls are limited to the set of
registered setjmp return addresses.
This also implements /guard:nolongjmp to allow users to link in object
files that call setjmp that weren't compiled with /guard:cf. In this
case, the linker will approximate the set of address taken functions,
but it will leave longjmp unprotected.
I used the following program to test, compiling it with different -guard
flags:
$ cl -c t.c -guard:cf
$ lld-link t.obj -guard:cf
#include <setjmp.h>
#include <stdio.h>
jmp_buf buf;
void g() {
printf("before longjmp\n");
fflush(stdout);
longjmp(buf, 1);
}
void f() {
if (setjmp(buf)) {
printf("setjmp returned non-zero\n");
return;
}
g();
}
int main() {
f();
printf("hello world\n");
}
In particular, the program aborts when the code is compiled *without*
-guard:cf and linked with -guard:cf. That indicates that longjmps are
protected.
Reviewers: ruiu, inglorion, amccarth
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D43217
llvm-svn: 325047
Summary:
This patch adds some initial support for Windows control flow guard. At
the end of the day, the linker needs to synthesize a table of RVAs very
similar to the structured exception handler table (/safeseh).
Both /safeseh and /guard:cf take sections of symbol table indices
(.sxdata and .gfids$y) and turn them into RVA tables referenced by the
load config struct in the CRT through special symbols.
Reviewers: ruiu, amccarth
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42592
llvm-svn: 324306
It's pretty annoying to have LLD lowercase paths in error messages when
cross-compiling from a case-sensitive filesystem, since e.g. if I want
to examine the problematic object file, I have to perform some manual
case correction instead of just being able to copy the path from the
error message.
Differential Revision: https://reviews.llvm.org/D40931
llvm-svn: 319996
This patch is to rename check CHECK and make it a C macro, so that
we can evaluate the second argument lazily.
Differential Revision: https://reviews.llvm.org/D40915
llvm-svn: 319974
Instead of building intermediate sets of exception handlers for each
object file, just create one for the final output file.
Differential Revision: https://reviews.llvm.org/D40581
llvm-svn: 319244
If /debug was not specified, readSection will return a null
pointer for debug sections. If the debug section is associative with
another section, we need to make sure that the section returned from
readSection is not a null pointer before adding it as an associative
section.
Differential Revision: https://reviews.llvm.org/D40533
llvm-svn: 319133
With this change, instead of creating a SectionChunk for each section
in the object file, we only create them when we encounter a prevailing
comdat section.
Also change how symbol resolution occurs between comdat symbols. Now
only the comdat leader participates in comdat resolution, and not any
other external associated symbols. This is more in line with how COFF
semantics are defined, and should allow for a more straightforward
implementation of non-ANY comdat types.
On my machine, this change reduces our runtime linking a release
build of chrome_child.dll with /nopdb from 5.65s to 4.54s (median of
50 runs).
Differential Revision: https://reviews.llvm.org/D40238
llvm-svn: 319090
Now that we have only SymbolBody as the symbol class. So, "SymbolBody"
is a bit strange name now. This is a mechanical change generated by
perl -i -pe s/SymbolBody/Symbol/g $(git grep -l SymbolBody lld/ELF lld/COFF)
nd clang-format-diff.
Differential Revision: https://reviews.llvm.org/D39459
llvm-svn: 317370
Summary:
The COFF linker and the ELF linker have long had similar but separate
Error.h and Error.cpp files to implement error handling. This change
introduces new error handling code in Common/ErrorHandler.h, changes the
COFF and ELF linkers to use it, and removes the old, separate
implementations.
Reviewers: ruiu
Reviewed By: ruiu
Subscribers: smeenai, jyknight, emaste, sdardis, nemanjai, nhaehnle, mgorny, javed.absar, kbarton, fedor.sergeev, llvm-commits
Differential Revision: https://reviews.llvm.org/D39259
llvm-svn: 316624
Apply the simplification suggestions that Peter Collingbourne made
during the review at D37368. The returned thunk is cast to the
appropriate type in the SymbolTable, and the constant symbol's body is
not needed directly, so avoid the assignment. NFC
llvm-svn: 312391
If a symbol is locally defined and is DLL imported in another
translation unit, and the object with the locally defined version is
loaded prior to the imported version, then the linker will fail to
resolve the definition of the thunk and return the locally defined
symbol. This will then be attempted to be cast to an import thunk,
which will clearly fail.
Only return the thunk if the symbol is inserted or a thunk is created.
Otherwise, report a duplication error.
llvm-svn: 312386
This reverts commit r312171 because it is pointed out that that's not a
correct fix (see https://bugs.llvm.org/show_bug.cgi?id=32674#c14) and
also because it broke buildbots.
llvm-svn: 312174
MSVC link.exe supports nested static libraries. That is, an .a file can
contain other .a file as its member. It is reported that MySQL actually
depends on this feature.
Fixes https://bugs.llvm.org/show_bug.cgi?id=32674
llvm-svn: 312171
A plain empty entry point function that returns 0 seems to produce
a binary that loads and runs fine in wine.
Differential Revision: https://reviews.llvm.org/D34833
llvm-svn: 306963
Summary:
Previously we didn't add debug info chunks to the SparseChunks array, so
they didn't participate in section GC. Now we do.
Reviewers: ruiu
Subscribers: aprantl, llvm-commits
Differential Revision: https://reviews.llvm.org/D34356
llvm-svn: 305811
Summary:
Adds a "Discarded" bool to SectionChunk to indicate if the section was
discarded by COMDAT deduplication. The Writer still just checks
`isLive()`.
Fixes PR33446
Reviewers: ruiu
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34288
llvm-svn: 305582
This creates a new library called BinaryFormat that has all of
the headers from llvm/Support containing structure and layout
definitions for various types of binary formats like dwarf, coff,
elf, etc as well as the code for identifying a file from its
magic.
Differential Revision: https://reviews.llvm.org/D33843
llvm-svn: 304864
This reverts commit r303304 because it looks like the change
introduced a crash bug. At least after that change, LLD with thinlto
crashes when linking Chromium.
llvm-svn: 303527
We've been using make<> to allocate new objects in ELF. We have
the same function in COFF, but we didn't use it widely due to
negligence. This patch uses the function in COFF to close the gap
between ELF and COFF.
llvm-svn: 303357
This reverts re-submits r303225 which was reverted in r303270 because it
broke the sanitizer-windows bot.
The reason of the failure is that we were writing dead symbols to the
symbol table. I fixed the issue.
llvm-svn: 303304
and follow-up r303226 "Fix Windows buildbots."
This broke the sanitizer-windows buildbot.
> Previously, the garbage collector (enabled by default or by explicitly
> passing /opt:ref) did not kill dllimported symbols. As a result,
> dllimported symbols could be added to resulting executables' dllimport
> list even if no one was actually using them.
>
> This patch implements dllexported symbol garbage collection. Just like
> COMDAT sections, dllimported symbols now have Live bits to manage their
> liveness, and MarkLive marks reachable dllimported symbols.
>
> Fixes https://bugs.llvm.org/show_bug.cgi?id=32950
>
> Reviewers: pcc
>
> Subscribers: llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D33264
llvm-svn: 303270
Summary:
Previously, the garbage collector (enabled by default or by explicitly
passing /opt:ref) did not kill dllimported symbols. As a result,
dllimported symbols could be added to resulting executables' dllimport
list even if no one was actually using them.
This patch implements dllexported symbol garbage collection. Just like
COMDAT sections, dllimported symbols now have Live bits to manage their
liveness, and MarkLive marks reachable dllimported symbols.
Fixes https://bugs.llvm.org/show_bug.cgi?id=32950
Reviewers: pcc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33264
llvm-svn: 303225
CONSTANT imports expect both the `_imp_` prefixed and non-prefixed
symbols should be added to the symbol table. This allows for linking
symbols like _NSConcreteGlobalBlock in WinObjC. The previous change
would generate the import library properly by handling the option but
would not consume the generated entry properly.
llvm-svn: 301657
Start using it in LLD to avoid needing to read bitcode again just to get the
target triple, and in llvm-lto2 to avoid printing symbol table information
that is inappropriate for the target.
Differential Revision: https://reviews.llvm.org/D32038
llvm-svn: 300300
Introduce symbol table data structures that can be potentially written to
disk, have the LTO library build those data structures using temporarily
constructed modules and redirect the LTO library implementation to go through
those data structures. This allows us to remove the LLVMContext and Modules
owned by InputFile.
With this change I measured a peak memory consumption decrease from 5.4GB to
2.8GB in a no-op incremental ThinLTO link of Chromium on Linux. The impact on
memory consumption is larger in COFF linkers where we are currently forced
to materialize all metadata in order to read linker options. Peak memory
consumption linking a large piece of Chromium for Windows with full LTO and
debug info decreases from >64GB (OOM) to 15GB.
Part of PR27551.
Differential Revision: https://reviews.llvm.org/D31364
llvm-svn: 299168
Summary: In the ELF linker, we create the buffer identifier for bitcode files by appending the object name to the archive name. This change makes the COFF linker do the same. Without the change, ThinLTO builds can fail with an error message about multiple ThinLTO modules per object file, caused by object files contained in different archives having the same name.
Reviewers: pcc, ruiu
Reviewed By: pcc
Subscribers: mehdi_amini
Differential Revision: https://reviews.llvm.org/D31402
llvm-svn: 298942
Summary: The COFF linker previously implemented link-time optimization using an API which has now been marked as legacy. This change refactors the COFF linker to use the new LTO API, which is also used by the ELF linker.
Reviewers: pcc, ruiu
Reviewed By: pcc
Subscribers: mgorny, mehdi_amini
Differential Revision: https://reviews.llvm.org/D29059
llvm-svn: 293967
I thought for a while about how to remove it, but it looks like we
can just copy the file for now. Of course I'm not happy about that,
but it's just less than 50 lines of code, and we already have
duplicate code in Error.h and some other places. I want to solve
them all at once later.
Differential Revision: https://reviews.llvm.org/D27819
llvm-svn: 290062
Profiling revealed that the majority of lld's execution time on Windows was
spent opening and mapping input files. We can reduce this cost significantly
by performing these operations asynchronously.
This change introduces a queue for all operations on input file data. When
we discover that we need to load a file (for example, when we find a lazy
archive for an undefined symbol, or when we read a linker directive to
load a file from disk), the file operation is launched using a future and
the symbol resolution operation is enqueued. This implies another change
to symbol resolution semantics, but it seems to be harmless ("ninja All"
in Chromium still succeeds).
To measure the perf impact of this change I linked Chromium's chrome_child.dll
with both thin and fat archives.
Thin archives:
Before (median of 5 runs): 19.50s
After: 10.93s
Fat archives:
Before: 12.00s
After: 9.90s
On Linux I found that doing this asynchronously had a negative effect on
performance, probably because the cost of mapping a file is small enough that
it becomes outweighed by the cost of managing the futures. So on non-Windows
platforms I use the deferred execution strategy.
Differential Revision: https://reviews.llvm.org/D27768
llvm-svn: 289760
This ports the ELF linker's symbol table design, introduced in r268178,
to the COFF linker.
Differential Revision: http://reviews.llvm.org/D21166
llvm-svn: 289280
Profiling revealed that we were spending 5% of our time linking
chrome_child.dll just in this call to toString().
Differential Revision: https://reviews.llvm.org/D27628
llvm-svn: 289270
Previously, we had different way to stringize SymbolBody and InputFile
to construct error messages. This patch defines overloaded function
toString() so that we don't need to memorize all these different
function names.
With that change, it is now easy to include demangled names in error
messages. Now, if there is a symbol name conflict, we'll print out
both mangled and demangled names.
llvm-svn: 288992
Previously, we discarded .debug$ sections. This patch adds them to
files so that PDB.cpp can access them.
This patch also adds a debug option, /dumppdb, to dump debug info
fed to createPDB so that we can verify that valid data has been passed.
llvm-svn: 287555
This flag is implemented similarly to --reproduce in the ELF linker.
This patch implements /linkrepro by moving the cpio writer and associated
utility functions to lldCore, and using that implementation in both linkers.
One COFF-specific detail is that we store the object file from which the
resource files were created in our reproducer, rather than the resource
files themselves. This allows the reproducer to be used on non-Windows
systems for example.
Differential Revision: https://reviews.llvm.org/D22418
llvm-svn: 276719