Allow disabling either the full auto import feature, or just
forbidding the cases that require runtime fixups.
As long as all auto imported variables are referenced from separate
.refptr$<name> sections, we can alias them on top of the IAT entries
and don't actually need any runtime fixups via pseudo relocations.
LLVM generates references to variables in .refptr stubs, if it
isn't known that the variable for sure is defined in the same object
module. Runtime pseudo relocs are needed if the addresses of auto
imported variables are used in constant initializers though.
Fixing up runtime pseudo relocations requires the use of
VirtualProtect (which is disallowed in WinStore/UWP apps) or
VirtualProtectFromApp. To allow any risk of ambiguity, allow
rejecting cases that would require this at the linker stage.
This adds support for the --disable-runtime-pseudo-reloc and
--disable-auto-import options in the MinGW driver (matching GNU ld.bfd)
with corresponding lld private options in the COFF driver.
Differential Revision: https://reviews.llvm.org/D78923
Use the unique filenames that are used when /lldsavetemps is passed.
After this change, module names for LTO blobs in PDBs will be unique.
Visual Studio and probably other debuggers expect module names to be
unique.
Revert some changes from 1e0b158db (2017) that are no longer necessary
after removing MSVC LTO support.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D78221
As of a while ago, lld groups all undefined references to a single
symbol in a single diagnostic. Back then, I made it so that we
print up to 10 references to each undefined symbol.
Having used this for a while, I never wished there were more
references, but I sometimes found that this can print a lot of
output. lld prints up to 10 diagnostics by default, and if
each has 10 references (which I've seen in practice), and each
undefined symbol produces 2 (possibly very long) lines of output,
that's over 200 lines of error output.
Let's try it with just 3 references for a while and see how
that feels in practice.
Differential Revision: https://reviews.llvm.org/D77017
Both MS link.exe and GNU ld.bfd handle it this way; one can have
multiple object files defining the same absolute symbols, as long
as it defines it to the same value. But if there are multiple absolute
symbols with differing values, it is treated as an error.
Differential Revision: https://reviews.llvm.org/D71981
Previously this caused crashes in the reportDuplicate method.
A DefinedAbsolute doesn't have any InputFile attached to it, so we
can't report the file for the original symbol.
We could add an InputFile argument to SymbolTable::addAbsolute
only for the sake of error reporting, but even then it'd be assymetrical,
only pointing out the file containing the new conflicting definition,
not the original one.
Differential Revision: https://reviews.llvm.org/D71679
This broke in 51dcb292cc, "[lld-link] diagnose undefined symbols
before LTO when possible" (very soon after the 9.0 branch, so
luckily the 9.0 release is unaffected).
The code for loading objects we believe might be needed for autoimport
(loadMinGWAutomaticImports()) does run before the new
reportUnresolvable() function, but it had a condition to only operate
on symbols from regular object files. This condition came from
resolveRemainingUndefines(), but as loadMinGWAutomaticImports() now
has to operate before the LTO, it has to operate on undefineds from
LTO objects as well.
Differential Revision: https://reviews.llvm.org/D70166
As we now have code that parses the dwarf info for variable locations,
we can use that instead of relying on the higher level Symbolizer library,
reducing the previous two different dwarf codepaths into one.
Differential Revision: https://reviews.llvm.org/D69198
llvm-svn: 375391
This fixes the second part of PR42407.
For files with dwarf debug info, it manually loads and iterates
.debug_info to find the declared location of variables, to allow
reporting them. (This matches the corresponding code in the ELF
linker.)
For functions, it uses the existing getFileLineDwarf which uses
LLVMSymbolizer for translating addresses to file lines.
In object files with codeview debug info, only the source location
of duplicate functions is printed. (And even there, only for the
first input file. The getFileLineCodeView function requires the
object file to be fully loaded and initialized to properly resolve
source locations, but duplicate symbols are reported at a stage when
the second object file isn't fully loaded yet.)
Differential Revision: https://reviews.llvm.org/D68975
llvm-svn: 375218
This makes use of it slightly clearer, and makes it match the
same construct in the lld ELF linker.
Differential Revision: https://reviews.llvm.org/D68935
llvm-svn: 374869
Summary:
This is a re-land of r370487 with a fix for the use-after-free bug
that rev contained.
This implements -start-lib and -end-lib flags for lld-link, analogous
to the similarly named options in ld.lld. Object files after
-start-lib are included in the link only when needed to resolve
undefined symbols. The -end-lib flag goes back to the normal behavior
of always including object files in the link. This mimics the
semantics of static libraries, but without needing to actually create
the archive file.
Reviewers: ruiu, smeenai, MaskRay
Reviewed By: ruiu, MaskRay
Subscribers: akhuang, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66848
llvm-svn: 370816
Summary:
This implements -start-lib and -end-lib flags for lld-link, analogous
to the similarly named options in ld.lld. Object files after
-start-lib are included in the link only when needed to resolve
undefined symbols. The -end-lib flag goes back to the normal behavior
of always including object files in the link. This mimics the
semantics of static libraries, but without needing to actually create
the archive file.
Reviewers: ruiu, smeenai, MaskRay
Reviewed By: ruiu, MaskRay
Subscribers: akhuang, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66848
llvm-svn: 370487
This avoids a spurious and confusing log message in cases where
both e.g. "alias" and "__imp_alias" exist.
Differential Revision: https://reviews.llvm.org/D65598
llvm-svn: 367673
Summary:
This allows reporting undefined symbols before LTO codegen is
run. Since LTO codegen can take a long time, this improves user
experience by avoiding that time spend if the link is going to
fail with undefined symbols anyway.
Fixes PR32400.
Reviewers: ruiu
Reviewed By: ruiu
Subscribers: mehdi_amini, steven_wu, dexonsmith, mstorsjo, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62434
llvm-svn: 367136
Also add test coverage for thin archives (which are the only way I could
come up with to test at least some of the diagnostic changes).
Differential Revision: https://reviews.llvm.org/D64927
llvm-svn: 366573
This reverts r365990 (git commit 1a6053ebc6)
The test no longer depends on the Visual C++ libraries. I confirmed that
the crash still reproduces with the new test case if I remove the null
check.
llvm-svn: 366095
This patch does the same thing as r365595 to other subdirectories,
which completes the naming style change for the entire lld directory.
With this, the naming style conversion is complete for lld.
Differential Revision: https://reviews.llvm.org/D64473
llvm-svn: 365730
lld/coff already deduplicated undefined symbols on a TU level: It would
group all references to a symbol from a single TU. This makes it so that
references from all TUs to a single symbol are grouped together.
Since lld/coff almost did what I thought it did already, the patch is
much smaller than the elf version. The only not local change is that
getSymbolLocations() now returns a vector<string> instead of a string,
so that the undefined symbol reporting code can know how many references
to a symbol exist in a given TU.
Fixes PR42260 for lld/coff.
Differential Revision: https://reviews.llvm.org/D63646
llvm-svn: 364285
r363016 let lld-link and llvm-lib share the /machine: parsing code.
This lets llvm-cvtres share it as well.
Making llvm-cvtres depend on llvm-lib seemed a bit strange (it doesn't
need llvm-lib's dependencies on BinaryFormat and BitReader) and I
couldn't find a good place to put this code. Since it's just a few
lines, put it in lib/Object for now.
Differential Revision: https://reviews.llvm.org/D63120
llvm-svn: 363144
And share some code with lld-link.
While here, also add a FIXME about PR42180 and merge r360150 to llvm-lib.
Differential Revision: https://reviews.llvm.org/D63021
llvm-svn: 363016
Summary:
When handling exports from the command line or from .def files, the
linker does a "fuzzy" string lookup to allow finding mangled symbols.
However, when the symbol is re-exported under a new name, the linker has
to transfer the decorations from the exported symbol over to the new
name. This is implemented by taking the mangled symbol that was found in
the object and replacing the original symbol name with the export name.
Before this patch, LLD implemented the fuzzy search by adding an
undefined symbol with the unmangled name, and then during symbol
resolution, checking if similar mangled symbols had been added after the
last round of symbol resolution. If so, LLD makes the original symbol a
weak alias of the mangled symbol. Later, to get the original symbol
name, LLD would look through the weak alias and forward it on to the
import library writer, which copies the symbol decorations. This
approach doesn't work when bar is itself a weak alias, as is the case in
asan. It's especially bad when the aliasee of bar contains the string
"bar", consider "bar_default". In this case, we would end up exporting
the symbol "foo_default" when we should've exported just "foo".
To fix this, don't look through weak aliases to find the mangled name.
Save the mangled name earlier during fuzzy symbol lookup.
Fixes PR42074
Reviewers: mstorsjo, ruiu
Subscribers: thakis, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62984
llvm-svn: 362849
SectionChunk is one of the most frequently allocated data structures in
LLD, since there are about four per function when optimizations and
debug info are enabled (.text, .pdata, .xdata, .debug$S).
A PE COFF file cannot be larger than 2GB, so there is an inherent limit
on the length of the section name and the number of relocations.
Decompose the ArrayRef and StringRef into pointer and size, and put them
back together in the accessors for section name and relocation list.
I plan to gather complete performance numbers later by padding
SectionChunk with dead data and measuring performance after all the size
optimizations are done.
llvm-svn: 359923
When mismatched #pragma detect_mismatch declarations occur, now print the conflicting OBJs.
lld-link: error: /failifmismatch: mismatch detected for 'TEST':
>>> test.obj has value 1
>>> test2.obj has value 2
Fixes PR38579
Differential Revision: https://reviews.llvm.org/D58910
llvm-svn: 355543
LLD used to handle comdats as if the selection field was always set to
IMAGE_COMDAT_SELECT_ANY. This means for obj files produced by `cl /Gy`, LLD
would never report a duplicate symbol error.
This change:
- adds validation for the Selection field (should make no difference in
practice for compiler-generated obj inputs)
- rejects comdats that have different Selection fields in different obj files
(likewise). This is a bit more strict but also more self-consistent thank
link.exe (see comment in code)
- implements handling for all the selection kinds
In practice, compilers only generate comdats with
IMAGE_COMDAT_SELECT_NODUPLICATES (LLD now produces duplicate symbol errors for
these), IMAGE_COMDAT_SELECT_ANY (no behavior change), and
IMAGE_COMDAT_SELECT_LARGEST (for RTTI data; here LLD should no longer create
broken executables when linking some TUs with RTTI enabled and some with it
disabled – but see below).
The implementation of `IMAGE_COMDAT_SELECT_LARGEST` is incomplete: If one
SELECT_LARGEST comdat replaces an earlier one, the comdat symbol is replaced
correctly, but the old section stays loaded and if /opt:ref is disabled (via
/opt:noref or /debug) it's still written to the output. That's not ideal, but
better than the current treatment of just picking any one of those comdats. I
hope to fix this better later.
Fixes most of PR40094.
Differential Revision: https://reviews.llvm.org/D57324
llvm-svn: 352590
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
Summary:
Reuse the "referenced by" note diagnostic code that we already use for
undefined symbols. In my case, it turned this:
lld-link: error: relocation against symbol in discarded section: .text
lld-link: error: relocation against symbol in discarded section: .text
...
Into this:
lld-link: error: relocation against symbol in discarded section: .text
>>> referenced by libANGLE.lib(CompilerGL.obj):(.SCOVP$M)
>>> referenced by libANGLE.lib(CompilerGL.obj):(.SCOVP$M)
...
lld-link: error: relocation against symbol in discarded section: .text
>>> referenced by obj/third_party/angle/libGLESv2/entry_points_egl_ext.obj:(.SCOVP$M)
>>> referenced by obj/third_party/angle/libGLESv2/entry_points_egl_ext.obj:(.SCOVP$M)
...
I think the new output is more useful.
Reviewers: ruiu, pcc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D54240
llvm-svn: 346427
Don't assume that the IAT chunk will be a DefinedImportData, it can
just as well be a DefinedRegular for gnu import libraries.
Differential Revision: https://reviews.llvm.org/D52381
llvm-svn: 343069
For this, add a few toString() calls when printing the "undefined symbol"
diagnostics; toString() already does demangling on Windows hosts.
Also make lld::demangleMSVC() (called by toString(Symbol*)) call LLVM's
microsoftDemangle() instead of UnDecorateSymbolName() so that it works on
non-Windows hosts – this makes both updating tests easier and provides a better
user experience for people doing cross-links.
This doesn't yet do the right thing for symbols starting with __imp_, but that
can be improved in a follow-up.
Differential Revision: https://reviews.llvm.org/D52104
llvm-svn: 342332
Patch by Thomas Roughton.
This patch adds support for linking with multiple definitions to LLD's
COFF driver, in line with link.exe's /force:multiple option.
Differential Revision: https://reviews.llvm.org/D50598
llvm-svn: 342191
Summary:
r338767 updated the COFF and wasm linker SymbolTable code to be
strutured more like the ELF linker's. That inadvertedly changed the
behavior of the COFF linker so that lazy symbols would be marked as
used in regular objects. This change adds an overload of the insert()
function, similar to the ELF linker, which does not perform that
marking.
Reviewers: ruiu, rnk, hans
Subscribers: aheejin, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D51720
llvm-svn: 341585
After fixing up the runtime pseudo relocation, the .refptr.<var>
will be a plain pointer with the same value as the IAT entry itself.
To save a little binary size and reduce the number of runtime pseudo
relocations, redirect references to the IAT entry (via the __imp_<var>
symbol) itself and discard the .refptr.<var> chunk (as long as the
same section chunk doesn't contain anything else than the single
pointer).
As there are now cases for both setting the Live variable to true
and false externally, remove the accessors and setters and just make
the variable public instead.
Differential Revision: https://reviews.llvm.org/D51456
llvm-svn: 341175
Normally, in order to reference exported data symbols from a different
DLL, the declarations need to have the dllimport attribute, in order to
use the __imp_<var> symbol (which contains an address to the actual
variable) instead of the variable itself directly. This isn't an issue
in the same way for functions, since any reference to the function without
the dllimport attribute will end up as a reference to a thunk which loads
the actual target function from the import address table (IAT).
GNU ld, in MinGW environments, supports automatically importing data
symbols from DLLs, even if the references didn't have the appropriate
dllimport attribute. Since the PE/COFF format doesn't support the kind
of relocations that this would require, the MinGW's CRT startup code
has an custom framework of their own for manually fixing the missing
relocations once module is loaded and the target addresses in the IAT
are known.
For this to work, the linker (originall in GNU ld) creates a list of
remaining references needing fixup, which the runtime processes on
startup before handing over control to user code.
While this feature is rather controversial, it's one of the main features
allowing unix style libraries to be used on windows without any extra
porting effort.
Some sort of automatic fixing of data imports is also necessary for the
itanium C++ ABI on windows (as clang implements it right now) for importing
vtable pointers in certain cases, see D43184 for some discussion on that.
The runtime pseudo relocation handler supports 8/16/32/64 bit addresses,
either PC relative references (like IMAGE_REL_*_REL32*) or absolute
references (IMAGE_REL_AMD64_ADDR32, IMAGE_REL_AMD64_ADDR32,
IMAGE_REL_I386_DIR32). On linking, the relocation is handled as a
relocation against the corresponding IAT slot. For the absolute references,
a normal base relocation is created, to update the embedded address
in case the image is loaded at a different address.
The list of runtime pseudo relocations contains the RVA of the
imported symbol (the IAT slot), the RVA of the location the relocation
should be applied to, and a size of the memory location. When the
relocations are fixed at runtime, the difference between the actual
IAT slot value and the IAT slot address is added to the reference,
doing the right thing for both absolute and relative references.
With this patch alone, things work fine for i386 binaries, and mostly
for x86_64 binaries, with feature parity with GNU ld. Despite this,
there are a few gotchas:
- References to data from within code works fine on both x86 architectures,
since their relocations consist of plain 32 or 64 bit absolute/relative
references. On ARM and AArch64, references to data doesn't consist of
a plain 32 or 64 bit embedded address or offset in the code. On ARMNT,
it's usually a MOVW+MOVT instruction pair represented by a
IMAGE_REL_ARM_MOV32T relocation, each instruction containing 16 bit of
the target address), on AArch64, it's usually an ADRP+ADD/LDR/STR
instruction pair with an even more complex encoding, storing a PC
relative address (with a range of +/- 4 GB). This could theoretically
be remedied by extending the runtime pseudo relocation handler with new
relocation types, to support these instruction encodings. This isn't an
issue for GCC/GNU ld since they don't support windows on ARMNT/AArch64.
- For x86_64, if references in code are encoded as 32 bit PC relative
offsets, the runtime relocation will fail if the target turns out to be
out of range for a 32 bit offset.
- Fixing up the relocations at runtime requires making sections writable
if necessary, with the VirtualProtect function. In Windows Store/UWP apps,
this function is forbidden.
These limitations are addressed by a few later patches in lld and
llvm.
Differential Revision: https://reviews.llvm.org/D50917
llvm-svn: 340726
newline() in ErrorHandler.cpp already tries to insert newlines between messages
that contain embedded newlines, so getSymbolLocations() shouldn't return a
string that ends in a newline -- else we end up with two newlines between error
messages.
Makes lld-link's output look more like ld.lld output.
https://reviews.llvm.org/D51117
llvm-svn: 340482