Before this change, you got to cast a symbol to DefinedRegular and then
call isCOMDAT() to determine if a given symbol is a COMDAT symbol.
Now you can just use isa<DefinedCOMDAT>().
As to the class definition of DefinedCOMDAT, I could remove duplicate
code from DefinedRegular and DefinedCOMDAT by introducing another base
class for them, but I chose to not do that to keep the class hierarchy
shallow. This amount of code duplication doesn't worth to define a new
class.
llvm-svn: 240319
DLLs are usually resolved at process startup, but you can
delay-load them by passing /delayload option to the linker.
If a /delayload is specified, the linker has to create data
which is similar to regular import table.
One notable difference is that the pointers in a delay-load
import table are originally pointing to thunks that resolves
themselves. Each thunk loads a DLL, resolve its name, and then
overwrites the pointer with the result so that subsequent
function calls directly call a desired function. The linker
has to emit thunks.
llvm-svn: 240250
.pdata section contains a list of triplets of function start address,
function end address and its unwind information. Linkers have to
sort section contents by function start address and set the section
address to the file header (so that runtime is able to find it and
do binary search.)
This change seems to resolve all but one remaining test failures in
check{,-clang,-lld} when building the entire stuff with clang-cl and
lld-link.
llvm-svn: 240231
This is a case that one mistake caused a very mysterious bug.
I made a mistake to calculate addresses of common symbols, so
each common symbol pointed not to the beginning of its location
but to the end of its location. (Ouch!)
Common symbols are aligned on 16 byte boundaries. If a common
symbol is small enough to fit between the end of its real
location and whatever comes next, this bug didn't cause any harm.
However, if a common symbol is larger than that, its memory
naturally overlapped with other symbols. That means some
uninitialized variables accidentally shared memory. Because
totally unrelated memory writes mutated other varaibles, it was
hard to debug.
It's surprising that LLD was able to link itself and all LLD
tests except gunit tests passed with this nasty bug.
With this fix, the new COFF linker is able to pass all tests
for LLVM, Clang and LLD if I use MSVC cl.exe as a compiler.
Only three tests are failing when used with clang-cl.
llvm-svn: 240216
This avoids undefined behaviour caused by an out-of-range access if the
vector is empty, which can happen if an object file's directive section
contains only whitespace.
llvm-svn: 240183
getName() does strlen() on the symbol table, so it's not very fast.
It's not as bad as r239332 because the number of symbols exported
from archive files are fewer than object files, and they are usually
shorter, though.
llvm-svn: 240178
In this linker model, adding an undefined symbol may trigger chain
reactions. It may trigger a Lazy symbol to read a new file.
A new file may contain a directive section, which may contain various
command line options.
Previously, we didn't handle chain reactions well. We visited /include'd
symbols only once, so newly-added /include symbols were ignored.
This patch fixes that bug.
Now, the symbol table is versioned; every time the symbol table is
updated, the version number is incremented. We repeat adding undefined
symbols until the version number does not change. It is guaranteed to
converge -- the number of undefined symbol in the system is finite,
and adding the same undefined symbol more than once is basically no-op.
llvm-svn: 240177
None of the implementations replace the SimpleFile with some other file,
they just modify the SimpleFile in-place, so a direct reference to the
file is sufficient.
llvm-svn: 240167
Alternatename option is in the form of /alternatename:<from>=<to>.
It's effect is to resolve <from> as <to> if <from> is still undefined
at end of name resolution.
If <from> is not undefined but completely a new symbol, alternatename
shouldn't do anything. Previously, it introduced a new undefined
symbol for <from>, which resulted in undefined symbol error.
llvm-svn: 240161
We don't want to insert a new symbol to the symbol table while reading
a .drectve section because it's going to be too complicated.
That we are reading a directive section means that we are currently
reading some object file. Adding a new undefined symbol to the symbol
table can trigger a library file to read a new file, so it would make
the call stack too deep.
In this patch, I add new symbol names to a list to resolve them later.
llvm-svn: 240076
Alternatename option is in the form of /alternatename:<from>=<to>.
It is an error if there are two options having the same <from> but
different <to>. It is *not* an error if both are the same.
llvm-svn: 240075
We skip unknown options in the command line with a warning message
being printed out, but we shouldn't do that for .drectve section.
The section is not visible to the user. We should handle unknown
options as an error.
llvm-svn: 240067
The linker has to create an XML file for each executable.
This patch supports that feature.
You can optionally embed an XML file to an executable as .rsrc
section. If you choose to do that (by passing /manifest:embed
option), the linker has to create a textual resource file
containing an XML file, compile that using rc.exe to a binary
resource file, conver that resource file to a COFF file using
cvtres.exe, and then link that COFF file. This patch implements
that feature too.
llvm-svn: 239978
Common symbols will be handled in a separate patch because it seems
Hexagon redefines the notion of common symbol, which I'm not (yet)
very familiar with.
llvm-svn: 239951
On Windows, we have to create a .lib file for each .dll.
When linking against DLLs, the linker doesn't use the DLL files,
but instead read a list of dllexported symbols from corresponding
lib files.
A library file containing descriptors of a DLL is called an
import library file.
lib.exe has a feature to create an import library file from a
module-definition file. In this patch, we create a module-definition
file and pass that to lib.exe.
We eventually want to create an import library file by ourselves
to eliminate dependency to lib.exe. For now, we just use the MSVC
tool.
llvm-svn: 239937
Module-definition files (.def files) are yet another way to
specify parameters to the linker. You can write a list of dllexported
symbols in module-definition files instead of using /export command
line option. It also supports a few more directives.
The parser code is taken from lib/Driver/WinLinkModuleDef.cpp
with the following modifications.
- variable names are updated to comply with the LLVM coding style.
- Instead of returning parsing results as "directive" objects,
it updates Config object directly.
llvm-svn: 239929
Current approach for initial-exec in ELF/x86_64 is to create a GOT entry
and change the relocation to R_X86_64_PC32 to be handled as a GOT offfset.
However there are two issues with this approach: 1. the R_X86_64_PC32 is
not really required since the GOT relocation will be handle dynamically and
2. the TLS symbols are not being exported externally and then correct
realocation are not being applied.
This patch fixes the R_X86_64_GOTTPOFF handling by just emitting a
R_X86_64_TPOFF64 dynamically one; it also sets R_X86_64_TPOFF64 to be
handled by runtime one. For second part, the patches uses a similar
strategy used for aarch64, by reimplementing buildDynamicSymbolTable
from X86_64ExecutableWriter and adding the TLS symbols in the dynamic
symbol table.
Some tests had to be adjusted due the now missing R_X86_64_PC32 relocation.
With this test the simple testcase:
* t1.c:
__thread int t0;
__thread int t1;
__thread int t2;
__thread int t3;
* t0.c:
extern __thread int t0;
extern __thread int t1;
extern __thread int t2;
extern __thread int t3;
__thread int t4;
__thread int t5;
__thread int t6;
__thread int t7;
int main ()
{
t0 = 1;
t1 = 2;
t2 = 3;
t3 = 4;
t4 = 5;
t5 = 6;
t6 = 7;
t7 = 8;
printf ("%i %i %i %i\n", t0, t1, t2, t3);
printf ("%i %i %i %i\n", t4, t5, t6, t7);
return 0;
}
Shows correct output for x86_64.
llvm-svn: 239908
This patch fixes the wrong .tbss segment size generated for cases where
multiple modules have non initialized threads variables. For instance:
* t0.c
__thread int x0;
__thread int x1;
__thread int x2;
extern __thread int e0;
extern __thread int e1;
extern __thread int e2;
extern __thread int e3;
int foo0 ()
{
return x0;
}
int main ()
{
return x0;
}
* t1.c
__thread int e0;
__thread int e1;
__thread int e2;
__thread int e3;
lld is generating (for aarch64):
[14] .tbss NOBITS 0000000000401000 00001000
0000000000000010 0000000000000000 WAT 0 0 4
Where is just taking in consideration the largest tbss segment, not all
from all objects. ld generates a correct output:
[17] .tbss NOBITS 0000000000410dec 00000dec
000000000000001c 0000000000000000 WAT 0 0 4
This issue is at 'lib/ReaderWriter/ELF/SegmentChunks.cpp' where
Segment<ELFT>::assignVirtualAddress is setting wrong slice values, not taking care
of although tbss segments file size does noy play role in other segment virtual
address placement, its size should still be considered.
llvm-svn: 239906
DLL files are in the same format as executables but they have export tables.
The format of the export table is described in PE/COFF spec section 5.3.
A new class, EdataContents, takes care of creating chunks for export tables.
What we need to do is to parse command line flags for dllexports, and then
instantiate the class to create chunks. For the writer, export table chunks
are opaque data -- it just add chunks to .edata section.
llvm-svn: 239869
We are currently handling all combinations of SymbolBody types directly.
This patch is to flip this and Other if Other->kind() < this->kind()
to reduce number of combinations. No functionality change intended.
llvm-svn: 239745
Add method to query segments for specified output section name.
Return error if the section is assigned to unknown segment.
Check matching of sections to segments during layout on the subject of correctness.
NOTE: no actual functionality of using custom segments is implemented.
Differential Revision: http://reviews.llvm.org/D10359
llvm-svn: 239719
PE/COFF executables/DLLs usually contain data which is called
base relocations. Base relocations are a list of addresses that
need to be fixed by the loader if load-time relocation is needed.
Base relocations are in .reloc section.
We emit one base relocation entry for each IMAGE_REL_AMD64_ADDR64
relocation.
In order to save disk space, base relocations are grouped by page.
Each group is called a block. A block starts with a 32-bit page
address followed by 16-bit offsets in the page. That is more
efficient representation of addresses than just an array of 32-bit
addresses.
llvm-svn: 239710
When we add a chunk to an OutputSection, we always want to create
a backreference from an OutputSection to a Chunk. To make sure
we always do, do that in addChunk(). NFC.
llvm-svn: 239706
Resource files are data files containing i18n messages, icon images, etc.
MSVC has a tool to convert a resource file to a regular COFF file so that
you can just link that file to embed resources to an executable.
However, you can directly pass resource files to the linker. If you do that,
the linker invokes the tool automatically. This patch implements that feature.
llvm-svn: 239704
As noted on Errc.h:
// * std::errc is just marked with is_error_condition_enum. This means that
// common patters like AnErrorCode == errc::no_such_file_or_directory take
// 4 virtual calls instead of two comparisons.
And on some libstdc++ those virtual functions conclude that
------------------------
int main() {
std::error_code foo = std::make_error_code(std::errc::no_such_file_or_directory);
return foo == std::errc::no_such_file_or_directory;
}
-------------------------
should exit with 0.
llvm-svn: 239685
In the case where either a bitcode file and a regular file or two bitcode
files export a common or comdat symbol with the same name, the linker needs
to pick one of them following COFF semantics. This patch implements a design
for resolving such symbols that pushes most of the work onto either LLD's
regular mechanism for resolving common or comdat symbols or the IR linker's
mechanism for doing the same.
We modify SymbolBody::compare to always prefer non-bitcode symbols, so that
during the initial phase of symbol resolution, the symbol table always contains
a regular symbol in any case where we need to choose between a regular and
a bitcode symbol. In SymbolTable::addCombinedLTOObject, we force export
any bitcode symbols that were initially pre-empted by a regular symbol,
and later use SymbolBody::compare to choose between the regular symbol in
the symbol table and the regular symbol from the combined LTO object file.
This design seems to be sound, so long as the resolution mechanism is defined
to be commutative and associative modulo arbitrary choices between symbols
(which seems to be the case for COFF).
Differential Revision: http://reviews.llvm.org/D10329
llvm-svn: 239563
isRoot, isLive and markLive functions are called very frequently.
Previously, they were virtual functions. This patch make them
non-virtual.
Also this patch checks chunk liveness before calling its mark().
Previously, we did that at beginning of markLive(), so the virtual
function would return immediately if it's live. That was inefficient.
llvm-svn: 239458
The code generator may create references to runtime library symbols such as
__chkstk which were not visible via LTOModule. Handle these cases by loading
the object file from the library, but abort if we end up having loaded any
bitcode objects.
Because loading the object file may have introduced new undefined references,
call reportRemainingUndefines again to detect and report them.
Differential Revision: http://reviews.llvm.org/D10332
llvm-svn: 239386
The LLVM code generator can sometimes synthesize symbols, such as SSE
constants, that are not visible via the LTOModule interface. Allow such
symbols so long as they have definitions.
Differential Revision: http://reviews.llvm.org/D10331
llvm-svn: 239385
This change seems to make the linker about 10% faster.
Reading symbol name is not very cheap because it needs strlen()
on the string table. We were wasting time on reading non-external
symbol names that would never be used by the linker.
llvm-svn: 239332
MSVC profiler reported that this stable_sort takes 7% time
when self-linking. As a result, createSection was taking 10% time.
Now createSection takes 3%. This small change actually makes
the linker a bit but perceptibly faster.
llvm-svn: 239292
We forgot to check for auxiliary symbol's type. So we sometimes read
garbage as associative section definitions.
Associative sections are considered as not live themselves by the
garbage collector because they are live only when associaited sections
are live.
By reading more data (or garbage) as associative section definitions,
we treated more sections as non-GC-roots, that caused the linker to
discard too many sections by mistake. That caused another mysterious
bug (such as some global constructors don't run at all for some reason.)
llvm-svn: 239287
I don't know what the right thing to do here, but at least 1 does
not seem like a correct value. If we do not align common chunks at
all, a small program which calls puts() from global dtors crashes
mysteriously in a kernel32's function.
I believe the crash was caused by symbols overlapping each other,
and my guess is that alignment has something to do with that, but
I am not 100% sure. Needs investigating.
llvm-svn: 239280
Chunk has writeTo function which takes uint8_t *Buf.
writeHeaderTo feels more consistent with that because this member
function also takes uint8_t *Buf.
llvm-svn: 239236
Previously, half of the constructor for .idata contents was in Chunks.cpp
and the rest was in Writer.cpp. This patch moves the latter to Chunks.cpp.
Now IdataContents class manages everything for .idata section.
llvm-svn: 239230
In this design, Chunk is the only thing that knows how to write
its contents to output file as well as how to apply relocations
there. The writer shouldn't know about the details.
llvm-svn: 239216
This test case uses too large addends in relocations. Now the test is correct.
Later we need to implement overflow checking to catch such cases.
llvm-svn: 239177
For some reason llvm's r239045 made lld propagate data_1's size. This indicates
a bug somewhere in lld.
I hesitated between changing the test or just checking in a .o produced with
the old llvm-mc. Since the size is now correct, it seemed better to update the
test.
llvm-svn: 239067
Not only entry point symbol but also symbols specified by /include
option must be preserved, as they will never be dead-stripped.
http://reviews.llvm.org/D10220
llvm-svn: 239005
This patch fixes the TLS initial executable for AArch64. Current
implementation have two issues: 1. does not generate dynamic
R_AARCH64_TLS_TPREL64 relocation for the external module symbols,
and 2. does not export the TLS initial executable symbol in dynamic
symbol table.
The fix follows the MIPS strategy to add a arch-specific GOTSection
class to keep track of TLS symbols required to be place in dynamic
symbol table. It also overrides the buildDynamicSymbolTable for
ExecutableWrite class to add the symbols.
It also adds some refactoring on AArch64RelocationPass.cpp based on ARM
backend.
llvm-svn: 238981
This patch fixes the TLS local relocations alignment done by @238258.
As pointed out, the TLS size should not be considered, but rather the
TCB size based on maximum output segment alignment. Although it has
not shown in the TLS simple cases for test-suite, more comprehensible
tests with more local TLS variable showed wrong relocations values
being generated.
The local TLS testcase is expanded to add more tls variable (both
exported and static) initialized or not.
llvm-svn: 238960
Avoid saying this is based on sections because it's not very accurate.
That we don't split section into smaller chunks of data does not mean
that the linker is built on top of that.
In reality, most part of the code do not care about underlying data,
so they are neither based on "atoms" nor sections.
The symbol table only cares about symbol names and their types.
The writer handles list of chunks, which look like just blobs,
and the writer doesn't care what those chunks are backed by.
The only thing that interact with sections is SectionChunk, which is
abstracted away as one type of Chunk.
llvm-svn: 238902
In r238690, I made all files have only MemoryBufferRefs. This change
is to do the same thing for the bitcode file reader. Also updated
a few variable names to match with other code.
llvm-svn: 238782
Symbols exported by DLLs can be imported not by name but by
small number or ordinal. Usually, symbols have both ordinals
and names, and in that case ordinals are called "hints" and
used by the loader as hints.
However, symbols can have only ordinals. They are called
import-by-ordinal symbols. You need to manage ordinals by hand
so that they will never change if you choose to use the feature.
But it's supposed to make dynamic linking faster because
it needs no string comparison. Not sure if that claim still
stands in year 2015, though. Anyways, the feature exists,
and this patch implements that.
llvm-svn: 238780
I'm adding ordinal-only (nameless) imports to the import table.
The chunk for that type is going to be different from LookupChunk.
Without this change, we cannot add objects of the new type to the
vectors.
llvm-svn: 238779