In the resolver, we maintain a list of undefined symbols, and when we
visit an archive file, we check that file if undefined symbols can be
resolved using files in the archive. The archive file class provides
find() function to lookup a symbol.
Previously, we call find() for each undefined symbols. Archive files
may be visited multiple times if they are in a --start-group and
--end-group. If we visit a file M times and if we have N undefined
symbols, find() is called M*N times. I found that that is one of the
most significant bottlenecks in LLD when linking a large executable.
find() is not a very cheap operation because it looks up a hash table
for a given string. And a string, or a symbol name, can be pretty long
if you are dealing with C++ symbols.
We can eliminate the bottleneck.
Calling find() with the same symbol multiple times is a waste. If a
result of looking up a symbol is "not found", it stays "not found"
forever because the symbol simply doesn't exist in the archive.
Thus, we should call find() only for newly-added undefined symbols.
This optimization makes O(M*N) O(N).
In this patch, all undefined symbols are added to a vector. For each
archive/shared library file, we maintain a start position P. All
symbols [0, P) are already searched. [P, end of the vector) are not
searched yet. For each file, we scan the vector only once.
This patch changes the order in which undefined symbols are looked for.
Previously, we iterated over the result of _symbolTable.undefines().
Now we iterate over the new vector. This is a benign change but caused
differences in output if remaining undefines exist. This is why some
tests are updated.
The performance improvement of this patch seems sometimes significant.
Previously, linking chrome.dll on my workstation (Xeon 2.4GHz 8 cores)
took about 70 seconds. Now it takes (only?) 30 seconds!
http://reviews.llvm.org/D8091
llvm-svn: 231434
Merge::mergeByLargestSection is half-baked since it's defined
in terms of section size, there's no way to get the section size
of an atom.
Currently we work around the issue by traversing the layout edges
to both directions and calculate the sum of all atoms reachable.
I wrote that code but I knew it's hacky. It's even not guaranteed
to work. If you add layout edges before the core linking, it
miscalculates a size.
Also it's of course slow. It's basically a linked list traversal.
In this patch I added DefinedAtom::sectionSize so that we can use
that for mergeByLargestSection. I'm not very happy to add a new
field to DefinedAtom base class, but I think it's legitimate since
mergeByLargestSection is defined for section size, and the section
size is currently just missing.
http://reviews.llvm.org/D7966
llvm-svn: 231290
Previously we didn't call the hook on a file in an archive, which
let the PE/COFF port fail to link files in archives. It was a
simple mistake. Added a call to the hook and also added a test to
catch that error.
const_cast is an unfortunate hack. Files in the resolver are usually
const, but they are not actually const objects, since they are
mutated if either a file is taken from an archive (an archive file
does never return the same file twice) or the beforeLink hook is
called. Maybe we should just remove const from there -- because they
are not const.
llvm-svn: 230808
This is yet another edge case of base relocation for symbols. Absolute
symbols are in general not target of base relocation because absolute
atom is a way to point to a specific memory location. In r229816, I
removed entries for absolute atoms from the base relocation table
(so that they won't be fixed by the loader).
However, there was one exception -- ImageBase. ImageBase points to the
start address of the current image in memory. That needs to be fixed up
at load time. This patch is to treat the symbol in a special manner.
llvm-svn: 229961
Previously we wrongly emitted a base relocation entry for an absolute symbol.
That made the loader to rewrite some instruction operands with wrong values
only when a DLL is not loaded at the default address. That caused a
misterious crash of some executable.
Absolute symbols will of course never change value wherever the binary is
loaded to memory. We shouldn't emit base relocations for absolute symbols.
llvm-svn: 229816
When this test was written, no llvm tool could print out contents
of base relocation section. Now llvm-readobj is able to dump it in
a text format. Use that tool to make this test readable.
llvm-svn: 229814
Weak aliases defined using /alternatename command line option were getting
wrong RVAs in the final output because of wrong atom ordinal. Alias atoms
were assigned large ordinals than any other regular atoms because they were
instantiated after other atoms and just got new (larger) ordinals.
Atoms are sorted by its file and atom ordinals in the order pass. Alias
atoms were located after all other atoms in the same file.
An alias atom's ordinal needs to be smaller than its alias target but larger
than the atom appeared before the target -- so that the alias is located
between the two. Since an alias has no size, the alias target will be located
at the same location as the alias.
In this patch, I made a gap between two regular atoms so that we can put
aliases after instantiating them (without re-numbering existing atoms).
llvm-svn: 229762
Previously, we incorrectly added the image base address to an absolute
symbol address (that calculation doesn't make any sense) if an
IMAGE_REL_I386_DIR32 relocation is applied to an absolute symbol.
This patch fixes the issue. With this fix, we can link Bochs using LLD.
(Choosing Bochs has no special meaining -- I just picked it up as a
test program and found it didn't work.) This also fixes one of the
issues we currently have to link Chromium using LLD.
llvm-svn: 228279
The LayoutPass is one of the slowest pass. This change is to skip
that pass. This change not only improve performance but also improve
maintainability of the code because the LayoutPass is pretty complex.
Previously we used the LayoutPass to sort all atoms in a specific way,
and reorder them again for PE/COFF in GroupedSectionPass.
I spent time on improving and fixing bugs in the LayoutPass (e.g.
r193029), but the pass is still hard to understand and hard to use.
It's better not to depend on that if we don't need. For PE/COFF, we
just wanted to sort atoms in the same order as the file order in the
command line.
The feature we used in the LayoutPass is now simplified to
compareByPosition function in OrderPass.cpp. The function is just 5
lines.
This patch changes the order of final output because it changes the
sort order a bit. The output is still correct, though.
llvm-svn: 227500
We used to manage the state whether we are in a group or not
using a counter. The counter is incremented by one if we jump from
end-group to start-group, and decremented by one if we don't.
The counter was assumed to be either zero or one, but obviously it
could be negative (if there's a group which is not repeated at all).
This is a fix for that issue.
llvm-svn: 226632
The target may be a synthetic symbol like __ImageBase. cast_or_null will ensure
that the atom is a DefinedAtom, which is not guaranteed, which was the original
reason for the cast_or_null. Switch this to dyn_cast, which should enable
building of executables for WoA. Unfortunately, the issue of missing base
relocations still needs to be investigated.
llvm-svn: 226246
r225764 broke a basic functionality on Mac OS. This change reverts
r225764, r225766, r225767, r225769, r225814, r225816, r225829, and r225832.
llvm-svn: 225859
This is necessary to support linking a basic program which references symbols
outside of the module itself. Add the import thunk for ARM NT style imports.
This allows us to create the reference. However, it is still insufficient to
generate executables that will run due to base relocations not being emitted for
the import.
llvm-svn: 225428
This adds the ability to export symbols from a DLL built for ARMNT. Add this
support first to help work towards adding support for import thunks on Windows
on ARM. In order to generate the exports, add support for
IMAGE_REL_ARM_ADDR32NB relocations.
llvm-svn: 225339
ARM NT assumes a purely THUMB execution, and as such requires that the address
of entry point is adjusted to indicate a thumb entry point. Unconditionally
adjust the AddressOfEntryPoint in the PE header for PE/COFF ARM as we only
support ARM NT at the moment.
llvm-svn: 225139
ARM NT assumes a THUMB only environment. As such, any address that is detected
as residing in an executable section is adjusted to have its bottom bit set to
indicate THUMB in case of a mode exchange.
Although the testing here seems insufficient (missing the negative cases) the
existing test cases for the IMAGE_REL_ARM_{ADDR32,MOV32T} are relevant as they
ensure that we do not incorrectly set the bit.
llvm-svn: 225104
This adds support for IMAGE_REL_ARM_BRANCH24T relocations. Similar to the
IMAGE_REL_ARM_BLX32T relocation, this relocation requires munging an
instruction. The instruction encoding is quite similar, allowing us to reuse
the same munging implementation. This is needed by the entry point stubs for
modules provided by MSVCRT.
llvm-svn: 225082
This adds support for IMAGE_REL_ARM_BLX23T relocations. Similar to the
IMAGE_REL_ARM_MOV32T relocation, this relocation requires munging an
instruction. This inches us closer to supporting a basic hello world
application.
llvm-svn: 225081
This adds support for the IMAGE_REL_ARM_MOV32T relocation. This is one of the
most complicated relocations for the Window on ARM target. It involves
re-encoding an instruction to contain an immediate value which is the relocation
target.
llvm-svn: 225072
Correct the yaml definition for the object. Adjust the symbol storage class
which was flipped for the two symbols, resulting in the link failure due to the
symbol missing. Adjust the virtual address of the section. This ripples into
the test case, since the data has been shifted up by 4 bytes.
llvm-svn: 225058
This implements the IMAGE_REL_ARM_ADDR32 relocation. There are still a few more
relocation types that need to resolved before lld can even attempt to link a
trivial program for Windows on ARM.
llvm-svn: 225057
This teaches lld about the ARM NT object types. Add a trivial test to ensure
that it can handle ARM NT object file inputs. It is still unable to perform the
necessary relocations for ARM NT, but this allows the linker to at least read
the objects.
llvm-svn: 225052
Looks like if you have symbol foo in a module-definition file
(.def file), and if the actual symbol name to match that export
description is _foo@x (where x is an integer), the exported
symbol name becomes this.
- foo in the .dll file
- foo@x in the .lib file
I have checked in a few fixes recently for exported symbol name mangling.
I haven't found a simple rule that governs all the mangling rules.
There may not ever exist. For now, this is a patch to improve .lib
file compatibility.
llvm-svn: 223524
Looks like the rule of /export is more complicated than
I was thinking. If /export:foo, for example, is given, and
if the actual symbol name in an object file is _foo@<number>,
we need to export that symbol as foo, not as the mangled name.
If only /export:_foo@<number> is given, the symbol is exported
as _foo@<number>.
If both /export:foo and /export:_foo@<number> are given,
they are considered as duplicates, and the linker needs to
choose the unmangled name.
The basic idea seems that the linker needs to export a symbol
with the same name as given as /export.
We exported mangled symbols. This patch fixes that issue.
llvm-svn: 223341
/export option can be given multiple times to specify multiple
symbols to be exported. /export accepts both decorated and
undecorated name.
If you give both undecorated and decorated name of the same symbol
to /export, they are resolved to the same symbol. In this case,
we need to de-duplicate the exported names, so that we don't have
duplicated items in the export symbol table in a DLL.
We remove duplicate items from a vector. The bug was there.
Because we had pointers pointing to elements of the vector,
after an item is removed, they would point wrong elements.
This patch is to remove these pointers. Added a test for that case.
llvm-svn: 223200
Export table entries need to be sorted in ASCII-betical order,
so that the loader can find an entry for a function by binary search.
We sorted the entries by its mangled names. That can be different
from their exported names. As a result, LLD produces incorrect export
table, from which the loader complains that a function that actually
exists in a DLL cannot be found.
This patch fixes that issue.
llvm-svn: 222452
If you have something like
__declspec(align(8192)) int foo = 1;
in your code, the compiler makes the data to be aligned to 8192-byte
boundary, and the linker align the section containing the data to 8192.
LLD always aligned the section to 4192. So, as long as alignment
requirement is smaller than 4192, it was correct, but for larger
requirements, it's wrong.
This patch fixes the issue.
llvm-svn: 222043
Each entry in the delay-import address table had a wrong alignment
requirement if 32 bit. As a result it got wrong delay-import table.
Because llvm-readobj doesn't print out that field, we don't have a
test for that. I'll submit a test that would catch this bug after
improving llvm-readobj.
llvm-svn: 221853
If /subsystem option is not given, the linker needs to infer the
subsystem based on the entry point symbol. If it fails to infer
that, the linker should error out on it.
LLD was almost correct, but it would fail to infer the subsystem
if the entry point is specified with /entry. This is because the
subsystem inference was coupled with the entry point function
searching (if no entry point name is specified, the linker needs
to find the right entry name).
This patch makes the subsystem inference an independent pass to
fix the issue. Now, as long as an entry point function is defined,
LLD can infer the subsystem no matter how it resolved the entry
point.
I don't think scanning all the defined symbols is fast, although
it shouldn't be that slow. The file class there does not provide
any easy way to find an atom by name, so this is what we can do
at this moment. I'd like to revisit this later to make it more
efficient.
llvm-svn: 221499
SECREL relocation's value is the offset to the beginning of the section.
Because of the off-by-one error, if a SECREL relocation target is at the
beginning of a section, it got wrong value.
Added a test that would have caught this.
llvm-svn: 221420
Many programs, for reasons unknown, really like to look at the
AddressOfRelocationTable to determine whether or not they are looking at
a bona fide PE file. Without this, programs like the UNIX `file'
utility will insist that they are looking at a MS DOS executable.
llvm-svn: 221335
LLD skipped COMDAT section symbols when reading them because
I thought we don't want to have symbols with the same name.
But they are actually needed because relocations may refer to
the section symbols. So we shoulnd't skip them.
llvm-svn: 221329
Normally, PE files have section names of eight characters or less.
However, this is problematic for DWARF because DWARF section names are
things like .debug_aranges.
Instead of truncating the section name, redirect the section name into
the string table.
Differential Revision: http://reviews.llvm.org/D6104
llvm-svn: 221212
This is a follow-up patch for r220333. r220333 renames exported symbols.
That raised another issue; if we have both decorated and undecorated names
for the same symbol, we'll end up have two duplicate exported symbol
entries.
This is a fix for that issue by removing duplciate entries.
llvm-svn: 220350
There are two ways to specify a symbol to be exported in the module
definition file.
1) EXPORT <external name> = <symbol>
2) EXPORT <symbol>
In (1), you give both external name and internal name. In that case,
the linker tries to find a symbol using the internal name, and write
that address to the export table with the external name. Thus, from
the outer world, the symbol seems to be exported as the external name.
In (2), internal name is basically the same as the external name
with an exception: if you give an undecorated symbol to the EXPORT
directive, and if the linker finds a decorated symbol, the external
name for the symbol will become the decorated symbol.
LLD didn't implement that exception correctly. This patch fixes that.
llvm-svn: 220333
This patch creates the import address table and sets its
address to the delay-load import table. This also creates
wrapper functions for __delayLoadHelper2.
x86 only for now.
llvm-svn: 219948