This is an initial patch for a section-based COFF linker.
The patch has 2300 lines of code including comments and blank lines.
Before diving into details, you want to start from reading README
because it should give you an overview of the design.
All important things are written in the README file, so I write
summary here.
- The linker is already able to self-link on Windows.
- It's significantly faster than the existing implementation.
The existing one takes 5 seconds to link LLD on my machine,
while the new one only takes 1.2 seconds, even though the new
one is not multi-threaded yet. (And a proof-of-concept multi-
threaded version was able to link it in 0.5 seconds.)
- It uses much less memory (250MB vs. 2GB virtual memory space
to self-host).
- IMHO the new code is much simpler and easier to read than
the existing PE/COFF port.
http://reviews.llvm.org/D10036
llvm-svn: 238458
loadFile could load mulitple files just because yaml has a feature for
putting multiple documents in one file.
Designing a linker around what yaml can do seems like a bad idea to
me. This patch changes it to read a single file.
There are further improvements to be done to the api and they
will follow shortly.
llvm-svn: 235724
Command line options --arm-target1-rel and --arm-target1-abs have been renamed to be compatible with GNU linkers.
Two tests have been updated:
test/elf/options/target-specific-args.test
test/elf/ARM/rel-arm-target1.test
Differential Revision: http://reviews.llvm.org/D9037
llvm-svn: 235499
There's (almost) never need to keep .L symbols around for production
builds. In fact, the FreeBSD kernel explicitly specify -X beacuse the
size impact (and the subsequent performance impact) might be significant,
because we keep symbols in memory.
I was tempted to make this the default, but I haven't (yet).
PR: 23232
llvm-svn: 235357
Before we only accepted --dynamic-linker=<file> and -dynamic-linker <file>
but older versions of GNU ld (e.g. 2.17.50) accept this other form, so
try to be compatible.
PR: 23233
llvm-svn: 235282
The function took either StringRef or Twine. Since string literals are
ambiguous when resolving the overloading, many code calls used this
function with explicit type conversion. That led awkward code like
make_dynamic_error_code(Twine("Error occurred")).
This patch adds a function definition for string literals, so that
you can directly call the function with literals.
llvm-svn: 234841
This doesn't compile with MSVC 2013:
include\lld/ReaderWriter/PECOFFLinkingContext.h(356) : error C2797:
'lld::PECOFFLinkingContext::_imageVersion': list initialization
inside member initializer list or non-static data member initializer
is not implemented
include\lld/ReaderWriter/PECOFFLinkingContext.h(357) : error C2797:
'lld::PECOFFLinkingContext::_imageVersion': list initialization
inside member initializer list or non-static data member initializer
is not implemented
llvm-svn: 234676
The Native file format was designed to be the fastest on-memory or
on-disk file format for object files. The problem is that no one
is working on that. No LLVM tools can produce object files in
the Native, thus the feature of supporting the format is useless
in the linker.
This patch removes the Native file support. We can add it back
if we really want it in future.
llvm-svn: 234641
This MIPS specific option controls R_MIPS_EH relocation handling.
If -pcrel-eh-reloc is specified R_MIPS_EH relocation should be handled
like R_MIPS_PC32 relocation.
llvm-svn: 234635
These functions are "constructors" of the LinkingContexts. We already
have auxiliary classes and functions for ELFLinkingContext in the header.
They fall in the same category.
llvm-svn: 234082
Only MIPS defined the member function, but this feature is not actually
MIPS-specific. Also, the dependency to the MIPS-only member function
prevented us from merging <Arch>ELF{Object,DSO}Reader classes.
This patch moves the feature from MipsLinkingContext to LinkingContext.
llvm-svn: 234068
What we are doing in ELFTarget.h was dubious. In the file, we define
partial classes of <Arch>LinkingContexts to declare only static member
functions. We have different (complete) class definitions in other files.
They would conflict if they exist in the same compilation unit (because
the ones defined in ELFTarget.h has only static member functions).
I don't think this was valid C++.
http://reviews.llvm.org/D8797
llvm-svn: 234039
This patch provides implementation of R_ARM_TARGET1 relocation with
configuration of its behaviour from a command line. This patch provides
two command line options for GnuLd driver: --arm-target1-rel and
--arm-target1-abs (similar to ld option names with extra prefix 'arm-').
So user may choose which behaviour of R_ARM_TARGET1 is preferred for his
implementation of libc.
Differential Revision: http://reviews.llvm.org/D8707
llvm-svn: 234009
This patch defines implicit conversion between integers and PowerOf2
instances, so uses of the classes is now implicit and look like
regular integers. Now we are ready to remove the scaffolding.
llvm-svn: 233245
The new constructor's type is the same, but this one takes not a log2
value but an alignment value itself, so the meaning is totally differnet.
llvm-svn: 233244
This patch is to make instantiation and conversion to an integer explicit,
so that we can mechanically replace all occurrences of the class with
integer in the next step.
Now get() returns an alignment value rather than its log2 value.
llvm-svn: 233242
We are using log2 values and values themselves to represent alignments.
For example, alignment 8 is sometimes represented as 3 (8 == 2^3).
We want to stop using log2 values.
Because both types are regular arithmetic types, we cannot get help from
a compiler to find places we mix two representations. That makes this
merging work surprisingly hard because if I make a mistake, I'll just get
wrong results at runtime (Yay types!). In this patch, I introduced
a class to represents power-of-two values, which is basically an alias
for an integer type.
Once the migration is done, the class will be removed.
llvm-svn: 233232
This commit implements the behaviour of the SECTIONS linker script directive,
used to not only define a custom mapping between input and output sections, but
also order input sections in the output file. To do this, we modify
DefaultLayout with hooks at important places that allow us to re-order input
sections according to a custom order. We also add a hook in SegmentChunk to
allow us to calculate linker script expressions while assigning virtual
addresses to the input sections that live in a segment.
Not all SECTIONS constructs are currently supported, but only the ones that do
not use special sort orders. It adds two LIT test as practical examples of
which sections directives are currently supported.
In terms of high-level changes, it creates a new class "script::Sema" that owns
all linker script ASTs and the logic for linker script semantics as well.
ELFLinkingContext owns a single copy of Sema, which will be used throughout
the object file writing process (to layout sections as proposed by the linker
script).
Other high-level change is that the writer no longer uses a "const" copy of
the linking context. This happens because linker script expressions must be
calculated *while* calculating final virtual addresses, which is a very late
step in object file writing. While calculating these expressions, we need to
update the linker script symbol table (inside the semantics object), and, thus,
we are "modifying our context" as we prepare to write the file.
http://reviews.llvm.org/D8157
llvm-svn: 232402
Handle resolution of symbols coming from linked object files lazily.
Add implementation of handling _GLOBAL_OFFSET_TABLE_ and __exidx_start/_end symbols for ARM platform.
Differential Revision: http://reviews.llvm.org/D8159
llvm-svn: 232261
GNU LD has an option named -T/--script which allows a user to specify
a linker script to be used [1]. LLD already accepts linker scripts
without this option, but the option is widely used. Therefore it is
best to support it in LLD as well.
[1] https://sourceware.org/binutils/docs/ld/Options.html#Options
llvm-svn: 232183
I converted them to non-range-based loops in r226883 and r226893
because at that moment File::parse() may have side effects and
may update the vector that the reference returned from
LinkingContext::nodes().
Now File::parse() is free from side effects. We can use range-based
loops again.
llvm-svn: 231321
This reverts commit r228955. Previously files appear in a .drectve
section are parsed synchronously to avoid threading issues. I believe
it's now safe to do that asynchronously.
llvm-svn: 230905
In doParse, we shouldn't do anything that has side effects. That function may be
called speculatively and possibly in parallel.
We called WinLinkDriver::parse from doParse to parse a command line in a .drectve
section. The parse function updates a linking context object, so it has many side
effects. It was not safe to call that function from doParse. beforeLink is a
function for a File object to do something that has side effects. Moving a call
of WinLinkDriver::parse to there.
llvm-svn: 230791
Previously we have a string -> string map to keep the weak alias
symbol mapping. Naturally we can't define more than one weak alias
with that data structure.
This patch is to allow multiple aliases for the same symbol by
changing the map type to string -> set of string map.
llvm-svn: 230702
This is mainly for back-compatibility with GNU ld.
Ideally --stats should be a general option in LinkingContext, providing
individual stats for every pass in the linking process.
In the GNU driver, a better wording could be used, but there's no need
to change it for now.
Differential Revision: D7657
Reviewed by: ruiu
llvm-svn: 230157
The round-trip passes were introduced in r193300. The intention of
the change was to make sure that LLD is capable of reading end
writing such file formats.
But that turned out to be yet another over-designed stuff that had
been slowing down everyday development.
The passes ran after the core linker and before the writer. If you
had an additional piece of information that needs to be passed from
front-end to the writer, you had to invent a way to save the data to
YAML/Native. These passes forced us to do that even if that data
was not needed to be represented neither in an object file nor in
an executable/DSO. It doesn't make sense. We don't need these passes.
http://reviews.llvm.org/D7480
llvm-svn: 230069
Looks like there's a race condition around here that caused LLD to crash
on Windows. Currently we are parsing libraries specified by .drectve section
asynchronously, and something is wrong in that process. Disable the feature
for now to make buildbots happy.
llvm-svn: 228955
Use a wrapper function for symbol. Any undefined reference to symbol will be
resolved to "__wrap_symbol". Any undefined reference to "__real_symbol" will be
resolved to symbol.
This can be used to provide a wrapper for a system function. The wrapper
function should be called "__wrap_symbol". If it wishes to call the system
function, it should call "__real_symbol".
Here is a trivial example:
void * __wrap_malloc (size_t c)
{
printf ("malloc called with %zu\n", c);
return __real_malloc (c);
}
If you link other code with this file using --wrap malloc, then all calls
to "malloc" will call the function "__wrap_malloc" instead. The call to
"__real_malloc" in "__wrap_malloc" will call the real "malloc" function.
llvm-svn: 228906
This adds the LinkingContext parameter to the ELFReader. Previously the flags in
that were needed in the Context was passed to the ELFReader, this made it very
hard to access data structures in the LinkingContext when reading an ELF file.
This change makes the ELFReader more flexible so that required parameters can be
grabbed directly from the LinkingContext.
Future patches make use of the changes.
There is no change in functionality though.
llvm-svn: 228905
We used to do like this instead of putting all command line processing
code within one gigantic switch statement. It is converted to a switch
in r188958, which introduced InputGraph.
In this patch I roll that change back. Now all "break"s are removed,
and the nesting is one level shallow.
llvm-svn: 228646
Use the environment variable "LLD_RUN_ROUNDTRIP_TEST" in the test that you want
to disable, as
RUN: env LLD_RUN_ROUNDTRIP_TEST= <run>
This was a patch that I made, but I find this a better way to accomplish what we
want to do.
llvm-svn: 228376
Only search library directories explicitly specified
on the command line. Library directories specified in linker
scripts (including linker scripts specified on the command
line) are ignored.
llvm-svn: 228375
INPUT directive is a variant of GROUP in the sense that that specifies
a list of input files. The only difference is whether the entire file
list is wrapped with a --start-group/--end-group or not.
http://reviews.llvm.org/D7390
llvm-svn: 228060
Currently, no one owns script::Parser buffers, but yet ELFLinkingContext gets
updated with StringRef pointers to data inside Parser buffers. Since this buffer
is locally owned inside GnuLdDriver::evalLinkerScript(), as soon as this
function finishes, all pointers in ELFLinkingContext that comes from linker
scripts get invalid. The problem is that we need someone to own linker scripts
data structures and, since ELFLinkingContext transports references to linker
scripts data, we can simply make it also own all linker scripts data.
Differential Revision: http://reviews.llvm.org/D7323
llvm-svn: 227975
This is needed, among others by the FreeBSD kernel linker script.
Patch by Davide Italiano!
Reviewers: ruiu, rafaelauler
Differential Revision: http://reviews.llvm.org/D7220
llvm-svn: 227694
* Removed cyclic dependency between lldPECOFF and lldDriver
* Added missing dependencies in unit tests
Differential Revision: http://reviews.llvm.org/D7185
llvm-svn: 227134
This is initial patch to support MIPS64 object files linking.
The patch just makes some classes more generalized, and rejects
attempts to interlinking O32 and N64 ABI object files.
I try to reuse the current MIPS target related classes as much as
possible because O32 and N64 MIPS ABI are tightly related and share
almost the same set of relocations, GOT, flags etc.
llvm-svn: 227058
lldELF is used by each ELF backend. lldELF's ELFLinkingContext
also held a reference to each backend, creating a link-time
cycle. This patch moves the backend references to lldDriver.
Differential Revision: http://reviews.llvm.org/D7119
llvm-svn: 226976
Moved getMemoryBuffer from DarwnLdDriver to MachOLinkingContext.
lldMachO shared library target now builds.
Differential Review: http://reviews.llvm.org/D7155
llvm-svn: 226963
lldELF is used by each ELF backend. lldELF's ELFLinkingContext
also held a reference to each backend, creating a link-time
cycle. This patch moves the backend references to lldDriver.
Differential Revision: http://reviews.llvm.org/D7119
llvm-svn: 226922
Before this patch there was a cyclic dependency between lldCore and
lldReaderWriter. Only lldConfig could be built as a shared library.
* Moved Reader and Writer base classes into lldCore.
* The following shared libraries can now be built:
lldCore
lldYAML
lldNative
lldPasses
lldReaderWriter
Differential Revision: http://reviews.llvm.org/D7105
From: Greg Fitzgerald <garious@gmail.com>
llvm-svn: 226732
The code is able to statically link the simplest case of:
int main() { return 0; }
* Only works with ARM code - no Thumb code, no interwork (-marm -mno-thumb-interwork)
* musl libc built with no interwork and no Thumb code
Differential Revision: http://reviews.llvm.org/D6716
From: Denis Protivensky <dprotivensky@accesssoftek.com>
llvm-svn: 226643
We had such class there because of InputGraph abstraction.
Previously, no one except InputGraph itself has complete picture of
input file list. In order to create a set of all defined symbols,
we had to use some indirections there to workaround InputGraph.
It can now be rewritten as simple code. No change in functionality.
llvm-svn: 226319
This patch makes File::parse() multi-thread safe. If one thread is running
File::parse(), other threads will block if they try to call the same method.
File::parse() is idempotent, so you can safely call multiple times.
With this change, we don't have to wait for all worker threads to finish
in Driver::link(). Previously, Driver::link() calls TaskGroup::sync() to
wait for all threads running File::parse(). This was not ideal because
we couldn't start the resolver until we parse all files.
This patch increase parallelism by making Driver::link() to not wait for
worker threads. The resolver calls parse() to make sure that the file
being read has been parsed, and then uses the file. In this approach,
the resolver can run with the parser threads in parallel.
http://reviews.llvm.org/D6994
llvm-svn: 226281
The previous default behavior of LLD is --as-needed. LLD linked
against a DSO only if the DSO file was actually used to link an
executable (i.e. at least one symbol was resolved using the shared
library file.)
In this patch I added a boolean flag to FileNode for --as-needed.
I also added an accessor to DSO name to shared library file class.
llvm-svn: 226274
InputElement was named that because it's an element of an InputGraph.
It's losing the origin because the InputGraph is now being removed.
InputElement's subclass is FileNode, that naming inconsistency needed
to be fixed.
llvm-svn: 226147
They were the last member functions of InputGraph (besides members()).
Now InputGraph is just a container of a vector. We are ready to replace
InputGraph with plain File vector.
llvm-svn: 226146
The member function was defined to allow subclasses to customize
error message. But since we only have one FileNode type, there's
no actual need for that.
llvm-svn: 226139
These changes depended on r225674 and had been rolled back in r225859.
Because r225674 has been re-submitted, it's safe to re-submit them.
llvm-svn: 226132
The original commit had an issue with Mac OS dylib files. It didn't
handle fat binary dylib files correctly. This patch includes a fix.
A test for that case has already been committed in r225764.
llvm-svn: 226123
r225764 broke a basic functionality on Mac OS. This change reverts
r225764, r225766, r225767, r225769, r225814, r225816, r225829, and r225832.
llvm-svn: 225859
PECOFF was the only user of the API, and the reason why we created
the API is because, although the driver creates a list of input files,
it has no knowledge on what files are being created. It was because
everything was hidden behind the InputGraph abstraction.
Now the driver knows what that's doing. We no longer need this
indirection to get the file list being processed.
llvm-svn: 225767
Instead of representing a linker script file as an "InputElement",
parse and evaluate scripts in the driver as we see them.
Linker scripts are not regular input files (regular file is one of
object, archive, or shared library file). They are more like
extended command line options. Linker script handling was needlessly
complicated because of that inappropriate abstraction (besides
excessive class hierarchy -- there is no such thing like ELF linker
script but we had two classes there for some reason.)
LinkerScript was one of a few remaining InputElement subclasses
that can be expanded to multiple files. With this patch, we are one
step closer to retire InputElement.
http://reviews.llvm.org/D6648
llvm-svn: 225330
This is a part of InputGraph cleanup to represent input files as a flat
list of Files (and some meta-nodes for group etc.)
We cannot achieve that goal in one gigantic patch, so I split the task
into small steps as shown below.
(Recap the progress so far: Currently InputGraph contains a list of
InputElements. Each InputElement contain one File (that used to have
multiple Files, but I eliminated that use case in r223867). Files are
currently instantiated in Driver::link(), but I already made a change
to separate file parsing from object instantiation (r224102), so we
can safely instantiate Files when we need them, instead of wrapping
a file with the wrapper class (FileNode class). InputGraph used to
act like a generator class by interpreting groups by itself, but it's
now just a container of a list of InputElements (r223867).)
1. Instantiate Files in the driver and wrap them with WrapperNode.
WrapperNode is a temporary class that allows us to instantiate Files
in the driver while keep using the current InputGraph data structure.
This patch demonstrates how this step 1 looks like, using Core driver
as an example.
2. Do the same thing for the other drivers.
When step 2 is done, an InputGraph consists of GroupEnd objects or
WrapperNodes each of which contains one File. Other types of
FileNode subclasses are removed.
3. Replace InputGraph with std::vector<std::unique_ptr<InputElement>>.
InputGraph is already just a container of list of InputElements,
so this step removes that needless class.
4. Remove WrapperNode.
We need some code cleanup between each step, because many classes
do a bit odd things (e.g. InputGraph::getGroupSize()). I'll straight
things up as I need to.
llvm-svn: 225313
Summary:
Work on adding -rpath support to the mach-o linker.
This patch is based on the ld64 behavior for the command line option validation.
It includes a basic test to check that the LC_RPATH load commands are properly generated when that option is used.
It also add LC_RPATH support to the binary reader, but I don't know how to test it though.
Reviewers: kledzik
Subscribers: llvm-commits
Projects: #lld
Differential Revision: http://reviews.llvm.org/D6724
llvm-svn: 224544
ReaderErrorCategory was used only at one place. We now have a
DynamicErrorCategory for this kind of one-time error, so use it.
The calling function doesn't really care the type of an error, so
ReaderErrorCategory was actually dead code.
llvm-svn: 224245
The documentation of parseFile() said that "the resulting File
object may take ownership of the MemoryBuffer." So, whether or not
the ownership of a MemoryBuffer would be taken was not clear.
A FileNode (a subclass of InputElement, which is being deprecated)
keeps the ownership if a File doesn't take it.
This patch makes File always take the ownership of a buffer.
Buffers lifespan is not always the same as File instances.
Files are able to deallocate buffers after parsing the contents.
llvm-svn: 224113
The LLD linker searches initializer and finalizer function names
and emits DT_INIT/DT_FINI dynamic table tags to point to these symbols.
The -init/-fini command line options override initializer ("_init") and
finalizer ("_fini") function names used by default.
Now the -init/-fini options do not affect .init_array/.fini_array
sections. The corresponding code has been removed.
Differential Revision: http://reviews.llvm.org/D6578
llvm-svn: 223917
This reverts commit r223330 because it broke Darwin and ELF
linkers in a way that we couldn't have caught with the existing
test cases.
llvm-svn: 223373
The aim of this patch is to reduce the excessive abstraction from
the InputGraph. We found that even a simple thing, such as sorting
input files (Mach-O) or adding a new file to the input file list
(PE/COFF), is nearly impossible with the InputGraph abstraction,
because it hides too much information behind it. As a result,
we invented complex interactions between components (e.g.
notifyProgress() mechanism) and tricky code to work around that
limitation. There were many occasions that we needed to write
awkward code.
This patch is a first step to make it cleaner. As a first step,
this removes Group class from the InputGraph. The grouping feature
is now directly handled by the Resolver. notifyProgress is removed
since we no longer need that. I could have cleaned it up even more,
but in order to keep the patch minimum, I focused on Group.
SimpleFileNode class, a container of File objects, is now limited
to have only one File. We shold have done this earlier.
We used to allow putting multiple File objects to FileNode.
Although SimpleFileNode usually has only one file, the Driver class
actually used that capability. I modified the Driver class a bit,
so that one FileNode is created for each input File.
We should now probably remove SimpleFileNode and directly store
File objects to the InputGraph in some way, because a container
that can contain only one object is useless. This is a TODO.
Mach-O input files are now sorted before they are passe to the
Resolver. DarwinInputGraph class is no longer needed, so removed.
PECOFF still has hacky code to add a new file to the input file list.
This will be cleaned up in another patch.
llvm-svn: 223330
This would allow other flavor specific contexts to override the default value,
if they want to optionally run the round trip passes.
There is some information lost like the original file owner of the atom with
RoundTripPasses. The Gnu flavor needs this information inorder to implement
LinkerScript matching and for other diagnostic outputs such as Map files.
The flag also can be used to record information in the Atom if the information
to the Writer needs to be conveyed through References too.
llvm-svn: 222983
.ilk file is a file for incremental linking. We don't create nor use
that file.
/MAXILKFILE is an undocumented option to specify the maximum size
of the .ilk file, IIUC. We should just ignore the option.
llvm-svn: 222777
There are many build files in the wild that depend on the fact that
link.exe produces a PDB file if /DEBUG option is given. They fail
if the file is not created.
This patch is to make LLD create an empty (dummy) file to satisfy
such build targets. This doesn't do anything other than "touching"
the file.
If a target depends on the content of the PDB file, this workaround
is no help, of course. Otherwise this patch should help build some
stuff.
llvm-svn: 222773
/debug makes MSVC link.exe to not remove unused sections from
the resulting executable. We did the same thing before. However,
I realized that the removal of associative section depends on
the dead-stripping pass in LLD, so we cannot disable that. Or
LLD may produce slightly broken executables that have too much
data in it (which could result in nasty subtle bugs).
This patch is a temporary measure to create correct executable.
Currently /debug does not have any real effect for LLD anyway.
I'll improve associative section handling in another patch, so that
they are removed from output without depending on the dead-stripping
pass.
llvm-svn: 222483
The user can use the max-page-size option and set the maximum page size. Dont
check for maximum allowed values for page size, as its what the kernel is
configured with.
Fix the test as well.
llvm-svn: 221858
lld generates an ELF by adhering to the ELF spec by aligning vma/fileoffset to a
page boundary, but this becomes an issue when dealing with large pages. This
adds support so that lld generated executables adheres to the ELF spec with the
rule vma % p_align = offset % p_align.
This is supported by the flag --no-align-segments.
This could be the default in few targets like X86_64 to save space on disk.
llvm-svn: 221571
My previous fix to have FileArchive own the member MemoryBuffers was not a
complete solution for darwin because nothing owned the FileArchive object.
Fixed MachOFileNode to be like ELFFileNode and have the graph node own the
archive object.
llvm-svn: 221552