forked from OSchip/llvm-project
406 lines
17 KiB
ReStructuredText
406 lines
17 KiB
ReStructuredText
.. _design:
|
|
|
|
Linker Design
|
|
=============
|
|
|
|
Introduction
|
|
------------
|
|
|
|
lld is a new generation of linker. It is not "section" based like traditional
|
|
linkers which mostly just interlace sections from multiple object files into the
|
|
output file. Instead, lld is based on "Atoms". Traditional section based
|
|
linking work well for simple linking, but their model makes advanced linking
|
|
features difficult to implement. Features like dead code stripping, reordering
|
|
functions for locality, and C++ coalescing require the linker to work at a finer
|
|
grain.
|
|
|
|
An atom is an indivisible chunk of code or data. An atom has a set of
|
|
attributes, such as: name, scope, content-type, alignment, etc. An atom also
|
|
has a list of References. A Reference contains: a kind, an optional offset, an
|
|
optional addend, and an optional target atom.
|
|
|
|
The Atom model allows the linker to use standard graph theory models for linking
|
|
data structures. Each atom is a node, and each Reference is an edge. The
|
|
feature of dead code stripping is implemented by following edges to mark all
|
|
live atoms, and then delete the non-live atoms.
|
|
|
|
|
|
Atom Model
|
|
----------
|
|
|
|
An atom is an indivisible chuck of code or data. Typically each user written
|
|
function or global variable is an atom. In addition, the compiler may emit
|
|
other atoms, such as for literal c-strings or floating point constants, or for
|
|
runtime data structures like dwarf unwind info or pointers to initializers.
|
|
|
|
A simple "hello world" object file would be modeled like this:
|
|
|
|
.. image:: hello.png
|
|
|
|
There are three atoms: main, a proxy for printf, and an anonymous atom
|
|
containing the c-string literal "hello world". The Atom "main" has two
|
|
references. One is the call site for the call to printf, and the other is a
|
|
refernce for the instruction that loads the address of the c-string literal.
|
|
|
|
File Model
|
|
----------
|
|
|
|
The linker views the input files as basically containers of Atoms and
|
|
References, and just a few attributes of their own. The linker works with three
|
|
kinds of files: object files, static libraries, and dynamic shared libraries.
|
|
Each kind of file has reader object which presents the file in the model
|
|
expected by the linker.
|
|
|
|
Object File
|
|
~~~~~~~~~~~
|
|
|
|
An object file is just a container of atoms. When linking an object file, a
|
|
reader is instantiated which parses the object file and instantiates a set of
|
|
atoms representing all content in the .o file. The linker adds all those atoms
|
|
to a master graph.
|
|
|
|
Static Library (Archive)
|
|
~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
This is the traditional unix static archive which is just a collection of object
|
|
files with a "table of contents". When linking with a static library, by default
|
|
nothing is added to the master graph of atoms. Instead, if after merging all
|
|
atoms from object files into a master graph, if any "undefined" atoms are left
|
|
remaining in the master graph, the linker reads the table of contents for each
|
|
static library to see if any have the needed definitions. If so, the set of
|
|
atoms from the specified object file in the static library is added to the
|
|
master graph of atoms.
|
|
|
|
Dynamic Library (Shared Object)
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Dynamic libraries are different than object files and static libraries in that
|
|
they don't directly add any content. Their purpose is to check at build time
|
|
that the remaining undefined references can be resolved at runtime, and provide
|
|
a list of dynamic libraries (SO_NEEDED) that will be needed at runtime. The way
|
|
this is modeled in the linker is that a dynamic library contributes no atoms to
|
|
the initial graph of atoms. Instead, (like static libraries) if there are
|
|
"undefined" atoms in the master graph of all atoms, then each dynamic library is
|
|
checked to see if exports the required symbol. If so, a "shared library" atom is
|
|
instantiated by the by the reader which the linker uses to replace the
|
|
"undefined" atom.
|
|
|
|
Linking Steps
|
|
-------------
|
|
|
|
Through the use of abstract Atoms, the core of linking is architecture
|
|
independent and file format independent. All command line parsing is factored
|
|
out into a separate "options" abstraction which enables the linker to be driven
|
|
with different command line sets.
|
|
|
|
The overall steps in linking are:
|
|
|
|
#. Command line processing
|
|
|
|
#. Parsing input files
|
|
|
|
#. Resolving
|
|
|
|
#. Passes/Optimizations
|
|
|
|
#. Generate output file
|
|
|
|
The Resolving and Passes steps are done purely on the master graph of atoms, so
|
|
they have no notion of file formats such as mach-o or ELF.
|
|
|
|
Resolving
|
|
~~~~~~~~~
|
|
|
|
The resolving step takes all the atoms graphs from each object file and combines
|
|
them into one master object graph. Unfortunately, it is not as simple as
|
|
appending the atom list from each file into one big list. There are many cases
|
|
where atoms need to be coalesced. That is, two or more atoms need to be
|
|
coalesced into one atom. This is necessary to support: C language "tentative
|
|
definitions", C++ weak symbols for templates and inlines defined in headers,
|
|
replacing undefined atoms with actual definition atoms, and for merging copies
|
|
of constants like c-strings and floating point constants.
|
|
|
|
The linker support coalescing by-name and by-content. By-name is used for
|
|
tentative definitions and weak symbols. By-content is used for constant data
|
|
that can be merged.
|
|
|
|
The resolving process maintains some global linking "state", including a "symbol
|
|
table" which is a map from llvm::StringRef to lld::Atom*. With these data
|
|
structures, the linker iterates all atoms in all input files. F or each atom, it
|
|
checks if the atom is named and has a global or hidden scope. If so, the atom
|
|
is added to the symbol table map. If there already is a matching atom in that
|
|
table, that means the current atom needs to be coalesced with the found atom, or
|
|
it is a multiple definition error.
|
|
|
|
When all initial input file atoms have been processed by the resolver, a scan is
|
|
made to see if there are any undefined atoms in the graph. If there are, the
|
|
linker scans all libraries (both static and dynamic) looking for definitions to
|
|
replace the undefined atoms. It is an error if any undefined atoms are left
|
|
remaining.
|
|
|
|
Dead code stripping (if requested) is done at the end of resolving. The linker
|
|
does a simple mark-and-sweep. It starts with "root" atoms (like "main" in a main
|
|
executable) and follows each references and marks each Atom that it visits as
|
|
"live". When done, all atoms not marked "live" are removed.
|
|
|
|
The result of the Resolving phase is the creation of an lld::File object. The
|
|
goal is that the lld::File model is <b>the</b> internal representation
|
|
throughout the linker. The file readers parse (mach-o, ELF, COFF) into an
|
|
lld::File. The file writers (mach-o, ELF, COFF) taken an lld::File and produce
|
|
their file kind, and every Pass only operates on an lld::File. This is not only
|
|
a simpler, consistent model, but it enables the state of the linker to be dumped
|
|
at any point in the link for testing purposes.
|
|
|
|
|
|
Passes
|
|
~~~~~~
|
|
|
|
The Passes step is an open ended set of routines that each get a change to
|
|
modify or enhance the current lld::File object. Some example Passes are:
|
|
|
|
* stub (PLT) generation
|
|
|
|
* GOT instantiation
|
|
|
|
* order_file optimization
|
|
|
|
* branch island generation
|
|
|
|
* branch shim generation
|
|
|
|
* Objective-C optimizations (Darwin specific)
|
|
|
|
* TLV instantiation (Darwin specific)
|
|
|
|
* dtrace probe processing (Darwin specific)
|
|
|
|
* compact unwind encoding (Darwin specific)
|
|
|
|
|
|
Some of these passes are specific to Darwin's runtime environments. But many of
|
|
the passes are applicable to any OS (such as generating branch island for out of
|
|
range branch instructions).
|
|
|
|
The general structure of a pass is to iterate through the atoms in the current
|
|
lld::File object, inspecting each atom and doing something. For instance, the
|
|
stub pass, looks for call sites to shared library atoms (e.g. call to printf).
|
|
It then instantiates a "stub" atom (PLT entry) and a "lazy pointer" atom for
|
|
each proxy atom needed, and these new atoms are added to the current lld::File
|
|
object. Next, all the noted call sites to shared library atoms have their
|
|
References altered to point to the stub atom instead of the shared library atom.
|
|
|
|
Generate Output File
|
|
~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Once the passes are done, the output file writer is given current lld::File
|
|
object. The writer's job is to create the executable content file wrapper and
|
|
place the content of the atoms into it.
|
|
|
|
lld::File representations
|
|
-------------------------
|
|
|
|
Just as LLVM has three representations of its IR model, lld has three
|
|
representations of its File/Atom/Reference model:
|
|
|
|
* In memory, abstract C++ classes (lld::Atom, lld::Reference, and lld::File).
|
|
|
|
* textual (in YAML)
|
|
|
|
* binary format ("native")
|
|
|
|
Binary File Format
|
|
~~~~~~~~~~~~~~~~~~
|
|
|
|
In theory, lld::File objects could be written to disk in an existing Object File
|
|
format standard (e.g. ELF). Instead we choose to define a new binary file
|
|
format. There are two main reasons for this: fidelity and performance. In order
|
|
for lld to work as a linker on all platforms, its internal model must be rich
|
|
enough to model all CPU and OS linking features. But if we choose an existing
|
|
Object File format as the lld binary format, that means an on going need to
|
|
retrofit each platform specific feature needed from alternate platforms into the
|
|
existing Object File format. Having our own "native" binary format side steps
|
|
that issue. We still need to be able to binary encode all the features, but
|
|
once the in-memory model can represent the feature, it is straight forward to
|
|
binary encode it.
|
|
|
|
The reason to use a binary file format at all, instead of a textual file format,
|
|
is speed. You want the binary format to be as fast as possible to read into the
|
|
in-memory model. Given that we control the in-memory model and the binary
|
|
format, the obvious way to make reading super fast it to make the file format be
|
|
basically just an array of atoms. The reader just mmaps in the file and looks
|
|
at the header to see how many atoms there are and instantiate that many atom
|
|
objects with the atom attribute information coming from that array. The trick
|
|
is designing this in a way that can be extended as the Atom mode evolves and new
|
|
attributes are added.
|
|
|
|
The native object file format starts with a header that lists how many "chunks"
|
|
are in the file. A chunk is an array of "ivar data". The native file reader
|
|
instantiates an array of Atom objects (with one large malloc call). Each atom
|
|
contains just a pointer to its vtable and a pointer to its ivar data. All
|
|
methods on lld::Atom are virtual, so all the method implementations return
|
|
values based on the ivar data to which it has a pointer. If a new linking
|
|
features is added which requires a change to the lld::Atom model, a new native
|
|
reader class (e.g. version 2) is defined which knows how to read the new feature
|
|
information from the new ivar data. The old reader class (e.g. version 1) is
|
|
updated to do its best to model (the lack of the new feature) given the old ivar
|
|
data in existing native object files.
|
|
|
|
With this model for the native file format, files can be read and turned
|
|
into the in-memory graph of lld::Atoms with just a few memory allocations.
|
|
And the format can easily adapt over time to new features
|
|
|
|
|
|
Textual representations in YAML
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
In designing a textual format we want something easy for humans to read and easy
|
|
for the linker to parse. Since an atom has lots of attributes most of which are
|
|
usually just the default, we should define default values for every attribute so
|
|
that those can be omitted from the text representation. Here is the atoms for a
|
|
simple hello world program expressed in YAML::
|
|
|
|
target-triple: x86_64-apple-darwin11
|
|
|
|
atoms:
|
|
- name: _main
|
|
scope: global
|
|
type: code
|
|
content: [ 55, 48, 89, e5, 48, 8d, 3d, 00, 00, 00, 00, 30, c0, e8, 00, 00,
|
|
00, 00, 31, c0, 5d, c3 ]
|
|
fixups:
|
|
- offset: 07
|
|
kind: pcrel32
|
|
target: 2
|
|
- offset: 0E
|
|
kind: call32
|
|
target: _fprintf
|
|
|
|
- type: c-string
|
|
content: [ 73, 5A, 00 ]
|
|
|
|
...
|
|
|
|
The biggest use for the textual format will be writing test cases. Writing test
|
|
cases in C is problematic because the compiler may vary its output over time for
|
|
its own optimization reasons which my inadvertently disable or break the linker
|
|
feature trying to be tested. By writing test cases in the linkers own textual
|
|
format, we can exactly specify every attribute of every atom and thus target
|
|
specific linker logic.
|
|
|
|
Testing
|
|
~~~~~~~
|
|
|
|
The lld project contains a test suite which is being built up as new code is
|
|
added to lld. All new lld functionality should have a tests added to the test
|
|
suite. The test suite is `lit <http://llvm.org/cmds/lit.html/>`_ driven. Each
|
|
test is a text file with comments telling lit how to run the test and check the
|
|
result To facilitate testing, the lld project builds a tool called lld-core.
|
|
This tool reads a YAML file (default from stdin), parses it into one or more
|
|
lld::File objects in memory and then feeds those lld::File objects to the
|
|
resolver phase. The output of the resolver is written as a native object file.
|
|
It is then read back in using the native object file reader and then pass to the
|
|
YAML writer. This round-about path means that all three representations
|
|
(in-memory, binary, and text) are exercised, and any new feature has to work in
|
|
all the representations to pass the test.
|
|
|
|
|
|
Resolver testing
|
|
~~~~~~~~~~~~~~~~
|
|
|
|
Basic testing is the "core linking" or resolving phase. That is where the
|
|
linker merges object files. All test cases are written in YAML. One feature of
|
|
YAML is that it allows multiple "documents" to be encoding in one YAML stream.
|
|
That means one text file can appear to the linker as multiple .o files - the
|
|
normal case for the linker.
|
|
|
|
Here is a simple example of a core linking test case. It checks that an
|
|
undefined atom from one file will be replaced by a definition from another
|
|
file::
|
|
|
|
# RUN: lld-core %s | FileCheck %s
|
|
|
|
#
|
|
# Test that undefined atoms are replaced with defined atoms.
|
|
#
|
|
|
|
---
|
|
atoms:
|
|
- name: foo
|
|
definition: undefined
|
|
---
|
|
atoms:
|
|
- name: foo
|
|
scope: global
|
|
type: code
|
|
...
|
|
|
|
# CHECK: name: foo
|
|
# CHECK: scope: global
|
|
# CHECK: type: code
|
|
# CHECK-NOT: name: foo
|
|
# CHECK: ...
|
|
|
|
|
|
Passes testing
|
|
~~~~~~~~~~~~~~
|
|
|
|
Since Passes just operate on an lld::File object, the lld-core tool has the
|
|
option to run a particular pass (after resolving). Thus, you can write a YAML
|
|
test case with carefully crafted input to exercise areas of a Pass and the check
|
|
the resulting lld::File object as represented in YAML.
|
|
|
|
|
|
Design Issues
|
|
-------------
|
|
|
|
There are a number of open issues in the design of lld. The plan is to wait and
|
|
make these design decisions when we need to.
|
|
|
|
|
|
Debug Info
|
|
~~~~~~~~~~
|
|
|
|
Currently, the lld model says nothing about debug info. But the most popular
|
|
debug format is DWARF and there is some impedance mismatch with the lld model
|
|
and DWARF. In lld there are just Atoms and only Atoms that need to be in a
|
|
special section at runtime have an associated section. Also, Atoms do not have
|
|
addresses. The way DWARF is spec'ed different parts of DWARF are supposed to go
|
|
into specially named sections and the DWARF references function code by address.
|
|
|
|
CPU and OS specific functionality
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Currently, lld has an abstract "Platform" that deals with any CPU or OS specific
|
|
differences in linking. We just keep adding virtual methods to the base
|
|
Platform class as we find linking areas that might need customization. At some
|
|
point we'll need to structure this better.
|
|
|
|
|
|
File Attributes
|
|
~~~~~~~~~~~~~~~
|
|
|
|
Currently, lld::File just has a path and a way to iterate its atoms. We will
|
|
need to add mores attributes on a File. For example, some equivalent to the
|
|
target triple. There is also a number of cached or computed attributes that
|
|
could make various Passes more efficient. For instance, on Darwin there are a
|
|
number of Objective-C optimizations that can be done by a Pass. But it would
|
|
improve the plain C case if the Objective-C optimization Pass did not have to
|
|
scan all atoms looking for any Objective-C data structures. This could be done
|
|
if the lld::File object had an attribute that said if the file had any
|
|
Objective-C data in it. The Resolving phase would then be required to "merge"
|
|
that attribute as object files are added.
|
|
|
|
|
|
Command Line Processing
|
|
~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Eventually, we may want this linker to be able to be a drop in replacement
|
|
linker for existing linker tools. That means being able to handle command line
|
|
arguments for different platforms (e.g. darwin or linux). Currently, there is
|
|
no command line processing code in lld. If clang winds up incorporating the lld
|
|
libraries into the clang binary, lld may be able to punt this work because clang
|
|
will be responsible for setting up the state for lld.
|
|
|
|
|
|
|