2016-05-24 04:34:19 +08:00
|
|
|
=======================================================
|
2016-05-26 07:34:19 +08:00
|
|
|
Building a JIT: Starting out with KaleidoscopeJIT
|
2016-05-24 04:34:19 +08:00
|
|
|
=======================================================
|
|
|
|
|
|
|
|
.. contents::
|
|
|
|
:local:
|
|
|
|
|
|
|
|
Chapter 1 Introduction
|
|
|
|
======================
|
|
|
|
|
2018-10-17 11:34:09 +08:00
|
|
|
**Warning: This tutorial is currently being updated to account for ORC API
|
|
|
|
changes. Only Chapter 1 is up-to-date.**
|
2018-02-07 05:25:20 +08:00
|
|
|
|
2018-10-17 11:34:09 +08:00
|
|
|
**Example code from Chapters 2 to 4 will compile and run, but has not been
|
|
|
|
updated**
|
2018-02-07 05:25:20 +08:00
|
|
|
|
2016-05-24 04:34:19 +08:00
|
|
|
Welcome to Chapter 1 of the "Building an ORC-based JIT in LLVM" tutorial. This
|
|
|
|
tutorial runs through the implementation of a JIT compiler using LLVM's
|
|
|
|
On-Request-Compilation (ORC) APIs. It begins with a simplified version of the
|
|
|
|
KaleidoscopeJIT class used in the
|
2017-07-10 17:07:23 +08:00
|
|
|
`Implementing a language with LLVM <LangImpl01.html>`_ tutorials and then
|
2018-10-17 11:34:09 +08:00
|
|
|
introduces new features like concurrent compilation, optimization, lazy
|
|
|
|
compilation and remote execution.
|
2016-05-24 04:34:19 +08:00
|
|
|
|
|
|
|
The goal of this tutorial is to introduce you to LLVM's ORC JIT APIs, show how
|
|
|
|
these APIs interact with other parts of LLVM, and to teach you how to recombine
|
|
|
|
them to build a custom JIT that is suited to your use-case.
|
|
|
|
|
|
|
|
The structure of the tutorial is:
|
|
|
|
|
|
|
|
- Chapter #1: Investigate the simple KaleidoscopeJIT class. This will
|
|
|
|
introduce some of the basic concepts of the ORC JIT APIs, including the
|
|
|
|
idea of an ORC *Layer*.
|
|
|
|
|
|
|
|
- `Chapter #2 <BuildingAJIT2.html>`_: Extend the basic KaleidoscopeJIT by adding
|
|
|
|
a new layer that will optimize IR and generated code.
|
|
|
|
|
|
|
|
- `Chapter #3 <BuildingAJIT3.html>`_: Further extend the JIT by adding a
|
|
|
|
Compile-On-Demand layer to lazily compile IR.
|
|
|
|
|
|
|
|
- `Chapter #4 <BuildingAJIT4.html>`_: Improve the laziness of our JIT by
|
|
|
|
replacing the Compile-On-Demand layer with a custom layer that uses the ORC
|
|
|
|
Compile Callbacks API directly to defer IR-generation until functions are
|
|
|
|
called.
|
|
|
|
|
|
|
|
- `Chapter #5 <BuildingAJIT5.html>`_: Add process isolation by JITing code into
|
|
|
|
a remote process with reduced privileges using the JIT Remote APIs.
|
|
|
|
|
2018-10-17 11:34:09 +08:00
|
|
|
To provide input for our JIT we will use a lightly modified version of the
|
|
|
|
Kaleidoscope REPL from `Chapter 7 <LangImpl07.html>`_ of the "Implementing a
|
|
|
|
language in LLVM tutorial".
|
2016-05-24 04:34:19 +08:00
|
|
|
|
|
|
|
Finally, a word on API generations: ORC is the 3rd generation of LLVM JIT API.
|
2016-07-03 03:28:40 +08:00
|
|
|
It was preceded by MCJIT, and before that by the (now deleted) legacy JIT.
|
2016-05-24 04:34:19 +08:00
|
|
|
These tutorials don't assume any experience with these earlier APIs, but
|
|
|
|
readers acquainted with them will see many familiar elements. Where appropriate
|
|
|
|
we will make this connection with the earlier APIs explicit to help people who
|
|
|
|
are transitioning from them to ORC.
|
|
|
|
|
|
|
|
JIT API Basics
|
|
|
|
==============
|
|
|
|
|
|
|
|
The purpose of a JIT compiler is to compile code "on-the-fly" as it is needed,
|
|
|
|
rather than compiling whole programs to disk ahead of time as a traditional
|
2018-10-17 11:34:09 +08:00
|
|
|
compiler does. To support that aim our initial, bare-bones JIT API will have
|
|
|
|
just two functions:
|
2016-05-24 04:34:19 +08:00
|
|
|
|
|
|
|
1. Handle addModule(Module &M) -- Make the given IR module available for
|
|
|
|
execution.
|
2018-10-17 11:34:09 +08:00
|
|
|
2. Expected<JITSymbol> lookup() -- Search for pointers to
|
2016-05-24 04:34:19 +08:00
|
|
|
symbols (functions or variables) that have been added to the JIT.
|
|
|
|
|
|
|
|
A basic use-case for this API, executing the 'main' function from a module,
|
|
|
|
will look like:
|
|
|
|
|
|
|
|
.. code-block:: c++
|
|
|
|
|
2016-05-26 06:33:25 +08:00
|
|
|
std::unique_ptr<Module> M = buildModule();
|
|
|
|
JIT J;
|
2018-10-17 11:34:09 +08:00
|
|
|
J.addModule(*M);
|
|
|
|
auto *Main = (int(*)(int, char*[]))J.lookup("main");.getAddress();
|
2016-05-26 06:33:25 +08:00
|
|
|
int Result = Main();
|
2016-05-24 04:34:19 +08:00
|
|
|
|
2017-09-17 08:24:43 +08:00
|
|
|
The APIs that we build in these tutorials will all be variations on this simple
|
2018-10-17 11:34:09 +08:00
|
|
|
theme. Behind this API we will refine the implementation of the JIT to add
|
|
|
|
support for concurrent compilation, optimization and lazy compilation.
|
|
|
|
Eventually we will extend the API itself to allow higher-level program
|
|
|
|
representations (e.g. ASTs) to be added to the JIT.
|
2016-05-24 04:34:19 +08:00
|
|
|
|
|
|
|
KaleidoscopeJIT
|
|
|
|
===============
|
|
|
|
|
|
|
|
In the previous section we described our API, now we examine a simple
|
|
|
|
implementation of it: The KaleidoscopeJIT class [1]_ that was used in the
|
2017-07-10 17:07:23 +08:00
|
|
|
`Implementing a language with LLVM <LangImpl01.html>`_ tutorials. We will use
|
|
|
|
the REPL code from `Chapter 7 <LangImpl07.html>`_ of that tutorial to supply the
|
2016-05-24 04:34:19 +08:00
|
|
|
input for our JIT: Each time the user enters an expression the REPL will add a
|
|
|
|
new IR module containing the code for that expression to the JIT. If the
|
|
|
|
expression is a top-level expression like '1+1' or 'sin(x)', the REPL will also
|
2018-10-17 11:34:09 +08:00
|
|
|
use the lookup method of our JIT class find and execute the code for the
|
|
|
|
expression. In later chapters of this tutorial we will modify the REPL to enable
|
|
|
|
new interactions with our JIT class, but for now we will take this setup for
|
|
|
|
granted and focus our attention on the implementation of our JIT itself.
|
2016-05-24 04:34:19 +08:00
|
|
|
|
|
|
|
Our KaleidoscopeJIT class is defined in the KaleidoscopeJIT.h header. After the
|
|
|
|
usual include guards and #includes [2]_, we get to the definition of our class:
|
|
|
|
|
|
|
|
.. code-block:: c++
|
|
|
|
|
2016-05-26 06:33:25 +08:00
|
|
|
#ifndef LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H
|
|
|
|
#define LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H
|
2016-05-24 04:34:19 +08:00
|
|
|
|
2018-10-17 11:34:09 +08:00
|
|
|
#include "llvm/ADT/StringRef.h"
|
2017-09-17 08:24:43 +08:00
|
|
|
#include "llvm/ExecutionEngine/JITSymbol.h"
|
2016-05-26 06:33:25 +08:00
|
|
|
#include "llvm/ExecutionEngine/Orc/CompileUtils.h"
|
2018-10-17 11:34:09 +08:00
|
|
|
#include "llvm/ExecutionEngine/Orc/Core.h"
|
|
|
|
#include "llvm/ExecutionEngine/Orc/ExecutionUtils.h"
|
2016-05-26 06:33:25 +08:00
|
|
|
#include "llvm/ExecutionEngine/Orc/IRCompileLayer.h"
|
2018-10-17 11:34:09 +08:00
|
|
|
#include "llvm/ExecutionEngine/Orc/JITTargetMachineBuilder.h"
|
2017-09-17 08:24:43 +08:00
|
|
|
#include "llvm/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.h"
|
2018-10-17 11:34:09 +08:00
|
|
|
#include "llvm/ExecutionEngine/SectionMemoryManager.h"
|
2017-09-17 08:24:43 +08:00
|
|
|
#include "llvm/IR/DataLayout.h"
|
2018-10-17 11:34:09 +08:00
|
|
|
#include "llvm/IR/LLVMContext.h"
|
2017-08-16 03:20:10 +08:00
|
|
|
#include <memory>
|
2016-05-24 04:34:19 +08:00
|
|
|
|
2016-05-26 06:33:25 +08:00
|
|
|
namespace llvm {
|
|
|
|
namespace orc {
|
2016-05-24 04:34:19 +08:00
|
|
|
|
2016-05-26 06:33:25 +08:00
|
|
|
class KaleidoscopeJIT {
|
|
|
|
private:
|
2016-05-24 04:34:19 +08:00
|
|
|
|
2018-10-17 11:34:09 +08:00
|
|
|
ExecutionSession ES;
|
|
|
|
RTDyldObjectLinkingLayer ObjectLayer{ES, getMemoryMgr};
|
|
|
|
IRCompileLayer CompileLayer{ES, ObjectLayer,
|
|
|
|
ConcurrentIRCompiler(getJTMB())};
|
|
|
|
DataLayout DL{cantFail(getJTMB().getDefaultDataLayoutForTarget())};
|
|
|
|
MangleAndInterner Mangle{ES, DL};
|
|
|
|
ThreadSafeContext Ctx{llvm::make_unique<LLVMContext>()};
|
|
|
|
|
|
|
|
static JITTargetMachineBuilder getJTMB() {
|
|
|
|
return cantFail(JITTargetMachineBuilder::detectHost());
|
|
|
|
}
|
|
|
|
|
|
|
|
static std::unique_ptr<SectionMemoryManager> getMemoryMgr(VModuleKey) {
|
|
|
|
return llvm::make_unique<SectionMemoryManager>();
|
|
|
|
}
|
|
|
|
|
|
|
|
We begin with the ExecutionSession member, ``ES``, which provides context for
|
|
|
|
our running JIT'd code. It holds the string pool for symbol names, the global
|
|
|
|
mutex that guards the critical sections of JIT operations, error logging
|
|
|
|
facilities, and other utilities. For basic use cases such as this, a default
|
|
|
|
constructed ExecutionSession is all we will need. We will investigate more
|
|
|
|
advanced uses of ExecutionSession in later chapters. Following our
|
|
|
|
ExecutionSession we have two ORC *layers*: an RTDyldObjectLinkingLayer and an
|
|
|
|
IRCompileLayer. We will be talking more about layers in the next chapter, but
|
|
|
|
for now you can think of them as analogous to LLVM Passes: they wrap up useful
|
|
|
|
JIT utilities behind an easy to compose interface. The first layer, ObjectLayer,
|
|
|
|
is the foundation of our JIT: it takes in-memory object files produced by a
|
|
|
|
compiler and links them on the fly to make them executable. This
|
|
|
|
JIT-on-top-of-a-linker design was introduced in MCJIT, however the linker was
|
|
|
|
hidden inside the MCJIT class. In ORC we expose the linker so that clients can
|
|
|
|
access and configure it directly if they need to. In this tutorial our
|
|
|
|
ObjectLayer will just be used to support the next layer in our stack: the
|
|
|
|
CompileLayer, which will be responsible for taking LLVM IR, compiling it, and
|
|
|
|
passing the resulting in-memory object files down to the object linking layer
|
|
|
|
below. Our ObjectLayer is constructed with a reference to the ExecutionSession
|
|
|
|
and the getMemoryMgr utility function, which it uses to generate a new memory
|
|
|
|
manager for each object file as it is added. Next up is our CompileLayer, which
|
|
|
|
is initialized with a reference to the ExecutionSession, a reference to the
|
|
|
|
ObjectLayer (where it will send the objects produced by the compiler), and an IR
|
|
|
|
compiler instance. In this case we are using the ConcurrentIRCompiler class
|
|
|
|
which is constructed with a JITTargetMachineBuilder and can be called to compile
|
|
|
|
IR concurrently from several threads (though in this chapter we will only use
|
|
|
|
one).
|
|
|
|
|
|
|
|
Following the ExecutionSession and layers we have three supporting member
|
|
|
|
variables. The DataLayout, ``DL``; and MangleAndInterner, ``Mangle`` members are
|
|
|
|
used to support portable lookups based on IR symbol names (more on that when we
|
|
|
|
get to our ``lookup`` function below), and the ThreadSafeContext member,
|
|
|
|
``Ctx``, manages an LLVMContext that can be used while building IR Modules for
|
|
|
|
the JIT.
|
|
|
|
|
|
|
|
After that, we have two static utility functions. The ``getJTMB()`` function
|
|
|
|
returns a JITTargetMachineBuilder, which is a factory for building LLVM
|
|
|
|
TargetMachine instances that are used by the compiler. In this first tutorial we
|
|
|
|
will only need one (implicitly created) TargetMachine, but in future tutorials
|
|
|
|
that enable concurrent compilation we will need one per thread. This is why we
|
|
|
|
use a target machine builder, rather than a single TargetMachine. (note: Older
|
|
|
|
LLVM JIT APIs that did not support concurrent compilation were constructed with
|
|
|
|
a single TargetMachines). The ``getMemoryMgr()`` function constructs instances
|
|
|
|
of RuntimeDyld::MemoryManager, and is used by the linking layer to generate a
|
|
|
|
new memory manager for each object file.
|
2016-05-24 04:34:19 +08:00
|
|
|
|
|
|
|
.. code-block:: c++
|
|
|
|
|
2018-10-17 11:34:09 +08:00
|
|
|
public:
|
2016-05-24 04:34:19 +08:00
|
|
|
|
2018-10-17 11:34:09 +08:00
|
|
|
KaleidoscopeJIT() {
|
|
|
|
ES.getMainJITDylib().setGenerator(
|
|
|
|
cantFail(DynamicLibrarySearchGenerator::GetForCurrentProcess(DL)));
|
|
|
|
}
|
2016-05-24 04:34:19 +08:00
|
|
|
|
2018-10-17 11:34:09 +08:00
|
|
|
const DataLayout &getDataLayout() const { return DL; }
|
2016-05-24 04:34:19 +08:00
|
|
|
|
2018-10-17 11:34:09 +08:00
|
|
|
LLVMContext &getContext() { return *Ctx.getContext(); }
|
2016-05-26 06:27:25 +08:00
|
|
|
|
2018-10-17 11:34:09 +08:00
|
|
|
Next up we have our class constructor. Our members have already been
|
|
|
|
initialized, so the one thing that remains to do is to tweak the configuration
|
|
|
|
of the *JITDylib* that we will store our code in. We want to modify this dylib
|
|
|
|
to contain not only the symbols that we add to it, but also the symbols from
|
|
|
|
our REPL process as well. We do this by attaching a
|
|
|
|
``DynamicLibrarySearchGenerator`` instance using the
|
|
|
|
``DynamicLibrarySearchGenerator::GetForCurrentProcess`` method.
|
2016-05-24 04:34:19 +08:00
|
|
|
|
2018-10-17 11:34:09 +08:00
|
|
|
Following the constructor we have the ``getDataLayout()`` and ``getContext()``
|
|
|
|
methods. These are used to make data structures created and managed by the JIT
|
|
|
|
(especially the LLVMContext) available to the REPL code that will build our
|
|
|
|
IR modules.
|
2016-05-24 04:34:19 +08:00
|
|
|
|
2018-10-17 11:34:09 +08:00
|
|
|
.. code-block:: c++
|
2016-05-24 04:34:19 +08:00
|
|
|
|
2018-10-17 11:34:09 +08:00
|
|
|
void addModule(std::unique_ptr<Module> M) {
|
|
|
|
cantFail(CompileLayer.add(ES.getMainJITDylib(),
|
|
|
|
ThreadSafeModule(std::move(M), Ctx)));
|
2017-08-16 03:20:10 +08:00
|
|
|
}
|
|
|
|
|
2018-10-17 11:34:09 +08:00
|
|
|
Expected<JITEvaluatedSymbol> lookup(StringRef Name) {
|
|
|
|
return ES.lookup({&ES.getMainJITDylib()}, Mangle(Name.str()));
|
2016-05-24 04:34:19 +08:00
|
|
|
}
|
|
|
|
|
2018-10-17 11:34:09 +08:00
|
|
|
Now we come to the first of our JIT API methods: addModule. This method is
|
|
|
|
responsible for adding IR to the JIT and making it available for execution. In
|
|
|
|
this initial implementation of our JIT we will make our modules "available for
|
|
|
|
execution" by adding them to the CompileLayer, which will it turn store the
|
|
|
|
Module in the main JITDylib. This process will create new symbol table entries
|
|
|
|
in the JITDylib for each definition in the module, and will defer compilation of
|
|
|
|
the module until any of its definitions is looked up. Note that this is not lazy
|
|
|
|
compilation: just referencing a definition, even if it is never used, will be
|
|
|
|
enough to trigger compilation. In later chapters we will teach our JIT to defer
|
|
|
|
compilation of functions until they're actually called. To add our Module we
|
|
|
|
must first wrap it in a ThreadSafeModule instance, which manages the lifetime of
|
|
|
|
the Module's LLVMContext (our Ctx member) in a thread-friendly way. In our
|
|
|
|
example, all modules will share the Ctx member, which will exist for the
|
|
|
|
duration of the JIT. Once we switch to concurrent compilation in later chapters
|
|
|
|
we will use a new context per module.
|
|
|
|
|
|
|
|
Our last method is ``lookup``, which allows us to look up addresses for
|
|
|
|
function and variable definitions added to the JIT based on their symbol names.
|
|
|
|
As noted above, lookup will implicitly trigger compilation for any symbol
|
|
|
|
that has not already been compiled. Our lookup method calls through to
|
|
|
|
`ExecutionSession::lookup`, passing in a list of dylibs to search (in our case
|
|
|
|
just the main dylib), and the symbol name to search for, with a twist: We have
|
|
|
|
to *mangle* the name of the symbol we're searching for first. The ORC JIT
|
|
|
|
components use mangled symbols internally the same way a static compiler and
|
|
|
|
linker would, rather than using plain IR symbol names. This allows JIT'd code
|
|
|
|
to interoperate easily with precompiled code in the application or shared
|
|
|
|
libraries. The kind of mangling will depend on the DataLayout, which in turn
|
|
|
|
depends on the target platform. To allow us to remain portable and search based
|
|
|
|
on the un-mangled name, we just re-produce this mangling ourselves using our
|
|
|
|
``Mangle`` member function object.
|
2016-05-31 03:03:26 +08:00
|
|
|
|
|
|
|
This brings us to the end of Chapter 1 of Building a JIT. You now have a basic
|
|
|
|
but fully functioning JIT stack that you can use to take LLVM IR and make it
|
|
|
|
executable within the context of your JIT process. In the next chapter we'll
|
|
|
|
look at how to extend this JIT to produce better quality code, and in the
|
|
|
|
process take a deeper look at the ORC layer concept.
|
2016-05-24 04:34:19 +08:00
|
|
|
|
2016-05-31 03:03:26 +08:00
|
|
|
`Next: Extending the KaleidoscopeJIT <BuildingAJIT2.html>`_
|
2016-05-24 04:34:19 +08:00
|
|
|
|
|
|
|
Full Code Listing
|
|
|
|
=================
|
|
|
|
|
2016-05-26 07:42:48 +08:00
|
|
|
Here is the complete code listing for our running example. To build this
|
|
|
|
example, use:
|
2016-05-24 04:34:19 +08:00
|
|
|
|
|
|
|
.. code-block:: bash
|
|
|
|
|
|
|
|
# Compile
|
2017-09-17 08:24:43 +08:00
|
|
|
clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core orcjit native` -O3 -o toy
|
2016-05-24 04:34:19 +08:00
|
|
|
# Run
|
|
|
|
./toy
|
|
|
|
|
|
|
|
Here is the code:
|
|
|
|
|
|
|
|
.. literalinclude:: ../../examples/Kaleidoscope/BuildingAJIT/Chapter1/KaleidoscopeJIT.h
|
|
|
|
:language: c++
|
|
|
|
|
|
|
|
.. [1] Actually we use a cut-down version of KaleidoscopeJIT that makes a
|
|
|
|
simplifying assumption: symbols cannot be re-defined. This will make it
|
|
|
|
impossible to re-define symbols in the REPL, but will make our symbol
|
|
|
|
lookup logic simpler. Re-introducing support for symbol redefinition is
|
|
|
|
left as an exercise for the reader. (The KaleidoscopeJIT.h used in the
|
|
|
|
original tutorials will be a helpful reference).
|
|
|
|
|
2017-09-17 08:24:43 +08:00
|
|
|
.. [2] +-----------------------------+-----------------------------------------------+
|
|
|
|
| File | Reason for inclusion |
|
|
|
|
+=============================+===============================================+
|
2018-10-17 11:34:09 +08:00
|
|
|
| JITSymbol.h | Defines the lookup result type |
|
|
|
|
| | JITEvaluatedSymbol |
|
2017-09-17 08:24:43 +08:00
|
|
|
+-----------------------------+-----------------------------------------------+
|
2018-10-17 11:34:09 +08:00
|
|
|
| CompileUtils.h | Provides the SimpleCompiler class. |
|
2017-09-17 08:24:43 +08:00
|
|
|
+-----------------------------+-----------------------------------------------+
|
2018-10-17 11:34:09 +08:00
|
|
|
| Core.h | Core utilities such as ExecutionSession and |
|
|
|
|
| | JITDylib. |
|
2017-09-17 08:24:43 +08:00
|
|
|
+-----------------------------+-----------------------------------------------+
|
2018-10-17 11:34:09 +08:00
|
|
|
| ExecutionUtils.h | Provides the DynamicLibrarySearchGenerator |
|
|
|
|
| | class. |
|
2017-09-17 08:24:43 +08:00
|
|
|
+-----------------------------+-----------------------------------------------+
|
2018-10-17 11:34:09 +08:00
|
|
|
| IRCompileLayer.h | Provides the IRCompileLayer class. |
|
2017-09-17 08:24:43 +08:00
|
|
|
+-----------------------------+-----------------------------------------------+
|
2018-10-17 11:34:09 +08:00
|
|
|
| JITTargetMachineBuilder.h | Provides the JITTargetMachineBuilder class. |
|
2017-09-17 08:24:43 +08:00
|
|
|
+-----------------------------+-----------------------------------------------+
|
2018-10-17 11:34:09 +08:00
|
|
|
| RTDyldObjectLinkingLayer.h | Provides the RTDyldObjectLinkingLayer class. |
|
2017-09-17 08:24:43 +08:00
|
|
|
+-----------------------------+-----------------------------------------------+
|
2018-10-17 11:34:09 +08:00
|
|
|
| SectionMemoryManager.h | Provides the SectionMemoryManager class. |
|
2017-09-17 08:24:43 +08:00
|
|
|
+-----------------------------+-----------------------------------------------+
|
2018-10-17 11:34:09 +08:00
|
|
|
| DataLayout.h | Provides the DataLayout class. |
|
2017-09-17 08:24:43 +08:00
|
|
|
+-----------------------------+-----------------------------------------------+
|
2018-10-17 11:34:09 +08:00
|
|
|
| LLVMContext.h | Provides the LLVMContext class. |
|
2017-09-17 08:24:43 +08:00
|
|
|
+-----------------------------+-----------------------------------------------+
|