forked from OSchip/llvm-project
182 lines
9.3 KiB
ReStructuredText
182 lines
9.3 KiB
ReStructuredText
======================
|
|
Clang Offload Packager
|
|
======================
|
|
|
|
.. contents::
|
|
:local:
|
|
|
|
.. _clang-offload-packager:
|
|
|
|
Introduction
|
|
============
|
|
|
|
This tool bundles device files into a single image containing necessary
|
|
metadata. We use a custom binary format for bundling all the device images
|
|
together. The image format is a small header wrapping around a string map. This
|
|
tool creates bundled binaries so that they can be embedded into the host to
|
|
create a fat-binary.
|
|
|
|
Binary Format
|
|
=============
|
|
|
|
The binary format is marked by the ``0x10FF10AD`` magic bytes, followed by a
|
|
version. Each created binary contains its own magic bytes. This allows us to
|
|
locate all the embedded offloading sections even after they may have been merged
|
|
by the linker, such as when using relocatable linking. Conceptually, this binary
|
|
format is a serialization of a string map and an image buffer. The binary header
|
|
is described in the following :ref:`table<table-binary_header>`.
|
|
|
|
.. table:: Offloading Binary Header
|
|
:name: table-binary_header
|
|
|
|
+----------+--------------+----------------------------------------------------+
|
|
| Type | Identifier | Description |
|
|
+==========+==============+====================================================+
|
|
| uint8_t | magic | The magic bytes for the binary format (0x10FF10AD) |
|
|
+----------+--------------+----------------------------------------------------+
|
|
| uint32_t | version | Version of this format (currently version 1) |
|
|
+----------+--------------+----------------------------------------------------+
|
|
| uint64_t | size | Size of this binary in bytes |
|
|
+----------+--------------+----------------------------------------------------+
|
|
| uint64_t | entry offset | Absolute offset of the offload entries in bytes |
|
|
+----------+--------------+----------------------------------------------------+
|
|
| uint64_t | entry size | Size of the offload entries in bytes |
|
|
+----------+--------------+----------------------------------------------------+
|
|
|
|
Once identified through the magic bytes, we use the size field to take a slice
|
|
of the binary blob containing the information for a single offloading image. We
|
|
can then use the offset field to find the actual offloading entries containing
|
|
the image and metadata. The offload entry contains information about the device
|
|
image. It contains the fields shown in the following
|
|
:ref:`table<table-binary_entry>`.
|
|
|
|
.. table:: Offloading Entry Table
|
|
:name: table-binary_entry
|
|
|
|
+----------+---------------+----------------------------------------------------+
|
|
| Type | Identifier | Description |
|
|
+==========+===============+====================================================+
|
|
| uint16_t | image kind | The kind of the device image (e.g. bc, cubin) |
|
|
+----------+---------------+----------------------------------------------------+
|
|
| uint16_t | offload kind | The producer of the image (e.g. openmp, cuda) |
|
|
+----------+---------------+----------------------------------------------------+
|
|
| uint32_t | flags | Generic flags for the image |
|
|
+----------+---------------+----------------------------------------------------+
|
|
| uint64_t | string offset | Absolute offset of the string metadata table |
|
|
+----------+---------------+----------------------------------------------------+
|
|
| uint64_t | num strings | Number of string entries in the table |
|
|
+----------+---------------+----------------------------------------------------+
|
|
| uint64_t | image offset | Absolute offset of the device image in bytes |
|
|
+----------+---------------+----------------------------------------------------+
|
|
| uint64_t | image size | Size of the device image in bytes |
|
|
+----------+---------------+----------------------------------------------------+
|
|
|
|
This table contains the offsets of the string table and the device image itself
|
|
along with some other integer information. The image kind lets us easily
|
|
identify the type of image stored here without needing to inspect the binary.
|
|
The offloading kind is used to determine which registration code or linking
|
|
semantics are necessary for this image. These are stored as enumerations with
|
|
the following values for the :ref:`offload kind<table-offload_kind>` and the
|
|
:ref:`image kind<table-image_kind>`.
|
|
|
|
.. table:: Image Kind
|
|
:name: table-image_kind
|
|
|
|
+---------------+-------+---------------------------------------+
|
|
| Name | Value | Description |
|
|
+===============+=======+=======================================+
|
|
| IMG_None | 0x00 | No image information provided |
|
|
+---------------+-------+---------------------------------------+
|
|
| IMG_Object | 0x01 | The image is a generic object file |
|
|
+---------------+-------+---------------------------------------+
|
|
| IMG_Bitcode | 0x02 | The image is an LLVM-IR bitcode file |
|
|
+---------------+-------+---------------------------------------+
|
|
| IMG_Cubin | 0x03 | The image is a CUDA object file |
|
|
+---------------+-------+---------------------------------------+
|
|
| IMG_Fatbinary | 0x04 | The image is a CUDA fatbinary file |
|
|
+---------------+-------+---------------------------------------+
|
|
| IMG_PTX | 0x05 | The image is a CUDA PTX file |
|
|
+---------------+-------+---------------------------------------+
|
|
|
|
.. table:: Offload Kind
|
|
:name: table-offload_kind
|
|
|
|
+------------+-------+---------------------------------------+
|
|
| Name | Value | Description |
|
|
+============+=======+=======================================+
|
|
| OFK_None | 0x00 | No offloading information provided |
|
|
+------------+-------+---------------------------------------+
|
|
| OFK_OpenMP | 0x01 | The producer was OpenMP offloading |
|
|
+------------+-------+---------------------------------------+
|
|
| OFK_CUDA | 0x02 | The producer was CUDA |
|
|
+------------+-------+---------------------------------------+
|
|
| OFK_HIP | 0x03 | The producer was HIP |
|
|
+------------+-------+---------------------------------------+
|
|
|
|
The flags are used to signify certain conditions, such as the presence of
|
|
debugging information or whether or not LTO was used. The string entry table is
|
|
used to generically contain any arbitrary key-value pair. This is stored as an
|
|
array of the :ref:`string entry<table-binary_string>` format.
|
|
|
|
.. table:: Offloading String Entry
|
|
:name: table-binary_string
|
|
|
|
+----------+--------------+-------------------------------------------------------+
|
|
| Type | Identifier | Description |
|
|
+==========+==============+=======================================================+
|
|
| uint64_t | key offset | Absolute byte offset of the key in th string table |
|
|
+----------+--------------+-------------------------------------------------------+
|
|
| uint64_t | value offset | Absolute byte offset of the value in the string table |
|
|
+----------+--------------+-------------------------------------------------------+
|
|
|
|
The string entries simply provide offsets to a key and value pair in the
|
|
binary images string table. The string table is simply a collection of null
|
|
terminated strings with defined offsets in the image. The string entry allows us
|
|
to create a key-value pair from this string table. This is used for passing
|
|
arbitrary arguments to the image, such as the triple and architecture.
|
|
|
|
All of these structures are combined to form a single binary blob, the order
|
|
does not matter because of the use of absolute offsets. This makes it easier to
|
|
extend in the future. As mentioned previously, multiple offloading images are
|
|
bundled together by simply concatenating them in this format. Because we have
|
|
the magic bytes and size of each image, we can extract them as-needed.
|
|
|
|
Usage
|
|
=====
|
|
|
|
This tool can be used with the following arguments. Generally information is
|
|
passed as a key-value pair to the ``image=`` argument. The ``file``, ``triple``,
|
|
and ``arch`` arguments are considered mandatory to make a valid image.
|
|
|
|
.. code-block:: console
|
|
|
|
OVERVIEW: A utility for bundling several object files into a single binary.
|
|
The output binary can then be embedded into the host section table
|
|
to create a fatbinary containing offloading code.
|
|
|
|
USAGE: clang-offload-packager [options]
|
|
|
|
OPTIONS:
|
|
|
|
Generic Options:
|
|
|
|
--help - Display available options (--help-hidden for more)
|
|
--help-list - Display list of available options (--help-list-hidden for more)
|
|
--version - Display the version of this program
|
|
|
|
clang-offload-packager options:
|
|
|
|
--image=<<key>=<value>,...> - List of key and value arguments. Required
|
|
keywords are 'file' and 'triple'.
|
|
-o <file> - Write output to <file>.
|
|
|
|
Example
|
|
=======
|
|
|
|
This tool simply takes many input files from the ``image`` option and creates a
|
|
single output file with all the images combined.
|
|
|
|
.. code-block:: console
|
|
|
|
clang-offload-packager -o out.bin --image=file=input.o,triple=nvptx64,arch=sm_70
|