forked from OSchip/llvm-project
572 lines
19 KiB
ReStructuredText
572 lines
19 KiB
ReStructuredText
=================
|
|
TableGen BackEnds
|
|
=================
|
|
|
|
.. contents::
|
|
:local:
|
|
|
|
Introduction
|
|
============
|
|
|
|
TableGen backends are at the core of TableGen's functionality. The source files
|
|
provide the semantics to a generated (in memory) structure, but it's up to the
|
|
backend to print this out in a way that is meaningful to the user (normally a
|
|
C program including a file or a textual list of warnings, options and error
|
|
messages).
|
|
|
|
TableGen is used by both LLVM and Clang with very different goals. LLVM uses it
|
|
as a way to automate the generation of massive amounts of information regarding
|
|
instructions, schedules, cores and architecture features. Some backends generate
|
|
output that is consumed by more than one source file, so they need to be created
|
|
in a way that is easy to use pre-processor tricks. Some backends can also print
|
|
C code structures, so that they can be directly included as-is.
|
|
|
|
Clang, on the other hand, uses it mainly for diagnostic messages (errors,
|
|
warnings, tips) and attributes, so more on the textual end of the scale.
|
|
|
|
LLVM BackEnds
|
|
=============
|
|
|
|
.. warning::
|
|
This document is raw. Each section below needs three sub-sections: description
|
|
of its purpose with a list of users, output generated from generic input, and
|
|
finally why it needed a new backend (in case there's something similar).
|
|
|
|
Overall, each backend will take the same TableGen file type and transform into
|
|
similar output for different targets/uses. There is an implicit contract between
|
|
the TableGen files, the back-ends and their users.
|
|
|
|
For instance, a global contract is that each back-end produces macro-guarded
|
|
sections. Based on whether the file is included by a header or a source file,
|
|
or even in which context of each file the include is being used, you have
|
|
todefine a macro just before including it, to get the right output:
|
|
|
|
.. code-block:: c++
|
|
|
|
#define GET_REGINFO_TARGET_DESC
|
|
#include "ARMGenRegisterInfo.inc"
|
|
|
|
And just part of the generated file would be included. This is useful if
|
|
you need the same information in multiple formats (instantiation, initialization,
|
|
getter/setter functions, etc) from the same source TableGen file without having
|
|
to re-compile the TableGen file multiple times.
|
|
|
|
Sometimes, multiple macros might be defined before the same include file to
|
|
output multiple blocks:
|
|
|
|
.. code-block:: c++
|
|
|
|
#define GET_REGISTER_MATCHER
|
|
#define GET_SUBTARGET_FEATURE_NAME
|
|
#define GET_MATCHER_IMPLEMENTATION
|
|
#include "ARMGenAsmMatcher.inc"
|
|
|
|
The macros will be undef'd automatically as they're used, in the include file.
|
|
|
|
On all LLVM back-ends, the ``llvm-tblgen`` binary will be executed on the root
|
|
TableGen file ``<Target>.td``, which should include all others. This guarantees
|
|
that all information needed is accessible, and that no duplication is needed
|
|
in the TableGen files.
|
|
|
|
CodeEmitter
|
|
-----------
|
|
|
|
**Purpose**: CodeEmitterGen uses the descriptions of instructions and their fields to
|
|
construct an automated code emitter: a function that, given a MachineInstr,
|
|
returns the (currently, 32-bit unsigned) value of the instruction.
|
|
|
|
**Output**: C++ code, implementing the target's CodeEmitter
|
|
class by overriding the virtual functions as ``<Target>CodeEmitter::function()``.
|
|
|
|
**Usage**: Used to include directly at the end of ``<Target>MCCodeEmitter.cpp``.
|
|
|
|
RegisterInfo
|
|
------------
|
|
|
|
**Purpose**: This tablegen backend is responsible for emitting a description of a target
|
|
register file for a code generator. It uses instances of the Register,
|
|
RegisterAliases, and RegisterClass classes to gather this information.
|
|
|
|
**Output**: C++ code with enums and structures representing the register mappings,
|
|
properties, masks, etc.
|
|
|
|
**Usage**: Both on ``<Target>BaseRegisterInfo`` and ``<Target>MCTargetDesc`` (headers
|
|
and source files) with macros defining in which they are for declaration vs.
|
|
initialization issues.
|
|
|
|
InstrInfo
|
|
---------
|
|
|
|
**Purpose**: This tablegen backend is responsible for emitting a description of the target
|
|
instruction set for the code generator. (what are the differences from CodeEmitter?)
|
|
|
|
**Output**: C++ code with enums and structures representing the instruction mappings,
|
|
properties, masks, etc.
|
|
|
|
**Usage**: Both on ``<Target>BaseInstrInfo`` and ``<Target>MCTargetDesc`` (headers
|
|
and source files) with macros defining in which they are for declaration vs.
|
|
initialization issues.
|
|
|
|
AsmWriter
|
|
---------
|
|
|
|
**Purpose**: Emits an assembly printer for the current target.
|
|
|
|
**Output**: Implementation of ``<Target>InstPrinter::printInstruction()``, among
|
|
other things.
|
|
|
|
**Usage**: Included directly into ``InstPrinter/<Target>InstPrinter.cpp``.
|
|
|
|
AsmMatcher
|
|
----------
|
|
|
|
**Purpose**: Emits a target specifier matcher for
|
|
converting parsed assembly operands in the MCInst structures. It also
|
|
emits a matcher for custom operand parsing. Extensive documentation is
|
|
written on the ``AsmMatcherEmitter.cpp`` file.
|
|
|
|
**Output**: Assembler parsers' matcher functions, declarations, etc.
|
|
|
|
**Usage**: Used in back-ends' ``AsmParser/<Target>AsmParser.cpp`` for
|
|
building the AsmParser class.
|
|
|
|
Disassembler
|
|
------------
|
|
|
|
**Purpose**: Contains disassembler table emitters for various
|
|
architectures. Extensive documentation is written on the
|
|
``DisassemblerEmitter.cpp`` file.
|
|
|
|
**Output**: Decoding tables, static decoding functions, etc.
|
|
|
|
**Usage**: Directly included in ``Disassembler/<Target>Disassembler.cpp``
|
|
to cater for all default decodings, after all hand-made ones.
|
|
|
|
PseudoLowering
|
|
--------------
|
|
|
|
**Purpose**: Generate pseudo instruction lowering.
|
|
|
|
**Output**: Implements ``<Target>AsmPrinter::emitPseudoExpansionLowering()``.
|
|
|
|
**Usage**: Included directly into ``<Target>AsmPrinter.cpp``.
|
|
|
|
CallingConv
|
|
-----------
|
|
|
|
**Purpose**: Responsible for emitting descriptions of the calling
|
|
conventions supported by this target.
|
|
|
|
**Output**: Implement static functions to deal with calling conventions
|
|
chained by matching styles, returning false on no match.
|
|
|
|
**Usage**: Used in ISelLowering and FastIsel as function pointers to
|
|
implementation returned by a CC selection function.
|
|
|
|
DAGISel
|
|
-------
|
|
|
|
**Purpose**: Generate a DAG instruction selector.
|
|
|
|
**Output**: Creates huge functions for automating DAG selection.
|
|
|
|
**Usage**: Included in ``<Target>ISelDAGToDAG.cpp`` inside the target's
|
|
implementation of ``SelectionDAGISel``.
|
|
|
|
DFAPacketizer
|
|
-------------
|
|
|
|
**Purpose**: This class parses the Schedule.td file and produces an API that
|
|
can be used to reason about whether an instruction can be added to a packet
|
|
on a VLIW architecture. The class internally generates a deterministic finite
|
|
automaton (DFA) that models all possible mappings of machine instructions
|
|
to functional units as instructions are added to a packet.
|
|
|
|
**Output**: Scheduling tables for GPU back-ends (Hexagon, AMD).
|
|
|
|
**Usage**: Included directly on ``<Target>InstrInfo.cpp``.
|
|
|
|
FastISel
|
|
--------
|
|
|
|
**Purpose**: This tablegen backend emits code for use by the "fast"
|
|
instruction selection algorithm. See the comments at the top of
|
|
lib/CodeGen/SelectionDAG/FastISel.cpp for background. This file
|
|
scans through the target's tablegen instruction-info files
|
|
and extracts instructions with obvious-looking patterns, and it emits
|
|
code to look up these instructions by type and operator.
|
|
|
|
**Output**: Generates ``Predicate`` and ``FastEmit`` methods.
|
|
|
|
**Usage**: Implements private methods of the targets' implementation
|
|
of ``FastISel`` class.
|
|
|
|
Subtarget
|
|
---------
|
|
|
|
**Purpose**: Generate subtarget enumerations.
|
|
|
|
**Output**: Enums, globals, local tables for sub-target information.
|
|
|
|
**Usage**: Populates ``<Target>Subtarget`` and
|
|
``MCTargetDesc/<Target>MCTargetDesc`` files (both headers and source).
|
|
|
|
Intrinsic
|
|
---------
|
|
|
|
**Purpose**: Generate (target) intrinsic information.
|
|
|
|
OptParserDefs
|
|
-------------
|
|
|
|
**Purpose**: Print enum values for a class.
|
|
|
|
SearchableTables
|
|
----------------
|
|
|
|
**Purpose**: Generate custom searchable tables.
|
|
|
|
**Output**: Enums, global tables and lookup helper functions.
|
|
|
|
**Usage**: This backend allows generating free-form, target-specific tables
|
|
from TableGen records. The ARM and AArch64 targets use this backend to generate
|
|
tables of system registers; the AMDGPU target uses it to generate meta-data
|
|
about complex image and memory buffer instructions.
|
|
|
|
More documentation is available in ``include/llvm/TableGen/SearchableTable.td``,
|
|
which also contains the definitions of TableGen classes which must be
|
|
instantiated in order to define the enums and tables emitted by this backend.
|
|
|
|
CTags
|
|
-----
|
|
|
|
**Purpose**: This tablegen backend emits an index of definitions in ctags(1)
|
|
format. A helper script, utils/TableGen/tdtags, provides an easier-to-use
|
|
interface; run 'tdtags -H' for documentation.
|
|
|
|
X86EVEX2VEX
|
|
-----------
|
|
|
|
**Purpose**: This X86 specific tablegen backend emits tables that map EVEX
|
|
encoded instructions to their VEX encoded identical instruction.
|
|
|
|
Clang BackEnds
|
|
==============
|
|
|
|
ClangAttrClasses
|
|
----------------
|
|
|
|
**Purpose**: Creates Attrs.inc, which contains semantic attribute class
|
|
declarations for any attribute in ``Attr.td`` that has not set ``ASTNode = 0``.
|
|
This file is included as part of ``Attr.h``.
|
|
|
|
ClangAttrParserStringSwitches
|
|
-----------------------------
|
|
|
|
**Purpose**: Creates AttrParserStringSwitches.inc, which contains
|
|
StringSwitch::Case statements for parser-related string switches. Each switch
|
|
is given its own macro (such as ``CLANG_ATTR_ARG_CONTEXT_LIST``, or
|
|
``CLANG_ATTR_IDENTIFIER_ARG_LIST``), which is expected to be defined before
|
|
including AttrParserStringSwitches.inc, and undefined after.
|
|
|
|
ClangAttrImpl
|
|
-------------
|
|
|
|
**Purpose**: Creates AttrImpl.inc, which contains semantic attribute class
|
|
definitions for any attribute in ``Attr.td`` that has not set ``ASTNode = 0``.
|
|
This file is included as part of ``AttrImpl.cpp``.
|
|
|
|
ClangAttrList
|
|
-------------
|
|
|
|
**Purpose**: Creates AttrList.inc, which is used when a list of semantic
|
|
attribute identifiers is required. For instance, ``AttrKinds.h`` includes this
|
|
file to generate the list of ``attr::Kind`` enumeration values. This list is
|
|
separated out into multiple categories: attributes, inheritable attributes, and
|
|
inheritable parameter attributes. This categorization happens automatically
|
|
based on information in ``Attr.td`` and is used to implement the ``classof``
|
|
functionality required for ``dyn_cast`` and similar APIs.
|
|
|
|
ClangAttrPCHRead
|
|
----------------
|
|
|
|
**Purpose**: Creates AttrPCHRead.inc, which is used to deserialize attributes
|
|
in the ``ASTReader::ReadAttributes`` function.
|
|
|
|
ClangAttrPCHWrite
|
|
-----------------
|
|
|
|
**Purpose**: Creates AttrPCHWrite.inc, which is used to serialize attributes in
|
|
the ``ASTWriter::WriteAttributes`` function.
|
|
|
|
ClangAttrSpellings
|
|
---------------------
|
|
|
|
**Purpose**: Creates AttrSpellings.inc, which is used to implement the
|
|
``__has_attribute`` feature test macro.
|
|
|
|
ClangAttrSpellingListIndex
|
|
--------------------------
|
|
|
|
**Purpose**: Creates AttrSpellingListIndex.inc, which is used to map parsed
|
|
attribute spellings (including which syntax or scope was used) to an attribute
|
|
spelling list index. These spelling list index values are internal
|
|
implementation details exposed via
|
|
``AttributeList::getAttributeSpellingListIndex``.
|
|
|
|
ClangAttrVisitor
|
|
-------------------
|
|
|
|
**Purpose**: Creates AttrVisitor.inc, which is used when implementing
|
|
recursive AST visitors.
|
|
|
|
ClangAttrTemplateInstantiate
|
|
----------------------------
|
|
|
|
**Purpose**: Creates AttrTemplateInstantiate.inc, which implements the
|
|
``instantiateTemplateAttribute`` function, used when instantiating a template
|
|
that requires an attribute to be cloned.
|
|
|
|
ClangAttrParsedAttrList
|
|
-----------------------
|
|
|
|
**Purpose**: Creates AttrParsedAttrList.inc, which is used to generate the
|
|
``AttributeList::Kind`` parsed attribute enumeration.
|
|
|
|
ClangAttrParsedAttrImpl
|
|
-----------------------
|
|
|
|
**Purpose**: Creates AttrParsedAttrImpl.inc, which is used by
|
|
``AttributeList.cpp`` to implement several functions on the ``AttributeList``
|
|
class. This functionality is implemented via the ``AttrInfoMap ParsedAttrInfo``
|
|
array, which contains one element per parsed attribute object.
|
|
|
|
ClangAttrParsedAttrKinds
|
|
------------------------
|
|
|
|
**Purpose**: Creates AttrParsedAttrKinds.inc, which is used to implement the
|
|
``AttributeList::getKind`` function, mapping a string (and syntax) to a parsed
|
|
attribute ``AttributeList::Kind`` enumeration.
|
|
|
|
ClangAttrDump
|
|
-------------
|
|
|
|
**Purpose**: Creates AttrDump.inc, which dumps information about an attribute.
|
|
It is used to implement ``ASTDumper::dumpAttr``.
|
|
|
|
ClangDiagsDefs
|
|
--------------
|
|
|
|
Generate Clang diagnostics definitions.
|
|
|
|
ClangDiagGroups
|
|
---------------
|
|
|
|
Generate Clang diagnostic groups.
|
|
|
|
ClangDiagsIndexName
|
|
-------------------
|
|
|
|
Generate Clang diagnostic name index.
|
|
|
|
ClangCommentNodes
|
|
-----------------
|
|
|
|
Generate Clang AST comment nodes.
|
|
|
|
ClangDeclNodes
|
|
--------------
|
|
|
|
Generate Clang AST declaration nodes.
|
|
|
|
ClangStmtNodes
|
|
--------------
|
|
|
|
Generate Clang AST statement nodes.
|
|
|
|
ClangSACheckers
|
|
---------------
|
|
|
|
Generate Clang Static Analyzer checkers.
|
|
|
|
ClangCommentHTMLTags
|
|
--------------------
|
|
|
|
Generate efficient matchers for HTML tag names that are used in documentation comments.
|
|
|
|
ClangCommentHTMLTagsProperties
|
|
------------------------------
|
|
|
|
Generate efficient matchers for HTML tag properties.
|
|
|
|
ClangCommentHTMLNamedCharacterReferences
|
|
----------------------------------------
|
|
|
|
Generate function to translate named character references to UTF-8 sequences.
|
|
|
|
ClangCommentCommandInfo
|
|
-----------------------
|
|
|
|
Generate command properties for commands that are used in documentation comments.
|
|
|
|
ClangCommentCommandList
|
|
-----------------------
|
|
|
|
Generate list of commands that are used in documentation comments.
|
|
|
|
ArmNeon
|
|
-------
|
|
|
|
Generate arm_neon.h for clang.
|
|
|
|
ArmNeonSema
|
|
-----------
|
|
|
|
Generate ARM NEON sema support for clang.
|
|
|
|
ArmNeonTest
|
|
-----------
|
|
|
|
Generate ARM NEON tests for clang.
|
|
|
|
AttrDocs
|
|
--------
|
|
|
|
**Purpose**: Creates ``AttributeReference.rst`` from ``AttrDocs.td``, and is
|
|
used for documenting user-facing attributes.
|
|
|
|
General BackEnds
|
|
================
|
|
|
|
JSON
|
|
----
|
|
|
|
**Purpose**: Output all the values in every ``def``, as a JSON data
|
|
structure that can be easily parsed by a variety of languages. Useful
|
|
for writing custom backends without having to modify TableGen itself,
|
|
or for performing auxiliary analysis on the same TableGen data passed
|
|
to a built-in backend.
|
|
|
|
**Output**:
|
|
|
|
The root of the output file is a JSON object (i.e. dictionary),
|
|
containing the following fixed keys:
|
|
|
|
* ``!tablegen_json_version``: a numeric version field that will
|
|
increase if an incompatible change is ever made to the structure of
|
|
this data. The format described here corresponds to version 1.
|
|
|
|
* ``!instanceof``: a dictionary whose keys are the class names defined
|
|
in the TableGen input. For each key, the corresponding value is an
|
|
array of strings giving the names of ``def`` records that derive
|
|
from that class. So ``root["!instanceof"]["Instruction"]``, for
|
|
example, would list the names of all the records deriving from the
|
|
class ``Instruction``.
|
|
|
|
For each ``def`` record, the root object also has a key for the record
|
|
name. The corresponding value is a subsidiary object containing the
|
|
following fixed keys:
|
|
|
|
* ``!superclasses``: an array of strings giving the names of all the
|
|
classes that this record derives from.
|
|
|
|
* ``!fields``: an array of strings giving the names of all the variables
|
|
in this record that were defined with the ``field`` keyword.
|
|
|
|
* ``!name``: a string giving the name of the record. This is always
|
|
identical to the key in the JSON root object corresponding to this
|
|
record's dictionary. (If the record is anonymous, the name is
|
|
arbitrary.)
|
|
|
|
* ``!anonymous``: a boolean indicating whether the record's name was
|
|
specified by the TableGen input (if it is ``false``), or invented by
|
|
TableGen itself (if ``true``).
|
|
|
|
For each variable defined in a record, the ``def`` object for that
|
|
record also has a key for the variable name. The corresponding value
|
|
is a translation into JSON of the variable's value, using the
|
|
conventions described below.
|
|
|
|
Some TableGen data types are translated directly into the
|
|
corresponding JSON type:
|
|
|
|
* A completely undefined value (e.g. for a variable declared without
|
|
initializer in some superclass of this record, and never initialized
|
|
by the record itself or any other superclass) is emitted as the JSON
|
|
``null`` value.
|
|
|
|
* ``int`` and ``bit`` values are emitted as numbers. Note that
|
|
TableGen ``int`` values are capable of holding integers too large to
|
|
be exactly representable in IEEE double precision. The integer
|
|
literal in the JSON output will show the full exact integer value.
|
|
So if you need to retrieve large integers with full precision, you
|
|
should use a JSON reader capable of translating such literals back
|
|
into 64-bit integers without losing precision, such as Python's
|
|
standard ``json`` module.
|
|
|
|
* ``string`` and ``code`` values are emitted as JSON strings.
|
|
|
|
* ``list<T>`` values, for any element type ``T``, are emitted as JSON
|
|
arrays. Each element of the array is represented in turn using these
|
|
same conventions.
|
|
|
|
* ``bits`` values are also emitted as arrays. A ``bits`` array is
|
|
ordered from least-significant bit to most-significant. So the
|
|
element with index ``i`` corresponds to the bit described as
|
|
``x{i}`` in TableGen source. However, note that this means that
|
|
scripting languages are likely to *display* the array in the
|
|
opposite order from the way it appears in the TableGen source or in
|
|
the diagnostic ``-print-records`` output.
|
|
|
|
All other TableGen value types are emitted as a JSON object,
|
|
containing two standard fields: ``kind`` is a discriminator describing
|
|
which kind of value the object represents, and ``printable`` is a
|
|
string giving the same representation of the value that would appear
|
|
in ``-print-records``.
|
|
|
|
* A reference to a ``def`` object has ``kind=="def"``, and has an
|
|
extra field ``def`` giving the name of the object referred to.
|
|
|
|
* A reference to another variable in the same record has
|
|
``kind=="var"``, and has an extra field ``var`` giving the name of
|
|
the variable referred to.
|
|
|
|
* A reference to a specific bit of a ``bits``-typed variable in the
|
|
same record has ``kind=="varbit"``, and has two extra fields:
|
|
``var`` gives the name of the variable referred to, and ``index``
|
|
gives the index of the bit.
|
|
|
|
* A value of type ``dag`` has ``kind=="dag"``, and has two extra
|
|
fields. ``operator`` gives the initial value after the opening
|
|
parenthesis of the dag initializer; ``args`` is an array giving the
|
|
following arguments. The elements of ``args`` are arrays of length
|
|
2, giving the value of each argument followed by its colon-suffixed
|
|
name (if any). For example, in the JSON representation of the dag
|
|
value ``(Op 22, "hello":$foo)`` (assuming that ``Op`` is the name of
|
|
a record defined elsewhere with a ``def`` statement):
|
|
|
|
* ``operator`` will be an object in which ``kind=="def"`` and
|
|
``def=="Op"``
|
|
|
|
* ``args`` will be the array ``[[22, null], ["hello", "foo"]]``.
|
|
|
|
* If any other kind of value or complicated expression appears in the
|
|
output, it will have ``kind=="complex"``, and no additional fields.
|
|
These values are not expected to be needed by backends. The standard
|
|
``printable`` field can be used to extract a representation of them
|
|
in TableGen source syntax if necessary.
|
|
|
|
How to write a back-end
|
|
=======================
|
|
|
|
TODO.
|
|
|
|
Until we get a step-by-step HowTo for writing TableGen backends, you can at
|
|
least grab the boilerplate (build system, new files, etc.) from Clang's
|
|
r173931.
|
|
|
|
TODO: How they work, how to write one. This section should not contain details
|
|
about any particular backend, except maybe ``-print-enums`` as an example. This
|
|
should highlight the APIs in ``TableGen/Record.h``.
|
|
|