forked from OSchip/llvm-project
384 lines
11 KiB
ReStructuredText
384 lines
11 KiB
ReStructuredText
===========================
|
|
TableGen Language Reference
|
|
===========================
|
|
|
|
.. sectionauthor:: Sean Silva <silvas@purdue.edu>
|
|
|
|
.. contents::
|
|
:local:
|
|
|
|
.. warning::
|
|
This document is extremely rough. If you find something lacking, please
|
|
fix it, file a documentation bug, or ask about it on llvmdev.
|
|
|
|
Introduction
|
|
============
|
|
|
|
This document is meant to be a normative spec about the TableGen language
|
|
in and of itself (i.e. how to understand a given construct in terms of how
|
|
it affects the final set of records represented by the TableGen file). If
|
|
you are unsure if this document is really what you are looking for, please
|
|
read :doc:`/TableGenFundamentals` first.
|
|
|
|
Notation
|
|
========
|
|
|
|
The lexical and syntax notation used here is intended to imitate
|
|
`Python's`_. In particular, for lexical definitions, the productions
|
|
operate at the character level and there is no implied whitespace between
|
|
elements. The syntax definitions operate at the token level, so there is
|
|
implied whitespace between tokens.
|
|
|
|
.. _`Python's`: http://docs.python.org/py3k/reference/introduction.html#notation
|
|
|
|
Lexical Analysis
|
|
================
|
|
|
|
TableGen supports BCPL (``// ...``) and nestable C-style (``/* ... */``)
|
|
comments.
|
|
|
|
The following is a listing of the basic punctuation tokens::
|
|
|
|
- + [ ] { } ( ) < > : ; . = ? #
|
|
|
|
Numeric literals take one of the following forms:
|
|
|
|
.. TableGen actually will lex some pretty strange sequences an interpret
|
|
them as numbers. What is shown here is an attempt to approximate what it
|
|
"should" accept.
|
|
|
|
.. productionlist::
|
|
TokInteger: `DecimalInteger` | `HexInteger` | `BinInteger`
|
|
DecimalInteger: ["+" | "-"] ("0"..."9")+
|
|
HexInteger: "0x" ("0"..."9" | "a"..."f" | "A"..."F")+
|
|
BinInteger: "0b" ("0" | "1")+
|
|
|
|
One aspect to note is that the :token:`DecimalInteger` token *includes* the
|
|
``+`` or ``-``, as opposed to having ``+`` and ``-`` be unary operators as
|
|
most languages do.
|
|
|
|
TableGen has identifier-like tokens:
|
|
|
|
.. productionlist::
|
|
ualpha: "a"..."z" | "A"..."Z" | "_"
|
|
TokIdentifier: ("0"..."9")* `ualpha` (`ualpha` | "0"..."9")*
|
|
TokVarName: "$" `ualpha` (`ualpha` | "0"..."9")*
|
|
|
|
Note that unlike most languages, TableGen allows :token:`TokIdentifier` to
|
|
begin with a number. In case of ambiguity, a token will be interpreted as a
|
|
numeric literal rather than an identifier.
|
|
|
|
TableGen also has two string-like literals:
|
|
|
|
.. productionlist::
|
|
TokString: '"' <non-'"' characters and C-like escapes> '"'
|
|
TokCodeFragment: "[{" <shortest text not containing "}]"> "}]"
|
|
|
|
.. note::
|
|
The current implementation accepts the following C-like escapes::
|
|
|
|
\\ \' \" \t \n
|
|
|
|
TableGen also has the following keywords::
|
|
|
|
bit bits class code dag
|
|
def foreach defm field in
|
|
int let list multiclass string
|
|
|
|
TableGen also has "bang operators" which have a
|
|
wide variety of meanings:
|
|
|
|
.. productionlist::
|
|
BangOperator: one of
|
|
:!eq !if !head !tail !con
|
|
:!add !shl !sra !srl
|
|
:!cast !empty !subst !foreach !strconcat
|
|
|
|
Syntax
|
|
======
|
|
|
|
TableGen has an ``include`` mechanism. It does not play a role in the
|
|
syntax per se, since it is lexically replaced with the contents of the
|
|
included file.
|
|
|
|
.. productionlist::
|
|
IncludeDirective: "include" `TokString`
|
|
|
|
TableGen's top-level production consists of "objects".
|
|
|
|
.. productionlist::
|
|
TableGenFile: `Object`*
|
|
Object: `Class` | `Def` | `Defm` | `Let` | `MultiClass` | `Foreach`
|
|
|
|
``class``\es
|
|
------------
|
|
|
|
.. productionlist::
|
|
Class: "class" `TokIdentifier` [`TemplateArgList`] `ObjectBody`
|
|
|
|
A ``class`` declaration creates a record which other records can inherit
|
|
from. A class can be parametrized by a list of "template arguments", whose
|
|
values can be used in the class body.
|
|
|
|
A given class can only be defined once. A ``class`` declaration is
|
|
considered to define the class if any of the following is true:
|
|
|
|
.. break ObjectBody into its consituents so that they are present here?
|
|
|
|
#. The :token:`TemplateArgList` is present.
|
|
#. The :token:`Body` in the :token:`ObjectBody` is present and is not empty.
|
|
#. The :token:`BaseClassList` in the :token:`ObjectBody` is present.
|
|
|
|
You can declare an empty class by giving and empty :token:`TemplateArgList`
|
|
and an empty :token:`ObjectBody`. This can serve as a restricted form of
|
|
forward declaration: note that records deriving from the forward-declared
|
|
class will inherit no fields from it since the record expansion is done
|
|
when the record is parsed.
|
|
|
|
.. productionlist::
|
|
TemplateArgList: "<" `Declaration` ("," `Declaration`)* ">"
|
|
|
|
Declarations
|
|
------------
|
|
|
|
.. Omitting mention of arcane "field" prefix to discourage its use.
|
|
|
|
The declaration syntax is pretty much what you would expect as a C++
|
|
programmer.
|
|
|
|
.. productionlist::
|
|
Declaration: `Type` `TokIdentifier` ["=" `Value`]
|
|
|
|
It assigns the value to the identifer.
|
|
|
|
Types
|
|
-----
|
|
|
|
.. productionlist::
|
|
Type: "string" | "code" | "bit" | "int" | "dag"
|
|
:| "bits" "<" `TokInteger` ">"
|
|
:| "list" "<" `Type` ">"
|
|
:| `ClassID`
|
|
ClassID: `TokIdentifier`
|
|
|
|
Both ``string`` and ``code`` correspond to the string type; the difference
|
|
is purely to indicate programmer intention.
|
|
|
|
The :token:`ClassID` must identify a class that has been previously
|
|
declared or defined.
|
|
|
|
Values
|
|
------
|
|
|
|
.. productionlist::
|
|
Value: `SimpleValue` `ValueSuffix`*
|
|
ValueSuffix: "{" `RangeList` "}"
|
|
:| "[" `RangeList` "]"
|
|
:| "." `TokIdentifier`
|
|
RangeList: `RangePiece` ("," `RangePiece`)*
|
|
RangePiece: `TokInteger`
|
|
:| `TokInteger` "-" `TokInteger`
|
|
:| `TokInteger` `TokInteger`
|
|
|
|
The peculiar last form of :token:`RangePiece` is due to the fact that the
|
|
"``-``" is included in the :token:`TokInteger`, hence ``1-5`` gets lexed as
|
|
two consecutive :token:`TokInteger`'s, with values ``1`` and ``-5``,
|
|
instead of "1", "-", and "5".
|
|
The :token:`RangeList` can be thought of as specifying "list slice" in some
|
|
contexts.
|
|
|
|
|
|
:token:`SimpleValue` has a number of forms:
|
|
|
|
|
|
.. productionlist::
|
|
SimpleValue: `TokIdentifier`
|
|
|
|
The value will be the variable referenced by the identifier. It can be one
|
|
of:
|
|
|
|
.. The code for this is exceptionally abstruse. These examples are a
|
|
best-effort attempt.
|
|
|
|
* name of a ``def``, such as the use of ``Bar`` in::
|
|
|
|
def Bar : SomeClass {
|
|
int X = 5;
|
|
}
|
|
|
|
def Foo {
|
|
SomeClass Baz = Bar;
|
|
}
|
|
|
|
* value local to a ``def``, such as the use of ``Bar`` in::
|
|
|
|
def Foo {
|
|
int Bar = 5;
|
|
int Baz = Bar;
|
|
}
|
|
|
|
* a template arg of a ``class``, such as the use of ``Bar`` in::
|
|
|
|
class Foo<int Bar> {
|
|
int Baz = Bar;
|
|
}
|
|
|
|
* value local to a ``multiclass``, such as the use of ``Bar`` in::
|
|
|
|
multiclass Foo {
|
|
int Bar = 5;
|
|
int Baz = Bar;
|
|
}
|
|
|
|
* a template arg to a ``multiclass``, such as the use of ``Bar`` in::
|
|
|
|
multiclass Foo<int Bar> {
|
|
int Baz = Bar;
|
|
}
|
|
|
|
.. productionlist::
|
|
SimpleValue: `TokInteger`
|
|
|
|
This represents the numeric value of the integer.
|
|
|
|
.. productionlist::
|
|
SimpleValue: `TokString`+
|
|
|
|
Multiple adjacent string literals are concatenated like in C/C++. The value
|
|
is the concatenation of the strings.
|
|
|
|
.. productionlist::
|
|
SimpleValue: `TokCodeFragment`
|
|
|
|
The value is the string value of the code fragment.
|
|
|
|
.. productionlist::
|
|
SimpleValue: "?"
|
|
|
|
``?`` represents an "unset" initializer.
|
|
|
|
.. productionlist::
|
|
SimpleValue: "{" `ValueList` "}"
|
|
ValueList: [`ValueListNE`]
|
|
ValueListNE: `Value` ("," `Value`)*
|
|
|
|
This represents a sequence of bits, as would be used to initialize a
|
|
``bits<n>`` field (where ``n`` is the number of bits).
|
|
|
|
.. productionlist::
|
|
SimpleValue: `ClassID` "<" `ValueListNE` ">"
|
|
|
|
This generates a new anonymous record definition (as would be created by an
|
|
unnamed ``def`` inheriting from the given class with the given template
|
|
arguments) and the value is the value of that record definition.
|
|
|
|
.. productionlist::
|
|
SimpleValue: "[" `ValueList` "]" ["<" `Type` ">"]
|
|
|
|
A list initializer. The optional :token:`Type` can be used to indicate a
|
|
specific element type, otherwise the element type will be deduced from the
|
|
given values.
|
|
|
|
.. The initial `DagArg` of the dag must start with an identifier or
|
|
!cast, but this is more of an implementation detail and so for now just
|
|
leave it out.
|
|
|
|
.. productionlist::
|
|
SimpleValue: "(" `DagArg` `DagArgList` ")"
|
|
DagArgList: `DagArg` ("," `DagArg`)*
|
|
DagArg: `Value` [":" `TokVarName`] | `TokVarName`
|
|
|
|
The initial :token:`DagArg` is called the "operator" of the dag.
|
|
|
|
.. productionlist::
|
|
SimpleValue: `BangOperator` ["<" `Type` ">"] "(" `ValueListNE` ")"
|
|
|
|
Bodies
|
|
------
|
|
|
|
.. productionlist::
|
|
ObjectBody: `BaseClassList` `Body`
|
|
BaseClassList: [":" `BaseClassListNE`]
|
|
BaseClassListNE: `SubClassRef` ("," `SubClassRef`)*
|
|
SubClassRef: (`ClassID` | `MultiClassID`) ["<" `ValueList` ">"]
|
|
DefmID: `TokIdentifier`
|
|
|
|
The version with the :token:`MultiClassID` is only valid in the
|
|
:token:`BaseClassList` of a ``defm``.
|
|
The :token:`MultiClassID` should be the name of a ``multiclass``.
|
|
|
|
.. put this somewhere else
|
|
|
|
It is after parsing the base class list that the "let stack" is applied.
|
|
|
|
.. productionlist::
|
|
Body: ";" | "{" BodyList "}"
|
|
BodyList: BodyItem*
|
|
BodyItem: `Declaration` ";"
|
|
:| "let" `TokIdentifier` [`RangeList`] "=" `Value` ";"
|
|
|
|
The ``let`` form allows overriding the value of an inherited field.
|
|
|
|
``def``
|
|
-------
|
|
|
|
.. TODO::
|
|
There can be pastes in the names here, like ``#NAME#``. Look into that
|
|
and document it (it boils down to ParseIDValue with IDParseMode ==
|
|
ParseNameMode). ParseObjectName calls into the general ParseValue, with
|
|
the only different from "arbitrary expression parsing" being IDParseMode
|
|
== Mode.
|
|
|
|
.. productionlist::
|
|
Def: "def" `TokIdentifier` `ObjectBody`
|
|
|
|
Defines a record whose name is given by the :token:`TokIdentifier`. The
|
|
fields of the record are inherited from the base classes and defined in the
|
|
body.
|
|
|
|
Special handling occurs if this ``def`` appears inside a ``multiclass`` or
|
|
a ``foreach``.
|
|
|
|
``defm``
|
|
--------
|
|
|
|
.. productionlist::
|
|
Defm: "defm" `TokIdentifier` ":" `BaseClassListNE` ";"
|
|
|
|
Note that in the :token:`BaseClassList`, all of the ``multiclass``'s must
|
|
precede any ``class``'s that appear.
|
|
|
|
``foreach``
|
|
-----------
|
|
|
|
.. productionlist::
|
|
Foreach: "foreach" `Declaration` "in" "{" `Object`* "}"
|
|
:| "foreach" `Declaration` "in" `Object`
|
|
|
|
The value assigned to the variable in the declaration is iterated over and
|
|
the object or object list is reevaluated with the variable set at each
|
|
iterated value.
|
|
|
|
Top-Level ``let``
|
|
-----------------
|
|
|
|
.. productionlist::
|
|
Let: "let" `LetList` "in" "{" `Object`* "}"
|
|
:| "let" `LetList` "in" `Object`
|
|
LetList: `LetItem` ("," `LetItem`)*
|
|
LetItem: `TokIdentifier` [`RangeList`] "=" `Value`
|
|
|
|
This is effectively equivalent to ``let`` inside the body of a record
|
|
except that it applies to multiple records at a time. The bindings are
|
|
applied at the end of parsing the base classes of a record.
|
|
|
|
``multiclass``
|
|
--------------
|
|
|
|
.. productionlist::
|
|
MultiClass: "multiclass" `TokIdentifier` [`TemplateArgList`]
|
|
: [":" `BaseMultiClassList`] "{" `MultiClassObject`+ "}"
|
|
BaseMultiClassList: `MultiClassID` ("," `MultiClassID`)*
|
|
MultiClassID: `TokIdentifier`
|
|
MultiClassObject: `Def` | `Defm` | `Let` | `Foreach`
|