forked from OSchip/llvm-project
Minor spelling tweaks
Closes tensorflow/mlir#145 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/145 from kiszk:spelling_tweaks_g3doc ae9140aab5b797441e880d43e557903585815e40 PiperOrigin-RevId: 271173907
This commit is contained in:
parent
3848baec69
commit
a2bce652af
|
@ -38,7 +38,7 @@ Some important things to think about w.r.t. canonicalization patterns:
|
|||
|
||||
## Globally Applied Rules
|
||||
|
||||
These transformation are applied to all levels of IR:
|
||||
These transformations are applied to all levels of IR:
|
||||
|
||||
* Elimination of operations that have no side effects and have no uses.
|
||||
|
||||
|
|
|
@ -25,14 +25,14 @@ benefits, including but not limited to:
|
|||
* **Being declarative**: The pattern creator just needs to state the rewrite
|
||||
pattern declaratively, without worrying about the concrete C++ methods to
|
||||
call.
|
||||
* **Removing boilerplate and showing the very essense the the rewrite**:
|
||||
* **Removing boilerplate and showing the very essence of the rewrite**:
|
||||
`mlir::RewritePattern` is already good at hiding boilerplate for defining a
|
||||
rewrite rule. But we still need to write the class and function structures
|
||||
required by the C++ programming language, inspect ops for matching, and call
|
||||
op `build()` methods for constructing. These statements are typically quite
|
||||
simple and similar, so they can be further condensed with auto-generation.
|
||||
Because we reduce the boilerplate to the bare minimum, the declarative
|
||||
rewrite rule will just contain the very essense of the rewrite. This makes
|
||||
rewrite rule will just contain the very essence of the rewrite. This makes
|
||||
it very easy to understand the pattern.
|
||||
|
||||
## Strengths and Limitations
|
||||
|
@ -239,7 +239,7 @@ to replace the matched `AOp`.
|
|||
|
||||
#### Binding op results
|
||||
|
||||
In the result pattern, we can bind to the result(s) of an newly built op by
|
||||
In the result pattern, we can bind to the result(s) of a newly built op by
|
||||
attaching symbols to the op. (But we **cannot** bind to op arguments given that
|
||||
they are referencing previously bound symbols.) This is useful for reusing
|
||||
newly created results where suitable. For example,
|
||||
|
@ -270,7 +270,7 @@ directly fed in as arguments to build the new op. For such cases, we can apply
|
|||
transformations on the arguments by calling into C++ helper functions. This is
|
||||
achieved by `NativeCodeCall`.
|
||||
|
||||
For example, if we want to catpure some op's attributes and group them as an
|
||||
For example, if we want to capture some op's attributes and group them as an
|
||||
array attribute to construct a new op:
|
||||
|
||||
```tblgen
|
||||
|
@ -361,7 +361,7 @@ $in2)`, then this will be translated into C++ call `someFn($in1, $in2, $in0)`.
|
|||
##### Customizing entire op building
|
||||
|
||||
`NativeCodeCall` is not only limited to transforming arguments for building an
|
||||
op; it can also used to specify how to build an op entirely. An example:
|
||||
op; it can be also used to specify how to build an op entirely. An example:
|
||||
|
||||
If we have a C++ function for building an op:
|
||||
|
||||
|
@ -379,10 +379,10 @@ def : Pat<(... $input, $attr), (createMyOp $input, $attr)>;
|
|||
|
||||
### Supporting auxiliary ops
|
||||
|
||||
A declarative rewrite rule supports multiple result patterns. One of the purpose
|
||||
is to allow generating _auxiliary ops_. Auxiliary ops are operations used for
|
||||
building the replacement ops; but they are not directly used for replacement
|
||||
themselves.
|
||||
A declarative rewrite rule supports multiple result patterns. One of the
|
||||
purposes is to allow generating _auxiliary ops_. Auxiliary ops are operations
|
||||
used for building the replacement ops; but they are not directly used for
|
||||
replacement themselves.
|
||||
|
||||
For the case of uni-result ops, if there are multiple result patterns, only the
|
||||
value generated from the last result pattern will be used to replace the matched
|
||||
|
@ -556,7 +556,7 @@ correspond to multiple actual values.
|
|||
|
||||
Constraints can be placed on op arguments when matching. But sometimes we need
|
||||
to also place constraints on the matched op's results or sometimes need to limit
|
||||
the matching with some constraints that cover both the arugments and the
|
||||
the matching with some constraints that cover both the arguments and the
|
||||
results. The third parameter to `Pattern` (and `Pat`) is for this purpose.
|
||||
|
||||
For example, we can write
|
||||
|
@ -587,7 +587,7 @@ You can
|
|||
|
||||
### Adjusting benefits
|
||||
|
||||
The benefit of a `Pattern` is an integer value indicating the benfit of matching
|
||||
The benefit of a `Pattern` is an integer value indicating the benefit of matching
|
||||
the pattern. It determines the priorities of patterns inside the pattern rewrite
|
||||
driver. A pattern with a higher benefit is applied before one with a lower
|
||||
benefit.
|
||||
|
@ -599,7 +599,7 @@ pattern. This is based on the heuristics and assumptions that:
|
|||
* If a smaller one is applied first the larger one may not apply anymore.
|
||||
|
||||
|
||||
The forth parameter to `Pattern` (and `Pat`) allows to manually tweak a
|
||||
The fourth parameter to `Pattern` (and `Pat`) allows to manually tweak a
|
||||
pattern's benefit. Just supply `(addBenefit N)` to add `N` to the benefit value.
|
||||
|
||||
## Special directives
|
||||
|
|
|
@ -55,7 +55,7 @@ enum Kinds {
|
|||
### Defining the type class
|
||||
|
||||
As described above, `Type` objects in MLIR are value-typed and rely on having an
|
||||
implicity internal storage object that holds the actual data for the type. When
|
||||
implicitly internal storage object that holds the actual data for the type. When
|
||||
defining a new `Type` it isn't always necessary to define a new storage class.
|
||||
So before defining the derived `Type`, it's important to know which of the two
|
||||
classes of `Type` we are defining. Some types are `primitives` meaning they do
|
||||
|
@ -256,7 +256,7 @@ Once the dialect types have been defined, they must then be registered with a
|
|||
```c++
|
||||
struct MyDialect : public Dialect {
|
||||
MyDialect(MLIRContext *context) : Dialect(/*name=*/"mydialect", context) {
|
||||
/// Add these types to the dialcet.
|
||||
/// Add these types to the dialect.
|
||||
addTypes<SimpleType, ComplexType>();
|
||||
}
|
||||
};
|
||||
|
|
|
@ -12,7 +12,7 @@ LLVM style guide:
|
|||
|
||||
* Adopts [camelBack](https://llvm.org/docs/Proposals/VariableNames.html);
|
||||
* Except for IR units (Region, Block, and Operation), non-nullable output
|
||||
argument are passed by non-const reference in general.
|
||||
arguments are passed by non-const reference in general.
|
||||
* IR constructs are not designed for [const correctness](UsageOfConst.md).
|
||||
* Do *not* use recursive algorithms if the recursion can't be bounded
|
||||
statically: that is avoid recursion if there is a possible IR input that can
|
||||
|
|
|
@ -105,7 +105,7 @@ parenthesization, (2) negation, (3) modulo, multiplication, floordiv, and
|
|||
ceildiv, and (4) addition and subtraction. All of these operators associate from
|
||||
left to right.
|
||||
|
||||
A _multi-dimensional affine expression_ is a comma separated list of
|
||||
A _multidimensional affine expression_ is a comma separated list of
|
||||
one-dimensional affine expressions, with the entire list enclosed in
|
||||
parentheses.
|
||||
|
||||
|
@ -119,7 +119,7 @@ affine function. MLIR further extends the definition of an affine function to
|
|||
allow 'floordiv', 'ceildiv', and 'mod' with respect to positive integer
|
||||
constants. Such extensions to affine functions have often been referred to as
|
||||
quasi-affine functions by the polyhedral compiler community. MLIR uses the term
|
||||
'affine map' to refer to these multi-dimensional quasi-affine functions. As
|
||||
'affine map' to refer to these multidimensional quasi-affine functions. As
|
||||
examples, $$(i+j+1, j)$$, $$(i \mod 2, j+i)$$, $$(j, i/4, i \mod 4)$$, $$(2i+1,
|
||||
j)$$ are two-dimensional affine functions of $$(i, j)$$, but $$(i \cdot j,
|
||||
i^2)$$, $$(i \mod j, i/j)$$ are not affine functions of $$(i, j)$$.
|
||||
|
|
|
@ -109,7 +109,7 @@ Examples:
|
|||
|
||||
In these operations, `<size>` must be a value of wrapped LLVM IR integer type,
|
||||
`<address>` must be a value of wrapped LLVM IR pointer type, and `<value>` must
|
||||
be a value of wrapped LLVM IR type that corresponds to the pointee type of
|
||||
be a value of wrapped LLVM IR type that corresponds to the pointer type of
|
||||
`<address>`.
|
||||
|
||||
The `index` operands are integer values whose semantics is identical to the
|
||||
|
|
|
@ -20,7 +20,7 @@ format and to facilitate transformations. Therefore, it should
|
|||
|
||||
* Stay as the same semantic level and try to be a mechanical 1:1 mapping;
|
||||
* But deviate representationally if possible with MLIR mechanisms.
|
||||
* Be straightforward to serialize into and deserialize drom the SPIR-V binary
|
||||
* Be straightforward to serialize into and deserialize from the SPIR-V binary
|
||||
format.
|
||||
|
||||
## Conventions
|
||||
|
@ -55,10 +55,10 @@ instructions are represented in the SPIR-V dialect. Notably,
|
|||
* Requirements for capabilities, extensions, extended instruction sets,
|
||||
addressing model, and memory model is conveyed using `spv.module`
|
||||
attributes. This is considered better because these information are for the
|
||||
exexcution environment. It's eaiser to probe them if on the module op
|
||||
execution environment. It's easier to probe them if on the module op
|
||||
itself.
|
||||
* Annotations/decoration instrutions are "folded" into the instructions they
|
||||
decorate and represented as attributes on those ops. This elimiates
|
||||
* Annotations/decoration instructions are "folded" into the instructions they
|
||||
decorate and represented as attributes on those ops. This eliminates
|
||||
potential forward references of SSA values, improves IR readability, and
|
||||
makes querying the annotations more direct.
|
||||
* Types are represented using MLIR standard types and SPIR-V dialect specific
|
||||
|
@ -252,7 +252,7 @@ block, one loop continue block, one merge block.
|
|||
...
|
||||
\ | /
|
||||
v
|
||||
+-------------+ (may have mulitple incoming branches)
|
||||
+-------------+ (may have multiple incoming branches)
|
||||
| merge block |
|
||||
+-------------+
|
||||
```
|
||||
|
|
|
@ -92,7 +92,7 @@ for %i = 0 to 3 {
|
|||
|
||||
On a GPU one could then map `i`, `j`, `k` to blocks and threads. Notice that the
|
||||
temporary storage footprint is `3 * 5` values but `3 * 4 * 5` values are
|
||||
actually transferred betwen `%A` and `%tmp`.
|
||||
actually transferred between `%A` and `%tmp`.
|
||||
|
||||
Alternatively, if a notional vector broadcast operation were available, the
|
||||
lowered code would resemble:
|
||||
|
|
|
@ -349,7 +349,7 @@ that match predicates eliminate the need for dynamically computed costs in
|
|||
almost all cases: you can simply instantiate the same pattern one time for each
|
||||
possible cost and use the predicate to guard the match.
|
||||
|
||||
The two phase nature of this API (match separate from rewrite) is important for
|
||||
The two-phase nature of this API (match separate from rewrite) is important for
|
||||
two reasons: 1) some clients may want to explore different ways to tile the
|
||||
graph, and only rewrite after committing to one tiling. 2) We want to support
|
||||
runtime extensibility of the pattern sets, but want to be able to statically
|
||||
|
|
|
@ -312,8 +312,8 @@ it.
|
|||
|
||||
An MLIR Function is an operation with a name containing one [region](#regions).
|
||||
The region of a function is not allowed to implicitly capture values defined
|
||||
outside of the function, and all external references must use Function arguments
|
||||
or attributes that establish a symbolic connection(e.g. symbols referenced by
|
||||
outside of the function, and all external references must use function arguments
|
||||
or attributes that establish a symbolic connection (e.g. symbols referenced by
|
||||
name via a string attribute like [SymbolRefAttr](#symbol-reference-attribute)):
|
||||
|
||||
``` {.ebnf}
|
||||
|
@ -455,12 +455,14 @@ func @accelerator_compute(i64, i1) -> i64 {
|
|||
^bb2:
|
||||
"accelerator.launch"() {
|
||||
^bb0:
|
||||
// Region of code nested under "accelerator_launch", it can reference %a but
|
||||
// Region of code nested under "accelerator.launch", it can reference %a but
|
||||
// not %value.
|
||||
%new_value = "accelerator.do_something"(%a) : (i64) -> ()
|
||||
}
|
||||
// %new_value cannot be referenced outside of the region
|
||||
...
|
||||
|
||||
^bb3:
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
|
@ -796,7 +798,7 @@ memref<16x32xf32, #identity, memspace0>
|
|||
// f32 elements.
|
||||
%T = alloc(%M, %N) [%B1, %B2] : memref<?x?xf32, #tiled_dynamic>
|
||||
|
||||
// A memref that has a two element padding at either end. The allocation size
|
||||
// A memref that has a two-element padding at either end. The allocation size
|
||||
// will fit 16 * 68 float elements of data.
|
||||
%P = alloc() : memref<16x64xf32, #padded>
|
||||
|
||||
|
@ -1296,7 +1298,7 @@ Syntax:
|
|||
integer-set-attribute ::= affine-map
|
||||
```
|
||||
|
||||
An integer-set attribute is an attribute that represents a integer-set object.
|
||||
An integer-set attribute is an attribute that represents an integer-set object.
|
||||
|
||||
#### String Attribute
|
||||
|
||||
|
|
|
@ -116,7 +116,7 @@ of the benefits that MLIR provides, in no particular order:
|
|||
|
||||
The MLIR in-memory data structure has a human readable and writable format, as
|
||||
well as [a specification](LangRef.md) for that format - built just like any
|
||||
other programming language. Important properties of this format is that it is
|
||||
other programming language. Important properties of this format are that it is
|
||||
compact, easy to read, and lossless. You can dump an MLIR program out to disk
|
||||
and munge around with it, then send it through a few more passes.
|
||||
|
||||
|
@ -139,7 +139,7 @@ the product more reliable, and making it easier to track down bugs when they
|
|||
appear - because the verifier can be run at any time, either as a compiler pass
|
||||
or with a single function call.
|
||||
|
||||
While MLIR provides a well considered infrastructure for IR verification, and
|
||||
While MLIR provides a well-considered infrastructure for IR verification, and
|
||||
has simple checks for existing TensorFlow operations, there is a lot that should
|
||||
be added here and lots of opportunity to get involved!
|
||||
|
||||
|
@ -166,7 +166,7 @@ turned into zero:
|
|||
|
||||
The "CHECK" comments are interpreted by the
|
||||
[LLVM FileCheck tool](https://llvm.org/docs/CommandGuide/FileCheck.html), which
|
||||
is sort of like a really advanced grep. This test is fully self contained: it
|
||||
is sort of like a really advanced grep. This test is fully self-contained: it
|
||||
feeds the input into the [canonicalize pass](Canonicalization.md), and checks
|
||||
that the output matches the CHECK lines. See the `test/Transforms` directory for
|
||||
more examples. In contrast, standard unit testing exposes the API of the
|
||||
|
@ -258,7 +258,7 @@ This is still a work in progress, but we have sightlines towards a
|
|||
tiles into other DAG tiles, using a declarative pattern format. DAG to DAG
|
||||
rewriting is a generalized solution for many common compiler optimizations,
|
||||
lowerings, and other rewrites and having an IR enables us to invest in building
|
||||
a single high quality implementation.
|
||||
a single high-quality implementation.
|
||||
|
||||
Declarative pattern rules are preferable to imperative C++ code for a number of
|
||||
reasons: they are more compact, easier to reason about, can have checkers
|
||||
|
@ -313,7 +313,7 @@ transformations) today, and are committed to pushing hard to make it better.
|
|||
|
||||
MLIR has been designed to be memory and compile-time efficient in its algorithms
|
||||
and data structures, using immutable and uniqued structures, low level
|
||||
bit-packing, and other well known techniques to avoid unnecessary heap
|
||||
bit-packing, and other well-known techniques to avoid unnecessary heap
|
||||
allocations, and allow simple and safe multithreaded optimization of MLIR
|
||||
programs. There are other reasons to believe that the MLIR implementations of
|
||||
common transformations will be more efficient than the Python and C++
|
||||
|
|
|
@ -242,7 +242,7 @@ like `"0.5f"`, and an integer array default value should be specified as like
|
|||
`Confined` is provided as a general mechanism to help modelling further
|
||||
constraints on attributes beyond the ones brought by value types. You can use
|
||||
`Confined` to compose complex constraints out of more primitive ones. For
|
||||
example, an 32-bit integer attribute whose minimal value must be 10 can be
|
||||
example, a 32-bit integer attribute whose minimal value must be 10 can be
|
||||
expressed as `Confined<I32Attr, [IntMinValue<10>]>`.
|
||||
|
||||
Right now, the following primitive constraints are supported:
|
||||
|
@ -373,7 +373,7 @@ def MyInterface : OpInterface<"MyInterface"> {
|
|||
|
||||
### Custom builder methods
|
||||
|
||||
For each operation, there are two builder automatically generated based on the
|
||||
For each operation, there are two builders automatically generated based on the
|
||||
arguments and returns types:
|
||||
|
||||
```c++
|
||||
|
@ -388,7 +388,7 @@ static void build(Builder *, OperationState &tblgen_state,
|
|||
ArrayRef<NamedAttribute> attributes);
|
||||
```
|
||||
|
||||
The above cases makes sure basic uniformity so that we can create ops using the
|
||||
The above cases make sure basic uniformity so that we can create ops using the
|
||||
same form regardless of the exact op. This is particularly useful for
|
||||
implementing declarative pattern rewrites.
|
||||
|
||||
|
@ -572,7 +572,7 @@ a float tensor, and so on.
|
|||
|
||||
Similarly, a set of `AttrConstraint`s are created for helping modelling
|
||||
constraints of common attribute kinds. They are the `Attr` subclass hierarchy.
|
||||
It includes `F32Attr` for the constraints of being an float attribute,
|
||||
It includes `F32Attr` for the constraints of being a float attribute,
|
||||
`F32ArrayAttr` for the constraints of being a float array attribute, and so on.
|
||||
|
||||
### Multi-entity constraint
|
||||
|
@ -648,7 +648,7 @@ replaced by the current attribute `attr` at expansion time.
|
|||
|
||||
For more complicated predicates, you can wrap it in a single `CPred`, or you
|
||||
can use predicate combiners to combine them. For example, to write the
|
||||
constraint that an attribute `attr` is an 32-bit or 64-bit integer, you can
|
||||
constraint that an attribute `attr` is a 32-bit or 64-bit integer, you can
|
||||
write it as
|
||||
|
||||
```tablegen
|
||||
|
@ -695,9 +695,9 @@ def MyOp : Op<...> {
|
|||
As to whether we should define the predicate using a single `CPred` wrapping
|
||||
the whole expression, multiple `CPred`s with predicate combiners, or a single
|
||||
`CPred` "invoking" a function, there are no clear-cut criteria. Defining using
|
||||
`CPred` and predicate combiners is preferrable since it exposes more information
|
||||
`CPred` and predicate combiners is preferable since it exposes more information
|
||||
(instead hiding all the logic behind a C++ function) into the op definition spec
|
||||
so that it can pontentially drive more auto-generation cases. But it will
|
||||
so that it can potentially drive more auto-generation cases. But it will
|
||||
require a nice library of common predicates as the building blocks to avoid the
|
||||
duplication, which is being worked on right now.
|
||||
|
||||
|
@ -928,7 +928,7 @@ the output type (shape) for given input type (shape).
|
|||
|
||||
But shape functions are determined by attributes and could be arbitrarily
|
||||
complicated with a wide-range of specification possibilities. Equality
|
||||
relationship are common (e.g., the elemental type of the output matches the
|
||||
relationships are common (e.g., the elemental type of the output matches the
|
||||
primitive type of the inputs, both inputs have exactly the same type [primitive
|
||||
type and shape]) and so these should be easy to specify. Algebraic relationships
|
||||
would also be common (e.g., a concat of `[n,m]` and `[n,m]` matrix along axis 0
|
||||
|
|
|
@ -79,7 +79,7 @@ In order to exactly represent the Real zero with an integral-valued affine
|
|||
value, the zero point must be an integer between the minimum and maximum affine
|
||||
value (inclusive). For example, given an affine value represented by an 8 bit
|
||||
unsigned integer, we have: $$ 0 \leq zero\_point \leq 255$$. This is important,
|
||||
because in deep neural networks's convolution-like operations, we frequently
|
||||
because in deep neural networks' convolution-like operations, we frequently
|
||||
need to zero-pad inputs and outputs, so zero must be exactly representable, or
|
||||
the result will be biased.
|
||||
|
||||
|
@ -123,7 +123,7 @@ $$
|
|||
In the above, we assume that $$real\_value$$ is a Single, $$scale$$ is a Single,
|
||||
$$roundToNearestInteger$$ returns a signed 32 bit integer, and $$zero\_point$$
|
||||
is an unsigned 8 or 16 bit integer. Note that bit depth and number of fixed
|
||||
point values is indicative of common types on typical hardware but is not
|
||||
point values are indicative of common types on typical hardware but is not
|
||||
constrained to particular bit depths or a requirement that the entire range of
|
||||
an N-bit integer is used.
|
||||
|
||||
|
|
|
@ -10,7 +10,7 @@ See [MLIR specification](LangRef.md) for more information about MLIR, the
|
|||
structure of the IR, operations, etc. See
|
||||
[Table-driven Operation Definition](OpDefinitions.md) and
|
||||
[Declarative Rewrite Rule](DeclarativeRewrites.md) for the detailed explanation
|
||||
of all available mechansims for defining operations and rewrites in a
|
||||
of all available mechanisms for defining operations and rewrites in a
|
||||
table-driven manner.
|
||||
|
||||
## Adding operation
|
||||
|
@ -90,7 +90,7 @@ OpFoldResult SpecificOp::fold(ArrayRef<Attribute> constOperands) {
|
|||
There are multiple forms of graph rewrite that can be performed in MLIR. One of
|
||||
the most common is DAG tile to DAG tile rewrite. Patterns provide a concise way
|
||||
to express this transformation as a pair of source pattern to match and
|
||||
resultant pattern. There is both the C++ classes to represent this
|
||||
resultant pattern. There are both the C++ classes to represent this
|
||||
transformation, as well as the patterns in TableGen from which these can be
|
||||
generated.
|
||||
|
||||
|
|
|
@ -39,7 +39,7 @@ neural network accelerators.
|
|||
|
||||
MLIR uses ideas drawn from IRs of LLVM and Swift for lower level constructs
|
||||
while combining them with ideas from the polyhedral abstraction to represent
|
||||
loop nests, multi-dimensional data (tensors), and transformations on these
|
||||
loop nests, multidimensional data (tensors), and transformations on these
|
||||
entities as first class concepts in the IR.
|
||||
|
||||
MLIR is a multi-level IR, i.e., it represents code at a domain-specific
|
||||
|
@ -58,7 +58,7 @@ polyhedral abstraction.
|
|||
|
||||
Maps, sets, and relations with affine constraints are the core structures
|
||||
underlying a polyhedral representation of high-dimensional loop nests and
|
||||
multi-dimensional arrays. These structures are represented as textual
|
||||
multidimensional arrays. These structures are represented as textual
|
||||
expressions in a form close to their mathematical form. These structures are
|
||||
used to capture loop nests, tensor data structures, and how they are reordered
|
||||
and mapped for a target architecture. All structured or "conforming" loops are
|
||||
|
@ -513,7 +513,7 @@ parsing/printing, will be available.
|
|||
Dialect extended types are represented as string literals wrapped inside of the
|
||||
dialect namespace. This means that the parser delegates to the dialect for
|
||||
parsing specific type instances. This differs from the representation of dialect
|
||||
defined operations, of which have a identifier name that the parser uses to
|
||||
defined operations, of which have an identifier name that the parser uses to
|
||||
identify and parse them.
|
||||
|
||||
This representation was chosen for several reasons:
|
||||
|
@ -773,7 +773,7 @@ our current design in practice.
|
|||
The current MLIR uses a representation of polyhedral schedules using a tree of
|
||||
if/for loops. We extensively debated the tradeoffs involved in the typical
|
||||
unordered polyhedral instruction representation (where each instruction has
|
||||
multi-dimensional schedule information), discussed the benefits of schedule tree
|
||||
multidimensional schedule information), discussed the benefits of schedule tree
|
||||
forms, and eventually decided to go with a syntactic tree of affine if/else
|
||||
conditionals and affine for loops. Discussion of the tradeoff was captured in
|
||||
this document:
|
||||
|
@ -806,7 +806,7 @@ At a high level, we have two alternatives here:
|
|||
This representation is based on a simplified form of the domain/schedule
|
||||
representation used by the polyhedral compiler community. Domains represent what
|
||||
has to be executed while schedules represent the order in which domain elements
|
||||
are interleaved. We model domains as non piece-wise convex integer sets, and
|
||||
are interleaved. We model domains as non-piece-wise convex integer sets, and
|
||||
schedules as affine functions; however, the former can be disjunctive, and the
|
||||
latter can be piece-wise affine relations. In the schedule tree representation,
|
||||
domain and schedules for instructions are represented in a tree-like structure
|
||||
|
@ -1110,7 +1110,7 @@ The problem is that LLVM has several objects in its IR that are globally uniqued
|
|||
and also mutable: notably constants like `i32 0`. In LLVM, these constants are
|
||||
`Value*r`'s, which allow them to be used as operands to instructions, and that
|
||||
they also have SSA use lists. Because these things are uniqued, every `i32 0` in
|
||||
any function share a use list. This means that optimizing multiple functions in
|
||||
any function shares a use list. This means that optimizing multiple functions in
|
||||
parallel won't work (at least without some sort of synchronization on the use
|
||||
lists, which would be unbearably inefficient).
|
||||
|
||||
|
@ -1122,7 +1122,7 @@ expressions, types, etc are all immutable, uniqued, and immortal). 2) constants
|
|||
are defined in per-function pools, instead of being globally uniqued. 3)
|
||||
functions themselves are not SSA values either, so they don't have the same
|
||||
problem as constants. 4) FunctionPasses are copied (through their copy ctor)
|
||||
into one instances per thread, avoiding sharing of local state across threads.
|
||||
into one instance per thread, avoiding sharing of local state across threads.
|
||||
|
||||
This allows MLIR function passes to support efficient multithreaded compilation
|
||||
and code generation.
|
||||
|
|
|
@ -10,7 +10,7 @@ This document is a very early design proposal (which has since been accepted)
|
|||
that explored the tradeoffs of using this simplified form vs the traditional
|
||||
polyhedral schedule list form. At some point, this document could be dusted off
|
||||
and written as a proper academic paper, but until now, it is better to included
|
||||
it in this crufty form than not to. Beware that this document uses archaic
|
||||
it in this crafty form than not to. Beware that this document uses archaic
|
||||
syntax and should not be considered a canonical reference to modern MLIR.
|
||||
|
||||
## Introduction
|
||||
|
@ -282,7 +282,7 @@ transformations want to be explicit about what they are doing.
|
|||
|
||||
### Simplicity of code generation
|
||||
|
||||
A key final stage of an mlfunc is its conversion to a cfg function, which is
|
||||
A key final stage of an mlfunc is its conversion to a CFG function, which is
|
||||
required as part of lowering to the target machine. The simplified form has a
|
||||
clear advantage here: the IR has a direct correspondence to the structure of the
|
||||
generated code.
|
||||
|
|
|
@ -49,8 +49,8 @@ elimination, only one constant remains in the IR.
|
|||
FileCheck is an extremely useful utility, it allows for easily matching various
|
||||
parts of the output. This ease of use means that it becomes easy to write
|
||||
brittle tests that are essentially `diff` tests. FileCheck tests should be as
|
||||
self contained as possible and focus on testing the minimal set of functionality
|
||||
needed. Let's see an example:
|
||||
self-contained as possible and focus on testing the minimal set of
|
||||
functionalities needed. Let's see an example:
|
||||
|
||||
```mlir {.mlir}
|
||||
// RUN: mlir-opt %s -cse | FileCheck %s
|
||||
|
|
|
@ -65,7 +65,7 @@ public:
|
|||
```
|
||||
|
||||
Unlike more complex types, RangeType does not require a hashing key for
|
||||
unique'ing in the `MLIRContext`. Note that all MLIR types derive from
|
||||
uniquing in the `MLIRContext`. Note that all MLIR types derive from
|
||||
`mlir::Type::TypeBase` and expose `using Base::Base` to enable generic hooks to
|
||||
work properly (in this instance for llvm-style casts. RangeType does not even
|
||||
require an implementation file as the above represents the whole code for the
|
||||
|
@ -187,7 +187,7 @@ view it slices and pretty-prints as:
|
|||
%2 = linalg.slice %1[*, *, %0, *] : !linalg.view<?x?x?xf32>
|
||||
```
|
||||
|
||||
In this particular case, %2 slices dimension `2` of the four dimensional view
|
||||
In this particular case, %2 slices dimension `2` of the four-dimensional view
|
||||
%1. The returned `!linalg.view<?x?x?xf32>` indicates that the indexing is
|
||||
rank-reducing and that %0 is an `index`.
|
||||
|
||||
|
|
|
@ -227,7 +227,7 @@ public:
|
|||
PatternMatchResult match(Operation *op) const override;
|
||||
|
||||
// A "rewriting" function that takes an original operation `op`, a list of
|
||||
// already rewritten opreands, and a function builder `rewriter`. It can use
|
||||
// already rewritten operands, and a function builder `rewriter`. It can use
|
||||
// the builder to construct new operations and ultimately create new values
|
||||
// that will replace those currently produced by the original operation. It
|
||||
// needs to define as many value as the original operation, but their types
|
||||
|
@ -259,7 +259,7 @@ PatternMatchResult ViewOpConversion::match(Operation *op) const override {
|
|||
}
|
||||
```
|
||||
|
||||
The actual conversion function may become quite involved. First, Let us go over
|
||||
The actual conversion function may become quite involved. First, let us go over
|
||||
the components of a view descriptor and see how they can be constructed to
|
||||
represent a _complete_ view of a `memref`, e.g. a view that covers all its
|
||||
elements.
|
||||
|
@ -412,7 +412,7 @@ struct ViewDescriptor {
|
|||
return builder.getArrayAttr(attrs);
|
||||
}
|
||||
|
||||
// Emit instructions obtaining individual values from the decsriptor.
|
||||
// Emit instructions obtaining individual values from the descriptor.
|
||||
Value *ptr() { return intrinsics::extractvalue(elementPtrType(), d, pos(0)); }
|
||||
Value *offset() { return intrinsics::extractvalue(indexType(), d, pos(1)); }
|
||||
Value *size(unsigned dim) {
|
||||
|
|
|
@ -82,7 +82,7 @@ def main() {
|
|||
# reuse the previously specialized and inferred version and return `<2, 2>`
|
||||
var d = multiply_transpose(b, a);
|
||||
|
||||
# A new call with `<2, 2>` for both dimension will trigger another
|
||||
# A new call with `<2, 2>` for both dimensions will trigger another
|
||||
# specialization of `multiply_transpose`.
|
||||
var e = multiply_transpose(c, d);
|
||||
|
||||
|
|
Loading…
Reference in New Issue