Previously, OpDSL did not support rank polymorphism, which required a separate implementation of linalg.fill. This revision extends OpDSL to support rank polymorphism for a limited class of operations that access only scalars and tensors of rank zero. At operation instantiation time, it scales these scalar computations to multi-dimensional pointwise computations by replacing the empty indexing maps with identity index maps. The revision does not change the DSL itself, instead it adapts the Python emitter and the YAML generator to generate different indexing maps and and iterators depending on the rank of the first output.
Additionally, the revision introduces a `linalg.fill_tensor` operation that in a future revision shall replace the current handwritten `linalg.fill` operation. `linalg.fill_tensor` is thus only temporarily available and will be renamed to `linalg.fill`.
Reviewed By: nicolasvasilache, stellaraccident
Differential Revision: https://reviews.llvm.org/D119003
The revision renames `PrimFn` to `ArithFn`. The name resembles the newly introduced arith dialect that implements most of the arithmetic functions. An exception are log/exp that are part of the math dialect.
Depends On D115239
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D115240
This revision introduces a the `TypeFn` class that similar to the `PrimFn` class contains an extensible set of type conversion functions. Having the same mechanism for both type conversion functions and arithmetic functions improves code consistency. Additionally, having an explicit function class and function name is a prerequisite to specify a conversion or arithmetic function via attribute. In a follow up commits, we will introduce function attributes to make OpDSL operations more generic. In particular, the goal is to handle signed and unsigned computation in one operations. Today, there is a linalg.matmul and a linalg.matmul_unsigned.
The commit implements the following changes:
- Introduce the class of type conversion functions `TypeFn`
- Replace the hardwired cast and cast_unsigned ops by the `TypeFn` counterparts
- Adapt the python and C++ code generation paths to support the new cast operations
Example:
```
cast(U, A[D.m, D.k])
```
changes to
```
TypeFn.cast(U, A[D.m, D.k])
```
Depends On D115237
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D115239
Renaming `AttributeDef` to `IndexAttrDef` prepares OpDSL to support different kinds of attributes and more closely reflects the purpose of the attribute.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D115237
There are several aspects of the API that either aren't easy to use, or are
deceptively easy to do the wrong thing. The main change of this commit
is to remove all of the `getValue<T>`/`getFlatValue<T>` from ElementsAttr
and instead provide operator[] methods on the ranges returned by
`getValues<T>`. This provides a much more convenient API for the value
ranges. It also removes the easy-to-be-inefficient nature of
getValue/getFlatValue, which under the hood would construct a new range for
the type `T`. Constructing a range is not necessarily cheap in all cases, and
could lead to very poor performance if used within a loop; i.e. if you were to
naively write something like:
```
DenseElementsAttr attr = ...;
for (int i = 0; i < size; ++i) {
// We are internally rebuilding the APFloat value range on each iteration!!
APFloat it = attr.getFlatValue<APFloat>(i);
}
```
Differential Revision: https://reviews.llvm.org/D113229
Update OpDSL to support unsigned integers by adding unsigned min/max/cast signatures. Add tests in OpDSL and on the C++ side to verify the proper signed and unsigned operations are emitted.
The patch addresses an issue brought up in https://reviews.llvm.org/D111170.
Reviewed By: rsuderman
Differential Revision: https://reviews.llvm.org/D111230
* This could have been removed some time ago as it only had one op left in it, which is redundant with the new approach.
* `matmul_i8_i8_i32` (the remaining op) can be trivially replaced by `matmul`, which natively supports mixed precision.
Differential Revision: https://reviews.llvm.org/D110792
The patch extends the yaml code generation to support the following new OpDSL constructs:
- captures
- constants
- iteration index accesses
- predefined types
These changes have been introduced by revision
https://reviews.llvm.org/D101364.
Differential Revision: https://reviews.llvm.org/D102075
Some variables are unused after D97383 landed. We should generate one symbol for one attrUse.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D97794
If one operand is not used in the formula, it will be considered a
shaped operand. And the result of indexing map of the operand will be the first
reduction dims.
Depends On D97383
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D97384
This will allow us to define select(pred, in, out) for TC ops, which is useful
for pooling ops.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D97312
This revision takes advantage of the newly extended `ref` directive in assembly format
to allow better region handling for LinalgOps. Specifically, FillOp and CopyOp now build their regions explicitly which allows retiring older behavior that relied on specific op knowledge in both lowering to loops and vectorization.
This reverts commit 3f22547fd1 and reland 973e133b76 with a workaround for
a gcc bug that does not accept lambda default parameters:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59949
Differential Revision: https://reviews.llvm.org/D96598
This reverts commit 973e133b76.
It triggers an issue in gcc5 that require investigation, the build is
broken with:
/tmp/ccdpj3B9.s: Assembler messages:
/tmp/ccdpj3B9.s:5821: Error: symbol `_ZNSt17_Function_handlerIFvjjEUljjE2_E9_M_invokeERKSt9_Any_dataOjS6_' is already defined
/tmp/ccdpj3B9.s:5860: Error: symbol `_ZNSt14_Function_base13_Base_managerIUljjE2_E10_M_managerERSt9_Any_dataRKS3_St18_Manager_operation' is already defined
This revision takes advantage of the newly extended `ref` directive in assembly format
to allow better region handling for LinalgOps. Specifically, FillOp and CopyOp now build their regions explicitly which allows retiring older behavior that relied on specific op knowledge in both lowering to loops and vectorization.
Differential Revision: https://reviews.llvm.org/D96598
Indexing maps for named ops can reference attributes so that
we can synthesize the indexing map dynamically. This supports
cases like strides for convolution ops. However, it does cause
an issue: now the indexing_maps() function call is dependent
on those attributes.
Linalg ops inherit LinalgOpInterfaceTraits, which calls
verifyStructuredOpInterface() to verify the interface.
verifyStructuredOpInterface() further calls indexing_maps().
Note that trait verification is done before the op itself,
where ODS generates the verification for those attributes.
So we can have indexing_maps() referencing non-existing or
invalid attribute, before the ODS-generated verification
kick in.
There isn't a dependency handling mechansim for traits.
This commit adds new interface methods to query whether an
op hasDynamicIndexingMaps() and then perform
verifyIndexingMapRequiredAttributes() in
verifyStructuredOpInterface() to handle the dependency issue.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D96297
This commit adds support to generate an additional builder for
each named op that has attributes. This gives better experience
when creating the named ops.
Along the way adds support for i64.
Reviewed By: hanchung
Differential Revision: https://reviews.llvm.org/D94733
This commit adds support for parsing attribute uses in indexing
maps. These attribute uses are represented as affine symbols in
the resultant indexing maps because we can only know their
concrete value (which are coming from op attributes and are
constants) for specific op instances. The `indxing_maps()`
calls are synthesized to read these attributes and create affine
constants to replace the placeholder affine symbols and simplify.
Depends on D94240
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D94335
With this, now we can specify a list of attributes on named ops
generated from the spec. The format is defined as
```
attr-id ::= bare-id (`?`)?
attr-typedef ::= type (`[` `]`)?
attr-def ::= attr-id `:` attr-typedef
tc-attr-def ::= `attr` `(` attr-def-list `)`
tc-def ::= `def` bare-id
`(`tensor-def-list`)` `->` `(` tensor-def-list`)`
(tc-attr-def)?
```
For example,
```
ods_def<SomeCppOp>
def some_op(...) -> (...)
attr(
f32_attr: f32,
i32_attr: i32,
array_attr : f32[],
optional_attr? : f32
)
```
where `?` means optional attribute and `[]` means array type.
Reviewed By: hanchung, nicolasvasilache
Differential Revision: https://reviews.llvm.org/D94240
This reverts commit df86f15f0c.
The gcc-5 build was broken by this change:
mlir/tools/mlir-linalg-ods-gen/mlir-linalg-ods-gen.cpp:1275:77: required from here
/usr/include/c++/5/ext/new_allocator.h:120:4: error: no matching function for call to 'std::pair<const std::__cxx11::basic_string<char>, {anonymous}::TCParser::RegisteredAttr>::pair(llvm::StringRef&, {anonymous}::TCParser::RegisteredAttr'
With this, now we can specify a list of attributes on named ops
generated from the spec. The format is defined as
```
attr-id ::= bare-id (`?`)?
attr-typedef ::= type (`[` `]`)?
attr-def ::= attr-id `:` attr-typedef
tc-attr-def ::= `attr` `(` attr-def-list `)`
tc-def ::= `def` bare-id
`(`tensor-def-list`)` `->` `(` tensor-def-list`)`
(tc-attr-def)?
```
For example,
```
ods_def<SomeCppOp>
def some_op(...) -> (...)
attr(
f32_attr: f32,
i32_attr: i32,
array_attr : f32[],
optional_attr? : f32
)
```
where `?` means optional attribute and `[]` means array type.
Reviewed By: hanchung, nicolasvasilache
Differential Revision: https://reviews.llvm.org/D94240
This revision drops init_tensor arguments from Linalg on tensors and instead uniformizes the output buffers and output tensors to be consistent.
This significantly simplifies the usage of Linalg on tensors and is a stepping stone for
its evolution towards a mixed tensor and shape abstraction discussed in https://llvm.discourse.group/t/linalg-and-shapes/2421/19.
Differential Revision: https://reviews.llvm.org/D93469
The LinalgDependenceGraph and alias analysis provide the necessary analysis for the Linalg fusion on buffers case.
However this is not enough for linalg on tensors which require proper memory effects to play nicely with DCE and other transformations.
This revision adds side effects to Linalg ops that were previously missing and has 2 consequences:
1. one example in the copy removal pass now fails since the linalg.generic op has side effects and the pass does not perform alias analysis / distinguish between reads and writes.
2. a few examples in fusion-tensor.mlir need to return the resulting tensor otherwise DCE automatically kicks in as part of greedy pattern application.
Differential Revision: https://reviews.llvm.org/D90762
This revision allows representing a reduction at the level of linalg on tensors for named ops. When a structured op has a reduction and returns tensor(s), new conventions are added and documented.
As an illustration, the syntax for a `linalg.matmul` writing into a buffer is:
```
linalg.matmul ins(%a, %b : memref<?x?xf32>, tensor<?x?xf32>)
outs(%c : memref<?x?xf32>)
```
, whereas the syntax for a `linalg.matmul` returning a new tensor is:
```
%d = linalg.matmul ins(%a, %b : tensor<?x?xf32>, memref<?x?xf32>)
init(%c : memref<?x?xf32>)
-> tensor<?x?xf32>
```
Other parts of linalg will be extended accordingly to allow mixed buffer/tensor semantics in the presence of reductions.
This revision refactors and cleans up a bunch of things to simplify StructuredOpInterface
before work can proceed on Linalg on tensors:
- break out pieces of the StructuredOps trait that are part of the StructuredOpInterface,
- drop referenceIterators and referenceIndexingMaps that end up being more confusing than useful,
- drop NamedStructuredOpTrait
This revision adds support to allow named ops to lower to loops.
Linalg.batch_matmul successfully lowers to loops and to LLVM.
In the process, this test also activates linalg to affine loops.
However padded convolutions to not lower to affine.load atm so this revision overrides the type of underlying load / store operation.
Differential Revision: https://reviews.llvm.org/D79135
This revision is the first in a set of improvements that aim at allowing
more generalized named Linalg op generation from a mathematical
specification.
This revision allows creating a new op and checks that the parser,
printer and verifier are hooked up properly.
This opened up a few design points that will be addressed in the future:
1. A named linalg op has a static region builder instead of an
explicitly parsed region. This is not currently compatible with
assemblyFormat so a custom parser / printer are needed.
2. The convention for structured ops and tensor return values needs to
evolve to allow tensor-land and buffer land specifications to agree
3. ReferenceIndexingMaps and referenceIterators will need to become
static to allow building attributes at parse time.
4. Error messages will be improved once we have 3. and we pretty print
in custom form.
Differential Revision: https://reviews.llvm.org/D78327
Summary:
This revision adds a tool that generates the ODS and C++ implementation for "named" Linalg ops according to the [RFC discussion](https://llvm.discourse.group/t/rfc-declarative-named-ops-in-the-linalg-dialect/745).
While the mechanisms and language aspects are by no means set in stone, this revision allows connecting the pieces end-to-end from a mathematical-like specification.
Some implementation details and short-term decisions taken for the purpose of bootstrapping and that are not set in stone include:
1. using a "[Tensor Comprehension](https://arxiv.org/abs/1802.04730)-inspired" syntax
2. implicit and eager discovery of dims and symbols when parsing
3. using EDSC ops to specify the computation (e.g. std_addf, std_mul_f, ...)
A followup revision will connect this tool to tablegen mechanisms and allow the emission of named Linalg ops that automatically lower to various loop forms and run end to end.
For the following "Tensor Comprehension-inspired" string:
```
def batch_matmul(A: f32(Batch, M, K), B: f32(K, N)) -> (C: f32(Batch, M, N)) {
C(b, m, n) = std_addf<k>(std_mulf(A(b, m, k), B(k, n)));
}
```
With -gen-ods-decl=1, this emits (modulo formatting):
```
def batch_matmulOp : LinalgNamedStructured_Op<"batch_matmul", [
NInputs<2>,
NOutputs<1>,
NamedStructuredOpTraits]> {
let arguments = (ins Variadic<LinalgOperand>:$views);
let results = (outs Variadic<AnyRankedTensor>:$output_tensors);
let extraClassDeclaration = [{
llvm::Optional<SmallVector<StringRef, 8>> referenceIterators();
llvm::Optional<SmallVector<AffineMap, 8>> referenceIndexingMaps();
void regionBuilder(ArrayRef<BlockArgument> args);
}];
let hasFolder = 1;
}
```
With -gen-ods-impl, this emits (modulo formatting):
```
llvm::Optional<SmallVector<StringRef, 8>> batch_matmul::referenceIterators() {
return SmallVector<StringRef, 8>{ getParallelIteratorTypeName(),
getParallelIteratorTypeName(),
getParallelIteratorTypeName(),
getReductionIteratorTypeName() };
}
llvm::Optional<SmallVector<AffineMap, 8>> batch_matmul::referenceIndexingMaps()
{
MLIRContext *context = getContext();
AffineExpr d0, d1, d2, d3;
bindDims(context, d0, d1, d2, d3);
return SmallVector<AffineMap, 8>{
AffineMap::get(4, 0, {d0, d1, d3}),
AffineMap::get(4, 0, {d3, d2}),
AffineMap::get(4, 0, {d0, d1, d2}) };
}
void batch_matmul::regionBuilder(ArrayRef<BlockArgument> args) {
using namespace edsc;
using namespace intrinsics;
ValueHandle _0(args[0]), _1(args[1]), _2(args[2]);
ValueHandle _4 = std_mulf(_0, _1);
ValueHandle _5 = std_addf(_2, _4);
(linalg_yield(ValueRange{ _5 }));
}
```
Differential Revision: https://reviews.llvm.org/D77067