- improve/fix doc comments for affine apply composition related methods.
- drop makeSingleValueComposedAffineApply - really redundant and out of line in
a public API; it's just returning the first result of the composed affine
apply op, and not making a single result affine map or an affine_apply op.
PiperOrigin-RevId: 230406169
This CL also makes ScopedEDSCContexts to reset the Bindable numbering when
creating a new context.
This is useful to write minimal tests that don't use FileCheck pattern
captures for now.
PiperOrigin-RevId: 230079997
This CL performs a bunch of cleanups related to EDSCs that are generally
useful in the context of using them with a simple wrapping C API (not in this
CL) and with simple language bindings to Python and Swift.
PiperOrigin-RevId: 230066505
This CL fixes a misunderstanding in how to build DimOp which triggered
execution issues in the CPU path.
The problem is that, given a `memref<?x4x?x8x?xf32>`, the expressions to
construct the dynamic dimensions should be:
`dim %arg, 0 : memref<?x4x?x8x?xf32>`
`dim %arg, 2 : memref<?x4x?x8x?xf32>`
and
`dim %arg, 4 : memref<?x4x?x8x?xf32>`
Before this CL, we wold construct:
`dim %arg, 0 : memref<?x4x?x8x?xf32>`
`dim %arg, 1 : memref<?x4x?x8x?xf32>`
`dim %arg, 2 : memref<?x4x?x8x?xf32>`
and expect the other dimensions to be constants.
This assumption seems consistent at first glance with the syntax of alloc:
```
%tensor = alloc(%M, %N, %O) : memref<?x4x?x8x?xf32>
```
But this was actuallyincorrect.
This CL also makes the relevant functions available to EDSCs and removes
duplication of the incorrect function.
PiperOrigin-RevId: 229622766
This is mostly plumbing to start allowing testing EDSC lowering. Prototype specifying reference implementation using verbose format without any generation/binding support. Add test pass that dumps the constructed EDSC (of which there can only be one). The idea is to enable iterating from multiple sides, this is wrong on many dimensions at the moment.
PiperOrigin-RevId: 229570535
This allows load, store and ForNest to be used with both Expr and Bindable.
This simplifies writing generic pieces of MLIR snippet.
For instance, a generic pointwise add can now be written:
```cpp
// Different Bindable ivs, one per loop in the loop nest.
auto ivs = makeBindables(shapeA.size());
Bindable zero, one;
// Same bindable, all equal to `zero`.
SmallVector<Bindable, 8> zeros(ivs.size(), zero);
// Same bindable, all equal to `one`.
SmallVector<Bindable, 8> ones(ivs.size(), one);
// clang-format off
Bindable A, B, C;
Stmt scalarA, scalarB, tmp;
Stmt block = edsc::Block({
ForNest(ivs, zeros, shapeA, ones, {
scalarA = load(A, ivs),
scalarB = load(B, ivs),
tmp = scalarA + scalarB,
store(tmp, C, ivs)
}),
});
// clang-format on
```
This CL also adds some extra support for pretty printing that will be used in
a future CL when we introduce standalone testing of EDSCs. At the momen twe
are lacking the basic infrastructure to write such tests.
PiperOrigin-RevId: 229375850
Arguably the dependence of EDSCs on Analysis is not great but on the other
hand this is a strict improvement in the emitted IR and since EDSCs are an
alternative to builders it makes sense that they have as much access to
Analysis as Transforms.
PiperOrigin-RevId: 228967624
- when SSAValue/MLValue existed, code at several places was forced to create additional
aggregate temporaries of SmallVector<SSAValue/MLValue> to handle the conversion; get
rid of such redundant code
- use filling ctors instead of explicit loops
- for smallvectors, change insert(list.end(), ...) -> append(...
- improve comments at various places
- turn getMemRefAccess into MemRefAccess ctor and drop duplicated
getMemRefAccess. In the next CL, provide getAccess() accessors for load,
store, DMA op's to return a MemRefAccess.
PiperOrigin-RevId: 228243638
This CL introduces a simple set of Embedded Domain-Specific Components (EDSCs)
in MLIR components:
1. a `Type` system of shell classes that closely matches the MLIR type system. These
types are subdivided into `Bindable` leaf expressions and non-bindable `Expr`
expressions;
2. an `MLIREmitter` class whose purpose is to:
a. maintain a map of `Bindable` leaf expressions to concrete SSAValue*;
b. provide helper functionality to specify bindings of `Bindable` classes to
SSAValue* while verifying comformable types;
c. traverse the `Expr` and emit the MLIR.
This is used on a concrete example to implement MemRef load/store with clipping in the
LowerVectorTransfer pass. More specifically, the following pseudo-C++ code:
```c++
MLFuncBuilder *b = ...;
Location location = ...;
Bindable zero, one, expr, size;
// EDSL expression
auto access = select(expr < zero, zero, select(expr < size, expr, size - one));
auto ssaValue = MLIREmitter(b)
.bind(zero, ...)
.bind(one, ...)
.bind(expr, ...)
.bind(size, ...)
.emit(location, access);
```
is used to emit all the MLIR for a clipped MemRef access.
This simple EDSL can easily be extended to more powerful patterns and should
serve as the counterpart to pattern matchers (and could potentially be unified
once we get enough experience).
In the future, most of this code should be TableGen'd but for now it has
concrete valuable uses: make MLIR programmable in a declarative fashion.
This CL also adds Stmt, proper supporting free functions and rewrites
VectorTransferLowering fully using EDSCs.
The code for creating the EDSCs emitting a VectorTransferReadOp as loops
with clipped loads is:
```c++
Stmt block = Block({
tmpAlloc = alloc(tmpMemRefType),
vectorView = vector_type_cast(tmpAlloc, vectorMemRefType),
ForNest(ivs, lbs, ubs, steps, {
scalarValue = load(scalarMemRef, accessInfo.clippedScalarAccessExprs),
store(scalarValue, tmpAlloc, accessInfo.tmpAccessExprs),
}),
vectorValue = load(vectorView, zero),
tmpDealloc = dealloc(tmpAlloc.getLHS())});
emitter.emitStmt(block);
```
where `accessInfo.clippedScalarAccessExprs)` is created with:
```c++
select(i + ii < zero, zero, select(i + ii < N, i + ii, N - one));
```
The generated MLIR resembles:
```mlir
%1 = dim %0, 0 : memref<?x?x?x?xf32>
%2 = dim %0, 1 : memref<?x?x?x?xf32>
%3 = dim %0, 2 : memref<?x?x?x?xf32>
%4 = dim %0, 3 : memref<?x?x?x?xf32>
%5 = alloc() : memref<5x4x3xf32>
%6 = vector_type_cast %5 : memref<5x4x3xf32>, memref<1xvector<5x4x3xf32>>
for %i4 = 0 to 3 {
for %i5 = 0 to 4 {
for %i6 = 0 to 5 {
%7 = affine_apply #map0(%i0, %i4)
%8 = cmpi "slt", %7, %c0 : index
%9 = affine_apply #map0(%i0, %i4)
%10 = cmpi "slt", %9, %1 : index
%11 = affine_apply #map0(%i0, %i4)
%12 = affine_apply #map1(%1, %c1)
%13 = select %10, %11, %12 : index
%14 = select %8, %c0, %13 : index
%15 = affine_apply #map0(%i3, %i6)
%16 = cmpi "slt", %15, %c0 : index
%17 = affine_apply #map0(%i3, %i6)
%18 = cmpi "slt", %17, %4 : index
%19 = affine_apply #map0(%i3, %i6)
%20 = affine_apply #map1(%4, %c1)
%21 = select %18, %19, %20 : index
%22 = select %16, %c0, %21 : index
%23 = load %0[%14, %i1, %i2, %22] : memref<?x?x?x?xf32>
store %23, %5[%i6, %i5, %i4] : memref<5x4x3xf32>
}
}
}
%24 = load %6[%c0] : memref<1xvector<5x4x3xf32>>
dealloc %5 : memref<5x4x3xf32>
```
In particular notice that only 3 out of the 4-d accesses are clipped: this
corresponds indeed to the number of dimensions in the super-vector.
This CL also addresses the cleanups resulting from the review of the prevous
CL and performs some refactoring to simplify the abstraction.
PiperOrigin-RevId: 227367414