Commit Graph

26 Commits

Author SHA1 Message Date
Sergei Lebedev 52ec65c85e Implemented __eq__ and __ne__ in EDSC Python bindings
PiperOrigin-RevId: 232473201
2019-03-29 16:13:34 -07:00
River Riddle bf9c381d1d Remove InstWalker and move all instruction walking to the api facilities on Function/Block/Instruction.
PiperOrigin-RevId: 232388113
2019-03-29 16:12:59 -07:00
River Riddle 44e040dd63 Remove remaining references to OperationInst in all directories except for lib/Transforms.
PiperOrigin-RevId: 232322771
2019-03-29 16:10:38 -07:00
River Riddle a3d9ccaecb Replace the walkOps/visitOperationInst variants from the InstWalkers with the Instruction variants.
PiperOrigin-RevId: 232322030
2019-03-29 16:10:24 -07:00
Dimitrios Vytiniotis 9ca0691b06 Exposing logical operators in EDSC all the way up to Python.
PiperOrigin-RevId: 232299839
2019-03-29 16:10:08 -07:00
River Riddle de2d0dfbca Fold the functionality of OperationInst into Instruction. OperationInst still exists as a forward declaration and will be removed incrementally in a set of followup cleanup patches.
PiperOrigin-RevId: 232198540
2019-03-29 16:09:19 -07:00
River Riddle 5052bd8582 Define the AffineForOp and replace ForInst with it. This patch is largely mechanical, i.e. changing usages of ForInst to OpPointer<AffineForOp>. An important difference is that upon construction an AffineForOp no longer automatically creates the body and induction variable. To generate the body/iv, 'createBody' can be called on an AffineForOp with no body.
PiperOrigin-RevId: 232060516
2019-03-29 16:06:49 -07:00
Nicolas Vasilache 0353ef99eb Cleanup EDSCs and start a functional auto-generated library of custom Ops
This CL applies the following simplifications to EDSCs:
1. Rename Block to StmtList because an MLIR Block is a different, not yet
supported, notion;
2. Rework Bindable to drop specific storage and just use it as a simple wrapper
around Expr. The only value of Bindable is to force a static cast when used by
the user to bind into the emitter. For all intended purposes, Bindable is just
a lightweight check that an Expr is Unbound. This simplifies usage and reduces
the API footprint. After playing with it for some time, it wasn't worth the API
cognition overhead;
3. Replace makeExprs and makeBindables by makeNewExprs and copyExprs which is
more explicit and less easy to misuse;
4. Add generally useful functionality to MLIREmitter:
  a. expose zero and one for the ubiquitous common lower bounds and step;
  b. add support to create already bound Exprs for all function arguments as
  well as shapes and views for Exprs bound to memrefs.
5. Delete Stmt::operator= and replace by a `Stmt::set` method which is more
explicit.
6. Make Stmt::operator Expr() explicit.
7. Indexed.indices assertions are removed to pave the way for expressing slices
and views as well as to work with 0-D memrefs.

The CL plugs those simplifications with TableGen and allows emitting a full MLIR function for
pointwise add.

This "x.add" op is both type and rank-agnostic (by allowing ArrayRef of Expr
passed to For loops) and opens the door to spinning up a composable library of
existing and custom ops that should automate a lot of the tedious work in
TF/XLA -> MLIR.

Testing needs to be significantly improved but can be done in a separate CL.

PiperOrigin-RevId: 231982325
2019-03-29 16:05:23 -07:00
Nicolas Vasilache 35200435e7 Address cleanups from previous CL
This CL addresses some cleanups that were leftover after an incorrect rebase:
1. use StringSwitch
2. use // NOLINTNEXTLINE
3. remove a dead line of code

PiperOrigin-RevId: 231726640
2019-03-29 16:03:53 -07:00
Nicolas Vasilache 39d81f246a Introduce python bindings for MLIR EDSCs
This CL also introduces a set of python bindings using pybind11. The bindings
are exercised using a `test_py2andpy3.py` test suite that works for both
python 2 and 3.

`test_py3.py` on the other hand uses the more idiomatic,
python 3 only "PEP 3132 -- Extended Iterable Unpacking" to implement a rank
and type-agnostic copy with transposition.

Because python assignment is by reference, we cannot easily make the
assignment operator use the same type of sugaring as in C++; i.e. the
following:

```cpp
Stmt block = edsc::Block({
  For(ivs, zeros, shapeA, ones, {
    C[ivs] = IA[ivs] + IB[ivs]
})});
```

has no equivalent in the native Python EDSCs at this time.

However, the sugaring can be built as a simple DSL in python and is left as
future work.

PiperOrigin-RevId: 231337667
2019-03-29 15:59:14 -07:00
Nicolas Vasilache cacf05892e Add a C API for EDSCs in other languages + python
This CL adds support for calling EDSCs from other languages than C++.
Following the LLVM convention this CL:
1. declares simple opaque types and a C API in mlir-c/Core.h;
2. defines the implementation directly in lib/EDSC/Types.cpp and
lib/EDSC/MLIREmitter.cpp.

Unlike LLVM however the nomenclature for these types and API functions is not
well-defined, naming suggestions are most welcome.

To avoid the need for conversion functions, Types.h and MLIREmitter.h include
mlir-c/Core.h and provide constructors and conversion operators between the
mlir::edsc type and the corresponding C type.

In this first commit, mlir-c/Core.h only contains the types for the C API
to allow EDSCs to work from Python. This includes both a minimal set of core
MLIR
types (mlir_context_t, mlir_type_t, mlir_func_t) as well as the EDSC types
(edsc_mlir_emitter_t, edsc_expr_t, edsc_stmt_t, edsc_indexed_t). This can be
restructured in the future as concrete needs arise.

For now, the API only supports:
1. scalar types;
2. memrefs of scalar types with static or symbolic shapes;
3. functions with input and output of these types.

The C API is not complete wrt ownership semantics. This is in large part due
to the fact that python bindings are written with Pybind11 which allows very
idiomatic C++ bindings. An effort is made to write a large chunk of these
bindings using the C API but some C++isms are used where the design benefits
from this simplication. A fully isolated C API will make more sense once we
also integrate with another language like Swift and have enough use cases to
drive the design.

Lastly, this CL also fixes a bug in mlir::ExecutionEngine were the order of
declaration of llvmContext and the JIT result in an improper order of
destructors (which used to crash before the fix).

PiperOrigin-RevId: 231290250
2019-03-29 15:41:53 -07:00
Chris Lattner b42bea215a Change AffineApplyOp to produce a single result, simplifying the code that
works with it, and updating the g3docs.

PiperOrigin-RevId: 231120927
2019-03-29 15:40:38 -07:00
River Riddle 36babbd781 Change the ForInst induction variable to be a block argument of the body instead of the ForInst itself. This is a necessary step in converting ForInst into an operation.
PiperOrigin-RevId: 231064139
2019-03-29 15:40:23 -07:00
Nicolas Vasilache e4020c2d1a Add support for Return in EDSCs
This CL adds the Return op to EDSCs types and emitter.
This allows generating full function bodies that can be compiled all the way
down to LLVMIR and executed on CPU.

At this point, the MLIR lacks the testing infrastructure to exercise this.
End-to-end testing of full functions written in EDSCs is left for a future CL.

PiperOrigin-RevId: 230527530
2019-03-29 15:31:50 -07:00
Uday Bondhugula 7669204304 Improve / fix documentation for affine map composition utilities - NFC
- improve/fix doc comments for affine apply composition related methods.
- drop makeSingleValueComposedAffineApply - really redundant and out of line in
  a public API; it's just returning the first result of the composed affine
  apply op, and not making a single result affine map or an affine_apply op.

PiperOrigin-RevId: 230406169
2019-03-29 15:30:47 -07:00
Nicolas Vasilache 119af6712e Cleanup spurious printing bits in EDSCs
This CL also makes ScopedEDSCContexts to reset the Bindable numbering when
creating a new context.
This is useful to write minimal tests that don't use FileCheck pattern
captures for now.

PiperOrigin-RevId: 230079997
2019-03-29 15:28:13 -07:00
Nicolas Vasilache 9f3f39d61a Cleanup EDSCs
This CL performs a bunch of cleanups related to EDSCs that are generally
useful in the context of using them with a simple wrapping C API (not in this
CL) and with simple language bindings to Python and Swift.

PiperOrigin-RevId: 230066505
2019-03-29 15:27:58 -07:00
Nicolas Vasilache 4573a8da9a Fix improperly indexed DimOp in LowerVectorTransfers.cpp
This CL fixes a misunderstanding in how to build DimOp which triggered
execution issues in the CPU path.

The problem is that, given a `memref<?x4x?x8x?xf32>`, the expressions to
construct the dynamic dimensions should be:
`dim %arg, 0 : memref<?x4x?x8x?xf32>`
`dim %arg, 2 : memref<?x4x?x8x?xf32>`
and
`dim %arg, 4 : memref<?x4x?x8x?xf32>`

Before this CL, we wold construct:
`dim %arg, 0 : memref<?x4x?x8x?xf32>`
`dim %arg, 1 : memref<?x4x?x8x?xf32>`
`dim %arg, 2 : memref<?x4x?x8x?xf32>`

and expect the other dimensions to be constants.
This assumption seems consistent at first glance with the syntax of alloc:

```
    %tensor = alloc(%M, %N, %O) : memref<?x4x?x8x?xf32>
```

But this was actuallyincorrect.

This CL also makes the relevant functions available to EDSCs and removes
duplication of the incorrect function.

PiperOrigin-RevId: 229622766
2019-03-29 15:24:13 -07:00
Jacques Pienaar 4b2b5f5267 Enable specifying the op for which the reference implementation should be printed.
Allows emitting reference implementation of multiple ops inside the test lowering pass.

PiperOrigin-RevId: 229603494
2019-03-29 15:22:43 -07:00
Jacques Pienaar 9d4bb57189 Start a testing pass for EDSC lowering.
This is mostly plumbing to start allowing testing EDSC lowering. Prototype specifying reference implementation using verbose format without any generation/binding support. Add test pass that dumps the constructed EDSC (of which there can only be one). The idea is to enable iterating from multiple sides, this is wrong on many dimensions at the moment.

PiperOrigin-RevId: 229570535
2019-03-29 15:21:44 -07:00
Nicolas Vasilache 515ce1e68e Add edsc::Indexed helper struct to act as syntactic sugar
This CL adds edsc::Indexed.

This helper class exists purely for sugaring purposes and allows writing
expressions such as:

```mlir
   Indexed A(...), B(...), C(...);
   ForNest(ivs, zeros, shapeA, ones, {
     C[ivs] = A[ivs] + B[ivs]
   });
```

PiperOrigin-RevId: 229388644
2019-03-29 15:17:37 -07:00
Nicolas Vasilache 424041ad58 Add EDSC sugar
This allows load, store and ForNest to be used with both Expr and Bindable.
This simplifies writing generic pieces of MLIR snippet.

For instance, a generic pointwise add can now be written:

```cpp
// Different Bindable ivs, one per loop in the loop nest.
auto ivs = makeBindables(shapeA.size());
Bindable zero, one;
// Same bindable, all equal to `zero`.
SmallVector<Bindable, 8> zeros(ivs.size(), zero);
// Same bindable, all equal to `one`.
SmallVector<Bindable, 8> ones(ivs.size(), one);
// clang-format off
Bindable A, B, C;
Stmt scalarA, scalarB, tmp;
Stmt block = edsc::Block({
  ForNest(ivs, zeros, shapeA, ones, {
    scalarA = load(A, ivs),
    scalarB = load(B, ivs),
    tmp = scalarA + scalarB,
    store(tmp, C, ivs)
  }),
});
// clang-format on
```

This CL also adds some extra support for pretty printing that will be used in
a future CL when we introduce standalone testing of EDSCs. At the momen twe
are lacking the basic infrastructure to write such tests.

PiperOrigin-RevId: 229375850
2019-03-29 15:16:53 -07:00
Nicolas Vasilache 1b171e9357 Add EDSC support for operator*
PiperOrigin-RevId: 229097351
2019-03-29 15:12:55 -07:00
Nicolas Vasilache b941dc8238 [MLIR] Make MLIREmitter emit composed single-result AffineMap by construction
Arguably the dependence of EDSCs on Analysis is not great but on the other
hand this is a strict improvement in the emitted IR and since EDSCs are an
alternative to builders it makes sense that they have as much access to
Analysis as Transforms.

PiperOrigin-RevId: 228967624
2019-03-29 15:12:11 -07:00
Uday Bondhugula 56b3640b94 Misc readability and doc / code comment related improvements - NFC
- when SSAValue/MLValue existed, code at several places was forced to create additional
  aggregate temporaries of SmallVector<SSAValue/MLValue> to handle the conversion; get
  rid of such redundant code

- use filling ctors instead of explicit loops

- for smallvectors, change insert(list.end(), ...) -> append(...

- improve comments at various places

- turn getMemRefAccess into MemRefAccess ctor and drop duplicated
  getMemRefAccess. In the next CL, provide getAccess() accessors for load,
  store, DMA op's to return a MemRefAccess.

PiperOrigin-RevId: 228243638
2019-03-29 15:02:41 -07:00
Nicolas Vasilache 73f5c9c380 [MLIR] Sketch a simple set of EDSCs to declaratively write MLIR
This CL introduces a simple set of Embedded Domain-Specific Components (EDSCs)
in MLIR components:
1. a `Type` system of shell classes that closely matches the MLIR type system. These
types are subdivided into `Bindable` leaf expressions and non-bindable `Expr`
expressions;
2. an `MLIREmitter` class whose purpose is to:
  a. maintain a map of `Bindable` leaf expressions to concrete SSAValue*;
  b. provide helper functionality to specify bindings of `Bindable` classes to
     SSAValue* while verifying comformable types;
  c. traverse the `Expr` and emit the MLIR.

This is used on a concrete example to implement MemRef load/store with clipping in the
LowerVectorTransfer pass. More specifically, the following pseudo-C++ code:
```c++
MLFuncBuilder *b = ...;
Location location = ...;
Bindable zero, one, expr, size;
// EDSL expression
auto access = select(expr < zero, zero, select(expr < size, expr, size - one));
auto ssaValue = MLIREmitter(b)
    .bind(zero, ...)
    .bind(one, ...)
    .bind(expr, ...)
    .bind(size, ...)
    .emit(location, access);
```
is used to emit all the MLIR for a clipped MemRef access.

This simple EDSL can easily be extended to more powerful patterns and should
serve as the counterpart to pattern matchers (and could potentially be unified
once we get enough experience).

In the future, most of this code should be TableGen'd but for now it has
concrete valuable uses: make MLIR programmable in a declarative fashion.

This CL also adds Stmt, proper supporting free functions and rewrites
VectorTransferLowering fully using EDSCs.

The code for creating the EDSCs emitting a VectorTransferReadOp as loops
with clipped loads is:

```c++
  Stmt block = Block({
    tmpAlloc = alloc(tmpMemRefType),
    vectorView = vector_type_cast(tmpAlloc, vectorMemRefType),
    ForNest(ivs, lbs, ubs, steps, {
      scalarValue = load(scalarMemRef, accessInfo.clippedScalarAccessExprs),
      store(scalarValue, tmpAlloc, accessInfo.tmpAccessExprs),
    }),
    vectorValue = load(vectorView, zero),
    tmpDealloc = dealloc(tmpAlloc.getLHS())});
  emitter.emitStmt(block);
```

where `accessInfo.clippedScalarAccessExprs)` is created with:

```c++
select(i + ii < zero, zero, select(i + ii < N, i + ii, N - one));
```

The generated MLIR resembles:

```mlir
    %1 = dim %0, 0 : memref<?x?x?x?xf32>
    %2 = dim %0, 1 : memref<?x?x?x?xf32>
    %3 = dim %0, 2 : memref<?x?x?x?xf32>
    %4 = dim %0, 3 : memref<?x?x?x?xf32>
    %5 = alloc() : memref<5x4x3xf32>
    %6 = vector_type_cast %5 : memref<5x4x3xf32>, memref<1xvector<5x4x3xf32>>
    for %i4 = 0 to 3 {
      for %i5 = 0 to 4 {
        for %i6 = 0 to 5 {
          %7 = affine_apply #map0(%i0, %i4)
          %8 = cmpi "slt", %7, %c0 : index
          %9 = affine_apply #map0(%i0, %i4)
          %10 = cmpi "slt", %9, %1 : index
          %11 = affine_apply #map0(%i0, %i4)
          %12 = affine_apply #map1(%1, %c1)
          %13 = select %10, %11, %12 : index
          %14 = select %8, %c0, %13 : index
          %15 = affine_apply #map0(%i3, %i6)
          %16 = cmpi "slt", %15, %c0 : index
          %17 = affine_apply #map0(%i3, %i6)
          %18 = cmpi "slt", %17, %4 : index
          %19 = affine_apply #map0(%i3, %i6)
          %20 = affine_apply #map1(%4, %c1)
          %21 = select %18, %19, %20 : index
          %22 = select %16, %c0, %21 : index
          %23 = load %0[%14, %i1, %i2, %22] : memref<?x?x?x?xf32>
          store %23, %5[%i6, %i5, %i4] : memref<5x4x3xf32>
        }
      }
    }
    %24 = load %6[%c0] : memref<1xvector<5x4x3xf32>>
    dealloc %5 : memref<5x4x3xf32>
```

In particular notice that only 3 out of the 4-d accesses are clipped: this
corresponds indeed to the number of dimensions in the super-vector.

This CL also addresses the cleanups resulting from the review of the prevous
CL and performs some refactoring to simplify the abstraction.

PiperOrigin-RevId: 227367414
2019-03-29 14:50:23 -07:00