Commit Graph

90 Commits

Author SHA1 Message Date
wren romano 8cb332406c [mlir][sparse] Enhancing sparse=>sparse conversion.
Fixes: https://github.com/llvm/llvm-project/issues/51652

Depends On D122060

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D122061
2022-05-16 15:42:19 -07:00
River Riddle a8308020ac [mlir] Remove special case parsing/printing of `func` operations
This was leftover from when the standard dialect was destroyed, and
when FuncOp moved to the func dialect. Now that these transitions
have settled a bit we can drop these.

Most updates were handled using a simple regex: replace `^( *)func` with `$1func.func`

Differential Revision: https://reviews.llvm.org/D124146
2022-05-06 13:36:15 -07:00
Aart Bik 952fa3018e [mlir][sparse] add more zero-preserving unary ops to sparse compiler
Although we now have semi-rings to deal with arbitrary ops,
it is still good to convey zero-preserving semantics of
ops to the sparse compiler.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D125043
2022-05-05 15:35:19 -07:00
Nick Kreeger 4620032ee3 Revert "[mlir][sparse] Expose SpareTensor passes as enums instead of opaque numbers for vectorization and parallelization options."
This reverts commit d59cf901cb.

Build fails on NVIDIA Sparse tests:
https://lab.llvm.org/buildbot/#/builders/61/builds/25447
2022-04-23 20:14:48 -05:00
Nick Kreeger d59cf901cb [mlir][sparse] Expose SpareTensor passes as enums instead of opaque numbers for vectorization and parallelization options.
The SparseTensor passes currently use opaque numbers for the CLI, despite using an enum internally. This patch exposes the enums instead of numbered items that are matched back to the enum.

Fixes GitHub issue #53389

Reviewed by: aartbik, mehdi_amini

Differential Revision: https://reviews.llvm.org/D123876
2022-04-23 19:16:57 -05:00
River Riddle fb35cd3baf [mlir][NFC] Update textual references of `func` to `func.func` in SparseTensor tests
The special case parsing of `func` operations is being removed.
2022-04-20 22:17:29 -07:00
River Riddle af371f9f98 Reland [GreedPatternRewriter] Preprocess constants while building worklist when not processing top down
Reland Note: Adds a fix to properly mark a commutative operation as folded if we change the order
             of its operands. This was uncovered by the fact that we no longer re-process constants.

This avoids accidentally reversing the order of constants during successive
application, e.g. when running the canonicalizer. This helps reduce the number
of iterations, and also avoids unnecessary changes to input IR.

Fixes #51892

Differential Revision: https://reviews.llvm.org/D122692
2022-04-07 11:31:42 -07:00
Aart Bik 0b55f94d2b [mlir][sparse] replace stack-based access pattern with dyn-alloc
Rationale:
Allocating the temporary buffers for access pattern expansion on the stack
(using alloca) is a bit too agressive, since it easily runs out of stack space
for large enveloping tensor dimensions. This revision changes the dynamic
allocation of these buffers with explicit alloc/dealloc pairs.

Reviewed By: bixia, wrengr

Differential Revision: https://reviews.llvm.org/D123253
2022-04-06 17:10:43 -07:00
wren romano 63bdcaf92a [mlir][sparse] Moving `delete coo` into codegen instead of runtime library
Prior to this change there were a number of places where the allocation and deallocation of SparseTensorCOO objects were not cleanly paired, leading to inconsistencies regarding whether each function released its tensor/coo arguments or not, as well as making it easy to run afoul of memory leaks, use-after-free, or double-free errors.  This change cleans up the codegen vs runtime boundary to resolve those issues.  Now, the only time the runtime library frees an object is either (a) because it's a function explicitly designed to do so, or (b) because the allocated object is entirely local to the function and would be a memory leak if not released.  Thus, now the codegen takes complete responsibility for releasing any objects it caused to be allocated.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D122435
2022-04-01 11:08:52 -07:00
Mehdi Amini ba43d6f85c Revert "[GreedPatternRewriter] Preprocess constants while building worklist when not processing top down"
This reverts commit 59bbc7a085.

This exposes an issue breaking the contract of
`applyPatternsAndFoldGreedily` where we "converge" without applying
remaining patterns.
2022-04-01 06:16:55 +00:00
River Riddle 59bbc7a085 [GreedPatternRewriter] Preprocess constants while building worklist when not processing top down
This avoids accidentally reversing the order of constants during successive
application, e.g. when running the canonicalizer. This helps reduce the number
of iterations, and also avoids unnecessary changes to input IR.

Fixes #51892

Differential Revision: https://reviews.llvm.org/D122692
2022-03-31 12:08:55 -07:00
Javier Setoain 7783a178f5 [mlir][Sparse] Add option for VLA sparsification
Use "enable-vla-vectorization=vla" to generate a vector length agnostic
loops during vectorization. This option works for vectorization strategy 2.

Differential Revision: https://reviews.llvm.org/D118379
2022-03-25 10:54:49 +00:00
wren romano df948127ac [mlir][sparse] Adding Action::kSparseToSparse for @newSparseTensor
This is work towards: https://github.com/llvm/llvm-project/issues/51652

This differential doesn't yet make use of the new kSparseToSparse, just introduces it.  The differential that finally makes use of them is D122061, which is the final differential in the chain that fixes bug 51652.

Depends On D122054

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D122055
2022-03-22 13:46:59 -07:00
Aart Bik 69a7759b40 [mlir][sparse] implement loop index value vectorization
with CHECK and integration test

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D122040
2022-03-21 10:40:38 -07:00
Jim Kitchen 414ed019ac [mlir][sparse] Introduce new binary and unary op
When the sparse_tensor dialect lowers linalg.generic,
it makes inferences about how the operations should
affect the looping logic. For example, multiplication
is an intersection while addition is a union of two
sparse tensors.

The new binary and unary op separate the looping logic
from the computation by nesting the computation code
inside a block which is merged at the appropriate level
in the lowered looping code.

The binary op can have custom computation code for the
overlap, left, and right sparse overlap regions. The
unary op can have custom computation code for the
present and absent values.

Reviewed by: aartbik

Differential Revision: https://reviews.llvm.org/D121018
2022-03-17 12:31:09 -05:00
gysit 7294be2b8e [mlir][linalg] Replace linalg.fill by OpDSL variant.
The revision removes the linalg.fill operation and renames the OpDSL generated linalg.fill_tensor operation to replace it. After the change, all named structured operations are defined via OpDSL and there are no handwritten operations left.

A side-effect of the change is that the pretty printed form changes from:
```
%1 = linalg.fill(%cst, %0) : f32, tensor<?x?xf32> -> tensor<?x?xf32>
```
changes to
```
%1 = linalg.fill ins(%cst : f32) outs(%0 : tensor<?x?xf32>) -> tensor<?x?xf32>
```
Additionally, the builder signature now takes input and output value ranges as it is the case for all other OpDSL operations:
```
rewriter.create<linalg::FillOp>(loc, val, output)
```
changes to
```
rewriter.create<linalg::FillOp>(loc, ValueRange{val}, ValueRange{output})
```
All other changes remain minimal. In particular, the canonicalization patterns are the same and the `value()`, `output()`, and `result()` methods are now implemented by the FillOpInterface.

Depends On D120726

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D120728
2022-03-14 10:51:08 +00:00
Aart Bik 52fb4f53c2 [mlir][sparse] added linalg.dot to sparse kernel collection
Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D121315
2022-03-09 15:10:44 -08:00
Aart Bik 53cc3a0637 [mlir][sparse] index support in sparse compiler codegen
This revision adds support for the linalg.index to the sparse compiler
pipeline. In essence, this adds the ability to refer to indices in
the tensor index expression, as illustrated below:

 Y[i, j, k, l, m] = T[i, j, k, l, m]  * i * j

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D121251
2022-03-08 17:25:36 -08:00
Aart Bik 34381a76c1 [mlir][sparse] avoid some codeup in sparsification transformation
A very small refactoring, but a big impact on tests that expect an exact order.
This revision fixes the tests, but also makes them less brittle for similar
minor changes in the future!

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D119992
2022-02-16 17:39:04 -08:00
Matthias Springer fe0bf7d469 [mlir][vector][NFC] Use CombiningKindAttr instead of StringAttr
This makes the op consistent with other ops in vector dialect.

Differential Revision: https://reviews.llvm.org/D119343
2022-02-10 19:13:29 +09:00
River Riddle dec8af701f [mlir] Move SelectOp from Standard to Arithmetic
This is part of splitting up the standard dialect. See https://llvm.discourse.group/t/standard-dialect-the-final-chapter/ for discussion.

Differential Revision: https://reviews.llvm.org/D118648
2022-02-02 14:45:12 -08:00
Uday Bondhugula 92ccb8cc50 [MLIR][NFC] Update SCF pass cmd line names to prefix scf
Update SCF pass cmd line names to prefix `scf`. This is consistent with
guidelines/convention on how to name dialect passes. This also avoids
ambiguity on the context given the multiple `for` operations in the
tree.

NFC.

Differential Revision: https://reviews.llvm.org/D118564
2022-01-31 07:09:30 +05:30
Matthias Springer ab47418df6 [mlir][bufferize] Merge tensor-constant-bufferize into arith-bufferize
The bufferization of arith.constant ops is also switched over to BufferizableOpInterface-based bufferization. The old implementation is deleted. Both implementations utilize GlobalCreator, now renamed to just `getGlobalFor`.

GlobalCreator no longer maintains a set of all created allocations to avoid duplicate allocations of the same constant. Instead, `getGlobalFor` scans the module to see if there is already a global allocation with the same constant value.

For compatibility reasons, it is still possible to create a pass that bufferizes only `arith.constant`. This pass (createConstantBufferizePass) could be deleted once all users were switched over to One-Shot bufferization.

Differential Revision: https://reviews.llvm.org/D118483
2022-01-30 21:37:48 +09:00
Aart Bik efa15f4178 [mlir][sparse] add ability for sparse tensor output
Rationale:
Although file I/O is a bit alien to MLIR itself, we provide two convenient ways
for sparse tensor I/O. The input part was already there (behind the swiss army
knife sparse_tensor.new). Now we have a sparse_tensor.out to write out data. As
before, the ops are kept vague and may change in the future. For now this
allows us to compare TACO vs MLIR very easily.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D117850
2022-01-21 15:43:29 -08:00
wren romano bc04a47038 [mlir][sparse] adding OverheadType::kIndex
Depends On D115008

This change opens the way for D115012, and removes some corner cases in `CodegenUtils.cpp`. The `SparseTensorAttrDefs.td` already specifies that we allow `0` bitwidth for the two overhead types and that it is interpreted to mean the architecture's native width.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D115010
2022-01-04 16:15:54 -08:00
Aart Bik e1b9d80532 [mlir][sparse] add a few more sparse output tests (for generated IR)
also fixes two typos in IR doc

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D115288
2021-12-07 15:31:29 -08:00
Aart Bik 4f2ec7f983 [mlir][sparse] finalize sparse output in the presence of reductions
This revision implements sparse outputs (from scratch) in all cases where
the loops can be reordered with all but one parallel loops outer. If the
inner parallel loop appears inside one or more reductions loops, then an
access pattern expansion is required (aka. workspaces in TACO speak).

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D115091
2021-12-07 10:54:29 -08:00
Aart Bik 0e85232fa3 [mlir][sparse] refine simply dynamic sparse tensor outputs
Proper test for sparse tensor outputs is a single condition throughout
the whole tensor index expression (not a general conjunction, since this
may include other conditions that cause cancellation).

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D114810
2021-11-30 13:45:58 -08:00
Aart Bik 7d4da4e1ab [mlir][sparse] generalize sparse tensor output implementation
Moves sparse tensor output support forward by generalizing from injective
insertions only to include reductions. This revision accepts the case with all
parallel outer and all reduction inner loops, since that can be handled with
an injective insertion still. Next revision will allow the inner parallel loop
to move inward (but that will require "access pattern expansion" aka "workspace").

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D114399
2021-11-29 16:15:53 -08:00
Alexander Belyaev 57470abc41 [mlir] Move memref.[tensor_load|buffer_cast|clone] to "bufferization" dialect.
https://llvm.discourse.group/t/rfc-dialect-for-bufferization-related-ops/4712

Differential Revision: https://reviews.llvm.org/D114552
2021-11-25 11:50:39 +01:00
Mogball 7c5ecc8b7e [mlir][vector] Insert/extract element can accept index
`vector::InsertElementOp` and `vector::ExtractElementOp` have had their `position`
operand changed to accept `AnySignlessIntegerOrIndex` for better operability with
operations that use `index`, such as affine loops.

LLVM's `extractelement` and `insertelement` can also accept `i64`, so lowering
directly to these operations without explicitly inserting casts is allowed. SPIRV's
equivalent ops can also accept `i64`.

Reviewed By: nicolasvasilache, jpienaar

Differential Revision: https://reviews.llvm.org/D114139
2021-11-18 22:40:29 +00:00
Aart Bik 1ce77b562d [mlir][sparse] refine lexicographic insertion to any tensor
First version was vectors only. With some clever "path" insertion,
we now support any d-dimensional tensor. Up next: reductions too

Reviewed By: bixia, wrengr

Differential Revision: https://reviews.llvm.org/D114024
2021-11-17 18:08:42 -08:00
Aart Bik f66e5769d4 [mlir][sparse] first version of "truly" dynamic sparse tensors as outputs of kernels
This revision contains all "sparsification" ops and rewriting necessary to support sparse output tensors when the kernel has no reduction (viz. insertions occur in lexicographic order and are "injective"). This will be later generalized to allow reductions too. Also, this first revision only supports sparse 1-d tensors (viz. vectors) as output in the runtime support library. This will be generalized to n-d tensors shortly. But this way, the revision is kept to a manageable size.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D113705
2021-11-15 15:33:32 -08:00
Aart Bik 2f0ee17017 [mlir][sparse] test for SIMD reduction chaining in consecutive vector loops
Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D113197
2021-11-05 10:14:17 -07:00
Aart Bik 7373cabcda [mlir][sparse] implement full reduction "scalarization" across loop nests
The earlier reduction "scalarization" was only applied to a chain of
*innermost* and *for* loops. This revision generalizes this to any
nesting of for- and while-loops. This implies that reductions can be
implemented with a lot less load and store operations. The chaining
is implemented with a forest of yield statements (but not as bad as
when we would also include the while-induction).

Fixes https://bugs.llvm.org/show_bug.cgi?id=52311

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D113078
2021-11-04 17:38:47 -07:00
Aart Bik 4aa9b39824 [mlir][sparse] reject sparsity annotation in "scalar" tensors
Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D113152
2021-11-04 09:49:05 -07:00
wren romano 5389cdc8f6 [mlir][sparse] Adding dynamic-size support for sparse=>dense conversion
Depends On D110790

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D112674
2021-10-28 16:56:18 -07:00
wren romano 28882b6575 [mlir][sparse] Implementing sparse=>dense conversion.
Depends On D110882, D110883, D110884

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D110790
2021-10-28 15:27:35 -07:00
Aart Bik 1e6ef0cfb0 [mlir][sparse] refine trait of sparse_tensor.convert
Rationale:
The currently used trait was demanding that all types are the same
which is not true (since the sparse part may change and the dim sizes
may be relaxed). This revision uses the correct trait and makes the
rank match test explicit in the verify method.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D112576
2021-10-26 14:36:49 -07:00
Aart Bik 1b15160ef3 [mlir][sparse] lower trivial tensor.cast on identical sparse tensors
Even though tensor.cast is not part of the sparse tensor dialect,
it may be used to cast static dimension sizes to dynamic dimension
sizes for sparse tensors without changing the actual sparse tensor
itself. Those cases should be lowered properly when replacing sparse
tensor types with their opaque pointers. Likewise, no op sparse
conversions are handled by this revision in a similar manner.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D112173
2021-10-25 10:30:19 -07:00
Aart Bik bd5494d127 [mlir][sparse] make index type explicit in public API of support library
The current implementation used explicit index->int64_t casts for some, but
not all instances of passing values of type "index" in and from the sparse
support library. This revision makes the situation more consistent by
using new "index_t" type at all such places  (which allows for less trivial
casting in the generated MLIR code).  Note that the current revision still
assumes that "index" is 64-bit wide. If we want to support targets with
alternative "index" bit widths, we need to build the support library different.
But the current revision is a step forward by making this requirement explicit
and more visible.

Reviewed By: wrengr

Differential Revision: https://reviews.llvm.org/D112122
2021-10-20 12:46:31 -07:00
Aart Bik 9d1db3d4a1 [mlir][sparse] generalize sparse_tensor.convert on static/dynamic dimension sizes
This revison lifts the artificial restriction on having exact matches between
source and destination type shapes. A static size may become dynamic. We still
reject changing a dynamic size into a static size to avoid the need for a
runtime "assert" on the conversion. This revision also refactors some of the
conversion code to share same-content buffers.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D111915
2021-10-18 13:54:03 -07:00
Aart Bik b24788abd8 [mlir][sparse] implement sparse tensor init operation
Next step towards supporting sparse tensors outputs.
Also some minor refactoring of enum constants as well
as replacing tensor arguments with proper buffer arguments
(latter is required for more general sizes arguments for
the sparse_tensor.init operation, as well as more general
spares_tensor.convert operations later)

Reviewed By: wrengr

Differential Revision: https://reviews.llvm.org/D111771
2021-10-15 09:33:16 -07:00
Aart Bik a652e5b53a [mlir][sparse] emergency fix after constant -> arith.constant change
Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D111743
2021-10-13 10:26:17 -07:00
Aart Bik 35517a251d [mlir][sparse] add init sparse tensor operation
This is the first step towards supporting general sparse tensors as output
of operations. The init sparse tensor is used to materialize an empty sparse
tensor of given shape and sparsity into a subsequent computation (similar to
the dense tensor init operation counterpart).

Example:
  %c = sparse_tensor.init %d1, %d2 : tensor<?x?xf32, #SparseMatrix>
  %0 = linalg.matmul
    ins(%a, %b: tensor<?x?xf32>, tensor<?x?xf32>)
    outs(%c: tensor<?x?xf32, #SparseMatrix>) -> tensor<?x?xf32, #SparseMatrix>

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D111684
2021-10-13 09:47:56 -07:00
Mogball a54f4eae0e [MLIR] Replace std ops with arith dialect ops
Precursor: https://reviews.llvm.org/D110200

Removed redundant ops from the standard dialect that were moved to the
`arith` or `math` dialects.

Renamed all instances of operations in the codebase and in tests.

Reviewed By: rriddle, jpienaar

Differential Revision: https://reviews.llvm.org/D110797
2021-10-13 03:07:03 +00:00
Aart Bik 849f016ce8 [mlir][sparse] accept affine subscripts in outer dimensions of dense memrefs
This relaxes vectorization of dense memrefs a bit so that affine expressions
are allowed in more outer dimensions. Vectorization of non unit stride
references is disabled though, since this seems ineffective anyway.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D111469
2021-10-11 11:45:14 -07:00
Aart Bik 16b8f4ddae [mlir][sparse] add a "release" operation to sparse tensor dialect
We have several ways to materialize sparse tensors (new and convert) but no explicit operation to release the underlying sparse storage scheme at runtime (other than making an explicit delSparseTensor() library call). To simplify memory management, a sparse_tensor.release operation has been introduced that lowers to the runtime library call while keeping tensors, opague pointers, and memrefs transparent in the initial IR.

*Note* There is obviously some tension between the concept of immutable tensors and memory management methods. This tension is addressed by simply stating that after the "release" call, no further memref related operations are allowed on the tensor value. We expect the design to evolve over time, however, and arrive at a more satisfactory view of tensors and buffers eventually.

Bug:
http://llvm.org/pr52046

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D111099
2021-10-05 09:35:59 -07:00
Aart Bik ec97a205c3 [mlir][sparse] preserve zero-initialization for materializing buffers
This revision makes sure that when the output buffer materializes locally
(in contrast with the passing in of output tensors either in-place or not
in-place), the zero initialization assumption is preserved. This also adds
a bit more documentation on our sparse kernel assumption (viz. TACO
assumptions).

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D110442
2021-09-27 11:22:05 -07:00
Bixia Zheng fbd5821c6f Implement the conversion from sparse constant to sparse tensors.
The sparse constant provides a constant tensor in coordinate format. We first split the sparse constant into a constant tensor for indices and a constant tensor for values. We then generate a loop to fill a sparse tensor in coordinate format using the tensors for the indices and the values. Finally, we convert the sparse tensor in coordinate format to the destination sparse tensor format.

Add tests.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D110373
2021-09-27 09:47:29 -07:00