Commit Graph

21 Commits

Author SHA1 Message Date
gysit 7294be2b8e [mlir][linalg] Replace linalg.fill by OpDSL variant.
The revision removes the linalg.fill operation and renames the OpDSL generated linalg.fill_tensor operation to replace it. After the change, all named structured operations are defined via OpDSL and there are no handwritten operations left.

A side-effect of the change is that the pretty printed form changes from:
```
%1 = linalg.fill(%cst, %0) : f32, tensor<?x?xf32> -> tensor<?x?xf32>
```
changes to
```
%1 = linalg.fill ins(%cst : f32) outs(%0 : tensor<?x?xf32>) -> tensor<?x?xf32>
```
Additionally, the builder signature now takes input and output value ranges as it is the case for all other OpDSL operations:
```
rewriter.create<linalg::FillOp>(loc, val, output)
```
changes to
```
rewriter.create<linalg::FillOp>(loc, ValueRange{val}, ValueRange{output})
```
All other changes remain minimal. In particular, the canonicalization patterns are the same and the `value()`, `output()`, and `result()` methods are now implemented by the FillOpInterface.

Depends On D120726

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D120728
2022-03-14 10:51:08 +00:00
River Riddle 5a7b919409 [mlir][NFC] Rename StandardToLLVM to FuncToLLVM
The current StandardToLLVM conversion patterns only really handle
the Func dialect. The pass itself adds patterns for Arithmetic/CFToLLVM, but
those should be/will be split out in a followup. This commit focuses solely
on being an NFC rename.

Aside from the directory change, the pattern and pass creation API have been renamed:
 * populateStdToLLVMFuncOpConversionPattern -> populateFuncToLLVMFuncOpConversionPattern
 * populateStdToLLVMConversionPatterns -> populateFuncToLLVMConversionPatterns
 * createLowerToLLVMPass -> createConvertFuncToLLVMPass

Differential Revision: https://reviews.llvm.org/D120778
2022-03-07 11:25:23 -08:00
River Riddle ace01605e0 [mlir] Split out a new ControlFlow dialect from Standard
This dialect is intended to model lower level/branch based control-flow constructs. The initial set
of operations are: AssertOp, BranchOp, CondBranchOp, SwitchOp; all split out from the current
standard dialect.

See https://discourse.llvm.org/t/standard-dialect-the-final-chapter/6061

Differential Revision: https://reviews.llvm.org/D118966
2022-02-06 14:51:16 -08:00
Uday Bondhugula 970f94d051 [MLIR] Fix integration tests broken by D118285
[MLIR] Fix integration tests broken by D118285.
2022-01-27 13:00:30 +05:30
Eugene Zhulenev 49ce40e9ab [mlir] AsyncParallelFor: align block size to be a multiple of inner loops iterations
Depends On D115263

By aligning block size to inner loop iterations parallel_compute_fn LLVM can later unroll and vectorize some of the inner loops with small number of trip counts. Up to 2x speedup in multiple benchmarks.

Reviewed By: bkramer

Differential Revision: https://reviews.llvm.org/D115436
2021-12-09 06:50:50 -08:00
Eugene Zhulenev 68a7c001ad [mlir] Improve async parallel for tests + fix typos
Do load and store to verify that we process each element of the iteration space once.

Reviewed By: cota

Differential Revision: https://reviews.llvm.org/D115152
2021-12-06 13:27:54 -08:00
bakhtiyar 7bd87a03fd Promote readability by factoring out creation of min/max operation. Remove unnecessary divisions.
Reviewed By: ezhulenev

Differential Revision: https://reviews.llvm.org/D110680
2021-11-24 16:17:23 -08:00
Alexander Belyaev 9b1d90e8ac [mlir] Move min/max ops from Std to Arith.
Differential Revision: https://reviews.llvm.org/D113881
2021-11-15 13:19:17 +01:00
Eugene Zhulenev 0d9b478932 [mlir] Reduce the number of iterations in async microbenchmarks
Differential Revision: https://reviews.llvm.org/D112609
2021-10-27 03:20:06 -07:00
Mogball a54f4eae0e [MLIR] Replace std ops with arith dialect ops
Precursor: https://reviews.llvm.org/D110200

Removed redundant ops from the standard dialect that were moved to the
`arith` or `math` dialects.

Renamed all instances of operations in the codebase and in tests.

Reviewed By: rriddle, jpienaar

Differential Revision: https://reviews.llvm.org/D110797
2021-10-13 03:07:03 +00:00
bakhtiyar 55dfab39a2 Rename target block size to min task size for clarity.
Reviewed By: ezhulenev

Differential Revision: https://reviews.llvm.org/D110604
2021-09-28 14:51:55 -07:00
Alex Zinenko 8b58ab8ccd [mlir] Factor type reconciliation out of Standard-to-LLVM conversion
Conversion to the LLVM dialect is being refactored to be more progressive and
is now performed as a series of independent passes converting different
dialects. These passes may produce `unrealized_conversion_cast` operations that
represent pending conversions between built-in and LLVM dialect types.
Historically, a more monolithic Standard-to-LLVM conversion pass did not need
these casts as all operations were converted in one shot. Previous refactorings
have led to the requirement of running the Standard-to-LLVM conversion pass to
clean up `unrealized_conversion_cast`s even though the IR had no standard
operations in it. The pass must have been also run the last among all to-LLVM
passes, in contradiction with the partial conversion logic. Additionally, the
way it was set up could produce invalid operations by removing casts between
LLVM and built-in types even when the consumer did not accept the uncasted
type, or could lead to cryptic conversion errors (recursive application of the
rewrite pattern on `unrealized_conversion_cast` as a means to indicate failure
to eliminate casts).

In fact, the need to eliminate A->B->A `unrealized_conversion_cast`s is not
specific to to-LLVM conversions and can be factored out into a separate type
reconciliation pass, which is achieved in this commit. While the cast operation
itself has a folder pattern, it is insufficient in most conversion passes as
the folder only applies to the second cast. Without complex legality setup in
the conversion target, the conversion infra will either consider the cast
operations valid and not fold them (a separate canonicalization would be
necessary to trigger the folding), or consider the first cast invalid upon
generation and stop with error. The pattern provided by the reconciliation pass
applies to the first cast operation instead. Furthermore, having a separate
pass makes it clear when `unrealized_conversion_cast`s could not have been
eliminated since it is the only reason why this pass can fail.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D109507
2021-09-09 16:51:24 +02:00
Eugene Zhulenev 6c1f655818 [mlir] Async: special handling for parallel loops with zero iterations
Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D106590
2021-07-23 01:22:59 -07:00
Alex Zinenko 75e5f0aac9 [mlir] factor memref-to-llvm lowering out of std-to-llvm
After the MemRef has been split out of the Standard dialect, the
conversion to the LLVM dialect remained as a huge monolithic pass.
This is undesirable for the same complexity management reasons as having
a huge Standard dialect itself, and is even more confusing given the
existence of a separate dialect. Extract the conversion of the MemRef
dialect operations to LLVM into a separate library and a separate
conversion pass.

Reviewed By: herhut, silvas

Differential Revision: https://reviews.llvm.org/D105625
2021-07-09 14:49:52 +02:00
Eugene Zhulenev f57b2420b2 [mlir:Async] Add an async reference counting pass based on the user defined policy
Depends On D104999

Automatic reference counting based on the liveness analysis can add a lot of reference counting overhead at runtime. If the IR is known to be constrained to few particular "shapes", it's much more efficient to provide a custom reference counting policy that will specify where it is required to update the async value reference count.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D105037
2021-06-29 12:53:09 -07:00
Eugene Zhulenev 9ccdaac8f9 [mlir:Async] Fix a bug in automatic refence counting around function calls
Depends On D104998

Function calls "transfer ownership" to the callee and it puts additional constraints on the reference counting optimization pass

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D104999
2021-06-29 09:35:43 -07:00
Eugene Zhulenev 86ad0af870 [mlir:Async] Implement recursive async work splitting for scf.parallel operation (async-parallel-for pass)
Depends On D104780

Recursive work splitting instead of sequential async tasks submission gives ~20%-30% speedup in microbenchmarks.

Algorithm outline:
1. Collapse scf.parallel dimensions into a single dimension
2. Compute the block size for the parallel operations from the 1d problem size
3. Launch parallel tasks
4. Each parallel task reconstructs its own bounds in the original multi-dimensional iteration space
5. Each parallel task computes the original parallel operation body using scf.for loop nest

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D104850
2021-06-25 10:34:39 -07:00
Tobias Gysi a21a6f51bc [mlir][linalg] Change the pretty printed FillOp operand order.
The patch changes the pretty printed FillOp operand order from output, value to value, output. The change is a follow up to https://reviews.llvm.org/D104121 that passes the fill value using a scalar input instead of the former capture semantics.

Differential Revision: https://reviews.llvm.org/D104356
2021-06-23 07:03:00 +00:00
Eugene Zhulenev a6628e596e [mlir] Async: add automatic reference counting at async.runtime operations level
Depends On D95311

Previous automatic-ref-counting pass worked with high level async operations (e.g. async.execute), however async values reference counting is a runtime implementation detail.

New pass mostly relies on the save liveness analysis to place drop_ref operations, and does better verification of CFG with different liveIn sets in block successors.

This is almost NFC change. No new reference counting ideas, just a cleanup of the previous version.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D95390
2021-04-12 18:54:55 -07:00
Julian Gross e2310704d8 [MLIR] Create memref dialect and move dialect-specific ops from std.
Create the memref dialect and move dialect-specific ops
from std dialect to this dialect.

Moved ops:
AllocOp -> MemRef_AllocOp
AllocaOp -> MemRef_AllocaOp
AssumeAlignmentOp -> MemRef_AssumeAlignmentOp
DeallocOp -> MemRef_DeallocOp
DimOp -> MemRef_DimOp
MemRefCastOp -> MemRef_CastOp
MemRefReinterpretCastOp -> MemRef_ReinterpretCastOp
GetGlobalMemRefOp -> MemRef_GetGlobalOp
GlobalMemRefOp -> MemRef_GlobalOp
LoadOp -> MemRef_LoadOp
PrefetchOp -> MemRef_PrefetchOp
ReshapeOp -> MemRef_ReshapeOp
StoreOp -> MemRef_StoreOp
SubViewOp -> MemRef_SubViewOp
TransposeOp -> MemRef_TransposeOp
TensorLoadOp -> MemRef_TensorLoadOp
TensorStoreOp -> MemRef_TensorStoreOp
TensorToMemRefOp -> MemRef_BufferCastOp
ViewOp -> MemRef_ViewOp

The roadmap to split the memref dialect from std is discussed here:
https://llvm.discourse.group/t/rfc-split-the-memref-dialect-from-std/2667

Differential Revision: https://reviews.llvm.org/D98041
2021-03-15 11:14:09 +01:00
Mehdi Amini 99b0032ce0 Move the MLIR integration tests as a subdirectory of test (NFC)
This does not change the behavior directly: the tests only run when
`-DMLIR_INCLUDE_INTEGRATION_TESTS=ON` is configured. However running
`ninja check-mlir` will not run all the tests within a single
lit invocation. The previous behavior would wait for all the integration
tests to complete before starting to run the first regular test. The
test results were also reported separately. This change is unifying all
of this and allow concurrent execution of the integration tests with
regular non-regression and unit-tests.

Differential Revision: https://reviews.llvm.org/D97241
2021-02-23 05:55:47 +00:00