Commit Graph

7070 Commits

Author SHA1 Message Date
Chris Jones 344eee6f38 [MLIR] Allow `Idempotent` trait to be applied to binary ops.
Add `Idempotent` trait to `arith.{andi,ori}`.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D114574
2021-11-26 18:22:49 +00:00
Michal Terepeta 7e65fc9a60 [mlir][Vector] Support 0-D vectors in `BroadcastOp`
This changes the op to produce `AnyVectorOfAnyRank` following mostly the code for 1-D vectors.

Depends On D114598

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114550
2021-11-26 17:17:18 +00:00
Michal Terepeta d0f927121e [mlir][Standard] Support 0-D vectors in `SplatOp`
This changes the op to produce `AnyVectorOfAnyRank` and implements this by just
inserting the element (skipping the shuffle that we do for the 1-D case).

Depends On D114549

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114598
2021-11-26 17:05:15 +00:00
Arjun P ad34ce94d5 [MLIR] Simplex: fix a bug when rolling back a Simplex with no solutions
Previously, when adding a constraint to a Simplex that is already marked
as having no solutions (marked empty), the Simplex would be marked empty again,
and a second UnmarkEmpty entry would be pushed to the undo log. When rolling
back, Simplex should be unmarked empty only after rolling back past the
creation of the first constraint that made it empty.

Reviewed By: Groverkss

Differential Revision: https://reviews.llvm.org/D114613
2021-11-26 22:33:48 +05:30
Arjun P f074bbb04a [MLIR] Simplex::pivot: also update the redundant rows when pivoting
Previously, the pivot function would only update the non-redundant rows when
pivoting. This is incorrect because in some cases, when rolling back past a
`detectRedundant` call, the basis being used could be different from that which
was used at the time of returning from the `detectRedundant` call. Therefore,
it is important to update the redundant rows as well during pivots. This could
also be triggered by pivots that occur when testing successive constraints for
being redundant in `detectRedundant` after some initial constraints are marked redundant.

Reviewed By: Groverkss

Differential Revision: https://reviews.llvm.org/D114614
2021-11-26 21:42:41 +05:30
Mats Petersson 30238c3676 [mlir][OpenMP] Add support for SIMD modifier
Add support for SIMD modifier in OpenMP worksharing loops.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D111051
2021-11-26 14:04:46 +00:00
Matthias Springer b62b21b980 [mlir][linalg][bufferize][NFC] InsertSliceOp no-copy detection as PostAnalysis
There is special logic for InsertSliceOp to check if a memcpy is needed. This change extracts that piece of code and makes it a PostAnalysisStep.

The purpose of this change is to untangle `bufferize` from BufferizationAliasInfo. (Not fully there yet.)

Differential Revision: https://reviews.llvm.org/D114513
2021-11-26 22:19:29 +09:00
Benjamin Kramer 8521850f20 Provide a definition for OperationPosition::kDown
This isn't necessary in C++17, but C++14 still requires it.
2021-11-26 14:11:59 +01:00
Benjamin Kramer 1b0312d280 [PDL] fix unused variable warning in Release builds 2021-11-26 14:11:58 +01:00
Stanislav Funiak d35f119094 Added line numbers to the debug output of PDL bytecode.
This is a small diff that splits out the debug output for PDL bytecode. When running bytecode with debug output on, it is useful to know the line numbers where the PDLIntepr operations are performed. Usually, these are in a single MLIR file, so it's sufficient to print out the line number rather than the entire location (which tends to be quite verbose). This debug output is gated by `LLVM_DEBUG` rather than `#ifndef NDEBUG` to make it easier to test.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D114061
2021-11-26 18:11:37 +05:30
Stanislav Funiak a76ee58f3c Multi-root PDL matching using upward traversals.
This is commit 4 of 4 for the multi-root matching in PDL, discussed in https://llvm.discourse.group/t/rfc-multi-root-pdl-patterns-for-kernel-matching/4148 (topic flagged for review).

This PR integrates the various components (root ordering algorithm, nondeterministic execution of PDL bytecode) to implement multi-root PDL matching. The main idea is for the pattern to specify mulitple candidate roots. The PDL-to-PDLInterp lowering selects one of these roots and "hangs" the pattern from this root, traversing the edges downwards (from operation to its operands) when possible and upwards (from values to its uses) when needed. The root is selected by invoking the optimal matching multiple times, once for each candidate root, and the connectors are determined form the optimal matching. The costs in the directed graph are equal to the number of upward edges that need to be traversed when connecting the given two candidate roots. It can be shown that, for this choice of the cost function, "hanging" the pattern an inner node is no better than from the optimal root.

The following three main additions were implemented as a part of this PR:
1. OperationPos predicate has been extended to allow tracing the operation accepting a value (the opposite of operation defining a value).
2. Predicate checking if two values are not equal - this is useful to ensure that we do not traverse the edge back downwards after we traversed it upwards.
3. Function for for building the cost graph among the candidate roots.
4. Updated buildPredicateList, building the predicates optimal branching has been determined.

Testing: unit tests (an integration test to follow once the stack of commits has landed)

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D108550
2021-11-26 18:11:37 +05:30
Stanislav Funiak 6df7cc7f47 Implementation of the root ordering algorithm
This is commit 3 of 4 for the multi-root matching in PDL, discussed in https://llvm.discourse.group/t/rfc-multi-root-pdl-patterns-for-kernel-matching/4148 (topic flagged for review).

We form a graph over the specified roots, provided in `pdl.rewrite`, where two roots are connected by a directed edge if the target root can be connected (via a chain of operations) in the underlying pattern to the source root. We place a restriction that the path connecting the two candidate roots must only contain the nodes in the subgraphs underneath these two roots. The cost of an edge is the smallest number of upward traversals (edges) required to go from the source to the target root, and the connector is a `Value` in the intersection of the two subtrees rooted at the source and target root that results in that smallest number of such upward traversals. Optimal root ordering is then formulated as the problem of finding a spanning arborescence (i.e., a directed spanning tree) of minimal weight.

In order to determine the spanning arborescence (directed spanning tree) of minimum weight, we use the [Edmonds' algorithm](https://en.wikipedia.org/wiki/Edmonds%27_algorithm). The worst-case computational complexity of this algorithm is O(_N_^3) for a single root, where _N_ is the number of specified roots. The `pdl`-to-`pdl_interp` lowering calls this algorithm as a subroutine _N_ times (once for each candidate root), so the overall complexity of root ordering is O(_N_^4). If needed, this complexity could be reduced to O(_N_^3) with a more efficient algorithm. However, note that the underlying implementation is very efficient, and _N_ in our instances tends to be very small (<10). Therefore, we believe that the proposed (asymptotically suboptimal) implementation will suffice for now.

Testing: a unit test of the algorithm

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D108549
2021-11-26 18:11:37 +05:30
Stanislav Funiak 3eb1647af0 Introduced iterative bytecode execution.
This is commit 2 of 4 for the multi-root matching in PDL, discussed in https://llvm.discourse.group/t/rfc-multi-root-pdl-patterns-for-kernel-matching/4148 (topic flagged for review).

This commit implements the features needed for the execution of the new operations pdl_interp.get_accepting_ops, pdl_interp.choose_op:
1. The implementation of the generation and execution of the two ops.
2. The addition of Stack of bytecode positions within the ByteCodeExecutor. This is needed because in pdl_interp.choose_op, we iterate over the values returned by pdl_interp.get_accepting_ops until we reach finalize. When we reach finalize, we need to return back to the position marked in the stack.
3. The functionality to extend the lifetime of values that cross the nondeterministic choice. The existing bytecode generator allocates the values to memory positions by representing the liveness of values as a collection of disjoint intervals over the matcher positions. This is akin to register allocation, and substantially reduces the footprint of the bytecode executor. However, because with iterative operation pdl_interp.choose_op, execution "returns" back, so any values whose original liveness cross the nondeterminstic choice must have their lifetime executed until finalize.

Testing: pdl-bytecode.mlir test

Reviewed By: rriddle, Mogball

Differential Revision: https://reviews.llvm.org/D108547
2021-11-26 18:11:37 +05:30
Stanislav Funiak 842b6861c0 Defines new PDLInterp operations needed for multi-root matching in PDL.
This is commit 1 of 4 for the multi-root matching in PDL, discussed in https://llvm.discourse.group/t/rfc-multi-root-pdl-patterns-for-kernel-matching/4148 (topic flagged for review).

These operations are:
* pdl.get_accepting_ops: Returns a list of operations accepting the given value or a range of values at the specified position. Thus if there are two operations `%op1 = "foo"(%val)` and `%op2 = "bar"(%val)` accepting a value at position 0, `%ops = pdl_interp.get_accepting_ops of %val : !pdl.value at 0` will return both of them. This allows us to traverse upwards from a value to operations accepting the value.
* pdl.choose_op: Iteratively chooses one operation from a range of operations. Therefore, writing `%op = pdl_interp.choose_op from %ops` in the example above will select either `%op1`or `%op2`.

Testing: Added the corresponding test cases to mlir/test/Dialect/PDLInterp/ops.mlir.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D108543
2021-11-26 17:59:22 +05:30
Matthias Springer 8e2214aa60 [mlir][linalg][bufferize][NFC] Pass BufferizationState to PostAnalysisStep
Pass BufferizationStep instead of BufferizationAliasInfo. Note: BufferizationState contains BufferizationAliasInfo.

Differential Revision: https://reviews.llvm.org/D114512
2021-11-26 11:46:14 +09:00
Matthias Springer d62b4b08af [mlir][linalg][bufferize] Compose dialect-specific bufferization state
Use composition instead of inheritance for storing dialect-specific bufferization state. This is in preparation of adding "tensor dialect"-specific bufferization state.

Differential Revision: https://reviews.llvm.org/D114508
2021-11-26 11:35:45 +09:00
Matthias Springer c94b80b438 [mlir][linalg][bufferize][NFC] Allow returning arbitrary memrefs
If `allowReturnMemref` is set to true, arbitrary memrefs may be returned from FuncOps. Also remove allocation hoisting code, which is only partly implemented at the moment.

The purpose of this commit is to untangle `bufferize` from `aliasInfo`. (Even with this change, they are not fully untangled yet.)

Differential Revision: https://reviews.llvm.org/D114507
2021-11-26 11:26:46 +09:00
Matthias Springer c637e3ea9e [mlir][linalg][bufferize][NFC] Extract func boundary bufferization
Bufferization of function boundaries is extracted from ComprehensiveBufferize into a separate file. This will become its own build target in the future.

Differential Revision: https://reviews.llvm.org/D114226
2021-11-26 10:25:36 +09:00
Matthias Springer f32c3d9528 [mlir][linalg][bufferize][NFC] Move Affine interface impl to new build target
This makes ComprehensiveBufferize entirely independent of the Affine dialect.

Differential Revision: https://reviews.llvm.org/D114222
2021-11-26 09:27:47 +09:00
Michal Terepeta cc311a155a [mlir][Vector] Support 0-D vectors in `VectorPrintOpConversion`
Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114549
2021-11-25 20:12:18 +00:00
Uday Bondhugula c89fc1eec3 [MLIR] NFC. Rename MLIR CAPI ExecutionEngine target for consistency
Rename MLIR CAPI ExecutionEngine target for consistency:
MLIRCEXECUTIONENGINE -> MLIRCAPIExecutionEngine in line with other
targets.

Differential Revision: https://reviews.llvm.org/D114596
2021-11-26 00:23:17 +05:30
Alexander Belyaev 57470abc41 [mlir] Move memref.[tensor_load|buffer_cast|clone] to "bufferization" dialect.
https://llvm.discourse.group/t/rfc-dialect-for-bufferization-related-ops/4712

Differential Revision: https://reviews.llvm.org/D114552
2021-11-25 11:50:39 +01:00
Tobias Gysi 4b03906346 [mlir][linalg] Perform checks early in hoist padding.
Instead of checking for unexpected operations (any operation with a region except for scf::For and `padTensorOp` or operations with a memory effect) while cloning the packing loop nest perform the checks early. Update `dropNonIndexDependencies` to check for unexpected operations. Additionally, check all of these operations have index type operands only.

Depends On D114428

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114438
2021-11-25 10:37:12 +00:00
Tobias Gysi fd723eaa92 [mlir][linalg] Limit hoist padding to constant paddings.
Limit hoist padding to pad tensor ops that depend only on a constant value. Supporting arbitrary padding values that depend on computations part of the backward slice to hoist require complex analysis to ensure the computation can be hoisted.

Depends On D114420

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114428
2021-11-25 10:31:39 +00:00
Tobias Gysi ed7c1fb9b0 [mlir][linalg] Add backward slice filtering in hoist padding.
Adapt hoist padding to filter the backward slice before cloning the packing loop nest. The filtering removes all operations that are not used to index the hoisted pad tensor op and its extract slice op. The filtering is needed to support the more complex loop nests created after fusion. For example, fusing the producer of an output operand can added linalg ops and pad tensor ops to the backward slice. These operations have regions and currently prevent hoisting.

The following example demonstrates the effect of the newly introduced `dropNonIndexDependencies` method that filters the backward slice:
```
%source = linalg.fill(%cst, %arg0)
scf.for %i
  %unrelated = linalg.fill(%cst, %arg1)    // not used to index %source!
  scf.for %j (%arg2 = %unrelated)
    scf.for %k                             // not used to index %source!
      %ubi = affine.min #map(%i)
      %ubj = affine.min #map(%j)
      %slice = tensor.extract_slice %source [%i, %j] [%ubi, %ubj]
      %padded_slice = linalg.pad_tensor %slice
```
dropNonIndexDependencies(%padded_slice, %slice)
removes [scf.for %k, linalg.fill(%cst, %arg1)] from backwardSlice.

Depends On D114175

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114420
2021-11-25 10:30:10 +00:00
Matthias Springer 48107eaa07 [mlir][linalg][bufferize][NFC] Move SCF interface impl to new build target
This makes ComprehensiveBufferize entirely independent of the SCF dialect.

Differential Revision: https://reviews.llvm.org/D114221
2021-11-25 19:00:17 +09:00
Alexander Belyaev 3c228573bc Revert "[mlir][SCF] Further simplify affine maps during `for-loop-canonicalization`"
This reverts commit ee1bf18672.

It breaks IREE lowering. Reverting the commit for now while we
investigate what's going on.
2021-11-25 10:54:52 +01:00
Butygin 8dae0b6b6c [mlir][spirv] arith::RemSIOp OpenCL lowering
Differential Revision: https://reviews.llvm.org/D114524
2021-11-25 12:44:06 +03:00
Matthias Springer a5c2f78287 [mlir][interfaces] Add insideMutuallyExclusiveRegions helper
Add a helper function to ControlFlowInterfaces for checking if two ops
are in mutually exclusive regions according to RegionBranchOpInterface.

Utilize this new helper in Linalg ComprehensiveBufferize. This makes the
analysis independent of the SCF dialect and generalizes it to other ops
that implement RegionBranchOpInterface.

Differential Revision: https://reviews.llvm.org/D114220
2021-11-25 17:44:39 +09:00
Matthias Springer ee1bf18672 [mlir][SCF] Further simplify affine maps during `for-loop-canonicalization`
* Implement `FlatAffineConstraints::getConstantBound(EQ)`.
* Inject a simpler constraint for loops that have at most 1 iteration.
* Taking into account constant EQ bounds of FlatAffineConstraint dims/symbols during canonicalization of the resulting affine map in `canonicalizeMinMaxOp`.

Differential Revision: https://reviews.llvm.org/D114138
2021-11-25 12:44:19 +09:00
Matthias Springer 8a8c655fe7 [mlir][SCF] Fix off-by-one bug in affine analysis
This change is NFC. There were two issues when passing/reading upper bounds into/from FlatAffineConstraints that negate each other, so the bug was not apparent. However, it made debugging harder because some constraints in the FlatAffineConstraints were off by one when dumping all constraints.

Differential Revision: https://reviews.llvm.org/D114137
2021-11-25 12:37:02 +09:00
Uday Bondhugula 23d505571d [NFC] Improve debug message in getAsIntegerSet
Improve debug message in getAsIntegerSet. Add missing trailing new line
and position info.

Differential Revision: https://reviews.llvm.org/D114511
2021-11-25 08:50:21 +05:30
Matthias Springer d3bb4fec2a [mlir][linalg][bufferize][NFC] Move arith interface impl to new build target
This makes ComprehensiveBufferize entirely independent of the arith dialect.

Differential Revision: https://reviews.llvm.org/D114219
2021-11-25 10:21:02 +09:00
bakhtiyar 7bd87a03fd Promote readability by factoring out creation of min/max operation. Remove unnecessary divisions.
Reviewed By: ezhulenev

Differential Revision: https://reviews.llvm.org/D110680
2021-11-24 16:17:23 -08:00
Lei Zhang cb395f66ac [mlir][spirv] Change the return type for {Min|Max}VersionBase
For synthesizing an op's implementation of the generated interface
from {Min|Max}Version, we need to define an `initializer` and
`mergeAction`. The `initializer` specifies the initial version,
and `mergeAction` specifies how version specifications from
different parts of the op should be merged to generate a final
version requirements.

Previously we use the specified version enum as the type for both
the initializer and thus the final return type. This means we need
to perform `static_cast` over some hopefully not used number (`~0u`)
as the initializer. This is quite opaque and sort of not guaranteed
to work. Also, there are ops that have an enum attribute where some
values declare version requirements (e.g., enumerant `B` requires
v1.1+) but some not (e.g., enumerant `A` requires nothing). Then a
concrete op instance with `A` will still declare it implements the
version interface (because interface implementation is static for
an op) but actually theirs no requirements for version.

So this commit changes to use an more explicit `llvm::Optional`
to wrap around the returned version enum.  This should make it
more clear.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D108312
2021-11-24 17:33:01 -05:00
Tobias Gysi 86f186efea [mlir][linalg] Add makeComposedPadHighOp.
Add the makeComposedPadHighOp method which creates a new PadTensorOp if necessary. If the source to pad is actually the result of a sequence of padded LinalgOps, the method checks if padding is needed or if we can use the padded result of the padded LinalgOp sequence directly.

Example:
```
%0 = tensor.extract_slice %arg0 [%iv0, %iv1] [%sz0, %sz1]
%1 = linalg.pad_tensor %0 low[0, 0] high[...] { linalg.yield %cst }
%2 = linalg.matmul ins(...) outs(%1)
%3 = tensor.extract_slice %2 [0, 0] [%sz0, %sz1]
```
when padding %3 return %2 instead of introducing
```
%4 = linalg.pad_tensor %3 low[0, 0] high[...] { linalg.yield %cst }
```

Depends On D114161

Reviewed By: nicolasvasilache, pifon2a

Differential Revision: https://reviews.llvm.org/D114175
2021-11-24 19:18:59 +00:00
Tobias Gysi a4fd8cb76f [mlir][linalg] Update failure conditions for padOperandToSmallestStaticBoundingBox.
Change the failure condition of padOperandToSmallestStaticBoundingBox to never fail if the operand is already statically sized.

In particular:
- if the padding value computation fails -> return failure if the operand shape is dynamic and success if it is static.
- if there is no extract slice op -> return failure if the operand shape is dynamic and success if it is static.

The latter change prevents padding from failure if the output operand passed by iteration argument is statically sized since in this case the extract / insert slice pairs are removed by canonicalization.

Depends On D114153

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114161
2021-11-24 19:10:50 +00:00
MaheshRavishankar 0a58982b08 [mlir][Linalg] Remove alloc/dealloc pair as a callback.
The alloc dealloc pair generation callback is really central to the
bufferization algorithm, it modifies the state in a way that affects
correctness. This is not really a configurable option. Moving it to
BufferizationState removes what was probably the reason it was added
as a callback.

Differential Revision: https://reviews.llvm.org/D114417
2021-11-24 10:36:34 -08:00
Nicolas Vasilache 1cfa9b4d70 [mlir][Vector] NFC - Apply some clangd suggested fixes. 2021-11-24 15:55:58 +00:00
Matthias Springer ca9d149e07 [mlir][linalg][bufferize][NFC] Move vector interface impl to new build target
This makes ComprehensiveBufferize entirely independent of the vector dialect.

Differential Revision: https://reviews.llvm.org/D114218
2021-11-24 19:36:12 +09:00
Matthias Springer bb273a35a0 [mlir][linalg][bufferize][NFC] Move tensor interface impl to new build target
This makes ComprehensiveBufferize entirely independent of the tensor dialect.

Differential Revision: https://reviews.llvm.org/D114217
2021-11-24 18:25:17 +09:00
Butygin 7f5d9bf13a [mlir][scf] Canonicalize scf.while with unused results
Differential Revision: https://reviews.llvm.org/D114291
2021-11-24 11:11:22 +03:00
Bixia Zheng 02710413a3 Accept symmetric sparse matrix in Matrix Market Exchange Format.
Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D114402
2021-11-23 19:53:17 -08:00
Uday Bondhugula 8bd08a9fd7 [MLIR] Remove duplicate `Pass` suffix from ViewOpGraph class name
Remove duplicate `Pass` suffix from view-op-graph pass class name. The
extra suffix would lead to methods like registerViewOpGraphPassPass
being generated.

Differential Revision: https://reviews.llvm.org/D114459
2021-11-24 08:00:16 +05:30
wren romano d7d7ffe254 [mlir][sparse] Adding wrappers for constantOverheadTypeEncoding
Minor code cleanup

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D114392
2021-11-23 18:30:06 -08:00
Butygin 75a1bee05d [mlir][spirv] Add math to OpenCL conversion
Differential Revision: https://reviews.llvm.org/D113780
2021-11-24 02:31:21 +03:00
Rob Suderman 0f1e52afa9 [mlir][tosa] Materialize tosa.pad value and fold noop pads
Padding now can explicitly specify the padding value when non-zero is wanted.
This also includes bypassing pads when the pad does nothing.

Differential Revision: https://reviews.llvm.org/D113611
2021-11-23 12:23:42 -08:00
Rob Suderman 54eec7cafc [mlir][tosa] Separate tosa.transpose_conv decomposition and added stride support
Transpose convolution decomposition is now performed in a separate pass. This
allows padding / constant propagation to be performed at the TOSA level. It
also adds support for striding when there is no dilation.

Differential Revision: https://reviews.llvm.org/D114409
2021-11-23 12:16:44 -08:00
MaheshRavishankar b57e2f071a [mlir][Linalg] Add pad vectorization patterns into LinalgStrategyVectorize passes.
Add an option to control whether these patterns are added to the
pattern list or not.

Differential Revision: https://reviews.llvm.org/D114290
2021-11-23 11:47:54 -08:00
Nicolas Vasilache 3ff4e5f2a4 [mlir][Vector] Thread 0-d vectors through InsertElementOp.
This revision makes concrete use of 0-d vectors to extend the semantics of
InsertElementOp.

Reviewed By: dcaballe, pifon2a

Differential Revision: https://reviews.llvm.org/D114388
2021-11-23 12:55:11 +00:00
Nicolas Vasilache e7026aba00 [mlir][Vector] Thread 0-d vectors through ExtractElementOp.
This revision starts making concrete use of 0-d vectors to extend the semantics of
ExtractElementOp.
In the process a new VectorOfAnyRank Tablegen OpBase.td is added to allow progressive transition to supporting 0-d vectors by gradually opting in.

Differential Revision: https://reviews.llvm.org/D114387
2021-11-23 12:39:44 +00:00
Matthias Springer f24d9313cc [mlir][linalg][bufferize][NFC] Specify bufferize traversal in `bufferize`
The interface method `bufferize` controls how (and it what order) nested ops are traversed. This simplifies bufferization of scf::ForOps and scf::IfOps, which used to need special rules in scf::YieldOp.

Differential Revision: https://reviews.llvm.org/D114057
2021-11-23 21:33:19 +09:00
Alexander Belyaev c7cc70c8f8 Revert "Revert "[mlir] Move AllocationOpInterface to Bufferize/IR/AllocationOpInterface.td.""
This reverts and fixes commit de18b7dee6.
2021-11-23 10:49:26 +01:00
Nicolas Vasilache b2729fda60 [mlir][Vector] Add a vblendps-based impl for transpose8x8 (both intrin and inline_asm)
This revision follows up on the conversation titled:

```[llvm-dev] Understanding and controlling some of the AVX shuffle emission paths```

The revision adds a vblendps-based implementation for transpose8x8 and further distinguishes between and intrinsics and an inline_asm implementation.

This results in roughly 20% fewer cycles as reported by llvm-mca:

After this revision (intrinsic version, resolves to virtually identical assembly as per the llvm-dev discussion, no vblendps instruction is emitted):
```
Iterations:        100
Instructions:      5900
Total Cycles:      2415
Total uOps:        7300

Dispatch Width:    6
uOps Per Cycle:    3.02
IPC:               2.44
Block RThroughput: 24.0

Cycles with backend pressure increase [ 89.90% ]
Throughput Bottlenecks:
  Resource Pressure       [ 89.65% ]
  - SKXPort1  [ 0.04% ]
  - SKXPort2  [ 12.42% ]
  - SKXPort3  [ 12.42% ]
  - SKXPort5  [ 89.52% ]
  Data Dependencies:      [ 37.06% ]
  - Register Dependencies [ 37.06% ]
  - Memory Dependencies   [ 0.00% ]
```

After this revision (inline_asm version, vblendps instructions are indeed emitted):
```
Iterations:        100
Instructions:      6300
Total Cycles:      2015
Total uOps:        7700

Dispatch Width:    6
uOps Per Cycle:    3.82
IPC:               3.13
Block RThroughput: 20.0

Cycles with backend pressure increase [ 83.47% ]
Throughput Bottlenecks:
  Resource Pressure       [ 83.18% ]
  - SKXPort0  [ 14.49% ]
  - SKXPort1  [ 14.54% ]
  - SKXPort2  [ 19.70% ]
  - SKXPort3  [ 19.70% ]
  - SKXPort5  [ 83.03% ]
  - SKXPort6  [ 14.49% ]
  Data Dependencies:      [ 39.75% ]
  - Register Dependencies [ 39.75% ]
  - Memory Dependencies   [ 0.00% ]
```

An accessible copy of the conversation is available [here](https://gist.github.com/nicolasvasilache/68c7f34012584b0e00f335bcb374ede0).

Differential Revision: https://reviews.llvm.org/D114393
2021-11-23 07:31:22 +00:00
Sandeep Dasgupta e5a8c8c883 [mlir] Refactoring a few Parser APIs
Refactored two new parser APIs parseGenericOperationAfterOperands and
 parseCustomOperationName out of parseGenericOperation and parseCustomOperation.

Motivation: Sometimes an op can be printed in a special way if certain criteria
is met. While parsing, we need to handle all the versions.
`parseGenericOperationAfterOperands` is handy in situation where we already
parsed the operands and decide to fall back to default parsing.

`parseCustomOperationName` is useful when we need to know details (dialect,
operation name etc.) about a parsed token meant to be an mlir operation.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D113719
2021-11-23 06:11:01 +00:00
Matthias Springer fb99686bfd [mlir][linalg][bufferize] Limited support for scf.execute_region
Add support for analysis only.

Differential Revision: https://reviews.llvm.org/D114055
2021-11-23 12:20:39 +09:00
Matthias Springer 26c0dd83ab [mlir][linalg][bufferize][NFC] Move helper function to op interface
This is in preparation of changing the op traversal during bufferization.

Differential Revision: https://reviews.llvm.org/D114040
2021-11-23 11:59:47 +09:00
Matthias Springer 8d0994ed21 [mlir][linalg][bufferize][NFC] Remove special casing of CallOps
Differential Revision: https://reviews.llvm.org/D113966
2021-11-23 11:14:10 +09:00
Matthias Springer b1083830d6 [mlir][linalg][bufferize][NFC] Clean up headers and function visibility
Differential Revision: https://reviews.llvm.org/D113964
2021-11-23 10:29:26 +09:00
Benjamin Kramer 966b720983 [mlir][memref] Fix expanded shape ops memref.cast folding with changed type
`memref.expand_shape` has verification logic to make sure
result dim must be static if all the collapsing src dims are static.

This can be relaxed once expand_shape supports more dynamism.

Differential Revision: https://reviews.llvm.org/D114391
2021-11-22 22:56:15 +01:00
Christian Ulmann f6718fc6d3 [mlir] FlatAffineConstraint parsing for unit tests
This patch adds functionality to parse FlatAffineConstraints from a
StringRef with the intention to be used for unit tests. This should
make the construction of FlatAffineConstraints easier for testing
purposes.

The patch contains an example usage of the functionality in a unit test that
uses FlatAffineConstraints.

Reviewed By: bondhugula, grosser

Differential Revision: https://reviews.llvm.org/D113275
2021-11-23 03:04:30 +05:30
Groverkss 98daa4e425 [MLIR] Fix incorrect removal of source loop in loop fusion
This patch fixes a bug in loop fusion pass where the source loop was removed
even when the fused loop did not cover all iterations of the source loop.

This was because the fast hueristic check for checking if source loop and
fused loop have same iterations did not take into account steps in loop.

Reviewed By: dcaballe, bondhugula

Differential Revision: https://reviews.llvm.org/D114164
2021-11-23 02:54:09 +05:30
Alexander Belyaev de18b7dee6 Revert "[mlir] Move AllocationOpInterface to Bufferize/IR/AllocationOpInterface.td."
This reverts commit 3028bca6a9.
For some reason using FallbackModel works with CMake and does not work
with bazel. Using `ExternalModel` works. I will check what's going on
and resubmit tomorrow.
2021-11-22 21:35:20 +01:00
Alexander Belyaev 3028bca6a9 [mlir] Move AllocationOpInterface to Bufferize/IR/AllocationOpInterface.td.
Remove the interface from op defs in MemRefOps.td and make it an external model.

This is the first PR of many that will move bufferization-related ops, interfaces, passes to Dialect/Bufferize.
RFC: https://llvm.discourse.group/t/rfc-dialect-for-bufferization-related-ops/4712
It is still debated if the comprehensive bufferization has to be moved there as well, so for now I am just moving the "gradual" bufferization.

Differential Revision: https://reviews.llvm.org/D114147
2021-11-22 21:00:59 +01:00
Mehdi Amini e0b7bee7cf Revert "[mlir][Vector] Add a vblendps-based impl for transpose8x8 (both intrin and inline_asm)"
This reverts commit a9e236bed8.
This broke the Windows build:

mlir\include\mlir/Dialect/X86Vector/Transforms.h(28): error C2061: syntax error: identifier 'uint'
2021-11-22 19:23:18 +00:00
Lei Zhang 93284120f2 [mlir][vector] Fix TransferOpReduceRank for 0-D tensors
We cannot unconditionally generate memref.load ops for such cases;
need to check the source's type.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114376
2021-11-22 12:30:46 -05:00
Alex Zinenko 9c5982ef8e [mlir] support recursive types in type conversion infra
MLIR supports recursive types but they could not be handled by the conversion
infrastructure directly as it would result in infinite recursion in
`convertType` for elemental types. Support this case by keeping the "call
stack" of nested type conversions in the TypeConverter class and by passing it
as an optional argument to the individual conversion callback. The callback can
then check if a specific type is present on the stack more than once to detect
and handle the recursive case.

This approach is preferred to the alternative approach of having a separate
callback dedicated to handling only the recursive case as the latter was
observed to introduce ~3% time overhead on a 50MB IR file even if it did not
contain recursive types.

This approach is also preferred to keeping a local stack in type converters
that need to handle recursive types as that would compose poorly in case of
out-of-tree or cross-project extensions.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D113579
2021-11-22 18:16:02 +01:00
Tobias Gysi 247a1a55eb [mlir][linalg] Use getAsOpFoldResult in padding (NFC).
After padding, we introduce a ExtractSliceOp to get the final unpadded result. This revision uses getAsOpFoldResult to compute the size of the unpadded result, which guarantees the result type has a partially static shape if some of the sizes of the unpadded result are statically known. At the moment, we rely on canonicalization to cleanup the types after padding.

Depends On D114085

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114153
2021-11-22 13:15:19 +00:00
Tobias Gysi 32c43241e7 [mlir][linalg] Always generate an extract/insert slice pair when tiling output tensors.
Adapt tiling to always generate an extract/insert slice pair for output tensors even if the tensor is not tiled. Having an explicit extract/insert slice pair simplifies followup transformations such as padding and bufferization. In particular, it makes read and written iteration argument slices explicit.

Depends On D114067

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114085
2021-11-22 13:12:43 +00:00
Tres Popp 106f307499 Rename MlirExecutionEngine lookup to lookupPacked
The purpose of the change is to make clear whether the user is
retrieving the original function or the wrapper function, in line with
the invoke commands. This new functionality is useful for users that
already have defined their own packed interface, so they do not want the
extra layer of indirection, or for users wanting to the look at the
resulting primary function rather than the wrapper function.

All locations, except the python bindings now have a `lookupPacked`
method that matches the original `lookup` functionality. `lookup`
still exists, but with new semantics.

- `lookup` returns the function with a given name. If `bool f(int,int)`
is compiled, `lookup` will return a reference to `bool(*f)(int,int)`.
- `lookupPacked` returns the packed wrapper of the function with the
given name. If `bool f(int,int)` is compiled, `lookupPacked` will return
`void(*mlir_f)(void**)`.

Differential Revision: https://reviews.llvm.org/D114352
2021-11-22 14:12:09 +01:00
Tobias Gysi f7751a3a42 [mlir][linalg] Remove tile and fuse test pass (NFC).
Remove the tile and fuse test pass that has been replaced by codegen strategy.

Depends On D114067

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114068
2021-11-22 12:33:31 +00:00
Nicolas Vasilache 050cc1cd6e [mlir] Add InitializeNativeTargetAsmParser to ExecutionEngine.
This is required to allow python to work with lowerings that use inline_asm.

Differential Revision: https://reviews.llvm.org/D114338
2021-11-22 11:28:14 +00:00
Tobias Gysi e3d386ea27 [mlir][linalg] Add a tile and fuse on tensors pattern.
Add a pattern to apply the new tile and fuse on tensors method. Integrate the pattern into the CodegenStrategy and use the CodegenStrategy to implement the tests.

Depends On D114012

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114067
2021-11-22 11:13:21 +00:00
Nicolas Vasilache 789c88e80e [mlir] Fix unintentional mutation by VectorType/RankedTensorType::Builder dropDim
Differential Revision: https://reviews.llvm.org/D113933
2021-11-22 10:51:50 +00:00
Tobias Gysi 0ccc44cec0 [mlir][linalg] Fix tile and fuse for outermost reduction.
Tile and fuse failed if the outermost tile loop is a reduction dimension. Add the necessary check to handle outermost reductions and introduce a test case to verify the change.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114012
2021-11-22 10:44:15 +00:00
Nicolas Vasilache a9e236bed8 [mlir][Vector] Add a vblendps-based impl for transpose8x8 (both intrin and inline_asm)
This revision follows up on the conversation titled:

```[llvm-dev] Understanding and controlling some of the AVX shuffle emission paths```

The revision adds a vblendps-based implementation for transpose8x8 and further distinguishes between and intrinsics and an inline_asm implementation.

This results in roughly 20% fewer cycles as reported by llvm-mca:

After this revision (intrinsic version, resolves to virtually identical assembly as per the llvm-dev discussion, no vblendps instruction is emitted):
```
Iterations:        100
Instructions:      5900
Total Cycles:      2415
Total uOps:        7300

Dispatch Width:    6
uOps Per Cycle:    3.02
IPC:               2.44
Block RThroughput: 24.0

Cycles with backend pressure increase [ 89.90% ]
Throughput Bottlenecks:
  Resource Pressure       [ 89.65% ]
  - SKXPort1  [ 0.04% ]
  - SKXPort2  [ 12.42% ]
  - SKXPort3  [ 12.42% ]
  - SKXPort5  [ 89.52% ]
  Data Dependencies:      [ 37.06% ]
  - Register Dependencies [ 37.06% ]
  - Memory Dependencies   [ 0.00% ]
```

After this revision (inline_asm version, vblendps instructions are indeed emitted):
```
Iterations:        100
Instructions:      6300
Total Cycles:      2015
Total uOps:        7700

Dispatch Width:    6
uOps Per Cycle:    3.82
IPC:               3.13
Block RThroughput: 20.0

Cycles with backend pressure increase [ 83.47% ]
Throughput Bottlenecks:
  Resource Pressure       [ 83.18% ]
  - SKXPort0  [ 14.49% ]
  - SKXPort1  [ 14.54% ]
  - SKXPort2  [ 19.70% ]
  - SKXPort3  [ 19.70% ]
  - SKXPort5  [ 83.03% ]
  - SKXPort6  [ 14.49% ]
  Data Dependencies:      [ 39.75% ]
  - Register Dependencies [ 39.75% ]
  - Memory Dependencies   [ 0.00% ]
```

An accessible copy of the conversation is available [here](https://gist.github.com/nicolasvasilache/68c7f34012584b0e00f335bcb374ede0).

Reviewed By: ftynse, dcaballe

Differential Revision: https://reviews.llvm.org/D114335
2021-11-22 10:32:34 +00:00
Jacques Pienaar e5a4d0f149 [mlir] Fix unused function warning (NFC)
Delete function no longer needed as all derived classes override
printer.
2021-11-21 15:06:08 -08:00
Jacques Pienaar 6f9cceb775 [mlir] Move trait to InferTypeOpInterface
Step towards removing the hard coded behavior for this trait and to instead use common interface.

Differential Revision: https://reviews.llvm.org/D114208
2021-11-21 14:41:12 -08:00
Arjun P ad48ef1e31 [MLIR][NFC] Simplex::restoreRow: improve documentation 2021-11-21 19:23:55 +05:30
Arnab Dutta ec7b0d4d34 [MLIR] Simplify Semi-affine expressions by rule based matching and replacing "expr - q * (expr floordiv q)" with "expr mod q" expression.
Add rule based matching for detecting and transforming "expr - q * (expr floordiv q)"
to "expr mod q", where q is a symbolic exxpression, in simplifyAdd function.

Reviewed By: bondhugula, dcaballe

Differential Revision: https://reviews.llvm.org/D112985
2021-11-20 21:05:36 +05:30
Arnab Dutta 1f9ca5adba [MLIR] Avoid creation of buggy affine maps while replacing dimension and symbol
Initially before appending the newly composed dimension and symbols
to the dimension and symbol list whose size is to be passed in
AffineMap::get(), the call to the AffineMap::get() was made, resulting
in wrong dimCount and symbolCount being passed as argument. We move the
call to the AffineMap::get() after the diimension and symbol list are
updated.

Differential Revision: https://reviews.llvm.org/D114237
2021-11-20 12:01:29 +05:30
Krzysztof Drewniak a6f53afbcb [MLIR][GPU] Link in device libraries during HSA compilation if needed
To perform some operations, such as sin() or printf(), code compiled
for AMD GPUs must be linked to a series of device libraries. This
commit adds support for linking in these libraries.

However, since these device libraries are delivered as LLVM bitcode,
raising the possibility of version incompatibilities, this commit only
links in libraries when the functions from those libraries are called
by the code being compiled.

This code also sets the math flags to their most conservative values,
as MLIR doesn't have a `-ffast-math` equivalent.

Depends on D114114

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D114117
2021-11-19 22:29:37 +00:00
rdzhabarov d729f4c38f [mlir] Bug fix. Stream must outlive the pass manager.
Bug fix. Stream must outlive the pass manager.

Reviewed By: Chia-hungDuan

Differential Revision: https://reviews.llvm.org/D114277
2021-11-19 21:45:43 +00:00
Krzysztof Drewniak 20f79f8caa [MLIR][GPU] Make the path to ROCm a runtime option
Our current build assumes that the path to ROCm we find at build time
will be the path at which ROCm is located when the built code is
executed. This commit adds a --rocm-path option to SerializeToHsaco,
and removes the HIP dependency that the SerializeToHsaco previously had.

Depends on D114113

(though the dependency is to ensure the diffs apply cleanly and to capture the dependency on D114107)

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D114114
2021-11-19 20:51:54 +00:00
Krzysztof Drewniak bd22554af0 [MLIR][GPU] Run generic LLVM optimizations when serializing (on AMD)
- Adds hooks that allow SerializeTo* passes to arbitrarily transform
the produced LLVM Module before it is passed to the code generation
passes.

- Uses these hooks within the SerializeToHsaco pass in order to run
LLVM optimizations and to set the optimization level on the
TargetMachine.

- Adds an optLevel parameter to SerializeToHsaco

Future work may include moving much of what's been added to
SerializeToHsaco to SerializeToBlob, but that would require
confirmation from the NVVM backend maintainers that it would be
appropriate to do so.

Depends on D114107

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D114113
2021-11-19 19:21:24 +00:00
Thomas Raoux 47555d73f6 [mlir][gpu] Extend shuffle op modes and add nvvm lowering
Add up, down and idx modes to gpu shuffle ops, also change the mode from
string to enum

Differential Revision: https://reviews.llvm.org/D114188
2021-11-19 11:14:31 -08:00
Thomas Raoux 7cde516513 [mlir][vector] NFC, move some vector patterns in a separate file
Move patterns related to dropping lead unit dim into their own file.

Differential Revision: https://reviews.llvm.org/D114265
2021-11-19 10:39:29 -08:00
Thomas Raoux 06dbb28569 [mlir][vector] Remove usage of shapecast to remove unit dim
Instead of using shape_cast op in the pattern removing leading unit
dimensions we use extract/broadcast ops. This is part of the effort to
restrict ShapeCastOp fuirther in the future and only allow them to
convert to or from 1D vector.

This also adds extra canonicalization to fill the gaps in simplifying
broadcast/extract ops.

Differential Revision: https://reviews.llvm.org/D114205
2021-11-19 10:25:21 -08:00
Krzysztof Drewniak f849640a0c [MLIR] Make the ROCM integration tests runnable
- Move the #define s to the GPU Transform library from GPU Ops so that
SerializeToHsaco is non-trivially compiled

- Add required includes to SerializeToHsaco

- Move MCSubtargetInfo creation to the correct point in the
compilation process

- Change mlir in ROCM tests to account for renamed/moved ops

Differential Revision: https://reviews.llvm.org/D114184
2021-11-19 17:09:53 +00:00
Valentin Clement 78d69182b7
[mlir] Expose region utils functions
As discussed in D109579, this patch exposes `runRegionDCE` and
`eraseUnreachableBlocks` so they can be used as separate utilities in
other passes.

Reviewed By: rriddle, mehdi_amini

Differential Revision: https://reviews.llvm.org/D114160
2021-11-19 09:24:39 +01:00
Mogball 7c5ecc8b7e [mlir][vector] Insert/extract element can accept index
`vector::InsertElementOp` and `vector::ExtractElementOp` have had their `position`
operand changed to accept `AnySignlessIntegerOrIndex` for better operability with
operations that use `index`, such as affine loops.

LLVM's `extractelement` and `insertelement` can also accept `i64`, so lowering
directly to these operations without explicitly inserting casts is allowed. SPIRV's
equivalent ops can also accept `i64`.

Reviewed By: nicolasvasilache, jpienaar

Differential Revision: https://reviews.llvm.org/D114139
2021-11-18 22:40:29 +00:00
Arjun P 3b7b4a8041 [MLIR][NFC] Simplex::markRowRedundant: assert that row is not already marked redundant 2021-11-19 03:43:25 +05:30
MaheshRavishankar d26beb0be2 [mlir][Linalg] Add method to check if LinalgTransformationFilter has been applied.
Differential Revision: https://reviews.llvm.org/D114170
2021-11-18 13:45:30 -08:00
MaheshRavishankar 526dfe3f4d [mlir][Linalg] Do not return failure when all tile sizes are zero.
Returning failure when tile sizes are all zero prevents the change in
the marker. This makes pattern rewriter run the pattern multiple times
only to exit when it hits a limit. Instead just clone the operation
(since tiling is essentially cloning in this case). Then the
transformation filter kicks in to avoid the pattern rewriter to be
invoked many times.

Differential Revision: https://reviews.llvm.org/D113949
2021-11-18 09:28:25 -08:00
Krzysztof Drewniak fb1a06aa13 [MLIR][GPU] Add target arguments to SerializeToHsaco
Compiling code for AMD GPUs requires knowledge of which chipset is
being targeted, especially if the code uses chipset-specific
intrinsics (which is the case in a downstream convolution generator).
This commit adds `target`, `chipset` and `features` arguments to the
SerializeToHsaco constructor to enable passing in this required
information.

It also amends the ROCm integration tests to pass in the target
chipset, which is set to the chipset of the first GPU on the system
executing the tests.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D114107
2021-11-18 16:28:44 +00:00
Matthias Springer ebf8d74e92 [mlir][linalg][bufferize] Fix bufferize bug where non-tensor ops are not skipped
`BufferizableOpInterface::bufferize` will only be called on ops that
have tensor operands and/or results.

Differential Revision: https://reviews.llvm.org/D113962
2021-11-18 16:20:22 +09:00
Matthias Springer 26e90423f4 [mlir][linalg][bufferize][NFC] Decouple ComprehensiveBufferize from tensor dialect
Add a new BufferizableOpInterface method `isNotConflicting` that can be used to implement custom analysis rules.

Differential Revision: https://reviews.llvm.org/D113961
2021-11-18 16:11:24 +09:00
River Riddle 0c7890c844 [mlir] Convert NamedAttribute to be a class
NamedAttribute is currently represented as an std::pair, but this
creates an extremely clunky .first/.second API. This commit
converts it to a class, with better accessors (getName/getValue)
and also opens the door for more convenient API in the future.

Differential Revision: https://reviews.llvm.org/D113956
2021-11-18 05:39:29 +00:00
Aart Bik 1ce77b562d [mlir][sparse] refine lexicographic insertion to any tensor
First version was vectors only. With some clever "path" insertion,
we now support any d-dimensional tensor. Up next: reductions too

Reviewed By: bixia, wrengr

Differential Revision: https://reviews.llvm.org/D114024
2021-11-17 18:08:42 -08:00
Robert Suderman 6e41a06911 [mlir][tosa] Revert add-0 canonicalization for floating-point
Floating point optimization can produce incorrect numerical resutls for
-0.0 + 0.0 optimization as result needs to be -0.0.

Reviewed By: eric-k256

Differential Revision: https://reviews.llvm.org/D114127
2021-11-17 17:29:57 -08:00
Rob Suderman 044e7e013e [mlir][tosa] Fixed shape inference for tosa.transpose_conv2d
Transpose conv2d shape inference was incorrect, tests did not properly validate
that the shape inference was executing. Corrected shape inference, and extended
tests to actually execute.

Reviewed By: NatashaKnk

Differential Revision: https://reviews.llvm.org/D114026
2021-11-17 14:59:52 -08:00
River Riddle edc6c0ecb9 [mlir] Refactor AbstractOperation and OperationName
The current implementation is quite clunky; OperationName stores either an Identifier
or an AbstractOperation that corresponds to an operation. This has several problems:

* OperationNames created before and after an operation are registered are different
* Accessing the identifier name/dialect/etc. from an OperationName are overly branchy
  - they need to dyn_cast a PointerUnion to check the state

This commit refactors this such that we create a single information struct for every
operation name, even operations that aren't registered yet. When an OperationName is
created for an unregistered operation, we only populate the name field. When the
operation is registered, we populate the remaining fields. With this we now have two
new classes: OperationName and RegisteredOperationName. These both point to the
same underlying operation information struct, but only RegisteredOperationName can
assume that the operation is actually registered. This leads to a much cleaner API, and
we can also move some AbstractOperation functionality directly to OperationName.

Differential Revision: https://reviews.llvm.org/D114049
2021-11-17 22:29:57 +00:00
Michal Terepeta ddf2d62c7d [mlir][Vector] First step for 0D vector type
There seems to be a consensus that we should allow 0D vectors:
https://llvm.discourse.group/t/should-we-have-0-d-vectors/3097

This commit is only the first step: it changes the verifier and the parser to
allow vectors like `vector<f32>` (but does not allow explicit 0 dimensions,
i.e., `vector<0xf32>` is not allowed).

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114086
2021-11-17 14:58:24 +00:00
Mogball 47f76bb0f4 [mlir][lsp] Use ResultGroupDefinition struct
This struct was added and was intended to be used, but it was missed in the original patch.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D114041
2021-11-17 00:40:57 +00:00
River Riddle 195730a650 [mlir][NFC] Replace references to Identifier with StringAttr
This is part of the replacement of Identifier with StringAttr.

Differential Revision: https://reviews.llvm.org/D113953
2021-11-16 17:36:26 +00:00
William S. Moses 30d87d4a5d [MLIR][LLVM] Permit integer types in switch other than i32
LLVM switchop currently only permits i32. Both LLVM IR and MLIR Standard switch permit other integer types leading to an illegal state when lowering an i8 switch from MLIR standard

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D113955
2021-11-16 12:00:37 -05:00
Nicolas Vasilache b377807a76 [mlir][LLVM] Fix folding of LLVM::ExtractValueOp
Limit the backtracking along def-use chains when a prefix is encountered as it would generate incorrect foldings.

Differential Revision: https://reviews.llvm.org/D113975
2021-11-16 14:49:05 +00:00
Butygin 6c48f6aafe [mlir][spirv] add AtomicFAddEXTOp
Differential Revision: https://reviews.llvm.org/D113764
2021-11-16 14:24:22 +03:00
Butygin 526b71e44a [mlir] spirv: Add scf.while spirv conversion
* It works similar to scf.for coversion, but convert condition and yield ops as part of scf.whille pattern so it don't need to maintain external state

Differential Revision: https://reviews.llvm.org/D113007
2021-11-16 13:19:34 +03:00
Adrian Kuegel 921d91f3ac [mlir] Support multi-dimensional vectors in MathToLibm conversion.
Differential Revision: https://reviews.llvm.org/D113969
2021-11-16 11:13:52 +01:00
Arnab Dutta 1402299271 [MLIR] Simplify semi-affine expressions using flattening
For the semi affine expressions, whenever rhs of a floordiv, ceildiv, mod
or product expression is a symbolic expression, we introduce a local variable
representing the result, and store the floordiv/ceildiv, mod or product
affine expression in LocalExprs. In this way the expression is flattened,
and trivial addition and subtraction related simplifications are performed.
Also rule based matching for detecting and transforming "expr - q * (expr floordiv q)"
to "expr mod q", where q is a symbolic exxpression, in simplifyAdd function.

Differential Revision: https://reviews.llvm.org/D112808
2021-11-16 15:42:22 +05:30
Groverkss 11462a82c5 [MLIR] FlatAffineConstraints: Allow extraction of explicit representation of local variables
This patch extends the existing functionality of computing an explicit
representation for local variables, to also get the explicit representation,
instead of only the inequality pairs.

This is required for a future patch to remove redundant local ids based on
their explicit representation.

Reviewed By: arjunp

Differential Revision: https://reviews.llvm.org/D113814
2021-11-16 14:51:06 +05:30
Mehdi Amini 1585b13024 Revert "[MLIR][LLVM] Permit integer types in switch other than i32"
This reverts commit 94992670fc.
Build is broken with:

tools/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.cpp.inc:23996:3: error: no matching function for call to 'printSwitchOpCases'
  printSwitchOpCases(_odsPrinter, *this, getValue().getType(), getCaseValuesAttr(), getCaseDestinations(), getCaseOperands(), getCaseOperands().getTypes());
  ^~~~~~~~~~~~~~~~~~
2021-11-16 05:59:12 +00:00
William S. Moses 94992670fc [MLIR][LLVM] Permit integer types in switch other than i32
LLVM switchop currently only permits i32. Both LLVM IR and MLIR Standard switch permit other integer types leading to an illegal state when lowering an i8 switch from MLIR standard

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D113955
2021-11-16 00:46:25 -05:00
Matthias Springer 17194ca96a [mlir][linalg][bufferize][NFC] Clean up tensor op bufferization
Differential Revision: https://reviews.llvm.org/D113730
2021-11-16 11:17:42 +09:00
Aart Bik f66e5769d4 [mlir][sparse] first version of "truly" dynamic sparse tensors as outputs of kernels
This revision contains all "sparsification" ops and rewriting necessary to support sparse output tensors when the kernel has no reduction (viz. insertions occur in lexicographic order and are "injective"). This will be later generalized to allow reductions too. Also, this first revision only supports sparse 1-d tensors (viz. vectors) as output in the runtime support library. This will be generalized to n-d tensors shortly. But this way, the revision is kept to a manageable size.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D113705
2021-11-15 15:33:32 -08:00
natashaknk 381677dfbf [tosa][mlir] Refactor tosa.reshape lowering to linalg for dynamic cases.
Split tosa.reshape into three individual lowerings: collapse, expand and a
combination of both. Add simple dynamic shape support.

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D113936
2021-11-15 15:31:37 -08:00
not-jenni cdb0623ad8 [mlir][tosa] Add tosa.mul by one canonicalization
Multiply by one can be removed during canonicalization. This optimizes away unneeded operations.

Differential Revision: https://reviews.llvm.org/D113807
2021-11-15 14:52:16 -08:00
Nicolas Vasilache 0b17336f79 [mlir][Vector] Make vector.shape_cast based size-1 foldings opt-in and separate.
This is in prevision of dropping them altogether and using insert/extract based patterns.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D113928
2021-11-15 21:17:57 +00:00
Nicolas Vasilache b828506eca [mlir][Linalg] Add a DownscaleDepthwiseConv2DNhwcHwcOp decomposition pattern.
Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D113907
2021-11-15 20:48:16 +00:00
Nicolas Vasilache 641fe70776 [mlir][Linalg] Fix and improve vectorization of depthwise convolutions.
When trying to connect the vectorization of depthwise convolutions to e2e execution
a number of problems surfaced.
Fix an off-by-one error on the size of the input vector (similary to what was previously done for regular conv).
Rewrite the lowering to vector.fma instead of vector.contract: the KW reduction dimension has already been unrolled and vector.contract requires a reduction dimension to be valid.

Differential Revision: https://reviews.llvm.org/D113884
2021-11-15 12:58:05 +00:00
Nicolas Vasilache ee80ffbf9a [mlir][Linalg] Add bounded recursion declaration to FMAOp -> LLVM conversion.
FMAOp -> LLVM conversion is done progressively by peeling off 1 dimension from FMAOp at each pattern iteration. Add the recursively bounded property declaration to the pattern so that the rewriter can apply it multiple times.

Without this, FMAOps with 3+D do not lower to LLVM.

Differential Revision: https://reviews.llvm.org/D113886
2021-11-15 12:41:52 +00:00
Alexander Belyaev 9b1d90e8ac [mlir] Move min/max ops from Std to Arith.
Differential Revision: https://reviews.llvm.org/D113881
2021-11-15 13:19:17 +01:00
Butygin 2a3878ea16 [mlir] DialectConversion: fix OperationLegalizer::isIllegal result when legality callback returns None
OperationLegalizer::isIllegal returns false if operation legality wasn't
registered by user and we expect same behaviour when dynamic legality
callback return None, but instead true was returned.

Differential Revision: https://reviews.llvm.org/D113267
2021-11-15 14:53:06 +03:00
Nicolas Vasilache f1c86b8354 [mlir][Linalg] Fix off-by-one error in conv vector size computation.
Differential Revision: https://reviews.llvm.org/D113877
2021-11-15 11:37:44 +00:00
Matthias Springer 8835a1924e [mlir][linalg][bufferize] Allow non-tensor mappings in BufferizationState
This change makes it possible to set up custom mappings in a PostAnalysisStep. Some users of Comprehensive Bufferize have custom tensor types and it is most convenient to just reuse the same bvm.

Also add some more assertions.

Differential Revision: https://reviews.llvm.org/D113726
2021-11-15 19:40:30 +09:00
Nicolas Vasilache c1a2985d7f [mlir] NFC - Add VectorType::Builder to more easily build vector types from existing ones
Differential Revision: https://reviews.llvm.org/D113875
2021-11-15 10:36:55 +00:00
Matthias Springer 542a8cfba7 [mlir][linalg][bufferize] Fix insertion point of result buffers
Differential Revision: https://reviews.llvm.org/D113723
2021-11-15 19:27:33 +09:00
Nicolas Vasilache f67171ac58 [mlir][Linalg] Make depthwise convolution naming scheme consistent.
Names should be consistent across all operations otherwise painful bugs will surface.

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D113762
2021-11-15 07:54:29 +00:00
Mehdi Amini d5730647ac Revert "[mlir] FlatAffineConstraint parsing for unit tests"
This reverts commit bec488b818.

This commit introduced a layering violation between MLIR libraries.
Reverting for now while discussing on the original review thread.
2021-11-15 07:22:38 +00:00
Mehdi Amini 67453c8941 Use std::make_unique instead of `new` to reinitalize a unique_ptr (NFC)
Fix a clang-tidy warning.
2021-11-14 22:28:54 +00:00
Christian Ulmann bec488b818 [mlir] FlatAffineConstraint parsing for unit tests
This patch adds functionality to parse FlatAffineConstraints from a
StringRef with the intention to be used for unit tests. This should
make the construction of FlatAffineConstraints easier for testing
purposes.

The patch contains an example usage of the functionality in a unit test that
uses FlatAffineConstraints.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D113275
2021-11-14 23:50:38 +05:30
Mehdi Amini e96214ddef Fix some clang-tidy reports in MLIR (NFC)
Mostly replace uses of `container.size()` with `container.empty()` in
conditionals when applicable.
2021-11-13 20:09:21 +00:00
Mogball 2696a9529e [mlir][ods] Cleanup of Class Codegen helper
Depends on D113331

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D113714
2021-11-12 21:22:01 +00:00
Nicolas Vasilache 99ff697bf7 [mlir][Vector] Add support for 1D depthwise conv vectorization
At this time the 2 flavors of conv are a little too different to allow significant code sharing and other will likely come up.
so we go the easy route first by duplicating and adapting.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D113758
2021-11-12 13:14:09 +00:00
Nicolas Vasilache aa37318067 [mlir][Linalg] Rewrite DownscaleSizeOneWindowed2DConvolution to use rank-reducing insert/extract slices.
This rewriting enables better bufferization and canonicalizations.

Differential Revision: https://reviews.llvm.org/D113745
2021-11-12 11:57:12 +00:00
Stella Laurenzo c265170110 [mlir] Add MLIR-C dylib.
Per discussion on discord and various feature requests across bindings (Haskell and Rust bindings authors have asked me directly), we should be building a link-ready MLIR-C dylib which exports the C API and can be used without linking to anything else.

This patch:

* Adds a new MLIR-C aggregate shared library (libMLIR-C.so), which is similar in name and function to libLLVM-C.so.
* It is guarded by the new CMake option MLIR_BUILD_MLIR_C_DYLIB, which has a similar purpose/name to the LLVM_BUILD_LLVM_C_DYLIB option.
* On all platforms, this will work with both static, BUILD_SHARED_LIBS, and libMLIR builds, if supported:
  * In static builds: libMLIR-C.so will export the CAPI symbols and statically link all dependencies into itself.
  * In BUILD_SHARED_LIBS: libMLIR-C.so will export the CAPI symbols and have dynamic dependencies on implementation shared libraries.
  * In libMLIR.so mode: same as static. libMLIR.so was not finished for actual linking use within the project. An eventual relayering so that libMLIR-C.so depends on libMLIR.so is possible but requires first re-engineering the latter to use the aggregate facility.
* On Linux, exported symbols are filtered to only the CAPI. On others (MacOS, Windows), all symbols are exported. A CMake status is printed unless if global visibility is hidden indicating that this has not yet been implemented. The library should still work, but it will be larger and more likely to conflict until fixed. Someone should look at lifting the corresponding support from libLLVM-C.so and adapting. Or, for special uses, just build with `-DCMAKE_CXX_VISIBILITY_PRESET=hidden -DCMAKE_C_VISIBILITY_PRESET=hidden`.
* Includes fixes to execution engine symbol export macros to enable default visibility. Without this, the advice to use hidden visibility would have resulted in test failures and unusable execution engine support libraries.

Differential Revision: https://reviews.llvm.org/D113731
2021-11-11 22:58:13 -08:00
Matthias Springer d1c8df8743 [mlir][linalg][bufferize] Decouple ComprehensiveBufferize from Linalg
The remaining dialects will be decoupled from ComprehensiveBufferize in separate commits.

Differential Revision: https://reviews.llvm.org/D113459
2021-11-12 10:08:09 +09:00
Mogball b8186b313c [mlir][ods] Unique attribute, successor, region constraints
With `-Os` turned on, results in 2-5% binary size reduction
(depends on the original binary). Without it, the binary size
is essentially unchanged.

Depends on D113128

Differential Revision: https://reviews.llvm.org/D113331
2021-11-12 01:04:08 +00:00
Matthias Springer 1b2bda8d1a [mlir][linalg][bufferize] Add PostAnalysisStep
This helper struct allows users of ComprehensiveBufferize to inject "post analysis" steps that are implemented after the analysis but before the bufferization.

Differential Revision: https://reviews.llvm.org/D113458
2021-11-12 09:51:06 +09:00
Butygin 92fc60bc62 [mlir][spirv] Regenerate SPIRVBase.td from recent spec
* Some long names were added and script decided to change whitespaces in a lot of places
* `ImageOperand` was renamed to `ImageOperands` in spec
* Some *NV enums were renamed to *KHR (spec actually maintains both variants with same value, but script pulled only *KHR versions)

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D113667
2021-11-11 17:07:52 -05:00
Thomas Raoux e7969240dc [mlir][VectorToGPU] Support more cases in conversion to MMA ops
Support load with broadcast, elementwise divf op and remove the
hardcoded restriction on the vector size. Picking the right size should
be enfored by user and will fail conversion to llvm/spirv if it is not
supported.

Differential Revision: https://reviews.llvm.org/D113618
2021-11-11 13:10:38 -08:00
Nicolas Vasilache 800694a697 [mlir][Linalg] Make a LinalgStrategyDecomposePass available.
Differential Revision: https://reviews.llvm.org/D113684
2021-11-11 17:47:27 +00:00
Stephan Herhut b241226aec [mlir][linalg] Avoid illegal elementwise fusion into reductions
Fusing into a reduction is only valid if doing so does not erase information on a reduction dimensions size.

Differential Revision: https://reviews.llvm.org/D113500
2021-11-11 15:56:12 +01:00
Benjamin Kramer f04a1237ba [mlir][X86Vector] Fix unused variable warning 2021-11-11 13:18:19 +01:00
Nicolas Vasilache a085c4b589 [mlir][Vector] Silence recently introduced warnings 2021-11-11 12:08:48 +00:00
Matthias Springer 4397a1baef [mlir][linalg][bufferize] Remove remaining linalg dependencies
* Move "linalg.inplaceable" attr name literals to BufferizableOpInterface.
* Use `memref.copy` by default. Override to `linalg.copy` in ComprehensiveBufferizePass.

These are the last remaining code dependencies on Linalg in Comprehensive Bufferize. The next commit will make ComprehensiveBufferize independent of the Linalg dialect.

Differential Revision: https://reviews.llvm.org/D113457
2021-11-11 19:04:41 +09:00
Matthias Springer aeb1c8d0ca [mlir][linalg][bufferize] Group helpers in BufferizationState
This simplifies the signature of `bufferize`.

Differential Revision: https://reviews.llvm.org/D113388
2021-11-11 18:24:13 +09:00
Nicolas Vasilache 34ff857350 [mlir][X86Vector] Add specialized vector.transpose lowering patterns for AVX2
This revision adds an implementation of 2-D vector.transpose for 4x8 and 8x8 for
AVX2 and surfaces it to the Linalg level of control.

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D113347
2021-11-11 07:33:31 +00:00
Mehdi Amini f97e72aaca Use base class AsmParser/AsmPrinter in Types and Attribute print/parse method (NFC)
This decouples the printing/parsing from the "context" in which the parsing occurs.
This will allow to invoke these methods directly using an OpAsmParser/OpAsmPrinter.

Differential Revision: https://reviews.llvm.org/D113637
2021-11-11 06:26:33 +00:00
Matthias Springer 2f5539e300 [mlir][linalg][bufferize][NFC] Move `getResultBuffer` to op interface
This is in preparation of decoupling Comprehensive Bufferize from the various dialects.

Differential Revision: https://reviews.llvm.org/D113387
2021-11-11 14:38:18 +09:00
Kazu Hirata 839d81862f [ComprehensiveBufferize] Fix a warning
This patch fixes:

  mlir/lib/Dialect/Linalg/ComprehensiveBufferize/ComprehensiveBufferize.cpp:301:20:
  error: unused function 'printValueInfo' [-Werror,-Wunused-function]
2021-11-10 21:21:32 -08:00
Matthias Springer a4547dc575 [mlir][linalg][bufferize] Move more helper functions/structs to interface
Move helper functions for traversing reverse use-def chains. These are useful for implementing custom optimizations (e.g., custom InitTensorOp eliminations).

Also move over the AllocationCallbacks struct. This is in preparation for decoupling ComprehensiveBufferize from various dialects.

Differential Revision: https://reviews.llvm.org/D113386
2021-11-11 14:16:20 +09:00
River Riddle 6de6131f02 [mlir] Optimize usage of llvm::mapped_iterator
mapped_iterator is a useful abstraction for applying a
map function over an existing iterator, but our current
usage ends up allocating storage/making indirect calls
even with the map function is a known function, which
is horribly inefficient. This commit refactors the usage
of mapped_iterator to avoid this, and allows for directly
referencing the map function when dereferencing.

Fixes PR52319

Differential Revision: https://reviews.llvm.org/D113511
2021-11-11 03:26:29 +00:00
Matthias Springer 56efafeabf [mlir][bufferize][linalg] Do not copy tensors that are overwritten
This is a generalization of "do not copy buffers for a LinalgOp output if the output is not used".

Differential Revision: https://reviews.llvm.org/D113385
2021-11-11 11:32:49 +09:00
Matthias Springer 3274145408 [mlir][linalg][bufferize] Do not copy results of non-writing ops
This is a generalization of "do not copy the result of an InitTensorOp". This commit is in preparation of decoupling `getResultBuffer` from the Linalg dialect.

Differential Revision: https://reviews.llvm.org/D113381
2021-11-11 11:25:51 +09:00
River Riddle 7961511ed8 [mlir] MicroOptimize a few hot StorageUniquer code paths
* Sprinkle `inline` on a few small and hot hashing/uniquing methods
* Use the faster DenseMapInfo hash functions instead of
   llvm::hash_value.

This provides a speed up of a few percent in workloads with lots of
attributes.
2021-11-11 02:02:24 +00:00
River Riddle 120591e126 [mlir] Replace usages of Identifier with StringAttr
Identifier and StringAttr essentially serve the same purpose, i.e. to hold a string value. Keeping these seemingly identical pieces of functionality separate has caused problems in certain situations:

* Identifier has nice accessors that StringAttr doesn't
* Identifier can't be used as an Attribute, meaning strings are often duplicated between Identifier/StringAttr (e.g. in PDL)

The only thing that Identifier has that StringAttr doesn't is support for caching a dialect that is referenced by the string (e.g. dialect.foo). This functionality is added to StringAttr, as this is useful for StringAttr in generally the same ways it was useful for Identifier.

Differential Revision: https://reviews.llvm.org/D113536
2021-11-11 02:02:24 +00:00
Matthias Springer 7f153e8ba1 [mlir][linalg][bufferize] Add `isAllocationHoistingBarrier` to op interface
This make `getResultBuffer` in ComprehensiveBufferize independent of the SCF, Affine and Linalg dialects. This commit is in preparating of decoupling op interface implementations from ComprehensiveBufferize.

Differential Revision: https://reviews.llvm.org/D113380
2021-11-11 11:00:47 +09:00
lipracer 8165eaa885 [mlir](arithmetic) Add ceildivui to the arithmetic dialect
The specific description is [[ https://llvm.discourse.group/t/adding-unsigned-integer-ceil-and-floor-in-std-dialect/4541 | Adding unsigned integer ceil in Std Dialect ]] .

When we lower ceilDivOp this will generate below code, sometimes we know m and n are unsigned intergal.Here are some redundant judgments about positive and negative.
So we need to add some unsigned operations to simplify the instructions.
```
ceilDiv(n, m)
  x = (m > 0) ? -1 : 1
  return (n*m>0) ? ((n+x) / m) + 1 : - (-n / m)
```
unsigned operations:
```
ceilDivU(n, m)
  return n ==0 ?  0 :  ((n - 1) / m) + 1
```

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D113363
2021-11-11 01:49:14 +00:00
Matthias Springer 161755770a [mlir][linalg][bufferize] Move BufferizationAliasInfo to op interface
BufferizationAliasInfo is used in BufferizationOpInterface::bufferize implementations, so it should be part of the same build target as BufferizableOpInterface.

This commit is in preparation of decoupling the ComprehensiveBufferize from the various dialects.

Differential Revision: https://reviews.llvm.org/D113378
2021-11-11 10:45:45 +09:00
Matthias Springer 2e0d821bd5 [mlir][linalg][bufferize] Store analysis results in BufferizationAliasInfo
* Store inplace bufferization decisions in `inplaceBufferized`.
* Remove `InPlaceSpec`. Use a bool instead.
* Use `BufferizableOpInterface::bufferizesToWritableMemory` and `bufferizesToWritableMemory` instead of `getInPlace(BlockArgument)`. The analysis does not care about inplacability of block arguments. It only cares whether the buffer can be written to or not.
* The `kInPlaceResultsAttrName` op attribute is for testing purposes only.

This commit further decouples BufferizationAliasInfo from other dialects such as SCF.

Differential Revision: https://reviews.llvm.org/D113375
2021-11-11 10:36:49 +09:00
Matthias Springer 996d4ffe30 [mlir][linalg][bufferize] Fix bug in InitTensor elimination
After replacing then init_tensor with a new value, the new value must be inserted into the corresponding union/equivalence sets.

Differential Revision: https://reviews.llvm.org/D113374
2021-11-11 10:28:17 +09:00
Matthias Springer 050591478e [mlir][linalg][bufferize][NFC] Move helper functions to op interface
Also enclose all bufferization code in a new namespace: `comprehensive_bufferize`

Differential Revision: https://reviews.llvm.org/D113373
2021-11-11 10:06:13 +09:00
Kazu Hirata a86ef2c827 [ComprehensiveBufferize] Fix a warning
This patch fixes:

  mlir/lib/Dialect/Linalg/ComprehensiveBufferize/ComprehensiveBufferize.cpp:2240:13:
  error: unused function 'checkAliasInfoConsistency'
  [-Werror,-Wunused-function]
2021-11-10 15:23:39 -08:00
Rob Suderman 860d3811a9 [mlir][tosa] Add lowering for tosa.pad with explicit value
New TOSA pad operation can support explicitly specifying the pad value. Added
lowering to linalg that uses the explicit value.

Differential Revision: https://reviews.llvm.org/D113515
2021-11-10 14:15:20 -08:00
Uday Bondhugula 51ae78a6d6 [MLIR][Affine][NFC] affine.store op verifier message fix and check
Fix typo in affine.store op verifier message and test case.

Differential Revision: https://reviews.llvm.org/D113360
2021-11-11 01:52:23 +05:30
Kevin Cheng bef966eb37 tosa-make-broadcatable pass now supports numpy style broadcasting only.
- fix bug that in [c,1] + [a, b, c, d] broadcast
- add test [3,3,4,1] + [4,5]

Signed-off-by: Kevin Cheng <kevin.cheng@arm.com>
Change-Id: Iaed2f04df8775f655c82c740271395274163d147

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D113596
2021-11-10 11:48:35 -08:00
thomasraoux 5aa6038a40 [mlir] Make topologicalSort iterative and consider op regions
When doing topological sort we need to make sure an op is scheduled before any
of the ops within its regions.
Also change the algorithm to not be recursive in order to prevent potential
stack overflow.

Differential Revision: https://reviews.llvm.org/D113423
2021-11-10 10:05:01 -08:00
thomasraoux f309939d06 [mlir][nvvm] Remove special case ptr arithmetic lowering in gpu to nvvm
Use existing helper instead of handling only a subset of indices lowering
arithmetic. Also relax the restriction on the memref rank for the GPU mma ops
as we can now support any rank.

Differential Revision: https://reviews.llvm.org/D113383
2021-11-10 10:00:12 -08:00
Alex Zinenko e64c76672f [mlir] recursively convert builtin types to LLVM when possible
Given that LLVM dialect types may now optionally contain types from other
dialects, which itself is motivated by dialect interoperability and progressive
lowering, the conversion should no longer assume that the outermost LLVM
dialect type can be left as is. Instead, it should inspect the types it
contains and attempt to convert them to the LLVM dialect. Introduce this
capability for LLVM array, pointer and structure types. Only literal structures
are currently supported as handling identified structures requires the
converison infrastructure to have a mechanism for avoiding infite recursion in
case of recursive types.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D112550
2021-11-10 18:11:00 +01:00
Tobias Gysi 9aea27ac88 [mlir][linalg] Remove getSmallestBoundingIndex (NFC).
Remove the getSmallestBoundingIndex method that has no uses anymore.

Depends On D113548

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D113549
2021-11-10 16:18:20 +00:00
Tobias Gysi 53da8600e1 [linalg][mlir] Replace getSmallestBoundingIndex in promotion (NFC).
Replace the getSmallestBoundingIndex method used in promotion by getConstantUpperBoundForIndex that uses flat affine constraints to compute a constant upper bound.

Depends On D113546

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D113548
2021-11-10 16:16:08 +00:00
Tobias Gysi 4e2c978f44 [mlir][linalg] Use getUpperBoundForIndex in hoisting (NFC).
Use the custom upper bound computation in hoisting by the new getUpperBoundForIndex method.

Depends On D113546

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D113547
2021-11-10 15:55:55 +00:00
Tobias Gysi ea53a6938b [linalg][mlir] Replace getSmallestBoundingIndex in padding (NFC).
Replace the getSmallestBoundingIndex method used in padding by getConstantUpperBoundForIndex that uses flat affine constraints to compute a constant upper bound.

Depends On D113398

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D113546
2021-11-10 15:12:51 +00:00
Tobias Gysi b86b2309ce [mlir][linalg] Use AffineApplyOp to compute padding width (NFC).
Use AffineApplyOp instead of SubIOp to compute the padding width when creating a pad tensor operation.

Depends On D113382

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D113404
2021-11-10 14:53:52 +00:00
Tobias Gysi ba2ac9c97c [mli][linalg] Add flag to control CodegenStrategy enable pass.
Add a flag to control if CodegenStrategy runs the EnablePass between the transformations.

Depends On D113382

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D113409
2021-11-10 14:11:40 +00:00
Tobias Gysi 969243a007 [mlir][linalg] Hoist padding simplifications (NFC).
Remove unused members and store the indexing and packing loops in SmallVector.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D113398
2021-11-10 14:01:49 +00:00
Tobias Gysi 0609eb1b32 [mlir][linalg] Remove padding from tiling options.
Remove the padding options from the tiling options since padding is now implemented by a separate pattern/pass introduced in https://reviews.llvm.org/D112412.

The revsion remove the tile-and-pad-tensors.mlir and replaces it with the pad.mlir that tests padding in isolation (without tiling). Similarly, hoist-padding.mlir is replaced by pad-and-hoist.mlir introduced in https://reviews.llvm.org/D112713.

Depends On D112838

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D113382
2021-11-10 13:33:28 +00:00
Denys Shabalin aaea92e1cd [mlir] Reintroduce nano time to execution_engine
Prior change had a broken test that wasn't run by accident.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D113488
2021-11-10 13:14:18 +01:00
Matthias Springer 8f6119128f [mlir][linalg][bufferize] Add mustBufferizeInPlace to op interface
This is useful for ops such as scf::IfOp, which always bufferize in-place.

This commit is in preparation of decoupling BufferizationAliasInfo from the SCF dialect.

Differential Revision: https://reviews.llvm.org/D113339
2021-11-10 19:33:11 +09:00
Matthias Springer c3eb967e2a [mlir][linalg][bufferize] Bufferize ops via PreOrder traversal
The existing PostOrder traversal with special rules for certain ops was complicated and had a bug. Switch to PreOrder traversal.

Differential Revision: https://reviews.llvm.org/D113338
2021-11-10 18:51:39 +09:00
Matthias Springer be98b20b9d [mlir][linalg][bufferize] Remove special scf::IfOp rules
Remove some of the special rules for scf::IfOp (not all of them) and encode them in the op interface. This is in preparation of decoupling analysis, bufferization and dialects.

Differential Revision: https://reviews.llvm.org/D112901
2021-11-10 18:39:53 +09:00
Matthias Springer 007e55133e [mlir][linalg][bufferize] Add helper method isMemoryWrite to op interface
This is in preparating for decoupling the SCF dialect from the analysis.

Differential Revision: https://reviews.llvm.org/D113337
2021-11-10 18:35:04 +09:00
Matthias Springer 99ad2079d4 [mlir][linalg][bufferize] Fix buffer equivalence around scf.if ops
Also extend the comments for aliasInfo and equivalenceInfo.

Differential Revision: https://reviews.llvm.org/D113340
2021-11-10 18:33:08 +09:00
Matthias Springer f74f09128b [mlir][linalg][bufferize] Relax tensor.insert_slice conflict rules
A tensor.insert_slice write does not conflict with a subsequent read of the source if the source is originating from a matching tensor.extract_slice.

Differential Revision: https://reviews.llvm.org/D113446
2021-11-10 18:23:29 +09:00
Jacques Pienaar d1a688ce0e [mlir-c] Add Region iterators matching Block & Operation ones
Enables using the same iterator interface to these even though underlying storage is different.

Differential Revision: https://reviews.llvm.org/D113512
2021-11-09 17:52:56 -08:00
Mehdi Amini f30a8a6f67 Change the contract with the type/attribute parsing to let the dispatch handle the mnemonic
This breaking change requires to remove printing the mnemonic in the print()
method on Type/Attribute classes.
This makes it consistent with the parsing code which alread handles the
mnemonic outside of the parsing method.

This likely won't break the build for anyone, but tests will start
failing for dialects downstream. The fix is trivial and look like
going from:

void emitc::OpaqueType::print(DialectAsmPrinter &printer) const {
  printer << "opaque<\"";

to:

void emitc::OpaqueAttr::print(DialectAsmPrinter &printer) const {
  printer << "<\"";

Reviewed By: rriddle, aartbik

Differential Revision: https://reviews.llvm.org/D113334
2021-11-10 00:47:15 +00:00
Mehdi Amini c27d85a9c9 Emit the boilerplate for Type printer/parser dialect dispatching from ODS
Add a new `useDefaultTypePrinterParser` boolean settings on the dialect
(default to false for now) that emits the boilerplate to dispatch type
parsing/printing to the auto-generated method.
We will likely turn this on by default in the future.

Differential Revision: https://reviews.llvm.org/D113332
2021-11-10 00:41:36 +00:00
Mehdi Amini fd6b404183 Emit the boilerplate for Attribute printer/parser dialect dispatching from ODS
Add a new `useDefaultAttributePrinterParser` boolean settings on the dialect
(default to false for now) that emits the boilerplate to dispatch attribute
parsing/printing to the auto-generated method.
We will likely turn this on by default in the future.

Differential Revision: https://reviews.llvm.org/D113329
2021-11-10 00:38:19 +00:00
Mehdi Amini c296609b68 Revert "[mlir] Add nano precision clock to execution engine"
This reverts commit 48d1f099d4.

Broke the MLIR buildbots
2021-11-09 18:12:42 +00:00
Denys Shabalin 48d1f099d4 [mlir] Add nano precision clock to execution engine
Reviewed By: ftynse, nicolasvasilache

Differential Revision: https://reviews.llvm.org/D113476
2021-11-09 14:32:36 +01:00
Groverkss 6706a4720f [MLIR][NFC] FlatAffineConstraints: Refactor division representation computation
This patch factors out division representation computation from upper-lower bound
inequalities to a separate function. This is done to improve readability and reuse.

This patch is marked NFC since the only change is factoring out existing code
to a separate function.

Reviewed By: grosser

Differential Revision: https://reviews.llvm.org/D113463
2021-11-09 14:08:15 +05:30
River Riddle 937e40a8cf [mlir] Remove the non-templated DenseElementsAttr::getSplatValue
This predates the templated variant, and has been simply forwarding
to getSplatValue<Attribute> for some time. Removing this makes the
API a bit more uniform, and also helps prevent users from thinking
it is "cheap".
2021-11-09 01:40:40 +00:00
River Riddle ae40d62541 [mlir] Refactor ElementsAttr's value access API
There are several aspects of the API that either aren't easy to use, or are
deceptively easy to do the wrong thing. The main change of this commit
is to remove all of the `getValue<T>`/`getFlatValue<T>` from ElementsAttr
and instead provide operator[] methods on the ranges returned by
`getValues<T>`. This provides a much more convenient API for the value
ranges. It also removes the easy-to-be-inefficient nature of
getValue/getFlatValue, which under the hood would construct a new range for
the type `T`. Constructing a range is not necessarily cheap in all cases, and
could lead to very poor performance if used within a loop; i.e. if you were to
naively write something like:

```
DenseElementsAttr attr = ...;
for (int i = 0; i < size; ++i) {
  // We are internally rebuilding the APFloat value range on each iteration!!
  APFloat it = attr.getFlatValue<APFloat>(i);
}
```

Differential Revision: https://reviews.llvm.org/D113229
2021-11-09 00:15:08 +00:00
Chia-hung Duan 2d99c815d7 [mlir-tblgen] Support `either` in Tablegen DRR.
Add a new directive `either` to specify the operands can be matched in either order

Reviewed By: jpienaar, Mogball

Differential Revision: https://reviews.llvm.org/D110666
2021-11-08 23:16:03 +00:00
Suraj Sudhir 82568021dd [mlir][tosa] Spec v0.23 updates
Add pad_const field to tosa.pad.
Add builders to enable optional construction of pad_const in pad op.
Update documentation of tosa.clamp to match spec wording.

Signed-off-by: Suraj Sudhir <suraj.sudhir@arm.com>

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D113322
2021-11-08 10:13:54 -08:00
Jeff Niu 9a2fdc369d [MLIR] Attribute and type formats in ODS
Declarative attribute and type formats with assembly formats. Define an
`assemblyFormat` field in attribute and type defs with a `mnemonic` to
generate a parser and printer.

```tablegen
def MyAttr : AttrDef<MyDialect, "MyAttr"> {
  let parameters = (ins "int64_t":$count, "AffineMap":$map);
  let mnemonic = "my_attr";
  let assemblyFormat = "`<` $count `,` $map `>`";
}
```

Use `struct` to define a comma-separated list of key-value pairs:

```tablegen
def MyType : TypeDef<MyDialect, "MyType"> {
  let parameters = (ins "int":$one, "int":$two, "int":$three);
  let mnemonic = "my_attr";
  let assemblyFormat = "`<` $three `:` struct($one, $two) `>`";
}
```

Use `struct(*)` to capture all parameters.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D111594
2021-11-08 17:38:28 +00:00
Lei Zhang 56ada0f80d [mlir][vector] Use dyn_cast instead of cast in patterns
This avoids crashes when the pattern is applied to ops with
tensor semantics.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D113415
2021-11-08 12:34:14 -05:00
Tobias Gysi 1726c956ae [mlir][linalg] Improve hoist padding buffer size computation.
Adapt the Fourier Motzkin elimination to take into account affine computations happening outside of the cloned loop nest.

Depends On D112713

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D112838
2021-11-08 12:02:57 +00:00