Commit Graph

2129 Commits

Author SHA1 Message Date
Uday Bondhugula 51ae78a6d6 [MLIR][Affine][NFC] affine.store op verifier message fix and check
Fix typo in affine.store op verifier message and test case.

Differential Revision: https://reviews.llvm.org/D113360
2021-11-11 01:52:23 +05:30
Kevin Cheng bef966eb37 tosa-make-broadcatable pass now supports numpy style broadcasting only.
- fix bug that in [c,1] + [a, b, c, d] broadcast
- add test [3,3,4,1] + [4,5]

Signed-off-by: Kevin Cheng <kevin.cheng@arm.com>
Change-Id: Iaed2f04df8775f655c82c740271395274163d147

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D113596
2021-11-10 11:48:35 -08:00
Tobias Gysi b326eb64fd [mli][linalg] Use CodegenStrategy to test interchange (NFC).
Use CodegenStrategy instead of a separate test pass to test iterator interchange.

Depends On D113409

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D113550
2021-11-10 15:44:44 +00:00
Tobias Gysi b676a67092 [mlir][linalg] Use CodegenStrategy to test hoisting (NFC).
Use CodegenStrategy instead of a separate test pass to test hoisting.

Depends On D113410

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D113411
2021-11-10 15:06:31 +00:00
Tobias Gysi 0c7c532643 [mli][linalg] Use CodegenStrategy to test padding (NFC).
Use CodegenStrategy instead of a separate test pass to test padding.

Depends On D113409

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D113410
2021-11-10 15:00:06 +00:00
Tobias Gysi b86b2309ce [mlir][linalg] Use AffineApplyOp to compute padding width (NFC).
Use AffineApplyOp instead of SubIOp to compute the padding width when creating a pad tensor operation.

Depends On D113382

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D113404
2021-11-10 14:53:52 +00:00
Tobias Gysi 0609eb1b32 [mlir][linalg] Remove padding from tiling options.
Remove the padding options from the tiling options since padding is now implemented by a separate pattern/pass introduced in https://reviews.llvm.org/D112412.

The revsion remove the tile-and-pad-tensors.mlir and replaces it with the pad.mlir that tests padding in isolation (without tiling). Similarly, hoist-padding.mlir is replaced by pad-and-hoist.mlir introduced in https://reviews.llvm.org/D112713.

Depends On D112838

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D113382
2021-11-10 13:33:28 +00:00
Matthias Springer c3eb967e2a [mlir][linalg][bufferize] Bufferize ops via PreOrder traversal
The existing PostOrder traversal with special rules for certain ops was complicated and had a bug. Switch to PreOrder traversal.

Differential Revision: https://reviews.llvm.org/D113338
2021-11-10 18:51:39 +09:00
Matthias Springer 99ad2079d4 [mlir][linalg][bufferize] Fix buffer equivalence around scf.if ops
Also extend the comments for aliasInfo and equivalenceInfo.

Differential Revision: https://reviews.llvm.org/D113340
2021-11-10 18:33:08 +09:00
Matthias Springer f74f09128b [mlir][linalg][bufferize] Relax tensor.insert_slice conflict rules
A tensor.insert_slice write does not conflict with a subsequent read of the source if the source is originating from a matching tensor.extract_slice.

Differential Revision: https://reviews.llvm.org/D113446
2021-11-10 18:23:29 +09:00
Suraj Sudhir 82568021dd [mlir][tosa] Spec v0.23 updates
Add pad_const field to tosa.pad.
Add builders to enable optional construction of pad_const in pad op.
Update documentation of tosa.clamp to match spec wording.

Signed-off-by: Suraj Sudhir <suraj.sudhir@arm.com>

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D113322
2021-11-08 10:13:54 -08:00
Tobias Gysi 1726c956ae [mlir][linalg] Improve hoist padding buffer size computation.
Adapt the Fourier Motzkin elimination to take into account affine computations happening outside of the cloned loop nest.

Depends On D112713

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D112838
2021-11-08 12:02:57 +00:00
Tobias Gysi 9fbcad3298 [mlir][linalg] Improve the padding packing loop computation.
The revision updates the packing loop search in hoist padding. Instead of considering all loops in the backward slice, we now compute a separate backward slice containing the index computations only. This modification ensures we do not add packing loops that are not used to index the packed buffer due to spurious dependencies. One instance where such spurious dependencies can appear is the extract slice operation introduced between the tile loops of a double tiling.

Depends On D112412

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D112713
2021-11-08 10:20:33 +00:00
Shraiysh Vaishay 19a7e4729d [MLIR][OpenMP] Added omp.sections and omp.section
Added omp.sections and omp.section operation according to the
section 2.8.1 of OpenMP Standard 5.0.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D110844
2021-11-06 19:27:35 +05:30
Aart Bik 2f0ee17017 [mlir][sparse] test for SIMD reduction chaining in consecutive vector loops
Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D113197
2021-11-05 10:14:17 -07:00
Aart Bik 7373cabcda [mlir][sparse] implement full reduction "scalarization" across loop nests
The earlier reduction "scalarization" was only applied to a chain of
*innermost* and *for* loops. This revision generalizes this to any
nesting of for- and while-loops. This implies that reductions can be
implemented with a lot less load and store operations. The chaining
is implemented with a forest of yield statements (but not as bad as
when we would also include the while-induction).

Fixes https://bugs.llvm.org/show_bug.cgi?id=52311

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D113078
2021-11-04 17:38:47 -07:00
not-jenni 07a029c057 Canonicalization for add to no-op if one of the inputs is zero
Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D113207
2021-11-04 16:52:47 -07:00
Aart Bik 4aa9b39824 [mlir][sparse] reject sparsity annotation in "scalar" tensors
Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D113152
2021-11-04 09:49:05 -07:00
Tobias Gysi 29c31cb79b [mlir][linalg] Add support for transitive fusion.
Extend fusion on tensors to fuse producers greedily.

Reviewed By: nicolasvasilache, hanchung

Differential Revision: https://reviews.llvm.org/D110262
2021-11-04 16:25:06 +00:00
Butygin 1cb13fddb9 [mlir] spirv: Add some atomic ops
Differential Revision: https://reviews.llvm.org/D112812
2021-11-03 14:47:12 +03:00
Nicolas Vasilache 9c4971740b [mlir][Linalg] Refactor vectorization of conv1d more aggressively.
This better decouples transfer read/write from vector-only rewrite of conv.
This form is close to ready to plop into a new vector.conv op and the vector.transfer operations to be generalized as part of generic vectorization once the properties ConvolutionOpInterface are inferred from the indexing maps.

This also results in a nice perf boost in the dw == 1 cases.

Differential revision: https://reviews.llvm.org/D112822
2021-11-03 08:18:01 +00:00
Nicolas Vasilache 7b09f157e1 [mlir][Linalg] Refactor conv vectorization to decouple memory from vector ops.
This refactoring prepares conv1d vectorization for a future integration into
the generic codegen path.
Once transfer_read / transfer_write vectorization also supports sliding windows,
the special pattern for conv can disappear.
This will also likely need a vector.conv operation.

Differential Revision: https://reviews.llvm.org/D112797
2021-11-03 08:03:40 +00:00
Nicolas Vasilache 885072820c [mlir][Vector] Add a pattern to lower 2-D vector.transpose to shape_cast+shuffle.
The 2-D case can be rewritten to generate quite fewer instructions and a single vector.shuffle which seems to provide a nice performance boost.
Add this arrow to our quiver by exposing it with a new vector transform option.

Differential Revision: https://reviews.llvm.org/D113062
2021-11-02 22:12:46 +00:00
Lei Zhang 7b615a87dc [mlir][linalg] Rewrite `linalg.conv_2d_nhwc_hwcf` into 1-D
We'd like to take a progressive approach towards Fconvolution op
CodeGen, by 1) tiling it to fit compute hierarchy first, and then
2) tiling along window dimensions with size 1 to reduce the problem
to be matmul-like. After that, we can 3) downscale high-D convolution
ops to low-D by removing the size-1 window dimensions. The final
step would be 4) vectorizing the low-D convolution op directly.

We have patterns for 1), 2), and 4). This commit adds a pattern for
3) for `linalg.conv_2d_nhwc_hwcf` ops as a starter. Supporting other
high-D convolution ops should be similar and mechanical.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D112928
2021-11-02 09:56:26 -04:00
thomasraoux 8a992b20db [mlir][gpu] Add basic support to do elementwise ops on mma matrix type
In order to support fusion with mma matrix type we need to be able to
execute elementwise operations on them. This add an op to be able to
support some basic elementwise operations. This is a is not a full
solution as it only supports a limited scope or operations. Ideally we would
want to be able to fuse with more kind of operations.

Differential Revision: https://reviews.llvm.org/D112857
2021-11-01 11:51:19 -07:00
thomasraoux 77eafb8430 [mlir][nvvm] Generalize wmma ops to handle more types and shapes
wmma intrinsics have a large number of combinations, ideally we want to be able
to target all the different variants. To avoid a combinatorial explosion in the
number of mlir op we use attributes to represent the different variation of
load/store/mma ops. We also can generate with tablegen helpers to know which
combinations are available. Using this we can avoid having too hardcode a path
for specific shapes and can support more types.
This patch also adds boiler plates for tf32 op support.

Differential Revision: https://reviews.llvm.org/D112689
2021-11-01 10:27:26 -07:00
Ahmed Taei 813fa79c15 Don't drop in_bounds when vector-transfer-collapse-inner-most-dims
When operand is a subview we don't infer in_bounds and some default cases (e.g case in the tests) will crash with `operand is NULL` when converting to LLVM

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D112772
2021-10-29 09:07:57 -07:00
Tobias Gysi 6638112b42 [mlir][linalg] Add padding pass to strategy passes.
Add a strategy pass that pads and hoists after tiling and fusion.

Depends On D112412

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D112480
2021-10-29 15:30:42 +00:00
Tobias Gysi d0ec4a8ed9 [mlir][linalg] Add pad and hoist test pass.
Adding a padding and hoisting pattern, a test pass, and tests. The patch prepares the split of tiling/fusion and padding.

Depends On D112255

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D112412
2021-10-29 15:08:16 +00:00
wren romano 5389cdc8f6 [mlir][sparse] Adding dynamic-size support for sparse=>dense conversion
Depends On D110790

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D112674
2021-10-28 16:56:18 -07:00
wren romano 28882b6575 [mlir][sparse] Implementing sparse=>dense conversion.
Depends On D110882, D110883, D110884

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D110790
2021-10-28 15:27:35 -07:00
Eugene Zhulenev 627fa0b9a8 [mlir] MathApproximations: unroll virtual vectors into hardware vectors for ISA specific operation
Reviewed By: cota

Differential Revision: https://reviews.llvm.org/D112736
2021-10-28 12:52:04 -07:00
Uday Bondhugula 57b9b29649 [MLIR][LLVM] Add llvm.mlir.global_ctors/dtors and translation support
Add llvm.mlir.global_ctors and global_dtors ops and their translation
support to LLVM global_ctors/global_dtors global variables.

Differential Revision: https://reviews.llvm.org/D112524
2021-10-28 18:09:34 +05:30
Shraiysh Vaishay 30bd11fab4 [MLIR][OpenMP] Fixed the missing inclusive clause in omp.wsloop and fix order clause
This patch adds the inclusive clause (which was missed in previous
reorganization - https://reviews.llvm.org/D110903) in omp.wsloop operation.
Added a test for validating it.

Also fixes the order clause, which was not accepting any values. It now accepts
"concurrent" as a value, as specified in the standard.

Reviewed By: kiranchandramohan, peixin, clementval

Differential Revision: https://reviews.llvm.org/D112198
2021-10-28 14:18:05 +05:30
Matthias Springer 5b98e4ed16 [mlir][linalg][bufferize] Add analysis fuzzer option
Analyze ops in a pseudo-random order to see if any assertions are triggered. Randomizing the order of analysis likely worsens the quality of the bufferization result (more out-of-place bufferizations). However, assertions should never fail, as that would indicate a problem with our implementation.

Differential Revision: https://reviews.llvm.org/D112581
2021-10-27 17:37:56 +09:00
Shraiysh Vaishay 9fb52cb3f1 [MLIR][OpenMP] Added omp.atomic.read and omp.atomic.write
This patch supports the atomic construct (read and write) following
section 2.17.7 of OpenMP 5.0 standard. Also added tests and
verifier for the same.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D111992
2021-10-27 14:05:44 +05:30
River Riddle 015192c634 [mlir:DialectConversion] Restructure how argument/target materializations get invoked
The current implementation invokes materializations
whenever an input operand does not have a mapping for the
desired type, i.e. it requires materialization at the earliest possible
point. This conflicts with goal of dialect conversion (and also the
current documentation) which states that a materialization is only
required if the materialization is supposed to persist after the
conversion process has finished.

This revision refactors this such that whenever a target
materialization "might" be necessary, we insert an
unrealized_conversion_cast to act as a temporary materialization.
This allows for deferring the invocation of the user
materialization hooks until the end of the conversion process,
where we actually have a better sense if it's actually
necessary. This has several benefits:

* In some cases a target materialization hook is no longer
   necessary
When performing a full conversion, there are some situations
where a temporary materialization is necessary. Moving forward,
these users won't need to provide any target materializations,
as the temporary materializations do not require the user to
provide materialization hooks.

* getRemappedValue can now handle values that haven't been
   converted yet
Before this commit, it wasn't well supported to get the remapped
value of a value that hadn't been converted yet (making it
difficult/impossible to convert multiple operations in many
situations). This commit updates getRemappedValue to properly
handle this case by inserting temporary materializations when
necessary.

Another code-health related benefit is that with this change we
can move a majority of the complexity related to materializations
to the end of the conversion process, instead of handling adhoc
while conversion is happening.

Differential Revision: https://reviews.llvm.org/D111620
2021-10-27 02:09:04 +00:00
Aart Bik 1e6ef0cfb0 [mlir][sparse] refine trait of sparse_tensor.convert
Rationale:
The currently used trait was demanding that all types are the same
which is not true (since the sparse part may change and the dim sizes
may be relaxed). This revision uses the correct trait and makes the
rank match test explicit in the verify method.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D112576
2021-10-26 14:36:49 -07:00
Alexander Belyaev 96cee29762 [mlir] Allow polynomial approximations for N-d vectors.
Polynomial approximation can be extented to support N-d vectors.
N-dimensional vectors are useful when vectorizing operations on N-dimensional
tiles. Before lowering to LLVM these vectors are usually unrolled or flattened
to 1-dimensional vectors.

Differential Revision: https://reviews.llvm.org/D112566
2021-10-26 20:50:00 +02:00
Amy Zhuang b9ae741d3e [mlir] Fix getVectorReductionOp
1.Combining kind min/max of Vector reduction op has been changed to
  minf/maxf, minsi/maxsi, and minui/maxui. Modify getVectorReductionOp
  accordingly.
2.Add min/max to supported reductions.

Reviewed By: dcaballe, nicolasvasilache

Differential Revision: https://reviews.llvm.org/D112246
2021-10-26 08:42:34 -07:00
Uday Bondhugula 41a8b46007 [MLIR] Fix AffineExpr getLargestKnownDivisor for ceildiv and floordiv
Fix AffineExpr `getLargestKnownDivisor` for ceil/floor div cases.
In these cases, nothing can be inferred on the divisor of the
result.

Add test case for `mod` as well.

Differential Revision: https://reviews.llvm.org/D112523
2021-10-26 16:21:29 +05:30
Robert Suderman 58901a5a29 [mlir][tosa] Correct tosa.avg_pool2d for specification error
Specification specified the output type for quantized average pool should be
an i32. Only accumulator should be an i32, result type should match the input
type.

Caused in https://reviews.llvm.org/D111590

Reviewed By: sjarus, GMNGeoffrey

Differential Revision: https://reviews.llvm.org/D112484
2021-10-25 14:41:16 -07:00
MaheshRavishankar 2f572818b0 [mlir][Linalg] Allow comprehensive bufferization to use callbacks for alloc/dealloc.
Using callbacks for allocation/deallocation allows users to override
the default.
Also add an option to comprehensive bufferization pass to use `alloca`
instead of `alloc`s. Note that this option is just for testing. The
option to use `alloca` does not work well with the option to allow for
returning memrefs.
2021-10-25 12:43:10 -07:00
Boian Petkantchin f1b922188e [MLIR][Math] Add erf to math dialect
Add math.erf lowering to libm call.
Add math.erf polynomial approximation.

Reviewed By: silvas, ezhulenev

Differential Revision: https://reviews.llvm.org/D112200
2021-10-25 18:30:17 +00:00
Aart Bik 1b15160ef3 [mlir][sparse] lower trivial tensor.cast on identical sparse tensors
Even though tensor.cast is not part of the sparse tensor dialect,
it may be used to cast static dimension sizes to dynamic dimension
sizes for sparse tensors without changing the actual sparse tensor
itself. Those cases should be lowered properly when replacing sparse
tensor types with their opaque pointers. Likewise, no op sparse
conversions are handled by this revision in a similar manner.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D112173
2021-10-25 10:30:19 -07:00
MaheshRavishankar 5fb46a9fa3 Revert "[mlir][Linalg] Allow comprehensive bufferization to use callbacks for alloc/dealloc."
This reverts commit c86f218fe4.

Revert because it causes build failure.
2021-10-25 08:57:53 -07:00
MaheshRavishankar c86f218fe4 [mlir][Linalg] Allow comprehensive bufferization to use callbacks for alloc/dealloc.
Using callbacks for allocation/deallocation allows users to override
the default.
Also add an option to comprehensive bufferization pass to use `alloca`
instead of `alloc`s. Note that this option is just for testing. The
option to use `alloca` does not work well with the option to allow for
returning memrefs.

Differential Revision: https://reviews.llvm.org/D112166
2021-10-25 08:50:25 -07:00
Jacques Pienaar 42e9af9e8f [mlir] Rename to avoid overlap in accessor prefixing
Split out renaming from D112383 into standalone change.
2021-10-24 18:17:09 -07:00
Emilio Cota 35553d452b [mlir] Add polynomial approximation for vectorized math::Rsqrt
This patch adds a polynomial approximation that matches the
approximation in Eigen.

Note that the approximation only applies to vectorized inputs;
the scalar rsqrt is left unmodified.

The approximation is protected with a flag since it emits an AVX2
intrinsic (generated via the X86Vector). This is the only reasonably
clean way that I could find to generate the exact approximation that
I wanted (i.e. an identical one to Eigen's).

I considered two alternatives:

1. Introduce a Rsqrt intrinsic in LLVM, which doesn't exist yet.
   I believe this is because there is no definition of Rsqrt that
   all backends could agree on, since hardware instructions that
   implement it have widely varying degrees of precision.
   This is something that the standard could mandate, but Rsqrt is
   not part of IEEE754, so I don't think this option is feasible.

2. Emit fdiv(1.0, sqrt) with fast math flags to allow reciprocal
   transformations. Although portable, this doesn't allow us
   to generate exactly the code we want; it is the LLVM backend,
   and not MLIR, who controls what code is generated based on the
   target CPU.

Reviewed By: ezhulenev

Differential Revision: https://reviews.llvm.org/D112192
2021-10-23 04:56:12 -07:00
Mats Petersson 3f00e10bdd [mlir][OpenMP]Support for modifiers in workshare loops
Pass the modifiers from the Flang parser to FIR/MLIR workshare
loop operation.

Not yet supporting the SIMD modifier, which is a bit more work
than just adding it to the list of modifiers, so will go in a
separate patch.

This adds a new field to the WsLoopOp.

Also add test for dynamic WSLoop, checking that dynamic schedule calls
the init and next functions as expected.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D111053
2021-10-22 14:19:33 +01:00
Matthias Springer 3bbc869e2e [mlir][linalg][bufferize] Support scf::IfOp
This commit adds support for scf::IfOp to comprehensive bufferization. Support is currently limited to cases where both branches yield tensors that bufferize to the same buffer.

To keep the analysis simple, scf::IfOp are treated as memory writes for analysis purposes, even if no op inside any branch is writing. (scf::ForOps are handled in the same way.)

Differential Revision: https://reviews.llvm.org/D111929
2021-10-22 10:12:55 +09:00
Mogball 516884f58b [MLIR] Fix FloorDivSIOpConverter that was failing for index type after the arithmetic op refactor
ConstantOp should be used instead of ConstantIntOp to be able to support index type.

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D112191
2021-10-21 21:42:30 +00:00
thomasraoux 93d0ade17c [mlir][linalg] Remove special case for contraction vectorization
Handle contraction op like all the other generic op reductions. This
simpifies the code. We now rely on contractionOp canonicalization to
keep the same code quality.

Differential Revision: https://reviews.llvm.org/D112171
2021-10-21 14:10:54 -07:00
thomasraoux 1d8cc45b0e [mlir][vector] Add patterns to convert multidimreduce to vector.contract
add several patterns that will simplify contraction vectorization in the
future. With those canonicalizationns we will be able to remove the special
case for contration during vectorization and rely on those transformations to
avoid materizalizing broadcast ops.

Differential Revision: https://reviews.llvm.org/D112121
2021-10-21 14:03:32 -07:00
Ahmed Taei 21f9e4a1ed Avoid infinity arithmetics when computing exp approximations
Otherwise this can result a poison value on some platforms see https://bugs.llvm.org/show_bug.cgi?id=51204

Reviewed By: ezhulenev

Differential Revision: https://reviews.llvm.org/D112115
2021-10-21 10:09:18 -07:00
Nicolas Vasilache 203accf0bd [mlir][Linalg] Improve conv vectorization for the stride==1 case.
In the stride == 1 case, conv1d reads contiguous data along the input dimension. This can be advantageaously used to bulk memory transfers and compute while avoiding unrolling. Experimentally, this can yield speedups of up to 50%.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D112139
2021-10-21 15:18:28 +00:00
Matthias Springer 7a7e93f122 [mlir][linalg][bufferize] Avoid creating copies that are never read
Differential Revision: https://reviews.llvm.org/D111956
2021-10-21 21:47:00 +09:00
Matthias Springer c5501a7a5c [mlir][linalg][bufferize] Eliminate InitTensorOps of InsertSliceOp sources
An InitTensorOp is replaced with an ExtractSliceOp on the InsertSliceOp's destination. This optimization is applied after analysis and only to InsertSliceOps that were decided to bufferize inplace. Another analysis on the new ExtractSliceOp is needed after the rewrite.

Differential Revision: https://reviews.llvm.org/D111955
2021-10-21 21:33:45 +09:00
Peixin-Qiao b37e5187f2 [MLIR][OpenMP] Add support for ordered construct
This patch supports the ordered construct in OpenMP dialect following
Section 2.19.9 of the OpenMP 5.1 standard. Also lowering to LLVM IR
using OpenMP IRBduiler. Lowering to LLVM IR for ordered simd directive
is not supported yet since LLVM optimization passes do not support it
for now.

Reviewed By: kiranchandramohan, clementval, ftynse, shraiysh

Differential Revision: https://reviews.llvm.org/D110015
2021-10-21 16:30:46 +08:00
Matthias Springer 9c55e718f5 [mlir][linalg][bufferize] Bufferize using PostOrder traversal
This is required for bufferization of scf::IfOp, which is added in a subsequent commit.

Some ops (scf::ForOp, TiledLoopOp) require PreOrder traversal to make sure that bbArgs are mapped before bufferizing the loop body.

Differential Revision: https://reviews.llvm.org/D111924
2021-10-21 17:21:52 +09:00
Mehdi Amini cb11ddb96c Revert "[MLIR][OpenMP] Add support for ordered construct"
This reverts commit dc2be87ecf.

Seems like this broke all the CI bots.
2021-10-21 04:53:45 +00:00
Peixin-Qiao dc2be87ecf [MLIR][OpenMP] Add support for ordered construct
This patch supports the ordered construct in OpenMP dialect following
Section 2.19.9 of the OpenMP 5.1 standard. Also lowering to LLVM IR
using OpenMP IRBduiler. Lowering to LLVM IR for ordered simd directive
is not supported yet since LLVM optimization passes do not support it
for now.

Reviewed By: kiranchandramohan, clementval, ftynse, shraiysh

Differential Revision: https://reviews.llvm.org/D110015
2021-10-21 09:16:04 +08:00
Aart Bik bd5494d127 [mlir][sparse] make index type explicit in public API of support library
The current implementation used explicit index->int64_t casts for some, but
not all instances of passing values of type "index" in and from the sparse
support library. This revision makes the situation more consistent by
using new "index_t" type at all such places  (which allows for less trivial
casting in the generated MLIR code).  Note that the current revision still
assumes that "index" is 64-bit wide. If we want to support targets with
alternative "index" bit widths, we need to build the support library different.
But the current revision is a step forward by making this requirement explicit
and more visible.

Reviewed By: wrengr

Differential Revision: https://reviews.llvm.org/D112122
2021-10-20 12:46:31 -07:00
Ahmed S. Taei a3dd4e7770 Drop transfer_read inner most unit dimensions
Add a pattern to take a rank-reducing subview and drop inner most
contiguous unit dim.
This is useful when lowering vector to backends with 1d vector types.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D111561
2021-10-20 19:27:04 +00:00
Shraiysh Vaishay c4c7e06bd7 [MLIR][OpenMP] Shifted hint from CriticalOp to CriticalDeclareOp
According to the OpenMP 5.0 standard, names and hints of critical operation are
closely related. The following are the restrictions on them:
 - Unless the effect is as if `hint(omp_sync_hint_none)` was specified, the
   critical construct must specify a name.
 - If the hint clause is specified, each of the critical constructs with the
   same name must have a hint clause for which the hint-expression evaluates to
   the same value.

These restrictions will be enforced by design if the hint expression is a part
of the `omp.critical.declare` operation.
 - Any operation with no "name" will be considered to have
   `hint(omp_sync_hint_none)`.
 - All the operations with the same "name" will have the same hint value.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D112134
2021-10-20 21:36:09 +05:30
Nicolas Vasilache 6bb7d2474f [mlir][Linalg] Add a first vectorization pattern for conv1d in NWCxWCF format.
This revision uses the newly refactored StructuredGenerator to create a simple vectorization for conv1d_nwc_wcf.

Note that the pattern is not specific to the op and is technically not even specific to the ConvolutionOpInterface (modulo minor details related to dilations and strides).

The overall design follows the same ideas as the lowering of vector::ContractionOp -> vector::OuterProduct: it seeks to be minimally complex, composable and extensible while avoiding inference analysis. Instead, we metaprogram the maps/indexings we expect and we match against them.

This is just a first stab and still needs to be evaluated for performance.
Other tradeoffs are possible that should be explored.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D111894
2021-10-20 13:54:18 +00:00
Kojo Acquah 9c62bb55f4 Implementation of `ReshapeNoopOptimization` canonicalizer.
This canonicalizer replaces reshapes of constant tensors that contain the updated shape (skipping the reshape operation).

Differential Revision: https://reviews.llvm.org/D112038
2021-10-19 16:07:34 -07:00
bakhtiyar f97f946839 Canonicalize max/min operations on integers.
Reviewed By: ezhulenev

Differential Revision: https://reviews.llvm.org/D112051
2021-10-19 05:25:59 -07:00
Shraiysh Vaishay d576f45014 [MLIR][OpenMP] Added parseClauses
Code reorganized in OpenMPDialect.cpp to have all functions corresponding to an operation together.

Added parseClauses function to avoid code duplication while parsing clauses in OpenMP operations. Also added printers and verifiers for clauses, which are being used for multiple operations.

Reviewed By: kiranchandramohan, peixin

Differential Revision: https://reviews.llvm.org/D110903
2021-10-19 17:31:36 +05:30
Vladislav Vinogradov e41ebbecf9 [mlir][RFC] Refactor layout representation in MemRefType
The change is based on the proposal from the following discussion:
https://llvm.discourse.group/t/rfc-memreftype-affine-maps-list-vs-single-item/3968

* Introduce `MemRefLayoutAttr` interface to get `AffineMap` from an `Attribute`
  (`AffineMapAttr` implements this interface).
* Store layout as a single generic `MemRefLayoutAttr`.

This change removes the affine map composition feature and related API.
Actually, while the `MemRefType` itself supported it, almost none of the upstream
can work with more than 1 affine map in `MemRefType`.

The introduced `MemRefLayoutAttr` allows to re-implement this feature
in a more stable way - via separate attribute class.

Also the interface allows to use different layout representations rather than affine maps.
For example, the described "stride + offset" form, which is currently supported in ASM parser only,
can now be expressed as separate attribute.

Reviewed By: ftynse, bondhugula

Differential Revision: https://reviews.llvm.org/D111553
2021-10-19 12:31:15 +03:00
not-jenni 4ada6c2aaf [mlir][tosa] Adds a canonicalization to the transpose op if the perms are a no op
Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D112037
2021-10-18 16:30:53 -07:00
Aart Bik 9d1db3d4a1 [mlir][sparse] generalize sparse_tensor.convert on static/dynamic dimension sizes
This revison lifts the artificial restriction on having exact matches between
source and destination type shapes. A static size may become dynamic. We still
reject changing a dynamic size into a static size to avoid the need for a
runtime "assert" on the conversion. This revision also refactors some of the
conversion code to share same-content buffers.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D111915
2021-10-18 13:54:03 -07:00
Eugene Zhulenev bf32bb7e05 [mlir] Update approximation range for Tanh operation
Use wider range for approximating Tanh to match results computed in Eigen with AVX.

Reviewed By: cota

Differential Revision: https://reviews.llvm.org/D112011
2021-10-18 10:57:31 -07:00
Ahmed Taei b0c4aaff24 Allow only valid vector.shape_cast transitive folding
When folding A->B->C => A->C only accept A->C that is valid shape cast

Reviewed By: ThomasRaoux, nicolasvasilache

Differential Revision: https://reviews.llvm.org/D111473
2021-10-18 07:57:55 -07:00
Matthias Springer e7bb8dd929 [mlir][linalg][bufferize] Relax rules for extract_slice/insert_slice matching
The rules were too restrictive, causing out-of-place bufferization when the result of two ExtractSliceOp is fed into an InsertSliceOp.

Differential Revision: https://reviews.llvm.org/D111861
2021-10-16 17:08:47 +09:00
Jacques Pienaar 965ec6dbe7 [mlir] Add folder for shape.add 2021-10-15 17:30:17 -07:00
Aart Bik b24788abd8 [mlir][sparse] implement sparse tensor init operation
Next step towards supporting sparse tensors outputs.
Also some minor refactoring of enum constants as well
as replacing tensor arguments with proper buffer arguments
(latter is required for more general sizes arguments for
the sparse_tensor.init operation, as well as more general
spares_tensor.convert operations later)

Reviewed By: wrengr

Differential Revision: https://reviews.llvm.org/D111771
2021-10-15 09:33:16 -07:00
Matthias Springer 7dd7078760 [mlir][linalg][bufferize] Handle scf::ForOp correctly in bufferizesToMemoryRead
From the perspective of analysis, scf::ForOp is treated as a black box. Basic block arguments do not alias with their respective OpOperands on the ForOp, so they do not participate in conflict analysis with ops defined outside of the loop.

However, bufferizesToMemoryRead and bufferizesToMemoryWrite on the scf::ForOp itself are used to determine how the scf::ForOp interacts with its surrounding ops.

Differential Revision: https://reviews.llvm.org/D111775
2021-10-15 11:24:21 +09:00
Matthias Springer d3cb6bf2d4 [mlir][linalg][bufferize] Rewrite conflict detection
For each memory read, follow SSA use-def chains to find the op that produces the data being read (i.e., the most recent write). A memory write to an alias is a conflict if it takes places after the "most recent write" but before the read.

This CL introduces two main changes:
* There is a concise definition of a conflict. Given a piece of IR with InPlaceSpec annotations and a computes alias set, it is easy to compute whether this program has a conflict. No need to consider multiple cases such as "read of operand after in-place write" etc.
* No need to check for clobbering.

Differential Revision: https://reviews.llvm.org/D111287
2021-10-15 10:31:02 +09:00
thomasraoux afad0cdf31 [mlir][vector] Refactor linalg vectorization for reductions
Emit reduction during op vectorization instead of doing it when creating the
transfer write. This allow us to not broadcast output arguments for reduction
initial value.

Differential Revision: https://reviews.llvm.org/D111825
2021-10-14 13:37:56 -07:00
Nicolas Vasilache 82dd977baf [mlir][Linalg] Tighten canonicalization of InsertSliceOp that triggers infinite loop
I am unclear this is reproducible with correct IR but atm the verifier for InsertSliceOp
is not powerful enough and this triggers an infinite loop that is worth fixing independently.

Differential Revision: https://reviews.llvm.org/D111812
2021-10-14 15:26:03 +00:00
Nicolas Vasilache 0eeaad3012 [mlir][Linalg] Fix insertion point in comprehensive bufferization 2021-10-14 15:24:09 +00:00
Tobias Gysi a8f69be61f [mlir][linalg] Expose flag to control nofold attribute when padding.
Setting the nofold attribute enables packing an operand. At the moment, the attribute is set by default. The pack introduces a callback to control the flag.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D111718
2021-10-14 10:07:07 +00:00
Tobias Gysi eaa52750ce [mlir][linalg] Verify every LinalgOp has a body.
After removing the last LinalgOps that have no region attached we can verify there is a region. The patch performs the following changes:
- Move the SingleBlockImplicitTerminator trait further up the the structured op base class.
- Adapt the LinalgOp verification since the trait only check if there is 0 or 1 block.
- Introduce a getBlock method on the LinalgOp interface.
- Access the LinalgOp body using either getBlock() or getBody() if the concrete operation type is known.

This patch is a follow up to https://reviews.llvm.org/D111233.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D111393
2021-10-14 09:08:39 +00:00
Aart Bik a652e5b53a [mlir][sparse] emergency fix after constant -> arith.constant change
Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D111743
2021-10-13 10:26:17 -07:00
Aart Bik 35517a251d [mlir][sparse] add init sparse tensor operation
This is the first step towards supporting general sparse tensors as output
of operations. The init sparse tensor is used to materialize an empty sparse
tensor of given shape and sparsity into a subsequent computation (similar to
the dense tensor init operation counterpart).

Example:
  %c = sparse_tensor.init %d1, %d2 : tensor<?x?xf32, #SparseMatrix>
  %0 = linalg.matmul
    ins(%a, %b: tensor<?x?xf32>, tensor<?x?xf32>)
    outs(%c: tensor<?x?xf32, #SparseMatrix>) -> tensor<?x?xf32, #SparseMatrix>

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D111684
2021-10-13 09:47:56 -07:00
xndcn 8c1553f0d7 [mlir][spirv] Add memory semantics verify for atomic operations
Differential Revision: https://reviews.llvm.org/D111510
2021-10-14 00:00:55 +08:00
thomasraoux cc83c2444f [mlir][vector] Add canonicalization extract + splat
Make canonicalization working on broadcast also work on splat op.

Differential Revision: https://reviews.llvm.org/D111690
2021-10-13 08:08:46 -07:00
Mogball a54f4eae0e [MLIR] Replace std ops with arith dialect ops
Precursor: https://reviews.llvm.org/D110200

Removed redundant ops from the standard dialect that were moved to the
`arith` or `math` dialects.

Renamed all instances of operations in the codebase and in tests.

Reviewed By: rriddle, jpienaar

Differential Revision: https://reviews.llvm.org/D110797
2021-10-13 03:07:03 +00:00
Weiwei Li c0a6381e49 [mlir][SPIRVToLLVM] Solve ExecutionModeOp redefinition and add OpTypeSampledImage into SPV_Type
1. To avoid two ExecutionModeOp using the same name, adding the value of execution mode in name when converting to LLVM dialect.
2. To avoid syntax error in spv.OpLoad, add OpTypeSampledImage into SPV_Type.

Reviewed by:antiagainst

Differential revision:https://reviews.llvm.org/D111193
2021-10-13 10:03:25 +08:00
thomasraoux aa71f487f3 [mlir] update new linalg vectorization tests after vectorization fix 2021-10-12 16:10:30 -07:00
thomasraoux 7c97e328b3 [mlir][linalg] Fix generic reduction vectorization
We shouldn't broadcast the original value when doing reduction. Instead
we compute the reduction and then combine it with the original value.

Differential Revision: https://reviews.llvm.org/D111666
2021-10-12 15:46:04 -07:00
Diego Caballero eeb09fd646 [mlir][Linalg] Enable vectorization of 'mul', 'and', 'or' and 'xor' reductions
This patch adds support for vectorizing 'mul', 'and', 'or' anx 'xor' reductions
to Linalg.

Reviewed By: pifon2a, ThomasRaoux, aartbik

Differential Revision: https://reviews.llvm.org/D111565
2021-10-12 21:08:23 +00:00
Diego Caballero 5c1d356c18 [mlir][Linalg] Enable vectorization of explicit broadcasts
This patch teaches `isProjectedPermutation` and `inverseAndBroadcastProjectedPermutation`
utilities to deal with maps representing an explicit broadcast, e.g., (d0, d1) -> (d0, 0).
This extension is needed to enable vectorization of such explicit broadcast in Linalg.

Reviewed By: pifon2a, nicolasvasilache

Differential Revision: https://reviews.llvm.org/D111563
2021-10-12 21:08:22 +00:00
Rob Suderman 95e4b71519 [mlir][tosa] Fix tosa average_pool2d to linalg type issue
Average pool assumed the same input/output type. Result type for integers
is always an i32, should be updated appropriately.

Reviewed By: GMNGeoffrey

Differential Revision: https://reviews.llvm.org/D111590
2021-10-12 13:09:21 -07:00
Benjamin Kramer f67d57c95f [mlir][Shape] Add a pattern to turn extract from shape_of into tensor.dim
If I remember correctly this wasn't done previously because dim used to
be in the memref dialect.

Differential Revision: https://reviews.llvm.org/D111651
2021-10-12 19:09:21 +02:00
Lei Zhang 519b350de0 [mlir][vector] Add folder for no-op InsertStridedSliceOp
Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D111636
2021-10-12 11:41:35 -04:00
Nicolas Vasilache b24c91fffc [mlir][Vector][Bigfix] Fix vector transfer to store lowering to insert a proper ExtractOp
Differential Revision: https://reviews.llvm.org/D111641
2021-10-12 13:28:12 +00:00
Nicolas Vasilache 753a67b5c9 [mlir][Linalg] Refactor and improve vectorization to add support for reduction into 0-d tensors.
This revision takes advantage of the recently added support for 0-d transfers and vector.multi_reduction that return a scalar.

Reviewed By: pifon2a

Differential Revision: https://reviews.llvm.org/D111626
2021-10-12 12:47:36 +00:00
Lei Zhang bdd37c9f49 [mlir][tensor] Add some folders for insert/extract slice ops
* Fold extract_slice immediately after insert_slice.
* Fold overlapping insert_slice.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D111439
2021-10-12 08:40:54 -04:00
Nicolas Vasilache 0c74b12a2e [mlir][Vector] NFC - Add test to exercise lowering of vector.transfer to scf
This revision also renames and moves some tests around.

Differential Revision: https://reviews.llvm.org/D111606
2021-10-12 12:38:33 +00:00
Nicolas Vasilache 47f7938a94 [mlir][Vector] Add support for lowering 0-d transfers to load/store.
Reviewed By: pifon2a

Differential Revision: https://reviews.llvm.org/D111603
2021-10-12 12:35:19 +00:00
Nicolas Vasilache 67b10532c6 [mlir][Vector] Allow a 0-d for for vector transfer ops.
This revision updates the op semantics, printer, parser and verifier to allow 0-d transfers.
Until 0-d vectors are available, such transfers have a special form that transits through vector<1xt>.
This is a stepping stone towards the longer term work of adding 0-d vectors and will help significantly reduce corner cases in vectorization.

Transformations and lowerings do not yet support this form, extensions will follow.

Differential Revision: https://reviews.llvm.org/D111559
2021-10-12 11:48:42 +00:00
Nicolas Vasilache 8f1650cb65 [mlir][Linalg] NFC - Refactor vector.broadcast op verification logic and make it available as a precondition in Linalg vectorization.
Reviewed By: pifon2a

Differential Revision: https://reviews.llvm.org/D111558
2021-10-12 11:35:34 +00:00
Nicolas Vasilache 31270eb165 [mlir][Vector] Let vector.multi_reduction reduce down to a scalar.
vector.multi_reduction currently does not allow reducing down to a scalar.
This creates corner cases that are hard to handle during vectorization.
This revision extends the semantics and adds the proper transforms, lowerings and canonicalizations to allow lowering out of vector.multi_reduction to other abstractions all the way to LLVM.

In a future, where we will also allow 0-d vectors, scalars will still be relevant: 0-d vector and scalars are not equivalent on all hardware.

In the process, splice out the implementation patterns related to vector.multi_reduce into a new file.

Reviewed By: pifon2a

Differential Revision: https://reviews.llvm.org/D111442
2021-10-12 11:03:54 +00:00
Shraiysh Vaishay 7a79c6afea [mlir][OpenMP] OpenMP Synchronization Hints stored as IntegerAttr
`hint-expression` is an IntegerAttr, because it can be a combination of multiple values from the enum `omp_sync_hint_t` (Section 2.17.12 of OpenMP 5.0)

Reviewed By: ftynse, kiranchandramohan

Differential Revision: https://reviews.llvm.org/D111360
2021-10-12 11:01:19 +00:00
Aart Bik 849f016ce8 [mlir][sparse] accept affine subscripts in outer dimensions of dense memrefs
This relaxes vectorization of dense memrefs a bit so that affine expressions
are allowed in more outer dimensions. Vectorization of non unit stride
references is disabled though, since this seems ineffective anyway.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D111469
2021-10-11 11:45:14 -07:00
Uday Bondhugula b2217b36fe [MLIR] Fix affine loop unroll corner case for full unroll
Fix affine loop unroll for zero trip count loops. Add missing check.

Differential Revision: https://reviews.llvm.org/D111375
2021-10-11 10:22:24 +05:30
Amy Zhuang 5ce368cfe2 [mlir] Vectorize induction variables
1. Add support to vectorize induction variables of loops that are
   not mapped to any vector dimension in SuperVectorize pass.
2. Fix a bug in getForInductionVarOwner.

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D111370
2021-10-09 12:40:24 -07:00
Matthias Springer f8453ea75f [mlir][linalg][bufferize] Rewrite "write into non-writable memory" detection
The purpose of this revision is to make "write into non-writable memory" conflict detection easier to understand.

The main idea is that there is a conflict in the case of inplace bufferization if:

1. Someone writes to (an alias of) opOperand, opResult or the to-be-bufferized op writes itself.
2. And, opOperand or opResult aliases a non-writable buffer.

Differential Revision: https://reviews.llvm.org/D111379
2021-10-08 21:27:49 +09:00
Lei Zhang 4cd7ff6728 [mlir][linalg] Constant fold linalg.generic that are transposes
This commit adds a pattern to perform constant folding on linalg
generic ops which are essentially transposes. We see real cases
where model importers may generate such patterns.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D110597
2021-10-08 08:09:13 -04:00
Eugene Zhulenev e2a37bb540 [mlir] Add alignment option to constant tensor bufferization pass
Reviewed By: bkramer

Differential Revision: https://reviews.llvm.org/D111364
2021-10-08 03:17:20 -07:00
Tobias Gysi 8ed2e8e04f [mlir][linalg] Retire Linalg ConvOp.
The convolution op is one of the remaining hard coded Linalg operations that have no region attached. It got obsolete due to the OpDSL convolution operations. Removing it allows us to delete specialized code and tests that are not needed for the OpDSL counterparts that rely on the standard code paths.

Test needed due to specialized implementations are removed. Tiling and fusion tests are replaced by variants using linalg.conv_2d.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D111233
2021-10-08 06:56:37 +00:00
Tobias Gysi 23800b05be [mlir][linalg] Add loop interchange to CodegenStrategy.
Add a loop interchange pass and integrate it with CodegenStrategy.

This patch depends on https://reviews.llvm.org/D110728 and https://reviews.llvm.org/D110746.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D110748
2021-10-08 06:39:22 +00:00
Tobias Gysi 1ebd197bc5 [mlir][linalg] Add generalization to CodegenStrategy.
Add a generalization pass and integrate it with CodegenStrategy.

This patch depends on https://reviews.llvm.org/D110728.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D110746
2021-10-08 06:31:19 +00:00
MaheshRavishankar 4281946390 [mlir][Tensor] Add ReifyRankedShapedTypeOpInterface to tensor.extract_slice.
Differential Revision: https://reviews.llvm.org/D111263
2021-10-07 17:10:35 -07:00
Amy Zhuang 5d001f58f2 [mlir] Fix a bug in Affine LICM.
Currently Affine LICM checks iterOperands and does not hoist out any
instruction containing iterOperands. We should check iterArgs instead.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D111090
2021-10-07 15:46:43 -07:00
Matthias Springer 6b1f653c94 [mlir][linalg][bufferize] tensor.cast may require a copy
Differential Revision: https://reviews.llvm.org/D110806
2021-10-07 22:24:05 +09:00
Eugene Zhulenev 8276ac13e9 [mlir] Add alignment attribute to memref.global
Revived https://reviews.llvm.org/D102435

Add alignment attribute to `memref.global` and propagate it to llvm global in memref->llvm lowering

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D111309
2021-10-07 06:21:57 -07:00
Adrian Kuegel 2bb208ddfd [mlir] Don't allow dynamic extent tensor types for ConstShapeOp.
ConstShapeOp has a constant shape, so its type can always be static.
We still allow it to have ShapeType though.

Differential Revision: https://reviews.llvm.org/D111139
2021-10-07 10:56:16 +02:00
Tobias Gysi 3fe7fe4424 [mlir][linalg] Add unsigned min/max/cast function to OpDSL.
Update OpDSL to support unsigned integers by adding unsigned min/max/cast signatures. Add tests in OpDSL and on the C++ side to verify the proper signed and unsigned operations are emitted.

The patch addresses an issue brought up in https://reviews.llvm.org/D111170.

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D111230
2021-10-07 06:27:20 +00:00
Alexandre Rames fd9613324d [MLIR] Rename Shape dialect's `join` to `meet`.
For the type lattice, we (now) use the "less specialized or equal" partial
order, leading to the bottom representing the empty set, and the top
representing any type.

This naming is more in line with the generally used conventions, where the top
of the lattice is the full set, and the bottom of the lattice is the empty set.
A typical example is the powerset of a finite set: generally, meet would be the
intersection, and join would be the union.

```
top:  {a,b,c}
     /   |   \
 {a,b} {a,c} {b,c}
   |  X     X  |
   {a} { b } {c}
      \  |  /
bottom: { }
```

This is in line with the examined lattice representations in LLVM:
* lattice for `BitTracker::BitValue` in `Hexagon/BitTracker.h`
* lattice for constant propagation in `HexagonConstPropagation.cpp`
* lattice in `VarLocBasedImpl.cpp`
* lattice for address space inference code in `InferAddressSpaces.cpp`

Reviewed By: silvas, jpienaar

Differential Revision: https://reviews.llvm.org/D110766
2021-10-06 09:41:33 -07:00
Tobias Gysi a744c7e962 [mlir][linalg] Update OpDSL to use the newly introduced min and max ops.
Implement min and max using the newly introduced std operations instead of relying on compare and select.

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D111170
2021-10-06 06:45:53 +00:00
Diego Caballero eaf2588a51 [mlir][Linalg] Add support for min/max reduction vectorization in linalg.generic
This patch extends Linalg core vectorization with support for min/max reductions
in linalg.generic ops. It enables the reduction detection for min/max combiner ops.
It also renames MIN/MAX combining kinds to MINS/MAXS to make the sign explicit for
floating point and signed integer types. MINU/MAXU should be introduce din the future
for unsigned integer types.

Reviewed By: pifon2a, ThomasRaoux

Differential Revision: https://reviews.llvm.org/D110854
2021-10-05 22:47:20 +00:00
Lei Zhang 7a89444cd9 [mlir][spirv] Add ops and patterns for lowering standard max/min ops
Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D111143
2021-10-05 14:27:32 -04:00
Aart Bik 16b8f4ddae [mlir][sparse] add a "release" operation to sparse tensor dialect
We have several ways to materialize sparse tensors (new and convert) but no explicit operation to release the underlying sparse storage scheme at runtime (other than making an explicit delSparseTensor() library call). To simplify memory management, a sparse_tensor.release operation has been introduced that lowers to the runtime library call while keeping tensors, opague pointers, and memrefs transparent in the initial IR.

*Note* There is obviously some tension between the concept of immutable tensors and memory management methods. This tension is addressed by simply stating that after the "release" call, no further memref related operations are allowed on the tensor value. We expect the design to evolve over time, however, and arrive at a more satisfactory view of tensors and buffers eventually.

Bug:
http://llvm.org/pr52046

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D111099
2021-10-05 09:35:59 -07:00
Nicolas Vasilache af9dce18bf [mlir][Linalg] Allow operand-less scf::ExecuteRegionOp to encapsulate scf::YieldOp
These are considered noops.
Buferization will still fail on scf.execute_region which yield values.
This is used to make comprehensive bufferization interoperate better with external clients.

Differential Revision: https://reviews.llvm.org/D111130
2021-10-05 11:34:53 +00:00
Alex Zinenko 01d696e563 [mlir] rename the "packing" flag of linalg.pad_tensor to "nofold"
The discussion in https://reviews.llvm.org/D110425 demonstrated that "packing"
may be a confusing term to define the behavior of this op in presence of the
attribute. Instead, indicate the intended effect of preventing the folder from
being applied.

Reviewed By: nicolasvasilache, silvas

Differential Revision: https://reviews.llvm.org/D111046
2021-10-04 21:28:11 +02:00
Nicolas Vasilache fab634b4e2 [mlir] Tighten strided layout specification.
Clarify that the strided layout specification is represented by a single semi-affine map.

Differential Revision: https://reviews.llvm.org/D110921
2021-10-04 10:37:05 +00:00
Tobias Gysi 32a7d60516 [mli][linalg] Change tensor size in unit test (NFC).
As a follow up to https://reviews.llvm.org/D110849, adapt the input tensor size to match the iteration space.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D110906
2021-10-04 06:43:35 +00:00
Tobias Gysi bf28849745 [mlir][linalg] Retire PoolingMaxOp/PoolingMinOp/PoolingSumOp.
The pooling ops are among the last remaining hard coded Linalg operations that have no region attached. They got obsolete due to the OpDSL pooling operations. Removing them allows us to delete specialized code and tests that are not needed for the OpDSL counterparts that rely on the standard code paths.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D110909
2021-10-01 13:51:56 +00:00
Uday Bondhugula 08b63db8bb [MLIR][GPU] Add GPU launch op support for dynamic shared memory
Add support for dynamic shared memory for GPU launch ops: add an
optional operand to gpu.launch and gpu.launch_func ops to specify the
amount of "dynamic" shared memory to use. Update lowerings to connect
this operand to the GPU runtime.

Differential Revision: https://reviews.llvm.org/D110800
2021-10-01 16:46:07 +05:30
Lei Zhang cb2e651800 [mlir][linalg] Fix incorrect bound calculation for tiling conv
For convolution, the input window dimension's access affine map
is of the form `(d0 * s0 + d1)`, where `d0`/`d1` is the output/
filter window dimension, and `s0` is the stride.

When tiling, https://reviews.llvm.org/D109267 changed how the
way dimensions are acquired. Instead of directly querying using
`*.dim` ops on the original convolution op, we now get it by
applying the access affine map to the loop upper bounds. This
is fine for dimensions having single-dimension affine maps,
like matmul, but not for convolution input. It will cause
incorrect compuation and out of bound. A concrete example, say
we have 1x225x225x3 (NHWC) input, 3x3x3x32 (HWCF) filter, and
1x112x112x3 (NHWC) output with stride 2, (112 * 2 + 3) would be
227, which is different from the correct input window dimension
size 225.

Instead, we should first calculate the max indices for each loop,
and apply the affine map to them, and then plus one to get the
dimension size. Note this makes no difference for matmul-like
ops given they will have `d0 - 1 + 1` effectively.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D110849
2021-09-30 13:50:57 -04:00
Stella Laurenzo 267bb194f3 [mlir] Remove old "tc" linalg ods generator.
* This could have been removed some time ago as it only had one op left in it, which is redundant with the new approach.
* `matmul_i8_i8_i32` (the remaining op) can be trivially replaced by `matmul`, which natively supports mixed precision.

Differential Revision: https://reviews.llvm.org/D110792
2021-09-30 16:30:06 +00:00
Matthias Springer 27451a05ed [mlir][vector] Fold transfer ops and tensor.extract/insert_slice.
* Fold vector.transfer_read and tensor.extract_slice.
* Fold vector.transfer_write and tensor.insert_slice.

Differential Revision: https://reviews.llvm.org/D110627
2021-09-30 09:28:00 +09:00
Nicolas Vasilache 92ea624a13 [mlir][Linalg] Rewrite CodegenStrategy to populate a pass pipeline.
This revision retires a good portion of the complexity of the codegen strategy and puts the logic behind pass logic.

Differential revision: https://reviews.llvm.org/D110678
2021-09-29 13:35:45 +00:00
bakhtiyar bdde959533 Remove unnecessary async group creates and awaits.
Reviewed By: ezhulenev

Differential Revision: https://reviews.llvm.org/D110605
2021-09-28 14:52:08 -07:00
Amy Zhuang 7ab14b8886 [mlir] Unroll-and-jam loops with iter_args.
Unroll-and-jam currently doesn't work when the loop being unroll-and-jammed
or any of its inner loops has iter_args. This patch modifies the
unroll-and-jam utility to support loops with iter_args.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D110085
2021-09-28 14:13:27 -07:00
thomasraoux b12e4c17e0 [mlir] Fix bug in FoldSubview with rank reducing subview
Fix how we calculate the new permutation map of the transfer ops.

Differential Revision: https://reviews.llvm.org/D110638
2021-09-28 13:18:29 -07:00
Alexander Belyaev 9fb57c8c1d [mlir] Add min/max operations to Standard.
[RFC: Add min/max ops](https://llvm.discourse.group/t/rfc-add-min-max-operations/4353)

I was following the naming style for Arith dialect in
https://reviews.llvm.org/D110200,
i.e. similar to DivSIOp and DivUIOp I defined MaxSIOp, MaxUIOp.

When Arith PR is landed, I will migrate these ops as well.

Differential Revision: https://reviews.llvm.org/D110540
2021-09-28 09:40:22 +02:00
Tobias Gysi d20d0e145d [mlir][linalg] Finer-grained padding control.
Adapt the signature of the PaddingValueComputationFunction callback to either return the padding value or failure to signal padding is not desired.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D110572
2021-09-27 19:21:37 +00:00
Aart Bik ec97a205c3 [mlir][sparse] preserve zero-initialization for materializing buffers
This revision makes sure that when the output buffer materializes locally
(in contrast with the passing in of output tensors either in-place or not
in-place), the zero initialization assumption is preserved. This also adds
a bit more documentation on our sparse kernel assumption (viz. TACO
assumptions).

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D110442
2021-09-27 11:22:05 -07:00
Bixia Zheng fbd5821c6f Implement the conversion from sparse constant to sparse tensors.
The sparse constant provides a constant tensor in coordinate format. We first split the sparse constant into a constant tensor for indices and a constant tensor for values. We then generate a loop to fill a sparse tensor in coordinate format using the tensors for the indices and the values. Finally, we convert the sparse tensor in coordinate format to the destination sparse tensor format.

Add tests.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D110373
2021-09-27 09:47:29 -07:00
Eugene Zhulenev 92db09cde0 [mlir] AsyncRuntime: use int64_t for ref counting operations
Workaround for SystemZ ABI problem: https://bugs.llvm.org/show_bug.cgi?id=51898

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D110550
2021-09-27 07:55:01 -07:00
Matthias Springer ffdf0a370d [mlir][vector] Fix bug in vector-transfer-full-partial-split
When splitting with linalg.copy, cannot write into the destination alloc directly. Instead, write into a subview of the alloc.

Differential Revision: https://reviews.llvm.org/D110512
2021-09-27 18:12:17 +09:00
Lei Zhang b45476c94c [mlir][tosa] Do not fold transpose with quantized types
For such cases, the type of the constant DenseElementsAttr is
different from the transpose op return type.

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D110446
2021-09-24 16:57:55 -04:00
River Riddle aca9bea199 [mlir:MemRef] Move DmaStartOp/DmaWaitOp to ODS
These are among the last operations still defined explicitly in C++. I've
tried to keep this commit as NFC as possible, but these ops
definitely need a non-NFC cleanup at some point.

Differential Revision: https://reviews.llvm.org/D110440
2021-09-24 19:35:28 +00:00
Lei Zhang e325ebb9c7 [mlir][tosa] Add some transpose folders
* If the input is a constant splat value, we just
  need to reshape it.
* If the input is a general constant with one user,
  we can also constant fold it, without bloating
  the IR.

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D110439
2021-09-24 15:25:14 -04:00
Alex Zinenko 5988a3b7a0 [mlir] Linalg: ensure tile-and-pad always creates padding as requested
Initially, the padding transformation and the related operation were only used
to guarantee static shapes of subtensors in tiled operations. The
transformation would not insert the padding operation if the shapes were
already static, and the overall code generation would actively remove such
"noop" pads. However, this transformation can be also used to pack data into
smaller tensors and marshall them into faster memory, regardless of the size
mismatches. In context of expert-driven transformation, we should assume that,
if padding is requested, a potentially padded tensor must be always created.
Update the transformation accordingly. To do this, introduce an optional
`packing` attribute to the `pad_tensor` op that serves as an indication that
the padding is an intentional choice (as opposed to side effect of type
normalization) and should be left alone by cleanups.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D110425
2021-09-24 18:40:13 +02:00
Alex Zinenko 3f89e339bb [mlir] add pad_tensor(tensor.cast) -> pad_tensor canonicalizer
This canonicalization pattern complements the tensor.cast(pad_tensor) one in
propagating constant type information when possible. It contributes to the
feasibility of pad hoisting.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D110343
2021-09-24 12:03:47 +02:00
Matthias Springer f3f25ffc04 [mlir][linalg] Fix result type in FoldSourceTensorCast
* Do not discard static result type information that cannot be inferred from lower/upper padding.
* Add optional argument to `PadTensorOp::inferResultType` for specifying known result dimensions.

Differential Revision: https://reviews.llvm.org/D110380
2021-09-24 16:47:18 +09:00
Matthias Springer 2190f8a8b1 [mlir][linalg] Support tile+peel with TiledLoopOp
Only scf.for was supported until now.

Differential Revision: https://reviews.llvm.org/D110220
2021-09-24 10:23:31 +09:00
Matthias Springer 8dc16ba8d2 [mlir][linalg] Merge all tiling passes into a single one.
Passes such as `linalg-tile-to-tiled-loop` are merged into `linalg-tile`.

Differential Revision: https://reviews.llvm.org/D110214
2021-09-24 10:16:46 +09:00
Aart Bik a924fcc7c3 [mlir][sparse] add sparse kernels test to sparse compiler test suite
This test makes sure kernels map to efficient sparse code, i.e. all
compressed for-loops, no co-iterating while loops.  In addition, this
revision removes the special constant folding inside the sparse
compiler in favor of Mahesh' new generic linalg folding. Thanks!

NOTE: relies on Mahesh fix, which needs to be rebased first

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D110001
2021-09-22 14:56:39 -07:00
MaheshRavishankar a40a08ed98 [mlir][Linalg] Teach constant -> generic op fusion to handle scalar constants.
The current folder of constant -> generic op only handles splat
constants. The same logic holds for scalar constants. Teach the
pattern to handle such cases.

Differential Revision: https://reviews.llvm.org/D109982
2021-09-22 13:41:47 -07:00
Aart Bik 5da21338bc [mlir][sparse] generalize reduction support in sparse compiler
Now not just SUM, but also PRODUCT, AND, OR, XOR. The reductions
MIN and MAX are still to be done (also depends on recognizing
these operations in cmp-select constructs).

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D110203
2021-09-22 12:36:46 -07:00
Tobias Gysi 8b5236def5 [mlir][linalg] Simplify slice dim computation for fusion on tensors (NFC).
Compute the tiled producer slice dimensions directly starting from the consumer not using the producer at all.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D110147
2021-09-21 15:09:46 +00:00
Nicolas Vasilache 101d017a64 [mlir][Linalg] Revisit heuristic ordering of tensor.insert_slice in comprehensive bufferize.
It was previously assumed that tensor.insert_slice should be bufferized first in a greedy fashion to avoid out-of-place bufferization of the large tensor. This heuristic does not hold upon further inspection.

This CL removes the special handling of such ops and adds a test that exhibits better behavior and appears in real use cases.

The only test adversely affected is an artificial test which results in a returned memref: this pattern is not allowed by comprehensive bufferization in real scenarios anyway and the offending test is deleted.

Differential Revision: https://reviews.llvm.org/D110072
2021-09-21 14:22:45 +00:00
Nicolas Vasilache 0d2c54e851 [mlir][Linalg] Revisit RAW dependence interference in comprehensive bufferize.
Previously, comprehensive bufferize would consider all aliasing reads and writes to
the result buffer and matching operand. This resulted in spurious dependences
being considered and resulted in too many unnecessary copies.

Instead, this revision revisits the gathering of read and write alias sets.
This results in fewer alloc and copies.
An exhaustive test cases is added that considers all possible permutations of
`matmul(extract_slice(fill), extract_slice(fill), ...)`.
2021-09-21 14:22:22 +00:00
Morten Borup Petersen 032cb1650f [MLIR][SCF] Add for-to-while loop transformation pass
This pass transforms SCF.ForOp operations to SCF.WhileOp. The For loop condition is placed in the 'before' region of the while operation, and indctuion variable incrementation + the loop body in the 'after' region. The loop carried values of the while op are the induction variable (IV) of the for-loop + any iter_args specified for the for-loop.
Any 'yield' ops in the for-loop are rewritten to additionally yield the (incremented) induction variable.

This transformation is useful for passes where we want to consider structured control flow solely on the basis of a loop body and the computation of a loop condition. As an example, when doing high-level synthesis in CIRCT, the incrementation of an IV in a for-loop is "just another part" of a circuit datapath, and what we really care about is the distinction between our datapath and our control logic (the condition variable).

Differential Revision: https://reviews.llvm.org/D108454
2021-09-21 09:09:54 +01:00
River Riddle 4f21152af1 [mlir] Tighten verification of SparseElementsAttr
SparseElementsAttr currently does not perform any verfication on construction, with the only verification existing within the parser. This revision moves the parser verification to SparseElementsAttr, and also adds additional verification for when a sparse index is not valid.

Differential Revision: https://reviews.llvm.org/D109189
2021-09-21 01:57:42 +00:00
MaheshRavishankar 4cf9bf6c9f [mlir][MemRef] Compute unused dimensions of a rank-reducing subviews using strides as well.
For `memref.subview` operations, when there are more than one
unit-dimensions, the strides need to be used to figure out which of
the unit-dims are actually dropped.

Differential Revision: https://reviews.llvm.org/D109418
2021-09-20 11:05:30 -07:00
MaheshRavishankar 0b33890f45 [mlir][Linalg] Add ConvolutionOpInterface.
Add an interface that allows grouping together all covolution and
pooling ops within Linalg named ops. The interface currently
- the indexing map used for input/image access is valid
- the filter and output are accessed using projected permutations
- that all loops are charecterizable as one iterating over
  - batch dimension,
  - output image dimensions,
  - filter convolved dimensions,
  - output channel dimensions,
  - input channel dimensions,
  - depth multiplier (for depthwise convolutions)

Differential Revision: https://reviews.llvm.org/D109793
2021-09-20 10:41:10 -07:00
Mehdi Amini 5edd79fc97 Revert "[MLIR][SCF] Add for-to-while loop transformation pass"
This reverts commit 644b55d57e.

The added test is failing the bots.
2021-09-20 17:21:59 +00:00
Tobias Gysi 7be28d82b4 [mlir][linalg] Add IndexOp support to fusion on tensors.
This revision depends on https://reviews.llvm.org/D109761 and https://reviews.llvm.org/D109766.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D109774
2021-09-20 15:59:35 +00:00
Morten Borup Petersen 644b55d57e [MLIR][SCF] Add for-to-while loop transformation pass
This pass transforms SCF.ForOp operations to SCF.WhileOp. The For loop condition is placed in the 'before' region of the while operation, and indctuion variable incrementation + the loop body in the 'after' region. The loop carried values of the while op are the induction variable (IV) of the for-loop + any iter_args specified for the for-loop.
Any 'yield' ops in the for-loop are rewritten to additionally yield the (incremented) induction variable.

This transformation is useful for passes where we want to consider structured control flow solely on the basis of a loop body and the computation of a loop condition. As an example, when doing high-level synthesis in CIRCT, the incrementation of an IV in a for-loop is "just another part" of a circuit datapath, and what we really care about is the distinction between our datapath and our control logic (the condition variable).

Differential Revision: https://reviews.llvm.org/D108454
2021-09-20 16:57:50 +01:00
Tobias Gysi 6db928b8f3 [mlir][linalg] Fusion on tensors.
Add a new version of fusion on tensors that supports the following scenarios:
- support input and output operand fusion
- fuse a producer result passed in via tile loop iteration arguments (update the tile loop iteration arguments)
- supports only linalg operations on tensors
- supports only scf::for
- cannot add an output to the tile loop nest

The LinalgTileAndFuseOnTensors pass tiles the root operation and fuses its producers.

Reviewed By: nicolasvasilache, mravishankar

Differential Revision: https://reviews.llvm.org/D109766
2021-09-20 14:45:34 +00:00
KareemErgawy-TomTom bdcf4b9b96 [MLIR][Linalg] Make detensoring cost-model more flexible.
So far, the CF cost-model for detensoring was limited to discovering
pure CF structures. This means, if while discovering the CF component,
the cost-model found any op that is not detensorable, it gives up on
detensoring altogether. This patch makes it a bit more flexible by
cleaning-up the detensorable component from non-detensorable ops without
giving up entirely.

Reviewed By: silvas

Differential Revision: https://reviews.llvm.org/D109965
2021-09-20 10:21:31 +02:00
Uday Bondhugula 57eda9becc [MLIR][GPU] Add constant propagator for gpu.launch op
Add a constant propagator for gpu.launch op in cases where the
grid/thread IDs can be trivially determined to take a single constant
value of zero.

Differential Revision: https://reviews.llvm.org/D109994
2021-09-18 12:02:46 +05:30
Aart Bik d4e16171e8 [mlir][sparse] add dce test for all sparse tensor ops
Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D109992
2021-09-17 13:03:42 -07:00
thomasraoux 08f0cb7719 [mlir] Prevent crash in DropUnitDim pattern due to tensor with encoding
Differential Revision: https://reviews.llvm.org/D109984
2021-09-17 12:03:16 -07:00
thomasraoux 36aac53b36 [mlir][linalg] Extend drop unit dim pattern to all cases of reduction
Even with all parallel loops reading the output value is still allowed so we
don't have to handle reduction loops differently.

Differential Revision: https://reviews.llvm.org/D109851
2021-09-17 10:09:57 -07:00
thomasraoux 416679615d [mlir] Linalg hoisting should ignore uses outside the loop
Differential Revision: https://reviews.llvm.org/D109859
2021-09-17 10:06:57 -07:00
Aart Bik b1d44e5902 [mlir][sparse] add affine subscripts to sparse compilation pass
This enables the sparsification of more kernels, such as convolutions
where there is a x(i+j) subscript. It also enables more tensor invariants
such as x(1) or other affine subscripts such as x(i+1). Currently, we
reject sparsity altogether for such tensors. Despite this restriction,
however, we can already handle a lot more kernels with compound subscripts
for dense access (viz. convolution with dense input and sparse filter).
Some unit tests and an integration test demonstrate new capability.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D109783
2021-09-15 20:28:04 -07:00
Rob Suderman 1ac2d195ec [mlir][linalg] Add canonicalizers for depthwise conv
There are two main versions of depthwise conv depending whether the multiplier
is 1 or not. In cases where m == 1 we should use the version without the
multiplier channel as it can perform greater optimization.

Add lowering for the quantized/float versions to have a multiplier of one.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D108959
2021-09-15 14:09:15 -07:00
Simon Camphausen 1b79efdc72 [mlir] Fix printing of EmitC attrs/types with escape characters
Attributes and types were not escaped when printing.

Reviewed By: jpienaar, marbre

Differential Revision: https://reviews.llvm.org/D109143
2021-09-15 18:15:38 +00:00
Nicolas Vasilache 96ec0ff2b7 [mlir][Linalg] Revisit insertion points in comprehensive bufferization.
This revision fixes a corner case that could appear due to incorrect insertion point behavior in comprehensive bufferization.

Differential Revision: https://reviews.llvm.org/D109830
2021-09-15 18:11:38 +00:00
Nicolas Vasilache 6fe77b1051 [mlir][Linalg] Fail comprehensive bufferization if a memref is returned.
Summary:

Reviewers:

Subscribers:

Differential revision: https://reviews.llvm.org/D109824
2021-09-15 15:11:17 +00:00
Tobias Gysi 44a889778c [mlir][linalg] Fold ExtractSliceOps during tiling.
Add the makeComposedExtractSliceOp method that creates an ExtractSliceOp and folds chains of ExtractSliceOps by computing the sum of their offsets and by multiplying their strides.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D109601
2021-09-14 11:43:52 +00:00
Matthias Springer 62883459cd [mlir][linalg] makeTiledShape: No affine.min if tile size == 1
This improves codegen (more static type information) with `scalarize-dynamic-dims`.

Differential Revision: https://reviews.llvm.org/D109415
2021-09-14 10:48:20 +09:00
Matthias Springer fb1def9c66 [mlir][linalg] New tiling option: Scalarize dynamic dims
This tiling option scalarizes all dynamic dimensions, i.e., it tiles all dynamic dimensions by 1.

This option is useful for linalg ops with partly dynamic tensor dimensions. E.g., such ops can appear in the partial iteration after loop peeling. After scalarizing dynamic dims, those ops can be vectorized.

Differential Revision: https://reviews.llvm.org/D109268
2021-09-14 10:40:50 +09:00
Matthias Springer 8faf35c0a5 [mlir][linalg] Add scf.for loop peeling to codegen strategy
Only scf.for loops are supported at the moment. linalg.tiled_loop support will be added in a subsequent commit.

Only static tensor sizes are supported. Loops for dynamic tensor sizes can be peeled, but the generated code is not optimal due to a missing canonicalization pattern.

Differential Revision: https://reviews.llvm.org/D109043
2021-09-14 10:35:01 +09:00
Matthias Springer a4a654d301 [mlir][linalg] TiledLoopOp peeling: Do not peel partial iterations
Extend the unit test with an option for skipping partial iterations during loop peeling.

Differential Revision: https://reviews.llvm.org/D109640
2021-09-14 10:01:46 +09:00
Nicolas Vasilache 181d18ef53 [mlir][Linalg] Insert static buffers as high as possible during ComprehensiveBufferization.
This revision allows hoisting static alloc/dealloc pairs as high as possible during ComprehensiveBufferization.
This also aligns such allocated buffers to 128B by default.

This change exhibited some issues wrt insertion points and a missing copy that are also fixed in this revision; tests are updated accordingly.

Differential Revision: https://reviews.llvm.org/D109684
2021-09-13 15:59:03 +00:00
Nicolas Vasilache b01d223faf [mlir][Linalg] Use reify for padded op shape derivation.
Previously, we would insert a DimOp and rely on later canonicalizations.
Unfortunately, reifyShape kind of rewrites are not canonicalizations anymore.
This introduces undesirable pass dependencies.

Instead, immediately reify the result shape and avoid the DimOp altogether.
This is akin to a local folding, which avoids introducing more reliance on `-resolve-shaped-type-result-dims` (similar to compositions of `affine.apply` by construction to avoid chains of size > 1).

It does not completely get rid of the reliance on the pass as the process is merely local: calling the pass may still be necessary for global effects. Indeed, one of the tests still requires the pass.

Differential Revision: https://reviews.llvm.org/D109571
2021-09-13 11:54:59 +00:00
Rob Suderman b0532286fe [mlir][tosa] Add shape inference for tosa.while
Tosa.while shape inference requires repeatedly running shape inference across
the body of the loop until the types become static as we do not know the number
of iterations required by the loop body. Once the least specific arguments are
known they are propagated to both regions.

To determine the final end type, the least restrictive types are determined
from all yields.

Differential Revision: https://reviews.llvm.org/D108801
2021-09-10 13:11:53 -07:00
Stephan Herhut 5e6c170b3f [mlir][linalg] Fix bufferize pattern to allow unknown operations in body of generic
The original version of the bufferization pattern for linalg.generic would
manually clone operations within the region to the bufferized clone of the
operation. This triggers legality requirements on those operations in the
conversion infra. Instead, this now uses the rewriter to inline the region
instead, avoiding those legality requirements.

Differential Revision: https://reviews.llvm.org/D109581
2021-09-10 13:37:42 +02:00
Matthias Springer 0f3544d185 [mlir][scf] Loop peeling: Use scf.for for partial iteration
Generate an scf.for instead of an scf.if for the partial iteration. This is for consistency reasons: The peeling of linalg.tiled_loop also uses another loop for the partial iteration.

Note: Canonicalizations patterns may rewrite partial iterations to scf.if afterwards.

Differential Revision: https://reviews.llvm.org/D109568
2021-09-10 19:07:09 +09:00
Nicolas Vasilache 5f1a1af4bf [mlir][Linalg] Properly order extract_slice traversal in comprehensive bufferization
This revision fixes the traversal order of extract_slice during the inplace analysis.
It was previously thought that such ops could be analyzed at the very end.
This is unfortunately not true as the AliasInfo for dependents of these ops need to be updated.

This change allows the aliases introduced by the bufferization of extract_slice to be properly propagated.

Differential Revision: https://reviews.llvm.org/D109519
2021-09-10 07:10:06 +00:00
Aart Bik 066d786ce0 [mlir][sparse] add folding to sparse_tensor.convert
folds conversion between identical types (with tests)

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D109545
2021-09-09 15:45:19 -07:00
Matthias Springer c7d569b8f7 [mlir][scf] Fold dim(scf.for) to dim(iter_arg)
Fold dim ops of scf.for results to dim ops of the respective iter args if the loop is shape preserving.

Differential Revision: https://reviews.llvm.org/D109430
2021-09-09 13:47:13 +09:00
Matthias Springer e2c8fcb9d0 [mlir][linalg] Fold dim(linalg.tiled_loop) to dim(output_arg)
Fold dim ops of linalg.tiled_loop results to dim ops of the respective iter args if the loop is shape preserving.

Differential Revision: https://reviews.llvm.org/D109431
2021-09-09 13:37:28 +09:00
Matthias Springer f7137da174 [mlir][linalg] Fix dim(iter_arg) canonicalization
Run a small analysis to see if the runtime type of the iter_arg is changing. Fold only if the runtime type stays the same. (Same as `DimOfIterArgFolder` in SCF.)

Differential Revision: https://reviews.llvm.org/D109299
2021-09-09 12:13:05 +09:00
Matthias Springer c95a7246a3 [mlir][linalg] Tiling: Use loop ub in extract_slice size computation if possible
When tiling a LinalgOp, extract_slice/insert_slice pairs are inserted. To avoid going out-of-bounds when the tile size does not divide the shape size evenly (at the boundary), AffineMin ops are inserted. Some ops have assumptions regarding the dimensions of inputs/outputs. E.g., in a `A * B` matmul, `dim(A, 1) == dim(B, 0)`. However, loop bounds use either `dim(A, 1)` or `dim(B, 0)`.

With this change, AffineMin ops are expressed in terms of loop bounds instead of tensor sizes. (Both have the same runtime value.) This simplifies canonicalizations.

Differential Revision: https://reviews.llvm.org/D109267
2021-09-09 11:06:22 +09:00
Chris Lattner 42431b8207 [tests] Make testsuite more resilient to "order of constant" changes. NFC. 2021-09-08 10:10:10 -07:00
Matthias Springer c57c4f888c [mlir][linalg] linalg.tiled_loop peeling
Differential Revision: https://reviews.llvm.org/D108270
2021-09-07 09:50:08 +09:00
Eugene Zhulenev fd52b4357a [mlir] Async: check awaited operand error state after sync await
Previously only await inside the async function (coroutine after lowering to async runtime) would check the error state

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D109229
2021-09-04 05:00:17 -07:00
Loren Maggiore 361458b1ce [mlir] create gpu memset op
Create a gpu memset op and corresponding CUDA and ROCm wrappers.

Reviewed By: herhut, lorenrose1013

Differential Revision: https://reviews.llvm.org/D107548
2021-09-04 08:13:04 +02:00
Mehdi Amini 78accf9f35 Make LLVM Linkage a first class attribute instead of using an integer attribute
This makes the IR more readable, in particular when this will be used on
the builtin func outside of the LLVM dialect.

Reviewed By: wsmoses

Differential Revision: https://reviews.llvm.org/D109209
2021-09-03 21:21:46 +00:00
Alexander Belyaev 5ee5bbd0ff [mlir][linalg] Extend tiled_loop to SCF conversion to generate scf.parallel.
Differential Revision: https://reviews.llvm.org/D109230
2021-09-03 18:05:54 +02:00
Aart Bik b6d1a31c1b [mlir][sparse] refine heuristic for iteration graph topsort
The sparse index order must always be satisfied, but this
may give a choice in topsorts for several cases. We broke
ties in favor of any dense index order, since this gives
good locality. However, breaking ties in favor of pushing
unrelated indices into sparse iteration spaces gives better
asymptotic complexity. This revision improves the heuristic.

Note that in the long run, we are really interested in using
ML for ML to find the best loop ordering as a replacement for
such heuristics.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D109100
2021-09-03 08:37:15 -07:00
Matthias Springer 4fa6c2734c [mlir][scf] Allow runtime type of iter_args to change
The limitation on iter_args introduced with D108806 is too restricting. Changes of the runtime type should be allowed.

Extends the dim op canonicalization with a simple analysis to determine when it is safe to canonicalize.

Differential Revision: https://reviews.llvm.org/D109125
2021-09-03 10:03:05 +09:00
Kiran Chandramohan 711aa35759 [MLIR][OpenMP] Add support for declaring critical construct names
Add an operation omp.critical.declare to declare names/symbols of
critical sections. Named omp.critical operations should use symbols
declared by omp.critical.declare. Having a declare operation ensures
that the names of critical sections are global and unique. In the
lowering flow to LLVM IR, the OpenMP IRBuilder creates unique names
for critical sections.

Reviewed By: ftynse, jeanPerier

Differential Revision: https://reviews.llvm.org/D108713
2021-09-02 14:31:19 +00:00
Weiwei Li a79d7c2c85 [mlir][SPIRV] Add Image Operands for Image Instructions
This patch is to add Image Operands in SPIR-V Dialect and also let ImageDrefGather to use Image Operands.

Image Operands are used in many image instructions. "Image Operands encodes what oprands follow, as per Image Operands". And ususally, they are optional to image instructions.

The format of image operands looks like:

    %0 = spv.ImageXXXX %1, ... %3 : f32 ["Bias|Lod"](%4, %5 : f32, f32) -> ...

This patch doesn’t implement all operands (see Section 3.14 in SPIR-V Spec) but provides a skeleton of it. There is TODO in verifyImageOperands function.

Co-authored: Alan Liu <alanliu.yf@gmail.com>

Reviewed by: antiagainst

Differential Revision: https://reviews.llvm.org/D108501
2021-09-02 04:14:17 +08:00
MaheshRavishankar b686fdbf92 [mlir][Linalg] Drop output tensor from `linalg.pad_tensor` op.
The output tensor was added for tiling purposes. With use of
`TilingInterface` for tiling pad operations, there is no need for an
explicit operand for the shape of result of `linalg.pad_tensor`
op. The interface allows the tiling pattern to query the value that
can be used for the "init" needed for tiling dynamically.

Differential Revision: https://reviews.llvm.org/D108613
2021-08-31 11:12:24 -07:00
Mehdi Amini 387f95541b Add a new interface allowing to set a default dialect to be used for printing/parsing regions
Currently the builtin dialect is the default namespace used for parsing
and printing. As such module and func don't need to be prefixed.
In the case of some dialects that defines new regions for their own
purpose (like SpirV modules for example), it can be beneficial to
change the default dialect in order to improve readability.

Differential Revision: https://reviews.llvm.org/D107236
2021-08-31 17:52:40 +00:00
Mehdi Amini c41b16c26b Change ASM Op printer to print the operation name in the framework instead of leaving it up to each individual operation
This aligns the printer with the parser contract: the operation isn't part of the user-controllable part of the syntax.

Differential Revision: https://reviews.llvm.org/D108804
2021-08-31 17:52:40 +00:00
Tres Popp 44485fcd97 [mlir] Prevent assertion failure in DropUnitDims
Don't assert fail on strided memrefs when dropping unit dims.
Instead just leave them unchanged.

Differential Revision: https://reviews.llvm.org/D108205
2021-08-31 12:15:13 +02:00
marina kolpakova a.k.a. geexie 0080d2aa55 [mlir][gpu] folds memref.dim of gpu.alloc
implements canonicalization which folds memref.dim(gpu.alloc(%size), %idx) -> %size

Differential Revision: https://reviews.llvm.org/D108892
2021-08-31 12:33:10 +03:00
MaheshRavishankar ba72cfe734 [mlir] Add an interface to allow operations to specify how they can be tiled.
An interface to allow for tiling of operations is introduced. The
tiling of the linalg.pad_tensor operation is modified to use this
interface.

Differential Revision: https://reviews.llvm.org/D108611
2021-08-30 16:31:18 -07:00
Matthias Springer d18ffd61d4 [mlir][SCF] Canonicalize dim(x) where x is an iter_arg
* Add `DimOfIterArgFolder`.
* Move existing cross-dialect canonicalization patterns to `LoopCanonicalization.cpp`.
* Rename `SCFAffineOpCanonicalization` pass to `SCFForLoopCanonicalization`.
* Expand documentaton of scf.for: The type of loop-carried variables may not change with iterations. (Not even the dynamic type.)

Differential Revision: https://reviews.llvm.org/D108806
2021-08-30 01:39:56 +00:00
Aart Bik 0a7b8cc5dd [mlir][sparse] fully implement sparse tensor to sparse tensor conversions
with rigorous integration test

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D108721
2021-08-27 15:08:18 -07:00
Matthias Springer a9cff97f94 [mlir][SCF] Generalize AffineMinSCFCanonicalization to min/max ops
* Add support for affine.max ops to SCF loop peeling pattern.
* Add support for affine.max ops to `AffineMinSCFCanonicalizationPattern`.
* Rename `AffineMinSCFCanonicalizationPattern` to `AffineOpSCFCanonicalizationPattern`.
* Rename `AffineMinSCFCanonicalization` pass to `SCFAffineOpCanonicalization`.

Differential Revision: https://reviews.llvm.org/D108009
2021-08-25 10:40:34 +09:00
Matthias Springer 2de2dbef2a [mlir][linalg] Replace AffineMinSCFCanonicalizationPattern with SCF reimplementation
Use the new canonicalization pattern in the SCF dialect.

Differential Revision: https://reviews.llvm.org/D107732
2021-08-25 08:52:56 +09:00
Matthias Springer 98aa694d0d [mlir][scf] Add general affine.min canonicalization pattern
This canonicalization simplifies affine.min operations inside "for loop"-like operations (e.g., scf.for and scf.parallel) based on two invariants:
* iv >= lb
* iv < lb + step * ((ub - lb - 1) floorDiv step) + 1

This commit adds a new pass `canonicalize-scf-affine-min` (instead of being a canonicalization pattern) to avoid dependencies between the Affine dialect and the SCF dialect.

Differential Revision: https://reviews.llvm.org/D107731
2021-08-25 07:32:30 +09:00
Tyler Augustine d25e91d7f6 Support alias.scope and noalias metadata
Introduces new Ops to represent 1. alias.scope metadata in LLVM, and 2. domains for these scopes. These correspond to the metadata described in https://llvm.org/docs/LangRef.html#noalias-and-alias-scope-metadata. Lists of scopes are modeled the same way as access groups - as an ArrayAttr on the Op (added in https://reviews.llvm.org/D97944).

Lowering 'noalias' attributes on function parameters is already supported. However, lowering `noalias` metadata on individual Ops is not, which is added in this change. LLVM uses the same keyword for these, but this change introduces a separate attribute name 'noalias_scopes' to represent this distinct concept.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D107870
2021-08-24 20:42:59 +02:00
Matthias Springer ebf35370ff [mlir][tensor] Insert explicit tensor.cast ops for insert_slice src
If additional static type information can be deduced from a insert_slice's size operands, insert an explicit cast of the op's source operand.

This enables other canonicalization patterns that are matching for tensor_cast ops such as `ForOpTensorCastFolder` in SCF.

Differential Revision: https://reviews.llvm.org/D108617
2021-08-24 19:45:04 +09:00
MaheshRavishankar b546f4347b [mlir]Linalg] Allow controlling fusion of linalg.generic -> linalg.tensor_expand_shape.
Differential Revision: https://reviews.llvm.org/D108565
2021-08-23 16:28:10 -07:00
Aart Bik 236a90802d [mlir][sparse] replace support lib conversion with actual MLIR codegen
Rationale:
Passing in a pointer to the memref data in order to implement the
dense to sparse conversion was a bit too low-level. This revision
improves upon that approach with a cleaner solution of generating
a loop nest in MLIR code itself that prepares the COO object before
passing it to our "swiss army knife" setup.  This is much more
intuitive *and* now also allows for dynamic shapes.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D108491
2021-08-23 14:26:05 -07:00
Matthias Springer bc194a5bb5 [mlir][SCF] Do not peel loops inside partial iterations
Do not apply loop peeling to loops that are contained in the partial iteration of an already peeled loop. This is to avoid code explosion when dealing with large loop nests. Can be controlled with a new pass option `skip-partial`.

Differential Revision: https://reviews.llvm.org/D108542
2021-08-23 21:35:46 +09:00
Rob Suderman 871c812483 [mlir][linalg] Finish refactor of TC ops to YAML
Multiple operations were still defined as TC ops that had equivalent versions
as YAML operations. Reducing to a single compilation path guarantees that
frontends can lower to their equivalent operations without missing the
optimized fastpath.

Some operations are maintained purely for testing purposes (mainly conv{1,2,3}D
as they are included as sole tests in the vectorizaiton transforms.

Differential Revision: https://reviews.llvm.org/D108169
2021-08-20 12:35:04 -07:00
Aart Bik 758ccf8506 [mlir][sparse] add test for DimOp folding
Folding in the MLIR uses the order of the type directly
but folding in the underlying implementation must take
the dim ordering into account. These tests clarify that
behavior and verify it is done right.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D108474
2021-08-20 11:24:09 -07:00
Morten Borup Petersen 6c1436a9b0 [MLIR][SCF] Parenthesize multiple return types in scf.execute_region asm op
Previously, ExecuteRegionOps with multiple return values would fail a round-trip test due to missing parenthesis around the types.

Differential Revision: https://reviews.llvm.org/D108402
2021-08-19 21:31:51 +01:00
Matthias Springer 76a1861816 [mlir][SparseTensor] Split scf.for loop into masked/unmasked parts
Apply the "for loop peeling" pattern from SCF dialect transforms. This pattern splits scf.for loops into full and partial iterations. In the full iteration, all masked loads/stores are canonicalized to unmasked loads/stores.

Differential Revision: https://reviews.llvm.org/D107733
2021-08-19 21:53:11 +09:00
Matthias Springer 8e8b70aa84 [mlir][scf] Simplify affine.min ops after loop peeling
Simplify affine.min ops, enabling various other canonicalizations inside the peeled loop body.

affine.min ops such as:
```
map = affine_map<(d0)[s0, s1] -> (s0, -d0 + s1)>
%r = affine.min #affine.min #map(%iv)[%step, %ub]
```
are rewritten them into (in the case the peeled loop):
```
%r = %step
```

To determine how an affine.min op should be rewritten and to prove its correctness, FlatAffineConstraints is utilized.

Differential Revision: https://reviews.llvm.org/D107222
2021-08-19 17:24:53 +09:00
Matthias Springer 08dbed8a57 [mlir][linalg] Canonicalize dim ops of tiled_loop block args
E.g.:
```
%y = ... : tensor<...>
linalg.tiled_loop ... ins(%x = %y : tensor<...>) {
  tensor.dim %x, %c0 : tensor<...>
}
```

is rewritten to:
```
%y = ... : tensor<...>
linalg.tiled_loop ... ins(%x = %y : tensor<...>) {
  tensor.dim %y, %c0 : tensor<...>
}
```

Differential Revision: https://reviews.llvm.org/D108272
2021-08-19 11:24:33 +09:00
Aart Bik d37d72eaf8 [mlir][sparse] use shared util for DimOp generation
This shares more code with existing utilities. Also, to be consistent,
we moved dimension permutation on the DimOp to the tensor lowering phase.
This way, both pre-existing DimOps on sparse tensors (not likely but
possible) as well as compiler generated DimOps are handled consistently.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D108309
2021-08-18 17:12:32 -07:00
Chia-hung Duan 41e5dbe0fa Enables inferring return types for Shape op if possible
Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D102565
2021-08-18 21:36:55 +00:00
Butygin ddc3d51d58 [mlir][spirv] Add (InBounds)PtrAccessChain ops
Differential Revision: https://reviews.llvm.org/D108070
2021-08-18 17:59:21 +03:00
Lei Zhang 4c15ad2321 [mlir][linalg] Don't drop existing attributes when creating ops
Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D108219
2021-08-17 15:44:56 -04:00
Robert Suderman 65532ea6dd [mlir][linalg] Clear unused linalg tc operations
These operations are not lowered to from any source dialect and are only
used for redundant tests. Removing these named ops, along with their
associated tests, will make migration to YAML operations much more
convenient.

Reviewed By: stellaraccident

Differential Revision: https://reviews.llvm.org/D107993
2021-08-16 11:55:45 -07:00
tashuang.zk 2d45e332ba [MLIR][DISC] Revise ParallelLoopTilingPass with inbound_check mode
Expand ParallelLoopTilingPass with an inbound_check mode.

In default mode, the upper bound of the inner loop is from the min op; in
inbound_check mode, the upper bound of the inner loop is the step of the outer
loop and an additional inbound check will be emitted inside of the inner loop.

This was 'FIXME' in the original codes and a typical usage is for GPU backends,
thus the outer loop and inner loop can be mapped to blocks/threads in seperate.

Differential Revision: https://reviews.llvm.org/D105455
2021-08-16 14:02:53 +02:00
harsh-nod e33f301ec2 [mlir] Add support for moving reductions to outer most dimensions in vector.multi_reduction
The approach for handling reductions in the outer most
dimension follows that for inner most dimensions, outlined
below

First, transpose to move reduction dims, if needed
Convert reduction from n-d to 2-d canonical form
Then, for outer reductions, we emit the appropriate op
(add/mul/min/max/or/and/xor) and combine the results.

Differential Revision: https://reviews.llvm.org/D107675
2021-08-13 12:59:50 -07:00
Tyler Augustine 3a2ff982d7 Support post-processing Ops in unrolled loop iterations
This can be useful when one needs to know which unrolled iteration an Op belongs to, for example, conveying noalias information among memory-affecting ops in parallel-access loops.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D107789
2021-08-11 23:11:10 +00:00
Rob Suderman 7de439b2be [mlir][tosa] Migrate tosa to more efficient linalg.conv
Existing linalg.conv2d is not well optimized for performance. Changed to a
version that is more aligned for optimziation. Include the corresponding
transposes to use this optimized version.

This also splits the conv and depthwise conv into separate implementations
to avoid overly complex lowerings.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D107504
2021-08-11 11:05:12 -07:00
Benjamin Kramer c1ebefdf77 [mlir] Make polynomial approximation emit std instead of LLVM ops
This is a bit cleaner and removes issues with 2d vectors. It also has a
big impact on constant folding, hence the test changes.

Differential Revision: https://reviews.llvm.org/D107896
2021-08-11 16:37:21 +02:00
Alex Zinenko 79b0576dd4 [mlir] Tighten LLVM_AnyNonAggregate ODS type constraint
The constraint was checking that the type is not an LLVM structure or array
type, but was not checking that it is an LLVM-compatible type, making it accept
incorrect types. As a result, some LLVM dialect ops could process values that
are not compatible with the LLVM dialect leading to further issues with
conversions and translations that assume all values are LLVM-compatible. Make
LLVM_AnyNonAggregate only accept LLVM-compatible types.

Reviewed By: cota, akuegel

Differential Revision: https://reviews.llvm.org/D107889
2021-08-11 16:30:19 +02:00
Alexander Belyaev 1e733a8c04 Revert "Bufferization for tiled loop."
This reverts commit edaffebcb2.
2021-08-11 10:04:12 +02:00
Alexander Belyaev 967578f0b8 Revert "[mlir] Change the pattern for TiledLoopOp bufferization."
This reverts commit 2f946eaa9d.
2021-08-11 10:01:36 +02:00
Rob Suderman 2b2ebb6f98 [mlir][tosa] Add folders for trivial tosa operation cases
Some folding cases are trivial to fold away, specifically no-op cases where
an operation's input and output are the same. Canonicalizing these away
removes unneeded operations.

The current version includes tensor cast operations to resolve shape
discreprencies that occur when an operation's result type differs from the
input type. These are resolved during a tosa shape propagation pass.

Reviewed By: NatashaKnk

Differential Revision: https://reviews.llvm.org/D107321
2021-08-10 14:43:00 -07:00
Alexander Belyaev 2f946eaa9d [mlir] Change the pattern for TiledLoopOp bufferization.
This version is does not affect the patterns for Extract/InsertSliceOp and
LinalgOps.

Differential Revision: https://reviews.llvm.org/D107858
2021-08-10 21:27:02 +02:00
bakhtiyar 391456f33c Fix a bug in algebraic simplification, and enable the tests.
Reviewed By: ezhulenev

Differential Revision: https://reviews.llvm.org/D107788
2021-08-10 04:15:56 -07:00
Alexander Belyaev edaffebcb2 Cloned from CL 389610703 by 'g4 patch'.
Original change by pifon@pifon:tfrt_clean:6896:citc on 2021/08/09 05:30:17.

Ad b

Differential Revision: https://reviews.llvm.org/D107762
2021-08-09 21:57:06 +02:00
Aart Bik 05c7f450df [mlir][sparse] add dense to sparse conversion implementation
Implements lowering dense to sparse conversion, for static tensor types only.
First step towards general sparse_tensor.convert support.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D107681
2021-08-09 12:12:39 -07:00
Max Kudryavtsev 0b8cb87e0d [MLIR][STD] Add safe scalar constant propagation for FPTruncOp
Perform scalar constant propagation for FPTruncOp only if the resulting value can be represented without precision loss or rounding.

Example:
%cst = constant 1.000000e+00 : f32
%0 = fptrunc %cst : f32 to bf16
-->
%cst = constant 1.000000e+00 : bf16

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D107518
2021-08-06 16:31:29 -07:00
Alexander Belyaev a552debdcf [mlir] Add patterns for vector.transfer_read/write to Linalg bufferization.
Differential Revision: https://reviews.llvm.org/D107643
2021-08-06 20:24:44 +02:00
Geoffrey Martin-Noble ca6baf1e1d [MLIR][std] Introduce bitcast operation
This patch introduces a bitcast operation to the standard dialect.
RFC: https://llvm.discourse.group/t/rfc-introduce-a-bitcast-op/3774

Reviewed By: silvas

Differential Revision: https://reviews.llvm.org/D105376
2021-08-06 08:47:51 -07:00
Adrian Kuegel d6b4993736 [mlir][MemRef] Fix canonicalization of BufferCast(TensorLoad).
CastOp::areCastCompatible does not check whether casts are definitely compatible.
When going from dynamic to static offset or stride, the canonicalization cannot
know whether it is really cast compatible. In that case, it can only canonicalize
to an alloc plus copy.

Differential Revision: https://reviews.llvm.org/D107545
2021-08-06 08:32:35 +02:00
Jacques Pienaar 9d10be70a8 [mlir] std.call reference function return types in failure
Makes it easier to see type mismatch from failure locally.

Differential Revision: https://reviews.llvm.org/D107288
2021-08-05 19:51:48 -07:00
Stephen Neuendorffer 432341d8a8 [mlir] Handle cases where transfer_read should turn into a scalar load
The existing vector transforms reduce the dimension of transfer_read
ops.  However, beyond a certain point, the vector op actually has
to be reduced to a scalar load, since we can't load a zero-dimension
vector.  This handles this case.

Note that in the longer term, it may be preferaby to support
zero-dimension vectors.  see
https://llvm.discourse.group/t/should-we-have-0-d-vectors/3097.

Differential Revision: https://reviews.llvm.org/D103432
2021-08-03 22:53:40 -07:00
Rob Suderman 1b00b94ffc [mlir][tosa] Tosa shape propagation for tosa.cond_if
We can propagate the shape from tosa.cond_if operands into the true/false
regions then through the connected blocks. Then, using the tosa.yield ops
we can determine what all possible return types are.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D105940
2021-08-03 17:54:54 -07:00
Rob Suderman 143edeca6d [mlir][tosa] Shape inference for a few remaining easy cases:
Handles shape inference for identity, cast, and rescale. These were missed
during the initialy elementwise work. This includes resize shape propagation
which includes both attribute and input type based propagation.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D105845
2021-08-03 17:20:32 -07:00
Aart Bik 817303ef34 [mlir][sparse] fix bug in permuting data structure
Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D107379
2021-08-03 14:27:43 -07:00
KareemErgawy-TomTom f984a805f3 [MLIR][Linalg] Extend detensoring control flow model.
This patch extends the PureControlFlowDetectionModel to consider
detensoring br and cond_br operands.

See: https://github.com/google/iree/issues/1159#issuecomment-884322687,
for a disccusion on the need for such extension.

Reviewed By: silvas

Differential Revision: https://reviews.llvm.org/D107358
2021-08-03 18:08:13 +02:00
Kiran Chandramohan 59989d68ba [MLIR][OpenMP] Add support for critical construct
This patch adds the critical construct to the OpenMP dialect. The
implementation models the definition in 2.17.1 of the OpenMP 5 standard.
A name and hint can be specified. The name is a global entity or has
external linkage, it is modelled as a FlatSymbolRefAttr. Hint is
modelled as an integer enum attribute.
Also lowering to LLVM IR using the OpenMP IRBuilder.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D107135
2021-08-03 10:50:21 +01:00
Matthias Springer 3a41ff4883 [mlir][SCF] Peel scf.for loops for even step divison
Add ForLoopBoundSpecialization pass, which specializes scf.for loops into a "main loop" where `step` divides the iteration space evenly and into an scf.if that handles the last iteration.

This transformation is useful for vectorization and loop tiling. E.g., when vectorizing loads/stores, programs will spend most of their time in the main loop, in which only unmasked loads/stores are used. Only the in the last iteration (scf.if), slower masked loads/stores are used.

Subsequent commits will apply this transformation in the SparseDialect and in Linalg's loop tiling.

Differential Revision: https://reviews.llvm.org/D105804
2021-08-03 10:21:38 +09:00
Eugene Zhulenev b537c5b414 [mlir] Async: clone constants into async.execute functions and parallel compute functions
Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D107007
2021-08-02 12:17:41 -07:00
Aart Bik 697ea09d47 [mlir][sparse] add sparse tensor type conversion operation
Introduces a conversion from one (sparse) tensor type to another
(sparse) tensor type. See the operation doc for details. Actual
codegen for all cases is still TBD.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D107205
2021-07-31 12:53:31 -07:00
Nicolas Vasilache 14c1450d5c [mlir][Vector] Add vector to outerproduct lowering for the [reduction, parallel] case.
Differential Revision: https://reviews.llvm.org/D105373
2021-07-30 14:32:57 +00:00
Amy Zhuang a8b7e56f65 [mlir] Set insertion point of vector constant to the top of the vectorized loop body
When we vectorize a scalar constant, the vector constant is inserted before its
first user if the scalar constant is defined outside the loops to be vectorized.
It is possible that the vector constant does not dominate all its users. To fix
the problem, we find the innermost vectorized loop that encloses that first user
and insert the vector constant at the top of the loop body.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D106609
2021-07-29 15:42:23 -07:00
Rob Suderman 2d0ba5e144 [mlir][tosa] Fix tosa.reshape failures due to implicit broadcasting
Make broadcastable needs the output shape to determine whether the operation
includes additional broadcasting. Include some canonicalizations for TOSA
to remove unneeded reshape.

Reviewed By: NatashaKnk

Differential Revision: https://reviews.llvm.org/D106846
2021-07-29 15:21:57 -07:00
Yi Zhang 9a82482313 [mlir][linalg] Fix pad tensor cast folding with changed type
`PadTensorOp` has verification logic to make sure
result dim must be static if all the padding values are static.
Cast folding might add more static information for the src operand
of `PadTensorOp` which might change a valid operation to be invalid.
Change the canonicalizing pattern to fix this.
2021-07-29 17:47:01 -04:00
bakhtiyar 1c144410e7 Refactor AsyncToAsyncRuntime pass to boost understandability.
Depends On D106730

Reviewed By: ezhulenev

Differential Revision: https://reviews.llvm.org/D106731
2021-07-29 12:01:07 -07:00
bakhtiyar 9a5bc83660 Add an escape-hatch for conversion of funcs with blocking awaits to coroutines.
Currently TFRT does not support top-level coroutines, so this functionality will allow to have a single blocking await at the top level until TFRT implements the necessary functionality.

Reviewed By: ezhulenev

Differential Revision: https://reviews.llvm.org/D106730
2021-07-29 08:52:28 -07:00
River Riddle f8479d9de5 [mlir] Set the namespace of the BuiltinDialect to 'builtin'
Historically the builtin dialect has had an empty namespace. This has unfortunately created a very awkward situation, where many utilities either have to special case the empty namespace, or just don't work at all right now. This revision adds a namespace to the builtin dialect, and starts to cleanup some of the utilities to no longer handle empty namespaces. For now, the assembly form of builtin operations does not require the `builtin.` prefix. (This should likely be re-evaluated though)

Differential Revision: https://reviews.llvm.org/D105149
2021-07-28 21:00:10 +00:00
bakhtiyar 6ea22d4626 Optionally eliminate blocking runtime.await calls by converting functions to coroutines.
Interop parallelism requires needs awaiting on results. Blocking awaits are bad for performance. TFRT supports lightweight resumption on threads, and coroutines are an abstraction than can be used to lower the kernels onto TFRT threads.

Reviewed By: ezhulenev

Differential Revision: https://reviews.llvm.org/D106508
2021-07-28 12:37:05 -07:00
Alex Zinenko c1f719d1a7 [mlir] harden result type verification in llvm.call
The verifier of the llvm.call operation was not checking for mismatches between
the number of operation results and the number of results in the signature of
the callee. Furthermore, it was possible to construct an llvm.call operation
producing an SSA value of !llvm.void type, which should not exist. Add the
verification and treat !llvm.void result type as absence of call results.
Update the GPU conversions to LLVM that were mistakenly assuming that it was
fine for llvm.call to produce values of !llvm.void type and ensure these calls
do not produce results.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D106937
2021-07-28 18:15:56 +02:00
Lei Zhang 23326b9f17 [mlir][spirv] Fix a few issues in ModuleCombiner
- Fixed symbol insertion into `symNameToModuleMap`. Insertion
  needs to happen whether symbols are renamed or not.
- Added check for the VCE triple and avoid dropping it.
- Disabled function deduplication. It requires more careful
  rules. Right now it can remove different functions.
- Added tests for symbol rename listener.
- And some other code/comment cleanups.

Reviewed By: ergawy

Differential Revision: https://reviews.llvm.org/D106886
2021-07-28 10:31:01 -04:00
Tobias Gysi ca0d244e99 [mlir][linalg] Introduce a separate EraseIdentityCopyOp Pattern.
Split out an EraseIdentityCopyOp from the existing RemoveIdentityLinalgOps pattern. Introduce an additional check to ensure the pattern checks the permutation maps match. This is a preparation step to specialize RemoveIdentityLinalgOps to GenericOp only.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D105622
2021-07-28 11:18:22 +00:00
Yi Zhang 8ed66cb88b [mlir][memref] Fix collapsed shape ops memref.cast folding with changed type
`memref.collapse_shape` has verification logic to make sure
result dim must be static if all the collapsing src dims are static.
Cast folding might add more static information for the src operand
of `memref.collapse_shape` which might change a valid collapsing
operation to be invalid. Add `CollapseShapeOpMemRefCastFolder` pattern
to fix this.

Minor changes to `convertReassociationIndicesToExprs` to use `context`
instead of `builder` to avoid extra steps to construct temporary
builders.

Reviewed By: nicolasvasilache, mravishankar

Differential Revision: https://reviews.llvm.org/D106670
2021-07-28 10:19:20 +00:00
River Riddle ddd8482117 [PDL] Remove RewriteEndOp and mark RewriteOp as NoTerminator
RewriteEndOp was a fake terminator operation that is no longer needed now that blocks are not required to have terminators.

Differential Revision: https://reviews.llvm.org/D106911
2021-07-27 20:45:10 +00:00
Eugene Zhulenev d94426d22a [mlir] Math: add algebraic simplification patterns to math transforms
Reviewed By: bkramer

Differential Revision: https://reviews.llvm.org/D106822
2021-07-27 09:22:33 -07:00
Aart Bik c2415d67a5 [mlir][sparse] fixed bug in verification
The order of testing in two sparse tensor ops was incorrect,
which could cause an invalid cast (crashing the compiler instead
of reporting the error). This revision fixes that bug.

Reviewed By: gussmith23

Differential Revision: https://reviews.llvm.org/D106841
2021-07-27 08:49:21 -07:00
Marcel Koester 0425332015 [mlir] Added new RegionBranchTerminatorOpInterface and adapted uses of hasTrait<ReturnLike>.
This CL adds a new RegionBranchTerminatorOpInterface to query information about operands that can be
passed to successor regions. Similar to the BranchOpInterface, it allows to freely define the
involved operands. However, in contrast to the BranchOpInterface, it expects an additional region
number to distinguish between various use cases which might require different operands passed to
different regions.

Moreover, we added new utility functions (namely getMutableRegionBranchSuccessorOperands and
getRegionBranchSuccessorOperands) to query (mutable) operand ranges for operations equiped with the
ReturnLike trait and/or implementing the newly added interface.  This simplifies reasoning about
terminators in the scope of the nested regions.

We also adjusted the SCF.ConditionOp to benefit from the newly added capabilities.

Differential Revision: https://reviews.llvm.org/D105018
2021-07-26 06:39:31 +02:00
Eugene Zhulenev de7a4e53a2 [mlir] Async: lower SCF operations into CFG inside coroutines
Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D106747
2021-07-24 14:36:26 -07:00
Yi Zhang deebf18512 [mlir][linalg] Add pooling_nchw_max, conv_2d_nchw as yaml ops.
- Add pooling_nchw_max.
- Move conv_2d_nchw to yaml ops and add strides and dilation attributes.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D106658
2021-07-23 17:37:15 +00:00
thomasraoux 73a9d6d0e2 [mlir][linalg] Fix bug in contraction op vectorization with output perm
When the output indexing map has a permutation we need to consider in
the contraction vector type.

Differential Revision: https://reviews.llvm.org/D106469
2021-07-23 08:39:43 -07:00
Eugene Zhulenev 6c1f655818 [mlir] Async: special handling for parallel loops with zero iterations
Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D106590
2021-07-23 01:22:59 -07:00
Nicolas Vasilache 06d2fb55ca [mlir][Linalg] Fix a missing copy when source of insert_slice is not inplace.
When the source tensor of a tensor.insert_slice is not equivalent to an inplace buffer an extra copy is necessary. This revision adds the missing copy.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D106587
2021-07-23 07:41:45 +00:00
Rob Suderman cf8a1f6208 [mlir][tosa] Quantized Conv2DOp lowering to linalg added.
Includes a version of a quantized conv2D operations with a lowering from TOSA
to linalg with corresponding test. We keep the quantized and quantized variants
as separate named ops to avoid the additional operations for non-quantized
convolutions.

Differential Revision: https://reviews.llvm.org/D106407
2021-07-22 15:42:26 -07:00
Rahul Joshi f8d3755f00 [MLIR][memref] Fix findDealloc() to handle > 1 dealloc for the given alloc.
- Change findDealloc() to return Optional<Operation *> and return None if > 1
  dealloc is associated with the given alloc.
- Add findDeallocs() to return all deallocs associated with the given alloc.
- Fix current uses of findDealloc() to bail out if > 1 dealloc is found.

Differential Revision: https://reviews.llvm.org/D106456
2021-07-22 09:34:19 -07:00
Uday Bondhugula 795e726f5f [MLIR] Fix affine.for empty loop body folder
Fix affine.for empty loop body folder in the presence of yield values.
The existing pattern ignored iter_args/yield values and thus crashed
when yield values had uses.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D106121
2021-07-22 08:11:40 +05:30
thomasraoux 45cb4140eb [mlir] Extend scf pipeling to support loop carried dependencies
Differential Revision: https://reviews.llvm.org/D106325
2021-07-21 18:32:38 -07:00
Mehdi Amini 5a8a159bf5 Add verifier for insert/extract element/value on type match between container and inserted/extracted value, and fix vector.shuffle lowering
Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D106398
2021-07-21 22:28:59 +00:00
Uday Bondhugula 104fad99c9 [MLIR] Add folder for zero trip count affine.for
AffineForOp's folding hook is expected to fold away trivially empty
affine.for.  This allows simplification to happen as part of the
canonicalizer and from wherever the folding hook is used. While more
complex analysis based zero trip count detection is available from other
passes in analysis and transforms, simple and inexpensive folding had
been missing.

Also, update/improve affine.for op documentation clarifying semantics of
the result values for zero trip count loops.

Differential Revision: https://reviews.llvm.org/D106123
2021-07-21 20:28:35 +05:30
Uday Bondhugula 7932d21f5d [MLIR] Introduce a new rewrite driver to simplify supplied list of ops
Introduce a new rewrite driver (MultiOpPatternRewriteDriver) to rewrite
a supplied list of ops and other ops. Provide a knob to restrict
rewrites strictly to those ops or also to affected ops (but still not to
completely related ops).

This rewrite driver is commonly needed to run any simplification and
cleanup at the end of a transforms pass or transforms utility in a way
that only simplifies relevant IR. This makes it easy to write test cases
while not performing unrelated whole IR simplification that may
invalidate other state at the caller.

The introduced utility provides more freedom to developers of transforms
and transform utilities to perform focussed and local simplification. In
several cases, it provides greater efficiency as well as more
simplification when compared to repeatedly calling
`applyOpPatternsAndFold`; in other cases, it avoids the need to
undesirably call `applyPatternsAndFoldGreedily` to do unrelated
simplification in a FuncOp.

Update a few transformations that were earlier using
applyOpPatternsAndFold (SimplifyAffineStructures,
affineDataCopyGenerate, a linalg transform).

TODO:
- OpPatternRewriteDriver can be removed as it's a special case of
  MultiOpPatternRewriteDriver, i.e., both can be merged.

Differential Revision: https://reviews.llvm.org/D106232
2021-07-21 20:25:16 +05:30
Tobias Gysi 3396377743 [linalg] Add TensorDimOp to list of ops known by bufferization.
Bufferization handles all unknown ops conservative. The patch ensures accessing the dimension of an output tensor does not prevent in place bufferization.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D106356
2021-07-20 12:44:13 +00:00
thomasraoux f6f88e66ce [mlir] Add software pipelining transformation for scf.For op
This is the first step to support software pipeline for scf.for loops.
This is only the transformation to create pipelined kernel and
prologue/epilogue.
The scheduling needs to be given by user as  many different algorithm
and heuristic could be applied.
This currently doesn't handle loop arguments, this will be added in a
follow up patch.

Differential Revision: https://reviews.llvm.org/D105868
2021-07-19 13:43:26 -07:00
MaheshRavishankar 5994201c8e [mlir][Linalg] NFC: Rename FusionOfTensors pass to FusionOfElementwiseOps pass.
This makes it more explicit what the scope of this pass is. The name
of this pass predates fusion on tensors using tile + fuse, and hence
the confusion.

Differential Revision: https://reviews.llvm.org/D106132
2021-07-19 12:51:26 -07:00
Rob Suderman 11dda1a234 [mlir][tosa] Added shape inference for tosa convolution operations
Added shape inference handles cases for convolution operations. This includes
conv2d, conv3d, depthwise_conv2d, and transpose_conv2d. With transpose conv
we use the specified output shape when possible however will shape propagate
if the output shape attribute has dynamic values.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D105645
2021-07-19 10:41:56 -07:00
Hanhan Wang 9c49195330 [mlir][Linalg] Migrate 2D pooling ops from tc definition to yaml definition.
This deletes all the pooling ops in LinalgNamedStructuredOpsSpec.tc. All the
uses are replaced with the yaml pooling ops.

Reviewed By: gysit, rsuderman

Differential Revision: https://reviews.llvm.org/D106181
2021-07-19 09:24:02 -07:00
Tobias Gysi 87656a3134 [mlir][linalg] Fold TensorCast into PadTensorOp.
Add pattern to fold a TensorCast into a PadTensorOp if the cast removes static size information.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D106278
2021-07-19 15:57:38 +00:00
Alexander Belyaev 58ddeba3e0 Revert "[mlir] Introduce `linalg.tiled_yield` terminator for `linalg.tiled_loop`."
This reverts commit 3b03d9b874.
2021-07-19 14:19:49 +02:00
Alexander Belyaev 3b03d9b874 [mlir] Introduce `linalg.tiled_yield` terminator for `linalg.tiled_loop`.
https://llvm.discourse.group/t/rfc-changes-to-linalg-tiledloopop-to-unblock-reductions/3890

Differential Revision: https://reviews.llvm.org/D106066
2021-07-19 14:16:03 +02:00
Tobias Gysi 3f8f292330 [mlir][linalg] Set explicit insertion point in pad_tensor patterns.
Insert ops replacing pad_tensor in front of the associated tansfer_write / insert_slice op. Otherwise we may end up with invalid ir if one of the remaining tansfer_write / insert_slice operands is defined after the pad_tensor op.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D106162
2021-07-19 08:43:56 +00:00
Matthias Springer d1a9e9a7cb [mlir][vector] Remove vector.transfer_read/write to LLVM lowering
This simplifies the vector to LLVM lowering. Previously, both vector.load/store and vector.transfer_read/write lowered directly to LLVM. With this commit, there is a single path to LLVM vector load/store instructions and vector.transfer_read/write ops must first be lowered to vector.load/store ops.

* Remove vector.transfer_read/write to LLVM lowering.
* Allow non-unit memref strides on all but the most minor dimension for vector.load/store ops.
* Add maxTransferRank option to populateVectorTransferLoweringPatterns.
* vector.transfer_reads with changing element type can no longer be lowered to LLVM. (This functionality is needed only for SPIRV.)

Differential Revision: https://reviews.llvm.org/D106118
2021-07-17 14:07:27 +09:00
Matthias Springer 4a3defa629 [mlir][vector] Refactor TransferReadToVectorLoadLowering
* TransferReadToVectorLoadLowering no longer generates memref.load ops.
* Add new pattern VectorLoadToMemrefLoadLowering that lowers scalar vector.loads to memref.loads.
* Add vector::BroadcastOp canonicalization pattern that folds broadcast chains.

Differential Revision: https://reviews.llvm.org/D106117
2021-07-17 13:53:09 +09:00
Alex Zinenko 881dc34f73 [mlir] replace llvm.mlir.cast with unrealized_conversion_cast
The dialect-specific cast between builtin (ex-standard) types and LLVM
dialect types was introduced long time before built-in support for
unrealized_conversion_cast. It has a similar purpose, but is restricted
to compatible builtin and LLVM dialect types, which may hamper
progressive lowering and composition with types from other dialects.
Replace llvm.mlir.cast with unrealized_conversion_cast, and drop the
operation that became unnecessary.

Also make unrealized_conversion_cast legal by default in
LLVMConversionTarget as the majority of convesions using it are partial
conversions that actually want the casts to persist in the IR. The
standard-to-llvm conversion, which is still expected to run last, cleans
up the remaining casts  standard-to-llvm conversion, which is still
expected to run last, cleans up the remaining casts

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D105880
2021-07-16 15:14:09 +02:00
Alexander Belyaev 46ef86b5d8 [mlir] Move linalg::Expand/CollapseShapeOp to memref dialect.
RFC: https://llvm.discourse.group/t/rfc-reshape-ops-restructuring/3310

Differential Revision: https://reviews.llvm.org/D106141
2021-07-16 13:32:17 +02:00
Alex Zinenko e4b79a542e [mlir] add an interface to support custom types in LLVM dialect pointers
This may be necessary in partial multi-stage conversion when a container type
from dialect A containing types from dialect B goes through the conversion
where only dialect A is converted to the LLVM dialect. We will need to keep a
pointer-to-non-LLVM type in the IR until a further conversion can convert
dialect B types to LLVM types.

Reviewed By: wsmoses

Differential Revision: https://reviews.llvm.org/D106076
2021-07-16 13:05:27 +02:00
Weiwei Li 108a320a58 [mlir][spirv] Add support for GLSL FMix
Add spv.GLSL.FMix opertaion.

co-authered-by: Alan Liu <alanliu.yf@gmail.com>

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D104153
2021-07-16 08:29:46 +08:00
Aart Bik 2b6e433230 [mlir][sparse] add shift ops support
Arbitrary shifts have some complications, but shift by invariants
(viz. tensor index exp only at left hand side) can be easily
handled with the conjunctive rule.

Reviewed By: gussmith23

Differential Revision: https://reviews.llvm.org/D106002
2021-07-15 09:43:12 -07:00
Aart Bik 68ac2e53ff [mlir][sparse] replace linalg.copy with memref.copy
Note, this revision relies on the following revision
for a bugfix in the memref copy library in order for
all sparse integration tests to pass.

https://reviews.llvm.org/D106036

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D106038
2021-07-15 07:56:50 -07:00
Nicolas Vasilache 01bdb0f75e [mlir][linalg] Improve implementation of hoist padding.
Instead of relying on adhoc bounds calculations, use a projection-based
implementation. This simplifies the implementation and finds more static
constant sizes than previously/

Differential Revision: https://reviews.llvm.org/D106054
2021-07-15 12:10:31 +00:00
Mehdi Amini 0f9e6451a8 Defend early against operation created without a registered dialect
Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D105961
2021-07-15 03:52:32 +00:00
Mehdi Amini 3e25ea709c Revert "Defend early against operation created without a registered dialect"
This reverts commit 58018858e8.

The Python bindings test are broken.
2021-07-15 03:31:44 +00:00
Mehdi Amini 58018858e8 Defend early against operation created without a registered dialect
Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D105961
2021-07-15 03:02:52 +00:00
Matthias Springer a0e02018be [mlir][linalg] Improve codegen when tiling PadTensor evenly
Produce simpler IR with more static type information and fewer affine expressions.

Differential Revision: https://reviews.llvm.org/D105530
2021-07-15 11:29:21 +09:00
Matthias Springer 318ce4ad92 [mlir][linalg] Improve codegen of ExtractSliceOfPadTensorSwapPattern
Generate simpler code in case low/high padding of the PadTensorOp is statically zero.

Differential Revision: https://reviews.llvm.org/D105529
2021-07-15 11:05:55 +09:00
Matthias Springer 4064b6a363 [mlir][linalg] Tile PadTensorOp
Tiling can be enabled with `linalg-tile-pad-tensor-ops`. Only scf::ForOp can be generated at the moment.

Differential Revision: https://reviews.llvm.org/D105460
2021-07-15 10:42:32 +09:00
Matthias Springer 5da010af9a [mlir][linalg] Add optional output operand to PadTensorOp
This optional operand will be used for tiling in a subsequent commit.

Differential Revision: https://reviews.llvm.org/D105459
2021-07-15 10:20:36 +09:00
Nicolas Vasilache df538fdaa9 [mlir][affine] Add single result affine.min/max -> affine.apply canonicalization.
Differential Revision: https://reviews.llvm.org/D106014
2021-07-14 20:38:06 +00:00
Matthias Springer b70dde522d [mlir][linalg] Fix typo in ExtractSliceOfPadTensorSwapPattern
Differential Revision: https://reviews.llvm.org/D105607
2021-07-14 22:30:32 +09:00
Butygin a36e9ee09d [mlir][SCF] populateSCFStructuralTypeConversionsAndLegality WhileOp support
Differential Revision: https://reviews.llvm.org/D105923
2021-07-14 12:43:04 +03:00
MaheshRavishankar f2b5e438aa [mlir][Tensor] Implement `reifyReturnTypeShapesPerResultDim` for `tensor.insert_slice`.
Differential Revision: https://reviews.llvm.org/D105852
2021-07-13 14:53:29 -07:00
Aart Bik 123e8dfcf8 [mlir][sparse] add support for std unary operations
Adds zero-preserving unary operators from std. Also adds xor.
Performs minor refactoring to remove "zero" node, and pushed
the irregular logic for negi (not support in std) into one place.

Reviewed By: gussmith23

Differential Revision: https://reviews.llvm.org/D105928
2021-07-13 14:51:13 -07:00
thomasraoux ae4cea38f1 [mlir] Add support for tensor.extract to comprehensive bufferization
Differential Revision: https://reviews.llvm.org/D105870
2021-07-13 09:54:46 -07:00
Nicolas Vasilache af55335924 [mlir][Linalg] Better support for bufferizing non-tensor results.
Clean up corner cases related to elemental tensor / buffer type return values that would previously fail.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D105857
2021-07-13 10:27:40 +00:00
Nicolas Vasilache e312fc49ae [mlir][Linalg] Add layout specification support to bufferization.
Previously, linalg bufferization always had to be conservative at function boundaries and assume the most dynamic strided memref layout.
This revision introduce the mechanism to specify a  linalg.buffer_layout function argument attribute that carries an affine map used to set a less pessimistic layout.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D105859
2021-07-13 10:22:18 +00:00
Aart Bik 45b3cfe843 [mlir][sparse] add support for AND and OR operations
Integral AND and OR follow the simple conjunction and disjuction rules
for lattice building. This revision also completes some of the Merge
refactoring by moving the remainder parts that are merger specific from
sparsification into utils files.

Reviewed By: gussmith23

Differential Revision: https://reviews.llvm.org/D105851
2021-07-12 17:47:18 -07:00
Hanhan Wang 50529affcd [mlir][Linalg] Add 3D pooling named ops to Linalg.
Reviewed By: gysit, hanchung

Differential Revision: https://reviews.llvm.org/D105329
2021-07-12 17:26:02 -07:00
Rob Suderman f2832c2295 [mlir][tosa] Added shape propagation for TOSA pool operations.
Pool operations perform the same shape propagation. Included the shape
propagation and tests for these avg_pool2d and max_pool2d.

Differential Revision: https://reviews.llvm.org/D105665
2021-07-12 15:40:49 -07:00
Aart Bik 622eb169f6 [mlir][sparse] add restrictive versions of division support
Right now, we only accept x/c with nonzero c, since this
conceptually can be treated as a x*(1/c) conjunction for both
FP and INT as far as lattice computations go. The codegen
keeps the division though to preserve precise semantics.

See discussion:
https://llvm.discourse.group/t/sparse-tensors-in-mlir/3389/28

Reviewed By: gussmith23

Differential Revision: https://reviews.llvm.org/D105731
2021-07-12 14:59:48 -07:00
Rob Suderman 5a4e776010 [mlir][tosa] Added more shape inference for tosa ops
Added shape inference for:
- scatter
- gather
- transpose
- slice
- pad
- concat
- reduction operations

Also updated reshape for more aggressive shape inference.

Differential Revision: https://reviews.llvm.org/D105383
2021-07-12 10:04:49 -07:00
Nicolas Vasilache 6b1668397f [mlir][Linalg] Improve comprehensive bufferization for scf.yield.
Previously, comprehensive bufferization of scf.yield did not have enough information
to detect whether an enclosing scf::for bbargs would bufferize to a buffer equivalent
to that of the matching scf::yield operand.
As a consequence a separate sanity check step would be required to determine whether
bufferization occured properly.
This late check would miss the case of calling a function in an loop.

Instead, we now pass and update aliasInfo during bufferization and it is possible to
imrpove bufferization of scf::yield and drop that post-pass check.

Add an example use case that was failing previously.
This slightly modifies the error conditions, which are also updated as part of this
revision.

Differential Revision: https://reviews.llvm.org/D105803
2021-07-12 10:36:25 +00:00
Jacques Pienaar 51cbe4e587 [mlir] Fix broadcasting check with 1 values
The trait was inconsistent with the other broadcasting logic here. And
also fix printing here to use ? rather than -1 in the error.

Differential Revision: https://reviews.llvm.org/D105748
2021-07-11 20:41:33 -07:00
Alex Zinenko c282d55a38 [mlir] add support for reductions in OpenMP WsLoopOp
Use a modeling similar to SCF ParallelOp to support arbitrary parallel
reductions. The two main differences are: (1) reductions are named and declared
beforehand similarly to functions using a special op that provides the neutral
element, the reduction code and optionally the atomic reduction code; (2)
reductions go through memory instead because this is closer to the OpenMP
semantics.

See https://llvm.discourse.group/t/rfc-openmp-reduction-support/3367.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D105358
2021-07-09 17:54:20 +02:00
Yi Zhang 7c35aae35b Mark TensorDialect legal and PadTensor op illegal
`GeneralizePadTensorOpPattern` might generate `tensor.dim` op so the
TensorDialect should be marked legal. This pattern should also
transform all `linalg.pad_tensor` ops so mark those as illegal. Those
changes are missed from a previous change in
https://reviews.llvm.org/D105293

Reviewed By: silvas

Differential Revision: https://reviews.llvm.org/D105642
2021-07-08 15:02:22 -07:00
Nicolas Vasilache 4747e1b83b [mlir][Linalg] Fix tensor.extract_slice(linalg.init_tensor) canonicalization for rank-reducing extract
Differential Revision: https://reviews.llvm.org/D105636
2021-07-08 18:13:51 +00:00
Nicolas Vasilache 31f80393bc Revert "[mlir][MemRef] Fix DimOp folding of OffsetSizeAndStrideInterface."
This reverts commit 6c0fd4db79.

This simple implementation is unfortunately not extensible and needs to be reverted.
The extensible way should be to extend https://reviews.llvm.org/D104321.
2021-07-08 10:09:00 +00:00
Tobias Gysi abfa950d86 [mlir][linalg][python] Add exp and log to the OpDSL.
Introduce the exp and log function in OpDSL. Add the soft plus operator to test the emitted IR in Python and C++.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D105420
2021-07-08 08:48:23 +00:00
Nicolas Vasilache 6c0fd4db79 [mlir][MemRef] Fix DimOp folding of OffsetSizeAndStrideInterface.
This addresses the issue reported in

https://llvm.discourse.group/t/rank-reducing-memref-subview-offsetsizeandstrideopinterface-interface-issues/3805

Differential Revision: https://reviews.llvm.org/D105558
2021-07-08 08:30:24 +00:00
Tobias Gysi 511af1b1ad [mlir][linalg] Tighter StructuredOp Verification.
Verify the number of results matches exactly the number of output tensors. Simplify the FillOp verification since part of it got redundant.

Differential Revision: https://reviews.llvm.org/D105427
2021-07-08 06:53:36 +00:00
thomasraoux 291025389c [mlir][vector] Refactor Vector Unrolling and remove Tuple ops
Simplify vector unrolling pattern to be more aligned with rest of the
patterns and be closer to vector distribution.
The new implementation uses ExtractStridedSlice/InsertStridedSlice
instead of the Tuple ops. After this change the ops based on Tuple don't
have any more used so they can be removed.

This allows removing signifcant amount of dead code and will allow
extending the unrolling code going forward.

Differential Revision: https://reviews.llvm.org/D105381
2021-07-07 11:11:26 -07:00
Nicolas Vasilache d0b282e10b [mlir][Linalg] Rewrite PadTensorOp to enable its comprehensive bufferization.
Add the rewrite of PadTensorOp to InitTensor + InsertSlice before the
bufferization analysis starts.

This is exercised via a more advanced integration test.

Since the new behavior triggers folding, 2 tests need to be updated.
One of those seems to exhibit a folding issue with `switch` and is modified.

Differential Revision: https://reviews.llvm.org/D105549
2021-07-07 12:39:22 +00:00
Yi Zhang 35df2f6fbd Refactor GenericPadTensorOpVectorizationPattern
Refactor the original code to rewrite a PadTensorOp into a
sequence of InitTensorOp, FillOp and InsertSliceOp without
vectorization by default. `GenericPadTensorOpVectorizationPattern`
provides a customized OptimizeCopyFn to vectorize the
copying step.

Reviewed By: silvas, nicolasvasilache, springerm

Differential Revision: https://reviews.llvm.org/D105293
2021-07-07 11:44:32 +00:00
Nicolas Vasilache 9a0af63d05 [mlir][Linalg] Proper handling of ForOp and TiledLoopOp
The `bufferizesToMemoryRead` condition was too optimistics in the case
of operands that map to a block argument.
This is the case for ForOp and TiledLoopOp.
For such ops, forward the call to all uses of the matching BBArg.

Differential Revision: https://reviews.llvm.org/D105540
2021-07-07 11:34:05 +00:00
Adrian Kuegel 6e80e3bd1b Add Log1pOp to complex dialect.
Also add a lowering pattern from Complex to Standard/Math dialect.

Differential Revision: https://reviews.llvm.org/D105538
2021-07-07 11:33:54 +02:00
Nicolas Vasilache 0c4e538d8f [mlir][Linalg] Add an InitTensor -> DimOp canonicalization pattern.
Differential Revision: https://reviews.llvm.org/D105537
2021-07-07 08:44:54 +00:00
Srishti Srivastava 0c1a7730f5 [MLIR] Simplify affine.if having yield values and trivial conditions
When an affine.if operation is returning/yielding results and has a
trivially true or false condition, then its 'then' or 'else' block,
respectively, is promoted to the affine.if's parent block and then, the
affine.if operation is replaced by the correct results/yield values.
Relevant test cases are also added.

Signed-off-by: Srishti Srivastava <srishti.srivastava@polymagelabs.com>

Differential Revision: https://reviews.llvm.org/D105418
2021-07-07 13:02:10 +05:30
Uday Bondhugula 715137d0c8 [MLIR] Fix memref get constant bound size and shape method
Fix FlatAffineConstraints::getConstantBoundOnDimSize to ensure that
returned bounds on dim size are always non-negative regardless of the
constraints on that dimension. Add an assertion at the user.

Differential Revision: https://reviews.llvm.org/D105171
2021-07-05 23:00:41 +05:30
Simon Camphausen 4ff440b0ef [mlir] Change custom syntax of emitc.include op to resemble C
This changes the custom syntax of the emitc.include operation for standard includes.

Reviewed By: marbre

Differential Revision: https://reviews.llvm.org/D105281
2021-07-05 16:40:05 +02:00
Aart Bik b8a021dbe3 [mlir][sparse] support for negation and subtractions
This revision extends the sparse compiler support from fp/int addition and multiplication to fp/int negation and subtraction, thereby increasing the scope of sparse kernels that can be compiled.

Reviewed By: gussmith23

Differential Revision: https://reviews.llvm.org/D105306
2021-07-02 15:55:05 -07:00
MaheshRavishankar cdf7b661c2 [mlir][Linalg] Fix incorrect logic in deciding when to fuse reshapes by linearization.
Fusion by linearization should not happen when
- The reshape is expanding and it is a consumer
- The reshape is collapsing and is a producer.

The bug introduced in this logic by some recent refactoring resulted
in a crash.
To enforce this (negetive) use case, add a test that reproduces the
error and verifies the fix.

Differential Revision: https://reviews.llvm.org/D104970
2021-07-02 11:16:21 -07:00
Tobias Gysi f239026f89 [mlir][linalg][python] Add min operation in OpDSL.
Add the min operation to OpDSL and introduce a min pooling operation to test the implementation. The patch is a sibling of the max operation patch https://reviews.llvm.org/D105203 and the min operation is again lowered to a compare and select pair.

Differential Revision: https://reviews.llvm.org/D105345
2021-07-02 16:27:30 +00:00
Nicolas Vasilache ad0050c607 [mlir][Linalg] Add comprehensive bufferization support for TiledLoopOp (14/n)
Differential Revision: https://reviews.llvm.org/D105335
2021-07-02 14:21:08 +00:00
Adrian Kuegel 791ddb79f1 Add LogOp to Complex dialect.
Differential Revision: https://reviews.llvm.org/D105337
2021-07-02 13:15:47 +02:00
Tobias Gysi 3b95400f78 [mlir][linalg][python] Add max operation in OpDSL
Add the max operation to the OpDSL and introduce a max pooling operation to test the implementation. As MLIR has no builtin max operation, the max function is lowered to a compare and select pair.

Differential Revision: https://reviews.llvm.org/D105203
2021-07-02 07:12:37 +00:00
Matthias Springer e895a670f8 [mlir] Move BufferizeDimOp to Tensor/Transforms/Bufferize.cpp
Differential Revision: https://reviews.llvm.org/D105256
2021-07-02 10:05:59 +09:00
Rob Suderman 8dea784b3e [mlir][tosa] Add tosa shape inference with InferReturnTypeComponent
Added InferReturnTypeComponents for NAry operations, reshape, and reverse.
With the additional tosa-infer-shapes pass, we can infer/propagate shapes
across a set of TOSA operations. Current version does not modify the
FuncOp type by inserting an unrealized conversion cast prior to any new
non-matchin returns.

Differential Revision: https://reviews.llvm.org/D105312
2021-07-01 16:04:26 -07:00