Commit Graph

1699 Commits

Author SHA1 Message Date
Matthias Springer fb1def9c66 [mlir][linalg] New tiling option: Scalarize dynamic dims
This tiling option scalarizes all dynamic dimensions, i.e., it tiles all dynamic dimensions by 1.

This option is useful for linalg ops with partly dynamic tensor dimensions. E.g., such ops can appear in the partial iteration after loop peeling. After scalarizing dynamic dims, those ops can be vectorized.

Differential Revision: https://reviews.llvm.org/D109268
2021-09-14 10:40:50 +09:00
Matthias Springer 8faf35c0a5 [mlir][linalg] Add scf.for loop peeling to codegen strategy
Only scf.for loops are supported at the moment. linalg.tiled_loop support will be added in a subsequent commit.

Only static tensor sizes are supported. Loops for dynamic tensor sizes can be peeled, but the generated code is not optimal due to a missing canonicalization pattern.

Differential Revision: https://reviews.llvm.org/D109043
2021-09-14 10:35:01 +09:00
Matthias Springer a4a654d301 [mlir][linalg] TiledLoopOp peeling: Do not peel partial iterations
Extend the unit test with an option for skipping partial iterations during loop peeling.

Differential Revision: https://reviews.llvm.org/D109640
2021-09-14 10:01:46 +09:00
Nicolas Vasilache 181d18ef53 [mlir][Linalg] Insert static buffers as high as possible during ComprehensiveBufferization.
This revision allows hoisting static alloc/dealloc pairs as high as possible during ComprehensiveBufferization.
This also aligns such allocated buffers to 128B by default.

This change exhibited some issues wrt insertion points and a missing copy that are also fixed in this revision; tests are updated accordingly.

Differential Revision: https://reviews.llvm.org/D109684
2021-09-13 15:59:03 +00:00
Nicolas Vasilache b01d223faf [mlir][Linalg] Use reify for padded op shape derivation.
Previously, we would insert a DimOp and rely on later canonicalizations.
Unfortunately, reifyShape kind of rewrites are not canonicalizations anymore.
This introduces undesirable pass dependencies.

Instead, immediately reify the result shape and avoid the DimOp altogether.
This is akin to a local folding, which avoids introducing more reliance on `-resolve-shaped-type-result-dims` (similar to compositions of `affine.apply` by construction to avoid chains of size > 1).

It does not completely get rid of the reliance on the pass as the process is merely local: calling the pass may still be necessary for global effects. Indeed, one of the tests still requires the pass.

Differential Revision: https://reviews.llvm.org/D109571
2021-09-13 11:54:59 +00:00
Rob Suderman b0532286fe [mlir][tosa] Add shape inference for tosa.while
Tosa.while shape inference requires repeatedly running shape inference across
the body of the loop until the types become static as we do not know the number
of iterations required by the loop body. Once the least specific arguments are
known they are propagated to both regions.

To determine the final end type, the least restrictive types are determined
from all yields.

Differential Revision: https://reviews.llvm.org/D108801
2021-09-10 13:11:53 -07:00
Stephan Herhut 5e6c170b3f [mlir][linalg] Fix bufferize pattern to allow unknown operations in body of generic
The original version of the bufferization pattern for linalg.generic would
manually clone operations within the region to the bufferized clone of the
operation. This triggers legality requirements on those operations in the
conversion infra. Instead, this now uses the rewriter to inline the region
instead, avoiding those legality requirements.

Differential Revision: https://reviews.llvm.org/D109581
2021-09-10 13:37:42 +02:00
Matthias Springer 0f3544d185 [mlir][scf] Loop peeling: Use scf.for for partial iteration
Generate an scf.for instead of an scf.if for the partial iteration. This is for consistency reasons: The peeling of linalg.tiled_loop also uses another loop for the partial iteration.

Note: Canonicalizations patterns may rewrite partial iterations to scf.if afterwards.

Differential Revision: https://reviews.llvm.org/D109568
2021-09-10 19:07:09 +09:00
Nicolas Vasilache 5f1a1af4bf [mlir][Linalg] Properly order extract_slice traversal in comprehensive bufferization
This revision fixes the traversal order of extract_slice during the inplace analysis.
It was previously thought that such ops could be analyzed at the very end.
This is unfortunately not true as the AliasInfo for dependents of these ops need to be updated.

This change allows the aliases introduced by the bufferization of extract_slice to be properly propagated.

Differential Revision: https://reviews.llvm.org/D109519
2021-09-10 07:10:06 +00:00
Aart Bik 066d786ce0 [mlir][sparse] add folding to sparse_tensor.convert
folds conversion between identical types (with tests)

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D109545
2021-09-09 15:45:19 -07:00
Matthias Springer c7d569b8f7 [mlir][scf] Fold dim(scf.for) to dim(iter_arg)
Fold dim ops of scf.for results to dim ops of the respective iter args if the loop is shape preserving.

Differential Revision: https://reviews.llvm.org/D109430
2021-09-09 13:47:13 +09:00
Matthias Springer e2c8fcb9d0 [mlir][linalg] Fold dim(linalg.tiled_loop) to dim(output_arg)
Fold dim ops of linalg.tiled_loop results to dim ops of the respective iter args if the loop is shape preserving.

Differential Revision: https://reviews.llvm.org/D109431
2021-09-09 13:37:28 +09:00
Matthias Springer f7137da174 [mlir][linalg] Fix dim(iter_arg) canonicalization
Run a small analysis to see if the runtime type of the iter_arg is changing. Fold only if the runtime type stays the same. (Same as `DimOfIterArgFolder` in SCF.)

Differential Revision: https://reviews.llvm.org/D109299
2021-09-09 12:13:05 +09:00
Matthias Springer c95a7246a3 [mlir][linalg] Tiling: Use loop ub in extract_slice size computation if possible
When tiling a LinalgOp, extract_slice/insert_slice pairs are inserted. To avoid going out-of-bounds when the tile size does not divide the shape size evenly (at the boundary), AffineMin ops are inserted. Some ops have assumptions regarding the dimensions of inputs/outputs. E.g., in a `A * B` matmul, `dim(A, 1) == dim(B, 0)`. However, loop bounds use either `dim(A, 1)` or `dim(B, 0)`.

With this change, AffineMin ops are expressed in terms of loop bounds instead of tensor sizes. (Both have the same runtime value.) This simplifies canonicalizations.

Differential Revision: https://reviews.llvm.org/D109267
2021-09-09 11:06:22 +09:00
Chris Lattner 42431b8207 [tests] Make testsuite more resilient to "order of constant" changes. NFC. 2021-09-08 10:10:10 -07:00
Matthias Springer c57c4f888c [mlir][linalg] linalg.tiled_loop peeling
Differential Revision: https://reviews.llvm.org/D108270
2021-09-07 09:50:08 +09:00
Eugene Zhulenev fd52b4357a [mlir] Async: check awaited operand error state after sync await
Previously only await inside the async function (coroutine after lowering to async runtime) would check the error state

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D109229
2021-09-04 05:00:17 -07:00
Loren Maggiore 361458b1ce [mlir] create gpu memset op
Create a gpu memset op and corresponding CUDA and ROCm wrappers.

Reviewed By: herhut, lorenrose1013

Differential Revision: https://reviews.llvm.org/D107548
2021-09-04 08:13:04 +02:00
Mehdi Amini 78accf9f35 Make LLVM Linkage a first class attribute instead of using an integer attribute
This makes the IR more readable, in particular when this will be used on
the builtin func outside of the LLVM dialect.

Reviewed By: wsmoses

Differential Revision: https://reviews.llvm.org/D109209
2021-09-03 21:21:46 +00:00
Alexander Belyaev 5ee5bbd0ff [mlir][linalg] Extend tiled_loop to SCF conversion to generate scf.parallel.
Differential Revision: https://reviews.llvm.org/D109230
2021-09-03 18:05:54 +02:00
Aart Bik b6d1a31c1b [mlir][sparse] refine heuristic for iteration graph topsort
The sparse index order must always be satisfied, but this
may give a choice in topsorts for several cases. We broke
ties in favor of any dense index order, since this gives
good locality. However, breaking ties in favor of pushing
unrelated indices into sparse iteration spaces gives better
asymptotic complexity. This revision improves the heuristic.

Note that in the long run, we are really interested in using
ML for ML to find the best loop ordering as a replacement for
such heuristics.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D109100
2021-09-03 08:37:15 -07:00
Matthias Springer 4fa6c2734c [mlir][scf] Allow runtime type of iter_args to change
The limitation on iter_args introduced with D108806 is too restricting. Changes of the runtime type should be allowed.

Extends the dim op canonicalization with a simple analysis to determine when it is safe to canonicalize.

Differential Revision: https://reviews.llvm.org/D109125
2021-09-03 10:03:05 +09:00
Kiran Chandramohan 711aa35759 [MLIR][OpenMP] Add support for declaring critical construct names
Add an operation omp.critical.declare to declare names/symbols of
critical sections. Named omp.critical operations should use symbols
declared by omp.critical.declare. Having a declare operation ensures
that the names of critical sections are global and unique. In the
lowering flow to LLVM IR, the OpenMP IRBuilder creates unique names
for critical sections.

Reviewed By: ftynse, jeanPerier

Differential Revision: https://reviews.llvm.org/D108713
2021-09-02 14:31:19 +00:00
Weiwei Li a79d7c2c85 [mlir][SPIRV] Add Image Operands for Image Instructions
This patch is to add Image Operands in SPIR-V Dialect and also let ImageDrefGather to use Image Operands.

Image Operands are used in many image instructions. "Image Operands encodes what oprands follow, as per Image Operands". And ususally, they are optional to image instructions.

The format of image operands looks like:

    %0 = spv.ImageXXXX %1, ... %3 : f32 ["Bias|Lod"](%4, %5 : f32, f32) -> ...

This patch doesn’t implement all operands (see Section 3.14 in SPIR-V Spec) but provides a skeleton of it. There is TODO in verifyImageOperands function.

Co-authored: Alan Liu <alanliu.yf@gmail.com>

Reviewed by: antiagainst

Differential Revision: https://reviews.llvm.org/D108501
2021-09-02 04:14:17 +08:00
MaheshRavishankar b686fdbf92 [mlir][Linalg] Drop output tensor from `linalg.pad_tensor` op.
The output tensor was added for tiling purposes. With use of
`TilingInterface` for tiling pad operations, there is no need for an
explicit operand for the shape of result of `linalg.pad_tensor`
op. The interface allows the tiling pattern to query the value that
can be used for the "init" needed for tiling dynamically.

Differential Revision: https://reviews.llvm.org/D108613
2021-08-31 11:12:24 -07:00
Mehdi Amini 387f95541b Add a new interface allowing to set a default dialect to be used for printing/parsing regions
Currently the builtin dialect is the default namespace used for parsing
and printing. As such module and func don't need to be prefixed.
In the case of some dialects that defines new regions for their own
purpose (like SpirV modules for example), it can be beneficial to
change the default dialect in order to improve readability.

Differential Revision: https://reviews.llvm.org/D107236
2021-08-31 17:52:40 +00:00
Mehdi Amini c41b16c26b Change ASM Op printer to print the operation name in the framework instead of leaving it up to each individual operation
This aligns the printer with the parser contract: the operation isn't part of the user-controllable part of the syntax.

Differential Revision: https://reviews.llvm.org/D108804
2021-08-31 17:52:40 +00:00
Tres Popp 44485fcd97 [mlir] Prevent assertion failure in DropUnitDims
Don't assert fail on strided memrefs when dropping unit dims.
Instead just leave them unchanged.

Differential Revision: https://reviews.llvm.org/D108205
2021-08-31 12:15:13 +02:00
marina kolpakova a.k.a. geexie 0080d2aa55 [mlir][gpu] folds memref.dim of gpu.alloc
implements canonicalization which folds memref.dim(gpu.alloc(%size), %idx) -> %size

Differential Revision: https://reviews.llvm.org/D108892
2021-08-31 12:33:10 +03:00
MaheshRavishankar ba72cfe734 [mlir] Add an interface to allow operations to specify how they can be tiled.
An interface to allow for tiling of operations is introduced. The
tiling of the linalg.pad_tensor operation is modified to use this
interface.

Differential Revision: https://reviews.llvm.org/D108611
2021-08-30 16:31:18 -07:00
Matthias Springer d18ffd61d4 [mlir][SCF] Canonicalize dim(x) where x is an iter_arg
* Add `DimOfIterArgFolder`.
* Move existing cross-dialect canonicalization patterns to `LoopCanonicalization.cpp`.
* Rename `SCFAffineOpCanonicalization` pass to `SCFForLoopCanonicalization`.
* Expand documentaton of scf.for: The type of loop-carried variables may not change with iterations. (Not even the dynamic type.)

Differential Revision: https://reviews.llvm.org/D108806
2021-08-30 01:39:56 +00:00
Aart Bik 0a7b8cc5dd [mlir][sparse] fully implement sparse tensor to sparse tensor conversions
with rigorous integration test

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D108721
2021-08-27 15:08:18 -07:00
Matthias Springer a9cff97f94 [mlir][SCF] Generalize AffineMinSCFCanonicalization to min/max ops
* Add support for affine.max ops to SCF loop peeling pattern.
* Add support for affine.max ops to `AffineMinSCFCanonicalizationPattern`.
* Rename `AffineMinSCFCanonicalizationPattern` to `AffineOpSCFCanonicalizationPattern`.
* Rename `AffineMinSCFCanonicalization` pass to `SCFAffineOpCanonicalization`.

Differential Revision: https://reviews.llvm.org/D108009
2021-08-25 10:40:34 +09:00
Matthias Springer 2de2dbef2a [mlir][linalg] Replace AffineMinSCFCanonicalizationPattern with SCF reimplementation
Use the new canonicalization pattern in the SCF dialect.

Differential Revision: https://reviews.llvm.org/D107732
2021-08-25 08:52:56 +09:00
Matthias Springer 98aa694d0d [mlir][scf] Add general affine.min canonicalization pattern
This canonicalization simplifies affine.min operations inside "for loop"-like operations (e.g., scf.for and scf.parallel) based on two invariants:
* iv >= lb
* iv < lb + step * ((ub - lb - 1) floorDiv step) + 1

This commit adds a new pass `canonicalize-scf-affine-min` (instead of being a canonicalization pattern) to avoid dependencies between the Affine dialect and the SCF dialect.

Differential Revision: https://reviews.llvm.org/D107731
2021-08-25 07:32:30 +09:00
Tyler Augustine d25e91d7f6 Support alias.scope and noalias metadata
Introduces new Ops to represent 1. alias.scope metadata in LLVM, and 2. domains for these scopes. These correspond to the metadata described in https://llvm.org/docs/LangRef.html#noalias-and-alias-scope-metadata. Lists of scopes are modeled the same way as access groups - as an ArrayAttr on the Op (added in https://reviews.llvm.org/D97944).

Lowering 'noalias' attributes on function parameters is already supported. However, lowering `noalias` metadata on individual Ops is not, which is added in this change. LLVM uses the same keyword for these, but this change introduces a separate attribute name 'noalias_scopes' to represent this distinct concept.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D107870
2021-08-24 20:42:59 +02:00
Matthias Springer ebf35370ff [mlir][tensor] Insert explicit tensor.cast ops for insert_slice src
If additional static type information can be deduced from a insert_slice's size operands, insert an explicit cast of the op's source operand.

This enables other canonicalization patterns that are matching for tensor_cast ops such as `ForOpTensorCastFolder` in SCF.

Differential Revision: https://reviews.llvm.org/D108617
2021-08-24 19:45:04 +09:00
MaheshRavishankar b546f4347b [mlir]Linalg] Allow controlling fusion of linalg.generic -> linalg.tensor_expand_shape.
Differential Revision: https://reviews.llvm.org/D108565
2021-08-23 16:28:10 -07:00
Aart Bik 236a90802d [mlir][sparse] replace support lib conversion with actual MLIR codegen
Rationale:
Passing in a pointer to the memref data in order to implement the
dense to sparse conversion was a bit too low-level. This revision
improves upon that approach with a cleaner solution of generating
a loop nest in MLIR code itself that prepares the COO object before
passing it to our "swiss army knife" setup.  This is much more
intuitive *and* now also allows for dynamic shapes.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D108491
2021-08-23 14:26:05 -07:00
Matthias Springer bc194a5bb5 [mlir][SCF] Do not peel loops inside partial iterations
Do not apply loop peeling to loops that are contained in the partial iteration of an already peeled loop. This is to avoid code explosion when dealing with large loop nests. Can be controlled with a new pass option `skip-partial`.

Differential Revision: https://reviews.llvm.org/D108542
2021-08-23 21:35:46 +09:00
Rob Suderman 871c812483 [mlir][linalg] Finish refactor of TC ops to YAML
Multiple operations were still defined as TC ops that had equivalent versions
as YAML operations. Reducing to a single compilation path guarantees that
frontends can lower to their equivalent operations without missing the
optimized fastpath.

Some operations are maintained purely for testing purposes (mainly conv{1,2,3}D
as they are included as sole tests in the vectorizaiton transforms.

Differential Revision: https://reviews.llvm.org/D108169
2021-08-20 12:35:04 -07:00
Aart Bik 758ccf8506 [mlir][sparse] add test for DimOp folding
Folding in the MLIR uses the order of the type directly
but folding in the underlying implementation must take
the dim ordering into account. These tests clarify that
behavior and verify it is done right.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D108474
2021-08-20 11:24:09 -07:00
Morten Borup Petersen 6c1436a9b0 [MLIR][SCF] Parenthesize multiple return types in scf.execute_region asm op
Previously, ExecuteRegionOps with multiple return values would fail a round-trip test due to missing parenthesis around the types.

Differential Revision: https://reviews.llvm.org/D108402
2021-08-19 21:31:51 +01:00
Matthias Springer 76a1861816 [mlir][SparseTensor] Split scf.for loop into masked/unmasked parts
Apply the "for loop peeling" pattern from SCF dialect transforms. This pattern splits scf.for loops into full and partial iterations. In the full iteration, all masked loads/stores are canonicalized to unmasked loads/stores.

Differential Revision: https://reviews.llvm.org/D107733
2021-08-19 21:53:11 +09:00
Matthias Springer 8e8b70aa84 [mlir][scf] Simplify affine.min ops after loop peeling
Simplify affine.min ops, enabling various other canonicalizations inside the peeled loop body.

affine.min ops such as:
```
map = affine_map<(d0)[s0, s1] -> (s0, -d0 + s1)>
%r = affine.min #affine.min #map(%iv)[%step, %ub]
```
are rewritten them into (in the case the peeled loop):
```
%r = %step
```

To determine how an affine.min op should be rewritten and to prove its correctness, FlatAffineConstraints is utilized.

Differential Revision: https://reviews.llvm.org/D107222
2021-08-19 17:24:53 +09:00
Matthias Springer 08dbed8a57 [mlir][linalg] Canonicalize dim ops of tiled_loop block args
E.g.:
```
%y = ... : tensor<...>
linalg.tiled_loop ... ins(%x = %y : tensor<...>) {
  tensor.dim %x, %c0 : tensor<...>
}
```

is rewritten to:
```
%y = ... : tensor<...>
linalg.tiled_loop ... ins(%x = %y : tensor<...>) {
  tensor.dim %y, %c0 : tensor<...>
}
```

Differential Revision: https://reviews.llvm.org/D108272
2021-08-19 11:24:33 +09:00
Aart Bik d37d72eaf8 [mlir][sparse] use shared util for DimOp generation
This shares more code with existing utilities. Also, to be consistent,
we moved dimension permutation on the DimOp to the tensor lowering phase.
This way, both pre-existing DimOps on sparse tensors (not likely but
possible) as well as compiler generated DimOps are handled consistently.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D108309
2021-08-18 17:12:32 -07:00
Chia-hung Duan 41e5dbe0fa Enables inferring return types for Shape op if possible
Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D102565
2021-08-18 21:36:55 +00:00
Butygin ddc3d51d58 [mlir][spirv] Add (InBounds)PtrAccessChain ops
Differential Revision: https://reviews.llvm.org/D108070
2021-08-18 17:59:21 +03:00
Lei Zhang 4c15ad2321 [mlir][linalg] Don't drop existing attributes when creating ops
Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D108219
2021-08-17 15:44:56 -04:00
Robert Suderman 65532ea6dd [mlir][linalg] Clear unused linalg tc operations
These operations are not lowered to from any source dialect and are only
used for redundant tests. Removing these named ops, along with their
associated tests, will make migration to YAML operations much more
convenient.

Reviewed By: stellaraccident

Differential Revision: https://reviews.llvm.org/D107993
2021-08-16 11:55:45 -07:00
tashuang.zk 2d45e332ba [MLIR][DISC] Revise ParallelLoopTilingPass with inbound_check mode
Expand ParallelLoopTilingPass with an inbound_check mode.

In default mode, the upper bound of the inner loop is from the min op; in
inbound_check mode, the upper bound of the inner loop is the step of the outer
loop and an additional inbound check will be emitted inside of the inner loop.

This was 'FIXME' in the original codes and a typical usage is for GPU backends,
thus the outer loop and inner loop can be mapped to blocks/threads in seperate.

Differential Revision: https://reviews.llvm.org/D105455
2021-08-16 14:02:53 +02:00
harsh-nod e33f301ec2 [mlir] Add support for moving reductions to outer most dimensions in vector.multi_reduction
The approach for handling reductions in the outer most
dimension follows that for inner most dimensions, outlined
below

First, transpose to move reduction dims, if needed
Convert reduction from n-d to 2-d canonical form
Then, for outer reductions, we emit the appropriate op
(add/mul/min/max/or/and/xor) and combine the results.

Differential Revision: https://reviews.llvm.org/D107675
2021-08-13 12:59:50 -07:00
Tyler Augustine 3a2ff982d7 Support post-processing Ops in unrolled loop iterations
This can be useful when one needs to know which unrolled iteration an Op belongs to, for example, conveying noalias information among memory-affecting ops in parallel-access loops.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D107789
2021-08-11 23:11:10 +00:00
Rob Suderman 7de439b2be [mlir][tosa] Migrate tosa to more efficient linalg.conv
Existing linalg.conv2d is not well optimized for performance. Changed to a
version that is more aligned for optimziation. Include the corresponding
transposes to use this optimized version.

This also splits the conv and depthwise conv into separate implementations
to avoid overly complex lowerings.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D107504
2021-08-11 11:05:12 -07:00
Benjamin Kramer c1ebefdf77 [mlir] Make polynomial approximation emit std instead of LLVM ops
This is a bit cleaner and removes issues with 2d vectors. It also has a
big impact on constant folding, hence the test changes.

Differential Revision: https://reviews.llvm.org/D107896
2021-08-11 16:37:21 +02:00
Alex Zinenko 79b0576dd4 [mlir] Tighten LLVM_AnyNonAggregate ODS type constraint
The constraint was checking that the type is not an LLVM structure or array
type, but was not checking that it is an LLVM-compatible type, making it accept
incorrect types. As a result, some LLVM dialect ops could process values that
are not compatible with the LLVM dialect leading to further issues with
conversions and translations that assume all values are LLVM-compatible. Make
LLVM_AnyNonAggregate only accept LLVM-compatible types.

Reviewed By: cota, akuegel

Differential Revision: https://reviews.llvm.org/D107889
2021-08-11 16:30:19 +02:00
Alexander Belyaev 1e733a8c04 Revert "Bufferization for tiled loop."
This reverts commit edaffebcb2.
2021-08-11 10:04:12 +02:00
Alexander Belyaev 967578f0b8 Revert "[mlir] Change the pattern for TiledLoopOp bufferization."
This reverts commit 2f946eaa9d.
2021-08-11 10:01:36 +02:00
Rob Suderman 2b2ebb6f98 [mlir][tosa] Add folders for trivial tosa operation cases
Some folding cases are trivial to fold away, specifically no-op cases where
an operation's input and output are the same. Canonicalizing these away
removes unneeded operations.

The current version includes tensor cast operations to resolve shape
discreprencies that occur when an operation's result type differs from the
input type. These are resolved during a tosa shape propagation pass.

Reviewed By: NatashaKnk

Differential Revision: https://reviews.llvm.org/D107321
2021-08-10 14:43:00 -07:00
Alexander Belyaev 2f946eaa9d [mlir] Change the pattern for TiledLoopOp bufferization.
This version is does not affect the patterns for Extract/InsertSliceOp and
LinalgOps.

Differential Revision: https://reviews.llvm.org/D107858
2021-08-10 21:27:02 +02:00
bakhtiyar 391456f33c Fix a bug in algebraic simplification, and enable the tests.
Reviewed By: ezhulenev

Differential Revision: https://reviews.llvm.org/D107788
2021-08-10 04:15:56 -07:00
Alexander Belyaev edaffebcb2 Cloned from CL 389610703 by 'g4 patch'.
Original change by pifon@pifon:tfrt_clean:6896:citc on 2021/08/09 05:30:17.

Ad b

Differential Revision: https://reviews.llvm.org/D107762
2021-08-09 21:57:06 +02:00
Aart Bik 05c7f450df [mlir][sparse] add dense to sparse conversion implementation
Implements lowering dense to sparse conversion, for static tensor types only.
First step towards general sparse_tensor.convert support.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D107681
2021-08-09 12:12:39 -07:00
Max Kudryavtsev 0b8cb87e0d [MLIR][STD] Add safe scalar constant propagation for FPTruncOp
Perform scalar constant propagation for FPTruncOp only if the resulting value can be represented without precision loss or rounding.

Example:
%cst = constant 1.000000e+00 : f32
%0 = fptrunc %cst : f32 to bf16
-->
%cst = constant 1.000000e+00 : bf16

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D107518
2021-08-06 16:31:29 -07:00
Alexander Belyaev a552debdcf [mlir] Add patterns for vector.transfer_read/write to Linalg bufferization.
Differential Revision: https://reviews.llvm.org/D107643
2021-08-06 20:24:44 +02:00
Geoffrey Martin-Noble ca6baf1e1d [MLIR][std] Introduce bitcast operation
This patch introduces a bitcast operation to the standard dialect.
RFC: https://llvm.discourse.group/t/rfc-introduce-a-bitcast-op/3774

Reviewed By: silvas

Differential Revision: https://reviews.llvm.org/D105376
2021-08-06 08:47:51 -07:00
Adrian Kuegel d6b4993736 [mlir][MemRef] Fix canonicalization of BufferCast(TensorLoad).
CastOp::areCastCompatible does not check whether casts are definitely compatible.
When going from dynamic to static offset or stride, the canonicalization cannot
know whether it is really cast compatible. In that case, it can only canonicalize
to an alloc plus copy.

Differential Revision: https://reviews.llvm.org/D107545
2021-08-06 08:32:35 +02:00
Jacques Pienaar 9d10be70a8 [mlir] std.call reference function return types in failure
Makes it easier to see type mismatch from failure locally.

Differential Revision: https://reviews.llvm.org/D107288
2021-08-05 19:51:48 -07:00
Stephen Neuendorffer 432341d8a8 [mlir] Handle cases where transfer_read should turn into a scalar load
The existing vector transforms reduce the dimension of transfer_read
ops.  However, beyond a certain point, the vector op actually has
to be reduced to a scalar load, since we can't load a zero-dimension
vector.  This handles this case.

Note that in the longer term, it may be preferaby to support
zero-dimension vectors.  see
https://llvm.discourse.group/t/should-we-have-0-d-vectors/3097.

Differential Revision: https://reviews.llvm.org/D103432
2021-08-03 22:53:40 -07:00
Rob Suderman 1b00b94ffc [mlir][tosa] Tosa shape propagation for tosa.cond_if
We can propagate the shape from tosa.cond_if operands into the true/false
regions then through the connected blocks. Then, using the tosa.yield ops
we can determine what all possible return types are.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D105940
2021-08-03 17:54:54 -07:00
Rob Suderman 143edeca6d [mlir][tosa] Shape inference for a few remaining easy cases:
Handles shape inference for identity, cast, and rescale. These were missed
during the initialy elementwise work. This includes resize shape propagation
which includes both attribute and input type based propagation.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D105845
2021-08-03 17:20:32 -07:00
Aart Bik 817303ef34 [mlir][sparse] fix bug in permuting data structure
Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D107379
2021-08-03 14:27:43 -07:00
KareemErgawy-TomTom f984a805f3 [MLIR][Linalg] Extend detensoring control flow model.
This patch extends the PureControlFlowDetectionModel to consider
detensoring br and cond_br operands.

See: https://github.com/google/iree/issues/1159#issuecomment-884322687,
for a disccusion on the need for such extension.

Reviewed By: silvas

Differential Revision: https://reviews.llvm.org/D107358
2021-08-03 18:08:13 +02:00
Kiran Chandramohan 59989d68ba [MLIR][OpenMP] Add support for critical construct
This patch adds the critical construct to the OpenMP dialect. The
implementation models the definition in 2.17.1 of the OpenMP 5 standard.
A name and hint can be specified. The name is a global entity or has
external linkage, it is modelled as a FlatSymbolRefAttr. Hint is
modelled as an integer enum attribute.
Also lowering to LLVM IR using the OpenMP IRBuilder.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D107135
2021-08-03 10:50:21 +01:00
Matthias Springer 3a41ff4883 [mlir][SCF] Peel scf.for loops for even step divison
Add ForLoopBoundSpecialization pass, which specializes scf.for loops into a "main loop" where `step` divides the iteration space evenly and into an scf.if that handles the last iteration.

This transformation is useful for vectorization and loop tiling. E.g., when vectorizing loads/stores, programs will spend most of their time in the main loop, in which only unmasked loads/stores are used. Only the in the last iteration (scf.if), slower masked loads/stores are used.

Subsequent commits will apply this transformation in the SparseDialect and in Linalg's loop tiling.

Differential Revision: https://reviews.llvm.org/D105804
2021-08-03 10:21:38 +09:00
Eugene Zhulenev b537c5b414 [mlir] Async: clone constants into async.execute functions and parallel compute functions
Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D107007
2021-08-02 12:17:41 -07:00
Aart Bik 697ea09d47 [mlir][sparse] add sparse tensor type conversion operation
Introduces a conversion from one (sparse) tensor type to another
(sparse) tensor type. See the operation doc for details. Actual
codegen for all cases is still TBD.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D107205
2021-07-31 12:53:31 -07:00
Nicolas Vasilache 14c1450d5c [mlir][Vector] Add vector to outerproduct lowering for the [reduction, parallel] case.
Differential Revision: https://reviews.llvm.org/D105373
2021-07-30 14:32:57 +00:00
Amy Zhuang a8b7e56f65 [mlir] Set insertion point of vector constant to the top of the vectorized loop body
When we vectorize a scalar constant, the vector constant is inserted before its
first user if the scalar constant is defined outside the loops to be vectorized.
It is possible that the vector constant does not dominate all its users. To fix
the problem, we find the innermost vectorized loop that encloses that first user
and insert the vector constant at the top of the loop body.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D106609
2021-07-29 15:42:23 -07:00
Rob Suderman 2d0ba5e144 [mlir][tosa] Fix tosa.reshape failures due to implicit broadcasting
Make broadcastable needs the output shape to determine whether the operation
includes additional broadcasting. Include some canonicalizations for TOSA
to remove unneeded reshape.

Reviewed By: NatashaKnk

Differential Revision: https://reviews.llvm.org/D106846
2021-07-29 15:21:57 -07:00
Yi Zhang 9a82482313 [mlir][linalg] Fix pad tensor cast folding with changed type
`PadTensorOp` has verification logic to make sure
result dim must be static if all the padding values are static.
Cast folding might add more static information for the src operand
of `PadTensorOp` which might change a valid operation to be invalid.
Change the canonicalizing pattern to fix this.
2021-07-29 17:47:01 -04:00
bakhtiyar 1c144410e7 Refactor AsyncToAsyncRuntime pass to boost understandability.
Depends On D106730

Reviewed By: ezhulenev

Differential Revision: https://reviews.llvm.org/D106731
2021-07-29 12:01:07 -07:00
bakhtiyar 9a5bc83660 Add an escape-hatch for conversion of funcs with blocking awaits to coroutines.
Currently TFRT does not support top-level coroutines, so this functionality will allow to have a single blocking await at the top level until TFRT implements the necessary functionality.

Reviewed By: ezhulenev

Differential Revision: https://reviews.llvm.org/D106730
2021-07-29 08:52:28 -07:00
River Riddle f8479d9de5 [mlir] Set the namespace of the BuiltinDialect to 'builtin'
Historically the builtin dialect has had an empty namespace. This has unfortunately created a very awkward situation, where many utilities either have to special case the empty namespace, or just don't work at all right now. This revision adds a namespace to the builtin dialect, and starts to cleanup some of the utilities to no longer handle empty namespaces. For now, the assembly form of builtin operations does not require the `builtin.` prefix. (This should likely be re-evaluated though)

Differential Revision: https://reviews.llvm.org/D105149
2021-07-28 21:00:10 +00:00
bakhtiyar 6ea22d4626 Optionally eliminate blocking runtime.await calls by converting functions to coroutines.
Interop parallelism requires needs awaiting on results. Blocking awaits are bad for performance. TFRT supports lightweight resumption on threads, and coroutines are an abstraction than can be used to lower the kernels onto TFRT threads.

Reviewed By: ezhulenev

Differential Revision: https://reviews.llvm.org/D106508
2021-07-28 12:37:05 -07:00
Alex Zinenko c1f719d1a7 [mlir] harden result type verification in llvm.call
The verifier of the llvm.call operation was not checking for mismatches between
the number of operation results and the number of results in the signature of
the callee. Furthermore, it was possible to construct an llvm.call operation
producing an SSA value of !llvm.void type, which should not exist. Add the
verification and treat !llvm.void result type as absence of call results.
Update the GPU conversions to LLVM that were mistakenly assuming that it was
fine for llvm.call to produce values of !llvm.void type and ensure these calls
do not produce results.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D106937
2021-07-28 18:15:56 +02:00
Lei Zhang 23326b9f17 [mlir][spirv] Fix a few issues in ModuleCombiner
- Fixed symbol insertion into `symNameToModuleMap`. Insertion
  needs to happen whether symbols are renamed or not.
- Added check for the VCE triple and avoid dropping it.
- Disabled function deduplication. It requires more careful
  rules. Right now it can remove different functions.
- Added tests for symbol rename listener.
- And some other code/comment cleanups.

Reviewed By: ergawy

Differential Revision: https://reviews.llvm.org/D106886
2021-07-28 10:31:01 -04:00
Tobias Gysi ca0d244e99 [mlir][linalg] Introduce a separate EraseIdentityCopyOp Pattern.
Split out an EraseIdentityCopyOp from the existing RemoveIdentityLinalgOps pattern. Introduce an additional check to ensure the pattern checks the permutation maps match. This is a preparation step to specialize RemoveIdentityLinalgOps to GenericOp only.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D105622
2021-07-28 11:18:22 +00:00
Yi Zhang 8ed66cb88b [mlir][memref] Fix collapsed shape ops memref.cast folding with changed type
`memref.collapse_shape` has verification logic to make sure
result dim must be static if all the collapsing src dims are static.
Cast folding might add more static information for the src operand
of `memref.collapse_shape` which might change a valid collapsing
operation to be invalid. Add `CollapseShapeOpMemRefCastFolder` pattern
to fix this.

Minor changes to `convertReassociationIndicesToExprs` to use `context`
instead of `builder` to avoid extra steps to construct temporary
builders.

Reviewed By: nicolasvasilache, mravishankar

Differential Revision: https://reviews.llvm.org/D106670
2021-07-28 10:19:20 +00:00
River Riddle ddd8482117 [PDL] Remove RewriteEndOp and mark RewriteOp as NoTerminator
RewriteEndOp was a fake terminator operation that is no longer needed now that blocks are not required to have terminators.

Differential Revision: https://reviews.llvm.org/D106911
2021-07-27 20:45:10 +00:00
Eugene Zhulenev d94426d22a [mlir] Math: add algebraic simplification patterns to math transforms
Reviewed By: bkramer

Differential Revision: https://reviews.llvm.org/D106822
2021-07-27 09:22:33 -07:00
Aart Bik c2415d67a5 [mlir][sparse] fixed bug in verification
The order of testing in two sparse tensor ops was incorrect,
which could cause an invalid cast (crashing the compiler instead
of reporting the error). This revision fixes that bug.

Reviewed By: gussmith23

Differential Revision: https://reviews.llvm.org/D106841
2021-07-27 08:49:21 -07:00
Marcel Koester 0425332015 [mlir] Added new RegionBranchTerminatorOpInterface and adapted uses of hasTrait<ReturnLike>.
This CL adds a new RegionBranchTerminatorOpInterface to query information about operands that can be
passed to successor regions. Similar to the BranchOpInterface, it allows to freely define the
involved operands. However, in contrast to the BranchOpInterface, it expects an additional region
number to distinguish between various use cases which might require different operands passed to
different regions.

Moreover, we added new utility functions (namely getMutableRegionBranchSuccessorOperands and
getRegionBranchSuccessorOperands) to query (mutable) operand ranges for operations equiped with the
ReturnLike trait and/or implementing the newly added interface.  This simplifies reasoning about
terminators in the scope of the nested regions.

We also adjusted the SCF.ConditionOp to benefit from the newly added capabilities.

Differential Revision: https://reviews.llvm.org/D105018
2021-07-26 06:39:31 +02:00
Eugene Zhulenev de7a4e53a2 [mlir] Async: lower SCF operations into CFG inside coroutines
Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D106747
2021-07-24 14:36:26 -07:00
Yi Zhang deebf18512 [mlir][linalg] Add pooling_nchw_max, conv_2d_nchw as yaml ops.
- Add pooling_nchw_max.
- Move conv_2d_nchw to yaml ops and add strides and dilation attributes.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D106658
2021-07-23 17:37:15 +00:00
thomasraoux 73a9d6d0e2 [mlir][linalg] Fix bug in contraction op vectorization with output perm
When the output indexing map has a permutation we need to consider in
the contraction vector type.

Differential Revision: https://reviews.llvm.org/D106469
2021-07-23 08:39:43 -07:00
Eugene Zhulenev 6c1f655818 [mlir] Async: special handling for parallel loops with zero iterations
Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D106590
2021-07-23 01:22:59 -07:00
Nicolas Vasilache 06d2fb55ca [mlir][Linalg] Fix a missing copy when source of insert_slice is not inplace.
When the source tensor of a tensor.insert_slice is not equivalent to an inplace buffer an extra copy is necessary. This revision adds the missing copy.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D106587
2021-07-23 07:41:45 +00:00
Rob Suderman cf8a1f6208 [mlir][tosa] Quantized Conv2DOp lowering to linalg added.
Includes a version of a quantized conv2D operations with a lowering from TOSA
to linalg with corresponding test. We keep the quantized and quantized variants
as separate named ops to avoid the additional operations for non-quantized
convolutions.

Differential Revision: https://reviews.llvm.org/D106407
2021-07-22 15:42:26 -07:00