Commit Graph

5226 Commits

Author SHA1 Message Date
William S. Moses 93c791839a [MLIR] Canonicalize/fold select %x, 1, 0 to extui
Two canonicalizations for select %x, 1, 0
  If the return type is i1, return simply the condition %x, otherwise extui %x to the return type.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D116517
2022-01-03 01:26:28 -05:00
William S. Moses 834cf3be22 [MLIR][Arith] Canonicalize and/or with ext
Replace and(ext(a),ext(b)) with ext(and(a,b)). This both reduces one instruction, and results in the computation (and/or) being done on a smaller type.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D116519
2022-01-03 01:25:30 -05:00
William S. Moses 1bb9f4e482 [MLIR] Create folders for extsi/extui
Create folders/canonicalizers for extsi/extui. Specifically,

extui(extui(x)) -> extui(x)
extsi(extsi(x)) -> extsi(x)
extsi(extui(x)) -> extui(x)

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D116515
2022-01-03 00:11:23 -05:00
Mehdi Amini 89af17c0c7 Define a `cppAccessorType` to const-ref in APFloatParameter and update ODS emitter to use it for verifier signatures
This reduce an unnecessary amount of copy of non-trivial objects, like
APFloat.

Reviewed By: rriddle, jpienaar

Differential Revision: https://reviews.llvm.org/D116505
2022-01-03 04:57:11 +00:00
William S. Moses 1a0a177965 [MLIR] Create fold for cmp of ext
This patch creates folds for cmpi( ext(%x : i1, iN) != 0) -> %x

In essence this matches patterns matching an extension of a boolean, that != 0, which is equivalent to the original condition.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D116504
2022-01-02 19:48:52 -05:00
Mehdi Amini 6786d7e4f5 Apply clang-tidy fixes for readability-simplify-boolean-expr to MLIR (NFC)
Reviewed By: rriddle, Mogball

Differential Revision: https://reviews.llvm.org/D116253
2022-01-02 01:59:31 +00:00
Mehdi Amini 5a1f6077ec Apply clang-tidy fixes for readability-container-size-empty for MLIR (NFC)
Reviewed By: rriddle, Mogball

Differential Revision: https://reviews.llvm.org/D116252
2022-01-02 01:56:38 +00:00
Mehdi Amini 1fc096af1e Apply clang-tidy fixes for performance-unnecessary-value-param to MLIR (NFC)
Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D116250
2022-01-02 01:45:18 +00:00
Mehdi Amini ee1fcb2fb6 Apply clang-tidy fixes for performance-move-const-arg to MLIR (NFC)
Reviewed By: rriddle, Mogball

Differential Revision: https://reviews.llvm.org/D116249
2022-01-02 01:16:15 +00:00
Mehdi Amini 89de9cc8a7 Apply clang-tidy fixes for performance-for-range-copy to MLIR (NFC)
Differential Revision: https://reviews.llvm.org/D116248
2022-01-02 01:13:42 +00:00
Mehdi Amini 322c891483 Apply clang-tidy fixes for modernize-use-equals-default to MLIR (NFC)
Differential Revision: https://reviews.llvm.org/D116247
2022-01-02 01:13:27 +00:00
Mehdi Amini 3bab9d4eb0 Apply clang-tidy fixes for bugprone-copy-constructor-init to MLIR (NFC)
Reviewed By: rriddle, Mogball

Differential Revision: https://reviews.llvm.org/D116245
2022-01-02 01:05:30 +00:00
Mehdi Amini ced8690d84 Apply clang-tidy fixes for bugprone-argument-comment to MLIR (NFC)
Differential Revision: https://reviews.llvm.org/D116244
2022-01-02 01:05:06 +00:00
Mehdi Amini 36a6e56bff Fix possible memory leak in a MLIR unit-test
Flagged by Coverity
2022-01-01 01:42:26 +00:00
Markus Böck 3536d24a1a [mlir][LLVMIR] Add `llvm.eh.typeid.for` intrinsic
MLIR already exposes landingpads, the invokeop and the personality function on LLVM functions. With this intrinsic it should be possible to implement exception handling via the exception handling mechanisms provided by the Itanium ABI.

Differential Revision: https://reviews.llvm.org/D116436
2022-01-01 02:03:00 +01:00
Stella Laurenzo 5cd0b817e2 [mlir] Allow IntegerAttr to parse zero width integers.
https://reviews.llvm.org/D109555 added support to APInt for this, so the special case to disable it is no longer valid. It is in fact legal to construct these programmatically today, and they print properly but do not parse.

Justification: zero bit integers arise naturally in various bit reduction optimization problems, and having them defined for MLIR reduces special casing.

I think there is a solid case for i0 and ui0 being supported. I'm less convinced about si0 and opted to just allow the parser to round-trip values that already verify. The counter argument is that the proper singular value for an si0 is -1. But the counter to this counter is that the sign bit is N-1, which does not exist for si0 and it is not unreasonable to consider this non-existent bit to be 0. Various sources consider it having the singular value "0" to be the least surprising.

Reviewed By: lattner

Differential Revision: https://reviews.llvm.org/D116413
2021-12-30 20:33:00 -08:00
William S. Moses a6a583dae4 [MLIR] Move AtomicRMW into MemRef dialect and enum into Arith
Per the discussion in https://reviews.llvm.org/D116345 it makes sense
to move AtomicRMWOp out of the standard dialect. This was accentuated by the
need to add a fold op with a memref::cast. The only dialect
that would permit this is the memref dialect (keeping it in the standard dialect
or moving it to the arithmetic dialect would require those dialects to have a
dependency on the memref dialect, which breaks linking).

As the AtomicRMWKind enum is used throughout, this has been moved to Arith.

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D116392
2021-12-30 14:31:33 -05:00
Nicolas Vasilache 2e69f4f012 [mlir][vector] Fix illegal vector.transfer + tensor.insert/extract_slice folding
vector.transfer operations do not have rank-reducing semantics.

Bail on illegal rank-reduction: we need to check that the rank-reduced
dims are exactly the leading dims. I.e. the following is illegal:
```
   %0 = vector.transfer_write %v, %t[0,0], %cst :
     vector<2x4xf32>, tensor<2x4xf32>
   %1 = tensor.insert_slice %0 into %tt[0,0,0][2,1,4][1,1,1] :
     tensor<2x4xf32> into tensor<2x1x4xf32>
```

Cannot fold into:
```
   %0 = vector.transfer_write %v, %t[0,0,0], %cst :
     vector<2x4xf32>, tensor<2x1x4xf32>
```
For this, check the trailing `vectorRank` dims of the insert_slice result
tensor match the trailing dims of the inferred result tensor.

Differential Revision: https://reviews.llvm.org/D116409
2021-12-30 14:55:16 +00:00
MaheshRavishankar 7df7586a0b [mlir][MemRef] Deprecate unspecified trailing offset, size, and strides semantics of `OffsetSizeAndStrideOpInterface`.
The semantics of the ops that implement the
`OffsetSizeAndStrideOpInterface` is that if the number of offsets,
sizes or strides are less than the rank of the source, then some
default values are filled along the trailing dimensions (0 for offset,
source dimension of sizes, and 1 for strides). This is confusing,
especially with rank-reducing semantics. Immediate issue here is that
the methods of `OffsetSizeAndStridesOpInterface` assumes that the
number of values is same as the source rank. This cause out-of-bounds
errors.

So simplifying the specification of `OffsetSizeAndStridesOpInterface`
to make it invalid to specify number of offsets/sizes/strides not
equal to the source rank.

Differential Revision: https://reviews.llvm.org/D115677
2021-12-29 11:18:29 -08:00
William S. Moses 180455ae5e [MLIR][LLVM] Expose powi intrinsic to MLIR
Expose the powi intrinsic to the LLVM dialect within MLIR

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D116364
2021-12-29 13:09:35 -05:00
Johannes Doerfert 7e14e881c4 [OpenMP][OpenACC] Update test after encoding change in D113126 2021-12-29 01:29:07 -06:00
William S. Moses 99fc000c87 [MLIR] Expose atomicrmw and/or
LLVM (dialect and IR) have atomics for and/or. This patch enables atomic_rmw ops in the standard dialect for and/or that lower to these (in addition to the existing atomics such as addi, etc).

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D116345
2021-12-29 00:23:28 -05:00
William S. Moses ca8997eb7f [MLIR] Add constant folder for fptosi and friends
This patch adds constant folds for FPToSI/FPToUI/SIToFP/UIToFP

Reviewed By: mehdi_amini, bondhugula

Differential Revision: https://reviews.llvm.org/D116321
2021-12-28 23:50:01 -05:00
Rob Suderman f0cb77d7d5 [mlir][tosa] Resubmit split tosa-to-linalg named ops out of pass
Includes dependency fix that resulted in canonicalizer pass not linking in.

Linalg named ops lowering are moved to a separate pass. This allows TOSA
canonicalizers to run between named-ops lowerings and the general TOSA
lowerings. This allows the TOSA canonicalizers to run between lowerings.

Differential Revision: https://reviews.llvm.org/D116057
2021-12-28 11:22:58 -08:00
William S. Moses 2709fd1520 [MLIR][LLVM] Add MemmoveOp to LLVM Dialect
LLVM Dialect in MLIR doesn't have a memmove op. This adds one.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D116274
2021-12-24 19:48:04 -05:00
Mogball 41a64338cc [mlir] Add getNumThreads to MLIRContext
Querying threads directly from the thread pool fails if there is no thread pool or if multithreading is not enabled. Returns 1 by default.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D116259
2021-12-24 02:02:54 +00:00
Mehdi Amini 735fe1da6b Revert "[mlir][tosa] Split tosa-to-linalg named ops out of pass"
This reverts commit 313de31fbb.

There is a missing CMake dependency, building with shared libraries is
broken:

55.509 [45/4/3061] Linking CXX shared library lib/libMLIRTosaToLinalg.so.14git
FAILED: lib/libMLIRTosaToLinalg.so.14git
...
TosaToLinalgPass.cpp: undefined reference to `mlir::createCanonicalizerPass()'
2021-12-24 00:09:15 +00:00
Rob Suderman 313de31fbb [mlir][tosa] Split tosa-to-linalg named ops out of pass
Linalg named ops lowering are moved to a separate pass. This allows TOSA
canonicalizers to run between named-ops lowerings and the general TOSA
lowerings. This allows the TOSA canonicalizers to run between lowerings.

Reviewed By: NatashaKnk

Differential Revision: https://reviews.llvm.org/D116057
2021-12-23 12:23:19 -08:00
Mehdi Amini e5639b3fa4 Fix more clang-tidy cleanups in mlir/ (NFC) 2021-12-22 20:53:11 +00:00
Adrian Kuegel 4a10457d33 [mlir][arith] Fix CmpIOP folding for vector types.
Previously, the folding assumed that it always operates on scalar types.

Differential Revision: https://reviews.llvm.org/D116151
2021-12-22 18:12:24 +01:00
Butygin 28ab10f404 [mlir][memref] ReinterpretCast: allow static sizes/strides/offset where affine map expects dynamic
* There is no reason to forbid that case
* Also, user will get very unfriendly error like `expected result type with offset = -9223372036854775808 instead of 1`

Differential Revision: https://reviews.llvm.org/D114678
2021-12-21 16:20:01 +03:00
Mehdi Amini 02b6fb218e Fix clang-tidy issues in mlir/ (NFC)
Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D115956
2021-12-20 20:25:01 +00:00
Butygin c7f96d5ab1 [mlir][scf] Canonicalize nested scf.if's to scf.if + arith.and
Differential Revision: https://reviews.llvm.org/D115930
2021-12-20 21:53:03 +03:00
MaheshRavishankar 4142932a83 [mlir][Linalg] Move named op conversions out of canonicalizations.
These conversions are better suited to be applied at whole tensor
level. Applying these as canonicalizations end up triggering such
canonicalizations at all levels of the stack which might be
undesirable. For example some of the resulting code patterns wont
bufferize in-place and need additional stack buffers. Best is to be
more deliberate in when these canonicalizations apply.

Differential Revision: https://reviews.llvm.org/D115912
2021-12-20 10:19:05 -08:00
Mehdi Amini 7f9e9c7fc3 Move getAsmBlockArgumentNames from OpAsmDialectInterface to OpAsmOpInterface
This method is more suitable as an opinterface: it seems intrinsic to
individual instances of the operation instead of the dialect.
Also remove the restriction on the interface being applicable to the entry block only.

Differential Revision: https://reviews.llvm.org/D116018
2021-12-20 07:18:01 +00:00
bakhtiyar ec0e4545ca Make AsyncParallelForRewrite parameterizable with a cost model which drives deciding the parallelization granularity.
Reviewed By: ezhulenev, mehdi_amini

Differential Revision: https://reviews.llvm.org/D115423
2021-12-19 08:41:01 -08:00
Aaron DeBattista 64f694acaf [mlir][tosa] Move tosa canonicalizers to optional optimization pass
TOSA's canonicalizers that change dense operations should be moved to a
seperate optimization pass to avoid canonicalizing to operations not supported
for relevant backends.

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D115890
2021-12-16 23:33:54 -08:00
Mogball 319d8cf685 [mlir][ods] Added EnumAttr, an AttrDef implementation of enum attributes
`EnumAttr` is a pure TableGen implementation of enum attributes using `AttrDef`. This is meant as a drop-in replacement for `StrEnumAttr`, which is soon to be deprecated. `StrEnumAttr` is often used over `IntEnumAttr` because its more readable in MLIR assembly formats. However, storing and manipulating strings is not efficient. Defining `StrEnumAttr` can also be awkward and relies on a lot of special logic in `EnumsGen`, and has some hidden sharp edges.

Also, `EnumAttr` stores the enum directly,  removing the need to convert to/from integers when calling attribute getters on ops.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D115181
2021-12-17 02:55:28 +00:00
Rob Suderman 0763f12213 [mlir][tosa] Handle rescale case where shift > 63
It is possible for the shift value to exceed the number of bits. In these
cases we can just multiply by zero. This is relatively rare occurence but
should be handled.

Reviewed By: not-jenni

Differential Revision: https://reviews.llvm.org/D115779
2021-12-16 15:30:48 -08:00
not-jenni f9cefc7b90 [mlir][tosa] Add tosa.max_pool2d as no-op canonicalization
When the input and output of a pool2d op are both 1x1, it can be canonicalized to a no-op

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D115908
2021-12-16 15:27:26 -08:00
Rob Suderman 9a2308e170 [mlir][tosa] Minor cleanup of tosa.conv2d canonicalizer
Slight rename and better variable type usage in tosa.conv2d to
tosa.fully_connected lowering. Included disabling pass for padded
convolutions.

Reviewed By: not-jenni

Differential Revision: https://reviews.llvm.org/D115776
2021-12-16 15:13:01 -08:00
Mogball 1aa0b84fa4 [mlir][ods] Fix OpFormatGen calling inferReturnTypes before region/segment resolution
The generated parser for ops with type inference calls `inferReturnTypes` before region resolution and segment attribute resolution, i.e. regions and the segment attributes are not passed to the `inferReturnTypes` even though it may need that information.

In particular, an op that has sized operand segments which queries those operands in its `inferReturnTypes` function will crash because the segment attributes hadn't been added yet.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D115782
2021-12-16 19:04:50 +00:00
Alexander Belyaev 88df30c8d8 [mlir] Add canonicalization for extract(tensor.from_elements) in 0d case.
Differential Revision: https://reviews.llvm.org/D115875
2021-12-16 15:46:57 +01:00
Lei Zhang 223be5f630 [mlir][spirv] Perform partial conversion in VectorToSPIRVPass
This allows the pass to participate in progressive lowering
and it also allows us to write tests better.

Along the way, cleaned up the tests.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D115756
2021-12-16 09:35:56 -05:00
Alexander Belyaev f77e9f8768 [mlir] Extend `tensor.from_elements` to support N-D case.
RFC: https://llvm.discourse.group/t/rfc-extend-tensor-fromelementsop-to-n-d/4715

Differential Revision: https://reviews.llvm.org/D115821
2021-12-16 14:52:41 +01:00
Diego Caballero 32fe1a8a25 [mlir][GPU] Extend GPU kernel outlining to generate DL specification
This patch extends the GPU kernel outlining pass so that it can take in
an optional data layout specification that will be attached to the GPU
module operation generated. If the data layout specification is not provided
the default data layout is used instead.

Reviewed By: herhut, mehdi_amini

Differential Revision: https://reviews.llvm.org/D115722
2021-12-16 11:35:53 +00:00
River Riddle 3ee44cb775 [PDLL] Add a `rewrite` statement to enable complex rewrites
The `rewrite` statement allows for rewriting a given root
operation with a block of nested rewriters. The root operation is
not implicitly erased or replaced, and any transformations to it
must be expressed within the nested rewrite block. The inner body
may contain any number of other rewrite statements, variables, or
expressions.

Differential Revision: https://reviews.llvm.org/D115299
2021-12-16 02:08:13 +00:00
River Riddle 12eebb8e37 [PDLL] Add a `replace` rewrite statement for replacing operations
This statement acts as a companion to the existing `erase`
statement, and is the corresponding PDLL construct for the
`PatternRewriter::replaceOp` C++ API. This statement replaces a
given operation with a set of values.

Differential Revision: https://reviews.llvm.org/D115298
2021-12-16 02:08:13 +00:00
River Riddle f62a57a3f0 [PDLL] Add support for tuple types and expressions
Tuples are used to group multiple elements into a single
compound value. The values in a tuple can be of any type, and
do not need to be of the same type. There is also no limit to
the number of elements held by a tuple.

Tuples will be used to support multiple results from
Constraints and Rewrites (added in a followup), and will also
make it easier to support more complex primitives (such as
range based maps that can operate on multiple values).

Differential Revision: https://reviews.llvm.org/D115297
2021-12-16 02:08:13 +00:00
River Riddle 02670c3f38 [PDLL] Add support for `op` Operation expressions
An operation expression in PDLL represents an MLIR operation. In
the match section of a pattern, this expression models one of
the input operations to the pattern. In the rewrite section of
a pattern, this expression models one of the operations to
create. The general structure of the operation expression is very
similar to that of the "generic form" of textual MLIR assembly:

```
let root = op<my_dialect.foo>(operands: ValueRange) {attr = attr: Attr} -> (resultTypes: TypeRange);
```

For now we only model the components that are within PDL, as PDL
gains support for blocks and regions so will this expression.

Differential Revision: https://reviews.llvm.org/D115296
2021-12-16 02:08:12 +00:00
River Riddle d7e7fdf3aa [PDLL] Add support for literal Attribute and Type expressions
This allows for using literal attributes and types within PDLL,
which simplifies building both constraints and rewriters. For
example, checking if an attribute is true is as simple as
`attr<"true">`.

Differential Revision: https://reviews.llvm.org/D115295
2021-12-16 02:08:12 +00:00
River Riddle 322691ab63 [PDLL] Add support for parsing pattern metadata
This allows for overriding the metadata of a pattern and
providing information such as the benefit, bounded recursion,
and more in the future.

Differential Revision: https://reviews.llvm.org/D115294
2021-12-16 02:08:12 +00:00
River Riddle 11d26bd143 [mlir][PDLL] Add an initial frontend for PDLL
This is a new pattern rewrite frontend designed from the ground
up to support MLIR constructs, and to target PDL. This frontend
language was proposed in https://llvm.discourse.group/t/rfc-pdll-a-new-declarative-rewrite-frontend-for-mlir/4798

This commit starts sketching out the base structure of the
frontend, and is intended to be a minimal starting point for
building up the language. It essentially contains support for
defining a pattern, variables, and erasing an operation. The
features mentioned in the proposal RFC (including IDE support)
will be added incrementally in followup commits.

I intend to upstream the documentation for the language in a
followup when a bit more of the pieces have been landed.

Differential Revision: https://reviews.llvm.org/D115093
2021-12-16 02:08:12 +00:00
Hanhan Wang 501674dc3b [mlir][Vector] Further fix to avoid infinite loop in InnerOuterDimReductionConversion
If all the dims are reduction dims, it is already in inner-most/outer-most
reduction form.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D115820
2021-12-15 13:54:15 -08:00
Matthias Springer 417014170b [mlir][linalg][bufferize] Replace remaining bvm usage with new API
* Call `replaceOp` instead of `mapBuffer`.
* Remove bvm and all helper functions around bvm.
* Simplify FuncOp bufferization and rely on existing functionality to generate ToMemrefOps for function BlockArguments.

Differential Revision: https://reviews.llvm.org/D115515
2021-12-15 23:21:39 +09:00
Markus Böck 1d10bddfa3 [mlir][LLVMIR] Add `llvm.umin` and `llvm.umax` intrinsics
Ops for the signed counterparts "llvm.smin" and "llvm.smax" already exist. This patch adds the unsigned versions as well.

Differential Revision: https://reviews.llvm.org/D115796
2021-12-15 13:54:31 +01:00
gysit b7f2c108eb [mlir][linalg] Replace LinalgOps.h and LinalgTypes.h by a single header.
After removing the range type, Linalg does not define any type. The revision thus consolidates the LinalgOps.h and LinalgTypes.h into a single Linalg.h header. Additionally, LinalgTypes.cpp is renamed to LinalgDialect.cpp to follow the convention adopted by other dialects such as the tensor dialect.

Depends On D115727

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D115728
2021-12-15 12:15:03 +00:00
Shraiysh Vaishay 3425b1bcb4 [mlir][OpenMP] omp.sections and omp.section lowering to LLVM IR
This patch adds lowering from omp.sections and omp.section (simple lowering along with the nowait clause) to LLVM IR.
Tests for the same are also added.

Reviewed By: ftynse, kiranchandramohan

Differential Revision: https://reviews.llvm.org/D115030
2021-12-15 15:41:12 +05:30
Matthias Springer a5927737da [mlir][linalg][bufferize] Reimplementation of scf.if bufferization
Instead of modifying the existing scf.if op, create a new op with memref OpOperands/OpResults and delete the old op.

New allocations / other memrefs can now be yielded from the op. This functionality is deactivated by default and guarded against by AssertDestinationPassingStyle.

Differential Revision: https://reviews.llvm.org/D115491
2021-12-15 18:40:54 +09:00
Javier Setoain a4830d14ed [mlir][RFC] Add scalable dimensions to VectorType
With VectorType supporting scalable dimensions, we don't need many of
the operations currently present in ArmSVE, like mask generation and
basic arithmetic instructions. Therefore, this patch also gets
rid of those.

Having built-in scalable vector support also simplifies the lowering of
scalable vector dialects down to LLVMIR.

Scalable dimensions are indicated with the scalable dimensions
between square brackets:

        vector<[4]xf32>

Is a scalable vector of 4 single precission floating point elements.

More generally, a VectorType can have a set of fixed-length dimensions
followed by a set of scalable dimensions:

        vector<2x[4x4]xf32>

Is a vector with 2 scalable 4x4 vectors of single precission floating
point elements.

The scale of the scalable dimensions can be obtained with the Vector
operation:

        %vs = vector.vscale

This change is being discussed in the discourse RFC:

https://llvm.discourse.group/t/rfc-add-built-in-support-for-scalable-vector-types/4484

Differential Revision: https://reviews.llvm.org/D111819
2021-12-15 09:31:37 +00:00
Matthias Springer 7161aa06ef [mlir][linalg][bufferize] Reimplementation of scf.for bufferization
Instead of modifying the existing scf.for op, create a new op with memref OpOperands/OpResults and delete the old op.

New allocations / other memrefs can now be yielded from the loop. This functionality is deactivated by default and guarded against by AssertDestinationPassingStyle.

This change also introduces `replaceOp`, which will be utilized by all other `bufferize` implementations in future commits. Bufferization will then no longer rely on old (pre-bufferize) ops to DCE away. Instead old ops are deleted on the spot. This improves debuggability because there won't be any duplicate ops anymore (bufferized + not-yet-bufferized) when dumping IR during bufferization. It is also less fragile because unbufferized IR can no longer silently "hang around" due to an implementation bug.

Differential Revision: https://reviews.llvm.org/D114926
2021-12-15 18:29:22 +09:00
gysit 9912bed730 [mlir][linalg] Remove RangeOp and RangeType.
Remove the RangeOp and the RangeType that are not actively used anymore. After removing RangeType, the LinalgTypes header only includes the generated dialect header.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D115727
2021-12-15 07:19:10 +00:00
Lei Zhang 96130b5dc7 [mlir][spirv] Support size-1 vector/tensor constant during conversion
Reviewed By: ThomasRaoux, mravishankar

Differential Revision: https://reviews.llvm.org/D115518
2021-12-14 15:58:08 -05:00
Krzysztof Drewniak c57b2a0635 [MLIR][GPU] Make max flat work group size for ROCDL kernels configurable
While the default value for the amdgpu-flat-work-group-size attribute,
"1, 256", matches the defaults from Clang, some users of the ROCDL dialect,
namely Tensorflow, use larger workgroups, such as 1024. Therefore,
instead of hardcoding this value, we add a rocdl.max_flat_work_group_size
attribute that can be set on GPU kernels to override the default value.

Reviewed By: whchung

Differential Revision: https://reviews.llvm.org/D115741
2021-12-14 20:12:23 +00:00
Alexander Belyaev a82a19c137 [mlir] Add a missing pattern to bufferize tensor.rank.
Differential Revision: https://reviews.llvm.org/D115745
2021-12-14 20:04:57 +01:00
Alexander Belyaev 15f8f3e20a [mlir] Split std.rank into tensor.rank and memref.rank.
Move `std.rank` similarly to how `std.dim` was moved to TensorOps and MemRefOps.

Differential Revision: https://reviews.llvm.org/D115665
2021-12-14 10:15:55 +01:00
Markus Böck ef5be2bb16 [mlir] Implement `DataLayoutTypeInterface` for `LLVMArrayType`
Implementation of the interface allows querying the size and alignments of an LLVMArrayType as well as query the size and alignment of a struct containing an LLVMArrayType.
The implementation should yield the same results as llvm::DataLayout, including support for over aligned element types.
There is no customization point for adjusting an arrays alignment; it is simply taken from the element type.

Differential Revision: https://reviews.llvm.org/D115704
2021-12-14 09:35:45 +01:00
Benoit Jacob aba437ceb2 [mlir][Vector] Patterns flattening vector transfers to 1D
This is the second part of https://reviews.llvm.org/D114993 after slicing
into 2 independent commits.

This is needed at the moment to get good codegen from 2d vector.transfer
ops that aim to compile to SIMD load/store instructions but that can
only do so if the whole 2d transfer shape is handled in one piece, in
particular taking advantage of the memref being contiguous rowmajor.

For instance, if the target architecture has 128bit SIMD then we would
expect that contiguous row-major transfers of <4x4xi8> map to one SIMD
load/store instruction each.

The current generic lowering of multi-dimensional vector.transfer ops
can't achieve that because it peels dimensions one by one, so a transfer
of <4x4xi8> becomes 4 transfers of <4xi8>.

The new patterns here are only enabled for now by
 -test-vector-transfer-flatten-patterns.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114993
2021-12-13 22:39:41 +00:00
Benoit Jacob 0aea49a730 [mlir][Vector] Patterns flattening vector transfers to 1D
This is the first part of https://reviews.llvm.org/D114993 which has been
split into small independent commits.

This is needed at the moment to get good codegen from 2d vector.transfer
ops that aim to compile to SIMD load/store instructions but that can
only do so if the whole 2d transfer shape is handled in one piece, in
particular taking advantage of the memref being contiguous rowmajor.

For instance, if the target architecture has 128bit SIMD then we would
expect that contiguous row-major transfers of <4x4xi8> map to one SIMD
load/store instruction each.

The current generic lowering of multi-dimensional vector.transfer ops
can't achieve that because it peels dimensions one by one, so a transfer
of <4x4xi8> becomes 4 transfers of <4xi8>.

The new patterns here are only enabled for now by
 -test-vector-transfer-flatten-patterns.

Reviewed By: nicolasvasilache
2021-12-13 21:49:04 +00:00
Stella Laurenzo c10995a8ad Re-apply [NFC] Generalize a couple of passes so they can operate on any FunctionLike op.
* Generalizes passes linalg-detensorize, linalg-fold-unit-extent-dims, convert-elementwise-to-linalg.
* I feel that more work could be done in the future (i.e. make FunctionLike into a proper OpInterface and extend actions in dialect conversion to be trait based), and this patch would be a good record of why that is useful.
* Note for downstreams:
  * Since these passes are now generic, they do not automatically nest with pass managers set up for implicit nesting.
  * The Detensorize pass must run on a FunctionLike, and this requires explicit nesting.
* Addressed missed comments from the original and per-suggestion removed the assert on FunctionLike in ElementwiseToLinalg and DropUnitDims.cpp, which also is what was causing the integration test to fail.

This reverts commit aa8815e42e.

Differential Revision: https://reviews.llvm.org/D115671
2021-12-13 13:33:00 -08:00
Aart Bik 312c51406d [mlir][sparse] python driven test for SDDMM
explores various sparsity combinations of
the SDMM kernel and verifies that the computed
result is the same for all cases

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D115476
2021-12-13 12:48:55 -08:00
Mehdi Amini aa8815e42e Revert "[NFC] Generalize a couple of passes so they can operate on any FunctionLike op."
This reverts commit 34696e6542.

A test is crashing on the mlir-nvidia bot.
2021-12-13 20:41:25 +00:00
Bixia Zheng 2f49e6b0db Support sparse tensor output.
Add convertFromMLIRSparseTensor to the supporting C shared library to convert
SparseTensorStorage to COO-flavor format.

Add Python routine sparse_tensor_to_coo_tensor to convert sparse tensor storage
pointer to numpy values for COO-flavor format tensor.

Add a Python test for sparse tensor output.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D115557
2021-12-13 12:06:33 -08:00
Stella Laurenzo 34696e6542 [NFC] Generalize a couple of passes so they can operate on any FunctionLike op.
* Generalizes passes linalg-detensorize, linalg-fold-unit-extent-dims, convert-elementwise-to-linalg.
* I feel that more work could be done in the future (i.e. make FunctionLike into a proper OpInterface and extend actions in dialect conversion to be trait based), and this patch would be a good record of why that is useful.
* Note for downstreams:
  * Since these passes are now generic, they do not automatically nest with pass managers set up for that.
  * If running them over nested functions, you must nest explicitly. Upstream has adopted this style but *-opt still has some uses of implicit pipelines via args. See tests for argument changes needed.

Differential Revision: https://reviews.llvm.org/D115645
2021-12-13 12:01:53 -08:00
gysit 8fc0525a15 [mlir][linalg] Stage application of pad tensor op vectoriztaion.
Adapt the LinalgStrategyVectorizationPattern pass to apply the vectorization patterns in two stages. The change ensures the generic pad tensor op vectorization pattern does not run too early. Additionally, the revision adds the transfer op canonicalization patterns to the set of applied patterns, since they are needed to enable efficient vectorization for rank-reduced convolutions.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D115627
2021-12-13 19:49:35 +00:00
Lei Zhang 5e55a20119 [mlir][spirv] Serialize selection with separate header block
The previous "optimization" that tries to reuse existing block for
selection header block can be problematic for deserialization
because it effectively pulls in previous ops in the selection op's
enclosing block into the selection op's header. When deserializing,
those ops will be placed in the selection op's region. If any of
the previous ops has usage after the section op, it will break. That
is, the following IR cannot round trip:

```mlir
^bb:
  %def = ...
  spv.mlir.selection { ... }
  %use = spv.SomeOp %def
```

This commit removes the "optimization" to always create new blocks
for the selection header.

Along the way, also made error reporting better in deserialization
by turning asserts into proper errors and add check of uses outside
of sinked structured control flow region blocks.

Reviewed By: Hardcode84

Differential Revision: https://reviews.llvm.org/D115582
2021-12-13 10:42:26 -05:00
Mogball 843534db3c [mlir][ods] Fix OpDefinitionsGen infer return types builder with regions
Despite handling regions and inferred return types, the builder was never generated for ops with both InferReturnTypeOpInterface and regions.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D115525
2021-12-13 15:11:35 +00:00
gysit 6c85a49e22 [mlir][memref] Use current source type in getCanonicalSubViewResultType.
Use the current instead of the new source type to compute the rank-reduction map in getCanonicalSubViewResultType. Otherwise, the computation of the rank-reduction map fails when folding a cast into a subview since the strides of the new source type cannot be related to the strides of the current result type.

Depends On D115428

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D115446
2021-12-13 14:50:41 +00:00
Markus Böck 664cc9312c [mlir] Implement `DataLayoutTypeInterface` for `LLVMStructType`
Using this implementation of the interface it is possible to query the size, ABI alignment as well as the preferred alignment of a struct. It should yield the same results as LLVMs `llvm::DataLayout` on an equivalent `llvm::StructType`, including for packed structs.

Additionally it is also possible to increase the ABI and preferred alignment using a data layout entry with the type `llvm.struct<()>, which serves the same functionality as the `a:` component in LLVMs data layout string.

Differential Revision: https://reviews.llvm.org/D115600
2021-12-13 15:09:16 +01:00
gysit db7a2e9176 [mlir][linalg] Only compose PadTensorOps if no ExtractSliceOp is rank-reducing.
Do not compose pad tensor operations if the extract slice of the outer pad tensor operation is rank reducing. The inner extract slice op cannot be rank-reducing since it source type must match the desired type of the padding.

Depends On D115359

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D115428
2021-12-13 13:01:30 +00:00
gysit 6859f8ed1e [mlir][linalg] Adapt the PadTensorOpVectorizationWithInsertSlicePattern matching.
Tighten the matcher of the PadTensorOpVectorizationWithInsertSlicePattern pattern. Only match if the PadOp result is used by the InsertSliceOp source. Fail if the result is used by the InsertSliceOp dest.

Depends On D115336

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D115359
2021-12-13 12:55:07 +00:00
gysit f895e95138 [mlir][linalg] Make padding work for rank-reducing slice ops.
Adapt the computation of a static bounding box to take rank-reducing slice operations into account by filtering out reduced size one dimensions. The revision is needed to make padding work for decomposed convolution operations. The decomposition introduces rank reducing extract slice operations that previously let padding fail.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D115336
2021-12-13 12:34:20 +00:00
Jacques Pienaar b743ff161b [mlir] Relax restriction on name location parsing
We currently restrict parsing of location to not allow nameloc being
nested inside nameloc. This restriction may be historical as there
doesn't seem to be a reason for it anymore (locations like this can be
constructed in C++ and they print fine). Relax this restriction in the
parser to allow this nesting.

Differential Revision: https://reviews.llvm.org/D115581
2021-12-12 08:06:59 -08:00
Jacques Pienaar efb7727a96 [mlir] Flag near misses in file splitting
Flags some potential cases where splitting isn't happening and so could result
in confusing results. Also update some test files where there were near misses
in splitting that seemed unintentional.

Differential Revision: https://reviews.llvm.org/D109636
2021-12-12 08:03:30 -08:00
Nicolas Vasilache 408553dd96 [mlir][Vector] Support 0-D vectors in `CreateMaskOp`
The 0-D case gets lowered in almost the same way that the 1-D case does
in VectorCreateMaskOpConversion. I also had to slightly update the
verifier for the op to always require exactly 1 operand in the 0-D case.

Depends On D115220

Reviewed by: ftynse

Differential revision: https://reviews.llvm.org/D115221
2021-12-12 13:32:29 +00:00
Michal Terepeta a0c930d312 [mlir][Vector] Support 0-D vectors in `CmpIOp`
Following the example of `VectorOfAnyRankOf`, I've done a few changes in the
`.td` files to help with adding the support for the 0-D case gradually.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D115220
2021-12-12 13:28:26 +00:00
Lei Zhang 731676b10d [mlir][spirv] Fix nested control flow serialization
If we have a `spv.mlir.selection` op nested in a `spv.mlir.loop`
op, when serializing the loop's block, we might need to jump
from the selection op's merge block, which might be different
than the immediate MLIR IR predecessor block. But we still need
to get the block argument from the MLIR IR predecessor block.

Also, if the `spv.mlir.selection` is in the `spv.mlir.loop`'s
header block, we need to make sure `OpLoopMerge` is emitted
in the current block before start processing the nested selection
op. Otherwise we'll see the LoopMerge in the wrong SPIR-V
basic block.

Reviewed By: Hardcode84

Differential Revision: https://reviews.llvm.org/D115560
2021-12-11 14:47:19 -05:00
Jacques Pienaar 1ab3efac41 [mlir][python] Add fused location 2021-12-11 10:16:13 -08:00
Nicolas Vasilache f2e945a393 Revert "[mlir][tensor] Fix insert_slice + tensor cast overflow"
This reverts commit 5601821dae.

The prefix + canonical complete behavior is actually obsolete and should not be reintroduced.
Reverting.
2021-12-10 22:53:52 +00:00
Nicolas Vasilache 5601821dae [mlir][tensor] Fix insert_slice + tensor cast overflow
InsertSliceOp may have subprefix semantics where missing trailing dimensions
are automatically inferred directly from the operand shape.
This revision fixes an overflow that occurs in such cases when the impl is based on the op rank.

Differential Revision: https://reviews.llvm.org/D115549
2021-12-10 21:41:26 +00:00
River Riddle 233e9476d8 [mlir:PDL] Allow non-bound pdl.attribute/pdl.type operations that create constants
This allows for passing in these attributes/types to constraints/rewrites as arguments.

Differential Revision: https://reviews.llvm.org/D114817
2021-12-10 19:38:43 +00:00
River Riddle 06c3b9c7be [mlir:PDL] Fix bugs in PDLPatternModule merging
* Constraints/Rewrites registered before a pattern was added were dropped
* Constraints/Rewrites may be registered multiple times (if different pattern sets depend on them)
* ModuleOp no longer has a terminator, so we shouldn't be removing the terminator from it

Differential Revision: https://reviews.llvm.org/D114816
2021-12-10 19:38:43 +00:00
River Riddle 98f5bd3489 [mlir:PDL] Adjust the assembly format for AttributeOp to avoid conflicts with DictionaryAttr
Switch the attribute creation operations to use attr-dict-with-
keyword to avoid conflicts (in the case of pdl.attribute) and
confusion(in the case of pdl_interp.create_attribute) with
having a DictionaryAttr as a value and specifying the
attributes of the operation itself (as a dictionary).

Differential Revision: https://reviews.llvm.org/D114815
2021-12-10 19:38:42 +00:00
River Riddle 9debc35f02 [mlir:PDL] Fix assembly format for pdl.apply_native_rewrite
The results of a rewrite are optional, but we currently require
them to be present in the assembly format. This commit
makes the results component in the format optional.

Differential Revision: https://reviews.llvm.org/D114814
2021-12-10 19:38:42 +00:00
Mogball e40624ae60 [mlir][ods] Fix OpFormatGen sometimes not calling inferReturnTypes
Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D115522
2021-12-10 19:35:56 +00:00
Mogball 0845635eda [mlir][ir] Custom ops' parse/print fall back to dialect hooks
Custom ops that have no parser or printer should fall back to the dialect's parser and/or printer hooks. This avoids the need to define parsers and printers that simply dispatch to the dialect hook.

Reviewed By: mehdi_amini, rriddle

Differential Revision: https://reviews.llvm.org/D115481
2021-12-10 19:34:25 +00:00
Alexander Belyaev b618880e7b [mlir] Move `linalg.tensor_expand/collapse_shape` to TensorDialect.
RFC: https://llvm.discourse.group/t/rfc-reshape-ops-restructuring/3310

linalg.fill gets a canonicalizer, because `FoldFillWithTensorReshape` cannot be moved to tensorops (it uses linalg::FillOp inside). Before it was listed as a canonicalization pattern for the reshape operations, now it became a canonicalization for FillOp.

Differential Revision: https://reviews.llvm.org/D115502
2021-12-10 12:11:48 +01:00
Rob Suderman 46c96fca0e [mlir][tosa] Fix quantized type for tosa.conv2d canonicalization
Wrong type was used for the result type in the tosa.conv_2d canonicalization.
The type should match the result element type should match the result type
not the input element type.

Differential Revision: https://reviews.llvm.org/D115463
2021-12-09 12:39:23 -08:00
Aart Bik 880021df13 [mlir][sparse] reenable asan for sampled mm integration test
Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D115364
2021-12-09 12:07:56 -08:00
MaheshRavishankar 9cfd8d7c6c [mlir][Vector] Avoid infinite loop in InnerOuterDimReductionConversion.
This patterns tries to convert an inner (outer) dim reduction to an
outer (inner) dim reduction. Doing this on a 1D or 0D vector results
in an infinite loop since the converted op is same as the original
operation. Just returning failure when source rank <= 1 fixes the
issue.

Differential Revision: https://reviews.llvm.org/D115426
2021-12-09 09:30:05 -08:00
Bixia Zheng 64e171c2d0 Avoid unnecessary output buffer allocation and initialization.
The sparse tensor code generator allocates memory for the output tensor. As
such, we only need to allocate a MemRefDescriptor to receive the output tensor
and do not need to allocate and initialize the storage for the tensor.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D115292
2021-12-09 08:29:02 -08:00
Krzysztof Drewniak e1da62910e [MLIR][GPU] Define gpu.printf op and its lowerings
- Define a gpu.printf op, which can be lowered to any GPU printf() support (which is present in CUDA, HIP, and OpenCL). This op only supports constant format strings and scalar arguments
- Define the lowering of gpu.pirntf to a call to printf() (which is what is required for AMD GPUs when using OpenCL) as well as to the hostcall interface present in the AMD Open Compute device library, which is the interface present when kernels are running under HIP.
- Add a "runtime" enum that allows specifying which of the possible runtimes a ROCDL kernel will be executed under or that the runtime is unknown. This enum controls how gpu.printf is lowered

This change does not enable lowering for Nvidia GPUs, but such a lowering should be possible in principle.

And:
[MLIR][AMDGPU] Always set amdgpu-implicitarg-num-bytes=56 on kernels

This is something that Clang always sets on both OpenCL and HIP kernels, and failing to include it causes mysterious crashes with printf() support.

In addition, revert the max-flat-work-group-size to (1, 256) to avoid triggering bugs in the AMDGPU backend.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D110448
2021-12-09 15:54:31 +00:00
Eugene Zhulenev 49ce40e9ab [mlir] AsyncParallelFor: align block size to be a multiple of inner loops iterations
Depends On D115263

By aligning block size to inner loop iterations parallel_compute_fn LLVM can later unroll and vectorize some of the inner loops with small number of trip counts. Up to 2x speedup in multiple benchmarks.

Reviewed By: bkramer

Differential Revision: https://reviews.llvm.org/D115436
2021-12-09 06:50:50 -08:00
Eugene Zhulenev 9f151b784b [mlir] AsyncParallelFor: sink constants into the parallel compute function
With complex recursive structure of async dispatch function LLVM can't always propagate constants to the parallel_compute_fn and it often prevents optimizations like loop unrolling and vectorization. We help LLVM by pushing known constants into the parallel_compute_fn explicitly.

Reviewed By: bkramer

Differential Revision: https://reviews.llvm.org/D115263
2021-12-09 06:48:23 -08:00
Matthias Springer cc45a13422 [mlir][linalg][bufferize] LinalgOps can bufferize inplace with input args
LinalgOp results usually bufferize inplace with output args. With this change, they may buffer inplace with input args if the value of the output arg is not used in the computation.

Differential Revision: https://reviews.llvm.org/D115022
2021-12-09 21:54:54 +09:00
Shraiysh Vaishay d82c1f4e4b [MLIR][OpenMP] Added omp.atomic.update
This patch supports the atomic construct (update) following section 2.17.7 of OpenMP 5.0 standard. Also added tests and verifier for the same.

Reviewed By: kiranchandramohan, peixin

Differential Revision: https://reviews.llvm.org/D112982
2021-12-09 15:21:24 +05:30
Nicolas Vasilache d69f5e197c [mlir][memref] Fix subview offset verification.
Offset-specific verification seems to have been lost in one of the recent refactorings.
Also add proper tests that would have caught this omission.

This addresses the immediate issues discussed in:
https://llvm.discourse.group/t/memref-subview-affine-map-and-symbols/4851

Differential Revision: https://reviews.llvm.org/D115427
2021-12-09 07:44:51 +00:00
MaheshRavishankar 6d7c9c3d0e [mlir][Linalg] Bufferize the region of LinalgOps as well.
The region of `linalg.generic` might contain `tensor` operations. For
example, current lowering of `gather` uses a `tensor.extract` in the
body of the `LinalgOp`. Bufferize the ops within a `LinalgOp` region
as well to catch such cases.

Differential Revision: https://reviews.llvm.org/D115322
2021-12-08 22:36:01 -08:00
Rob Suderman 23149d522b [mlir] Added ctlz and cttz to math dialect and LLVM dialect
Count leading/trailing zeros are an existing LLVM intrinsic. Added LLVM
support for the intrinsics with lowerings from the math dialect to LLVM
dialect.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D115206
2021-12-08 14:32:15 -08:00
Butygin d8fce785de [mlir][spirv] math.erf OpenCL lowering
Differential Revision: https://reviews.llvm.org/D115335
2021-12-08 21:59:46 +03:00
Thomas Raoux 579c1ff67d [mlir][nvvm] Add async copy ops to nvvm dialect
Differential Revision: https://reviews.llvm.org/D115314
2021-12-08 09:42:20 -08:00
Matthias Springer 847710f7b7 [mlir][linalg][bufferize] Add dialect filter to BufferizationOptions
This adds a new option `dialectFilter` to BufferizationOptions. Only ops from dialects that are allow-listed in the filter are bufferized. Other ops are left unbufferized. Note: This option requires `allowUnknownOps = true`.

To make use of `dialectFilter`, BufferizationOptions or BufferizationState must be passed to various helper functions.

The purpose of this change is to provide a better infrastructure for partial bufferization, which will be fully activated in a subsequent change.

Differential Revision: https://reviews.llvm.org/D114691
2021-12-08 23:51:18 +09:00
Mehdi Amini be0a7e9f27 Adjust "end namespace" comment in MLIR to match new agree'd coding style
See D115115 and this mailing list discussion:
https://lists.llvm.org/pipermail/llvm-dev/2021-December/154199.html

Differential Revision: https://reviews.llvm.org/D115309
2021-12-08 06:05:26 +00:00
Mehdi Amini ee0908703d Change the printing/parsing behavior for Attributes used in declarative assembly format
The new form of printing attribute in the declarative assembly is eliding the `#dialect.mnemonic` prefix to only keep the `<....>` part.

Differential Revision: https://reviews.llvm.org/D113873
2021-12-08 02:02:37 +00:00
Aart Bik e1b9d80532 [mlir][sparse] add a few more sparse output tests (for generated IR)
also fixes two typos in IR doc

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D115288
2021-12-07 15:31:29 -08:00
Aart Bik 4f2ec7f983 [mlir][sparse] finalize sparse output in the presence of reductions
This revision implements sparse outputs (from scratch) in all cases where
the loops can be reordered with all but one parallel loops outer. If the
inner parallel loop appears inside one or more reductions loops, then an
access pattern expansion is required (aka. workspaces in TACO speak).

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D115091
2021-12-07 10:54:29 -08:00
Rob Suderman e9fae0f19e [mlir][tosa] Disable tosa.depthwise_conv2d canonicalizer for quantized case
Quantized case needs to include zero-point corrections before the tosa.mul.
Disabled for the quantized use-case.

Reviewed By: NatashaKnk

Differential Revision: https://reviews.llvm.org/D115264
2021-12-07 10:16:12 -08:00
Matthias Springer 8a232632c5 [mlir][linalg][bufferize] Add FuncOp bufferization pass
This passes bufferizes FuncOp bodies, but not FuncOp boundaries.

Differential Revision: https://reviews.llvm.org/D114671
2021-12-07 21:44:26 +09:00
Shraiysh Vaishay 31cf42bd9a [mlir][OpenMP] Added omp.atomic.read lowering
This patch adds lowering from omp.atomic.read to LLVM IR along with the
memory ordering clause. Tests for the same are also added.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D115134
2021-12-07 11:17:30 +05:30
not-jenni 5911a29aa9 [mlir][tosa] Add tosa.depthwise_conv2d as tosa.mul canonicalization
For a 1x1 weight and stride of 1, the input/weight can be reshaped and
multiplied elementwise then reshaped back

Reviewed By: rsuderman, KoolJBlack

Differential Revision: https://reviews.llvm.org/D115207
2021-12-06 17:28:52 -08:00
Rob Suderman 05e33d846f [mlir][tosa] Resubmit add tosa.conv2d as tosa.fully_connected canonicalization
Fixed the tosa.conv2d to tosa.fully_connected canonicalization for incorrect
output channels. Included uptes to tests to include checks for the result
shapes during canonicalization.

This allows conv2d to transform to the simpler fully_connected operation.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D115170
2021-12-06 15:33:07 -08:00
Eugene Zhulenev 68a7c001ad [mlir] Improve async parallel for tests + fix typos
Do load and store to verify that we process each element of the iteration space once.

Reviewed By: cota

Differential Revision: https://reviews.llvm.org/D115152
2021-12-06 13:27:54 -08:00
Rob Suderman c5fef77bc3 [mlir] Add CtPop to MathOps with lowering to LLVM
math.ctpop maths to the llvm.ctpop intrinsic.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D114998
2021-12-06 11:54:20 -08:00
Alex Zinenko d64b3e47ba [mlir] Avoid needlessly converting LLVM named structs with compatible elements
Conversion of LLVM named structs leads to them being renamed since we cannot
modify the body of the struct type once it is set. Previously, this applied to
all named struct types, even if their element types were not affected by the
conversion. Make this behvaior only applicable when element types are changed.
This requires making the LLVM dialect type-compatibility check recursively look
at the element types (arguably, it should have been doing than since the moment
the LLVM dialect type system stopped being closed). In addition, have a more
lax check for outer types only to avoid repeated check when necessary (e.g.,
parser, verifiers that are going to also look at the inner type).

Reviewed By: wsmoses

Differential Revision: https://reviews.llvm.org/D115037
2021-12-06 13:42:11 +01:00
Matthias Springer e9fb4dc9e9 [mlir][linalg][bufferize] Remove buffer equivalence from bufferize
Remove all function calls related to buffer equivalence from bufferize implementations.

Add a new PostAnalysisStep for scf.for that ensures that yielded values are equivalent to the corresponding BBArgs. (This was previously checked in `bufferize`.) This will be relaxed in a subsequent commit.

Note: This commit changes two test cases. These were broken by design
and should not have passed. With the new scf.for PostAnalysisStep, this
bug was fixed.

Differential Revision: https://reviews.llvm.org/D114927
2021-12-06 17:48:31 +09:00
Matthias Springer cb4d0bf997 [mlir][linalg][bufferize][NFC] Collect equivalent FuncOp BBArgs in PostAnalysisStep
Collect equivalent BBArgs right after the equivalence analysis of the FuncOp and before bufferizing. This is in preparation of decoupling bufferization from aliasInfo.

Also gather equivalence info for CallOps, which was missing in the
previous commit.

Differential Revision: https://reviews.llvm.org/D114847
2021-12-06 17:31:39 +09:00
Michal Terepeta caf89c0db6 [mlir][Vector] Support 0-D vectors in `ConstantMaskOp`
To support creating both a mask with just a single `true` and `false` values,
I had to relax the restriction in the verifier that the rank is always equal to
the length of the attribute array, in other words, we now allow:

- `vector.constant_mask [0] : vector<i1>` which gets lowered to
  `arith.constant dense<false> : vector<i1>`
- `vector.constant_mask [1] : vector<i1>` which gets lowered to
  `arith.constant dense<true> : vector<i1>`

(the attribute list for the 0-D case must be a singleton containing
either `0` or `1`)

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D115023
2021-12-06 08:03:04 +00:00
Mehdi Amini afb0582325 Fix TOSA verifier to emit verbose errors
Also as a test for invalid ops which was missing.
2021-12-05 19:16:54 +00:00
Butygin 91072b74f8 [mlir] Add InlinerInterface to bufferization dialect
Differential Revision: https://reviews.llvm.org/D115080
2021-12-04 23:45:56 +03:00
Hugo Pompougnac 5d49511b30 Apply the permutation map on each affine nest
When using -test-loop-permutation="permutation-map=...", applies the
permutation map on each affine nest in the function (and not only the
first one). If the size of the permutation map and the size of a nest
are not consistent, do nothing on this particular nest (instead of
making MLIR crash).

Differential Revision: https://reviews.llvm.org/D112947
2021-12-04 17:48:34 +05:30
Uday Bondhugula 2108ed0671 [MLIR] Fix affine.for unroll for multi-result upper bound maps
Fix affine.for unroll for multi-result upper bound maps: these can't be
unrolled/unroll-and-jammed in cases where the trip count isn't known to
be a multiple of the unroll factor.

Fix and clean up repeated/unnecessary checks/comments at helper callees.

Also, fix clang-tidy variable naming warnings and redundant includes.

Differential Revision: https://reviews.llvm.org/D114662
2021-12-04 07:20:26 +05:30
River Riddle 7169996159 [mlir] Allow shape dimensions larger than 2^32
Internally we use int64_t to hold shapes, but for some
reason the parser was limiting shapes to unsigned. This
change updates the parser to properly handle int64_t shape
dimensions.

Differential Revision: https://reviews.llvm.org/D115086
2021-12-04 01:29:50 +00:00
Uday Bondhugula d20249fde6 [MLIR] NFC. Rename test cases in test/mlir-cpu-runner per convention
Test case files at most places in MLIR uses hyphens and not underscores.
A counter-pattern was somehow started to use underscores in some places.
Rename test cases in test/mlir-cpu-runner to use hyphens so that it's
consistent at least within its directory.

Differential Revision: https://reviews.llvm.org/D114672
2021-12-04 06:53:39 +05:30
Mehdi Amini 48fb79effb Improve error message when declarativeAssembly contains invalid literals
Differential Revision: https://reviews.llvm.org/D115085
2021-12-04 00:27:32 +00:00
wren romano 4748cc6931 [mlir][sparse] Adding a stress test
Addresses https://bugs.llvm.org/show_bug.cgi?id=52410
Depends on D114192

Reviewed By: aartbik, mehdi_amini

Differential Revision: https://reviews.llvm.org/D114118
2021-12-03 14:59:39 -08:00
natashaknk e2d8b60742 Revert "[mlir][tosa] Add tosa.conv2d as fully_connected canonicalization"
This reverts commit 13bdb7ab4a. The commit introduced/uncovered an unintended bug in models containing Conv2D.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D115079
2021-12-03 14:35:48 -08:00
Alex Zinenko 9dd1f8dfdd [mlir] support recursive type conversion of named LLVM structs
A previous commit added support for converting elemental types contained in
LLVM dialect types in case they were not compatible with the LLVM dialect. It
was missing support for named structs as they could be recursive, which was not
supported by the conversion infra. Now that it is, add support for converting
such named structs.

Depends On D113579

Reviewed By: wsmoses

Differential Revision: https://reviews.llvm.org/D113580
2021-12-03 12:41:40 +01:00
Matthias Springer ad1ba42f68 [mlir][linalg][bufferize] Allow unbufferizable ops in input
Allow ops that are not bufferizable in the input IR. (Deactivated by default.)

bufferization::ToMemrefOp and bufferization::ToTensorOp are generated at the bufferization boundaries.

Differential Revision: https://reviews.llvm.org/D114669
2021-12-03 20:20:46 +09:00
Michal Terepeta 1423e8bf5d [mlir][Vector] Support 0-D vectors in `BitCastOp`
The implementation only allows to bit-cast between two 0-D vectors. We could
probably support casting from/to vectors like `vector<1xf32>`, but I wasn't
convinced that this would be important and it would require breaking the
invariant that `BitCastOp` works only on vectors with equal rank.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114854
2021-12-03 08:55:59 +00:00
Michal Terepeta 8e2b373396 [mlir][Vector] Add some missing tests for `broadcast` and `splat`
Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114853
2021-12-03 08:52:51 +00:00
Matthias Springer d30fcadf07 [mlir][linalg][bufferize] Op interface implementation for Bufferization dialect ops
This change provides `BufferizableOpInterface` implementations for ops from the Bufferization dialects. These ops are needed at the bufferization boundaries for partial bufferization.

Differential Revision: https://reviews.llvm.org/D114618
2021-12-03 16:25:44 +09:00
Matthias Springer 4479138de8 [mlir][linalg][bufferize] Bufferization of tensor.insert
This is a lightweight operation, useful for writing unit tests. It will be utilized for testing in subsequent commits.

Differential Revision: https://reviews.llvm.org/D114693
2021-12-02 11:58:01 +09:00
Mogball 71668a9367 [mlir][ods][nfc] fixing test cases 2021-12-01 18:50:02 +00:00
Mogball ca6bd9cd43 [mlir][ods] AttrOrTypeGen uses Class
AttrOrType def generator uses `Class` code gen helper,
instead of naked raw_ostream.

Depends on D113714 and D114807

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D113715
2021-12-01 16:53:23 +00:00
Nicolas Vasilache c537a94334 [mlir][Vector] Thread 0-d vectors through vector.transfer ops
This revision adds 0-d vector support to vector.transfer ops.
In the process, numerous cleanups are applied, in particular around normalizing
and reducing the number of builders.

Reviewed By: ThomasRaoux, springerm

Differential Revision: https://reviews.llvm.org/D114803
2021-12-01 16:49:43 +00:00
Stephan Herhut 9fce961d2f [mlir][linalg] Disable tensor-matmul test under asan
The test is currently leaky. Disabling it to make the bots green.

Differential Revision: https://reviews.llvm.org/D114857
2021-12-01 16:25:31 +01:00
Matthias Springer 2fd0ea960c [mlir][linalg][bufferize] CallOps do not bufferize to memory writes
However, since CallOps have no aliasing OpResults, their OpOperands always bufferize out-of-place.

This change removes `bufferizesToMemoryWrite` from `CallOpInterface`. This method was called, but its return value did not matter.

Differential Revision: https://reviews.llvm.org/D114616
2021-12-01 18:47:28 +09:00
Thomas Raoux 69a8a7cf2d [mlir] Make sure linearizeCollapsedDims doesn't drop input map dims
The new affine map generated by linearizeCollapsedDims should not drop
dimensions. We need to make sure we create a map with at least as many
dimensions as the source map. This prevents
FoldProducerReshapeOpByLinearization from generating invalid IR.

This solves regression in IREE due to e4e4da86af

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D114838

This reverts commit 9a844c2a9b.
2021-11-30 22:51:56 -08:00
MaheshRavishankar 9a844c2a9b Revert "[mlir] Make sure linearizeCollapsedDims doesn't drop input map dims"
This reverts commit bc38673e4d.
2021-11-30 22:43:46 -08:00
MaheshRavishankar bc38673e4d [mlir] Make sure linearizeCollapsedDims doesn't drop input map dims
The new affine map generated by linearizeCollapsedDims should not drop
dimensions. We need to make sure we create a map with at least as many
dimensions as the source map. This prevents
FoldProducerReshapeOpByLinearization from generating invalid IR.

This solves regression in IREE due to e4e4da86af

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D114838
2021-11-30 22:37:53 -08:00