Commit Graph

6883 Commits

Author SHA1 Message Date
Hanhan Wang 21895a2bef [mlir][linalg] Reuse the symbol if attribute uses are identical.
Depends On D97312

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D97383
2021-02-24 11:42:13 -08:00
Hanhan Wang 705068cb8c [mlir][linalg] Support for using output values in TC definitions.
This will allow us to define select(pred, in, out) for TC ops, which is useful
for pooling ops.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D97312
2021-02-24 11:37:45 -08:00
Weiwei Li ce2ad938ff [mlir][spirv] Define spv.GLSL.Ldexp
co-authored-by: Alan Liu <alanliu.yf@gmail.com>

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D97228
2021-02-24 13:07:46 -05:00
Lei Zhang 5f8a80882b [mlir] Add constBuilderCall to TypeAttr to simplify builders
Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D97344
2021-02-24 13:04:03 -05:00
Eugene Zhulenev ce976d2db3 [mlir] Add polynomial approximation for math::LogOp (using builders API)
Replace math::LogOp with an approximations from the the Julien Pommier's SSE math library

Link: http://gruntthepeon.free.fr/ssemath

Reviewed By: asaadaldien

Differential Revision: https://reviews.llvm.org/D97304
2021-02-24 07:50:25 -08:00
Alexander Belyaev 7377ef9357 [mlir] Add a builder to `linalg.tiled_loop`.
https://llvm.discourse.group/t/rfc-add-linalg-tileop/2833

Differential Revision: https://reviews.llvm.org/D97372
2021-02-24 14:47:27 +01:00
Christian Sigg eb8d6af5e4 [mlir] Specify cuda-runner pass pipeline as command line options.
The cuda-runner registers two pass pipelines for nested passes,
so that we don't have to use verbose textual pass pipeline specification.

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D97091
2021-02-24 14:36:52 +01:00
Alexander Belyaev 945b76d428 [mlir][linalg] Fix Linalg roundtrip test.
The test did not check whether the operations can be parsed again after
printing them once.

Differential Revision: https://reviews.llvm.org/D97368
2021-02-24 11:31:09 +01:00
River Riddle 59f0e4627a [mlir][Inliner] Don't optimize callees in async mode if there is only one to optimize
This avoids unnecessary async overhead in situations that won't benefit from it.
2021-02-23 18:44:09 -08:00
Kern Handa 3c4cdd0b6a [mlir] ExecutionEngine needs special handling for COFF binaries
Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D97141
2021-02-23 17:34:19 -08:00
River Riddle 16a50c9e64 [mlir][Inliner] Keep the number of async pass managers constant
This prevents a bug in the pass instrumentation implementation where the main thread would end up with a different pass manager in different runs of the pass.
2021-02-23 16:40:24 -08:00
River Riddle abd3c6f24c [mlir][Inliner] Use llvm::parallelForEach instead of llvm::parallelTransformReduce
llvm::parallelTransformReduce does not schedule work on the caller thread, which becomes very costly for
the inliner where a majority of SCCs are small, often ~1 element. The switch to llvm::parallelForEach solves this,
and also aligns the implementation with the PassManager (which realistically should share the same implementation).

This change dropped compile time on an internal benchmark by ~1(25%) second.

Differential Revision: https://reviews.llvm.org/D96086
2021-02-23 14:36:45 -08:00
River Riddle 65a3197a8f [mlir] Refactor InterfaceMap to use a sorted vector of interfaces, as opposed to a DenseMap
A majority of operations have a very small number of interfaces, which means that the cost of using a hash map is generally larger for interface lookups than just a binary search. In the future when there are a number of operations with large amounts of interfaces, we can switch to a hybrid approach that optimizes lookups based on the number of interfaces. For now, however, a binary search is the best approach.

This dropped compile time on a largish TF MLIR module by 20%(half a second).

Differential Revision: https://reviews.llvm.org/D96085
2021-02-23 14:36:45 -08:00
Aart Bik 17fa919847 [mlir][sparse] incorporate vector index into address computation
When computing dense address, a vectorized index must be accounted
for properly. This bug was formerly undetected because we get 0 * prev + i
in most cases, which folds away the scalar part. Now it works for all cases.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D97317
2021-02-23 13:25:51 -08:00
Adam Straw af8adea155 make Affine parallel and yield ops MemRefsNormalizable
Affine parallel ops may contain and yield results from MemRefsNormalizable ops in the loop body.  Thus, both affine.parallel and affine.yield should have the MemRefsNormalizable trait.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D96821
2021-02-23 10:16:47 -08:00
Nicolas Vasilache 8cf14b8dec [mlir][Linalg] Retire hoistViewAllocOps.
This transformation was only used for quick experimentation and is not general enough.
Retire it.

Differential Revision: https://reviews.llvm.org/D97266
2021-02-23 11:45:19 +00:00
Nicolas Vasilache 551ba72760 [mlir] NFC - Use declarative assembly for scf::YieldOp 2021-02-23 11:17:30 +00:00
Frederik Gossen 1fff7c8924 Fix unused variable 2021-02-23 11:19:35 +01:00
River Riddle dc6a84fce6 [mlir] Add support for DebugCounters using the new DebugAction infrastructure
DebugCounters allow for selectively enabling the execution of a debug action based upon a "counter". This counter is comprised of two components that are used in the control of execution of an action, a "skip" value and a "count" value. The "skip" value is used to skip a certain number of initial executions of a debug action. The "count" value is used to prevent a debug action from executing after it has executed for a set number of times (not including any executions that have been skipped). For example, a counter for a debug action with `skip=47` and `count=2`, would skip the first 47 executions, then execute twice, and finally prevent any further executions.

This is effectively the same as the DebugCounter infrastructure in LLVM, but using the DebugAction infrastructure in MLIR. We can't simply reuse the DebugCounter support already present in LLVM due to its heavy reliance on global constructors (which are not allowed in MLIR). The DebugAction infrastructure already nicely supports the debug counter use case, and promotes the separation of policy and mechanism design philosophy.

Differential Revision: https://reviews.llvm.org/D96395
2021-02-23 01:01:17 -08:00
River Riddle 72d5afa4ac [mlir] Add a new debug action framework.
This revision adds the infrastructure for `Debug Actions`. This is a DEBUG only
API that allows for external entities to control various aspects of compiler
execution. This is conceptually similar to something like DebugCounters in LLVM, but at a lower level. This framework doesn't make any assumptions about how the higher level driver is controlling the execution, it merely provides a framework for connecting the two together. This means that on top of DebugCounter functionality, we could also provide more interesting drivers such as interactive execution. A high level overview of the workflow surrounding debug actions is
shown below:

*   Compiler developer defines an `action` that is taken by the a pass,
    transformation, utility that they are developing.
*   Depending on the needs, the developer dispatches various queries, pertaining
    to this action, to an `action manager` that will provide an answer as to
    what behavior the action should do.
*   An external entity registers an `action handler` with the action manager,
    and provides the logic to resolve queries on actions.

The exact definition of an `external entity` is left opaque, to allow for more
interesting handlers.

This framework was proposed here: https://llvm.discourse.group/t/rfc-debug-actions-in-mlir-debug-counters-for-the-modern-world

Differential Revision: https://reviews.llvm.org/D84986
2021-02-23 00:52:17 -08:00
KareemErgawy-TomTom 67e0d58de4 [MLIR][LinAlg] Start detensoring implementation.
This commit is the first baby step towards detensoring in
linalg-on-tensors.

Detensoring is the process through which a tensor value is convereted to one
or potentially more primitive value(s). During this process, operations with
such detensored operands are also converted to an equivalen form that works
on primitives.

The detensoring process is driven by linalg-on-tensor ops. In particular, a
linalg-on-tensor op is checked to see whether *all* its operands can be
detensored. If so, those operands are converted to thier primitive
counterparts and the linalg op is replaced by an equivalent op that takes
those new primitive values as operands.

This works towards handling github/google/iree#1159.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D96271
2021-02-23 08:27:58 +01:00
Mehdi Amini 99b0032ce0 Move the MLIR integration tests as a subdirectory of test (NFC)
This does not change the behavior directly: the tests only run when
`-DMLIR_INCLUDE_INTEGRATION_TESTS=ON` is configured. However running
`ninja check-mlir` will not run all the tests within a single
lit invocation. The previous behavior would wait for all the integration
tests to complete before starting to run the first regular test. The
test results were also reported separately. This change is unifying all
of this and allow concurrent execution of the integration tests with
regular non-regression and unit-tests.

Differential Revision: https://reviews.llvm.org/D97241
2021-02-23 05:55:47 +00:00
River Riddle 154cabe722 [mlir][pdl][NFC] Extract the execution of each bytecode operation into its own function
This makes the implementation of each bytecode operation much easier to reason about, and lets the compiler decide which implementations are beneficial to inline into the main switch.

Differential Revision: https://reviews.llvm.org/D95716
2021-02-22 19:02:48 -08:00
River Riddle ddd556f10e [mlir][pdl] Fix bug when ordering predicates
We should be ordering predicates with higher primary/secondary sums first, but we are currently ordering them last. This allows for predicates more frequently encountered to be checked first.

Differential Revision: https://reviews.llvm.org/D95715
2021-02-22 19:02:48 -08:00
River Riddle 06e25d5645 [mlir][IR] Refactor the `getChecked` and `verifyConstructionInvariants` methods on Attributes/Types
`verifyConstructionInvariants` is intended to allow for verifying the invariants of an attribute/type on construction, and `getChecked` is intended to enable more graceful error handling aside from an assert. There are a few problems with the current implementation of these methods:
* `verifyConstructionInvariants` requires an mlir::Location for emitting errors, which is prohibitively costly in the situations that would most likely use them, e.g. the parser.
This creates an unfortunate code duplication between the verifier code and the parser code, given that the parser operates on llvm::SMLoc and it is an undesirable overhead to pre-emptively convert from that to an mlir::Location.
* `getChecked` effectively requires duplicating the definition of the `get` method, creating a quite clunky workflow due to the subtle different in its signature.

This revision aims to talk the above problems by refactoring the implementation to use a callback for error emission. Using a callback allows for deferring the costly part of error emission until it is actually necessary.

Due to the necessary signature change in each instance of these methods, this revision also takes this opportunity to cleanup the definition of these methods by:
* restructuring the signature of `getChecked` such that it can be generated from the same code block as the `get` method.
* renaming `verifyConstructionInvariants` to `verify` to match the naming scheme of the rest of the compiler.

Differential Revision: https://reviews.llvm.org/D97100
2021-02-22 17:37:49 -08:00
Aart Bik 0df59f234b [sparse][mlir] simplify lattice optimization logic
Simplifies the way lattices are optimized with less, but more
powerful rules. This also fixes an inaccuracy where too many
lattices resulted (expecting a non-existing universal index).
Also puts no-side-effects on all proper getters and unifies
bufferization flags order in integration tests (for future,
more complex use cases).

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D97134
2021-02-22 16:52:06 -08:00
Geoffrey Martin-Noble e2224dd753 Fix typo introduced in https://reviews.llvm.org/D97006
Differential Revision: https://reviews.llvm.org/D97220
2021-02-22 13:11:37 -08:00
Geoffrey Martin-Noble 54529c4be6 Add missing dep to fix shared libs build
Followup to https://reviews.llvm.org/D97006 which broke the shared libs
build because of a missing dependency.

Differential Revision: https://reviews.llvm.org/D97213
2021-02-22 11:36:48 -08:00
Vivek 817d343fb0 [MLIR] Fix tilePerfectlyNested utility for handling non-unit step size
The current implementation of tilePerfectlyNested utility doesn't handle
the non-unit step size. We have added support to perform tiling
correctly even if the step size of the loop to be tiled is non-unit.
Fixes https://bugs.llvm.org/show_bug.cgi?id=49188.

Differential Revision: https://reviews.llvm.org/D97037
2021-02-23 00:50:04 +05:30
Geoffrey Martin-Noble 2ce6a42cc9 [MLIR] Add Linalg support for integer (generalized) matmuls
This patch adds Linalg named ops for various types of integer matmuls.
Due to limitations in the tc spec/linalg-ods-gen ops cannot be type
polymorphic, so this instead creates new ops (improvements to the
methods for defining Linalg named ops are underway with a prototype at
https://github.com/stellaraccident/mlir-linalgpy).

To avoid the necessity of directly referencing these many new ops, this
adds additional methods to ContractionOpInterface to allow classifying
types of operations based on their indexing maps.

Reviewed By: nicolasvasilache, mravishankar

Differential Revision: https://reviews.llvm.org/D97006
2021-02-22 11:13:26 -08:00
Benjamin Kramer ed4d12c2ce [mlir][Shape] Fix a crash when folding nary broadcast ops
operands[2] can be nullptr here. I'm not able to build a lit test for
this because of the commutative reordering of operands. It's possible to
trigger this with a createOrFold<BroadcastOp> though.

Differential Revision: https://reviews.llvm.org/D97206
2021-02-22 20:06:37 +01:00
Vinayaka Bandishti 15332982c3 [MLIR][affine] Prevent fusion when ops with memory effect free are present between producer and consumer
This commit fixes a bug in affine fusion pipeline where an
incorrect fusion is performed despite a dealloc op is present
between a producer and a consumer. This is done by creating a
node for dealloc op in the MDG.

Reviewed By: bondhugula, dcaballe

Differential Revision: https://reviews.llvm.org/D97032
2021-02-22 23:21:02 +05:30
Tres Popp 5b20d80a03 [mlir] Mark std.subview as NoSideEffect
Differential Revision: https://reviews.llvm.org/D96951
2021-02-22 09:34:38 +01:00
Kern Handa 2d62212b06 [mlir] Export CUDA and Vulkan runtime wrappers on Windows
Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D97140
2021-02-21 22:58:55 -08:00
Jacques Pienaar 04c66edd32 [mlir] Add simple jupyter kernel
Simple jupyter kernel using mlir-opt and reproducer to run passes.
Useful for local experimentation & generating examples. The export to
markdown from here is not immediately useful nor did I define a
CodeMirror synax to make the HTML output prettier. It only supports one
level of history (e.g., `_`) as I was mostly using with expanding a
pipeline one pass at a time and so was all I needed.

I placed this in utils directory next to editor & debugger utils.

Differential Revision: https://reviews.llvm.org/D95742
2021-02-21 18:16:06 -08:00
Stella Laurenzo 6c9541d4dd Implement simple type polymorphism for linalg named ops.
* It was decided that this was the end of the line for the existing custom tc parser/generator, and this is the first step to replacing it with a declarative format that maps well to mathy source languages.
* One such source language is implemented here: https://github.com/stellaraccident/mlir-linalgpy/blob/main/samples/mm.py
  * In fact, this is the exact source of the declarative `polymorphic_matmul` in this change.
  * I am working separately to clean this python implementation up and add it to MLIR (probably as `mlir.tools.linalg_opgen` or equiv). The scope of the python side is greater than just generating named ops: the ops are callable and directly emit `linalg.generic` ops fully dynamically, and this is intended to be a feature for frontends like npcomp to define custom linear algebra ops at runtime.
* There is more work required to handle full type polymorphism, especially with respect to integer formulations, since they require more specificity wrt types.
* Followups to this change will bring the new generator to feature parity with the current one and delete the current. Roughly, this involves adding support for interface declarations and attribute symbol bindings.

Differential Revision: https://reviews.llvm.org/D97135
2021-02-21 14:30:31 -08:00
Jacques Pienaar fa211f3ce9 Update test error string post pass registration change 2021-02-20 15:54:52 -08:00
Jacques Pienaar 02d7b260c6 [mlir] Register the print-op-graph pass using ODS
Move over to ODS & use pass options.
2021-02-20 15:42:02 -08:00
Aart Bik f32b3401e1 [mlir][sparse] convert function pass to module pass
Rationale:
Touching function level information can only be done within a module pass.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D97102
2021-02-19 20:04:02 -08:00
Mehdi Amini f8c1f3b14a Revert "Revert "Fix MLIR Toy tutorial JIT example and add a test to cover it""
This reverts commit f36060417a and
reapply commit ae15b1e7ad.

JIT test must be annotated to not run on Windows.
2021-02-19 23:54:52 +00:00
Stella Stamenova f36060417a Revert "Fix MLIR Toy tutorial JIT example and add a test to cover it"
This reverts commit ae15b1e7ad.

This commit caused failures on the mlir windows buildbot
2021-02-19 13:38:43 -08:00
Eugene Zhulenev f99ccf6516 [mlir] Add math polynomial approximation pass
This gives ~30x speedup compared to expanding Tanh into exp operations:

```
name                  old cpu/op  new cpu/op  delta
BM_mlir_Tanh_f32/10    253ns ± 3%    55ns ± 7%  -78.35%  (p=0.000 n=44+41)
BM_mlir_Tanh_f32/100  2.21µs ± 4%  0.14µs ± 8%  -93.85%  (p=0.000 n=48+49)
BM_mlir_Tanh_f32/1k   22.6µs ± 4%   0.7µs ± 5%  -96.68%  (p=0.000 n=32+42)
BM_mlir_Tanh_f32/10k   225µs ± 5%     7µs ± 6%  -96.88%  (p=0.000 n=49+55)

name                  old time/op             new time/op             delta
BM_mlir_Tanh_f32/10    259ns ± 1%               56ns ± 2%  -78.31%        (p=0.000 n=41+39)
BM_mlir_Tanh_f32/100  2.27µs ± 1%             0.14µs ± 5%  -93.89%        (p=0.000 n=46+49)
BM_mlir_Tanh_f32/1k   22.9µs ± 1%              0.8µs ± 4%  -96.67%        (p=0.000 n=30+42)
BM_mlir_Tanh_f32/10k   230µs ± 0%                7µs ± 3%  -96.88%        (p=0.000 n=37+55)
```

This approximations is based on Eigen::generic_fast_tanh function

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D96739
2021-02-19 12:43:36 -08:00
Nicolas Vasilache 0ee4bf151c [mlir] Add folding of tensor.cast -> subtensor_insert
Differential Revision: https://reviews.llvm.org/D97059
2021-02-19 17:24:16 +00:00
Geoffrey Martin-Noble 236aab0b0c [MLIR] Delete unused functions getCollapsedInitTensor and getExpandedInitTensor
These are unused since
https://reviews.llvm.org/rG81264dfbe80df08668a325a61613b64243b99c01

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D97014
2021-02-19 09:23:54 -08:00
Nicolas Vasilache 62f5c46eec [mlir][Linalg] NFC - Expose more options to the CodegenStrategy 2021-02-19 14:01:44 +00:00
Alexander Belyaev 53367b8fe1 [mlir][nfc] Fix indentation in LinalgOps.td. 2021-02-19 13:02:58 +01:00
Nicolas Vasilache d12fa33d73 [mlir] Add a TensorLoadToMemref canonicalization
A folder of `tensor_load + tensor_to_memref` exists but it only applies when
source and destination memref types are the same.

This revision adds a canonicalize `tensor_load + tensor_to_memref` to `memref_cast`
when type mismatches prevent folding to kick in.

Differential Revision: https://reviews.llvm.org/D97038
2021-02-19 09:38:33 +00:00
Nicolas Vasilache b3c227a25a [mlir] Better support for rank-reducing subview / subtensor type inference.
Differential Revision: https://reviews.llvm.org/D96995
2021-02-19 08:30:50 +00:00
Aart Bik 2556d62282 [mlir][sparse] assert fail on mismatch between rank and annotations array
Rationale:
Providing the wrong number of sparse/dense annotations was silently
ignored or caused unrelated crashes. This minor change verifies that
the provided number matches the rank.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D97034
2021-02-18 23:22:14 -08:00
Christian Sigg c86c96a710 [mlir] Load dynamic libraries in JitRunner from absolute paths so that GDB can find the symbol tables.
Reviewed By: mehdi_amini, ftynse

Differential Revision: https://reviews.llvm.org/D96759
2021-02-19 07:33:35 +01:00
Geoffrey Martin-Noble db011775e4 Reland "[MLIR] Make structured op tests permutation invariant"
Relands with fix swapping DEPENDS for LINK_LIBS.

This reverts commit cd8cc00b9e.

Differential Revision: https://reviews.llvm.org/D97011
2021-02-18 18:09:49 -08:00
Mehdi Amini ae15b1e7ad Fix MLIR Toy tutorial JIT example and add a test to cover it 2021-02-19 01:53:36 +00:00
Jing Pu d690cbf821 Add DivOp to the Shape dialect
Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D96907
2021-02-18 16:58:47 -08:00
Mehdi Amini cd8cc00b9e Revert "[MLIR] Make structured op tests permutation invariant"
This reverts commit b9ff67099a.
The build is broken with -DBUILD_SHARED_LIBS=ON
2021-02-19 00:16:45 +00:00
Geoffrey Martin-Noble b9ff67099a [MLIR] Make structured op tests permutation invariant
Extracts the relevant dimensions from the map under test to build up the
maps to test against in a permutation-invariant way.

This also includes a fix to the indexing maps used by
isColumnMajorMatmul. The maps as currently written do not describe a
column-major matmul. The linalg named op column_major_matmul has the
correct maps (and notably fails the current test).

If `C = matmul(A, B)` we want an operation that given A in column major
format and B in column major format produces C in column major format.
Given that for a matrix, faux column major is just transpose.
`column_major_matmul(transpose(A), transpose(B)) = transpose(C)`. If
`A` is `NxK` and `B` is `KxM`, then `C` is `NxM`, so `transpose(A)` is
`KxN`, `transpose(B)` is `MxK` and `transpose(C)` is `MxN`, not `NxM`
as these maps currently have.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D96984
2021-02-18 14:36:07 -08:00
Nicolas Vasilache b006902b2d [mlir] Fold trivial subtensor / subtensor_insert ops.
Static subtensor / subtensor_insert of the same size as the source / destination tensor and root @[0..0] with strides [1..1] are folded away.

Differential revision: https://reviews.llvm.org/D96991
2021-02-18 21:34:55 +00:00
Nicolas Vasilache 8e01e2ec0f [mlir][Vector] Fold tensor_cast + vector.transfer_read
Differential Revision: https://reviews.llvm.org/D96988
2021-02-18 20:47:16 +00:00
Andrew Pritchard 08c681f645 Perform memory accesses in the same addrspace as the corresponding memref.
It's not necessarily the case on all architectures that all memory is
addressable in addrspace 0, so casting the pointer to addrspace 0 is
liable to cause problems.

Reviewed By: aartbik, ftynse, nicolasvasilache

Differential Revision: https://reviews.llvm.org/D96380
2021-02-18 12:36:16 -08:00
natashaknk 25b4a6a7f0 [MLIR][TOSA] Add lowering from TOSA to Linalg for math-based and elementwise ops
This patch adds lowering to Linalg for the following TOSA ops: negate, rsqrt, mul, select, clamp and reluN and includes support for signless integer and floating point types

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D96924
2021-02-18 12:10:10 -08:00
Alexander Belyaev 624fccba87 [mlir] Add `linalg.tiled_loop` op.
`subtensor_insert` was used instead of `linalg.subtensor_yield` to make this PR
smaller. Verification will be added in a follow-up PR.

Differential Revision: https://reviews.llvm.org/D96943
2021-02-18 13:23:00 +01:00
Alexander Belyaev a89035d750 Revert "[MLIR] Create memref dialect and move several dialect-specific ops from std."
This commit introduced a cyclic dependency:
Memref dialect depends on Standard because it used ConstantIndexOp.
Std depends on the MemRef dialect in its EDSC/Intrinsics.h

Working on a fix.

This reverts commit 8aa6c3765b.
2021-02-18 12:49:52 +01:00
Julian Gross 8aa6c3765b [MLIR] Create memref dialect and move several dialect-specific ops from std.
Create the memref dialect and move several dialect-specific ops without
dependencies to other ops from std dialect to this dialect.

Moved ops:
AllocOp -> MemRef_AllocOp
AllocaOp -> MemRef_AllocaOp
DeallocOp -> MemRef_DeallocOp
MemRefCastOp -> MemRef_CastOp
GetGlobalMemRefOp -> MemRef_GetGlobalOp
GlobalMemRefOp -> MemRef_GlobalOp
PrefetchOp -> MemRef_PrefetchOp
ReshapeOp -> MemRef_ReshapeOp
StoreOp -> MemRef_StoreOp
TransposeOp -> MemRef_TransposeOp
ViewOp -> MemRef_ViewOp

The roadmap to split the memref dialect from std is discussed here:
https://llvm.discourse.group/t/rfc-split-the-memref-dialect-from-std/2667

Differential Revision: https://reviews.llvm.org/D96425
2021-02-18 11:29:39 +01:00
Alex Zinenko 12875ed976 [mlir] generate enum translation functions with unused attribute
The functions translating enums to LLVM IR are generated in a single
file included in many places, not all of which use all translations.
Generate functions with "unused" attribute to silence compiler warnings.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D96880
2021-02-18 09:44:40 +01:00
Aart Bik ff6c84b803 [mlir][sparse] generalize sparse storage format to many more types
Rationale:
Narrower types for overhead storage yield a smaller memory footprint for
sparse tensors and thus needs to be supported. Also, more value types
need to be supported to deal with all kinds of kernels. Since the
"one-size-fits-all" sparse storage scheme implementation is used
instead of actual codegen, the library needs to be able to support
all combinations of desired types. With some crafty templating and
overloading, the actual code for this is kept reasonably sized though.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D96819
2021-02-17 18:20:23 -08:00
Rob Suderman 55756f32f7 [MLIR][TOSA] Expand Tosa int types to I8 and I16
Tosa integers should include I8 and I16 values.

Differential Revision: https://reviews.llvm.org/D96900
2021-02-17 14:18:38 -08:00
Alex Zinenko 4a3473ff3b [mlir] silence unused-function warnings in table-generated code
Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D96695
2021-02-17 19:28:31 +01:00
Eugene Zhulenev 519f5917b4 [mlir] Add fma operation to std dialect
Will remove `vector.fma` operation in the followup CLs.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D96801
2021-02-17 10:06:01 -08:00
Hanhan Wang c80484e16e [mlir][StandardToSPIRV] Add support for lowering trunci to SPIR-V to i1 types.
Add a pattern to converting some value to a boolean. spirv.S/UConvert does not
work on i1 types. Thus, the pattern is lowered to cmpi + select.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D96851
2021-02-17 07:23:41 -08:00
Weiwei Li 7742620620 [mlir][spirv] Add spv.GLSL.FrexpStruct
co-authored-by: Alan Liu <alanliu.yf@gmail.com>

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D96527
2021-02-17 09:02:03 -05:00
Benjamin Kramer 63a35f35ec [mlir][Shape] Generalize cstr_broadcastable folding for n-ary broadcasts
This is still fairly tricky code, but I tried to untangle it a bit.

Differential Revision: https://reviews.llvm.org/D96800
2021-02-17 11:44:52 +01:00
Benjamin Kramer 82b692e546 [mlir][Shape] Mark BroadcastOp as not having side effects
This allows it to be dead code eliminated when unused.

Differential Revision: https://reviews.llvm.org/D96797
2021-02-17 10:26:14 +01:00
Stella Laurenzo 4c3f1be84f [mlir][python] Add python binding for AffineMapAttribute.
Differential Revision: https://reviews.llvm.org/D96815
2021-02-16 15:43:30 -08:00
MaheshRavishankar 81264dfbe8 [mlir][Linalg] Add utility method to reshape ops to express output shape in terms of input shape.
Resolving the dim of outputs of a tensor_reshape op in terms of its
input shape allows the op to be eliminated when its used only in its
dims. The init_tensor -> tensor_reshape canonicalization can be
simplified to use the dims of the output of the tensor_reshape which
gets canonicalized away later making the tensor_reshape dead.

Differential Revision: https://reviews.llvm.org/D96635
2021-02-16 13:42:08 -08:00
Adam Straw 99c0458f2f separate AffineMapAccessInterface from AffineRead/WriteOpInterface
Separating the AffineMapAccessInterface from AffineRead/WriteOp interface so that dialects which extend Affine capabilities (e.g. PlaidML PXA = parallel extensions for Affine) can utilize relevant passes (e.g. MemRef normalization).

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D96284
2021-02-16 13:05:27 -08:00
Thomas Raoux adfd3c7083 [mlir] Fix memref_cast + subview folder when reducing rank
When the destination of the subview has a lower rank than its source we need to
fix the result type of the new subview op.

Differential Revision: https://reviews.llvm.org/D96804
2021-02-16 12:00:59 -08:00
Alex Zinenko ce8f10d6cb [mlir] Simplify ModuleTranslation for LLVM IR
A series of preceding patches changed the mechanism for translating MLIR to
LLVM IR to use dialect interface with delayed registration. It is no longer
necessary for specific dialects to derive from ModuleTranslation. Remove all
virtual methods from ModuleTranslation and factor out the entry point to be a
free function.

Also perform some cleanups in ModuleTranslation internals.

Depends On D96774

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D96775
2021-02-16 18:42:52 +01:00
Alex Zinenko 2ab57c503e [mlir] tighten LLVM dialect verifiers to generate valid LLVM IR
Verification of the LLVM IR produced when translating various MLIR dialects was
only active when calling the translation programmatically. This has led to
several cases of invalid LLVM IR being generated that could not be caught with
textual mlir-translate tests. Add verifiers for these cases and fix the tests
in preparation for enforcing the validation of LLVM IR.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D96774
2021-02-16 18:18:21 +01:00
Alex Zinenko 9cd47a26d5 [mlir] add verifiers for NVVM and ROCDL kernel attributes
Make sure they can only be attached to LLVM functions as a result of converting
GPU functions to the LLVM Dialect.
2021-02-16 18:06:54 +01:00
Thomas Raoux 397336dcab [mlir][vector] Add missing support for contract of integer lowering.
Some of the lowering of vector.contract didn't support integer case. Since
reduction of integer cannot accumulate we always break up the reduction op, it
should be merged by a separate canonicalization if possible.

Differential Revision: https://reviews.llvm.org/D96461
2021-02-16 07:13:30 -08:00
Thomas Raoux 807e5467f3 [mlir] Add canonicalization for tensor_cast + tensor_to_memref
This helps bufferization passes by removing tensor_cast operations.

Differential Revision: https://reviews.llvm.org/D96745
2021-02-16 07:11:09 -08:00
Lei Zhang cb1a42359b [mlir][vector] Move splitting transfer ops into a separate entry point
These patterns unrolls transfer read/write ops if the vector consumers/
producers are extract/insert slices op. Transfer ops can map to hardware
load/store functionalities, where the vector size matters for bandwidth
considerations. So these patterns should be collected separately, instead
of being generic canonicalization patterns.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D96782
2021-02-16 10:04:34 -05:00
Lei Zhang d8c7f442ea [mlir][vector] Add support for unrolling vector.fma
Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D96781
2021-02-16 09:56:25 -05:00
Tres Popp 787d771dce [mlir] Don't return nullptrs from scf::IfOp::getSuccessorRegions
Previously this might happen if there was no elseRegion and the method
was asked for all successor regions.

Differential Revision: https://reviews.llvm.org/D96764
2021-02-16 12:06:30 +01:00
Nicolas Vasilache 21debeae78 [mlir][Linalg] Generalize vector::transfer hoisting on tensors.
This revision adds support for hoisting "subtensor + vector.transfer_read" / "subtensor_insert + vector.transfer_write pairs" across scf.for.
The unit of hoisting becomes a HoistableRead / HoistableWrite struct which contains a pair of "vector.transfer_read + optional subtensor" / "vector.transfer_write + optional subtensor_insert".
scf::ForOp canonicalization patterns are applied greedily on the successful application of the transformation to cleanup the IR more eagerly and potentially expose more transformation opportunities.

Differential revision: https://reviews.llvm.org/D96731
2021-02-16 09:45:14 +00:00
Adrian Kuegel 07cc77187a Lower math.expm1 to intrinsics in the GPUToNVVM and GPUToROCDL conversions.
This adds the lowering for expm1 for GPU backends.

Differential Revision: https://reviews.llvm.org/D96756
2021-02-16 10:23:42 +01:00
Adrian Kuegel 9f581815ae Add Expm1 op to the math dialect.
Differential Revision: https://reviews.llvm.org/D96704
2021-02-16 08:33:37 +01:00
Nicolas Vasilache d01ea0edaa [mlir] Drop reliance of SliceAnalysis on specific ops.
SliceAnalysis originally was developed in the context of affine.for within mlfunc.
It predates the notion of region.
This revision updates it to not hardcode specific ops like scf::ForOp.
When rooted at an op, the behavior of the slice computation changes as it recurses into the regions of the op. This does not support gathering all values transitively depending on a loop induction variable anymore.
Additional variants rooted at a Value are added to also support the existing behavior.

Differential revision: https://reviews.llvm.org/D96702
2021-02-16 06:34:32 +00:00
Nicolas Vasilache 02d053ed2d [mlir][Vector] Add a canonicalization pattern for vector.contract + add
Differential Revision: https://reviews.llvm.org/D96701
2021-02-15 21:22:36 +00:00
Jacques Pienaar 381a65fa06 [mlir] Add clone method to ShapedType
Allow clients to create a new ShapedType of the same "container" type
but with different element or shape. First use case is when refining
shape during shape inference without needing to consider which
ShapedType is being refined.

Differential Revision: https://reviews.llvm.org/D96682
2021-02-15 11:04:16 -08:00
Tres Popp 3842d4b679 Make shape.is_broadcastable/shape.cstr_broadcastable nary
This corresponds with the previous work to make shape.broadcast nary.
Additionally, simplify the ConvertShapeConstraints pass. It now doesn't
lower an implicit shape.is_broadcastable. This is still the same in
combination with shape-to-standard when the 2 passes are used in either
order.

Differential Revision: https://reviews.llvm.org/D96401
2021-02-15 16:05:32 +01:00
Alex Zinenko 1d6f08e61d [mlir] use new cmake targets in mlir-*-runner 2021-02-15 15:04:00 +01:00
Alex Zinenko 176379e0c8 [mlir] Use the interface-based translation for LLVM "intrinsic" dialects
Port the translation of five dialects that define LLVM IR intrinsics
(LLVMAVX512, LLVMArmNeon, LLVMArmSVE, NVVM, ROCDL) to the new dialect
interface-based mechanism. This allows us to remove individual translations
that were created for each of these dialects and just use one common
MLIR-to-LLVM-IR translation that potentially supports all dialects instead,
based on what is registered and including any combination of translatable
dialects. This removal was one of the main goals of the refactoring.

To support the addition of GPU-related metadata, the translation interface is
extended with the `amendOperation` function that allows the interface
implementation to post-process any translated operation with dialect attributes
from the dialect for which the interface is implemented regardless of the
operation's dialect. This is currently applied to "kernel" functions, but can
be used to construct other metadata in dialect-specific ways without
necessarily affecting operations.

Depends On D96591, D96504

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D96592
2021-02-15 14:43:07 +01:00
Tres Popp 89d900b2a1 [mlir] Add error message on shape.broadcast verification failure 2021-02-15 10:58:53 +01:00
Alex Zinenko 34ea608a47 [mlir] Support repeated delayed registration of dialect interfaces
Dialects themselves do not support repeated addition of interfaces with the
same TypeID. However, in case of delayed registration, the registry may contain
such an interface, or have the same interface registered several times due to,
e.g., dependencies. Make sure we delayed registration does not attempt to add
an interface with the same TypeID more than once.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D96606
2021-02-15 10:46:26 +01:00
Tobias Gysi 99f3510b41 Reland "[mlir] add support for verification in integration tests"
The patch extends the runner utils by verification methods that compare two memrefs. The methods compare the content of the two memrefs and print success if the data is identical up to a small numerical error. The methods are meant to simplify the development of integration tests that compare the results against a reference implementation (cf. the updates to the linalg matmul integration tests).

Originally landed in 5fa893c (https://reviews.llvm.org/D96326) and reverted in dd719fd due to a Windows build failure.

Changes:
- Remove the max function that requires the "algorithm" header on Windows
- Eliminate the truncation warning in the float specialization of verifyElem by using a float constant

Reviewed By: Kayjukh

Differential Revision: https://reviews.llvm.org/D96593
2021-02-14 20:30:05 +01:00
Nicolas Vasilache 428bc6feed [mlir][Linalg] Fix constant detection in linalg.pad_tensor vectorization. 2021-02-14 15:53:39 +00:00
Fangrui Song 3643828b51 [CMake][mlir] Fix mlir-linalg-ods-gen/CMakeLists.txt after D96645 2021-02-13 14:16:38 -08:00
daquexian 6e31a6b7c2 fix linalg ods gen cross compiling like other gen executables
Signed-off-by: daquexian <daquexian566@gmail.com>

Reviewed By: vinograd47

Differential Revision: https://reviews.llvm.org/D96645
2021-02-13 19:17:46 +00:00
Praveen Narayanan a65fb1916c Add a "kind" attribute to ContractionOp and OuterProductOp.
Currently, vector.contract joins the intermediate result and the accumulator
argument (of ranks K) using summation. We desire more joining operations ---
such as max --- to help vector.contract express reductions. This change extends
Vector_ContractionOp to take an optional attribute (called "kind", of enum type
CombiningKind) specifying the joining operation to be add/mul/min/max for int/fp
, and and/or/xor for int only. By default this attribute has value "add".

To implement this we also need to extend vector.outerproduct, since
vector.contract gets transformed to vector.outerproduct (and that to
vector.fma). The extension for vector.outerproduct is also an optional kind
attribute that uses the same enum type and possible values. The default is
"add". In case of max/min we transform vector.outerproduct to a combination of
compare and select.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D93280
2021-02-12 20:23:59 +00:00
Mehdi Amini aa4e466caa [mlir][Linalg] Improve region support in Linalg ops
This revision takes advantage of the newly extended `ref` directive in assembly format
to allow better region handling for LinalgOps. Specifically, FillOp and CopyOp now build their regions explicitly which allows retiring older behavior that relied on specific op knowledge in both lowering to loops and vectorization.

This reverts commit 3f22547fd1 and reland 973e133b76 with a workaround for
a gcc bug that does not accept lambda default parameters:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59949

Differential Revision: https://reviews.llvm.org/D96598
2021-02-12 19:11:24 +00:00