Commit Graph

6941 Commits

Author SHA1 Message Date
Frederik Gossen 3b9667a84c Clarify documentation for `Elementwise`, `Scalarizable`, `Vectorizable`, and
`Tensorizable` traits.

Differential Revision: https://reviews.llvm.org/D97841
2021-03-08 10:35:22 +01:00
Mehdi Amini e94e55712c Forward the `LLVM_ENABLE_LIBCXX` CMake parameter to the mlir standalone test
This allows to build and test MLIR with `-DLLVM_ENABLE_LIBCXX=ON`.
2021-03-08 05:07:26 +00:00
KareemErgawy-TomTom 3fb384d50e [MLIR][SPIRV] Rename `spv.selection` to `spv.mlir.selection`.
To unify the naming scheme across all ops in the SPIR-V dialect, we are
moving from spv.camelCase to spv.CamelCase everywhere. For ops that
don't have a SPIR-V spec counterpart, we use spv.mlir.snake_case.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D98014
2021-03-06 16:05:31 +01:00
Lei Zhang bb6f5c8314 [mlir][spirv] Convert tensor.extract for very small tensors
Normally tensors will be stored in buffers before converting to SPIR-V,
given that is how a large amount of data is sent to the GPU. However,
SPIR-V supports converting from tensors directly too. This is for the
cases where the tensor just contains a small amount of elements and it
makes sense to directly inline them as a small data array in the shader.
To handle this, internally the conversion might create new local
variables. SPIR-V consumers in GPU drivers may or may not optimize that
away. So this has implications over register pressure. Therefore, a
threshold is used to control when the patterns should kick in.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D98052
2021-03-06 08:03:36 -05:00
Mehdi Amini f8fe6d9f3f Use gen-dialect-doc instead of gen-op-doc for the Builtin dialect
This is fixing the missing title and menu entry on the MLIR website.
2021-03-06 05:32:46 +00:00
Matthias Springer acce0ea70c [mlir][AVX512] Add mask.compress to AVX512 dialect.
Adds mask.compress to the AVX512 dialect and defines a lowering to the LLVM dialect.

Differential Revision: https://reviews.llvm.org/D97611
2021-03-06 10:02:48 +09:00
Mehdi Amini a7cac0d9a5 Fix Dialect doc generation to special case for the Builtin dialect empty name
This should fix the issue with an empty entry for the builtin dialect on
the website.

Differential Revision: https://reviews.llvm.org/D98074
2021-03-05 23:47:50 +00:00
Alex Zinenko 6410ee0d09 [mlir] Squash LLVM_ArmNeon dialect into ArmNeon
The two dialects are largely redundant. The former was introduced as a mirror
of the latter operating on LLVM dialect types. This is no longer necessary
since the LLVM dialect operates on built-in types. Combine the two dialects.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D98060
2021-03-05 23:33:32 +01:00
Aart Bik e5c8fc776f [mlir][vector] canonicalize unmasked gather/scatter/compress/expand directly into l/s
With the new vector.load/store operations, there is no need to go through
unmasked transfer operations (which will canonicalized to l/s anyway).

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D98056
2021-03-05 14:23:50 -08:00
Diego Caballero 2de6dbda66 [mlir] Add 'Skip' result to Operation visitor
This patch is a follow-up on D97217. It adds a new 'Skip' result to the Operation visitor
so that a callback can stop the ongoing visit of an operation/block/region and
continue visiting the next one without fully interrupting the walk. Skipping is
needed to be able to erase an operation/block in pre-order and do not continue
visiting the internals of that operation/block.

Related to the skipping mechanism, the patch also introduces the following changes:
 * Added new TestIRVisitors pass with basic testing for the IR visitors.
 * Fixed missing early increment ranges in visitor implementation.
 * Updated documentation of walk methods to include erasure information and walk
   order information.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D97820
2021-03-06 00:02:20 +02:00
Diego Caballero 71a86245ca [mlir] Extend Operation visitor with pre-order traversal
This patch extends the Region, Block and Operation visitors to also support pre-order walks.
We introduce a new template argument that dictates the walk order (only pre-order and
post-order are supported for now). The default order for Regions, Blocks and Operations is
post-order. Mixed orders (e.g., Region/Block pre-order + Operation post-order) could easily
be implemented, as shown in NumberOfExecutions.cpp.

Reviewed By: rriddle, frgossen, bondhugula

Differential Revision: https://reviews.llvm.org/D97217
2021-03-06 00:02:20 +02:00
Diego Caballero b635492c3f [mlir][Affine][NFC] Return BlockArgument in AffineForOp::getInductionVar
This avoids unnecessary casts when a BlockArgument is required.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D97879
2021-03-06 00:02:19 +02:00
KareemErgawy-TomTom d48ceb45e3 [MLIR][SPIRV] Rename `spv.undef` to `spv.Undef`.
To unify the naming scheme across all ops in the SPIR-V dialect, we are
moving from spv.camelCase to spv.CamelCase everywhere. For ops that
don't have a SPIR-V spec counterpart, we use spv.mlir.snake_case.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D98016
2021-03-05 15:49:44 -05:00
River Riddle f175ba4a54 [mlir][AsmPrinter] Don't use string comparison when filtering list attributes
In .mlir modules with larges amounts of attributes, e.g. a function with a larger number of argument attributes, the string comparison filtering greatly affects compile time. This revision switches to using a SmallDenseSet in these situations, resulting in over a 10x speed up in some situations.

Differential Revision: https://reviews.llvm.org/D97980
2021-03-05 12:47:05 -08:00
KareemErgawy-TomTom 29812a6195 [MLIR][SPIRV] Rename `spv.loop` to `spv.mlir.loop`.
To unify the naming scheme across all ops in the SPIR-V dialect,
we are moving from spv.camelCase to spv.CamelCase everywhere.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D97918
2021-03-05 15:44:30 -05:00
Stella Laurenzo 0b5f1b859f [mlir][linalg] Add linalg_opdsl tool first draft.
* Mostly imported from experimental repo as-is with cosmetic changes.
* Temporarily left out emission code (for building ops at runtime) to keep review size down.
* Documentation and lit tests added fresh.
* Sample op library that represents current Linalg named ops included.

Differential Revision: https://reviews.llvm.org/D97995
2021-03-05 11:45:09 -08:00
Stella Laurenzo a9ccdfbc7d NFC: Glob all python sources in the MLIR Python bindings.
* Also switches to use symlinks vs copy as that enables edit-and-continue python development.
* Broken out of https://reviews.llvm.org/D97995 per request from reviewer.

Differential Revision: https://reviews.llvm.org/D98005
2021-03-05 10:21:02 -08:00
Aart Bik adc35b689f [mlir][sparse] mask reduction update
Reduction updates should be masked, just like the load and stores.
Note that alternatively, we could use the fact that masked values are
zero of += updates and mask invariants to get this working but that
would not work for *= updates. Masking the update itself is cleanest.
This change also replaces the constant mask with a broadcast of "true"
since this constant folds much better for various folding patterns.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D98000
2021-03-05 08:56:10 -08:00
Nicolas Vasilache c86d3c1a38 [mlir][Linalg] Fix order of dimensions in hoistPaddingOnTensors. 2021-03-05 15:11:35 +00:00
Christian Sigg 5fedf30748 [mlir] Make cuInit() call thread-safe.
Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D98024
2021-03-05 16:06:15 +01:00
Nicolas Vasilache 35908406dc [mlir][scf] Canonicalize scf.for last tensor iteration result.
Canonicalize the iter_args of an scf::ForOp that involve a tensor_load and
for which only the last loop iteration is actually visible outside of the
loop. The canonicalization looks for a pattern such as:
```
   %t0 = ... : tensor_type
   %0 = scf.for ... iter_args(%bb0 : %t0) -> (tensor_type) {
     ...
     // %m is either tensor_to_memref(%bb00) or defined above the loop
     %m... : memref_type
     ... // uses of %m with potential inplace updates
     %new_tensor = tensor_load %m : memref_type
     ...
     scf.yield %new_tensor : tensor_type
   }
```

`%bb0` may have either 0 or 1 use. If it has 1 use it must be exactly a
`%m = tensor_to_memref %bb0` op that feeds into the yielded `tensor_load`
op.

If no aliasing write of `%new_tensor` occurs between tensor_load and yield
then the value %0 visible outside of the loop is the last `tensor_load`
produced in the loop.

For now, we approximate the absence of aliasing by only supporting the case
when the tensor_load is the operation immediately preceding the yield.

The canonicalization rewrites the pattern as:
```
   // %m is either a tensor_to_memref or defined above
   %m... : memref_type
   scf.for ... { // no iter_args
     ... // uses of %m with potential inplace updates
   }
   %0 = tensor_load %m : memref_type
```

Differential revision: https://reviews.llvm.org/D97953
2021-03-05 09:42:19 +00:00
KareemErgawy-TomTom c74eb466d2 [MLIR][SPIRV] Rename `spv.globalVariable` to `spv.GlobalVariable`.
To unify the naming scheme across all ops in the SPIR-V dialect, we are
moving from spv.camelCase to spv.CamelCase everywhere.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D97919
2021-03-04 16:24:59 -05:00
KareemErgawy-TomTom 5abdca47b3 [MLIR][SPIRV] Rename `spv.constant` to `spv.Constant`.
To unify the naming scheme across all ops in the SPIR-V dialect, we are
moving from `spv.camelCase` to `spv.CamelCase` everywhere.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D97917
2021-03-04 16:15:56 -05:00
KareemErgawy-TomTom 4d90e460bc [MLIR][SPIRV] Rename `spv.spcConstant...` to `spv.SpcConstant...`.
To unify the naming scheme across all ops in the SPIR-V dialect, we are
moving from spv.camelCase to spv.CamelCase everywhere.

Differential Revision: https://reviews.llvm.org/D97920
2021-03-04 16:07:41 -05:00
River Riddle 2f37cdd569 [mlir][IR][NFC] Move a majority of the builtin attributes to ODS
Now that attributes can be generated using ODS, we can move the builtin attributes as well. This revision removes a majority of the builtin attributes with a few left for followup revisions. The attributes moved to ODS in this revision are: AffineMapAttr, ArrayAttr, DictionaryAttr, IntegerSetAttr, StringAttr, SymbolRefAttr, TypeAttr, and UnitAttr.

Differential Revision: https://reviews.llvm.org/D97591
2021-03-04 13:04:06 -08:00
River Riddle 1447ec5182 [mlir][AttrDefGen] Add support for specifying the value type of an attribute
The value type of the attribute can be specified by either overriding the typeBuilder field on the AttrDef, or by providing a parameter of type `AttributeSelfTypeParameter`. This removes the need to define custom storage class constructors for attributes that have a value type other than NoneType.

Differential Revision: https://reviews.llvm.org/D97590
2021-03-04 13:04:05 -08:00
River Riddle 6bc767cd07 [mlir] Add a DialectAsmParser::getChecked method
This function simplifies calling the getChecked methods on Attributes and Types from within the parser, and removes any need to use `getEncodedSourceLocation` for these methods (by using an SMLoc instead). This is much more efficient than using an mlir::Location, as the encoding process to produce an mlir::Location is inefficient and undesirable for parsing (locations used during parsing should not persist afterwards unless otherwise necessary).

Differential Revision: https://reviews.llvm.org/D97900
2021-03-04 11:53:24 -08:00
Ahmed Taei da1e37a8b0 Fold full-size subview of static shapes.
Differential Revision: https://reviews.llvm.org/D97429
2021-03-04 09:52:06 -08:00
Nicolas Vasilache f21d78633a [mlir] Tighten the rules around folding TensorLoadOp
`tensor_load(tensor_to_memref(x)) -> x` is an incorrect folding because it ignores potential aliasing.

This revision approximates no-aliasing by restricting the folding to occur only when tensor_to_memref
is immediately preceded by tensor_load in the same block. This is a conservative step back towards
correctness until better alias analysis becomes available.

Context: https://llvm.discourse.group/t/properly-using-bufferization-related-passes/2913/6

Differential Revision: https://reviews.llvm.org/D97957
2021-03-04 17:48:09 +00:00
Arpith C. Jacob 4e393350c5 [mlir] Add an AccessGroup attribute to load/store LLVM dialect ops and generate the access_group LLVM metadata.
This also includes LLVM dialect ops created from intrinsics.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D97944
2021-03-04 18:17:23 +01:00
Hanhan Wang b47c6c686c [mlir][linalg] Add suffix "Op" to pooling TC ops.
Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D97946
2021-03-04 07:08:30 -08:00
Nicolas Vasilache f3cc854364 [mlir][Vector] Add folding of vector transfers from/into tensor producing ops.
Add a folder to rewrite a sequence such as:

```
   %t1 = ...
   %v = vector.transfer_read %t0[%c0...], {masked = [false...]} :
     tensor<static_sizesxf32>, vector<static_sizesxf32>
  %t2 = vector.transfer_write %v, %t1[%c0...] {masked = [false...]} :
     vector<static_sizesxf32>, tensor<static_sizesxf32>
```

into:

```
   %t0
```

The producer of t1 may or may not be DCE'd depending on whether it is a
block argument or has side effects.

Differential revision: https://reviews.llvm.org/D97934
2021-03-04 14:17:42 +00:00
Nicolas Vasilache a756f12b4d [mlir][Linalg] Add folding of linalg.copy that are in fact identities.
Differential Revision: https://reviews.llvm.org/D97939
2021-03-04 13:37:26 +00:00
Nicolas Vasilache 4f4f3f1e59 [mlir] NFC - Add runner util functions to only print MemRef metadata.
These are useful to debug execution, without having to print the whole
content of a memref.
2021-03-04 12:35:45 +00:00
Nicolas Vasilache 05882157db [mlir][Linalg] NFC - Add isOutputTensor to LinalgInterfaces.td 2021-03-04 12:33:21 +00:00
Christian Sigg f69d5a7fc7 [mlir] Initialize CUDA context lazily.
So we can remove the ignore-warning pragma again.

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D97864
2021-03-04 13:07:56 +01:00
Alex Zinenko 32c49c7d73 [mlir] ODS: change OpBuilderDAG to OpBuilder
We no longer have the non-DAG version.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D97856
2021-03-04 10:55:02 +01:00
Alex Zinenko 19db802e7b [mlir] make implementations of translation to LLVM IR interfaces private
There is no need for the interface implementations to be exposed, opaque
registration functions are sufficient for all users, similarly to passes.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D97852
2021-03-04 09:16:32 +01:00
Arpith C. Jacob 4a2930f495 [mlir] Add loop codegen options to some LLVM dialect ops.
Add a Loop Option attribute and generate llvm metadata attached to
branch instructions to control code generation.

Reviewed By: ftynse, mehdi_amini

Differential Revision: https://reviews.llvm.org/D96820
2021-03-04 09:01:57 +01:00
Aart Bik 553cb6d473 [mlir][sparse] fix bug in reduction chain
Found with exhaustive testing, it is possible that a while loop
appears in between chainable for loops. As long as we don't
scalarize reductions in while loops, this means we need to
terminate the chain at the while. This also refactors the
reduction code into more readable helper methods.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D97886
2021-03-03 17:38:22 -08:00
River Riddle 83ef862fad [mlir] Add support for generating Attribute classes for ODS
The support for attributes closely maps that of Types (basically 1-1) given that Attributes are defined in exactly the same way as Types. All of the current ODS TypeDef classes get an Attr equivalent. The generation of the attribute classes themselves share the same generator as types.

Differential Revision: https://reviews.llvm.org/D97589
2021-03-03 16:41:49 -08:00
River Riddle e07c968a6d [mlir][pdl][NFC] Rename InputOp to OperandOp
This better matches the actual IR concept that is being modeled, and is consistent with how the rest of PDL is structured.

Differential Revision: https://reviews.llvm.org/D95718
2021-03-03 15:48:00 -08:00
River Riddle 55f878bad9 [mlir][pdl] Add a new !pdl.range<> type
This type represents a range of positional values. It will be used in followup revisions to add support for variadic constructs to PDL, such as operand and result ranges.

Differential Revision: https://reviews.llvm.org/D95717
2021-03-03 15:48:00 -08:00
River Riddle 3dfa86149e [mlir][IR] Refactor the internal implementation of Value
The current implementation of Value involves a pointer int pair with several different kinds of owners, i.e. BlockArgumentImpl*, Operation *, TrailingOpResult*. This design arose from the desire to save memory overhead for operations that have a very small number of results (generally 0-2). There are, unfortunately, many problematic aspects of the current implementation that make Values difficult to work with or just inefficient.

Operation result types are stored as a separate array on the Operation. This is very inefficient for many reasons: we use TupleType for multiple results, which can lead to huge amounts of memory usage if multi-result operations change types frequently(they do). It also means that simple methods like Value::getType/Value::setType now require complex logic to get to the desired type.

Value only has one pointer bit free, severely limiting the ability to use it in things like PointerUnion/PointerIntPair. Given that we store the kind of a Value along with the "owner" pointer, we only leave one bit free for users of Value. This creates situations where we end up nesting PointerUnions to be able to use Value in one.

As noted above, most of the methods in Value need to branch on at least 3 different cases which is both inefficient, possibly error prone, and verbose. The current storage of results also creates problems for utilities like ValueRange/TypeRange, which want to efficiently store base pointers to ranges (of which Operation* isn't really useful as one).

This revision greatly simplifies the implementation of Value by the introduction of a new ValueImpl class. This class contains all of the state shared between all of the various derived value classes; i.e. the use list, the type, and the kind. This shared implementation class provides several large benefits:

* Most of the methods on value are now branchless, and often one-liners.

* The "kind" of the value is now stored in ValueImpl instead of Value
This frees up all of Value's pointer bits, allowing for users to take full advantage of PointerUnion/PointerIntPair/etc. It also allows for storing more operation results as "inline", 6 now instead of 2, freeing up 1 word per new inline result.

* Operation result types are now stored in the result, instead of a side array
This drops the size of zero-result operations by 1 word. It also removes the memory crushing use of TupleType for operations results (which could lead up to hundreds of megabytes of "dead" TupleTypes in the context). This also allowed restructured ValueRange, making it simpler and one word smaller.

This revision does come with two conceptual downsides:
* Operation::getResultTypes no longer returns an ArrayRef<Type>
This conceptually makes some usages slower, as the iterator increment is slightly more complex.
* OpResult::getOwner is slightly more expensive, as it now requires a little bit of arithmetic

From profiling, neither of the conceptual downsides have resulted in any perceivable hit to performance. Given the advantages of the new design, most compiles are slightly faster.

Differential Revision: https://reviews.llvm.org/D97804
2021-03-03 14:33:37 -08:00
David Blaikie 4fda0dc14b Fix use of deprecated API 2021-03-03 14:07:28 -08:00
MaheshRavishankar c118fdcd59 [mlir] Remove incorrect folding for SubTensorInsertOp
The SubTensorInsertOp has a requirement that dest type and result
type match. Just folding the tensor.cast operation violates this and
creates verification errors during canonicalization. Also fix other
canonicalization methods that werent inserting casts properly.

Differential Revision: https://reviews.llvm.org/D97800
2021-03-03 13:58:05 -08:00
Hanhan Wang 83c56aa4ee [mlir][linalg] Add depthwise_conv_2d_input_nhwc_filter_hwcf to Linalg TC ops.
Different from the definition in Tensorflow and TOSA, the output is [N,H,W,C,M]. This can make transforms easier in LinAlg because the indexing maps are plain. E.g., to determine if the fill op has dependency between the depthwise conv op, the current pipeline only recognizes the dep if they are all projected affine map.

Reviewed By: asaadaldien

Differential Revision: https://reviews.llvm.org/D97798
2021-03-03 11:47:02 -08:00
Mehdi Amini 13cb431719 Add basic JIT Python Bindings
This offers the ability to create a JIT and invoke a function by passing
ctypes pointers to the argument and the result.

Differential Revision: https://reviews.llvm.org/D97523
2021-03-03 18:19:40 +00:00
Mehdi Amini 86c8a7857d Add C bindings for mlir::ExecutionEngine
This adds minimalistic bindings for the execution engine, allowing to
invoke the JIT from the C API. This is still quite early and
experimental and shouldn't be considered stable in any way.

Differential Revision: https://reviews.llvm.org/D96651
2021-03-03 18:19:40 +00:00
Hanhan Wang 497b7b8c00 [mlir][linalg] Delete unused vars if there are shaped-only operands.
Reviewed By: stella.stamenova

Differential Revision: https://reviews.llvm.org/D97851
2021-03-03 09:36:08 -08:00