This new pattern mixes vector.transpose and direct lowering to vector.reduce.
This allows more progressive lowering than immediately going to insert/extract and
composes more nicely with other canonicalizations.
This has 2 use cases:
1. for very wide vectors the generated IR may be much smaller
2. when we have a custom lowering for transpose ops we can target it directly
rather than rely LLVM
Differential Revision: https://reviews.llvm.org/D85428
When any of the memrefs in a structured linalg op has a zero dimension, it becomes dead.
This is consistent with the fact that linalg ops deduce their loop bounds from their operands.
Note however that this is not the case for the `tensor<0xelt_type>` which is a special convention
that must be lowered away into either `memref<elt_type>` or just `elt_type` before this
canonicalization can kick in.
Differential Revision: https://reviews.llvm.org/D85413
The intrinsics were already supported and vector.transfer_read/write lowered
direclty into these operations. By providing them as individual ops, however,
clients can used them directly, and it opens up progressively lowering transfer
operations at higher levels (rather than direct lowering to LLVM IR as done now).
Reviewed By: bkramer
Differential Revision: https://reviews.llvm.org/D85357
This dialect was introduced during the bring-up of the new LLVM dialect type
system for testing purposes. The main LLVM dialect now uses the new type system
and the test dialect is no longer necessary, so remove it.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D85224
Introduces the expand and compress operations to the Vector dialect
(important memory operations for sparse computations), together
with a first reference implementation that lowers to the LLVM IR
dialect to enable running on CPU (and other targets that support
the corresponding LLVM IR intrinsics).
Reviewed By: reidtatge
Differential Revision: https://reviews.llvm.org/D84888
Simplify semi-affine expression for the operations like ceildiv,
floordiv and modulo by any given symbol by checking divisibilty by that
symbol.
Some properties used in simplification are:
1) Commutative property of the floordiv and ceildiv:
((expr1 floordiv expr2) floordiv expr3 ) = ((expr1 floordiv expr3) floordiv expr2)
((expr1 ceildiv expr2) ceildiv expr3 ) = ((expr1 ceildiv expr3) ceildiv expr2)
While simplification if operations are different no simplification is
possible as there is no property that simplify expressions like these:
((expr1 ceildiv expr2) floordiv expr3) or ((expr1 floordiv expr2)
ceildiv expr3).
2) If both expr1 and expr2 are divisible by the expr3 then:
(expr1 % expr2) / expr3 = ((expr1 / expr3) % (expr2 / expr3))
where / is divide symbol.
3) If expr1 is divisible by expr2 then expr1 % expr2 = 0.
Signed-off-by: Yash Jain <yash.jain@polymagelabs.com>
Differential Revision: https://reviews.llvm.org/D84920
This revision adds a transformation and a pattern that rewrites a "maybe masked" `vector.transfer_read %view[...], %pad `into a pattern resembling:
```
%1:3 = scf.if (%inBounds) {
scf.yield %view : memref<A...>, index, index
} else {
%2 = linalg.fill(%extra_alloc, %pad)
%3 = subview %view [...][...][...]
linalg.copy(%3, %alloc)
memref_cast %extra_alloc: memref<B...> to memref<A...>
scf.yield %4 : memref<A...>, index, index
}
%res= vector.transfer_read %1#0[%1#1, %1#2] {masked = [false ... false]}
```
where `extra_alloc` is a top of the function alloca'ed buffer of one vector.
This rewrite makes it possible to realize the "always full tile" abstraction where vector.transfer_read operations are guaranteed to read from a padded full buffer.
The extra work only occurs on the boundary tiles.
A new first-party modeling for LLVM IR types in the LLVM dialect has been
developed in parallel to the existing modeling based on wrapping LLVM `Type *`
instances. It resolves the long-standing problem of modeling identified
structure types, including recursive structures, and enables future removal of
LLVMContext and related locking mechanisms from LLVMDialect.
This commit only switches the modeling by (a) renaming LLVMTypeNew to LLVMType,
(b) removing the old implementaiton of LLVMType, and (c) updating the tests. It
is intentionally minimal. Separate commits will remove the infrastructure built
for the transition and update API uses where appropriate.
Depends On D85020
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D85021
This revision adds a transformation and a pattern that rewrites a "maybe masked" `vector.transfer_read %view[...], %pad `into a pattern resembling:
```
%1:3 = scf.if (%inBounds) {
scf.yield %view : memref<A...>, index, index
} else {
%2 = vector.transfer_read %view[...], %pad : memref<A...>, vector<...>
%3 = vector.type_cast %extra_alloc : memref<...> to
memref<vector<...>> store %2, %3[] : memref<vector<...>> %4 =
memref_cast %extra_alloc: memref<B...> to memref<A...> scf.yield %4 :
memref<A...>, index, index
}
%res= vector.transfer_read %1#0[%1#1, %1#2] {masked = [false ... false]}
```
where `extra_alloc` is a top of the function alloca'ed buffer of one vector.
This rewrite makes it possible to realize the "always full tile" abstraction where vector.transfer_read operations are guaranteed to read from a padded full buffer.
The extra work only occurs on the boundary tiles.
Differential Revision: https://reviews.llvm.org/D84631
This reverts commit 35b65be041.
Build is broken with -DBUILD_SHARED_LIBS=ON with some undefined
references like:
VectorTransforms.cpp:(.text._ZN4llvm12function_refIFvllEE11callback_fnIZL24createScopedInBoundsCondN4mlir25VectorTransferOpInterfaceEE3$_8EEvlll+0xa5): undefined reference to `mlir::edsc::op::operator+(mlir::Value, mlir::Value)'
The current modeling of LLVM IR types in MLIR is based on the LLVMType class
that wraps a raw `llvm::Type *` and delegates uniquing, printing and parsing to
LLVM itself. This model makes thread-safe type manipulation hard and is being
progressively replaced with a cleaner MLIR model that replicates the type
system. Introduce a set of classes reflecting the LLVM IR type system in MLIR
instead of wrapping the existing types. These are currently introduced as
separate classes without affecting the dialect flow, and are exercised through
a test dialect. Once feature parity is reached, the old implementation will be
gradually substituted with the new one.
Depends On D84171
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D84339
This revision adds a transformation and a pattern that rewrites a "maybe masked" `vector.transfer_read %view[...], %pad `into a pattern resembling:
```
%1:3 = scf.if (%inBounds) {
scf.yield %view : memref<A...>, index, index
} else {
%2 = vector.transfer_read %view[...], %pad : memref<A...>, vector<...>
%3 = vector.type_cast %extra_alloc : memref<...> to
memref<vector<...>> store %2, %3[] : memref<vector<...>> %4 =
memref_cast %extra_alloc: memref<B...> to memref<A...> scf.yield %4 :
memref<A...>, index, index
}
%res= vector.transfer_read %1#0[%1#1, %1#2] {masked = [false ... false]}
```
where `extra_alloc` is a top of the function alloca'ed buffer of one vector.
This rewrite makes it possible to realize the "always full tile" abstraction where vector.transfer_read operations are guaranteed to read from a padded full buffer.
The extra work only occurs on the boundary tiles.
Differential Revision: https://reviews.llvm.org/D84631
This patch handles loopControl and selectionControl in parsing and
printing. In order to reuse the functionality, and avoid handling cases when
`{` of the region is parsed as a dictionary attribute, `control` keyword was
introduced.`None` is a default control attribute. This functionality can be
later extended to `spv.func`.
Also, loopControl and selectionControl can now be (de)serialized.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D84175
This is an operation that can returns a new ValueShape with a different shape. Useful for composing shape function calls and reusing existing shape transfer functions.
Just adding the op in this change.
Differential Revision: https://reviews.llvm.org/D84217
This change allow CooperativeMatrix Load/Store operations to use pointer type
that may not match the matrix element type. This allow us to declare buffer
with a larger type size than the matrix element type. This follows SPIR-V spec
and this is needed to be able to use cooperative matrix in combination with
shared local memory efficiently.
Differential Revision: https://reviews.llvm.org/D84993
In a context in which `shape.broadcast` is known not to produce an error value,
we want it to operate solely on extent tensors. The operation's behavior is
then undefined in the error case as the result type cannot hold this value.
Differential Revision: https://reviews.llvm.org/D84933
Replaced definition of named ND ConvOps with tensor comprehension
syntax which reduces boilerplate code significantly. Furthermore,
new ops to support TF convolutions added (without strides and dilations).
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D84628
This commit is part of a greater project which aims to add
full end-to-end support for convolutions inside mlir. The
reason behind having conv ops for each rank rather than
having one generic ConvOp is to enable better optimizations
for every N-D case which reflects memory layout of input/kernel
buffers better and simplifies code as well. We expect plain linalg.conv
to be progressively retired.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D83879
This adds conversions for const_size and to_extent_tensor. Also, cast-like operations are now folded away if the source and target types are the same.
Differential Revision: https://reviews.llvm.org/D84745
Added a check for 'Function' storage class in `spv.globalVariable`
verifier since it only can be used with `spv.Variable`.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D84731
The current transformation to shape.reduce does not support tensor values.
This adds the required changes to make that work, including fixing the builder
for shape.reduce.
Differential Revision: https://reviews.llvm.org/D84744
- replace DotOp, now that DRR rules have been dropped.
- Capture arguments mismatch in the parser. The number of parsed arguments must
equal the number of expected arguments.
Reviewed By: ftynse, nicolasvasilache
Differential Revision: https://reviews.llvm.org/D82952
linalg.indexed_generic (consumer) with tensor arguments.
The implementation of fusing std.constant producer with a
linalg.indexed_generic consumer was already in place. It is exposed
with this change. Also cleaning up some of the patterns that implement
the fusion to not be templated, thereby avoiding lot of conditional
checks for calling the right instantiation.
Differential Revision: https://reviews.llvm.org/D84566
This diff provides a concrete test case for the error that will be raised when the iteration space is non hyper-rectangular.
The corresponding emission method for this error message has been changed as well.
Differential Revision: https://reviews.llvm.org/D84531
Previous changes generalized some of the operands and results. Complete
a larger group of those to simplify progressive lowering. Also update
some of the declarative asm form due to generalization. Tried to keep it
mostly mechanical.
Based on https://reviews.llvm.org/D84439 but less restrictive, else we
don't allow shape_of to be able to produce a ranked output and doesn't
allow for iterative refinement here. We can consider making it more
restrictive later.
The operation `shape.shape_of` now returns an extent tensor `tensor<?xindex>` in
cases when no error are possible. All consuming operation will eventually accept
both, shapes and extent tensors.
Differential Revision: https://reviews.llvm.org/D84160
The operation `shape.const_shape` was used for constants of type shape only.
We can now also use it to create constant extent tensors.
Differential Revision: https://reviews.llvm.org/D84157
This revision adds support for much deeper type conversion integration into the conversion process, and enables auto-generating cast operations when necessary. Type conversions are now largely automatically managed by the conversion infra when using a ConversionPattern with a provided TypeConverter. This removes the need for patterns to do type cast wrapping themselves and moves the burden to the infra. This makes it much easier to perform partial lowerings when type conversions are involved, as any lingering type conversions will be automatically resolved/legalized by the conversion infra.
To support this new integration, a few changes have been made to the type materialization API on TypeConverter. Materialization has been split into three separate categories:
* Argument Materialization: This type of materialization is used when converting the type of block arguments when calling `convertRegionTypes`. This is useful for contextually inserting additional conversion operations when converting a block argument type, such as when converting the types of a function signature.
* Source Materialization: This type of materialization is used to convert a legal type of the converter into a non-legal type, generally a source type. This may be called when uses of a non-legal type persist after the conversion process has finished.
* Target Materialization: This type of materialization is used to convert a non-legal, or source, type into a legal, or target, type. This type of materialization is used when applying a pattern on an operation, but the types of the operands have not yet been converted.
Differential Revision: https://reviews.llvm.org/D82831
The `makeTiledViews` did not use the sizes of the tiled views based on
the result of the loop bound inference computation. This manifested as
an error in computing tile sizes with convolution where not all the
result expression of concatenated affine maps are simple
AffineDimExpr.
Differential Revision: https://reviews.llvm.org/D84366
linalg.conv does not support memrefs with rank smaller than 3 as stated here:
https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/nn/convolution
However it does not verify it and thus crashes with "LLVM ERROR: out of memory"
error for 1D case and "nWin > 0 && "expected at least one window dimension"" assertion
for 2D case. This commit adds check for that in the verification method.
Differential Revision: https://reviews.llvm.org/D84317
Loop bound inference is right now very limited as it supports only permutation maps and thus
it is impossible to implement convolution with linalg.generic as it requires more advanced
loop bound inference. This commits solves it for the convolution case.
Depends On D83158
Differential Revision: https://reviews.llvm.org/D83191
The underlying infrastructure supports this already, just add the
pattern matching for linalg.generic.
Differential Revision: https://reviews.llvm.org/D84335
Introduces the scatter/gather operations to the Vector dialect
(important memory operations for sparse computations), together
with a first reference implementation that lowers to the LLVM IR
dialect to enable running on CPU (and other targets that support
the corresponding LLVM IR intrinsics).
The operations can be used directly where applicable, or can be used
during progressively lowering to bring other memory operations closer to
hardware ISA support for a gather/scatter. The semantics of the operation
closely correspond to those of the corresponding llvm intrinsics.
Note that the operation allows for a dynamic index vector (which is
important for sparse computations). However, this first reference
lowering implementation "serializes" the address computation when
base + index_vector is converted to a vector of pointers. Exploring
how to use SIMD properly during these step is TBD. More general
memrefs and idiomatic versions of striding are also TBD.
Reviewed By: arpith-jacob
Differential Revision: https://reviews.llvm.org/D84039
This commit adds functionality needed for implementation of convolutions with
linalg.generic op. Since linalg.generic right now expects indexing maps to be
just permutations, offset indexing needed in convolutions is not possible.
Therefore in this commit we address the issue by adding support for symbols inside
indexing maps which enables more advanced indexing. The upcoming commit will
solve the problem of computing loop bounds from such maps.
Differential Revision: https://reviews.llvm.org/D83158