This patch adds constant folder for ExpM1Op which only supports single and double precision floating-point.
Differential Revision: https://reviews.llvm.org/D130567
This patch removes the `type` field from `Attribute` along with the
`Attribute::getType` accessor.
Going forward, this means that attributes in MLIR will no longer have
types as a first-class concept. This patch lays the groundwork to
incrementally remove or refactor code that relies on generic attributes
being typed. The immediate impact will be on attributes that rely on
`Attribute` containing a type, such as `IntegerAttr`,
`DenseElementsAttr`, and `ml_program::ExternAttr`, which will now need
to define a type parameter on their storage classes. This will save
memory as all other attribute kinds will no longer contain a type.
Moreover, it will not be possible to generically query the type of an
attribute directly. This patch provides an attribute interface
`TypedAttr` that implements only one method, `getType`, which can be
used to generically query the types of attributes that implement the
interface. This interface can be used to retain the concept of a "typed
attribute". The ODS-generated accessor for a `type` parameter
automatically implements this method.
Next steps will be to refactor the assembly formats of certain operations
that rely on `parseAttribute(type)` and `printAttributeWithoutType` to
remove special handling of type elision until `type` can be removed from
the dialect parsing hook entirely; and incrementally remove uses of
`TypedAttr`.
Reviewed By: lattner, rriddle, jpienaar
Differential Revision: https://reviews.llvm.org/D130092
Fix the hardcoded check for `FuncOp` in `getCommonBlock` utility: the
check should have been for an op that starts an affine scope. The
incorrect block returned in turn causes dependence analysis to function
incorrectly.
This change allows affine store-load forwarding to work correctly inside
any ops that start an affine scope.
Reviewed By: ftynse, dcaballe
Differential Revision: https://reviews.llvm.org/D130749
The implementation and API of GEP Op has gotten a bit convoluted over the time. Issues with it are:
* Misleading naming: `indices` actually only contains the dynamic indices, not all of them. To get the amount of indices you need to query the size of `structIndices`
* Very difficult to iterate over all indices properly: One had to iterate over `structIndices`, check whether it contains the magic constant `kDynamicIndex`, if it does, access the next value in `index` etc.
* Inconvenient to build: One either has create lots of constant ops for every index or have an odd split of passing both a `ValueRange` as well as a `ArrayRef<int32_t>` filled with `kDynamicIndex` at the correct places.
* Implementation doing verification in the build method
and more.
This patch attempts to address all these issues via convenience classes and reworking the way GEP Op works:
* Adds `GEPArg` class which is a sum type of a `int32_t` and `Value` and is used to have a single convenient easy to use `ArrayRef<GEPArg>` in the builders instead of the previous `ValueRange` + `ArrayRef<int32_t>` builders.
* Adds `GEPIndicesAdapter` which is a class used for easy random access and iteration over the indices of a GEP. It is generic and flexible enough to also instead return eg. a corresponding `Attribute` for an operand inside of `fold`.
* Rename `structIndices` to `rawConstantIndices` and `indices` to `dynamicIndices`: `rawConstantIndices` signifies one shouldn't access it directly as it is encoded, and `dynamicIndices` is more accurate and also frees up the `indices` name.
* Add `getIndices` returning a `GEPIndicesAdapter` to easily iterate over the GEP Ops indices.
* Move the verification/asserts out of the build method and into the `verify` method emitting op error messages.
* Add convenient builder methods making use of `GEPArg`.
* Add canonicalizer turning dynamic indices with constant values into constant indices to have a canonical representation.
The only breaking change is for any users building GEPOps that have so far used the old `ValueRange` + `ArrayRef<int32_t>` builder as well as those using the generic syntax.
Another follow up patch then goes through upstream and makes use of the new `ArrayRef<GEPArg>` builder to remove a lot of code building constants for GEP indices.
Differential Revision: https://reviews.llvm.org/D130730
Adding complex value with 0 for real and imaginary part can be ignored.
NOTE: This type of canonicalization can be written in an easy and tidy format using `complex.number` after constant op supports custom attribute.
Differential Revision: https://reviews.llvm.org/D130748
We can canonicalize consecutive complex.conj just by removing all conjugate operations.
Reviewed By: pifon2a
Differential Revision: https://reviews.llvm.org/D130684
It is more useful to use ComplexType as type of the attribute than to
use the element type as attribute type. This means when using this
attribute in complex::ConstantOp, we just need to check whether
the types match.
Reviewed By: pifon2a
Differential Revision: https://reviews.llvm.org/D130703
Current implementation of decomposition of Linalg operations wouldnt
work if the `outs` operand values were used within the body of the
operation. Relax this restriction. This potentially sets the stage for
decomposing ops with reduction iterator types (but is not done here
since it requires more study).
Differential Revision: https://reviews.llvm.org/D130527
This supports lowering from parse-tree to MLIR and translation from
MLIR to LLVM IR using OMPIRBuilder for OpenMP simdlen clause in SIMD
construct.
Reviewed By: shraiysh, peixin, arnamoy10
Differential Revision: https://reviews.llvm.org/D130195
This commit folds a `tensor.cast` op into a `tensor.collapse_shape` op
when following two conditions meet:
1. the `tensor.collapse_shape` op consumes result of the `tensor.cast` op.
2. `tensor.cast` op casts to a more dynamic version of the source tensor.
This is added as a canonicalization pattern in `tensor.collapse_shape` op.
Signed-Off-By: Gaurav Shukla <gaurav@nod-labs.com>
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D130650
* https://discourse.llvm.org/t/rfc-removing-the-quant-dialect/3643/8
* Removes most ops. Leaves casts given final comment (can remove more in a followup).
* There are a few uses in Tosa keeping some of the utilities alive. In a followup, I will probably elect to just move simplified versions of them into Tosa itself vs having this quasi-library dependency.
Differential Revision: https://reviews.llvm.org/D120204
This commit extends UnifyAliasedResourcePass to handle the case
where aliased resources have different vector sizes. (It still
requires all scalar types to be of the same bitwidth.) This is
effectively reusing the code for handling different-bitwidth
scalar types.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D130671
This commit fixes spv.CompositeConstruct to assembly to list
operand types to enable vector construction out of smaller vectors.
Validation is also fixed to properly check the cases for vector
construction.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D130669
This patch adds canonicalization conditions for omp.atomic.update thus
eliminating it when it becomes just a write or a no-op due to other
changes during canonicalization.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D126531
Add custom attribute for complex dialect. Although this commit does not have significant impact on the conversion framework, it will lead us to construct complex numbers in a readable and tidy manner.
Related discussion: https://reviews.llvm.org/D127476
Reviewed By: pifon2a, akuegel
Differential Revision: https://reviews.llvm.org/D130149
The fold in it's current state only checks whether the amount of dynamic indices is 1. This does however not check for the presence of any struct indices, leading to an incorrect fold.
This patch fixes that issue by checking that struct indices are 1, which in addition to the pre-existing check that dynamic indices are 1, guarantees that the single index is a dynamic one.
Differential Revision: https://reviews.llvm.org/D129374
Combine the recently added utilities for folded-by-construction affine
operations with the attribute-based Range to enable more folding. This
decreases the amount of emitted code but has little effect on test
precisely because the tests are not checking for the spurious constants.
The difference in the shape of affine maps comes from the internals of
affine folding.
Depends on D129633
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D130167
The structured op splitting transformation is conceptually similar to
tiling in the sense that it decomposes the iteration space of the
original op into several parts. Therefore, it is possible to implement
it using the TilingInterface to operate on iteration spaces and their
parts. However, the implementation also requires to pass updated input
operands, which is not supported by the interface, so the implementation
currently remains Linalg-specific.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D129564
This patch adds constant folder for Exp2Op which only supports single and double precision floating-point.
Differential Revision: https://reviews.llvm.org/D130472
This patch adds constant folder for ExpOp which only supports single and double precision floating-point.
Differential Revision: https://reviews.llvm.org/D130318
For arith.constant operations of integer type, the operation generates
result names that include the value of the constant (i.e., the
IntegerAttr that defines the constant's value). That code currently
assumes integer widths of 64 bits or less and hits an assert with wider
constants or would create truncated and potentially ambiguous names when
built with assertions disabled.
To enable printing arith.constant ops for arbitrarily wide integer
types, change to use the IntegerAttr's function getValue() when
generating result names.
Also, add a regression test.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D129930
This change adds a new DelinearizeIndexOp to the `arith` dialect. The
operation accepts an `index` type as well as a basis (array of index
values) representing how the index should be decomposed into a
multi-index. The decomposition obeys a canonical semantic that treats
the final basis element as "fastest varying" and the first basis element
as "slowest varying". A naive lowering of the operation using a sequence
of `arith.divui` and `arith.remui` operations is also given.
Differential Revision: https://reviews.llvm.org/D129697
Folding of transfer_write into transfer_read is already supported but
this requires the read and write to have the same permuation map.
After linalg vectorization it is common to have different ppermuation
map for write followed by read even though the cases could be
propagated.
This canonicalization handle cases where the permuation maps are
different but the data read and written match and replace the transfer
ops with broadcast and permuation
Differential Revision: https://reviews.llvm.org/D130135
Add a constraint to ensure that the operand and result of the
threadprivate operation are the same.
Reviewed By: peixin
Differential Revision: https://reviews.llvm.org/D128609
This is useful for building small test cases and will be utilized in a subsequent commit that adds a fusion example.
Differential Revision: https://reviews.llvm.org/D130344
This op fuses a given payload op into a given container op. Inside the container, all uses of the producer are replaced (fused) with the newly inserted op. If the producer is tileable and accessed via a tensor.extract_slice, the new op computes only the requested slice ("tile and fuse"). Otherwise, the entire tensor value is computed inside the container ("clone and fuse").
Differential Revision: https://reviews.llvm.org/D130244
Convert arith.cmpi to the canonical form with constants on the right side
to simplify further optimizations and open more opportunities for CSE.
Differential Revision: https://reviews.llvm.org/D129929
Add affine.if canonicalization to compose affine.apply ops into its set
and operands. This eliminates affine.apply ops feeding into affine.if
ops.
Differential Revision: https://reviews.llvm.org/D130242
This is the same as the existing multiplier-1 variant of DepthwiseConv2D, but in PyTorch dim order.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D128575
This is to improve consistency within the SPIR-V dialect and make these ops a bit shorter.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D130280
This change modifies `structured.tile_to_foreach_thread_op` so that
it accepts either `tile_sizes` or `num_threads` parameters. If
`tile_sizes` are specified, then the number of threads required is
derived the tile sizes rather than the other way around. In both cases,
more aggressive folding of loop parameters is enabled during the
transformation, allowing for the potential elimination of `affine.min`
and `affine.max` operations in the static shape case when calculating
the final adjusted tile size.
Differential Revision: https://reviews.llvm.org/D130139
This operation is a NavigationOp that simplifies the writing of transform IR.
Since there is no way of refering to an interface by name, the current implementation uses
an EnumAttr and depends on the interfaces it supports.
In the future, it would be worthwhile to remove this dependence and generalize.
Differential Revision: https://reviews.llvm.org/D130267
The `tileAndFuseLinalgOps` is a legacy approach for tiling + fusion of
Linalg operations. Since it was also intended to work on operations
with buffer operands, this method had fairly complex logic to make
sure tile and fuse was correct even with side-effecting linalg ops.
While complex, it still wasnt robust enough. This patch deprecates
this method and thereby deprecating the tiling + fusion method for ops
with buffer semantics. Note that the core transformation to do fusion
of a producer with a tiled consumer still exists. The deprecation here
only removes methods that auto-magically tried to tile and fuse
correctly in presence of side-effects.
The `tileAndFuseLinalgOps` also works with operations with tensor
semantics. There are at least two other ways the same functionality
exists.
1) The `tileConsumerAndFuseProducers` method. This does a similar
transformation, but using a slightly different logic to
automatically figure out the legal tile + fuse code. Note that this
is also to be deprecated soon.
2) The prefered way uses the `TilingInterface` for tile + fuse, and
relies on the caller to set the tiling options correctly to ensure
that the generated code is correct.
As proof that (2) is equivalent to the functionality provided by
`tileAndFuseLinalgOps`, relevant tests have been moved to use the
interface, where the test driver sets the tile sizes appropriately to
generate the expected code.
Differential Revision: https://reviews.llvm.org/D129901
This patch adds constant folder for LogOp which only supports single and double precision floating-point.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D130148
This is to improve the consistency within the SPIR-V dialect and to make op names a bit shorter.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D130194
This patch adds constant folder for Log1pOp which only supports single and double precision floating-point.
Differential Revision: https://reviews.llvm.org/D129979
This file contains a huge number of tests that should really be in
different dialect/files. It is monolothic because of the legacy
surrounding the old standard dialect, affine operations, etc. Splitting
this up makes the tests much more maintainable given that they are now
group with other similar tests.
This revision adds a new transformation to tile a TilingInterface `op` to a tiled `scf.foreach_thread`, applying
tiling by `num_threads`.
If non-empty, the `threadDimMapping` is added as an attribute to the resulting `scf.foreach_thread`.
0-tile sizes (i.e. tile by the full size of the data) are used to encode
that a dimension is not tiled.
Differential Revision: https://reviews.llvm.org/D129577
This op used to belong to the sparse dialect, but there are use cases for dense bufferization as well. (E.g., when a tensor alloc is returned from a function and should be deallocated at the call site.) This change moves the op to the bufferization dialect, which now has an `alloc_tensor` and a `dealloc_tensor` op.
Differential Revision: https://reviews.llvm.org/D129985
This patch adds constant folder for Log10Op which only support single and double precision floating-point.
Reviewed By: Mogball
Differential Revision: https://reviews.llvm.org/D129740
This patch adds a pattern to decompose a `linalg.generic` operations
that
- has only parallel iterator types
- has more than 2 statements (including the yield)
into multiple `linalg.generic` operation such that each operation has
a single statement and a yield.
The pattern added here just splits the matching `linalg.generic` into
two `linalg.generic`s, one containing the first statement, and the
other containing the remaining. The same pattern can be applied
repeatedly on the second op to ultimately fully decompose the generic
op.
Differential Revision: https://reviews.llvm.org/D129704