This is the very first step toward removing the glue and clutter from linalg and
replace it with proper sparse tensor types. This revision migrates the LinalgSparseOps
into SparseTensorOps of a sparse tensor dialect. This also provides a new home for
sparse tensor related transformation.
NOTE: the actual replacement with sparse tensor types (and removal of linalg glue/clutter)
will follow but I am trying to keep the amount of changes per revision manageable.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D101488
FillOp allows complex ops, and filling a properly sized buffer with
a default zero complex number is implemented.
Differential Revision: https://reviews.llvm.org/D99939
Add a new ProgressiveVectorToSCF pass that lowers vector transfer ops to SCF by gradually unpacking one dimension at time. Unpacking stops at 1D, but can be configured to stop earlier, should the HW support (N>1)-d vectors.
The current implementation cannot handle permutation maps, masks, tensor types and unrolling yet. These will be added in subsequent commits. Once features are on par with VectorToSCF, this implementation will replace VectorToSCF.
Differential Revision: https://reviews.llvm.org/D100622
Instead of interchanging loops during the loop lowering this pass performs the interchange by permuting the indexing maps. It also updates the iterator types and the index accesses in the body of the operation.
Differential Revision: https://reviews.llvm.org/D100627
This matches the current support provided to operations, and allows attaching traits, interfaces, and using the DeclareInterfaceMethods utility. This was missed when attribute/type generation was first added.
Differential Revision: https://reviews.llvm.org/D100233
The `linalg.index` operation provides access to the iteration indexes of immediately enclosing linalg operations. It takes a dimension `dim` attribute and returns the iteration index in the given dimension. Having `linalg.index` allows us to unify `linalg.generic` and `linalg.indexed_generic` and also enables index access in named operations.
Differential Revision: https://reviews.llvm.org/D100292
Right now Elementwise operations fusion in Linalg fuses everything it
can. This can run up against resource limits of the target hardware
without some checks. This patch adds a callback function that clients
can use to implement a cost function. When two elementwise operations
are deemed structurally fusable, the callback can be used to control
if the fusion applies.
Differential Revision: https://reviews.llvm.org/D99820
This commit add utility functions for creating push constant
storage variable and loading values from it.
Along the way, performs some clean up:
* Deleted `setABIAttrs`, which is just a 4-liner function
with one user.
* Moved `SPIRVConverstionTarget` into `mlir` namespace,
to be consistent with `SPIRVTypeConverter` and
`LLVMConversionTarget`.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D99725
Fixes a bug in affine fusion pipeline where an incorrect slice is computed.
After the slice computation is done, original domain of the the source is
compared with the new domain that will result if the fusion succeeds. If the
new domain must be a subset of the original domain for the slice to be
valid. If the slice computed is incorrect, fusion based on such a slice is
avoided.
Relevant test cases are added/edited.
Fixes https://bugs.llvm.org/show_bug.cgi?id=49203
Differential Revision: https://reviews.llvm.org/D98239
This allows for the conversion to match `A(B()) -> C()` with a pattern matching
`A` and marking `B` for deletion.
Also add better assertions when an operation is erased while still having uses.
Differential Revision: https://reviews.llvm.org/D99442
A new `InterfaceMethod` is added to `InferShapedTypeOpInterface` that
allows an operation to return the `Value`s for each dim of its
results. It is intended for the case where the `Value` returned for
each dim is computed using the operands and operation attributes. This
interface method is for cases where the result dim of an operation can
be computed independently, and it avoids the need to aggregate all
dims of a result into a single shape value. This also implies that
this is not suitable for cases where the result type is unranked (for
which the existing interface methods is to be used).
Also added is a canonicalization pattern that uses this interface and
resolves the shapes of the output in terms of the shapes of the
inputs. Moving Linalg ops to use this interface, so that many
canonicalization patterns implemented for individual linalg ops to
achieve the same result can be removed in favor of the added
canonicalization pattern.
Differential Revision: https://reviews.llvm.org/D97887
The issue was introduced in D98468.
The `{0}Regions` is an array of `std::unique_ptr<Region>` objects,
so it should be processed accordingly.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D99332
In particular for Graph Regions, the terminator needs is just a
historical artifact of the generalization of MLIR from CFG region.
Operations like Module don't need a terminator, and before Module
migrated to be an operation with region there wasn't any needed.
To validate the feature, the ModuleOp is migrated to use this trait and
the ModuleTerminator operation is deleted.
This patch is likely to break clients, if you're in this case:
- you may iterate on a ModuleOp with `getBody()->without_terminator()`,
the solution is simple: just remove the ->without_terminator!
- you created a builder with `Builder::atBlockTerminator(module_body)`,
just use `Builder::atBlockEnd(module_body)` instead.
- you were handling ModuleTerminator: it isn't needed anymore.
- for generic code, a `Block::mayNotHaveTerminator()` may be used.
Differential Revision: https://reviews.llvm.org/D98468
Until now Linalg fusion only allow fusing producers whose operands
are all permutation indexing maps. It's easier to deduce the
subtensor/subview but it is an unnecessary constraint, as in tiling
we have more advanced logic to deduce the subranges even when the
operand is not of permutation indexing maps, e.g., the input operand
for convolution ops.
This patch uses the logic on tiling side to deduce subranges for
fusion. This enables fusing convolution with its consumer ops
when possible.
Along the way, we are now generating proper affine.min ops to guard
against size boundaries, if we cannot be certain they won't be
out of bounds.
Differential Revision: https://reviews.llvm.org/D99014
This is useful for bit-packing types such as vectors and tuples as well as for
exotic architectures that have non-8-bit bytes.
Depends On D98500
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D98524
ModuleOp is a natural place to provide scoped data layout information. However,
it is undesirable for ModuleOp to implement the entirety of
DataLayoutOpInterface because that would require either pushing the interface
inside the IR library instead of a separate library, or putting the default
implementation of the interface as inline functions in headers leading to
binary bloat. Instead, ModuleOp accepts an arbitrary data layout spec attribute
and has a dedicated hook to extract it, and DataLayout is modified to know
about ModuleOp particularities.
Reviewed By: herhut, nicolasvasilache
Differential Revision: https://reviews.llvm.org/D98500
Fix the BlockAndValueMapping update that was missing entries for scf.for op's blockIterArgs.
Skip cloning subtensors of the padded tensor as the logic for these is separate.
Add a filter to drop side-effecting ops.
Tests are beefed up to verify the IR is sound in all hoisting configurations for 2-level 3-D tiled matmul.
Differential Revision: https://reviews.llvm.org/D99255
This mechanism makes it possible for a dialect to not register all
operations but still answer interface-based queries.
This can useful for dialects that are "open" or connected to an external
system and still interoperate with the compiler. It can also open up the
possibility to have a more extensible compiler at runtime: the compiler
does not need a pre-registration for each operation and the dialect can
inject behavior dynamically.
Reviewed By: rriddle, jpienaar
Differential Revision: https://reviews.llvm.org/D93085
To match an interface or trait, users currently have to use the `MatchAny` tag. This tag can be quite problematic for compile time for things like the canonicalizer, as the `MatchAny` patterns may get applied to *every* operation. This revision adds better support by bucketing interface/trait patterns based on which registered operations have them registered. This means that moving forward we will only attempt to match these patterns to operations that have this interface registered. Two simplify defining patterns that match traits and interfaces, two new utility classes have been added: OpTraitRewritePattern and OpInterfaceRewritePattern.
Differential Revision: https://reviews.llvm.org/D98986
The "else" group of an optional element is a collection of elements that get parsed/printed when the anchor of the main element group is *not* present. This is useful when there is a special syntax when an element is not present. The new syntax for an optional element is shown below:
```
optional-group: `(` elements `)` (`:` `(` else-elements `)`)? `?`
```
An example of how this might be used is shown below:
```tablegen
def FooOp : ... {
let arguments = (ins UnitAttr:$foo);
let assemblyFormat = "attr-dict (`foo_is_present` $foo^):(`foo_is_absent`)?";
}
```
would be formatted as such:
```mlir
// When the `foo` attribute is present:
foo.op foo_is_present
// When the `foo` attribute is not present:
foo.op foo_is_absent
```
Differential Revision: https://reviews.llvm.org/D99129
This nicely aligns the naming with RewritePatternSet. This type isn't
as widely used, but we keep a using declaration in to help with
downstream consumption of this change.
Differential Revision: https://reviews.llvm.org/D99131
This doesn't change APIs, this just cleans up the many in-tree uses of these
names to use the new preferred names. We'll keep the old names around for a
couple weeks to help transitions.
Differential Revision: https://reviews.llvm.org/D99127
This updates the codebase to pass the context when creating an instance of
OwningRewritePatternList, and starts removing extraneous MLIRContext
parameters. There are many many more to be removed.
Differential Revision: https://reviews.llvm.org/D99028
This change combines for ROCm what was done for CUDA in D97463, D98203, D98360, and D98396.
I did not try to compile SerializeToHsaco.cpp or test mlir/test/Integration/GPU/ROCM because I don't have an AMD card. I fixed the things that had obvious bit-rot though.
Reviewed By: whchung
Differential Revision: https://reviews.llvm.org/D98447
Add a feature to `EnumAttr` definition to generate
specialized Attribute class for the particular enumeration.
This class will inherit `StringAttr` or `IntegerAttr` and
will override `classof` and `getValue` methods.
With this class the enumeration predicate can be checked with simple
RTTI calls (`isa`, `dyn_cast`) and it will return the typed enumeration
directly instead of raw string/integer.
Based on the following discussion:
https://llvm.discourse.group/t/rfc-add-enum-attribute-decorator-class/2252
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D97836
Supporting ranges in the byte code requires additional complexity, given that a range can't be easily representable as an opaque void *, as is possible with the existing bytecode value types (Attribute, Type, Value, etc.). To enable representing a range with void *, an auxillary storage is used for the actual range itself, with the pointer being passed around in the normal byte code memory. For type ranges, a TypeRange is stored. For value ranges, a ValueRange is stored. The above problem represents a majority of the complexity involved in this revision, the rest is adapting/adding byte code operations to support the changes made to the PDL interpreter in the parent revision.
After this revision, PDL will have initial end-to-end support for variadic operands/results.
Differential Revision: https://reviews.llvm.org/D95723
This has a numerous amount of benefits, given the overly clunky nature of CreateNativeOp:
* Users can now call into arbitrary rewrite functions from inside of PDL, allowing for more natural interleaving of PDL/C++ and enabling for more of the pattern to be in PDL.
* Removes the need for an additional set of C++ functions/registry/etc. The new ApplyNativeRewriteOp will use the same PDLRewriteFunction as the existing RewriteOp. This reduces the API surface area exposed to users.
This revision also introduces a new PDLResultList class. This class is used to provide results of native rewrite functions back to PDL. We introduce a new class instead of using a SmallVector to simplify the work necessary for variadics, given that ranges will require some changes to the structure of PDLValue.
Differential Revision: https://reviews.llvm.org/D95720
This patch introduces progressive lowering patterns for rewriting
vector.transfer_read/write to vector.load/store and vector.broadcast
in certain supported cases.
Reviewed By: dcaballe, nicolasvasilache
Differential Revision: https://reviews.llvm.org/D97822
This allows for storage instances to store data that isn't uniqued in the context, or contain otherwise non-trivial logic, in the rare situations that they occur. Storage instances with trivial destructors will still have their destructor skipped. A consequence of this is that the storage instance definition must be visible from the place that registers the type.
Differential Revision: https://reviews.llvm.org/D98311
Data layout information allows to answer questions about the size and alignment
properties of a type. It enables, among others, the generation of various
linear memory addressing schemes for containers of abstract types and deeper
reasoning about vectors. This introduces the subsystem for modeling data
layouts in MLIR.
The data layout subsystem is designed to scale to MLIR's open type and
operation system. At the top level, it consists of attribute interfaces that
can be implemented by concrete data layout specifications; type interfaces that
should be implemented by types subject to data layout; operation interfaces
that must be implemented by operations that can serve as data layout scopes
(e.g., modules); and dialect interfaces for data layout properties unrelated to
specific types. Built-in types are handled specially to decrease the overall
query cost.
A concrete default implementation of these interfaces is provided in the new
Target dialect. Defaults for built-in types that match the current behavior are
also provided.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D97067
Clean-up after D98279, remove one call to createConvertGPUKernelToBlobPass().
Depends On D98203
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D98360
This class provides efficient implementations of symbol queries related to uses, such as collecting the users of a symbol, replacing all uses, etc. This provides similar benefits to use related queries, as SymbolTableCollection did for lookup queries.
Differential Revision: https://reviews.llvm.org/D98071
This allows the caller to distinguish between a parse error or an
unmatched keyword. It fixes the redundant error that was emitted by the
caller when the generated parser would fail.
Differential Revision: https://reviews.llvm.org/D98162
This makes it easy to compose the distribution computation with
other affine computations.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D98171
This patch is a follow-up on D97217. It adds a new 'Skip' result to the Operation visitor
so that a callback can stop the ongoing visit of an operation/block/region and
continue visiting the next one without fully interrupting the walk. Skipping is
needed to be able to erase an operation/block in pre-order and do not continue
visiting the internals of that operation/block.
Related to the skipping mechanism, the patch also introduces the following changes:
* Added new TestIRVisitors pass with basic testing for the IR visitors.
* Fixed missing early increment ranges in visitor implementation.
* Updated documentation of walk methods to include erasure information and walk
order information.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D97820
The value type of the attribute can be specified by either overriding the typeBuilder field on the AttrDef, or by providing a parameter of type `AttributeSelfTypeParameter`. This removes the need to define custom storage class constructors for attributes that have a value type other than NoneType.
Differential Revision: https://reviews.llvm.org/D97590
There is no need for the interface implementations to be exposed, opaque
registration functions are sufficient for all users, similarly to passes.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D97852
The support for attributes closely maps that of Types (basically 1-1) given that Attributes are defined in exactly the same way as Types. All of the current ODS TypeDef classes get an Attr equivalent. The generation of the attribute classes themselves share the same generator as types.
Differential Revision: https://reviews.llvm.org/D97589
Some elementwise operations are not scalarizable, vectorizable, or tensorizable.
Split `ElementwiseMappable` trait into the following, more precise traits.
- `Elementwise`
- `Scalarizable`
- `Vectorizable`
- `Tensorizable`
This allows for reuse of `Elementwise` in dialects like HLO.
Differential Revision: https://reviews.llvm.org/D97674
For ops that produces tensor types and implement the shaped type component interface, the type inference interface can be used. Create a grouping of these together to make it easier to specify (it cannot be added into a list of traits, but must rather be appended/concated to one as it isn't a trait but a list of traits).
Differential Revision: https://reviews.llvm.org/D97636
'getAttrs' has been explicitly marked deprecated. This patch refactors
to use Operation::getAttrs().
Reviewed By: csigg
Differential Revision: https://reviews.llvm.org/D97546
This transformation was only used for quick experimentation and is not general enough.
Retire it.
Differential Revision: https://reviews.llvm.org/D97266
`verifyConstructionInvariants` is intended to allow for verifying the invariants of an attribute/type on construction, and `getChecked` is intended to enable more graceful error handling aside from an assert. There are a few problems with the current implementation of these methods:
* `verifyConstructionInvariants` requires an mlir::Location for emitting errors, which is prohibitively costly in the situations that would most likely use them, e.g. the parser.
This creates an unfortunate code duplication between the verifier code and the parser code, given that the parser operates on llvm::SMLoc and it is an undesirable overhead to pre-emptively convert from that to an mlir::Location.
* `getChecked` effectively requires duplicating the definition of the `get` method, creating a quite clunky workflow due to the subtle different in its signature.
This revision aims to talk the above problems by refactoring the implementation to use a callback for error emission. Using a callback allows for deferring the costly part of error emission until it is actually necessary.
Due to the necessary signature change in each instance of these methods, this revision also takes this opportunity to cleanup the definition of these methods by:
* restructuring the signature of `getChecked` such that it can be generated from the same code block as the `get` method.
* renaming `verifyConstructionInvariants` to `verify` to match the naming scheme of the rest of the compiler.
Differential Revision: https://reviews.llvm.org/D97100
Rationale:
Touching function level information can only be done within a module pass.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D97102
This commit introduced a cyclic dependency:
Memref dialect depends on Standard because it used ConstantIndexOp.
Std depends on the MemRef dialect in its EDSC/Intrinsics.h
Working on a fix.
This reverts commit 8aa6c3765b.
Create the memref dialect and move several dialect-specific ops without
dependencies to other ops from std dialect to this dialect.
Moved ops:
AllocOp -> MemRef_AllocOp
AllocaOp -> MemRef_AllocaOp
DeallocOp -> MemRef_DeallocOp
MemRefCastOp -> MemRef_CastOp
GetGlobalMemRefOp -> MemRef_GetGlobalOp
GlobalMemRefOp -> MemRef_GlobalOp
PrefetchOp -> MemRef_PrefetchOp
ReshapeOp -> MemRef_ReshapeOp
StoreOp -> MemRef_StoreOp
TransposeOp -> MemRef_TransposeOp
ViewOp -> MemRef_ViewOp
The roadmap to split the memref dialect from std is discussed here:
https://llvm.discourse.group/t/rfc-split-the-memref-dialect-from-std/2667
Differential Revision: https://reviews.llvm.org/D96425
A series of preceding patches changed the mechanism for translating MLIR to
LLVM IR to use dialect interface with delayed registration. It is no longer
necessary for specific dialects to derive from ModuleTranslation. Remove all
virtual methods from ModuleTranslation and factor out the entry point to be a
free function.
Also perform some cleanups in ModuleTranslation internals.
Depends On D96774
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D96775
These patterns unrolls transfer read/write ops if the vector consumers/
producers are extract/insert slices op. Transfer ops can map to hardware
load/store functionalities, where the vector size matters for bandwidth
considerations. So these patterns should be collected separately, instead
of being generic canonicalization patterns.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D96782
Port the translation of five dialects that define LLVM IR intrinsics
(LLVMAVX512, LLVMArmNeon, LLVMArmSVE, NVVM, ROCDL) to the new dialect
interface-based mechanism. This allows us to remove individual translations
that were created for each of these dialects and just use one common
MLIR-to-LLVM-IR translation that potentially supports all dialects instead,
based on what is registered and including any combination of translatable
dialects. This removal was one of the main goals of the refactoring.
To support the addition of GPU-related metadata, the translation interface is
extended with the `amendOperation` function that allows the interface
implementation to post-process any translated operation with dialect attributes
from the dialect for which the interface is implemented regardless of the
operation's dialect. This is currently applied to "kernel" functions, but can
be used to construct other metadata in dialect-specific ways without
necessarily affecting operations.
Depends On D96591, D96504
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D96592
This revision connects the generated sparse code with an actual
sparse storage scheme, which can be initialized from a test file.
Lacking a first-class citizen SparseTensor type (with buffer),
the storage is hidden behind an opaque pointer with some "glue"
to bring the pointer back to tensor land. Rather than generating
sparse setup code for each different annotated tensor (viz. the
"pack" methods in TACO), a single "one-size-fits-all" implementation
has been added to the runtime support library. Many details and
abstractions need to be refined in the future, but this revision
allows full end-to-end integration testing and performance
benchmarking (with on one end, an annotated Lingalg
op and, on the other end, a JIT/AOT executable).
Reviewed By: nicolasvasilache, bixia
Differential Revision: https://reviews.llvm.org/D95847
This allows for referencing nearly every component of an operation from within a custom directive.
It also fixes a bug with the current type_ref implementation, PR48478
Differential Revision: https://reviews.llvm.org/D96189
This revision adds a new `AliasAnalysis` class that represents the main alias analysis interface in MLIR. The purpose of this class is not to hold the aliasing logic itself, but to provide an interface into various different alias analysis implementations. As it evolves this should allow for users to plug in specialized alias analysis implementations for their own needs, and have them immediately usable by other analyses and transformations.
This revision also adds an initial simple generic alias, LocalAliasAnalysis, that provides support for performing stateless local alias queries between values. This class is similar in scope to LLVM's BasicAA.
Differential Revision: https://reviews.llvm.org/D92343
These properties were useful for a few things before traits had a better integration story, but don't really carry their weight well these days. Most of these properties are already checked via traits in most of the code. It is better to align the system around traits, and improve the performance/cost of traits in general.
Differential Revision: https://reviews.llvm.org/D96088
This revision fixes the fact that the padding transformation did not have enough information to set the proper type for the padding value.
Additionally, the verifier for Yield in the presence of PadTensorOp is fixed to properly report incorrect number of results or operands. Previously, the error would be silently ignored which made the core issue difficult to debug.
Differential Revision: https://reviews.llvm.org/D96264
These patterns move vector.bitcast ops to be before
insert ops or after extract ops where suitable.
With them, bitcast will happen on smaller vectors
and there are more chances to share extract/insert
ops.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D96040
This patch adds patterns to use vector.shape_cast to cast
away leading 1-dimensions from a few vector operations.
It allows exposing more canonical forms of vector.transfer_read,
vector.transfer_write, vector_extract_strided_slice, and
vector.insert_strided_slice. With this, we can have more
opportunity to cancelling extract/insert ops or forwarding
write/read ops.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D95873
This makes ignoring a result explicit by the user, and helps to prevent accidental errors with dropped results. Marking LogicalResult as no discard was always the intention from the beginning, but got lost along the way.
Differential Revision: https://reviews.llvm.org/D95841
This revision takes advantage of recent extensions to vectorization to refactor contraction detection into a bona fide Linalg interface.
The mlit-linalg-ods-gen parser is extended to support adding such interfaces.
The detection that was originally enabling vectorization is refactored to serve as both a test on a generic LinalgOp as well as to verify ops that declare to conform to that interface.
This is plugged through Linalg transforms and strategies but it quickly becomes evident that the complexity and rigidity of the C++ class based templating does not pay for itself.
Therefore, this revision changes the API for vectorization patterns to get rid of templates as much as possible.
Variadic templates are relegated to the internals of LinalgTransformationFilter as much as possible and away from the user-facing APIs.
It is expected other patterns / transformations will follow the same path and drop as much C++ templating as possible from the class definition.
Differential revision: https://reviews.llvm.org/D95973
In dialect conversion infrastructure, source materialization applies as part of
the finalization procedure to results of the newly produced operations that
replace previously existing values with values having a different type.
However, such operations may be created to replace operations created in other
patterns. At this point, it is possible that the results of the _original_
operation are still in use and have mismatching types, but the results of the
_intermediate_ operation that performed the type change are not in use leading
to the absence of source materialization. For example,
%0 = dialect.produce : !dialect.A
dialect.use %0 : !dialect.A
can be replaced with
%0 = dialect.other : !dialect.A
%1 = dialect.produce : !dialect.A // replaced, scheduled for removal
dialect.use %1 : !dialect.A
and then with
%0 = dialect.final : !dialect.B
%1 = dialect.other : !dialect.A // replaced, scheduled for removal
%2 = dialect.produce : !dialect.A // replaced, scheduled for removal
dialect.use %2 : !dialect.A
in the same rewriting, but only the %1->%0 replacement is currently considered.
Change the logic in dialect conversion to look up all values that were replaced
by the given value and performing source materialization if any of those values
is still in use with mismatching types. This is performed by computing the
inverse value replacement mapping. This arguably expensive manipulation is
performed only if there were some type-changing replacements. An alternative
could be to consider all replaced operations and not only those that resulted
in type changes, but it would harm pattern-level composability: the pattern
that performed the non-type-changing replacement would have to be made aware of
the type converter in order to call the materialization hook.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D95626
This revision defines a Linalg contraction in general terms:
1. Has 2 input and 1 output shapes.
2. Has at least one reduction dimension.
3. Has only projected permutation indexing maps.
4. its body computes `u5(u1(c) + u2(u3(a) * u4(b)))` on some field
(AddOpType, MulOpType), where u1, u2, u3, u4 and u5 represent scalar unary
operations that may change the type (e.g. for mixed-precision).
As a consequence, when vectorization of such an op occurs, the only special
behavior is that the (unique) MulOpType is vectorized into a
`vector.contract`. All other ops are handled in a generic fashion.
In the future, we may wish to allow more input arguments and elementwise and
constant operations that do not involve the reduction dimension(s).
A test is added to demonstrate the proper vectorization of matmul_i8_i8_i32.
Differential revision: https://reviews.llvm.org/D95939
We could extend this with an interface to allow dialect to perform a type
conversion, but that would make the folder creating operation which isn't
the case at the moment, and isn't necessarily always desirable.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D95991
Fix a bug that was introduced where calling the codegen strategy with actual concrete C++ Op types did not trigger the expected behavior.
Also introduce a test for the behavior that was missing.
Differential Revision: https://reviews.llvm.org/D95863
In dialect conversion, signature conversions essentially perform block argument
replacement and are added to the general value remapping. However, the replaced
values were not tracked, so if a signature conversion was rolled back, the
construction of operand lists for the following patterns could have obtained
block arguments from the mapping and give them to the pattern leading to
use-after-free. Keep track of signature conversions similarly to normal block
argument replacement, and erase such replacements from the general mapping when
the conversion is rolled back.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D95688
This is the last revision to migrate using SimplePadOp to PadTensorOp, and the
SimplePadOp is removed in the patch. Update a bit in SliceAnalysis because the
PadTensorOp takes a region different from SimplePadOp. This is not covered by
LinalgOp because it is not a structured op.
Also, remove a duplicated comment from cpp file, which is already described in a
header file. And update the pseudo-mlir in the comment.
This is as same as D95615 but fixing one dep in CMakeLists.txt
Different from D95671, the fix was applied to run target.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D95785
This reverts commit d9b953d84b.
This commit resulted in build bot failures and the author is away from a
computer, so I am reverting on their behalf until they have a chance to
look into this.
This is the last revision to migrate using SimplePadOp to PadTensorOp, and the
SimplePadOp is removed in the patch. Update a bit in SliceAnalysis because the
PadTensorOp takes a region different from SimplePadOp. This is not covered by
LinalgOp because it is not a structured op.
Also, remove a duplicated comment from cpp file, which is already described in a
header file. And update the pseudo-mlir in the comment.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D95671
This is the last revision to migrate using SimplePadOp to PadTensorOp, and the
SimplePadOp is removed in the patch. Update a bit in SliceAnalysis because the
PadTensorOp takes a region different from SimplePadOp. This is not covered by
LinalgOp because it is not a structured op.
Also, remove a duplicated comment from cpp file, which is already described in a
header file. And update the pseudo-mlir in the comment.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D95615
This revision adds a layer of SFINAE to the composable codegen strategy so it does
not have to require statically defined ops but instead can also be used with OpInterfaces, Operation* and an op name string.
A linalg.matmul_i8_i8_i32 is added to the .tc spec to demonstrate how all this works end to end.
Differential Revision: https://reviews.llvm.org/D95600
This revision improves the usage of the codegen strategy by adding a few flags that
make it easier to control for the CLI.
Usage of ModuleOp is replaced by FuncOp as this created issues in multi-threaded mode.
A simple benchmarking capability is added for linalg.matmul as well as linalg.matmul_column_major.
This latter op is also added to linalg.
Now obsolete linalg integration tests that also take too long are deleted.
Correctness checks are still missing at this point.
Differential revision: https://reviews.llvm.org/D95531
This transformation anchors on a padding op whose result is only used as an input
to a Linalg op and pulls it out of a given number of loops.
The result is a packing of padded tailes of ops that is amortized just before
the outermost loop from which the pad operation is hoisted.
Differential revision: https://reviews.llvm.org/D95243
This revision allows the base Linalg tiling pattern to optionally require padding to
a constant bounding shape.
When requested, a simple analysis is performed, similar to buffer promotion.
A temporary `linalg.simple_pad` op is added to model padding for the purpose of
connecting the dots. This will be replaced by a more fleshed out `linalg.pad_tensor`
op when it is available.
In the meantime, this temporary op serves the purpose of exhibiting the necessary
properties required from a more fleshed out pad op, to compose with transformations
properly.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D95149
This revision adds support for using either operand or result types to anchor an optional group. It also removes the arbitrary restriction that type directives must refer to variables in the same group, which is overly limiting for a declarative format syntax.
Fixes PR#48784
Differential Revision: https://reviews.llvm.org/D95109
Use cases with 16- or even 8-bit pointer/index structures have been identified.
Reviewed By: penpornk
Differential Revision: https://reviews.llvm.org/D95015