LLVM switchop currently only permits i32. Both LLVM IR and MLIR Standard switch permit other integer types leading to an illegal state when lowering an i8 switch from MLIR standard
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D113955
Limit the backtracking along def-use chains when a prefix is encountered as it would generate incorrect foldings.
Differential Revision: https://reviews.llvm.org/D113975
* It works similar to scf.for coversion, but convert condition and yield ops as part of scf.whille pattern so it don't need to maintain external state
Differential Revision: https://reviews.llvm.org/D113007
For the semi affine expressions, whenever rhs of a floordiv, ceildiv, mod
or product expression is a symbolic expression, we introduce a local variable
representing the result, and store the floordiv/ceildiv, mod or product
affine expression in LocalExprs. In this way the expression is flattened,
and trivial addition and subtraction related simplifications are performed.
Also rule based matching for detecting and transforming "expr - q * (expr floordiv q)"
to "expr mod q", where q is a symbolic exxpression, in simplifyAdd function.
Differential Revision: https://reviews.llvm.org/D112808
This patch extends the existing functionality of computing an explicit
representation for local variables, to also get the explicit representation,
instead of only the inequality pairs.
This is required for a future patch to remove redundant local ids based on
their explicit representation.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D113814
This reverts commit 94992670fc.
Build is broken with:
tools/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.cpp.inc:23996:3: error: no matching function for call to 'printSwitchOpCases'
printSwitchOpCases(_odsPrinter, *this, getValue().getType(), getCaseValuesAttr(), getCaseDestinations(), getCaseOperands(), getCaseOperands().getTypes());
^~~~~~~~~~~~~~~~~~
LLVM switchop currently only permits i32. Both LLVM IR and MLIR Standard switch permit other integer types leading to an illegal state when lowering an i8 switch from MLIR standard
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D113955
This revision contains all "sparsification" ops and rewriting necessary to support sparse output tensors when the kernel has no reduction (viz. insertions occur in lexicographic order and are "injective"). This will be later generalized to allow reductions too. Also, this first revision only supports sparse 1-d tensors (viz. vectors) as output in the runtime support library. This will be generalized to n-d tensors shortly. But this way, the revision is kept to a manageable size.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D113705
Split tosa.reshape into three individual lowerings: collapse, expand and a
combination of both. Add simple dynamic shape support.
Reviewed By: rsuderman
Differential Revision: https://reviews.llvm.org/D113936
Multiply by one can be removed during canonicalization. This optimizes away unneeded operations.
Differential Revision: https://reviews.llvm.org/D113807
This is in prevision of dropping them altogether and using insert/extract based patterns.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D113928
This is analogous to what LLVM's PatternMatch.h supports,
but LLVM calls it m_Value for both the binding and
nonbinding versions.
This is an upstream from CIRCT and is used there.
Differential Revision: https://reviews.llvm.org/D113905
When trying to connect the vectorization of depthwise convolutions to e2e execution
a number of problems surfaced.
Fix an off-by-one error on the size of the input vector (similary to what was previously done for regular conv).
Rewrite the lowering to vector.fma instead of vector.contract: the KW reduction dimension has already been unrolled and vector.contract requires a reduction dimension to be valid.
Differential Revision: https://reviews.llvm.org/D113884
FMAOp -> LLVM conversion is done progressively by peeling off 1 dimension from FMAOp at each pattern iteration. Add the recursively bounded property declaration to the pattern so that the rewriter can apply it multiple times.
Without this, FMAOps with 3+D do not lower to LLVM.
Differential Revision: https://reviews.llvm.org/D113886
OperationLegalizer::isIllegal returns false if operation legality wasn't
registered by user and we expect same behaviour when dynamic legality
callback return None, but instead true was returned.
Differential Revision: https://reviews.llvm.org/D113267
This change makes it possible to set up custom mappings in a PostAnalysisStep. Some users of Comprehensive Bufferize have custom tensor types and it is most convenient to just reuse the same bvm.
Also add some more assertions.
Differential Revision: https://reviews.llvm.org/D113726
Names should be consistent across all operations otherwise painful bugs will surface.
Reviewed By: rsuderman
Differential Revision: https://reviews.llvm.org/D113762
This reverts commit bec488b818.
This commit introduced a layering violation between MLIR libraries.
Reverting for now while discussing on the original review thread.
Re-applies D111513:
* Adds a full-fledged Python example dialect and tests to the Standalone example (need to do a bit of tweaking in the top level CMake and lit tests to adapt better to if not building with Python enabled).
* Rips out remnants of custom extension building in favor of pybind11_add_module which does the right thing.
* Makes python and extension sources installable (outputs to src/python/${name} in the install tree): Both Python and C++ extension sources get installed as downstreams need all of this in order to build a derived version of the API.
* Exports sources targets (with our properties that make everything work) by converting them to INTERFACE libraries (which have export support), as recommended for the forseeable future by CMake devs. Renames custom properties to start with lower-case letter, as also recommended/required (groan).
* Adds a ROOT_DIR argument to declare_mlir_python_extension since now all C++ sources for an extension must be under the same directory (to line up at install time).
* Downstreams will need to adapt by:
* Remove absolute paths from any SOURCES for declare_mlir_python_extension (I believe all downstreams are just using ${CMAKE_CURRENT_SOURCE_DIR} here, which can just be ommitted). May need to set ROOT_DIR if not relative to the current source directory.
* To allow further downstreams to install/build, will need to make sure that all C++ extension headers are also listed under SOURCES for declare_mlir_python_extension.
This reverts commit 1a6c26d1f5.
Reviewed By: stephenneuendorffer
Differential Revision: https://reviews.llvm.org/D113732
This patch adds functionality to parse FlatAffineConstraints from a
StringRef with the intention to be used for unit tests. This should
make the construction of FlatAffineConstraints easier for testing
purposes.
The patch contains an example usage of the functionality in a unit test that
uses FlatAffineConstraints.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D113275
Part of the very first discussion here, but didn't upstream it before as we
didn't use it yet. Fix that for pending updates. Just adding the op here,
follow up will add the lowering to codegen.
At this time the 2 flavors of conv are a little too different to allow significant code sharing and other will likely come up.
so we go the easy route first by duplicating and adapting.
Reviewed By: gysit
Differential Revision: https://reviews.llvm.org/D113758
Per discussion on discord and various feature requests across bindings (Haskell and Rust bindings authors have asked me directly), we should be building a link-ready MLIR-C dylib which exports the C API and can be used without linking to anything else.
This patch:
* Adds a new MLIR-C aggregate shared library (libMLIR-C.so), which is similar in name and function to libLLVM-C.so.
* It is guarded by the new CMake option MLIR_BUILD_MLIR_C_DYLIB, which has a similar purpose/name to the LLVM_BUILD_LLVM_C_DYLIB option.
* On all platforms, this will work with both static, BUILD_SHARED_LIBS, and libMLIR builds, if supported:
* In static builds: libMLIR-C.so will export the CAPI symbols and statically link all dependencies into itself.
* In BUILD_SHARED_LIBS: libMLIR-C.so will export the CAPI symbols and have dynamic dependencies on implementation shared libraries.
* In libMLIR.so mode: same as static. libMLIR.so was not finished for actual linking use within the project. An eventual relayering so that libMLIR-C.so depends on libMLIR.so is possible but requires first re-engineering the latter to use the aggregate facility.
* On Linux, exported symbols are filtered to only the CAPI. On others (MacOS, Windows), all symbols are exported. A CMake status is printed unless if global visibility is hidden indicating that this has not yet been implemented. The library should still work, but it will be larger and more likely to conflict until fixed. Someone should look at lifting the corresponding support from libLLVM-C.so and adapting. Or, for special uses, just build with `-DCMAKE_CXX_VISIBILITY_PRESET=hidden -DCMAKE_C_VISIBILITY_PRESET=hidden`.
* Includes fixes to execution engine symbol export macros to enable default visibility. Without this, the advice to use hidden visibility would have resulted in test failures and unusable execution engine support libraries.
Differential Revision: https://reviews.llvm.org/D113731
* Depends on D111504, which provides the boilerplate for building aggregate shared libraries from installed MLIR.
* Adds a full-fledged Python example dialect and tests to the Standalone example (need to do a bit of tweaking in the top level CMake and lit tests to adapt better to if not building with Python enabled).
* Rips out remnants of custom extension building in favor of `pybind11_add_module` which does the right thing.
* Makes python and extension sources installable (outputs to src/python/${name} in the install tree): Both Python and C++ extension sources get installed as downstreams need all of this in order to build a derived version of the API.
* Exports sources targets (with our properties that make everything work) by converting them to INTERFACE libraries (which have export support), as recommended for the forseeable future by CMake devs. Renames custom properties to start with lower-case letter, as also recommended/required (groan).
* Adds a ROOT_DIR argument to `declare_mlir_python_extension` since now all C++ sources for an extension must be under the same directory (to line up at install time).
* Need to validate against a downstream or two and adjust, prior to submitting.
Downstreams will need to adapt by:
* Remove absolute paths from any SOURCES for `declare_mlir_python_extension` (I believe all downstreams are just using `${CMAKE_CURRENT_SOURCE_DIR}` here, which can just be ommitted). May need to set `ROOT_DIR` if not relative to the current source directory.
* To allow further downstreams to install/build, will need to make sure that all C++ extension headers are also listed under SOURCES for `declare_mlir_python_extension`.
Reviewed By: stephenneuendorffer, mikeurbach
Differential Revision: https://reviews.llvm.org/D111513
With `-Os` turned on, results in 2-5% binary size reduction
(depends on the original binary). Without it, the binary size
is essentially unchanged.
Depends on D113128
Differential Revision: https://reviews.llvm.org/D113331
This helper struct allows users of ComprehensiveBufferize to inject "post analysis" steps that are implemented after the analysis but before the bufferization.
Differential Revision: https://reviews.llvm.org/D113458
* Some long names were added and script decided to change whitespaces in a lot of places
* `ImageOperand` was renamed to `ImageOperands` in spec
* Some *NV enums were renamed to *KHR (spec actually maintains both variants with same value, but script pulled only *KHR versions)
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D113667
Support load with broadcast, elementwise divf op and remove the
hardcoded restriction on the vector size. Picking the right size should
be enfored by user and will fail conversion to llvm/spirv if it is not
supported.
Differential Revision: https://reviews.llvm.org/D113618
Fusing into a reduction is only valid if doing so does not erase information on a reduction dimensions size.
Differential Revision: https://reviews.llvm.org/D113500
* Move "linalg.inplaceable" attr name literals to BufferizableOpInterface.
* Use `memref.copy` by default. Override to `linalg.copy` in ComprehensiveBufferizePass.
These are the last remaining code dependencies on Linalg in Comprehensive Bufferize. The next commit will make ComprehensiveBufferize independent of the Linalg dialect.
Differential Revision: https://reviews.llvm.org/D113457
This revision adds an implementation of 2-D vector.transpose for 4x8 and 8x8 for
AVX2 and surfaces it to the Linalg level of control.
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D113347
This decouples the printing/parsing from the "context" in which the parsing occurs.
This will allow to invoke these methods directly using an OpAsmParser/OpAsmPrinter.
Differential Revision: https://reviews.llvm.org/D113637
This patch fixes:
mlir/lib/Dialect/Linalg/ComprehensiveBufferize/ComprehensiveBufferize.cpp:301:20:
error: unused function 'printValueInfo' [-Werror,-Wunused-function]
Move helper functions for traversing reverse use-def chains. These are useful for implementing custom optimizations (e.g., custom InitTensorOp eliminations).
Also move over the AllocationCallbacks struct. This is in preparation for decoupling ComprehensiveBufferize from various dialects.
Differential Revision: https://reviews.llvm.org/D113386
mapped_iterator is a useful abstraction for applying a
map function over an existing iterator, but our current
usage ends up allocating storage/making indirect calls
even with the map function is a known function, which
is horribly inefficient. This commit refactors the usage
of mapped_iterator to avoid this, and allows for directly
referencing the map function when dereferencing.
Fixes PR52319
Differential Revision: https://reviews.llvm.org/D113511
This is a generalization of "do not copy buffers for a LinalgOp output if the output is not used".
Differential Revision: https://reviews.llvm.org/D113385
This is a generalization of "do not copy the result of an InitTensorOp". This commit is in preparation of decoupling `getResultBuffer` from the Linalg dialect.
Differential Revision: https://reviews.llvm.org/D113381
* Sprinkle `inline` on a few small and hot hashing/uniquing methods
* Use the faster DenseMapInfo hash functions instead of
llvm::hash_value.
This provides a speed up of a few percent in workloads with lots of
attributes.
Identifier and StringAttr essentially serve the same purpose, i.e. to hold a string value. Keeping these seemingly identical pieces of functionality separate has caused problems in certain situations:
* Identifier has nice accessors that StringAttr doesn't
* Identifier can't be used as an Attribute, meaning strings are often duplicated between Identifier/StringAttr (e.g. in PDL)
The only thing that Identifier has that StringAttr doesn't is support for caching a dialect that is referenced by the string (e.g. dialect.foo). This functionality is added to StringAttr, as this is useful for StringAttr in generally the same ways it was useful for Identifier.
Differential Revision: https://reviews.llvm.org/D113536
This make `getResultBuffer` in ComprehensiveBufferize independent of the SCF, Affine and Linalg dialects. This commit is in preparating of decoupling op interface implementations from ComprehensiveBufferize.
Differential Revision: https://reviews.llvm.org/D113380
The specific description is [[ https://llvm.discourse.group/t/adding-unsigned-integer-ceil-and-floor-in-std-dialect/4541 | Adding unsigned integer ceil in Std Dialect ]] .
When we lower ceilDivOp this will generate below code, sometimes we know m and n are unsigned intergal.Here are some redundant judgments about positive and negative.
So we need to add some unsigned operations to simplify the instructions.
```
ceilDiv(n, m)
x = (m > 0) ? -1 : 1
return (n*m>0) ? ((n+x) / m) + 1 : - (-n / m)
```
unsigned operations:
```
ceilDivU(n, m)
return n ==0 ? 0 : ((n - 1) / m) + 1
```
Reviewed By: Mogball
Differential Revision: https://reviews.llvm.org/D113363
BufferizationAliasInfo is used in BufferizationOpInterface::bufferize implementations, so it should be part of the same build target as BufferizableOpInterface.
This commit is in preparation of decoupling the ComprehensiveBufferize from the various dialects.
Differential Revision: https://reviews.llvm.org/D113378
* Store inplace bufferization decisions in `inplaceBufferized`.
* Remove `InPlaceSpec`. Use a bool instead.
* Use `BufferizableOpInterface::bufferizesToWritableMemory` and `bufferizesToWritableMemory` instead of `getInPlace(BlockArgument)`. The analysis does not care about inplacability of block arguments. It only cares whether the buffer can be written to or not.
* The `kInPlaceResultsAttrName` op attribute is for testing purposes only.
This commit further decouples BufferizationAliasInfo from other dialects such as SCF.
Differential Revision: https://reviews.llvm.org/D113375
After replacing then init_tensor with a new value, the new value must be inserted into the corresponding union/equivalence sets.
Differential Revision: https://reviews.llvm.org/D113374
Simply emit traits, interfaces & effects (with some minimal formatting) to the
generated docs to make this information easier to find in the docs.
Differential Revision: https://reviews.llvm.org/D113539
This patch fixes:
mlir/lib/Dialect/Linalg/ComprehensiveBufferize/ComprehensiveBufferize.cpp:2240:13:
error: unused function 'checkAliasInfoConsistency'
[-Werror,-Wunused-function]
New TOSA pad operation can support explicitly specifying the pad value. Added
lowering to linalg that uses the explicit value.
Differential Revision: https://reviews.llvm.org/D113515
When doing topological sort we need to make sure an op is scheduled before any
of the ops within its regions.
Also change the algorithm to not be recursive in order to prevent potential
stack overflow.
Differential Revision: https://reviews.llvm.org/D113423
Use existing helper instead of handling only a subset of indices lowering
arithmetic. Also relax the restriction on the memref rank for the GPU mma ops
as we can now support any rank.
Differential Revision: https://reviews.llvm.org/D113383
Given that LLVM dialect types may now optionally contain types from other
dialects, which itself is motivated by dialect interoperability and progressive
lowering, the conversion should no longer assume that the outermost LLVM
dialect type can be left as is. Instead, it should inspect the types it
contains and attempt to convert them to the LLVM dialect. Introduce this
capability for LLVM array, pointer and structure types. Only literal structures
are currently supported as handling identified structures requires the
converison infrastructure to have a mechanism for avoiding infite recursion in
case of recursive types.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D112550
Remove the getSmallestBoundingIndex method that has no uses anymore.
Depends On D113548
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D113549
Replace the getSmallestBoundingIndex method used in promotion by getConstantUpperBoundForIndex that uses flat affine constraints to compute a constant upper bound.
Depends On D113546
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D113548
Use the custom upper bound computation in hoisting by the new getUpperBoundForIndex method.
Depends On D113546
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D113547
Use CodegenStrategy instead of a separate test pass to test iterator interchange.
Depends On D113409
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D113550
Remove padding test pass that was replaced by CodegenStrategy.
Depends On D113411
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D113412
Replace the getSmallestBoundingIndex method used in padding by getConstantUpperBoundForIndex that uses flat affine constraints to compute a constant upper bound.
Depends On D113398
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D113546
Use CodegenStrategy instead of a separate test pass to test hoisting.
Depends On D113410
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D113411
Use CodegenStrategy instead of a separate test pass to test padding.
Depends On D113409
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D113410
Use AffineApplyOp instead of SubIOp to compute the padding width when creating a pad tensor operation.
Depends On D113382
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D113404
Add a flag to control if CodegenStrategy runs the EnablePass between the transformations.
Depends On D113382
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D113409
Remove unused members and store the indexing and packing loops in SmallVector.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D113398
Remove the padding options from the tiling options since padding is now implemented by a separate pattern/pass introduced in https://reviews.llvm.org/D112412.
The revsion remove the tile-and-pad-tensors.mlir and replaces it with the pad.mlir that tests padding in isolation (without tiling). Similarly, hoist-padding.mlir is replaced by pad-and-hoist.mlir introduced in https://reviews.llvm.org/D112713.
Depends On D112838
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D113382
This is useful for ops such as scf::IfOp, which always bufferize in-place.
This commit is in preparation of decoupling BufferizationAliasInfo from the SCF dialect.
Differential Revision: https://reviews.llvm.org/D113339
The existing PostOrder traversal with special rules for certain ops was complicated and had a bug. Switch to PreOrder traversal.
Differential Revision: https://reviews.llvm.org/D113338
Remove some of the special rules for scf::IfOp (not all of them) and encode them in the op interface. This is in preparation of decoupling analysis, bufferization and dialects.
Differential Revision: https://reviews.llvm.org/D112901
A tensor.insert_slice write does not conflict with a subsequent read of the source if the source is originating from a matching tensor.extract_slice.
Differential Revision: https://reviews.llvm.org/D113446
This uses the buffer-deallocation pass or, in case the test case does not use bufferization, has added explicit deallocs.
Differential Revision: https://reviews.llvm.org/D111059
Enables using the same iterator interface to these even though underlying storage is different.
Differential Revision: https://reviews.llvm.org/D113512
This breaking change requires to remove printing the mnemonic in the print()
method on Type/Attribute classes.
This makes it consistent with the parsing code which alread handles the
mnemonic outside of the parsing method.
This likely won't break the build for anyone, but tests will start
failing for dialects downstream. The fix is trivial and look like
going from:
void emitc::OpaqueType::print(DialectAsmPrinter &printer) const {
printer << "opaque<\"";
to:
void emitc::OpaqueAttr::print(DialectAsmPrinter &printer) const {
printer << "<\"";
Reviewed By: rriddle, aartbik
Differential Revision: https://reviews.llvm.org/D113334
Add a new `useDefaultTypePrinterParser` boolean settings on the dialect
(default to false for now) that emits the boilerplate to dispatch type
parsing/printing to the auto-generated method.
We will likely turn this on by default in the future.
Differential Revision: https://reviews.llvm.org/D113332
Add a new `useDefaultAttributePrinterParser` boolean settings on the dialect
(default to false for now) that emits the boilerplate to dispatch attribute
parsing/printing to the auto-generated method.
We will likely turn this on by default in the future.
Differential Revision: https://reviews.llvm.org/D113329
In preparation for implementation subrange lookup on attributes.
Depends on D113039
Reviewed By: jpienaar, Chia-hungDuan
Differential Revision: https://reviews.llvm.org/D113128
This patch factors out division representation computation from upper-lower bound
inequalities to a separate function. This is done to improve readability and reuse.
This patch is marked NFC since the only change is factoring out existing code
to a separate function.
Reviewed By: grosser
Differential Revision: https://reviews.llvm.org/D113463
This predates the templated variant, and has been simply forwarding
to getSplatValue<Attribute> for some time. Removing this makes the
API a bit more uniform, and also helps prevent users from thinking
it is "cheap".
There are several aspects of the API that either aren't easy to use, or are
deceptively easy to do the wrong thing. The main change of this commit
is to remove all of the `getValue<T>`/`getFlatValue<T>` from ElementsAttr
and instead provide operator[] methods on the ranges returned by
`getValues<T>`. This provides a much more convenient API for the value
ranges. It also removes the easy-to-be-inefficient nature of
getValue/getFlatValue, which under the hood would construct a new range for
the type `T`. Constructing a range is not necessarily cheap in all cases, and
could lead to very poor performance if used within a loop; i.e. if you were to
naively write something like:
```
DenseElementsAttr attr = ...;
for (int i = 0; i < size; ++i) {
// We are internally rebuilding the APFloat value range on each iteration!!
APFloat it = attr.getFlatValue<APFloat>(i);
}
```
Differential Revision: https://reviews.llvm.org/D113229
Add a new directive `either` to specify the operands can be matched in either order
Reviewed By: jpienaar, Mogball
Differential Revision: https://reviews.llvm.org/D110666
Generate static function for matching the type/attribute to reduce the
memory footprint.
Reviewed By: Mogball
Differential Revision: https://reviews.llvm.org/D110199
Add pad_const field to tosa.pad.
Add builders to enable optional construction of pad_const in pad op.
Update documentation of tosa.clamp to match spec wording.
Signed-off-by: Suraj Sudhir <suraj.sudhir@arm.com>
Reviewed By: rsuderman
Differential Revision: https://reviews.llvm.org/D113322
Declarative attribute and type formats with assembly formats. Define an
`assemblyFormat` field in attribute and type defs with a `mnemonic` to
generate a parser and printer.
```tablegen
def MyAttr : AttrDef<MyDialect, "MyAttr"> {
let parameters = (ins "int64_t":$count, "AffineMap":$map);
let mnemonic = "my_attr";
let assemblyFormat = "`<` $count `,` $map `>`";
}
```
Use `struct` to define a comma-separated list of key-value pairs:
```tablegen
def MyType : TypeDef<MyDialect, "MyType"> {
let parameters = (ins "int":$one, "int":$two, "int":$three);
let mnemonic = "my_attr";
let assemblyFormat = "`<` $three `:` struct($one, $two) `>`";
}
```
Use `struct(*)` to capture all parameters.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D111594
This avoids crashes when the pattern is applied to ops with
tensor semantics.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D113415
[NFC] As part of using inclusive language within the llvm project,
this patch replaces master with main when referring to `.chm` files.
Reviewed By: teemperor
Differential Revision: https://reviews.llvm.org/D113299
Adapt the Fourier Motzkin elimination to take into account affine computations happening outside of the cloned loop nest.
Depends On D112713
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D112838
The revision updates the packing loop search in hoist padding. Instead of considering all loops in the backward slice, we now compute a separate backward slice containing the index computations only. This modification ensures we do not add packing loops that are not used to index the packed buffer due to spurious dependencies. One instance where such spurious dependencies can appear is the extract slice operation introduced between the tile loops of a double tiling.
Depends On D112412
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D112713
Added omp.sections and omp.section operation according to the
section 2.8.1 of OpenMP Standard 5.0.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D110844
This patch fixes:
mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorConversion.cpp:124:3:
error: default label in switch which covers all enumeration values
[-Werror,-Wcovered-switch-default]
by removing the default case.
Previously we didn't materialize conversions for arguments in certain
cases as the implicit type propagation was being heavily relied on
by many patterns. Now that those patterns have been fixed to
properly handle type conversions, we can drop the special behavior.
Differential Revision: https://reviews.llvm.org/D113233
[NFC] This patch fixes URLs containing "master". Old URLs were either broken or
redirecting to the new URL.
Reviewed By: #libc, ldionne, mehdi_amini
Differential Revision: https://reviews.llvm.org/D113186
The ODS-based Python op bindings generator has been generating incorrect
specification of the operand segment in presence if both optional and variadic
operand groups: optional groups were treated as variadic whereas they require
separate treatement. Make sure it is the case. Also harden the tests around
generated op constructors as they could hitherto accept the code for both
optional and variadic arguments.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D113259
This commit separates the bufferization from the bufferization pass in Linalg. This allows other dialects to use ComprehensiveBufferize more easily.
This commit mainly moves files to a new directory and adds a new build target.
Differential Revision: https://reviews.llvm.org/D112989
AllocationCallbacks functions allocate/deallocate only. They no longer set the insertion point.
This is in preparation of decoupling ComprehensiveBufferize from the Linalg dialect.
Differential Revision: https://reviews.llvm.org/D112991
Move dialect-specific and analysis-specific function out of BufferizationAliasInfo. BufferizationAliasInfo's only job now is to keep track of aliases.
This is in preparation of futher decoupling ComprehensiveBufferize from various dialects.
Differential Revision: https://reviews.llvm.org/D112992
By default, OpResult buffers are writable. But there are ops (e.g., ConstantOp) for which this is not the case.
The purpose of this commit is to further decouple Comprehensive Bufferize from the Standard dialect.
Differential Revision: https://reviews.llvm.org/D112908
This in preparation of decoupling BufferizableOpInterface, Comprehensive Bufferize and dialects.
The goal of this CL is to make `getResultBuffer` (and other `bufferize` functions) independent of `LinalgOps`.
Differential Revision: https://reviews.llvm.org/D112907
These two methods are redundant and removed:
* `bufferizesToAliasOnly`: If not `bufferizesToMemoryRead` and not `bufferizesToMemoryWrite` but `getAliasingOpResult` returns a non-null value, we know that this OpOperand is alias-only. This method now has a default implementation and does not have to be implemented.
* `getInplaceableOpResult`: The analysis does not differentiate between "inplaceable" and "aliasing". The only thing that matters is whether or not OpOperand and OpResult are aliasing. That is the key property that makes buffer copies necessary.
Differential Revision: https://reviews.llvm.org/D112902
The earlier reduction "scalarization" was only applied to a chain of
*innermost* and *for* loops. This revision generalizes this to any
nesting of for- and while-loops. This implies that reductions can be
implemented with a lot less load and store operations. The chaining
is implemented with a forest of yield statements (but not as bad as
when we would also include the while-induction).
Fixes https://bugs.llvm.org/show_bug.cgi?id=52311
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D113078
- String binary search does 1 less string comparison
- Identifier linear scan on large attribute list is switched to string binary search
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D112970
OpAdaptor::verify performs string lookups on an attribute dictionary. By
calling OpAdaptor::verify, Op::verify is not able to use cached attribute
identifiers for faster lookups.
Reviewed By: jpienaar, rriddle
Differential Revision: https://reviews.llvm.org/D113039
This allows for external users of Comprehensive Bufferize to specify their own InitTensorOp elimination procedures.
Differential Revision: https://reviews.llvm.org/D112686
The main benefits of this change are faster access to operands
(no need to compute the offset, as it is now right after the
operation), simpler code(no need to manage a lot of the "is the
operand storage trailing" logic we had to before). The major
downside to this though, is that operand holding operations now
grow in size by 1 word (as no matter how we do this change, there
will need to be some additional book keeping).
Differential Revision: https://reviews.llvm.org/D111695
A quick grep for NDEBUG in MLIR revealed a use in DebugActions.h that breaks ABI. This patch changes the use of NDEBUG to LLVM_ENABLE_ABI_BREAKING_CHECKS which has the advantage of being independent of whether clients build their own app in debug or release as it is purely dependant on how MLIR itself was built.
Differential Revision: https://reviews.llvm.org/D113088
- Provide the operator overloads for constructing (semi-)affine expressions in
Python by combining existing expressions with constants.
- Make AffineExpr, AffineMap and IntegerSet hashable in Python.
- Expose the AffineExpr composition functionality.
Reviewed By: gysit, aoyal
Differential Revision: https://reviews.llvm.org/D113010
This better decouples transfer read/write from vector-only rewrite of conv.
This form is close to ready to plop into a new vector.conv op and the vector.transfer operations to be generalized as part of generic vectorization once the properties ConvolutionOpInterface are inferred from the indexing maps.
This also results in a nice perf boost in the dw == 1 cases.
Differential revision: https://reviews.llvm.org/D112822
This refactoring prepares conv1d vectorization for a future integration into
the generic codegen path.
Once transfer_read / transfer_write vectorization also supports sliding windows,
the special pattern for conv can disappear.
This will also likely need a vector.conv operation.
Differential Revision: https://reviews.llvm.org/D112797
The current setup of LinalgTransformationFilter allows a
transformation to trigger when either
1) The StringAttr is not set and no filter identifier is specified.
2) The StringAttr is set and its value matches (one of) the provided
identifier.
This misses the case where the transformation should trigger either
when the attribute is not set or its value matches (one of) the
provided identifier. Since `Identifier` does not allow empty strings,
add a boolean option to match when the attribute is not set. This
option is by default off.
Differential Revision: https://reviews.llvm.org/D113057
The 2-D case can be rewritten to generate quite fewer instructions and a single vector.shuffle which seems to provide a nice performance boost.
Add this arrow to our quiver by exposing it with a new vector transform option.
Differential Revision: https://reviews.llvm.org/D113062
We'd like to take a progressive approach towards Fconvolution op
CodeGen, by 1) tiling it to fit compute hierarchy first, and then
2) tiling along window dimensions with size 1 to reduce the problem
to be matmul-like. After that, we can 3) downscale high-D convolution
ops to low-D by removing the size-1 window dimensions. The final
step would be 4) vectorizing the low-D convolution op directly.
We have patterns for 1), 2), and 4). This commit adds a pattern for
3) for `linalg.conv_2d_nhwc_hwcf` ops as a starter. Supporting other
high-D convolution ops should be similar and mechanical.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D112928
Symbol tables are a largely useful top-level IR construct, for example, they
make it easy to access functions in a module by name instead of traversing the
list of module's operations to find the corresponding function.
Depends On D112886
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D112821
Inserting a symbol into a SymbolTable may lead to the name of the symbol being
changed in order to ensure uniqueness of symbol names in the table. Return this
new name to spare the caller the need to extract it from the symbol operation.
Depends On D112700
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D112886
This commit moves parts of the existing bufferization code into external op interface implementations. Furthermore, Comprehensive Bufferize is adapted to use the new interface.
Future commits will decouple the interface and its op implementations from Comprehensive Bufferize and the Linalg dialect, as well as split them into multiple files with their own build targets. This commit leaves the file structure and build rules mostly unchanged.
Differential Revision: https://reviews.llvm.org/D112900
This commit adds a new op interface: BufferizableOpInterface. In the future, ops that implement this interface can be bufferized using Comprehensive Bufferize.
Note: The interface methods of this interface correspond to the "op interface" in ComprehensiveBufferize.cpp.
Differential Revision: https://reviews.llvm.org/D112974
In order to support fusion with mma matrix type we need to be able to
execute elementwise operations on them. This add an op to be able to
support some basic elementwise operations. This is a is not a full
solution as it only supports a limited scope or operations. Ideally we would
want to be able to fuse with more kind of operations.
Differential Revision: https://reviews.llvm.org/D112857
wmma intrinsics have a large number of combinations, ideally we want to be able
to target all the different variants. To avoid a combinatorial explosion in the
number of mlir op we use attributes to represent the different variation of
load/store/mma ops. We also can generate with tablegen helpers to know which
combinations are available. Using this we can avoid having too hardcode a path
for specific shapes and can support more types.
This patch also adds boiler plates for tf32 op support.
Differential Revision: https://reviews.llvm.org/D112689
Add the shufflevector conversion. It only handles the static, i.e., IntegerAttr, index.
Co-authored: Xinyi Liu <xyliuhelen@gmail.com>
Reviewed by: antiagainst
Differential revision: https://reviews.llvm.org/D112161
This makes the class usable with types that do not provide their own operator<.
Update MLIR Linalg ComprehensiveBufferize to take advantage of the new template param.
Differential Revision: https://reviews.llvm.org/D112052
Provide support for removing an operation from the block that contains it and
moving it back to detached state. This allows for the operation to be moved to
a different block, a common IR manipulation for, e.g., module merging.
Also fix a potential one-past-end iterator dereference in Operation::moveAfter
discovered in the process.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D112700
The list of operations that do neither read nor write, but create an alias when bufferizing inplace, is getting longer. This commit adds a helper function so that we do not have to spell out the entire list each time.
Differential Revision: https://reviews.llvm.org/D112515
This patch reorders mergeLocalIds usage to merge locals only after number of
dimensions and symbols are same. This does not change any functionality
because it does not matter in what order identifiers are merged, since
the reason to do it is to ensure that two FACs are aligned.
The order ensured in this patch simplifies a subsequent patch to improve
mergeLocalIds which requires dimensions and symbols to be aligned.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D112841
Added a type with different pointer/index bit width. Also
added some sanity CHECKs on the stored indices.
Reviewed By: wrengr
Differential Revision: https://reviews.llvm.org/D112778
When operand is a subview we don't infer in_bounds and some default cases (e.g case in the tests) will crash with `operand is NULL` when converting to LLVM
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D112772
Add a strategy pass that pads and hoists after tiling and fusion.
Depends On D112412
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D112480
Adding a padding and hoisting pattern, a test pass, and tests. The patch prepares the split of tiling/fusion and padding.
Depends On D112255
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D112412
Using [1] for representing shape of a scalar is incorrect, and will break with vectors of size 1.
- remove redundant helper functions
- fix couple of style warnings
Reviewed By: cota
Differential Revision: https://reviews.llvm.org/D112764
InsertionGuards move constructor is currently the compiler synthesized implementation which is very bug prone. A move constructed InsertionGuard will get the same builder and insertion point as the one it is constructed from, leading to insertion point being restored twice. This can even happen in non obvious situations on some compilers, such as when returning a move constructible struct from a function.
This patch fixes the issue by properly implementing the move constructor. An InsertionGuard that was used to move construct another InsertionGuard is simply inactive and will not restore the insertion point.
I chose to explicitly delete the move assign operator as its semantics are not clear cut. If one were to strictly follow the rule of 5, you'd have to restore the insertion point before then taking ownership of the others guards fields. I believe that to be rather confusing and/or surprising however. One may still get such semantics using llvm::Optional or std::optional and the emplace method if really needed.
Differential Revision: https://reviews.llvm.org/D112749
Adapt hoistPaddingOnTensors to leave replacing and erasing the old pad tensor operation to the caller. This change makes the function pattern friendly.
Depends On D112003
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D112255
Adapt the rewriteAsPaddedOp method to use the OpBuilder instead of the PatterRewriter.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D112003
This patch extends the SubElementAttr interface to allow replacing a contained sub attribute. The attribute that should be replaced is identified by an index which denotes the n-th element returned by the accompanying walkImmediateSubElements method.
Using this addition the patch implements replacing SymbolRefAttrs contained within any dialect attributes.
Differential Revision: https://reviews.llvm.org/D111357
This patch fixes:
mlir/lib/IR/BuiltinAttributes.cpp:876:39: error: unused function
'isComplexOfIntType' [-Werror,-Wunused-function]
in a release build.
Rationale:
The silent exit(1) gives little clues on where the error occurs on failure
and may even be confusing at first. The CHECK testing of all computed values
and indices may be a little bit more elaborate, but it directly pinpoints
where errors happen if they occur. This style is also consistent with
the other tests, which I actually prefer.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D112688
* Move SmallVectors outside of inner loops to avoid frequent
allocations and deallocations
* Calculate linearized index and call flat range getters to
avoid internal shape querying behind `getValue`.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D112099
Add llvm.mlir.global_ctors and global_dtors ops and their translation
support to LLVM global_ctors/global_dtors global variables.
Differential Revision: https://reviews.llvm.org/D112524
This patch adds the inclusive clause (which was missed in previous
reorganization - https://reviews.llvm.org/D110903) in omp.wsloop operation.
Added a test for validating it.
Also fixes the order clause, which was not accepting any values. It now accepts
"concurrent" as a value, as specified in the standard.
Reviewed By: kiranchandramohan, peixin, clementval
Differential Revision: https://reviews.llvm.org/D112198
Allow lowering of wmma ops with 64bits indexes. Change the default
version of the test to use default layout.
Differential Revision: https://reviews.llvm.org/D112479
Analyze ops in a pseudo-random order to see if any assertions are triggered. Randomizing the order of analysis likely worsens the quality of the bufferization result (more out-of-place bufferizations). However, assertions should never fail, as that would indicate a problem with our implementation.
Differential Revision: https://reviews.llvm.org/D112581
This patch supports the atomic construct (read and write) following
section 2.17.7 of OpenMP 5.0 standard. Also added tests and
verifier for the same.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D111992
This also fixes the vector.shuffle C++ builder which had an incorrect type assumption that triggers with this new rewrite.
The vector.shuffle semantics were correct though.
Differential revision: https://reviews.llvm.org/D112578
This patch fixes:
mlir/lib/Transforms/Utils/DialectConversion.cpp:2775:5: error:
default label in switch which covers all enumeration values
[-Werror,-Wcovered-switch-default]
by removing the default case. This way, the compiler should issue a
warning in the future when somebody adds a new enum value without a
corresponding case in the switch statement.
The current implementation invokes materializations
whenever an input operand does not have a mapping for the
desired type, i.e. it requires materialization at the earliest possible
point. This conflicts with goal of dialect conversion (and also the
current documentation) which states that a materialization is only
required if the materialization is supposed to persist after the
conversion process has finished.
This revision refactors this such that whenever a target
materialization "might" be necessary, we insert an
unrealized_conversion_cast to act as a temporary materialization.
This allows for deferring the invocation of the user
materialization hooks until the end of the conversion process,
where we actually have a better sense if it's actually
necessary. This has several benefits:
* In some cases a target materialization hook is no longer
necessary
When performing a full conversion, there are some situations
where a temporary materialization is necessary. Moving forward,
these users won't need to provide any target materializations,
as the temporary materializations do not require the user to
provide materialization hooks.
* getRemappedValue can now handle values that haven't been
converted yet
Before this commit, it wasn't well supported to get the remapped
value of a value that hadn't been converted yet (making it
difficult/impossible to convert multiple operations in many
situations). This commit updates getRemappedValue to properly
handle this case by inserting temporary materializations when
necessary.
Another code-health related benefit is that with this change we
can move a majority of the complexity related to materializations
to the end of the conversion process, instead of handling adhoc
while conversion is happening.
Differential Revision: https://reviews.llvm.org/D111620
Dyn-cast should be checked and bailed out if the dyn_cast failed.
Reviewed By: sjarus, NatashaKnk
Differential Revision: https://reviews.llvm.org/D112574
Rationale:
The currently used trait was demanding that all types are the same
which is not true (since the sparse part may change and the dim sizes
may be relaxed). This revision uses the correct trait and makes the
rank match test explicit in the verify method.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D112576
This refactoring adds a few "event" functions (start/end loop-seq/loop) for
readability of the core function of codegen. This also prepares sparse tensor
output codegen, where these "event" functions will provide convenient
placeholders to start or stop insertion bookkeeping.
This revision also includes a few various minor changes that kept on
pending in my local workspace.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D112506