- attribute-dict production is redundant with dictionary-attribute
- definitions of attribute aliases were part of the same production as
uses of attribute aliases
- `std.dim` now accepts the dimension number as an operand, so the
example is out of date. Use the predicate of std.cmpi as a better
example.
Differential Revision: https://reviews.llvm.org/D96076
This reverts commit 953086ddbb because
it breaks GCC 5 build:
error: could not convert '(const char*)""' from 'const char*' to 'llvm::StringLiteral'
static ::llvm::StringLiteral getDialectNamespace() { return ""; }
Use `StringLiteral` for function return type if it is known to return
constant string literals only.
This will make it visible to API users, that such values can be safely
stored, since they refers to constant data, which will never be deallocated.
`StringRef` is general is not safe to store for a long term,
since it might refer to temporal data allocated in heap.
Reviewed By: mehdi_amini, bkramer
Differential Revision: https://reviews.llvm.org/D95945
To allow it usage for Operation classes defined outside of `mlir` namespace.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D95952
* Introduce separate `RankedTensorOf` class. Use it as base class for `AnyRankedTensor`.
* Add C++ class specification (`::mlir::MemRefType`) to `MemRefRankOf` and `StaticShapeMemRefOf`.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D95936
This revision takes advantage of recent extensions to vectorization to refactor contraction detection into a bona fide Linalg interface.
The mlit-linalg-ods-gen parser is extended to support adding such interfaces.
The detection that was originally enabling vectorization is refactored to serve as both a test on a generic LinalgOp as well as to verify ops that declare to conform to that interface.
This is plugged through Linalg transforms and strategies but it quickly becomes evident that the complexity and rigidity of the C++ class based templating does not pay for itself.
Therefore, this revision changes the API for vectorization patterns to get rid of templates as much as possible.
Variadic templates are relegated to the internals of LinalgTransformationFilter as much as possible and away from the user-facing APIs.
It is expected other patterns / transformations will follow the same path and drop as much C++ templating as possible from the class definition.
Differential revision: https://reviews.llvm.org/D95973
Historically, the Vector to LLVM dialect conversion subsumed the Standard to
LLVM dialect conversion patterns. This was necessary because the conversion
infrastructure did not have sufficient support for reconciling type
conversions. This support is now available. Only keep the patterns related to
the Vector dialect in the Vector to LLVM conversion and require type casts
operations to be inserted if necessary. These casts will be removed by
following conversions if possible. Update integration tests to also run the
Standard to LLVM conversion.
There is a significant amount of test churn, which is due to (a) unnecessarily
strict tests in VectorToLLVM and (b) many patterns actually targeting Standard
dialect ops instead of LLVM dialect ops leading to tests actually exercising a
Vector->Standard->LLVM conversion. This churn is a good illustration of the
reason to make the conversion partial: now the tests only check the code in the
Vector to LLVM conversion and will not be randomly broken by changes in
Standard to LLVM conversion.
Arguably, it may be possible to extract Vector to Standard patterns into a
separate pass, but given the ongoing splitting of the Standard dialect, such
pass will be short-lived and will require further refactoring.
Depends On D95626
Reviewed By: nicolasvasilache, aartbik
Differential Revision: https://reviews.llvm.org/D95685
In dialect conversion infrastructure, source materialization applies as part of
the finalization procedure to results of the newly produced operations that
replace previously existing values with values having a different type.
However, such operations may be created to replace operations created in other
patterns. At this point, it is possible that the results of the _original_
operation are still in use and have mismatching types, but the results of the
_intermediate_ operation that performed the type change are not in use leading
to the absence of source materialization. For example,
%0 = dialect.produce : !dialect.A
dialect.use %0 : !dialect.A
can be replaced with
%0 = dialect.other : !dialect.A
%1 = dialect.produce : !dialect.A // replaced, scheduled for removal
dialect.use %1 : !dialect.A
and then with
%0 = dialect.final : !dialect.B
%1 = dialect.other : !dialect.A // replaced, scheduled for removal
%2 = dialect.produce : !dialect.A // replaced, scheduled for removal
dialect.use %2 : !dialect.A
in the same rewriting, but only the %1->%0 replacement is currently considered.
Change the logic in dialect conversion to look up all values that were replaced
by the given value and performing source materialization if any of those values
is still in use with mismatching types. This is performed by computing the
inverse value replacement mapping. This arguably expensive manipulation is
performed only if there were some type-changing replacements. An alternative
could be to consider all replaced operations and not only those that resulted
in type changes, but it would harm pattern-level composability: the pattern
that performed the non-type-changing replacement would have to be made aware of
the type converter in order to call the materialization hook.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D95626
This revision defines a Linalg contraction in general terms:
1. Has 2 input and 1 output shapes.
2. Has at least one reduction dimension.
3. Has only projected permutation indexing maps.
4. its body computes `u5(u1(c) + u2(u3(a) * u4(b)))` on some field
(AddOpType, MulOpType), where u1, u2, u3, u4 and u5 represent scalar unary
operations that may change the type (e.g. for mixed-precision).
As a consequence, when vectorization of such an op occurs, the only special
behavior is that the (unique) MulOpType is vectorized into a
`vector.contract`. All other ops are handled in a generic fashion.
In the future, we may wish to allow more input arguments and elementwise and
constant operations that do not involve the reduction dimension(s).
A test is added to demonstrate the proper vectorization of matmul_i8_i8_i32.
Differential revision: https://reviews.llvm.org/D95939
This separation improves the layering and paves the way for more interfaces coming up in the future.
Differential revision: https://reviews.llvm.org/D95941
We could extend this with an interface to allow dialect to perform a type
conversion, but that would make the folder creating operation which isn't
the case at the moment, and isn't necessarily always desirable.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D95991
The AsyncRuntime declares prototypes for extern "C" functions inside a
namespace in the header, but not inside that namespace in the
definition. This causes Visual Studio to treat them as different
entities and thus the dllexport is ignored for the definitions.
Using the same namespace fixes this issue.
Secondly, this commit moves the dllexport to be consistent with the
JITs expectation.
This is an update to https://reviews.llvm.org/D95386 that fixes the
compile issues in old versions of Visual studio.
Differential Revision: https://reviews.llvm.org/D95933
We should be check whether lb + step >= ub to determine
whether this is a single iteration. Previously we were
checking lb + lb >= ub.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D95440
Add the conversion pattern for vector.bitcast to lower it to
the LLVM Dialect.
Reviewed By: ThomasRaoux, aartbik
Differential Revision: https://reviews.llvm.org/D95579
This revision adds two new classes, RewriterBase and IRRewriter. RewriterBase is a new shared base class between IRRewriter and PatternRewriter. PatternRewriter will continue to be the base class used to perform rewrites within a rewrite pattern. IRRewriter on the other hand, is a new class that allows for tracking IR rewrites from outside of a rewrite pattern. In this revision all of the old API from PatternRewriter is moved to RewriterBase, but the distinction between IRRewriter and PatternRewriter is kept on the chance that a necessary API divergence happens in the future.
Currently if you want to have some utility that transforms a piece of IR and share it between pattern and non-pattern code, you have to duplicate it. This revision enables the creation of utilities that can be invoked from rewrite patterns and normal transformation code:
```c++
void someSharedUtility(RewriterBase &rewriter, ...) {
// Some interesting IR mutation here.
}
// Some RewritePattern
LogicalResult MyPattern::matchAndRewrite(Operation *op, PatternRewriter &rewriter) {
...
someSharedUtility(rewriter, ...);
...
}
// Some Pass
void MyPass::runOnOperation() {
...
IRRewriter rewriter(...);
someSharedUtility(rewriter, ...);
}
```
Differential Revision: https://reviews.llvm.org/D94638
The MLIR Async runtime uses different namespacing for the header file,
and the definitions of its C API. The header file places the extern "C"
functions inside namespace mlir::runtime, and the definitions are not
in a namespace. This causes issues in cl.exe. It treats the declaration
and definition as different, and thus does not apply dllexport to the
definition, which leads to the mlir_async_runtime.dll containing no
definitions, and the mlir_async_runtime.lib not being generated.
This patch moves the namespace to cover the definitions, and thus
generates the dll correctly on Windows with cl.exe.
This was tested with Visual Studio C++ 19.28.29336.
Differential Revision: https://reviews.llvm.org/D95386
Add the necessary bits to CMakeLists to make it possible to configure
MLIR against installed LLVM, and build it with minimal need for LLVM
source tree. The latter is only necessary to run unittests, and if it
is missing then unittests are skipped with a warning.
This change includes the necessary changes to tests, in particular
adding some missing substitutions and defining missing variables
for lit.site.cfg.py substitution.
Reviewed By: stephenneuendorffer
Differential Revision: https://reviews.llvm.org/D85464
Co-authored-by: Isuru Fernando <isuruf@gmail.com>
The __resume function trips up LLVM's 'X86 DAG->DAG Instruction Selection' unless optimizations are disabled.
Only adding the __resume function when it's needed allows lowering through AsyncToLLVM and LLVM without '-O0' as long as the coroutine functionality is not used.
Reviewed By: ezhulenev
Differential Revision: https://reviews.llvm.org/D95868
It will allow to perform additional manipulation with the newly created Operation.
For example, custom attributes propagation/changes.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D95525
This makes the generated code independent from actual namespace of its users.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D95520
We should be check whether lb + step >= ub to determine
whether this is a single iteration. Previously we were
checking lb + lb >= ub.
Differential Revision: https://reviews.llvm.org/D95440
Fix a bug that was introduced where calling the codegen strategy with actual concrete C++ Op types did not trigger the expected behavior.
Also introduce a test for the behavior that was missing.
Differential Revision: https://reviews.llvm.org/D95863
This revision unifies Linalg vectorization and paves the way for vectorization of Linalg ops with mixed-precision operations.
The new algorithm traverses the ops in the linalg block in order and avoids recursion.
It uses a BlockAndValueMapping to keep track of vectorized operations.
The revision makes the following modifications but is otherwise NFC:
1. vector.transfer_read are created eagerly and may appear in a different order than the original order.
2. a more progressive vectorization to vector.contract results in only the multiply operation being converted to `vector.contract %a, %b, %zero`, where `%zero` is a
constant of the proper type. Later vector canonicalizations are assumed to rewrite vector.contract %a, %b, %zero + add to a proper accumulate form.
Differential revision: https://reviews.llvm.org/D95797
In dialect conversion, signature conversions essentially perform block argument
replacement and are added to the general value remapping. However, the replaced
values were not tracked, so if a signature conversion was rolled back, the
construction of operand lists for the following patterns could have obtained
block arguments from the mapping and give them to the pattern leading to
use-after-free. Keep track of signature conversions similarly to normal block
argument replacement, and erase such replacements from the general mapping when
the conversion is rolled back.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D95688
[s|z]exti ops do not have the same operand and result type.
As a consequence, the lowering of the n-D vector form needs to be relaxed a bit.
This revision additionally performs a few NFC renamings of variables to make them more intuitive.
Differential Revision: https://reviews.llvm.org/D95760
Comitted log, exp, maximum, minimum, comparison, ceil and floor conversions from TOSA to LinAlg. Support for signless integer and floating point.
Reviewed By: rsuderman
Differential Revision: https://reviews.llvm.org/D95839
Add printer and parser hooks for a custom directive that allows
parsing and printing of idioms that can represent a list of values
each of which is either an integer or an SSA value. For example in
`subview %source[%offset_0, 1] [4, %size_1] [%stride_0, 3]`
each of the list (which represents offset, size and strides) is a mix
of either statically know integer values or dynamically computed SSA
values. Since this is used in many places adding a custom directive to
parse/print this idiom allows using assembly format on operations
which use this idiom.
Differential Revision: https://reviews.llvm.org/D95773
Support OpImageType in SPIRV Dialect.
This change doesn't support operand AccessQualifier since
it is optinal and only enables under Kernel capability.
co-authored-by: Alan Liu <alanliu.yf@gmail.com>
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D95580
Support OpImageType in SPIRV Dialect.
This change doesn't support operand AccessQualifier since
it is optinal and only enables under Kernel capability.
co-authored-by: Alan Liu <alanliu.yf@gmail.com>
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D95580
This is the last revision to migrate using SimplePadOp to PadTensorOp, and the
SimplePadOp is removed in the patch. Update a bit in SliceAnalysis because the
PadTensorOp takes a region different from SimplePadOp. This is not covered by
LinalgOp because it is not a structured op.
Also, remove a duplicated comment from cpp file, which is already described in a
header file. And update the pseudo-mlir in the comment.
This is as same as D95615 but fixing one dep in CMakeLists.txt
Different from D95671, the fix was applied to run target.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D95785
This reverts commit d9b953d84b.
This commit resulted in build bot failures and the author is away from a
computer, so I am reverting on their behalf until they have a chance to
look into this.
This is the last revision to migrate using SimplePadOp to PadTensorOp, and the
SimplePadOp is removed in the patch. Update a bit in SliceAnalysis because the
PadTensorOp takes a region different from SimplePadOp. This is not covered by
LinalgOp because it is not a structured op.
Also, remove a duplicated comment from cpp file, which is already described in a
header file. And update the pseudo-mlir in the comment.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D95671
* Fixing missing `type` keyword in alias print
* Add test for large tuple type alias & rerun output to verify printed
form can be parsed (which caught the above).
The result values of vp2intersect are vectors of bits, i.e.,
vector<8xi1> or vector<16xi8> (instead of i8 or i16).
Differential Revision: https://reviews.llvm.org/D95678
Tuples can occupy quite a lot of space, instead of printing out tuple type
everywhere, just use the type alias if larger (arbitrarily chose a bound for
now).
Differential Revision: https://reviews.llvm.org/D95707
Update ElementsAttr::isValidIndex to handle ElementsAttr with a scalar. Scalar will have rank 0.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D95663
Currently, for a scf.parallel (i,j,k) after the loop collapsing to 1D is done, the
IVs would be traversed as for an scf.parallel(k,j,i).
Differential Revision: https://reviews.llvm.org/D95693
Previously, CMake would find any version of Python3. However, the project
claims to require 3.6 or greater, and 3.6 features are being used.
Reviewed By: yln
Differential Revision: https://reviews.llvm.org/D95635
The library is not actually static when BUILD_SHARED_LIBS is on, and tests need to explicitly load it already. Also, the shared objects it was linked to did not use any symbols from it and it was therefore never linked to it.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D95612
This segfault could occur from out of bounds accesses when simplifying
tensor.extract with a constant index and a tensor created by
tensor.from_elements.
This IR is not necesarilly invalid as it might conditionally be
never executed.
Differential Revision: https://reviews.llvm.org/D95535
This class is looking up a dialect prefix on the identifier on initialization
and keeping a pointer to the Dialect when found.
The NamedAttribute key is now a DialectIdentifier.
Reviewed By: rriddle, jpienaar
Differential Revision: https://reviews.llvm.org/D95418
This is the last revision to migrate using SimplePadOp to PadTensorOp, and the
SimplePadOp is removed in the patch. Update a bit in SliceAnalysis because the
PadTensorOp takes a region different from SimplePadOp. This is not covered by
LinalgOp because it is not a structured op.
Also, remove a duplicated comment from cpp file, which is already described in a
header file. And update the pseudo-mlir in the comment.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D95615
Expand existing one to handle the common case for verifying compatible
is existing and inferred. This considers arrays equivalent if they they
have the same size and pairwise compatible elements.
Rationale:
Providing an output tensor, even if one is not used as input to
the kernel provides the right pattern for using lingalg sparse
kernels (in contrast with reusing a tensor just to provide the shape).
This prepares proper bufferization that will follow.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D95587
It is no longer necessary to also convert other "standard" ops along with the
complex dialect: the element types are now built-in integers or floating point
types, and the top-level cast between complex and struct is automatically
inserted and removed in progressive lowering.
Reviewed By: herhut
Differential Revision: https://reviews.llvm.org/D95625
Depending on the headers only is fine, but we do not want to use any symbols from LLVMSupport. If we do, static registration of cl options is linked in as well, and loading multiple such libraries in the cuda/rocm-runner fails because the same cl options are registered multiple times.
The cuda/rocm-runners also depend on LLVMSupport, so one could think that already loading a single such library would fail. It does not because the map of cl options is not shared between the runner and the loaded libraries (but it is shared across all loaded libraries, presumably because it has external linkage, in contrast to the static registration which has internal linkage).
This change is a preparation step for dynamically loading the mlir_async_runtime.so and cuda-runtime-wrappers.so in the same test. The async runtime depends on LLVMSupport in a more fundamental way (llvm::ThreadPool), and as explained above there can only be one.
This change also switches to add_mlir_library to make it consistent with the other runner_utils libraries.
Reviewed By: herhut
Differential Revision: https://reviews.llvm.org/D95613
This revision creates a build method of PadTensorOp which can be mapped to
SimplePad op. The verifier is updated to accept a static custom result type,
which has the same semantic as SimplePadOp.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D95555
The subview verifier in the rank-reduced case is plainly skipping verification
when the resulting type is a memref with empty affine map. This is generally incorrect.
Instead, form the actual expected rank-reduced MemRefType that takes into account the projections of 1's dimensions. Then, check the canonicalized expected rank-reduced type against the canonicalized candidate type.
Differential Revision: https://reviews.llvm.org/D95316
This revision adds a layer of SFINAE to the composable codegen strategy so it does
not have to require statically defined ops but instead can also be used with OpInterfaces, Operation* and an op name string.
A linalg.matmul_i8_i8_i32 is added to the .tc spec to demonstrate how all this works end to end.
Differential Revision: https://reviews.llvm.org/D95600
This revision improves the usage of the codegen strategy by adding a few flags that
make it easier to control for the CLI.
Usage of ModuleOp is replaced by FuncOp as this created issues in multi-threaded mode.
A simple benchmarking capability is added for linalg.matmul as well as linalg.matmul_column_major.
This latter op is also added to linalg.
Now obsolete linalg integration tests that also take too long are deleted.
Correctness checks are still missing at this point.
Differential revision: https://reviews.llvm.org/D95531
Fixes a few small issues in the docs. It seems one of the examples was missing
the expected MLIR output due to a copy-paste typo.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D95599
This prevents needless reinitialization for clients that want to reuse a pass manager multiple times. A new `getRegisryHash` function is exposed by the context to give a rough indicator of when the context registry has changed.
Differential Revision: https://reviews.llvm.org/D95493
`emplace???` functions running concurrently can set the ready flag and then pending awaiter will never be executed
Differential Revision: https://reviews.llvm.org/D95517
OffsetSizeAndStrideOpInterface now have the ability to specify only a leading subset of
offset, sizes, strides operands/attributes.
The size of that leading subset must be limited by the corresponding entry in `getArrayAttrMaxRanks` to avoid overflows.
Missing trailing dimensions are assumed to span the whole range (i.e. [0 .. dim)).
This brings more natural semantics to slice-like op on top of subview and is a simplifies to removing all uses of SliceOp in dependent projects.
Differential revision: https://reviews.llvm.org/D95441
The current context is thread-local state, and in preparation of GPU async execution (on multiple threads) we need to set the context before calling API that create resources.
Reviewed By: herhut
Differential Revision: https://reviews.llvm.org/D94495
This follows up on the introduction of C API for the same object and is similar
to AffineExpr and AffineMap.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D95437
Depends On D95000
Move async.execute outlining and async -> async.runtime lowering into the separate Async transformation pass
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D95311
Adds vp2intersect to the AVX512 dialect and defines a lowering to the
LLVM dialect.
Author: Matthias Springer <springerm@google.com>
Differential Revision: https://reviews.llvm.org/D95301
The `getCapsule` and `createFromCapsule` comments incorrectly state the `PyMlirContext` and `MlirContext` in `PyLocation`, `PyAttribute`, and `PyType` classes.
Differential Revision: https://reviews.llvm.org/D95413
Instead of using llvm.call operations to call LLVM coro intrinsics use Coro operations from the LLVM dialect.
(This was reviewed as a part of https://reviews.llvm.org/D94923 but was lost in arc land from local branch)
Differential Revision: https://reviews.llvm.org/D95405
[NFC] No new functionality, mostly a cleanup and one more abstraction level between Async and LLVM IR.
Instead of lowering from Async to LLVM coroutines and Async Runtime API in one shot, do it progressively via async.coro and async.runtime operations.
1. Lower from async to async.runtime/coro (e.g. async.execute to function with coro setup and runtime calls)
2. Lower from async.runtime/coro to LLVM intrinsics and runtime API calls
Intermediate coro/runtime operations will allow to run transformations on a higher level IR and do not try to match IR based on the LLVM::CallOp properties.
Although async.coro is very close to LLVM coroutines, it is not exactly the same API, instead it is optimized for usability in async lowering, and misses a lot of details that are present in @llvm.coro intrinsic.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D94923