The code assumes that a TypeConstraint in the additional constraints list specifies precisely one argument.
If the user were to not specify any, it'd result in a crash. If given more than one, the additional ones were ignored.
This patch fixes the crash and disallows user errors by adding a check that a single argument is supplied to the TypeConstraint
Differential Revision: https://reviews.llvm.org/D118763
This is completely unused upstream, and does not really have well defined semantics
on what this is supposed to do/how this fits into the ecosystem. Given that, as part of
splitting up the standard dialect it's best to just remove this behavior, instead of try
to awkwardly fit it somewhere upstream. Downstream users are encouraged to
define their own operations that clearly can define the semantics of this.
This also uncovered several lingering uses of ConstantOp that weren't
updated to use arith::ConstantOp, and worked during conversions because
the constant was removed/converted into something else before
verification.
See https://llvm.discourse.group/t/standard-dialect-the-final-chapter/ for more discussion.
Differential Revision: https://reviews.llvm.org/D118654
The Utils.cpp file in StandardOps essentially just contains utilities for interacting with arithmetic
operations, and at this point makes more sense as a utility file for the arithemtic dialect.
Differential Revision: https://reviews.llvm.org/D118280
This is part of the larger effort to split the standard dialect. This will also allow for pruning some
additional dependencies on Standard (done in a followup).
Differential Revision: https://reviews.llvm.org/D118202
Currently if an operation requires additional verification, it specifies an inline
code block (`let verifier = "blah"`). This is quite problematic for various reasons, e.g.
it requires defining C++ inside of Tablegen which is discouraged when possible, but mainly because
nearly all usages simply forward to a static function `static LogicalResult verify(SomeOp op)`.
This commit adds support for a `hasVerifier` bit field that specifies if an additional verifier
is needed, and when set to `1` declares a `LogicalResult verify()` method for operations to
override. For migration purposes, the existing behavior is untouched. Upstream usages will
be replaced in a followup to keep this patch focused on the hasVerifier implementation.
One main user facing change is that what was one `MyOp::verify` is now `MyOp::verifyInvariants`.
This better matches the name this method is called everywhere else, and also frees up `verify` for
the user defined additional verification. The `verify` function when generated now (for additional
verification) is private to the operation class, which should also help avoid accidental usages after
this switch.
Differential Revision: https://reviews.llvm.org/D118742
This CL supports adding dependency between traits verifiers and the
dependency will be checked statically.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D115135
Use `SmallVector` instead of `std::vector` in `getLocalRepr` function.
Also, fix the casing of a variable.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D118722
This revision avoids incorrect hoisting of alloca'd buffers across an AutomaticAllocationScope boundary.
In the more general case, we will probably need a ParallelScope-like interface.
Differential Revision: https://reviews.llvm.org/D118768
By fully qualifying the use of any types and functions from the mlir namespace, users are not required to add using namespace mlir; into the C++ file including the Tablegen output.
Differential Revision: https://reviews.llvm.org/D118767
Use type inference when building the TransferWriteOp in the TransferWritePermutationLowering. Previously, the result type has been set to Type() which triggers an assertion if the pattern is used with tensors instead of memrefs.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D118758
Some FormatElement subclasses contain `std::vector`. Since these use
BumpPtrAllocator, they need to be converted to trailing objects.
However, this is not a trivial fix so I will leave it as a FIXME and use
a workaround.
Move the functions that retrieve the supporting C library, compile an MLIR
module and build a JIT execution engine to mlir_pytaco_utils.
Add a function to create an MLIR sparse tensor from a file and return a pointer
to the MLIR sparse tensor as well as the shape of the sparse tensor.
Add unit tests.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D118496
Exposes mlir::DialectRegistry to the C API as MlirDialectRegistry along with
helper functions. A hook has been added to MlirDialectHandle that inserts
the dialect into a registry.
A future possible change is removing mlirDialectHandleRegisterDialect in
favor of using mlirDialectHandleInsertDialect, which it is now implemented with.
Differential Revision: https://reviews.llvm.org/D118293
When attempting to cast a pybind11 handle to an MLIR C API object through
capsules, the binding code would attempt to directly access the "_CAPIPtr"
attribute on the object, leading to a rather obscure AttributeError when the
attribute was missing, e.g., on non-MLIR types. Check for its presence and
throw a TypeError instead.
Depends On D117646
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D117658
Somehow the test introduced in https://reviews.llvm.org/D118006 produces the expected result but running
through lli with Intel SDE activated sneaks in an error code 2 (before this commit) or an error code 10
(after this commit).
The test as is is still meaningful in that the LLVMIR generation would crash if the `elementtype` is set
improperly.
Still, this should run with lli turned on.
This revision adds enough support to allow InlineAsmOp to work properly with indirect memory constraints "*m".
These require an explicit "elementtype" TypeAttr on the operands to pass LLVM verification and need to be provided.
Reviewed By: bkramer
Differential Revision: https://reviews.llvm.org/D118006
Extract the division representation from equality constraints.
For example:
32*k == 16*i + j - 31 <-- k is the localVariable
expr = 16*i + j - 31, divisor = 32
k = (16*i + j - 32) floordiv 32
The dividend of the division is set to [16, 1, -32] and the divisor is set
to 32.
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D117959
Following the discussion in D118318, mark `arith.addf/mulf` commutative.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D118600
Part 2 of 3 of unifying the assembly formats of attributes/types and operations.The last patch that introduced attribute/type formats (D111594) factored out the format lexer entirely. This patch factors out most of the format parsers such that the attribute/type and op parsers only need to implement handling for specific elements.
Certain things could be factored better (element verification, 'seen' variables) but the primary goal of factoring is so that features can be used across both assembly formats.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D117971
This matches the same API usage as attributes/ops/types. For example:
```c++
Dialect *dialect = ...;
// Instead of this:
if (auto *interface = dialect->getRegisteredInterface<DialectInlinerInterface>())
// You can do this:
if (auto *interface = dyn_cast<DialectInlinerInterface>(dialect))
```
Differential Revision: https://reviews.llvm.org/D117859
- Remove the `{Op,Attr,Type}Trait` TableGen classes and replace with `Trait`
- Rename `OpTraitList` to `TraitList` and use it in a few places
The bulk of this change is a mechanical s/OpTrait/Trait/ throughout the codebase.
Reviewed By: rriddle, jpienaar, herhut
Differential Revision: https://reviews.llvm.org/D118543
This reduces the dependencies of the MLIRVector target and makes the dialect consistent with other dialects.
Differential Revision: https://reviews.llvm.org/D118533
Support affine.load/store ops in fold-memref-subview ops pass. The
existing pass just "inlines" the subview operation on load/stores by
inserting affine.apply ops in front of the memref load/store ops: this
is by design always consistent with the semantics on affine.load/store
ops and the same would work even more naturally/intuitively with the
latter.
Differential Revision: https://reviews.llvm.org/D118565
Update SCF pass cmd line names to prefix `scf`. This is consistent with
guidelines/convention on how to name dialect passes. This also avoids
ambiguity on the context given the multiple `for` operations in the
tree.
NFC.
Differential Revision: https://reviews.llvm.org/D118564
There was a bug where some of the OpOperands needed in the replacement op were not in scope.
It does not matter where the replacement op is inserted. Any insertion point is OK as long as there are no dominance errors. In the worst case, the newly inserted op will bufferize out-of-place. This is no worse than not eliminating the InitTensorOp at all.
Differential Revision: https://reviews.llvm.org/D117685
Also reimplement `std-bufferize` in terms of BufferizableOpInterface-based bufferization. The old `std.select` bufferization pattern is no longer needed and deleted.
Differential Revision: https://reviews.llvm.org/D118559
The bufferization of arith.constant ops is also switched over to BufferizableOpInterface-based bufferization. The old implementation is deleted. Both implementations utilize GlobalCreator, now renamed to just `getGlobalFor`.
GlobalCreator no longer maintains a set of all created allocations to avoid duplicate allocations of the same constant. Instead, `getGlobalFor` scans the module to see if there is already a global allocation with the same constant value.
For compatibility reasons, it is still possible to create a pass that bufferizes only `arith.constant`. This pass (createConstantBufferizePass) could be deleted once all users were switched over to One-Shot bufferization.
Differential Revision: https://reviews.llvm.org/D118483
llvm/Support/Path.h was likely previously implicitly included, and a
refactoring removed that inclusion, breaking the pass.
Differential Revision: https://reviews.llvm.org/D118508
Type constraints such as BoolLike and SignlessIntegerLike appear to
have been defined by copy-paste and all share an underlying TypesLike
structure that can be factored out.
This also allows for defining additional constraints of a similar
form, such as F32Like.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D118382
This patch adds the vector.scan op which computes the
scan for a given n-d vector. It requires specifying the operator,
the identity element and whether the scan is inclusive or
exclusive.
TEST: Added test in ops.mlir
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D117171
The unit tests for PyTACO hasn't been upstreamed yet. A unit test for this
change will be added when we upstream all the unit tests for PyTACO.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D118417
mlir-cpu-runner has a dependency on ExecutionEngine which is only built for the native arch. So currently mlir-cpu-runner does not link correctly when the native arch is not targeted.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D118422
This patch introduces a class LexSimplex that can currently be used to find the
lexicographically minimal rational point in an IntegerPolyhedron. This is a
series of patches leading to computing the lexicographically minimal integer
lattice point as well parametric lexicographic minimization.
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D117437
With opaque pointers, we can no longer derive this from the pointer
type, so we need to explicitly provide the element type the atomic
operation should work with.
Differential Revision: https://reviews.llvm.org/D118359
This is is the start of the MLIR benchmarks. It sets up a command
line tool along with conventions to define and run benchmarks
using mlir's python bindings.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D115174
Although cmake should be platform-independent, we've observed
that some aspects of Python detection don't work on all platforms,
even with recent versions of cmake. This appears to be due to bugs
in the python detection logic, especially when the NumPy component
is required and not located within the python installation.
As a workaround, this patch first searches for "Development" before
searching for "Development.Module", which seems to workaround the
issue.
Differential Revision: https://reviews.llvm.org/D118148
This is in preparation of switching `-tensor-constant-bufferize` and `-arith-bufferize` to BufferizableOpInterface-based implementations.
Differential Revision: https://reviews.llvm.org/D118324
This commit switches the `tensor-bufferize` pass over to BufferizableOpInterface-based bufferization.
Differential Revision: https://reviews.llvm.org/D118246
The pass can currently not handle to_memref(to_tensor(x)) folding where a cast is necessary. This is required with the new unified bufferization. There is already a canonicalization pattern that handles such foldings and it should be used during this pass.
Differential Revision: https://reviews.llvm.org/D117988
Stop the Cpp target from emitting unused labels. The previosly generated
code generated warning if `-Wunused-label` is passed to a compiler.
Co-authored-by: Simon Camphausen <simon.camphausen@iml.fraunhofer.de>
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D118154
Prefix "affine-" to affine transform passes that were missing it -- to
avoid ambiguity and for uniformity. There were only two needed this.
Move mispaced affine coalescing test case file.
NFC.
Differential Revision: https://reviews.llvm.org/D118314
OwningRewritePatternList has been deprecated for ~10 months now, we can remove
the leftover using directives at this point.
Differential Revision: https://reviews.llvm.org/D118287
These transformations already operate on memref operations (as part of
splitting up the standard dialect). Now that the operations have moved,
it's time for these transformations to move as well.
Differential Revision: https://reviews.llvm.org/D118285
We already convert to BitVector internally, and other APIs (namely Operation::eraseOperands)
already use BitVector as well. Switching over provides a common format between
API and also reduces the amount of format conversions necessary.
Fixes#53325
Differential Revision: https://reviews.llvm.org/D118083
The option parser currently does not properly handle nested options, meaning that
in some cases we can print pass pipelines that we can't actually parse back in. For example,
from #52885 we currently can't parse in inliner pipelines that have nested pipeline strings.
This commit adds handling for string values (e.g. "...") and nested options
(e.g. `foo{baz{bar=10 pizza=11}}`).
Fixes#52885
Differential Revision: https://reviews.llvm.org/D118078
Rationale:
Demonstrates the maximum tile size allowed for the f32 <= bf16 x bf16 op
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D118277
This diff modifies the tablegen specification and code generation for
BitEnumAttr attributes in MLIR Operation Definition Specification (ODS) files.
Specifically:
- there is a new tablegen class for "none" values (i.e. no bits set)
- single-bit enum cases are specified via bit index (i.e. [0, 31]) instead of
the resulting enum integer value
- there is a new tablegen class to represent a "grouped" bitwise OR of other
enum values
This diff is intended as an initial step towards improving "fastmath"
optimization support in MLIR, to allow more precise control of whether certain
floating point optimizations are applied in MLIR passes. "Fast" math options
for floating point MLIR operations would (following subsequent RFC and
discussion) be specified by using the improved enum bit support in this diff.
For example, a "fast" enum value would act as an alias for a group of other
cases (e.g. finite-math-only, no-signed-zeros, etc.), in a way that is similar
to support in C/C++ compilers (clang, gcc).
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D117029
This is part of splitting up the standard dialect. The move makes sense anyways,
given that the memref dialect already holds memref.atomic_rmw which is the non-region
sibling operation of std.generic_atomic_rmw (the relationship is even more clear given
they have nearly the same description % how they represent the inner computation).
Differential Revision: https://reviews.llvm.org/D118209
The GPU dialect currently contains an explicit reference to LLVMFuncOp
during verification to handle the situation where the kernel has already been
converted. This commit changes that reference to instead use FunctionOpInterface,
which has two main benefits:
* It allows for removing an otherwise unnecessary dependency on the LLVM dialect
* It removes hardcoded assumptions about the lowering path and use of the GPU dialect
Differential Revision: https://reviews.llvm.org/D118172
This reverts commit ef82063207.
- It conflicts with the existing llvm::size in STLExtras, which will now
never be called.
- Calling it without llvm:: breaks C++17 compat
This is for compatibility with existing bufferization passes. Also clean up memref type generation a bit.
Differential Revision: https://reviews.llvm.org/D118243
This revision adds enough support to allow InlineAsmOp to work properly with indirect memory constraints "*m".
These require an explicit "elementtype" TypeAttr on the operands to pass LLVM verification and need to be provided.
Reviewed By: bkramer
Differential Revision: https://reviews.llvm.org/D118006
The existing implementation called DenseMap::insert, which is a no-op if the map already contains an entry with the same key.
Differential Revision: https://reviews.llvm.org/D118165
If we are extracting it is more useful to push the index_cast past the
extraction. This increases the chance the tensor.extract can evaluated at
compile time.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D118204
There is not much of a benefit to reshape a from element vs reloading it.
Updated to progagate shape manipulations into the output type of
tensor.from_elements.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D118201
Add methods to
- Get block argument that is tied with an opOperand
- Get the yield value that is tied with a output opOperand.
Differential Revision: https://reviews.llvm.org/D118085
Fusion of reshape ops by linearization incorrectly inverted the
indexing map before linearizing dimensions. This leads to incorrect
indexing maps used in the fused operation.
Differential Revision: https://reviews.llvm.org/D117908
This pattern is not written to handle operations with `linalg.index`
operations in its body, i.e. operations that have index semantics.
Differential Revision: https://reviews.llvm.org/D117856
This is mostly a copy of the existing tensor.from_elements bufferization. Once TensorInterfaceImpl.cpp is moved to the tensor dialect, the existing rewrite pattern can be deleted.
Differential Revision: https://reviews.llvm.org/D117775
This is mostly a copy of the existing tensor.generate bufferization. Once TensorInterfaceImpl.cpp is moved to the tensor dialect, the existing rewrite pattern can be deleted.
Differential Revision: https://reviews.llvm.org/D117770
This patch supports the atomic construct (capture) following section 2.17.7 of OpenMP 5.0 standard. Also added tests for the same.
Reviewed By: peixin, kiranchandramohan
Differential Revision: https://reviews.llvm.org/D115851
This is a pretty important debugging option to stay hidden. Also,
improve its cmd-line description; the current description gives no hint
that this is the one to use to have locations printed inline.
Out-of-line locations are also unproductive to work with in many cases
where the locations are actually compact, which is also why this option
should be more visible. This revision doesn't change the default on it
though.
Reviewed By: rriddle, jpienaar
Differential Revision: https://reviews.llvm.org/D117186
A lot of dialects have dependencies that are unnecessary, either because of copy/paste
of files when creating things or some other means. This commit cleans up a bunch of
the simple ones:
* Copy/Paste or missed during refactoring
Most of the dependencies cleaned up here look like copy/paste errors when creating
new dialects/transformations, or because the dependency wasn't removed during a
refactoring (e.g. when splitting the standard dialect).
* Unnecessary hard coding of constant operations in matchers
There are a few instances where a dialect had a dependency because it
was hardcoding checks for constant operations instead of using the better m_Constant
approach.
Differential Revision: https://reviews.llvm.org/D118062
This has been a major TODO for a very long time, and is necessary for establishing a proper
dialect-free dependency layering for the Transforms library. Code was moved to effectively
two main locations:
* Affine/
There was quite a bit of affine dialect related code in Transforms/ do to historical reasons
(of a time way into MLIR's past). The following headers were moved to:
Transforms/LoopFusionUtils.h -> Dialect/Affine/LoopFusionUtils.h
Transforms/LoopUtils.h -> Dialect/Affine/LoopUtils.h
Transforms/Utils.h -> Dialect/Affine/Utils.h
The following transforms were also moved:
AffineLoopFusion, AffinePipelineDataTransfer, LoopCoalescing
* SCF/
Only one SCF pass was in Transforms/ (likely accidentally placed here): ParallelLoopCollapsing
The SCF specific utilities in LoopUtils have been moved to SCF/Utils.h
* Misc:
mlir::moveLoopInvariantCode was also moved to LoopLikeInterface.h given
that it is a simple utility defined in terms of LoopLikeOpInterface.
Differential Revision: https://reviews.llvm.org/D117848
Transforms/ should only contain transformations that are dialect-independent and
this pass interacts with MemRef operations (making it a better fit for living in that
dialect).
Differential Revision: https://reviews.llvm.org/D117841
Transforms/ should only contain dialect-independent transformations,
and these files are a much better fit for the bufferization dialect anyways.
Differential Revision: https://reviews.llvm.org/D117839
The current lowering from GPU to NVVM does
not correctly handle the following cases when
lowering the gpu shuffle op.
1. When the active width is set to 32 (all lanes),
then the current approach computes (1 << 32) -1 which
results in poison values in the LLVM IR. We fix this by
defining the active mask as (-1) >> (32 - width).
2. In the case of shuffle up, the computation of the third
operand c has to be different from the other 3 modes due to
the op definition in the ISA reference.
(https://docs.nvidia.com/cuda/parallel-thread-execution/index.html)
Specifically, the predicate value is computed as j >= maxLane
for up and j <= maxLane for all other modes. We fix this by
computing maskAndClamp as 32 - width for this mode.
TEST: We modify the existing test and add more checks for the up mode.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D118086