When lowering to memrefCopy call, the size for i1 type was calculated as 0.
Instead of using getTypeSizeInBits() and dividing by 8, we should just use getTypeSize().
Differential Revision: https://reviews.llvm.org/D119540
The lowering creates llvm.insertvalue with the rank value, so it needs to use
index type instead of 64 bit integer type. Otherwise, we get an error:
llvm.insertvalue' op Type mismatch: cannot insert 'i64' into '!llvm.struct<(i32, ptr<i8>)>'
Differential Revision: https://reviews.llvm.org/D119534
This patch simply adds an optional garbage collector attribute to LLVMFuncOp which maps 1:1 to the "gc" property of functions in LLVM.
Differential Revision: https://reviews.llvm.org/D119492
This allows operations to control the block ids used by the printer in nested regions.
Reviewed By: Mogball
Differential Revision: https://reviews.llvm.org/D115849
Previously, OpDSL did not support rank polymorphism, which required a separate implementation of linalg.fill. This revision extends OpDSL to support rank polymorphism for a limited class of operations that access only scalars and tensors of rank zero. At operation instantiation time, it scales these scalar computations to multi-dimensional pointwise computations by replacing the empty indexing maps with identity index maps. The revision does not change the DSL itself, instead it adapts the Python emitter and the YAML generator to generate different indexing maps and and iterators depending on the rank of the first output.
Additionally, the revision introduces a `linalg.fill_tensor` operation that in a future revision shall replace the current handwritten `linalg.fill` operation. `linalg.fill_tensor` is thus only temporarily available and will be renamed to `linalg.fill`.
Reviewed By: nicolasvasilache, stellaraccident
Differential Revision: https://reviews.llvm.org/D119003
Add new operations to the gpu dialect to represent device side
asynchronous copies. This also add the lowering of those operations to
nvvm dialect.
Those ops are meant to be low level and map directly to llvm dialects
like nvvm or rocdl.
We can further add higher level of abstraction by building on top of
those operations.
This has been discuss here:
https://discourse.llvm.org/t/modeling-gpu-async-copy-ampere-feature/4924
Differential Revision: https://reviews.llvm.org/D119191
These functions allow for defining pattern fragments usable within the `match` and `rewrite` sections of a pattern. The main structure of Constraints and Rewrites functions are the same, and are similar to functions in other languages; they contain a signature (i.e. name, argument list, result list) and a body:
```pdll
// Constraint that takes a value as an input, and produces a value:
Constraint Cst(arg: Value) -> Value { ... }
// Constraint that returns multiple values:
Constraint Cst() -> (result1: Value, result2: ValueRange);
```
When returning multiple results, each result can be optionally be named (the result of a Constraint/Rewrite in the case of multiple results is a tuple).
These body of a Constraint/Rewrite functions can be specified in several ways:
* Externally
In this case we are importing an external function (registered by the user outside of PDLL):
```pdll
Constraint Foo(op: Op);
Rewrite Bar();
```
* In PDLL (using PDLL constructs)
In this case, the body is defined using PDLL constructs:
```pdll
Rewrite BuildFooOp() {
// The result type of the Rewrite is inferred from the return.
return op<my_dialect.foo>;
}
// Constraints/Rewrites can also implement a lambda/expression
// body for simple one line bodies.
Rewrite BuildFooOp() => op<my_dialect.foo>;
```
* In PDLL (using a native/C++ code block)
In this case the body is specified using a C++(or potentially other language at some point) code block. When building PDLL in AOT mode this will generate a native constraint/rewrite and register it with the PDL bytecode.
```pdll
Rewrite BuildFooOp() -> Op<my_dialect.foo> [{
return rewriter.create<my_dialect::FooOp>(...);
}];
```
Differential Revision: https://reviews.llvm.org/D115836
This allows for defining simple patterns in a single line. The lambda
body of a Pattern expects a single operation rewrite statement:
```
Pattern => replace op<my_dialect.foo>(operands: ValueRange) with operands;
```
Differential Revision: https://reviews.llvm.org/D115835
If the result operand has a unit leading dim it is removed from all operands.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D119206
Fix fold-memref-subview-ops for affine.load/store. We need to expand out
the affine apply on its operands.
Differential Revision: https://reviews.llvm.org/D119402
removed obsoleted TODO
removed strange Fp precision for coordinates
lined up meta data testing code for readability
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D119377
They used to be classes with a virtual `run` function. This was inconvenient because post analysis steps are stored in BufferizationOptions. Because of this design choice, BufferizationOptions were not copyable.
Differential Revision: https://reviews.llvm.org/D119258
Supports whitespace elements: ` ` and `\\n` as well as the "empty" whitespace `` that removes an otherwise printed space.
Depends on D118208
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D118210
Reuse the higher precision F32 approximation for the F16 one (by expanding and
truncating). This is partly RFC as I'm not sure what the expectations are here
(e.g., these are only for F32 and should not be expanded, that reusing
higher-precision ones for lower precision is undesirable due to increased
compute cost and only approximations per exact type is preferred, or this is
appropriate [at least as fallback] but we need to see how to make it more
generic across all the patterns here).
Differential Revision: https://reviews.llvm.org/D118968
Implements optional attribute or type parameters, including support for such parameters in the assembly format `struct` directive. Also implements optional groups.
Depends on D117971
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D118208
For 0-D as well as 1-D vectors, both these patterns should
return a failure as there is no need to collapse the shape
of the source. Currently, only 1-D vectors were handled. This
patch handles the 0-D case as well.
Reviewed By: Benoit, ThomasRaoux
Differential Revision: https://reviews.llvm.org/D119202
There are a few different test passes that check elementwise fusion in
Linalg. Consolidate them to a single pass controlled by different pass
options (in keeping with how `TestLinalgTransforms` exists).
Add a Python method, output_sparse_tensor, to use sparse_tensor.out to write
a sparse tensor value to a file.
Modify the method that evaluates a tensor expression to return a pointer of the
MLIR sparse tensor for the result to delay the extraction of the coordinates and
non-zero values.
Implement the Tensor to_file method to evaluate the tensor assignment and write
the result to a file.
Add unit tests. Modify test golden files to reflect the change that TNS outputs
now have a comment line and two meta data lines.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D118956
Currently if an operation wants a C++ implemented parser/printer, it specifies inline
code blocks. This is quite problematic for various reasons, e.g. it requires defining
C++ inside of Tablegen which is discouraged when possible, but mainly because
nearly all usages simply forward to static functions (e.g. `static void parseSomeOp(...)`)
with users devising their own standards for how these are defined.
This commit adds support for a `hasCustomAssemblyFormat` bit field that specifies if
a C++ parser/printer is needed, and when set to 1 declares the parse/print methods for
operations to override. For migration purposes, the existing behavior is untouched. Upstream
usages will be replaced in a followup to keep this patch focused on the new implementation.
Differential Revision: https://reviews.llvm.org/D119054
There are a few different test passes that check elementwise fusion in
Linalg. Consolidate them to a single pass controlled by different pass
options (in keeping with how `TestLinalgTransforms` exists).
Fix the verification function of spirv::ConstantOp to allow nesting
array attributes.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D118939
* Implement `FlatAffineConstraints::getConstantBound(EQ)`.
* Inject a simpler constraint for loops that have at most 1 iteration.
* Taking into account constant EQ bounds of FlatAffineConstraint dims/symbols during canonicalization of the resulting affine map in `canonicalizeMinMaxOp`.
Differential Revision: https://reviews.llvm.org/D119153
This is both more efficient and more ergonomic to use, as inverting a
bit vector is trivial while inverting a set is annoying.
Sadly this leaks into a bunch of APIs downstream, so adapt them as well.
This would be NFC, but there is an ordering dependency in MemRefOps's
computeMemRefRankReductionMask. This is now deterministic, previously it
was dependent on SmallDenseSet's unspecified iteration order.
Differential Revision: https://reviews.llvm.org/D119076
Adapt `tileConsumerAndFuseProducers` to return failure if the generated tile loop nest is empty since all tile sizes are zero. Additionally, fix `LinalgTileAndFuseTensorOpsPattern` to return success if the pattern applied successfully.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D118878
Simple pass that changes all symbols to private unless symbol is excluded (and
in which case there is no change to symbol's visibility).
Differential Revision: https://reviews.llvm.org/D118752
Induction variable calculation was ignoring scf.for step value. Fix it to get
the correct induction variable value in the prologue.
Differential Revision: https://reviews.llvm.org/D118932
Some translations do work with unregistered dialects, this allows one
to write testcases against them. It works the same way as it does for
mlir-opt.
Differential Revision: https://reviews.llvm.org/D118872
Replace the Python implementation for reading tensor input data from files with
create_sparse_tensor that uses sparse_tensor.new.
The MLIR TNS format has two extra meta data lines. Add the extra meta data to a
test data file.
Implement TACO tensor methods evaluate and unpack.
Add unit tests.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D118803
-- This commit adds a canonicalization pattern on scf.while to remove
the loop invariant arguments.
-- An argument is considered loop invariant if the iteration argument value is
the same as the corresponding one being yielded (at the same position) in both
the before/after block of scf.while.
-- For the arguments removed, their use within scf.while and their corresponding
scf.while's result are replaced with their corresponding initial value.
Signed-off-by: Abhishek Varma <abhishek.varma@polymagelabs.com>
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D116923
Prior to this patch, using an operation without any results as the location would result in the generation of invalid C++ code. It'd try to format using the result values, which would would end up being an empty string for an operation without any.
This patch fixes that issue by instead using getValueAndRangeUse which handles both ranges as well as the case for an op without any results.
Differential Revision: https://reviews.llvm.org/D118885
The code assumes that a TypeConstraint in the additional constraints list specifies precisely one argument.
If the user were to not specify any, it'd result in a crash. If given more than one, the additional ones were ignored.
This patch fixes the crash and disallows user errors by adding a check that a single argument is supplied to the TypeConstraint
Differential Revision: https://reviews.llvm.org/D118763