Initially, the padding transformation and the related operation were only used
to guarantee static shapes of subtensors in tiled operations. The
transformation would not insert the padding operation if the shapes were
already static, and the overall code generation would actively remove such
"noop" pads. However, this transformation can be also used to pack data into
smaller tensors and marshall them into faster memory, regardless of the size
mismatches. In context of expert-driven transformation, we should assume that,
if padding is requested, a potentially padded tensor must be always created.
Update the transformation accordingly. To do this, introduce an optional
`packing` attribute to the `pad_tensor` op that serves as an indication that
the padding is an intentional choice (as opposed to side effect of type
normalization) and should be left alone by cleanups.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D110425
Add support for intersecting, subtracting, complementing and checking equality of sets having divisions.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D110138
This canonicalization pattern complements the tensor.cast(pad_tensor) one in
propagating constant type information when possible. It contributes to the
feasibility of pad hoisting.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D110343
* Do not discard static result type information that cannot be inferred from lower/upper padding.
* Add optional argument to `PadTensorOp::inferResultType` for specifying known result dimensions.
Differential Revision: https://reviews.llvm.org/D110380
This is only noticeable when using an attribute across dialects I think.
Previously the namespace would be ommited, but it wouldn't matter as
long as the generated code stays within a single namespace.
Differential Revision: https://reviews.llvm.org/D110367
When generating code to add an element to SparseTensorCOO (e.g., when doing dense=>sparse conversion), we used to check for nonzero values on the runtime side, whereas now we generate MLIR code to do that check.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D110121
Current warning message in method `addAffineForOpDomain` of mlir/lib/Analysis/AffineStructures.cpp is being printed to the stdout/stderr.
This patch redirects the warning with LLVM_DEBUG following standard llvm practice.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D108340
clang-cl errors out while handling the templated version of tgfmt. This
patch works around the issue by explicitly choosing the non-templated
version of tgfmt, which takes an ArrayRef<std::string>.
More details in this thread:
https://lists.llvm.org/pipermail/cfe-dev/2021-September/068936.html
Thanks @Mehdi Amini for suggesting the fix :)
Differential Revision: https://reviews.llvm.org/D110223
Enables putting types and attributes in sets and in dicts as keys.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D110301
This test makes sure kernels map to efficient sparse code, i.e. all
compressed for-loops, no co-iterating while loops. In addition, this
revision removes the special constant folding inside the sparse
compiler in favor of Mahesh' new generic linalg folding. Thanks!
NOTE: relies on Mahesh fix, which needs to be rebased first
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D110001
When both a DefaultValuedAttr and a successor or variadic region was specified, this would generate invalid C++ declaration. There would be the parameter with a default value, followed by the successors/regions, which don't have a default, which is invalid.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D110205
The current folder of constant -> generic op only handles splat
constants. The same logic holds for scalar constants. Teach the
pattern to handle such cases.
Differential Revision: https://reviews.llvm.org/D109982
This fixes a bug where we discover new information about the arguments of an
already executable edge, but don't visit the arguments. We only visit the arguments, and not the block itself, so this commit shouldn't really affect performance at all.
Fixes PR#51871
Differential Revision: https://reviews.llvm.org/D110197
Should reset the operation to original state when canceling the updates.
Reviewed By: rriddle, ftynse
Differential Revision: https://reviews.llvm.org/D110176
Now not just SUM, but also PRODUCT, AND, OR, XOR. The reductions
MIN and MAX are still to be done (also depends on recognizing
these operations in cmp-select constructs).
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D110203
If no interchange vector is given initialize it with the identity permutation from 0 to number of loops.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D110249
This revision removes the ad-hoc MemRefs that were needed using the old
ABI (when we still passed by value) and replaces them with the shared
StridedMemRef definitions of CRunnerUtils (possible now that we pass by
pointer). This avoids code duplication and makes sure we have a consistent
view of strided memory references in all our support libraries.
Reviewed By: jsetoain
Differential Revision: https://reviews.llvm.org/D110221
This change adds automatic wrapper functoins with emit_c_interface
to all methods in the sparse support library that deal with MEMREFs.
The wrappers will take care of passing MEMREFs by value internally
and by pointer externally, thereby avoiding ABI issues across platforms.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D110219
DialectAsmParser has a `parseAttribute` member that takes a
contextual type, but DialectAsmPrinter doesn't have the corresponding
member to take advantage of it. As such, custom attribute
implementations can't really use it. This adds the obvious missing
method which fills this hole.
Differential Revision: https://reviews.llvm.org/D110211
Previously, the translation to LLVM IR would emit IR that directly uses
a scope metadata node in case only one scope was in use in alias.scopes
or noalias metadata. It should always be a list of scopes. The verifier
change in 8700f2bd36 enforced this and
broke the test. Fix the translation to always create a list of scopes
using a new metadata node, update and reenable the respective test.
Fixes PR51919.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D110140
Add a PointerProxy similar to the existing iterator_facade_base::ReferenceProxy and return it from the arrow operator. This prevents iterator facades with a reference type that is not a true reference to take the address of a temporary.
Forward the reference type of the mapped_iterator to the iterator adaptor which in turn forwards it to the iterator facade. This fixes mlir::op_iterator::operator->() to take the address of a temporary.
Make some polishing changes to op_iterator and op_filter_iterator.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D109490
Compute the tiled producer slice dimensions directly starting from the consumer not using the producer at all.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D110147
Add a helper method to check if an index vector contains a permutation of its indices. Additionally, refactor applyPermutationToVector to take int64_t.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D110135
It was previously assumed that tensor.insert_slice should be bufferized first in a greedy fashion to avoid out-of-place bufferization of the large tensor. This heuristic does not hold upon further inspection.
This CL removes the special handling of such ops and adds a test that exhibits better behavior and appears in real use cases.
The only test adversely affected is an artificial test which results in a returned memref: this pattern is not allowed by comprehensive bufferization in real scenarios anyway and the offending test is deleted.
Differential Revision: https://reviews.llvm.org/D110072
Previously, comprehensive bufferize would consider all aliasing reads and writes to
the result buffer and matching operand. This resulted in spurious dependences
being considered and resulted in too many unnecessary copies.
Instead, this revision revisits the gathering of read and write alias sets.
This results in fewer alloc and copies.
An exhaustive test cases is added that considers all possible permutations of
`matmul(extract_slice(fill), extract_slice(fill), ...)`.
This pass transforms SCF.ForOp operations to SCF.WhileOp. The For loop condition is placed in the 'before' region of the while operation, and indctuion variable incrementation + the loop body in the 'after' region. The loop carried values of the while op are the induction variable (IV) of the for-loop + any iter_args specified for the for-loop.
Any 'yield' ops in the for-loop are rewritten to additionally yield the (incremented) induction variable.
This transformation is useful for passes where we want to consider structured control flow solely on the basis of a loop body and the computation of a loop condition. As an example, when doing high-level synthesis in CIRCT, the incrementation of an IV in a for-loop is "just another part" of a circuit datapath, and what we really care about is the distinction between our datapath and our control logic (the condition variable).
Differential Revision: https://reviews.llvm.org/D108454
This patch adds mergeLocalIds andmergeSymbolIds as public functions
for FlatAffineConstraints and FlatAffineValueConstraints respectively.
mergeLocalIds is also required to support divisions in intersection,
subtraction, equality checks, and complement for PresburgerSet.
This patch is part of a series of patches aimed at generalizing affine
dependence analysis.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D110045
Lots of custom ops have hand-rolled comma-delimited parsing loops, as does
the MLIR parser itself. Provides a standard interface for doing this that
is less error prone and less boilerplate.
While here, extend Delimiter to support <> and {} delimited sequences as
well (I have a use for <> in CIRCT specifically).
Differential Revision: https://reviews.llvm.org/D110122
This revision refactors ElementsAttr into an Attribute Interface.
This enables a common interface with which to interact with
element attributes, without needing to modify the builtin
dialect. It also removes a majority (if not all?) of the need for
the current OpaqueElementsAttr, which was originally intended as
a way to opaquely represent data that was not representable by
the other builtin constructs.
The new ElementsAttr interface not only allows for users to
natively represent their data in the way that best suits them,
it also allows for efficient opaque access and iteration of the
underlying data. Attributes using the ElementsAttr interface
can directly expose support for interacting with the held
elements using any C++ data type they claim to support. For
example, DenseIntOrFpElementsAttr supports iteration using
various native C++ integer/float data types, as well as
APInt/APFloat, and more. ElementsAttr instances that refer to
DenseIntOrFpElementsAttr can use all of these data types for
iteration:
```c++
DenseIntOrFpElementsAttr intElementsAttr = ...;
ElementsAttr attr = intElementsAttr;
for (uint64_t value : attr.getValues<uint64_t>())
...;
for (APInt value : attr.getValues<APInt>())
...;
for (IntegerAttr value : attr.getValues<IntegerAttr>())
...;
```
ElementsAttr also supports failable range/iterator access,
allowing for selective code paths depending on data type
support:
```c++
ElementsAttr attr = ...;
if (auto range = attr.tryGetValues<uint64_t>()) {
for (uint64_t value : *range)
...;
}
```
Differential Revision: https://reviews.llvm.org/D109190
Currently DenseElementsAttr only exposes the ability to get the full range of values for a given type T, but there are many situations where we just want the beginning/end iterator. This revision adds proper value_begin/value_end methods for all of the supported T types, and also cleans up a bit of the interface.
Differential Revision: https://reviews.llvm.org/D104173
SparseElementsAttr currently does not perform any verfication on construction, with the only verification existing within the parser. This revision moves the parser verification to SparseElementsAttr, and also adds additional verification for when a sparse index is not valid.
Differential Revision: https://reviews.llvm.org/D109189
* ODS generated operations extend _OperationBase and without this, cannot be marshalled to CAPI functions.
* No test case updates: this kind of interop is quite hard to verify with in-tree tests.
Differential Revision: https://reviews.llvm.org/D110030
Some patterns may share the common DAG structures. Generate a static
function to do the match logic to reduce the binary size.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D105797
For `memref.subview` operations, when there are more than one
unit-dimensions, the strides need to be used to figure out which of
the unit-dims are actually dropped.
Differential Revision: https://reviews.llvm.org/D109418
Add an interface that allows grouping together all covolution and
pooling ops within Linalg named ops. The interface currently
- the indexing map used for input/image access is valid
- the filter and output are accessed using projected permutations
- that all loops are charecterizable as one iterating over
- batch dimension,
- output image dimensions,
- filter convolved dimensions,
- output channel dimensions,
- input channel dimensions,
- depth multiplier (for depthwise convolutions)
Differential Revision: https://reviews.llvm.org/D109793
This pass transforms SCF.ForOp operations to SCF.WhileOp. The For loop condition is placed in the 'before' region of the while operation, and indctuion variable incrementation + the loop body in the 'after' region. The loop carried values of the while op are the induction variable (IV) of the for-loop + any iter_args specified for the for-loop.
Any 'yield' ops in the for-loop are rewritten to additionally yield the (incremented) induction variable.
This transformation is useful for passes where we want to consider structured control flow solely on the basis of a loop body and the computation of a loop condition. As an example, when doing high-level synthesis in CIRCT, the incrementation of an IV in a for-loop is "just another part" of a circuit datapath, and what we really care about is the distinction between our datapath and our control logic (the condition variable).
Differential Revision: https://reviews.llvm.org/D108454
Add a new version of fusion on tensors that supports the following scenarios:
- support input and output operand fusion
- fuse a producer result passed in via tile loop iteration arguments (update the tile loop iteration arguments)
- supports only linalg operations on tensors
- supports only scf::for
- cannot add an output to the tile loop nest
The LinalgTileAndFuseOnTensors pass tiles the root operation and fuses its producers.
Reviewed By: nicolasvasilache, mravishankar
Differential Revision: https://reviews.llvm.org/D109766
Make use of runtime extension for the second reference counter used in
structured data region. This extension is implemented in D106510 and D106509.
Differential Revision: https://reviews.llvm.org/D106517