Adding the accumulator value after the `vector.contract` changes the
precision of the operation. This makes sure the accumulator is carried
through to `vector.reduce` (and down to LLVM).
Differential Revision: https://reviews.llvm.org/D128674
This patch adds a `convertFromStorage` field to attribute or type parameters that can implement more complex logic for converting from the parameter's C++ storage type (e.g. `Optional<SmallVector<T>>`) to its C++ type (e.g. `Optional<ArrayRef<T>>`).
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D128293
The parser currently can't parse bare identifiers like 'i0' in affine
maps and sets, and similarly ids like f16/f32. But these bare ids are
part of the grammar - although they are primitive types.
```
error: expected bare identifier
set = affine_set<(i0, i1) : ()>
^
```
This patch allows the parser for AffineMap/IntegerSet to parse bare
identifiers as defined by the grammer.
Reviewed By: bondhugula, rriddle
Differential Revision: https://reviews.llvm.org/D127076
Adding more test cases for sparse_tensor.BinaryOp, including different cases when overlap/left/right region is implemented/empty/identity
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D128383
Previously, the sparse_tensor.unary integration test does not contain cases with the use of `linalg.index` (previoulsy unsupported), this commit adds test cases that use `linalg.index` operators.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D128460
This patch memorize compatible LLVM types in `LLVM::isCompatibleType` in
order to avoid redundant works.
This is especially useful when the size of program is big and there are
multiple occurrences of some deeply nested LLVM struct types, in which
case we can gain quite some speedups with this patch.
Differential Revision: https://reviews.llvm.org/D127918
This patch adds three new LLVM intrinsic operations: llvm.intr.vastart/copy/end.
And its translation from LLVM IR.
This effectively removes a restriction, imposed by 0126dcf1f0, where
non-external functions in LLVM dialect cannot be variadic. At that time
it was not clear how LLVM intrinsics are going to be modeled, which
indirectly affects va_start/copy/end, the core intrinsics used in
variadic functions. But since we have LLVM intrinsics as normal
MLIR operations, it's not a problem anymore.
Differential Revision: https://reviews.llvm.org/D127540
This change updates all remaining bufferization patterns (except for scf.while) and the remaining bufferization infrastructure to infer the memory space whenever possible instead of falling back to "0". (If a default memory space is set in the bufferization options, we still fall back to that value if the memory space could not be inferred.)
Differential Revision: https://reviews.llvm.org/D128423
Add a failure return value and bufferization options argument. This is to keep a subsequent change smaller.
Differential Revision: https://reviews.llvm.org/D128278
These intrinsics will be needed to convert between fixed-length vectors
and scalable vectors.
This operation will be needed for VLS (vector-length specific)
vectorization, when interfacing with vector functions or intrinsics that
take scalable vectors as operands in a context where the length of our
vectors is known or assumed at compile time, but we still want to
generate scalable vector instructions.
Differential Revision: https://reviews.llvm.org/D127100
An optional thread_dim_mapping index array attribute specifies for each
virtual thread dimension, how it remaps 1-1 to a set of concrete processing
element resources (e.g. a CUDA grid dimension or a level of concrete nested
async parallelism). At this time, the specification is backend-dependent and
is not verified by the op, beyond being an index array attribute.
It is the reponsibility of the lowering to interpret the index array in the
context of the concrete target the op is lowered to, or to ignore it when
the specification is ill-formed or unsupported for a particular target.
Differential Revision: https://reviews.llvm.org/D128633
This allows for better type inference during bufferization and is in preparation of supporting memory spaces.
Differential Revision: https://reviews.llvm.org/D128422
This allows for better type inference during bufferization and is in preparation of supporting memory spaces.
Differential Revision: https://reviews.llvm.org/D128581
This allows for better type inference during bufferization and is in preparation of supporting memory spaces.
Differential Revision: https://reviews.llvm.org/D128580
This allows for better type inference during bufferization and is in preparation of supporting memory spaces.
Differential Revision: https://reviews.llvm.org/D128579
This is useful because the result type of an op can sometimes be inferred from its body (e.g., `scf.if`). This will be utilized in subsequent changes.
Also introduces a new `getBufferType` interface method on BufferizableOpInterface. This method is useful for computing a bufferized block argument type with respect to OpOperand types of the parent op.
Differential Revision: https://reviews.llvm.org/D128420
This attribute is currently supported on AllocTensorOp only. Future changes will add support to other ops. Furthermore, the memory space is not propagated properly in all bufferization patterns and some of the core bufferization infrastructure. This will be addressed in a subsequent change.
Differential Revision: https://reviews.llvm.org/D128274
Seems to have been an accident of history and none of these had any reason to be restricted to FuncOp.
Differential Revision: https://reviews.llvm.org/D128614
This patch fixes:
llvm-project/mlir/lib/Dialect/Linalg/Transforms/SplitReduction.cpp:300:26:
error: comparison of integers of different signs: 'int64_t' (aka
'long') and 'size_t' (aka 'unsigned long') [-Werror,-Wsign-compare]
This paves the way for integer-exact projection, and for supporting
non-division locals in subtraction, complement, and equality checks.
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D127463
Static loop unrolling does not change the operation type. We can rigorously make sure to use affine.store in the check.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D128237
We can have validation test for quant.region having incompatible output spec.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D128245
When creating a scf.for without argument a scf.yield is automatically
created. Make sure we don't create a second one.
Differential Revision: https://reviews.llvm.org/D128405
Putting some direct use restrictions on tensor allocations in the
sparse case enables the use of simplifying assumptions in the
bufferization analysis.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D128463
Only the analysis part of the interface is implemented. The bufferization itself is performed by the SparseTensorConversion pass.
Differential Revision: https://reviews.llvm.org/D128138
spv.bitcast from a vector to a scalar expects the lower-numbered
components of the the vector to map to the lower-ordered bits of
the scalar. That actually already matches how little endian stores
data in the memory. So we just need to read and push to the back
of the vector sequentially.
Reviewed By: hanchung
Differential Revision: https://reviews.llvm.org/D128473
Add the reverse functions to the ViewLikeInterface's functions
`getMixedStrides`, `getMixedSizes`, and `getMixedOffsets`. The new functions
are useful to build view-like operations from an array of mixed static/dynamic
values.
Differential Revision: https://reviews.llvm.org/D128376
Such situations manifest themselves with an empty payload which ends up producing empty results.
In such cases, we still want to match the transform op contract and return as many empty SmallVector<Operation*>
as the op requires.
Differential Revision: https://reviews.llvm.org/D128456
All bufferizable ops that bufferize to an allocation receive a `bufferization.escape` attribute during TensorCopyInsertion.
Differential Revision: https://reviews.llvm.org/D128137
Command line option injected by tablegen rule cannot be respected by
PDLL here, so add new helper function that is copy of original without
any additional flags injected. This avoids compilation failure when
compiler warnings are disabled.
Kept it as a mechanical copy.
Fixes#55716
The result of applying an N-result producing transformation to M payload ops
is an M-wide result, each containing N result operations.
This requires a transposition of the results obtained by calling `applyToOne`.
This revision fixes the issue and adds more advanced tests that exercise the behavior.
Differential Revision: https://reviews.llvm.org/D128414
This revision proposes a different implementation of the SplitReductoin transformation that does
not rely on tensor::ExpandShapeOp.
Previously, a dimension `[k]` would be split into `[k][kk]` via an ExpandShapeOp.
Instead, this revision proposes to rewrite `[k]` into `[factor * k + kk]`.
There are different tradeoffs involved but the proposed implementation is more general because
the affine rewrite is well-defined. In particular, it works naturally with `?` parallel dimensions and
non-trivial indexing maps.
A further rewrite of `[factor * k + kk]` + ExpandShapeOp is possible as a followup.
Differential Revision: https://reviews.llvm.org/D128266
Binary size of `clang` is trivial; namely, numerical value doesn't
change when measured in MiB, and `.data` section increases from 139Ki to
173 Ki.
Differential Revision: https://reviews.llvm.org/D128070
This revision makes sure we accept sparse tensors as arguments
of the expand/collapse reshaping operations in the tensor dialect.
Note that the actual lowering to runnable IR is still TBD.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D128311
The Presburger library currently uses int64_t throughout for its integers.
This runs the risk of silently producing incorrect results when overflows occur.
Fixing this issue requires some sort of multiprecision integer
that transparently supports aribtrary arithmetic computations.
The class SlowMPInt provides this functionality, and is intended to be used
as the slow path fallback for a more optimized upcoming class, MPInt, that optimizes
for the Presburger library's workloads.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D123758
erf and round op are able to lowered to libm supporting vector type as other math operations.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D127934
Marking bufferization allocation operation as invalid
during sparse lowering is too strict, since dense and
sparse allocation can co-exist. This revision refines
the lowering with a dynamic type check.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D128305
This patch implements tile and fuse transformation for ops that
implement the tiling interface. To do so,
- `TilingInterface` needs a new method that generates a tiled
implementation of the operation based on the tile of the result
needed.
- A pattern is added that replaces a `tensor.extract_slice` whose
source is defined by an operation that implements the
`TilingInterface` with a tiled implementation that produces the
extracted slice in-place (using the method added to
`TilingInterface`).
- A pattern is added that takes a sequence of operations that
implement the `TilingInterface` (for now `LinalgOp`s), tiles the
consumer, and greedily fuses its producers iteratively.
Differential Revision: https://reviews.llvm.org/D127809
This patch implements tile and fuse transformation for ops that
implement the tiling interface. To do so,
- `TilingInterface` needs a new method that generates a tiled
implementation of the operation based on the tile of the result
needed.
- A pattern is added that replaces a `tensor.extract_slice` whose
source is defined by an operation that implements the
`TilingInterface` with a tiled implementation that produces the
extracted slice in-place (using the method added to
`TilingInterface`).
- A pattern is added that takes a sequence of operations that
implement the `TilingInterface` (for now `LinalgOp`s), tiles the
consumer, and greedily fuses its producers iteratively.
Differential Revision: https://reviews.llvm.org/D127809
Some GDB versions require all prettyprinter classes to define to_string.
This commit adds these definitions.
Reviewed By: csigg
Differential Revision: https://reviews.llvm.org/D127969
This revision separates the `LinalgSplitReduction` pattern, whose application is based on attributes,
from its implementation.
A transform dialect op extension is added to control the application of the transformation at a finer granularity.
Differential Revision: https://reviews.llvm.org/D128165
This revision adds the necessary plumbing for canonicalizing scf::ForeachThread with the
`AffineOpSCFCanonicalizationPattern`.
In the process the `loopMatcher` helper is updated to take OpFoldResult instead of just values.
This allows composing various scenarios without the need for an artificial builder.
Differential Revision: https://reviews.llvm.org/D128244
This patch adds omp.taskgroup operation according to OpenMP 5.0 2.17.6.
Also added tests for the same.
Reviewed By: kiranchandramohan, peixin
Differential Revision: https://reviews.llvm.org/D127250
In order to support newer hardware, define wrappers around MFMA
intrinsics that have not previously been exposed in the ROCDL dialect.
A `amdgpu.mfma` wrapper around these instructions is in development
and will provide a more user-friendly interface to them.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D128079
This aligns the SCF dialect file layout with the majority of the dialects.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D128049
Patch created by running:
rg -l parallelForEachN | xargs sed -i '' -c 's/parallelForEachN/parallelFor/'
No behavior change.
Differential Revision: https://reviews.llvm.org/D128140
Follow up from flipping dialects to both, flip accessor used to prefixed
variant ahead to flipping from _Both to _Prefixed. This just flips to
the accessors introduced in the preceding change which are just prefixed
forms of the existing accessor changed from.
Mechanical change using helper script
https://github.com/jpienaar/llvm-project/blob/main/clang-tools-extra/clang-tidy/misc/AddGetterCheck.cpp and clang-format.
Marked all dialects that could be (reasonably) easily flipped to _Both
prefix. Updating the accessors to prefixed form will happen in follow
up, this was to flush out conflicts and to mark all dialects explicitly
as I plan to flip OpBase default to _Prefixed to avoid needing to
migrate new dialects.
Except for Standalone example which got flipped to _Prefixed.
Differential Revision: https://reviews.llvm.org/D128027
Support complex types of float and double. See the added test for an example.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D128076
This adds weak versions of the truncation libcalls in case the runtime
environment doesn't have them.
Differential Revision: https://reviews.llvm.org/D128091