The InlineAsmOp mirrors the underlying LLVM semantics with a notable
exception: the embedded `asm_string` is not allowed to define or reference
any symbol or any global variable: only the operands of the op may be read,
written, or referenced.
Attempting to define or reference any symbol or any global behavior is
considered undefined behavior at this time.
The asm dialect syntax is currently specified with an integer (0 [default] for the "att dialect", 1 for the intel dialect) to circumvent the ODS limitation on string enums.
Translation to LLVM is provided and raises the fact that the asm constraints string must be well-formed with respect to in/out operands. No check is performed on the asm_string.
An InlineAsm instruction in LLVM is a special call operation to a function that is constructed on the fly.
It does not fit the current model of MLIR calls with symbols.
As a consequence, the current implementation constructs the function type in ModuleTranslation.cpp.
This should be refactored in the future.
The mlir-cpu-runner is augmented with the global initialization of the X86 asm parser to allow proper execution in JIT mode. Previously, only the X86 asm printer was initialized.
Differential revision: https://reviews.llvm.org/D92166
* If ODS redefines this, it is fine, but I have found this accessor to be universally useful in the old npcomp bindings and I'm closing gaps that will let me switch.
Differential Revision: https://reviews.llvm.org/D92287
* Add capsule get/create for Attribute and Type, which already had capsule interop defined.
* Add capsule interop and get/create for Location.
* Add Location __eq__.
* Use get() and implicit cast to go from PyAttribute, PyType, PyLocation to MlirAttribute, MlirType, MlirLocation (bundled with this change because I didn't want to continue the pattern one more time).
Differential Revision: https://reviews.llvm.org/D92283
Op with mapping from ops to corresponding shape functions for those op
in the library and mechanism to associate shape functions to functions.
The mapping of operand to shape function is kept separate from the shape
functions themselves as the operation is associated to the shape
function and not vice versa, and one could have a common library of
shape functions that can be used in different contexts.
Use fully qualified names and require a name for shape fn lib ops for
now and an explicit print/parse (based around the generated one & GPU
module op ones).
This commit reverts d9da4c3e73. Fixes
missing headers (don't know how that was working locally).
Differential Revision: https://reviews.llvm.org/D91672
Op with mapping from ops to corresponding shape functions for those op
in the library and mechanism to associate shape functions to functions.
The mapping of operand to shape function is kept separate from the shape
functions themselves as the operation is associated to the shape
function and not vice versa, and one could have a common library of
shape functions that can be used in different contexts.
Use fully qualified names and require a name for shape fn lib ops for
now and an explicit print/parse (based around the generated one & GPU
module op ones).
Differential Revision: https://reviews.llvm.org/D91672
A splat attribute have a single element during printing so we should
treat it as such when we decide if we elide it or not based on the flag
intended to elide large attributes.
Reviewed By: rriddle, mehdi_amini
Differential Revision: https://reviews.llvm.org/D92165
The ops are very similar to the std variants, but support async GPU execution.
gpu.alloc does not currently support an alignment attribute, and the new ops do not have
canonicalizers/folders like their std siblings do.
Reviewed By: herhut
Differential Revision: https://reviews.llvm.org/D91698
The rewrite logic has an optimization to drop a cast operation after
rewriting block arguments if the cast operation has no users. This is
unsafe as there might be a pending rewrite that replaced the cast operation
itself and hence would trigger a second free.
Instead, do not remove the casts and leave it up to a later canonicalization
to do so.
Differential Revision: https://reviews.llvm.org/D92184
This enables partial bufferization that includes function signatures. To test this, this
change also makes the func-bufferize partial and adds a dedicated finalizing-bufferize pass.
Differential Revision: https://reviews.llvm.org/D92032
This change gives sparse compiler clients more control over selecting
individual types for the pointers and indices in the sparse storage schemes.
Narrower width obviously results in smaller memory footprints, but the
range should always suffice for the maximum number of entries or index value.
Reviewed By: penpornk
Differential Revision: https://reviews.llvm.org/D92126
This CL adds the ability to request different parallelization strategies
for the generate code. Every "parallel" loop is a candidate, and converted
to a parallel op if it is an actual for-loop (not a while) and the strategy
allows dense/sparse outer/inner parallelization.
This will connect directly with the work of @ezhulenev on parallel loops.
Still TBD: vectorization strategy
Reviewed By: penpornk
Differential Revision: https://reviews.llvm.org/D91978
SameOperandsAndResultShape and ElementwiseMappable have similar
verification, but in general neither is strictly redundant with the
other.
Examples:
- SameOperandsAndResultShape allows
`"foo"(%0) : tensor<2xf32> -> tensor<?xf32> but ElementwiseMappable
does not.
- ElementwiseMappable allows
`select %scalar_pred, %true_tensor, %false_tensor` but
SameOperandsAndResultShape does not.
SameOperandsAndResultShape is redundant with ElementwiseMappable when
we can prove that the mixed scalar/non-scalar case cannot happen. In
those situations, `ElementwiseMappable & SameOperandsAndResultShape ==
ElementwiseMappable`:
- Ops with 1 operand: the case of mixed scalar and non-scalar operands
cannot happen since there is only one operand.
- When SameTypeOperands is also present, the mixed scalar/non-scalar
operand case cannot happen.
Differential Revision: https://reviews.llvm.org/D91396
Generalizes invariant handling to anything defined outside the Linalg op
(parameters and SSA computations). Fixes bug that was using parameter number
as tensor number.
Reviewed By: penpornk
Differential Revision: https://reviews.llvm.org/D91985
Introduce a conversion pass from SCF parallel loops to OpenMP dialect
constructs - parallel region and workshare loop. Loops with reductions are not
supported because the OpenMP dialect cannot model them yet.
The conversion currently targets only one level of parallelism, i.e. only
one top-level `omp.parallel` operation is produced even if there are nested
`scf.parallel` operations that could be mapped to `omp.wsloop`. Nested
parallelism support is left for future work.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D91982
* Was missed in the initial submission and is required for a ConstantLike op.
* Also adds a materializeConstant hook to preserve it.
* Tightens up the argument constraint on tosa.const to match what is actually legal.
Differential Revision: https://reviews.llvm.org/D92040
This revision will make it easier to create new ops base on the strided memref abstraction outside of the std dialect.
OffsetSizeAndStrideOpInterface is an interface for ops that allow specifying mixed dynamic and static offsets, sizes and strides variadic operands.
Ops that implement this interface need to expose the following methods:
1. `getArrayAttrRanks` to specify the length of static integer
attributes.
2. `offsets`, `sizes` and `strides` variadic operands.
3. `static_offsets`, resp. `static_sizes` and `static_strides` integer
array attributes.
The invariants of this interface are:
1. `static_offsets`, `static_sizes` and `static_strides` have length
exactly `getArrayAttrRanks()`[0] (resp. [1], [2]).
2. `offsets`, `sizes` and `strides` have each length at most
`getArrayAttrRanks()`[0] (resp. [1], [2]).
3. if an entry of `static_offsets` (resp. `static_sizes`,
`static_strides`) is equal to a special sentinel value, namely
`ShapedType::kDynamicStrideOrOffset` (resp. `ShapedType::kDynamicSize`,
`ShapedType::kDynamicStrideOrOffset`), then the corresponding entry is
a dynamic offset (resp. size, stride).
4. a variadic `offset` (resp. `sizes`, `strides`) operand must be present
for each dynamic offset (resp. size, stride).
This interface is useful to factor out common behavior and provide support
for carrying or injecting static behavior through the use of the static
attributes.
Differential Revision: https://reviews.llvm.org/D92011
This file is intended to be included by other files, including
out-of-tree dialects, and makes more sense in `include` than in `lib`.
Depends On D91652
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D91961
Attributes represent additional data about an operation and are intended to be
modifiable during the lifetime of the operation. In the dialect-specific Python
bindings, attributes are exposed as properties on the operation class. Allow
for assigning values to these properties. Also support creating new and
deleting existing attributes through the generic "attributes" property of an
operation. Any validity checking must be performed by the op verifier after the
mutation, similarly to C++. Operations are not invalidated in the process: no
dangling pointers can be created as all attributes are owned by the context and
will remain live even if they are not used in any operation.
Introduce a Python Test dialect by analogy with the Test dialect and to avoid
polluting the latter with Python-specific constructs. Use this dialect to
implement a test for the attribute access and mutation API.
Reviewed By: stellaraccident, mehdi_amini
Differential Revision: https://reviews.llvm.org/D91652
It is a simple conversion that only requires to change the region argument
types, generalize it from ParallelOp.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D91989
While this makes the unit tests a bit more verbose, this simplifies the creation of bindings because only the bidirectional mapping between the host language's string type and MlirStringRef need to be implemented.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D91905
Enhance the tile+fuse logic to allow fusing a sequence of operations.
Make sure the value used to obtain tile shape is a
SubViewOp/SubTensorOp. Current logic used to get the bounds of loop
depends on the use of `getOrCreateRange` method on `SubViewOp` and
`SubTensorOp`. Make sure that the value/dim used to compute the range
is from such ops. This fix is a reasonable WAR, but a btter fix would
be to make `getOrCreateRange` method be a method of `ViewInterface`.
Differential Revision: https://reviews.llvm.org/D90991
Previously, there was no way to add context to the diagnostic engine via the C API. Adding this ability makes it much easier to reason about memory ownership, particularly in reference-counted languages such as Swift. There are more details in the review comments.
Reviewed By: ftynse, mehdi_amini
Differential Revision: https://reviews.llvm.org/D91738
An SCF 'for' loop does not iterate if its lower bound is equal to its upper
bound. Remove loops where both bounds are the same SSA value as such bounds are
guaranteed to be equal. Similarly, remove 'parallel' loops where at least one
pair of respective lower/upper bounds is specified by the same SSA value.
Reviewed By: gysit
Differential Revision: https://reviews.llvm.org/D91880
The existing implementation of the conversion from SCF Parallel operation to
SCF "for" loops in order to further convert those loops to branch-based CFG has
been cloning the loop and reduction body operations into the new loop because
ConversionPatternRewriter was missing support for moving blocks while replacing
their arguments. This functionality now available, use it to implement the
conversion and avoid cloning operations, which may lead to doubling of the IR
size during the conversion.
In addition, this fixes an issue with converting nested SCF "if" conditionals
present in "parallel" operations that would cause the conversion infrastructure
to stop because of the repeated application of the pattern converting "newly"
created "if"s (which were in fact just moved). Arguably, this should be fixed
at the infrastructure level and this fix is a workaround.
Reviewed By: herhut
Differential Revision: https://reviews.llvm.org/D91955
This revision refactors code used in various Linalg transformations and makes it a first class citizen to the LinalgStructureOpInterface. This is in preparation to allowing more advanced Linalg behavior but is otherwise NFC.
Differential revision: https://reviews.llvm.org/D91863
- Fixes bug 48242 point 3 crash.
- Makes the improvments from points 1 & 2.
https://bugs.llvm.org/show_bug.cgi?id=48262
```
def RTLValueType : Type<CPred<"isRTLValueType($_self)">, "Type"> {
string cppType = "::mlir::Type";
}
```
Works now, but merely by happenstance. Parameters expects a `TypeParameter` class def or a string representing a c++ type but doesn't enforce it.
Reviewed By: lattner
Differential Revision: https://reviews.llvm.org/D91939
Adds tests for full sum reduction (tensors summed up into scalars)
and the well-known sampled-dense-dense-matrix-product. Refines
the optimizations rules slightly to handle the summation better.
Reviewed By: penpornk
Differential Revision: https://reviews.llvm.org/D91818
Add transformation to be able to forward transfer_write into transfer_read
operation and to be able to remove dead transfer_write when a transfer_write is
overwritten before being read.
Differential Revision: https://reviews.llvm.org/D91321
Block merging in MLIR will incorrectly merge blocks with operations whose values are used outside of that block. This change forbids this behavior and provides a test where it is illegal to perform such a merge.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D91745
Add canoncalization patterns to remove zero-iteration 'for' loops, replace
single-iteration 'for' loops with their bodies; remove known-false conditionals
with no 'else' branch and replace conditionals with known value by the
respective region. Although similar transformations are performed at the CFG
level, not all flows reach that level, e.g., the GPU flow may want to remove
single-iteration loops before deciding on loop mapping to thread dimensions.
Reviewed By: herhut
Differential Revision: https://reviews.llvm.org/D91865
This canonicalization is useful to resolve loads into scalar values when
doing partial bufferization.
Differential Revision: https://reviews.llvm.org/D91855
This reverts commit f8284d21a8.
Revert "[mlir][Linalg] NFC: Expose some utility functions used for promotion."
This reverts commit 0c59f51592.
Revert "Remove unused isZero function"
This reverts commit 0f9f0a4046.
Change f8284d21 led to multiple failures in IREE compilation.
Depends On D89963
**Automatic reference counting algorithm outline:**
1. `ReturnLike` operations forward the reference counted values without
modifying the reference count.
2. Use liveness analysis to find blocks in the CFG where the lifetime of
reference counted values ends, and insert `drop_ref` operations after
the last use of the value.
3. Insert `add_ref` before the `async.execute` operation capturing the
value, and pairing `drop_ref` before the async body region terminator,
to release the captured reference counted value when execution
completes.
4. If the reference counted value is passed only to some of the block
successors, insert `drop_ref` operations in the beginning of the blocks
that do not have reference coutned value uses.
Reviewed By: silvas
Differential Revision: https://reviews.llvm.org/D90716
Null types are commonly used as an error marker. Catch them in the constructor
of Operation if they are present in the result type list, as otherwise this
could lead to further surprising behavior when querying op result types.
Fix AsyncToLLVM and StandardToLLVM that were using null types when constructing
operations.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D91770
This commit does the renaming mentioned in the title in order to bring
'spv' dialect closer to the MLIR naming conventions.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D91792