This was leftover from when the standard dialect was destroyed, and
when FuncOp moved to the func dialect. Now that these transitions
have settled a bit we can drop these.
Most updates were handled using a simple regex: replace `^( *)func` with `$1func.func`
Differential Revision: https://reviews.llvm.org/D124146
This follows the same implementation strategy as scf::ForOp and common functionality is extracted into helper functions.
This implementation works well in cases where each yielded value (from either body/condition region) is equivalent to the corresponding bbArg of the parent block. In that case, each OpResult of the loop may be aliasing with the corresponding OpOperand of the loop (and with no other OpOperand).
In the absence of said equivalence relationship, new buffer copies must be inserted, so that the aliasing OpOperand/OpResult contract of scf::WhileOp is honored. In essence, by yielding a newly allocated buffer, we can enforce the specified may-alias relationship. (Newly allocated buffers cannot alias with any OpOperands of the loop.)
Differential Revision: https://reviews.llvm.org/D124929
Although we now have semi-rings to deal with arbitrary ops,
it is still good to convey zero-preserving semantics of
ops to the sparse compiler.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D125043
The fallback attribute parse path is parsing a Type attribute, but this results
in a really unintuitive error message: `expected non-function type`, which
doesn't really hint at tall that we were trying to parse an attribute. This
commit fixes this by trying to optionally parse a type, and on failure
emitting an error that we were expecting an attribute.
Differential Revision: https://reviews.llvm.org/D124870
The names of the functions that are supposed to be exported do not match the implementations. This is due in part to cac7aabbd8.
This change makes the implementations and declarations match and adds a couple missing declarations.
The new names follow the pattern of the existing `verify` functions where the prefix is maintained as `_mlir_ciface_` but the suffix follows the new naming convention.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D124891
The NVVM dialect test coverage for all possible type/shape combinations
in the `nvvm.mma.sync` op is mostly complete. However, there were tests
missing for TF32 datatype support. This change adds tests for the one
relevant shape/type combination. This uncovered a small bug in the op
verifier, which this change also fixes.
Differential Revision: https://reviews.llvm.org/D124975
The previous error message was technically incorrect. We do not compare equivalence of YieldOp operands and ForOp operands.
Differential Revision: https://reviews.llvm.org/D124934
This commit relaxes the rules around ops that define a value but do not specify the tensor's contents. (The only such op at the moment is init_tensor.)
When such a tensor is written in a loop, it should not cause out-of-place bufferization.
Differential Revision: https://reviews.llvm.org/D124849
Add the mechanism for TransformState extensions to update the mapping between
Transform IR values and Payload IR operations held by the state. The mechanism
is intentionally restrictive, similarly to how results of the transform op are
handled.
Introduce test ops that exercise a simple extension that maintains information
across the application of multiple transform ops.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D124778
This patch restricts the value of `if` clause expression to an I1 value.
It also restricts the value of `num_threads` clause expression to an I32
value.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D124142
The current implementation uses a discrete "pdl_interp.inferred_types"
operation, which acts as a "fake" handle to a type range. This op is
used as a signal to pdl_interp.create_operation that types should be
inferred. This is terribly awkward and clunky though:
* This op doesn't have a byte code representation, and its conversion
to bytecode kind of assumes that it is only used in a certain way. The
current lowering is also broken and seemingly untested.
* Given that this is a different operation, it gives off the assumption
that it can be used multiple times, or that after the first use
the value contains the inferred types. This isn't the case though,
the resultant type range can never actually be used as a type range.
This commit refactors the representation by removing the discrete
InferredTypesOp, and instead adds a UnitAttr to
pdl_interp.CreateOperation that signals when the created operations
should infer their types. This leads to a much much cleaner abstraction,
a more optimal bytecode lowering, and also allows for better error
handling and diagnostics when a created operation doesn't actually
support type inferrence.
Differential Revision: https://reviews.llvm.org/D124587
MLIR has a common pattern for "arguments" that uses syntax
like `%x : i32 {attrs} loc("sourceloc")` which is implemented
in adhoc ways throughout the codebase. The approach this uses
is verbose (because it is implemented with parallel arrays) and
inconsistent (e.g. lots of things drop source location info).
Solve this by introducing OpAsmParser::Argument and make addRegion
(which sets up BlockArguments for the region) take it. Convert the
world to propagating this down. This means that we correctly
capture and propagate source location information in a lot more
cases (e.g. see the affine.for testcase example), and it also
simplifies much code.
Differential Revision: https://reviews.llvm.org/D124649
We currently emit an error during verification if a pdl.operation with non-inferrable
results is used within a rewrite. This allows for catching some errors during compile
time, but is slightly broken. For one, the verification at the PDL level assumes that
all dialects have been loaded, which is true at run time, but may not be true when
the PDL is generated (such as via PDLL). This commit fixes this by not emitting the
error if the operation isn't registered, i.e. it uses the `mightHave` variant of trait/interface
methods.
Secondly, we currently don't verify when a pdl.operation has no explicit results, but the
operation being created is known to expect at least one. This commit adds a heuristic
error to detect these cases when possible and fail. We can't always capture when the user
made an error, but we can capture the most common case where the user expected an
operation to infer its result types (when it actually isn't possible).
Differential Revision: https://reviews.llvm.org/D124583
pdl.attribute currently has a syntax ambiguity that leads to the incorrect parsing
of pdl.attribute operations with locations that don't also have a constant value. For example:
```
pdl.attribute loc("foo")
```
The above IR is treated as being a pdl.attribute with a constant value containing the location,
`loc("foo")`, which is incorrect. This commit changes the syntax to use `= <constant-value>` to
clearly distinguish when the constant value is present, as opposed to just trying to parse an attribute.
Differential Revision: https://reviews.llvm.org/D124582
This allows for using attribute types in result type inference for use with
InferTypeOpInterface. This was a TODO before, but it isn't much
additional work to properly support this. After this commit,
arith::ConstantOp can now have its InferTypeOpInterface implementation automatically
generated.
Differential Revision: https://reviews.llvm.org/D124580
The asm parser had a notional distinction between parsing an
operand (like "%foo" or "%4#3") and parsing a region argument
(which isn't supposed to allow a result number like #3).
Unfortunately the implementation has two problems:
1) It didn't actually check for the result number and reject
it. parseRegionArgument and parseOperand were identical.
2) It had a lot of machinery built up around it that paralleled
operand parsing. This also was functionally identical, but
also had some subtle differences (e.g. the parseOptional
stuff had a different result type).
I thought about just removing all of this, but decided that the
missing error checking was important, so I reimplemented it with
a `allowResultNumber` flag on parseOperand. This keeps the
codepaths unified and adds the missing error checks.
Differential Revision: https://reviews.llvm.org/D124470
This adds a cast operation that allows to perform an explicit type
conversion. The cast op is emitted as a C-style cast. It can be applied
to integer, float, index and EmitC types.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D123514
Fordbids to express pointer via the `!emitc.opaque` type. Point the user
to use the `!emitc.ptr` type instead.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D124002
The verifier of llvm.mlir.addressof did not properly account for opaque pointers, that is, the pointer type not having an element type equal to the type of the referenced global or function. This patch fixes that by skipping the test for the element type if the pointer is opaque.
Differential Revision: https://reviews.llvm.org/D124333
After https://reviews.llvm.org/D119743 added the `AutomaticAllocationScope`
trait to loop-like constructs, the vector transfer full/partial splitting pass
started inserting allocations for temporaries within the closest loop rather
than the closest function (or other allocation scope such as `async.execute`).
While this is correct as long as the lowered code takes care of automatic
deallocation at the end of each iteration of the loop, this interferes with
downstream optimizations that expect `alloca`s to be at the function level.
Step over loops when looking for the closest allocation scope in vector
transfer full/partial splitting pass thus restoring the original behavior.
Reviewed By: hanchung
Differential Revision: https://reviews.llvm.org/D124366
This is likely preferable to having it crash if one were to specify an opaque pointer type, and the actual element type is unused either way.
Differential Revision: https://reviews.llvm.org/D124334
The SparseTensor passes currently use opaque numbers for the CLI, despite using an enum internally. This patch exposes the enums instead of numbered items that are matched back to the enum.
Fixes GitHub issue #53389
Reviewed by: aartbik, mehdi_amini
Differential Revision: https://reviews.llvm.org/D123876
Run `one-shot-bufferize` instead of `linalg-comprehensive-module-bufferize` and move some test cases to their respective dialects.
Differential Revision: https://reviews.llvm.org/D124323
As a fallback mechanism, if no entry was supplied for a given address space, the size or alignment for a pointer type with the default address space is returned instead.
This code currently crashes with opaque pointers, as it tries to construct a typed pointer type from the opaque pointer type, leading to a null pointer dereference when fetching the element type.
This patch fixes the issue by handling the opaque pointer cases explicitly.
Differential Revision: https://reviews.llvm.org/D124290
This change fixes `CollapsedLayoutMap` for cases where the collapsed
dims are size 1. The cases where inner most dims are size 1 and
noncontiguous can be represented by the strided form and therefore can
be allowed. For such cases, the new stride should be of the next entry
in an association whose dimension is not size 1. If the next entry is
dynamic, it's not possible to decide which stride to use at compilation
time and the stride is set to dynamic.
Differential Revision: https://reviews.llvm.org/D124137
Currently, the sequence of Transform dialect operations only supports a single
use of each operand (verified by the `transform.sequence` operation). This was
originally motivated by the need to guard against accessing a payload IR
operation associated with a transform IR value after this operation has likely
been rewritten by a transformation. However, not all Transform dialect
operations rewrite payload IR, in particular the "navigation" operation such as
`transform.pdl_match` do not.
Introduce memory effects to the Transform dialect operations to describe their
effect on the payload IR and the mapping between payload IR opreations and
transform IR values. Use these effects to replace the single-use rule, allowing
repeated reads and disallowing use-after-free, where operations with the "free"
effect are considered to "consume" the transform IR value and rewrite the
corresponding payload IR operations). As an additional improvement, this
enables code motion transformation on the transform IR itself.
Reviewed By: Mogball
Differential Revision: https://reviews.llvm.org/D124181
The bubble up logic was written by assuming the slice operation is
always a normal slice that outputs a tensor with the same rank.
Differential Revision: https://reviews.llvm.org/D124283
If there is only one single element in the vector, then we can
just extract the element to compute the final result.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D124129
vector.broadcast can inject all size one dimensions. If it's
followed by a vector.shape_cast to the original type, we can
cancel the op pair, like cancelling consecutive shape_cast ops.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D124094
* Move Module Bufferization to the bufferization dialect. The implementation is split into `OneShotModuleBufferize.cpp` and `FuncBufferizableOpInterfaceImpl.cpp`, so that the external model implementation can be easily moved to the func dialect in the future.
* Split and clean up test cases. A few test cases are still remaining in Linalg and will be updated separately.
* `linalg.inplaceable` is renamed to `bufferization.writable` to accurately reflect its current usage.
* Attributes and their verifiers are moved from the Linalg dialect to the Bufferization dialect.
* Expand documentation.
* Add a new flag to One-Shot Bufferize to allow for function boundary bufferization.
Differential Revision: https://reviews.llvm.org/D122229
FuncOps are now less special. They must still be analyzed + bufferized in a certain order, but they are now bufferized same as other ops that have a region: Bufferize the op first (`bufferize` interface method), then bufferize the region body with other bufferization patterns. In the case of FuncOps, the function signature is bufferized together with ReturnOps. Similar to how, e.g., scf.for ops are bufferized together with scf.yield ops.
This change is essentially a reimplementation of the FuncOp bufferization, but mostly NFC from a user's perspective (apart from error messages). This change is in preparation of moving the code to the bufferization dialect.
Differential Revision: https://reviews.llvm.org/D123214
The bufferization driver was previously using a GreedyPatternRewriter. This was problematic because bufferization must traverse ops top-to-bottom. The GreedyPatternRewriter was previously configured via `useTopDownTraversal`, but this was a hack; this API was just meant for performance improvements and should not affect the result of the rewrite.
BEGIN_PUBLIC
No public commit message needed.
END_PUBLIC
Differential Revision: https://reviews.llvm.org/D123618
This patch replaces current fold function with the common constant fold funtion in order to cover the situation of constant splat.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D124236
Insert the select op before the combiner op when vectorizing a
reduction loop that needs a mask, so the vectorized reduction loop
can pass isLoopParallel check and be transformed correctly in later
passes.
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D124047
When Location tracking support for block arguments was added, we
discussed various approaches to threading support for this through
function-like argument parsing. At the time, we added a parallel array
of locations that could hold this. It turns out that that approach was
verbose and error prone, roughly no one adopted it.
This patch takes a different approach, adding an optional source
locator to the UnresolvedOperand class. This fits much more naturally
into the standard structure we use for representing locators, and gives
all the function like dialects locator support for free (e.g. see the
test adding an example for the LLVM dialect).
Differential Revision: https://reviews.llvm.org/D124188
This introduces a pair of ops to the Transform dialect that connect it to PDL
patterns. Transform dialect relies on PDL for matching the Payload IR ops that
are about to be transformed. For this purpose, it provides a container op for
patterns, a "pdl_match" op and transform interface implementations that call
into the pattern matching infrastructure.
To enable the caching of compiled patterns, this also provides the extension
mechanism for TransformState. Extensions allow one to store additional
information in the TransformState and thus communicate it between different
Transform dialect operations when they are applied. They can be added and
removed when applying transform ops. An extension containing a symbol table in
which the pattern names are resolved and a pattern compilation cache is
introduced as the first client.
Depends On D123664
Reviewed By: Mogball
Differential Revision: https://reviews.llvm.org/D124007
Add async dependencies support for gpu.launch op: this allows specifying
a list of async tokens ("streams") as dependencies for the launch.
Update the GPU kernel outlining pass lowering to propagate async
dependencies from gpu.launch to gpu.launch_func op. Previously, a new
stream was being created and destroyed for a kernel launch. The async
deps support allows the kernel launch to be serialized on an existing
stream.
Differential Revision: https://reviews.llvm.org/D123499