Fixes a bug in affine fusion pipeline where an incorrect fusion is performed
despite a Call Op that potentially modifies memrefs under consideration
exists between source and target.
Fixes part of https://bugs.llvm.org/show_bug.cgi?id=49220
Reviewed By: bondhugula, dcaballe
Differential Revision: https://reviews.llvm.org/D97252
This patch handles defining ops between the source and dest loop nests, and prevents loop nests with `iter_args` from being fused.
If there is any SSA value in the dest loop nest whose defining op has dependence from the source loop nest, we cannot fuse the loop nests.
If there is a `affine.for` with `iter_args`, prevent it from being fused.
Reviewed By: dcaballe, bondhugula
Differential Revision: https://reviews.llvm.org/D97030
Affine parallel ops may contain and yield results from MemRefsNormalizable ops in the loop body. Thus, both affine.parallel and affine.yield should have the MemRefsNormalizable trait.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D96821
This commit fixes a bug in affine fusion pipeline where an
incorrect fusion is performed despite a dealloc op is present
between a producer and a consumer. This is done by creating a
node for dealloc op in the MDG.
Reviewed By: bondhugula, dcaballe
Differential Revision: https://reviews.llvm.org/D97032
This commit introduced a cyclic dependency:
Memref dialect depends on Standard because it used ConstantIndexOp.
Std depends on the MemRef dialect in its EDSC/Intrinsics.h
Working on a fix.
This reverts commit 8aa6c3765b.
Create the memref dialect and move several dialect-specific ops without
dependencies to other ops from std dialect to this dialect.
Moved ops:
AllocOp -> MemRef_AllocOp
AllocaOp -> MemRef_AllocaOp
DeallocOp -> MemRef_DeallocOp
MemRefCastOp -> MemRef_CastOp
GetGlobalMemRefOp -> MemRef_GetGlobalOp
GlobalMemRefOp -> MemRef_GlobalOp
PrefetchOp -> MemRef_PrefetchOp
ReshapeOp -> MemRef_ReshapeOp
StoreOp -> MemRef_StoreOp
TransposeOp -> MemRef_TransposeOp
ViewOp -> MemRef_ViewOp
The roadmap to split the memref dialect from std is discussed here:
https://llvm.discourse.group/t/rfc-split-the-memref-dialect-from-std/2667
Differential Revision: https://reviews.llvm.org/D96425
This revision takes advantage of the newly extended `ref` directive in assembly format
to allow better region handling for LinalgOps. Specifically, FillOp and CopyOp now build their regions explicitly which allows retiring older behavior that relied on specific op knowledge in both lowering to loops and vectorization.
This reverts commit 3f22547fd1 and reland 973e133b76 with a workaround for
a gcc bug that does not accept lambda default parameters:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59949
Differential Revision: https://reviews.llvm.org/D96598
This reverts commit 973e133b76.
It triggers an issue in gcc5 that require investigation, the build is
broken with:
/tmp/ccdpj3B9.s: Assembler messages:
/tmp/ccdpj3B9.s:5821: Error: symbol `_ZNSt17_Function_handlerIFvjjEUljjE2_E9_M_invokeERKSt9_Any_dataOjS6_' is already defined
/tmp/ccdpj3B9.s:5860: Error: symbol `_ZNSt14_Function_base13_Base_managerIUljjE2_E10_M_managerERSt9_Any_dataRKS3_St18_Manager_operation' is already defined
This revision takes advantage of the newly extended `ref` directive in assembly format
to allow better region handling for LinalgOps. Specifically, FillOp and CopyOp now build their regions explicitly which allows retiring older behavior that relied on specific op knowledge in both lowering to loops and vectorization.
Differential Revision: https://reviews.llvm.org/D96598
In dialect conversion infrastructure, source materialization applies as part of
the finalization procedure to results of the newly produced operations that
replace previously existing values with values having a different type.
However, such operations may be created to replace operations created in other
patterns. At this point, it is possible that the results of the _original_
operation are still in use and have mismatching types, but the results of the
_intermediate_ operation that performed the type change are not in use leading
to the absence of source materialization. For example,
%0 = dialect.produce : !dialect.A
dialect.use %0 : !dialect.A
can be replaced with
%0 = dialect.other : !dialect.A
%1 = dialect.produce : !dialect.A // replaced, scheduled for removal
dialect.use %1 : !dialect.A
and then with
%0 = dialect.final : !dialect.B
%1 = dialect.other : !dialect.A // replaced, scheduled for removal
%2 = dialect.produce : !dialect.A // replaced, scheduled for removal
dialect.use %2 : !dialect.A
in the same rewriting, but only the %1->%0 replacement is currently considered.
Change the logic in dialect conversion to look up all values that were replaced
by the given value and performing source materialization if any of those values
is still in use with mismatching types. This is performed by computing the
inverse value replacement mapping. This arguably expensive manipulation is
performed only if there were some type-changing replacements. An alternative
could be to consider all replaced operations and not only those that resulted
in type changes, but it would harm pattern-level composability: the pattern
that performed the non-type-changing replacement would have to be made aware of
the type converter in order to call the materialization hook.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D95626
We could extend this with an interface to allow dialect to perform a type
conversion, but that would make the folder creating operation which isn't
the case at the moment, and isn't necessarily always desirable.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D95991
In dialect conversion, signature conversions essentially perform block argument
replacement and are added to the general value remapping. However, the replaced
values were not tracked, so if a signature conversion was rolled back, the
construction of operand lists for the following patterns could have obtained
block arguments from the mapping and give them to the pattern leading to
use-after-free. Keep track of signature conversions similarly to normal block
argument replacement, and erase such replacements from the general mapping when
the conversion is rolled back.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D95688
Currently, for a scf.parallel (i,j,k) after the loop collapsing to 1D is done, the
IVs would be traversed as for an scf.parallel(k,j,i).
Differential Revision: https://reviews.llvm.org/D95693
This patch adds support for producer-consumer fusion scenarios with
multiple producer stores to the AffineLoopFusion pass. The patch
introduces some changes to the producer-consumer algorithm, including:
* For a given consumer loop, producer-consumer fusion iterates over its
producer candidates until a fixed point is reached.
* Producer candidates are gathered beforehand for each iteration of the
consumer loop and visited in reverse program order (not strictly guaranteed)
to maximize the number of loops fused per iteration.
In general, these changes were needed to simplify the multi-store producer
support and remove some of the workarounds that were introduced in the past
to support more fusion cases under the single-store producer limitation.
This patch also preserves the existing functionality of AffineLoopFusion with
one minor change in behavior. Producer-consumer fusion didn't fuse scenarios
with escaping memrefs and multiple outgoing edges (from a single store).
Multi-store producer scenarios will usually (always?) have multiple outgoing
edges so we couldn't fuse any with escaping memrefs, which would greatly limit
the applicability of this new feature. Therefore, the patch enables fusion for
these scenarios. Please, see modified tests for specific details.
Reviewed By: andydavis1, bondhugula
Differential Revision: https://reviews.llvm.org/D92876
This patch adds support for producer-consumer fusion scenarios with
multiple producer stores to the AffineLoopFusion pass. The patch
introduces some changes to the producer-consumer algorithm, including:
* For a given consumer loop, producer-consumer fusion iterates over its
producer candidates until a fixed point is reached.
* Producer candidates are gathered beforehand for each iteration of the
consumer loop and visited in reverse program order (not strictly guaranteed)
to maximize the number of loops fused per iteration.
In general, these changes were needed to simplify the multi-store producer
support and remove some of the workarounds that were introduced in the past
to support more fusion cases under the single-store producer limitation.
This patch also preserves the existing functionality of AffineLoopFusion with
one minor change in behavior. Producer-consumer fusion didn't fuse scenarios
with escaping memrefs and multiple outgoing edges (from a single store).
Multi-store producer scenarios will usually (always?) have multiple outgoing
edges so we couldn't fuse any with escaping memrefs, which would greatly limit
the applicability of this new feature. Therefore, the patch enables fusion for
these scenarios. Please, see modified tests for specific details.
Reviewed By: andydavis1, bondhugula
Differential Revision: https://reviews.llvm.org/D92876
Add a check if regions do not implement the RegionBranchOpInterface. This is not
allowed in the current deallocation steps. Furthermore, we handle edge-cases,
where a single region is attached and the parent operation has no results.
This fixes: https://bugs.llvm.org/show_bug.cgi?id=48575
Differential Revision: https://reviews.llvm.org/D94586
The standard and gpu dialect both have `alloc` operations which use the
memory effect `MemAlloc`. In both cases, it is specified on both the
operation itself and on the result. This results in two memory effects
being created for these operations. When `MemAlloc` is defined on an
operation, it represents some background effect which the compiler
cannot reason about, and inhibits the ability of the compiler to
remove dead `std.alloc` operations. This change removes the uneeded
`MemAlloc` effect from these operations and leaves the effect on the
result, which allows dead allocs to be erased.
There is the same problem, but to a lesser extent, with MemFree, MemRead
and MemWrite. Over-specifying these traits is not currently inhibiting
any optimization.
Differential Revision: https://reviews.llvm.org/D94662
This revision adds a new `replaceOpWithIf` hook that replaces uses of an operation that satisfy a given functor. If all uses are replaced, the operation gets erased in a similar manner to `replaceOp`. DialectConversion support will be added in a followup as this requires adjusting how replacements are tracked there.
Differential Revision: https://reviews.llvm.org/D94632
In the overwhelmingly common case, enum attribute case strings represent valid identifiers in MLIR syntax. This revision updates the format generator to format as a keyword in these cases, removing the need to wrap values in a string. The parser still retains the ability to parse the string form, but the printer will use the keyword form when applicable.
Differential Revision: https://reviews.llvm.org/D94575
The LLVM dialect type system has been closed until now, i.e. did not support
types from other dialects inside containers. While this has had obvious
benefits of deriving from a common base class, it has led to some simple types
being almost identical with the built-in types, namely integer and floating
point types. This in turn has led to a lot of larger-scale complexity: simple
types must still be converted, numerous operations that correspond to LLVM IR
intrinsics are replicated to produce versions operating on either LLVM dialect
or built-in types leading to quasi-duplicate dialects, lowering to the LLVM
dialect is essentially required to be one-shot because of type conversion, etc.
In this light, it is reasonable to trade off some local complexity in the
internal implementation of LLVM dialect types for removing larger-scale system
complexity. Previous commits to the LLVM dialect type system have adapted the
API to support types from other dialects.
Replace LLVMIntegerType with the built-in IntegerType plus additional checks
that such types are signless (these are isolated in a utility function that
replaced `isa<LLVMType>` and in the parser). Temporarily keep the possibility
to parse `!llvm.i32` as a synonym for `i32`, but add a deprecation notice.
Reviewed By: mehdi_amini, silvas, antiagainst
Differential Revision: https://reviews.llvm.org/D94178
Now that passes have support for running nested pipelines, the inliner can now allow for users to provide proper nested pipelines to use for optimization during inlining. This revision also changes the behavior of optimization during inlining to optimize before attempting to inline, which should lead to a more accurate cost model and prevents the need for users to schedule additional duplicate cleanup passes before/after the inliner that would already be run during inlining.
Differential Revision: https://reviews.llvm.org/D91211
This reverts commit 0d48d265db.
This reapplies the following commit, with a fix for CAPI/ir.c:
[mlir] Start splitting the `tensor` dialect out of `std`.
This starts by moving `std.extract_element` to `tensor.extract` (this
mirrors the naming of `vector.extract`).
Curiously, `std.extract_element` supposedly works on vectors as well,
and this patch removes that functionality. I would tend to do that in
separate patch, but I couldn't find any downstream users relying on
this, and the fact that we have `vector.extract` made it seem safe
enough to lump in here.
This also sets up the `tensor` dialect as a dependency of the `std`
dialect, as some ops that currently live in `std` depend on
`tensor.extract` via their canonicalization patterns.
Part of RFC: https://llvm.discourse.group/t/rfc-split-the-tensor-dialect-from-std/2347/2
Differential Revision: https://reviews.llvm.org/D92991
This starts by moving `std.extract_element` to `tensor.extract` (this
mirrors the naming of `vector.extract`).
Curiously, `std.extract_element` supposedly works on vectors as well,
and this patch removes that functionality. I would tend to do that in
separate patch, but I couldn't find any downstream users relying on
this, and the fact that we have `vector.extract` made it seem safe
enough to lump in here.
This also sets up the `tensor` dialect as a dependency of the `std`
dialect, as some ops that currently live in `std` depend on
`tensor.extract` via their canonicalization patterns.
Part of RFC: https://llvm.discourse.group/t/rfc-split-the-tensor-dialect-from-std/2347/2
Differential Revision: https://reviews.llvm.org/D92991
This fixes a subtle bug where SCCP could incorrectly optimize a private callable while waiting for its arguments to be resolved.
Fixes PR#48457
Differential Revision: https://reviews.llvm.org/D92976
Memrefs with affine_map in the results of normalizable operation were
not normalized by `--normalize-memrefs` option. This patch normalizes
them.
Differential Revision: https://reviews.llvm.org/D88719
Extended promote buffers to stack pass to support dynamically shaped allocas.
The conversion is limited by the rank of the underlying tensor.
An option is added to the pass to adjust the given rank.
Differential Revision: https://reviews.llvm.org/D91969
- Address TODO in scf-bufferize: the argument materialization issue is
now fixed and the code is now in Transforms/Bufferize.cpp
- Tighten up finalizing-bufferize to avoid creating invalid IR when
operand types potentially change
- Tidy up the testing of func-bufferize, and move appropriate tests
to a new finalizing-bufferize.mlir
- The new stricter checking in finalizing-bufferize revealed that we
needed a DimOp conversion pattern (found when integrating into npcomp).
Previously, the converion infrastructure was blindly changing the
operand type during finalization, which happened to work due to
DimOp's tensor/memref polymorphism, but is generally not encouraged
(the new pattern is the way to tell the conversion infrastructure that
it is legal to change that type).
The rewrite logic has an optimization to drop a cast operation after
rewriting block arguments if the cast operation has no users. This is
unsafe as there might be a pending rewrite that replaced the cast operation
itself and hence would trigger a second free.
Instead, do not remove the casts and leave it up to a later canonicalization
to do so.
Differential Revision: https://reviews.llvm.org/D92184
Block merging in MLIR will incorrectly merge blocks with operations whose values are used outside of that block. This change forbids this behavior and provides a test where it is illegal to perform such a merge.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D91745
This replaces the old type decomposition logic that was previously mixed
into bufferization, and makes it easily accessible.
This also deletes TestFinalizingBufferize, because after we remove the type
decomposition, it doesn't do anything that is not already provided by
func-bufferize.
Differential Revision: https://reviews.llvm.org/D90899
The index type does not have a bitsize and hence the size of corresponding allocations cannot be computed. Instead, the promotion pass now has an explicit option to specify the size of index.
Differential Revision: https://reviews.llvm.org/D91360
The previous logic for inlining a region A with N blocks into region B
would produce incorrect results on rollback for N greater than 1. This
rollback logic would leave blocks 1..N in region B and only move block 0
to region A.
The new inlining action recording stores the block move actions from N-1
to 0. Now on roll back, block 0 is moved to region A and then 1..N is
appended to the list of blocks in region A.
Differential Revision: https://reviews.llvm.org/D91185
Locations often get very long and clutter up operations when printed inline with them. This revision adds support for using aliases with trailing operation locations, and makes printing with aliases the default behavior. Aliases in the trailing location take the form `loc(<alias>)`, such as `loc(#loc0)`. As with all aliases, using `mlir-print-local-scope` can be used to disable them and get the inline behavior.
Differential Revision: https://reviews.llvm.org/D90652
This revision refactors the way that attributes/types are considered when generating aliases. Instead of considering all of the attributes/types of every operation, we perform a "fake" print step that prints the operations using a dummy printer to collect the attributes and types that would actually be printed during the real process. This removes a lot of attributes/types from consideration that generally won't end up in the final output, e.g. affine map attributes in an `affine.apply`/`affine.for`.
This resolves a long standing TODO w.r.t aliases, and helps to have a much cleaner textual output format. As a datapoint to the latter, as part of this change several tests were identified as testing for the presence of attributes aliases that weren't actually referenced by the custom form of any operation.
To ensure that this wouldn't cause a large degradation in compile time due to the second full print, I benchmarked this change on a very large module with a lot of operations(The file is ~673M/~4.7 million lines long). This file before this change take ~6.9 seconds to print in the custom form, and ~7 seconds after this change. In the custom assembly case, this added an average of a little over ~100 miliseconds to the compile time. This increase was due to the way that argument attributes on functions are structured and how they get printed; i.e. with a better representation the negative impact here can be greatly decreased. When printing in the generic form, this revision had no observable impact on the compile time. This benchmarking leads me to believe that the impact of this change on compile time w.r.t printing is closely related to `print` methods that perform a lot of additional/complex processing outside of the OpAsmPrinter.
Differential Revision: https://reviews.llvm.org/D90512
- Change syntax for FuncOp to be `func <visibility>? @name` instead of printing the
visibility in the attribute dictionary.
- Since printFunctionLikeOp() and parseFunctionLikeOp() are also used by other
operations, make the "inline visibility" an opt-in feature.
- Updated unit test to use and check the new syntax.
Differential Revision: https://reviews.llvm.org/D90859
There exists a generic folding facility that folds the operand of a memref_cast
into users of memref_cast that support this. However, it was not used for the
memref_cast itself. Fix it to enable elimination of memref_cast chains such as
%1 = memref_cast %0 : A to B
%2 = memref_cast %1 : B to A
that is achieved by combining the folding with the existing "A to A" cast
elimination.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D90910
This functionality is superceded by BufferResultsToOutParams pass (see
https://reviews.llvm.org/D90071) for users the require buffers to be
out-params. That pass should be run immediately after all tensors are gone from
the program (before buffer optimizations and deallocation insertion), such as
immediately after a "finalizing" bufferize pass.
The -test-finalizing-bufferize pass now defaults to what used to be the
`allowMemrefFunctionResults=true` flag. and the
finalizing-bufferize-allowed-memref-results.mlir file is moved
to test/Transforms/finalizing-bufferize.mlir.
Differential Revision: https://reviews.llvm.org/D90778
The LinalgDependenceGraph and alias analysis provide the necessary analysis for the Linalg fusion on buffers case.
However this is not enough for linalg on tensors which require proper memory effects to play nicely with DCE and other transformations.
This revision adds side effects to Linalg ops that were previously missing and has 2 consequences:
1. one example in the copy removal pass now fails since the linalg.generic op has side effects and the pass does not perform alias analysis / distinguish between reads and writes.
2. a few examples in fusion-tensor.mlir need to return the resulting tensor otherwise DCE automatically kicks in as part of greedy pattern application.
Differential Revision: https://reviews.llvm.org/D90762
BufferizeTests.
Summary:
Added test operations to replace the LinalgDialect dependency in tests
which use the buffer-deallocation, buffer-hoisting,
buffer-loop-hoisting, promote-buffers-to-stack,
buffer-placement-preparation-allowed-memref-resutls and
buffer-placement-preparation pass. Adapted the corresponding tests cases
and TestBufferPlacement.cpp.
Differential Revision: https://reviews.llvm.org/D90037
BufferPlacement is no longer part of bufferization. However, this test
is an important test of "finalizing" bufferize passes.
A "finalizing" bufferize conversion is one that performs a "full"
conversion and expects all tensors to be gone from the program. This in
particular involves rewriting funcs (including block arguments of the
contained region), calls, and returns. The unique property of finalizing
bufferization passes is that they cannot be done via a local
transformation with suitable materializations to ensure composability
(as other bufferization passes do). For example, if a call is
rewritten, the callee needs to be rewritten otherwise the IR will end up
invalid. Thus, finalizing bufferization passes require an atomic change
to the entire program (e.g. the whole module).
This new designation makes it clear also that it shouldn't be testing
bufferization of linalg ops, so the tests have been updated to not use
linalg.generic ops. (linalg.copy is still used as the "copy" op for
copying into out-params)
Differential Revision: https://reviews.llvm.org/D89979
This pass allows removing getResultConversionKind from
BufferizeTypeConverter. This pass replaces the AppendToArgumentsList
functionality. As far as I could tell, the only use of this functionlity
is to perform the transformation that is implemented in this pass.
Future patches will remove the getResultConversionKind machinery from
BufferizeTypeConverter, but sending this patch for individual review for
clarity.
Differential Revision: https://reviews.llvm.org/D90071
Previously they were separated into "instance" and "kind" aliases, and also required that the dialect know ahead of time all of the instances that would have a corresponding alias. This approach was very clunky and not ergonomic to interact with. The new approach is to provide the dialect with an instance of an attribute/type to provide an alias for, fully replacing the original split approach.
Differential Revision: https://reviews.llvm.org/D89354
In certain situations it isn't legal to inline a call operation, but this isn't something that is possible(at least not easily) to prevent with the current hooks. This revision adds a new hook so that dialects with call operations that shouldn't be inlined can prevent it.
Differential Revision: https://reviews.llvm.org/D90359
Added optimization pass to convert heap-based allocs to stack-based allocas in
buffer placement. Added the corresponding test file.
Differential Revision: https://reviews.llvm.org/D89688
The current BufferPlacement transformation contains several concepts for
hoisting allocations. However, more advanced hoisting techniques should not be
integrated into the BufferPlacement transformation. Hence, this CL refactors the
current BufferPlacement pass into three separate pieces: BufferDeallocation and
BufferAllocation(Loop)Hoisting. Moreover, it extends the hoisting functionality
by allowing to move allocations out of loops.
Differential Revision: https://reviews.llvm.org/D87756
Parsing of a scalar subview did not create the required static_offsets attribute.
This also adds support for folding scalar subviews away.
Differential Revision: https://reviews.llvm.org/D89467
The buffer placement preparation tests in
test/Transforms/buffer-placement-preparation* are using Linalg as a test
dialect which leads to confusion and "copy-pasta", i.e. Linalg is being
extended now and when TensorsToBuffers.cpp is changed, TestBufferPlacement is
sometimes kept in-sync, which should not be the case.
This has led to the unnoticed bug, because the tests were in a different directory and the patterns were slightly off.
Differential Revision: https://reviews.llvm.org/D89209
This revision adds init_tensors support to buffer allocation for Linalg on tensors.
Currently makes the assumption that the init_tensors fold onto the first output tensors.
This assumption is not currently enforced or cast in stone and requires experimenting with tiling linalg on tensors for ops **without reductions**.
Still this allows progress towards the end-to-end goal.
Normalizing memrefs failed when a caller of symbolic use in a function
can not be casted to `CallOp`. This patch avoids the failure by checking
the result of the casting. If the caller can not be casted to `CallOp`,
it is skipped.
Differential Revision: https://reviews.llvm.org/D87746
This adds support for the interface and provides unambigious information
on the control flow as it is unconditional on any runtime values.
The code is tested through confirming that buffer-placement behaves as
expected.
Differential Revision: https://reviews.llvm.org/D87894
Adds a pattern that replaces a chain of two tensor_cast operations by a single tensor_cast operation if doing so will not remove constraints on the shapes.
This add canonicalizer for
- extracting an element from a dynamic_tensor_from_elements
- propagating constant operands to the type of dynamic_tensor_from_elements
Differential Revision: https://reviews.llvm.org/D87525
The current BufferPlacement transformation cannot handle loops properly. Buffers
passed via backedges will not be freed automatically introducing memory leaks.
This CL adds support for loops to overcome these limitations.
Differential Revision: https://reviews.llvm.org/D85513
In this PR, the users of BufferPlacement can configure
BufferAssginmentTypeConverter. These new configurations would give the user more
freedom in the process of converting function signature, and return and call
operation conversions.
These are the new features:
- Accepting callback functions for decomposing types (i.e. 1 to N type
conversion such as unpacking tuple types).
- Defining ResultConversionKind for specifying whether a function result
with a certain type should be appended to the function arguments list or
should be kept as function result. (Usage:
converter.setResultConversionKind<MemRefType>(AppendToArgumentList))
- Accepting callback functions for composing or decomposing values (i.e. N
to 1 and 1 to N value conversion).
Differential Revision: https://reviews.llvm.org/D85133
This reverts commit 94f5d24877 because
of failing the following tests:
MLIR :: Dialect/Linalg/tensors-to-buffers.mlir
MLIR :: Transforms/buffer-placement-preparation-allowed-memref-results.mlir
MLIR :: Transforms/buffer-placement-preparation.mlir
In this PR, the users of BufferPlacement can configure
BufferAssginmentTypeConverter. These new configurations would give the user more
freedom in the process of converting function signature, and return and call
operation conversions.
These are the new features:
- Accepting callback functions for decomposing types (i.e. 1 to N type
conversion such as unpacking tuple types).
- Defining ResultConversionKind for specifying whether a function result
with a certain type should be appended to the function arguments list or
should be kept as function result. (Usage:
converter.setResultConversionKind<MemRefType>(AppendToArgumentList))
- Accepting callback functions for composing or decomposing values (i.e. N
to 1 and 1 to N value conversion).
Differential Revision: https://reviews.llvm.org/D85133
The prior diff that introduced `addAffineIfOpDomain` missed appending
constraints from the ifOp domain. This revision fixes this problem.
Differential Revision: https://reviews.llvm.org/D86421
When dealing with dialects that will results in function calls to
external libraries, it is important to be able to handle maps as some
dialects may require mapped data. Before this patch, the detection of
whether normalization can apply or not, operations are compared to an
explicit list of operations (`alloc`, `dealloc`, `return`) or to the
presence of specific operation interfaces (`AffineReadOpInterface`,
`AffineWriteOpInterface`, `AffineDMAStartOp`, or `AffineDMAWaitOp`).
This patch add a trait, `MemRefsNormalizable` to determine if an
operation can have its `memrefs` normalized.
This trait can be used in turn by dialects to assert that such
operations are compatible with normalization of `memrefs` with
nontrivial memory layout specification. An example is given in the
literal tests.
Differential Revision: https://reviews.llvm.org/D86236
This exercises the corner case that was fixed in
https://reviews.llvm.org/rG8979a9cdf226066196f1710903d13492e6929563.
The bug can be reproduced when there is a @callee with a custom type argument and @caller has a producer of this argument passed to the @callee.
Example:
func @callee(!test.test_type) -> i32
func @caller() -> i32 {
%arg = "test.type_producer"() : () -> !test.test_type
%out = call @callee(%arg) : (!test.test_type) -> i32
return %out : i32
}
Even though there is a type conversion for !test.test_type, the output IR (before the fix) contained a DialectCastOp:
module {
llvm.func @callee(!llvm.ptr<i8>) -> !llvm.i32
llvm.func @caller() -> !llvm.i32 {
%0 = llvm.mlir.null : !llvm.ptr<i8>
%1 = llvm.mlir.cast %0 : !llvm.ptr<i8> to !test.test_type
%2 = llvm.call @callee(%1) : (!test.test_type) -> !llvm.i32
llvm.return %2 : !llvm.i32
}
}
instead of
module {
llvm.func @callee(!llvm.ptr<i8>) -> !llvm.i32
llvm.func @caller() -> !llvm.i32 {
%0 = llvm.mlir.null : !llvm.ptr<i8>
%1 = llvm.call @callee(%0) : (!llvm.ptr<i8>) -> !llvm.i32
llvm.return %1 : !llvm.i32
}
}
Differential Revision: https://reviews.llvm.org/D85914
-- This commit handles the returnOp in memref map layout normalization.
-- An initial filter is applied on FuncOps which helps us know which functions can be
a suitable candidate for memref normalization which doesn't lead to invalid IR.
-- Handles memref map normalization for external function assuming the external function
is normalizable.
Differential Revision: https://reviews.llvm.org/D85226
This diff attempts to resolve the TODO in `getOpIndexSet` (formerly
known as `getInstIndexSet`), which states "Add support to handle IfInsts
surronding `op`".
Major changes in this diff:
1. Overload `getIndexSet`. The overloaded version considers both
`AffineForOp` and `AffineIfOp`.
2. The `getInstIndexSet` is updated accordingly: its name is changed to
`getOpIndexSet` and its implementation is based on a new API `getIVs`
instead of `getLoopIVs`.
3. Add `addAffineIfOpDomain` to `FlatAffineConstraints`, which extracts
new constraints from the integer set of `AffineIfOp` and merges it to
the current constraint system.
4. Update how a `Value` is determined as dim or symbol for
`ValuePositionMap` in `buildDimAndSymbolPositionMaps`.
Differential Revision: https://reviews.llvm.org/D84698
Always define a remapping for the memref replacement (`indexRemap`)
with the proper number of inputs, including all the `outerIVs`, so that
the number of inputs and the operands provided for the map don't mismatch.
Reviewed By: bondhugula, andydavis1
Differential Revision: https://reviews.llvm.org/D85177
-- Introduces a pass that normalizes the affine layout maps to the identity layout map both within and across functions by rewriting function arguments and call operands where necessary.
-- Memref normalization is now implemented entirely in the module pass '-normalize-memrefs' and the limited intra-procedural version has been removed from '-simplify-affine-structures'.
-- Run using -normalize-memrefs.
-- Return ops are not handled and would be handled in the subsequent revisions.
Signed-off-by: Abhishek Varma <abhishek.varma@polymagelabs.com>
Differential Revision: https://reviews.llvm.org/D84490
The MemRefDataFlow pass does store to load forwarding
only for affine store/loads. This patch updates the pass
to use affine read/write interface which enables vector
forwarding.
Reviewed By: dcaballe, bondhugula, ftynse
Differential Revision: https://reviews.llvm.org/D84302
This revision adds support for much deeper type conversion integration into the conversion process, and enables auto-generating cast operations when necessary. Type conversions are now largely automatically managed by the conversion infra when using a ConversionPattern with a provided TypeConverter. This removes the need for patterns to do type cast wrapping themselves and moves the burden to the infra. This makes it much easier to perform partial lowerings when type conversions are involved, as any lingering type conversions will be automatically resolved/legalized by the conversion infra.
To support this new integration, a few changes have been made to the type materialization API on TypeConverter. Materialization has been split into three separate categories:
* Argument Materialization: This type of materialization is used when converting the type of block arguments when calling `convertRegionTypes`. This is useful for contextually inserting additional conversion operations when converting a block argument type, such as when converting the types of a function signature.
* Source Materialization: This type of materialization is used to convert a legal type of the converter into a non-legal type, generally a source type. This may be called when uses of a non-legal type persist after the conversion process has finished.
* Target Materialization: This type of materialization is used to convert a non-legal, or source, type into a legal, or target, type. This type of materialization is used when applying a pattern on an operation, but the types of the operands have not yet been converted.
Differential Revision: https://reviews.llvm.org/D82831
AllocOp is updated in normalizeMemref(AllocOp allocOp), but, when the
AllocOp has `alignment` attribute, it was ignored and updated AllocOp
does not have `alignment` attribute. This patch fixes it.
Differential Revision: https://reviews.llvm.org/D83656
Up until now, there has been an implicit agreement that when an operation is marked as
"erased" all uses of that operation's results are guaranteed to be removed during conversion. How this works in practice is that there is either an assert/crash/asan failure/etc. This revision adds support for properly detecting when an erased operation has dangling users, emits and error and fails the conversion.
Differential Revision: https://reviews.llvm.org/D82830
ViewLikeOpInterfaces introduce new aliases that need to be added to the alias
list. This is necessary to place deallocs in the right positions.
Differential Revision: https://reviews.llvm.org/D83044
This pass removes redundant dialect-independent Copy operations in different
situations like the following:
%from = ...
%to = ...
... (no user/alias for %to)
copy(%from, %to)
... (no user/alias for %from)
dealloc %from
use(%to)
Differential Revision: https://reviews.llvm.org/D82757
Summary: The current BufferPlacement implementation does not support
nested region control flow. This CL adds support for nested regions via
the RegionBranchOpInterface and the detection of branch-like
(ReturnLike) terminators inside nested regions.
Differential Revision: https://reviews.llvm.org/D81926
Summary: The patch fixes an off by one error in the method collapseParallelLoops. It ensures the same normalized bound is used for the computation of the division and the remainder.
Reviewers: herhut
Reviewed By: herhut
Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, stephenneuendorffer, Joonsoo, grosul1, Kayjukh, jurahul, msifontes
Tags: #mlir
Differential Revision: https://reviews.llvm.org/D82634
When there is a mix of affine load/store and non-affine operations (e.g. std.load, std.store),
affine-loop-fusion ignores the present of non-affine ops, thus changing the program semantics.
E.g. we have a program of three affine loops operating on the same memref in which one of them uses std.load and std.store, as follows.
```
affine.for
affine.store %1
affine.for
std.load %1
std.store %1
affine.for
affine.load %1
affine.store %1
```
affine-loop-fusion will produce the following result which changed the program semantics:
```
affine.for
std.load %1
std.store %1
affine.for
affine.store %1
affine.load %1
affine.store %1
```
This patch is to fix the above problem by checking non-affine users of the memref that are between the source and destination nodes of interest.
Differential Revision: https://reviews.llvm.org/D82158
This revision removes the TypeConverter parameter passed to the apply* methods, and instead moves the responsibility of region type conversion to patterns. The types of a region can be converted using the 'convertRegionTypes' method, which acts similarly to the existing 'applySignatureConversion'. This method ensures that all blocks within, and including those moved into, a region will have the block argument types converted using the provided converter.
This has the benefit of making more of the legalization logic controlled by patterns, instead of being handled explicitly by the driver. It also opens up the possibility to support multiple type conversions at some point in the future.
This revision also adds a new utility class `FailureOr<T>` that provides a LogicalResult friendly facility for returning a failure or a valid result value.
Differential Revision: https://reviews.llvm.org/D81681
Traditionally patterns have always had the root operation kind hardcoded to a specific operation name. This has worked well for quite some time, but it has certain limitations that make it undesirable. For example, some lowering have the same implementation for many different operations types with a few lowering entire dialects using the same pattern implementation. This problem has led to several "solutions":
a) Provide a template implementation to the user so that they can instantiate it for each operation combination, generally requiring the inclusion of the auto-generated operation definition file.
b) Use a non-templated pattern that allows for providing the name of the operation to match
- No one ever does this, because enumerating operation names can be cumbersome and so this quickly devolves into solution a.
This revision removes the restriction that patterns have a hardcoded root type, and allows for a class patterns that could match "any" operation type. The major downside of root-agnostic patterns is that they make certain pattern analyses more difficult, so it is still very highly encouraged that an operation specific pattern be used whenever possible.
Differential Revision: https://reviews.llvm.org/D82066
We previously weren't properly updating the SCC iterator when nodes were removed, leading to asan failures in certain situations. This commit adds a CallGraphSCC class and defers operation deletion until inlining has finished.
Differential Revision: https://reviews.llvm.org/D81984
Fix memref region compute for 0-d memref accesses in certain cases (when
there are loops surrounding such 0-d accesses).
Differential Revision: https://reviews.llvm.org/D81792
allocations cannot be moved freely and can remain in divergent control flow.
The current BufferPlacement pass does not support allocation nodes that carry
additional dependencies (like in the case of dynamic shaped types). These
allocations can often not be moved freely and in turn might remain in divergent
control-flow branches. This requires a different strategy with respect to block
arguments and aliases. This CL adds additinal functionality to support
allocation nodes in divergent control flow while avoiding memory leaks.
Differential Revision: https://reviews.llvm.org/D79850
This option avoids to accidentally reuse variable across -LABEL match,
it can be explicitly opted-in by prefixing the variable name with $
Differential Revision: https://reviews.llvm.org/D81531
This patch changes the fusion algorithm so that after fusing two loop nests
we revisit previously visited nodes so that they are considered again for
fusion in the context of the new fused loop nest.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D81609
Allow for dynamic indices in the `dim` operation.
Rather than an attribute, the index is now an operand of type `index`.
This allows to apply the operation to dynamically ranked tensors.
The correct lowering of dynamic indices remains to be implemented.
Differential Revision: https://reviews.llvm.org/D81551
Having the input dumped on failure seems like a better
default: I debugged FileCheck tests for a while without knowing
about this option, which really helps to understand failures.
Remove `-dump-input-on-failure` and the environment variable
FILECHECK_DUMP_INPUT_ON_FAILURE which are now obsolete.
Differential Revision: https://reviews.llvm.org/D81422
This parameter gives the developers the freedom to choose their desired function
signature conversion for preparing their functions for buffer placement. It is
introduced for BufferAssignmentFuncOpConverter, and also for
BufferAssignmentReturnOpConverter, and BufferAssignmentCallOpConverter to adapt
the return and call operations with the selected function signature conversion.
If the parameter is set, buffer placement won't also deallocate the returned
buffers.
Differential Revision: https://reviews.llvm.org/D81137
This simplifies a lot of handling of BoolAttr/IntegerAttr. For example, a lot of places currently have to handle both IntegerAttr and BoolAttr. In other places, a decision is made to pick one which can lead to surprising results for users. For example, DenseElementsAttr currently uses BoolAttr for i1 even if the user initialized it with an Array of i1 IntegerAttrs.
Differential Revision: https://reviews.llvm.org/D81047
This patch enables affine loop fusion for loops with affine vector loads
and stores. For that, we only had to use affine memory op interfaces in
LoopFusionUtils.cpp and Utils.cpp so that vector loads and stores are
also taken into account.
Reviewed By: andydavis1, ftynse
Differential Revision: https://reviews.llvm.org/D80971
Dialect conversion infrastructure supports 1->N type conversions by requiring
individual conversions to provide facilities to generate operations
retrofitting N values into 1 of the original type when N > 1. This
functionality can also be used to materialize explicit "cast"-like operations,
but it did not support 1->1 type conversions until now. Modify TypeConverter to
support materialization of cast operations for 1-1 conversions.
This also makes materialization specification more extensible following the
same pattern as type conversions. Instead of overloading a virtual function,
users or subclasses of TypeConversion can now register type-specific
materialization callbacks that will be called in order for the given type.
Differential Revision: https://reviews.llvm.org/D79729
Add BufferAssignmentCallOpConverter as a pattern rewriter for Buffer
Placement. It matches the signature of the caller operation with the callee
after rewriting the callee with FunctionAndBlockSignatureConverter.
Differential Revision: https://reviews.llvm.org/D80785
This utility factors out the machinery required to add iterArgs and yield values to an scf.ForOp.
Differential Revision: https://reviews.llvm.org/D80656
Buffer placement can now operates on functions that return buffers. These
buffers escape from the deallocation phase of buffer placement.
Differential Revision: https://reviews.llvm.org/D80696
PatternRewriter has support for erasing a Block from its parent region, but
this feature has not been implemented for ConversionPatternRewriter that needs
to keep track of and be able to undo block actions. Introduce support for
undoing block erasure in the ConversionPatternRewriter by marking all the ops
it contains for erasure and by detaching the block from its parent region. The
detached block is stored in the action description and is not actually deleted
until the rewrites are applied.
Differential Revision: https://reviews.llvm.org/D80135
Dialect conversion infrastructure may roll back op creation by erasing the
operations in the reverse order of their creation. While this guarantees uses
of values will be deleted before their definitions, this does not guarantee
that a parent operation will not be deleted before its child. (This may happen
in case of block inlining or if child operations, such as terminators, are
created in the parent's `build` function before the parent itself.) Handle the
parent/child relationship between ops by removing all child ops from the blocks
before erasing the parent. The child ops remain live, detached from a block,
and will be safely destroyed in their turn, which may come later than that of
the parent.
Differential Revision: https://reviews.llvm.org/D80134
Making these two converters more generic. FunctionAndBlockSignatureConverter now
moves only memref results (after type conversion) to the function argument and
keeps other legal function results unchanged. NonVoidToVoidReturnOpConverter is
renamed to NoBufferOperandsReturnOpConverter. It removes only the buffer
operands from the operands of the converted ReturnOp and inserts CopyOps to copy
each buffer to the target function argument.
Differential Revision: https://reviews.llvm.org/D79329
DimOp folding is using bare accesses to underlying SubViewOp operands.
This is generally incorrect and is fixed in this revision.
Differential Revision: https://reviews.llvm.org/D80017
All ops of the SCF dialect now use the `scf.` prefix instead of `loop.`. This
is a part of dialect renaming.
Differential Revision: https://reviews.llvm.org/D79844
The existing implementation of SubViewOp::getRanges relies on all
offsets/sizes/strides to be dynamic values and does not work in
combination with canonicalization. This revision adds a
SubViewOp::getOrCreateRanges to create the missing constants in the
canonicalized case.
This allows reactivating the fused pass with staged pattern
applications.
However another issue surfaces that the SubViewOp verifier is now too
strict to allow folding. The existing folding pattern is turned into a
canonicalization pattern which rewrites memref_cast + subview into
subview + memref_cast.
The transform-patterns-matmul-to-vector can then be reactivated.
Differential Revision: https://reviews.llvm.org/D79759
Due to the extension of Liveness, Buffer Assignment can now work on nested regions. This PR provides a test case to show that existing functionally of BA works properly.
Differential Revision: https://reviews.llvm.org/D79332
The main objective of this revision is to change the way static information is represented, propagated and canonicalized in the SubViewOp.
In the current implementation the issue is that canonicalization may strictly lose information because static offsets are combined in irrecoverable ways into the result type, in order to fit the strided memref representation.
The core semantics of the op do not change but the parser and printer do: the op always requires `rank` offsets, sizes and strides. These quantities can now be either SSA values or static integer attributes.
The result type is automatically deduced from the static information and more powerful canonicalizations (as powerful as the representation with sentinel `?` values allows). Previously static information was inferred on a best-effort basis from looking at the source and destination type.
Relevant tests are rewritten to use the idiomatic `offset: x, strides : [...]`-form. Bugs are corrected along the way that were not trivially visible in flattened strided memref form.
Lowering to LLVM is updated, simplified and now supports all cases.
A mixed static-dynamic mode test that wouldn't previously lower is added.
It is an open question, and a longer discussion, whether a better result type representation would be a nicer alternative. For now, the subview op carries the required semantic.
Differential Revision: https://reviews.llvm.org/D79662
This reverts commit 80d133b24f.
Per Stephan Herhut: The canonicalizer pattern that was added creates
forms of the subview op that cannot be lowered.
This is shown by failing Tensorflow XLA tests such as:
tensorflow/compiler/xla/service/mlir_gpu/tests:abs.hlo.test
Will provide more details offline, they rely on logs from private CI.
Summary:
The main objective of this revision is to change the way static information is represented, propagated and canonicalized in the SubViewOp.
In the current implementation the issue is that canonicalization may strictly lose information because static offsets are combined in irrecoverable ways into the result type, in order to fit the strided memref representation.
The core semantics of the op do not change but the parser and printer do: the op always requires `rank` offsets, sizes and strides. These quantities can now be either SSA values or static integer attributes.
The result type is automatically deduced from the static information and more powerful canonicalizations (as powerful as the representation with sentinel `?` values allows). Previously static information was inferred on a best-effort basis from looking at the source and destination type.
Relevant tests are rewritten to use the idiomatic `offset: x, strides : [...]`-form. Bugs are corrected along the way that were not trivially visible in flattened strided memref form.
It is an open question, and a longer discussion, whether a better result type representation would be a nicer alternative. For now, the subview op carries the required semantic.
Reviewers: ftynse, mravishankar, antiagainst, rriddle!, andydavis1, timshen, asaadaldien, stellaraccident
Reviewed By: mravishankar
Subscribers: aartbik, bondhugula, mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, bader, grosul1, frgossen, Kayjukh, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79662
This [discussion](https://llvm.discourse.group/t/viewop-isnt-expressive-enough/991/2) raised some concerns with ViewOp.
In particular, the handling of offsets is incorrect and does not match the op description.
Note that with an elemental type change, offsets cannot be part of the type in general because sizeof(srcType) != sizeof(dstType).
Howerver, offset is a poorly chosen term for this purpose and is renamed to byte_shift.
Additionally, for all intended purposes, trying to support non-identity layouts for this op does not bring expressive power but rather increases code complexity.
This revision simplifies the existing semantics and implementation.
This simplification effort is voluntarily restrictive and acts as a stepping stone towards supporting richer semantics: treat the non-common cases as YAGNI for now and reevaluate based on concrete use cases once a round of simplification occurred.
Differential revision: https://reviews.llvm.org/D79541
The list of destination load ops while evaluating producer-consumer
fusion wasn't being maintained as a set, and as such, duplicate load ops
were being added to it. Although this is harmless correctness-wise, it's
a killer efficiency-wise and it prevents interesting/useful fusions
(including for eg. reshapes into a matmul). The reason the latter
fusions would be missed is that a slice union would be unnecessarily
needed due to the duplicate load ops on a memref added to the 'dst
loads' list. Since slice union is unimplemented for the local var case,
a single destination load op that leads to local vars (like a floordiv /
mod producing fusion), a common case, would not get fused due to an
unnecessary union being tried with itself. (The union would actually be
the same thing but we would bail out.)
Besides the above, this would also significantly speed up fusion as all
the unnecessary slice computations / unions, checks, etc. due to the
duplicates go away.
Differential Revision: https://reviews.llvm.org/D79547
Summary:
This revision adds a conservative canonicalization pattern for MemRefCastOp that are typically inserted during ViewOp and SubViewOp canonicalization.
Ideally such canonicalizations would propagate the type to consumers but this is not a local behavior. As a consequence MemRefCastOp are introduced to keep type compatibility but need to be cleaned up later, in the case where more dynamic behavior than necessary is introduced.
Differential Revision: https://reviews.llvm.org/D79438
This revision adds support for merging identical blocks, or those with the same operations that branch to the same successors. Operands that mismatch between the different blocks are replaced with new block arguments added to the merged block.
Differential Revision: https://reviews.llvm.org/D79134
There are three op conversion modes: Partial, Full, and Analysis. This change modifies the Partial mode to optionally take a set of non-legalizable ops. If this parameter is specified, all ops that are not legalizable (i.e. would cause full conversion to fail) are tracked throughout the partial legalization.
Differential Revision: https://reviews.llvm.org/D78788
Summary:
This change results in tests also being changed to prevent dead
affine.load operations from being folded away during rewrites.
Also move AffineStoreOp and AffineLoadOp to an ODS file.
Differential Revision: https://reviews.llvm.org/D78930
We have provided a generic buffer assignment transformation ported from
TensorFlow. This generic transformation pass automatically analyzes the values
and their aliases (also in other blocks) and returns the valid positions for
Alloc and Dealloc operations. To find these positions, the algorithm uses the
block Dominator and Post-Dominator analyses. In our proposed algorithm, we have
considered aliasing, liveness, nested regions, branches, conditional branches,
critical edges, and independency to custom block terminators. This
implementation doesn't support block loops. However, we have considered this in
our design. For this purpose, it is only required to have a loop analysis to
insert Alloc and Dealloc operations outside of these loops in some special
cases.
Differential Revision: https://reviews.llvm.org/D78484
- Adds a folder for integer division by one with the `divi_signed` and `divi_unsigned` ops.
- Creates tests for scalar and tensor versions of these ops.
- Modifies the test in `parallel-loop-collapsing.mlir` so that it doesn't assume division by one will be in the output.
Differential Revision: https://reviews.llvm.org/D78518
This revision adds support for propagating constants across symbol-based callgraph edges. It uses the existing Call/CallableOpInterfaces to detect the dataflow edges, and propagates constants through arguments and out of returns.
Differential Revision: https://reviews.llvm.org/D78592
Summary:
Previously operations like std.load created methods for obtaining their
effects but did not inherit from the SideEffect interfaces when their
parameters were decorated with the information. The resulting situation
was that passes had no information on the SideEffects of std.load/store
and had to treat them more cautiously. This adds the inheritance
information when creating the methods.
As a side effect, many tests are modified, as they were using std.load
for testing and this oepration would be folded away as part of pattern
rewriting. Tests are modified to use store or to reutn the result of the
std.load.
Reviewers: mravishankar, antiagainst, nicolasvasilache, herhut, aartbik, ftynse!
Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, csigg, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, bader, grosul1, frgossen, Kayjukh, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78802
The current implementation of this method performs the replacement directly, and thus doesn't support proper back tracking.
Differential Revision: https://reviews.llvm.org/D78790
This is possible by adding two new ControlFlowInterface additions:
- A new interface, RegionBranchOpInterface
This interface allows for region holding operations to describe how control flows between regions. This interface initially contains two methods:
* getSuccessorEntryOperands
Returns the operands of this operation used as the entry arguments when entering the region at `index`, which was specified as a successor by `getSuccessorRegions`. when entering. These operands should correspond 1-1 with the successor inputs specified in `getSuccessorRegions`, and may be a subset of the entry arguments for that region.
* getSuccessorRegions
Returns the viable successors of a region, or the possible successor when branching from the parent op. This allows for describing which regions may be executed when entering an operation, and which regions are executed after having executed another region of the parent op. For example, a structured loop operation may always enter into the loop body region. The loop body region may branch back to itself, or exit to the operation.
- A trait, ReturnLike
This trait signals that a terminator exits a region and forwards all of its operands as "exiting" values.
These additions allow for performing more general dataflow analysis in the presence of region holding operations.
Differential Revision: https://reviews.llvm.org/D78447
This revision adds the initial pass for performing SCCP generically in MLIR. SCCP is an algorithm for propagating constants across control flow, and optimistically assumes all values to be constant unless proven otherwise. It currently supports branching control, with support for regions and inter-procedural propagation being added in followups.
Differential Revision: https://reviews.llvm.org/D78397
The previous code result a mismatch between block argument types and
predecessor successor args when a type conversion was needed in a
multiblock case. It was assuming the replaced result types matched the
region result types.
Also, slighly improve the debug output from the inliner.
Differential Revision: https://reviews.llvm.org/D78415
Summary: Some pattern rewriters, like dialect conversion, prohibit the unbounded recursion(or reapplication) of patterns on generated IR. Most patterns are not written with recursive application in mind, so will generally explode the stack if uncaught. This revision adds a hook to RewritePattern, `hasBoundedRewriteRecursion`, to signal that the pattern can safely be applied to the generated IR of a previous application of the same pattern. This allows for establishing a contract between the pattern and rewriter that the pattern knows and can handle the potential recursive application.
Differential Revision: https://reviews.llvm.org/D77782