Introduce the execute_region op that is able to hold a region which it
executes exactly once. The op encapsulates a CFG within itself while
isolating it from the surrounding control flow. Proposal discussed here:
https://llvm.discourse.group/t/introduce-std-inlined-call-op-proposal/282
execute_region enables one to inline a function without lowering out all
other higher level control flow constructs (affine.for/if, scf.for/if)
to the flat list of blocks / CFG form. It thus allows the benefit of
transforms on higher level control flow ops available in the presence of
the inlined calls. The inlined calls continue to benefit from
propagation of SSA values across their top boundary. Functions won’t
have to remain outlined until later than desired. Abstractions like
affine execute_regions, lambdas with implicit captures could be lowered
to this without first lowering out structured loops/ifs or outlining.
But two potential early use cases are of: (1) an early inliner (which
can inline functions by introducing execute_region ops), (2) lowering of
an affine.execute_region, which cleanly maps to an scf.execute_region
when going from the affine dialect to the scf dialect.
Differential Revision: https://reviews.llvm.org/D75837
Splitting the memref dialect lead to an introduction of several dependencies
to avoid compilation issues. The canonicalize pass also depends on the
memref dialect, but it shouldn't. This patch resolves the dependencies
and the unintuitive includes are removed. However, the dependency moves
to the constructor of the std dialect.
Differential Revision: https://reviews.llvm.org/D102060
This covers the extremely common case of replacing all uses of a Value
with a new op that is itself a user of the original Value.
This should also be a little bit more efficient than the
`SmallPtrSet<Operation *, 1>{op}` idiom that was being used before.
Differential Revision: https://reviews.llvm.org/D102373
Add two canoncalizations for scf.if.
1) A canonicalization that allows users of a condition within an if to assume the condition
is true if in the true region, etc.
2) A canonicalization that removes yielded statements that are equivalent to the condition
or its negation
Differential Revision: https://reviews.llvm.org/D101012
Break up the dependency between SCF ops and substituteMin helper and make a
more generic version of AffineMinSCFCanonicalization. This reduce dependencies
between linalg and SCF and will allow the logic to be used with other kind of
ops. (Like ID ops).
Differential Revision: https://reviews.llvm.org/D100321
This doesn't change APIs, this just cleans up the many in-tree uses of these
names to use the new preferred names. We'll keep the old names around for a
couple weeks to help transitions.
Differential Revision: https://reviews.llvm.org/D99127
This updates the codebase to pass the context when creating an instance of
OwningRewritePatternList, and starts removing extraneous MLIRContext
parameters. There are many many more to be removed.
Differential Revision: https://reviews.llvm.org/D99028
* Do we need a threshold on maximum number of Yeild arguments processed (maximum number of SelectOp's to be generated)?
* Had to modify some old IfOp tests to not get optimized by this pattern
Differential Revision: https://reviews.llvm.org/D98592
'ForOpIterArgsFolder' can now remove iterator arguments (and corresponding
results) with no use.
Example:
```
%cst = constant 32 : i32
%0:2 = scf.for %arg1 = %lb to %ub step %step iter_args(%arg2 = %arg0, %arg3 = %cst)
-> (i32, i32) {
%1 = addu %arg2, %cst : i32
scf.yield %1, %1 : i32, i32
}
use(%0#0)
```
%arg3 is not used in the block, and its corresponding result `%0#1` has no use,
thus remove the iter argument.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D98711
Enhance 'ForOpIterArgsFolder' to remove unused iteration arguments in a
scf::ForOp. If the block argument corresponding to the given iterator has no
use and the yielded value equals the input, we fold it away.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D98503
Canonicalize the iter_args of an scf::ForOp that involve a tensor_load and
for which only the last loop iteration is actually visible outside of the
loop. The canonicalization looks for a pattern such as:
```
%t0 = ... : tensor_type
%0 = scf.for ... iter_args(%bb0 : %t0) -> (tensor_type) {
...
// %m is either tensor_to_memref(%bb00) or defined above the loop
%m... : memref_type
... // uses of %m with potential inplace updates
%new_tensor = tensor_load %m : memref_type
...
scf.yield %new_tensor : tensor_type
}
```
`%bb0` may have either 0 or 1 use. If it has 1 use it must be exactly a
`%m = tensor_to_memref %bb0` op that feeds into the yielded `tensor_load`
op.
If no aliasing write of `%new_tensor` occurs between tensor_load and yield
then the value %0 visible outside of the loop is the last `tensor_load`
produced in the loop.
For now, we approximate the absence of aliasing by only supporting the case
when the tensor_load is the operation immediately preceding the yield.
The canonicalization rewrites the pattern as:
```
// %m is either a tensor_to_memref or defined above
%m... : memref_type
scf.for ... { // no iter_args
... // uses of %m with potential inplace updates
}
%0 = tensor_load %m : memref_type
```
Differential revision: https://reviews.llvm.org/D97953
This commit introduced a cyclic dependency:
Memref dialect depends on Standard because it used ConstantIndexOp.
Std depends on the MemRef dialect in its EDSC/Intrinsics.h
Working on a fix.
This reverts commit 8aa6c3765b.
Create the memref dialect and move several dialect-specific ops without
dependencies to other ops from std dialect to this dialect.
Moved ops:
AllocOp -> MemRef_AllocOp
AllocaOp -> MemRef_AllocaOp
DeallocOp -> MemRef_DeallocOp
MemRefCastOp -> MemRef_CastOp
GetGlobalMemRefOp -> MemRef_GetGlobalOp
GlobalMemRefOp -> MemRef_GlobalOp
PrefetchOp -> MemRef_PrefetchOp
ReshapeOp -> MemRef_ReshapeOp
StoreOp -> MemRef_StoreOp
TransposeOp -> MemRef_TransposeOp
ViewOp -> MemRef_ViewOp
The roadmap to split the memref dialect from std is discussed here:
https://llvm.discourse.group/t/rfc-split-the-memref-dialect-from-std/2667
Differential Revision: https://reviews.llvm.org/D96425
Previously this might happen if there was no elseRegion and the method
was asked for all successor regions.
Differential Revision: https://reviews.llvm.org/D96764
We should be check whether lb + step >= ub to determine
whether this is a single iteration. Previously we were
checking lb + lb >= ub.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D95440
We should be check whether lb + step >= ub to determine
whether this is a single iteration. Previously we were
checking lb + lb >= ub.
Differential Revision: https://reviews.llvm.org/D95440
This better matches the rest of the infrastructure, is much simpler, and makes it easier to move these types to being declaratively specified.
Differential Revision: https://reviews.llvm.org/D93432
Given that OpState already implicit converts to Operator*, this seems reasonable.
The alternative would be to add more functions to OpState which forward to Operation.
Reviewed By: rriddle, ftynse
Differential Revision: https://reviews.llvm.org/D92266
- Address TODO in scf-bufferize: the argument materialization issue is
now fixed and the code is now in Transforms/Bufferize.cpp
- Tighten up finalizing-bufferize to avoid creating invalid IR when
operand types potentially change
- Tidy up the testing of func-bufferize, and move appropriate tests
to a new finalizing-bufferize.mlir
- The new stricter checking in finalizing-bufferize revealed that we
needed a DimOp conversion pattern (found when integrating into npcomp).
Previously, the converion infrastructure was blindly changing the
operand type during finalization, which happened to work due to
DimOp's tensor/memref polymorphism, but is generally not encouraged
(the new pattern is the way to tell the conversion infrastructure that
it is legal to change that type).
An SCF 'for' loop does not iterate if its lower bound is equal to its upper
bound. Remove loops where both bounds are the same SSA value as such bounds are
guaranteed to be equal. Similarly, remove 'parallel' loops where at least one
pair of respective lower/upper bounds is specified by the same SSA value.
Reviewed By: gysit
Differential Revision: https://reviews.llvm.org/D91880
Add canoncalization patterns to remove zero-iteration 'for' loops, replace
single-iteration 'for' loops with their bodies; remove known-false conditionals
with no 'else' branch and replace conditionals with known value by the
respective region. Although similar transformations are performed at the CFG
level, not all flows reach that level, e.g., the GPU flow may want to remove
single-iteration loops before deciding on loop mapping to thread dimensions.
Reviewed By: herhut
Differential Revision: https://reviews.llvm.org/D91865
These includes have been deprecated in favor of BuiltinDialect.h, which contains the definitions of ModuleOp and FuncOp.
Differential Revision: https://reviews.llvm.org/D91572
scf.parallel is currently not a good fit for tiling on tensors.
Instead provide a path to parallelism directly through scf.for.
For now, this transformation ignores the distribution scheme and always does a block-cyclic mapping (where block is the tile size).
Differential revision: https://reviews.llvm.org/D90475
The new construct represents a generic loop with two regions: one executed
before the loop condition is verifier and another after that. This construct
can be used to express both a "while" loop and a "do-while" loop, depending on
where the main payload is located. It is intended as an intermediate
abstraction for lowering, which will be added later. This form is relatively
easy to target from higher-level abstractions and supports transformations such
as loop rotation and LICM.
Differential Revision: https://reviews.llvm.org/D90255
Often times the legality of inlining can change depending on if the callable is going to be inlined in-place, or cloned. For example, some operations are not allowed to be duplicated and can only be inlined if the original callable will cease to exist afterwards. The new `wouldBeCloned` flag allows for dialects to hook into this when determining legality.
Differential Revision: https://reviews.llvm.org/D90360
This fixes a subtle issue, described in the comment starting with
"Clone the op without the regions and inline the regions from the old op",
which prevented this conversion from working on non-trivial examples.
Differential Revision: https://reviews.llvm.org/D90203
This class represents a rewrite pattern list that has been frozen, and thus immutable. This replaces the uses of OwningRewritePatternList in pattern driver related API, such as dialect conversion. When PDL becomes more prevalent, this API will allow for optimizing a set of patterns once without the need to do this per run of a pass.
Differential Revision: https://reviews.llvm.org/D89104
A "structural" type conversion is one where the underlying ops are
completely agnostic to the actual types involved and simply need to update
their types. An example of this is scf.if -- the scf.if op and the
corresponding scf.yield ops need to update their types accordingly to the
TypeConverter, but otherwise don't care what type conversions are happening.
To test the structural type conversions, it is convenient to define a
bufferize pass for a dialect, which exercises them nicely.
Differential Revision: https://reviews.llvm.org/D89757