Rewrite patterns may make modifications to the CFG, including dropping edges between blocks. This change adds a simple unreachable block elimination run at the end of each iteration to ensure that the CFG remains valid.
PiperOrigin-RevId: 277545805
In some cases, it may be desirable to mark entire regions of operations as legal. This provides an additional granularity of context to the concept of "legal". The `ConversionTarget` supports marking operations, that were previously added as `Legal` or `Dynamic`, as `recursively` legal. Recursive legality means that if an operation instance is legal, either statically or dynamically, all of the operations nested within are also considered legal. An operation can be marked via `markOpRecursivelyLegal<>`:
```c++
ConversionTarget &target = ...;
/// The operation must first be marked as `Legal` or `Dynamic`.
target.addLegalOp<MyOp>(...);
target.addDynamicallyLegalOp<MySecondOp>(...);
/// Mark the operation as always recursively legal.
target.markOpRecursivelyLegal<MyOp>();
/// Mark optionally with a callback to allow selective marking.
target.markOpRecursivelyLegal<MyOp, MySecondOp>([](Operation *op) { ... });
/// Mark optionally with a callback to allow selective marking.
target.markOpRecursivelyLegal<MyOp>([](MyOp op) { ... });
```
PiperOrigin-RevId: 277086382
This allows for them to be used on other non-function, or even other function-like, operations. The algorithms are already generic, so this is simply changing the derived pass type. The majority of this change is just ensuring that the nesting of these passes remains the same, as the pass manager won't auto-nest them anymore.
PiperOrigin-RevId: 276573038
This simplifies defining expected-* directives when there are multiple that apply to the next or previous line. @below applies the directive to the next non-designator line, i.e. the next line that does not contain an expected-* designator. @above applies to the previous non designator line.
Examples:
// Expect an error on the next line that does not contain a designator.
// expected-remark@below {{remark on function below}}
// expected-remark@below {{another remark on function below}}
func @bar(%a : f32)
// Expect an error on the previous line that does not contain a designator.
func @baz(%a : f32)
// expected-remark@above {{remark on function above}}
// expected-remark@above {{another remark on function above}}
PiperOrigin-RevId: 276369085
This allows mixing linalg operations with vector transfer operations (with additional modifications to affine ops) and is a step towards solving tensorflow/mlir#189.
PiperOrigin-RevId: 275543361
When dealing with regions, or other patterns that need to generate temporary operations, it is useful to be able to replace other operations than the root op being matched. Before this PR, these operations would still be considered for legalization meaning that the conversion would either fail, erroneously need to mark these ops as legal, or add unnecessary patterns.
PiperOrigin-RevId: 274598513
This will allow for inlining newly devirtualized calls, as well as give a more accurate cost model(when we have one). Currently canonicalization will only run for nodes that have no child edges, as the child nodes may be erased during canonicalization. We can support this in the future, but it requires more intricate deletion tracking.
PiperOrigin-RevId: 274011386
When an operation with regions gets replaced, we currently require that all of the remaining nested operations are still converted even though they are going to be replaced when the rewrite is finished. This cl adds a tracking for a minimal set of operations that are known to be "dead". This allows for ignoring the legalization of operations that are won't survive after conversion.
PiperOrigin-RevId: 274009003
variadic result.
Add missing test for single line fix to `void OpEmitter::genFolderDecls()`
entitled "Fold away reduction over 0 dimensions."
PiperOrigin-RevId: 273880337
This PR is a stepping stone towards supporting generic multi-store
source loop nests in affine loop fusion. It extends the algorithm to
support fusion of multi-store loop nests that:
1. have only one store that writes to a function-local live out, and
2. the remaining stores are involved in loop nest self dependences
or no dependences within the function.
Closestensorflow/mlir#162
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/162 from dcaballe:dcaballe/multi-output-fusion 7fb7dec6fe8b45f5ce176f018bfe37b256420c45
PiperOrigin-RevId: 273773907
This allows individual passes to define options structs and for these options to be parsed per instance of the pass while building the pass pipeline from the command line provided textual specification.
The user can specify these per-instance pipeline options like so:
```
struct MyPassOptions : public PassOptions<MyPassOptions> {
Option<int> exampleOption{*this, "flag-name", llvm:🆑:desc("...")};
List<int> exampleListOption{*this, "list-flag-name", llvm:🆑:desc("...")};
};
static PassRegistration<MyPass, MyPassOptions> pass("my-pass", "description");
```
PiperOrigin-RevId: 273650140
This is similar to the `inlineRegionBefore` hook, except the original blocks are unchanged. The region to be cloned *must* not have been modified during the conversion process at the point of cloning, i.e. it must belong an operation that has yet to be converted, or the operation that is currently being converted.
PiperOrigin-RevId: 273622533
- bodies would earlier appear in the order (i, i+3, i+2, i+1) instead of
(i, i+1, i+2, i+3) for example for factor 4.
- clean up hardcoded test cases
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closestensorflow/mlir#170
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/170 from bondhugula:ujam b66b405b2b1894a03b376952e32a9d0292042665
PiperOrigin-RevId: 273613131
Some dialects have implicit conversions inherent in their modeling, meaning that a call may have a different type that the type that the callable expects. To support this, a hook is added to the dialect interface that allows for materializing conversion operations during inlining when there is a mismatch. A hook is also added to the callable interface to allow for introspecting the expected result types.
PiperOrigin-RevId: 272814379
This allows for the inliner to work on arbitrary call operations. The updated inliner will also work bottom-up through the callgraph enabling support for multiple levels of inlining.
PiperOrigin-RevId: 272813876
- introduce splat op in standard dialect (currently for int/float/index input
type, output type can be vector or statically shaped tensor)
- implement LLVM lowering (when result type is 1-d vector)
- add constant folding hook for it
- while on Ops.cpp, fix some stale names
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closestensorflow/mlir#141
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/141 from bondhugula:splat 48976a6aa0a75be6d91187db6418de989e03eb51
PiperOrigin-RevId: 270965304
- fix store to load forwarding for a certain set of cases (where
forwarding shouldn't have happened); use AffineValueMap difference
based MemRefAccess equality checking; utility logic is also greatly
simplified
- add missing equality/inequality operators for AffineExpr ==/!= ints
- add == != operators on MemRefAccess
Closestensorflow/mlir#136
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/136 from bondhugula:store-load-forwarding d79fd1add8bcfbd9fa71d841a6a9905340dcd792
PiperOrigin-RevId: 270457011
- allow symbols in index remapping provided for memref replacement
- fix memref normalize crash on cases with layout maps with symbols
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Reported by: Alex Zinenko
Closestensorflow/mlir#139
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/139 from bondhugula:memref-rep-symbols 2f48c1fdb5d4c58915bbddbd9f07b18541819233
PiperOrigin-RevId: 269851182
- add canonicalization pattern to compose maps into affine loads/stores;
templatize the pattern and reuse it for affine.apply as well
- rename getIndices -> getMapOperands() (getIndices is confusing since
these are no longer the indices themselves but operands to the map
whose results are the indices). This also makes the accessor uniform
across affine.apply/load/store. Change arg names on the affine
load/store builder to avoid confusion. Drop an unused confusing build
method on AffineStoreOp.
- update incomplete doc comment for canonicalizeMapAndOperands (this was
missed from a previous update).
Addresses issue tensorflow/mlir#121
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closestensorflow/mlir#122
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/122 from bondhugula:compose-load-store e71de1771e56a85c4282c10cb43f30cef0701c4f
PiperOrigin-RevId: 269619540
When performing A->B->C conversion, an operation may still refer to an operand of A. This makes it necessary to unmap through multiple levels of replacement for a specific value.
PiperOrigin-RevId: 269367859
- take care of symbolic operands with alloc
- add missing check for compose map failure and a test case
- add test cases on strides
- drop incorrect check for one-to-one'ness
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closestensorflow/mlir#132
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/132 from bondhugula:normalize-memrefs 8aebf285fb0d7c19269d85255aed644657e327b7
PiperOrigin-RevId: 269105947
- NFC - on any pass/utility logic/output.
- Resolve TODO; the method building loop trip count maps was
creating and deleting affine.apply ops (transforming IR from under
analysis!, strictly speaking). Introduce AffineValueMap::difference to
do this correctly (without the need to create any IR).
- Move AffineApplyNormalizer out so that its methods are reusable from
AffineStructures.cpp; add a helper method 'normalize' to it. Fix
AffineApplyNormalize::renumberOneDim (Issue tensorflow/mlir#89).
- Trim includes on files touched.
- add test case on a scenario previously not covered
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closestensorflow/mlir#133
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/133 from bondhugula:trip-count-build 7fc34d857f7788f98b641792cafad6f5bd50e47b
PiperOrigin-RevId: 269101118
- add missing canonicalization pattern to fold memref_cast + dim to
dim (needed to propagate constant when folding a dynamic shape to
a static one)
- also fix an outdated/inconsistent comment in StandardOps/Ops.td
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closestensorflow/mlir#126
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/126 from bondhugula:quickfix 4566e75e49685c532faffff91d64c5d83d4da524
PiperOrigin-RevId: 269020058
This defines a set of initial utilities for inlining a region(or a FuncOp), and defines a simple inliner pass for testing purposes.
A new dialect interface is defined, DialectInlinerInterface, that allows for dialects to override hooks controlling inlining legality. The interface currently provides the following hooks, but these are just premilinary and should be changed/added to/modified as necessary:
* isLegalToInline
- Determine if a region can be inlined into one of this dialect, *or* if an operation of this dialect can be inlined into a given region.
* shouldAnalyzeRecursively
- Determine if an operation with regions should be analyzed recursively for legality. This allows for child operations to be closed off from the legality checks for operations like lambdas.
* handleTerminator
- Process a terminator that has been inlined.
This cl adds support for inlining StandardOps, but other dialects will be added in followups as necessary.
PiperOrigin-RevId: 267426759
- address remaining comments from PR tensorflow/mlir#87 for better test coverage for
pipeline-data-transfer/replaceAllMemRefUsesWith
- remove dead tag allocs the same way they are removed for the replaced buffers
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closestensorflow/mlir#106
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/106 from bondhugula:followup 9e868666d047e8d43e5f82f43e4093b838c710fa
PiperOrigin-RevId: 267144774
- introduce utility to convert memrefs with non-identity layout maps to
ones with identity layout maps: convert the type and rewrite/remap all
its uses
- add this utility to -simplify-affine-structures pass for testing
purposes
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closestensorflow/mlir#104
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/104 from bondhugula:memref-normalize f2c914aa1890e8860326c9e33f9aa160b3d65e6d
PiperOrigin-RevId: 266985317
- the [begin, end) range identified for copying could end in between the
block, which makes hoisting invalid in some cases. Change the range
identification to always end with end of block.
- add test case to exercise these (with fast mem capacity set to minimal so
that single element memref buffers are generated at the innermost loop)
- the location of begin/end of the block range for data copying was
being confused with the insert points for copy in and copy out code.
In cases, where we choose to hoist transfers, these are separate.
- when copy loops are single iteration ones, promote their bodies at
the end of the pass.
- change default fast mem space to 1 (setting it to zero made it
generate DMA op's that won't verify in the default case - since the
DMA ops have a check for src/dest memref spaces being different).
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Co-Authored-By: Mehdi Amini <joker.eph@gmail.com>
Closestensorflow/mlir#88
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/88 from bondhugula:datacopy 88697267c45e850c3ced87671e16e4a930c02a42
PiperOrigin-RevId: 266980911
This interface will allow for providing hooks to interrop with operation folding. The first hook, 'shouldMaterializeInto', will allow for controlling which region to insert materialized constants into. The folder will generally materialize constants into the top-level isolated region, this allows for materializing into a lower level ancestor region if it is more profitable/correct.
PiperOrigin-RevId: 266702972
- extend canonicalizeMapAndOperands to propagate constant operands into
the map's expressions (and thus drop those operands).
- canonicalizeMapAndOperands previously only dropped duplicate and
unused operands; however, operands that were constants were
retained.
This change makes IR maps/expressions generated by various
utilities/passes even simpler; also makes some of the test checks more
accurate and simpler -- for eg., 0' instead of symbol(%{{.*}}).
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closestensorflow/mlir#107
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/107 from bondhugula:canonicalize-maps c889a51486d14fbf7db489f224f881e7e1ff7d72
PiperOrigin-RevId: 266085289
Refactor replaceAllMemRefUsesWith to split it into two methods: the new
method does the replacement on a single op, and is used by the existing
one.
- make the methods return LogicalResult instead of bool
- Earlier, when replacement failed (due to non-deferencing uses of the
memref), the set of ops that had already been processed would have
been replaced leaving the IR in an inconsistent state. Now, a
pass is made over all ops to first check for non-deferencing
uses, and then replacement is performed. No test cases were affected
because all clients of this method were first checking for
non-deferencing uses before calling this method (for other reasons).
This isn't true for a use case in another upcoming PR (scalar
replacement); clients can now bail out with consistent IR on failure
of replaceAllMemRefUsesWith. Add test case.
- multiple deferencing uses of the same memref in a single op is
possible (we have no such use cases/scenarios), and this has always
remained unsupported. Add an assertion for this.
- minor fix to another test pipeline-data-transfer case.
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closestensorflow/mlir#87
PiperOrigin-RevId: 265808183
This will allow for naming values the same as existing SSA values for regions attached to operations that are isolated from above. This fits in with how the system already allows separate name scopes for sibling regions. This name shadowing can be enabled in the custom parser of operations by setting the 'enableNameShadowing' flag to true when calling 'parseRegion'.
%arg = constant 10 : i32
foo.op {
%arg = constant 10 : i32
}
PiperOrigin-RevId: 264255999
All 'getValue' variants now require that the index is valid, queryable via 'isValidIndex'. 'getSplatValue' now requires that the attribute is a proper splat. This allows for querying these methods on DenseElementAttr with all possible value types; e.g. float, int, APInt, etc. This also allows for removing unnecessary conversions to Attribute that really want the underlying value.
PiperOrigin-RevId: 263437337
In the backward slice computation, BlockArgument coming from function arguments represent a natural boundary for the traversal and should not trigger llvm_unreachable.
This CL also improves the error message and adds a relevant test.
PiperOrigin-RevId: 260118630
Clipping creates non-affine memory accesses, use std_load and std_store instead of affine_load and affine_store.
In the future we may also want a fill with the neutral element rather than clip, this would make the accesses affine if we wanted more analyses and transformations to happen post lowering to pointwise copies.
PiperOrigin-RevId: 260110503
This mode analyzes which operations are legalizable to the given target if a conversion were to be applied, i.e. no rewrites are ever performed even on success. This mode is useful for device partitioning or other utilities that may want to analyze the effect of conversion to different targets before performing it.
The analysis method currently just fills a provided set with the operations that were found to be legalizable. This can be extended in the future to capture more information as necessary.
PiperOrigin-RevId: 259987105
This CL fixes an oversight with dealing with loops in slicing analysis.
The forward slice computation properly propagates through loops but not the backward slice.
Add relevant unit tests.
PiperOrigin-RevId: 259903396
The loop parallelism detection utility only collects the affine.load and
affine.store operations appearing inside the loop to analyze the access
patterns for the absence of dependences. However, any operation, including
unregistered operations, can appear in a body of an affine loop. If such
operation has side effects, the result of parallelism analysis is incorrect.
Conservatively assume affine loops are not parallel in presence of operations
other than affine.load, affine.store, affine.for, affine.terminator that may
have side effects.
This required to update the loop-fusion unit test that relies on parallelism
analysis and was exercising loop fusion in presence of an unregistered
operation.
PiperOrigin-RevId: 259560935
- introduce parseRegionArgumentList (similar to parseOperandList) to parse a
list of region arguments with a delimiter
- allows defining custom parse for op's with multiple/variadic number of
region arguments
- use this on the gpu.launch op (although the latter has a fixed number
of region arguments)
- add a test dialect op to test region argument list parsing (with the
no delimiter case)
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Closestensorflow/mlir#40
PiperOrigin-RevId: 259442536