llvm-project

Commit Graph

Author	SHA1	Message	Date
Chia-hung Duan	2afd16fe72	[mlir] Enable MLIRDialectUtilsTests Also remove `TooFewDims` test which tried to create an invalid AffineMap. The creation of an invalid AffineMap is rejected by `willBeValidAffineMap`, as a result we can deprecate the test. Reviewed By: bondhugula Differential Revision: https://reviews.llvm.org/D114657	2021-11-27 22:36:43 +00:00
Stanislav Funiak	a19e163526	Fixed broken build under GCC 5.4. This diff fixes broken build caused by D108550. Under GCC 5, auto lambdas that capture this require `this->` for member calls. Reviewed By: bondhugula Differential Revision: https://reviews.llvm.org/D114659	2021-11-27 09:03:27 +05:30
Kazu Hirata	803cec0268	[mlir] Fix a warning This patch fixes: mlir/lib/IR/MLIRContext.cpp:1020:3: error: use of the 'nodiscard' attribute is a C++17 extension [-Werror,-Wc++17-extensions]	2021-11-26 12:27:11 -08:00
Arnab Dutta	c2280b5517	[MLIR] Avoid creation of buggy affine maps when incorrect values of number of dimensions and number of symbols are provided. We check whether the maximum index of dimensional identifier present in the result expressions is less than dimCount (number of dimensional identifiers) argument passed in the AffineMap::get() and the maximum index of symbolic identifier present in the result expressions is less than symbolCount (number of symbolic identifiers) argument passed in AffineMap::get(). Reviewed By: nicolasvasilache, bondhugula Differential Revision: https://reviews.llvm.org/D114238	2021-11-27 00:37:08 +05:30
Arnab Dutta	e4e4da86af	[MLIR] Prevent creation of buggy affine map after linearizing collapsed dimensions of source map Initially we were passing wrong numSymbols argument while calling AffineMap::get() for creaating affine map with linearized result expressions. The main problems was the number of symbols of the newly to be created map may be different from that of the source map, as new symbolic identifiers may be introduced while creating strided layout linearized expressions. Reviewed By: nicolasvasilache, bondhugula Differential Revision: https://reviews.llvm.org/D114240	2021-11-27 00:32:58 +05:30
Chris Jones	344eee6f38	[MLIR] Allow `Idempotent` trait to be applied to binary ops. Add `Idempotent` trait to `arith.{andi,ori}`. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D114574	2021-11-26 18:22:49 +00:00
Michal Terepeta	7e65fc9a60	[mlir][Vector] Support 0-D vectors in `BroadcastOp` This changes the op to produce `AnyVectorOfAnyRank` following mostly the code for 1-D vectors. Depends On D114598 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114550	2021-11-26 17:17:18 +00:00
Michal Terepeta	d0f927121e	[mlir][Standard] Support 0-D vectors in `SplatOp` This changes the op to produce `AnyVectorOfAnyRank` and implements this by just inserting the element (skipping the shuffle that we do for the 1-D case). Depends On D114549 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114598	2021-11-26 17:05:15 +00:00
Arjun P	ad34ce94d5	[MLIR] Simplex: fix a bug when rolling back a Simplex with no solutions Previously, when adding a constraint to a Simplex that is already marked as having no solutions (marked empty), the Simplex would be marked empty again, and a second UnmarkEmpty entry would be pushed to the undo log. When rolling back, Simplex should be unmarked empty only after rolling back past the creation of the first constraint that made it empty. Reviewed By: Groverkss Differential Revision: https://reviews.llvm.org/D114613	2021-11-26 22:33:48 +05:30
Arjun P	f074bbb04a	[MLIR] Simplex::pivot: also update the redundant rows when pivoting Previously, the pivot function would only update the non-redundant rows when pivoting. This is incorrect because in some cases, when rolling back past a `detectRedundant` call, the basis being used could be different from that which was used at the time of returning from the `detectRedundant` call. Therefore, it is important to update the redundant rows as well during pivots. This could also be triggered by pivots that occur when testing successive constraints for being redundant in `detectRedundant` after some initial constraints are marked redundant. Reviewed By: Groverkss Differential Revision: https://reviews.llvm.org/D114614	2021-11-26 21:42:41 +05:30
Mats Petersson	30238c3676	[mlir][OpenMP] Add support for SIMD modifier Add support for SIMD modifier in OpenMP worksharing loops. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D111051	2021-11-26 14:04:46 +00:00
Matthias Springer	b62b21b980	[mlir][linalg][bufferize][NFC] InsertSliceOp no-copy detection as PostAnalysis There is special logic for InsertSliceOp to check if a memcpy is needed. This change extracts that piece of code and makes it a PostAnalysisStep. The purpose of this change is to untangle `bufferize` from BufferizationAliasInfo. (Not fully there yet.) Differential Revision: https://reviews.llvm.org/D114513	2021-11-26 22:19:29 +09:00
Benjamin Kramer	8521850f20	Provide a definition for OperationPosition::kDown This isn't necessary in C++17, but C++14 still requires it.	2021-11-26 14:11:59 +01:00
Benjamin Kramer	1b0312d280	[PDL] fix unused variable warning in Release builds	2021-11-26 14:11:58 +01:00
Stanislav Funiak	d35f119094	Added line numbers to the debug output of PDL bytecode. This is a small diff that splits out the debug output for PDL bytecode. When running bytecode with debug output on, it is useful to know the line numbers where the PDLIntepr operations are performed. Usually, these are in a single MLIR file, so it's sufficient to print out the line number rather than the entire location (which tends to be quite verbose). This debug output is gated by `LLVM_DEBUG` rather than `#ifndef NDEBUG` to make it easier to test. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D114061	2021-11-26 18:11:37 +05:30
Stanislav Funiak	a76ee58f3c	Multi-root PDL matching using upward traversals. This is commit 4 of 4 for the multi-root matching in PDL, discussed in https://llvm.discourse.group/t/rfc-multi-root-pdl-patterns-for-kernel-matching/4148 (topic flagged for review). This PR integrates the various components (root ordering algorithm, nondeterministic execution of PDL bytecode) to implement multi-root PDL matching. The main idea is for the pattern to specify mulitple candidate roots. The PDL-to-PDLInterp lowering selects one of these roots and "hangs" the pattern from this root, traversing the edges downwards (from operation to its operands) when possible and upwards (from values to its uses) when needed. The root is selected by invoking the optimal matching multiple times, once for each candidate root, and the connectors are determined form the optimal matching. The costs in the directed graph are equal to the number of upward edges that need to be traversed when connecting the given two candidate roots. It can be shown that, for this choice of the cost function, "hanging" the pattern an inner node is no better than from the optimal root. The following three main additions were implemented as a part of this PR: 1. OperationPos predicate has been extended to allow tracing the operation accepting a value (the opposite of operation defining a value). 2. Predicate checking if two values are not equal - this is useful to ensure that we do not traverse the edge back downwards after we traversed it upwards. 3. Function for for building the cost graph among the candidate roots. 4. Updated buildPredicateList, building the predicates optimal branching has been determined. Testing: unit tests (an integration test to follow once the stack of commits has landed) Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D108550	2021-11-26 18:11:37 +05:30
Stanislav Funiak	6df7cc7f47	Implementation of the root ordering algorithm This is commit 3 of 4 for the multi-root matching in PDL, discussed in https://llvm.discourse.group/t/rfc-multi-root-pdl-patterns-for-kernel-matching/4148 (topic flagged for review). We form a graph over the specified roots, provided in `pdl.rewrite`, where two roots are connected by a directed edge if the target root can be connected (via a chain of operations) in the underlying pattern to the source root. We place a restriction that the path connecting the two candidate roots must only contain the nodes in the subgraphs underneath these two roots. The cost of an edge is the smallest number of upward traversals (edges) required to go from the source to the target root, and the connector is a `Value` in the intersection of the two subtrees rooted at the source and target root that results in that smallest number of such upward traversals. Optimal root ordering is then formulated as the problem of finding a spanning arborescence (i.e., a directed spanning tree) of minimal weight. In order to determine the spanning arborescence (directed spanning tree) of minimum weight, we use the [Edmonds' algorithm](https://en.wikipedia.org/wiki/Edmonds%27_algorithm). The worst-case computational complexity of this algorithm is O(_N_^3) for a single root, where _N_ is the number of specified roots. The `pdl`-to-`pdl_interp` lowering calls this algorithm as a subroutine _N_ times (once for each candidate root), so the overall complexity of root ordering is O(_N_^4). If needed, this complexity could be reduced to O(_N_^3) with a more efficient algorithm. However, note that the underlying implementation is very efficient, and _N_ in our instances tends to be very small (<10). Therefore, we believe that the proposed (asymptotically suboptimal) implementation will suffice for now. Testing: a unit test of the algorithm Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D108549	2021-11-26 18:11:37 +05:30
Stanislav Funiak	3eb1647af0	Introduced iterative bytecode execution. This is commit 2 of 4 for the multi-root matching in PDL, discussed in https://llvm.discourse.group/t/rfc-multi-root-pdl-patterns-for-kernel-matching/4148 (topic flagged for review). This commit implements the features needed for the execution of the new operations pdl_interp.get_accepting_ops, pdl_interp.choose_op: 1. The implementation of the generation and execution of the two ops. 2. The addition of Stack of bytecode positions within the ByteCodeExecutor. This is needed because in pdl_interp.choose_op, we iterate over the values returned by pdl_interp.get_accepting_ops until we reach finalize. When we reach finalize, we need to return back to the position marked in the stack. 3. The functionality to extend the lifetime of values that cross the nondeterministic choice. The existing bytecode generator allocates the values to memory positions by representing the liveness of values as a collection of disjoint intervals over the matcher positions. This is akin to register allocation, and substantially reduces the footprint of the bytecode executor. However, because with iterative operation pdl_interp.choose_op, execution "returns" back, so any values whose original liveness cross the nondeterminstic choice must have their lifetime executed until finalize. Testing: pdl-bytecode.mlir test Reviewed By: rriddle, Mogball Differential Revision: https://reviews.llvm.org/D108547	2021-11-26 18:11:37 +05:30
Stanislav Funiak	842b6861c0	Defines new PDLInterp operations needed for multi-root matching in PDL. This is commit 1 of 4 for the multi-root matching in PDL, discussed in https://llvm.discourse.group/t/rfc-multi-root-pdl-patterns-for-kernel-matching/4148 (topic flagged for review). These operations are: * pdl.get_accepting_ops: Returns a list of operations accepting the given value or a range of values at the specified position. Thus if there are two operations `%op1 = "foo"(%val)` and `%op2 = "bar"(%val)` accepting a value at position 0, `%ops = pdl_interp.get_accepting_ops of %val : !pdl.value at 0` will return both of them. This allows us to traverse upwards from a value to operations accepting the value. * pdl.choose_op: Iteratively chooses one operation from a range of operations. Therefore, writing `%op = pdl_interp.choose_op from %ops` in the example above will select either `%op1`or `%op2`. Testing: Added the corresponding test cases to mlir/test/Dialect/PDLInterp/ops.mlir. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D108543	2021-11-26 17:59:22 +05:30
Tobias Gysi	8d07ba817c	[mlir][linalg] Simplify the hoist padding tests. Use primarily matvec instead of matmul to test hoist padding. Test the hoisting only starting from already padded IR. Use one-dimensional tiling only except for the tile_and_fuse test that exercises hoisting on a larger loop nest with fill and pad tensor operations in the backward slice. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114608	2021-11-26 07:40:22 +00:00
Michal Terepeta	c47108c041	[mlir][Vector] Minor formatting fixes in Vector.md Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D113854	2021-11-26 07:16:07 +00:00
Matthias Springer	8e2214aa60	[mlir][linalg][bufferize][NFC] Pass BufferizationState to PostAnalysisStep Pass BufferizationStep instead of BufferizationAliasInfo. Note: BufferizationState contains BufferizationAliasInfo. Differential Revision: https://reviews.llvm.org/D114512	2021-11-26 11:46:14 +09:00
Matthias Springer	d62b4b08af	[mlir][linalg][bufferize] Compose dialect-specific bufferization state Use composition instead of inheritance for storing dialect-specific bufferization state. This is in preparation of adding "tensor dialect"-specific bufferization state. Differential Revision: https://reviews.llvm.org/D114508	2021-11-26 11:35:45 +09:00
Matthias Springer	c94b80b438	[mlir][linalg][bufferize][NFC] Allow returning arbitrary memrefs If `allowReturnMemref` is set to true, arbitrary memrefs may be returned from FuncOps. Also remove allocation hoisting code, which is only partly implemented at the moment. The purpose of this commit is to untangle `bufferize` from `aliasInfo`. (Even with this change, they are not fully untangled yet.) Differential Revision: https://reviews.llvm.org/D114507	2021-11-26 11:26:46 +09:00
Matthias Springer	c637e3ea9e	[mlir][linalg][bufferize][NFC] Extract func boundary bufferization Bufferization of function boundaries is extracted from ComprehensiveBufferize into a separate file. This will become its own build target in the future. Differential Revision: https://reviews.llvm.org/D114226	2021-11-26 10:25:36 +09:00
Matthias Springer	f32c3d9528	[mlir][linalg][bufferize][NFC] Move Affine interface impl to new build target This makes ComprehensiveBufferize entirely independent of the Affine dialect. Differential Revision: https://reviews.llvm.org/D114222	2021-11-26 09:27:47 +09:00
Mehdi Amini	850e8b4504	Fix link to the other docs from the Bufferization dialect	2021-11-26 00:13:32 +00:00
Michal Terepeta	cc311a155a	[mlir][Vector] Support 0-D vectors in `VectorPrintOpConversion` Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114549	2021-11-25 20:12:18 +00:00
Uday Bondhugula	c89fc1eec3	[MLIR] NFC. Rename MLIR CAPI ExecutionEngine target for consistency Rename MLIR CAPI ExecutionEngine target for consistency: MLIRCEXECUTIONENGINE -> MLIRCAPIExecutionEngine in line with other targets. Differential Revision: https://reviews.llvm.org/D114596	2021-11-26 00:23:17 +05:30
Tres Popp	6eca1957ee	Don't store nullptrs in mlir::FuncOp::getAll*Attrs' result These methods for results and arguments would create an ArrayRef full of nullptrs when there were no argument attributes. This is problematic because this result could not be passed to the FuncOp::build creator without causing a segfault. Now the list will have empty attributes. Differential Revision: https://reviews.llvm.org/D114358	2021-11-25 15:12:29 +01:00
seongwon bang	35c1e6ac1a	[MLIR] [docs] Fix misguided examples in memref.subview operation. The examples in `memref.subview` operation are misguided in that subview's strides operands mean "memref-rank number of strides that compose multiplicatively with the base memref strides in each dimension.". So the below examples should be changed from `Strides: [64, 4, 1]` to `Strides: [1, 1, 1]` Before changes ``` // Subview with constant offsets, sizes and strides. %1 = memref.subview %0[0, 2, 0][4, 4, 4][64, 4, 1] : memref<8x16x4xf32, (d0, d1, d2) -> (d0 * 64 + d1 * 4 + d2)> to memref<4x4x4xf32, (d0, d1, d2) -> (d0 * 64 + d1 * 4 + d2 + 8)> ``` After changes ``` // Subview with constant offsets, sizes and strides. %1 = memref.subview %0[0, 2, 0][4, 4, 4][1, 1, 1] : memref<8x16x4xf32, (d0, d1, d2) -> (d0 * 64 + d1 * 4 + d2)> to memref<4x4x4xf32, (d0, d1, d2) -> (d0 * 64 + d1 * 4 + d2 + 8)> ``` Also I fixed some syntax issues in docs related with memref layout map and added detailed explanation in subview rank reducing case. Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D114500	2021-11-25 21:24:10 +09:00
Alexander Belyaev	57470abc41	[mlir] Move memref.[tensor_load\|buffer_cast\|clone] to "bufferization" dialect. https://llvm.discourse.group/t/rfc-dialect-for-bufferization-related-ops/4712 Differential Revision: https://reviews.llvm.org/D114552	2021-11-25 11:50:39 +01:00
Tobias Gysi	43dc6d5d57	[mlir][linalg] Cleanup hoisting test (NFC). Rename the check prefixes to HOIST21 and HOIST32 to clarify the different flag configurations. Depends On D114438 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114442	2021-11-25 10:42:24 +00:00
Tobias Gysi	4b03906346	[mlir][linalg] Perform checks early in hoist padding. Instead of checking for unexpected operations (any operation with a region except for scf::For and `padTensorOp` or operations with a memory effect) while cloning the packing loop nest perform the checks early. Update `dropNonIndexDependencies` to check for unexpected operations. Additionally, check all of these operations have index type operands only. Depends On D114428 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114438	2021-11-25 10:37:12 +00:00
Tobias Gysi	fd723eaa92	[mlir][linalg] Limit hoist padding to constant paddings. Limit hoist padding to pad tensor ops that depend only on a constant value. Supporting arbitrary padding values that depend on computations part of the backward slice to hoist require complex analysis to ensure the computation can be hoisted. Depends On D114420 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114428	2021-11-25 10:31:39 +00:00
Tobias Gysi	ed7c1fb9b0	[mlir][linalg] Add backward slice filtering in hoist padding. Adapt hoist padding to filter the backward slice before cloning the packing loop nest. The filtering removes all operations that are not used to index the hoisted pad tensor op and its extract slice op. The filtering is needed to support the more complex loop nests created after fusion. For example, fusing the producer of an output operand can added linalg ops and pad tensor ops to the backward slice. These operations have regions and currently prevent hoisting. The following example demonstrates the effect of the newly introduced `dropNonIndexDependencies` method that filters the backward slice: ``` %source = linalg.fill(%cst, %arg0) scf.for %i %unrelated = linalg.fill(%cst, %arg1) // not used to index %source! scf.for %j (%arg2 = %unrelated) scf.for %k // not used to index %source! %ubi = affine.min #map(%i) %ubj = affine.min #map(%j) %slice = tensor.extract_slice %source [%i, %j] [%ubi, %ubj] %padded_slice = linalg.pad_tensor %slice ``` dropNonIndexDependencies(%padded_slice, %slice) removes [scf.for %k, linalg.fill(%cst, %arg1)] from backwardSlice. Depends On D114175 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114420	2021-11-25 10:30:10 +00:00
Matthias Springer	48107eaa07	[mlir][linalg][bufferize][NFC] Move SCF interface impl to new build target This makes ComprehensiveBufferize entirely independent of the SCF dialect. Differential Revision: https://reviews.llvm.org/D114221	2021-11-25 19:00:17 +09:00
Alexander Belyaev	3c228573bc	Revert "[mlir][SCF] Further simplify affine maps during `for-loop-canonicalization`" This reverts commit `ee1bf18672`. It breaks IREE lowering. Reverting the commit for now while we investigate what's going on.	2021-11-25 10:54:52 +01:00
Butygin	8dae0b6b6c	[mlir][spirv] arith::RemSIOp OpenCL lowering Differential Revision: https://reviews.llvm.org/D114524	2021-11-25 12:44:06 +03:00
Butygin	467acf3b6b	[mlir][spirv] Float atomics should not imply Shader Differential Revision: https://reviews.llvm.org/D114551	2021-11-25 12:07:28 +03:00
Matthias Springer	a5c2f78287	[mlir][interfaces] Add insideMutuallyExclusiveRegions helper Add a helper function to ControlFlowInterfaces for checking if two ops are in mutually exclusive regions according to RegionBranchOpInterface. Utilize this new helper in Linalg ComprehensiveBufferize. This makes the analysis independent of the SCF dialect and generalizes it to other ops that implement RegionBranchOpInterface. Differential Revision: https://reviews.llvm.org/D114220	2021-11-25 17:44:39 +09:00
Uday Bondhugula	25d173499e	[MLIR] Rename test/python/dialects/math.py -> math_dialect.py Rename test/python/dialects/math.py -> math_dialect.py to avoid a collision with a Python standard package of the same name. These test scripts are run by path and are not part of a package. Python apparently implicitly adds the containing directory to its PYTHONPATH. As such, test scripts with common names run the risk of conflicting with global names and resolution of an import for the latter happens to the former. Differential Revision: https://reviews.llvm.org/D114568	2021-11-25 09:51:49 +05:30
Matthias Springer	ee1bf18672	[mlir][SCF] Further simplify affine maps during `for-loop-canonicalization` * Implement `FlatAffineConstraints::getConstantBound(EQ)`. * Inject a simpler constraint for loops that have at most 1 iteration. * Taking into account constant EQ bounds of FlatAffineConstraint dims/symbols during canonicalization of the resulting affine map in `canonicalizeMinMaxOp`. Differential Revision: https://reviews.llvm.org/D114138	2021-11-25 12:44:19 +09:00
Matthias Springer	8a8c655fe7	[mlir][SCF] Fix off-by-one bug in affine analysis This change is NFC. There were two issues when passing/reading upper bounds into/from FlatAffineConstraints that negate each other, so the bug was not apparent. However, it made debugging harder because some constraints in the FlatAffineConstraints were off by one when dumping all constraints. Differential Revision: https://reviews.llvm.org/D114137	2021-11-25 12:37:02 +09:00
Uday Bondhugula	23d505571d	[NFC] Improve debug message in getAsIntegerSet Improve debug message in getAsIntegerSet. Add missing trailing new line and position info. Differential Revision: https://reviews.llvm.org/D114511	2021-11-25 08:50:21 +05:30
Matthias Springer	d3bb4fec2a	[mlir][linalg][bufferize][NFC] Move arith interface impl to new build target This makes ComprehensiveBufferize entirely independent of the arith dialect. Differential Revision: https://reviews.llvm.org/D114219	2021-11-25 10:21:02 +09:00
bakhtiyar	7bd87a03fd	Promote readability by factoring out creation of min/max operation. Remove unnecessary divisions. Reviewed By: ezhulenev Differential Revision: https://reviews.llvm.org/D110680	2021-11-24 16:17:23 -08:00
Lei Zhang	cb395f66ac	[mlir][spirv] Change the return type for {Min\|Max}VersionBase For synthesizing an op's implementation of the generated interface from {Min\|Max}Version, we need to define an `initializer` and `mergeAction`. The `initializer` specifies the initial version, and `mergeAction` specifies how version specifications from different parts of the op should be merged to generate a final version requirements. Previously we use the specified version enum as the type for both the initializer and thus the final return type. This means we need to perform `static_cast` over some hopefully not used number (`~0u`) as the initializer. This is quite opaque and sort of not guaranteed to work. Also, there are ops that have an enum attribute where some values declare version requirements (e.g., enumerant `B` requires v1.1+) but some not (e.g., enumerant `A` requires nothing). Then a concrete op instance with `A` will still declare it implements the version interface (because interface implementation is static for an op) but actually theirs no requirements for version. So this commit changes to use an more explicit `llvm::Optional` to wrap around the returned version enum. This should make it more clear. Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D108312	2021-11-24 17:33:01 -05:00
Tobias Gysi	b6e7b1be73	[mlir][linalg] Simplify padding test (NFC). The padding tests previously contained the tile loops. This revision removes the tile loops since padding itself does not consider the loops. Instead the induction variables are passed in as function arguments which promotes them to symbols in the affine expressions. Note that the pad-and-hoist.mlir test still exercises padding in the context of the full loop nest. Depends On D114175 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114227	2021-11-24 19:21:50 +00:00
Tobias Gysi	86f186efea	[mlir][linalg] Add makeComposedPadHighOp. Add the makeComposedPadHighOp method which creates a new PadTensorOp if necessary. If the source to pad is actually the result of a sequence of padded LinalgOps, the method checks if padding is needed or if we can use the padded result of the padded LinalgOp sequence directly. Example: ``` %0 = tensor.extract_slice %arg0 [%iv0, %iv1] [%sz0, %sz1] %1 = linalg.pad_tensor %0 low[0, 0] high[...] { linalg.yield %cst } %2 = linalg.matmul ins(...) outs(%1) %3 = tensor.extract_slice %2 [0, 0] [%sz0, %sz1] ``` when padding %3 return %2 instead of introducing ``` %4 = linalg.pad_tensor %3 low[0, 0] high[...] { linalg.yield %cst } ``` Depends On D114161 Reviewed By: nicolasvasilache, pifon2a Differential Revision: https://reviews.llvm.org/D114175	2021-11-24 19:18:59 +00:00

1 2 3 4 5 ...

9341 Commits