llvm-project

Commit Graph

Author	SHA1	Message	Date
Morten Borup Petersen	032cb1650f	[MLIR][SCF] Add for-to-while loop transformation pass This pass transforms SCF.ForOp operations to SCF.WhileOp. The For loop condition is placed in the 'before' region of the while operation, and indctuion variable incrementation + the loop body in the 'after' region. The loop carried values of the while op are the induction variable (IV) of the for-loop + any iter_args specified for the for-loop. Any 'yield' ops in the for-loop are rewritten to additionally yield the (incremented) induction variable. This transformation is useful for passes where we want to consider structured control flow solely on the basis of a loop body and the computation of a loop condition. As an example, when doing high-level synthesis in CIRCT, the incrementation of an IV in a for-loop is "just another part" of a circuit datapath, and what we really care about is the distinction between our datapath and our control logic (the condition variable). Differential Revision: https://reviews.llvm.org/D108454	2021-09-21 09:09:54 +01:00
Mehdi Amini	5edd79fc97	Revert "[MLIR][SCF] Add for-to-while loop transformation pass" This reverts commit `644b55d57e`. The added test is failing the bots.	2021-09-20 17:21:59 +00:00
Morten Borup Petersen	644b55d57e	[MLIR][SCF] Add for-to-while loop transformation pass This pass transforms SCF.ForOp operations to SCF.WhileOp. The For loop condition is placed in the 'before' region of the while operation, and indctuion variable incrementation + the loop body in the 'after' region. The loop carried values of the while op are the induction variable (IV) of the for-loop + any iter_args specified for the for-loop. Any 'yield' ops in the for-loop are rewritten to additionally yield the (incremented) induction variable. This transformation is useful for passes where we want to consider structured control flow solely on the basis of a loop body and the computation of a loop condition. As an example, when doing high-level synthesis in CIRCT, the incrementation of an IV in a for-loop is "just another part" of a circuit datapath, and what we really care about is the distinction between our datapath and our control logic (the condition variable). Differential Revision: https://reviews.llvm.org/D108454	2021-09-20 16:57:50 +01:00
Matthias Springer	0f3544d185	[mlir][scf] Loop peeling: Use scf.for for partial iteration Generate an scf.for instead of an scf.if for the partial iteration. This is for consistency reasons: The peeling of linalg.tiled_loop also uses another loop for the partial iteration. Note: Canonicalizations patterns may rewrite partial iterations to scf.if afterwards. Differential Revision: https://reviews.llvm.org/D109568	2021-09-10 19:07:09 +09:00
Matthias Springer	c7d569b8f7	[mlir][scf] Fold dim(scf.for) to dim(iter_arg) Fold dim ops of scf.for results to dim ops of the respective iter args if the loop is shape preserving. Differential Revision: https://reviews.llvm.org/D109430	2021-09-09 13:47:13 +09:00
Matthias Springer	4fa6c2734c	[mlir][scf] Allow runtime type of iter_args to change The limitation on iter_args introduced with D108806 is too restricting. Changes of the runtime type should be allowed. Extends the dim op canonicalization with a simple analysis to determine when it is safe to canonicalize. Differential Revision: https://reviews.llvm.org/D109125	2021-09-03 10:03:05 +09:00
Matthias Springer	d18ffd61d4	[mlir][SCF] Canonicalize dim(x) where x is an iter_arg * Add `DimOfIterArgFolder`. * Move existing cross-dialect canonicalization patterns to `LoopCanonicalization.cpp`. * Rename `SCFAffineOpCanonicalization` pass to `SCFForLoopCanonicalization`. * Expand documentaton of scf.for: The type of loop-carried variables may not change with iterations. (Not even the dynamic type.) Differential Revision: https://reviews.llvm.org/D108806	2021-08-30 01:39:56 +00:00
Matthias Springer	a9cff97f94	[mlir][SCF] Generalize AffineMinSCFCanonicalization to min/max ops * Add support for affine.max ops to SCF loop peeling pattern. * Add support for affine.max ops to `AffineMinSCFCanonicalizationPattern`. * Rename `AffineMinSCFCanonicalizationPattern` to `AffineOpSCFCanonicalizationPattern`. * Rename `AffineMinSCFCanonicalization` pass to `SCFAffineOpCanonicalization`. Differential Revision: https://reviews.llvm.org/D108009	2021-08-25 10:40:34 +09:00
Matthias Springer	2de2dbef2a	[mlir][linalg] Replace AffineMinSCFCanonicalizationPattern with SCF reimplementation Use the new canonicalization pattern in the SCF dialect. Differential Revision: https://reviews.llvm.org/D107732	2021-08-25 08:52:56 +09:00
Matthias Springer	98aa694d0d	[mlir][scf] Add general affine.min canonicalization pattern This canonicalization simplifies affine.min operations inside "for loop"-like operations (e.g., scf.for and scf.parallel) based on two invariants: * iv >= lb * iv < lb + step * ((ub - lb - 1) floorDiv step) + 1 This commit adds a new pass `canonicalize-scf-affine-min` (instead of being a canonicalization pattern) to avoid dependencies between the Affine dialect and the SCF dialect. Differential Revision: https://reviews.llvm.org/D107731	2021-08-25 07:32:30 +09:00
Matthias Springer	ebf35370ff	[mlir][tensor] Insert explicit tensor.cast ops for insert_slice src If additional static type information can be deduced from a insert_slice's size operands, insert an explicit cast of the op's source operand. This enables other canonicalization patterns that are matching for tensor_cast ops such as `ForOpTensorCastFolder` in SCF. Differential Revision: https://reviews.llvm.org/D108617	2021-08-24 19:45:04 +09:00
Matthias Springer	bc194a5bb5	[mlir][SCF] Do not peel loops inside partial iterations Do not apply loop peeling to loops that are contained in the partial iteration of an already peeled loop. This is to avoid code explosion when dealing with large loop nests. Can be controlled with a new pass option `skip-partial`. Differential Revision: https://reviews.llvm.org/D108542	2021-08-23 21:35:46 +09:00
Morten Borup Petersen	6c1436a9b0	[MLIR][SCF] Parenthesize multiple return types in scf.execute_region asm op Previously, ExecuteRegionOps with multiple return values would fail a round-trip test due to missing parenthesis around the types. Differential Revision: https://reviews.llvm.org/D108402	2021-08-19 21:31:51 +01:00
Matthias Springer	8e8b70aa84	[mlir][scf] Simplify affine.min ops after loop peeling Simplify affine.min ops, enabling various other canonicalizations inside the peeled loop body. affine.min ops such as: ``` map = affine_map<(d0)[s0, s1] -> (s0, -d0 + s1)> %r = affine.min #affine.min #map(%iv)[%step, %ub] ``` are rewritten them into (in the case the peeled loop): ``` %r = %step ``` To determine how an affine.min op should be rewritten and to prove its correctness, FlatAffineConstraints is utilized. Differential Revision: https://reviews.llvm.org/D107222	2021-08-19 17:24:53 +09:00
tashuang.zk	2d45e332ba	[MLIR][DISC] Revise ParallelLoopTilingPass with inbound_check mode Expand ParallelLoopTilingPass with an inbound_check mode. In default mode, the upper bound of the inner loop is from the min op; in inbound_check mode, the upper bound of the inner loop is the step of the outer loop and an additional inbound check will be emitted inside of the inner loop. This was 'FIXME' in the original codes and a typical usage is for GPU backends, thus the outer loop and inner loop can be mapped to blocks/threads in seperate. Differential Revision: https://reviews.llvm.org/D105455	2021-08-16 14:02:53 +02:00
Tyler Augustine	3a2ff982d7	Support post-processing Ops in unrolled loop iterations This can be useful when one needs to know which unrolled iteration an Op belongs to, for example, conveying noalias information among memory-affecting ops in parallel-access loops. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D107789	2021-08-11 23:11:10 +00:00
Matthias Springer	3a41ff4883	[mlir][SCF] Peel scf.for loops for even step divison Add ForLoopBoundSpecialization pass, which specializes scf.for loops into a "main loop" where `step` divides the iteration space evenly and into an scf.if that handles the last iteration. This transformation is useful for vectorization and loop tiling. E.g., when vectorizing loads/stores, programs will spend most of their time in the main loop, in which only unmasked loads/stores are used. Only the in the last iteration (scf.if), slower masked loads/stores are used. Subsequent commits will apply this transformation in the SparseDialect and in Linalg's loop tiling. Differential Revision: https://reviews.llvm.org/D105804	2021-08-03 10:21:38 +09:00
River Riddle	f8479d9de5	[mlir] Set the namespace of the BuiltinDialect to 'builtin' Historically the builtin dialect has had an empty namespace. This has unfortunately created a very awkward situation, where many utilities either have to special case the empty namespace, or just don't work at all right now. This revision adds a namespace to the builtin dialect, and starts to cleanup some of the utilities to no longer handle empty namespaces. For now, the assembly form of builtin operations does not require the `builtin.` prefix. (This should likely be re-evaluated though) Differential Revision: https://reviews.llvm.org/D105149	2021-07-28 21:00:10 +00:00
Marcel Koester	0425332015	[mlir] Added new RegionBranchTerminatorOpInterface and adapted uses of hasTrait<ReturnLike>. This CL adds a new RegionBranchTerminatorOpInterface to query information about operands that can be passed to successor regions. Similar to the BranchOpInterface, it allows to freely define the involved operands. However, in contrast to the BranchOpInterface, it expects an additional region number to distinguish between various use cases which might require different operands passed to different regions. Moreover, we added new utility functions (namely getMutableRegionBranchSuccessorOperands and getRegionBranchSuccessorOperands) to query (mutable) operand ranges for operations equiped with the ReturnLike trait and/or implementing the newly added interface. This simplifies reasoning about terminators in the scope of the nested regions. We also adjusted the SCF.ConditionOp to benefit from the newly added capabilities. Differential Revision: https://reviews.llvm.org/D105018	2021-07-26 06:39:31 +02:00
thomasraoux	45cb4140eb	[mlir] Extend scf pipeling to support loop carried dependencies Differential Revision: https://reviews.llvm.org/D106325	2021-07-21 18:32:38 -07:00
thomasraoux	f6f88e66ce	[mlir] Add software pipelining transformation for scf.For op This is the first step to support software pipeline for scf.for loops. This is only the transformation to create pipelined kernel and prologue/epilogue. The scheduling needs to be given by user as many different algorithm and heuristic could be applied. This currently doesn't handle loop arguments, this will be added in a follow up patch. Differential Revision: https://reviews.llvm.org/D105868	2021-07-19 13:43:26 -07:00
Butygin	a36e9ee09d	[mlir][SCF] populateSCFStructuralTypeConversionsAndLegality WhileOp support Differential Revision: https://reviews.llvm.org/D105923	2021-07-14 12:43:04 +03:00
William S. Moses	dfb34c0df9	[MLIR][SCF] Inline ExecuteRegion if parent can contain multiple blocks The executeregionop is used to allow multiple blocks within SCF constructs. If the container allows multiple blocks, inline the region Differential Revision: https://reviews.llvm.org/D104960	2021-06-30 10:03:22 -04:00
William S. Moses	2ab27758d5	Revert "[MLIR][SCF] Inline ExecuteRegion if parent can contain multiple blocks" This reverts commit `5d6240b77e`. The commit was mistakenly landed without a PR approval, this will be reverted now and resubmitted.	2021-06-28 13:52:30 -04:00
William S. Moses	5d6240b77e	[MLIR][SCF] Inline ExecuteRegion if parent can contain multiple blocks The executeregionop is used to allow multiple blocks within SCF constructs. If the container allows multiple blocks, inline the region Differential Revision: https://reviews.llvm.org/D104960	2021-06-28 13:09:22 -04:00
William S. Moses	44985872b8	[MLIR][SCF] Inline single block ExecuteRegionOp This commit adds a canonicalization pass which inlines any single block execute region Differential Revision: https://reviews.llvm.org/D104865	2021-06-24 13:15:26 -04:00
Anthony Canino	3f429e82d3	Implement an scf.for range folding optimization pass. In cases where arithmetic (addi/muli) ops are performed on an scf.for loops induction variable with a single use, we can fold those ops directly into the scf.for loop. For example, in the following code: ``` scf.for %i = %c0 to %arg1 step %c1 { %0 = addi %arg2, %i : index %1 = muli %0, %c4 : index %2 = memref.load %arg0[%1] : memref<?xi32> %3 = muli %2, %2 : i32 memref.store %3, %arg0[%1] : memref<?xi32> } ``` we can lift `%0` up into the scf.for loop range, as it is the only user of %i: ``` %lb = addi %arg2, %c0 : index %ub = addi %arg2, %i : index scf.for %i = %lb to %ub step %c1 { %1 = muli %0, %c4 : index %2 = memref.load %arg0[%1] : memref<?xi32> %3 = muli %2, %2 : i32 memref.store %3, %arg0[%1] : memref<?xi32> } ``` Reviewed By: mehdi_amini, ftynse, Anthony Differential Revision: https://reviews.llvm.org/D104289	2021-06-24 01:07:28 +00:00
Matthias Springer	060208b4c8	[mlir][NFC] Move SubTensorOp and SubTensorInsertOp to TensorDialect The main goal of this commit is to remove the dependency of Standard dialect on the Tensor dialect. * Rename SubTensorOp -> tensor.extract_slice, SubTensorInsertOp -> tensor.insert_slice. * Some helper functions are (already) duplicated between the Tensor dialect and the MemRef dialect. To keep this commit smaller, this will be cleaned up in a separate commit. * Additional dialect dependencies: Shape --> Tensor, Tensor --> Standard * Remove dialect dependencies: Standard --> Tensor * Move canonicalization test cases to correct dialect (Tensor/MemRef). Note: This is a fixed version of https://reviews.llvm.org/D104499, which was reverted due to a missing update to two CMakeFile.txt. Differential Revision: https://reviews.llvm.org/D104676	2021-06-22 17:55:53 +09:00
Mehdi Amini	60d97fb4cf	Revert "[mlir][NFC] Move SubTensorOp and SubTensorInsertOp to TensorDialect" This reverts commit `83bf801f5f`. This breaks the build with -DBUILD_SHARED_LIBS=ON	2021-06-21 16:39:24 +00:00
Matthias Springer	83bf801f5f	[mlir][NFC] Move SubTensorOp and SubTensorInsertOp to TensorDialect The main goal of this commit is to remove the dependency of Standard dialect on the Tensor dialect. * Rename ops: SubTensorOp --> ExtractTensorOp, SubTensorInsertOp --> InsertTensorOp * Some helper functions are (already) duplicated between the Tensor dialect and the MemRef dialect. To keep this commit smaller, this will be cleaned up in a separate commit. * Additional dialect dependencies: Shape --> Tensor, Tensor --> Standard * Remove dialect dependencies: Standard --> Tensor * Move canonicalization test cases to correct dialect (Tensor/MemRef). Differential Revision: https://reviews.llvm.org/D104499	2021-06-22 00:11:21 +09:00
Uday Bondhugula	18c8c934d8	[MLIR] Introduce scf.execute_region op Introduce the execute_region op that is able to hold a region which it executes exactly once. The op encapsulates a CFG within itself while isolating it from the surrounding control flow. Proposal discussed here: https://llvm.discourse.group/t/introduce-std-inlined-call-op-proposal/282 execute_region enables one to inline a function without lowering out all other higher level control flow constructs (affine.for/if, scf.for/if) to the flat list of blocks / CFG form. It thus allows the benefit of transforms on higher level control flow ops available in the presence of the inlined calls. The inlined calls continue to benefit from propagation of SSA values across their top boundary. Functions won’t have to remain outlined until later than desired. Abstractions like affine execute_regions, lambdas with implicit captures could be lowered to this without first lowering out structured loops/ifs or outlining. But two potential early use cases are of: (1) an early inliner (which can inline functions by introducing execute_region ops), (2) lowering of an affine.execute_region, which cleanly maps to an scf.execute_region when going from the affine dialect to the scf dialect. Differential Revision: https://reviews.llvm.org/D75837	2021-06-18 15:22:33 +05:30
MaheshRavishankar	621d93d263	[mlir][SCF] Remove empty else blocks of `scf.if` operations. Differential Revision: https://reviews.llvm.org/D104273	2021-06-15 15:07:20 -07:00
Chris Lattner	a004da0d77	[Canonicalize] Switch the default setting to "top down". This provides a sizable compile time improvement by seeding the worklist in an order that leads to less iterations of the worklist. This patch only changes the behavior of the Canonicalize pass itself, it does not affect other passes that use the GreedyPatternRewrite driver Differential Revision: https://reviews.llvm.org/D103053	2021-05-25 13:42:11 -07:00
Butygin	4184018253	[mlir][SCF] Canonicalize nested ParallelOp's Differential Revision: https://reviews.llvm.org/D102799	2021-05-22 14:00:00 +03:00
William S. Moses	f4a2dbfe29	[MLIR][SCF] Combine adjacent scf.if with same condition Differential Revision: https://reviews.llvm.org/D101798	2021-05-05 00:39:58 -04:00
William S. Moses	8e211bf1c8	[MLIR][SCF] Assume uses of condition in the body of scf.while is true Differential Revision: https://reviews.llvm.org/D101801	2021-05-04 11:39:07 -04:00
William S. Moses	ca27260701	[MLIR] Add SCF.if Condition Canonicalizations Add two canoncalizations for scf.if. 1) A canonicalization that allows users of a condition within an if to assume the condition is true if in the true region, etc. 2) A canonicalization that removes yielded statements that are equivalent to the condition or its negation Differential Revision: https://reviews.llvm.org/D101012	2021-04-26 20:13:08 -04:00
Nicolas Vasilache	843f1fc825	[mlir][scf] Add scf.for + tensor.cast canonicalization pattern Fold scf.for iter_arg/result pairs that go through incoming/ougoing a tensor.cast op pair so as to pull the tensor.cast inside the scf.for: ``` %0 = tensor.cast %t0 : tensor<32x1024xf32> to tensor<?x?xf32> %1 = scf.for %i = %c0 to %c1024 step %c32 iter_args(%iter_t0 = %0) -> (tensor<?x?xf32>) { %2 = call @do(%iter_t0) : (tensor<?x?xf32>) -> tensor<?x?xf32> scf.yield %2 : tensor<?x?xf32> } %2 = tensor.cast %1 : tensor<?x?xf32> to tensor<32x1024xf32> use_of(%2) ``` folds into: ``` %0 = scf.for %arg2 = %c0 to %c1024 step %c32 iter_args(%arg3 = %arg0) -> (tensor<32x1024xf32>) { %2 = tensor.cast %arg3 : tensor<32x1024xf32> to tensor<?x?xf32> %3 = call @do(%2) : (tensor<?x?xf32>) -> tensor<?x?xf32> %4 = tensor.cast %3 : tensor<?x?xf32> to tensor<32x1024xf32> scf.yield %4 : tensor<32x1024xf32> } use_of(%0) ``` Differential Revision: https://reviews.llvm.org/D100661	2021-04-16 16:50:21 +00:00
Butygin	eb31540066	[mlir] Canonicalize single-iteration ParallelOp Differential Revision: https://reviews.llvm.org/D100248	2021-04-13 13:42:19 +03:00
Butygin	5657f93e78	[mlir] Canonicalize IfOp with trivial `then` and `else` bodies to list of SelectOp's * Do we need a threshold on maximum number of Yeild arguments processed (maximum number of SelectOp's to be generated)? * Had to modify some old IfOp tests to not get optimized by this pattern Differential Revision: https://reviews.llvm.org/D98592	2021-03-20 12:18:49 +03:00
Chris Lattner	b2f232b830	[testsuite] Make testsuite more stable vs canonicalization change. NFC. Differential Revision: https://reviews.llvm.org/D98998	2021-03-19 18:11:12 -07:00
River Riddle	d75a611afb	[mlir] Update `simplifyRegions` to use RewriterBase for erasure notifications This allows for notifying callers when operations/blocks get erased, which is especially useful for the greedy pattern driver. The current greedy pattern driver "throws away" all information on constants in the operation folder because it doesn't know if they get erased or not. By passing in RewriterBase, we can directly track this and prevent the need for the pattern driver to rediscover all of the existing constants. In some situations this cuts the compile time of the canonicalizer in half. Differential Revision: https://reviews.llvm.org/D98755	2021-03-19 16:33:54 -07:00
lorenzo chelini	0a74a7161b	[mlir] scf::ForOp: Drop iter arguments (and corresponding result) with no use 'ForOpIterArgsFolder' can now remove iterator arguments (and corresponding results) with no use. Example: ``` %cst = constant 32 : i32 %0:2 = scf.for %arg1 = %lb to %ub step %step iter_args(%arg2 = %arg0, %arg3 = %cst) -> (i32, i32) { %1 = addu %arg2, %cst : i32 scf.yield %1, %1 : i32, i32 } use(%0#0) ``` %arg3 is not used in the block, and its corresponding result `%0#1` has no use, thus remove the iter argument. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D98711	2021-03-17 12:06:17 +00:00
Lorenzo Chelini	fd7eee64c5	scf::ForOp: Fold away iterator arguments with no use and for which the corresponding input is yielded Enhance 'ForOpIterArgsFolder' to remove unused iteration arguments in a scf::ForOp. If the block argument corresponding to the given iterator has no use and the yielded value equals the input, we fold it away. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D98503	2021-03-16 07:01:25 +00:00
Julian Gross	e2310704d8	[MLIR] Create memref dialect and move dialect-specific ops from std. Create the memref dialect and move dialect-specific ops from std dialect to this dialect. Moved ops: AllocOp -> MemRef_AllocOp AllocaOp -> MemRef_AllocaOp AssumeAlignmentOp -> MemRef_AssumeAlignmentOp DeallocOp -> MemRef_DeallocOp DimOp -> MemRef_DimOp MemRefCastOp -> MemRef_CastOp MemRefReinterpretCastOp -> MemRef_ReinterpretCastOp GetGlobalMemRefOp -> MemRef_GetGlobalOp GlobalMemRefOp -> MemRef_GlobalOp LoadOp -> MemRef_LoadOp PrefetchOp -> MemRef_PrefetchOp ReshapeOp -> MemRef_ReshapeOp StoreOp -> MemRef_StoreOp SubViewOp -> MemRef_SubViewOp TransposeOp -> MemRef_TransposeOp TensorLoadOp -> MemRef_TensorLoadOp TensorStoreOp -> MemRef_TensorStoreOp TensorToMemRefOp -> MemRef_BufferCastOp ViewOp -> MemRef_ViewOp The roadmap to split the memref dialect from std is discussed here: https://llvm.discourse.group/t/rfc-split-the-memref-dialect-from-std/2667 Differential Revision: https://reviews.llvm.org/D98041	2021-03-15 11:14:09 +01:00
Alex Zinenko	40d8e4d3f9	Revert "[Canonicalizer] Process regions top-down instead of bottom up & reuse existing constants." This reverts commit `b5d9a3c923`. The commit introduced a memory error in canonicalization/operation walking that is exposed when compiled with ASAN. It leads to crashes in some "release" configurations.	2021-03-15 10:27:55 +01:00
Chris Lattner	b5d9a3c923	[Canonicalizer] Process regions top-down instead of bottom up & reuse existing constants. Two changes: 1) Change the canonicalizer to walk the function in top-down order instead of bottom-up order. This composes well with the "top down" nature of constant folding and simplification, reducing iterations and re-evaluation of ops in simple cases. 2) Explicitly enter existing constants into the OperationFolder table before canonicalizing. Previously we would "constant fold" them and rematerialize them, wastefully recreating a bunch fo constants, which lead to pointless memory traffic. Both changes together provide a 33% speedup for canonicalize on some mid-size CIRCT examples. One artifact of this change is that the constants generated in normal pattern application get inserted at the top of the function as the patterns are applied. Because of this, we get "inverted" constants more often, which is an aethetic change to the IR but does permute some testcases. Differential Revision: https://reviews.llvm.org/D98609	2021-03-14 18:21:42 -07:00
Nicolas Vasilache	35908406dc	[mlir][scf] Canonicalize scf.for last tensor iteration result. Canonicalize the iter_args of an scf::ForOp that involve a tensor_load and for which only the last loop iteration is actually visible outside of the loop. The canonicalization looks for a pattern such as: ``` %t0 = ... : tensor_type %0 = scf.for ... iter_args(%bb0 : %t0) -> (tensor_type) { ... // %m is either tensor_to_memref(%bb00) or defined above the loop %m... : memref_type ... // uses of %m with potential inplace updates %new_tensor = tensor_load %m : memref_type ... scf.yield %new_tensor : tensor_type } ``` `%bb0` may have either 0 or 1 use. If it has 1 use it must be exactly a `%m = tensor_to_memref %bb0` op that feeds into the yielded `tensor_load` op. If no aliasing write of `%new_tensor` occurs between tensor_load and yield then the value %0 visible outside of the loop is the last `tensor_load` produced in the loop. For now, we approximate the absence of aliasing by only supporting the case when the tensor_load is the operation immediately preceding the yield. The canonicalization rewrites the pattern as: ``` // %m is either a tensor_to_memref or defined above %m... : memref_type scf.for ... { // no iter_args ... // uses of %m with potential inplace updates } %0 = tensor_load %m : memref_type ``` Differential revision: https://reviews.llvm.org/D97953	2021-03-05 09:42:19 +00:00
Alexander Belyaev	a89035d750	Revert "[MLIR] Create memref dialect and move several dialect-specific ops from std." This commit introduced a cyclic dependency: Memref dialect depends on Standard because it used ConstantIndexOp. Std depends on the MemRef dialect in its EDSC/Intrinsics.h Working on a fix. This reverts commit `8aa6c3765b`.	2021-02-18 12:49:52 +01:00
Julian Gross	8aa6c3765b	[MLIR] Create memref dialect and move several dialect-specific ops from std. Create the memref dialect and move several dialect-specific ops without dependencies to other ops from std dialect to this dialect. Moved ops: AllocOp -> MemRef_AllocOp AllocaOp -> MemRef_AllocaOp DeallocOp -> MemRef_DeallocOp MemRefCastOp -> MemRef_CastOp GetGlobalMemRefOp -> MemRef_GetGlobalOp GlobalMemRefOp -> MemRef_GlobalOp PrefetchOp -> MemRef_PrefetchOp ReshapeOp -> MemRef_ReshapeOp StoreOp -> MemRef_StoreOp TransposeOp -> MemRef_TransposeOp ViewOp -> MemRef_ViewOp The roadmap to split the memref dialect from std is discussed here: https://llvm.discourse.group/t/rfc-split-the-memref-dialect-from-std/2667 Differential Revision: https://reviews.llvm.org/D96425	2021-02-18 11:29:39 +01:00

1 2

80 Commits