llvm-project

Commit Graph

Author	SHA1	Message	Date
Butygin	c7f96d5ab1	[mlir][scf] Canonicalize nested scf.if's to scf.if + arith.and Differential Revision: https://reviews.llvm.org/D115930	2021-12-20 21:53:03 +03:00
Jacques Pienaar	efb7727a96	[mlir] Flag near misses in file splitting Flags some potential cases where splitting isn't happening and so could result in confusing results. Also update some test files where there were near misses in splitting that seemed unintentional. Differential Revision: https://reviews.llvm.org/D109636	2021-12-12 08:03:30 -08:00
Nicolas Vasilache	a08b750ce9	[mlir][tensor] InsertSliceOp verification. This revision reintroduces tensor.insert_slice verification which seems to have vanished over time: a verifier was initially introduced in `cf9503c1b7` but for some reason the invalid.mlir was not properly updated; as time passed the verifier was not called anymore and later the code was deleted. As a consequence, a non-negligible portion of tests has run astray using invalid tensor.insert_slice semantics and needed to be fixed. Also, extract isRankReducedType from TensorOps for better reuse Originally, this facility was used by both tensor and memref forms but it got copied around as dialects were split. Differential Revision: https://reviews.llvm.org/D114715	2021-11-30 20:37:06 +00:00
Alexander Belyaev	57470abc41	[mlir] Move memref.[tensor_load\|buffer_cast\|clone] to "bufferization" dialect. https://llvm.discourse.group/t/rfc-dialect-for-bufferization-related-ops/4712 Differential Revision: https://reviews.llvm.org/D114552	2021-11-25 11:50:39 +01:00
Alexander Belyaev	3c228573bc	Revert "[mlir][SCF] Further simplify affine maps during `for-loop-canonicalization`" This reverts commit `ee1bf18672`. It breaks IREE lowering. Reverting the commit for now while we investigate what's going on.	2021-11-25 10:54:52 +01:00
Matthias Springer	ee1bf18672	[mlir][SCF] Further simplify affine maps during `for-loop-canonicalization` * Implement `FlatAffineConstraints::getConstantBound(EQ)`. * Inject a simpler constraint for loops that have at most 1 iteration. * Taking into account constant EQ bounds of FlatAffineConstraint dims/symbols during canonicalization of the resulting affine map in `canonicalizeMinMaxOp`. Differential Revision: https://reviews.llvm.org/D114138	2021-11-25 12:44:19 +09:00
Butygin	7f5d9bf13a	[mlir][scf] Canonicalize scf.while with unused results Differential Revision: https://reviews.llvm.org/D114291	2021-11-24 11:11:22 +03:00
Arnab Dutta	1402299271	[MLIR] Simplify semi-affine expressions using flattening For the semi affine expressions, whenever rhs of a floordiv, ceildiv, mod or product expression is a symbolic expression, we introduce a local variable representing the result, and store the floordiv/ceildiv, mod or product affine expression in LocalExprs. In this way the expression is flattened, and trivial addition and subtraction related simplifications are performed. Also rule based matching for detecting and transforming "expr - q * (expr floordiv q)" to "expr mod q", where q is a symbolic exxpression, in simplifyAdd function. Differential Revision: https://reviews.llvm.org/D112808	2021-11-16 15:42:22 +05:30
River Riddle	015192c634	[mlir:DialectConversion] Restructure how argument/target materializations get invoked The current implementation invokes materializations whenever an input operand does not have a mapping for the desired type, i.e. it requires materialization at the earliest possible point. This conflicts with goal of dialect conversion (and also the current documentation) which states that a materialization is only required if the materialization is supposed to persist after the conversion process has finished. This revision refactors this such that whenever a target materialization "might" be necessary, we insert an unrealized_conversion_cast to act as a temporary materialization. This allows for deferring the invocation of the user materialization hooks until the end of the conversion process, where we actually have a better sense if it's actually necessary. This has several benefits: * In some cases a target materialization hook is no longer necessary When performing a full conversion, there are some situations where a temporary materialization is necessary. Moving forward, these users won't need to provide any target materializations, as the temporary materializations do not require the user to provide materialization hooks. * getRemappedValue can now handle values that haven't been converted yet Before this commit, it wasn't well supported to get the remapped value of a value that hadn't been converted yet (making it difficult/impossible to convert multiple operations in many situations). This commit updates getRemappedValue to properly handle this case by inserting temporary materializations when necessary. Another code-health related benefit is that with this change we can move a majority of the complexity related to materializations to the end of the conversion process, instead of handling adhoc while conversion is happening. Differential Revision: https://reviews.llvm.org/D111620	2021-10-27 02:09:04 +00:00
Mogball	a54f4eae0e	[MLIR] Replace std ops with arith dialect ops Precursor: https://reviews.llvm.org/D110200 Removed redundant ops from the standard dialect that were moved to the `arith` or `math` dialects. Renamed all instances of operations in the codebase and in tests. Reviewed By: rriddle, jpienaar Differential Revision: https://reviews.llvm.org/D110797	2021-10-13 03:07:03 +00:00
Morten Borup Petersen	032cb1650f	[MLIR][SCF] Add for-to-while loop transformation pass This pass transforms SCF.ForOp operations to SCF.WhileOp. The For loop condition is placed in the 'before' region of the while operation, and indctuion variable incrementation + the loop body in the 'after' region. The loop carried values of the while op are the induction variable (IV) of the for-loop + any iter_args specified for the for-loop. Any 'yield' ops in the for-loop are rewritten to additionally yield the (incremented) induction variable. This transformation is useful for passes where we want to consider structured control flow solely on the basis of a loop body and the computation of a loop condition. As an example, when doing high-level synthesis in CIRCT, the incrementation of an IV in a for-loop is "just another part" of a circuit datapath, and what we really care about is the distinction between our datapath and our control logic (the condition variable). Differential Revision: https://reviews.llvm.org/D108454	2021-09-21 09:09:54 +01:00
Mehdi Amini	5edd79fc97	Revert "[MLIR][SCF] Add for-to-while loop transformation pass" This reverts commit `644b55d57e`. The added test is failing the bots.	2021-09-20 17:21:59 +00:00
Morten Borup Petersen	644b55d57e	[MLIR][SCF] Add for-to-while loop transformation pass This pass transforms SCF.ForOp operations to SCF.WhileOp. The For loop condition is placed in the 'before' region of the while operation, and indctuion variable incrementation + the loop body in the 'after' region. The loop carried values of the while op are the induction variable (IV) of the for-loop + any iter_args specified for the for-loop. Any 'yield' ops in the for-loop are rewritten to additionally yield the (incremented) induction variable. This transformation is useful for passes where we want to consider structured control flow solely on the basis of a loop body and the computation of a loop condition. As an example, when doing high-level synthesis in CIRCT, the incrementation of an IV in a for-loop is "just another part" of a circuit datapath, and what we really care about is the distinction between our datapath and our control logic (the condition variable). Differential Revision: https://reviews.llvm.org/D108454	2021-09-20 16:57:50 +01:00
Matthias Springer	0f3544d185	[mlir][scf] Loop peeling: Use scf.for for partial iteration Generate an scf.for instead of an scf.if for the partial iteration. This is for consistency reasons: The peeling of linalg.tiled_loop also uses another loop for the partial iteration. Note: Canonicalizations patterns may rewrite partial iterations to scf.if afterwards. Differential Revision: https://reviews.llvm.org/D109568	2021-09-10 19:07:09 +09:00
Matthias Springer	c7d569b8f7	[mlir][scf] Fold dim(scf.for) to dim(iter_arg) Fold dim ops of scf.for results to dim ops of the respective iter args if the loop is shape preserving. Differential Revision: https://reviews.llvm.org/D109430	2021-09-09 13:47:13 +09:00
Matthias Springer	4fa6c2734c	[mlir][scf] Allow runtime type of iter_args to change The limitation on iter_args introduced with D108806 is too restricting. Changes of the runtime type should be allowed. Extends the dim op canonicalization with a simple analysis to determine when it is safe to canonicalize. Differential Revision: https://reviews.llvm.org/D109125	2021-09-03 10:03:05 +09:00
Matthias Springer	d18ffd61d4	[mlir][SCF] Canonicalize dim(x) where x is an iter_arg * Add `DimOfIterArgFolder`. * Move existing cross-dialect canonicalization patterns to `LoopCanonicalization.cpp`. * Rename `SCFAffineOpCanonicalization` pass to `SCFForLoopCanonicalization`. * Expand documentaton of scf.for: The type of loop-carried variables may not change with iterations. (Not even the dynamic type.) Differential Revision: https://reviews.llvm.org/D108806	2021-08-30 01:39:56 +00:00
Matthias Springer	a9cff97f94	[mlir][SCF] Generalize AffineMinSCFCanonicalization to min/max ops * Add support for affine.max ops to SCF loop peeling pattern. * Add support for affine.max ops to `AffineMinSCFCanonicalizationPattern`. * Rename `AffineMinSCFCanonicalizationPattern` to `AffineOpSCFCanonicalizationPattern`. * Rename `AffineMinSCFCanonicalization` pass to `SCFAffineOpCanonicalization`. Differential Revision: https://reviews.llvm.org/D108009	2021-08-25 10:40:34 +09:00
Matthias Springer	2de2dbef2a	[mlir][linalg] Replace AffineMinSCFCanonicalizationPattern with SCF reimplementation Use the new canonicalization pattern in the SCF dialect. Differential Revision: https://reviews.llvm.org/D107732	2021-08-25 08:52:56 +09:00
Matthias Springer	98aa694d0d	[mlir][scf] Add general affine.min canonicalization pattern This canonicalization simplifies affine.min operations inside "for loop"-like operations (e.g., scf.for and scf.parallel) based on two invariants: * iv >= lb * iv < lb + step * ((ub - lb - 1) floorDiv step) + 1 This commit adds a new pass `canonicalize-scf-affine-min` (instead of being a canonicalization pattern) to avoid dependencies between the Affine dialect and the SCF dialect. Differential Revision: https://reviews.llvm.org/D107731	2021-08-25 07:32:30 +09:00
Matthias Springer	ebf35370ff	[mlir][tensor] Insert explicit tensor.cast ops for insert_slice src If additional static type information can be deduced from a insert_slice's size operands, insert an explicit cast of the op's source operand. This enables other canonicalization patterns that are matching for tensor_cast ops such as `ForOpTensorCastFolder` in SCF. Differential Revision: https://reviews.llvm.org/D108617	2021-08-24 19:45:04 +09:00
Matthias Springer	bc194a5bb5	[mlir][SCF] Do not peel loops inside partial iterations Do not apply loop peeling to loops that are contained in the partial iteration of an already peeled loop. This is to avoid code explosion when dealing with large loop nests. Can be controlled with a new pass option `skip-partial`. Differential Revision: https://reviews.llvm.org/D108542	2021-08-23 21:35:46 +09:00
Morten Borup Petersen	6c1436a9b0	[MLIR][SCF] Parenthesize multiple return types in scf.execute_region asm op Previously, ExecuteRegionOps with multiple return values would fail a round-trip test due to missing parenthesis around the types. Differential Revision: https://reviews.llvm.org/D108402	2021-08-19 21:31:51 +01:00
Matthias Springer	8e8b70aa84	[mlir][scf] Simplify affine.min ops after loop peeling Simplify affine.min ops, enabling various other canonicalizations inside the peeled loop body. affine.min ops such as: ``` map = affine_map<(d0)[s0, s1] -> (s0, -d0 + s1)> %r = affine.min #affine.min #map(%iv)[%step, %ub] ``` are rewritten them into (in the case the peeled loop): ``` %r = %step ``` To determine how an affine.min op should be rewritten and to prove its correctness, FlatAffineConstraints is utilized. Differential Revision: https://reviews.llvm.org/D107222	2021-08-19 17:24:53 +09:00
tashuang.zk	2d45e332ba	[MLIR][DISC] Revise ParallelLoopTilingPass with inbound_check mode Expand ParallelLoopTilingPass with an inbound_check mode. In default mode, the upper bound of the inner loop is from the min op; in inbound_check mode, the upper bound of the inner loop is the step of the outer loop and an additional inbound check will be emitted inside of the inner loop. This was 'FIXME' in the original codes and a typical usage is for GPU backends, thus the outer loop and inner loop can be mapped to blocks/threads in seperate. Differential Revision: https://reviews.llvm.org/D105455	2021-08-16 14:02:53 +02:00
Tyler Augustine	3a2ff982d7	Support post-processing Ops in unrolled loop iterations This can be useful when one needs to know which unrolled iteration an Op belongs to, for example, conveying noalias information among memory-affecting ops in parallel-access loops. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D107789	2021-08-11 23:11:10 +00:00
Matthias Springer	3a41ff4883	[mlir][SCF] Peel scf.for loops for even step divison Add ForLoopBoundSpecialization pass, which specializes scf.for loops into a "main loop" where `step` divides the iteration space evenly and into an scf.if that handles the last iteration. This transformation is useful for vectorization and loop tiling. E.g., when vectorizing loads/stores, programs will spend most of their time in the main loop, in which only unmasked loads/stores are used. Only the in the last iteration (scf.if), slower masked loads/stores are used. Subsequent commits will apply this transformation in the SparseDialect and in Linalg's loop tiling. Differential Revision: https://reviews.llvm.org/D105804	2021-08-03 10:21:38 +09:00
River Riddle	f8479d9de5	[mlir] Set the namespace of the BuiltinDialect to 'builtin' Historically the builtin dialect has had an empty namespace. This has unfortunately created a very awkward situation, where many utilities either have to special case the empty namespace, or just don't work at all right now. This revision adds a namespace to the builtin dialect, and starts to cleanup some of the utilities to no longer handle empty namespaces. For now, the assembly form of builtin operations does not require the `builtin.` prefix. (This should likely be re-evaluated though) Differential Revision: https://reviews.llvm.org/D105149	2021-07-28 21:00:10 +00:00
Marcel Koester	0425332015	[mlir] Added new RegionBranchTerminatorOpInterface and adapted uses of hasTrait<ReturnLike>. This CL adds a new RegionBranchTerminatorOpInterface to query information about operands that can be passed to successor regions. Similar to the BranchOpInterface, it allows to freely define the involved operands. However, in contrast to the BranchOpInterface, it expects an additional region number to distinguish between various use cases which might require different operands passed to different regions. Moreover, we added new utility functions (namely getMutableRegionBranchSuccessorOperands and getRegionBranchSuccessorOperands) to query (mutable) operand ranges for operations equiped with the ReturnLike trait and/or implementing the newly added interface. This simplifies reasoning about terminators in the scope of the nested regions. We also adjusted the SCF.ConditionOp to benefit from the newly added capabilities. Differential Revision: https://reviews.llvm.org/D105018	2021-07-26 06:39:31 +02:00
thomasraoux	45cb4140eb	[mlir] Extend scf pipeling to support loop carried dependencies Differential Revision: https://reviews.llvm.org/D106325	2021-07-21 18:32:38 -07:00
thomasraoux	f6f88e66ce	[mlir] Add software pipelining transformation for scf.For op This is the first step to support software pipeline for scf.for loops. This is only the transformation to create pipelined kernel and prologue/epilogue. The scheduling needs to be given by user as many different algorithm and heuristic could be applied. This currently doesn't handle loop arguments, this will be added in a follow up patch. Differential Revision: https://reviews.llvm.org/D105868	2021-07-19 13:43:26 -07:00
Butygin	a36e9ee09d	[mlir][SCF] populateSCFStructuralTypeConversionsAndLegality WhileOp support Differential Revision: https://reviews.llvm.org/D105923	2021-07-14 12:43:04 +03:00
William S. Moses	dfb34c0df9	[MLIR][SCF] Inline ExecuteRegion if parent can contain multiple blocks The executeregionop is used to allow multiple blocks within SCF constructs. If the container allows multiple blocks, inline the region Differential Revision: https://reviews.llvm.org/D104960	2021-06-30 10:03:22 -04:00
William S. Moses	2ab27758d5	Revert "[MLIR][SCF] Inline ExecuteRegion if parent can contain multiple blocks" This reverts commit `5d6240b77e`. The commit was mistakenly landed without a PR approval, this will be reverted now and resubmitted.	2021-06-28 13:52:30 -04:00
William S. Moses	5d6240b77e	[MLIR][SCF] Inline ExecuteRegion if parent can contain multiple blocks The executeregionop is used to allow multiple blocks within SCF constructs. If the container allows multiple blocks, inline the region Differential Revision: https://reviews.llvm.org/D104960	2021-06-28 13:09:22 -04:00
William S. Moses	44985872b8	[MLIR][SCF] Inline single block ExecuteRegionOp This commit adds a canonicalization pass which inlines any single block execute region Differential Revision: https://reviews.llvm.org/D104865	2021-06-24 13:15:26 -04:00
Anthony Canino	3f429e82d3	Implement an scf.for range folding optimization pass. In cases where arithmetic (addi/muli) ops are performed on an scf.for loops induction variable with a single use, we can fold those ops directly into the scf.for loop. For example, in the following code: ``` scf.for %i = %c0 to %arg1 step %c1 { %0 = addi %arg2, %i : index %1 = muli %0, %c4 : index %2 = memref.load %arg0[%1] : memref<?xi32> %3 = muli %2, %2 : i32 memref.store %3, %arg0[%1] : memref<?xi32> } ``` we can lift `%0` up into the scf.for loop range, as it is the only user of %i: ``` %lb = addi %arg2, %c0 : index %ub = addi %arg2, %i : index scf.for %i = %lb to %ub step %c1 { %1 = muli %0, %c4 : index %2 = memref.load %arg0[%1] : memref<?xi32> %3 = muli %2, %2 : i32 memref.store %3, %arg0[%1] : memref<?xi32> } ``` Reviewed By: mehdi_amini, ftynse, Anthony Differential Revision: https://reviews.llvm.org/D104289	2021-06-24 01:07:28 +00:00
Matthias Springer	060208b4c8	[mlir][NFC] Move SubTensorOp and SubTensorInsertOp to TensorDialect The main goal of this commit is to remove the dependency of Standard dialect on the Tensor dialect. * Rename SubTensorOp -> tensor.extract_slice, SubTensorInsertOp -> tensor.insert_slice. * Some helper functions are (already) duplicated between the Tensor dialect and the MemRef dialect. To keep this commit smaller, this will be cleaned up in a separate commit. * Additional dialect dependencies: Shape --> Tensor, Tensor --> Standard * Remove dialect dependencies: Standard --> Tensor * Move canonicalization test cases to correct dialect (Tensor/MemRef). Note: This is a fixed version of https://reviews.llvm.org/D104499, which was reverted due to a missing update to two CMakeFile.txt. Differential Revision: https://reviews.llvm.org/D104676	2021-06-22 17:55:53 +09:00
Mehdi Amini	60d97fb4cf	Revert "[mlir][NFC] Move SubTensorOp and SubTensorInsertOp to TensorDialect" This reverts commit `83bf801f5f`. This breaks the build with -DBUILD_SHARED_LIBS=ON	2021-06-21 16:39:24 +00:00
Matthias Springer	83bf801f5f	[mlir][NFC] Move SubTensorOp and SubTensorInsertOp to TensorDialect The main goal of this commit is to remove the dependency of Standard dialect on the Tensor dialect. * Rename ops: SubTensorOp --> ExtractTensorOp, SubTensorInsertOp --> InsertTensorOp * Some helper functions are (already) duplicated between the Tensor dialect and the MemRef dialect. To keep this commit smaller, this will be cleaned up in a separate commit. * Additional dialect dependencies: Shape --> Tensor, Tensor --> Standard * Remove dialect dependencies: Standard --> Tensor * Move canonicalization test cases to correct dialect (Tensor/MemRef). Differential Revision: https://reviews.llvm.org/D104499	2021-06-22 00:11:21 +09:00
Uday Bondhugula	18c8c934d8	[MLIR] Introduce scf.execute_region op Introduce the execute_region op that is able to hold a region which it executes exactly once. The op encapsulates a CFG within itself while isolating it from the surrounding control flow. Proposal discussed here: https://llvm.discourse.group/t/introduce-std-inlined-call-op-proposal/282 execute_region enables one to inline a function without lowering out all other higher level control flow constructs (affine.for/if, scf.for/if) to the flat list of blocks / CFG form. It thus allows the benefit of transforms on higher level control flow ops available in the presence of the inlined calls. The inlined calls continue to benefit from propagation of SSA values across their top boundary. Functions won’t have to remain outlined until later than desired. Abstractions like affine execute_regions, lambdas with implicit captures could be lowered to this without first lowering out structured loops/ifs or outlining. But two potential early use cases are of: (1) an early inliner (which can inline functions by introducing execute_region ops), (2) lowering of an affine.execute_region, which cleanly maps to an scf.execute_region when going from the affine dialect to the scf dialect. Differential Revision: https://reviews.llvm.org/D75837	2021-06-18 15:22:33 +05:30
MaheshRavishankar	621d93d263	[mlir][SCF] Remove empty else blocks of `scf.if` operations. Differential Revision: https://reviews.llvm.org/D104273	2021-06-15 15:07:20 -07:00
Chris Lattner	a004da0d77	[Canonicalize] Switch the default setting to "top down". This provides a sizable compile time improvement by seeding the worklist in an order that leads to less iterations of the worklist. This patch only changes the behavior of the Canonicalize pass itself, it does not affect other passes that use the GreedyPatternRewrite driver Differential Revision: https://reviews.llvm.org/D103053	2021-05-25 13:42:11 -07:00
Butygin	4184018253	[mlir][SCF] Canonicalize nested ParallelOp's Differential Revision: https://reviews.llvm.org/D102799	2021-05-22 14:00:00 +03:00
William S. Moses	f4a2dbfe29	[MLIR][SCF] Combine adjacent scf.if with same condition Differential Revision: https://reviews.llvm.org/D101798	2021-05-05 00:39:58 -04:00
William S. Moses	8e211bf1c8	[MLIR][SCF] Assume uses of condition in the body of scf.while is true Differential Revision: https://reviews.llvm.org/D101801	2021-05-04 11:39:07 -04:00
William S. Moses	ca27260701	[MLIR] Add SCF.if Condition Canonicalizations Add two canoncalizations for scf.if. 1) A canonicalization that allows users of a condition within an if to assume the condition is true if in the true region, etc. 2) A canonicalization that removes yielded statements that are equivalent to the condition or its negation Differential Revision: https://reviews.llvm.org/D101012	2021-04-26 20:13:08 -04:00
Nicolas Vasilache	843f1fc825	[mlir][scf] Add scf.for + tensor.cast canonicalization pattern Fold scf.for iter_arg/result pairs that go through incoming/ougoing a tensor.cast op pair so as to pull the tensor.cast inside the scf.for: ``` %0 = tensor.cast %t0 : tensor<32x1024xf32> to tensor<?x?xf32> %1 = scf.for %i = %c0 to %c1024 step %c32 iter_args(%iter_t0 = %0) -> (tensor<?x?xf32>) { %2 = call @do(%iter_t0) : (tensor<?x?xf32>) -> tensor<?x?xf32> scf.yield %2 : tensor<?x?xf32> } %2 = tensor.cast %1 : tensor<?x?xf32> to tensor<32x1024xf32> use_of(%2) ``` folds into: ``` %0 = scf.for %arg2 = %c0 to %c1024 step %c32 iter_args(%arg3 = %arg0) -> (tensor<32x1024xf32>) { %2 = tensor.cast %arg3 : tensor<32x1024xf32> to tensor<?x?xf32> %3 = call @do(%2) : (tensor<?x?xf32>) -> tensor<?x?xf32> %4 = tensor.cast %3 : tensor<?x?xf32> to tensor<32x1024xf32> scf.yield %4 : tensor<32x1024xf32> } use_of(%0) ``` Differential Revision: https://reviews.llvm.org/D100661	2021-04-16 16:50:21 +00:00
Butygin	eb31540066	[mlir] Canonicalize single-iteration ParallelOp Differential Revision: https://reviews.llvm.org/D100248	2021-04-13 13:42:19 +03:00
Butygin	5657f93e78	[mlir] Canonicalize IfOp with trivial `then` and `else` bodies to list of SelectOp's * Do we need a threshold on maximum number of Yeild arguments processed (maximum number of SelectOp's to be generated)? * Had to modify some old IfOp tests to not get optimized by this pattern Differential Revision: https://reviews.llvm.org/D98592	2021-03-20 12:18:49 +03:00

1 2

90 Commits