BlockArguments gained the ability to have locations attached a while ago, but they
have always been optional. This goes against the core tenant of MLIR where location
information is a requirement, so this commit updates the API to require locations.
Fixes#53279
Differential Revision: https://reviews.llvm.org/D117633
The tree merging of pattern predicates places the predicates in an unordered set. When the predicates are sorted, they are taken in the set order, not the insertion order. This results in nondeterministic behavior.
One solution to this problem would be to use `SetVector`. However, the value `SetVector` does not provide a `find` function for fast O(1) lookups and stores the predicates twice -- once in the set and once in the vector, which is undesirable, because we store patternToAnswer in each predicate. A simpler solution is to store the tie breaking ID (which follows the insertion order), and use this ID to break any ties when comparing predicates.
Reviewed By: Mogball
Differential Revision: https://reviews.llvm.org/D116081
When the original version of multi-root patterns was reviewed, several improvements were made to the pdl_interp operations during the review process. Specifically, the "get users of a value at the specified operand index" was split up into "get users" and "compare the users' operands with that value". The iterative execution was also cleaned up to `pdl_interp.foreach`. However, the positions in the pdl-to-pdl_interp lowering were not similarly refactored. This introduced several problems, including hard-to-detect bugs in the lowering and duplicate evaluation of `pdl_interp.get_users`.
This diff cleans up the positions. The "upward" `OperationPosition` was split-out into `UsersPosition` and `ForEachPosition`, and the operand comparison was replaced with a simple predicate. In the process, I fixed three bugs:
1. When multiple roots were had the same connector (i.e., a node that they shared with a subtree at the previously visited root), we would generate a single foreach loop rather than one foreach loop for each such root. The reason for this is that such connectors shared the position. The solution for this is to add root index as an id to the newly introduced `ForEachPosition`.
2. Previously, we would use `pdl_interp.get_operands` indiscriminately, whether or not the operand was variadic. We now correctly detect variadic operands and insert `pdl_interp.get_operand` when needed.
3. In certain corner cases, we would trigger the "connector has not been traversed yet" assertion. This was caused by not inserting the values during the upward traversal correctly. This has now been fixed.
Reviewed By: Mogball
Differential Revision: https://reviews.llvm.org/D116080
Previously, we defined a struct named `RootOrderingCost`, which stored the cost (a pair consisting of the depth of the connector and a tie breaking ID), as well as the connector itself. This created some confusion, because we would sometimes write, e.g., `cost.cost.first` (the first `cost` referring to the struct, the second one referring to the `cost` field, and `first` referring to the depth). In order to address this confusion, here we rename `RootOrderingCost` to `RootOrderingEntry` (keeping the fields and their names as-is).
This clarification exposed non-determinism in the optimal branching algorithm. When choosing the best local parent, we were previuosly only considering its depth (`cost.first`) and not the tie-breaking ID (`cost.second`). This led to non-deterministic choice of the parent when multiple potential parents had the same depth. The solution is to compare both the depth and the tie-breaking ID.
Testing: Rely on existing unit tests. Non-detgerminism is hard to unit-test.
Reviewed By: rriddle, Mogball
Differential Revision: https://reviews.llvm.org/D116079
This diff fixes broken build caused by D108550. Under GCC 5, auto lambdas that capture this require `this->` for member calls.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D114659
This is commit 4 of 4 for the multi-root matching in PDL, discussed in https://llvm.discourse.group/t/rfc-multi-root-pdl-patterns-for-kernel-matching/4148 (topic flagged for review).
This PR integrates the various components (root ordering algorithm, nondeterministic execution of PDL bytecode) to implement multi-root PDL matching. The main idea is for the pattern to specify mulitple candidate roots. The PDL-to-PDLInterp lowering selects one of these roots and "hangs" the pattern from this root, traversing the edges downwards (from operation to its operands) when possible and upwards (from values to its uses) when needed. The root is selected by invoking the optimal matching multiple times, once for each candidate root, and the connectors are determined form the optimal matching. The costs in the directed graph are equal to the number of upward edges that need to be traversed when connecting the given two candidate roots. It can be shown that, for this choice of the cost function, "hanging" the pattern an inner node is no better than from the optimal root.
The following three main additions were implemented as a part of this PR:
1. OperationPos predicate has been extended to allow tracing the operation accepting a value (the opposite of operation defining a value).
2. Predicate checking if two values are not equal - this is useful to ensure that we do not traverse the edge back downwards after we traversed it upwards.
3. Function for for building the cost graph among the candidate roots.
4. Updated buildPredicateList, building the predicates optimal branching has been determined.
Testing: unit tests (an integration test to follow once the stack of commits has landed)
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D108550
This is commit 3 of 4 for the multi-root matching in PDL, discussed in https://llvm.discourse.group/t/rfc-multi-root-pdl-patterns-for-kernel-matching/4148 (topic flagged for review).
We form a graph over the specified roots, provided in `pdl.rewrite`, where two roots are connected by a directed edge if the target root can be connected (via a chain of operations) in the underlying pattern to the source root. We place a restriction that the path connecting the two candidate roots must only contain the nodes in the subgraphs underneath these two roots. The cost of an edge is the smallest number of upward traversals (edges) required to go from the source to the target root, and the connector is a `Value` in the intersection of the two subtrees rooted at the source and target root that results in that smallest number of such upward traversals. Optimal root ordering is then formulated as the problem of finding a spanning arborescence (i.e., a directed spanning tree) of minimal weight.
In order to determine the spanning arborescence (directed spanning tree) of minimum weight, we use the [Edmonds' algorithm](https://en.wikipedia.org/wiki/Edmonds%27_algorithm). The worst-case computational complexity of this algorithm is O(_N_^3) for a single root, where _N_ is the number of specified roots. The `pdl`-to-`pdl_interp` lowering calls this algorithm as a subroutine _N_ times (once for each candidate root), so the overall complexity of root ordering is O(_N_^4). If needed, this complexity could be reduced to O(_N_^3) with a more efficient algorithm. However, note that the underlying implementation is very efficient, and _N_ in our instances tends to be very small (<10). Therefore, we believe that the proposed (asymptotically suboptimal) implementation will suffice for now.
Testing: a unit test of the algorithm
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D108549
The StringAttr version doesn't need a context, so we can just use the
existing `SymbolRefAttr::get` form. The StringRef version isn't preferred
so we want to encourage people to use StringAttr.
There is an additional form of getSymbolRefAttr that takes a (SymbolTrait
implementing) operation. This should also be moved, but I'll do that as
a separate patch.
Differential Revision: https://reviews.llvm.org/D108922
There's no need for `toSmallVector()` as `SmallVector.h` already provides a `to_vector` free function that takes a range.
Reviewed By: Quuxplusone
Differential Revision: https://reviews.llvm.org/D104024
This is a roll forward of D102679.
This patch simplifies the implementation of Sequence and makes it compatible with llvm::reverse.
It exposes the reverse iterators through rbegin/rend which prevents a dangling reference in std::reverse_iterator::operator++().
Note: Compared to D102679, this patch introduces a `asSmallVector()` member function and fixes compilation issue with GCC 5.
Differential Revision: https://reviews.llvm.org/D103948
This reverts commit e772216e70
(and fixup 7f6c878a2c).
The build is broken with gcc5 host compiler:
In file included from
from mlir/lib/Dialect/Utils/StructuredOpsUtils.cpp:9:
tools/mlir/include/mlir/IR/BuiltinAttributes.h.inc:424:57: error: type/value mismatch at argument 1 in template parameter list for 'template<class ItTy, class FuncTy, class FuncReturnTy> class llvm::mapped_iterator'
std::function<T(ptrdiff_t)>>;
^
tools/mlir/include/mlir/IR/BuiltinAttributes.h.inc:424:57: note: expected a type, got 'decltype (seq<ptrdiff_t>(0, 0))::const_iterator'
This patch simplifies the implementation of Sequence and makes it compatible with llvm::reverse.
It exposes the reverse iterators through rbegin/rend which prevents a dangling reference in std::reverse_iterator::operator++().
Differential Revision: https://reviews.llvm.org/D102679
This revision extends the PDL Interpreter dialect to add support for variadic operands and results, with ranges of these values represented via the recently added !pdl.range type. To support this extension, three new operations have been added that closely match the single variant:
* pdl_interp.check_types : Compare a range of types with a known range.
* pdl_interp.create_types : Create a constant range of types.
* pdl_interp.get_operands : Get a range of operands from an operation.
* pdl_interp.get_results : Get a range of results from an operation.
* pdl_interp.switch_types : Switch on a range of types.
This revision handles adding support in the interpreter dialect and the conversion from PDL to PDLInterp. Support for variadic operands and results in the bytecode will be added in a followup revision.
Differential Revision: https://reviews.llvm.org/D95722
This has a numerous amount of benefits, given the overly clunky nature of CreateNativeOp:
* Users can now call into arbitrary rewrite functions from inside of PDL, allowing for more natural interleaving of PDL/C++ and enabling for more of the pattern to be in PDL.
* Removes the need for an additional set of C++ functions/registry/etc. The new ApplyNativeRewriteOp will use the same PDLRewriteFunction as the existing RewriteOp. This reduces the API surface area exposed to users.
This revision also introduces a new PDLResultList class. This class is used to provide results of native rewrite functions back to PDL. We introduce a new class instead of using a SmallVector to simplify the work necessary for variadics, given that ranges will require some changes to the structure of PDLValue.
Differential Revision: https://reviews.llvm.org/D95720
Up until now, results have been represented as additional results to a pdl.operation. This is fairly clunky, as it mismatches the representation of the rest of the IR constructs(e.g. pdl.operand) and also isn't a viable representation for operations returned by pdl.create_native. This representation also creates much more difficult problems when factoring in support for variadic result groups, optional results, etc. To resolve some of these problems, and simplify adding support for variable length results, this revision extracts the representation for results out of pdl.operation in the form of a new `pdl.result` operation. This operation returns the result of an operation at a given index, e.g.:
```
%root = pdl.operation ...
%result = pdl.result 0 of %root
```
Differential Revision: https://reviews.llvm.org/D95719
This better matches the actual IR concept that is being modeled, and is consistent with how the rest of PDL is structured.
Differential Revision: https://reviews.llvm.org/D95718
We should be ordering predicates with higher primary/secondary sums first, but we are currently ordering them last. This allows for predicates more frequently encountered to be checked first.
Differential Revision: https://reviews.llvm.org/D95715
There isn't a good reason for anything within IR to specifically reference any of the builtin operations. The only place that had a good reason in the past was AsmPrinter, but the behavior there doesn't need to hardcode ModuleOp anymore.
Differential Revision: https://reviews.llvm.org/D92448
Given that OpState already implicit converts to Operator*, this seems reasonable.
The alternative would be to add more functions to OpState which forward to Operation.
Reviewed By: rriddle, ftynse
Differential Revision: https://reviews.llvm.org/D92266
These includes have been deprecated in favor of BuiltinDialect.h, which contains the definitions of ModuleOp and FuncOp.
Differential Revision: https://reviews.llvm.org/D91572
The conversion between PDL and the interpreter is split into several different parts.
** The Matcher:
The matching section of all incoming pdl.pattern operations is converted into a predicate tree and merged. Each pattern is first converted into an ordered list of predicates starting from the root operation. A predicate is composed of three distinct parts:
* Position
- A position refers to a specific location on the input DAG, i.e. an
existing MLIR entity being matched. These can be attributes, operands,
operations, results, and types. Each position also defines a relation to
its parent. For example, the operand `[0] -> 1` has a parent operation
position `[0]` (the root).
* Question
- A question refers to a query on a specific positional value. For
example, an operation name question checks the name of an operation
position.
* Answer
- An answer is the expected result of a question. For example, when
matching an operation with the name "foo.op". The question would be an
operation name question, with an expected answer of "foo.op".
After the predicate lists have been created and ordered(based on occurrence of common predicates and other factors), they are formed into a tree of nodes that represent the branching flow of a pattern match. This structure allows for efficient construction and merging of the input patterns. There are currently only 4 simple nodes in the tree:
* ExitNode: Represents the termination of a match
* SuccessNode: Represents a successful match of a specific pattern
* BoolNode/SwitchNode: Branch to a specific child node based on the expected answer to a predicate question.
Once the matcher tree has been generated, this tree is walked to generate the corresponding interpreter operations.
** The Rewriter:
The rewriter portion of a pattern is generated in a very straightforward manor, similarly to lowerings in other dialects. Each PDL operation that may exist within a rewrite has a mapping into the interpreter dialect. The code for the rewriter is generated within a FuncOp, that is invoked by the interpreter on a successful pattern match. Referenced values defined in the matcher become inputs the generated rewriter function.
An example lowering is shown below:
```mlir
// The following high level PDL pattern:
pdl.pattern : benefit(1) {
%resultType = pdl.type
%inputOperand = pdl.input
%root, %results = pdl.operation "foo.op"(%inputOperand) -> %resultType
pdl.rewrite %root {
pdl.replace %root with (%inputOperand)
}
}
// is lowered to the following:
module {
// The matcher function takes the root operation as an input.
func @matcher(%arg0: !pdl.operation) {
pdl_interp.check_operation_name of %arg0 is "foo.op" -> ^bb2, ^bb1
^bb1:
pdl_interp.return
^bb2:
pdl_interp.check_operand_count of %arg0 is 1 -> ^bb3, ^bb1
^bb3:
pdl_interp.check_result_count of %arg0 is 1 -> ^bb4, ^bb1
^bb4:
%0 = pdl_interp.get_operand 0 of %arg0
pdl_interp.is_not_null %0 : !pdl.value -> ^bb5, ^bb1
^bb5:
%1 = pdl_interp.get_result 0 of %arg0
pdl_interp.is_not_null %1 : !pdl.value -> ^bb6, ^bb1
^bb6:
// This operation corresponds to a successful pattern match.
pdl_interp.record_match @rewriters::@rewriter(%0, %arg0 : !pdl.value, !pdl.operation) : benefit(1), loc([%arg0]), root("foo.op") -> ^bb1
}
module @rewriters {
// The inputs to the rewriter from the matcher are passed as arguments.
func @rewriter(%arg0: !pdl.value, %arg1: !pdl.operation) {
pdl_interp.replace %arg1 with(%arg0)
pdl_interp.return
}
}
}
```
Differential Revision: https://reviews.llvm.org/D84580