Due to legacy reasons (ML/CFG function separation), regions in affine control
flow operations require contained blocks not to have terminators. This is
inconsistent with the notion of the block and may complicate code motion
between regions of affine control operations and other regions.
Introduce `affine.terminator`, a special terminator operation that must be used
to terminate blocks inside affine operations and transfers the control back to
he region enclosing the affine operation. For brevity and readability reasons,
allow `affine.for` and `affine.if` to omit the `affine.terminator` in their
regions when using custom printing and parsing format. The custom parser
injects the `affine.terminator` if it is missing so as to always have it
present in constructed operations.
Update transformations to account for the presence of terminator. In
particular, most code motion transformation between loops should leave the
terminator in place, and code motion between loops and non-affine blocks should
drop the terminator.
PiperOrigin-RevId: 240536998
Before this CL, the result type of the pattern match results need to be as same
as the first operand type, operand broadcast type or a generic tensor type.
This CL adds a new trait to set the result type by attribute. For example, the
TFL_ConstOp can use this to set the output type to its value attribute.
PiperOrigin-RevId: 240441249
Currently, regions can only be constructed by passing in a `Function` or an
`Instruction` pointer referencing the parent object, unlike `Function`s or
`Instruction`s themselves that can be created without a parent. It leads to a
rather complex flow in operation construction where one has to create the
operation first before being able to work with its regions. It may be
necessary to work with the regions before the operation is created. In
particular, in `build` and `parse` functions that are executed _before_ the
operation is created in cases where boilerplate region manipulation is required
(for example, inserting the hypothetical default terminator in affine regions).
Allow creating standalone regions. Such regions are meant to own a list of
blocks and transfer them to other regions on demand.
Each instruction stores a fixed number of regions as trailing objects and has
ownership of them. This decreases the size of the Instruction object for the
common case of instructions without regions. Keep this behavior intact. To
allow some flexibility in construction, make OperationState store an owning
vector of regions. When the Builder creates an Instruction from
OperationState, the bodies of the regions are transferred into the
instruction-owned regions to minimize copying. Thus, it becomes possible to
fill standalone regions with blocks and move them to an operation when it is
constructed, or move blocks from a region to an operation region, e.g., for
inlining.
PiperOrigin-RevId: 240368183
a pointer. This makes it consistent with all the other methods in
FunctionPass, as well as with ModulePass::getModule(). NFC.
PiperOrigin-RevId: 240257910
This combined match/rewrite functionality allows simplifying the majority of existing RewritePatterns, as they do not benefit from separate match and rewrite functions.
Some of the existing canonicalization patterns in StandardOps have been modified to take advantage of this functionality.
PiperOrigin-RevId: 240187856
Previously we have multiple mechanisms to specify op definition and match constraints:
TypeConstraint, AttributeConstraint, Type, Attr, mAttr, mAttrAnyOf, mPat. These variants
are not added because there are so many distinct cases we need to model; essentially,
they are all carrying a predicate. It's just an artifact of implementation.
It's quite confusing for users to grasp these variants and choose among them. Instead,
as the OpBase TableGen file, we need to strike to provide an unified mechanism. Each
dialect has the flexibility to define its own aliases if wanted.
This CL removes mAttr, mAttrAnyOf, mPat. A new base class, Constraint, is added. Now
TypeConstraint and AttrConstraint derive from Constraint. Type and Attr further derive
from TypeConstraint and AttrConstraint, respectively.
Comments are revised and examples are added to make it clear how to use constraints.
PiperOrigin-RevId: 240125076
Dialect implementer are expected to inherit from this class when implementing their types. It does not seems right when using MLIR "from the outside" to use directly something from `mlir::detail::`.
PiperOrigin-RevId: 240075769
inherited constructors, which is cleaner and means you can now use DimOp()
to get a null op, instead of having to use Instruction::getNull<DimOp>().
This removes another 200 lines of code.
PiperOrigin-RevId: 240068113
Using global constructors should not be mandatory when possible, clients should be able to register a dialect explicitly when they want.
PiperOrigin-RevId: 240064244
We just need a way to unpack ArrayRef<ValueHandle> to ArrayRef<Value*>.
No need to expose this to the user.
This reduces the cognitive overhead for the tutorial.
PiperOrigin-RevId: 240037425
tblgen be non-const. This requires introducing some const_cast's at the
moment, but those (and lots more stuff) will disappear in subsequent patches.
This significantly simplifies those patches because the various tblgen op emitters
get adjusted.
PiperOrigin-RevId: 239954566
Enable users specifying operand type constraint combinations (e.g., considering multiple operands). Some of these will be refactored (particularly the OpBase change and that should also not be needed to be done by most users), but the focus is more on user side (shown in test). The generated code for this does not take any known facts into account or perform any simplification.
Start with 2 primities to specify 1) whether an operand has a specific element type, and 2) whether an operand's element type matches another operands element type.
PiperOrigin-RevId: 239875712
This also eliminates some incorrect reinterpret_cast logic working around it, and numerous const-incorrect issues (like block argument iteration).
PiperOrigin-RevId: 239712029
This allows the indexing sugar to just work naturally with other type of load and store ops than the affine ones we currently have.
This is needed for the EuroLLVM tutorial.
PiperOrigin-RevId: 239602257
This is useful when developing one or multiple dialects in a private context without having to register them with MLIR Core.
PiperOrigin-RevId: 239601844
This eliminate ConstOpPointer (but keeps OpPointer for now) by making OpPointer
implicitly launder const in a const incorrect way. It will eventually go away
entirely, this is a progressive step towards the new const model.
PiperOrigin-RevId: 239512640
Previously we emit both op declaration and definition into one file and include it
in *Ops.h. That pulls in lots of implementation details in the header file and we
cannot hide symbols local to implementation. This CL splits them to provide a cleaner
interface.
The way how we define custom builders in TableGen is changed accordingly because now
we need to distinguish signatures and implementation logic. Some custom builders with
complicated logic now can be moved to be implemented in .cpp entirely.
PiperOrigin-RevId: 239509594
Avoids including function in C++ side that resulted in OSS C++ errors:
include/mlir-c/Core.h:228:16: error: functions that differ only in their
return type cannot be overloaded
edsc_indexed_t index(edsc_indexed_t indexed, edsc_expr_list_t indices);
~~~~~~~~~~~~~~ ^
/usr/include/string.h:484:14: note: previous declaration is here
extern char *index (const char *__s, int __c)
And as these are going away soon, just removing the function requires the least changes.
PiperOrigin-RevId: 239110470
Previously Value was a pair of name & Type, but for operands/result a TypeConstraint rather then a Type is specified. Update C++ side to match declarative side.
PiperOrigin-RevId: 238984799
This CL introduces a ValueArrayHandle helper to manage the implicit conversion
of ArrayRef<ValueHandle> -> ArrayRef<Value*> by converting first to ValueArrayHandle.
Without this, boilerplate operations that take ArrayRef<Value*> cannot be removed easily.
This all seems to boil down to decoupling Value from Type.
Alternative solutions exist (e.g. MLIR using Value by value everywhere) but they would be very intrusive. This seems to be the lowest impedance change.
Intrinsics are also lowercased by popular demand.
PiperOrigin-RevId: 238974125
This CL removes the dependency of LowerVectorTransfers on the AST version of EDSCs which will be retired.
This exhibited a pretty fundamental staging difference in AST-based vs declarative based emission.
Since the delayed creation with an AST was staged, the loop order came into existence after the clipping expressions were computed.
This now changes as the loops first need to be created declaratively in fixed order and then the clipping expressions are created.
Also, due to lack of staging, coalescing cannot be done on the fly anymore and
needs to be done either as a pre-pass (current implementation) or as a local transformation on the generated IR (future work).
Tests are updated accordingly.
PiperOrigin-RevId: 238971631
* print-ir-before=(comma-separated-pass-list)
- Print the IR before each of the passes provided within the pass list.
* print-ir-before-all
- Print the IR before every pass in the pipeline.
* print-ir-after=(comma-separated-pass-list)
- Print the IR after each of the passes provided within the pass list.
* print-ir-after-all
- Print the IR after every pass in the pipeline.
* print-ir-module-scope
- Always print the Module IR, even for non module passes.
PiperOrigin-RevId: 238523649
In particular, expose comparison operators as Python operator overloads on
ValueHandles. The comparison currently emits signed integer comparisons only,
which is compatible with the behavior of emitter-based EDSC interface. This is
sub-optimal and must be reconsidered in the future. Note that comparison
operators are not overloaded in the C++ declarative builder API precisely
because this avoids the premature decision on the signedness of comparisons.
Implement the declarative construction of boolean expressions using
ValueHandles by overloading the boolean operators in the `op` namespace to
differentiate between `operator!` for nullity check and for boolean negation.
The operands must be of i1 type. Also expose boolean operations as Python
operator overloads on ValueHandles.
PiperOrigin-RevId: 238421615
- emit a note on the loop being parallel instead of setting a loop attribute
- rename the pass -test-detect-parallel (from -detect-parallel)
PiperOrigin-RevId: 238122847
Add support to create a new attribute from multiple attributes. It extended the
DagNode class to represent attribute creation dag. It also changed the
RewriterGen::emitOpCreate method to support this nested dag emit.
An unit test is added.
PiperOrigin-RevId: 238090229
- fix for getConstantBoundOnDimSize: floordiv -> ceildiv for extent
- make getConstantBoundOnDimSize also return the identifier upper bound
- fix unionBoundingBox to correctly use the divisor and upper bound identified by
getConstantBoundOnDimSize
- deal with loop step correctly in addAffineForOpDomain (covers most cases now)
- fully compose bound map / operands and simplify/canonicalize before adding
dim/symbol to FlatAffineConstraints; fixes false positives in -memref-bound-check; add
test case there
- expose mlir::isTopLevelSymbol from AffineOps
PiperOrigin-RevId: 238050395
This CL sorts attribute kinds in OpBase.td according to a logical order: simple
cases ahead of complicated ones. The logic of attribute kinds involved are
completely untouched.
Comments on AttrConstraint and Attr are revised slightly.
PiperOrigin-RevId: 238031275
This CL also changes IntegerAttrBase to use APInt as return value to defer bitwidth
handling to API call sites and be consistent with FloatAttrBase. Call sites are
adjusted accordingly.
PiperOrigin-RevId: 238030614
multi-result upper bounds, complete TODOs, fix/improve test cases.
- complete TODOs for loop unroll/unroll-and-jam. Something as simple as
"for %i = 0 to %N" wasn't being unrolled earlier (unless it had been written
as "for %i = ()[s0] -> (0)()[%N] to %N"; addressed now.
- update/replace getTripCountExpr with buildTripCountMapAndOperands; makes it
more powerful as it composes inputs into it
- getCleanupLowerBound and getUnrolledLoopUpperBound actually needed the same
code; refactor and remove one.
- reorganize test cases, write previous ones better; most of these changes are
"label replacements".
- fix wrongly labeled test cases in unroll-jam.mlir
PiperOrigin-RevId: 238014653
This CL makes some minor changes to the declarative builder Helpers:
1. adds lb, ub, step methods to MemRefView to avoid always having to go through std::get + range;
2. drops MemRefView& from IndexedValue which was just creating ownership concerns. Instead, an IndexedValue only needs to keep track of the ValueHandle from which a MemRefView can be constructed on-demand if necessary.
PiperOrigin-RevId: 237861493
TensorFlow comparison ops like tf.Less supports broadcast behavior but the result
type have different element types as the input types. Extend broadcastable trait
to allow such cases. Added tf.Less to demonstrate it.
PiperOrigin-RevId: 237846127
So that we can use this function to deduce broadcasted shapes elsewhere.
Also added support for unknown dimensions, by following TensorFlow behavior.
PiperOrigin-RevId: 237846065
Below shows the output for an example mlir-opt command line.
mlir-opt foo.mlir -verify-each=false -cse -canonicalize -cse -cse -pass-timing
list view (-pass-timing-display=list):
* In this mode the results are displayed in a list sorted by total time; with each pass/analysis instance aggregated into one unique result. This mode is similar to the output of 'time-passes' in llvm-opt.
===-------------------------------------------------------------------------===
... Pass execution timing report ...
===-------------------------------------------------------------------------===
Total Execution Time: 0.0097 seconds (0.0096 wall clock)
---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
0.0051 ( 58.3%) 0.0001 ( 12.2%) 0.0052 ( 53.8%) 0.0052 ( 53.8%) Canonicalizer
0.0025 ( 29.1%) 0.0005 ( 58.2%) 0.0031 ( 31.9%) 0.0031 ( 32.0%) CSE
0.0011 ( 12.6%) 0.0003 ( 29.7%) 0.0014 ( 14.3%) 0.0014 ( 14.2%) DominanceInfo
0.0087 (100.0%) 0.0009 (100.0%) 0.0097 (100.0%) 0.0096 (100.0%) Total
pipeline view (-pass-timing-display=pipeline):
* In this mode the results are displayed in a nested pipeline view that mirrors the internal pass pipeline that is being executed in the pass manager. This view is useful for understanding specifically which parts of the pipeline are taking the most time, and can also be used to identify when analyses are being invalidated and recomputed.
===-------------------------------------------------------------------------===
... Pass execution timing report ...
===-------------------------------------------------------------------------===
Total Execution Time: 0.0082 seconds (0.0081 wall clock)
---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
0.0042 (100.0%) 0.0039 (100.0%) 0.0082 (100.0%) 0.0081 (100.0%) Function Pipeline
0.0005 ( 11.6%) 0.0008 ( 21.1%) 0.0013 ( 16.1%) 0.0013 ( 16.2%) CSE
0.0002 ( 5.0%) 0.0004 ( 9.3%) 0.0006 ( 7.0%) 0.0006 ( 7.0%) (A) DominanceInfo
0.0026 ( 61.8%) 0.0018 ( 45.6%) 0.0044 ( 54.0%) 0.0044 ( 54.1%) Canonicalizer
0.0005 ( 11.7%) 0.0005 ( 13.0%) 0.0010 ( 12.3%) 0.0010 ( 12.4%) CSE
0.0003 ( 6.1%) 0.0003 ( 8.3%) 0.0006 ( 7.2%) 0.0006 ( 7.1%) (A) DominanceInfo
0.0002 ( 3.8%) 0.0001 ( 2.8%) 0.0003 ( 3.3%) 0.0003 ( 3.3%) CSE
0.0042 (100.0%) 0.0039 (100.0%) 0.0082 (100.0%) 0.0081 (100.0%) Total
PiperOrigin-RevId: 237825367
* before/after pass execution
* after a pass fails
* before/after an analysis is computed
After getting this infrastructure in place, we can start providing common developer utilities like pass timing, IR printing after pass execution, etc.
PiperOrigin-RevId: 237709692
This CL addresses a few post-submit comments:
1. better comments,
2. check number of results before dyn_cast (which is a less common case)
3. test usage for multi-result InstructionHandle
PiperOrigin-RevId: 237549333
This CL adds support for named custom instructions in declarative builders.
To allow this, it introduces a templated `CustomInstruction` class.
This CL also splits ValueHandle which can capture only the **value** in single-valued instructions from InstructionHandle which can capture any instruction but provide no typing and sugaring to extract the potential Value*.
PiperOrigin-RevId: 237543222
This CL added the ability to generate multiple ops using multiple result
patterns, with each of them replacing one result of the matched source op.
Specifically, the syntax is
```
def : Pattern<(SourceOp ...),
[(ResultOp1 ...), (ResultOp2 ...), (ResultOp3 ...)]>;
```
Assuming `SourceOp` has three results.
Currently we require that each result op must generate one result, which
can be lifted later when use cases arise.
To help with cases that certain output is unused and we don't care about it,
this CL also introduces a new directive: `verifyUnusedValue`. Checks will
be emitted in the `match()` method to make sure if the corresponding output
is not unused, `match()` returns with `matchFailure()`.
PiperOrigin-RevId: 237513904
TensorFlow does not allow integers of random bitwidths. It only accepts 8-,
16-, 32-, and 64-bit integer types. Similarly for floating point types, only
half, single, double, and bfloat16 types.
PiperOrigin-RevId: 237483913