This allows the indexing sugar to just work naturally with other type of load and store ops than the affine ones we currently have.
This is needed for the EuroLLVM tutorial.
PiperOrigin-RevId: 239602257
This is useful when developing one or multiple dialects in a private context without having to register them with MLIR Core.
PiperOrigin-RevId: 239601844
This eliminate ConstOpPointer (but keeps OpPointer for now) by making OpPointer
implicitly launder const in a const incorrect way. It will eventually go away
entirely, this is a progressive step towards the new const model.
PiperOrigin-RevId: 239512640
Previously we emit both op declaration and definition into one file and include it
in *Ops.h. That pulls in lots of implementation details in the header file and we
cannot hide symbols local to implementation. This CL splits them to provide a cleaner
interface.
The way how we define custom builders in TableGen is changed accordingly because now
we need to distinguish signatures and implementation logic. Some custom builders with
complicated logic now can be moved to be implemented in .cpp entirely.
PiperOrigin-RevId: 239509594
This CL fixes an issue where cloned loop induction variables were not properly
propagated and beefs up the corresponding test.
PiperOrigin-RevId: 239422961
Avoids including function in C++ side that resulted in OSS C++ errors:
include/mlir-c/Core.h:228:16: error: functions that differ only in their
return type cannot be overloaded
edsc_indexed_t index(edsc_indexed_t indexed, edsc_expr_list_t indices);
~~~~~~~~~~~~~~ ^
/usr/include/string.h:484:14: note: previous declaration is here
extern char *index (const char *__s, int __c)
And as these are going away soon, just removing the function requires the least changes.
PiperOrigin-RevId: 239110470
Previously Value was a pair of name & Type, but for operands/result a TypeConstraint rather then a Type is specified. Update C++ side to match declarative side.
PiperOrigin-RevId: 238984799
This CL introduces a ValueArrayHandle helper to manage the implicit conversion
of ArrayRef<ValueHandle> -> ArrayRef<Value*> by converting first to ValueArrayHandle.
Without this, boilerplate operations that take ArrayRef<Value*> cannot be removed easily.
This all seems to boil down to decoupling Value from Type.
Alternative solutions exist (e.g. MLIR using Value by value everywhere) but they would be very intrusive. This seems to be the lowest impedance change.
Intrinsics are also lowercased by popular demand.
PiperOrigin-RevId: 238974125
This CL removes the dependency of LowerVectorTransfers on the AST version of EDSCs which will be retired.
This exhibited a pretty fundamental staging difference in AST-based vs declarative based emission.
Since the delayed creation with an AST was staged, the loop order came into existence after the clipping expressions were computed.
This now changes as the loops first need to be created declaratively in fixed order and then the clipping expressions are created.
Also, due to lack of staging, coalescing cannot be done on the fly anymore and
needs to be done either as a pre-pass (current implementation) or as a local transformation on the generated IR (future work).
Tests are updated accordingly.
PiperOrigin-RevId: 238971631
* print-ir-before=(comma-separated-pass-list)
- Print the IR before each of the passes provided within the pass list.
* print-ir-before-all
- Print the IR before every pass in the pipeline.
* print-ir-after=(comma-separated-pass-list)
- Print the IR after each of the passes provided within the pass list.
* print-ir-after-all
- Print the IR after every pass in the pipeline.
* print-ir-module-scope
- Always print the Module IR, even for non module passes.
PiperOrigin-RevId: 238523649
In particular, expose comparison operators as Python operator overloads on
ValueHandles. The comparison currently emits signed integer comparisons only,
which is compatible with the behavior of emitter-based EDSC interface. This is
sub-optimal and must be reconsidered in the future. Note that comparison
operators are not overloaded in the C++ declarative builder API precisely
because this avoids the premature decision on the signedness of comparisons.
Implement the declarative construction of boolean expressions using
ValueHandles by overloading the boolean operators in the `op` namespace to
differentiate between `operator!` for nullity check and for boolean negation.
The operands must be of i1 type. Also expose boolean operations as Python
operator overloads on ValueHandles.
PiperOrigin-RevId: 238421615
In particular, expose `cond_br`, `select` and `call` operations with syntax
similar to that of the previous emitter-based EDSC interface. These are
provided for backwards-compatibility. Ideally, we want them to be
Table-generated from the Op definitions when those definitions are declarative.
Additionally, expose the ability to construct any op given its canonical name,
which also exercises the construction of unregistered ops.
PiperOrigin-RevId: 238421583
Expose edsc::IndexedValue using a syntax smilar to that of edsc::Indexed to
ensure backwards-compatibility. It remains possible to write array-indexed
loads and stores as
C.store([i, j], A.load([i, k]) * B.load([k, j]))
after taking a "view" of some value handle using IndexedValue as
A = IndexedValue(originalValueHandle)
provided that all indices are also value handles.
PiperOrigin-RevId: 238421544
In the original implementation, constants could be bound to EDSC expressions in
the binder, independently from other MLIR Values. A rework of EDSC including
early typing provided the functionality to use MLIR's `constant` operation to
define typed constants instead of binding them separately, but only used it for
index types. The new declarative builder implementation followed by providing
a call for building `constant` operations of index types but nothing more.
Expose similar builders for integers, floats and functions to match the what
binders allow one to use.
PiperOrigin-RevId: 238421508
Provide a function `arg` that returns the function argument as a value handle,
similar to block arguments. This makes function context managers in Python
similar to block context managers, which is more consistent given that the
function context manager sets the insertion point to the first block of the
function and that arguments of that block are those of the function. This
prepares the removal of PythonMLIREmitter class and its bind_function_arguments
helper.
Additionally, provide a helper method in PythonMLIRModule to define a function
and immediately create a context for it. Update the tests that are already
using context managers to use the function context manager instead of creating
the function manually.
PiperOrigin-RevId: 238421087
- emit a note on the loop being parallel instead of setting a loop attribute
- rename the pass -test-detect-parallel (from -detect-parallel)
PiperOrigin-RevId: 238122847
- this is really not a hard error; emit a warning instead (for inability to compute
footprint due to the union failing due to unimplemented cases)
- remove a misleading warning from LoopFusion.cpp
PiperOrigin-RevId: 238118711
Add support to create a new attribute from multiple attributes. It extended the
DagNode class to represent attribute creation dag. It also changed the
RewriterGen::emitOpCreate method to support this nested dag emit.
An unit test is added.
PiperOrigin-RevId: 238090229
- fix for getConstantBoundOnDimSize: floordiv -> ceildiv for extent
- make getConstantBoundOnDimSize also return the identifier upper bound
- fix unionBoundingBox to correctly use the divisor and upper bound identified by
getConstantBoundOnDimSize
- deal with loop step correctly in addAffineForOpDomain (covers most cases now)
- fully compose bound map / operands and simplify/canonicalize before adding
dim/symbol to FlatAffineConstraints; fixes false positives in -memref-bound-check; add
test case there
- expose mlir::isTopLevelSymbol from AffineOps
PiperOrigin-RevId: 238050395
This CL sorts attribute kinds in OpBase.td according to a logical order: simple
cases ahead of complicated ones. The logic of attribute kinds involved are
completely untouched.
Comments on AttrConstraint and Attr are revised slightly.
PiperOrigin-RevId: 238031275
This CL also changes IntegerAttrBase to use APInt as return value to defer bitwidth
handling to API call sites and be consistent with FloatAttrBase. Call sites are
adjusted accordingly.
PiperOrigin-RevId: 238030614
multi-result upper bounds, complete TODOs, fix/improve test cases.
- complete TODOs for loop unroll/unroll-and-jam. Something as simple as
"for %i = 0 to %N" wasn't being unrolled earlier (unless it had been written
as "for %i = ()[s0] -> (0)()[%N] to %N"; addressed now.
- update/replace getTripCountExpr with buildTripCountMapAndOperands; makes it
more powerful as it composes inputs into it
- getCleanupLowerBound and getUnrolledLoopUpperBound actually needed the same
code; refactor and remove one.
- reorganize test cases, write previous ones better; most of these changes are
"label replacements".
- fix wrongly labeled test cases in unroll-jam.mlir
PiperOrigin-RevId: 238014653
Expose EDSC block builders as Python context managers, similarly to loop
builders. Note that blocks, unlike loops, are addressable and may need to be
"declared" without necessarily filling their bodies with instructions. This is
the case, for example, when branching to a new block from the existing block.
Therefore, creating the block context manager immediately creates the block
(unless the manager captures an existing block) by creating and destroying the
block builder. With this approach, one can either fill in the block and refer
to it later leveraging Python's dynamic variable lookup
with BlockContext([indexType]) as b:
op(...) # operation inside the block
ret()
op(...) # operation outside the block (in the function entry block)
br(b, [...]) # branching to the block created above
or declare the block contexts upfront and enter them on demand
bb1 = BlockContext() # empty block created in the surrounding function
bb2 = BlockContext() # context
cond_br(bb1.handle, [], bb2.handle, []) # branch to blocks from here
with bb1:
op(...) # operation inside the first block
with bb2:
op(...) # operation inside the second block
with bb1:
op(...) # append operation to the first block
Additionally, one can create multiple throw-away contexts that append to the
same block
with BlockContext() as b:
op(...) # operation inside the block
with BlockContext(appendTo(b)):
op(...) # new context appends to the block
which has a potential of being extended to control the insertion point of the
block at a finer level of granularity.
PiperOrigin-RevId: 238005298
Historically, Python bindings were using full path including third_party for
most headers but not all of them. This is inconsistent with the rest of MLIR.
Drop the prefix path in #include directives.
PiperOrigin-RevId: 237999346
This CL makes some minor changes to the declarative builder Helpers:
1. adds lb, ub, step methods to MemRefView to avoid always having to go through std::get + range;
2. drops MemRefView& from IndexedValue which was just creating ownership concerns. Instead, an IndexedValue only needs to keep track of the ValueHandle from which a MemRefView can be constructed on-demand if necessary.
PiperOrigin-RevId: 237861493
TensorFlow comparison ops like tf.Less supports broadcast behavior but the result
type have different element types as the input types. Extend broadcastable trait
to allow such cases. Added tf.Less to demonstrate it.
PiperOrigin-RevId: 237846127
So that we can use this function to deduce broadcasted shapes elsewhere.
Also added support for unknown dimensions, by following TensorFlow behavior.
PiperOrigin-RevId: 237846065
* Separate MyAnalysis into MyFunctionAnalysis/MyModuleAnalysis to avoid potential confusion.
* Add an example of an inline lambda builder for PassPipelineRegistration.
* Clarify the wording on a few of the pass restrictions.
PiperOrigin-RevId: 237840325
Below shows the output for an example mlir-opt command line.
mlir-opt foo.mlir -verify-each=false -cse -canonicalize -cse -cse -pass-timing
list view (-pass-timing-display=list):
* In this mode the results are displayed in a list sorted by total time; with each pass/analysis instance aggregated into one unique result. This mode is similar to the output of 'time-passes' in llvm-opt.
===-------------------------------------------------------------------------===
... Pass execution timing report ...
===-------------------------------------------------------------------------===
Total Execution Time: 0.0097 seconds (0.0096 wall clock)
---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
0.0051 ( 58.3%) 0.0001 ( 12.2%) 0.0052 ( 53.8%) 0.0052 ( 53.8%) Canonicalizer
0.0025 ( 29.1%) 0.0005 ( 58.2%) 0.0031 ( 31.9%) 0.0031 ( 32.0%) CSE
0.0011 ( 12.6%) 0.0003 ( 29.7%) 0.0014 ( 14.3%) 0.0014 ( 14.2%) DominanceInfo
0.0087 (100.0%) 0.0009 (100.0%) 0.0097 (100.0%) 0.0096 (100.0%) Total
pipeline view (-pass-timing-display=pipeline):
* In this mode the results are displayed in a nested pipeline view that mirrors the internal pass pipeline that is being executed in the pass manager. This view is useful for understanding specifically which parts of the pipeline are taking the most time, and can also be used to identify when analyses are being invalidated and recomputed.
===-------------------------------------------------------------------------===
... Pass execution timing report ...
===-------------------------------------------------------------------------===
Total Execution Time: 0.0082 seconds (0.0081 wall clock)
---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
0.0042 (100.0%) 0.0039 (100.0%) 0.0082 (100.0%) 0.0081 (100.0%) Function Pipeline
0.0005 ( 11.6%) 0.0008 ( 21.1%) 0.0013 ( 16.1%) 0.0013 ( 16.2%) CSE
0.0002 ( 5.0%) 0.0004 ( 9.3%) 0.0006 ( 7.0%) 0.0006 ( 7.0%) (A) DominanceInfo
0.0026 ( 61.8%) 0.0018 ( 45.6%) 0.0044 ( 54.0%) 0.0044 ( 54.1%) Canonicalizer
0.0005 ( 11.7%) 0.0005 ( 13.0%) 0.0010 ( 12.3%) 0.0010 ( 12.4%) CSE
0.0003 ( 6.1%) 0.0003 ( 8.3%) 0.0006 ( 7.2%) 0.0006 ( 7.1%) (A) DominanceInfo
0.0002 ( 3.8%) 0.0001 ( 2.8%) 0.0003 ( 3.3%) 0.0003 ( 3.3%) CSE
0.0042 (100.0%) 0.0039 (100.0%) 0.0082 (100.0%) 0.0081 (100.0%) Total
PiperOrigin-RevId: 237825367
* before/after pass execution
* after a pass fails
* before/after an analysis is computed
After getting this infrastructure in place, we can start providing common developer utilities like pass timing, IR printing after pass execution, etc.
PiperOrigin-RevId: 237709692
Declarative builders want to provide the same nesting interface for blocks and loops. MLIR on the other hand has different behaviors:
1. when an AffineForOp is created the insertion point does not enter the loop body;
2. when a Block is created, the insertion point does enter the block body.
Guard against the second behavior in EDSC to make the interface unsurprising.
This also surfaces two places in the eager branch API where I was guarding against this behavior indirectly by creating a new ScopedContext.
Instead, uniformize everything to properly reset the insertion point in the unique place that builds the mlir::Block*.
PiperOrigin-RevId: 237619513
This CL addresses a few post-submit comments:
1. better comments,
2. check number of results before dyn_cast (which is a less common case)
3. test usage for multi-result InstructionHandle
PiperOrigin-RevId: 237549333
This CL adds support for named custom instructions in declarative builders.
To allow this, it introduces a templated `CustomInstruction` class.
This CL also splits ValueHandle which can capture only the **value** in single-valued instructions from InstructionHandle which can capture any instruction but provide no typing and sugaring to extract the potential Value*.
PiperOrigin-RevId: 237543222
There are two ways that we can attach a name to a DAG node:
1) (Op:$name ...)
2) (Op ...):$name
The problem with 2) is that we cannot do it on the outmost DAG node in a tree.
Switch from 2) to 1).
PiperOrigin-RevId: 237513962
This CL added the ability to generate multiple ops using multiple result
patterns, with each of them replacing one result of the matched source op.
Specifically, the syntax is
```
def : Pattern<(SourceOp ...),
[(ResultOp1 ...), (ResultOp2 ...), (ResultOp3 ...)]>;
```
Assuming `SourceOp` has three results.
Currently we require that each result op must generate one result, which
can be lifted later when use cases arise.
To help with cases that certain output is unused and we don't care about it,
this CL also introduces a new directive: `verifyUnusedValue`. Checks will
be emitted in the `match()` method to make sure if the corresponding output
is not unused, `match()` returns with `matchFailure()`.
PiperOrigin-RevId: 237513904
TensorFlow does not allow integers of random bitwidths. It only accepts 8-,
16-, 32-, and 64-bit integer types. Similarly for floating point types, only
half, single, double, and bfloat16 types.
PiperOrigin-RevId: 237483913
- compute tile sizes based on a simple model that looks at memory footprints
(instead of using the hardcoded default value)
- adjust tile sizes to make them factors of trip counts based on an option
- update loop fusion CL options to allow setting maximal fusion at pass creation
- change an emitError to emitWarning (since it's not a hard error unless the client
treats it that way, in which case, it can emit one)
$ mlir-opt -debug-only=loop-tile -loop-tile test/Transforms/loop-tiling.mlir
test/Transforms/loop-tiling.mlir:81:3: note: using tile sizes [4 4 5 ]
for %i = 0 to 256 {
for %i0 = 0 to 256 step 4 {
for %i1 = 0 to 256 step 4 {
for %i2 = 0 to 250 step 5 {
for %i3 = #map4(%i0) to #map11(%i0) {
for %i4 = #map4(%i1) to #map11(%i1) {
for %i5 = #map4(%i2) to #map12(%i2) {
%0 = load %arg0[%i3, %i5] : memref<8x8xvector<64xf32>>
%1 = load %arg1[%i5, %i4] : memref<8x8xvector<64xf32>>
%2 = load %arg2[%i3, %i4] : memref<8x8xvector<64xf32>>
%3 = mulf %0, %1 : vector<64xf32>
%4 = addf %2, %3 : vector<64xf32>
store %4, %arg2[%i3, %i4] : memref<8x8xvector<64xf32>>
}
}
}
}
}
}
PiperOrigin-RevId: 237461836
Recently, EDSC introduced an eager mode for building IR in different contexts.
Introduce Python bindings support for loop and loop nest contexts of EDSC
builders. The eager mode is built around the notion of ValueHandle, which is
convenience class for delayed initialization and operator overloads. Expose
this class and overloads directly. The model of insertion contexts maps
naturally to Python context manager mechanism, therefore new bindings are
defined bypassing the C APIs. The bindings now provide three new context
manager classes: FunctionContext, LoopContext and LoopNestContext. The last
two can be used with the `with`-construct in Python to create loop (nests) and
obtain handles to the loop induction variables seamlessly:
with LoopContext(lhs, rhs, 1) as i:
lhs + rhs + i
with LoopContext(rhs, rhs + rhs, 2) as j:
x = i + j
Any statement within the Python context will trigger immediate emission of the
corresponding IR constructs into the context owned by the nearest context
manager.
PiperOrigin-RevId: 237447732
The first version of TableGen-defined LLVM IR Dialect did not include the
mandatory or optional attributes of the operations due to the missing support
for some of the relevant attribute types. This support has been recently
introduced, along with named attributes as arguments in the TableGen operation
definitions. With these changes, LLVM IR Dialect operations now have factory
functions accepting (unnamed) attributes and attaching their canonical names.
Use these factories instead of manually constructing named attributes in the
dialect convreter to avoid hardcoded attribute names in unexpected places.
PiperOrigin-RevId: 237237769
These cleanups reflects some recent changes to the LLVM IR Dialect and the
infrastructure that affects it. In particular, add documentation on direct and
indirect function calls as well as remove the `call` and `call0` separation.
Change the prefix of custom types from `!llvm.type` to `!llvm` so that it
matches the IR. Remove the verifier check disallowing conditional branches to
the same block with arguments: identical arguments are now supported, and
different arguments will be caught later.
PiperOrigin-RevId: 237203452
The LLVM IR Dialect strives to be close to the original LLVM IR instructions.
The conversion from the LLVM IR Dialect to LLVM IR proper is mostly mechanical
and can be automated. Implement TableGen support for generating conversions
from a concise pattern form in the TableGen definition of the LLVM IR Dialect
operations. It is used for all operations except calls and branches. These
operations need access to function and block remapping tables and would require
significantly more code to generate the conversions from TableGen definitions
than the current manually written conversions.
This implementation is accompanied by various necessary changes to the TableGen
operation definition infrastructure. In particular, operation definitions now
contain named accessors to results as well as named accessors to the variadic
operand (returning a vector of operands). The base operation support TableGen
file now contains a FunctionAttr definition. The TableGen now allows to query
the names of the operation results.
PiperOrigin-RevId: 237203077
* bool succeeded(Status)
- Return if the status corresponds to a success value.
* bool failed(Status)
- Return if the status corresponds to a failure value.
PiperOrigin-RevId: 237153884