- getDimensionBounds() was added initially for quick experimentation - no
longer used (getConstantBoundOnDimSize is the more powerful/complete
replacement).
- FlatAffineConstraints::getConstantLower/UpperBound are incomplete,
functionality/naming-wise misleading, and not used currently. Removing these;
complete/fixed version will be added in an upcoming CL.
PiperOrigin-RevId: 225075061
For each op, generate another builder with the following signature:
static void build(Builder* builder, OperationState* result,
ArrayRef<Type> resultTypes,
ArrayRef<SSAValue*> args,
ArrayRef<NamedAttribute> attributes);
PiperOrigin-RevId: 225066007
An extensive discussion demonstrated that it is difficult to support `index`
types as elements of compound (vector, memref, tensor) types. In particular,
their size is unknown until the target-specific lowering takes place. MLIR may
need to store constants of the fixed-shape compound types (e.g.,
vector<4 x index>) internally and must know the size of the element type and
data layout constraints. The same information is necessary for target-specific
lowering and translation to reliably support compound types with `index`
elements, but MLIR does not have a dedicated target description mechanism yet.
The uses cases for compound types with `index` elements, should they appear,
can be handled via an `index_cast` operation that converts between `index` and
fixed-size integer types at the SSA value level instead of the type level.
PiperOrigin-RevId: 225064373
- loop step wasn't handled and there wasn't a TODO or an assertion; fix this.
- rename 'delay' to shift for consistency/readability.
- other readability changes.
- remove duplicate attribute print for DmaStartOp; fix misplaced attribute
print for DmaWaitOp
- add build method for AddFOp (unrelated to this CL, but add it anyway)
PiperOrigin-RevId: 224892958
- add method normalizeConstraintsByGCD
- call normalizeConstraintsByGCD() and GCDTightenInequalities() at the end of
projectOut.
- remove call to GCDTightenInequalities() from getMemRefRegion
- change isEmpty() to check isEmptyByGCDTest() / hasInvalidConstraint() each
time an identifier is eliminated (to detect emptiness early).
- make FourierMotzkinEliminate, gaussianEliminateId(s),
GCDTightenInequalities() private
- improve / update stale comments
PiperOrigin-RevId: 224866741
- fix replaceAllMemRefUsesWith call to replace only inside loop body.
- handle the case where DMA buffers are dynamic; extend doubleBuffer() method
to handle dynamically shaped DMA buffers (pass the right operands to AllocOp)
- place alloc's for DMA buffers at the depth at which pipelining is being done
(instead of at top-level)
- add more test cases
PiperOrigin-RevId: 224852231
- generate DMAs correctly now using strided DMAs where needed
- add support for multi-level/nested strides; op still supports one level of
stride for now.
Other things
- add test case for symbolic lower/upper bound; cases where the DMA buffer
size can't be bounded by a known constant
- add test case for dynamic shapes where the DMA buffers are however bounded by
constants
- refactor some of the '-dma-generate' code
PiperOrigin-RevId: 224584529
This CL adds a pass that lowers VectorTransferReadOp and VectorTransferWriteOp
to a simple loop nest via local buffer allocations.
This is an MLIR->MLIR lowering based on builders.
A few TODOs are left to address in particular:
1. invert the permutation map so the accesses to the remote memref are coalesced;
2. pad the alloc for bank conflicts in local memory (e.g. GPUs shared_memory);
3. support broadcast / avoid copies when permutation_map is not of full column rank
4. add a proper "element_cast" op
One notable limitation is this does not plan on supporting boundary conditions.
It should be significantly easier to use pre-baked MLIR functions to handle such paddings.
This is left for future consideration.
Therefore the current CL only works properly for full-tile cases atm.
This CL also adds 2 simple tests:
```mlir
for %i0 = 0 to %M step 3 {
for %i1 = 0 to %N step 4 {
for %i2 = 0 to %O {
for %i3 = 0 to %P step 5 {
vector_transfer_write %f1, %A, %i0, %i1, %i2, %i3 {permutation_map: (d0, d1, d2, d3) -> (d3, d1, d0)} : vector<5x4x3xf32>, memref<?x?x?x?xf32, 0>, index, index, index, index
```
lowers into:
```mlir
for %i0 = 0 to %arg0 step 3 {
for %i1 = 0 to %arg1 step 4 {
for %i2 = 0 to %arg2 {
for %i3 = 0 to %arg3 step 5 {
%1 = alloc() : memref<5x4x3xf32>
%2 = "element_type_cast"(%1) : (memref<5x4x3xf32>) -> memref<1xvector<5x4x3xf32>>
store %cst, %2[%c0] : memref<1xvector<5x4x3xf32>>
for %i4 = 0 to 5 {
%3 = affine_apply (d0, d1) -> (d0 + d1) (%i3, %i4)
for %i5 = 0 to 4 {
%4 = affine_apply (d0, d1) -> (d0 + d1) (%i1, %i5)
for %i6 = 0 to 3 {
%5 = affine_apply (d0, d1) -> (d0 + d1) (%i0, %i6)
%6 = load %1[%i4, %i5, %i6] : memref<5x4x3xf32>
store %6, %0[%5, %4, %i2, %3] : memref<?x?x?x?xf32>
dealloc %1 : memref<5x4x3xf32>
```
and
```mlir
for %i0 = 0 to %M step 3 {
for %i1 = 0 to %N {
for %i2 = 0 to %O {
for %i3 = 0 to %P step 5 {
%f = vector_transfer_read %A, %i0, %i1, %i2, %i3 {permutation_map: (d0, d1, d2, d3) -> (d3, 0, d0)} : (memref<?x?x?x?xf32, 0>, index, index, index, index) -> vector<5x4x3xf32>
```
lowers into:
```mlir
for %i0 = 0 to %arg0 step 3 {
for %i1 = 0 to %arg1 {
for %i2 = 0 to %arg2 {
for %i3 = 0 to %arg3 step 5 {
%1 = alloc() : memref<5x4x3xf32>
%2 = "element_type_cast"(%1) : (memref<5x4x3xf32>) -> memref<1xvector<5x4x3xf32>>
for %i4 = 0 to 5 {
%3 = affine_apply (d0, d1) -> (d0 + d1) (%i3, %i4)
for %i5 = 0 to 4 {
for %i6 = 0 to 3 {
%4 = affine_apply (d0, d1) -> (d0 + d1) (%i0, %i6)
%5 = load %0[%4, %i1, %i2, %3] : memref<?x?x?x?xf32>
store %5, %1[%i4, %i5, %i6] : memref<5x4x3xf32>
%6 = load %2[%c0] : memref<1xvector<5x4x3xf32>>
dealloc %1 : memref<5x4x3xf32>
```
PiperOrigin-RevId: 224552717
This CL adds a finer grain composition function between AffineExpr and an
unbounded map. This will be used in the next CL.
Also cleans up some comments remaining from a previous CL.
PiperOrigin-RevId: 224536314
This simplifies call-sites returning true after emitting an error. After the
conversion, dropped braces around single statement blocks as that seems more
common.
Also, switched to emitError method instead of emitting Error kind using the
emitDiagnostic method.
TESTED with existing unit tests
PiperOrigin-RevId: 224527868
If no custom builder is supplied for an op, TableGen now generates
a default builder for it with the following signature:
static void build(Builder *builder, OperationState* result,
<list-of-all-result-types>,
<list-of-all-operands>,
<list-of-all-attributes>);
PiperOrigin-RevId: 224382473
This CLs adds proper error emission, removes NYI assertions and documents
assumptions that are required in the relevant functions.
PiperOrigin-RevId: 224377143
This CL adds the following free functions:
```
/// Returns the AffineExpr e o m.
AffineExpr compose(AffineExpr e, AffineMap m);
/// Returns the AffineExpr f o g.
AffineMap compose(AffineMap f, AffineMap g);
```
This addresses the issue that AffineMap composition is only available at a
distance via AffineValueMap and is thus unusable on Attributes.
This CL thus implements AffineMap composition in a more modular and composable
way.
This CL does not claim that it can be a good replacement for the
implementation in AffineValueMap, in particular it does not support bounded
maps atm.
Standalone tests are added that replicate some of the logic of the AffineMap
composition pass.
Lastly, affine map composition is used properly inside MaterializeVectors and
a standalone test is added that requires permutation_map composition with a
projection map.
PiperOrigin-RevId: 224376870
This CL hooks up and uses permutation_map in vector_transfer ops.
In particular, when going into the nuts and bolts of the implementation, it
became clear that cases arose that required supporting broadcast semantics.
Broadcast semantics are thus added to the general permutation_map.
The verify methods and tests are updated accordingly.
Examples of interest include.
Example 1:
The following MLIR snippet:
```mlir
for %i3 = 0 to %M {
for %i4 = 0 to %N {
for %i5 = 0 to %P {
%a5 = load %A[%i4, %i5, %i3] : memref<?x?x?xf32>
}}}
```
may vectorize with {permutation_map: (d0, d1, d2) -> (d2, d1)} into:
```mlir
for %i3 = 0 to %0 step 32 {
for %i4 = 0 to %1 {
for %i5 = 0 to %2 step 256 {
%4 = vector_transfer_read %arg0, %i4, %i5, %i3
{permutation_map: (d0, d1, d2) -> (d2, d1)} :
(memref<?x?x?xf32>, index, index) -> vector<32x256xf32>
}}}
````
Meaning that vector_transfer_read will be responsible for reading the 2-D slice:
`%arg0[%i4, %i5:%15+256, %i3:%i3+32]` into vector<32x256xf32>. This will
require a transposition when vector_transfer_read is further lowered.
Example 2:
The following MLIR snippet:
```mlir
%cst0 = constant 0 : index
for %i0 = 0 to %M {
%a0 = load %A[%cst0, %cst0] : memref<?x?xf32>
}
```
may vectorize with {permutation_map: (d0) -> (0)} into:
```mlir
for %i0 = 0 to %0 step 128 {
%3 = vector_transfer_read %arg0, %c0_0, %c0_0
{permutation_map: (d0, d1) -> (0)} :
(memref<?x?xf32>, index, index) -> vector<128xf32>
}
````
Meaning that vector_transfer_read will be responsible of reading the 0-D slice
`%arg0[%c0, %c0]` into vector<128xf32>. This will require a 1-D vector
broadcast when vector_transfer_read is further lowered.
Additionally, some minor cleanups and refactorings are performed.
One notable thing missing here is the composition with a projection map during
materialization. This is because I could not find an AffineMap composition
that operates on AffineMap directly: everything related to composition seems
to require going through SSAValue and only operates on AffinMap at a distance
via AffineValueMap. I have raised this concern a bunch of times already, the
followup CL will actually do something about it.
In the meantime, the projection is hacked at a minimum to pass verification
and materialiation tests are temporarily incorrect.
PiperOrigin-RevId: 224376828
The implementation of OpPointer<OpType> provides an implicit conversion to
Operation *, but not to the underlying OpType *. This has led to
awkward-looking code when an OpPointer needs to be passed to a function
accepting an OpType *. For example,
if (auto someOp = genericOp.dyn_cast<OpType>())
someFunction(&*someOp);
where "&*" makes it harder to read. Arguably, one does not want to spell out
OpPointer<OpType> in the line with dyn_cast. More generally, OpPointer is now
being used as an owning pointer to OpType rather than to operation.
Replace the implicit conversion to Operation* with the conversion to OpType*
taking into account const-ness of the type. An Operation* can be obtained from
an OpType with a simple call. Since an instance of OpPointer owns the OpType
value, the pointer to it is never null. However, the OpType value may not be
associated with any Operation*. In this case, return nullptr when conversion
is attempted to maintain consistency with the existing null checks.
PiperOrigin-RevId: 224368103
- add optional stride arguments for DmaStartOp
- add DmaStartOp::verify(), and missing test cases for DMA op's in
test/IR/memory-ops.mlir.
PiperOrigin-RevId: 224232466
update/improve/clean up API.
- update FlatAffineConstraints::getConstBoundDifference; return constant
differences between symbolic affine expressions, look at equalities as well.
- fix buffer size computation when generating DMAs symbolic in outer loops,
correctly handle symbols at various places (affine access maps, loop bounds,
loop IVs outer to the depth at which DMA generation is being done)
- bug fixes / complete some TODOs for getMemRefRegion
- refactor common code b/w memref dependence check and getMemRefRegion
- FlatAffineConstraints API update; added methods employ trivial checks /
detection - sufficient to handle hyper-rectangular cases in a precise way
while being fast / low complexity. Hyper-rectangular cases fall out as
trivial cases for these methods while other cases still do not cause failure
(either return conservative or return failure that is handled by the caller).
PiperOrigin-RevId: 224229879
The checks for `isa<IndexType>() || isa<IntegerType>()` and
`isa<IndexType>() || isa<IntegerType>() || isa<FloatType>()`
are frequently used, so it's useful to have some helper
methods for them.
PiperOrigin-RevId: 224133596
removeColumnRange
- remove functionally duplicate code in removeId.
- rename removeColumnRange -> removeIdRange - restrict valid input to just the
identifier columns (not the constant term column).
PiperOrigin-RevId: 224054064
This is an obvious bug, but none of the test cases exposed it since numIds was
correctly updated, and the dimensional identifiers were always eliminated
before the symbolic identifiers in all cases that removeId was getting
called from. However, other work in progress exercises the other scenarios and
exposes this bug.
Add an hasConsistentState() private method to move common assertion checks, and call it
from several base methods. Make hasInvalidConstraint() a private method as
well (from a file static one).
PiperOrigin-RevId: 224032721
As it turns out, the TFLite runtime already supports implicit broadcasting
for math binary ops. As the instruction set for TFLite runtime, the tfl
dialect should reflect that, instead of requiring both operands for binary
ops to be of the same type.
To support implicit broadcast means it's not suitable to provide the
short-form assembly for TFLite binary ops anymore. So by default, we should
just provide the canonical-form assembly parser/printer for base binary op.
It's subclasses' choices whether to opt in to short-form.
Added BroadcastableTwoOperandsOneResult as a new dialect trait for checking
the operand and result types for TFLite binary ops.
Also added SameOperandsAndResultType to several neural network ops.
PiperOrigin-RevId: 224027445
This CL implements and uses VectorTransferOps in lieu of the former custom
call op. Tests are updated accordingly.
VectorTransferOps come in 2 flavors: VectorTransferReadOp and
VectorTransferWriteOp.
VectorTransferOps can be thought of as a backend-independent
pseudo op/library call that needs to be legalized to MLIR (whiteboxed) before
it can be lowered to backend-dependent IR.
Note that the current implementation does not yet support a real permutation
map. Proper support will come in a followup CL.
VectorTransferReadOp
====================
VectorTransferReadOp performs a blocking read from a scalar memref
location into a super-vector of the same elemental type. This operation is
called 'read' by opposition to 'load' because the super-vector granularity
is generally not representable with a single hardware register. As a
consequence, memory transfers will generally be required when lowering
VectorTransferReadOp. A VectorTransferReadOp is thus a mid-level abstraction
that supports super-vectorization with non-effecting padding for full-tile
only code.
A vector transfer read has semantics similar to a vector load, with additional
support for:
1. an optional value of the elemental type of the MemRef. This value
supports non-effecting padding and is inserted in places where the
vector read exceeds the MemRef bounds. If the value is not specified,
the access is statically guaranteed to be within bounds;
2. an attribute of type AffineMap to specify a slice of the original
MemRef access and its transposition into the super-vector shape. The
permutation_map is an unbounded AffineMap that must represent a
permutation from the MemRef dim space projected onto the vector dim
space.
Example:
```mlir
%A = alloc(%size1, %size2, %size3, %size4) : memref<?x?x?x?xf32>
...
%val = `ssa-value` : f32
// let %i, %j, %k, %l be ssa-values of type index
%v0 = vector_transfer_read %src, %i, %j, %k, %l
{permutation_map: (d0, d1, d2, d3) -> (d3, d1, d2)} :
(memref<?x?x?x?xf32>, index, index, index, index) ->
vector<16x32x64xf32>
%v1 = vector_transfer_read %src, %i, %j, %k, %l, %val
{permutation_map: (d0, d1, d2, d3) -> (d3, d1, d2)} :
(memref<?x?x?x?xf32>, index, index, index, index, f32) ->
vector<16x32x64xf32>
```
VectorTransferWriteOp
=====================
VectorTransferWriteOp performs a blocking write from a super-vector to
a scalar memref of the same elemental type. This operation is
called 'write' by opposition to 'store' because the super-vector
granularity is generally not representable with a single hardware register. As
a consequence, memory transfers will generally be required when lowering
VectorTransferWriteOp. A VectorTransferWriteOp is thus a mid-level
abstraction that supports super-vectorization with non-effecting padding
for full-tile only code.
A vector transfer write has semantics similar to a vector store, with
additional support for handling out-of-bounds situations.
Example:
```mlir
%A = alloc(%size1, %size2, %size3, %size4) : memref<?x?x?x?xf32>.
%val = `ssa-value` : vector<16x32x64xf32>
// let %i, %j, %k, %l be ssa-values of type index
vector_transfer_write %val, %src, %i, %j, %k, %l
{permutation_map: (d0, d1, d2, d3) -> (d3, d1, d2)} :
(vector<16x32x64xf32>, memref<?x?x?x?xf32>, index, index, index, index)
```
PiperOrigin-RevId: 223873234
FlatAffineConstraints::composeMap: should return false instead of asserting on
a semi-affine map. Make getMemRefRegion just propagate false when encountering
semi-affine maps (instead of crashing!)
PiperOrigin-RevId: 223828743
The algorithm collects defining operations within a scoped hash table. The scopes within the hash table correspond to nodes within the dominance tree for a function. This cl only adds support for simple operations, i.e non side-effecting. Such operations, e.g. load/store/call, will be handled in later patches.
PiperOrigin-RevId: 223811328
Remove tfl.reshape for the following two cases:
1. A tfl.reshape's input is from another tfl.reshape.
Then these two tfl.reshape ops can be merged.
2. A tfl.reshape's result type is the same as its input type.
This tfl.reshape op does nothing, which can be removed.
These transformations are put in a new source file, Canonicalizer.cpp,
because they are TFLite op to TFLite op transformations, and aiming
to making TFLite ops more canonicalized.
Also added a hasCanonicalizationPatterns marker in TableGen Op class
to indicate whether an op has custom getCanonicalizationPatterns().
PiperOrigin-RevId: 223806921
class. This change is NFC, but allows for new kinds of patterns, specifically
LegalizationPatterns which will be allowed to change the types of things they
rewrite.
PiperOrigin-RevId: 223243783
This CL added two new traits, SameOperandsAndResultShape and
ResultsAreBoolLike, and changed CmpIOp to embody these two
traits. As a consequence, CmpIOp's result type now is verified
to be bool-like.
PiperOrigin-RevId: 223208438
Derived attributes are attributes that are derived from other properties of the operation (e.g., the shape returned from the type). DerivedAttr is parameterized on the return type and function body.
PiperOrigin-RevId: 223180315
The semantics of 'select' is conventional: return the second operand if the
first operand is true (1 : i1) and the third operand otherwise. It is
applicable to vectors and tensors element-wise, similarly to LLVM instruction.
This operation is necessary to implement min/max to lower 'for' loops with
complex bounds to CFG functions and to support ternary operations in ML
functions. It is preferred to first-class min/max because of its simplicity,
e.g. it is not concered with signedness.
PiperOrigin-RevId: 223160860
* Added TF::FusedBatchNormOp
* Validated TF::FusedBatchNormOp's operands
* Added converter from tf.FusedBatchNorm to tfl math ops
In the converter, we additionally check that the 'is_training'
attribute in tf.FusedBatchNorm is false and the last 4 outputs
are all not used (true for inference). These requirements do
not exist in the original TOCO source code, which just silently
ignores the last 4 outputs.
PiperOrigin-RevId: 223027333
Several things were suggested in post-submission reviews. In particular, use
pointers in function interfaces instead of references (still use references
internally). Clarify the behavior of the pass in presence of MLFunctions.
PiperOrigin-RevId: 222556851
This CL adds tooling for computing slices as an independent CL.
The first consumer of this analysis will be super-vector materialization in a
followup CL.
In particular, this adds:
1. a getForwardStaticSlice function with documentation, example and a
standalone unit test;
2. a getBackwardStaticSlice function with documentation, example and a
standalone unit test;
3. a getStaticSlice function with documentation, example and a standalone unit
test;
4. a topologicalSort function that is exercised through the getStaticSlice
unit test.
The getXXXStaticSlice functions take an additional root (resp. terminators)
parameter which acts as a boundary that the transitive propagation algorithm
is not allowed to cross.
PiperOrigin-RevId: 222446208
Not having self-contained headers in LLVM is a constant pain. Don't make the
same mistake in mlir. The only interesting change here is moving setSuccessor
to Instructions.cpp, which breaks the cycle between Instructions.h and
BasicBlock.h.
PiperOrigin-RevId: 222440816
cases.
- fix bug in calculating index expressions for DMA buffers in certain cases
(affected tiled loop nests); add more test cases for better coverage.
- introduce an additional optional argument to replaceAllMemRefUsesWith;
additional operands to the index remap AffineMap can now be supplied by the
client.
- FlatAffineConstraints::addBoundsForStmt - fix off by one upper bound,
::composeMap - fix position bug.
- Some clean up and more comments
PiperOrigin-RevId: 222434628
This function pass replaces affine_apply operations in CFG functions with
sequences of primitive arithmetic instructions that form the affine map.
The actual replacement functionality is located in LoweringUtils as a
standalone function operating on an individual affine_apply operation and
inserting the result at the location of the original operation. It is expected
to be useful for other, target-specific lowering passes that may start at
MLFunction level that Deaffinator does not support.
PiperOrigin-RevId: 222406692
Initial restricted implementaiton of the MLIR to LLVM IR translation.
Introduce a new flow into the mlir-translate tool taking an MLIR module
containing CFG functions only and producing and LLVM IR module. The MLIR
features supported by the translator are as follows:
- primitive and function types;
- integer constants;
- cfg and ext functions with 0 or 1 return values;
- calls to these functions;
- basic block conversion translation of arguments to phi nodes;
- conversion between arguments of the first basic block and function arguments;
- (conditional) branches;
- integer addition and comparison operations.
Are NOT supported:
- vector and tensor types and operations on them;
- memrefs and operations on them;
- allocations;
- functions returning multiple values;
- LLVM Module triple and data layout (index type is hardcoded to i64).
Create a new MLIR library and place it under lib/Target/LLVMIR. The "Target"
library group is similar to the one present in LLVM and is intended to contain
all future public MLIR translation targets.
The general flow of MLIR to LLVM IR convresion will include several lowering
and simplification passes on the MLIR itself in order to make the translation
as simple as possible. In particular, ML functions should be transformed to
CFG functions by the recently introduced pass, operations on structured types
will be converted to sequences of operations on primitive types, complex
operations such as affine_apply will be converted into sequence of primitive
operations, primitive operations themselves may eventually be converted to an
LLVM dialect that uses LLVM-like operations.
Introduce the first translation test so that further changes make sure the
basic translation functionality is not broken.
PiperOrigin-RevId: 222400112
This has been a long-standing TODO in the build system. Now that we need to
share the non-inlined implementation of file utilities for translators, create
a separate library for support functionality. Move Support/* headers to the
new library in the build system.
PiperOrigin-RevId: 222398880
Translations performed by mlir-translate only have MLIR on one end.
MLIR-to-MLIR conversions (including dialect changes) should be treated as
passes and run by mlir-opt. Individual translations should not care about
reading or writing MLIR and should work on in-memory representation of MLIR
modules instead. Split the TranslateFunction interface and the translate
registry into two parts: "from MLIR" and "to MLIR".
Update mlir-translate to handle both registries together by wrapping
translation functions into source-to-source convresions. Remove MLIR parsing
and writing from individual translations and make them operate on Modules
instead. This removes the need for individual translators to include
tools/mlir-translate/mlir-translate.h, which can now be safely removed.
Remove mlir-to-mlir translation that only existed as a registration example and
use mlir-opt instead for tests.
PiperOrigin-RevId: 222398707
The mlir-translate tool is expected to discover individual translations at link
time. These translations must register themselves and may need the utilities
that are currently defined in mlir-translate.cpp for their entry point
functions. Since mlir-translate is linking against individual translations,
the translations cannot link against mlir-translate themselves. Extract out
the utilities into a separate "Translation" library to avoid the potential
dependency cycle. Individual translations link to that library to access
TranslateRegistration. The mlir-translate tool links to individual translations
and to the "Translation" library because it needs the utilities as well.
The main header of the new library is located in include/mlir/Translation.h to
make it easily accessible by translators. The rationale for putting it to
include/mlir rather than to one of its subdirectories is that its purpose is
similar to that of include/mlir/Pass.h so it makes sense to put them at the
same level.
PiperOrigin-RevId: 222398617
Also, added iterators for VariadicResults class.
TESTED with unit tests
TODOs:
- Handle non-bool condition results (similar to the IfOp)
- Use PatternRewriter
PiperOrigin-RevId: 222340376
Existing default visitation function for dimension and symbols were called
"visitAffineDimExpr" and "visitAffineSymbolExpr". However, generic CRTP-based
visit and walk methods were calling "visitDimExpr" and "visitSymbolExpr",
respectively, on derived classes. This has not been discovered before because
all existing affine expression visitors (re)define functions for dimensions and
symbols. Change the names of the default empty visitation functions to the
latter form.
PiperOrigin-RevId: 222312114
This reverts the previous method which needs to create a new dialect with the
constant fold hook from TensorFlow. This new method uses a function object in
dialect to store the constant fold hook. Once a hook is registered to the
dialect, this function object will be assigned when the dialect is added to the
MLIRContext.
For the operations which are not registered, a new method getRegisteredDialects
is added to the MLIRContext to query the dialects which matches their op name
prefixes.
PiperOrigin-RevId: 222310149
This CL refactors a few things in Vectorize.cpp:
1. a clear distinction is made between:
a. the LoadOp are the roots of vectorization and must be vectorized
eagerly and propagate their value; and
b. the StoreOp which are the terminals of vectorization and must be
vectorized late (i.e. they do not produce values that need to be
propagated).
2. the StoreOp must be vectorized late because in general it can store a value
that is not reachable from the subset of loads defined in the
current pattern. One trivial such case is storing a constant defined at the
top-level of the MLFunction and that needs to be turned into a splat.
3. a description of the algorithm is given;
4. the implementation matches the algorithm;
5. the last example is made parametric, in practice it will fully rely on the
implementation of vector_transfer_read/write which will handle boundary
conditions and padding. This will happen by lowering to a lower-level
abstraction either:
a. directly in MLIR (whether DMA or just loops or any async tasks in the
future) (whiteboxing);
b. in LLO/LLVM-IR/whatever blackbox library call/ search + swizzle inventor
one may want to use;
c. a partial mix of a. and b. (grey-boxing)
5. minor cleanups are applied;
6. mistakenly disabled unit tests are re-enabled (oopsie).
With this CL, this MLIR snippet:
```
mlfunc @vector_add_2d(%M : index, %N : index) -> memref<?x?xf32> {
%A = alloc (%M, %N) : memref<?x?xf32>
%B = alloc (%M, %N) : memref<?x?xf32>
%C = alloc (%M, %N) : memref<?x?xf32>
%f1 = constant 1.0 : f32
%f2 = constant 2.0 : f32
for %i0 = 0 to %M {
for %i1 = 0 to %N {
// non-scoped %f1
store %f1, %A[%i0, %i1] : memref<?x?xf32>
}
}
for %i4 = 0 to %M {
for %i5 = 0 to %N {
%a5 = load %A[%i4, %i5] : memref<?x?xf32>
%b5 = load %B[%i4, %i5] : memref<?x?xf32>
%s5 = addf %a5, %b5 : f32
// non-scoped %f1
%s6 = addf %s5, %f1 : f32
store %s6, %C[%i4, %i5] : memref<?x?xf32>
}
}
return %C : memref<?x?xf32>
}
```
vectorized with these arguments:
```
-vectorize -virtual-vector-size 256 --test-fastest-varying=0
```
vectorization produces this standard innermost-loop vectorized code:
```
mlfunc @vector_add_2d(%arg0 : index, %arg1 : index) -> memref<?x?xf32> {
%0 = alloc(%arg0, %arg1) : memref<?x?xf32>
%1 = alloc(%arg0, %arg1) : memref<?x?xf32>
%2 = alloc(%arg0, %arg1) : memref<?x?xf32>
%cst = constant 1.000000e+00 : f32
%cst_0 = constant 2.000000e+00 : f32
for %i0 = 0 to %arg0 {
for %i1 = 0 to %arg1 step 256 {
%cst_1 = constant splat<vector<256xf32>, 1.000000e+00> : vector<256xf32>
"vector_transfer_write"(%cst_1, %0, %i0, %i1) : (vector<256xf32>, memref<?x?xf32>, index, index) -> ()
}
}
for %i2 = 0 to %arg0 {
for %i3 = 0 to %arg1 step 256 {
%3 = "vector_transfer_read"(%0, %i2, %i3) : (memref<?x?xf32>, index, index) -> vector<256xf32>
%4 = "vector_transfer_read"(%1, %i2, %i3) : (memref<?x?xf32>, index, index) -> vector<256xf32>
%5 = addf %3, %4 : vector<256xf32>
%cst_2 = constant splat<vector<256xf32>, 1.000000e+00> : vector<256xf32>
%6 = addf %5, %cst_2 : vector<256xf32>
"vector_transfer_write"(%6, %2, %i2, %i3) : (vector<256xf32>, memref<?x?xf32>, index, index) -> ()
}
}
return %2 : memref<?x?xf32>
}
```
Of course, much more intricate n-D imperfectly-nested patterns can be emitted too in a fully declarative fashion, but this is enough for now.
PiperOrigin-RevId: 222280209
Added TF::Conv2D op and TFL::Conv2D op, and converted TF::Conv2D to
TFL::Conv2D, which need to address the operand numberr mismatch
and attribute conversion.
PiperOrigin-RevId: 222277554
This CL adds some vector support in prevision of the upcoming vector
materialization pass. In particular this CL adds 2 functions to:
1. compute the multiplicity of a subvector shape in a supervector shape;
2. help match operations on strict super-vectors. This is defined for a given
subvector shape as an operation that manipulates a vector type that is an
integral multiple of the subtype, with multiplicity at least 2.
This CL also adds a TestUtil pass where we can dump arbitrary testing of
functions and analysis that operate at a much smaller granularity than a pass
(e.g. an analysis for which it is convenient to write a bit of artificial MLIR
and write some custom test). This is in order to keep using Filecheck for
things that essentially look and feel like C++ unit tests.
PiperOrigin-RevId: 222250910
and getMemRefRegion() to work with specified loop depths; add support for
outgoing DMAs, store op's.
- add support for getMemRefRegion symbolic in outer loops - hence support for
DMAs symbolic in outer surrounding loops.
- add DMA generation support for outgoing DMAs (store op's to lower memory
space); extend getMemoryRegion to store op's. -memref-bound-check now works
with store op's as well.
- fix dma-generate (references to the old memref in the dma_start op were also
being replaced with the new buffer); we need replace all memref uses to work
only on a subset of the uses - add a new optional argument for
replaceAllMemRefUsesWith. update replaceAllMemRefUsesWith to take an optional
'operation' argument to serve as a filter - if provided, only those uses that
are dominated by the filter are replaced.
- Add missing print for attributes for dma_start, dma_wait op's.
- update the FlatAffineConstraints API
PiperOrigin-RevId: 221889223
This would also make the CallOp and ExtractElementOp invocations from eliminateIfOp function always valid and removes the need for error handling.
Also, verify TensorFlowOp trait.
PiperOrigin-RevId: 221737192
We do some limited renaming here but define an alias for OperationInst so that a follow up cl can solely perform the large scale renaming.
PiperOrigin-RevId: 221726963
* Optionally attach the type of integer and floating point attributes to the attributes, this allows restricting a int/float to specific width.
- Currently this allows suffixing int/float constant with type [this might be revised in future].
- Default to i64 and f32 if not specified.
* For index types the APInt width used is 64.
* Change callers to request a specific attribute type.
* Store iN type with APInt of width N.
* This change does not handle the folding of constants of different types (e.g., doing int type promotions to support constant folding i3 and i32), and instead restricts the constant folding to only operate on the same types.
PiperOrigin-RevId: 221722699
Unranked tensors used to return an empty list of dimensions as their shape. This is confusing since an empty list of dimensions is also returned for 0-D tensors. In particular, hasStaticShape() method used to check if any of the dimensions are -1, which held for unranked tensors even though they don't have static shape.
PiperOrigin-RevId: 221571138
Array attributes can nested and function attributes can appear anywhere at that
level. They should be remapped to point to the generated CFGFunction after
ML-to-CFG conversion, similarly to plain function attributes. Extract the
nested attribute remapping functionality from the Parser to Utils. Extract out
the remapping function for individual Functions from the module remapping
function. Use these new functions in the ML-to-CFG conversion pass and in the
parser.
PiperOrigin-RevId: 221510997
This CL adds support for and a vectorization test to perform scalar 2-D addf.
The support extension notably comprises:
1. extend vectorizable test to exclude vector_transfer operations and
expose them to LoopAnalysis where they are needed. This is a temporary
solution a concrete MLIR Op exists;
2. add some more functional sugar mapKeys, apply and ScopeGuard (which became
relevant again);
3. fix improper shifting during coarsening;
4. rename unaligned load/store to vector_transfer_read/write and simplify the
design removing the unnecessary AllocOp that were introduced prematurely:
vector_transfer_read currently has the form:
(memref<?x?x?xf32>, index, index, index) -> vector<32x64x256xf32>
vector_transfer_write currently has the form:
(vector<32x64x256xf32>, memref<?x?x?xf32>, index, index, index) -> ()
5. adds vectorizeOperations which traverses the operations in a ForStmt and
rewrites them to their vector form;
6. add support for vector splat from a constant.
The relevant tests are also updated.
PiperOrigin-RevId: 221421426
* Add skeleton br/cond_br builtin ops.
* Add a terminator trait for operations.
* Mark ReturnOp as a Terminator.
The functionality for managing/parsing/verifying successors will be added in a follow up cl.
PiperOrigin-RevId: 221283000
Similarly to other types, introduce "get" and "getChecked" static member
functions for IntegerType. The latter emits errors to the error handler
registered with the MLIR context and returns a null type for the caller to
handle errors gracefully. This deduplicates type consistency checks between
the parser and the builder. Update the parser to call IntegerType::getChecked
for error reporting instead of the builder that would simply assert.
This CL completes the type system error emission refactoring: the parser now
only emits syntax-related errors for types while type factory systems may emit
type consistency errors.
PiperOrigin-RevId: 221165207
Implement a pass converting a subset of MLFunctions to CFGFunctions. Currently
supports arbitrarily complex imperfect loop nests with statically constant
(i.e., not affine map) bounds filled with operations. Does NOT support
branches and non-constant loop bounds.
Conversion is performed per-function and the function names are preserved to
avoid breaking any external references to the current module. In-memory IR is
updated to point to the right functions in direct calls and constant loads.
This behavior is tested via a really hidden flag that enables function
renaming.
Inside each function, the control flow conversion is based on single-entry
single-exit regions, i.e. subgraphs of the CFG that have exactly one incoming
and exactly one outgoing edge. Since an MLFunction must have a single "return"
statement as per MLIR spec, it constitutes an SESE region. Individual
operations are appended to this region. Control flow statements are
recursively converted into such regions that are concatenated with the current
region. Bodies of the compound statement also form SESE regions, which allows
to nest control flow statements easily. Note that SESE regions are not
materialized in the code. It is sufficent to keep track of the end of the
region as the current instruction insertion point as long as all recursive
calls update the insertion point in the end.
The converter maintains a mapping between SSA values in ML functions and their
CFG counterparts. The mapping is used to find the operands for each operation
and is updated to contain the results of each operation as the conversion
continues.
PiperOrigin-RevId: 221162602
Change the storage type to APInt from int64_t for IntegerAttr (following the change to APFloat storage in FloatAttr). Effectively a direct change from int64_t to 64-bit APInt throughout (the bitwidth hardcoded). This change also adds a getInt convenience method to IntegerAttr and replaces previous getValue calls with getInt calls.
While this changes updates the storage type, it does not update all constant folding calls.
PiperOrigin-RevId: 221082788
time. The "Fast and Flexible Instruction Selection With Constraints" paper
from CC2018 makes a credible argument that dynamic costs aren't actually
necessary/important, and we are not using them.
- Check in my "MLIR Generic DAG Rewriter Infrastructure" design doc into the
source tree.
PiperOrigin-RevId: 221017546
These are locations that form a collection of other source locations with an optional metadata attribute.
- Add initial support for print/dump for locations.
Location Printing Examples:
* Unknown : [unknown-location]
* FileLineColLoc : third_party/llvm/llvm/projects/google-mlir/test/TensorFlowLite/legalize.mlir:6:8
* FusedLoc : <"tfl-legalize">[third_party/llvm/llvm/projects/google-mlir/test/TensorFlowLite/legalize.mlir:6:8, third_party/llvm/llvm/projects/google-mlir/test/TensorFlowLite/legalize.mlir:7:8]
- Add diagnostic support for fused locs.
* Prints the first location as the main location and the remaining as "fused from here" notes:
e.g.
third_party/llvm/llvm/projects/google-mlir/test/TensorFlowLite/legalize.mlir:6:8: error: This is an error.
%1 = "tf.add"(%arg0, %0) : (i32, i32) -> i32
^
third_party/llvm/llvm/projects/google-mlir/test/TensorFlowLite/legalize.mlir:7:8: error: Fused from here.
%2 = "tf.relu"(%1) : (i32) -> i32
^
PiperOrigin-RevId: 220835552
Updates MemRefDependenceCheck to check and report on all memref access pairs at all loop nest depths.
Updates old and adds new memref dependence check tests.
Resolves multiple TODOs.
PiperOrigin-RevId: 220816515
- constant bounded memory regions, static shapes, no handling of
overlapping/duplicate regions (through union) for now; also only, load memory
op's.
- add build methods for DmaStartOp, DmaWaitOp.
- move getMemoryRegion() into Analysis/Utils and expose it.
- fix addIndexSet, getMemoryRegion() post switch to exclusive upper bounds;
update test cases for memref-bound-check and memref-dependence-check for
exclusive bounds (missed in a previous CL)
PiperOrigin-RevId: 220729810
This CL introduces the following related changes:
- move tensor element type validity checking to a static member function
TensorType::isValidElementType
- introduce get/getChecked similarly to MemRefType, where the checked function
emits errors and returns nullptrs;
- remove duplicate element type validity checking from the parser and rely on
the type constructor to emit errors instead.
PiperOrigin-RevId: 220694831
This CL introduces the following related changes:
- factor out element type validity checking to a static member function
VectorType::isValidElementType;
- introduce get/getChecked similarly to MemRefType, where the checked function
emits errors and returns nullptrs;
- remove duplicate element type validity checking from the parser and rely on
the type constructor to emit errors instead.
PiperOrigin-RevId: 220693828
Value type abstraction for locations differ from others in that a Location can NOT be null. NOTE: dyn_cast returns an Optional<T>.
PiperOrigin-RevId: 220682078
Arithmetic and comparison instructions are necessary to implement, e.g.,
control flow when lowering MLFunctions to CFGFunctions. (While it is possible
to replace some of the arithmetics by affine_apply instructions for loop
bounds, it is still necessary for loop bounds checking, steps, if-conditions,
non-trivial memref subscripts, etc.) Furthermore, working with indirect
accesses in, e.g., lookup tables for large embeddings, may require operating on
tensors of indexes. For example, the equivalents to C code "LUT[Index[i]]" or
"ResultIndex[i] = i + j" where i, j are loop induction variables require the
arithmetics on indices as well as the possibility to operate on tensors
thereof. Allow arithmetic and comparison operations to apply to index types by
declaring them integer-like. Allow tensors whose element type is index for
indirection purposes.
The absence of vectors with "index" element type is explicitly tested, but the
only justification for this restriction in the CL introducing the test is
"because we don't need them". Do NOT enable vectors of index types, although
it makes vector and tensor types inconsistent with respect to allowed element
types.
PiperOrigin-RevId: 220614055
Previously, index (aka affint) type was hidden under OtherType in the type API.
We will need to identify and operate on values of index types in the upcoming
MLFunc->CFGFunc(->LLVM) lowering passes. Materialize index type into a
separate class and make it visible to LLVM RTTI hierarchy directly.
Practically, index is an integer type of unknown bit width and is accetable in
most places where regular integer types are. This is purely an API change that
does not affect the IR.
After IndexType is separated out from OtherType, the remaining "other types"
are, in fact, TF-specific types only. Further renaming may be of interest.
PiperOrigin-RevId: 220614026
This binary operation is applicable to integers, vectors and tensors thereof
similarly to binary arithmetic operations. The operand types must match
exactly, and the shape of the result type is the same as that of the operands.
The element type of the result is always i1. The kind of the comparison is
defined by the "predicate" integer attribute. This attribute requests one of:
- equals to;
- not equals to;
- signed less than;
- signed less than or equals;
- signed greater than;
- signed greater than or equals;
- unsigned less than;
- unsigned less than or equals;
- unsigned greater than;
- unsigned greater than or equals.
Since integer values themselves do not have a sign, the comparison operator
specifies whether to use signed or unsigned comparison logic, i.e. whether to
interpret values where the foremost bit is set as negatives expressed as two's
complements or as positive values. For non-scalar operands, pairwise
per-element comparison is performed. Comparison operators on scalars are
necessary to implement basic control flow with conditional branches.
PiperOrigin-RevId: 220613566
The passID is not currently stored in Pass but this avoids the unused variable warning. The passID is used to uniquely identify passes, currently this is only stored/used in PassInfo.
PiperOrigin-RevId: 220485662
Common transformation is replacing a op with single result with new op. Adding two variants to enable specifying list of ops that could be removed if they are dead.
PiperOrigin-RevId: 220473903
Add static pass registration and change mlir-opt to use it. Future work is needed to refactor the registration for PassManager usage.
Change build targets to alwayslink to enforce registration.
PiperOrigin-RevId: 220390178
Introduce new OpTraits verifying relation between operands of an Operation,
similarly to its results. Arithmetic operations are defined separately for
integer and floating point types. While we are currently leveraging the
equality of result and operand types to make sure the right arithmetic
operations are used for the right types, we may eventually want to verify
operand types directly. Furthermore, for upcoming comparison operations, the
type of the result differs from those of the operands so we need to verify the
operand types directly. Similarly, we will want to restrict comparisons (and
potentially binary arithmetic operations) to operands of the same type.
PiperOrigin-RevId: 220365629
- simple perfectly nested band tiling with fixed tile sizes.
- only the hyper-rectangular case is handled, with other limitations of
getIndexSet applying (constant loop bounds, etc.); once
the latter utility is extended, tiled code generation should become more
general.
- Add FlatAffineConstraints::isHyperRectangular()
PiperOrigin-RevId: 220324933
simple utility methods.
- clean up some of the analysis utilities used by memref dep checking
- add additional asserts / comments at places in analysis utilities
- add additional simple methods to the FlatAffineConstraints API.
PiperOrigin-RevId: 220124523
Adds equality constraints to dependence constraint system for accesses using dims/symbols where the defining operation of the dim/symbol is a constant.
PiperOrigin-RevId: 219814740
Introduce a new public static member function, MemRefType::getChecked, intended
for the users that want detailed error messages to be emitted during MemRefType
construction and can gracefully handle these errors. This function takes a
Location of the "MemRef" token if known. The parser is one user of getChecked
that has location information, it outputs errors as compiler diagnostics.
Other users may pass in an instance of UnknownLoc and still have error messages
emitted. Compiler-internal users not expecting the MemRefType construction to
fail should call MemRefType::get, which now aborts on failure with a generic
message.
Both "getChecked" and "get" call to a static free function that does actual
construction with well-formedness checks, optionally emits errors and returns
nullptr on failure.
The location information passed to getChecked has voluntarily coarse precision.
The error messages are intended for compiler engineers and do not justify
heavier API than a single location. The text of the messages can be written so
that it pinpoints the actual location of the error within a MemRef declaration.
PiperOrigin-RevId: 219765902
variables from mod's and div's when converting to flat form.
- propagate mod, floordiv, ceildiv / local variables constraint information
when flattening affine expressions and converting them into flat affine
constraints; resolve multiple TODOs.
- enables memref bound checker to work with arbitrary affine expressions
- update FlatAffineConstraints API with several new methods
- test/exercise functionality mostly through -memref-bound-check
- other analyses such as dependence tests, etc. should now be able to work in the
presence of any affine composition of add, mul, floor, ceil, mod.
PiperOrigin-RevId: 219711806
- Builds access functions and iterations domains for each access.
- Builds dependence polyhedron constraint system which has equality constraints for equated access functions and inequality constraints for iteration domain loop bounds.
- Runs elimination on the dependence polyhedron to test if no dependence exists between the accesses.
- Adds a trivial LoopFusion transformation pass with a simple test policy to test dependence between accesses to the same memref in adjacent loops.
- The LoopFusion pass will be extended in subsequent CLs.
PiperOrigin-RevId: 219630898
This CL adds support for vectorization using more interesting 2-D and 3-D
patterns. Note in particular the fact that we match some pretty complex
imperfectly nested 2-D patterns with a quite minimal change to the
implementation: we just add a bit of recursion to traverse the matched
patterns and actually vectorize the loops.
For instance, vectorizing the following loop by 128:
```
for %i3 = 0 to %0 {
%7 = affine_apply (d0) -> (d0)(%i3)
%8 = load %arg0[%c0_0, %7] : memref<?x?xf32>
}
```
Currently generates:
```
#map0 = ()[s0] -> (s0 + 127)
#map1 = (d0) -> (d0)
for %i3 = 0 to #map0()[%0] step 128 {
%9 = affine_apply #map1(%i3)
%10 = alloc() : memref<1xvector<128xf32>>
%11 = "n_d_unaligned_load"(%arg0, %c0_0, %9, %10, %c0) :
(memref<?x?xf32>, index, index, memref<1xvector<128xf32>>, index) ->
(memref<?x?xf32>, index, index, memref<1xvector<128xf32>>, index)
%12 = load %10[%c0] : memref<1xvector<128xf32>>
}
```
The above is subject to evolution.
PiperOrigin-RevId: 219629745
FuncBuilder is useful to build a operation to replace an existing operation, so change the constructor to allow constructing it with an existing operation. Change FuncBuilder to contain (effectively) a tagged union of CFGFuncBuilder and MLFuncBuilder (as these should be cheap to copy and avoid allocating/deletion when created via a operation).
PiperOrigin-RevId: 219532952
Introduce analysis to check memref accesses (in MLFunctions) for out of bound
ones. It works as follows:
$ mlir-opt -memref-bound-check test/Transforms/memref-bound-check.mlir
/tmp/single.mlir:10:12: error: 'load' op memref out of upper bound access along dimension tensorflow/mlir#1
%x = load %A[%idxtensorflow/mlir#0, %idxtensorflow/mlir#1] : memref<9 x 9 x i32>
^
/tmp/single.mlir:10:12: error: 'load' op memref out of lower bound access along dimension tensorflow/mlir#1
%x = load %A[%idxtensorflow/mlir#0, %idxtensorflow/mlir#1] : memref<9 x 9 x i32>
^
/tmp/single.mlir:10:12: error: 'load' op memref out of upper bound access along dimension tensorflow/mlir#2
%x = load %A[%idxtensorflow/mlir#0, %idxtensorflow/mlir#1] : memref<9 x 9 x i32>
^
/tmp/single.mlir:10:12: error: 'load' op memref out of lower bound access along dimension tensorflow/mlir#2
%x = load %A[%idxtensorflow/mlir#0, %idxtensorflow/mlir#1] : memref<9 x 9 x i32>
^
/tmp/single.mlir:12:12: error: 'load' op memref out of upper bound access along dimension tensorflow/mlir#1
%y = load %B[%idy] : memref<128 x i32>
^
/tmp/single.mlir:12:12: error: 'load' op memref out of lower bound access along dimension tensorflow/mlir#1
%y = load %B[%idy] : memref<128 x i32>
^
#map0 = (d0, d1) -> (d0, d1)
#map1 = (d0, d1) -> (d0 * 128 - d1)
mlfunc @test() {
%0 = alloc() : memref<9x9xi32>
%1 = alloc() : memref<128xi32>
for %i0 = -1 to 9 {
for %i1 = -1 to 9 {
%2 = affine_apply #map0(%i0, %i1)
%3 = load %0[%2tensorflow/mlir#0, %2tensorflow/mlir#1] : memref<9x9xi32>
%4 = affine_apply #map1(%i0, %i1)
%5 = load %1[%4] : memref<128xi32>
}
}
return
}
- Improves productivity while manually / semi-automatically developing MLIR for
testing / prototyping; also provides an indirect way to catch errors in
transformations.
- This pass is an easy way to test the underlying affine analysis
machinery including low level routines.
Some code (in getMemoryRegion()) borrowed from @andydavis cl/218263256.
While on this:
- create mlir/Analysis/Passes.h; move Pass.h up from mlir/Transforms/ to mlir/
- fix a bug in AffineAnalysis.cpp::toAffineExpr
TODO: extend to non-constant loop bounds (straightforward). Will transparently
work for all accesses once floordiv, mod, ceildiv are supported in the
AffineMap -> FlatAffineConstraints conversion.
PiperOrigin-RevId: 219397961
This is done by changing Type to be a POD interface around an underlying pointer storage and adding in-class support for isa/dyn_cast/cast.
PiperOrigin-RevId: 219372163
Separate the storage and return type more explicitly. This is in preparation for, among others, allowing supporting enum attributes where the return type is a enum class. NFC.
PiperOrigin-RevId: 219368487
- add methods addConstantLowerBound, addConstantUpperBound, setIdToConstant,
addDimsForMap
- update coefficient storage to use numReservedCols * rows instead of numCols *
rows (makes the code simpler/natural; reduces movement of data when new
columns are added, eliminates movement of data when columns are added to the
end).
(addDimsForMap is tested in the child CL on memref bound checking: cl/219000460)
PiperOrigin-RevId: 219358376
- We already have Pattern::match(). Using mlir::match() in Pattern::match()
confuses the compiler.
- Created m_Op() to avoid using detailed matcher implementation in
StandardOps.cpp.
PiperOrigin-RevId: 219328823
This CL is a first in a series that implements early vectorization of
increasingly complex patterns. In particular, early vectorization will support
arbitrary loop nesting patterns (both perfectly and imperfectly nested), at
arbitrary depths in the loop tree.
This first CL builds the minimal support for applying 1-D patterns.
It relies on an unaligned load/store op abstraction that can be inplemented
differently on different HW.
Future CLs will support higher dimensional patterns, but 1-D patterns already
exhibit interesting properties.
In particular, we want to separate pattern matching (i.e. legality both
structural and dependency analysis based), from profitability analysis, from
application of the transformation.
As a consequence patterns may intersect and we need to verify that a pattern
can still apply by the time we get to applying it.
A non-greedy analysis on profitability that takes into account pattern
intersection is left for future work.
Additionally the CL makes the following cleanups:
1. the matches method now returns a value, not a reference;
2. added comments about the MLFunctionMatcher and MLFunctionMatches usage by
value;
3. added size and empty methods to matches;
4. added a negative vectorization test with a conditional, this exhibited a
but in the iterators. Iterators now return nullptr if the underlying storage
is nullpt.
PiperOrigin-RevId: 219299489
- There are several places where we are casting the type of the memref obtained
from the load/store op to a memref type, and this will become even more
common (some upcoming CLs this week). Add a getMemRefType and use it at
several places where the cast was being used.
PiperOrigin-RevId: 219164326
- Added a mechanism for specifying pattern matching more concisely like LLVM.
- Added support for canonicalization of addi/muli over vector/tensor splat
- Added ValueType to Attribute class hierarchy
- Allowed creating constant splat
PiperOrigin-RevId: 219149621
Mostly a mechanical change to make it easier to try reuse the same core definitions. Added additional string members summary/description, that mirrors OpDef's documentation (thinking about document generation :))
PiperOrigin-RevId: 219044183
Statement, which paves the way to make SSAValue's have a useful owner
available, which will allow subsequent patches to improve their use/def
chains.
While I'm poking at this, shrink sizeof(Instruction) and sizeof(Statement) by a
word by packing the kind and location together into a single PointerIntPair.
NFC.
PiperOrigin-RevId: 218959651
distinction. FunctionPasses can now choose to get called on all functions, or
have the driver split CFG/ML Functions up for them. NFC.
PiperOrigin-RevId: 218775885
make operations provide a list of canonicalizations that can be applied to
them. This allows canonicalization to be general to any IR definition.
As part of this, sink PatternMatch.h/cpp down to the IR library to fix a
layering problem.
PiperOrigin-RevId: 218773981
This is done by changing Attribute to be a POD interface around an underlying pointer storage and adding in-class support for isa/dyn_cast/cast.
PiperOrigin-RevId: 218764173
helper function, in preparation for it being used by other passes.
There is still a lot of room for improvement in its design, this patch is
intended as an NFC refactoring, and the improvements will continue after this
lands.
PiperOrigin-RevId: 218737116
- Introduce Fourier-Motzkin variable elimination to eliminate a dimension from
a system of linear equalities/inequalities. Update isEmpty to use this.
Since FM is only exact on rational/real spaces, an emptiness check based on
this is guaranteed to be exact whenever it says the underlying set is empty;
if it says, it's not empty, there may still be no integer points in it.
Also, supports a version that computes "dark shadows".
- Test this by checking for "always false" conditionals in if statements.
- Unique IntegerSet's that are small (few constraints, few variables). This
basically means the canonical empty set and other small sets that are
likely commonly used get uniqued; allows checking for the canonical empty set
by pointer. IntegerSet::kUniquingThreshold gives the threshold constraint size
for uniqui'ing.
- rename simplify-affine-expr -> simplify-affine-structures
Other cleanup
- IntegerSet::numConstraints, AffineMap::numResults are no longer needed;
remove them.
- add copy assignment operators for AffineMap, IntegerSet.
- rename Invalid() -> Null() on AffineExpr, AffineMap, IntegerSet
- Misc cleanup for FlatAffineConstraints API
PiperOrigin-RevId: 218690456
- Adds FlatAffineConstraints::isEmpty method to test if there are no solutions to the system.
- Adds GCD test check if equality constraints have no solution.
- Adds unit test cases.
PiperOrigin-RevId: 218546319
"shape_cast" only applies to tensors, and there are other operations that
actually affect shape, for example "reshape". Rename "shape_cast" to
"tensor_cast" in both the code and the documentation.
PiperOrigin-RevId: 218528122
is a straight-forward change, but required adding missing moveBefore() methods
on operations (requiring moving some traits around to make C++ happy). This
also fixes a constness issue with the getBlock/getFunction() methods on
Instruction, and adds a missing getFunction() method on MLFuncBuilder.
PiperOrigin-RevId: 218523905
For some of the constant vector / tesor, if the compiler doesn't need to
interpret their elements content, they can be stored in this class to save the
serialize / deserialize cost.
syntax:
`opaque<` tensor-type `,` opaque-string `>`
opaque-string ::= `0x` [0-9a-fA-F]*
PiperOrigin-RevId: 218399426
- Add a few canonicalization patterns to fold memref_cast into
load/store/dealloc.
- Canonicalize alloc(constant) into an alloc with a constant shape followed by
a cast.
- Add a new PatternRewriter::updatedRootInPlace API to make this more convenient.
SimplifyAllocConst and the testcase is heavily based on Uday's implementation work, just
in a different framework.
PiperOrigin-RevId: 218361237
- Change AllocOp to have a getType() that always returns a MemRefType, since
that is what it requires.
- Rename StandardOps/StandardOpRegistration.cpp ->
StandardOps/OpRegistration.cpp to align with other op sets.
- Add AffineMap::getContext() helper and use it in the asmprinter.
PiperOrigin-RevId: 218205527
a step forward because now every AbstractOperation knows which Dialect it is
associated with, enabling things in the future like "constant folding
hooks" which will be important for layering. This is also a bit nicer on
the registration side of things.
PiperOrigin-RevId: 218104230
PatternMatcher clients up to date and provide a funnel point for newly added
operations. This is also progress towards the canonicalizer supporting
CFGFunctions.
This paves the way for more complex patterns, but by itself doesn't do much
useful, so no testcase.
PiperOrigin-RevId: 218101737
We should be able to represent arbitrary precision Float-point values inside
the IR, so compiler optimizations, such as constant folding can be done
independently on the compiling platform.
This CL also added a new field, AttrValueGetter, to the Attr class definition
for TableGen. This field is used to customize which mlir::Attr getter method to
get the defined PrimitiveType.
PiperOrigin-RevId: 218034983
Also rename Operation::is to Operation::isa
Introduce Operation::cast
All of these are for consistency with global dyn_cast/cast/isa operators.
PiperOrigin-RevId: 217878786
The SparseElementsAttr uses (COO) Coordinate List encoding to represents a
sparse tensor / vector. Specifically, the coordinates and values are stored as
two dense elements attributes. The first dense elements attribute is a 2-D
attribute with shape [N, ndims], which contains the indices of the elements
with nonzero values in the constant vector/tensor. The second elements
attribute is a 1-D attribute list with shape [N], which supplies the values for
each element in the first elements attribute. ndims is the rank of the
vector/tensor and N is the total nonzero elements.
The syntax is:
`sparse<` (tensor-type | vector-type)`, ` indices-attribute-list, values-attribute-list `>`
Example: a sparse tensor
sparse<vector<3x4xi32>, [[0, 0], [1, 2]], [1, 2]> represents the dense tensor
[[1, 0, 0, 0]
[0, 0, 2, 0]
[0, 0, 0, 0]]
PiperOrigin-RevId: 217764319
The syntax of dense vecor/tensor attribute value is
`dense<` (tensor-type | vector-type)`,` attribute-list`>`
and
attribute-list ::= `[` attribute-list (`, ` attribute-list)* `]`.
The construction of the dense vector/tensor attribute takes a vector/tensor
type and a character array as arguments. The size of the input array should be
larger than the size specified by the type argument. It also assumes the
elements of the vector or tensor have been trunked to the data type sizes in
the input character array, so it extends the trunked data to 64 bits when it is
retrieved.
PiperOrigin-RevId: 217762811
multiple TODOs.
- replace the fake test pass (that worked on just the first loop in the
MLFunction) to perform DMA pipelining on all suitable loops.
- nested DMAs work now (DMAs in an outer loop, more DMAs in nested inner loops)
- fix bugs / assumptions: correctly copy memory space and elemental type of source
memref for double buffering.
- correctly identify matching start/finish statements, handle multiple DMAs per
loop.
- introduce dominates/properlyDominates utitilies for MLFunction statements.
- move checkDominancePreservationOnShifts to LoopAnalysis.h; rename it
getShiftValidity
- refactor getContainingStmtPos -> findAncestorStmtInBlock - move into
Analysis/Utils.h; has two users.
- other improvements / cleanup for related API/utilities
- add size argument to dma_wait - for nested DMAs or in general, it makes it
easy to obtain the size to use when lowering the dma_wait since we wouldn't
want to identify the matching dma_start, and more importantly, in general/in the
future, there may not always be a dma_start dominating the dma_wait.
- add debug information in the pass
PiperOrigin-RevId: 217734892
This CL implements a very simple loop vectorization **test** and the basic
infrastructure to support it.
The test simply consists in:
1. matching the loops in the MLFunction and all the Load/Store operations
nested under the loop;
2. testing whether all the Load/Store are contiguous along the innermost
memory dimension along that particular loop. If any reference is
non-contiguous (i.e. the ForStmt SSAValue appears in the expression), then
the loop is not-vectorizable.
The simple test above can gradually be extended with more interesting
behaviors to account for the fact that a layout permutation may exist that
enables contiguity etc. All these will come in due time but it is worthwhile
noting that the test already supports detection of outer-vetorizable loops.
In implementing this test, I also added a recursive MLFunctionMatcher and some
sugar that can capture patterns
such as `auto gemmLike = Doall(Doall(Red(LoadStore())))` and allows iterating
on the matched IR structures. For now it just uses in order traversal but
post-order DFS will be useful in the future once IR rewrites start occuring.
One may note that the memory management design decision follows a different
pattern from MLIR. After evaluating different designs and how they quickly
increase cognitive overhead, I decided to opt for the simplest solution in my
view: a class-wide (threadsafe) RAII context.
This way, a pass that needs MLFunctionMatcher can just have its own locally
scoped BumpPtrAllocator and everything is cleaned up when the pass is destroyed.
If passes are expected to have a longer lifetime, then the contexts can easily
be scoped inside the runOnMLFunction call and storage lifetime reduced.
Lastly, whatever the scope of threading (module, function, pass), this is
expected to also be future-proof wrt concurrency (but this is a detail atm).
PiperOrigin-RevId: 217622889
Updates ComposeAffineMaps test pass to use this method.
Updates affine map composition test cases to handle the new pass, which can be reused when this method is used in a future instruction combine pass.
PiperOrigin-RevId: 217163351
- Make it so OpPointer implicitly converts to SSAValue* when the underlying op
has a single value. This eliminates a lot more ->getResult() calls and makes
the behavior more LLVM-like
- Fill out PatternBenefit to be typed instead of just a typedef for int with
magic numbers.
- Simplify various code due to these changes.
PiperOrigin-RevId: 217020717
- add util to create a private / exclusive / single use affine
computation slice for an op stmt (see method doc comment); a single
multi-result affine_apply op is prepended to the op stmt to provide all
results needed for its operands as a function of loop iterators and symbols.
- use it for DMA pipelining (to create private slices for DMA start stmt's);
resolve TODOs/feature request (b/117159533)
- move createComposedAffineApplyOp to Transforms/Utils; free it from taking a
memref as input / generalize it.
PiperOrigin-RevId: 216926818
out canonicalization pass to drive it, and a simple (x-x) === 0 pattern match
as a test case.
There is a tremendous number of improvements that need to land, and the
matcher/rewriter and patterns will be split out of this file, but this is a
starting point.
PiperOrigin-RevId: 216788604
* Move Return, Constant and AffineApply out into BuiltinOps;
* BuiltinOps are always registered, while StandardOps follow the same dynamic registration;
* Kept isValidX in MLValue as we don't have a verify on AffineMap so need to keep it callable from Parser (I wanted to move it to be called in verify instead);
PiperOrigin-RevId: 216592527
We allow the name of an operation to be different from the name of the
'ConcreteType' op it was instantiated with. This can happen when you sub-class
an existing op and provide a getOperationName for it. Such a situation leads to
an assertion too deep and at a place seeminly unrelated, and typically when the
module is printed with the trace:
printOperation, printAssembly, Op::print, getOperand, dyn_cast<OperationStmt>,
isa. 'isa' will complain about being called on a null pointer, and the null
pointer actually comes from the getAs<> in printAssembly. This should have been
caught in printAssembly.
On another note, it is also weird that we allow setting the op's name to
something independent of the ConcreteType that op was instantiated with - so,
getAs<ConcreteType> will fail since ConcreteType::isClassFor won't succeed on
it.
PiperOrigin-RevId: 216580294
This CL applies the same pattern as AffineMap to IntegerSet: a simple struct
that acts as the storage is allocated in the bump pointer. The IntegerSet is
immutable and accessed everywhere by value.
Note that unlike AffineMap, it is not possible to remove the MLIRContext
parameter when constructing an IntegerSet for now. One possible way to achieve
this would be to add an enum to distinguish between the mathematically empty
set, the universe set and other sets.
This is left for future discussion.
PiperOrigin-RevId: 216545361
This attribute represents a reference to a splat vector or tensor, where all
the elements have the same value. The syntax of the attribute is:
`splat<` (tensor-type | vector-type)`,` attribute-value `>`
PiperOrigin-RevId: 216537997
AbstractOperation* or an Identifier. This makes it possible to get to stuff in
AbstractOperation faster than going through a hash table lookup. This makes
constant folding a bit faster now, but will become more important with
subsequent changes.
PiperOrigin-RevId: 216476772
This CL applies the same pattern as AffineExpr to AffineMap: a simple struct
that acts as the storage is allocated in the bump pointer. The AffineMap is
immutable and accessed everywhere by value.
PiperOrigin-RevId: 216445930
Add target independent standard DMA ops: dma.start, dma.wait. Update pipeline
data transfer to use these to detect DMA ops.
While on this
- return failure from mlir-opt::performActions if a pass generates invalid output
- improve error message for verify 'n' operand traits
PiperOrigin-RevId: 216429885
This CL sketches what it takes for AffineExpr to fully have by-value semantics
and not be a not-so-smart pointer anymore.
This essentially makes the underyling class a simple storage struct and
implements the operations on the value type directly. Since there is no
forwarding of operations anymore, we can full isolate the storage class and
make a hard visibility barrier by moving detail::AffineExpr into
AffineExprDetail.h.
AffineExprDetail.h is only included where storage-related information is
needed.
PiperOrigin-RevId: 216385459
This CL:
1. performs the global codemod AffineXExpr->AffineXExprClass and
AffineXExprRef -> AffineXExpr;
2. simplifies function calls by removing the redundant MLIRContext parameter;
3. adds missing binary operator versions of scalar op AffineExpr where it
makes sense.
PiperOrigin-RevId: 216242674
This CL introduces a series of cleanups for AffineExpr value types:
1. to make it clear that the value types should be used, the pointer
AffineExpr types are put in the detail namespace. Unfortunately, since the
value type operator-> only forwards to the underlying pointer type, we
still
need to expose this in the include file for now;
2. AffineExprKind is ok to use, it thus comes out of detail and thus of
AffineExpr
3. getAffineDimExpr, getAffineSymbolExpr, getAffineConstantExpr are
similarly
extracted as free functions and their naming is mande consistent across
Builder, MLContext and AffineExpr
4. AffineBinaryOpEx::simplify functions are made into static free
functions.
In particular it is moved away from AffineMap.cpp where it does not belong
5. operator AffineExprType is made explicit
6. uses the binary operators everywhere possible
7. drops the pointer usage everywhere outside of AffineExpr.cpp,
MLIRContext.cpp and AsmPrinter.cpp
PiperOrigin-RevId: 216207212
This CL makes AffineExprRef into a value type.
Notably:
1. drops llvm isa, cast, dyn_cast on pointer type and uses member functions on
the value type. It may be possible to still use classof (in a followup CL)
2. AffineBaseExprRef aggressively casts constness away: if we mean the type is
immutable then let's jump in with both feet;
3. Drop implicit casts to the underlying pointer type because that always
results in surprising behavior and is not needed in practice once enough
cleanup has been applied.
The remaining negative I see is that we still need to mix operator. and
operator->. There is an ugly solution that forwards the methods but that ends
up duplicating the class hierarchy which I tried to avoid as much as
possible. But maybe it's not that bad anymore since AffineExpr.h would still
contain a single class hierarchy (the duplication would be impl detail in.cpp)
PiperOrigin-RevId: 216188003
1) affineint (as it is named) is not a type suitable for general computation (e.g. the multiply/adds in an integer matmul). It has undefined width and is undefined on overflow. They are used as the indices for forstmt because they are intended to be used as indexes inside the loop.
2) It can be used in both cfg and ml functions, and in cfg functions. As you mention, “symbols” are not affine, and we use affineint values for symbols.
3) Integers aren’t affine, the algorithms applied to them can be. :)
4) The only suitable use for affineint in MLIR is for indexes and dimension sizes (i.e. the bounds of those indexes).
PiperOrigin-RevId: 216057974
- Fold the lower/upper bound of a loop to a constant whenever the result of the
application of the bound's affine map on the operand list yields a constant.
- Update/complete 'for' stmt's API to set lower/upper bounds with operands.
Resolve TODOs for ForStmt::set{Lower,Upper}Bound.
- Moved AffineExprConstantFolder into AffineMap.cpp and added
AffineMap::constantFold to be used by both AffineApplyOp and
ForStmt::constantFoldBound.
PiperOrigin-RevId: 215997346
with a new one (of a potentially different rank/shape) with an optional index
remapping.
- introduce Utils::replaceAllMemRefUsesWith
- use this for DMA double buffering
(This CL also adds a few temporary utilities / code that will be done away with
once:
1) abstract DMA op's are added
2) memref deferencing side-effect / trait is available on op's
3) b/117159533 is resolved (memref index computation slices).
PiperOrigin-RevId: 215831373
This CL starts by replacing AffineExpr* with value-type AffineExprRef in a few
places in the IR. By a domino effect that is pretty telling of the
inconsistencies in the codebase, const is removed where it makes sense.
The rationale is that the decision was concisously made that unique'd types
have pointer semantics without const specifier. This is fine but we should be
consistent. In the end, the only logical invariant is that there should never
be such a thing as a const AffineExpr*, const AffineMap* or const IntegerSet*
in our codebase.
This CL takes a number of shortcuts to killing const with fire, in particular
forcing const AffineExprRef to return the underlying non-const
AffineExpr*. This will be removed once AffineExpr* has disappeared in
containers but for now such shortcuts allow a bit of sanity in this long quest
for cleanups.
The **only** places where const AffineExpr*, const AffineMap* or const
IntegerSet* may still appear is by transitive needs from containers,
comparison operators etc.
There is still one major thing remaining here: figure out why cast/dyn_cast
return me a const AffineXXX*, which in turn requires a bunch of ugly
const_casts. I suspect this is due to the classof
taking const AffineXXXExpr*. I wonder whether this is a side effect of 1., if
it is coming from llvm itself (I'd doubt it) or something else (clattner@?)
In light of this, the whole discussion about const makes total sense to me now
and I would systematically apply the rule that in the end, we should never
have any const XXX in our codebase for unique'd types (assuming we can remove
them all in containers and no additional constness constraint is added on us
from the outside world).
PiperOrigin-RevId: 215811554
This CL implements AffineExprBaseRef as a templated type to allow LLVM-style
casts to work properly. This also allows making AffineExprBaseRef::expr
private.
To achieve this, it is necessary to use llvm::simplify_type and make
AffineConstExpr derive from both AffineExpr and llvm::simplify<AffineExprRef>.
Note that llvm::simplify_type is just an interface to enable the proper
template resolution of isa/cast/dyn_cast but it otherwise holds no value.
Lastly note that certain dyn_cast operations wanted the const AffineExpr* form
of AffineExprBaseRef so I made the implicit constructor take that by default
and documented the immutable behavior. I think this is consistent with the
decision to make unique'd type immutable by convention and never use const on
them.
PiperOrigin-RevId: 215642247
This CL uniformizes the uses of AffineExprWrap outside of IR.
The public API of AffineExpr builder is modified to only use AffineExprWrap.
A few places access AffineExprWrap.expr, this is only while the API is in
transition to easily keep track (i.e. make expr private and let the compiler
track the errors).
Parser.cpp exhibits patterns that are dependent on nullptr values so
converting it is left for another CL.
PiperOrigin-RevId: 215642005
This CL proposes adding MLIRContext* to AffineExpr as discussed previously.
This allows the value class to not require the context in its constructor and
makes it a POD that it makes sense to pass by value everywhere.
A list of other RFC CLs will build on this. The RFC CLs are small incremental
pushes of the API which would be a pretty big change otherwise.
Pushing the thinking a little bit more it seems reasonable to use implicit
cast/constructor to/from AffineExpr*.
As this thing evolves, it looks to me like IR (and
probably Parser, for not so good reasons) want to operate on AffineExpr* and
the rest of the code wants to operate on the value type.
For this reason I think AffineExprImpl*/AffineExpr may also make sense but I
do not have a particular naming preference.
The jury is still out for naming decision between the above and
AffineExprBase*/AffineExpr or AffineExpr*/AffineExprRef.
PiperOrigin-RevId: 215641596
This CL argues that the builder API for AffineExpr should be used
with a lightweight wrapper that supports operators chaining.
This CL takes the ill-named AffineExprWrap and proposes a simple
set of operators with builtin constant simplifications.
This allows:
1. removing the getAddMulPureAffineExpr function;
2. avoiding concerns about constant vs non-constant simplifications
at **every call site**;
3. writing the mathematical expressions we want to write without unnecessary
obfuscations.
The points above represent pure technical debt that we don't want to carry on.
It is important to realize that this is not a mere convenience or "just sugar"
but reduction in cognitive overhead.
This thinking can be pushed significantly further, I have added some comments
with some basic ideas but we could make AffineMap, AffineApply and other
objects that use map applications more functional and value-based.
I am putting this out to get a first batch of reviews and see what people
think.
I think in my preferred design I would have the Builder directly return such
AffineExprPtr objects by value everywhere and avoid the boilerplate explicit
creations that I am doing by hand at this point.
Yes this AffineExprPtr would implicitly convert to AffineExpr* because that is
what it is.
PiperOrigin-RevId: 215641317
- introduce mlir::{floorDiv, ceilDiv, mod} for constant inputs in
mlir/Support/MathExtras.h
- consistently use these everywhere in IR, Analysis, and Transforms.
PiperOrigin-RevId: 215580677
ResultIsFloatLike/ResultIsIntegerLike, move some code out of templates into
shared code, keep the ops in StandardOps.cpp/h sorted.
This significantly reduces the boilerplate for Add/Mul sorts of ops. In a subsequent patch, I plan to rename OpBase to Op, but didn't want to clutter this diff.
PiperOrigin-RevId: 214622871
This CL adds support for `mulf` which is necessary to write/emit a simple scalar
matmul in MLIR. This CL does not consider automation of generation of ops but
mulf is important and useful enough to be added on its own atm.
PiperOrigin-RevId: 214496098
Instead of linking in different initializeMLIRContext functions, add a registry mechanism and function to initialize all registered ops in a given MLIRContext. Initialize all registered ops along with the StandardOps when constructing a MLIRContext.
PiperOrigin-RevId: 214073842
consolidate the implementations in CFGFunctionViewGraph.cpp into it, and
implement the missing const specializations for functions. NFC.
PiperOrigin-RevId: 214048649
verifier. We get most of this infrastructure directly from LLVM, we just
need to adapt it to our CFG abstraction.
This has a few unrelated changes engangled in it:
- getFunction() in various classes was const incorrect, fix it.
- This moves Verifier.cpp to the analysis library, since Verifier depends on
dominance and these are both really analyses.
- IndexedAccessorIterator::reference was defined wrong, leading to really
exciting template errors that were fun to diagnose.
- This flips the boolean sense of the foldOperation() function in constant
folding pass in response to previous patch feedback.
PiperOrigin-RevId: 214046593
Alternatively, we can defined a TFComplexType with a width parameter in the
mlir, then both types can be converted to the same mlir type with different width (like IntegerType).
We chose to use a direct mapping because there are only two TF Complex types.
PiperOrigin-RevId: 213856651
optimization pass:
- Give the ability for operations to implement a constantFold hook (a simple
one for single-result ops as well as general support for multi-result ops).
- Implement folding support for constant and addf.
- Implement support in AbstractOperation and Operation to make this usable by
clients.
- Implement a very simple constant folding pass that does top down folding on
CFG and ML functions, with a testcase that exercises all the above stuff.
Random cleanups:
- Improve the build APIs for ConstantOp.
- Stop passing "-o -" to mlir-opt in the testsuite, since that is the default.
PiperOrigin-RevId: 213749809
- extend loop unroll-jam similar to loop unroll for affine bounds
- extend both loop unroll/unroll-jam to deal with cleanup loop for non multiple
of unroll factor.
- extend promotion of single iteration loops to work with affine bounds
- fix typo bugs in loop unroll
- refactor common code b/w loop unroll and loop unroll-jam
- move prototypes of non-pass transforms to LoopUtils.h
- add additional builder methods.
- introduce loopUnrollUpTo(factor) to unroll by either factor or trip count,
whichever is less.
- remove Statement::isInnermost (not used for now - will come back at the right
place/in right form later)
PiperOrigin-RevId: 213471227
- add builder method for ReturnOp
- expose API from Transforms/ to work on specific ML statements (do this for
LoopUnroll, LoopUnrollAndJam)
- add MLFuncBuilder::getForStmtBodyBuilder, ::getBlock
PiperOrigin-RevId: 213074178
Use these methods to simplify existing code. Rename getConstantMap
getConstantAffineMap. Move declarations to group similar ones together.
PiperOrigin-RevId: 212814829
unroll/unroll-and-jam more powerful; add additional affine expr builder methods
- use previously added analysis/simplification to infer multiple of unroll
factor trip counts, making loop unroll/unroll-and-jam more general.
- for loop unroll, support bounds that are single result affine map's with the
same set of operands. For unknown loop bounds, loop unroll will now work as
long as trip count can be determined to be a multiple of unroll factor.
- extend getConstantTripCount to deal with single result affine map's with the
same operands. move it to mlir/Analysis/LoopAnalysis.cpp
- add additional builder utility methods for affine expr arithmetic
(difference, mod/floordiv/ceildiv w.r.t postitive constant). simplify code to
use the utility methods.
- move affine analysis routines to AffineAnalysis.cpp/.h from
AffineStructures.cpp/.h.
- Rename LoopUnrollJam to LoopUnrollAndJam to match class name.
- add an additional simplification for simplifyFloorDiv, simplifyCeilDiv
- Rename AffineMap::getNumOperands() getNumInputs: an affine map by itself does
not have operands. Operands are passed to it through affine_apply, from loop
bounds/if condition's, etc., operands are stored in the latter.
This should be sufficiently powerful for now as far as unroll/unroll-and-jam go for TPU
code generation, and can move to other analyses/transformations.
Loop nests like these are now unrolled without any cleanup loop being generated.
for %i = 1 to 100 {
// unroll factor 4: no cleanup loop will be generated.
for %j = (d0) -> (d0) (%i) to (d0) -> (5*d0 + 3) (%i) {
%x = "foo"(%j) : (affineint) -> i32
}
}
for %i = 1 to 100 {
// unroll factor 4: no cleanup loop will be generated.
for %j = (d0) -> (d0) (%i) to (d0) -> (d0 - d mod 4 - 1) (%i) {
%y = "foo"(%j) : (affineint) -> i32
}
}
for %i = 1 to 100 {
for %j = (d0) -> (d0) (%i) to (d0) -> (d0 + 128) (%i) {
%x = "foo"() : () -> i32
}
}
TODO(bondhugula): extend this to LoopUnrollAndJam as well in the next CL (with minor
changes).
PiperOrigin-RevId: 212661212
infrastructure, instead of returning a const char*. This allows custom
formatting and more interesting diagnostics.
This patch regresses the error message quality from the control flow
lowering pass, I'll address this in a subsequent patch.
PiperOrigin-RevId: 212210681
loop counts. Improve / refactor loop unroll / loop unroll and jam.
- add utility to remove single iteration loops.
- use this utility to promote single iteration loops after unroll/unroll-and-jam
- use loopUnrollByFactor for loopUnrollFull and remove most of the latter.
- add methods for getting constant loop trip count
PiperOrigin-RevId: 212039569
- Compress the identifier/kind of a Function into a single word.
- Eliminate otherFailure from verifier now that we always have a location
- Eliminate the error string from the verifier now that we always have
locations.
- Simplify the parser's handling of fn forward references, using the location
tracked by the function.
PiperOrigin-RevId: 211985101
inserting shape_casts as necessary.
Along the way:
- Add some missing accessors to the AtLeastNOperands trait.
- Implement shape_cast / ShapeCastOp standard op.
- Improve handling of errors in mlir-opt, making it easier to understand
errors when invalid IR is rejected by the verifier.
PiperOrigin-RevId: 211897877
- Make the tf-lower-control flow handle error cases better. Add a testcase
that (currently) fails due to type mismatches.
- Factor more code in the verifier for basic block argument checking, and
check more invariants.
- Fix a crasher in the asmprinter on null instructions (which only occurs on
invalid code).
- Fix a bug handling conditional branches with no block operands, it would
access &operands[0] instead of using operands.data().
- Enhance the mlir-opt driver to use the verifier() in a non-crashing mode,
allowing issues to be reported as diagnostics.
PiperOrigin-RevId: 211818291
Enable using GraphWriter to dump graphviz in debug mode (kept to debug builds completely as this is only for debugging). Add option to mlir-opt to print CFGFunction after every transform in debug mode.
PiperOrigin-RevId: 211578699
- handle floordiv/ceildiv in AffineExprFlattener; update the simplification to
work even if mod/floordiv/ceildiv expressions appearing in the tree can't be eliminated.
- refactor the flattening / analysis to move it out of lib/Transforms/
- fix MutableAffineMap::isMultipleOf
- add AffineBinaryOpExpr:getAdd/getMul/... utility methods
PiperOrigin-RevId: 211540536
- Add a new -verify mode to the mlir-opt tool that allows writing test cases
for optimization and other passes that produce diagnostics.
- Refactor existing the -check-parser-errors flag to mlir-opt into a new
-split-input-file option which is orthogonal to -verify.
- Eliminate the special error hook the parser maintained and use the standard
MLIRContext's one instead.
- Enhance the default MLIRContext error reporter to print file/line/col of
errors when it is available.
- Add new createChecked() methods to the builder that create ops and invoke
the verify hook on them, use this to detected unhandled code in the
RaiseControlFlow pass.
- Teach mlir-opt about expected-error @+, it previously only worked with @-
PiperOrigin-RevId: 211305770
Restored the order of deleting then-clause before else-clause in if-statement destructor, since it does not and should not matter.
PiperOrigin-RevId: 211273720
Note: I've tested this in a forth coming transformation pass CL that iterates through all uses, but do not have an associated unit test in this CL.
PiperOrigin-RevId: 211104365
Outside of IR/
- simplify a MutableAffineMap by flattening the affine expressions
- add a simplify affine expression pass that uses this analysis
- update the FlatAffineConstraints API (to be used in the next CL)
In IR:
- add isMultipleOf and getKnownGCD for AffineExpr, and make the in-IR
simplication of simplifyMod simpler and more powerful.
- rename the AffineExpr visitor methods to distinguish b/w visiting and
walking, and to simplify API names based on context.
The next CL will use some of these for the loop unrolling/unroll-jam to make
the detection for the need of cleanup loop powerful/non-trivial.
A future CL will finally move this simplification to FlatAffineConstraints to
make it more powerful. For eg., currently, even if a mod expr appearing in a
part of the expression tree can't be simplified, the whole thing won't be
simplified.
PiperOrigin-RevId: 211012256
- for test purposes, the unroll-jam pass unroll jams the first outermost loop.
While on this:
- fix StmtVisitor to allow overriding of function to iterate walk over children
of a stmt.
PiperOrigin-RevId: 210644813
This CL also includes two other minor changes:
- change the implemented syntax from 'if (cond)' to 'if cond', as specified by MLIR spec.
- a minor fix to the implementation of the ForStmt.
PiperOrigin-RevId: 210618122
(and more useful) way rather than hacking up a pile of attributes for it. In
the future this will grow to represent inlined locations, fusion cases etc, but
for now we start with simple Unknown and File/Line/Col locations. NFC.
PiperOrigin-RevId: 210485775
This commit creates a static constexpr limit for the IntegerType
bitwidth and uses it. The check had to be moved because Token is
not aware of IR/Type and it was a sign the abstraction leaked:
bitwidth limit is not a property of the Token but of the IntegerType.
Added a positive and a negative test at the limit.
PiperOrigin-RevId: 210388192
- introduce hyper-rectangular set representation and API sketch for
analysis/code generation
- implement the 'intersect' and 'project out' operations.
The represention is lighter weight, and operations and other queries on it are
much faster on such domains when compared to general polyhedral domains.
PiperOrigin-RevId: 210245882
This revamps implementation of the loop bounds in the ForStmt, using general representation that supports operands. The frequent case of constant bounds is supported
via special access methods.
This also includes:
- Operand iterators for the Statement class.
- OpPointer::is() method to query the class of the Operation.
- Support for the bound shorthand notation parsing and printing.
- Validity checks for the bound operands used as dim ids and symbols
I didn't mean this CL to be so large. It just happened this way, as one thing led to another.
PiperOrigin-RevId: 210204858
- Implement support for the TensorFlow 'If' op, the first TF op definition.
- Fill in some missing basic infra, including the ability to split a basic block, the ability to create a branch with operands, etc.
- Implement basic lowering for some simple forms of If, where the condition is a zero-D bool tensor and when all the types line up. Future patches will generalize this.
There is still much to be done here. I'd like to get some example graphs coming from the converter to play with to direct this work.
PiperOrigin-RevId: 210198760
parser hooks, as it has been subsumed by a simpler and cleaner mechanism.
Second, remove the "Inst" suffixes from a few methods in CFGFuncBuilder since
they are redundant and this is inconsistent with the other builders. NFC.
PiperOrigin-RevId: 210006263
operation and statement to have a location, and make it so a location is
required to be specified whenever you make one (though a null location is still
allowed). This is to encourage compiler authors to propagate loc info
properly, allowing our failability story to work well.
This is still a WIP - it isn't clear if we want to continue abusing Attribute
for location information, or whether we should introduce a new class heirarchy
to do so. This is good step along the way, and unblocks some of the tf/xla
work that builds upon it.
PiperOrigin-RevId: 210001406
new VectorOrTensorType class that provides a common interface between vector
and tensor since a number of operations will be uniform across them (including
extract_element). Improve the LoadOp verifier.
I also updated the MLIR spec doc as well.
PiperOrigin-RevId: 209953189
OperationState contain a context and have the generic builder mechanics handle
the job of initializing the OperationState and setting the op name. NFC.
PiperOrigin-RevId: 209869948
FlatAffineConstraints, and MutableAffineMap.
All four classes introduced reside in lib/Analysis and are not meant to be
used in the IR (from lib/IR or lib/Parser/). They are all mutable, alloc'ed,
dealloc'ed - although with their fields pointing to immutable affine
expressions (AffineExpr *).
While on this, update simplifyMod to fold mod to a zero when possible.
PiperOrigin-RevId: 209618437
- Have the parser rewrite forward references to their resolved values at the
end of parsing.
- Implement verifier support for detecting malformed function attrs.
- Add efficient query for (in general, recursive) attributes to tell if they
contain a function.
As part of this, improve other general infrastructure:
- Implement support for verifying OperationStmt's in ml functions, refactoring
and generalizing support for operations in the verifier.
- Refactor location handling code in mlir-opt to have the non-error expecting
form of mlir-opt invocations to report error locations precisely.
- Fix parser to detect verifier failures and report them through errorReporter
instead of printing the error and crashing.
This regresses the location info for verifier errors in the parser that were
previously ascribed to the function. This will get resolved in future patches
by adding support for function attributes, which we can use to manage location
information.
PiperOrigin-RevId: 209600980
resolver support.
Still TODO are verifier support (to make sure you don't use an attribute for a
function in another module) and the TODO in ModuleParser::finalizeModule that I
will handle in the next patch.
PiperOrigin-RevId: 209361648
print floating point in a structured form that we know can round trip,
enumerate attributes in the visitor so we print affine mapping attributes
symbolically (the majority of the testcase updates).
We still have an issue where the hexadecimal floating point syntax is reparsed
as an integer, but that can evolve in subsequent patches.
PiperOrigin-RevId: 208828876
Prior to this CL, return statement had no explicit representation in MLIR. Now, it is represented as ReturnOp standard operation and is pretty printed according to the return statement syntax. This way statement walkers can process ML function return operands without making special case for them.
PiperOrigin-RevId: 208092424
an operand mapping, which simplifies it a bit. Implement cloning for IfStmt,
rename getThenClause() to getThen() which is unambiguous and less repetitive in
use cases.
PiperOrigin-RevId: 207915990
- introduce affine integer sets into the IR
- parse and print affine integer sets (both inline or outlined) similar to
affine maps
- use integer set for IfStmt's conditional, and implement parsing of IfStmt's
conditional
- fixed an affine expr paren omission bug while one this.
TODO: parse/represent/print MLValue operands to affine integer set references.
PiperOrigin-RevId: 207779408
encapsulates an operation that is yet to be created. This is a patch towards
custom ops providing create methods that don't need to be templated, allowing
them to move out of line in the future.
PiperOrigin-RevId: 207725557
- fix/complete forStmt cloning for unrolling to work for outer loops
- create IV const's only when needed
- test outer loop unrolling by creating a short trip count unroll pass for
loops with trip counts <= <parameter>
- add unrolling test cases for multiple op results, outer loop unrolling
- fix/clean up StmtWalker class while on this
- switch unroll loop iterator values from i32 to affineint
PiperOrigin-RevId: 207645967
Unrelated minor change - remove OperationStmt::dropReferences(). Since MLFunction does not have cyclic operand references (it's an AST) destruction can be safely done w/o a special pass to drop references.
PiperOrigin-RevId: 207583024
- Implement a diagnostic hook in one of the paths in mlir-opt which
captures and reports the diagnostics nicely.
- Have the parser capture simple location information from the parser
indicating where each op came from in the source .mlir file.
- Add a verifyDominance() method to MLFuncVerifier to demo this, resolving b/112086163
- Add some PrettyStackTrace handlers to make crashes in the testsuite easier
to track down.
PiperOrigin-RevId: 207488548
- deal with non-operation stmt's (if/for stmt's) in loops being unrolled
(unrolling of non-innermost loops works).
- update uses in unrolled bodies to use results of new operations that may be
introduced in the unrolled bodies.
Unrolling now works for all kinds of loop nests - perfect nests, imperfect
nests, loops at any depth, and with any kind of operation in the body. (IfStmt
support not done, hence untested there).
Added missing dump/print method for StmtBlock.
TODO: add test case for outer loop unrolling.
PiperOrigin-RevId: 207314286
Add create function on builder to make it easier to create ops of registered types. Enables doing `builder.create<AddFOp>(lhs, rhs)` as well as having default values on the build method.
This CL does not add a default build method (i.e., create<DimOp>(...) would fail).
PiperOrigin-RevId: 207268882
MLFunctions.
- MLStmt cloning and IV replacement
- While at this, fix the innermostLoopGatherer to actually gather all the
innermost loops (it was stopping its walk at the first innermost loop it
found)
- Improve comments for MLFunction statement classes, fix inheritance order.
- Fixed StmtBlock destructor.
PiperOrigin-RevId: 207049173
generalize the asmprinters handling of pretty names to allow arbitrary sugar to
be dumped on various constructs. Give CFG function arguments nice "arg0" names
like MLFunctions get, and give constant integers pretty names like %c37 for a
constant 377
PiperOrigin-RevId: 206953080
handlers and to feed them with errors and warnings produced by the compiler.
Enhance Operation to be able to get its own MLIRContext on demand, simplifying
some clients. Change the verifier to emit certain errors with the diagnostic
handler.
This is steps towards reworking the verifier and diagnostic propagation but is
itself not particularly useful. More to come.
PiperOrigin-RevId: 206948643
Fix b/112039912 - we were recording 'i' instead of '%i' for loop induction variables causing "use of undefined SSA value" error.
PiperOrigin-RevId: 206884644
Two problems: 1) we didn't visit the types in ops correctly, and 2) the
general "T" version of the OpAsmPrinter inserter would match things like
MemRefType& and print it directly.
PiperOrigin-RevId: 206863642
createOperation needs to insert the operation at 'insertPoint', which can be
anywhere (not necessarily at the end of 'statements').
PiperOrigin-RevId: 206827159
Induction variables are implemented by inheriting ForStmt from MLValue. ForStmt provides APIs that make this design decision invisible to the ForStmt users.
This CL in combination with cl/206253643 resolves http://b/111769060.
PiperOrigin-RevId: 206655937
* TensorFlow Control Flow supports variadic number of control inputs, add variant of NOperands to support at least N operands;
* Update example:
- All ops will produce control output;
- Use tf$name for mapping back to TensorFlow (a op can have a name as well as an attribute _name, using tf$name disambiguates that);
PiperOrigin-RevId: 206621280
- Generalize TwoOperands and TwoResults to NOperands and NResults, which can
be used for any fixed N.
- Rename OpImpl namespace to OpTrait, OpImpl::Base to OpBase, and TraitImpl to
TraitBase to better reflect what these are.
PiperOrigin-RevId: 206588634
and control edges. Wire the TensorFlow tests into the harness properly. Fix a bug in the CFGBuilder (not inserting at the insertion point) and flesh it out a bit more.
PiperOrigin-RevId: 206511270
- Sketch out a TensorFlow/IR directory that will hold op definitions and common TF support logic. We will eventually have TensorFlow/TF2HLO, TensorFlow/Grappler, TensorFlow/TFLite, etc.
- Add sketches of a Switch/Merge op definition, including some missing stuff like the TwoResults trait. Add a skeleton of a pass to raise this form.
- Beef up the Pass/FunctionPass definitions slightly, moving the common code out of LoopUnroll.cpp into a new IR/Pass.cpp file.
- Switch ConvertToCFG.cpp to be a ModulePass.
- Allow _ to start bare identifiers, since this is important for TF attributes.
PiperOrigin-RevId: 206502517
and OtherType. Other type is now the thing that holds AffineInt, Control,
eventually Resource, Variant, String, etc. FloatType holds the floating point
types, and allows convenient query of isa<FloatType>().
This fixes issues where we allowed control to be the element type of tensor,
memref, vector. At the same time, ban AffineInt from being an element of a
vector/memref/tensor as well since we don't need it.
I updated the spec to match this as well.
PiperOrigin-RevId: 206361942
* Add tf_control as primitive type;
* Allow $ in bare-id to allow attributes with $ (to make it trivially to mangle a TF attribute);
PiperOrigin-RevId: 206342642
- Update InnermostLoopGatherer to use a post order traversal (linear
time/single traversal).
- Drop getNumNestedLoops().
- Update isInnermost() to use the StmtWalker.
When using return values in conjunction with walkers, the StmtWalker CRTP
pattern doesn't appear to be of any use. It just requires overriding nearly all
of the methods, which is what InnermostLoopGatherer currently does. Please see
FIXME/ENLIGHTENME comments. TODO: figure this out from this CL discussion.
Note
- Comments on visitor/walker base class are out of date; will update when this
CL is finalized.
PiperOrigin-RevId: 206340901
BasicBlockOperand to be more consistent with InstOperand. Rename
getDestinations() to getBasicBlockOperands() to reduce confusion on their role.
PiperOrigin-RevId: 206192327
I tried to do the same with OperationInst; unfortunately that leads to some spicy ambiguities (should getOperand return CFGValue or SSAValue?) due to multiple inheritance from Operation which also has operand accessors.
PiperOrigin-RevId: 206094072
The Instruction subclasses just need getInstOperands() and getNumOperands(). Operand iterators and the other accessors are provided by Instruction for free; why would we not just use those?
PiperOrigin-RevId: 206075000
pointer, and ensure that functions are deleted when the module is destroyed.
This exposed the fact that MLFunction had no dtor, and that the dtor in
CFGFunction was broken with cyclic references. Fix both of these problems.
PiperOrigin-RevId: 206051666
This regresses parser error recovery in some cases (in invalid.mlir) which I'll
consider in a follow-up patch. The important thing in this patch is that the
parse methods in StandardOps.cpp are nice and simple.
PiperOrigin-RevId: 206023308
- Implement a full loop unroll for innermost loops.
- Use it to implement a pass that unroll all the innermost loops of all
mlfunction's in a module. ForStmt's parsed currently have constant trip
counts (and constant loop bounds).
- Implement StmtVisitor based (Visitor pattern)
Loop IVs aren't currently parsed and represented as SSA values. Replacing uses
of loop IVs in unrolled bodies is thus a TODO. Class comments are sparse at some places - will add them after one round of comments.
A cmd-line flag triggers this for now.
Original:
mlfunc @loops() {
for x = 1 to 100 step 2 {
for x = 1 to 4 {
"Const"(){value: 1} : () -> ()
}
}
return
}
After unrolling:
mlfunc @loops() {
for x = 1 to 100 step 2 {
"Const"(){value: 1} : () -> ()
"Const"(){value: 1} : () -> ()
"Const"(){value: 1} : () -> ()
"Const"(){value: 1} : () -> ()
}
return
}
PiperOrigin-RevId: 205933235
This looks heavyweight but most of the code is in the massive number of operand accessors!
We need to be able to iterate over all operands to the condbr (all live-outs) but also just
the true/just the false operands too.
PiperOrigin-RevId: 205897704
- Op classes can now provide customized matchers, allowing specializations
beyond just a name match.
- We now provide default implementations of verify/print hooks, so Op classes
only need to implement them if they're doing custom stuff, and only have to
implement the ones they're interested in.
- "Base" now takes a variadic list of template template arguments, allowing
concrete Op types to avoid passing the Concrete type multiple times.
- Add new ZeroOperands trait.
- Add verification hooks to Zero/One/Two operands and OneResult to check that
ops using them are correctly formed.
- Implement getOperand hooks to zero/one/two operand traits, and
getResult/getType hook to OneResult trait.
- Add a new "constant" op to show some of this off, with a specialization for
the constant case.
This patch also splits op validity checks out to a new test/IR/invalid-ops.mlir
file.
This stubs out support for default asmprinter support. My next planned patch
building on top of this will make asmprinter hooks real and will revise this.
PiperOrigin-RevId: 205833214
to all the things. Fill out the OneOperand trait class with support for
getting and setting operands, allowing DimOp to have a working
get/setOperand() method.
I'm not thrilled with the extra template argument on OneOperand, I'll will
investigate removing that in a follow-on patch.
PiperOrigin-RevId: 205679696
details, returning things in terms of values (which is what most clients want).
Implement support for operands and results on Operation, and simplify the
asmprinter to use it.
PiperOrigin-RevId: 205608853
This patch adds support for basic block arguments including parsing and printing.
In doing so noticed that `ssa-id-and-type` is undefined in the MLIR spec; suggested an implementation in the spec doc.
PiperOrigin-RevId: 205593369
there is now an explicit state class - which only has one instance per top
level FooThing::print call. The FunctionPrinter's now subclass ModulePrinter
so they can just call print on their types and other global stuff. This also
makes the contract strict that the global FooThing::print calls are the public
entrypoints and that the printer implementation is otherwise self contained.
No Functionality Change.
PiperOrigin-RevId: 205409317
is still limited in several ways, which i'll build out in subsequent patches.
Rename the accessor for inst operands/results to make the Operand/Result
versions of these more obscure, allowing getOperand/getResult to traffic
in values (which is what - by far - most clients actually care about).
PiperOrigin-RevId: 205408439
- Drop sub-classing of affine binary op expressions.
- Drop affine expr op kind sub. Represent it as multiply by -1 and add. This
will also be in line with the math form when we'll need to represent a system of
linear equalities/inequalities: the negative number goes into the coefficient
of an affine form. (For eg. x_1 + (-1)*x_2 + 3*x_3 + (-2) >= 0). The folding
simplification will transparently deal with multiplying the -1 with any other
constants. This also means we won't need to simplify a multiply expression
like in x_1 + (-2)*x_2 to a subtract expression (x_1 - 2*x_2) for
canonicalization/uniquing.
- When we print the IR, we will still pretty print to a subtract when possible.
PiperOrigin-RevId: 205298958
Loop bounds and presumed to be constants for now and are stored in ForStmt as affine constant expressions. ML function arguments, return statement operands and loop variable name are dropped for now.
PiperOrigin-RevId: 205256208
- This introduces a new FunctionParser base class to handle logic common
between the kinds of functions we have, e.g. ssa operand/def parsing.
- This introduces a basic symbol table (without support for forward
references!) and links defs and uses.
- CFG functions now parse and build operand lists for operations. The printer
isn't set up for them yet tho.
PiperOrigin-RevId: 205246110
the instruction side of the house.
This has a number of limitations, including that we are still dropping
operands on the floor in the parser. Also, most of the convenience methods
aren't wired up yet. This is enough to get result type lists round tripping
through.
PiperOrigin-RevId: 205148223
Refactors operation parsing to share functionality between CFG and ML functions. ML function construction now goes through a builder, similar to the way it is done for
CFG functions.
PiperOrigin-RevId: 204779279
- fold constants when possible.
- for a mul expression, canonicalize to always keep the LHS as the
constant/symbolic term, and similarly, the RHS for an add expression to keep
it closer to the mathematical form. (Eg: f(x) = 3*x + 5)); other similar simplifications;
- verify binary op expressions at creation time.
TODO: we can completely drop AffineSubExpr, and instead use add and mul by -1.
This way something like x - 4 and -4 + x get canonicalized to x + -1 * 4
instead of being x - 4 and x + -4. (The other alternative if wanted to retain
AffineSubExpr would be to simplify x + -1*y to x - y and x + <neg number> to x
- <pos number>).
PiperOrigin-RevId: 204240258
use it.
This also removes "operand" from the affine expr classes: it is unnecessary
verbosity and "operand" will mean something very specific for SSA stuff (we
will have an Operand type).
PiperOrigin-RevId: 203976504
- check for non-affine expressions
- handle negative numbers and negation of id's, expressions
- functions to check if a map is pure affine or semi-affine
- simplify/clean up affine map parsing code
- report more errors messages, more accurate error messages
PiperOrigin-RevId: 203773633
reducing the memory impact on Operation to one word instead of 3 from an
std::vector.
Implement Jacques' suggestion to merge OpImpl::Storage into OpImpl::Base.
PiperOrigin-RevId: 203426518
properties:
- They allow type checked dynamic casting from their base Operation.
- They allow nice accessors for C++ clients, e.g. a "getIndex()" method on
'dim' that returns an unsigned.
- They work with both OperationInst/OperationStmt (once OperationStmt is
implemented).
- They get custom printing logic. They will eventually get custom parsing,
verifier, and builder logic as well.
- Out of tree clients can register their own operation set without having to
change MLIR core, e.g. for TensorFlow or custom target instructions.
This registers addf and dim as examples.
PiperOrigin-RevId: 203382993
A recursive descent parser for affine maps/expressions with operator precedence and
associativity. (While on this, sketch out uniqui'ing functionality for affine maps
and affine binary op expressions (partly).)
PiperOrigin-RevId: 203222063
Add a default error reporter for the parser that uses the SourceManager to print the error. Also and OptResult enum (mirroring ParseResult) to make the behavior self-documenting.
PiperOrigin-RevId: 203173647
in a container automatically maintain their parent pointers, and change storage
from std::vector to the proper llvm::iplist type.
PiperOrigin-RevId: 202889037
important for low-bitwidth inference cases and hardware synthesis targets.
Rename 'int' to 'affineint' to avoid confusion between "the integers" and "the int
type".
PiperOrigin-RevId: 202751508
Run test case:
$ mlir-opt test/IR/parser-affine-map.mlir
test/IR/parser-affine-map.mlir:3:30: error: expect '(' at start of map range
#hello_world2 (i, j) [s0] -> i+s0, j)
^
PiperOrigin-RevId: 202736856
class.
Introduce an Identifier class to MLIRContext to represent uniqued identifiers,
introduce string literal support to the lexer, introducing parser and printer
support etc.
PiperOrigin-RevId: 202592007
- parsing affine map identifiers
- place-holder classes for AffineMap
- module contains a list of affine maps (defined at the top level).
PiperOrigin-RevId: 202336919
Add diagnostic reporter function to lexer/parser and use that from mlir-opt to report errors instead of having the lexer/parser print the errors.
PiperOrigin-RevId: 201892004
This is pretty much minimal scaffolding for this step. Basic block arguments,
instructions, other terminators, a proper IR representation for
blocks/instructions, etc are all coming.
PiperOrigin-RevId: 201826439