This CL retricts shorthand notation printing to only the bounds that can
be roundtripped unambiguously; i.e.:
1. ()[]->(%some_cst) ()[]
2. ()[s0]->(s0) ()[%some_symbol]
Upon inspection it turns out that the constant case was lossy so this CL also
updates it.
Note however that fixing this issue exhibits a potential issues in unroll.mlir.
L488 exhibits a map ()[s0] -> (1)()[%arg0] which could be simplified down to
()[]->(1)()[].
This does not seem like a bug but maybe an undesired complexity in the maps
generated by unrolling.
bondhugula@, care to take a look?
PiperOrigin-RevId: 214531410
optimization pass:
- Give the ability for operations to implement a constantFold hook (a simple
one for single-result ops as well as general support for multi-result ops).
- Implement folding support for constant and addf.
- Implement support in AbstractOperation and Operation to make this usable by
clients.
- Implement a very simple constant folding pass that does top down folding on
CFG and ML functions, with a testcase that exercises all the above stuff.
Random cleanups:
- Improve the build APIs for ConstantOp.
- Stop passing "-o -" to mlir-opt in the testsuite, since that is the default.
PiperOrigin-RevId: 213749809
- extend loop unroll-jam similar to loop unroll for affine bounds
- extend both loop unroll/unroll-jam to deal with cleanup loop for non multiple
of unroll factor.
- extend promotion of single iteration loops to work with affine bounds
- fix typo bugs in loop unroll
- refactor common code b/w loop unroll and loop unroll-jam
- move prototypes of non-pass transforms to LoopUtils.h
- add additional builder methods.
- introduce loopUnrollUpTo(factor) to unroll by either factor or trip count,
whichever is less.
- remove Statement::isInnermost (not used for now - will come back at the right
place/in right form later)
PiperOrigin-RevId: 213471227
unroll/unroll-and-jam more powerful; add additional affine expr builder methods
- use previously added analysis/simplification to infer multiple of unroll
factor trip counts, making loop unroll/unroll-and-jam more general.
- for loop unroll, support bounds that are single result affine map's with the
same set of operands. For unknown loop bounds, loop unroll will now work as
long as trip count can be determined to be a multiple of unroll factor.
- extend getConstantTripCount to deal with single result affine map's with the
same operands. move it to mlir/Analysis/LoopAnalysis.cpp
- add additional builder utility methods for affine expr arithmetic
(difference, mod/floordiv/ceildiv w.r.t postitive constant). simplify code to
use the utility methods.
- move affine analysis routines to AffineAnalysis.cpp/.h from
AffineStructures.cpp/.h.
- Rename LoopUnrollJam to LoopUnrollAndJam to match class name.
- add an additional simplification for simplifyFloorDiv, simplifyCeilDiv
- Rename AffineMap::getNumOperands() getNumInputs: an affine map by itself does
not have operands. Operands are passed to it through affine_apply, from loop
bounds/if condition's, etc., operands are stored in the latter.
This should be sufficiently powerful for now as far as unroll/unroll-and-jam go for TPU
code generation, and can move to other analyses/transformations.
Loop nests like these are now unrolled without any cleanup loop being generated.
for %i = 1 to 100 {
// unroll factor 4: no cleanup loop will be generated.
for %j = (d0) -> (d0) (%i) to (d0) -> (5*d0 + 3) (%i) {
%x = "foo"(%j) : (affineint) -> i32
}
}
for %i = 1 to 100 {
// unroll factor 4: no cleanup loop will be generated.
for %j = (d0) -> (d0) (%i) to (d0) -> (d0 - d mod 4 - 1) (%i) {
%y = "foo"(%j) : (affineint) -> i32
}
}
for %i = 1 to 100 {
for %j = (d0) -> (d0) (%i) to (d0) -> (d0 + 128) (%i) {
%x = "foo"() : () -> i32
}
}
TODO(bondhugula): extend this to LoopUnrollAndJam as well in the next CL (with minor
changes).
PiperOrigin-RevId: 212661212
loop counts. Improve / refactor loop unroll / loop unroll and jam.
- add utility to remove single iteration loops.
- use this utility to promote single iteration loops after unroll/unroll-and-jam
- use loopUnrollByFactor for loopUnrollFull and remove most of the latter.
- add methods for getting constant loop trip count
PiperOrigin-RevId: 212039569
- handle floordiv/ceildiv in AffineExprFlattener; update the simplification to
work even if mod/floordiv/ceildiv expressions appearing in the tree can't be eliminated.
- refactor the flattening / analysis to move it out of lib/Transforms/
- fix MutableAffineMap::isMultipleOf
- add AffineBinaryOpExpr:getAdd/getMul/... utility methods
PiperOrigin-RevId: 211540536
Outside of IR/
- simplify a MutableAffineMap by flattening the affine expressions
- add a simplify affine expression pass that uses this analysis
- update the FlatAffineConstraints API (to be used in the next CL)
In IR:
- add isMultipleOf and getKnownGCD for AffineExpr, and make the in-IR
simplication of simplifyMod simpler and more powerful.
- rename the AffineExpr visitor methods to distinguish b/w visiting and
walking, and to simplify API names based on context.
The next CL will use some of these for the loop unrolling/unroll-jam to make
the detection for the need of cleanup loop powerful/non-trivial.
A future CL will finally move this simplification to FlatAffineConstraints to
make it more powerful. For eg., currently, even if a mod expr appearing in a
part of the expression tree can't be simplified, the whole thing won't be
simplified.
PiperOrigin-RevId: 211012256
- for test purposes, the unroll-jam pass unroll jams the first outermost loop.
While on this:
- fix StmtVisitor to allow overriding of function to iterate walk over children
of a stmt.
PiperOrigin-RevId: 210644813
Collect loops through a post order walk instead of a pre-order so that loops
are collected from inner loops are collected before outer surrounding ones.
Add a complex test case.
PiperOrigin-RevId: 209041057
print floating point in a structured form that we know can round trip,
enumerate attributes in the visitor so we print affine mapping attributes
symbolically (the majority of the testcase updates).
We still have an issue where the hexadecimal floating point syntax is reparsed
as an integer, but that can evolve in subsequent patches.
PiperOrigin-RevId: 208828876
- fix/complete forStmt cloning for unrolling to work for outer loops
- create IV const's only when needed
- test outer loop unrolling by creating a short trip count unroll pass for
loops with trip counts <= <parameter>
- add unrolling test cases for multiple op results, outer loop unrolling
- fix/clean up StmtWalker class while on this
- switch unroll loop iterator values from i32 to affineint
PiperOrigin-RevId: 207645967
Unrelated minor change - remove OperationStmt::dropReferences(). Since MLFunction does not have cyclic operand references (it's an AST) destruction can be safely done w/o a special pass to drop references.
PiperOrigin-RevId: 207583024
- deal with non-operation stmt's (if/for stmt's) in loops being unrolled
(unrolling of non-innermost loops works).
- update uses in unrolled bodies to use results of new operations that may be
introduced in the unrolled bodies.
Unrolling now works for all kinds of loop nests - perfect nests, imperfect
nests, loops at any depth, and with any kind of operation in the body. (IfStmt
support not done, hence untested there).
Added missing dump/print method for StmtBlock.
TODO: add test case for outer loop unrolling.
PiperOrigin-RevId: 207314286
MLFunctions.
- MLStmt cloning and IV replacement
- While at this, fix the innermostLoopGatherer to actually gather all the
innermost loops (it was stopping its walk at the first innermost loop it
found)
- Improve comments for MLFunction statement classes, fix inheritance order.
- Fixed StmtBlock destructor.
PiperOrigin-RevId: 207049173
Induction variables are implemented by inheriting ForStmt from MLValue. ForStmt provides APIs that make this design decision invisible to the ForStmt users.
This CL in combination with cl/206253643 resolves http://b/111769060.
PiperOrigin-RevId: 206655937
- Implement a full loop unroll for innermost loops.
- Use it to implement a pass that unroll all the innermost loops of all
mlfunction's in a module. ForStmt's parsed currently have constant trip
counts (and constant loop bounds).
- Implement StmtVisitor based (Visitor pattern)
Loop IVs aren't currently parsed and represented as SSA values. Replacing uses
of loop IVs in unrolled bodies is thus a TODO. Class comments are sparse at some places - will add them after one round of comments.
A cmd-line flag triggers this for now.
Original:
mlfunc @loops() {
for x = 1 to 100 step 2 {
for x = 1 to 4 {
"Const"(){value: 1} : () -> ()
}
}
return
}
After unrolling:
mlfunc @loops() {
for x = 1 to 100 step 2 {
"Const"(){value: 1} : () -> ()
"Const"(){value: 1} : () -> ()
"Const"(){value: 1} : () -> ()
"Const"(){value: 1} : () -> ()
}
return
}
PiperOrigin-RevId: 205933235