1) We incorrectly reassociated non-reassociative operations like subi, causing
miscompilations.
2) When constant folding, we didn't add users of the new constant back to the
worklist for reprocessing, causing us to miss some cases (pointed out by
Uday).
The code for tensorflow/mlir#2 is gross, but I'll add the new APIs in a followup patch.
PiperOrigin-RevId: 218803984
distinction. FunctionPasses can now choose to get called on all functions, or
have the driver split CFG/ML Functions up for them. NFC.
PiperOrigin-RevId: 218775885
make operations provide a list of canonicalizations that can be applied to
them. This allows canonicalization to be general to any IR definition.
As part of this, sink PatternMatch.h/cpp down to the IR library to fix a
layering problem.
PiperOrigin-RevId: 218773981
This is done by changing Attribute to be a POD interface around an underlying pointer storage and adding in-class support for isa/dyn_cast/cast.
PiperOrigin-RevId: 218764173
just having the pattern matcher in its own library. At this point,
lib/Transforms/*.cpp are all actually passes themselves (and will probably
eventually be themselves move to a new subdirectory as we accrete more).
PiperOrigin-RevId: 218745193
helper function, in preparation for it being used by other passes.
There is still a lot of room for improvement in its design, this patch is
intended as an NFC refactoring, and the improvements will continue after this
lands.
PiperOrigin-RevId: 218737116
- Introduce Fourier-Motzkin variable elimination to eliminate a dimension from
a system of linear equalities/inequalities. Update isEmpty to use this.
Since FM is only exact on rational/real spaces, an emptiness check based on
this is guaranteed to be exact whenever it says the underlying set is empty;
if it says, it's not empty, there may still be no integer points in it.
Also, supports a version that computes "dark shadows".
- Test this by checking for "always false" conditionals in if statements.
- Unique IntegerSet's that are small (few constraints, few variables). This
basically means the canonical empty set and other small sets that are
likely commonly used get uniqued; allows checking for the canonical empty set
by pointer. IntegerSet::kUniquingThreshold gives the threshold constraint size
for uniqui'ing.
- rename simplify-affine-expr -> simplify-affine-structures
Other cleanup
- IntegerSet::numConstraints, AffineMap::numResults are no longer needed;
remove them.
- add copy assignment operators for AffineMap, IntegerSet.
- rename Invalid() -> Null() on AffineExpr, AffineMap, IntegerSet
- Misc cleanup for FlatAffineConstraints API
PiperOrigin-RevId: 218690456
- Adds FlatAffineConstraints::isEmpty method to test if there are no solutions to the system.
- Adds GCD test check if equality constraints have no solution.
- Adds unit test cases.
PiperOrigin-RevId: 218546319
"shape_cast" only applies to tensors, and there are other operations that
actually affect shape, for example "reshape". Rename "shape_cast" to
"tensor_cast" in both the code and the documentation.
PiperOrigin-RevId: 218528122
This CL only converts the document to the g3doc format and does some minor
typesetting, e.g. removing unicode ellipsis and mdash symbols, replace single
quotes with backticks to trigger tt-type dispay, etc. The original document
is located at
https://docs.google.com/document/d/1KoVYgp-m-dgAyKwqRne2c72j0FoxpsdNgfa9DTfWGgw/view
Links to the sections of the same document are updated to point to the anchors
in the converted document whereas links to external documents are kept as is.
Cross-links between LangRef.md and Rationale.md are updated to point to the
relevant anchors in the g3doc files.
PiperOrigin-RevId: 218527560
is a straight-forward change, but required adding missing moveBefore() methods
on operations (requiring moving some traits around to make C++ happy). This
also fixes a constness issue with the getBlock/getFunction() methods on
Instruction, and adds a missing getFunction() method on MLFuncBuilder.
PiperOrigin-RevId: 218523905
For some of the constant vector / tesor, if the compiler doesn't need to
interpret their elements content, they can be stored in this class to save the
serialize / deserialize cost.
syntax:
`opaque<` tensor-type `,` opaque-string `>`
opaque-string ::= `0x` [0-9a-fA-F]*
PiperOrigin-RevId: 218399426
- Add a few canonicalization patterns to fold memref_cast into
load/store/dealloc.
- Canonicalize alloc(constant) into an alloc with a constant shape followed by
a cast.
- Add a new PatternRewriter::updatedRootInPlace API to make this more convenient.
SimplifyAllocConst and the testcase is heavily based on Uday's implementation work, just
in a different framework.
PiperOrigin-RevId: 218361237
This was left as a TODO in the code. Move the type verification from
MLFuncVerifier::verifyReturn to ReturnOp::verify. Since the return operation
can only appear as the last statement of an MLFunction, i.e. where the
surrounding block is the function itself, it is easy to access the function
descriptor (ReturnOp::verify already relies on this). From the function
descriptor, one can easily access the type information. Note that this
slightly modifies the error message due to the use of emitOpError instead of a
plain emitError.
Drop the obsolete TODO comment in MLFunction::verify about checking that
"return" only appears as the last operation of an MLFunction since
ReturnOp::verify explicitly checks for that.
PiperOrigin-RevId: 218347843
This was left as a TODO in the code. Note that the spec does not explicitly
prohibit the first basic block from having a predecessor, and may be worth
updating.
The error is reported at the location of the cfgfunc to which the basic block
belongs since the location information of the block label is not propagated
beyond the IR parser. Arguably, pointing to a function that starts with an
ill-formed block is better than pointing to the first operation in that block
as it makes easier to follow the code down until the first block label.
PiperOrigin-RevId: 218343654
- Change AllocOp to have a getType() that always returns a MemRefType, since
that is what it requires.
- Rename StandardOps/StandardOpRegistration.cpp ->
StandardOps/OpRegistration.cpp to align with other op sets.
- Add AffineMap::getContext() helper and use it in the asmprinter.
PiperOrigin-RevId: 218205527
a step forward because now every AbstractOperation knows which Dialect it is
associated with, enabling things in the future like "constant folding
hooks" which will be important for layering. This is also a bit nicer on
the registration side of things.
PiperOrigin-RevId: 218104230
PatternMatcher clients up to date and provide a funnel point for newly added
operations. This is also progress towards the canonicalizer supporting
CFGFunctions.
This paves the way for more complex patterns, but by itself doesn't do much
useful, so no testcase.
PiperOrigin-RevId: 218101737
We should be able to represent arbitrary precision Float-point values inside
the IR, so compiler optimizations, such as constant folding can be done
independently on the compiling platform.
This CL also added a new field, AttrValueGetter, to the Attr class definition
for TableGen. This field is used to customize which mlir::Attr getter method to
get the defined PrimitiveType.
PiperOrigin-RevId: 218034983
Also rename Operation::is to Operation::isa
Introduce Operation::cast
All of these are for consistency with global dyn_cast/cast/isa operators.
PiperOrigin-RevId: 217878786
The SparseElementsAttr uses (COO) Coordinate List encoding to represents a
sparse tensor / vector. Specifically, the coordinates and values are stored as
two dense elements attributes. The first dense elements attribute is a 2-D
attribute with shape [N, ndims], which contains the indices of the elements
with nonzero values in the constant vector/tensor. The second elements
attribute is a 1-D attribute list with shape [N], which supplies the values for
each element in the first elements attribute. ndims is the rank of the
vector/tensor and N is the total nonzero elements.
The syntax is:
`sparse<` (tensor-type | vector-type)`, ` indices-attribute-list, values-attribute-list `>`
Example: a sparse tensor
sparse<vector<3x4xi32>, [[0, 0], [1, 2]], [1, 2]> represents the dense tensor
[[1, 0, 0, 0]
[0, 0, 2, 0]
[0, 0, 0, 0]]
PiperOrigin-RevId: 217764319
The syntax of dense vecor/tensor attribute value is
`dense<` (tensor-type | vector-type)`,` attribute-list`>`
and
attribute-list ::= `[` attribute-list (`, ` attribute-list)* `]`.
The construction of the dense vector/tensor attribute takes a vector/tensor
type and a character array as arguments. The size of the input array should be
larger than the size specified by the type argument. It also assumes the
elements of the vector or tensor have been trunked to the data type sizes in
the input character array, so it extends the trunked data to 64 bits when it is
retrieved.
PiperOrigin-RevId: 217762811
multiple TODOs.
- replace the fake test pass (that worked on just the first loop in the
MLFunction) to perform DMA pipelining on all suitable loops.
- nested DMAs work now (DMAs in an outer loop, more DMAs in nested inner loops)
- fix bugs / assumptions: correctly copy memory space and elemental type of source
memref for double buffering.
- correctly identify matching start/finish statements, handle multiple DMAs per
loop.
- introduce dominates/properlyDominates utitilies for MLFunction statements.
- move checkDominancePreservationOnShifts to LoopAnalysis.h; rename it
getShiftValidity
- refactor getContainingStmtPos -> findAncestorStmtInBlock - move into
Analysis/Utils.h; has two users.
- other improvements / cleanup for related API/utilities
- add size argument to dma_wait - for nested DMAs or in general, it makes it
easy to obtain the size to use when lowering the dma_wait since we wouldn't
want to identify the matching dma_start, and more importantly, in general/in the
future, there may not always be a dma_start dominating the dma_wait.
- add debug information in the pass
PiperOrigin-RevId: 217734892
This CL implements a very simple loop vectorization **test** and the basic
infrastructure to support it.
The test simply consists in:
1. matching the loops in the MLFunction and all the Load/Store operations
nested under the loop;
2. testing whether all the Load/Store are contiguous along the innermost
memory dimension along that particular loop. If any reference is
non-contiguous (i.e. the ForStmt SSAValue appears in the expression), then
the loop is not-vectorizable.
The simple test above can gradually be extended with more interesting
behaviors to account for the fact that a layout permutation may exist that
enables contiguity etc. All these will come in due time but it is worthwhile
noting that the test already supports detection of outer-vetorizable loops.
In implementing this test, I also added a recursive MLFunctionMatcher and some
sugar that can capture patterns
such as `auto gemmLike = Doall(Doall(Red(LoadStore())))` and allows iterating
on the matched IR structures. For now it just uses in order traversal but
post-order DFS will be useful in the future once IR rewrites start occuring.
One may note that the memory management design decision follows a different
pattern from MLIR. After evaluating different designs and how they quickly
increase cognitive overhead, I decided to opt for the simplest solution in my
view: a class-wide (threadsafe) RAII context.
This way, a pass that needs MLFunctionMatcher can just have its own locally
scoped BumpPtrAllocator and everything is cleaned up when the pass is destroyed.
If passes are expected to have a longer lifetime, then the contexts can easily
be scoped inside the runOnMLFunction call and storage lifetime reduced.
Lastly, whatever the scope of threading (module, function, pass), this is
expected to also be future-proof wrt concurrency (but this is a detail atm).
PiperOrigin-RevId: 217622889
Change how attributes can be added to an Op to make the syntax in the td file a bit cleaner. Also avoid unnecessarily emitting verify method (trivial return false one that's already in base) and use custom syntax in test.
PiperOrigin-RevId: 217330036
Updates ComposeAffineMaps test pass to use this method.
Updates affine map composition test cases to handle the new pass, which can be reused when this method is used in a future instruction combine pass.
PiperOrigin-RevId: 217163351
Create tblgen based tool to generate the C++ Op definitions. The modelling is
currently simple (ops, attributes, properties) with the printer/parser/verifier
the bodies of those functions and builders being very explicit.
PiperOrigin-RevId: 217150213
Associate BasicBlocks with the function being parsed to avoid leaks in the case of parse failures. Associating with the function means that we can no longer determine if defined/fwd declared simply by considering if a BasicBlock has an associated function, so track forward declared block references explicitly (this should also allow flagging multiple undeclared fwd references). Split out getting the named block from defining it, in the case of definition move the block to the end of the function.
Also destroy all forward reference placeholders in FunctionParser.
Return parse failure in parseAttributeDict if there is no left brace instead of
asserting.
PiperOrigin-RevId: 217049507
- Make it so OpPointer implicitly converts to SSAValue* when the underlying op
has a single value. This eliminates a lot more ->getResult() calls and makes
the behavior more LLVM-like
- Fill out PatternBenefit to be typed instead of just a typedef for int with
magic numbers.
- Simplify various code due to these changes.
PiperOrigin-RevId: 217020717
- add util to create a private / exclusive / single use affine
computation slice for an op stmt (see method doc comment); a single
multi-result affine_apply op is prepended to the op stmt to provide all
results needed for its operands as a function of loop iterators and symbols.
- use it for DMA pipelining (to create private slices for DMA start stmt's);
resolve TODOs/feature request (b/117159533)
- move createComposedAffineApplyOp to Transforms/Utils; free it from taking a
memref as input / generalize it.
PiperOrigin-RevId: 216926818
out canonicalization pass to drive it, and a simple (x-x) === 0 pattern match
as a test case.
There is a tremendous number of improvements that need to land, and the
matcher/rewriter and patterns will be split out of this file, but this is a
starting point.
PiperOrigin-RevId: 216788604