loops), (2) take into account fast memory space capacity and lower 'dmaDepth'
to fit, (3) add location information for debug info / errors
- change dma-generate pass to work on blocks of instructions (start/end
iterators) instead of 'for' loops; complete TODOs - allows DMA generation for
straightline blocks of operation instructions interspersed b/w loops
- take into account fast memory capacity: check whether memory footprint fits
in fastMemoryCapacity parameter, and recurse/lower the depth at which DMA
generation is performed until it does fit in the provided memory
- add location information to MemRefRegion; any insufficient fast memory
capacity errors or debug info w.r.t dma generation shows location information
- allow DMA generation pass to be instantiated with a fast memory capacity
option (besides command line flag)
- change getMemRefRegion to return unique_ptr's
- change getMemRefFootprintBytes to work on a 'Block' instead of 'ForInst'
- other helper methods; add postDomInstFilter option for
replaceAllMemRefUsesWith; drop forInst->walkOps, add Block::walkOps methods
Eg. output
$ mlir-opt -dma-generate -dma-fast-mem-capacity=1 /tmp/single.mlir
/tmp/single.mlir:9:13: error: Total size of all DMA buffers' for this block exceeds fast memory capacity
for %i3 = (d0) -> (d0)(%i1) to (d0) -> (d0 + 32)(%i1) {
^
$ mlir-opt -debug-only=dma-generate -dma-generate -dma-fast-mem-capacity=400 /tmp/single.mlir
/tmp/single.mlir:9:13: note: 8 KiB of DMA buffers in fast memory space for this block
for %i3 = (d0) -> (d0)(%i1) to (d0) -> (d0 + 32)(%i1) {
PiperOrigin-RevId: 232297044
They are essentially both modelling MLIR OpTrait; the former achieves the
purpose via introducing corresponding symbols in TableGen, while the latter
just uses plain strings.
Unify them to provide a single mechanism to avoid confusion and to better
reflect the definitions on MLIR C++ side.
Ideally we should be able to deduce lots of these traits automatically via
other bits of op definitions instead of manually specifying them; but not
for now though.
PiperOrigin-RevId: 232191401
These attribute kinds are different from the rest in the sense that their types are defined
in MLIR's type hierarchy and we can build constant op out of them.
By defining this middle-level base class, we have a unified way to test and query the type
of these attributes, which will be useful when constructing constant ops of various dialects.
This CL also added asserts to reject non-NumericAttr in constant op's build() method.
PiperOrigin-RevId: 232188178
- fusion already includes the necessary analysis to create small/local buffers
post fusion; allocate these buffers in a higher memory space if the necessary
pass parameters are provided (threshold size, memory space id)
- although there will be a separate utility at some point to directly detect
and promote small local buffers to higher memory spaces, doing it while fusion
when possible is much less expensive, comes free with fusion analysis, and covers
a key common case.
PiperOrigin-RevId: 232063894
Nothing in the loop can (legally) cause curPtr -> nullptr. And if it did, we
would null dereference right below anyway.
This loop still reads funny to me but doesn't make me stare at it and wonder
what I am missing anymore.
--
PiperOrigin-RevId: 232062076
This CL added a tblgen::DagLeaf wrapper class with several helper methods for handling
DAG arguments. It helps to refactor the rewriter generation logic to be more higher
level.
This CL also added a tblgen::ConstantAttr wrapper class for constant attributes.
PiperOrigin-RevId: 232050683
- getTerminator() on a block can return nullptr; moreover, blocks that are improperly
constructed/transformed by utilities/passes may not have terminators even for the
top-level blocks
PiperOrigin-RevId: 232025963
This CL applies the following simplifications to EDSCs:
1. Rename Block to StmtList because an MLIR Block is a different, not yet
supported, notion;
2. Rework Bindable to drop specific storage and just use it as a simple wrapper
around Expr. The only value of Bindable is to force a static cast when used by
the user to bind into the emitter. For all intended purposes, Bindable is just
a lightweight check that an Expr is Unbound. This simplifies usage and reduces
the API footprint. After playing with it for some time, it wasn't worth the API
cognition overhead;
3. Replace makeExprs and makeBindables by makeNewExprs and copyExprs which is
more explicit and less easy to misuse;
4. Add generally useful functionality to MLIREmitter:
a. expose zero and one for the ubiquitous common lower bounds and step;
b. add support to create already bound Exprs for all function arguments as
well as shapes and views for Exprs bound to memrefs.
5. Delete Stmt::operator= and replace by a `Stmt::set` method which is more
explicit.
6. Make Stmt::operator Expr() explicit.
7. Indexed.indices assertions are removed to pave the way for expressing slices
and views as well as to work with 0-D memrefs.
The CL plugs those simplifications with TableGen and allows emitting a full MLIR function for
pointwise add.
This "x.add" op is both type and rank-agnostic (by allowing ArrayRef of Expr
passed to For loops) and opens the door to spinning up a composable library of
existing and custom ops that should automate a lot of the tedious work in
TF/XLA -> MLIR.
Testing needs to be significantly improved but can be done in a separate CL.
PiperOrigin-RevId: 231982325
This allow for arbitrarily complex builder patterns which is meant to cover initial cases while the modelling is improved and long tail cases/cases for which expanding the DSL would result in worst overall system.
NFC just sorting the emit replace methods alphabetical in the class and file body.
PiperOrigin-RevId: 231890352
This CL introduces a hotfix post refactoring of NestedMatchers:
- fix uninitialized read to skip
- avoid bumpptr allocating with 0 elements
Interestingly the latter issue only surfaced in fastbuild mode with no-san and
manifested itself by a SIGILL. All other combinations that were tried failed to
reproduce the issue (dbg, opt, fastbuild with asan)
PiperOrigin-RevId: 231787642
A performance issue was reported due to the usage of NestedMatcher in
ComposeAffineMaps. The main culprit was the ubiquitous copies that were
occuring when appending even a single element in `matchOne`.
This CL generally simplifies the implementation and removes one level of indirection by getting rid of
auxiliary storage as well as simplifying the API.
The users of the API are updated accordingly.
The implementation was tested on a heavily unrolled example with
ComposeAffineMaps and is now close in performance with an implementation based
on stateless InstWalker.
As a reminder, the whole ComposeAffineMaps pass is slated to disappear but the bug report was very useful as a stress test for NestedMatchers.
Lastly, the following cleanups reported by @aminim were addressed:
1. make NestedPatternContext scoped within runFunction rather than at the Pass level. This was caused by a previous misunderstanding of Pass lifetime;
2. use defensive assertions in the constructor of NestedPatternContext to make it clear a unique such locally scoped context is allowed to exist.
PiperOrigin-RevId: 231781279
This CL addresses some cleanups that were leftover after an incorrect rebase:
1. use StringSwitch
2. use // NOLINTNEXTLINE
3. remove a dead line of code
PiperOrigin-RevId: 231726640
a trivial inst walker :-) (reduces pass time from several minutes non-terminating to 120ms) - (fixes b/123541184)
- use a simple 7-line inst walker to collect affine_apply op's instead of the nested
matcher; -compose-affine-maps pass runs in 120ms now instead of 5 minutes + (non-
terminating / out of memory) - on a realistic test case that is 20,000 lines 12-d
loop nest
- this CL is also pushing for simple existing/standard patterns unless there
is a real efficiency issue (OTOH, fixing nested matcher to address this issue requires
cl/231400521)
- the improvement is from swapping out the nested walker as opposed to from a bug
or anything else that this CL changes
- update stale comment
PiperOrigin-RevId: 231623619
This CL mandated TypeConstraint and Type to provide descriptions and fixed
various subclasses and definitions to provide so. The purpose is to enforce
good documentation; using empty string as the default just invites oversight.
PiperOrigin-RevId: 231579629
* Emitted result lists for ops.
* Changed to allow empty summary and description for ops.
* Avoided indenting description to allow proper MarkDown rendering of
formatting markers inside description content.
* Used fixed width font for operand/attribute names.
* Massaged TensorFlow op docs and generated dialect op doc.
PiperOrigin-RevId: 231427574
Similar to op operands and attributes, use DAG to specify operation's results.
This will allow us to provide names and matchers for outputs.
Also Defined `outs` as a marker to indicate the start of op result list.
PiperOrigin-RevId: 231422455
This CL also introduces a set of python bindings using pybind11. The bindings
are exercised using a `test_py2andpy3.py` test suite that works for both
python 2 and 3.
`test_py3.py` on the other hand uses the more idiomatic,
python 3 only "PEP 3132 -- Extended Iterable Unpacking" to implement a rank
and type-agnostic copy with transposition.
Because python assignment is by reference, we cannot easily make the
assignment operator use the same type of sugaring as in C++; i.e. the
following:
```cpp
Stmt block = edsc::Block({
For(ivs, zeros, shapeA, ones, {
C[ivs] = IA[ivs] + IB[ivs]
})});
```
has no equivalent in the native Python EDSCs at this time.
However, the sugaring can be built as a simple DSL in python and is left as
future work.
PiperOrigin-RevId: 231337667
Python modules cannot be defined under a directory that has a `-` character in its name inside of Google code.
Rename to `google_mlir` which circumvents this limitation.
PiperOrigin-RevId: 231329321
Update to allow constant attribute values to be used to match or as result in rewrite rule. Define variable ctx in the matcher to allow matchers to refer to the context of the operation being matched.
PiperOrigin-RevId: 231322019
This CL adds support for calling EDSCs from other languages than C++.
Following the LLVM convention this CL:
1. declares simple opaque types and a C API in mlir-c/Core.h;
2. defines the implementation directly in lib/EDSC/Types.cpp and
lib/EDSC/MLIREmitter.cpp.
Unlike LLVM however the nomenclature for these types and API functions is not
well-defined, naming suggestions are most welcome.
To avoid the need for conversion functions, Types.h and MLIREmitter.h include
mlir-c/Core.h and provide constructors and conversion operators between the
mlir::edsc type and the corresponding C type.
In this first commit, mlir-c/Core.h only contains the types for the C API
to allow EDSCs to work from Python. This includes both a minimal set of core
MLIR
types (mlir_context_t, mlir_type_t, mlir_func_t) as well as the EDSC types
(edsc_mlir_emitter_t, edsc_expr_t, edsc_stmt_t, edsc_indexed_t). This can be
restructured in the future as concrete needs arise.
For now, the API only supports:
1. scalar types;
2. memrefs of scalar types with static or symbolic shapes;
3. functions with input and output of these types.
The C API is not complete wrt ownership semantics. This is in large part due
to the fact that python bindings are written with Pybind11 which allows very
idiomatic C++ bindings. An effort is made to write a large chunk of these
bindings using the C API but some C++isms are used where the design benefits
from this simplication. A fully isolated C API will make more sense once we
also integrate with another language like Swift and have enough use cases to
drive the design.
Lastly, this CL also fixes a bug in mlir::ExecutionEngine were the order of
declaration of llvmContext and the JIT result in an improper order of
destructors (which used to crash before the fix).
PiperOrigin-RevId: 231290250
Similar to other tblgen:: abstractions, tblgen::Pattern hides the native TableGen
API and provides a nicer API that is more coherent with the TableGen definitions.
PiperOrigin-RevId: 231285143
* Matching an attribute and specifying a attribute constraint is the same thing executionally, so represent it such.
* Extract AttrConstraint helper to match TypeConstraint and use that where mAttr was previously used in RewriterGen.
PiperOrigin-RevId: 231213580
Cleanup a usage of functional::map that is deemed too obscure in
`reindexAffineIndices`. Also fix a stale comment in `reindexAffineIndices`.
PiperOrigin-RevId: 231211184