Commit Graph

743 Commits

Author SHA1 Message Date
Nicolas Vasilache d4921f4a96 Address Performance issue in NestedMatcher
A performance issue was reported due to the usage of NestedMatcher in
ComposeAffineMaps. The main culprit was the ubiquitous copies that were
occuring when appending even a single element in `matchOne`.

This CL generally simplifies the implementation and removes one level of indirection by getting rid of
auxiliary storage as well as simplifying the API.
The users of the API are updated accordingly.

The implementation was tested on a heavily unrolled example with
ComposeAffineMaps and is now close in performance with an implementation based
on stateless InstWalker.

As a reminder, the whole ComposeAffineMaps pass is slated to disappear but the bug report was very useful as a stress test for NestedMatchers.

Lastly, the following cleanups reported by @aminim were addressed:
1. make NestedPatternContext scoped within runFunction rather than at the Pass level. This was caused by a previous misunderstanding of Pass lifetime;
2. use defensive assertions in the constructor of NestedPatternContext to make it clear a unique such locally scoped context is allowed to exist.

PiperOrigin-RevId: 231781279
2019-03-29 16:04:07 -07:00
Nicolas Vasilache 35200435e7 Address cleanups from previous CL
This CL addresses some cleanups that were leftover after an incorrect rebase:
1. use StringSwitch
2. use // NOLINTNEXTLINE
3. remove a dead line of code

PiperOrigin-RevId: 231726640
2019-03-29 16:03:53 -07:00
MLIR Team 1e85191d07 Fix ASAN issue: snapshot edge list before loop which can modify this list.
PiperOrigin-RevId: 231686040
2019-03-29 16:03:38 -07:00
MLIR Team d7c824451f LoopFusion: insert the source loop nest slice at a depth in the destination loop nest which preserves dependences (above any loop carried or other dependences). This is accomplished by updating the maximum destination loop depth based on dependence checks between source loop nest loads and stores which access the memref on which the source loop nest has a store op. In addition, prevent fusing in source loop nests which write to memrefs which escape or are live out.
PiperOrigin-RevId: 231684492
2019-03-29 16:03:23 -07:00
Uday Bondhugula 44064d5b3b 3000x speed improvement on compose-affine-maps by dropping NestedMatcher for
a trivial inst walker :-) (reduces pass time from several minutes non-terminating to 120ms) - (fixes b/123541184)

- use a simple 7-line inst walker to collect affine_apply op's instead of the nested
  matcher; -compose-affine-maps pass runs in 120ms now instead of 5 minutes + (non-
  terminating / out of memory) - on a realistic test case that is 20,000 lines 12-d
  loop nest

- this CL is also pushing for simple existing/standard patterns unless there
  is a real efficiency issue (OTOH, fixing nested matcher to address this issue requires
  cl/231400521)

- the improvement is from swapping out the nested walker as opposed to from a bug
  or anything else that this CL changes

- update stale comment

PiperOrigin-RevId: 231623619
2019-03-29 16:02:53 -07:00
River Riddle b6928c945c Standardize the spelling of debug info to "debuginfo" in opt flags.
PiperOrigin-RevId: 231610337
2019-03-29 16:02:38 -07:00
Lei Zhang 66647a313a [tablegen] Use tblgen:: classes for NamedAttribute and Operand fields
This is another step towards hiding raw TableGen API calls.

PiperOrigin-RevId: 231580827
2019-03-29 16:02:23 -07:00
Lei Zhang 726dc08e4d [doc] Generate more readable description for attributes
This CL added "description" field to AttrConstraint and Attr, like what we
have for type classes.

PiperOrigin-RevId: 231579853
2019-03-29 16:01:53 -07:00
Lei Zhang 18219caeb2 [doc] Generate more readable description for operands
This CL mandated TypeConstraint and Type to provide descriptions and fixed
various subclasses and definitions to provide so. The purpose is to enforce
good documentation; using empty string as the default just invites oversight.

PiperOrigin-RevId: 231579629
2019-03-29 16:01:38 -07:00
River Riddle 994111238b Fold CallIndirectOp to CallOp when the callee operand is a known constant function.
PiperOrigin-RevId: 231511697
2019-03-29 16:01:23 -07:00
Jacques Pienaar b52dd7f788 Use formatv for the error instead of string stream.
PiperOrigin-RevId: 231507680
2019-03-29 16:01:08 -07:00
Uday Bondhugula c0e9e5eb07 Fix getFullMemRefAsRegion() and FlatAffineConstraints::reset
PiperOrigin-RevId: 231426734
2019-03-29 16:00:39 -07:00
Lei Zhang c224a518f5 TableGen: Use DAG for op results
Similar to op operands and attributes, use DAG to specify operation's results.
This will allow us to provide names and matchers for outputs.

Also Defined `outs` as a marker to indicate the start of op result list.

PiperOrigin-RevId: 231422455
2019-03-29 16:00:22 -07:00
MLIR Team a0f3db4024 Support fusing loop nests which require insertion into a new instruction Block position while preserving dependences, opening up additional fusion opportunities.
- Adds SSA Value edges to the data dependence graph used in the loop fusion pass.

PiperOrigin-RevId: 231417649
2019-03-29 16:00:04 -07:00
Lei Zhang 1dfc3ac5ce Prefix Operator getter methods with "get" to be consistent
PiperOrigin-RevId: 231416230
2019-03-29 15:59:46 -07:00
River Riddle 755538328b Recommit: Define a AffineOps dialect as well as an AffineIfOp operation. Replace all instances of IfInst with AffineIfOp and delete IfInst.
PiperOrigin-RevId: 231342063
2019-03-29 15:59:30 -07:00
Nicolas Vasilache 39d81f246a Introduce python bindings for MLIR EDSCs
This CL also introduces a set of python bindings using pybind11. The bindings
are exercised using a `test_py2andpy3.py` test suite that works for both
python 2 and 3.

`test_py3.py` on the other hand uses the more idiomatic,
python 3 only "PEP 3132 -- Extended Iterable Unpacking" to implement a rank
and type-agnostic copy with transposition.

Because python assignment is by reference, we cannot easily make the
assignment operator use the same type of sugaring as in C++; i.e. the
following:

```cpp
Stmt block = edsc::Block({
  For(ivs, zeros, shapeA, ones, {
    C[ivs] = IA[ivs] + IB[ivs]
})});
```

has no equivalent in the native Python EDSCs at this time.

However, the sugaring can be built as a simple DSL in python and is left as
future work.

PiperOrigin-RevId: 231337667
2019-03-29 15:59:14 -07:00
Nicolas Vasilache ae772b7965 Automated rollback of changelist 231318632.
PiperOrigin-RevId: 231327161
2019-03-29 15:42:38 -07:00
River Riddle 5ecef2b3f6 Define a AffineOps dialect as well as an AffineIfOp operation. Replace all instances of IfInst with AffineIfOp and delete IfInst.
PiperOrigin-RevId: 231318632
2019-03-29 15:42:08 -07:00
Nicolas Vasilache cacf05892e Add a C API for EDSCs in other languages + python
This CL adds support for calling EDSCs from other languages than C++.
Following the LLVM convention this CL:
1. declares simple opaque types and a C API in mlir-c/Core.h;
2. defines the implementation directly in lib/EDSC/Types.cpp and
lib/EDSC/MLIREmitter.cpp.

Unlike LLVM however the nomenclature for these types and API functions is not
well-defined, naming suggestions are most welcome.

To avoid the need for conversion functions, Types.h and MLIREmitter.h include
mlir-c/Core.h and provide constructors and conversion operators between the
mlir::edsc type and the corresponding C type.

In this first commit, mlir-c/Core.h only contains the types for the C API
to allow EDSCs to work from Python. This includes both a minimal set of core
MLIR
types (mlir_context_t, mlir_type_t, mlir_func_t) as well as the EDSC types
(edsc_mlir_emitter_t, edsc_expr_t, edsc_stmt_t, edsc_indexed_t). This can be
restructured in the future as concrete needs arise.

For now, the API only supports:
1. scalar types;
2. memrefs of scalar types with static or symbolic shapes;
3. functions with input and output of these types.

The C API is not complete wrt ownership semantics. This is in large part due
to the fact that python bindings are written with Pybind11 which allows very
idiomatic C++ bindings. An effort is made to write a large chunk of these
bindings using the C API but some C++isms are used where the design benefits
from this simplication. A fully isolated C API will make more sense once we
also integrate with another language like Swift and have enough use cases to
drive the design.

Lastly, this CL also fixes a bug in mlir::ExecutionEngine were the order of
declaration of llvmContext and the JIT result in an improper order of
destructors (which used to crash before the fix).

PiperOrigin-RevId: 231290250
2019-03-29 15:41:53 -07:00
Lei Zhang eb753f4aec Add tblgen::Pattern to model Patterns defined in TableGen
Similar to other tblgen:: abstractions, tblgen::Pattern hides the native TableGen
API and provides a nicer API that is more coherent with the TableGen definitions.

PiperOrigin-RevId: 231285143
2019-03-29 15:41:38 -07:00
Jacques Pienaar 0fbf4ff232 Define mAttr in terms of AttrConstraint.
* Matching an attribute and specifying a attribute constraint is the same thing executionally, so represent it such.
* Extract AttrConstraint helper to match TypeConstraint and use that where mAttr was previously used in RewriterGen.

PiperOrigin-RevId: 231213580
2019-03-29 15:41:23 -07:00
Nicolas Vasilache 1a5287d594 Replace too obscure usage of functional::map by declare + reserve + loop.
Cleanup a usage of functional::map that is deemed too obscure in
`reindexAffineIndices`. Also fix a stale comment in `reindexAffineIndices`.

PiperOrigin-RevId: 231211184
2019-03-29 15:41:08 -07:00
Chris Lattner b42bea215a Change AffineApplyOp to produce a single result, simplifying the code that
works with it, and updating the g3docs.

PiperOrigin-RevId: 231120927
2019-03-29 15:40:38 -07:00
River Riddle 36babbd781 Change the ForInst induction variable to be a block argument of the body instead of the ForInst itself. This is a necessary step in converting ForInst into an operation.
PiperOrigin-RevId: 231064139
2019-03-29 15:40:23 -07:00
Nicolas Vasilache 0e7a8a9027 Drop AffineMap::Null and IntegerSet::Null
Addresses b/122486036

This CL addresses some leftover crumbs in AffineMap and IntegerSet by removing
the Null method and cleaning up the constructors.

As the ::Null uses were tracked down, opportunities appeared to untangle some
of the Parsing logic and make it explicit where AffineMap/IntegerSet have
ambiguous syntax. Previously, ambiguous cases were hidden behind the implicit
pointer values of AffineMap* and IntegerSet* that were passed as function
parameters. Depending the values of those pointers one of 3 behaviors could
occur.

This parsing logic convolution is one of the rare cases where I would advocate
for code duplication. The more proper fix would be to make the syntax
unambiguous or to allow some lookahead.

PiperOrigin-RevId: 231058512
2019-03-29 15:40:08 -07:00
Nicolas Vasilache 81c7f2e2f3 Cleanup resource management and rename recursive matchers
This CL follows up on a memory leak issue related to SmallVector growth that
escapes the BumpPtrAllocator.
The fix is to properly use ArrayRef and placement new to define away the
issue.

The following renaming is also applied:
1. MLFunctionMatcher -> NestedPattern
2. MLFunctionMatches -> NestedMatch

As a consequence all allocations are now guaranteed to live on the BumpPtrAllocator.

PiperOrigin-RevId: 231047766
2019-03-29 15:39:53 -07:00
River Riddle 75c21e1de0 Wrap cl::opt flags within passes in a category with the pass name. This improves the help output of tools like mlir-opt.
Example:

dma-generate options:

  -dma-fast-mem-capacity                 - Set fast memory space  ...
  -dma-fast-mem-space=<uint>             - Set fast memory space  ...

loop-fusion options:

  -fusion-compute-tolerance=<number>     - Fractional increase in  ...
  -fusion-maximal                        - Enables maximal loop fusion

loop-tile options:

  -tile-size=<uint>                      - Use this tile size for  ...

loop-unroll options:

  -unroll-factor=<uint>                  - Use this unroll factor  ...
  -unroll-full                           - Fully unroll loops
  -unroll-full-threshold=<uint>          - Unroll all loops with  ...
  -unroll-num-reps=<uint>                - Unroll innermost loops  ...

loop-unroll-jam options:

  -unroll-jam-factor=<uint>              - Use this unroll jam factor ...

PiperOrigin-RevId: 231019363
2019-03-29 15:39:38 -07:00
Chris Lattner 146ad7cf43 Finish removing multi-result affine maps from the testsuite, and disable them.
PiperOrigin-RevId: 231014261
2019-03-29 15:39:23 -07:00
Feng Liu ebac3528d0 Add an option to improve the readibility of the printed MLIR debuginfo
Use `-mlir-pretty-debuginfo` if the user wants line breaks between different callsite lines.
The print results before and after this CL are shown in the tests.

PiperOrigin-RevId: 231013812
2019-03-29 15:39:08 -07:00
Uday Bondhugula b4a1443508 Update replaceAllMemRefUsesWith to generate single result affine_apply's for
index remapping
- generate a sequence of single result affine_apply's for the index remapping
  (instead of one multi result affine_apply)
- update dma-generate and loop-fusion test cases; while on this, change test cases
  to use single result affine apply ops
- some fusion comment fix/cleanup

PiperOrigin-RevId: 230985830
2019-03-29 15:38:23 -07:00
Nicolas Vasilache 629f5b7fcb Add a simple arity-agnostic invocation of JIT-compiled functions.
This is useful to call generic function with unspecified number of arguments
e.g. when interfacing with ML frameworks.

PiperOrigin-RevId: 230974736
2019-03-29 15:38:08 -07:00
Uday Bondhugula b588d58c5f Update createAffineComputationSlice to generate single result affine maps
- Update createAffineComputationSlice to generate a sequence of single result
  affine apply ops instead of one multi-result affine apply
- update pipeline-data-transfer test case; while on this, also update the test
  case to use only single result affine maps, and make it more robust to
  change.

PiperOrigin-RevId: 230965478
2019-03-29 15:37:53 -07:00
River Riddle c3424c3c75 Allow operations to hold a blocklist and add support for parsing/printing a block list for verbose printing.
PiperOrigin-RevId: 230951462
2019-03-29 15:37:37 -07:00
Alex Zinenko 6d37a255e2 Generic dialect conversion pass exercised by LLVM IR lowering
This commit introduces a generic dialect conversion/lowering/legalization pass
and illustrates it on StandardOps->LLVMIR conversion.

It partially reuses the PatternRewriter infrastructure and adds the following
functionality:
- an actual pass;
- non-default pattern constructors;
- one-to-many rewrites;
- rewriting terminators with successors;
- not applying patterns iteratively (unlike the existing greedy rewrite driver);
- ability to change function signature;
- ability to change basic block argument types.

The latter two things required, given the existing API, to create new functions
in the same module.  Eventually, this should converge with the rest of
PatternRewriter.  However, we may want to keep two pass versions: "heavy" with
function/block argument conversion and "light" that only touches operations.

This pass creates new functions within a module as a means to change function
signature, then creates new blocks with converted argument types in the new
function.  Then, it traverses the CFG in DFS-preorder to make sure defs are
converted before uses in the dominated blocks.  The generic pass has a minimal
interface with two hooks: one to fill in the set of patterns, and another one
to convert types for functions and blocks.  The patterns are defined as
separate classes that can be table-generated in the future.

The LLVM IR lowering pass partially inherits from the existing LLVM IR
translator, in particular for type conversion.  It defines a conversion pattern
template, instantiated for different operations, and is a good candidate for
tablegen.  The lowering does not yet support loads and stores and is not
connected to the translator as it would have broken the existing flows.  Future
patches will add missing support before switching the translator in a single
patch.

PiperOrigin-RevId: 230951202
2019-03-29 15:37:23 -07:00
Mehdi Amini d9ce382fc9 Use a unique_ptr instead of manual deletion for PIMPL idiom (NFC)
PiperOrigin-RevId: 230930254
2019-03-29 15:37:07 -07:00
Lei Zhang ba1715f407 Pull TableGen op argument definitions into their own files
PiperOrigin-RevId: 230923050
2019-03-29 15:36:52 -07:00
Uday Bondhugula 95f19d558c Fix return value logic / error reporting in -dma-generate
PiperOrigin-RevId: 230906158
2019-03-29 15:36:23 -07:00
Alex Zinenko 5a4403787f Simple CPU runner
This implements a simple CPU runner based on LLVM Orc JIT.  The base
functionality is provided by the ExecutionEngine class that compiles and links
the module, and provides an interface for obtaining function pointers to the
JIT-compiled MLIR functions and for invoking those functions directly.  Since
function pointers need to be casted to the correct pointer type, the
ExecutionEngine wraps LLVM IR functions obtained from MLIR into a helper
function with the common signature `void (void **)` where the single argument
is interpreted as a list of pointers to the actual arguments passed to the
function, eventually followed by a pointer to the result of the function.
Additionally, the ExecutionEngine is set up to resolve library functions to
those available in the current process, enabling support for, e.g., simple C
library calls.

For integration purposes, this also provides a simplistic runtime for memref
descriptors as expected by the LLVM IR code produced by MLIR translation.  In
particular, memrefs are transformed into LLVM structs (can be mapped to C
structs) with a pointer to the data, followed by dynamic sizes.  This
implementation only supports statically-shaped memrefs of type float, but can
be extened if necessary.

Provide a binary for the runner and a test that exercises it.

PiperOrigin-RevId: 230876363
2019-03-29 15:36:08 -07:00
MLIR Team 5c5739d42b Change the dependence check in the loop fusion pass to use the MLIR instruction list ordering (instead of the dependence graph node id ordering). This breaks the overloading of dependence graph node ids as both edge endpoints and instruction list position.
PiperOrigin-RevId: 230849232
2019-03-29 15:35:53 -07:00
Uday Bondhugula f94b15c247 Update dma-generate: update for multiple load/store op's per memref
- introduce a way to compute union using symbolic rectangular bounding boxes
- handle multiple load/store op's to the same memref by taking a union of the regions
- command-line argument to provide capacity of the fast memory space
- minor change to replaceAllMemRefUsesWith to not generate affine_apply if the
  supplied index remap was identity

PiperOrigin-RevId: 230848185
2019-03-29 15:35:38 -07:00
River Riddle 4a7dfa7882 Add order bit to instructions to lazily track dominance queries. This improves the performance of dominance queries, which are used quite often within the compiler(especially within the verifier).
This reduced the execution time of a few internal tests from ~2 minutes to ~4 seconds.

PiperOrigin-RevId: 230819723
2019-03-29 15:35:23 -07:00
Uday Bondhugula 06d21d9f64 loop-fusion: debug info cleanup
PiperOrigin-RevId: 230817383
2019-03-29 15:35:08 -07:00
Chris Lattner 934b6d125f Introduce a new operation hook point for implementing simple local
canonicalizations of operations.  The ultimate important user of this is
going to be a funcBuilder->foldOrCreate<YourOp>(...) API, but for now it
is just a more convenient way to write certain classes of canonicalizations
(see the change in StandardOps.cpp).

NFC.

PiperOrigin-RevId: 230770021
2019-03-29 15:34:35 -07:00
River Riddle 451869f394 Add cloning functionality to Block and Function, this also adds support for remapping successor block operands of terminator operations. We define a new BlockAndValueMapping class to simplify mapping between cloned values.
PiperOrigin-RevId: 230768759
2019-03-29 15:34:20 -07:00
Uday Bondhugula 72e5c7f428 Minor updates + cleanup to dma-generate
- switch some debug info to emitError
- use a single constant op for zero index to make it easier to write/update
  test cases; avoid creating new constant op's for common zero index cases
- test case cleanup

This is in preparation for an upcoming major update to this pass.

PiperOrigin-RevId: 230728379
2019-03-29 15:34:06 -07:00
River Riddle f319bbbd28 Add a function pass to strip debug info from functions and instructions.
PiperOrigin-RevId: 230654315
2019-03-29 15:33:50 -07:00
River Riddle 98c729d6f1 Change trailing locations printing to also print unknown locations. This will allow for truly round tripping debug locations given that we assign locations while parsing IR.
PiperOrigin-RevId: 230627191
2019-03-29 15:33:35 -07:00
River Riddle 6859f33292 Migrate VectorOrTensorType/MemRefType shape api to use int64_t instead of int.
PiperOrigin-RevId: 230605756
2019-03-29 15:33:20 -07:00
Feng Liu b64998a6b3 Add a method to construct a CallSiteLoc which represents a stack of locations.
PiperOrigin-RevId: 230592860
2019-03-29 15:33:05 -07:00
River Riddle 1210e92d86 Add asmparser/printer support for locations to make them round-trippable. Location printing is currently behind a command line flag "mlir-print-debuginfo", we can rethink this when we have a pass for stripping debug info or when we have support for printer flags.
Example inline notation:

  trailing-location ::= 'loc' '(' location ')'

  // FileLineCol Location.
  %1 = "foo"() : () -> i1 loc("mysource.cc":10:8)

  // Name Location
  return loc("foo")

  // CallSite Location
  return loc(callsite("foo" at "mysource.cc":19:9))

  // Fused Location
  /// Without metadata
  func @inline_notation() loc(fused["foo", "mysource.cc":10:8])

  /// With metadata
  return loc(fused<"myPass">["foo", "foo2"])

  // Unknown location.
  return loc(unknown)

Locations are currently only printed with inline notation at the line of each instruction. Further work is needed to allow for reference notation, e.g:
     ...
     return loc 1
   }
   ...
   loc 1 = "source.cc":10:1

PiperOrigin-RevId: 230587621
2019-03-29 15:32:49 -07:00
Lei Zhang 5654450853 Unify terms regarding assembly form to use generic vs. custom
This CL just changes various docs and comments to use the term "generic" and
"custom" when mentioning assembly forms. To be consist, several methods are
also renamed:

* FunctionParser::parseVerboseOperation() -> parseGenericOperation()
* ModuleState::hasShorthandForm() -> hasCustomForm()
* OpAsmPrinter::printDefaultOp() -> printGenericOp()

PiperOrigin-RevId: 230568819
2019-03-29 15:32:35 -07:00
MLIR Team b28009b681 Fix single producer check in loop fusion pass.
PiperOrigin-RevId: 230565482
2019-03-29 15:32:20 -07:00
Uday Bondhugula 864d9e02a1 Update fusion cost model + some additional infrastructure and debug information for -loop-fusion
- update fusion cost model to fuse while tolerating a certain amount of redundant
  computation; add cl option -fusion-compute-tolerance
  evaluate memory footprint and intermediate memory reduction
- emit debug info from -loop-fusion showing what was fused and why
- introduce function to compute memory footprint for a loop nest
- getMemRefRegion readability update - NFC

PiperOrigin-RevId: 230541857
2019-03-29 15:32:06 -07:00
Nicolas Vasilache e4020c2d1a Add support for Return in EDSCs
This CL adds the Return op to EDSCs types and emitter.
This allows generating full function bodies that can be compiled all the way
down to LLVMIR and executed on CPU.

At this point, the MLIR lacks the testing infrastructure to exercise this.
End-to-end testing of full functions written in EDSCs is left for a future CL.

PiperOrigin-RevId: 230527530
2019-03-29 15:31:50 -07:00
Uday Bondhugula 92e9d9484c loop unroll update: unroll factor one for a single iteration loop
- unrolling a single iteration loop by a factor of one should promote its body
  into its parent; this makes it consistent with the behavior/expectation that
  unrolling a loop by a factor equal to its trip count makes the loop go away.

PiperOrigin-RevId: 230426499
2019-03-29 15:31:35 -07:00
Uday Bondhugula 1b735dfe27 Refactor -dma-generate walker - NFC
- ForInst::walkOps will also be used in an upcoming CL (cl/229438679); better to have
  this instead of deriving from the InstWalker

PiperOrigin-RevId: 230413820
2019-03-29 15:31:03 -07:00
Uday Bondhugula 7669204304 Improve / fix documentation for affine map composition utilities - NFC
- improve/fix doc comments for affine apply composition related methods.
- drop makeSingleValueComposedAffineApply - really redundant and out of line in
  a public API; it's just returning the first result of the composed affine
  apply op, and not making a single result affine map or an affine_apply op.

PiperOrigin-RevId: 230406169
2019-03-29 15:30:47 -07:00
Uday Bondhugula 94a03f864f Allocate private/local buffers for slices accurately during fusion
- the size of the private memref created for the slice should be based on
  the memref region accessed at the depth at which the slice is being
  materialized, i.e., symbolic in the outer IVs up until that depth, as opposed
  to the region accessed based on the entire domain.

- leads to a significant contraction of the temporary / intermediate memref
  whenever the memref isn't reduced to a single scalar (through store fwd'ing).

Other changes

- update to promoteIfSingleIteration - avoid introducing unnecessary identity
  map affine_apply from IV; makes it much easier to write and read test cases
  and pass output for all passes that use promoteIfSingleIteration; loop-fusion
  test cases become much simpler

- fix replaceAllMemrefUsesWith bug that was exposed by the above update -
  'domInstFilter' could be one of the ops erased due to a memref replacement in
  it.

- fix getConstantBoundOnDimSize bug: a division by the coefficient of the identifier was
  missing (the latter need not always be 1); add lbFloorDivisors output argument

- rename getBoundingConstantSizeAndShape -> getConstantBoundingSizeAndShape

PiperOrigin-RevId: 230405218
2019-03-29 15:30:31 -07:00
MLIR Team 71495d58a7 Handle escaping memrefs in loop fusion pass:
*) Do not remove loop nests which write to memrefs which escape the function.
*) Do not remove memrefs which escape the function (e.g. are used in the return instruction).

PiperOrigin-RevId: 230398630
2019-03-29 15:30:14 -07:00
Jacques Pienaar 34c6f8c6e4 Add default attr value & define tf.AvgPool op and use pattern for rewrite.
Add default values to attributes, to allow attribute being left unspecified.  The attr getter will always return an attribute so callers need not check for it, if the attribute is not set then the default will be returned (at present the default will be constructed upon query but this will be changed).

Add op definition for tf.AvgPool in ops.td, rewrite matcher using pattern using attribute matching & transforms. Adding some helper functions to make it simpler.

Handle attributes with dialect prefix and map them to getter without dialect prefix.

Note: VerifyAvgPoolOp could probably be autogenerated by know given the predicate specification on attributes, but deferring that to a follow up.
PiperOrigin-RevId: 230364857
2019-03-29 15:29:59 -07:00
Uday Bondhugula d2aaa175ca Fix FlatAffineConstraints::removeIdRange
- the number of symbols/local ids was being incorrectly updated; the code in
  cl/230112574 exposes this.

PiperOrigin-RevId: 230358327
2019-03-29 15:29:44 -07:00
Jacques Pienaar a280e3997e Start doc generation pass.
Start doc generation pass that generates simple markdown output. The output is formatted simply[1] in markdown, but this allows seeing what info we have, where we can refine the op description (e.g., the inputs is probably redundant), what info is missing (e.g., the attributes could probably have a description).

The formatting of the description is still left up to whatever was in the op definition (which luckily, due to the uniformity in the .td file, turned out well but relying on the indentation there is fragile). The mechanism to autogenerate these post changes has not been added yet either. The output file could be run through a markdown formatter too to remove extra spaces.

[1]. This is not proposal for final style :) There could also be a discussion around single doc vs multiple (per dialect, per op), whether we want a TOC, whether operands/attributes should be headings or just formatted differently ...

PiperOrigin-RevId: 230354538
2019-03-29 15:29:29 -07:00
Lei Zhang 57aade19b3 Add assertions to SplatElementsAttr and ConstantOp builders and fix failures
1) Fix FloatAttr type inconsistency in conversion from tf.FusedBatchNorm to TFLite ops

We used to compose the splat tensor out of the scalar epsilon attribute by using the
type of the variance operand. However, the epsilon attribute may have a different
bitwidth than the one in the variance operand. So it ends up we were creating
inconsistent types within the FloatAttr itself.

2) Fix SplatElementsAttr type inconsistency in AnnotateInputArrays

We need to create the zero-valued attribute according to the type provided as the
command-line arguments.

3) Concretize the result type of tf.Shape constant folding test case

Currently the resultant constant is created by the constant folding harness, using
the result type of the original op as the constant's result type. That can be
a different type than the constant's internal DenseElementsAttr.

PiperOrigin-RevId: 230244665
2019-03-29 15:28:59 -07:00
Uday Bondhugula c1880a857d AffineExpr pretty print - add missing handling to print expr * - 1 as -expr
- print multiplication by -1 as unary negate; expressions like s0 * -1, d0 * -1
  + d1 will now appear as -s0, -d0 + d1 resp.
- a minor cleanup while on printAffineExprInternal

PiperOrigin-RevId: 230222151
2019-03-29 15:28:44 -07:00
River Riddle 512d87cefc Add a constant folding hook to ExtractElementOp to fold extracting the element of a constant. This also adds a 'getValue' function to DenseElementsAttr and SparseElementsAttr to get the element at a constant index.
PiperOrigin-RevId: 230098938
2019-03-29 15:28:28 -07:00
Nicolas Vasilache 119af6712e Cleanup spurious printing bits in EDSCs
This CL also makes ScopedEDSCContexts to reset the Bindable numbering when
creating a new context.
This is useful to write minimal tests that don't use FileCheck pattern
captures for now.

PiperOrigin-RevId: 230079997
2019-03-29 15:28:13 -07:00
Nicolas Vasilache 9f3f39d61a Cleanup EDSCs
This CL performs a bunch of cleanups related to EDSCs that are generally
useful in the context of using them with a simple wrapping C API (not in this
CL) and with simple language bindings to Python and Swift.

PiperOrigin-RevId: 230066505
2019-03-29 15:27:58 -07:00
River Riddle 174f66bc8a Restructure FloatAttr::get(Type, double) to allow for loss of precision when converting the double value to the target type semantics. A comment is added to discourage the use of this method for non simple constants. The new handling also removes the direct use of the float constructor for APFloat to avoid runtime float cast asan errors.
PiperOrigin-RevId: 230014696
2019-03-29 15:27:44 -07:00
River Riddle b04c9a47ca Fix raw buffer size when creating a DenseElementsAttr from an array of attributes.
PiperOrigin-RevId: 229973134
2019-03-29 15:27:13 -07:00
Lei Zhang 1e484b5ef4 Mark (void)indexRemap to please compiler for unused variable check
PiperOrigin-RevId: 229957023
2019-03-29 15:26:59 -07:00
River Riddle a1c0da42ec Rewrite OpStats to use llvm formatting utilities.
Example Output:

Operations encountered:
-----------------------
      addf                  , 11
      constant              , 4
      return                , 19
      some_op               , 1
   tf.AvgPool               , 3
   tf.DepthwiseConv2dNative , 3
   tf.FusedBatchNorm        , 2
  tfl.add                   , 7
  tfl.average_pool_2d       , 1
  tfl.leaky_relu            , 1

PiperOrigin-RevId: 229937190
2019-03-29 15:26:29 -07:00
MLIR Team c4237ae990 LoopFusion: Creates private MemRefs which are used only by operations in the fused loop.
*) Enables reduction of private memref size based on MemRef region accessed by fused slice.
*) Enables maximal fusion by creating a private memref to break a fusion-preventing dependence.
*) Adds maximal fusion flag to enable fusing as much as possible (though it still fuses the minimum cost computation slice).

PiperOrigin-RevId: 229936698
2019-03-29 15:26:15 -07:00
Nicolas Vasilache 24e5a72dac Fix AffineApply corner case
This CL adds a test reported by andydavis@ and fixes the corner case that
appears when operands do not come from an AffineApply and no Dim composition
is needed.

In such cases, we would need to create an empty map which is disallowed.
The composition in such cases becomes trivial: there is no composition.

This CL also updates the name AffineNormalizer to AffineApplyNormalizer.

PiperOrigin-RevId: 229819234
2019-03-29 15:25:59 -07:00
River Riddle 0e81d7c420 [MLIR] Add functionality for constructing a DenseElementAttr from an array of attributes and rerwite DenseElementsAttr::writeBits/readBits to handle non uniform bitwidths. This fixes asan failures that happen when using non uniform bitwidths.
PiperOrigin-RevId: 229815107
2019-03-29 15:25:45 -07:00
Uday Bondhugula 40f7535571 Update stale / target-specific information in comments - NFC
PiperOrigin-RevId: 229800834
2019-03-29 15:25:29 -07:00
Jacques Pienaar d6f84fa5d9 Add AttrConstraint to enable generating verification for attribute values.
Change MinMaxAttr to match hasValidMinMaxAttribute behavior. Post rewriting the other users of that function it could be removed too. The currently generated error message is:

error: 'tfl.fake_quant' op attribute 'minmax' failed to satisfy constraint of MinMaxAttr
PiperOrigin-RevId: 229775631
2019-03-29 15:25:13 -07:00
Smit Hinsu 0eebe6ffd9 Update comment in the constant folding pass as constant folding is supported even when not all operands are constants
PiperOrigin-RevId: 229670189
2019-03-29 15:24:28 -07:00
Nicolas Vasilache 4573a8da9a Fix improperly indexed DimOp in LowerVectorTransfers.cpp
This CL fixes a misunderstanding in how to build DimOp which triggered
execution issues in the CPU path.

The problem is that, given a `memref<?x4x?x8x?xf32>`, the expressions to
construct the dynamic dimensions should be:
`dim %arg, 0 : memref<?x4x?x8x?xf32>`
`dim %arg, 2 : memref<?x4x?x8x?xf32>`
and
`dim %arg, 4 : memref<?x4x?x8x?xf32>`

Before this CL, we wold construct:
`dim %arg, 0 : memref<?x4x?x8x?xf32>`
`dim %arg, 1 : memref<?x4x?x8x?xf32>`
`dim %arg, 2 : memref<?x4x?x8x?xf32>`

and expect the other dimensions to be constants.
This assumption seems consistent at first glance with the syntax of alloc:

```
    %tensor = alloc(%M, %N, %O) : memref<?x4x?x8x?xf32>
```

But this was actuallyincorrect.

This CL also makes the relevant functions available to EDSCs and removes
duplication of the incorrect function.

PiperOrigin-RevId: 229622766
2019-03-29 15:24:13 -07:00
Uday Bondhugula c1ca23ef6e Some loop fusion code cleanup/simplification post cl/229575126
- enforce the assumptions better / in a simpler way

PiperOrigin-RevId: 229612424
2019-03-29 15:23:43 -07:00
Lei Zhang 3766332533 Change impl::printBinaryOp() to consider operand and result type
The operand and result types of binary ops are not necessarily the
same. For those binary ops, we cannot print in the short-form assembly.

Enhance impl:::printBinaryOp to consider operand and result types
to select which assembly form to use.

PiperOrigin-RevId: 229608142
2019-03-29 15:23:28 -07:00
River Riddle 5843e5a7c0 Add a canonicalization pattern to remove Dealloc operations if the memref is an AllocOp that is only used by Dealloc operations.
PiperOrigin-RevId: 229606558
2019-03-29 15:23:13 -07:00
Alex Zinenko 05b02bb98e TableGen: implement predicate tree and basic simplification
A recent change in TableGen definitions allowed arbitrary AND/OR predicate
compositions at the cost of removing known-true predicate simplification.
Introduce a more advanced simplification mechanism instead.

In particular, instead of folding predicate C++ expressions directly in
TableGen, keep them as is and build a predicate tree in TableGen C++ library.
The predicate expression-substitution mechanism, necessary to implement complex
predicates for nested classes such as `ContainerType`, is replaced by a
dedicated predicate.  This predicate appears in the predicate tree and can be
used for tree matching and separation.  More specifically, subtrees defined
below such predicate may be subject to different transformations than those
that appear above.  For example, a subtree known to be true above the
substitution predicate is not necessarily true below it.

Use the predicate tree structure to eliminate known-true and known-false
predicates before code emission, as well as to collapse AND and OR predicates
if their value can be deduced based on the value of one child.

PiperOrigin-RevId: 229605997
2019-03-29 15:22:58 -07:00
Jacques Pienaar 4b2b5f5267 Enable specifying the op for which the reference implementation should be printed.
Allows emitting reference implementation of multiple ops inside the test lowering pass.

PiperOrigin-RevId: 229603494
2019-03-29 15:22:43 -07:00
River Riddle ada685f352 Add canonicalization to remove AllocOps if there are no uses. AllocOp has side effects on the heap, but can still be deleted if it has zero uses.
PiperOrigin-RevId: 229596556
2019-03-29 15:22:28 -07:00
Jacques Pienaar a5827fc91d Add attribute matching and transform to pattern rewrites.
Start simple with single predicate match & transform rules for attributes.
* Its unclear whether modelling Attr predicates will be needed so start with allowing matching attributes with a single predicate.
*  The input and output attr type often differs and so add ability to specify a transform between the input and output format.

PiperOrigin-RevId: 229580879
2019-03-29 15:22:14 -07:00
MLIR Team 27d067e164 LoopFusion improvements:
*) Adds support for fusing into consumer loop nests with multiple loads from the same memref.
*) Adds support for reducing slice loop trip count by projecting out destination loop IVs greater than destination loop depth.
*) Removes dependence on src loop depth and simplifies cost model computation.

PiperOrigin-RevId: 229575126
2019-03-29 15:21:59 -07:00
Jacques Pienaar 9d4bb57189 Start a testing pass for EDSC lowering.
This is mostly plumbing to start allowing testing EDSC lowering. Prototype specifying reference implementation using verbose format without any generation/binding support. Add test pass that dumps the constructed EDSC (of which there can only be one). The idea is to enable iterating from multiple sides, this is wrong on many dimensions at the moment.

PiperOrigin-RevId: 229570535
2019-03-29 15:21:44 -07:00
Alex Zinenko bd161ae5bc TableGen: untie Attr from Type
In TableGen definitions, the "Type" class has been used for types of things
that can be stored in Attributes, but not necessarily present in the MLIR type
system.  As a consequence, records like "String" or "DerviedAttrBody" were of
class "Type", which can be confusing.  Furthermore, the "builderCall" field of
the "Type" class serves only for attribute construction.  Some TableGen "Type"
subclasses that correspond to MLIR kinds of types do not have a canonical way
of construction only from the data available in TableGen, e.g. MemRefType would
require the list of affine maps.  This leads to a conclusion that the entities
that describe types of objects appearing in Attributes should be independent of
"Type": they have some properties "Type"s don't and vice versa.

Do not parameterize Tablegen "Attr" class by an instance of "Type".  Instead,
provide a "constBuilderCall" field that can be used to build an attribute from
a constant value stored in TableGen instead of indirectly going through
Attribute.Type.builderCall.  Some attributes still don't have a
"constBuilderCall" because they used to depend on types without a
"builderCall".

Drop definitions of class "Type" that don't correspond to MLIR Types.  Provide
infrastructure to define type-dependent attributes and string-backed attributes
for convenience.

PiperOrigin-RevId: 229570087
2019-03-29 15:21:28 -07:00
Lei Zhang 590012772d Promote broadcast logic from TensorFlowLite to Dialect/ directory
We also need the broadcast logic in the TensorFlow dialect. Move it to a
Dialect/ directory for a broader scope. This Dialect/ directory is intended
for code not in core IR, but can potentially be shared by multiple dialects.

Apart from fixing TensorFlow op TableGen to use this trait, this CL only
contains mechanical code shuffling.

PiperOrigin-RevId: 229563911
2019-03-29 15:21:14 -07:00
Uday Bondhugula f99a44a7cd Address documentation/readability related comments from cl/227252907 on memref
store forwarding - NFC.

PiperOrigin-RevId: 229561933
2019-03-29 15:20:59 -07:00
River Riddle 18fe1ffcd7 Move the storage of uniqued TypeStorage objects into TypeUniquer and give each context a unique TypeUniquer instance.
PiperOrigin-RevId: 229460053
2019-03-29 15:19:56 -07:00
Uday Bondhugula 03e15e1b9f Minor code cleanup - NFC.
- readability changes

PiperOrigin-RevId: 229443430
2019-03-29 15:19:41 -07:00
Lei Zhang b7dbfd04eb Const fold splat tensors for TFLite AddOp, SubOp, MulOp
The constant folding rules assumes value attributes of operands are already
verified to be in good standing.

For each op in the above, the constant folding rules support both integer and
floating point cases. Broadcast behavior is also supported as per the semantics
of TFLite ops.

This CL does not handle overflow/underflow cases yet.

PiperOrigin-RevId: 229441221
2019-03-29 15:19:26 -07:00
River Riddle f9d2eb1c8c Change derived type storage objects to define an 'operator==(const KeyTy &)' instead of converting to the KeyTy. This allows for handling cases where the KeyTy does not provide an equality operator on itself.
PiperOrigin-RevId: 229423249
2019-03-29 15:19:11 -07:00
River Riddle f8341cfe06 Verify that the parsed predicate attribute of a cmpi operation is a string.
PiperOrigin-RevId: 229419703
2019-03-29 15:18:53 -07:00
Alex Zinenko 0e58de70e7 Initial version of the LLVM IR dialect
LLVM IR types are defined using MLIR's extendable type system.  The dialect
provides the only type kind, LLVMType, that wraps an llvm::Type*.  Since LLVM
IR types are pointer-unique, MLIR type systems relies on those pointers to
perform its own type unique'ing.  Type parsing and printing is delegated to
LLVM libraries.

Define MLIR operations for the LLVM IR instructions currently used by the
translation to the LLVM IR Target to simplify eventual transition.  Operations
classes are defined using TableGen.  LLVM IR instruction operands that are only
allowed to take constant values are accepted as attributes instead.  All
operations are using verbose form for printing and parsing.

PiperOrigin-RevId: 229400375
2019-03-29 15:18:37 -07:00
Alex Zinenko 44e9869f1a TableGen: extract TypeConstraints from Type
MLIR has support for type-polymorphic instructions, i.e. instructions that may
take arguments of different types.  For example, standard arithmetic operands
take scalars, vectors or tensors.  In order to express such instructions in
TableGen, we need to be able to verify that a type object satisfies certain
constraints, but we don't need to construct an instance of this type.  The
existing TableGen definition of Type requires both.  Extract out a
TypeConstraint TableGen class to define restrictions on types.  Define the Type
TableGen class as a subclass of TypeConstraint for consistency.  Accept records
of the TypeConstraint class instead of the Type class as values in the
Arguments class when defining operators.

Replace the predicate logic TableGen class based on conjunctive normal form
with the predicate logic classes allowing for abitrary combinations of
predicates using Boolean operators (AND/OR/NOT).  The combination is
implemented using simple string rewriting of C++ expressions and, therefore,
respects the short-circuit evaluation order.  No logic simplification is
performed at the TableGen level so all expressions must be valid C++.
Maintaining CNF using TableGen only would have been complicated when one needed
to introduce top-level disjunction.  It is also unclear if it could lead to a
significantly simpler emitted C++ code.  In the future, we may replace inplace
predicate string combination with a tree structure that can be simplified in
TableGen's C++ driver.

Combined, these changes allow one to express traits like ArgumentsAreFloatLike
directly in TableGen instead of relying on C++ trait classes.

PiperOrigin-RevId: 229398247
2019-03-29 15:18:23 -07:00
Uday Bondhugula 4598dafa30 Parsing DmaStartOp: check if source, destination, and tag are of memref type.
- fix along the lines of cl/229390720 by @riverriddle

PiperOrigin-RevId: 229395218
2019-03-29 15:18:07 -07:00
River Riddle d50dc4fd6d When parsing DmaWait, check that the tag is a MemRef type.
PiperOrigin-RevId: 229390720
2019-03-29 15:17:52 -07:00