Commit Graph

498 Commits

Author SHA1 Message Date
Lei Zhang d2d7c11f19 Auto-generate op builder with TableGen
If no custom builder is supplied for an op, TableGen now generates
a default builder for it with the following signature:

  static void build(Builder *builder, OperationState* result,
                    <list-of-all-result-types>,
                    <list-of-all-operands>,
                    <list-of-all-attributes>);

PiperOrigin-RevId: 224382473
2019-03-29 14:21:51 -07:00
Nicolas Vasilache 13bc77045e [MLIR] Drop assert for NYI in Vectorize.cpp
This CLs adds proper error emission, removes NYI assertions and documents
assumptions that are required in the relevant functions.

PiperOrigin-RevId: 224377207
2019-03-29 14:21:37 -07:00
Nicolas Vasilache 2408f0eba5 [MLIR] Drop assert for NYI in VectorAnalysis
This CLs adds proper error emission, removes NYI assertions and documents
assumptions that are required in the relevant functions.

PiperOrigin-RevId: 224377143
2019-03-29 14:21:22 -07:00
Nicolas Vasilache 48d22e83e3 [MLIR] Drop unnecessary mention of NYI.
This CL also documents the `substExpr` helper function assumptions.
The assumptions are properly propagated up already.

PiperOrigin-RevId: 224377072
2019-03-29 14:21:07 -07:00
Nicolas Vasilache a019379cdb [MLIR] Remove NYI assertions in LoopAnalysis.cpp
This CL also cleans up some loose ends and returns conservative answers while
emitting errors in the NYI cases.

PiperOrigin-RevId: 224377004
2019-03-29 14:20:52 -07:00
Nicolas Vasilache 5b610630b2 [MLIR] Error handling in MaterializeVectors
This removes assertions as a means to capture NYI behavior and propagates
errors up.

PiperOrigin-RevId: 224376935
2019-03-29 14:20:37 -07:00
Nicolas Vasilache 4adc169bd0 [MLIR] Add AffineMap composition and use it in Materialization
This CL adds the following free functions:
```
/// Returns the AffineExpr e o m.
AffineExpr compose(AffineExpr e, AffineMap m);
/// Returns the AffineExpr f o g.
AffineMap compose(AffineMap f, AffineMap g);
```

This addresses the issue that AffineMap composition is only available at a
distance via AffineValueMap and is thus unusable on Attributes.
This CL thus implements AffineMap composition in a more modular and composable
way.

This CL does not claim that it can be a good replacement for the
implementation in AffineValueMap, in particular it does not support bounded
maps atm.

Standalone tests are added that replicate some of the logic of the AffineMap
composition pass.

Lastly, affine map composition is used properly inside MaterializeVectors and
a standalone test is added that requires permutation_map composition with a
projection map.

PiperOrigin-RevId: 224376870
2019-03-29 14:20:22 -07:00
Nicolas Vasilache df0a25efee [MLIR] Add support for permutation_map
This CL hooks up and uses permutation_map in vector_transfer ops.
In particular, when going into the nuts and bolts of the implementation, it
became clear that cases arose that required supporting broadcast semantics.
Broadcast semantics are thus added to the general permutation_map.
The verify methods and tests are updated accordingly.

Examples of interest include.

Example 1:
The following MLIR snippet:
```mlir
   for %i3 = 0 to %M {
     for %i4 = 0 to %N {
       for %i5 = 0 to %P {
         %a5 = load %A[%i4, %i5, %i3] : memref<?x?x?xf32>
   }}}
```
may vectorize with {permutation_map: (d0, d1, d2) -> (d2, d1)} into:
```mlir
   for %i3 = 0 to %0 step 32 {
     for %i4 = 0 to %1 {
       for %i5 = 0 to %2 step 256 {
         %4 = vector_transfer_read %arg0, %i4, %i5, %i3
              {permutation_map: (d0, d1, d2) -> (d2, d1)} :
              (memref<?x?x?xf32>, index, index) -> vector<32x256xf32>
   }}}
````
Meaning that vector_transfer_read will be responsible for reading the 2-D slice:
`%arg0[%i4, %i5:%15+256, %i3:%i3+32]` into vector<32x256xf32>. This will
require a transposition when vector_transfer_read is further lowered.

Example 2:
The following MLIR snippet:
```mlir
   %cst0 = constant 0 : index
   for %i0 = 0 to %M {
     %a0 = load %A[%cst0, %cst0] : memref<?x?xf32>
   }
```
may vectorize with {permutation_map: (d0) -> (0)} into:
```mlir
   for %i0 = 0 to %0 step 128 {
     %3 = vector_transfer_read %arg0, %c0_0, %c0_0
          {permutation_map: (d0, d1) -> (0)} :
          (memref<?x?xf32>, index, index) -> vector<128xf32>
   }
````
Meaning that vector_transfer_read will be responsible of reading the 0-D slice
`%arg0[%c0, %c0]` into vector<128xf32>. This will require a 1-D vector
broadcast when vector_transfer_read is further lowered.

Additionally, some minor cleanups and refactorings are performed.

One notable thing missing here is the composition with a projection map during
materialization. This is because I could not find an AffineMap composition
that operates on AffineMap directly: everything related to composition seems
to require going through SSAValue and only operates on AffinMap at a distance
via AffineValueMap. I have raised this concern a bunch of times already, the
followup CL will actually do something about it.

In the meantime, the projection is hacked at a minimum to pass verification
and materialiation tests are temporarily incorrect.

PiperOrigin-RevId: 224376828
2019-03-29 14:20:07 -07:00
Alex Zinenko 7c89a225cf ConvertToCFG: support min/max in loop bounds.
The recently introduced `select` operation enables ConvertToCFG to support
min(max) in loop bounds.  Individual min(max) is implemented as
`cmpi "lt"`(`cmpi "gt"`) followed by a `select` between the compared values.
Multiple results of an `affine_apply` operation extracted from the loop bounds
are reduced using min(max) in a sequential manner.  While this may decrease the
potential for instruction-level parallelism, it is easier to recognize for the
following passes, in particular for the vectorizer.

PiperOrigin-RevId: 224376233
2019-03-29 14:19:52 -07:00
Alex Zinenko 513d6d896c OpPointer: replace conversion operator to Operation* to OpType*.
The implementation of OpPointer<OpType> provides an implicit conversion to
Operation *, but not to the underlying OpType *.  This has led to
awkward-looking code when an OpPointer needs to be passed to a function
accepting an OpType *.  For example,

    if (auto someOp = genericOp.dyn_cast<OpType>())
      someFunction(&*someOp);

where "&*" makes it harder to read.  Arguably, one does not want to spell out
OpPointer<OpType> in the line with dyn_cast.  More generally, OpPointer is now
being used as an owning pointer to OpType rather than to operation.

Replace the implicit conversion to Operation* with the conversion to OpType*
taking into account const-ness of the type.  An Operation* can be obtained from
an OpType with a simple call.  Since an instance of OpPointer owns the OpType
value, the pointer to it is never null.  However, the OpType value may not be
associated with any Operation*.  In this case, return nullptr when conversion
is attempted to maintain consistency with the existing null checks.

PiperOrigin-RevId: 224368103
2019-03-29 14:19:37 -07:00
Uday Bondhugula 73fc0223e4 Fix cases where unsigned / signed arithmetic was being mixed (following up on
cl/224246657); eliminate repeated evaluation of exprs in loop upper bounds.

- while on this, sweep through and fix potential repeated evaluation of
  expressions in loop upper bounds

PiperOrigin-RevId: 224268918
2019-03-29 14:19:22 -07:00
MLIR Team a53ed1b767 Fix bug in GCD calculation when flattening AffineExpr (adds unit test which triggers the bug and tests the fix).
PiperOrigin-RevId: 224246657
2019-03-29 14:19:07 -07:00
Tatiana Shpeisman 8ad72bd6be Make examples semantically meaningful and fix miscellaneous typos. Thanks to @rocky for pointing out the bugs.
PiperOrigin-RevId: 224239160
2019-03-29 14:18:52 -07:00
Uday Bondhugula 9f77faae87 Strided DMA support for DmaStartOp
- add optional stride arguments for DmaStartOp
- add DmaStartOp::verify(), and missing test cases for DMA op's in
  test/IR/memory-ops.mlir.

PiperOrigin-RevId: 224232466
2019-03-29 14:18:37 -07:00
Uday Bondhugula a92130880e Complete multiple unhandled cases for DmaGeneration / getMemRefRegion;
update/improve/clean up API.

- update FlatAffineConstraints::getConstBoundDifference; return constant
  differences between symbolic affine expressions, look at equalities as well.
- fix buffer size computation when generating DMAs symbolic in outer loops,
  correctly handle symbols at various places (affine access maps, loop bounds,
  loop IVs outer to the depth at which DMA generation is being done)
- bug fixes / complete some TODOs for getMemRefRegion
- refactor common code b/w memref dependence check and getMemRefRegion
- FlatAffineConstraints API update; added methods employ trivial checks /
  detection - sufficient to handle hyper-rectangular cases in a precise way
  while being fast / low complexity. Hyper-rectangular cases fall out as
  trivial cases for these methods while other cases still do not cause failure
  (either return conservative or return failure that is handled by the caller).

PiperOrigin-RevId: 224229879
2019-03-29 14:18:22 -07:00
Lei Zhang ff3b9149b3 Clean up base TableGen definitions
* Removed unused builder field for type definitions
* Refined comments and reordered classes

PiperOrigin-RevId: 224223038
2019-03-29 14:18:07 -07:00
Jacques Pienaar c143132a56 Enable using bare attributes.
Useful for defining ops such as <dialect>.Const where multiple kinds of attributes are legal.

PiperOrigin-RevId: 224210511
2019-03-29 14:17:53 -07:00
Lei Zhang b572322859 Add isIntOrIndex() and isIntOrIndexOrFloat() into Type
The checks for `isa<IndexType>() || isa<IntegerType>()` and
`isa<IndexType>() || isa<IntegerType>() || isa<FloatType>()`
are frequently used, so it's useful to have some helper
methods for them.

PiperOrigin-RevId: 224133596
2019-03-29 14:17:38 -07:00
Uday Bondhugula f9af62998b Remove duplicate FlatAffineConstraints::removeId - refactor to use
removeColumnRange

- remove functionally duplicate code in removeId.

- rename removeColumnRange -> removeIdRange - restrict valid input to just the
  identifier columns (not the constant term column).

PiperOrigin-RevId: 224054064
2019-03-29 14:17:24 -07:00
Uday Bondhugula 7c2347266d FlatAffineConstraints::removeId() fix.
This is an obvious bug, but none of the test cases exposed it since numIds was
correctly updated, and the dimensional identifiers were always eliminated
before the symbolic identifiers in all cases that removeId was getting
called from. However, other work in progress exercises the other scenarios and
exposes this bug.

Add an hasConsistentState() private method to move common assertion checks, and call it
from several base methods. Make hasInvalidConstraint() a private method as
well (from a file static one).

PiperOrigin-RevId: 224032721
2019-03-29 14:17:10 -07:00
Lei Zhang 86f5a467d2 Change TFLite binary ops to support implicit broadcasting
As it turns out, the TFLite runtime already supports implicit broadcasting
for math binary ops. As the instruction set for TFLite runtime, the tfl
dialect should reflect that, instead of requiring both operands for binary
ops to be of the same type.

To support implicit broadcast means it's not suitable to provide the
short-form assembly for TFLite binary ops anymore. So by default, we should
just provide the canonical-form assembly parser/printer for base binary op.
It's subclasses' choices whether to opt in to short-form.

Added BroadcastableTwoOperandsOneResult as a new dialect trait for checking
the operand and result types for TFLite binary ops.

Also added SameOperandsAndResultType to several neural network ops.

PiperOrigin-RevId: 224027445
2019-03-29 14:16:55 -07:00
MLIR Team 753109547d During forward substitution, merge symbols from input AffineMap with the symbol list of the target AffineMap.
Symbols can be used as dim identifiers and symbolic identifiers, and so we must preserve the symbolic identifies from the input AffineMap during forward substitution, even if that same identifier is used as a dimension identifier in the target AffineMap.
Test case added.

Going forward, we may want to explore solutions where we do not maintain this split between dimensions and symbols, and instead verify the validity of each use of each AffineMap operand AffineMap in the context where the AffineMap operand usage is required to be a symbol: in the denominator of floordiv/ceildiv/mod for semi-affine maps, and in instructions that can capture symbols (i.e. alloc)

PiperOrigin-RevId: 224017364
2019-03-29 14:16:40 -07:00
Jacques Pienaar f24628b1f0 Fix off by one in OpStats.
PiperOrigin-RevId: 223977444
2019-03-29 14:16:25 -07:00
Alex Zinenko 7868abd9d8 ConvertToCFG: convert "if" statements.
The condition of the "if" statement is an integer set, defined as a conjunction
of affine constraints.  An affine constraints consists of an affine expression
and a flag indicating whether the expression is strictly equal to zero or is
also allowed to be greater than zero.  Affine maps, accepted by `affine_apply`
are also formed from affine expressions.  Leverage this fact to implement the
checking of "if" conditions.  Each affine expression from the integer set is
converted into an affine map.  This map is applied to the arguments of the "if"
statement.  The result of the application is compared with zero given the
equality flag to obtain the final boolean value.  The conjunction of conditions
is tested sequentially with short-circuit branching to the "else" branch if any
of the condition evaluates to false.

Create an SESE region for the if statement (including its "then" and optional
"else" statement blocks) and append it to the end of the current region.  The
conditional region consists of a sequence of condition-checking blocks that
implement the short-circuit scheme, followed by a "then" SESE region and an
"else" SESE region, and the continuation block that post-dominates all blocks
of the "if" statement.  The flow of blocks that correspond to the "then" and
"else" clauses are constructed recursively, enabling easy nesting of "if"
statements and if-then-else-if chains.

Note that MLIR semantics does not require nor prohibit short-circuit
evaluation.  Since affine expressions do not have side effects, there is no
observable difference in the program behavior.  We may trade off extra
operations for operation-level parallelism opportunity by first performing all
`affine_apply` and comparison operations independently, and then performing a
tree pattern reduction of the resulting boolean values with the `muli i1`
operations (in absence of the dedicated bit operations).  The pros and cons are
not clear, and since MLIR does not include parallel semantics, we prefer to
minimize the number of sequentially executed operations.

PiperOrigin-RevId: 223970248
2019-03-29 14:16:10 -07:00
Alex Zinenko dee51d0961 LLVM IR Lowering: support multi-value returns.
Unlike MLIR, LLVM IR does not support functions that return multiple values.
Simulate this by packing values into the LLVM structure type in the same order
as they appear in the MLIR return.  If the function returns only a single
value, return it directly without packing.

PiperOrigin-RevId: 223964886
2019-03-29 14:15:56 -07:00
Nicolas Vasilache ebb3d38471 [MLIR] Separate and split vectorization tests
These tests have become too bulky and unwiedly.
Splitting simplifies modifications that will occur in the next CL.

PiperOrigin-RevId: 223874321
2019-03-29 14:15:40 -07:00
Nicolas Vasilache b39d1f0bdb [MLIR] Add VectorTransferOps
This CL implements and uses VectorTransferOps in lieu of the former custom
call op. Tests are updated accordingly.

VectorTransferOps come in 2 flavors: VectorTransferReadOp and
VectorTransferWriteOp.

VectorTransferOps can be thought of as a backend-independent
pseudo op/library call that needs to be legalized to MLIR (whiteboxed) before
it can be lowered to backend-dependent IR.

Note that the current implementation does not yet support a real permutation
map. Proper support will come in a followup CL.

VectorTransferReadOp
====================
VectorTransferReadOp performs a blocking read from a scalar memref
location into a super-vector of the same elemental type. This operation is
called 'read' by opposition to 'load' because the super-vector granularity
is generally not representable with a single hardware register. As a
consequence, memory transfers will generally be required when lowering
VectorTransferReadOp. A VectorTransferReadOp is thus a mid-level abstraction
that supports super-vectorization with non-effecting padding for full-tile
only code.

A vector transfer read has semantics similar to a vector load, with additional
support for:
  1. an optional value of the elemental type of the MemRef. This value
     supports non-effecting padding and is inserted in places where the
     vector read exceeds the MemRef bounds. If the value is not specified,
     the access is statically guaranteed to be within bounds;
  2. an attribute of type AffineMap to specify a slice of the original
     MemRef access and its transposition into the super-vector shape. The
     permutation_map is an unbounded AffineMap that must represent a
     permutation from the MemRef dim space projected onto the vector dim
     space.

Example:
```mlir
  %A = alloc(%size1, %size2, %size3, %size4) : memref<?x?x?x?xf32>
  ...
  %val = `ssa-value` : f32
  // let %i, %j, %k, %l be ssa-values of type index
  %v0 = vector_transfer_read %src, %i, %j, %k, %l
        {permutation_map: (d0, d1, d2, d3) -> (d3, d1, d2)} :
          (memref<?x?x?x?xf32>, index, index, index, index) ->
            vector<16x32x64xf32>
  %v1 = vector_transfer_read %src, %i, %j, %k, %l, %val
        {permutation_map: (d0, d1, d2, d3) -> (d3, d1, d2)} :
          (memref<?x?x?x?xf32>, index, index, index, index, f32) ->
            vector<16x32x64xf32>
```

VectorTransferWriteOp
=====================
VectorTransferWriteOp performs a blocking write from a super-vector to
a scalar memref of the same elemental type. This operation is
called 'write' by opposition to 'store' because the super-vector
granularity is generally not representable with a single hardware register. As
a consequence, memory transfers will generally be required when lowering
VectorTransferWriteOp. A VectorTransferWriteOp is thus a mid-level
abstraction that supports super-vectorization with non-effecting padding
for full-tile only code.
A vector transfer write has semantics similar to a vector store, with
additional support for handling out-of-bounds situations.

Example:
```mlir
  %A = alloc(%size1, %size2, %size3, %size4) : memref<?x?x?x?xf32>.
  %val = `ssa-value` : vector<16x32x64xf32>
  // let %i, %j, %k, %l be ssa-values of type index
  vector_transfer_write %val, %src, %i, %j, %k, %l
    {permutation_map: (d0, d1, d2, d3) -> (d3, d1, d2)} :
  (vector<16x32x64xf32>, memref<?x?x?x?xf32>, index, index, index, index)
```
PiperOrigin-RevId: 223873234
2019-03-29 14:15:25 -07:00
Jacques Pienaar bb3ffc1c22 Fix two more getHashValues.
These were still returning the hash of the pointers resulting in the two getHashValues being different.

PiperOrigin-RevId: 223862743
2019-03-29 14:15:11 -07:00
Uday Bondhugula 89c41fdca1 FlatAffineConstraints::composeMap: return failure instead of asserting on semi-affine maps
FlatAffineConstraints::composeMap: should return false instead of asserting on
a semi-affine map. Make getMemRefRegion just propagate false when encountering
semi-affine maps (instead of crashing!)
PiperOrigin-RevId: 223828743
2019-03-29 14:14:56 -07:00
Uday Bondhugula 5f76245cfe Minor fix for replaceAllMemRefUsesWith.
The check for whether the memref was used in a non-derefencing context had to
be done inside, i.e., only for the op stmt's that the replacement was specified
to be performed on (by the domStmtFilter arg if provided). As such, it is
completely fine for example for a function to return a memref while the replacement
is being performed only a specific loop's body (as in the case of DMA
generation).

PiperOrigin-RevId: 223827753
2019-03-29 14:14:43 -07:00
River Riddle 7669a259c4 Add a simple common sub expression elimination pass.
The algorithm collects defining operations within a scoped hash table. The scopes within the hash table correspond to nodes within the dominance tree for a function. This cl only adds support for simple operations, i.e non side-effecting. Such operations, e.g. load/store/call, will be handled in later patches.

PiperOrigin-RevId: 223811328
2019-03-29 14:14:28 -07:00
Lei Zhang 5858102ab1 Remove tfl.reshape op when possible
Remove tfl.reshape for the following two cases:

1. A tfl.reshape's input is from another tfl.reshape.
   Then these two tfl.reshape ops can be merged.

2. A tfl.reshape's result type is the same as its input type.
   This tfl.reshape op does nothing, which can be removed.

These transformations are put in a new source file, Canonicalizer.cpp,
because they are TFLite op to TFLite op transformations, and aiming
to making TFLite ops more canonicalized.

Also added a hasCanonicalizationPatterns marker in TableGen Op class
to indicate whether an op has custom getCanonicalizationPatterns().

PiperOrigin-RevId: 223806921
2019-03-29 14:14:13 -07:00
Jacques Pienaar 3277f94bf4 Update getHashValue for ptr values stored in a DenseMap/Set to use getHasValue of KeyTy.
Ensures both hash values returned are the same. Tested by triggering resize of map/set and verifying failure before change.

PiperOrigin-RevId: 223651443
2019-03-29 14:13:58 -07:00
Jacques Pienaar 45e3139bc8 RankedTensorType: Use getHashValue(KeyTy) when calling getHashValue(RankedTensorTypeStorage*).
PiperOrigin-RevId: 223649958
2019-03-29 14:13:44 -07:00
Alex Zinenko 9769ba7489 Document SelectOp class
This was missing from the commit that introduced SelectOp although the
documentation was present in the LangRef.md.

PiperOrigin-RevId: 223476888
2019-03-29 14:13:29 -07:00
Jacques Pienaar 21ed46abb8 Avoid failing when attempting to print null Attribute.
This avoids segfaulting when dumping during debugging of failures.

PiperOrigin-RevId: 223449494
2019-03-29 14:13:14 -07:00
Uday Bondhugula a619b5c295 Debug output / logging memref sizes in DMA generation + related changes
- Add method to get a memref's size in bytes
- clean up a loop tiling pass helper (NFC)

PiperOrigin-RevId: 223422077
2019-03-29 14:12:56 -07:00
Nicolas Vasilache 1ae66f6520 [MLIR] Reenable materialize_vectors test
Fixes one of the Filecheck'ed test which was mistakenly disabled.

PiperOrigin-RevId: 223401978
2019-03-29 14:12:40 -07:00
River Riddle 5668887a1d Add support for result type iteration in Operation/Instruction/OperationStmt.
PiperOrigin-RevId: 223264992
2019-03-29 14:12:21 -07:00
Chris Lattner 3f2530cdf5 Split "rewrite" functionality out of Pattern into a new RewritePattern derived
class.  This change is NFC, but allows for new kinds of patterns, specifically
LegalizationPatterns which will be allowed to change the types of things they
rewrite.

PiperOrigin-RevId: 223243783
2019-03-29 14:12:07 -07:00
Lei Zhang 1f5330ac90 Verify CmpIOp's result type to be bool-like
This CL added two new traits, SameOperandsAndResultShape and
ResultsAreBoolLike, and changed CmpIOp to embody these two
traits. As a consequence, CmpIOp's result type now is verified
to be bool-like.

PiperOrigin-RevId: 223208438
2019-03-29 14:11:53 -07:00
Jacques Pienaar 16f525bc27 Add derived attribute support.
Derived attributes are attributes that are derived from other properties of the operation (e.g., the shape returned from the type). DerivedAttr is parameterized on the return type and function body.

PiperOrigin-RevId: 223180315
2019-03-29 14:11:40 -07:00
Alex Zinenko a3fb6d0da3 StandardOps: introduce 'select'.
The semantics of 'select' is conventional: return the second operand if the
first operand is true (1 : i1) and the third operand otherwise.  It is
applicable to vectors and tensors element-wise, similarly to LLVM instruction.
This operation is necessary to implement min/max to lower 'for' loops with
complex bounds to CFG functions and to support ternary operations in ML
functions.  It is preferred to first-class min/max because of its simplicity,
e.g. it is not concered with signedness.

PiperOrigin-RevId: 223160860
2019-03-29 14:11:25 -07:00
Alex Zinenko e7f43c8361 LLVM IR lowering: support 'dim' operation.
Add support for translating 'dim' opreation on MemRefs to LLVM IR.  For a
static size, this operation merely defines an LLVM IR constant value that may
not appear in the output IR if not used (and had not been removed before by
DCE).  For a dynamic size, this operation is translated into an access to the
MemRef descriptor that contains the dynamic size.

PiperOrigin-RevId: 223160774
2019-03-29 14:11:10 -07:00
Alex Zinenko 90d1b6b5f2 LLVM IR lowering: support simple MemRef types
Introduce initial support for MemRef types, including type conversion,
allocation and deallocation, read and write element-wise access, passing
MemRefs to and returning from functions.  Affine map compositions and
non-default memory spaces are NOT YET supported.

Lowered code needs to handle potentially dynamic sizes of the MemRef.  To do
so, it replaces a MemRef-typed value with a special MemRef descriptor that
carries the data and the dynamic sizes together.  A MemRef type is converted to
LLVM's first-class structure type with the first element being the pointer to
the data buffer with data layed out linearly, followed by as many integer-typed
elements as MemRef has dynamic sizes.  The type of these elements is that of
MLIR index lowered to LLVM.  For example, `memref<?x42x?xf32>` is converted to
`{ f32*, i64, i64 }` provided `index` is lowered to `i64`.  While it is
possible to convert MemRefs with fully static sizes to simple pointers to their
elemental types, we opted for consistency and convert them to the
single-element structure.  This makes the conversion code simpler and the
calling convention of the generated LLVM IR functions consistent.

Loads from and stores to a MemRef element are lowered to a sequence of LLVM
instructions that, first, computes the linearized index of the element in the
data buffer using the access indices and combining the static sizes with the
dynamic sizes stored in the descriptor, and then loads from or stores to the
buffer element indexed by the linearized subscript.  While some of the index
computations may be redundant (i.e., consecutive load and store to the same
location in the same scope could reuse the linearized index), we emit them for
every operation.  A subsequent optimization pass may eliminate them if
necessary.

MemRef allocation and deallocation is performed using external functions
`__mlir_alloc(index) -> i8*` and `__mlir_free(i8*)` that must be implemented by
the caller.  These functions behave similarly to `malloc` and `free`, but can
be extended to support different memory spaces in future.  Allocation and
deallocation instructions take care of casting the pointers.  Prior to calling
the allocation function, the emitted code creates an SSA Value for the
descriptor and uses it to store the dynamic sizes of the MemRef passed to the
allocation operation.  It further emits instructions that compute the dynamic
amount of memory to allocate in bytes.  Finally, the allocation stores the
result of calling the `__mlir_alloc` in the MemRef descriptor.  Deallocation
extracts the pointer to the allocated memory from the descriptor and calls
`__mlir_free` on it.  The descriptor itself is not modified and, being
stack-allocated, ceases to exist when it goes out of scope.

MLIR functions that access MemRef values as arguments or return them are
converted to LLVM IR functions that accept MemRef descriptors as LLVM IR
structure types by value.  This significantly simplifies the calling convention
at the LLVM IR level and avoids handling descriptors in the dynamic memory,
however is not always comaptible with LLVM IR functions emitted from C code
with similar signatures.  A separate LLVM pass may be introduced in the future
to provide C-compatible calling conventions for LLVM IR functions generated
from MLIR.

PiperOrigin-RevId: 223134883
2019-03-29 14:10:55 -07:00
River Riddle 312d8ee96b Make operation names hashable.
PiperOrigin-RevId: 223104253
2019-03-29 14:10:41 -07:00
Alex Zinenko 67939e8b70 Create Passes.md.
Start the documentation file listing available MLIR passes.  Briefly describe
the `-convert-to-cfg` and the `-lower-affine-apply` passes.  These passes
serve as description templates for other passes.  In particular, they include
the dialect and operation restrictions in the pass input and output.

PiperOrigin-RevId: 223076894
2019-03-29 14:10:27 -07:00
Jacques Pienaar 17b8105761 Fix typo.
Tensor has as element type a tensor-memref-element-type rather than a vector-element-type.

PiperOrigin-RevId: 223062135
2019-03-29 14:10:12 -07:00
Lei Zhang fce05646d7 Convert tf.FusedBatchNorm into tfl primary math ops
* Added TF::FusedBatchNormOp
* Validated TF::FusedBatchNormOp's operands
* Added converter from tf.FusedBatchNorm to tfl math ops

In the converter, we additionally check that the 'is_training'
attribute in tf.FusedBatchNorm is false and the last 4 outputs
are all not used (true for inference). These requirements do
not exist in the original TOCO source code, which just silently
ignores the last 4 outputs.

PiperOrigin-RevId: 223027333
2019-03-29 14:09:58 -07:00
River Riddle 759fd1c6a3 Add support for setting the location of an IROperandOwner.
PiperOrigin-RevId: 222995814
2019-03-29 14:09:43 -07:00