Commit Graph

1147 Commits

Author SHA1 Message Date
Chris Lattner d2d89cbc19 Rename affineint type to index type. The name 'index' may not be perfect, but is better than the old name. Here is some justification:
1) affineint (as it is named) is not a type suitable for general computation (e.g. the multiply/adds in an integer matmul).  It has undefined width and is undefined on overflow.  They are used as the indices for forstmt because they are intended to be used as indexes inside the loop.

2) It can be used in both cfg and ml functions, and in cfg functions.  As you mention, “symbols” are not affine, and we use affineint values for symbols.

3) Integers aren’t affine, the algorithms applied to them can be. :)

4) The only suitable use for affineint in MLIR is for indexes and dimension sizes (i.e. the bounds of those indexes).

PiperOrigin-RevId: 216057974
2019-03-29 13:24:16 -07:00
Uday Bondhugula d18ae9e2c7 Constant folding for loop bounds.
- Fold the lower/upper bound of a loop to a constant whenever the result of the
  application of the bound's affine map on the operand list yields a constant.

- Update/complete 'for' stmt's API to set lower/upper bounds with operands.
  Resolve TODOs for ForStmt::set{Lower,Upper}Bound.

- Moved AffineExprConstantFolder into AffineMap.cpp and added
  AffineMap::constantFold to be used by both AffineApplyOp and
  ForStmt::constantFoldBound.

PiperOrigin-RevId: 215997346
2019-03-29 13:24:01 -07:00
Uday Bondhugula f069d796f3 Fix opt build breakage - lib/Transforms/Utils.cpp
PiperOrigin-RevId: 215924308
2019-03-29 13:23:46 -07:00
Chris Lattner 6822c4e29c Implement support for constant folding operations even when their operands are
not all constant.  Implement support for folding dim, x*0, and affine_apply.

PiperOrigin-RevId: 215917432
2019-03-29 13:23:32 -07:00
Uday Bondhugula 6cfdb756b1 Introduce memref replacement/rewrite support: to replace an existing memref
with a new one (of a potentially different rank/shape) with an optional index
remapping.

- introduce Utils::replaceAllMemRefUsesWith
- use this for DMA double buffering

(This CL also adds a few temporary utilities / code that will be done away with
once:
1) abstract DMA op's are added
2) memref deferencing side-effect / trait is available on op's
3) b/117159533 is resolved (memref index computation slices).
PiperOrigin-RevId: 215831373
2019-03-29 13:23:19 -07:00
Nicolas Vasilache b55b407601 [RFC][MLIR] Use AffineExprRef in place of AffineExpr* in IR
This CL starts by replacing AffineExpr* with value-type AffineExprRef in a few
places in the IR. By a domino effect that is pretty telling of the
inconsistencies in the codebase, const is removed where it makes sense.

The rationale is that the decision was concisously made that unique'd types
have pointer semantics without const specifier. This is fine but we should be
consistent. In the end, the only logical invariant is that there should never
be such a thing as a const AffineExpr*, const AffineMap* or const IntegerSet*
in our codebase.

This CL takes a number of shortcuts to killing const with fire, in particular
forcing const AffineExprRef to return the underlying non-const
AffineExpr*. This will be removed once AffineExpr* has disappeared in
containers but for now such shortcuts allow a bit of sanity in this long quest
for cleanups.

The **only** places where const AffineExpr*, const AffineMap* or const
IntegerSet* may still appear is by transitive needs from containers,
comparison operators etc.

There is still one major thing remaining here: figure out why cast/dyn_cast
return me a const AffineXXX*, which in turn requires a bunch of ugly
const_casts. I suspect this is due to the classof
taking const AffineXXXExpr*. I wonder whether this is a side effect of 1., if
it is coming from llvm itself (I'd doubt it) or something else (clattner@?)

In light of this, the whole discussion about const makes total sense to me now
and I would systematically apply the rule that in the end, we should never
have any const XXX in our codebase for unique'd types (assuming we can remove
them all in containers and no additional constness constraint is added on us
from the outside world).

PiperOrigin-RevId: 215811554
2019-03-29 13:23:05 -07:00
Nicolas Vasilache 5b8017db18 [MLIR] Templated AffineExprBaseRef
This CL implements AffineExprBaseRef as a templated type to allow LLVM-style
casts to work properly. This also allows making AffineExprBaseRef::expr
private.

To achieve this, it is necessary to use llvm::simplify_type and make
AffineConstExpr derive from both AffineExpr and llvm::simplify<AffineExprRef>.
Note that llvm::simplify_type is just an interface to enable the proper
template resolution of isa/cast/dyn_cast but it otherwise holds no value.

Lastly note that certain dyn_cast operations wanted the const AffineExpr* form
of AffineExprBaseRef so I made the implicit constructor take that by default
and documented the immutable behavior. I think this is consistent with the
decision to make unique'd type immutable by convention and never use const on
them.

PiperOrigin-RevId: 215642247
2019-03-29 13:22:49 -07:00
Nicolas Vasilache 544f5e7a9b [MLIR] Remove uses of AffineExpr* outside of IR
This CL uniformizes the uses of AffineExprWrap outside of IR.
The public API of AffineExpr builder is modified to only use AffineExprWrap.
A few places access AffineExprWrap.expr, this is only while the API is in
transition to easily keep track (i.e. make expr private and let the compiler
track the errors).

Parser.cpp exhibits patterns that are dependent on nullptr values so
converting it is left for another CL.

PiperOrigin-RevId: 215642005
2019-03-29 13:22:35 -07:00
Nicolas Vasilache 9ef87c4b6b [MLIR] AffineExpr lightweight value type for operators
This CL proposes adding MLIRContext* to AffineExpr as discussed previously.
This allows the value class to not require the context in its constructor and
makes it a POD that it makes sense to pass by value everywhere.
A list of other RFC CLs will build on this. The RFC CLs are small incremental
pushes of the API which would be a pretty big change otherwise.

Pushing the thinking a little bit more it seems reasonable to use implicit
cast/constructor to/from AffineExpr*.
As this thing evolves, it looks to me like IR (and
probably Parser, for not so good reasons) want to operate on AffineExpr* and
the rest of the code wants to operate on the value type.

For this reason I think AffineExprImpl*/AffineExpr may also make sense but I
do not have a particular naming preference.
The jury is still out for naming decision between the above and
AffineExprBase*/AffineExpr or AffineExpr*/AffineExprRef.

PiperOrigin-RevId: 215641596
2019-03-29 13:22:21 -07:00
Nicolas Vasilache 4805e629c5 [MLIR] Use chainable ligthweight wrapper for AffineExpr
This CL argues that the builder API for AffineExpr should be used
with a lightweight wrapper that supports operators chaining.
This CL takes the ill-named AffineExprWrap and proposes a simple
set of operators with builtin constant simplifications.

This allows:
1. removing the getAddMulPureAffineExpr function;
2. avoiding concerns about constant vs non-constant simplifications
at **every call site**;
3. writing the mathematical expressions we want to write without unnecessary
obfuscations.

The points above represent pure technical debt that we don't want to carry on.
It is important to realize that this is not a mere convenience or "just sugar"
but reduction in cognitive overhead.

This thinking can be pushed significantly further, I have added some comments
with some basic ideas but we could make AffineMap, AffineApply and other
objects that use map applications more functional and value-based.

I am putting this out to get a first batch of reviews and see what people
think.

I think in my preferred design I would have the Builder directly return such
AffineExprPtr objects by value everywhere and avoid the boilerplate explicit
creations that I am doing by hand at this point.

Yes this AffineExprPtr would implicitly convert to AffineExpr* because that is
what it is.

PiperOrigin-RevId: 215641317
2019-03-29 13:22:07 -07:00
Uday Bondhugula 041817a45e Introduce loop body skewing / loop pipelining / loop shifting utility.
- loopBodySkew shifts statements of a loop body by stmt-wise delays, and is
  typically meant to be used to:
  - allow overlap of non-blocking start/wait until completion operations with
    other computation
  - allow shifting of statements (for better register
    reuse/locality/parallelism)
  - software pipelining (when applied to the innermost loop)
- an additional argument specifies whether to unroll the prologue and epilogue.
- add method to check SSA dominance preservation.
- add a fake loop pipeline pass to test this utility.

Sample input/output are below. While on this, fix/add following:

- fix minor bug in getAddMulPureAffineExpr
- add additional builder methods for common affine map cases
- fix const_operand_iterator's for ForStmt, etc. When there is no such thing
  as 'const MLValue', the iterator shouldn't be returning const MLValue's.
  Returning MLValue is const correct.

Sample input/output examples:

1) Simplest case: shift second statement by one.

Input:

for %i = 0 to 7 {
  %y = "foo"(%i) : (affineint) -> affineint
  %x = "bar"(%i) : (affineint) -> affineint
}

Output:

#map0 = (d0) -> (d0 - 1)
mlfunc @loop_nest_simple1() {
  %c8 = constant 8 : affineint
  %c0 = constant 0 : affineint
  %0 = "foo"(%c0) : (affineint) -> affineint
  for %i0 = 1 to 7 {
    %1 = "foo"(%i0) : (affineint) -> affineint
    %2 = affine_apply #map0(%i0)
    %3 = "bar"(%2) : (affineint) -> affineint
  }
  %4 = affine_apply #map0(%c8)
  %5 = "bar"(%4) : (affineint) -> affineint
  return
}

2) DMA overlap: shift dma.wait and compute by one.

Input
  for %i = 0 to 7 {
    %pingpong = affine_apply (d0) -> (d0 mod 2) (%i)
    "dma.enqueue"(%pingpong) : (affineint) -> affineint
    %pongping = affine_apply (d0) -> (d0 mod 2) (%i)
    "dma.wait"(%pongping) : (affineint) -> affineint
    "compute1"(%pongping) : (affineint) -> affineint
  }

Output

#map0 = (d0) -> (d0 mod 2)
#map1 = (d0) -> (d0 - 1)
#map2 = ()[s0] -> (s0 + 7)
mlfunc @loop_nest_dma() {
  %c8 = constant 8 : affineint
  %c0 = constant 0 : affineint
  %0 = affine_apply #map0(%c0)
  %1 = "dma.enqueue"(%0) : (affineint) -> affineint
  for %i0 = 1 to 7 {
    %2 = affine_apply #map0(%i0)
    %3 = "dma.enqueue"(%2) : (affineint) -> affineint
    %4 = affine_apply #map1(%i0)
    %5 = affine_apply #map0(%4)
    %6 = "dma.wait"(%5) : (affineint) -> affineint
    %7 = "compute1"(%5) : (affineint) -> affineint
  }
  %8 = affine_apply #map1(%c8)
  %9 = affine_apply #map0(%8)
  %10 = "dma.wait"(%9) : (affineint) -> affineint
  %11 = "compute1"(%9) : (affineint) -> affineint
  return
}

3) With arbitrary affine bound maps:

Shift last two statements by two.

Input:

  for %i = %N to ()[s0] -> (s0 + 7)()[%N] {
    %y = "foo"(%i) : (affineint) -> affineint
    %x = "bar"(%i) : (affineint) -> affineint
    %z = "foo_bar"(%i) : (affineint) -> (affineint)
    "bar_foo"(%i) : (affineint) -> (affineint)
  }

Output

#map0 = ()[s0] -> (s0 + 1)
#map1 = ()[s0] -> (s0 + 2)
#map2 = ()[s0] -> (s0 + 7)
#map3 = (d0) -> (d0 - 2)
#map4 = ()[s0] -> (s0 + 8)
#map5 = ()[s0] -> (s0 + 9)

  for %i0 = %arg0 to #map0()[%arg0] {
    %0 = "foo"(%i0) : (affineint) -> affineint
    %1 = "bar"(%i0) : (affineint) -> affineint
  }
  for %i1 = #map1()[%arg0] to #map2()[%arg0] {
    %2 = "foo"(%i1) : (affineint) -> affineint
    %3 = "bar"(%i1) : (affineint) -> affineint
    %4 = affine_apply #map3(%i1)
    %5 = "foo_bar"(%4) : (affineint) -> affineint
    %6 = "bar_foo"(%4) : (affineint) -> affineint
  }
  for %i2 = #map4()[%arg0] to #map5()[%arg0] {
    %7 = affine_apply #map3(%i2)
    %8 = "foo_bar"(%7) : (affineint) -> affineint
    %9 = "bar_foo"(%7) : (affineint) -> affineint
  }

4) Shift one by zero, second by one, third by two

  for %i = 0 to 7 {
    %y = "foo"(%i) : (affineint) -> affineint
    %x = "bar"(%i) : (affineint) -> affineint
    %z = "foobar"(%i) : (affineint) -> affineint
  }

#map0 = (d0) -> (d0 - 1)
#map1 = (d0) -> (d0 - 2)
#map2 = ()[s0] -> (s0 + 7)

  %c9 = constant 9 : affineint
  %c8 = constant 8 : affineint
  %c1 = constant 1 : affineint
  %c0 = constant 0 : affineint
  %0 = "foo"(%c0) : (affineint) -> affineint
  %1 = "foo"(%c1) : (affineint) -> affineint
  %2 = affine_apply #map0(%c1)
  %3 = "bar"(%2) : (affineint) -> affineint
  for %i0 = 2 to 7 {
    %4 = "foo"(%i0) : (affineint) -> affineint
    %5 = affine_apply #map0(%i0)
    %6 = "bar"(%5) : (affineint) -> affineint
    %7 = affine_apply #map1(%i0)
    %8 = "foobar"(%7) : (affineint) -> affineint
  }
  %9 = affine_apply #map0(%c8)
  %10 = "bar"(%9) : (affineint) -> affineint
  %11 = affine_apply #map1(%c8)
  %12 = "foobar"(%11) : (affineint) -> affineint
  %13 = affine_apply #map1(%c9)
  %14 = "foobar"(%13) : (affineint) -> affineint

5) SSA dominance violated; no shifting if a shift is specified for the second
statement.

  for %i = 0 to 7 {
    %x = "foo"(%i) : (affineint) -> affineint
    "bar"(%x) : (affineint) -> affineint
  }

PiperOrigin-RevId: 214975731
2019-03-29 13:21:26 -07:00
Uday Bondhugula 591fa9698e Change behavior of loopUnrollFull with unroll factor 1
Using loopUnrollFull with unroll factor 1 should promote the loop body as
opposed to doing nothing.

PiperOrigin-RevId: 214812126
2019-03-29 13:20:59 -07:00
Uday Bondhugula 501462ac47 Use statement walker for constant folding.
- makes the code compact (gets rid of MLFunction walking logic)
- makes it natural to extend to fold affine map loop bounds
  and if conditions (upcoming CL)

PiperOrigin-RevId: 214668957
2019-03-29 13:19:32 -07:00
Chris Lattner cdb9551aba Move the GraphTraits implementations for CFGs out to their own header,
consolidate the implementations in CFGFunctionViewGraph.cpp into it, and
implement the missing const specializations for functions.  NFC.

PiperOrigin-RevId: 214048649
2019-03-29 13:17:35 -07:00
Chris Lattner d6f8ec7bac Introduce [post]dominator tree and related infrastructure, use it in CFG func
verifier.  We get most of this infrastructure directly from LLVM, we just
need to adapt it to our CFG abstraction.

This has a few unrelated changes engangled in it:
 - getFunction() in various classes was const incorrect, fix it.
 - This moves Verifier.cpp to the analysis library, since Verifier depends on
   dominance and these are both really analyses.
 - IndexedAccessorIterator::reference was defined wrong, leading to really
   exciting template errors that were fun to diagnose.
 - This flips the boolean sense of the foldOperation() function in constant
   folding pass in response to previous patch feedback.

PiperOrigin-RevId: 214046593
2019-03-29 13:17:20 -07:00
Chris Lattner 82eb284a53 Implement support for constant folding operations and a simple constant folding
optimization pass:

 - Give the ability for operations to implement a constantFold hook (a simple
   one for single-result ops as well as general support for multi-result ops).
 - Implement folding support for constant and addf.
 - Implement support in AbstractOperation and Operation to make this usable by
   clients.
 - Implement a very simple constant folding pass that does top down folding on
   CFG and ML functions, with a testcase that exercises all the above stuff.

Random cleanups:
 - Improve the build APIs for ConstantOp.
 - Stop passing "-o -" to mlir-opt in the testsuite, since that is the default.

PiperOrigin-RevId: 213749809
2019-03-29 13:16:33 -07:00
Feng Liu 7e004efae2 Add function attributes for ExtFunction, CFGFunction and MLFunction.
PiperOrigin-RevId: 213540509
2019-03-29 13:15:35 -07:00
Uday Bondhugula ab4797229c Extend loop unroll/unroll-and-jam to affine bounds + refactor related code.
- extend loop unroll-jam similar to loop unroll for affine bounds
- extend both loop unroll/unroll-jam to deal with cleanup loop for non multiple
  of unroll factor.
- extend promotion of single iteration loops to work with affine bounds
- fix typo bugs in loop unroll
- refactor common code b/w loop unroll and loop unroll-jam
- move prototypes of non-pass transforms to LoopUtils.h
- add additional builder methods.
- introduce loopUnrollUpTo(factor) to unroll by either factor or trip count,
  whichever is less.
- remove Statement::isInnermost (not used for now - will come back at the right
  place/in right form later)

PiperOrigin-RevId: 213471227
2019-03-29 13:15:06 -07:00
Tatiana Shpeisman 52111cefc0 Store 'then' clause statements directly in the 'if' statement.
Also a few minor changes.

PiperOrigin-RevId: 213359024
2019-03-29 13:14:23 -07:00
Uday Bondhugula 37a3f638ea Misc changes to builder's and Transforms/ API to allow code generation.
- add builder method for ReturnOp
- expose API from Transforms/ to work on specific ML statements (do this for
  LoopUnroll, LoopUnrollAndJam)
- add MLFuncBuilder::getForStmtBodyBuilder, ::getBlock

PiperOrigin-RevId: 213074178
2019-03-29 13:14:09 -07:00
Jacques Pienaar fb3116f59e Add PassResult and have passes return PassResult to indicate failure/success.
For FunctionPass's for passes that want to stop upon error encountered.

PiperOrigin-RevId: 213058651
2019-03-29 13:13:55 -07:00
Uday Bondhugula 64812a56c7 Extend getConstantTripCount to deal with a larger subset of loop bounds; make loop
unroll/unroll-and-jam more powerful; add additional affine expr builder methods

- use previously added analysis/simplification to infer multiple of unroll
  factor trip counts, making loop unroll/unroll-and-jam more general.

- for loop unroll, support bounds that are single result affine map's with the
  same set of operands. For unknown loop bounds, loop unroll will now work as
  long as trip count can be determined to be a multiple of unroll factor.

- extend getConstantTripCount to deal with single result affine map's with the
  same operands. move it to mlir/Analysis/LoopAnalysis.cpp

- add additional builder utility methods for affine expr arithmetic
  (difference, mod/floordiv/ceildiv w.r.t postitive constant). simplify code to
  use the utility methods.

- move affine analysis routines to AffineAnalysis.cpp/.h from
  AffineStructures.cpp/.h.

- Rename LoopUnrollJam to LoopUnrollAndJam to match class name.

- add an additional simplification for simplifyFloorDiv, simplifyCeilDiv

- Rename AffineMap::getNumOperands() getNumInputs: an affine map by itself does
  not have operands. Operands are passed to it through affine_apply, from loop
  bounds/if condition's, etc., operands are stored in the latter.

This should be sufficiently powerful for now as far as unroll/unroll-and-jam go for TPU
code generation, and can move to other analyses/transformations.

Loop nests like these are now unrolled without any cleanup loop being generated.

  for %i = 1 to 100 {
    // unroll factor 4: no cleanup loop will be generated.
    for %j = (d0) -> (d0) (%i) to (d0) -> (5*d0 + 3) (%i) {
      %x = "foo"(%j) : (affineint) -> i32
    }
  }

  for %i = 1 to 100 {
    // unroll factor 4: no cleanup loop will be generated.
    for %j = (d0) -> (d0) (%i) to (d0) -> (d0 - d mod 4 - 1) (%i) {
      %y = "foo"(%j) : (affineint) -> i32
    }
  }

  for %i = 1 to 100 {
    for %j = (d0) -> (d0) (%i) to (d0) -> (d0 + 128) (%i) {
      %x = "foo"() : () -> i32
    }
  }

TODO(bondhugula): extend this to LoopUnrollAndJam as well in the next CL (with minor
changes).

PiperOrigin-RevId: 212661212
2019-03-29 13:13:00 -07:00
Uday Bondhugula 3bae041e5d Add utility to promote single iteration loops. Add methods for getting constant
loop counts. Improve / refactor loop unroll / loop unroll and jam.

- add utility to remove single iteration loops.
- use this utility to promote single iteration loops after unroll/unroll-and-jam
- use loopUnrollByFactor for loopUnrollFull and remove most of the latter.
- add methods for getting constant loop trip count

PiperOrigin-RevId: 212039569
2019-03-29 13:11:21 -07:00
Chris Lattner 348f31a4fa Add location specifier to MLIR Functions, and:
- Compress the identifier/kind of a Function into a single word.
 - Eliminate otherFailure from verifier now that we always have a location
 - Eliminate the error string from the verifier now that we always have
   locations.
 - Simplify the parser's handling of fn forward references, using the location
   tracked by the function.

PiperOrigin-RevId: 211985101
2019-03-29 13:10:55 -07:00
Jacques Pienaar 95f31d53d5 Add GraphTraits and DOTGraphTraits for CFGFunction in debug builds.
Enable using GraphWriter to dump graphviz in debug mode (kept to debug builds completely as this is only for debugging). Add option to mlir-opt to print CFGFunction after every transform in debug mode.

PiperOrigin-RevId: 211578699
2019-03-29 13:09:31 -07:00
Uday Bondhugula d5416f299e Complete AffineExprFlattener based simplification for floordiv/ceildiv.
- handle floordiv/ceildiv in AffineExprFlattener; update the simplification to
  work even if mod/floordiv/ceildiv expressions appearing in the tree can't be eliminated.
- refactor the flattening / analysis to move it out of lib/Transforms/
- fix MutableAffineMap::isMultipleOf
- add AffineBinaryOpExpr:getAdd/getMul/... utility methods

PiperOrigin-RevId: 211540536
2019-03-29 13:09:18 -07:00
Uday Bondhugula 0122a99cbb Affine expression analysis and simplification.
Outside of IR/
- simplify a MutableAffineMap by flattening the affine expressions
- add a simplify affine expression pass that uses this analysis
- update the FlatAffineConstraints API (to be used in the next CL)

In IR:
- add isMultipleOf and getKnownGCD for AffineExpr, and make the in-IR
  simplication of simplifyMod simpler and more powerful.
- rename the AffineExpr visitor methods to distinguish b/w visiting and
  walking, and to simplify API names based on context.

The next CL will use some of these for the loop unrolling/unroll-jam to make
the detection for the need of cleanup loop powerful/non-trivial.

A future CL will finally move this simplification to FlatAffineConstraints to
make it more powerful. For eg., currently, even if a mod expr appearing in a
part of the expression tree can't be simplified, the whole thing won't be
simplified.

PiperOrigin-RevId: 211012256
2019-03-29 13:07:44 -07:00
Uday Bondhugula e9fb4b492d Introduce loop unroll jam transformation.
- for test purposes, the unroll-jam pass unroll jams the first outermost loop.

While on this:
- fix StmtVisitor to allow overriding of function to iterate walk over children
  of a stmt.

PiperOrigin-RevId: 210644813
2019-03-29 13:07:30 -07:00
Tatiana Shpeisman d32a28c520 Implement operands for the lower and upper bounds of the for statement.
This revamps implementation of the loop bounds in the ForStmt, using general representation that supports operands. The frequent case of constant bounds is supported
via special access methods.

This also includes:
- Operand iterators for the Statement class.
- OpPointer::is() method to query the class of the Operation.
- Support for the bound shorthand notation parsing and printing.
- Validity checks for the bound operands used as dim ids and symbols

I didn't mean this CL to be so large. It just happened this way, as one thing led to another.

PiperOrigin-RevId: 210204858
2019-03-29 13:05:16 -07:00
Chris Lattner dfc58848e3 Two unrelated API cleanups: remove the location processing stuff from custom op
parser hooks, as it has been subsumed by a simpler and cleaner mechanism.
Second, remove the "Inst" suffixes from a few methods in CFGFuncBuilder since
they are redundant and this is inconsistent with the other builders.  NFC.

PiperOrigin-RevId: 210006263
2019-03-29 13:04:47 -07:00
Chris Lattner 956e0f7e21 Push location information more tightly into the IR, providing space for every
operation and statement to have a location, and make it so a location is
required to be specified whenever you make one (though a null location is still
allowed).  This is to encourage compiler authors to propagate loc info
properly, allowing our failability story to work well.

This is still a WIP - it isn't clear if we want to continue abusing Attribute
for location information, or whether we should introduce a new class heirarchy
to do so.  This is good step along the way, and unblocks some of the tf/xla
work that builds upon it.

PiperOrigin-RevId: 210001406
2019-03-29 13:04:33 -07:00
Uday Bondhugula 00bed4bd99 Extend loop unrolling to unroll by a given factor; add builder for affine
apply op.

- add builder for AffineApplyOp (first one for an operation that has
  non-zero operands)
- add support for loop unrolling by a given factor; uses the affine apply op
  builder.

While on this, change 'step' of ForStmt to be 'unsigned' instead of
AffineConstantExpr *. Add setters for ForStmt lb, ub, step.

Sample Input:

// CHECK-LABEL: mlfunc @loop_nest_unroll_cleanup() {
mlfunc @loop_nest_unroll_cleanup() {
  for %i = 1 to 100 {
    for %j = 0 to 17 {
      %x = "addi32"(%j, %j) : (affineint, affineint) -> i32
      %y = "addi32"(%x, %x) : (i32, i32) -> i32
    }
  }
  return
}

Output:

$ mlir-opt -loop-unroll -unroll-factor=4 /tmp/single2.mlir
#map0 = (d0) -> (d0 + 1)
#map1 = (d0) -> (d0 + 2)
#map2 = (d0) -> (d0 + 3)
mlfunc @loop_nest_unroll_cleanup() {
  for %i0 = 1 to 100 {
    for %i1 = 0 to 17 step 4 {
      %0 = "addi32"(%i1, %i1) : (affineint, affineint) -> i32
      %1 = "addi32"(%0, %0) : (i32, i32) -> i32
      %2 = affine_apply #map0(%i1)
      %3 = "addi32"(%2, %2) : (affineint, affineint) -> i32
      %4 = affine_apply #map1(%i1)
      %5 = "addi32"(%4, %4) : (affineint, affineint) -> i32
      %6 = affine_apply #map2(%i1)
      %7 = "addi32"(%6, %6) : (affineint, affineint) -> i32
    }
    for %i2 = 16 to 17 {
      %8 = "addi32"(%i2, %i2) : (affineint, affineint) -> i32
      %9 = "addi32"(%8, %8) : (i32, i32) -> i32
    }
  }
  return
}

PiperOrigin-RevId: 209676220
2019-03-29 13:03:38 -07:00
Chris Lattner ae79d69922 Implement a module-level symbol table for functions, enforcing uniqueness of
names across the module and auto-renaming conflicts.  Have the parser reject
malformed modules that have redefinitions.

PiperOrigin-RevId: 209227560
2019-03-29 13:02:30 -07:00
Uday Bondhugula 98a24881d3 ShortLoopUnroll - bug fix.
Collect loops through a post order walk instead of a pre-order so that loops
are collected from inner loops are collected before outer surrounding ones.

Add a complex test case.

PiperOrigin-RevId: 209041057
2019-03-29 13:01:22 -07:00
Uday Bondhugula 3e92be9c71 Move Pass.{h,cpp} from lib/IR/ to lib/Transforms/.
PiperOrigin-RevId: 208571437
2019-03-29 12:59:07 -07:00
Chris Lattner 8159186f57 Rework the cloning infrastructure for statements to be able to take and update
an operand mapping, which simplifies it a bit.  Implement cloning for IfStmt,
rename getThenClause() to getThen() which is unambiguous and less repetitive in
use cases.

PiperOrigin-RevId: 207915990
2019-03-29 12:57:38 -07:00
Uday Bondhugula d8490d8d4f Loop unrolling pass update
- fix/complete forStmt cloning for unrolling to work for outer loops
- create IV const's only when needed
- test outer loop unrolling by creating a short trip count unroll pass for
  loops with trip counts <= <parameter>
- add unrolling test cases for multiple op results, outer loop unrolling
- fix/clean up StmtWalker class while on this
- switch unroll loop iterator values from i32 to affineint

PiperOrigin-RevId: 207645967
2019-03-29 12:56:16 -07:00
Uday Bondhugula 65b6e73245 Loop unrolling update.
- deal with non-operation stmt's (if/for stmt's) in loops being unrolled
  (unrolling of non-innermost loops works).
- update uses in unrolled bodies to use results of new operations that may be
  introduced in the unrolled bodies.

Unrolling now works for all kinds of loop nests - perfect nests, imperfect
nests, loops at any depth, and with any kind of operation in the body. (IfStmt
support not done, hence untested there).

Added missing dump/print method for StmtBlock.

TODO: add test case for outer loop unrolling.
PiperOrigin-RevId: 207314286
2019-03-29 12:55:19 -07:00
Uday Bondhugula 2a003256ae MLStmt cloning and IV replacement for loop unrolling, add constant pool to
MLFunctions.

- MLStmt cloning and IV replacement
- While at this, fix the innermostLoopGatherer to actually gather all the
  innermost loops (it was stopping its walk at the first innermost loop it
  found)
- Improve comments for MLFunction statement classes, fix inheritance order.

- Fixed StmtBlock destructor.

PiperOrigin-RevId: 207049173
2019-03-29 12:53:02 -07:00
Tatiana Shpeisman 8189a12bce Clean up and extend MLFuncBuilder to allow creating statements in the middle of a statement block. Rename Statement::getFunction() and StmtBlock()::getFunction() to findFunction() to make it clear that this is not a constant time getter.
Fix b/112039912 - we were recording 'i' instead of '%i' for loop induction variables causing "use of undefined SSA value" error.

PiperOrigin-RevId: 206884644
2019-03-29 12:51:38 -07:00
Uday Bondhugula dfd48dc24c LoopUnroll post order walk: fix misleading naming
PiperOrigin-RevId: 206609084
2019-03-29 12:48:44 -07:00
Chris Lattner 12adbeb872 Prepare for implementation of TensorFlow passes:
- Sketch out a TensorFlow/IR directory that will hold op definitions and common TF support logic.  We will eventually have TensorFlow/TF2HLO, TensorFlow/Grappler, TensorFlow/TFLite, etc.
 - Add sketches of a Switch/Merge op definition, including some missing stuff like the TwoResults trait.  Add a skeleton of a pass to raise this form.
 - Beef up the Pass/FunctionPass definitions slightly, moving the common code out of LoopUnroll.cpp into a new IR/Pass.cpp file.
 - Switch ConvertToCFG.cpp to be a ModulePass.
 - Allow _ to start bare identifiers, since this is important for TF attributes.

PiperOrigin-RevId: 206502517
2019-03-29 12:47:25 -07:00
Uday Bondhugula 0af97111d2 Stmt visitors and walkers.
- Update InnermostLoopGatherer to use a post order traversal (linear
  time/single traversal).
- Drop getNumNestedLoops().
- Update isInnermost() to use the StmtWalker.

When using return values in conjunction with walkers, the StmtWalker CRTP
pattern doesn't appear to be of any use. It just requires overriding nearly all
of the methods, which is what InnermostLoopGatherer currently does. Please see
FIXME/ENLIGHTENME comments. TODO: figure this out from this CL discussion.

Note
- Comments on visitor/walker base class are out of date; will update when this
  CL is finalized.

PiperOrigin-RevId: 206340901
2019-03-29 12:46:17 -07:00
Tatiana Shpeisman 9ebd3c7df8 Implement MLValue, statement operands, operation statement operands and values. ML functions now have full support for expressing operations. Induction variables, function arguments and return values are still todo.
PiperOrigin-RevId: 206253643
2019-03-29 12:46:04 -07:00
Chris Lattner f964bad6d1 Implement a proper function list in module, which auto-maintain the parent
pointer, and ensure that functions are deleted when the module is destroyed.

This exposed the fact that MLFunction had no dtor, and that the dtor in
CFGFunction was broken with cyclic references.  Fix both of these problems.

PiperOrigin-RevId: 206051666
2019-03-29 12:43:57 -07:00
Uday Bondhugula a0abd666a7 Sketch out loop unrolling transformation.
- Implement a full loop unroll for innermost loops.
- Use it to implement a pass that unroll all the innermost loops of all
  mlfunction's in a module. ForStmt's parsed currently have constant trip
  counts (and constant loop bounds).
- Implement StmtVisitor based (Visitor pattern)

Loop IVs aren't currently parsed and represented as SSA values. Replacing uses
of loop IVs in unrolled bodies is thus a TODO. Class comments are sparse at some places - will add them after one round of comments.

A cmd-line flag triggers this for now.

Original:

mlfunc @loops() {
  for x = 1 to 100 step 2 {
    for x = 1 to 4 {
      "Const"(){value: 1} : () -> ()
    }
  }
  return
}

After unrolling:

mlfunc @loops() {
  for x = 1 to 100 step 2 {
    "Const"(){value: 1} : () -> ()
    "Const"(){value: 1} : () -> ()
    "Const"(){value: 1} : () -> ()
    "Const"(){value: 1} : () -> ()
  }
  return
}

PiperOrigin-RevId: 205933235
2019-03-29 12:43:01 -07:00
Tatiana Shpeisman 1b24c48b91 Scaffolding for convertToCFG pass that replaces all instances of ML functions with equivalent CFG functions. Traverses module MLIR, generates CFG functions (empty for now) and removes ML functions. Adds Transforms library and tests.
PiperOrigin-RevId: 205848367
2019-03-29 12:41:15 -07:00