Commit Graph

1256 Commits

Author SHA1 Message Date
Uday Bondhugula 075090f891 Extend loop unrolling and unroll-jamming to non-matching bound operands and
multi-result upper bounds, complete TODOs, fix/improve test cases.

- complete TODOs for loop unroll/unroll-and-jam. Something as simple as
  "for %i = 0 to %N" wasn't being unrolled earlier (unless it had been written
  as "for %i = ()[s0] -> (0)()[%N] to %N"; addressed now.

- update/replace getTripCountExpr with buildTripCountMapAndOperands; makes it
  more powerful as it composes inputs into it

- getCleanupLowerBound and getUnrolledLoopUpperBound actually needed the same
  code; refactor and remove one.

- reorganize test cases, write previous ones better; most of these changes are
  "label replacements".

- fix wrongly labeled test cases in unroll-jam.mlir

PiperOrigin-RevId: 238014653
2019-03-29 17:14:12 -07:00
Alex Zinenko 9abea4a466 Python bindings: provide context managers for the Blocks
Expose EDSC block builders as Python context managers, similarly to loop
builders.  Note that blocks, unlike loops, are addressable and may need to be
"declared" without necessarily filling their bodies with instructions.  This is
the case, for example, when branching to a new block from the existing block.
Therefore, creating the block context manager immediately creates the block
(unless the manager captures an existing block) by creating and destroying the
block builder.  With this approach, one can either fill in the block and refer
to it later leveraging Python's dynamic variable lookup

    with BlockContext([indexType]) as b:
      op(...)  # operation inside the block
      ret()
    op(...)  # operation outside the block (in the function entry block)
    br(b, [...])    # branching to the block created above

or declare the block contexts upfront and enter them on demand

    bb1 = BlockContext()  # empty block created in the surrounding function
    bb2 = BlockContext()  # context
    cond_br(bb1.handle, [], bb2.handle, [])  # branch to blocks from here
    with bb1:
      op(...)  # operation inside the first block
    with bb2:
      op(...)  # operation inside the second block
    with bb1:
      op(...)  # append operation to the first block

Additionally, one can create multiple throw-away contexts that append to the
same block

    with BlockContext() as b:
      op(...)  # operation inside the block
    with BlockContext(appendTo(b)):
      op(...)  # new context appends to the block

which has a potential of being extended to control the insertion point of the
block at a finer level of granularity.

PiperOrigin-RevId: 238005298
2019-03-29 17:13:57 -07:00
Alex Zinenko b0cc81883c Python bindings: drop third_party/ in includes
Historically, Python bindings were using full path including third_party for
most headers but not all of them.  This is inconsistent with the rest of MLIR.
Drop the prefix path in #include directives.

PiperOrigin-RevId: 237999346
2019-03-29 17:13:42 -07:00
MLIR Team 8d62a6092f Clean up some stray mlfunc/cfgfunc leftovers.
PiperOrigin-RevId: 237936610
2019-03-29 17:13:26 -07:00
River Riddle fde5bcdae7 Add documentation for the pass instrumentation framework to the WritingAPass document.
PiperOrigin-RevId: 237919897
2019-03-29 17:13:11 -07:00
River Riddle 59b0839206 NFC: Remove old comment referencing CFG/EXT/ML functions.
PiperOrigin-RevId: 237902039
2019-03-29 17:12:56 -07:00
Nicolas Vasilache dfd904d4a9 Minor changes to the EDSC API NFC
This CL makes some minor changes to the declarative builder Helpers:
1. adds lb, ub, step methods to MemRefView to avoid always having to go through std::get + range;
2. drops MemRefView& from IndexedValue which was just creating ownership concerns. Instead, an IndexedValue only needs to keep track of the ValueHandle from which a MemRefView can be constructed on-demand if necessary.

PiperOrigin-RevId: 237861493
2019-03-29 17:12:41 -07:00
Lei Zhang e1595df1af Allow input and output to have different element types for broadcastable ops
TensorFlow comparison ops like tf.Less supports broadcast behavior but the result
type have different element types as the input types. Extend broadcastable trait
to allow such cases. Added tf.Less to demonstrate it.

PiperOrigin-RevId: 237846127
2019-03-29 17:12:26 -07:00
Lei Zhang 7972dcef84 Pull shape broadcast out as a stand-alone utility function
So that we can use this function to deduce broadcasted shapes elsewhere.

Also added support for unknown dimensions, by following TensorFlow behavior.

PiperOrigin-RevId: 237846065
2019-03-29 17:12:11 -07:00
River Riddle 0cc212f2b7 Ensure that pass timing is the last added pass instrumentation. This also updates the PassInstrumentor to iterate in reverse for the "after" hooks. This ensures that the instrumentations run in a stack like fashion.
PiperOrigin-RevId: 237840808
2019-03-29 17:11:56 -07:00
River Riddle dc141c307b Tidy up some of the pass infrastructure g3doc.
* Separate MyAnalysis into MyFunctionAnalysis/MyModuleAnalysis to avoid potential confusion.
* Add an example of an inline lambda builder for PassPipelineRegistration.
* Clarify the wording on a few of the pass restrictions.

PiperOrigin-RevId: 237840325
2019-03-29 17:11:41 -07:00
River Riddle e46ba31c66 Add a new instrumentation for timing pass and analysis execution. This is made available in mlir-opt via the 'pass-timing' and 'pass-timing-display' flags. The 'pass-timing-display' flag toggles between the different available display modes for the timing results. The current display modes are 'list' and 'pipeline', with 'list' representing the default.
Below shows the output for an example mlir-opt command line.

mlir-opt foo.mlir -verify-each=false -cse -canonicalize -cse -cse -pass-timing

list view (-pass-timing-display=list):
* In this mode the results are displayed in a list sorted by total time; with each pass/analysis instance aggregated into one unique result. This mode is similar to the output of 'time-passes' in llvm-opt.

===-------------------------------------------------------------------------===
                      ... Pass execution timing report ...
===-------------------------------------------------------------------------===
  Total Execution Time: 0.0097 seconds (0.0096 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   0.0051 ( 58.3%)   0.0001 ( 12.2%)   0.0052 ( 53.8%)   0.0052 ( 53.8%)  Canonicalizer
   0.0025 ( 29.1%)   0.0005 ( 58.2%)   0.0031 ( 31.9%)   0.0031 ( 32.0%)  CSE
   0.0011 ( 12.6%)   0.0003 ( 29.7%)   0.0014 ( 14.3%)   0.0014 ( 14.2%)  DominanceInfo
   0.0087 (100.0%)   0.0009 (100.0%)   0.0097 (100.0%)   0.0096 (100.0%)  Total

pipeline view (-pass-timing-display=pipeline):
* In this mode the results are displayed in a nested pipeline view that mirrors the internal pass pipeline that is being executed in the pass manager. This view is useful for understanding specifically which parts of the pipeline are taking the most time, and can also be used to identify when analyses are being invalidated and recomputed.

===-------------------------------------------------------------------------===
                      ... Pass execution timing report ...
===-------------------------------------------------------------------------===
  Total Execution Time: 0.0082 seconds (0.0081 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   0.0042 (100.0%)   0.0039 (100.0%)   0.0082 (100.0%)   0.0081 (100.0%)  Function Pipeline
   0.0005 ( 11.6%)   0.0008 ( 21.1%)   0.0013 ( 16.1%)   0.0013 ( 16.2%)    CSE
   0.0002 (  5.0%)   0.0004 (  9.3%)   0.0006 (  7.0%)   0.0006 (  7.0%)      (A) DominanceInfo
   0.0026 ( 61.8%)   0.0018 ( 45.6%)   0.0044 ( 54.0%)   0.0044 ( 54.1%)    Canonicalizer
   0.0005 ( 11.7%)   0.0005 ( 13.0%)   0.0010 ( 12.3%)   0.0010 ( 12.4%)    CSE
   0.0003 (  6.1%)   0.0003 (  8.3%)   0.0006 (  7.2%)   0.0006 (  7.1%)      (A) DominanceInfo
   0.0002 (  3.8%)   0.0001 (  2.8%)   0.0003 (  3.3%)   0.0003 (  3.3%)    CSE
   0.0042 (100.0%)   0.0039 (100.0%)   0.0082 (100.0%)   0.0081 (100.0%)  Total

PiperOrigin-RevId: 237825367
2019-03-29 17:11:25 -07:00
Mehdi Amini 732160eaa5 Move `createConvertToLLVMIRPass()` to its own header matching the target library clients need to link
PiperOrigin-RevId: 237723197
2019-03-29 17:11:07 -07:00
River Riddle 5e1f1d2cab Update the constantFold/fold API to use LogicalResult instead of bool.
PiperOrigin-RevId: 237719658
2019-03-29 17:10:50 -07:00
River Riddle 43d0ca8419 NFC: Move the PassExecutor and PassAdaptor classes into PassDetail.h so that they can be referenced throughout lib/Pass.
PiperOrigin-RevId: 237712736
2019-03-29 17:10:36 -07:00
River Riddle 0310d49f46 Move the success/failure functions out of LogicalResult and into the mlir namespace.
PiperOrigin-RevId: 237712180
2019-03-29 17:10:21 -07:00
River Riddle 2d2b40bce5 Add basic infrastructure for instrumenting pass execution and analysis computation. A virtual class, PassInstrumentation, is provided to allow for different parts of the pass manager infrastructure. The currently available hooks allow for instrumenting:
* before/after pass execution
* after a pass fails
* before/after an analysis is computed

After getting this infrastructure in place, we can start providing common developer utilities like pass timing, IR printing after pass execution, etc.

PiperOrigin-RevId: 237709692
2019-03-29 17:10:06 -07:00
Nicolas Vasilache 861eb87471 [EDSC] Cleanup declarative builder insertion point with blocks
Declarative builders want to provide the same nesting interface for blocks and loops. MLIR on the other hand has different behaviors:
1. when an AffineForOp is created the insertion point does not enter the loop body;
2. when a Block is created, the insertion point does enter the block body.

Guard against the second behavior in EDSC to make the interface unsurprising.
This also surfaces two places in the eager branch API where I was guarding against this behavior indirectly by creating a new ScopedContext.

Instead, uniformize everything to properly reset the insertion point in the unique place that builds the mlir::Block*.

PiperOrigin-RevId: 237619513
2019-03-29 17:09:51 -07:00
Jacques Pienaar 497d645337 Delete dead function.
Can reintroduce when needed.

PiperOrigin-RevId: 237599264
2019-03-29 17:09:35 -07:00
Nicolas Vasilache 0d925c5510 Follow up on custom instruction support.
This CL addresses a few post-submit comments:
1. better comments,
2. check number of results before dyn_cast (which is a less common case)
3. test usage for multi-result InstructionHandle

PiperOrigin-RevId: 237549333
2019-03-29 17:09:20 -07:00
Nicolas Vasilache eb19b4eefc Add support for custom ops in declarative builders.
This CL adds support for named custom instructions in declarative builders.
To allow this, it introduces a templated `CustomInstruction` class.
This CL also splits ValueHandle which can capture only the **value** in single-valued instructions from InstructionHandle which can capture any instruction but provide no typing and sugaring to extract the potential Value*.

PiperOrigin-RevId: 237543222
2019-03-29 17:09:05 -07:00
River Riddle 80d3568c0a Rename Status to LogicalResult to avoid conflictions with the Status in xla/tensorflow/etc.
PiperOrigin-RevId: 237537341
2019-03-29 17:08:50 -07:00
River Riddle e2c301441e Don't run verifyOperation in verifyDominance, as it is already run as part of verifyBlock. This caused the verifier to run in exponential time for nested regions.
PiperOrigin-RevId: 237519751
2019-03-29 17:08:35 -07:00
Lei Zhang d6afced006 [TF] Define tf.FusedBatchNormOp in TableGen
Also fixed wrong epsilon attribute types for tf.FusedBatchNormOp in test cases.

PiperOrigin-RevId: 237514017
2019-03-29 17:08:20 -07:00
Lei Zhang 684cc6e8da [TableGen] Change to attach the name to DAG operator in result patterns
There are two ways that we can attach a name to a DAG node:

1) (Op:$name ...)
2) (Op ...):$name

The problem with 2) is that we cannot do it on the outmost DAG node in a tree.
Switch from 2) to 1).

PiperOrigin-RevId: 237513962
2019-03-29 17:08:05 -07:00
Lei Zhang 18fde7c9d8 [TableGen] Support multiple result patterns
This CL added the ability to generate multiple ops using multiple result
patterns, with each of them replacing one result of the matched source op.

Specifically, the syntax is

```
def : Pattern<(SourceOp ...),
              [(ResultOp1 ...), (ResultOp2 ...), (ResultOp3 ...)]>;
```

Assuming `SourceOp` has three results.

Currently we require that each result op must generate one result, which
can be lifted later when use cases arise.

To help with cases that certain output is unused and we don't care about it,
this CL also introduces a new directive: `verifyUnusedValue`. Checks will
be emitted in the `match()` method to make sure if the corresponding output
is not unused, `match()` returns with `matchFailure()`.

PiperOrigin-RevId: 237513904
2019-03-29 17:07:50 -07:00
Uday Bondhugula 87884ab4b6 Refactor and share common code across addAffineForOpDomain / addSliceBounds
PiperOrigin-RevId: 237508755
2019-03-29 17:07:35 -07:00
Lei Zhang 999a0c8736 [TF] Improve verification for integer and floating-point tensor types
TensorFlow does not allow integers of random bitwidths. It only accepts 8-,
16-, 32-, and 64-bit integer types. Similarly for floating point types, only
half, single, double, and bfloat16 types.

PiperOrigin-RevId: 237483913
2019-03-29 17:07:20 -07:00
River Riddle 2c78469a93 Introduce a TypeID class to provide unique identifiers for derived type classes. This removes the need for derived types to define a static typeID field.
PiperOrigin-RevId: 237482890
2019-03-29 17:07:06 -07:00
Uday Bondhugula ce7e59536c Add a basic model to set tile sizes + some cleanup
- compute tile sizes based on a simple model that looks at memory footprints
  (instead of using the hardcoded default value)
- adjust tile sizes to make them factors of trip counts based on an option
- update loop fusion CL options to allow setting maximal fusion at pass creation
- change an emitError to emitWarning (since it's not a hard error unless the client
  treats it that way, in which case, it can emit one)

$ mlir-opt -debug-only=loop-tile -loop-tile test/Transforms/loop-tiling.mlir

test/Transforms/loop-tiling.mlir:81:3: note: using tile sizes [4 4 5 ]

  for %i = 0 to 256 {

for %i0 = 0 to 256 step 4 {
    for %i1 = 0 to 256 step 4 {
      for %i2 = 0 to 250 step 5 {
        for %i3 = #map4(%i0) to #map11(%i0) {
          for %i4 = #map4(%i1) to #map11(%i1) {
            for %i5 = #map4(%i2) to #map12(%i2) {
              %0 = load %arg0[%i3, %i5] : memref<8x8xvector<64xf32>>
              %1 = load %arg1[%i5, %i4] : memref<8x8xvector<64xf32>>
              %2 = load %arg2[%i3, %i4] : memref<8x8xvector<64xf32>>
              %3 = mulf %0, %1 : vector<64xf32>
              %4 = addf %2, %3 : vector<64xf32>
              store %4, %arg2[%i3, %i4] : memref<8x8xvector<64xf32>>
            }
          }
        }
      }
    }
  }

PiperOrigin-RevId: 237461836
2019-03-29 17:06:51 -07:00
Alex Zinenko 8b4b9b31f1 Python bindings: introduce loop and loop nest contexts
Recently, EDSC introduced an eager mode for building IR in different contexts.
Introduce Python bindings support for loop and loop nest contexts of EDSC
builders.  The eager mode is built around the notion of ValueHandle, which is
convenience class for delayed initialization and operator overloads.  Expose
this class and overloads directly.  The model of insertion contexts maps
naturally to Python context manager mechanism, therefore new bindings are
defined bypassing the C APIs.  The bindings now provide three new context
manager classes: FunctionContext, LoopContext and LoopNestContext.  The last
two can be used with the `with`-construct in Python to create loop (nests) and
obtain handles to the loop induction variables seamlessly:

    with LoopContext(lhs, rhs, 1) as i:
      lhs + rhs + i
      with LoopContext(rhs, rhs + rhs, 2) as j:
        x = i + j

Any statement within the Python context will trigger immediate emission of the
corresponding IR constructs into the context owned by the nearest context
manager.

PiperOrigin-RevId: 237447732
2019-03-29 17:06:36 -07:00
River Riddle 1e55ae19a0 Convert ambiguous bool returns in /Analysis to use Status instead.
PiperOrigin-RevId: 237390240
2019-03-29 17:06:21 -07:00
River Riddle 10ddae6d88 Use Status instead of bool in DialectConversion.
PiperOrigin-RevId: 237339277
2019-03-29 17:06:06 -07:00
River Riddle f427bddd06 Update the PassManager infrastructure to return Status instead of bool.
PiperOrigin-RevId: 237261205
2019-03-29 17:05:51 -07:00
Uday Bondhugula b5f7b7fd59 Fix unionBoundingBox bug introduced by cl/237141668
- add test case

PiperOrigin-RevId: 237241598
2019-03-29 17:05:36 -07:00
Alex Zinenko 6621f39d19 LLVM IR Dialect conversion: use builder arguments instead of named attributes
The first version of TableGen-defined LLVM IR Dialect did not include the
mandatory or optional attributes of the operations due to the missing support
for some of the relevant attribute types.  This support has been recently
introduced, along with named attributes as arguments in the TableGen operation
definitions.  With these changes, LLVM IR Dialect operations now have factory
functions accepting (unnamed) attributes and attaching their canonical names.
Use these factories instead of manually constructing named attributes in the
dialect convreter to avoid hardcoded attribute names in unexpected places.

PiperOrigin-RevId: 237237769
2019-03-29 17:05:20 -07:00
Alex Zinenko b9724e98c2 Cleanups in the LLVM IR Dialect
These cleanups reflects some recent changes to the LLVM IR Dialect and the
infrastructure that affects it.  In particular, add documentation on direct and
indirect function calls as well as remove the `call` and `call0` separation.
Change the prefix of custom types from `!llvm.type` to `!llvm` so that it
matches the IR.  Remove the verifier check disallowing conditional branches to
the same block with arguments: identical arguments are now supported, and
different arguments will be caught later.

PiperOrigin-RevId: 237203452
2019-03-29 17:05:05 -07:00
Alex Zinenko dbaab04a80 TableGen most of the LLVM IR Dialect to LLVM IR conversions
The LLVM IR Dialect strives to be close to the original LLVM IR instructions.
The conversion from the LLVM IR Dialect to LLVM IR proper is mostly mechanical
and can be automated.  Implement TableGen support for generating conversions
from a concise pattern form in the TableGen definition of the LLVM IR Dialect
operations.  It is used for all operations except calls and branches.  These
operations need access to function and block remapping tables and would require
significantly more code to generate the conversions from TableGen definitions
than the current manually written conversions.

This implementation is accompanied by various necessary changes to the TableGen
operation definition infrastructure.  In particular, operation definitions now
contain named accessors to results as well as named accessors to the variadic
operand (returning a vector of operands).  The base operation support TableGen
file now contains a FunctionAttr definition.  The TableGen now allows to query
the names of the operation results.

PiperOrigin-RevId: 237203077
2019-03-29 17:04:50 -07:00
Mehdi Amini 056fc2fd09 Change assert message to mention `nullptr` instead of `sentinel`: this is likely more helpful to the user when it fires
PiperOrigin-RevId: 237170067
2019-03-29 17:04:35 -07:00
River Riddle ba6fdc8b01 Move UtilResult into the Support directory and rename it to Status. Status provides an unambiguous way to specify success/failure results. These can be generated by 'Status::success()' and Status::failure()'. Status provides no implicit conversion to bool and should be consumed by one of the following utility functions:
* bool succeeded(Status)
  - Return if the status corresponds to a success value.

* bool failed(Status)
  - Return if the status corresponds to a failure value.

PiperOrigin-RevId: 237153884
2019-03-29 17:04:19 -07:00
River Riddle 157e3cdb19 Add documentation for the new pass infrastructure.
PiperOrigin-RevId: 237153501
2019-03-29 17:04:03 -07:00
MLIR Team 11b099c012 Adds offset argument to specified range of ids know to be aligned when calling mergeAndAlignIds (used by FlatAffineConstraints).
Supports use case where FlatAffineConstraints::composeMap adds dim identifiers with no SSA values (because the identifiers are the result of an AffineValueMap which is not materialized in the IR and thus has no SSA Value results).

PiperOrigin-RevId: 237145506
2019-03-29 17:03:47 -07:00
MLIR Team 1678fd1584 Fix opt build.
PiperOrigin-RevId: 237141751
2019-03-29 17:03:32 -07:00
Uday Bondhugula b8b15c7700 Add FlatAffineConstraints::containsId to avoid using findId when position isn't
needed + other cleanup
- clean up unionBoundingBox (hoist SmallVector allocations out of loop).

PiperOrigin-RevId: 237141668
2019-03-29 17:03:17 -07:00
Nicolas Vasilache 9e425a06f7 Fix an incorrect comment in builder-api-test.
Also address post commit cleanups that were missed.

PiperOrigin-RevId: 237122077
2019-03-29 17:03:00 -07:00
Nicolas Vasilache 7c0b9e8b62 Add helper classes to declarative builders to help write end-to-end custom ops.
This CL adds the same helper classes that exist in the AST form of EDSCs to support a basic indexing notation and emit the proper load and store operations and capture MemRefViews as function arguments.

This CL also adds a wrapper class LoopNestBuilder to allow generic rank-agnostic loops over indices.

PiperOrigin-RevId: 237113755
2019-03-29 17:02:41 -07:00
Lei Zhang 4fc9b51727 [TableGen] Emit verification code for op results
They can be verified using the same logic as operands.

PiperOrigin-RevId: 237101461
2019-03-29 17:02:26 -07:00
River Riddle d43f630de8 NFC: Remove 'Result' from the analysis manager api to better reflect the implementation. There is no distinction between analysis computation and result.
PiperOrigin-RevId: 237093101
2019-03-29 17:02:12 -07:00
Dimitrios Vytiniotis 32943f5783 More graceful failure when verifying llvm.noalias.
PiperOrigin-RevId: 237081778
2019-03-29 17:01:56 -07:00
River Riddle 1d87b62afe Add support for preserving specific analyses in the analysis manager. Passes can now preserve specific analyses via 'markAnalysesPreserved'.
Example:

markAnalysesPreserved<DominanceInfo>();
markAnalysesPreserved<DominanceInfo, PostDominanceInfo>();

PiperOrigin-RevId: 237081454
2019-03-29 17:01:41 -07:00