Commit Graph

871 Commits

Author SHA1 Message Date
Nicolas Vasilache dfd904d4a9 Minor changes to the EDSC API NFC
This CL makes some minor changes to the declarative builder Helpers:
1. adds lb, ub, step methods to MemRefView to avoid always having to go through std::get + range;
2. drops MemRefView& from IndexedValue which was just creating ownership concerns. Instead, an IndexedValue only needs to keep track of the ValueHandle from which a MemRefView can be constructed on-demand if necessary.

PiperOrigin-RevId: 237861493
2019-03-29 17:12:41 -07:00
Lei Zhang e1595df1af Allow input and output to have different element types for broadcastable ops
TensorFlow comparison ops like tf.Less supports broadcast behavior but the result
type have different element types as the input types. Extend broadcastable trait
to allow such cases. Added tf.Less to demonstrate it.

PiperOrigin-RevId: 237846127
2019-03-29 17:12:26 -07:00
Lei Zhang 7972dcef84 Pull shape broadcast out as a stand-alone utility function
So that we can use this function to deduce broadcasted shapes elsewhere.

Also added support for unknown dimensions, by following TensorFlow behavior.

PiperOrigin-RevId: 237846065
2019-03-29 17:12:11 -07:00
River Riddle 0cc212f2b7 Ensure that pass timing is the last added pass instrumentation. This also updates the PassInstrumentor to iterate in reverse for the "after" hooks. This ensures that the instrumentations run in a stack like fashion.
PiperOrigin-RevId: 237840808
2019-03-29 17:11:56 -07:00
River Riddle e46ba31c66 Add a new instrumentation for timing pass and analysis execution. This is made available in mlir-opt via the 'pass-timing' and 'pass-timing-display' flags. The 'pass-timing-display' flag toggles between the different available display modes for the timing results. The current display modes are 'list' and 'pipeline', with 'list' representing the default.
Below shows the output for an example mlir-opt command line.

mlir-opt foo.mlir -verify-each=false -cse -canonicalize -cse -cse -pass-timing

list view (-pass-timing-display=list):
* In this mode the results are displayed in a list sorted by total time; with each pass/analysis instance aggregated into one unique result. This mode is similar to the output of 'time-passes' in llvm-opt.

===-------------------------------------------------------------------------===
                      ... Pass execution timing report ...
===-------------------------------------------------------------------------===
  Total Execution Time: 0.0097 seconds (0.0096 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   0.0051 ( 58.3%)   0.0001 ( 12.2%)   0.0052 ( 53.8%)   0.0052 ( 53.8%)  Canonicalizer
   0.0025 ( 29.1%)   0.0005 ( 58.2%)   0.0031 ( 31.9%)   0.0031 ( 32.0%)  CSE
   0.0011 ( 12.6%)   0.0003 ( 29.7%)   0.0014 ( 14.3%)   0.0014 ( 14.2%)  DominanceInfo
   0.0087 (100.0%)   0.0009 (100.0%)   0.0097 (100.0%)   0.0096 (100.0%)  Total

pipeline view (-pass-timing-display=pipeline):
* In this mode the results are displayed in a nested pipeline view that mirrors the internal pass pipeline that is being executed in the pass manager. This view is useful for understanding specifically which parts of the pipeline are taking the most time, and can also be used to identify when analyses are being invalidated and recomputed.

===-------------------------------------------------------------------------===
                      ... Pass execution timing report ...
===-------------------------------------------------------------------------===
  Total Execution Time: 0.0082 seconds (0.0081 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   0.0042 (100.0%)   0.0039 (100.0%)   0.0082 (100.0%)   0.0081 (100.0%)  Function Pipeline
   0.0005 ( 11.6%)   0.0008 ( 21.1%)   0.0013 ( 16.1%)   0.0013 ( 16.2%)    CSE
   0.0002 (  5.0%)   0.0004 (  9.3%)   0.0006 (  7.0%)   0.0006 (  7.0%)      (A) DominanceInfo
   0.0026 ( 61.8%)   0.0018 ( 45.6%)   0.0044 ( 54.0%)   0.0044 ( 54.1%)    Canonicalizer
   0.0005 ( 11.7%)   0.0005 ( 13.0%)   0.0010 ( 12.3%)   0.0010 ( 12.4%)    CSE
   0.0003 (  6.1%)   0.0003 (  8.3%)   0.0006 (  7.2%)   0.0006 (  7.1%)      (A) DominanceInfo
   0.0002 (  3.8%)   0.0001 (  2.8%)   0.0003 (  3.3%)   0.0003 (  3.3%)    CSE
   0.0042 (100.0%)   0.0039 (100.0%)   0.0082 (100.0%)   0.0081 (100.0%)  Total

PiperOrigin-RevId: 237825367
2019-03-29 17:11:25 -07:00
Mehdi Amini 732160eaa5 Move `createConvertToLLVMIRPass()` to its own header matching the target library clients need to link
PiperOrigin-RevId: 237723197
2019-03-29 17:11:07 -07:00
River Riddle 5e1f1d2cab Update the constantFold/fold API to use LogicalResult instead of bool.
PiperOrigin-RevId: 237719658
2019-03-29 17:10:50 -07:00
River Riddle 0310d49f46 Move the success/failure functions out of LogicalResult and into the mlir namespace.
PiperOrigin-RevId: 237712180
2019-03-29 17:10:21 -07:00
River Riddle 2d2b40bce5 Add basic infrastructure for instrumenting pass execution and analysis computation. A virtual class, PassInstrumentation, is provided to allow for different parts of the pass manager infrastructure. The currently available hooks allow for instrumenting:
* before/after pass execution
* after a pass fails
* before/after an analysis is computed

After getting this infrastructure in place, we can start providing common developer utilities like pass timing, IR printing after pass execution, etc.

PiperOrigin-RevId: 237709692
2019-03-29 17:10:06 -07:00
Nicolas Vasilache 0d925c5510 Follow up on custom instruction support.
This CL addresses a few post-submit comments:
1. better comments,
2. check number of results before dyn_cast (which is a less common case)
3. test usage for multi-result InstructionHandle

PiperOrigin-RevId: 237549333
2019-03-29 17:09:20 -07:00
Nicolas Vasilache eb19b4eefc Add support for custom ops in declarative builders.
This CL adds support for named custom instructions in declarative builders.
To allow this, it introduces a templated `CustomInstruction` class.
This CL also splits ValueHandle which can capture only the **value** in single-valued instructions from InstructionHandle which can capture any instruction but provide no typing and sugaring to extract the potential Value*.

PiperOrigin-RevId: 237543222
2019-03-29 17:09:05 -07:00
River Riddle 80d3568c0a Rename Status to LogicalResult to avoid conflictions with the Status in xla/tensorflow/etc.
PiperOrigin-RevId: 237537341
2019-03-29 17:08:50 -07:00
Lei Zhang d6afced006 [TF] Define tf.FusedBatchNormOp in TableGen
Also fixed wrong epsilon attribute types for tf.FusedBatchNormOp in test cases.

PiperOrigin-RevId: 237514017
2019-03-29 17:08:20 -07:00
Lei Zhang 18fde7c9d8 [TableGen] Support multiple result patterns
This CL added the ability to generate multiple ops using multiple result
patterns, with each of them replacing one result of the matched source op.

Specifically, the syntax is

```
def : Pattern<(SourceOp ...),
              [(ResultOp1 ...), (ResultOp2 ...), (ResultOp3 ...)]>;
```

Assuming `SourceOp` has three results.

Currently we require that each result op must generate one result, which
can be lifted later when use cases arise.

To help with cases that certain output is unused and we don't care about it,
this CL also introduces a new directive: `verifyUnusedValue`. Checks will
be emitted in the `match()` method to make sure if the corresponding output
is not unused, `match()` returns with `matchFailure()`.

PiperOrigin-RevId: 237513904
2019-03-29 17:07:50 -07:00
Uday Bondhugula 87884ab4b6 Refactor and share common code across addAffineForOpDomain / addSliceBounds
PiperOrigin-RevId: 237508755
2019-03-29 17:07:35 -07:00
Lei Zhang 999a0c8736 [TF] Improve verification for integer and floating-point tensor types
TensorFlow does not allow integers of random bitwidths. It only accepts 8-,
16-, 32-, and 64-bit integer types. Similarly for floating point types, only
half, single, double, and bfloat16 types.

PiperOrigin-RevId: 237483913
2019-03-29 17:07:20 -07:00
River Riddle 2c78469a93 Introduce a TypeID class to provide unique identifiers for derived type classes. This removes the need for derived types to define a static typeID field.
PiperOrigin-RevId: 237482890
2019-03-29 17:07:06 -07:00
Uday Bondhugula ce7e59536c Add a basic model to set tile sizes + some cleanup
- compute tile sizes based on a simple model that looks at memory footprints
  (instead of using the hardcoded default value)
- adjust tile sizes to make them factors of trip counts based on an option
- update loop fusion CL options to allow setting maximal fusion at pass creation
- change an emitError to emitWarning (since it's not a hard error unless the client
  treats it that way, in which case, it can emit one)

$ mlir-opt -debug-only=loop-tile -loop-tile test/Transforms/loop-tiling.mlir

test/Transforms/loop-tiling.mlir:81:3: note: using tile sizes [4 4 5 ]

  for %i = 0 to 256 {

for %i0 = 0 to 256 step 4 {
    for %i1 = 0 to 256 step 4 {
      for %i2 = 0 to 250 step 5 {
        for %i3 = #map4(%i0) to #map11(%i0) {
          for %i4 = #map4(%i1) to #map11(%i1) {
            for %i5 = #map4(%i2) to #map12(%i2) {
              %0 = load %arg0[%i3, %i5] : memref<8x8xvector<64xf32>>
              %1 = load %arg1[%i5, %i4] : memref<8x8xvector<64xf32>>
              %2 = load %arg2[%i3, %i4] : memref<8x8xvector<64xf32>>
              %3 = mulf %0, %1 : vector<64xf32>
              %4 = addf %2, %3 : vector<64xf32>
              store %4, %arg2[%i3, %i4] : memref<8x8xvector<64xf32>>
            }
          }
        }
      }
    }
  }

PiperOrigin-RevId: 237461836
2019-03-29 17:06:51 -07:00
Alex Zinenko 8b4b9b31f1 Python bindings: introduce loop and loop nest contexts
Recently, EDSC introduced an eager mode for building IR in different contexts.
Introduce Python bindings support for loop and loop nest contexts of EDSC
builders.  The eager mode is built around the notion of ValueHandle, which is
convenience class for delayed initialization and operator overloads.  Expose
this class and overloads directly.  The model of insertion contexts maps
naturally to Python context manager mechanism, therefore new bindings are
defined bypassing the C APIs.  The bindings now provide three new context
manager classes: FunctionContext, LoopContext and LoopNestContext.  The last
two can be used with the `with`-construct in Python to create loop (nests) and
obtain handles to the loop induction variables seamlessly:

    with LoopContext(lhs, rhs, 1) as i:
      lhs + rhs + i
      with LoopContext(rhs, rhs + rhs, 2) as j:
        x = i + j

Any statement within the Python context will trigger immediate emission of the
corresponding IR constructs into the context owned by the nearest context
manager.

PiperOrigin-RevId: 237447732
2019-03-29 17:06:36 -07:00
River Riddle 1e55ae19a0 Convert ambiguous bool returns in /Analysis to use Status instead.
PiperOrigin-RevId: 237390240
2019-03-29 17:06:21 -07:00
River Riddle 10ddae6d88 Use Status instead of bool in DialectConversion.
PiperOrigin-RevId: 237339277
2019-03-29 17:06:06 -07:00
River Riddle f427bddd06 Update the PassManager infrastructure to return Status instead of bool.
PiperOrigin-RevId: 237261205
2019-03-29 17:05:51 -07:00
Alex Zinenko b9724e98c2 Cleanups in the LLVM IR Dialect
These cleanups reflects some recent changes to the LLVM IR Dialect and the
infrastructure that affects it.  In particular, add documentation on direct and
indirect function calls as well as remove the `call` and `call0` separation.
Change the prefix of custom types from `!llvm.type` to `!llvm` so that it
matches the IR.  Remove the verifier check disallowing conditional branches to
the same block with arguments: identical arguments are now supported, and
different arguments will be caught later.

PiperOrigin-RevId: 237203452
2019-03-29 17:05:05 -07:00
Alex Zinenko dbaab04a80 TableGen most of the LLVM IR Dialect to LLVM IR conversions
The LLVM IR Dialect strives to be close to the original LLVM IR instructions.
The conversion from the LLVM IR Dialect to LLVM IR proper is mostly mechanical
and can be automated.  Implement TableGen support for generating conversions
from a concise pattern form in the TableGen definition of the LLVM IR Dialect
operations.  It is used for all operations except calls and branches.  These
operations need access to function and block remapping tables and would require
significantly more code to generate the conversions from TableGen definitions
than the current manually written conversions.

This implementation is accompanied by various necessary changes to the TableGen
operation definition infrastructure.  In particular, operation definitions now
contain named accessors to results as well as named accessors to the variadic
operand (returning a vector of operands).  The base operation support TableGen
file now contains a FunctionAttr definition.  The TableGen now allows to query
the names of the operation results.

PiperOrigin-RevId: 237203077
2019-03-29 17:04:50 -07:00
River Riddle ba6fdc8b01 Move UtilResult into the Support directory and rename it to Status. Status provides an unambiguous way to specify success/failure results. These can be generated by 'Status::success()' and Status::failure()'. Status provides no implicit conversion to bool and should be consumed by one of the following utility functions:
* bool succeeded(Status)
  - Return if the status corresponds to a success value.

* bool failed(Status)
  - Return if the status corresponds to a failure value.

PiperOrigin-RevId: 237153884
2019-03-29 17:04:19 -07:00
Uday Bondhugula b8b15c7700 Add FlatAffineConstraints::containsId to avoid using findId when position isn't
needed + other cleanup
- clean up unionBoundingBox (hoist SmallVector allocations out of loop).

PiperOrigin-RevId: 237141668
2019-03-29 17:03:17 -07:00
Nicolas Vasilache 7c0b9e8b62 Add helper classes to declarative builders to help write end-to-end custom ops.
This CL adds the same helper classes that exist in the AST form of EDSCs to support a basic indexing notation and emit the proper load and store operations and capture MemRefViews as function arguments.

This CL also adds a wrapper class LoopNestBuilder to allow generic rank-agnostic loops over indices.

PiperOrigin-RevId: 237113755
2019-03-29 17:02:41 -07:00
Lei Zhang 4fc9b51727 [TableGen] Emit verification code for op results
They can be verified using the same logic as operands.

PiperOrigin-RevId: 237101461
2019-03-29 17:02:26 -07:00
River Riddle d43f630de8 NFC: Remove 'Result' from the analysis manager api to better reflect the implementation. There is no distinction between analysis computation and result.
PiperOrigin-RevId: 237093101
2019-03-29 17:02:12 -07:00
River Riddle 1d87b62afe Add support for preserving specific analyses in the analysis manager. Passes can now preserve specific analyses via 'markAnalysesPreserved'.
Example:

markAnalysesPreserved<DominanceInfo>();
markAnalysesPreserved<DominanceInfo, PostDominanceInfo>();

PiperOrigin-RevId: 237081454
2019-03-29 17:01:41 -07:00
Dimitrios Vytiniotis 480cc2b063 Using llvm.noalias attribute when generating LLVMIR.
PiperOrigin-RevId: 237063104
2019-03-29 17:01:11 -07:00
Nicolas Vasilache ee4a80bbd6 Add an eager API version for BR and COND_BR
When building unstructured control-flow there is a need to construct mlir::Block* before being able to fill them. This invites goto-style programming.

This CL introduces an alternative eager API for BR and COND_BR in which blocks are created eagerly and captured on the fly.

This allows reducing the number of calls to `BlockBuilder` from 4 to 2 in the `builder_blocks_eager` test and from 3 to 2 in the `builder_cond_branch_eager` test.

PiperOrigin-RevId: 237046114
2019-03-29 17:00:26 -07:00
Nicolas Vasilache 38f1d2d77e Add support for Branches in edsc::Builder
This CL adds support for BranchHandle and BranchBuilder that require a slightly different
abstraction since an mlir::Block is not an mlir::Value.

This CL also adds support for the BR and COND_BR instructions and the relevant tests.

PiperOrigin-RevId: 237034312
2019-03-29 17:00:09 -07:00
Nicolas Vasilache af6c3f7a63 Start a new implementation for edsc::Builder
This CL reworks the design of EDSCs from first principles.
It introduces a ValueHandle which can hold either:
1. an eagerly typed, delayed Value*
2. a precomputed Value*

A ValueHandle can be manipulated with intrinsic operations a nested within a NestedBuilder. These NestedBuilder are a more idiomatic nested builder abstraction that should feel intuitive to program in C++.

Notably, this abstraction does not require an AST to stage the construction of MLIR snippets from C++. Instead, the abstraction makes use of orderings between definition and declaration of ValueHandles and provides a NestedBuilder and a LoopBuilder helper classes to handle those orderings.

All instruction creations are meant to use the templated ValueHandle::create<> which directly calls mlir::Builder.create<>.

For now the EDSC AST and the builders live side-by-side until the C API is ported.

PiperOrigin-RevId: 237030945
2019-03-29 16:59:50 -07:00
Alex Zinenko 95949a0d09 TableGen: allow mixing attributes and operands in the Arguments DAG of Op definition
The existing implementation of the Op definition generator assumes and relies
on the fact that native Op Attributes appear after its value-based operands in
the Arguments list.  Furthermore, the same order is used in the generated
`build` function for the operation.  This is not desirable for some operations
with mandatory attributes that would want the attribute to appear upfront for
better consistency with their textual representation, for example `cmpi` would
prefer the `predicate` attribute to be foremost in the argument list.

Introduce support for using attributes and operands in the Arguments DAG in no
particular order.  This is achieved by maintaining a list of Arguments that
point to either the value or the attribute and are used to generate the `build`
method.

PiperOrigin-RevId: 237002921
2019-03-29 16:59:35 -07:00
MLIR Team c1ff9e866e Use FlatAffineConstraints::unionBoundingBox to perform slice bounds union for loop fusion pass (WIP).
Adds utility to convert slice bounds to a FlatAffineConstraints representation.
Adds utility to FlatAffineConstraints to promote loop IV symbol identifiers to dim identifiers.

PiperOrigin-RevId: 236973261
2019-03-29 16:59:21 -07:00
Uday Bondhugula 02af8c22df Change Pass:getFunction() to return pointer instead of ref - NFC
- change this for consistency - everything else similar takes/returns a
  Function pointer - the FuncBuilder ctor,
  Block/Value/Instruction::getFunction(), etc.
- saves a whole bunch of &s everywhere

PiperOrigin-RevId: 236928761
2019-03-29 16:58:35 -07:00
River Riddle 89d42f15a7 NFC: Move OperandStorage into a new header file for instruction support utilities, InstructionSupport.h.
PiperOrigin-RevId: 236892857
2019-03-29 16:57:50 -07:00
Nicolas Vasilache 069c818f40 Fix lower/upper bound mismatch in stripmineSink
Also beef up the corresponding test case.

PiperOrigin-RevId: 236878818
2019-03-29 16:57:21 -07:00
River Riddle 2dfefdafea Fix dialect attribute hooks so that they accept a NamedAttribute instead of an Attribute.
PiperOrigin-RevId: 236869321
2019-03-29 16:57:05 -07:00
Dimitrios Vytiniotis a60ba7d908 Supporting conversion of argument attributes along their types.
This fixes a bug: previously, during conversion function argument
attributes were neither beings passed through nor converted. This fix
extends DialectConversion to allow for simultaneous conversion of the
function type and the argument attributes.

This was important when lowering MLIR to LLVM where attribute
information (e.g. noalias) needs to be preserved in MLIR(LLVMDialect).

Longer run it seems reasonable that we want to convert both the
function attribute and its type and the argument attributes, but that
requires a small refactoring in Function.h to aggregate these three
fields in an inner struct, which will require some discussion.

PiperOrigin-RevId: 236709409
2019-03-29 16:55:51 -07:00
River Riddle 50efe0fc85 Add a 'verifyPasses' flag to the PassManager that specifies if the IR should be verified after each pass. This also adds a "verify-each" flag to mlir-opt to optionally disable running the verifier after each pass.
PiperOrigin-RevId: 236703760
2019-03-29 16:55:35 -07:00
River Riddle a495f960e0 Introduce the notion of dialect attributes and dependent attributes. A dialect attribute derives its context from a specific dialect, whereas a dependent attribute derives context from what it is attached to. Following this, we now enforce that functions and function arguments may only contain dialect specific attributes. These are generic entities and cannot provide any specific context for a dependent attribute.
Dialect attributes are defined as:

        dialect-namespace `.` attr-name `:` attribute-value

Dialects can override any of the following hooks to verify the validity of a given attribute:
  * verifyFunctionAttribute
  * verifyFunctionArgAttribute
  * verifyInstructionAttribute

PiperOrigin-RevId: 236507970
2019-03-29 16:55:05 -07:00
River Riddle 485746f524 Implement the initial AnalysisManagement infrastructure, with the introduction of the FunctionAnalysisManager and ModuleAnalysisManager classes. These classes provide analysis computation, caching, and invalidation for a specific IR unit. The invalidation is currently limited to either all or none, i.e. you cannot yet preserve specific analyses.
An analysis can be any class, but it must provide the following:
* A constructor for a given IR unit.

struct MyAnalysis {
  // Compute this analysis with the provided module.
  MyAnalysis(Module *module);
};

Analyses can be accessed from a Pass by calling either the 'getAnalysisResult<AnalysisT>' or 'getCachedAnalysisResult<AnalysisT>' methods. A FunctionPass may query for a cached analysis on the parent module with 'getCachedModuleAnalysisResult'. Similary, a ModulePass may query an analysis, it doesn't need to be cached, on a child function with 'getFunctionAnalysisResult'.

By default, when running a pass all cached analyses are set to be invalidated. If no transformation was performed, a pass can use the method 'markAllAnalysesPreserved' to preserve all analysis results. As noted above, preserving specific analyses is not yet supported.

PiperOrigin-RevId: 236505642
2019-03-29 16:54:50 -07:00
River Riddle eeeef090ef Set the namespace of the StandardOps dialect to "std", but add a special case to the parser to allow parsing standard operations without the "std" prefix. This will now allow for the standard dialect to be looked up dynamically by name.
PiperOrigin-RevId: 236493865
2019-03-29 16:54:20 -07:00
River Riddle f37651c708 NFC. Move all of the remaining operations left in BuiltinOps to StandardOps. The only thing left in BuiltinOps are the core MLIR types. The standard types can't be moved because they are referenced within the IR directory, e.g. in things like Builder.
PiperOrigin-RevId: 236403665
2019-03-29 16:53:35 -07:00
Lei Zhang 85d9b6c8f7 Use consistent names for dialect op source files
This CL changes dialect op source files (.h, .cpp, .td) to follow the following
convention:

  <full-dialect-name>/<dialect-namespace>Ops.{h|cpp|td}

Builtin and standard dialects are specially treated, though. Both of them do
not have dialect namespace; the former is still named as BuiltinOps.* and the
latter is named as Ops.*.

Purely mechanical. NFC.

PiperOrigin-RevId: 236371358
2019-03-29 16:53:19 -07:00
Uday Bondhugula 8254aabd4a A simple pass to detect and mark all parallel loops
- detect all parallel loops based on dep information and mark them with a
  "parallel" attribute
- add mlir::isLoopParallel(OpPointer<AffineForOp> ...), and refactor an existing method
  to use that (reuse some code from @andydavis (cl/236007073) for this)
- a simple/meaningful way to test memref dep test as well

Ex:

$ mlir-opt -detect-parallel test/Transforms/parallelism-detection.mlir
#map1 = ()[s0] -> (s0)
func @foo(%arg0: index) {
  %0 = alloc() : memref<1024x1024xvector<64xf32>>
  %1 = alloc() : memref<1024x1024xvector<64xf32>>
  %2 = alloc() : memref<1024x1024xvector<64xf32>>
  for %i0 = 0 to %arg0 {
    for %i1 = 0 to %arg0 {
      for %i2 = 0 to %arg0 {
        %3 = load %0[%i0, %i2] : memref<1024x1024xvector<64xf32>>
        %4 = load %1[%i2, %i1] : memref<1024x1024xvector<64xf32>>
        %5 = load %2[%i0, %i1] : memref<1024x1024xvector<64xf32>>
        %6 = mulf %3, %4 : vector<64xf32>
        %7 = addf %5, %6 : vector<64xf32>
        store %7, %2[%i0, %i1] : memref<1024x1024xvector<64xf32>>
      } {parallel: false}
    } {parallel: true}
  } {parallel: true}
  return
}

PiperOrigin-RevId: 236367368
2019-03-29 16:53:03 -07:00
MLIR Team d038e34735 Loop fusion for input reuse.
*) Breaks fusion pass into multiple sub passes over nodes in data dependence graph:
- first pass fuses single-use producers into their unique consumer.
- second pass enables fusing for input-reuse by fusing sibling nodes which read from the same memref, but which do not share dependence edges.
- third pass fuses remaining producers into their consumers (Note that the sibling fusion pass may have transformed a producer with multiple uses into a single-use producer).
*) Fusion for input reuse is enabled by computing a sibling node slice using the load/load accesses to the same memref, and fusion safety is guaranteed by checking that the sibling node memref write region (to a different memref) is preserved.
*) Enables output vector and output matrix computations from KFAC patches-second-moment operation to fuse into a single loop nest and reuse input from the image patches operation.
*) Adds a generic loop utilitiy for finding all sequential loops in a loop nest.
*) Adds and updates unit tests.

PiperOrigin-RevId: 236350987
2019-03-29 16:52:35 -07:00
River Riddle 269c872ee8 Add support for parsing and printing affine.if and affine.for attributes. The attribute dictionaries are printed after the final block list for both operations:
for %i = 0 to 10 {
     ...
  } {some_attr: true}

  if () : () () {
    ...
  } {some_attr: true}

  if () : () () {
    ...
  } else {
    ...
  } {some_attr: true}

PiperOrigin-RevId: 236346983
2019-03-29 16:52:19 -07:00
Uday Bondhugula 6ef5fc582e Method to align/merge dimensional/symbolic identifiers between two FlatAffineConstraints
- add a method to merge and align the spaces (identifiers) of two
  FlatAffineConstraints (both get dimension-wise and symbol-wise unique
  columns)
- this completes several TODOs, gets rid of previous assumptions/restrictions
  in composeMap, unionBoundingBox, and reuses common code
- remove previous workarounds / duplicated funcitonality in
  FlatAffineConstraints::composeMap and unionBoundingBox, use mergeAlignIds
  from both

PiperOrigin-RevId: 236320581
2019-03-29 16:51:47 -07:00
Alex Zinenko 4bd5d28391 EDSC bindings: expose generic Op construction interface
EDSC Expressions can now be used to build arbitrary MLIR operations identified
by their canonical name, i.e. the name obtained from
`OpClass::getOperationName()` for registered operations.  Expose this
functionality to the C API and Python bindings.  This exposes builder-level
interface to Python and avoids the need for experimental Python code to
implement EDSC free function calls for constructing each op type.

This modification required exposing mlir::Attribute to the C API and Python
bindings, which only supports integer attributes for now.

This is step 4/n to making EDSCs more generalizable.

PiperOrigin-RevId: 236306776
2019-03-29 16:51:32 -07:00
River Riddle ddc6788cc7 Provide a Builder::getNamedAttr and (Instruction|Function)::setAttr(StringRef, Attribute) to simplify attribute manipulation.
PiperOrigin-RevId: 236222504
2019-03-29 16:50:59 -07:00
River Riddle ed5fe2098b Remove PassResult and have the runOnFunction/runOnModule functions return void instead. To signal a pass failure, passes should now invoke the 'signalPassFailure' method. This provides the equivalent functionality when needed, but isn't an intrusive part of the API like PassResult.
PiperOrigin-RevId: 236202029
2019-03-29 16:50:44 -07:00
River Riddle db1757f858 Add support for named function argument attributes. The attribute dictionary is printed after the argument type:
func @arg_attrs(i32 {arg_attr: 10})

func @arg_attrs(%arg0: i32 {arg_attr: 10})

PiperOrigin-RevId: 236136830
2019-03-29 16:50:15 -07:00
Alex Zinenko 8cc50208a6 LLVM IR Dialect: unify call and call0 operations
When the LLVM IR dialect was implemented, TableGen operation definition scheme
did not support operations with variadic results.  Therefore, the `call`
instruction was split into `call` and `call0` for the single- and zero-result
calls (LLVM does not support multi-result operations).  Unify `call` and
`call0` using the recently added TableGen support for operations with Variadic
results.  Explicitly verify that the new operation has 0 or 1 results.  As a
side effect, this change enables clean-ups in the conversion to the LLVM IR
dialect that no longer needs to rely on wrapped LLVM IR void types when
constructing zero-result calls.

PiperOrigin-RevId: 236119197
2019-03-29 16:49:59 -07:00
River Riddle 300e4126c5 Move the PassExecutor and ModuleToFunctionPassAdaptor classes from PassManager.h to Pass.cpp. This allows for us to remove a dependency on Pass.h from PassManager.h.
PiperOrigin-RevId: 236029339
2019-03-29 16:49:15 -07:00
River Riddle 303b768579 Add a generic getValue to ElementsAttr for accessing a value at a given index.
PiperOrigin-RevId: 236013669
2019-03-29 16:48:59 -07:00
River Riddle 1c1767621c Remove the stubs for getValue from DenseIntElementsAttr and DenseFPElementsAttr as they aren't implemented. The type for the index is also wrong.
PiperOrigin-RevId: 236010720
2019-03-29 16:48:44 -07:00
River Riddle 091ff3dc3f Add support for registering pass pipelines to the PassRegistry. This is done by providing a static registration facility PassPipelineRegistration that works similarly to PassRegistration except for it also takes a function that will add necessary passes to a provided PassManager.
void pipelineBuilder(PassManager &pm) {
      pm.addPass(new MyPass());
      pm.addPass(new MyOtherPass());
  }

  static PassPipelineRegistration Unused("unused", "Unused pass", pipelineBuilder);

This is also useful for registering specializations of existing passes:

  Pass *createFooPass10() { return new FooPass(10); }

  static PassPipelineRegistration Unused("unused", "Unused pass", createFooPass10);

PiperOrigin-RevId: 235996282
2019-03-29 16:48:29 -07:00
Jacques Pienaar e31c23853b Fix incorrect line split in header guard.
PiperOrigin-RevId: 235994785
2019-03-29 16:48:14 -07:00
Uday Bondhugula a003179367 Detect more trivially redundant constraints better
- detect more trivially redundant constraints in
  FlatAffineConstraints::removeTrivialRedundantConstraints. Redundancy due to
  constraints that only differ in the constant part (eg., 32i + 64j - 3 >= 0, 32 +
  64j - 8 >= 0) is now detected.  The method is still linear-time and does
  a single scan over the FlatAffineConstraints buffer. This detection is useful
  and needed to eliminate redundant constraints generated after FM elimination.

- update GCDTightenInequalities so that we also normalize by the GCD while at
  it. This way more constraints will show up as redundant (232i - 203 >= 0
  becomes i - 1 >= 0 instead of 232i - 232 >= 0) without having to call
  normalizeConstraintsByGCD.

- In FourierMotzkinEliminate, call GCDTightenInequalities and
  normalizeConstraintsByGCD before calling removeTrivialRedundantConstraints()
  - so that more redundant constraints are detected. As a result, redundancy
    due to constraints like i - 5 >= 0, i - 7 >= 0, 2i - 5 >= 0, 232i - 203 >=
    0 is now detected (here only i >= 7 is non-redundant).

As a result of these, a -memref-bound-check on the added test case runs in 16ms
instead of 1.35s (opt build) and no longer returns a conservative result.

PiperOrigin-RevId: 235983550
2019-03-29 16:47:59 -07:00
MLIR Team c2766f3760 Fix bug in memref region computation with slice loop bounds. Adds loop IV values to ComputationSliceState which are used in FlatAffineConstraints::addSliceBounds, to ensure that constraints are only added for loop IV values which are present in the constraint system.
PiperOrigin-RevId: 235952912
2019-03-29 16:47:29 -07:00
River Riddle c6c534493d Port all of the existing passes over to the new pass manager infrastructure. This is largely NFC.
PiperOrigin-RevId: 235952357
2019-03-29 16:47:14 -07:00
River Riddle 6067cdebaa Implement the initial pass management functionality.
The definitions of derived passes have now changed and passes must adhere to the following:

* Inherit from a CRTP base class FunctionPass/ModulePass.
   - This class provides several necessary utilities for the transformation:
       . Access to the IR unit being transformed (getFunction/getModule)
       . Various utilities for pass identification and registration.

* Provide a 'PassResult runOn(Function|Module)()' method to transform the IR.
   - This replaces the runOn* functions from before.

This patch also introduces the notion of the PassManager. This allows for simplified construction of pass pipelines and acts as the sole interface for executing passes. This is important as FunctionPass will no longer have a 'runOnModule' method.

PiperOrigin-RevId: 235952008
2019-03-29 16:46:59 -07:00
Lei Zhang 9e18783e41 [TableGen] Add more scalar integer and floating-point types
PiperOrigin-RevId: 235918286
2019-03-29 16:46:26 -07:00
Ben Vanik d3918fc8cd Adding an IREE type kind range definition.
PiperOrigin-RevId: 235849609
2019-03-29 16:45:55 -07:00
River Riddle 302fb03961 Add a new class NamedAttributeList to deduplicate named attribute handling between Function and Instruction.
PiperOrigin-RevId: 235830304
2019-03-29 16:45:40 -07:00
River Riddle 3b3e11da93 Validate the names of attribute, dialect, and functions during verification. This essentially enforces the parsing rules upon their names.
PiperOrigin-RevId: 235818842
2019-03-29 16:44:53 -07:00
Uday Bondhugula d4b3ff1096 Loop fusion comand line options cleanup
- clean up loop fusion CL options for promoting local buffers to fast memory
  space
- add parameters to loop fusion pass instantiation

PiperOrigin-RevId: 235813419
2019-03-29 16:44:38 -07:00
River Riddle 79944e5eef Add a Function::isExternal utility to simplify checks for external functions.
PiperOrigin-RevId: 235746553
2019-03-29 16:43:50 -07:00
River Riddle cdbfd48471 Rewrite the dominance info classes to allow for operating on arbitrary control flow within operation regions. The CSE pass is also updated to properly handle nested dominance.
PiperOrigin-RevId: 235742627
2019-03-29 16:43:35 -07:00
Alex Zinenko 1da1b4c321 LLVM IR dialect and translation: support conditional branches with arguments
Since the goal of the LLVM IR dialect is to reflect LLVM IR in MLIR, the
dialect and the conversion procedure must account for the differences betweeen
block arguments and LLVM IR PHI nodes. In particular, LLVM IR disallows PHI
nodes with different values coming from the same source. Therefore, the LLVM IR
dialect now disallows `cond_br` operations that have identical successors
accepting arguments, which would lead to invalid PHI nodes. The conversion
process resolves the potential PHI source ambiguity by injecting dummy blocks
if the same block is used more than once as a successor in an instruction.
These dummy blocks branch unconditionally to the original successors, pass them
the original operands (available in the dummy block because it is dominated by
the original block) and are used instead of them in the original terminator
operation.

PiperOrigin-RevId: 235682798
2019-03-29 16:43:05 -07:00
Uday Bondhugula b269481106 Cleanup post cl/235283610 - NFC
- remove stale comments + cleanup
- drop MLIRContext * field from expr flattener

PiperOrigin-RevId: 235621178
2019-03-29 16:42:20 -07:00
River Riddle b4f033f6c6 Convert the dialect type parse/print hooks into virtual functions on the Dialect class.
PiperOrigin-RevId: 235589945
2019-03-29 16:42:05 -07:00
River Riddle f1f86eac60 Add support for constructing DenseIntElementsAttr with an array of APInt and
DenseFPElementsAttr with an array of APFloat.

PiperOrigin-RevId: 235581794
2019-03-29 16:41:50 -07:00
Lei Zhang 3f644705eb [TableGen] Use ArrayRef instead of SmallVectorImpl for suitable method
PiperOrigin-RevId: 235577399
2019-03-29 16:41:35 -07:00
Nicolas Vasilache 62c54a2ec4 Add a stripmineSink and imperfectly nested tiling primitives.
This CL adds a primitive to perform stripmining of a loop by a given factor and
sinking it under multiple target loops.
In turn this is used to implement imperfectly nested loop tiling (with interchange) by repeatedly calling the stripmineSink primitive.

The API returns the point loops and allows repeated invocations of tiling to achieve declarative, multi-level, imperfectly-nested tiling.

Note that this CL is only concerned with the mechanical aspects and does not worry about analysis and legality.

The API is demonstrated in an example which creates an EDSC block, emits the corresponding MLIR and applies imperfectly-nested tiling:

```cpp
    auto block = edsc::block({
      For(ArrayRef<edsc::Expr>{i, j}, {zero, zero}, {M, N}, {one, one}, {
        For(k1, zero, O, one, {
          C({i, j, k1}) = A({i, j, k1}) + B({i, j, k1})
        }),
        For(k2, zero, O, one, {
          C({i, j, k2}) = A({i, j, k2}) + B({i, j, k2})
        }),
      }),
    });
    // clang-format on
    emitter.emitStmts(block.getBody());

    auto l_i = emitter.getAffineForOp(i), l_j = emitter.getAffineForOp(j),
         l_k1 = emitter.getAffineForOp(k1), l_k2 = emitter.getAffineForOp(k2);
    auto indicesL1 = mlir::tile({l_i, l_j}, {512, 1024}, {l_k1, l_k2});
    auto l_ii1 = indicesL1[0][0], l_jj1 = indicesL1[1][0];
    mlir::tile({l_jj1, l_ii1}, {32, 16}, l_jj1);
```

The edsc::Expr for the induction variables (i, j, k_1, k_2) provide the programmatic hooks from which tiling can be applied declaratively.

PiperOrigin-RevId: 235548228
2019-03-29 16:41:20 -07:00
Alex Zinenko e7193a70f8 EDSC: support conditional branch instructions
Leverage the recently introduced support for multiple argument groups and
multiple destination blocks in EDSC Expressions to implement conditional
branches in EDSC.  Conditional branches have two successors and three argument
groups.  The first group contains a single expression of i1 type that
corresponds to the condition of the branch.  The two following groups contain
arguments of the two successors of the conditional branch instruction, in the
same order as the successors.  Expose this instruction to the C API and Python
bindings.

PiperOrigin-RevId: 235542768
2019-03-29 16:41:05 -07:00
Alex Zinenko 83e8db2193 EDSC: support branch instructions
The new implementation of blocks was designed to support blocks with arguments.
More specifically, StmtBlock can be constructed with a list of Bindables that
will be bound to block aguments upon construction.  Leverage this functionality
to implement branch instructions with arguments.

This additionally requires the statement storage to have a list of successors,
similarly to core IR operations.

Becauase successor chains can form loops, we need a possibility to decouple
block declaration, after which it becomes usable by branch instructions, from
block body definition.  This is achieved by creating an empty block and by
resetting its body with a new list of instructions.  Note that assigning a
block from another block will not affect any instructions that may have
designated this block as their successor (this behavior is necessary to make
value-type semantics of EDSC types consistent).  Combined, one can now write
generators like

    EDSCContext context;
    Type indexType = ...;
    Bindable i(indexType), ii(indexType), zero(indexType), one(indexType);
    StmtBlock loopBlock({i}, {});
    loopBlock.set({ii = i + one,
                   Branch(loopBlock, {ii})});
    MLIREmitter(&builder)
        .bindConstant<ConstantIndexOp>(zero, 0)
        .bindConstant<ConstantIndexOp>(one, 1)
	.emitStmt(Branch(loopBlock, {zero}));

where the emitter will emit the statement and its successors, if present.

PiperOrigin-RevId: 235541892
2019-03-29 16:40:50 -07:00
Tatiana Shpeisman 8b99d1bdbf Use dialect hook registration for constant folding hook.
Deletes specialized mechanism for registering constant folding hook and uses dialect hooks registration mechanism instead.

PiperOrigin-RevId: 235535410
2019-03-29 16:40:35 -07:00
River Riddle a51d21538c Add constant folding for ExtractElementOp when the aggregate is an OpaqueElementsAttr.
PiperOrigin-RevId: 235533283
2019-03-29 16:40:20 -07:00
Uday Bondhugula dfe07b7bf6 Refactor AffineExprFlattener and move FlatAffineConstraints out of IR into
Analysis - NFC

- refactor AffineExprFlattener (-> SimpleAffineExprFlattener) so that it
  doesn't depend on FlatAffineConstraints, and so that FlatAffineConstraints
  could be moved out of IR/; the simplification that the IR needs for
  AffineExpr's doesn't depend on FlatAffineConstraints
- have AffineExprFlattener derive from SimpleAffineExprFlattener to use for
  all Analysis/Transforms purposes; override addLocalFloorDivId in the derived
  class

- turn addAffineForOpDomain into a method on FlatAffineConstraints
- turn AffineForOp::getAsValueMap into an AffineValueMap ctor

PiperOrigin-RevId: 235283610
2019-03-29 16:39:32 -07:00
Stella Laurenzo c81b16e279 Spike to define real math ops and lowering of one variant of add to corresponding integer ops.
The only reason in starting with a fixedpoint add is that it is the absolute simplest variant and illustrates the level of abstraction I'm aiming for.

The overall flow would be:
  1. Determine quantization parameters (out of scope of this cl).
  2. Source dialect rules to lower supported math ops to the quantization dialect (out of scope of this cl).
  3. Quantization passes: [-quant-convert-const, -quant-lower-uniform-real-math, -quant-lower-unsupported-to-float] (the last one not implemented yet)
  4. Target specific lowering of the integral arithmetic ops (roughly at the level of gemmlowp) to more fundamental operations (i.e. calls to gemmlowp, simd instructions, DSP instructions, etc).

How I'm doing this should facilitate implementation of just about any kind of backend except TFLite, which has a very course, adhoc surface area for its quantized kernels. Options there include (I'm not taking an opinion on this - just trying to provide options):
  a) Not using any of this: just match q/dbarrier + tf math ops to the supported TFLite quantized op set.
  b) Implement the more fundamental integer math ops on TFLite and convert to those instead of the current op set.

Note that I've hand-waved over the process of choosing appropriate quantization parameters. Getting to that next. As you can see, different implementations will likely have different magic combinations of specific math support, and we will need the target system that has been discussed for some of the esoteric cases (i.e. many DSPs only support POT fixedpoint).

Two unrelated changes to the overall goal of this CL and can be broken out of desired:
  - Adding optional attribute support to TabelGen
  - Allowing TableGen native rewrite hooks to return nullptr, signalling that no rewrite has been done.

PiperOrigin-RevId: 235267229
2019-03-29 16:39:13 -07:00
River Riddle f48716146e NFC: Make DialectConversion not directly inherit from ModulePass. It is now just a utility class that performs dialect conversion on a provided module.
PiperOrigin-RevId: 235194067
2019-03-29 16:38:57 -07:00
River Riddle 5410dff790 Rewrite MLPatternLoweringPass to no longer inherit from FunctionPass and just provide a utility function that applies ML patterns.
PiperOrigin-RevId: 235194034
2019-03-29 16:38:41 -07:00
River Riddle 3e656599f1 Define a PassID class to use when defining a pass. This allows for the type used for the ID field to be self documenting. It also allows for the compiler to know the set alignment of the ID object, which is useful for storing pointer identifiers within llvm data structures.
PiperOrigin-RevId: 235107957
2019-03-29 16:37:12 -07:00
Sergei Lebedev 1cc9305c71 Exposed division and remainder operations in EDSC
This change introduces three new operators in EDSC: Div (also exposed
via Expr.__div__ aka /) -- floating-point division, FloorDiv and CeilDiv
for flooring/ceiling index division.

The lowering to LLVM will be implemented in b/124872679.

PiperOrigin-RevId: 234963217
2019-03-29 16:36:41 -07:00
Alex Zinenko 59a209721e EDSC: support call instructions
Introduce support for binding MLIR functions as constant expressions.  Standard
constant operation supports functions as possible constant values.

Provide C APIs to look up existing named functions in an MLIR module and expose
them to the Python bindings.  Provide Python bindings to declare a function in
an MLIR module without defining it and to add a definition given a function
declaration.  These declarations are useful when attempting to link MLIR
modules with, e.g., the standard library.

Introduce EDSC support for direct and indirect calls to other MLIR functions.
Internally, an indirect call is always emitted to leverage existing support for
delayed construction of MLIR Values using EDSC Exprs.  If the expression is
bound to a constant function (looked up or declared beforehand), MLIR constant
folding will be able to replace an indirect call by a direct call.  Currently,
only zero- and one-result functions are supported since we don't have support
for multi-valued expressions in EDSC.

Expose function calling interface to Python bindings on expressions by defining
a `__call__` function accepting a variable number of arguments.

PiperOrigin-RevId: 234959444
2019-03-29 16:36:26 -07:00
Jacques Pienaar 1725b485eb Create OpTrait base class & allow operation predicate OpTraits.
* Introduce a OpTrait class in C++ to wrap the TableGen definition;
* Introduce PredOpTrait and rename previous usage of OpTrait to NativeOpTrait;
* PredOpTrait allows specifying a trait of the operation by way of predicate on the operation. This will be used in future to create reusable set of trait building blocks in the definition of operations. E.g., indicating whether to operands have the same type and allowing locally documenting op requirements by trait composition.
  - Some of these building blocks could later evolve into known fixed set as LLVMs backends do, but that can be considered with more data.
* Use the modelling to address one verify TODO in a very local manner.

This subsumes the current custom verify specification which will be removed in a separate mechanical CL.

PiperOrigin-RevId: 234827169
2019-03-29 16:35:11 -07:00
Alex Zinenko 0a95aac7c7 Allow Builder to create function-type constants
A recent change made ConstantOp::build accept a NumericAttr or assert that a
generic Attribute is in fact a NumericAttr.  The rationale behind the change
was that NumericAttrs have a type that can be used as the result type of the
constant operation.  FunctionAttr also has a type, and it is valid to construct
function-typed constants as exercised by the parser.mlir test.  Relax
ConstantOp::build back to take a generic Attribute.  In the overload that only
takes an attribute, assert that the Attribute is either a NumericAttr or a
FunctionAttr, because it is necessary to extract the type.  In the overload
that takes both type type and the attribute, delegate the attribute type
checking to ConstantOp::verify to prevent non-Builder-based Op construction
mechanisms from creating invalid IR.
PiperOrigin-RevId: 234798569
2019-03-29 16:34:41 -07:00
Alex Zinenko 21bd4540f3 EDSC: introduce min/max only usable inside for upper/lower bounds of a loop
Introduce a type-safe way of building a 'for' loop with max/min bounds in EDSC.
Define new types MaxExpr and MinExpr in C++ EDSC API and expose them to Python
bindings.  Use values of these type to construct 'for' loops with max/min in
newly introduced overloads of the `edsc::For` factory function.  Note that in C
APIs, we still must expose MaxMinFor as a different function because C has no
overloads.  Also note that MaxExpr and MinExpr do _not_ derive from Expr
because they are not allowed to be used in a regular Expr context (which may
produce `affine.apply` instructions not expecting `min` or `max`).

Factory functions `Min` and `Max` in Python can be further overloaded to
produce chains of comparisons and selects on non-index types.  This is not
trivial in C++ since overloaded functions cannot differ by the return type only
(`MaxExpr` or `Expr`) and making `MaxExpr` derive from `Expr` defies the
purpose of type-safe construction.

PiperOrigin-RevId: 234786131
2019-03-29 16:34:11 -07:00
Alex Zinenko d055a4e100 EDSC: support multi-expression loop bounds
MLIR supports 'for' loops with lower(upper) bound defined by taking a
maximum(minimum) of a list of expressions, but does not have first-class affine
constructs for the maximum(minimum).  All these expressions must have affine
provenance, similarly to a single-expression bound.  Add support for
constructing such loops using EDSC.  The expression factory function is called
`edsc::MaxMinFor` to (1) highlight that the maximum(minimum) operation is
applied to the lower(upper) bound expressions and (2) differentiate it from a
`edsc::For` that creates multiple perfectly nested loops (and should arguably
be called `edsc::ForNest`).

PiperOrigin-RevId: 234785996
2019-03-29 16:33:56 -07:00
Alex Zinenko a2a433652d EDSC: create constants as expressions
Introduce a functionality to create EDSC expressions from typed constants.
This complements the current functionality that uses "unbound" expressions and
binds them to a specific constant before emission.  It comes in handy in cases
where we want to check if something is a constant early during construciton
rather than late during emission, for example multiplications and divisions in
affine expressions.  This is also consistent with MLIR vision of constants
being defined by an operation (rather than being special kinds of values in the
IR) by exposing this operation as EDSC expression.

PiperOrigin-RevId: 234758020
2019-03-29 16:33:41 -07:00
Uday Bondhugula a1dad3a5d9 Extend/improve getSliceBounds() / complete TODO + update unionBoundingBox
- compute slices precisely where the destination iteration depends on multiple source
  iterations (instead of over-approximating to the whole source loop extent)
- update unionBoundingBox to deal with input with non-matching symbols
- reenable disabled backend test case

PiperOrigin-RevId: 234714069
2019-03-29 16:33:11 -07:00
River Riddle 48ccae2476 NFC: Refactor the files related to passes.
* PassRegistry is split into its own source file.
* Pass related files are moved to a new library 'Pass'.

PiperOrigin-RevId: 234705771
2019-03-29 16:32:56 -07:00
River Riddle da0ebe0670 Add a generic pattern matcher for matching constant values produced by an operation with zero operands and a single result.
PiperOrigin-RevId: 234616691
2019-03-29 16:31:56 -07:00
Alex Zinenko 05f37d52d0 EDSC: clean up type casting mechanism
Originally, edsc::Expr had a long enum edsc::ExprKind with all supported types
of operations.  Recent Expr extensibility support removed the need to specify
supported types in advance.  Replace the no-longer-used blocks of enum values
reserved for unary/binary/ternary/variadic expressions with simple values (it
is still useful to know if an expression is, e.g., binary to access it through
a simpler API).

Furthermore, wrap string-comparison now used to identify specific ops into an
`Expr::is_op<>` function template, that acts similarly to `Instruction::isa<>`.
Introduce `{Unary,Binary,Ternary,Variadic}Expr::make<> ` function template that
creates a Expression emitting the MLIR Op specified as template argument.

PiperOrigin-RevId: 234612916
2019-03-29 16:31:41 -07:00
Alex Zinenko b4dba895a6 EDSC: make Expr typed and extensible
Expose the result types of edsc::Expr, which are now stored for all types of
Exprs and not only for the variadic ones.  Require return types when an Expr is
constructed, if it will ever have some.  An empty return type list is
interpreted as an Expr that does not create a value (e.g. `return` or `store`).

Conceptually, all edss::Exprs are now typed, with the type being a (potentially
empty) tuple of return types.  Unbound expressions and Bindables must now be
constructed with a specific type they will take.  This makes EDSC less
evidently type-polymorphic, but we can still write generic code such as

    Expr sumOfSquares(Expr lhs, Expr rhs) { return lhs * lhs + rhs * rhs; }

and use it to construct different typed expressions as

    sumOfSquares(Bindable(IndexType::get(ctx)), Bindable(IndexType::get(ctx)));
    sumOfSquares(Bindable(FloatType::getF32(ctx)),
                 Bindable(FloatType::getF32(ctx)));

On the positive side, we get the following.
1. We can now perform type checking when constructing Exprs rather than during
   MLIR emission.  Nevertheless, this is still duplicates the Op::verify()
   until we can factor out type checking from that.
2. MLIREmitter is significantly simplified.
3. ExprKind enum is only used for actual kinds of expressions.  Data structures
   are converging with AbstractOperation, and the users can now create a
   VariadicExpr("canonical_op_name", {types}, {exprs}) for any operation, even
   an unregistered one without having to extend the enum and make pervasive
   changes to EDSCs.

On the negative side, we get the following.
1. Typed bindables are more verbose, even in Python.
2. We lose the ability to do print debugging for higher-level EDSC abstractions
   that are implemented as multiple MLIR Ops, for example logical disjunction.

This is the step 2/n towards making EDSC extensible.

***

Move MLIR Op construction from MLIREmitter::emitExpr to Expr::build since Expr
now has sufficient information to build itself.

This is the step 3/n towards making EDSC extensible.

Both of these strive to minimize the amount of irrelevant changes.  In
particular, this introduces more complex pretty-printing for affine and binary
expression to make sure tests continue to pass.  It also relies on string
comparison to identify specific operations that an Expr produces.

PiperOrigin-RevId: 234609882
2019-03-29 16:31:26 -07:00
Lei Zhang e0fc503896 [TableGen] Support using Variadic<Type> in results
This CL extended TableGen Operator class to provide accessors for information on op
results.

In OpDefinitionGen, added checks to make sure only the last result can be variadic,
and adjusted traits and builders generation to consider variadic results.

PiperOrigin-RevId: 234596124
2019-03-29 16:31:11 -07:00
Alex Zinenko 0a4c940c1b EDSC: introduce support for blocks
EDSC currently implement a block as a statement that is itself a list of
statements.  This suffers from two modeling problems: (1) these blocks are not
addressable, i.e. one cannot create an instruction where thus constructed block
is a successor; (2) they support block nesting, which is not supported by MLIR
blocks.  Furthermore, emitting such "compound statement" (misleadingly named
`Block` in Python bindings) does not actually produce a new Block in the IR.

Implement support for creating actual IR Blocks in EDSC.  In particular, define
a new StmtBlock EDSC class that is neither an Expr nor a Stmt but contains a
list of Stmts.  Additionally, StmtBlock may have (early-) typed arguments.
These arguments are Bindable expressions that can be used inside the block.
Provide two calls in the MLIREmitter, `emitBlock` that actually emits a new
block and `emitBlockBody` that only emits the instructions contained in the
block without creating a new block.  In the latter case, the instructions must
not use block arguments.

Update Python bindings to make it clear when instruction emission happens
without creating a new block.

PiperOrigin-RevId: 234556474
2019-03-29 16:30:56 -07:00
Lei Zhang 911b9960ba [TableGen] Fix discrepancy between parameter meaning and code logic
The parameter to emitStandaloneParamBuilder() was renamed from hasResultType to
isAllSameType, which is the opposite boolean value. The logic should be changed
to make them consistent.

Also re-ordered some methods in Operator. And few other tiny improvements.

PiperOrigin-RevId: 234478316
2019-03-29 16:30:41 -07:00
Uday Bondhugula f97c1c5b06 Misc. updates/fixes to analysis utils used for DMA generation; update DMA
generation pass to make it drop certain assumptions, complete TODOs.

- multiple fixes for getMemoryFootprintBytes
  - pass loopDepth correctly from getMemoryFootprintBytes()
  - use union while computing memory footprints

- bug fixes for addAffineForOpDomain
  - take into account loop step
  - add domains of other loop IVs in turn that might have been used in the bounds

- dma-generate: drop assumption of "non-unit stride loops being tile space loops
  and skipping those and recursing to inner depths"; DMA generation is now purely
  based on available fast mem capacity and memory footprint's calculated

- handle memory region compute failures/bailouts correctly from dma-generate

- loop tiling cleanup/NFC

- update some debug and error messages to use emitNote/emitError in
  pipeline-data-transfer pass - NFC

PiperOrigin-RevId: 234245969
2019-03-29 16:30:26 -07:00
Alex Zinenko 4bb31f7377 ExecutionEngine: provide utils for running CLI-configured LLVM passes
A recent change introduced a possibility to run LLVM IR transformation during
JIT-compilation in the ExecutionEngine.  Provide helper functions that
construct IR transformers given either clang-style optimization levels or a
list passes to run.  The latter wraps the LLVM command line option parser to
parse strings rather than actual command line arguments.  As a result, we can
run either of

    mlir-cpu-runner -O3 input.mlir
    mlir-cpu-runner -some-mlir-pass -llvm-opts="-llvm-pass -other-llvm-pass"

to combine different transformations.  The transformer builder functions are
provided as a separate library that depends on LLVM pass libraries unlike the
main execution engine library.  The library can be used for integrating MLIR
execution engine into external frameworks.

PiperOrigin-RevId: 234173493
2019-03-29 16:29:41 -07:00
MLIR Team 8f5f2c765d LoopFusion: perform a series of loop interchanges to increase the loop depth at which slices of producer loop nests can be fused into constumer loop nests.
*) Adds utility to LoopUtils to perform loop interchange of two AffineForOps.
*) Adds utility to LoopUtils to sink a loop to a specified depth within a loop nest, using a series of loop interchanges.
*) Computes dependences between all loads and stores in the loop nest, and classifies each loop as parallel or sequential.
*) Computes loop interchange permutation required to sink sequential loops (and raise parallel loop nests) while preserving relative order among them.
*) Checks each dependence against the permutation to make sure that dependences would not be violated by the loop interchange transformation.
*) Calls loop interchange in LoopFusion pass on consumer loop nests before fusing in producers, sinking loops with loop carried dependences deeper into the consumer loop nest.
*) Adds and updates related unit tests.

PiperOrigin-RevId: 234158370
2019-03-29 16:29:26 -07:00
Lei Zhang 081299333b [TableGen] Rename Operand to Value to prepare sharing between operand and result
We specify op operands and results in TableGen op definition using the same syntax.
They should be modelled similarly in TableGen driver wrapper classes.

PiperOrigin-RevId: 234153332
2019-03-29 16:29:11 -07:00
Alex Zinenko d7aa700ccb Dialect conversion: decouple function signature conversion from type conversion
Function types are built-in in MLIR and affect the validity of the IR itself.
However, advanced target dialects such as the LLVM IR dialect may include
custom function types.  Until now, dialect conversion was expecting function
types not to be converted to the custom type: although the signatures was
allowed to change, the outer type must have been an mlir::FunctionType.  This
effectively prevented dialect conversion from creating instructions that
operate on values of the custom function type.

Dissociate function signature conversion from general type conversion.
Function signature conversion must still produce an mlir::FunctionType and is
used in places where built-in types are required to make IR valid.  General
type conversion is used for SSA values, including function and block arguments
and function results.

Exercise this behavior in the LLVM IR dialect conversion by converting function
types to LLVM IR function pointer types.  The pointer to a function is chosen
to provide consistent lowering of higher-order functions: while it is possible
to have a value of function type, it is not possible to create a function type
accepting a returning another function type.

PiperOrigin-RevId: 234124494
2019-03-29 16:28:41 -07:00
MLIR Team affb2193cc Update direction vector computation to use FlatAffineConstraints::getLower/UpperBounds.
Update FlatAffineConstraints::getLower/UpperBounds to project to the identifier for which bounds are being computed. This change enables computing bounds on an identifier which were previously dependent on the bounds of another identifier.

PiperOrigin-RevId: 234017514
2019-03-29 16:28:25 -07:00
Lei Zhang 93d8f14c0f [TFLite] Fuse AddOp into preceding convolution ops
If we see an add op adding a constant value to a convolution op with constant
bias, we can fuse the add into the convolution op by constant folding the
bias and the add op's constant operand.

This CL also removes dangling RewriterGen check that prevents us from using
nested DAG nodes in result patterns, which is already supported.

PiperOrigin-RevId: 233989654
2019-03-29 16:27:55 -07:00
Lei Zhang eb3f8dcb93 [TableGen] Use deduced result types for build() of suitable ops
For ops with the SameOperandsAndResultType trait, we know that all result types
should be the same as the first operand's type. So we can generate a build()
method without requiring result types as parameters and also invoke this method
when constructing such ops during expanding rewrite patterns.

Similarly for ops have broadcast behavior, we can define build() method to use
the deduced type as the result type. So we can also calling into this build()
method when constructing ops in RewriterGen.

PiperOrigin-RevId: 233988307
2019-03-29 16:27:40 -07:00
Jacques Pienaar 388fb3751e Add pattern constraints.
Enable matching pattern only if constraint is met. Start with type constraints and more general C++ constraints.

PiperOrigin-RevId: 233830768
2019-03-29 16:26:53 -07:00
Alex Zinenko bc184cff3f EDSC: unify Expr storage
EDSC expressions evolved to have different types of underlying storage.
Separate classes are used for unary, binary, ternary and variadic expressions.
The latter covers all the needs of the three special cases.  Remove these
special cases and use a single ExprStorage class everywhere while maintaining
the same APIs at the Expr level (ExprStorage is an internal implementation
class).

This is step 1/n to converging EDSC expressions and Ops and making EDSCs
support custom operations.

PiperOrigin-RevId: 233704912
2019-03-29 16:26:37 -07:00
River Riddle 2f11f86846 Add langref descriptions for the attribute values supported in MLIR.
PiperOrigin-RevId: 233661338
2019-03-29 16:26:08 -07:00
River Riddle 4755774d16 Make IndexType a standard type instead of a builtin. This also cleans up some unnecessary factory methods on the Type class.
PiperOrigin-RevId: 233640730
2019-03-29 16:25:38 -07:00
Alex Zinenko 8de7f6c471 LLVM IR Dialect: add select op and lower standard select to it
This is a similar one-to-one mapping.

PiperOrigin-RevId: 233621006
2019-03-29 16:25:23 -07:00
Alex Zinenko 0e59e5c49b EDSC: move Expr and Stmt construction operators to a namespace
In the current state, edsc::Expr and edsc::Stmt overload operators to construct
other Exprs and Stmts.  This includes some unconventional overloads of the
`operator==` to create a comparison expression and of the `operator!` to create
a negation expression.  This situation could lead to unpleasant surprises where
the code does not behave like expected.  Make all Expr and Stmt construction
operators free functions and move them to the `edsc::op` namespace.  Callers
willing to use these operators must explicitly include them with the `using`
declaration.  This can be done in some local scope.

Additionally, we currently emit signed comparisons for order-comparison
operators.  With namespaces, we can later introduce two sets of operators in
different namespace, e.g. `edsc::op::sign` and `edsc::op::unsign` to clearly
state which kind of comparison is implied.

PiperOrigin-RevId: 233578674
2019-03-29 16:25:08 -07:00
Tatiana Shpeisman 2e6cd60d3b Add dialect-specific decoding for opaque constants.
Associates opaque constants with a particular dialect. Adds general mechanism to register dialect-specific hooks defined in external components. Adds hooks to decode opaque tensor constant and extract an element of an opaque tensor constant.

This CL does not change the existing mechanism for registering constant folding hook yet. One thing at a time.

PiperOrigin-RevId: 233544757
2019-03-29 16:24:38 -07:00
Jacques Pienaar 4b88e7a245 Fix incorrect type in iterator.
PiperOrigin-RevId: 233542711
2019-03-29 16:24:23 -07:00
Uday Bondhugula 8b3f841daf Generate dealloc's for the alloc's of dma-generate.
- for the DMA buffers being allocated (and their tags), generate corresponding deallocs
- minor related update to replaceAllMemRefUsesWith and PipelineDataTransfer pass

Code generation for DMA transfers was being done with the initial simplifying
assumption that the alloc's would map to scoped allocations, and so no
deallocations would be necessary. Drop this assumption to generalize. Note that
even with scoped allocations, unrolling loops that have scoped allocations
could create a series of allocations and exhaustion of fast memory. Having a
end of lifetime marker like a dealloc in fact allows creating new scopes if
necessary when lowering to a backend and still utilize scoped allocation.
DMA buffers created by -dma-generate are guaranteed to have either
non-overlapping lifetimes or nested lifetimes.

PiperOrigin-RevId: 233502632
2019-03-29 16:24:08 -07:00
Lei Zhang a57b398906 [TableGen] Assign created ops to variables and rewrite with PatternRewriter::replaceOp()
Previously we were using PatternRewrite::replaceOpWithNewOp() to both create the new op
inline and rewrite the matched op. That does not work well if we want to generate multiple
ops in a sequence. To support that, this CL changed to assign each newly created op to a
separate variable.

This CL also refactors how PatternEmitter performs the directive dispatch logic.

PiperOrigin-RevId: 233206819
2019-03-29 16:22:53 -07:00
River Riddle 366ebcf6aa Remove the restriction that only registered terminator operations may terminate a block and have block operands. This allows for any operation to hold block operands. It also introduces the notion that unregistered operations may terminate a block. As such, the 'isTerminator' api on Instruction has been split into 'isKnownTerminator' and 'isKnownNonTerminator'.
PiperOrigin-RevId: 233076831
2019-03-29 16:22:23 -07:00
Uday Bondhugula c419accea3 Automated rollback of changelist 232728977.
PiperOrigin-RevId: 232944889
2019-03-29 16:21:38 -07:00
Smit Hinsu c201e6ef05 Handle dynamic shapes in Broadcastable op trait
That allows TensorFlow Add and Div ops to use Broadcastable op trait instead of
more restrictive SameValueType op trait.

That in turn allows TensorFlow ops to be registered by defining GET_OP_LIST and
including the generated ops file. Currently, tf-raise-control-flow pass tests
are using dynamic shapes in tf.Add op and AddOp can't be registered without
supporting the dynamic shapes.

TESTED with unit tests

PiperOrigin-RevId: 232927998
2019-03-29 16:21:23 -07:00
Jacques Pienaar 351eed0dd1 Add tf.LeakyRelu.
* Add tf.LeakyRelu op definition + folders (well one is really canonicalizer)
* Change generated error message to use attribute description instead;
* Change the return type of F32Attr to be APFloat - internally it is already
  stored as APFloat so let the caller decides if they want to convert it or
  not. I could see varying opinions here though :) (did not change i32attr
  similarly)

PiperOrigin-RevId: 232923358
2019-03-29 16:20:53 -07:00
Alex Zinenko 36c0516c78 Disallow zero dimensions in vectors and memrefs
Aggregate types where at least one dimension is zero do not fully make sense as
they cannot contain any values (their total size is zero).  However, TensorFlow
and XLA support tensors with zero sizes, so we must support those too.  This is
relatively safe since, unlike vectors and memrefs, we don't have first-class
element accessors for MLIR tensors.

To support sparse element attributes of vector types that have no non-zero
elements, make sure that index and value element attributes have tensor type so
that we never need to create a zero vector type internally.  Note that this is
already consistent with the inline documentation of the sparse elements
attribute.  Users of the sparse elements attribute should not rely on the
storage schema anyway.

PiperOrigin-RevId: 232896707
2019-03-29 16:20:38 -07:00
River Riddle a886625813 Modify the canonicalizations of select and muli to use the fold hook.
This also extends the greedy pattern rewrite driver to add the operands of folded operations back to the worklist.

PiperOrigin-RevId: 232878959
2019-03-29 16:20:06 -07:00
Alex Zinenko 8093f17a66 ExecutionEngine: provide a hook for LLVM IR passes
The current ExecutionEngine flow generates the LLVM IR from MLIR and
JIT-compiles it as is without any transformation.  It thus misses the
opportunity to perform optimizations supported by LLVM or collect statistics
about the module.  Modify the Orc JITter to perform transformations on the LLVM
IR.  Accept an optional LLVM module transformation function when constructing
the ExecutionEngine and use it while JIT-compiling.  This prevents MLIR
ExecutionEngine from depending on LLVM passes; its clients should depend on the
passes they require.

PiperOrigin-RevId: 232877060
2019-03-29 16:19:49 -07:00
Uday Bondhugula 4ba8c9147d Automated rollback of changelist 232717775.
PiperOrigin-RevId: 232807986
2019-03-29 16:19:33 -07:00
River Riddle fd2d7c857b Rename the 'if' operation in the AffineOps dialect to 'affine.if' and namespace
the AffineOps dialect with 'affine'.

PiperOrigin-RevId: 232728977
2019-03-29 16:18:59 -07:00
Lei Zhang 888b9fa8a6 Add constant build() method not requiring result type
Instead, we deduce the result type from the given attribute.

This is in preparation for generating constant ops with TableGen.

PiperOrigin-RevId: 232723467
2019-03-29 16:18:44 -07:00
Stella Laurenzo c78d708487 Implement Quantization dialect and minimal UniformQuantizedType.
PiperOrigin-RevId: 232723240
2019-03-29 16:18:29 -07:00
River Riddle 90d10b4e00 NFC: Rename the 'for' operation in the AffineOps dialect to 'affine.for'. The is the second step to adding a namespace to the AffineOps dialect.
PiperOrigin-RevId: 232717775
2019-03-29 16:17:59 -07:00
River Riddle 3227dee15d NFC: Rename affine_apply to affine.apply. This is the first step to adding a namespace to the affine dialect.
PiperOrigin-RevId: 232707862
2019-03-29 16:17:29 -07:00
MLIR Team b9dde91ea6 Adds the ability to compute the MemRefRegion of a sliced loop nest. Utilizes this feature during loop fusion cost computation, to compute what the write region of a fusion candidate loop nest slice would be (without having to materialize the slice or change the IR).
*) Adds parameter to public API of MemRefRegion::compute for passing in the slice loop bounds to compute the memref region of the loop nest slice.
*) Exposes public method MemRefRegion::getRegionSize for computing the size of the memref region in bytes.

PiperOrigin-RevId: 232706165
2019-03-29 16:17:15 -07:00
River Riddle 42a2d7d6e1 Remove findInstPositionInBlock from the Block api.
PiperOrigin-RevId: 232704766
2019-03-29 16:16:43 -07:00
Lei Zhang 1df6ca5053 [TableGen] Model variadic operands using Variadic<Type>
Previously, we were using the trait mechanism to specify that an op has variadic operands.
That led a discrepancy between how we handle ops with deterministic number of operands.
Besides, we have no way to specify the constraints and match against the variadic operands.

This CL introduced Variadic<Type> as a way to solve the above issues.

PiperOrigin-RevId: 232656104
2019-03-29 16:16:28 -07:00
River Riddle 0c65cf283c Move the AffineFor loop bound folding to a canonicalization pattern on the AffineForOp.
PiperOrigin-RevId: 232610715
2019-03-29 16:16:11 -07:00
River Riddle 10237de8eb Refactor the affine analysis by moving some functionality to IR and some to AffineOps. This is important for allowing the affine dialect to define canonicalizations directly on the operations instead of relying on transformation passes, e.g. ComposeAffineMaps. A summary of the refactoring:
* AffineStructures has moved to IR.

* simplifyAffineExpr/simplifyAffineMap/getFlattenedAffineExpr have moved to IR.

* makeComposedAffineApply/fullyComposeAffineMapAndOperands have moved to AffineOps.

* ComposeAffineMaps is replaced by AffineApplyOp::canonicalize and deleted.

PiperOrigin-RevId: 232586468
2019-03-29 16:15:41 -07:00
Smit Hinsu 2927297a1c Add derived type attributes for TensorFlow ops generated by TableGen
Motivation for this change is to remove redundant TF type attributes for
TensorFlow ops. For example, tf$T: "tfdtype$DT_FLOAT". Type attributes can be derived using the MLIR operand or result MLIR types, attribute names and their mapping. This will also allow constant folding of instructions generated within MLIR (and not imported from TensorFlow) without adding type attributes for the instruction.

Derived attributes are populated while exporting MLIR to TF GraphDef using
auto-generated populators. Populators are only available for the ops that are generated by the TableGen.

Also, fixed Operator::getNumArgs method to exclude derived attributes as they are not
part of the arguments.

TESTED with unit test

PiperOrigin-RevId: 232531561
2019-03-29 16:15:08 -07:00
Alex Zinenko 3fa22b88de Print non-default attribute types in optional attr dictionary
In optional attribute dictionary used, among others, in the generic form of the
ops, attribute types for integers and floats are omitted.  This could lead to
inconsistencies when round-tripping the IR, in particular the attributes are
created with incorrect types after parsing (integers default to i64, floats
default to f64).  Provide API to emit a trailing type after the attribute for
integers and floats.  Use it while printing the optional attribute dictionary.

Omitting types for i64 and f64 is a pragmatic decision that minimizes changes
in tests.  We may want to reconsider in the future and always print types of
attributes in the generic form.

PiperOrigin-RevId: 232480116
2019-03-29 16:14:05 -07:00
Sergei Lebedev 52ec65c85e Implemented __eq__ and __ne__ in EDSC Python bindings
PiperOrigin-RevId: 232473201
2019-03-29 16:13:34 -07:00
River Riddle bf9c381d1d Remove InstWalker and move all instruction walking to the api facilities on Function/Block/Instruction.
PiperOrigin-RevId: 232388113
2019-03-29 16:12:59 -07:00
River Riddle c9ad4621ce NFC: Move AffineApplyOp to the AffineOps dialect. This also moves the isValidDim/isValidSymbol methods from Value to the AffineOps dialect.
PiperOrigin-RevId: 232386632
2019-03-29 16:12:40 -07:00
Uday Bondhugula 0f50414fa4 Refactor common code getting memref access in getMemRefRegion - NFC
- use getAccessMap() instead of repeating it
- fold getMemRefRegion into MemRefRegion ctor (more natural, avoid heap
  allocation and unique_ptr where possible)

- change extractForInductionVars - MutableArrayRef -> ArrayRef for the
  arguments. Since the method is just returning copies of 'Value *', the client
  can't mutate the pointers themselves; it's fine to mutate the 'Value''s
  themselves, but that doesn't mutate the pointers to those.

- change the way extractForInductionVars returns (see b/123437690)

PiperOrigin-RevId: 232359277
2019-03-29 16:12:25 -07:00
River Riddle 74adaa5b31 Remove the OwnerTy template parameter of IROperandImpl and ValueUseIterator as it is no longer necessary now that all instructions are operations.
PiperOrigin-RevId: 232356323
2019-03-29 16:11:53 -07:00
Jacques Pienaar 5e88422f1d No need to specify default behavior. NFC.
This avoids overriding the class members + setting the printer/parser hooks only to fall back to generic.

PiperOrigin-RevId: 232348307
2019-03-29 16:11:23 -07:00
River Riddle 2d75501691 Remove the forward definition of OperationInst now that no references remain.
PiperOrigin-RevId: 232325321
2019-03-29 16:11:08 -07:00
River Riddle a3d9ccaecb Replace the walkOps/visitOperationInst variants from the InstWalkers with the Instruction variants.
PiperOrigin-RevId: 232322030
2019-03-29 16:10:24 -07:00
Dimitrios Vytiniotis 9ca0691b06 Exposing logical operators in EDSC all the way up to Python.
PiperOrigin-RevId: 232299839
2019-03-29 16:10:08 -07:00
Uday Bondhugula b26900dce5 Update dma-generate pass to (1) work on blocks of instructions (instead of just
loops), (2) take into account fast memory space capacity and lower 'dmaDepth'
to fit, (3) add location information for debug info / errors

- change dma-generate pass to work on blocks of instructions (start/end
  iterators) instead of 'for' loops; complete TODOs - allows DMA generation for
  straightline blocks of operation instructions interspersed b/w loops
- take into account fast memory capacity: check whether memory footprint fits
  in fastMemoryCapacity parameter, and recurse/lower the depth at which DMA
  generation is performed until it does fit in the provided memory
- add location information to MemRefRegion; any insufficient fast memory
  capacity errors or debug info w.r.t dma generation shows location information
- allow DMA generation pass to be instantiated with a fast memory capacity
  option (besides command line flag)

- change getMemRefRegion to return unique_ptr's
- change getMemRefFootprintBytes to work on a 'Block' instead of 'ForInst'
- other helper methods; add postDomInstFilter option for
  replaceAllMemRefUsesWith; drop forInst->walkOps, add Block::walkOps methods

Eg. output

$ mlir-opt  -dma-generate -dma-fast-mem-capacity=1 /tmp/single.mlir
/tmp/single.mlir:9:13: error: Total size of all DMA buffers' for this block exceeds fast memory capacity

        for %i3 = (d0) -> (d0)(%i1) to (d0) -> (d0 + 32)(%i1) {
            ^

$ mlir-opt -debug-only=dma-generate  -dma-generate -dma-fast-mem-capacity=400 /tmp/single.mlir
/tmp/single.mlir:9:13: note: 8 KiB of DMA buffers in fast memory space for this block

        for %i3 = (d0) -> (d0)(%i1) to (d0) -> (d0 + 32)(%i1) {

PiperOrigin-RevId: 232297044
2019-03-29 16:09:52 -07:00
River Riddle 870d778350 Begin the process of fully removing OperationInst. This patch cleans up references to OperationInst in the /include, /AffineOps, and lib/Analysis.
PiperOrigin-RevId: 232199262
2019-03-29 16:09:36 -07:00
River Riddle de2d0dfbca Fold the functionality of OperationInst into Instruction. OperationInst still exists as a forward declaration and will be removed incrementally in a set of followup cleanup patches.
PiperOrigin-RevId: 232198540
2019-03-29 16:09:19 -07:00
Lei Zhang b2dbbdb704 Merge OpProperty and Traits into OpTrait
They are essentially both modelling MLIR OpTrait; the former achieves the
purpose via introducing corresponding symbols in TableGen, while the latter
just uses plain strings.

Unify them to provide a single mechanism to avoid confusion and to better
reflect the definitions on MLIR C++ side.

Ideally we should be able to deduce lots of these traits automatically via
other bits of op definitions instead of manually specifying them; but not
for now though.

PiperOrigin-RevId: 232191401
2019-03-29 16:09:03 -07:00
Lei Zhang 8b75cc5741 Define NumericAttr as the base class for BoolAttr, IntegerAttr, FloatAttr, and ElementsAttr
These attribute kinds are different from the rest in the sense that their types are defined
in MLIR's type hierarchy and we can build constant op out of them.

By defining this middle-level base class, we have a unified way to test and query the type
of these attributes, which will be useful when constructing constant ops of various dialects.

This CL also added asserts to reject non-NumericAttr in constant op's build() method.

PiperOrigin-RevId: 232188178
2019-03-29 16:08:43 -07:00
River Riddle dae0263e0b Fold IROperandOwner into Instruction.
PiperOrigin-RevId: 232159334
2019-03-29 16:08:11 -07:00
River Riddle 5052bd8582 Define the AffineForOp and replace ForInst with it. This patch is largely mechanical, i.e. changing usages of ForInst to OpPointer<AffineForOp>. An important difference is that upon construction an AffineForOp no longer automatically creates the body and induction variable. To generate the body/iv, 'createBody' can be called on an AffineForOp with no body.
PiperOrigin-RevId: 232060516
2019-03-29 16:06:49 -07:00
Lei Zhang e0774c008f [TableGen] Use tblgen::DagLeaf to model DAG arguments
This CL added a tblgen::DagLeaf wrapper class with several helper methods for handling
DAG arguments. It helps to refactor the rewriter generation logic to be more higher
level.

This CL also added a tblgen::ConstantAttr wrapper class for constant attributes.

PiperOrigin-RevId: 232050683
2019-03-29 16:06:31 -07:00
Nicolas Vasilache 0353ef99eb Cleanup EDSCs and start a functional auto-generated library of custom Ops
This CL applies the following simplifications to EDSCs:
1. Rename Block to StmtList because an MLIR Block is a different, not yet
supported, notion;
2. Rework Bindable to drop specific storage and just use it as a simple wrapper
around Expr. The only value of Bindable is to force a static cast when used by
the user to bind into the emitter. For all intended purposes, Bindable is just
a lightweight check that an Expr is Unbound. This simplifies usage and reduces
the API footprint. After playing with it for some time, it wasn't worth the API
cognition overhead;
3. Replace makeExprs and makeBindables by makeNewExprs and copyExprs which is
more explicit and less easy to misuse;
4. Add generally useful functionality to MLIREmitter:
  a. expose zero and one for the ubiquitous common lower bounds and step;
  b. add support to create already bound Exprs for all function arguments as
  well as shapes and views for Exprs bound to memrefs.
5. Delete Stmt::operator= and replace by a `Stmt::set` method which is more
explicit.
6. Make Stmt::operator Expr() explicit.
7. Indexed.indices assertions are removed to pave the way for expressing slices
and views as well as to work with 0-D memrefs.

The CL plugs those simplifications with TableGen and allows emitting a full MLIR function for
pointwise add.

This "x.add" op is both type and rank-agnostic (by allowing ArrayRef of Expr
passed to For loops) and opens the door to spinning up a composable library of
existing and custom ops that should automate a lot of the tedious work in
TF/XLA -> MLIR.

Testing needs to be significantly improved but can be done in a separate CL.

PiperOrigin-RevId: 231982325
2019-03-29 16:05:23 -07:00
River Riddle 9f22a2391b Define an detail::OperandStorage class to handle managing instruction operands. This class stores operands in a similar way to SmallVector except for two key differences. The first is the inline storage, which is a trailing objects array. The second is that being able to dynamically resize the operand list is optional. This means that we can enable the cases where operations need to change the number of operands after construction without losing the spatial locality benefits of the common case (operation instructions / non-control flow instructions with a lifetime fixed number of operands).
PiperOrigin-RevId: 231910497
2019-03-29 16:05:08 -07:00
Jacques Pienaar 82dc6a878c Add fallback to native code op builder specification for patterns.
This allow for arbitrarily complex builder patterns which is meant to cover initial cases while the modelling is improved and long tail cases/cases for which expanding the DSL would result in worst overall system.

NFC just sorting the emit replace methods alphabetical in the class and file body.

PiperOrigin-RevId: 231890352
2019-03-29 16:04:53 -07:00
Jacques Pienaar 4161d44bd5 Enable using constant attribute as matchers.
Straight roll-forward of cl/231322019 that got accidentally reverted in the move.

PiperOrigin-RevId: 231791464
2019-03-29 16:04:38 -07:00
Nicolas Vasilache d4921f4a96 Address Performance issue in NestedMatcher
A performance issue was reported due to the usage of NestedMatcher in
ComposeAffineMaps. The main culprit was the ubiquitous copies that were
occuring when appending even a single element in `matchOne`.

This CL generally simplifies the implementation and removes one level of indirection by getting rid of
auxiliary storage as well as simplifying the API.
The users of the API are updated accordingly.

The implementation was tested on a heavily unrolled example with
ComposeAffineMaps and is now close in performance with an implementation based
on stateless InstWalker.

As a reminder, the whole ComposeAffineMaps pass is slated to disappear but the bug report was very useful as a stress test for NestedMatchers.

Lastly, the following cleanups reported by @aminim were addressed:
1. make NestedPatternContext scoped within runFunction rather than at the Pass level. This was caused by a previous misunderstanding of Pass lifetime;
2. use defensive assertions in the constructor of NestedPatternContext to make it clear a unique such locally scoped context is allowed to exist.

PiperOrigin-RevId: 231781279
2019-03-29 16:04:07 -07:00
Lei Zhang 66647a313a [tablegen] Use tblgen:: classes for NamedAttribute and Operand fields
This is another step towards hiding raw TableGen API calls.

PiperOrigin-RevId: 231580827
2019-03-29 16:02:23 -07:00
Lei Zhang 726dc08e4d [doc] Generate more readable description for attributes
This CL added "description" field to AttrConstraint and Attr, like what we
have for type classes.

PiperOrigin-RevId: 231579853
2019-03-29 16:01:53 -07:00
Lei Zhang 18219caeb2 [doc] Generate more readable description for operands
This CL mandated TypeConstraint and Type to provide descriptions and fixed
various subclasses and definitions to provide so. The purpose is to enforce
good documentation; using empty string as the default just invites oversight.

PiperOrigin-RevId: 231579629
2019-03-29 16:01:38 -07:00
River Riddle 994111238b Fold CallIndirectOp to CallOp when the callee operand is a known constant function.
PiperOrigin-RevId: 231511697
2019-03-29 16:01:23 -07:00
Lei Zhang a759cf3190 Include op results in generate TensorFlow/TFLite op docs
* Emitted result lists for ops.
* Changed to allow empty summary and description for ops.
* Avoided indenting description to allow proper MarkDown rendering of
  formatting markers inside description content.
* Used fixed width font for operand/attribute names.
* Massaged TensorFlow op docs and generated dialect op doc.

PiperOrigin-RevId: 231427574
2019-03-29 16:00:53 -07:00
Lei Zhang c224a518f5 TableGen: Use DAG for op results
Similar to op operands and attributes, use DAG to specify operation's results.
This will allow us to provide names and matchers for outputs.

Also Defined `outs` as a marker to indicate the start of op result list.

PiperOrigin-RevId: 231422455
2019-03-29 16:00:22 -07:00
Lei Zhang 1dfc3ac5ce Prefix Operator getter methods with "get" to be consistent
PiperOrigin-RevId: 231416230
2019-03-29 15:59:46 -07:00
River Riddle 755538328b Recommit: Define a AffineOps dialect as well as an AffineIfOp operation. Replace all instances of IfInst with AffineIfOp and delete IfInst.
PiperOrigin-RevId: 231342063
2019-03-29 15:59:30 -07:00
Nicolas Vasilache 0f9436e56a Move google-mlir to google_mlir
Python modules cannot be defined under a directory that has a `-` character in its name inside of Google code.
Rename to `google_mlir` which circumvents this limitation.

PiperOrigin-RevId: 231329321
2019-03-29 15:42:55 -07:00
Nicolas Vasilache ae772b7965 Automated rollback of changelist 231318632.
PiperOrigin-RevId: 231327161
2019-03-29 15:42:38 -07:00
Jacques Pienaar ad637f3cce Enable using constant attribute as matchers.
Update to allow constant attribute values to be used to match or as result in rewrite rule. Define variable ctx in the matcher to allow matchers to refer to the context of the operation being matched.

PiperOrigin-RevId: 231322019
2019-03-29 15:42:23 -07:00
River Riddle 5ecef2b3f6 Define a AffineOps dialect as well as an AffineIfOp operation. Replace all instances of IfInst with AffineIfOp and delete IfInst.
PiperOrigin-RevId: 231318632
2019-03-29 15:42:08 -07:00
Nicolas Vasilache cacf05892e Add a C API for EDSCs in other languages + python
This CL adds support for calling EDSCs from other languages than C++.
Following the LLVM convention this CL:
1. declares simple opaque types and a C API in mlir-c/Core.h;
2. defines the implementation directly in lib/EDSC/Types.cpp and
lib/EDSC/MLIREmitter.cpp.

Unlike LLVM however the nomenclature for these types and API functions is not
well-defined, naming suggestions are most welcome.

To avoid the need for conversion functions, Types.h and MLIREmitter.h include
mlir-c/Core.h and provide constructors and conversion operators between the
mlir::edsc type and the corresponding C type.

In this first commit, mlir-c/Core.h only contains the types for the C API
to allow EDSCs to work from Python. This includes both a minimal set of core
MLIR
types (mlir_context_t, mlir_type_t, mlir_func_t) as well as the EDSC types
(edsc_mlir_emitter_t, edsc_expr_t, edsc_stmt_t, edsc_indexed_t). This can be
restructured in the future as concrete needs arise.

For now, the API only supports:
1. scalar types;
2. memrefs of scalar types with static or symbolic shapes;
3. functions with input and output of these types.

The C API is not complete wrt ownership semantics. This is in large part due
to the fact that python bindings are written with Pybind11 which allows very
idiomatic C++ bindings. An effort is made to write a large chunk of these
bindings using the C API but some C++isms are used where the design benefits
from this simplication. A fully isolated C API will make more sense once we
also integrate with another language like Swift and have enough use cases to
drive the design.

Lastly, this CL also fixes a bug in mlir::ExecutionEngine were the order of
declaration of llvmContext and the JIT result in an improper order of
destructors (which used to crash before the fix).

PiperOrigin-RevId: 231290250
2019-03-29 15:41:53 -07:00
Lei Zhang eb753f4aec Add tblgen::Pattern to model Patterns defined in TableGen
Similar to other tblgen:: abstractions, tblgen::Pattern hides the native TableGen
API and provides a nicer API that is more coherent with the TableGen definitions.

PiperOrigin-RevId: 231285143
2019-03-29 15:41:38 -07:00
Jacques Pienaar 0fbf4ff232 Define mAttr in terms of AttrConstraint.
* Matching an attribute and specifying a attribute constraint is the same thing executionally, so represent it such.
* Extract AttrConstraint helper to match TypeConstraint and use that where mAttr was previously used in RewriterGen.

PiperOrigin-RevId: 231213580
2019-03-29 15:41:23 -07:00
Jacques Pienaar 8c7f106e53 Add value member to constant attribute specification base.
String specification of the default value is the common case so just make it so.

PiperOrigin-RevId: 231204081
2019-03-29 15:40:53 -07:00
Chris Lattner b42bea215a Change AffineApplyOp to produce a single result, simplifying the code that
works with it, and updating the g3docs.

PiperOrigin-RevId: 231120927
2019-03-29 15:40:38 -07:00
River Riddle 36babbd781 Change the ForInst induction variable to be a block argument of the body instead of the ForInst itself. This is a necessary step in converting ForInst into an operation.
PiperOrigin-RevId: 231064139
2019-03-29 15:40:23 -07:00
Nicolas Vasilache 0e7a8a9027 Drop AffineMap::Null and IntegerSet::Null
Addresses b/122486036

This CL addresses some leftover crumbs in AffineMap and IntegerSet by removing
the Null method and cleaning up the constructors.

As the ::Null uses were tracked down, opportunities appeared to untangle some
of the Parsing logic and make it explicit where AffineMap/IntegerSet have
ambiguous syntax. Previously, ambiguous cases were hidden behind the implicit
pointer values of AffineMap* and IntegerSet* that were passed as function
parameters. Depending the values of those pointers one of 3 behaviors could
occur.

This parsing logic convolution is one of the rare cases where I would advocate
for code duplication. The more proper fix would be to make the syntax
unambiguous or to allow some lookahead.

PiperOrigin-RevId: 231058512
2019-03-29 15:40:08 -07:00
Nicolas Vasilache 81c7f2e2f3 Cleanup resource management and rename recursive matchers
This CL follows up on a memory leak issue related to SmallVector growth that
escapes the BumpPtrAllocator.
The fix is to properly use ArrayRef and placement new to define away the
issue.

The following renaming is also applied:
1. MLFunctionMatcher -> NestedPattern
2. MLFunctionMatches -> NestedMatch

As a consequence all allocations are now guaranteed to live on the BumpPtrAllocator.

PiperOrigin-RevId: 231047766
2019-03-29 15:39:53 -07:00
Nicolas Vasilache 629f5b7fcb Add a simple arity-agnostic invocation of JIT-compiled functions.
This is useful to call generic function with unspecified number of arguments
e.g. when interfacing with ML frameworks.

PiperOrigin-RevId: 230974736
2019-03-29 15:38:08 -07:00
Uday Bondhugula b588d58c5f Update createAffineComputationSlice to generate single result affine maps
- Update createAffineComputationSlice to generate a sequence of single result
  affine apply ops instead of one multi-result affine apply
- update pipeline-data-transfer test case; while on this, also update the test
  case to use only single result affine maps, and make it more robust to
  change.

PiperOrigin-RevId: 230965478
2019-03-29 15:37:53 -07:00
River Riddle c3424c3c75 Allow operations to hold a blocklist and add support for parsing/printing a block list for verbose printing.
PiperOrigin-RevId: 230951462
2019-03-29 15:37:37 -07:00
Alex Zinenko 6d37a255e2 Generic dialect conversion pass exercised by LLVM IR lowering
This commit introduces a generic dialect conversion/lowering/legalization pass
and illustrates it on StandardOps->LLVMIR conversion.

It partially reuses the PatternRewriter infrastructure and adds the following
functionality:
- an actual pass;
- non-default pattern constructors;
- one-to-many rewrites;
- rewriting terminators with successors;
- not applying patterns iteratively (unlike the existing greedy rewrite driver);
- ability to change function signature;
- ability to change basic block argument types.

The latter two things required, given the existing API, to create new functions
in the same module.  Eventually, this should converge with the rest of
PatternRewriter.  However, we may want to keep two pass versions: "heavy" with
function/block argument conversion and "light" that only touches operations.

This pass creates new functions within a module as a means to change function
signature, then creates new blocks with converted argument types in the new
function.  Then, it traverses the CFG in DFS-preorder to make sure defs are
converted before uses in the dominated blocks.  The generic pass has a minimal
interface with two hooks: one to fill in the set of patterns, and another one
to convert types for functions and blocks.  The patterns are defined as
separate classes that can be table-generated in the future.

The LLVM IR lowering pass partially inherits from the existing LLVM IR
translator, in particular for type conversion.  It defines a conversion pattern
template, instantiated for different operations, and is a good candidate for
tablegen.  The lowering does not yet support loads and stores and is not
connected to the translator as it would have broken the existing flows.  Future
patches will add missing support before switching the translator in a single
patch.

PiperOrigin-RevId: 230951202
2019-03-29 15:37:23 -07:00
Mehdi Amini d9ce382fc9 Use a unique_ptr instead of manual deletion for PIMPL idiom (NFC)
PiperOrigin-RevId: 230930254
2019-03-29 15:37:07 -07:00
Lei Zhang ba1715f407 Pull TableGen op argument definitions into their own files
PiperOrigin-RevId: 230923050
2019-03-29 15:36:52 -07:00
Lei Zhang 2de5e9fd19 Support op removal patterns in TableGen
This CL adds a new marker, replaceWithValue, to indicate that no new result
op is generated by applying a pattern. Instead, the matched DAG is replaced
by an existing SSA value.

Converted the tf.Identity converter to use the pattern.

PiperOrigin-RevId: 230922323
2019-03-29 15:36:37 -07:00
Alex Zinenko 5a4403787f Simple CPU runner
This implements a simple CPU runner based on LLVM Orc JIT.  The base
functionality is provided by the ExecutionEngine class that compiles and links
the module, and provides an interface for obtaining function pointers to the
JIT-compiled MLIR functions and for invoking those functions directly.  Since
function pointers need to be casted to the correct pointer type, the
ExecutionEngine wraps LLVM IR functions obtained from MLIR into a helper
function with the common signature `void (void **)` where the single argument
is interpreted as a list of pointers to the actual arguments passed to the
function, eventually followed by a pointer to the result of the function.
Additionally, the ExecutionEngine is set up to resolve library functions to
those available in the current process, enabling support for, e.g., simple C
library calls.

For integration purposes, this also provides a simplistic runtime for memref
descriptors as expected by the LLVM IR code produced by MLIR translation.  In
particular, memrefs are transformed into LLVM structs (can be mapped to C
structs) with a pointer to the data, followed by dynamic sizes.  This
implementation only supports statically-shaped memrefs of type float, but can
be extened if necessary.

Provide a binary for the runner and a test that exercises it.

PiperOrigin-RevId: 230876363
2019-03-29 15:36:08 -07:00
Uday Bondhugula f94b15c247 Update dma-generate: update for multiple load/store op's per memref
- introduce a way to compute union using symbolic rectangular bounding boxes
- handle multiple load/store op's to the same memref by taking a union of the regions
- command-line argument to provide capacity of the fast memory space
- minor change to replaceAllMemRefUsesWith to not generate affine_apply if the
  supplied index remap was identity

PiperOrigin-RevId: 230848185
2019-03-29 15:35:38 -07:00
River Riddle 4a7dfa7882 Add order bit to instructions to lazily track dominance queries. This improves the performance of dominance queries, which are used quite often within the compiler(especially within the verifier).
This reduced the execution time of a few internal tests from ~2 minutes to ~4 seconds.

PiperOrigin-RevId: 230819723
2019-03-29 15:35:23 -07:00
Chris Lattner 934b6d125f Introduce a new operation hook point for implementing simple local
canonicalizations of operations.  The ultimate important user of this is
going to be a funcBuilder->foldOrCreate<YourOp>(...) API, but for now it
is just a more convenient way to write certain classes of canonicalizations
(see the change in StandardOps.cpp).

NFC.

PiperOrigin-RevId: 230770021
2019-03-29 15:34:35 -07:00
River Riddle 451869f394 Add cloning functionality to Block and Function, this also adds support for remapping successor block operands of terminator operations. We define a new BlockAndValueMapping class to simplify mapping between cloned values.
PiperOrigin-RevId: 230768759
2019-03-29 15:34:20 -07:00
River Riddle f319bbbd28 Add a function pass to strip debug info from functions and instructions.
PiperOrigin-RevId: 230654315
2019-03-29 15:33:50 -07:00
River Riddle 6859f33292 Migrate VectorOrTensorType/MemRefType shape api to use int64_t instead of int.
PiperOrigin-RevId: 230605756
2019-03-29 15:33:20 -07:00
Feng Liu b64998a6b3 Add a method to construct a CallSiteLoc which represents a stack of locations.
PiperOrigin-RevId: 230592860
2019-03-29 15:33:05 -07:00
River Riddle 1210e92d86 Add asmparser/printer support for locations to make them round-trippable. Location printing is currently behind a command line flag "mlir-print-debuginfo", we can rethink this when we have a pass for stripping debug info or when we have support for printer flags.
Example inline notation:

  trailing-location ::= 'loc' '(' location ')'

  // FileLineCol Location.
  %1 = "foo"() : () -> i1 loc("mysource.cc":10:8)

  // Name Location
  return loc("foo")

  // CallSite Location
  return loc(callsite("foo" at "mysource.cc":19:9))

  // Fused Location
  /// Without metadata
  func @inline_notation() loc(fused["foo", "mysource.cc":10:8])

  /// With metadata
  return loc(fused<"myPass">["foo", "foo2"])

  // Unknown location.
  return loc(unknown)

Locations are currently only printed with inline notation at the line of each instruction. Further work is needed to allow for reference notation, e.g:
     ...
     return loc 1
   }
   ...
   loc 1 = "source.cc":10:1

PiperOrigin-RevId: 230587621
2019-03-29 15:32:49 -07:00
Lei Zhang 5654450853 Unify terms regarding assembly form to use generic vs. custom
This CL just changes various docs and comments to use the term "generic" and
"custom" when mentioning assembly forms. To be consist, several methods are
also renamed:

* FunctionParser::parseVerboseOperation() -> parseGenericOperation()
* ModuleState::hasShorthandForm() -> hasCustomForm()
* OpAsmPrinter::printDefaultOp() -> printGenericOp()

PiperOrigin-RevId: 230568819
2019-03-29 15:32:35 -07:00
Uday Bondhugula 864d9e02a1 Update fusion cost model + some additional infrastructure and debug information for -loop-fusion
- update fusion cost model to fuse while tolerating a certain amount of redundant
  computation; add cl option -fusion-compute-tolerance
  evaluate memory footprint and intermediate memory reduction
- emit debug info from -loop-fusion showing what was fused and why
- introduce function to compute memory footprint for a loop nest
- getMemRefRegion readability update - NFC

PiperOrigin-RevId: 230541857
2019-03-29 15:32:06 -07:00