Commit Graph

686 Commits

Author SHA1 Message Date
Uday Bondhugula 6136f33d59 unroll and jam: fix order of jammed bodies
- bodies would earlier appear in the order (i, i+3, i+2, i+1) instead of
  (i, i+1, i+2, i+3) for example for factor 4.

- clean up hardcoded test cases

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#170

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/170 from bondhugula:ujam b66b405b2b1894a03b376952e32a9d0292042665
PiperOrigin-RevId: 273613131
2019-10-08 15:13:11 -07:00
Jing Pu 17606a108b Print result types when dumping graphviz.
PiperOrigin-RevId: 273406833
2019-10-07 16:45:53 -07:00
Uday Bondhugula 89e7a76a1c fix simplify-affine-structures bug
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#157

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/157 from bondhugula:quickfix bd1fcd79825fc0bd5b4a3e688153fa0993ab703d
PiperOrigin-RevId: 273316498
2019-10-07 10:04:50 -07:00
Christian Sigg 85dcaf19c7 Fix typos, NFC.
PiperOrigin-RevId: 272851237
2019-10-04 04:37:53 -07:00
River Riddle 5830f71a45 Add support for inlining calls with different arg/result types from the callable.
Some dialects have implicit conversions inherent in their modeling, meaning that a call may have a different type that the type that the callable expects. To support this, a hook is added to the dialect interface that allows for materializing conversion operations during inlining when there is a mismatch. A hook is also added to the callable interface to allow for introspecting the expected result types.

PiperOrigin-RevId: 272814379
2019-10-03 23:10:51 -07:00
River Riddle a20d96e436 Update the Inliner pass to work on SCCs of the CallGraph.
This allows for the inliner to work on arbitrary call operations. The updated inliner will also work bottom-up through the callgraph enabling support for multiple levels of inlining.

PiperOrigin-RevId: 272813876
2019-10-03 23:05:21 -07:00
Jacques Pienaar 2b86e27dbd Show type even if elementsattr is elided in graph
The type is quite useful for debugging and shouldn't be too large.

PiperOrigin-RevId: 272390311
2019-10-02 01:46:12 -07:00
Jacques Pienaar c57f202c8c Switch explicit create methods to match generated build's order
The generated build methods have result type before the arguments (operands and attributes, which are also now adjacent in the explicit create method). This also results in changing the create method's ordering to match most build method's ordering.

PiperOrigin-RevId: 271755054
2019-09-28 09:35:58 -07:00
Uday Bondhugula 74eabdd14e NFC - clean up op accessor usage, std.load/store op verify, other stale info
- also remove stale terminology/references in docs

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#148

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/148 from bondhugula:cleanup e846b641a3c2936e874138aff480a23cdbf66591
PiperOrigin-RevId: 271618279
2019-09-27 11:58:24 -07:00
Nicolas Vasilache ddf737c5da Promote MemRefDescriptor to a pointer to struct when passing function boundaries in LLVMLowering.
The strided MemRef RFC discusses a normalized descriptor and interaction with library calls (https://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/MaL8m2nXuio).
Lowering of nested LLVM structs as value types does not play nicely with externally compiled C/C++ functions due to ABI issues.
Solving the ABI problem generally is a very complex problem and most likely involves taking
a dependence on clang that we do not want atm.

A simple workaround is to pass pointers to memref descriptors at function boundaries, which this CL implement.

PiperOrigin-RevId: 271591708
2019-09-27 09:57:36 -07:00
Jing Pu 47a7021cc3 Change the return type of createPrintCFGGraphPass to match other passes.
PiperOrigin-RevId: 271252404
2019-09-25 18:33:47 -07:00
Mehdi Amini 5583252173 Add convenience methods to set an OpBuilder insertion point after an Operation (NFC)
PiperOrigin-RevId: 270727180
2019-09-23 11:54:55 -07:00
Christian Sigg c900d4994e Fix a number of Clang-Tidy warnings.
PiperOrigin-RevId: 270632324
2019-09-23 02:34:27 -07:00
Uday Bondhugula f559c38c28 Upgrade/fix/simplify store to load forwarding
- fix store to load forwarding for a certain set of cases (where
  forwarding shouldn't have happened); use AffineValueMap difference
  based MemRefAccess equality checking; utility logic is also greatly
  simplified

- add missing equality/inequality operators for AffineExpr ==/!= ints

- add == != operators on MemRefAccess

Closes tensorflow/mlir#136

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/136 from bondhugula:store-load-forwarding d79fd1add8bcfbd9fa71d841a6a9905340dcd792
PiperOrigin-RevId: 270457011
2019-09-21 10:08:56 -07:00
River Riddle 91125d33ed Avoid iterator invalidation when recursively computing pattern depth.
computeDepth calls itself recursively, which may insert into minPatternDepth. minPatternDepth is a DenseMap, which invalidates iterators on insertion, so this may lead to asan failures.

PiperOrigin-RevId: 270374203
2019-09-20 16:30:29 -07:00
Uday Bondhugula 727a50ae2d Support symbolic operands for memref replacement; fix memrefNormalize
- allow symbols in index remapping provided for memref replacement
- fix memref normalize crash on cases with layout maps with symbols

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Reported by: Alex Zinenko

Closes tensorflow/mlir#139

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/139 from bondhugula:memref-rep-symbols 2f48c1fdb5d4c58915bbddbd9f07b18541819233
PiperOrigin-RevId: 269851182
2019-09-18 11:26:11 -07:00
MLIR Team 1c73be76d8 Unify error messages to start with lower-case.
PiperOrigin-RevId: 269803466
2019-09-18 07:45:17 -07:00
Uday Bondhugula bd7de6d4df Add rewrite pattern to compose maps into affine load/stores
- add canonicalization pattern to compose maps into affine loads/stores;
  templatize the pattern and reuse it for affine.apply as well

- rename getIndices -> getMapOperands() (getIndices is confusing since
  these are no longer the indices themselves but operands to the map
  whose results are the indices). This also makes the accessor uniform
  across affine.apply/load/store. Change arg names on the affine
  load/store builder to avoid confusion. Drop an unused confusing build
  method on AffineStoreOp.

- update incomplete doc comment for canonicalizeMapAndOperands (this was
  missed from a previous update).

Addresses issue tensorflow/mlir#121

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#122

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/122 from bondhugula:compose-load-store e71de1771e56a85c4282c10cb43f30cef0701c4f
PiperOrigin-RevId: 269619540
2019-09-17 11:49:45 -07:00
River Riddle 9619ba10d4 Add support for multi-level value mapping to DialectConversion.
When performing A->B->C conversion, an operation may still refer to an operand of A. This makes it necessary to unmap through multiple levels of replacement for a specific value.

PiperOrigin-RevId: 269367859
2019-09-16 10:38:19 -07:00
Uday Bondhugula 4f32ae61b4 NFC - Move explicit copy/dma generation utility out of pass and into LoopUtils
- turn copy/dma generation method into a utility in LoopUtils, allowing
  it to be reused elsewhere.

- no functional/logic change to the pass/utility

- trim down header includes in files affected

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#124

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/124 from bondhugula:datacopy 9f346e62e5bd9dd1986720a30a35f302eb4d3252
PiperOrigin-RevId: 269106088
2019-09-14 13:23:48 -07:00
Uday Bondhugula 1366467a3b update normalizeMemRef utility; handle missing failure check + add more tests
- take care of symbolic operands with alloc
- add missing check for compose map failure and a test case
- add test cases on strides
- drop incorrect check for one-to-one'ness

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#132

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/132 from bondhugula:normalize-memrefs 8aebf285fb0d7c19269d85255aed644657e327b7
PiperOrigin-RevId: 269105947
2019-09-14 13:21:35 -07:00
River Riddle f1b100c77b NFC: Finish replacing FunctionPassBase/ModulePassBase with OpPassBase.
These directives were temporary during the generalization of FunctionPass/ModulePass to OpPass.

PiperOrigin-RevId: 268970259
2019-09-13 13:34:27 -07:00
Smit Hinsu 1854c64c7c Log name of the generated illegal operation name in DialectConversion debug mode
PiperOrigin-RevId: 268859399
2019-09-13 01:37:38 -07:00
Jacques Pienaar a23f69a37b Remove redundant qualification
Address GCC error: extra qualification not allowed [-fpermissive]

PiperOrigin-RevId: 268133737
2019-09-09 19:50:53 -07:00
Jacques Pienaar 2660623a88 Add pass generate per block in a function a GraphViz Dot graph with ops as nodes
* Add GraphTraits that treat a block as a graph, Operation* as node and use-relationship for edges;
  - Just basic graph output;
* Add use iterator to iterate over all uses of an Operation;
* Add testing pass to generate op graph;

This does not support arbitrary operations other than function nor nested regions yet.

PiperOrigin-RevId: 268121782
2019-09-09 18:12:41 -07:00
Mehdi Amini 6443583bfd Refactor getUsedValuesDefinedAbove to expose a variant taking a callback (NFC)
This will allow clients to implement a different collection strategy on these
values, including collecting each uses within the region for example.

PiperOrigin-RevId: 267803978
2019-09-07 17:03:01 -07:00
River Riddle 0ba0087887 Add the initial inlining infrastructure.
This defines a set of initial utilities for inlining a region(or a FuncOp), and defines a simple inliner pass for testing purposes.
A new dialect interface is defined, DialectInlinerInterface, that allows for dialects to override hooks controlling inlining legality. The interface currently provides the following hooks, but these are just premilinary and should be changed/added to/modified as necessary:

* isLegalToInline
  - Determine if a region can be inlined into one of this dialect, *or* if an operation of this dialect can be inlined into a given region.

* shouldAnalyzeRecursively
  - Determine if an operation with regions should be analyzed recursively for legality. This allows for child operations to be closed off from the legality checks for operations like lambdas.

* handleTerminator
  - Process a terminator that has been inlined.

This cl adds support for inlining StandardOps, but other dialects will be added in followups as necessary.

PiperOrigin-RevId: 267426759
2019-09-05 12:24:13 -07:00
Uday Bondhugula 8c9dc690eb pipeline-data-transfer: remove dead tag alloc's and improve test coverage for replaceMemRefUsesWith / pipeline-data-transfer
- address remaining comments from PR tensorflow/mlir#87 for better test coverage for
  pipeline-data-transfer/replaceAllMemRefUsesWith
- remove dead tag allocs the same way they are removed for the replaced buffers

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#106

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/106 from bondhugula:followup 9e868666d047e8d43e5f82f43e4093b838c710fa
PiperOrigin-RevId: 267144774
2019-09-04 06:59:09 -07:00
Uday Bondhugula 54d674f51e Utility to normalize memrefs with non-identity layout maps
- introduce utility to convert memrefs with non-identity layout maps to
  ones with identity layout maps: convert the type and rewrite/remap all
  its uses

- add this utility to -simplify-affine-structures pass for testing
  purposes

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#104

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/104 from bondhugula:memref-normalize f2c914aa1890e8860326c9e33f9aa160b3d65e6d
PiperOrigin-RevId: 266985317
2019-09-03 12:14:28 -07:00
Uday Bondhugula b1ef9dc22c Fix affine data copy generation corner cases/bugs
- the [begin, end) range identified for copying could end in between the
  block, which makes hoisting invalid in some cases. Change the range
  identification to always end with end of block.

- add test case to exercise these (with fast mem capacity set to minimal so
  that single element memref buffers are generated at the innermost loop)

- the location of begin/end of the block range for data copying was
  being confused with the insert points for copy in and copy out code.
  In cases, where we choose to hoist transfers, these are separate.

- when copy loops are single iteration ones, promote their bodies at
  the end of the pass.

- change default fast mem space to 1 (setting it to zero made it
  generate DMA op's that won't verify in the default case - since the
  DMA ops have a check for src/dest memref spaces being different).

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Co-Authored-By: Mehdi Amini <joker.eph@gmail.com>

Closes tensorflow/mlir#88

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/88 from bondhugula:datacopy 88697267c45e850c3ced87671e16e4a930c02a42
PiperOrigin-RevId: 266980911
2019-09-03 11:53:16 -07:00
River Riddle 6563b1c446 Add a new dialect interface for the OperationFolder `OpFolderDialectInterface`.
This interface will allow for providing hooks to interrop with operation folding. The first hook, 'shouldMaterializeInto', will allow for controlling which region to insert materialized constants into. The folder will generally materialize constants into the top-level isolated region, this allows for materializing into a lower level ancestor region if it is more profitable/correct.

PiperOrigin-RevId: 266702972
2019-09-01 20:07:08 -07:00
Mehdi Amini ce702fc8da Add a `getUsedValuesDefinedAbove()` overload that takes an `Operation` pointer (NFC)
This is a convenient utility around the existing `getUsedValuesDefinedAbove()`
that take two regions.

PiperOrigin-RevId: 266686854
2019-09-01 16:32:10 -07:00
River Riddle 9c8a8a7d0d Add a canonicalization to erase empty AffineForOps.
AffineForOp themselves are pure and can be removed if there are no internal operations.

PiperOrigin-RevId: 266481293
2019-08-30 16:49:32 -07:00
River Riddle 037742cdf2 Add support for early exit walk methods.
This is done by providing a walk callback that returns a WalkResult. This result is either `advance` or `interrupt`. `advance` means that the walk should continue, whereas `interrupt` signals that the walk should stop immediately. An example is shown below:

auto result = op->walk([](Operation *op) {
  if (some_invariant)
    return WalkResult::interrupt();
  return WalkResult::advance();
});

if (result.wasInterrupted())
  ...;

PiperOrigin-RevId: 266436700
2019-08-30 12:47:53 -07:00
River Riddle 4bfae66d70 Refactor the 'walk' methods for operations.
This change refactors and cleans up the implementation of the operation walk methods. After this refactoring is that the explicit template parameter for the operation type is no longer needed for the explicit op walks. For example:

    op->walk<AffineForOp>([](AffineForOp op) { ... });

is now accomplished via:

    op->walk([](AffineForOp op) { ... });

PiperOrigin-RevId: 266209552
2019-08-29 13:04:50 -07:00
Uday Bondhugula bc2a543225 fix loop unroll and jam - operand mapping - imperfect nest case
- fix operand mapping while cloning sub-blocks to jam - was incorrect
  for imperfect nests where def/use was across sub-blocks
- strengthen/generalize the first test case to cover the previously
  missed scenario
- clean up the other cases while on this.

Previously, unroll-jamming the following nest
```
    affine.for %arg0 = 0 to 2048 {
      %0 = alloc() : memref<512x10xf32>
      affine.for %arg1 = 0 to 10 {
        %1 = affine.load %0[%arg0, %arg1] : memref<512x10xf32>
      }
      dealloc %0 : memref<512x10xf32>
    }
```

would yield

```
      %0 = alloc() : memref<512x10xf32>
      %1 = affine.apply #map0(%arg0)
      %2 = alloc() : memref<512x10xf32>
      affine.for %arg1 = 0 to 10 {
        %4 = affine.load %0[%arg0, %arg1] : memref<512x10xf32>
        %5 = affine.apply #map0(%arg0)
        %6 = affine.load %0[%5, %arg1] : memref<512x10xf32>
      }
      dealloc %0 : memref<512x10xf32>
      %3 = affine.apply #map0(%arg0)
      dealloc %0 : memref<512x10xf32>

```

instead of

```

module {
    affine.for %arg0 = 0 to 2048 step 2 {
      %0 = alloc() : memref<512x10xf32>
      %1 = affine.apply #map0(%arg0)
      %2 = alloc() : memref<512x10xf32>
      affine.for %arg1 = 0 to 10 {
        %4 = affine.load %0[%arg0, %arg1] : memref<512x10xf32>
        %5 = affine.apply #map0(%arg0)
        %6 = affine.load %2[%5, %arg1] : memref<512x10xf32>
      }
      dealloc %0 : memref<512x10xf32>
      %3 = affine.apply #map0(%arg0)
      dealloc %2 : memref<512x10xf32>
    }
```

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#98

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/98 from bondhugula:ujam ddbc853f69b5608b3e8ff9b5ac1f6a5a0bb315a4
PiperOrigin-RevId: 266073460
2019-08-28 23:42:50 -07:00
Uday Bondhugula aa2cee9cf5 Refactor / improve replaceAllMemRefUsesWith
Refactor replaceAllMemRefUsesWith to split it into two methods: the new
method does the replacement on a single op, and is used by the existing
one.

- make the methods return LogicalResult instead of bool

- Earlier, when replacement failed (due to non-deferencing uses of the
  memref), the set of ops that had already been processed would have
  been replaced leaving the IR in an inconsistent state. Now, a
  pass is made over all ops to first check for non-deferencing
  uses, and then replacement is performed. No test cases were affected
  because all clients of this method were first checking for
  non-deferencing uses before calling this method (for other reasons).
  This isn't true for a use case in another upcoming PR (scalar
  replacement); clients can now bail out with consistent IR on failure
  of replaceAllMemRefUsesWith. Add test case.

- multiple deferencing uses of the same memref in a single op is
  possible (we have no such use cases/scenarios), and this has always
  remained unsupported. Add an assertion for this.

- minor fix to another test pipeline-data-transfer case.

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#87

PiperOrigin-RevId: 265808183
2019-08-27 17:56:56 -07:00
River Riddle 2f59f76876 NFC: Remove the explicit context from Operation::create and OperationState.
The context can easily be recovered from the Location in these situations.

PiperOrigin-RevId: 265578574
2019-08-26 17:34:48 -07:00
Andy Ly 6a501e3d1b Support folding of ops with inner ops in GreedyPatternRewriteDriver.
This fixes a bug when folding ops with inner ops and inner ops are still being visited.

PiperOrigin-RevId: 265475780
2019-08-26 09:44:39 -07:00
River Riddle 32052c8417 NFC: Add a note to 'applyPatternsGreedily' that it also performs folding/dce.
Fixes tensorflow/mlir#72

PiperOrigin-RevId: 265097597
2019-08-23 11:28:45 -07:00
River Riddle ffde975e21 NFC: Move AffineOps dialect to the Dialect sub-directory.
PiperOrigin-RevId: 264482571
2019-08-20 15:36:39 -07:00
Nicolas Vasilache b628194013 Move Linalg and VectorOps dialects to the Dialect subdir - NFC
PiperOrigin-RevId: 264277760
2019-08-19 17:11:38 -07:00
River Riddle ba0fa92524 NFC: Move LLVMIR, SDBM, and StandardOps to the Dialect/ directory.
PiperOrigin-RevId: 264193915
2019-08-19 11:01:25 -07:00
Jacques Pienaar 79f53b0cf1 Change from llvm::make_unique to std::make_unique
Switch to C++14 standard method as llvm::make_unique has been removed (
https://reviews.llvm.org/D66259). Also mark some targets as c++14 to ease next
integrates.

PiperOrigin-RevId: 263953918
2019-08-17 11:06:03 -07:00
River Riddle 9c29273ddc Refactor DialectConversion to convert the signatures of blocks when they are moved.
Often we want to ensure that block arguments are converted before operations that use them. This refactors the current implementation to be cleaner/less frequent by triggering conversion when a set of blocks are moved/inlined; or when legalization is successful.

PiperOrigin-RevId: 263795005
2019-08-16 10:16:38 -07:00
Mehdi Amini 926fb685de Express ownership transfer in PassManager API through std::unique_ptr (NFC)
Since raw pointers are always passed around for IR construct without
implying any ownership transfer, it can be error prone to have implicit
ownership transferred the same way.
For example this code can seem harmless:

  Pass *pass = ....
  pm.addPass(pass);
  pm.addPass(pass);
  pm.run(module);

PiperOrigin-RevId: 263053082
2019-08-12 19:13:12 -07:00
River Riddle 5290e8c36d NFC: Update pattern rewrite API to pass OwningRewritePatternList by const reference.
The pattern list is not modified by any of these APIs and should thus be passed with const.

PiperOrigin-RevId: 262844002
2019-08-11 18:34:14 -07:00
River Riddle 1e42954032 NFC: Standardize the terminology used for parent ops/regions/etc.
There are currently several different terms used to refer to a parent IR unit in 'get' methods: getParent/getEnclosing/getContaining. This cl standardizes all of these methods to use 'getParent*'.

PiperOrigin-RevId: 262680287
2019-08-09 20:07:52 -07:00
River Riddle 41968fb475 NFC: Update usages of OwningRewritePatternList to pass by & instead of &&.
This will allow for reusing the same pattern list, which may be costly to continually reconstruct, on multiple invocations.

PiperOrigin-RevId: 262664599
2019-08-09 17:20:29 -07:00
Nicolas Vasilache 39f1b9a053 Add a higher-order vector.extractelement operation in MLIR
This CL is step 2/n towards building a simple, programmable and portable vector abstraction in MLIR that can go all the way down to generating assembly vector code via LLVM's opt and llc tools.

This CL adds the vector.extractelement operation to the MLIR vector dialect as well as the appropriate roundtrip test. Lowering to LLVM will occur in the following CL.

PiperOrigin-RevId: 262545089
2019-08-09 05:58:47 -07:00
River Riddle 8089f93746 Add utility 'replaceAllUsesWith' methods to Operation.
These methods will allow replacing the uses of results with an existing operation, with the same number of results, or a range of values. This removes a number of hand-rolled result replacement loops and simplifies replacement for operations with multiple results.

PiperOrigin-RevId: 262206600
2019-08-07 13:48:52 -07:00
Andy Ly 55f2e24ab3 Remove ops in regions/blocks from worklist when parent op is being removed via GreedyPatternRewriteDriver::replaceOp.
This fixes a bug where ops inside the parent op are visited even though the parent op has been removed.

PiperOrigin-RevId: 261953580
2019-08-06 11:08:54 -07:00
Nicolas Vasilache 24647750d4 Refactor Linalg ops to loop lowering (NFC)
This CL modifies the LowerLinalgToLoopsPass to use RewritePattern.
This will make it easier to inline Linalg generic functions and regions when emitting to loops in a subsequent CL.

PiperOrigin-RevId: 261894120
2019-08-06 05:38:16 -07:00
River Riddle a0df3ebd15 NFC: Implement OwningRewritePatternList as a class instead of a using directive.
This allows for proper forward declaration, as opposed to leaking the internal implementation via a using directive. This also allows for all pattern building to go through 'insert' methods on the OwningRewritePatternList, replacing uses of 'push_back' and 'RewriteListBuilder'.

PiperOrigin-RevId: 261816316
2019-08-05 18:38:22 -07:00
Mehdi Amini 0c3923e1dc Fix clang 5.0 by using type aliases for LLVM DenseSet/Map
When inlining the declaration for llvm::DenseSet/DenseMap in the mlir
namespace from a forward declaration, clang does not take the default
for the template parameters if their are declared later.

namespace llvm {
  template<typename Foo>
  class DenseMap;
}
namespace mlir {
  using llvm::DenseMap;
}
namespace llvm {
  template<typename Foo = int>
  class DenseMap {};
}

namespace mlir {
  DenseMap<> map;
}

PiperOrigin-RevId: 261495612
2019-08-03 11:35:50 -07:00
Alex Zinenko 58e66d71e7 AffineDataCopyGeneration: don't use CL flag values inside the pass
AffineDataCopyGeneration pass relied on command line flags for internal logic
in several places, which makes it unusable in a library context (i.e. outside a
standalone mlir-opt binary that does the command line parsing).  Define
configuration flags in the constructor instead, and set them up to command
line-based defaults to maintain the original behavior.

PiperOrigin-RevId: 261322364
2019-08-02 08:04:30 -07:00
Uday Bondhugula 18b8d4352b Introduce explicit copying optimization by generalizing the DMA generation pass
Explicit copying to contiguous buffers is a standard technique to avoid
conflict misses and TLB misses, and improve hardware prefetching
performance. When done in conjunction with cache tiling, it nearly
eliminates all cache conflict and TLB misses, and a single hardware
prefetch stream is needed per data tile.

- generalize/extend DMA generation pass (renamed data copying pass) to
  perform either point-wise explicit copies to fast memory buffers or
  DMAs (depending on a cmd line option). All logic is the same as
  erstwhile -dma-generate.

- -affine-dma-generate is now renamed -affine-data-copy; when -dma flag is
  provided, DMAs are generated, or else explicit copy loops are generated
  (point-wise) by default.

- point-wise copying could be used for CPUs (or GPUs); some indicative
  performance numbers with a "C" version of the MLIR when compiled with
  and without this optimization (about 2x improvement here).

  With a matmul on 4096^2 matrices on a single core of an Intel Core i7
  Skylake i7-8700K with clang 8.0.0:

  clang -O3:                       518s
  clang -O3 with MLIR tiling (128x128):      24.5s
  clang -O3 with MLIR tiling + data copying  12.4s
  (code equivalent to test/Transforms/data-copy.mlir func @matmul)

- fix some misleading comments.

- change default fast-mem space to 0 (more intuitive now with the
  default copy generation using point-wise copies instead of DMAs)

On a simple 3-d matmul loop nest, code generated with -affine-data-copy:

```
  affine.for %arg3 = 0 to 4096 step 128 {
    affine.for %arg4 = 0 to 4096 step 128 {
      %0 = affine.apply #map0(%arg3, %arg4)
      %1 = affine.apply #map1(%arg3, %arg4)
      %2 = alloc() : memref<128x128xf32, 2>
      // Copy-in Out matrix.
      affine.for %arg5 = 0 to 128 {
        %5 = affine.apply #map2(%arg3, %arg5)
        affine.for %arg6 = 0 to 128 {
          %6 = affine.apply #map2(%arg4, %arg6)
          %7 = load %arg2[%5, %6] : memref<4096x4096xf32>
          affine.store %7, %2[%arg5, %arg6] : memref<128x128xf32, 2>
        }
      }
      affine.for %arg5 = 0 to 4096 step 128 {
        %5 = affine.apply #map0(%arg3, %arg5)
        %6 = affine.apply #map1(%arg3, %arg5)
        %7 = alloc() : memref<128x128xf32, 2>
        // Copy-in LHS.
        affine.for %arg6 = 0 to 128 {
          %11 = affine.apply #map2(%arg3, %arg6)
          affine.for %arg7 = 0 to 128 {
            %12 = affine.apply #map2(%arg5, %arg7)
            %13 = load %arg0[%11, %12] : memref<4096x4096xf32>
            affine.store %13, %7[%arg6, %arg7] : memref<128x128xf32, 2>
          }
        }
        %8 = affine.apply #map0(%arg5, %arg4)
        %9 = affine.apply #map1(%arg5, %arg4)
        %10 = alloc() : memref<128x128xf32, 2>
        // Copy-in RHS.
        affine.for %arg6 = 0 to 128 {
          %11 = affine.apply #map2(%arg5, %arg6)
          affine.for %arg7 = 0 to 128 {
            %12 = affine.apply #map2(%arg4, %arg7)
            %13 = load %arg1[%11, %12] : memref<4096x4096xf32>
            affine.store %13, %10[%arg6, %arg7] : memref<128x128xf32, 2>
          }
        }
        // Compute.
        affine.for %arg6 = #map7(%arg3) to #map8(%arg3) {
          affine.for %arg7 = #map7(%arg4) to #map8(%arg4) {
            affine.for %arg8 = #map7(%arg5) to #map8(%arg5) {
              %11 = affine.load %7[-%arg3 + %arg6, -%arg5 + %arg8] : memref<128x128xf32, 2>
              %12 = affine.load %10[-%arg5 + %arg8, -%arg4 + %arg7] : memref<128x128xf32, 2>
              %13 = affine.load %2[-%arg3 + %arg6, -%arg4 + %arg7] : memref<128x128xf32, 2>
              %14 = mulf %11, %12 : f32
              %15 = addf %13, %14 : f32
              affine.store %15, %2[-%arg3 + %arg6, -%arg4 + %arg7] : memref<128x128xf32, 2>
            }
          }
        }
        dealloc %10 : memref<128x128xf32, 2>
        dealloc %7 : memref<128x128xf32, 2>
      }
      %3 = affine.apply #map0(%arg3, %arg4)
      %4 = affine.apply #map1(%arg3, %arg4)
      // Copy out result matrix.
      affine.for %arg5 = 0 to 128 {
        %5 = affine.apply #map2(%arg3, %arg5)
        affine.for %arg6 = 0 to 128 {
          %6 = affine.apply #map2(%arg4, %arg6)
          %7 = affine.load %2[%arg5, %arg6] : memref<128x128xf32, 2>
          store %7, %arg2[%5, %6] : memref<4096x4096xf32>
        }
      }
      dealloc %2 : memref<128x128xf32, 2>
    }
  }
```

With -affine-data-copy -dma:

```
  affine.for %arg3 = 0 to 4096 step 128 {
    %0 = affine.apply #map3(%arg3)
    %1 = alloc() : memref<128xf32, 2>
    %2 = alloc() : memref<1xi32>
    affine.dma_start %arg2[%arg3], %1[%c0], %2[%c0], %c128_0 : memref<4096xf32>, memref<128xf32, 2>, memref<1xi32>
    affine.dma_wait %2[%c0], %c128_0 : memref<1xi32>
    %3 = alloc() : memref<1xi32>
    affine.for %arg4 = 0 to 4096 step 128 {
      %5 = affine.apply #map0(%arg3, %arg4)
      %6 = affine.apply #map1(%arg3, %arg4)
      %7 = alloc() : memref<128x128xf32, 2>
      %8 = alloc() : memref<1xi32>
      affine.dma_start %arg0[%arg3, %arg4], %7[%c0, %c0], %8[%c0], %c16384, %c4096, %c128_2 : memref<4096x4096xf32>, memref<128x128xf32, 2>, memref<1xi32>
      affine.dma_wait %8[%c0], %c16384 : memref<1xi32>
      %9 = affine.apply #map3(%arg4)
      %10 = alloc() : memref<128xf32, 2>
      %11 = alloc() : memref<1xi32>
      affine.dma_start %arg1[%arg4], %10[%c0], %11[%c0], %c128_1 : memref<4096xf32>, memref<128xf32, 2>, memref<1xi32>
      affine.dma_wait %11[%c0], %c128_1 : memref<1xi32>
      affine.for %arg5 = #map3(%arg3) to #map5(%arg3) {
        affine.for %arg6 = #map3(%arg4) to #map5(%arg4) {
          %12 = affine.load %7[-%arg3 + %arg5, -%arg4 + %arg6] : memref<128x128xf32, 2>
          %13 = affine.load %10[-%arg4 + %arg6] : memref<128xf32, 2>
          %14 = affine.load %1[-%arg3 + %arg5] : memref<128xf32, 2>
          %15 = mulf %12, %13 : f32
          %16 = addf %14, %15 : f32
          affine.store %16, %1[-%arg3 + %arg5] : memref<128xf32, 2>
        }
      }
      dealloc %11 : memref<1xi32>
      dealloc %10 : memref<128xf32, 2>
      dealloc %8 : memref<1xi32>
      dealloc %7 : memref<128x128xf32, 2>
    }
    %4 = affine.apply #map3(%arg3)
    affine.dma_start %1[%c0], %arg2[%arg3], %3[%c0], %c128 : memref<128xf32, 2>, memref<4096xf32>, memref<1xi32>
    affine.dma_wait %3[%c0], %c128 : memref<1xi32>
    dealloc %3 : memref<1xi32>
    dealloc %2 : memref<1xi32>
    dealloc %1 : memref<128xf32, 2>
  }
```

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#50

PiperOrigin-RevId: 261221903
2019-08-01 16:31:58 -07:00
Jacques Pienaar 0fa1ea704c Initialize union to avoid -Wmissing-field-initializers warning.
Reported by clang-6.

PiperOrigin-RevId: 260311814
2019-07-27 11:47:26 -07:00
Nicolas Vasilache fae4d94990 Use "standard" load and stores in LowerVectorTransfers
Clipping creates non-affine memory accesses, use std_load and std_store instead of affine_load and affine_store.
In the future we may also want a fill with the neutral element rather than clip, this would make the accesses affine if we wanted more analyses and transformations to happen post lowering to pointwise copies.

PiperOrigin-RevId: 260110503
2019-07-26 02:34:24 -07:00
River Riddle 1293708473 Add support for an analysis mode to DialectConversion.
This mode analyzes which operations are legalizable to the given target if a conversion were to be applied, i.e. no rewrites are ever performed even on success. This mode is useful for device partitioning or other utilities that may want to analyze the effect of conversion to different targets before performing it.

The analysis method currently just fills a provided set with the operations that were found to be legalizable. This can be extended in the future to capture more information as necessary.

PiperOrigin-RevId: 259987105
2019-07-25 11:31:07 -07:00
Nicolas Vasilache 48a1baeb8a Refactor LoopParametricTiling as a test pass - NFC
This CL moves LoopParametricTiling into test/lib as a pass for purely testing purposes.

PiperOrigin-RevId: 259300264
2019-07-22 04:31:17 -07:00
River Riddle 00bdc8e070 Refactor region type signature conversion to be explicit via patterns.
This cl enforces that the conversion of the type signatures for regions, and thus their entry blocks, is handled via ConversionPatterns. A new hook 'applySignatureConversion' is added to the ConversionPatternRewriter to perform the desired conversion on a region. This also means that the handling of rewriting the signature of a FuncOp is moved to a pattern. A default implementation is provided via 'mlir::populateFuncOpTypeConversionPattern'. This removes the hacky implicit 'dynamically legal' status of FuncOp that was present previously, and leaves it up to the user to decide when/how to convert the signature of a function.

PiperOrigin-RevId: 259161999
2019-07-20 19:06:07 -07:00
Nicolas Vasilache d2a872922f Refactor stripmineSink for AffineForOp - NFC
More moving less cloning.

PiperOrigin-RevId: 258947575
2019-07-19 11:40:37 -07:00
Nicolas Vasilache db4cd1c8dc Utility function to map a loop on a parametric grid of virtual processors
This CL introduces a simple loop utility function which rewrites the bounds and step of a loop so as to become mappable on a regular grid of processors whose identifiers are given by SSA values.

A corresponding unit test is added.

For example, using CUDA terminology, and assuming a 2-d grid with processorIds = [blockIdx.x, threadIdx.x] and numProcessors = [gridDim.x, blockDim.x], the loop:
```
   loop.for %i = %lb to %ub step %step {
     ...
   }
```
is rewritten into a version resembling the following pseudo-IR:
```
   loop.for %i = %lb + threadIdx.x + blockIdx.x * blockDim.x to %ub
      step %gridDim.x * blockDim.x {
     ...
   }
```

PiperOrigin-RevId: 258945942
2019-07-19 11:40:31 -07:00
Nicolas Vasilache 5bc344743c Uniformize the API for the mlir::tile functions on AffineForOp and loop::ForOp
This CL adapts the recently introduced parametric tiling to have an API matching the tiling
of AffineForOp. The transformation using stripmineSink is more general and produces  imperfectly nested loops.

Perfect nesting invariants of the tiled version are obtained by selectively applying hoisting of ops to isolate perfectly nested bands. Such hoisting may fail to produce a perfect loop nest in cases where ForOp transitively depend on enclosing induction variables. In such cases, the API provides a LogicalResult return but the SimpleParametricLoopTilingPass does not currently use this result.

A new unit test is added with a triangular loop for which the perfect nesting property does not hold. For this example, the old behavior was to produce IR that did not verify (some use was not dominated by its def).

PiperOrigin-RevId: 258928309
2019-07-19 11:40:25 -07:00
River Riddle 28057ff3da Add support for providing a legality callback for dynamic legality in DialectConversion.
This allows for providing specific handling for dynamically legal operations/dialects without overriding the general 'isDynamicallyLegal' hook. This also means that a derived ConversionTarget class need not always be defined when some operations are dynamically legal.

Example usage:

ConversionTarget target(...);
target.addDynamicallyLegalOp<ReturnOp>([](ReturnOp op) {
  return ...
};
target.addDynamicallyLegalDialect<StandardOpsDialect>([](Operation *op) {
  return ...
};

PiperOrigin-RevId: 258884753
2019-07-19 11:40:19 -07:00
River Riddle 8b447b6cad NFC: Expose a ConversionPatternRewriter for use with ConversionPatterns.
This specific PatternRewriter will allow for exposing hooks in the future that are only useful for the conversion framework, e.g. type conversions.

PiperOrigin-RevId: 258818122
2019-07-19 11:40:00 -07:00
River Riddle 9e3c2650d2 Refactor the conversion of block argument types in DialectConversion.
This cl begins a large refactoring over how signature types are converted in the DialectConversion infrastructure. The signatures of blocks are now converted on-demand when an operation held by that block is being converted. This allows for handling the case where a region is created as part of a pattern, something that wasn't possible previously.

This cl also generalizes the region signature conversion used by FuncOp to work on any region of any operation. This generalization allows for removing the 'apply*Conversion' functions that were specific to FuncOp/ModuleOp. The implementation currently uses a new hook on TypeConverter, 'convertRegionSignature', but this should ideally be removed in favor of using Patterns. That depends on adding support to the PatternRewriter used by ConversionPattern to allow applying signature conversions to regions, which should be coming in a followup.

PiperOrigin-RevId: 258645733
2019-07-19 11:38:45 -07:00
River Riddle 491ef84dc4 Add support for explicitly marking dialects and operations as illegal.
This explicit tag is useful is several ways:
*) This simplifies how to mark sub sections of a dialect as explicitly unsupported, e.g. my target supports all operations in the foo dialect except for these select few. This is useful for partial lowerings between dialects.
*) Partial conversions will now verify that operations that were explicitly marked as illegal must be converted. This provides some guarantee that the operations that need to be lowered by a specific pass will be.

PiperOrigin-RevId: 258582879
2019-07-19 11:38:25 -07:00
Nicolas Vasilache 0002e2964d Move affine.for and affine.if to ODS
As the move to ODS is made, body and region names across affine and loop dialects are uniformized.

PiperOrigin-RevId: 258416590
2019-07-16 13:45:47 -07:00
River Riddle 2b9855b5b4 Refactor DialectConversion to support different conversion modes.
Users generally want several different modes of conversion. This cl refactors DialectConversion to provide two:
* Partial (applyPartialConversion)
  - This mode allows for illegal operations to exist in the IR, and does not fail if an operation fails to be legalized.

* Full (applyFullConversion)
  - This mode fails if any operation is not properly legalized to the conversion target. This allows for ensuring that the IR after a conversion only contains operations legal for the target.

PiperOrigin-RevId: 258412243
2019-07-16 13:45:41 -07:00
River Riddle 2087bf6386 Remove lowerAffineConstructs and lowerControlFlow in favor of providing patterns.
These methods don't compose well with the rest of conversion framework, and create artificial breaks in conversion. Replace these methods with two(populateAffineToStdConversionPatterns and populateLoopToStdConversionPatterns respectively) that populate a list of patterns to perform the same behavior.

PiperOrigin-RevId: 258219277
2019-07-16 13:44:45 -07:00
River Riddle e7a2ef21f9 Update 'applyPatternsGreedily' to work on the regions of any operations.
'applyPatternsGreedily' is a useful utility outside of just function regions.

PiperOrigin-RevId: 258182937
2019-07-16 13:44:39 -07:00
River Riddle 7d1e1e6721 Refactor the traversal of operations to Convert in DialectConversion.
This cl changes the way that operations/blocks to convert are collected/traversed so that parent region operations can be legalized before their bodies. Most RewritePatterns for region operations assume that the entry arguments to each region are yet to be converted. Given that the bodies are currently converted first, this makes it difficult to fit these patterns into the same run as one converting types.

The operations/blocks to convert are now collected before any legalization has run, which simplifies the conversion logic itself, as legalization may insert new operations, move blocks, etc.

PiperOrigin-RevId: 258170158
2019-07-16 13:44:33 -07:00
River Riddle 40715789f8 Refactor LowerAffine to use OpRewritePattern instead of ConversionPattern.
ConversionPattern should ideally only be used when the types of the operands are changing, which in this case they aren't. Using OpRewritePattern also lends to much simpler code.

PiperOrigin-RevId: 258158474
2019-07-16 13:44:09 -07:00
Alex Zinenko fc044e8929 Introduce loop coalescing utility and a simple pass
Multiple (perfectly) nested loops with independent bounds can be combined into
a single loop and than subdivided into blocks of arbitrary size for load
balancing or more efficient parallelism exploitation.  However, MLIR wants to
preserve the multi-dimensional multi-loop structure at higher levels of
abstraction. Introduce a transformation that coalesces nested loops with
independent bounds so that they can be further subdivided by tiling.

PiperOrigin-RevId: 258151016
2019-07-16 13:43:44 -07:00
Nicolas Vasilache cca53e8527 Extract std.for std.if and std.terminator in their own dialect
These ops should not belong to the std dialect.
This CL extracts them in their own dialect and updates the corresponding conversions and tests.

PiperOrigin-RevId: 258123853
2019-07-16 13:43:18 -07:00
River Riddle a764c19d17 Fix a bug in DialectConversion when using RewritePattern.
When using a RewritePattern and replacing an operation with an existing value, that value may have already been replaced by something else. This cl ensures that only the final value is used when applying rewrites.

PiperOrigin-RevId: 258058488
2019-07-16 13:43:12 -07:00
River Riddle e50a8bd19c NFC: Add header blocks to DialectConversion.h to improve readability.
PiperOrigin-RevId: 257903383
2019-07-13 05:55:50 -07:00
River Riddle 2566a72a21 Update the PatternRewriter constructor to take a context instead of a region.
This will allow for cleanly using a rewriter for multiple different regions.

PiperOrigin-RevId: 257845371
2019-07-12 17:42:52 -07:00
River Riddle 8e349a48b6 Remove the 'region' field from OpBuilder.
This field wasn't updated as the insertion point changed, making it potentially dangerous given the multi-level of MLIR(e.g. 'createBlock' would always insert the new block in 'region'). This also allows for building an OpBuilder with just a context.

PiperOrigin-RevId: 257829135
2019-07-12 17:42:41 -07:00
River Riddle 60a2983779 Fix a bug in the canonicalizer when replacing constants via patterns.
The GreedyPatternRewriteDriver currently does not notify the OperationFolder when constants are removed as part of a pattern match. This materializes in a nasty bug where a different operation may be allocated to the same address. This causes an assertion in the OperationFolder when it gets notified of the new operations removal.

PiperOrigin-RevId: 257817627
2019-07-12 17:42:24 -07:00
Nicolas Vasilache cab671d166 Lower affine control flow to std control flow to LLVM dialect
This CL splits the lowering of affine to LLVM into 2 parts:
1. affine -> std
2. std -> LLVM

The conversions mostly consists of splitting concerns between the affine and non-affine worlds from existing conversions.
Short-circuiting of affine `if` conditions was never tested or exercised and is removed in the process, it can be reintroduced later if needed.

LoopParametricTiling.cpp is updated to reflect the newly added ForOp::build.

PiperOrigin-RevId: 257794436
2019-07-12 08:44:28 -07:00
River Riddle 9dbef0bf96 Rename FunctionAttr to SymbolRefAttr.
This allows for the attribute to hold symbolic references to other operations than FuncOp. This also allows for removing the dependence on FuncOp from the base Builder.

PiperOrigin-RevId: 257650017
2019-07-12 08:43:42 -07:00
Alex Zinenko 054e25c079 EDSC: use affine.load/store instead of std.load/store
Standard load and store operations are evolving to be separated from the Affine
constructs.  Special affine.load/store have been introduced to uphold the
restrictions of the Affine control flow constructs on their operands.
EDSC-produced loads and stores were originally intended to uphold those
restrictions as well so they should use affine.load/store instead of
std.load/store.

PiperOrigin-RevId: 257443307
2019-07-12 08:42:28 -07:00
River Riddle fec20e590f NFC: Rename Module to ModuleOp.
Module is a legacy name that only exists as a typedef of ModuleOp.

PiperOrigin-RevId: 257427248
2019-07-10 10:11:21 -07:00
River Riddle 8c44367891 NFC: Rename Function to FuncOp.
PiperOrigin-RevId: 257293379
2019-07-10 10:10:53 -07:00
Alex Zinenko 7a2e8726e8 Fix a test broken on some systems due to a mis-rebase.
PiperOrigin-RevId: 257190161
2019-07-09 07:43:42 -07:00
Alex Zinenko 9d03f5674f Implement parametric tiling on standard for loops
Parametric tiling can be used to extract outer loops with fixed number of
iterations.  This in turn enables mapping to GPU kernels on a fixed grid
independently of the range of the original loops, which may be unknown
statically, making the kernel adaptable to different sizes.  Provide a utility
function that also computes the parametric tile size given the range of the
loop.  Exercise the utility function through a simple pass that applies it to
all top-level loop nests.  Permutability or parallelism checks must be
performed before calling this utility function in actual passes.

Note that parametric tiling cannot be implemented in a purely affine way,
although it can be encoded using semi-affine maps.  The choice to implement it
on standard loops is guided by them being the common representation between
Affine loops, Linalg and GPU kernels.

PiperOrigin-RevId: 257180251
2019-07-09 06:37:41 -07:00
River Riddle 626b8b6a5d NFC: Remove `Module::getFunctions` in favor of a general `getOps<T>`.
Modules can now contain more than just Functions, this just updates the iteration API to reflect that. The 'begin'/'end' methods have also been updated to iterate over opaque Operations.

PiperOrigin-RevId: 257099084
2019-07-08 18:28:17 -07:00
River Riddle ce502af9cd NFC: Remove the various "::getFunction" methods.
These methods assume that a function is a valid builtin top-level operation, and removing these methods allows for decoupling FuncOp and IR/. Utility "getParentOfType" methods have been added to Operation/OpState to allow for querying the first parent operation of a given type.

PiperOrigin-RevId: 257018913
2019-07-08 12:40:08 -07:00
River Riddle 474e354179 NFC: Remove Region::getContainingFunction as Functions are now Operations.
PiperOrigin-RevId: 256579717
2019-07-04 13:23:10 -07:00
Andy Davis 2e1187dd25 Globally change load/store/dma_start/dma_wait operations over to affine.load/store/dma_start/dma_wait.
In most places, this is just a name change (with the exception of affine.dma_start swapping the operand positions of its tag memref and num_elements operands).
Significant code changes occur here:
*) Vectorization: LoopAnalysis.cpp, Vectorize.cpp
*) Affine Transforms: Transforms/Utils/Utils.cpp

PiperOrigin-RevId: 256395088
2019-07-03 14:37:06 -07:00
River Riddle 206e55cc16 NFC: Refactor Module to be value typed.
As with Functions, Module will soon become an operation, which are value-typed. This eases the transition from Module to ModuleOp. A new class, OwningModuleRef is provided to allow for owning a reference to a Module, and will auto-delete the held module on destruction.

PiperOrigin-RevId: 256196193
2019-07-02 16:43:36 -07:00
River Riddle 705b80918d Generalize the CFG graph printing for Functions to work on Regions instead.
PiperOrigin-RevId: 256029944
2019-07-01 17:02:51 -07:00
River Riddle 694975ddbc Standardize the definition and usage of getAllArgAttrs between FuncOp and Function.
PiperOrigin-RevId: 255988352
2019-07-01 11:39:12 -07:00
River Riddle 54cd6a7e97 NFC: Refactor Function to be value typed.
Move the data members out of Function and into a new impl storage class 'FunctionStorage'. This allows for Function to become value typed, which will greatly simplify the transition of Function to FuncOp(given that FuncOp is also value typed).

PiperOrigin-RevId: 255983022
2019-07-01 11:39:00 -07:00
Alex Zinenko 5eef726bc8 TypeConversion: do not materialize conversion of the type to itself
Type conversion does not necessarily affect all types, some of them may remain
untouched.  The type conversion tool from the dialect conversion framework will
unconditionally insert a temporary cast operation from the type to itself
anyway, and will try to materialize it to a real conversion operation if there
are remaining uses.  Simply use the original value instead.

PiperOrigin-RevId: 255975450
2019-07-01 09:56:56 -07:00
Andy Davis f487d20bf0 Add affine-to-standard lowerings for affine.load/store/dma_start/dma_wait.
PiperOrigin-RevId: 255960171
2019-07-01 09:56:22 -07:00
Nicolas Vasilache e7f51ad08a Add a folder-based EDSC intrinsics constructor (NFC)
PiperOrigin-RevId: 255908660
2019-07-01 09:55:35 -07:00
River Riddle 7c755d06aa Refactor DialectConversion to use 'materializeConversion' when a type conversion must persist after the conversion has finished.
During conversion, if a type conversion has dangling uses a type conversion must persist after conversion has finished to maintain valid IR. In these cases, we now query the TypeConverter to materialize a conversion for us. This allows for the default case of a full conversion to continue working as expected, but also handle the degenerate cases more robustly.

PiperOrigin-RevId: 255637171
2019-06-28 11:29:04 -07:00
River Riddle a4c3a6455c Move the emitError/Warning/Remark utility methods out of MLIRContext and into the mlir namespace.
Now that Locations are attributes, they have direct access to the MLIR context. This allows for simplifying error emission by removing unnecessary context lookups.

PiperOrigin-RevId: 255112791
2019-06-25 21:32:23 -07:00
River Riddle 49162524d8 NFC: Uniformize the return of the LocationAttr 'get' methods to 'Location'.
PiperOrigin-RevId: 255078768
2019-06-25 16:57:56 -07:00
River Riddle 66ed7d6d83 Update the OperationFolder to find a valid insertion point when materializing constants.
The OperationFolder currently just inserts into the entry block of a Function, but regions may be isolated above, i.e. explicit capture only, and blindly inserting constants may break the invariants of these regions.

PiperOrigin-RevId: 254987796
2019-06-25 09:43:21 -07:00
River Riddle c32080a1b0 NFC: Move the ArgConverter methods out-of-line to improve readability.
PiperOrigin-RevId: 254872695
2019-06-24 17:47:51 -07:00
Nicolas Vasilache 95cfd99616 Fix OSS build
PiperOrigin-RevId: 254847773
2019-06-24 17:47:27 -07:00
Nicolas Vasilache dac75ae5ff Split test-specific passes out of mlir-opt
Instead put their impl in test/lib and link them into mlir-test-opt

PiperOrigin-RevId: 254837439
2019-06-24 17:47:12 -07:00
River Riddle b67cab4c44 Update CSE to respect nested regions that are isolated from above. This cl also removes the unused 'NthRegionIsIsolatedFromAbove' trait as it was replaced with a more general 'IsIsolatedFromAbove'.
PiperOrigin-RevId: 254709704
2019-06-24 13:44:53 -07:00
River Riddle bcacef1a70 Add a new dialect hook 'materializeConstant' to create a constant operation that materializes an attribute value with the given type. This effectively adds support for dialect specific constant values that have different invariants than std.constant. 'OperationFolder' is updated to use this new hook, or attempt to default to std.constant when legal.
PiperOrigin-RevId: 254570153
2019-06-22 13:05:27 -07:00
River Riddle 48d6cf1ced NFC: Remove the 'context' parameter from OperationState.
Now that Locations are Attributes they contain a direct reference to the MLIRContext, i.e. the context can be directly accessed from the given location instead of being explicitly passed in.

PiperOrigin-RevId: 254568329
2019-06-22 13:05:10 -07:00
River Riddle 704a7fb13e Add support for 1->N type mappings in the dialect conversion infrastructure. To support these mappings a hook must be overridden on the type converter: 'materializeConversion' :to generate a cast operation from the new types to the old type. This operation is automatically erased if all uses are removed, otherwise it remains in the IR for the user to handle.
PiperOrigin-RevId: 254411383
2019-06-22 09:16:06 -07:00
River Riddle 3e99d99553 Add an overload to 'PatternRewriter::inlineRegionBefore' that accepts a parent region for the insertion position. This allows for inlining the given region into the end of another region.
PiperOrigin-RevId: 254367375
2019-06-22 09:15:21 -07:00
Nicolas Vasilache 0804750c9b Uniformize usage of OpBuilder& (NFC)
Historically the pointer-based version of builders was used.
This CL uniformizes to OpBuilder &

PiperOrigin-RevId: 254280885
2019-06-22 09:14:49 -07:00
River Riddle 7202c4e69d Rename ConversionTarget::isLegal to isDynamicallyLegal to better represent what the function is actually checking.
PiperOrigin-RevId: 254141073
2019-06-22 09:13:45 -07:00
River Riddle 9764ae3f24 Refactor the TypeConverter to support more robust type conversions:
* Support for 1->0 type mappings, i.e. when the argument is being removed.
* Reordering types when converting a type signature.
* Adding new inputs when converting a type signature.

This cl also lays down the initial foundation for supporting 1->N type mappings, but full support will come in a followup.

Moving forward, function signature changes will be driven by populating a SignatureConversion instance. This class contains all of the necessary information for adding/removing/remapping function signatures; e.g. addInputs, addResults, remapInputs, etc.

PiperOrigin-RevId: 254064665
2019-06-19 23:08:33 -07:00
River Riddle 30bbd91056 Simplify usages of SplatElementsAttr now that it inherits from DenseElementsAttr.
PiperOrigin-RevId: 253910543
2019-06-19 23:07:34 -07:00
Andy Davis 59b68146ff Factor fusion compute cost calculation out of LoopFusion and into LoopFusionUtils (NFC).
PiperOrigin-RevId: 253797886
2019-06-19 23:06:26 -07:00
Alex Zinenko 4291ae7431 Factor Region::getUsedValuesDefinedAbove into Transforms/RegionUtils
Arguably, this function is only useful for transformations and should not
pollute the main IR.  Also make sure it accepts a the resulting container
by-reference instead of returning it.

PiperOrigin-RevId: 253622981
2019-06-19 23:03:51 -07:00
Andy Davis 898cf0e968 LoopFusion: adds support for computing forward computation slices, which will enable fusion of consumer loop nests into their producers in subsequent CLs.
PiperOrigin-RevId: 253601994
2019-06-19 23:03:42 -07:00
Alex Zinenko ee6f84aebd Convert a nest affine loops to a GPU kernel
This converts entire loops into threads/blocks.  No check on the size of the
block or grid, or on the validity of parallelization is performed, it is under
the responsibility of the caller to strip-mine the loops and to perform the
dependence analysis before calling the conversion.

PiperOrigin-RevId: 253189268
2019-06-19 23:02:02 -07:00
River Riddle 6a0555a875 Refactor SplatElementsAttr to inherit from DenseElementsAttr as opposed to being a separate Attribute type. DenseElementsAttr provides a better internal representation for splat values as well as better API for accessing elements.
PiperOrigin-RevId: 253138287
2019-06-19 23:01:52 -07:00
River Riddle 5da741f671 Add basic cost modeling to the dialect conversion infrastructure. This initial cost model favors specific patterns based upon two criteria:
1) Lowest minimum pattern stack depth when legalizing.
  - This leads the system to favor patterns that have lower legalization stacks, i.e. represent a more direct mapping to the target.

2)  Pattern benefit.
  - When considering multiple patterns with the same legalization depth, this favors patterns with a larger specified benefit.

PiperOrigin-RevId: 252713470
2019-06-19 22:59:06 -07:00
River Riddle eb28b30940 NFC: Cleanup the naming scheme for registering legalization actions to be consistent, and move a file functions to the source file.
PiperOrigin-RevId: 252639629
2019-06-11 10:14:35 -07:00
Alex Zinenko 8ad35b90ec Use DialectConversion to lower the Affine dialect to the Standard dialect
This introduces the support for region-containing operations to the dialect
conversion framework in order to support the conversion of affine control-flow
operations into the standard control flow with branches.  Regions that belong
to an operation are converted before the operation itself.  The
DialectConversionPattern can therefore access the converted regions of the
original operation and process them further if necessary.  In particular, the
conversion is allowed to move the blocks from the original region to other
regions and to split blocks into multiple blocks.  All block manipulations must
be performed through the PatternRewriter to ensure they will be undone if the
conversion fails.

Port the pass converting from the affine dialect (loops and ifs with bodies as
regions) to the standard dialect (branch-based cfg) to use DialectConversion in
order to exercise this new functionality.  The modification to the lowering
functions are minor and are focused on using the PatterRewriter instead of
directly modifying the IR.

PiperOrigin-RevId: 252625169
2019-06-11 10:14:27 -07:00
Andy Davis e33e36f178 Return dependence result enum to distiguish between dependence result and error cases (NFC).
PiperOrigin-RevId: 252437616
2019-06-11 10:12:36 -07:00
River Riddle e7ccfb2ae8 Add support to ConversionTarget for storing legalization actions for entire dialects as opposed to individual operations. This allows for better support of unregistered operations, as well as removing the need to collect all of the operations for a given dialect(which may be very expensive).
PiperOrigin-RevId: 251943590
2019-06-09 16:21:32 -07:00
River Riddle e25796ef6e Add support for matchAndRewrite to the DialectConversion patterns. This also drops the default "always succeed" match override to better align with RewritePattern.
PiperOrigin-RevId: 251941625
2019-06-09 16:21:20 -07:00
River Riddle 0560f153b8 Add utility 'create' methods to OperationFolder that will create an operation with a given OpBuilder and automatically try to fold it, similarly to OpBuilder::createOrFold. The difference here is that these methods enable folding to constants in addition to existing values. This functionality is then used to replace linalg::FunctionConstants.
PiperOrigin-RevId: 251716247
2019-06-09 16:19:51 -07:00
River Riddle 9fc00cf840 Always remap results when replacing an operation. This prevents a crash when lowering identity(passthrough) operations to the same resultant type as the original operation.
PiperOrigin-RevId: 251665492
2019-06-09 16:18:44 -07:00
River Riddle 0d2492eb2e When cleaning up after a failed legalization pattern, make sure to remove any newly created value mappings.
PiperOrigin-RevId: 251658984
2019-06-09 16:18:32 -07:00
River Riddle f1b848e470 NFC: Rename FuncBuilder to OpBuilder and refactor to take a top level region instead of a function.
PiperOrigin-RevId: 251563898
2019-06-09 16:17:59 -07:00
River Riddle 9b4a02c1e9 NFC: Rename FoldHelper to OperationFolder and split a large function in two.
PiperOrigin-RevId: 251485843
2019-06-09 16:17:11 -07:00
Ben Vanik 9fc4193eea Adding additional dialect parsing utilities, conversion wrappers, and traversal helpers.
- added a typed walk to Block (matching the equivalent on Function)
- added token parsers (incl optional variants) for : and (
- added applyConversionPatterns that takes a list of functions to apply patterns to

PiperOrigin-RevId: 251481608
2019-06-09 16:16:59 -07:00
River Riddle 95eaca3e0f Refactor the dialect conversion framework to support multi-level conversions. Multi-level conversions are those that require multiple patterns to be applied before an operation is completely legalized. This essentially means that conversion patterns do not have to directly generate legal operations, and may be chained together to produce legal code.
To accomplish this, moving forward users will need to provide a legalization target that defines what operations are legal for the conversion. A target can mark an operation as legal by providing a specific legalization action. The initial actions are:
* Legal
  - This action signals that every instance of the given operation is legal,
    i.e. any combination of attributes, operands, types, etc. is valid.
* Dynamic
  - This action signals that only some instances of a given operation are legal. This
    allows for defining fine-tune constraints, like say std.add is only legal when
    operating on 32-bit integers.

An example target is shown below:
struct MyTarget : public ConversionTarget {
  MyTarget(MLIRContext &ctx) : ConversionTarget(ctx) {
    // All operations in the LLVM dialect are legal.
    addLegalDialect<LLVMDialect>();

    // std.constant op is always legal on this target.
    addLegalOp<ConstantOp>();

    // std.return op has dynamic legality constraints.
    addDynamicallyLegalOp<ReturnOp>();
  }

  /// Implement the custom legalization handler to handle
  /// std.return.
  bool isLegal(Operation *op) override {
    // Process the dynamic handling for a std.return op (and any others that were
    // marked "dynamic").
    ...
  }
};

PiperOrigin-RevId: 251289374
2019-06-03 19:27:02 -07:00
Amit Sabne 7a43da6060 Loop invariant code motion - remove reliance on getForwardSlice. Add more tests.
--

PiperOrigin-RevId: 250950703
2019-06-01 20:13:30 -07:00
Geoffrey Martin-Noble 60d6249fbd Replace checks against numDynamicDims with hasStaticShape
--

PiperOrigin-RevId: 250782165
2019-06-01 20:11:31 -07:00
Jacques Pienaar 4a697a91de Fix 5 ClangTidy - Readability findings.
* the 'empty' method should be used to check for emptiness instead of 'size'
    * using decl 'CapturableHandle' is unused
    * redundant get() call on smart pointer
    * using decl 'apply' is unused
    * using decl 'ScopeGuard' is unused

--

PiperOrigin-RevId: 250623863
2019-06-01 20:10:22 -07:00
River Riddle 9abdbb3189 NFC: Inline toString as operations can be streamed directly into raw_ostream.
--

PiperOrigin-RevId: 250619765
2019-06-01 20:10:12 -07:00
MLIR Team 5a91b9896c Remove "size" property of affine maps.
--

PiperOrigin-RevId: 250572818
2019-06-01 20:09:02 -07:00
Andy Davis 1de0f97fff LoopFusionUtils CL 2/n: Factor out and generalize slice union computation.
*) Factors slice union computation out of LoopFusion into Analysis/Utils (where other iteration slice utilities exist).
    *) Generalizes slice union computation to take the union of slices computed on all loads/stores pairs between source and destination loop nests.
    *) Fixes a bug in FlatAffineConstraints::addSliceBounds where redundant constraints were added.
    *) Takes care of a TODO to expose FlatAffineConstraints::mergeAndAlignIds as a public method.

--

PiperOrigin-RevId: 250561529
2019-06-01 20:08:52 -07:00
Alex Zinenko c2d105811a Do not assume Blocks belong to Functions
Fix Block::splitBlock and Block::eraseFromFunction that erronously assume
    blocks belong to functions.  They now belong to regions.  When splitting, new
    blocks should be created in the same region as the existing block.  When
    erasing a block, it should be removed from the region rather than from the
    function body that transitively contains the region.

    Also rename Block::eraseFromFunction to Block::erase for consistency with other
    IR containers.

--

PiperOrigin-RevId: 250278272
2019-06-01 20:05:21 -07:00
Alex Zinenko d4c071cc69 Decouple affine->standard lowering from the pass
The lowering from the Affine dialect to the Standard dialect was originally
    implemented as a standalone pass.  However, it may be used by other passes
    willing to lower away some of the affine constructs as a part of their
    operation.  Decouple the transformation functions from the pass infrastructure
    and expose the entry point for the lowering.

    Also update the lowering functions to use `LogicalResult` instead of bool for
    return values.

--

PiperOrigin-RevId: 250229198
2019-06-01 20:05:01 -07:00
River Riddle c2d069323b Rename DialectConversion to TypeConverter and split out pattern construction. This simplifies building the conversion pattern list from multiple sources.
--

PiperOrigin-RevId: 249930583
2019-06-01 20:02:03 -07:00
Lei Zhang ba104f871c Add TestLoopFusion.cpp to CMakeLists.txt
--

PiperOrigin-RevId: 249901490
2019-06-01 20:00:52 -07:00
Andy Davis e53b7d2c02 Add LoopFusionUtils.cpp to CMakeLists.
--

PiperOrigin-RevId: 249887371
2019-06-01 20:00:33 -07:00
Andy Davis a560f2c646 Affine Loop Fusion Utility Module (1/n).
*) Adds LoopFusionUtils which will expose a set of loop fusion utilities (e.g. dependence checks, fusion cost/storage reduction, loop fusion transformation) for use by loop fusion algorithms. Support for checking block-level fusion-preventing dependences is added in this CL (additional loop fusion utilities will be added in subsequent CLs).
    *) Adds TestLoopFusion test pass for testing LoopFusionUtils at a fine granularity.
    *) Adds unit test for testing dependence check for block-level fusion-preventing dependences.

--

PiperOrigin-RevId: 249861071
2019-06-01 20:00:23 -07:00
River Riddle ae1651368f NFC: Rename DialectConversionPattern to ConversionPattern.
--

PiperOrigin-RevId: 249857277
2019-06-01 20:00:13 -07:00
Alex Zinenko fe2716aee3 Detemplatize convertRegion in DialectConversion
Originally, FunctionConverter::convertRegion in the DialectConversion framework
    was implemented as a function template because it was creating a new region in
    the parent object, which could have been an op or a function.  Since
    DialectConversion now operates in place, new region is no longer created so
    there is no need for convertRegion to be aware of the parent, only of the error
    reporting location.

--

PiperOrigin-RevId: 249826392
2019-06-01 20:00:04 -07:00
River Riddle 4958ec2414 Apply operation rewrites before updating arguments.
--

PiperOrigin-RevId: 249678839
2019-06-01 19:58:14 -07:00
River Riddle 14d1cfbccb Decouple running a conversion from the DialectConversion class. The DialectConversion class is only necessary for type signature changes(block arguments or function arguments). This isn't always desired when performing a dialect conversion. This allows for those conversions without this need to run per function instead of per module.
--

PiperOrigin-RevId: 249657549
2019-06-01 19:58:04 -07:00