Commit Graph

686 Commits

Author SHA1 Message Date
River Riddle d6ee6a0310 Update the builder API to take ValueRange instead of ArrayRef<Value *>
This allows for users to provide operand_range and result_range in builder.create<> calls, instead of requiring an explicit copy into a separate data structure like SmallVector/std::vector.

PiperOrigin-RevId: 284360710
2019-12-07 10:35:41 -08:00
River Riddle 9d1a0c72b4 Add a new ValueRange class.
This class represents a generic abstraction over the different ways to represent a range of Values: ArrayRef<Value *>, operand_range, result_range. This class will allow for removing the many instances of explicit SmallVector<Value *, N> construction. It has the same memory cost as ArrayRef, and only suffers cost from indexing(if+elsing the different underlying representations).

This change only updates a few of the existing usages, with more to be changed in followups; e.g. 'build' API.

PiperOrigin-RevId: 284307996
2019-12-06 20:07:23 -08:00
Alexandre E. Eichenberger 3c69ca1e69 fix examples in comments
Closes tensorflow/mlir#301

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/301 from AlexandreEichenberger:vect-doc-update 7e5418a9101a4bdad2357882fe660b02bba8bd01
PiperOrigin-RevId: 284202462
2019-12-06 09:40:50 -08:00
Kazuaki Ishizaki 84a6182ddd minor spelling tweaks
Closes tensorflow/mlir#290

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/290 from kiszk:spelling_tweaks_201912 9d9afd16a723dd65754a04698b3976f150a6054a
PiperOrigin-RevId: 284169681
2019-12-06 05:59:30 -08:00
River Riddle 33a64540ad Add support for instance specific pass statistics.
Statistics are a way to keep track of what the compiler is doing and how effective various optimizations are. It is useful to see what optimizations are contributing to making a particular program run faster. Pass-instance specific statistics take this even further as you can see the effect of placing a particular pass at specific places within the pass pipeline, e.g. they could help answer questions like "what happens if I run CSE again here".

Statistics can be added to a pass by simply adding members of type 'Pass::Statistics'. This class takes as a constructor arguments: the parent pass pointer, a name, and a description. Statistics can be dumped by the pass manager in a similar manner to how pass timing information is dumped, i.e. via PassManager::enableStatistics programmatically; or -pass-statistics and -pass-statistics-display via the command line pass manager options.

Below is an example:

struct MyPass : public OperationPass<MyPass> {
  Statistic testStat{this, "testStat", "A test statistic"};

  void runOnOperation() {
    ...
    ++testStat;
    ...
  }
};

$ mlir-opt -pass-pipeline='func(my-pass,my-pass)' foo.mlir -pass-statistics

Pipeline Display:
===-------------------------------------------------------------------------===
                         ... Pass statistics report ...
===-------------------------------------------------------------------------===
'func' Pipeline
  MyPass
    (S) 15 testStat - A test statistic
  MyPass
    (S)  6 testStat - A test statistic

List Display:
===-------------------------------------------------------------------------===
                         ... Pass statistics report ...
===-------------------------------------------------------------------------===
MyPass
  (S) 21 testStat - A test statistic

PiperOrigin-RevId: 284022014
2019-12-05 11:53:28 -08:00
River Riddle 6f895bec7d [CSE] NFC: Hash the attribute dictionary pointer instead of the list of attributes.
PiperOrigin-RevId: 283810829
2019-12-04 12:32:08 -08:00
Nicolas Vasilache edfaf925cf Drop MaterializeVectorTransfers in favor of simpler declarative unrolling
Now that we have unrolling as a declarative pattern, we can drop a full pass that has gone stale. In the future we may want to add specific unrolling patterns for VectorTransferReadOp.

PiperOrigin-RevId: 283806880
2019-12-04 12:11:42 -08:00
Alex Zinenko 75175134d4 Loop coalescing: fix pointer chainsing in use-chain traversal
In the replaceAllUsesExcept utility function called from loop coalescing the
iteration over the use-chain is incorrect. The use list nodes (IROperands) have
next/prev links, and bluntly resetting the use would make the loop to continue
on uses of the value that was replaced instead of the original one. As a
result, it could miss the existing uses and update the wrong ones. Make sure we
increment the iterator before updating the use in the loop body.

Reported-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#291.

PiperOrigin-RevId: 283754195
2019-12-04 07:42:29 -08:00
Nicolas Vasilache 5c0c51a997 Refactor dependencies to expose Vector transformations as patterns - NFC
This CL refactors some of the MLIR vector dependencies to allow decoupling VectorOps, vector analysis, vector transformations and vector conversions from each other.
This makes the system more modular and allows extracting VectorToVector into VectorTransforms that do not depend on vector conversions.

This refactoring exhibited a bunch of cyclic library dependencies that have been cleaned up.

PiperOrigin-RevId: 283660308
2019-12-03 17:52:10 -08:00
Diego Caballero 330d1ff00e AffineLoopFusion: Prevent fusion of multi-out-edge producer loops
tensorflow/mlir#162 introduced a bug that
incorrectly allowed fusion of producer loops with multiple outgoing
edges. This commit fixes that problem. It also introduces a new flag to
disable sibling loop fusion so that we can test producer-consumer fusion
in isolation.

Closes tensorflow/mlir#259

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/259 from dcaballe:dcaballe/fix_multi_out_edge_producer_fusion 578d5661705fd5c56c555832d5e0528df88c5282
PiperOrigin-RevId: 283531105
2019-12-03 06:09:50 -08:00
Mahesh Ravishankar bd485afda0 Introduce attributes that specify the final ABI for a spirv::ModuleOp.
To simplify the lowering into SPIR-V, while still respecting the ABI
requirements of SPIR-V/Vulkan, split the process into two
1) While lowering a function to SPIR-V (when the function is an entry
   point function), allow specifying attributes on arguments and
   function itself that describe the ABI of the function.
2) Add a pass that materializes the ABI described in the function.

Two attributes are needed.
1) Attribute on arguments of the entry point function that describe
   the descriptor_set, binding, storage class, etc, of the
   spv.globalVariable this argument will be replaced by
2) Attribute on function that specifies workgroup size, etc. (for now
   only workgroup size).

Add the pass -spirv-lower-abi-attrs to materialize the ABI described
by the attributes.

This change makes the SPIRVBasicTypeConverter class unnecessary and is
removed, further simplifying the SPIR-V lowering path.

PiperOrigin-RevId: 282387587
2019-11-25 11:19:56 -08:00
Jean-Michel Gorius 104777d8e6 Unify vector op names with other dialects.
Change vector op names from VectorFooOp to Vector_FooOp and from
vector::VectorFooOp to vector::FooOp.

Closes tensorflow/mlir#257

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/257 from Kayjukh:master dfc3a0e04114885aaec8740d5951d6984d6e1577
PiperOrigin-RevId: 281967461
2019-11-22 08:24:49 -08:00
River Riddle 4ea92a0586 NFC: Use Region::getBlocks to fix build failure with drop_begin.
PiperOrigin-RevId: 281656603
2019-11-20 19:30:46 -08:00
River Riddle fafb708b9a Merge DCE and unreachable block elimination into a new utility 'simplifyRegions'.
This moves the different canonicalizations of regions into one place and invokes them in the fixed-point iteration of the canonicalizer.

PiperOrigin-RevId: 281617072
2019-11-20 15:53:19 -08:00
Sean Silva e4f83c6c26 Add multi-level DCE pass.
This is a simple multi-level DCE pass that operates pretty generically on
the IR. Its key feature compared to the existing peephole dead op folding
that happens during canonicalization is being able to delete recursively
dead cycles of the use-def graph, including block arguments.

PiperOrigin-RevId: 281568202
2019-11-20 12:55:10 -08:00
Nicolas Vasilache fa14d4f6ab Implement unrolling of vector ops to finer-grained vector ops as a pattern.
This CL uses the pattern rewrite infrastructure to implement a simple VectorOps -> VectorOps legalization strategy to unroll coarse-grained vector operations into finer grained ones.
The transformation is written using local pattern rewrites to allow composition with other rewrites. It proceeds by iteratively introducing fake cast ops and cleaning canonicalizing or lowering them away where appropriate.

This is an example of writing transformations as compositions of local pattern rewrites that should enable us to make them significantly more declarative.

PiperOrigin-RevId: 281555100
2019-11-20 11:49:36 -08:00
Diego Caballero dd5a7cb488 Add getRemappedValue to ConversionPatternRewriter
This method is needed for N->1 conversion patterns to retrieve remapped
Values used in the original N operations.

Closes tensorflow/mlir#237

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/237 from dcaballe:dcaballe/getRemappedValue 1f64fadcf2b203f7b336ff0c5838b116ae3625db
PiperOrigin-RevId: 281321881
2019-11-19 11:09:39 -08:00
Jing Pu 563b5910a8 Also elide large array attribute in OpGraph Dump
PiperOrigin-RevId: 281114034
2019-11-18 11:27:43 -08:00
Andy Davis 68a8da4a93 Fix Affine Loop Fusion test case reported on github.
This CL utilizies the more robust fusion feasibility analysis being built out in LoopFusionUtils, which will eventually be used to replace the current affine loop fusion pass.

PiperOrigin-RevId: 281112340
2019-11-18 11:20:37 -08:00
Lei Zhang a0986bf43d NFC: Convert CmpIPredicate in StandardOps to use EnumAttr
This turns several hand-written functions to auto-generated ones.

PiperOrigin-RevId: 280684326
2019-11-15 10:17:31 -08:00
Nicolas Vasilache 0b271b7dfe Refactor the LowerVectorTransfers pass to use the RewritePattern infra - NFC
This is step 1/n in refactoring infrastructure along the Vector dialect to make it ready for retargetability and composable progressive lowering.

PiperOrigin-RevId: 280529784
2019-11-14 15:40:07 -08:00
Alex Zinenko 971b8dd4d8 Move Affine to Standard conversion to lib/Conversion
This is essentially a dialect conversion and conceptually belongs to
conversions.

PiperOrigin-RevId: 280460034
2019-11-14 10:35:21 -08:00
Nicolas Vasilache f2b6ae9991 Move VectorOps to Tablegen - (almost) NFC
This CL moves VectorOps to Tablegen and cleans up the implementation.

This is almost NFC but 2 changes occur:
  1. an interface change occurs in the padding value specification in vector_transfer_read:
     the value becomes non-optional. As a shortcut we currently use %f0 for all paddings.
     This should become an OpInterface for vectorization in the future.
  2. the return type of vector.type_cast is trivial and simplified to `memref<vector<...>>`

Relevant roundtrip and invalid tests that used to sit in core are moved to the vector dialect.

The op documentation is moved to the .td file.

PiperOrigin-RevId: 280430869
2019-11-14 08:15:23 -08:00
River Riddle d985c74883 NFC: Refactor block signature conversion to not erase the original arguments.
This refactors the implementation of block signature(type) conversion to not insert fake cast operations to perform the type conversion, but to instead create a new block containing the proper signature. This has the benefit of enabling the use of pre-computed analyses that rely on mapping values. It also leads to a much cleaner implementation overall. The major user facing change is that applySignatureConversion will now replace the entry block of the region, meaning that blocks generally shouldn't be cached over calls to applySignatureConversion.

PiperOrigin-RevId: 280226936
2019-11-13 10:27:53 -08:00
Jacques Pienaar bcfb3d4cd6 Explicitly initialize isRecursivelyLegal
This also previously triggered the warning:

warning: missing field 'isRecursivelyLegal' initializer [-Wmissing-field-initializers]
  legalOperations[op] = {action};
                               ^
PiperOrigin-RevId: 279399175
2019-11-08 15:06:34 -08:00
Sean Silva f6188b5b07 Replace some remnant uses of "inst" with "op".
PiperOrigin-RevId: 278961676
2019-11-06 16:09:23 -08:00
River Riddle 2366561a39 Add a PatternRewriter hook to merge blocks, and use it to support for folding branches.
A pattern rewriter hook, mergeBlock, is added that allows for merging the operations of one block into the end of another. This is used to support a canonicalization pattern for branch operations that folds the branch when the successor has a single predecessor(the branch block).

Example:
  ^bb0:
    %c0_i32 = constant 0 : i32
    br ^bb1(%c0_i32 : i32)
  ^bb1(%x : i32):
    return %x : i32

becomes:
  ^bb0:
    %c0_i32 = constant 0 : i32
    return %c0_i32 : i32
PiperOrigin-RevId: 278677825
2019-11-05 11:57:38 -08:00
Mahesh Ravishankar 9cbbd8f4df Support lowering of imperfectly nested loops into GPU dialect.
The current lowering of loops to GPU only supports lowering of loop
nests where the loops mapped to workgroups and workitems are perfectly
nested. Here a new lowering is added to handle lowering of imperfectly
nested loop body with the following properties
1) The loops partitioned to workgroups are perfectly nested.
2) The loop body of the inner most loop partitioned to workgroups can
contain one or more loop nests that are to be partitioned across
workitems. Each individual loops nests partitioned to workitems should
also be perfectly nested.
3) The number of workgroups and workitems are not deduced from the
loop bounds but are passed in by the caller of the lowering as values.
4) For statements within the perfectly nested loop nest partitioned
across workgroups that are not loops, it is valid to have all threads
execute that statement. This is NOT verified.

PiperOrigin-RevId: 277958868
2019-11-01 10:52:06 -07:00
Jing Pu 736ad2061c Dump op location in createPrintOpGraphPass for easier debugging.
PiperOrigin-RevId: 277546527
2019-10-30 11:22:22 -07:00
River Riddle a32f0dcb5d Add support to GreedyPatternRewriter for erasing unreachable blocks.
Rewrite patterns may make modifications to the CFG, including dropping edges between blocks. This change adds a simple unreachable block elimination run at the end of each iteration to ensure that the CFG remains valid.

PiperOrigin-RevId: 277545805
2019-10-30 11:19:24 -07:00
Diego Caballero c87c7f5732 Bugfix: Keep worklistMap in sync with worklist in GreedyPatternRewriter
When we removed a pattern, we removed it from worklist but not from
worklistMap. Then, when we tried to add a new pattern on the same Operation
again, the pattern wasn't added since it already existed in the
worklistMap (but not in the worklist).

Closes tensorflow/mlir#211

PiperOrigin-RevId: 277319669
2019-10-29 10:58:31 -07:00
River Riddle 2f4d0c085a Add support for marking an operation as recursively legal.
In some cases, it may be desirable to mark entire regions of operations as legal. This provides an additional granularity of context to the concept of "legal". The `ConversionTarget` supports marking operations, that were previously added as `Legal` or `Dynamic`, as `recursively` legal. Recursive legality means that if an operation instance is legal, either statically or dynamically, all of the operations nested within are also considered legal. An operation can be marked via `markOpRecursivelyLegal<>`:

```c++
ConversionTarget &target = ...;

/// The operation must first be marked as `Legal` or `Dynamic`.
target.addLegalOp<MyOp>(...);
target.addDynamicallyLegalOp<MySecondOp>(...);

/// Mark the operation as always recursively legal.
target.markOpRecursivelyLegal<MyOp>();
/// Mark optionally with a callback to allow selective marking.
target.markOpRecursivelyLegal<MyOp, MySecondOp>([](Operation *op) { ... });
/// Mark optionally with a callback to allow selective marking.
target.markOpRecursivelyLegal<MyOp>([](MyOp op) { ... });
```

PiperOrigin-RevId: 277086382
2019-10-28 10:04:34 -07:00
River Riddle 2b61b7979e Convert the Canonicalize and CSE passes to generic Operation Passes.
This allows for them to be used on other non-function, or even other function-like, operations. The algorithms are already generic, so this is simply changing the derived pass type. The majority of this change is just ensuring that the nesting of these passes remains the same, as the pass manager won't auto-nest them anymore.

PiperOrigin-RevId: 276573038
2019-10-24 15:01:09 -07:00
Alex Zinenko edffbbcdae Fix "set-but-unused" warning in DialectConversion
The variable in question is only used in an assertion,
leading to a warning in opt builds.

PiperOrigin-RevId: 276352259
2019-10-23 14:32:13 -07:00
Kazuaki Ishizaki 8bfedb3ca5 Fix minor spelling tweaks (NFC)
Closes tensorflow/mlir#177

PiperOrigin-RevId: 275692653
2019-10-20 00:11:34 -07:00
Nicolas Vasilache 9e7e297da3 Lower vector transfer ops to loop.for operations.
This allows mixing linalg operations with vector transfer operations (with additional modifications to affine ops) and is a step towards solving tensorflow/mlir#189.

PiperOrigin-RevId: 275543361
2019-10-18 14:10:10 -07:00
River Riddle 2acc220f17 NFC: Remove trivial builder get methods.
These don't add any value, and some are even more restrictive than the respective static 'get' method.

PiperOrigin-RevId: 275391240
2019-10-17 20:08:34 -07:00
Geoffrey Martin-Noble 6090643877 Introduce a wrapper around ConversionPattern that operates on the derived class
Analogous to OpRewritePattern, this makes writing conversion patterns more convenient.

PiperOrigin-RevId: 275349854
2019-10-17 15:30:38 -07:00
Nicolas Vasilache 10039d04e2 Rename LoopNestBuilder to AffineLoopNestBuilder - NFC
PiperOrigin-RevId: 275310747
2019-10-17 12:13:59 -07:00
Sana Damani 3940b90d84 Update Chapter 4 of the Toy tutorial
This Chapter now introduces and makes use of the Interface concept
in MLIR to demonstrate ShapeInference.
END_PUBLIC

Closes tensorflow/mlir#191

PiperOrigin-RevId: 275085151
2019-10-16 12:19:39 -07:00
Mahesh Ravishankar e7b49eef1d Allow for remapping argument to a Value in SignatureConversion.
The current SignatureConversion framework (part of DialectConversion)
allows remapping input arguments to a function from 1->0, 1->1 or
1->many arguments during conversion. Another case is where the
argument itself is dropped, but it's use are remapped to another
Value*.

An example of this is: The Vulkan/SPIR-V spec requires entry functions
to be of type void(void). The GPU -> SPIR-V conversion implemented
this without having the DialectConversion framework track the
remapping that lead to some undefined behavior. The changes here
addresses that.

PiperOrigin-RevId: 275059656
2019-10-16 10:21:03 -07:00
River Riddle dfe09cc621 Add support for PatternRewriter::eraseOp.
This hook is useful when an operation is known to be dead, and no replacement values make sense.

PiperOrigin-RevId: 275052756
2019-10-16 09:50:57 -07:00
Mehdi Amini f1f9e3b8d1 Fix CMake configuration after introduction of LICM and LoopLikeInterface
b843cc5d5a introduced a new op LICM transformation and a LoopLike interface,
but missed the CMake aspects of it. This should fix the build.

PiperOrigin-RevId: 275038533
2019-10-16 08:37:39 -07:00
Stephan Herhut b843cc5d5a Implement simple loop-invariant-code-motion based on dialect interfaces.
PiperOrigin-RevId: 275004258
2019-10-16 04:28:38 -07:00
River Riddle 96de7091bc Allowing replacing non-root operations in DialectConversion.
When dealing with regions, or other patterns that need to generate temporary operations, it is useful to be able to replace other operations than the root op being matched. Before this PR, these operations would still be considered for legalization meaning that the conversion would either fail, erroneously need to mark these ops as legal, or add unnecessary patterns.

PiperOrigin-RevId: 274598513
2019-10-14 10:01:59 -07:00
River Riddle 6b1cc3c6ea Add support for canonicalizing callable regions during inlining.
This will allow for inlining newly devirtualized calls, as well as give a more accurate cost model(when we have one). Currently canonicalization will only run for nodes that have no child edges, as the child nodes may be erased during canonicalization. We can support this in the future, but it requires more intricate deletion tracking.

PiperOrigin-RevId: 274011386
2019-10-10 17:06:33 -07:00
River Riddle 438dc176b1 Remove the need to convert operations in regions of operations that have been replaced.
When an operation with regions gets replaced, we currently require that all of the remaining nested operations are still converted even though they are going to be replaced when the rewrite is finished. This cl adds a tracking for a minimal set of operations that are known to be "dead". This allows for ignoring the legalization of operations that are won't survive after conversion.

PiperOrigin-RevId: 274009003
2019-10-10 17:06:25 -07:00
Christian Sigg 35bb732032 Guard rewriter insertion point during signature conversion.
Avoid unexpected side effect in rewriter insertion point.

PiperOrigin-RevId: 273785794
2019-10-09 11:33:28 -07:00
Diego Caballero 3451055614 Add support for some multi-store cases in affine fusion
This PR is a stepping stone towards supporting generic multi-store
source loop nests in affine loop fusion. It extends the algorithm to
support fusion of multi-store loop nests that:
 1. have only one store that writes to a function-local live out, and
 2. the remaining stores are involved in loop nest self dependences
    or no dependences within the function.

Closes tensorflow/mlir#162

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/162 from dcaballe:dcaballe/multi-output-fusion 7fb7dec6fe8b45f5ce176f018bfe37b256420c45
PiperOrigin-RevId: 273773907
2019-10-09 10:37:30 -07:00
River Riddle 49b29dd186 Add a PatternRewriter hook for cloning a region into another.
This is similar to the `inlineRegionBefore` hook, except the original blocks are unchanged. The region to be cloned *must* not have been modified during the conversion process at the point of cloning, i.e. it must belong an operation that has yet to be converted, or the operation that is currently being converted.

PiperOrigin-RevId: 273622533
2019-10-08 15:45:08 -07:00
Uday Bondhugula 6136f33d59 unroll and jam: fix order of jammed bodies
- bodies would earlier appear in the order (i, i+3, i+2, i+1) instead of
  (i, i+1, i+2, i+3) for example for factor 4.

- clean up hardcoded test cases

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#170

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/170 from bondhugula:ujam b66b405b2b1894a03b376952e32a9d0292042665
PiperOrigin-RevId: 273613131
2019-10-08 15:13:11 -07:00
Jing Pu 17606a108b Print result types when dumping graphviz.
PiperOrigin-RevId: 273406833
2019-10-07 16:45:53 -07:00
Uday Bondhugula 89e7a76a1c fix simplify-affine-structures bug
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#157

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/157 from bondhugula:quickfix bd1fcd79825fc0bd5b4a3e688153fa0993ab703d
PiperOrigin-RevId: 273316498
2019-10-07 10:04:50 -07:00
Christian Sigg 85dcaf19c7 Fix typos, NFC.
PiperOrigin-RevId: 272851237
2019-10-04 04:37:53 -07:00
River Riddle 5830f71a45 Add support for inlining calls with different arg/result types from the callable.
Some dialects have implicit conversions inherent in their modeling, meaning that a call may have a different type that the type that the callable expects. To support this, a hook is added to the dialect interface that allows for materializing conversion operations during inlining when there is a mismatch. A hook is also added to the callable interface to allow for introspecting the expected result types.

PiperOrigin-RevId: 272814379
2019-10-03 23:10:51 -07:00
River Riddle a20d96e436 Update the Inliner pass to work on SCCs of the CallGraph.
This allows for the inliner to work on arbitrary call operations. The updated inliner will also work bottom-up through the callgraph enabling support for multiple levels of inlining.

PiperOrigin-RevId: 272813876
2019-10-03 23:05:21 -07:00
Jacques Pienaar 2b86e27dbd Show type even if elementsattr is elided in graph
The type is quite useful for debugging and shouldn't be too large.

PiperOrigin-RevId: 272390311
2019-10-02 01:46:12 -07:00
Jacques Pienaar c57f202c8c Switch explicit create methods to match generated build's order
The generated build methods have result type before the arguments (operands and attributes, which are also now adjacent in the explicit create method). This also results in changing the create method's ordering to match most build method's ordering.

PiperOrigin-RevId: 271755054
2019-09-28 09:35:58 -07:00
Uday Bondhugula 74eabdd14e NFC - clean up op accessor usage, std.load/store op verify, other stale info
- also remove stale terminology/references in docs

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#148

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/148 from bondhugula:cleanup e846b641a3c2936e874138aff480a23cdbf66591
PiperOrigin-RevId: 271618279
2019-09-27 11:58:24 -07:00
Nicolas Vasilache ddf737c5da Promote MemRefDescriptor to a pointer to struct when passing function boundaries in LLVMLowering.
The strided MemRef RFC discusses a normalized descriptor and interaction with library calls (https://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/MaL8m2nXuio).
Lowering of nested LLVM structs as value types does not play nicely with externally compiled C/C++ functions due to ABI issues.
Solving the ABI problem generally is a very complex problem and most likely involves taking
a dependence on clang that we do not want atm.

A simple workaround is to pass pointers to memref descriptors at function boundaries, which this CL implement.

PiperOrigin-RevId: 271591708
2019-09-27 09:57:36 -07:00
Jing Pu 47a7021cc3 Change the return type of createPrintCFGGraphPass to match other passes.
PiperOrigin-RevId: 271252404
2019-09-25 18:33:47 -07:00
Mehdi Amini 5583252173 Add convenience methods to set an OpBuilder insertion point after an Operation (NFC)
PiperOrigin-RevId: 270727180
2019-09-23 11:54:55 -07:00
Christian Sigg c900d4994e Fix a number of Clang-Tidy warnings.
PiperOrigin-RevId: 270632324
2019-09-23 02:34:27 -07:00
Uday Bondhugula f559c38c28 Upgrade/fix/simplify store to load forwarding
- fix store to load forwarding for a certain set of cases (where
  forwarding shouldn't have happened); use AffineValueMap difference
  based MemRefAccess equality checking; utility logic is also greatly
  simplified

- add missing equality/inequality operators for AffineExpr ==/!= ints

- add == != operators on MemRefAccess

Closes tensorflow/mlir#136

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/136 from bondhugula:store-load-forwarding d79fd1add8bcfbd9fa71d841a6a9905340dcd792
PiperOrigin-RevId: 270457011
2019-09-21 10:08:56 -07:00
River Riddle 91125d33ed Avoid iterator invalidation when recursively computing pattern depth.
computeDepth calls itself recursively, which may insert into minPatternDepth. minPatternDepth is a DenseMap, which invalidates iterators on insertion, so this may lead to asan failures.

PiperOrigin-RevId: 270374203
2019-09-20 16:30:29 -07:00
Uday Bondhugula 727a50ae2d Support symbolic operands for memref replacement; fix memrefNormalize
- allow symbols in index remapping provided for memref replacement
- fix memref normalize crash on cases with layout maps with symbols

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Reported by: Alex Zinenko

Closes tensorflow/mlir#139

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/139 from bondhugula:memref-rep-symbols 2f48c1fdb5d4c58915bbddbd9f07b18541819233
PiperOrigin-RevId: 269851182
2019-09-18 11:26:11 -07:00
MLIR Team 1c73be76d8 Unify error messages to start with lower-case.
PiperOrigin-RevId: 269803466
2019-09-18 07:45:17 -07:00
Uday Bondhugula bd7de6d4df Add rewrite pattern to compose maps into affine load/stores
- add canonicalization pattern to compose maps into affine loads/stores;
  templatize the pattern and reuse it for affine.apply as well

- rename getIndices -> getMapOperands() (getIndices is confusing since
  these are no longer the indices themselves but operands to the map
  whose results are the indices). This also makes the accessor uniform
  across affine.apply/load/store. Change arg names on the affine
  load/store builder to avoid confusion. Drop an unused confusing build
  method on AffineStoreOp.

- update incomplete doc comment for canonicalizeMapAndOperands (this was
  missed from a previous update).

Addresses issue tensorflow/mlir#121

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#122

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/122 from bondhugula:compose-load-store e71de1771e56a85c4282c10cb43f30cef0701c4f
PiperOrigin-RevId: 269619540
2019-09-17 11:49:45 -07:00
River Riddle 9619ba10d4 Add support for multi-level value mapping to DialectConversion.
When performing A->B->C conversion, an operation may still refer to an operand of A. This makes it necessary to unmap through multiple levels of replacement for a specific value.

PiperOrigin-RevId: 269367859
2019-09-16 10:38:19 -07:00
Uday Bondhugula 4f32ae61b4 NFC - Move explicit copy/dma generation utility out of pass and into LoopUtils
- turn copy/dma generation method into a utility in LoopUtils, allowing
  it to be reused elsewhere.

- no functional/logic change to the pass/utility

- trim down header includes in files affected

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#124

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/124 from bondhugula:datacopy 9f346e62e5bd9dd1986720a30a35f302eb4d3252
PiperOrigin-RevId: 269106088
2019-09-14 13:23:48 -07:00
Uday Bondhugula 1366467a3b update normalizeMemRef utility; handle missing failure check + add more tests
- take care of symbolic operands with alloc
- add missing check for compose map failure and a test case
- add test cases on strides
- drop incorrect check for one-to-one'ness

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#132

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/132 from bondhugula:normalize-memrefs 8aebf285fb0d7c19269d85255aed644657e327b7
PiperOrigin-RevId: 269105947
2019-09-14 13:21:35 -07:00
River Riddle f1b100c77b NFC: Finish replacing FunctionPassBase/ModulePassBase with OpPassBase.
These directives were temporary during the generalization of FunctionPass/ModulePass to OpPass.

PiperOrigin-RevId: 268970259
2019-09-13 13:34:27 -07:00
Smit Hinsu 1854c64c7c Log name of the generated illegal operation name in DialectConversion debug mode
PiperOrigin-RevId: 268859399
2019-09-13 01:37:38 -07:00
Jacques Pienaar a23f69a37b Remove redundant qualification
Address GCC error: extra qualification not allowed [-fpermissive]

PiperOrigin-RevId: 268133737
2019-09-09 19:50:53 -07:00
Jacques Pienaar 2660623a88 Add pass generate per block in a function a GraphViz Dot graph with ops as nodes
* Add GraphTraits that treat a block as a graph, Operation* as node and use-relationship for edges;
  - Just basic graph output;
* Add use iterator to iterate over all uses of an Operation;
* Add testing pass to generate op graph;

This does not support arbitrary operations other than function nor nested regions yet.

PiperOrigin-RevId: 268121782
2019-09-09 18:12:41 -07:00
Mehdi Amini 6443583bfd Refactor getUsedValuesDefinedAbove to expose a variant taking a callback (NFC)
This will allow clients to implement a different collection strategy on these
values, including collecting each uses within the region for example.

PiperOrigin-RevId: 267803978
2019-09-07 17:03:01 -07:00
River Riddle 0ba0087887 Add the initial inlining infrastructure.
This defines a set of initial utilities for inlining a region(or a FuncOp), and defines a simple inliner pass for testing purposes.
A new dialect interface is defined, DialectInlinerInterface, that allows for dialects to override hooks controlling inlining legality. The interface currently provides the following hooks, but these are just premilinary and should be changed/added to/modified as necessary:

* isLegalToInline
  - Determine if a region can be inlined into one of this dialect, *or* if an operation of this dialect can be inlined into a given region.

* shouldAnalyzeRecursively
  - Determine if an operation with regions should be analyzed recursively for legality. This allows for child operations to be closed off from the legality checks for operations like lambdas.

* handleTerminator
  - Process a terminator that has been inlined.

This cl adds support for inlining StandardOps, but other dialects will be added in followups as necessary.

PiperOrigin-RevId: 267426759
2019-09-05 12:24:13 -07:00
Uday Bondhugula 8c9dc690eb pipeline-data-transfer: remove dead tag alloc's and improve test coverage for replaceMemRefUsesWith / pipeline-data-transfer
- address remaining comments from PR tensorflow/mlir#87 for better test coverage for
  pipeline-data-transfer/replaceAllMemRefUsesWith
- remove dead tag allocs the same way they are removed for the replaced buffers

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#106

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/106 from bondhugula:followup 9e868666d047e8d43e5f82f43e4093b838c710fa
PiperOrigin-RevId: 267144774
2019-09-04 06:59:09 -07:00
Uday Bondhugula 54d674f51e Utility to normalize memrefs with non-identity layout maps
- introduce utility to convert memrefs with non-identity layout maps to
  ones with identity layout maps: convert the type and rewrite/remap all
  its uses

- add this utility to -simplify-affine-structures pass for testing
  purposes

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#104

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/104 from bondhugula:memref-normalize f2c914aa1890e8860326c9e33f9aa160b3d65e6d
PiperOrigin-RevId: 266985317
2019-09-03 12:14:28 -07:00
Uday Bondhugula b1ef9dc22c Fix affine data copy generation corner cases/bugs
- the [begin, end) range identified for copying could end in between the
  block, which makes hoisting invalid in some cases. Change the range
  identification to always end with end of block.

- add test case to exercise these (with fast mem capacity set to minimal so
  that single element memref buffers are generated at the innermost loop)

- the location of begin/end of the block range for data copying was
  being confused with the insert points for copy in and copy out code.
  In cases, where we choose to hoist transfers, these are separate.

- when copy loops are single iteration ones, promote their bodies at
  the end of the pass.

- change default fast mem space to 1 (setting it to zero made it
  generate DMA op's that won't verify in the default case - since the
  DMA ops have a check for src/dest memref spaces being different).

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Co-Authored-By: Mehdi Amini <joker.eph@gmail.com>

Closes tensorflow/mlir#88

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/88 from bondhugula:datacopy 88697267c45e850c3ced87671e16e4a930c02a42
PiperOrigin-RevId: 266980911
2019-09-03 11:53:16 -07:00
River Riddle 6563b1c446 Add a new dialect interface for the OperationFolder `OpFolderDialectInterface`.
This interface will allow for providing hooks to interrop with operation folding. The first hook, 'shouldMaterializeInto', will allow for controlling which region to insert materialized constants into. The folder will generally materialize constants into the top-level isolated region, this allows for materializing into a lower level ancestor region if it is more profitable/correct.

PiperOrigin-RevId: 266702972
2019-09-01 20:07:08 -07:00
Mehdi Amini ce702fc8da Add a `getUsedValuesDefinedAbove()` overload that takes an `Operation` pointer (NFC)
This is a convenient utility around the existing `getUsedValuesDefinedAbove()`
that take two regions.

PiperOrigin-RevId: 266686854
2019-09-01 16:32:10 -07:00
River Riddle 9c8a8a7d0d Add a canonicalization to erase empty AffineForOps.
AffineForOp themselves are pure and can be removed if there are no internal operations.

PiperOrigin-RevId: 266481293
2019-08-30 16:49:32 -07:00
River Riddle 037742cdf2 Add support for early exit walk methods.
This is done by providing a walk callback that returns a WalkResult. This result is either `advance` or `interrupt`. `advance` means that the walk should continue, whereas `interrupt` signals that the walk should stop immediately. An example is shown below:

auto result = op->walk([](Operation *op) {
  if (some_invariant)
    return WalkResult::interrupt();
  return WalkResult::advance();
});

if (result.wasInterrupted())
  ...;

PiperOrigin-RevId: 266436700
2019-08-30 12:47:53 -07:00
River Riddle 4bfae66d70 Refactor the 'walk' methods for operations.
This change refactors and cleans up the implementation of the operation walk methods. After this refactoring is that the explicit template parameter for the operation type is no longer needed for the explicit op walks. For example:

    op->walk<AffineForOp>([](AffineForOp op) { ... });

is now accomplished via:

    op->walk([](AffineForOp op) { ... });

PiperOrigin-RevId: 266209552
2019-08-29 13:04:50 -07:00
Uday Bondhugula bc2a543225 fix loop unroll and jam - operand mapping - imperfect nest case
- fix operand mapping while cloning sub-blocks to jam - was incorrect
  for imperfect nests where def/use was across sub-blocks
- strengthen/generalize the first test case to cover the previously
  missed scenario
- clean up the other cases while on this.

Previously, unroll-jamming the following nest
```
    affine.for %arg0 = 0 to 2048 {
      %0 = alloc() : memref<512x10xf32>
      affine.for %arg1 = 0 to 10 {
        %1 = affine.load %0[%arg0, %arg1] : memref<512x10xf32>
      }
      dealloc %0 : memref<512x10xf32>
    }
```

would yield

```
      %0 = alloc() : memref<512x10xf32>
      %1 = affine.apply #map0(%arg0)
      %2 = alloc() : memref<512x10xf32>
      affine.for %arg1 = 0 to 10 {
        %4 = affine.load %0[%arg0, %arg1] : memref<512x10xf32>
        %5 = affine.apply #map0(%arg0)
        %6 = affine.load %0[%5, %arg1] : memref<512x10xf32>
      }
      dealloc %0 : memref<512x10xf32>
      %3 = affine.apply #map0(%arg0)
      dealloc %0 : memref<512x10xf32>

```

instead of

```

module {
    affine.for %arg0 = 0 to 2048 step 2 {
      %0 = alloc() : memref<512x10xf32>
      %1 = affine.apply #map0(%arg0)
      %2 = alloc() : memref<512x10xf32>
      affine.for %arg1 = 0 to 10 {
        %4 = affine.load %0[%arg0, %arg1] : memref<512x10xf32>
        %5 = affine.apply #map0(%arg0)
        %6 = affine.load %2[%5, %arg1] : memref<512x10xf32>
      }
      dealloc %0 : memref<512x10xf32>
      %3 = affine.apply #map0(%arg0)
      dealloc %2 : memref<512x10xf32>
    }
```

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#98

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/98 from bondhugula:ujam ddbc853f69b5608b3e8ff9b5ac1f6a5a0bb315a4
PiperOrigin-RevId: 266073460
2019-08-28 23:42:50 -07:00
Uday Bondhugula aa2cee9cf5 Refactor / improve replaceAllMemRefUsesWith
Refactor replaceAllMemRefUsesWith to split it into two methods: the new
method does the replacement on a single op, and is used by the existing
one.

- make the methods return LogicalResult instead of bool

- Earlier, when replacement failed (due to non-deferencing uses of the
  memref), the set of ops that had already been processed would have
  been replaced leaving the IR in an inconsistent state. Now, a
  pass is made over all ops to first check for non-deferencing
  uses, and then replacement is performed. No test cases were affected
  because all clients of this method were first checking for
  non-deferencing uses before calling this method (for other reasons).
  This isn't true for a use case in another upcoming PR (scalar
  replacement); clients can now bail out with consistent IR on failure
  of replaceAllMemRefUsesWith. Add test case.

- multiple deferencing uses of the same memref in a single op is
  possible (we have no such use cases/scenarios), and this has always
  remained unsupported. Add an assertion for this.

- minor fix to another test pipeline-data-transfer case.

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#87

PiperOrigin-RevId: 265808183
2019-08-27 17:56:56 -07:00
River Riddle 2f59f76876 NFC: Remove the explicit context from Operation::create and OperationState.
The context can easily be recovered from the Location in these situations.

PiperOrigin-RevId: 265578574
2019-08-26 17:34:48 -07:00
Andy Ly 6a501e3d1b Support folding of ops with inner ops in GreedyPatternRewriteDriver.
This fixes a bug when folding ops with inner ops and inner ops are still being visited.

PiperOrigin-RevId: 265475780
2019-08-26 09:44:39 -07:00
River Riddle 32052c8417 NFC: Add a note to 'applyPatternsGreedily' that it also performs folding/dce.
Fixes tensorflow/mlir#72

PiperOrigin-RevId: 265097597
2019-08-23 11:28:45 -07:00
River Riddle ffde975e21 NFC: Move AffineOps dialect to the Dialect sub-directory.
PiperOrigin-RevId: 264482571
2019-08-20 15:36:39 -07:00
Nicolas Vasilache b628194013 Move Linalg and VectorOps dialects to the Dialect subdir - NFC
PiperOrigin-RevId: 264277760
2019-08-19 17:11:38 -07:00
River Riddle ba0fa92524 NFC: Move LLVMIR, SDBM, and StandardOps to the Dialect/ directory.
PiperOrigin-RevId: 264193915
2019-08-19 11:01:25 -07:00
Jacques Pienaar 79f53b0cf1 Change from llvm::make_unique to std::make_unique
Switch to C++14 standard method as llvm::make_unique has been removed (
https://reviews.llvm.org/D66259). Also mark some targets as c++14 to ease next
integrates.

PiperOrigin-RevId: 263953918
2019-08-17 11:06:03 -07:00
River Riddle 9c29273ddc Refactor DialectConversion to convert the signatures of blocks when they are moved.
Often we want to ensure that block arguments are converted before operations that use them. This refactors the current implementation to be cleaner/less frequent by triggering conversion when a set of blocks are moved/inlined; or when legalization is successful.

PiperOrigin-RevId: 263795005
2019-08-16 10:16:38 -07:00
Mehdi Amini 926fb685de Express ownership transfer in PassManager API through std::unique_ptr (NFC)
Since raw pointers are always passed around for IR construct without
implying any ownership transfer, it can be error prone to have implicit
ownership transferred the same way.
For example this code can seem harmless:

  Pass *pass = ....
  pm.addPass(pass);
  pm.addPass(pass);
  pm.run(module);

PiperOrigin-RevId: 263053082
2019-08-12 19:13:12 -07:00
River Riddle 5290e8c36d NFC: Update pattern rewrite API to pass OwningRewritePatternList by const reference.
The pattern list is not modified by any of these APIs and should thus be passed with const.

PiperOrigin-RevId: 262844002
2019-08-11 18:34:14 -07:00
River Riddle 1e42954032 NFC: Standardize the terminology used for parent ops/regions/etc.
There are currently several different terms used to refer to a parent IR unit in 'get' methods: getParent/getEnclosing/getContaining. This cl standardizes all of these methods to use 'getParent*'.

PiperOrigin-RevId: 262680287
2019-08-09 20:07:52 -07:00
River Riddle 41968fb475 NFC: Update usages of OwningRewritePatternList to pass by & instead of &&.
This will allow for reusing the same pattern list, which may be costly to continually reconstruct, on multiple invocations.

PiperOrigin-RevId: 262664599
2019-08-09 17:20:29 -07:00
Nicolas Vasilache 39f1b9a053 Add a higher-order vector.extractelement operation in MLIR
This CL is step 2/n towards building a simple, programmable and portable vector abstraction in MLIR that can go all the way down to generating assembly vector code via LLVM's opt and llc tools.

This CL adds the vector.extractelement operation to the MLIR vector dialect as well as the appropriate roundtrip test. Lowering to LLVM will occur in the following CL.

PiperOrigin-RevId: 262545089
2019-08-09 05:58:47 -07:00