Commit Graph

479 Commits

Author SHA1 Message Date
Nicolas Vasilache 6ed61a26c2 [mlir] Simplify and better document std.view semantics
This [discussion](https://llvm.discourse.group/t/viewop-isnt-expressive-enough/991/2) raised some concerns with ViewOp.

In particular, the handling of offsets is incorrect and does not match the op description.
Note that with an elemental type change, offsets cannot be part of the type in general because sizeof(srcType) != sizeof(dstType).

Howerver, offset is a poorly chosen term for this purpose and is renamed to byte_shift.

Additionally, for all intended purposes, trying to support non-identity layouts for this op does not bring expressive power but rather increases code complexity.

This revision simplifies the existing semantics and implementation.
This simplification effort is voluntarily restrictive and acts as a stepping stone towards supporting richer semantics: treat the non-common cases as YAGNI for now and reevaluate based on concrete use cases once a round of simplification occurred.

Differential revision: https://reviews.llvm.org/D79541
2020-05-11 12:29:23 -04:00
Alex Zinenko 4809580463 [mlir] Add a test for OperationFolder
Adds a test exercising the rewriting pattern in the test dialect that calls
OperationFolder.create.
2020-05-07 12:39:24 +02:00
Uday Bondhugula 2affcd664e [MLIR] Fix affine fusion bug/efficiency issue / enable more fusion
The list of destination load ops while evaluating producer-consumer
fusion wasn't being maintained as a set, and as such, duplicate load ops
were being added to it. Although this is harmless correctness-wise, it's
a killer efficiency-wise and it prevents interesting/useful fusions
(including for eg. reshapes into a matmul). The reason the latter
fusions would be missed is that a slice union would be unnecessarily
needed due to the duplicate load ops on a memref added to the 'dst
loads' list. Since slice union is unimplemented for the local var case,
a single destination load op that leads to local vars (like a floordiv /
mod producing fusion), a common case, would not get fused due to an
unnecessary union being tried with itself.  (The union would actually be
the same thing but we would bail out.)

Besides the above, this would also significantly speed up fusion as all
the unnecessary slice computations / unions, checks, etc. due to the
duplicates go away.

Differential Revision: https://reviews.llvm.org/D79547
2020-05-07 10:51:34 +05:30
Nicolas Vasilache 94438c86ad [mlir] Add a MemRefCastOp canonicalization pattern.
Summary:
This revision adds a conservative canonicalization pattern for MemRefCastOp that are typically inserted during ViewOp and SubViewOp canonicalization.
Ideally such canonicalizations would propagate the type to consumers but this is not a local behavior. As a consequence MemRefCastOp are introduced to keep type compatibility but need to be cleaned up later, in the case where more dynamic behavior than necessary is introduced.

Differential Revision: https://reviews.llvm.org/D79438
2020-05-06 09:10:05 -04:00
River Riddle 469c02d058 [mlir] Add support for merging identical blocks during canonicalization
This revision adds support for merging identical blocks, or those with the same operations that branch to the same successors. Operands that mismatch between the different blocks are replaced with new block arguments added to the merged block.

Differential Revision: https://reviews.llvm.org/D79134
2020-05-04 19:56:46 -07:00
Lucy Fox 8de482ea9a [MLIR] Modify Partial op conversion mode to optionally track all non-legalizable operations.
There are three op conversion modes: Partial, Full, and Analysis. This change modifies the Partial mode to optionally take a set of non-legalizable ops. If this parameter is specified, all ops that are not legalizable (i.e. would cause full conversion to fail) are tracked throughout the partial legalization.

Differential Revision: https://reviews.llvm.org/D78788
2020-04-30 09:52:37 -07:00
Tres Popp f66c87637a [MLIR] Give AffineStoreOp and AffineLoadOp Memory SideEffects.
Summary:
This change results in tests also being changed to prevent dead
affine.load operations from being folded away during rewrites.

Also move AffineStoreOp and AffineLoadOp to an ODS file.

Differential Revision: https://reviews.llvm.org/D78930
2020-04-28 15:45:25 +02:00
Ehsan Toosi 5c352e69e7 Providing buffer assignment for MLIR
We have provided a generic buffer assignment transformation ported from
TensorFlow. This generic transformation pass automatically analyzes the values
and their aliases (also in other blocks) and returns the valid positions for
Alloc and Dealloc operations. To find these positions, the algorithm uses the
block Dominator and Post-Dominator analyses. In our proposed algorithm, we have
considered aliasing, liveness, nested regions, branches, conditional branches,
critical edges, and independency to custom block terminators. This
implementation doesn't support block loops. However, we have considered this in
our design. For this purpose, it is only required to have a loop analysis to
insert Alloc and Dealloc operations outside of these loops in some special
cases.

Differential Revision: https://reviews.llvm.org/D78484
2020-04-28 10:17:59 +02:00
Phoenix Meadowlark 622aac6a0a Add a folder for division by one.
- Adds a folder for integer division by one with the `divi_signed` and `divi_unsigned` ops.
- Creates tests for scalar and tensor versions of these ops.
- Modifies the test in `parallel-loop-collapsing.mlir` so that it doesn't assume division by one will be in the output.

Differential Revision: https://reviews.llvm.org/D78518
2020-04-27 22:35:10 +00:00
River Riddle a90151d67e [mlir][SCCP] Add support for propagating across symbol based calls
This revision adds support for propagating constants across symbol-based callgraph edges. It uses the existing Call/CallableOpInterfaces to detect the dataflow edges, and propagates constants through arguments and out of returns.

Differential Revision: https://reviews.llvm.org/D78592
2020-04-27 13:04:49 -07:00
Tres Popp 2d2d696137 [MLIR] Propagate input side effect information
Summary:
Previously operations like std.load created methods for obtaining their
effects but did not inherit from the SideEffect interfaces when their
parameters were decorated with the information. The resulting situation
was that passes had no information on the SideEffects of std.load/store
and had to treat them more cautiously. This adds the inheritance
information when creating the methods.

As a side effect, many tests are modified, as they were using std.load
for testing and this oepration would be folded away as part of pattern
rewriting. Tests are modified to use store or to reutn the result of the
std.load.

Reviewers: mravishankar, antiagainst, nicolasvasilache, herhut, aartbik, ftynse!

Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, csigg, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, bader, grosul1, frgossen, Kayjukh, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78802
2020-04-27 11:35:52 +02:00
River Riddle 0816de167a [mlir][DialectConversion] Add support for properly tracking replaceUsesOfBlockArgument
The current implementation of this method performs the replacement directly, and thus doesn't support proper back tracking.

Differential Revision: https://reviews.llvm.org/D78790
2020-04-24 12:37:32 -07:00
River Riddle 2f4b303d68 [mlir][Standard] Add canonicalization for collapsing pass through cond_br successors.
This revision adds support for the following canonicalization:

```
   cond_br %cond, ^bb1, ^bb2
 ^bb1
   br ^bbN(...)
 ^bb2
   br ^bbK(...)

   cond_br %cond, ^bbN(...), ^bbK(...)
```

Differential Revision: https://reviews.llvm.org/D78681
2020-04-23 04:42:01 -07:00
River Riddle 2eda87dfbe [mlir][SCCP] Add support for propagating constants across inter-region control flow.
This is possible by adding two new ControlFlowInterface additions:

- A new interface, RegionBranchOpInterface
This interface allows for region holding operations to describe how control flows between regions. This interface initially contains two methods:

* getSuccessorEntryOperands
Returns the operands of this operation used as the entry arguments when entering the region at `index`, which was specified as a successor by `getSuccessorRegions`. when entering. These operands should correspond 1-1 with the successor inputs specified in `getSuccessorRegions`, and may be a subset of the entry arguments for that region.

*  getSuccessorRegions
Returns the viable successors of a region, or the possible successor when branching from the parent op. This allows for describing which regions may be executed when entering an operation, and which regions are executed after having executed another region of the parent op. For example, a structured loop operation may always enter into the loop body region. The loop body region may branch back to itself, or exit to the operation.

- A trait, ReturnLike
This trait signals that a terminator exits a region and forwards all of its operands as "exiting" values.

These additions allow for performing more general dataflow analysis in the presence of region holding operations.

Differential Revision: https://reviews.llvm.org/D78447
2020-04-21 02:59:25 -07:00
River Riddle 152d29cc74 [mlir][Transforms] Add pass to perform sparse conditional constant propagation
This revision adds the initial pass for performing SCCP generically in MLIR. SCCP is an algorithm for propagating constants across control flow, and optimistically assumes all values to be constant unless proven otherwise. It currently supports branching control, with support for regions and inter-procedural propagation being added in followups.

Differential Revision: https://reviews.llvm.org/D78397
2020-04-21 02:59:25 -07:00
Sean Silva 22219cfc6a Fix inlining multi-block callees with type conversion.
The previous code result a mismatch between block argument types and
predecessor successor args when a type conversion was needed in a
multiblock case. It was assuming the replaced result types matched the
region result types.

Also, slighly improve the debug output from the inliner.

Differential Revision: https://reviews.llvm.org/D78415
2020-04-20 16:54:01 -07:00
River Riddle 43cf489cf5 [mlir][SymbolDCE][NFC] Fix the visibility of the symbols within the test and
move it to test/Transforms/
2020-04-13 00:33:11 -07:00
River Riddle bd1ccfe6df [mlir] Add a new RewritePattern::hasBoundedRewriteRecursion hook.
Summary: Some pattern rewriters, like dialect conversion, prohibit the unbounded recursion(or reapplication) of patterns on generated IR. Most patterns are not written with recursive application in mind, so will generally explode the stack if uncaught. This revision adds a hook to RewritePattern, `hasBoundedRewriteRecursion`, to signal that the pattern can safely be applied to the generated IR of a previous application of the same pattern. This allows for establishing a contract between the pattern and rewriter that the pattern knows and can handle the potential recursive application.

Differential Revision: https://reviews.llvm.org/D77782
2020-04-09 12:42:28 -07:00
River Riddle 400ad6f95d [mlir] Eliminate the remaining usages of cl::opt instead of PassOption.
Summary: Pass options are a better choice for various reasons and avoid the need for static constructors.

Differential Revision: https://reviews.llvm.org/D77707
2020-04-08 13:05:08 -07:00
Christian Sigg 06ddb7946b [MLIR] Add missing colon after CHECKs.
Reviewers: herhut

Reviewed By: herhut

Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, grosul1, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77709
2020-04-08 11:16:06 +02:00
Uday Bondhugula 7023f4b4cb [MLIR] Introduce std.alloca op
Introduce the alloca op for stack memory allocation. When converting to the
LLVM dialect, this is lowered to an llvm.alloca. Refactor the std to
llvm conversion for alloc op to reuse with alloca. Drop useAlloca option
with alloc op lowering.

Differential Revision: https://reviews.llvm.org/D76602
2020-04-07 15:45:07 +05:30
Uday Bondhugula 3f9cdd44d7 [MLIR] Add pattern rewriter util to erase block; remove dead else
Add a pattern rewriter utility to erase blocks (while notifying the
pattern rewriting driver of the erased ops). Use this to remove trivial
else blocks in affine.if ops.

Differential Revision: https://reviews.llvm.org/D77083
2020-04-05 19:24:43 +05:30
Alex Zinenko f27f1e8c27 [mlir] DialectConversion: support block creation in ConversionPatternRewriter
PatternRewriter and derived classes provide a set of virtual methods to
manipulate blocks, which ConversionPatternRewriter overrides to keep track of
the manipulations and undo them in case the conversion fails. However, one can
currently create a block only by splitting another block into two. This not
only makes the API inconsistent (`splitBlock` is allowed in conversion
patterns, but `createBlock` is not), but it also make it impossible for one to
create blocks with argument lists different from those of already existing
blocks since in-place block updates are not supported either. Such
functionality precludes dialect conversion infrastructure from being used more
extensively on region-containing ops, for example, for value-returning "if"
operations. At the same time, ConversionPatternRewriter already allows one to
undo block creation as block creation is one of the primitive operations in
already supported region inlining.

Support block creation in conversion patterns by hooking `createBlock` on the
block action undo mechanism. This requires to make `Builder::createBlock`
virtual, similarly to Op insertion. This is a minimal change to the Builder
infrastructure that will later help support additional use cases such as block
signature changes. `createBlock` now additionally takes the types of the block
arguments that are added immediately so as to avoid in-place argument list
manipulation that would be illegal in conversion patterns.
2020-04-03 20:30:03 +02:00
Tres Popp a67cd71acd [MLIR] Implement LoopLikeInterface for loop.parallel
Summary:
This is to allow optimizations like loop invariant code motion to work
on the ParallelOp.

Additional small cleanup on the ForOp implementation of
LoopLikeInterface and the test file of loop-invariant-code-motion.

Differential Revision: https://reviews.llvm.org/D77128
2020-04-01 16:47:57 +02:00
Tres Popp 90b7bbffdd [MLIR] Rename collapsePLoops -> collapseParallelLoops
Summary:
Additionally, NFC code cleanups were done.

This is to address additional comments on
https://reviews.llvm.org/D76363

Differential Revision: https://reviews.llvm.org/D77052
2020-04-01 10:15:13 +02:00
Uday Bondhugula 5f9bf3f656 [MLIR][NFC] Move test/Transforms/lower-affine.mlir -> test/Conversion
Move lower-affine.mlir from test/Transforms to
test/Conversion/AffineToStandard/. Other related NFC.

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Differential Revision: https://reviews.llvm.org/D77008
2020-03-31 23:34:50 +05:30
Mehdi Amini bab5bcf8fd Add a flag on the context to protect against creation of operations in unregistered dialects
Differential Revision: https://reviews.llvm.org/D76903
2020-03-30 19:37:31 +00:00
Tres Popp 27c201aa1d [MLIR] Add parallel loop collapsing.
This allows conversion of a ParallelLoop from N induction variables to
some nuber of induction variables less than N.

The first intended use of this is for the GPUDialect to convert
ParallelLoops to iterate over 3 dimensions so they can be launched as
GPU Kernels.

To implement this:
- Normalize each iteration space of the ParallelLoop
- Use the same induction variable in a new ParallelLoop for multiple
  original iterations.
- Split the new induction variable back into the original set of values
  inside the body of the ParallelLoop.

Subscribers: mgorny, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76363
2020-03-26 09:32:52 +01:00
Uday Bondhugula b873761496 [MLIR][NFC] Move some of the affine transforms / tests to dialect dirs
Move some of the affine transforms and their test cases to their
respective dialect directory. This patch does not complete the move, but
takes care of a good part.

Renames: prefix 'affine' to affine loop tiling cl options,
vectorize -> super-vectorize

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Differential Revision: https://reviews.llvm.org/D76565
2020-03-23 08:25:07 +05:30
River Riddle e9482ed194 [mlir] Move several static cl::opts to be pass options instead.
This removes the reliance on global options, and also simplifies the pass registration.

Differential Revision: https://reviews.llvm.org/D76552
2020-03-22 03:16:21 -07:00
Rob Suderman cd1212deff [mlir] Introduced CallOp Dialect Conversion
Summary:
Utility to perform CallOp Dialect conversion, specifically handling cases where
an argument type has changed and the corresponding CallOp needs to be updated.

Differential Revision: https://reviews.llvm.org/D76326
2020-03-18 20:07:38 -07:00
River Riddle 4be504a97f [mlir] Add support for detecting single use callables in the Inliner.
Summary: This is somewhat complex(annoying) as it involves directly tracking the uses within each of the callgraph nodes, and updating them as needed during inlining. The benefit of this is that we can have a more exact cost model, enable inlining some otherwise non-inlinable cases, and also ensure that newly dead callables are properly disposed of.

Differential Revision: https://reviews.llvm.org/D75476
2020-03-18 13:10:41 -07:00
Uday Bondhugula d811aee5d9 [MLIR][NFC] update/clean up affine PDT, related utils, its test case
- rename vars that had inst suffixes (due to ops earlier being
  known as insts); other renames for better readability
- drop unnecessary matches in test cases
- iterate without block terminator
- comment/doc updates
- instBodySkew -> affineForOpBodySkew

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Differential Revision: https://reviews.llvm.org/D76214
2020-03-17 06:12:16 +05:30
River Riddle 0ddba0bd59 [mlir][SideEffects] Replace HasNoSideEffect with the memory effect interfaces.
HasNoSideEffect can now be implemented using the MemoryEffectInterface, removing the need to check multiple things for the same information. This also removes an easy foot-gun for users as 'Operation::hasNoSideEffect' would ignore operations that dynamically, or recursively, have no side effects. This also leads to an immediate improvement in some of the existing users, such as DCE, now that they have access to more information.

Differential Revision: https://reviews.llvm.org/D76036
2020-03-12 14:26:15 -07:00
Tim Shen d00f5632f3 [mlir] Add a simplifying wrapper for generateCopy and expose it.
Summary:
affineDataCopyGenerate is a monolithinc function that
combines several steps for good reasons, but it makes customizing
the behaivor even harder. The major two steps by affineDataCopyGenerate are:
a) Identify interesting memrefs and collect their uses.
b) Create new buffers to forward these uses.

Step (a) actually has requires tremendous customization options. One could see
that from the recently added filterMemRef parameter.

This patch adds a function that only does (b), in the hope that (a)
can be directly implemented by the callers. In fact, (a) is quite
simple if the caller has only one buffer to consider, or even one use.

Differential Revision: https://reviews.llvm.org/D75965
2020-03-11 16:22:31 -07:00
River Riddle b10c662514 [mlir][SideEffects] Replace the old SideEffects dialect interface with the newly added op interfaces/traits.
Summary:
The old interface was a temporary stopgap to allow for implementing simple LICM that took side effects of region operations into account. Now that MLIR has proper support for specifying memory effects, this interface can be deleted.

Differential Revision: https://reviews.llvm.org/D74441
2020-03-09 16:02:21 -07:00
Uday Bondhugula 5e080dff75 [MLIR] NFC: modernize affine loop fusion test cases
- update test case for readability, avoid unnecessary matches

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Differential Revision: https://reviews.llvm.org/D75823
2020-03-09 04:27:51 +00:00
River Riddle cb1777127c [mlir] Remove successor operands from the Operation class
Summary:
This revision removes all of the functionality related to successor operands on the core Operation class. This greatly simplifies a lot of handling of operands, as well as successors. For example, DialectConversion no longer needs a special "matchAndRewrite" for branching terminator operations.(Note, the existing method was also broken for operations with variadic successors!!)

This also enables terminator operations to define their own relationships with successor arguments, instead of the hardcoded "pass-through" behavior that exists today.

Differential Revision: https://reviews.llvm.org/D75318
2020-03-05 12:53:02 -08:00
River Riddle 01f7431b5b [mlir][DeclarativeParser] Add support for formatting operations with AttrSizedOperandSegments.
This attribute details the segment sizes for operand groups within the operation. This revision add support for automatically populating this attribute in the declarative parser.

Differential Revision: https://reviews.llvm.org/D75315
2020-03-05 12:51:28 -08:00
Diego Caballero d7058acc14 [mlir] Add MemRef filter to affine data copy optimization
This patch extends affine data copy optimization utility with an
optional memref filter argument. When the memref filter is used, data
copy optimization will only generate copies for such a memref.

Note: this patch is just porting the memref filter feature from Uday's
'hop' branch: https://github.com/bondhugula/llvm-project/tree/hop.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D74342
2020-02-14 13:41:45 -08:00
Andy Davis 40b2eb3530 [mlir][AffineOps] Adds affine loop fusion transformation function to LoopFusionUtils.
Summary:
Adds affine loop fusion transformation function to LoopFusionUtils.
Updates TestLoopFusion utility to run loop fusion transformation until a fixed point is reached.
Adds unit tests to test the transformation.
Includes ASAN bug fix for D73190.

Reviewers: bondhugula, dcaballe

Reviewed By: bondhugula, dcaballe

Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74330
2020-02-11 13:56:26 -08:00
Stephen Neuendorffer ed56633fb9 [MLIR][Standard] Implement constant folding for IndexCast
Differential Revision: https://reviews.llvm.org/D73672
2020-02-10 10:23:56 -08:00
Stephen Neuendorffer 12df427fb2 [MLIR][Standard] Add folding for indexCast(indexCast(x)) -> x
Allow this only if the types are the same.  e.g.:
i16 -> index -> i16  or
index -> i16 -> index

Differential Revision: https://reviews.llvm.org/D73671
2020-02-10 10:23:56 -08:00
Stephen Neuendorffer b80a9ca8cb [MLIR] Allow non-binary operations to be commutative
NFC for binary operations.

Differential Revision: https://reviews.llvm.org/D73670
2020-02-10 10:23:55 -08:00
River Riddle abe3e5babd [mlir] Add support for generating debug locations from intermediate levels of the IR.
Summary:
This revision adds a utility to generate debug locations from the IR during compilation, by snapshotting to a output stream and using the locations that operations were dumped in that stream. The new locations may either;
* Replace the original location of the operation.

old:
   loc("original_source.cpp":1:1)
new:
   loc("snapshot_source.mlir":10:10)

* Fuse with the original locations as NamedLocs with a specific tag.

old:
    loc("original_source.cpp":1:1)
new:
    loc(fused["original_source.cpp":1:1, "snapshot"("snapshot_source.mlir":10:10)])

This feature may be used by a debugger to display the code at various different levels of the IR. It would also be able to show the different levels of IR attached to a specific source line in the original source file.

This feature may also be used to generate locations for operations generated during compilation, that don't necessarily have a user source location to attach to.

This requires changes in the printer to track the locations of operations emitted in the stream. Moving forward we need to properly(and efficiently) track the number of newlines emitted to the stream during printing.

Differential Revision: https://reviews.llvm.org/D74019
2020-02-08 15:11:29 -08:00
Mehdi Amini 2724ada8d2 Revert "[mlir] Adds affine loop fusion transformation function to LoopFusionUtils."
This reverts commit 64871f778d.

ASAN indicates a use-after-free in in mlir::canFuseLoops(mlir::AffineForOp, mlir::AffineForOp, unsigned int, mlir::ComputationSliceState*) lib/Transforms/Utils/LoopFusionUtils.cpp:202:41
2020-02-06 16:46:28 +00:00
OuHangKresnik 5c3b34930c [mlir] Add AffineMaxOp
Differential Revision: https://reviews.llvm.org/D73848
2020-02-06 10:26:50 +01:00
Andy Davis 64871f778d [mlir] Adds affine loop fusion transformation function to LoopFusionUtils.
Summary:
Adds affine loop fusion transformation function to LoopFusionUtils.
Updates TestLoopFusion utility to run loop fusion transformation until a fixed point is reached.
Adds unit tests to test the transformation.

Reviewers: bondhugula, dcaballe, nicolasvasilache

Reviewed By: bondhugula, dcaballe

Subscribers: Joonsoo, merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73190
2020-02-05 16:01:06 -08:00
Stephen Neuendorffer b692f43e42 [MLIR] Rename MemRefBoundCheck.cpp -> TestMemRefBoundCheck.cpp
Summary:

This makes it consistent with other test passes.

Reviewers: rriddle

Reviewed By: rriddle

Subscribers: merge_guards_bot, mgorny, mehdi_amini, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74068
2020-02-05 11:27:09 -08:00
Kazuaki Ishizaki 549588698f [mlir] NFC: Fix trivial typo in comment
Summary: Also, an exercise to merge this into the master myself after a reviewer gives LGTM.

Reviewers: nicolasvasilache, mehdi_amini

Reviewed By: mehdi_amini

Subscribers: Joonsoo, merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73432
2020-02-03 17:39:56 +09:00
River Riddle ce674b131b [mlir] Add support for marking 'unknown' operations as dynamically legal.
Summary: This allows for providing a default "catchall" legality check that is not dependent on specific operations or dialects. For example, this can be useful to check legality based on the specific types of operation operands or results.

Differential Revision: https://reviews.llvm.org/D73379
2020-01-27 19:50:52 -08:00
Alex Zinenko 51ba5b528a [mlir] add lowering from affine.min to std
Summary:
Affine minimum computation will be used in tiling transformation. The
implementation is mostly boilerplate as we already lower the minimum in the
upper bound of an affine loop.

Differential Revision: https://reviews.llvm.org/D73488
2020-01-27 22:30:52 +01:00
Ahmed Taei ab03564706 [mlir] : Fix ViewOp shape folder for identity affine maps
Summary: Fix the ViewOpShapeFolder in case of no affine mapping associated with a Memref construct identity mapping.

Reviewers: nicolasvasilache

Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72735
2020-01-15 00:54:00 +00:00
River Riddle 4268e4f4b8 [mlir] Change the syntax of AffineMapAttr and IntegerSetAttr to avoid conflicts with function types.
Summary: The current syntax for AffineMapAttr and IntegerSetAttr conflict with function types, making it currently impossible to round-trip function types(and e.g. FuncOp) in the IR. This revision changes the syntax for the attributes by wrapping them in a keyword. AffineMapAttr is wrapped with `affine_map<>` and IntegerSetAttr is wrapped with `affine_set<>`.

Reviewed By: nicolasvasilache, ftynse

Differential Revision: https://reviews.llvm.org/D72429
2020-01-13 13:24:39 -08:00
Ahmed Taei f84d320052 [MLIR] Don't use SSA names directly for std.view canonicalization test
Reviewers: rriddle, nicolasvasilache

Subscribers: mehdi_amini, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72408
2020-01-08 14:39:35 -08:00
Ahmed Taei 1e25109f93 Canonicalize static alloc followed by memref_cast and std.view
Summary: Rewrite alloc, memref_cast, std.view into allo, std.view by droping memref_cast.

Reviewers: nicolasvasilache

Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72379
2020-01-08 11:50:33 -08:00
Manuel Freiberger 22954a0e40 Add integer bit-shift operations to the standard dialect.
Rename the 'shlis' operation in the standard dialect to 'shift_left'. Add tests
for this operation (these have been missing so far) and add a lowering to the
'shl' operation in the LLVM dialect.

Add also 'shift_right_signed' (lowered to LLVM's 'ashr') and 'shift_right_unsigned'
(lowered to 'lshr').

The original plan was to name these operations 'shift.left', 'shift.right.signed'
and 'shift.right.unsigned'. This works if the operations are prefixed with 'std.'
in MLIR assembly. Unfortunately during import the short form is ambigous with
operations from a hypothetical 'shift' dialect. The best solution seems to omit
dots in standard operations for now.

Closes tensorflow/mlir#226

PiperOrigin-RevId: 286803388
2019-12-22 10:02:13 -08:00
Uday Bondhugula e5691c512f fix isValidDim for block arg case
- a block argument associated with an arbitrary op can't be a valid
  dimensional identifier; it has to be the block argument of either
  a function op or an affine.for.

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#331

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/331 from bondhugula:valid_dim 3273b4fcbaa31fb7b6671d93c9e42a6b2a6a4e4c
PiperOrigin-RevId: 286593693
2019-12-20 09:44:03 -08:00
Uday Bondhugula 47034c4bc5 Introduce prefetch op: affine -> std -> llvm intrinsic
Introduce affine.prefetch: op to prefetch using a multi-dimensional
subscript on a memref; similar to affine.load but has no effect on
semantics, but only on performance.

Provide lowering through std.prefetch, llvm.prefetch and map to llvm's
prefetch instrinsic. All attributes reflected through the lowering -
locality hint, rw, and instr/data cache.

  affine.prefetch %0[%i, %j + 5], false, 3, true : memref<400x400xi32>

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#225

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/225 from bondhugula:prefetch 4c3b4e93bc64d9a5719504e6d6e1657818a2ead0
PiperOrigin-RevId: 286212997
2019-12-18 10:00:04 -08:00
River Riddle b030e4a4ec Try to fold operations in DialectConversion when trying to legalize.
This change allows for DialectConversion to attempt folding as a mechanism to legalize illegal operations. This also expands folding support in OpBuilder::createOrFold to generate new constants when folding, and also enables it to work in the context of a PatternRewriter.

PiperOrigin-RevId: 285448440
2019-12-13 16:47:26 -08:00
Uday Bondhugula 36a415bcc5 More affine expr simplifications for floordiv and mod
Add one more simplification for floordiv and mod affine expressions.
Examples:
 (2*d0 + 1) floordiv 2 is simplified to d0
 (8*d0 + 4*d1 + d2) floordiv 4 simplified to 4*d0 + d1 + d2 floordiv 4.
 etc.

 Similarly, (4*d1 + 1) mod 2 is simplified to 1,
            (2*d0 + 8*d1) mod 8 simplified to 2*d0 mod 8.

Change getLargestKnownDivisor to return int64_t to be consistent and
to avoid casting at call sites (since the return value is used in expressions
of int64_t/index type).

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#202

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/202 from bondhugula:affine b13fcb2f1c00a39ca5434613a02408e085a80e77
PiperOrigin-RevId: 284866710
2019-12-10 16:00:53 -08:00
Kazuaki Ishizaki ae05cf27c6 Minor spelling tweaks
Closes tensorflow/mlir#304

PiperOrigin-RevId: 284568358
2019-12-09 09:23:48 -08:00
Uday Bondhugula 3ade6a7d15 DimOp folding for alloc/view dynamic dimensions
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#253

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/253 from bondhugula:dimop a4b464f24ae63fd259114558d87e11b8ee4dae86
PiperOrigin-RevId: 284169689
2019-12-06 06:00:54 -08:00
Nicolas Vasilache edfaf925cf Drop MaterializeVectorTransfers in favor of simpler declarative unrolling
Now that we have unrolling as a declarative pattern, we can drop a full pass that has gone stale. In the future we may want to add specific unrolling patterns for VectorTransferReadOp.

PiperOrigin-RevId: 283806880
2019-12-04 12:11:42 -08:00
Alex Zinenko 75175134d4 Loop coalescing: fix pointer chainsing in use-chain traversal
In the replaceAllUsesExcept utility function called from loop coalescing the
iteration over the use-chain is incorrect. The use list nodes (IROperands) have
next/prev links, and bluntly resetting the use would make the loop to continue
on uses of the value that was replaced instead of the original one. As a
result, it could miss the existing uses and update the wrong ones. Make sure we
increment the iterator before updating the use in the loop body.

Reported-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#291.

PiperOrigin-RevId: 283754195
2019-12-04 07:42:29 -08:00
Diego Caballero 330d1ff00e AffineLoopFusion: Prevent fusion of multi-out-edge producer loops
tensorflow/mlir#162 introduced a bug that
incorrectly allowed fusion of producer loops with multiple outgoing
edges. This commit fixes that problem. It also introduces a new flag to
disable sibling loop fusion so that we can test producer-consumer fusion
in isolation.

Closes tensorflow/mlir#259

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/259 from dcaballe:dcaballe/fix_multi_out_edge_producer_fusion 578d5661705fd5c56c555832d5e0528df88c5282
PiperOrigin-RevId: 283531105
2019-12-03 06:09:50 -08:00
Ben Vanik 38d7870ee5 Make std.divis and std.diviu support ElementsAttr folding.
PiperOrigin-RevId: 282434465
2019-11-25 14:31:43 -08:00
Ben Vanik d2284f1f0b Support folding of StandardOps with DenseElementsAttr.
PiperOrigin-RevId: 282270243
2019-11-24 19:23:38 -08:00
Mahesh Ravishankar 6db8530c26 Add more canonicalizations for SubViewOp.
Depending on which of the offsets, sizes, or strides are constant, the
subview op can be canonicalized in different ways. Add such
canonicalizations, which generalize the existing approach of
canonicalizing subview op only if all of offsets, sizes and shapes are
constants.

PiperOrigin-RevId: 282010703
2019-11-22 12:14:18 -08:00
MLIR Team 75379a684f Correctly parse empty affine maps.
Previously the test case crashes / produces an error.

PiperOrigin-RevId: 281630540
2019-11-20 18:30:15 -08:00
River Riddle fafb708b9a Merge DCE and unreachable block elimination into a new utility 'simplifyRegions'.
This moves the different canonicalizations of regions into one place and invokes them in the fixed-point iteration of the canonicalizer.

PiperOrigin-RevId: 281617072
2019-11-20 15:53:19 -08:00
Sean Silva e4f83c6c26 Add multi-level DCE pass.
This is a simple multi-level DCE pass that operates pretty generically on
the IR. Its key feature compared to the existing peephole dead op folding
that happens during canonicalization is being able to delete recursively
dead cycles of the use-def graph, including block arguments.

PiperOrigin-RevId: 281568202
2019-11-20 12:55:10 -08:00
Alexander Belyaev e50261657f Fix 'the the' typo.
PiperOrigin-RevId: 281501234
2019-11-20 05:38:14 -08:00
Diego Caballero dd5a7cb488 Add getRemappedValue to ConversionPatternRewriter
This method is needed for N->1 conversion patterns to retrieve remapped
Values used in the original N operations.

Closes tensorflow/mlir#237

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/237 from dcaballe:dcaballe/getRemappedValue 1f64fadcf2b203f7b336ff0c5838b116ae3625db
PiperOrigin-RevId: 281321881
2019-11-19 11:09:39 -08:00
Andy Davis a6a287335d Fix SubViewOp stride calculation in constant folding.
Adds unit tests for subview offset and stride argument constant folding.

PiperOrigin-RevId: 281161041
2019-11-18 15:01:08 -08:00
Andy Davis 68a8da4a93 Fix Affine Loop Fusion test case reported on github.
This CL utilizies the more robust fusion feasibility analysis being built out in LoopFusionUtils, which will eventually be used to replace the current affine loop fusion pass.

PiperOrigin-RevId: 281112340
2019-11-18 11:20:37 -08:00
Stephan Herhut f0f3b71d67 Implement folding of pattern dim(subview(_)[...][s1, ..., sn][...], i) -> si.
PiperOrigin-RevId: 281042016
2019-11-18 04:31:33 -08:00
Stephan Herhut 57bafc674e Mark std.view as no-sideeffect.
The same reasoning as for std.subview applies.

PiperOrigin-RevId: 280639308
2019-11-15 05:28:31 -08:00
Stephan Herhut 9c7bceb4fe Mark std.subview as no-sideeffect.
In essence, std.subview is just an abstract indexing transformation (somewhat
akin to a gep in llvm) and by itself has no effect. From a practical perspective
this helps, as it allows to remove dead subview operations.

PiperOrigin-RevId: 280630046
2019-11-15 04:00:31 -08:00
Nicolas Vasilache 0b271b7dfe Refactor the LowerVectorTransfers pass to use the RewritePattern infra - NFC
This is step 1/n in refactoring infrastructure along the Vector dialect to make it ready for retargetability and composable progressive lowering.

PiperOrigin-RevId: 280529784
2019-11-14 15:40:07 -08:00
Andy Davis a4669cd3b4 Adds canonicalizer to SubViewOp which folds constants from base memref and operands into the subview result memref type.
Changes SubViewOp to support zero operands case, when offset, strides and sizes are all constant.

PiperOrigin-RevId: 280485075
2019-11-14 12:23:04 -08:00
Nicolas Vasilache f2b6ae9991 Move VectorOps to Tablegen - (almost) NFC
This CL moves VectorOps to Tablegen and cleans up the implementation.

This is almost NFC but 2 changes occur:
  1. an interface change occurs in the padding value specification in vector_transfer_read:
     the value becomes non-optional. As a shortcut we currently use %f0 for all paddings.
     This should become an OpInterface for vectorization in the future.
  2. the return type of vector.type_cast is trivial and simplified to `memref<vector<...>>`

Relevant roundtrip and invalid tests that used to sit in core are moved to the vector dialect.

The op documentation is moved to the .td file.

PiperOrigin-RevId: 280430869
2019-11-14 08:15:23 -08:00
River Riddle d985c74883 NFC: Refactor block signature conversion to not erase the original arguments.
This refactors the implementation of block signature(type) conversion to not insert fake cast operations to perform the type conversion, but to instead create a new block containing the proper signature. This has the benefit of enabling the use of pre-computed analyses that rely on mapping values. It also leads to a much cleaner implementation overall. The major user facing change is that applySignatureConversion will now replace the entry block of the region, meaning that blocks generally shouldn't be cached over calls to applySignatureConversion.

PiperOrigin-RevId: 280226936
2019-11-13 10:27:53 -08:00
Stephan Herhut e04d4bf865 Also consider index constants when folding integer arithmetics with constants.
PiperOrigin-RevId: 279698088
2019-11-11 02:34:21 -08:00
Andy Davis 8f00b4494d Swap operand order in std.view operation so that offset appears before dynamic sizes in the operand list.
PiperOrigin-RevId: 279114236
2019-11-07 10:20:23 -08:00
Andy Davis 5fbdb67b0a Add canonicalizer for ViewOp which folds constants into the ViewOp memref shape and layout map strides and offset.
PiperOrigin-RevId: 279088023
2019-11-07 08:05:03 -08:00
River Riddle 2366561a39 Add a PatternRewriter hook to merge blocks, and use it to support for folding branches.
A pattern rewriter hook, mergeBlock, is added that allows for merging the operations of one block into the end of another. This is used to support a canonicalization pattern for branch operations that folds the branch when the successor has a single predecessor(the branch block).

Example:
  ^bb0:
    %c0_i32 = constant 0 : i32
    br ^bb1(%c0_i32 : i32)
  ^bb1(%x : i32):
    return %x : i32

becomes:
  ^bb0:
    %c0_i32 = constant 0 : i32
    return %c0_i32 : i32
PiperOrigin-RevId: 278677825
2019-11-05 11:57:38 -08:00
Mahesh Ravishankar 9cbbd8f4df Support lowering of imperfectly nested loops into GPU dialect.
The current lowering of loops to GPU only supports lowering of loop
nests where the loops mapped to workgroups and workitems are perfectly
nested. Here a new lowering is added to handle lowering of imperfectly
nested loop body with the following properties
1) The loops partitioned to workgroups are perfectly nested.
2) The loop body of the inner most loop partitioned to workgroups can
contain one or more loop nests that are to be partitioned across
workitems. Each individual loops nests partitioned to workitems should
also be perfectly nested.
3) The number of workgroups and workitems are not deduced from the
loop bounds but are passed in by the caller of the lowering as values.
4) For statements within the perfectly nested loop nest partitioned
across workgroups that are not loops, it is valid to have all threads
execute that statement. This is NOT verified.

PiperOrigin-RevId: 277958868
2019-11-01 10:52:06 -07:00
River Riddle a32f0dcb5d Add support to GreedyPatternRewriter for erasing unreachable blocks.
Rewrite patterns may make modifications to the CFG, including dropping edges between blocks. This change adds a simple unreachable block elimination run at the end of each iteration to ensure that the CFG remains valid.

PiperOrigin-RevId: 277545805
2019-10-30 11:19:24 -07:00
River Riddle 2f4d0c085a Add support for marking an operation as recursively legal.
In some cases, it may be desirable to mark entire regions of operations as legal. This provides an additional granularity of context to the concept of "legal". The `ConversionTarget` supports marking operations, that were previously added as `Legal` or `Dynamic`, as `recursively` legal. Recursive legality means that if an operation instance is legal, either statically or dynamically, all of the operations nested within are also considered legal. An operation can be marked via `markOpRecursivelyLegal<>`:

```c++
ConversionTarget &target = ...;

/// The operation must first be marked as `Legal` or `Dynamic`.
target.addLegalOp<MyOp>(...);
target.addDynamicallyLegalOp<MySecondOp>(...);

/// Mark the operation as always recursively legal.
target.markOpRecursivelyLegal<MyOp>();
/// Mark optionally with a callback to allow selective marking.
target.markOpRecursivelyLegal<MyOp, MySecondOp>([](Operation *op) { ... });
/// Mark optionally with a callback to allow selective marking.
target.markOpRecursivelyLegal<MyOp>([](MyOp op) { ... });
```

PiperOrigin-RevId: 277086382
2019-10-28 10:04:34 -07:00
River Riddle 2b61b7979e Convert the Canonicalize and CSE passes to generic Operation Passes.
This allows for them to be used on other non-function, or even other function-like, operations. The algorithms are already generic, so this is simply changing the derived pass type. The majority of this change is just ensuring that the nesting of these passes remains the same, as the pass manager won't auto-nest them anymore.

PiperOrigin-RevId: 276573038
2019-10-24 15:01:09 -07:00
River Riddle 21ee4e987f Add @below and @above directives to verify-diagnostics.
This simplifies defining expected-* directives when there are multiple that apply to the next or previous line. @below applies the directive to the next non-designator line, i.e. the next line that does not contain an expected-* designator. @above applies to the previous non designator line.

Examples:

// Expect an error on the next line that does not contain a designator.
// expected-remark@below {{remark on function below}}
// expected-remark@below {{another remark on function below}}
func @bar(%a : f32)

// Expect an error on the previous line that does not contain a designator.
func @baz(%a : f32)
// expected-remark@above {{remark on function above}}
// expected-remark@above {{another remark on function above}}

PiperOrigin-RevId: 276369085
2019-10-23 15:56:29 -07:00
Kazuaki Ishizaki f28c5aca17 Fix minor spelling tweaks (NFC)
Closes tensorflow/mlir#175

PiperOrigin-RevId: 275726876
2019-10-20 09:44:36 -07:00
Nicolas Vasilache 9e7e297da3 Lower vector transfer ops to loop.for operations.
This allows mixing linalg operations with vector transfer operations (with additional modifications to affine ops) and is a step towards solving tensorflow/mlir#189.

PiperOrigin-RevId: 275543361
2019-10-18 14:10:10 -07:00
Stephan Herhut b843cc5d5a Implement simple loop-invariant-code-motion based on dialect interfaces.
PiperOrigin-RevId: 275004258
2019-10-16 04:28:38 -07:00
River Riddle 96de7091bc Allowing replacing non-root operations in DialectConversion.
When dealing with regions, or other patterns that need to generate temporary operations, it is useful to be able to replace other operations than the root op being matched. Before this PR, these operations would still be considered for legalization meaning that the conversion would either fail, erroneously need to mark these ops as legal, or add unnecessary patterns.

PiperOrigin-RevId: 274598513
2019-10-14 10:01:59 -07:00
River Riddle 6b1cc3c6ea Add support for canonicalizing callable regions during inlining.
This will allow for inlining newly devirtualized calls, as well as give a more accurate cost model(when we have one). Currently canonicalization will only run for nodes that have no child edges, as the child nodes may be erased during canonicalization. We can support this in the future, but it requires more intricate deletion tracking.

PiperOrigin-RevId: 274011386
2019-10-10 17:06:33 -07:00
River Riddle 438dc176b1 Remove the need to convert operations in regions of operations that have been replaced.
When an operation with regions gets replaced, we currently require that all of the remaining nested operations are still converted even though they are going to be replaced when the rewrite is finished. This cl adds a tracking for a minimal set of operations that are known to be "dead". This allows for ignoring the legalization of operations that are won't survive after conversion.

PiperOrigin-RevId: 274009003
2019-10-10 17:06:25 -07:00
Parker Schuh 309b4556d0 Add test for fix to tablegen for custom folders for ops that return a single
variadic result.

Add missing test for single line fix to `void OpEmitter::genFolderDecls()`
entitled "Fold away reduction over 0 dimensions."

PiperOrigin-RevId: 273880337
2019-10-09 20:44:30 -07:00
Diego Caballero 3451055614 Add support for some multi-store cases in affine fusion
This PR is a stepping stone towards supporting generic multi-store
source loop nests in affine loop fusion. It extends the algorithm to
support fusion of multi-store loop nests that:
 1. have only one store that writes to a function-local live out, and
 2. the remaining stores are involved in loop nest self dependences
    or no dependences within the function.

Closes tensorflow/mlir#162

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/162 from dcaballe:dcaballe/multi-output-fusion 7fb7dec6fe8b45f5ce176f018bfe37b256420c45
PiperOrigin-RevId: 273773907
2019-10-09 10:37:30 -07:00