Commit Graph

2404 Commits

Author SHA1 Message Date
Alex Zinenko ed749b7689 Make "LowerToCFG" an operation pass
The conversion from the Loops dialect to the Standard dialect, also known as
loop-to-cfg lowering, has been historically a function pass. It can be required
on non-Standard function Ops, in particular the recently introduced GPU
functions. Make the conversion an operation pass instead of a function pass.

PiperOrigin-RevId: 285814560
2019-12-16 11:36:02 -08:00
Jose Ignacio Gomez 3ae56c4135 [Linalg] Expose subview promotion as a declarative pattern
This PR targest issue tensorflow/mlir#295. It exposes the already existing
subiew promotion pass as a declarative pattern

Change-Id: If901ebef9fb53fcd0b12ecc536f6b174ce320b92

Closes tensorflow/mlir#315

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/315 from tetuante:issue295 8e5f268b6d85f31015c33505329dbd7a4db97ac5
PiperOrigin-RevId: 285801463
2019-12-16 10:50:45 -08:00
Aart Bik cd5dab8ad7 [VectorOps] Add [insert/extract]element definition together with lowering to LLVM
Similar to insert/extract vector instructions but
(1) work on 1-D vectors only
(2) allow for a dynamic index

  %c3 = constant 3 : index
  %0 = vector.insertelement %arg0, %arg1[%c : index] : vector<4xf32>
  %1 = vector.extractelement %arg0[%c3 : index] : vector<4xf32>

PiperOrigin-RevId: 285792205
2019-12-16 09:52:46 -08:00
Andy Davis 73ec37c8bb Adds ExtractSlicesOp to the VectorOps dialect.
ExtractSlicesOp extracts slices of its vector operand and with a specified tiling scheme.
This operation centralizes the tiling scheme around a single op, which simplifies vector op unrolling and subsequent pattern rewrite transformations.

PiperOrigin-RevId: 285761129
2019-12-16 06:39:09 -08:00
Alex Zinenko 0684aa9a8b Make memref promotion during std->LLVM lowering the default calling convention
During the conversion from the standard dialect to the LLVM dialect,
memref-typed arguments are promoted from registers to memory and passed into
functions by pointer. This had been introduced into the lowering to work around
the abesnce of calling convention modeling in MLIR to enable better
interoperability with LLVM IR generated from C, and has been exerciced for
several months. Make this promotion the default calling covention when
converting to the LLVM dialect. This adds the documentation, simplifies the
code and makes the conversion consistent across function operations and
function types used in other places, e.g. in high-order functions or
attributes, which would not follow the same rule previously.

PiperOrigin-RevId: 285751280
2019-12-16 05:17:14 -08:00
Tres Popp 44fc7d72b3 Remove LLVM dependency on mlir::Module and instead check Traits.
PiperOrigin-RevId: 285724678
2019-12-16 01:45:44 -08:00
Smit Hinsu 2d22b1e04e Add verifyCompatibleShape function overload with shapes
PiperOrigin-RevId: 285574334
2019-12-14 11:18:38 -08:00
Nicolas Vasilache 200beb8446 Apply a level of sugaring to the linalg.generic EDSC - NFC
Make the declarative C++ builder API simpler to use so we can start chaining these ops together.

PiperOrigin-RevId: 285496266
2019-12-13 17:39:46 -08:00
River Riddle 7ac42fa26e Refactor various canonicalization patterns as in-place folds.
This is more efficient, and allows for these to fire in more situations: e.g. createOrFold, DialectConversion, etc.

PiperOrigin-RevId: 285476837
2019-12-13 17:19:02 -08:00
Nicolas Vasilache 7923abd357 Add a layer of EDSC for linalg.GenericOp
This will be evolved into a simple programming model for custom ops and custom layers in followup CLs.

This CL also deletes the obsolete tablegen's reference-impl.td that was using EDSCs.

PiperOrigin-RevId: 285459545
2019-12-13 16:57:57 -08:00
River Riddle b030e4a4ec Try to fold operations in DialectConversion when trying to legalize.
This change allows for DialectConversion to attempt folding as a mechanism to legalize illegal operations. This also expands folding support in OpBuilder::createOrFold to generate new constants when folding, and also enables it to work in the context of a PatternRewriter.

PiperOrigin-RevId: 285448440
2019-12-13 16:47:26 -08:00
Christian Sigg 8846557672 Fix maskAndClamp in gpu.all_reduce.
The clamp value determines the returned predicate. Previously, the clamp value was fixed to 31 and the predicate was therefore always true. This is incorrect for partial warp reductions, but went unnoticed because the returned values happened to be zero (but it could be anything).

PiperOrigin-RevId: 285343160
2019-12-13 15:28:58 -08:00
River Riddle e7aa47ff11 NFC: Cleanup the various Op::print methods.
This cleans up the implementation of the various operation print methods. This is done via a combination of code cleanup, adding new streaming methods to the printer(e.g. operand ranges), etc.

PiperOrigin-RevId: 285285181
2019-12-12 15:32:21 -08:00
Aart Bik 1c81adf362 [VectorOps] Add lowering of vector.shuffle to LLVM IR
For example, a shuffle

%1 = vector.shuffle %arg0, %arg1 [0 : i32, 1 : i32] : vector<2xf32>, vector<2xf32>

becomes a direct LLVM shuffle

0 = llvm.shufflevector %arg0, %arg1 [0 : i32, 1 : i32] : !llvm<"<2 x float>">, !llvm<"<2 x float>">

but

%1 = vector.shuffle %a, %b[1 : i32, 0 : i32, 2: i32] : vector<1x4xf32>, vector<2x4xf32>

becomes the more elaborate (note the index permutation that drives
argument selection for the extract operations)

%0 = llvm.mlir.undef : !llvm<"[3 x <4 x float>]">
%1 = llvm.extractvalue %arg1[0] : !llvm<"[2 x <4 x float>]">
%2 = llvm.insertvalue %1, %0[0] : !llvm<"[3 x <4 x float>]">
%3 = llvm.extractvalue %arg0[0] : !llvm<"[1 x <4 x float>]">
%4 = llvm.insertvalue %3, %2[1] : !llvm<"[3 x <4 x float>]">
%5 = llvm.extractvalue %arg1[1] : !llvm<"[2 x <4 x float>]">
%6 = llvm.insertvalue %5, %4[2] : !llvm<"[3 x <4 x float>]">

PiperOrigin-RevId: 285268164
2019-12-12 14:11:56 -08:00
Nicolas Vasilache 782ae29678 Retire !linalg.buffer type - NFC
This type is not used anymore now that Linalg view and subview have graduated to std and that alignment is supported on alloc.

PiperOrigin-RevId: 285213424
2019-12-12 10:03:57 -08:00
Ehsan Toosi f7bffad5a7 Added lowering of `std.tanh` to llvm function call to `tanh` and `tanhf`.
Closes tensorflow/mlir#312

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/312 from dfki-ehna:tanh 9e89b072ff91ff390ad739501745114feb3ac856
PiperOrigin-RevId: 285205674
2019-12-12 09:25:15 -08:00
Christian Sigg 9b85582682 Automated rollback of commit f68ac464d8
PiperOrigin-RevId: 285162061
2019-12-12 03:48:38 -08:00
Christian Sigg f68ac464d8 Switch from shfl.bfly to shfl.down.
Both work for the current use case, but the latter allows implementing
prefix sums and is a little easier to understand for partial warps.

PiperOrigin-RevId: 285145287
2019-12-12 01:28:01 -08:00
River Riddle 851a8516d3 Make OpBuilder::insert virtual instead of OpBuilder::createOperation.
It is sometimes useful to create operations separately from the builder before insertion as it may be easier to erase them in isolation if necessary. One example use case for this is folding, as we will only want to insert newly generated constant operations on success. This has the added benefit of fixing some silent PatternRewriter failures related to cloning, as the OpBuilder 'clone' methods don't call createOperation.

PiperOrigin-RevId: 285086242
2019-12-11 16:26:45 -08:00
Nicolas Vasilache 9dfa84a269 Add std.log* and llvm.intr.log* that correspond to the LLVMIR intrinsics
PiperOrigin-RevId: 285073483
2019-12-11 15:25:34 -08:00
Mahesh Ravishankar 652fc261d7 Expose a convenience function to add interface attributes to a function.
PiperOrigin-RevId: 285036647
2019-12-11 12:21:42 -08:00
Denis Khalikov d968f9696d [spirv] Add lowering for std.fdiv, std.frem, std.fsub
Closes tensorflow/mlir#313

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/313 from denis0x0D:sandbox/lowering_std_farith 41715070a74d13bfa9401957478978c1bb8006c0
PiperOrigin-RevId: 285023586
2019-12-11 11:17:35 -08:00
Nicolas Vasilache 508d4e672e Continue refactoring StructuredOps utilities
This CL adds more common information to StructuredOpsUtils.h
The n_view attribute is retired in favor of args_in + args_out but the CL is otherwise NFC.

PiperOrigin-RevId: 285000621
2019-12-11 09:27:34 -08:00
Christian Sigg c5fb4c1303 NFC: Fix naming inconsistency: FuncOpLowering -> GPUFuncOpLowering.
Remove nested anonymous namespace.

PiperOrigin-RevId: 284987357
2019-12-11 08:24:58 -08:00
Alexander Belyaev 4b0198acb5 Roll-forward initial liveness analysis including test cases.
Fix the usage of the map size when appending to the map with [].

PiperOrigin-RevId: 284985916
2019-12-11 08:13:43 -08:00
Alexander Belyaev 984fdde269 Automated rollback of commit 98fbf41044
PiperOrigin-RevId: 284979684
2019-12-11 07:17:21 -08:00
Stephan Herhut b96f86daaf Add a function to get lowering patterns from GPU to NVVM.
This enables combining the patterns with other patterns into larger lowerings.

PiperOrigin-RevId: 284979271
2019-12-11 07:14:33 -08:00
Alexander Belyaev bae8a7a724 [Linalg] Add tiling for IndexedGenericOp with a region.
PiperOrigin-RevId: 284949355
2019-12-11 02:56:40 -08:00
Marcel Koester 98fbf41044 Add initial liveness analysis including test cases.
Closes tensorflow/mlir#255

PiperOrigin-RevId: 284935454
2019-12-11 01:03:25 -08:00
Aart Bik 9826fe5c9f [VectorOps] Add lowering of vector.insert to LLVM IR
For example, an insert

  %0 = vector.insert %arg0, %arg1[3 : i32] : f32 into vector<4xf32>

becomes

  %0 = llvm.mlir.constant(3 : i32) : !llvm.i32
  %1 = llvm.insertelement %arg0, %arg1[%0 : !llvm.i32] : !llvm<"<4 x float>">

A more elaborate example, inserting an element in a higher dimension
vector

  %0 = vector.insert %arg0, %arg1[3 : i32, 7 : i32, 15 : i32] : f32 into vector<4x8x16xf32>

becomes

  %0 = llvm.extractvalue %arg1[3 : i32, 7 : i32] : !llvm<"[4 x [8 x <16 x float>]]">
  %1 = llvm.mlir.constant(15 : i32) : !llvm.i32
  %2 = llvm.insertelement %arg0, %0[%1 : !llvm.i32] : !llvm<"<16 x float>">
  %3 = llvm.insertvalue %2, %arg1[3 : i32, 7 : i32] : !llvm<"[4 x [8 x <16 x float>]]">

PiperOrigin-RevId: 284882443
2019-12-10 17:12:49 -08:00
Andy Davis 4d8ba88610 Add VectorOp transform pattern which splits vector TransferReadOps to target vector unroll size.
PiperOrigin-RevId: 284880592
2019-12-10 17:02:51 -08:00
Uday Bondhugula 36a415bcc5 More affine expr simplifications for floordiv and mod
Add one more simplification for floordiv and mod affine expressions.
Examples:
 (2*d0 + 1) floordiv 2 is simplified to d0
 (8*d0 + 4*d1 + d2) floordiv 4 simplified to 4*d0 + d1 + d2 floordiv 4.
 etc.

 Similarly, (4*d1 + 1) mod 2 is simplified to 1,
            (2*d0 + 8*d1) mod 8 simplified to 2*d0 mod 8.

Change getLargestKnownDivisor to return int64_t to be consistent and
to avoid casting at call sites (since the return value is used in expressions
of int64_t/index type).

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#202

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/202 from bondhugula:affine b13fcb2f1c00a39ca5434613a02408e085a80e77
PiperOrigin-RevId: 284866710
2019-12-10 16:00:53 -08:00
Alex Zinenko d1213ae51d Move gpu.launch_func to ODS. NFC
Move the definition of gpu.launch_func operation from hand-rolled C++
implementation to the ODS framework. Also move the documentation. This only
performs the move and remains a non-functional change, a follow-up will clean
up the custom functions that can be auto-generated using ODS.

PiperOrigin-RevId: 284842252
2019-12-10 13:55:21 -08:00
River Riddle 9ed22ae5b8 Refactor the various operand/result/type iterators to use indexed_accessor_range.
This has several benefits:
* The implementation is much cleaner and more efficient.
* The ranges now have support for many useful operations: operator[], slice, drop_front, size, etc.
* Value ranges can now directly query a range for their types via 'getTypes()': e.g:
   void foo(Operation::operand_range operands) {
     auto operandTypes = operands.getTypes();
   }

PiperOrigin-RevId: 284834912
2019-12-10 13:21:22 -08:00
Jose Ignacio Gomez b19fed5415 [Linalg] Add a Linalg iterator permutation transformation
This patch closes issue tensorflow/mlir#272
We add a standalone iterator permutation transformation to Linalg.
This transformation composes a permutation map with the maps in the
"indexing_maps" attribute. It also permutes "iterator_types"
accordingly.

Change-Id: I7c1e693b8203aeecc595a7c012e738ca1100c857

Closes tensorflow/mlir#307

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/307 from tetuante:issue272 f7908d58792f4111119721885e247045104f1131
PiperOrigin-RevId: 284824102
2019-12-10 12:25:43 -08:00
Nicolas Vasilache ad38e49806 Uniformize Vector transforms as patterns on the model of Linalg - NFC
This reorganizes the vector transformations to be more easily testable as patterns and more easily composable into fused passes in the future.

PiperOrigin-RevId: 284817474
2019-12-10 11:54:33 -08:00
Mahesh Ravishankar 04fdd33daf More convenience build methods for SPIR-V ops.
Add some convenience build methods to SPIR-V ops and update the
lowering to use these methods where possible.

For SPIRV::CompositeExtractOp move the method to deduce type of
element based on base and indices into a convenience function. Some
additional functionality needed to handle differences between parsing
and verification methods.

PiperOrigin-RevId: 284794404
2019-12-10 10:11:50 -08:00
Alex Zinenko ac4873322f Drop Markdown style annotations
These come from a non-standard extenion that is not available on Github, so it
only clutters the documentation source with {.mlir} or {.ebnf} tags.

PiperOrigin-RevId: 284733003
2019-12-10 03:00:57 -08:00
Aart Bik 1fe65688d4 [VectorOps] Add a ShuffleOp to the VectorOps dialect
For example

 %0 = vector.shuffle %x, %y [3 : i32, 2 : i32, 1 : i32, 0 : i32] : vector<2xf32>, vector<2xf32>

yields a vector<4xf32> result with a permutation of the elements of %x and %y

PiperOrigin-RevId: 284657191
2019-12-09 16:15:41 -08:00
Aart Bik 0e963b9c42 [VectorOps] Fix off-by-one error in insert/extract validation
PiperOrigin-RevId: 284652653
2019-12-09 15:54:23 -08:00
River Riddle 3f9744a6b7 Refactor the Block support classes.
Each of the support classes for Block are now moved into a new header BlockSupport.h. The successor iterator class is also reimplemented as an indexed_accessor_range. This makes the class more efficient, and expands on its available functionality.

PiperOrigin-RevId: 284646792
2019-12-09 15:24:43 -08:00
River Riddle 7be6a40ab9 Add new indexed_accessor_range_base and indexed_accessor_range classes that simplify defining index-able ranges.
Many ranges want similar functionality from a range type(e.g. slice/drop_front/operator[]/etc.), so these classes provide a generic implementation that may be used by many different types of ranges. This removes some code duplication, and also empowers many of the existing range types in MLIR(e.g. result type ranges, operand ranges, ElementsAttr ranges, etc.). This change only updates RegionRange and ValueRange, more ranges will be updated in followup commits.

PiperOrigin-RevId: 284615679
2019-12-09 12:55:40 -08:00
Denis Khalikov 34265dad65 [spirv] Add CompositeConstruct operation.
Closes tensorflow/mlir#308

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/308 from denis0x0D:sandbox/composite_construct 9ef7180f77f9374bcd05afc4f9e6c1d2d72d02b7
PiperOrigin-RevId: 284613617
2019-12-09 12:43:53 -08:00
Lei Zhang 2c7e8ed7c6 [spirv] Add spv.IAdd, spv.ISub, and spv.IMul folders
The patterns to be folded away can be commonly generated
during lowering to SPIR-V.

PiperOrigin-RevId: 284604855
2019-12-09 11:59:10 -08:00
Nicolas Vasilache 5a48e40a65 Factor out commonly reusable names across structured ops dialects
This CL starts extracting commonalities between dialects that use the structured ops abstractions. Also fixes an OSS build issue where StringRef were incorrectly used with constexpr.

PiperOrigin-RevId: 284591114
2019-12-09 11:01:40 -08:00
Mahesh Ravishankar 4a62019eb8 Add lowering for module with gpu.kernel_module attribute.
The existing GPU to SPIR-V lowering created a spv.module for every
function with gpu.kernel attribute. A better approach is to lower the
module that the function lives in (which has the attribute
gpu.kernel_module) to a spv.module operation. This better captures the
host-device separation modeled by GPU dialect and simplifies the
lowering as well.

PiperOrigin-RevId: 284574688
2019-12-09 09:52:21 -08:00
Andy Davis 312ccb1c0f Unify vector op unrolling transformation.
Unifies vector op unrolling transformation, by using the same unrolling implementation for contraction and elementwise operations.
Removes fakefork/join operations which are non longer needed now that we have the InsertStridedSlice operation.

PiperOrigin-RevId: 284570784
2019-12-09 09:35:15 -08:00
Kazuaki Ishizaki ae05cf27c6 Minor spelling tweaks
Closes tensorflow/mlir#304

PiperOrigin-RevId: 284568358
2019-12-09 09:23:48 -08:00
Nicolas Vasilache 91c0074624 [StructuredOps][Linalg] Add a primitive pattern to rewrite the linalg.generic form of matmul to vector form.
This CL uses the newly expanded matcher support to easily detect when a linalg.generic has a multiply-accumulate body. A linalg.generic with such a body is rewritten as a vector contraction.
This CL additionally limits the rewrite to the case of matrix multiplication on contiguous and statically shaped memrefs for now.

Before expanding further, we should harden the infrastructure for expressing custom ops with the structured ops abstraction.

PiperOrigin-RevId: 284566659
2019-12-09 09:14:39 -08:00
Jacques Pienaar 70aeb4566e Add RegionRange for when need to abstract over different region iteration
Follows ValueRange in representing a generic abstraction over the different
ways to represent a range of Regions. This wrapper is not as ValueRange and only
considers the current cases of interest: MutableArrayRef<Region> and
ArrayRef<std::unique_ptr<Region>> as occurs during op construction vs op region
querying.

Note: ArrayRef<std::unique_ptr<Region>> allows for unset regions, so this range
returns a pointer to a Region instead of a Region.
PiperOrigin-RevId: 284563229
2019-12-09 08:57:56 -08:00