Commit Graph

244 Commits

Author SHA1 Message Date
Tres Popp 4624a1e8ac [mlir] Create a gpu.module operation for the GPU Dialect.
Summary:
This is based on the use of code constantly checking for an attribute on
a model and instead represents the distinct operaion with a different
op. Instead, this op can be used to provide better filtering.

Reviewers: herhut, mravishankar, antiagainst, rriddle

Reviewed By: herhut, antiagainst, rriddle

Subscribers: liufengdb, aartbik, jholewinski, mgorny, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, csigg, arpith-jacob, mgester, lucyrfox, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72336
2020-01-14 12:05:47 +01:00
Adrian Kuegel 018b042593 [mlir] Add loop.parallel, loop.reduce and loop.reduce.return operations.
Summary:
These operations can be used to specify a loop nest with a body that can
contain reductions. The iteration space can be iterated in any order.

RFC: https://groups.google.com/a/tensorflow.org/d/topic/mlir/pwtSgiKFPis/discussion

Differential Revision: https://reviews.llvm.org/D72394
2020-01-14 11:35:41 +01:00
River Riddle 4268e4f4b8 [mlir] Change the syntax of AffineMapAttr and IntegerSetAttr to avoid conflicts with function types.
Summary: The current syntax for AffineMapAttr and IntegerSetAttr conflict with function types, making it currently impossible to round-trip function types(and e.g. FuncOp) in the IR. This revision changes the syntax for the attributes by wrapping them in a keyword. AffineMapAttr is wrapped with `affine_map<>` and IntegerSetAttr is wrapped with `affine_set<>`.

Reviewed By: nicolasvasilache, ftynse

Differential Revision: https://reviews.llvm.org/D72429
2020-01-13 13:24:39 -08:00
Alex Zinenko 08778d8c4f [mlir][GPU] introduce utilities for promotion to workgroup memory
Introduce a set of function that promote a memref argument of a `gpu.func` to
workgroup memory using memory attribution. The promotion boils down to
additional loops performing the copy from the original argument to the
attributed memory in the beginning of the function, and back at the end of the
function using all available threads. The loop bounds are specified so as to
adapt to any size of the workgroup. These utilities are intended to compose
with other existing utilities (loop coalescing and tiling) in cases where the
distribution of work across threads is uneven, e.g. copying a 2D memref with
only the threads along the "x" dimension. Similarly, specialization of the
kernel to specific launch sizes should be implemented as a separate pass
combining constant propagation and canonicalization.

Introduce a simple attribute-driven pass to test the promotion transformation
since we don't have a heuristic at the moment.

Differential revision: https://reviews.llvm.org/D71904
2020-01-09 10:06:00 +01:00
Nicolas Vasilache 766ce87e9b [mlir][Linalg] Lower linalg.reshape to LLVM for the static case
Summary:
This diff adds lowering of the linalg.reshape op to LLVM.

A new descriptor is created with fields initialized as follows:
1. allocatedPTr, alignedPtr and offset are copied from the source descriptor
2. sizes are copied from the static destination shape
3. strides are copied from the static strides collected with `getStridesAndOffset`

Only the static case in which the target view conforms to strided memref
semantics is supported. Other cases are left for future work and will be added on
a per-need basis.

Reviewers: ftynse, mravishankar

Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72316
2020-01-08 13:07:41 -05:00
Denis Khalikov dd495e8a87 [mlir][spirv] Add lowering for std cmp ops.
Differential Revision: https://reviews.llvm.org/D72296
2020-01-07 21:51:51 -05:00
Denis Khalikov 9883b14cd1 [mlir][spirv] Add lowering for standard bit ops
Differential Revision: https://reviews.llvm.org/D72205
2020-01-07 21:45:54 -05:00
Nicolas Vasilache e3750cafdb [mlir][Linalg] Add a linalg.reshape op
Summary:
This diff adds a new operation to linalg to allow reshaping of an
existing view into a new view in the same buffer at the same offset.

More specifically:
The `linalg.reshape` op produces a new view whose sizes are a reassociation
of the original `view`. Depending on whether or not the reassociated
MemRefType is contiguous, the resulting memref may require explicit alloc
and copies.

A reassociation is defined as a continous grouping of dimensions and is
represented with a affine map array attribute. In the future, non-continous
groupings may be allowed (i.e. permutations, reindexings etc).

For now, it is assumed that either:
  1. a reassociation produces and consumes contiguous MemRefType or,
  2. the reshape op will be folded into its consumers (by changing the shape
     of the computations).
All other cases are undefined behavior and a reshape op may not lower to
LLVM if it cannot be proven statically that it does not require alloc+copy.

A reshape may either collapse or expand dimensions, depending on the
relationship between source and target memref ranks. The verification rule
is that the reassociation maps are applied to the memref with the larger
rank to obtain the memref with the smaller rank. In the case of a dimension
expansion, the reassociation maps can be interpreted as inverse maps.

Examples:

```mlir
   // Dimension collapse (i, j) -> i' and k -> k'
   %1 = linalg.reshape %0 [(i, j, k) -> (i, j),
                           (i, j, k) -> (k)] :
     memref<?x?x?xf32, stride_spec> into memref<?x?xf32, stride_spec_2>
```

```mlir
   // Dimension expansion i -> (i', j') and (k) -> (k')
   %1 = linalg.reshape %0 [(i, j, k) -> (i, j),
                           (i, j, k) -> (k)] :
     memref<?x?xf32, stride_spec> into memref<?x?x?xf32, stride_spec_2>
```

The relevant invalid and roundtripping tests are added.

Reviewers: AlexEichenberger, ftynse, rriddle, asaadaldien, yangjunpro

Subscribers: kiszk, merge_guards_bot, mehdi_amini, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72168
2020-01-06 22:21:19 -05:00
Ahmed Taei 14ee51581a [mlir][linalg] Lower linalg to affine loops
Reviewers: nicolasvasilache

Reviewed By: nicolasvasilache

Subscribers: mgester, lucyrfox, merge_guards_bot, AlexEichenberger, mravishankar, ftynse, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72094
2020-01-03 13:21:10 -05:00
Lei Zhang b3d2867769 [mlir][spirv] Fix shader ABI attribute prefix and add verification
This commit fixes shader ABI attributes to use `spv.` as the prefix
so that they match the dialect's namespace. This enables us to add
verification hooks in the SPIR-V dialect to verify them.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D72062
2020-01-03 07:44:27 -05:00
Lei Zhang 98856b22cd [mlir][spirv] Update SPIR-V enums and ops with availability spec
This commit updates gen_spirv_dialect.py to query the grammar and
generate availability spec for various enum attribute definitions
and all defined ops.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D72095
2020-01-02 14:09:02 -05:00
Nicolas Vasilache 2140a973f2 [mlir][Linalg] Extend generic ops to allow tensors
Summary:
    This diff adds support to allow `linalg.generic` and
    `linalg.indexed_generic` to take tensor input and output
    arguments.

    The subset of output tensor operand types must appear
    verbatim in the result types after an arrow. The parser,
    printer and verifier are extended to accomodate this
    behavior.

    The Linalg operations now support variadic ranked tensor
    return values. This extension exhibited issues with the
    current handling of NativeCall in RewriterGen.cpp. As a
    consequence, an explicit cast to `SmallVector<Value, 4>`
    is added in the proper place to support the new behavior
    (better suggestions are welcome).

    Relevant cleanups and name uniformization are applied.

    Relevant invalid and roundtrip test are added.

    Reviewers: mehdi_amini, rriddle, jpienaar, antiagainst, ftynse

    Subscribers: burmako, shauheen, llvm-commits

    Tags: #llvm

    Differential Revision: https://reviews.llvm.org/D72022
2020-01-02 13:54:57 -05:00
Lei Zhang a81cb1b8bf [mlir][spirv] Allow specifying availability on enum attribute cases
Lots of SPIR-V ops take enum attributes and certain enum cases
need extra capabilities or extensions to be available. This commit
extends to allow specifying availability spec on enum cases.
Extra utility functions are generated for the corresponding enum
classes to return the availability requirement. The availability
interface implemention for a SPIR-V op now goes over all enum
attributes to collect the availability requirements.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D71947
2020-01-02 13:19:44 -05:00
Lei Zhang b30d87a90b [mlir][spirv] Add basic definitions for supporting availability
SPIR-V has a few mechanisms to control op availability: version,
extension, and capabilities. These mechanisms are considered as
different availability classes.

This commit introduces basic definitions for modelling SPIR-V
availability classes. Specifically, an `Availability` class is
added to SPIRVBase.td, along with two subclasses: MinVersion
and MaxVersion for versioning. SPV_Op is extended to take a
list of `Availability`. Each `Availability` instance carries
information for generating op interfaces for the corresponding
availability class and also the concrete availability
requirements.

With the availability spec on ops, we can now auto-generate the
op interfaces of all SPIR-V availability classes and also
synthesize the op's implementations of these interfaces. The
interface generation is done via new TableGen backends
-gen-avail-interface-{decls|defs}. The op's implementation is
done via -gen-spirv-avail-impls.

Differential Revision: https://reviews.llvm.org/D71930
2019-12-27 16:25:09 -05:00
Aart Bik 1d47564a53 [VectorOps] unify vector dialect "subscripts"
PiperOrigin-RevId: 286650682
2019-12-20 15:33:04 -08:00
Aart Bik 67c019ddac [VectorOps] remove redundant returns from invalid ops test
PiperOrigin-RevId: 286640660
2019-12-20 14:27:42 -08:00
Christian Sigg 42d46b4efa Add gpu.shuffle op.
This will allow us to lower most of gpu.all_reduce (when all_reduce
doesn't exist in the target dialect) within the GPU dialect, and only do
target-specific lowering for the shuffle op.

PiperOrigin-RevId: 286548256
2019-12-20 02:52:52 -08:00
Andy Davis 8020ad3e39 [VectorOps] Update vector transfer_read/write ops to operatate on memrefs with vector element type.
Update vector transfer_read/write ops to operatate on memrefs with vector element type.
This handle cases where the memref vector element type represents the minimal memory transfer unit (or multiple of the minimal memory transfer unit).

PiperOrigin-RevId: 286482115
2019-12-19 16:05:32 -08:00
Andy Davis 1d798b1d27 [VectorOps] Add vector ReshapeOp to the VectorOps dialect.
Adds vector ReshapeOp to the VectorOps dialect. An aggregate vector reshape operation, which aggregates multiple hardware vectors, can enable optimizations during decomposition (e.g. loading one input hardware vector and performing multiple rotate and scatter store operations to the vector output).

PiperOrigin-RevId: 286440658
2019-12-19 12:27:59 -08:00
Aart Bik 15f800f4bc [VectorOps] minor cleanup: vector dialect "subscripts" are i32
Introduces some centralized methods to move towards
consistent use of i32 as vector subscripts.

Note: sizes/strides/offsets attributes are still i64
PiperOrigin-RevId: 286434133
2019-12-19 11:51:08 -08:00
Aart Bik d9b500d3bb [VectorOps] Add vector.print definition, with lowering support
Examples:

  vector.print %f : f32
  vector.print %x : vector<4xf32>
  vector.print %y : vector<3x4xf32>
  vector.print %z : vector<2x3x4xf32>

LLVM lowering replaces these with fully unrolled calls
into a small runtime support library that provides some
basic printing operations (single value, opening closing
bracket, comma, newline).

PiperOrigin-RevId: 286230325
2019-12-18 11:31:34 -08:00
Alex Zinenko 40ef46fba4 Harden the requirements to memory attribution types in gpu.func
When memory attributions are present in `gpu.func`, require that they are of
memref type and live in memoryspaces 3 and 5 for workgroup and private memory
attributions, respectively. Adapt the conversion from the GPU dialect to the
NVVM dialect to drop the private memory space from attributions as NVVM is able
to model them as local `llvm.alloca`s in the default memory space.

PiperOrigin-RevId: 286161763
2019-12-18 03:38:55 -08:00
Andy Davis 6fa3bd5b3e Add pattern rewrite which splits a vector TransferWriteOp into slices according to the unrolling/slicing scheme of its InsertSlicesOp operand.
PiperOrigin-RevId: 286042578
2019-12-17 13:17:10 -08:00
Mahesh Ravishankar 319cca3bbe Add missing virtual inliner interface method in SPIR-V dialect.
The inline interface uses two methods to check legality of inling:
1) Can a region be inlined into another.
2) Can an operation be inlined into another.
Setting the former to true, allows the inliner to use the second for
legality checks. Add this method to the SPIR-V dialect inlining
interface.

PiperOrigin-RevId: 286041734
2019-12-17 13:06:05 -08:00
Andy Davis d1fb285b32 Add pattern rewrite to forward vector tuple elements to their users.
User(TupleGetOp(ExtractSlicesOp(InsertSlicesOp(TupleOp(Producer))) -> User(Producer)

PiperOrigin-RevId: 286020249
2019-12-17 11:21:45 -08:00
Andy Davis 038ad1d856 Add pattern rewrite which splits a vector TransferReadOp into slices according to the unrolling/slicing scheme of its ExtractSlicesOp user.
PiperOrigin-RevId: 285975613
2019-12-17 07:29:06 -08:00
Andy Davis 4e825c59be Update vector op unrolling transformation to generate ExtractSlicesOp and InsertSlicesOp (instead of less structured chain of StridedSliceOps and InsertStridedSliceOps).
PiperOrigin-RevId: 285968051
2019-12-17 06:27:01 -08:00
Mahesh Ravishankar 80ec474a65 Add atomic operations to SPIR-V dialect.
Some changes to the dialect generation script to allow specification
of different base class to derive from in ODS.

PiperOrigin-RevId: 285859230
2019-12-16 15:05:51 -08:00
Lei Zhang 659150b570 [spirv] Re-enable nested loop (de)serialization test
PiperOrigin-RevId: 285849308
2019-12-16 14:21:52 -08:00
Andy Davis 11e92875f0 Add InsertSlicesOp to the VectorOps dialect.
PiperOrigin-RevId: 285830394
2019-12-16 12:56:38 -08:00
Alex Zinenko 6273fa0c6a Plug gpu.func into the GPU lowering pipelines
This updates the lowering pipelines from the GPU dialect to lower-level
dialects (NVVM, SPIRV) to use the recently introduced gpu.func operation
instead of a standard function annotated with an attribute. In particular, the
kernel outlining is updated to produce gpu.func instead of std.func and the
individual conversions are updated to consume gpu.funcs and disallow standard
funcs after legalization, if necessary. The attribute "gpu.kernel" is preserved
in the generic syntax, but can also be used with the custom syntax on
gpu.funcs. The special kind of function for GPU allows one to use additional
features such as memory attribution.

PiperOrigin-RevId: 285822272
2019-12-16 12:12:48 -08:00
Jose Ignacio Gomez 3ae56c4135 [Linalg] Expose subview promotion as a declarative pattern
This PR targest issue tensorflow/mlir#295. It exposes the already existing
subiew promotion pass as a declarative pattern

Change-Id: If901ebef9fb53fcd0b12ecc536f6b174ce320b92

Closes tensorflow/mlir#315

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/315 from tetuante:issue295 8e5f268b6d85f31015c33505329dbd7a4db97ac5
PiperOrigin-RevId: 285801463
2019-12-16 10:50:45 -08:00
Aart Bik cd5dab8ad7 [VectorOps] Add [insert/extract]element definition together with lowering to LLVM
Similar to insert/extract vector instructions but
(1) work on 1-D vectors only
(2) allow for a dynamic index

  %c3 = constant 3 : index
  %0 = vector.insertelement %arg0, %arg1[%c : index] : vector<4xf32>
  %1 = vector.extractelement %arg0[%c3 : index] : vector<4xf32>

PiperOrigin-RevId: 285792205
2019-12-16 09:52:46 -08:00
Andy Davis 73ec37c8bb Adds ExtractSlicesOp to the VectorOps dialect.
ExtractSlicesOp extracts slices of its vector operand and with a specified tiling scheme.
This operation centralizes the tiling scheme around a single op, which simplifies vector op unrolling and subsequent pattern rewrite transformations.

PiperOrigin-RevId: 285761129
2019-12-16 06:39:09 -08:00
Alexander Belyaev 1b579d998a [Linalg] Add test for fusion of GenericOp with IndexedGenericOp.
PiperOrigin-RevId: 285211797
2019-12-12 09:56:45 -08:00
Christian Sigg 9b85582682 Automated rollback of commit f68ac464d8
PiperOrigin-RevId: 285162061
2019-12-12 03:48:38 -08:00
Christian Sigg f68ac464d8 Switch from shfl.bfly to shfl.down.
Both work for the current use case, but the latter allows implementing
prefix sums and is a little easier to understand for partial warps.

PiperOrigin-RevId: 285145287
2019-12-12 01:28:01 -08:00
Nicolas Vasilache 508d4e672e Continue refactoring StructuredOps utilities
This CL adds more common information to StructuredOpsUtils.h
The n_view attribute is retired in favor of args_in + args_out but the CL is otherwise NFC.

PiperOrigin-RevId: 285000621
2019-12-11 09:27:34 -08:00
Alexander Belyaev bae8a7a724 [Linalg] Add tiling for IndexedGenericOp with a region.
PiperOrigin-RevId: 284949355
2019-12-11 02:56:40 -08:00
Andy Davis 4d8ba88610 Add VectorOp transform pattern which splits vector TransferReadOps to target vector unroll size.
PiperOrigin-RevId: 284880592
2019-12-10 17:02:51 -08:00
Nicolas Vasilache 995048d7b7 Fold TestLinalgTilePermutePatterns into TestLinalgTransformPatterns - NFC
Centralize all patterns that test Linalg transforms in a single pass.

PiperOrigin-RevId: 284835938
2019-12-10 13:26:15 -08:00
Jose Ignacio Gomez b19fed5415 [Linalg] Add a Linalg iterator permutation transformation
This patch closes issue tensorflow/mlir#272
We add a standalone iterator permutation transformation to Linalg.
This transformation composes a permutation map with the maps in the
"indexing_maps" attribute. It also permutes "iterator_types"
accordingly.

Change-Id: I7c1e693b8203aeecc595a7c012e738ca1100c857

Closes tensorflow/mlir#307

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/307 from tetuante:issue272 f7908d58792f4111119721885e247045104f1131
PiperOrigin-RevId: 284824102
2019-12-10 12:25:43 -08:00
Nicolas Vasilache ad38e49806 Uniformize Vector transforms as patterns on the model of Linalg - NFC
This reorganizes the vector transformations to be more easily testable as patterns and more easily composable into fused passes in the future.

PiperOrigin-RevId: 284817474
2019-12-10 11:54:33 -08:00
Aart Bik 1fe65688d4 [VectorOps] Add a ShuffleOp to the VectorOps dialect
For example

 %0 = vector.shuffle %x, %y [3 : i32, 2 : i32, 1 : i32, 0 : i32] : vector<2xf32>, vector<2xf32>

yields a vector<4xf32> result with a permutation of the elements of %x and %y

PiperOrigin-RevId: 284657191
2019-12-09 16:15:41 -08:00
Aart Bik 0e963b9c42 [VectorOps] Fix off-by-one error in insert/extract validation
PiperOrigin-RevId: 284652653
2019-12-09 15:54:23 -08:00
Denis Khalikov 34265dad65 [spirv] Add CompositeConstruct operation.
Closes tensorflow/mlir#308

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/308 from denis0x0D:sandbox/composite_construct 9ef7180f77f9374bcd05afc4f9e6c1d2d72d02b7
PiperOrigin-RevId: 284613617
2019-12-09 12:43:53 -08:00
Lei Zhang 2c7e8ed7c6 [spirv] Add spv.IAdd, spv.ISub, and spv.IMul folders
The patterns to be folded away can be commonly generated
during lowering to SPIR-V.

PiperOrigin-RevId: 284604855
2019-12-09 11:59:10 -08:00
Kazuaki Ishizaki ae05cf27c6 Minor spelling tweaks
Closes tensorflow/mlir#304

PiperOrigin-RevId: 284568358
2019-12-09 09:23:48 -08:00
Nicolas Vasilache 91c0074624 [StructuredOps][Linalg] Add a primitive pattern to rewrite the linalg.generic form of matmul to vector form.
This CL uses the newly expanded matcher support to easily detect when a linalg.generic has a multiply-accumulate body. A linalg.generic with such a body is rewritten as a vector contraction.
This CL additionally limits the rewrite to the case of matrix multiplication on contiguous and statically shaped memrefs for now.

Before expanding further, we should harden the infrastructure for expressing custom ops with the structured ops abstraction.

PiperOrigin-RevId: 284566659
2019-12-09 09:14:39 -08:00
Aart Bik d37f27251f [VecOps] Rename vector.[insert|extract]element to just vector.[insert|extract]
Since these operations lower to [insert|extract][element|value] at LLVM
dialect level, neither element nor value would correctly reflect the meaning.

PiperOrigin-RevId: 284240727
2019-12-06 12:39:25 -08:00