llvm-project

Commit Graph

Author	SHA1	Message	Date
Kazuaki Ishizaki	5aacce3db2	[mlir] NFC: Fix trivial typo Differential Revision: https://reviews.llvm.org/D77473	2020-04-05 11:30:30 +09:00
Alex Grosul	3a5192098c	[mlir][VectorOps] Implement canonicalization for TransposeOp. Two back-to-back transpose operations are combined into a single transpose, which uses a combination of their permutation vectors. Differential Revision: https://reviews.llvm.org/D77331	2020-04-02 18:36:40 -07:00
Uday Bondhugula	7c771631c6	[MLIR][NFC] drop unnecessary matches in affine dma generate test case Drop unnecessary matches in affine DMA generate test case. Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, grosul1, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77243	2020-04-02 03:02:07 +05:30
Uday Bondhugula	5e8093134a	[MLIR] Add method to drop duplicate result exprs from AffineMap Add a method that given an affine map returns another with just its unique results. Use this to drop redundant bounds in max/min for affine.for. Update affine.for's canonicalization pattern and createCanonicalizedForOp to use this. Differential Revision: https://reviews.llvm.org/D77237	2020-04-02 03:00:19 +05:30
Uday Bondhugula	572890f1d3	[MLIR][NFC] clean up affine data copy test case Capture maps to test better; drop unnecessary matches Differential Revision: https://reviews.llvm.org/D77196	2020-04-01 22:37:49 +05:30
Hanhan Wang	6dd696ae4f	[mlir][Linalg] Extend fusion to support WAW atm on buffers. Summary: The RAW fusion happens only if the produecer block dominates the consumer block. The WAW pattern also works with the precondition. I.e., if a producer can dominate the consumer, they can fairly fuse together. Since they are all tilable, we can think the pattern like this way: Input: ``` linalg_op1 view tile_loop subview_2 linalg_op2 subview_2 ``` Tile the first Linalg op as same as the second Linalg. ``` tile_loop subview_1 linalg_op1 subview_1 tile_loop subview_2 liangl_op2 subview_2 ``` Since the first Linalg op is tilable in the same way and the computation are independently, it's fair to fuse it with the second Linalg op. ``` tile_loop subview_1 linalg_op1 subview_1 linalg_op2 subview_2 ``` In short, this patch includes: - Handling both RAW and WAW pattern. - Adding a interface method to get input and output buffers. - Exposing a method to get a StringRef of a dependency type. - Fixing existing WAW tests and add one more use case: initialize the buffer before conv op. Differential Revision: https://reviews.llvm.org/D76897	2020-03-31 21:33:50 -07:00
Hanhan Wang	69ddee1d2a	[mlir][Linalg] Introduce linalg.pooling_min/max/sum op. Summary: Performs an N-D pooling operation similarly to the description in the TF documentation: https://www.tensorflow.org/api_docs/python/tf/nn/pool Different from the description, this operation doesn't perform on batch and channel. It only takes tensors of rank `N`. ``` output[x[0], ..., x[N-1]] = REDUCE_{z[0], ..., z[N-1]} input[ x[0] * strides[0] - pad_before[0] + dilation_rate[0]z[0], ... x[N-1]strides[N-1] - pad_before[N-1] + dilation_rate[N-1]*z[N-1] ], ``` The required optional arguments are: - strides: an i64 array specifying the stride (i.e. step) for window loops. - dilations: an i64 array specifying the filter upsampling/input downsampling rate - padding: an i64 array of pairs (low, high) specifying the number of elements to pad along a dimension. If strides or dilations attributes are missing then the default value is one for each of the input dimensions. Similarly, padding values are zero for both low and high in each of the dimensions, if not specified. Differential Revision: https://reviews.llvm.org/D76414	2020-03-31 21:21:54 -07:00
Alex Grosul	855e738be2	[VectorOps] Implement a simple folder for identity vector.transpose operations. Differential Revision: https://reviews.llvm.org/D77088	2020-03-31 17:03:10 -07:00
MaheshRavishankar	da7b6fe942	[mlir][Linalg] Allow tiling of batch dimension for convolution ops with padding. Existing tiling implementation of Linalg would still work for tiling the batch dimensions of the convolution op. Differential Revision: https://reviews.llvm.org/D76637	2020-03-31 09:22:38 -07:00
Andy Davis	31a346cc35	[MLIR][Vector] Add support for TupleGetOp folding through InsertSlicesOp and ExtractSlicesOp. Summary: Add support for TupleGetOp folding through InsertSlicesOp and ExtractSlicesOp. Vector-to-vector transformations for unrolling and lowering to hardware vectors can generate chains of structured vector operations (InsertSlicesOp, ExtractSlicesOp and ShapeCastOp) between the producer of a hardware vector value and its consumer. Because InsertSlicesOp, ExtractSlicesOp and ShapeCastOp are structured, we can track the location (tuple index and vector offsets) of the consumer vector value through the chain of structured operations to the producer, enabling a much more powerful producer-consumer fowarding of values through structured ops and tuple, which in turn enables a more powerful TupleGetOp folding transformation. Reviewers: nicolasvasilache, aartbik Reviewed By: aartbik Subscribers: grosul1, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76889	2020-03-31 08:39:17 -07:00
Ahmed Taei	221fa96cd4	Fix linalg.generic access of hoisted constants Summary: Otherwise the added @generic_const_int will fail Reviewers: nicolasvasilache, rriddle, mravishankar Subscribers: mehdi_amini, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, grosul1, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77109	2020-03-30 21:15:41 -07:00
Mehdi Amini	111c932855	Fix test: add `-allow-unregistered-dialect` to Affine/loop-permute.mlir test (missed during rebase)	2020-03-30 20:14:47 +00:00
Mehdi Amini	bab5bcf8fd	Add a flag on the context to protect against creation of operations in unregistered dialects Differential Revision: https://reviews.llvm.org/D76903	2020-03-30 19:37:31 +00:00
Uday Bondhugula	f273e5c507	[MLIR] Fix permuteLoops utility Rewrite mlir::permuteLoops (affine loop permutation utility) to fix incorrect approach. Avoiding using sinkLoops entirely - use single move approach. Add test pass. This fixes https://bugs.llvm.org/show_bug.cgi?id=45328 Depends on D77003. Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Differential Revision: https://reviews.llvm.org/D77004	2020-03-30 23:38:23 +05:30
Uday Bondhugula	43a95a543f	[MLIR] Introduce full/partial tile separation using if/else This patch introduces a utility to separate full tiles from partial tiles when tiling affine loop nests where trip counts are unknown or where tile sizes don't divide trip counts. A conditional guard is generated to separate out the full tile (with constant trip count loops) into the then block of an 'affine.if' and the partial tile to the else block. The separation allows the 'then' block (which has constant trip count loops) to be optimized better subsequently: for eg. for unroll-and-jam, register tiling, vectorization without leading to cleanup code, or to offload to accelerators. Among techniques from the literature, the if/else based separation leads to the most compact cleanup code for multi-dimensional cases (because a single version is used to model all partial tiles). INPUT affine.for %i0 = 0 to %M { affine.for %i1 = 0 to %N { "foo"() : () -> () } } OUTPUT AFTER TILING W/O SEPARATION map0 = affine_map<(d0) -> (d0)> map1 = affine_map<(d0)[s0] -> (d0 + 32, s0)> affine.for %arg2 = 0 to %M step 32 { affine.for %arg3 = 0 to %N step 32 { affine.for %arg4 = #map0(%arg2) to min #map1(%arg2)[%M] { affine.for %arg5 = #map0(%arg3) to min #map1(%arg3)[%N] { "foo"() : () -> () } } } } OUTPUT AFTER TILING WITH SEPARATION map0 = affine_map<(d0) -> (d0)> map1 = affine_map<(d0) -> (d0 + 32)> map2 = affine_map<(d0)[s0] -> (d0 + 32, s0)> #set0 = affine_set<(d0, d1)[s0, s1] : (-d0 + s0 - 32 >= 0, -d1 + s1 - 32 >= 0)> affine.for %arg2 = 0 to %M step 32 { affine.for %arg3 = 0 to %N step 32 { affine.if #set0(%arg2, %arg3)[%M, %N] { // Full tile. affine.for %arg4 = #map0(%arg2) to #map1(%arg2) { affine.for %arg5 = #map0(%arg3) to #map1(%arg3) { "foo"() : () -> () } } } else { // Partial tile. affine.for %arg4 = #map0(%arg2) to min #map2(%arg2)[%M] { affine.for %arg5 = #map0(%arg3) to min #map2(%arg3)[%N] { "foo"() : () -> () } } } } } The separation is tested via a cmd line flag on the loop tiling pass. The utility itself allows one to pass in any band of contiguously nested loops, and can be used by other transforms/utilities. The current implementation works for hyperrectangular loop nests. Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Differential Revision: https://reviews.llvm.org/D76700	2020-03-28 06:58:35 +05:30
Uday Bondhugula	92744f6247	[MLIR] Add flat affine constraints method to round trip integer set - add method to get back an integer set from flat affine constraints; this allows a round trip - use this to complete the simplification of integer sets in -simplify-affine-structures - update FlatAffineConstraints::removeTrivialRedundancy to also do GCD tightening and normalize by GCD (while still keeping it linear time). Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>	2020-03-26 12:07:13 +05:30
Frej Drejhammar	d8981ce5b9	[mlir][Parser] Fix attribute parser errors for ui64 Summary: The attribute parser fails to correctly parse unsigned 64 bit attributes as the check `isNegative ? (int64_t)-val.getValue() >= 0 : (int64_t)val.getValue() < 0` will falsely detect an overflow for unsigned values larger than 2^63-1. This patch reworks the overflow logic to instead of doing arithmetic on int64_t use APInt::isSignBitSet() and knowledge of the attribute type. Test-cases which verify the de-facto behavior of the parser and triggered the previous faulty handing of unsigned 64 bit attrbutes are also added. Differential Revision: https://reviews.llvm.org/D76493	2020-03-25 11:57:16 -07:00
aartbik	8d46bfa808	[mlir] [VectorOps] A "reference" lowering of vector.transpose to LLVM IR Summary: Makes the vector.tranpose runnable on CPU. Reviewers: nicolasvasilache, andydavis1, rriddle Reviewed By: nicolasvasilache Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76644	2020-03-23 19:01:38 -07:00
Uday Bondhugula	b873761496	[MLIR][NFC] Move some of the affine transforms / tests to dialect dirs Move some of the affine transforms and their test cases to their respective dialect directory. This patch does not complete the move, but takes care of a good part. Renames: prefix 'affine' to affine loop tiling cl options, vectorize -> super-vectorize Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Differential Revision: https://reviews.llvm.org/D76565	2020-03-23 08:25:07 +05:30
River Riddle	e9482ed194	[mlir] Move several static cl::opts to be pass options instead. This removes the reliance on global options, and also simplifies the pass registration. Differential Revision: https://reviews.llvm.org/D76552	2020-03-22 03:16:21 -07:00
aartbik	479ee11061	[mlir] [VectorOps] Introduce vector.transpose Summary: Introduced in order to introduce specialized lowering passes that implement transposition operations efficiently. Reviewers: nicolasvasilache, andydavis1 Reviewed By: nicolasvasilache Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76460	2020-03-20 15:47:44 -07:00
Rob Suderman	e708471395	[mlir][NFC] Cleanup AffineOps directory structure Summary: Change AffineOps Dialect structure to better group both IR and Tranforms. This included extracting transforms directly related to AffineOps. Also move AffineOps to Affine. Differential Revision: https://reviews.llvm.org/D76161	2020-03-20 14:23:43 -07:00
Ahmed Taei	08a9147349	[mlir][LLVMIR] Fix fusion for rank-0 tensors Summary: This diff fixes fusion craching for ops with rank-0 tensors Reviewers: mravishankar, nicolasvasilache, rriddle! Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76479	2020-03-20 13:17:19 -07:00
Nicolas Vasilache	462db62053	[mlir][AVX512] Start a primitive AVX512 dialect The Vector Dialect [document](https://mlir.llvm.org/docs/Dialects/Vector/) discusses the vector abstractions that MLIR supports and the various tradeoffs involved. One of the layer that is missing in OSS atm is the Hardware Vector Ops (HWV) level. This revision proposes an AVX512-specific to add a new Dialect/Targets/AVX512 Dialect that would directly target AVX512-specific intrinsics. Atm, we rely too much on LLVM’s peephole optimizer to do a good job from small insertelement/extractelement/shufflevector. In the future, when possible, generic abstractions such as VP intrinsics should be preferred. The revision will allow trading off HW-specific vs generic abstractions in MLIR. Differential Revision: https://reviews.llvm.org/D75987	2020-03-20 14:11:57 -04:00
Valentin Clement	d4d62fcab6	[MLIR] Add test for multiple gpu.all_reduce in the same kernel when lowering to NVVM Summary: This patch add tests when lowering multiple `gpu.all_reduce` operations in the same kernel. This was previously failing. Differential Revision: https://reviews.llvm.org/D75930	2020-03-19 16:36:38 +01:00
Shraiysh Vaishay	ff77397fcf	[mlir] Added llvm.resume and personality functions in LLVM IR Dialect `llvm.resume` is similar to `llvm.return` except that has to be exactly one operand and that should be derived from a `llvm.landingpad` instruction. Any function having `llvm.landingpad` instruction must have a personality attribute. Example: LLVM IR ``` define dso_local i32 @main() personality i32 (...)* @__gxx_personality_v0 { invoke void @foo(i32 42) to label %3 unwind label %1 1: ; preds = %0 %2 = landingpad i8* catch i8** @_ZTIi catch i8* bitcast (i8** @_ZTIi to i8) resume i8 %2 3: ; preds = %0 ret i32 1 } ``` MLIR - LLVM IR Dialect ``` llvm.func @main() -> !llvm.i32 attributes {personality = @__gxx_personality_v0} { %0 = llvm.mlir.constant(1 : i32) : !llvm.i32 %1 = llvm.mlir.addressof @_ZTIi : !llvm<"i8"> %2 = llvm.bitcast %1 : !llvm<"i8"> to !llvm<"i8"> %3 = llvm.mlir.addressof @_ZTIi : !llvm<"i8"> %4 = llvm.mlir.constant(42 : i32) : !llvm.i32 llvm.invoke @foo(%4) to ^bb2 unwind ^bb1 : (!llvm.i32) -> () ^bb1: // pred: ^bb0 %5 = llvm.landingpad (catch %3 : !llvm<"i8">) (catch %2 : !llvm<"i8">) : !llvm<"i8"> llvm.resume %5 : !llvm<"i8"> ^bb2: // pred: ^bb0 llvm.return %0 : !llvm.i32 } ``` Differential Revision: https://reviews.llvm.org/D71888	2020-03-19 13:14:25 +01:00
Alex Zinenko	bc18624b40	[mlir] vector.type_cast: disallow memrefs with layout in verifier Summary: These are not supported by any of the code using `type_cast`. In the general case, such casting would require memrefs to handle a non-contiguous vector representation or misaligned vectors (e.g., if the offset of the source memref is not divisible by vector size, since offset in the target memref is expressed in the number of elements). Differential Revision: https://reviews.llvm.org/D76349	2020-03-19 10:15:41 +01:00
Lei Zhang	58df5e6d9a	[mlir][spirv] Plumbing target environment into type converter This commit unifies target environment queries into a new wrapper class spirv::TargetEnv and shares across various places needing the functionality. We still create multiple instances of TargetEnv though given the parent components (type converters, passes, conversion targets) have different lifetimes. In the meantime, LowerABIAttributesPass is updated to take into consideration the target environment, which requires updates to tests to provide that. Differential Revision: https://reviews.llvm.org/D76242	2020-03-18 20:11:05 -04:00
Lei Zhang	3b35f9d8b5	[mlir][spirv] Use memref memory space for storage class Previously in SPIRVTypeConverter, we always convert memref types to StorageBuffer regardless of their memory spaces. This commit fixes that to let the conversion to look into memory space properly. For this purpose, a mapping between SPIR-V storage class and memref memory space is introduced. The mapping is arbitary decided at the moment and the hope is that we can leverage string memory space later to be more clear. Now spv.interface_var_abi cannot contain storage class unless it's attached to a scalar value, where we need the storage class as side channel information. Verifications and tests are properly adjusted. Differential Revision: https://reviews.llvm.org/D76241	2020-03-18 20:11:04 -04:00
Nicolas Vasilache	2fae7878d5	[mlir][Vector] Mostly-NFC - Restructure options for lowering to LLVM Matrix Intrinsics Summary: This revision restructures the calling of vector transforms to make it more flexible to ask for lowering through LLVM matrix intrinsics. This also makes sure we bail out in degenerate cases (i.e. 1) in which LLVM complains about not being able to scalarize. Differential Revision: https://reviews.llvm.org/D76266	2020-03-17 22:58:02 -04:00
Rob Suderman	4d60f47b08	[mlir][NFC] Renamed VectorOps to Vector Summary: Renamed VectorOps to Vector to avoid the redundant Ops suffix. Differential Revision: https://reviews.llvm.org/D76317	2020-03-17 15:28:08 -07:00
Rob Suderman	363dd3f394	[mlir][NFC] Rename QuantOps to Quant Summary: Renamed QuantOps to Quant to avoid the Ops suffix. All dialects will contain ops, so the Ops suffix is redundant. Differential Revision: https://reviews.llvm.org/D76318	2020-03-17 15:16:47 -07:00
Jacques Pienaar	9a65d683e0	[mlir] Add target for Shape dialect Summary: Add targets and basic printing/parsing of types in Shape dialect. Differential Revision: https://reviews.llvm.org/D76321	2020-03-17 14:54:25 -07:00
Sagar Jain	76cf14035b	[MLIR] Added llvm.fence This patch adds llvm.fence. I tried not to change the syntax much. syntax: LLVM IR `fence [syncscope("<target-scope>")] <ordering>` MLIR LLVM Dialect `llvm.fence [syncscope("<target-scope>")] <ordering>` example: LLVM IR: `fence syncscope("agent") seq_cst` MLIR: `llvm.fence syncscope("agent") seq_cst` Differential Revision: https://reviews.llvm.org/D75645	2020-03-17 17:53:37 +01:00
Alex Zinenko	e119980f3f	[mlir] LLVM dialect: move ensureDistinctSuccessors out of std->LLVM conversion MLIR supports terminators that have the same successor block with different block operands, which cannot be expressed in the LLVM's phi-notation as the block identifier is used to tell apart the predecessors. This limitation can be worked around by branching to a new block instead, with this new block unconditionally branching to the original successor and forwarding the argument. Until now, this transformation was performed during the conversion from the Standard to the LLVM dialect. This does not scale well to multiple dialects targeting the LLVM dialect as all of them would have to be aware of this limitation and perform the preparatory transformation. Instead, do it as a separate pass and run it immediately before the translation. Differential Revision: https://reviews.llvm.org/D75619	2020-03-17 15:22:14 +01:00
Uday Bondhugula	332f0b3cd4	Affine expr simplification for add of const multiple of same expression - Detect "c_1 * expr + c_2 * expr" as (c_1 + c_2) * expr - subsumes things like 'expr - expr' and "expr * -1 + expr" as 0. - change AffineConstantExpr ctor to allow default null init Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Differential Revision: https://reviews.llvm.org/D76233	2020-03-17 08:22:17 +05:30
Feng Liu	166f83f436	[QuantOps] Add the quant region definition Summary: This regional op in the QuantOps dialect will be used to wrap high-precision ops into atomic units for quantization. All the values used by the internal ops are captured explicitly by the op inputs. The quantization parameters of the inputs and outputs are stored in the attributes. Subscribers: jfb, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, Joonsoo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75972	2020-03-16 15:44:43 -07:00
Hanhan Wang	92f7e8133a	[mlir][Linalg] Implement padding for linalg.conv and lowering to loops. Summary: To enable this, two changes are needed: 1) Add an optional attribute `padding` to linalg.conv. 2) Compute if the indices accessing is out of bound in the loops. If so, use the padding value `0`. Otherwise, use the value derived from load. In the patch, the padding only works for lowering without other transformations, e.g., tiling, fusion, etc. Differential Revision: https://reviews.llvm.org/D75722	2020-03-13 14:35:58 -07:00
Nicolas Vasilache	bbf3ef8541	[mlir][Vector]Lower vector.contract to llvm.intr.matrix_multiply Summary: This revision adds lowering of vector.contract to llvm.intr.matrix_multiply. Note that there is currently a mismatch between the MLIR vector dialect which expects row-major layout and the LLVM matrix intrinsics which expect column major layout. As a consequence, we currently only match a vector.contract with indexing maps that express column-major matrix multiplication. Other cases would require additional transposes and it is better to wait for LLVM intrinsics to provide a per-operation attribute that would specify which layout is expected. A separate integration test, not submitted to MLIR core, has independently verified that correct execution occurs on a 2x2x2 matrix multiplication. Differential Revision: https://reviews.llvm.org/D76014	2020-03-13 16:33:23 -04:00
aartbik	a213ece30b	[mlir] [VectorOps,LinAlg] Remove direct LLVM lowering for vector.broadcast Summary: The direct lowering of vector.broadcast into LLVM has been replaced by progressive lowering into elementary vector ops. This also required a small refactoring of a llvm.mlir test that used a direct vector.broadcast operator (just to define a matmul). Reviewers: nicolasvasilache, andydavis1, rriddle Reviewed By: nicolasvasilache Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76143	2020-03-13 11:42:51 -07:00
Lei Zhang	e5c85a5a4f	[mlir][spirv] Support querying type extension/capability requirements Previously we only consider the version/capability/extension requirements on ops themselves. Some types in SPIR-V also require special extensions or capabilities to be used. For example, non-32-bit integers/floats will require different capabilities and/or extensions depending on where they are used because it may mean special hardware abilities. This commit adds query methods to SPIR-V type class hierarchy to support querying extensions and capabilities. We don't go through ODS for auto-generating such information given that we don't have them in SPIR-V machine readable grammar and there are just a few types. Differential Revision: https://reviews.llvm.org/D75875	2020-03-12 19:37:45 -04:00
Lei Zhang	3148f10b17	[mlir][spirv] Use spv.vce in spv.module and wire up (de)serialization This commits changes the definition of spv.module to use the #spv.vce attribute for specifying (version, capabilities, extensions) triple so that we can have better API and custom assembly form. Since now we have proper modelling of the triple, (de)serialization is wired up to use them. With the new UpdateVCEPass, we don't need to manually specify the required extensions and capabilities anymore when creating a spv.module. One just need to call UpdateVCEPass before serialization to get the needed version/extensions/capabilities. Differential Revision: https://reviews.llvm.org/D75872	2020-03-12 19:37:45 -04:00
Lei Zhang	9414db1090	[mlir][spirv] Add a pass to deduce version/extension/capability Creates an operation pass that deduces and attaches the minimal version/ capabilities/extensions requirements for spv.module ops. For each spv.module op, this pass requires a `spv.target_env` attribute on it or an enclosing module-like op to drive the deduction. The reason is that an op can be enabled by multiple extensions/capabilities. So we need to know which one to pick. `spv.target_env` gives the hard limit as for what the target environment can support; this pass deduces what are actually needed for a specific spv.module op. Differential Revision: https://reviews.llvm.org/D75870	2020-03-12 19:37:45 -04:00
Lei Zhang	e115a40f50	[mlir][spirv] Use separate attribute for (version, capabilities, extensions) We also need the (version, capabilities, extensions) triple on the spv.module op. Thus far we have been using separate 'extensions' and 'capabilities' attributes there and 'version' is missing. Creating a separate attribute for the trip allows us to reuse the assembly form and verification. Differential Revision: https://reviews.llvm.org/D75868	2020-03-12 19:37:45 -04:00
River Riddle	907403f342	[mlir] Add a new `ConstantLike` trait to better identify operations that represent a "constant". The current mechanism for identifying is a bit hacky and extremely adhoc, i.e. we explicit check 1-result, 0-operand, no side-effect, and always foldable and then assume that this is a constant. Adding a trait adds structure to this, and makes checking for a constant much more efficient as we can guarantee that all of these things have already been verified. Differential Revision: https://reviews.llvm.org/D76020	2020-03-12 14:26:15 -07:00
River Riddle	7c211cf3af	[mlir][NFC] Move the definition of AffineApplyOp to ODS This has been a long standing cleanup TODO. Differential Revision: https://reviews.llvm.org/D76019	2020-03-12 14:26:15 -07:00
aartbik	078776a679	[mlir] [VectorOps] Progressively lower vector.outerproduct to LLVM Summary: This replaces the direct lowering of vector.outerproduct to LLVM with progressive lowering into elementary vectors ops to avoid having the similar lowering logic at several places. NOTE1: with the new progressive rule, the lowered llvm is slightly more elaborate than with the direct lowering, but the generated assembly is just as optimized; still if we want to stay closer to the original, we should add a "broadcast on extract" to shuffle rewrite (rather than special cases all the lowering steps) NOTE2: the original outerproduct lowering code should now be removed but some linalg test work directly on vector and contain some dead code, so this requires another CL Reviewers: nicolasvasilache, andydavis1 Reviewed By: nicolasvasilache, andydavis1 Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75956	2020-03-12 13:45:42 -07:00
Valentin Clement	c7380995f8	[MLIR] Add `and`, `or`, `xor`, `min`, `max` too gpu.all_reduce and the nvvm lowering Summary: This patch add some builtin operation for the gpu.all_reduce ops. - for Integer only: `and`, `or`, `xor` - for Float and Integer: `min`, `max` This is useful for higher level dialect like OpenACC or OpenMP that can lower to the GPU dialect. Differential Revision: https://reviews.llvm.org/D75766	2020-03-11 14:07:04 +01:00
Stephan Herhut	f6790a1c63	Revert "[MLIR] Add `and`, `or`, `xor`, `min`, `max` too gpu.all_reduce and the nvvm lowering" Attribution to original author got lost.	2020-03-11 14:07:04 +01:00
Stephan Herhut	2eff566b07	[MLIR] Add `and`, `or`, `xor`, `min`, `max` too gpu.all_reduce and the nvvm lowering Summary: This patch add some builtin operation for the gpu.all_reduce ops. - for Integer only: `and`, `or`, `xor` - for Float and Integer: `min`, `max` This is useful for higher level dialect like OpenACC or OpenMP that can lower to the GPU dialect. Differential Revision: https://reviews.llvm.org/D75766	2020-03-10 21:09:06 +01:00
Nicolas Vasilache	47ec8702cb	[mlir][Linalg] Revisit 0-D abstraction This revision takes advantage of the empty AffineMap to specify the 0-D edge case. This allows removing a bunch of annoying corner cases that ended up impacting users of Linalg. Differential Revision: https://reviews.llvm.org/D75831	2020-03-10 15:14:09 -04:00
Nicolas Vasilache	90322403c2	[mlir][Vector] Allow lowering of vector.shape_cast 2D <-> 1D Summary: This will support the progressive lowering of: ``` vector.contract -> downcast + vector.matrix_multiply + upcast -> llvm.intr.matrix ``` Differential Revision: https://reviews.llvm.org/D75776	2020-03-09 13:14:39 -04:00
Alexander Belyaev	3147342ae7	[MLIR] Change custom printer/parser for loop.parallel and loop.reduce.	2020-03-09 15:11:48 +01:00
aartbik	0d924700a6	[mlir] [VectorOps] Merge VectorReduction/VectorReductionV2 into one Op Summary: Paying off some technical debt in VectorOps, where I introduced a special op for a fused accumulator into reduction to avoid some issues around printing and parsing an optional accumulator. This CL merges the two into one op again and does things the right way (still would be nice to have "assemblyFormat" for optional operands though....). Reviewers: nicolasvasilache, andydavis1, ftynse, rriddle Reviewed By: nicolasvasilache Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75699	2020-03-05 13:07:31 -08:00
River Riddle	cb1777127c	[mlir] Remove successor operands from the Operation class Summary: This revision removes all of the functionality related to successor operands on the core Operation class. This greatly simplifies a lot of handling of operands, as well as successors. For example, DialectConversion no longer needs a special "matchAndRewrite" for branching terminator operations.(Note, the existing method was also broken for operations with variadic successors!!) This also enables terminator operations to define their own relationships with successor arguments, instead of the hardcoded "pass-through" behavior that exists today. Differential Revision: https://reviews.llvm.org/D75318	2020-03-05 12:53:02 -08:00
River Riddle	01f7431b5b	[mlir][DeclarativeParser] Add support for formatting operations with AttrSizedOperandSegments. This attribute details the segment sizes for operand groups within the operation. This revision add support for automatically populating this attribute in the declarative parser. Differential Revision: https://reviews.llvm.org/D75315	2020-03-05 12:51:28 -08:00
River Riddle	c0fd5e657e	[mlir] Add traits for verifying the number of successors and providing relevant accessors. This allows for simplifying OpDefGen, as well providing specializing accessors for the different successor counts. This mirrors the existing traits for operands and results. Differential Revision: https://reviews.llvm.org/D75313	2020-03-05 12:49:59 -08:00
MaheshRavishankar	3f44495dfd	[mlir][GPU] Expose the functionality to create a GPUFuncOp from a LaunchOp The current setup of the GPU dialect is to model both the host and device side codegen. For cases (like IREE) the host side modeling might not directly fit its use case, but device-side codegen is still valuable. First step in accessing just the device-side functionality of the GPU dialect is to allow just creating a gpu.func operation from a gpu.launch operation. In addition this change also "inlines" operations into the gpu.func op at time of creation instead of this being a later step. Differential Revision: https://reviews.llvm.org/D75287	2020-03-05 11:03:51 -08:00
MaheshRavishankar	755c050200	[mlir][Linalg] Fix load/store operations generated while lower loops when output has zero rank. While lowering to loops, no indices should be used in the load/store operation if the buffer is zero-rank. Differential Revision: https://reviews.llvm.org/D75391	2020-03-04 17:04:30 -08:00
Lei Zhang	9600b55ac8	[mlir][spirv] Support integer signedness This commit updates SPIR-V dialect to support integer signedness by relaxing various checks for signless to just normal integers. The hack for spv.Bitcast can now be removed. Differential Revision: https://reviews.llvm.org/D75611	2020-03-04 15:14:11 -05:00
Lei Zhang	5b2cc6c3d0	[mlir][ods] Improve integer signedness modelling A previous commit added support for integer signedness in C++ IntegerType. This change introduces ODS definitions for integer types and integer (element) attributes w.r.t. signedness. This commit also updates various existing definitions' descriptions to mention signless where suitable to make it more clear. Positive and non-negative integer attributes are removed to avoid the explosion of subclasses. Instead, one should use more atmoic constraints together with Confined to model that. For example, `Confined<..., [IntPositive]>`. Differential Revision: https://reviews.llvm.org/D75610	2020-03-04 15:05:42 -05:00
Alexander Belyaev	0145a26c65	[MLIR] Add explicit initial values for loop.parallel op. Differential Revision: https://reviews.llvm.org/D75206	2020-03-03 15:36:10 +01:00
Sagar Jain	d85821dfa6	[MLIR] Added llvm.freeze This patch adds llvm.freeze & processes undef constants from LLVM IR. Syntax: LLVM IR `<result> = freeze ty <val>` MLIR LLVM Dialect: `llvm.freeze val attr-dict : type` Example: LLVM IR: `%3 = freeze i32 5` MLIR: `%6 = llvm.freeze %5 : !llvm.i32` Differential Revision: https://reviews.llvm.org/D75329	2020-03-02 10:24:01 +01:00
Benjamin Kramer	5abf128d64	Add a pass that specializes parallel loops for easier unrolling and vectorization This matches loops with a affine.min upper bound, limiting the trip count to a constant, and rewrites them into two loops, one with constant upper bound and one with variable upper bound. The assumption is that the constant upper bound loop will be unrolled and vectorized, which is preferable if this is the hot path. Differential Revision: https://reviews.llvm.org/D75240	2020-02-28 19:47:23 +01:00
Alex Zinenko	54e5600e4d	[mlir] fix wrong symbol order in AffineApplyNormalizer Summary: AffineApplyNormalizer provides common logic for folding affine maps that appear in affine.apply into other affine operations that use the result of said affine.apply. In the process, affine maps of both operations are composed. During the composition `A.compose(B)` the symbols from the map A are placed before those of the map B in a single concatenated symbol list. However, AffineApplyNormalizer was ordering the operands of the operation being normalized by iteratively appending the symbols into a single list accoridng to the operand order, regardless of whether these operands are symbols of the current operation or of the map that is being folded into it. This could lead to wrong order of symbols and, when the symbols were bound to constant values, to visibly incorrect folding of constants into affine maps as reported in PR45031. Make sure symbols operands to the current operation are always placed before symbols coming from the folded maps. Update the test that was exercising the incorrect folder behavior. For some reason, the order of symbol operands was swapped in the test input compared to the previous operations, making it easy to assume the correct maps were produced whereas they were swapping the symbols back due to the problem described above. Closes https://bugs.llvm.org/show_bug.cgi?id=45031 Differential Revision: https://reviews.llvm.org/D75247	2020-02-27 15:15:29 +01:00
Lei Zhang	5bc6ff6455	[mlir][spirv] Add some folders for spv.LogicalAnd/spv.LogicalOr This commit handles folding spv.LogicalAnd/spv.LogicalOr when one of the operands is constant true/false. Differential Revision: https://reviews.llvm.org/D75195	2020-02-26 15:13:37 -05:00
Alex Zinenko	305320b005	[mlir] NFC: move AffineOps tests from test/ to test/Dialect AffineOps dialect lives under lib/Dialect/AffineOps and so should its tests.	2020-02-25 14:20:40 +01:00
Stephan Herhut	7a7eacc797	[MLIR][GPU] Implement a simple greedy loop mapper. Summary: The mapper assigns annotations to loop.parallel operations that are compatible with the loop to gpu mapping pass. The outermost loop uses the grid dimensions, followed by block dimensions. All remaining loops are mapped to sequential loops. Differential Revision: https://reviews.llvm.org/D74963	2020-02-25 11:42:42 +01:00
Lei Zhang	8358ddbe5d	[mlir][spirv] NFC: Move test passes to test/lib Previously C++ test passes for SPIR-V were put under test/Dialect/SPIRV. Move them to test/lib/Dialect/SPIRV to create a better structure. Also fixed one of the test pass to use new PassRegistration mechanism. Differential Revision: https://reviews.llvm.org/D75066	2020-02-24 14:17:02 -05:00
Benjamin Kramer	bc1947a6f5	Add a basic tiling pass for parallel loops This exploits the fact that the iterations of parallel loops are independent so tiling becomes just an index transformation. This pass only tiles the innermost loop of a loop nest. The ultimate goal is to allow vectorization of the tiled loops, but I don't think we're there yet with the current rewriting, as the tiled loops don't have a constant trip count. Differential Revision: https://reviews.llvm.org/D74954	2020-02-24 11:44:40 +01:00
River Riddle	9eb436feaa	[mlir][DeclarativeParser] Add support for formatting the successors of an operation. This revision add support for formatting successor variables in a similar way to operands, attributes, etc. Differential Revision: https://reviews.llvm.org/D74789	2020-02-21 15:15:32 -08:00
River Riddle	b1de971ba8	[mlir][ODS] Add support for specifying the successors of an operation. This revision add support in ODS for specifying the successors of an operation. Successors are specified via the `successors` list: ``` let successors = (successor AnySuccessor:$target, AnySuccessor:$otherTarget); ``` Differential Revision: https://reviews.llvm.org/D74783	2020-02-21 15:15:32 -08:00
Hanhan Wang	29ad9d6b26	[mlir][spirv] Add lowering for load/store zero-rank memref from std to SPIR-V. Differential Revision: https://reviews.llvm.org/D74874	2020-02-21 14:41:12 -05:00
Nagy Mostafa	bc7b26c333	[MLIR] Allow Loop dialect IfOp and ForOp to define values This patch implements the RFCs proposed here: https://llvm.discourse.group/t/rfc-modify-ifop-in-loop-dialect-to-yield-values/463 https://llvm.discourse.group/t/rfc-adding-operands-and-results-to-loop-for/459/19. It introduces the following changes: - All Loop Ops region, except for ReduceOp, terminate with a YieldOp. - YieldOp can have variadice operands that is used to return values out of IfOp and ForOp regions. - Change IfOp and ForOp syntax and representation to define values. - Add unit-tests and update .td documentation. - YieldOp is a terminator to loop.for/if/parallel - YieldOp custom parser and printer Lowering is not supported at the moment, and will be in a follow-up PR. Thanks. Reviewed By: bondhugula, nicolasvasilache, rriddle Differential Revision: https://reviews.llvm.org/D74174	2020-02-21 10:05:32 -08:00
aartbik	ee260c1a0f	[mlir] [VectorOps] Multi-dim reductions for lowering vector.contract Summary: This implements the last step for lowering vector.contract progressively to LLVM IR (except for masks). Multi-dimensional reductions that remain after expanding all parallel dimensions are lowered into into simpler vector.contract operations until a trivial 1-dim reduction remains. Reviewers: nicolasvasilache, andydavis1 Reviewed By: andydavis1 Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74880	2020-02-20 14:16:50 -08:00
Hanhan Wang	28e0449ec6	[mlir][Linalg] Allow specifiying zero-rank shaped type operands to linalg.indexed_generic ops. Patch D74638 allows linalg.generic ops to use zero-rank shaped type operands, this also can be applied to linalg.indexed_generic ops.	2020-02-19 19:24:27 -05:00
aartbik	0ba9ee9f0e	[mlir] [VectorOps] Framework for progressive lowering of vector.contract Summary: Lowers all free/batch dimensions in a vector.contract progressively into simpler vector.contract operations until a direct vector.reduction operation is reached. Then lowers 1-D reductions into vector.reduce. Still TBD: multi-dimensional contractions that remain after removing all the parallel dims Reviewers: nicolasvasilache, andydavis1, rriddle Reviewed By: andydavis1 Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74797	2020-02-19 11:36:11 -08:00
River Riddle	6b6c96695c	[mlir][ODS] Add a new trait `TypesMatchWith` Summary: This trait takes three arguments: lhs, rhs, transformer. It verifies that the type of 'rhs' matches the type of 'lhs' when the given 'transformer' is applied to 'lhs'. This allows for adding constraints like: "the type of 'a' must match the element type of 'b'". A followup revision will add support in the declarative parser for using these equality constraints to port more c++ parsers to the declarative form. Differential Revision: https://reviews.llvm.org/D74647	2020-02-19 10:18:58 -08:00
Alexander Belyaev	284279ac23	[MLIR] Add naive fusion of parallel loops.	2020-02-19 14:51:09 +01:00
Tim Shen	b762bbd4c8	[MLIR] change NVVM.mma.sync to the most useful variant. Summary: the .row.col variant turns out to be the popular one, contrary to what I thought as .row.row. Since .row.col is so prevailing (as I inspect cuDNN's behavior), I'm going to remove the .row.row support here, which makes the patch a little bit easier. Reviewers: ftynse Subscribers: jholewinski, bixia, sanjoy.google, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74655	2020-02-18 17:57:04 -08:00
MaheshRavishankar	a8355b5c0f	[mlir][Linalg] Allow specifiying zero-rank shaped type operands to linalg.generic ops. Fixing a bug where using a zero-rank shaped type operand to linalg.generic ops hit an unrelated assert. This also meant that lowering the operation to loops was not supported. Adding roundtrip tests and lowering to loops test for zero-rank shaped type operand with fixes to make the test pass. Differential Revision: https://reviews.llvm.org/D74638	2020-02-18 13:23:28 -08:00
Pierre Oechsel	0acd7e02f2	[mlir] Linalg: Extend promotion to non f32 buffers. Summary: Linalg's promotion pass was only supporting f32 buffers due to how the zero value was build for the `fill` operation. Moreover, `promoteSubViewOperands` was returning a vector with one entry per float subview while omitting integer subviews. For a program with only integer subviews the return vector would be of size 0. However, `promoteSubViewsOperands` would try to access a non zero number of entries of this vector, resulting in a sefgault. Reviewers: nicolasvasilache, ftynse Reviewed By: ftynse Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74532	2020-02-17 15:56:49 +01:00
River Riddle	5756bc4382	[mlir][DeclarativeParser] Add support for formatting enum attributes in the string form. Summary: This revision adds support to the declarative parser for formatting enum attributes in the symbolized form. It uses this new functionality to port several of the SPIRV parsers over to the declarative form. Differential Revision: https://reviews.llvm.org/D74525	2020-02-13 17:11:48 -08:00
aartbik	b21c799952	[mlir] [VectorOps] Initial framework for progressively lowering vector.contract Summary: This sets the basic framework for lowering vector.contract progressively into simpler vector.contract operations until a direct vector.reduction operation is reached. More details will be filled out progressively as well. Reviewers: nicolasvasilache Reviewed By: nicolasvasilache Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74520	2020-02-13 15:07:57 -08:00
Alex Zinenko	5ae9c4c868	[mlir] Linalg fusion: ignore indexed_generic producers They are currently not supported and we should not attempt fusing them.	2020-02-12 15:13:21 +01:00
Lei Zhang	d3e7816d85	[mlir][spirv] Introduce spv.func Thus far we have been using builtin func op to model SPIR-V functions. It was because builtin func op used to have special treatment in various parts of the core codebase (e.g., pass pipelines, etc.) and it's easy to bootstrap the development of the SPIR-V dialect. But nowadays with general op concepts and region support we don't have such limitations and it's time to tighten the SPIR-V dialect for completeness. This commits introduces a spv.func op to properly model SPIR-V functions. Compared to builtin func op, it can provide the following benefits: * We can control the full op so we can integrate SPIR-V information bits (e.g., function control) in a more integrated way and define our own assembly form and enforcing better verification. * We can have a better dialect and library boundary. At the current moment only functions are modelled with an external op. With this change, all ops modelling SPIR-V concpets will be spv.* ops and registered to the SPIR-V dialect. * We don't need to special-case func op anymore when creating ConversionTarget declaring SPIR-V dialect as legal. This is quite important given we'll see more and more conversions in the future. In the process, bumps a few FuncOp methods to the FunctionLike trait. Differential Revision: https://reviews.llvm.org/D74226	2020-02-12 07:46:43 -05:00
Mehdi Amini	c64770506b	Remove static registration for dialects, and the "alwayslink" hack for passes In the previous state, we were relying on forcing the linker to include all libraries in the final binary and the global initializer to self-register every piece of the system. This change help moving away from this model, and allow users to compose pieces more freely. The current change is only "fixing" the dialect registration and avoiding relying on "whole link" for the passes. The translation is still relying on the global registry, and some refactoring is needed to make this all more convenient. Differential Revision: https://reviews.llvm.org/D74461	2020-02-12 09:13:02 +00:00
Andy Davis	813bfffec3	[mlir][VectorOps] Adds canonicalization rewrite patterns for vector ShapeCastOp. Summary: Adds two rewrite patterns for the vector ShapeCastOp. ) ShapeCastOp decomposer: decomposes ShapeCastOp on tuple-of-vectors to multiple ShapeCastOps each on vector types. ) ShapeCastOp folder: folds canceling shape cast ops (e.g. shape_cast A -> B followed by shape_cast B -> A) away. Reviewers: nicolasvasilache, aartbik Reviewed By: nicolasvasilache Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74327	2020-02-11 13:11:45 -08:00
aartbik	e83b7b99da	[mlir] [VectorOps] Implement vector.reduce operation Summary: This new operation operates on 1-D vectors and forms the bridge between vector.contract and llvm intrinsics for vector reductions. Reviewers: nicolasvasilache, andydavis1, ftynse Reviewed By: nicolasvasilache Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74370	2020-02-11 11:31:59 -08:00
Hanhan Wang	4687822b9e	[mlir][Linalg] Add a roundtrip test for indexed_generic op with tensors. Summary: After D72555 has been landed, `linalg.indexed_generic` also accepts ranked tensor as input and output. Add a test for it. Differential Revision: https://reviews.llvm.org/D74267	2020-02-10 15:51:59 -05:00
Alex Zinenko	5a1778057f	[mlir] use unpacked memref descriptors at function boundaries The existing (default) calling convention for memrefs in standard-to-LLVM conversion was motivated by interfacing with LLVM IR produced from C sources. In particular, it passes a pointer to the memref descriptor structure when calling the function. Therefore, the descriptor is allocated on stack before the call. This convention leads to several problems. PR44644 indicates a problem with stack exhaustion when calling functions with memref-typed arguments in a loop. Allocating outside of the loop may lead to concurrent access problems in case the loop is parallel. When targeting GPUs, the contents of the stack-allocated memory for the descriptor (passed by pointer) needs to be explicitly copied to the device. Using an aggregate type makes it impossible to attach pointer-specific argument attributes pertaining to alignment and aliasing in the LLVM dialect. Change the default calling convention for memrefs in standard-to-LLVM conversion to transform a memref into a list of arguments, each of primitive type, that are comprised in the memref descriptor. This avoids stack allocation for ranked memrefs (and thus stack exhaustion and potential concurrent access problems) and simplifies the device function invocation on GPUs. Provide an option in the standard-to-LLVM conversion to generate auxiliary wrapper function with the same interface as the previous calling convention, compatible with LLVM IR porduced from C sources. These auxiliary functions pack the individual values into a descriptor structure or unpack it. They also handle descriptor stack allocation if necessary, serving as an allocation scope: the memory reserved by `alloca` will be freed on exiting the auxiliary function. The effect of this change on MLIR-generated only LLVM IR is minimal. When interfacing MLIR-generated LLVM IR with C-generated LLVM IR, the integration only needs to require auxiliary functions and change the function name to call the wrapper function instead of the original function. This also opens the door to forwarding aliasing and alignment information from memrefs to LLVM IR pointers in the standrd-to-LLVM conversion.	2020-02-10 15:03:43 +01:00
River Riddle	2f94ce0dcf	[mlir][DeclarativeParser] Move several missed parsers over to the declarative form. Differential Revision: https://reviews.llvm.org/D74283	2020-02-08 15:47:55 -08:00
natashaknk	9c1c825b72	[mlir][spirv] Adding sin op in the GLSL extension Differential Revision: https://reviews.llvm.org/D74151	2020-02-07 16:36:12 -05:00
Nicolas Vasilache	681f929f59	[mlir][VectorOps] Introduce a `vector.fma` op that works on n-D vectors and lowers to `llvm.intrin.fmuladd` Summary: The `vector.fma` operation is portable enough across targets that we do not want to keep it wrapped under `vector.outerproduct` and `llvm.intrin.fmuladd`. This revision lifts the op into the vector dialect and implements the lowering to LLVM by using two patterns: 1. a pattern that lowers from n-D to (n-1)-D by unrolling when n > 2 2. a pattern that converts from 1-D to the proper LLVM representation Reviewers: ftynse, stellaraccident, aartbik, dcaballe, jsetoain, tetuante Reviewed By: aartbik Subscribers: fhahn, dcaballe, merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74075	2020-02-07 15:44:53 -05:00
Nicolas Vasilache	499ad45877	[mlir][VectorOps] Expose and use llvm.intrin.fma* Summary: This revision exposes the portable `llvm.fma` intrinsic in LLVMOps and uses it in lieu of `llvm.fmuladd` when lowering the `vector.outerproduct` op to LLVM. This guarantees proper `fma` instructions will be emitted if the target ISA supports it. `llvm.fmuladd` does not have this guarantee in its semantics, despite evidence that the proper x86 instructions are emitted. For more details, see https://llvm.org/docs/LangRef.html#llvm-fmuladd-intrinsic. Reviewers: ftynse, aartbik, dcaballe, fhahn Reviewed By: aartbik Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74219	2020-02-07 15:38:40 -05:00
MaheshRavishankar	d06dd29e09	[mlir][Linalg] Implement fusion of linalg.generic operation on tensors. The initial implementation of the fusion operation exposes a method to fuse a consumer with its producer, when - both the producer and consumer operate on tensors - the producer has only a single result value - the producer has only "parallel" iterator types A new interface method hasTensorSemantics is added to verify that an operation has all operands and results of type RankedTensorType. Differential Revision: https://reviews.llvm.org/D74172	2020-02-07 10:36:53 -08:00
aartbik	6e2309d7fa	[mlir] [VectorOps] generalized vector.contract semantics Summary: Previously, vector.contract did not allow an empty set of free or batch dimensions (K = 0) which defines a basic reduction into a scalar (like a dot product). This CL relaxes that restriction. Also adds constraints on element type of operands and results. With tests. Reviewers: nicolasvasilache, andydavis1, rriddle Reviewed By: andydavis1 Subscribers: merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74014	2020-02-05 17:20:32 -08:00
Andy Davis	f9efce1dd5	[mlir][VectorOps] Support vector transfer_read/write unrolling for memrefs with vector element type. Summary: [mlir][VectorOps] Support vector transfer_read/write unrolling for memrefs with vector element type. When unrolling vector transfer read/write on memrefs with vector element type, the indices used to index the memref argument must be updated to reflect the unrolled operation. However, in the case of memrefs with vector element type, we need to be careful to only update the relevant memref indices. For example, a vector transfer read with the following source/result types, memref<6x2x1xvector<2x4xf32>>, vector<2x1x2x4xf32>, should only update memref indices 1 and 2 during unrolling. Reviewers: nicolasvasilache, aartbik Reviewed By: nicolasvasilache, aartbik Subscribers: lebedev.ri, Joonsoo, merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72965	2020-02-05 16:21:58 -08:00
Andy Davis	3ce8095c29	[mlir][VectorOps] Add ShapeCastOp to the vector ops dialect. Summary: Add ShapeCastOp to the vector ops dialect. The shape_cast operation casts between an n-D source vector shape and a k-D result vector shape (the element type remains the same). Reviewers: nicolasvasilache, aartbik Reviewed By: nicolasvasilache Subscribers: Joonsoo, merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73635	2020-02-05 15:45:12 -08:00
Lei Zhang	13b197c7d1	[mlir][spirv] Add dialect-specific attribute for target environment We were using normal dictionary attribute for target environment specification. It becomes cumbersome with more and more fields. This commit changes the modelling to a dialect-specific attribute, where we can have control over its storage and assembly form. Differential Revision: https://reviews.llvm.org/D73959	2020-02-04 21:33:13 -05:00
Stephen Neuendorffer	d7cbef2714	[MLIR] Fixes for shared library dependencies. Summary: This patch is a step towards enabling BUILD_SHARED_LIBS=on, which builds most libraries as DLLs instead of statically linked libraries. The main effect of this is that incremental build times are greatly reduced, since usually only one library need be relinked in response to isolated code changes. The bulk of this patch is fixing incorrect usage of cmake, where library dependencies are listed under add_dependencies rather than under target_link_libraries or under the LINK_LIBS tag. Correct usage should be like this: add_dependencies(MLIRfoo MLIRfooIncGen) target_link_libraries(MLIRfoo MLIRlib1 MLIRlib2) A separate issue is that in cmake, dependencies between static libraries are automatically included in dependencies. In the above example, if MLIBlib1 depends on MLIRlib2, then it is sufficient to have only MLIRlib1 in the target_link_libraries. When compiling with shared libraries, it is necessary to have both MLIRlib1 and MLIRlib2 specified if MLIRfoo uses symbols from both. Reviewers: mravishankar, antiagainst, nicolasvasilache, vchuravy, inouehrs, mehdi_amini, jdoerfert Reviewed By: nicolasvasilache, mehdi_amini Subscribers: Joonsoo, merge_guards_bot, jholewinski, mgorny, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, csigg, arpith-jacob, mgester, lucyrfox, herhut, aartbik, liufengdb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73653	2020-02-04 08:56:37 -08:00
Alexander Belyaev	baecae838d	[Linalg] Add tiling of Linalg to parallel loops. Differential Revision: https://reviews.llvm.org/D73955	2020-02-04 14:51:19 +01:00
Lei Zhang	399887c9e4	[mlir][spirv] Add resource limits into target environment This commit adds two resource limits, max_compute_workgroup_size and max_compute_workgroup_invocations as resource limits to the target environment. They are not used at the current moment, but they will affect the SPIR-V CodeGen. Adding for now to have a proper target environment modelling. Differential Revision: https://reviews.llvm.org/D73905	2020-02-04 08:35:19 -05:00
River Riddle	7ef37a5f99	[mlir] Initial support for type constraints in the declarative assembly format Summary: This revision add support for accepting a few type constraints, e.g. AllTypesMatch, when inferring types for operands and results. This is used to remove the c++ parsers for several additional operations. Differential Revision: https://reviews.llvm.org/D73735	2020-02-03 21:55:09 -08:00
Alexander Belyaev	eda6b2e2b3	[MLIR][Linalg] Allow fusion of more than 2 linalg ops. LinalgDependenceGraph was not updated after successful producer-consumer fusion for linalg ops. In this patch it is fixed by reconstructing LinalgDependenceGraph on every iteration. This is very ineffective and should be improved by updating LDGraph only when it is necessary.	2020-02-03 21:00:23 +01:00
Alexander Belyaev	3dcc1fc61b	[MLIR][Linalg] Lower linalg.generic to ploops. Differential Revision: https://reviews.llvm.org/D73684	2020-02-03 11:52:23 +01:00
Stephan Herhut	283b5e733d	[MLIR] Make gpu.launch implicitly capture uses of values defined above. Summary: In the original design, gpu.launch required explicit capture of uses and passing them as operands to the gpu.launch operation. This was motivated by infrastructure restrictions rather than design. This change lifts the requirement and removes the concept of kernel arguments from gpu.launch. Instead, the kernel outlining transformation now does the explicit capturing. This is a breaking change for users of gpu.launch. Differential Revision: https://reviews.llvm.org/D73769	2020-02-03 10:08:48 +01:00
Nicolas Vasilache	dc1d43cfa0	[mlir][Linalg] NFC - Cleanup and split input file for roundtrip.mlir	2020-01-31 22:01:56 -05:00
Alex Zinenko	9dfcddfaae	[mlir] Linalg tiling: generate code avoding out-of-bounds accesses Summary: After the `subview` operation was migrated from Linalg to Standard, it changed semantics and does not guarantee the absence of out-of-bounds accesses through the created view anymore. Compute the size of the subview to make sure it always fits within the view (subviews in last iterations of the loops may be smaller than those in other iterations). Differential Revision: https://reviews.llvm.org/D73614	2020-01-31 19:43:47 +01:00
River Riddle	389b126210	[mlir][NFC] Update several SPIRV operations to use declarative parsers. Differential Revision: https://reviews.llvm.org/D73504	2020-01-30 11:43:41 -08:00
River Riddle	528adb2e48	[mlir][NFC] Use declarative format for several operations in LLVM and Linalg dialects Differential Revision: https://reviews.llvm.org/D73503	2020-01-30 11:43:41 -08:00
River Riddle	82170d5619	[mlir] Update various operations to declaratively specify their assembly format. Summary: This revision switches over many operations to use the declarative methods for defining the assembly specification. This updates operations in the NVVM, ROCDL, Standard, and VectorOps dialects. Differential Revision: https://reviews.llvm.org/D73407	2020-01-30 11:43:40 -08:00
Denis Khalikov	4801522432	[mlir][spirv] Add GroupNonUniform min and max operations. Add GroupNonUniform atihtmetic operations: FMax, FMin, SMax, SMin, UMax, UMin. Differential Revision: https://reviews.llvm.org/D73563	2020-01-30 10:25:15 -05:00
Alexander Belyaev	9109cccb4f	[Linalg] Format Linalg/fusion.mlir. Differential Revision: https://reviews.llvm.org/D73689	2020-01-30 14:17:52 +01:00
Shraiysh Vaishay	d242aa245c	[MLIR] Added llvm.invoke and llvm.landingpad Summary: I have tried to implement `llvm.invoke` and `llvm.landingpad`. # `llvm.invoke` is similar to `llvm.call` with two successors added, the first one is the normal label and the second one is unwind label. # `llvm.launchpad` takes a variable number of args with either `catch` or `filter` associated with them. Catch clauses are not array types and filter clauses are array types. This is same as the criteria used by LLVM (`4f82af81a0/llvm/include/llvm/IR/Instructions.h (L2866)`) Examples: LLVM IR ``` define i32 @caller(i32 %a) personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8) { invoke i32 @foo(i32 2) to label %success unwind label %fail success: ret i32 2 fail: landingpad {i8, i32} catch i8 @_ZTIi catch i8 null catch i8* bitcast (i8** @_ZTIi to i8) filter [1 x i8] [ i8 1 ] ret i32 3 } ``` MLIR LLVM Dialect ``` llvm.func @caller(%arg0: !llvm.i32) -> !llvm.i32 { %0 = llvm.mlir.constant(3 : i32) : !llvm.i32 %1 = llvm.mlir.constant("\01") : !llvm<"[1 x i8]"> %2 = llvm.mlir.addressof @_ZTIi : !llvm<"i8"> %3 = llvm.bitcast %2 : !llvm<"i8"> to !llvm<"i8"> %4 = llvm.mlir.null : !llvm<"i8"> %5 = llvm.mlir.addressof @_ZTIi : !llvm<"i8"> %6 = llvm.mlir.constant(2 : i32) : !llvm.i32 %7 = llvm.invoke @foo(%6) to ^bb1 unwind ^bb2 : (!llvm.i32) -> !llvm.i32 ^bb1: // pred: ^bb0 llvm.return %6 : !llvm.i32 ^bb2: // pred: ^bb0 %8 = llvm.landingpad (catch %5 : !llvm<"i8">) (catch %4 : !llvm<"i8">) (catch %3 : !llvm<"i8">) (filter %1 : !llvm<"[1 x i8]">) : !llvm<"{ i8, i32 }"> llvm.return %0 : !llvm.i32 } ``` Signed-off-by: Shraiysh Vaishay <cs17btech11050@iith.ac.in> Differential Revision: https://reviews.llvm.org/D72006	2020-01-30 12:55:28 +01:00
Stephan Herhut	2692751895	Add 'gpu.terminator' operation. Summary: The 'gpu.terminator' operation is used as the terminator for the regions of gpu.launch. This is to disambugaute them from the return operation on 'gpu.func' functions. This is a breaking change and users of the gpu dialect will need to adapt their code when producting 'gpu.launch' operations. Reviewers: nicolasvasilache Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, csigg, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73620	2020-01-30 12:41:41 +01:00
Nicolas Vasilache	ea1e3369f7	[mlir][Linalg] Introduce folding patterns to remove certain MemRefCastOp Summary: Canonicalization and folding patterns in StandardOps may interfere with the needs of Linalg. This revision introduces specific foldings for dynamic memrefs that can be proven to be static. Very concretely: Determines whether it is possible to fold it away in the parent Linalg op: ```mlir %1 = memref_cast %0 : memref<8x16xf32> to memref<?x?xf32> %2 = linalg.slice %1 ... : memref<?x?xf32> ... // or %1 = memref_cast %0 : memref<8x16xf32, affine_map<(i, j)->(16 * i + j)>> to memref<?x?xf32> linalg.generic(%1 ...) : memref<?x?xf32> ... ``` into ```mlir %2 = linalg.slice %0 ... : memref<8x16xf32> ... // or linalg.generic(%0 ... : memref<8x16xf32, affine_map<(i, j)->(16 * i + j)>> ``` Reviewers: ftynse, aartbik, jsetoain, tetuante, asaadaldien Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73565	2020-01-29 09:52:51 -05:00
David Truby	63c8972562	[MLIR] Add OpenMP dialect with barrier operation Summary: Barrier is a simple operation that takes no arguments and returns nothing, but implies a side effect (synchronization of all threads) Reviewers: jdoerfert Subscribers: mgorny, guansong, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72400	2020-01-29 11:34:58 +00:00
Ahmed Taei	16e82d855a	[mlir] Add primitive transform pattern to rewrite linalg.fill into vector.broadcast form. Summary: This diff adds a transformation patter to rewrite linalg.fill as broadcasting a scaler into a vector. It uses the same preconditioning as matmul (memory is contiguous). Reviewers: nicolasvasilache Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73391	2020-01-28 11:21:56 -08:00
Denis Khalikov	731b140a52	[mlir][spirv] Add GroupNonUniform arithmetic operations. Add GroupNonUniform arithmetic operations: FAdd, FMul, IMul. Unify parser, printer, verifier for GroupNonUniform arithmetic operations. Differential Revision: https://reviews.llvm.org/D73491	2020-01-28 10:21:56 -05:00
Lei Zhang	60d541e1b9	[mlir][spirv] Relax verification to allow flexible placement Thus far certain SPIR-V ops have been required to be in spv.module. While this provides strong verification to catch unexpected errors, it's quite rigid and makes progressive lowering difficult. Sometimes we would like to partially lower ops from other dialects, which may involve creating ops like global variables that should be placed in other module-like ops. So this commit relaxes the requirement of such SPIR-V ops' scope to module-like ops. Similarly for function- like ops. Differential Revision: https://reviews.llvm.org/D73415	2020-01-26 10:39:45 -05:00
Lei Zhang	ae21e37eb4	[mlir][spirv] Add spv.GroupNonUniformElect and spv.GroupNonUniformIAdd Differential Revision: https://reviews.llvm.org/D73349	2020-01-26 10:20:40 -05:00
aartbik	303fddeeab	[mlir] [VectorOps] Rewriting of vector.extract/insert_slices to other vector ops Summary: Rewrites the extract/insert_slices operation in terms of strided_slice/insert_strided_slice ops with intermediate tuple uses (that should get optimimized away with typical usage). This is done in a separate "pass" to enable testing this particular rewriting in isolation. Reviewers: nicolasvasilache, andydavis1, ftynse Reviewed By: nicolasvasilache Subscribers: merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73295	2020-01-24 16:24:45 -08:00
aartbik	ed8222b2ca	[mlir] [VectorOps] Implement vector tuple get folding Summary: Rewrites get-i tup<a1,...,an> into ai Reviewers: nicolasvasilache, rriddle, andydavis1 Reviewed By: nicolasvasilache, rriddle, andydavis1 Subscribers: merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73213	2020-01-23 14:15:27 -08:00
Tres Popp	9a52ea5cf9	Create a gpu.module operation for the GPU Dialect. Summary: This is based on the use of code constantly checking for an attribute on a model and instead represents the distinct operaion with a different op. Instead, this op can be used to provide better filtering. Reverts "Revert "[mlir] Create a gpu.module operation for the GPU Dialect."" This reverts commit ac446302ca4145cdc89f377c0c364c29ee303be5 after fixing internal Google issues. This additionally updates ROCDL lowering to use the new gpu.module. Reviewers: herhut, mravishankar, antiagainst, nicolasvasilache Subscribers: jholewinski, mgorny, mehdi_amini, jpienaar, burmako, shauheen, csigg, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, llvm-commits, mravishankar, rriddle, antiagainst, bkramer Tags: #llvm Differential Revision: https://reviews.llvm.org/D72921	2020-01-21 14:05:03 +01:00
Frank Laub	fffea2842d	[MLIR] LLVM Dialect: add llvm.cmpxchg and improve llvm.atomicrmw custom parser Summary: Add a `llvm.cmpxchg` op as a counterpart to LLVM IR's `cmpxchg` instruction. Note that the `weak`, `volatile`, and `syncscope` attributes are not yet supported. This will be useful for upcoming parallel versions of affine.for and generally for reduction-like semantics (especially for reductions that can't make use of `atomicrmw`, e.g. `fmax`). Reviewers: ftynse, nicolasvasilache Reviewed By: ftynse Subscribers: merge_guards_bot, jfb, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72995	2020-01-21 01:09:42 -08:00
Mehdi Amini	fdb9cc7dc5	Fix printer for llvm.addressof symbol name that need escaping Differential Revision: https://reviews.llvm.org/D73065	2020-01-20 22:09:18 +00:00
Jacques Pienaar	b70e4efb75	[mlir] Generalize broadcastable trait operands Summary: Generalize broadcastable trait to variadic operands. Update the documentation that still talked about element type as part of broadcastable trait (that bug was already fixed). Also rename Broadcastable to ResultBroadcastableShape to be more explicit that the trait affects the result shape (it is possible for op to allow broadcastable operands but not have result shape that is broadcast compatible with operands). Doing some intermediate work to have getBroadcastedType take an optional elementType as input and use that if specified, instead of the common element type of type1 and type2 in this function. Differential Revision: https://reviews.llvm.org/D72559	2020-01-20 13:02:14 -08:00
Christian Sigg	8b2eb7c494	[mlir] Add in-dialect lowering of gpu.all_reduce. Reviewers: ftynse, nicolasvasilache, herhut Reviewed By: ftynse, herhut Subscribers: liufengdb, aartbik, herhut, merge_guards_bot, mgorny, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72129	2020-01-20 13:43:43 +01:00
Kazuaki Ishizaki	fc817b09e2	[mlir] NFC: Fix trivial typos in comments Differential Revision: https://reviews.llvm.org/D73012	2020-01-20 03:17:03 +00:00
Frank Laub	ee2de95507	[MLIR] LLVM dialect: modernize and cleanups Summary: Modernize some of the existing custom parsing code in the LLVM dialect. While this reduces some boilerplate code, it also reduces the precision of the diagnostic error messges. Reviewers: ftynse, nicolasvasilache, rriddle Reviewed By: rriddle Subscribers: merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72967	2020-01-17 17:11:50 -08:00
Nicolas Vasilache	64c4dcb5ee	[mlir][Linalg] Extend linalg vectorization to MatmulOp Summary: This is a simple extension to allow vectorization to work not only on GenericLinalgOp but more generally across named ops too. For now, this still only vectorizes matmul-like ops but is a step towards more generic vectorization of Linalg ops. Reviewers: ftynse Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72942	2020-01-17 17:09:47 -05:00
Frank Laub	60a0c612df	[MLIR] LLVM dialect: Add llvm.atomicrmw Summary: This op is the counterpart to LLVM's atomicrmw instruction. Note that volatile and syncscope attributes are not yet supported. This will be useful for upcoming parallel versions of `affine.for` and generally for reduction-like semantics. Differential Revision: https://reviews.llvm.org/D72741	2020-01-17 21:17:14 +01:00
Eric Schweitz	37e2560d3d	[Flang][mlir] add a band-aid to support the creation of mutually recursive types when lowering to LLVM IR Summary: This is a temporary implementation to support Flang. The LLVM-IR parser will need to be extended in some way to support recursive types. The exact approach here is still a work-in-progress. Unfortunately, this won't pass roundtrip testing yet. Adding a comment to the test file as a reminder. Differential Revision: https://reviews.llvm.org/D72542	2020-01-17 21:17:06 +01:00
Lei Zhang	267483ac70	[mlir][spirv] Support implied extensions and capabilities In SPIR-V, when a new version is introduced, it is possible some existing extensions will be incorporated into it so that it becomes implicitly declared if targeting the new version. This affects conversion target specification because we need to take this into account when allowing what extensions to use. For a capability, it may also implies some other capabilities, for example, the `Shader` capability implies `Matrix` the capability. This should also be taken into consideration when preparing the conversion target: when we specify an capability is allowed, all its recursively implied capabilities are also allowed. This commit adds utility functions to query implied extensions for a given version and implied capabilities for a given capability and updated SPIRVConversionTarget to use them. This commit also fixes a bug in availability spec. When a symbol (op or enum case) can be enabled by an extension, we should drop it's minimal version requirement. Being enabled by an extension naturally means the symbol can be used by any SPIR-V version as long as the extension is supported. The grammar still encodes the 'version' field for such cases, but it should be interpreted as a different way: rather than meaning a minimal version requirement, it says the symbol becomes core at that specific version. Differential Revision: https://reviews.llvm.org/D72765	2020-01-17 08:01:57 -05:00
Lei Zhang	961174f878	[mlir][spirv] Fix SPV_MM_Vulkan extension reqirements SPV_MM_Vulkan can be enabled by the SPV_KHR_vulkan_memory_model extension. Differential Revision: https://reviews.llvm.org/D72764	2020-01-16 21:32:01 -05:00
Benjamin Kramer	0133cc60e4	Revert "[mlir] Create a gpu.module operation for the GPU Dialect." This reverts commit `4624a1e8ac`. Causing problems downstream.	2020-01-15 17:52:17 +01:00
Lei Zhang	47c6ab2b97	[mlir][spirv] Properly support SPIR-V conversion target This commit defines a new SPIR-V dialect attribute for specifying a SPIR-V target environment. It is a dictionary attribute containing the SPIR-V version, supported extension list, and allowed capability list. A SPIRVConversionTarget subclass is created to take in the target environment and sets proper dynmaically legal ops by querying the op availability interface of SPIR-V ops to make sure they are available in the specified target environment. All existing conversions targeting SPIR-V is changed to use this SPIRVConversionTarget. It probes whether the input IR has a `spv.target_env` attribute, otherwise, it uses the default target environment: SPIR-V 1.0 with Shader capability and no extra extensions. Differential Revision: https://reviews.llvm.org/D72256	2020-01-14 19:18:42 -05:00
Nicolas Vasilache	f52d71736b	[mlir][Linalg] Update the semantics, verifier and test for Linalg with tensors. Summary: This diff fixes issues with the semantics of linalg.generic on tensors that appeared when converting directly from HLO to linalg.generic. The changes are self-contained within MLIR and can be captured and tested independently of XLA. The linalg.generic and indexed_generic are updated to: To allow progressive lowering from the value world (a.k.a tensor values) to the buffer world (a.k.a memref values), a linalg.generic op accepts mixing input and output ranked tensor values with input and output memrefs. ``` %1 = linalg.generic #trait_attribute %A, %B {other-attributes} : tensor<?x?xf32>, memref<?x?xf32, stride_specification> -> (tensor<?x?xf32>) ``` In this case, the number of outputs (args_out) must match the sum of (1) the number of output buffer operands and (2) the number of tensor return values. The semantics is that the linalg.indexed_generic op produces (i.e. allocates and fills) its return values. Tensor values must be legalized by a buffer allocation pass before most transformations can be applied. Such legalization moves tensor return values into output buffer operands and updates the region argument accordingly. Transformations that create control-flow around linalg.indexed_generic operations are not expected to mix with tensors because SSA values do not escape naturally. Still, transformations and rewrites that take advantage of tensor SSA values are expected to be useful and will be added in the near future. Subscribers: bmahjour, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72555	2020-01-14 17:25:28 -05:00
Tres Popp	4624a1e8ac	[mlir] Create a gpu.module operation for the GPU Dialect. Summary: This is based on the use of code constantly checking for an attribute on a model and instead represents the distinct operaion with a different op. Instead, this op can be used to provide better filtering. Reviewers: herhut, mravishankar, antiagainst, rriddle Reviewed By: herhut, antiagainst, rriddle Subscribers: liufengdb, aartbik, jholewinski, mgorny, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, csigg, arpith-jacob, mgester, lucyrfox, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72336	2020-01-14 12:05:47 +01:00
Adrian Kuegel	018b042593	[mlir] Add loop.parallel, loop.reduce and loop.reduce.return operations. Summary: These operations can be used to specify a loop nest with a body that can contain reductions. The iteration space can be iterated in any order. RFC: https://groups.google.com/a/tensorflow.org/d/topic/mlir/pwtSgiKFPis/discussion Differential Revision: https://reviews.llvm.org/D72394	2020-01-14 11:35:41 +01:00
River Riddle	4268e4f4b8	[mlir] Change the syntax of AffineMapAttr and IntegerSetAttr to avoid conflicts with function types. Summary: The current syntax for AffineMapAttr and IntegerSetAttr conflict with function types, making it currently impossible to round-trip function types(and e.g. FuncOp) in the IR. This revision changes the syntax for the attributes by wrapping them in a keyword. AffineMapAttr is wrapped with `affine_map<>` and IntegerSetAttr is wrapped with `affine_set<>`. Reviewed By: nicolasvasilache, ftynse Differential Revision: https://reviews.llvm.org/D72429	2020-01-13 13:24:39 -08:00
Alex Zinenko	08778d8c4f	[mlir][GPU] introduce utilities for promotion to workgroup memory Introduce a set of function that promote a memref argument of a `gpu.func` to workgroup memory using memory attribution. The promotion boils down to additional loops performing the copy from the original argument to the attributed memory in the beginning of the function, and back at the end of the function using all available threads. The loop bounds are specified so as to adapt to any size of the workgroup. These utilities are intended to compose with other existing utilities (loop coalescing and tiling) in cases where the distribution of work across threads is uneven, e.g. copying a 2D memref with only the threads along the "x" dimension. Similarly, specialization of the kernel to specific launch sizes should be implemented as a separate pass combining constant propagation and canonicalization. Introduce a simple attribute-driven pass to test the promotion transformation since we don't have a heuristic at the moment. Differential revision: https://reviews.llvm.org/D71904	2020-01-09 10:06:00 +01:00
Nicolas Vasilache	766ce87e9b	[mlir][Linalg] Lower linalg.reshape to LLVM for the static case Summary: This diff adds lowering of the linalg.reshape op to LLVM. A new descriptor is created with fields initialized as follows: 1. allocatedPTr, alignedPtr and offset are copied from the source descriptor 2. sizes are copied from the static destination shape 3. strides are copied from the static strides collected with `getStridesAndOffset` Only the static case in which the target view conforms to strided memref semantics is supported. Other cases are left for future work and will be added on a per-need basis. Reviewers: ftynse, mravishankar Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72316	2020-01-08 13:07:41 -05:00
Denis Khalikov	dd495e8a87	[mlir][spirv] Add lowering for std cmp ops. Differential Revision: https://reviews.llvm.org/D72296	2020-01-07 21:51:51 -05:00
Denis Khalikov	9883b14cd1	[mlir][spirv] Add lowering for standard bit ops Differential Revision: https://reviews.llvm.org/D72205	2020-01-07 21:45:54 -05:00
Nicolas Vasilache	e3750cafdb	[mlir][Linalg] Add a linalg.reshape op Summary: This diff adds a new operation to linalg to allow reshaping of an existing view into a new view in the same buffer at the same offset. More specifically: The `linalg.reshape` op produces a new view whose sizes are a reassociation of the original `view`. Depending on whether or not the reassociated MemRefType is contiguous, the resulting memref may require explicit alloc and copies. A reassociation is defined as a continous grouping of dimensions and is represented with a affine map array attribute. In the future, non-continous groupings may be allowed (i.e. permutations, reindexings etc). For now, it is assumed that either: 1. a reassociation produces and consumes contiguous MemRefType or, 2. the reshape op will be folded into its consumers (by changing the shape of the computations). All other cases are undefined behavior and a reshape op may not lower to LLVM if it cannot be proven statically that it does not require alloc+copy. A reshape may either collapse or expand dimensions, depending on the relationship between source and target memref ranks. The verification rule is that the reassociation maps are applied to the memref with the larger rank to obtain the memref with the smaller rank. In the case of a dimension expansion, the reassociation maps can be interpreted as inverse maps. Examples: ```mlir // Dimension collapse (i, j) -> i' and k -> k' %1 = linalg.reshape %0 [(i, j, k) -> (i, j), (i, j, k) -> (k)] : memref<?x?x?xf32, stride_spec> into memref<?x?xf32, stride_spec_2> ``` ```mlir // Dimension expansion i -> (i', j') and (k) -> (k') %1 = linalg.reshape %0 [(i, j, k) -> (i, j), (i, j, k) -> (k)] : memref<?x?xf32, stride_spec> into memref<?x?x?xf32, stride_spec_2> ``` The relevant invalid and roundtripping tests are added. Reviewers: AlexEichenberger, ftynse, rriddle, asaadaldien, yangjunpro Subscribers: kiszk, merge_guards_bot, mehdi_amini, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72168	2020-01-06 22:21:19 -05:00
Ahmed Taei	14ee51581a	[mlir][linalg] Lower linalg to affine loops Reviewers: nicolasvasilache Reviewed By: nicolasvasilache Subscribers: mgester, lucyrfox, merge_guards_bot, AlexEichenberger, mravishankar, ftynse, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72094	2020-01-03 13:21:10 -05:00
Lei Zhang	b3d2867769	[mlir][spirv] Fix shader ABI attribute prefix and add verification This commit fixes shader ABI attributes to use `spv.` as the prefix so that they match the dialect's namespace. This enables us to add verification hooks in the SPIR-V dialect to verify them. Reviewed By: mravishankar Differential Revision: https://reviews.llvm.org/D72062	2020-01-03 07:44:27 -05:00
Lei Zhang	98856b22cd	[mlir][spirv] Update SPIR-V enums and ops with availability spec This commit updates gen_spirv_dialect.py to query the grammar and generate availability spec for various enum attribute definitions and all defined ops. Reviewed By: mravishankar Differential Revision: https://reviews.llvm.org/D72095	2020-01-02 14:09:02 -05:00
Nicolas Vasilache	2140a973f2	[mlir][Linalg] Extend generic ops to allow tensors Summary: This diff adds support to allow `linalg.generic` and `linalg.indexed_generic` to take tensor input and output arguments. The subset of output tensor operand types must appear verbatim in the result types after an arrow. The parser, printer and verifier are extended to accomodate this behavior. The Linalg operations now support variadic ranked tensor return values. This extension exhibited issues with the current handling of NativeCall in RewriterGen.cpp. As a consequence, an explicit cast to `SmallVector<Value, 4>` is added in the proper place to support the new behavior (better suggestions are welcome). Relevant cleanups and name uniformization are applied. Relevant invalid and roundtrip test are added. Reviewers: mehdi_amini, rriddle, jpienaar, antiagainst, ftynse Subscribers: burmako, shauheen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72022	2020-01-02 13:54:57 -05:00
Lei Zhang	a81cb1b8bf	[mlir][spirv] Allow specifying availability on enum attribute cases Lots of SPIR-V ops take enum attributes and certain enum cases need extra capabilities or extensions to be available. This commit extends to allow specifying availability spec on enum cases. Extra utility functions are generated for the corresponding enum classes to return the availability requirement. The availability interface implemention for a SPIR-V op now goes over all enum attributes to collect the availability requirements. Reviewed By: mravishankar Differential Revision: https://reviews.llvm.org/D71947	2020-01-02 13:19:44 -05:00
Lei Zhang	b30d87a90b	[mlir][spirv] Add basic definitions for supporting availability SPIR-V has a few mechanisms to control op availability: version, extension, and capabilities. These mechanisms are considered as different availability classes. This commit introduces basic definitions for modelling SPIR-V availability classes. Specifically, an `Availability` class is added to SPIRVBase.td, along with two subclasses: MinVersion and MaxVersion for versioning. SPV_Op is extended to take a list of `Availability`. Each `Availability` instance carries information for generating op interfaces for the corresponding availability class and also the concrete availability requirements. With the availability spec on ops, we can now auto-generate the op interfaces of all SPIR-V availability classes and also synthesize the op's implementations of these interfaces. The interface generation is done via new TableGen backends -gen-avail-interface-{decls\|defs}. The op's implementation is done via -gen-spirv-avail-impls. Differential Revision: https://reviews.llvm.org/D71930	2019-12-27 16:25:09 -05:00
Aart Bik	1d47564a53	[VectorOps] unify vector dialect "subscripts" PiperOrigin-RevId: 286650682	2019-12-20 15:33:04 -08:00
Aart Bik	67c019ddac	[VectorOps] remove redundant returns from invalid ops test PiperOrigin-RevId: 286640660	2019-12-20 14:27:42 -08:00
Christian Sigg	42d46b4efa	Add gpu.shuffle op. This will allow us to lower most of gpu.all_reduce (when all_reduce doesn't exist in the target dialect) within the GPU dialect, and only do target-specific lowering for the shuffle op. PiperOrigin-RevId: 286548256	2019-12-20 02:52:52 -08:00
Andy Davis	8020ad3e39	[VectorOps] Update vector transfer_read/write ops to operatate on memrefs with vector element type. Update vector transfer_read/write ops to operatate on memrefs with vector element type. This handle cases where the memref vector element type represents the minimal memory transfer unit (or multiple of the minimal memory transfer unit). PiperOrigin-RevId: 286482115	2019-12-19 16:05:32 -08:00
Andy Davis	1d798b1d27	[VectorOps] Add vector ReshapeOp to the VectorOps dialect. Adds vector ReshapeOp to the VectorOps dialect. An aggregate vector reshape operation, which aggregates multiple hardware vectors, can enable optimizations during decomposition (e.g. loading one input hardware vector and performing multiple rotate and scatter store operations to the vector output). PiperOrigin-RevId: 286440658	2019-12-19 12:27:59 -08:00
Aart Bik	15f800f4bc	[VectorOps] minor cleanup: vector dialect "subscripts" are i32 Introduces some centralized methods to move towards consistent use of i32 as vector subscripts. Note: sizes/strides/offsets attributes are still i64 PiperOrigin-RevId: 286434133	2019-12-19 11:51:08 -08:00
Aart Bik	d9b500d3bb	[VectorOps] Add vector.print definition, with lowering support Examples: vector.print %f : f32 vector.print %x : vector<4xf32> vector.print %y : vector<3x4xf32> vector.print %z : vector<2x3x4xf32> LLVM lowering replaces these with fully unrolled calls into a small runtime support library that provides some basic printing operations (single value, opening closing bracket, comma, newline). PiperOrigin-RevId: 286230325	2019-12-18 11:31:34 -08:00
Alex Zinenko	40ef46fba4	Harden the requirements to memory attribution types in gpu.func When memory attributions are present in `gpu.func`, require that they are of memref type and live in memoryspaces 3 and 5 for workgroup and private memory attributions, respectively. Adapt the conversion from the GPU dialect to the NVVM dialect to drop the private memory space from attributions as NVVM is able to model them as local `llvm.alloca`s in the default memory space. PiperOrigin-RevId: 286161763	2019-12-18 03:38:55 -08:00
Andy Davis	6fa3bd5b3e	Add pattern rewrite which splits a vector TransferWriteOp into slices according to the unrolling/slicing scheme of its InsertSlicesOp operand. PiperOrigin-RevId: 286042578	2019-12-17 13:17:10 -08:00
Mahesh Ravishankar	319cca3bbe	Add missing virtual inliner interface method in SPIR-V dialect. The inline interface uses two methods to check legality of inling: 1) Can a region be inlined into another. 2) Can an operation be inlined into another. Setting the former to true, allows the inliner to use the second for legality checks. Add this method to the SPIR-V dialect inlining interface. PiperOrigin-RevId: 286041734	2019-12-17 13:06:05 -08:00
Andy Davis	d1fb285b32	Add pattern rewrite to forward vector tuple elements to their users. User(TupleGetOp(ExtractSlicesOp(InsertSlicesOp(TupleOp(Producer))) -> User(Producer) PiperOrigin-RevId: 286020249	2019-12-17 11:21:45 -08:00
Andy Davis	038ad1d856	Add pattern rewrite which splits a vector TransferReadOp into slices according to the unrolling/slicing scheme of its ExtractSlicesOp user. PiperOrigin-RevId: 285975613	2019-12-17 07:29:06 -08:00
Andy Davis	4e825c59be	Update vector op unrolling transformation to generate ExtractSlicesOp and InsertSlicesOp (instead of less structured chain of StridedSliceOps and InsertStridedSliceOps). PiperOrigin-RevId: 285968051	2019-12-17 06:27:01 -08:00
Mahesh Ravishankar	80ec474a65	Add atomic operations to SPIR-V dialect. Some changes to the dialect generation script to allow specification of different base class to derive from in ODS. PiperOrigin-RevId: 285859230	2019-12-16 15:05:51 -08:00
Lei Zhang	659150b570	[spirv] Re-enable nested loop (de)serialization test PiperOrigin-RevId: 285849308	2019-12-16 14:21:52 -08:00
Andy Davis	11e92875f0	Add InsertSlicesOp to the VectorOps dialect. PiperOrigin-RevId: 285830394	2019-12-16 12:56:38 -08:00
Alex Zinenko	6273fa0c6a	Plug gpu.func into the GPU lowering pipelines This updates the lowering pipelines from the GPU dialect to lower-level dialects (NVVM, SPIRV) to use the recently introduced gpu.func operation instead of a standard function annotated with an attribute. In particular, the kernel outlining is updated to produce gpu.func instead of std.func and the individual conversions are updated to consume gpu.funcs and disallow standard funcs after legalization, if necessary. The attribute "gpu.kernel" is preserved in the generic syntax, but can also be used with the custom syntax on gpu.funcs. The special kind of function for GPU allows one to use additional features such as memory attribution. PiperOrigin-RevId: 285822272	2019-12-16 12:12:48 -08:00
Jose Ignacio Gomez	3ae56c4135	[Linalg] Expose subview promotion as a declarative pattern This PR targest issue tensorflow/mlir#295. It exposes the already existing subiew promotion pass as a declarative pattern Change-Id: If901ebef9fb53fcd0b12ecc536f6b174ce320b92 Closes tensorflow/mlir#315 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/315 from tetuante:issue295 8e5f268b6d85f31015c33505329dbd7a4db97ac5 PiperOrigin-RevId: 285801463	2019-12-16 10:50:45 -08:00
Aart Bik	cd5dab8ad7	[VectorOps] Add [insert/extract]element definition together with lowering to LLVM Similar to insert/extract vector instructions but (1) work on 1-D vectors only (2) allow for a dynamic index %c3 = constant 3 : index %0 = vector.insertelement %arg0, %arg1[%c : index] : vector<4xf32> %1 = vector.extractelement %arg0[%c3 : index] : vector<4xf32> PiperOrigin-RevId: 285792205	2019-12-16 09:52:46 -08:00
Andy Davis	73ec37c8bb	Adds ExtractSlicesOp to the VectorOps dialect. ExtractSlicesOp extracts slices of its vector operand and with a specified tiling scheme. This operation centralizes the tiling scheme around a single op, which simplifies vector op unrolling and subsequent pattern rewrite transformations. PiperOrigin-RevId: 285761129	2019-12-16 06:39:09 -08:00
Alexander Belyaev	1b579d998a	[Linalg] Add test for fusion of GenericOp with IndexedGenericOp. PiperOrigin-RevId: 285211797	2019-12-12 09:56:45 -08:00
Christian Sigg	9b85582682	Automated rollback of commit `f68ac464d8` PiperOrigin-RevId: 285162061	2019-12-12 03:48:38 -08:00
Christian Sigg	f68ac464d8	Switch from shfl.bfly to shfl.down. Both work for the current use case, but the latter allows implementing prefix sums and is a little easier to understand for partial warps. PiperOrigin-RevId: 285145287	2019-12-12 01:28:01 -08:00
Nicolas Vasilache	508d4e672e	Continue refactoring StructuredOps utilities This CL adds more common information to StructuredOpsUtils.h The n_view attribute is retired in favor of args_in + args_out but the CL is otherwise NFC. PiperOrigin-RevId: 285000621	2019-12-11 09:27:34 -08:00
Alexander Belyaev	bae8a7a724	[Linalg] Add tiling for IndexedGenericOp with a region. PiperOrigin-RevId: 284949355	2019-12-11 02:56:40 -08:00
Andy Davis	4d8ba88610	Add VectorOp transform pattern which splits vector TransferReadOps to target vector unroll size. PiperOrigin-RevId: 284880592	2019-12-10 17:02:51 -08:00
Nicolas Vasilache	995048d7b7	Fold TestLinalgTilePermutePatterns into TestLinalgTransformPatterns - NFC Centralize all patterns that test Linalg transforms in a single pass. PiperOrigin-RevId: 284835938	2019-12-10 13:26:15 -08:00
Jose Ignacio Gomez	b19fed5415	[Linalg] Add a Linalg iterator permutation transformation This patch closes issue tensorflow/mlir#272 We add a standalone iterator permutation transformation to Linalg. This transformation composes a permutation map with the maps in the "indexing_maps" attribute. It also permutes "iterator_types" accordingly. Change-Id: I7c1e693b8203aeecc595a7c012e738ca1100c857 Closes tensorflow/mlir#307 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/307 from tetuante:issue272 f7908d58792f4111119721885e247045104f1131 PiperOrigin-RevId: 284824102	2019-12-10 12:25:43 -08:00
Nicolas Vasilache	ad38e49806	Uniformize Vector transforms as patterns on the model of Linalg - NFC This reorganizes the vector transformations to be more easily testable as patterns and more easily composable into fused passes in the future. PiperOrigin-RevId: 284817474	2019-12-10 11:54:33 -08:00
Aart Bik	1fe65688d4	[VectorOps] Add a ShuffleOp to the VectorOps dialect For example %0 = vector.shuffle %x, %y [3 : i32, 2 : i32, 1 : i32, 0 : i32] : vector<2xf32>, vector<2xf32> yields a vector<4xf32> result with a permutation of the elements of %x and %y PiperOrigin-RevId: 284657191	2019-12-09 16:15:41 -08:00
Aart Bik	0e963b9c42	[VectorOps] Fix off-by-one error in insert/extract validation PiperOrigin-RevId: 284652653	2019-12-09 15:54:23 -08:00
Denis Khalikov	34265dad65	[spirv] Add CompositeConstruct operation. Closes tensorflow/mlir#308 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/308 from denis0x0D:sandbox/composite_construct 9ef7180f77f9374bcd05afc4f9e6c1d2d72d02b7 PiperOrigin-RevId: 284613617	2019-12-09 12:43:53 -08:00
Lei Zhang	2c7e8ed7c6	[spirv] Add spv.IAdd, spv.ISub, and spv.IMul folders The patterns to be folded away can be commonly generated during lowering to SPIR-V. PiperOrigin-RevId: 284604855	2019-12-09 11:59:10 -08:00
Kazuaki Ishizaki	ae05cf27c6	Minor spelling tweaks Closes tensorflow/mlir#304 PiperOrigin-RevId: 284568358	2019-12-09 09:23:48 -08:00
Nicolas Vasilache	91c0074624	[StructuredOps][Linalg] Add a primitive pattern to rewrite the linalg.generic form of matmul to vector form. This CL uses the newly expanded matcher support to easily detect when a linalg.generic has a multiply-accumulate body. A linalg.generic with such a body is rewritten as a vector contraction. This CL additionally limits the rewrite to the case of matrix multiplication on contiguous and statically shaped memrefs for now. Before expanding further, we should harden the infrastructure for expressing custom ops with the structured ops abstraction. PiperOrigin-RevId: 284566659	2019-12-09 09:14:39 -08:00
Aart Bik	d37f27251f	[VecOps] Rename vector.[insert\|extract]element to just vector.[insert\|extract] Since these operations lower to [insert\|extract][element\|value] at LLVM dialect level, neither element nor value would correctly reflect the meaning. PiperOrigin-RevId: 284240727	2019-12-06 12:39:25 -08:00
Aart Bik	b36aaeafb1	[VectorOps] Add lowering of vector.broadcast to LLVM IR For example, a scalar broadcast %0 = vector.broadcast %x : f32 to vector<2xf32> return %0 : vector<2xf32> which expands scalar x into vector [x,x] by lowering to the following LLVM IR dialect to implement the duplication over the leading dimension. %0 = llvm.mlir.undef : !llvm<"<2 x float>"> %1 = llvm.mlir.constant(0 : index) : !llvm.i64 %2 = llvm.insertelement %x, %0[%1 : !llvm.i64] : !llvm<"<2 x float>"> %3 = llvm.shufflevector %2, %0 [0 : i32, 0 : i32] : !llvm<"<2 x float>">, !llvm<"<2 x float>"> return %3 : vector<2xf32> In the trailing dimensions, the operand is simply "passed through", unless a more elaborate "stretch" is required. For example %0 = vector.broadcast %arg0 : vector<1xf32> to vector<4xf32> return %0 : vector<4xf32> becomes %0 = llvm.mlir.undef : !llvm<"<4 x float>"> %1 = llvm.mlir.constant(0 : index) : !llvm.i64 %2 = llvm.extractelement %arg0[%1 : !llvm.i64] : !llvm<"<1 x float>"> %3 = llvm.mlir.constant(0 : index) : !llvm.i64 %4 = llvm.insertelement %2, %0[%3 : !llvm.i64] : !llvm<"<4 x float>"> %5 = llvm.shufflevector %4, %0 [0 : i32, 0 : i32, 0 : i32, 0 : i32] : !llvm<"<4 x float>">, !llvm<"<4 x float>"> llvm.return %5 : !llvm<"<4 x float>"> PiperOrigin-RevId: 284219926	2019-12-06 11:02:29 -08:00
Andy Davis	41f8e105fa	Unroll vector masks along with their associated vector arguments. Updates vector ContractionOp to use proper vector masks (produced by CreateMaskOp/ConstantMaskOp). Leverages the following canonicalizations in unrolling unit test: CreateMaskOp -> ConstantMaskOp, StridedSliceOp(ConstantMaskOp) -> ConstantMaskOp Removes IndexTupleOp (no longer needed now that we have vector mask ops). Updates all unit tests. PiperOrigin-RevId: 284182168	2019-12-06 07:37:28 -08:00
Uday Bondhugula	3ade6a7d15	DimOp folding for alloc/view dynamic dimensions Signed-off-by: Uday Bondhugula <uday@polymagelabs.com> Closes tensorflow/mlir#253 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/253 from bondhugula:dimop a4b464f24ae63fd259114558d87e11b8ee4dae86 PiperOrigin-RevId: 284169689	2019-12-06 06:00:54 -08:00
Alex Zinenko	58adf99ed1	LLVM::AddressOfOp: properly take into account the address space The AddressOf operation in the LLVM dialect return a pointer to a global variable. The latter may be in a non-default address space as indicated by the "addr_space" attribute. Check that the address space of the pointer returned by AddressOfOp matches that of the referenced GlobalOp. Update the AddressOfOp builder to respect this constraint. PiperOrigin-RevId: 284138860	2019-12-06 01:09:13 -08:00
Jose Ignacio Gomez	f60bbb6c3b	[Linalg] Add permutation information to tiling This patch closes issue tensorflow/mlir#271. It adds an optional permutation map to declarative tiling transformations. The map is expressed as a list of integers. Closes tensorflow/mlir#288 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/288 from tetuante:issue271 2df2938d6a1f01b3bc404ded08dea2dd1e10b588 PiperOrigin-RevId: 284064151	2019-12-05 15:14:59 -08:00
Denis Khalikov	e67acfa468	[spirv] Add CompositeInsertOp operation A CompositeInsertOp operation make a copy of a composite object, while modifying one part of it. Closes tensorflow/mlir#292 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/292 from denis0x0D:sandbox/composite_insert 2200962b9057bda53cd2f2866b461e2797196380 PiperOrigin-RevId: 284036551	2019-12-05 13:10:44 -08:00
Lei Zhang	037044b0ae	Add spv.AtomicCompareExchangeWeak PiperOrigin-RevId: 283997917	2019-12-05 10:06:24 -08:00
Lei Zhang	c0a9de29ad	[spirv] Fix nested loop (de)serialization For serialization, when we have nested ops, the inner loop will create multiple SPIR-V blocks. If the outer loop has block arguments (which corresponds to OpPhi instructions), we defer the handling of OpPhi's parent block handling until we serialized all blocks and then fix it up with the result <id>. These two cases happening together was generating invalid SPIR-V blob because we previously assume the parent block to be the block containing the terminator. That is not true anymore when the block contains structured control flow ops. If that happens, it should be fixed to use the structured control flow op's merge block. For deserialization, we record a map from header blocks to their corresponding merge and continue blocks during the initial deserialization and then use the info to construct spv.selection/spv.loop. The existing implementation will also fall apart when we have nested loops. If so, we clone all blocks for the outer loop, including the ones for the inner loop, to the spv.loop's region. So the map for header blocks' merge info need to be updated; otherwise we are operating on already deleted blocks. PiperOrigin-RevId: 283949230	2019-12-05 04:39:37 -08:00
Andy Davis	d20d763241	Add canonicalization patterns for vector CreateMaskOp and StridedSliceOp to be used in the unroll vector op transformation. Adds a ConstantMaskOp to the vector ops dialect. Adds the following canonicalization patterns: CreateMaskOp -> ConstantMaskOp StridedSliceOp(ConstantMaskOp) -> ConstantMaskOp PiperOrigin-RevId: 283816752	2019-12-04 13:00:43 -08:00
Scott Todd	bf45ff6aab	[spirv] Adding sqrt op in the GLSL extension. PiperOrigin-RevId: 283769736	2019-12-04 09:16:23 -08:00
Lei Zhang	50b2b26e70	[spirv] Add spv.GroupNonUniformBallot This CL also did the following cleanup: - Moved the test for spv.SubgroupBallotKHR to its own file - Wrapped generated canonicalization patterns in anonymous namespace - Updated header comments in SPVOps.td PiperOrigin-RevId: 283650091	2019-12-03 16:44:09 -08:00
Andy Davis	2c13fd9f17	Add CreateMaskOp to the VectorOps dialect. PiperOrigin-RevId: 283591888	2019-12-03 11:55:54 -08:00
Alex Zinenko	993e79e9bd	Fix ViewOp to have at most one offset operand As described in the documentation, ViewOp is expected to take an optional dynamic offset followed by a list of dynamic sizes. However, the ViewOp parser did not include a check for the offset being a single value and accepeted a list of values instead. Furthermore, several tests have been exercising the wrong syntax of a ViewOp, passing multiple values to the dyanmic stride list, which was not caught by the parser. The trailing values could have been erronously interpreted as dynamic sizes. This is likely due to resyntaxing of the ViewOp, with the previous syntax taking the list of sizes before the offset. Update the tests to use the syntax with the offset preceding the sizes. Worse, the conversion of ViewOp to the LLVM dialect assumed the wrong order of operands with offset in the trailing position, and erronously relied on the permissive parsing that interpreted trailing dynamic offset values as leading dynamic sizes. Fix the lowering to use the correct order of operands. PiperOrigin-RevId: 283532506	2019-12-03 06:23:04 -08:00
Lei Zhang	1af9633d85	[spirv] Add spv.SubgroupBallotKHROp PiperOrigin-RevId: 283522284	2019-12-03 04:49:56 -08:00
Alex Zinenko	fdbb99cd62	Add linkage support to LLVMFuncOp A recent commit introduced the Linkage attribute to the LLVM dialect and used it in the Global Op. Also use it in LLVMFuncOp. As per LLVM Language Reference, if the linkage attribute is omitted, the function is assumed to have external linkage. PiperOrigin-RevId: 283493299	2019-12-03 00:26:44 -08:00
Aart Bik	3126004a5a	[VectorOps] Add legality rules to broadcast PiperOrigin-RevId: 283360101	2019-12-02 09:57:27 -08:00
Alexander Belyaev	9630fcbc52	Lower linalg.indexed_generic with libcall to LLVM. PiperOrigin-RevId: 283328994	2019-12-02 06:30:52 -08:00
Alex Zinenko	d5e627f84b	Introduce Linkage attribute to the LLVM dialect LLVM IR supports linkage on global objects such as global variables and functions. Introduce the Linkage attribute into the LLVM dialect, backed by an integer storage. Use this attribute on LLVM::GlobalOp and make it mandatory. Implement parsing/printing of the attribute and conversion to LLVM IR. See tensorflow/mlir#277. PiperOrigin-RevId: 283309328	2019-12-02 03:28:10 -08:00
Denis Khalikov	cd556f25de	[spirv] Check that operand of `spirv::CompositeExtractOp` is constant while folding. Closes tensorflow/mlir#281 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/281 from denis0x0D:sandbox/composite_ex_fold d02d73658bd1b9eaa515eb4e0aee34bc41d4252b PiperOrigin-RevId: 282971563	2019-11-28 13:27:56 -08:00
Jose Ignacio Gomez	0494ef60f7	[Linalg] Change attribute n_loop_types to iterator This addresses issue tensorflow/mlir#270. Linalg is updated to take the same form of iterator_types than vector contraction. Closes tensorflow/mlir#280 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/280 from tetuante:PRissue270 d26d88d090d3765d3b9884bfabdd023143f27287 PiperOrigin-RevId: 282905396	2019-11-28 01:59:55 -08:00
Lei Zhang	d4e4387fbf	[spirv] Add folders for spv.IAdd and spv.IMul Adding zero and multiplying one can be common when generating code for index calculation. This CL also sorted canonicalize.mlir to alphabetical order. PiperOrigin-RevId: 282828055	2019-11-27 13:46:52 -08:00
Nicolas Vasilache	1fa8c8070b	Implement Linalg to loops lowering as a pattern This CL rewrites the linalg ops to loops transformations as patterns that can be targeted directly from Tablegen. Reliance on OpFolder is removed and to cope with it we introduce local folding patterns that are applied greedily. PiperOrigin-RevId: 282765550	2019-11-27 07:32:13 -08:00
Aart Bik	e2232fbcee	[VectorOps] Refine BroadcastOp in VectorOps dialect Since second argument is always fully overwritten and shape is define in "to" clause, it is not needed. Also renamed "into" to "to" now that arg is dropped. PiperOrigin-RevId: 282686475	2019-11-26 19:52:38 -08:00
Aart Bik	cf97263cb8	[VectorOps] Add a BroadcastOp to the VectorOps dialect PiperOrigin-RevId: 282643305	2019-11-26 14:43:31 -08:00
Mahesh Ravishankar	03620fa70a	Misc changes to lowering to SPIR-V. These changes to SPIR-V lowering while adding support for lowering SUbViewOp, but are not directly related. - Change the lowering of MemRefType to !spv.ptr<!spv.struct<!spv.array<...>[offset]>, ..> This is consistent with the Vulkan spec. - To enable testing a simple pattern of lowering functions is added to ConvertStandardToSPIRVPass. This is just used to convert the type of the arguments of the function. The added function lowering itself is not meant to be the way functions are eventually lowered into SPIR-V dialect. PiperOrigin-RevId: 282589644	2019-11-26 10:11:34 -08:00
Nicolas Vasilache	36469f7d2a	Add a vector.InsertStridedSliceOp This new op is the counterpart of vector.StridedSliceOp and will be used for in the pattern rewrites for vector unrolling. PiperOrigin-RevId: 282447414	2019-11-25 15:37:13 -08:00
MLIR Team	1012c492f0	Allow LLVM::ExtractElementOp to have non-i32 indices. Also change the text format a bit, so that indices are braced by squares. PiperOrigin-RevId: 282437095	2019-11-25 14:44:52 -08:00
Andy Davis	8fc44a4d13	Update VectorContractionOp to take iterator types and index mapping attributes compatible with linalg ops. PiperOrigin-RevId: 282412311	2019-11-25 12:40:00 -08:00
Mahesh Ravishankar	bd485afda0	Introduce attributes that specify the final ABI for a spirv::ModuleOp. To simplify the lowering into SPIR-V, while still respecting the ABI requirements of SPIR-V/Vulkan, split the process into two 1) While lowering a function to SPIR-V (when the function is an entry point function), allow specifying attributes on arguments and function itself that describe the ABI of the function. 2) Add a pass that materializes the ABI described in the function. Two attributes are needed. 1) Attribute on arguments of the entry point function that describe the descriptor_set, binding, storage class, etc, of the spv.globalVariable this argument will be replaced by 2) Attribute on function that specifies workgroup size, etc. (for now only workgroup size). Add the pass -spirv-lower-abi-attrs to materialize the ABI described by the attributes. This change makes the SPIRVBasicTypeConverter class unnecessary and is removed, further simplifying the SPIR-V lowering path. PiperOrigin-RevId: 282387587	2019-11-25 11:19:56 -08:00
Nicolas Vasilache	01145544aa	Add vector.insertelement op This is the counterpart of vector.extractelement op and has the same limitations at the moment (static I64IntegerArrayAttr to express position). This restriction will be filterd in the future. LLVM lowering will be added in a subsequent commit. PiperOrigin-RevId: 282365760	2019-11-25 08:47:15 -08:00
Alex Zinenko	bf4692dc49	Introduce gpu.func Introduce a new function-like operation to the GPU dialect to provide a placeholder for the execution semantic description and to add support for GPU memory hierarchy. This aligns with the overall goal of the dialect to expose the common abstraction layer for GPU devices, in particular by providing an MLIR unit of semantics (i.e. an operation) for memory modeling. This proposal has been discussed in the mailing list: https://groups.google.com/a/tensorflow.org/d/msg/mlir/RfXNP7Hklsc/MBNN7KhjAgAJ As decided, the "convergence" aspect of the execution model will be factored out into a new discussion and therefore is not included in this commit. This commit only introduces the operation but does not hook it up with the remaining flow. The intention is to develop the new flow while keeping the old flow operational and do the switch in a simple, separately reversible commit. PiperOrigin-RevId: 282357599	2019-11-25 08:10:37 -08:00
Denis Khalikov	a5cda4763f	[spirv] Add a canonicalizer for `spirv::LogicalNotOp`. Add a canonicalizer for `spirv::LogicalNotOp`. Converts: * spv.LogicalNot(spv.IEqual(...)) -> spv.INotEqual(...) * spv.LogicalNot(spv.INotEqual(...)) -> spv.IEqual(...) * spv.LogicalNot(spv.LogicalEqual(...)) -> spv.LogicalNotEqual(...) * spv.LogicalNot(spv.LogicalNotEqual(...)) -> spv.LogicalEqual(...) Also moved the test for spv.IMul to arithemtic tests. Closes tensorflow/mlir#256 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/256 from denis0x0D:sandbox/canon_logical_not 76ab5787b2c777f948c8978db061d99e76453d44 PiperOrigin-RevId: 282012356	2019-11-22 12:25:52 -08:00
Alex Zinenko	b5af3784a6	Don't force newline before function attributes Due to legacy reasons, a newline character followed by two spaces was always inserted before the attributes of the function Op in pretty form. This breaks formatting when functions are nested in some other operations. Don't print the newline and just put the attributes on the same line, which is also more consistent with module Op. Line breaking aware of indentation can be introduced separately into the parser if deemed useful. PiperOrigin-RevId: 281721793	2019-11-21 05:08:19 -08:00
Andy Davis	d6a70b31be	Add VectorContractionOp to the VectorOps dialect. PiperOrigin-RevId: 281605471	2019-11-20 14:53:57 -08:00
Stephan Herhut	abb626686d	Extend kernel outlining to also consider dim worth inlining. PiperOrigin-RevId: 281483447	2019-11-20 02:59:35 -08:00
Nicolas Vasilache	ee95f6f259	Add VectorOps.StridedSliceOp The `vector.strided_slice` takes an n-D vector, k-D `offsets` integer array attribute, a k-D `sizes` integer array attribute, a k-D `strides` integer array attribute and extracts the n-D subvector at the proper offset. Returns an n-D vector where the first k-D dimensions match the `sizes` attribute. The returned subvector contains the elements starting at offset `offsets` and ending at `offsets + sizes`. Example: ``` %1 = vector.strided_slice %0 {offsets : [0, 2], sizes : [2, 4], strides : [1, 1]}: vector<4x8x16xf32> // returns a vector<2x4x16xf32> ``` This op will be useful for progressive lowering within the VectorOp dialect. PiperOrigin-RevId: 281352749	2019-11-19 12:22:34 -08:00
Hanhan Wang	c614c92fdc	Support SPIR-V constant op to take DenseElementsAttr as input. Iterates each element to build the array. This includes a little refactor to combine bool/int/float into a function, since they are similar. The only difference is calling different function in the end. PiperOrigin-RevId: 281210288	2019-11-18 20:02:05 -08:00
Alexander Belyaev	8c6a5233d5	Lower linalg.indexed_generic to loops. PiperOrigin-RevId: 281169885	2019-11-18 16:55:15 -08:00
River Riddle	9873a29817	Add a parseAttribute<AttrType> overload for the non-type case. The variant that accepts a type will check that the parsed attribute is a valid instance of AttrType. The non-type variant would silently fail in this case, leading to garbage attribute values. PiperOrigin-RevId: 281136528	2019-11-18 13:11:36 -08:00
Denis Khalikov	6c77e59bfd	[spirv] Add a canonicalizer for BitcastOp. Convert chained `spirv::BitcastOp` operations into one `spirv::BitcastOp` operation. Closes tensorflow/mlir#238 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/238 from denis0x0D:sandbox/canon_bitcast 4352ed4f81b959ec92f849c599e733b62a99c010 PiperOrigin-RevId: 281129234	2019-11-18 12:37:00 -08:00
Denis Khalikov	68e48ba111	[spirv] Add bit ops This CL added op definitions for a few bit operations: * OpBitFieldInsert * OpBitFieldSExtract * OpBitFieldUExtract Closes tensorflow/mlir#233 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/233 from denis0x0D:sandbox/bit_field_ops e7fd85b00d72d483d7992dc42b9cc4d673903455 PiperOrigin-RevId: 280691816	2019-11-15 11:03:19 -08:00
Nicolas Vasilache	f2b6ae9991	Move VectorOps to Tablegen - (almost) NFC This CL moves VectorOps to Tablegen and cleans up the implementation. This is almost NFC but 2 changes occur: 1. an interface change occurs in the padding value specification in vector_transfer_read: the value becomes non-optional. As a shortcut we currently use %f0 for all paddings. This should become an OpInterface for vectorization in the future. 2. the return type of vector.type_cast is trivial and simplified to `memref<vector<...>>` Relevant roundtrip and invalid tests that used to sit in core are moved to the vector dialect. The op documentation is moved to the .td file. PiperOrigin-RevId: 280430869	2019-11-14 08:15:23 -08:00
Nicolas Vasilache	0bd6390b54	Deprecate linalg.subview in favor of std.subview This CL uses the now standard std.subview in linalg. Two shortcuts are currently taken to allow this port: 1. the type resulting from a view is currently degraded to fully dynamic to pass the SubViewOp verifier. 2. indexing into SubViewOp may access out of bounds since lowering to LLVM does not currently enforce it by construction. These will be fixed in subsequent commits after discussions. PiperOrigin-RevId: 280250129	2019-11-13 12:10:09 -08:00
Hanhan Wang	85d7fb3324	Make VariableOp instructions be in the first block in the function. Since VariableOp is serialized during processBlock, we add two more fields, `functionHeader` and `functionBody`, to collect instructions for a function. After all the blocks have been processed, we append them to the `functions`. Also, fix a bug in processGlobalVariableOp. The global variables should be encoded into `typesGlobalValues`. PiperOrigin-RevId: 280105366	2019-11-12 18:59:15 -08:00
Lei Zhang	b259c26eb0	Add support for OpPhi in loop header block During deserialization, the loop header block will be moved into the spv.loop's region. If the loop header block has block arguments, we need to make sure it is correctly carried over to the block where the new spv.loop resides. During serialization, we need to make sure block arguments from the spv.loop's entry block are not silently dropped. PiperOrigin-RevId: 280021777	2019-11-12 12:00:28 -08:00
Nicolas Vasilache	f51a155337	Add support for alignment attribute in std.alloc. This CL adds an extra pointer to the memref descriptor to allow specifying alignment. In a previous implementation, we used 2 types: `linalg.buffer` and `view` where the buffer type was the unit of allocation/deallocation/alignment and `view` was the unit of indexing. After multiple discussions it was decided to use a single type, which conflates both, so the memref descriptor now needs to carry both pointers. This is consistent with the [RFC-Proposed Changes to MemRef and Tensor MLIR Types](https://groups.google.com/a/tensorflow.org/forum/#!searchin/mlir/std.view%7Csort:date/mlir/-wKHANzDNTg/4K6nUAp8AAAJ). PiperOrigin-RevId: 279959463	2019-11-12 07:06:54 -08:00
MLIR Team	9fbf52e330	Look for SymbolRefAttr in KernelOutlining instead of hard-coding CallOp This code should be exercised using the existing kernel outlining unit test, but let me know if I should add a dedicated unit test using a fake call instruction as well. PiperOrigin-RevId: 279436321	2019-11-08 19:13:13 -08:00
Denis Khalikov	4697d657b7	[spirv] Add bit ops This CL added op definitions for a few bit operations: * OpShiftLeftLogical * OpShiftRightArithmetic * OpShiftRightLogical * OpBitCount * OpBitReverse * OpNot Also moved the definition of spv.BitwiseAnd to follow the lexicographical order. Closes tensorflow/mlir#215 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/215 from denis0x0D:sandbox/bit_ops d9b0852b689ac6c4879a9740b1740a2357f44d24 PiperOrigin-RevId: 279350470	2019-11-08 11:17:05 -08:00
Andy Davis	8f00b4494d	Swap operand order in std.view operation so that offset appears before dynamic sizes in the operand list. PiperOrigin-RevId: 279114236	2019-11-07 10:20:23 -08:00
Nicolas Vasilache	72040bf7c8	Update Linalg to use std.view Now that a view op has graduated to the std dialect, we can update Linalg to use it and remove ops that have become obsolete. As a byproduct, the linalg buffer and associated ops can also disappear. PiperOrigin-RevId: 279073591	2019-11-07 06:33:10 -08:00
Alexander Belyaev	eee9cbdeb7	Add IndexedGenericOp to Linalg. PiperOrigin-RevId: 279013404	2019-11-06 22:36:25 -08:00
Eric Schweitz	0d545921ea	Add support for the LLVM FNeg instruction Closes tensorflow/mlir#216 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/216 from schweitzpgi:llvmir-fneg-op f9b5f185845d671b745ab6fc213d5d9aff044b34 PiperOrigin-RevId: 278795325	2019-11-06 00:02:10 -08:00
James Molloy	250a11ae0f	[llvm] Allow GlobalOp to take a region for complex initializers This allows GlobalOp to either take a value attribute (for simple constants) or a region that can contain IR instructions (that must be constant-foldable) to create a ConstantExpr initializer. Example: // A complex initializer is constructed with an initializer region. llvm.mlir.global constant @int_gep() : !llvm<"i32"> { %0 = llvm.mlir.addressof @g2 : !llvm<"i32"> %1 = llvm.mlir.constant(2 : i32) : !llvm.i32 %2 = llvm.getelementptr %0[%1] : (!llvm<"i32">, !llvm.i32) -> !llvm<"i32"> llvm.return %2 : !llvm<"i32*"> } PiperOrigin-RevId: 278717836	2019-11-05 15:11:01 -08:00
MLIR Team	1f43d0d000	[NVVM] Add mma.sync operation. PiperOrigin-RevId: 278440547	2019-11-04 12:36:37 -08:00
River Riddle	e4a912eb5a	Update the SPV dialect type parser to use the methods on DialectAsmParser directly. This simplifies the implementation quite a bit, and removes the need for explicit string munging. One change is made to some of the enum elements of SPV_DimAttr to ensure that they are proper identifiers; The string form is now prefixed with 'Dim'. PiperOrigin-RevId: 278027132	2019-11-01 16:55:25 -07:00
River Riddle	68cfc89a0d	Refactor LinalgDialect::parseType to use the DialectAsmParser methods directly. This simplifies the implementation, and removes the need to do explicit string manipulation. A utility method 'parseDimensionList' is added to the DialectAsmParser to simplify defining types and attributes that contain shapes. PiperOrigin-RevId: 278020604	2019-11-01 16:14:10 -07:00
River Riddle	e94a8bfca8	Refactor QuantOps TypeParser to use the DialectAsmParser methods directly. This greatly simplifies the implementation and removes custom parser functionality. The necessary methods are added to the DialectAsmParser. PiperOrigin-RevId: 278015983	2019-11-01 15:47:03 -07:00
Nicolas Vasilache	e20a2aa9f2	Delete spurious file PiperOrigin-RevId: 277967079	2019-11-01 11:28:15 -07:00
Nicolas Vasilache	bd94a10c02	Add Linalg pattern for producer-consumer fusion This CL adds a simple pattern for specifying producer-consumer fusion on Linalg operations. Implementing such an extension reveals some interesting properties. Since Linalg operates on a buffer abstraction, the output buffers are specified as in/out parameters to the ops. As a consequence, there are no SSA use-def chains and one cannot specify complex dag input patterns with the current infrastructure. Instead this CL uses constraints based on the existing linalg dependence analysis to focus the pattern and refine patterns based on the type of op that last wrote in a buffer. This is a very local property and is less powerful than the generic dag specification based on SSA use-def chains. This will be generalized in the future. PiperOrigin-RevId: 277931503	2019-11-01 08:30:38 -07:00
James Molloy	96531e2f87	[mlir][llvm] Add missing cast ops Also adds a builder method for fcmp, identical to that for icmp. PiperOrigin-RevId: 277923158	2019-11-01 07:32:09 -07:00
Denis Khalikov	d423d4a338	[spirv] Add cast operations This CL added op definitions for a few cast operations: * OpConvertFToU * OpConvertFToS * OpConvertSToF * OpConvertUToF * OpUConvert * OpSConvert * OpFConvert Also moved the definition of spv.Bitcast to the new file. Closes tensorflow/mlir#208 and tensorflow/mlir#174 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/208 from denis0x0D:sandbox/cast_ops 79bc9b37398aafddee6cf6beb301807988fe67f9 PiperOrigin-RevId: 277587891	2019-10-30 14:53:04 -07:00
Nicolas Vasilache	05a5a41416	Add basic support for declarative Linalg transformations Linalg ops provide a good anchor for pattern matching/rewriting transformations. This CL adds a simple example of how multi-level tiling may be specified by attaching a simple StringAttr to ops as they are transformed so we can easily specify partial lowering to control transformation application. This is a first stab at taking advantage of higher-level information contained in Linalg ops and will evolve in the future. PiperOrigin-RevId: 277497958	2019-10-30 07:12:33 -07:00
Lei Zhang	80213ba5f0	[spirv] Fix gen_spirv_dialect.py and add spv.Unreachable This CL fixed gen_spirv_dialect.py to support nested delimiters when chunking existing ODS entries in .td files and to allow ops without correspondence in the spec. This is needed to pull in the definition of OpUnreachable. PiperOrigin-RevId: 277486465	2019-10-30 05:41:18 -07:00
Lei Zhang	ca2538e9a7	[spirv] Support OpPhi using block arguments This CL adds another control flow instruction in SPIR-V: OpPhi. It is modelled as block arguments to be idiomatic with MLIR. See the rationale.md doc for "Block Arguments vs PHI nodes". Serialization and deserialization is updated to convert between block arguments and SPIR-V OpPhi instructions. PiperOrigin-RevId: 277161545	2019-10-28 15:58:42 -07:00
Denis Khalikov	dd2e444325	[spirv] AccessChainOp canonicalization. Combine chained `spirv::AccessChainOp` operations into one `spirv::AccessChainOp` operation. Closes tensorflow/mlir#198 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/198 from denis0x0D:sandbox/canon_access_chain 0cb87955a85511071143d62637ff939d0dabc2bd PiperOrigin-RevId: 276609345	2019-10-24 18:41:34 -07:00
River Riddle	2b61b7979e	Convert the Canonicalize and CSE passes to generic Operation Passes. This allows for them to be used on other non-function, or even other function-like, operations. The algorithms are already generic, so this is simply changing the derived pass type. The majority of this change is just ensuring that the nesting of these passes remains the same, as the pass manager won't auto-nest them anymore. PiperOrigin-RevId: 276573038	2019-10-24 15:01:09 -07:00
Uday Bondhugula	ad6925f479	Update loop.for verifier message fix: nonnegative -> positive Closes tensorflow/mlir#206 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/206 from bondhugula:bondhugula-patch-1 9a47ca7dfd230180a9df33e9a64b33d02252d30a PiperOrigin-RevId: 276060885	2019-10-22 07:34:56 -07:00
Lei Zhang	d9fe892e42	[spirv] Allow block arguments on spv.Branch(Conditional) We will use block arguments as the way to model SPIR-V OpPhi in the SPIR-V dialect. This CL also adds a few useful helper methods to both ops to get the block arguments. Also added tests for branch weight (de)serialization. PiperOrigin-RevId: 275960797	2019-10-21 17:32:00 -07:00
River Riddle	9ac459e871	Add a Symbol trait to simplify defining operations that represent symbols. This trait provides accessors for the name, symbol use list methods, verification, with more to be added. PiperOrigin-RevId: 275864554	2019-10-21 09:58:59 -07:00
Kazuaki Ishizaki	f28c5aca17	Fix minor spelling tweaks (NFC) Closes tensorflow/mlir#175 PiperOrigin-RevId: 275726876	2019-10-20 09:44:36 -07:00
Kazuaki Ishizaki	8bfedb3ca5	Fix minor spelling tweaks (NFC) Closes tensorflow/mlir#177 PiperOrigin-RevId: 275692653	2019-10-20 00:11:34 -07:00
Christian Sigg	c3e56cd12c	Get active source lane predicate from shuffle instruction. nvvm.shfl.sync.bfly optionally returns a predicate whether source lane was active. Support for this was added to clang in https://reviews.llvm.org/D68892. Add an optional 'pred' unit attribute to the instruction to return this predicate. Specify this attribute in the partial warp reduction so we don't need to manually compute the predicate. PiperOrigin-RevId: 275616564	2019-10-19 01:53:25 -07:00
Nicolas Vasilache	9e7e297da3	Lower vector transfer ops to loop.for operations. This allows mixing linalg operations with vector transfer operations (with additional modifications to affine ops) and is a step towards solving tensorflow/mlir#189. PiperOrigin-RevId: 275543361	2019-10-18 14:10:10 -07:00
Stephan Herhut	3622e1833f	Use StrEnumAttr for gpu.allreduce op instead of StringAttr to better encode constraints. PiperOrigin-RevId: 275448372	2019-10-18 04:44:48 -07:00
Christian Sigg	fe0ee32da5	Add gpu.barrier op to synchronize invocations of a local workgroup. Adding gen table for rewrite patterns from GPU to NVVM dialect. Copy missing op documentation from GPUOps.td to GPU.md. PiperOrigin-RevId: 275419588	2019-10-18 00:30:44 -07:00
Nicolas Vasilache	5b03e692f6	Decouple Linalg promotion from Linalg tiling - NFC This CL creates a new Linalg promotion pass that operates on SubViewOp and decouples it from Linalg tiling. This is mostly moving code around. PiperOrigin-RevId: 275329213	2019-10-17 13:41:17 -07:00
Denis Khalikov	a560505d1a	[spirv] Add a canonicalization pattern for spv.selection. Add a canonicalization pattern for spv.selection operation. Convert spv.selection operation to spv.Select based on simple pattern. Closes tensorflow/mlir#183 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/183 from denis0x0D:sandbox/canon_select 43d04d923272dd60b9da39f70bdbc51a5168db62 PiperOrigin-RevId: 275312748	2019-10-17 12:36:47 -07:00
Lei Zhang	057dc41bf6	Allow '_' when pretty printing dialect symbols '_' is used frequently enough as the separator of words in symbols. We should allow it in dialect symbols when considering pretty printing. Also updated LangRef.md regarding pretty form. PiperOrigin-RevId: 275312494	2019-10-17 12:24:18 -07:00
Lei Zhang	0e3efb32c6	[spirv] Implement inliner interface We just need to implement a few interface hooks to DialectInlinerInterface and CallOpInterface to gain the benefits of an inliner. :) Right now only supports some trivial cases: * Inlining single block with spv.Return/spv.ReturnValue * Inlining multi block with spv.Return * Inlining spv.selection/spv.loop without return ops More advanced cases will require block argument and Phi support. PiperOrigin-RevId: 275151132	2019-10-16 17:46:19 -07:00
Christian Sigg	d2f0f847af	Support custom accumulator provided as region to gpu.all_reduce. In addition to specifying the type of accumulation through the 'op' attribute, the accumulation can now also be specified as arbitrary code region. Adds a gpu.yield op to specify the result of the accumulation. Also support more types (integers) and accumulations (mul). PiperOrigin-RevId: 275065447	2019-10-16 10:43:44 -07:00
Hanhan Wang	950979745a	Add support for OpBitwiseOr, OpBitwiseXor, and OpBitwiseAnd in SPIR-V dialect. PiperOrigin-RevId: 274935374	2019-10-15 18:42:40 -07:00
Lei Zhang	e03e151983	[spirv] Add support for SpecId decoration on spv.specConstant The SpecId decoration is the handle for providing external specialization. Similar to descriptor set and binding on global variables, we directly bake it into assembly parsing and printing. PiperOrigin-RevId: 274893879	2019-10-15 14:53:30 -07:00
Nicolas Vasilache	5c5d83afb4	Fix linalg.subview behavior in (partially) static cases. When the implementation of the strided memref [RFC](https://groups.google.com/a/tensorflow.org/forum/#!msg/mlir/MaL8m2nXuio/1scRqZa6AQAJ) landed, linalg started using this type instead of the now retired !linalg.view. As static and partially static cases appear, the stride information needs to be maintained properly. In particular, the result type of the subview op was generally incorrect. This CL fixes the issue by computing a return type that: 1. always has dynamic sizes, which is generally the only correct way to construct a subview in the absence of data padding and/or code versioning. 2. has the same strides as the base strided memref. Point 1. above can be further refined but will needs further analysis and canonicalization to optimize the particular case where: 1. The base memref has static size along a given dimension. 2. The subview size can be statically derived (e.g. after canonicalization). 3. And the subview size is an even divisor of the base memref. This 3rd constraint is well-known in the case of tiled layouts that don't assume implicit padding: the boundary tile may be only partial and has size given by `problem_size % tile_size`. Tests are updated as appropriate. PiperOrigin-RevId: 274578624	2019-10-14 08:43:53 -07:00
Nicolas Vasilache	c2285b619d	Add lowering of VectorOps dialect to LLVM to the Linalg LLVM lowering pass This fixes an omission that prevents Linalg to lower generic ops regions operating on ops in the VectorOps dialect. To achieve this we simply need to `populateVectorToLLVMConversionPatterns` in the conversion. Relevant tests are added. PiperOrigin-RevId: 274577325	2019-10-14 08:43:26 -07:00
Eric Schweitz	a3d084848d	Add LLVM IR dialect hooks for FP128 and X86_FP80 types Closes tensorflow/mlir#184 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/184 from schweitzpgi:more-float-types ca27d00510a86ffc9c79c65fb3a0193b5ea097a0 PiperOrigin-RevId: 274288813	2019-10-11 18:35:33 -07:00
Alex Zinenko	71b82bcbf6	LLVM Dialect: introduce llvm.mlir.null operation Similarly to `llvm.mlir.undef`, this auxiliary operation creates an SSA value that corresponds to `null` in LLVM IR. This operation is necessary to model sizeof(<...>) behavior when allocating memory. PiperOrigin-RevId: 274158760	2019-10-11 06:32:24 -07:00
Mahesh Ravishankar	28d7f9c052	Add lowering of constant ops to SPIR-V. The lowering is specified as a pattern and is done only if the result is a SPIR-V scalar type or vector type. Handling ConstantOp with index return type needs special handling since SPIR-V dialect does not have index types. Based on the bitwidth of the attribute value, either i32 or i64 is chosen. Other constant lowerings are left as a TODO. PiperOrigin-RevId: 274056805	2019-10-10 17:19:57 -07:00
Denis Khalikov	d21ba951de	[spirv] Add a pass to decorate the composite types with layout info. Add a pass to decorate the composite types used by composite objects in the StorageBuffer, PhysicalStorageBuffer, Uniform, and PushConstant storage classes with layout information. Closes tensorflow/mlir#156 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/156 from denis0x0D:sandbox/layout_info_decoration 7c50840fd38ca169a2da7ce9886b52b50c868b84 PiperOrigin-RevId: 273634140	2019-10-08 16:54:11 -07:00
Alex Zinenko	90d65d32d6	Use named modules for gpu.launch_func The kernel function called by gpu.launch_func is now placed into an isolated nested module during the outlining stage to simplify separate compilation. Until recently, modules did not have names and could not be referenced. This limitation was circumvented by introducing a stub kernel at the same name at the same nesting level as the module containing the actual kernel. This relation is only effective in one direction: from actual kernel function to its launch_func "caller". Leverage the recently introduced symbol name attributes on modules to refer to a specific nested module from `gpu.launch_func`. This removes the implicit connection between the identically named stub and kernel functions. It also enables support for `gpu.launch_func`s to call different kernels located in the same module. PiperOrigin-RevId: 273491891	2019-10-08 04:30:32 -07:00
Lei Zhang	5a1108c9a6	[spirv] Disable a crashing spv.loop test PiperOrigin-RevId: 273379318	2019-10-07 14:40:49 -07:00
Mahesh Ravishankar	9e9c3a009a	Update UndefOp (de)serialization to generate OpUndef at module level. The SPIR-V spec recommends all OpUndef instructions be generated at module level. For the SPIR-V dialect its better for UndefOp to produce an SSA value for use with other instructions. If UndefOp is to be used at module level, it cannot produce an SSA value (use of this SSA value within FuncOp would need implicit capture). To satisfy needs of the SPIR-V spec while making it simpler to represent UndefOp in the SPIR-V dialect, the serialization is updated to create OpUndef instruction at module scope. PiperOrigin-RevId: 273355526	2019-10-07 12:56:38 -07:00
Lei Zhang	ebf584b813	[spirv] Fix function entry block erase after moving to spv.selection The structured selection/loop's entry block does not have arguments. If the function's header block is also part of the structured control flow, we cannot just simply erase it because it may contain arguments matching the function signature and used by the cloned blocks. Instead, turn it into a block only containing a spv.Branch op. Also, we can directly emit instructions for the spv.selection header block to the block containing the spv.selection op. This eliminates unnecessary branches in the SPIR-V blob. Added a test for nested spv.loop. PiperOrigin-RevId: 273351424	2019-10-07 12:37:13 -07:00
Nicolas Vasilache	9f98bcda47	Support AllocOp terminal in Linalg::AliasAnalysis. Now that linalg.view and strided memrefs are unified, there is no reason to disallow AllocOp in alias analysis. This CLs adds support for AllocOp which allows writing shorter tests that do not require explicitly creating a view for each operation. PiperOrigin-RevId: 273303060	2019-10-07 09:01:18 -07:00
Lei Zhang	c020480fc6	[spirv] Allow return ops to be in control flow ops Use `getParentOfType<FunctionOp>()` instead of `cast<FuncOp>(getParentOp())` to avoid crash when return ops are used inside spv.selection/spv.loop. PiperOrigin-RevId: 273006041	2019-10-04 20:08:52 -07:00
Mahesh Ravishankar	3f8bde40cb	Add spv.Undef op to support OpUndef instruction in SPIR-V. Adding support for OpUndef instruction. Updating the dialect generation script to fix a few bugs in the instruction spec generation. PiperOrigin-RevId: 272975685	2019-10-04 16:00:22 -07:00
Nicolas Vasilache	516f6a3477	Add missing Linalg lowerings to allow roundtrip.mlir to lower to LLVM Certain lowering patterns were reported as [missing](https://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/dkdmHa77sSQ). This CL adds them and allows Linalg/roundtrip.mlir and Linalg/loops.mlir to lower to LLVM directly. Those 2 tests are updated to additionally check that the direct lowering to LLVM does not crash. The following points, left as TODOs still need to be addressed for correct end-to-end execution: 1. the lowering for ConvOp needs to pass attributes such as strides and dilations; the external library call needs to support it. 2. the lowering for GenericOp needs to support lowering to loops as a DialectConversion pattern. This is blocked on the DialectConversion infrastructure accepting an OperationFolder. PiperOrigin-RevId: 272878131	2019-10-04 08:07:54 -07:00
Feng Liu	8c95223e3c	Add `axis` attribute to the quant.stats op The first dim length of the axisStats attribute should equals to the slice size of the input argument when splitted by the axis dimension. PiperOrigin-RevId: 272798042	2019-10-03 20:29:08 -07:00
Nicolas Vasilache	218f0e611a	Add syntactic sugar for strided memref parsing. This CL implements the last remaining bit of the [strided memref proposal](https://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/MaL8m2nXuio). The syntax is a bit more explicit than what was originally proposed and resembles: `memref<?x?xf32, offset: 0 strides: [?, 1]>` Nonnegative strides and offsets are currently supported. Future extensions will include negative strides. This also gives a concrete example of syntactic sugar for the ([RFC] Proposed Changes to MemRef and Tensor MLIR Types)[https://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/-wKHANzDNTg]. The underlying implementation still uses AffineMap layout. PiperOrigin-RevId: 272717437	2019-10-03 12:34:36 -07:00
Lei Zhang	f294e0e513	[spirv] Add support for spv.selection Similar to spv.loop, spv.selection is another op for modelling SPIR-V structured control flow. It covers both OpBranchConditional and OpSwitch with OpSelectionMerge. Instead of having a `spv.SelectionMerge` op to directly model selection merge instruction for indicating the merge target, we use regions to delimit the boundary of the selection: the merge target is the next op following the `spv.selection` op. This way it's easier to discover all blocks belonging to the selection and it plays nicer with the MLIR system. PiperOrigin-RevId: 272475006	2019-10-02 11:01:57 -07:00
Nicolas Vasilache	e36337a998	Unify Linalg types by using strided memrefs This CL finishes the implementation of the Linalg + Affine type unification of the [strided memref RFC](https://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/MaL8m2nXuio). As a consequence, the !linalg.view type, linalg::DimOp, linalg::LoadOp and linalg::StoreOp can now disappear and Linalg can use standard types everywhere. PiperOrigin-RevId: 272187165	2019-10-01 05:23:21 -07:00
Denis Khalikov	219421ece7	[spirv] Add array length check. According to the SPIR-V spec: "Length is the number of elements in the array. It must be at least 1." Closes tensorflow/mlir#160 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/160 from denis0x0D:sandbox/array_len 0840dc0986ad0088a3aa7d5d8d3e97d489377ed9 PiperOrigin-RevId: 272094669	2019-09-30 16:43:26 -07:00
Mahesh Ravishankar	2f7bb1e25f	Add support for Logical Ops in SPIR-V dialect Add operations corresponding to OpLogicalAnd, OpLogicalNot, OpLogicalEqual, OpLogicalNotEqual and OpLogicalOr instructions in SPIR-V dialect. This needs changes to class hierarchy in SPIR-V TableGen files to split SPIRVLogicalOp into SPIRVLogicalUnaryOp and SPIRVLogicalBinaryOp. All derived classes of SPIRVLogicalOp are updated accordingly. Update the spirv dialect generation script to 1) Allow specifying base class to use for instruction spec generation and file name to generate the specification in separately. 2) Use the existing descriptions for operations. 3) Update define_inst.sh to also invoke define_opcode.sh to also define the corresponding SPIR-V instruction opcode enum. PiperOrigin-RevId: 272014876	2019-09-30 10:40:36 -07:00
Nicolas Vasilache	ddf737c5da	Promote MemRefDescriptor to a pointer to struct when passing function boundaries in LLVMLowering. The strided MemRef RFC discusses a normalized descriptor and interaction with library calls (https://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/MaL8m2nXuio). Lowering of nested LLVM structs as value types does not play nicely with externally compiled C/C++ functions due to ABI issues. Solving the ABI problem generally is a very complex problem and most likely involves taking a dependence on clang that we do not want atm. A simple workaround is to pass pointers to memref descriptors at function boundaries, which this CL implement. PiperOrigin-RevId: 271591708	2019-09-27 09:57:36 -07:00
Deven Desai	fee40fef5c	[ROCm] Adding ROCDL Dialect. This commit introduces the ROCDL Dialect (i.e. the ROCDL ops + the code to lower those ROCDL ops to LLWM intrinsics/functions). Think of ROCDL Dialect as analogous to the NVVM Dialect, but for AMD GPUs. This patch contains just the essentials needed to get a simple example up and running. We expect to make further additions to the ROCDL Dialect. This is the first of 3 commits, the follow-up will be: * add a pass that lowers GPU Dialect to ROCDL Dialect * add a "mlir-rocm-runner" utility Closes tensorflow/mlir#146 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/146 from deven-amd:deven-rocdl-dialect e78e8005c75a78912631116c78dc844fcc4b0de9 PiperOrigin-RevId: 271511259	2019-09-27 00:22:32 -07:00
Nicolas Vasilache	445232df0b	Decouple tiling from fusion in Linalg. This CL modifies the linalg-fusion pass such that it does not tile anymore as part of the pass. Tiling is a separate concern that enables linalg fusion but should happen before. This makes fusion more composable with other decisions. In particular the fusion pass now becomes greedy and only applies the transformation on a best-effort basis. This should also let fusion work in a multi-hop fashion with chains of producer/consumers. Since the fusion pass does not perform tiling anymore, tests are rewritten to be in pretiled form and make the intent of the test clearer (albeit more verbose). PiperOrigin-RevId: 271357741	2019-09-26 08:44:31 -07:00
Christian Sigg	116dac00ba	Add AllReduceOp to GPU dialect with lowering to NVVM. The reduction operation is currently fixed to "add", and the scope is fixed to "workgroup". The implementation is currently limited to sizes that are multiple 32 (warp size) and no larger than 1024. PiperOrigin-RevId: 271290265	2019-09-26 00:17:50 -07:00
Mahesh Ravishankar	6f0e65441c	Add spv.Bitcast operation to SPIR-V dialect Support the OpBitcast instruction of SPIR-V using the spv.Bitcast operation. The semantics implemented in the dialect differ from the SPIR-V spec in that the dialect does not allow conversion to/from pointer types from/to non-pointer types. PiperOrigin-RevId: 271255957	2019-09-25 19:01:53 -07:00
Lei Zhang	ae13c28f3f	[spirv] Add SPV_UnaryOp and spv.FNegate This CL also moves common parsers and printers to the same section in SPIRVOps.cpp. PiperOrigin-RevId: 271233546	2019-09-25 16:35:08 -07:00
Mahesh Ravishankar	c5284fe85e	Add support for GLSL Binary ops, and use it to implement GLSL FMax. A base class is added to implement all GLSL Binary operations and is used to implement the FMax operation. The existing framework already generates all the necessary (de)serialization code. PiperOrigin-RevId: 271037166	2019-09-24 19:42:11 -07:00
Christian Sigg	74cdbf5909	Clone called functions into nested GPU module. PiperOrigin-RevId: 270891190	2019-09-24 06:29:54 -07:00
Lei Zhang	6caa4f500b	[spirv] NFC: clean up (de)serialization tests This CL uses the newly added -split-input-file CLI option to mlir-translate to combine certain (de)serialization tests. It also renames certain test filenames. PiperOrigin-RevId: 270816324	2019-09-23 19:57:17 -07:00
Mahesh Ravishankar	69af468754	Make spirv::RuntimeArrayType part of spirv::CompositeType. According to SPIR-V spec, spirv::CompositeType includes spirv::RuntimeArrayType. This allows using objects of spirv::RuntimeArrayType with spirv::AccessChainOp. PiperOrigin-RevId: 270809492	2019-09-23 18:50:47 -07:00
Mahesh Ravishankar	75906bd565	Handle OpMemberName instruction in SPIR-V deserializer. Sdd support in deserializer for OpMemberName instruction. For now the name is just processed and not associated with the spirv::StructType being built. That needs an enhancement to spirv::StructTypes itself. Add tests to check for errors reported during deserialization with some refactoring to common out some utility functions. PiperOrigin-RevId: 270794524	2019-09-23 17:11:18 -07:00
Mahesh Ravishankar	98d1d3fc43	Simplify the way spirv::StructTypes are parsed. The existing logic to parse spirv::StructTypes is very brittle. This change simplifies the parsing logic a lot. The simplification also allows for memberdecorations to be separated by commas instead of spaces (which was an artifact of the existing parsing logic). The change also needs a modification to mlir::parseType to return the number of chars parsed. Adding a new parseType method to do so. Also allow specification of spirv::StructType with no members. PiperOrigin-RevId: 270739672	2019-09-23 12:53:06 -07:00
Christian Sigg	b8676da1fc	Outline GPU kernel function into a nested module. Roll forward of commit `5684a12`. When outlining GPU kernels, put the kernel function inside a nested module. Then use a nested pipeline to generate the cubins, independently per kernel. In a final pass, move the cubins back to the parent module. PiperOrigin-RevId: 270639748	2019-09-23 03:17:01 -07:00
Jing Pu	54f4522a5c	Specalize f32->i8/u8 Quanitization with C++ native arithmetic to optimize performance. The CL adds a rounding mode flag to the class and changes the default to rmNearestTiesToAway from rmNearestTiesToEven because 1) Tensorflow QuantizeV2 ops uses rmNearestTiesToAway; 2) the specialization only implements rmNearestTiesToAway. PiperOrigin-RevId: 270600739	2019-09-22 22:07:51 -07:00
Denis Khalikov	2ec8e2be1f	[spirv] Add OpControlBarrier and OpMemoryBarrier. Add OpControlBarrier and OpMemoryBarrier (de)serialization. Closes tensorflow/mlir#130 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/130 from denis0x0D:sandbox/memory_barrier 2e3fff16bca44904dc1039592cb9a65d526faea8 PiperOrigin-RevId: 270457478	2019-09-21 10:18:34 -07:00
Christian Sigg	33a3a91ba2	Make GlobalOp's value attribute optional. Make GlobalOp's value attribute an OptionalAttr. Change code that uses the value to handle 'nullopt'. Translate an unitialized value attribute to llvm::UndefValue. PiperOrigin-RevId: 270423646	2019-09-21 01:20:28 -07:00
Mahesh Ravishankar	9a4f5d2ee3	Allow specification of decorators on SPIR-V StructType members. Allow specification of decorators on SPIR-V StructType members. If the struct has layout information, these decorations are to be specified after the offset specification of the member. These decorations are emitted as OpMemberDecorate instructions on the struct <id>. Update (de)serialization to handle these decorations. PiperOrigin-RevId: 270130136	2019-09-19 14:50:05 -07:00
George Karpenkov	2df646bef6	Automated rollback of commit `5684a12434` PiperOrigin-RevId: 270126672	2019-09-19 14:34:30 -07:00
Feng Liu	c8961d408e	Quantize attribute values by per axis quantization parameters A new converter with per axis quantization parameters is added to quantize a dense elements attribute. For each slice along the quantization axis, it creates an uniform quantized value converter, with different scale and zero point, and quantizes the values in the slice. The current implementation doesn't handle sparse elements attributes. PiperOrigin-RevId: 270121986	2019-09-19 14:12:08 -07:00
MLIR Team	e79bfefb89	Add address space attribute to LLVMIR's GlobalOp. PiperOrigin-RevId: 270012505	2019-09-19 04:50:46 -07:00
MLIR Team	5684a12434	Outline GPU kernel function into a nested module. When outlining GPU kernels, put the kernel function inside a nested module. Then use a nested pipeline to generate the cubins, independently per kernel. In a final pass, move the cubins back to the parent module. PiperOrigin-RevId: 269987720	2019-09-19 01:51:28 -07:00
Mahesh Ravishankar	9330c1b9a1	Add (de)serialization support for OpRuntimeArray. Update the SPIR-V (de)serialization to handle RuntimeArrayType. PiperOrigin-RevId: 269667196	2019-09-17 15:21:57 -07:00
Lei Zhang	af45ca844f	Register a -test-spirv-roundtrip hook to mlir-translate This CL registers a new mlir-translate hook, -test-spirv-roundtrip, for testing SPIR-V serialization and deserialization round-trip. This CL also moves the existing -serialize-spirv and -deserialize-spirv hooks to one source file. PiperOrigin-RevId: 269659528	2019-09-17 14:48:24 -07:00
Mahesh Ravishankar	2d86ad79f0	Autogenerate (de)serialization for Extended Instruction Sets A generic mechanism for (de)serialization of extended instruction sets is added with this CL. To facilitate this, a new class "SPV_ExtendedInstSetOp" is added which is a base class for all operations corresponding to extended instruction sets. The methods to (de)serialization such ops as well as its dispatch is generated automatically. The behavior controlled by autogenSerialization and hasOpcode is also slightly modified to enable this. They are now decoupled. 1) Setting hasOpcode=1 means the operation has a corresponding opcode in SPIR-V binary format, and its dispatch for (de)serialization is automatically generated. 2) Setting autogenSerialization=1 generates the function for (de)serialization automatically. So now it is possible to have hasOpcode=0 and autogenSerialization=1 (for example SPV_ExtendedInstSetOp). Since the dispatch functions is also auto-generated, the input file needs to contain all operations. To this effect, SPIRVGLSLOps.td is included into SPIRVOps.td. This makes the previously added SPIRVGLSLOps.h and SPIRVGLSLOps.cpp unnecessary, and are deleted. The SPIRVUtilsGen.cpp is also changed to make better use of formatv,making the code more readable. PiperOrigin-RevId: 269456263	2019-09-16 17:12:33 -07:00
Denis Khalikov	8a34d5d18c	[spirv] Add support for function calls. Add spv.FunctionCall operation and (de)serialization. Closes tensorflow/mlir#137 COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/137 from denis0x0D:sandbox/function_call_op e2e6f07d21e7f23e8b44c7df8a8ab784f3356ce4 PiperOrigin-RevId: 269437167	2019-09-16 15:39:54 -07:00
Lei Zhang	6934a337f0	[spirv] Add support for BitEnumAttr Certain enum classes in SPIR-V, like function/loop control and memory access, are bitmasks. This CL introduces a BitEnumAttr to properly model this and drive auto-generation of verification code and utility functions. We still store the attribute using an 32-bit IntegerAttr for minimal memory footprint and easy (de)serialization. But utility conversion functions are adjusted to inspect each bit and generate "\|"-concatenated strings for the bits; vice versa. Each such enum class has a "None" case that means no bit is set. We need special handling for "None". Because of this, the logic is not general anymore. So right now the definition is placed in the SPIR-V dialect. If later this turns out to be useful for other dialects, then we can see how to properly adjust it and move to OpBase.td. Added tests for SPV_MemoryAccess to check and demonstrate. PiperOrigin-RevId: 269350620	2019-09-16 09:23:22 -07:00
Mahesh Ravishankar	9814b3fa0d	Add mechanism to specify extended instruction sets in SPIR-V. Add support for specifying extended instructions sets. The operations in SPIR-V dialect are named as 'spv.<extension-name>.<op-name>'. Use this mechanism to define a 'Exp' operation from GLSL(450) instructions. Later CLs will add support for (de)serialization of these operations, and update the dialect generation scripts to auto-generate the specification using the spec directly. Additional changes: Add a Type Constraint to OpBase.td to check for vector of specified lengths. This is used to check that the vector type used in SPIR-V dialect are of lengths 2, 3 or 4. Update SPIRVBase.td to use this Type constraints for vectors. PiperOrigin-RevId: 269234377	2019-09-15 19:40:07 -07:00
Lei Zhang	113aadddf9	Update SPIR-V symbols and use GLSL450 instead of VulkanKHR SPIR-V recently publishes v1.5, which brings a bunch of symbols into core. So the suffix "KHR"/"EXT"/etc. is removed from the symbols. We use a script to pull information from the spec directly. Also changed conversion and tests to use GLSL450 instead of VulkanKHR memory model. GLSL450 is still the main memory model supported by Vulkan shaders and it does not require extra capability to enable. PiperOrigin-RevId: 268992661	2019-09-13 15:26:32 -07:00
Lei Zhang	a84bc68acc	[spirv] Add support for spv.loop (de)serialization This CL adds support for serializing and deserializing spv.loop ops. This adds support for spv.Branch and spv.BranchConditional op (de)serialization, too, because they are needed for spv.loop. PiperOrigin-RevId: 268536962	2019-09-11 14:02:59 -07:00
Lei Zhang	ee8cbccacf	Add folding rule for spv.CompositeExtract If the composite is a constant, we can fold it away. This only supports vector and array constants for now, given that struct constant is not supported in spv.constant yet. PiperOrigin-RevId: 268350340	2019-09-10 17:48:24 -07:00
Feng Liu	cf0a782339	Remove the constraint that min / max should stride zero Since we apply nudging for the zero point to make sure the nudged zerop points can be in the range of [qmin, qmax], the constraint that rmin / rmax should stride zero isn't necessary. This also matches the documentation of tensorflow's FakeQuantWithMinMaxArgs op, where min and max don't need to stride zero: https://www.tensorflow.org/api_docs/python/tf/quantization/fake_quant_with_min_max_args PiperOrigin-RevId: 268296285	2019-09-10 13:26:46 -07:00
Feng Liu	c68d5467d6	Convert ConstFakeQuantPerAxis to qcast and dcast pair This is also to add the test to the fakeQuantAttrsToType for per-channel fake quant. PiperOrigin-RevId: 268260032	2019-09-10 10:50:57 -07:00
Feng Liu	f4ae4762bf	Add quant.const_fake_quant_per_axis op Comparing to the existing quant.const_fake_quant op, the min and max attributes of this new op is for each channel of last dimension of the input. PiperOrigin-RevId: 268093722	2019-09-09 15:42:37 -07:00
Stephan Herhut	318ff019cf	Addressing some late review comments on kernel inlining. Just formatting and better lit tests, no functional change. PiperOrigin-RevId: 267942907	2019-09-09 01:15:47 -07:00
Nicolas Vasilache	1b8eff8fcd	Simplify Linalg ABI integration with external function calls. View descriptors are converted to pointer to LLVM struct to avoid ABI issues related to C struct packing. This creates unnecessary complexity and hampers unification with memrefs. Instead, this CL makes view descriptors convert to LLVM struct (as it was originally) and promotes all structs to pointers right before calling an external function. PiperOrigin-RevId: 267602693	2019-09-06 08:31:19 -07:00
Lei Zhang	916eb980b0	[spirv] Add spv.loop SPIR-V can explicitly declare structured control-flow constructs using merge instructions. These explicitly declare a header block before the control flow diverges and a merge block where control flow subsequently converges. These blocks delimit constructs that must nest, and can only be entered and exited in structured ways. Instead of having a `spv.LoopMerge` op to directly model loop merge instruction for indicating the merge and continue target, we use regions to delimit the boundary of the loop: the merge target is the next op following the `spv.loop` op and the continue target is the block that has a back-edge pointing to the entry block inside the `spv.loop`'s region. This way it's easier to discover all blocks belonging to a construct and it plays nicer with the MLIR system. Updated the SPIR-V.md doc. PiperOrigin-RevId: 267431010	2019-09-05 12:45:53 -07:00
Stephan Herhut	7eb25cd367	Make GPU kernel outlining test independent of value names. PiperOrigin-RevId: 267323604	2019-09-05 01:46:26 -07:00
Alex Zinenko	c6f8adad8e	Move LLVMIR dialect tests from test/LLVMIR to test/Dialect and test/Conversion This follows up on the recent restructuring that moved the dialects under lib/Dialect and inter-dialect conversions to lib/Conversion. Originally, the tests for both the LLVMIR dialect itself and the conversion from Standard to LLVMIR dialect lived under test/LLVMIR. This no longer reflects the code structure. Move the tests to either test/Dialect/LLVMIR or test/Conversion/StandardToLLVM depending on the features they exercise. PiperOrigin-RevId: 267159219	2019-09-04 08:38:18 -07:00
Alex Zinenko	6395229509	Move Linalg dialect tests to test/Dialect/Linalg This was missing from the commit that moved the Linalg dialect to lib/Dialect. PiperOrigin-RevId: 267141176	2019-09-04 06:33:23 -07:00
Stephan Herhut	dfd06af562	Make GPU kernel outlining inline constants. It is generally beneficial to pass less arguments to a kernel, so cloning constants into the kernel is beneficial. PiperOrigin-RevId: 267139084	2019-09-04 06:16:07 -07:00
Lei Zhang	5593e005c6	Add folding rule and dialect materialization hook for spv.constant This will allow us to use MLIR's folding infrastructure to deduplicate SPIR-V constants. This CL also changed isValidSPIRVType in SPIRVDialect to a static method. PiperOrigin-RevId: 266984403	2019-09-03 12:09:58 -07:00
Mahesh Ravishankar	2acd0dbf05	Add Select operation to SPIR-V dialect. The SelectOp models the semantics of OpSelect from SPIR-V spec. PiperOrigin-RevId: 266849559	2019-09-02 21:07:18 -07:00
Lei Zhang	4f6c29223e	Add spv.Branch and spv.BranchConditional This CL just covers the op definition, its parsing, printing, and verification. (De)serialization is to be implemented in a subsequent CL. PiperOrigin-RevId: 266431077	2019-08-30 12:17:53 -07:00
Feng Liu	6de6c2c138	Add tests to verify 0.0 is quantized correctly We should consider both signed and narrow_range cases. PiperOrigin-RevId: 266167366	2019-08-29 10:09:22 -07:00
Stephan Herhut	e90542c03b	Add verification for dimension attribute on GPUDialect index operations. PiperOrigin-RevId: 266073204	2019-08-28 23:39:57 -07:00
Lei Zhang	0e131d83fe	[spirv] NFC: move SPIR-V control flow ops to a separate file This CL is also purely moving code around for better file organization. PiperOrigin-RevId: 265092566	2019-08-23 11:07:52 -07:00
Lei Zhang	21b77fc11f	[spirv] NFC: move arithmetic and logical ops to separate files This is purely moving code around for better file organization. PiperOrigin-RevId: 265082517	2019-08-23 10:26:45 -07:00
Lei Zhang	51cbf97b53	[spirv] Add support for extension (de)serialization Only a few important KHR extensions are registered to the SPIR-V dialect for now. PiperOrigin-RevId: 264939428	2019-08-22 16:01:35 -07:00
Lei Zhang	27ed82f99c	[spirv] Add support for capability (de)serialization This CL pulls in capabilities defined in the spec and adds support for (de)serialize capabilities of a spv.module. PiperOrigin-RevId: 264877413	2019-08-22 11:15:41 -07:00
Lei Zhang	1d10eb162c	Point to spv.AccessChain when reporting spv.AccessChain errors PiperOrigin-RevId: 264742130	2019-08-21 18:54:06 -07:00
Lei Zhang	748edce6b8	Remove the wrapping function in SPIR-V (de)serialization Previously Module and Function are builtinn constructs in MLIR. Due to the structural requirements we must wrap the SPIR-V module inside a Function inside a Module. Now the requirement is lifted and we can remove the wrapping function! :) PiperOrigin-RevId: 264736051	2019-08-21 18:05:24 -07:00
Lei Zhang	8d18fdf2d3	[spirv] Support i1 as bool type PiperOrigin-RevId: 264612014	2019-08-21 08:17:50 -07:00
Lei Zhang	69cf811d5b	Materialize spv.constants at use sites In SPIR-V binary format, constants are placed at the module level and referenced by instructions inside functions using their result <id>s. To model this natively (using SSA values for result <id>s), it means we need to have implicit capturing functions. We will lose the ability to have function passes if going down that path. Instead, this CL changes to materialize constants at their use sites in deserialization. It's cheap to copy constants in MLIR given that attributes is uniqued to MLIRContext. By localizing constants into functions, we can preserve isolated functions. PiperOrigin-RevId: 264582532	2019-08-21 04:45:49 -07:00
Lei Zhang	f4934bcc3e	Add spv.specConstant and spv._reference_of Similar to global variables, specialization constants also live in the module scope and can be referenced by instructions in functions in native SPIR-V. A direct modelling would be to allow functions in the SPIR-V dialect to implicit capture, but it means we are losing the ability to write passes for Functions. While in SPIR-V normally we want to process the module as a whole, it's not common to see multiple functions get used so we'd like to leave the door open for those cases. Therefore, similar to global variables, we introduce spv.specConstant to model three SPIR-V instructions: OpSpecConstantTrue, OpSpecConstantFalse, and OpSpecConstant. They do not return SSA value results; instead they have symbols and can only be referenced by the symbols. To use it in a function, we need to have another op spv._reference_of to turn the symbol into an SSA value. This breaks the tie and makes functions still explicit capture. Previously specialization constants were handled similarly as normal constants. That is incorrect given that specialization constant actually acts more like variable (without need to load and store). E.g., they cannot be de-duplicated like normal constants. This CL also refines various documents and comments. PiperOrigin-RevId: 264455172	2019-08-20 13:34:13 -07:00
Denis Khalikov	82cf6051ee	[spirv] Support (de)serialization of spv.struct Support (de)serialization of spv.struct with offset decorations. Closes tensorflow/mlir#94 PiperOrigin-RevId: 264421427	2019-08-20 11:03:42 -07:00
Mahesh Ravishankar	377bfb3a14	Fix parsing/printing of spv.globalVariable and spv._address_of Change the prining/parsing of spv.globalVariable to print the type of the variable after the ':' to be consistent with MLIR convention. The spv._address_of should print the variable type after the ':'. It was mistakenly printing the address of the return value. Add a (missing) test that should have caught that. Also move spv.globalVariable and spv._address_of tests to structure-ops.mlir. PiperOrigin-RevId: 264204686	2019-08-19 11:39:25 -07:00
Lei Zhang	64abcd983d	[spirv] Add spv.ReturnValue This CL adds the spv.ReturnValue op and its tests. Also adds a InFunctionScope trait to make sure that the op stays inside a function. To be consistent, ModuleOnly trait is changed to InModuleScope. PiperOrigin-RevId: 264193081	2019-08-19 10:58:10 -07:00
Mahesh Ravishankar	d745101339	Add spirv::GlobalVariableOp that allows module level definition of variables FuncOps in MLIR use explicit capture. So global variables defined in module scope need to have a symbol name and this should be used to refer to the variable within the function. This deviates from SPIR-V spec, which assigns an SSA value to variables at all scopes that can be used to refer to the variable, which requires SPIR-V functions to allow implicit capture. To handle this add a new op, spirv::GlobalVariableOp that can be used to define module scope variables. Since instructions need an SSA value, an new spirv::AddressOfOp is added to convert a symbol reference to an SSA value for use with other instructions. This also means the spirv::EntryPointOp instruction needs to change to allow initializers to be specified using symbol reference instead of SSA value The current spirv::VariableOp which returns an SSA value (as defined by SPIR-V spec) can still be used to define function-scope variables. PiperOrigin-RevId: 263951109	2019-08-17 10:20:13 -07:00
Denis Khalikov	cf358017e6	[spirv] Extend spv.array with Layoutinfo Extend spv.array with Layoutinfo to support (de)serialization. Closes tensorflow/mlir#80 PiperOrigin-RevId: 263795304	2019-08-16 10:18:14 -07:00

... 5 6 7 8 9 ...

683 Commits