llvm-project

Commit Graph

Author	SHA1	Message	Date
Tres Popp	9adc64539f	[mlir] Add std.powf to ROCDL lowering. Differential Revision: https://reviews.llvm.org/D93313	2020-12-15 18:47:49 +01:00
Tres Popp	f3e8f27ca1	[mlir] Fix GPUToNVVM test	2020-12-15 18:41:16 +01:00
Tres Popp	e04785b131	[mlir] Add NVVM lowering for std.pow Differential Revision: https://reviews.llvm.org/D93303	2020-12-15 18:28:23 +01:00
Javier Setoain	aece4e2793	[mlir][ArmSVE][RFC] Add an ArmSVE dialect This revision starts an Arm-specific ArmSVE dialect discussed in the discourse RFC thread: https://llvm.discourse.group/t/rfc-vector-dialects-neon-and-sve/2284 Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D92172	2020-12-14 21:35:01 +00:00
Frederik Gossen	75d9a46090	[MLIR] Add atan and atan2 lowerings to CUDA intrinsics Differential Revision: https://reviews.llvm.org/D93124	2020-12-14 10:45:28 +01:00
Frederik Gossen	1c6bc2c0b5	[MLIR] Add lowerings for atan and atan2 to ROCDL intrinsics Differential Revision: https://reviews.llvm.org/D93123	2020-12-14 10:43:19 +01:00
Sean Silva	444822d77a	Revert "Revert "[mlir] Start splitting the `tensor` dialect out of `std`."" This reverts commit `0d48d265db`. This reapplies the following commit, with a fix for CAPI/ir.c: [mlir] Start splitting the `tensor` dialect out of `std`. This starts by moving `std.extract_element` to `tensor.extract` (this mirrors the naming of `vector.extract`). Curiously, `std.extract_element` supposedly works on vectors as well, and this patch removes that functionality. I would tend to do that in separate patch, but I couldn't find any downstream users relying on this, and the fact that we have `vector.extract` made it seem safe enough to lump in here. This also sets up the `tensor` dialect as a dependency of the `std` dialect, as some ops that currently live in `std` depend on `tensor.extract` via their canonicalization patterns. Part of RFC: https://llvm.discourse.group/t/rfc-split-the-tensor-dialect-from-std/2347/2 Differential Revision: https://reviews.llvm.org/D92991	2020-12-11 14:30:50 -08:00
Sean Silva	0d48d265db	Revert "[mlir] Start splitting the `tensor` dialect out of `std`." This reverts commit `cab8dda90f`. I mistakenly thought that CAPI/ir.c failure was unrelated to this change. Need to debug it.	2020-12-11 14:15:41 -08:00
Sean Silva	cab8dda90f	[mlir] Start splitting the `tensor` dialect out of `std`. This starts by moving `std.extract_element` to `tensor.extract` (this mirrors the naming of `vector.extract`). Curiously, `std.extract_element` supposedly works on vectors as well, and this patch removes that functionality. I would tend to do that in separate patch, but I couldn't find any downstream users relying on this, and the fact that we have `vector.extract` made it seem safe enough to lump in here. This also sets up the `tensor` dialect as a dependency of the `std` dialect, as some ops that currently live in `std` depend on `tensor.extract` via their canonicalization patterns. Part of RFC: https://llvm.discourse.group/t/rfc-split-the-tensor-dialect-from-std/2347/2 Differential Revision: https://reviews.llvm.org/D92991	2020-12-11 13:50:55 -08:00
Nicolas Vasilache	7310501f74	[mlir][ArmNeon][RFC] Add a Neon dialect This revision starts an Arm-specific ArmNeon dialect discussed in the [discourse RFC thread](https://llvm.discourse.group/t/rfc-vector-dialects-neon-and-sve/2284). Differential Revision: https://reviews.llvm.org/D92171	2020-12-11 13:49:40 +00:00
Adrian Kuegel	ada4c7a351	Add rsqrt lowering from standard to ROCDL. Add a lowering for rsqrt from standard dialect to ROCDL. Differential Revision: https://reviews.llvm.org/D93011	2020-12-11 13:18:57 +01:00
Adrian Kuegel	09f717b929	Add sqrt lowering from standard to ROCDL Add a lowering for sqrt from standard dialect to ROCDL. Differential Revision: https://reviews.llvm.org/D92921	2020-12-10 09:47:37 +01:00
Frederik Gossen	b4750f58d8	Add sqrt lowering from standard to NVVM Differential Revision: https://reviews.llvm.org/D92850	2020-12-08 17:08:27 +01:00
Frederik Gossen	bb7d43e7d5	Add rsqrt lowering from standard to NVVM Differential Revision: https://reviews.llvm.org/D92838	2020-12-08 14:33:58 +01:00
Aart Bik	c95acf052b	[mlir][vector][avx512] move avx512 lowering pass into general vector lowering A separate AVX512 lowering pass does not compose well with the regular vector lowering pass. As such, it is at risk of code duplication and lowering inconsistencies. This change removes the separate AVX512 lowering pass and makes it an "option" in the regular vector lowering pass (viz. vector dialect "augmented" with AVX512 dialect). Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D92614	2020-12-03 17:23:46 -08:00
Christian Sigg	5535696c38	[mlir] Add gpu.allocate, gpu.deallocate ops with LLVM lowering to runtime function calls. The ops are very similar to the std variants, but support async GPU execution. gpu.alloc does not currently support an alignment attribute, and the new ops do not have canonicalizers/folders like their std siblings do. Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D91698	2020-11-27 09:40:59 +01:00
Alex Zinenko	119545f433	[mlir] Add conversion from SCF parallel loops to OpenMP Introduce a conversion pass from SCF parallel loops to OpenMP dialect constructs - parallel region and workshare loop. Loops with reductions are not supported because the OpenMP dialect cannot model them yet. The conversion currently targets only one level of parallelism, i.e. only one top-level `omp.parallel` operation is produced even if there are nested `scf.parallel` operations that could be mapped to `omp.wsloop`. Nested parallelism support is left for future work. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D91982	2020-11-24 21:12:56 +01:00
Alex Zinenko	f7d033f4d8	[mlir] Support WsLoopOp in OpenMP to LLVM dialect conversion It is a simple conversion that only requires to change the region argument types, generalize it from ParallelOp. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D91989	2020-11-23 23:28:02 +01:00
Alex Zinenko	1ec60862d7	[mlir] Avoid cloning ops in SCF parallel conversion to CFG The existing implementation of the conversion from SCF Parallel operation to SCF "for" loops in order to further convert those loops to branch-based CFG has been cloning the loop and reduction body operations into the new loop because ConversionPatternRewriter was missing support for moving blocks while replacing their arguments. This functionality now available, use it to implement the conversion and avoid cloning operations, which may lead to doubling of the IR size during the conversion. In addition, this fixes an issue with converting nested SCF "if" conditionals present in "parallel" operations that would cause the conversion infrastructure to stop because of the repeated application of the pattern converting "newly" created "if"s (which were in fact just moved). Arguably, this should be fixed at the infrastructure level and this fix is a workaround. Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D91955	2020-11-23 14:01:22 +01:00
Eugene Zhulenev	a86a9b5ef7	[mlir] Automatic reference counting for Async values + runtime support for ref counted objects Depends On D89963 Automatic reference counting algorithm outline: 1. `ReturnLike` operations forward the reference counted values without modifying the reference count. 2. Use liveness analysis to find blocks in the CFG where the lifetime of reference counted values ends, and insert `drop_ref` operations after the last use of the value. 3. Insert `add_ref` before the `async.execute` operation capturing the value, and pairing `drop_ref` before the async body region terminator, to release the captured reference counted value when execution completes. 4. If the reference counted value is passed only to some of the block successors, insert `drop_ref` operations in the beginning of the blocks that do not have reference coutned value uses. Reviewed By: silvas Differential Revision: https://reviews.llvm.org/D90716	2020-11-20 03:08:44 -08:00
Alex Zinenko	9bb5bff570	[mlir] Add an assertion on creating an Operation with null result types Null types are commonly used as an error marker. Catch them in the constructor of Operation if they are present in the result type list, as otherwise this could lead to further surprising behavior when querying op result types. Fix AsyncToLLVM and StandardToLLVM that were using null types when constructing operations. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D91770	2020-11-19 22:28:38 +01:00
ergawy	2f3adc54b5	[MLIR][SPIRV] Rename `spv._module_end` to `spv.mlir.endmodule` This commit does the renaming mentioned in the title in order to bring 'spv' dialect closer to the MLIR naming conventions. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D91792	2020-11-19 13:25:13 -05:00
ergawy	9bd50abc4c	[MLIR][SPIRV] Rename `spv._merge` to `spv.mlir.merge` This commit does the renaming mentioned in the title in order to bring 'spv' dialect closer to the MLIR naming conventions. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D91797	2020-11-19 10:04:35 -05:00
Christian Sigg	8b97e17d16	[mlir] Simplify code generated by ConvertToLLVMPattern::getStridedElementPtr(). Make the interface match the one of ConvertToLLVMPattern::getDataPtr() (to be removed in a separate change). Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D91599	2020-11-18 11:52:09 +01:00
Christian Sigg	bedaad4495	[mlir] Simplify std.alloc lowering to LLVM. std.alloc only supports memrefs with identity layout, which means we can simplify the lowering to LLVM and compute strides only from (static and dynamic) sizes. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D91549	2020-11-17 18:55:34 +01:00
ergawy	9793edd5bf	[MLIR][SPIRV] Rename `spv._address_of` to `spv.mlir.addressof` This commit does the renaming mentioned in the title in order to bring `spv` dialect closer to the MLIR naming conventions. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D91609	2020-11-17 12:12:27 -05:00
Rahul Joshi	b7382ed3fe	[MLIR] Extend Symbol verification to reject public symbol declarations. - Extend the Symbol interface with `isDeclaration` to identify operations that declare a symbol as opposed to define it. - Extend verification to disallow public declarations as per the discussion in https://llvm.discourse.group/t/rfc-symbol-definition-declaration-x-visibility-checks/2140 - Adopt the new interface for `FuncOp` and fix test and code to not have/create public function declarations. Differential Revision: https://reviews.llvm.org/D91456	2020-11-16 16:05:32 -08:00
Christian Sigg	04481f26fa	[mlir] Require std.alloc() ops to have canonical layout during LLVM lowering. The current code allows strided layouts, but the number of elements allocated is ambiguous. It could be either the number of elements in the shape (the current implementation), or the amount of elements required to not index out-of-bounds with the given maps (which would require evaluating the layout map). If we require the canonical layouts, the two will be the same. Reviewed By: nicolasvasilache, ftynse Differential Revision: https://reviews.llvm.org/D91523	2020-11-16 17:29:36 +01:00
Hanhan Wang	47fd19f22e	[mlir][StandardToSPIRV] Extend support for lowering cmpi to SPIRV. The logic of vector on boolean was missed. This patch adds the logic and test on it. Reviewed By: mravishankar Differential Revision: https://reviews.llvm.org/D91403	2020-11-16 06:51:05 -08:00
Eugene Zhulenev	c30ab6c2a3	[mlir] Transform scf.parallel to scf.for + async.execute Depends On D89958 1. Adds `async.group`/`async.awaitall` to group together multiple async tokens/values 2. Rewrite scf.parallel operation into multiple concurrent async.execute operations over non overlapping subranges of the original loop. Example: ``` scf.for (%i, %j) = (%lbi, %lbj) to (%ubi, %ubj) step (%si, %sj) { "do_some_compute"(%i, %j): () -> () } ``` Converted to: ``` %c0 = constant 0 : index %c1 = constant 1 : index // Compute blocks sizes for each induction variable. %num_blocks_i = ... : index %num_blocks_j = ... : index %block_size_i = ... : index %block_size_j = ... : index // Create an async group to track async execute ops. %group = async.create_group scf.for %bi = %c0 to %num_blocks_i step %c1 { %block_start_i = ... : index %block_end_i = ... : index scf.for %bj = %c0 t0 %num_blocks_j step %c1 { %block_start_j = ... : index %block_end_j = ... : index // Execute the body of original parallel operation for the current // block. %token = async.execute { scf.for %i = %block_start_i to %block_end_i step %si { scf.for %j = %block_start_j to %block_end_j step %sj { "do_some_compute"(%i, %j): () -> () } } } // Add produced async token to the group. async.add_to_group %token, %group } } // Await completion of all async.execute operations. async.await_all %group ``` In this example outer loop launches inner block level loops as separate async execute operations which will be executed concurrently. At the end it waits for the completiom of all async execute operations. Reviewed By: ftynse, mehdi_amini Differential Revision: https://reviews.llvm.org/D89963	2020-11-13 04:02:56 -08:00
Stephan Herhut	5da2423bc0	[mlir][gpu] Only transform mapped parallel loops to GPU. This exposes a hook to configure legality of operations such that only `scf.parallel` operations that have mapping attributes are marked as illegal. Consequently, the transformation can now also be applied to mixed forms. Differential Revision: https://reviews.llvm.org/D91340	2020-11-13 09:15:17 +01:00
George Mitenkov	de3ad5bb09	[MLIR][SPIRVToLLVM] Enhanced conversion for execution mode This patch introduces a new conversion pattern for `spv.ExecutionMode`. `spv.ExecutionMode` may contain important information about the entry point, which we want to preserve. For example, `LocalSize` provides information about the work-group size that can be reused. Hence, the pattern for entry-point ops changes to the following: - `spv.EntryPoint` is still simply removed - Info from `spv.ExecutionMode` is used to create a global struct variable, which looks like: ``` struct { int32_t executionMode; int32_t values[]; // optional values }; ``` Reviewed By: mravishankar Differential Revision: https://reviews.llvm.org/D89989	2020-11-10 18:33:54 +03:00
Artur Bialas	3035e676a3	[mlir][spirv] Add VectorInsertDynamicOp and vector.insertelement lowering VectorInsertDynamicOp in SPIRV dialect conversion from vector.insertelement to spirv VectorInsertDynamicOp Differential Revision: https://reviews.llvm.org/D90927	2020-11-10 09:49:12 +01:00
River Riddle	ebcc022507	[mlir][AsmPrinter] Refactor printing to only print aliases for attributes/types that will exist in the output. This revision refactors the way that attributes/types are considered when generating aliases. Instead of considering all of the attributes/types of every operation, we perform a "fake" print step that prints the operations using a dummy printer to collect the attributes and types that would actually be printed during the real process. This removes a lot of attributes/types from consideration that generally won't end up in the final output, e.g. affine map attributes in an `affine.apply`/`affine.for`. This resolves a long standing TODO w.r.t aliases, and helps to have a much cleaner textual output format. As a datapoint to the latter, as part of this change several tests were identified as testing for the presence of attributes aliases that weren't actually referenced by the custom form of any operation. To ensure that this wouldn't cause a large degradation in compile time due to the second full print, I benchmarked this change on a very large module with a lot of operations(The file is ~673M/~4.7 million lines long). This file before this change take ~6.9 seconds to print in the custom form, and ~7 seconds after this change. In the custom assembly case, this added an average of a little over ~100 miliseconds to the compile time. This increase was due to the way that argument attributes on functions are structured and how they get printed; i.e. with a better representation the negative impact here can be greatly decreased. When printing in the generic form, this revision had no observable impact on the compile time. This benchmarking leads me to believe that the impact of this change on compile time w.r.t printing is closely related to `print` methods that perform a lot of additional/complex processing outside of the OpAsmPrinter. Differential Revision: https://reviews.llvm.org/D90512	2020-11-09 21:54:47 -08:00
Rahul Joshi	8b5a3e4632	[MLIR] Change FuncOp assembly syntax to print visibility inline instead of in attrib dict. - Change syntax for FuncOp to be `func <visibility>? @name` instead of printing the visibility in the attribute dictionary. - Since printFunctionLikeOp() and parseFunctionLikeOp() are also used by other operations, make the "inline visibility" an opt-in feature. - Updated unit test to use and check the new syntax. Differential Revision: https://reviews.llvm.org/D90859	2020-11-09 11:08:08 -08:00
Rahul Joshi	a97e357e8e	[MLIR] Support `global_memref` and `get_global_memref` in standard -> LLVM conversion. - Convert `global_memref` to LLVM::GlobalOp. - Convert `get_global_memref` to a memref descriptor with a pointer to the first element of the global stashed in it. - Extend unit test and a mlir-cpu-runner test to validate the generated LLVM IR. Differential Revision: https://reviews.llvm.org/D90803	2020-11-09 10:54:21 -08:00
George Mitenkov	89eed79c1f	[MLIR][SPIRVToLLVM] Added module name conversion Since SPIR-V module has an optional name, this patch makes a change to pass it to `ModuleOp` during conversion. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D90904	2020-11-07 12:27:44 +03:00
Artur Bialas	f9dca1039a	[mlir][spirv] Add VectorExtractDynamicOp and vector.extractelement lowering VectorExtractDynamicOp in SPIRV dialect conversion from vector.extractelement to spirv VectorExtractDynamicOp Differential Revision: https://reviews.llvm.org/D90679	2020-11-05 08:26:54 +01:00
Alex Zinenko	8475fa6ed6	[mlir] Add a simpler lowering pattern for WhileOp representing a do-while loop When the "after" region of a WhileOp is merely forwarding its arguments back to the "before" region, i.e. WhileOp is a canonical do-while loop, a simpler CFG subgraph that omits the "after" region with its extra branch operation can be produced. Loop rotation from general "while" to "if { do-while }" is left for a future canonicalization pattern when it becomes necessary. Differential Revision: https://reviews.llvm.org/D90604	2020-11-04 09:43:13 +01:00
Alex Zinenko	4c0e255c98	[mlir] Add lowering to CFG for WhileOp The lowering is a straightforward inlining of the "before" and "after" regions connected by (conditional) branches. This plugs the WhileOp into the progressive lowering scheme. Future commits may choose to target WhileOp instead of CFG when lowering ForOp. Differential Revision: https://reviews.llvm.org/D90603	2020-11-04 09:43:13 +01:00
Alexander Belyaev	9925168576	[mlir] Convert `memref_reshape` to LLVM. https://llvm.discourse.group/t/rfc-standard-memref-cast-ops/1454/15 Differential Revision: https://reviews.llvm.org/D90377	2020-11-03 11:39:08 +01:00
Tres Popp	d05d42199f	[mlir] Add partial lowering of shape.cstr_broadcastable. Because cstr operations allow more instruction reordering than asserts, we only lower cstr_broadcastable to std ops with cstr_require. This ensures that the more drastic lowering to asserts can happen specifically with the user's desire. Differential Revision: https://reviews.llvm.org/D89325	2020-11-03 09:57:23 +01:00
Eugene Zhulenev	f507aa17b7	[mlir] Implement lowering to LLVM of async.execute ops with token dependencies Add support for lowering `async.execute` operations with token dependencies Example: ``` %dep = ... : !async.token %token = async.execute[%dep] { ... } ``` Token dependencies lowered to `async.await` operations inside the outline coroutine body. Reviewed By: herhut, mehdi_amini, ftynse Differential Revision: https://reviews.llvm.org/D89958	2020-10-30 05:59:03 -07:00
Tres Popp	511484f27d	[mlir] Add lowering for IsBroadcastable to Std dialect. Differential Revision: https://reviews.llvm.org/D90407	2020-10-30 10:44:27 +01:00
Christian Sigg	b22f111023	[mlir][gpu] NFC: Change gpu.launch_func ops to custom format. This should fix the reason for the failures after `ec7780ebda`. I will roll forward in a separate change. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D90410	2020-10-29 21:21:30 +01:00
Christian Sigg	97b351a827	[mlir][gpu] Fix leaked stream and module when lowering gpu.launch_func to runtime calls. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D90370	2020-10-29 08:40:51 +01:00
Qingyi Liu	1ec893c574	MLIR: add SinOp Lowering to __nv_sinf and __nv_sin Added lowering rule from `SinOp` to `__nv_sinf` and `__nv_sin` Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D90147	2020-10-28 14:15:26 +01:00
River Riddle	8a1ca2cd34	[mlir] Add a conversion pass between PDL and the PDL Interpreter Dialect The conversion between PDL and the interpreter is split into several different parts. ** The Matcher: The matching section of all incoming pdl.pattern operations is converted into a predicate tree and merged. Each pattern is first converted into an ordered list of predicates starting from the root operation. A predicate is composed of three distinct parts: * Position - A position refers to a specific location on the input DAG, i.e. an existing MLIR entity being matched. These can be attributes, operands, operations, results, and types. Each position also defines a relation to its parent. For example, the operand `[0] -> 1` has a parent operation position `[0]` (the root). * Question - A question refers to a query on a specific positional value. For example, an operation name question checks the name of an operation position. * Answer - An answer is the expected result of a question. For example, when matching an operation with the name "foo.op". The question would be an operation name question, with an expected answer of "foo.op". After the predicate lists have been created and ordered(based on occurrence of common predicates and other factors), they are formed into a tree of nodes that represent the branching flow of a pattern match. This structure allows for efficient construction and merging of the input patterns. There are currently only 4 simple nodes in the tree: * ExitNode: Represents the termination of a match * SuccessNode: Represents a successful match of a specific pattern * BoolNode/SwitchNode: Branch to a specific child node based on the expected answer to a predicate question. Once the matcher tree has been generated, this tree is walked to generate the corresponding interpreter operations. ** The Rewriter: The rewriter portion of a pattern is generated in a very straightforward manor, similarly to lowerings in other dialects. Each PDL operation that may exist within a rewrite has a mapping into the interpreter dialect. The code for the rewriter is generated within a FuncOp, that is invoked by the interpreter on a successful pattern match. Referenced values defined in the matcher become inputs the generated rewriter function. An example lowering is shown below: ```mlir // The following high level PDL pattern: pdl.pattern : benefit(1) { %resultType = pdl.type %inputOperand = pdl.input %root, %results = pdl.operation "foo.op"(%inputOperand) -> %resultType pdl.rewrite %root { pdl.replace %root with (%inputOperand) } } // is lowered to the following: module { // The matcher function takes the root operation as an input. func @matcher(%arg0: !pdl.operation) { pdl_interp.check_operation_name of %arg0 is "foo.op" -> ^bb2, ^bb1 ^bb1: pdl_interp.return ^bb2: pdl_interp.check_operand_count of %arg0 is 1 -> ^bb3, ^bb1 ^bb3: pdl_interp.check_result_count of %arg0 is 1 -> ^bb4, ^bb1 ^bb4: %0 = pdl_interp.get_operand 0 of %arg0 pdl_interp.is_not_null %0 : !pdl.value -> ^bb5, ^bb1 ^bb5: %1 = pdl_interp.get_result 0 of %arg0 pdl_interp.is_not_null %1 : !pdl.value -> ^bb6, ^bb1 ^bb6: // This operation corresponds to a successful pattern match. pdl_interp.record_match @rewriters::@rewriter(%0, %arg0 : !pdl.value, !pdl.operation) : benefit(1), loc([%arg0]), root("foo.op") -> ^bb1 } module @rewriters { // The inputs to the rewriter from the matcher are passed as arguments. func @rewriter(%arg0: !pdl.value, %arg1: !pdl.operation) { pdl_interp.replace %arg1 with(%arg0) pdl_interp.return } } } ``` Differential Revision: https://reviews.llvm.org/D84580	2020-10-26 18:01:06 -07:00
Alexander Belyaev	d6ab0474c6	[mlir] Convert MemRefReinterpretCastOp to LLVM. https://llvm.discourse.group/t/rfc-standard-memref-cast-ops/1454/15 Differential Revision: https://reviews.llvm.org/D90033	2020-10-26 20:13:17 +01:00
George Mitenkov	cae4067ec1	[MLIR][mlir-spirv-cpu-runner] A pass to emulate a call to kernel in LLVM This patch introduces a pass for running `mlir-spirv-cpu-runner` - LowerHostCodeToLLVMPass. This pass emulates `gpu.launch_func` call in LLVM dialect and lowers the host module code to LLVM. It removes the `gpu.module`, creates a sequence of global variables that are later linked to the varables in the kernel module, as well as a series of copies to/from them to emulate the memory transfer to/from the host or to/from the device sides. It also converts the remaining Standard dialect into LLVM dialect, emitting C wrappers. Reviewed By: mravishankar Differential Revision: https://reviews.llvm.org/D86112	2020-10-26 08:11:04 -04:00
Lei Zhang	36ce915ac5	Revert "Revert "[mlir] Convert from Async dialect to LLVM coroutines"" This reverts commit `4986d5eaff` with proper patches to CMakeLists.txt: - Add MLIRAsync as a dependency to MLIRAsyncToLLVM - Add Coroutines as a dependency to MLIRExecutionEngine	2020-10-22 15:23:11 -04:00
Mehdi Amini	4986d5eaff	Revert "[mlir] Convert from Async dialect to LLVM coroutines" This reverts commit `a8b0ae3bdd` and commit `f8fcff5a9d`. The build with SHARED_LIBRARY=ON is broken.	2020-10-22 19:12:19 +00:00
Christian Sigg	9ab5362bab	[mlir][gpu] NFC: switch occurrences of gpu.launch_func to custom format. Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D89929	2020-10-22 17:27:19 +02:00
Eugene Zhulenev	f8fcff5a9d	[mlir] Convert from Async dialect to LLVM coroutines Lower from Async dialect to LLVM by converting async regions attached to `async.execute` operations into LLVM coroutines (https://llvm.org/docs/Coroutines.html): 1. Outline all async regions to functions 2. Add LLVM coro intrinsics to mark coroutine begin/end 3. Use MLIR conversion framework to convert all remaining async types and ops to LLVM + Async runtime function calls All `async.await` operations inside async regions converted to coroutine suspension points. Await operation outside of a coroutine converted to the blocking wait operations. Implement simple runtime to support concurrent execution of coroutines. Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D89292	2020-10-22 06:30:46 -07:00
Thomas Raoux	ac2cf07195	[spirv] Fix legalize standard to spir-v for transfer ops Forward missing attributes when creating the new transfer op otherwise the builder would use default values. Differential Revision: https://reviews.llvm.org/D89907	2020-10-21 13:56:01 -07:00
Christian Sigg	3ac561d8c3	[mlir][gpu] Add lowering to LLVM for `gpu.wait` and `gpu.wait async`. Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D89686	2020-10-21 18:20:42 +02:00
Tres Popp	72d5ac90b9	[mlir] Use affine dim instead of symbol in SCFToGPU lowering. This still satisfies the constraints required by the affine dialect and gives more flexibility in what iteration bounds can be used when loewring to the GPU dialect. Differential Revision: https://reviews.llvm.org/D89782	2020-10-20 11:56:34 +02:00
Sean Silva	57211fd239	[mlir] Use dynamic_tensor_from_elements in shape.broadcast conversion Now, convert-shape-to-std doesn't internally create memrefs, which was previously a bit of a layering violation. The conversion to memrefs should logically happen as part of bufferization. Differential Revision: https://reviews.llvm.org/D89669	2020-10-19 15:51:46 -07:00
ergawy	bddaa7a848	[MLIR][SPIRV] Support identified and recursive structs. This PR adds support for identified and recursive structs. This includes: parsing, printing, serializing, and deserializing such structs. The following C struct: ```C struct A { A* next; }; ``` which is translated to the following MLIR code as: ```mlir !spv.struct<A, (!spv.ptr<!spv.struct<A>, Generic>)> ``` would be represented in the SPIR-V module as: ```spirv OpName %A "A" OpTypeForwardPointer %APtr Generic %A = OpTypeStruct %APtr %APtr = OpTypePointer Generic %A ``` In particular the following changes are included: - SPIR-V structs can now be either identified or literal (i.e. non-identified). - All structs now have their members surrounded by a ()-pair. - For recursive references, (1) an OpTypeForwardPointer instruction is emitted before the OpTypeStruct instruction defining the recursive struct (2) an OpTypePointer instruction is emitted after the OpTypeStruct instruction which actually defines the recursive pointer to struct type. Reviewed By: antiagainst, rriddle, ftynse Differential Revision: https://reviews.llvm.org/D87206	2020-10-13 10:18:21 -04:00
Tres Popp	8178e41dc1	[mlir] Type erase inputs to select statements in shape.broadcast lowering. This is required or broadcasting with operands of different ranks will lead to failures as the select op requires both possible outputs and its output type to be the same. Differential Revision: https://reviews.llvm.org/D89134	2020-10-11 21:58:06 +02:00
Amara Emerson	322d0afd87	[llvm][mlir] Promote the experimental reduction intrinsics to be first class intrinsics. This change renames the intrinsics to not have "experimental" in the name. The autoupgrader will handle legacy intrinsics. Relevant ML thread: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140729.html Differential Revision: https://reviews.llvm.org/D88787	2020-10-07 10:36:44 -07:00
Thomas Raoux	6e557bc405	[mlir][spirv] Add Vector to SPIR-V conversion pass Add conversion pass for Vector dialect to SPIR-V dialect and add some simple conversion pattern for vector.broadcast, vector.insert, vector.extract. Differential Revision: https://reviews.llvm.org/D88761	2020-10-06 11:53:23 -07:00
George Mitenkov	b81bedf714	[MLIR][SPIRVToLLVM] Conversion for composite extract and insert A pattern to convert `spv.CompositeInsert` and `spv.CompositeExtract`. In LLVM, there are 2 ops that correspond to each instruction depending on the container type. If the container type is a vector type, then the result of conversion is `llvm.insertelement` or `llvm.extractelement`. If the container type is an aggregate type (i.e. struct, array), the result of conversion is `llvm.insertvalue` or `llvm.extractvalue`. Reviewed By: mravishankar Differential Revision: https://reviews.llvm.org/D88205	2020-10-06 11:46:25 +03:00
Christian Sigg	665371d0b2	[mlir] Split alloc-like op LLVM lowerings into base and separate derived classes. The previous code did the lowering to alloca, malloc, and aligned_malloc in a single class with different code paths that are somewhat difficult to follow. This change moves the common code to a base class and has a separte derived class per lowering target that contains the specifics. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D88696	2020-10-05 17:36:01 +02:00
Benjamin Kramer	6e2b267d1c	Promote transpose from linalg to standard dialect While affine maps are part of the builtin memref type, there is very limited support for manipulating them in the standard dialect. Add transpose to the set of ops to complement the existing view/subview ops. This is a metadata transformation that encodes the transpose into the strides of a memref. I'm planning to use this when lowering operations on strided memrefs, using the transpose to remove the stride without adding a dependency on linalg dialect. Differential Revision: https://reviews.llvm.org/D88651	2020-10-05 10:58:20 +02:00
Diego Caballero	a611f9a5c6	[mlir] Fix call op conversion in bare-ptr calling convention We hit an llvm_unreachable related to unranked memrefs for call ops with scalar types. Removing the llvm_unreachable since the conversion should gracefully bail out in the presence of unranked memrefs. Adding tests to verify that. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D88709	2020-10-02 08:48:21 -07:00
Jakub Lichman	0b17d4754a	[mlir][Linalg] Tile sizes for Conv ops vectorization added as pass arguments Current setup for conv op vectorization does not enable user to specify tile sizes as well as dimensions for vectorization. In this commit we change that by adding tile sizes as pass arguments. Every dimension with corresponding tile size > 1 is automatically vectorized. Differential Revision: https://reviews.llvm.org/D88533	2020-09-30 11:31:28 +00:00
Diego Caballero	a89fc12653	[mlir] Support return and call ops in bare-ptr calling convention This patch adds support for the 'return' and 'call' ops to the bare-ptr calling convention. These changes also align the bare-ptr calling convention code with the latest changes in the default calling convention and reduce the amount of customization code needed. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D87724	2020-09-29 12:00:47 -07:00
Sean Silva	a975be0e00	[mlir][shape] Make conversion passes more consistent. - use select-ops to make the lowering simpler - change style of FileCheck variables names to be consistent - change some variable names in the code to be more explicit Differential Revision: https://reviews.llvm.org/D88258	2020-09-28 14:55:42 -07:00
Aart Bik	54759cefdb	[mlir] [VectorOps] changes to printing support for integers (1) simplify integer printing logic by always using 64-bit print (2) add index support (since vector<16xindex> is planned to be added) (3) adjust naming convention print_x -> printX Reviewed By: bkramer Differential Revision: https://reviews.llvm.org/D88436	2020-09-28 11:43:31 -07:00
Aart Bik	b8880f5f97	[mlir] [VectorOps] generalize printing support for integers This generalizes printing beyond just i1,i32,i64 and also accounts for signed and unsigned interpretation in the output. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D88290	2020-09-25 04:52:21 -07:00
Artur Bialas	396e7f4548	[mlir][SCFToGPU] LaunchOp propagate optional attributes Allow propagating optional user defined attributes during SCF to GPU conversion. Gives opportunity to use user defined attributes in the further lowering. For example setting subgroup size, or other options for GPU dispatch. This does not break backward compatibility and does not require new attributes, just allow passing optional ones. Differential Revision: https://reviews.llvm.org/D88203	2020-09-25 09:21:16 +02:00
Sean Silva	9ed1e5873c	[mlir][shape] Start a pass that lowers shape constraints. This pass converts shape.cstr_* ops to eager (side-effecting) error-handling code. After that conversion is done, the witnesses are trivially satisfied and are replaced with `shape.const_witness true`. Differential Revision: https://reviews.llvm.org/D87941	2020-09-24 12:25:30 -07:00
Alexander Belyaev	56ffb8d169	[mlir] Stop allowing LLVMType Int arguments for GPULaunchFuncOp. Conversion to LLVM becomes confusing and incorrect if someone tries to lower STD -> LLVM and only then GPULaunchFuncOp to LLVM separately. Although it is technically allowed now, it works incorrectly because of the argument promotion. The correct way to use this conversion pattern is to add to the STD->LLVM patterns before running the pass. Differential Revision: https://reviews.llvm.org/D88147	2020-09-24 11:16:23 +02:00
Nicolas Vasilache	ed229132f1	[mlir][Linalg] Uniformize linalg.generic with named ops. This revision allows representing a reduction at the level of linalg on tensors for generic ops by uniformizing with the named ops approach.	2020-09-22 04:13:22 -04:00
Benjamin Kramer	2d76274b99	[mlir][VectorOps] Loosen restrictions on vector.reduction types LLVM can deal with any integer or float type, don't arbitrarily restrict it to f32/f64/i32/i64. Differential Revision: https://reviews.llvm.org/D88010	2020-09-21 12:45:23 +02:00
Hanhan Wang	1909b6ac0d	[mlir][StandardToSPIRV] Handle vector of i1 case for lowering zexti to SPIR-V. Reviewed By: mravishankar Differential Revision: https://reviews.llvm.org/D87887	2020-09-18 07:07:22 -07:00
Nicolas Vasilache	93fd30bac3	[mlir][Linalg] Evolve named ops to use assembly form and support linalg on tensors. This revision allows representing a reduction at the level of linalg on tensors for named ops. When a structured op has a reduction and returns tensor(s), new conventions are added and documented. As an illustration, the syntax for a `linalg.matmul` writing into a buffer is: ``` linalg.matmul ins(%a, %b : memref<?x?xf32>, tensor<?x?xf32>) outs(%c : memref<?x?xf32>) ``` , whereas the syntax for a `linalg.matmul` returning a new tensor is: ``` %d = linalg.matmul ins(%a, %b : tensor<?x?xf32>, memref<?x?xf32>) init(%c : memref<?x?xf32>) -> tensor<?x?xf32> ``` Other parts of linalg will be extended accordingly to allow mixed buffer/tensor semantics in the presence of reductions.	2020-09-18 06:14:30 -04:00
Jakub Lichman	347d59b16c	[mlir][Linalg] Convolution tiling added to ConvOp vectorization pass ConvOp vectorization supports now only convolutions of static shapes with dimensions of size either 3(vectorized) or 1(not) as underlying vectors have to be of static shape as well. In this commit we add support for convolutions of any size as well as dynamic shapes by leveraging existing matmul infrastructure for tiling of both input and kernel to sizes accepted by the previous version of ConvOp vectorization. In the future this pass can be extended to take "tiling mask" as a user input which will enable vectorization of user specified dimensions. Differential Revision: https://reviews.llvm.org/D87676	2020-09-17 09:39:41 +00:00
Alex Zinenko	967c7b6936	[mlir] check for failures when packing function sigunatures in std->llvm conversion When packing function results into a structure during the standard-to-llvm dialect conversion, do not assume the conversion was successful and propagate nullptr as error state. Fixes PR45184. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D87605	2020-09-15 12:30:44 +02:00
Lubomir Litchev	ef7a255c03	Add support for casting elements in vectors for certain Std dialect type conversion operations. Added support to the Std dialect cast operations to do casts in vector types when feasible. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D87410	2020-09-14 07:45:46 -07:00
Alex Zinenko	5cac85c931	[mlir] Check for type conversion success in std->llvm function conversion Type converter may fail and return nullptr on unconvertible types. The function conversion did not include a check and was attempting to use a nullptr type to construct an LLVM function, leading to a crash. Add a check and return early. The rest of the call stack propagates errors properly. Fixes PR47403. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D87075	2020-09-14 13:16:42 +02:00
Sean Silva	84a6da67e6	[mlir] Fix some edge cases around 0-element TensorFromElementsOp This introduces a builder for the more general case that supports zero elements (where the element type can't be inferred from the ValueRange, since it might be empty). Also, fix up some cases in ShapeToStandard lowering that hit this. It happens very easily when dealing with shapes of 0-D tensors. The SameOperandsAndResultElementType is redundant with the new TypesMatchWith and prevented having zero elements. Differential Revision: https://reviews.llvm.org/D87492	2020-09-11 10:58:35 -07:00
aartbik	3c42c0dcf6	[mlir] [VectorOps] Enable 32-bit index optimizations Rationale: After some discussion we decided that it is safe to assume 32-bit indices for all subscripting in the vector dialect (it is unlikely the dialect will be used; or even work; for such long vectors). So rather than detecting specific situations that can exploit 32-bit indices with higher parallel SIMD, we just optimize it by default, and let users that don't want it opt-out. Reviewed By: nicolasvasilache, bkramer Differential Revision: https://reviews.llvm.org/D87404	2020-09-10 00:26:27 -07:00
Frederik Gossen	5106a8b8f8	[MLIR][Shape] Lower `shape_of` to `dynamic_tensor_from_elements` Take advantage of the new `dynamic_tensor_from_elements` operation in `std`. Instead of stack-allocated memory, we can now lower directly to a single `std` operation. Differential Revision: https://reviews.llvm.org/D86935	2020-09-09 07:55:13 +00:00
Frederik Gossen	133322d2e3	[MLIR][Standard] Update `tensor_from_elements` assembly format Remove the redundant parenthesis that are used for none of the other operation formats. Differential Revision: https://reviews.llvm.org/D86287	2020-09-09 07:45:46 +00:00
Benjamin Kramer	239eff502b	[mlir][VectorOps] Redo the scalar loop emission in VectoToSCF to pad instead of clipping This replaces the select chain for edge-padding with an scf.if that performs the memory operation when the index is in bounds and uses the pad value when it's not. For transfer_write the same mechanism is used, skipping the store when the index is out of bounds. The integration test has a bunch of cases of how I believe this should work. Differential Revision: https://reviews.llvm.org/D87241	2020-09-08 11:15:25 +02:00
Jakub Lichman	67b37f571c	[mlir] Conv ops vectorization pass In this commit a new way of convolution ops lowering is introduced. The conv op vectorization pass lowers linalg convolution ops into vector contractions. This lowering is possible when conv op is first tiled by 1 along specific dimensions which transforms it into dot product between input and kernel subview memory buffers. This pass converts such conv op into vector contraction and does all necessary vector transfers that make it work. Differential Revision: https://reviews.llvm.org/D86619	2020-09-08 08:47:42 +00:00
Nicolas Vasilache	9be6178449	[mlir][Vector] Make VectorToSCF deterministic Differential Revision: https://reviews.llvm.org/D87273	2020-09-08 04:18:22 -04:00
Frederik Gossen	a70f2eb3e3	[MLIR][Shape] Merge `shape` to `std`/`scf` lowerings. Merge the two lowering passes because they are not useful by themselves. The new pass lowers to `std` and `scf` is considered an auxiliary dialect. See also https://llvm.discourse.group/t/conversions-with-multiple-target-dialects/1541/12 Differential Revision: https://reviews.llvm.org/D86779	2020-09-07 14:39:37 +00:00
David Truby	973800dc7c	Revert "[MLIR][Shape] Merge `shape` to `std`/`scf` lowerings." This reverts commit `15acdd7543`.	2020-09-07 13:37:32 +01:00
Nicolas Vasilache	1c849ec40a	[MLIR] Fix Win test due to partial order of CHECK directives Differential Revision: https://reviews.llvm.org/D87230	2020-09-07 08:14:35 -04:00
Frederik Gossen	15acdd7543	[MLIR][Shape] Merge `shape` to `std`/`scf` lowerings. Merge the two lowering passes because they are not useful by themselves. The new pass lowers to `std` and `scf` is considered an auxiliary dialect. See also https://llvm.discourse.group/t/conversions-with-multiple-target-dialects/1541/12 Differential Revision: https://reviews.llvm.org/D86779	2020-09-07 12:12:36 +00:00
Nicolas Vasilache	8d64df9f13	[mlir][Vector] Revisit VectorToSCF. Vector to SCF conversion still had issues due to the interaction with the natural alignment derived by the LLVM data layout. One traditional workaround is to allocate aligned. However, this does not always work for vector sizes that are non-powers of 2. This revision implements a more portable mechanism where the intermediate allocation is always a memref of elemental vector type. AllocOp is extended to use the natural LLVM DataLayout alignment for non-scalar types, when the alignment is not specified in the first place. An integration test is added that exercises the transfer to scf.for + scalar lowering with a 5x5 transposition. Differential Revision: https://reviews.llvm.org/D87150	2020-09-07 05:19:43 -04:00
aartbik	060c9dd1cc	[mlir] [VectorOps] Improve SIMD compares with narrower indices When allowed, use 32-bit indices rather than 64-bit indices in the SIMD computation of masks. This runs up to 2x and 4x faster on a number of AVX2 and AVX512 microbenchmarks. Reviewed By: bkramer Differential Revision: https://reviews.llvm.org/D87116	2020-09-03 21:43:38 -07:00
Lei Zhang	8d420fb3a0	[spirv][nfc] Simplify resource limit with default values These deafult values are gotten from Vulkan required limits. Reviewed By: hanchung Differential Revision: https://reviews.llvm.org/D87090	2020-09-03 13:29:26 -04:00
Benjamin Kramer	dfb7b3fe02	[mlir][VectorOps] Fall back to a loop when accessing a vector from a strided memref The scalar loop is slow but correct. Differential Revision: https://reviews.llvm.org/D87082	2020-09-03 16:05:38 +02:00
Jakub Lichman	f5ed22f09d	[mlir][VectorToSCF] 128 byte alignment of alloc ops Added 128 byte alignment to alloc ops created in VectorToSCF pass. 128b alignment was already introduced to this pass but not to all alloc ops. This commit changes that by adding 128b alignment to the remaining ops. The point of specifying alignment is to prevent possible memory alignment errors on weakly tested architectures. Differential Revision: https://reviews.llvm.org/D86454	2020-09-02 12:37:35 +00:00
Kiran Chandramohan	875074c8a9	[OpenMP][MLIR] Conversion pattern for OpenMP to LLVM Adding a conversion pattern for the parallel Operation. This will help the conversion of parallel operation with standard dialect to parallel operation with llvm dialect. The type conversion of the block arguments in a parallel region are controlled by the pattern for the parallel Operation. Without this pattern, a parallel Operation with block arguments cannot be converted from standard to LLVM dialect. Other OpenMP operations without regions are marked as legal. When translation of OpenMP operations with regions are added then patterns for these operations can also be added. Also uses all the standard to llvm patterns. Patterns of other dialects can be added later if needed. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D86273	2020-08-27 19:32:15 +01:00
George Mitenkov	d48b84eb8a	[MLIR][GPUToSPIRV] Passing gpu module name to SPIR-V module This patch allows to pass the gpu module name to SPIR-V module during conversion. This has many benefits as we can lookup converted to SPIR-V kernel in the symbol table. In order to avoid symbol conflicts, `"__spv__"` is added to the gpu module name to form the new one. Reviewed By: mravishankar Differential Revision: https://reviews.llvm.org/D86384	2020-08-27 09:19:24 +03:00

1 2 3 4 5 ...

550 Commits