llvm-project

Commit Graph

Author	SHA1	Message	Date
gysit	9912bed730	[mlir][linalg] Remove RangeOp and RangeType. Remove the RangeOp and the RangeType that are not actively used anymore. After removing RangeType, the LinalgTypes header only includes the generated dialect header. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D115727	2021-12-15 07:19:10 +00:00
Lei Zhang	96130b5dc7	[mlir][spirv] Support size-1 vector/tensor constant during conversion Reviewed By: ThomasRaoux, mravishankar Differential Revision: https://reviews.llvm.org/D115518	2021-12-14 15:58:08 -05:00
Alexander Belyaev	15f8f3e20a	[mlir] Split std.rank into tensor.rank and memref.rank. Move `std.rank` similarly to how `std.dim` was moved to TensorOps and MemRefOps. Differential Revision: https://reviews.llvm.org/D115665	2021-12-14 10:15:55 +01:00
River Riddle	233e9476d8	[mlir:PDL] Allow non-bound pdl.attribute/pdl.type operations that create constants This allows for passing in these attributes/types to constraints/rewrites as arguments. Differential Revision: https://reviews.llvm.org/D114817	2021-12-10 19:38:43 +00:00
Alexander Belyaev	b618880e7b	[mlir] Move `linalg.tensor_expand/collapse_shape` to TensorDialect. RFC: https://llvm.discourse.group/t/rfc-reshape-ops-restructuring/3310 linalg.fill gets a canonicalizer, because `FoldFillWithTensorReshape` cannot be moved to tensorops (it uses linalg::FillOp inside). Before it was listed as a canonicalization pattern for the reshape operations, now it became a canonicalization for FillOp. Differential Revision: https://reviews.llvm.org/D115502	2021-12-10 12:11:48 +01:00
Mehdi Amini	79a0330a52	Fix crash from use of a temporary after its scope exit Introduced in D110448 and broke some bots (reported by ASAN). Differential Revision: https://reviews.llvm.org/D110448	2021-12-10 05:04:23 +00:00
Krzysztof Drewniak	e1da62910e	[MLIR][GPU] Define gpu.printf op and its lowerings - Define a gpu.printf op, which can be lowered to any GPU printf() support (which is present in CUDA, HIP, and OpenCL). This op only supports constant format strings and scalar arguments - Define the lowering of gpu.pirntf to a call to printf() (which is what is required for AMD GPUs when using OpenCL) as well as to the hostcall interface present in the AMD Open Compute device library, which is the interface present when kernels are running under HIP. - Add a "runtime" enum that allows specifying which of the possible runtimes a ROCDL kernel will be executed under or that the runtime is unknown. This enum controls how gpu.printf is lowered This change does not enable lowering for Nvidia GPUs, but such a lowering should be possible in principle. And: [MLIR][AMDGPU] Always set amdgpu-implicitarg-num-bytes=56 on kernels This is something that Clang always sets on both OpenCL and HIP kernels, and failing to include it causes mysterious crashes with printf() support. In addition, revert the max-flat-work-group-size to (1, 256) to avoid triggering bugs in the AMDGPU backend. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D110448	2021-12-09 15:54:31 +00:00
Rob Suderman	23149d522b	[mlir] Added ctlz and cttz to math dialect and LLVM dialect Count leading/trailing zeros are an existing LLVM intrinsic. Added LLVM support for the intrinsics with lowerings from the math dialect to LLVM dialect. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D115206	2021-12-08 14:32:15 -08:00
Butygin	d8fce785de	[mlir][spirv] math.erf OpenCL lowering Differential Revision: https://reviews.llvm.org/D115335	2021-12-08 21:59:46 +03:00
Mehdi Amini	be0a7e9f27	Adjust "end namespace" comment in MLIR to match new agree'd coding style See D115115 and this mailing list discussion: https://lists.llvm.org/pipermail/llvm-dev/2021-December/154199.html Differential Revision: https://reviews.llvm.org/D115309	2021-12-08 06:05:26 +00:00
Rob Suderman	c5fef77bc3	[mlir] Add CtPop to MathOps with lowering to LLVM math.ctpop maths to the llvm.ctpop intrinsic. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D114998	2021-12-06 11:54:20 -08:00
Alex Zinenko	d64b3e47ba	[mlir] Avoid needlessly converting LLVM named structs with compatible elements Conversion of LLVM named structs leads to them being renamed since we cannot modify the body of the struct type once it is set. Previously, this applied to all named struct types, even if their element types were not affected by the conversion. Make this behvaior only applicable when element types are changed. This requires making the LLVM dialect type-compatibility check recursively look at the element types (arguably, it should have been doing than since the moment the LLVM dialect type system stopped being closed). In addition, have a more lax check for outer types only to avoid repeated check when necessary (e.g., parser, verifiers that are going to also look at the inner type). Reviewed By: wsmoses Differential Revision: https://reviews.llvm.org/D115037	2021-12-06 13:42:11 +01:00
Alex Zinenko	9dd1f8dfdd	[mlir] support recursive type conversion of named LLVM structs A previous commit added support for converting elemental types contained in LLVM dialect types in case they were not compatible with the LLVM dialect. It was missing support for named structs as they could be recursive, which was not supported by the conversion infra. Now that it is, add support for converting such named structs. Depends On D113579 Reviewed By: wsmoses Differential Revision: https://reviews.llvm.org/D113580	2021-12-03 12:41:40 +01:00
Michal Terepeta	1423e8bf5d	[mlir][Vector] Support 0-D vectors in `BitCastOp` The implementation only allows to bit-cast between two 0-D vectors. We could probably support casting from/to vectors like `vector<1xf32>`, but I wasn't convinced that this would be important and it would require breaking the invariant that `BitCastOp` works only on vectors with equal rank. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114854	2021-12-03 08:55:59 +00:00
Nicolas Vasilache	c537a94334	[mlir][Vector] Thread 0-d vectors through vector.transfer ops This revision adds 0-d vector support to vector.transfer ops. In the process, numerous cleanups are applied, in particular around normalizing and reducing the number of builders. Reviewed By: ThomasRaoux, springerm Differential Revision: https://reviews.llvm.org/D114803	2021-12-01 16:49:43 +00:00
Jacques Pienaar	62fea88bc5	[mlir] Update accessors prefixed form (NFC)	2021-11-30 19:42:37 -08:00
Stephen Neuendorffer	7386364889	Revert "[MLIR] Update Vector To LLVM conversion to be aware of assume_alignment" This reverts commit `29a50c5864`. After LLVM lowering, the original patch incorrectly moved alignment information across an unconstrained GEP operation. This is only correct for some index offsets in the GEP. It seems that the best approach is, in fact, to rely on LLVM to propagate information from the llvm.assume() to users. Thanks to Thomas Raoux for catching this.	2021-11-30 15:18:22 -08:00
Nicolas Vasilache	a08b750ce9	[mlir][tensor] InsertSliceOp verification. This revision reintroduces tensor.insert_slice verification which seems to have vanished over time: a verifier was initially introduced in `cf9503c1b7` but for some reason the invalid.mlir was not properly updated; as time passed the verifier was not called anymore and later the code was deleted. As a consequence, a non-negligible portion of tests has run astray using invalid tensor.insert_slice semantics and needed to be fixed. Also, extract isRankReducedType from TensorOps for better reuse Originally, this facility was used by both tensor and memref forms but it got copied around as dialects were split. Differential Revision: https://reviews.llvm.org/D114715	2021-11-30 20:37:06 +00:00
Alexander Belyaev	f910aa9105	[mlir] Fix BufferizationToMemRef build.	2021-11-30 13:10:54 +01:00
Julian Gross	ae1ea0bead	[mlir] Decompose Bufferization Clone operation into Memref Alloc and Copy. This patch introduces a new conversion to convert bufferization.clone operations into a memref.alloc and a memref.copy operation. This transformation is needed to transform all remaining clones which "survive" all previous transformations, before a given program is lowered further (to LLVM e.g.). Otherwise, these operations cannot be handled anymore and lead to compile errors. See: https://llvm.discourse.group/t/bufferization-error-related-to-memref-clone/4665 Differential Revision: https://reviews.llvm.org/D114233	2021-11-30 10:15:56 +01:00
Stanislav Funiak	a19e163526	Fixed broken build under GCC 5.4. This diff fixes broken build caused by D108550. Under GCC 5, auto lambdas that capture this require `this->` for member calls. Reviewed By: bondhugula Differential Revision: https://reviews.llvm.org/D114659	2021-11-27 09:03:27 +05:30
Michal Terepeta	d0f927121e	[mlir][Standard] Support 0-D vectors in `SplatOp` This changes the op to produce `AnyVectorOfAnyRank` and implements this by just inserting the element (skipping the shuffle that we do for the 1-D case). Depends On D114549 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114598	2021-11-26 17:05:15 +00:00
Benjamin Kramer	8521850f20	Provide a definition for OperationPosition::kDown This isn't necessary in C++17, but C++14 still requires it.	2021-11-26 14:11:59 +01:00
Benjamin Kramer	1b0312d280	[PDL] fix unused variable warning in Release builds	2021-11-26 14:11:58 +01:00
Stanislav Funiak	a76ee58f3c	Multi-root PDL matching using upward traversals. This is commit 4 of 4 for the multi-root matching in PDL, discussed in https://llvm.discourse.group/t/rfc-multi-root-pdl-patterns-for-kernel-matching/4148 (topic flagged for review). This PR integrates the various components (root ordering algorithm, nondeterministic execution of PDL bytecode) to implement multi-root PDL matching. The main idea is for the pattern to specify mulitple candidate roots. The PDL-to-PDLInterp lowering selects one of these roots and "hangs" the pattern from this root, traversing the edges downwards (from operation to its operands) when possible and upwards (from values to its uses) when needed. The root is selected by invoking the optimal matching multiple times, once for each candidate root, and the connectors are determined form the optimal matching. The costs in the directed graph are equal to the number of upward edges that need to be traversed when connecting the given two candidate roots. It can be shown that, for this choice of the cost function, "hanging" the pattern an inner node is no better than from the optimal root. The following three main additions were implemented as a part of this PR: 1. OperationPos predicate has been extended to allow tracing the operation accepting a value (the opposite of operation defining a value). 2. Predicate checking if two values are not equal - this is useful to ensure that we do not traverse the edge back downwards after we traversed it upwards. 3. Function for for building the cost graph among the candidate roots. 4. Updated buildPredicateList, building the predicates optimal branching has been determined. Testing: unit tests (an integration test to follow once the stack of commits has landed) Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D108550	2021-11-26 18:11:37 +05:30
Stanislav Funiak	6df7cc7f47	Implementation of the root ordering algorithm This is commit 3 of 4 for the multi-root matching in PDL, discussed in https://llvm.discourse.group/t/rfc-multi-root-pdl-patterns-for-kernel-matching/4148 (topic flagged for review). We form a graph over the specified roots, provided in `pdl.rewrite`, where two roots are connected by a directed edge if the target root can be connected (via a chain of operations) in the underlying pattern to the source root. We place a restriction that the path connecting the two candidate roots must only contain the nodes in the subgraphs underneath these two roots. The cost of an edge is the smallest number of upward traversals (edges) required to go from the source to the target root, and the connector is a `Value` in the intersection of the two subtrees rooted at the source and target root that results in that smallest number of such upward traversals. Optimal root ordering is then formulated as the problem of finding a spanning arborescence (i.e., a directed spanning tree) of minimal weight. In order to determine the spanning arborescence (directed spanning tree) of minimum weight, we use the [Edmonds' algorithm](https://en.wikipedia.org/wiki/Edmonds%27_algorithm). The worst-case computational complexity of this algorithm is O(_N_^3) for a single root, where _N_ is the number of specified roots. The `pdl`-to-`pdl_interp` lowering calls this algorithm as a subroutine _N_ times (once for each candidate root), so the overall complexity of root ordering is O(_N_^4). If needed, this complexity could be reduced to O(_N_^3) with a more efficient algorithm. However, note that the underlying implementation is very efficient, and _N_ in our instances tends to be very small (<10). Therefore, we believe that the proposed (asymptotically suboptimal) implementation will suffice for now. Testing: a unit test of the algorithm Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D108549	2021-11-26 18:11:37 +05:30
Michal Terepeta	cc311a155a	[mlir][Vector] Support 0-D vectors in `VectorPrintOpConversion` Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114549	2021-11-25 20:12:18 +00:00
Butygin	8dae0b6b6c	[mlir][spirv] arith::RemSIOp OpenCL lowering Differential Revision: https://reviews.llvm.org/D114524	2021-11-25 12:44:06 +03:00
Butygin	75a1bee05d	[mlir][spirv] Add math to OpenCL conversion Differential Revision: https://reviews.llvm.org/D113780	2021-11-24 02:31:21 +03:00
Rob Suderman	54eec7cafc	[mlir][tosa] Separate tosa.transpose_conv decomposition and added stride support Transpose convolution decomposition is now performed in a separate pass. This allows padding / constant propagation to be performed at the TOSA level. It also adds support for striding when there is no dilation. Differential Revision: https://reviews.llvm.org/D114409	2021-11-23 12:16:44 -08:00
Nicolas Vasilache	3ff4e5f2a4	[mlir][Vector] Thread 0-d vectors through InsertElementOp. This revision makes concrete use of 0-d vectors to extend the semantics of InsertElementOp. Reviewed By: dcaballe, pifon2a Differential Revision: https://reviews.llvm.org/D114388	2021-11-23 12:55:11 +00:00
Nicolas Vasilache	e7026aba00	[mlir][Vector] Thread 0-d vectors through ExtractElementOp. This revision starts making concrete use of 0-d vectors to extend the semantics of ExtractElementOp. In the process a new VectorOfAnyRank Tablegen OpBase.td is added to allow progressive transition to supporting 0-d vectors by gradually opting in. Differential Revision: https://reviews.llvm.org/D114387	2021-11-23 12:39:44 +00:00
Thomas Raoux	47555d73f6	[mlir][gpu] Extend shuffle op modes and add nvvm lowering Add up, down and idx modes to gpu shuffle ops, also change the mode from string to enum Differential Revision: https://reviews.llvm.org/D114188	2021-11-19 11:14:31 -08:00
Mogball	7c5ecc8b7e	[mlir][vector] Insert/extract element can accept index `vector::InsertElementOp` and `vector::ExtractElementOp` have had their `position` operand changed to accept `AnySignlessIntegerOrIndex` for better operability with operations that use `index`, such as affine loops. LLVM's `extractelement` and `insertelement` can also accept `i64`, so lowering directly to these operations without explicitly inserting casts is allowed. SPIRV's equivalent ops can also accept `i64`. Reviewed By: nicolasvasilache, jpienaar Differential Revision: https://reviews.llvm.org/D114139	2021-11-18 22:40:29 +00:00
River Riddle	0c7890c844	[mlir] Convert NamedAttribute to be a class NamedAttribute is currently represented as an std::pair, but this creates an extremely clunky .first/.second API. This commit converts it to a class, with better accessors (getName/getValue) and also opens the door for more convenient API in the future. Differential Revision: https://reviews.llvm.org/D113956	2021-11-18 05:39:29 +00:00
River Riddle	195730a650	[mlir][NFC] Replace references to Identifier with StringAttr This is part of the replacement of Identifier with StringAttr. Differential Revision: https://reviews.llvm.org/D113953	2021-11-16 17:36:26 +00:00
Butygin	526b71e44a	[mlir] spirv: Add scf.while spirv conversion * It works similar to scf.for coversion, but convert condition and yield ops as part of scf.whille pattern so it don't need to maintain external state Differential Revision: https://reviews.llvm.org/D113007	2021-11-16 13:19:34 +03:00
Adrian Kuegel	921d91f3ac	[mlir] Support multi-dimensional vectors in MathToLibm conversion. Differential Revision: https://reviews.llvm.org/D113969	2021-11-16 11:13:52 +01:00
natashaknk	381677dfbf	[tosa][mlir] Refactor tosa.reshape lowering to linalg for dynamic cases. Split tosa.reshape into three individual lowerings: collapse, expand and a combination of both. Add simple dynamic shape support. Reviewed By: rsuderman Differential Revision: https://reviews.llvm.org/D113936	2021-11-15 15:31:37 -08:00
Nicolas Vasilache	ee80ffbf9a	[mlir][Linalg] Add bounded recursion declaration to FMAOp -> LLVM conversion. FMAOp -> LLVM conversion is done progressively by peeling off 1 dimension from FMAOp at each pattern iteration. Add the recursively bounded property declaration to the pattern so that the rewriter can apply it multiple times. Without this, FMAOps with 3+D do not lower to LLVM. Differential Revision: https://reviews.llvm.org/D113886	2021-11-15 12:41:52 +00:00
Alexander Belyaev	9b1d90e8ac	[mlir] Move min/max ops from Std to Arith. Differential Revision: https://reviews.llvm.org/D113881	2021-11-15 13:19:17 +01:00
Nicolas Vasilache	f67171ac58	[mlir][Linalg] Make depthwise convolution naming scheme consistent. Names should be consistent across all operations otherwise painful bugs will surface. Reviewed By: rsuderman Differential Revision: https://reviews.llvm.org/D113762	2021-11-15 07:54:29 +00:00
Thomas Raoux	e7969240dc	[mlir][VectorToGPU] Support more cases in conversion to MMA ops Support load with broadcast, elementwise divf op and remove the hardcoded restriction on the vector size. Picking the right size should be enfored by user and will fail conversion to llvm/spirv if it is not supported. Differential Revision: https://reviews.llvm.org/D113618	2021-11-11 13:10:38 -08:00
Rob Suderman	860d3811a9	[mlir][tosa] Add lowering for tosa.pad with explicit value New TOSA pad operation can support explicitly specifying the pad value. Added lowering to linalg that uses the explicit value. Differential Revision: https://reviews.llvm.org/D113515	2021-11-10 14:15:20 -08:00
thomasraoux	f309939d06	[mlir][nvvm] Remove special case ptr arithmetic lowering in gpu to nvvm Use existing helper instead of handling only a subset of indices lowering arithmetic. Also relax the restriction on the memref rank for the GPU mma ops as we can now support any rank. Differential Revision: https://reviews.llvm.org/D113383	2021-11-10 10:00:12 -08:00
Alex Zinenko	e64c76672f	[mlir] recursively convert builtin types to LLVM when possible Given that LLVM dialect types may now optionally contain types from other dialects, which itself is motivated by dialect interoperability and progressive lowering, the conversion should no longer assume that the outermost LLVM dialect type can be left as is. Instead, it should inspect the types it contains and attempt to convert them to the LLVM dialect. Introduce this capability for LLVM array, pointer and structure types. Only literal structures are currently supported as handling identified structures requires the converison infrastructure to have a mechanism for avoiding infite recursion in case of recursive types. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D112550	2021-11-10 18:11:00 +01:00
River Riddle	937e40a8cf	[mlir] Remove the non-templated DenseElementsAttr::getSplatValue This predates the templated variant, and has been simply forwarding to getSplatValue<Attribute> for some time. Removing this makes the API a bit more uniform, and also helps prevent users from thinking it is "cheap".	2021-11-09 01:40:40 +00:00
River Riddle	ae40d62541	[mlir] Refactor ElementsAttr's value access API There are several aspects of the API that either aren't easy to use, or are deceptively easy to do the wrong thing. The main change of this commit is to remove all of the `getValue<T>`/`getFlatValue<T>` from ElementsAttr and instead provide operator[] methods on the ranges returned by `getValues<T>`. This provides a much more convenient API for the value ranges. It also removes the easy-to-be-inefficient nature of getValue/getFlatValue, which under the hood would construct a new range for the type `T`. Constructing a range is not necessarily cheap in all cases, and could lead to very poor performance if used within a loop; i.e. if you were to naively write something like: ``` DenseElementsAttr attr = ...; for (int i = 0; i < size; ++i) { // We are internally rebuilding the APFloat value range on each iteration!! APFloat it = attr.getFlatValue<APFloat>(i); } ``` Differential Revision: https://reviews.llvm.org/D113229	2021-11-09 00:15:08 +00:00
thomasraoux	d88cc07943	[mlir][gpuTonvvm] Remove hardcoded values in MMAType to llvm struct Also relax the types allowed in GPU wmma ops Differential Revision: https://reviews.llvm.org/D112969	2021-11-02 08:12:27 -07:00
thomasraoux	7fbb0678fa	[mlir][VectorToGPU] Add support for elementwise mma to vector to GPU Differential Revision: https://reviews.llvm.org/D112960	2021-11-02 08:01:04 -07:00

1 2 3 4 5 ...

1318 Commits