llvm-project

Commit Graph

Author	SHA1	Message	Date
Alex Zinenko	5c73db24df	[mlir] disallow side-effecting ops in llvm.mlir.global The llvm.mlir.global operation accepts a region as initializer. This region corresponds to an LLVM IR constant expression and therefore should not accept operations with side effects. Add a corresponding verifier. Reviewed By: wsmoses, bondhugula Differential Revision: https://reviews.llvm.org/D120632	2022-03-01 14:16:09 +01:00
gysit	e9085d0d25	[mlir][OpDSL] Rename function to make signedness explicit (NFC). The revision renames the following OpDSL functions: ``` TypeFn.cast -> TypeFn.cast_signed BinaryFn.min -> BinaryFn.min_signed BinaryFn.max -> BinaryFn.max_signed ``` The corresponding enum values on the C++ side are renamed accordingly: ``` #linalg.type_fn<cast> -> #linalg.type_fn<cast_signed> #linalg.binary_fn<min> -> #linalg.binary_fn<min_signed> #linalg.binary_fn<max> -> #linalg.binary_fn<max_signed> ``` Depends On D120110 Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D120562	2022-03-01 08:15:53 +00:00
gysit	24357fec8d	[mlir][OpDSL] Add arithmetic function attributes. The revision extends OpDSL with unary and binary function attributes. A function attribute, makes the operations used in the body of a structured operation configurable. For example, a pooling operation may take an aggregation function attribute that specifies if the op shall implement a min or a max pooling. The goal of this revision is to define less and more flexible operations. We may thus for example define an element wise op: ``` linalg.elem(lhs, rhs, outs=[out], op=BinaryFn.mul) ``` If the op argument is not set the default operation is used. Depends On D120109 Reviewed By: nicolasvasilache, aartbik Differential Revision: https://reviews.llvm.org/D120110	2022-03-01 07:45:47 +00:00
Michael Kruse	a66f7769a3	[OpenMPIRBuilder] Implement static-chunked workshare-loop schedules. Add applyStaticChunkedWorkshareLoop method implementing static schedule when chunk-size is specified. Unlike a static schedule without chunk-size (where chunk-size is chosen by the runtime such that each thread receives one chunk), we need two nested loops: one for looping over the iterations of a chunk, and a second for looping over all chunks assigned to the threads. This patch includes the following related changes: * Adapt applyWorkshareLoop to triage between the schedule types, now possible since all schedules have been implemented. The default schedule is assumed to be non-chunked static, as without OpenMPIRBuilder. * Remove the chunk parameter from applyStaticWorkshareLoop, it is ignored by the runtime. Change the value for the value passed to the init function to 0, as without OpenMPIRBuilder. * Refactor CanonicalLoopInfo::setTripCount and CanonicalLoopInfo::mapIndVar as used by both, applyStaticWorkshareLoop and applyStaticChunkedWorkshareLoop. * Enable Clang to use the OpenMPIRBuilder in the presence of the schedule clause. Differential Revision: https://reviews.llvm.org/D114413	2022-02-28 18:18:33 -06:00
Okwan Kwon	4c901bf447	[mlir] Match Arithmetic::ConstantOp and Tensor::ExtractSliceOp. Add a pattern matcher for ExtractSliceOp when its source is a constant. The matching heuristics can be governed by the control function since generating a new constant is not always beneficial. Differential Revision: https://reviews.llvm.org/D119605	2022-02-28 23:09:03 +00:00
Lei Zhang	96bc2233c4	[mlir][linalg] Enhance FoldInsertPadIntoFill to support op chain If we have a chain of `tensor.insert_slice` ops inserting some `tensor.pad` op into a `linalg.fill` and ranges do not overlap, we can also elide the `tensor.pad` later. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D120446	2022-02-28 16:51:17 -05:00
Lei Zhang	5d47332783	[mlir][linalg] Fold tensor.pad when inserting into linalg.fill Fold tensor.insert_slice(tensor.pad(<input>), linalg.fill) into tensor.insert_slice(<input>, linalg.fill) if the padding value and the filling value are the same. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D120410	2022-02-28 16:42:32 -05:00
Okwan Kwon	4f5eb53e68	Revert "[mlir] Fold Arithmetic::ConstantOp and Tensor::ExtractSliceOp." This reverts commit `3104994104`.	2022-02-28 19:14:05 +00:00
Okwan Kwon	3104994104	[mlir] Fold Arithmetic::ConstantOp and Tensor::ExtractSliceOp. Fold ExtractSliceOp when the source is a constant.	2022-02-28 17:47:29 +00:00
gysit	11d144c576	[mlir][linalg] Check the iterator types are valid. Improve the LinalgOp verification to ensure the iterator types is known. Previously, unknown iterator types have been ignored without warning, which can lead to confusing bugs. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D120649	2022-02-28 11:25:40 +00:00
Adrian Kuegel	44adca60d4	[mlir] Remove unused static variables (NFC)	2022-02-28 11:52:39 +01:00
Alexander Belyaev	1a829d2d06	[mlir] Purge linalg.tiled_loop. Differential Revision: https://reviews.llvm.org/D119415	2022-02-28 09:05:18 +01:00
River Riddle	b474ca1d5a	[PDLL] Properly error out on returning results from native constraints PDL currently doesn't support result values from constraints, meaning we need to error out until this is actually supported to avoid crashes. Differential Revision: https://reviews.llvm.org/D119782	2022-02-26 11:08:51 -08:00
River Riddle	9ad64a5c78	[mlir:PDLL] Add support for C++ generation This commits adds a C++ generator to PDLL that generates wrapper PDL patterns directly usable in C++ code, and also generates the definitions of native constraints/rewrites that have code bodies specified in PDLL. This generator is effectively the PDLL equivalent of the current DRR generator, and will allow easy replacement of DRR patterns with PDLL patterns. A followup will start to utilize this for end-to-end integration testing and show case how to use this as a drop-in replacement for DRR tablegen usage. Differential Revision: https://reviews.llvm.org/D119781	2022-02-26 11:08:51 -08:00
River Riddle	a486cf5e98	[mlir:PDLL] Fix handling of unspecified operands/results on operation expressions If the operand list or result list of an operation expression is not specified, we interpret this as meaning that the operands/results are "unconstraint" (i.e. "could be anything"). We currently don't properly handle differentiating this case from the case of "no operands/results". This commit adds the insertion of implicit value/type range variables when these lists are unspecified. This allows for adding proper support for when zero operands or results are expected. Differential Revision: https://reviews.llvm.org/D119780	2022-02-26 11:08:51 -08:00
River Riddle	95b4e88b1d	[mlir:PDLL] Add support for PDL MLIR code generation This commits starts to plumb PDLL down into MLIR and adds an initial PDL generator. After this commit, we will have conceptually support end-to-end execution of PDLL. Followups will add CPP generation to match the current DRR setup, and begin to add various end-to-end tests to test PDLL execution. Differential Revision: https://reviews.llvm.org/D119779	2022-02-26 11:08:51 -08:00
Aart Bik	180c9f9efe	[mlir][sparse] enable scalar test Removed TODO now that we support scalars properly Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D120590	2022-02-25 15:05:25 -08:00
Bixia Zheng	6f07191101	[mlir][sparse][taco] Support reduction to scalar tensors. The PyTACO DSL doesn't support reduction to scalars. This change enhances the MLIR-PyTACO implementation to support reduction to scalars. Extend an existing test to show the syntax of reduction to scalars and two methods to retrieve the scalar values. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D120572	2022-02-25 14:17:45 -08:00
Hanhan Wang	748bf4bb28	[mlir][Linalg] Add support for tileFuseAndDistribute on tensors. This extends TileAndFuse to handle distribution on tensors. Reviewed By: gysit Differential Revision: https://reviews.llvm.org/D120441	2022-02-25 11:51:11 -08:00
Diego Caballero	875bbce9f7	[mlir][Vector] Prevent AVX2 lowering for non-f32 transpose ops The AVX2 lowering for transpose operations is only applicable to f32 vector types. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D120427	2022-02-25 19:27:32 +00:00
Diego Caballero	d7e0a0846b	[mlir][Vector] Generalize AVX2 transpose lowering to n-D vectors The existing AVX2 lowering patterns for the transpose op only triggers if the input vector is 2-D. This patch extends the patterns to trigger for n-D vectors which are effectively 2-D vectors (e.g., vector<1x4x1x8x1). The main constraint for the generalized AVX2 patterns to be applicable to these vectors is that the dimensions that are greater than one must be transposed. Otherwise, the existing patterns are not applicable. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D119505	2022-02-25 19:27:32 +00:00
Chia-hung Duan	9445b39673	[mlir] Support verification order (2/3) This change gives explicit order of verifier execution and adds `hasRegionVerifier` and `verifyWithRegions` to increase the granularity of verifier classification. The orders are as below, 1. InternalOpTrait will be verified first, they can be run independently. 2. `verifyInvariants` which is constructed by ODS, it verifies the type, attributes, .etc. 3. Other Traits/Interfaces that have marked their verifier as `verifyTrait` or `verifyWithRegions=0`. 4. Custom verifier which is defined in the op and has marked `hasVerifier=1` If an operation has regions, then it may have the second phase, 5. Traits/Interfaces that have marked their verifier as `verifyRegionTrait` or `verifyWithRegions=1`. This implies the verifier needs to access the operations in its regions. 6. Custom verifier which is defined in the op and has marked `hasRegionVerifier=1` Note that the second phase will be run after the operations in the region are verified. Based on the verification order, you will be able to avoid verifying duplicate things. Reviewed By: Mogball Differential Revision: https://reviews.llvm.org/D116789	2022-02-25 19:04:56 +00:00
Bixia Zheng	c601dfbcc2	[mlir][sparse][taco] Use np.array_equal to compare integer values. Fix MLIR-PyTACO and some tests to use np.array_equal to compare integer values. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D120526	2022-02-25 07:38:15 -08:00
gysit	cd2776b0d5	[mlir][OpDSL] Split arithmetic functions. Split arithmetic function into unary and binary functions. The revision prepares the introduction of unary and binary function attributes that work similar to type function attributes. Depends On D120108 Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D120109	2022-02-25 15:27:42 +00:00
Bixia Zheng	90f22ab3ad	[mlir][sparse][taco] Add support for scalar tensors. This change allows the use of scalar tensors with index 0 in tensor index expressions. In this case, the scalar value is broadcast to match the dimensions of other tensors in the same expression. Using scalar tensors as a destination in tensor index expressions is not supported in the PyTACO DSL. Add a PyTACO test to show the use of scalar tensors. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D120524	2022-02-25 07:20:15 -08:00
gysit	4d4cb17da8	[mlir][OpDSL] Refactor function handling. Prepare the OpDSL function handling to introduce more function classes. A follow up commit will split ArithFn into UnaryFn and BinaryFn. This revision prepares the split by adding a function kind enum to handle different function types using a single class on the various levels of the stack (for example, there is now one TensorFn and one ScalarFn). Depends On D119718 Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D120108	2022-02-25 15:05:32 +00:00
gysit	51fdd802c7	[mlir][OpDSL] Add type function attributes. Previously, OpDSL operation used hardcoded type conversion operations (cast or cast_unsigned). Supporting signed and unsigned casts thus meant implementing two different operations. Type function attributes allow us to define a single operation that has a cast type function attribute which at operation instantiation time may be set to cast or cast_unsigned. We may for example, defina a matmul operation with a cast argument: ``` @linalg_structured_op def matmul(A=TensorDef(T1, S.M, S.K), B=TensorDef(T2, S.K, S.N), C=TensorDef(U, S.M, S.N, output=True), cast=TypeFnAttrDef(default=TypeFn.cast)): C[D.m, D.n] += cast(U, A[D.m, D.k]) * cast(U, B[D.k, D.n]) ``` When instantiating the operation the attribute may be set to the desired cast function: ``` linalg.matmul(lhs, rhs, outs=[out], cast=TypeFn.cast_unsigned) ``` The revsion introduces a enum in the Linalg dialect that maps one-by-one to the type functions defined by OpDSL. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D119718	2022-02-25 08:25:23 +00:00
Thomas Raoux	b1357fe618	[mlir][memref] Add transformation to do loop multi-buffering This transformation is useful to break dependency between consecutive loop iterations by increasing the size of a temporary buffer. This is usually combined with heavy software pipelining. Differential Revision: https://reviews.llvm.org/D119406	2022-02-24 09:41:21 -08:00
Marius Brehler	1fa1251116	[mlir][emitc] Add a variable op This adds a variable op, emitted as C/C++ locale variable, which can be used if the `emitc.constant` op is not sufficient. As an example, the canonicalization pass would transform ```mlir %0 = "emitc.constant"() {value = 0 : i32} : () -> i32 %1 = "emitc.constant"() {value = 0 : i32} : () -> i32 %2 = emitc.apply "&"(%0) : (i32) -> !emitc.ptr<i32> %3 = emitc.apply "&"(%1) : (i32) -> !emitc.ptr<i32> emitc.call "write"(%2, %3) : (!emitc.ptr<i32>, !emitc.ptr<i32>) -> () ``` into ```mlir %0 = "emitc.constant"() {value = 0 : i32} : () -> i32 %1 = emitc.apply "&"(%0) : (i32) -> !emitc.ptr<i32> %2 = emitc.apply "&"(%0) : (i32) -> !emitc.ptr<i32> emitc.call "write"(%1, %2) : (!emitc.ptr<i32>, !emitc.ptr<i32>) -> () ``` resulting in pointer aliasing, as %1 and %2 point to the same address. In such a case, the `emitc.variable` operation can be used instead. Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D120098	2022-02-24 15:25:21 +00:00
Benjamin Kramer	92cf9f1481	[mlir][linalg] Cast back to the original type after making linalg.generic outputs more static This codepath was entirely untested. Differential Revision: https://reviews.llvm.org/D120473	2022-02-24 13:35:54 +01:00
Javier Setoain	cd0d21b47b	[mlir][LLVM] Allow scalable vectors in ShuffleVectorOp The current implementation of ShuffleVectorOp assumes all vectors are scalable. LLVM IR allows shufflevector operations on scalable vectors, and the current translation between LLVM Dialect and LLVM IR does the rigth thing when the shuffle mask is all zeroes. This is required to do a splat operation on a scalable vector, but it doesn't make sense for scalable vectors outside of that operation, i.e.: with non-all zero masks. Differential Revision: https://reviews.llvm.org/D118371	2022-02-24 11:24:34 +00:00
Matthias Springer	25bc684603	[mlir][linalg][bufferize] Always bufferize in-place with "out" operands by default In D115022, we introduced an optimization where OpResults of a `linalg.generic` may bufferize in-place with an "in" OpOperand if the corresponding "out" OpOperand is not used in the computation. This optimization can lead to unexpected behavior if the newly chosen OpOperand is in the same alias set as another OpOperand (that is used in the computation). In that case, the newly chosen OpOperand must bufferize out-of-place. This can be confusing to users, as always choosing the "out" OpOperand (regardless of whether it is used) would be expected when having the notion of "destination-passing style" in mind. With this change, we go back to always bufferizing in-place with "out" OpOperands by default, but letting users override the behavior with a bufferization option. Differential Revision: https://reviews.llvm.org/D120182	2022-02-24 19:58:05 +09:00
rkayaith	e9db306dcd	[mlir][python] Support more types in IntegerAttr.value Previously only accessing values for `index` and signless int types would work; signed and unsigned ints would hit an assert in `IntegerAttr::getInt`. This exposes `IntegerAttr::get{S,U}Int` to the C API and calls the appropriate function from the python bindings. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D120194	2022-02-24 10:26:31 +01:00
Bixia Zheng	c8ae8cfb5d	[mlir][sparse][taco] Add support for float32. Previously, we only support float64. We now support float32 and float64. When constructing a tensor without providing a data type, the default is float32. Fix the tests to data type consistency. All PyTACO application tests now use float32 to match the default data type of TACO. Other tests may use float32 or float64. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D120356	2022-02-23 18:24:22 -08:00
Aart Bik	652b39b46f	[mlir][sparse][linalg] add linalg rewriting specific to sparse tensors Now that sparse tensor types are first-class citizens and the sparse compiler is taking shape, it is time to make sure other compiler optimizations compose well with sparse tensors. Mostly, this should be completely transparent (i.e., dense and sparse take the same path). However, in some cases, optimizations only make sense in the context of sparse tensors. This is a first example of such an optimization, where fusing a sampled elt-wise multiplication only makes sense when the resulting kernel has a potential lower asymptotic complexity due to the sparsity. As an extreme example, running SDDMM with 1024x1024 matrices and a sparse sampling matrix with only two elements runs in 463.55ms in the unfused case but just 0.032ms in the fused case, with a speedup of 14485x that is only possible in the exciting world of sparse computations! Reviewed By: mravishankar Differential Revision: https://reviews.llvm.org/D120429	2022-02-23 17:29:41 -08:00
William S. Moses	1b2a1f8473	[MLIR][Arith] Canonicalize cmpf(int to fp) to cmpi Given a cmpf of either uitofp or sitofp and a constant, attempt to canonicalize it to a cmpi. This PR rewrites equivalent code within LLVM to now apply to MLIR arith. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D117257	2022-02-23 14:09:20 -05:00
Eugene Zhulenev	beff16f7bd	[mlir] Async: update condition for dispatching block-aligned compute function + compare block size with the unrollable inner dimension + reduce nesting in the code and simplify a bit IR building Reviewed By: cota Differential Revision: https://reviews.llvm.org/D120075	2022-02-23 10:29:55 -08:00
Aart Bik	8b83b8f131	[mlir][sparse] refactor sparse compiler pipeline to single place Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D120347	2022-02-22 16:23:56 -08:00
Okwan Kwon	f79f430d4b	Fold Tensor.extract_slice into a constant splat. Fold arith.extract_slice into arith.constant when the source is a constant splat and the result type is statically shaped.	2022-02-22 21:39:57 +00:00
Matthias Springer	d2dacde5d8	[mlir][bufferize][NFC] Rename `comprehensive-function-bufferize` to `one-shot-bufferize` The related functionality is moved over to the bufferization dialect. Test cases are cleaned up a bit. Differential Revision: https://reviews.llvm.org/D120191	2022-02-22 17:19:20 +09:00
Matthias Springer	41cb504b7c	[mlir][linalg][bufferize][NFC] Move interface impl to Linalg Transforms This is for consistency with other dialects. Differential Revision: https://reviews.llvm.org/D120190	2022-02-21 17:14:24 +09:00
Prateek Gupta	1a2bb03eda	[MLIR][LINALG] Add canonicalization pattern in `linalg.generic` op for static shape inference. This commit adds canonicalization pattern in `linalg.generic` op for static shape inference. If any of the inputs or outputs have static shape or is casted from a tensor of static shape, then shapes of all the inputs and outputs can be inferred by using the affine map of the static shape input/output. Signed-Off-By: Prateek Gupta <prateek@nod-labs.com> Reviewed By: mravishankar Differential Revision: https://reviews.llvm.org/D118929	2022-02-21 07:51:13 +00:00
Shraiysh Vaishay	c1e4e01945	[mlir][OpenMP] Added assemblyFormat for SectionsOp This patch adds assemblyFormat for omp.sections operation. Some existing functions have been altered to fit the custom directive in assemblyFormat. This has led to their callsites to get modified too, but those will be removed in later patches, when other operations get their assemblyFormat. All operations were not changed in one patch for ease of review. Reviewed By: Mogball Differential Revision: https://reviews.llvm.org/D120176	2022-02-21 13:01:49 +05:30
Matthias Springer	4ec00fb3ea	[mlir][bufferize] Add a way for ops to fail the analysis Add `BufferizableOpInterface::verifyAnalysis`. Ops can implement this method to check for expected invariants and limitations. The purpose of this change is to introduce a modular way of checking assertions such as `assertScfForAliasingProperties`. Differential Revision: https://reviews.llvm.org/D120189	2022-02-20 05:51:18 +09:00
Shraiysh Vaishay	39151717db	[mlir][OpenMP] Added assemblyFormat for ParallelOp This patch adds assemblyFormat for omp.parallel operation. Some existing functions have been altered to fit the custom directive in assemblyFormat. This has led to their callsites to get modified too, but those will be removed in later patches, when other operations get their assemblyFormat. All operations were not changed in one patch for ease of review. Reviewed By: Mogball Differential Revision: https://reviews.llvm.org/D120157	2022-02-19 10:28:58 +05:30
Aart Bik	9b9a084af0	[mlir][sparse][pytaco] test with 3-dim tensor and scalar Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D120163	2022-02-18 15:16:35 -08:00
Aart Bik	6438783fda	[mlir][sparse] provide more types for external to/from MLIR routines These routines will need to be specialized a lot more based on value types, index types, pointer types, and permutation/dimension ordering. This is a careful first step, providing some functionality needed in PyTACO bridge. Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D120154	2022-02-18 13:36:52 -08:00
Shraiysh Vaishay	60210f9acb	[mlir][OpenMP] Added assemblyformat for TargetOp This patch removes custom parser/printer for `omp.target` and adds assemblyformat. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D120138	2022-02-19 01:22:59 +05:30
Shraiysh Vaishay	5ee500acbb	[mlir][OpenMP] Remove clauses that are not being handled This patch removes the following clauses from OpenMP Dialect: - private - firstprivate - lastprivate - shared - default - copyin - copyprivate The privatization clauses are being handled in the flang frontend. The data copying clauses are not being handled anywhere for now. Once we have a better picture of how to handle these clauses in OpenMP Dialect, we can add these. For the time being, removing unneeded clauses. For detailed discussion about this refer to [[ https://discourse.llvm.org/t/rfc-privatisation-in-openmp-dialect/3526 \| Privatisation in OpenMP dialect ]] Reviewed By: kiranchandramohan, clementval Differential Revision: https://reviews.llvm.org/D120029	2022-02-19 01:13:05 +05:30
William S. Moses	670aeece51	[MLIR][OpenMP][SCF] Mark parallel regions as allocation scopes MLIR has the notion of allocation scopes which specify that stack allocations (e.g. memref.alloca, llvm.alloca) should be freed or equivalently aren't available at the end of the corresponding region. Currently neither OpenMP parallel nor SCF parallel regions have the notion of such a scope. This clearly makes sense for an OpenMP parallel as this is implemented in with a new function which outlines the region, and clearly any allocations in that newly outlined function have a lifetime that ends at the return of the function, by definition. While SCF.parallel doesn't have a guaranteed runtime which it is implemented with, this similarly makes sense for SCF.parallel since otherwise an allocation within an SCF.parallel will needlessly continue to allocate stack memory that isn't cleaned up until the function (or other allocation scope op) which contains the SCF.parallel returns. This means that it is impossible to represent thread or iteration-local memory without causing a stack blow-up. In the case that this stack-blow-up behavior is intended, this can be equivalently represented with an allocation outside of the SCF.parallel with a size equal to the number of iterations. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D119743	2022-02-18 11:06:32 -05:00

1 2 3 4 5 ...

5516 Commits