llvm-project

Commit Graph

Author	SHA1	Message	Date
Eugene Zhulenev	68a7c001ad	[mlir] Improve async parallel for tests + fix typos Do load and store to verify that we process each element of the iteration space once. Reviewed By: cota Differential Revision: https://reviews.llvm.org/D115152	2021-12-06 13:27:54 -08:00
Rob Suderman	c5fef77bc3	[mlir] Add CtPop to MathOps with lowering to LLVM math.ctpop maths to the llvm.ctpop intrinsic. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D114998	2021-12-06 11:54:20 -08:00
Alex Zinenko	d64b3e47ba	[mlir] Avoid needlessly converting LLVM named structs with compatible elements Conversion of LLVM named structs leads to them being renamed since we cannot modify the body of the struct type once it is set. Previously, this applied to all named struct types, even if their element types were not affected by the conversion. Make this behvaior only applicable when element types are changed. This requires making the LLVM dialect type-compatibility check recursively look at the element types (arguably, it should have been doing than since the moment the LLVM dialect type system stopped being closed). In addition, have a more lax check for outer types only to avoid repeated check when necessary (e.g., parser, verifiers that are going to also look at the inner type). Reviewed By: wsmoses Differential Revision: https://reviews.llvm.org/D115037	2021-12-06 13:42:11 +01:00
Matthias Springer	e761c49a14	[mlir][linalg][bufferize][NFC] Utilize isWritable for FuncOps This is a cleanup of ModuleBufferization. Instead of storing information about writable function arguments in BufferizationAliasInfo, we can use isWritable and make the decision there, based on dialect-specifc bufferization state. Differential Revision: https://reviews.llvm.org/D114930	2021-12-06 18:36:54 +09:00
Matthias Springer	e9fb4dc9e9	[mlir][linalg][bufferize] Remove buffer equivalence from bufferize Remove all function calls related to buffer equivalence from bufferize implementations. Add a new PostAnalysisStep for scf.for that ensures that yielded values are equivalent to the corresponding BBArgs. (This was previously checked in `bufferize`.) This will be relaxed in a subsequent commit. Note: This commit changes two test cases. These were broken by design and should not have passed. With the new scf.for PostAnalysisStep, this bug was fixed. Differential Revision: https://reviews.llvm.org/D114927	2021-12-06 17:48:31 +09:00
MaheshRavishankar	3ec6b1bfac	[mlir] Add default implementations for methods in `TilingInterface`. Adding the default implementation of `getLoopIteratorTypes` and `getLoopBounds` allows ExternalModels to override these methods. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D115101	2021-12-06 08:35:55 +00:00
Matthias Springer	cb4d0bf997	[mlir][linalg][bufferize][NFC] Collect equivalent FuncOp BBArgs in PostAnalysisStep Collect equivalent BBArgs right after the equivalence analysis of the FuncOp and before bufferizing. This is in preparation of decoupling bufferization from aliasInfo. Also gather equivalence info for CallOps, which was missing in the previous commit. Differential Revision: https://reviews.llvm.org/D114847	2021-12-06 17:31:39 +09:00
Michal Terepeta	caf89c0db6	[mlir][Vector] Support 0-D vectors in `ConstantMaskOp` To support creating both a mask with just a single `true` and `false` values, I had to relax the restriction in the verifier that the rank is always equal to the length of the attribute array, in other words, we now allow: - `vector.constant_mask [0] : vector<i1>` which gets lowered to `arith.constant dense<false> : vector<i1>` - `vector.constant_mask [1] : vector<i1>` which gets lowered to `arith.constant dense<true> : vector<i1>` (the attribute list for the 0-D case must be a singleton containing either `0` or `1`) Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D115023	2021-12-06 08:03:04 +00:00
gysit	69bcff46bf	[mlir][linalg] Pad independent of application order (NFC). This revision makes the padding pattern independent of the application order. It addresses the concern that we cannot rely on the execution order of the greedy rewriter (https://reviews.llvm.org/D114689). Instead, the pattern is updated to apply repeatedly till all operations are padded. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D114851	2021-12-06 07:26:15 +00:00
Mehdi Amini	afb0582325	Fix TOSA verifier to emit verbose errors Also as a test for invalid ops which was missing.	2021-12-05 19:16:54 +00:00
Butygin	91072b74f8	[mlir] Add InlinerInterface to bufferization dialect Differential Revision: https://reviews.llvm.org/D115080	2021-12-04 23:45:56 +03:00
Hugo Pompougnac	5d49511b30	Apply the permutation map on each affine nest When using -test-loop-permutation="permutation-map=...", applies the permutation map on each affine nest in the function (and not only the first one). If the size of the permutation map and the size of a nest are not consistent, do nothing on this particular nest (instead of making MLIR crash). Differential Revision: https://reviews.llvm.org/D112947	2021-12-04 17:48:34 +05:30
Chia-hung Duan	b8c6b15283	[mlir] Support collecting logs from notifyMatchFailure(). Let the user registers their own handler to processing the matching failure information. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D110896	2021-12-04 04:35:24 +00:00
Mehdi Amini	4022152b35	Use LLVM_ATTRIBUTE_UNUSED to silent warning for static function used in assert only (NFC)	2021-12-04 04:23:21 +00:00
Matthias Springer	5fa0b3561a	[mlir][linalg][bufferize] Implement equivalence analysis Instead of checking buffer equivalence during bufferization, gather buffer equivalence information right after the analysis. This is in preparation of decoupling bufferization from BufferizationAliasInfo. This change also fixes equivalence analysis for scf.if op results, which was not fully implemented. scf.if op results are equivalent to their corresponding yield values if both yield values are equivalent. Differential Revision: https://reviews.llvm.org/D114774	2021-12-04 11:52:04 +09:00
Uday Bondhugula	2108ed0671	[MLIR] Fix affine.for unroll for multi-result upper bound maps Fix affine.for unroll for multi-result upper bound maps: these can't be unrolled/unroll-and-jammed in cases where the trip count isn't known to be a multiple of the unroll factor. Fix and clean up repeated/unnecessary checks/comments at helper callees. Also, fix clang-tidy variable naming warnings and redundant includes. Differential Revision: https://reviews.llvm.org/D114662	2021-12-04 07:20:26 +05:30
Matthias Springer	9e42f2aa0b	[mlir][linalg][bufferize][NFC] Add inPlaceAnalysis overload Differential Revision: https://reviews.llvm.org/D114773	2021-12-04 10:41:57 +09:00
River Riddle	7169996159	[mlir] Allow shape dimensions larger than 2^32 Internally we use int64_t to hold shapes, but for some reason the parser was limiting shapes to unsigned. This change updates the parser to properly handle int64_t shape dimensions. Differential Revision: https://reviews.llvm.org/D115086	2021-12-04 01:29:50 +00:00
Uday Bondhugula	ecf458507e	[MLIR] Improve error message on missing getArgument() override on pass Improve error message while registering a pass with a missing getArgument() override. Differential Revision: https://reviews.llvm.org/D114744	2021-12-04 06:54:52 +05:30
Uday Bondhugula	d20249fde6	[MLIR] NFC. Rename test cases in test/mlir-cpu-runner per convention Test case files at most places in MLIR uses hyphens and not underscores. A counter-pattern was somehow started to use underscores in some places. Rename test cases in test/mlir-cpu-runner to use hyphens so that it's consistent at least within its directory. Differential Revision: https://reviews.llvm.org/D114672	2021-12-04 06:53:39 +05:30
Matthias Springer	6db200736c	[mlir][linalg][bufferize][NFC] Use same OpBuilder throughout bufferization Also set insertion point right before calling `bufferize`. No need to put an InsertionGuard anymore. Differential Revision: https://reviews.llvm.org/D114928	2021-12-04 09:57:26 +09:00
Mehdi Amini	48fb79effb	Improve error message when declarativeAssembly contains invalid literals Differential Revision: https://reviews.llvm.org/D115085	2021-12-04 00:27:32 +00:00
wren romano	4748cc6931	[mlir][sparse] Adding a stress test Addresses https://bugs.llvm.org/show_bug.cgi?id=52410 Depends on D114192 Reviewed By: aartbik, mehdi_amini Differential Revision: https://reviews.llvm.org/D114118	2021-12-03 14:59:39 -08:00
natashaknk	e2d8b60742	Revert "[mlir][tosa] Add tosa.conv2d as fully_connected canonicalization" This reverts commit `13bdb7ab4a`. The commit introduced/uncovered an unintended bug in models containing Conv2D. Reviewed By: mravishankar Differential Revision: https://reviews.llvm.org/D115079	2021-12-03 14:35:48 -08:00
Matthias Springer	e359a1e548	[mlir][linalg][bufferize][NFC] Map only tensors in BufferizationState BufferizationState had map/lookup overloads for non-tensor values. This was necessary for IREE. There is now a better way to do this, so these overloads can be removed. Differential Revision: https://reviews.llvm.org/D114929	2021-12-03 23:07:09 +09:00
Matthias Springer	ed8c63115e	[mlir][linalg][bufferize][NFC] Provide default implementation of getAliasingOpOperand This simplifies op interface implementations. Differential Revision: https://reviews.llvm.org/D115025	2021-12-03 22:36:22 +09:00
Adrian Kuegel	04d083b19e	[mlir][NFC] Use const reference for loop variables.	2021-12-03 13:07:54 +01:00
Alex Zinenko	9dd1f8dfdd	[mlir] support recursive type conversion of named LLVM structs A previous commit added support for converting elemental types contained in LLVM dialect types in case they were not compatible with the LLVM dialect. It was missing support for named structs as they could be recursive, which was not supported by the conversion infra. Now that it is, add support for converting such named structs. Depends On D113579 Reviewed By: wsmoses Differential Revision: https://reviews.llvm.org/D113580	2021-12-03 12:41:40 +01:00
Matthias Springer	5e1c038f7d	[mlir][linalg][bufferize][NFC] Move FuncOp boundary bufferization to ModuleBufferization Differential Revision: https://reviews.llvm.org/D114670	2021-12-03 20:29:39 +09:00
Matthias Springer	ad1ba42f68	[mlir][linalg][bufferize] Allow unbufferizable ops in input Allow ops that are not bufferizable in the input IR. (Deactivated by default.) bufferization::ToMemrefOp and bufferization::ToTensorOp are generated at the bufferization boundaries. Differential Revision: https://reviews.llvm.org/D114669	2021-12-03 20:20:46 +09:00
Matthias Springer	867cd948ac	[mlir][linalg][bufferize][NFC] Move BufferizationOptions to op interface Also store a reference to BufferizationOptions in BufferizationState. This is in preparation of adding support for partial bufferization. Differential Revision: https://reviews.llvm.org/D114661	2021-12-03 19:51:34 +09:00
Michal Terepeta	1423e8bf5d	[mlir][Vector] Support 0-D vectors in `BitCastOp` The implementation only allows to bit-cast between two 0-D vectors. We could probably support casting from/to vectors like `vector<1xf32>`, but I wasn't convinced that this would be important and it would require breaking the invariant that `BitCastOp` works only on vectors with equal rank. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114854	2021-12-03 08:55:59 +00:00
Michal Terepeta	8e2b373396	[mlir][Vector] Add some missing tests for `broadcast` and `splat` Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114853	2021-12-03 08:52:51 +00:00
Matthias Springer	d30fcadf07	[mlir][linalg][bufferize] Op interface implementation for Bufferization dialect ops This change provides `BufferizableOpInterface` implementations for ops from the Bufferization dialects. These ops are needed at the bufferization boundaries for partial bufferization. Differential Revision: https://reviews.llvm.org/D114618	2021-12-03 16:25:44 +09:00
Mehdi Amini	d2386ab6ad	Using make_unique instead of `new` (NFC) Fix a clang-tidy warning.	2021-12-03 01:53:42 +00:00
Mogball	29d990e439	[mlir][ods] update attr/type def format docs	2021-12-02 23:42:47 +00:00
Ulysse Beaugnon	e45705ad50	[MLIR] Use a shared uniquer for affine maps and integer sets. Affine maps and integer sets previously relied on a single lock for creating unique instances. In a multi-threaded setting, this lock becomes a contention point. This commit updates AffineMap and IntegerSet to use StorageUniquer instead. StorageUniquer internally uses sharded locks and thread-local caches to reduce contention. It is already used for affine expressions, types and attributes. On my local machine, this gives me a 5X speedup for an application that manipulates a lot of affine maps and integer sets. This commit also removes the integer set uniquer threshold. The threshold was used to avoid adding integer sets with a lot of constraints to the hash_map containing unique instances, but the constraints and the integer set were still allocated in the same allocator and never freed, thus not saving any space expect for the hash-map entry. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D114942	2021-12-02 23:49:32 +01:00
Groverkss	d257f7c1bf	[MLIR][FlatAffineConstraints] Remove duplicate divisions while merging local ids This patch implements detecting duplicate local identifiers by extracting their division representation while merging local identifiers. For example, given the FACs A, B: ``` A: (x, y)[s0] : (exists d0 = [x / 4], d1 = [y / 4]: d0 <= s0, d1 <= s0, x + y >= 2) B: (x, y)[s0] : (exists d0 = [x / 4], d1 = [y / 4]: d0 <= s0, d1 <= s0, x + y >= 5) ``` The intersection of A and B without this patch would lead to the following FAC: ``` (x, y)[s0] : (exists d0 = [x / 4], d1 = [y / 4], d2 = [x / 4], d3 = [x / 4]: d0 <= s0, d1 <= s0, d2 <= s0, d3 <= s0, x + y >= 2, x + y >= 5) ``` after this patch, merging of local ids will detect that `d0 = d2` and `d1 = d3`, and the intersection of these two FACs will be (after removing duplicate constraints): ``` (x, y)[s0] : (exists d0 = [x / 4], d1 = [y / 4] : d0 <= s0, d1 <= s0, x + y >= 2, x + y >= 5) ``` This reduces the number of constraints by 2 (constraints) + 4 (2 constraints for each extra division) for this case. This is used to reduce the output size representation of operations like PresburgerSet::subtract, PresburgerSet::intersect which require merging local variables. Reviewed By: arjunp, bondhugula Differential Revision: https://reviews.llvm.org/D112867	2021-12-03 03:44:47 +05:30
Groverkss	cff427ee20	Revert changes that should have been sent as a patch Revert changes that were meant to be sent as a single commit with summary for the differential review, but were accidently sent directly. This reverts commit `3bc5353fc6`.	2021-12-03 03:42:37 +05:30
Groverkss	c15724ab34	Address bondhugula's comments	2021-12-03 03:23:22 +05:30
Groverkss	d82a676227	Addressed comments	2021-12-03 03:23:22 +05:30
Groverkss	b912bf240e	Fix doc comment for mergeLocalIds.	2021-12-03 03:23:21 +05:30
Groverkss	76ad74a4a9	Address more comments.	2021-12-03 03:23:21 +05:30
Groverkss	a8b79d116a	Addressed more comments	2021-12-03 03:23:20 +05:30
Groverkss	1e0d7fd769	Fix asserts as suggested by Arjun	2021-12-03 03:23:20 +05:30
Groverkss	19352630c0	Fix clang-format errors	2021-12-03 03:23:19 +05:30
Groverkss	b8ea299628	Update docs	2021-12-03 03:23:19 +05:30
Groverkss	7f11dbec6e	Update tests for mergeLocalIds	2021-12-03 03:23:19 +05:30
Groverkss	8a0967481f	Address arjun's comments	2021-12-03 03:23:18 +05:30
Groverkss	c9cea1909f	Move division representation to a common function	2021-12-03 03:23:18 +05:30
Groverkss	985789ce0b	Update mergeLocalIds docs	2021-12-03 03:23:17 +05:30
Groverkss	06a119a3bd	Update docs for mergeLocalIds	2021-12-03 03:23:17 +05:30
Groverkss	3bc5353fc6	Implement division merging	2021-12-03 03:23:16 +05:30
Mogball	75dfeef9ad	[mlir][ods] fix defgen on empty files	2021-12-02 21:25:59 +00:00
Aart Bik	543924284f	[mlir][bufferization] fixed typo in to_memref doc Reviewed By: pifon2a Differential Revision: https://reviews.llvm.org/D114824	2021-12-02 10:55:57 -08:00
Jacques Pienaar	86eb57b728	[mlir][drr] Simple heuristic to reduce chance of accidental nullptr dereference When an attribute is optional & is given an additional constraint in rewrite pattern that could lead to dereferencing null Attribute. Avoid cases where the constraints checks attribute but has no check if null. This should be improved to be more uniformly guarded.	2021-12-01 20:45:08 -08:00
Matthias Springer	4479138de8	[mlir][linalg][bufferize] Bufferization of tensor.insert This is a lightweight operation, useful for writing unit tests. It will be utilized for testing in subsequent commits. Differential Revision: https://reviews.llvm.org/D114693	2021-12-02 11:58:01 +09:00
Kazu Hirata	afe43e0713	[mlir] Remove extractVectorTypeFromShapedValue This patch fixes the build by removing extractVectorTypeFromShapedValue. The last use was removed Dec 1, 2021 in commit extractVectorTypeFromShapedValue.	2021-12-01 13:43:17 -08:00
Mogball	71668a9367	[mlir][ods][nfc] fixing test cases	2021-12-01 18:50:02 +00:00
Mogball	ecaad4a876	[mlir][ods][nfc] fix gcc-5 build	2021-12-01 18:34:59 +00:00
Mogball	ca6bd9cd43	[mlir][ods] AttrOrTypeGen uses Class AttrOrType def generator uses `Class` code gen helper, instead of naked raw_ostream. Depends on D113714 and D114807 Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D113715	2021-12-01 16:53:23 +00:00
Nicolas Vasilache	c537a94334	[mlir][Vector] Thread 0-d vectors through vector.transfer ops This revision adds 0-d vector support to vector.transfer ops. In the process, numerous cleanups are applied, in particular around normalizing and reducing the number of builders. Reviewed By: ThomasRaoux, springerm Differential Revision: https://reviews.llvm.org/D114803	2021-12-01 16:49:43 +00:00
Stephan Herhut	9fce961d2f	[mlir][linalg] Disable tensor-matmul test under asan The test is currently leaky. Disabling it to make the bots green. Differential Revision: https://reviews.llvm.org/D114857	2021-12-01 16:25:31 +01:00
Stanislav Funiak	810b284918	Fixed a memory leak in the PDLToPDLInterp RootOrderingTest. RootOrderingTest is a low-level unit test that creates values and uses them as vertices in a directed graph. These vertices were created using `builder.create`, but never freed, due to my insufficient understanding of the MLIR infrastructure. Reviewed By: mehdi_amini, bondhugula, rriddle Differential Revision: https://reviews.llvm.org/D114745	2021-12-01 17:40:46 +05:30
Matthias Springer	2fd0ea960c	[mlir][linalg][bufferize] CallOps do not bufferize to memory writes However, since CallOps have no aliasing OpResults, their OpOperands always bufferize out-of-place. This change removes `bufferizesToMemoryWrite` from `CallOpInterface`. This method was called, but its return value did not matter. Differential Revision: https://reviews.llvm.org/D114616	2021-12-01 18:47:28 +09:00
Alexander Belyaev	3a6c4f307b	[mlir] Add a helper for TiledLoopOp to get an operand tied to the bbArg. Differential Revision: https://reviews.llvm.org/D114852	2021-12-01 09:32:00 +01:00
Thomas Raoux	69a8a7cf2d	[mlir] Make sure linearizeCollapsedDims doesn't drop input map dims The new affine map generated by linearizeCollapsedDims should not drop dimensions. We need to make sure we create a map with at least as many dimensions as the source map. This prevents FoldProducerReshapeOpByLinearization from generating invalid IR. This solves regression in IREE due to `e4e4da86af` Reviewed By: mravishankar Differential Revision: https://reviews.llvm.org/D114838 This reverts commit `9a844c2a9b`.	2021-11-30 22:51:56 -08:00
MaheshRavishankar	9a844c2a9b	Revert "[mlir] Make sure linearizeCollapsedDims doesn't drop input map dims" This reverts commit `bc38673e4d`.	2021-11-30 22:43:46 -08:00
MaheshRavishankar	bc38673e4d	[mlir] Make sure linearizeCollapsedDims doesn't drop input map dims The new affine map generated by linearizeCollapsedDims should not drop dimensions. We need to make sure we create a map with at least as many dimensions as the source map. This prevents FoldProducerReshapeOpByLinearization from generating invalid IR. This solves regression in IREE due to `e4e4da86af` Reviewed By: mravishankar Differential Revision: https://reviews.llvm.org/D114838	2021-11-30 22:37:53 -08:00
Jacques Pienaar	62fea88bc5	[mlir] Update accessors prefixed form (NFC)	2021-11-30 19:42:37 -08:00
Aart Bik	61e353e0b6	[mlir][sparse] added sparse out element wise mult integration test Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D114822	2021-11-30 16:44:38 -08:00
Aart Bik	fe0508dc9d	[mlir][sparse] fix typos in integration tests Reviewed By: bixia, wrengr Differential Revision: https://reviews.llvm.org/D114820	2021-11-30 15:32:20 -08:00
Stephen Neuendorffer	7386364889	Revert "[MLIR] Update Vector To LLVM conversion to be aware of assume_alignment" This reverts commit `29a50c5864`. After LLVM lowering, the original patch incorrectly moved alignment information across an unconstrained GEP operation. This is only correct for some index offsets in the GEP. It seems that the best approach is, in fact, to rely on LLVM to propagate information from the llvm.assume() to users. Thanks to Thomas Raoux for catching this.	2021-11-30 15:18:22 -08:00
Aart Bik	0e85232fa3	[mlir][sparse] refine simply dynamic sparse tensor outputs Proper test for sparse tensor outputs is a single condition throughout the whole tensor index expression (not a general conjunction, since this may include other conditions that cause cancellation). Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D114810	2021-11-30 13:45:58 -08:00
Nicolas Vasilache	a08b750ce9	[mlir][tensor] InsertSliceOp verification. This revision reintroduces tensor.insert_slice verification which seems to have vanished over time: a verifier was initially introduced in `cf9503c1b7` but for some reason the invalid.mlir was not properly updated; as time passed the verifier was not called anymore and later the code was deleted. As a consequence, a non-negligible portion of tests has run astray using invalid tensor.insert_slice semantics and needed to be fixed. Also, extract isRankReducedType from TensorOps for better reuse Originally, this facility was used by both tensor and memref forms but it got copied around as dialects were split. Differential Revision: https://reviews.llvm.org/D114715	2021-11-30 20:37:06 +00:00
MaheshRavishankar	311dd55c9e	[mlir][MemRef] Fix SubViewOp canonicalization when a subset of unit-dims are dropped. The canonical type of the result of the `memref.subview` needs to make sure that the previously dropped unit-dimensions are the ones dropped for the canonicalized type as well. This means the generic `inferRankReducedResultType` cannot be used. Instead the current dropped dimensions need to be querried and the same need to be dropped. Reviewed By: nicolasvasilache, ThomasRaoux Differential Revision: https://reviews.llvm.org/D114751	2021-11-30 20:37:06 +00:00
not-jenni	13bdb7ab4a	[mlir][tosa] Add tosa.conv2d as fully_connected canonicalization For a 1x1 weight and stride of 1, the input/weight can be reshaped and passed into a fully connected op then reshaped back Reviewed By: rsuderman Differential Revision: https://reviews.llvm.org/D114757	2021-11-30 12:01:14 -08:00
gysit	c8f2139eb0	[mlir][linalg] Add decompose to CodegenStrategy. Add the decompose patterns that lower higher dimensional convolutions to lower dimensional ones to CodegenStrategy and use CodegenStrategy to test the decompose patterns. Additionally, remove the assertion that checks the anchor op name is set in the CodegenStrategyTest pass. Removing the assertion allows us to simplify the pipelines used in the interchange and decompose tests. Depends On D114797 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114798	2021-11-30 15:48:29 +00:00
gysit	98dbcff19c	[mlir][linalg] Adapt the decompose patterns to use a filter (NFC). The revision updates the convolution decomposition patterns to take a linalg transformation filter. The transformation filter in a later revision allows use the patterns from CodegenStrategy. Depends On D114690 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114797	2021-11-30 15:46:10 +00:00
gysit	316e627c2b	[mlir][linalg] Support the empty anchor op string when padding. Add support for an empty anchor op string in vectorization. An empty anchor op string is useful after fusion when there are multiple different operations to vectorize. Depends On D114689 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114690	2021-11-30 15:32:13 +00:00
gysit	7f7103cd06	[mlir][linalg] Use top down traversal for padding. Pad the operation using a top down traversal. The top down traversal unlocks folding opportunities and dim op canonicalizations due to the introduced extract slice operation after the padded operation. Depends On D114585 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114689	2021-11-30 15:30:45 +00:00
gysit	1ae7342a7d	[mlir][linalg] Fix windows build issue in hoist padding. Iterating backwardSlice and removing elements at the same time can fail on windows for specific build configurations (the code was introduced in https://reviews.llvm.org/D114420). This revision introduces a second vector to collect all operations and removes them after finishing the reverse iteration. Reviewed By: hpmorgan Differential Revision: https://reviews.llvm.org/D114775	2021-11-30 15:21:53 +00:00
gysit	914e72d400	[mlir][linalg] Run CSE after every CodegenStrategy transformation. Add CSE after every transformation. Transformations such as tiling introduce redundant computation, for example, one AffineMinOp for every operand dimension pair. Follow up transformations such as Padding and Hoisting benefit from CSE since comparing slice sizes simplifies to comparing SSA values instead of analyzing affine expressions. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114585	2021-11-30 15:07:51 +00:00
Alexander Belyaev	f910aa9105	[mlir] Fix BufferizationToMemRef build.	2021-11-30 13:10:54 +01:00
Julian Gross	ae1ea0bead	[mlir] Decompose Bufferization Clone operation into Memref Alloc and Copy. This patch introduces a new conversion to convert bufferization.clone operations into a memref.alloc and a memref.copy operation. This transformation is needed to transform all remaining clones which "survive" all previous transformations, before a given program is lowered further (to LLVM e.g.). Otherwise, these operations cannot be handled anymore and lead to compile errors. See: https://llvm.discourse.group/t/bufferization-error-related-to-memref-clone/4665 Differential Revision: https://reviews.llvm.org/D114233	2021-11-30 10:15:56 +01:00
Alexander Belyaev	f89bb3c012	[mlir] Move bufferization-related passes to `bufferization` dialect. [RFC](https://llvm.discourse.group/t/rfc-dialect-for-bufferization-related-ops/4712) Differential Revision: https://reviews.llvm.org/D114698	2021-11-30 09:58:47 +01:00
gysit	0d0371f58f	[mlir][OpDSL] Fix OpDSL tests after https://reviews.llvm.org/D114680 . Update the shapes of the convolution / pooling tests that where detected after enabling verification during printing (https://reviews.llvm.org/D114680). Also split the emit_structured_generic.py file that previously contained all tests into multiple separate files to simplify debugging. Reviewed By: stellaraccident Differential Revision: https://reviews.llvm.org/D114731	2021-11-30 08:57:28 +00:00
Stella Laurenzo	a88bb5b9fe	[mlir][python] Audit and fix a lot of the Python pyi stubs. * Classes that are still todo are marked with "# TODO: Auto-generated. Audit and fix." * Those without this note have been cross-checked with C++ sources and most have been spot checked by hovering in VsCode. Differential Revision: https://reviews.llvm.org/D114767	2021-11-29 21:40:28 -08:00
Stella Laurenzo	bdc3183742	[mlir][python] Implement more SymbolTable methods. * set_symbol_name, get_symbol_name, set_visibility, get_visibility, replace_all_symbol_uses, walk_symbol_tables * In integrations I've been doing, I've been reaching for all of these to do both general IR manipulation and module merging. * I don't love the replace_all_symbol_uses underlying APIs since they necessitate SYMBOL_COUNT walks and have various sharp edges. I'm hoping that whatever emerges eventually for this can still retain this simple API as a one-shot. Differential Revision: https://reviews.llvm.org/D114687	2021-11-29 20:31:13 -08:00
Stella Laurenzo	a6e7d024a9	[mlir][python] Add pyi stub files to enable auto completion. There is no completely automated facility for generating stubs that are both accurate and comprehensive for native modules. After some experimentation, I found that MyPy's stubgen does the best at generating correct stubs with a few caveats that are relatively easy to fix: * Some types resolve to cross module symbols incorrectly. * staticmethod and classmethod signatures seem to always be completely generic and need to be manually provided. * It does not generate an __all__ which, from testing, causes namespace pollution to be visible to IDE code completion. As a first step, I did the following: * Ran `stubgen` for `_mlir.ir`, `_mlir.passmanager`, and `_mlirExecutionEngine`. * Manually looked for all instances where unnamed arguments were being emitted (i.e. as 'arg0', etc) and updated the C++ side to include names (and re-ran stubgen to get a good initial state). * Made/noted a few structural changes to each `pyi` file to make it minimally functional. * Added the `pyi` files to the CMake rules so they are installed and visible. To test, I added a `.env` file to the root of the project with `PYTHONPATH=...` set as per instructions. Then reload the developer window (in VsCode) and verify that completion works for various changes to test cases. There are still a number of overly generic signatures, but I want to check in this low-touch baseline before iterating on more ambiguous changes. This is already a big improvement. Differential Revision: https://reviews.llvm.org/D114679	2021-11-29 19:58:58 -08:00
Aart Bik	7d4da4e1ab	[mlir][sparse] generalize sparse tensor output implementation Moves sparse tensor output support forward by generalizing from injective insertions only to include reductions. This revision accepts the case with all parallel outer and all reduction inner loops, since that can be handled with an injective insertion still. Next revision will allow the inner parallel loop to move inward (but that will require "access pattern expansion" aka "workspace"). Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D114399	2021-11-29 16:15:53 -08:00
Aart Bik	52668355f4	[mlir][sparse] some leftover cleanup from migration to bufferization dialect Reviewed By: pifon2a Differential Revision: https://reviews.llvm.org/D114730	2021-11-29 12:46:01 -08:00
Benjamin Kramer	8d474f1d15	[mlir] Handle an edge case when folding reshapes with multiple trailing 1 dimensions We would exit early and miss this case. Differential Revision: https://reviews.llvm.org/D114711	2021-11-29 18:31:43 +01:00
Stephan Herhut	95f34e318c	[mlir][memref] Fix bug in verification of memref.collapse_shape The verifier computed an illegal type with negative dimension size when collapsing partially static memrefs. Differential Revision: https://reviews.llvm.org/D114702	2021-11-29 15:47:12 +01:00
Stella Laurenzo	ace1d0ad3d	[mlir][python] Normalize asm-printing IR behavior. While working on an integration, I found a lot of inconsistencies on IR printing and verification. It turns out that we were: * Only doing "soft fail" verification on IR printing of Operation, not of a Module. * Failed verification was interacting badly with binary=True IR printing (causing a TypeError trying to pass an `str` to a `bytes` based handle). * For systematic integrations, it is often desirable to control verification yourself so that you can explicitly handle errors. This patch: * Trues up the "soft fail" semantics by having `Module.__str__` delegate to `Operation.__str__` vs having a shortcut implementation. * Fixes soft fail in the presence of binary=True (and adds an additional happy path test case to make sure the binary functionality works). * Adds an `assume_verified` boolean flag to the `print`/`get_asm` methods which disables internal verification, presupposing that the caller has taken care of it. It turns out that we had a number of tests which were generating illegal IR but it wasn't being caught because they were doing a print on the `Module` vs operation. All except two were trivially fixed: * linalg/ops.py : Had two tests for direct constructing a Matmul incorrectly. Fixing them made them just like the next two tests so just deleted (no need to test the verifier only at this level). * linalg/opdsl/emit_structured_generic.py : Hand coded conv and pooling tests appear to be using illegal shaped inputs/outputs, causing a verification failure. I just used the `assume_verified=` flag to restore the original behavior and left a TODO. Will get someone who owns that to fix it properly in a followup (would also be nice to break this file up into multiple test modules as it is hard to tell exactly what is failing). Notes to downstreams: * If, like some of our tests, you get verification failures after this patch, it is likely that your IR was always invalid and you will need to fix the root cause. To temporarily revert to prior (broken) behavior, replace calls like `print(module)` with `print(module.operation.get_asm(assume_verified=True))`. Differential Revision: https://reviews.llvm.org/D114680	2021-11-28 18:02:01 -08:00
Nicolas Vasilache	f5a9bfdf8f	[mlir] NFC - Move invalid.mlir tests to the proper dialects	2021-11-28 21:30:40 +00:00
Chia-hung Duan	2afd16fe72	[mlir] Enable MLIRDialectUtilsTests Also remove `TooFewDims` test which tried to create an invalid AffineMap. The creation of an invalid AffineMap is rejected by `willBeValidAffineMap`, as a result we can deprecate the test. Reviewed By: bondhugula Differential Revision: https://reviews.llvm.org/D114657	2021-11-27 22:36:43 +00:00
Stanislav Funiak	a19e163526	Fixed broken build under GCC 5.4. This diff fixes broken build caused by D108550. Under GCC 5, auto lambdas that capture this require `this->` for member calls. Reviewed By: bondhugula Differential Revision: https://reviews.llvm.org/D114659	2021-11-27 09:03:27 +05:30
Kazu Hirata	803cec0268	[mlir] Fix a warning This patch fixes: mlir/lib/IR/MLIRContext.cpp:1020:3: error: use of the 'nodiscard' attribute is a C++17 extension [-Werror,-Wc++17-extensions]	2021-11-26 12:27:11 -08:00
Arnab Dutta	c2280b5517	[MLIR] Avoid creation of buggy affine maps when incorrect values of number of dimensions and number of symbols are provided. We check whether the maximum index of dimensional identifier present in the result expressions is less than dimCount (number of dimensional identifiers) argument passed in the AffineMap::get() and the maximum index of symbolic identifier present in the result expressions is less than symbolCount (number of symbolic identifiers) argument passed in AffineMap::get(). Reviewed By: nicolasvasilache, bondhugula Differential Revision: https://reviews.llvm.org/D114238	2021-11-27 00:37:08 +05:30
Arnab Dutta	e4e4da86af	[MLIR] Prevent creation of buggy affine map after linearizing collapsed dimensions of source map Initially we were passing wrong numSymbols argument while calling AffineMap::get() for creaating affine map with linearized result expressions. The main problems was the number of symbols of the newly to be created map may be different from that of the source map, as new symbolic identifiers may be introduced while creating strided layout linearized expressions. Reviewed By: nicolasvasilache, bondhugula Differential Revision: https://reviews.llvm.org/D114240	2021-11-27 00:32:58 +05:30
Chris Jones	344eee6f38	[MLIR] Allow `Idempotent` trait to be applied to binary ops. Add `Idempotent` trait to `arith.{andi,ori}`. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D114574	2021-11-26 18:22:49 +00:00
Michal Terepeta	7e65fc9a60	[mlir][Vector] Support 0-D vectors in `BroadcastOp` This changes the op to produce `AnyVectorOfAnyRank` following mostly the code for 1-D vectors. Depends On D114598 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114550	2021-11-26 17:17:18 +00:00
Michal Terepeta	d0f927121e	[mlir][Standard] Support 0-D vectors in `SplatOp` This changes the op to produce `AnyVectorOfAnyRank` and implements this by just inserting the element (skipping the shuffle that we do for the 1-D case). Depends On D114549 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114598	2021-11-26 17:05:15 +00:00
Arjun P	ad34ce94d5	[MLIR] Simplex: fix a bug when rolling back a Simplex with no solutions Previously, when adding a constraint to a Simplex that is already marked as having no solutions (marked empty), the Simplex would be marked empty again, and a second UnmarkEmpty entry would be pushed to the undo log. When rolling back, Simplex should be unmarked empty only after rolling back past the creation of the first constraint that made it empty. Reviewed By: Groverkss Differential Revision: https://reviews.llvm.org/D114613	2021-11-26 22:33:48 +05:30
Arjun P	f074bbb04a	[MLIR] Simplex::pivot: also update the redundant rows when pivoting Previously, the pivot function would only update the non-redundant rows when pivoting. This is incorrect because in some cases, when rolling back past a `detectRedundant` call, the basis being used could be different from that which was used at the time of returning from the `detectRedundant` call. Therefore, it is important to update the redundant rows as well during pivots. This could also be triggered by pivots that occur when testing successive constraints for being redundant in `detectRedundant` after some initial constraints are marked redundant. Reviewed By: Groverkss Differential Revision: https://reviews.llvm.org/D114614	2021-11-26 21:42:41 +05:30
Mats Petersson	30238c3676	[mlir][OpenMP] Add support for SIMD modifier Add support for SIMD modifier in OpenMP worksharing loops. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D111051	2021-11-26 14:04:46 +00:00
Matthias Springer	b62b21b980	[mlir][linalg][bufferize][NFC] InsertSliceOp no-copy detection as PostAnalysis There is special logic for InsertSliceOp to check if a memcpy is needed. This change extracts that piece of code and makes it a PostAnalysisStep. The purpose of this change is to untangle `bufferize` from BufferizationAliasInfo. (Not fully there yet.) Differential Revision: https://reviews.llvm.org/D114513	2021-11-26 22:19:29 +09:00
Benjamin Kramer	8521850f20	Provide a definition for OperationPosition::kDown This isn't necessary in C++17, but C++14 still requires it.	2021-11-26 14:11:59 +01:00
Benjamin Kramer	1b0312d280	[PDL] fix unused variable warning in Release builds	2021-11-26 14:11:58 +01:00
Stanislav Funiak	d35f119094	Added line numbers to the debug output of PDL bytecode. This is a small diff that splits out the debug output for PDL bytecode. When running bytecode with debug output on, it is useful to know the line numbers where the PDLIntepr operations are performed. Usually, these are in a single MLIR file, so it's sufficient to print out the line number rather than the entire location (which tends to be quite verbose). This debug output is gated by `LLVM_DEBUG` rather than `#ifndef NDEBUG` to make it easier to test. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D114061	2021-11-26 18:11:37 +05:30
Stanislav Funiak	a76ee58f3c	Multi-root PDL matching using upward traversals. This is commit 4 of 4 for the multi-root matching in PDL, discussed in https://llvm.discourse.group/t/rfc-multi-root-pdl-patterns-for-kernel-matching/4148 (topic flagged for review). This PR integrates the various components (root ordering algorithm, nondeterministic execution of PDL bytecode) to implement multi-root PDL matching. The main idea is for the pattern to specify mulitple candidate roots. The PDL-to-PDLInterp lowering selects one of these roots and "hangs" the pattern from this root, traversing the edges downwards (from operation to its operands) when possible and upwards (from values to its uses) when needed. The root is selected by invoking the optimal matching multiple times, once for each candidate root, and the connectors are determined form the optimal matching. The costs in the directed graph are equal to the number of upward edges that need to be traversed when connecting the given two candidate roots. It can be shown that, for this choice of the cost function, "hanging" the pattern an inner node is no better than from the optimal root. The following three main additions were implemented as a part of this PR: 1. OperationPos predicate has been extended to allow tracing the operation accepting a value (the opposite of operation defining a value). 2. Predicate checking if two values are not equal - this is useful to ensure that we do not traverse the edge back downwards after we traversed it upwards. 3. Function for for building the cost graph among the candidate roots. 4. Updated buildPredicateList, building the predicates optimal branching has been determined. Testing: unit tests (an integration test to follow once the stack of commits has landed) Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D108550	2021-11-26 18:11:37 +05:30
Stanislav Funiak	6df7cc7f47	Implementation of the root ordering algorithm This is commit 3 of 4 for the multi-root matching in PDL, discussed in https://llvm.discourse.group/t/rfc-multi-root-pdl-patterns-for-kernel-matching/4148 (topic flagged for review). We form a graph over the specified roots, provided in `pdl.rewrite`, where two roots are connected by a directed edge if the target root can be connected (via a chain of operations) in the underlying pattern to the source root. We place a restriction that the path connecting the two candidate roots must only contain the nodes in the subgraphs underneath these two roots. The cost of an edge is the smallest number of upward traversals (edges) required to go from the source to the target root, and the connector is a `Value` in the intersection of the two subtrees rooted at the source and target root that results in that smallest number of such upward traversals. Optimal root ordering is then formulated as the problem of finding a spanning arborescence (i.e., a directed spanning tree) of minimal weight. In order to determine the spanning arborescence (directed spanning tree) of minimum weight, we use the [Edmonds' algorithm](https://en.wikipedia.org/wiki/Edmonds%27_algorithm). The worst-case computational complexity of this algorithm is O(_N_^3) for a single root, where _N_ is the number of specified roots. The `pdl`-to-`pdl_interp` lowering calls this algorithm as a subroutine _N_ times (once for each candidate root), so the overall complexity of root ordering is O(_N_^4). If needed, this complexity could be reduced to O(_N_^3) with a more efficient algorithm. However, note that the underlying implementation is very efficient, and _N_ in our instances tends to be very small (<10). Therefore, we believe that the proposed (asymptotically suboptimal) implementation will suffice for now. Testing: a unit test of the algorithm Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D108549	2021-11-26 18:11:37 +05:30
Stanislav Funiak	3eb1647af0	Introduced iterative bytecode execution. This is commit 2 of 4 for the multi-root matching in PDL, discussed in https://llvm.discourse.group/t/rfc-multi-root-pdl-patterns-for-kernel-matching/4148 (topic flagged for review). This commit implements the features needed for the execution of the new operations pdl_interp.get_accepting_ops, pdl_interp.choose_op: 1. The implementation of the generation and execution of the two ops. 2. The addition of Stack of bytecode positions within the ByteCodeExecutor. This is needed because in pdl_interp.choose_op, we iterate over the values returned by pdl_interp.get_accepting_ops until we reach finalize. When we reach finalize, we need to return back to the position marked in the stack. 3. The functionality to extend the lifetime of values that cross the nondeterministic choice. The existing bytecode generator allocates the values to memory positions by representing the liveness of values as a collection of disjoint intervals over the matcher positions. This is akin to register allocation, and substantially reduces the footprint of the bytecode executor. However, because with iterative operation pdl_interp.choose_op, execution "returns" back, so any values whose original liveness cross the nondeterminstic choice must have their lifetime executed until finalize. Testing: pdl-bytecode.mlir test Reviewed By: rriddle, Mogball Differential Revision: https://reviews.llvm.org/D108547	2021-11-26 18:11:37 +05:30
Stanislav Funiak	842b6861c0	Defines new PDLInterp operations needed for multi-root matching in PDL. This is commit 1 of 4 for the multi-root matching in PDL, discussed in https://llvm.discourse.group/t/rfc-multi-root-pdl-patterns-for-kernel-matching/4148 (topic flagged for review). These operations are: * pdl.get_accepting_ops: Returns a list of operations accepting the given value or a range of values at the specified position. Thus if there are two operations `%op1 = "foo"(%val)` and `%op2 = "bar"(%val)` accepting a value at position 0, `%ops = pdl_interp.get_accepting_ops of %val : !pdl.value at 0` will return both of them. This allows us to traverse upwards from a value to operations accepting the value. * pdl.choose_op: Iteratively chooses one operation from a range of operations. Therefore, writing `%op = pdl_interp.choose_op from %ops` in the example above will select either `%op1`or `%op2`. Testing: Added the corresponding test cases to mlir/test/Dialect/PDLInterp/ops.mlir. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D108543	2021-11-26 17:59:22 +05:30
Tobias Gysi	8d07ba817c	[mlir][linalg] Simplify the hoist padding tests. Use primarily matvec instead of matmul to test hoist padding. Test the hoisting only starting from already padded IR. Use one-dimensional tiling only except for the tile_and_fuse test that exercises hoisting on a larger loop nest with fill and pad tensor operations in the backward slice. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114608	2021-11-26 07:40:22 +00:00
Michal Terepeta	c47108c041	[mlir][Vector] Minor formatting fixes in Vector.md Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D113854	2021-11-26 07:16:07 +00:00
Matthias Springer	8e2214aa60	[mlir][linalg][bufferize][NFC] Pass BufferizationState to PostAnalysisStep Pass BufferizationStep instead of BufferizationAliasInfo. Note: BufferizationState contains BufferizationAliasInfo. Differential Revision: https://reviews.llvm.org/D114512	2021-11-26 11:46:14 +09:00
Matthias Springer	d62b4b08af	[mlir][linalg][bufferize] Compose dialect-specific bufferization state Use composition instead of inheritance for storing dialect-specific bufferization state. This is in preparation of adding "tensor dialect"-specific bufferization state. Differential Revision: https://reviews.llvm.org/D114508	2021-11-26 11:35:45 +09:00
Matthias Springer	c94b80b438	[mlir][linalg][bufferize][NFC] Allow returning arbitrary memrefs If `allowReturnMemref` is set to true, arbitrary memrefs may be returned from FuncOps. Also remove allocation hoisting code, which is only partly implemented at the moment. The purpose of this commit is to untangle `bufferize` from `aliasInfo`. (Even with this change, they are not fully untangled yet.) Differential Revision: https://reviews.llvm.org/D114507	2021-11-26 11:26:46 +09:00
Matthias Springer	c637e3ea9e	[mlir][linalg][bufferize][NFC] Extract func boundary bufferization Bufferization of function boundaries is extracted from ComprehensiveBufferize into a separate file. This will become its own build target in the future. Differential Revision: https://reviews.llvm.org/D114226	2021-11-26 10:25:36 +09:00
Matthias Springer	f32c3d9528	[mlir][linalg][bufferize][NFC] Move Affine interface impl to new build target This makes ComprehensiveBufferize entirely independent of the Affine dialect. Differential Revision: https://reviews.llvm.org/D114222	2021-11-26 09:27:47 +09:00
Mehdi Amini	850e8b4504	Fix link to the other docs from the Bufferization dialect	2021-11-26 00:13:32 +00:00
Michal Terepeta	cc311a155a	[mlir][Vector] Support 0-D vectors in `VectorPrintOpConversion` Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114549	2021-11-25 20:12:18 +00:00
Uday Bondhugula	c89fc1eec3	[MLIR] NFC. Rename MLIR CAPI ExecutionEngine target for consistency Rename MLIR CAPI ExecutionEngine target for consistency: MLIRCEXECUTIONENGINE -> MLIRCAPIExecutionEngine in line with other targets. Differential Revision: https://reviews.llvm.org/D114596	2021-11-26 00:23:17 +05:30
Tres Popp	6eca1957ee	Don't store nullptrs in mlir::FuncOp::getAll*Attrs' result These methods for results and arguments would create an ArrayRef full of nullptrs when there were no argument attributes. This is problematic because this result could not be passed to the FuncOp::build creator without causing a segfault. Now the list will have empty attributes. Differential Revision: https://reviews.llvm.org/D114358	2021-11-25 15:12:29 +01:00
seongwon bang	35c1e6ac1a	[MLIR] [docs] Fix misguided examples in memref.subview operation. The examples in `memref.subview` operation are misguided in that subview's strides operands mean "memref-rank number of strides that compose multiplicatively with the base memref strides in each dimension.". So the below examples should be changed from `Strides: [64, 4, 1]` to `Strides: [1, 1, 1]` Before changes ``` // Subview with constant offsets, sizes and strides. %1 = memref.subview %0[0, 2, 0][4, 4, 4][64, 4, 1] : memref<8x16x4xf32, (d0, d1, d2) -> (d0 * 64 + d1 * 4 + d2)> to memref<4x4x4xf32, (d0, d1, d2) -> (d0 * 64 + d1 * 4 + d2 + 8)> ``` After changes ``` // Subview with constant offsets, sizes and strides. %1 = memref.subview %0[0, 2, 0][4, 4, 4][1, 1, 1] : memref<8x16x4xf32, (d0, d1, d2) -> (d0 * 64 + d1 * 4 + d2)> to memref<4x4x4xf32, (d0, d1, d2) -> (d0 * 64 + d1 * 4 + d2 + 8)> ``` Also I fixed some syntax issues in docs related with memref layout map and added detailed explanation in subview rank reducing case. Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D114500	2021-11-25 21:24:10 +09:00
Alexander Belyaev	57470abc41	[mlir] Move memref.[tensor_load\|buffer_cast\|clone] to "bufferization" dialect. https://llvm.discourse.group/t/rfc-dialect-for-bufferization-related-ops/4712 Differential Revision: https://reviews.llvm.org/D114552	2021-11-25 11:50:39 +01:00
Tobias Gysi	43dc6d5d57	[mlir][linalg] Cleanup hoisting test (NFC). Rename the check prefixes to HOIST21 and HOIST32 to clarify the different flag configurations. Depends On D114438 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114442	2021-11-25 10:42:24 +00:00
Tobias Gysi	4b03906346	[mlir][linalg] Perform checks early in hoist padding. Instead of checking for unexpected operations (any operation with a region except for scf::For and `padTensorOp` or operations with a memory effect) while cloning the packing loop nest perform the checks early. Update `dropNonIndexDependencies` to check for unexpected operations. Additionally, check all of these operations have index type operands only. Depends On D114428 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114438	2021-11-25 10:37:12 +00:00
Tobias Gysi	fd723eaa92	[mlir][linalg] Limit hoist padding to constant paddings. Limit hoist padding to pad tensor ops that depend only on a constant value. Supporting arbitrary padding values that depend on computations part of the backward slice to hoist require complex analysis to ensure the computation can be hoisted. Depends On D114420 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114428	2021-11-25 10:31:39 +00:00
Tobias Gysi	ed7c1fb9b0	[mlir][linalg] Add backward slice filtering in hoist padding. Adapt hoist padding to filter the backward slice before cloning the packing loop nest. The filtering removes all operations that are not used to index the hoisted pad tensor op and its extract slice op. The filtering is needed to support the more complex loop nests created after fusion. For example, fusing the producer of an output operand can added linalg ops and pad tensor ops to the backward slice. These operations have regions and currently prevent hoisting. The following example demonstrates the effect of the newly introduced `dropNonIndexDependencies` method that filters the backward slice: ``` %source = linalg.fill(%cst, %arg0) scf.for %i %unrelated = linalg.fill(%cst, %arg1) // not used to index %source! scf.for %j (%arg2 = %unrelated) scf.for %k // not used to index %source! %ubi = affine.min #map(%i) %ubj = affine.min #map(%j) %slice = tensor.extract_slice %source [%i, %j] [%ubi, %ubj] %padded_slice = linalg.pad_tensor %slice ``` dropNonIndexDependencies(%padded_slice, %slice) removes [scf.for %k, linalg.fill(%cst, %arg1)] from backwardSlice. Depends On D114175 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114420	2021-11-25 10:30:10 +00:00
Matthias Springer	48107eaa07	[mlir][linalg][bufferize][NFC] Move SCF interface impl to new build target This makes ComprehensiveBufferize entirely independent of the SCF dialect. Differential Revision: https://reviews.llvm.org/D114221	2021-11-25 19:00:17 +09:00
Alexander Belyaev	3c228573bc	Revert "[mlir][SCF] Further simplify affine maps during `for-loop-canonicalization`" This reverts commit `ee1bf18672`. It breaks IREE lowering. Reverting the commit for now while we investigate what's going on.	2021-11-25 10:54:52 +01:00
Butygin	8dae0b6b6c	[mlir][spirv] arith::RemSIOp OpenCL lowering Differential Revision: https://reviews.llvm.org/D114524	2021-11-25 12:44:06 +03:00
Butygin	467acf3b6b	[mlir][spirv] Float atomics should not imply Shader Differential Revision: https://reviews.llvm.org/D114551	2021-11-25 12:07:28 +03:00
Matthias Springer	a5c2f78287	[mlir][interfaces] Add insideMutuallyExclusiveRegions helper Add a helper function to ControlFlowInterfaces for checking if two ops are in mutually exclusive regions according to RegionBranchOpInterface. Utilize this new helper in Linalg ComprehensiveBufferize. This makes the analysis independent of the SCF dialect and generalizes it to other ops that implement RegionBranchOpInterface. Differential Revision: https://reviews.llvm.org/D114220	2021-11-25 17:44:39 +09:00
Uday Bondhugula	25d173499e	[MLIR] Rename test/python/dialects/math.py -> math_dialect.py Rename test/python/dialects/math.py -> math_dialect.py to avoid a collision with a Python standard package of the same name. These test scripts are run by path and are not part of a package. Python apparently implicitly adds the containing directory to its PYTHONPATH. As such, test scripts with common names run the risk of conflicting with global names and resolution of an import for the latter happens to the former. Differential Revision: https://reviews.llvm.org/D114568	2021-11-25 09:51:49 +05:30
Matthias Springer	ee1bf18672	[mlir][SCF] Further simplify affine maps during `for-loop-canonicalization` * Implement `FlatAffineConstraints::getConstantBound(EQ)`. * Inject a simpler constraint for loops that have at most 1 iteration. * Taking into account constant EQ bounds of FlatAffineConstraint dims/symbols during canonicalization of the resulting affine map in `canonicalizeMinMaxOp`. Differential Revision: https://reviews.llvm.org/D114138	2021-11-25 12:44:19 +09:00
Matthias Springer	8a8c655fe7	[mlir][SCF] Fix off-by-one bug in affine analysis This change is NFC. There were two issues when passing/reading upper bounds into/from FlatAffineConstraints that negate each other, so the bug was not apparent. However, it made debugging harder because some constraints in the FlatAffineConstraints were off by one when dumping all constraints. Differential Revision: https://reviews.llvm.org/D114137	2021-11-25 12:37:02 +09:00
Uday Bondhugula	23d505571d	[NFC] Improve debug message in getAsIntegerSet Improve debug message in getAsIntegerSet. Add missing trailing new line and position info. Differential Revision: https://reviews.llvm.org/D114511	2021-11-25 08:50:21 +05:30
Matthias Springer	d3bb4fec2a	[mlir][linalg][bufferize][NFC] Move arith interface impl to new build target This makes ComprehensiveBufferize entirely independent of the arith dialect. Differential Revision: https://reviews.llvm.org/D114219	2021-11-25 10:21:02 +09:00
bakhtiyar	7bd87a03fd	Promote readability by factoring out creation of min/max operation. Remove unnecessary divisions. Reviewed By: ezhulenev Differential Revision: https://reviews.llvm.org/D110680	2021-11-24 16:17:23 -08:00
Lei Zhang	cb395f66ac	[mlir][spirv] Change the return type for {Min\|Max}VersionBase For synthesizing an op's implementation of the generated interface from {Min\|Max}Version, we need to define an `initializer` and `mergeAction`. The `initializer` specifies the initial version, and `mergeAction` specifies how version specifications from different parts of the op should be merged to generate a final version requirements. Previously we use the specified version enum as the type for both the initializer and thus the final return type. This means we need to perform `static_cast` over some hopefully not used number (`~0u`) as the initializer. This is quite opaque and sort of not guaranteed to work. Also, there are ops that have an enum attribute where some values declare version requirements (e.g., enumerant `B` requires v1.1+) but some not (e.g., enumerant `A` requires nothing). Then a concrete op instance with `A` will still declare it implements the version interface (because interface implementation is static for an op) but actually theirs no requirements for version. So this commit changes to use an more explicit `llvm::Optional` to wrap around the returned version enum. This should make it more clear. Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D108312	2021-11-24 17:33:01 -05:00
Tobias Gysi	b6e7b1be73	[mlir][linalg] Simplify padding test (NFC). The padding tests previously contained the tile loops. This revision removes the tile loops since padding itself does not consider the loops. Instead the induction variables are passed in as function arguments which promotes them to symbols in the affine expressions. Note that the pad-and-hoist.mlir test still exercises padding in the context of the full loop nest. Depends On D114175 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114227	2021-11-24 19:21:50 +00:00
Tobias Gysi	86f186efea	[mlir][linalg] Add makeComposedPadHighOp. Add the makeComposedPadHighOp method which creates a new PadTensorOp if necessary. If the source to pad is actually the result of a sequence of padded LinalgOps, the method checks if padding is needed or if we can use the padded result of the padded LinalgOp sequence directly. Example: ``` %0 = tensor.extract_slice %arg0 [%iv0, %iv1] [%sz0, %sz1] %1 = linalg.pad_tensor %0 low[0, 0] high[...] { linalg.yield %cst } %2 = linalg.matmul ins(...) outs(%1) %3 = tensor.extract_slice %2 [0, 0] [%sz0, %sz1] ``` when padding %3 return %2 instead of introducing ``` %4 = linalg.pad_tensor %3 low[0, 0] high[...] { linalg.yield %cst } ``` Depends On D114161 Reviewed By: nicolasvasilache, pifon2a Differential Revision: https://reviews.llvm.org/D114175	2021-11-24 19:18:59 +00:00
Tobias Gysi	a4fd8cb76f	[mlir][linalg] Update failure conditions for padOperandToSmallestStaticBoundingBox. Change the failure condition of padOperandToSmallestStaticBoundingBox to never fail if the operand is already statically sized. In particular: - if the padding value computation fails -> return failure if the operand shape is dynamic and success if it is static. - if there is no extract slice op -> return failure if the operand shape is dynamic and success if it is static. The latter change prevents padding from failure if the output operand passed by iteration argument is statically sized since in this case the extract / insert slice pairs are removed by canonicalization. Depends On D114153 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114161	2021-11-24 19:10:50 +00:00
Florian Hahn	fb46e64a01	Revert "[ThreadPool] Do not return shared futures." This reverts commit `a5fff58781`. The offending commit broke building with LLVM_ENABLE_THREADS=OFF.	2021-11-24 19:01:47 +00:00
MaheshRavishankar	0a58982b08	[mlir][Linalg] Remove alloc/dealloc pair as a callback. The alloc dealloc pair generation callback is really central to the bufferization algorithm, it modifies the state in a way that affects correctness. This is not really a configurable option. Moving it to BufferizationState removes what was probably the reason it was added as a callback. Differential Revision: https://reviews.llvm.org/D114417	2021-11-24 10:36:34 -08:00
Nicolas Vasilache	1cfa9b4d70	[mlir][Vector] NFC - Apply some clangd suggested fixes.	2021-11-24 15:55:58 +00:00
Matthias Springer	ca9d149e07	[mlir][linalg][bufferize][NFC] Move vector interface impl to new build target This makes ComprehensiveBufferize entirely independent of the vector dialect. Differential Revision: https://reviews.llvm.org/D114218	2021-11-24 19:36:12 +09:00
Matthias Springer	bb273a35a0	[mlir][linalg][bufferize][NFC] Move tensor interface impl to new build target This makes ComprehensiveBufferize entirely independent of the tensor dialect. Differential Revision: https://reviews.llvm.org/D114217	2021-11-24 18:25:17 +09:00
Butygin	7f5d9bf13a	[mlir][scf] Canonicalize scf.while with unused results Differential Revision: https://reviews.llvm.org/D114291	2021-11-24 11:11:22 +03:00
Bixia Zheng	02710413a3	Accept symmetric sparse matrix in Matrix Market Exchange Format. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D114402	2021-11-23 19:53:17 -08:00
Uday Bondhugula	8bd08a9fd7	[MLIR] Remove duplicate `Pass` suffix from ViewOpGraph class name Remove duplicate `Pass` suffix from view-op-graph pass class name. The extra suffix would lead to methods like registerViewOpGraphPassPass being generated. Differential Revision: https://reviews.llvm.org/D114459	2021-11-24 08:00:16 +05:30
wren romano	d7d7ffe254	[mlir][sparse] Adding wrappers for constantOverheadTypeEncoding Minor code cleanup Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D114392	2021-11-23 18:30:06 -08:00
Butygin	75a1bee05d	[mlir][spirv] Add math to OpenCL conversion Differential Revision: https://reviews.llvm.org/D113780	2021-11-24 02:31:21 +03:00
Rob Suderman	0f1e52afa9	[mlir][tosa] Materialize tosa.pad value and fold noop pads Padding now can explicitly specify the padding value when non-zero is wanted. This also includes bypassing pads when the pad does nothing. Differential Revision: https://reviews.llvm.org/D113611	2021-11-23 12:23:42 -08:00
Rob Suderman	54eec7cafc	[mlir][tosa] Separate tosa.transpose_conv decomposition and added stride support Transpose convolution decomposition is now performed in a separate pass. This allows padding / constant propagation to be performed at the TOSA level. It also adds support for striding when there is no dilation. Differential Revision: https://reviews.llvm.org/D114409	2021-11-23 12:16:44 -08:00
MaheshRavishankar	b57e2f071a	[mlir][Linalg] Add pad vectorization patterns into LinalgStrategyVectorize passes. Add an option to control whether these patterns are added to the pattern list or not. Differential Revision: https://reviews.llvm.org/D114290	2021-11-23 11:47:54 -08:00
wren romano	286248db2c	[mlir][sparse] Moving integration tests that merely use the Python API Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D114192	2021-11-23 10:59:38 -08:00
Nicolas Vasilache	3ff4e5f2a4	[mlir][Vector] Thread 0-d vectors through InsertElementOp. This revision makes concrete use of 0-d vectors to extend the semantics of InsertElementOp. Reviewed By: dcaballe, pifon2a Differential Revision: https://reviews.llvm.org/D114388	2021-11-23 12:55:11 +00:00
Nicolas Vasilache	e7026aba00	[mlir][Vector] Thread 0-d vectors through ExtractElementOp. This revision starts making concrete use of 0-d vectors to extend the semantics of ExtractElementOp. In the process a new VectorOfAnyRank Tablegen OpBase.td is added to allow progressive transition to supporting 0-d vectors by gradually opting in. Differential Revision: https://reviews.llvm.org/D114387	2021-11-23 12:39:44 +00:00
Matthias Springer	f24d9313cc	[mlir][linalg][bufferize][NFC] Specify bufferize traversal in `bufferize` The interface method `bufferize` controls how (and it what order) nested ops are traversed. This simplifies bufferization of scf::ForOps and scf::IfOps, which used to need special rules in scf::YieldOp. Differential Revision: https://reviews.llvm.org/D114057	2021-11-23 21:33:19 +09:00
Florian Hahn	a5fff58781	[ThreadPool] Do not return shared futures. The only users of returned futures from ThreadPool is llvm-reduce after D113857. There should be no cases where multiple threads wait on the same future, so there should be no need to return std::shared_future<>. Instead return plain std::future<>. If users need to share a future between multiple threads, they can share the futures themselves. Reviewed By: Meinersbur, mehdi_amini Differential Revision: https://reviews.llvm.org/D114363	2021-11-23 10:06:08 +00:00
Alexander Belyaev	c7cc70c8f8	Revert "Revert "[mlir] Move AllocationOpInterface to Bufferize/IR/AllocationOpInterface.td."" This reverts and fixes commit `de18b7dee6`.	2021-11-23 10:49:26 +01:00
Nicolas Vasilache	b2729fda60	[mlir][Vector] Add a vblendps-based impl for transpose8x8 (both intrin and inline_asm) This revision follows up on the conversation titled: ```[llvm-dev] Understanding and controlling some of the AVX shuffle emission paths``` The revision adds a vblendps-based implementation for transpose8x8 and further distinguishes between and intrinsics and an inline_asm implementation. This results in roughly 20% fewer cycles as reported by llvm-mca: After this revision (intrinsic version, resolves to virtually identical assembly as per the llvm-dev discussion, no vblendps instruction is emitted): ``` Iterations: 100 Instructions: 5900 Total Cycles: 2415 Total uOps: 7300 Dispatch Width: 6 uOps Per Cycle: 3.02 IPC: 2.44 Block RThroughput: 24.0 Cycles with backend pressure increase [ 89.90% ] Throughput Bottlenecks: Resource Pressure [ 89.65% ] - SKXPort1 [ 0.04% ] - SKXPort2 [ 12.42% ] - SKXPort3 [ 12.42% ] - SKXPort5 [ 89.52% ] Data Dependencies: [ 37.06% ] - Register Dependencies [ 37.06% ] - Memory Dependencies [ 0.00% ] ``` After this revision (inline_asm version, vblendps instructions are indeed emitted): ``` Iterations: 100 Instructions: 6300 Total Cycles: 2015 Total uOps: 7700 Dispatch Width: 6 uOps Per Cycle: 3.82 IPC: 3.13 Block RThroughput: 20.0 Cycles with backend pressure increase [ 83.47% ] Throughput Bottlenecks: Resource Pressure [ 83.18% ] - SKXPort0 [ 14.49% ] - SKXPort1 [ 14.54% ] - SKXPort2 [ 19.70% ] - SKXPort3 [ 19.70% ] - SKXPort5 [ 83.03% ] - SKXPort6 [ 14.49% ] Data Dependencies: [ 39.75% ] - Register Dependencies [ 39.75% ] - Memory Dependencies [ 0.00% ] ``` An accessible copy of the conversation is available [here](https://gist.github.com/nicolasvasilache/68c7f34012584b0e00f335bcb374ede0). Differential Revision: https://reviews.llvm.org/D114393	2021-11-23 07:31:22 +00:00
Sandeep Dasgupta	e5a8c8c883	[mlir] Refactoring a few Parser APIs Refactored two new parser APIs parseGenericOperationAfterOperands and parseCustomOperationName out of parseGenericOperation and parseCustomOperation. Motivation: Sometimes an op can be printed in a special way if certain criteria is met. While parsing, we need to handle all the versions. `parseGenericOperationAfterOperands` is handy in situation where we already parsed the operands and decide to fall back to default parsing. `parseCustomOperationName` is useful when we need to know details (dialect, operation name etc.) about a parsed token meant to be an mlir operation. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D113719	2021-11-23 06:11:01 +00:00
Matthias Springer	fb99686bfd	[mlir][linalg][bufferize] Limited support for scf.execute_region Add support for analysis only. Differential Revision: https://reviews.llvm.org/D114055	2021-11-23 12:20:39 +09:00
Matthias Springer	26c0dd83ab	[mlir][linalg][bufferize][NFC] Move helper function to op interface This is in preparation of changing the op traversal during bufferization. Differential Revision: https://reviews.llvm.org/D114040	2021-11-23 11:59:47 +09:00
Matthias Springer	8d0994ed21	[mlir][linalg][bufferize][NFC] Remove special casing of CallOps Differential Revision: https://reviews.llvm.org/D113966	2021-11-23 11:14:10 +09:00
Matthias Springer	b1083830d6	[mlir][linalg][bufferize][NFC] Clean up headers and function visibility Differential Revision: https://reviews.llvm.org/D113964	2021-11-23 10:29:26 +09:00
Benjamin Kramer	966b720983	[mlir][memref] Fix expanded shape ops memref.cast folding with changed type `memref.expand_shape` has verification logic to make sure result dim must be static if all the collapsing src dims are static. This can be relaxed once expand_shape supports more dynamism. Differential Revision: https://reviews.llvm.org/D114391	2021-11-22 22:56:15 +01:00
Christian Ulmann	f6718fc6d3	[mlir] FlatAffineConstraint parsing for unit tests This patch adds functionality to parse FlatAffineConstraints from a StringRef with the intention to be used for unit tests. This should make the construction of FlatAffineConstraints easier for testing purposes. The patch contains an example usage of the functionality in a unit test that uses FlatAffineConstraints. Reviewed By: bondhugula, grosser Differential Revision: https://reviews.llvm.org/D113275	2021-11-23 03:04:30 +05:30
Groverkss	98daa4e425	[MLIR] Fix incorrect removal of source loop in loop fusion This patch fixes a bug in loop fusion pass where the source loop was removed even when the fused loop did not cover all iterations of the source loop. This was because the fast hueristic check for checking if source loop and fused loop have same iterations did not take into account steps in loop. Reviewed By: dcaballe, bondhugula Differential Revision: https://reviews.llvm.org/D114164	2021-11-23 02:54:09 +05:30
Alexander Belyaev	de18b7dee6	Revert "[mlir] Move AllocationOpInterface to Bufferize/IR/AllocationOpInterface.td." This reverts commit `3028bca6a9`. For some reason using FallbackModel works with CMake and does not work with bazel. Using `ExternalModel` works. I will check what's going on and resubmit tomorrow.	2021-11-22 21:35:20 +01:00
Alexander Belyaev	3028bca6a9	[mlir] Move AllocationOpInterface to Bufferize/IR/AllocationOpInterface.td. Remove the interface from op defs in MemRefOps.td and make it an external model. This is the first PR of many that will move bufferization-related ops, interfaces, passes to Dialect/Bufferize. RFC: https://llvm.discourse.group/t/rfc-dialect-for-bufferization-related-ops/4712 It is still debated if the comprehensive bufferization has to be moved there as well, so for now I am just moving the "gradual" bufferization. Differential Revision: https://reviews.llvm.org/D114147	2021-11-22 21:00:59 +01:00
Mehdi Amini	e0b7bee7cf	Revert "[mlir][Vector] Add a vblendps-based impl for transpose8x8 (both intrin and inline_asm)" This reverts commit `a9e236bed8`. This broke the Windows build: mlir\include\mlir/Dialect/X86Vector/Transforms.h(28): error C2061: syntax error: identifier 'uint'	2021-11-22 19:23:18 +00:00
Lei Zhang	93284120f2	[mlir][vector] Fix TransferOpReduceRank for 0-D tensors We cannot unconditionally generate memref.load ops for such cases; need to check the source's type. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114376	2021-11-22 12:30:46 -05:00
Alex Zinenko	9c5982ef8e	[mlir] support recursive types in type conversion infra MLIR supports recursive types but they could not be handled by the conversion infrastructure directly as it would result in infinite recursion in `convertType` for elemental types. Support this case by keeping the "call stack" of nested type conversions in the TypeConverter class and by passing it as an optional argument to the individual conversion callback. The callback can then check if a specific type is present on the stack more than once to detect and handle the recursive case. This approach is preferred to the alternative approach of having a separate callback dedicated to handling only the recursive case as the latter was observed to introduce ~3% time overhead on a 50MB IR file even if it did not contain recursive types. This approach is also preferred to keeping a local stack in type converters that need to handle recursive types as that would compose poorly in case of out-of-tree or cross-project extensions. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D113579	2021-11-22 18:16:02 +01:00
Arjun P	0512bf3540	[MLIR] PresburgerSetTest: fix comment and add a test case	2021-11-22 20:00:56 +05:30
Tobias Gysi	247a1a55eb	[mlir][linalg] Use getAsOpFoldResult in padding (NFC). After padding, we introduce a ExtractSliceOp to get the final unpadded result. This revision uses getAsOpFoldResult to compute the size of the unpadded result, which guarantees the result type has a partially static shape if some of the sizes of the unpadded result are statically known. At the moment, we rely on canonicalization to cleanup the types after padding. Depends On D114085 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114153	2021-11-22 13:15:19 +00:00
Tobias Gysi	32c43241e7	[mlir][linalg] Always generate an extract/insert slice pair when tiling output tensors. Adapt tiling to always generate an extract/insert slice pair for output tensors even if the tensor is not tiled. Having an explicit extract/insert slice pair simplifies followup transformations such as padding and bufferization. In particular, it makes read and written iteration argument slices explicit. Depends On D114067 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114085	2021-11-22 13:12:43 +00:00
Tres Popp	106f307499	Rename MlirExecutionEngine lookup to lookupPacked The purpose of the change is to make clear whether the user is retrieving the original function or the wrapper function, in line with the invoke commands. This new functionality is useful for users that already have defined their own packed interface, so they do not want the extra layer of indirection, or for users wanting to the look at the resulting primary function rather than the wrapper function. All locations, except the python bindings now have a `lookupPacked` method that matches the original `lookup` functionality. `lookup` still exists, but with new semantics. - `lookup` returns the function with a given name. If `bool f(int,int)` is compiled, `lookup` will return a reference to `bool(f)(int,int)`. - `lookupPacked` returns the packed wrapper of the function with the given name. If `bool f(int,int)` is compiled, `lookupPacked` will return `void(mlir_f)(void**)`. Differential Revision: https://reviews.llvm.org/D114352	2021-11-22 14:12:09 +01:00
Tobias Gysi	f7751a3a42	[mlir][linalg] Remove tile and fuse test pass (NFC). Remove the tile and fuse test pass that has been replaced by codegen strategy. Depends On D114067 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114068	2021-11-22 12:33:31 +00:00
Nicolas Vasilache	050cc1cd6e	[mlir] Add InitializeNativeTargetAsmParser to ExecutionEngine. This is required to allow python to work with lowerings that use inline_asm. Differential Revision: https://reviews.llvm.org/D114338	2021-11-22 11:28:14 +00:00
Tobias Gysi	e3d386ea27	[mlir][linalg] Add a tile and fuse on tensors pattern. Add a pattern to apply the new tile and fuse on tensors method. Integrate the pattern into the CodegenStrategy and use the CodegenStrategy to implement the tests. Depends On D114012 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114067	2021-11-22 11:13:21 +00:00
Nicolas Vasilache	789c88e80e	[mlir] Fix unintentional mutation by VectorType/RankedTensorType::Builder dropDim Differential Revision: https://reviews.llvm.org/D113933	2021-11-22 10:51:50 +00:00
Tobias Gysi	0ccc44cec0	[mlir][linalg] Fix tile and fuse for outermost reduction. Tile and fuse failed if the outermost tile loop is a reduction dimension. Add the necessary check to handle outermost reductions and introduce a test case to verify the change. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D114012	2021-11-22 10:44:15 +00:00
Nicolas Vasilache	a9e236bed8	[mlir][Vector] Add a vblendps-based impl for transpose8x8 (both intrin and inline_asm) This revision follows up on the conversation titled: ```[llvm-dev] Understanding and controlling some of the AVX shuffle emission paths``` The revision adds a vblendps-based implementation for transpose8x8 and further distinguishes between and intrinsics and an inline_asm implementation. This results in roughly 20% fewer cycles as reported by llvm-mca: After this revision (intrinsic version, resolves to virtually identical assembly as per the llvm-dev discussion, no vblendps instruction is emitted): ``` Iterations: 100 Instructions: 5900 Total Cycles: 2415 Total uOps: 7300 Dispatch Width: 6 uOps Per Cycle: 3.02 IPC: 2.44 Block RThroughput: 24.0 Cycles with backend pressure increase [ 89.90% ] Throughput Bottlenecks: Resource Pressure [ 89.65% ] - SKXPort1 [ 0.04% ] - SKXPort2 [ 12.42% ] - SKXPort3 [ 12.42% ] - SKXPort5 [ 89.52% ] Data Dependencies: [ 37.06% ] - Register Dependencies [ 37.06% ] - Memory Dependencies [ 0.00% ] ``` After this revision (inline_asm version, vblendps instructions are indeed emitted): ``` Iterations: 100 Instructions: 6300 Total Cycles: 2015 Total uOps: 7700 Dispatch Width: 6 uOps Per Cycle: 3.82 IPC: 3.13 Block RThroughput: 20.0 Cycles with backend pressure increase [ 83.47% ] Throughput Bottlenecks: Resource Pressure [ 83.18% ] - SKXPort0 [ 14.49% ] - SKXPort1 [ 14.54% ] - SKXPort2 [ 19.70% ] - SKXPort3 [ 19.70% ] - SKXPort5 [ 83.03% ] - SKXPort6 [ 14.49% ] Data Dependencies: [ 39.75% ] - Register Dependencies [ 39.75% ] - Memory Dependencies [ 0.00% ] ``` An accessible copy of the conversation is available [here](https://gist.github.com/nicolasvasilache/68c7f34012584b0e00f335bcb374ede0). Reviewed By: ftynse, dcaballe Differential Revision: https://reviews.llvm.org/D114335	2021-11-22 10:32:34 +00:00
Arjun P	d92aabc336	[MLIR][NFC] Simplex: remove repeated words in comment	2021-11-22 15:50:03 +05:30
Jacques Pienaar	e5a4d0f149	[mlir] Fix unused function warning (NFC) Delete function no longer needed as all derived classes override printer.	2021-11-21 15:06:08 -08:00
Jacques Pienaar	6f9cceb775	[mlir] Move trait to InferTypeOpInterface Step towards removing the hard coded behavior for this trait and to instead use common interface. Differential Revision: https://reviews.llvm.org/D114208	2021-11-21 14:41:12 -08:00
Arjun P	ad48ef1e31	[MLIR][NFC] Simplex::restoreRow: improve documentation	2021-11-21 19:23:55 +05:30
Arnab Dutta	ec7b0d4d34	[MLIR] Simplify Semi-affine expressions by rule based matching and replacing "expr - q * (expr floordiv q)" with "expr mod q" expression. Add rule based matching for detecting and transforming "expr - q * (expr floordiv q)" to "expr mod q", where q is a symbolic exxpression, in simplifyAdd function. Reviewed By: bondhugula, dcaballe Differential Revision: https://reviews.llvm.org/D112985	2021-11-20 21:05:36 +05:30
Arnab Dutta	1f9ca5adba	[MLIR] Avoid creation of buggy affine maps while replacing dimension and symbol Initially before appending the newly composed dimension and symbols to the dimension and symbol list whose size is to be passed in AffineMap::get(), the call to the AffineMap::get() was made, resulting in wrong dimCount and symbolCount being passed as argument. We move the call to the AffineMap::get() after the diimension and symbol list are updated. Differential Revision: https://reviews.llvm.org/D114237	2021-11-20 12:01:29 +05:30
Krzysztof Drewniak	a6f53afbcb	[MLIR][GPU] Link in device libraries during HSA compilation if needed To perform some operations, such as sin() or printf(), code compiled for AMD GPUs must be linked to a series of device libraries. This commit adds support for linking in these libraries. However, since these device libraries are delivered as LLVM bitcode, raising the possibility of version incompatibilities, this commit only links in libraries when the functions from those libraries are called by the code being compiled. This code also sets the math flags to their most conservative values, as MLIR doesn't have a `-ffast-math` equivalent. Depends on D114114 Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D114117	2021-11-19 22:29:37 +00:00
rdzhabarov	d729f4c38f	[mlir] Bug fix. Stream must outlive the pass manager. Bug fix. Stream must outlive the pass manager. Reviewed By: Chia-hungDuan Differential Revision: https://reviews.llvm.org/D114277	2021-11-19 21:45:43 +00:00
Krzysztof Drewniak	20f79f8caa	[MLIR][GPU] Make the path to ROCm a runtime option Our current build assumes that the path to ROCm we find at build time will be the path at which ROCm is located when the built code is executed. This commit adds a --rocm-path option to SerializeToHsaco, and removes the HIP dependency that the SerializeToHsaco previously had. Depends on D114113 (though the dependency is to ensure the diffs apply cleanly and to capture the dependency on D114107) Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D114114	2021-11-19 20:51:54 +00:00
Stella Laurenzo	3fcdd182e9	NFC: Callout restriction on folding 0-result ops in documentation. Differential Revision: https://reviews.llvm.org/D114271	2021-11-19 20:35:01 +00:00

... 2 3 4 5 6 ...

9587 Commits