llvm-project

Commit Graph

Author	SHA1	Message	Date
River Riddle	eda6f907d2	[mlir][NFC] Shift a bunch of dialect includes from the .h to the .cpp Now that dialect constructors are generated in the .cpp file, we can drop all of the dependent dialect includes from the .h file. Differential Revision: https://reviews.llvm.org/D124298	2022-04-23 01:09:29 -07:00
Markus Böck	8ed2bd1e74	[mlir][LLVM] Fix `DataLayoutTypeInterface` for opqaue pointers with non-default address space As a fallback mechanism, if no entry was supplied for a given address space, the size or alignment for a pointer type with the default address space is returned instead. This code currently crashes with opaque pointers, as it tries to construct a typed pointer type from the opaque pointer type, leading to a null pointer dereference when fetching the element type. This patch fixes the issue by handling the opaque pointer cases explicitly. Differential Revision: https://reviews.llvm.org/D124290	2022-04-23 00:10:31 +02:00
Markus Böck	bab3d3778d	[mlir][LLVM] Fix crash when using opaque pointers in function signatures Using opaque pointers in function signatures leads to an attempt to recursively convert all types, including sub types in LLVM types. In the case of LLVM pointers, it may not have a subtype aka element type if it is opaque which would then lead to a null pointer dereference. Differential Revision: https://reviews.llvm.org/D124291	2022-04-23 00:10:31 +02:00
Yi Zhang	1cddcfdc3c	Fix CollapsedLayoutMap for dim size 1 case This change fixes `CollapsedLayoutMap` for cases where the collapsed dims are size 1. The cases where inner most dims are size 1 and noncontiguous can be represented by the strided form and therefore can be allowed. For such cases, the new stride should be of the next entry in an association whose dimension is not size 1. If the next entry is dynamic, it's not possible to decide which stride to use at compilation time and the stride is set to dynamic. Differential Revision: https://reviews.llvm.org/D124137	2022-04-22 17:48:24 -04:00
Alex Zinenko	40a8bd635b	[mlir] use side effects in the Transform dialect Currently, the sequence of Transform dialect operations only supports a single use of each operand (verified by the `transform.sequence` operation). This was originally motivated by the need to guard against accessing a payload IR operation associated with a transform IR value after this operation has likely been rewritten by a transformation. However, not all Transform dialect operations rewrite payload IR, in particular the "navigation" operation such as `transform.pdl_match` do not. Introduce memory effects to the Transform dialect operations to describe their effect on the payload IR and the mapping between payload IR opreations and transform IR values. Use these effects to replace the single-use rule, allowing repeated reads and disallowing use-after-free, where operations with the "free" effect are considered to "consume" the transform IR value and rewrite the corresponding payload IR operations). As an additional improvement, this enables code motion transformation on the transform IR itself. Reviewed By: Mogball Differential Revision: https://reviews.llvm.org/D124181	2022-04-22 23:29:11 +02:00
Okwan Kwon	ee285faed2	[mlir] Do not bubble up extract slice when it is rank-reducing. The bubble up logic was written by assuming the slice operation is always a normal slice that outputs a tensor with the same rank. Differential Revision: https://reviews.llvm.org/D124283	2022-04-22 12:21:47 -07:00
cpillmayer	3e8560f890	[MLIR] Add option to print users of an operation as comment in the printer This allows printing the users of an operation as proposed in the git issue #53286. To be able to refer to operations with no result, these operations are assigned an ID in SSANameState. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D124048	2022-04-22 18:58:10 +00:00
Jacques Pienaar	9bae20b528	[mlir] Add shape.func Add shape func op for use (primarily) in shape function_library op. Allows setting default dialect for some simpler authoring. This is a minimal version of the ops needed. Differential Revision: https://reviews.llvm.org/D124055	2022-04-22 11:35:35 -07:00
Lei Zhang	6f28fd0bf7	[mlir][vector] Fold 1-element reduction into extract or arith ops If there is only one single element in the vector, then we can just extract the element to compute the final result. Reviewed By: mravishankar Differential Revision: https://reviews.llvm.org/D124129	2022-04-22 14:24:46 -04:00
Matthias Springer	d6dab38ae4	[mlir][bufferize][NFC] Add function boundary bufferization flag to BufferizationOptions This makes the API easier to use. Also allows us to check for incorrect API usage for easier debugging. Differential Revision: https://reviews.llvm.org/D124265	2022-04-23 01:11:37 +09:00
Lei Zhang	fc760c0260	[mlir][vector] Fold cancelling vector.shape_cast(vector.broadcast) vector.broadcast can inject all size one dimensions. If it's followed by a vector.shape_cast to the original type, we can cancel the op pair, like cancelling consecutive shape_cast ops. Reviewed By: mravishankar Differential Revision: https://reviews.llvm.org/D124094	2022-04-22 08:58:26 -04:00
Matthias Springer	e07a7fd5c0	[mlir][bufferization] Move ModuleBufferization to bufferization dialect * Move Module Bufferization to the bufferization dialect. The implementation is split into `OneShotModuleBufferize.cpp` and `FuncBufferizableOpInterfaceImpl.cpp`, so that the external model implementation can be easily moved to the func dialect in the future. * Split and clean up test cases. A few test cases are still remaining in Linalg and will be updated separately. * `linalg.inplaceable` is renamed to `bufferization.writable` to accurately reflect its current usage. * Attributes and their verifiers are moved from the Linalg dialect to the Bufferization dialect. * Expand documentation. * Add a new flag to One-Shot Bufferize to allow for function boundary bufferization. Differential Revision: https://reviews.llvm.org/D122229	2022-04-22 19:37:28 +09:00
Matthias Springer	bd1d87e3d1	[mlir][bufferization][NFC] Remove layout post processing step The layout postprocessing step was removed and is now part of the FuncOp bufferization. If the user specified a certain layout map for a tensor function arg, use that layout map directly when bufferizing the function signature. Previously, the bufferization used a generic layout map for every tensor function arg and then updated function signatures and CallOps in a separate step. Differential Revision: https://reviews.llvm.org/D122228	2022-04-22 18:49:47 +09:00
Matthias Springer	70777d967f	[mlir][bufferize][NFC] Move FuncOp bufferization to BufferizableOpInterface impl FuncOps are now less special. They must still be analyzed + bufferized in a certain order, but they are now bufferized same as other ops that have a region: Bufferize the op first (`bufferize` interface method), then bufferize the region body with other bufferization patterns. In the case of FuncOps, the function signature is bufferized together with ReturnOps. Similar to how, e.g., scf.for ops are bufferized together with scf.yield ops. This change is essentially a reimplementation of the FuncOp bufferization, but mostly NFC from a user's perspective (apart from error messages). This change is in preparation of moving the code to the bufferization dialect. Differential Revision: https://reviews.llvm.org/D123214	2022-04-22 18:47:12 +09:00
Matthias Springer	d820acdde1	[mlir][bufferize][NFC] Use custom walk instead of GreedyPatternRewriter The bufferization driver was previously using a GreedyPatternRewriter. This was problematic because bufferization must traverse ops top-to-bottom. The GreedyPatternRewriter was previously configured via `useTopDownTraversal`, but this was a hack; this API was just meant for performance improvements and should not affect the result of the rewrite. BEGIN_PUBLIC No public commit message needed. END_PUBLIC Differential Revision: https://reviews.llvm.org/D123618	2022-04-22 18:23:09 +09:00
jacquesguan	9b32886e7e	[mlir][Arithmetic] Use common constant fold function in RemSI and RemUI to cover splat. This patch replaces current fold function with the common constant fold funtion in order to cover the situation of constant splat. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D124236	2022-04-22 09:20:18 +00:00
jacquesguan	abc17a6751	[mlir][Arithmetic] Use matchPattern to simplify code. This patch replaces some code with matchPattern and move them before the constant folder function in order to avoid redundant invoking. Differential Revision: https://reviews.llvm.org/D124235	2022-04-22 08:42:51 +00:00
Adrian Kuegel	a74e5a89b9	[mlir] Move isGuaranteedCollapsible to CollapseShapeOp (NFC). It seems more natural than to have it as a static method of ExpandShapeOp. Also fix a typo ("the the" -> "the"). Differential Revision: https://reviews.llvm.org/D124234	2022-04-22 10:31:25 +02:00
Amy Zhuang	5bd4bcfc04	[mlir] Modify SuperVectorize to generate select op->combiner op Insert the select op before the combiner op when vectorizing a reduction loop that needs a mask, so the vectorized reduction loop can pass isLoopParallel check and be transformed correctly in later passes. Reviewed By: dcaballe Differential Revision: https://reviews.llvm.org/D124047	2022-04-21 17:09:13 -07:00
Mahesh Ravishankar	0c090dcc8a	[mlir][Linalg] Deprecate legacy reshape + generic op folding patterns. These patterns have been superceded by the fusion by collapsing patterns. Differential Revision: https://reviews.llvm.org/D124145	2022-04-21 22:25:23 +00:00
Chris Lattner	31c8abc3f1	[AsmParser/Printer] Rework sourceloc support for function arguments. When Location tracking support for block arguments was added, we discussed various approaches to threading support for this through function-like argument parsing. At the time, we added a parallel array of locations that could hold this. It turns out that that approach was verbose and error prone, roughly no one adopted it. This patch takes a different approach, adding an optional source locator to the UnresolvedOperand class. This fits much more naturally into the standard structure we use for representing locators, and gives all the function like dialects locator support for free (e.g. see the test adding an example for the LLVM dialect). Differential Revision: https://reviews.llvm.org/D124188	2022-04-21 12:43:36 -07:00
Frederik Gossen	673e9828be	[MLIR] Fix iteration counting in greedy pattern application Previously, checking that a fix point is reached was counted as a full iteration. As this "iteration" never changes the IR, this seems counter- intuitive. Differential Revision: https://reviews.llvm.org/D123641	2022-04-21 15:17:28 -04:00
Fangrui Song	ae46b3e01f	Revert D121279 "[MLIR][GPU] Add canonicalizer for gpu.memcpy" This reverts commit `12f55cac69`. Causes miscompile. Will follow up with a reproduce.	2022-04-21 08:55:13 -07:00
Alex Zinenko	30f22429d3	[mlir] Connect Transform dialect to PDL This introduces a pair of ops to the Transform dialect that connect it to PDL patterns. Transform dialect relies on PDL for matching the Payload IR ops that are about to be transformed. For this purpose, it provides a container op for patterns, a "pdl_match" op and transform interface implementations that call into the pattern matching infrastructure. To enable the caching of compiled patterns, this also provides the extension mechanism for TransformState. Extensions allow one to store additional information in the TransformState and thus communicate it between different Transform dialect operations when they are applied. They can be added and removed when applying transform ops. An extension containing a symbol table in which the pattern names are resolved and a pattern compilation cache is introduced as the first client. Depends On D123664 Reviewed By: Mogball Differential Revision: https://reviews.llvm.org/D124007	2022-04-21 16:23:10 +02:00
Markus Böck	a41aaf166f	[mlir] Make `Regions`s `cloneInto` multithread-readable Prior to this patch, `cloneInto` would do a simple walk over the blocks and contained operations and clone and map them as it encounters them. As finishing touch it then remaps any successor and operands it has remapped during that process. This is generally fine, but sadly leads to a lot of uses of both operations and blocks from the source region, in the cloned operations in the target region. Those uses lead to writes in the use-def list of the operations, making `cloneInto` never thread safe. This patch reimplements `cloneInto` in three steps to avoid ever creating any extra uses on elements in the source region: * It first creates the mapping of all blocks and block operands * It then clones all operations to create the mapping of all operation results, but does not yet clone any regions or set the operands * After all operation results have been mapped, it now sets the operations operands and clones their regions. That way it is now possible to call `cloneInto` from multiple threads if the Region or Operation is isolated-from-above. This allows creating copies of functions or to use `mlir::inlineCall` with the same source region from multiple threads. In the general case, the method is thread-safe if through cloning, no new uses of `Value`s from outside the cloned Operation/Region are created. This can be ensured by mapping any outside operands via the `BlockAndValueMapping` to `Value`s owned by the caller thread. While I was at it, I also reworked the `clone` method of `Operation` a little bit and added a proper options class to avoid having a `cloneWithoutRegionsAndOperands` method, and be more extensible in the future. `cloneWithoutRegions` is now also a simple wrapper that calls `clone` with the proper options set. That way all the operation cloning code is now contained solely within `clone`. Differential Revision: https://reviews.llvm.org/D123917	2022-04-21 13:43:00 +02:00
Uday Bondhugula	f47a38f517	Add async dependencies support for gpu.launch op Add async dependencies support for gpu.launch op: this allows specifying a list of async tokens ("streams") as dependencies for the launch. Update the GPU kernel outlining pass lowering to propagate async dependencies from gpu.launch to gpu.launch_func op. Previously, a new stream was being created and destroyed for a kernel launch. The async deps support allows the kernel launch to be serialized on an existing stream. Differential Revision: https://reviews.llvm.org/D123499	2022-04-21 16:25:59 +05:30
Nimish Mishra	00c511b351	Added lowering support for atomic read and write constructs This patch adds lowering support for atomic read and write constructs. Also added is pointer modelling code to allow FIR pointer like types to be inferred and converted while lowering. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D122725 Co-authored-by: Kiran Chandramohan <kiran.chandramohan@arm.com>	2022-04-21 12:19:13 +05:30
Shraiysh Vaishay	88bb2521b0	[mlir][OpenMP] Add checks and tests for hint clause and fix empty hint This patch handles empty hint value for critical and atomic constructs. This also adds checks and tests for hint clause on atomic constructs. Reviewed By: peixin, kiranchandramohan, NimishMishra Differential Revision: https://reviews.llvm.org/D123186	2022-04-21 07:31:03 +05:30
Matthias Springer	8544523dcb	[mlir][tensor] Promote extract(from_elements(...)) to folding pattern Differential Revision: https://reviews.llvm.org/D123617	2022-04-20 23:47:42 +09:00
gysit	407b351da2	[mlir][linalg] Add ods-gen helper to simplify the build methods. Add a helper used to implement the build methods generated by ods-gen. The change reduces code size and compilation time since all structured op builders use the same build method. The change reduces the LinalgOps.cpp compilation time from 10.2s to 9.8s (debug build). Depends On D123987 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D124003	2022-04-20 13:14:38 +00:00
gysit	17721b6915	[mlir][linalg] Avoid template methods for parsing and printing. The revision avoids template methods for parsing and printing that are replicated for every named operation. Instead, the new methods take a regionBuilder argument. The revision reduces the compile time of LinalgOps.cpp from 11.2 to 10.2 seconds (debug build). Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D123987	2022-04-20 13:06:31 +00:00
Uday Bondhugula	d7565de6cc	[MLIR] NFC. Drop trailing white space in GPU async ops print NFC. Drop trailing end of line white space in GPU async ops' printer whenever the list of async deps is empty. Reviewed By: mehdi_amini, rriddle Differential Revision: https://reviews.llvm.org/D123754	2022-04-20 17:56:53 +05:30
Uday Bondhugula	d423fc3724	Add RegionBranchOpInterface on affine.for op Add RegionBranchOpInterface on affine.for op so that transforms relying on RegionBranchOpInterface can support affine.for. E.g.: buffer-deallocation pass. Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D123568	2022-04-20 17:46:07 +05:30
Matthias Springer	9235e597a4	[mlir][bufferize] Fix missing copies when writing to a buffer in a loop Writes into tensors that are definied outside of a repetitive region, but with the write happening inside of the repetitive region were previously not considered conflicts. This was incorrect. E.g.: ``` %0 = ... : tensor<?xf32> scf.for ... { "reading_op"(%0) : tensor<?xf32> %1 = "writing_op"(%0) : tensor<?xf32> -> tensor<?xf32> ... } ``` In the above example, "writing_op" should be out-of-place. This commit fixes the bufferization for any op that declares its repetitive semantics via RegionBranchOpInterface.	2022-04-20 18:51:06 +09:00
jacquesguan	61baf2ffa7	[mlir][Vector] Add check of supported reduction kind for ScanOp. This patch adds check of supported reduction kind for ScanOp to avoid using and/or/xor for floating point type. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D123977	2022-04-20 02:42:19 +00:00
Mehdi Amini	8608ed1441	Apply clang-tidy fixes for llvm-twine-local in OpenMPToLLVMIRTranslation.cpp (NFC)	2022-04-20 00:39:10 +00:00
John Demme	6b0bed7ea5	[MLIR] [Python] Add a method to clear live operations map Introduce a method on PyMlirContext (and plumb it through to Python) to invalidate all of the operations in the live operations map and clear it. Since Python has no notion of private data, an end-developer could reach into some 3rd party API which uses the MLIR Python API (that is behaving correctly with regard to holding references) and grab a reference to an MLIR Python Operation, preventing it from being deconstructed out of the live operations map. This allows the API developer to clear the map when it calls C++ code which could delete operations, protecting itself from its users. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D123895	2022-04-19 15:14:09 -07:00
Krzysztof Drewniak	ddc2eb0ada	[mlir] Adds getUpperBound() to LoopLikeInterface. getUpperBound is analogous to getLowerBound(), except for the upper bound, and is used in range analysis. Reviewed By: Mogball Differential Revision: https://reviews.llvm.org/D124020	2022-04-19 19:56:44 +00:00
Alex Zinenko	0eb403ad1b	[mlir][transform] Introduce transform.sequence op Sequence is an important transform combination primitive that just indicates transform ops being applied in a row. The simplest version requires fails immediately if any transformation in the sequence fails. Introducing this operation allows one to start placing transform IR within other IR. Depends On D123135 Reviewed By: Mogball, rriddle Differential Revision: https://reviews.llvm.org/D123664	2022-04-19 21:41:02 +02:00
Ashay Rane	25c218be36	[MLIR] Add function to create BFloat16 array attribute This patch adds a new function `mlirDenseElementsAttrBFloat16Get()`, which accepts the shaped type, the number of BFloat16 values, and a pointer to an array of BFloat16 values, each of which is a `uint16_t` value. Reviewed By: stellaraccident Differential Revision: https://reviews.llvm.org/D123981	2022-04-19 19:27:06 +00:00
Mehdi Amini	83892d76f4	Print custom assembly on pass failure by default The printer is now resilient to invalid IR and will already automatically fallback to the generic form on invalid IR. Using the generic printer on pass failure was a conservative option before the printer was made failsafe. Reviewed By: lattner, rriddle, jpienaar, bondhugula Differential Revision: https://reviews.llvm.org/D123915	2022-04-19 17:29:08 +00:00
Mehdi Amini	2d6335421f	Apply clang-tidy fixes for llvm-qualified-auto in OpenMPToLLVMIRTranslation.cpp (NFC)	2022-04-19 17:20:57 +00:00
Mehdi Amini	f9735be7e2	Apply clang-tidy fixes for performance-unnecessary-value-param in ControlFlowInterfaces.cpp (NFC)	2022-04-19 17:20:57 +00:00
Arnab Dutta	12f55cac69	[MLIR][GPU] Add canonicalizer for gpu.memcpy Fold away gpu.memcpy op when only uses of dest are the memcpy op in question, its allocation and deallocation ops. Reviewed By: bondhugula Differential Revision: https://reviews.llvm.org/D121279	2022-04-19 17:54:00 +05:30
Marius Brehler	2ba865903d	[mlir][emitc] Add test for invalid type Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D123503	2022-04-19 11:03:56 +02:00
Matthias Springer	a3005a406e	[mlir][interfaces] Fix infinite loop in insideMutuallyExclusiveRegions This function was missing a termination condition.	2022-04-19 16:28:52 +09:00
Mehdi Amini	4e01184ad5	Apply clang-tidy fixes for performance-unnecessary-value-param in JitRunner.cpp (NFC)	2022-04-19 07:23:12 +00:00
Mehdi Amini	722a3a58e2	Apply clang-tidy fixes for performance-for-range-copy in MemRefOps.cpp (NFC)	2022-04-19 07:23:12 +00:00
Matthias Springer	0f4ba02db3	[mlir][interfaces] Add helpers for detecting recursive regions Add helper functions to check if an op may be executed multiple times based on RegionBranchOpInterface. Differential Revision: https://reviews.llvm.org/D123789	2022-04-19 16:13:32 +09:00
Michael Kruse	2d92ee97f1	Reapply "[OpenMP] Refactor OMPScheduleType enum." This reverts commit `af0285122f`. The test "libomp::loop_dispatch.c" on builder openmp-gcc-x86_64-linux-debian fails from time-to-time. See #54969. This patch is unrelated.	2022-04-18 21:56:47 -05:00

1 2 3 4 5 ...

8208 Commits