Commit Graph

8208 Commits

Author SHA1 Message Date
River Riddle eda6f907d2 [mlir][NFC] Shift a bunch of dialect includes from the .h to the .cpp
Now that dialect constructors are generated in the .cpp file, we can
drop all of the dependent dialect includes from the .h file.

Differential Revision: https://reviews.llvm.org/D124298
2022-04-23 01:09:29 -07:00
Markus Böck 8ed2bd1e74 [mlir][LLVM] Fix `DataLayoutTypeInterface` for opqaue pointers with non-default address space
As a fallback mechanism, if no entry was supplied for a given address space, the size or alignment for a pointer type with the default address space is returned instead.
This code currently crashes with opaque pointers, as it tries to construct a typed pointer type from the opaque pointer type, leading to a null pointer dereference when fetching the element type.

This patch fixes the issue by handling the opaque pointer cases explicitly.

Differential Revision: https://reviews.llvm.org/D124290
2022-04-23 00:10:31 +02:00
Markus Böck bab3d3778d [mlir][LLVM] Fix crash when using opaque pointers in function signatures
Using opaque pointers in function signatures leads to an attempt to recursively convert all types, including sub types in LLVM types. In the case of LLVM pointers, it may not have a subtype aka element type if it is opaque which would then lead to a null pointer dereference.

Differential Revision: https://reviews.llvm.org/D124291
2022-04-23 00:10:31 +02:00
Yi Zhang 1cddcfdc3c Fix CollapsedLayoutMap for dim size 1 case
This change fixes `CollapsedLayoutMap` for cases where the collapsed
dims are size 1. The cases where inner most dims are size 1 and
noncontiguous can be represented by the strided form and therefore can
be allowed. For such cases, the new stride should be of the next entry
in an association whose dimension is not size 1. If the next entry is
dynamic, it's not possible to decide which stride to use at compilation
time and the stride is set to dynamic.

Differential Revision: https://reviews.llvm.org/D124137
2022-04-22 17:48:24 -04:00
Alex Zinenko 40a8bd635b [mlir] use side effects in the Transform dialect
Currently, the sequence of Transform dialect operations only supports a single
use of each operand (verified by the `transform.sequence` operation). This was
originally motivated by the need to guard against accessing a payload IR
operation associated with a transform IR value after this operation has likely
been rewritten by a transformation. However, not all Transform dialect
operations rewrite payload IR, in particular the "navigation" operation such as
`transform.pdl_match` do not.

Introduce memory effects to the Transform dialect operations to describe their
effect on the payload IR and the mapping between payload IR opreations and
transform IR values. Use these effects to replace the single-use rule, allowing
repeated reads and disallowing use-after-free, where operations with the "free"
effect are considered to "consume" the transform IR value and rewrite the
corresponding payload IR operations). As an additional improvement, this
enables code motion transformation on the transform IR itself.

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D124181
2022-04-22 23:29:11 +02:00
Okwan Kwon ee285faed2 [mlir] Do not bubble up extract slice when it is rank-reducing.
The bubble up logic was written by assuming the slice operation is
always a normal slice that outputs a tensor with the same rank.

Differential Revision: https://reviews.llvm.org/D124283
2022-04-22 12:21:47 -07:00
cpillmayer 3e8560f890 [MLIR] Add option to print users of an operation as comment in the printer
This allows printing the users of an operation as proposed in the git issue #53286.
To be able to refer to operations with no result, these operations are assigned an
ID in SSANameState.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D124048
2022-04-22 18:58:10 +00:00
Jacques Pienaar 9bae20b528 [mlir] Add shape.func
Add shape func op for use (primarily) in shape function_library op. Allows
setting default dialect for some simpler authoring. This is a minimal version
of the ops needed.

Differential Revision: https://reviews.llvm.org/D124055
2022-04-22 11:35:35 -07:00
Lei Zhang 6f28fd0bf7 [mlir][vector] Fold 1-element reduction into extract or arith ops
If there is only one single element in the vector, then we can
just extract the element to compute the final result.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D124129
2022-04-22 14:24:46 -04:00
Matthias Springer d6dab38ae4 [mlir][bufferize][NFC] Add function boundary bufferization flag to BufferizationOptions
This makes the API easier to use. Also allows us to check for incorrect API usage for easier debugging.

Differential Revision: https://reviews.llvm.org/D124265
2022-04-23 01:11:37 +09:00
Lei Zhang fc760c0260 [mlir][vector] Fold cancelling vector.shape_cast(vector.broadcast)
vector.broadcast can inject all size one dimensions. If it's
followed by a vector.shape_cast to the original type, we can
cancel the op pair, like cancelling consecutive shape_cast ops.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D124094
2022-04-22 08:58:26 -04:00
Matthias Springer e07a7fd5c0 [mlir][bufferization] Move ModuleBufferization to bufferization dialect
* Move Module Bufferization to the bufferization dialect. The implementation is split into `OneShotModuleBufferize.cpp` and `FuncBufferizableOpInterfaceImpl.cpp`, so that the external model implementation can be easily moved to the func dialect in the future.
* Split and clean up test cases. A few test cases are still remaining in Linalg and will be updated separately.
* `linalg.inplaceable` is renamed to `bufferization.writable` to accurately reflect its current usage.
* Attributes and their verifiers are moved from the Linalg dialect to the Bufferization dialect.
* Expand documentation.
* Add a new flag to One-Shot Bufferize to allow for function boundary bufferization.

Differential Revision: https://reviews.llvm.org/D122229
2022-04-22 19:37:28 +09:00
Matthias Springer bd1d87e3d1 [mlir][bufferization][NFC] Remove layout post processing step
The layout postprocessing step was removed and is now part of the FuncOp bufferization. If the user specified a certain layout map for a tensor function arg, use that layout map directly when bufferizing the function signature. Previously, the bufferization used a generic layout map for every tensor function arg and then updated function signatures and CallOps in a separate step.

Differential Revision: https://reviews.llvm.org/D122228
2022-04-22 18:49:47 +09:00
Matthias Springer 70777d967f [mlir][bufferize][NFC] Move FuncOp bufferization to BufferizableOpInterface impl
FuncOps are now less special. They must still be analyzed + bufferized in a certain order, but they are now bufferized same as other ops that have a region: Bufferize the op first (`bufferize` interface method), then bufferize the region body with other bufferization patterns. In the case of FuncOps, the function signature is bufferized together with ReturnOps. Similar to how, e.g., scf.for ops are bufferized together with scf.yield ops.

This change is essentially a reimplementation of the FuncOp bufferization, but mostly NFC from a user's perspective (apart from error messages). This change is in preparation of moving the code to the bufferization dialect.

Differential Revision: https://reviews.llvm.org/D123214
2022-04-22 18:47:12 +09:00
Matthias Springer d820acdde1 [mlir][bufferize][NFC] Use custom walk instead of GreedyPatternRewriter
The bufferization driver was previously using a GreedyPatternRewriter. This was problematic because bufferization must traverse ops top-to-bottom. The GreedyPatternRewriter was previously configured via `useTopDownTraversal`, but this was a hack; this API was just meant for performance improvements and should not affect the result of the rewrite.

BEGIN_PUBLIC
No public commit message needed.
END_PUBLIC

Differential Revision: https://reviews.llvm.org/D123618
2022-04-22 18:23:09 +09:00
jacquesguan 9b32886e7e [mlir][Arithmetic] Use common constant fold function in RemSI and RemUI to cover splat.
This patch replaces current fold function with the common constant fold funtion in order to cover the situation of constant splat.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D124236
2022-04-22 09:20:18 +00:00
jacquesguan abc17a6751 [mlir][Arithmetic] Use matchPattern to simplify code.
This patch replaces some code with matchPattern and move them before the constant folder function in order to avoid redundant invoking.

Differential Revision: https://reviews.llvm.org/D124235
2022-04-22 08:42:51 +00:00
Adrian Kuegel a74e5a89b9 [mlir] Move isGuaranteedCollapsible to CollapseShapeOp (NFC).
It seems more natural than to have it as a static method of ExpandShapeOp.
Also fix a typo ("the the" -> "the").

Differential Revision: https://reviews.llvm.org/D124234
2022-04-22 10:31:25 +02:00
Amy Zhuang 5bd4bcfc04 [mlir] Modify SuperVectorize to generate select op->combiner op
Insert the select op before the combiner op when vectorizing a
reduction loop that needs a mask, so the vectorized reduction loop
can pass isLoopParallel check and be transformed correctly in later
passes.

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D124047
2022-04-21 17:09:13 -07:00
Mahesh Ravishankar 0c090dcc8a [mlir][Linalg] Deprecate legacy reshape + generic op folding patterns.
These patterns have been superceded by the fusion by collapsing patterns.

Differential Revision: https://reviews.llvm.org/D124145
2022-04-21 22:25:23 +00:00
Chris Lattner 31c8abc3f1 [AsmParser/Printer] Rework sourceloc support for function arguments.
When Location tracking support for block arguments was added, we
discussed various approaches to threading support for this through
function-like argument parsing.  At the time, we added a parallel array
of locations that could hold this.  It turns out that that approach was
verbose and error prone, roughly no one adopted it.

This patch takes a different approach, adding an optional source
locator to the UnresolvedOperand class.  This fits much more naturally
into the standard structure we use for representing locators, and gives
all the function like dialects locator support for free (e.g. see the
test adding an example for the LLVM dialect).

Differential Revision: https://reviews.llvm.org/D124188
2022-04-21 12:43:36 -07:00
Frederik Gossen 673e9828be [MLIR] Fix iteration counting in greedy pattern application
Previously, checking that a fix point is reached was counted as a full
iteration. As this "iteration" never changes the IR, this seems counter-
intuitive.

Differential Revision: https://reviews.llvm.org/D123641
2022-04-21 15:17:28 -04:00
Fangrui Song ae46b3e01f Revert D121279 "[MLIR][GPU] Add canonicalizer for gpu.memcpy"
This reverts commit 12f55cac69.

Causes miscompile. Will follow up with a reproduce.
2022-04-21 08:55:13 -07:00
Alex Zinenko 30f22429d3 [mlir] Connect Transform dialect to PDL
This introduces a pair of ops to the Transform dialect that connect it to PDL
patterns. Transform dialect relies on PDL for matching the Payload IR ops that
are about to be transformed. For this purpose, it provides a container op for
patterns, a "pdl_match" op and transform interface implementations that call
into the pattern matching infrastructure.

To enable the caching of compiled patterns, this also provides the extension
mechanism for TransformState. Extensions allow one to store additional
information in the TransformState and thus communicate it between different
Transform dialect operations when they are applied. They can be added and
removed when applying transform ops. An extension containing a symbol table in
which the pattern names are resolved and a pattern compilation cache is
introduced as the first client.

Depends On D123664

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D124007
2022-04-21 16:23:10 +02:00
Markus Böck a41aaf166f [mlir] Make `Regions`s `cloneInto` multithread-readable
Prior to this patch, `cloneInto` would do a simple walk over the blocks and contained operations and clone and map them as it encounters them. As finishing touch it then remaps any successor and operands it has remapped during that process.

This is generally fine, but sadly leads to a lot of uses of both operations and blocks from the source region, in the cloned operations in the target region. Those uses lead to writes in the use-def list of the operations, making `cloneInto` never thread safe.

This patch reimplements `cloneInto` in three steps to avoid ever creating any extra uses on elements in the source region:
* It first creates the mapping of all blocks and block operands
* It then clones all operations to create the mapping of all operation results, but does not yet clone any regions or set the operands
* After all operation results have been mapped, it now sets the operations operands and clones their regions.

That way it is now possible to call `cloneInto` from multiple threads if the Region or Operation is isolated-from-above. This allows creating copies of  functions or to use `mlir::inlineCall` with the same source region from multiple threads. In the general case, the method is thread-safe if through cloning, no new uses of `Value`s from outside the cloned Operation/Region are created. This can be ensured by mapping any outside operands via the `BlockAndValueMapping` to `Value`s owned by the caller thread.

While I was at it, I also reworked the `clone` method of `Operation` a little bit and added a proper options class to avoid having a `cloneWithoutRegionsAndOperands` method, and be more extensible in the future. `cloneWithoutRegions` is now also a simple wrapper that calls `clone` with the proper options set. That way all the operation cloning code is now contained solely within `clone`.

Differential Revision: https://reviews.llvm.org/D123917
2022-04-21 13:43:00 +02:00
Uday Bondhugula f47a38f517 Add async dependencies support for gpu.launch op
Add async dependencies support for gpu.launch op: this allows specifying
a list of async tokens ("streams") as dependencies for the launch.

Update the GPU kernel outlining pass lowering to propagate async
dependencies from gpu.launch to gpu.launch_func op. Previously, a new
stream was being created and destroyed for a kernel launch. The async
deps support allows the kernel launch to be serialized on an existing
stream.

Differential Revision: https://reviews.llvm.org/D123499
2022-04-21 16:25:59 +05:30
Nimish Mishra 00c511b351 Added lowering support for atomic read and write constructs
This patch adds lowering support for atomic read and write constructs.
Also added is pointer modelling code to allow FIR pointer like types to
be inferred and converted while lowering.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D122725

Co-authored-by: Kiran Chandramohan <kiran.chandramohan@arm.com>
2022-04-21 12:19:13 +05:30
Shraiysh Vaishay 88bb2521b0 [mlir][OpenMP] Add checks and tests for hint clause and fix empty hint
This patch handles empty hint value for critical and atomic constructs.

This also adds checks and tests for hint clause on atomic constructs.

Reviewed By: peixin, kiranchandramohan, NimishMishra

Differential Revision: https://reviews.llvm.org/D123186
2022-04-21 07:31:03 +05:30
Matthias Springer 8544523dcb [mlir][tensor] Promote extract(from_elements(...)) to folding pattern
Differential Revision: https://reviews.llvm.org/D123617
2022-04-20 23:47:42 +09:00
gysit 407b351da2 [mlir][linalg] Add ods-gen helper to simplify the build methods.
Add a helper used to implement the build methods generated by ods-gen. The change reduces code size and compilation time since all structured op builders use the same build method. The change reduces the LinalgOps.cpp compilation time from 10.2s to 9.8s (debug build).

Depends On D123987

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D124003
2022-04-20 13:14:38 +00:00
gysit 17721b6915 [mlir][linalg] Avoid template methods for parsing and printing.
The revision avoids template methods for parsing and printing that are replicated for every named operation. Instead, the new methods take a regionBuilder argument. The revision reduces the compile time of LinalgOps.cpp from 11.2 to 10.2 seconds (debug build).

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D123987
2022-04-20 13:06:31 +00:00
Uday Bondhugula d7565de6cc [MLIR] NFC. Drop trailing white space in GPU async ops print
NFC. Drop trailing end of line white space in GPU async ops' printer
whenever the list of async deps is empty.

Reviewed By: mehdi_amini, rriddle

Differential Revision: https://reviews.llvm.org/D123754
2022-04-20 17:56:53 +05:30
Uday Bondhugula d423fc3724 Add RegionBranchOpInterface on affine.for op
Add RegionBranchOpInterface on affine.for op so that transforms relying
on RegionBranchOpInterface can support affine.for. E.g.:
buffer-deallocation pass.

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D123568
2022-04-20 17:46:07 +05:30
Matthias Springer 9235e597a4 [mlir][bufferize] Fix missing copies when writing to a buffer in a loop
Writes into tensors that are definied outside of a repetitive region, but with the write happening inside of the repetitive region were previously not considered conflicts. This was incorrect.

E.g.:
```
%0 = ... : tensor<?xf32>
scf.for ... {
  "reading_op"(%0) : tensor<?xf32>
  %1 = "writing_op"(%0) : tensor<?xf32> -> tensor<?xf32>
  ...
}
```

In the above example, "writing_op" should be out-of-place.

This commit fixes the bufferization for any op that declares its repetitive semantics via RegionBranchOpInterface.
2022-04-20 18:51:06 +09:00
jacquesguan 61baf2ffa7 [mlir][Vector] Add check of supported reduction kind for ScanOp.
This patch adds check of supported reduction kind for ScanOp to avoid using and/or/xor for floating point type.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D123977
2022-04-20 02:42:19 +00:00
Mehdi Amini 8608ed1441 Apply clang-tidy fixes for llvm-twine-local in OpenMPToLLVMIRTranslation.cpp (NFC) 2022-04-20 00:39:10 +00:00
John Demme 6b0bed7ea5 [MLIR] [Python] Add a method to clear live operations map
Introduce a method on PyMlirContext (and plumb it through to Python) to
invalidate all of the operations in the live operations map and clear
it. Since Python has no notion of private data, an end-developer could
reach into some 3rd party API which uses the MLIR Python API (that is
behaving correctly with regard to holding references) and grab a
reference to an MLIR Python Operation, preventing it from being
deconstructed out of the live operations map. This allows the API
developer to clear the map when it calls C++ code which could delete
operations, protecting itself from its users.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D123895
2022-04-19 15:14:09 -07:00
Krzysztof Drewniak ddc2eb0ada [mlir] Adds getUpperBound() to LoopLikeInterface.
getUpperBound is analogous to getLowerBound(), except for the upper
bound, and is used in range analysis.

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D124020
2022-04-19 19:56:44 +00:00
Alex Zinenko 0eb403ad1b [mlir][transform] Introduce transform.sequence op
Sequence is an important transform combination primitive that just indicates
transform ops being applied in a row. The simplest version requires fails
immediately if any transformation in the sequence fails. Introducing this
operation allows one to start placing transform IR within other IR.

Depends On D123135

Reviewed By: Mogball, rriddle

Differential Revision: https://reviews.llvm.org/D123664
2022-04-19 21:41:02 +02:00
Ashay Rane 25c218be36 [MLIR] Add function to create BFloat16 array attribute
This patch adds a new function `mlirDenseElementsAttrBFloat16Get()`,
which accepts the shaped type, the number of BFloat16 values, and a
pointer to an array of BFloat16 values, each of which is a `uint16_t`
value.

Reviewed By: stellaraccident

Differential Revision: https://reviews.llvm.org/D123981
2022-04-19 19:27:06 +00:00
Mehdi Amini 83892d76f4 Print custom assembly on pass failure by default
The printer is now resilient to invalid IR and will already automatically
fallback to the generic form on invalid IR. Using the generic printer on
pass failure was a conservative option before the printer was made
failsafe.

Reviewed By: lattner, rriddle, jpienaar, bondhugula

Differential Revision: https://reviews.llvm.org/D123915
2022-04-19 17:29:08 +00:00
Mehdi Amini 2d6335421f Apply clang-tidy fixes for llvm-qualified-auto in OpenMPToLLVMIRTranslation.cpp (NFC) 2022-04-19 17:20:57 +00:00
Mehdi Amini f9735be7e2 Apply clang-tidy fixes for performance-unnecessary-value-param in ControlFlowInterfaces.cpp (NFC) 2022-04-19 17:20:57 +00:00
Arnab Dutta 12f55cac69 [MLIR][GPU] Add canonicalizer for gpu.memcpy
Fold away gpu.memcpy op when only uses of dest are
the memcpy op in question, its allocation and deallocation
ops.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D121279
2022-04-19 17:54:00 +05:30
Marius Brehler 2ba865903d [mlir][emitc] Add test for invalid type
Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D123503
2022-04-19 11:03:56 +02:00
Matthias Springer a3005a406e [mlir][interfaces] Fix infinite loop in insideMutuallyExclusiveRegions
This function was missing a termination condition.
2022-04-19 16:28:52 +09:00
Mehdi Amini 4e01184ad5 Apply clang-tidy fixes for performance-unnecessary-value-param in JitRunner.cpp (NFC) 2022-04-19 07:23:12 +00:00
Mehdi Amini 722a3a58e2 Apply clang-tidy fixes for performance-for-range-copy in MemRefOps.cpp (NFC) 2022-04-19 07:23:12 +00:00
Matthias Springer 0f4ba02db3 [mlir][interfaces] Add helpers for detecting recursive regions
Add helper functions to check if an op may be executed multiple times based on RegionBranchOpInterface.

Differential Revision: https://reviews.llvm.org/D123789
2022-04-19 16:13:32 +09:00
Michael Kruse 2d92ee97f1 Reapply "[OpenMP] Refactor OMPScheduleType enum."
This reverts commit af0285122f.

The test "libomp::loop_dispatch.c" on builder
openmp-gcc-x86_64-linux-debian fails from time-to-time.
See #54969. This patch is unrelated.
2022-04-18 21:56:47 -05:00