Commit Graph

6421 Commits

Author SHA1 Message Date
Jacques Pienaar 3c68d58841 [mlir][doc] Move pass to passes list and remove redundant doc
The types are already included in the dialect doc (the attributes is not
properly yet, so retaining).
2022-06-13 21:01:31 -07:00
jacquesguan 5179f885d1 [mlir][Arithmetic] Fold NegF in MulF and DivF.
This patch adds the following combination:

mulf(negf(x), negf(y)) -> mulf(x, y)
divf(negf(x), negf(y)) -> divf(x, y)

Differential Revision: https://reviews.llvm.org/D126044
2022-06-14 03:15:19 +00:00
Mogball 537f220891 [mlir] Support getSuccessorInputs from parent op
Ops that implement `RegionBranchOpInterface` are allowed to indicate that they can branch back to themselves in `getSuccessorRegions`, but there is no API that allows them to specify the forwarded operands. This patch enables that by changing `getSuccessorEntryOperands` to accept `None`.

Fixes #54928

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D127239
2022-06-13 22:21:34 +00:00
Mahesh Ravishankar cf6a7c1947 [mlir][TilingInterface] Add pattern to tile using TilingInterface and implement TilingInterface for Linalg ops.
This patch adds support for tiling operations that implement the
TilingInterface.
- It separates the loop constructs that are used to iterate over tile
  from the implementation of the tiling itself. For example, the use
  of destructive updates is more related to use of scf.for for
  iterating over tiles that are tensors.
- To test the transformation, TilingInterface is implemented for
  LinalgOps. The separation of the looping constructs used from the
  implementation of tile code generation greatly simplifies the
  latter.
- The implementation of TilingInterface for LinalgOp is kept as an
  external model for now till this approach can be fully flushed out
  to replace the existing tiling + fusion approaches in Linalg.

Differential Revision: https://reviews.llvm.org/D127133
2022-06-13 20:37:44 +00:00
Benjamin Kramer de0aa687a2 [mlir][linalg] Add conv_2d_nhwc_fhwc to core_named_ops.py
So it doesn't disappear when running the generator.
2022-06-13 20:43:10 +02:00
Thomas Raoux 2d32dac8bb Revert "[mlir][vector] Add patterns to ppropagate vector distribution"
This reverts commit 1c84800c42.

This was causing asan crash.
2022-06-13 17:55:31 +00:00
Lei Zhang cc020a2236 [mlir][spirv] Convert math.ctlz to spv.GLSL.FindUMsb
Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D127582
2022-06-13 13:02:37 -04:00
Thomas Raoux 1c84800c42 [mlir][vector] Add patterns to ppropagate vector distribution
Add patterns to propagate vector distribution and remove dead
arguments. This handles propagation for several vector operations.

Differential Revision: https://reviews.llvm.org/D127167
2022-06-13 16:38:50 +00:00
Lei Zhang a10c09d1e3 [mlir][spirv] Remove unused `traits` from `SPV_Attr`
This addresses the warning of unused template argument.
2022-06-13 12:20:57 -04:00
Frederik Gossen ff6ce9e8fc Add `createDynamicDimValues` to tensor dialect utils
The function creates dim ops for each dynamic dimension of the raked tensor
argument and returns these as values.

Differential Revision: https://reviews.llvm.org/D127533
2022-06-13 08:26:28 -04:00
Thomas Raoux ed0288f7c4 [mlir][vector] Add patterns for vector distribution
Add pattern to hoist scalar code outside of warp distribute region as
those cannot be distributed and we would want to execute them on all
the lanes.
Add patterns to distribute transfer_write ops. Those operations can be
distributed in different ways and it is control by user.

Differential Revision: https://reviews.llvm.org/D127152
2022-06-10 17:46:51 +00:00
Alex Zinenko 6403e1b12a [mlir] add a dynamic user-after-parent-freed transform dialect check
In the transform dialect, a transform IR handle may be pointing to a payload IR
operation that is an ancestor of another payload IR operation pointed to by
another handle. If such a "parent" handle is consumed by a transformation, this
indicates that the associated operation is likely rewritten, which in turn
means that the "child" handle may now be associated with a dangling pointer or
a pointer to a different operation than originally. Add a handle invalidation
mechanism to guard against such situations by reporting errors at runtime.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D127480
2022-06-10 13:05:34 +02:00
Matthias Springer 79f115911e [mlir][bufferize] Avoid tensor copies when the data is not read
There are various shortcuts in `BufferizationState::getBuffer` that avoid a buffer copy when we just need an allocation (and no initialization). This change adds those shortcuts to the TensorCopyInsertion pass, so that `getBuffer` can be simplified in a subsequent change.

Differential Revision: https://reviews.llvm.org/D126821
2022-06-10 10:26:07 +02:00
Okwan Kwon 5ccb9df3ba [mlir] Support passing ostream as argument for the create function.
The constructor already supports passing an ostream as argument,
so let's make the create function support it too.

Differential Revision: https://reviews.llvm.org/D127449
2022-06-09 16:34:22 -07:00
Mogball a31ff0af9b [mlir][spirv] Replace StructAttrs with AttrDefs
Depends on D127370

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D127373
2022-06-09 23:16:44 +00:00
Mogball f1182bd6d5 [mlir][tosa] Replace StructAttrs with AttrDefs
Depends on D127352

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D127370
2022-06-09 23:01:51 +00:00
Mogball d7ef488bb6 [mlir][gpu] Move GPU headers into IR/ and Transforms/
Depends on D127350

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D127352
2022-06-09 22:49:03 +00:00
Mogball 7bdd3722f2 [mlir][gpu] Change ParalellLoopMappingAttr to AttrDef
It was a StructAttr. Also adds a FieldParser for AffineMap.

Depends on D127348

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D127350
2022-06-09 22:23:21 +00:00
Mogball ba79bb4973 [mlir][nvvm] Change MMAShapeAttr to AttrDef
MMAShapeAttr was a StructAttr

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D127348
2022-06-09 22:14:45 +00:00
Matthias Springer 87c770bbd0 [mlir][bufferization][NFC] Put inplacability conflict resolution in op interface
The TensorCopyInsertion pass resolves out-of-place bufferization decisions by inserting explicit `bufferization.alloc_tensor` ops. This change moves that functionality into a new BufferizableOpInterface method, so that it can be overridden by op implementations. Some op bufferizations must insert additional `alloc_tensor` ops to make sure that certain aliasing invariants are not violated (e.g., scf::ForOp). This will be addressed in a subsequent change.

Differential Revision: https://reviews.llvm.org/D126817
2022-06-09 22:06:44 +02:00
Christopher Bate 9f1221521f Recommit "[mlir][vector] Allow unroll of contraction in arbitrary order"
Fixed issue with vector.contract default unroll permutation.

Adds support for vector unroll transformations to unroll in different
orders. For example, the vector.contract can be unrolled into a
smaller set of contractions. There is a choice of how to unroll the
decomposition based on the traversal order of (dim0, dim1, dim2).
The choice of traversal order can now be specified by a callback which
given by the caller of the transform. For now, only the
vector.contract, vector.transfer_read/transfer_write operations
support the callback.

Differential Revision: https://reviews.llvm.org/D127004
2022-06-09 14:01:19 -06:00
Matthias Springer 3b2004e16b [mlir][bufferization] Add TensorCopyInsertion pass
This pass runs the One-Shot Analysis to find out which tensor OpOperands must bufferize out-of-place. It then rewrites those tensor OpOperands to explicit allocations with a copy in the form of `bufferization.alloc_tensor`. The resulting IR can then be bufferized without having to care about read-after-write conflicts.

This change makes it possible to connect One-Shot Analysis to other bufferizations such as the sparse compiler.

Differential Revision: https://reviews.llvm.org/D126573
2022-06-09 21:55:52 +02:00
Matthias Springer 56d68e8d7a [mlir][bufferization] Add optional `copy` operand to AllocTensorOp
If `copy` is specified, the newly allocated buffer is initialized with the given contents. Also add an optional `escape` attribute to indicate whether the buffer of the tensor may be returned from the parent block (aka. "escape") after bufferization.

This change is in preparation of connecting One-Shot Bufferize to the sparse compiler.

Differential Revision: https://reviews.llvm.org/D126570
2022-06-09 21:37:15 +02:00
Matthias Springer 88539c5bdb [mlir][bufferize][NFC] Decouple dropping of equivalent return values from bufferization
This simplifies the bufferization itself and is in preparation of connecting with the sparse compiler.

Differential Revision: https://reviews.llvm.org/D126814
2022-06-09 18:39:05 +02:00
Matthias Springer 92680126bf [mlir][bufferize] Decouple promoteBufferResultsToOutParams from One-Shot Bufferize
Users should explicitly run `-buffer-results-to-out-params` instead.

The purpose of this change is to remove `finalizeBuffers`, which made it difficult to extend the bufferization to custom buffer types.

Differential Revision: https://reviews.llvm.org/D126253
2022-06-09 18:25:26 +02:00
Matthias Springer 058af65e78 [mlir][bufferization] Decouple buffer-deallocation from One-Shot Bufferize
The buffer deallocation pass must now be run explicitly when `allow-return-alloc` is set.

This results in a few extra buffer copies in unoptimized test cases. The proper way to avoid such copies is to relax the OpOperand/OpResult aliasing contract on ops such as scf.for. Some of these copies can also be avoided by improving the buffer deallocation pass.

Differential Revision: https://reviews.llvm.org/D126252
2022-06-09 18:20:39 +02:00
Yuanqiang Liu 56e19717f5 [MLIR][Shape] Generalize `shape.concat` to extent tensors
The operation `shape.concat` was used for type shape only.
We now enable it for extent tensors.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D127321
2022-06-09 08:23:26 -07:00
Matthias Springer 461dafd2a3 [mlir][bufferization] Add OneShotBufferize transform op
This commit allows for One-Shot Bufferize to be used through the transform dialect. No op handle is currently returned for the bufferized IR.

Differential Revision: https://reviews.llvm.org/D125098
2022-06-09 15:15:09 +02:00
Alex Zinenko b6c58ec486 [mlir] add producer fusion to structured transform ops
This relies on the existing TileAndFuse pattern for tensor-based structured
ops. It complements pure tiling, from which some utilities are generalized.

Depends On D127300

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D127319
2022-06-09 14:30:45 +02:00
Alex Zinenko 5f0d4f208e [mlir] Introduce Transform ops for loops
Introduce transform ops for "for" loops, in particular for peeling, software
pipelining and unrolling, along with a couple of "IR navigation" ops. These ops
are intended to be generalized to different kinds of loops when possible and
therefore use the "loop" prefix. They currently live in the SCF dialect as
there is no clear place to put transform ops that may span across several
dialects, this decision is postponed until the ops actually need to handle
non-SCF loops.

Additionally refactor some common utilities for transform ops into trait or
interface methods, and change the loop pipelining to be a returning pattern.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D127300
2022-06-09 11:41:55 +02:00
Mogball 971e13d69e [mlir][ods] Mark StructAttr as deprecated 2022-06-09 03:23:31 +00:00
Arjun P 4bf9cbc408 [MLIR][Presburger] subtract: improve redundant constraint detection
When constraints in the two operands make each other redundant, prefer constraints of the second because this affects the number of sets in the output at each level; reducing these can help prevent exponential blowup.

This is accomplished by adding extra overloads to Simplex::detectRedundant that only scan a subrange of the constraints for redundancy.

Reviewed By: Groverkss

Differential Revision: https://reviews.llvm.org/D127237
2022-06-08 14:44:31 -04:00
wren romano 0371ddf9ad [mlir] Refactoring the tablegen Tensor types
Reduces repetition in tablegen files for defining various tensor types.  In particular the goal is to reduce the repetition when defining new tensor types (e.g., D126994).

Reviewed By: aartbik, rriddle

Differential Revision: https://reviews.llvm.org/D127039
2022-06-08 11:33:48 -07:00
dime10 4f55ed5a1e Add Python bindings for the OpaqueType
Implement the C-API and Python bindings for the builtin opaque type, which was previously missing.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D127303
2022-06-08 19:51:00 +02:00
Mogball ee70039ae2 [mlir] Fix handling of some region branch terminator successors
When `RegionBranchOpInterface::getSuccessorRegions` is called for anything other than the parent op, it expects the operands of the terminator of the source region to be passed, not the operands of the parent op. This was not always respected.

This fixes a bug in integer range inference and ForwardDataFlowSolver and changes `scf.while` to allow narrowing of successors using constant inputs.

Fixes #55873

Reviewed By: mehdi_amini, krzysz00

Differential Revision: https://reviews.llvm.org/D127261
2022-06-08 17:17:03 +00:00
bixia1 ea8ed5cbcf [mlir][sparse] Add F16 and BF16.
This is the first PR to add `F16` and `BF16` support to the sparse codegen. There are still problems in supporting these two data types, such as `BF16` is not quite working yet.

Add tests cases.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D127010
2022-06-08 09:51:05 -07:00
lorenzo chelini a0fc94ab61 [MLIR][Math] Add round operation
Introduce RoundOp in the math dialect. The operation rounds the operand to the
nearest integer value in floating-point format. RoundOp lowers to LLVM
intrinsics 'llvm.intr.round' or as a function call to libm (round or roundf).

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D127286
2022-06-08 13:07:39 +02:00
Matthias Springer 032be23309 [mlir][bufferize] Improve buffer writability analysis
Find writability conflicts (writes to buffers that are not allowed to be written to) by checking SSA use-def chains. This is better than the current writability analysis, which is too conservative and finds false positives.

Differential Revision: https://reviews.llvm.org/D127256
2022-06-08 10:11:52 +02:00
lorenzo chelini d48479791f [MLIR][SCF] Improve doc (NFC) 2022-06-08 08:46:36 +02:00
Aart Bik 7482cd6869 [mlir][sparse] updated our sparse dialect doc with some recent changes
The `init` and `tensor` ops are renamed (and one moved to another dialect).

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D127169
2022-06-07 14:27:57 -07:00
Christopher Bate 53fe155b3f Revert "[mlir][vector] Allow unroll of contraction in arbitrary order"
Reverts commit 1469ebf838 (original commit)
Reverts commit a392a39f75 (build fix for above commit)

The commit broke tests in out-of-tree projects, indicating that some logical
error was made in the previous change but not covered by current tests.
2022-06-07 14:54:01 -06:00
Kiran Chandramohan dd32bf9a77 [Flang,MLIR,OpenMP] Fix a few tests that were not converting to LLVM
A few OpenMP tests were retaining the FIR operands even after running
the LLVM conversion pass. To fix these tests the legality checkes for
OpenMP conversion are made stricter to include operands and results.
The Flush, Single and Sections operations are added to conversions or
legality checks. The RegionLessOpConversion is appropriately renamed
to clarify that it works only for operations with Variable operands.
The operands of the flush operation are changed to match those of
Variable Operands.

Fix for an OpenMP issue mentioned in
https://github.com/llvm/llvm-project/issues/55210.

Reviewed By: shraiysh, peixin, awarzynski

Differential Revision: https://reviews.llvm.org/D127092
2022-06-07 09:55:53 +00:00
Alex Zinenko 3326eddcd1 [mlir] fix documentation format in SCF
Four leading spaces are interpreted as a code block in markdown. Unless
used consistently in ODS op description, they cannot be stripped away by
the tablegen backend, which results in malformed markdown being
generated.
2022-06-07 11:51:24 +02:00
lorenzo chelini 9b3712e0bf [MLIR][LLVMIR] Add round intrinsic
Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D126879
2022-06-07 10:27:55 +02:00
lewuathe 62a34f6a6f [mlir][complex] Add complex.conj op
Add complex.conj op to calculate the complex conjugate which is widely used for the mathematical operation on the complex space.

Reviewed By: pifon2a

Differential Revision: https://reviews.llvm.org/D127181
2022-06-07 09:38:35 +02:00
Georgios Pinitas 3bcaf2eb93 [mlir][tosa] Moves constant folding operations out of the Canonicalizer
Transpose operations on constant data were getting folded during the
canonicalization process. This has compile time cost proportional to
the constant size. Moving this to a separate pass to enable optionality
and flexibility of how such scenarios can be handled.

Reviewed By: rsuderman, jpienaar, stellaraccident

Differential Revision: https://reviews.llvm.org/D124685
2022-06-06 22:10:22 +00:00
Christopher Bate 1469ebf838 [mlir][vector] Allow unroll of contraction in arbitrary order
Adds supprot for vector unroll transformations to unroll in different
orders. For example, the `vector.contract` can be unrolled into a
smaller set of contractions.  There is a choice of how to unroll the
decomposition  based on the traversal order of (dim0, dim1, dim2).
The choice of traversal order can now be specified by a callback which
given by the caller of the transform. For now, only the
`vector.contract`, `vector.transfer_read/transfer_write` operations
support the callback.

Differential Revision: https://reviews.llvm.org/D127004
2022-06-06 14:31:04 -06:00
Christopher Bate cca662b849 [mlir][linalg] add conv_2d_nhwc_fhwc named op
This operation should be supported as a named op because
when the operands are viewed as having canonical layouts
with decreasing strides, then the "reduction" dimensions
of the filter (h, w, and c) are contiguous relative to each
output channel. When lowered to a matrix multiplication,
this layout is the simplest to deal with, and thus future
transforms/vectorizations of `conv2d` may find using this
named op convenient.

Differential Revision: https://reviews.llvm.org/D126995
2022-06-06 13:18:08 -06:00
jacquesguan ad44495ad3 [mlir][NFC] Replace some llvm::find with llvm::is_contained.
This patch replaces some llvm::find with llvm::is_contained, it should be more clear.

Differential Revision: https://reviews.llvm.org/D127077
2022-06-06 03:01:14 +00:00
Christian Sigg 400fef081a Recommit: "[MLIR][NVVM] Replace fdiv on fp16 with promoted (fp32) multiplication with reciprocal plus one (conditional) Newton iteration."
This change rolls bcfc0a9051 forward (i.e., reverting 369ce54bb3) with fixed CMakeLists.txt.
2022-06-05 09:11:43 +02:00
Mehdi Amini 369ce54bb3 Revert "[MLIR][GPU] Replace fdiv on fp16 with promoted (fp32) multiplication with reciprocal plus one (conditional) Newton iteration."
This reverts commit bcfc0a9051.

The build is broken with shared library enabled.
2022-06-04 08:35:45 +00:00
Christian Sigg bcfc0a9051 [MLIR][GPU] Replace fdiv on fp16 with promoted (fp32) multiplication with reciprocal plus one (conditional) Newton iteration.
This is correct for all values, i.e. the same as promoting the division to fp32 in the NVPTX backend. But it is faster (~10% in average, sometimes more) because:

- it performs less Newton iterations
- it avoids the slow path for e.g. denormals
- it allows reuse of the reciprocal for multiple divisions by the same divisor

Test program:
```
#include <stdio.h>
#include "cuda_fp16.h"

// This is a variant of CUDA's own __hdiv which is fast than hdiv_promote below
// and doesn't suffer from the perf cliff of div.rn.fp32 with 'special' values.
__device__ half hdiv_newton(half a, half b) {
  float fa = __half2float(a);
  float fb = __half2float(b);

  float rcp;
  asm("{rcp.approx.ftz.f32 %0, %1;\n}" : "=f"(rcp) : "f"(fb));

  float result = fa * rcp;
  auto exponent = reinterpret_cast<const unsigned&>(result) & 0x7f800000;
  if (exponent != 0 && exponent != 0x7f800000) {
    float err = __fmaf_rn(-fb, result, fa);
    result = __fmaf_rn(rcp, err, result);
  }

  return __float2half(result);
}

// Surprisingly, this is faster than CUDA's own __hdiv.
__device__ half hdiv_promote(half a, half b) {
  return __float2half(__half2float(a) / __half2float(b));
}

// This is an approximation that is accurate up to 1 ulp.
__device__ half hdiv_approx(half a, half b) {
  float fa = __half2float(a);
  float fb = __half2float(b);

  float result;
  asm("{div.approx.ftz.f32 %0, %1, %2;\n}" : "=f"(result) : "f"(fa), "f"(fb));
  return __float2half(result);
}

__global__ void CheckCorrectness() {
  int i = threadIdx.x + blockIdx.x * blockDim.x;
  half x = reinterpret_cast<const half&>(i);
  for (int j = 0; j < 65536; ++j) {
    half y = reinterpret_cast<const half&>(j);
    half d1 = hdiv_newton(x, y);
    half d2 = hdiv_promote(x, y);
    auto s1 = reinterpret_cast<const short&>(d1);
    auto s2 = reinterpret_cast<const short&>(d2);
    if (s1 != s2) {
      printf("%f (%u) / %f (%u), got %f (%hu), expected: %f (%hu)\n",
             __half2float(x), i, __half2float(y), j, __half2float(d1), s1,
             __half2float(d2), s2);
      //__trap();
    }
  }
}

__device__ half dst;

__global__ void ProfileBuiltin(half x) {
  #pragma unroll 1
  for (int i = 0; i < 10000000; ++i) {
    x = x / x;
  }
  dst = x;
}

__global__ void ProfilePromote(half x) {
  #pragma unroll 1
  for (int i = 0; i < 10000000; ++i) {
    x = hdiv_promote(x, x);
  }
  dst = x;
}

__global__ void ProfileNewton(half x) {
  #pragma unroll 1
  for (int i = 0; i < 10000000; ++i) {
    x = hdiv_newton(x, x);
  }
  dst = x;
}

__global__ void ProfileApprox(half x) {
  #pragma unroll 1
  for (int i = 0; i < 10000000; ++i) {
    x = hdiv_approx(x, x);
  }
  dst = x;
}

int main() {
  CheckCorrectness<<<256, 256>>>();
  half one = __float2half(1.0f);
  ProfileBuiltin<<<1, 1>>>(one);  // 1.001s
  ProfilePromote<<<1, 1>>>(one);  // 0.560s
  ProfileNewton<<<1, 1>>>(one);   // 0.508s
  ProfileApprox<<<1, 1>>>(one);   // 0.304s
  auto status = cudaDeviceSynchronize();
  printf("%s\n", cudaGetErrorString(status));
}
```

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D126158
2022-06-04 08:03:29 +02:00
wren romano 3cf03f1c56 [mlir][sparse] Adding IsSparseTensorPred and updating ops to use it
Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D126994
2022-06-03 17:15:31 -07:00
Diego Caballero 9a79b1b04c [mlir] Add peeling xform to Codegen Strategy
This patch adds the knobs to use peeling in the codegen strategy
infrastructure.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D126842
2022-06-03 21:31:43 +00:00
Krzysztof Drewniak 95aff23e29 Re-land "[mlir] Add integer range inference analysis""
This reverts commit 4e5ce2056e.

This relands commit 1350c9887d.

Reinstates the range analysis with the build issue fixed.

Differential Revision: https://reviews.llvm.org/D126926
2022-06-03 17:13:48 +00:00
Nicolas Vasilache 72de7588cc [mlir][SCF] Add bufferization hook for scf.foreach_thread and terminator.
`scf.foreach_thread` results alias with the underlying `scf.foreach_thread.parallel_insert_slice` destination operands
and they bufferize to equivalent buffers in the absence of other conflicts.
`scf.foreach_thread.parallel_insert_slice` conflict detection is similar to `tensor.insert_slice` conflict detection.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D126769
2022-06-03 07:14:05 +00:00
Thomas Raoux 205c08b54d [mlir][scf] Add option to loop pipelining to not peel the epilogue
Add an option to predicate the epilogue within the kernel instead of
peeling the epilogue. This is a useful option to prevent generating
large amount of code for deep pipeline. This currently require a user
lamdba to implement operation predication.

Differential Revision: https://reviews.llvm.org/D126753
2022-06-03 04:20:20 +00:00
River Riddle ee1cf1f645 [mlir][NFC] Simplify the various `parseSourceFile<T>` overloads
These effectively all share the same implementation, i.e. forward
to the non-templated overload and then construct the container op.
2022-06-02 19:18:55 -07:00
River Riddle bf352e0b2e [mlir:PDLL] Add better support for providing Constraint/Pattern/Rewrite documentation
This commit enables providing long-form documentation more seamlessly to the LSP
by revamping decl documentation. For ODS imported constructs, we now also import
descriptions and attach them to decls when possible. For PDLL constructs, the LSP will
now try to provide documentation by parsing the comments directly above the decls
location within the source file. This commit also adds a new parser flag
`enableDocumentation` that gates the import and attachment of ODS documentation,
which is unnecessary in the normal build process (i.e. it should only be used/consumed
by tools).

Differential Revision: https://reviews.llvm.org/D124881
2022-06-02 16:31:07 -07:00
Arjun P 8bc2cff95a [MLIR][Presburger] Simplex: remove redundant member vars nRow, nCol
Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D126790
2022-06-03 00:30:48 +01:00
Mehdi Amini 4e5ce2056e Revert "[mlir] Add integer range inference analysis"
This reverts commit 1350c9887d.

Shared library build is broken with undefined references.
2022-06-02 21:24:06 +00:00
Krzysztof Drewniak 1350c9887d [mlir] Add integer range inference analysis
This commit defines a dataflow analysis for integer ranges, which
uses a newly-added InferIntRangeInterface to compute the lower and
upper bounds on the results of an operation from the bounds on the
arguments. The range inference is a flow-insensitive dataflow analysis
that can be used to simplify code, such as by statically identifying
bounds checks that cannot fail in order to eliminate them.

The InferIntRangeInterface has one method, inferResultRanges(), which
takes a vector of inferred ranges for each argument to an op
implementing the interface and a callback allowing the implementation
to define the ranges for each result. These ranges are stored as
ConstantIntRanges, which hold the lower and upper bounds for a
value. Bounds are tracked separately for the signed and unsigned
interpretations of a value, which ensures that the impact of
arithmetic overflows is correctly tracked during the analysis.

The commit also adds a -test-int-range-inference pass to test the
analysis until it is integrated into SCCP or otherwise exposed.

Finally, this commit fixes some bugs relating to the handling of
region iteration arguments and terminators in the data flow analysis
framework.

Depends on D124020

Depends on D124021

Reviewed By: rriddle, Mogball

Differential Revision: https://reviews.llvm.org/D124023
2022-06-02 20:24:11 +00:00
Aart Bik bf7dbc2a30 [mlir][sparse][bufferization] fix doc on new init operation
The example was still using the -now- removed sparse_tensor.init_tensor.
Also, I made the input operands of the matrix multiplication sparse too
(since it looks a bit strange to multiply two dense matrices into a sparse).

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D126897
2022-06-02 12:04:36 -07:00
Alex Zinenko ce2e198bc2 [mlir] add decompose and generalize to structured transform ops
These ops complement the tiling/padding transformations by transforming
higher-level named structured operations such as depthwise convolutions into
lower-level and/or generic equivalents that are better handled by some
downstream transformations.

Differential Revision: https://reviews.llvm.org/D126698
2022-06-02 15:25:18 +02:00
Nicolas Vasilache 311967701a [mlir][SCF] Add scf.foreach_thread.parallel_insert_slice canonicalization.
Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D126761
2022-06-02 11:53:25 +00:00
jacquesguan 19e285477e [mlir][Arithmetic] Add constant folder for RemF.
This patch adds the constant folder for RemF.

Differential Revision: https://reviews.llvm.org/D126045
2022-06-02 06:24:37 +00:00
Matthias Springer 6232a8f3d6 [mlir][sparse][NFC] Switch InitOp to bufferization::AllocTensorOp
Now that we have an AllocTensorOp (previously InitTensorOp) in the bufferization dialect, the InitOp in the sparse dialect is no longer needed.

Differential Revision: https://reviews.llvm.org/D126180
2022-06-02 00:03:52 +02:00
wren romano b364c76683 [mlir][sparse] Using non-empty function name suffix for OverheadType::kIndex
The trick of using an empty token in the `FOREVERY_O` x-macro relies on preprocessor behavior which is only standard since C99 6.10.3/4 and C++11 N3290 16.3/4 (whereas it was undefined behavior up through C++03 16.3/10).  Since the `ExecutionEngine/SparseTensorUtils.cpp` file is required to be compile-able under C++98 compatibility mode (unlike the C++11 used elsewhere in MLIR), we shouldn't rely on that behavior.

Also, using a non-empty suffix helps improve uniformity of the API, since all other primary/overhead suffixes are also non-empty.  I'm using the suffix `0` since that's the value used by the `SparseTensorEncoding` attribute for indicating the index overhead-type.

Depends On D126720

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D126724
2022-06-01 14:18:42 -07:00
Rob Suderman f3bdb56d61 [mlir][math] Add math.ctlz expansion to control flow + arith operations
Ctlz is an intrinsic in LLVM but does not have equivalent operations in SPIR-V.
Including a decomposition gives an alternative path for these platforms.

Reviewed By: NatashaKnk

Differential Revision: https://reviews.llvm.org/D126261
2022-06-01 11:45:04 -07:00
Stella Laurenzo 3bb7999339 [mlir] Add global_load and global_store ops to ml_program.
* Adds simple, non-atomic, non-volatile, non-synchronized direct load/store ops.

Differential Revision: https://reviews.llvm.org/D126230
2022-06-01 11:32:15 -07:00
Arjun P ec145ba2a3 [MLIR][Presburger] Matrix: inline trivial accessors
This resolves a comment from https://reviews.llvm.org/D126708
that was previously missed.
2022-06-01 16:56:46 +01:00
Arjun P d5e31cf38a [MLIR][Presburger] Move Matrix accessors inline
This gives a 1.5x speedup on the Presburger unittests.

Reviewed By: Groverkss

Differential Revision: https://reviews.llvm.org/D126708
2022-06-01 16:51:42 +01:00
PeixinQiao fe2cc16035 [NFC][MLIR] Fix -Wtype-limits warning
Fix the warning: comparison of unsigned expression in ‘>= 0’ is always
true.

Reviewed By: kiranchandramohan, shraiysh

Differential Revision: https://reviews.llvm.org/D126784
2022-06-01 23:42:07 +08:00
Nicolas Vasilache 59b273a166 [mlir][SCF] Add parallel abstraction on tensors.
This revision adds `scf.foreach_thread` and other supporting abstractions
that allow connecting parallel abstractions and tensors.

Discussion is available [here](https://discourse.llvm.org/t/rfc-parallel-abstraction-for-tensors-and-buffers/62607).

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D126555
2022-06-01 09:16:01 +00:00
Nicolas Vasilache beab8e871e Revert "[mlir][SCF] Add parallel abstraction on tensors."
This reverts commit 9b7193f852.

This is an older branch that was committed by mistake and does not include addressed review comments, an updated version will come next.
2022-06-01 09:04:20 +00:00
Nicolas Vasilache 9b7193f852 [mlir][SCF] Add parallel abstraction on tensors.
This revision adds `scf.foreach_thread` and other supporting abstractions
that allow connecting parallel abstractions and tensors.

Discussion is available [here](https://discourse.llvm.org/t/rfc-parallel-abstraction-for-tensors-and-buffers/62607).
2022-06-01 09:02:16 +00:00
lewuathe 6d75c89783 [mlir][complex] Add tan op for complex dialect
Add tangent operation for complex dialect. This is the follow-up change of https://reviews.llvm.org/D126521

Differential Revision: https://reviews.llvm.org/D126685
2022-06-01 09:20:42 +02:00
wren romano a4c53f8cd6 [mlir][sparse] Factoring out SparseTensorFile class for readSparseTensorShape
The primary goal of this change is to define readSparseTensorShape.  Whereas the SparseTensorFile class is merely introduced as a way to reduce code duplication along the way.

Depends On D126106

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D126233
2022-05-31 13:24:28 -07:00
lorenzo chelini 850dbff708 [MLIR][Math] Improve docs (NFC)
Remove boilerplate examples and add a text at the dialect level to describe
what kind of operands the operations accept (i.e., scalar, tensor or vector).
Left a shorter sentence describing the input operands for each operation as
this redundancy is convenient when browsing the documentation using the
website.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D126648
2022-05-31 18:16:59 +02:00
jacquesguan 42c17073fc [mlir] Support import llvm intrinsics.
This patch supports to convert the llvm intrinsic to the corresponding op. It still leaves some intrinsics to be handled specially.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D126639
2022-05-31 11:08:23 +00:00
River Riddle 1c2edb026e [mlir:PDLL] Rework the C++ generation of native Constraint/Rewrite arguments and results
The current translation uses the old "ugly"/"raw" form which used PDLValue for the arguments
and results. This commit updates the C++ generation to use the recently added sugar that
allows for directly using the desired types for the arguments and result of PDL functions.
In addition, this commit also properly imports the C++ class for ODS operations, constraints,
and interfaces. This allows for a much more convienent C++ API than previously granted
with the raw/low-level types.

Differential Revision: https://reviews.llvm.org/D124817
2022-05-30 17:35:34 -07:00
River Riddle 91b8d96fd1 [mlir:PDLL] Add proper support for operation result type inference
This allows for the results of operations to be inferred in certain contexts,
and matches the support in PDL for result type inference. The main two
initial circumstances are when used as a replacement of another operation,
or when the operation being created implements InferTypeOpInterface.

Differential Revision: https://reviews.llvm.org/D124782
2022-05-30 17:35:33 -07:00
Mehdi Amini 940e290860 Apply clang-tidy fixes for performance-unnecessary-value-param in OneShotModuleBufferize.cpp (NFC) 2022-05-30 18:44:28 +00:00
Alex Zinenko cc6c159203 [mlir] add VectorizeOp to structured transform ops
Vectorization is a key transformation to achieve high performance on most
architectures. In the transform dialect, vectorization is implemented as a
parameterizable transform op. It currently applies to a scope of payload IR
delimited by some isolated-from-above op, mainly because several enabling
transformations (such as affine simplification) are needed to perform
vectorization and these transformation would apply to ops other than the "main"
computational payload op. A separate "navigation" transform op that obtains the
isolated-from-above ancestor of an op is introduced in the core transform
dialect. Even though it is currently only useful for vectorization,
isolated-from-above ops are a common anchor for transformations (usually
implemented as passes) that is likely to be reused in the future.

Depends On D126374

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D126542
2022-05-30 17:37:50 +02:00
Alex Zinenko 5cde5a5739 [mlir] add interchange, pad and scalarize to structured transform dialect
Add ops to the structured transform extension of the transform dialect that
perform interchange, padding and scalarization on structured ops. Along with
tiling that is already defined, this provides a minimal set of transformations
necessary to build vectorizable code for a single structured op.

Define two helper traits: one that implements TransformOpInterface by applying
a function to each payload op independently and another that provides a simple
"functional-style" producer/consumer list of memory effects for the transform
ops.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D126374
2022-05-30 11:42:40 +02:00
Christian Sigg bcf3d52486 [MLIR][GPU] Expose GpuParallelLoopMapping as non-test pass.
Reviewed By: bondhugula, herhut

Differential Revision: https://reviews.llvm.org/D126199
2022-05-30 09:20:48 +02:00
Groverkss dac27da7b9 [MLIR][Presburger] Add applyDomain/Range to IntegerRelation
This patch adds support for applying a relation on domain/range of a relation.

Reviewed By: arjunp, ftynse

Differential Revision: https://reviews.llvm.org/D126339
2022-05-29 02:06:11 +05:30
Matthias Springer 2f0a634c5e [mlir][bufferization] Add extra filter mechanism to bufferizeOp
Differential Revision: https://reviews.llvm.org/D126569
2022-05-28 04:49:23 +02:00
Matthias Springer f470f8cbce [mlir][bufferize][NFC] Split analysis+bufferization of ModuleBufferization
Analysis and bufferization can now be run separately.

Differential Revision: https://reviews.llvm.org/D126572
2022-05-28 04:43:50 +02:00
Matthias Springer 3490aadf56 [mlir][bufferization][NFC] Remove post-analysis step infrastructure
Now that analysis and bufferization are better separated, post-analysis steps are no longer needed. Users can directly interleave analysis and bufferization as needed.

Differential Revision: https://reviews.llvm.org/D126571
2022-05-28 04:37:13 +02:00
Matthias Springer 1534177f8f [mlir][bufferization][NFC] Move OpFilter out of BufferizationOptions
Differential Revision: https://reviews.llvm.org/D126568
2022-05-28 01:47:39 +02:00
Aart Bik a5d7e2a8ac [OpenMP][mlir] fix broken build
Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D126556
2022-05-27 10:06:01 -07:00
PeixinQiao 042ae89556 [OpenMP] Support operation conversion to LLVM for threadprivate directive
This supports the operation conversion for threadprivate directive. The
support for memref type conversion is not implemented.

Reviewed By: kiranchandramohan, shraiysh

Differential Revision: https://reviews.llvm.org/D124610
2022-05-28 00:06:57 +08:00
Daniil Dudkin 52d79b04b2 [mlir][llvm] Fix compiler error on GCC 9
This patch fixes the following compiler error:

    error: declaration of ‘mlir::LLVM::cconv::CConv mlir::LLVM::detail::CConvAttrStorage::CConv’ changes meaning of ‘CConv’ [-fpermissive]

CConv as a member variable name was shadowing CConv as an enumeration,
hence the compiler error.

Reviewed By: ftynse, alexbatashev

Differential Revision: https://reviews.llvm.org/D126530
2022-05-27 15:33:43 +03:00
Groverkss f168a65943 [MLIR][Presburger] Add intersectDomain/Range to IntegerRelation
This patch adds support for intersection a set with a relation.

Reviewed By: arjunp

Differential Revision: https://reviews.llvm.org/D126328
2022-05-27 15:51:54 +05:30
Alexander Batashev 0252357b3e [mlir][LLVM] Add support for Calling Convention in LLVMFuncOp
This patch adds support for Calling Convention attribute in LLVM
dialect, including enums, custom syntax and import from LLVM IR.
Additionally fix import of dso_local attribute.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D126161
2022-05-27 09:43:31 +03:00
wren romano 2046e11ac4 [mlir][sparse] Improving ExecutionEngine/SparseTensorUtils.h
This change makes the public API of SparseTensorUtils.cpp explicit, whereas before the publicity of these functions was only implicit.  Implicit publicity is sufficient for mlir-opt to generate calls to these functions, but it's not enough to enable C/C++ code to call them directly in the usual way (i.e., without going through codegen).  Thus, leaving the publicity implicit prevents development of other tools (e.g., microbenchmarks).

In addition this change also marks the functions MLIR_CRUNNERUTILS_EXPORT, which is required by the JIT under certain configurations (albeit not for anything in our test suite).

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D126105
2022-05-26 17:22:08 -07:00
Alex Zinenko 73c3dff1b3 [mlir] Use-after-free checker for the Transform dialect
The Transform dialect uses the side effect modeling mechanism to record the
effects of the transform ops on the mapping between Transform IR values and
Payload IR ops. Introduce a checker pass that warns if a Transform IR value is
used after it has been freed (consumed). This pass is mostly intended as a
debugging aid in addition to the verification/assertion mechanisms in the
transform interpreter. It reports all potential use-after-free situations.
The implementation makes a series of simplifying assumptions to be simple and
conservative. A more advanced implementation would rely on the data flow-like
analysis associated with a side-effect resource rather than a value, which is
currently not supported by the analysis infrastructure.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D126381
2022-05-26 12:28:41 +02:00
Matthias Springer 52698a33d0 [mlir][bufferization] Clean up imports and code comments
Differential Revision: https://reviews.llvm.org/D126427
2022-05-26 05:48:52 +02:00
bixia1 a14057d4bd [mlir][sparse] Add more complex operations.
Support complex operations sqrt, expm1, and tanh.

Add tests.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D126393
2022-05-25 16:38:09 -07:00
Matthias Springer ab249fd87d [mlir][bufferization][NFC] Remove dead code
There were two copies of AlwaysCopyAnalysisState. (Must have been a merge conflict mistake...)

Differential Revision: https://reviews.llvm.org/D126414
2022-05-25 22:26:00 +02:00
Matthias Springer 0ee1c0388c [mlir][bufferize] Remove hoisting functionality from One-Shot Bufferize
The same functionality is already provided by `-buffer-hoisting` and `-buffer-loop-hoisting`.

Differential Revision: https://reviews.llvm.org/D126251
2022-05-25 19:56:18 +02:00
Logan Chien 0c8fdd7230 [mlir] Fix Tensor_InsertSliceOp description
This commit fixes `Tensor_InsertSliceOp` `sizes` inputs/attributes
description.

Before this commit, the description says the `sizes` inputs/attributes
denote the size of the return type. But according to the
`InsertSliceOpConstantArgumentFolder` in
`lib/Dialect/Tensor/IR/TensorOps.cpp`, the `sizes` inputs/attributes
actually denote the size of the source type.

I had an off-line discussion with the authors of `TensorOps.td` and
`TensorOps.cpp`. We concluded that it was a typo in the Op description.

This commit updates the Op description to match the actual usage.

Differential Revision: https://reviews.llvm.org/D126264
2022-05-25 09:38:06 -07:00
Groverkss fb857ded70 [MLIR][Presburger] Add inverse to IntegerRelation
This patch adds support for obtaining inverse of a relation.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D126327
2022-05-25 19:37:52 +05:30
Groverkss 3c057ac2c2 [MLIR][Presburger] Add getDomainSet, getRangeSet to IntegerRelation
This patch adds support for obtaining a set corresponding to the domain/range
of the relation.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D126326
2022-05-25 19:35:56 +05:30
lorenzo chelini 1ad9b26622 [MLIR][Linalg] Adjust documentation (NFC)
Adjust docs for tensor.pad, tensor.collapse_shape and tensor.expand_shape.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D126370
2022-05-25 13:57:11 +02:00
Alexander Belyaev b3c5c22c13 [mlir] Add `complex.atan2` operation.
Differential Revision: https://reviews.llvm.org/D126357
2022-05-25 10:11:44 +02:00
Mogball 96bbe1bd61 [mlir] Rename mlir::SmallVector -> llvm::SmallVector 2022-05-24 15:03:19 +00:00
Logan Chien 57d239e4ad [mlir] Breakdown diagnostic string literals
This commit breaks down diagnostic string literals so that the attribute
name and enumurator names can be shared with the stringify utility
function and the "expected ", " to be one of ", and ", " can be shared
between different enum-related diagnostic.

Differential Revision: https://reviews.llvm.org/D125938
2022-05-24 07:58:00 -07:00
Thomas Raoux 89aaa2d033 [mlir][vector] Add new lowering mode to vector.contractionOp
Add lowering for cases where the reduction dimension is fully unrolled.
It is common to unroll the reduction dimension, therefore we would want
to lower the contractions to an elementwise vector op in this case.

Differential Revision: https://reviews.llvm.org/D126120
2022-05-24 14:19:08 +00:00
Mehdi Amini 63d69a21b7 Apply clang-tidy fixes for performance-unnecessary-value-param in Utils.cpp (NFC) 2022-05-23 23:12:58 +00:00
Matthias Springer 82c85bf38e [mlir][bufferize][NFC] Update One-Shot Bufferize pass documentation
Differential Revision: https://reviews.llvm.org/D125637
2022-05-23 18:53:36 +02:00
Matthias Springer ec55f0bd58 [mlir][bufferization][NFC] Improve assembly format of AllocTensorOp
No longer pass static dim sizes as an attribute. This was redundant and required extra checks in the verifier. This change also makes the op symmetrical to memref::AllocOp.

Differential Revision: https://reviews.llvm.org/D126178
2022-05-23 16:58:01 +02:00
Alexander Belyaev f3eeefe449 [mlir] Add Expm1 tp ComplexOps.td.
Differential Revision: https://reviews.llvm.org/D126206
2022-05-23 16:31:16 +02:00
Alexander Belyaev a3a85fe59f [mlir] Add RSqrt tp ComplexOps.td.
Differential Revision: https://reviews.llvm.org/D126202
2022-05-23 16:12:05 +02:00
Benjamin Kramer 295d032762 [mlir] Move diagnostic handlers instead of copying
This also allows using unique_ptr instead of shared_ptr for the CAPI
user data. NFCI.
2022-05-21 13:25:24 +02:00
Matthias Springer ffdbecccaf [mlir][bufferization] Add bufferization.alloc_tensor op
This change adds a new op `alloc_tensor` to the bufferization dialect. During bufferization, this op is always lowered to a buffer allocation (unless it is "eliminated" by a pre-processing pass). It is useful to have such an op in tensor land, because it allows users to model tensor SSA use-def chains (which drive bufferization decisions) and because tensor SSA use-def chains can be analyzed by One-Shot Bufferize, while memref values cannot.

This change also replaces all uses of linalg.init_tensor in bufferization-related code with bufferization.alloc_tensor.

linalg.init_tensor and bufferization.alloc_tensor are similar, but the purpose of the former one is just to carry a shape. It does not indicate a memory allocation.

linalg.init_tensor is not suitable for modelling SSA use-def chains for bufferization purposes, because linalg.init_tensor is marked as not having side effects (in contrast to alloc_tensor). As such, it is legal to move linalg.init_tensor ops around/CSE them/etc. This is not desirable for alloc_tensor; it represents an explicit buffer allocation while still in tensor land and such allocations should not suddenly disappear or get moved around when running the canonicalizer/CSE/etc.

BEGIN_PUBLIC
No public commit message needed for presubmit.
END_PUBLIC

Differential Revision: https://reviews.llvm.org/D126003
2022-05-21 02:47:32 +02:00
Bixia Zheng d390035b46 [mlir][sparse] Support more complex operations.
Add complex operations abs, neg, sin, log1p, sub and div.

Add test cases.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D126027
2022-05-20 14:39:26 -07:00
Christopher Bate 1ca772ed95 [MLIR][GPU] Add NvGpu mma.sync path to the VectorToGPU pass
This changes adds the option to lower to NvGpu dialect ops during the
VectorToGPU convsersion pass. Because this transformation reuses
existing VectorToGPU logic, a seperate VectorToNvGpu conversion pass is
not created. The option `use-nvgpu` is added to the VectorToGPU pass.
When this is true, the pass will attempt to convert slices rooted at
`vector.contract` operations into `nvgpu.mma.sync` ops, and
`vector.transfer_read` ops are converted to either `nvgpu.ldmatrix` or
one or more `vector.load` operations.  The specific data loaded will
depend on the thread id within a subgroup (warp). These index
calculations depend on data type and shape of the MMA op
according to the downstream PTX specification. The code for supporting
these details is separated into `NvGpuSupport.cpp|h`.

Differential Revision: https://reviews.llvm.org/D122940
2022-05-20 09:42:55 -06:00
Stella Laurenzo 2bb252852c [mlir] Add GlobalOp, GlobalLoadConstOp to ml_program.
The approach I took was to define a dialect 'extern' attribute that a GlobalOp can take as a value to signify external linkage. I think this approach should compose well and should also work with wherever the OpaqueElements work goes in the future (since that is just another kind of attribute). I special cased the GlobalOp parser/printer for this case because it is significantly easier on the eyes.

In the discussion, Jeff Niu had proposed an alternative syntax for GlobalOp that I ended up not taking. I did try to implement it but a) I don't think it made anything easier to read in the common case, and b) it made the parsing/printing logic a lot more complicated (I think I would need a completely custom parser/printer to do it well). Please have a look at the common cases where the global type and initial value type match: I don't think how I have it is too bad. The less common cases seem ok to me.

I chose to only implement the direct, constant load op since that is non side effecting and there was still discussion pending on that.

Differential Revision: https://reviews.llvm.org/D124318
2022-05-18 23:08:28 -07:00
Mogball 4957518ef5 [mlir][ods] Simplify useDefaultType/AttributePrinterParser
The current behaviour of `useDefaultTypePrinterParser` and `useDefaultAttributePrinterParser` is that they are set by default, but the dialect generator only generates the declarations for the parsing and printing hooks if it sees dialect types and attributes. Same goes for the definitions generated by the AttrOrTypeDef generator.

This can lead to confusing and undesirable behaviour if the dialect generator doesn't see the definitions of the attributes and types, for example, if they are sensibly separated into different files: `Dialect.td`, `Ops.td`, `Attributes.td`, and `Types.td`.

Now, these bits are unset by default. Setting them will always result in the dialect generator emitting the declarations for the parsing hooks. And if the AttrOrTypeDef generator sees it set, it will generate the default implementations.

Reviewed By: rriddle, stellaraccident

Differential Revision: https://reviews.llvm.org/D125809
2022-05-18 17:22:11 +00:00
Bixia Zheng 69edacbcf0 [mlir][sparse] Add support for complex.im and complex.re to the sparse compiler.
Add a test.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D125834
2022-05-18 15:53:07 +00:00
Benjamin Kramer e497871356 [mlir][complex] Add pow/sqrt/tanh ops and lowering to libm
Lowering through libm gives us a baseline version, even though it's not
going to be particularly fast. This is similar to what we do for some
math dialect ops.

Differential Revision: https://reviews.llvm.org/D125550
2022-05-18 14:03:14 +02:00
Frederik Gossen 6d36cfed3b [MLIR] Make `parseDimensionListRanked` configurable wrt parsing a trailing `x`
Differential Revision: https://reviews.llvm.org/D125797
2022-05-18 05:42:35 -04:00
Groverkss e00cbbec06 [MLIR][Presburger] Cleanup getMaybeValues in FACV
This patch cleans up multiple getMaybeValue functions to take an IdKind instead
of special functions.

Reviewed By: arjunp

Differential Revision: https://reviews.llvm.org/D125617
2022-05-18 09:44:14 +05:30
Groverkss 862b5a5233 [MLIR][Presburger] Attach values only to non-local identifiers in FAVC
This patch changes `FlatAffineValueConstraints` to only allow attaching
values to non-local identifiers.

The reasoning for this change is:
1. Information attached to local identifiers can be lost since local identifiers
  can be removed for output size optimizations.
2. There are no current use cases for attaching values to Local identifiers.
3. Attaching a value to a local identifier does not make sense since a local
  identifier represents existential quantification.

This patch also adds some additional asserts to the affected functions.

Reviewed By: arjunp, bondhugula

Differential Revision: https://reviews.llvm.org/D125613
2022-05-18 09:17:23 +05:30
Robert Suderman 9294a1e9a8 [mlir][tosa] Rework tosa.apply_scale lowering for 32-bit
Added handling rounding behavior in 32-bits for when possible. This
avoids kernel compilation generating scalarized code on platforms where
64-bit vectors are not available.

As the 48-bit lowering requires 64-bit anyway, we added a full 64-bit
solution simplifying the old path.

Reviewed By: dcaballe, mravishankar

Differential Revision: https://reviews.llvm.org/D125583
2022-05-17 16:01:12 -07:00
Matthias Springer 996834e681 [mlir][SCF] Fix scf.while bufferization
Before this fix, the bufferization implementation made the incorrect assumption that the values yielded from the "before" region must match with the values yielded from the "after" region.

Differential Revision: https://reviews.llvm.org/D125835
2022-05-18 00:35:50 +02:00
jfurtek 5c3b20520b [mlir] Update LLVMIR Fastmath flags use of MLIR BitEnum functionality
This diff updates the LLVMIR dialect Fastmath flags attribute to use recently
added features of `BitEnum` attributes. Specifically, this diff uses the bit
enum "group" case to represent the `fast` value as an alias for a combination
of other values (`ninf`, `nnan`, ...), instead of using a separate integer
value. (This is in line with LLVM's fastmath flags representation.) This diff
also leverages the `printBitEnumPrimaryGroups` `tblgen` field for concise
enum printing.

The `BitEnum` features were developed for an upcoming diff that adds `fastmath`
support to the arithmetic dialect. This diff simply applies some of the relevant
new features to the LLVM dialect attribute.

Reviewed By: ftynse, Mogball

Differential Revision: https://reviews.llvm.org/D124720
2022-05-17 18:19:14 +00:00
Min-Yih Hsu 0b168a49bf [mlir][LLVMIR] Use a new way to verify GEPOp indices
Previously, GEPOp relies on `findKnownStructIndices` to check if a GEP
index should be static. The truth is, `findKnownStructIndices` can only
tell you a GEP index _might_ be indexing into a struct (which should use
a static GEP index). But GEPOp::build and GEPOp::verify are falsely
taking this information as a certain answer, which creates many false
alarms like the one depicted in
`test/Target/LLVMIR/Import/dynamic-gep-index.ll`.

The solution presented here adopts a new verification scheme: When we're
recursively checking the child element types of a struct type, instead
of checking every child types, we only check the one dictated by the
(static) GEP index value. We also combine "refinement" logics --
refine/promote struct index mlir::Value into constants -- into the very
verification process since they have lots of logics in common. The
resulting code is more concise and less brittle.

We also hide GEPOp::findKnownStructIndices since most of the
aforementioned logics are already encapsulated within GEPOp::build and
GEPOp::verify, we found little reason for findKnownStructIndices (or the
new findStructIndices) to be public.

Differential Revision: https://reviews.llvm.org/D124935
2022-05-17 10:28:44 -07:00
Alex Zinenko 1075c8ca49 [mlir] support isa/cast/dyn_cast<Operation *>(operation) again
The support for this has been added by 946311b893
but then ignored by bc22b5c9a2.

This enables one to write generic code that can be instantiated for both
specific operation classes and the common base class without
specialization. Examples include functions that take/return ops, such
as:

```mlir
template <typename FnTy>
void applyIf(FnTy &&lambda, ...) {
  for (Operation *op : ...) {
    auto specific = dyn_cast<function_traits<FnTy>::template arg_t<0>>(op);
    if (specific)
      lambda(specific);
  }
}
```

that would otherwise need to rely on template specialization to support
lambdas that take specific operations and those that take `Operation *`.

Differential Revision: https://reviews.llvm.org/D125543

Reviewed by: rriddle
2022-05-17 11:15:17 +02:00
River Riddle 5de12bb703 [mlir][Tablegen-LSP] Add support for a basic TableGen language server
This follows the same general structure of the MLIR and PDLL language
servers. This commits adds the basic functionality for setting up the server,
and initially only supports providing diagnostics. Followon commits will
build out more comprehensive behavior.

Realistically this should eventually live in llvm/, but building in MLIR is an easier
initial step given that:
* All of the necessary LSP functionality is already here
* It allows for proving out useful language features (e.g. compilation databases)
  without affecting wider scale tablegen users
* MLIR has a vscode extension that can immediately take advantage of it

Differential Revision: https://reviews.llvm.org/D125440
2022-05-16 16:03:51 -07:00
River Riddle e0c3b94c80 [mlir] Restrict dialect doc gen to a single dialect
In the overwhelmingly majority of cases only one dialect is generated at a time
anyways, and this restriction more easily catches user error when multiple
dialects might be generated. We hit this semi-recently with the PDL dialect,
and circt+other downstream users are also actively hitting this as well.

Differential Revision: https://reviews.llvm.org/D125651
2022-05-16 15:35:07 -07:00
Alex Zinenko 18fc395909 [mlir] allow for re-registering extension ops
Op registration mechanism does not allow for ops with the same name to be
re-registered. This is okay to avoid name conflicts and debug
double-registration, but may be problematic for dialect extensions that may get
registered several times (unlike dialects that are deduplicated in the
registry). When registering ops through the Transform dialect extension
mechanism, check first if the ops are already registered and only complain in
the case of repeated registration with the same name but different TypeID.

Differential Revision: https://reviews.llvm.org/D125554
2022-05-17 00:03:40 +02:00
Matthias Springer 0b293bf045 [mlir][bufferize] Better propagation of errors
Return immediately when an op bufferization patterns fails.

Differential Revision: https://reviews.llvm.org/D125087
2022-05-16 23:17:01 +02:00
Mogball c8457eb532 [mlir][transforms] Add a topological sort utility and pass
This patch adds a topological sort utility and pass. A topological sort reorders
the operations in a block without SSA dominance such that, as much as possible,
users of values come after their producers.

The utility function sorts topologically the operation range in a given block
with an optional user-provided callback that can be used to virtually break cycles.
The toposort pass itself recursively sorts graph regions under the target op.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D125063
2022-05-16 20:47:30 +00:00
Aart Bik 736c1b66ef [mlir][sparse] introduce complex type to sparse tensor support
This is the first implementation of complex (f64 and f32) support
in the sparse compiler, with complex add/mul as first operations.
Note that various features are still TBD, such as other ops, and
reading in complex values from file. Also, note that the
std::complex<float> had a bit of an ABI issue when passed as
single argument. It is still TBD if better solutions are possible.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D125596
2022-05-16 13:17:36 -07:00
Matthias Springer f287da8a15 [mlir][bufferize] Better user control of layout maps
This changes replaces the `fully-dynamic-layout-maps` options (which was badly named) with two new options:

* `unknown-type-conversion` controls the layout maps on buffer types for which no layout map can be inferred.
* `function-boundary-type-conversion` controls the layout maps on buffer types inside of function signatures.

Differential Revision: https://reviews.llvm.org/D125615
2022-05-16 18:06:13 +02:00
Matthias Springer 12e41d9264 [mlir][bufferize] Infer memref types when possible
Instead of recomputing memref types from tensor types, try to infer them when possible. This results in more precise layout maps.

Differential Revision: https://reviews.llvm.org/D125614
2022-05-16 02:02:08 +02:00
Arnab Dutta 16219f8c94 [MLIR][GPU] Add canonicalizer for gpu.memcpy
Erase gpu.memcpy op when only uses of dest are
the memcpy op in question, its allocation and deallocation
ops.

Reviewed By: bondhugula, csigg

Differential Revision: https://reviews.llvm.org/D124257
2022-05-14 19:01:04 +05:30
Christian Sigg 0e3d1ca54a [MLIR][GPU] NFC: simplify kernel operand accessor implementations.
Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D125112
2022-05-14 14:14:42 +02:00
Mogball bf8049dc48 [mlir][ods] (NFC) remove erroneous trait 2022-05-14 00:37:34 +00:00
Mogball 70b69c54fa [mlir] Rename Zero* traits to Zero*s
Rename
ZeroResult -> ZeroResults
ZeroSuccessor -> ZeroSuccessors
ZeroRegion -> ZeroRegions

to be in line with ZeroOperands and grammatically correct.
2022-05-14 00:20:28 +00:00
Chris Lattner 1d7b5cd5bf [ParseResult] Mark this as LLVM_NODISCARD (like LogicalResult) and fix issues.
There are a lot of cases where we accidentally ignored the result of some
parsing hook.  Mark ParseResult as LLVM_NODISCARD just like ParseResult is.
This exposed some stuff to clean up, so do.

Differential Revision: https://reviews.llvm.org/D125549
2022-05-13 16:28:53 +01:00
Tres Popp 1dce51b888 [mlir] Add TensorToLinalgPass
This pass is to handle computationally complex operations like
tensor.pad which are not simply lowered to the exact same operation in
the memref dialect.

Differential Revision: https://reviews.llvm.org/D125384
2022-05-13 12:17:22 +02:00
Matthias Springer e9fa559097 [mlir][sparse][NFC] Use RewriterBase/OpBuilder when possible
Most functions do not need a PatternRewriter or ConversionPatternRewriter.

Differential Revision: https://reviews.llvm.org/D125466
2022-05-13 11:37:26 +02:00
Matthias Springer 8f42939a07 [mlir][bufferize][NFC] Make getContiguousMemRefType a static function
No need to expose this as public API anymore.

Differential Revision: https://reviews.llvm.org/D125361
2022-05-13 11:27:43 +02:00
River Riddle c2fb9c29b4 [mlir:Pass] Add support for op-agnostic pass managers
This commit refactors the current pass manager support to allow for
operation agnostic pass managers. This allows for a series of passes
to be executed on any viable pass manager root operation, instead
of one specific operation type. Op-agnostic/generic pass managers
only allow for adding op-agnostic passes.

These types of pass managers are extremely useful when constructing
pass pipelines that can apply to many different types of operations,
e.g., the default inliner simplification pipeline. With the advent of
interface/trait passes, this support can be used to define FunctionOpInterface
pass managers, or other pass managers that effectively operate on
specific interfaces/traits/etc (see #52916 for an example).

Differential Revision: https://reviews.llvm.org/D123536
2022-05-12 13:12:59 -07:00
Chris Lattner 3cce374ee6 Various improvements suggested by river NFC.
Differential Revision: https://reviews.llvm.org/D125471
2022-05-12 16:18:23 +01:00
Chris Lattner f21896f2c6 [DenseElementAttr] Simplify the public API for creating these.
Instead of requiring the client to compute the "isSplat" bit,
compute it internally.  This makes the logic more consistent
and defines away a lot of "elements.size()==1" in the clients.

This addresses Issue #55185

Differential Revision: https://reviews.llvm.org/D125447
2022-05-12 16:18:23 +01:00
Thomas Raoux d02f10d96d [mlir][vector] Add lowering pattern for vector.warp_execute_on_lane_0 op
Add lowering of the vector.warp_execute_on_lane_0 into scf.if plus memory
transfer for the operands and yield values.

This also add an integration test running on GPU warp. The same tests can be
later re-used with different comment lines to tests distribution
transformations.

This is mostly from @springerm contribution.

Differential Revision: https://reviews.llvm.org/D125430
2022-05-12 13:27:43 +00:00
Benjamin Kramer 27dad99622 [mlir][LLVM] Make the nested type restriction on complex constants less aggressive
Complex nested in other types is perfectly fine, just nested structs
aren't supported. Instead of checking whether there's nesting just check
whether the struct we're dealing with is a complex number.

Differential Revision: https://reviews.llvm.org/D125381
2022-05-12 11:47:01 +02:00
Matthias Springer 82ea0d8b82 [mlir][bufferize] Support alloc hoisting across function boundaries
This change integrates the BufferResultsToOutParamsPass into One-Shot Module Bufferization. This improves memory management (deallocation) when buffers are returned from a function.

Note: This currently only works with statically-sized tensors. The generated code is not very efficient yet and there are opportunities for improvment (fewer copies). By default, this new functionality is deactivated.

Differential Revision: https://reviews.llvm.org/D125376
2022-05-12 09:44:07 +02:00
Matthias Springer 011f1b1c1f [mlir][bufferize] Add helpers for templatized DENY filters
We already have templatized ALLOW filters but the DENY filters were missing.

Differential Revision: https://reviews.llvm.org/D125358
2022-05-12 09:18:21 +02:00
Mahesh Ravishankar 8be7e6f56a [mlir][Linalg] Combine canonicalizers that deal with removing dead/redundant args.
`linalg.generic` ops have canonicalizers that either remove arguments
not used in the payload, or redundant arguments. Combine these and
enhance the canonicalization to also remove results that have no use.
This is effectively dead code elimination for Linalg ops.

Differential Revision: https://reviews.llvm.org/D123632
2022-05-12 05:22:30 +00:00
bzcheeseman bc22b5c9a2 [MLIR][Operation] Simplify Operation casting, NFC
We can simplify the code needed to implement dyn_cast/cast/isa support for MLIR operations with documented interfaces via the CastInfo structures. This will also provide an example of how to use CastInfo.

Depends on D123901

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D124963
2022-05-12 00:17:01 -04:00
Matthias Springer 248e113e9f [mlir][bufferize][NFC] Move helper functions to BufferizationOptions
Move helper functions for creating allocs/deallocs/memcpys to BufferizationOptions.

Differential Revision: https://reviews.llvm.org/D125375
2022-05-11 16:23:22 +02:00
Thomas Raoux 15bcc36eed [mlir][gpu] Move async copy ops to NVGPU and add caching hints
Move async copy operations to NVGPU as they only exist on NV target and are
designed to match ptx semantic. This allows us to also add more fine grain
caching hint attribute to the op.
Add hint to bypass L1 and hook it up to NVVM op.

Differential Revision: https://reviews.llvm.org/D125244
2022-05-10 22:30:24 +00:00
Nicolas Vasilache 1f23211cb1 [mlir][SCF] Retire `cloneWithNewYields` helper function.
This is now subsumed by `replaceLoopWithNewYields`.

Differential Revision: https://reviews.llvm.org/D125309
2022-05-10 18:44:11 +00:00
Mahesh Ravishankar 567fd523bf [mlir][SCF] Add utility method to add new yield values to a loop.
The current implementation of `cloneWithNewYields` has a few issues
- It clones the loop body of the original loop to create a new
  loop. This is very expensive.
- It performs `erase` operations which are incompatible when this
  method is called from within a pattern rewrite. All erases need to
  go through `PatternRewriter`.

To address these a new utility method `replaceLoopWithNewYields` is added
which
- moves the operations from the original loop into the new loop.
- replaces all uses of the original loop with the corresponding
  results of the new loop
- use a call back to allow caller to generate the new yield values.
- the original loop is modified to just yield the basic block
  arguments corresponding to the iter_args of the loop. This
  represents a no-op loop. The loop itself is dead (since all its uses
  are replaced), but is not removed. The caller is expected to erase
  the op. Consequently, this method can be called from within a
  `matchAndRewrite` method of a `PatternRewriter`.

The `cloneWithNewYields` could be replaces with
`replaceLoopWithNewYields`, but that seems to trigger a failure during
walks, potentially due to the operations being moved. That is left as
a TODO.

Differential Revision: https://reviews.llvm.org/D125147
2022-05-10 18:44:11 +00:00
Krzysztof Drewniak f1f05a91ca [MLIR][AMDGPU] Add AMDGPU dialect, wrappers around raw buffer intrinsics
By analogy with the NVGPU dialect, introduce an AMDGPU dialect for
AMD-specific intrinsic wrappers.

The dialect initially includes wrappers around the raw buffer intrinsics.

On AMD GPUs, a memref can be converted to a "buffer descriptor" that
allows more precise control of memory access, such as by allowing for
out of bounds loads/stores to be replaced by 0/ignored without adding
additional conditional logic, which is important for performance.

The repository currently contains a limited conversion from
transfer_read/transfer_write to Mubuf intrinsics, which are an older,
deprecated intrinsic for the same functionality.

The new amdgpu.raw_buffer_* ops allow these operations to be used
explicitly and for including metadata such as whether the target
chipset is an RDNA chip or not (which impacts the interpretation of
some bits in the buffer descriptor), while still maintaining an
MLIR-like interface.

(This change also exposes the floating-point atomic add intrinsic.)

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D122765
2022-05-10 14:59:58 +00:00
Stella Stamenova 784a5bccfd [mlir] Fix python bindings build on Windows in Debug
Currently, building mlir with the python bindings enabled on Windows in Debug is broken because pybind11, python and cmake don't like to play together. This change normalizes how the three interact, so that the builds can now run and succeed.

The main issue is that python and cmake both make assumptions about which libraries are needed in a Windows build based on the flavor.
- cmake assumes that a debug (or a debug-like) flavor of the build will always require pythonX_d.lib and provides no option/hint to tell it to use a different library. cmake does find both the debug and release versions, but then uses the debug library.
- python (specifically pyconfig.h and by extension python.h) hardcodes the dependency on pythonX_d.lib or pythonX.lib depending on whether `_DEBUG` is defined. This is NOT transparent - it does not show up anywhere in the build logs until the link step fails with `pythonX_d.lib is missing` (or `pythonX.lib is missing`)
- pybind11 tries to "fix" this by implementing a workaround - unless Py_DEBUG is defined, `_DEBUG` is explicitly undefined right before including python headers. This also requires some windows headers to be included differently, so while clever, this is a non-trivial workaround.

mlir itself includes the pybind11 headers (which contain the workaround) AS WELL AS python.h, essentially always requiring both pythonX.lib and pythonX_d.lib for linking. cmake explicitly only adds one or the other, so the build fails.

This change does a couple of things:
- In the cmake files, explicitly add the release version of the python library on Windows builds regardless of flavor. Since Py_DEBUG is not defined, pybind11 will always require release and it will be satisfied
- To satisfy python as well, this change removes any explicit inclusions of Python.h on Windows instead relying on the fact that pybind11 headers will bring in what is needed

There are a few additional things that we could do but I rejected as unnecessary at this time:
- define Py_DEBUG based on the CMAKE_BUILD_TYPE - this will *mostly* work, we'd have to think through multiconfig generators like VS, but it's possible. There doesn't seem to be a need to link against debug python at the moment, so I chose not to overcomplicate the build and always default to release
- similar to above, but define Py_DEBUG based on the CMAKE_BUILD_TYPE *as well as* the presence of the debug python library (`Python3_LIBRARY_DEBUG`). Similar to above, this seems unnecessary right now. I think it's slightly better than above because most people don't actually have the debug version of python installed, so this would prevent breaks in that case.
- similar to the two above, but add a cmake variable to control the logic
- implement the pybind11 workaround directly in mlir (specifically in Interop.h) so that Python.h can still be included directly. This seems prone to error and a pain to maintain in lock step with pybind11
- reorganize how the pybind11 headers are included and place at least one of them in Interop.h directly, so that the header has all of its dependencies included as was the original intention. I decided against this because it really doesn't need pybind11 logic and it's always included after pybind11 is, so we don't necessarily need the python includes

Reviewed By: stellaraccident

Differential Revision: https://reviews.llvm.org/D125284
2022-05-09 19:46:47 -07:00
Thomas Raoux 09fc685ce6 [mlir][nvvm] Add attribute to nvvm.cpAsyncOp to control l1 bypass
Add attribute to be able to generate the intrinsic version of async copy
generating a copy with l1 bypass. This correspond to
cp.async.cg.shared.global in ptx.

Differential Revision: https://reviews.llvm.org/D125241
2022-05-09 19:34:48 +00:00
Jakub Tucholski 167bbfcb9d [mlir] Refactoring dialect and test code to use parseCommaSeparatedList
Issue #55173

Reviewed By: lattner, rriddle

Differential Revision: https://reviews.llvm.org/D124791
2022-05-09 12:36:54 -04:00
Jerry Wu ad7c49bef7 [mlir][linalg] Fix padding size calculation for Conv2d ops.
This patch fixed the padding size calculation for Conv2d ops when the stride > 1. It contains the changes below:

- Use addBound to add constraint for AffineApplyOp in getUpperBoundForIndex. So the result value can be mapped and retrieved later.

- Fixed the bound from AffineMinOp by adding as a closed bound. Originally the bound was added as an open upper bound, which results in the incorrect bounds when we multiply the values. For example:

```
%0 = affine.min affine_map<()[s0] -> (4, -s0 + 11)>()[iv0]
%1 = affine.apply affine_map<()[s0] -> (s0 * 2)>()[%0]

If we add the affine.min as an open bound, addBound will internally transform it into the close bound "%0 <= 3". The following sliceBounds will derive the bound of %1 as "%1 <= 6" and return the open bound "%1 < 7", while the correct bound should be "%1 <= 8".
```

- In addition to addBound, I also changed sliceBounds to support returning closed upper bound, since for the size computation, we usually care about the closed bounds.

- Change the getUpperBoundForIndex to favor constant bounds when required. The sliceBounds will return a tighter but non-constant bounds, which can't be used for padding. The constantRequired option requires getUpperBoundForIndex to get the constant bounds when possible.

Reviewed By: hanchung

Differential Revision: https://reviews.llvm.org/D124821
2022-05-09 08:45:37 -07:00
River Riddle a8308020ac [mlir] Remove special case parsing/printing of `func` operations
This was leftover from when the standard dialect was destroyed, and
when FuncOp moved to the func dialect. Now that these transitions
have settled a bit we can drop these.

Most updates were handled using a simple regex: replace `^( *)func` with `$1func.func`

Differential Revision: https://reviews.llvm.org/D124146
2022-05-06 13:36:15 -07:00
Mehdi Amini 072e0aabbc Enable the use of ThreadPoolTaskGroup in MLIR threading helper to enable nested parallelism
The LLVM ThreadPool recently got the addition of the concept of
ThreadPoolTaskGroup: this is a way to "partition" the threadpool
into a group of tasks and enable nested parallelism through this
grouping at every level of nesting.
We make use of this feature in MLIR threading abstraction to fix a long
lasting TODO and enable nested parallelism.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D124902
2022-05-06 19:40:22 +00:00
Matthias Springer 988748c077 [mlir][bufferize] Do not copy buffers with undefined contents
Buffers with undefined contents (e.g., the result of an init_tensor) are no longer copied.

Differential Revision: https://reviews.llvm.org/D125015
2022-05-06 17:31:01 +09:00
Aart Bik 952fa3018e [mlir][sparse] add more zero-preserving unary ops to sparse compiler
Although we now have semi-rings to deal with arbitrary ops,
it is still good to convey zero-preserving semantics of
ops to the sparse compiler.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D125043
2022-05-05 15:35:19 -07:00
Stella Stamenova d4555698f8 [mlir] Fix the names of exported functions
The names of the functions that are supposed to be exported do not match the implementations. This is due in part to cac7aabbd8.

This change makes the implementations and declarations match and adds a couple missing declarations.

The new names follow the pattern of the existing `verify` functions where the prefix is maintained as `_mlir_ciface_` but the suffix follows the new naming convention.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D124891
2022-05-05 13:46:15 -07:00
Alexander Belyaev e8f7d019fc [mlir] Add a flag to allow equivalent results.
Differential Revision: https://reviews.llvm.org/D124931
2022-05-04 17:48:18 +02:00
Matthias Springer b34ea97f55 [mlir][linalg][bufferize][NFC] Remove remaining Comprehensive Bufferize code
This commit removes the Linalg Comprehensive Bufferize pass.

Differential Revision: https://reviews.llvm.org/D124854
2022-05-04 17:19:44 +09:00
Matthias Springer 5f60c4825b [mlir][linalg][bufferize][NFC] Make init_tensor elimination a separate pre-processing pass
This commit decouples init_tensor elimination from the rest of the bufferization.

Differential Revision: https://reviews.llvm.org/D124853
2022-05-04 17:17:27 +09:00
Stella Stamenova 5f14aee3bb [mlir] Fix Visual Studio warnings
There are only a couple of warnings when compiling with VS on Windows. This fixes the last remaining warnings so that we can enable LLVM_ENABLE_WERROR on the mlir windows bot.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D124862
2022-05-03 14:12:15 -07:00
Jim Kitchen 2c33266084 [mlir][sparse] Add lowering for unary and binary ops
Adding lowering for Unary and Binary required several changes due to
their unique nature of containing custom code for different "regions"
of the sparse structure being operated on. Along with a Kind, a pointer
to the Operation is passed along to be merged once the lattice
structure is figured out.

The original operation is maintained, as it is required for subsequent
lattice decisions. However, sparse_tensor.binary has some branches
are considered as fully handled and therefore are marked with as
kBinaryBranch to distinguish them.

A unique aspect of the custom code is that sometimes the desired result
is no result at all -- i.e. a user wants overlapping sparse entries to
become empty in the output. The solution to this is to return an
uninitialized Value(), which is checked and handled elsewhere in the
code and results in nothing being written to the output tensor for that
case.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D123057
2022-05-03 15:50:26 -05:00
Goran Flegar 672b908bca [mlir] Add sin & cos ops to complex dialect
Also adds conversions for those ops to math + arith.

Differential Revision: https://reviews.llvm.org/D124773
2022-05-03 19:36:12 +02:00
Hanhan Wang 919e459f1b [Linalg] Remove Optional from getStaticLoopRanges interface method.
It is very wrong if the ranges can't be infered. It's also checked in
verifyStructuredOpInterface, so we don't need the Optional return type.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D124596
2022-05-03 05:12:54 -07:00
Alex Zinenko 6c57b0debe [mlir] improve and test TransformState::Extension
Add the mechanism for TransformState extensions to update the mapping between
Transform IR values and Payload IR operations held by the state. The mechanism
is intentionally restrictive, similarly to how results of the transform op are
handled.

Introduce test ops that exercise a simple extension that maintains information
across the application of multiple transform ops.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D124778
2022-05-03 11:33:00 +02:00
Raghu Maddhipatla c685f82126 [mlir][OpenMP] Add omp.cancel and omp.cancellationpoint.
Reviewed By: kiranchandramohan, peixin, shraiysh

Differential Revision: https://reviews.llvm.org/D123828
2022-05-02 12:23:38 -05:00
Eugene Zhulenev 38d0df5577 [mlir] CRunnerUtils: qualify UnrankedMemRefType to avoid collisions with mlir::UnrankedMemRefType
When CRunnerUtils included together with MLIR IR headers, it can lead to compilation errors.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D124744
2022-05-02 10:19:02 -07:00
Shraiysh Vaishay a60fda59dc [mlir][OpenMP] Restrict types for omp.parallel args
This patch restricts the value of `if` clause expression to an I1 value.
It also restricts the value of `num_threads` clause expression to an I32
value.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D124142
2022-05-02 14:17:34 +05:30
Alex Zinenko 946311b893 [mlir] support isa/cast/dyn_cast<Operation *>(operation)
This enables one to write generic code that can be instantiated for both
specific operation classes and the common base class without
specialization. Examples include functions that take/return ops, such
as:

```mlir
template <typename FnTy>
void applyIf(FnTy &&lambda, ...) {
  for (Operation *op : ...) {
    auto specific = dyn_cast<function_traits<FnTy>::template arg_t<0>>(op);
    if (specific)
      lambda(specific);
  }
}
```

that would otherwise need to rely on template specialization to support
lambdas that take specific operations and those that take `Operation *`.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D124675
2022-05-02 10:07:09 +02:00
River Riddle 3c75228991 [mlir:PDLInterp] Refactor the implementation of result type inferrence
The current implementation uses a discrete "pdl_interp.inferred_types"
operation, which acts as a "fake" handle to a type range. This op is
used as a signal to pdl_interp.create_operation that types should be
inferred. This is terribly awkward and clunky though:

* This op doesn't have a byte code representation, and its conversion
  to bytecode kind of assumes that it is only used in a certain way. The
  current lowering is also broken and seemingly untested.

* Given that this is a different operation, it gives off the assumption
  that it can be used multiple times, or that after the first use
  the value contains the inferred types. This isn't the case though,
  the resultant type range can never actually be used as a type range.

This commit refactors the representation by removing the discrete
InferredTypesOp, and instead adds a UnitAttr to
pdl_interp.CreateOperation that signals when the created operations
should infer their types. This leads to a much much cleaner abstraction,
a more optimal bytecode lowering, and also allows for better error
handling and diagnostics when a created operation doesn't actually
support type inferrence.

Differential Revision: https://reviews.llvm.org/D124587
2022-05-01 12:25:05 -07:00
Chris Lattner d85eb4e2d6 [AsmParser] Introduce a new "Argument" abstraction + supporting logic
MLIR has a common pattern for "arguments" that uses syntax
like `%x : i32 {attrs} loc("sourceloc")` which is implemented
in adhoc ways throughout the codebase.  The approach this uses
is verbose (because it is implemented with parallel arrays) and
inconsistent (e.g. lots of things drop source location info).

Solve this by introducing OpAsmParser::Argument and make addRegion
(which sets up BlockArguments for the region) take it.  Convert the
world to propagating this down.  This means that we correctly
capture and propagate source location information in a lot more
cases (e.g. see the affine.for testcase example), and it also
simplifies much code.

Differential Revision: https://reviews.llvm.org/D124649
2022-04-29 12:19:34 -07:00
Matthias Springer 3c2a74a3ae [mlir][linalg][transform] Add TileOp to transform dialect
This commit adds a tiling op to the transform dialect as an external op.

Differential Revision: https://reviews.llvm.org/D124661
2022-04-29 21:35:31 +09:00
River Riddle 9613a850b6 [mlir:PDL] Rework errors for pdl.operations with non-inferrable results
We currently emit an error during verification if a pdl.operation with non-inferrable
results is used within a rewrite. This allows for catching some errors during compile
time, but is slightly broken. For one, the verification at the PDL level assumes that
all dialects have been loaded, which is true at run time, but may not be true when
the PDL is generated (such as via PDLL). This commit fixes this by not emitting the
error if the operation isn't registered, i.e. it uses the `mightHave` variant of trait/interface
methods.

Secondly, we currently don't verify when a pdl.operation has no explicit results, but the
operation being created is known to expect at least one. This commit adds a heuristic
error to detect these cases when possible and fail. We can't always capture when the user
made an error, but we can capture the most common case where the user expected an
operation to infer its result types (when it actually isn't possible).

Differential Revision: https://reviews.llvm.org/D124583
2022-04-28 12:58:00 -07:00
River Riddle d4381b3f93 [mlir:PDL] Fix a syntax ambiguity in pdl.attribute
pdl.attribute currently has a syntax ambiguity that leads to the incorrect parsing
of pdl.attribute operations with locations that don't also have a constant value. For example:

```
pdl.attribute loc("foo")
```

The above IR is treated as being a pdl.attribute with a constant value containing the location,
`loc("foo")`, which is incorrect. This commit changes the syntax to use `= <constant-value>` to
clearly distinguish when the constant value is present, as opposed to just trying to parse an attribute.

Differential Revision: https://reviews.llvm.org/D124582
2022-04-28 12:57:59 -07:00
River Riddle 92a836da07 [mlir] Attach InferTypeOpInterface on SameOperandsAndResultType operations when possible
This allows for inferring the result types of operations in certain situations by using the type of
an operand. This commit allowed for automatically supporting type inference for many more
operations with no additional effort, e.g. nearly all Arithmetic operations now support
result type inferrence with no additional changes.

Differential Revision: https://reviews.llvm.org/D124581
2022-04-28 12:57:59 -07:00
River Riddle 1bd1edaf40 [mlir:ODS] Support using attributes in AllTypesMatch to automatically add InferTypeOpInterface
This allows for using attribute types in result type inference for use with
InferTypeOpInterface. This was a TODO before, but it isn't much
additional work to properly support this. After this commit,
arith::ConstantOp can now have its InferTypeOpInterface implementation automatically
generated.

Differential Revision: https://reviews.llvm.org/D124580
2022-04-28 12:57:59 -07:00
Chris Lattner 99499c3ea7 [OpAsmParser] Simplify logic for requiredOperandCount in parseOperandList.
I would ideally like to eliminate 'requiredOperandCount' as a bit of
verification that should be in the client side, but it is much more
widely used than I expected.  Just tidy some pieces up around it given
we can't drop it immediately.

NFC.

Differential Revision: https://reviews.llvm.org/D124629
2022-04-28 12:05:10 -07:00
Chris Lattner 5dedf911de [AsmParser] Rework logic around "region argument parsing"
The asm parser had a notional distinction between parsing an
operand (like "%foo" or "%4#3") and parsing a region argument
(which isn't supposed to allow a result number like #3).

Unfortunately the implementation has two problems:

1) It didn't actually check for the result number and reject
   it.  parseRegionArgument and parseOperand were identical.
2) It had a lot of machinery built up around it that paralleled
   operand parsing.  This also was functionally identical, but
   also had some subtle differences (e.g. the parseOptional
   stuff had a different result type).

I thought about just removing all of this, but decided that the
missing error checking was important, so I reimplemented it with
a `allowResultNumber` flag on parseOperand.  This keeps the
codepaths unified and adds the missing error checks.

Differential Revision: https://reviews.llvm.org/D124470
2022-04-28 11:12:44 -07:00
Marius Brehler 84fe39a45b [mlir][emitc] Add a cast op
This adds a cast operation that allows to perform an explicit type
conversion. The cast op is emitted as a C-style cast. It can be applied
to integer, float, index and EmitC types.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D123514
2022-04-28 15:50:59 +00:00
Mathieu Fehr 88bc24a7e3 [mlir] Allow setting operation legality with an OperationName
This is necessary to handle conversions of operations defined at runtime in extensible dialects.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D124353
2022-04-27 08:54:51 -07:00
Mathieu Fehr 9e0b553359 [mlir] Add extensible dialects
Depends on D104534
Add support for extensible dialects, which are dialects that can be
extended at runtime with new operations and types.

These operations and types cannot at the moment implement traits
or interfaces.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D104554
2022-04-26 19:48:22 -07:00
River Riddle 41d2c6df5c [mlir][PDLL-LSP] Add code completion for include file paths
This allows for providing completion results for include directive
file paths by searching the set of include directories for the current
file.

Differential Revision: https://reviews.llvm.org/D124112
2022-04-26 18:33:17 -07:00
River Riddle b3fc0fa84a [mlir][PDLL] Don't use the result of `Constraint::getDefName()` when uniquing
In the case of anonymous defs this may return the name of the base def class,
which can lead to two different defs with the same name (which hits an assert).
This commit adds a new `getUniqueDefName` method that returns a unique name
for the constraint.

Differential Revision: https://reviews.llvm.org/D124074
2022-04-26 18:33:16 -07:00
Yi Zhang e1318078a4 Support non identity layout map for reshape ops in MemRefToLLVM lowering
This change borrows the ideas from `computeExpanded/CollapsedLayoutMap`
and computes the dynamic strides at runtime for the memref descriptors.

Differential Revision: https://reviews.llvm.org/D124001
2022-04-26 13:03:53 -04:00
Alex Zinenko b84f95fe53 [mlir] Fix -Wunused-private-field in the Transform dialect
Classes derived from TransformState::Extension may need access to the
parent state.
2022-04-26 14:05:24 +02:00
Krzysztof Drewniak d35f7f254f [mlir] Allow data flow analysis of non-control flow branch arguments
This commit adds the visitNonControlFlowArguments method to
DataFlowAnalysis, allowing analyses to provide lattice values for the
arguments to a RegionSuccessor block that aren't directly tied to an
op's inputs. For example, integer range interface can use this method
to infer bounds for the step values in loops.

This method has a default implementation that keeps the old behavior
of assigning a pessimistic fixedpoint state to all such arguments.

Reviewed By: Mogball, rriddle

Differential Revision: https://reviews.llvm.org/D124021
2022-04-25 20:19:34 +00:00
Jeremy Furtek a266a21000 [mlir][ods] Extend the EnumAttr tablegen class to support BitEnum attributes
This diff allows the EnumAttr class to be used for bit enum attributes (in
addition to previously supported integer enum attributes). While integer
and bit enum attributes share many common implementation aspects, parsing
bit enum values requires a separate implementation. This is accomplished
by creating empty parser and printer strings in the EnumAttrInfo record,
and having derived classes (specific to bit and integer enums) override with
an appropriate parser/printer string.

To support existing bit enums that may use a vertical bar separator, the
parser is modified to support the | token.

Tests were added for bit enums alongside integer enums.

Future diffs for fastmath attributes in the arithmetic dialect will use these
changes.

(resubmission of earlier abaondoned diff, updated to reflect subsequent changes
in the repository)

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D123880
2022-04-25 19:00:00 +00:00
jfurtek 4e5dee2f30 [mlir][ods] Add tablegen field for concise printing of BitEnum attributes
This diff introduces a tablegen field for bit enum attributes
(`printBitEnumPrimaryGroups`) to control printing when the enum uses "group"
cases. An example would be an implementation that uses a `fastmath` enum value
as an alias for individual fastmath flags. The proposed field would allow
printing of simply `fast` for the enum value, instead of the more verbose list
that would include `fast` as well as the individual flags (e.g. `reassoc,nnan,
ninf,nsz,arcp,contract,afn,fast`).

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D123871
2022-04-25 18:48:35 +00:00
Markus Böck 12a2716953 [mlir][LLVM] Support opaque pointers in `llvm.mlir.addressof`
The verifier of llvm.mlir.addressof did not properly account for opaque pointers, that is, the pointer type not having an element type equal to the type of the referenced global or function. This patch fixes that by skipping the test for the element type if the pointer is opaque.

Differential Revision: https://reviews.llvm.org/D124333
2022-04-25 12:23:16 +02:00
Nick Kreeger 4620032ee3 Revert "[mlir][sparse] Expose SpareTensor passes as enums instead of opaque numbers for vectorization and parallelization options."
This reverts commit d59cf901cb.

Build fails on NVIDIA Sparse tests:
https://lab.llvm.org/buildbot/#/builders/61/builds/25447
2022-04-23 20:14:48 -05:00
Nick Kreeger d59cf901cb [mlir][sparse] Expose SpareTensor passes as enums instead of opaque numbers for vectorization and parallelization options.
The SparseTensor passes currently use opaque numbers for the CLI, despite using an enum internally. This patch exposes the enums instead of numbered items that are matched back to the enum.

Fixes GitHub issue #53389

Reviewed by: aartbik, mehdi_amini

Differential Revision: https://reviews.llvm.org/D123876
2022-04-23 19:16:57 -05:00
Matthias Springer 48b8edac1c [mlir][bufferize][NFC] Remove old references to Comprehensive Bufferize
Differential Revision: https://reviews.llvm.org/D124324
2022-04-23 18:01:05 +09:00
River Riddle eda6f907d2 [mlir][NFC] Shift a bunch of dialect includes from the .h to the .cpp
Now that dialect constructors are generated in the .cpp file, we can
drop all of the dependent dialect includes from the .h file.

Differential Revision: https://reviews.llvm.org/D124298
2022-04-23 01:09:29 -07:00
Alex Zinenko 40a8bd635b [mlir] use side effects in the Transform dialect
Currently, the sequence of Transform dialect operations only supports a single
use of each operand (verified by the `transform.sequence` operation). This was
originally motivated by the need to guard against accessing a payload IR
operation associated with a transform IR value after this operation has likely
been rewritten by a transformation. However, not all Transform dialect
operations rewrite payload IR, in particular the "navigation" operation such as
`transform.pdl_match` do not.

Introduce memory effects to the Transform dialect operations to describe their
effect on the payload IR and the mapping between payload IR opreations and
transform IR values. Use these effects to replace the single-use rule, allowing
repeated reads and disallowing use-after-free, where operations with the "free"
effect are considered to "consume" the transform IR value and rewrite the
corresponding payload IR operations). As an additional improvement, this
enables code motion transformation on the transform IR itself.

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D124181
2022-04-22 23:29:11 +02:00
cpillmayer 3e8560f890 [MLIR] Add option to print users of an operation as comment in the printer
This allows printing the users of an operation as proposed in the git issue #53286.
To be able to refer to operations with no result, these operations are assigned an
ID in SSANameState.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D124048
2022-04-22 18:58:10 +00:00
Jacques Pienaar 9bae20b528 [mlir] Add shape.func
Add shape func op for use (primarily) in shape function_library op. Allows
setting default dialect for some simpler authoring. This is a minimal version
of the ops needed.

Differential Revision: https://reviews.llvm.org/D124055
2022-04-22 11:35:35 -07:00
Lei Zhang 6f28fd0bf7 [mlir][vector] Fold 1-element reduction into extract or arith ops
If there is only one single element in the vector, then we can
just extract the element to compute the final result.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D124129
2022-04-22 14:24:46 -04:00
Matthias Springer d6dab38ae4 [mlir][bufferize][NFC] Add function boundary bufferization flag to BufferizationOptions
This makes the API easier to use. Also allows us to check for incorrect API usage for easier debugging.

Differential Revision: https://reviews.llvm.org/D124265
2022-04-23 01:11:37 +09:00
Matthias Springer b0b19fae81 [mlir][bufferize][NFC] Rewrite op filter logic
The `hasFilter` field is not needed. Instead, the filter accepts ops by default if no ALLOW rule was specified.

Differential Revision: https://reviews.llvm.org/D124264
2022-04-23 00:25:24 +09:00
Matthias Springer e07a7fd5c0 [mlir][bufferization] Move ModuleBufferization to bufferization dialect
* Move Module Bufferization to the bufferization dialect. The implementation is split into `OneShotModuleBufferize.cpp` and `FuncBufferizableOpInterfaceImpl.cpp`, so that the external model implementation can be easily moved to the func dialect in the future.
* Split and clean up test cases. A few test cases are still remaining in Linalg and will be updated separately.
* `linalg.inplaceable` is renamed to `bufferization.writable` to accurately reflect its current usage.
* Attributes and their verifiers are moved from the Linalg dialect to the Bufferization dialect.
* Expand documentation.
* Add a new flag to One-Shot Bufferize to allow for function boundary bufferization.

Differential Revision: https://reviews.llvm.org/D122229
2022-04-22 19:37:28 +09:00
Matthias Springer d820acdde1 [mlir][bufferize][NFC] Use custom walk instead of GreedyPatternRewriter
The bufferization driver was previously using a GreedyPatternRewriter. This was problematic because bufferization must traverse ops top-to-bottom. The GreedyPatternRewriter was previously configured via `useTopDownTraversal`, but this was a hack; this API was just meant for performance improvements and should not affect the result of the rewrite.

BEGIN_PUBLIC
No public commit message needed.
END_PUBLIC

Differential Revision: https://reviews.llvm.org/D123618
2022-04-22 18:23:09 +09:00
Adrian Kuegel a74e5a89b9 [mlir] Move isGuaranteedCollapsible to CollapseShapeOp (NFC).
It seems more natural than to have it as a static method of ExpandShapeOp.
Also fix a typo ("the the" -> "the").

Differential Revision: https://reviews.llvm.org/D124234
2022-04-22 10:31:25 +02:00
Mahesh Ravishankar 0c090dcc8a [mlir][Linalg] Deprecate legacy reshape + generic op folding patterns.
These patterns have been superceded by the fusion by collapsing patterns.

Differential Revision: https://reviews.llvm.org/D124145
2022-04-21 22:25:23 +00:00
Chris Lattner 31c8abc3f1 [AsmParser/Printer] Rework sourceloc support for function arguments.
When Location tracking support for block arguments was added, we
discussed various approaches to threading support for this through
function-like argument parsing.  At the time, we added a parallel array
of locations that could hold this.  It turns out that that approach was
verbose and error prone, roughly no one adopted it.

This patch takes a different approach, adding an optional source
locator to the UnresolvedOperand class.  This fits much more naturally
into the standard structure we use for representing locators, and gives
all the function like dialects locator support for free (e.g. see the
test adding an example for the LLVM dialect).

Differential Revision: https://reviews.llvm.org/D124188
2022-04-21 12:43:36 -07:00
Alex Zinenko 0edb262d91 [mlir] enable doc generation for the transform dialect 2022-04-21 18:52:08 +02:00
Fangrui Song ae46b3e01f Revert D121279 "[MLIR][GPU] Add canonicalizer for gpu.memcpy"
This reverts commit 12f55cac69.

Causes miscompile. Will follow up with a reproduce.
2022-04-21 08:55:13 -07:00
Alex Zinenko 30f22429d3 [mlir] Connect Transform dialect to PDL
This introduces a pair of ops to the Transform dialect that connect it to PDL
patterns. Transform dialect relies on PDL for matching the Payload IR ops that
are about to be transformed. For this purpose, it provides a container op for
patterns, a "pdl_match" op and transform interface implementations that call
into the pattern matching infrastructure.

To enable the caching of compiled patterns, this also provides the extension
mechanism for TransformState. Extensions allow one to store additional
information in the TransformState and thus communicate it between different
Transform dialect operations when they are applied. They can be added and
removed when applying transform ops. An extension containing a symbol table in
which the pattern names are resolved and a pattern compilation cache is
introduced as the first client.

Depends On D123664

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D124007
2022-04-21 16:23:10 +02:00
Markus Böck 850b2c6b3c [mlir] Fix `Region`s `takeBody` method if the region is not empty
The current implementation of takeBody first clears the Region, before then taking ownership of the blocks of the other regions. The issue here however, is that when clearing the region, it does not take into account references of operations to each other. In particular, blocks are deleted from front to back, and operations within a block are very likely to be deleted despite still having uses, causing an assertion to trigger [0].

This patch fixes that issue by simply calling dropAllReferences()before clearing the blocks.

[0] 9a8bb4bc63/mlir/lib/IR/Operation.cpp (L154)

Differential Revision: https://reviews.llvm.org/D123913
2022-04-21 15:32:59 +02:00
Markus Böck a41aaf166f [mlir] Make `Regions`s `cloneInto` multithread-readable
Prior to this patch, `cloneInto` would do a simple walk over the blocks and contained operations and clone and map them as it encounters them. As finishing touch it then remaps any successor and operands it has remapped during that process.

This is generally fine, but sadly leads to a lot of uses of both operations and blocks from the source region, in the cloned operations in the target region. Those uses lead to writes in the use-def list of the operations, making `cloneInto` never thread safe.

This patch reimplements `cloneInto` in three steps to avoid ever creating any extra uses on elements in the source region:
* It first creates the mapping of all blocks and block operands
* It then clones all operations to create the mapping of all operation results, but does not yet clone any regions or set the operands
* After all operation results have been mapped, it now sets the operations operands and clones their regions.

That way it is now possible to call `cloneInto` from multiple threads if the Region or Operation is isolated-from-above. This allows creating copies of  functions or to use `mlir::inlineCall` with the same source region from multiple threads. In the general case, the method is thread-safe if through cloning, no new uses of `Value`s from outside the cloned Operation/Region are created. This can be ensured by mapping any outside operands via the `BlockAndValueMapping` to `Value`s owned by the caller thread.

While I was at it, I also reworked the `clone` method of `Operation` a little bit and added a proper options class to avoid having a `cloneWithoutRegionsAndOperands` method, and be more extensible in the future. `cloneWithoutRegions` is now also a simple wrapper that calls `clone` with the proper options set. That way all the operation cloning code is now contained solely within `clone`.

Differential Revision: https://reviews.llvm.org/D123917
2022-04-21 13:43:00 +02:00
Uday Bondhugula f47a38f517 Add async dependencies support for gpu.launch op
Add async dependencies support for gpu.launch op: this allows specifying
a list of async tokens ("streams") as dependencies for the launch.

Update the GPU kernel outlining pass lowering to propagate async
dependencies from gpu.launch to gpu.launch_func op. Previously, a new
stream was being created and destroyed for a kernel launch. The async
deps support allows the kernel launch to be serialized on an existing
stream.

Differential Revision: https://reviews.llvm.org/D123499
2022-04-21 16:25:59 +05:30
Nimish Mishra 00c511b351 Added lowering support for atomic read and write constructs
This patch adds lowering support for atomic read and write constructs.
Also added is pointer modelling code to allow FIR pointer like types to
be inferred and converted while lowering.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D122725

Co-authored-by: Kiran Chandramohan <kiran.chandramohan@arm.com>
2022-04-21 12:19:13 +05:30
River Riddle 8ae83bb8be [mlir][NFC] Update textual references of `func` to `func.func` in ODS documentation
The special case parsing of `func` operations is being removed.
2022-04-20 22:17:26 -07:00
Shraiysh Vaishay 88bb2521b0 [mlir][OpenMP] Add checks and tests for hint clause and fix empty hint
This patch handles empty hint value for critical and atomic constructs.

This also adds checks and tests for hint clause on atomic constructs.

Reviewed By: peixin, kiranchandramohan, NimishMishra

Differential Revision: https://reviews.llvm.org/D123186
2022-04-21 07:31:03 +05:30
Uday Bondhugula d423fc3724 Add RegionBranchOpInterface on affine.for op
Add RegionBranchOpInterface on affine.for op so that transforms relying
on RegionBranchOpInterface can support affine.for. E.g.:
buffer-deallocation pass.

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D123568
2022-04-20 17:46:07 +05:30
jacquesguan 590a38920f [mlir][LLVMIR] Add vector predication type cast intrinsic ops.
This patch adds vector predication type cast intrinsic ops.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D123996
2022-04-20 02:11:14 +00:00
Krzysztof Drewniak ddc2eb0ada [mlir] Adds getUpperBound() to LoopLikeInterface.
getUpperBound is analogous to getLowerBound(), except for the upper
bound, and is used in range analysis.

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D124020
2022-04-19 19:56:44 +00:00
Alex Zinenko 0eb403ad1b [mlir][transform] Introduce transform.sequence op
Sequence is an important transform combination primitive that just indicates
transform ops being applied in a row. The simplest version requires fails
immediately if any transformation in the sequence fails. Introducing this
operation allows one to start placing transform IR within other IR.

Depends On D123135

Reviewed By: Mogball, rriddle

Differential Revision: https://reviews.llvm.org/D123664
2022-04-19 21:41:02 +02:00
Ashay Rane 25c218be36 [MLIR] Add function to create BFloat16 array attribute
This patch adds a new function `mlirDenseElementsAttrBFloat16Get()`,
which accepts the shaped type, the number of BFloat16 values, and a
pointer to an array of BFloat16 values, each of which is a `uint16_t`
value.

Reviewed By: stellaraccident

Differential Revision: https://reviews.llvm.org/D123981
2022-04-19 19:27:06 +00:00
Arnab Dutta 12f55cac69 [MLIR][GPU] Add canonicalizer for gpu.memcpy
Fold away gpu.memcpy op when only uses of dest are
the memcpy op in question, its allocation and deallocation
ops.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D121279
2022-04-19 17:54:00 +05:30
Matthias Springer 0f4ba02db3 [mlir][interfaces] Add helpers for detecting recursive regions
Add helper functions to check if an op may be executed multiple times based on RegionBranchOpInterface.

Differential Revision: https://reviews.llvm.org/D123789
2022-04-19 16:13:32 +09:00
Groverkss 15650b320b [MLIR][Presburger] Remove inheritence in MultiAffineFunction
This patch removes inheritence of MultiAffineFunction from IntegerPolyhedron
and instead makes IntegerPolyhedron as a member.

This patch removes virtualization in MultiAffineFunction and also removes
unnecessary functions inherited from IntegerPolyhedron.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D123921
2022-04-19 01:25:13 +05:30
River Riddle 58ceae9561 [mlir:NFC] Remove the forward declaration of FuncOp in the mlir namespace
FuncOp has been moved to the `func` namespace for a little over a month, the
using directive can be dropped now.
2022-04-18 12:01:55 -07:00
Groverkss 4ffd0b6fde [MLIR][Presburger] Make IntegerRelation::mergeLocalIds not delete duplicates
This patch modifies mergeLocalIds to not delete duplicate local ids in
`this` relation. This allows the ordering of the final local ids for `this`
to be determined more easily, which is generally required when other objects
refer to these local ids.

Reviewed By: arjunp

Differential Revision: https://reviews.llvm.org/D123866
2022-04-18 10:15:35 +05:30
Chris Lattner cac19f4141 [LogicalResult.h] Move ParseResult to the bottom of file and fix comment, NFC.
This was review feedback that I missed in the phab review:
https://reviews.llvm.org/D123760
2022-04-17 15:34:26 -07:00
Chris Lattner 81b2dc548b [Support] Move ParseResult from OpDefinition.h to LogicalResult.h
This class is a helper for 'parser-like' use cases of LogicalResult
where the implicit conversion to bool is tolerable.  It is used by the
operation asmparsers, but is more generic functionality that is closely
aligned with LogicalResult.  Hoist it up to LogicalResult.h to make it
more accessible.  This is part of Issue #54884

Differential Revision: https://reviews.llvm.org/D123760
2022-04-17 15:18:33 -07:00
Mehdi Amini d98481a1e7 Revert "[MLIR] Provide a way to print ops in custom form on pass failure"
This reverts commit daabcf5f04.

This patch still had on-going discussion that should be closed before
committing.
2022-04-17 18:55:09 +00:00
Uday Bondhugula daabcf5f04 [MLIR] Provide a way to print ops in custom form on pass failure
The generic form of the op is too verbose and in some cases not
readable. On pass failure, ops have been so far printed in generic form
to provide a (stronger) guarantee that the IR print succeeds. However,
in a large number of pass failure cases, the IR is still valid and
the custom printers for the ops will succeed. In fact, readability is
highly desirable post pass failure. This revision provides an option to
print ops in their custom/pretty-printed form on IR failure -- this
option is unsafe and there is no guarantee it will succeed. It's
disabled by default and can be turned on only if needed.

Differential Revision: https://reviews.llvm.org/D123893
2022-04-17 20:10:40 +05:30
Jacques Pienaar bdabe505f4 [mlir][docs] Add missing directory separator 2022-04-16 21:59:18 -07:00
River Riddle 0f304ef017 [mlir] Add asserts when changing various MLIRContext configurations
This helps to prevent tsan failures when users inadvertantly mutate the
context in a non-safe way.

Differential Revision: https://reviews.llvm.org/D112021
2022-04-15 21:49:03 -07:00
Mogball fa26c7ff4b [mlir] Refactor LICM into a utility
LICM is refactored into a utility that is application on any region. The implementation is moved to Transform/Utils.
2022-04-16 00:37:07 +00:00
Stella Stamenova 353f0a8e43 Revert "[mlir] Refactor LICM into a utility"
This reverts commit 3131f80824.

This commit broke the Windows mlir bot:
https://lab.llvm.org/buildbot/#/builders/13/builds/19745
2022-04-15 17:09:16 -07:00
Mogball 3131f80824 [mlir] Refactor LICM into a utility
LICM is refactored into a utility that is application on any region. The implementation is moved to Transform/Utils.
2022-04-15 22:07:01 +00:00
River Riddle 31c88660ab [mlir] Remove the use of FilterTypes for template metaprogramming
This technique results in an explosion in compile time, resulting from a
huge number of std::tuple/concat instatiations. This technique is replaced
by simpler metaprogramming and results in a signficant reduction in
compile time. A local debug/asan build saw a 4x speed up in the processing
of ArithmeticOps.h.inc, and given the nature of this change every dialect
should see similar reductions in compile time.

Differential Revision: https://reviews.llvm.org/D123360
2022-04-15 12:57:07 -07:00
Arjun P 69c1a35488 [MLIR][Presburger][Simplex] moveRowUnknownToColumn: support the row sample value being zero
When the sample value is zero, everything is the same except that failure to
pivot does not imply emptiness. So, leave it to the user to mark as empty if
necessary, if they know the sample value is strictly negative. This is needed
for an upcoming symbolic lexmin heuristic.

Reviewed By: Groverkss

Differential Revision: https://reviews.llvm.org/D123604
2022-04-15 20:15:21 +01:00
rdzhabarov 3ef4099a61 [mlir] Fix BUILD issues and dependencies.
Differential Revision: https://reviews.llvm.org/D123868
2022-04-15 19:05:02 +00:00
William S. Moses ed499ddcda [MLIR] Fix operation clone
Operation clone is currently faulty.

Suppose you have a block like as follows:

```
(%x0 : i32) {
   %x1 = f(%x0)
   return %x1
}
```

The test case we have is that we want to "unroll" this, in which we want to change this to compute `f(f(x0))` instead of just `f(x0)`. We do so by making a copy of the body at the end of the block and set the uses of the argument in the copy operations with the value returned from the original block.
This is implemented as follows:
1) map to the block arguments to the returned value (`map[x0] = x1`).
2) clone the body

Now for this small example, this works as intended and we get the following.

```
(%x0 : i32) {
   %x1 = f(%x0)
   %x2 = f(%x1)
   return %x2
}
```

This is because the current logic to clone `x1 = f(x0)` first looks up the arguments in the map (which finds `x0` maps to `x1` from the initialization), and then sets the map of the result to the cloned result (`map[x1] = x2`).

However, this fails if `x0` is not an argument to the op, but instead used inside the region, like below.

```
(%x0 : i32) {
   %x1 = f() {
      yield %x0
   }
   return %x1
}
```

This is because cloning an op currently first looks up the args (none), sets the map of the result (`map[%x1] = %x2`), and then clones the regions. This results in the following, which is clearly illegal:

```
(%x0 : i32) {
   %x1 = f() {
      yield %x0
   }
   %x2 = f() {
      yield %x2
   }
   return %x2
}
```

Diving deeper, this is partially due to the ordering (how this PR fixes it), as well as how region cloning works. Namely it will first clone with the mapping, and then it will remap all operands. Since the ordering above now has a map of `x0 -> x1` and `x1 -> x2`, we end up with the incorrect behavior here.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D122531
2022-04-15 13:09:13 -04:00
jfurtek bed8212157 [mlir][ods][NFC] Move enum attribute definitions from OpBase.td to EnumAttr.td
This diff moves `EnumAttr` tablegen definitions (specifically, `IntEnumAttr` and
`BitEnumAttr`-related classes) from `OpBase.td` to `EnumAttr.td`. No
functionality is changed.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D123551
2022-04-15 16:51:14 +00:00