Commit Graph

12114 Commits

Author SHA1 Message Date
Fangrui Song 67854f9ed0 Use value_or instead of getValueOr. NFC 2022-06-29 21:55:02 -07:00
Aart Bik e057f25dee [mlir][sparse] auto-insertion of conversion to resolve cycles
When the iteration graph is cyclic (even after several attempts using less and less constraints), the current sparse compiler bails out, and no rewriting hapens. However, this revision adds some new logic where the sparse compiler tries to find a single input sparse tensor that breaks the cycle, and then adds a proper sparse conversion operation. This way, more incoming kernels can be handled!

Note, the resulting code is not optimal (although it keeps more or less proper "sparse" complexity), and more improvements should be added (especially when the kernel directly yields without computation, such as the transpose example). However, handling is better than not handling ;-)

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D128847
2022-06-29 18:28:18 -07:00
Min-Yih Hsu 9bad9248ed [mlir][LLVMIR] Apply SubElementTypeInterface on suitable types
This feature is tested by unit test since not many places in the codebase
use SubElementTypeInterface.

Differential Revision: https://reviews.llvm.org/D127539
2022-06-29 13:58:02 -07:00
Min-Yih Hsu d41028610b [mlir] Prevent SubElementInterface from going into infinite recursion
Since only mutable types and attributes can go into infinite recursion
inside SubElementInterface::walkSubElement, and there are only a few of
them (mutable types and attributes), we introduce new traits for Type
and Attribute: TypeTrait::IsMutable and AttributeTrait::IsMutable,
respectively. They indicate whether a type or attribute is mutable.
Such traits are required if the ImplType defines a `mutate` function.

Then, inside SubElementInterface, we use a set to record visited mutable
types and attributes that have been visited before.

Differential Revision: https://reviews.llvm.org/D127537
2022-06-29 13:58:02 -07:00
rdzhabarov ef2c837a73 [Fix] Fix compilation warning on unused var. 2022-06-29 19:54:49 +00:00
River Riddle 0d9e51ea42 [mlir] Update the pass crash reproducer documentation
The reproducer is now encoded using an external resource, instead
of a comment at the top of the file.
2022-06-29 12:23:11 -07:00
River Riddle 361acbb362 [mlir] Refactor pass crash reproducer generation to be an assembly resource
We currently generate reproducer configurations using a comment placed at
the top of the generated .mlir file. This is kind of hacky given that comments
have no semantic context in the source file and can easily be dropped. This
strategy also wouldn't work if/when we have a bitcode format. This commit
switches to using an external assembly resource, which is verifiable/can
work with a hypothetical bitcode naturally/and removes the awkward processing
from mlir-opt for splicing comments and re-applying command line options. With
the removal of command line munging, this opens up new possibilities for
executing reproducers in memory.

Differential Revision: https://reviews.llvm.org/D126447
2022-06-29 12:14:02 -07:00
River Riddle ea488bd6e1 [mlir] Allow for attaching external resources to .mlir files
This commit enables support for providing and processing external
resources within MLIR assembly formats. This is a mechanism with which
dialects, and external clients, may attach additional information when
printing IR without that information being encoded in the IR itself.
External resources are not uniqued within the MLIR context, are not
attached directly to any operation, and are solely intended to live and be
processed outside of the immediate IR. There are many potential uses of this
functionality, for example MLIR's pass crash reproducer could utilize this to
attach the pass resource executing when a crash occurs. Other types of
uses may be embedding large amounts of binary data, such as weights in ML
applications, that shouldn't be copied directly into the MLIR context, but
need to be kept adjacent to the IR.

External resources are encoded using a key-value pair nested within a
dictionary anchored by name either on a dialect, or an externally registered
entity. The key is an identifier used to disambiguate the data. The value
may be stored in various limited forms, but general encodings use a string
(human readable) or blob format (binary). Within the textual format, an
example may be of the form:

```mlir
{-#
  // The `dialect_resources` section within the file-level metadata
  // dictionary is used to contain any dialect resource entries.
  dialect_resources: {
    // Here is a dictionary anchored on "foo_dialect", which is a dialect
    // namespace.
    foo_dialect: {
      // `some_dialect_resource` is a key to be interpreted by the dialect,
      // and used to initialize/configure/etc.
      some_dialect_resource: "Some important resource value"
    }
  },
  // The `external_resources` section within the file-level metadata
  // dictionary is used to contain any non-dialect resource entries.
  external_resources: {
    // Here is a dictionary anchored on "mlir_reproducer", which is an
    // external entity representing MLIR's crash reproducer functionality.
    mlir_reproducer: {
      // `pipeline` is an entry that holds a crash reproducer pipeline
      // resource.
      pipeline: "func.func(canonicalize,cse)"
    }
  }
```

Differential Revision: https://reviews.llvm.org/D126446
2022-06-29 12:14:01 -07:00
Benoit Jacob 694ad3eaef Fix CombineContractBroadcast folding reduction iterators.
Fix CombineContractBroadcast folding reduction iterators.

Differential Revision: https://reviews.llvm.org/D128739
2022-06-29 18:01:56 +00:00
Nicolas Vasilache 0fb24a85cb [mlir][Tensor] Improve documentation of verification behavior of InsertSliceOp. 2022-06-29 07:52:54 -07:00
Nicolas Vasilache 741f8f2bed [mlir][Tensor][NFC] Better document rank-reducing behavior of ExtractSliceOp and cleanup 2022-06-29 07:37:58 -07:00
Mehdi Amini 84124ff891 Apply clang-tidy fixes for readability-simplify-boolean-expr in ViewLikeInterface.cpp (NFC) 2022-06-29 12:15:39 +00:00
Mehdi Amini be7997221d Apply clang-tidy fixes for readability-identifier-naming in Float16bits.cpp (NFC) 2022-06-29 12:13:57 +00:00
Arjun P d08522f5bc [MLIR][Preburger] fix typo covertVarKind -> convertVarKind
Also update parameter names in the implementation file to match the header.
2022-06-29 13:08:20 +01:00
Arjun P 7236e5de54 Revert clang-tidy fixes for readability-simplify-boolean-expr and add NOLINT
The original code is more readable because the goal is to check if the given
value does *not* lie in the range. It is harder to understand this by
reading the rewritten code.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D128753
2022-06-29 12:23:35 +01:00
Benjamin Kramer 206a6037a0 [Presburger] Cheat around old versions of clang not doing NRVO when there's a derived-to-base cast in the way
Should be NFC. We can just do the base conversion manually and avoid
warnings about it. Clang before Clang 13 didn't implement P1825 and
complains:

mlir/lib/Analysis/Presburger/IntegerRelation.cpp:226:10: warning: local variable 'result' will be copied
      despite being returned by name [-Wreturn-std-move]
  return result;
         ^~~~~~
mlir/lib/Analysis/Presburger/IntegerRelation.cpp:226:10: note: call 'std::move' explicitly to avoid copying
  return result;
         ^~~~~~
         std::move(result)
2022-06-29 12:40:59 +02:00
Christian Sigg f6d00bac77 Revert "Add default copy and move c'tor/assignment to PresburgerRelation."
This reverts commit 2cd468ef15.
2022-06-29 12:24:28 +02:00
Christian Sigg 2cd468ef15 Add default copy and move c'tor/assignment to PresburgerRelation.
Differential Revision: https://reviews.llvm.org/D128789
2022-06-29 12:07:55 +02:00
lewuathe 0180709590 [mlir][complex] Canonicalization for consecutive complex.neg
Consecutive complex.neg are redundant so that we can canonicalize them to the original operands.

Reviewed By: pifon2a

Differential Revision: https://reviews.llvm.org/D128781
2022-06-29 11:11:40 +02:00
Adrian Kuegel 0a6e29f455 Revert "[mlir][Presburger] Fix warning Wreturn-std-move (NFC)"
This reverts commit a4070a5e77.
It introduced another warning instead.
2022-06-29 09:22:36 +02:00
Adrian Kuegel a4070a5e77 [mlir][Presburger] Fix warning Wreturn-std-move (NFC) 2022-06-29 09:10:20 +02:00
lorenzo chelini 58a55107c2 [MLIR][Math] Improve docs for round op (NFC)
Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D128662
2022-06-29 08:35:07 +02:00
Peixin-Qiao 1795f8cd2e [NFC][OpenMP] Fix worksharing-loop
1. Remove the redundant collapse clause in MLIR OpenMP worksharing-loop
   operation.
2. Fix several typos.
3. Refactor the chunk size type conversion since CreateSExtOrTrunc has
   both type check and type conversion.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D128338
2022-06-29 12:20:03 +08:00
River Riddle 9560f02141 [mlir] Add `enableSplitting` and `insertMarkerInOutput` options to `splitAndProcessBuffer`
`enableSplitting` simply enables/disables whether we should split
or use the full buffer. `insertMarkerInOutput` toggles if split markers
should be inserted in between prcessed output chunks.

These options allow for merging the duplicate code paths we have
when splitting is optional.

Differential Revision: https://reviews.llvm.org/D128764
2022-06-28 15:42:35 -07:00
Jacques Pienaar 04235d07ad [mlir] Update flipped accessors (NFC)
Follow up with memref flipped and flipping any intermediate changes
made.
2022-06-28 13:11:26 -07:00
Mehdi Amini 08d651d7ba Apply clang-tidy fixes for performance-unnecessary-value-param in VectorDistribute.cpp (NFC) 2022-06-28 19:52:46 +00:00
Mehdi Amini b254d55711 Apply clang-tidy fixes for readability-simplify-boolean-expr in SPIRVOps.cpp (NFC) 2022-06-28 19:51:28 +00:00
Arjun P dda8b1ceda [MLIR][Presburger] subtract: support non-div locals
Also added test cases. Also extend support for `computeReprWithOnlyDivLocals` from `IntegerPolyhedron` to `IntegerRelation` and `PresburgerRelation`.

Depends on D128736.

Reviewed By: Groverkss

Differential Revision: https://reviews.llvm.org/D128737
2022-06-28 20:37:22 +01:00
Arjun P fd26d86f5f [MLIR][Presburger] subtract: fix support for divs defined by equalities
Also added test cases to test this. Both IntegerRelation::addLocalFloorDiv and the fixed implementation of subtraction need to compute division inequalities from dividend and divisor, so this also adds helper util functions to avoid duplicating this logic.

Reviewed By: Groverkss

Differential Revision: https://reviews.llvm.org/D128736
2022-06-28 20:24:51 +01:00
Groverkss da0f151401 [MLIR][Affine][NFC] Fix affine utlities docs using "identifiers" instead of "variables" 2022-06-28 19:35:17 +01:00
Mehdi Amini 35d7ebb1b7 Apply clang-tidy fixes for readability-simplify-boolean-expr in Shape.cpp (NFC) 2022-06-28 18:08:58 +00:00
Mehdi Amini cd417c6a46 Apply clang-tidy fixes for modernize-use-emplace in SCFTransformOps.cpp (NFC) 2022-06-28 18:08:58 +00:00
Groverkss d95140a5a9 [MLIR][Presburger] Rename variable/identifier -> variable
Currently, in the Presburger library, we use the words "variables" and
"identifiers" interchangeably. This patch changes this to only use "variables" to
refer to the variables of PresburgerSpace.

The reasoning behind this change is that the current usage of the word "identifier"
is misleading. variables do not "identify" anything. The information attached to them is the
actual "identifier" for the variable. The word "identifier", will later be used
to refer to the information attached to each variable in space.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D128585
2022-06-28 19:08:08 +01:00
Stella Stamenova ac521d9ecd [mlir] Leverage CMake interface libraries for mlir python
This is already partially the case, but we can rely more heavily on interface libraries and how they are imported/exported in other to simplify the implementation of the mlir python functions in Cmake.

This change also makes a couple of other changes:
1) Add a new CMake function which handles "pure" sources. This was done inline previously
2) Moves the headers associated with CAPI libraries to the libraries themselves. These were previously managed in a separate source target. They can now be added directly to the CAPI libraries using DECLARED_HEADERS.
3) Cleanup some dependencies that showed up as an issue during the refactor

This is a big CMake change that should produce no impact on the build of mlir and on the produced *build tree*. However, this change fixes an issue with the *install tree* of mlir which was previously unusable for projects like torch-mlir because both the "pure" and "extension" targets were pointing to either the build or source trees.

Reviewed By: stellaraccident

Differential Revision: https://reviews.llvm.org/D128230
2022-06-28 10:42:58 -07:00
Mehdi Amini f68454ee8f Fix printing for ArrayRef attributes/types in declarative assembly format
These were abbreviated when parsing, but not when printing.

Reviewed By: Mogball, rriddle

Differential Revision: https://reviews.llvm.org/D128720
2022-06-28 17:39:38 +00:00
Mehdi Amini eafb18eb87 Apply clang-tidy fixes for performance-unnecessary-value-param in LinalgStrategyPasses.cpp (NFC) 2022-06-28 17:30:46 +00:00
Mehdi Amini c2828b6363 Apply clang-tidy fixes for readability-identifier-naming in ArithmeticOps.cpp (NFC) 2022-06-28 17:30:46 +00:00
Nicolas Vasilache a48bdee686 q[mlir][Vector] Add a ShapeCastOp(BroadcastOp) canonicalization pattern
This pattern can kick in when the source of the broadcast has a shape
that is a prefix/suffix of the result of the shape_cast.

Differential Revision: https://reviews.llvm.org/D128734
2022-06-28 09:49:37 -07:00
Mehdi Amini 6901607822 Fix build with some GCC version: `global qualification of class name is invalid before '{' token` 2022-06-28 16:48:08 +00:00
Aart Bik eca6f9160f [mlir][sparse][bufferization] refine bufferization assumption enforcement
Enforce the assumption made on tensor buffers explicitly. When in-place,
reuse the buffer, but fill with all zeroes for the non-update case, since
the kernel assumes all elements are written to. When not in-place, zero
out the new buffer when materializing or when no-updates occur. Copy the
original tensor value when updates occur. This prepares migrating to the
new bufferization strategy, where these assumptions must be made explicit.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D128691
2022-06-28 09:43:30 -07:00
Lei Zhang e1e0ecb96e [mlir][spirv] Support more comparisons on boolean values
Reviewed By: hanchung

Differential Revision: https://reviews.llvm.org/D128692
2022-06-28 11:58:42 -04:00
Arjun P e9fa18637d [MLIR][Presburger] getDivRepr: fix bug where dividend was negated
Also updated the tests, which were asserting the wrong behaviour.

Reviewed By: Groverkss

Differential Revision: https://reviews.llvm.org/D128735
2022-06-28 16:43:32 +01:00
Mehdi Amini 7faf75bb3e Introduce a new Dense Array attribute
This attribute is similar to DenseElementsAttr but does not support
splat. As such it has a much simpler API and does not need any smart
iterator: it exposes direct ArrayRef access.

A new syntax is introduced so that the generic printing/parsing looks
like:

  [:i64 1, -2, 3]

This attribute beings like an ArrayAttr but has a `:` token after the
opening square brace to introduce the element type (supported are I8,
I16, I32, I64, F32, F64) and the comma separated list for the data.

This is particularly convenient for attributes intended to be small,
like those referring to shapes.
For example a `transpose` operation with a `dims` attribute could be
defined as such:

  let arguments = (ins AnyTensor:$input, DenseI64ArrayAttr:$dims);
  let assemblyFormat = "$input `dims` `=` $dims attr-dict : type($input)";

And printed this way (the element type is elided in this case):

  transpose %input dims = [0, 2, 1] : tensor<2x3x4xf32>

The C++ API for dims would just directly return an ArrayRef<int64>

RFC: https://discourse.llvm.org/t/rfc-introduce-a-new-dense-array-attribute/63279

Recommit with a custom DenseArrayBaseAttrStorage class to ensure
over-alignment of the storage to the largest type.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D123774
2022-06-28 13:28:06 +00:00
Mehdi Amini 744d06e4f2 Revert "Introduce a new Dense Array attribute"
This reverts commit 508eb41d82.

UBSAN indicates some pointer mis-alignment I need to investigate
2022-06-28 12:47:15 +00:00
Mehdi Amini 508eb41d82 Introduce a new Dense Array attribute
This attribute is similar to DenseElementsAttr but does not support
splat. As such it has a much simpler API and does not need any smart
iterator: it exposes direct ArrayRef access.

A new syntax is introduced so that the generic printing/parsing looks
like:

  [:i64 1, -2, 3]

This attribute beings like an ArrayAttr but has a `:` token after the
opening square brace to introduce the element type (supported are I8,
I16, I32, I64, F32, F64) and the comma separated list for the data.

This is particularly convenient for attributes intended to be small,
like those referring to shapes.
For example a `transpose` operation with a `dims` attribute could be
defined as such:

  let arguments = (ins AnyTensor:$input, DenseI64ArrayAttr:$dims);
  let assemblyFormat = "$input `dims` `=` $dims attr-dict : type($input)";

And printed this way (the element type is elided in this case):

  transpose %input dims = [0, 2, 1] : tensor<2x3x4xf32>

The C++ API for dims would just directly return an ArrayRef<int64>

RFC: https://discourse.llvm.org/t/rfc-introduce-a-new-dense-array-attribute/63279

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D123774
2022-06-28 12:08:25 +00:00
Mehdi Amini 2d70faa299 Apply clang-tidy fixes for readability-simplify-boolean-expr in TosaToLinalg.cpp (NFC) 2022-06-28 11:21:43 +00:00
Mehdi Amini cf3f477d30 Apply clang-tidy fixes for readability-simplify-boolean-expr in Utils.cpp (NFC) 2022-06-28 11:21:37 +00:00
Matthias Springer 04dac2ca7c [mlir][SCF][bufferize][NFC] Implement resolveConflicts for ParallelInsertSliceOp
This was previous implemented as part of the BufferizableOpInterface of ForEachThreadOp. Moving the implementation to ParallelInsertSliceOp to be consistent with the remaining ops and to have a nice example op that can serve as a blueprint for other ops.

Differential Revision: https://reviews.llvm.org/D128666
2022-06-28 12:18:22 +02:00
lewuathe 036a699675 [mlir][complex] Canonicalization for consecutive complex.add and sub
Add basic canonicalization for consecutive complex.add and sub operations.

Reviewed By: pifon2a

Differential Revision: https://reviews.llvm.org/D128702
2022-06-28 11:41:16 +02:00
Mahesh Ravishankar fa596c6921 [mlir][Vector] Fix reordering of floating point adds during lower of `vector.contract`.
Adding the accumulator value after the `vector.contract` changes the
precision of the operation. This makes sure the accumulator is carried
through to `vector.reduce` (and down to LLVM).

Differential Revision: https://reviews.llvm.org/D128674
2022-06-28 05:26:39 +00:00
Mogball 92bdc5c3e5 [mlir][ods] Add convertFromStorage field to parameters
This patch adds a `convertFromStorage` field to attribute or type parameters that can implement more complex logic for converting from the parameter's C++ storage type (e.g. `Optional<SmallVector<T>>`) to its C++ type (e.g. `Optional<ArrayRef<T>>`).

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D128293
2022-06-27 15:57:21 -07:00
Matthias Springer cb47124179 [mlir][bufferize] Improve to_tensor/to_memref folding
Differential Revision: https://reviews.llvm.org/D128615
2022-06-27 21:42:39 +02:00
Aart Bik 4db52450c1 [mlir][sparse] remove redundant whitespace
Reviewed By: Peiming

Differential Revision: https://reviews.llvm.org/D128673
2022-06-27 11:57:08 -07:00
Groverkss aab7e2fa05 [MLIR][Parser] Fix AffineParser colliding bare identifiers with primitive types
The parser currently can't parse bare identifiers like 'i0' in affine
maps and sets, and similarly ids like f16/f32. But these bare ids are
part of the grammar - although they are primitive types.

```
error: expected bare identifier
set = affine_set<(i0, i1) : ()>
                   ^
```

This patch allows the parser for AffineMap/IntegerSet to parse bare
identifiers as defined by the grammer.

Reviewed By: bondhugula, rriddle

Differential Revision: https://reviews.llvm.org/D127076
2022-06-27 19:35:25 +01:00
Peiming Liu 15d1cb4520 [mlir][sparse]more integration test cases for sparse_tensor.BinaryOp
Adding more test cases for sparse_tensor.BinaryOp, including different cases when overlap/left/right region is implemented/empty/identity

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D128383
2022-06-27 11:29:27 -07:00
Peiming Liu 057e33ef36 [mlir][sparse]Add more integration tests for sparse_tensor.unary
Previously, the sparse_tensor.unary integration test does not contain cases with the use of `linalg.index` (previoulsy unsupported), this commit adds test cases that use `linalg.index` operators.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D128460
2022-06-27 10:58:47 -07:00
Min-Yih Hsu fc7f7260a6 [mlir][LLVMIR] Memorize compatible LLVM types
This patch memorize compatible LLVM types in `LLVM::isCompatibleType` in
order to avoid redundant works.

This is especially useful when the size of program is big and there are
multiple occurrences of some deeply nested LLVM struct types, in which
case we can gain quite some speedups with this patch.

Differential Revision: https://reviews.llvm.org/D127918
2022-06-27 09:46:41 -07:00
Min-Yih Hsu 856056d1b0 [mlir][LLVMIR] Add support for va_start/copy/end intrinsics
This patch adds three new LLVM intrinsic operations: llvm.intr.vastart/copy/end.
And its translation from LLVM IR.

This effectively removes a restriction, imposed by 0126dcf1f0, where
non-external functions in LLVM dialect cannot be variadic. At that time
it was not clear how LLVM intrinsics are going to be modeled, which
indirectly affects va_start/copy/end, the core intrinsics used in
variadic functions. But since we have LLVM intrinsics as normal
MLIR operations, it's not a problem anymore.

Differential Revision: https://reviews.llvm.org/D127540
2022-06-27 09:46:40 -07:00
Matthias Springer f164814f2f [mlir][SCF][bufferize] Small simplification and more comments
Differential Revision: https://reviews.llvm.org/D128651
2022-06-27 17:04:29 +02:00
Lewuathe 8fa2e67979 [mlir][complex] complex.arg op to calculate the angle of complex number
Add complex.arg op which calculates the angle of complex number. The op name is inspired by the function carg in libm.

See: https://sourceware.org/newlib/libm.html#carg

Differential Revision: https://reviews.llvm.org/D128531
2022-06-27 16:45:41 +02:00
Matthias Springer c0b0b6a00a [mlir][bufferize] Infer memory space in all bufferization patterns
This change updates all remaining bufferization patterns (except for scf.while) and the remaining bufferization infrastructure to infer the memory space whenever possible instead of falling back to "0". (If a default memory space is set in the bufferization options, we still fall back to that value if the memory space could not be inferred.)

Differential Revision: https://reviews.llvm.org/D128423
2022-06-27 16:32:52 +02:00
Matthias Springer 45b995cda4 [mlir][bufferize][NFC] Change signature of allocateTensorForShapedValue
Add a failure return value and bufferization options argument. This is to keep a subsequent change smaller.

Differential Revision: https://reviews.llvm.org/D128278
2022-06-27 16:00:06 +02:00
Javier Setoain f39c2a1142 [mlir][llvm] Add vector insert/extract intrinsics
These intrinsics will be needed to convert between fixed-length vectors
and scalable vectors.

This operation will be needed for VLS (vector-length specific)
vectorization, when interfacing with vector functions or intrinsics that
take scalable vectors as operands in a context where the length of our
vectors is known or assumed at compile time, but we still want to
generate scalable vector instructions.

Differential Revision: https://reviews.llvm.org/D127100
2022-06-27 14:12:18 +01:00
Nicolas Vasilache a0f843fdaf [SCF] Add thread_dim_mapping attribute to scf.foreach_thread
An optional thread_dim_mapping index array attribute specifies for each
virtual thread dimension, how it remaps 1-1 to a set of concrete processing
element resources (e.g. a CUDA grid dimension or a level of concrete nested
async parallelism). At this time, the specification is backend-dependent and
is not verified by the op, beyond being an index array attribute.
It is the reponsibility of the lowering to interpret the index array in the
context of the concrete target the op is lowered to, or to ignore it when
the specification is ill-formed or unsupported for a particular target.

Differential Revision: https://reviews.llvm.org/D128633
2022-06-27 04:58:36 -07:00
Matthias Springer 5d50f51c97 [mlir][bufferization][NFC] Add error handling to getBuffer
This is in preparation of adding memory space support.

Differential Revision: https://reviews.llvm.org/D128277
2022-06-27 13:48:01 +02:00
Matthias Springer 0d0a94a792 [mlir][bufferization][NFC] Fix typo in AllocTensorOp builders 2022-06-27 13:41:18 +02:00
Matthias Springer 3ff93f838e [mlir][SCF][bufferize][NFC] Bufferize scf.for terminator separately
This allows for better type inference during bufferization and is in preparation of supporting memory spaces.

Differential Revision: https://reviews.llvm.org/D128422
2022-06-27 13:26:32 +02:00
Matthias Springer 8e691e1f24 [mlir][SCF][bufferize] Bufferize scf.if/execute_region terminators separately
This allows for better type inference during bufferization and is in preparation of supporting memory spaces.

Differential Revision: https://reviews.llvm.org/D128581
2022-06-27 13:22:19 +02:00
Matthias Springer 7ebf70d85d [mlir][SCF][bufferize][NFC] Bufferize parallel_insert_slice separately
This allows for better type inference during bufferization and is in preparation of supporting memory spaces.

Differential Revision: https://reviews.llvm.org/D128580
2022-06-27 13:16:02 +02:00
Matthias Springer 19efb84c7a [mlir][shape][bufferize][NFC] Bufferize block terminators separately
This allows for better type inference during bufferization and is in preparation of supporting memory spaces.

Differential Revision: https://reviews.llvm.org/D128579
2022-06-27 13:08:13 +02:00
Matthias Springer ba9d886db4 [mlir][bufferization][NFC] Bufferize with PostOrder traversal
This is useful because the result type of an op can sometimes be inferred from its body (e.g., `scf.if`). This will be utilized in subsequent changes.

Also introduces a new `getBufferType` interface method on BufferizableOpInterface. This method is useful for computing a bufferized block argument type with respect to OpOperand types of the parent op.

Differential Revision: https://reviews.llvm.org/D128420
2022-06-27 12:42:41 +02:00
Matthias Springer c06f01ffee [mlir][bufferization] Add `memory_space` op attribute
This attribute is currently supported on AllocTensorOp only. Future changes will add support to other ops. Furthermore, the memory space is not propagated properly in all bufferization patterns and some of the core bufferization infrastructure. This will be addressed in a subsequent change.

Differential Revision: https://reviews.llvm.org/D128274
2022-06-27 12:33:26 +02:00
Matthias Springer b06614e2e8 [mlir][bufferization][NFC] Change signature of getMemRefType
These functions now accep unsigned attributes for address spaces instead of Attributes.

Differential Revision: https://reviews.llvm.org/D128275
2022-06-27 10:41:40 +02:00
Adrian Kuegel ca2933f3f8 [mlir] Fix ClangTidyPerformance finding (NFC) 2022-06-27 09:15:39 +02:00
Jacques Pienaar 655dc02cb0 [mlir] Flip MemRef dialect to _Both (NFC) 2022-06-26 20:45:25 -07:00
Jacques Pienaar 2d70eff802 [mlir] Flip more uses to prefixed accessor form (NFC).
Try to keep the final flip small. Need to flip MemRef as there are many
templated cases with it and Tensor.
2022-06-26 19:12:38 -07:00
Uday Bondhugula dab6c11f83 [MLIR] NFC. Fix doc comment for AliasResult::isNo
Fix doc comment for AliasResult::isNo. NFC.

Differential Revision: https://reviews.llvm.org/D128594
2022-06-27 03:58:17 +05:30
Stella Laurenzo 54998986c3 [mlir] Generalize SCF passes to not have to run on FuncOp.
Seems to have been an accident of history and none of these had any reason to be restricted to FuncOp.

Differential Revision: https://reviews.llvm.org/D128614
2022-06-26 11:05:35 -07:00
Kazu Hirata 3b7c3a654c Revert "Don't use Optional::hasValue (NFC)"
This reverts commit aa8feeefd3.
2022-06-25 11:56:50 -07:00
Kazu Hirata aa8feeefd3 Don't use Optional::hasValue (NFC) 2022-06-25 11:55:57 -07:00
Kazu Hirata f8c1c9afd3 [mlir] Fix a warning
This patch fixes:

  llvm-project/mlir/lib/Dialect/Linalg/Transforms/SplitReduction.cpp:300:26:
  error: comparison of integers of different signs: 'int64_t' (aka
  'long') and 'size_t' (aka 'unsigned long') [-Werror,-Wsign-compare]
2022-06-25 09:18:04 -07:00
Jacques Pienaar 701051a8c2 [mlir][shape] Switch types to ODS generated (NFC)
These were already pretty simple, so just switching to generated.
2022-06-25 09:06:52 -07:00
Arjun P 8a7ead691b [MLIR][Presburger] Support computing a representation of a set that only has locals that are divs
This paves the way for integer-exact projection, and for supporting
non-division locals in subtraction, complement, and equality checks.

Reviewed By: Groverkss

Differential Revision: https://reviews.llvm.org/D127463
2022-06-25 14:23:32 +01:00
lewuathe bd861056a5 [mlir][affine] Rigorous check for loop unrolling store operation
Static loop unrolling does not change the operation type. We can rigorously make sure to use affine.store in the check.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D128237
2022-06-25 09:30:43 +09:00
lewuathe d72d488039 [mlir][quant] output spec verification check for quant.region
We can have validation test for quant.region having incompatible output spec.

Reviewed By: stellaraccident

Differential Revision: https://reviews.llvm.org/D128245
2022-06-25 08:57:44 +09:00
Thomas Raoux d343cdd509 [mlir][vector] Fix bug when swapping scf.for and vector warp op
When creating a scf.for without argument a scf.yield is automatically
created. Make sure we don't create a second one.

Differential Revision: https://reviews.llvm.org/D128405
2022-06-24 19:13:02 +00:00
Thomas Raoux 7eba5cdf9c [mlir][vector] Relax transfer_write vector distribution pattern
Small change to relax the pattern to support any vector containing a
single element.

Differential Revision: https://reviews.llvm.org/D128545
2022-06-24 19:03:14 +00:00
Aart Bik 9a3d60e0d3 [mlir][bufferization][sparse] put restriction on sparse tensor allocation
Putting some direct use restrictions on tensor allocations in the
sparse case enables the use of simplifying assumptions in the
bufferization analysis.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D128463
2022-06-24 10:58:43 -07:00
Nicolas Vasilache f6c79c6ae4 [mlir][Vector]Fix bug where vector::WarpExecuteOnLane0Op are created with 2 blocks in the region
Differential Revision: https://reviews.llvm.org/D128534
2022-06-24 07:33:58 -07:00
Matthias Springer 3798678bd1 [mlir][sparse][bufferize] Implement BufferizableOpInterface
Only the analysis part of the interface is implemented. The bufferization itself is performed by the SparseTensorConversion pass.

Differential Revision: https://reviews.llvm.org/D128138
2022-06-24 13:47:01 +02:00
Lei Zhang b2671d8898 [mlir][spirv] Fix bitcast input order for UnifyAliasedResourcePass
spv.bitcast from a vector to a scalar expects the lower-numbered
components of the the vector to map to the lower-ordered bits of
the scalar. That actually already matches how little endian stores
data in the memory. So we just need to read and push to the back
of the vector sequentially.

Reviewed By: hanchung

Differential Revision: https://reviews.llvm.org/D128473
2022-06-23 22:19:08 -04:00
Frederik Gossen 02d29afd16 [MLIR] Add `decomposeMixedStridesOrOffsets` and `decomposeMixedSizes`
Add the reverse functions to the ViewLikeInterface's functions
`getMixedStrides`, `getMixedSizes`, and `getMixedOffsets`. The new functions
are useful to build view-like operations from an array of mixed static/dynamic
values.

Differential Revision: https://reviews.llvm.org/D128376
2022-06-23 19:08:36 -04:00
Nicolas Vasilache 8c6da76483 [mlir][Transform] Fix applyToOne corner case when no op is matched.
Such situations manifest themselves with an empty payload which ends up producing empty results.
In such cases, we still want to match the transform op contract and return as many empty SmallVector<Operation*>
as the op requires.

Differential Revision: https://reviews.llvm.org/D128456
2022-06-23 12:18:21 -07:00
Slava Zakharin b163ac33bd [mlir][math] Lower atan to libm
Differential Revision: https://reviews.llvm.org/D128454
2022-06-23 10:49:25 -07:00
Matthias Springer 3474d10e1a [mlir][bufferization][NFC] Make `escape` a dialect attribute
All bufferizable ops that bufferize to an allocation receive a `bufferization.escape` attribute during TensorCopyInsertion.

Differential Revision: https://reviews.llvm.org/D128137
2022-06-23 19:34:47 +02:00
gpetters94 bc07634b5a Adding a named op for grouped convolutions 2022-06-23 16:32:22 +00:00
Jacques Pienaar 983cb6c92f [mlir][pdll] Add new tablegen helper NFC
Command line option injected by tablegen rule cannot be respected by
PDLL here, so add new helper function that is copy of original without
any additional flags injected. This avoids compilation failure when
compiler warnings are disabled.

Kept it as a mechanical copy.

Fixes #55716
2022-06-23 05:31:32 -07:00
Nicolas Vasilache 4c7225d19a [mlir][Transform] Fix implementation of the generic apply that is based on applyToOne.
The result of applying an N-result producing transformation to M payload ops
is an M-wide result, each containing N result operations.
This requires a transposition of the results obtained by calling `applyToOne`.

This revision fixes the issue and adds more advanced tests that exercise the behavior.

Differential Revision: https://reviews.llvm.org/D128414
2022-06-23 05:28:09 -07:00
Adrian Kuegel 991547703a [mlir] Add an additional check to vectorizeStaticLinalgOpPrecondition.
We need to make sure that the types used in the body are valid element types
for VectorType.

Differential Revision: https://reviews.llvm.org/D128336
2022-06-23 10:24:04 +02:00
Okwan Kwon 1dd2c93a66 [mlir][linalg] move isElementwise() to Linalg/Utils (NFC)
Differential Revision: https://reviews.llvm.org/D128398
2022-06-22 18:55:45 -07:00
jackalcooper efe603e70d [mlir][vulkan-runner] fix VK_ERROR_INCOMPATIBLE_DRIVER error
On macOS, Vulkan is emulated by MoltenVK. Some extra flags are
required for "building robust and portable Vulkan based
applications that are good citizens in the Vulkan ecosystem".

More information:
https://vulkan.lunarg.com/doc/sdk/1.3.216.0/mac/getting_started.html

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D128173
2022-06-22 19:36:35 -04:00
Nicolas Vasilache d571639579 [mlir][Linalg] SplitReduction implementation without tensor::ExpandShapeOp
This revision proposes a different implementation of the SplitReductoin transformation that does
not rely on tensor::ExpandShapeOp.

Previously, a dimension `[k]` would be split into `[k][kk]` via an ExpandShapeOp.
Instead, this revision proposes to rewrite `[k]` into `[factor * k + kk]`.

There are different tradeoffs involved  but the proposed implementation is more general because
the affine rewrite is well-defined. In particular, it works naturally with `?` parallel dimensions and
non-trivial indexing maps.

A further rewrite of `[factor * k + kk]` + ExpandShapeOp is possible as a followup.

Differential Revision: https://reviews.llvm.org/D128266
2022-06-22 12:06:58 -07:00
Mingming Liu 67dc8021a1 [Support] Change TrackingStatistic and NoopStatistic to use uint64_t instead of unsigned.
Binary size of `clang` is trivial; namely, numerical value doesn't
change when measured in MiB, and `.data` section increases from 139Ki to
173 Ki.

Differential Revision: https://reviews.llvm.org/D128070
2022-06-22 10:11:40 -07:00
lorenzo chelini 30bdfacf5d [MLIR] Fix top-level comment (NFC) 2022-06-22 19:04:40 +02:00
Aart Bik 9e6261edc0 [mlir][sparse] fix typo in CHECK test
Thanks Peiming for reporting!

Reviewed By: Peiming

Differential Revision: https://reviews.llvm.org/D128308
2022-06-22 09:43:32 -07:00
Aart Bik e7d3ba1066 [mlir][sparse] accept sparse reshape (expand/collapse)
This revision makes sure we accept sparse tensors as arguments
of the expand/collapse reshaping operations in the tensor dialect.
Note that the actual lowering to runnable IR is still TBD.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D128311
2022-06-22 09:40:38 -07:00
Arjun P 628a2c14e3 [MLIR][Presburger] introduce SlowMPInt, an auto-resizing APInt for fully correct signed integer computations
The Presburger library currently uses int64_t throughout for its integers.
This runs the risk of silently producing incorrect results when overflows occur.
Fixing this issue requires some sort of multiprecision integer
that transparently supports aribtrary arithmetic computations.

The class SlowMPInt provides this functionality, and is intended to be used
as the slow path fallback for a more optimized upcoming class, MPInt, that optimizes
for the Presburger library's workloads.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D123758
2022-06-22 18:37:18 +02:00
Nicolas Vasilache 74f0660160 [mlir][Transform] NFC - Pass TransformState as an argument to applyToOne methods
This will allow implementing state-dependent behavior in the future.

Differential Revision: https://reviews.llvm.org/D128327
2022-06-22 01:19:13 -07:00
lewuathe ce07b95610 [mlir][math] Support vector type by erf and round libm lowering
erf and round op are able to lowered to libm supporting vector type as other math operations.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D127934
2022-06-22 09:42:41 +09:00
Aart Bik fde04aee33 [mlir][sparse] refine bufferization allocation lowering
Marking bufferization allocation operation as invalid
during sparse lowering is too strict, since dense and
sparse allocation can co-exist. This revision refines
the lowering with a dynamic type check.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D128305
2022-06-21 15:17:25 -07:00
Mahesh Ravishankar 2f637fe730 [mlir][TilingInterface] Enable tile and fuse using TilingInterface.
This patch implements tile and fuse transformation for ops that
implement the tiling interface. To do so,
- `TilingInterface` needs a new method that generates a tiled
  implementation of the operation based on the tile of the result
  needed.
- A pattern is added that replaces a `tensor.extract_slice` whose
  source is defined by an operation that implements the
  `TilingInterface` with a tiled implementation that produces the
  extracted slice in-place (using the method added to
  `TilingInterface`).
- A pattern is added that takes a sequence of operations that
  implement the `TilingInterface` (for now `LinalgOp`s), tiles the
  consumer, and greedily fuses its producers iteratively.

Differential Revision: https://reviews.llvm.org/D127809
2022-06-21 17:22:58 +00:00
Mahesh Ravishankar c584771f54 Revert "[mlir][TilingInterface] Enable tile and fuse using TilingInterface."
This reverts commit ea75511319 due to build failures.
2022-06-21 16:56:59 +00:00
Mahesh Ravishankar ea75511319 [mlir][TilingInterface] Enable tile and fuse using TilingInterface.
This patch implements tile and fuse transformation for ops that
implement the tiling interface. To do so,
- `TilingInterface` needs a new method that generates a tiled
  implementation of the operation based on the tile of the result
  needed.
- A pattern is added that replaces a `tensor.extract_slice` whose
  source is defined by an operation that implements the
  `TilingInterface` with a tiled implementation that produces the
  extracted slice in-place (using the method added to
  `TilingInterface`).
- A pattern is added that takes a sequence of operations that
  implement the `TilingInterface` (for now `LinalgOp`s), tiles the
  consumer, and greedily fuses its producers iteratively.

Differential Revision: https://reviews.llvm.org/D127809
2022-06-21 16:47:14 +00:00
Krzysztof Drewniak 7c5c4e781b [gdb-scripts] Add to_string methods to printer implementations
Some GDB versions require all prettyprinter classes to define to_string.
This commit adds these definitions.

Reviewed By: csigg

Differential Revision: https://reviews.llvm.org/D127969
2022-06-21 16:09:30 +00:00
bixia1 bdeae1f57b [mlir][sparse][taco] Support f16.
Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D128105
2022-06-21 09:08:26 -07:00
Benjamin Kramer b3127769b1 [mlir][sparse] Preserve NaNs when converting float to bfloat 2022-06-21 15:22:35 +02:00
Nicolas Vasilache f439b31971 [mlir][Linalg] Split reduction transform op
This revision separates the `LinalgSplitReduction` pattern, whose application is based on attributes,
from its implementation.
A transform dialect op extension is added to control the application of the transformation at a finer granularity.

Differential Revision: https://reviews.llvm.org/D128165
2022-06-21 05:01:26 -07:00
Shraiysh Vaishay 66e24da027 [mlir][OpenMP][NFC] Parameter refers to single args and hence changing description for taskgroup allocate clause. 2022-06-21 15:27:01 +05:30
Nicolas Vasilache 98dbaed1e6 [mlir][SCF] Fold tensor.cast feeding into scf.foreach_thread.parallel_insert_slice
Differential Revision: https://reviews.llvm.org/D128247
2022-06-21 01:19:18 -07:00
Matthias Springer 858be16670 [mlir][memref] Fix layout map computation in inferRankReducedResultType
Differential Revision: https://reviews.llvm.org/D128160
2022-06-21 10:08:26 +02:00
Nicolas Vasilache a489aa745b [mlir][SCF] Add scf::ForeachThread canonicalization.
This revision adds the necessary plumbing for canonicalizing scf::ForeachThread with the
`AffineOpSCFCanonicalizationPattern`.
In the process the `loopMatcher` helper is updated to take OpFoldResult instead of just values.
This allows composing various scenarios without the need for an artificial builder.

Differential Revision: https://reviews.llvm.org/D128244
2022-06-21 00:54:46 -07:00
Kazu Hirata 6d5fc1e3d5 [mlir] Don't use Optional::getValue (NFC) 2022-06-20 23:20:25 -07:00
Shraiysh Vaishay 23fec3405b [mlir][OpenMP] Add omp.taskgroup operation
This patch adds omp.taskgroup operation according to OpenMP 5.0 2.17.6.

Also added tests for the same.

Reviewed By: kiranchandramohan, peixin

Differential Revision: https://reviews.llvm.org/D127250
2022-06-21 10:17:24 +05:30
Kazu Hirata d66cbc565a Don't use Optional::hasValue (NFC) 2022-06-20 20:26:05 -07:00
Kazu Hirata 0916d96d12 Don't use Optional::hasValue (NFC) 2022-06-20 20:17:57 -07:00
Kazu Hirata 064a08cd95 Don't use Optional::hasValue (NFC) 2022-06-20 20:05:16 -07:00
Mogball d883a02a7c [mlir][ods] Remove StructAttr
Depends on D127373

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D127375
2022-06-21 01:10:05 +00:00
lewuathe 0bae40eff6 [mlir][math] Lower cos,sin to libm
Lower math.cos and math.sin to libm

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D128028
2022-06-21 08:38:07 +09:00
Kazu Hirata ad7ce1e769 Don't use Optional::hasValue (NFC) 2022-06-20 11:49:10 -07:00
Kazu Hirata 5413bf1bac Don't use Optional::hasValue (NFC) 2022-06-20 11:33:56 -07:00
Kazu Hirata 037f09959a [mlir] Don't use Optional::hasValue (NFC) 2022-06-20 11:22:37 -07:00
Krzysztof Drewniak 8e61fdc727 [mlir][ROCDL] Define MLIR wrappers around new MFMA intrinsics
In order to support newer hardware, define wrappers around MFMA
intrinsics that have not previously been exposed in the ROCDL dialect.

A `amdgpu.mfma` wrapper around these instructions is in development
and will provide a more user-friendly interface to them.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D128079
2022-06-20 15:03:45 +00:00
Krzysztof Drewniak e49ae6284c [mlir][Arith] Make --unsigned-when-equivalent use dialect conversion
Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D128096
2022-06-20 15:03:07 +00:00
Alex Zinenko 8b68da2c7d [mlir] move SCF headers to SCF/{IR,Transforms} respectively
This aligns the SCF dialect file layout with the majority of the dialects.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D128049
2022-06-20 10:18:01 +02:00
Adrian Kuegel 132234fac7 [mlir] Fix ClangTidy performance finding (NFC) 2022-06-20 08:47:00 +02:00
lewuathe 72ee11a8cf [mlir][complex] Convert complex.conj to libm
Add conversion for complex.conj to libm call

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D127473
2022-06-20 09:38:50 +09:00
Nico Weber 7effcbda49 Rename parallelForEachN to just parallelFor
Patch created by running:

  rg -l parallelForEachN | xargs sed -i '' -c 's/parallelForEachN/parallelFor/'

No behavior change.

Differential Revision: https://reviews.llvm.org/D128140
2022-06-19 17:49:00 -04:00
Kazu Hirata 30c675878c Use value_or instead of getValueOr (NFC) 2022-06-19 10:34:41 -07:00
Jacques Pienaar 8df54a6a03 [mlir] Update accessors to prefixed form (NFC)
Follow up from flipping dialects to both, flip accessor used to prefixed
variant ahead to flipping from _Both to _Prefixed. This just flips to
the accessors introduced in the preceding change which are just prefixed
forms of the existing accessor changed from.

Mechanical change using helper script
https://github.com/jpienaar/llvm-project/blob/main/clang-tools-extra/clang-tidy/misc/AddGetterCheck.cpp and clang-format.
2022-06-18 17:53:22 -07:00
Jacques Pienaar eca86cb2ed [mlir] Start migrating more dialects to prefixed form
Marked all dialects that could be (reasonably) easily flipped to _Both
prefix. Updating the accessors to prefixed form will happen in follow
up, this was to flush out conflicts and to mark all dialects explicitly
as I plan to flip OpBase default to _Prefixed to avoid needing to
migrate new dialects.

Except for Standalone example which got flipped to _Prefixed.

Differential Revision: https://reviews.llvm.org/D128027
2022-06-18 10:10:31 -07:00
Matthias Springer 99260e9583 [mlir][bufferization] Set emitAccessorPrefix dialect flag
Generate get/set accessors on all bufferization ops. Also update all internal uses.

Differential Revision: https://reviews.llvm.org/D128057
2022-06-18 10:26:29 +02:00
Benjamin Kramer 745a4caaeb [mlir] Fix an msvc warning
Float16bits.cpp(148): warning C4067: unexpected tokens following preprocessor directive - expected a newline
2022-06-18 10:07:51 +02:00
bixia1 e5e7e51473 [mlir][sparse][taco] Support complex types.
Support complex types of float and double. See the added test for an example.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D128076
2022-06-17 16:06:53 -07:00
Christopher Bate 829c84ec5b [mlir][nvgpu] fix MSVC warning regarding left shift
Differential Revision: https://reviews.llvm.org/D128088
2022-06-17 16:26:40 -06:00
Benjamin Kramer d5c29b23e1 [mlir][sparse] Inline the definition of LLVM_ATTRIBUTE_WEAK
This library is supposed not to have a dependency on LLVM, and linking
LLVMSupport into it breaks its shared library setup.
2022-06-17 22:41:10 +02:00
Benjamin Kramer 3420cd7caf [mlir][sparse] Add testing for bf16 and fallback for software bf16
This adds weak versions of the truncation libcalls in case the runtime
environment doesn't have them.

Differential Revision: https://reviews.llvm.org/D128091
2022-06-17 21:54:01 +02:00
Aart Bik 86d5d34c72 [mlir][sparse] renable f16 tests
Sparse library ABI issues are fixed.

https://github.com/llvm/llvm-project/issues/55992

Reviewed By: bkramer

Differential Revision: https://reviews.llvm.org/D128086
2022-06-17 12:48:24 -07:00
bixia1 48f4407c1a [mlir][linalg] Extend opdsl to support operations on complex types.
Linalg opdsl now supports negf/add/sub/mul on complex types.

Add a test.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D128010
2022-06-17 09:34:26 -07:00
Christopher Bate d089d68a2c [mlir][nvgpu] fix missing build dependency for NVGPUTransforms
Fixes build failure caused by 51b925df94
2022-06-17 09:42:49 -06:00
Aart Bik aef20f59a5 [mlir][sparse] move from by-value to by-reference for data types
This fixes all sorts of ABI issues due to passing by-value
(using by-reference with memref's exclusively).

Reviewed By: bkramer

Differential Revision: https://reviews.llvm.org/D128018
2022-06-17 08:39:25 -07:00
Christopher Bate 51b925df94 [mlir][nvgpu] shared memory access optimization pass
This change adds a transformation and pass to the NvGPU dialect that
attempts to optimize reads/writes from a  memref representing GPU shared
memory in order to avoid bank conflicts. Given a value representing a
shared memory memref, it traverses all reads/writes within the parent op
and, subject to suitable conditions, rewrites all last dimension index
values such that element locations in the final (col) dimension are
given by
`newColIdx = col % vecSize + perm[row](col/vecSize,row)`
where `perm` is a permutation function indexed by `row` and `vecSize`
is the vector access size in elements (currently assumes 128bit
vectorized accesses, but this can be made a parameter). This specific
transformation can help optimize typical distributed & vectorized accesses
common to loading matrix multiplication operands to/from shared memory.

Differential Revision: https://reviews.llvm.org/D127457
2022-06-17 09:31:05 -06:00
Phoebe Wang 655ba9c8a1 Reland "Reland "Reland "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI""""
This resolves problems reported in commit 1a20252978.
1. Promote to float lowering for nodes XINT_TO_FP
2. Bail out f16 from shuffle combine due to vector type is not legal in the version
2022-06-17 21:34:05 +08:00
Matthias Springer b55d55ecd9 [mlir][bufferize][NFC] Remove BufferizationState
With the recent refactorings, this class is no longer needed. We can use BufferizationOptions in all places were BufferizationState was used.

Differential Revision: https://reviews.llvm.org/D127653
2022-06-17 14:04:11 +02:00
Matthias Springer b3ebe3beed [mlir][bufferize] Bufferize after TensorCopyInsertion
This change changes the bufferization so that it utilizes the new TensorCopyInsertion pass. One-Shot Bufferize no longer calls the One-Shot Analysis. Instead, it relies on the TensorCopyInsertion pass to make the entire IR fully inplacable. The `bufferize` implementations of all ops are simplified; they no longer have to account for out-of-place bufferization decisions. These were already materialized in the IR in the form of `bufferization.alloc_tensor` ops during the TensorCopyInsertion pass.

Differential Revision: https://reviews.llvm.org/D127652
2022-06-17 13:29:52 +02:00
Alex Zinenko 610139d2d9 [mlir] replace 'emit_c_wrappers' func->llvm conversion option with a pass
The 'emit_c_wrappers' option in the FuncToLLVM conversion requests C interface
wrappers to be emitted for every builtin function in the module. While this has
been useful to bootstrap the interface, it is problematic in the longer term as
it may unintentionally affect the functions that should retain their existing
interface, e.g., libm functions obtained by lowering math operations (see
D126964 for an example). Since D77314, we have a finer-grain control over
interface generation via an attribute that avoids the problem entirely. Remove
the 'emit_c_wrappers' option. Introduce the '-llvm-request-c-wrappers' pass
that can be run in any pipeline that needs blanket emission of functions to
annotate all builtin functions with the attribute before performing the usual
lowering that accounts for the attribute.

Reviewed By: chelini

Differential Revision: https://reviews.llvm.org/D127952
2022-06-17 11:10:31 +02:00
Benjamin Kramer 1a20252978 Revert "Reland "Reland "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI""""
This reverts commit 04a3d5f3a1.

I see two more issues:

- uitofp/sitofp from i32/i64 to half now generates
  __floatsihf/__floatdihf, which exists in neither compiler-rt nor
  libgcc

- This crashes when legalizing the bitcast:
```
; RUN: llc < %s -mcpu=skx
define void @main.45(ptr nocapture readnone %retval, ptr noalias nocapture readnone %run_options, ptr noalias nocapture readnone %params, ptr noalias nocapture readonly %buffer_table, ptr noalias nocapture readnone %status, ptr noalias nocapture readnone %prof_counters) local_unnamed_addr {
entry:
  %fusion = load ptr, ptr %buffer_table, align 8
  %0 = getelementptr inbounds ptr, ptr %buffer_table, i64 1
  %Arg_1.2 = load ptr, ptr %0, align 8
  %1 = getelementptr inbounds ptr, ptr %buffer_table, i64 2
  %Arg_0.1 = load ptr, ptr %1, align 8
  %2 = load half, ptr %Arg_0.1, align 8
  %3 = bitcast half %2 to i16
  %4 = and i16 %3, 32767
  %5 = icmp eq i16 %4, 0
  %6 = and i16 %3, -32768
  %broadcast.splatinsert = insertelement <4 x half> poison, half %2, i64 0
  %broadcast.splat = shufflevector <4 x half> %broadcast.splatinsert, <4 x half> poison, <4 x i32> zeroinitializer
  %broadcast.splatinsert9 = insertelement <4 x i16> poison, i16 %4, i64 0
  %broadcast.splat10 = shufflevector <4 x i16> %broadcast.splatinsert9, <4 x i16> poison, <4 x i32> zeroinitializer
  %broadcast.splatinsert11 = insertelement <4 x i16> poison, i16 %6, i64 0
  %broadcast.splat12 = shufflevector <4 x i16> %broadcast.splatinsert11, <4 x i16> poison, <4 x i32> zeroinitializer
  %broadcast.splatinsert13 = insertelement <4 x i16> poison, i16 %3, i64 0
  %broadcast.splat14 = shufflevector <4 x i16> %broadcast.splatinsert13, <4 x i16> poison, <4 x i32> zeroinitializer
  %wide.load = load <4 x half>, ptr %Arg_1.2, align 8
  %7 = fcmp uno <4 x half> %broadcast.splat, %wide.load
  %8 = fcmp oeq <4 x half> %broadcast.splat, %wide.load
  %9 = bitcast <4 x half> %wide.load to <4 x i16>
  %10 = and <4 x i16> %9, <i16 32767, i16 32767, i16 32767, i16 32767>
  %11 = icmp eq <4 x i16> %10, zeroinitializer
  %12 = and <4 x i16> %9, <i16 -32768, i16 -32768, i16 -32768, i16 -32768>
  %13 = or <4 x i16> %12, <i16 1, i16 1, i16 1, i16 1>
  %14 = select <4 x i1> %11, <4 x i16> %9, <4 x i16> %13
  %15 = icmp ugt <4 x i16> %broadcast.splat10, %10
  %16 = icmp ne <4 x i16> %broadcast.splat12, %12
  %17 = or <4 x i1> %15, %16
  %18 = select <4 x i1> %17, <4 x i16> <i16 -1, i16 -1, i16 -1, i16 -1>, <4 x i16> <i16 1, i16 1, i16 1, i16 1>
  %19 = add <4 x i16> %18, %broadcast.splat14
  %20 = select i1 %5, <4 x i16> %14, <4 x i16> %19
  %21 = select <4 x i1> %8, <4 x i16> %9, <4 x i16> %20
  %22 = bitcast <4 x i16> %21 to <4 x half>
  %23 = select <4 x i1> %7, <4 x half> <half 0xH7E00, half 0xH7E00, half 0xH7E00, half 0xH7E00>, <4 x half> %22
  store <4 x half> %23, ptr %fusion, align 16
  ret void
}
```

llc: llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:977: void (anonymous namespace)::SelectionDAGLegalize::LegalizeOp(llvm::SDNode *): Assertion `(TLI.getTypeAction(*DAG.getContext(), Op.getValueType()) == TargetLowering::TypeLegal || Op.getOpcode() == ISD::TargetConstant || Op.getOpcode() == ISD::Register) && "Unexpected illegal type!"' failed.
2022-06-17 09:43:07 +02:00
Phoebe Wang 04a3d5f3a1 Reland "Reland "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI"""
Fix the crash on lowering X86ISD::FCMP.
2022-06-17 12:12:17 +08:00
Jacques Pienaar 02b9ddb2f2 [mlir] Disable warning in test of deprecated feature (NFC)
Disable warning for deprecation in test of deprecated feature. Also
remove additional test of deprecated feature from TestOps.td.
2022-06-16 20:15:13 -07:00
Jacques Pienaar d30c0221cf [mlir] Split MLProgram global load and store to Graph variants
* Split ops into X_graph variants as discussed;
* Remove tokens from non-Graph region variants and rely on side-effect
  modelling there while removing side-effect modelling from Graph
  variants and relying on explicit ordering there;
* Make tokens required to be produced by Graph variants - but kept
  explicit token type specification given previous discussion on this
  potentially being configurable in future;

This results in duplicating some code. I considered adding helper
functions but decided against adding an abstraction there early given
size of duplication and creating accidental coupling.

Differential Revision: https://reviews.llvm.org/D127813
2022-06-16 20:01:54 -07:00
Jacques Pienaar 287ade415e [mlir][doc] Avoid duplication with constraints and defs
Where a constraint also has a def, emit the def only to avoid duplicate
output (and def has more complete info). Also move attributes and types
to the end rather than some on top and some at end.

Differential Revision: https://reviews.llvm.org/D127823
2022-06-16 19:42:56 -07:00
Aart Bik 2a2886160d [mlir][sparse] improved testing and codegen for semi-ring operations
The semi-ring blocks were simply "inlined" by the sparse compiler but
without any filtering or patching. This revision improves the analysis
(rejecting blocks that use non-invariant computations from outside
their blocks, except for linalg.index) and also improves the codegen
by properly patching up index computations (previous version crashed).

With a regression test. Also updated the documentation now that the
example code is properly working.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D128000
2022-06-16 16:13:42 -07:00
Aart Bik 36c01876d7 [mlir][sparse] fix asan issue
The LinalgElementwiseOpFusion pass has become smarter, and converts
the simple conversion linalg operation into a sparse dialect convert
operation. However, since our current bufferization does not take the
new semantics into consideration, we leak memory of the allocation.
For now, this has been fixed by making the operation less trivial.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D128002
2022-06-16 14:49:02 -07:00
bixia1 bbb73ade43 [mlir][complex] Add Python bindings for complex ops.
Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D127916
2022-06-16 14:19:11 -07:00
Thomas Raoux f011d32c3a [mlir][vector] Fix contraction op lowering with mixed types
contraction op can have mixed type, add support for this case to the pattern
lowering contraction op to outerproduct.

Differential Revision: https://reviews.llvm.org/D127926
2022-06-16 16:40:56 +00:00
Thomas Raoux 046ebeb605 [mlir][linalg] Relax convolution vectorization to support mixed types
Support the case where convolution does float extension of the inputs.

Differential Revision: https://reviews.llvm.org/D127925
2022-06-16 16:29:46 +00:00
Lei Zhang 2320a4ae90 [mlir][spirv] Workaround driver bug in math.ctlz conversion again
The previous approach does not work as the Adreno driver is
clever at optimizing away the selection. So now check two
inputs together.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D127930
2022-06-16 10:53:49 -04:00
Mark Browning bccf27d934 [mlir][python] Actually set UseLocalScope printing flag
The useLocalScope printing flag has been passed around between pybind methods, but doesn't actually enable the corresponding printing flag.

Reviewed By: stellaraccident

Differential Revision: https://reviews.llvm.org/D127907
2022-06-15 22:01:34 -07:00
Lei Zhang f3bc0fccd6 [mlir][spirv] Define spv.ISubBorrowOp
Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D127909
2022-06-15 20:38:53 -04:00
Frederik Gossen 3cd5696a33 Revert "Reland "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI"""
This reverts commit e1c5afa47d.

This introduces crashes in the JAX backend on CPU. A reproducer in LLVM is
below. Let me know if you have trouble reproducing this.

; ModuleID = '__compute_module'
source_filename = "__compute_module"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-grtev4-linux-gnu"

@0 = private unnamed_addr constant [4 x i8] c"\00\00\00?"
@1 = private unnamed_addr constant [4 x i8] c"\1C}\908"
@2 = private unnamed_addr constant [4 x i8] c"?\00\\4"
@3 = private unnamed_addr constant [4 x i8] c"%ci1"
@4 = private unnamed_addr constant [4 x i8] zeroinitializer
@5 = private unnamed_addr constant [4 x i8] c"\00\00\00\C0"
@6 = private unnamed_addr constant [4 x i8] c"\00\00\00B"
@7 = private unnamed_addr constant [4 x i8] c"\94\B4\C22"
@8 = private unnamed_addr constant [4 x i8] c"^\09B6"
@9 = private unnamed_addr constant [4 x i8] c"\15\F3M?"
@10 = private unnamed_addr constant [4 x i8] c"e\CC\\;"
@11 = private unnamed_addr constant [4 x i8] c"d\BD/>"
@12 = private unnamed_addr constant [4 x i8] c"V\F4I="
@13 = private unnamed_addr constant [4 x i8] c"\10\CB,<"
@14 = private unnamed_addr constant [4 x i8] c"\AC\E3\D6:"
@15 = private unnamed_addr constant [4 x i8] c"\DC\A8E9"
@16 = private unnamed_addr constant [4 x i8] c"\C6\FA\897"
@17 = private unnamed_addr constant [4 x i8] c"%\F9\955"
@18 = private unnamed_addr constant [4 x i8] c"\B5\DB\813"
@19 = private unnamed_addr constant [4 x i8] c"\B4W_\B2"
@20 = private unnamed_addr constant [4 x i8] c"\1Cc\8F\B4"
@21 = private unnamed_addr constant [4 x i8] c"~3\94\B6"
@22 = private unnamed_addr constant [4 x i8] c"3Yq\B8"
@23 = private unnamed_addr constant [4 x i8] c"\E9\17\17\BA"
@24 = private unnamed_addr constant [4 x i8] c"\F1\B2\8D\BB"
@25 = private unnamed_addr constant [4 x i8] c"\F8t\C2\BC"
@26 = private unnamed_addr constant [4 x i8] c"\82[\C2\BD"
@27 = private unnamed_addr constant [4 x i8] c"uB-?"
@28 = private unnamed_addr constant [4 x i8] c"^\FF\9B\BE"
@29 = private unnamed_addr constant [4 x i8] c"\00\00\00A"

; Function Attrs: uwtable
define void @main.158(ptr %retval, ptr noalias %run_options, ptr noalias %params, ptr noalias %buffer_table, ptr noalias %status, ptr noalias %prof_counters) #0 {
entry:
  %fusion.invar_address.dim.1 = alloca i64, align 8
  %fusion.invar_address.dim.0 = alloca i64, align 8
  %0 = getelementptr inbounds ptr, ptr %buffer_table, i64 1
  %Arg_0.1 = load ptr, ptr %0, align 8, !invariant.load !0, !dereferenceable !1, !align !2
  %1 = getelementptr inbounds ptr, ptr %buffer_table, i64 0
  %fusion = load ptr, ptr %1, align 8, !invariant.load !0, !dereferenceable !1, !align !2
  store i64 0, ptr %fusion.invar_address.dim.0, align 8
  br label %fusion.loop_header.dim.0

return:                                           ; preds = %fusion.loop_exit.dim.0
  ret void

fusion.loop_header.dim.0:                         ; preds = %fusion.loop_exit.dim.1, %entry
  %fusion.indvar.dim.0 = load i64, ptr %fusion.invar_address.dim.0, align 8
  %2 = icmp uge i64 %fusion.indvar.dim.0, 3
  br i1 %2, label %fusion.loop_exit.dim.0, label %fusion.loop_body.dim.0

fusion.loop_body.dim.0:                           ; preds = %fusion.loop_header.dim.0
  store i64 0, ptr %fusion.invar_address.dim.1, align 8
  br label %fusion.loop_header.dim.1

fusion.loop_header.dim.1:                         ; preds = %fusion.loop_body.dim.1, %fusion.loop_body.dim.0
  %fusion.indvar.dim.1 = load i64, ptr %fusion.invar_address.dim.1, align 8
  %3 = icmp uge i64 %fusion.indvar.dim.1, 1
  br i1 %3, label %fusion.loop_exit.dim.1, label %fusion.loop_body.dim.1

fusion.loop_body.dim.1:                           ; preds = %fusion.loop_header.dim.1
  %4 = getelementptr inbounds [3 x [1 x half]], ptr %Arg_0.1, i64 0, i64 %fusion.indvar.dim.0, i64 0
  %5 = load half, ptr %4, align 2, !invariant.load !0, !noalias !3
  %6 = fpext half %5 to float
  %7 = call float @llvm.fabs.f32(float %6)
  %constant.121 = load float, ptr @29, align 4
  %compare.2 = fcmp ole float %7, %constant.121
  %8 = zext i1 %compare.2 to i8
  %constant.120 = load float, ptr @0, align 4
  %multiply.95 = fmul float %7, %constant.120
  %constant.119 = load float, ptr @5, align 4
  %add.82 = fadd float %multiply.95, %constant.119
  %constant.118 = load float, ptr @4, align 4
  %multiply.94 = fmul float %add.82, %constant.118
  %constant.117 = load float, ptr @19, align 4
  %add.81 = fadd float %multiply.94, %constant.117
  %multiply.92 = fmul float %add.82, %add.81
  %constant.116 = load float, ptr @18, align 4
  %add.79 = fadd float %multiply.92, %constant.116
  %multiply.91 = fmul float %add.82, %add.79
  %subtract.87 = fsub float %multiply.91, %add.81
  %constant.115 = load float, ptr @20, align 4
  %add.78 = fadd float %subtract.87, %constant.115
  %multiply.89 = fmul float %add.82, %add.78
  %subtract.86 = fsub float %multiply.89, %add.79
  %constant.114 = load float, ptr @17, align 4
  %add.76 = fadd float %subtract.86, %constant.114
  %multiply.88 = fmul float %add.82, %add.76
  %subtract.84 = fsub float %multiply.88, %add.78
  %constant.113 = load float, ptr @21, align 4
  %add.75 = fadd float %subtract.84, %constant.113
  %multiply.86 = fmul float %add.82, %add.75
  %subtract.83 = fsub float %multiply.86, %add.76
  %constant.112 = load float, ptr @16, align 4
  %add.73 = fadd float %subtract.83, %constant.112
  %multiply.85 = fmul float %add.82, %add.73
  %subtract.81 = fsub float %multiply.85, %add.75
  %constant.111 = load float, ptr @22, align 4
  %add.72 = fadd float %subtract.81, %constant.111
  %multiply.83 = fmul float %add.82, %add.72
  %subtract.80 = fsub float %multiply.83, %add.73
  %constant.110 = load float, ptr @15, align 4
  %add.70 = fadd float %subtract.80, %constant.110
  %multiply.82 = fmul float %add.82, %add.70
  %subtract.78 = fsub float %multiply.82, %add.72
  %constant.109 = load float, ptr @23, align 4
  %add.69 = fadd float %subtract.78, %constant.109
  %multiply.80 = fmul float %add.82, %add.69
  %subtract.77 = fsub float %multiply.80, %add.70
  %constant.108 = load float, ptr @14, align 4
  %add.68 = fadd float %subtract.77, %constant.108
  %multiply.79 = fmul float %add.82, %add.68
  %subtract.75 = fsub float %multiply.79, %add.69
  %constant.107 = load float, ptr @24, align 4
  %add.67 = fadd float %subtract.75, %constant.107
  %multiply.77 = fmul float %add.82, %add.67
  %subtract.74 = fsub float %multiply.77, %add.68
  %constant.106 = load float, ptr @13, align 4
  %add.66 = fadd float %subtract.74, %constant.106
  %multiply.76 = fmul float %add.82, %add.66
  %subtract.72 = fsub float %multiply.76, %add.67
  %constant.105 = load float, ptr @25, align 4
  %add.65 = fadd float %subtract.72, %constant.105
  %multiply.74 = fmul float %add.82, %add.65
  %subtract.71 = fsub float %multiply.74, %add.66
  %constant.104 = load float, ptr @12, align 4
  %add.64 = fadd float %subtract.71, %constant.104
  %multiply.73 = fmul float %add.82, %add.64
  %subtract.69 = fsub float %multiply.73, %add.65
  %constant.103 = load float, ptr @26, align 4
  %add.63 = fadd float %subtract.69, %constant.103
  %multiply.71 = fmul float %add.82, %add.63
  %subtract.67 = fsub float %multiply.71, %add.64
  %constant.102 = load float, ptr @11, align 4
  %add.62 = fadd float %subtract.67, %constant.102
  %multiply.70 = fmul float %add.82, %add.62
  %subtract.66 = fsub float %multiply.70, %add.63
  %constant.101 = load float, ptr @28, align 4
  %add.61 = fadd float %subtract.66, %constant.101
  %multiply.68 = fmul float %add.82, %add.61
  %subtract.65 = fsub float %multiply.68, %add.62
  %constant.100 = load float, ptr @27, align 4
  %add.60 = fadd float %subtract.65, %constant.100
  %subtract.64 = fsub float %add.60, %add.62
  %multiply.66 = fmul float %subtract.64, %constant.120
  %constant.99 = load float, ptr @6, align 4
  %divide.4 = fdiv float %constant.99, %7
  %add.59 = fadd float %divide.4, %constant.119
  %multiply.65 = fmul float %add.59, %constant.118
  %constant.98 = load float, ptr @3, align 4
  %add.58 = fadd float %multiply.65, %constant.98
  %multiply.64 = fmul float %add.59, %add.58
  %constant.97 = load float, ptr @7, align 4
  %add.57 = fadd float %multiply.64, %constant.97
  %multiply.63 = fmul float %add.59, %add.57
  %subtract.63 = fsub float %multiply.63, %add.58
  %constant.96 = load float, ptr @2, align 4
  %add.56 = fadd float %subtract.63, %constant.96
  %multiply.62 = fmul float %add.59, %add.56
  %subtract.62 = fsub float %multiply.62, %add.57
  %constant.95 = load float, ptr @8, align 4
  %add.55 = fadd float %subtract.62, %constant.95
  %multiply.61 = fmul float %add.59, %add.55
  %subtract.61 = fsub float %multiply.61, %add.56
  %constant.94 = load float, ptr @1, align 4
  %add.54 = fadd float %subtract.61, %constant.94
  %multiply.60 = fmul float %add.59, %add.54
  %subtract.60 = fsub float %multiply.60, %add.55
  %constant.93 = load float, ptr @10, align 4
  %add.53 = fadd float %subtract.60, %constant.93
  %multiply.59 = fmul float %add.59, %add.53
  %subtract.59 = fsub float %multiply.59, %add.54
  %constant.92 = load float, ptr @9, align 4
  %add.52 = fadd float %subtract.59, %constant.92
  %subtract.58 = fsub float %add.52, %add.54
  %multiply.58 = fmul float %subtract.58, %constant.120
  %9 = call float @llvm.sqrt.f32(float %7)
  %10 = fdiv float 1.000000e+00, %9
  %multiply.57 = fmul float %multiply.58, %10
  %11 = trunc i8 %8 to i1
  %12 = select i1 %11, float %multiply.66, float %multiply.57
  %13 = fptrunc float %12 to half
  %14 = getelementptr inbounds [3 x [1 x half]], ptr %fusion, i64 0, i64 %fusion.indvar.dim.0, i64 0
  store half %13, ptr %14, align 2, !alias.scope !3
  %invar.inc1 = add nuw nsw i64 %fusion.indvar.dim.1, 1
  store i64 %invar.inc1, ptr %fusion.invar_address.dim.1, align 8
  br label %fusion.loop_header.dim.1

fusion.loop_exit.dim.1:                           ; preds = %fusion.loop_header.dim.1
  %invar.inc = add nuw nsw i64 %fusion.indvar.dim.0, 1
  store i64 %invar.inc, ptr %fusion.invar_address.dim.0, align 8
  br label %fusion.loop_header.dim.0

fusion.loop_exit.dim.0:                           ; preds = %fusion.loop_header.dim.0
  br label %return
}

; Function Attrs: nocallback nofree nosync nounwind readnone speculatable willreturn
declare float @llvm.fabs.f32(float %0) #1

; Function Attrs: nocallback nofree nosync nounwind readnone speculatable willreturn
declare float @llvm.sqrt.f32(float %0) #1

attributes #0 = { uwtable "denormal-fp-math"="preserve-sign" "no-frame-pointer-elim"="false" }
attributes #1 = { nocallback nofree nosync nounwind readnone speculatable willreturn }

!0 = !{}
!1 = !{i64 6}
!2 = !{i64 8}
!3 = !{!4}
!4 = !{!"buffer: {index:0, offset:0, size:6}", !5}
!5 = !{!"XLA global AA domain"}
2022-06-15 18:04:42 -04:00
Okwan Kwon f4ad203930 [mlir] create PrintOpStatsPass using printAsJSON
This was missed by the previous commit in OpStats.cpp.
2022-06-15 14:49:54 -07:00
Min-Yih Hsu cd8978e19e [mlir][LLVMIR] Ask ICmpOp to return vector<Nxi1> when needed
If any of the operands for ICmpOp is a vector, returns a vector<Nxi1>
, rather than an i1 type result.

Differential Revision: https://reviews.llvm.org/D127536
2022-06-15 14:33:48 -07:00
Min-Yih Hsu 719e24d39f [mlir][LLVMIR] Use isScalableVectorType in ShuffleVectorOp::parse
Instead of casting the incoming operand into VectorType to check if it's
scalable or not.
This is the place I missed to fix in f088b99eac.

Differential Revision: https://reviews.llvm.org/D127535
2022-06-15 14:33:48 -07:00
Min-Yih Hsu dcdd5d312f [mlir][LLVMIR] Use insertelement if needed when translating ConstantAggregate
When translating from a llvm::ConstantAggregate with vector type, we
should lower to insertelement operations (if needed) rather than using
insertvalue.

Differential Revision: https://reviews.llvm.org/D127534
2022-06-15 14:33:47 -07:00
Okwan Kwon 85f19f99e4 [mlir] add createPrintOpStatsPass() with explicit params
This allows to set printAsJSON through the create function.

Differential Revision: https://reviews.llvm.org/D127891
2022-06-15 12:09:59 -07:00
Thomas Raoux a6f2c2291e [mlir][GPUToNVVM] Fix bug in mma elementwise lowering
The maxf implementation of wmma elementwise op was incorrect as the
operands of the select to check for Nan were swapped.

Differential Revision: https://reviews.llvm.org/D127879
2022-06-15 17:23:17 +00:00
Okwan Kwon 8010d7e044 [mlir] add an option to print op stats in JSON
Differential Revision: https://reviews.llvm.org/D127691
2022-06-15 10:07:36 -07:00
Rob Suderman 640973f2b9 [tosa] Lower tosa.slice to tensor.slice for dynamic case
Existing slice lowering only supporting static shapes.

Reviewed By: NatashaKnk

Differential Revision: https://reviews.llvm.org/D127704
2022-06-15 09:54:36 -07:00
Alex Zinenko 1d45282aa3 [mlir] address post-commit review for D127724
- make transform.alternatives op apply only to isolated-from-above payload IR
  scopes;
- fix potential leak;
- fix several typos.
2022-06-15 18:43:05 +02:00
lorenzo chelini f2ada383f2 [MLIR][Bufferization] Assume alias if no information is available
- Post (minor) fix after: https://reviews.llvm.org/D127301

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D127868
2022-06-15 18:41:51 +02:00
Thomas Raoux 6834803c3d [mlir][vector] NFC remove dependency of VectorTransform to GPU dialect
Make the reduction distribution pattern more generic and remove layering
problem. The new pattern to distribute reduction is now independent of
GPU and takes a lamdba to decide how the distributed reduction should be
generated.

Differential Revision: https://reviews.llvm.org/D127867
2022-06-15 16:08:29 +00:00
Phoebe Wang e1c5afa47d Reland "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI""
Fixed the missing SQRT promotion. Adding several missing operations too.
2022-06-15 23:00:18 +08:00
Matthias Springer 989d2b5186 [mlir][tablegen] Generate default attr values in Python bindings
When specifying an op attribute with a default value (via DefaultValuedAttr), the default value is a string of C++ code. In the general case, the default value of such an attribute cannot be translated to Python when generating the bindings. However, we can hard-code default Python values for frequently-used C++ default values.

This change adds a Python default value for empty ArrayAttrs.

Differential Revision: https://reviews.llvm.org/D127750
2022-06-15 16:40:27 +02:00
Alex Zinenko 9c8fe394cf [mlir] check interfaces are attached to the expected object
Add static assertions into the various `attachInterface` methods, which are
used for adding external interface implementations to attributes, operations
and types, that ensure `ExternalModel` interface classes are instantiated for
the same concrete operation for the concrete base (potentially self) attribute
or type as they are attached to. `FallbackModel`s remain usable for generic
interface models that should support more than one kind of entities.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D127850
2022-06-15 15:07:57 +02:00
Alex Zinenko 2f1791c43c [mlir] generate documentation for transform dialect extensions 2022-06-15 15:06:20 +02:00
Thomas Joerg 37455b1f71 Revert "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI""
This reverts commit 6e02e27536.

This introduces a crash in the backend. Reproducer in MLIR's LLVM
dialect follows. Let me know if you have trouble reproducing this.

module {
  llvm.func @malloc(i64) -> !llvm.ptr<i8>
  llvm.func @_mlir_ciface_tf_report_error(!llvm.ptr<i8>, i32, !llvm.ptr<i8>)
  llvm.mlir.global internal constant @error_message_2208944672953921889("failed to allocate memory at loc(\22-\22:3:8)\00")
  llvm.func @_mlir_ciface_tf_alloc(!llvm.ptr<i8>, i64, i64, i32, i32, !llvm.ptr<i32>) -> !llvm.ptr<i8>
  llvm.func @Rsqrt_CPU_DT_HALF_DT_HALF(%arg0: !llvm.ptr<i8>, %arg1: i64, %arg2: !llvm.ptr<i8>) -> !llvm.struct<(i64, ptr<i8>)> attributes {llvm.emit_c_interface, tf_entry} {
    %0 = llvm.mlir.constant(8 : i32) : i32
    %1 = llvm.mlir.constant(8 : index) : i64
    %2 = llvm.mlir.constant(2 : index) : i64
    %3 = llvm.mlir.constant(dense<0.000000e+00> : vector<4xf16>) : vector<4xf16>
    %4 = llvm.mlir.constant(dense<[0, 1, 2, 3]> : vector<4xi32>) : vector<4xi32>
    %5 = llvm.mlir.constant(dense<1.000000e+00> : vector<4xf16>) : vector<4xf16>
    %6 = llvm.mlir.constant(false) : i1
    %7 = llvm.mlir.constant(1 : i32) : i32
    %8 = llvm.mlir.constant(0 : i32) : i32
    %9 = llvm.mlir.constant(4 : index) : i64
    %10 = llvm.mlir.constant(0 : index) : i64
    %11 = llvm.mlir.constant(1 : index) : i64
    %12 = llvm.mlir.constant(-1 : index) : i64
    %13 = llvm.mlir.null : !llvm.ptr<f16>
    %14 = llvm.getelementptr %13[%9] : (!llvm.ptr<f16>, i64) -> !llvm.ptr<f16>
    %15 = llvm.ptrtoint %14 : !llvm.ptr<f16> to i64
    %16 = llvm.alloca %15 x f16 {alignment = 32 : i64} : (i64) -> !llvm.ptr<f16>
    %17 = llvm.alloca %15 x f16 {alignment = 32 : i64} : (i64) -> !llvm.ptr<f16>
    %18 = llvm.mlir.null : !llvm.ptr<i64>
    %19 = llvm.getelementptr %18[%arg1] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
    %20 = llvm.ptrtoint %19 : !llvm.ptr<i64> to i64
    %21 = llvm.alloca %20 x i64 : (i64) -> !llvm.ptr<i64>
    llvm.br ^bb1(%10 : i64)
  ^bb1(%22: i64):  // 2 preds: ^bb0, ^bb2
    %23 = llvm.icmp "slt" %22, %arg1 : i64
    llvm.cond_br %23, ^bb2, ^bb3
  ^bb2:  // pred: ^bb1
    %24 = llvm.bitcast %arg2 : !llvm.ptr<i8> to !llvm.ptr<struct<(ptr<f16>, ptr<f16>, i64)>>
    %25 = llvm.getelementptr %24[%10, 2] : (!llvm.ptr<struct<(ptr<f16>, ptr<f16>, i64)>>, i64) -> !llvm.ptr<i64>
    %26 = llvm.add %22, %11  : i64
    %27 = llvm.getelementptr %25[%26] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
    %28 = llvm.load %27 : !llvm.ptr<i64>
    %29 = llvm.getelementptr %21[%22] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
    llvm.store %28, %29 : !llvm.ptr<i64>
    llvm.br ^bb1(%26 : i64)
  ^bb3:  // pred: ^bb1
    llvm.br ^bb4(%10, %11 : i64, i64)
  ^bb4(%30: i64, %31: i64):  // 2 preds: ^bb3, ^bb5
    %32 = llvm.icmp "slt" %30, %arg1 : i64
    llvm.cond_br %32, ^bb5, ^bb6
  ^bb5:  // pred: ^bb4
    %33 = llvm.bitcast %arg2 : !llvm.ptr<i8> to !llvm.ptr<struct<(ptr<f16>, ptr<f16>, i64)>>
    %34 = llvm.getelementptr %33[%10, 2] : (!llvm.ptr<struct<(ptr<f16>, ptr<f16>, i64)>>, i64) -> !llvm.ptr<i64>
    %35 = llvm.add %30, %11  : i64
    %36 = llvm.getelementptr %34[%35] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
    %37 = llvm.load %36 : !llvm.ptr<i64>
    %38 = llvm.mul %37, %31  : i64
    llvm.br ^bb4(%35, %38 : i64, i64)
  ^bb6:  // pred: ^bb4
    %39 = llvm.bitcast %arg2 : !llvm.ptr<i8> to !llvm.ptr<ptr<f16>>
    %40 = llvm.getelementptr %39[%11] : (!llvm.ptr<ptr<f16>>, i64) -> !llvm.ptr<ptr<f16>>
    %41 = llvm.load %40 : !llvm.ptr<ptr<f16>>
    %42 = llvm.getelementptr %13[%11] : (!llvm.ptr<f16>, i64) -> !llvm.ptr<f16>
    %43 = llvm.ptrtoint %42 : !llvm.ptr<f16> to i64
    %44 = llvm.alloca %7 x i32 : (i32) -> !llvm.ptr<i32>
    llvm.store %8, %44 : !llvm.ptr<i32>
    %45 = llvm.call @_mlir_ciface_tf_alloc(%arg0, %31, %43, %8, %7, %44) : (!llvm.ptr<i8>, i64, i64, i32, i32, !llvm.ptr<i32>) -> !llvm.ptr<i8>
    %46 = llvm.bitcast %45 : !llvm.ptr<i8> to !llvm.ptr<f16>
    %47 = llvm.icmp "eq" %31, %10 : i64
    %48 = llvm.or %6, %47  : i1
    %49 = llvm.mlir.null : !llvm.ptr<i8>
    %50 = llvm.icmp "ne" %45, %49 : !llvm.ptr<i8>
    %51 = llvm.or %50, %48  : i1
    llvm.cond_br %51, ^bb7, ^bb13
  ^bb7:  // pred: ^bb6
    %52 = llvm.urem %31, %9  : i64
    %53 = llvm.sub %31, %52  : i64
    llvm.br ^bb8(%10 : i64)
  ^bb8(%54: i64):  // 2 preds: ^bb7, ^bb9
    %55 = llvm.icmp "slt" %54, %53 : i64
    llvm.cond_br %55, ^bb9, ^bb10
  ^bb9:  // pred: ^bb8
    %56 = llvm.mul %54, %11  : i64
    %57 = llvm.add %56, %10  : i64
    %58 = llvm.add %57, %10  : i64
    %59 = llvm.getelementptr %41[%58] : (!llvm.ptr<f16>, i64) -> !llvm.ptr<f16>
    %60 = llvm.bitcast %59 : !llvm.ptr<f16> to !llvm.ptr<vector<4xf16>>
    %61 = llvm.load %60 {alignment = 2 : i64} : !llvm.ptr<vector<4xf16>>
    %62 = "llvm.intr.sqrt"(%61) : (vector<4xf16>) -> vector<4xf16>
    %63 = llvm.fdiv %5, %62  : vector<4xf16>
    %64 = llvm.getelementptr %46[%58] : (!llvm.ptr<f16>, i64) -> !llvm.ptr<f16>
    %65 = llvm.bitcast %64 : !llvm.ptr<f16> to !llvm.ptr<vector<4xf16>>
    llvm.store %63, %65 {alignment = 2 : i64} : !llvm.ptr<vector<4xf16>>
    %66 = llvm.add %54, %9  : i64
    llvm.br ^bb8(%66 : i64)
  ^bb10:  // pred: ^bb8
    %67 = llvm.icmp "ult" %53, %31 : i64
    llvm.cond_br %67, ^bb11, ^bb12
  ^bb11:  // pred: ^bb10
    %68 = llvm.mul %53, %12  : i64
    %69 = llvm.add %31, %68  : i64
    %70 = llvm.mul %53, %11  : i64
    %71 = llvm.add %70, %10  : i64
    %72 = llvm.trunc %69 : i64 to i32
    %73 = llvm.mlir.undef : vector<4xi32>
    %74 = llvm.insertelement %72, %73[%8 : i32] : vector<4xi32>
    %75 = llvm.shufflevector %74, %73 [0 : i32, 0 : i32, 0 : i32, 0 : i32] : vector<4xi32>, vector<4xi32>
    %76 = llvm.icmp "slt" %4, %75 : vector<4xi32>
    %77 = llvm.add %71, %10  : i64
    %78 = llvm.getelementptr %41[%77] : (!llvm.ptr<f16>, i64) -> !llvm.ptr<f16>
    %79 = llvm.bitcast %78 : !llvm.ptr<f16> to !llvm.ptr<vector<4xf16>>
    %80 = llvm.intr.masked.load %79, %76, %3 {alignment = 2 : i32} : (!llvm.ptr<vector<4xf16>>, vector<4xi1>, vector<4xf16>) -> vector<4xf16>
    %81 = llvm.bitcast %16 : !llvm.ptr<f16> to !llvm.ptr<vector<4xf16>>
    llvm.store %80, %81 : !llvm.ptr<vector<4xf16>>
    %82 = llvm.load %81 {alignment = 2 : i64} : !llvm.ptr<vector<4xf16>>
    %83 = "llvm.intr.sqrt"(%82) : (vector<4xf16>) -> vector<4xf16>
    %84 = llvm.fdiv %5, %83  : vector<4xf16>
    %85 = llvm.bitcast %17 : !llvm.ptr<f16> to !llvm.ptr<vector<4xf16>>
    llvm.store %84, %85 {alignment = 2 : i64} : !llvm.ptr<vector<4xf16>>
    %86 = llvm.load %85 : !llvm.ptr<vector<4xf16>>
    %87 = llvm.getelementptr %46[%77] : (!llvm.ptr<f16>, i64) -> !llvm.ptr<f16>
    %88 = llvm.bitcast %87 : !llvm.ptr<f16> to !llvm.ptr<vector<4xf16>>
    llvm.intr.masked.store %86, %88, %76 {alignment = 2 : i32} : vector<4xf16>, vector<4xi1> into !llvm.ptr<vector<4xf16>>
    llvm.br ^bb12
  ^bb12:  // 2 preds: ^bb10, ^bb11
    %89 = llvm.mul %2, %1  : i64
    %90 = llvm.mul %arg1, %2  : i64
    %91 = llvm.add %90, %11  : i64
    %92 = llvm.mul %91, %1  : i64
    %93 = llvm.add %89, %92  : i64
    %94 = llvm.alloca %93 x i8 : (i64) -> !llvm.ptr<i8>
    %95 = llvm.bitcast %94 : !llvm.ptr<i8> to !llvm.ptr<ptr<f16>>
    llvm.store %46, %95 : !llvm.ptr<ptr<f16>>
    %96 = llvm.getelementptr %95[%11] : (!llvm.ptr<ptr<f16>>, i64) -> !llvm.ptr<ptr<f16>>
    llvm.store %46, %96 : !llvm.ptr<ptr<f16>>
    %97 = llvm.getelementptr %95[%2] : (!llvm.ptr<ptr<f16>>, i64) -> !llvm.ptr<ptr<f16>>
    %98 = llvm.bitcast %97 : !llvm.ptr<ptr<f16>> to !llvm.ptr<i64>
    llvm.store %10, %98 : !llvm.ptr<i64>
    %99 = llvm.bitcast %94 : !llvm.ptr<i8> to !llvm.ptr<struct<(ptr<f16>, ptr<f16>, i64, i64)>>
    %100 = llvm.getelementptr %99[%10, 3] : (!llvm.ptr<struct<(ptr<f16>, ptr<f16>, i64, i64)>>, i64) -> !llvm.ptr<i64>
    %101 = llvm.getelementptr %100[%arg1] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
    %102 = llvm.sub %arg1, %11  : i64
    llvm.br ^bb14(%102, %11 : i64, i64)
  ^bb13:  // pred: ^bb6
    %103 = llvm.mlir.addressof @error_message_2208944672953921889 : !llvm.ptr<array<42 x i8>>
    %104 = llvm.getelementptr %103[%10, %10] : (!llvm.ptr<array<42 x i8>>, i64, i64) -> !llvm.ptr<i8>
    llvm.call @_mlir_ciface_tf_report_error(%arg0, %0, %104) : (!llvm.ptr<i8>, i32, !llvm.ptr<i8>) -> ()
    %105 = llvm.mul %2, %1  : i64
    %106 = llvm.mul %2, %10  : i64
    %107 = llvm.add %106, %11  : i64
    %108 = llvm.mul %107, %1  : i64
    %109 = llvm.add %105, %108  : i64
    %110 = llvm.alloca %109 x i8 : (i64) -> !llvm.ptr<i8>
    %111 = llvm.bitcast %110 : !llvm.ptr<i8> to !llvm.ptr<ptr<f16>>
    llvm.store %13, %111 : !llvm.ptr<ptr<f16>>
    %112 = llvm.getelementptr %111[%11] : (!llvm.ptr<ptr<f16>>, i64) -> !llvm.ptr<ptr<f16>>
    llvm.store %13, %112 : !llvm.ptr<ptr<f16>>
    %113 = llvm.getelementptr %111[%2] : (!llvm.ptr<ptr<f16>>, i64) -> !llvm.ptr<ptr<f16>>
    %114 = llvm.bitcast %113 : !llvm.ptr<ptr<f16>> to !llvm.ptr<i64>
    llvm.store %10, %114 : !llvm.ptr<i64>
    %115 = llvm.call @malloc(%109) : (i64) -> !llvm.ptr<i8>
    "llvm.intr.memcpy"(%115, %110, %109, %6) : (!llvm.ptr<i8>, !llvm.ptr<i8>, i64, i1) -> ()
    %116 = llvm.mlir.undef : !llvm.struct<(i64, ptr<i8>)>
    %117 = llvm.insertvalue %10, %116[0] : !llvm.struct<(i64, ptr<i8>)>
    %118 = llvm.insertvalue %115, %117[1] : !llvm.struct<(i64, ptr<i8>)>
    llvm.return %118 : !llvm.struct<(i64, ptr<i8>)>
  ^bb14(%119: i64, %120: i64):  // 2 preds: ^bb12, ^bb15
    %121 = llvm.icmp "sge" %119, %10 : i64
    llvm.cond_br %121, ^bb15, ^bb16
  ^bb15:  // pred: ^bb14
    %122 = llvm.getelementptr %21[%119] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
    %123 = llvm.load %122 : !llvm.ptr<i64>
    %124 = llvm.getelementptr %100[%119] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
    llvm.store %123, %124 : !llvm.ptr<i64>
    %125 = llvm.getelementptr %101[%119] : (!llvm.ptr<i64>, i64) -> !llvm.ptr<i64>
    llvm.store %120, %125 : !llvm.ptr<i64>
    %126 = llvm.mul %120, %123  : i64
    %127 = llvm.sub %119, %11  : i64
    llvm.br ^bb14(%127, %126 : i64, i64)
  ^bb16:  // pred: ^bb14
    %128 = llvm.call @malloc(%93) : (i64) -> !llvm.ptr<i8>
    "llvm.intr.memcpy"(%128, %94, %93, %6) : (!llvm.ptr<i8>, !llvm.ptr<i8>, i64, i1) -> ()
    %129 = llvm.mlir.undef : !llvm.struct<(i64, ptr<i8>)>
    %130 = llvm.insertvalue %arg1, %129[0] : !llvm.struct<(i64, ptr<i8>)>
    %131 = llvm.insertvalue %128, %130[1] : !llvm.struct<(i64, ptr<i8>)>
    llvm.return %131 : !llvm.struct<(i64, ptr<i8>)>
  }
  llvm.func @_mlir_ciface_Rsqrt_CPU_DT_HALF_DT_HALF(%arg0: !llvm.ptr<struct<(i64, ptr<i8>)>>, %arg1: !llvm.ptr<i8>, %arg2: !llvm.ptr<struct<(i64, ptr<i8>)>>) attributes {llvm.emit_c_interface, tf_entry} {
    %0 = llvm.load %arg2 : !llvm.ptr<struct<(i64, ptr<i8>)>>
    %1 = llvm.extractvalue %0[0] : !llvm.struct<(i64, ptr<i8>)>
    %2 = llvm.extractvalue %0[1] : !llvm.struct<(i64, ptr<i8>)>
    %3 = llvm.call @Rsqrt_CPU_DT_HALF_DT_HALF(%arg1, %1, %2) : (!llvm.ptr<i8>, i64, !llvm.ptr<i8>) -> !llvm.struct<(i64, ptr<i8>)>
    llvm.store %3, %arg0 : !llvm.ptr<struct<(i64, ptr<i8>)>>
    llvm.return
  }
}
2022-06-15 13:24:24 +02:00
Benjamin Kramer 0886ea902b [mlir][Arith] Fix a use-after-free after rewriting ops to unsigned
Just short-circuit when a change was made, the erased value is invalid
after that. Found by asan.

This pass looks like it could use rewrite patterns instead which don't
have this issue, but let's fix the asan build first.
2022-06-15 10:28:43 +02:00
Matthias Springer a36c801d12 [mlir][bufferize] Better implementation of AnalysisState::isTensorYielded
If `create-deallocs=0`, mark all bufferization.alloc_tensor ops as escaping. (Unless they already have an `escape` attribute.) In the absence of analysis information, check SSA use-def chains to see if the value may be yielded.

Differential Revision: https://reviews.llvm.org/D127302
2022-06-15 10:15:47 +02:00
Matthias Springer a3bca1181b [mlir][bufferize][NFC] Merge AlwaysCopyAnalysisState into AnalysisState
`AnalysisState` now has default implementations of all virtual functions.

Differential Revision: https://reviews.llvm.org/D127301
2022-06-15 10:08:52 +02:00
Matthias Springer cd80617a8a [mlir][bufferize][NFC] Make func BufferizableOpInterface impl compatible with One-Shot Bufferize
Bufferization of the func dialect must go through `OneShotModuleBufferize`. With this change, the analysis interface methods of the BufferizableOpInterface of func dialect ops can be used together with the normal `OneShotBufferize`. (In the absence of analysis information, they will return conservative results.)

Differential Revision: https://reviews.llvm.org/D127299
2022-06-15 10:05:15 +02:00
Matthias Springer ad2e635fae [mlir][linalg][bufferize] Remove always-aliasing-with-dest option
This flag was introduced for a use case in IREE, but it is no longer needed.

Differential Revision: https://reviews.llvm.org/D126965
2022-06-15 09:56:53 +02:00
Matthias Springer d361ecbd0d [mlir][SCF][bufferize] Implement `resolveConflicts` for SCF ops
scf::ForOp and scf::WhileOp must insert buffer copies not only for out-of-place bufferizations, but also to enforce additional invariants wrt. to buffer aliasing behavior. This is currently happening in the respective `bufferize` methods. With this change, the tensor copy insertion pass will also enforce these invariants by inserting copies. The `bufferize` methods can then be simplified and made independent of the `AnalysisState` data structure in a subsequent change.

Differential Revision: https://reviews.llvm.org/D126822
2022-06-15 09:07:31 +02:00
owenca 485c18c11b [mlir] Add missing newline at end of .clang-format file 2022-06-14 23:59:00 -07:00
Lei Zhang 06c6758a98 [mlir][spirv] Handle corner cases for math.powf conversion
Per GLSL Pow extended instruction spec: "Result is undefined if
x < 0. Result is undefined if x = 0 and y <= 0." So we need to
handle negative `x` values specifically.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D127816
2022-06-14 23:02:44 -04:00
jacquesguan 701a282af4 [mlir][Vector] Fold consecutive bitcast.
This patch supports to fold consecutive bitcast into one bitcast.

Differential Revision: https://reviews.llvm.org/D127723
2022-06-15 10:45:05 +08:00
lewuathe 95bdbb9747 [mlir][affine] Make loop tiling default options explicit
Make default loop tiling options explicit from CLI options. We can also set default value for separate option which is declared implicitly.

Reviewed By: ayzhuang

Differential Revision: https://reviews.llvm.org/D127711
2022-06-15 11:28:21 +09:00
Phoebe Wang 6e02e27536 Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI"
Disabled 2 mlir tests due to the runtime doesn't support `_Float16`, see
the issue here https://github.com/llvm/llvm-project/issues/55992
2022-06-15 09:15:31 +08:00
Lei Zhang b4dff404f3 [mlir][spirv] Fix math.ctlz for full zero bit cases
If the integer has all zero bits, GLSL FindUMsb would return -1.
So theoretically (31 - FindUMsb) should still give use the correct
result.  However, Adreno GPUshave issues with this:
https://buildkite.com/iree/iree-test-android/builds/6482#01815f05-3926-466f-822a-1e20299e5461
This looks like a driver bug. So handle the corner case explicity
to workaround it.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D127747
2022-06-14 19:39:27 -04:00
Benjamin Kramer d3b1796842 [mlir] Try to work around ambiguity in older clang versions
mlir/lib/Dialect/Arithmetic/IR/InferIntRangeInterfaceImpls.cpp:366:10: error: chosen constructor is explicit in copy-initialization
  return {leftVal, rightVal};
         ^~~~~~~~~~~~~~~~~~~
2022-06-14 23:57:57 +02:00
Mogball ead75d9434 (Reland)[mlir] Add a generic data-flow analysis framework
Removes one element of the pointer union to make it work on 32-bit
systems.

This patch introduces a generic data-flow analysis framework to MLIR. The framework implements a fixed-point iteration algorithm and a dependency graph between lattice states and analysis. Lattice states and points are fully extensible to support highly-customizable analyses.

Reviewed By: phisiart, rriddle

Differential Revision: https://reviews.llvm.org/D126751
2022-06-14 21:33:05 +00:00
Krzysztof Drewniak b0b0043209 [mlir][Arith] Pass to switch signed ops for equivalent unsigned ones
If all the arguments to and results of an operation are known to be
non-negative when interpreted as signed (which also implies that all
computations producing those values did not experience signed
overflow), we can replace that operation with an equivalent one that
operates on unsigned values.

Such a replacement, when it is possible, can provide useful hints to
backends, such as by allowing LLVM to replace remainder with bitwise
operations in more cases.

Depends on D124022

Depends on D124023

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D124024
2022-06-14 21:18:29 +00:00
Frederik Gossen a6fa12ab3b Revert "[mlir] Add a generic data-flow analysis framework"
This reverts commit 9dea117283.
The PointerUnion assumes 3 available bits, which is not the case on 32-bit
machines.
2022-06-14 17:14:27 -04:00
Okwan Kwon 28331c6097 Revert "[mlir] add an option to print op stats in JSON"
There is a failure from the python pass manager.

This reverts commit 1a19abf38c.
2022-06-14 14:09:18 -07:00
Okwan Kwon 1a19abf38c [mlir] add an option to print op stats in JSON
Differential Revision: https://reviews.llvm.org/D127691
2022-06-14 13:06:25 -07:00
Groverkss 87b8b377cc [MLIR][Presburger] Fix spellings of attachment 2022-06-15 00:10:54 +05:30
Krzysztof Drewniak 75bfc6f295 [mlir][Arith] Implement InferIntRangeInterface for arithmetic ops
Depends on D124023

Reviewed By: Mogball, rriddle

Differential Revision: https://reviews.llvm.org/D124022
2022-06-14 18:30:34 +00:00
Groverkss 127780e5b7 [MLIR][Presburger] Add values to PresburgerSpace
This patch allows attaching user information, called "values" to each
identifier. The values are used to carry information along with variables and
are also used to determine if two variables are identical.

This patch is part of a series of patches to allow attaching user information
with variables in Presburger library.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D127347
2022-06-14 23:15:10 +05:30
Mogball 9dea117283 [mlir] Add a generic data-flow analysis framework
This patch introduces a generic data-flow analysis framework to MLIR. The framework implements a fixed-point iteration algorithm and a dependency graph between lattice states and analysis. Lattice states and points are fully extensible to support highly-customizable analyses.

Reviewed By: phisiart, rriddle

Differential Revision: https://reviews.llvm.org/D126751
2022-06-14 16:54:15 +00:00
Benjamin Kramer ba0222cdc6 [mlir][linalg] Add named ops for depthwise 3d convolution
Also complete the set by adding a variant of depthwise 1d convolution
with the multiplier != 1.

Differential Revision: https://reviews.llvm.org/D127687
2022-06-14 18:22:47 +02:00
Alex Zinenko 069ca6f7a3 [mlir] fix compiler error due to commit landing race 2022-06-14 18:04:00 +02:00
Alex Zinenko e3890b7fd6 [mlir] Introduce transform.alternatives op
Introduce a transform dialect op that allows one to attempt different
transformation sequences on the same piece of payload IR until one of them
succeeds. This op fundamentally expands the scope of possibilities in the
transform dialect that, until now, could only propagate transformation failure,
at least using in-tree operations. This requires a more detailed specification
of the execution model for the transform dialect that now indicates how failure
is handled and propagated.

Transformations described by transform operations now have tri-state results,
with some errors being fundamentally irrecoverable (e.g., generating malformed
IR) and some others being recoverable by containing ops. Existing transform ops
directly implementing the `apply` interface method are updated to produce this
directly. Transform ops with the `TransformEachTransformOpTrait` are currently
considered to produce only irrecoverable failures and will be updated
separately.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D127724
2022-06-14 17:51:30 +02:00
Chuanqi Xu 735e6c40b5 [Coroutines] Convert coroutine.presplit to enum attr
This is required by @nikic in https://reviews.llvm.org/D127383 to
decrease the cost to check whether a function is a coroutine and this
fixes a FIXME too.

Reviewed By: rjmccall, ezhulenev

Differential Revision: https://reviews.llvm.org/D127471
2022-06-14 14:23:46 +08:00
Thomas Raoux 087aba4f0f [mlir][vector] Add pattern to distribute vector reduction to GPU shuffles
Add a pattern to do ad hoc lowering of vector.reduction to a sequence of
warp shuffles. This allow distributing reduction on a warp for GPU targets.
Also add an execution test for warp reduction.

co-authored with @springerm

Differential Revision: https://reviews.llvm.org/D127176
2022-06-14 05:49:16 +00:00
Thomas Raoux 76cf33dab2 [mlir][vector] Add patterns to ppropagate vector distribution
Add patterns to propagate vector distribution and remove dead
arguments. This handles propagation for several vector operations.

recommit after minor bug fix.

Differential Revision: https://reviews.llvm.org/D127167
2022-06-14 05:26:10 +00:00
Mogball baca1c1ac4 [mlir][ods] Make Attr/Type def accessors match the dialect
The generated attribute and type def accessors are changed to match the setting on the dialect. Most importantly, "prefixed" will now correctly convert snake case to camel case (e.g. `weight_zp` -> `getWeightZp`)

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D127688
2022-06-14 05:13:24 +00:00
Mogball f8ec4dfa94 [mlir][sparse_tensor] fix windows build 2022-06-14 04:37:27 +00:00
Jacques Pienaar 743791d6d5 [mlir] Include attributes in ML program dialect ops def
I considered adding a new dialect top-level file with all ops,
attributes & types included, but didn't see practical benefit to it.
2022-06-13 21:35:34 -07:00
Jacques Pienaar 3c68d58841 [mlir][doc] Move pass to passes list and remove redundant doc
The types are already included in the dialect doc (the attributes is not
properly yet, so retaining).
2022-06-13 21:01:31 -07:00
jacquesguan 5179f885d1 [mlir][Arithmetic] Fold NegF in MulF and DivF.
This patch adds the following combination:

mulf(negf(x), negf(y)) -> mulf(x, y)
divf(negf(x), negf(y)) -> divf(x, y)

Differential Revision: https://reviews.llvm.org/D126044
2022-06-14 03:15:19 +00:00
jacquesguan 059ee5d937 [mlir][Vector] Support vectorize to vector.reduction or/and.
This patch supports to vectorize affine.for of ori/andi to vector.reduction or/and.

Differential Revision: https://reviews.llvm.org/D127090
2022-06-14 03:11:45 +00:00
Mogball b1b4808c3f [mlir] Fix CMake file 2022-06-13 22:36:14 +00:00
Mogball 537f220891 [mlir] Support getSuccessorInputs from parent op
Ops that implement `RegionBranchOpInterface` are allowed to indicate that they can branch back to themselves in `getSuccessorRegions`, but there is no API that allows them to specify the forwarded operands. This patch enables that by changing `getSuccessorEntryOperands` to accept `None`.

Fixes #54928

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D127239
2022-06-13 22:21:34 +00:00
Mahesh Ravishankar cf6a7c1947 [mlir][TilingInterface] Add pattern to tile using TilingInterface and implement TilingInterface for Linalg ops.
This patch adds support for tiling operations that implement the
TilingInterface.
- It separates the loop constructs that are used to iterate over tile
  from the implementation of the tiling itself. For example, the use
  of destructive updates is more related to use of scf.for for
  iterating over tiles that are tensors.
- To test the transformation, TilingInterface is implemented for
  LinalgOps. The separation of the looping constructs used from the
  implementation of tile code generation greatly simplifies the
  latter.
- The implementation of TilingInterface for LinalgOp is kept as an
  external model for now till this approach can be fully flushed out
  to replace the existing tiling + fusion approaches in Linalg.

Differential Revision: https://reviews.llvm.org/D127133
2022-06-13 20:37:44 +00:00
Benjamin Kramer de0aa687a2 [mlir][linalg] Add conv_2d_nhwc_fhwc to core_named_ops.py
So it doesn't disappear when running the generator.
2022-06-13 20:43:10 +02:00
Thomas Raoux 2d32dac8bb Revert "[mlir][vector] Add patterns to ppropagate vector distribution"
This reverts commit 1c84800c42.

This was causing asan crash.
2022-06-13 17:55:31 +00:00
Lei Zhang b5192cbe50 [mlir][spirv] Fix result type for arith.cmpi/cmpf conversion
We cannot directly use the original result type; instead we need
to deduce it from the converted operand type. This addresses
invalid ops generated from converting single element vectors.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D127574
2022-06-13 13:15:23 -04:00
Lei Zhang 91de20c36d [mlir][spirv] Use UnrealizedConversionCast in ArithmeticToSPIRV
This avoids pulling in function converion patterns, which is not
part of what we want to test in ArithmeticToSPIRV. It also allows
using ConvertArithmeticToSPIRVPass as a standalone step.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D127573
2022-06-13 13:13:57 -04:00
Lei Zhang cc020a2236 [mlir][spirv] Convert math.ctlz to spv.GLSL.FindUMsb
Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D127582
2022-06-13 13:02:37 -04:00
Thomas Raoux 1c84800c42 [mlir][vector] Add patterns to ppropagate vector distribution
Add patterns to propagate vector distribution and remove dead
arguments. This handles propagation for several vector operations.

Differential Revision: https://reviews.llvm.org/D127167
2022-06-13 16:38:50 +00:00
Mogball e16d13322b [mlir] (NFC) Clean up bazel and CMake target names
All dialect targets in bazel have been named *Dialect and all dialect
targets in CMake have been named MLIR*Dialect.
2022-06-13 16:24:15 +00:00
Lei Zhang a10c09d1e3 [mlir][spirv] Remove unused `traits` from `SPV_Attr`
This addresses the warning of unused template argument.
2022-06-13 12:20:57 -04:00
Lei Zhang a4360efb2c [mlir][spirv] Convert single element vector.splat/fma
Reviewed By: ThomasRaoux, hanchung

Differential Revision: https://reviews.llvm.org/D127572
2022-06-13 12:18:16 -04:00
Matthias Springer 6ab1ed43f5 [mlir][shape][bufferize] Fix typo in external model
Differential Revision: https://reviews.llvm.org/D127639
2022-06-13 16:38:56 +02:00
Frederik Gossen ff6ce9e8fc Add `createDynamicDimValues` to tensor dialect utils
The function creates dim ops for each dynamic dimension of the raked tensor
argument and returns these as values.

Differential Revision: https://reviews.llvm.org/D127533
2022-06-13 08:26:28 -04:00
Adrian Kuegel 8ab925a2fd [mlir] Fix ClangTidyPerformance finding (NFC). 2022-06-13 10:49:25 +02:00
Ulrich Weigand 7095a1ff82 Fix endian conversion of sub-byte types
When convertEndianOfCharForBEmachine is called with elementBitWidth
smaller than CHAR_BIT, the default case is invoked, but this does
nothing at all and leaves the output array unchanged.

Fix DenseIntOrFPElementsAttr::convertEndianOfArrayRefForBEmachine
by not calling convertEndianOfCharForBEmachine in this case, and
instead simply copying the input to the output (for sub-byte types,
endian conversion is in fact a no-op).

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D125676
2022-06-12 16:08:23 +02:00
Chia-hung Duan ba3a9f51ff [mlir:MultiOpDriver] Add operands to worklist should be checked
Operand's defining op may not be valid for adding to the worklist under
stict mode

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D127180
2022-06-11 15:56:23 +00:00
Arjun P 4e53df0f0b [MLIR][Presburger] PresburgerSet::containsPoint: support disjuncts with locals
Reviewed By: Groverkss

Differential Revision: https://reviews.llvm.org/D127466
2022-06-10 20:18:49 -04:00
Lei Zhang e90b56e411 [mlir][vulkan] Add missing '<>' in test IRs to fix test 2022-06-10 18:09:12 -04:00
Lei Zhang 11cf2d5f62 [mlir][spirv] Unify aliases of different bitwidth scalar types
This commit extends the UnifyAliasedResourcePass to handle scalar
types of different bitwidths. It requires to get the smaller bitwidth
resource as the canonical resource so that we can avoid subcomponent
load/store. Instead we load/store multiple smaller bitwidth ones.

Reviewed By: hanchung

Differential Revision: https://reviews.llvm.org/D127266
2022-06-10 18:01:31 -04:00
agostini01 70f1021431 [mlir][py-bindings] Fix include issue introduced by D127352
Using:
      -DMLIR_ENABLE_BINDINGS_PYTHON=ON

Resulted in a failed build due to changes implemented by
https://reviews.llvm.org/D127352

This updates the include line

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D127523
2022-06-10 20:44:22 +00:00
John Ericson 0bb317b7bf Revert "[cmake] Don't export `LLVM_TOOLS_INSTALL_DIR` anymore"
This reverts commit d5daa5c5b0.
2022-06-10 19:26:12 +00:00
Krzysztof Drewniak a2cdb9791b [mlir][AMDGPU] Set ABI version constant when linking device libs
Currently, linking the device libraries requires setting a constant
that indicates the code object ABI version the compilation is
targeting.

This fixes the MLIR linking process by setting this constant to 400,
which is the value corresponding to the current code object ABI
default, version 4.

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D126913
2022-06-10 18:40:52 +00:00
Thomas Raoux ed0288f7c4 [mlir][vector] Add patterns for vector distribution
Add pattern to hoist scalar code outside of warp distribute region as
those cannot be distributed and we would want to execute them on all
the lanes.
Add patterns to distribute transfer_write ops. Those operations can be
distributed in different ways and it is control by user.

Differential Revision: https://reviews.llvm.org/D127152
2022-06-10 17:46:51 +00:00
John Ericson d5daa5c5b0 [cmake] Don't export `LLVM_TOOLS_INSTALL_DIR` anymore
First of all, `LLVM_TOOLS_INSTALL_DIR` put there breaks our NixOS
builds, because `LLVM_TOOLS_INSTALL_DIR` defined the same as
`CMAKE_INSTALL_BINDIR` becomes an *absolute* path, and then when
downstream projects try to install there too this breaks because our
builds always install to fresh directories for isolation's sake.

Second of all, note that `LLVM_TOOLS_INSTALL_DIR` stands out against the
other specially crafted `LLVM_CONFIG_*` variables substituted in
`llvm/cmake/modules/LLVMConfig.cmake.in`.

@beanz added it in d0e1c2a550 to fix a
dangling reference in `AddLLVM`, but I am suspicious of how this
variable doesn't follow the pattern.

Those other ones are carefully made to be build-time vs install-time
variables depending on which `LLVMConfig.cmake` is being generated, are
carefully made relative as appropriate, etc. etc. For my NixOS use-case
they are also fine because they are never used as downstream install
variables, only for reading not writing.

To avoid the problems I face, and restore symmetry, I deleted the
exported and arranged to have many `${project}_TOOLS_INSTALL_DIR`s.
`AddLLVM` now instead expects each project to define its own, and they
do so based on `CMAKE_INSTALL_BINDIR`. `LLVMConfig` still exports
`LLVM_TOOLS_BINARY_DIR` which is the location for the tools defined in
the usual way, matching the other remaining exported variables.

For the `AddLLVM` changes, I tried to copy the existing pattern of
internal vs non-internal or for LLVM vs for downstream function/macro
names, but it would good to confirm I did that correctly.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D117977
2022-06-10 14:35:18 +00:00
Alex Zinenko 6403e1b12a [mlir] add a dynamic user-after-parent-freed transform dialect check
In the transform dialect, a transform IR handle may be pointing to a payload IR
operation that is an ancestor of another payload IR operation pointed to by
another handle. If such a "parent" handle is consumed by a transformation, this
indicates that the associated operation is likely rewritten, which in turn
means that the "child" handle may now be associated with a dangling pointer or
a pointer to a different operation than originally. Add a handle invalidation
mechanism to guard against such situations by reporting errors at runtime.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D127480
2022-06-10 13:05:34 +02:00
Matthias Springer 79f115911e [mlir][bufferize] Avoid tensor copies when the data is not read
There are various shortcuts in `BufferizationState::getBuffer` that avoid a buffer copy when we just need an allocation (and no initialization). This change adds those shortcuts to the TensorCopyInsertion pass, so that `getBuffer` can be simplified in a subsequent change.

Differential Revision: https://reviews.llvm.org/D126821
2022-06-10 10:26:07 +02:00
lewuathe 999f767f9f [mlir] fix typo in AttributesAndTypes doc 2022-06-10 11:42:57 +09:00
Okwan Kwon 5ccb9df3ba [mlir] Support passing ostream as argument for the create function.
The constructor already supports passing an ostream as argument,
so let's make the create function support it too.

Differential Revision: https://reviews.llvm.org/D127449
2022-06-09 16:34:22 -07:00
Mogball a31ff0af9b [mlir][spirv] Replace StructAttrs with AttrDefs
Depends on D127370

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D127373
2022-06-09 23:16:44 +00:00
Mogball f1182bd6d5 [mlir][tosa] Replace StructAttrs with AttrDefs
Depends on D127352

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D127370
2022-06-09 23:01:51 +00:00
Mogball d7ef488bb6 [mlir][gpu] Move GPU headers into IR/ and Transforms/
Depends on D127350

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D127352
2022-06-09 22:49:03 +00:00
Mogball 7bdd3722f2 [mlir][gpu] Change ParalellLoopMappingAttr to AttrDef
It was a StructAttr. Also adds a FieldParser for AffineMap.

Depends on D127348

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D127350
2022-06-09 22:23:21 +00:00
Mogball ba79bb4973 [mlir][nvvm] Change MMAShapeAttr to AttrDef
MMAShapeAttr was a StructAttr

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D127348
2022-06-09 22:14:45 +00:00
Matthias Springer 87b46776c4 [mlir][bufferize] Improve resolveConflicts for ExtractSliceOp
It is sometimes better to make a copy of the OpResult instead of making a copy of the OpOperand. E.g., when bufferizing tensor.extract_slice.

This implementation will eventually make parts of extract_slice's `bufferize` implementation obsolete (and simplify it). It will only need to handle in-place OpOperands.

Differential Revision: https://reviews.llvm.org/D126819
2022-06-09 22:19:37 +02:00
Matthias Springer 87c770bbd0 [mlir][bufferization][NFC] Put inplacability conflict resolution in op interface
The TensorCopyInsertion pass resolves out-of-place bufferization decisions by inserting explicit `bufferization.alloc_tensor` ops. This change moves that functionality into a new BufferizableOpInterface method, so that it can be overridden by op implementations. Some op bufferizations must insert additional `alloc_tensor` ops to make sure that certain aliasing invariants are not violated (e.g., scf::ForOp). This will be addressed in a subsequent change.

Differential Revision: https://reviews.llvm.org/D126817
2022-06-09 22:06:44 +02:00
Christopher Bate 9f1221521f Recommit "[mlir][vector] Allow unroll of contraction in arbitrary order"
Fixed issue with vector.contract default unroll permutation.

Adds support for vector unroll transformations to unroll in different
orders. For example, the vector.contract can be unrolled into a
smaller set of contractions. There is a choice of how to unroll the
decomposition based on the traversal order of (dim0, dim1, dim2).
The choice of traversal order can now be specified by a callback which
given by the caller of the transform. For now, only the
vector.contract, vector.transfer_read/transfer_write operations
support the callback.

Differential Revision: https://reviews.llvm.org/D127004
2022-06-09 14:01:19 -06:00
Matthias Springer 3b2004e16b [mlir][bufferization] Add TensorCopyInsertion pass
This pass runs the One-Shot Analysis to find out which tensor OpOperands must bufferize out-of-place. It then rewrites those tensor OpOperands to explicit allocations with a copy in the form of `bufferization.alloc_tensor`. The resulting IR can then be bufferized without having to care about read-after-write conflicts.

This change makes it possible to connect One-Shot Analysis to other bufferizations such as the sparse compiler.

Differential Revision: https://reviews.llvm.org/D126573
2022-06-09 21:55:52 +02:00
Matthias Springer 56d68e8d7a [mlir][bufferization] Add optional `copy` operand to AllocTensorOp
If `copy` is specified, the newly allocated buffer is initialized with the given contents. Also add an optional `escape` attribute to indicate whether the buffer of the tensor may be returned from the parent block (aka. "escape") after bufferization.

This change is in preparation of connecting One-Shot Bufferize to the sparse compiler.

Differential Revision: https://reviews.llvm.org/D126570
2022-06-09 21:37:15 +02:00
Aart Bik 06aa6ec87d [mlir][sparse] refactor handling of merger leafs and ops
Using "default:" in the switch statemements that handle all our
merger ops has become a bit cumbersome since it is easy to overlook
parts of the code that need to handle ops specifically. By enforcing
full switch statements without "default:", we get a compiler warning
when cases are overlooked.

Reviewed By: wrengr

Differential Revision: https://reviews.llvm.org/D127263
2022-06-09 11:35:54 -07:00
Matthias Springer 88539c5bdb [mlir][bufferize][NFC] Decouple dropping of equivalent return values from bufferization
This simplifies the bufferization itself and is in preparation of connecting with the sparse compiler.

Differential Revision: https://reviews.llvm.org/D126814
2022-06-09 18:39:05 +02:00
Matthias Springer bf58256967 [mlir][bufferize] Fix bug in module equivalence analysis
CallOp result are not equivalent to an OpOperand if the OpOperand bufferizes out-of-place.

Differential Revision: https://reviews.llvm.org/D126813
2022-06-09 18:32:17 +02:00
Matthias Springer 92680126bf [mlir][bufferize] Decouple promoteBufferResultsToOutParams from One-Shot Bufferize
Users should explicitly run `-buffer-results-to-out-params` instead.

The purpose of this change is to remove `finalizeBuffers`, which made it difficult to extend the bufferization to custom buffer types.

Differential Revision: https://reviews.llvm.org/D126253
2022-06-09 18:25:26 +02:00
Matthias Springer 058af65e78 [mlir][bufferization] Decouple buffer-deallocation from One-Shot Bufferize
The buffer deallocation pass must now be run explicitly when `allow-return-alloc` is set.

This results in a few extra buffer copies in unoptimized test cases. The proper way to avoid such copies is to relax the OpOperand/OpResult aliasing contract on ops such as scf.for. Some of these copies can also be avoided by improving the buffer deallocation pass.

Differential Revision: https://reviews.llvm.org/D126252
2022-06-09 18:20:39 +02:00
Yuanqiang Liu 56e19717f5 [MLIR][Shape] Generalize `shape.concat` to extent tensors
The operation `shape.concat` was used for type shape only.
We now enable it for extent tensors.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D127321
2022-06-09 08:23:26 -07:00
Matthias Springer 461dafd2a3 [mlir][bufferization] Add OneShotBufferize transform op
This commit allows for One-Shot Bufferize to be used through the transform dialect. No op handle is currently returned for the bufferized IR.

Differential Revision: https://reviews.llvm.org/D125098
2022-06-09 15:15:09 +02:00
Alex Zinenko b6c58ec486 [mlir] add producer fusion to structured transform ops
This relies on the existing TileAndFuse pattern for tensor-based structured
ops. It complements pure tiling, from which some utilities are generalized.

Depends On D127300

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D127319
2022-06-09 14:30:45 +02:00
Benjamin Kramer abcf1496ad Fix complex.conj integration test
- It doesn't actually print the fractional part if the result is a whole number
- One of the expectations was just wrong
2022-06-09 13:11:10 +02:00
Alex Zinenko 5f0d4f208e [mlir] Introduce Transform ops for loops
Introduce transform ops for "for" loops, in particular for peeling, software
pipelining and unrolling, along with a couple of "IR navigation" ops. These ops
are intended to be generalized to different kinds of loops when possible and
therefore use the "loop" prefix. They currently live in the SCF dialect as
there is no clear place to put transform ops that may span across several
dialects, this decision is postponed until the ops actually need to handle
non-SCF loops.

Additionally refactor some common utilities for transform ops into trait or
interface methods, and change the loop pipelining to be a returning pattern.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D127300
2022-06-09 11:41:55 +02:00
lewuathe fff27d181c [mlir][complex] Correctness check for complex.conj
Add correctness check for complex.conj operation

Reviewed By: pifon2a

Differential Revision: https://reviews.llvm.org/D127377
2022-06-09 11:11:56 +02:00
lorenzo chelini 2a3c07f897 [MLIR][Math] Re-order conversions alphabetically (NFC)
Minor follow-up after: D127286 (https://reviews.llvm.org/D127286/new/)

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D127382
2022-06-09 09:12:13 +02:00
Mogball 971e13d69e [mlir][ods] Mark StructAttr as deprecated 2022-06-09 03:23:31 +00:00
bixia1 ff96d434d0 [mlir][sparse] Fix a problem introduced by the PR for reading complex number.
The problem is in function isValid.

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D127349
2022-06-08 15:01:50 -07:00
bixia1 5b1c5fc53a [mlir][sparse] Add complex number reading from files.
Support complex numbers for Matrix Market Exchange Formats. Add a test case.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D127138
2022-06-08 13:33:35 -07:00
Arjun P 4bf9cbc408 [MLIR][Presburger] subtract: improve redundant constraint detection
When constraints in the two operands make each other redundant, prefer constraints of the second because this affects the number of sets in the output at each level; reducing these can help prevent exponential blowup.

This is accomplished by adding extra overloads to Simplex::detectRedundant that only scan a subrange of the constraints for redundancy.

Reviewed By: Groverkss

Differential Revision: https://reviews.llvm.org/D127237
2022-06-08 14:44:31 -04:00
wren romano 0371ddf9ad [mlir] Refactoring the tablegen Tensor types
Reduces repetition in tablegen files for defining various tensor types.  In particular the goal is to reduce the repetition when defining new tensor types (e.g., D126994).

Reviewed By: aartbik, rriddle

Differential Revision: https://reviews.llvm.org/D127039
2022-06-08 11:33:48 -07:00
bixia1 6c6eddb617 [mlir] Lower complex.power and complex.rsqrt to standard dialect.
Add conversion tests and correctness tests.

Reviewed By: pifon2a

Differential Revision: https://reviews.llvm.org/D127255
2022-06-08 10:53:53 -07:00
dime10 4f55ed5a1e Add Python bindings for the OpaqueType
Implement the C-API and Python bindings for the builtin opaque type, which was previously missing.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D127303
2022-06-08 19:51:00 +02:00
Mogball ee70039ae2 [mlir] Fix handling of some region branch terminator successors
When `RegionBranchOpInterface::getSuccessorRegions` is called for anything other than the parent op, it expects the operands of the terminator of the source region to be passed, not the operands of the parent op. This was not always respected.

This fixes a bug in integer range inference and ForwardDataFlowSolver and changes `scf.while` to allow narrowing of successors using constant inputs.

Fixes #55873

Reviewed By: mehdi_amini, krzysz00

Differential Revision: https://reviews.llvm.org/D127261
2022-06-08 17:17:03 +00:00
bixia1 ea8ed5cbcf [mlir][sparse] Add F16 and BF16.
This is the first PR to add `F16` and `BF16` support to the sparse codegen. There are still problems in supporting these two data types, such as `BF16` is not quite working yet.

Add tests cases.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D127010
2022-06-08 09:51:05 -07:00
Lei Zhang 2dfefe0283 [mlir][spirv] NFC: fix typo in UnifyAliasedResourcePass pass
Reviewed By: ThomasRaoux, hanchung

Differential Revision: https://reviews.llvm.org/D127265
2022-06-08 08:18:12 -07:00
lorenzo chelini a0fc94ab61 [MLIR][Math] Add round operation
Introduce RoundOp in the math dialect. The operation rounds the operand to the
nearest integer value in floating-point format. RoundOp lowers to LLVM
intrinsics 'llvm.intr.round' or as a function call to libm (round or roundf).

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D127286
2022-06-08 13:07:39 +02:00
Matthias Springer 032be23309 [mlir][bufferize] Improve buffer writability analysis
Find writability conflicts (writes to buffers that are not allowed to be written to) by checking SSA use-def chains. This is better than the current writability analysis, which is too conservative and finds false positives.

Differential Revision: https://reviews.llvm.org/D127256
2022-06-08 10:11:52 +02:00
Benjamin Kramer 6eb0f8e285 [mlir][MemRef] Fix a crash when expanding a scalar shape
In this case the reassociation is empty, yielding no strides for the
result type.

Differential Revision: https://reviews.llvm.org/D127232
2022-06-08 09:37:40 +02:00
lorenzo chelini d48479791f [MLIR][SCF] Improve doc (NFC) 2022-06-08 08:46:36 +02:00
Nathan Lanza f46ce03734 [MLIR] Add an install target for mlir-libraries
This is required for the distribution system for installing the
mlir-libraries component. This is copied from clang's equivalent
feature.

Differential Revision: https://reviews.llvm.org/D126837
2022-06-07 22:57:07 -04:00
Aart Bik 7482cd6869 [mlir][sparse] updated our sparse dialect doc with some recent changes
The `init` and `tensor` ops are renamed (and one moved to another dialect).

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D127169
2022-06-07 14:27:57 -07:00
Christopher Bate 53fe155b3f Revert "[mlir][vector] Allow unroll of contraction in arbitrary order"
Reverts commit 1469ebf838 (original commit)
Reverts commit a392a39f75 (build fix for above commit)

The commit broke tests in out-of-tree projects, indicating that some logical
error was made in the previous change but not covered by current tests.
2022-06-07 14:54:01 -06:00
Groverkss 445e2b2aa0 [MLIR][Presburger] Fix subtract processing extra inequalities
This patch fixes a bug in PresburgeRelation::subtract that made it process the
inequality at index 0, multiple times. This was caused by allocating memory
instead of reserving memory in llvm::SmallVector.

Reviewed By: arjunp

Differential Revision: https://reviews.llvm.org/D127228
2022-06-07 22:51:03 +05:30
Kiran Chandramohan dd32bf9a77 [Flang,MLIR,OpenMP] Fix a few tests that were not converting to LLVM
A few OpenMP tests were retaining the FIR operands even after running
the LLVM conversion pass. To fix these tests the legality checkes for
OpenMP conversion are made stricter to include operands and results.
The Flush, Single and Sections operations are added to conversions or
legality checks. The RegionLessOpConversion is appropriately renamed
to clarify that it works only for operations with Variable operands.
The operands of the flush operation are changed to match those of
Variable Operands.

Fix for an OpenMP issue mentioned in
https://github.com/llvm/llvm-project/issues/55210.

Reviewed By: shraiysh, peixin, awarzynski

Differential Revision: https://reviews.llvm.org/D127092
2022-06-07 09:55:53 +00:00
Alex Zinenko 3326eddcd1 [mlir] fix documentation format in SCF
Four leading spaces are interpreted as a code block in markdown. Unless
used consistently in ODS op description, they cannot be stripped away by
the tablegen backend, which results in malformed markdown being
generated.
2022-06-07 11:51:24 +02:00
Alexander Batashev 8324561e33 [mlir][spirv] Correctly deduce PhysicalStorageBuffer64 addressing model
According to the SPIR-V specification[1], PhysicalStorageBuffer storage
class can only be used iff addressing model is PhysicalStorageBuffer64.

[1]: https://www.khronos.org/registry/SPIR-V/specs/unified1/SPIRV.html#_addressing_model

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D127067
2022-06-07 12:14:38 +03:00
lorenzo chelini 9b3712e0bf [MLIR][LLVMIR] Add round intrinsic
Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D126879
2022-06-07 10:27:55 +02:00
lewuathe 62a34f6a6f [mlir][complex] Add complex.conj op
Add complex.conj op to calculate the complex conjugate which is widely used for the mathematical operation on the complex space.

Reviewed By: pifon2a

Differential Revision: https://reviews.llvm.org/D127181
2022-06-07 09:38:35 +02:00
lorenzo chelini 2cbf0b3dc6 [MLIR][SCF] Fix top-level comment (NFC) 2022-06-07 08:52:11 +02:00
River Riddle a3a4f0335f [vscode-mlir] Bump to version 0.9
Since version 0.8 we've added:

* Switched PDLL and TableGen to use incremental doc updates
* Added support to PDLL for inlay hints
2022-06-06 20:20:19 -07:00
River Riddle 5919eab55c [mlir:PDLL] Add support for inlay hints
These allow for displaying additional inline information,
such as the types of variables, names operands/results,
constraint/rewrite arguments, etc. This requires a bump in the
vscode extension to a newer version, as inlay hints are a new LSP feature.

Differential Revision: https://reviews.llvm.org/D126033
2022-06-06 20:20:19 -07:00
River Riddle 6187178e83 [mlir:LSP] Switch document sync mode to Incremental
This is much more efficient over the full mode, as it only requires sending
smalls chunks of files. It also works around a weird command ordering
issue (full document updates are being sent after other commands like
code completion) in newer versions of vscode.

Differential Revision: https://reviews.llvm.org/D126032
2022-06-06 20:20:19 -07:00
River Riddle 1b501cbcbb [mlir] Add documentation for TableGen LSP features and setup
This commit beefs up the documentation for MLIR language servers by
adding proper documentations/examples/etc for the provided TableGen
language server capabilities. Given that this documentation is also used
for the vscode extension, this commit also updates the user facing vscode
extension documentation.

Note that the images referenced in the new documentation are hosted on
the website, and will be commited to mlir-www shortly after this commit
lands.
2022-06-06 18:29:31 -07:00
Georgios Pinitas 3bcaf2eb93 [mlir][tosa] Moves constant folding operations out of the Canonicalizer
Transpose operations on constant data were getting folded during the
canonicalization process. This has compile time cost proportional to
the constant size. Moving this to a separate pass to enable optionality
and flexibility of how such scenarios can be handled.

Reviewed By: rsuderman, jpienaar, stellaraccident

Differential Revision: https://reviews.llvm.org/D124685
2022-06-06 22:10:22 +00:00
Christopher Bate a392a39f75 [mlir][vector] fix typo in vector unroll transform 2022-06-06 16:09:13 -06:00
Christopher Bate 1469ebf838 [mlir][vector] Allow unroll of contraction in arbitrary order
Adds supprot for vector unroll transformations to unroll in different
orders. For example, the `vector.contract` can be unrolled into a
smaller set of contractions.  There is a choice of how to unroll the
decomposition  based on the traversal order of (dim0, dim1, dim2).
The choice of traversal order can now be specified by a callback which
given by the caller of the transform. For now, only the
`vector.contract`, `vector.transfer_read/transfer_write` operations
support the callback.

Differential Revision: https://reviews.llvm.org/D127004
2022-06-06 14:31:04 -06:00
River Riddle 731dfca8a0 [mlir] Add documentation for PDLL LSP features and setup
This commit beefs up the documentation for MLIR language servers by
adding proper documentations/examples/etc for the provided PDLL
language server capabilities. Given that this documentation is also used
for the vscode extension, this commit also updates the user facing vscode
extension documentation.

Not that the images referenced in the new documentation are hosted on
the website, and will be commited to mlir-www shortly after this commit
lands.

Differential Revision: https://reviews.llvm.org/D125650
2022-06-06 13:13:54 -07:00
Christopher Bate cca662b849 [mlir][linalg] add conv_2d_nhwc_fhwc named op
This operation should be supported as a named op because
when the operands are viewed as having canonical layouts
with decreasing strides, then the "reduction" dimensions
of the filter (h, w, and c) are contiguous relative to each
output channel. When lowered to a matrix multiplication,
this layout is the simplest to deal with, and thus future
transforms/vectorizations of `conv2d` may find using this
named op convenient.

Differential Revision: https://reviews.llvm.org/D126995
2022-06-06 13:18:08 -06:00
Christopher Bate 99069ab212 [mlir][linalg] fix crash when promoting rank-reducing memref.subviews
This change adds support for promoting `linalg` operation operands that
are produced by rank-reducing `memref.subview` ops.

Differential Revision: https://reviews.llvm.org/D127086
2022-06-06 12:06:36 -06:00
jacquesguan ad44495ad3 [mlir][NFC] Replace some llvm::find with llvm::is_contained.
This patch replaces some llvm::find with llvm::is_contained, it should be more clear.

Differential Revision: https://reviews.llvm.org/D127077
2022-06-06 03:01:14 +00:00
Stella Laurenzo 768a251587 [mlir] Tunnel LLVM_USE_LINKER through to the standalone example build.
When building in debug mode, the link time of the standalone sample is excessive, taking upwards of a minute if using BFD. This at least allows lld to be used if the main invocation was configured that way. On my machine, this gets a standalone test that requires a relink to run in ~13s for Debug mode. This is still a lot, but better than it was. I think we may want to do something about this test: it adds a lot of latency to a normal compile/test cycle and requires a bunch of arg fiddling to exclude.

I think we may end up wanting a `check-mlir-heavy` target that can be used just prior to submit, and then make `check-mlir` just run unit/lite tests. More just thoughts for the future (none of that is done here).

Reviewed By: bondhugula, mehdi_amini

Differential Revision: https://reviews.llvm.org/D126585
2022-06-05 12:31:41 -07:00
Fangrui Song d86a206f06 Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options 2022-06-05 00:31:44 -07:00
Christian Sigg 400fef081a Recommit: "[MLIR][NVVM] Replace fdiv on fp16 with promoted (fp32) multiplication with reciprocal plus one (conditional) Newton iteration."
This change rolls bcfc0a9051 forward (i.e., reverting 369ce54bb3) with fixed CMakeLists.txt.
2022-06-05 09:11:43 +02:00
Jacques Pienaar 29794ab0fa [mlir] Use context provided rather than getContext
Avoids "pass state was never initialized" assertion failure.
2022-06-04 12:18:51 -07:00
Mehdi Amini 369ce54bb3 Revert "[MLIR][GPU] Replace fdiv on fp16 with promoted (fp32) multiplication with reciprocal plus one (conditional) Newton iteration."
This reverts commit bcfc0a9051.

The build is broken with shared library enabled.
2022-06-04 08:35:45 +00:00
Christian Sigg bcfc0a9051 [MLIR][GPU] Replace fdiv on fp16 with promoted (fp32) multiplication with reciprocal plus one (conditional) Newton iteration.
This is correct for all values, i.e. the same as promoting the division to fp32 in the NVPTX backend. But it is faster (~10% in average, sometimes more) because:

- it performs less Newton iterations
- it avoids the slow path for e.g. denormals
- it allows reuse of the reciprocal for multiple divisions by the same divisor

Test program:
```
#include <stdio.h>
#include "cuda_fp16.h"

// This is a variant of CUDA's own __hdiv which is fast than hdiv_promote below
// and doesn't suffer from the perf cliff of div.rn.fp32 with 'special' values.
__device__ half hdiv_newton(half a, half b) {
  float fa = __half2float(a);
  float fb = __half2float(b);

  float rcp;
  asm("{rcp.approx.ftz.f32 %0, %1;\n}" : "=f"(rcp) : "f"(fb));

  float result = fa * rcp;
  auto exponent = reinterpret_cast<const unsigned&>(result) & 0x7f800000;
  if (exponent != 0 && exponent != 0x7f800000) {
    float err = __fmaf_rn(-fb, result, fa);
    result = __fmaf_rn(rcp, err, result);
  }

  return __float2half(result);
}

// Surprisingly, this is faster than CUDA's own __hdiv.
__device__ half hdiv_promote(half a, half b) {
  return __float2half(__half2float(a) / __half2float(b));
}

// This is an approximation that is accurate up to 1 ulp.
__device__ half hdiv_approx(half a, half b) {
  float fa = __half2float(a);
  float fb = __half2float(b);

  float result;
  asm("{div.approx.ftz.f32 %0, %1, %2;\n}" : "=f"(result) : "f"(fa), "f"(fb));
  return __float2half(result);
}

__global__ void CheckCorrectness() {
  int i = threadIdx.x + blockIdx.x * blockDim.x;
  half x = reinterpret_cast<const half&>(i);
  for (int j = 0; j < 65536; ++j) {
    half y = reinterpret_cast<const half&>(j);
    half d1 = hdiv_newton(x, y);
    half d2 = hdiv_promote(x, y);
    auto s1 = reinterpret_cast<const short&>(d1);
    auto s2 = reinterpret_cast<const short&>(d2);
    if (s1 != s2) {
      printf("%f (%u) / %f (%u), got %f (%hu), expected: %f (%hu)\n",
             __half2float(x), i, __half2float(y), j, __half2float(d1), s1,
             __half2float(d2), s2);
      //__trap();
    }
  }
}

__device__ half dst;

__global__ void ProfileBuiltin(half x) {
  #pragma unroll 1
  for (int i = 0; i < 10000000; ++i) {
    x = x / x;
  }
  dst = x;
}

__global__ void ProfilePromote(half x) {
  #pragma unroll 1
  for (int i = 0; i < 10000000; ++i) {
    x = hdiv_promote(x, x);
  }
  dst = x;
}

__global__ void ProfileNewton(half x) {
  #pragma unroll 1
  for (int i = 0; i < 10000000; ++i) {
    x = hdiv_newton(x, x);
  }
  dst = x;
}

__global__ void ProfileApprox(half x) {
  #pragma unroll 1
  for (int i = 0; i < 10000000; ++i) {
    x = hdiv_approx(x, x);
  }
  dst = x;
}

int main() {
  CheckCorrectness<<<256, 256>>>();
  half one = __float2half(1.0f);
  ProfileBuiltin<<<1, 1>>>(one);  // 1.001s
  ProfilePromote<<<1, 1>>>(one);  // 0.560s
  ProfileNewton<<<1, 1>>>(one);   // 0.508s
  ProfileApprox<<<1, 1>>>(one);   // 0.304s
  auto status = cudaDeviceSynchronize();
  printf("%s\n", cudaGetErrorString(status));
}
```

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D126158
2022-06-04 08:03:29 +02:00
wren romano 3cf03f1c56 [mlir][sparse] Adding IsSparseTensorPred and updating ops to use it
Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D126994
2022-06-03 17:15:31 -07:00
Christopher Bate 9f819f4c62 [mlir][linalg] fix crash in vectorization of elementwise operations
The current vectorization logic implicitly expects "elementwise"
linalg ops to have projected permutations for indexing maps, but
the precondition logic misses this check. This can result in a
crash when executing the generic vectorization transform on an op
with a non-projected permutation input indexing map. This change
fixes the logic and adds a test (which crashes without this fix).

Differential Revision: https://reviews.llvm.org/D127000
2022-06-03 16:38:13 -06:00
Diego Caballero 9a79b1b04c [mlir] Add peeling xform to Codegen Strategy
This patch adds the knobs to use peeling in the codegen strategy
infrastructure.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D126842
2022-06-03 21:31:43 +00:00
Krzysztof Drewniak 95aff23e29 Re-land "[mlir] Add integer range inference analysis""
This reverts commit 4e5ce2056e.

This relands commit 1350c9887d.

Reinstates the range analysis with the build issue fixed.

Differential Revision: https://reviews.llvm.org/D126926
2022-06-03 17:13:48 +00:00
lewuathe d4141c93a8 [mlir][complex] Check the correctness of tanh in complex dialect
Correctness check for tanh operation in complex dialect.

Ref: https://reviews.llvm.org/D126858

Reviewed By: pifon2a

Differential Revision: https://reviews.llvm.org/D126946
2022-06-03 14:04:48 +02:00
Adrian Kuegel 39f28397e2 [mlir] Fix ClangTidy warning (NFC).
virtual is redundant since the function is already declared 'override'.
2022-06-03 12:46:14 +02:00
Shraiysh Vaishay f5d29c15bf [mlir][OpenMP] Add memory_order clause tests
This patch adds tests for memory_order clause for atomic update and
capture operations. This patch also adds a check for making sure that
the operations inside and omp.atomic.capture region do not specify the
memory_order clause.

Reviewed By: kiranchandramohan, peixin

Differential Revision: https://reviews.llvm.org/D126195
2022-06-03 13:41:22 +05:30
Nicolas Vasilache 72de7588cc [mlir][SCF] Add bufferization hook for scf.foreach_thread and terminator.
`scf.foreach_thread` results alias with the underlying `scf.foreach_thread.parallel_insert_slice` destination operands
and they bufferize to equivalent buffers in the absence of other conflicts.
`scf.foreach_thread.parallel_insert_slice` conflict detection is similar to `tensor.insert_slice` conflict detection.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D126769
2022-06-03 07:14:05 +00:00
Alexander Batashev b34fb277df [mlir][cf] Implement missing SwitchOp::build function
Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D126594
2022-06-03 09:08:04 +03:00
Thomas Raoux 271a48e029 [mlir][VectorToGPU] Fix bug generating incorrect ldmatrix ops
ldmatrix transpose can only be used with types that are 16bits wide.

Differential Revision: https://reviews.llvm.org/D126846
2022-06-03 04:30:22 +00:00
Thomas Raoux 205c08b54d [mlir][scf] Add option to loop pipelining to not peel the epilogue
Add an option to predicate the epilogue within the kernel instead of
peeling the epilogue. This is a useful option to prevent generating
large amount of code for deep pipeline. This currently require a user
lamdba to implement operation predication.

Differential Revision: https://reviews.llvm.org/D126753
2022-06-03 04:20:20 +00:00
River Riddle ee1cf1f645 [mlir][NFC] Simplify the various `parseSourceFile<T>` overloads
These effectively all share the same implementation, i.e. forward
to the non-templated overload and then construct the container op.
2022-06-02 19:18:55 -07:00
Aart Bik f8b692dd31 [mlir][python][f16] add ctype python binding support for f16
Similar to complex128/complex64, float16 has no direct support
in the ctypes implementation. This fixes the issue by using a
custom F16 type to change the view in and out of MLIR code

Reviewed By: wrengr

Differential Revision: https://reviews.llvm.org/D126928
2022-06-02 17:21:24 -07:00
River Riddle bb81b3b274 [vscode-mlir] Bump to version 0.8
Since version 0.7 we've added:

* Initial language support for TableGen
* Tweaked syntax highlighting for PDLL
* Added a new command to view intermediate PDLL output
2022-06-02 16:35:09 -07:00
River Riddle bf352e0b2e [mlir:PDLL] Add better support for providing Constraint/Pattern/Rewrite documentation
This commit enables providing long-form documentation more seamlessly to the LSP
by revamping decl documentation. For ODS imported constructs, we now also import
descriptions and attach them to decls when possible. For PDLL constructs, the LSP will
now try to provide documentation by parsing the comments directly above the decls
location within the source file. This commit also adds a new parser flag
`enableDocumentation` that gates the import and attachment of ODS documentation,
which is unnecessary in the normal build process (i.e. it should only be used/consumed
by tools).

Differential Revision: https://reviews.llvm.org/D124881
2022-06-02 16:31:07 -07:00
Arjun P 8bc2cff95a [MLIR][Presburger] Simplex: remove redundant member vars nRow, nCol
Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D126790
2022-06-03 00:30:48 +01:00
Chia-hung Duan 633ad1d864 [mlir:MultiOpDriver] Quick fix the assertion position
The assertion should come after null check
2022-06-02 23:25:35 +00:00
Mehdi Amini 4e5ce2056e Revert "[mlir] Add integer range inference analysis"
This reverts commit 1350c9887d.

Shared library build is broken with undefined references.
2022-06-02 21:24:06 +00:00
Krzysztof Drewniak 1350c9887d [mlir] Add integer range inference analysis
This commit defines a dataflow analysis for integer ranges, which
uses a newly-added InferIntRangeInterface to compute the lower and
upper bounds on the results of an operation from the bounds on the
arguments. The range inference is a flow-insensitive dataflow analysis
that can be used to simplify code, such as by statically identifying
bounds checks that cannot fail in order to eliminate them.

The InferIntRangeInterface has one method, inferResultRanges(), which
takes a vector of inferred ranges for each argument to an op
implementing the interface and a callback allowing the implementation
to define the ranges for each result. These ranges are stored as
ConstantIntRanges, which hold the lower and upper bounds for a
value. Bounds are tracked separately for the signed and unsigned
interpretations of a value, which ensures that the impact of
arithmetic overflows is correctly tracked during the analysis.

The commit also adds a -test-int-range-inference pass to test the
analysis until it is integrated into SCCP or otherwise exposed.

Finally, this commit fixes some bugs relating to the handling of
region iteration arguments and terminators in the data flow analysis
framework.

Depends on D124020

Depends on D124021

Reviewed By: rriddle, Mogball

Differential Revision: https://reviews.llvm.org/D124023
2022-06-02 20:24:11 +00:00
Aart Bik bf7dbc2a30 [mlir][sparse][bufferization] fix doc on new init operation
The example was still using the -now- removed sparse_tensor.init_tensor.
Also, I made the input operands of the matrix multiplication sparse too
(since it looks a bit strange to multiply two dense matrices into a sparse).

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D126897
2022-06-02 12:04:36 -07:00
Chia-hung Duan 2aeffc6d8d [mlir:MultiOpDriver] Don't add ops which are not in the allowed list
In strict mode, only the new inserted operation is allowed to add to the
worklist. Before this change, it would add the users of a replaced op
and it didn't check if the users are allowed to be pushed into the
worklist

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D126899
2022-06-02 18:27:37 +00:00
Ashay Rane 5fee1799f4
[mlir] translate memref.reshape with static shapes but dynamic dims
Prior to this patch, the lowering of memref.reshape operations to the
LLVM dialect failed if the shape argument had a static shape with
dynamic dimensions.  This patch adds the necessary support so that when
the shape argument has dynamic values, the lowering probes the dimension
at runtime to set the size in the `MemRefDescriptor` type.  This patch
also computes the stride for dynamic dimensions by deriving it from the
sizes of the inner dimensions.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D126604
2022-06-02 10:00:58 -07:00
Alex Zinenko ce2e198bc2 [mlir] add decompose and generalize to structured transform ops
These ops complement the tiling/padding transformations by transforming
higher-level named structured operations such as depthwise convolutions into
lower-level and/or generic equivalents that are better handled by some
downstream transformations.

Differential Revision: https://reviews.llvm.org/D126698
2022-06-02 15:25:18 +02:00
Nicolas Vasilache 311967701a [mlir][SCF] Add scf.foreach_thread.parallel_insert_slice canonicalization.
Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D126761
2022-06-02 11:53:25 +00:00
lewuathe 9f0869a61d [mlir][complex] Lower complex.sin/cos to libm
Lower sin/cos operation in complex dialect to libm as a baseline. This follows up to https://reviews.llvm.org/D125550.

Reviewed By: pifon2a

Differential Revision: https://reviews.llvm.org/D126755
2022-06-02 10:39:00 +02:00
lewuathe 4b13b061ae [mlir][complex] Sanity check for tan operation in complex dialect
Add a sanity check for newly added tan operation in complex dialect. It follows-up to https://reviews.llvm.org/D126685.

Differential Revision: https://reviews.llvm.org/D126858
2022-06-02 10:33:40 +02:00
Nikita Popov 41d5033eb1 [IR] Enable opaque pointers by default
This enabled opaque pointers by default in LLVM. The effect of this
is twofold:

* If IR that contains *neither* explicit ptr nor %T* types is passed
  to tools, we will now use opaque pointer mode, unless
  -opaque-pointers=0 has been explicitly passed.
* Users of LLVM as a library will now default to opaque pointers.
  It is possible to opt-out by calling setOpaquePointers(false) on
  LLVMContext.

A cmake option to toggle this default will not be provided. Frontends
or other tools that want to (temporarily) keep using typed pointers
should disable opaque pointers via LLVMContext.

Differential Revision: https://reviews.llvm.org/D126689
2022-06-02 09:40:56 +02:00
jacquesguan 19e285477e [mlir][Arithmetic] Add constant folder for RemF.
This patch adds the constant folder for RemF.

Differential Revision: https://reviews.llvm.org/D126045
2022-06-02 06:24:37 +00:00
jacquesguan ce820375ef [mlir] Support convert token type from LLVM IR.
This patch supports the token type for converting from LLVM IR.

Differential Revision: https://reviews.llvm.org/D126756
2022-06-02 03:32:51 +00:00
Matthias Springer 6232a8f3d6 [mlir][sparse][NFC] Switch InitOp to bufferization::AllocTensorOp
Now that we have an AllocTensorOp (previously InitTensorOp) in the bufferization dialect, the InitOp in the sparse dialect is no longer needed.

Differential Revision: https://reviews.llvm.org/D126180
2022-06-02 00:03:52 +02:00
wren romano b364c76683 [mlir][sparse] Using non-empty function name suffix for OverheadType::kIndex
The trick of using an empty token in the `FOREVERY_O` x-macro relies on preprocessor behavior which is only standard since C99 6.10.3/4 and C++11 N3290 16.3/4 (whereas it was undefined behavior up through C++03 16.3/10).  Since the `ExecutionEngine/SparseTensorUtils.cpp` file is required to be compile-able under C++98 compatibility mode (unlike the C++11 used elsewhere in MLIR), we shouldn't rely on that behavior.

Also, using a non-empty suffix helps improve uniformity of the API, since all other primary/overhead suffixes are also non-empty.  I'm using the suffix `0` since that's the value used by the `SparseTensorEncoding` attribute for indicating the index overhead-type.

Depends On D126720

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D126724
2022-06-01 14:18:42 -07:00
Rob Suderman f3bdb56d61 [mlir][math] Add math.ctlz expansion to control flow + arith operations
Ctlz is an intrinsic in LLVM but does not have equivalent operations in SPIR-V.
Including a decomposition gives an alternative path for these platforms.

Reviewed By: NatashaKnk

Differential Revision: https://reviews.llvm.org/D126261
2022-06-01 11:45:04 -07:00
Stella Laurenzo 3bb7999339 [mlir] Add global_load and global_store ops to ml_program.
* Adds simple, non-atomic, non-volatile, non-synchronized direct load/store ops.

Differential Revision: https://reviews.llvm.org/D126230
2022-06-01 11:32:15 -07:00
Alexander Belyaev f711785e61 [mlir] Add conversion and tests for complex.[sqrt|atan2] to Arith.
Differential Revision: https://reviews.llvm.org/D126799
2022-06-01 20:21:51 +02:00
bixia1 548f0841cd [mlir][sparse] Enable the test for operator expm1.
Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D126732
2022-06-01 11:18:17 -07:00
Aart Bik d668218946 [mlir][python][ctypes] fix ctype python binding complication for complex
There is no direct ctypes for MLIR's complex (and thus np.complex128
and np.complex64) yet, causing the mlir python binding methods for
memrefs to crash. This revision fixes this by passing complex arrays
as tuples of floats, correcting at the boundaries for the proper view.

NOTE: some of these changes (4 -> 2) were forced by the new "linting"

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D126422
2022-06-01 10:15:24 -07:00
Arjun P 8f99cdd27c [MLIR][Presburger] Simplex: remove redundant zeroing out of row
This fillRow(..., 0) is redundant because when the size of the tableau is
consistent, the resize always creates a new row, which is zero-initialized.

Also added asserts throughout to ensure the dimensions of the tableau remain
consistent.

Reviewed By: Groverkss

Differential Revision: https://reviews.llvm.org/D126709
2022-06-01 16:59:37 +01:00
Arjun P ec145ba2a3 [MLIR][Presburger] Matrix: inline trivial accessors
This resolves a comment from https://reviews.llvm.org/D126708
that was previously missed.
2022-06-01 16:56:46 +01:00
Arjun P d5e31cf38a [MLIR][Presburger] Move Matrix accessors inline
This gives a 1.5x speedup on the Presburger unittests.

Reviewed By: Groverkss

Differential Revision: https://reviews.llvm.org/D126708
2022-06-01 16:51:42 +01:00
PeixinQiao fe2cc16035 [NFC][MLIR] Fix -Wtype-limits warning
Fix the warning: comparison of unsigned expression in ‘>= 0’ is always
true.

Reviewed By: kiranchandramohan, shraiysh

Differential Revision: https://reviews.llvm.org/D126784
2022-06-01 23:42:07 +08:00
Nicolas Vasilache 59b273a166 [mlir][SCF] Add parallel abstraction on tensors.
This revision adds `scf.foreach_thread` and other supporting abstractions
that allow connecting parallel abstractions and tensors.

Discussion is available [here](https://discourse.llvm.org/t/rfc-parallel-abstraction-for-tensors-and-buffers/62607).

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D126555
2022-06-01 09:16:01 +00:00
lewuathe ffb8eecdd6 [mlir][complex] Lowering complex.tanh to standard
Lowering complex.tanh to standard dialects including math, arith.

Reviewed By: pifon2a

Differential Revision: https://reviews.llvm.org/D126521
2022-06-01 11:13:54 +02:00
Nicolas Vasilache beab8e871e Revert "[mlir][SCF] Add parallel abstraction on tensors."
This reverts commit 9b7193f852.

This is an older branch that was committed by mistake and does not include addressed review comments, an updated version will come next.
2022-06-01 09:04:20 +00:00
Nicolas Vasilache 9b7193f852 [mlir][SCF] Add parallel abstraction on tensors.
This revision adds `scf.foreach_thread` and other supporting abstractions
that allow connecting parallel abstractions and tensors.

Discussion is available [here](https://discourse.llvm.org/t/rfc-parallel-abstraction-for-tensors-and-buffers/62607).
2022-06-01 09:02:16 +00:00
Benjamin Kramer 7d431e9ec5 [mlir][complex] Remove unused variables. NFC. 2022-06-01 09:33:02 +02:00
lewuathe 6d75c89783 [mlir][complex] Add tan op for complex dialect
Add tangent operation for complex dialect. This is the follow-up change of https://reviews.llvm.org/D126521

Differential Revision: https://reviews.llvm.org/D126685
2022-06-01 09:20:42 +02:00
wren romano 98e142cd4f [mlir][sparse] Using x-macros in the function-suffix functions
By defining the `{primary,overhead}TypeFunctionSuffix` functions via the same x-macros used to generate the runtime library's functions themselves, this helps avoid bugs from typos or things getting out of sync.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D126720
2022-05-31 17:36:43 -07:00
wren romano c63d4fac4f [mlir][sparse] Improving the FATAL macro
The previous macro definition using `{...}` would fail to compile when the callsite uses a semicolon followed by an else-statement (i.e., `if (...) FATAL(...); else ...;`).  Replacing the simple braces with `do{...}while(0)` (n.b., semicolon not included in the macro definition) enables callsites to use the semicolon plus else-statement syntax without problems.  The new definition now requires the semicolon at all callsites, but since it was already being called that way nothing changes.

For more explanation, see <https://gcc.gnu.org/onlinedocs/cpp/Swallowing-the-Semicolon.html>

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D126514
2022-05-31 14:31:38 -07:00
wren romano a4c53f8cd6 [mlir][sparse] Factoring out SparseTensorFile class for readSparseTensorShape
The primary goal of this change is to define readSparseTensorShape.  Whereas the SparseTensorFile class is merely introduced as a way to reduce code duplication along the way.

Depends On D126106

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D126233
2022-05-31 13:24:28 -07:00
Nathaniel McVicar 8fb1bef60f [windows] Remove unused pybind exception params
Resolve MSVC warning C4104 for unreferenced variable

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D126683
2022-05-31 12:36:57 -07:00
Arjun P 18a06d4f3a [MLIR][Presburger] Simplex::computeOptimum: slightly simplify code (NFC) 2022-05-31 19:10:15 +01:00
lorenzo chelini 850dbff708 [MLIR][Math] Improve docs (NFC)
Remove boilerplate examples and add a text at the dialect level to describe
what kind of operands the operations accept (i.e., scalar, tensor or vector).
Left a shorter sentence describing the input operands for each operation as
this redundancy is convenient when browsing the documentation using the
website.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D126648
2022-05-31 18:16:59 +02:00
Mehdi Amini 5d93d2a9eb Apply clang-tidy fixes for llvm-else-after-return in OpPythonBindingGen.cpp (NFC) 2022-05-31 11:54:19 +00:00
Mehdi Amini d8c46eb612 Apply clang-tidy fixes for readability-identifier-naming in SparseTensorUtils.cpp (NFC) 2022-05-31 11:54:19 +00:00
jacquesguan 42c17073fc [mlir] Support import llvm intrinsics.
This patch supports to convert the llvm intrinsic to the corresponding op. It still leaves some intrinsics to be handled specially.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D126639
2022-05-31 11:08:23 +00:00
River Riddle 1c2edb026e [mlir:PDLL] Rework the C++ generation of native Constraint/Rewrite arguments and results
The current translation uses the old "ugly"/"raw" form which used PDLValue for the arguments
and results. This commit updates the C++ generation to use the recently added sugar that
allows for directly using the desired types for the arguments and result of PDL functions.
In addition, this commit also properly imports the C++ class for ODS operations, constraints,
and interfaces. This allows for a much more convienent C++ API than previously granted
with the raw/low-level types.

Differential Revision: https://reviews.llvm.org/D124817
2022-05-30 17:35:34 -07:00
River Riddle 0429472efe [mlir:PDLL] Fix signature help for operation operands
We were currently only completing on the first operand because
the completion check was outside of the parse loop.

Differential Revision: https://reviews.llvm.org/D124784
2022-05-30 17:35:34 -07:00
River Riddle 01652d889c [mlir:PDLL-LSP] Add a custom LSP command for viewing the output of PDLL
This commit adds a new PDLL specific LSP command, pdll.viewOutput, that
allows for viewing the intermediate outputs of a given PDLL file. The available
intermediate forms currently mirror those in mlir-pdll, namely: AST, MLIR, CPP.
This is extremely useful for a developer of PDLL, as it simplifies various testing,
and is also quite useful for users as they can easily view what is actually being
generated for their PDLL files.

This new command is added to the vscode client, and is available in the right
client context menu of PDLL files, or via the vscode command palette.

Differential Revision: https://reviews.llvm.org/D124783
2022-05-30 17:35:34 -07:00
River Riddle 91b8d96fd1 [mlir:PDLL] Add proper support for operation result type inference
This allows for the results of operations to be inferred in certain contexts,
and matches the support in PDL for result type inference. The main two
initial circumstances are when used as a replacement of another operation,
or when the operation being created implements InferTypeOpInterface.

Differential Revision: https://reviews.llvm.org/D124782
2022-05-30 17:35:33 -07:00
Mehdi Amini 940e290860 Apply clang-tidy fixes for performance-unnecessary-value-param in OneShotModuleBufferize.cpp (NFC) 2022-05-30 18:44:28 +00:00
Mehdi Amini 118d9ebd52 Apply clang-tidy fixes for llvm-else-after-return in OpenMPToLLVM.cpp (NFC) 2022-05-30 18:44:28 +00:00
Alex Zinenko 3f71765a71 [mlir] provide Python bindings for the Transform dialect
Python bindings for extensions of the Transform dialect are defined in separate
Python source files that can be imported on-demand, i.e., that are not imported
with the "main" transform dialect. This requires a minor addition to the
ODS-based bindings generator. This approach is consistent with the current
model for downstream projects that are expected to bundle MLIR Python bindings:
such projects can include their custom extensions into the bundle similarly to
how they include their dialects.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D126208
2022-05-30 17:37:52 +02:00
Alex Zinenko cc6c159203 [mlir] add VectorizeOp to structured transform ops
Vectorization is a key transformation to achieve high performance on most
architectures. In the transform dialect, vectorization is implemented as a
parameterizable transform op. It currently applies to a scope of payload IR
delimited by some isolated-from-above op, mainly because several enabling
transformations (such as affine simplification) are needed to perform
vectorization and these transformation would apply to ops other than the "main"
computational payload op. A separate "navigation" transform op that obtains the
isolated-from-above ancestor of an op is introduced in the core transform
dialect. Even though it is currently only useful for vectorization,
isolated-from-above ops are a common anchor for transformations (usually
implemented as passes) that is likely to be reused in the future.

Depends On D126374

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D126542
2022-05-30 17:37:50 +02:00
Simon Moll 18c1ee04de Re-land "[VP] vp intrinsics are not speculatable" with test fix
Update the llvmir-intrinsics.mlir test to account for the modified
attribute sets.

This reverts commit 2e2a8a2d90.
2022-05-30 14:41:15 +02:00
Mehdi Amini eacfd04744 Apply clang-tidy fixes for llvm-else-after-return in OpPythonBindingGen.cpp (NFC) 2022-05-30 12:25:58 +00:00
Mehdi Amini 0f68c959d2 Apply clang-tidy fixes for modernize-use-override in SparseTensorUtils.cpp (NFC) 2022-05-30 12:25:55 +00:00
Alex Zinenko 5cde5a5739 [mlir] add interchange, pad and scalarize to structured transform dialect
Add ops to the structured transform extension of the transform dialect that
perform interchange, padding and scalarization on structured ops. Along with
tiling that is already defined, this provides a minimal set of transformations
necessary to build vectorizable code for a single structured op.

Define two helper traits: one that implements TransformOpInterface by applying
a function to each payload op independently and another that provides a simple
"functional-style" producer/consumer list of memory effects for the transform
ops.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D126374
2022-05-30 11:42:40 +02:00
Alexander Belyaev 402b837302 Revert "[mlir] Lower complex.sqrt and complex.atan2 to Arithmetic dialect."
This reverts commit f5fa633b09.

Integration test sparse_complex_ops.mlir breaks because of it.
2022-05-30 10:48:58 +02:00
Christian Sigg 544d6507ba [MLIR][NVVM] NFC: add labels to test functions.
Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D126631
2022-05-30 10:39:44 +02:00
Alexander Belyaev f5fa633b09 [mlir] Lower complex.sqrt and complex.atan2 to Arithmetic dialect.
I don't see a point here in the lit tests here since sqrt, mul and other ops
expand as well. I just added "smoke" tests to verify that the conversion works
and does not create any illegal ops.

I will create a patch that adds a simple integration test to
mlir/test/Integration/Dialect/ComplexOps/ that will compare the values.

Differential Revision: https://reviews.llvm.org/D126539
2022-05-30 09:44:36 +02:00
Christian Sigg bcf3d52486 [MLIR][GPU] Expose GpuParallelLoopMapping as non-test pass.
Reviewed By: bondhugula, herhut

Differential Revision: https://reviews.llvm.org/D126199
2022-05-30 09:20:48 +02:00
Javed Absar c7283c263c [mlir][NFC] Trivial : Fix typo
Reviewed By: JohnTitor

Differential Revision: https://reviews.llvm.org/D126601
2022-05-29 11:00:08 +01:00
Groverkss dac27da7b9 [MLIR][Presburger] Add applyDomain/Range to IntegerRelation
This patch adds support for applying a relation on domain/range of a relation.

Reviewed By: arjunp, ftynse

Differential Revision: https://reviews.llvm.org/D126339
2022-05-29 02:06:11 +05:30
Matthias Springer 2f0a634c5e [mlir][bufferization] Add extra filter mechanism to bufferizeOp
Differential Revision: https://reviews.llvm.org/D126569
2022-05-28 04:49:23 +02:00
Matthias Springer f470f8cbce [mlir][bufferize][NFC] Split analysis+bufferization of ModuleBufferization
Analysis and bufferization can now be run separately.

Differential Revision: https://reviews.llvm.org/D126572
2022-05-28 04:43:50 +02:00
Matthias Springer 3490aadf56 [mlir][bufferization][NFC] Remove post-analysis step infrastructure
Now that analysis and bufferization are better separated, post-analysis steps are no longer needed. Users can directly interleave analysis and bufferization as needed.

Differential Revision: https://reviews.llvm.org/D126571
2022-05-28 04:37:13 +02:00
Matthias Springer 1534177f8f [mlir][bufferization][NFC] Move OpFilter out of BufferizationOptions
Differential Revision: https://reviews.llvm.org/D126568
2022-05-28 01:47:39 +02:00
wren romano 0fbe3f3f48 [mlir][sparse] Fixes C++98 warning
The semicolons were introduced in D126105 in order to correct clang-format, but I forgot this file must be compiled as C++98 rather than C++11.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D126561
2022-05-27 13:42:17 -07:00
Aart Bik a5d7e2a8ac [OpenMP][mlir] fix broken build
Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D126556
2022-05-27 10:06:01 -07:00
PeixinQiao 042ae89556 [OpenMP] Support operation conversion to LLVM for threadprivate directive
This supports the operation conversion for threadprivate directive. The
support for memref type conversion is not implemented.

Reviewed By: kiranchandramohan, shraiysh

Differential Revision: https://reviews.llvm.org/D124610
2022-05-28 00:06:57 +08:00
Daniil Dudkin 52d79b04b2 [mlir][llvm] Fix compiler error on GCC 9
This patch fixes the following compiler error:

    error: declaration of ‘mlir::LLVM::cconv::CConv mlir::LLVM::detail::CConvAttrStorage::CConv’ changes meaning of ‘CConv’ [-fpermissive]

CConv as a member variable name was shadowing CConv as an enumeration,
hence the compiler error.

Reviewed By: ftynse, alexbatashev

Differential Revision: https://reviews.llvm.org/D126530
2022-05-27 15:33:43 +03:00
Groverkss f168a65943 [MLIR][Presburger] Add intersectDomain/Range to IntegerRelation
This patch adds support for intersection a set with a relation.

Reviewed By: arjunp

Differential Revision: https://reviews.llvm.org/D126328
2022-05-27 15:51:54 +05:30
River Riddle 8d021670c3 [mlir][Tablegen-LSP] Add support for a tracking definitions and references
This essentially builds an index for the parsed records and record values (fields).
This covers quite a few cases, but is limited by the currently lackluster location
tracking in tablegen. A followup will work on plumbing more locations through
tablegen, which should greatly improve what we can do here.

Differential Revision: https://reviews.llvm.org/D125443
2022-05-27 02:39:49 -07:00
River Riddle 682ca00e21 [mlir][Tablegen-LSP] Add support for include file link and hover
This allows for following links to include files. This support is effectively
identical to the logic in the PDLL language server, and code is shared as
much as possible.

Differential Revision: https://reviews.llvm.org/D125442
2022-05-27 02:39:49 -07:00
River Riddle dc9fb65c4f [mlir][Tablegen-LSP] Add support for a compilation database
This provides a format for externally specifying the include directories
for a source file. The format of the tablegen database is exactly the
same as that for PDLL, namely it includes the absolute source file name and
the set of include directories. The database format is shared to simplify
the infra, and also because the format itself is general enough to share. Even
if we desire to expand in the future to contain the actual compilation command,
nothing there is specific enough that we would need two different formats.

As with PDLL, support for generating the database is added to our mlir_tablegen
cmake command.

Differential Revision: https://reviews.llvm.org/D125441
2022-05-27 02:39:49 -07:00
River Riddle d6708b7419 [mlir-vscode] Add support for highlighting pdll and tablegen markdown code blocks
This essentially just piggy backs off of the existing mlir support.

Differential Revision: https://reviews.llvm.org/D125734
2022-05-27 00:34:42 -07:00
Alexander Batashev 0252357b3e [mlir][LLVM] Add support for Calling Convention in LLVMFuncOp
This patch adds support for Calling Convention attribute in LLVM
dialect, including enums, custom syntax and import from LLVM IR.
Additionally fix import of dso_local attribute.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D126161
2022-05-27 09:43:31 +03:00
Hanhan Wang 5aefdafccf [mlir][Linalg] Relax vectorization condition to allow transposed output.
Reviewed By: ThomasRaoux, dcaballe

Differential Revision: https://reviews.llvm.org/D126454
2022-05-26 19:20:36 -07:00
wren romano 05c17bc4bb [mlir][sparse] Moving some functions around
This is a followup to D126105 to move functions in SparseTensorUtils.cpp to match their locations in SparseTensorUtils.h

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D126106
2022-05-26 17:23:38 -07:00