Commit Graph

511 Commits

Author SHA1 Message Date
Alex Zinenko 2230bf99c7 [mlir] replace LLVMIntegerType with built-in integer type
The LLVM dialect type system has been closed until now, i.e. did not support
types from other dialects inside containers. While this has had obvious
benefits of deriving from a common base class, it has led to some simple types
being almost identical with the built-in types, namely integer and floating
point types. This in turn has led to a lot of larger-scale complexity: simple
types must still be converted, numerous operations that correspond to LLVM IR
intrinsics are replicated to produce versions operating on either LLVM dialect
or built-in types leading to quasi-duplicate dialects, lowering to the LLVM
dialect is essentially required to be one-shot because of type conversion, etc.
In this light, it is reasonable to trade off some local complexity in the
internal implementation of LLVM dialect types for removing larger-scale system
complexity. Previous commits to the LLVM dialect type system have adapted the
API to support types from other dialects.

Replace LLVMIntegerType with the built-in IntegerType plus additional checks
that such types are signless (these are isolated in a utility function that
replaced `isa<LLVMType>` and in the parser). Temporarily keep the possibility
to parse `!llvm.i32` as a synonym for `i32`, but add a deprecation notice.

Reviewed By: mehdi_amini, silvas, antiagainst

Differential Revision: https://reviews.llvm.org/D94178
2021-01-07 19:48:31 +01:00
Sanjoy Das 6173d1277b Remove allow-unregistered-dialect from some tests that don't need it
Differential Revision: https://reviews.llvm.org/D93982
2021-01-06 09:40:50 -08:00
Eugene Zhulenev 61422c8b66 [mlir] Async: add support for lowering async value operands to LLVM
Depends On D93592

Add support for `async.execute` async value unwrapping operands:

```
%token = async.execute(%async_value as %unwrapped : !async.value<!my.type>) {
  ...
  async.yield
}
```

Reviewed By: csigg

Differential Revision: https://reviews.llvm.org/D93598
2020-12-25 02:25:20 -08:00
Eugene Zhulenev 621ad468d9 [mlir] Async: lowering async.value to LLVM
1. Add new methods to Async runtime API to support yielding async values
2. Add lowering from `async.yield` with value payload to the new runtime API calls

`async.value` lowering requires that payload type is convertible to LLVM and supported by `llvm.mlir.cast` (DialectCast) operation.

Reviewed By: csigg

Differential Revision: https://reviews.llvm.org/D93592
2020-12-25 02:23:48 -08:00
Lei Zhang a16fbff17d [mlir][spirv] Create a pass for testing SCFToSPIRV patterns
Previously all SCF to SPIR-V conversion patterns were tested as
the -convert-gpu-to-spirv pass. That obscured the structure we
want. This commit fixed it.

Reviewed By: ThomasRaoux, hanchung

Differential Revision: https://reviews.llvm.org/D93488
2020-12-23 14:31:55 -05:00
Lei Zhang 42980a789d [mlir][spirv] Convert functions returning one value
Reviewed By: hanchung, ThomasRaoux

Differential Revision: https://reviews.llvm.org/D93468
2020-12-23 13:27:31 -05:00
Christian Sigg df6cbd37f5 [mlir] Lower gpu.memcpy to GPU runtime calls.
Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D93204
2020-12-22 22:49:19 +01:00
Prateek Gupta 3e07b0b9d3 [MLIR] Fix lowering of affine operations with return values
This commit addresses the issue of lowering affine.for and
affine.parallel having return values. Relevant test cases are also
added.

Signed-off-by: Prateek Gupta <prateek@polymagelabs.com>

Differential Revision: https://reviews.llvm.org/D93090
2020-12-22 21:44:31 +05:30
Sean Silva 129d6e554e [mlir] Move `std.tensor_cast` -> `tensor.cast`.
This is almost entirely mechanical.

Differential Revision: https://reviews.llvm.org/D93357
2020-12-17 16:06:56 -08:00
Alex Zinenko 96076a2edb [mlir] Support index and memref types in llvm.mlir.cast
This operation is designed to support partial conversion, more specifically the
IR state in which some operations expect or produce built-in types and some
operations produce and expect LLVM dialect types. It is reasonable for it to
support cast between built-in types and any equivalent that could be produced
by the type conversion. (At the same time, we don't want the dialect to depend
on the type conversion as it could lead to a dependency cycle). Introduce
support for casting from index to any integer type and back, and from memref to
bare pointer or memref descriptor type and back.

Contrary to what the TODO in the code stated, there are no particular
precautions necessary to handle the bare pointer conversion for memerfs. This
conversion applies exclusively to statically-shaped memrefs, so we can always
recover the full descriptor contents from the type.

This patch simultaneously tightens the verification for other types to only
accept matching pairs of types, e.g., i64 and !llvm.i64, as opposed to the
previous implementation that only checked if the types were generally allowed
byt not for matching, e.g. i64 could be "casted" to !llvm.bfloat, which is not
the intended semantics.

Move the relevant test under test/Dialect/LLVMIR because it is not specific to
the conversion pass, but rather exercises an op in the dialect. If we decide
this op does not belong to the LLVM dialect, both the dialect and the op should
move together.

Reviewed By: silvas, ezhulenev

Differential Revision: https://reviews.llvm.org/D93405
2020-12-17 09:21:42 +01:00
Tres Popp c77ea40528 [mlir] Add std.pow lowering to LLVMIR
Differential Revision: https://reviews.llvm.org/D93311
2020-12-15 18:54:29 +01:00
Tres Popp 9adc64539f [mlir] Add std.powf to ROCDL lowering.
Differential Revision: https://reviews.llvm.org/D93313
2020-12-15 18:47:49 +01:00
Tres Popp f3e8f27ca1 [mlir] Fix GPUToNVVM test 2020-12-15 18:41:16 +01:00
Tres Popp e04785b131 [mlir] Add NVVM lowering for std.pow
Differential Revision: https://reviews.llvm.org/D93303
2020-12-15 18:28:23 +01:00
Javier Setoain aece4e2793 [mlir][ArmSVE][RFC] Add an ArmSVE dialect
This revision starts an Arm-specific ArmSVE dialect discussed in the discourse RFC thread:

https://llvm.discourse.group/t/rfc-vector-dialects-neon-and-sve/2284

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D92172
2020-12-14 21:35:01 +00:00
Frederik Gossen 75d9a46090 [MLIR] Add atan and atan2 lowerings to CUDA intrinsics
Differential Revision: https://reviews.llvm.org/D93124
2020-12-14 10:45:28 +01:00
Frederik Gossen 1c6bc2c0b5 [MLIR] Add lowerings for atan and atan2 to ROCDL intrinsics
Differential Revision: https://reviews.llvm.org/D93123
2020-12-14 10:43:19 +01:00
Sean Silva 444822d77a Revert "Revert "[mlir] Start splitting the `tensor` dialect out of `std`.""
This reverts commit 0d48d265db.

This reapplies the following commit, with a fix for CAPI/ir.c:

[mlir] Start splitting the `tensor` dialect out of `std`.

This starts by moving `std.extract_element` to `tensor.extract` (this
mirrors the naming of `vector.extract`).

Curiously, `std.extract_element` supposedly works on vectors as well,
and this patch removes that functionality. I would tend to do that in
separate patch, but I couldn't find any downstream users relying on
this, and the fact that we have `vector.extract` made it seem safe
enough to lump in here.

This also sets up the `tensor` dialect as a dependency of the `std`
dialect, as some ops that currently live in `std` depend on
`tensor.extract` via their canonicalization patterns.

Part of RFC: https://llvm.discourse.group/t/rfc-split-the-tensor-dialect-from-std/2347/2

Differential Revision: https://reviews.llvm.org/D92991
2020-12-11 14:30:50 -08:00
Sean Silva 0d48d265db Revert "[mlir] Start splitting the `tensor` dialect out of `std`."
This reverts commit cab8dda90f.

I mistakenly thought that CAPI/ir.c failure was unrelated to this
change. Need to debug it.
2020-12-11 14:15:41 -08:00
Sean Silva cab8dda90f [mlir] Start splitting the `tensor` dialect out of `std`.
This starts by moving `std.extract_element` to `tensor.extract` (this
mirrors the naming of `vector.extract`).

Curiously, `std.extract_element` supposedly works on vectors as well,
and this patch removes that functionality. I would tend to do that in
separate patch, but I couldn't find any downstream users relying on
this, and the fact that we have `vector.extract` made it seem safe
enough to lump in here.

This also sets up the `tensor` dialect as a dependency of the `std`
dialect, as some ops that currently live in `std` depend on
`tensor.extract` via their canonicalization patterns.

Part of RFC: https://llvm.discourse.group/t/rfc-split-the-tensor-dialect-from-std/2347/2

Differential Revision: https://reviews.llvm.org/D92991
2020-12-11 13:50:55 -08:00
Nicolas Vasilache 7310501f74 [mlir][ArmNeon][RFC] Add a Neon dialect
This revision starts an Arm-specific ArmNeon dialect discussed in the [discourse RFC thread](https://llvm.discourse.group/t/rfc-vector-dialects-neon-and-sve/2284).

Differential Revision: https://reviews.llvm.org/D92171
2020-12-11 13:49:40 +00:00
Adrian Kuegel ada4c7a351 Add rsqrt lowering from standard to ROCDL.
Add a lowering for rsqrt from standard dialect to ROCDL.

Differential Revision: https://reviews.llvm.org/D93011
2020-12-11 13:18:57 +01:00
Adrian Kuegel 09f717b929 Add sqrt lowering from standard to ROCDL
Add a lowering for sqrt from standard dialect to ROCDL.

Differential Revision: https://reviews.llvm.org/D92921
2020-12-10 09:47:37 +01:00
Frederik Gossen b4750f58d8 Add sqrt lowering from standard to NVVM
Differential Revision: https://reviews.llvm.org/D92850
2020-12-08 17:08:27 +01:00
Frederik Gossen bb7d43e7d5 Add rsqrt lowering from standard to NVVM
Differential Revision: https://reviews.llvm.org/D92838
2020-12-08 14:33:58 +01:00
Aart Bik c95acf052b [mlir][vector][avx512] move avx512 lowering pass into general vector lowering
A separate AVX512 lowering pass does not compose well with the regular
vector lowering pass. As such, it is at risk of code duplication and
lowering inconsistencies. This change removes the separate AVX512 lowering
pass and makes it an "option" in the regular vector lowering pass
(viz. vector dialect "augmented" with AVX512 dialect).

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D92614
2020-12-03 17:23:46 -08:00
Christian Sigg 5535696c38 [mlir] Add gpu.allocate, gpu.deallocate ops with LLVM lowering to runtime function calls.
The ops are very similar to the std variants, but support async GPU execution.

gpu.alloc does not currently support an alignment attribute, and the new ops do not have
canonicalizers/folders like their std siblings do.

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D91698
2020-11-27 09:40:59 +01:00
Alex Zinenko 119545f433 [mlir] Add conversion from SCF parallel loops to OpenMP
Introduce a conversion pass from SCF parallel loops to OpenMP dialect
constructs - parallel region and workshare loop. Loops with reductions are not
supported because the OpenMP dialect cannot model them yet.

The conversion currently targets only one level of parallelism, i.e. only
one top-level `omp.parallel` operation is produced even if there are nested
`scf.parallel` operations that could be mapped to `omp.wsloop`. Nested
parallelism support is left for future work.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D91982
2020-11-24 21:12:56 +01:00
Alex Zinenko f7d033f4d8 [mlir] Support WsLoopOp in OpenMP to LLVM dialect conversion
It is a simple conversion that only requires to change the region argument
types, generalize it from ParallelOp.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D91989
2020-11-23 23:28:02 +01:00
Alex Zinenko 1ec60862d7 [mlir] Avoid cloning ops in SCF parallel conversion to CFG
The existing implementation of the conversion from SCF Parallel operation to
SCF "for" loops in order to further convert those loops to branch-based CFG has
been cloning the loop and reduction body operations into the new loop because
ConversionPatternRewriter was missing support for moving blocks while replacing
their arguments. This functionality now available, use it to implement the
conversion and avoid cloning operations, which may lead to doubling of the IR
size during the conversion.

In addition, this fixes an issue with converting nested SCF "if" conditionals
present in "parallel" operations that would cause the conversion infrastructure
to stop because of the repeated application of the pattern converting "newly"
created "if"s (which were in fact just moved). Arguably, this should be fixed
at the infrastructure level and this fix is a workaround.

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D91955
2020-11-23 14:01:22 +01:00
Eugene Zhulenev a86a9b5ef7 [mlir] Automatic reference counting for Async values + runtime support for ref counted objects
Depends On D89963

**Automatic reference counting algorithm outline:**

1. `ReturnLike` operations forward the reference counted values without
    modifying the reference count.
2. Use liveness analysis to find blocks in the CFG where the lifetime of
   reference counted values ends, and insert `drop_ref` operations after
   the last use of the value.
3. Insert `add_ref` before the `async.execute` operation capturing the
   value, and pairing `drop_ref` before the async body region terminator,
   to release the captured reference counted value when execution
   completes.
4. If the reference counted value is passed only to some of the block
   successors, insert `drop_ref` operations in the beginning of the blocks
   that do not have reference coutned value uses.

Reviewed By: silvas

Differential Revision: https://reviews.llvm.org/D90716
2020-11-20 03:08:44 -08:00
Alex Zinenko 9bb5bff570 [mlir] Add an assertion on creating an Operation with null result types
Null types are commonly used as an error marker. Catch them in the constructor
of Operation if they are present in the result type list, as otherwise this
could lead to further surprising behavior when querying op result types.

Fix AsyncToLLVM and StandardToLLVM that were using null types when constructing
operations.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D91770
2020-11-19 22:28:38 +01:00
ergawy 2f3adc54b5 [MLIR][SPIRV] Rename `spv._module_end` to `spv.mlir.endmodule`
This commit does the renaming mentioned in the title in order to bring
'spv' dialect closer to the MLIR naming conventions.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D91792
2020-11-19 13:25:13 -05:00
ergawy 9bd50abc4c [MLIR][SPIRV] Rename `spv._merge` to `spv.mlir.merge`
This commit does the renaming mentioned in the title in order to bring
'spv' dialect closer to the MLIR naming conventions.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D91797
2020-11-19 10:04:35 -05:00
Christian Sigg 8b97e17d16 [mlir] Simplify code generated by ConvertToLLVMPattern::getStridedElementPtr().
Make the interface match the one of ConvertToLLVMPattern::getDataPtr() (to be removed in a separate change).

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D91599
2020-11-18 11:52:09 +01:00
Christian Sigg bedaad4495 [mlir] Simplify std.alloc lowering to LLVM.
std.alloc only supports memrefs with identity layout, which means we can simplify the lowering to LLVM and compute strides only from (static and dynamic) sizes.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D91549
2020-11-17 18:55:34 +01:00
ergawy 9793edd5bf [MLIR][SPIRV] Rename `spv._address_of` to `spv.mlir.addressof`
This commit does the renaming mentioned in the title in order to bring
`spv` dialect closer to the MLIR naming conventions.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D91609
2020-11-17 12:12:27 -05:00
Rahul Joshi b7382ed3fe [MLIR] Extend Symbol verification to reject public symbol declarations.
- Extend the Symbol interface with `isDeclaration` to identify operations that declare
  a symbol as opposed to define it.
- Extend verification to disallow public declarations as per the discussion in
   https://llvm.discourse.group/t/rfc-symbol-definition-declaration-x-visibility-checks/2140
- Adopt the new interface for `FuncOp` and fix test and code to not have/create public
  function declarations.

Differential Revision: https://reviews.llvm.org/D91456
2020-11-16 16:05:32 -08:00
Christian Sigg 04481f26fa [mlir] Require std.alloc() ops to have canonical layout during LLVM lowering.
The current code allows strided layouts, but the number of elements allocated is ambiguous. It could be either the number of elements in the shape (the current implementation), or the amount of elements required to not index out-of-bounds with the given maps (which would require evaluating the layout map).

If we require the canonical layouts, the two will be the same.

Reviewed By: nicolasvasilache, ftynse

Differential Revision: https://reviews.llvm.org/D91523
2020-11-16 17:29:36 +01:00
Hanhan Wang 47fd19f22e [mlir][StandardToSPIRV] Extend support for lowering cmpi to SPIRV.
The logic of vector on boolean was missed. This patch adds the logic and test on
it.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D91403
2020-11-16 06:51:05 -08:00
Eugene Zhulenev c30ab6c2a3 [mlir] Transform scf.parallel to scf.for + async.execute
Depends On D89958

1. Adds `async.group`/`async.awaitall` to group together multiple async tokens/values
2. Rewrite scf.parallel operation into multiple concurrent async.execute operations over non overlapping subranges of the original loop.

Example:

```
   scf.for (%i, %j) = (%lbi, %lbj) to (%ubi, %ubj) step (%si, %sj) {
     "do_some_compute"(%i, %j): () -> ()
   }
```

Converted to:

```
   %c0 = constant 0 : index
   %c1 = constant 1 : index

   // Compute blocks sizes for each induction variable.
   %num_blocks_i = ... : index
   %num_blocks_j = ... : index
   %block_size_i = ... : index
   %block_size_j = ... : index

   // Create an async group to track async execute ops.
   %group = async.create_group

   scf.for %bi = %c0 to %num_blocks_i step %c1 {
     %block_start_i = ... : index
     %block_end_i   = ... : index

     scf.for %bj = %c0 t0 %num_blocks_j step %c1 {
       %block_start_j = ... : index
       %block_end_j   = ... : index

       // Execute the body of original parallel operation for the current
       // block.
       %token = async.execute {
         scf.for %i = %block_start_i to %block_end_i step %si {
           scf.for %j = %block_start_j to %block_end_j step %sj {
             "do_some_compute"(%i, %j): () -> ()
           }
         }
       }

       // Add produced async token to the group.
       async.add_to_group %token, %group
     }
   }

   // Await completion of all async.execute operations.
   async.await_all %group
```
In this example outer loop launches inner block level loops as separate async
execute operations which will be executed concurrently.

At the end it waits for the completiom of all async execute operations.

Reviewed By: ftynse, mehdi_amini

Differential Revision: https://reviews.llvm.org/D89963
2020-11-13 04:02:56 -08:00
Stephan Herhut 5da2423bc0 [mlir][gpu] Only transform mapped parallel loops to GPU.
This exposes a hook to configure legality of operations such that only
`scf.parallel` operations that have mapping attributes are marked as
illegal. Consequently, the transformation can now also be applied to
mixed forms.

Differential Revision: https://reviews.llvm.org/D91340
2020-11-13 09:15:17 +01:00
George Mitenkov de3ad5bb09 [MLIR][SPIRVToLLVM] Enhanced conversion for execution mode
This patch introduces a new conversion pattern for `spv.ExecutionMode`.
`spv.ExecutionMode` may contain important information about the entry
point, which we want to preserve. For example, `LocalSize` provides
information about the work-group size that can be reused. Hence, the
pattern for entry-point ops changes to the following:
- `spv.EntryPoint` is still simply removed
- Info from `spv.ExecutionMode` is used to create a global struct variable,
  which looks like:

  ```
  struct {
    int32_t executionMode;
    int32_t values[];          // optional values
  };
  ```

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D89989
2020-11-10 18:33:54 +03:00
Artur Bialas 3035e676a3 [mlir][spirv] Add VectorInsertDynamicOp and vector.insertelement lowering
VectorInsertDynamicOp in SPIRV dialect
conversion from vector.insertelement to spirv VectorInsertDynamicOp

Differential Revision: https://reviews.llvm.org/D90927
2020-11-10 09:49:12 +01:00
River Riddle ebcc022507 [mlir][AsmPrinter] Refactor printing to only print aliases for attributes/types that will exist in the output.
This revision refactors the way that attributes/types are considered when generating aliases. Instead of considering all of the attributes/types of every operation, we perform a "fake" print step that prints the operations using a dummy printer to collect the attributes and types that would actually be printed during the real process. This removes a lot of attributes/types from consideration that generally won't end up in the final output, e.g. affine map attributes in an `affine.apply`/`affine.for`.

This resolves a long standing TODO w.r.t aliases, and helps to have a much cleaner textual output format. As a datapoint to the latter, as part of this change several tests were identified as testing for the presence of attributes aliases that weren't actually referenced by the custom form of any operation.

To ensure that this wouldn't cause a large degradation in compile time due to the second full print, I benchmarked this change on a very large module with a lot of operations(The file is ~673M/~4.7 million lines long). This file before this change take ~6.9 seconds to print in the custom form, and ~7 seconds after this change. In the custom assembly case, this added an average of a little over ~100 miliseconds to the compile time. This increase was due to the way that argument attributes on functions are structured and how they get printed; i.e. with a better representation the negative impact here can be greatly decreased. When printing in the generic form, this revision had no observable impact on the compile time. This benchmarking leads me to believe that the impact of this change on compile time w.r.t printing is closely related to `print` methods that perform a lot of additional/complex processing outside of the OpAsmPrinter.

Differential Revision: https://reviews.llvm.org/D90512
2020-11-09 21:54:47 -08:00
Rahul Joshi 8b5a3e4632 [MLIR] Change FuncOp assembly syntax to print visibility inline instead of in attrib dict.
- Change syntax for FuncOp to be `func <visibility>? @name` instead of printing the
  visibility in the attribute dictionary.
- Since printFunctionLikeOp() and parseFunctionLikeOp() are also used by other
  operations, make the "inline visibility" an opt-in feature.
- Updated unit test to use and check the new syntax.

Differential Revision: https://reviews.llvm.org/D90859
2020-11-09 11:08:08 -08:00
Rahul Joshi a97e357e8e [MLIR] Support `global_memref` and `get_global_memref` in standard -> LLVM conversion.
- Convert `global_memref` to LLVM::GlobalOp.
- Convert `get_global_memref` to a memref descriptor with a pointer to the first element
  of the global stashed in it.
- Extend unit test and a mlir-cpu-runner test to validate the generated LLVM IR.

Differential Revision: https://reviews.llvm.org/D90803
2020-11-09 10:54:21 -08:00
George Mitenkov 89eed79c1f [MLIR][SPIRVToLLVM] Added module name conversion
Since SPIR-V module has an optional name, this patch
makes a change to pass it to `ModuleOp` during conversion.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D90904
2020-11-07 12:27:44 +03:00
Artur Bialas f9dca1039a [mlir][spirv] Add VectorExtractDynamicOp and vector.extractelement lowering
VectorExtractDynamicOp in SPIRV dialect
conversion from vector.extractelement to spirv VectorExtractDynamicOp

Differential Revision: https://reviews.llvm.org/D90679
2020-11-05 08:26:54 +01:00
Alex Zinenko 8475fa6ed6 [mlir] Add a simpler lowering pattern for WhileOp representing a do-while loop
When the "after" region of a WhileOp is merely forwarding its arguments back to
the "before" region, i.e. WhileOp is a canonical do-while loop, a simpler CFG
subgraph that omits the "after" region with its extra branch operation can be
produced. Loop rotation from general "while" to "if { do-while }" is left for a
future canonicalization pattern when it becomes necessary.

Differential Revision: https://reviews.llvm.org/D90604
2020-11-04 09:43:13 +01:00