Commit Graph

581 Commits

Author SHA1 Message Date
Lei Zhang e27197f360 [mlir][spirv] Define spv.IsNan/spv.IsInf and add lowerings
spv.Ordered/spv.Unordered are meant for OpenCL Kernel capability.
For Vulkan Shader capability, we should use spv.IsNan to check
whether a number is NaN.

Add a new pattern for converting `std.cmpf ord|uno` to spv.IsNan
and bumped the pattern converting to spv.Ordered/spv.Unordered
to a higher benefit. The SPIR-V target environment will properly
select between these two patterns.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D95237
2021-01-22 13:09:33 -05:00
Lei Zhang 167fb9b4b4 [mlir][spirv] Fix script for availability autogen and refresh ops
Previously we only autogen the availability for ops that are
direct instantiating `SPV_Op` and expected other subclasses of
`SPV_Op` to define aggregated availability for all ops. This is
quite error prone and we can miss capabilities for certain ops.
Also it's arguable to have multiple levels of subclasses and try
to deduplicate too much: having the availability directly in the
op can be quite explicit and clear. A few extra lines of
declarative code is fine.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D95236
2021-01-22 13:07:36 -05:00
Hanhan Wang 2cb130f766 [mlir][StandardToSPIRV] Add support for lowering uitofp to SPIR-V
- Extend spirv::ConstantOp::getZero/One to handle float, vector of int, and vector of float.
- Refactor ZeroExtendI1Pattern to use getZero/One methods.
- Add one more test for lowering std.zexti which extends vector<4xi1> to vector<4xi64>.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D95120
2021-01-21 22:20:32 -08:00
MaheshRavishankar 615167c9f7 [mlir]][SPIRV] Define OrderedOp and UnorderedOp and add lowerings from Standard.
Define OrderedOp and UnorderedOp instructions in SPIR-V and convert
cmpf operations with `ord` and `uno` tag to these instructions
respectively.

Differential Revision: https://reviews.llvm.org/D95098
2021-01-21 07:56:44 -08:00
Frederik Gossen 4ef38f9c12 Add log1p lowering from standard to ROCDL intrinsics
Differential Revision: https://reviews.llvm.org/D95129
2021-01-21 14:02:48 +01:00
Frederik Gossen 294e2544c9 Add log1p lowering from standard to NVVM intrinsics
Differential Revision: https://reviews.llvm.org/D95130
2021-01-21 14:00:38 +01:00
Alexander Belyaev fc58bfd02f [mlir] Remove complex ops from Standard dialect.
`complex` dialect should be used instead.
https://llvm.discourse.group/t/rfc-split-the-complex-dialect-from-std/2496/2

Differential Revision: https://reviews.llvm.org/D95077
2021-01-21 10:34:26 +01:00
Sean Silva be7352c00d [mlir][splitting std] move 2 more ops to `tensor`
- DynamicTensorFromElementsOp
- TensorFromElements

Differential Revision: https://reviews.llvm.org/D94994
2021-01-19 13:49:25 -08:00
Lei Zhang 3a56a96664 [mlir][spirv] Define spv.GLSL.Fma and add lowerings
Also changes some rewriter.create + rewriter.replaceOp calls
into rewriter.replaceOpWithNewOp calls.

Reviewed By: hanchung

Differential Revision: https://reviews.llvm.org/D94965
2021-01-19 09:14:21 -05:00
Alexander Belyaev 11f4c58c15 [mlir] Add `complex.abs`, `complex.div` and `complex.mul` to ComplexOps.
Differential Revision: https://reviews.llvm.org/D94911
2021-01-19 12:09:59 +01:00
Alexander Belyaev d0cb0d30a4 [mlir] Add Complex dialect.
Differential Revision: https://reviews.llvm.org/D94764
2021-01-15 19:58:10 +01:00
River Riddle 93592b726c [mlir][OpFormatGen] Format enum attribute cases as keywords when possible
In the overwhelmingly common case, enum attribute case strings represent valid identifiers in MLIR syntax. This revision updates the format generator to format as a keyword in these cases, removing the need to wrap values in a string. The parser still retains the ability to parse the string form, but the printer will use the keyword form when applicable.

Differential Revision: https://reviews.llvm.org/D94575
2021-01-14 11:35:49 -08:00
Rob Suderman 1d973b7ded [MLIR][TOSA] First lowerings from Tosa to Linalg
Initial commit to add support for lowering from TOSA to Linalg. The focus is on
the essential infrastructure for these lowerings and integration with existing
passes.

Includes lowerings for a subset of operations including:
  abs, add, sub, pow, and, or, xor, left shift, right shift, tanh

Lit tests are used to validate correctness.

Differential Revision: https://reviews.llvm.org/D94247
2021-01-14 11:24:23 -08:00
Alex Zinenko bd30a796fc [mlir] use built-in vector types instead of LLVM dialect types when possible
Continue the convergence between LLVM dialect and built-in types by using the
built-in vector type whenever possible, that is for fixed vectors of built-in
integers and built-in floats. LLVM dialect vector type is still in use for
pointers, less frequent floating point types that do not have a built-in
equivalent, and scalable vectors. However, the top-level `LLVMVectorType` class
has been removed in favor of free functions capable of inspecting both built-in
and LLVM dialect vector types: `LLVM::getVectorElementType`,
`LLVM::getNumVectorElements` and `LLVM::getFixedVectorType`. Additional work is
necessary to design an implemented the extensions to built-in types so as to
remove the `LLVMFixedVectorType` entirely.

Note that the default output format for the built-in vectors does not have
whitespace around the `x` separator, e.g., `vector<4xf32>` as opposed to the
LLVM dialect vector type format that does, e.g., `!llvm.vec<4 x fp128>`. This
required changing the FileCheck patterns in several tests.

Reviewed By: mehdi_amini, silvas

Differential Revision: https://reviews.llvm.org/D94405
2021-01-12 10:04:28 +01:00
Thomas Raoux 3d693bd0bd [mlir][vector] Add memory effects to transfer_read transfer_write ops
This allow more accurate modeling of the side effects and allow dead code
elimination to remove dead transfer ops.

Differential Revision: https://reviews.llvm.org/D94318
2021-01-11 09:25:37 -08:00
Christian Sigg d59ddba777 [mlir] Fix gpu-to-llvm lowering for gpu.alloc with dynamic sizes.
Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D94402
2021-01-11 15:55:48 +01:00
Aart Bik 6728af16cf [mlir][vector] modified scatter/gather syntax, pass_thru mandatory
This change makes the scatter/gather syntax more consistent with
the syntax of all the other memory operations in the Vector dialect
(order of types, use of [] for index, etc.). This will make the MLIR
code easier to read. In addition, the pass_thru parameter of the
gather has been made mandatory (there is very little benefit in
using the implicit "undefined" values).

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D94352
2021-01-09 11:41:37 -08:00
Lei Zhang 7c3ae48fe8 [mlir][spirv] Replace SPIRVOpLowering with OpConversionPattern
The dialect conversion framework was enhanced to handle type
conversion automatically. OpConversionPattern already contains
a pointer to the TypeConverter. There is no need to duplicate it
in a separate subclass. This removes the only reason for a
SPIRVOpLowering subclass. It adapts to use core infrastructure
and simplifies the code.

Also added a utility function to OpConversionPattern for getting
TypeConverter as a certain subclass.

Reviewed By: hanchung

Differential Revision: https://reviews.llvm.org/D94080
2021-01-09 08:04:53 -05:00
Aart Bik a57def30f5 [mlir][vector] generalized masked l/s and compressed l/s with indices
Adding the ability to index the base address brings these operations closer
to the transfer read and write semantics (with lowering advantages), ensures
more consistent use in vector MLIR code (easier to read), and reduces the
amount of code duplication to lower memrefs into base addresses considerably
(making codegen less error-prone).

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D94278
2021-01-08 13:59:34 -08:00
Alex Zinenko dd5165a920 [mlir] replace LLVM dialect float types with built-ins
Continue the convergence between LLVM dialect and built-in types by replacing
the bfloat, half, float and double LLVM dialect types with their built-in
counterparts. At the API level, this is a direct replacement. At the syntax
level, we change the keywords to `bf16`, `f16`, `f32` and `f64`, respectively,
to be compatible with the built-in type syntax. The old keywords can still be
parsed but produce a deprecation warning and will be eventually removed.

Depends On D94178

Reviewed By: mehdi_amini, silvas, antiagainst

Differential Revision: https://reviews.llvm.org/D94179
2021-01-08 17:38:12 +01:00
Alex Zinenko 2230bf99c7 [mlir] replace LLVMIntegerType with built-in integer type
The LLVM dialect type system has been closed until now, i.e. did not support
types from other dialects inside containers. While this has had obvious
benefits of deriving from a common base class, it has led to some simple types
being almost identical with the built-in types, namely integer and floating
point types. This in turn has led to a lot of larger-scale complexity: simple
types must still be converted, numerous operations that correspond to LLVM IR
intrinsics are replicated to produce versions operating on either LLVM dialect
or built-in types leading to quasi-duplicate dialects, lowering to the LLVM
dialect is essentially required to be one-shot because of type conversion, etc.
In this light, it is reasonable to trade off some local complexity in the
internal implementation of LLVM dialect types for removing larger-scale system
complexity. Previous commits to the LLVM dialect type system have adapted the
API to support types from other dialects.

Replace LLVMIntegerType with the built-in IntegerType plus additional checks
that such types are signless (these are isolated in a utility function that
replaced `isa<LLVMType>` and in the parser). Temporarily keep the possibility
to parse `!llvm.i32` as a synonym for `i32`, but add a deprecation notice.

Reviewed By: mehdi_amini, silvas, antiagainst

Differential Revision: https://reviews.llvm.org/D94178
2021-01-07 19:48:31 +01:00
Sanjoy Das 6173d1277b Remove allow-unregistered-dialect from some tests that don't need it
Differential Revision: https://reviews.llvm.org/D93982
2021-01-06 09:40:50 -08:00
Eugene Zhulenev 61422c8b66 [mlir] Async: add support for lowering async value operands to LLVM
Depends On D93592

Add support for `async.execute` async value unwrapping operands:

```
%token = async.execute(%async_value as %unwrapped : !async.value<!my.type>) {
  ...
  async.yield
}
```

Reviewed By: csigg

Differential Revision: https://reviews.llvm.org/D93598
2020-12-25 02:25:20 -08:00
Eugene Zhulenev 621ad468d9 [mlir] Async: lowering async.value to LLVM
1. Add new methods to Async runtime API to support yielding async values
2. Add lowering from `async.yield` with value payload to the new runtime API calls

`async.value` lowering requires that payload type is convertible to LLVM and supported by `llvm.mlir.cast` (DialectCast) operation.

Reviewed By: csigg

Differential Revision: https://reviews.llvm.org/D93592
2020-12-25 02:23:48 -08:00
Lei Zhang a16fbff17d [mlir][spirv] Create a pass for testing SCFToSPIRV patterns
Previously all SCF to SPIR-V conversion patterns were tested as
the -convert-gpu-to-spirv pass. That obscured the structure we
want. This commit fixed it.

Reviewed By: ThomasRaoux, hanchung

Differential Revision: https://reviews.llvm.org/D93488
2020-12-23 14:31:55 -05:00
Lei Zhang 42980a789d [mlir][spirv] Convert functions returning one value
Reviewed By: hanchung, ThomasRaoux

Differential Revision: https://reviews.llvm.org/D93468
2020-12-23 13:27:31 -05:00
Christian Sigg df6cbd37f5 [mlir] Lower gpu.memcpy to GPU runtime calls.
Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D93204
2020-12-22 22:49:19 +01:00
Prateek Gupta 3e07b0b9d3 [MLIR] Fix lowering of affine operations with return values
This commit addresses the issue of lowering affine.for and
affine.parallel having return values. Relevant test cases are also
added.

Signed-off-by: Prateek Gupta <prateek@polymagelabs.com>

Differential Revision: https://reviews.llvm.org/D93090
2020-12-22 21:44:31 +05:30
Sean Silva 129d6e554e [mlir] Move `std.tensor_cast` -> `tensor.cast`.
This is almost entirely mechanical.

Differential Revision: https://reviews.llvm.org/D93357
2020-12-17 16:06:56 -08:00
Alex Zinenko 96076a2edb [mlir] Support index and memref types in llvm.mlir.cast
This operation is designed to support partial conversion, more specifically the
IR state in which some operations expect or produce built-in types and some
operations produce and expect LLVM dialect types. It is reasonable for it to
support cast between built-in types and any equivalent that could be produced
by the type conversion. (At the same time, we don't want the dialect to depend
on the type conversion as it could lead to a dependency cycle). Introduce
support for casting from index to any integer type and back, and from memref to
bare pointer or memref descriptor type and back.

Contrary to what the TODO in the code stated, there are no particular
precautions necessary to handle the bare pointer conversion for memerfs. This
conversion applies exclusively to statically-shaped memrefs, so we can always
recover the full descriptor contents from the type.

This patch simultaneously tightens the verification for other types to only
accept matching pairs of types, e.g., i64 and !llvm.i64, as opposed to the
previous implementation that only checked if the types were generally allowed
byt not for matching, e.g. i64 could be "casted" to !llvm.bfloat, which is not
the intended semantics.

Move the relevant test under test/Dialect/LLVMIR because it is not specific to
the conversion pass, but rather exercises an op in the dialect. If we decide
this op does not belong to the LLVM dialect, both the dialect and the op should
move together.

Reviewed By: silvas, ezhulenev

Differential Revision: https://reviews.llvm.org/D93405
2020-12-17 09:21:42 +01:00
Tres Popp c77ea40528 [mlir] Add std.pow lowering to LLVMIR
Differential Revision: https://reviews.llvm.org/D93311
2020-12-15 18:54:29 +01:00
Tres Popp 9adc64539f [mlir] Add std.powf to ROCDL lowering.
Differential Revision: https://reviews.llvm.org/D93313
2020-12-15 18:47:49 +01:00
Tres Popp f3e8f27ca1 [mlir] Fix GPUToNVVM test 2020-12-15 18:41:16 +01:00
Tres Popp e04785b131 [mlir] Add NVVM lowering for std.pow
Differential Revision: https://reviews.llvm.org/D93303
2020-12-15 18:28:23 +01:00
Javier Setoain aece4e2793 [mlir][ArmSVE][RFC] Add an ArmSVE dialect
This revision starts an Arm-specific ArmSVE dialect discussed in the discourse RFC thread:

https://llvm.discourse.group/t/rfc-vector-dialects-neon-and-sve/2284

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D92172
2020-12-14 21:35:01 +00:00
Frederik Gossen 75d9a46090 [MLIR] Add atan and atan2 lowerings to CUDA intrinsics
Differential Revision: https://reviews.llvm.org/D93124
2020-12-14 10:45:28 +01:00
Frederik Gossen 1c6bc2c0b5 [MLIR] Add lowerings for atan and atan2 to ROCDL intrinsics
Differential Revision: https://reviews.llvm.org/D93123
2020-12-14 10:43:19 +01:00
Sean Silva 444822d77a Revert "Revert "[mlir] Start splitting the `tensor` dialect out of `std`.""
This reverts commit 0d48d265db.

This reapplies the following commit, with a fix for CAPI/ir.c:

[mlir] Start splitting the `tensor` dialect out of `std`.

This starts by moving `std.extract_element` to `tensor.extract` (this
mirrors the naming of `vector.extract`).

Curiously, `std.extract_element` supposedly works on vectors as well,
and this patch removes that functionality. I would tend to do that in
separate patch, but I couldn't find any downstream users relying on
this, and the fact that we have `vector.extract` made it seem safe
enough to lump in here.

This also sets up the `tensor` dialect as a dependency of the `std`
dialect, as some ops that currently live in `std` depend on
`tensor.extract` via their canonicalization patterns.

Part of RFC: https://llvm.discourse.group/t/rfc-split-the-tensor-dialect-from-std/2347/2

Differential Revision: https://reviews.llvm.org/D92991
2020-12-11 14:30:50 -08:00
Sean Silva 0d48d265db Revert "[mlir] Start splitting the `tensor` dialect out of `std`."
This reverts commit cab8dda90f.

I mistakenly thought that CAPI/ir.c failure was unrelated to this
change. Need to debug it.
2020-12-11 14:15:41 -08:00
Sean Silva cab8dda90f [mlir] Start splitting the `tensor` dialect out of `std`.
This starts by moving `std.extract_element` to `tensor.extract` (this
mirrors the naming of `vector.extract`).

Curiously, `std.extract_element` supposedly works on vectors as well,
and this patch removes that functionality. I would tend to do that in
separate patch, but I couldn't find any downstream users relying on
this, and the fact that we have `vector.extract` made it seem safe
enough to lump in here.

This also sets up the `tensor` dialect as a dependency of the `std`
dialect, as some ops that currently live in `std` depend on
`tensor.extract` via their canonicalization patterns.

Part of RFC: https://llvm.discourse.group/t/rfc-split-the-tensor-dialect-from-std/2347/2

Differential Revision: https://reviews.llvm.org/D92991
2020-12-11 13:50:55 -08:00
Nicolas Vasilache 7310501f74 [mlir][ArmNeon][RFC] Add a Neon dialect
This revision starts an Arm-specific ArmNeon dialect discussed in the [discourse RFC thread](https://llvm.discourse.group/t/rfc-vector-dialects-neon-and-sve/2284).

Differential Revision: https://reviews.llvm.org/D92171
2020-12-11 13:49:40 +00:00
Adrian Kuegel ada4c7a351 Add rsqrt lowering from standard to ROCDL.
Add a lowering for rsqrt from standard dialect to ROCDL.

Differential Revision: https://reviews.llvm.org/D93011
2020-12-11 13:18:57 +01:00
Adrian Kuegel 09f717b929 Add sqrt lowering from standard to ROCDL
Add a lowering for sqrt from standard dialect to ROCDL.

Differential Revision: https://reviews.llvm.org/D92921
2020-12-10 09:47:37 +01:00
Frederik Gossen b4750f58d8 Add sqrt lowering from standard to NVVM
Differential Revision: https://reviews.llvm.org/D92850
2020-12-08 17:08:27 +01:00
Frederik Gossen bb7d43e7d5 Add rsqrt lowering from standard to NVVM
Differential Revision: https://reviews.llvm.org/D92838
2020-12-08 14:33:58 +01:00
Aart Bik c95acf052b [mlir][vector][avx512] move avx512 lowering pass into general vector lowering
A separate AVX512 lowering pass does not compose well with the regular
vector lowering pass. As such, it is at risk of code duplication and
lowering inconsistencies. This change removes the separate AVX512 lowering
pass and makes it an "option" in the regular vector lowering pass
(viz. vector dialect "augmented" with AVX512 dialect).

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D92614
2020-12-03 17:23:46 -08:00
Christian Sigg 5535696c38 [mlir] Add gpu.allocate, gpu.deallocate ops with LLVM lowering to runtime function calls.
The ops are very similar to the std variants, but support async GPU execution.

gpu.alloc does not currently support an alignment attribute, and the new ops do not have
canonicalizers/folders like their std siblings do.

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D91698
2020-11-27 09:40:59 +01:00
Alex Zinenko 119545f433 [mlir] Add conversion from SCF parallel loops to OpenMP
Introduce a conversion pass from SCF parallel loops to OpenMP dialect
constructs - parallel region and workshare loop. Loops with reductions are not
supported because the OpenMP dialect cannot model them yet.

The conversion currently targets only one level of parallelism, i.e. only
one top-level `omp.parallel` operation is produced even if there are nested
`scf.parallel` operations that could be mapped to `omp.wsloop`. Nested
parallelism support is left for future work.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D91982
2020-11-24 21:12:56 +01:00
Alex Zinenko f7d033f4d8 [mlir] Support WsLoopOp in OpenMP to LLVM dialect conversion
It is a simple conversion that only requires to change the region argument
types, generalize it from ParallelOp.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D91989
2020-11-23 23:28:02 +01:00
Alex Zinenko 1ec60862d7 [mlir] Avoid cloning ops in SCF parallel conversion to CFG
The existing implementation of the conversion from SCF Parallel operation to
SCF "for" loops in order to further convert those loops to branch-based CFG has
been cloning the loop and reduction body operations into the new loop because
ConversionPatternRewriter was missing support for moving blocks while replacing
their arguments. This functionality now available, use it to implement the
conversion and avoid cloning operations, which may lead to doubling of the IR
size during the conversion.

In addition, this fixes an issue with converting nested SCF "if" conditionals
present in "parallel" operations that would cause the conversion infrastructure
to stop because of the repeated application of the pattern converting "newly"
created "if"s (which were in fact just moved). Arguably, this should be fixed
at the infrastructure level and this fix is a workaround.

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D91955
2020-11-23 14:01:22 +01:00
Eugene Zhulenev a86a9b5ef7 [mlir] Automatic reference counting for Async values + runtime support for ref counted objects
Depends On D89963

**Automatic reference counting algorithm outline:**

1. `ReturnLike` operations forward the reference counted values without
    modifying the reference count.
2. Use liveness analysis to find blocks in the CFG where the lifetime of
   reference counted values ends, and insert `drop_ref` operations after
   the last use of the value.
3. Insert `add_ref` before the `async.execute` operation capturing the
   value, and pairing `drop_ref` before the async body region terminator,
   to release the captured reference counted value when execution
   completes.
4. If the reference counted value is passed only to some of the block
   successors, insert `drop_ref` operations in the beginning of the blocks
   that do not have reference coutned value uses.

Reviewed By: silvas

Differential Revision: https://reviews.llvm.org/D90716
2020-11-20 03:08:44 -08:00
Alex Zinenko 9bb5bff570 [mlir] Add an assertion on creating an Operation with null result types
Null types are commonly used as an error marker. Catch them in the constructor
of Operation if they are present in the result type list, as otherwise this
could lead to further surprising behavior when querying op result types.

Fix AsyncToLLVM and StandardToLLVM that were using null types when constructing
operations.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D91770
2020-11-19 22:28:38 +01:00
ergawy 2f3adc54b5 [MLIR][SPIRV] Rename `spv._module_end` to `spv.mlir.endmodule`
This commit does the renaming mentioned in the title in order to bring
'spv' dialect closer to the MLIR naming conventions.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D91792
2020-11-19 13:25:13 -05:00
ergawy 9bd50abc4c [MLIR][SPIRV] Rename `spv._merge` to `spv.mlir.merge`
This commit does the renaming mentioned in the title in order to bring
'spv' dialect closer to the MLIR naming conventions.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D91797
2020-11-19 10:04:35 -05:00
Christian Sigg 8b97e17d16 [mlir] Simplify code generated by ConvertToLLVMPattern::getStridedElementPtr().
Make the interface match the one of ConvertToLLVMPattern::getDataPtr() (to be removed in a separate change).

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D91599
2020-11-18 11:52:09 +01:00
Christian Sigg bedaad4495 [mlir] Simplify std.alloc lowering to LLVM.
std.alloc only supports memrefs with identity layout, which means we can simplify the lowering to LLVM and compute strides only from (static and dynamic) sizes.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D91549
2020-11-17 18:55:34 +01:00
ergawy 9793edd5bf [MLIR][SPIRV] Rename `spv._address_of` to `spv.mlir.addressof`
This commit does the renaming mentioned in the title in order to bring
`spv` dialect closer to the MLIR naming conventions.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D91609
2020-11-17 12:12:27 -05:00
Rahul Joshi b7382ed3fe [MLIR] Extend Symbol verification to reject public symbol declarations.
- Extend the Symbol interface with `isDeclaration` to identify operations that declare
  a symbol as opposed to define it.
- Extend verification to disallow public declarations as per the discussion in
   https://llvm.discourse.group/t/rfc-symbol-definition-declaration-x-visibility-checks/2140
- Adopt the new interface for `FuncOp` and fix test and code to not have/create public
  function declarations.

Differential Revision: https://reviews.llvm.org/D91456
2020-11-16 16:05:32 -08:00
Christian Sigg 04481f26fa [mlir] Require std.alloc() ops to have canonical layout during LLVM lowering.
The current code allows strided layouts, but the number of elements allocated is ambiguous. It could be either the number of elements in the shape (the current implementation), or the amount of elements required to not index out-of-bounds with the given maps (which would require evaluating the layout map).

If we require the canonical layouts, the two will be the same.

Reviewed By: nicolasvasilache, ftynse

Differential Revision: https://reviews.llvm.org/D91523
2020-11-16 17:29:36 +01:00
Hanhan Wang 47fd19f22e [mlir][StandardToSPIRV] Extend support for lowering cmpi to SPIRV.
The logic of vector on boolean was missed. This patch adds the logic and test on
it.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D91403
2020-11-16 06:51:05 -08:00
Eugene Zhulenev c30ab6c2a3 [mlir] Transform scf.parallel to scf.for + async.execute
Depends On D89958

1. Adds `async.group`/`async.awaitall` to group together multiple async tokens/values
2. Rewrite scf.parallel operation into multiple concurrent async.execute operations over non overlapping subranges of the original loop.

Example:

```
   scf.for (%i, %j) = (%lbi, %lbj) to (%ubi, %ubj) step (%si, %sj) {
     "do_some_compute"(%i, %j): () -> ()
   }
```

Converted to:

```
   %c0 = constant 0 : index
   %c1 = constant 1 : index

   // Compute blocks sizes for each induction variable.
   %num_blocks_i = ... : index
   %num_blocks_j = ... : index
   %block_size_i = ... : index
   %block_size_j = ... : index

   // Create an async group to track async execute ops.
   %group = async.create_group

   scf.for %bi = %c0 to %num_blocks_i step %c1 {
     %block_start_i = ... : index
     %block_end_i   = ... : index

     scf.for %bj = %c0 t0 %num_blocks_j step %c1 {
       %block_start_j = ... : index
       %block_end_j   = ... : index

       // Execute the body of original parallel operation for the current
       // block.
       %token = async.execute {
         scf.for %i = %block_start_i to %block_end_i step %si {
           scf.for %j = %block_start_j to %block_end_j step %sj {
             "do_some_compute"(%i, %j): () -> ()
           }
         }
       }

       // Add produced async token to the group.
       async.add_to_group %token, %group
     }
   }

   // Await completion of all async.execute operations.
   async.await_all %group
```
In this example outer loop launches inner block level loops as separate async
execute operations which will be executed concurrently.

At the end it waits for the completiom of all async execute operations.

Reviewed By: ftynse, mehdi_amini

Differential Revision: https://reviews.llvm.org/D89963
2020-11-13 04:02:56 -08:00
Stephan Herhut 5da2423bc0 [mlir][gpu] Only transform mapped parallel loops to GPU.
This exposes a hook to configure legality of operations such that only
`scf.parallel` operations that have mapping attributes are marked as
illegal. Consequently, the transformation can now also be applied to
mixed forms.

Differential Revision: https://reviews.llvm.org/D91340
2020-11-13 09:15:17 +01:00
George Mitenkov de3ad5bb09 [MLIR][SPIRVToLLVM] Enhanced conversion for execution mode
This patch introduces a new conversion pattern for `spv.ExecutionMode`.
`spv.ExecutionMode` may contain important information about the entry
point, which we want to preserve. For example, `LocalSize` provides
information about the work-group size that can be reused. Hence, the
pattern for entry-point ops changes to the following:
- `spv.EntryPoint` is still simply removed
- Info from `spv.ExecutionMode` is used to create a global struct variable,
  which looks like:

  ```
  struct {
    int32_t executionMode;
    int32_t values[];          // optional values
  };
  ```

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D89989
2020-11-10 18:33:54 +03:00
Artur Bialas 3035e676a3 [mlir][spirv] Add VectorInsertDynamicOp and vector.insertelement lowering
VectorInsertDynamicOp in SPIRV dialect
conversion from vector.insertelement to spirv VectorInsertDynamicOp

Differential Revision: https://reviews.llvm.org/D90927
2020-11-10 09:49:12 +01:00
River Riddle ebcc022507 [mlir][AsmPrinter] Refactor printing to only print aliases for attributes/types that will exist in the output.
This revision refactors the way that attributes/types are considered when generating aliases. Instead of considering all of the attributes/types of every operation, we perform a "fake" print step that prints the operations using a dummy printer to collect the attributes and types that would actually be printed during the real process. This removes a lot of attributes/types from consideration that generally won't end up in the final output, e.g. affine map attributes in an `affine.apply`/`affine.for`.

This resolves a long standing TODO w.r.t aliases, and helps to have a much cleaner textual output format. As a datapoint to the latter, as part of this change several tests were identified as testing for the presence of attributes aliases that weren't actually referenced by the custom form of any operation.

To ensure that this wouldn't cause a large degradation in compile time due to the second full print, I benchmarked this change on a very large module with a lot of operations(The file is ~673M/~4.7 million lines long). This file before this change take ~6.9 seconds to print in the custom form, and ~7 seconds after this change. In the custom assembly case, this added an average of a little over ~100 miliseconds to the compile time. This increase was due to the way that argument attributes on functions are structured and how they get printed; i.e. with a better representation the negative impact here can be greatly decreased. When printing in the generic form, this revision had no observable impact on the compile time. This benchmarking leads me to believe that the impact of this change on compile time w.r.t printing is closely related to `print` methods that perform a lot of additional/complex processing outside of the OpAsmPrinter.

Differential Revision: https://reviews.llvm.org/D90512
2020-11-09 21:54:47 -08:00
Rahul Joshi 8b5a3e4632 [MLIR] Change FuncOp assembly syntax to print visibility inline instead of in attrib dict.
- Change syntax for FuncOp to be `func <visibility>? @name` instead of printing the
  visibility in the attribute dictionary.
- Since printFunctionLikeOp() and parseFunctionLikeOp() are also used by other
  operations, make the "inline visibility" an opt-in feature.
- Updated unit test to use and check the new syntax.

Differential Revision: https://reviews.llvm.org/D90859
2020-11-09 11:08:08 -08:00
Rahul Joshi a97e357e8e [MLIR] Support `global_memref` and `get_global_memref` in standard -> LLVM conversion.
- Convert `global_memref` to LLVM::GlobalOp.
- Convert `get_global_memref` to a memref descriptor with a pointer to the first element
  of the global stashed in it.
- Extend unit test and a mlir-cpu-runner test to validate the generated LLVM IR.

Differential Revision: https://reviews.llvm.org/D90803
2020-11-09 10:54:21 -08:00
George Mitenkov 89eed79c1f [MLIR][SPIRVToLLVM] Added module name conversion
Since SPIR-V module has an optional name, this patch
makes a change to pass it to `ModuleOp` during conversion.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D90904
2020-11-07 12:27:44 +03:00
Artur Bialas f9dca1039a [mlir][spirv] Add VectorExtractDynamicOp and vector.extractelement lowering
VectorExtractDynamicOp in SPIRV dialect
conversion from vector.extractelement to spirv VectorExtractDynamicOp

Differential Revision: https://reviews.llvm.org/D90679
2020-11-05 08:26:54 +01:00
Alex Zinenko 8475fa6ed6 [mlir] Add a simpler lowering pattern for WhileOp representing a do-while loop
When the "after" region of a WhileOp is merely forwarding its arguments back to
the "before" region, i.e. WhileOp is a canonical do-while loop, a simpler CFG
subgraph that omits the "after" region with its extra branch operation can be
produced. Loop rotation from general "while" to "if { do-while }" is left for a
future canonicalization pattern when it becomes necessary.

Differential Revision: https://reviews.llvm.org/D90604
2020-11-04 09:43:13 +01:00
Alex Zinenko 4c0e255c98 [mlir] Add lowering to CFG for WhileOp
The lowering is a straightforward inlining of the "before" and "after" regions
connected by (conditional) branches. This plugs the WhileOp into the
progressive lowering scheme. Future commits may choose to target WhileOp
instead of CFG when lowering ForOp.

Differential Revision: https://reviews.llvm.org/D90603
2020-11-04 09:43:13 +01:00
Alexander Belyaev 9925168576 [mlir] Convert `memref_reshape` to LLVM.
https://llvm.discourse.group/t/rfc-standard-memref-cast-ops/1454/15

Differential Revision: https://reviews.llvm.org/D90377
2020-11-03 11:39:08 +01:00
Tres Popp d05d42199f [mlir] Add partial lowering of shape.cstr_broadcastable.
Because cstr operations allow more instruction reordering than asserts, we only
lower cstr_broadcastable to std ops with cstr_require. This ensures that the
more drastic lowering to asserts can happen specifically with the user's desire.

Differential Revision: https://reviews.llvm.org/D89325
2020-11-03 09:57:23 +01:00
Eugene Zhulenev f507aa17b7 [mlir] Implement lowering to LLVM of async.execute ops with token dependencies
Add support for lowering `async.execute` operations with token dependencies

Example:

```
%dep = ... : !async.token
%token = async.execute[%dep] {
...
}
```

Token dependencies lowered to `async.await` operations inside the outline coroutine body.

Reviewed By: herhut, mehdi_amini, ftynse

Differential Revision: https://reviews.llvm.org/D89958
2020-10-30 05:59:03 -07:00
Tres Popp 511484f27d [mlir] Add lowering for IsBroadcastable to Std dialect.
Differential Revision: https://reviews.llvm.org/D90407
2020-10-30 10:44:27 +01:00
Christian Sigg b22f111023 [mlir][gpu] NFC: Change gpu.launch_func ops to custom format.
This should fix the reason for the failures after ec7780ebda. I will roll forward in a separate change.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D90410
2020-10-29 21:21:30 +01:00
Christian Sigg 97b351a827 [mlir][gpu] Fix leaked stream and module when lowering gpu.launch_func to runtime calls.
Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D90370
2020-10-29 08:40:51 +01:00
Qingyi Liu 1ec893c574 MLIR: add SinOp Lowering to __nv_sinf and __nv_sin
Added lowering rule from `SinOp` to `__nv_sinf` and `__nv_sin`

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D90147
2020-10-28 14:15:26 +01:00
River Riddle 8a1ca2cd34 [mlir] Add a conversion pass between PDL and the PDL Interpreter Dialect
The conversion between PDL and the interpreter is split into several different parts.
** The Matcher:

The matching section of all incoming pdl.pattern operations is converted into a predicate tree and merged. Each pattern is first converted into an ordered list of predicates starting from the root operation. A predicate is composed of three distinct parts:
* Position
  - A position refers to a specific location on the input DAG, i.e. an
    existing MLIR entity being matched. These can be attributes, operands,
    operations, results, and types. Each position also defines a relation to
    its parent. For example, the operand `[0] -> 1` has a parent operation
    position `[0]` (the root).
* Question
  - A question refers to a query on a specific positional value. For
  example, an operation name question checks the name of an operation
  position.
* Answer
  - An answer is the expected result of a question. For example, when
  matching an operation with the name "foo.op". The question would be an
  operation name question, with an expected answer of "foo.op".

After the predicate lists have been created and ordered(based on occurrence of common predicates and other factors), they are formed into a tree of nodes that represent the branching flow of a pattern match. This structure allows for efficient construction and merging of the input patterns. There are currently only 4 simple nodes in the tree:
* ExitNode: Represents the termination of a match
* SuccessNode: Represents a successful match of a specific pattern
* BoolNode/SwitchNode: Branch to a specific child node based on the expected answer to a predicate question.

Once the matcher tree has been generated, this tree is walked to generate the corresponding interpreter operations.

 ** The Rewriter:
The rewriter portion of a pattern is generated in a very straightforward manor, similarly to lowerings in other dialects. Each PDL operation that may exist within a rewrite has a mapping into the interpreter dialect. The code for the rewriter is generated within a FuncOp, that is invoked by the interpreter on a successful pattern match. Referenced values defined in the matcher become inputs the generated rewriter function.

An example lowering is shown below:

```mlir
// The following high level PDL pattern:
pdl.pattern : benefit(1) {
  %resultType = pdl.type
  %inputOperand = pdl.input
  %root, %results = pdl.operation "foo.op"(%inputOperand) -> %resultType
  pdl.rewrite %root {
    pdl.replace %root with (%inputOperand)
  }
}

// is lowered to the following:
module {
  // The matcher function takes the root operation as an input.
  func @matcher(%arg0: !pdl.operation) {
    pdl_interp.check_operation_name of %arg0 is "foo.op" -> ^bb2, ^bb1
  ^bb1:
    pdl_interp.return
  ^bb2:
    pdl_interp.check_operand_count of %arg0 is 1 -> ^bb3, ^bb1
  ^bb3:
    pdl_interp.check_result_count of %arg0 is 1 -> ^bb4, ^bb1
  ^bb4:
    %0 = pdl_interp.get_operand 0 of %arg0
    pdl_interp.is_not_null %0 : !pdl.value -> ^bb5, ^bb1
  ^bb5:
    %1 = pdl_interp.get_result 0 of %arg0
    pdl_interp.is_not_null %1 : !pdl.value -> ^bb6, ^bb1
  ^bb6:
    // This operation corresponds to a successful pattern match.
    pdl_interp.record_match @rewriters::@rewriter(%0, %arg0 : !pdl.value, !pdl.operation) : benefit(1), loc([%arg0]), root("foo.op") -> ^bb1
  }
  module @rewriters {
    // The inputs to the rewriter from the matcher are passed as arguments.
    func @rewriter(%arg0: !pdl.value, %arg1: !pdl.operation) {
      pdl_interp.replace %arg1 with(%arg0)
      pdl_interp.return
    }
  }
}
```

Differential Revision: https://reviews.llvm.org/D84580
2020-10-26 18:01:06 -07:00
Alexander Belyaev d6ab0474c6 [mlir] Convert MemRefReinterpretCastOp to LLVM.
https://llvm.discourse.group/t/rfc-standard-memref-cast-ops/1454/15

Differential Revision: https://reviews.llvm.org/D90033
2020-10-26 20:13:17 +01:00
George Mitenkov cae4067ec1 [MLIR][mlir-spirv-cpu-runner] A pass to emulate a call to kernel in LLVM
This patch introduces a pass for running
`mlir-spirv-cpu-runner` - LowerHostCodeToLLVMPass.

This pass emulates `gpu.launch_func` call in LLVM dialect and lowers
the host module code to LLVM. It removes the `gpu.module`, creates a
sequence of global variables that are later linked to the varables
in the kernel module, as well as a series of copies to/from
them to emulate the memory transfer to/from the host or to/from the
device sides. It also converts the remaining Standard dialect into
LLVM dialect, emitting C wrappers.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D86112
2020-10-26 08:11:04 -04:00
Lei Zhang 36ce915ac5 Revert "Revert "[mlir] Convert from Async dialect to LLVM coroutines""
This reverts commit 4986d5eaff with
proper patches to CMakeLists.txt:

- Add MLIRAsync as a dependency to MLIRAsyncToLLVM
- Add Coroutines as a dependency to MLIRExecutionEngine
2020-10-22 15:23:11 -04:00
Mehdi Amini 4986d5eaff Revert "[mlir] Convert from Async dialect to LLVM coroutines"
This reverts commit a8b0ae3bdd
and commit f8fcff5a9d.

The build with SHARED_LIBRARY=ON is broken.
2020-10-22 19:12:19 +00:00
Christian Sigg 9ab5362bab [mlir][gpu] NFC: switch occurrences of gpu.launch_func to custom format.
Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D89929
2020-10-22 17:27:19 +02:00
Eugene Zhulenev f8fcff5a9d [mlir] Convert from Async dialect to LLVM coroutines
Lower from Async dialect to LLVM by converting async regions attached to `async.execute` operations into LLVM coroutines (https://llvm.org/docs/Coroutines.html):
1. Outline all async regions to functions
2. Add LLVM coro intrinsics to mark coroutine begin/end
3. Use MLIR conversion framework to convert all remaining async types and ops to LLVM + Async runtime function calls

All `async.await` operations inside async regions converted to coroutine suspension points. Await operation outside of a coroutine converted to the blocking wait operations.

Implement simple runtime to support concurrent execution of coroutines.

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D89292
2020-10-22 06:30:46 -07:00
Thomas Raoux ac2cf07195 [spirv] Fix legalize standard to spir-v for transfer ops
Forward missing attributes when creating the new transfer op otherwise the
builder would use default values.

Differential Revision: https://reviews.llvm.org/D89907
2020-10-21 13:56:01 -07:00
Christian Sigg 3ac561d8c3 [mlir][gpu] Add lowering to LLVM for `gpu.wait` and `gpu.wait async`.
Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D89686
2020-10-21 18:20:42 +02:00
Tres Popp 72d5ac90b9 [mlir] Use affine dim instead of symbol in SCFToGPU lowering.
This still satisfies the constraints required by the affine dialect and
gives more flexibility in what iteration bounds can be used when
loewring to the GPU dialect.

Differential Revision: https://reviews.llvm.org/D89782
2020-10-20 11:56:34 +02:00
Sean Silva 57211fd239 [mlir] Use dynamic_tensor_from_elements in shape.broadcast conversion
Now, convert-shape-to-std doesn't internally create memrefs, which was
previously a bit of a layering violation. The conversion to memrefs
should logically happen as part of bufferization.

Differential Revision: https://reviews.llvm.org/D89669
2020-10-19 15:51:46 -07:00
ergawy bddaa7a848 [MLIR][SPIRV] Support identified and recursive structs.
This PR adds support for identified and recursive structs.
This includes: parsing, printing, serializing, and
deserializing such structs.

The following C struct:

```C
struct A {
  A* next;
};
```

which is translated to the following MLIR code as:

```mlir
!spv.struct<A, (!spv.ptr<!spv.struct<A>, Generic>)>
```

would be represented in the SPIR-V module as:

```spirv
OpName %A "A"
OpTypeForwardPointer %APtr Generic
%A = OpTypeStruct %APtr
%APtr = OpTypePointer Generic %A
```

In particular the following changes are included:
- SPIR-V structs can now be either identified or literal
  (i.e. non-identified).
- All structs now have their members surrounded by a ()-pair.
- For recursive references,
  (1) an OpTypeForwardPointer instruction is emitted before
  the OpTypeStruct instruction defining the recursive struct
  (2) an OpTypePointer instruction is emitted after the
  OpTypeStruct instruction which actually defines the recursive
  pointer to struct type.

Reviewed By: antiagainst, rriddle, ftynse

Differential Revision: https://reviews.llvm.org/D87206
2020-10-13 10:18:21 -04:00
Tres Popp 8178e41dc1 [mlir] Type erase inputs to select statements in shape.broadcast lowering.
This is required or broadcasting with operands of different ranks will lead to
failures as the select op requires both possible outputs and its output type to
be the same.

Differential Revision: https://reviews.llvm.org/D89134
2020-10-11 21:58:06 +02:00
Amara Emerson 322d0afd87 [llvm][mlir] Promote the experimental reduction intrinsics to be first class intrinsics.
This change renames the intrinsics to not have "experimental" in the name.

The autoupgrader will handle legacy intrinsics.

Relevant ML thread: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140729.html

Differential Revision: https://reviews.llvm.org/D88787
2020-10-07 10:36:44 -07:00
Thomas Raoux 6e557bc405 [mlir][spirv] Add Vector to SPIR-V conversion pass
Add conversion pass for Vector dialect to SPIR-V dialect and add some simple
conversion pattern for vector.broadcast, vector.insert, vector.extract.

Differential Revision: https://reviews.llvm.org/D88761
2020-10-06 11:53:23 -07:00
George Mitenkov b81bedf714 [MLIR][SPIRVToLLVM] Conversion for composite extract and insert
A pattern to convert `spv.CompositeInsert` and `spv.CompositeExtract`.
In LLVM, there are 2 ops that correspond to each instruction depending
on the container type. If the container type is a vector type, then
the result of conversion is `llvm.insertelement` or `llvm.extractelement`.
If the container type is an aggregate type (i.e. struct, array), the
result of conversion is `llvm.insertvalue` or `llvm.extractvalue`.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D88205
2020-10-06 11:46:25 +03:00
Christian Sigg 665371d0b2 [mlir] Split alloc-like op LLVM lowerings into base and separate derived classes.
The previous code did the lowering to alloca, malloc, and aligned_malloc
in a single class with different code paths that are somewhat difficult to
follow.

This change moves the common code to a base class and has a separte
derived class per lowering target that contains the specifics.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D88696
2020-10-05 17:36:01 +02:00
Benjamin Kramer 6e2b267d1c Promote transpose from linalg to standard dialect
While affine maps are part of the builtin memref type, there is very
limited support for manipulating them in the standard dialect. Add
transpose to the set of ops to complement the existing view/subview ops.
This is a metadata transformation that encodes the transpose into the
strides of a memref.

I'm planning to use this when lowering operations on strided memrefs,
using the transpose to remove the stride without adding a dependency on
linalg dialect.

Differential Revision: https://reviews.llvm.org/D88651
2020-10-05 10:58:20 +02:00
Diego Caballero a611f9a5c6 [mlir] Fix call op conversion in bare-ptr calling convention
We hit an llvm_unreachable related to unranked memrefs for call ops
with scalar types. Removing the llvm_unreachable since the conversion
should gracefully bail out in the presence of unranked memrefs. Adding
tests to verify that.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D88709
2020-10-02 08:48:21 -07:00
Jakub Lichman 0b17d4754a [mlir][Linalg] Tile sizes for Conv ops vectorization added as pass arguments
Current setup for conv op vectorization does not enable user to specify tile
sizes as well as dimensions for vectorization. In this commit we change that by
adding tile sizes as pass arguments. Every dimension with corresponding tile
size > 1 is automatically vectorized.

Differential Revision: https://reviews.llvm.org/D88533
2020-09-30 11:31:28 +00:00
Diego Caballero a89fc12653 [mlir] Support return and call ops in bare-ptr calling convention
This patch adds support for the 'return' and 'call' ops to the bare-ptr
calling convention. These changes also align the bare-ptr calling
convention code with the latest changes in the default calling convention
and reduce the amount of customization code needed.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D87724
2020-09-29 12:00:47 -07:00
Sean Silva a975be0e00 [mlir][shape] Make conversion passes more consistent.
- use select-ops to make the lowering simpler
- change style of FileCheck variables names to be consistent
- change some variable names in the code to be more explicit

Differential Revision: https://reviews.llvm.org/D88258
2020-09-28 14:55:42 -07:00