2019-11-19 16:34:20 +08:00
|
|
|
# Conversion to the LLVM Dialect
|
2019-02-22 23:45:55 +08:00
|
|
|
|
2020-12-18 04:38:15 +08:00
|
|
|
Conversion from several dialects that rely on
|
|
|
|
[built-in types](LangRef.md#builtin-types) to the
|
|
|
|
[LLVM Dialect](Dialects/LLVM.md) is expected to be performed through the
|
|
|
|
[Dialect Conversion](DialectConversion.md) infrastructure.
|
2019-02-22 23:45:55 +08:00
|
|
|
|
2020-12-18 04:38:15 +08:00
|
|
|
The conversion of types and that of the overall module structure is described in
|
|
|
|
this document. Individual conversion passes provide a set of conversion patterns
|
|
|
|
for ops in different dialects, such as `-convert-std-to-llvm` for ops in the
|
|
|
|
[Standard dialect](Dialects/Standard.md) and `-convert-vector-to-llvm` in the
|
|
|
|
[Vector dialect](Dialects/Vector.md). *Note that some conversions subsume the
|
|
|
|
others.*
|
2019-02-22 23:45:55 +08:00
|
|
|
|
2020-12-18 04:38:15 +08:00
|
|
|
We use the terminology defined by the
|
|
|
|
[LLVM Dialect description](Dialects/LLVM.md) throughout this document.
|
2019-02-22 23:45:55 +08:00
|
|
|
|
|
|
|
[TOC]
|
|
|
|
|
|
|
|
## Type Conversion
|
|
|
|
|
|
|
|
### Scalar Types
|
|
|
|
|
|
|
|
Scalar types are converted to their LLVM counterparts if they exist. The
|
2020-03-21 13:32:40 +08:00
|
|
|
following conversions are currently implemented:
|
2019-02-22 23:45:55 +08:00
|
|
|
|
2019-12-18 03:48:15 +08:00
|
|
|
- `i*` converts to `!llvm.i*`
|
2020-12-18 04:38:15 +08:00
|
|
|
- `bf16` converts to `!llvm.bfloat`
|
2019-12-18 03:48:15 +08:00
|
|
|
- `f16` converts to `!llvm.half`
|
|
|
|
- `f32` converts to `!llvm.float`
|
|
|
|
- `f64` converts to `!llvm.double`
|
2019-02-22 23:45:55 +08:00
|
|
|
|
|
|
|
### Index Type
|
|
|
|
|
2020-12-18 04:38:15 +08:00
|
|
|
Index type is converted to an LLVM dialect integer type with bitwidth equal to
|
|
|
|
the bitwidth of the pointer size as specified by the
|
|
|
|
[data layout](Dialects/LLVM.md#data-layout-and-triple) of the closest module.
|
[mlir] replace LLVMIntegerType with built-in integer type
The LLVM dialect type system has been closed until now, i.e. did not support
types from other dialects inside containers. While this has had obvious
benefits of deriving from a common base class, it has led to some simple types
being almost identical with the built-in types, namely integer and floating
point types. This in turn has led to a lot of larger-scale complexity: simple
types must still be converted, numerous operations that correspond to LLVM IR
intrinsics are replicated to produce versions operating on either LLVM dialect
or built-in types leading to quasi-duplicate dialects, lowering to the LLVM
dialect is essentially required to be one-shot because of type conversion, etc.
In this light, it is reasonable to trade off some local complexity in the
internal implementation of LLVM dialect types for removing larger-scale system
complexity. Previous commits to the LLVM dialect type system have adapted the
API to support types from other dialects.
Replace LLVMIntegerType with the built-in IntegerType plus additional checks
that such types are signless (these are isolated in a utility function that
replaced `isa<LLVMType>` and in the parser). Temporarily keep the possibility
to parse `!llvm.i32` as a synonym for `i32`, but add a deprecation notice.
Reviewed By: mehdi_amini, silvas, antiagainst
Differential Revision: https://reviews.llvm.org/D94178
2021-01-06 23:19:04 +08:00
|
|
|
For example, on x86-64 CPUs it converts to `i64`. This behavior can be
|
2020-12-18 04:38:15 +08:00
|
|
|
overridden by the type converter configuration, which is often exposed as a pass
|
|
|
|
option by conversion passes.
|
2019-02-22 23:45:55 +08:00
|
|
|
|
|
|
|
### Vector Types
|
|
|
|
|
|
|
|
LLVM IR only supports *one-dimensional* vectors, unlike MLIR where vectors can
|
2019-08-12 19:08:26 +08:00
|
|
|
be multi-dimensional. Vector types cannot be nested in either IR. In the
|
|
|
|
one-dimensional case, MLIR vectors are converted to LLVM IR vectors of the same
|
|
|
|
size with element type converted using these conversion rules. In the
|
|
|
|
n-dimensional case, MLIR vectors are converted to (n-1)-dimensional array types
|
|
|
|
of one-dimensional vectors.
|
|
|
|
|
2020-12-18 04:38:15 +08:00
|
|
|
For example, `vector<4 x f32>` converts to `!llvm.vec<4 x float>` and `vector<4
|
|
|
|
x 8 x 16 x f32>` converts to `!llvm.array<4 x array<8 x vec<16 x float>>>`.
|
2019-02-22 23:45:55 +08:00
|
|
|
|
2020-12-18 04:38:15 +08:00
|
|
|
### Ranked Memref Types
|
2019-02-22 23:45:55 +08:00
|
|
|
|
|
|
|
Memref types in MLIR have both static and dynamic information associated with
|
2020-12-18 04:38:15 +08:00
|
|
|
them. In the general case, the dynamic information describes dynamic sizes in
|
|
|
|
the logical indexing space and any symbols bound to the memref. This dynamic
|
|
|
|
information must be present at runtime in the LLVM dialect equivalent type.
|
|
|
|
|
|
|
|
In practice, the conversion supports two conventions:
|
|
|
|
|
|
|
|
- the default convention for memrefs in the
|
|
|
|
**[strided form](LangRef.md#strided-memref)**;
|
|
|
|
- a "bare pointer" conversion for statically-shaped memrefs with default
|
|
|
|
layout.
|
|
|
|
|
|
|
|
The choice between conventions is specified at type converter construction time
|
|
|
|
and is often exposed as an option by conversion passes.
|
|
|
|
|
|
|
|
Memrefs with arbitrary layouts are not supported. Instead, these layouts can be
|
|
|
|
factored out of the type and used as part of index computation for operations
|
|
|
|
that read and write into a memref with the default layout.
|
|
|
|
|
|
|
|
#### Default Convention
|
|
|
|
|
|
|
|
The dynamic information comprises the buffer pointer as well as sizes and
|
2020-03-21 13:32:40 +08:00
|
|
|
strides of any dynamically-sized dimensions. Memref types are normalized and
|
2020-12-18 04:38:15 +08:00
|
|
|
converted to a _descriptor_ that is only dependent on the rank of the memref.
|
|
|
|
The descriptor contains the following fields in order:
|
|
|
|
|
|
|
|
1. The pointer to the data buffer as allocated, referred to as "allocated
|
|
|
|
pointer". This is only useful for deallocating the memref.
|
|
|
|
2. The pointer to the properly aligned data pointer that the memref indexes,
|
|
|
|
referred to as "aligned pointer".
|
|
|
|
3. A lowered converted `index`-type integer containing the distance in number
|
|
|
|
of elements between the beginning of the (aligned) buffer and the first
|
|
|
|
element to be accessed through the memref, referred to as "offset".
|
|
|
|
4. An array containing as many converted `index`-type integers as the rank of
|
|
|
|
the memref: the array represents the size, in number of elements, of the
|
|
|
|
memref along the given dimension. For constant memref dimensions, the
|
|
|
|
corresponding size entry is a constant whose runtime value must match the
|
|
|
|
static value.
|
|
|
|
5. A second array containing as many converted `index`-type integers as the
|
|
|
|
rank of memref: the second array represents the "stride" (in tensor
|
|
|
|
abstraction sense), i.e. the number of consecutive elements of the
|
|
|
|
underlying buffer one needs to jump over to get to the next logically
|
|
|
|
indexed element.
|
2019-10-01 02:58:14 +08:00
|
|
|
|
|
|
|
For constant memref dimensions, the corresponding size entry is a constant whose
|
|
|
|
runtime value matches the static value. This normalization serves as an ABI for
|
|
|
|
the memref type to interoperate with externally linked functions. In the
|
|
|
|
particular case of rank `0` memrefs, the size and stride arrays are omitted,
|
2019-11-12 23:06:18 +08:00
|
|
|
resulting in a struct containing two pointers + offset.
|
2019-02-22 23:45:55 +08:00
|
|
|
|
|
|
|
Examples:
|
|
|
|
|
2019-12-10 19:00:29 +08:00
|
|
|
```mlir
|
2020-12-18 04:38:15 +08:00
|
|
|
memref<f32> -> !llvm.struct<(ptr<float> , ptr<float>, i64)>
|
|
|
|
memref<1 x f32> -> !llvm.struct<(ptr<float>, ptr<float>, i64,
|
|
|
|
array<1 x 64>, array<1 x i64>)>
|
|
|
|
memref<? x f32> -> !llvm.struct<(ptr<float>, ptr<float>, i64
|
|
|
|
array<1 x 64>, array<1 x i64>)>
|
|
|
|
memref<10x42x42x43x123 x f32> -> !llvm.struct<(ptr<float>, ptr<float>, i64
|
|
|
|
array<5 x 64>, array<5 x i64>)>
|
|
|
|
memref<10x?x42x?x123 x f32> -> !llvm.struct<(ptr<float>, ptr<float>, i64
|
|
|
|
array<5 x 64>, array<5 x i64>)>
|
2019-02-22 23:45:55 +08:00
|
|
|
|
|
|
|
// Memref types can have vectors as element types
|
2020-12-18 04:38:15 +08:00
|
|
|
memref<1x? x vector<4xf32>> -> !llvm.struct<(ptr<vec<4 x float>>,
|
|
|
|
ptr<vec<4 x float>>, i64,
|
|
|
|
array<1 x i64>, array<1 x i64>)>
|
2019-02-22 23:45:55 +08:00
|
|
|
```
|
|
|
|
|
2020-12-18 04:38:15 +08:00
|
|
|
#### Bare Pointer Convention
|
2019-12-06 05:12:50 +08:00
|
|
|
|
2020-12-18 04:38:15 +08:00
|
|
|
Ranked memrefs with static shape and default layout can be converted into an
|
|
|
|
LLVM dialect pointer to their element type. Only the default alignment is
|
2021-01-07 01:35:29 +08:00
|
|
|
supported in such cases, e.g. the `alloc` operation cannot have an alignment
|
2020-12-18 04:38:15 +08:00
|
|
|
attribute.
|
2019-12-06 05:12:50 +08:00
|
|
|
|
|
|
|
Examples:
|
|
|
|
|
2019-12-10 19:00:29 +08:00
|
|
|
```mlir
|
2020-12-18 04:38:15 +08:00
|
|
|
memref<f32> -> !llvm.ptr<float>
|
|
|
|
memref<10x42 x f32> -> !llvm.ptr<float>
|
|
|
|
|
|
|
|
// Memrefs with vector types are also supported.
|
|
|
|
memref<10x42 x vector<4xf32>> -> !llvm.ptr<vec<4 x float>>
|
2019-12-06 05:12:50 +08:00
|
|
|
```
|
|
|
|
|
2020-12-18 04:38:15 +08:00
|
|
|
### Unranked Memref types
|
2019-12-16 21:16:35 +08:00
|
|
|
|
2020-12-18 04:38:15 +08:00
|
|
|
Unranked memrefs are converted to an unranked descriptor that contains:
|
|
|
|
|
|
|
|
1. a converted `index`-typed integer representing the dynamic rank of the
|
|
|
|
memref;
|
|
|
|
2. a type-erased pointer (`!llvm.ptr<i8>`) to a ranked memref descriptor with
|
|
|
|
the contents listed above.
|
|
|
|
|
|
|
|
This descriptor is primarily intended for interfacing with rank-polymorphic
|
|
|
|
library functions. The pointer to the ranked memref descriptor points to memory
|
|
|
|
_allocated on stack_ of the function in which it is used.
|
|
|
|
|
|
|
|
Note that stack allocations may be emitted at a location where the unranked
|
|
|
|
memref first appears, e.g., a cast operation, and remain live throughout the
|
|
|
|
lifetime of the function; this may lead to stack exhaustion if used in a loop.
|
|
|
|
|
|
|
|
Examples:
|
2019-12-16 21:16:35 +08:00
|
|
|
|
|
|
|
```mlir
|
2020-12-18 04:38:15 +08:00
|
|
|
// Unranked descriptor.
|
|
|
|
memref<*xf32> -> !llvm.struct<(i64, ptr<i8>)>
|
2019-12-16 21:16:35 +08:00
|
|
|
```
|
|
|
|
|
2020-12-18 04:38:15 +08:00
|
|
|
Bare pointer convention does not support unranked memrefs.
|
|
|
|
|
2019-04-05 23:19:42 +08:00
|
|
|
### Function Types
|
2019-02-22 23:45:55 +08:00
|
|
|
|
2020-12-18 04:38:15 +08:00
|
|
|
Function types get converted to LLVM dialect function types. The arguments are
|
|
|
|
converted individually according to these rules, except for `memref` types in
|
|
|
|
function arguments and high-order functions, which are described below. The
|
|
|
|
result types need to accommodate the fact that LLVM functions always have a
|
|
|
|
return type, which may be an `!llvm.void` type. The converted function always
|
|
|
|
has a single result type. If the original function type had no results, the
|
|
|
|
converted function will have one result of the `!llvm.void` type. If the
|
|
|
|
original function type had one result, the converted function will also have one
|
|
|
|
result converted using these rules. Otherwise, the result type will be an LLVM
|
|
|
|
dialect structure type where each element of the structure corresponds to one of
|
|
|
|
the results of the original function, converted using these rules.
|
2019-02-22 23:45:55 +08:00
|
|
|
|
|
|
|
Examples:
|
|
|
|
|
2019-12-10 19:00:29 +08:00
|
|
|
```mlir
|
2020-12-18 04:38:15 +08:00
|
|
|
// Zero-ary function type with no results:
|
2019-02-22 23:45:55 +08:00
|
|
|
() -> ()
|
2020-12-18 04:38:15 +08:00
|
|
|
// is converted to a zero-ary function with `void` result.
|
|
|
|
!llvm.func<void ()>
|
2019-02-22 23:45:55 +08:00
|
|
|
|
2020-12-18 04:38:15 +08:00
|
|
|
// Unary function with one result:
|
2019-02-22 23:45:55 +08:00
|
|
|
(i32) -> (i64)
|
2020-12-18 04:38:15 +08:00
|
|
|
// has its argument and result type converted, before creating the LLVM dialect
|
|
|
|
// function type.
|
|
|
|
!llvm.func<i64 (i32)>
|
2019-02-22 23:45:55 +08:00
|
|
|
|
2020-12-18 04:38:15 +08:00
|
|
|
// Binary function with one result:
|
2019-02-22 23:45:55 +08:00
|
|
|
(i32, f32) -> (i64)
|
|
|
|
// has its arguments handled separately
|
2020-12-18 04:38:15 +08:00
|
|
|
!llvm.func<i64 (i32, float)>
|
2019-02-22 23:45:55 +08:00
|
|
|
|
2020-12-18 04:38:15 +08:00
|
|
|
// Binary function with two results:
|
2019-02-22 23:45:55 +08:00
|
|
|
(i32, f32) -> (i64, f64)
|
2020-12-18 04:38:15 +08:00
|
|
|
// has its result aggregated into a structure type.
|
|
|
|
!llvm.func<struct<(i64, double)> (i32, float)>
|
|
|
|
```
|
|
|
|
|
|
|
|
#### Functions as Function Arguments or Results
|
2019-02-22 23:45:55 +08:00
|
|
|
|
2020-12-18 04:38:15 +08:00
|
|
|
High-order function types, i.e. types of functions that have other functions as
|
|
|
|
arguments or results, are converted differently to accommodate the fact that
|
|
|
|
LLVM IR does not allow for function-typed values. Instead, functions are
|
|
|
|
expected to be passed into and return from other functions _by pointer_.
|
|
|
|
Therefore, function-typed function arguments are results are converted to
|
|
|
|
pointer-to-the-function type. The pointee type is converted using these rules.
|
|
|
|
|
|
|
|
Examples:
|
|
|
|
|
|
|
|
```mlir
|
|
|
|
// Function-typed arguments or results in higher-order functions:
|
2019-02-22 23:45:55 +08:00
|
|
|
(() -> ()) -> (() -> ())
|
2020-12-18 04:38:15 +08:00
|
|
|
// are converted into pointers to functions.
|
|
|
|
!llvm.func<ptr<func<void ()>> (ptr<func<void ()>>)>
|
|
|
|
|
|
|
|
// These rules apply recursively: a function type taking a function that takes
|
|
|
|
// another function
|
|
|
|
( ( (i32) -> (i64) ) -> () ) -> ()
|
|
|
|
// is converted into a function type taking a pointer-to-function that takes
|
|
|
|
// another point-to-function.
|
|
|
|
!llvm.func<void (ptr<func<void (ptr<func<i64 (i32)>>)>>)>
|
2019-02-22 23:45:55 +08:00
|
|
|
```
|
|
|
|
|
2020-12-18 04:38:15 +08:00
|
|
|
#### Memrefs as Function Arguments
|
2019-02-22 23:45:55 +08:00
|
|
|
|
2020-12-18 04:38:15 +08:00
|
|
|
When used as function arguments, both ranked and unranked memrefs are converted
|
|
|
|
into a list of arguments that represents each _scalar_ component of their
|
2021-01-07 01:35:29 +08:00
|
|
|
descriptor. This is intended for some compatibility with C ABI, in which
|
2020-12-18 04:38:15 +08:00
|
|
|
structure types would need to be passed by-pointer leading to the need for
|
|
|
|
allocations and related issues, as well as for aliasing annotations, which are
|
|
|
|
currently attached to pointer in function arguments. Having scalar components
|
2021-01-07 01:35:29 +08:00
|
|
|
means that each size and stride is passed as an individual value.
|
2019-02-22 23:45:55 +08:00
|
|
|
|
2020-12-18 04:38:15 +08:00
|
|
|
When used as function results, memrefs are converted as usual, i.e. each memref
|
|
|
|
is converted to a descriptor struct (default convention) or to a pointer (bare
|
|
|
|
pointer convention).
|
2019-02-22 23:45:55 +08:00
|
|
|
|
|
|
|
Examples:
|
|
|
|
|
2019-12-10 19:00:29 +08:00
|
|
|
```mlir
|
2020-12-18 04:38:15 +08:00
|
|
|
// A memref descriptor appearing as function argument:
|
|
|
|
(memref<f32>) -> ()
|
|
|
|
// gets converted into a list of individual scalar components of a descriptor.
|
|
|
|
!llvm.func<void (ptr<float>, ptr<float>, i64)>
|
|
|
|
|
|
|
|
// The list of arguments is linearized and one can freely mix memref and other
|
|
|
|
// types in this list:
|
|
|
|
(memref<f32>, f32) -> ()
|
|
|
|
// which gets converted into a flat list.
|
|
|
|
!llvm.func<void (ptr<float>, ptr<float>, i64, float)>
|
|
|
|
|
|
|
|
// For nD ranked memref descriptors:
|
|
|
|
(memref<?x?xf32>) -> ()
|
|
|
|
// the converted signature will contain 2n+1 `index`-typed integer arguments,
|
|
|
|
// offset, n sizes and n strides, per memref argument type.
|
|
|
|
!llvm.func<void (ptr<float>, ptr<float>, i64, i64, i64, i64, i64)>
|
|
|
|
|
|
|
|
// Same rules apply to unranked descriptors:
|
|
|
|
(memref<*xf32>) -> ()
|
|
|
|
// which get converted into their components.
|
|
|
|
!llvm.func<void (i64, ptr<i8>)>
|
|
|
|
|
|
|
|
// However, returning a memref from a function is not affected:
|
|
|
|
() -> (memref<?xf32>)
|
|
|
|
// gets converted to a function returning a descriptor structure.
|
|
|
|
!llvm.func<struct<(ptr<float>, ptr<float>, i64, array<1xi64>, array<1xi64>)> ()>
|
|
|
|
|
|
|
|
// If multiple memref-typed results are returned:
|
|
|
|
() -> (memref<f32>, memref<f64>)
|
|
|
|
// their descriptor structures are additionally packed into another structure,
|
|
|
|
// potentially with other non-memref typed results.
|
|
|
|
!llvm.func<struct<(struct<(ptr<float>, ptr<float>, i64)>,
|
|
|
|
struct<(ptr<double>, ptr<double>, i64)>)> ()>
|
2019-02-22 23:45:55 +08:00
|
|
|
```
|