llvm-project/mlir/g3doc/ConversionToLLVMDialect.md

# Conversion to the LLVM IR Dialect

Conversion to the [LLVM IR Dialect](Dialects/LLVM.md) can be performed by the
specialized dialect conversion pass by running

```sh
mlir-opt -lower-to-llvm <filename.mlir>
```

It performs type and operation conversions for a subset of operations from
standard, built-in and super-vector dialects as described in this document. We
use the terminology defined by the
[LLVM IR Dialect description](Dialects/LLVM.md) throughout this document.

[TOC]

## Type Conversion

### Scalar Types

Scalar types are converted to their LLVM counterparts if they exist. The
following conversions are currently implemented.

-   `i*` converts to `!llvm.type<"i*">`
-   `f16` converts to `!llvm.type<"half">`
-   `f32` converts to `!llvm.type<"float">`
-   `f64` converts to `!llvm.type<"double">`

Note: `bf16` type is not supported by LLVM IR and cannot be converted.

### Index Type

Index type is converted to a wrapped LLVM IR integer with bitwidth equal to the
bitwidth of the pointer size as specified by the
[data layout](https://llvm.org/docs/LangRef.html#data-layout) of the LLVM module
[contained](Dialects/LLVM.md#context-and-module-association) in the LLVM Dialect
object. For example, on x86-64 CPUs it converts to `!llvm.type<"i64">`.

### Vector Types

LLVM IR only supports *one-dimensional* vectors, unlike MLIR where vectors can
be multi-dimensional. MLIR vectors are converted to LLVM IR vectors of the same
size with element type converted using these conversion rules. Vector types
cannot be nested in either IR.

For example, `vector<4 x f32>` converts to `!llvm.type<"<4 x float>">`.

### Memref Types

Memref types in MLIR have both static and dynamic information associated with
them. The dynamic information comprises the buffer pointer as well as sizes of
any dynamically sized dimensions. Memref types are converted into either LLVM IR
pointer types if they are fully statically shaped; or to LLVM IR structure types
if they contain dynamic sizes. In the latter case, the first element of the
structure is a pointer to the converted (using these rules) memref element type,
followed by as many elements as the memref has dynamic sizes. The type of each
of these size arguments will be the LLVM type that results from converting the
MLIR `index` type. Zero-dimensional memrefs are treated as pointers to the
elemental type.

Examples:

```mlir {.mlir}
// All of the following are converted to just a pointer type because
// of fully static sizes.
memref<f32>
memref<1 x f32>
memref<10x42x42x43x123 x f32>
// resulting type
!llvm.type<"float*">

// All of the following are converted to a three-element structure
memref<?x? x f32>
memref<42x?x10x35x1x? x f32>
// resulting type assuming 64-bit pointers
!llvm.type<"{float*, i64, i64}">

// Memref types can have vectors as element types
memref<1x? x vector<4xf32>>
// which get converted as well
!llvm.type<"{<4 x float>*, i64}">
```

### Function Types

Function types get converted to LLVM function types. The arguments are converted
individually according to these rules. The result types need to accommodate the
fact that LLVM IR functions always have a return type, which may be a Void type.
The converted function always has a single result type. If the original function
type had no results, the converted function will have one result of the wrapped
`void` type. If the original function type had one result, the converted
function will have one result converted using these rules. Otherwise, the result
type will be a wrapped LLVM IR structure type where each element of the
structure corresponds to one of the results of the original function, converted
using these rules. In high-order functions, function-typed arguments and results
are converted to a wrapped LLVM IR function pointer type (since LLVM IR does not
allow passing functions to functions without indirection) with the pointee type
converted using these rules.

Examples:

```mlir {.mlir}
// zero-ary function type with no results.
() -> ()
// is converted to a zero-ary function with `void` result
!llvm.type<"void ()">

// unary function with one result
(i32) -> (i64)
// has its argument and result type converted, before creating the LLVM IR function type
!llvm.type<"i64 (i32)">

// binary function with one result
(i32, f32) -> (i64)
// has its arguments handled separately
!llvm.type<"i64 (i32, float)">

// binary function with two results
(i32, f32) -> (i64, f64)
// has its result aggregated into a structure type
!llvm.type<"{i64, double} (i32, f32)">

// function-typed arguments or results in higher-order functions
(() -> ()) -> (() -> ())
// are converted into pointers to functions
!llvm.type<"void ()* (void ()*)">
```

## Calling Convention

### Function Signature Conversion

MLIR function type is built into the representation, even the functions in
dialects including a first-class function type must have the built-in MLIR
function type. During the conversion to LLVM IR, function signatures are
converted as follows:

-   the outer type remains the built-in MLIR function;
-   function arguments are converted individually following these rules;
-   function results:
    -   zero-result functions remain zero-result;
    -   single-result functions have their result type converted according to
        these rules;
    -   multi-result functions have a single result type of the wrapped LLVM IR
        structure type with elements corresponding to the converted original
        results.

Rationale: function definitions remain analyzable within MLIR without having to
abstract away the function type. In order to remain consistent with the regular
MLIR functions, we do not introduce a `void` result type since we cannot create
a value of `void` type that MLIR passes might expect to be returned from a
function.

Examples:

```mlir {.mlir}
// zero-ary function type with no results.
func @foo() -> ()
// remains as is
func @foo() -> ()

// unary function with one result
func @bar(i32) -> (i64)
// has its argument and result type converted
func @bar(!llvm.type<"i32">) -> !llvm.type<"i64">

// binary function with one result
func @baz(i32, f32) -> (i64)
// has its arguments handled separately
func @baz(!llvm.type<"i32">, !llvm.type<"float">) -> !llvm.type<"i64">

// binary function with two results
func @qux(i32, f32) -> (i64, f64)
// has its result aggregated into a structure type
func @qux(!llvm.type<"i32">, !llvm.type<"float">) -> !llvm.type<"{i64, double}">

// function-typed arguments or results in higher-order functions
func @quux(() -> ()) -> (() -> ())
// are converted into pointers to functions
func @quux(!llvm.type<"void ()*">) -> !llvm.type<"void ()*">
// the call flow is handled by the LLVM dialect `call` operation supporting both
// direct and indirect calls
```

### Result Packing

In case of multi-result functions, the returned values are inserted into a
structure-typed value before being returned and extracted from it at the call
site. This transformation is a part of the conversion and is transparent to the
defines and uses of the values being returned.

Example:

```mlir {.mlir}
func @foo(%arg0: i32, %arg1: i64) -> (i32, i64) {
  return %arg0, %arg1 : i32, i64
}
func @bar() {
  %0 = constant 42 : i32
  %1 = constant 17 : i64
  %2:2 = call @foo(%0, %1) : (i32, i64) -> (i32, i64)
  "use_i32"(%2#0) : (i32) -> ()
  "use_i64"(%2#1) : (i64) -> ()
}

// is transformed into

func @foo(%arg0: !llvm.type<"i32">, %arg1: !llvm.type<"i64">) -> !llvm.type<"{i32, i64}"> {
  // insert the vales into a structure
  %0 = llvm.undef :  !llvm.type<"{i32, i64}">
  %1 = llvm.insertvalue %arg0, %0[0] : !llvm.type<"{i32, i64}">
  %2 = llvm.insertvalue %arg1, %1[1] : !llvm.type<"{i32, i64}">

  // return the structure value
  llvm.return %2 : !llvm.type<"{i32, i64}">
}
func @bar() {
  %0 = llvm.constant(42 : i32) : !llvm.type<"i32">
  %1 = llvm.constant(17) : !llvm.type<"i64">

  // call and extract the values from the structure
  %2 = llvm.call @bar(%0, %1) : (%arg0: !llvm.type<"i32">, %arg1: !llvm.type<"i64">) -> !llvm.type<"{i32, i64}">
  %3 = llvm.extractvalue %2[0] : !llvm.type<"{i32, i64}">
  %4 = llvm.extractvalue %2[1] : !llvm.type<"{i32, i64}">

  // use as before
  "use_i32"(%3) : (!llvm.type<"i32">) -> ()
  "use_i64"(%4) : (!llvm.type<"i64">) -> ()
}
```

## Repeated Successor Removal

Since the goal of the LLVM IR dialect is to reflect LLVM IR in MLIR, the dialect
and the conversion procedure must account for the differences between block
arguments and LLVM IR PHI nodes. In particular, LLVM IR disallows PHI nodes with
different values coming from the same source. Therefore, the LLVM IR dialect
disallows operations that have identical successors accepting arguments, which
would lead to invalid PHI nodes. The conversion process resolves the potential
PHI source ambiguity by injecting dummy blocks if the same block is used more
than once as a successor in an instruction. These dummy blocks branch
unconditionally to the original successors, pass them the original operands
(available in the dummy block because it is dominated by the original block) and
are used instead of them in the original terminator operation.

Example:

```mlir {.mlir}
  cond_br %0, ^bb1(%1 : i32), ^bb1(%2 : i32)
^bb1(%3 : i32)
  "use"(%3) : (i32) -> ()
```

leads to a new basic block being inserted,

```mlir {.mlir}
  cond_br %0, ^bb1(%1 : i32), ^dummy
^bb1(%3 : i32):
  "use"(%3) : (i32) -> ()
^dummy:
  br ^bb1(%4 : i32)
```

before the conversion to the LLVM IR dialect:

```mlir {.mlir}
  llvm.cond_br  %0, ^bb1(%1 : !llvm.type<"i32">), ^dummy
^bb1(%3 : !llvm.type<"i32">):
  "use"(%3) : (!llvm.type<"i32">) -> ()
^dummy:
  llvm.br ^bb1(%2 : !llvm.type<"i32">)
```

## Memref Model

### Memref Descriptor

Within a converted function, a `memref`-typed value is represented by a memref
_descriptor_, the type of which is the structure type obtained by converting
from the memref type. This descriptor holds a pointer to a linear buffer storing
the data, and dynamic sizes of the memref value. It is created by the allocation
operation and is updated by the conversion operations that may change static
dimensions into dynamic and vice versa.

Note: LLVM IR conversion does not support `memref`s in non-default memory spaces
or `memref`s with non-identity layouts.

### Index Linearization

Accesses to a memref element are transformed into an access to an element of the
buffer pointed to by the descriptor. The position of the element in the buffer
is calculated by linearizing memref indices in row-major order (lexically first
index is the slowest varying, similar to C). The computation of the linear
address is emitted as arithmetic operation in the LLVM IR dialect. Static sizes
are introduced as constants. Dynamic sizes are extracted from the memref
descriptor.

Accesses to zero-dimensional memref (that are interpreted as pointers to the
elemental type) are directly converted into `llvm.load` or `llvm.store` without
any pointer manipulations.

Examples:

An access to a zero-dimensional memref is converted into a plain load:

```mlir {.mlir}
// before
%0 = load %m[] : memref<f32>

// after
%0 = llvm.load %m : !llvm.type<"float*">
```

An access to a memref with indices:

```mlir {.mlir}
%0 = load %m[1,2,3,4] : memref<10x?x13x?xf32>
```

is transformed into the equivalent of the following code:

```mlir {.mlir}
// obtain the buffer pointer
%b = llvm.extractvalue %m[0] : !llvm.type<"{float*, i64, i64}">

// obtain the components for the index
%sub1 = llvm.constant(1) : !llvm.type<"i64">  // first subscript
%sz2 = llvm.extractvalue %m[1]
    : !llvm.type<"{float*, i64, i64}"> // second size (dynamic, second descriptor element)
%sub2 = llvm.constant(2) : !llvm.type<"i64">  // second subscript
%sz3 = llvm.constant(13) : !llvm.type<"i64">  // third size (static)
%sub3 = llvm.constant(3) : !llvm.type<"i64">  // third subscript
%sz4 = llvm.extractvalue %m[1]
    : !llvm.type<"{float*, i64, i64}"> // fourth size (dynamic, third descriptor element)
%sub4 = llvm.constant(4) : !llvm.type<"i64">  // fourth subscript

// compute the linearized index
// %sub4 + %sub3 * %sz4 + %sub2 * (%sz3 * %sz4) + %sub1 * (%sz2 * %sz3 * %sz4) =
// = ((%sub1 * %sz2 + %sub2) * %sz3 + %sub3) * %sz4 + %sub4
%idx0 = llvm.mul %sub1, %sz2 : !llvm.type<"i64">
%idx1 = llvm.add %idx0, %sub : !llvm.type<"i64">
%idx2 = llvm.mul %idx1, %sz3 : !llvm.type<"i64">
%idx3 = llvm.add %idx2, %sub3 : !llvm.type<"i64">
%idx4 = llvm.mul %idx3, %sz4 : !llvm.type<"i64">
%idx5 = llvm.add %idx4, %sub4 : !llvm.type<"i64">

// obtain the element address
%a = llvm.getelementptr %b[%idx5] : (!llvm.type<"float*">, !llvm.type<"i64">) -> !llvm.type<"float*">

// perform the actual load
%0 = llvm.load %a : !llvm.type<"float*">
```

In practice, the subscript and size extraction will be interleaved with the
linear index computation. For stores, the address computation code is identical
and only the actual store operation is different.

Note: the conversion does not perform any sort of common subexpression
elimination when emitting memref accesses.