llvm-project/mlir/docs/SymbolsAndSymbolTables.md

# Symbols and Symbol Tables

[TOC]

With [Regions](LangRef.md#regions), the multi-level aspect of MLIR is structural
in the IR. A lot of infrastructure within the compiler is built around this
nesting structure; including the processing of operations within the
[pass manager](PassManagement.md#pass-manager). One advantage of the MLIR design
is that it is able to process operations in parallel, utilizing multiple
threads. This is possible due to a property of the IR known as
[`IsolatedFromAbove`](Traits.md#isolatedfromabove).

Without this property, any operation could affect or mutate the use-list of
operations defined above. Making this thread-safe requires expensive locking in
some of the core IR data structures, which becomes quite inefficient. To enable
multi-threaded compilation without this locking, MLIR uses local pools for
constant values as well as `Symbol` accesses for global values and variables.
This document details the design of `Symbol`s, what they are and how they fit
into the system.

The `Symbol` infrastructure essentially provides a non-SSA mechanism in which to
refer to an operation symbolically with a name. This allows for referring to
operations defined above regions that were defined as `IsolatedFromAbove` in a
safe way. It also allows for symbolically referencing operations define below
other regions as well.

## Symbol

A `Symbol` is a named operation that resides immediately within a region that
defines a [`SymbolTable`](#symbol-table). The name of a symbol *must* be unique
within the parent `SymbolTable`. This name is semantically similarly to an SSA
result value, and may be referred to by other operations to provide a symbolic
link, or use, to the symbol. An example of a `Symbol` operation is
[`func`](LangRef.md#functions). `func` defines a symbol name, which is
[referred to](#referencing-a-symbol) by operations like
[`std.call`](Dialects/Standard.md#call).

### Defining a Symbol

A `Symbol` operation should use the `SymbolOpInterface` interface to provide the
necessary verification and accessors; it also supports
operations, such as `module`, that conditionally define a symbol. `Symbol`s must
have the following properties:

*   A `StringAttr` attribute named
    'SymbolTable::getSymbolAttrName()'(`sym_name`).
    -   This attribute defines the symbolic 'name' of the operation.
*   An optional `StringAttr` attribute named
    'SymbolTable::getVisibilityAttrName()'(`sym_visibility`)
    -   This attribute defines the [visibility](#symbol-visibility) of the
        symbol, or more specifically in-which scopes it may be accessed.
*   No SSA results
    -   Intermixing the different ways to `use` an operation quickly becomes
        unwieldy and difficult to analyze.

## Symbol Table

Described above are `Symbol`s, which reside within a region of an operation
defining a `SymbolTable`. A `SymbolTable` operation provides the container for
the [`Symbol`](#symbol) operations. It verifies that all `Symbol` operations
have a unique name, and provides facilities for looking up symbols by name.
Operations defining a `SymbolTable` must use the `OpTrait::SymbolTable` trait.

### Referencing a Symbol

`Symbol`s are referenced symbolically by name via the
[`SymbolRefAttr`](LangRef.md#symbol-reference-attribute) attribute. A symbol
reference attribute contains a named reference to an operation that is nested
within a symbol table. It may optionally contain a set of nested references that
further resolve to a symbol nested within a different symbol table. When
resolving a nested reference, each non-leaf reference must refer to a symbol
operation that is also a [symbol table](#symbol-table).

Below is an example of how an operation can reference a symbol operation:

```mlir
// This `func` operation defines a symbol named `symbol`.
func @symbol()

// Our `foo.user` operation contains a SymbolRefAttr with the name of the
// `symbol` func.
"foo.user"() {uses = [@symbol]} : () -> ()

// Symbol references resolve to the nearest parent operation that defines a
// symbol table, so we can have references with arbitrary nesting levels.
func @other_symbol() {
  affine.for %i0 = 0 to 10 {
    // Our `foo.user` operation resolves to the same `symbol` func as defined
    // above.
    "foo.user"() {uses = [@symbol]} : () -> ()
  }
  return
}

// Here we define a nested symbol table. References within this operation will
// not resolve to any symbols defined above.
module {
  // Error. We resolve references with respect to the closest parent operation
  // that defines a symbol table, so this reference can't be resolved.
  "foo.user"() {uses = [@symbol]} : () -> ()
}

// Here we define another nested symbol table, except this time it also defines
// a symbol.
module @module_symbol {
  // This `func` operation defines a symbol named `nested_symbol`.
  func @nested_symbol()
}

// Our `foo.user` operation may refer to the nested symbol, by resolving through
// the parent.
"foo.user"() {uses = [@module_symbol::@nested_symbol]} : () -> ()
```

Using an attribute, as opposed to an SSA value, has several benefits:

*   References may appear in more places than the operand list; including
    [nested attribute dictionaries](LangRef.md#dictionary-attribute),
    [array attributes](LangRef.md#array-attribute), etc.

*   Handling of SSA dominance remains unchanged.

    -   If we were to use SSA values, we would need to create some mechanism in
        which to opt-out of certain properties of it such as dominance.
        Attributes allow for referencing the operations irregardless of the
        order in which they were defined.
    -   Attributes simplify referencing operations within nested symbol tables,
        which are traditionally not visible outside of the parent region.

The impact of this choice to use attributes as opposed to SSA values is that we
now have two mechanisms with reference operations. This means that some dialects
must either support both `SymbolRefs` and SSA value references, or provide
operations that materialize SSA values from a symbol reference. Each has
different trade offs depending on the situation. A function call may directly
use a `SymbolRef` as the callee, whereas a reference to a global variable might
use a materialization operation so that the variable can be used in other
operations like `std.addi`.
[`llvm.mlir.addressof`](Dialects/LLVM.md#llvmmliraddressof) is one example of
such an operation.

See the `LangRef` definition of the
[`SymbolRefAttr`](LangRef.md#symbol-reference-attribute) for more information
about the structure of this attribute.

Operations that reference a `Symbol` and want to perform verification and
general mutation of the symbol should implement the `SymbolUserOpInterface` to
ensure that symbol accesses are legal and efficient.

### Manipulating a Symbol

As described above, `SymbolRefs` act as an auxiliary way of defining uses of
operations to the traditional SSA use-list. As such, it is imperative to provide
similar functionality to manipulate and inspect the list of uses and the users.
The following are a few of the utilities provided by the `SymbolTable`:

*   `SymbolTable::getSymbolUses`

    -   Access an iterator range over all of the uses on and nested within a
        particular operation.

*   `SymbolTable::symbolKnownUseEmpty`

    -   Check if a particular symbol is known to be unused within a specific
        section of the IR.

*   `SymbolTable::replaceAllSymbolUses`

    -   Replace all of the uses of one symbol with a new one within a specific
        section of the IR.

*   `SymbolTable::lookupNearestSymbolFrom`

    -   Lookup the definition of a symbol in the nearest symbol table from some
        anchor operation.

## Symbol Visibility

Along with a name, a `Symbol` also has a `visibility` attached to it. The
`visibility` of a symbol defines its structural reachability within the IR. A
symbol has one of the following visibilities:

*   Public (Default)

    -   The symbol may be referenced from outside of the visible IR. We cannot
        assume that all of the uses of this symbol are observable.

*   Private

    -   The symbol may only be referenced from within the current symbol table.

*   Nested

    -   The symbol may be referenced by operations outside of the current symbol
        table, but not outside of the visible IR, as long as each symbol table
        parent also defines a non-private symbol.

A few examples of what this looks like in the IR are shown below:

```mlir
module @public_module {
  // This function can be accessed by 'live.user', but cannot be referenced
  // externally; all uses are known to reside within parent regions.
  func @nested_function() attributes { sym_visibility = "nested" }

  // This function cannot be accessed outside of 'public_module'.
  func @private_function() attributes { sym_visibility = "private" }
}

// This function can only be accessed from within the top-level module.
func @private_function() attributes { sym_visibility = "private" }

// This function may be referenced externally.
func @public_function()

"live.user"() {uses = [
  @public_module::@nested_function,
  @private_function,
  @public_function
]} : () -> ()
```
[mlir] Add a document detailing the design of the SymbolTable. Summary: This document provides insight on the rationale and the design of Symbols in MLIR, and why they are necessary. Differential Revision: https://reviews.llvm.org/D73590 2020-02-09 02:40:00 +08:00			`# Symbols and Symbol Tables`

			`[TOC]`

[mlir] Address post commit feedback of D73590 for SymbolsAndSymbolTables.md 2020-02-17 13:06:56 +08:00			`With [Regions](LangRef.md#regions), the multi-level aspect of MLIR is structural`
			`in the IR. A lot of infrastructure within the compiler is built around this`
			`nesting structure; including the processing of operations within the`
Fix broken docs links (WritingAPass.md was renamed PassManagement.md) 2020-04-19 12:37:26 +08:00			`[pass manager](PassManagement.md#pass-manager). One advantage of the MLIR design`
[mlir] Address post commit feedback of D73590 for SymbolsAndSymbolTables.md 2020-02-17 13:06:56 +08:00			`is that it is able to process operations in parallel, utilizing multiple`
			`threads. This is possible due to a property of the IR known as`
			[`IsolatedFromAbove`](Traits.md#isolatedfromabove).
[mlir] Add a document detailing the design of the SymbolTable. Summary: This document provides insight on the rationale and the design of Symbols in MLIR, and why they are necessary. Differential Revision: https://reviews.llvm.org/D73590 2020-02-09 02:40:00 +08:00
			`Without this property, any operation could affect or mutate the use-list of`
			`operations defined above. Making this thread-safe requires expensive locking in`
			`some of the core IR data structures, which becomes quite inefficient. To enable`
			`multi-threaded compilation without this locking, MLIR uses local pools for`
			constant values as well as `Symbol` accesses for global values and variables.
			This document details the design of `Symbol`s, what they are and how they fit
			`into the system.`

			The `Symbol` infrastructure essentially provides a non-SSA mechanism in which to
			`refer to an operation symbolically with a name. This allows for referring to`
			operations defined above regions that were defined as `IsolatedFromAbove` in a
			`safe way. It also allows for symbolically referencing operations define below`
			`other regions as well.`

			`## Symbol`

			A `Symbol` is a named operation that resides immediately within a region that
			defines a [`SymbolTable`](#symbol-table). The name of a symbol must be unique
			within the parent `SymbolTable`. This name is semantically similarly to an SSA
			`result value, and may be referred to by other operations to provide a symbolic`
			link, or use, to the symbol. An example of a `Symbol` operation is
			[`func`](LangRef.md#functions). `func` defines a symbol name, which is
			`[referred to](#referencing-a-symbol) by operations like`
			[`std.call`](Dialects/Standard.md#call).

			`### Defining a Symbol`

[mlir] Update docs referencing OpTrait::Symbol. Since https://reviews.llvm.org/D78522, Symbol is not a Trait itself. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D88512 2020-09-30 11:55:54 +08:00			A `Symbol` operation should use the `SymbolOpInterface` interface to provide the
			`necessary verification and accessors; it also supports`
			operations, such as `module`, that conditionally define a symbol. `Symbol`s must
			`have the following properties:`
[mlir] Add a document detailing the design of the SymbolTable. Summary: This document provides insight on the rationale and the design of Symbols in MLIR, and why they are necessary. Differential Revision: https://reviews.llvm.org/D73590 2020-02-09 02:40:00 +08:00
			* A `StringAttr` attribute named
			'SymbolTable::getSymbolAttrName()'(`sym_name`).
			`- This attribute defines the symbolic 'name' of the operation.`
			* An optional `StringAttr` attribute named
			'SymbolTable::getVisibilityAttrName()'(`sym_visibility`)
			`- This attribute defines the [visibility](#symbol-visibility) of the`
			`symbol, or more specifically in-which scopes it may be accessed.`
			`* No SSA results`
			- Intermixing the different ways to `use` an operation quickly becomes
			`unwieldy and difficult to analyze.`

			`## Symbol Table`

			Described above are `Symbol`s, which reside within a region of an operation
			defining a `SymbolTable`. A `SymbolTable` operation provides the container for
			the [`Symbol`](#symbol) operations. It verifies that all `Symbol` operations
			`have a unique name, and provides facilities for looking up symbols by name.`
[mlir] Address post commit feedback of D73590 for SymbolsAndSymbolTables.md 2020-02-17 13:06:56 +08:00			Operations defining a `SymbolTable` must use the `OpTrait::SymbolTable` trait.
[mlir] Add a document detailing the design of the SymbolTable. Summary: This document provides insight on the rationale and the design of Symbols in MLIR, and why they are necessary. Differential Revision: https://reviews.llvm.org/D73590 2020-02-09 02:40:00 +08:00
			`### Referencing a Symbol`

			`Symbol`s are referenced symbolically by name via the
			[`SymbolRefAttr`](LangRef.md#symbol-reference-attribute) attribute. A symbol
			`reference attribute contains a named reference to an operation that is nested`
			`within a symbol table. It may optionally contain a set of nested references that`
			`further resolve to a symbol nested within a different symbol table. When`
			`resolving a nested reference, each non-leaf reference must refer to a symbol`
			`operation that is also a [symbol table](#symbol-table).`

[mlir] Address post commit feedback of D73590 for SymbolsAndSymbolTables.md 2020-02-17 13:06:56 +08:00			`Below is an example of how an operation can reference a symbol operation:`
[mlir] Add a document detailing the design of the SymbolTable. Summary: This document provides insight on the rationale and the design of Symbols in MLIR, and why they are necessary. Differential Revision: https://reviews.llvm.org/D73590 2020-02-09 02:40:00 +08:00
			```mlir
			// This `func` operation defines a symbol named `symbol`.
			`func @symbol()`

			// Our `foo.user` operation contains a SymbolRefAttr with the name of the
			// `symbol` func.
			`"foo.user"() {uses = [@symbol]} : () -> ()`

			`// Symbol references resolve to the nearest parent operation that defines a`
			`// symbol table, so we can have references with arbitrary nesting levels.`
			`func @other_symbol() {`
			`affine.for %i0 = 0 to 10 {`
			// Our `foo.user` operation resolves to the same `symbol` func as defined
			`// above.`
			`"foo.user"() {uses = [@symbol]} : () -> ()`
			`}`
			`return`
			`}`

			`// Here we define a nested symbol table. References within this operation will`
			`// not resolve to any symbols defined above.`
			`module {`
[mlir] Address post commit feedback of D73590 for SymbolsAndSymbolTables.md 2020-02-17 13:06:56 +08:00			`// Error. We resolve references with respect to the closest parent operation`
			`// that defines a symbol table, so this reference can't be resolved.`
[mlir] Add a document detailing the design of the SymbolTable. Summary: This document provides insight on the rationale and the design of Symbols in MLIR, and why they are necessary. Differential Revision: https://reviews.llvm.org/D73590 2020-02-09 02:40:00 +08:00			`"foo.user"() {uses = [@symbol]} : () -> ()`
			`}`

			`// Here we define another nested symbol table, except this time it also defines`
			`// a symbol.`
			`module @module_symbol {`
			// This `func` operation defines a symbol named `nested_symbol`.
			`func @nested_symbol()`
			`}`

			// Our `foo.user` operation may refer to the nested symbol, by resolving through
			`// the parent.`
[mlir] Address post commit feedback of D73590 for SymbolsAndSymbolTables.md 2020-02-17 13:06:56 +08:00			`"foo.user"() {uses = [@module_symbol::@nested_symbol]} : () -> ()`
[mlir] Add a document detailing the design of the SymbolTable. Summary: This document provides insight on the rationale and the design of Symbols in MLIR, and why they are necessary. Differential Revision: https://reviews.llvm.org/D73590 2020-02-09 02:40:00 +08:00			```

			`Using an attribute, as opposed to an SSA value, has several benefits:`

			`* References may appear in more places than the operand list; including`
			`[nested attribute dictionaries](LangRef.md#dictionary-attribute),`
			`[array attributes](LangRef.md#array-attribute), etc.`

			`* Handling of SSA dominance remains unchanged.`

			`- If we were to use SSA values, we would need to create some mechanism in`
			`which to opt-out of certain properties of it such as dominance.`
			`Attributes allow for referencing the operations irregardless of the`
			`order in which they were defined.`
			`- Attributes simplify referencing operations within nested symbol tables,`
			`which are traditionally not visible outside of the parent region.`

			`The impact of this choice to use attributes as opposed to SSA values is that we`
			`now have two mechanisms with reference operations. This means that some dialects`
			must either support both `SymbolRefs` and SSA value references, or provide
			`operations that materialize SSA values from a symbol reference. Each has`
			`different trade offs depending on the situation. A function call may directly`
			use a `SymbolRef` as the callee, whereas a reference to a global variable might
			`use a materialization operation so that the variable can be used in other`
			operations like `std.addi`.
			[`llvm.mlir.addressof`](Dialects/LLVM.md#llvmmliraddressof) is one example of
			`such an operation.`

			See the `LangRef` definition of the
			[`SymbolRefAttr`](LangRef.md#symbol-reference-attribute) for more information
			`about the structure of this attribute.`

[mlir] Add a new SymbolUserOpInterface class The initial goal of this interface is to fix the current problems with verifying symbol user operations, but can extend beyond that in the future. The current problems with the verification of symbol uses are: * Extremely inefficient: Most current symbol users perform the symbol lookup using the slow O(N) string compare methods, which can lead to extremely long verification times in large modules. * Invalid/break the constraints of verification pass If the symbol reference is not-flat(and even if it is flat in some cases) a verifier for an operation is not permitted to touch the referenced operation because it may be in the process of being mutated by a different thread within the pass manager. The new SymbolUserOpInterface exposes a method `verifySymbolUses` that will be invoked from the parent symbol table to allow for verifying the constraints of any referenced symbols. This method is passed a `SymbolTableCollection` to allow for O(1) lookups of any necessary symbol operation. Differential Revision: https://reviews.llvm.org/D89512 2020-10-17 02:57:00 +08:00			Operations that reference a `Symbol` and want to perform verification and
			general mutation of the symbol should implement the `SymbolUserOpInterface` to
			`ensure that symbol accesses are legal and efficient.`

[mlir] Add a document detailing the design of the SymbolTable. Summary: This document provides insight on the rationale and the design of Symbols in MLIR, and why they are necessary. Differential Revision: https://reviews.llvm.org/D73590 2020-02-09 02:40:00 +08:00			`### Manipulating a Symbol`

			As described above, `SymbolRefs` act as an auxiliary way of defining uses of
			`operations to the traditional SSA use-list. As such, it is imperative to provide`
			`similar functionality to manipulate and inspect the list of uses and the users.`
			The following are a few of the utilities provided by the `SymbolTable`:

			* `SymbolTable::getSymbolUses`

			`- Access an iterator range over all of the uses on and nested within a`
			`particular operation.`

			* `SymbolTable::symbolKnownUseEmpty`

			`- Check if a particular symbol is known to be unused within a specific`
			`section of the IR.`

			* `SymbolTable::replaceAllSymbolUses`

			`- Replace all of the uses of one symbol with a new one within a specific`
			`section of the IR.`

			* `SymbolTable::lookupNearestSymbolFrom`

			`- Lookup the definition of a symbol in the nearest symbol table from some`
			`anchor operation.`

			`## Symbol Visibility`

			Along with a name, a `Symbol` also has a `visibility` attached to it. The
			`visibility` of a symbol defines its structural reachability within the IR. A
[mlir] Address post commit feedback of D73590 for SymbolsAndSymbolTables.md 2020-02-17 13:06:56 +08:00			`symbol has one of the following visibilities:`
[mlir] Add a document detailing the design of the SymbolTable. Summary: This document provides insight on the rationale and the design of Symbols in MLIR, and why they are necessary. Differential Revision: https://reviews.llvm.org/D73590 2020-02-09 02:40:00 +08:00
[mlir] Address post commit feedback of D73590 for SymbolsAndSymbolTables.md 2020-02-17 13:06:56 +08:00			`* Public (Default)`
[mlir] Add a document detailing the design of the SymbolTable. Summary: This document provides insight on the rationale and the design of Symbols in MLIR, and why they are necessary. Differential Revision: https://reviews.llvm.org/D73590 2020-02-09 02:40:00 +08:00
			`- The symbol may be referenced from outside of the visible IR. We cannot`
			`assume that all of the uses of this symbol are observable.`

			`* Private`

			`- The symbol may only be referenced from within the current symbol table.`

			`* Nested`

			`- The symbol may be referenced by operations outside of the current symbol`
			`table, but not outside of the visible IR, as long as each symbol table`
			`parent also defines a non-private symbol.`

			`A few examples of what this looks like in the IR are shown below:`

			```mlir
			`module @public_module {`
			`// This function can be accessed by 'live.user', but cannot be referenced`
			`// externally; all uses are known to reside within parent regions.`
			`func @nested_function() attributes { sym_visibility = "nested" }`

[mlir] Address post commit feedback of D73590 for SymbolsAndSymbolTables.md 2020-02-17 13:06:56 +08:00			`// This function cannot be accessed outside of 'public_module'.`
[mlir] Add a document detailing the design of the SymbolTable. Summary: This document provides insight on the rationale and the design of Symbols in MLIR, and why they are necessary. Differential Revision: https://reviews.llvm.org/D73590 2020-02-09 02:40:00 +08:00			`func @private_function() attributes { sym_visibility = "private" }`
			`}`

[mlir] Address post commit feedback of D73590 for SymbolsAndSymbolTables.md 2020-02-17 13:06:56 +08:00			`// This function can only be accessed from within the top-level module.`
[mlir] Add a document detailing the design of the SymbolTable. Summary: This document provides insight on the rationale and the design of Symbols in MLIR, and why they are necessary. Differential Revision: https://reviews.llvm.org/D73590 2020-02-09 02:40:00 +08:00			`func @private_function() attributes { sym_visibility = "private" }`

[mlir] Address post commit feedback of D73590 for SymbolsAndSymbolTables.md 2020-02-17 13:06:56 +08:00			`// This function may be referenced externally.`
[mlir] Add a document detailing the design of the SymbolTable. Summary: This document provides insight on the rationale and the design of Symbols in MLIR, and why they are necessary. Differential Revision: https://reviews.llvm.org/D73590 2020-02-09 02:40:00 +08:00			`func @public_function()`

			`"live.user"() {uses = [`
			`@public_module::@nested_function,`
			`@private_function,`
			`@public_function`
			`]} : () -> ()`
			```