forked from OSchip/llvm-project
[flang] Add the proposal document and rationale for the internal naming module that was previously added.
Summary: This document describes how uniquing of internal names is done. This name uniquing is done to support the constraints and invariants of the FIR dialect of MLIR. Reviewers: jeanPerier, mehdi_amini, DavidTruby, jdoerfert, sscalpone, kiranchandramohan Reviewed By: jeanPerier, sscalpone, kiranchandramohan Subscribers: tskeith, kiranchandramohan, rriddle, llvm-commits Tags: #llvm, #flang Differential Revision: https://reviews.llvm.org/D79089
This commit is contained in:
parent
5d46e4b0da
commit
7875362986
|
@ -0,0 +1,118 @@
|
|||
## Bijective Internal Name Uniquing
|
||||
|
||||
FIR has a flat namespace. No two objects may have the same name at
|
||||
the module level. (These would be functions, globals, etc.)
|
||||
This necessitates some sort of encoding scheme to unique
|
||||
symbols from the front-end into FIR.
|
||||
|
||||
Another requirement is
|
||||
to be able to reverse these unique names and recover the associated
|
||||
symbol in the symbol table.
|
||||
|
||||
Fortran is case insensitive, which allows the compiler to convert the
|
||||
user's identifiers to all lower case. Such a universal conversion implies
|
||||
that all upper case letters are available for use in uniquing.
|
||||
|
||||
### Prefix `_Q`
|
||||
|
||||
All uniqued names have the prefix sequence `_Q` to indicate the name has
|
||||
been uniqued. (Q is chosen because it is a
|
||||
[low frequency letter](http://pi.math.cornell.edu/~mec/2003-2004/cryptography/subs/frequencies.html)
|
||||
in English.)
|
||||
|
||||
### Scope Building
|
||||
|
||||
Symbols can be scoped by the module, submodule, or procedure that contains
|
||||
that symbol. After the `_Q` sigil, names are constructed from outermost to
|
||||
innermost scope as
|
||||
|
||||
* Module name prefixed with `M`
|
||||
* Submodule name prefixed with `S`
|
||||
* Procedure name prefixed with `F`
|
||||
|
||||
Given:
|
||||
```
|
||||
submodule (mod:s1mod) s2mod
|
||||
...
|
||||
subroutine sub
|
||||
...
|
||||
contains
|
||||
function fun
|
||||
```
|
||||
|
||||
The uniqued name of `fun` becomes:
|
||||
```
|
||||
_QMmodSs1modSs2modFsubPfun
|
||||
```
|
||||
|
||||
### Common blocks
|
||||
|
||||
* A common block name will be prefixed with `B`
|
||||
|
||||
### Module scope global data
|
||||
|
||||
* A global data entity is prefixed with `E`
|
||||
* A global entity that is constant (parameter) will be prefixed with `EC`
|
||||
|
||||
### Procedures/Subprograms
|
||||
|
||||
* A procedure/subprogram is prefixed with `P`
|
||||
|
||||
Given:
|
||||
```
|
||||
subroutine sub
|
||||
```
|
||||
The uniqued name of `sub` becomes:
|
||||
```
|
||||
_QPsub
|
||||
```
|
||||
|
||||
### Derived types and related
|
||||
|
||||
* A derived type is prefixed with `T`
|
||||
* If a derived type has KIND parameters, they are listed in a consistent
|
||||
canonical order where each takes the form `Ki` and where _i_ is the
|
||||
compile-time constant value. (All type parameters are integer.) If _i_
|
||||
is a negative value, the prefix `KN` will be used and _i_ will reflect
|
||||
the magnitude of the value.
|
||||
|
||||
Given:
|
||||
```
|
||||
module mymodule
|
||||
type mytype
|
||||
integer :: member
|
||||
end type
|
||||
...
|
||||
```
|
||||
The uniqued name of `mytype` becomes:
|
||||
```
|
||||
_QMmymoduleTmytype
|
||||
```
|
||||
|
||||
Given:
|
||||
```
|
||||
type yourtype(k1,k2)
|
||||
integer, kind :: k1, k2
|
||||
real :: mem1
|
||||
complex :: mem2
|
||||
end type
|
||||
```
|
||||
|
||||
The uniqued name of `yourtype` where `k1=4` and `k2=-6` (at compile-time):
|
||||
```
|
||||
_QTyourtypeK4KN6
|
||||
```
|
||||
|
||||
* A derived type dispatch table is prefixed with `D`. The dispatch table
|
||||
for `type t` would be `_QDTt`
|
||||
* A type descriptor instance is prefixed with `C`. Intrinsic types can
|
||||
be encoded with their names and kinds. The type descriptor for the
|
||||
type `yourtype` above would be `_QCTyourtypeK4KN6`. The type
|
||||
descriptor for `REAL(4)` would be `_QCrealK4`.
|
||||
|
||||
### Compiler generated names
|
||||
|
||||
Compiler generated names do not have to be mapped back to Fortran. These
|
||||
names will be prefixed with `_QQ` and followed by a unique compiler
|
||||
generated identifier. There is, of course, no mapping back to a symbol
|
||||
derived from the input source in this case as no such symbol exists.
|
Loading…
Reference in New Issue