forked from OSchip/llvm-project
169 lines
7.0 KiB
Markdown
169 lines
7.0 KiB
Markdown
<!--===- docs/ControlFlowGraph.md
|
|
|
|
Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
|
See https://llvm.org/LICENSE.txt for license information.
|
|
SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
|
|
|
|
-->
|
|
|
|
# Control Flow Graph
|
|
|
|
```eval_rst
|
|
.. contents::
|
|
:local:
|
|
```
|
|
|
|
## Concept
|
|
After a Fortran subprogram has been parsed, its names resolved, and all its
|
|
semantic constraints successfully checked, the parse tree of its
|
|
executable part is translated into another abstract representation,
|
|
namely the _control flow graph_ described in this note.
|
|
|
|
This second representation of the subprogram's executable part is
|
|
suitable for analysis and incremental modification as the subprogram
|
|
is readied for code generation.
|
|
Many high-level Fortran features are implemented by rewriting portions
|
|
of a subprogram's control flow graph in place.
|
|
|
|
### Control Flow Graph
|
|
A _control flow graph_ is a collection of simple (_i.e.,_ "non-extended")
|
|
basic _blocks_ that comprise straight-line sequences of _actions_ with a
|
|
single entry point and a single exit point, and a collection of
|
|
directed flow _edges_ (or _arcs_) denoting all possible transitions of
|
|
control flow that may take place during execution from the end of
|
|
one basic block to the beginning of another (or itself).
|
|
|
|
A block that has multiple distinct successors in the flow of control
|
|
must end with an action that selects its successor.
|
|
|
|
The sequence of actions that constitutes a basic block may
|
|
include references to user and library procedures.
|
|
Subprogram calls with implicit control flow afterwards, namely
|
|
alternate returns and `END=`/`ERR=` labels on input/output,
|
|
will be lowered in translation to a representation that materializes
|
|
that control flow into something similar to a computed `GO TO` or
|
|
C language `switch` statement.
|
|
|
|
For convenience in optimization and to simplify the implementation of
|
|
data flow confluence functions, we may choose to maintain the
|
|
property that each flow arc is the sole outbound arc emanating from
|
|
its originating block, the sole inbound arc arriving at its destination,
|
|
or both.
|
|
Empty blocks would inserted to "split" arcs when necessary to maintain this
|
|
invariant property.
|
|
|
|
Fortran subprograms (other than internal subprograms) can have multiple
|
|
entry points by using the obsolescent `ENTRY` statement.
|
|
We will implement such subprograms by constructing a union
|
|
of their dummy argument lists and using it as part of the definition
|
|
of a new subroutine or function that can be called by each of
|
|
the entry points, which are then all converted into wrapper routines that
|
|
pass a selector value as an additional argument to drive a `switch` on entry
|
|
to the new subprogram.
|
|
|
|
This transformation ensures that every subprogram's control
|
|
flow graph has a well-defined `START` node.
|
|
|
|
Statement labels can be used in Fortran on any statement, but only
|
|
the labels that decorate legal destinations of `GO TO` statements
|
|
need to be implemented in the control flow graph.
|
|
Specifically, non-executable statements like `DATA`, `NAMELIST`, and
|
|
`FORMAT` statements will be extracted into data initialization
|
|
records before or during the construction of the control flow
|
|
graph, and will survive only as synonyms for `CONTINUE`.
|
|
|
|
Nests of multiple labeled `DO` loops that terminate on the same
|
|
label will be have that label rewritten so that `GO TO` within
|
|
the loop nest will arrive at the copy that most closely nests
|
|
the context.
|
|
The Fortran standard does not require us to do this, but XLF
|
|
(at least) works this way.
|
|
|
|
### Expressions and Statements (Operations and Actions)
|
|
Expressions are trees, not DAGs, of intrinsic operations,
|
|
resolved function references, constant literals, and
|
|
data designators.
|
|
|
|
Expression nodes are represented in the compiler in a type-safe manner.
|
|
There is a distinct class or class template for every category of
|
|
intrinsic type, templatized over its supported kind type parameter values.
|
|
|
|
Operands are storage-owning indirections to other instances
|
|
of `Expression`, instances of constant values, and to representations
|
|
of data and function references.
|
|
These indirections are not nullable apart from the situation in which
|
|
the operands of an expression are being removed for use elsewhere before
|
|
the expression is destructed.
|
|
|
|
The ranks and the extents of the shapes of the results of expressions
|
|
are explicit for constant arrays and recoverable by analysis otherwise.
|
|
|
|
Parenthesized subexpressions are scrupulously preserved in accordance with
|
|
the Fortran standard.
|
|
|
|
The expression tree is meant to be a representation that is
|
|
as equally well suited for use in the symbol table (e.g., for
|
|
a bound of an explicit shape array) as it is for an action
|
|
in a basic block of the control flow graph (e.g., the right
|
|
hand side of an assignment statement).
|
|
|
|
Each basic block comprises a linear sequence of _actions_.
|
|
These are represented as a doubly-linked list so that insertion
|
|
and deletion can be done in constant time.
|
|
|
|
Only the last action in a basic block can represent a change
|
|
to the flow of control.
|
|
|
|
### Scope Transitions
|
|
Some of the various scopes of the symbol table are visible in the control flow
|
|
graph as `SCOPE ENTRY` and `SCOPE EXIT` actions.
|
|
`SCOPE ENTRY` actions are unique for their corresponding scopes,
|
|
while `SCOPE EXIT` actions need not be so.
|
|
It must be the case that
|
|
any flow of control within the subprogram will enter only scopes that are
|
|
not yet active, and exit only the most recently entered scope that has not
|
|
yet been deactivated; i.e., when modeled by a push-down stack that is
|
|
pushed by each traversal of a `SCOPE ENTRY` action,
|
|
the entries of the stack are always distinct, only the scope at
|
|
the top of the stack is ever popped by `SCOPE EXIT`, and the stack is empty
|
|
when the subprogram terminates.
|
|
Further, any references to resolved symbols must be to symbols whose scopes
|
|
are active.
|
|
|
|
The `DEALLOCATE` actions and calls to `FINAL` procedures implied by scoped
|
|
lifetimes will be explicit in the sequence of actions in the control flow
|
|
graph.
|
|
|
|
Parallel regions might be partially represented by scopes, or by explicit
|
|
operations similar to the scope entry and exit operations.
|
|
|
|
### Data Flow Representation
|
|
The subprogram text will be in static single assignment form by the time the
|
|
subprogram arrives at the bridge to the LLVM IR builder.
|
|
Merge points are actions at the heads of basic blocks whose operands
|
|
are definition points; definition points are actions at the ends of
|
|
basic blocks whose operands are expression trees (which may refer to
|
|
merge points).
|
|
|
|
### Rewriting Transformations
|
|
|
|
#### I/O
|
|
#### Dynamic allocation
|
|
#### Array constructors
|
|
|
|
#### Derived type initialization, deallocation, and finalization
|
|
The machinery behind the complicated semantics of Fortran's derived types
|
|
and `ALLOCATABLE` objects will be implemented in large part by the run time
|
|
support library.
|
|
|
|
#### Actual argument temporaries
|
|
#### Array assignments, `WHERE`, and `FORALL`
|
|
|
|
Array operations have shape.
|
|
|
|
`WHERE` masks have shape.
|
|
Their effects on array operations are by means of explicit `MASK` operands that
|
|
are part of array assignment operations.
|
|
|
|
#### Intrinsic function and subroutine calls
|