forked from OSchip/llvm-project
[LoopTerminology] LCSSA form
Reviewed by: Michael Kruse (Meinersbur) Differential Revision: https://reviews.llvm.org/D75233
This commit is contained in:
parent
3807079d70
commit
229cda968c
|
@ -161,11 +161,164 @@ When it is successful, the loop has:
|
||||||
has a predecessor that is outside the loop. This implies
|
has a predecessor that is outside the loop. This implies
|
||||||
that all exit blocks are dominated by the loop header.
|
that all exit blocks are dominated by the loop header.
|
||||||
|
|
||||||
|
.. _loop-terminology-lcssa:
|
||||||
|
|
||||||
Loop Closed SSA (LCSSA)
|
Loop Closed SSA (LCSSA)
|
||||||
=======================
|
=======================
|
||||||
|
|
||||||
TBD
|
A program is in Loop Closed SSA Form if it is in SSA form
|
||||||
|
and all values that are defined in a loop are used only inside
|
||||||
|
this loop.
|
||||||
|
Programs written in LLVM IR are always in SSA form but not necessarily
|
||||||
|
in LCSSA. To achieve the latter, single entry PHI nodes are inserted
|
||||||
|
at the end of the loops for all values that are live
|
||||||
|
across the loop boundary [#lcssa-construction]_.
|
||||||
|
In particular, consider the following loop:
|
||||||
|
|
||||||
|
.. code-block:: C
|
||||||
|
|
||||||
|
c = ...;
|
||||||
|
for (...) {
|
||||||
|
if (c)
|
||||||
|
X1 = ...
|
||||||
|
else
|
||||||
|
X2 = ...
|
||||||
|
X3 = phi(X1, X2); // X3 defined
|
||||||
|
}
|
||||||
|
|
||||||
|
... = X3 + 4; // X3 used, i.e. live
|
||||||
|
// outside the loop
|
||||||
|
|
||||||
|
In the inner loop, the X3 is defined inside the loop, but used
|
||||||
|
outside of it. In Loop Closed SSA form, this would be represented as follows:
|
||||||
|
|
||||||
|
.. code-block:: C
|
||||||
|
|
||||||
|
c = ...;
|
||||||
|
for (...) {
|
||||||
|
if (c)
|
||||||
|
X1 = ...
|
||||||
|
else
|
||||||
|
X2 = ...
|
||||||
|
X3 = phi(X1, X2);
|
||||||
|
}
|
||||||
|
X4 = phi(X3);
|
||||||
|
|
||||||
|
... = X4 + 4;
|
||||||
|
|
||||||
|
This is still valid LLVM; the extra phi nodes are purely redundant,
|
||||||
|
but all LoopPass'es are required to preserve them.
|
||||||
|
This form is ensured by the LCSSA (:ref:`-lcssa <passes-lcssa>`)
|
||||||
|
pass and is added automatically by the LoopPassManager when
|
||||||
|
scheduling a LoopPass.
|
||||||
|
After the loop optimizations are done, these extra phi nodes
|
||||||
|
will be deleted by :ref:`-instcombine <passes-instcombine>`.
|
||||||
|
|
||||||
|
The major benefit of this transformation is that it makes many other
|
||||||
|
loop optimizations simpler.
|
||||||
|
|
||||||
|
First of all, a simple observation is that if one needs to see all
|
||||||
|
the outside users, they can just iterate over all the (loop closing)
|
||||||
|
PHI nodes in the exit blocks (the alternative would be to
|
||||||
|
scan the def-use chain [#def-use-chain]_ of all instructions in the loop).
|
||||||
|
|
||||||
|
Then, consider for example
|
||||||
|
:ref:`-loop-unswitch <passes-loop-unswitch>` ing the loop above.
|
||||||
|
Because it is in LCSSA form, we know that any value defined inside of
|
||||||
|
the loop will be used either only inside the loop or in a loop closing
|
||||||
|
PHI node. In this case, the only loop closing PHI node is X4.
|
||||||
|
This means that we can just copy the loop and change the X4
|
||||||
|
accordingly, like so:
|
||||||
|
|
||||||
|
.. code-block:: C
|
||||||
|
|
||||||
|
for (...) {
|
||||||
|
c = ...;
|
||||||
|
if (c) {
|
||||||
|
for (...) {
|
||||||
|
if (true)
|
||||||
|
X1 = ...
|
||||||
|
else
|
||||||
|
X2 = ...
|
||||||
|
X3 = phi(X1, X2);
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
for (...) {
|
||||||
|
if (false)
|
||||||
|
X1' = ...
|
||||||
|
else
|
||||||
|
X2' = ...
|
||||||
|
X3' = phi(X1', X2');
|
||||||
|
}
|
||||||
|
}
|
||||||
|
X4 = phi(X3, X3')
|
||||||
|
|
||||||
|
Now, all uses of X4 will get the updated value (in general,
|
||||||
|
if a loop is in LCSSA form, in any loop transformation,
|
||||||
|
we only need to update the loop closing PHI nodes for the changes
|
||||||
|
to take effect). If we did not have Loop Closed SSA form, it means that X3 could
|
||||||
|
possibly be used outside the loop. So, we would have to introduce the
|
||||||
|
X4 (which is the new X3) and replace all uses of X3 with that.
|
||||||
|
However, we should note that because LLVM keeps a def-use chain
|
||||||
|
[#def-use-chain]_ for each Value, we wouldn't need
|
||||||
|
to perform data-flow analysis to find and replace all the uses
|
||||||
|
(there is even a utility function, replaceAllUsesWith(),
|
||||||
|
that performs this transformation by iterating the def-use chain).
|
||||||
|
|
||||||
|
Another important advantage is that the behavior of all uses
|
||||||
|
of an induction variable is the same. Without this, you need to
|
||||||
|
distinguish the case when the variable is used outside of
|
||||||
|
the loop it is defined in, for example:
|
||||||
|
|
||||||
|
.. code-block:: C
|
||||||
|
|
||||||
|
for (i = 0; i < 100; i++) {
|
||||||
|
for (j = 0; j < 100; j++) {
|
||||||
|
k = i + j;
|
||||||
|
use(k); // use 1
|
||||||
|
}
|
||||||
|
use(k); // use 2
|
||||||
|
}
|
||||||
|
|
||||||
|
Looking from the outer loop with the normal SSA form, the first use of k
|
||||||
|
is not well-behaved, while the second one is an induction variable with
|
||||||
|
base 100 and step 1. Although, in practice, and in the LLVM context,
|
||||||
|
such cases can be handled effectively by SCEV. Scalar Evolution
|
||||||
|
(:ref:`scalar-evolution <passes-scalar-evolution>`) or SCEV, is a
|
||||||
|
(analysis) pass that analyzes and categorizes the evolution of scalar
|
||||||
|
expressions in loops.
|
||||||
|
|
||||||
|
In general, it's easier to use SCEV in loops that are in LCSSA form.
|
||||||
|
The evolution of a scalar (loop-variant) expression that
|
||||||
|
SCEV can analyze is, by definition, relative to a loop.
|
||||||
|
An expression is represented in LLVM by an
|
||||||
|
`llvm::Instruction <https://llvm.org/doxygen/classllvm_1_1Instruction.html>`.
|
||||||
|
If the expression is inside two (or more) loops (which can only
|
||||||
|
happen if the loops are nested, like in the example above) and you want
|
||||||
|
to get an analysis of its evolution (from SCEV),
|
||||||
|
you have to also specify relative to what Loop you want it.
|
||||||
|
Specifically, you have to use
|
||||||
|
`getSCEVAtScope() <https://llvm.org/doxygen/classllvm_1_1ScalarEvolution.html#a21d6ee82eed29080d911dbb548a8bb68>`_.
|
||||||
|
|
||||||
|
However, if all loops are in LCSSA form, each expression is actually
|
||||||
|
represented by two different llvm::Instructions. One inside the loop
|
||||||
|
and one outside, which is the loop-closing PHI node and represents
|
||||||
|
the value of the expression after the last iteration (effectively,
|
||||||
|
we break each loop-variant expression into two expressions and so, every
|
||||||
|
expression is at most in one loop). You can now just use
|
||||||
|
`getSCEV() <https://llvm.org/doxygen/classllvm_1_1ScalarEvolution.html#a30bd18ac905eacf3601bc6a553a9ff49>`_.
|
||||||
|
and which of these two llvm::Instructions you pass to it disambiguates
|
||||||
|
the context / scope / relative loop.
|
||||||
|
|
||||||
|
.. rubric:: Footnotes
|
||||||
|
|
||||||
|
.. [#lcssa-construction] To insert these loop-closing PHI nodes, one has to
|
||||||
|
(re-)compute dominance frontiers (if the loop has multiple exits).
|
||||||
|
|
||||||
|
.. [#def-use-chain] A property of SSA is that there exists a def-use chain
|
||||||
|
for each definition, which is a list of all the uses of this definition.
|
||||||
|
LLVM implements this property by keeping a list of all the uses of a Value
|
||||||
|
in an internal data structure.
|
||||||
|
|
||||||
"More Canonical" Loops
|
"More Canonical" Loops
|
||||||
======================
|
======================
|
||||||
|
|
|
@ -331,6 +331,8 @@ The ``RegionInfo`` pass detects single entry single exit regions in a function,
|
||||||
where a region is defined as any subgraph that is connected to the remaining
|
where a region is defined as any subgraph that is connected to the remaining
|
||||||
graph at only two spots. Furthermore, a hierarchical region tree is built.
|
graph at only two spots. Furthermore, a hierarchical region tree is built.
|
||||||
|
|
||||||
|
.. _passes-scalar-evolution:
|
||||||
|
|
||||||
``-scalar-evolution``: Scalar Evolution Analysis
|
``-scalar-evolution``: Scalar Evolution Analysis
|
||||||
------------------------------------------------
|
------------------------------------------------
|
||||||
|
|
||||||
|
@ -711,6 +713,8 @@ An example of when this can occur is code like this:
|
||||||
In this case, the unconditional branch at the end of the first if can be
|
In this case, the unconditional branch at the end of the first if can be
|
||||||
revectored to the false side of the second if.
|
revectored to the false side of the second if.
|
||||||
|
|
||||||
|
.. _passes-lcssa:
|
||||||
|
|
||||||
``-lcssa``: Loop-Closed SSA Form Pass
|
``-lcssa``: Loop-Closed SSA Form Pass
|
||||||
-------------------------------------
|
-------------------------------------
|
||||||
|
|
||||||
|
@ -732,7 +736,8 @@ into the right code:
|
||||||
This is still valid LLVM; the extra phi nodes are purely redundant, and will be
|
This is still valid LLVM; the extra phi nodes are purely redundant, and will be
|
||||||
trivially eliminated by ``InstCombine``. The major benefit of this
|
trivially eliminated by ``InstCombine``. The major benefit of this
|
||||||
transformation is that it makes many other loop optimizations, such as
|
transformation is that it makes many other loop optimizations, such as
|
||||||
``LoopUnswitch``\ ing, simpler.
|
``LoopUnswitch``\ ing, simpler. You can read more in the
|
||||||
|
:ref:`loop terminology section for the LCSSA form <loop-terminology-lcssa>`.
|
||||||
|
|
||||||
.. _passes-licm:
|
.. _passes-licm:
|
||||||
|
|
||||||
|
@ -864,6 +869,8 @@ loops into one. When variables or loads can be shared in the new inner loop, thi
|
||||||
can lead to significant performance improvements. It uses
|
can lead to significant performance improvements. It uses
|
||||||
:ref:`Dependence Analysis <passes-da>` for proving the transformations are safe.
|
:ref:`Dependence Analysis <passes-da>` for proving the transformations are safe.
|
||||||
|
|
||||||
|
.. _passes-loop-unswitch:
|
||||||
|
|
||||||
``-loop-unswitch``: Unswitch loops
|
``-loop-unswitch``: Unswitch loops
|
||||||
----------------------------------
|
----------------------------------
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue