This is a simple pass that flattens nested loops. The intention is to optimise
loop nests like this, which together access an array linearly:
for (int i = 0; i < N; ++i)
for (int j = 0; j < M; ++j)
f(A[i*M+j]);
into one loop:
for (int i = 0; i < (N*M); ++i)
f(A[i]);
It can also flatten loops where the induction variables are not used in the
loop. This can help with codesize and runtime, especially on simple cpus
without advanced branch prediction.
This is only worth flattening if the induction variables are only used in an
expression like i*M+j. If they had any other uses, we would have to insert a
div/mod to reconstruct the original values, so this wouldn't be profitable.
This partially fixes PR40581 as this pass triggers on one of the two cases. I
will follow up on this to learn LoopFlatten a few more (small) tricks. Please
note that LoopFlatten is not yet enabled by default.
Patch by Oliver Stannard, with minor tweaks from Dave Green and myself.
Differential Revision: https://reviews.llvm.org/D42365