The -loop-flatten legacy pass preserves loop analyses. The legacy PM
will check all passes that preserve loop analyses that they preserve
LCSSA. This implicitly involves running -loop-simplify. The test
shouldn't depend on verify flags being set in order to run
-loop-simplify, so explicitly add it. The new PM ends up not running it
otherwise.
I disabled the widening in fa5cb4b because it run in an assert, which was
related to replacing values with different types. I forgot that an extend could
also be a zero-extend, which I have added now. This means that the approach now
is to create and insert a trunc value of the outerloop for each user, and use
that to replace IV values.
Differential Revision: https://reviews.llvm.org/D91690
Widen the IV to the widest available and legal integer type, which makes this
transformations always safe so that we can skip overflow checks.
Motivation is to let this pass trigger on 64-bit targets too, and this is the
last patch in a serie to achieve this: D90402 moves pass LoopFlatten to just
before IndVarSimplify so that IVs are not already widened, D90421 factors out
widening from IndVarSimplify into Utils/SimplifyIndVar so that we can also use
it in LoopFlatten.
Differential Revision: https://reviews.llvm.org/D90640
This converts LoopFlatten from a LoopPass to a FunctionPass so that we don't
run into problems of a loop pass deleting a (inner)loop.
Differential Revision: https://reviews.llvm.org/D90940
This is a simple pass that flattens nested loops. The intention is to optimise
loop nests like this, which together access an array linearly:
for (int i = 0; i < N; ++i)
for (int j = 0; j < M; ++j)
f(A[i*M+j]);
into one loop:
for (int i = 0; i < (N*M); ++i)
f(A[i]);
It can also flatten loops where the induction variables are not used in the
loop. This can help with codesize and runtime, especially on simple cpus
without advanced branch prediction.
This is only worth flattening if the induction variables are only used in an
expression like i*M+j. If they had any other uses, we would have to insert a
div/mod to reconstruct the original values, so this wouldn't be profitable.
This partially fixes PR40581 as this pass triggers on one of the two cases. I
will follow up on this to learn LoopFlatten a few more (small) tricks. Please
note that LoopFlatten is not yet enabled by default.
Patch by Oliver Stannard, with minor tweaks from Dave Green and myself.
Differential Revision: https://reviews.llvm.org/D42365