Commit Graph

427 Commits

Author SHA1 Message Date
Michael Kruse fa6c5107c3 [Polly] Fix unused variable in non-assert builds. NFC. 2021-10-13 00:20:22 -05:00
Michael Kruse 64489255be [Polly] Add greedy fusion algorithm.
When the option -polly-loopfusion-greedy is set, the ScheduleOptimizer
tries to aggressively fuse any band it can and does not violate any
dependences.

As part if the implementation, the functionalty for copying a band
into an new schedule was extracted out of the ScheduleTreeRewriter.
2021-10-08 20:33:30 -05:00
Michael Kruse cb879d00d8 [Polly] Completely remove -polly-opt-fusion.
This was missing from 07e7cb9433.
The switch did nothing since then.
2021-10-08 02:10:34 -05:00
Roman Gareev 113fa82c3c [Polly] Check the properties of accesses to operands of a matrix-matrix
multiplication

The following code modifies elements of the array D.

    for (i = 0; i < _PB_NI; i++)
      for (j = 0; j < _PB_NJ; j++)
      {
        for (k = 0; k < _PB_NK; k++)
        {
          double Mul = A[i][k] * B[k][j];
          D[i][j][k] += Mul;
          C[i][j] += Mul;
        }
      }

Nevertheless, the code is recognised as a matrix-matrix multiplication, since
the second and third dimensions of D are accessed with non-zero strides.

This fixes the typo, which was made during the translation to C++ bindings
(https://reviews.llvm.org/D35845).

Reviewed By: Michael Kruse <llvm@meinersbur.de>

Differential Revision: https://reviews.llvm.org/D110491
2021-09-28 22:58:57 +05:00
Michael Kruse e470f9268a [Polly] Implement user-directed loop distribution/fission.
This is a simple version without the possibility to define distribute
points or followup-transformations. However, it is the first
transformation that has to check whether the transformation is correct.

It interprets the same metadata as the LoopDistribute pass.

Re-apply after revert in c7bcd72a38 with
fix: Take isBand out of #ifndef NDEBUG since it now is used
unconditionally.
2021-09-23 21:11:01 -05:00
Petr Hosek c7bcd72a38 Revert "[Polly] Implement user-directed loop distribution/fission."
This reverts commit 52c30adc7d which
breaks the build when NDEBUG is defined.
2021-09-23 14:04:25 -07:00
Michael Kruse 07e7cb9433 [Polly] Remove -polly-opt-fusion option.
The name of the option is misleading and has been renamed by isl to
"serialize-sccs". Instead of also renaming the option, remove it.
The option is still accessible using

    -polly-isl-arg=--no-schedule-serialize-sccs
2021-09-23 15:43:08 -05:00
Michael Kruse 52c30adc7d [Polly] Implement user-directed loop distribution/fission.
This is a simple version without the possibility to define distribute
points or followup-transformations. However, it is the first
transformation that has to check whether the transformation is correct.

It interprets the same metadata as the LoopDistribute pass.
2021-09-22 17:28:25 -05:00
Michael Kruse ced20c6672 [Polly] Add -polly-reschedule and -polly-postopts options.
This command line options allow to off parts of the schedule tree optimization pipeline.
2021-09-22 00:18:19 -05:00
Michael Kruse cad9f98a2a [Polly] Don't generate inter-iteration noalias metadata.
This metadata was intended to mark all accesses within an iteration to be pairwise non-aliasing, in this case because every memory of a base pointer is touched (read or write) at most once. This is typical for 'sweeps' over all data. The stated motivation from D30606 is to ensure that unrolled iterations are considered non-aliasing.

Rhe implemention had multiple issues:

 * The structure of the noalias metadata was malformed. D110026 added check in the verifier for this metadata, and the tests were failing since then.

 * This is not true for the outer loops of the BLIS matrix multiplication, where it was being inserted. Each element of A, B, C is accessed multiple times, as often as the loop not used as an index is iterating.

 * Scopes were added to SecondLevelOtherAliasScopeList (used for the !noalias scop list) on-the-fly when another SCEV was seen. This meant that previously visited instructions would not be updated with alias scopes that are only seen later, missing out those SCEVs they should not be aliasing with.

 * Since the !noalias scope list would ideally consists of all other SCEV for this base pointer, we might run quickly into scalability issues. Especially after unrolling there would probably at least once SCEV per instruction and unroll instance.

 * The inter-iteration noalias base pointer was not removed after leaving the loop marked with it, effectively marking everything after it to noalias as well.

A solution I considered was to mark each instruction as non-aliasing with its own scope. The instruction itself would obviously alias itself, but such construction might also be considered invalid. Duplicating the instruction (e.g. due to speculation) would mark the instruction non-aliasing with its clone. I don't want to go into this territory, especially since the original motivation of determining unrolled instances as noalias based on SCEV is the what scev-aa does as well.

This effectively reverts D30606 and D35761.
2021-09-20 22:20:17 -05:00
Michael Kruse c62d9a5ca0 [Polly] Use subtyped isl::schedule_nodes for ScheduleTreeVisitor. NFC.
Change pass-by-const-ref to pass-by-value as objects are recreated
due to custom up-/down-casting anwyway.
2021-08-31 20:54:12 -05:00
Riccardo Mori d3fdbda6b0 [Polly][Isl] Move to the new-polly-generator branch version of isl-noexceptions.h. NFCI
This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in `lib/External/isl/include/isl/isl-noxceptions.h` and the official isl C++ interface.

With this commit we are moving from the `polly-generator` branch to the `new-polly-generator` branch that is more mantainable and is based on the official C++ interface `cpp-checked.h`.

Changes made:
 - There are now many sublcasses for `isl::ast_node` representing different isl types. Use `isl::ast_node_for`, `isl::ast_node_user`, `isl::ast_node_block` and `isl::ast_node_mark` where needed.
 - There are now many sublcasses for `isl::schedule_node` representing different isl types. Use `isl::schedule_node_mark`, `isl::schedule_node_extension`, `isl::schedule_node_band` and `isl::schedule_node_filter` where needed.
 - Replace the `isl::*::dump` with `dumpIslObj` since the isl dump method is not exposed in the C++ interface.
 - `isl::schedule_node::get_child` has been renamed to `isl::schedule_node::child`
 - `isl::pw_multi_aff::get_pw_aff` has been renamed to `isl::pw_multi_aff::at`
 - The constructor `isl::union_map(isl::union_pw_multi_aff)` has been replaced with the static method `isl::union_map::from()`
 - Replace usages of `isl::val::add_ui` with `isl::val::add`
 - `isl::union_set_list::alloc` is now a constructor
 - All the `isl_size` values are now wrapped inside the class `isl::size` use `isl::size::release` to get the internal `isl_size` value where needed.
 - `isl-noexceptions.h` has been generated by 73f5ed1f4d

No functional change intended.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D107225
2021-08-16 15:53:26 +02:00
Tarindu Jayatilaka 7a797b2902 Take OptimizationLevel class out of Pass Builder
Pulled out the OptimizationLevel class from PassBuilder in order to be able to access it from within the PassManager and avoid include conflicts.

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D107025
2021-07-29 21:57:23 -07:00
Riccardo Mori d5ee355f89 [Polly][Isl] Use isl::union_map::unite() instead of isl::union_map::add_map(). NFC
This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in `lib/External/isl/include/isl/isl-noxceptions.h` and the official isl C++ interface.

Changes made:
 - Use `isl::union_map::unite()` instead of `isl::union_map::add_map()`
 - `isl-noexceptions.h` has been generated by this 3f43ae29fa

Depends on D106059

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D106061
2021-07-19 12:11:00 +02:00
Riccardo Mori bad3ebbaae [Polly][Isl] Stop generating isl::union_{set,map} from isl::space. NFC
This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in `lib/External/isl/include/isl/isl-noxceptions.h` and the official isl C++ interface.

Changes made:
 - Stop generating `isl::union_set` and isl::union_map` from `isl::space` and instead generate them from `isl::ctx`
 - Disable clang-format on `isl-noexceptions.h`
 - Removed `isl::union_{set,map}` generator from `isl::space` from `isl-noexceptions.h`
 - `isl-noexceptions.h` has been generated by this 87c3413b6f

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D106059
2021-07-19 11:24:53 +02:00
Riccardo Mori 0813bd1696 [Polly][Isl] Use isl::*::ctx instead of isl::*::get_ctx. NFC
This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in `lib/External/isl/include/isl/isl-noxceptions.h` and the official isl C++ interface.

Changes made:
 - Use `isl::*::ctx()` instead of `isl::*::get_ctx()` (for example `isl::space::ctx()` instead of `isl::space::get_ctx()`)
 - Add `isl::` namespace in front of isl types to avoid confusion (for example `isl::space::ctx` and `isl::ctx`
 - `isl-noexceptions.h` has been generated by this b64e33c62d

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D105691
2021-07-09 21:14:14 +02:00
patacca b55aedd0b8 [Polly][Isl] Use isl::union_set::unite() instead of isl::union_set::add_set(). NFC
This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in `lib/External/isl/include/isl/isl-noxceptions.h` and the official isl C++ interface.

Changes made:
 - Use `isl::union_set::unite()` instead of `isl::union_set::add_set()`
 - `isl-noexceptions.h` has been generated by this 390c44982b

Depends on D104994

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D105444
2021-07-07 16:26:55 +02:00
patacca f482497c38 [Polly][Isl] Use isl::set::tuple_dim, isl::map::domain_tuple_dim and isl::map::range_tuple_dim. NFC
This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in `lib/External/isl/include/isl/isl-noxceptions.h` and the official isl C++ interface.

Changes made:
 - Use `isl::set::tuple_dim` instead of `isl::set::dim` and `isl::set::n_dim`
 - Use `isl::map::domain_tuple_dim` instead of `isl::map::dim`
 - Use `isl::map::range_tuple_dim` instead of `isl::map::dim`
 - isl-noexceptions.h has been generated by this 45576e1b42

Note that not all the usage of `isl::{set,map}::dim` where replaced

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D104994
2021-07-06 11:20:45 +02:00
Michael Kruse e2d4b02404 [Polly][ScopInliner] Indicate if the IR has changed.
Return true to indicate that the IR has changed if the nested pass
manager has changed it.

Fixes the ScopInliner tests in the LLVM_ENABLE_EXPENSIVE_CHECKS=ON
configuration.

Thanks to Alexandre Ganea for reporting.
2021-06-24 15:44:39 -05:00
patacca 7c7978a122 [Polly][Isl] Removing explicit operator bool() from isl C++ bindings. NFC.
This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in `lib/External/isl/include/isl/isl-noxceptions.h` and the official isl C++ interface.

Changes made:
 - Removing explicit operator bool() from all the classes in the isl C++ bindings.
 - Replace each call to operator bool() to method `is_null()`.
 - isl-noexceptions.h has been generated by this 27396daac5

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D103976
2021-06-11 14:44:24 +02:00
Michael Kruse a56bd7dec8 [Polly][Matmul] Re-pack A in every iteration.
Packed_A must be copied repeatedly, not just for the first iteration of
the outer tile.

This fixes llvm.org/PR50557
2021-06-09 15:19:52 -05:00
patacca 9b41d0958e [Polly][Isl] Removing nullptr constructor from C++ bindings. NFC.
[Polly][Isl] Removing nullptr constructor from C++ bindings. NFC.

This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in `lib/External/isl/include/isl/isl-noxceptions.h` and the official isl C++ interface.

Changes made:
 - Removed `std::nullptr_t` constructor from all the classes in the isl C++ bindings.
 - `isl-noexceptions.h` has been generated by this a7e00bea38

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D103751
2021-06-08 23:46:28 +02:00
patacca f60ea691a9 Revert "[Polly][Isl] Removing nullptr constructor from C++ bindings. NFC."
This reverts commit be5e2fc7bf.

This introduced a building error for polly. https://lab.llvm.org/buildbot#builders/10/builds/4951
2021-06-08 17:12:10 +02:00
patacca be5e2fc7bf [Polly][Isl] Removing nullptr constructor from C++ bindings. NFC.
[Polly][Isl] Removing nullptr constructor from C++ bindings. NFC.

This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in `lib/External/isl/include/isl/isl-noxceptions.h` and the official isl C++ interface.

Changes made:
 - Removed `std::nullptr_t` constructor from all the classes in the isl C++ bindings.
 - `isl-noexceptions.h` has been generated by this a7e00bea38

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D103751
2021-06-08 15:28:20 +02:00
Michael Kruse d123e983b3 [Polly] Move MatMul optimization into its own file. NFC.
Functions shared between generalized matrix-multiplication optimization
and other post-reschedule optimizations (tiling, prevect) are moved into
the schedule tree transformation utility ScheduleTreeTransform.
2021-06-04 23:22:30 -05:00
Michael Kruse 5aafcb2b44 [Polly] Add support for -polly-position=early with the NPM.
This required support for the canonicalization passes, inlcuding
porting RewriteByReferenceParams to the NPM.

For some reason, the legacy pass pipeline with -polly-position=early did
not run the CodePreparation pass. This was fixed as well.
2021-05-14 12:55:03 -05:00
Michael Kruse 286677870b [Polly][ManualOpt] Match interpretation of unroll metadata to LoopUnrolls's.
We previously had a different interpretation of unroll transformation
attributes than how LoopUnroll interpreted it. In particular,
llvm.loop.unroll.enable was needed explicitly to enable it and disabling
metadata was ignored.
Additionally, it required that either full unrolling or an unroll factor
to be specified or fail otherwise. An unroll factor is still required,
but the transformation is ignored with the hope that LoopUnroll is going
to apply the unrolling, since Polly currently does not implement an
heuristic.

Fixes llvm.org/PR50109
2021-04-24 04:30:19 -05:00
David Blaikie 30df6d5d6a Preprocessor conditionalize some assert-only functions to suppress -Wunused-function 2021-04-03 14:03:43 -07:00
Fangrui Song 927050af53 [Polly] Fix -Wunused-function in -DLLVM_ENABLE_ASSERTIONS=off builds 2021-03-24 19:56:43 -07:00
Nikita Popov 7d91d81c6b [polly] Fix build
This produced a compile error with GCC:

llvm-project/polly/lib/Transform/ScheduleOptimizer.cpp:1220:49: error: cannot convert ‘bool’ to ‘llvm::TargetTransformInfo::RegisterKind’
 1220 |     RegisterBitwidth = TTI->getRegisterBitWidth(true);
2021-03-24 17:46:46 +01:00
Michael Kruse 8796451d6e [Polly] Port DeadCodeElim to the NewPM. 2021-03-24 01:01:29 -05:00
Michael Kruse f51427afb5 [Polly][Unroll] Fix unroll_double test.
We enumerated the cross product Domain x Scatter, but sorted only be the
scatter key. In case there are are multiple statement instances per
scatter value, the order between statement instances of the same loop
iteration was undefined.

Propertly enumerate and sort only by the scatter value, and group the
domains using the scatter dimension again.

Thanks to Leonard Chan for the report.
2021-03-16 09:00:42 -05:00
Michael Kruse 3f170eb197 [Polly][Optimizer] Apply user-directed unrolling.
Make Polly look for unrolling metadata (https://llvm.org/docs/TransformMetadata.html#loop-unrolling) that is usually only interpreted by the LoopUnroll pass and apply it to the SCoP's schedule.

While not that useful by itself (there already is an unroll pass), it introduces mechanism to apply arbitrary loop transformation directives in arbitrary order to the schedule. Transformations are applied until no more directives are found. Since ISL's rescheduling would discard the manual transformations and it is assumed that when the user specifies the sequence of transformations, they do not want any other transformations to apply. Applying user-directed transformations can be controlled using the `-polly-pragma-based-opts` switch and is enabled by default.

This does not influence the SCoP detection heuristic. As a consequence, loop that do not fulfill SCoP requirements or the initial profitability heuristic will be ignored. `-polly-process-unprofitable` can be used to disable the latter.

Other than manually editing the IR, there is currently no way for the user to add loop transformations in an order other than the order in the default pipeline, or transformations other than the one supported by clang's LoopHint. See the `unroll_double.ll` test as example that clang currently is unable to emit. My own extension of `#pragma clang loop` allowing an arbitrary order and additional transformations is available here: https://github.com/meinersbur/llvm-project/tree/pragma-clang-loop. An effort to upstream this functionality as `#pragma clang transform` (because `#pragma clang loop` has an implicit transformation order defined by the loop pipeline) is D69088.

Additional transformations from my downstream pragma-clang-loop branch are tiling, interchange, reversal, unroll-and-jam, thread-parallelization and array packing. Unroll was chosen because it uses already-defined metadata and does not require correctness checks.

Reviewed By: sebastiankreutzer

Differential Revision: https://reviews.llvm.org/D97977
2021-03-15 13:05:39 -05:00
Michael Kruse ab0556bb20 [Polly] Regenerate isl-noexceptions.h.
Regenerate the C++ wrapper header from the current isl version's
headers.

The most notable change is that some dimension sizes are represented by
an isl_size (instead of unsigned), which is a signed int. Additionally,
some function may return -1 in case of an error which already had been
fixed in the past. The C++ may no return -1 instead of UINT_MAX which
caused the problems.

Some types in Polly had been changed from unsigned to isl_size
(that were not already auto) and some loops/comparision had to be
changed to avoid unsigned/signed comparison warnings.
2021-02-14 19:17:54 -06:00
Michael Kruse f0f5afc4dd [Polly] Remove unused declaration. NFC. 2021-02-12 02:20:31 -06:00
Michael Kruse 7387f33bfe [Polly] Hide IslScheduleOptimizer implementation from header. NFC.
These are implementation details of the IslScheduleOptimizer pass
implementation and not use anywhere else. Hence, we can move them to the
cpp file and into an anonymous namespace.

Only getPartialTilePrefixes is, aside from the pass itself, used
externally (by the ScheduleOptimizerTest) and moved into the polly
namespace.
2021-02-11 21:02:29 -06:00
Michael Kruse 23753c6088 [Polly] Hide Simplify implementation from header. NFC.
Move SimplifiyVisitor from Simplify.h to Simplify.cpp. It is not
relevant for applying the pass in either the NewPM or the legacyPM.
Rename it to SimplifyImpl to account for that.

This is possible due its state not being necessary to be preserved
between runs and thefore SimplifyImpl not needed to be held in the
pass object. Instead, SimplifyImpl is only instatiated for the
current Scop. In the NewPM as a function-local variable, and in the
legacy PM inside a llvm::Optional object because the state must be
preserved between the printScop (invoked by opt -analyze) and the most
recent runOnScop calls.
2021-02-10 22:11:52 -06:00
Michael Kruse 91ca9adc9e [Polly] Avoid "using namespace llvm" in public headers. NFC.
"using namespace" pollutes the namespace of every file that includes
such a header and universally considered a bad thing. Even the variant

    namespace polly {
      using namespace llvm;
    }

(previously used by LoopGenerators.h) imports more symbols than the file
is in control of. The header may include a fixed set of files from LLVM,
but the header itself may by be included together with other headers
from LLVM. For instance, LLVM's MemorySSA.h and Polly's ScopInfo.h both
declare a class 'MemoryAccess' which may conflict.

Instead of prefixing everything in Polly's header files, this patch adds
'using' statements to import only the symbols that are actually
referenced in Polly. This approach is also used by MLIR to import
commonly used symbols into the mlir namespace.

This patch also puts the symbols declared in IslNodeBuilder.h into the
Polly namespace to also be able to use the imported symbols.
2021-02-10 20:58:33 -06:00
Michael Kruse 13f758a805 [Polly] Improve Simplify pass PM integration.
1. LegacyPM: Rename SimplifyLegacyPass to SimplifyWrapperPass.
2. LegacyPM: Complete create/init functions in LinkAllPasses.h
3. NewPM: Only invalidate non-Scop passes if changed.
4. NewPM: Add to default pass pipeline.
5. NewPM: Print -analyze header for each print<polly-simplify>
2021-02-09 23:56:21 -06:00
Michael Kruse e200df952b [Polly] Port IslScheduleOptimizer to the NewPM. 2021-02-09 23:56:21 -06:00
Michael Kruse 7903d594ea [Polly] Port DeLICM to the NewPM. 2021-02-09 23:56:19 -06:00
Michael Kruse 4c64d8ee3a [Polly] Port ForwardOpTree to the NewPM. 2021-02-09 23:56:19 -06:00
Michael Kruse 3b9677e1ec [Polly] Track defined behavior for PHI predecessor computation.
ZoneAlgorithms's computePHI relies on being provided with consistent a
schedule to compute the statement prodecessors of a statement containing
PHINodes. Otherwise unexpected results such as PHI nodes with multiple
predecessors can occur which would result in problems in the
algorithms expecting consistent data.

In the added test case, statement instances are scrubbed from the
SCoP their execution would result in undefined behavior (Due to a nsw
overflow). As already being undefined behavior in LLVM-IR, neither
AssumedContext nor InvalidContext are updated, giving computePHI no
means to avoid these cases.

Intoduce a new SCoP property, the DefinedBehaviorContext, that among
the runtime-checked conditions, also tracks the assumptions not needing
a runtime check, in particular those affecting the assumed control flow.
This replaces the manual combination of the 3 other contexts that was
already done in computePHI and setNewAccessRelation. Currently, the only
additional assumption is that loop induction variables will nsw flag for
not wrap, but potentially more can be added. Use in
hasFeasibleRuntimeContext, isl::ast_build and gisting are other
potential uses.

To limit computational complexity, the DefinedBehaviorContext is not
availabe if it grows too large (atm hardcoded to 8 disjuncts).

Possible other fixes include bailing out in computePHI when
inconsistencies are detected, choose an arbitrary value for inconsistent
cases (since it is undefined behavior anyways), or make the code
receiving the result from ComputePHI handle inconsistent data. All of
them reduce the quality of implementation having to bail out more often
and disabling the ability to assert on actually wrong results.

This fixes llvm.org/PR48783.
2021-01-23 13:03:49 -06:00
Michael Kruse fc115f2e73 [Polly] Move SimplifyVisitor into polly namespace.
Declarations in headers should not be in the anonymous
namespace. Compilers also warn about the use of
<anon namespace>::SimplifyVisitor as a public field in
polly::SimplifyPass and polly::SimplifyPrinterPass.
2020-11-16 18:59:08 -06:00
Michael Kruse 243511a24e [Polly] Fix memory leak. 2020-11-12 20:04:17 -06:00
Michael Kruse c8a0e27cfb [Polly][OpTree] Fix mid-processing change of access kind.
Operand tree forwarding can cause the change of an access kind; in
particular change from a scalar kind to an array kind if the scalar
dependency is not necessary. Such an access cannot and doesn't need to
be forwarded anymore.

Fixes llvm.org/PR48034
2020-11-11 16:21:48 -06:00
Michael Kruse c1cf51e777 [Polly][OpTree] Better report applied changes.
Print to dbgs() any taken action.

Also, read-only scalars do not require any action unless
-polly-analyze-read-only-scalars=true is used. Better refect this by
using ForwardingAction::triviallyForwardable and thus not bumping the
statistics.
2020-11-11 16:21:48 -06:00
Fangrui Song 98031b664c [polly] Fix -Wunused-lambda-capture and -Wunused-variable 2020-11-02 20:35:26 -08:00
Fangrui Song 2213a354b9 [Polly] Delete unused lambda capture after 7175cffb21 2020-10-20 18:34:52 -07:00
Michael Kruse 7175cffb21 [Polly] Reuse multiple uses in operand tree.
Recursively traversing the operand tree leads to an exponential blowup
if instructions are used multiple times due to every path leading to an
additional copy of the instructions after forwarding. This problem was
marked as a TODO in the code and was reported as a bug in llvm.org/PR47340.

Fix by caching already visited instructions and returning the cached
version when already visited. Instead of calling forwardTree() twice,
return a ForwardingAction structure that contains a lambda which will
carry-out the forwarding when requested. The lambdas are executed in
reverse-postorder to mimic the previous recursive calls unless there
is a reuse.

Fixes llvm.org/PR47340
2020-10-20 18:05:35 -05:00