The new pass manager does not allow adding module passes at the
-polly-position=before-vectorizer extension point. Introduce a
DumpFunctionPass that dumps only current function. In contrast to the
legacy pass manager's -polly-dump-before, each function will be dumped
into its own file. -polly-dump-before-file is still not supported.
The DumpFunctionPass uses llvm::CloneModule to copy the current function
into a new module and then write it into a file.
Compilation of the file insn-attrtab.c of the SPEC CPU 2017 502.gcc_r
benchmark takes excessive time (> 30min) with Polly enabled. Most time
is spent in the isErrorBlock function querying the DominatorTree.
The isErrorBlock is invoked redundantly over the course of ScopDetection
and ScopBuilder. This patch introduces a caching mechanism for its
result.
Instead of a free function, isErrorBlock is moved to ScopDetection where
its cache map resides. This also means that many functions directly or
indirectly calling isErrorBlock are not "const" anymore. The
DetectionContextMap was marked as "mutable", but IMHO it never should
have been since it stores the detection result.
502.gcc_r only takes excessive time with the new pass manager. The
reason seeams to be that it invalidates the ScopDetection analysis more
often than the legacy pass manager, for unknown reasons.
This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in `lib/External/isl/include/isl/isl-noxceptions.h` and the official isl C++ interface.
With this commit we are moving from the `polly-generator` branch to the `new-polly-generator` branch that is more mantainable and is based on the official C++ interface `cpp-checked.h`.
Changes made:
- There are now many sublcasses for `isl::ast_node` representing different isl types. Use `isl::ast_node_for`, `isl::ast_node_user`, `isl::ast_node_block` and `isl::ast_node_mark` where needed.
- There are now many sublcasses for `isl::schedule_node` representing different isl types. Use `isl::schedule_node_mark`, `isl::schedule_node_extension`, `isl::schedule_node_band` and `isl::schedule_node_filter` where needed.
- Replace the `isl::*::dump` with `dumpIslObj` since the isl dump method is not exposed in the C++ interface.
- `isl::schedule_node::get_child` has been renamed to `isl::schedule_node::child`
- `isl::pw_multi_aff::get_pw_aff` has been renamed to `isl::pw_multi_aff::at`
- The constructor `isl::union_map(isl::union_pw_multi_aff)` has been replaced with the static method `isl::union_map::from()`
- Replace usages of `isl::val::add_ui` with `isl::val::add`
- `isl::union_set_list::alloc` is now a constructor
- All the `isl_size` values are now wrapped inside the class `isl::size` use `isl::size::release` to get the internal `isl_size` value where needed.
- `isl-noexceptions.h` has been generated by 73f5ed1f4d
No functional change intended.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D107225
Pulled out the OptimizationLevel class from PassBuilder in order to be able to access it from within the PassManager and avoid include conflicts.
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D107025
This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in `lib/External/isl/include/isl/isl-noxceptions.h` and the official isl C++ interface.
Changes made:
- Use `isl::union_map::unite()` instead of `isl::union_map::add_map()`
- `isl-noexceptions.h` has been generated by this 3f43ae29fa
Depends on D106059
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D106061
This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in `lib/External/isl/include/isl/isl-noxceptions.h` and the official isl C++ interface.
Changes made:
- Stop generating `isl::union_set` and isl::union_map` from `isl::space` and instead generate them from `isl::ctx`
- Disable clang-format on `isl-noexceptions.h`
- Removed `isl::union_{set,map}` generator from `isl::space` from `isl-noexceptions.h`
- `isl-noexceptions.h` has been generated by this 87c3413b6f
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D106059
This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in `lib/External/isl/include/isl/isl-noxceptions.h` and the official isl C++ interface.
Changes made:
- Use `isl::*::ctx()` instead of `isl::*::get_ctx()` (for example `isl::space::ctx()` instead of `isl::space::get_ctx()`)
- Add `isl::` namespace in front of isl types to avoid confusion (for example `isl::space::ctx` and `isl::ctx`
- `isl-noexceptions.h` has been generated by this b64e33c62d
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D105691
This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in `lib/External/isl/include/isl/isl-noxceptions.h` and the official isl C++ interface.
Changes made:
- Use `isl::union_set::unite()` instead of `isl::union_set::add_set()`
- `isl-noexceptions.h` has been generated by this 390c44982b
Depends on D104994
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D105444
This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in `lib/External/isl/include/isl/isl-noxceptions.h` and the official isl C++ interface.
Changes made:
- Use `isl::set::tuple_dim` instead of `isl::set::dim` and `isl::set::n_dim`
- Use `isl::map::domain_tuple_dim` instead of `isl::map::dim`
- Use `isl::map::range_tuple_dim` instead of `isl::map::dim`
- isl-noexceptions.h has been generated by this 45576e1b42
Note that not all the usage of `isl::{set,map}::dim` where replaced
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D104994
This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in `lib/External/isl/include/isl/isl-noxceptions.h` and the official isl C++ interface.
Changes made:
- Removing method `to_str()` from all the classes in the isl C++ bindings.
- Overload method `stringFromIslObj()` so it accepts isl C++ objects.
- To keep backward compatibility `stringFromIslObj()` now accepts a value that is returned if the isl C object is `null` or doesn't have a string representation (by default it's an empty string). In some cases it's better to have the string "null" instead of an empty string.
- isl-noexceptions.h has been generated by this d33ec3a3bb
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D104211
This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in `lib/External/isl/include/isl/isl-noxceptions.h` and the official isl C++ interface.
Changes made:
- Removing explicit operator bool() from all the classes in the isl C++ bindings.
- Replace each call to operator bool() to method `is_null()`.
- isl-noexceptions.h has been generated by this 27396daac5
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D103976
[Polly][Isl] Removing nullptr constructor from C++ bindings. NFC.
This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in `lib/External/isl/include/isl/isl-noxceptions.h` and the official isl C++ interface.
Changes made:
- Removed `std::nullptr_t` constructor from all the classes in the isl C++ bindings.
- `isl-noexceptions.h` has been generated by this a7e00bea38
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D103751
[Polly][Isl] Removing nullptr constructor from C++ bindings. NFC.
This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in `lib/External/isl/include/isl/isl-noxceptions.h` and the official isl C++ interface.
Changes made:
- Removed `std::nullptr_t` constructor from all the classes in the isl C++ bindings.
- `isl-noexceptions.h` has been generated by this a7e00bea38
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D103751
Avoid the warning
/polly/lib/Support/RegisterPasses.cpp:833:3: warning: default label in switch which covers all enumeration values [-Wcovered-switch-default]
default:
^
since all cases are now handled.
Thanks to Luke Benes for reporting.
Only supported with -polly-position=early. Unfortunately, the
extension point callpack for VectorizerStart only passes a
FunctionPassManager, making it impossible to add a module pass.
This required support for the canonicalization passes, inlcuding
porting RewriteByReferenceParams to the NPM.
For some reason, the legacy pass pipeline with -polly-position=early did
not run the CodePreparation pass. This was fixed as well.
Printing pass manager invocations is fairly verbose and not super
useful.
This allows us to remove DebugLogging from pass managers and PassBuilder
since all logging (aside from analysis managers) goes through
instrumentation now.
This has the downside of never being able to print the top level pass
manager via instrumentation, but that seems like a minor downside.
Reviewed By: ychen
Differential Revision: https://reviews.llvm.org/D101797
We previously had a different interpretation of unroll transformation
attributes than how LoopUnroll interpreted it. In particular,
llvm.loop.unroll.enable was needed explicitly to enable it and disabling
metadata was ignored.
Additionally, it required that either full unrolling or an unroll factor
to be specified or fail otherwise. An unroll factor is still required,
but the transformation is ignored with the hope that LoopUnroll is going
to apply the unrolling, since Polly currently does not implement an
heuristic.
Fixes llvm.org/PR50109
Make Polly look for unrolling metadata (https://llvm.org/docs/TransformMetadata.html#loop-unrolling) that is usually only interpreted by the LoopUnroll pass and apply it to the SCoP's schedule.
While not that useful by itself (there already is an unroll pass), it introduces mechanism to apply arbitrary loop transformation directives in arbitrary order to the schedule. Transformations are applied until no more directives are found. Since ISL's rescheduling would discard the manual transformations and it is assumed that when the user specifies the sequence of transformations, they do not want any other transformations to apply. Applying user-directed transformations can be controlled using the `-polly-pragma-based-opts` switch and is enabled by default.
This does not influence the SCoP detection heuristic. As a consequence, loop that do not fulfill SCoP requirements or the initial profitability heuristic will be ignored. `-polly-process-unprofitable` can be used to disable the latter.
Other than manually editing the IR, there is currently no way for the user to add loop transformations in an order other than the order in the default pipeline, or transformations other than the one supported by clang's LoopHint. See the `unroll_double.ll` test as example that clang currently is unable to emit. My own extension of `#pragma clang loop` allowing an arbitrary order and additional transformations is available here: https://github.com/meinersbur/llvm-project/tree/pragma-clang-loop. An effort to upstream this functionality as `#pragma clang transform` (because `#pragma clang loop` has an implicit transformation order defined by the loop pipeline) is D69088.
Additional transformations from my downstream pragma-clang-loop branch are tiling, interchange, reversal, unroll-and-jam, thread-parallelization and array packing. Unroll was chosen because it uses already-defined metadata and does not require correctness checks.
Reviewed By: sebastiankreutzer
Differential Revision: https://reviews.llvm.org/D97977
This reverts commit 329aeb5db4,
and relands commit 61f006ac65.
This is a continuation of D89456.
As it was suggested there, now that SCEV models `PtrToInt`,
we can try to improve SCEV's pointer handling.
In particular, i believe, i will need this in the future
to further fix `SCEVAddExpr`operation type handling.
This removes special handling of `ConstantPointerNull`
from `ScalarEvolution::createSCEV()`, and add constant folding
into `ScalarEvolution::getPtrToIntExpr()`.
This way, `null` constants stay as such in SCEV's,
but gracefully become zero integers when asked.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D98147
This is a continuation of D89456.
As it was suggested there, now that SCEV models `PtrToInt`,
we can try to improve SCEV's pointer handling.
In particular, i believe, i will need this in the future
to further fix `SCEVAddExpr`operation type handling.
This removes special handling of `ConstantPointerNull`
from `ScalarEvolution::createSCEV()`, and add constant folding
into `ScalarEvolution::getPtrToIntExpr()`.
This way, `null` constants stay as such in SCEV's,
but gracefully become zero integers when asked.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D98147
Regenerate the C++ wrapper header from the current isl version's
headers.
The most notable change is that some dimension sizes are represented by
an isl_size (instead of unsigned), which is a signed int. Additionally,
some function may return -1 in case of an error which already had been
fixed in the past. The C++ may no return -1 instead of UINT_MAX which
caused the problems.
Some types in Polly had been changed from unsigned to isl_size
(that were not already auto) and some loops/comparision had to be
changed to avoid unsigned/signed comparison warnings.
The description of the -polly switch stated that it was only enabled
with -O3. This was a lie, the optimization level was ignored. Only at
-O0 Polly was not added to the pass pipeline because the pass builder,
but only because the extension points were not triggered.
In the NewPM, the VectorizerStart extensions point is actually trigger
even with -O0 which leads to the following crash:
Assertion `Level != OptimizationLevel::O0 && "Must request optimizations!"' failed.
We sanitize the optimization levels using the following rules for both
pass mangers:
1. Only enable Polly if optimizing at all (-O1, -O2 or -O3).
2. Do not enable Polly when optimizing for size.
3. Ignore the optimization level for diagnostic passes (printer, viewer
or JScop-exporter).
4. If only diagnostic passes enabled, skip the code-generation.
5. Fix the description of the -polly command line option.
Even though it has some oddities, both pipelines should be as similar as
possible. Also use report_fatal_error instead of assertions to ensure a
proper failure in release builds for unsupported options.
This finalizes the patch serious to make Polly run in the default
configuration when using the NewPM by default.
In particular, print the ast with -debug-only=polly-ast, print a
per-scop header with print<polly-ast> and force-add the analysis with
-polly-code-generation=ast.
The pass-instrumentation pass is implicitly execute by the NewPM
whenever a new analysis runs. Not registering it will cause the crash
whenever a scop pass requests an analysis.
For instance this is the case for the IstAstAnalysis requesting the
DependenceAnalsis result.
In addition to that regression tests should not test the intire pass
pipeline (unless they are testing the pipeline itself), the Polly-ACC
currently does not support the new pass manager. If enabled by default,
such tests will therefore fail.
Use the -polly-gpu-runtime and -polly-gpu-arch options also as default
values for the PPCGCodeGeneration pass. This requires to move the option
to be moved from the pipeline-building Register passes to the
PPCGCodeGeneration implementation.
Fixes the spir-typesize.ll buildbot fail.
ZoneAlgorithms's computePHI relies on being provided with consistent a
schedule to compute the statement prodecessors of a statement containing
PHINodes. Otherwise unexpected results such as PHI nodes with multiple
predecessors can occur which would result in problems in the
algorithms expecting consistent data.
In the added test case, statement instances are scrubbed from the
SCoP their execution would result in undefined behavior (Due to a nsw
overflow). As already being undefined behavior in LLVM-IR, neither
AssumedContext nor InvalidContext are updated, giving computePHI no
means to avoid these cases.
Intoduce a new SCoP property, the DefinedBehaviorContext, that among
the runtime-checked conditions, also tracks the assumptions not needing
a runtime check, in particular those affecting the assumed control flow.
This replaces the manual combination of the 3 other contexts that was
already done in computePHI and setNewAccessRelation. Currently, the only
additional assumption is that loop induction variables will nsw flag for
not wrap, but potentially more can be added. Use in
hasFeasibleRuntimeContext, isl::ast_build and gisting are other
potential uses.
To limit computational complexity, the DefinedBehaviorContext is not
availabe if it grows too large (atm hardcoded to 8 disjuncts).
Possible other fixes include bailing out in computePHI when
inconsistencies are detected, choose an arbitrary value for inconsistent
cases (since it is undefined behavior anyways), or make the code
receiving the result from ComputePHI handle inconsistent data. All of
them reduce the quality of implementation having to bail out more often
and disabling the ability to assert on actually wrong results.
This fixes llvm.org/PR48783.
to Pass.h.
In some compiler passes like SampleProfileLoaderPass, we want to know which
LTO/ThinLTO phase the pass is in. Currently the phase is represented in enum
class PassBuilder::ThinLTOPhase, so it is only available in PassBuilder and
it also cannot represent phase in full LTO. The patch extends it to include
full LTO phases and move it from PassBuilder.h to Pass.h, then it is much
easier for PassBuilder to communiate with each pass about current LTO phase.
Differential Revision: https://reviews.llvm.org/D94613
MemoryAccess::setNewAccessRelation() in assert-builds checks whether the
access relation for a READ has a memory location for every instance of
the domain. Otherwise, we would not have value to load from. That check
already considered that instances outside the Scop's context do not
matter since they are never executed (or would be undefined behavior).
In this patch also take instances of the InvalidContext into account,
as these can also be assumed to never occur. InvalidContext was
introduced to avoid the computational complexity of subtracting
restrictions from the AssumedContext. However, this additional check in
setNewAccessRelation is only done in assert-builds.
The assertion case with an InvalidContext may occur with DeLICM on a
conditionally infinite loops, as it is the case in the following code:
for (int i = 0; i < n; i+=b)
vreg = ...;
*Dest = vreg;
The loop is infinite when b=0, and [b] -> { : b = 0 } is part of the
InvalidContext. When DeLICM tries to map the memory for %vreg to *Dest,
there is no store instance that uses the value of vreg when b = 0, hence
no location to map it to. However, the case is irrelevant since Polly's
runtime condition check ensures that this is never case.
Fixes llvm.org/PR48445
And use it to model LLVM IR's `ptrtoint` cast.
This is essentially an alternative to D88806, but with no chance for
all the problems it caused due to having the cast as implicit there.
(see rG7ee6c402474a2f5fd21c403e7529f97f6362fdb3)
As we've established by now, there are at least two reasons why we want this:
* It will allow SCEV to actually model the `ptrtoint` casts
and their operands, instead of treating them as `SCEVUnknown`
* It should help with initial problem of PR46786 - this should eventually allow us
to not loose pointer-ness of an expression in more cases
As discussed in [[ https://bugs.llvm.org/show_bug.cgi?id=46786 | PR46786 ]], in principle,
we could just extend `SCEVUnknown` with a `is ptrtoint` cast, because `ScalarEvolution::getPtrToIntExpr()`
should sink the cast as far down into the expression as possible,
so in the end we should always end up with `SCEVPtrToIntExpr` of `SCEVUnknown`.
But i think that it isn't the best solution, because it doesn't really matter
from memory consumption side - there probably won't be *that* many `SCEVPtrToIntExpr`s
for it to matter, and it allows for much better discoverability.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D89456
This removes "VerifyEachPass" parameters from a lot of functions which is nice.
Don't verify after special passes or VerifierPass.
This introduces verification on loop and cgscc passes, verifying the corresponding function/module.
Reviewed By: ychen
Differential Revision: https://reviews.llvm.org/D88764
This is in preparation for supporting -debugify-each, which adds a debug
info pass before and after each pass.
Switch VerifyEach to use this.
Reviewed By: ychen
Differential Revision: https://reviews.llvm.org/D88107