Remove uses of to-be-deprecated API. In cases where the correct
element type was not immediately obvious to me, fall back to
explicit getPointerElementType().
Remove uses of to-be-deprecated API.
Unfortunately this one mostly just makes the use of
getPointerElementType() explicit, as the correct type to use
wasn't immediately available (deriving it from QualType is left
as an excercise to the reader).
This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in `lib/External/isl/include/isl/isl-noxceptions.h` and the official isl C++ interface.
Changes made:
- Use `isl::*::ctx()` instead of `isl::*::get_ctx()` (for example `isl::space::ctx()` instead of `isl::space::get_ctx()`)
- Add `isl::` namespace in front of isl types to avoid confusion (for example `isl::space::ctx` and `isl::ctx`
- `isl-noexceptions.h` has been generated by this b64e33c62d
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D105691
This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in `lib/External/isl/include/isl/isl-noxceptions.h` and the official isl C++ interface.
Changes made:
- Use `isl::union_set::unite()` instead of `isl::union_set::add_set()`
- `isl-noexceptions.h` has been generated by this 390c44982b
Depends on D104994
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D105444
This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in `lib/External/isl/include/isl/isl-noxceptions.h` and the official isl C++ interface.
Changes made:
- Use `isl::set::tuple_dim` instead of `isl::set::dim` and `isl::set::n_dim`
- Use `isl::map::domain_tuple_dim` instead of `isl::map::dim`
- Use `isl::map::range_tuple_dim` instead of `isl::map::dim`
- isl-noexceptions.h has been generated by this 45576e1b42
Note that not all the usage of `isl::{set,map}::dim` where replaced
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D104994
Return true to indicate that the IR has changed if the nested pass
manager has changed it.
Fixes the ScopInliner tests in the LLVM_ENABLE_EXPENSIVE_CHECKS=ON
configuration.
Thanks to Alexandre Ganea for reporting.
Polly uses algorithms from the Integer Set Library (isl), which is a library written in C and which is incompatible with the rest of the LLVM as it is written in C++.
Changes made:
- Refactoring the method `IslAstInfo::getBuild()`
- `IslAstInfo::IslAstUserPayload.Build` now uses C++ types instead of C types
- Removing destructor of `IslAstInfo::IslAstUserPayload`
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D104370
Commit 4c7f820b2b changed the llvm.powi intrinsic to support
different 'int' sizes for the exponent. That happened to break
the IntrinsicToLibdeviceFunc mapping in PPCGCodeGeneration, which
obviously should have been updated as part of commit 4c7f820b2b
(https://reviews.llvm.org/D99439).
The shortcoming was found by buildbots that use
-DPOLLY_ENABLE_GPGPU_CODEGEN=ON
This patch should fixup the problem.
This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in `lib/External/isl/include/isl/isl-noxceptions.h` and the official isl C++ interface.
Changes made:
- Removing method `to_str()` from all the classes in the isl C++ bindings.
- Overload method `stringFromIslObj()` so it accepts isl C++ objects.
- To keep backward compatibility `stringFromIslObj()` now accepts a value that is returned if the isl C object is `null` or doesn't have a string representation (by default it's an empty string). In some cases it's better to have the string "null" instead of an empty string.
- isl-noexceptions.h has been generated by this d33ec3a3bb
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D104211
This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in `lib/External/isl/include/isl/isl-noxceptions.h` and the official isl C++ interface.
Changes made:
- Removing explicit operator bool() from all the classes in the isl C++ bindings.
- Replace each call to operator bool() to method `is_null()`.
- isl-noexceptions.h has been generated by this 27396daac5
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D103976
[Polly][Isl] Removing nullptr constructor from C++ bindings. NFC.
This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in `lib/External/isl/include/isl/isl-noxceptions.h` and the official isl C++ interface.
Changes made:
- Removed `std::nullptr_t` constructor from all the classes in the isl C++ bindings.
- `isl-noexceptions.h` has been generated by this a7e00bea38
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D103751
[Polly][Isl] Removing nullptr constructor from C++ bindings. NFC.
This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in `lib/External/isl/include/isl/isl-noxceptions.h` and the official isl C++ interface.
Changes made:
- Removed `std::nullptr_t` constructor from all the classes in the isl C++ bindings.
- `isl-noexceptions.h` has been generated by this a7e00bea38
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D103751
Functions shared between generalized matrix-multiplication optimization
and other post-reschedule optimizations (tiling, prevect) are moved into
the schedule tree transformation utility ScheduleTreeTransform.
Avoid the warning
/polly/lib/Support/RegisterPasses.cpp:833:3: warning: default label in switch which covers all enumeration values [-Wcovered-switch-default]
default:
^
since all cases are now handled.
Thanks to Luke Benes for reporting.
Only supported with -polly-position=early. Unfortunately, the
extension point callpack for VectorizerStart only passes a
FunctionPassManager, making it impossible to add a module pass.
This required support for the canonicalization passes, inlcuding
porting RewriteByReferenceParams to the NPM.
For some reason, the legacy pass pipeline with -polly-position=early did
not run the CodePreparation pass. This was fixed as well.
Printing pass manager invocations is fairly verbose and not super
useful.
This allows us to remove DebugLogging from pass managers and PassBuilder
since all logging (aside from analysis managers) goes through
instrumentation now.
This has the downside of never being able to print the top level pass
manager via instrumentation, but that seems like a minor downside.
Reviewed By: ychen
Differential Revision: https://reviews.llvm.org/D101797
LazyBlockFrequenceInfoPass, LazyBranchProbabilityInfoPass and
LoopAccessLegacyAnalysis all cache pointers to their nestled required
analysis passes. One need to use addRequiredTransitive to describe
that the nestled passes can't be freed until those analysis passes
no longer are used themselves.
There is still a bit of a mess considering the getLazyBPIAnalysisUsage
and getLazyBFIAnalysisUsage functions. Those functions are used from
both Transform, CodeGen and Analysis passes. I figure it is OK to
use addRequiredTransitive also when being used from Transform and
CodeGen passes. On the other hand, I figure we must do it when
used from other Analysis passes. So using addRequiredTransitive should
be more correct here. An alternative solution would be to add a
bool option in those functions to let the user tell if it is a
analysis pass or not. Since those lazy passes will be obsolete when
new PM has conquered the world I figure we can leave it like this
right now.
Intention with the patch is to fix PR49950. It at least solves the
problem for the reproducer in PR49950. However, that reproducer
need five passes in a specific order, so there are lots of various
"solutions" that could avoid the crash without actually fixing the
root cause.
This is a reapply of commit 3655f0757f, that was reverted in
33ff3c2049 due to problems with assertions in the polly
lit tests. That problem is supposed to be solved by also adjusting
ScopPass to explicitly preserve LazyBlockFrequencyInfo and
LazyBranchProbabilityInfo (it already preserved
OptimizationRemarkEmitter which depends on those lazy passes).
Differential Revision: https://reviews.llvm.org/D100958
We previously had a different interpretation of unroll transformation
attributes than how LoopUnroll interpreted it. In particular,
llvm.loop.unroll.enable was needed explicitly to enable it and disabling
metadata was ignored.
Additionally, it required that either full unrolling or an unroll factor
to be specified or fail otherwise. An unroll factor is still required,
but the transformation is ignored with the hope that LoopUnroll is going
to apply the unrolling, since Polly currently does not implement an
heuristic.
Fixes llvm.org/PR50109
external.c defines stub functions that are never used because of how
Polly uses PPCG. Unfortunately, they are declared as functions without
return values or parameters which does not match their declarations.
Since they are never called, this was usually not a problem, but an LTO
build gets confused with differently declared functions, or in case of
pet_options_args, a global variable declaration that is defined as a
function
Resolve by including the declaring headers in external.c which forces
the declaration and definition to match at compile-time.
This fixes llvm.org/50021
The PollyPPCG library is only needed when POLLY_ENABLE_GPGPU_CODEGEN=ON.
If disabled, the library target is still created, but not linked against
anything.
This change does not add create the PollyPPCG build target if not
needed.
Motivated by llvm.org/PR50021
The isl_id_* have been in used without including the correspodning
isl/id.h header. According to rules in C, a function is defined
implicitly when first used with an assumed int return type (32 bits on
64 bit systems). But the implementation returns a pointer (64 bits on 64
bit systems). Is usually has no consequence because the return value is
stored in a registers that is 64 bits (RAX) and the optimizer does not
truncate its value before using it again as a pointer value. However,
LTO optimizers will be rightfull;y confused.
Fix by including <isl/id.h>
This fixes llvm.org/PR50021
Polly use algorithms from the Integer Set Library (isl), which is a library written in C and which is incompatible with the rest of the LLVM as it is written in C++.
Changes made:
- Refactoring the following methods of class `IslAst`
- `getAst()` `getRunCondition()` `buildRunCondition()`
- Removed the destructor in favor of the default one
- Change the type of the attribute `IslAst.RunCondition` to `isl::ast_expr`
- Change the type of the attribute `IslAst.Root` to `isl::ast_node`
- Change the order of attributes in class `IslAst` to reflect the data dependencies so that the destructor won't complain
- Refactoring the following methods of class `IslAstInfo`
- `getAst()` `getRunCondition()`
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D100265
Polly use algorithms from the Integer Set Library (isl), which is a library written in C and which is incompatible with the rest of the LLVM as it is written in C++.
Changes made:
- Refactoring the following methods of class IslAstInfo
- isParallel() isExecutedInParallel() isReductionParallel() getSchedule() getMinimalDependenceDistance() getBrokenReductions()
- Refactoring the following methods of class IslNodeBuilder
- getReferencesInSubtree() getScheduleForAstNode()
- Refactoring function getBrokenReductionsStr()
- Fixed the mismatching function declaration for getScheduleForAstNode()
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D99971
Problem:
On SystemZ we need to open text files in text mode. On Windows, files opened in text mode adds a CRLF '\r\n' which may not be desirable.
Solution:
This patch adds two new flags
- OF_CRLF which indicates that CRLF translation is used.
- OF_TextWithCRLF = OF_Text | OF_CRLF indicates that the file is text and uses CRLF translation.
Developers should now use either the OF_Text or OF_TextWithCRLF for text files and OF_None for binary files. If the developer doesn't want carriage returns on Windows, they should use OF_Text, if they do want carriage returns on Windows, they should use OF_TextWithCRLF.
So this is the behaviour per platform with my patch:
z/OS:
OF_None: open in binary mode
OF_Text : open in text mode
OF_TextWithCRLF: open in text mode
Windows:
OF_None: open file with no carriage return
OF_Text: open file with no carriage return
OF_TextWithCRLF: open file with carriage return
The Major change is in llvm/lib/Support/Windows/Path.inc to only set text mode if the OF_CRLF is set.
```
if (Flags & OF_CRLF)
CrtOpenFlags |= _O_TEXT;
```
These following files are the ones that still use OF_Text which I left unchanged. I modified all these except raw_ostream.cpp in recent patches so I know these were previously in Binary mode on Windows.
./llvm/lib/Support/raw_ostream.cpp
./llvm/lib/TableGen/Main.cpp
./llvm/tools/dsymutil/DwarfLinkerForBinary.cpp
./llvm/unittests/Support/Path.cpp
./clang/lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp
./clang/lib/Frontend/CompilerInstance.cpp
./clang/lib/Driver/Driver.cpp
./clang/lib/Driver/ToolChains/Clang.cpp
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D99426
Polly use algorithms from the Integer Set Library (isl), which is a library written in C and which is incompatible with the rest of the LLVM as it is written in C++.
Changes made:
- Refactoring isInnermost() to take C++ bindings instead of the plain isl C api.
- Addition of manage_copy() when needed to get the reference for the isl_ast_node object
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D99841
This produced a compile error with GCC:
llvm-project/polly/lib/Transform/ScheduleOptimizer.cpp:1220:49: error: cannot convert ‘bool’ to ‘llvm::TargetTransformInfo::RegisterKind’
1220 | RegisterBitwidth = TTI->getRegisterBitWidth(true);
BandAttr markers are added as parents of schedule tree bands. These also
appear as markers its equivalent AST, but a band does not necessarily
corresponds to a loop in this. Iterations may be peeled or the loop
being unrolled (e.g. if it has just one iteration). In such cases it may
happend that there is not loop between a BandAttr marker and the marker
for a loop nested in the former parent band/loop.
Handle the situation by giving priority to the inner marker over the
outer.
Fixes the polly-x86_64-linux-test-suite buildbot.
We enumerated the cross product Domain x Scatter, but sorted only be the
scatter key. In case there are are multiple statement instances per
scatter value, the order between statement instances of the same loop
iteration was undefined.
Propertly enumerate and sort only by the scatter value, and group the
domains using the scatter dimension again.
Thanks to Leonard Chan for the report.
Make Polly look for unrolling metadata (https://llvm.org/docs/TransformMetadata.html#loop-unrolling) that is usually only interpreted by the LoopUnroll pass and apply it to the SCoP's schedule.
While not that useful by itself (there already is an unroll pass), it introduces mechanism to apply arbitrary loop transformation directives in arbitrary order to the schedule. Transformations are applied until no more directives are found. Since ISL's rescheduling would discard the manual transformations and it is assumed that when the user specifies the sequence of transformations, they do not want any other transformations to apply. Applying user-directed transformations can be controlled using the `-polly-pragma-based-opts` switch and is enabled by default.
This does not influence the SCoP detection heuristic. As a consequence, loop that do not fulfill SCoP requirements or the initial profitability heuristic will be ignored. `-polly-process-unprofitable` can be used to disable the latter.
Other than manually editing the IR, there is currently no way for the user to add loop transformations in an order other than the order in the default pipeline, or transformations other than the one supported by clang's LoopHint. See the `unroll_double.ll` test as example that clang currently is unable to emit. My own extension of `#pragma clang loop` allowing an arbitrary order and additional transformations is available here: https://github.com/meinersbur/llvm-project/tree/pragma-clang-loop. An effort to upstream this functionality as `#pragma clang transform` (because `#pragma clang loop` has an implicit transformation order defined by the loop pipeline) is D69088.
Additional transformations from my downstream pragma-clang-loop branch are tiling, interchange, reversal, unroll-and-jam, thread-parallelization and array packing. Unroll was chosen because it uses already-defined metadata and does not require correctness checks.
Reviewed By: sebastiankreutzer
Differential Revision: https://reviews.llvm.org/D97977
Polly currently needs to be slowly refactor to use the C++ wrapper objects to handle the reference counters automatically.
I took the function of astScheduleDimIsParallel and refactored it so that it uses the C++ wrapper function as much as possible.
There are some problems with the IsParallel since it expects the C objects, so the C++ wrapper functions must be .release() and .get() first before they are able to be used with IsParallel.
When checking the ReductionDependencies Parallelism with the Build's Schedule, I opted to keep the union map as a C object rather than a C++ object. Eventually, changes will need to be made to IsParallel to refactor it to the C++ wrappers. When this is done, this function will also need to be slightly refactored to not use the C object.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D98455
This reverts commit 329aeb5db4,
and relands commit 61f006ac65.
This is a continuation of D89456.
As it was suggested there, now that SCEV models `PtrToInt`,
we can try to improve SCEV's pointer handling.
In particular, i believe, i will need this in the future
to further fix `SCEVAddExpr`operation type handling.
This removes special handling of `ConstantPointerNull`
from `ScalarEvolution::createSCEV()`, and add constant folding
into `ScalarEvolution::getPtrToIntExpr()`.
This way, `null` constants stay as such in SCEV's,
but gracefully become zero integers when asked.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D98147
This removes some (but not all) uses of type-less CreateGEP()
and CreateInBoundsGEP() APIs, which are incompatible with opaque
pointers.
There are a still a number of tricky uses left, as well as many
more variation APIs for CreateGEP.
This is a continuation of D89456.
As it was suggested there, now that SCEV models `PtrToInt`,
we can try to improve SCEV's pointer handling.
In particular, i believe, i will need this in the future
to further fix `SCEVAddExpr`operation type handling.
This removes special handling of `ConstantPointerNull`
from `ScalarEvolution::createSCEV()`, and add constant folding
into `ScalarEvolution::getPtrToIntExpr()`.
This way, `null` constants stay as such in SCEV's,
but gracefully become zero integers when asked.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D98147
Emit llvm.loop.parallel_accesses metadata instead of
llvm.mem.parallel_loop_access. The latter is deprecated because it
assumes that LoopIDs are persistent, which they are not.
We also emit parallel access metadata for all surrounding parallel
loops, not just the innermost parallel.
Polly use algorithms from the Integer Set Library (isl), which is a library written in C and which is incompatible with the rest of the LLVM as it is written in C++.
Changes made:
* Refabricating IsOutermostParallel() to take C++ bindings instead of reference-counting in C isl lib.
* Addition of manage_copy() to be used as reference for C objects instead of IsOutermostParallel()
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D97751
Currently, the IslAst library is a C library that would be incompatible with the rest of the LLVM because LLVM is written in C++.
I took one function, IsInnermostParallel(), and refactored it so that it would take the C++ wrapper object instead of using reference counters with the C ISL library. As well, all the references that use IsInnermostParallel() will use manage_copy() since they are still expecting the C object.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D97425
Allow users to use a non-system version of perl, python and awk, which is useful
in certain package managers.
Reviewed By: JDevlieghere, MaskRay
Differential Revision: https://reviews.llvm.org/D95119
Regenerate the C++ wrapper header from the current isl version's
headers.
The most notable change is that some dimension sizes are represented by
an isl_size (instead of unsigned), which is a signed int. Additionally,
some function may return -1 in case of an error which already had been
fixed in the past. The C++ may no return -1 instead of UINT_MAX which
caused the problems.
Some types in Polly had been changed from unsigned to isl_size
(that were not already auto) and some loops/comparision had to be
changed to avoid unsigned/signed comparison warnings.
DetectionContext objects are stored as values in a DenseMap. When the
DenseMap reaches its maximum load factor, it is resized and all its
objects moved to a new memory allocation. Unfortunately Scop object have
a reference to its DetectionContext. When the DenseMap resizes, all the
DetectionContexts reference now point to invalid memory, even if caused
by an unrelated DetectionContext.
Even worse, NewPM's ScopPassManager called isMaxRegionInScop with the
Verify=true parameter before each pass. This caused the old
DetectionContext to be removed an a new on created and re-verified.
Of course, the Scop object was already created pointing to the old
DetectionContext. Because the new DetectionContext would
usually be stored at the same position in the DenseMap, the reference
would usually reference the new DetectionContext of the same Region.
Usually.
If not, the old position still points to memory in the DenseMap
allocation (unless also a resizing occurs) such that tools like Valgrind
and AddressSanitizer would not be able to diagnose this.
Instead of storing the DetectionContext inside the DenseMap, use a
std::unique_ptr to a DetectionContext allocation, i.e. it will not move
around anymore. This also allows use to remove the very strange
DetectionContext(const DetectionContext &&)
copy/move(?) constructor. DetectionContext objects now are neither
copied nor moved.
As a result, every re-verification of a DetectionContext will use a new
allocation. Therefore, once a Scop object has been created using a
DetectionContext, it must not be re-verified (the Scop data structure
requires its underlying Region to not change before code generation
anyway). The NewPM may call isMaxRegionInScop only with
Validate=false parameter.
The description of the -polly switch stated that it was only enabled
with -O3. This was a lie, the optimization level was ignored. Only at
-O0 Polly was not added to the pass pipeline because the pass builder,
but only because the extension points were not triggered.
In the NewPM, the VectorizerStart extensions point is actually trigger
even with -O0 which leads to the following crash:
Assertion `Level != OptimizationLevel::O0 && "Must request optimizations!"' failed.
We sanitize the optimization levels using the following rules for both
pass mangers:
1. Only enable Polly if optimizing at all (-O1, -O2 or -O3).
2. Do not enable Polly when optimizing for size.
3. Ignore the optimization level for diagnostic passes (printer, viewer
or JScop-exporter).
4. If only diagnostic passes enabled, skip the code-generation.
5. Fix the description of the -polly command line option.
These are implementation details of the IslScheduleOptimizer pass
implementation and not use anywhere else. Hence, we can move them to the
cpp file and into an anonymous namespace.
Only getPartialTilePrefixes is, aside from the pass itself, used
externally (by the ScheduleOptimizerTest) and moved into the polly
namespace.
This reverts commit b7d870eae7 and the
subsequent fix "[Polly] Fix build after AssumptionCache change (D96168)"
(commit e6810cab09).
It caused indeterminism in the output, such that e.g. the
polly-x86_64-linux buildbot failed accasionally.
Move SimplifiyVisitor from Simplify.h to Simplify.cpp. It is not
relevant for applying the pass in either the NewPM or the legacyPM.
Rename it to SimplifyImpl to account for that.
This is possible due its state not being necessary to be preserved
between runs and thefore SimplifyImpl not needed to be held in the
pass object. Instead, SimplifyImpl is only instatiated for the
current Scop. In the NewPM as a function-local variable, and in the
legacy PM inside a llvm::Optional object because the state must be
preserved between the printScop (invoked by opt -analyze) and the most
recent runOnScop calls.
"using namespace" pollutes the namespace of every file that includes
such a header and universally considered a bad thing. Even the variant
namespace polly {
using namespace llvm;
}
(previously used by LoopGenerators.h) imports more symbols than the file
is in control of. The header may include a fixed set of files from LLVM,
but the header itself may by be included together with other headers
from LLVM. For instance, LLVM's MemorySSA.h and Polly's ScopInfo.h both
declare a class 'MemoryAccess' which may conflict.
Instead of prefixing everything in Polly's header files, this patch adds
'using' statements to import only the symbols that are actually
referenced in Polly. This approach is also used by MLIR to import
commonly used symbols into the mlir namespace.
This patch also puts the symbols declared in IslNodeBuilder.h into the
Polly namespace to also be able to use the imported symbols.
Even though it has some oddities, both pipelines should be as similar as
possible. Also use report_fatal_error instead of assertions to ensure a
proper failure in release builds for unsupported options.
This finalizes the patch serious to make Polly run in the default
configuration when using the NewPM by default.
In particular, print the ast with -debug-only=polly-ast, print a
per-scop header with print<polly-ast> and force-add the analysis with
-polly-code-generation=ast.
The pass-instrumentation pass is implicitly execute by the NewPM
whenever a new analysis runs. Not registering it will cause the crash
whenever a scop pass requests an analysis.
For instance this is the case for the IstAstAnalysis requesting the
DependenceAnalsis result.
In addition to that regression tests should not test the intire pass
pipeline (unless they are testing the pipeline itself), the Polly-ACC
currently does not support the new pass manager. If enabled by default,
such tests will therefore fail.
Use the -polly-gpu-runtime and -polly-gpu-arch options also as default
values for the PPCGCodeGeneration pass. This requires to move the option
to be moved from the pipeline-building Register passes to the
PPCGCodeGeneration implementation.
Fixes the spir-typesize.ll buildbot fail.
ZoneAlgorithms's computePHI relies on being provided with consistent a
schedule to compute the statement prodecessors of a statement containing
PHINodes. Otherwise unexpected results such as PHI nodes with multiple
predecessors can occur which would result in problems in the
algorithms expecting consistent data.
In the added test case, statement instances are scrubbed from the
SCoP their execution would result in undefined behavior (Due to a nsw
overflow). As already being undefined behavior in LLVM-IR, neither
AssumedContext nor InvalidContext are updated, giving computePHI no
means to avoid these cases.
Intoduce a new SCoP property, the DefinedBehaviorContext, that among
the runtime-checked conditions, also tracks the assumptions not needing
a runtime check, in particular those affecting the assumed control flow.
This replaces the manual combination of the 3 other contexts that was
already done in computePHI and setNewAccessRelation. Currently, the only
additional assumption is that loop induction variables will nsw flag for
not wrap, but potentially more can be added. Use in
hasFeasibleRuntimeContext, isl::ast_build and gisting are other
potential uses.
To limit computational complexity, the DefinedBehaviorContext is not
availabe if it grows too large (atm hardcoded to 8 disjuncts).
Possible other fixes include bailing out in computePHI when
inconsistencies are detected, choose an arbitrary value for inconsistent
cases (since it is undefined behavior anyways), or make the code
receiving the result from ComputePHI handle inconsistent data. All of
them reduce the quality of implementation having to bail out more often
and disabling the ability to assert on actually wrong results.
This fixes llvm.org/PR48783.
This fixes llvm.org/PR48554
Some test cases had to be updated because the hash function for
union_maps have been changed which affects the output order.
to Pass.h.
In some compiler passes like SampleProfileLoaderPass, we want to know which
LTO/ThinLTO phase the pass is in. Currently the phase is represented in enum
class PassBuilder::ThinLTOPhase, so it is only available in PassBuilder and
it also cannot represent phase in full LTO. The patch extends it to include
full LTO phases and move it from PassBuilder.h to Pass.h, then it is much
easier for PassBuilder to communiate with each pass about current LTO phase.
Differential Revision: https://reviews.llvm.org/D94613
MemoryAccess::setNewAccessRelation() in assert-builds checks whether the
access relation for a READ has a memory location for every instance of
the domain. Otherwise, we would not have value to load from. That check
already considered that instances outside the Scop's context do not
matter since they are never executed (or would be undefined behavior).
In this patch also take instances of the InvalidContext into account,
as these can also be assumed to never occur. InvalidContext was
introduced to avoid the computational complexity of subtracting
restrictions from the AssumedContext. However, this additional check in
setNewAccessRelation is only done in assert-builds.
The assertion case with an InvalidContext may occur with DeLICM on a
conditionally infinite loops, as it is the case in the following code:
for (int i = 0; i < n; i+=b)
vreg = ...;
*Dest = vreg;
The loop is infinite when b=0, and [b] -> { : b = 0 } is part of the
InvalidContext. When DeLICM tries to map the memory for %vreg to *Dest,
there is no store instance that uses the value of vreg when b = 0, hence
no location to map it to. However, the case is irrelevant since Polly's
runtime condition check ensures that this is never case.
Fixes llvm.org/PR48445
ScalarEvolution::getSCEV cannot be used during codegen. ScalarEvolution
assumes a stable IR and control flow which is under construction during
Polly's CodeGen. In particular, it uses DominatorTree for compute the
backedge taken count. However the DominatorTree is not updated during
codegen.
In this case, SCEV was used to determine the base pointer of an array
access. Replace it by our own function. Polly generates only GEP and
BitCasts for array acceses, i.e. it is sufficient to handle these to to
find the base pointer.
Fixes llvm.org/PR48422
There's a small number of users of this function, they are all updated.
This updates the C API adding a new method LLVMGetTypeByName2 that takes a context and a name.
Differential Revision: https://reviews.llvm.org/D78793
Currently, we have some confusion in the codebase regarding the
meaning of LocationSize::unknown(): Some parts (including most of
BasicAA) assume that LocationSize::unknown() only allows accesses
after the base pointer. Some parts (various callers of AA) assume
that LocationSize::unknown() allows accesses both before and after
the base pointer (but within the underlying object).
This patch splits up LocationSize::unknown() into
LocationSize::afterPointer() and LocationSize::beforeOrAfterPointer()
to make this completely unambiguous. I tried my best to determine
which one is appropriate for all the existing uses.
The test changes in cs-cs.ll in particular illustrate a previously
clearly incorrect AA result: We were effectively assuming that
argmemonly functions were only allowed to access their arguments
after the passed pointer, but not before it. I'm pretty sure that
this was not intentional, and it's certainly not specified by
LangRef that way.
Differential Revision: https://reviews.llvm.org/D91649
Declarations in headers should not be in the anonymous
namespace. Compilers also warn about the use of
<anon namespace>::SimplifyVisitor as a public field in
polly::SimplifyPass and polly::SimplifyPrinterPass.
Operand tree forwarding can cause the change of an access kind; in
particular change from a scalar kind to an array kind if the scalar
dependency is not necessary. Such an access cannot and doesn't need to
be forwarded anymore.
Fixes llvm.org/PR48034
Print to dbgs() any taken action.
Also, read-only scalars do not require any action unless
-polly-analyze-read-only-scalars=true is used. Better refect this by
using ForwardingAction::triviallyForwardable and thus not bumping the
statistics.
ScopBuilder distributes independent instructions between statements.
Only modeled (e.g. not synthesizable) instructions are represented.
To compute independence, non-modeled instructions were used in some
parts of determining instruction independence, which could lead to the
re-introduction of non-model instructions.
In particular, required invariant loads could be added to instruction
list, which then led to redundant MemoryAccesses for such a load.
This fixes llvm.org/PR48059.
And use it to model LLVM IR's `ptrtoint` cast.
This is essentially an alternative to D88806, but with no chance for
all the problems it caused due to having the cast as implicit there.
(see rG7ee6c402474a2f5fd21c403e7529f97f6362fdb3)
As we've established by now, there are at least two reasons why we want this:
* It will allow SCEV to actually model the `ptrtoint` casts
and their operands, instead of treating them as `SCEVUnknown`
* It should help with initial problem of PR46786 - this should eventually allow us
to not loose pointer-ness of an expression in more cases
As discussed in [[ https://bugs.llvm.org/show_bug.cgi?id=46786 | PR46786 ]], in principle,
we could just extend `SCEVUnknown` with a `is ptrtoint` cast, because `ScalarEvolution::getPtrToIntExpr()`
should sink the cast as far down into the expression as possible,
so in the end we should always end up with `SCEVPtrToIntExpr` of `SCEVUnknown`.
But i think that it isn't the best solution, because it doesn't really matter
from memory consumption side - there probably won't be *that* many `SCEVPtrToIntExpr`s
for it to matter, and it allows for much better discoverability.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D89456
This is a long-delayed follow-up to
5e5b85098d.
`TempMDNode` includes a bunch of machinery for RAUW, and should only be
used when necessary. RAUW wasn't being used in any of these cases... it
was just a placeholder for a self-reference.
Where the real node was using `MDNode::getDistinct`, just replace the
temporary argument with `nullptr`.
Where the real node was using `MDNode::get`, the `replaceOperandWith`
call was "promoting" the node to a distinct one implicitly due to
self-reference detection in `MDNode::handleChangedOperand`. The
`TempMDNode` was serving a purpose by delaying uniquing, but it's way
simpler to just call `MDNode::getDistinct` in the first place.
Note that using a self-reference at all in these places is a hold-over
from before `distinct` metadata existed. It was an old trick to create
distinct nodes. It would be intrusive to change, including bitcode
upgrades, etc., and it's harmless so I'm not sure there's much value in
removing it from existing schemas. After this commit it still has a tiny
memory cost (in the extra metadata operand) but no more overhead in
construction.
Differential Revision: https://reviews.llvm.org/D90079
Recursively traversing the operand tree leads to an exponential blowup
if instructions are used multiple times due to every path leading to an
additional copy of the instructions after forwarding. This problem was
marked as a TODO in the code and was reported as a bug in llvm.org/PR47340.
Fix by caching already visited instructions and returning the cached
version when already visited. Instead of calling forwardTree() twice,
return a ForwardingAction structure that contains a lambda which will
carry-out the forwarding when requested. The lambdas are executed in
reverse-postorder to mimic the previous recursive calls unless there
is a reuse.
Fixes llvm.org/PR47340
getVectorPtrTy is private to VectorBlockGenerator, and all uses query
the address space from the passed-in pointer prior to calling it.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D89745
Polly incorrectly dropped the address space specified for a load instruction when it vectorized the code.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D88907
This removes "VerifyEachPass" parameters from a lot of functions which is nice.
Don't verify after special passes or VerifierPass.
This introduces verification on loop and cgscc passes, verifying the corresponding function/module.
Reviewed By: ychen
Differential Revision: https://reviews.llvm.org/D88764