LazyBlockFrequenceInfoPass, LazyBranchProbabilityInfoPass and
LoopAccessLegacyAnalysis all cache pointers to their nestled required
analysis passes. One need to use addRequiredTransitive to describe
that the nestled passes can't be freed until those analysis passes
no longer are used themselves.
There is still a bit of a mess considering the getLazyBPIAnalysisUsage
and getLazyBFIAnalysisUsage functions. Those functions are used from
both Transform, CodeGen and Analysis passes. I figure it is OK to
use addRequiredTransitive also when being used from Transform and
CodeGen passes. On the other hand, I figure we must do it when
used from other Analysis passes. So using addRequiredTransitive should
be more correct here. An alternative solution would be to add a
bool option in those functions to let the user tell if it is a
analysis pass or not. Since those lazy passes will be obsolete when
new PM has conquered the world I figure we can leave it like this
right now.
Intention with the patch is to fix PR49950. It at least solves the
problem for the reproducer in PR49950. However, that reproducer
need five passes in a specific order, so there are lots of various
"solutions" that could avoid the crash without actually fixing the
root cause.
This is a reapply of commit 3655f0757f, that was reverted in
33ff3c2049 due to problems with assertions in the polly
lit tests. That problem is supposed to be solved by also adjusting
ScopPass to explicitly preserve LazyBlockFrequencyInfo and
LazyBranchProbabilityInfo (it already preserved
OptimizationRemarkEmitter which depends on those lazy passes).
Differential Revision: https://reviews.llvm.org/D100958
Make Polly look for unrolling metadata (https://llvm.org/docs/TransformMetadata.html#loop-unrolling) that is usually only interpreted by the LoopUnroll pass and apply it to the SCoP's schedule.
While not that useful by itself (there already is an unroll pass), it introduces mechanism to apply arbitrary loop transformation directives in arbitrary order to the schedule. Transformations are applied until no more directives are found. Since ISL's rescheduling would discard the manual transformations and it is assumed that when the user specifies the sequence of transformations, they do not want any other transformations to apply. Applying user-directed transformations can be controlled using the `-polly-pragma-based-opts` switch and is enabled by default.
This does not influence the SCoP detection heuristic. As a consequence, loop that do not fulfill SCoP requirements or the initial profitability heuristic will be ignored. `-polly-process-unprofitable` can be used to disable the latter.
Other than manually editing the IR, there is currently no way for the user to add loop transformations in an order other than the order in the default pipeline, or transformations other than the one supported by clang's LoopHint. See the `unroll_double.ll` test as example that clang currently is unable to emit. My own extension of `#pragma clang loop` allowing an arbitrary order and additional transformations is available here: https://github.com/meinersbur/llvm-project/tree/pragma-clang-loop. An effort to upstream this functionality as `#pragma clang transform` (because `#pragma clang loop` has an implicit transformation order defined by the loop pipeline) is D69088.
Additional transformations from my downstream pragma-clang-loop branch are tiling, interchange, reversal, unroll-and-jam, thread-parallelization and array packing. Unroll was chosen because it uses already-defined metadata and does not require correctness checks.
Reviewed By: sebastiankreutzer
Differential Revision: https://reviews.llvm.org/D97977
This reverts commit 329aeb5db4,
and relands commit 61f006ac65.
This is a continuation of D89456.
As it was suggested there, now that SCEV models `PtrToInt`,
we can try to improve SCEV's pointer handling.
In particular, i believe, i will need this in the future
to further fix `SCEVAddExpr`operation type handling.
This removes special handling of `ConstantPointerNull`
from `ScalarEvolution::createSCEV()`, and add constant folding
into `ScalarEvolution::getPtrToIntExpr()`.
This way, `null` constants stay as such in SCEV's,
but gracefully become zero integers when asked.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D98147
This is a continuation of D89456.
As it was suggested there, now that SCEV models `PtrToInt`,
we can try to improve SCEV's pointer handling.
In particular, i believe, i will need this in the future
to further fix `SCEVAddExpr`operation type handling.
This removes special handling of `ConstantPointerNull`
from `ScalarEvolution::createSCEV()`, and add constant folding
into `ScalarEvolution::getPtrToIntExpr()`.
This way, `null` constants stay as such in SCEV's,
but gracefully become zero integers when asked.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D98147
Regenerate the C++ wrapper header from the current isl version's
headers.
The most notable change is that some dimension sizes are represented by
an isl_size (instead of unsigned), which is a signed int. Additionally,
some function may return -1 in case of an error which already had been
fixed in the past. The C++ may no return -1 instead of UINT_MAX which
caused the problems.
Some types in Polly had been changed from unsigned to isl_size
(that were not already auto) and some loops/comparision had to be
changed to avoid unsigned/signed comparison warnings.
DetectionContext objects are stored as values in a DenseMap. When the
DenseMap reaches its maximum load factor, it is resized and all its
objects moved to a new memory allocation. Unfortunately Scop object have
a reference to its DetectionContext. When the DenseMap resizes, all the
DetectionContexts reference now point to invalid memory, even if caused
by an unrelated DetectionContext.
Even worse, NewPM's ScopPassManager called isMaxRegionInScop with the
Verify=true parameter before each pass. This caused the old
DetectionContext to be removed an a new on created and re-verified.
Of course, the Scop object was already created pointing to the old
DetectionContext. Because the new DetectionContext would
usually be stored at the same position in the DenseMap, the reference
would usually reference the new DetectionContext of the same Region.
Usually.
If not, the old position still points to memory in the DenseMap
allocation (unless also a resizing occurs) such that tools like Valgrind
and AddressSanitizer would not be able to diagnose this.
Instead of storing the DetectionContext inside the DenseMap, use a
std::unique_ptr to a DetectionContext allocation, i.e. it will not move
around anymore. This also allows use to remove the very strange
DetectionContext(const DetectionContext &&)
copy/move(?) constructor. DetectionContext objects now are neither
copied nor moved.
As a result, every re-verification of a DetectionContext will use a new
allocation. Therefore, once a Scop object has been created using a
DetectionContext, it must not be re-verified (the Scop data structure
requires its underlying Region to not change before code generation
anyway). The NewPM may call isMaxRegionInScop only with
Validate=false parameter.
This reverts commit b7d870eae7 and the
subsequent fix "[Polly] Fix build after AssumptionCache change (D96168)"
(commit e6810cab09).
It caused indeterminism in the output, such that e.g. the
polly-x86_64-linux buildbot failed accasionally.
ZoneAlgorithms's computePHI relies on being provided with consistent a
schedule to compute the statement prodecessors of a statement containing
PHINodes. Otherwise unexpected results such as PHI nodes with multiple
predecessors can occur which would result in problems in the
algorithms expecting consistent data.
In the added test case, statement instances are scrubbed from the
SCoP their execution would result in undefined behavior (Due to a nsw
overflow). As already being undefined behavior in LLVM-IR, neither
AssumedContext nor InvalidContext are updated, giving computePHI no
means to avoid these cases.
Intoduce a new SCoP property, the DefinedBehaviorContext, that among
the runtime-checked conditions, also tracks the assumptions not needing
a runtime check, in particular those affecting the assumed control flow.
This replaces the manual combination of the 3 other contexts that was
already done in computePHI and setNewAccessRelation. Currently, the only
additional assumption is that loop induction variables will nsw flag for
not wrap, but potentially more can be added. Use in
hasFeasibleRuntimeContext, isl::ast_build and gisting are other
potential uses.
To limit computational complexity, the DefinedBehaviorContext is not
availabe if it grows too large (atm hardcoded to 8 disjuncts).
Possible other fixes include bailing out in computePHI when
inconsistencies are detected, choose an arbitrary value for inconsistent
cases (since it is undefined behavior anyways), or make the code
receiving the result from ComputePHI handle inconsistent data. All of
them reduce the quality of implementation having to bail out more often
and disabling the ability to assert on actually wrong results.
This fixes llvm.org/PR48783.
MemoryAccess::setNewAccessRelation() in assert-builds checks whether the
access relation for a READ has a memory location for every instance of
the domain. Otherwise, we would not have value to load from. That check
already considered that instances outside the Scop's context do not
matter since they are never executed (or would be undefined behavior).
In this patch also take instances of the InvalidContext into account,
as these can also be assumed to never occur. InvalidContext was
introduced to avoid the computational complexity of subtracting
restrictions from the AssumedContext. However, this additional check in
setNewAccessRelation is only done in assert-builds.
The assertion case with an InvalidContext may occur with DeLICM on a
conditionally infinite loops, as it is the case in the following code:
for (int i = 0; i < n; i+=b)
vreg = ...;
*Dest = vreg;
The loop is infinite when b=0, and [b] -> { : b = 0 } is part of the
InvalidContext. When DeLICM tries to map the memory for %vreg to *Dest,
there is no store instance that uses the value of vreg when b = 0, hence
no location to map it to. However, the case is irrelevant since Polly's
runtime condition check ensures that this is never case.
Fixes llvm.org/PR48445
Currently, we have some confusion in the codebase regarding the
meaning of LocationSize::unknown(): Some parts (including most of
BasicAA) assume that LocationSize::unknown() only allows accesses
after the base pointer. Some parts (various callers of AA) assume
that LocationSize::unknown() allows accesses both before and after
the base pointer (but within the underlying object).
This patch splits up LocationSize::unknown() into
LocationSize::afterPointer() and LocationSize::beforeOrAfterPointer()
to make this completely unambiguous. I tried my best to determine
which one is appropriate for all the existing uses.
The test changes in cs-cs.ll in particular illustrate a previously
clearly incorrect AA result: We were effectively assuming that
argmemonly functions were only allowed to access their arguments
after the passed pointer, but not before it. I'm pretty sure that
this was not intentional, and it's certainly not specified by
LangRef that way.
Differential Revision: https://reviews.llvm.org/D91649
ScopBuilder distributes independent instructions between statements.
Only modeled (e.g. not synthesizable) instructions are represented.
To compute independence, non-modeled instructions were used in some
parts of determining instruction independence, which could lead to the
re-introduction of non-model instructions.
In particular, required invariant loads could be added to instruction
list, which then led to redundant MemoryAccesses for such a load.
This fixes llvm.org/PR48059.
InstStmtMap became inconsistent with ScopStmt::getInstructions() after
the statement's instructions is modified, e.g. by being considered
unused by the Simplify pass or being moved by ForwardOpTree.
Change ScopStmt::setInstructions() to also update its parent's
InstStmtMap. Also add assertions checking the consistency.
The patch introduces the system to distinctively store the information
needed for the Control Flow Graph as well as the instrumentary needed for
the follow-up changes: BlockFrequencyInfo and BranchProbabilityInfo.
The patch is a part of sequence of three patches, related to graphs Heat Coloring.
Reviewers: rcorcs, apilipenko, davidxl, sfertile, fedor.sergeev, eraman, bollu
Differential Revision: https://reviews.llvm.org/D76820
The option is passed as argv to ISL's command line option parser.
Polly's own own command line options take precedence over options passed
as `-polly-isl-arg`. For instance,
`-polly-isl-arg=--schedule-outer-coincidence` will be ignored in favor
of `-polly-opt-outer-coincidence`.
Reviewed By: grosser
Differential Revision: https://reviews.llvm.org/D77303
Summary:
This patch moves the getIndexExpressionsFromGEP function from polly
into ScalarEvolution so that both polly and DependenceAnalysis can
use it for the purpose of subscript delinearization when the array
sizes are not parametric.
Authored By: bmahjour
Reviewer: Meinersbur, sebpop, fhahn, dmgreen, grosser, etiotto, bollu
Reviewed By: Meinersbur
Subscribers: hiraditya, arphaman, Whitney, ppc-slack, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73995
The primary motivation is to fix an assertion failure in
isl_basic_map_alloc_equality:
isl_assert(ctx, room_for_con(bmap, 1), return -1);
Although the assertion does not occur anymore, I could not identify
which of ISL's commits fixed it.
Compared to the previous ISL version, Polly requires some changes for this update
* Since ISL commit
20d3574 "perform parameter alignment by modifying both arguments to function"
isl_*_gist_* and similar functions do not always align the paramter
list anymore. This caused the parameter lists in JScop files to
become out-of-sync. Since many regression tests use JScop files with
a fixed parameter list and order, we explicitly call align_params to
ensure a predictable parameter list.
* ISL changed some return types to isl_size, a typedef of (signed) int.
This caused some issues where the return type was unsigned int before:
- No overload for std::max(unsigned,isl_size)
- It cause additional 'mixed signed/unsigned comparison' warnings.
Since they do not break compilation, and sizes larger than 2^31
were never supported, I am going to fix it separately.
* With the change to isl_size, commit
57d547 "isl_*_list_size: return isl_size"
also changed the return value in case of an error from 0 to -1. This
caused undefined looping over isl_iterator since the 'end iterator'
got index -1, never reached from the 'begin iterator' with index 0.
* Some internal changes in ISL caused the number of operations to
increase when determining access ranges to determine aliasing
overlaps. In one test, this caused exceeding the default limit of
800000. The operations-limit was disabled for this test.
Previously, the enums didn't account for all the possible cases, which
could cause misleading results (particularly for a "switch" on
FunctionModRefBehavior).
Fixes regression in polly from recent patch to add writeonly to memset.
While I'm here, also fix a few dubious uses of the FMRB_* enum values.
Differential Revision: https://reviews.llvm.org/D73154
Scope of changes:
1) Moved RecordedAssumptions vector to ScopBuilder. RecordedAssumptions are used only for Scop constructions.
2) Moved definition of RecordedAssumptionsTy to ScopHelper. It is required both by ScopBuilder and SCEVAffinator.
3) Add new function recordAssumption to ScopHelper. One of its argument is a reference to RecordedAssumption vector. This function is used by ScopBuilder and SCEVAffinator.
4) All RecordedAssumptions are created by ScopBuilder. isl::pw_aff
objects for corresponding SCEVs are created inside ScopBuilder. Scop
functions do not record any assumptions. Scop can use isl::pw_aff
objects which were created by ScopBuilder.
5) Removed functions for handling RecordedAssumptions from Scop class.
6) Removed constness from getScopArrayInfo functions.
7) Replaced SCEVVisitor struct from SCEVAffinator with taylored version, which allow to pass pointer to RecordedAssumptions as function argument.
Differential Revision: https://reviews.llvm.org/D68056
Previously, the polly unit tests were stuck in a infinite loop.
There was an edge case in StringRef::count() introduced by 9f6b13e5cc, where an empty 'Str' would cause the function to never exit.
Also fixed usage in polly.
This file lists every pass in LLVM, and is included by Pass.h, which is
very popular. Every time we add, remove, or rename a pass in LLVM, it
caused lots of recompilation.
I found this fact by looking at this table, which is sorted by the
number of times a file was changed over the last 100,000 git commits
multiplied by the number of object files that depend on it in the
current checkout:
recompiles touches affected_files header
342380 95 3604 llvm/include/llvm/ADT/STLExtras.h
314730 234 1345 llvm/include/llvm/InitializePasses.h
307036 118 2602 llvm/include/llvm/ADT/APInt.h
213049 59 3611 llvm/include/llvm/Support/MathExtras.h
170422 47 3626 llvm/include/llvm/Support/Compiler.h
162225 45 3605 llvm/include/llvm/ADT/Optional.h
158319 63 2513 llvm/include/llvm/ADT/Triple.h
140322 39 3598 llvm/include/llvm/ADT/StringRef.h
137647 59 2333 llvm/include/llvm/Support/Error.h
131619 73 1803 llvm/include/llvm/Support/FileSystem.h
Before this change, touching InitializePasses.h would cause 1345 files
to recompile. After this change, touching it only causes 550 compiles in
an incremental rebuild.
Reviewers: bkramer, asbirlea, bollu, jdoerfert
Differential Revision: https://reviews.llvm.org/D70211
ScopBuilder::buildEqivClassBlockStmts creates ScopStmts for instruction
groups in basic block and inserts these ScopStmts into Scop::StmtMap,
however, as described in llvm.org/PR38358, comment #5, StmtScops are
inserted into vector ScopStmt[BB] in wrong order. As a result,
ScopBuilder::buildSchedule creates wrong order sequence node.
Looking closer to code, it's clear there is no equivalent classes with
interleaving isOrderedInstruction(memory access) instructions after
joinOrderedInstructions. Afterwards, ScopStmts need to be created and
inserted in the original order of memory access instructions, however,
at the moment ScopStmts are inserted in the order of leader instructions
which are probably not memory access instructions.
The fix is simple with a standalone loop scanning
isOrderedInstruction(memory access) instructions in basic block and
inserting elements into LeaderToInstList one by one. The patch also
removes double reversing operations which are now unnecessary.
New test preserve-equiv-class-order-in-basic_block.ll is also added.
Differential Revision: https://reviews.llvm.org/D68941
llvm-svn: 375192
Function joinOrderedInstructions merges instructions when a leader is encountered twice.
It also notices that leaders in SeenLeaders may lose their leadership in previous merging,
and tries to handle the case using following code:
Instruction *PrevLeader = UnionFind.getLeaderValue(SeenLeaders.back());
However, this is wrong because it always gets leader for the last element of SeenLeaders,
and I believe it's wrong even we get leader for Prev here. As a result, Statements in cases
like the one in patch aren't merged as expected. After investigation, I believe it's
unnecessary to get leader instruction at all. This is based on fact: Although leaders in
SeenLeaders could lose leadership, they only lose to others in SeenLeaders, in other words,
one existing leader will be chosen as new leader of merged equivalent statements. We can
take advantage of this and simply check if current leader equals to Prev and break merging
if it does.
The patch also adds a new test.
Patch by bin.narwal <bin.narwal@gmail.com>
Differential Revision: https://reviews.llvm.org/D67007
llvm-svn: 371801
When reading code of Dependences::calculateDependences, I noticed that
WAR is computed specifically by buildWAR. Given ISL now
supports "kills" in approximate dataflow analysis, this patch takes
advantage of it.
This patch also cleans up a couple lines redundant codes.
Patch by bin.narwal <bin.narwal@gmail.com>
Differential Revision: https://reviews.llvm.org/D66741
llvm-svn: 370396
The while loop iterating parent loop in ScopBuilder::buildDomains is
unnecessary because either L or LD are later unused, this is a simple
patch removing it.
Patch by bin.narwal <bin.narwal@gmail.com>
Differential Revision: https://reviews.llvm.org/D66698
llvm-svn: 370368
When reading code in ScopBuilder::buildEqivClassBlockStmts, I think the
main statement flag computation can be simplified, here is the patch.
It's based on two simple facts that:
1. Instruction won't be removed once it's inserted into UnionFind.
2. Main statement must be set if there is non-trivial statement besides the last one.
The patch also saves std::find call.
Patch by bin.narwal <bin.narwal@gmail.com>
Differential Revision: https://reviews.llvm.org/D66477
llvm-svn: 369972
Scope of changes:
1) Moved buildDomains function to ScopBuilder class.
2) Moved buildDomainsWithBranchConstraints function to ScopBuilder class.
3) Moved propagateDomainConstraints to ScopBuilder class.
4) Moved propagateDomainConstraintsToRegionExit to ScopBuilder class.
5) Moved propagateInvalidStmtDomains to ScopBuilder class.
6) Moved getPredecessorDomainConstraints function to ScopBuilder class.
7) Moved addLoopBoundsToHeaderDomain function to ScopBuilder class.
8) Moved getPwAff function to ScopBuilder class.
9) Moved buildConditionSets functions to ScopBuilder class.
10) Added updateMaxLoopDepth, notifyErrorBlock, getOrInitEmptyDomain, isDomainDefined, setDomain functions to Scop class. They are used by ScopBuilder.
11) Moved helper functions: getRegionNodeBasicBlock, getRegionNodeSuccessor, containsErrorBlock, createNextIterationMap, collectBoundedParts, partitionSetParts, buildConditionSet to ScopBuilder.cpp file.
Differential Revision: https://reviews.llvm.org/D65729
llvm-svn: 368100
Scope of changes:
1) Moved addUserAssumptions function to ScopBuilder class.
2) Moved buildConditionSets functions to polly namespace.
3) Moved getRepresentingInvariantLoadSCEV to public section of the Scop class
Differential Revision: https://reviews.llvm.org/D65241
llvm-svn: 368089
Scope of changes:
1. Moved buildSchedule functions to ScopBuilder.
2. Moved combineInSequence function to ScopBuilder.
3. Moved mapToDimension function to ScopBuilder.
4. Moved LoopStackTy to ScopBuilder.
5. Moved getLoopSurroundingScop to ScopHelper.
6. Moved getNumBlocksInLoop to ScopHelper.
7. Moved getNumBlocksInRegionNode to ScopHelper.
8. Moved getRegionNodeLoop to ScopHelper.
Differential Revision: https://reviews.llvm.org/D64223
llvm-svn: 366377
Scope of changes:
1) Moved finalizeAccesses to ScopBuilder
2) Moved updateAccessDimensionality to ScopBuilder
3) Moved foldSizeConstantsToRight to ScopBuilder
4) Moved foldSizeConstantsToRight to ScopBuilder
5) Moved assumeNoOutOfBounds to ScopBuilder
6) Moved markFortranArrays to ScopBuilder
7) Added iterator range for AccessFunctions vector.
Differential Revision: https://reviews.llvm.org/D63794
llvm-svn: 366374
Scope of changes:
1) Moved addUserContext to ScopBuilder.
2) Moved command line option UserContextStr to ScopBuilder.
Differential Revision: https://reviews.llvm.org/D63740
llvm-svn: 366266