While clang-format takes care that the line-length is not surpassed, the
resulting comments sometimes look not optimal. We re-flow the text in the
comment to avoid these ugly single-word lines.
llvm-svn: 250626
Instead of generating implicit loads within basic blocks, put them
before the instructions of the statment itself, including non-affine
subregions. The region's entry node is dominating all blocks in the
region and therefore the loaded value will be available there.
Implicit writes in block-stmts were already stored back at the end of
the block. Now, also generate the stores of non-affine subregions when
leaving the statement, i.e. in the exiting block.
This change is required for array-mapped implicits ("De-LICM") to
ensure that there are no dependencies of demoted scalars within
statments. Statement load all required values, operator on copied in
registers, and then write back the changed value to the demoted memory.
Lifetimes analysis within statements becomes unecessary.
Differential Revision: http://reviews.llvm.org/D13487
llvm-svn: 250625
Accesses for exit node phis will be handled separately by
buildPHIAccesses if there is more than one exiting edge,
buildScalarDependences does not need to create additional SCALAR
accesses.
This is a corrected version of r250517, which was reverted in r250607.
Differential Revision: http://reviews.llvm.org/D13848
llvm-svn: 250622
Instead of checking at code generation time for each ScopStmt if a scalar has
external uses, we just iterate over the ScopArrayInfo descriptions we have and
check each of these for possible external uses.
Besides being somehow clearer, this approach has the benefit that we will always
create valid LLVM-IR even in case we disable the code generation of ScopStmt
bodies e.g. for testing purposes.
llvm-svn: 250608
When pulling a llvm::Value to be written as a PHI write, the former
code did only check whether it is within the same basic block, but it
could also be the same non-affine subregion. In that case some
unecessary pair of MemoryAccesses would have been created.
Two unit test were explicitely checking for the unecessary writes,
including the comments that the writes are unecessary.
llvm-svn: 250411
This will allow us to optimize C++ template code with Polly. This support is
mostly for debugging purpose and individual experiments. The ultimate goal is
still to run Polly later in the pass manager when inlining already happened.
llvm-svn: 250092
instead of llvm::PassManagerBuilder::EP_EarlyAsPossible. This will allow us
to run actual module passes in Polly's canonicalization sequence, but should
otherwise have only little impact.
llvm-svn: 250091
We also allow such products for cases where 'Parameter' is loaded within the
scop, but where we can dynamically verify that the value of 'Parameter' remains
unchanged during the execution of the scop.
This change relies on Polly's new RequiredILS tracking infrastructure recently
contributed by Johannes.
llvm-svn: 250019
The domain generation can handle lazy && and || by default but eager
evaluated expressions were dismissed as non-affine. With this patch we
will allow arbitrary combinations of and/or bit-operations in the
conditions of branches.
Differential Revision: http://reviews.llvm.org/D13624
llvm-svn: 249971
Helper functions in the BlockGenerators.h/cpp introduce dependences
from the frontend to the backend of Polly. As they are used in
ScopDetection, ScopInfo, etc. we move them to the ScopHelper file.
llvm-svn: 249919
If a (assumed) invariant location is loaded multiple times we
generated a parameter for each location. However, this caused compile
time problems for several benchmarks (e.g., 445_gobmk in SPEC2006 and
BT in the NAS benchmarks). Additionally, the code we generate is
suboptimal as we preload the same location multiple times and perform
the same checks on all the parameters that refere to the same value.
With this patch we consolidate the invariant loads in three steps:
1) During SCoP initialization required invariant loads are put in
equivalence classes based on their pointer operand. One
representing load is used to generate a parameter for the whole
class, thus we never generate multiple parameters for the same
location.
2) During the SCoP simplification we remove invariant memory
accesses that are in the same equivalence class. While doing so
we build the union of all execution domains as it is only
important that the location is at least accessed once.
3) During code generation we only preload one element of each
equivalence class with the unified execution domain. All others
are mapped to that preloaded value.
Differential Revision: http://reviews.llvm.org/D13338
llvm-svn: 249853
This was left out from the original patch proposed in
http://reviews.llvm.org/D13195
even though it is needed to define an order invariant loads
are hoisted.
llvm-svn: 249680
Drop an unused flag polly-allow-non-scev-backedge-taken-count and also
its occurrences from the tests.
Contributed-by: Chris Jenneisch <chrisj@codeaurora.org>
Differential Revision: http://reviews.llvm.org/D13400
llvm-svn: 249675
ScopDetection users are interested in the detection context and access
these via different get-methods. However, not all information was
exposed though the number of maps to hold it was increasing steadily.
With this change only the detection contexts the rejection log and the
ValidRegions set are mapped. The former is needed, the second could be
integrated in the first and the ValidRegions set is only needed for the
deterministic order of the regions.
llvm-svn: 249614
This replaces the support for user defined error functions by a
heuristic that tries to determine if a call to a non-pure function
should be considered "an error". If so the block is assumed not to be
executed at runtime. While treating all non-pure function calls as
errors will allow a lot more regions to be analyzed, it will also
cause us to dismiss a lot again due to an infeasible runtime context.
This patch tries to limit that effect. A non-pure function call is
considered an error if it is executed only in conditionally with
regards to a cheap but simple heuristic.
llvm-svn: 249611
This patch allows invariant loads to be used in the SCoP description,
e.g., as loop bounds, conditions or in memory access functions.
First we collect "required invariant loads" during SCoP detection that
would otherwise make an expression we care about non-affine. To this
end a new level of abstraction was introduced before
SCEVValidator::isAffineExpr() namely ScopDetection::isAffine() and
ScopDetection::onlyValidRequiredInvariantLoads(). Here we can decide
if we want a load inside the region to be optimistically assumed
invariant or not. If we do, it will be marked as required and in the
SCoP generation we bail if it is actually not invariant. If we don't
it will be a non-affine expression as before. At the moment we
optimistically assume all "hoistable" (namely non-loop-carried) loads
to be invariant. This causes us to expand some SCoPs and dismiss them
later but it also allows us to detect a lot we would dismiss directly
if we would ask e.g., AliasAnalysis::canBasicBlockModify(). We also
allow potential aliases between optimistically assumed invariant loads
and other pointers as our runtime alias checks are sound in case the
loads are actually invariant. Together with the invariant checks this
combination allows to handle a lot more than LICM can.
The code generation of the invariant loads had to be extended as we
can now have dependences between parameters and invariant (hoisted)
loads as well as the other way around, e.g.,
test/Isl/CodeGen/invariant_load_parameters_cyclic_dependence.ll
First, it is important to note that we cannot have real cycles but
only dependences from a hoisted load to a parameter and from another
parameter to that hoisted load (and so on). To handle such cases we
materialize llvm::Values for parameters that are referred by a hoisted
load on demand and then materialize the remaining parameters. Second,
there are new kinds of dependences between hoisted loads caused by the
constraints on their execution. If a hoisted load is conditionally
executed it might depend on the value of another hoisted load. To deal
with such situations we sort them already in the ScopInfo such that
they can be generated in the order they are listed in the
Scop::InvariantAccesses list (see compareInvariantAccesses). The
dependences between hoisted loads caused by indirect accesses are
handled the same way as before.
llvm-svn: 249607
Value maps are created and used in many places and it is not always
possible to include CodeGen/Blockgenerators.h. To this end, ValueMapT
now lives in the ScopHelper.h which does not have any dependences itself.
This patch also replaces uses of different other value map types with
the ValueMapT.
llvm-svn: 249606
Do not use "Map[Key] == nullptr" to check if a Key is in the map, but use
"Map.find(Key) == Map.end()". Map[Key] always adds Key into the map, a
side-effect we do not want.
Found by inspection. This is hard to test outside of a targetted unit test,
which seems too much overhead for this individual issue.
llvm-svn: 249544
Clang's been taught to warn on more things here (unused values when
calling pure functions and ignoring their result, for example).
It might be better to figure out how to have cmake compile these tests
without -Werror/without any warnings enabled. But this'll do for now & I
don't know enough about cmake to fix it any other way, or to understand
the tradeoffs in that space.
llvm-svn: 249472
This single option replaces -polly-detect-unprofitable and -polly-no-early-exit
and is supposed to be the only option that disables compile-time heuristics that
aim to bail out early on scops that are believed to not benefit from Polly
optimizations.
Suggested-by: Johannes Doerfert
llvm-svn: 249426
A statement with an empty domain complicates the invariant load
hoisting and does not help any subsequent analysis or transformation.
In fact it might introduce parameter dimensions or increase the
schedule dimensionality. To this end, we remove statements with an
empty domain early in the SCoP simplification.
llvm-svn: 249276
Before isValidCFG() could hide the fact that a loop is non-affine by
over-approximation. This is problematic if a subregion of the loop contains
an exit/latch block and is over-approximated. Now we do not over-approximate
in the isValidCFG function if we check loop control. If such control is
non-affine the whole loop is over-approximated, not only a subregion.
llvm-svn: 249273
This patch cannot be tested on its own as the isValidCFG currently
hides the fact that control is actually non-affine with
over-approximation. This will be corrected in the next patch and a
test for non-affine latches will be added.
llvm-svn: 249272
When the ScopAnnotator was a class member variable some of the maps it contains
have not been properly cleared. As a result we had dangling pointers to
llvm::Value(s) which got detected by the AssertingVH we recently added.
No test case as this issue is hard to reproduce reliably as subsequent
optimizations need to delete some of the llvm::Values we still keep in our
lists.
llvm-svn: 249269
The use of const qualified Value pointers prevents the use of AssertingVH. We
could probably think of adding const support to AssertingVH, but as const
correctness seems to currently provide limited benefit in Polly, we do not do
this yet.
llvm-svn: 249266
There have been various places where llvm::DenseMap<const llvm::Value *,
llvm::Value *> types have been defined, but all types have been expected to be
identical. We make this more clear by consolidating the different types and use
BlockGenerator::ValueMapT wherever there is a need for types to match
BlockGenerator::ValueMapT.
llvm-svn: 249264
By using asserting value handles, we will get assertions when we forget to clear
any of the Value maps instead of difficult to debug undefined behavior.
llvm-svn: 249237
We have to skip accesses in non-affine subregions during hoisting as
they might not be executed under the same condition as the entry of
the non-affine subregion.
llvm-svn: 249139
This moves the construction of ScopStmt to the beginning of the
ScopInfo pass. The late creation was a result of the earlier separation
of ScopInfo and TempScopInfo. This will avoid introducing more
ScopStmt-like maps in future commits. The AccFuncMap will also be
removed in some future commit. DomainMap might also be included into
ScopStmt.
The order in which ScopStmt are created changes and initially creates
empty statements that are removed in a simplification.
Differential Revision: http://reviews.llvm.org/D13341
llvm-svn: 249132
If a value is globally mapped (IslNodeBuilder::ValueMap) and
referenced in the code that will be put into a subfunction, we hand
down the new value to the subfunction.
This patch also removes code that handed down all invariant loads to
the subfunction. Instead, only needed invariant loads are given to the
subfunction. There are two possible reasons for an invariant load to
be handed down:
1) The invariant load is used in a block that is placed in the
subfunction but which is not the parent of the load. In this
case, the scalar access that will read the loaded value, will
cause its base pointer (the preloaded value) to be handed down to
the subfunction.
2) The invariant load is defined and used in a block that is placed
in the subfunction. With this patch we will hand down the
preloaded value to the subfunction as the invariant load is
globally mapped to that value.
llvm-svn: 249126
When error blocks are not terminated by an unreachable they have successors
that might only be reachable via error blocks. Additionally, branches in
error blocks are not checked during SCoP detection, thus we might not be able
to handle them. With this patch we do not try to model error block exit
conditions. Anything that is only reachable via error blocks is ignored too,
as it will not be executed in the optimized version of the SCoP anyway.
llvm-svn: 249099
The user can provide function names with
-polly-error-functions=name1,name2,name3
that will be treated as error functions. Any call to them is assumed
not to be executed.
This feature is mainly for developers to play around with the new
"error block" feature.
llvm-svn: 249098
With this patch we erase cached results for regions that are invalid
as early as possible. If we do not (as before), it is possible that
two expanded regions will have the same address and the tracked
results for both are mixed. Currently this would "only" cause
pessimism in later passes but that will change when we allow invariant
loads in the SCoP. Additionally, it triggers non-deterministic
results as we might dismiss a later region because of results cached
for an earlier one.
There is no test case as the problem occurs only non-deterministically.
llvm-svn: 249000
Because we handle more than SCEV does it is not possible to rewrite an
expression on the top-level using the SCEVParameterRewriter only. With
this patch we will do the rewriting on demand only and also
recursively, thus not only on the top-level.
llvm-svn: 248916
Instructions which we can synthesis from a SCEV expression are not
generated directly, but only when they are used as an operand of
another instruction. This avoids generating unnecessary instructions
and works more reliably than first inserting them and then deleting
them later on.
This commit was reverted in r248860 due to a remaining miscompile, where
we forgot to synthesis the operand values that were referenced from scalar
writes. test/Isl/CodeGen/scalar-store-from-same-bb.ll tests that we do this
now correctly.
llvm-svn: 248900
Before we unconditinoally forced all users outside the SCoP to use
the preloaded value. However, if the SCoP is not executed due to the
runtime checks, we need to use the original value because it might not
be invariant in the first place.
llvm-svn: 248881
This makes ScopInfo's scop member available earlier to other methods which will make some planned changes simpler.
No behavioral change intended
llvm-svn: 248879
As a first step in the direction of assumed invariant loads (loads
that are not written in some context) we now detect and hoist
definitively invariant loads. These invariant loads will be preloaded
in the code generation and used in the optimized version of the SCoP.
If the load is only conditionally executed the preloaded version will
also only be executed under the same condition, hence we will never
access memory that wouldn't have been accessed otherwise. This is also
the most distinguishing feature to licm.
As hoisting can make statements empty we will simplify the SCoP and
remove empty statements that would otherwise cause artifacts in the
code generation.
Differential Revision: http://reviews.llvm.org/D13194
llvm-svn: 248861
This reverts commit 07830c18d789ee72812d5b5b9b4f8ce72ebd4207.
The commit broke at least one test in lnt,
MultiSource/Benchmarks/Ptrdist/bc/number.c
was miss compiled and the test produced a wrong result.
One Polly test case that was added later was adjusted too.
llvm-svn: 248860
Every once in a while we see code that accesses memory with different types,
e.g. to perform operations on a piece of memory using type 'float', but to copy
data to this memory using type 'int'. Modeled in C, such codes look like:
void foo(float A[], float B[]) {
for (long i = 0; i < 100; i++)
*(int *)(&A[i]) = *(int *)(&B[i]);
for (long i = 0; i < 100; i++)
A[i] += 10;
}
We already used the correct types during normal operations, but fall back to our
detected type as soon as we import changed memory access functions. For these
memory accesses we may generate invalid IR due to a mismatch between the element
type of the array we detect and the actual type used in the memory access. To
address this issue, we always cast the newly created address of a memory access
back to the type of the memory access where the address will be used.
llvm-svn: 248781
Instructions which we can synthesis from a SCEV expression are not generated
directly, but only when they are used as an operand of another instruction. This
avoids generating unnecessary instruction and works more reliably than first
inserting them and then deleting them later on.
Suggested-by: Johannes Doerfert <doerfert@cs.uni-saarland.de>
Differential Revision: http://reviews.llvm.org/D13208
llvm-svn: 248712
This patch allows switch instructions with affine conditions in the
SCoP. Also switch instructions in non-affine subregions are allowed.
Both did not require much changes to the code, though there was some
refactoring needed to integrate them without code duplication.
In the llvm-test suite the number of profitable SCoPs increased from
135 to 139 but more importantly we can handle more benchmarks and user
inputs without preprocessing.
Differential Revision: http://reviews.llvm.org/D13200
llvm-svn: 248701
This check was needed at some point but seems not useful anymore. Only
one adjustment in the domain generation was needed to cope with the
cases this check prevented from happening before.
llvm-svn: 248695
We now only delete trivially dead instructions in the BB we copy (copyBB), but
not in any other BB. Only for copyBB we know that there will _never_ be any
future uses of instructions that have no use after copyBB has been generated.
Other instructions in the AST that have been generated by IslNodeBuilder may
look dead at the moment, but may possibly still be referenced by GlobalMaps. If
we delete them now, later uses would break surprisingly.
We do not have a test case that breaks due to us deleting too many instructions.
This issue was found by inspection.
llvm-svn: 248688
The changes affect methods that are part of the Pass interface and
include:
- Comments that describe the methods purpose.
- A consistent use of the keywords override and virtual.
Additionally, the printScop method is now optional and removed from
SCoP passes that do not implement it.
llvm-svn: 248685
After having generated a new user statement a couple of inefficient or
trivially dead instructions may remain. This commit runs instruction
simplification over the newly generated blocks to ensure unneeded
instructions are removed right away.
This commit does adds simplification for non-affine subregions which was not
yet part of 248681.
llvm-svn: 248683
After having generated a new user statement a couple of inefficient or trivially
dead instructions may remain. This commit runs instruction simplification over
the newly generated blocks to ensure unneeded instructions are removed right
away.
This commit does not yet add simplification for non-affine subregions.
llvm-svn: 248681
This commit basically reverts r246427 but still solves the issue
tackled by that commit. Instead of emitting initialization code in the
beginning of the start block we now generate parallel code in its own
block and thereby guarantee separation. This is necessary as we cannot
generate code for hoisted loads prior to the start block but it still
needs to be placed prior to everything else.
llvm-svn: 248674
When the whole SCoP is a non-affine region we need to use the
surrounding loop in the construction of the schedule as that is
the one that will be looked up after the schedule generation.
This fixes bug 24947
llvm-svn: 248667
When recovering multi-dimensional memory accesses, it may happen that different
accesses to the same base array are recovered with different dimensionality.
This patch ensures that the dimensionalities are unified by adding zero valued
dimensions to acesses with lower dimensionality. When starting to model
fixed-size arrays as multi-dimensional in 247906, this has not been taken
care of.
llvm-svn: 248662
There are three possible reasons to add a memory memory access: For explicit load and stores, for llvm::Value defs/uses, and to emulate PHI nodes (the latter two called implicit accesses). Previously MemoryAccess only stored IsPHI. Register accesses could be identified through the isScalar() method if it was no IsPHI. isScalar() determined the number of dimensions of the underlaying array, scalars represented by zero dimensions.
For the work on de-LICM, implicit accesses can have more than zero dimensions, making the distinction of isScalars() useless, hence now stored explicitly in the MemoryAccess. Instead, we replace it by isImplicit() and avoid the term scalar for zero-dimensional arrays as it might be confused with llvm::Value which are also often referred to as scalars (or alternatively, as registers).
No behavioral change intended, under the condition that it was impossible to create explicit accesses to zero-dimensional "arrays".
llvm-svn: 248616
This makes the intent of each created object clearer and allows to add more specific asserts. The bug fixed in r248535 has been discovered this way.
No functional change intended; everything should behave as before.
llvm-svn: 248603
This change addresses three issues:
- Read only scalars that enter a PHI node through an edge that comes from
outside the scop are not modeled any more, as such PHI nodes will always
be initialized to this initial value right before the SCoP is entered.
- For PHI nodes that depend on a scalar value that is defined outside the
scop, but where the scalar values is passed through an edge that itself
comes from a BB that is part of the region, we introduce in this basic
block a read of the out-of-scop value to ensure it's value is available
to write it into the PHI alloc location.
- Read only uses of scalars by PHI nodes are ignored in the general read only
handling code, as they are taken care of by the general PHI node modeling
code.
llvm-svn: 248535
After the merge of TempScopInfo into ScopInfo the analysis output
remained because of the existing unit tests. These remains are removed
and the units tests converted to match the equivalent output of
ScopInfo's analysis output. The unit tests are also moved into the
directory of ScopInfo tests.
Differential Revision: http://reviews.llvm.org/D13116
llvm-svn: 248485
Refactor all ISL-related cmake build instructions into its own
CMakeLists.txt and build as a separate library.
This is useful to apply ISL-related build flags to ISL only and not to
Polly's files. Also, it the separation of both projects becomes clearer.
Proposed name of the library is Polly_isl. It is not "isl" to avoid
mix-up with potentially installed libisl.{a|so}.
Tested configurations:
- Windows with cmake 3.2
- Ubuntu with cmake 3.0.2
- Ubuntu with cmake 3.0.2 BUILD_SHARED_LIBS on
- Ubuntu with cmake 2.8.12.2 (LLVM minimum version)
- Ubuntu out-of-LLVM-tree
Differential Revision: http://reviews.llvm.org/D12810
llvm-svn: 248484
A missing return statement that previously did not have a visibly negative
effect caused after some data-structure changes in r248024 multi-dimensional
accesses to be modeled both multi-dimensional as well as linearized. This
commit adds the missing return to avoid the incorrect double modeling as
well as the compile time increases it caused.
llvm-svn: 248171
If we encounter a <nsw> tagged AddRec for a loop we know the trip count of
that loop has to be bounded or the semantics is undefined anyway. Hence, we
only need to add unbounded assumptions if no such AddRec is known.
llvm-svn: 248128
So far we ignored the unbounded parts of the iteration domain, however
we need to assume they do not occure at all to remain sound if they do.
llvm-svn: 248126
We now add loop carried information during the second traversal of the
region instead of in a intermediate step in-between. This makes the
generation simpler, removes code and should even be faster.
llvm-svn: 248125
In order to allow multiple back edges we:
- compute the conditions under which each back edge is taken
- build the union over all these conditions, thus the condition that
any back edge is taken
- apply the same logic to the union we applied to a single back edge
llvm-svn: 248120
As we currently do not perform any optimizations that targets (or is
even aware) small trip counts we will skip them when we count the
loops in a region.
llvm-svn: 248119
All MemoryAccess objects will be owned by ScopInfo::AccFuncMap which
previously stored the IRAccess objects. Instead of creating new
MemoryAccess objects, the already created ones are reused, but their
order might be different now. Some fields of IRAccess and MemoryAccess
had the same meaning and are merged.
This is the last step of fusioning TempScopInfo.{h|cpp} and
ScopInfo.{h.cpp}. Some refactoring might still make sense.
Differential Revision: http://reviews.llvm.org/D12843
llvm-svn: 248024