Change the code around setNewAccessRelation to allow to use a an existing array
element for memory instead of an ad-hoc alloca. This facility will be used for
DeLICM/DeGVN to convert scalar dependencies into regular ones.
The changes necessary include:
- Make the code generator use the implicit locations instead of the alloca ones.
- A test case
- Make the JScop importer accept changes of scalar accesses for that test case.
- Adapt the MemoryAccess interface to the fact that the MemoryKind can change.
They are named (get|is)OriginalXXX() to get the status of the memory access
before any change by setNewAccessRelation() (some properties such as
getIncoming() do not change even if the kind is changed and are still
required). To get the modified properties, there is (get|is)LatestXXX(). The
old accessors without Original|Latest become synonyms of the
(get|is)OriginalXXX() to not make functional changes in unrelated code.
Differential Revision: https://reviews.llvm.org/D23962
llvm-svn: 280408
We already invalidated a couple of critical values earlier on, but we now
invalidate all instructions contained in a scop after the scop has been code
generated. This is necessary as later scops may otherwise obtain SCEV
expressions that reference values in the earlier scop that before dominated
the later scop, but which had been moved into the conditional branch and
consequently do not dominate the later scop any more. If these very values are
then used during code generation of the later scop, we generate used that are
dominated by the values they use.
This fixes: http://llvm.org/PR28984
llvm-svn: 279047
In case some code -- not guarded by control flow -- would be emitted directly in
the start block, it may happen that this code would use uninitalized scalar
values if the scalar initialization is only emitted at the end of the start
block. This is not a problem today in normal Polly, as all statements are
emitted in their own basic blocks, but Polly-ACC emits host-to-device copy
statements into the start block.
Additional Polly-ACC test coverage will be added in subsequent changes that
improve the handling of PHI nodes in Polly-ACC.
llvm-svn: 278124
After having generated the code for a ScopStmt, we run a simple dead-code
elimination that drops all instructions that are known to be and remain unused.
Until this change, we only considered instructions for dead-code elimination, if
they have a corresponding instruction in the original BB that belongs to
ScopStmt. However, when generating code we do not only copy code from the BB
belonging to a ScopStmt, but also generate code for operands referenced from BB.
After this change, we now also considers code for dead code elimination, which
does not have a corresponding instruction in BB.
This fixes a bug in Polly-ACC where such dead-code referenced CPU code from
within a GPU kernel, which is possible as we do not guarantee that all variables
that are used in known-dead-code are moved to the GPU.
llvm-svn: 278103
The map is iterated over when generating the values escaping the SCoP. The
indeterministic iteration order of DenseMap causes the output IR to change at
every compilation, adding noise to comparisons.
Replace DenseMap by a MapVector to ensure the same iteration order at every
compilation.
llvm-svn: 277832
Extend the jscop interface to allow the user to export arrays. It is required
that already existing arrays of the list of arrays correspond to arrays
of the SCoP. Each array that is appended to the list will be newly created.
Furthermore, we allow the user to modify access expressions to reference
any array in case it has the same element type.
Reviewed-by: Tobias Grosser <tobias@grosser.es>
Differential Revision: https://reviews.llvm.org/D22828
llvm-svn: 277263
This ensures that no trivially dead code is generated. This is not only cleaner,
but also avoids troubles in case code is generated in a separate function and
some of this dead code contains references to values that are not available.
This issue may happen, in case the memory access functions have been updated
and old getelementptr instructions remain in the code. With normal Polly,
a test case is difficult to draft, but the upcoming GPU code generation can
possibly trigger such problems. We will later extend this dead-code elimination
to region and vector statements.
llvm-svn: 276263
Summary:
API-wise `apply` is a somewhat unidiomatic one-off function, and
removing the only(?) use in polly will let me remove it from SCEV's
exposed interface.
Reviewers: jdoerfert, Meinersbur, grosser
Subscribers: grosser, mcrosier, pollydev
Differential Revision: http://reviews.llvm.org/D20779
llvm-svn: 271177
If a non-affine region PHI is generated we should not move the insert
point prior to the synthezised value in the same block as we might
split that block at the insert point later on. Only if the incoming
value should be placed in a different block we should change the
insertion point.
llvm-svn: 265132
Value merging is only necessary for scalars when they are used outside
of the scop. While an array's base pointer can be used after the scop,
it gets an extra ScopArrayInfo of type MK_Value. We used to generate
phi's for both of them, where one was assuming the reault of the other
phi would be the original value, because it has already been replaced by
the previous phi. This resulted in IR that the current IR verifier
allows, but is probably illegal.
This reduces the number of LNT test-suite fails with
-polly-position=before-vectorizer -polly-process-unprofitable
from 16 to 10.
Also see llvm.org/PR26718.
llvm-svn: 262629
Polly recognizes affine loops that ScalarEvolution does not, in
particular those with loop conditions that depend on hoisted invariant
loads. Check for SCEVAddRec dependencies on such loops and do not
consider their exit values as synthesizable because SCEVExpander would
generate them as expressions that depend on the original induction
variables. These are not available in generated code.
llvm-svn: 262404
The generated dedicated subregion exit block was assumed to have the same
dominance relation as the original exit block. This is incorrect if the exit
block receives other edges than only from the subregion, which results in that
e.g. the subregion's entry block does not dominate the exit block.
llvm-svn: 261865
This is also be caught by the function verifier, but disconnected from
the place that produced it. Catch it already at creation to be able to
reason more directly about the cause.
llvm-svn: 261790
This allows code such as:
void multiple_types(char *Short, char *Float, char *Double) {
for (long i = 0; i < 100; i++) {
Short[i] = *(short *)&Short[2 * i];
Float[i] = *(float *)&Float[4 * i];
Double[i] = *(double *)&Double[8 * i];
}
}
To model such code we use as canonical element type of the modeled array the
smallest element type of all original array accesses, if type allocation sizes
are multiples of each other. Otherwise, we use a newly created iN type, where N
is the gcd of the allocation size of the types used in the accesses to this
array. Accesses with types larger as the canonical element type are modeled as
multiple accesses with the smaller type.
For example the second load access is modeled as:
{ Stmt_bb2[i0] -> MemRef_Float[o0] : 4i0 <= o0 <= 3 + 4i0 }
To support code-generating these memory accesses, we introduce a new method
getAccessAddressFunction that assigns each statement instance a single memory
location, the address we load from/store to. Currently we obtain this address by
taking the lexmin of the access function. We may consider keeping track of the
memory location more explicitly in the future.
We currently do _not_ handle multi-dimensional arrays and also keep the
restriction of not supporting accesses where the offset expression is not a
multiple of the access element type size. This patch adds tests that ensure
we correctly invalidate a scop in case these accesses are found. Both types of
accesses can be handled using the very same model, but are left to be added in
the future.
We also move the initialization of the scop-context into the constructor to
ensure it is already available when invalidating the scop.
Finally, we add this as a new item to the 2.9 release notes
Reviewers: jdoerfert, Meinersbur
Differential Revision: http://reviews.llvm.org/D16878
llvm-svn: 259784
We support now code such as:
void multiple_types(char *Short, char *Float, char *Double) {
for (long i = 0; i < 100; i++) {
Short[i] = *(short *)&Short[2 * i];
Float[i] = *(float *)&Float[4 * i];
Double[i] = *(double *)&Double[8 * i];
}
}
To support such code we use as element type of the modeled array the smallest
element type of all original array accesses. Accesses with larger types are
modeled as multiple accesses with the smaller type.
For example the second load access is modeled as:
{ Stmt_bb2[i0] -> MemRef_Float[o0] : 4i0 <= o0 <= 3 + 4i0 }
To support jscop-rewritable memory accesses we need each statement instance to
only be assigned a single memory location, which will be the address at which
we load the value. Currently we obtain this address by taking the lexmin of
the access function. We may consider keeping track of the memory location more
explicitly in the future.
llvm-svn: 259587
MemAccInst wraps the common members of LoadInst and StoreInst. Also use
of this class in:
- ScopInfo::buildMemoryAccess
- BlockGenerator::generateLocationAccessed
- ScopInfo::addArrayAccess
- Scop::buildAliasGroups
- Replace every use of polly::getPointerOperand
Reviewers: jdoerfert, grosser
Differential Revision: http://reviews.llvm.org/D16530
llvm-svn: 258947
Ensure that there is at most one phi write access per PHINode and
ScopStmt. In particular, this would be possible for non-affine
subregions with multiple exiting blocks. We replace multiple MAY_WRITE
accesses by one MUST_WRITE access. The written value is constructed
using a PHINode of all exiting blocks. The interpretation of the PHI
WRITE's "accessed value" changed from the incoming value to the PHI like
for PHI READs since there is no unique incoming value.
Because region simplification shuffles around PHI nodes -- particularly
with exit node PHIs -- the PHINodes at analysis time does not always
exist anymore in the code generation pass. We instead remember the
incoming block/value pair in the MemoryAccess.
Differential Revision: http://reviews.llvm.org/D15681
llvm-svn: 258809
Both functions implement the same functionality, with the difference that
getNewScalarValue assumes that globals and out-of-scop scalars can be directly
reused without loading them from their corresponding stack slot. This is correct
for sequential code generation, but causes issues with outlining code e.g. for
OpenMP code generation. getNewValue handles such cases correctly.
Hence, we can replace getNewScalarValue with getNewValue. This is not only more
future proof, but also eliminates a bunch of code.
The only functionality that was available in getNewScalarValue that is lost
is the on-demand creation of scalar values. However, this is not necessary any
more as scalars are always loaded at the beginning of each basic block and will
consequently always be available when scalar stores are generated. As this was
not the case in older versions of Polly, it seems the on-demand loading is just
some older code that has not yet been removed.
Finally, generateScalarLoads also generated loads for values that are loop
invariant, available in GlobalMap and which are preferred over the ones loaded
in generateScalarLoads. Hence, we can just skip the code generation of such
scalar values, avoiding the generation of dead code.
Differential Revision: http://reviews.llvm.org/D16522
llvm-svn: 258799
getAccessFor does not guarantee a certain access to be returned in case an
instruction is related to multiple accesses. However, in the vector code
generation we want to know the stride of the array access of a store
instruction. By using getArrayAccessFor we ensure we always get the correct
memory access.
This patch fixes a potential bug, but I was unable to produce a failing test
case. Several existing test cases cover this code, but all of them already
passed out of luck (or the specific but not-guaranteed order in which we build
memory accesses).
llvm-svn: 255715
When generating scalar loads/stores separately the vector code has not been
updated. This commit adds code to generate scalar loads for vector code as well
as code to assert in case scalar stores are encountered within a vector loop.
llvm-svn: 255714
When rewriting the access functions of load/store statements, we are only
interested in the actual array memory location. The current code just took
the very first memory access, which could be a scalar or an array access. As
a result, we failed to update access functions even though this was requested
via .jscop.
llvm-svn: 255713
This change should not change the behavior of Polly today, but it allows
external constants to be remapped e.g. when targetting multiple LLVM modules.
llvm-svn: 255506
Over time different vocabulary has been introduced to describe the different
memory objects in Polly, resulting in different - often inconsistent - naming
schemes in different parts of Polly. We now standartize this to the following
scheme:
KindArray, KindValue, KindPHI, KindExitPHI
| ------- isScalar -----------|
In most cases this naming scheme has already been used previously (this
minimizes changes and ensures we remain consistent with previous publications).
The main change is that we remove KindScalar to clearify the difference between
a scalar as a memory object of kind Value, PHI or ExitPHI and a value (former
KindScalar) which is a memory object modeling a llvm::Value.
We also move all documentation to the Kind* enum in the ScopArrayInfo class,
remove the second enum in the MemoryAccess class and update documentation to be
formulated from the perspective of the memory object, rather than the memory
access. The terms "Implicit"/"Explicit", formerly used to describe memory
accesses, have been dropped. From the perspective of memory accesses they
described the different memory kinds well - especially from the perspective of
code generation - but just from the perspective of a memory object it seems more
straightforward to talk about scalars and arrays, rather than explicit and
implicit arrays. The last comment is clearly subjective, though. A less
subjective reason to go for these terms is the historic use both in mailing list
discussions and publications.
llvm-svn: 255467
Previously, accesses that originate from PHI nodes in the exit block
were registered as SCALAR. In some context they are treated as scalars,
but it makes a difference in others. We used to check whether the
AccessInstruction is a terminator to differentiate the cases.
This patch introduces an MemoryAccess origin EXIT_PHI and a
ScopArrayInfo kind KIND_EXIT_PHI to make this case more explicit. No
behavioural change intended.
Differential Revision: http://reviews.llvm.org/D14688
llvm-svn: 254149
IVs of loops for which the loop header is in the subregion, but not the entire
loop may be incremented outside of the subregion and can consequently not be
kept private to the subregion. Instead, they need to and are modeled as virtual
loops in the iteration domains. As this is the case, generating new subregion
induction variables for such loops is not needed and indeed wrong as they would
hide the virtual induction variables modeled in the scop.
This fixes a miscompile in MultiSource/Benchmarks/Ptrdist/bc and
MultiSource/Benchmarks/nbench/. Thanks Michael and Johannes for their
investiagations and helpful observations regarding this bug.
llvm-svn: 252860
Scalar reloads in the generated entering block were not recognized as
dominating the subregions locks when there were multiple entering
nodes. This resulted in values defined in there not being copied.
As a fix, we unconditionally add the BBMap of the generated entering
node to the generated entry. This fixes part of llvm.org/PR25439.
This reverts 252449 and reapplies r252445. Its test was failing
indeterministically due to r252375 which was reverted in r252522.
llvm-svn: 252540
The dominance of the generated non-affine subregion block was based on
the scop's merge block, therefore resulted in an invalid DominanceTree.
It resulted in some values as assumed to be unusable in the actual
generated exit block.
We detect the case that the exit block has been moved and decide
dominance using the BB at the original exit. If we create another exit
node, that exit nodes is dominated by the one generated from where the
original exit resides. This fixes llvm.org/PR25438 and part of
llvm.org/PR25439.
llvm-svn: 252526
It introduced indeterminism as it was iterating over an address-indexed
hashtable. The corresponding bug PR25438 will be fixed in a successive
commit.
llvm-svn: 252522
This reverts commit 9775824b265e574fc541e975d64d3e270243b59d due to a
failing unit test.
Please check and correct the unit test and commit again.
llvm-svn: 252449
Scalar reloads in the generated entering block were not recognized as
dominating the subregions locks when there were multiple entering
nodes. This resulted in values defined in there not being copied.
As a fix, we unconditionally add the BBMap of the generated entering
node to the generated entry. This fixes part of llvm.org/PR25439.
llvm-svn: 252445
After loop versioning, a dominance check of a non-affine subregion's
exit node causes the dominance check to always fail on any block in the
subregion if it shares the same exit block with the scop. The
subregion's exit block has become polly_merge_new_and_old, which also
receives the control flow of the generated code. This would cause that
any value for implicit stores is assumed to be not from the scop.
We check dominance with the generated exit node instead.
This fixes llvm.org/PR25438
llvm-svn: 252375
Remove all the implicit ilist iterator conversions from polly, in
preparation for making them illegal in ADT. There was one oddity I came
across: at line 95 of lib/CodeGen/LoopGenerators.cpp, there was a
post-increment `Builder.GetInsertPoint()++`.
Since it was a no-op, I removed it, but I admit I wonder if it might be
a bug (both before and after this change)? Perhaps it should be a
pre-increment?
llvm-svn: 252357
We were adding all generated values in non-affine subregions to be used
for the subregions generated exit block. The thought was that only
values that are dominating the original exit block can be used there.
But it is possible for synthesizable values to be expanded in any
block. If the same values is also used for implicit writes, it would
try to reuse already synthesized values even if not dominating the exit
block.
The fix is to only add values to the list of values usable in the exit
block only if it is dominating the exit block. This fixes
llvm.org/PR25412.
llvm-svn: 252301
For generating scalar writes of non-affine subregions, all except phi
writes are generated in the exit block. The phi writes are generated in
the incoming block for which we errornously used the same BBMap. This
can conflict if a value for one block is synthesized, and then reused
for another block which is not dominated by the first block. This is
fixed by using block-specific BBMaps for phi writes.
llvm-svn: 252172
These maps are only needed during the construction of a single region statement.
Clearing them is important, as we otherwise get an assert in case some of the
referenced values are erased before the RegionGenerator is deleted.
llvm-svn: 251341
Such PHI nodes can not only appear in the ExitBlock of the Scop, but indeed
any scalar PHI node above the scop and used in the scop is modeled as scalar
read access.
llvm-svn: 251198
This change adds code to directly code-generate multi-exit PHI nodes, instead
of trying to reuse the EscapeMap infrastructure for this. Using escape maps
adds a level of indirection that is hard to understand and - more importantly -
breaks in certain cases.
Specifically, the original code relied on simplifyRegion() to split the original
PHI node in two PHI nodes, one merging the values coming from within the scop
and a second that merges the first PHI node with the values that come from
outside the scop. To generate code the first PHI node is then just handled like
any other in-scop value that is used somewhere outside the scop. This fails for
the case where all values from inside the scop are identical, as the first PHI
node is in such cases automatically simplified and eliminated by LLVM right at
construction. As a result, there is no instruction that can be pass to the
EscapeMap handling, which means the references in the second PHI node are not
updated and may still reference values from within the original scop that do not
dominate it.
Our new code iterates directly over all modeled ScopArrayInfo objects that
represent multi-exit PHI nodes and generates code for them without relying on
the EscapeMap infrastructure. Hence, it works also for the case where the first
PHI node is eliminated.
llvm-svn: 251191
New values were always synthesized in the block of the instruction
that needed them. This is incorrect for PHI node whose' value must be
defined in the respective incoming block. This patch temporarily moves
the builder's insert point to the incoming block while synthesizing phi
node arguments.
This fixes PR25241 (http://llvm.org/bugs/show_bug.cgi?id=25241)
llvm-svn: 250693
Expressing this in terms of BlockGenerator::getOrCreateAlloca(const
ScopArrayInfo *Array) does not work as the MemoryAccess BasePtr is in case of
invariant load hoisting different to the ScopArrayInfo BasePtr. Until this is
investigated and fixed, we move back to code that just uses the baseptr of
MemoryAccess.
llvm-svn: 250637
This allows the caller to get the alloca locations of an array without the
need to thank if Array is a PHI or a non-PHI Array. We directly make use of this
in BlockGenerator::getOrCreateAlloca(MemoryAccess &Access).
llvm-svn: 250628
Instead of generating implicit loads within basic blocks, put them
before the instructions of the statment itself, including non-affine
subregions. The region's entry node is dominating all blocks in the
region and therefore the loaded value will be available there.
Implicit writes in block-stmts were already stored back at the end of
the block. Now, also generate the stores of non-affine subregions when
leaving the statement, i.e. in the exiting block.
This change is required for array-mapped implicits ("De-LICM") to
ensure that there are no dependencies of demoted scalars within
statments. Statement load all required values, operator on copied in
registers, and then write back the changed value to the demoted memory.
Lifetimes analysis within statements becomes unecessary.
Differential Revision: http://reviews.llvm.org/D13487
llvm-svn: 250625
Instead of checking at code generation time for each ScopStmt if a scalar has
external uses, we just iterate over the ScopArrayInfo descriptions we have and
check each of these for possible external uses.
Besides being somehow clearer, this approach has the benefit that we will always
create valid LLVM-IR even in case we disable the code generation of ScopStmt
bodies e.g. for testing purposes.
llvm-svn: 250608
Helper functions in the BlockGenerators.h/cpp introduce dependences
from the frontend to the backend of Polly. As they are used in
ScopDetection, ScopInfo, etc. we move them to the ScopHelper file.
llvm-svn: 249919
This patch allows invariant loads to be used in the SCoP description,
e.g., as loop bounds, conditions or in memory access functions.
First we collect "required invariant loads" during SCoP detection that
would otherwise make an expression we care about non-affine. To this
end a new level of abstraction was introduced before
SCEVValidator::isAffineExpr() namely ScopDetection::isAffine() and
ScopDetection::onlyValidRequiredInvariantLoads(). Here we can decide
if we want a load inside the region to be optimistically assumed
invariant or not. If we do, it will be marked as required and in the
SCoP generation we bail if it is actually not invariant. If we don't
it will be a non-affine expression as before. At the moment we
optimistically assume all "hoistable" (namely non-loop-carried) loads
to be invariant. This causes us to expand some SCoPs and dismiss them
later but it also allows us to detect a lot we would dismiss directly
if we would ask e.g., AliasAnalysis::canBasicBlockModify(). We also
allow potential aliases between optimistically assumed invariant loads
and other pointers as our runtime alias checks are sound in case the
loads are actually invariant. Together with the invariant checks this
combination allows to handle a lot more than LICM can.
The code generation of the invariant loads had to be extended as we
can now have dependences between parameters and invariant (hoisted)
loads as well as the other way around, e.g.,
test/Isl/CodeGen/invariant_load_parameters_cyclic_dependence.ll
First, it is important to note that we cannot have real cycles but
only dependences from a hoisted load to a parameter and from another
parameter to that hoisted load (and so on). To handle such cases we
materialize llvm::Values for parameters that are referred by a hoisted
load on demand and then materialize the remaining parameters. Second,
there are new kinds of dependences between hoisted loads caused by the
constraints on their execution. If a hoisted load is conditionally
executed it might depend on the value of another hoisted load. To deal
with such situations we sort them already in the ScopInfo such that
they can be generated in the order they are listed in the
Scop::InvariantAccesses list (see compareInvariantAccesses). The
dependences between hoisted loads caused by indirect accesses are
handled the same way as before.
llvm-svn: 249607
The use of const qualified Value pointers prevents the use of AssertingVH. We
could probably think of adding const support to AssertingVH, but as const
correctness seems to currently provide limited benefit in Polly, we do not do
this yet.
llvm-svn: 249266
There have been various places where llvm::DenseMap<const llvm::Value *,
llvm::Value *> types have been defined, but all types have been expected to be
identical. We make this more clear by consolidating the different types and use
BlockGenerator::ValueMapT wherever there is a need for types to match
BlockGenerator::ValueMapT.
llvm-svn: 249264
By using asserting value handles, we will get assertions when we forget to clear
any of the Value maps instead of difficult to debug undefined behavior.
llvm-svn: 249237
Because we handle more than SCEV does it is not possible to rewrite an
expression on the top-level using the SCEVParameterRewriter only. With
this patch we will do the rewriting on demand only and also
recursively, thus not only on the top-level.
llvm-svn: 248916
Instructions which we can synthesis from a SCEV expression are not
generated directly, but only when they are used as an operand of
another instruction. This avoids generating unnecessary instructions
and works more reliably than first inserting them and then deleting
them later on.
This commit was reverted in r248860 due to a remaining miscompile, where
we forgot to synthesis the operand values that were referenced from scalar
writes. test/Isl/CodeGen/scalar-store-from-same-bb.ll tests that we do this
now correctly.
llvm-svn: 248900
As a first step in the direction of assumed invariant loads (loads
that are not written in some context) we now detect and hoist
definitively invariant loads. These invariant loads will be preloaded
in the code generation and used in the optimized version of the SCoP.
If the load is only conditionally executed the preloaded version will
also only be executed under the same condition, hence we will never
access memory that wouldn't have been accessed otherwise. This is also
the most distinguishing feature to licm.
As hoisting can make statements empty we will simplify the SCoP and
remove empty statements that would otherwise cause artifacts in the
code generation.
Differential Revision: http://reviews.llvm.org/D13194
llvm-svn: 248861
This reverts commit 07830c18d789ee72812d5b5b9b4f8ce72ebd4207.
The commit broke at least one test in lnt,
MultiSource/Benchmarks/Ptrdist/bc/number.c
was miss compiled and the test produced a wrong result.
One Polly test case that was added later was adjusted too.
llvm-svn: 248860
Every once in a while we see code that accesses memory with different types,
e.g. to perform operations on a piece of memory using type 'float', but to copy
data to this memory using type 'int'. Modeled in C, such codes look like:
void foo(float A[], float B[]) {
for (long i = 0; i < 100; i++)
*(int *)(&A[i]) = *(int *)(&B[i]);
for (long i = 0; i < 100; i++)
A[i] += 10;
}
We already used the correct types during normal operations, but fall back to our
detected type as soon as we import changed memory access functions. For these
memory accesses we may generate invalid IR due to a mismatch between the element
type of the array we detect and the actual type used in the memory access. To
address this issue, we always cast the newly created address of a memory access
back to the type of the memory access where the address will be used.
llvm-svn: 248781
Instructions which we can synthesis from a SCEV expression are not generated
directly, but only when they are used as an operand of another instruction. This
avoids generating unnecessary instruction and works more reliably than first
inserting them and then deleting them later on.
Suggested-by: Johannes Doerfert <doerfert@cs.uni-saarland.de>
Differential Revision: http://reviews.llvm.org/D13208
llvm-svn: 248712
This patch allows switch instructions with affine conditions in the
SCoP. Also switch instructions in non-affine subregions are allowed.
Both did not require much changes to the code, though there was some
refactoring needed to integrate them without code duplication.
In the llvm-test suite the number of profitable SCoPs increased from
135 to 139 but more importantly we can handle more benchmarks and user
inputs without preprocessing.
Differential Revision: http://reviews.llvm.org/D13200
llvm-svn: 248701
We now only delete trivially dead instructions in the BB we copy (copyBB), but
not in any other BB. Only for copyBB we know that there will _never_ be any
future uses of instructions that have no use after copyBB has been generated.
Other instructions in the AST that have been generated by IslNodeBuilder may
look dead at the moment, but may possibly still be referenced by GlobalMaps. If
we delete them now, later uses would break surprisingly.
We do not have a test case that breaks due to us deleting too many instructions.
This issue was found by inspection.
llvm-svn: 248688
After having generated a new user statement a couple of inefficient or
trivially dead instructions may remain. This commit runs instruction
simplification over the newly generated blocks to ensure unneeded
instructions are removed right away.
This commit does adds simplification for non-affine subregions which was not
yet part of 248681.
llvm-svn: 248683
After having generated a new user statement a couple of inefficient or trivially
dead instructions may remain. This commit runs instruction simplification over
the newly generated blocks to ensure unneeded instructions are removed right
away.
This commit does not yet add simplification for non-affine subregions.
llvm-svn: 248681
This commit basically reverts r246427 but still solves the issue
tackled by that commit. Instead of emitting initialization code in the
beginning of the start block we now generate parallel code in its own
block and thereby guarantee separation. This is necessary as we cannot
generate code for hoisted loads prior to the start block but it still
needs to be placed prior to everything else.
llvm-svn: 248674
There are three possible reasons to add a memory memory access: For explicit load and stores, for llvm::Value defs/uses, and to emulate PHI nodes (the latter two called implicit accesses). Previously MemoryAccess only stored IsPHI. Register accesses could be identified through the isScalar() method if it was no IsPHI. isScalar() determined the number of dimensions of the underlaying array, scalars represented by zero dimensions.
For the work on de-LICM, implicit accesses can have more than zero dimensions, making the distinction of isScalars() useless, hence now stored explicitly in the MemoryAccess. Instead, we replace it by isImplicit() and avoid the term scalar for zero-dimensional arrays as it might be confused with llvm::Value which are also often referred to as scalars (or alternatively, as registers).
No behavioral change intended, under the condition that it was impossible to create explicit accesses to zero-dimensional "arrays".
llvm-svn: 248616
All MemoryAccess objects will be owned by ScopInfo::AccFuncMap which
previously stored the IRAccess objects. Instead of creating new
MemoryAccess objects, the already created ones are reused, but their
order might be different now. Some fields of IRAccess and MemoryAccess
had the same meaning and are merged.
This is the last step of fusioning TempScopInfo.{h|cpp} and
ScopInfo.{h.cpp}. Some refactoring might still make sense.
Differential Revision: http://reviews.llvm.org/D12843
llvm-svn: 248024
While we do not need to model PHI nodes in the region exit (as it is not part
of the SCoP), we need to prepare for the case that the exit block is split in
code generation to create a single exiting block. If this will happen, hence
if the region did not have a single exiting block before, we will model the
operands of the PHI nodes as escaping scalars in the SCoP.
Differential Revision: http://reviews.llvm.org/D12051
llvm-svn: 247078
When this option is enabled, Polly will emit printf calls for each scalar
load/and store which dump the scalar value loaded/stored at run time.
This patch also refactors the RuntimeDebugBuilder to use variadic templates
when generating CPU printfs. As result, it now becomes easier to print
strings that consist of a set of arguments. Also, as a single printf
call is emitted, it is more likely for such strings to be emitted atomically
if executed multi-threaded.
llvm-svn: 246941
By inspection the update of the GlobalMaps in the RegionGenerator seems unneed,
and is removed as also no test cases fail when dropping this. Johannes Doerfert
confirmed that this is indeed save:
"I think that code was needed when we did not use the scalar codegen by default.
Now everything defined in a non-affine region should be communicated via memory
and reloaded in the user block. Hence, we should be good removing this code."
llvm-svn: 246926
The GlobalMap variable used in BlockGenerator should always reference the same
list througout the entire code generation, hence we can make it a member
variable to avoid passing it around through every function call.
History: Before we switched to the SCEV based code generation the GlobalMap
also contained a mapping form old to new induction variables, hence it was
different for each ScopStmt, which is why we passed it as function argument
to copyStmt. The new SCEV based code generation now uses a separate mapping
called LTS -> LoopToSCEV that maps each original loop to a new loop iteration
variable provided as a SCEVExpr. The GlobalMap is currently mostly used for
OpenMP code generation, where references to parameters in the original function
need to be rewritten to the locations of these variables after they have been
passed to the subfunction.
Suggested-by: Johannes Doerfert <doerfert@cs.uni-saarland.de>
llvm-svn: 246920
Our OpenMP code generation generated part of its launching code directly into
the start basic block and without this change the scalar initialization was
run _after_ the OpenMP threads have been launched. This resulted in
uninitialized scalar values to be used.
llvm-svn: 246427
Scalar dependences between scop statements have caused troubles during parallel
code generation as we did not pass on the new stack allocation created for such
scalars to the parallel subfunctions. This change now detects all scalar
reads/writes in parallel subfunctions, creates the allocas for these scalar
objects, passes the resulting memory locations to the subfunctions and ensures
that within the subfunction requests for these memory locations will return the
rewritten values.
Johannes suggested as a future optimization to privatizing some of the scalars
in the subfunction.
llvm-svn: 246414
We already modeled read-only dependences to scalar values defined outside the
scop as memory reads and also generated read accesses from the corresponding
alloca instructions that have been used to pass these scalar values around
during code generation. However, besides for PHI nodes that have already been
handled, we failed to store the orignal read-only scalar values into these
alloc. This commit extends the initialization of scalar values to all read-only
scalar values used within the scop.
llvm-svn: 246394