We use the branch instruction as the location at which a PHI-node write takes
place, instead of the PHI-node itself. This allows us to identify the
basic-block in a region statement which is on the incoming edge of the PHI-node
and for which the write access was originally introduced. As a result we can,
during code generation, avoid generating PHI-node write accesses for basic
blocks that do not preceed the PHI node without having to look at the IR
again.
This change fixes a bug which was introduced in r243420, when we started to
explicitly model PHI-node reads and writes, but dropped some additional checks
that where still necessary during code generation to not emit PHI-node writes
for basic-blocks that are not on incoming edges of the original PHI node.
Compared to the code before r243420 the new code does not need to inspect the IR
any more and we also do not generate multiple redundant writes.
llvm-svn: 243852
SCEVExpander, which we are using during code generation, only allows
instructions as insert locations, but breaks in case BasicBlock->end() iterators
are passed to it due to it trying to obtain the basic block in which code should
be generated by calling Instruction->getParent(), which is not defined for
->end() iterators.
This change adds an assert to Polly that ensures we only pass valid instructions
to SCEVExpander and it fixes one case, where we used IRBuilder->SetInsertBlock()
to set an ->end() insert location which was later passed to SCEVExpander.
In general, Polly is always trying to build up the CFG first, before we actually
insert instructions into the CFG sceleton. As a result, each basic block should
already have at least one branch instruction before we start adding code. Hence,
always requiring the IRBuilder insert location to be set to a real instruction
should always be possible.
Thanks Utpal Bora <cs14mtech11017@iith.ac.in> for his help with test case
reduction.
llvm-svn: 243830
As specified in PR23888, run-time alias check generation is expensive
in terms of compile-time. This reduces the compile time by computing
minimal/maximal access only once for each base pointer
Contributed-by: Pratik Bhatu <cs12b1010@iith.ac.in>
llvm-svn: 243024
Instead of flat schedules, we now use so-called schedule trees to represent the
execution order of the statements in a SCoP. Schedule trees make it a lot easier
to analyze, understand and modify properties of a schedule, as specific nodes
in the tree can be choosen and possibly replaced.
This patch does not yet fully move our DependenceInfo pass to schedule trees,
as some additional performance analysis is needed here. (In general schedule
trees should be faster in compile-time, as the more structured representation
is generally easier to analyze and work with). We also can not yet perform the
reduction analysis on schedule trees.
For more information regarding schedule trees, please see Section 6 of
https://lirias.kuleuven.be/handle/123456789/497238
llvm-svn: 242130
This removes old code that has been disabled since several weeks and was hidden
behind the flags -disable-polly-intra-scop-scalar-to-array=false and
-polly-model-phi-nodes=false. Earlier, Polly used to translate scalars and
PHI nodes to single element arrays, as this avoided the need for their special
handling in Polly. With Johannes' patches adding native support for such scalar
references to Polly, this code is not needed any more. After this commit both
-polly-prepare and -polly-independent are now mostly no-ops. Only a couple of
simple transformations still remain, but they are scheduled for removal too.
Thanks again to Johannes Doerfert for his nice work in making all this code
obsolete.
llvm-svn: 240766
Remainder operations with constant divisor can be modeled as quasi-affine
expression. This patch adds support for detecting and modeling them. We also
add a test that ensures they are correctly code generated.
This patch was extracted from a larger patch contributed by Johannes Doerfert
in http://reviews.llvm.org/D5293
llvm-svn: 240518
LLVM's instcombine already translates power-of-two sdivs that are known to be
exact to fast ashr instructions. Hence, there is no need to add this logic
ourselves.
Pointed-out-by: Johannes Doerfert
llvm-svn: 239025
We now verify that memory access functions imported via JSON are indeed defined
for the full iteration domain. Before this change we accidentally imported
memory mappings such as i -> i / 127, which only defined a mapped for values of
i that are evenly divisible by 127, but which did not define any mapping for the
remaining values, with the result that isl just generated an access expression
that had undefined behavior for all the unmapped values.
In the incorrect test cases, we now either use floor(i/127) or we use p/127 and
provide the information that p is indeed a multiple of 127.
llvm-svn: 239024
isl marks known non-negative numerators in modulo (and soon also division)
operations. We now exploit this by generating unsigned operations. This is
beneficial as unsigned operations with power-of-two denominators will be
translated by isl to fast bitshift or bitwise and operations.
llvm-svn: 238577
This ensures we pass all tests independently of how we set the options
-disable-polly-intra-scop-scalar-to-array and -polly-model-phi-nodes.
(At least if we enable both or disable both. Enabling them individually makes
little sense, as they will hopefully disappear soon anyhow).
llvm-svn: 238087
To reduce compile time and to allow more and better quality SCoPs in
the long run we introduced scalar dependences and PHI-modeling. This
patch will now allow us to generate code if one or both of those
options are set. While the principle of demoting scalars as well as
PHIs to memory in order to communicate their value stays the same,
this allows to delay the demotion till the very end (the actual code
generation). Consequently:
- We __almost__ do not modify the code if we do not generate code
for an optimized SCoP in the end. Thus, the early exit as well as
the unprofitable option will now actually preven us from
introducing regressions in case we will probably not get better
code.
- Polly can be used as a "pure" analyzer tool as long as the code
generator is set to none.
- The original SCoP is almost not touched when the optimized version
is placed next to it. Runtime regressions if the runtime checks
chooses the original are not to be expected and later
optimizations do not need to revert the demotion for that part.
- We will generate direct accesses to the demoted values, thus there
are no "trivial GEPs" that select the first element of a scalar we
demoted and treated as an array.
Differential Revision: http://reviews.llvm.org/D7513
llvm-svn: 238070
Modified two test cases to adjust to the above change in renaming.
These two files were causing the buildbot failure in Polly, #30204 for example.
Details in http://reviews.llvm.org/D9483
This checkin goes with r237150 and r237151
llvm-svn: 237203
Besides class, function and file names, we also change the command line option
from -polly-codegen-isl to just -polly-codegen. The isl postfix is a leftover
from the times when we still had the CLooG based -polly-codegen. Today it is
just redundant and we drop it.
llvm-svn: 237099
This option is enabled since a long time and there does not seem to be a
situation in which we would not want to print alias scopes. Remove this option
to reduce the set of command-line option combinations that may expose bugs.
llvm-svn: 235861
I just learned that target triples prevent test cases to be run on other
architectures. Polly test cases are until now sufficiently target independent
to not require any target triples. Hence, we drop them.
llvm-svn: 235384
This change ensures that we sign-extend integer types in case non-matching
operands are encountered when generating a multi-dimensional access offset.
This fixes http://llvm.org/PR23124
Reported-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
llvm-svn: 234122
This allows us to delinerize code such as:
A[][n]
for (i
for (j
A[i][n-j-1] = ...
which would previously have been delinearize to an access A[i+1][-j-1].
To recover the correct access we apply the piecewise expression:
{ A[i][j] -> A[i-1][i+N]: i < 0; A[i][j] -> A[i][i]: i >= 0}
This approach generalizes to higher dimensions.
llvm-svn: 233566
When creating parameters the SCEVexpander may introduce new induction variables,
that possibly create scalar dependences in the original scop, before we code
generate the scop. The resulting scalar dependences may then inhibit correct
code generation of the scop. To prevent this, we first version the code without
a run-time check and only then introduce new parameters and the run-time
condition. The if-condition that guards the original scop from being modified by
the SCEVexpander.
This change causes some test case changes as the run-time conditions are now
introduced in the split basic block rather than in the entry basic block.
This fixes http://llvm.org/PR22069
Test case reduced by: Karthik Senthil
llvm-svn: 233477
This options was earlier used for experiments with the vectorizer, but to my
knowledge is not really used anymore. If anybody needs this, we can always
reintroduce this feature.
llvm-svn: 232934
The BB vectorizer is deprecated and there is no point in generating code for it
any more. This option was introduced when there was not yet any loop vectorizer
in sight. Now being matured, Polly should target the loop vectorizer.
llvm-svn: 232099
When code generating array index expressions the types of the different
components of the index expressions may not always match. We extend the type of
the index expression (if possible) and assert otherwise.
llvm-svn: 231592
When we generate code for a whole region we have to respect dominance
and update it too.
The first is achieved with multiple "BBMap"s. Each copied block in the
region gets its own map. It is initialized only with values mapped in
the immediate dominator block, if this block is in the region and was
therefor already copied. This way no values defined in a block that
doesn't dominate the current one will be used.
To update dominance information we check if the immediate dominator of
the original block we want to copy is in the region. If so we set the
immediate dominator of the current block to the copy of the immediate
dominator of the original block.
llvm-svn: 230774
isl recently introduced a new interface to create run-time checks from
constraint sets. Use this interface to simplify our run-time check generation.
llvm-svn: 230640
This is the code generation for region statements that are created
when non-affine control flow was present in the input. A new
generator, similar to the block or vector generator, for regions is
used to traverse and copy the region statement and to adjust the
control flow inside the new region in the end.
llvm-svn: 230340
Scops that only read seem generally uninteresting and scops that only write are
most likely initializations where there is also little to optimize. To not
waste compile time we bail early.
Differential Revision: http://reviews.llvm.org/D7735
llvm-svn: 229820
This commit imports the latest isl version into lib/External/isl. The changes
relavant for Polly are:
1) Schedule trees [1] have been introduced as a more structured way to
describe schedules. Polly does not yet use them, but we may switch to them
in the near future.
2) Another set of coalescing changes [2] simplifies some data dependences and
removes a couple of code generation artifacts.
We now understand that the following sets can be merged:
{ Stmt_S1[i0, i1] -> Stmt_S2[i0 + i1] :
i0 >= 0 and i1 <= 1023 - i0 and i1 >= 1
Stmt_S1[i0, 0] -> Stmt_S2[i0] : i0 <= 1023 and i0 >= 1}
into:
{ Stmt_S1[i0, i1] -> Stmt_S2[i0 + i1] : i1 <= 1023 - i0 and i1 >= 0 and
i1 >= 1 - i0 and i0 >= 0 }
Changes of this kind reduce unnecessary specialization during code
generation.
- for (int c3 = 0; c3 <= 1023; c3 += 1) {
- if (c3 % 2 == 0) {
- Stmt_for_body3(c1, c3);
- } else
- Stmt_for_body3(c1, c3);
- }
+ for (int c3 = 0; c3 <= 1023; c3 += 1)
+ Stmt_for_body3(c1, c3);
[1] http://impact.gforge.inria.fr/impact2014/papers/impact2014-verdoolaege.pdf
[2] http://impact.gforge.inria.fr/impact2015/papers/impact2015-verdoolaege.pdf
llvm-svn: 229423
This allows us to skip ast and code generation if we did not optimize
a SCoP and will not generate parallel or alias annotations. The
initial heuristic to exit is simple but allows improvements later on.
All failing test cases have been modified to disable early exit, thus
to keep their coverage.
Differential Revision: http://reviews.llvm.org/D7254
llvm-svn: 228851