When we generate code for a whole region we have to respect dominance
and update it too.
The first is achieved with multiple "BBMap"s. Each copied block in the
region gets its own map. It is initialized only with values mapped in
the immediate dominator block, if this block is in the region and was
therefor already copied. This way no values defined in a block that
doesn't dominate the current one will be used.
To update dominance information we check if the immediate dominator of
the original block we want to copy is in the region. If so we set the
immediate dominator of the current block to the copy of the immediate
dominator of the original block.
llvm-svn: 230774
After a function was created we will verify it for Debug builds. If
errors are found and debug-type equals "polly-codegen-isl" the SCoP,
the isl AST, the function as well as the errors will be printed.
llvm-svn: 230767
isl recently introduced a new interface to create run-time checks from
constraint sets. Use this interface to simplify our run-time check generation.
llvm-svn: 230640
For Polly the two interesting changes are short_circuit && and || AST
expressions as well as the introduction of isl_ast_build_expr_from_set,
a well defined interface to compute ast expressions from constraint sets.
llvm-svn: 230636
With the patches r230325, r230329 and r230340 we can handle non-affine
control flow in (loop-free) subregions. As all LLVM test-suite tests pass and
we get ~20% more non-trivial SCoPs, we activate it now by default.
llvm-svn: 230624
This update contains:
- Fixes of minor issues detected by clang's scan_build
- More schedule tree infrastructure additions
This update slightly changes the output of our dependence analysis, but these
changes are purely syntactially.
llvm-svn: 230528
This is the code generation for region statements that are created
when non-affine control flow was present in the input. A new
generator, similar to the block or vector generator, for regions is
used to traverse and copy the region statement and to adjust the
control flow inside the new region in the end.
llvm-svn: 230340
This allows us to model non-affine regions in the SCoP representation.
SCoP statements can now describe either basic blocks or non-affine
regions. In the latter case all accesses in the region are accumulated
for the statement and write accesses, except in the entry, have to be
marked as may-write.
Differential Revision: http://reviews.llvm.org/D7846
llvm-svn: 230329
With this patch we allow the SCoP detection to detect regions as SCoPs
which have non-affine control flow inside. All non-affine regions are
tracked and later accessible to the ScopInfo.
As there is no real difference, non-affine branches as well as
floating point branches are covered (and both called non-affine
control flow). However, the detection is restricted to
overapproximate only loop free regions.
llvm-svn: 230325
Scops that only read seem generally uninteresting and scops that only write are
most likely initializations where there is also little to optimize. To not
waste compile time we bail early.
Differential Revision: http://reviews.llvm.org/D7735
llvm-svn: 229820
This is just a single commit that includes a performance optimization that
should improve dependence analysis time. Our performance bots should measure
this difference.
llvm-svn: 229476
This commit imports the latest isl version into lib/External/isl. The changes
relavant for Polly are:
1) Schedule trees [1] have been introduced as a more structured way to
describe schedules. Polly does not yet use them, but we may switch to them
in the near future.
2) Another set of coalescing changes [2] simplifies some data dependences and
removes a couple of code generation artifacts.
We now understand that the following sets can be merged:
{ Stmt_S1[i0, i1] -> Stmt_S2[i0 + i1] :
i0 >= 0 and i1 <= 1023 - i0 and i1 >= 1
Stmt_S1[i0, 0] -> Stmt_S2[i0] : i0 <= 1023 and i0 >= 1}
into:
{ Stmt_S1[i0, i1] -> Stmt_S2[i0 + i1] : i1 <= 1023 - i0 and i1 >= 0 and
i1 >= 1 - i0 and i0 >= 0 }
Changes of this kind reduce unnecessary specialization during code
generation.
- for (int c3 = 0; c3 <= 1023; c3 += 1) {
- if (c3 % 2 == 0) {
- Stmt_for_body3(c1, c3);
- } else
- Stmt_for_body3(c1, c3);
- }
+ for (int c3 = 0; c3 <= 1023; c3 += 1)
+ Stmt_for_body3(c1, c3);
[1] http://impact.gforge.inria.fr/impact2014/papers/impact2014-verdoolaege.pdf
[2] http://impact.gforge.inria.fr/impact2015/papers/impact2015-verdoolaege.pdf
llvm-svn: 229423
Alias checks might become costly if there are divisions that complicate the
description of the accessed locations. By overaproximating them we get fairly
accurate results without the huge compile time cost.
llvm-svn: 229252
namespace and header rather than the top-level header and using
declarations. These helpers impede modular builds and are going away.
Migrating away from them will also be necessary to start mixing in any
usage of the new pass manager.
llvm-svn: 229091
Without this change we get linker errors such as:
undefined reference to `llvm::dbgs()'
We only conditionally link in these libraries, as in BUILD_SHARED_LIBS=OFF mode,
linking in these libraries causes such functions (and especially global options)
to be defined twice. The "solution" I choose is most likely not ideal, but seems
to work. If any cmake specialist can suggest a better approach, this would be
appreciated.
We also drop a .c file that is not needed as it caused linker errors as well.
llvm-svn: 228914
This allows us to skip ast and code generation if we did not optimize
a SCoP and will not generate parallel or alias annotations. The
initial heuristic to exit is simple but allows improvements later on.
All failing test cases have been modified to disable early exit, thus
to keep their coverage.
Differential Revision: http://reviews.llvm.org/D7254
llvm-svn: 228851
These write are important as they will force the scheduling and code
generation of an otherwise trivial statement and also impose an order of
execution needed to guarantee the correct final value for a scalar in a loop.
Added test case modeled after ClamAV/clamscan.
llvm-svn: 228847
This change has two main purposes:
1) We do not use a static interface to hide an object we create and
destroy for every basic block we copy.
2) We allow the BlockGenerator to store information between calls to
the copyBB method. This will ease scalar/phi code generation
later on.
While a lot of method signatures were changed this should not cause
any real behaviour change.
Differential Revision: http://reviews.llvm.org/D7467
llvm-svn: 228443
This allows us to model PHI nodes in the polyhedral description
without demoting them. The modeling however will result in the
same accesses as the demotion would have introduced.
Differential Revision: http://reviews.llvm.org/D7415
llvm-svn: 228433
With this patch Polly is always GPL-free (no dependency on GMP any more). As a
result, building and distributing Polly will be easier. Furthermore, there is no
need to tightly coordinate isl and Polly releases anymore.
We import isl b3e0fa7a05d as well as imath 4d707e5ef2. These are the git
versions Polly currently was tested with when using utils/checkout_isl.sh. The
imported libraries are both MIT-style licensed.
We build isl and imath with -fvisibility=hidden to avoid clashes in case other
projects (such as gcc) use conflicting versions of isl. The use of imath can
temporarily reduce compile-time performance of Polly. We will work on
performance tuning in tree.
Patches to isl should be contributed first to the main isl repository and can
then later be reimported to Polly.
This patch is also a prerequisite for the upcoming isl C++ interface.
llvm-svn: 228193
The support is currently limited as we only allow them in the input but do
not emit them in the transformed SCoP due to the possible semantic changes.
Differential Revision: http://reviews.llvm.org/D5225
llvm-svn: 227054
This change ensures that the values that represent the array size of a
multi-dimensional access are correctly sign-extended when used to compute a
memory address used in the run-time alias check.
To make the test case more readable, we name the instructions that we generate.
llvm-svn: 225818
The max loop depth was incorrectly computed for scops that contain a
block from a loop but do not contain the entire loop. We need to
check that the full loop is contained in the region when computing
the max loop depth.
These scops occur when a region containing an inner loop is expanded
to include some blocks from the outer loop, but it cannot be fully
expanded to contain the outer loop because the region containing the
outer loop is invalid.
Differential Revision: http://reviews.llvm.org/D6913
llvm-svn: 225812
This support is still incomplete and consequently hidden behind a switch that
needs to be enabled. One problem is ATM that we incorrectly interpret very large
unsigned values as negative values even if used in an unsigned comparision.
llvm-svn: 225480
AF = dyn_cast<SCEVAddRecExpr>(Pair.second) may be NULL for some SCEVs that we do
not support. When reporting the error we still want to pass a pointer that is
known to always be non-NULL.
I do not yet have a test case for this, unfortunately.
llvm-svn: 225461
We previously used a Twine here, but as pointed out by David Blaikie
and Mehdi Amini storing a temporary StringRef in a Twine is not a good
idea, as the StringRef will be freed before the Twine is used leaving
a Twine that points to uninitialized memory. We now make it explicit that
we use a StringRef here.
llvm-svn: 225342
Schedule dimensions that have the same constant value accross all statements do
not carry any information, but due to the increased dimensionality of the
schedule cost compile time. To not pay this cost, we remove constant dimensions
if possible.
llvm-svn: 225067
Without updating dependences we may lose implicit transitive dependences for
which all explicit dependences have gone through the statement iterations we
have just eliminated.
No test case. We should probably implement a -verify-dependences option.
This fixes llvm.org/PR21227
llvm-svn: 224459
The dead code elimination is a pass that looks very promising, but needs some
more compile-time tuning before enabling it by default seems sensible.
llvm-svn: 223965
This simplifies the construction of the input for the reduction dependence
computation and at the same time removes an assumption that expects the schedule
to be of 2D + 1 form (the odd dimensions giving textual order, the even
dimensions the loop iterations).
llvm-svn: 223621
Isl now specifically marks modulo operations that are compared against zero.
They can be implemented with the C/LLVM remainder operation.
We also update a couple of test cases where the output of isl has slightly
changed.
llvm-svn: 223607
This commit drops the Cloog support for Polly. The scripts and
documentation are changed to only use isl as prerequisity. In the code
all Cloog specific parts have been removed and all relevant tests have
been ported to the isl backend when it was created.
llvm-svn: 223141
Polly had a copy of this pass to create the canonical induction variables
necessary for the non-scev-based code generation. As we now always use SCEV
based code generation, canonical induction variables are not needed any more.
llvm-svn: 222979
SCEV based code generation has been the default for two weeks after having
been tested for a long time. We now drop the support the non-scev-based code
generation.
llvm-svn: 222978
In TempScopInfo::buildCondition we extract the conditions to guard the
BB *in addition of* loop bounds. This means we should only consider the
conditions in the paths (in CFG) that do not contain cycles (loops).
At the same time, we set the invert flag if the FalseBB of the current
branch dominates our target BB to indicate that we reach the target BB
with an inverted condition from the current branch.
In this case, the path from the FalseBB contains a cycle if the FalseBB
is the target of a backedge. The conditions implied by such a path should
not be consider. We can identify such a case by checking if the TrueBB
also dominates our target BB, which means we can also reach our target
BB from the TrueBB, without going through the backedge.
llvm-svn: 222907
In case a GEP instruction references into a fixed size array e.g., an access
A[i][j] into an array A[100x100], LLVM-IR does not guarantee that the subscripts
always compute values that are within array bounds. We now derive the set of
parameter values for which all accesses are within bounds and add the assumption
that the scop is only every executed with this set of parameter values.
Example:
void foo(float A[][20], long n, long m {
for (long i = 0; i < n; i++)
for (long j = 0; j < m; j++)
A[i][j] = ...
This loop yields out-of-bound accesses if m is at least 20 and at the same time
at least one iteration of the outer loop is executed. Hence, we assume:
n <= 0 or m <= 20.
Doing so simplifies the dependence analysis problem, allows us to perform
more optimizations and generate better code.
TODO: The location where the GEP instruction is executed is not necessarily the
location where the memory is actually accessed. As a result scanning for GEP[s]
is imprecise. Even though this is not a correctness problem, this imprecision
may result in missed optimizations or non-optimal run-time checks.
In polybench where this mismatch between parametric loop bounds and fixed size
arrays is common, we see with this patch significant reductions in compile time
(up to 50%) and execution time (up to 70%). We see two significant compile time
regressions (fdtd-2d, jacobi-2d-imper), and one execution time regression
(trmm). Both regressions arise due to additional optimizations that have been
enabled by this patch. They can be addressed in subsequent commits.
http://reviews.llvm.org/D6369
llvm-svn: 222754
SCEV based code generation allows Polly to detect and generate code for loops
that do not have an explicit induction variable, but only virtual induction
variables given by SCEV.
Being able to do so has two main benefits:
- We can detect more scops by default
- We require less canonicalization before Polly, which means we get closer
to our goal of not touching the IR before analyzing its properties.
Specifically, we do not need to run -polly-indvars to introduce explicit
canonical induction variables.
This switch became possible as both the isl code generation and -polly-parallel
are LNT error free with SCEV based code generation and the isl ast generator.
llvm-svn: 222113
This prevents SCEVs to reference values not valid any more and as a consequence
solves a bug where such values reintroduced during ast generation caused the
independent blocks pass to fail validation.
http://llvm.org/PR21204
llvm-svn: 222103
The isl based backend has been tested since a long time and with the recently
commited OpenMP support the last missing piece of functionality was ported from
the CLooG backend.
The isl based backend gives us interesting new functionality:
- Run-time alias checks (enabled by default)
Optimize scops that contain possibly aliasing pointers. This feature has
largely increased the number of loop nests we consider for optimization.
Thanks Johannes!
- Delinearization (not yet enabled by default)
Model accesses to multi-dimensional arrays precisely. This will allow us to
understand kernels with multi-dimensional VLAs written in Julia, boost::ublas,
coremark or C99.
Thanks Sebastian!
- Generation of higher quality code
Sven and me spent a long time to optimize the quality of the generated code. A
major focus were expressions as they result from modulos/divisions or
piecewise affine expressions (a ? b : c).
- Full/Partial tile separation, polyhedral unrolling
The isl code generation provides functionality to generate specialized code
for core and cleanup loops and to specialize code using polyhedral context
information while unrolling statements.
(not yet exploited in Polly)
- Modifieable access functions
We can now use standard isl functionality to remap memory accesses to new
data locations. A standard use case is the use of shared memory, where
accesses to a larger region in global memory need to be mapped to a smaller
shared memory region using a modulo mapping.
(not yet exploited in Polly)
The cloog based code generation is still available for comparision, but is
scheduled for removal.
llvm-svn: 222101
Instead of parallelizing every parallel outermost loop, we now use a very
minimalistic cost model. Specifically, we assume innermost loops are not
worth parallelising and all non-innermost loops are.
When parallelizing all loops in LNT we got several slowdowns/timeouts due to
us parallelizing innermost loops that are executed only a couple of times
(number of iterations not known statically). With this basic heuristic enabled
LNT does not show any more timeouts, while several interesting loops are still
parallelized.
There are many ways to obtain an improved heuristic. Constructing such an
improvide heuristic from a position of minimal slow-down and zero code size
increase seems to be the best, as it allows us to track progress on LNT.
llvm-svn: 222096
This backend supports besides the classical code generation the upcoming SCEV
based code generation (which the existing CLooG backend does not support
robustly).
OpenMP code generation in the isl backend benefits from our run-time alias
checks such that the set of loops that can possibly be parallelized is a lot
larger.
The code was tested on LNT. We do not regress on builds without -polly-parallel.
When using -polly-parallel most tests work flawlessly, but a few issues still
remain and will be addressed in follow up commits.
SCEV/non-SCEV codegen:
- Compile time failure in ldecod and TimberWolfMC due a problem in our
run-time alias check generation triggered by pointers that escape through
the OpenMP subfunction (OpenMP specific).
- Several execution time failures. Due to the larger set of loops that we now
parallelize (compared to the classical code generation), we currently run
into some timeouts in tests with a lot loops that have a low trip count and
are slowed down by parallelizing them.
SCEV only:
- One existing failure in lencod due to llvm.org/PR21204 (not OpenMP specific)
OpenMP code generation is the last feature that was only available in the CLooG
backend. With the isl backend being the only one supporting features such as
run-time alias checks and delinearization, we will soon switch to use the isl
ast generator by the default and subsequently remove our dependency on CLooG.
http://reviews.llvm.org/D5517
llvm-svn: 222088
Polly was accidently modifying a debug info metadata node when
attempting to generate a new unique metadata node for the loop id.
The problem was that we had dwarf metadata that referred to a
metadata node with a null value, like this:
!6 = ... some dwarf metadata referring to !7 ...
!7 = {null}
When we attempt to generate a new metadata node, we reserve the
first space for self-referential node by setting the first argument
to null and then mutating the node later to refer to itself.
However, because the nodes are uniqued based on pointer values, when
we get the new metadata node it actually referred to an existing
node (!7 in the example). When we went to modify the metadata to
point to itself, we were accidently mutating the dwarf metatdata. We
ended up in this situation:
!6 = ... some dwarf metadata referring to !7 ...
!7 = {!7}
and this causes an assert when generating the debug info. The fix is
simple, we just need to use a unique value when getting a new
metadata node. The MDNode::getTemporary() provides exactly the API
we need (and it is used in clang to generate the unique nodes).
Differential Revision: http://reviews.llvm.org/D6174
llvm-svn: 221550
We introduces a new flag -polly-parallel and use it to annotate the for-nodes in
the isl ast that we want to execute thread parallel (e.g., using OpenMP). We
previously already emmitted openmp annotations, but we did this for various
kinds of parallel loops, including some which we can not run in parallel.
With this patch we now have three annotations:
1) #pragma known-parallel [reduction]
2) #pragma omp for
3) #pragma simd
meaning:
1) loop has no loop carried dependences
2) loop will be executed thread-parallel
3) loop can possibly be vectorized
This patch introduces 1) and reduces the use of 2) to only the cases where we
will actually generate thread parallel code.
It is in preparation of openmp code generation in our isl backend.
Legacy:
- We also have a command line option -enable-polly-openmp. This option controls
the OpenMP code generation in CLooG. It will become an alias of
-polly-parallel after the CLooG code generation has been dropped.
http://reviews.llvm.org/D6142
llvm-svn: 221479
This patch moves the SCEV based (re)generation of values before the checking for
scop-constant terms. It enables us to provide SCEV based replacements, which
are necessary to correctly generate OpenMP subfunctions when using the SCEV
based code generation.
When recomputing a new value for a value used in the code of the original scop,
we previously directly returned the same original value for all scop-constant
expressions without even trying to regenerate these values using our SCEV
expression. This is correct when the newly generated code remains fully in the
same function, however in case we want to outline parts of the newly generated
scop into subfunctions, this approach means we do not have any opportunity to
update these values in the SCEV based code generation. (In the non-SCEV based
code generation, we can provide such updates through the GlobalMap). To ensure
we have this opportunity, we first try to regenerate scalar terms with our SCEV
builder and will only return scop-constant expressions if SCEV based code
generation was not possible.
This change should not affect the results of the existing code generation
passes. It only impacts the upcoming OpenMP based code generation.
This commit also adds a test case. This test case passes before and after this
commit. It was added to ensure test coverage for the changed code.
llvm-svn: 221393
There was no good reason why this code was split accross two functions.
In subsequent changes we will change the order in which values are looked up.
Doing so would make the split into two functions even more arbitrary.
We also slightly improve the documentation.
llvm-svn: 221388
When our RuntimeDebugBuilder calles fflush(NULL) to flush all output streams, it
is important that the types we use in the call match the ones used in a
declaration of fflush possible already available in the translation unit.
As we just pass on a NULL pointer, the type of the pointer value does not really
matter. However, as LLVM complains in case of mismatched types, we make sure
to create a NULL pointer of identical type.
No test case, as RuntimeDebugBuilder is not permanently used in Polly. Calls to
it are until now only used to add informative output during debugging sessions.
llvm-svn: 221251
We will use ScalarEvolution in the ScopInfo.cpp to get the loop trip
count, not cache it in the TempScop object.
Differential Revision: http://reviews.llvm.org/D6070
llvm-svn: 221035
Now MaxLoopDepth only lives in Scops not in TempScops anymore.
This is the first part of a series of changes to make TempScops
obsolete.
Differential Revision: http://reviews.llvm.org/D6069
llvm-svn: 221026
Originally we have needed this code to map the isl_id of an array to its base
pointer. However, as now the isl_id contains a reference to the array itself we
obtain the base pointer from this isl_id and we do not need to add this
information to the IDToValue map.
llvm-svn: 220876
The description of the parameter value passed to -enable-polly-aligned did
not make any sense at all, but was just a leftover coming from when this option
was copied form -enable-polly-openmp. We just drop it as the option description
gives sufficient information already.
llvm-svn: 220445
This makes sure we consistently use dbgs() when printing debug output.
Previously, the code just mixed calls to isl_*_dump() with printing to dbgs()
and was relying for both methods to interact in predictable ways (same output
stream, no unexpected reordering of outputs).
llvm-svn: 220443
By adding braces into the DEBUG statement we can make clang-format format code
such as:
DEBUG(stmt1(); stmt2())
as multi-line code:
DEBUG({
stmt1();
stmt2();
});
This makes control-flow in debug statements easier to read.
llvm-svn: 220441
This patch changes the RegionSet type used in ScopDetection from a
std::set to a llvm::SetVector. The reason for the change is to
ensure deterministic output when printing the result of the
analysis. We had a windows buildbot failure for the modified test
because the output was coming in a different order.
Only one test case needed to be modified for this change. We could
use CHECK-DAG directives instead of CHECK in the analysis test cases
because the actual order of scops does not matter, but I think that
change should be done in a separate patch that modifies all the
appliciable tests. I simply modified the test to reflect the
expected deterministic output.
Differential Revision: http://reviews.llvm.org/D5897
llvm-svn: 220423
This patch does not change the semantic on it's own. However, the
dependence analysis as well as dce will now use the newest available
access relation for each memory access, thus if at some point the json
importer or any other pass will run before those two and set a new
access relation the behaviour will be different. In general it is
unclear if the dependence analysis and dce should be run on the old or
new access functions anyway. If we need to access the original access
function from the outside later, we can expose the getter again.
Differential Revision: http://reviews.llvm.org/D5707
llvm-svn: 219612
We restricted the new access functions to be a subset of the old one
because we want to keep the alignment, however if the alignment is
"not special", thus the default for the type, we can allow any access.
Differential Revision: http://reviews.llvm.org/D5680
llvm-svn: 219503
In case the pieceweise affine function used to create an isl_ast_expr
had empty cases (e.g., with contradicting constraints on the
parameters), it was possible that the condition of the isl_ast_expr
select was not a comparison but a constant (thus of type i64).
This patch does two thing:
1) Handle the case the condition of a select is not a i1 type like C.
2) Try to simplify the pieceweise affine functions for the min/max
access when we generate runtime alias checks. That step can often
remove empty or redundant cases as well as redundant constrains.
This fixes bug: http://llvm.org/PR21167
Differential Revision: http://reviews.llvm.org/D5627
llvm-svn: 219208
-Wcomment complained about a "multi-line comment" caused by the
ascii art used in ScopHelper to describe the CFG.
Differential Revision: http://reviews.llvm.org/D5618
llvm-svn: 219207
This resolved the issues with delinearized accesses that might alias,
thus delinearization doesn't deactivate runtime alias checks anymore.
Differential Revision: http://reviews.llvm.org/D5614
llvm-svn: 219078
This class allows to store information about the arrays in the SCoP.
For each base pointer in the SCoP one object is created storing the
type and dimension sizes of the array. The objects can be obtained via
the SCoP, a MemoryAccess or the isl_id associated with the output
dimension of a MemoryAccess (the description of what is accessed).
So far we use the information in the IslExprBuilder to create the
right base type before indexing into the base array. This fixes the
bug http://llvm.org/bugs/show_bug.cgi?id=21113 (both test cases are
included). On top of that we can now build runtime alias checks for
delinearized arrays as the dimension sizes are also part of the
ScopArrayInfo objects.
Differential Revision: http://reviews.llvm.org/D5613
llvm-svn: 219077
+ Generalized function names and comments
+ Removed OpenMP (omp) from the names and comments
+ Use common names (non OpenMP specific) for runtime library call creation
methodes
+ Commented the parallel code generator and all its member functions
+ Refactored some values and methodes
Differential Revision: http://reviews.llvm.org/D4990
llvm-svn: 219003
This also forbids the json importer to access other memory locations
than the original instruction as we to reuse the alignment of the
original load/store.
Differential Revision: http://reviews.llvm.org/D5560
llvm-svn: 218883
The LoopAnnotator doesn't annotate only loops any more, thus it is
called ScopAnnotator from now on.
This also removes unnecessary polly:: namespace tags.
llvm-svn: 218878
The command line flag -polly-annotate-alias-scopes controls whether or not
Polly annotates alias scopes in the new SCoP (default ON). This can improve
later optimizations as the new SCoP is basically an alias free environment for
them.
llvm-svn: 218877
This change allows to annotate all parallel loops with loop id metadata.
Furthermore, it will annotate memory instructions with
llvm.mem.parallel_loop_access metadata for all surrounding parallel loops.
This is especially usefull if an external paralleliser is used.
This also removes the PollyLoopInfo class and comments the
LoopAnnotator.
A test case for multiple parallel loops is attached.
llvm-svn: 218793
We use a parametric abstraction of the domain to split alias groups
if accesses cannot be executed under the same parameter evaluation.
The two test cases check that we can remove alias groups if the
pointers which might alias are never accessed under the same parameter
evaluation and that the minimal/maximal accesses are not global but
with regards to the parameter evaluation.
Differential Revision: http://reviews.llvm.org/D5436
llvm-svn: 218758
If there are multiple read only base addresses in an alias group
we can split it into multiple alias groups each with only one
read only access. This way we might reduce the number of
comparisons significantly as it grows linear in the number of
alias groups but exponential in their size.
Differential Revision: http://reviews.llvm.org/D5435
llvm-svn: 218757
This is just a optimization to save the compile time and execution time
for runtime alias checks if the user guarantees no aliasing all together.
llvm-svn: 218613
If too many parameters are involved in accesses used to create RTCs
we might end up with enormous compile times and RTC expressions.
The reason is that the lexmin/lexmax is dependent on all these
parameters and isl might need to create a case for every "ordering"
of them (e.g., p0 <= p1 <= p2, p1 <= p0 <= p2, ...).
The exact number of parameters allowed in accesses is defined by the
command line option -polly-rtc-max-parameters=XXX and set by default
to 8.
Differential Revision: http://reviews.llvm.org/D5500
llvm-svn: 218566
The run-time alias check places code that involves the base pointer at the
beginning of the SCoP. This breaks if the base pointer is defined inside the
SCoP. Hence, we can only create a run-time alias check if we are sure the base
pointer is not an instruction defined inside the scop. If it is we refuse to
handle the SCoP.
This commit should unbreak most of our current LNT failures.
Differential Revision: http://reviews.llvm.org/D5483
llvm-svn: 218412
This fixes two problems which are usualy caused together:
1) The elements of an isl AST access expression could be pointers
not only integers, floats and vectores thereof.
2) The runtime alias checks need to compare pointers but if they
are of a different type we need to cast them into a "max" type
similar to the non pointer case.
llvm-svn: 218113
This commit drops a call to std::sort, which sorted the base pointers that
possibly alias according to the address at which their corresponding llvm::Value
was allocated. There does not seem to be any good reason, why those pointers
should be (re)sorted and this only makes the output indeterministic.
llvm-svn: 218052
This change will build all alias groups (minimal/maximal accesses
to possible aliasing base pointers) we have to check before
we can assume an alias free environment. It will also use these
to create Runtime Alias Checks (RTC) in the ISL code generation
backend, thus allow us to optimize SCoPs despite possibly aliasing
pointers when this backend is used.
This feature will be enabled for the isl code generator, e.g.,
--polly-code-generator=isl, but disabled for:
- The cloog code generator (still the default).
- The case delinearization is enabled.
- The case non-affine accesses are allowed.
llvm-svn: 218046
We use SplitEdge to split a conditional entry edge of the SCoP region.
However, SplitEdge can cause two different situations (depending on
whether or not the edge is critical). This patch tests
which one is present and deals with the former unhandled one.
It also refactors and unifies the case we have to change the basic
blocks of the SCoP to new ones (see replaceScopAndRegionEntry).
llvm-svn: 217802
During the IslAst parallelism check also compute the minimal dependency
distance and store it in the IstAst for node.
Reviewer: sebpop
Differential Revision: http://reviews.llvm.org/D4987
llvm-svn: 217729
Even though we previously correctly detected the multi-dimensional access
pattern for accesses with a certain base address, we only delinearized
non-affine accesses to this address. Affine accesses have not been touched and
remained as single dimensional accesses. The result was an inconsistent
description of accesses to the same array, with some being one dimensional and
some being multi-dimensional.
This patch ensures that all accesses are delinearized with the same
dimensionality as soon as a single one of them has been detected as non-affine.
While writing this patch, it became evident that the options
-polly-allow-nonaffine and -polly-detect-keep-going have not been properly
supported in case delinearization has been turned on. This patch adds relevant
test coverage and addresses these issues as well. We also added some more
documentation to the functions that are modified in this patch.
This fixes llvm.org/PR20123
Differential Revision: http://reviews.llvm.org/D5329
llvm-svn: 217728
At the moment we assume that only elements of identical size are stored/loaded
to a certain base pointer. This patch adds logic to the scop detection to verify
this.
Differential Revision: http://reviews.llvm.org/D5329
llvm-svn: 217727
This allows us to omit the GuardBB in front of created loops
if we can show the loop trip count is at least one. It also
simplifies the dominance relation inside the new created region.
A GuardBB (even with a constant branch condition) might trigger
false dominance errors during function verification.
Differential Revision: http://reviews.llvm.org/D5297
llvm-svn: 217525
Summary:
+ Refactor the runtime check (RTC) build function
+ Added helper function to create an PollyIRBuilder
+ Change the simplify region function to create not
only unique entry and exit edges but also enfore that
the entry edge is unconditional
+ Cleaned the IslCodeGeneration runOnScop function:
- less post-creation changes of the created IR
+ Adjusted and added test cases
Reviewers: grosser, sebpop, simbuerg, dpeixott
Subscribers: llvm-commits, #polly
Differential Revision: http://reviews.llvm.org/D5076
llvm-svn: 217508
This previous code added in r216842 most likely created unnecessary copies.
Reported-by: Duncan P. N. Exon Smith <dexonsmith@apple.com>
llvm-svn: 217507
It seems we added guards to check for non-existing std::map elements to make
sure they are default constructed before first accessed. Besides, the code
being wrong because of checking Context.NonAffineAccesses[BasePointer].size()
instead of Context.cound(BasePointer), such a check is also not necessary
as std::map takes care of this already.
From the std::map documentation:
"If k does not match the key of any element in the container, the function
inserts a new element with that key and returns a reference to its mapped value.
Notice that this always increases the container size by one, even if no mapped
value is assigned to the element (the element is constructed using its default
constructor)."
llvm-svn: 217506
There was a bug in the IslAst which caused that no more outermost
parallel loops were detected/checked after a parallel outermost loop
of depth 1.
+ Test case attached
llvm-svn: 217452
Arcanist (arc) will now always run linters before uploading any new
commit to Phabricator. All errors/warnings (or their absence) will be
shown in the web interface together with a explanation by the commiter
(arcanist will ask the commiter if the build was not clean).
The linters include:
- clang-format
- spelling check
- permissions check (aka. chmod)
- filename check
- merge conflict marker check
Note, that their scope is sometimes limited (see .arclint for
details).
This commit also fixes all errors and warnings these linters reported,
namely:
- spelling mistakes and typos
- executable permissions for various text files
Differential Revision: http://reviews.llvm.org/D4916
llvm-svn: 215871
This will spill out information about LLVM-internals. However, in cases
where the name of the Value matches the name of the array in the source,
we provide more useful information. In cases where we spill internals,
the information still might help the user to pin down the correct
arrays.
The problem we face here is: The error is pinned to the debug location
of one of the offending values out of the alias set instead of all of them.
The more information we give the user about the set of aliasing
pointers the better.
llvm-svn: 215830
This reverts commit 215466 (and 215528, a trivial formatting fix).
The intention of these commits is a good one, but unfortunately they broke
our LNT buildbot:
http://lab.llvm.org:8011/builders/perf-x86_64-penryn-O3-polly-codegen-isl
Several of the cleanup changes that have been combined in this 'fixup' are
trivial and could probably be committed as obvious changes without risking to
break the build. The remaining changes are little and it should be easy to
figure out what went wrong.
llvm-svn: 215817
This reverts commit 215684. The intention of the commit is great, but
unfortunately it seems to be the cause of 14 LNT test suite failures:
http://lab.llvm.org:8011/builders/perf-x86_64-penryn-O3-polly/builds/116
To make our buildbots and performance testers green until this issue is solved,
we temporarily revert this commit.
llvm-svn: 215816
The support is limited to signed modulo access and condition
expressions with a constant right hand side, e.g., A[i % 2] or
A[i % 9]. Test cases are modified according to this new feature and
new test cases are added.
Differential Revision: http://reviews.llvm.org/D4843
llvm-svn: 215684
Store the llvm::Value pointers of the AliasSet instead of the AliasSet
itself.
We have to be careful about changed IR when the message is generated,
because the Value pointers might not exist anymore. This would render
the Diagnostic invalid. For now we just assert there.
Simply do not retreive a diagnostic message after the IR has changed
it's not valid information anyway.
llvm-svn: 215625
Remove the PoCC and ScopLib support from Polly as we do not have a
user/maintainer for it.
Differential Revision: http://reviews.llvm.org/D4871
llvm-svn: 215563
Use the explicit analysis if possible, only for splitBlock we will continue
to use the Pass * argument. This change allows us to remove the getAnalysis
calls from the code generation.
llvm-svn: 215121
There is no needed for neither 1-dimensional nor higher dimensional arrays to
require positive offsets in the outermost array dimension.
We originally introduced this assumption with the support for delinearizing
multi-dimensional arrays.
llvm-svn: 214665
+ Remove the class IslGenerator which duplicates the functionality of
IslExprBuilder.
+ Use the IslExprBuilder to create code for memory access relations.
+ Also handle array types during access creation.
+ Enable scev codegen for one of the transformed memory access tests,
thus access creation without canonical induction variables available.
+ Update one test case to the new output.
llvm-svn: 214659
+ Split all reduction dependences and map them to the causing memory accesses.
+ Print the types & base addresses of broken reductions for each "reduction
parallel" marked loop (OpenMP style).
+ 3 test cases to show how reductions are now represented in the isl ast.
The mapping "(ast) loops -> broken reductions" is also needed to find the
memory accesses we need to privatize in a loop.
llvm-svn: 214489
The functions isParallel, isInnermostParallel and IsOutermostParallel in
IslAstInfo will now return true even in the presence of broken reductions.
To compensate for this change the negated result of isReductionParallel can
be used.
llvm-svn: 214488
+ Perform the parallelism check on the innermost loop only once.
+ Inline the markOpenmpParallel function.
+ Rename all IslAstUserPayload * into Payload to make it consistent.
llvm-svn: 214448
Whe we build the IslAst we visit for nodes (in pre and post order) as well as
user/domain nodes. As these two sets are non overlapping we do not need to
check if we annotated a node earlier when we visit it.
llvm-svn: 214170