Schedule dimensions that have the same constant value accross all statements do
not carry any information, but due to the increased dimensionality of the
schedule cost compile time. To not pay this cost, we remove constant dimensions
if possible.
llvm-svn: 225067
Isl now specifically marks modulo operations that are compared against zero.
They can be implemented with the C/LLVM remainder operation.
We also update a couple of test cases where the output of isl has slightly
changed.
llvm-svn: 223607
SCEV based code generation has been the default for two weeks after having
been tested for a long time. We now drop the support the non-scev-based code
generation.
llvm-svn: 222978
In case a GEP instruction references into a fixed size array e.g., an access
A[i][j] into an array A[100x100], LLVM-IR does not guarantee that the subscripts
always compute values that are within array bounds. We now derive the set of
parameter values for which all accesses are within bounds and add the assumption
that the scop is only every executed with this set of parameter values.
Example:
void foo(float A[][20], long n, long m {
for (long i = 0; i < n; i++)
for (long j = 0; j < m; j++)
A[i][j] = ...
This loop yields out-of-bound accesses if m is at least 20 and at the same time
at least one iteration of the outer loop is executed. Hence, we assume:
n <= 0 or m <= 20.
Doing so simplifies the dependence analysis problem, allows us to perform
more optimizations and generate better code.
TODO: The location where the GEP instruction is executed is not necessarily the
location where the memory is actually accessed. As a result scanning for GEP[s]
is imprecise. Even though this is not a correctness problem, this imprecision
may result in missed optimizations or non-optimal run-time checks.
In polybench where this mismatch between parametric loop bounds and fixed size
arrays is common, we see with this patch significant reductions in compile time
(up to 50%) and execution time (up to 70%). We see two significant compile time
regressions (fdtd-2d, jacobi-2d-imper), and one execution time regression
(trmm). Both regressions arise due to additional optimizations that have been
enabled by this patch. They can be addressed in subsequent commits.
http://reviews.llvm.org/D6369
llvm-svn: 222754
+ Introduced dependency type TYPE_TC_RED to represent the transitive closure
(& the reverse) of reduction dependences. These are used when we check for
reduction parallel loops.
+ Test cases including loop reversals and modulo schedules which compute
reductions in a alternated order.
llvm-svn: 213019
Iterate over all store memory accesses and check for valid binary reduction
candidate loads by following the operands of the stored value. For each
candidate pair we check if they have the same base address and there are no
other accesses which may overlap with them. This ensures that no intermediate
value can escape into other memory locations or is overwritten at some point.
+ 17 test cases for reduction detection and reduction dependency modeling
llvm-svn: 211957
This change will ease the transision to multiple reductions per statement as
we can now distinguish the effects of multiple reductions in the same
statement.
+ Wrapped reduction dependences are used to compute privatization dependences
+ Modified test cases to account for the change
llvm-svn: 211795
This dependency analysis will keep track of memory accesses if they might be
part of a reduction. If not, the dependences are tracked on a statement level.
The main reason to do this is to reduce the compile time while beeing able to
distinguish the effects of reduction and non-reduction accesses.
+ Adjusted two test cases
llvm-svn: 211794
+ Collect reduction dependences
+ Introduced TYPE_RED in Dependences.h which can be used to obtain the
reduction dependences
+ Used TYPE_RED to prevent parallelization while we do not have a privatizing
code generation
+ Relax the dependences for non-parallel code generation
+ Add privatization dependences to ensure correctness
+ 12 Test cases to check for reduction and privatization dependences
llvm-svn: 211369
This is necessary to avoid test failures in the CLooG test suite due to the
recent isl update.
We also need to update two polly test cases which rely on a certain order in the
textual description that isl chooses for its sets and maps. Changes here are not
often, but we should probably switch to a check that verifies such maps are
semantically equivalent instead of represented identically.
llvm-svn: 203476
Count the number of computational steps that have been used to solve the
dependence problem and abort in case we reach the "compute-out". This ensures we
do not hang forever in cases the dependence problem is too difficult to solve.
There is just a single case in the LLVM test-suite that runs into the
compute-out. Even in this case, we can probably coalesce some of the parameters
(i32 b, i32 b zext i64, ...) to simplify the problem enough to not hit the
compute out. However, for now we set the compute out in place to address the
general issue. The compute out was choosen such that it stops on a recent laptop
after about 8 seconds.
llvm-svn: 200156
Instead of calculating exact value (flow) dependences, it is also possible to
calculate memory based dependences. Sometimes memory based dependences are a lot
easier to calculate. To evaluate the benefits, we add an option to calculate
memory based dependences (use -polly-value-dependences=false).
llvm-svn: 167251