In case the pieceweise affine function used to create an isl_ast_expr
had empty cases (e.g., with contradicting constraints on the
parameters), it was possible that the condition of the isl_ast_expr
select was not a comparison but a constant (thus of type i64).
This patch does two thing:
1) Handle the case the condition of a select is not a i1 type like C.
2) Try to simplify the pieceweise affine functions for the min/max
access when we generate runtime alias checks. That step can often
remove empty or redundant cases as well as redundant constrains.
This fixes bug: http://llvm.org/PR21167
Differential Revision: http://reviews.llvm.org/D5627
llvm-svn: 219208
-Wcomment complained about a "multi-line comment" caused by the
ascii art used in ScopHelper to describe the CFG.
Differential Revision: http://reviews.llvm.org/D5618
llvm-svn: 219207
This resolved the issues with delinearized accesses that might alias,
thus delinearization doesn't deactivate runtime alias checks anymore.
Differential Revision: http://reviews.llvm.org/D5614
llvm-svn: 219078
This class allows to store information about the arrays in the SCoP.
For each base pointer in the SCoP one object is created storing the
type and dimension sizes of the array. The objects can be obtained via
the SCoP, a MemoryAccess or the isl_id associated with the output
dimension of a MemoryAccess (the description of what is accessed).
So far we use the information in the IslExprBuilder to create the
right base type before indexing into the base array. This fixes the
bug http://llvm.org/bugs/show_bug.cgi?id=21113 (both test cases are
included). On top of that we can now build runtime alias checks for
delinearized arrays as the dimension sizes are also part of the
ScopArrayInfo objects.
Differential Revision: http://reviews.llvm.org/D5613
llvm-svn: 219077
Update debug info testcases for the LLVM metadata schema change in
r219010 to fold metadata constant operands into a single `MDString`.
Part of PR17891.
llvm-svn: 219019
+ Generalized function names and comments
+ Removed OpenMP (omp) from the names and comments
+ Use common names (non OpenMP specific) for runtime library call creation
methodes
+ Commented the parallel code generator and all its member functions
+ Refactored some values and methodes
Differential Revision: http://reviews.llvm.org/D4990
llvm-svn: 219003
This also forbids the json importer to access other memory locations
than the original instruction as we to reuse the alignment of the
original load/store.
Differential Revision: http://reviews.llvm.org/D5560
llvm-svn: 218883
The LoopAnnotator doesn't annotate only loops any more, thus it is
called ScopAnnotator from now on.
This also removes unnecessary polly:: namespace tags.
llvm-svn: 218878
The command line flag -polly-annotate-alias-scopes controls whether or not
Polly annotates alias scopes in the new SCoP (default ON). This can improve
later optimizations as the new SCoP is basically an alias free environment for
them.
llvm-svn: 218877
arc unit will now show the number of consecutive tests with the same
result instead of printing a "." for each one. Due to the number of
tests the "dots" didn't fit in one line any more. Furthermore the
result list is shortened, only non passing tests or tests taking
longer than a time threshold (50ms) will be reported (both to the user
and to phabricator).
llvm-svn: 218826
This change allows to annotate all parallel loops with loop id metadata.
Furthermore, it will annotate memory instructions with
llvm.mem.parallel_loop_access metadata for all surrounding parallel loops.
This is especially usefull if an external paralleliser is used.
This also removes the PollyLoopInfo class and comments the
LoopAnnotator.
A test case for multiple parallel loops is attached.
llvm-svn: 218793
We use a parametric abstraction of the domain to split alias groups
if accesses cannot be executed under the same parameter evaluation.
The two test cases check that we can remove alias groups if the
pointers which might alias are never accessed under the same parameter
evaluation and that the minimal/maximal accesses are not global but
with regards to the parameter evaluation.
Differential Revision: http://reviews.llvm.org/D5436
llvm-svn: 218758
If there are multiple read only base addresses in an alias group
we can split it into multiple alias groups each with only one
read only access. This way we might reduce the number of
comparisons significantly as it grows linear in the number of
alias groups but exponential in their size.
Differential Revision: http://reviews.llvm.org/D5435
llvm-svn: 218757
This is just a optimization to save the compile time and execution time
for runtime alias checks if the user guarantees no aliasing all together.
llvm-svn: 218613
If too many parameters are involved in accesses used to create RTCs
we might end up with enormous compile times and RTC expressions.
The reason is that the lexmin/lexmax is dependent on all these
parameters and isl might need to create a case for every "ordering"
of them (e.g., p0 <= p1 <= p2, p1 <= p0 <= p2, ...).
The exact number of parameters allowed in accesses is defined by the
command line option -polly-rtc-max-parameters=XXX and set by default
to 8.
Differential Revision: http://reviews.llvm.org/D5500
llvm-svn: 218566
The run-time alias check places code that involves the base pointer at the
beginning of the SCoP. This breaks if the base pointer is defined inside the
SCoP. Hence, we can only create a run-time alias check if we are sure the base
pointer is not an instruction defined inside the scop. If it is we refuse to
handle the SCoP.
This commit should unbreak most of our current LNT failures.
Differential Revision: http://reviews.llvm.org/D5483
llvm-svn: 218412
This fixes two problems which are usualy caused together:
1) The elements of an isl AST access expression could be pointers
not only integers, floats and vectores thereof.
2) The runtime alias checks need to compare pointers but if they
are of a different type we need to cast them into a "max" type
similar to the non pointer case.
llvm-svn: 218113
This commit drops a call to std::sort, which sorted the base pointers that
possibly alias according to the address at which their corresponding llvm::Value
was allocated. There does not seem to be any good reason, why those pointers
should be (re)sorted and this only makes the output indeterministic.
llvm-svn: 218052
This change will build all alias groups (minimal/maximal accesses
to possible aliasing base pointers) we have to check before
we can assume an alias free environment. It will also use these
to create Runtime Alias Checks (RTC) in the ISL code generation
backend, thus allow us to optimize SCoPs despite possibly aliasing
pointers when this backend is used.
This feature will be enabled for the isl code generator, e.g.,
--polly-code-generator=isl, but disabled for:
- The cloog code generator (still the default).
- The case delinearization is enabled.
- The case non-affine accesses are allowed.
llvm-svn: 218046
We use SplitEdge to split a conditional entry edge of the SCoP region.
However, SplitEdge can cause two different situations (depending on
whether or not the edge is critical). This patch tests
which one is present and deals with the former unhandled one.
It also refactors and unifies the case we have to change the basic
blocks of the SCoP to new ones (see replaceScopAndRegionEntry).
llvm-svn: 217802
During the IslAst parallelism check also compute the minimal dependency
distance and store it in the IstAst for node.
Reviewer: sebpop
Differential Revision: http://reviews.llvm.org/D4987
llvm-svn: 217729
Even though we previously correctly detected the multi-dimensional access
pattern for accesses with a certain base address, we only delinearized
non-affine accesses to this address. Affine accesses have not been touched and
remained as single dimensional accesses. The result was an inconsistent
description of accesses to the same array, with some being one dimensional and
some being multi-dimensional.
This patch ensures that all accesses are delinearized with the same
dimensionality as soon as a single one of them has been detected as non-affine.
While writing this patch, it became evident that the options
-polly-allow-nonaffine and -polly-detect-keep-going have not been properly
supported in case delinearization has been turned on. This patch adds relevant
test coverage and addresses these issues as well. We also added some more
documentation to the functions that are modified in this patch.
This fixes llvm.org/PR20123
Differential Revision: http://reviews.llvm.org/D5329
llvm-svn: 217728
At the moment we assume that only elements of identical size are stored/loaded
to a certain base pointer. This patch adds logic to the scop detection to verify
this.
Differential Revision: http://reviews.llvm.org/D5329
llvm-svn: 217727
We now verify that such functions are correctly detected even in combination
with delinearization. This change is added to ensure we have good test coverage
for the subsequent delinearization fix.
We also remove unnecessary instructions from the test case.
llvm-svn: 217664
This allows us to omit the GuardBB in front of created loops
if we can show the loop trip count is at least one. It also
simplifies the dominance relation inside the new created region.
A GuardBB (even with a constant branch condition) might trigger
false dominance errors during function verification.
Differential Revision: http://reviews.llvm.org/D5297
llvm-svn: 217525
Summary:
+ Refactor the runtime check (RTC) build function
+ Added helper function to create an PollyIRBuilder
+ Change the simplify region function to create not
only unique entry and exit edges but also enfore that
the entry edge is unconditional
+ Cleaned the IslCodeGeneration runOnScop function:
- less post-creation changes of the created IR
+ Adjusted and added test cases
Reviewers: grosser, sebpop, simbuerg, dpeixott
Subscribers: llvm-commits, #polly
Differential Revision: http://reviews.llvm.org/D5076
llvm-svn: 217508
This previous code added in r216842 most likely created unnecessary copies.
Reported-by: Duncan P. N. Exon Smith <dexonsmith@apple.com>
llvm-svn: 217507
It seems we added guards to check for non-existing std::map elements to make
sure they are default constructed before first accessed. Besides, the code
being wrong because of checking Context.NonAffineAccesses[BasePointer].size()
instead of Context.cound(BasePointer), such a check is also not necessary
as std::map takes care of this already.
From the std::map documentation:
"If k does not match the key of any element in the container, the function
inserts a new element with that key and returns a reference to its mapped value.
Notice that this always increases the container size by one, even if no mapped
value is assigned to the element (the element is constructed using its default
constructor)."
llvm-svn: 217506
The -e flag exits the script with a non-zero code if any subcommand
fails. This flag allows us to notice as early as possible if the
test was not properly regenerated using a command like:
$ create_ll.sh t.c && opt < t.ll -polly ...
The above pattern is useful when iteratively developing a test case
to guard against un-noticed syntax errors.
Differential Revision: http://reviews.llvm.org/D5276
llvm-svn: 217463
There was a bug in the IslAst which caused that no more outermost
parallel loops were detected/checked after a parallel outermost loop
of depth 1.
+ Test case attached
llvm-svn: 217452
Thanks to Clemens (hammacher@cs.uni-saarland.de) extension of the
arcanist unit test engine, we can run the llvm-lit tests automatically.
Similar to the linters, aracanist will run all tests for each uploaded
commit and asks the commiter for a statements in case unit tests fail.
To use this feature the unit test engine needs to find the polly build
directory. Currently we support the following setups:
$POLLY_BUILD_DIR (environment variable)
<root>/build
<root>.build
<root>-build
<root:s/src/build>
<cwd>
llvm-svn: 217396
This allows to link Polly's lit.site.cfg from the build into the src directory,
without having it removed by every 'git clean':
ln -s build/tools/polly/test/lit.site.cfg to src/tools/polly/test
Having this file in our src directory allows us to run llvm-lit on specific
test cases in the Polly test directory just by running 'llvm-lit test/case.ll'.
llvm-svn: 217336
In Polly we used to have a mix of test cases, some that used 'opt %s' and others
that used 'opt < %s'. We now change all to use 'opt < %s'. Piping in test files
is preferable as it does prevent temporary files to be written to disk. This
brings us in line with what is usus in LLVM.
llvm-svn: 216816
This replaces the use of %defaultOpts = '-basicaa -polly-prepare' with the
minimal set of passes necessary for a test to succeed. Of the test cases that
previously used %defaultOpts 76 test cases require none of these passes, 42
need -basicaa and only 2 need -polly-prepare. Our change makes this requirement
explicit.
In Polly many test cases have been using a macro '%defaultOpts' which run a
couple of preparing passes before the actual Polly test case. This macro was
introduced very early in the development of Polly and originally contained a
large set of canonicalization passes. However, as the need for additional
canonicalization passes makes test cases harder to understand and also more
fragile in terms of changes in such passes, we aim since a longer time to only
include the minimal set of passes necessary. This patch removes the last
leftovers from of %defaultOpts and brings our tests cases more in line to what
is usus in LLVM itself.
llvm-svn: 216815
Arcanist (arc) will now always run linters before uploading any new
commit to Phabricator. All errors/warnings (or their absence) will be
shown in the web interface together with a explanation by the commiter
(arcanist will ask the commiter if the build was not clean).
The linters include:
- clang-format
- spelling check
- permissions check (aka. chmod)
- filename check
- merge conflict marker check
Note, that their scope is sometimes limited (see .arclint for
details).
This commit also fixes all errors and warnings these linters reported,
namely:
- spelling mistakes and typos
- executable permissions for various text files
Differential Revision: http://reviews.llvm.org/D4916
llvm-svn: 215871
This will spill out information about LLVM-internals. However, in cases
where the name of the Value matches the name of the array in the source,
we provide more useful information. In cases where we spill internals,
the information still might help the user to pin down the correct
arrays.
The problem we face here is: The error is pinned to the debug location
of one of the offending values out of the alias set instead of all of them.
The more information we give the user about the set of aliasing
pointers the better.
llvm-svn: 215830
This reverts commit 215466 (and 215528, a trivial formatting fix).
The intention of these commits is a good one, but unfortunately they broke
our LNT buildbot:
http://lab.llvm.org:8011/builders/perf-x86_64-penryn-O3-polly-codegen-isl
Several of the cleanup changes that have been combined in this 'fixup' are
trivial and could probably be committed as obvious changes without risking to
break the build. The remaining changes are little and it should be easy to
figure out what went wrong.
llvm-svn: 215817
This reverts commit 215684. The intention of the commit is great, but
unfortunately it seems to be the cause of 14 LNT test suite failures:
http://lab.llvm.org:8011/builders/perf-x86_64-penryn-O3-polly/builds/116
To make our buildbots and performance testers green until this issue is solved,
we temporarily revert this commit.
llvm-svn: 215816
The support is limited to signed modulo access and condition
expressions with a constant right hand side, e.g., A[i % 2] or
A[i % 9]. Test cases are modified according to this new feature and
new test cases are added.
Differential Revision: http://reviews.llvm.org/D4843
llvm-svn: 215684
Store the llvm::Value pointers of the AliasSet instead of the AliasSet
itself.
We have to be careful about changed IR when the message is generated,
because the Value pointers might not exist anymore. This would render
the Diagnostic invalid. For now we just assert there.
Simply do not retreive a diagnostic message after the IR has changed
it's not valid information anyway.
llvm-svn: 215625
Remove the PoCC and ScopLib support from Polly as we do not have a
user/maintainer for it.
Differential Revision: http://reviews.llvm.org/D4871
llvm-svn: 215563
Use the explicit analysis if possible, only for splitBlock we will continue
to use the Pass * argument. This change allows us to remove the getAnalysis
calls from the code generation.
llvm-svn: 215121
There is no needed for neither 1-dimensional nor higher dimensional arrays to
require positive offsets in the outermost array dimension.
We originally introduced this assumption with the support for delinearizing
multi-dimensional arrays.
llvm-svn: 214665
+ Remove the class IslGenerator which duplicates the functionality of
IslExprBuilder.
+ Use the IslExprBuilder to create code for memory access relations.
+ Also handle array types during access creation.
+ Enable scev codegen for one of the transformed memory access tests,
thus access creation without canonical induction variables available.
+ Update one test case to the new output.
llvm-svn: 214659
The updated tests use a different context than the old ones did.
Other than that only their path and the code generation we use
changed.
llvm-svn: 214657
+ Split all reduction dependences and map them to the causing memory accesses.
+ Print the types & base addresses of broken reductions for each "reduction
parallel" marked loop (OpenMP style).
+ 3 test cases to show how reductions are now represented in the isl ast.
The mapping "(ast) loops -> broken reductions" is also needed to find the
memory accesses we need to privatize in a loop.
llvm-svn: 214489
The functions isParallel, isInnermostParallel and IsOutermostParallel in
IslAstInfo will now return true even in the presence of broken reductions.
To compensate for this change the negated result of isReductionParallel can
be used.
llvm-svn: 214488
+ Perform the parallelism check on the innermost loop only once.
+ Inline the markOpenmpParallel function.
+ Rename all IslAstUserPayload * into Payload to make it consistent.
llvm-svn: 214448
Whe we build the IslAst we visit for nodes (in pre and post order) as well as
user/domain nodes. As these two sets are non overlapping we do not need to
check if we annotated a node earlier when we visit it.
llvm-svn: 214170
Use the fact that if we visit a for node first in pre and next in post order
we know we did not visit any children, thus we found an innermost loop.
+ Test case for an innermost loop with a conditional inside
llvm-svn: 213870
+ Renamed context into build when it's the isl_ast_build
+ Use the IslAstInfo functions to extract the schedule of a node
+ Use the IslAstInfo functions to extract the build/context of a node
+ Move the payload struct into the IslAstInfo class
+ Use a constructor and destructor (also new and delete) to
allocate/initialize the payload struct
llvm-svn: 213792
Offer the static functions to extract information out of an IslAst for node
as members of IslAstInfo not as top level entities.
+ Refactor common code
+ Add isParallel and isReductionParallel
+ Rename IslAstUser to IslAstUserPayload to make it clear this is just a (or
the) payload struct.
llvm-svn: 213272
This pulls in a couple of minor cleanups in isl. More importantly, in
preparation of the upcoming LLVM releases this change brings us back on a
released version of isl.
llvm-svn: 213062
+ Introduced dependency type TYPE_TC_RED to represent the transitive closure
(& the reverse) of reduction dependences. These are used when we check for
reduction parallel loops.
+ Test cases including loop reversals and modulo schedules which compute
reductions in a alternated order.
llvm-svn: 213019
We move back to a simple approach where the liveout is the last must-write
statement for a data-location plus all may-write statements. The previous
approach did not work out. We would have to consider per-data-access
dependences, instead of per-statement dependences to correct it. As this adds
complexity and it seems we would not gain anything over the simpler approach
that we implement in this commit, I moved us back to the old approach of
computing the liveout, but enhanced it to also add may-write accesses.
We also fix the test case and explain why we can not perform dead code
elimination in this case.
llvm-svn: 212925
As our delinearization works optimistically, we need in some cases run-time
checks that verify our optimistic assumptions. A simple example is the
following code:
void foo(long n, long m, long o, double A[n][m][o]) {
for (long i = 0; i < 100; i++)
for (long j = 0; j < 150; j++)
for (long k = 0; k < 200; k++)
A[i][j][k] = 1.0;
}
After clang linearized the access to A and we delinearized it again to
A[i][j][k] we need to ensure that we do not access the delinearized array
out of bounds (this information is not available in LLVM-IR). Hence, we
need to verify the following constraints at run-time:
CHECK: Assumed Context:
CHECK: [o, m] -> { : m >= 150 and o >= 200 }
llvm-svn: 212198
To translate the old induction variables as they exist before Polly to new
new induction variables introduced during AST code generation we need to
generate code that computes the new values from the old ones. We can do this
by just looking at the arguments isl generates in each scheduled statement.
Example:
// Old
for i
S(i)
// New
for c0
for c1
S(c0 + c1)
To get the value of i, we need to compute 'c0 + c1'. This expression is readily
available in the user statements generated by isl and just needs to be
translated to LLVM-IR.
This replaces an old confusing construct that constructed during ast generation
an isl multi affine expression that described this relation and which was then
again ast generated for each statement and argument when translating the isl ast
to LLVM-IR. This approach was difficult to understand and the additional ast
generation calls where entirely redundant as isl provides the relevant
expressions as arguments of the generated user statements.
llvm-svn: 212186
This change is particularly useful in the code generation as we need
to know which binary operator/identity element we need to combine/initialize
the privatization locations.
+ Print the reduction type for each memory access
+ Adjusted the test cases to comply with the new output format and
to test for the right reduction type
llvm-svn: 212126
Iterate over all store memory accesses and check for valid binary reduction
candidate loads by following the operands of the stored value. For each
candidate pair we check if they have the same base address and there are no
other accesses which may overlap with them. This ensures that no intermediate
value can escape into other memory locations or is overwritten at some point.
+ 17 test cases for reduction detection and reduction dependency modeling
llvm-svn: 211957
Enabling -keep-going in ScopDetection causes expansion to an invalid
Scop candidate.
Region A <- Valid candidate
|
Region B <- Invalid candidate
If -keep-going is enabled, ScopDetection would expand A to A+B because
the RejectLog is never checked for errors during expansion.
With this patch only A becomes a valid Scop.
llvm-svn: 211875
This change will ease the transision to multiple reductions per statement as
we can now distinguish the effects of multiple reductions in the same
statement.
+ Wrapped reduction dependences are used to compute privatization dependences
+ Modified test cases to account for the change
llvm-svn: 211795
This dependency analysis will keep track of memory accesses if they might be
part of a reduction. If not, the dependences are tracked on a statement level.
The main reason to do this is to reduce the compile time while beeing able to
distinguish the effects of reduction and non-reduction accesses.
+ Adjusted two test cases
llvm-svn: 211794
Use a container class to store the reject logs. Delegating most calls to
the internal std::map and add a few convenient shortcuts (e.g.,
hasErrors()).
llvm-svn: 211780
Add support for generating optimization remarks after completing the
detection of Scops.
The goal is to provide end-users with useful hints about opportunities that
help to increase the size of the detected Scops in their code.
By default the remark is unspecified and the debug location is empty. Future
patches have to expand on the messages generated.
This patch brings a simple test case for ReportFuncCall to demonstrate the
feature.
Reports all missed opportunities to increase the size/number of valid
Scops:
clang <...> -Rpass-missed="polly-detect" <...>
opt <...> -pass-remarks-missed="polly-detect" <...>
Reports beginning and end of all valid Scops:
clang <...> -Rpass="polly-detect" <...>
opt <...> -pass-remarks="polly-detect" <...>
Differential Revision: http://reviews.llvm.org/D4171
llvm-svn: 211769
Due to bad habit we sometimes used a variable %defaultOpts that listed
a set of passes commonly run to prepare for Polly. None of these test cases
actually needs special preparation and only two of them need the 'basicaa' to
be scheduled. Scheduling the required alias analysis explicitly makes the test
cases clearer.
llvm-svn: 211671
We had a set of test cases that have been incomplete and XFAILED. This patch
completes a couple of the interesting ones and removes the ones which seem
redundant or not sufficiently reduced to be useful.
llvm-svn: 211670
Insert a header into the new testcase containing a sample RUN line a FIXME and
an XFAIL. Then insert the formated C code and finally the LLVM-IR without
attributes, the module ID or the target triple.
llvm-svn: 211612
We use llvm.codegen intrinsic to generate code for embedded LLVM-IR
strings. The reason we introduce such a intrinsic is that previous
clang/opt tools was NOT linked with various LLVM targets and their
AsmParsers and AsmPrinters. Since clang/opt been linked with all the
needed libraries, we no longer need the llvm.codegen intrinsic.
llvm-svn: 211573
+ Collect reduction dependences
+ Introduced TYPE_RED in Dependences.h which can be used to obtain the
reduction dependences
+ Used TYPE_RED to prevent parallelization while we do not have a privatizing
code generation
+ Relax the dependences for non-parallel code generation
+ Add privatization dependences to ensure correctness
+ 12 Test cases to check for reduction and privatization dependences
llvm-svn: 211369
+ Flag to indicate reduction like statements
+ Command line option to (dis)allow multiplicative reduction opcodes
+ Two simple positive test cases, one fp test case w and w/o fast math
+ One "negative" test case (only reduction like but no reduction)
llvm-svn: 211114
+ Added const iterator version
+ Changed name to begin/end to allow range loops
+ Changed call sites to range loops
+ Changed typename to (const_)iterator
llvm-svn: 210927
In general this fixes ambiguity that can arise from using
a different namespace that declares the same symbols as
we do.
One example inside llvm would be:
createIndVarSimplifyPass(..);
Which can be found in:
llvm/Transforms/Scalar.h
and
polly/LinkAllPasses.h
llvm-svn: 210755
Fixes#19976.
The error log does not contain an error, in case we reject a candidate
without generating a diagnostic message by using invalid<>(...). This is
the case for the top-level region of a function.
The patch comes without a test-case because adding a useful one requires
additional code just for triggering it. Before the patch it would only trigger,
if we try to print the CFG with Scop error annotations.
llvm-svn: 210753
We do this currently only for test cases where we have integer offsets that
clearly access array dimensions out-of-bound.
-; for (long i = 0; i < n; i++)
-; for (long j = 0; j < m; j++)
-; for (long k = 0; k < o; k++)
+; for (long i = 0; i < n - 3; i++)
+; for (long j = 4; j < m; j++)
+; for (long k = 0; k < o - 7; k++)
; A[i+3][j-4][k+7] = 1.0;
This will be helpful if we later want to simplify the access functions under the
assumption that they do not access memory out of bounds.
llvm-svn: 210179
Without this patch, the testcase would fail on the delinearization of the second
array:
; void foo(long n, long m, long o, double A[n][m][o]) {
; for (long i = 0; i < n; i++)
; for (long j = 0; j < m; j++)
; for (long k = 0; k < o; k++) {
; A[i+3][j-4][k+7] = 1.0;
; A[i][0][k] = 2.0;
; }
; }
; CHECK: [n, m, o] -> { Stmt_for_body6[i0, i1, i2] -> MemRef_A[3 + i0, -4 + i1, 7 + i2] };
; CHECK: [n, m, o] -> { Stmt_for_body6[i0, i1, i2] -> MemRef_A[i0, 0, i2] };
Here is the output of FileCheck on the testcase without this patch:
; CHECK: [n, m, o] -> { Stmt_for_body6[i0, i1, i2] -> MemRef_A[i0, 0, i2] };
^
<stdin>:26:2: note: possible intended match here
[n, m, o] -> { Stmt_for_body6[i0, i1, i2] -> MemRef_A[o0] };
^
It is possible to find a good delinearization for A[i][0][k] only in the context
of the delinearization of both array accesses.
There are two ways to delinearize together all array subscripts touching the
same base address: either duplicate the code from scop detection to first gather
all array references and then run the delinearization; or as implemented in this
patch, use the same delinearization info that we computed during scop detection.
llvm-svn: 210117
+ CL-option --polly-tile-sizes=<int,...,int>
The i'th value is used as a tile size for dimension i, if
there is no i'th value, the value of --polly-default-tile-size is
used
+ CL-option --polly-default-tile-size=int
Used if no tile size is given for a dimension i
+ 3 Simple testcases
llvm-svn: 209753
Instead of relying on the delinearization to infer the size of an element,
compute the element size from the base address type. This is a much more precise
way of computing the element size than before, as we would have mixed together
the size of an element with the strides of the innermost dimension.
llvm-svn: 209695
Support a 'keep-going' mode for the ScopDetection. In this mode, we just keep
on detecting, even if we encounter an error.
This is useful for diagnosing SCoP candidates. Sometimes you want all the
errors. Invalid SCoPs will still be refused in the end, we just refuse to
abort on the first error.
llvm-svn: 209574
This stores all RejectReasons created for one region
in a RejectLog inside the DetectionContext. For now
this only keeps track of the last error.
A separate patch will enable the tracking of all errors.
This patch itself does no harm (yet).
llvm-svn: 209572
SVN r209103 removed the OwningPtr variant of the MemoryBuffer APIs. Switch to
the equivalent std::unique_ptr versions. This should clear up the build bots.
llvm-svn: 209104
Tag the GPGPU codegen test cases as unsupported if the nvptx target is not
included in the current llvm build.
Contributed-by: Yabin Hu <yabin.hwu@gmail.com>
llvm-svn: 208779
definition below all of the header #include lines, Polly edition.
If you want to know more details about this, you can see the recent
commits to Debug.h in LLVM. This is just the Polly segment of a cleanup
I'm doing globally for this macro.
llvm-svn: 206852
Commit r206510 falsely advertised to fix the load cases, even though it only
fixed the store case. This commit adds the same fix for the load case including
the missing test coverage.
llvm-svn: 206577
Even tough we may want to generate a vector load, the address from which to load
still is a scalar. Make sure even if previous address computations may have been
vectorized, that the addresses are also available as scalars.
This fixes http://llvm.org/PR19469
Reported-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
llvm-svn: 206510
The following example shows a non-parallel loop
void f(int a[]) {
int i;
for (i = 0; i < 10; ++i)
A[i] = A[i+5];
}
which, in case we import a schedule that limits the iteration domain
to 0 <= i < 5, becomes parallel. Previously we crashed in such cases, now we
just recognize it as parallel.
This fixes http://llvm.org/PR19435
Reported-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
llvm-svn: 206318
We update to a newer version of isl, which includes changes to the compute
out facility that make it a lot more predicable. With our new value, we can
reliably bail out for all reported bugs, while still being able to compute
dependences for all but two test cases in the LLVM test suite. For the remaining
two test cases, the dependence problem we construct is unnecessarily complex,
so there is hope we can improve on this. However, to avoid any future issues,
having a reliable compute out facility in place is important.
llvm-svn: 206106
We only supported a very old version of OpenScop that was entirely different
to what OpenScop is today. To not confuse people, we remove this old and
unusable support. If anyone is interested to add OpenScop support back in,
the relevant patches are available in version control.
llvm-svn: 206026
Reversed the order in which LD_LIBRARY_PATH is defined in order to make sure the
${CLOOG_INSTALL} prefix is found first.
Contributed-by: Christian Bielert <cib123@googlemail.com>
llvm-svn: 205556
This replaces the ancient INVALID/INVALID_NOVERIFY macros with a real
function.
The new invalid(..) function uses small diagnostic objects that are
generated on demand. We can store arbitrary additional information per
error type and generate useful debug/error messages on the fly.
Use it as follows:
if (/* Some error condition (ReportFoo) */)
invalid<ReportFoo>(Context, /*Assert=*/true/false,
(/* List of helpful diagnostic objects */));
Where ReportFoo is a subclass of RejectReason that is able to take the
list of helpful diagnostic objects in its constructor.
The implementation of invalid will create the report and fire
an assertion, if necessary.
llvm-svn: 205414
During code preperation trivial PHI nodes (mainly introduced by lcssa) are
deleted to decrease the number of introduced allocas (==> dependences). However
simply replacing them by their only incoming value would cause the independent
block pass to introduce new allocas. To prevent this we try to share stack slots
during code preperarion, hence to reuse a already created alloca 'to demote' the
trivial PHI node. This works if we know that the value stored in this alloca
will be the incoming value of the trivial PHI at the end of the predecessor
block of this trivial PHI.
Contributed-by: Johannes Doerfert <doerfert@cs.uni-saarland.de>
llvm-svn: 205320
We explicitly specifying all filenames instead of assuming some naming
convention used by clang and opt.
Contributed-by: Johannes Doerfert <doerfert@cs.uni-saarland.de>
llvm-svn: 204726
For complex examples it may happen that we do not compute dependences. In this
case we do not want to crash, but just not detect parallel loops.
llvm-svn: 204470
It does not seem to add a lot of value, as it leaves unclear which parts are
mature and whichs not. Adding this informatin also does not make sense, as it
changes rapidly.
llvm-svn: 204447
This patch enables vectorization of loops containing backward array
traversal (array stride is -1).
Contributed-by: Chris Jenneisch <chrisj@codeaurora.org>
llvm-svn: 204257
This ensures that the polly passes get properly registered both, when using
polly as a loadable module and when directly linking it into clang/opt/bugpoint.
llvm-svn: 204255
llvm.org/PR19081 reports that the polly dependence analysis causes some h264
compilation to hang. We adjust the compute out, to ensure we do not block on
expensive dependence calculations.
llvm-svn: 204168
This reverts the broken modularized build.
This builds the classic loadable module. The separation
of library and plugin is yet to be done in a future patch.
llvm-svn: 203952
to avoid build errors like this:
In file included from ../include/llvm/IR/IntrinsicInst.h:30:0,
from ../tools/polly/lib/CodeGen/BlockGenerators.cpp:28:
../include/llvm/IR/Intrinsics.h:41:34: fatal error: llvm/IR/Intrinsics.gen: No such file or directory
The earlier change in commit b1d2e6d5c3ce151ef6b7213c00019715e63d7cfb has been
accidentally reverted in:
commit 6b1963814877f5b1b161d3a73a3c8246c4ad4787
Author: simbuerg <simbuerg@91177308-0d34-0410-b5e6-96231b3b80d8>
Date: Tue Mar 11 21:26:06 2014 +0000
Refactor Polly's Pass creation and initialization.
Rename some files and adjust cmake accordingly
llvm-svn: 203843
In case we are at the innermost band, we try to prepare for vectorization. This
means, we look for the innermost parallel loop and strip mine this loop to the
innermost level using a strip-mine factor corresponding to the number of vector
iterations.
For whatever reason, the code that implemented this feature was broken. We now
added a comment, a test case and obviously also the right code.
llvm-svn: 203544
1) The isl_int -> isl_val changes are the ones Tobias suggested.
One additional isl_val_free is added (and needed)
2) Three scoplib_vector_free are added, maybe we would need even
more (and matrix_free) but it's hard to place them right.
3) Cleaned the includes (and removed 'extern C')
This fixes the broken compilation for the scoplib import and export.
Contributed-by: Johannes Doerfert <doerfert@cs.uni-saarland.de>
llvm-svn: 203500
to avoid build errors like this:
In file included from ../include/llvm/IR/IntrinsicInst.h:30:0,
from ../tools/polly/lib/CodeGen/BlockGenerators.cpp:28:
../include/llvm/IR/Intrinsics.h:41:34: fatal error: llvm/IR/Intrinsics.gen: No such file or directory
llvm-svn: 203495
This is necessary to avoid test failures in the CLooG test suite due to the
recent isl update.
We also need to update two polly test cases which rely on a certain order in the
textual description that isl chooses for its sets and maps. Changes here are not
often, but we should probably switch to a check that verifies such maps are
semantically equivalent instead of represented identically.
llvm-svn: 203476
Older isl versions did not properly guard all function declarations for the
use in C++. Specifically, the isl/val_gmp.h header was not properly guarded.
This patch ensures we have the proper guards in place and do not accidentally
link to name-mangled C++ versions of those functions that are not available in
libisl.so.
llvm-svn: 203462
Value::user_iterator changes in LLVM r203364. Converts several of these
loops to nice range based loops in the process.
Built and tested cleanly for me, yay for being able to fully build and
test Polly changes!
llvm-svn: 203381
We currently need to always name instructions, as the polly test suite currently
matches for certain names. We should improve the test suite at some point.
This fixes 'make check-polly' in NDEBUG builds.
llvm-svn: 202855
For now we only mark innermost loops for the loop vectorizer. We could later
also mark not-innermost loops to enable the introduction of openmp parallelism.
llvm-svn: 202854
PollyIRBuilder is currently just a typedef to IRBuilder<>. Consequently, this
change should not affect behavior. In subsequent patches we will extend its
functionality to emit loop.parallel metadata.
llvm-svn: 202853
This fixes the buildbots who failed, because the linker eliminated most of the
Polly functionality when building without BUILD_SHARED_LIBS=ON.
Besides fixing the build, this change also brings additional functionality. With
the new separation between the general polly libraries and the functionality for
the polly module, it is now possible to link polly directly into a tool instead
of using requiring users to load a shared library.
llvm-svn: 202762
The module LLVMPolly.so links to that. There is really no reason to build a
large number of mini-libraries here, especially as we do have dependences
between the libraries that are not properly handled and that make linking fail
on darwin.
Submitted-by: David Fang <fang@csl.cornell.edu>
llvm-svn: 202743
We mostly iterate over read-only values. Following a suggestion by Duncan P.N
Exons Smith, we use the construct 'const auto &' for this.
llvm-svn: 202651
clang-formats behaviour has changed for a couple of C++11 formattings. We adapt
Polly to ensure our formatting checks are clean again.
llvm-svn: 202650
clang-format requires a space before the ":" in the foreach loop. Even though
this is surprising to me, we follow this style to make our formatting
consistent with clang-format. I found that this clang-format style is used in a
couple of C++11 examples, hence I believe the fact that clang-format adds a
colon is not a bug but just something I was not used to yet.
llvm-svn: 202648
In 'obsequi' we have a scop in which the current dead code elimination works,
but the generated code is way too complex. To avoid this trouble (and to not
disable the DCE entirely) we add an additional approximative step before
the actual dead code elimination. This should fix one of the two current
nightly-test issues.
Polly could be improved to handle 'obsequi' by teaching it to introduce only a
single parameter for (%1 and zext %1) which halves the number of parameters and
allows polly to derive a simpler representation for the set of live iterations.
However, this needs some time to investigate.
I will commit a test case as soon as we have a reduced one.
llvm-svn: 202010
Sven suggested to use this simpler code generation strategy as the convex_hull
computation is apparently rather inefficient. We do not have a test case that
shows a difference, but, in case we find a test case where this makes a
difference, we can reconsider our decission.
llvm-svn: 201997
In case we do not have valid dependences, we do not run dead code elimination or
the schedule optimizer. This fixes an infinite loop in the dead code
elimination (PR12110).
llvm-svn: 201982
Instead of giving a choice between a precise (but possibly very complex)
analysis and an approximative analysis we now use a hybrid approach which uses N
precise steps followed by one approximating step. The precision of the analysis
can be changed by increasing N. With a default of 'N' = 2, we get fully precise
results for our current test cases and should not run into performance problems
for more complex test cases. We can adjust this value when we got more
experience with this dead code elimination.
llvm-svn: 201888
In case the domain of a statement is empty, the schedule optimizer set by
accident the schedule to a NULL pointer. This is incorrect. Instead, we set
it to an empty isl_map with zero schedule dimensions. We already checked for
this in our test cases, but unfortunately the test cases did not fail as
expected. The assert we add in this commit now ensures that the test cases
fail properly in case we regress on this again.
llvm-svn: 201886
We now skip the debug intrinsics which is a lot better than crashing due to
uncopied metadata references. We should step by step investigate which debug
intrinsics we can copy without trouble.
We still keep the debug location metadata.
llvm-svn: 201860
This pass eliminates loop iterations that compute results that are not used
later on. This can help e.g. in D, where the default zero-initialization is
often unnecessary if right after new values are assigned to an array.
Contributed-by: Peter Conn <conn.peter@gmail.com>
llvm-svn: 201817
We do not have a use for this information at the moment. If we need this at some
point, the "instruction -> access" mapping needs to be enhanced as a single
instruction could then possibly perform multiple accesses.
This patch allows us to build the polyhedral information for scops with scalar
dependences.
llvm-svn: 201815
In rare cases the modification of one scop can effect the validity of other
scops, as code generation of an earlier scop may make the scalar evolution
functions derived for later scops less precise. The example that triggered this
patch was a scop that contained an 'or' expression as follows:
%add13710 = or i32 %j.19, 1
--> {(1 + (4 * %l)),+,2}<nsw><%for.body81>
Scev could only analyze the 'or' as it knew %j.19 is a multiple of 2. This
information was not available after the first scop was code generated (or
independent-blocks was run on it) and SCEV could not derive a precise SCEV
expression any more. This means we could not any more code generate this SCoP.
My current understanding is that there is always the risk that an earlier code
generation change invalidates later scops. As the example we have seen here is
difficult to avoid, we use this occasion to guard us against all such
invalidations.
This patch "solves" this issue by verifying right before we start working on
a detected scop, if this scop is in fact still valid. This adds a certain
overhead. However the verification we run is anyways very fast and secondly
it is only run on detected scops. So the overhead should not be very large. As
a later optimization we could detect scops only on demand, such that we need
to run scop-detections always only a single time.
This should fix the single last failure in the LLVM test-suite for the new
scev-based code generation.
llvm-svn: 201593
The MayAliasSet class is currently not used and just confuses people. We can
reintroduce it in case need a more precise tracking of alias sets.
llvm-svn: 201191
There does not seem to be a reason that we can not support PHI nodes outside of
the scop that reference values within the SCoP. Or at least, the attached test
case seems to do the right thing. We remove the assert for now.
llvm-svn: 200427
In rare cases, a region R which is itself not valid has an indirect child region
that is valid. When R becomes part of a valid region by expansion of another
region, then all children of R have to be erased from the set of valid regions.
This patch ensures that indirect children are erased in addition to direct
children.
Contributed-by: Armin Groesslinger <armin.groesslinger@uni-passau.de>
Tobias: I added a reduced test case and adjusted the logic of the patch to
only recurse until the first child is found.
llvm-svn: 200411
Verification of base addresses is difficult as the independent blocks pass may
introduce aliasing that was not there during scop detection. As a midterm
solution -polly-codegen-scev will remove the need for the independent blocks
pass. For now, we do not verify at compile time that the independent blocks pass
does not make the base addresses loop invariant. Disabling this just removes
one of the multiple safety layers we have. We still can check for correctness
in our regression tests.
llvm-svn: 200315
Array base addresses need to be invariant in the region considered. The base
address has to be computed outside the region, or, when it is computed inside,
the value must not change with the iterations of the loops. For example, when a
two-dimensional array is represented as a pointer to pointers the base address
A[i] in an access A[i][j] changes with i; therefore, such regions have to be
rejected.
Contributed by: Armin Größlinger <armin.groesslinger@uni-passau.de>
llvm-svn: 200314
Restricting Polly to -O3 does not make a lot of sense as it is opt-in anyway
and users who specifically request it should get it. If this causes performance
problems we should rather address them by scheduling the right cleanup passes
then just prevent the user from trying.
Also restricting Polly to -O3 made bugpoint not work with the -O3 flag and polly
enabled.
llvm-svn: 200208
This is not only not necessary, but in case -03 changes this can actually
cause arbitrarily failing test cases such as, e.g., a recent change by Chandler
that caused -O3 to unroll the loop body, which made the loop we wanted to
detect disappear and consequently this test case fail.
llvm-svn: 200204
Count the number of computational steps that have been used to solve the
dependence problem and abort in case we reach the "compute-out". This ensures we
do not hang forever in cases the dependence problem is too difficult to solve.
There is just a single case in the LLVM test-suite that runs into the
compute-out. Even in this case, we can probably coalesce some of the parameters
(i32 b, i32 b zext i64, ...) to simplify the problem enough to not hit the
compute out. However, for now we set the compute out in place to address the
general issue. The compute out was choosen such that it stops on a recent laptop
after about 8 seconds.
llvm-svn: 200156
This includes the following very useful isl commit:
commit d962967ab42323ea5ca0398956fbff6a98c782fa
Author: Sven Verdoolaege <skimo@kotnet.org>
Date: Wed Dec 18 12:05:32 2013 +0100
allow the user to impose a bound on the number of low-level operations
This should allow the user to deterministically limit the effort spent on a
computation.
llvm-svn: 200155
This removes the last isl_int dependency in the default build. There are
still some in OpenScop and Scoplib. For those isl-0.12.2 still needs to be used.
llvm-svn: 199585
This brings in isl_val support from cloog and, most importantly the following
isl commit:
commit d962967ab42323ea5ca0398956fbff6a98c782fa
Author: Sven Verdoolaege <skimo@kotnet.org>
Date: Wed Dec 18 12:05:32 2013 +0100
allow the user to impose a bound on the number of low-level operations
This should allow the user to deterministically limit the effort spent on a
computation.
llvm-svn: 199582