IntToPtr and PtrToInt instructions are basically no-ops that we can handle as
such. In order to generate them properly as parameters we had to improve the
ScopExpander, though the change is the first in the direction of a more
aggressive scalar synthetization.
llvm-svn: 271888
We now use the minimal necessary bit width for the generated code. If
operations might overflow (add/sub/mul) we will try to adjust the types in
order to ensure a non-wrapping computation. If the type adjustment is not
possible, thus the necessary type is bigger than the type value of
--polly-max-expr-bit-width, we will use assumptions to verify the computation
will not wrap. However, for run-time checks we cannot build assumptions but
instead utilize overflow tracking intrinsics.
llvm-svn: 271878
In case of modulo compared to zero, we need to do signed modulo
operation as unsigned can give different results based on whether the
dividend is negative or not.
This addresses llvm.org/PR27707
Contributed-by: Chris Jenneisch <chrisj@codeaurora.org>
Reviewers: _jdoerfert, grosser, Meinersbur
Differential Revision: http://reviews.llvm.org/D20145
llvm-svn: 271707
Operands of binary operations that might overflow will be temporarily
promoted to i64 again, though that is not a sound solution for the problem.
llvm-svn: 271538
Summary:
After rL271151 (SCEV change) SCEV no longer unconditionally transfers
nuw/nsw from the increment operation to the post-inc value; this
transfer only happens if there is undefined behavior in the program if
the increment overflowed (as opposed to just generating poison).
The loops in `wraping_signed_expr_1.ll` are in non-canonical
form (they're not rotated), and that defeats LLVM's poison-is-UB
analysis. IMO the easiest fix here is to run `wraping_signed_expr_1.ll`
through `-loop-rotate` to canonicalize the loops, which is what this
patch does.
Reviewers: jdoerfert, Meinersbur, grosser
Subscribers: grosser, mcrosier, pollydev
Differential Revision: http://reviews.llvm.org/D20778
llvm-svn: 271536
We now have a simple function to adjust/unify the types of two (or three)
operands before an operation that requieres the same type for all operands.
Due to this change we will not promote parameters that are added to i64
anymore if that is not needed.
llvm-svn: 271513
multiplication
Fix small issues related to characters, operators and descriptions of tests.
Differential Revision: http://reviews.llvm.org/D20806
llvm-svn: 271264
Created a new pass ScopInfoRegionPass. As name suggests, it is a
region pass and it is there to preserve compatibility with our
existing Polly passes. ScopInfoRegionPass will return a SCoP object
for a valid region while the creation of the SCoP stays in the
ScopInfo class.
Contributed-by: Utpal Bora <cs14mtech11017@iith.ac.in>
Reviewed-by: Tobias Grosser <tobias@grosser.es>,
Johannes Doerfert <doerfert@cs.uni-saarland.de>
Differential Revision: http://reviews.llvm.org/D20770
llvm-svn: 271259
This header is required to make the ISO 646 alternative operator
spellings ("and", "or" instead of "&&", "||") work. Should these
operators be replaced by the standard ones as already suggested by
Johannes, also remove this #include again.
llvm-svn: 271206
Summary:
API-wise `apply` is a somewhat unidiomatic one-off function, and
removing the only(?) use in polly will let me remove it from SCEV's
exposed interface.
Reviewers: jdoerfert, Meinersbur, grosser
Subscribers: grosser, mcrosier, pollydev
Differential Revision: http://reviews.llvm.org/D20779
llvm-svn: 271177
Add determination of statements that contain, in particular,
matrix multiplications and can be optimized with [1] to try to
get close-to-peak performance. It can be enabled
via polly-pm-based-opts, which is false by default.
Refs:
[1] - http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf
Contributed-by: Roman Gareev <gareevroman@gmail.com>
Reviewed-by: Tobias Grosser <tobias@grosser.es>
Differential Revision: http://reviews.llvm.org/D20575
llvm-svn: 271128
Before this patch we bailed if a required invariant load was potentially
overwritten. However, now we will optimistically assume it is actually
invariant and, to this end, restrict the valid parameter space as well as the
execution context with regards to potential overwrites of the location.
llvm-svn: 270416
Since the base pointer of a possibly aliasing pointer might not alias
with any other pointer it (the base pointer) might not be tagged as
"required invariant". However, we need it do be in order to compare
the accessed addresses of the derived (possibly aliasing) pointer.
This patch also tries to clean up the load hoisting a little bit.
llvm-svn: 270412
So far we bailed if a required invariant load was potentially overwritten in
the SCoP. From now on we will optimistically assume it is actually invariant
and, to this end, restrict the valid parameter space.
llvm-svn: 270060
The SCoP now holds a reference to the ScopDetection::DetectionContext
which allows to simplify the type of various methods and remove code.
llvm-svn: 270053
Before this patch we only expanded valid __and__ profitable region. Therefor
we did not allow the expansion to create a profitable region from a
non-profitable one. With this patch we will remember and expand all valid
regions and check for profitability only at the end.
This patch increases the number of valid SCoPs in the LLVM-TS and SPEC
2000/2006 by 28% (from 303 to 390), including the hot loop in hmmer.
llvm-svn: 269343
This patch cleans up the rejection log handling during the
ScopDetection. It consists of two interconnected parts:
- We keep all detection contexts for a function in order to provide
more information to the user, e.g., about the rejection of
extended/intermediate regions.
- We remove the mutable "RejectLogs" member as the information is
available through the detection contexts.
llvm-svn: 269323
Truncate operations are basically modulo operations, thus we can model
them that way. However, for large types we assume the operand to fit
in the new type size instead of introducing a modulo with a very large
constant.
llvm-svn: 269300
We utilize assumptions on the input to model IR in polyhedral world.
To verify these assumptions we version the code and guard it with a
runtime-check (RTC). However, since the RTCs are themselves generated
from the polyhedral representation we generate them under the same
assumptions that they should verify. In other words, the guarantees
that we try to provide with the RTCs do not hold for the RTCs
themselves. To this end it is necessary to employ a different check
for the RTCs that will verify the assumptions did hold for them too.
Differential Revision: http://reviews.llvm.org/D20165
llvm-svn: 269299
If a profitable run is performed we will check if the SCoP seems to be
profitable after creation but before e.g., dependence are computed. This is
needed as SCoP detection only approximates the actual SCoP representation.
In the end this should allow us to be less conservative during the SCoP
detection while keeping the compile time in check.
llvm-svn: 269074
Regions with one affine loop can be profitable if the loop is
distributable. To this end we will allow them to be treated as
profitable if they contain at least two non-trivial basic blocks.
llvm-svn: 269064
The assumption attached to an llvm.assume in the SCoP needs to be
combined with the domain of the surrounding statement but can
nevertheless be used to refine the context.
This fixes the problems mentioned in PR27067.
llvm-svn: 269060
This patches makes the propagation of complexity problems during
domain generation consistent. Additionally, it makes it less likely to
encounter ill-formed domains later, e.g., during schedule generation.
llvm-svn: 269055
Before this patch we generated error-restrictions only for
error-blocks, thus blocks (or regions) containing a not represented
function call. However, the same reasoning is needed if the invalid
domain of a statement subsumes its actual domain. To this end we move
the generation of error-restrictions after the propagation of the
invalid domains. Consequently, error-statements are now defined more
general as statements that are assumed to be not executed.
Additionally, we do not record an empty domain for such statements but
a nullptr instead. This allows to distinguish between error-statements
and dead-statements.
llvm-svn: 269053
We now use context information to simplify the domains and access
functions of the SCoP instead of just aligning them with the parameter
space.
llvm-svn: 269048
Previously we checked the number of pieces to decide whether or not a
invariant load was to complex to be generated. However, there are
cases when e.g., divisions cause the complexity to spike regardless of
the number of pieces. To this end we now check the number of totally
involved dimensions which will increase with the number of pieces but
also the number of divisions.
llvm-svn: 269045
This exposes the functionality to interpret a SCEV, or better the
piece-wise function created from the SCEV, as an unsigned value
instead of a signed one.
llvm-svn: 269044
Min/max expressions are easier to read and can in some cases also result in
more concise IR that is generated as the min/max --- when lowered to a
cmp+select pattern -- commonly has a simpler condition then the ternary
condition isl would normally generate.
llvm-svn: 268855
This release includes sevaral improvments compared to the previous
version isl-0.16.1-145-g243bf7c (from the ISL 0.17 announcement):
- optionally combine SCCs incrementally in scheduler
- optionally maximize coincidence in scheduler
- optionally avoid loop coalescing in scheduler
- minor AST generator improvements
- improve support for expansions in schedule trees
llvm-svn: 268500
The check for complexity compares the number of polyhedra in a set,
which are combined by disjunctions (union, "OR"),
not conjunctions (intersection, "AND").
llvm-svn: 268223
Add a command line switch to set the
isl_options_set_schedule_outer_coincidence option. ISL then tries to
build schedules where the outer member of a band satisfies the
coincidence constraints.
In practice this allows loop skewing for more parallelism in inner
loops.
llvm-svn: 268222
After zero-extend operations and unsigned comparisons we now allow
unsigned divisions. The handling is basically the same as for signed
division, except the interpretation of the operands. As the divisor
has to be constant in both cases we can simply interpret it as an
unsigned value without additional complexity in the representation.
For the dividend we could choose from the different representation
schemes introduced for zero-extend operations but for now we will
simply use an assumption.
llvm-svn: 268032
For debugging it is often convenient to not abort at the very first memory
management error. This option allows to control this behavior at run-time.
llvm-svn: 268030
It does not suffice to take a global assumptions for unsigned comparisons but
we also need to adjust the invalid domain of the statements guarded by such
an assumption. To this end we allow to specialize the getPwAff call now in
order to indicate unsigned interpretation.
llvm-svn: 268025
When we materialize parameter SCEVs we did so without considering the
side effects they might have, e.g., both division and modulo are
undefined if the right hand side is zero. This is a problem because we
potentially extended the domain under which we evaluate parameters,
thus we might have introduced such undefined behaviour. To prevent
that from happening we will now guard divisions and modulo operations
in the parameters with a compare and select.
llvm-svn: 268023
Assumptions and restrictions can both be simplified with the domain of a
statement but not the same way. After this patch we will correctly
distinguish them.
llvm-svn: 267885
Instead of matching for %6, we use a regexp to match for the result strings.
This test case caused unrelated noise in http://reviews.llvm.org/D15722.
llvm-svn: 267875
If the base pointer of an invariant load is is loaded conditionally, that
condition needs to hold for the invariant load too. The structure of the
program will imply this for domain constraints but not for imprecisions in
the modeling. To this end we will propagate the execution context of base
pointers during code generation and thus ensure the derived pointer does
not access an invalid base pointer.
llvm-svn: 267707
With this patch we will optimistically assume that the result of an unsigned
comparison is the same as the result of the same comparison interpreted as
signed.
llvm-svn: 267559
Additive expressions can have constant factors too that we can extract
and thereby simplify the internal representation. For now we do
compute the gcd of all constant factors but only extract the same
(possibly negated) factor if there is one.
llvm-svn: 267445
Before, we checked all GEPs in a statement in order to derive
out-of-bound assumptions. However, this can not only introduce new
parameters but it is also not clear what we can learn from GEPs that
are not immediately used in a memory accesses inside the SCoP. As this
case is very rare, no actual change in the behaviour is expected.
llvm-svn: 267442
Before, assumptions derived from llvm.assume could reference new
parameters that were not known to the SCoP before. These were neither
beneficial to the representation nor to the user that reads the
emitted remark. Now we project them out and keep only user assumptions
on known parameters. Nevertheless, the new parameters are still part
of the SCoPs parameter space as the SCEVAffinator currently adds them
on demand.
llvm-svn: 267441
The new handling is consistent with the remaining code, e.g., we do
not create a new parameter id for each lookup call but copy an
existing one. Additionally, we now use the implicit order defined by
the Parameters set instead of an explicit one defined in a map.
llvm-svn: 267423
A zero-extended value can be interpreted as a piecewise defined signed
value. If the value was non-negative it stays the same, otherwise it
is the sum of the original value and 2^n where n is the bit-width of
the original (or operand) type. Examples:
zext i8 127 to i32 -> { [127] }
zext i8 -1 to i32 -> { [256 + (-1)] } = { [255] }
zext i8 %v to i32 -> [v] -> { [v] | v >= 0; [256 + v] | v < 0 }
However, LLVM/Scalar Evolution uses zero-extend (potentially lead by a
truncate) to represent some forms of modulo computation. The left-hand side
of the condition in the code below would result in the SCEV
"zext i1 <false, +, true>for.body" which is just another description
of the C expression "i & 1 != 0" or, equivalently, "i % 2 != 0".
for (i = 0; i < N; i++)
if (i & 1 != 0 /* == i % 2 */)
/* do something */
If we do not make the modulo explicit but only use the mechanism described
above we will get the very restrictive assumption "N < 3", because for all
values of N >= 3 the SCEVAddRecExpr operand of the zero-extend would wrap.
Alternatively, we can make the modulo in the operand explicit in the
resulting piecewise function and thereby avoid the assumption on N. For the
example this would result in the following piecewise affine function:
{ [i0] -> [(1)] : 2*floor((-1 + i0)/2) = -1 + i0;
[i0] -> [(0)] : 2*floor((i0)/2) = i0 }
To this end we can first determine if the (immediate) operand of the
zero-extend can wrap and, in case it might, we will use explicit modulo
semantic to compute the result instead of emitting non-wrapping assumptions.
Note that operands with large bit-widths are less likely to be negative
because it would result in a very large access offset or loop bound after the
zero-extend. To this end one can optimistically assume the operand to be
positive and avoid the piecewise definition if the bit-width is bigger than
some threshold (here MaxZextSmallBitWidth).
We choose to go with a hybrid solution of all modeling techniques described
above. For small bit-widths (up to MaxZextSmallBitWidth) we will model the
wrapping explicitly and use a piecewise defined function. However, if the
bit-width is bigger than MaxZextSmallBitWidth we will employ overflow
assumptions and assume the "former negative" piece will not exist.
llvm-svn: 267408
Memory accesses can have non-precisely modeled access functions that
would cause us to build incorrect execution context for hoisted loads.
This is the same issue that occurred during the domain construction for
statements and it is dealt with the same way.
llvm-svn: 267289
The SCEVAffinator will now produce not only the isl representaiton of
a SCEV but also the domain under which it is invalid. This is used to
record possible overflows that can happen in the statement domains in
the statements invalid domain. The result is that invalid loads have
an accurate execution contexts with regards to the validity of their
statements domain. While the SCEVAffinator currently is only taking
"no-wrapping" assumptions, we can add more withouth worrying about the
execution context of loads that are optimistically hoisted.
llvm-svn: 267288
The invalid context is not enough to describe the parameter constraints under
which a statement is not modeled precisely. The reason is that during the
domain construction the bounds on the induction variables are not known but
needed to check if e.g., an overflow can actually happen. To this end we
replace the invalid context of a statement with an invalid domain. It is
initialized during domain construction and intersected with the domain once
it was completely build. Later this invalid domain allows to eliminate
falsely assumed wrapping cases and other falsely assumed mismatches in the
modeling.
llvm-svn: 267286
As discussed in the Polly weekly phone call and reviews.llvm.org/D18878,
the assumed contexts changed (widen) due to D18878/r265942. Also check
these contexts in the tests affected by that change.
llvm-svn: 266323
We used checks to minimize the number of remarks we present to a user
but these checks can become expensive, especially since all wrapping
assumptions are emitted separately. Because there is not benefit for a
"headless" run we put these checks under a command line flag. Thus, if
the flag is not given we will emit "non-effective" remarks, e.g.,
duplicates and revert to the old behaviour if it is given. As this
also changes the internal representation of some sets we set the flag
by default for our unit tests.
llvm-svn: 266087
Utilizing the record option for assumptions we can simplify the wrapping
assumption generation a lot. Additionally, we can now report locations
together with wrapping assumptions, though they might not be accurate yet.
llvm-svn: 266069
There are three reasons why we want to record assumptions first before we
add them to the assumed/invalid context:
1) If the SCoP is not profitable or otherwise invalid without the
assumed/invalid context we do not have to compute it.
2) Information about the context are gathered rather late in the SCoP
construction (basically after we know all parameters), thus the user
might see overly complicated assumptions to be taken while they would
have been simplified later on.
3) Currently we cannot take assumptions at any point but have to wait,
e.g., for the domain generation to finish. This makes wrapping
assumptions much more complicated as they need to be and it will
have a similar effect on "signed-unsigned" assumptions later.
llvm-svn: 266068
Collect the error domain contexts (formerly in the ErrorDomainCtxMap)
for each statement in the new InvalidContext member variable. While
this commit is basically a [NFC] it is a first step to make hoisting
sound by allowing a more fine grained record of invalid contexts,
e.g., here on statement level.
llvm-svn: 266053
Allow overflow of indices into the next higher dimension if it has
constant size. E.g.
float A[32][2];
((float*)A)[5];
is effectively the same as
A[2][1];
This can happen since r265379 as a side effect if ScopDetection
recognizes an access as affine, but ScopInfo rejects the GetElementPtr.
Differential Revision: http://reviews.llvm.org/D18878
llvm-svn: 265942
MSVC warns with:
warning C4239: nonstandard extension used: 'initializing': conversion from 'llvm::DebugLoc' to 'llvm::DebugLoc &'
note: A non-const reference may only be bound to an lvalue
Change the reference to a const reference.
llvm-svn: 265937
In r247147 we disabled pointer expressions because the IslExprBuilder did not
fully support them. This patch reintroduces them by simply treating them as
integers. The only special handling for pointers that is left detects the
comparison of two address_of operands and uses an unsigned compare.
llvm-svn: 265894
This reverts commit 2879c53e80e05497f408f21ce470d122e9f90f94.
Additionally, it adds SDiv and SRem instructions to the set of values
discovered by the findValues function even if we add the operands to
be able to recompute the SCEVs. In subfunctions we do not want to
recompute SDiv and SRem instructions but pass them instead as they
might have been created through the IslExprBuilder and are more
complicated than simple SDiv/SRem instructions in the code.
llvm-svn: 265873
Static libraries where installed into "lib${LLVM_LIBDIR_SUFFIX}" while
shared ones into "lib". I found no justification for this behaviour.
This patch changes both types of libraries to be install into
"lib${LLVM_LIBDIR_SUFFIX}". LLVM and clang use the same behaviour.
This fixes llvm.org/PR27305.
llvm-svn: 265872
We verify the optimized function now for a long time and it helped to track
down bugs early. This will now also happen for all parallel subfunctions we
generate.
llvm-svn: 265823
The way to get the elements size with getPrimitiveSizeInBits() is not
the same as used in other parts of Polly which should use
DataLayout::getTypeAllocSize(). Its use only queries the size of the
pointer and getPrimitiveSizeInBits returns 0 for types that require a
DataLayout object such as pointers.
Together with r265379, this should fix PR27195.
llvm-svn: 265795
If we build the domains for error blocks and later remove them we lose
the information that they are not executed. Thus, in the SCoP it looks
like the control will always reach the statement S:
for (i = 0 ... N)
if (*valid == 0)
doSth(&ptr);
S: A[i] = *ptr;
Consequently, we would have assumed "ptr" to be always accessed and
preloaded it unconditionally. However, only if "*valid != 0" we would
execute the optimized version of the SCoP. Nevertheless, we would have
hoisted and accessed "ptr"regardless of "*valid". This changes the
semantic of the program as the value of "*valid" can cause a change of
"ptr" and control if it is executed or not.
To fix this problem we adjust the execution context of hoisted loads
wrt. error domains. To this end we introduce an ErrorDomainCtxMap that
maps each basic block to the error context under which it might be
executed. Thus, to the context under which it is executed but an error
block would have been executed to. To fill this map one traversal of
the blocks in the SCoP suffices. During this traversal we do also
"remove" error statements and those that are only reachable via error
statements. This was previously done by the removeErrorBlockDomains
function which is therefor not needed anymore.
This fixes bug PR26683 and thereby several SPEC miscompiles.
Differential Revision: http://reviews.llvm.org/D18822
llvm-svn: 265778
If ScalarEvolution cannot look through some expression but we do, it
might happen that a multiplication will arrive at the
SCEVAffinator::visitMulExpr. While we could always try to improve the
extractConstantFactor function we might still miss something, thus we
reintroduce the code to generate multiplicative piecewise-affine
functions as a fall-back.
llvm-svn: 265777
The findValues() function did not look through div & srem instructions
that were part of the argument SCEV. However, in different other
places we already look through it. This mismatch caused us to preload
values in the wrong order.
llvm-svn: 265775
If all exiting blocks of a SCoP are error blocks and therefor not
represented we will not generate accesses and consequently no SAI
objects for exit PHIs. However, they are needed in the code generation
to generate the merge PHIs between the original and optimized region.
With this patch we enusre that the SAI objects for exit PHIs exist
even if all exiting blocks turn out to be eror blocks.
This fixes the crash reported in PR27207.
llvm-svn: 265393
We currently only consider the first GEP when delinearizing access functions,
which makes us loose information about additional index expression offsets,
which results in our SCoP model to be incorrect. With this patch we now
compare the base pointers used to ensure we do not miss any additional offsets.
This fixes llvm.org/PR27195.
We may consider supporting nested GEP in our delinearization heuristics in
the future.
llvm-svn: 265379
Even before we build the domain the branch condition can become very
complex, especially if we have to build the complement of a lot of
equality constraints. With this patch we bail if the branch condition
has a lot of basic sets and parameters.
After this patch we now successfully compile
External/SPEC/CINT2000/186_crafty/186_crafty
with "-polly-process-unprofitable -polly-position=before-vectorizer".
llvm-svn: 265286
As a CFG is often structured we can simplify the steps performed during
domain generation. When we push domain information we can utilize the
information from a block A to build the domain of a block B, if A dominates B
and there is no loop backede on a path from A to B. When we pull domain
information we can use information from a block A to build the domain of a
block B if B post-dominates A. This patch implements both ideas and thereby
simplifies domains that were not simplified by isl. For the FINAL basic block
in test/ScopInfo/complex-successor-structure-3.ll we used to build a universe
set with 81 basic sets. Now it actually is represented as universe set.
While the initial idea to utilize the graph structure depended on the
dominator and post-dominator tree we can use the available region
information as a coarse grained replacement. To this end we push the
region entry domain to the region exit and pull it from the region
entry for the region exit if applicable.
With this patch we now successfully compile
External/SPEC/CINT2006/400_perlbench/400_perlbench
and
SingleSource/Benchmarks/Adobe-C++/loop_unroll.
Differential Revision: http://reviews.llvm.org/D18450
llvm-svn: 265285
If a loop has no exiting blocks the region covering we use during
schedule genertion might not cover that loop properly. For now we bail
out as we would not optimize these loops anyway.
llvm-svn: 265280
If an exit PHI is written and also read in the SCoP we should not create two
SAI objects but only one. As the read is only modeled to ensure OpenMP code
generation knows about it we can simply use the EXIT_PHI MemoryKind for both
accesses.
llvm-svn: 265261
If a loop has no exiting blocks the region covering we use during
schedule genertion might not cover that loop properly. For now we bail
out as we would not optimize these loops anyway.
llvm-svn: 265260
If a non-affine region PHI is generated we should not move the insert
point prior to the synthezised value in the same block as we might
split that block at the insert point later on. Only if the incoming
value should be placed in a different block we should change the
insertion point.
llvm-svn: 265132
... instead of hardcoding something that has been free at some point. This fixes
a crash triggered by r265084, where the diagnostic IDs have been shifted in a
way that resulted our hardcode ID to not be assigned any implementation. Our ID
was likely already wrong earlier on, but this time we really crashed nicely.
llvm-svn: 265114
These caused LNT failures due to new assertions when running with
-polly-position=before-vectorizer -polly-process-unprofitable for:
FAIL: clamscan.compile_time
FAIL: cjpeg.compile_time
FAIL: consumer-jpeg.compile_time
FAIL: shapes.compile_time
FAIL: clamscan.execution_time
FAIL: cjpeg.execution_time
FAIL: consumer-jpeg.execution_time
FAIL: shapes.execution_time
The failures have been introduced by r264782, but r264789 had to be reverted
as it depended on the earlier patch.
llvm-svn: 264885
As a CFG is often structured we can simplify the steps performed
during domain generation. When we push domain information we can
utilize the information from a block A to build the domain of a
block B, if A dominates B. When we pull domain information we can
use information from a block A to build the domain of a block B
if B post-dominates A. This patch implements both ideas and thereby
simplifies domains that were not simplified by isl. For the FINAL
basic block in
test/ScopInfo/complex-successor-structure-3.ll .
we used to build a universe set with 81 basic sets. Now it actually is
represented as universe set.
While the initial idea to utilize the graph structure depended on the
dominator and post-dominator tree we can use the available region
information as a coarse grained replacement. To this end we push the
region entry domain to the region exit and pull it from the region
entry for the region exit.
Differential Revision: http://reviews.llvm.org/D18450
llvm-svn: 264789
Instead of waiting for the domain construction to finish we will now
bail as early as possible in case a complexity problem is encountered.
This might save compile time but more importantly it makes the "abort"
explicit. While we can always check if we invalidated the assumed
context we can simply propagate the result of the construction back.
This also removes the HasComplexCFG flag that was used for the very
same reason.
Differential Revision: http://reviews.llvm.org/D18504
llvm-svn: 264775
Ensure the length of the header underline matches the length of the header.
This prevents SPHINX from erroring on this file and consequently not updating
the documentation.
Also, make this its own point not belonging to the 'increased applicability'
section.
llvm-svn: 264592
This patch applies the restrictions on the number of domain conjuncts
also to the domain parts of piecewise affine expressions we generate.
To this end the wording is change slightly. It was needed to support
complex additions featuring zext-instructions but it also fixes PR27045.
lnt profitable runs reports only little changes that might be noise:
Compile Time:
Polybench/[...]/2mm +4.34%
SingleSource/[...]/stepanov_container -2.43%
Execution Time:
External/[...]/186_crafty -2.32%
External/[...]/188_ammp -1.89%
External/[...]/473_astar -1.87%
llvm-svn: 264514
This got accidentally dropped in r264283.
Also, drop the wwwfiles from the removal list. This is not needed any more as
we now explicitly list the directories that should be formatted.
llvm-svn: 264397
This pass is not enabled in the default tool chain and currently can run into an
infinite loop, due to other parts of LLVM generating incorrect IR
(http://llvm.org/PR27065) -- which is not executed and consequently does not
seem to disturb other passes. As this pass is not really needed, we can just
drop it to get our build clean.
This fixes the timeout issues in MultiSource/Benchmarks/MiBench/consumer-jpeg
and MultiSource/Benchmarks/mediabench/jpeg/jpeg-6a/cjpeg for
-polly-position=before-vectorizer -polly-process-unprofitable.. Unfortunately,
we are still left with a miscompile in cjpeg.
llvm-svn: 264396
This fixes PR27035. While we now exclude MemIntrinsics from the
polyhedral model if they would access "null" we could exploit this
even more, e.g., remove all parameter combinations that would lead to
the execution of this statement from the context.
llvm-svn: 264284
Similar to r262612 we need to check not only the pointer SCEV and the
type of an alias group but also the actual access instruction. The
reason is again the same: The pointer SCEV is not flow sensitive but the
access function is. In r262612 we avoided consolidating alias groups
even though the pointer SCEV and the type were the same but the access
function was not. Here it is simpler as we can simply check all members
of an alias group against the given access instruction.
llvm-svn: 264274
When codegenerating invariant loads in some rare cases we cannot generate code
and bail out. This change ensures that we maintain a valid dominator tree
in these situations. This fixes llvm.org/PR26736
Contributed-by: Matthias Reisinger <d412vv1n@gmail.com>
llvm-svn: 264142
This might be useful to evaluate the benefit of us handling modref funciton
calls. Also, a new bug that was triggered by modref function calls was
recently reported http://llvm.org/PR27035. To ensure the same issue does not
cause troubles for other people, we temporarily disable this until the bug
is resolved.
llvm-svn: 264140
ISL can conclude additional conditions on parameters from restrictions
on loop variables. Such conditions persist when leaving the loop and the
loop variable is projected out. This results in a narrower domain for
exiting the loop than entering it and is logically impossible for
non-infinite loops.
We fix this by not adding a lower bound i>=0 when constructing BB
domains, but defer it to when also the upper bound it computed, which
was done redundantly even before this patch.
This reduces the number of LNT fails with -polly-process-unprofitable
-polly-position=before-vectorizer from 8 to 6.
llvm-svn: 264118
We bail out if current scop has a complex control flow as this could lead to
building of large domain conditions. This is to reduce compile time. This
addresses r26382.
Contributed-by: Chris Jenneisch <chrisj@codeaurora.org>
Differential Revision: http://reviews.llvm.org/D18362
llvm-svn: 264105
Affine branches are fully modeled and regenerated from the polyhedral domain and
consequently do not require any input conditions to be propagated.
llvm-svn: 263678
This mirrors:
commit https://llvm.org/svn/llvm-project/llvm/trunk@263462
Author: Michael Kuperstein <michael.kuperstein@gmail.com>
Date: Mon Mar 14 18:34:29 2016 +0000
[AliasSetTracker] Do not strip pointer casts when processing MemSetInst
and fixes the failure the above commit triggered in Polly.
llvm-svn: 263538
This reverts commit r263322 and reapplies r263296. The original
r263258 was reapplied in LLVM after being reverted in r263321 due to
issues with Release testing in Clang.
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263399
Index calculations can use the last value that come out of a loop.
Ideally, ScalarEvolution can compute that exit value directly without
depending on the loop induction variable, but not in all cases.
This changes isAffine to not consider such loop exit values as affine to
avoid that SCEVExpander adds uses of the original loop induction
variable.
This fix is analogous to r262404 that applies to general uses of loop
exit values instead of index expressions and loop bouds as in this
patch.
This reduces the number of LNT test-suite fails with
-polly-position=before-vectorizer -polly-unprofitable
from 10 to 8.
llvm-svn: 262665
The scope will be required in the following fix. This commit separates
the large changes that do not change behaviour from the small, but
functional change.
llvm-svn: 262664
Value merging is only necessary for scalars when they are used outside
of the scop. While an array's base pointer can be used after the scop,
it gets an extra ScopArrayInfo of type MK_Value. We used to generate
phi's for both of them, where one was assuming the reault of the other
phi would be the original value, because it has already been replaced by
the previous phi. This resulted in IR that the current IR verifier
allows, but is probably illegal.
This reduces the number of LNT test-suite fails with
-polly-position=before-vectorizer -polly-process-unprofitable
from 16 to 10.
Also see llvm.org/PR26718.
llvm-svn: 262629
This should fix PR19422.
Thanks to Jeremy Huddleston Sequoia for reporting this.
Thanks to Roman Gareev for his investigation and the reduced test case.
llvm-svn: 262612
The LNT test suite with -polly-process-unprofitable
-polly-position=before-vectorizer currenty fails 59 tests. With this
barrier added, only 16 keep failing. This is probably because Polly's
code generation currently does not correctly preserve all analyses it
promised to preserve. Temporarily add this barrier until further
investigation.
llvm-svn: 262488
Polly recognizes affine loops that ScalarEvolution does not, in
particular those with loop conditions that depend on hoisted invariant
loads. Check for SCEVAddRec dependencies on such loops and do not
consider their exit values as synthesizable because SCEVExpander would
generate them as expressions that depend on the original induction
variables. These are not available in generated code.
llvm-svn: 262404
In order to speed up compile time and to avoid random timeouts we now
separately track assumptions and restrictions. In this context
assumptions describe parameter valuations we need and restrictions
describe parameter valuations we do not allow. During AST generation
we create a runtime check for both, whereas the one for the
restrictions is negated before a conjunction is build.
Except the In-Bounds assumptions we currently only track restrictions.
Differential Revision: http://reviews.llvm.org/D17247
llvm-svn: 262328
removeCachedResults deletes the DetectionContext from
DetectionContextMap such that any it cannot be used anymore.
Unfortunately invalid<ReportUnprofitable> and RejectLogs.insert still do
use it. Because the memory is part of a map and not returned to to the
OS immediatly, such that the observable effect was only a memory leak
due to reference counters not decreased when the second call to
removeCachedResults does not remove the DetectionContext because because
it already has been removed.
Fix by not removing the DetectionContext prematurely. The second call to
removeCachedResults will handle it anyway.
llvm-svn: 262235
Originally committed in r261899 and reverted in r262202 due to failing
in out-of-LLVM tree builds.
Replace the use of LLVM_TOOLS_BINARY_DIR by LLVM_TOOLS_DIR which exists
in both, in-tree and out-of-tree builds.
Original commit message:
The script updates a lit test case that uses FileCheck using the actual
output of the 'RUN:'-lines program. Useful when updating test cases due
to expected output changes and diff'ing expected and actual output.
llvm-svn: 262227
We move verifyInvariantLoads out of this function to allow for an early return
without the need for code duplication. A similar transformation was suggested
by Johannes Doerfert in post commit review of r262033.
llvm-svn: 262203
This reverts commit r261899. Even though I am not yet 100% certain, this is
commit is the only one that has some relation to the recent cmake failures
in Polly.
llvm-svn: 262202
This debug output distracts from the -debug-only=polly-scops output. As it is
rather verbose and only really needed for debugging the domain construction
I drop this output. The domain construction is meanwhile stable enough to
not require regular debugging.
llvm-svn: 262117
In case the underlying basepointer of a ScopArrayInfo object is moved to another
module while the scop is still processed is it necessary to free dependent
ScopArrayInfo objects as they might otherwise be looked accidentally when a
new llvm basepointer value is reassigned the very same memory location as the
llvm value that has been moved earlier.
This function is not yet used in Polly itself, but is useful for external users.
llvm-svn: 262113
The functions buildAccessMultiDimFixed and buildAccessMultiDimParam were
refactored from buildMemoryAccess. In their own functions, the control
flow can be shortcut and simplified using returns.
Suggested-by: etherzhhb
llvm-svn: 262029
This allows to construct run-time checks for a scop without having to generate
a full AST. This is currently not taken advantage of in Polly itself, but
external users may benefit from this feature.
llvm-svn: 262009
This commit updates to the latest isl development version. There is no specific
feature we need on the Polly side, but we want to ensure test coverage for the
latest isl changes.
llvm-svn: 262001
The script updates a lit test case that uses FileCheck using the actual
output of the 'RUN:'-lines program. Useful when updating test cases due
expected output changes and diff'ing expected and actual output.
llvm-svn: 261899
Check the ModRefBehaviour of functions in order to decide whether or
not a call instruction might be acceptable.
Differential Revision: http://reviews.llvm.org/D5227
llvm-svn: 261866
The generated dedicated subregion exit block was assumed to have the same
dominance relation as the original exit block. This is incorrect if the exit
block receives other edges than only from the subregion, which results in that
e.g. the subregion's entry block does not dominate the exit block.
llvm-svn: 261865
From now on we bail only if a non-trivial alias group contains a non-affine
access, not when we discover aliasing and non-affine accesses are allowed.
llvm-svn: 261863
Replace Scop::getStmtForBasicBlock and Scop::getStmtForRegionNode, and
add overloads for llvm::Instruction and llvm::RegionNode.
getStmtFor and overloads become the common interface to get the Stmt
that contains something. Named after LoopInfo::getLoopFor and
RegionInfo::getRegionFor.
llvm-svn: 261791
This is also be caught by the function verifier, but disconnected from
the place that produced it. Catch it already at creation to be able to
reason more directly about the cause.
llvm-svn: 261790
MemoryAccess::addIncoming exists to remember which values come from that
statement in PHI writes, relevant for subregions that have multiple
exiting edges to an exit block. The exit block can be separated from the
exiting block by regions simplifications. It should not be called for
any read accesses.
llvm-svn: 261789
The test style guide defines that opt should get its input from stdin.
(instead by file argument to avoid that the file name appears in its
output)
CHECK-FORCED is not recognized by FileCheck; remove it.
llvm-svn: 261786
This allows other passes and transformations to use some of the existing AST
building infrastructure. This is not yet used in Polly itself.
llvm-svn: 261496
This patch adds support for memcpy, memset and memmove intrinsics. They are
represented as one (memset) or two (memcpy, memmove) memory accesses in the
polyhedral model. These accesses have an access range that describes the
summarized effect of the intrinsic, i.e.,
memset(&A[i], '$', N);
is represented as a write access from A[i] to A[i+N].
Differential Revision: http://reviews.llvm.org/D5226
llvm-svn: 261489
We now always print the reason why the code did not pass the LLVM verifier and
we also allow to disable verfication with -polly-codegen-verify=false. Before
this change the first assertion had generally no information why or what might
have gone wrong and it was also impossible to -view-cfg without recompile. This
change makes debugging bugs that result in incorrect IR a lot easier.
llvm-svn: 261320
To support non-aligned accesses we introduce a virtual element size
for arrays that divides each access function used for this array. The
adjustment of the access function based on the element size of the
array was therefore moved after this virtual element size was
determined, thus after all accesses have been created.
Differential Revision: http://reviews.llvm.org/D17246
llvm-svn: 261226
After we moved isl_ctx into Scop, we need to free the isl_ctx after
freeing all isl objects, which requires the ScopInfo pass to be freed
at last. But this is not guaranteed by the PassManager, and we need
extra code to free the isl_ctx at the right time.
We introduced a shared pointer to manage the isl_ctx, and distribute
it to all analyses that create isl objects. As such, whenever we free
an analyses with the shared_ptr (and also free the isl objects which
are created by the analyses), we decrease the (shared) reference
counter of the shared_ptr by 1. Whenever the reference counter reach
0 in the releaseMemory function of an analysis, that analysis will
be the last one that hold any isl objects, and we can safely free the
isl_ctx with that analysis.
Differential Revision: http://reviews.llvm.org/D17241
llvm-svn: 261100
First support for this feature was committed in r259784. Support for
loop invariant load hoisting with different types was added by
Johannes Doerfert in r260045 and r260886.
llvm-svn: 260965
A load can only be invariant if its base pointer is invariant too. To
this end, we check if the base pointer is defined inside the region or
outside. In the former case we recursively check if we can (and
therefore will) hoist the base pointer too. Only if that happends we
can hoist the load.
llvm-svn: 260886
This reverts commit 98efa006c96ac981c00d2e386ec1102bce9f549a.
The fix was broken since we do not use AA in the ScopDetection anymore to
check for invariant accesses.
llvm-svn: 260884
Eliminate the global variable "InsnToMemAcc" to make Scop/ScopInfo become
more protable, such that we can safely use them in a CallGraphSCC pass.
Differential Revision: http://reviews.llvm.org/D17238
llvm-svn: 260863
Before this patch it could happen that we did not hoist a load that
was a base pointer of another load even though AA already declared the
first one as invariant (during ScopDetection). If this case arises we
will now skipt the "can be overwriten" check because in this case the
over-approximating nature causes us to generate broken code.
llvm-svn: 260862
The former ScopArrayInfo::updateSizes was implicitly divided into an
updateElementType and an updateSizes. Now this partitioning is
explicit.
llvm-svn: 260860
So far we separated constant factors from multiplications, however,
only when they are at the outermost level of a parameter SCEV. Now,
we also separate constant factors from the parameter SCEV if the
outermost expression is a SCEVAddRecExpr. With the changes to the
SCEVAffinator we can now improve the extractConstantFactor(...)
function at will without worrying about any other code part. Thus,
if needed we can implement a more comprehensive
extractConstantFactor(...) function that will traverse the SCEV
instead of looking only at the outermost level.
Four test cases were affected. One did not change much and the other
three were simplified.
llvm-svn: 260859
This reverts commit https://llvm.org/svn/llvm-project/polly/trunk@260853
We unfortunately still have two bugs left which show only up with
-polly-process-unprofitable and which I forgot to test before committing.
llvm-svn: 260854
First support for this feature was committed in r259784. Support for
loop invariant load hoisting with different types was added by Johannes
Doerfert in r260045. This fixed the last known bug.
llvm-svn: 260853
Since the origin AccFuncMap in ScopInfo is used by the underlying Scop
only, and it must stay alive until we delete the Scop. It will be better
if we simply move the origin AccFuncMap in ScopInfo into the Scop class.
llvm-svn: 260820
Make Scop become more portable such that we can use it in a CallGraphSCC pass.
The first step is to drop the analyses that are only used during Scop construction.
This patch drop LoopInfo from Scop.
llvm-svn: 260819
Make Scop become more portable such that we can use it in a CallGraphSCC pass.
The first step is to drop the analyses that are only used during Scop construction.
This patch drop DominatorTree from Scop.
llvm-svn: 260818
Make Scop become more portable such that we can use it in a CallGraphSCC pass.
The first step is to drop the analyses that are only used during Scop construction.
This patch drop ScopDecection from Scop.
llvm-svn: 260817
We now distinguish invariant loads to the same memory location if they
have different types. This will cause us to pre-load an invariant
location once for each type that is used to access it. However, we can
thereby avoid invalid casting, especially if an array is accessed
though different typed/sized invariant loads.
This basically reverts the changes in r260023 but keeps the test
cases.
llvm-svn: 260045
We also disable this feature by default, as there are still some issues in
combination with invariant load hoisting that slipped through my initial
testing.
llvm-svn: 260025
Invariant load hoisting of memory accesses with non-canonical element
types lacks support for equivalence classes that contain elements of
different width/size. This support should be added, but to get our buildbots
back to green, we disable load hoisting for memory accesses with non-canonical
element size for now.
llvm-svn: 260023
Always use access-instruction pointer type to load the invariant values.
Otherwise mismatches between ScopArrayInfo element type and memory access
element type will result in invalid casts. These type mismatches are after
r259784 a lot more common and also arise with types of different size, which
have not been handled before.
Interestingly, this change actually simplifies the code, as we now have only
one code path that is always taken, rather then a standard code path for the
common case and a "fixup" code path that replaces the standard code path in
case of mismatching types.
llvm-svn: 260009
The previously implemented approach is to follow value definitions and
create write accesses ("push defs") while searching for uses. This
requires the same relatively validity- and requirement conditions to be
replicated at multiple locations (PHI instructions, other instructions,
uses by PHIs).
We replace this by iterating over the uses in a SCoP ("pull in
requirements"), and add writes only when at least one read has been
added. It turns out to be simpler code because each use is only iterated
over once and writes are added for the first access that reads it. We
need another iteration to identify escaping values (uses not in the
SCoP), which also makes the difference between such accesses more
obvious. As a side-effect, the order of scalar MemoryAccess can change.
Differential Revision: http://reviews.llvm.org/D15706
llvm-svn: 259987
This allows code such as:
void multiple_types(char *Short, char *Float, char *Double) {
for (long i = 0; i < 100; i++) {
Short[i] = *(short *)&Short[2 * i];
Float[i] = *(float *)&Float[4 * i];
Double[i] = *(double *)&Double[8 * i];
}
}
To model such code we use as canonical element type of the modeled array the
smallest element type of all original array accesses, if type allocation sizes
are multiples of each other. Otherwise, we use a newly created iN type, where N
is the gcd of the allocation size of the types used in the accesses to this
array. Accesses with types larger as the canonical element type are modeled as
multiple accesses with the smaller type.
For example the second load access is modeled as:
{ Stmt_bb2[i0] -> MemRef_Float[o0] : 4i0 <= o0 <= 3 + 4i0 }
To support code-generating these memory accesses, we introduce a new method
getAccessAddressFunction that assigns each statement instance a single memory
location, the address we load from/store to. Currently we obtain this address by
taking the lexmin of the access function. We may consider keeping track of the
memory location more explicitly in the future.
We currently do _not_ handle multi-dimensional arrays and also keep the
restriction of not supporting accesses where the offset expression is not a
multiple of the access element type size. This patch adds tests that ensure
we correctly invalidate a scop in case these accesses are found. Both types of
accesses can be handled using the very same model, but are left to be added in
the future.
We also move the initialization of the scop-context into the constructor to
ensure it is already available when invalidating the scop.
Finally, we add this as a new item to the 2.9 release notes
Reviewers: jdoerfert, Meinersbur
Differential Revision: http://reviews.llvm.org/D16878
llvm-svn: 259784
Even though the commands still work, dragonegg has not been updated to work with
recent versions of LLVM. Consequently, only very old Polly versions (which we
do not support any more), can be used in this way.
This simplifies our documentation page.
llvm-svn: 259758
This includes some (optional) improvements to the isl scheduler, which we do not
use yet, as well as a fix for a bug previously also affecting Polly:
commit 662ee9b7d45ebeb7629b239d3ed43442e25bf87c
Author: Sven Verdoolaege <skimo@kotnet.org>
Date: Mon Jan 25 16:59:32 2016 +0100
isl_basic_map_realign: perform Gaussian elimination on result
Many parts of isl assume that Gaussian elimination has been
applied to the equality constraints. In particular singleton_extract_point
makes this assumption. The input to singleton_extract_point
may have undergone parameter alignment. This parameter alignment
(ultimately performed by isl_basic_map_realign) therefore
needs to make sure the result preserves this property
llvm-svn: 259757
We remove information for older versions of Polly and also shorten the overall
text. This should make it a lot easier for people to get to the important code
wight away.
llvm-svn: 259658
We support now code such as:
void multiple_types(char *Short, char *Float, char *Double) {
for (long i = 0; i < 100; i++) {
Short[i] = *(short *)&Short[2 * i];
Float[i] = *(float *)&Float[4 * i];
Double[i] = *(double *)&Double[8 * i];
}
}
To support such code we use as element type of the modeled array the smallest
element type of all original array accesses. Accesses with larger types are
modeled as multiple accesses with the smaller type.
For example the second load access is modeled as:
{ Stmt_bb2[i0] -> MemRef_Float[o0] : 4i0 <= o0 <= 3 + 4i0 }
To support jscop-rewritable memory accesses we need each statement instance to
only be assigned a single memory location, which will be the address at which
we load the value. Currently we obtain this address by taking the lexmin of
the access function. We may consider keeping track of the memory location more
explicitly in the future.
llvm-svn: 259587
We create separate functions for fixed-size multi-dimensional, parameteric-sized
multi-dimensional, as well as single-dimensional memory accesses to reduce the
complexity of a large monolithic function.
Suggested-by: Michael Kruse <llvm@meinersbur.de>
llvm-svn: 259522
There is no need to pass the size of the elements as the last size dimension
to ScopArrayInfo. This information is already available through the ElementType.
Tracking it twice is not only redundant but may result in inconsistencies.
llvm-svn: 259521