Commit Graph

500 Commits

Author SHA1 Message Date
Michael Kruse 4a080de057 Add %loadPolly to test command line.
Required for out-of-tree builds of Polly.

llvm-svn: 279657
2016-08-24 19:12:48 +00:00
Eli Friedman 28671c83d6 [SCEVValidator] Don't reorder multiplies in extractConstantFactor.
The existing code would add the operands in the wrong order, and eventually
crash because the SCEV expression doesn't exactly match the parameter SCEV
expression in SCEVAffinator::visit. (SCEV doesn't sort the operands to
getMulExpr in general.)

Differential Revision: https://reviews.llvm.org/D23592

llvm-svn: 279087
2016-08-18 16:30:42 +00:00
Tobias Grosser b143e31164 [ScopInfo] Make scalars used by PHIs in non-affine regions available
Normally this is ensured when adding PHI nodes, but as PHI node dependences
do not need to be added in case all incoming blocks are within the same
non-affine region, this was missed.

This corrects an issue visible in LNT's sqlite3, in case invariant load hoisting
was disabled.

llvm-svn: 278792
2016-08-16 11:44:48 +00:00
Tobias Grosser 7cb809983d [tests] Force invariant load hoisting for test cases that need it -- III
llvm-svn: 278673
2016-08-15 15:56:24 +00:00
Tobias Grosser ad61c170d5 [tests] Force invariant load hoisting for test cases that need it II
llvm-svn: 278669
2016-08-15 13:58:16 +00:00
Tobias Grosser 75b9c7df4d [test] Correct spelling in test case
and explicitly enable invariant load hoisting for this test case.

llvm-svn: 278668
2016-08-15 13:58:04 +00:00
Tobias Grosser 6e6264c142 [tests] Force invariant load hoisting for test cases that need it
This will make it easier to switch the default of Polly's invariant load
hoisting strategy and also makes it very clear that these test cases
indeed require invariant code hoisting to work.

llvm-svn: 278667
2016-08-15 13:27:49 +00:00
Tobias Grosser c59b3ce044 [BlockGenerator] Also eliminate dead code not originating from BB
After having generated the code for a ScopStmt, we run a simple dead-code
elimination that drops all instructions that are known to be and remain unused.
Until this change, we only considered instructions for dead-code elimination, if
they have a corresponding instruction in the original BB that belongs to
ScopStmt. However, when generating code we do not only copy code from the BB
belonging to a ScopStmt, but also generate code for operands referenced from BB.
After this change, we now also considers code for dead code elimination, which
does not have a corresponding instruction in BB.

This fixes a bug in Polly-ACC where such dead-code referenced CPU code from
within a GPU kernel, which is possible as we do not guarantee that all variables
that are used in known-dead-code are moved to the GPU.

llvm-svn: 278103
2016-08-09 08:59:05 +00:00
Tobias Grosser 9d12d8ade3 BlockGenerator: remove dead instructions in normal statements
This ensures that no trivially dead code is generated. This is not only cleaner,
but also avoids troubles in case code is generated in a separate function and
some of this dead code contains references to values that are not available.
This issue may happen, in case the memory access functions have been updated
and old getelementptr instructions remain in the code. With normal Polly,
a test case is difficult to draft, but the upcoming GPU code generation can
possibly trigger such problems. We will later extend this dead-code elimination
to region and vector statements.

llvm-svn: 276263
2016-07-21 11:48:36 +00:00
Michael Kruse 3b0a9934fa Add CHECK line to test case. NFC.
Check not only that the compiler is not crashing, but also whether the
probablematic part (The sequence of instructions simplified to '4') is reflected
in the output.

Thanks to Tobias for the hint.

llvm-svn: 275189
2016-07-12 16:37:50 +00:00
Michael Kruse e448364320 [SCEVAffinator] Fix assertion checking for constant divisor.
An assertion in visitSDivInstruction() checked whether the divisor is constant
by checking whether the argument is a ConstantInt. However, SCEVValidator allows
the divisor to be simplified to a constant by ScalarEvolution.

We synchronize the implementation of SCEVValidator and SCEVAffinator to both
accept simplified SCEV expressions.

llvm-svn: 275174
2016-07-12 15:08:47 +00:00
Tobias Grosser 42eef3acd7 Add test case forgotten in r275053
llvm-svn: 275055
2016-07-11 12:15:06 +00:00
Michael Kruse 586e579fe8 Fix assertion due to buildMemoryAccess.
For llvm the memory accesses from nonaffine loops should be visible,
however for polly those nonaffine loops should be invisible/boxed.

This fixes llvm.org/PR28245

Cointributed-by: Huihui Zhang <huihuiz@codeaurora.org>

Differential Revision: http://reviews.llvm.org/D21591

llvm-svn: 274842
2016-07-08 12:38:28 +00:00
Tobias Grosser 2ea7c6e8d1 Ensure parameter names are isl-compatible
Without this change it is not possible for isl to parse the resulting objects
from their string representation.

llvm-svn: 274350
2016-07-01 13:40:28 +00:00
Johannes Doerfert 4ba65a5622 [GSoC 2016]New function pass ScopInfoWrapperPass
This patch adds a new function pass ScopInfoWrapperPass so that the
polyhedral description of a region, the SCoP, can be constructed and
used in a function pass.

Patch by Utpal Bora <cs14mtech11017@iith.ac.in>

Differential Revision: http://reviews.llvm.org/D20962

llvm-svn: 273856
2016-06-27 09:32:30 +00:00
Tobias Grosser 423642a597 Recommit: "Look through IntToPtr & PtrToInt instructions"
IntToPtr and PtrToInt instructions are basically no-ops that we can handle as
such. In order to generate them properly as parameters we had to improve the
ScopExpander, though the change is the first in the direction of a more
aggressive scalar synthetization.

This patch was originally contributed by Johannes Doerfert in r271888, but was
in conflict with the revert in r272483. This is a recommit with some minor
adjustment to the test cases to take care of differing instruction names.

llvm-svn: 272485
2016-06-11 19:26:08 +00:00
Tobias Grosser 3717aa5ddb This reverts recent expression type changes
The recent expression type changes still need more discussion, which will happen
on phabricator or on the mailing list. The precise list of commits reverted are:

- "Refactor division generation code"
- "[NFC] Generate runtime checks after the SCoP"
- "[FIX] Determine insertion point during SCEV expansion"
- "Look through IntToPtr & PtrToInt instructions"
- "Use minimal types for generated expressions"
- "Temporarily promote values to i64 again"
- "[NFC] Avoid unnecessary comparison for min/max expressions"
- "[Polly] Fix -Wunused-variable warnings (NFC)"
- "[NFC] Simplify min/max expression generation"
- "Simplify the type adjustment in the IslExprBuilder"

Some of them are just reverted as we would otherwise get conflicts. I will try
to re-commit them if possible.

llvm-svn: 272483
2016-06-11 19:17:15 +00:00
Johannes Doerfert 695c6b476a [FIX] Model the rounding behaviour of SRem correctly
llvm-svn: 272001
2016-06-07 12:00:37 +00:00
Johannes Doerfert c0ece9b67e [NFC] Generate runtime checks after the SCoP
We now generate runtime checks __after__ the SCoP code generation and
  not before, though they are still inserted at the same position int
  the code. This allows to modify the runtime check during SCoP code
  generation.

llvm-svn: 271894
2016-06-06 13:32:52 +00:00
Johannes Doerfert dedb7693ec Look through IntToPtr & PtrToInt instructions
IntToPtr and PtrToInt instructions are basically no-ops that we can handle as
  such. In order to generate them properly as parameters we had to improve the
  ScopExpander, though the change is the first in the direction of a more
  aggressive scalar synthetization.

llvm-svn: 271888
2016-06-06 12:12:27 +00:00
Johannes Doerfert 4b2fd892ec [FIX] Do not recognize division by 0 as affine
llvm-svn: 271885
2016-06-06 12:08:34 +00:00
Johannes Doerfert 0767a511ba Use minimal types for generated expressions
We now use the minimal necessary bit width for the generated code. If
  operations might overflow (add/sub/mul) we will try to adjust the types in
  order to ensure a non-wrapping computation. If the type adjustment is not
  possible, thus the necessary type is bigger than the type value of
  --polly-max-expr-bit-width, we will use assumptions to verify the computation
  will not wrap. However, for run-time checks we cannot build assumptions but
  instead utilize overflow tracking intrinsics.

llvm-svn: 271878
2016-06-06 09:57:41 +00:00
Sanjoy Das 2084d784df [Polly] Fix test case after rL271151
Summary:
After rL271151 (SCEV change) SCEV no longer unconditionally transfers
nuw/nsw from the increment operation to the post-inc value; this
transfer only happens if there is undefined behavior in the program if
the increment overflowed (as opposed to just generating poison).

The loops in `wraping_signed_expr_1.ll` are in non-canonical
form (they're not rotated), and that defeats LLVM's poison-is-UB
analysis.  IMO the easiest fix here is to run `wraping_signed_expr_1.ll`
through `-loop-rotate` to canonicalize the loops, which is what this
patch does.

Reviewers: jdoerfert, Meinersbur, grosser

Subscribers: grosser, mcrosier, pollydev

Differential Revision: http://reviews.llvm.org/D20778

llvm-svn: 271536
2016-06-02 16:58:41 +00:00
Johannes Doerfert 6631bfdd1c [FIX] Correctly translate i1 expressions
llvm-svn: 271534
2016-06-02 16:57:12 +00:00
Johannes Doerfert fd7ddf1479 [FIX] Test case broken by r271522.
llvm-svn: 271531
2016-06-02 16:33:01 +00:00
Tobias Grosser 6a114505c4 Temporarily xfail test case which broke after a fix in SCEV
This should keep the buildbots quite while we decide how to update the test
case.

llvm-svn: 271169
2016-05-29 06:03:44 +00:00
Johannes Doerfert 25227fe7b0 Optimistic assume required invariant loads to be invariant
Before this patch we bailed if a required invariant load was potentially
  overwritten. However, now we will optimistically assume it is actually
  invariant and, to this end, restrict the valid parameter space as well as the
  execution context with regards to potential overwrites of the location.

llvm-svn: 270416
2016-05-23 10:40:54 +00:00
Johannes Doerfert cda1bd5048 Revert "Optimistic assume required invariant loads to be invariant"
This reverts commit 787e642207ca978f2e800140529fc7049ea1f3de until the
lnt failures are fixed.

llvm-svn: 270061
2016-05-19 13:47:34 +00:00
Johannes Doerfert cb77542d1c Optimistic assume required invariant loads to be invariant
So far we bailed if a required invariant load was potentially overwritten in
  the SCoP. From now on we will optimistically assume it is actually invariant
  and, to this end, restrict the valid parameter space.

llvm-svn: 270060
2016-05-19 13:24:10 +00:00
Johannes Doerfert 60dd9e1346 Compute the MaxLoopDepth during domain construction [NFC]
llvm-svn: 270052
2016-05-19 12:33:14 +00:00
Johannes Doerfert 6f1bb7a9d9 Support truncate operations
Truncate operations are basically modulo operations, thus we can model
  them that way. However, for large types we assume the operand to fit
  in the new type size instead of introducing a modulo with a very large
  constant.

llvm-svn: 269300
2016-05-12 15:13:49 +00:00
Johannes Doerfert 27d12d3d1f Invalidate unprofitable SCoPs after creation
If a profitable run is performed we will check if the SCoP seems to be
  profitable after creation but before e.g., dependence are computed. This is
  needed as SCoP detection only approximates the actual SCoP representation.
  In the end this should allow us to be less conservative during the SCoP
  detection while keeping the compile time in check.

llvm-svn: 269074
2016-05-10 16:38:09 +00:00
Johannes Doerfert bf9473b2d8 Weaken profitability constraints during ScopDetection
Regions with one affine loop can be profitable if the loop is
  distributable. To this end we will allow them to be treated as
  profitable if they contain at least two non-trivial basic blocks.

llvm-svn: 269064
2016-05-10 14:42:30 +00:00
Johannes Doerfert ede4ecaefb [FIX] Cleanup isl objects prior to early exit
llvm-svn: 269061
2016-05-10 14:01:21 +00:00
Johannes Doerfert 2b92a0e4ee Handle llvm.assume inside the SCoP
The assumption attached to an llvm.assume in the SCoP needs to be
  combined with the domain of the surrounding statement but can
  nevertheless be used to refine the context.

  This fixes the problems mentioned in PR27067.

llvm-svn: 269060
2016-05-10 14:00:57 +00:00
Johannes Doerfert 297c720d15 Propagate complexity problems during domain generation [NFC]
This patches makes the propagation of complexity problems during
  domain generation consistent. Additionally, it makes it less likely to
  encounter ill-formed domains later, e.g., during schedule generation.

llvm-svn: 269055
2016-05-10 13:06:42 +00:00
Johannes Doerfert 14b1cf35b5 [FIX] Create error-restrictions late
Before this patch we generated error-restrictions only for
  error-blocks, thus blocks (or regions) containing a not represented
  function call. However, the same reasoning is needed if the invalid
  domain of a statement subsumes its actual domain. To this end we move
  the generation of error-restrictions after the propagation of the
  invalid domains. Consequently, error-statements are now defined more
  general as statements that are assumed to be not executed.
  Additionally, we do not record an empty domain for such statements but
  a nullptr instead. This allows to distinguish between error-statements
  and dead-statements.

llvm-svn: 269053
2016-05-10 12:42:26 +00:00
Johannes Doerfert a60ad845c0 Simplify the internal representation according to the context [NFC]
We now use context information to simplify the domains and access
  functions of the SCoP instead of just aligning them with the parameter
  space.

llvm-svn: 269048
2016-05-10 12:18:22 +00:00
Johannes Doerfert 172dd8b923 Allow unsigned divisions
After zero-extend operations and unsigned comparisons we now allow
  unsigned divisions. The handling is basically the same as for signed
  division, except the interpretation of the operands. As the divisor
  has to be constant in both cases we can simply interpret it as an
  unsigned value without additional complexity in the representation.
  For the dividend we could choose from the different representation
  schemes introduced for zero-extend operations but for now we will
  simply use an assumption.

llvm-svn: 268032
2016-04-29 11:53:35 +00:00
Johannes Doerfert 99ec00a2bb [FIX] Typo
llvm-svn: 268027
2016-04-29 10:47:07 +00:00
Johannes Doerfert 3e48ee2ab9 [FIX] Unsigned comparisons change invalid domain
It does not suffice to take a global assumptions for unsigned comparisons but
  we also need to adjust the invalid domain of the statements guarded by such
  an assumption. To this end we allow to specialize the getPwAff call now in
  order to indicate unsigned interpretation.

llvm-svn: 268025
2016-04-29 10:44:41 +00:00
Johannes Doerfert 8475d1c163 [FIX] Correct assumption simplification
Assumptions and restrictions can both be simplified with the domain of a
  statement but not the same way. After this patch we will correctly
  distinguish them.

llvm-svn: 267885
2016-04-28 14:32:58 +00:00
Johannes Doerfert 792374b941 Allow unsigned comparisons
With this patch we will optimistically assume that the result of an unsigned
  comparison is the same as the result of the same comparison interpreted as
  signed.

llvm-svn: 267559
2016-04-26 14:33:12 +00:00
Johannes Doerfert 323ab3975b [FIX] Adjust assumption space for zext instructions
llvm-svn: 267552
2016-04-26 12:44:01 +00:00
Johannes Doerfert 625bb1fc10 Do not add but record signed-unsigned assumptions
llvm-svn: 267528
2016-04-26 09:16:36 +00:00
Johannes Doerfert 9cc8340fea Extract some constant factors from "SCEVAddExprs"
Additive expressions can have constant factors too that we can extract
  and thereby simplify the internal representation. For now we do
  compute the gcd of all constant factors but only extract the same
  (possibly negated) factor if there is one.

llvm-svn: 267445
2016-04-25 19:09:10 +00:00
Johannes Doerfert d5c369f460 Do not check all GEPs for assumptions
Before, we checked all GEPs in a statement in order to derive
  out-of-bound assumptions. However, this can not only introduce new
  parameters but it is also not clear what we can learn from GEPs that
  are not immediately used in a memory accesses inside the SCoP. As this
  case is very rare, no actual change in the behaviour is expected.

llvm-svn: 267442
2016-04-25 18:55:15 +00:00
Johannes Doerfert c78ce7dc21 Only add user assumptions on known parameters [NFC]
Before, assumptions derived from llvm.assume could reference new
  parameters that were not known to the SCoP before. These were neither
  beneficial to the representation nor to the user that reads the
  emitted remark. Now we project them out and keep only user assumptions
  on known parameters. Nevertheless, the new parameters are still part
  of the SCoPs parameter space as the SCEVAffinator currently adds them
  on demand.

llvm-svn: 267441
2016-04-25 18:51:27 +00:00
Johannes Doerfert c3596284c3 Model zext-extend instructions
A zero-extended value can be interpreted as a piecewise defined signed
  value. If the value was non-negative it stays the same, otherwise it
  is the sum of the original value and 2^n where n is the bit-width of
  the original (or operand) type. Examples:
    zext i8 127 to i32 -> { [127] }
    zext i8  -1 to i32 -> { [256 + (-1)] } = { [255] }
    zext i8  %v to i32 -> [v] -> { [v] | v >= 0; [256 + v] | v < 0 }

  However, LLVM/Scalar Evolution uses zero-extend (potentially lead by a
  truncate) to represent some forms of modulo computation. The left-hand side
  of the condition in the code below would result in the SCEV
  "zext i1 <false, +, true>for.body" which is just another description
  of the C expression "i & 1 != 0" or, equivalently, "i % 2 != 0".

    for (i = 0; i < N; i++)
      if (i & 1 != 0 /* == i % 2 */)
        /* do something */

  If we do not make the modulo explicit but only use the mechanism described
  above we will get the very restrictive assumption "N < 3", because for all
  values of N >= 3 the SCEVAddRecExpr operand of the zero-extend would wrap.
  Alternatively, we can make the modulo in the operand explicit in the
  resulting piecewise function and thereby avoid the assumption on N. For the
  example this would result in the following piecewise affine function:
  { [i0] -> [(1)] : 2*floor((-1 + i0)/2) = -1 + i0;
    [i0] -> [(0)] : 2*floor((i0)/2) = i0 }
  To this end we can first determine if the (immediate) operand of the
  zero-extend can wrap and, in case it might, we will use explicit modulo
  semantic to compute the result instead of emitting non-wrapping assumptions.

  Note that operands with large bit-widths are less likely to be negative
  because it would result in a very large access offset or loop bound after the
  zero-extend. To this end one can optimistically assume the operand to be
  positive and avoid the piecewise definition if the bit-width is bigger than
  some threshold (here MaxZextSmallBitWidth).

  We choose to go with a hybrid solution of all modeling techniques described
  above. For small bit-widths (up to MaxZextSmallBitWidth) we will model the
  wrapping explicitly and use a piecewise defined function. However, if the
  bit-width is bigger than MaxZextSmallBitWidth we will employ overflow
  assumptions and assume the "former negative" piece will not exist.

llvm-svn: 267408
2016-04-25 14:01:36 +00:00
Johannes Doerfert 85676e3674 Add an invalid domain to memory accesses
Memory accesses can have non-precisely modeled access functions that
  would cause us to build incorrect execution context for hoisted loads.
  This is the same issue that occurred during the domain construction for
  statements and it is dealt with the same way.

llvm-svn: 267289
2016-04-23 14:32:34 +00:00
Johannes Doerfert ac9c32e216 Translate SCEVs to isl_pw_aff and their invalid domain
The SCEVAffinator will now produce not only the isl representaiton of
  a SCEV but also the domain under which it is invalid. This is used to
  record possible overflows that can happen in the statement domains in
  the statements invalid domain. The result is that invalid loads have
  an accurate execution contexts with regards to the validity of their
  statements domain. While the SCEVAffinator currently is only taking
  "no-wrapping" assumptions, we can add more withouth worrying about the
  execution context of loads that are optimistically hoisted.

llvm-svn: 267288
2016-04-23 14:31:17 +00:00
Johannes Doerfert 1dc12aff8a Simplify the execution context for dereferencable loads
If we know it is safe to execute a load we do not need an execution
  context, however only if we are sure it was modeled correctly.

llvm-svn: 267284
2016-04-23 12:59:18 +00:00
Johannes Doerfert d77089e62d Bail for complex execution contexts of invariant loads
llvm-svn: 267146
2016-04-22 11:41:14 +00:00
Tobias Grosser ad25f0fee6 Update debug metadata after LLVM commits r266445+r266446
llvm-svn: 266473
2016-04-15 20:51:27 +00:00
Michael Kruse 5c7d0cb834 Add contexts to test cases. NFC.
As discussed in the Polly weekly phone call and reviews.llvm.org/D18878,
the assumed contexts changed (widen) due to D18878/r265942. Also check
these contexts in the tests affected by that change.

llvm-svn: 266323
2016-04-14 15:22:13 +00:00
Johannes Doerfert fb72187fdd [FIX] Check the invalid context agains the context to rule out SCoPs
llvm-svn: 266096
2016-04-12 17:54:29 +00:00
Johannes Doerfert 615e0b85f8 Record wrapping assumptions early
Utilizing the record option for assumptions we can simplify the wrapping
  assumption generation a lot. Additionally, we can now report locations
  together with wrapping assumptions, though they might not be accurate yet.

llvm-svn: 266069
2016-04-12 13:28:39 +00:00
Johannes Doerfert 3bf6e4129f Record assumptions first and add them later
There are three reasons why we want to record assumptions first before we
add them to the assumed/invalid context:

  1) If the SCoP is not profitable or otherwise invalid without the
     assumed/invalid context we do not have to compute it.
  2) Information about the context are gathered rather late in the SCoP
     construction (basically after we know all parameters), thus the user
     might see overly complicated assumptions to be taken while they would
     have been simplified later on.
  3) Currently we cannot take assumptions at any point but have to wait,
     e.g., for the domain generation to finish. This makes wrapping
     assumptions much more complicated as they need to be and it will
     have a similar effect on "signed-unsigned" assumptions later.

llvm-svn: 266068
2016-04-12 13:27:35 +00:00
Johannes Doerfert 7c01357cef Introduce an invalid context for each statement
Collect the error domain contexts (formerly in the ErrorDomainCtxMap)
  for each statement in the new InvalidContext member variable. While
  this commit is basically a [NFC] it is a first step to make hoisting
  sound by allowing a more fine grained record of invalid contexts,
  e.g., here on statement level.

llvm-svn: 266053
2016-04-12 09:57:34 +00:00
Michael Kruse 3b425ff232 Allow overflow of indices with constant dim-sizes.
Allow overflow of indices into the next higher dimension if it has
constant size. E.g.

    float A[32][2];
    ((float*)A)[5];

is effectively the same as

    A[2][1];

This can happen since r265379 as a side effect if ScopDetection
recognizes an access as affine, but ScopInfo rejects the GetElementPtr.

Differential Revision: http://reviews.llvm.org/D18878

llvm-svn: 265942
2016-04-11 14:34:08 +00:00
Johannes Doerfert 561d36b320 Allow pointer expressions in SCEVs again.
In r247147 we disabled pointer expressions because the IslExprBuilder did not
  fully support them. This patch reintroduces them by simply treating them as
  integers. The only special handling for pointers that is left detects the
  comparison of two address_of operands and uses an unsigned compare.

llvm-svn: 265894
2016-04-10 09:50:10 +00:00
Johannes Doerfert 41725a1e7a [FIX] Do not crash on opaque (unsized) types.
llvm-svn: 265834
2016-04-08 19:20:03 +00:00
Michael Kruse b3de24f5e6 Add testcase from PR27218. NFC.
The the bug has already been fixed r265795, but this second testcase
still useful.

llvm-svn: 265809
2016-04-08 16:54:53 +00:00
Michael Kruse 436c90619c [ScopInfo] Fix check for element size mismatch.
The way to get the elements size with getPrimitiveSizeInBits() is not
the same as used in other parts of Polly which should use
DataLayout::getTypeAllocSize(). Its use only queries the size of the
pointer and getPrimitiveSizeInBits returns 0 for types that require a
DataLayout object such as pointers.

Together with r265379, this should fix PR27195.

llvm-svn: 265795
2016-04-08 16:20:08 +00:00
Johannes Doerfert 3ef78d6d38 [FIX] Adjust execution context of hoisted loads wrt. error domains
If we build the domains for error blocks and later remove them we lose
  the information that they are not executed. Thus, in the SCoP it looks
  like the control will always reach the statement S:

            for (i = 0 ... N)
                if (*valid == 0)
                  doSth(&ptr);
          S:    A[i] = *ptr;

  Consequently, we would have assumed "ptr" to be always accessed and
  preloaded it unconditionally. However, only if "*valid != 0" we would
  execute the optimized version of the SCoP. Nevertheless, we would have
  hoisted and accessed "ptr"regardless of "*valid". This changes the
  semantic of the program as the value of "*valid" can cause a change of
  "ptr" and control if it is executed or not.

  To fix this problem we adjust the execution context of hoisted loads
  wrt. error domains. To this end we introduce an ErrorDomainCtxMap that
  maps each basic block to the error context under which it might be
  executed. Thus, to the context under which it is executed but an error
  block would have been executed to. To fill this map one traversal of
  the blocks in the SCoP suffices. During this traversal we do also
  "remove" error statements and those that are only reachable via error
  statements. This was previously done by the removeErrorBlockDomains
  function which is therefor not needed anymore.

  This fixes bug PR26683 and thereby several SPEC miscompiles.

Differential Revision: http://reviews.llvm.org/D18822

llvm-svn: 265778
2016-04-08 10:30:09 +00:00
Johannes Doerfert b47cbe1c72 [FIX] Handle multiplications in the SCEVAffinator again
If ScalarEvolution cannot look through some expression but we do, it
  might happen that a multiplication will arrive at the
  SCEVAffinator::visitMulExpr. While we could always try to improve the
  extractConstantFactor function we might still miss something, thus we
  reintroduce the code to generate multiplicative piecewise-affine
  functions as a fall-back.

llvm-svn: 265777
2016-04-08 10:27:40 +00:00
Johannes Doerfert e85d50defb Add test cases for the removal of error blocks
llvm-svn: 265776
2016-04-08 10:26:39 +00:00
Tobias Grosser 6a44b7fdf0 Add test case forgotten in r265379.
Thanks Johannes for reminding me.

llvm-svn: 265423
2016-04-05 17:40:07 +00:00
Johannes Doerfert 57c5f0b1c4 [FIX] Ensure SAI objects for exit PHIs
If all exiting blocks of a SCoP are error blocks and therefor not
  represented we will not generate accesses and consequently no SAI
  objects for exit PHIs. However, they are needed in the code generation
  to generate the merge PHIs between the original and optimized region.
  With this patch we enusre that the SAI objects for exit PHIs exist
  even if all exiting blocks turn out to be eror blocks.

  This fixes the crash reported in PR27207.

llvm-svn: 265393
2016-04-05 13:44:21 +00:00
Johannes Doerfert 1519491eaf Do not allow to complex branch conditions
Even before we build the domain the branch condition can become very
  complex, especially if we have to build the complement of a lot of
  equality constraints. With this patch we bail if the branch condition
  has a lot of basic sets and parameters.

  After this patch we now successfully compile
    External/SPEC/CINT2000/186_crafty/186_crafty
  with "-polly-process-unprofitable -polly-position=before-vectorizer".

llvm-svn: 265286
2016-04-04 07:59:41 +00:00
Johannes Doerfert 642594ae87 Exploit graph properties during domain generation
As a CFG is often structured we can simplify the steps performed during
  domain generation. When we push domain information we can utilize the
  information from a block A to build the domain of a block B, if A dominates B
  and there is no loop backede on a path from A to B. When we pull domain
  information we can use information from a block A to build the domain of a
  block B if B post-dominates A. This patch implements both ideas and thereby
  simplifies domains that were not simplified by isl. For the FINAL basic block
  in test/ScopInfo/complex-successor-structure-3.ll we used to build a universe
  set with 81 basic sets. Now it actually is represented as universe set.

  While the initial idea to utilize the graph structure depended on the
  dominator and post-dominator tree we can use the available region
  information as a coarse grained replacement. To this end we push the
  region entry domain to the region exit and pull it from the region
  entry for the region exit if applicable.

  With this patch we now successfully compile
    External/SPEC/CINT2006/400_perlbench/400_perlbench
  and
    SingleSource/Benchmarks/Adobe-C++/loop_unroll.

Differential Revision: http://reviews.llvm.org/D18450

llvm-svn: 265285
2016-04-04 07:57:39 +00:00
Johannes Doerfert d5edbd61a1 [FIX] Do not create a SCoP in the presence of infinite loops
If a loop has no exiting blocks the region covering we use during
  schedule genertion might not cover that loop properly. For now we bail
  out as we would not optimize these loops anyway.

llvm-svn: 265280
2016-04-03 23:09:06 +00:00
Tobias Grosser 151ae32dba Revert "[FIX] Do not create a SCoP in the presence of infinite loops"
This reverts commit r265260, as it caused the following 'make check-polly'
failures:

    Polly :: ScopDetect/index_from_unpredictable_loop.ll
    Polly :: ScopInfo/multiple_exiting_blocks.ll
    Polly :: ScopInfo/multiple_exiting_blocks_two_loop.ll
    Polly :: ScopInfo/schedule-const-post-dominator-walk-2.ll
    Polly :: ScopInfo/schedule-const-post-dominator-walk.ll
    Polly :: ScopInfo/switch-5.ll

llvm-svn: 265272
2016-04-03 19:36:52 +00:00
Johannes Doerfert 2075b5d2a1 [FIX] Do not create two SAI objects for exit PHIs
If an exit PHI is written and also read in the SCoP we should not create two
  SAI objects but only one. As the read is only modeled to ensure OpenMP code
  generation knows about it we can simply use the EXIT_PHI MemoryKind for both
  accesses.

llvm-svn: 265261
2016-04-03 11:16:00 +00:00
Johannes Doerfert 7dcceb82e9 [FIX] Do not create a SCoP in the presence of infinite loops
If a loop has no exiting blocks the region covering we use during
  schedule genertion might not cover that loop properly. For now we bail
  out as we would not optimize these loops anyway.

llvm-svn: 265260
2016-04-03 11:12:39 +00:00
Tobias Grosser 6deba4ea03 Revert 264782 and 264789
These caused LNT failures due to new assertions when running with
-polly-position=before-vectorizer -polly-process-unprofitable for:

FAIL: clamscan.compile_time
FAIL: cjpeg.compile_time
FAIL: consumer-jpeg.compile_time
FAIL: shapes.compile_time
FAIL: clamscan.execution_time
FAIL: cjpeg.execution_time
FAIL: consumer-jpeg.execution_time
FAIL: shapes.execution_time

The failures have been introduced by r264782, but r264789 had to be reverted
as it depended on the earlier patch.

llvm-svn: 264885
2016-03-30 18:18:31 +00:00
Johannes Doerfert a144fb148b Exploit graph properties during domain generation
As a CFG is often structured we can simplify the steps performed
  during domain generation. When we push domain information we can
  utilize the information from a block A to build the domain of a
  block B, if A dominates B. When we pull domain information we can
  use information from a block A to build the domain of a block B
  if B post-dominates A. This patch implements both ideas and thereby
  simplifies domains that were not simplified by isl. For the FINAL
  basic block in
    test/ScopInfo/complex-successor-structure-3.ll .
  we used to build a universe set with 81 basic sets. Now it actually is
  represented as universe set.

  While the initial idea to utilize the graph structure depended on the
  dominator and post-dominator tree we can use the available region
  information as a coarse grained replacement. To this end we push the
  region entry domain to the region exit and pull it from the region
  entry for the region exit.

Differential Revision: http://reviews.llvm.org/D18450

llvm-svn: 264789
2016-03-29 21:31:05 +00:00
Michael Kruse 88a2256a34 Revert "[ScopInfo] Fix domains after loops."
This reverts commit r264118. The approach is still under discussion.

llvm-svn: 264705
2016-03-29 07:50:52 +00:00
Johannes Doerfert 6462d8c1d9 Generalize the domain complexity restrictions
This patch applies the restrictions on the number of domain conjuncts
  also to the domain parts of piecewise affine expressions we generate.
  To this end the wording is change slightly. It was needed to support
  complex additions featuring zext-instructions but it also fixes PR27045.

  lnt profitable runs reports only little changes that might be noise:
  Compile Time:
    Polybench/[...]/2mm                     +4.34%
    SingleSource/[...]/stepanov_container   -2.43%
  Execution Time:
    External/[...]/186_crafty               -2.32%
    External/[...]/188_ammp                 -1.89%
    External/[...]/473_astar                -1.87%

llvm-svn: 264514
2016-03-26 16:17:00 +00:00
Johannes Doerfert 733ea34f38 [FIX] Handle accesses to "null" in MemIntrinsics
This fixes PR27035. While we now exclude MemIntrinsics from the
  polyhedral model if they would access "null" we could exploit this
  even more, e.g., remove all parameter combinations that would lead to
  the execution of this statement from the context.

llvm-svn: 264284
2016-03-24 13:50:04 +00:00
Tobias Grosser 25e8ebe29d Drop explicit -polly-delinearize parameter
Delinearization is now enabled by default and does not need to explicitly need
to be enabled in our tests.

llvm-svn: 264154
2016-03-23 13:21:02 +00:00
Tobias Grosser 898a636210 Add option to disallow modref function calls in scops.
This might be useful to evaluate the benefit of us handling modref funciton
calls. Also, a new bug that was triggered by modref function calls was
recently reported http://llvm.org/PR27035. To ensure the same issue does not
cause troubles for other people, we temporarily disable this until the bug
is resolved.

llvm-svn: 264140
2016-03-23 06:40:15 +00:00
Michael Kruse 49a59ca093 [ScopInfo] Fix domains after loops.
ISL can conclude additional conditions on parameters from restrictions
on loop variables. Such conditions persist when leaving the loop and the
loop variable is projected out. This results in a narrower domain for
exiting the loop than entering it and is logically impossible for
non-infinite loops.

We fix this by not adding a lower bound i>=0 when constructing BB
domains, but defer it to when also the upper bound it computed, which
was done redundantly even before this patch.

This reduces the number of LNT fails with -polly-process-unprofitable
-polly-position=before-vectorizer from 8 to 6.

llvm-svn: 264118
2016-03-22 23:27:42 +00:00
Tobias Grosser 5a8c052baf Invalidate scop on encountering a complex control flow
We bail out if current scop has a complex control flow as this could lead to
building of large domain conditions. This is to reduce compile time.  This
addresses r26382.

Contributed-by: Chris Jenneisch <chrisj@codeaurora.org>

Differential Revision: http://reviews.llvm.org/D18362

llvm-svn: 264105
2016-03-22 22:05:32 +00:00
Tobias Grosser 0904c69110 ScopInfo: Do not generate dependences for i1 values used in affine branches
Affine branches are fully modeled and regenerated from the polyhedral domain and
consequently do not require any input conditions to be propagated.

llvm-svn: 263678
2016-03-16 23:33:54 +00:00
Tobias Grosser 2880f10aa4 tests: Fix some spelling mistakes
llvm-svn: 262649
2016-03-03 19:51:03 +00:00
Johannes Doerfert df88023d2b [FIX] Consolidation of loads with same pointer but different access relation
This should fix PR19422.

  Thanks to Jeremy Huddleston Sequoia for reporting this.
  Thanks to Roman Gareev for his investigation and the reduced test case.

llvm-svn: 262612
2016-03-03 12:26:58 +00:00
Johannes Doerfert 066dbf3f8e Track assumptions and restrictions separatly
In order to speed up compile time and to avoid random timeouts we now
  separately track assumptions and restrictions. In this context
  assumptions describe parameter valuations we need and restrictions
  describe parameter valuations we do not allow. During AST generation
  we create a runtime check for both, whereas the one for the
  restrictions is negated before a conjunction is build.

  Except the In-Bounds assumptions we currently only track restrictions.

Differential Revision: http://reviews.llvm.org/D17247

llvm-svn: 262328
2016-03-01 13:06:28 +00:00
Johannes Doerfert a792098047 Support calls with known ModRef function behaviour
Check the ModRefBehaviour of functions in order to decide whether or
  not a call instruction might be acceptable.

Differential Revision: http://reviews.llvm.org/D5227

llvm-svn: 261866
2016-02-25 14:08:48 +00:00
Johannes Doerfert 9dd42ee7c1 Try to build alias checks even when non-affine accesses are allowed
From now on we bail only if a non-trivial alias group contains a non-affine
  access, not when we discover aliasing and non-affine accesses are allowed.

llvm-svn: 261863
2016-02-25 14:06:11 +00:00
Roman Gareev 11001e1534 Annotation of SIMD loops
Use 'mark' nodes annotate a SIMD loop during ScheduleTransformation and skip
parallelism checks.

The buildbot shows the following compile/execution time changes:

  Compile time:
    Improvements    Δ     Previous  Current  σ
    …/gesummv      -6.06% 0.2640    0.2480   0.0055
    …/gemver       -4.46% 0.4480    0.4280   0.0044
    …/covariance   -4.31% 0.8360    0.8000   0.0065
    …/adi          -3.23% 0.9920    0.9600   0.0065
    …/doitgen      -2.53% 0.9480    0.9240   0.0090
    …/3mm          -2.33% 1.0320    1.0080   0.0087

  Execution time:
    Regressions     Δ     Previous  Current  σ
    …/viterbi       1.70% 5.1840    5.2720   0.0074
    …/smallpt       1.06% 12.4920   12.6240  0.0040

Reviewed-by: Tobias Grosser <tobias@grosser.es>

Differential Revision: http://reviews.llvm.org/D14491

llvm-svn: 261620
2016-02-23 09:00:13 +00:00
Johannes Doerfert cea6193b79 Support memory intrinsics
This patch adds support for memcpy, memset and memmove intrinsics. They are
  represented as one (memset) or two (memcpy, memmove) memory accesses in the
  polyhedral model. These accesses have an access range that describes the
  summarized effect of the intrinsic, i.e.,
    memset(&A[i], '$', N);
  is represented as a write access from A[i] to A[i+N].

Differential Revision: http://reviews.llvm.org/D5226

llvm-svn: 261489
2016-02-21 19:13:19 +00:00
Johannes Doerfert 4d9bb8d594 Allow all combinations of types and subscripts for memory accesses
To support non-aligned accesses we introduce a virtual element size
  for arrays that divides each access function used for this array. The
  adjustment of the access function based on the element size of the
  array was therefore moved after this virtual element size was
  determined, thus after all accesses have been created.

Differential Revision: http://reviews.llvm.org/D17246

llvm-svn: 261226
2016-02-18 16:50:12 +00:00
Johannes Doerfert 13637678b1 [FIX] LICM test case
llvm-svn: 260955
2016-02-16 12:10:42 +00:00
Johannes Doerfert 965edde695 Separate more constant factors of parameters
So far we separated constant factors from multiplications, however,
  only when they are at the outermost level of a parameter SCEV. Now,
  we also separate constant factors from the parameter SCEV if the
  outermost expression is a SCEVAddRecExpr. With the changes to the
  SCEVAffinator we can now improve the extractConstantFactor(...)
  function at will without worrying about any other code part. Thus,
  if needed we can implement a more comprehensive
  extractConstantFactor(...) function that will traverse the SCEV
  instead of looking only at the outermost level.

  Four test cases were affected. One did not change much and the other
  three were simplified.

llvm-svn: 260859
2016-02-14 22:30:56 +00:00
Johannes Doerfert 96e5471139 Separate invariant equivalence classes by type
We now distinguish invariant loads to the same memory location if they
  have different types. This will cause us to pre-load an invariant
  location once for each type that is used to access it. However, we can
  thereby avoid invalid casting, especially if an array is accessed
  though different typed/sized invariant loads.

  This basically reverts the changes in r260023 but keeps the test
  cases.

llvm-svn: 260045
2016-02-07 17:30:13 +00:00
Johannes Doerfert e708790c59 [FIX] Two "off-by-one" error in constant range usage
llvm-svn: 260031
2016-02-07 13:59:03 +00:00
Tobias Grosser 8ebdc2dd53 Make memory accesses with different element types optional
We also disable this feature by default, as there are still some issues in
combination with invariant load hoisting that slipped through my initial
testing.

llvm-svn: 260025
2016-02-07 08:48:57 +00:00
Tobias Grosser 107cd5f5f6 IslNodeBuilder: Invariant load hoisting of elements with differing sizes
Always use access-instruction pointer type to load the invariant values.
Otherwise mismatches between ScopArrayInfo element type and memory access
element type will result in invalid casts. These type mismatches are after
r259784 a lot more common and also arise with types of different size, which
have not been handled before.

Interestingly, this change actually simplifies the code, as we now have only
one code path that is always taken, rather then a standard code path for the
common case and a "fixup" code path that replaces the standard code path in
case of mismatching types.

llvm-svn: 260009
2016-02-06 21:23:39 +00:00
Michael Kruse 2e02d560aa Follow uses to create value MemoryAccesses
The previously implemented approach is to follow value definitions and
create write accesses ("push defs") while searching for uses. This
requires the same relatively validity- and requirement conditions to be
replicated at multiple locations (PHI instructions, other instructions,
uses by PHIs).

We replace this by iterating over the uses in a SCoP ("pull in
requirements"), and add writes only when at least one read has been
added. It turns out to be simpler code because each use is only iterated
over once and writes are added for the first access that reads it. We
need another iteration to identify escaping values (uses not in the
SCoP), which also makes the difference between such accesses more
obvious. As a side-effect, the order of scalar MemoryAccess can change.

Differential Revision: http://reviews.llvm.org/D15706

llvm-svn: 259987
2016-02-06 09:19:40 +00:00
Tobias Grosser d840fc7277 Support accesses with differently sized types to the same array
This allows code such as:

void multiple_types(char *Short, char *Float, char *Double) {
  for (long i = 0; i < 100; i++) {
    Short[i] = *(short *)&Short[2 * i];
    Float[i] = *(float *)&Float[4 * i];
    Double[i] = *(double *)&Double[8 * i];
  }
}

To model such code we use as canonical element type of the modeled array the
smallest element type of all original array accesses, if type allocation sizes
are multiples of each other. Otherwise, we use a newly created iN type, where N
is the gcd of the allocation size of the types used in the accesses to this
array. Accesses with types larger as the canonical element type are modeled as
multiple accesses with the smaller type.

For example the second load access is modeled as:

  { Stmt_bb2[i0] -> MemRef_Float[o0] : 4i0 <= o0 <= 3 + 4i0 }

To support code-generating these memory accesses, we introduce a new method
getAccessAddressFunction that assigns each statement instance a single memory
location, the address we load from/store to. Currently we obtain this address by
taking the lexmin of the access function. We may consider keeping track of the
memory location more explicitly in the future.

We currently do _not_ handle multi-dimensional arrays and also keep the
restriction of not supporting accesses where the offset expression is not a
multiple of the access element type size. This patch adds tests that ensure
we correctly invalidate a scop in case these accesses are found. Both types of
accesses can be handled using the very same model, but are left to be added in
the future.

We also move the initialization of the scop-context into the constructor to
ensure it is already available when invalidating the scop.

Finally, we add this as a new item to the 2.9 release notes

Reviewers: jdoerfert, Meinersbur

Differential Revision: http://reviews.llvm.org/D16878

llvm-svn: 259784
2016-02-04 13:18:42 +00:00
Tobias Grosser e2c31210b2 Revert "Support loads with differently sized types from a single array"
This reverts commit (@259587). It needs some further discussions.

llvm-svn: 259629
2016-02-03 05:53:27 +00:00
Tobias Grosser 5d3fc1ea43 Support loads with differently sized types from a single array
We support now code such as:

void multiple_types(char *Short, char *Float, char *Double) {
  for (long i = 0; i < 100; i++) {
    Short[i] = *(short *)&Short[2 * i];
    Float[i] = *(float *)&Float[4 * i];
    Double[i] = *(double *)&Double[8 * i];
  }
}

To support such code we use as element type of the modeled array the smallest
element type of all original array accesses. Accesses with larger types are
modeled as multiple accesses with the smaller type.

For example the second load access is modeled as:

  { Stmt_bb2[i0] -> MemRef_Float[o0] : 4i0 <= o0 <= 3 + 4i0 }

To support jscop-rewritable memory accesses we need each statement instance to
only be assigned a single memory location, which will be the address at which
we load the value. Currently we obtain this address by taking the lexmin of
the access function. We may consider keeping track of the memory location more
explicitly in the future.

llvm-svn: 259587
2016-02-02 22:05:29 +00:00
Tobias Grosser c2fd8b411d ScopInfo: Correct schedule construction
For schedule generation we assumed that the reverse post order traversal used by
the domain generation is sufficient, however it is not. Once a loop is
discovered, we have to completely traverse it, before we can generate the
schedule for any block/region that is only reachable through a loop exiting
block.

To this end, we add a "loop stack" that will keep track of loops we
discovered during the traversal but have not yet traversed completely.
We will never visit a basic block (or region) outside the most recent
(thus smallest) loop in the loop stack but instead queue such blocks
(or regions) in a waiting list. If the waiting list is not empty and
(might) contain blocks from the most recent loop in the loop stack the
next block/region to visit is drawn from there, otherwise from the
reverse post order iterator.

We exploit the new property of loops being always completed before additional
loops are processed, by removing the LoopSchedules map and instead keep all
information in LoopStack. This clarifies that we indeed always only keep a
stack of in-process loops, but will never keep incomplete schedules for an
arbitrary set of loops. As a result, we can simplify some of the existing code.

This patch also adds some more documentation about how our schedule construction
works.

This fixes http://llvm.org/PR25879

This patch is an modified version of Johannes Doerfert's initial fix.

Differential Revision: http://reviews.llvm.org/D15679

llvm-svn: 259354
2016-02-01 11:54:13 +00:00
Michael Kruse fd46308de4 ScopInfo: Never add read accesses for synthesizable values
Before adding a MK_Value READ MemoryAccess, check whether the read is
necessary or synthesizable. Synthesizable values are later generated by
the SCEVExpander and therefore do not need to be transferred
explicitly. This can happen because the check for synthesizability has
presumbly been forgotten in the case where a phi's incoming value has
been defined in a different statement.

Differential Revision: http://reviews.llvm.org/D15687

llvm-svn: 258998
2016-01-27 22:51:56 +00:00
Michael Kruse ee6a4fc680 Unique phi write accesses
Ensure that there is at most one phi write access per PHINode and
ScopStmt. In particular, this would be possible for non-affine
subregions with multiple exiting blocks. We replace multiple MAY_WRITE
accesses by one MUST_WRITE access. The written value is constructed
using a PHINode of all exiting blocks. The interpretation of the PHI
WRITE's "accessed value" changed from the incoming value to the PHI like
for PHI READs since there is no unique incoming value.

Because region simplification shuffles around PHI nodes -- particularly
with exit node PHIs -- the PHINodes at analysis time does not always
exist anymore in the code generation pass. We instead remember the
incoming block/value pair in the MemoryAccess.

Differential Revision: http://reviews.llvm.org/D15681

llvm-svn: 258809
2016-01-26 13:33:27 +00:00
Michael Kruse 436db620e7 Unique value write accesses
Ensure there is at most one write access per definition of an
llvm::Value. Keep track of already created value write access by using
a (dense) map.

Replace addValueWriteAccess by ensureValueStore which can be uses more
liberally without worrying to add redundant accesses. It will be used,
e.g. in a logical correspondant for value reads -- ensureValueReload --
to ensure that the expected definition has been written when loading it.

Differential Revision: http://reviews.llvm.org/D15483

llvm-svn: 258807
2016-01-26 13:33:10 +00:00
Johannes Doerfert 6f50c29ab2 [FIX] Domain generation error due to loops in non-affine regions
llvm-svn: 258803
2016-01-26 11:03:25 +00:00
Johannes Doerfert 432658d7b8 [FIX] Build correct domain for non-affine region SCoPs
llvm-svn: 258802
2016-01-26 11:01:41 +00:00
Tobias Grosser b3a9538e95 Remove irreducible control flow from test case
The test case we look at does not necessarily require irreducible control flow,
but a normal loop is sufficient to create a non-affine region containing more
than one basic block that dominates the exit node. We replace this irreducible
control flow with a normal loop for the following reasons:

  1) This is easier to understand
  2) We will subsequently commit a patch that ensures Polly does not process
     irreducible control flow.

Within non-affine regions, we could possibly handle irreducible control flow.

llvm-svn: 258496
2016-01-22 09:33:33 +00:00
Michael Kruse 959a8dc39f Update to ISL 0.16.1
llvm-svn: 257898
2016-01-15 15:54:45 +00:00
Michael Kruse 5a9a65e43f Prepare unit tests for update to ISL 0.16
ISL 0.16 will change how sets are printed which breaks 117 unit tests
that text-compare printed sets. This patch re-formats most of these unit
tests using a script and small manual editing on top of that. When
actually updating ISL, most work is done by just re-running the script
to adapt to the changed output.

Some tests that compare IR and tests with single CHECK-lines that can be
easily updated manually are not included here.

The re-format script will also be committed afterwards. The per-test
formatter invocation command lines options will not be added in the near
future because it is ad hoc and would overwrite the manual edits.
Ideally it also shouldn't be required anymore because ISL's set printing
has become more stable in 0.16.

Differential Revision: http://reviews.llvm.org/D16095

llvm-svn: 257851
2016-01-15 00:48:42 +00:00
Roman Gareev 10595a1739 Call assumeNoOutOfBound only in updateDimensionality
Call assumeNoOutOfBound only in updateDimensionality to process situations
when new dimensions are added and new bounds checks are required.

Contributed-by: Tobias Grosser, Gareev Roman
llvm-svn: 257170
2016-01-08 14:01:59 +00:00
Johannes Doerfert 30e2307f61 [FIX] Schedule generation for block exiting multiple loops.
This fixes bug PR25604.

llvm-svn: 256125
2015-12-20 17:12:22 +00:00
Tobias Grosser 75dc40c3be ScopInfo: Bail out in case of complex branch structures
Scops that contain many complex branches are likely to result in complex domain
conditions that consist of a large (> 100) number of conjucts.  Transforming
such domains is expensive and unlikely to result in efficient code.  To avoid
long compile times we detect this case and skip such scops. In the future we may
improve this by either using non-affine subregions to hide such complex
condition structures or by exploiting in certain cases properties (e.g.,
dominance) that allow us to construct the domains of a scop in a way that
results in a smaller number improving conjuncts.

Example of a code that results in complex iteration spaces:

      loop.header
     /    |    \ \
   A0    A2    A4 \
     \  /  \  /    \
      A1    A3      \
     /  \  /  \     |
   B0    B2    B4   |
     \  /  \  /     |
      B1    B3      ^
     /  \  /  \     |
   C0    C2    C4   |
     \  /  \  /    /
      C1    C3    /
       \   /     /
    loop backedge

llvm-svn: 256123
2015-12-20 13:31:48 +00:00
Tobias Grosser f4f6870ff2 Revert "Always treat scalar writes as MUST_WRITEs"
This reverts commit r255471.

Johannes raised in the post-commit review of r255471 the concern that PHI
writes in non-affine regions with two exiting blocks are not really MUST_WRITE,
but we just know that at least one out of the set of all possible PHI writes
will be executed. Modeling all PHI nodes as MUST_WRITEs is probably save, but
adding the needed documentation for such a special case is probably not worth
the effort. Michael will be proposing a new patch that ensures only a single
PHI_WRITE is created for non-affine regions, which - besides other benefits -
should also allow us to use a single well-defined MUST_WRITE for such PHI
writes.

(This is not a full revert, but the condition and documentation have been
slightly extended)

llvm-svn: 255503
2015-12-14 15:05:37 +00:00
Michael Kruse e0d135c536 Add unit test for r255473
Check that memory accesses in non-affine regions that are always executed are
MUST_WRITE.

llvm-svn: 255500
2015-12-14 14:53:30 +00:00
Michael Kruse b06e3029d1 Always treat scalar writes as MUST_WRITEs
LLVM's IR guarantees that a value definition occurs before any use, and
also the value of a PHI must be one of the incoming values, "written"
in one of the incoming blocks. Hence, such writes are never conditional
in the context of a non-affine subregion.

llvm-svn: 255471
2015-12-13 22:10:32 +00:00
Tobias Grosser 2d3d4ec860 executeScopConditionally: Introduce special exiting block
When introducing separate control flow for the original and optimized code we
introduce now a special 'ExitingBlock':

      \   /
    EnteringBB
        |
    SplitBlock---------\
   _____|_____         |
  /  EntryBB  \    StartBlock
  |  (region) |        |
  \_ExitingBB_/   ExitingBlock
        |              |
    MergeBlock---------/
        |
      ExitBB
      /    \

This 'ExitingBlock' contains code such as the final_reloads for scalars, which
previously were just added to whichever statement/loop_exit/branch-merge block
had been generated last. Having an explicit basic block makes it easier to
find these constructs when looking at the CFG.

llvm-svn: 255107
2015-12-09 11:38:22 +00:00
Tobias Grosser 2f8e43d677 ScopInfo: Add support for delinearizing fortran arrays
gfortran (and fortran in general?) does not compute the address of an array
element directly from the array sizes (e.g., %s0, %s1), but takes first the
maximum of the sizes and 0 (e.g., max(0, %s0)) before multiplying the resulting
value with the per-dimension array subscript expressions. To successfully
delinearize index expressions as we see them in fortran, we first filter 'smax'
expressions out of the SCEV expression, use them to guess array size parameters
and only then continue with the existing delinearization.

llvm-svn: 253995
2015-11-24 17:06:38 +00:00
Tobias Grosser 9737c7b431 ScopInfo: Remove domains of error blocks (and blocks they dominate) early on
Trying to build up access functions for any of these blocks is likely to fail,
as error blocks may contain invalid/non-representable instructions, and blocks
dominated by error blocks may reference such instructions, which wil also cause
failures. As all of these blocks are anyhow assumed to not be executed, we can
just remove them early on.

This fixes http://llvm.org/PR25596

llvm-svn: 253818
2015-11-22 11:06:51 +00:00
Tobias Grosser 020fa09a3c Remove -polly-code-generator=isl from many test cases
This is the default since a long time. Setting it again does not add value
in any of these test cases.

llvm-svn: 253800
2015-11-21 23:05:48 +00:00
Johannes Doerfert dec27df588 [FIX] Get the correct loop that surrounds a region
llvm-svn: 253788
2015-11-21 16:56:13 +00:00
Johannes Doerfert 55b3d8b831 Consistenly use getTypeAllocSize for size estimation.
Only when we check for wrapping we want to use the store size, for all
  other cases we use the alloc size now.

Suggested by: Tobias Grosser <tobias@grosser.es>

llvm-svn: 252941
2015-11-12 20:15:08 +00:00
Johannes Doerfert 2af10e2eed Use parameter constraints provided via llvm.assume
If an llvm.assume dominates the SCoP entry block and the assumed condition
  can be expressed as an affine inequality we will now add it to the context.

Differential Revision: http://reviews.llvm.org/D14413

llvm-svn: 252851
2015-11-12 03:25:01 +00:00
Johannes Doerfert d84493e52e Emit remarks for taken assumptions
Differential Revision: http://reviews.llvm.org/D14412

llvm-svn: 252848
2015-11-12 02:33:38 +00:00
Johannes Doerfert 0cf4e0aa42 Emit remark about aliasing pointers
llvm-svn: 252847
2015-11-12 02:32:51 +00:00
Johannes Doerfert 48fe86f1ff Emit SCoP source location as remark during ScopInfo
This removes a similar feature from ScopDetection, though with
  -polly-report that feature present twice anyway.

llvm-svn: 252846
2015-11-12 02:32:32 +00:00
Tobias Grosser e19fca4525 ScopInfo: Bailing out means assigning isl_set_empty to the AssumedContext
I got this the other way around in 252750. Thank you Johannes for noticing.

llvm-svn: 252795
2015-11-11 20:21:39 +00:00
Tobias Grosser 910cf26811 ScopInfo: Do not try to model the memory accesses in an error block
Error blocks may contain arbitrary instructions, among them some which we can
not modeled correctly. As we do not generate ScopStmts for error blocks anyhow
there is no point in trying to generate access functions for them.

This fixes llvm.org/PR25494

llvm-svn: 252794
2015-11-11 20:15:49 +00:00
Tobias Grosser 4cd07b1188 ScopInfo: Bound compute time spent in boundary context construction
For complex inputs our current approach of construction the boundary context
may in rare cases become computationally so expensive that it is better to
abort. This change adds a compute out check that bounds the compuations we
spend on boundary context construction and bails out if this limit is reached.

We can probably make our boundary construction algorithm more efficient, but
this requires some more investigation and probably also some additional changes
to isl. Until these have been added, we bound the compile time to ensure our
buildbots are green.

llvm-svn: 252758
2015-11-11 17:34:02 +00:00
Tobias Grosser 20a4c0c205 ScopInfo: Limit the number of disjuncts in assumed context
In certain rare cases (mostly -polly-process-unprofitable on large sequences
of conditions - often without any loop), we see some compile-time timeouts due
to the construction of an overly complex assumption context. This change limits
the number of disjuncts to 150 (adjustable), to prevent us from creating
assumptions contexts that are too large for even the compilation to finish.

The limit has been choosen as large as possible to make sure we do not
unnecessarily drop test coverage. If such cases also appear in
-polly-process-unprofitable=false mode we may need to think about this again,
as the current limitations may still allow assumptions that are way to complex
to be checked profitably at run-time.

There is also certainly room for improvement regarding how (and how efficient)
we construct an assumed context, but this requires some more thinking.

This completes llvm.org/PR25458

llvm-svn: 252750
2015-11-11 16:22:36 +00:00
Tobias Grosser 56e3fefbdc test: Shorten test case to reduce 'make polly-check' time
Thinking more about the last commit I came to realize that for testing the
new functionality it is sufficient to verify that the iteration domains
we construct for a simple test case do not contain any of the complexity that
caused compile time issues for larger inputs.

llvm-svn: 252714
2015-11-11 09:19:15 +00:00
Tobias Grosser b76cd3cc56 ScopInfo: Pass domain constraints through error blocks
Previously, we just skipped error blocks during scop construction. With
this change we make sure we can construct domains for error blocks such that
these domains can be forwarded to subsequent basic blocks.

This change ensures that basic blocks that post-dominate and are dominated by
a basic block that branches to an error condition have the very same iteration
domain as the branching basic block. Before, this change we would construct
a domain that excludes all error conditions. Such domains could become _very_
complex and were undesirable to build.

Another solution would have been to drop these constraints using a
dominance/post-dominance check instead of modeling the error blocks. Such
a solution could also work in case of unreachable statements or infinite
loops in the scop. However, as we currently (to my believe incorrectly) model
unreachable basic blocks in the post-dominance tree, such a solution is not
yet feasible and requires first a change to LLVM's post-dominance tree
construction.

This commit addresses the most sever compile time issue reported in:
http://llvm.org/PR25458

llvm-svn: 252713
2015-11-11 08:42:20 +00:00
Johannes Doerfert dcfedf3505 [FIX] Cast pre-loaded values correctly or reload them with adjusted type.
Especially for structs, the SAI object of a base pointer does not
describe all the types that the user might expect when he loads from
that base pointer. While we will still cast integers and pointers we
will now reload the value with the correct type if floating point and
non-floating point values are involved. However, there are now TODOs
where we use bitcasts instead of a proper conversion or reloading.

This fixes bug 25479.

llvm-svn: 252706
2015-11-11 06:20:25 +00:00
Johannes Doerfert fc4bfc465a [FIX] Create empty invariant equivalence classes
We now create all invariant equivalence classes for required invariant loads
  instead of creating them on-demand. This way we can check if a parameter
  references an invariant load that is actually not executed and was therefor
  not materialized. If that happens the parameter is not materialized either.

This fixes bug 25469.

llvm-svn: 252701
2015-11-11 04:30:07 +00:00
Tobias Grosser 8b05278b4e tests: Add test that has a single pointer both as scalar read and array base
In case we also model scalar reads it can happen that a pointer appears in both
a scalar read access as well as the base pointer of an array access. As this
is a little surprising, we add a specific test case to document this behaviour.
To my understanding it should be OK to have a read from an array A[] and
read/write accesses to A[...]. isl is treating these arrays as unrelated as
their dimensionality differs. This seems to be correct as A[] remains constant
throughout the execution of the scop and is not affected by the reads/writes to
A[...]. If this causes confusion, it might make sense to make this behaviour
more obvious by using different names (e.g., A_scalar[], A[...]).

llvm-svn: 252615
2015-11-10 16:23:30 +00:00
Tobias Grosser 98e566e213 Simplify test case
Commit r252422 introduced an unnecessary complicated test case. Reduce it to
the part that actually triggered the original issue.

llvm-svn: 252611
2015-11-10 15:42:44 +00:00
Tobias Grosser 4ea2e07a60 ScopInfo: Make printing of ScopArrayInfo more similar to declarations in C
Memory references are now printed as follows:

           Old                          New
Scalars:   i64 MemRef_val[*]            i64 MemRef_val;
Arrays:    i64 MemRef_A[*][%m][%o][8]   i64 MemRef_A[*][%m][%o];

We do not print any more information about the element size in the type. Such
information has already been available in a comment after the scalar/array
declaration. It was redundant and did not match well with what people were used
from C.

llvm-svn: 252602
2015-11-10 14:02:54 +00:00
Johannes Doerfert f85ad0411f [FIX] Carefully simplify assumptions in the presence of error blocks
If a SCoP contains error blocks we cannot use the domain constraints
  to simplify the assumptions as the domain is already influenced by the
  assumptions we took. Before this patch we did that and some assumptions
  became self-fulfilling as they were implied by the domain constraints.

llvm-svn: 252424
2015-11-08 20:16:39 +00:00
Johannes Doerfert a768624f14 [FIX] Introduce different SAI objects for scalar and memory accesses
Even if a scalar and memory access have the same base pointer, we cannot use
  one SAI object as the type but also the number of dimensions are wrong. For
  the attached test case this caused a crash in the invariant load hoisting,
  though it could cause various other problems too.

This fixes bug 25428 and a execution time bug in MallocBench/cfrac.

Reported-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
llvm-svn: 252422
2015-11-08 19:12:05 +00:00
Johannes Doerfert 44483c5599 [FIX] Remove all invariant load occurences from own execution context
llvm-svn: 252411
2015-11-07 19:45:27 +00:00
Michael Kruse f714d470d7 Fix escaping value to subregion entry node phi
An incoming value from a block the is not inside the scop is an
external use, even if the phi is inside the scop. A previous fix in
r251208 did not apply if the phi is inside a non-affine subregion. We
move the check for this phi case before the non-affine subregion check.

llvm-svn: 252157
2015-11-05 13:18:43 +00:00
Johannes Doerfert eca9e890b9 Remove read-only statements from the SCoP
We do not need to model read-only statements in the SCoP as they will
  not cause any side effects that are visible to the outside anyway.
  Removing them should safe us time and might even simplify the ASTs we
  generate.

Differential Revision: http://reviews.llvm.org/D14272

llvm-svn: 251948
2015-11-03 16:54:49 +00:00
Johannes Doerfert d6fc0701ee [FIX] Carefully rewrite parameters wrt. invariant equivalence classes
ScalarEvolution doesn't allow the operands of an AddRec to be variant in the
  loop of the AddRec. When we rewrite parameter SCEVs it might seem like the
  new SCEV violates this property and ScalarEvolution will trigger an
  assertion. To avoid this we move the start part out of an AddRec when we
  rewrite it, thus avoid the operands to be possibly variant completely.

llvm-svn: 251945
2015-11-03 16:47:58 +00:00
Johannes Doerfert 3181c2ef72 [FIX] Correctly update SAI base pointer
If a base pointer load is preloaded, we have change the base pointer of
  the derived SAI. However, as the derived SAI relationship is is
  coarse grained, we need to check if we actually preloaded the base
  pointer or a different element of the base pointer SAI array.

llvm-svn: 251881
2015-11-03 01:42:59 +00:00
Johannes Doerfert dca2837b76 [FIX] Do not crash in the presence of infinite loops.
llvm-svn: 251870
2015-11-03 00:28:07 +00:00
Tobias Grosser 8286b83f97 ScopInfo: Bail out in case of mismatching array dimension sizes
In some cases different memory accesses access the very same array using a
different multi-dimensional array layout where the same dimensions have
different sizes. Instead of asserting when encountering this issue, we
gracefully bail out for this scop.

This fixes llvm.org/PR25252

llvm-svn: 251791
2015-11-02 11:29:32 +00:00
Tobias Grosser 4dc909cbed tests: Add test cases for LLVM commit r251267
This fixes llvm.org/PR25242

llvm-svn: 251268
2015-10-25 22:56:42 +00:00
Tobias Grosser baffa091dd ScopInfo: PHI-node uses in the EntryNode with an incoming BB that is not part
of the Region are external.

During code generation we split off the parts of the PHI nodes in the entry
block, which have incoming blocks that are not part of the region. As these
split-off PHI nodes then are external uses, we consequently also need to model
these uses in ScopInfo.

llvm-svn: 251208
2015-10-24 20:55:27 +00:00
Johannes Doerfert 654c3284f4 [FIX] Do not hoist nested variant base pointers
This fixes bug 25249.

llvm-svn: 250958
2015-10-21 22:14:57 +00:00
Johannes Doerfert 9c28bfa72c [FIX] Only constant integer branch conditions are always affine
There are several different kinds of constants that could occur in a
  branch condition, however we can only handle the most interesting one
  namely constant integers. To this end we have to treat others as
  non-affine.

  This fixes bug 25244.

llvm-svn: 250669
2015-10-18 22:56:42 +00:00
Johannes Doerfert 30c2265f98 [FIX] Normalize loops outside the SCoP during schedule generation
We build the schedule based on a traversal of the region and accumulate
  information for each loop in it. The total schedule is associated with the
  loop surrounding the SCoP, though it can happen that there are blocks in the
  SCoP which are part of loops that are only partially in the SCoP. Instead of
  associating information with them (they are not part of the SCoP and
  consequently are not modeled) we have to associate the schedule information
  with the surrounding loop if any.

  This fixes bug 25240.

llvm-svn: 250668
2015-10-18 21:17:11 +00:00
Johannes Doerfert af3e301a67 [FIX] Restructure invariant load equivalence classes
Sorting is replaced by a demand driven code generation that will pre-load a
  value when it is needed or, if it was not needed before, at some point
  determined by the order of invariant accesses in the program. Only in very
  little cases this demand driven pre-loading will kick in, though it will
  prevent us from generating faulty code. An example where it is needed is
  shown in:
    test/ScopInfo/invariant_loads_complicated_dependences.ll

  Invariant loads that appear in parameters but are not on the top-level (e.g.,
  the parameter is not a SCEVUnknown) will now be treated correctly.

Differential Revision: http://reviews.llvm.org/D13831

llvm-svn: 250655
2015-10-18 12:39:19 +00:00
Johannes Doerfert d8b6ad255f [FIX] Cast preloaded values
Preloaded values have to match the type of their counterpart in the
  original code and not the type of the base array.

llvm-svn: 250654
2015-10-18 12:36:42 +00:00
Johannes Doerfert 01978cfa0c Remove independent blocks pass
Polly can now be used as a analysis only tool as long as the code
  generation is disabled. However, we do not have an alternative to the
  independent blocks pass in place yet, though in the relevant cases
  this does not seem to impact the performance much. Nevertheless, a
  virtual alternative that allows the same transformations without
  changing the input region will follow shortly.

llvm-svn: 250652
2015-10-18 12:28:00 +00:00
Michael Kruse 01cb379fed Avoid unnecessay .s2a write access when used only in PHIs
Accesses for exit node phis will be handled separately by 
buildPHIAccesses if there is more than one exiting edge, 
buildScalarDependences does not need to create additional SCALAR 
accesses.

This is a corrected version of r250517, which was reverted in r250607.

Differential Revision: http://reviews.llvm.org/D13848

llvm-svn: 250622
2015-10-17 21:07:08 +00:00
Tobias Grosser 3839b422e6 Revert "Avoid unnecessay .s2a write access when used only in PHIs"
This reverts commit r250606 due to some bugs it introduced. After these bugs
have been resolved, we will add it back to tree.

llvm-svn: 250607
2015-10-17 08:54:05 +00:00
Michael Kruse e71893d580 Add testcase for r250517
llvm-svn: 250518
2015-10-16 15:17:26 +00:00
Michael Kruse aeceab770e Avoid unnecessay .s2a write access when used only in PHIs
PHI accesses will be handled separately by buildPHIAccesses,
buildScalarDependences does not need to create additional accesses.

llvm-svn: 250517
2015-10-16 15:14:40 +00:00
Tobias Grosser b860289dbd Add ScopInfo test case for r250411
llvm-svn: 250439
2015-10-15 18:26:06 +00:00
Tobias Grosser e2c8275346 ScopInfo: Allow simple 'AddRec * Parameter' products in delinearization
We also allow such products for cases where 'Parameter' is loaded within the
scop, but where we can dynamically verify that the value of 'Parameter' remains
unchanged during the execution of the scop.

This change relies on Polly's new RequiredILS tracking infrastructure recently
contributed by Johannes.

llvm-svn: 250019
2015-10-12 08:02:30 +00:00
Johannes Doerfert 9b1f9c8b61 Allow eager evaluated binary && and || conditions
The domain generation can handle lazy && and || by default but eager
  evaluated expressions were dismissed as non-affine. With this patch we
  will allow arbitrary combinations of and/or bit-operations in the
  conditions of branches.

Differential Revision: http://reviews.llvm.org/D13624

llvm-svn: 249971
2015-10-11 13:21:03 +00:00
Johannes Doerfert 697fdf891c Consolidate invariant loads
If a (assumed) invariant location is loaded multiple times we
  generated a parameter for each location. However, this caused compile
  time problems for several benchmarks (e.g., 445_gobmk in SPEC2006 and
  BT in the NAS benchmarks). Additionally, the code we generate is
  suboptimal as we preload the same location multiple times and perform
  the same checks on all the parameters that refere to the same value.

  With this patch we consolidate the invariant loads in three steps:
    1) During SCoP initialization required invariant loads are put in
       equivalence classes based on their pointer operand. One
       representing load is used to generate a parameter for the whole
       class, thus we never generate multiple parameters for the same
       location.
    2) During the SCoP simplification we remove invariant memory
       accesses that are in the same equivalence class. While doing so
       we build the union of all execution domains as it is only
       important that the location is at least accessed once.
    3) During code generation we only preload one element of each
       equivalence class with the unified execution domain. All others
       are mapped to that preloaded value.

Differential Revision: http://reviews.llvm.org/D13338

llvm-svn: 249853
2015-10-09 17:12:26 +00:00
Johannes Doerfert c7ab83dfb7 Remove unused flag polly-allow-non-scev-backedge-taken-count
Drop an unused flag polly-allow-non-scev-backedge-taken-count and also
  its occurrences from the tests.

Contributed-by: Chris Jenneisch <chrisj@codeaurora.org>

Differential Revision: http://reviews.llvm.org/D13400

llvm-svn: 249675
2015-10-08 10:05:48 +00:00
Johannes Doerfert 08d90a3cee Treat conditionally executed non-pure calls as errors
This replaces the support for user defined error functions by a
  heuristic that tries to determine if a call to a non-pure function
  should be considered "an error". If so the block is assumed not to be
  executed at runtime. While treating all non-pure function calls as
  errors will allow a lot more regions to be analyzed, it will also
  cause us to dismiss a lot again due to an infeasible runtime context.
  This patch tries to limit that effect. A non-pure function call is
  considered an error if it is executed only in conditionally with
  regards to a cheap but simple heuristic.

llvm-svn: 249611
2015-10-07 20:32:43 +00:00
Johannes Doerfert 09e3697f44 Allow invariant loads in the SCoP description
This patch allows invariant loads to be used in the SCoP description,
  e.g., as loop bounds, conditions or in memory access functions.

  First we collect "required invariant loads" during SCoP detection that
  would otherwise make an expression we care about non-affine. To this
  end a new level of abstraction was introduced before
  SCEVValidator::isAffineExpr() namely ScopDetection::isAffine() and
  ScopDetection::onlyValidRequiredInvariantLoads(). Here we can decide
  if we want a load inside the region to be optimistically assumed
  invariant or not. If we do, it will be marked as required and in the
  SCoP generation we bail if it is actually not invariant. If we don't
  it will be a non-affine expression as before. At the moment we
  optimistically assume all "hoistable" (namely non-loop-carried) loads
  to be invariant. This causes us to expand some SCoPs and dismiss them
  later but it also allows us to detect a lot we would dismiss directly
  if we would ask e.g., AliasAnalysis::canBasicBlockModify(). We also
  allow potential aliases between optimistically assumed invariant loads
  and other pointers as our runtime alias checks are sound in case the
  loads are actually invariant. Together with the invariant checks this
  combination allows to handle a lot more than LICM can.

  The code generation of the invariant loads had to be extended as we
  can now have dependences between parameters and invariant (hoisted)
  loads as well as the other way around, e.g.,
    test/Isl/CodeGen/invariant_load_parameters_cyclic_dependence.ll
  First, it is important to note that we cannot have real cycles but
  only dependences from a hoisted load to a parameter and from another
  parameter to that hoisted load (and so on). To handle such cases we
  materialize llvm::Values for parameters that are referred by a hoisted
  load on demand and then materialize the remaining parameters. Second,
  there are new kinds of dependences between hoisted loads caused by the
  constraints on their execution. If a hoisted load is conditionally
  executed it might depend on the value of another hoisted load. To deal
  with such situations we sort them already in the ScopInfo such that
  they can be generated in the order they are listed in the
  Scop::InvariantAccesses list (see compareInvariantAccesses). The
  dependences between hoisted loads caused by indirect accesses are
  handled the same way as before.

llvm-svn: 249607
2015-10-07 20:17:36 +00:00
Tobias Grosser 369c1d663b test: Add example of scalar that is reused accross loop
llvm-svn: 249527
2015-10-07 09:00:29 +00:00
Tobias Grosser 575aca8d43 Introduce -polly-process-unprofitable
This single option replaces -polly-detect-unprofitable and -polly-no-early-exit
and is supposed to be the only option that disables compile-time heuristics that
aim to bail out early on scops that are believed to not benefit from Polly
optimizations.

Suggested-by:  Johannes Doerfert
llvm-svn: 249426
2015-10-06 16:10:29 +00:00
Tobias Grosser f4ee371e60 tests: Drop -polly-detect-unprofitable and -polly-no-early-exit
These flags are now always passed to all tests and need to be disabled if
not needed. Disabling these flags, rather than passing them to almost all
tests, significantly simplfies our RUN: lines.

llvm-svn: 249422
2015-10-06 15:36:44 +00:00
Tobias Grosser 935f62cf0d tests: Explicitly state if profitability tests should be used
Polly's profitability heuristic saves compile time by skipping trivial scops or
scops were we know no good optimization can be applied. For almost all our tests
this heuristic makes little sense as we aim for minimal test cases when testing
functionality. Hence, in almost all cases this heuristic is better be disabled.
In preparation of disabling Polly's compile time heuristic by default in the
test suite we first explicitly enable it in the couple of test cases that really
use it (or run with/without heuristic side-by-side).

llvm-svn: 249418
2015-10-06 15:19:35 +00:00
Tobias Grosser 1ac26d06fe test: Disable profitability heuristics to unfail LICM test case
This test case was XFAILed under the assumption Polly is unable to detect the
scop. However, disabling Polly's profitability heuristics is sufficient to
detect this scop.

llvm-svn: 249414
2015-10-06 15:10:19 +00:00
Johannes Doerfert f17a78ef63 Remove non-executed statements during SCoP simplifcation
A statement with an empty domain complicates the invariant load
  hoisting and does not help any subsequent analysis or transformation.
  In fact it might introduce parameter dimensions or increase the
  schedule dimensionality. To this end, we remove statements with an
  empty domain early in the SCoP simplification.

llvm-svn: 249276
2015-10-04 15:00:05 +00:00
Johannes Doerfert 634909c2c9 [FIX] Domain generation for non-affine loops
llvm-svn: 249275
2015-10-04 14:57:41 +00:00
Johannes Doerfert f61df69423 [FIX] Count affine loops correctly
The "unprofitable" heuristic was broken and counted boxed loops
  even though we do not represent and optimize them.

llvm-svn: 249274
2015-10-04 14:56:08 +00:00
Johannes Doerfert 3e7d171866 [FIX] Repair broken commit
The last invariant load fix was based on a later patch not
  polly/master, thus needs to be adjusted.

llvm-svn: 249145
2015-10-02 15:35:03 +00:00
Johannes Doerfert 8930f4846c [FIX] Do not hoist from inside a non-affine subregion
We have to skip accesses in non-affine subregions during hoisting as
  they might not be executed under the same condition as the entry of
  the non-affine subregion.

llvm-svn: 249139
2015-10-02 14:51:00 +00:00
Michael Kruse cac948ef46 Earlier creation of ScopStmt objects
This moves the construction of ScopStmt to the beginning of the 
ScopInfo pass. The late creation was a result of the earlier separation 
of ScopInfo and TempScopInfo. This will avoid introducing more 
ScopStmt-like maps in future commits. The AccFuncMap will also be 
removed in some future commit. DomainMap might also be included into 
ScopStmt.

The order in which ScopStmt are created changes and initially creates 
empty statements that are removed in a simplification.

Differential Revision: http://reviews.llvm.org/D13341

llvm-svn: 249132
2015-10-02 13:53:07 +00:00
Johannes Doerfert f56738041e Make the SCoP generation resistent wrt. error blocks
When error blocks are not terminated by an unreachable they have successors
  that might only be reachable via error blocks. Additionally, branches in
  error blocks are not checked during SCoP detection, thus we might not be able
  to handle them. With this patch we do not try to model error block exit
  conditions. Anything that is only reachable via error blocks is ignored too,
  as it will not be executed in the optimized version of the SCoP anyway.

llvm-svn: 249099
2015-10-01 23:48:18 +00:00
Johannes Doerfert f80f3b0449 Allow user defined error functions
The user can provide function names with
    -polly-error-functions=name1,name2,name3
  that will be treated as error functions. Any call to them is assumed
  not to be executed.

  This feature is mainly for developers to play around with the new
  "error block" feature.

llvm-svn: 249098
2015-10-01 23:45:51 +00:00
Johannes Doerfert c1db67e218 Identify and hoist definitively invariant loads
As a first step in the direction of assumed invariant loads (loads
  that are not written in some context) we now detect and hoist
  definitively invariant loads. These invariant loads will be preloaded
  in the code generation and used in the optimized version of the SCoP.
  If the load is only conditionally executed the preloaded version will
  also only be executed under the same condition, hence we will never
  access memory that wouldn't have been accessed otherwise. This is also
  the most distinguishing feature to licm.

  As hoisting can make statements empty we will simplify the SCoP and
  remove empty statements that would otherwise cause artifacts in the
  code generation.

Differential Revision: http://reviews.llvm.org/D13194

llvm-svn: 248861
2015-09-29 23:47:21 +00:00
Johannes Doerfert 9a132f36c3 Allow switch instructions in SCoPs
This patch allows switch instructions with affine conditions in the
  SCoP. Also switch instructions in non-affine subregions are allowed.
  Both did not require much changes to the code, though there was some
  refactoring needed to integrate them without code duplication.

  In the llvm-test suite the number of profitable SCoPs increased from
  135 to 139 but more importantly we can handle more benchmarks and user
  inputs without preprocessing.

Differential Revision: http://reviews.llvm.org/D13200

llvm-svn: 248701
2015-09-28 09:33:22 +00:00
Tobias Grosser 06c495c2b0 Add test case from llvm.org/PR17187
The new domain construction algorithm now correctly models this test case (and
derives an empty run-time condition). Add this test case to ensure we do not
regress.

llvm-svn: 248669
2015-09-26 14:27:54 +00:00
Johannes Doerfert 12155a9ef4 Add test case from open bug
The bug (15771) was fixed already with the new domain generation
  but the test case was not added till now.

llvm-svn: 248668
2015-09-26 14:03:29 +00:00
Johannes Doerfert c6987c18de [FIX] Use the surrounding loop for non-affine SCoP regions
When the whole SCoP is a non-affine region we need to use the
  surrounding loop in the construction of the schedule as that is
  the one that will be looked up after the schedule generation.

  This fixes bug 24947

llvm-svn: 248667
2015-09-26 13:41:43 +00:00
Tobias Grosser bbda083c75 Add test case for delinearization through bitcasts
This was forgotten in r247928

llvm-svn: 248663
2015-09-26 08:55:59 +00:00
Tobias Grosser 99c70dd8d1 Ensure memory accesses to the same array have identical dimensionality
When recovering multi-dimensional memory accesses, it may happen that different
accesses to the same base array are recovered with different dimensionality.
This patch ensures that the dimensionalities are unified by adding zero valued
dimensions to acesses with lower dimensionality. When starting to model
fixed-size arrays as multi-dimensional in 247906, this has not been taken
care of.

llvm-svn: 248662
2015-09-26 08:55:54 +00:00
Tobias Grosser 8016f3a4f5 Add missing PHI to test case
llvm-svn: 248563
2015-09-25 05:41:30 +00:00
Tobias Grosser da95a4a7c7 Handle read-only scalars used in PHI-nodes correctly
This change addresses three issues:

  - Read only scalars that enter a PHI node through an edge that comes from
    outside the scop are not modeled any more, as such PHI nodes will always
    be initialized to this initial value right before the SCoP is entered.
  - For PHI nodes that depend on a scalar value that is defined outside the
    scop, but where the scalar values is passed through an edge that itself
    comes from a BB that is part of the region, we introduce in this basic
    block a read of the out-of-scop value to ensure it's value is available
    to write it into the PHI alloc location.
  - Read only uses of scalars by PHI nodes are ignored in the general read only
    handling code, as they are taken care of by the general PHI node modeling
    code.

llvm-svn: 248535
2015-09-24 20:59:59 +00:00
Michael Kruse 2d0ece960f Remove Analysis Output of TempScopInfo
After the merge of TempScopInfo into ScopInfo the analysis output 
remained because of the existing unit tests. These remains are removed 
and the units tests converted to match the equivalent output of 
ScopInfo's analysis output. The unit tests are also moved into the
directory of ScopInfo tests.

Differential Revision: http://reviews.llvm.org/D13116

llvm-svn: 248485
2015-09-24 11:41:21 +00:00
Tobias Grosser b1c39429d9 Do not model delinearized and linearized access relation for a single access
A missing return statement that previously did not have a visibly negative
effect caused after some data-structure changes in r248024 multi-dimensional
accesses to be modeled both multi-dimensional as well as linearized. This
commit adds the missing return to avoid the incorrect double modeling as
well as the compile time increases it caused.

llvm-svn: 248171
2015-09-21 16:19:25 +00:00
Johannes Doerfert 6a72a2af13 Use <nsw> AddRecs in the affinator to avoid bounded assumptions
If we encounter a <nsw> tagged AddRec for a loop we know the trip count of
  that loop has to be bounded or the semantics is undefined anyway. Hence, we
  only need to add unbounded assumptions if no such AddRec is known.

llvm-svn: 248128
2015-09-20 16:59:23 +00:00
Johannes Doerfert 707a406078 Add bounded loop assumption
So far we ignored the unbounded parts of the iteration domain, however
  we need to assume they do not occure at all to remain sound if they do.

llvm-svn: 248126
2015-09-20 16:38:19 +00:00
Johannes Doerfert f2cc86edae Simplify domain generation
We now add loop carried information during the second traversal of the
  region instead of in a intermediate step in-between. This makes the
  generation simpler, removes code and should even be faster.

llvm-svn: 248125
2015-09-20 16:15:32 +00:00
Johannes Doerfert 06c57b594c Allow loops with multiple back edges
In order to allow multiple back edges we:
    - compute the conditions under which each back edge is taken
    - build the union over all these conditions, thus the condition that
      any back edge is taken
    - apply the same logic to the union we applied to a single back edge

llvm-svn: 248120
2015-09-20 15:00:20 +00:00
Michael Kruse e2bccbbfb2 Merge IRAccess into MemoryAccess
All MemoryAccess objects will be owned by ScopInfo::AccFuncMap which 
previously stored the IRAccess objects. Instead of creating new 
MemoryAccess objects, the already created ones are reused, but their 
order might be different now. Some fields of IRAccess and MemoryAccess 
had the same meaning and are merged.

This is the last step of fusioning TempScopInfo.{h|cpp} and 
ScopInfo.{h.cpp}. Some refactoring might still make sense.

Differential Revision: http://reviews.llvm.org/D12843

llvm-svn: 248024
2015-09-18 19:59:43 +00:00
Tobias Grosser 5fd8c0961e Model fixed-size multi-dimensional arrays if possible multi-dimensional
If the GEP instructions give us enough insights, model scalar accesses as
multi-dimensional (and generate the relevant run-time checks to ensure
correctness). This will allow us to simplify the dependence computation in
a subsequent commit.

llvm-svn: 247906
2015-09-17 17:28:15 +00:00
Johannes Doerfert 883f8c1d2f Use modulo semantic to generate non-integer-overflow assumptions
This will allow to generate non-wrap assumptions for integer expressions
  that are part of the SCoP. We compare the common isl representation of
  the expression with one computed with modulo semantic. For all parameter
  combinations they are not equal we can have integer overflows.

  The nsw flags are respected when the modulo representation is computed,
  nuw and nw flags are ignored for now.

  In order to not increase compile time to much, the non-wrap assumptions
  are collected in a separate boundary context instead of the assumed
  context. This helps compile time as the boundary context can become
  complex and it is therefor not advised to use it in other operations
  except runtime check generation. However, the assumed context is e.g.,
  used to tighten dependences. While the boundary context might help to
  tighten the assumed context it is doubtful that it will help in practice
  (it does not effect lnt much) as the boundary (or no-wrap assumptions)
  only restrict the very end of the possible value range of parameters.

  PET uses a different approach to compute the no-wrap context, though lnt runs
  have shown that this version performs slightly better for us.

llvm-svn: 247732
2015-09-15 22:52:53 +00:00
Johannes Doerfert 36255eecd8 Revert r247278 "Disable support for modulo expressions"
This reverts commit 00c5b6ca8832439193036aadaaaee92a43236219.

  We can handle modulo expressions in the domain again.

llvm-svn: 247542
2015-09-14 11:14:23 +00:00
Johannes Doerfert ca1e38fa43 Propagate exit conditions as described in the PET paper
At some point we build loop trip counts using this method. It was replaced by
  a simpler trick that works only for affine (e.g., not modulo) constraints and
  relies on the removal of unbounded parts. In order to allow modulo constrains
  again we go back to the former, more accurate method.

llvm-svn: 247540
2015-09-14 11:12:52 +00:00