Commit Graph

297 Commits

Author SHA1 Message Date
Tobias Grosser ad25f0fee6 Update debug metadata after LLVM commits r266445+r266446
llvm-svn: 266473
2016-04-15 20:51:27 +00:00
Michael Kruse 5c7d0cb834 Add contexts to test cases. NFC.
As discussed in the Polly weekly phone call and reviews.llvm.org/D18878,
the assumed contexts changed (widen) due to D18878/r265942. Also check
these contexts in the tests affected by that change.

llvm-svn: 266323
2016-04-14 15:22:13 +00:00
Johannes Doerfert fb72187fdd [FIX] Check the invalid context agains the context to rule out SCoPs
llvm-svn: 266096
2016-04-12 17:54:29 +00:00
Johannes Doerfert 615e0b85f8 Record wrapping assumptions early
Utilizing the record option for assumptions we can simplify the wrapping
  assumption generation a lot. Additionally, we can now report locations
  together with wrapping assumptions, though they might not be accurate yet.

llvm-svn: 266069
2016-04-12 13:28:39 +00:00
Johannes Doerfert 3bf6e4129f Record assumptions first and add them later
There are three reasons why we want to record assumptions first before we
add them to the assumed/invalid context:

  1) If the SCoP is not profitable or otherwise invalid without the
     assumed/invalid context we do not have to compute it.
  2) Information about the context are gathered rather late in the SCoP
     construction (basically after we know all parameters), thus the user
     might see overly complicated assumptions to be taken while they would
     have been simplified later on.
  3) Currently we cannot take assumptions at any point but have to wait,
     e.g., for the domain generation to finish. This makes wrapping
     assumptions much more complicated as they need to be and it will
     have a similar effect on "signed-unsigned" assumptions later.

llvm-svn: 266068
2016-04-12 13:27:35 +00:00
Johannes Doerfert 7c01357cef Introduce an invalid context for each statement
Collect the error domain contexts (formerly in the ErrorDomainCtxMap)
  for each statement in the new InvalidContext member variable. While
  this commit is basically a [NFC] it is a first step to make hoisting
  sound by allowing a more fine grained record of invalid contexts,
  e.g., here on statement level.

llvm-svn: 266053
2016-04-12 09:57:34 +00:00
Michael Kruse 3b425ff232 Allow overflow of indices with constant dim-sizes.
Allow overflow of indices into the next higher dimension if it has
constant size. E.g.

    float A[32][2];
    ((float*)A)[5];

is effectively the same as

    A[2][1];

This can happen since r265379 as a side effect if ScopDetection
recognizes an access as affine, but ScopInfo rejects the GetElementPtr.

Differential Revision: http://reviews.llvm.org/D18878

llvm-svn: 265942
2016-04-11 14:34:08 +00:00
Johannes Doerfert 561d36b320 Allow pointer expressions in SCEVs again.
In r247147 we disabled pointer expressions because the IslExprBuilder did not
  fully support them. This patch reintroduces them by simply treating them as
  integers. The only special handling for pointers that is left detects the
  comparison of two address_of operands and uses an unsigned compare.

llvm-svn: 265894
2016-04-10 09:50:10 +00:00
Johannes Doerfert 41725a1e7a [FIX] Do not crash on opaque (unsized) types.
llvm-svn: 265834
2016-04-08 19:20:03 +00:00
Michael Kruse b3de24f5e6 Add testcase from PR27218. NFC.
The the bug has already been fixed r265795, but this second testcase
still useful.

llvm-svn: 265809
2016-04-08 16:54:53 +00:00
Michael Kruse 436c90619c [ScopInfo] Fix check for element size mismatch.
The way to get the elements size with getPrimitiveSizeInBits() is not
the same as used in other parts of Polly which should use
DataLayout::getTypeAllocSize(). Its use only queries the size of the
pointer and getPrimitiveSizeInBits returns 0 for types that require a
DataLayout object such as pointers.

Together with r265379, this should fix PR27195.

llvm-svn: 265795
2016-04-08 16:20:08 +00:00
Johannes Doerfert 3ef78d6d38 [FIX] Adjust execution context of hoisted loads wrt. error domains
If we build the domains for error blocks and later remove them we lose
  the information that they are not executed. Thus, in the SCoP it looks
  like the control will always reach the statement S:

            for (i = 0 ... N)
                if (*valid == 0)
                  doSth(&ptr);
          S:    A[i] = *ptr;

  Consequently, we would have assumed "ptr" to be always accessed and
  preloaded it unconditionally. However, only if "*valid != 0" we would
  execute the optimized version of the SCoP. Nevertheless, we would have
  hoisted and accessed "ptr"regardless of "*valid". This changes the
  semantic of the program as the value of "*valid" can cause a change of
  "ptr" and control if it is executed or not.

  To fix this problem we adjust the execution context of hoisted loads
  wrt. error domains. To this end we introduce an ErrorDomainCtxMap that
  maps each basic block to the error context under which it might be
  executed. Thus, to the context under which it is executed but an error
  block would have been executed to. To fill this map one traversal of
  the blocks in the SCoP suffices. During this traversal we do also
  "remove" error statements and those that are only reachable via error
  statements. This was previously done by the removeErrorBlockDomains
  function which is therefor not needed anymore.

  This fixes bug PR26683 and thereby several SPEC miscompiles.

Differential Revision: http://reviews.llvm.org/D18822

llvm-svn: 265778
2016-04-08 10:30:09 +00:00
Johannes Doerfert b47cbe1c72 [FIX] Handle multiplications in the SCEVAffinator again
If ScalarEvolution cannot look through some expression but we do, it
  might happen that a multiplication will arrive at the
  SCEVAffinator::visitMulExpr. While we could always try to improve the
  extractConstantFactor function we might still miss something, thus we
  reintroduce the code to generate multiplicative piecewise-affine
  functions as a fall-back.

llvm-svn: 265777
2016-04-08 10:27:40 +00:00
Johannes Doerfert e85d50defb Add test cases for the removal of error blocks
llvm-svn: 265776
2016-04-08 10:26:39 +00:00
Tobias Grosser 6a44b7fdf0 Add test case forgotten in r265379.
Thanks Johannes for reminding me.

llvm-svn: 265423
2016-04-05 17:40:07 +00:00
Johannes Doerfert 57c5f0b1c4 [FIX] Ensure SAI objects for exit PHIs
If all exiting blocks of a SCoP are error blocks and therefor not
  represented we will not generate accesses and consequently no SAI
  objects for exit PHIs. However, they are needed in the code generation
  to generate the merge PHIs between the original and optimized region.
  With this patch we enusre that the SAI objects for exit PHIs exist
  even if all exiting blocks turn out to be eror blocks.

  This fixes the crash reported in PR27207.

llvm-svn: 265393
2016-04-05 13:44:21 +00:00
Johannes Doerfert 1519491eaf Do not allow to complex branch conditions
Even before we build the domain the branch condition can become very
  complex, especially if we have to build the complement of a lot of
  equality constraints. With this patch we bail if the branch condition
  has a lot of basic sets and parameters.

  After this patch we now successfully compile
    External/SPEC/CINT2000/186_crafty/186_crafty
  with "-polly-process-unprofitable -polly-position=before-vectorizer".

llvm-svn: 265286
2016-04-04 07:59:41 +00:00
Johannes Doerfert 642594ae87 Exploit graph properties during domain generation
As a CFG is often structured we can simplify the steps performed during
  domain generation. When we push domain information we can utilize the
  information from a block A to build the domain of a block B, if A dominates B
  and there is no loop backede on a path from A to B. When we pull domain
  information we can use information from a block A to build the domain of a
  block B if B post-dominates A. This patch implements both ideas and thereby
  simplifies domains that were not simplified by isl. For the FINAL basic block
  in test/ScopInfo/complex-successor-structure-3.ll we used to build a universe
  set with 81 basic sets. Now it actually is represented as universe set.

  While the initial idea to utilize the graph structure depended on the
  dominator and post-dominator tree we can use the available region
  information as a coarse grained replacement. To this end we push the
  region entry domain to the region exit and pull it from the region
  entry for the region exit if applicable.

  With this patch we now successfully compile
    External/SPEC/CINT2006/400_perlbench/400_perlbench
  and
    SingleSource/Benchmarks/Adobe-C++/loop_unroll.

Differential Revision: http://reviews.llvm.org/D18450

llvm-svn: 265285
2016-04-04 07:57:39 +00:00
Johannes Doerfert d5edbd61a1 [FIX] Do not create a SCoP in the presence of infinite loops
If a loop has no exiting blocks the region covering we use during
  schedule genertion might not cover that loop properly. For now we bail
  out as we would not optimize these loops anyway.

llvm-svn: 265280
2016-04-03 23:09:06 +00:00
Tobias Grosser 151ae32dba Revert "[FIX] Do not create a SCoP in the presence of infinite loops"
This reverts commit r265260, as it caused the following 'make check-polly'
failures:

    Polly :: ScopDetect/index_from_unpredictable_loop.ll
    Polly :: ScopInfo/multiple_exiting_blocks.ll
    Polly :: ScopInfo/multiple_exiting_blocks_two_loop.ll
    Polly :: ScopInfo/schedule-const-post-dominator-walk-2.ll
    Polly :: ScopInfo/schedule-const-post-dominator-walk.ll
    Polly :: ScopInfo/switch-5.ll

llvm-svn: 265272
2016-04-03 19:36:52 +00:00
Johannes Doerfert 2075b5d2a1 [FIX] Do not create two SAI objects for exit PHIs
If an exit PHI is written and also read in the SCoP we should not create two
  SAI objects but only one. As the read is only modeled to ensure OpenMP code
  generation knows about it we can simply use the EXIT_PHI MemoryKind for both
  accesses.

llvm-svn: 265261
2016-04-03 11:16:00 +00:00
Johannes Doerfert 7dcceb82e9 [FIX] Do not create a SCoP in the presence of infinite loops
If a loop has no exiting blocks the region covering we use during
  schedule genertion might not cover that loop properly. For now we bail
  out as we would not optimize these loops anyway.

llvm-svn: 265260
2016-04-03 11:12:39 +00:00
Tobias Grosser 6deba4ea03 Revert 264782 and 264789
These caused LNT failures due to new assertions when running with
-polly-position=before-vectorizer -polly-process-unprofitable for:

FAIL: clamscan.compile_time
FAIL: cjpeg.compile_time
FAIL: consumer-jpeg.compile_time
FAIL: shapes.compile_time
FAIL: clamscan.execution_time
FAIL: cjpeg.execution_time
FAIL: consumer-jpeg.execution_time
FAIL: shapes.execution_time

The failures have been introduced by r264782, but r264789 had to be reverted
as it depended on the earlier patch.

llvm-svn: 264885
2016-03-30 18:18:31 +00:00
Johannes Doerfert a144fb148b Exploit graph properties during domain generation
As a CFG is often structured we can simplify the steps performed
  during domain generation. When we push domain information we can
  utilize the information from a block A to build the domain of a
  block B, if A dominates B. When we pull domain information we can
  use information from a block A to build the domain of a block B
  if B post-dominates A. This patch implements both ideas and thereby
  simplifies domains that were not simplified by isl. For the FINAL
  basic block in
    test/ScopInfo/complex-successor-structure-3.ll .
  we used to build a universe set with 81 basic sets. Now it actually is
  represented as universe set.

  While the initial idea to utilize the graph structure depended on the
  dominator and post-dominator tree we can use the available region
  information as a coarse grained replacement. To this end we push the
  region entry domain to the region exit and pull it from the region
  entry for the region exit.

Differential Revision: http://reviews.llvm.org/D18450

llvm-svn: 264789
2016-03-29 21:31:05 +00:00
Michael Kruse 88a2256a34 Revert "[ScopInfo] Fix domains after loops."
This reverts commit r264118. The approach is still under discussion.

llvm-svn: 264705
2016-03-29 07:50:52 +00:00
Johannes Doerfert 6462d8c1d9 Generalize the domain complexity restrictions
This patch applies the restrictions on the number of domain conjuncts
  also to the domain parts of piecewise affine expressions we generate.
  To this end the wording is change slightly. It was needed to support
  complex additions featuring zext-instructions but it also fixes PR27045.

  lnt profitable runs reports only little changes that might be noise:
  Compile Time:
    Polybench/[...]/2mm                     +4.34%
    SingleSource/[...]/stepanov_container   -2.43%
  Execution Time:
    External/[...]/186_crafty               -2.32%
    External/[...]/188_ammp                 -1.89%
    External/[...]/473_astar                -1.87%

llvm-svn: 264514
2016-03-26 16:17:00 +00:00
Johannes Doerfert 733ea34f38 [FIX] Handle accesses to "null" in MemIntrinsics
This fixes PR27035. While we now exclude MemIntrinsics from the
  polyhedral model if they would access "null" we could exploit this
  even more, e.g., remove all parameter combinations that would lead to
  the execution of this statement from the context.

llvm-svn: 264284
2016-03-24 13:50:04 +00:00
Tobias Grosser 25e8ebe29d Drop explicit -polly-delinearize parameter
Delinearization is now enabled by default and does not need to explicitly need
to be enabled in our tests.

llvm-svn: 264154
2016-03-23 13:21:02 +00:00
Tobias Grosser 898a636210 Add option to disallow modref function calls in scops.
This might be useful to evaluate the benefit of us handling modref funciton
calls. Also, a new bug that was triggered by modref function calls was
recently reported http://llvm.org/PR27035. To ensure the same issue does not
cause troubles for other people, we temporarily disable this until the bug
is resolved.

llvm-svn: 264140
2016-03-23 06:40:15 +00:00
Michael Kruse 49a59ca093 [ScopInfo] Fix domains after loops.
ISL can conclude additional conditions on parameters from restrictions
on loop variables. Such conditions persist when leaving the loop and the
loop variable is projected out. This results in a narrower domain for
exiting the loop than entering it and is logically impossible for
non-infinite loops.

We fix this by not adding a lower bound i>=0 when constructing BB
domains, but defer it to when also the upper bound it computed, which
was done redundantly even before this patch.

This reduces the number of LNT fails with -polly-process-unprofitable
-polly-position=before-vectorizer from 8 to 6.

llvm-svn: 264118
2016-03-22 23:27:42 +00:00
Tobias Grosser 5a8c052baf Invalidate scop on encountering a complex control flow
We bail out if current scop has a complex control flow as this could lead to
building of large domain conditions. This is to reduce compile time.  This
addresses r26382.

Contributed-by: Chris Jenneisch <chrisj@codeaurora.org>

Differential Revision: http://reviews.llvm.org/D18362

llvm-svn: 264105
2016-03-22 22:05:32 +00:00
Tobias Grosser 0904c69110 ScopInfo: Do not generate dependences for i1 values used in affine branches
Affine branches are fully modeled and regenerated from the polyhedral domain and
consequently do not require any input conditions to be propagated.

llvm-svn: 263678
2016-03-16 23:33:54 +00:00
Tobias Grosser 2880f10aa4 tests: Fix some spelling mistakes
llvm-svn: 262649
2016-03-03 19:51:03 +00:00
Johannes Doerfert df88023d2b [FIX] Consolidation of loads with same pointer but different access relation
This should fix PR19422.

  Thanks to Jeremy Huddleston Sequoia for reporting this.
  Thanks to Roman Gareev for his investigation and the reduced test case.

llvm-svn: 262612
2016-03-03 12:26:58 +00:00
Johannes Doerfert 066dbf3f8e Track assumptions and restrictions separatly
In order to speed up compile time and to avoid random timeouts we now
  separately track assumptions and restrictions. In this context
  assumptions describe parameter valuations we need and restrictions
  describe parameter valuations we do not allow. During AST generation
  we create a runtime check for both, whereas the one for the
  restrictions is negated before a conjunction is build.

  Except the In-Bounds assumptions we currently only track restrictions.

Differential Revision: http://reviews.llvm.org/D17247

llvm-svn: 262328
2016-03-01 13:06:28 +00:00
Johannes Doerfert a792098047 Support calls with known ModRef function behaviour
Check the ModRefBehaviour of functions in order to decide whether or
  not a call instruction might be acceptable.

Differential Revision: http://reviews.llvm.org/D5227

llvm-svn: 261866
2016-02-25 14:08:48 +00:00
Johannes Doerfert 9dd42ee7c1 Try to build alias checks even when non-affine accesses are allowed
From now on we bail only if a non-trivial alias group contains a non-affine
  access, not when we discover aliasing and non-affine accesses are allowed.

llvm-svn: 261863
2016-02-25 14:06:11 +00:00
Roman Gareev 11001e1534 Annotation of SIMD loops
Use 'mark' nodes annotate a SIMD loop during ScheduleTransformation and skip
parallelism checks.

The buildbot shows the following compile/execution time changes:

  Compile time:
    Improvements    Δ     Previous  Current  σ
    …/gesummv      -6.06% 0.2640    0.2480   0.0055
    …/gemver       -4.46% 0.4480    0.4280   0.0044
    …/covariance   -4.31% 0.8360    0.8000   0.0065
    …/adi          -3.23% 0.9920    0.9600   0.0065
    …/doitgen      -2.53% 0.9480    0.9240   0.0090
    …/3mm          -2.33% 1.0320    1.0080   0.0087

  Execution time:
    Regressions     Δ     Previous  Current  σ
    …/viterbi       1.70% 5.1840    5.2720   0.0074
    …/smallpt       1.06% 12.4920   12.6240  0.0040

Reviewed-by: Tobias Grosser <tobias@grosser.es>

Differential Revision: http://reviews.llvm.org/D14491

llvm-svn: 261620
2016-02-23 09:00:13 +00:00
Johannes Doerfert cea6193b79 Support memory intrinsics
This patch adds support for memcpy, memset and memmove intrinsics. They are
  represented as one (memset) or two (memcpy, memmove) memory accesses in the
  polyhedral model. These accesses have an access range that describes the
  summarized effect of the intrinsic, i.e.,
    memset(&A[i], '$', N);
  is represented as a write access from A[i] to A[i+N].

Differential Revision: http://reviews.llvm.org/D5226

llvm-svn: 261489
2016-02-21 19:13:19 +00:00
Johannes Doerfert 4d9bb8d594 Allow all combinations of types and subscripts for memory accesses
To support non-aligned accesses we introduce a virtual element size
  for arrays that divides each access function used for this array. The
  adjustment of the access function based on the element size of the
  array was therefore moved after this virtual element size was
  determined, thus after all accesses have been created.

Differential Revision: http://reviews.llvm.org/D17246

llvm-svn: 261226
2016-02-18 16:50:12 +00:00
Johannes Doerfert 13637678b1 [FIX] LICM test case
llvm-svn: 260955
2016-02-16 12:10:42 +00:00
Johannes Doerfert 965edde695 Separate more constant factors of parameters
So far we separated constant factors from multiplications, however,
  only when they are at the outermost level of a parameter SCEV. Now,
  we also separate constant factors from the parameter SCEV if the
  outermost expression is a SCEVAddRecExpr. With the changes to the
  SCEVAffinator we can now improve the extractConstantFactor(...)
  function at will without worrying about any other code part. Thus,
  if needed we can implement a more comprehensive
  extractConstantFactor(...) function that will traverse the SCEV
  instead of looking only at the outermost level.

  Four test cases were affected. One did not change much and the other
  three were simplified.

llvm-svn: 260859
2016-02-14 22:30:56 +00:00
Johannes Doerfert 96e5471139 Separate invariant equivalence classes by type
We now distinguish invariant loads to the same memory location if they
  have different types. This will cause us to pre-load an invariant
  location once for each type that is used to access it. However, we can
  thereby avoid invalid casting, especially if an array is accessed
  though different typed/sized invariant loads.

  This basically reverts the changes in r260023 but keeps the test
  cases.

llvm-svn: 260045
2016-02-07 17:30:13 +00:00
Johannes Doerfert e708790c59 [FIX] Two "off-by-one" error in constant range usage
llvm-svn: 260031
2016-02-07 13:59:03 +00:00
Tobias Grosser 8ebdc2dd53 Make memory accesses with different element types optional
We also disable this feature by default, as there are still some issues in
combination with invariant load hoisting that slipped through my initial
testing.

llvm-svn: 260025
2016-02-07 08:48:57 +00:00
Tobias Grosser 107cd5f5f6 IslNodeBuilder: Invariant load hoisting of elements with differing sizes
Always use access-instruction pointer type to load the invariant values.
Otherwise mismatches between ScopArrayInfo element type and memory access
element type will result in invalid casts. These type mismatches are after
r259784 a lot more common and also arise with types of different size, which
have not been handled before.

Interestingly, this change actually simplifies the code, as we now have only
one code path that is always taken, rather then a standard code path for the
common case and a "fixup" code path that replaces the standard code path in
case of mismatching types.

llvm-svn: 260009
2016-02-06 21:23:39 +00:00
Michael Kruse 2e02d560aa Follow uses to create value MemoryAccesses
The previously implemented approach is to follow value definitions and
create write accesses ("push defs") while searching for uses. This
requires the same relatively validity- and requirement conditions to be
replicated at multiple locations (PHI instructions, other instructions,
uses by PHIs).

We replace this by iterating over the uses in a SCoP ("pull in
requirements"), and add writes only when at least one read has been
added. It turns out to be simpler code because each use is only iterated
over once and writes are added for the first access that reads it. We
need another iteration to identify escaping values (uses not in the
SCoP), which also makes the difference between such accesses more
obvious. As a side-effect, the order of scalar MemoryAccess can change.

Differential Revision: http://reviews.llvm.org/D15706

llvm-svn: 259987
2016-02-06 09:19:40 +00:00
Tobias Grosser d840fc7277 Support accesses with differently sized types to the same array
This allows code such as:

void multiple_types(char *Short, char *Float, char *Double) {
  for (long i = 0; i < 100; i++) {
    Short[i] = *(short *)&Short[2 * i];
    Float[i] = *(float *)&Float[4 * i];
    Double[i] = *(double *)&Double[8 * i];
  }
}

To model such code we use as canonical element type of the modeled array the
smallest element type of all original array accesses, if type allocation sizes
are multiples of each other. Otherwise, we use a newly created iN type, where N
is the gcd of the allocation size of the types used in the accesses to this
array. Accesses with types larger as the canonical element type are modeled as
multiple accesses with the smaller type.

For example the second load access is modeled as:

  { Stmt_bb2[i0] -> MemRef_Float[o0] : 4i0 <= o0 <= 3 + 4i0 }

To support code-generating these memory accesses, we introduce a new method
getAccessAddressFunction that assigns each statement instance a single memory
location, the address we load from/store to. Currently we obtain this address by
taking the lexmin of the access function. We may consider keeping track of the
memory location more explicitly in the future.

We currently do _not_ handle multi-dimensional arrays and also keep the
restriction of not supporting accesses where the offset expression is not a
multiple of the access element type size. This patch adds tests that ensure
we correctly invalidate a scop in case these accesses are found. Both types of
accesses can be handled using the very same model, but are left to be added in
the future.

We also move the initialization of the scop-context into the constructor to
ensure it is already available when invalidating the scop.

Finally, we add this as a new item to the 2.9 release notes

Reviewers: jdoerfert, Meinersbur

Differential Revision: http://reviews.llvm.org/D16878

llvm-svn: 259784
2016-02-04 13:18:42 +00:00
Tobias Grosser e2c31210b2 Revert "Support loads with differently sized types from a single array"
This reverts commit (@259587). It needs some further discussions.

llvm-svn: 259629
2016-02-03 05:53:27 +00:00
Tobias Grosser 5d3fc1ea43 Support loads with differently sized types from a single array
We support now code such as:

void multiple_types(char *Short, char *Float, char *Double) {
  for (long i = 0; i < 100; i++) {
    Short[i] = *(short *)&Short[2 * i];
    Float[i] = *(float *)&Float[4 * i];
    Double[i] = *(double *)&Double[8 * i];
  }
}

To support such code we use as element type of the modeled array the smallest
element type of all original array accesses. Accesses with larger types are
modeled as multiple accesses with the smaller type.

For example the second load access is modeled as:

  { Stmt_bb2[i0] -> MemRef_Float[o0] : 4i0 <= o0 <= 3 + 4i0 }

To support jscop-rewritable memory accesses we need each statement instance to
only be assigned a single memory location, which will be the address at which
we load the value. Currently we obtain this address by taking the lexmin of
the access function. We may consider keeping track of the memory location more
explicitly in the future.

llvm-svn: 259587
2016-02-02 22:05:29 +00:00