Commit Graph

711 Commits

Author SHA1 Message Date
Johannes Doerfert 13637678b1 [FIX] LICM test case
llvm-svn: 260955
2016-02-16 12:10:42 +00:00
Johannes Doerfert 4cf1580f0c [FIX] Check the next base pointer for possible invariant loads
A load can only be invariant if its base pointer is invariant too. To
  this end, we check if the base pointer is defined inside the region or
  outside. In the former case we recursively check if we can (and
  therefore will) hoist the base pointer too. Only if that happends we
  can hoist the load.

llvm-svn: 260886
2016-02-15 12:42:05 +00:00
Johannes Doerfert f69162486b Revert "[FIX] Hoist accesses if AA stated they are invariant"
This reverts commit 98efa006c96ac981c00d2e386ec1102bce9f549a.

  The fix was broken since we do not use AA in the ScopDetection anymore to
  check for invariant accesses.

llvm-svn: 260884
2016-02-15 12:21:11 +00:00
Johannes Doerfert 2353e39e1f [FIX] Hoist accesses if AA stated they are invariant
Before this patch it could happen that we did not hoist a load that
  was a base pointer of another load even though AA already declared the
  first one as invariant (during ScopDetection). If this case arises we
  will now skipt the "can be overwriten" check because in this case the
  over-approximating nature causes us to generate broken code.

llvm-svn: 260862
2016-02-14 23:37:14 +00:00
Johannes Doerfert 965edde695 Separate more constant factors of parameters
So far we separated constant factors from multiplications, however,
  only when they are at the outermost level of a parameter SCEV. Now,
  we also separate constant factors from the parameter SCEV if the
  outermost expression is a SCEVAddRecExpr. With the changes to the
  SCEVAffinator we can now improve the extractConstantFactor(...)
  function at will without worrying about any other code part. Thus,
  if needed we can implement a more comprehensive
  extractConstantFactor(...) function that will traverse the SCEV
  instead of looking only at the outermost level.

  Four test cases were affected. One did not change much and the other
  three were simplified.

llvm-svn: 260859
2016-02-14 22:30:56 +00:00
Johannes Doerfert 96e5471139 Separate invariant equivalence classes by type
We now distinguish invariant loads to the same memory location if they
  have different types. This will cause us to pre-load an invariant
  location once for each type that is used to access it. However, we can
  thereby avoid invalid casting, especially if an array is accessed
  though different typed/sized invariant loads.

  This basically reverts the changes in r260023 but keeps the test
  cases.

llvm-svn: 260045
2016-02-07 17:30:13 +00:00
Johannes Doerfert e708790c59 [FIX] Two "off-by-one" error in constant range usage
llvm-svn: 260031
2016-02-07 13:59:03 +00:00
Tobias Grosser 8ebdc2dd53 Make memory accesses with different element types optional
We also disable this feature by default, as there are still some issues in
combination with invariant load hoisting that slipped through my initial
testing.

llvm-svn: 260025
2016-02-07 08:48:57 +00:00
Tobias Grosser 46bafbd0fe Do not yet consider loads with non-canonical element size for load hoisting.
Invariant load hoisting of memory accesses with non-canonical element
types lacks support for equivalence classes that contain elements of
different width/size. This support should be added, but to get our buildbots
back to green, we disable load hoisting for memory accesses with non-canonical
element size for now.

llvm-svn: 260023
2016-02-07 08:11:36 +00:00
Tobias Grosser 107cd5f5f6 IslNodeBuilder: Invariant load hoisting of elements with differing sizes
Always use access-instruction pointer type to load the invariant values.
Otherwise mismatches between ScopArrayInfo element type and memory access
element type will result in invalid casts. These type mismatches are after
r259784 a lot more common and also arise with types of different size, which
have not been handled before.

Interestingly, this change actually simplifies the code, as we now have only
one code path that is always taken, rather then a standard code path for the
common case and a "fixup" code path that replaces the standard code path in
case of mismatching types.

llvm-svn: 260009
2016-02-06 21:23:39 +00:00
Michael Kruse 2e02d560aa Follow uses to create value MemoryAccesses
The previously implemented approach is to follow value definitions and
create write accesses ("push defs") while searching for uses. This
requires the same relatively validity- and requirement conditions to be
replicated at multiple locations (PHI instructions, other instructions,
uses by PHIs).

We replace this by iterating over the uses in a SCoP ("pull in
requirements"), and add writes only when at least one read has been
added. It turns out to be simpler code because each use is only iterated
over once and writes are added for the first access that reads it. We
need another iteration to identify escaping values (uses not in the
SCoP), which also makes the difference between such accesses more
obvious. As a side-effect, the order of scalar MemoryAccess can change.

Differential Revision: http://reviews.llvm.org/D15706

llvm-svn: 259987
2016-02-06 09:19:40 +00:00
Tobias Grosser d840fc7277 Support accesses with differently sized types to the same array
This allows code such as:

void multiple_types(char *Short, char *Float, char *Double) {
  for (long i = 0; i < 100; i++) {
    Short[i] = *(short *)&Short[2 * i];
    Float[i] = *(float *)&Float[4 * i];
    Double[i] = *(double *)&Double[8 * i];
  }
}

To model such code we use as canonical element type of the modeled array the
smallest element type of all original array accesses, if type allocation sizes
are multiples of each other. Otherwise, we use a newly created iN type, where N
is the gcd of the allocation size of the types used in the accesses to this
array. Accesses with types larger as the canonical element type are modeled as
multiple accesses with the smaller type.

For example the second load access is modeled as:

  { Stmt_bb2[i0] -> MemRef_Float[o0] : 4i0 <= o0 <= 3 + 4i0 }

To support code-generating these memory accesses, we introduce a new method
getAccessAddressFunction that assigns each statement instance a single memory
location, the address we load from/store to. Currently we obtain this address by
taking the lexmin of the access function. We may consider keeping track of the
memory location more explicitly in the future.

We currently do _not_ handle multi-dimensional arrays and also keep the
restriction of not supporting accesses where the offset expression is not a
multiple of the access element type size. This patch adds tests that ensure
we correctly invalidate a scop in case these accesses are found. Both types of
accesses can be handled using the very same model, but are left to be added in
the future.

We also move the initialization of the scop-context into the constructor to
ensure it is already available when invalidating the scop.

Finally, we add this as a new item to the 2.9 release notes

Reviewers: jdoerfert, Meinersbur

Differential Revision: http://reviews.llvm.org/D16878

llvm-svn: 259784
2016-02-04 13:18:42 +00:00
Wei Mi eb14ac5396 Polly tests update contributed by Tobias Grosser for SCEV patch in r259736.
llvm-svn: 259737
2016-02-04 01:34:28 +00:00
Tobias Grosser ba043143a9 test: make test case more robust against removal of unrelated instructions
llvm-svn: 259693
2016-02-03 21:10:11 +00:00
Tobias Grosser e2c31210b2 Revert "Support loads with differently sized types from a single array"
This reverts commit (@259587). It needs some further discussions.

llvm-svn: 259629
2016-02-03 05:53:27 +00:00
Tobias Grosser 5d3fc1ea43 Support loads with differently sized types from a single array
We support now code such as:

void multiple_types(char *Short, char *Float, char *Double) {
  for (long i = 0; i < 100; i++) {
    Short[i] = *(short *)&Short[2 * i];
    Float[i] = *(float *)&Float[4 * i];
    Double[i] = *(double *)&Double[8 * i];
  }
}

To support such code we use as element type of the modeled array the smallest
element type of all original array accesses. Accesses with larger types are
modeled as multiple accesses with the smaller type.

For example the second load access is modeled as:

  { Stmt_bb2[i0] -> MemRef_Float[o0] : 4i0 <= o0 <= 3 + 4i0 }

To support jscop-rewritable memory accesses we need each statement instance to
only be assigned a single memory location, which will be the address at which
we load the value. Currently we obtain this address by taking the lexmin of
the access function. We may consider keeping track of the memory location more
explicitly in the future.

llvm-svn: 259587
2016-02-02 22:05:29 +00:00
Tobias Grosser c2fd8b411d ScopInfo: Correct schedule construction
For schedule generation we assumed that the reverse post order traversal used by
the domain generation is sufficient, however it is not. Once a loop is
discovered, we have to completely traverse it, before we can generate the
schedule for any block/region that is only reachable through a loop exiting
block.

To this end, we add a "loop stack" that will keep track of loops we
discovered during the traversal but have not yet traversed completely.
We will never visit a basic block (or region) outside the most recent
(thus smallest) loop in the loop stack but instead queue such blocks
(or regions) in a waiting list. If the waiting list is not empty and
(might) contain blocks from the most recent loop in the loop stack the
next block/region to visit is drawn from there, otherwise from the
reverse post order iterator.

We exploit the new property of loops being always completed before additional
loops are processed, by removing the LoopSchedules map and instead keep all
information in LoopStack. This clarifies that we indeed always only keep a
stack of in-process loops, but will never keep incomplete schedules for an
arbitrary set of loops. As a result, we can simplify some of the existing code.

This patch also adds some more documentation about how our schedule construction
works.

This fixes http://llvm.org/PR25879

This patch is an modified version of Johannes Doerfert's initial fix.

Differential Revision: http://reviews.llvm.org/D15679

llvm-svn: 259354
2016-02-01 11:54:13 +00:00
Michael Kruse 26311f00e8 Remove autotools build system
The autotools build system is based on and requires LLVM's autotools
build system to work, which has been depricated and finally removed in
r258861. Consequently we also remove the autotools build system from
Polly.

Differential Revision: http://reviews.llvm.org/D16655

llvm-svn: 259041
2016-01-28 12:00:33 +00:00
Michael Kruse fd46308de4 ScopInfo: Never add read accesses for synthesizable values
Before adding a MK_Value READ MemoryAccess, check whether the read is
necessary or synthesizable. Synthesizable values are later generated by
the SCEVExpander and therefore do not need to be transferred
explicitly. This can happen because the check for synthesizability has
presumbly been forgotten in the case where a phi's incoming value has
been defined in a different statement.

Differential Revision: http://reviews.llvm.org/D15687

llvm-svn: 258998
2016-01-27 22:51:56 +00:00
Michael Kruse ee6a4fc680 Unique phi write accesses
Ensure that there is at most one phi write access per PHINode and
ScopStmt. In particular, this would be possible for non-affine
subregions with multiple exiting blocks. We replace multiple MAY_WRITE
accesses by one MUST_WRITE access. The written value is constructed
using a PHINode of all exiting blocks. The interpretation of the PHI
WRITE's "accessed value" changed from the incoming value to the PHI like
for PHI READs since there is no unique incoming value.

Because region simplification shuffles around PHI nodes -- particularly
with exit node PHIs -- the PHINodes at analysis time does not always
exist anymore in the code generation pass. We instead remember the
incoming block/value pair in the MemoryAccess.

Differential Revision: http://reviews.llvm.org/D15681

llvm-svn: 258809
2016-01-26 13:33:27 +00:00
Michael Kruse 436db620e7 Unique value write accesses
Ensure there is at most one write access per definition of an
llvm::Value. Keep track of already created value write access by using
a (dense) map.

Replace addValueWriteAccess by ensureValueStore which can be uses more
liberally without worrying to add redundant accesses. It will be used,
e.g. in a logical correspondant for value reads -- ensureValueReload --
to ensure that the expected definition has been written when loading it.

Differential Revision: http://reviews.llvm.org/D15483

llvm-svn: 258807
2016-01-26 13:33:10 +00:00
Johannes Doerfert 6f50c29ab2 [FIX] Domain generation error due to loops in non-affine regions
llvm-svn: 258803
2016-01-26 11:03:25 +00:00
Johannes Doerfert 432658d7b8 [FIX] Build correct domain for non-affine region SCoPs
llvm-svn: 258802
2016-01-26 11:01:41 +00:00
Tobias Grosser f2cdd144e5 BlockGenerators: Replace getNewScalarValue with getNewValue
Both functions implement the same functionality, with the difference that
getNewScalarValue assumes that globals and out-of-scop scalars can be directly
reused without loading them from their corresponding stack slot. This is correct
for sequential code generation, but causes issues with outlining code e.g. for
OpenMP code generation. getNewValue handles such cases correctly.

Hence, we can replace getNewScalarValue with getNewValue. This is not only more
future proof, but also eliminates a bunch of code.

The only functionality that was available in getNewScalarValue that is lost
is the on-demand creation of scalar values. However, this is not necessary any
more as scalars are always loaded at the beginning of each basic block and will
consequently always be available when scalar stores are generated. As this was
not the case in older versions of Polly, it seems the on-demand loading is just
some older code that has not yet been removed.

Finally, generateScalarLoads also generated loads for values that are loop
invariant, available in GlobalMap and which are preferred over the ones loaded
in generateScalarLoads. Hence, we can just skip the code generation of such
scalar values, avoiding the generation of dead code.

Differential Revision: http://reviews.llvm.org/D16522

llvm-svn: 258799
2016-01-26 10:01:35 +00:00
Tobias Grosser 232905089e test: Name instructions in a test case [NFC]
llvm-svn: 258662
2016-01-24 17:51:37 +00:00
Tobias Grosser 1c3a6d7808 ScopDetection: Do not detect regions with irreducible control as scops
Polly currently does not support irreducible control and it is probably not
worth supporting. This patch adds code that checks for irreducible control
and refuses regions containing irreducible control.

Polly traditionally had rather restrictive checks on the control flow structure
which would have refused irregular control, but within the last couple of months
most of the control flow restrictions have been removed. As part of this
generalization we accidentally allowed irregular control flow.

Contributed-by: Karthik Senthil and Ajith Pandel
llvm-svn: 258497
2016-01-22 09:44:37 +00:00
Tobias Grosser b3a9538e95 Remove irreducible control flow from test case
The test case we look at does not necessarily require irreducible control flow,
but a normal loop is sufficient to create a non-affine region containing more
than one basic block that dominates the exit node. We replace this irreducible
control flow with a normal loop for the following reasons:

  1) This is easier to understand
  2) We will subsequently commit a patch that ensures Polly does not process
     irreducible control flow.

Within non-affine regions, we could possibly handle irreducible control flow.

llvm-svn: 258496
2016-01-22 09:33:33 +00:00
Johannes Doerfert 370cf00c9f Make sure we preserve alignment information after hoisting invariant load
In Polly, after hoisting loop invariant loads outside loop, the alignment
information for hoisted loads are missing, this patch restore them.

Contributed-by: Lawrence Hu <lawrence@codeaurora.org>

Differential Revision: http://reviews.llvm.org/D16160

llvm-svn: 258105
2016-01-19 00:17:21 +00:00
Michael Kruse 959a8dc39f Update to ISL 0.16.1
llvm-svn: 257898
2016-01-15 15:54:45 +00:00
Michael Kruse 5a9a65e43f Prepare unit tests for update to ISL 0.16
ISL 0.16 will change how sets are printed which breaks 117 unit tests
that text-compare printed sets. This patch re-formats most of these unit
tests using a script and small manual editing on top of that. When
actually updating ISL, most work is done by just re-running the script
to adapt to the changed output.

Some tests that compare IR and tests with single CHECK-lines that can be
easily updated manually are not included here.

The re-format script will also be committed afterwards. The per-test
formatter invocation command lines options will not be added in the near
future because it is ad hoc and would overwrite the manual edits.
Ideally it also shouldn't be required anymore because ISL's set printing
has become more stable in 0.16.

Differential Revision: http://reviews.llvm.org/D16095

llvm-svn: 257851
2016-01-15 00:48:42 +00:00
Roman Gareev 10595a1739 Call assumeNoOutOfBound only in updateDimensionality
Call assumeNoOutOfBound only in updateDimensionality to process situations
when new dimensions are added and new bounds checks are required.

Contributed-by: Tobias Grosser, Gareev Roman
llvm-svn: 257170
2016-01-08 14:01:59 +00:00
Tobias Grosser c1a269bf0e Add option to assume single-loop scops with sufficient compute are profitable
If a loop has a sufficiently large amount of compute instruction in its loop
body, it is unlikely that our rewrite of the loop iterators introduces large
performance changes. As Polly can also apply beneficical optimizations (such
as parallelization) to such loop nests, we mark them as profitable.

This option is currently "disabled" by default, but can be used to run
experiments. If enabled by setting it e.g. to 40 instructions, we currently
see some compile-time increases on LNT without any significant run-time
changes.

llvm-svn: 256199
2015-12-21 21:00:43 +00:00
Johannes Doerfert 30e2307f61 [FIX] Schedule generation for block exiting multiple loops.
This fixes bug PR25604.

llvm-svn: 256125
2015-12-20 17:12:22 +00:00
Tobias Grosser 75dc40c3be ScopInfo: Bail out in case of complex branch structures
Scops that contain many complex branches are likely to result in complex domain
conditions that consist of a large (> 100) number of conjucts.  Transforming
such domains is expensive and unlikely to result in efficient code.  To avoid
long compile times we detect this case and skip such scops. In the future we may
improve this by either using non-affine subregions to hide such complex
condition structures or by exploiting in certain cases properties (e.g.,
dominance) that allow us to construct the domains of a scop in a way that
results in a smaller number improving conjuncts.

Example of a code that results in complex iteration spaces:

      loop.header
     /    |    \ \
   A0    A2    A4 \
     \  /  \  /    \
      A1    A3      \
     /  \  /  \     |
   B0    B2    B4   |
     \  /  \  /     |
      B1    B3      ^
     /  \  /  \     |
   C0    C2    C4   |
     \  /  \  /    /
      C1    C3    /
       \   /     /
    loop backedge

llvm-svn: 256123
2015-12-20 13:31:48 +00:00
Roman Gareev 22803d4488 Fix of a comment.
llvm-svn: 255923
2015-12-17 20:47:10 +00:00
Roman Gareev 8aa437503c Fix delinearization of fortran arrays
The patch fixes Bug 25759 produced by inappropriate handling of unsigned
maximum SCEV expressions by SCEVRemoveMax. Without a fix, we get an infinite
loop and a segmentation fault, if we try to process, for example,
'((-1 + (-1 * %b1)) umax {(-1 + (-1 * %yStart)),+,-1}<%.preheader>)'.
It also fixes a potential issue related to signed maximum SCEV expressions.

Tested-by: Roman Gareev <gareevroman@gmail.com>
Fixed-by: Tobias Grosser <tobias@grosser.es>

Differential Revision: http://reviews.llvm.org/D15563

llvm-svn: 255922
2015-12-17 20:37:17 +00:00
Tobias Grosser a69d4f0d83 VectorBlockGenerator: Generate scalar loads for vector statements
When generating scalar loads/stores separately the vector code has not been
updated. This commit adds code to generate scalar loads for vector code as well
as code to assert in case scalar stores are encountered within a vector loop.

llvm-svn: 255714
2015-12-15 23:49:58 +00:00
Tobias Grosser 0921477248 ScopInfo: Look up first (and only) array access
When rewriting the access functions of load/store statements, we are only
interested in the actual array memory location. The current code just took
the very first memory access, which could be a scalar or an array access. As
a result, we failed to update access functions even though this was requested
via .jscop.

llvm-svn: 255713
2015-12-15 23:49:53 +00:00
Tobias Grosser f4f6870ff2 Revert "Always treat scalar writes as MUST_WRITEs"
This reverts commit r255471.

Johannes raised in the post-commit review of r255471 the concern that PHI
writes in non-affine regions with two exiting blocks are not really MUST_WRITE,
but we just know that at least one out of the set of all possible PHI writes
will be executed. Modeling all PHI nodes as MUST_WRITEs is probably save, but
adding the needed documentation for such a special case is probably not worth
the effort. Michael will be proposing a new patch that ensures only a single
PHI_WRITE is created for non-affine regions, which - besides other benefits -
should also allow us to use a single well-defined MUST_WRITE for such PHI
writes.

(This is not a full revert, but the condition and documentation have been
slightly extended)

llvm-svn: 255503
2015-12-14 15:05:37 +00:00
Michael Kruse e0d135c536 Add unit test for r255473
Check that memory accesses in non-affine regions that are always executed are
MUST_WRITE.

llvm-svn: 255500
2015-12-14 14:53:30 +00:00
Michael Kruse b06e3029d1 Always treat scalar writes as MUST_WRITEs
LLVM's IR guarantees that a value definition occurs before any use, and
also the value of a PHI must be one of the incoming values, "written"
in one of the incoming blocks. Hence, such writes are never conditional
in the context of a non-affine subregion.

llvm-svn: 255471
2015-12-13 22:10:32 +00:00
Tobias Grosser 2d3d4ec860 executeScopConditionally: Introduce special exiting block
When introducing separate control flow for the original and optimized code we
introduce now a special 'ExitingBlock':

      \   /
    EnteringBB
        |
    SplitBlock---------\
   _____|_____         |
  /  EntryBB  \    StartBlock
  |  (region) |        |
  \_ExitingBB_/   ExitingBlock
        |              |
    MergeBlock---------/
        |
      ExitBB
      /    \

This 'ExitingBlock' contains code such as the final_reloads for scalars, which
previously were just added to whichever statement/loop_exit/branch-merge block
had been generated last. Having an explicit basic block makes it easier to
find these constructs when looking at the CFG.

llvm-svn: 255107
2015-12-09 11:38:22 +00:00
Tobias Grosser 87a44d29a2 test: Fix misspelled test line
llvm-svn: 255106
2015-12-09 11:38:08 +00:00
Tobias Grosser a5d9e65e17 Update isl to isl-0.15-142-gf101714
This update brings in improvements to isl's 'isolate' option that reduce the
number of code versions generated. This results in both code-size and compile
time reduction for outer loop vectorization.

Thanks to Roman Garev and Sven Verdoolaege for working on this improvement.

llvm-svn: 254706
2015-12-04 08:46:14 +00:00
Tobias Grosser 2f8e43d677 ScopInfo: Add support for delinearizing fortran arrays
gfortran (and fortran in general?) does not compute the address of an array
element directly from the array sizes (e.g., %s0, %s1), but takes first the
maximum of the sizes and 0 (e.g., max(0, %s0)) before multiplying the resulting
value with the per-dimension array subscript expressions. To successfully
delinearize index expressions as we see them in fortran, we first filter 'smax'
expressions out of the SCEV expression, use them to guess array size parameters
and only then continue with the existing delinearization.

llvm-svn: 253995
2015-11-24 17:06:38 +00:00
Tobias Grosser 9737c7b431 ScopInfo: Remove domains of error blocks (and blocks they dominate) early on
Trying to build up access functions for any of these blocks is likely to fail,
as error blocks may contain invalid/non-representable instructions, and blocks
dominated by error blocks may reference such instructions, which wil also cause
failures. As all of these blocks are anyhow assumed to not be executed, we can
just remove them early on.

This fixes http://llvm.org/PR25596

llvm-svn: 253818
2015-11-22 11:06:51 +00:00
Tobias Grosser 020fa09a3c Remove -polly-code-generator=isl from many test cases
This is the default since a long time. Setting it again does not add value
in any of these test cases.

llvm-svn: 253800
2015-11-21 23:05:48 +00:00
Johannes Doerfert a2a74f09fc Do not enforce lcssa
At some point we enforced lcssa for the loop surrounding the entry block.
  This is not only questionable as it does not check any other loop but also
  not needed any more.

llvm-svn: 253789
2015-11-21 17:00:02 +00:00
Johannes Doerfert dec27df588 [FIX] Get the correct loop that surrounds a region
llvm-svn: 253788
2015-11-21 16:56:13 +00:00
Tobias Grosser b39c96aa19 ScopInfo: Ensure unique names for parameter names coming from load instructions
In case the original parameter instruction does not have a name, but it comes
from a load instruction where the base pointer has a name we used the name of
the load instruction to give some more intuition of where the parameter came
from. To ensure this works also through GEPs which may have complex offsets,
we originally just dropped the offsets and _only_ used the base pointer name.
As this can result in multiple parameters to get the same name, we now prefix
the parameter ID to ensure parameter names are unique. This will make it easier
to understand debug output.

This change does not affect correctness, as parameter IDs (even of the same
name) can always be distinguished through the SCEV pointer stored inside them.

llvm-svn: 253330
2015-11-17 11:54:51 +00:00