Commit Graph

637 Commits

Author SHA1 Message Date
Johannes Doerfert 7a6e292d86 [FIX] Use same alloca for invariant loads and the scalar users
llvm-svn: 252451
2015-11-09 06:28:45 +00:00
Johannes Doerfert 544b23a1ef Revert "Fix non-affine generated entering node not being recognized as dominating"
This reverts commit 9775824b265e574fc541e975d64d3e270243b59d due to a
failing unit test.

Please check and correct the unit test and commit again.

llvm-svn: 252449
2015-11-09 06:04:05 +00:00
Michael Kruse fd9c89e84b Fix non-affine generated entering node not being recognized as dominating
Scalar reloads in the generated entering block were not recognized as
dominating the subregions locks when there were multiple entering
nodes. This resulted in values defined in there not being copied.

As a fix, we unconditionally add the BBMap of the generated entering
node to the generated entry. This fixes part of llvm.org/PR25439.

llvm-svn: 252445
2015-11-09 05:00:30 +00:00
Johannes Doerfert 188542fda9 [FIX] Initialize incoming scalar memory locations for PHIs
llvm-svn: 252437
2015-11-09 00:21:21 +00:00
Johannes Doerfert f85ad0411f [FIX] Carefully simplify assumptions in the presence of error blocks
If a SCoP contains error blocks we cannot use the domain constraints
  to simplify the assumptions as the domain is already influenced by the
  assumptions we took. Before this patch we did that and some assumptions
  became self-fulfilling as they were implied by the domain constraints.

llvm-svn: 252424
2015-11-08 20:16:39 +00:00
Johannes Doerfert a768624f14 [FIX] Introduce different SAI objects for scalar and memory accesses
Even if a scalar and memory access have the same base pointer, we cannot use
  one SAI object as the type but also the number of dimensions are wrong. For
  the attached test case this caused a crash in the invariant load hoisting,
  though it could cause various other problems too.

This fixes bug 25428 and a execution time bug in MallocBench/cfrac.

Reported-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
llvm-svn: 252422
2015-11-08 19:12:05 +00:00
Johannes Doerfert 3797707695 [FIX] Use unreachable to indicate dead code and repair dominance
When we bail out early we make the partially build new code path
  practically dead, though it was not unreachable. To remove dominance
  problems we now make it not only dead but also prevent the control
  flow to join with the original code path, thus allow to use original
  values after the SCoP without any PHI nodes.

This fixes bug 25447.

llvm-svn: 252420
2015-11-08 17:57:41 +00:00
Johannes Doerfert c4898504ea [FIX] Bail out if there is a dependence cycle between invariant loads
While the program cannot cause a dependence cycle between invariant
  loads, additional constraints (e.g., to ensure finite loops) can
  introduce them. It is hard to detect them in the SCoP description,
  thus we will only check for them at code generation time. If such a
  recursion is detected we will bail out the code generation and place a
  "false" runtime check to guarantee the original code is used.

  This fixes bug 25443.

llvm-svn: 252412
2015-11-07 19:46:04 +00:00
Johannes Doerfert 44483c5599 [FIX] Remove all invariant load occurences from own execution context
llvm-svn: 252411
2015-11-07 19:45:27 +00:00
Michael Kruse 0651480b97 Fix non-affine region dominance of implicitely stored values
After loop versioning, a dominance check of a non-affine subregion's
exit node causes the dominance check to always fail on any block in the
subregion if it shares the same exit block with the scop. The
subregion's exit block has become polly_merge_new_and_old, which also
receives the control flow of the generated code. This would cause that
any value for implicit stores is assumed to be not from the scop.

We check dominance with the generated exit node instead.

This fixes llvm.org/PR25438

llvm-svn: 252375
2015-11-07 00:36:50 +00:00
Tobias Grosser 712229ec59 Add missing '%loadPolly' to test case
llvm-svn: 252302
2015-11-06 14:03:35 +00:00
Michael Kruse ddb6528ba6 Fix reuse of non-dominating synthesized value in subregion exit
We were adding all generated values in non-affine subregions to be used
for the subregions generated exit block. The thought was that only
values that are dominating the original exit block can be used there.
But it is possible for synthesizable values to be expanded in any
block. If the same values is also used for implicit writes, it would
try to reuse already synthesized values even if not dominating the exit
block.

The fix is to only add values to the list of values usable in the exit
block only if it is dominating the exit block. This fixes
llvm.org/PR25412.

llvm-svn: 252301
2015-11-06 13:51:24 +00:00
Tobias Grosser 6578f001bf Adjust debug metadata to LLVM changes in 252219
llvm-svn: 252273
2015-11-06 06:27:39 +00:00
Tobias Grosser f1bfd75221 ScopInfo: Allocate globally unique memory access identifiers
Before this commit memory reference identifiers have only been unique per
basic block, but not per (non-affine) ScopStmt. This commit now uses the
MemoryAccess base pointer to uniquely identify each Memory access.

llvm-svn: 252200
2015-11-05 20:15:37 +00:00
Michael Kruse 27149cf32d Use per-BB value maps for non-exit BBs
For generating scalar writes of non-affine subregions, all except phi
writes are generated in the exit block. The phi writes are generated in
the incoming block for which we errornously used the same BBMap. This
can conflict if a value for one block is synthesized, and then reused
for another block which is not dominated by the first block. This is
fixed by using block-specific BBMaps for phi writes.

llvm-svn: 252172
2015-11-05 16:17:17 +00:00
Michael Kruse f714d470d7 Fix escaping value to subregion entry node phi
An incoming value from a block the is not inside the scop is an
external use, even if the phi is inside the scop. A previous fix in
r251208 did not apply if the phi is inside a non-affine subregion. We
move the check for this phi case before the non-affine subregion check.

llvm-svn: 252157
2015-11-05 13:18:43 +00:00
Johannes Doerfert 22892687f7 [FIX] Simplify and correct preloading of base pointer origin
To simplify and correct the preloading of a base pointer origin, e.g.,
  the base pointer for the current indirect invariant load, we now just
  check if there is an invariant access class that involves the base
  pointer of the current class.

llvm-svn: 251962
2015-11-03 19:15:33 +00:00
Johannes Doerfert eca9e890b9 Remove read-only statements from the SCoP
We do not need to model read-only statements in the SCoP as they will
  not cause any side effects that are visible to the outside anyway.
  Removing them should safe us time and might even simplify the ASTs we
  generate.

Differential Revision: http://reviews.llvm.org/D14272

llvm-svn: 251948
2015-11-03 16:54:49 +00:00
Johannes Doerfert 475d8e3f42 [FIX] Ensure base pointer origin was preloaded already
If a base pointer of a preloaded value has a base pointer origin, thus it is
  an indirect invariant load, we have to make sure the base pointer origin is
  preloaded first.

llvm-svn: 251946
2015-11-03 16:49:02 +00:00
Johannes Doerfert d6fc0701ee [FIX] Carefully rewrite parameters wrt. invariant equivalence classes
ScalarEvolution doesn't allow the operands of an AddRec to be variant in the
  loop of the AddRec. When we rewrite parameter SCEVs it might seem like the
  new SCEV violates this property and ScalarEvolution will trigger an
  assertion. To avoid this we move the start part out of an AddRec when we
  rewrite it, thus avoid the operands to be possibly variant completely.

llvm-svn: 251945
2015-11-03 16:47:58 +00:00
Johannes Doerfert 3181c2ef72 [FIX] Correctly update SAI base pointer
If a base pointer load is preloaded, we have change the base pointer of
  the derived SAI. However, as the derived SAI relationship is is
  coarse grained, we need to check if we actually preloaded the base
  pointer or a different element of the base pointer SAI array.

llvm-svn: 251881
2015-11-03 01:42:59 +00:00
Johannes Doerfert dca2837b76 [FIX] Do not crash in the presence of infinite loops.
llvm-svn: 251870
2015-11-03 00:28:07 +00:00
Johannes Doerfert 907456fe04 [FIX] Use appropriately sized types for big constants
llvm-svn: 251869
2015-11-03 00:26:22 +00:00
Tobias Grosser 8286b83f97 ScopInfo: Bail out in case of mismatching array dimension sizes
In some cases different memory accesses access the very same array using a
different multi-dimensional array layout where the same dimensions have
different sizes. Instead of asserting when encountering this issue, we
gracefully bail out for this scop.

This fixes llvm.org/PR25252

llvm-svn: 251791
2015-11-02 11:29:32 +00:00
Tobias Grosser 4dc909cbed tests: Add test cases for LLVM commit r251267
This fixes llvm.org/PR25242

llvm-svn: 251268
2015-10-25 22:56:42 +00:00
Tobias Grosser bf45e74254 ScopDetect: Bail out for non-simple memory accesses
Volatile or atomic memory accesses are currently not supported. Neither did
we think about any special handling needed nor do we support the unknown
instructions the alias set tracker turns them into sometimes. Before this
patch, us not supporting unkown instructions in an alias set caused the
following assertion failures:

Assertion `AG.size() > 1 && "Alias groups should contain at least two accesses"'
failed

llvm-svn: 251234
2015-10-25 13:48:40 +00:00
Tobias Grosser 4d935cd9aa tests: Add test case forgotten in 251191
llvm-svn: 251228
2015-10-25 10:55:40 +00:00
Tobias Grosser 907090c37c ScopDetection: Update DetectionContextMap accordingly
When verifying if a scop is still valid we rerun all analysis, but did not
update DetectionContextMap. This change ensures that information, e.g. about
non-affine regions, is correctly updated

llvm-svn: 251227
2015-10-25 10:55:35 +00:00
Tobias Grosser 51174cca30 ScopDetection: Do not crash if we find zero array size candidates for delinearization
llvm-svn: 251226
2015-10-25 08:40:44 +00:00
Tobias Grosser 5528dcdf25 ScopDetection: Always refuse multi-dimensional memory accesses with 'undef' in
the size expression.

We previously only checked if the size expression is 'undef', but allowed size
expressions of the form 'undef * undef' by accident. After this change we now
require size expressions to be affine which implies no 'undef' appears anywhere
in the expression.

llvm-svn: 251225
2015-10-25 08:40:38 +00:00
Tobias Grosser baffa091dd ScopInfo: PHI-node uses in the EntryNode with an incoming BB that is not part
of the Region are external.

During code generation we split off the parts of the PHI nodes in the entry
block, which have incoming blocks that are not part of the region. As these
split-off PHI nodes then are external uses, we consequently also need to model
these uses in ScopInfo.

llvm-svn: 251208
2015-10-24 20:55:27 +00:00
Tobias Grosser ffd6b3bb29 Add a missing '-S'
llvm-svn: 251199
2015-10-24 19:02:01 +00:00
Tobias Grosser a3f6edaee1 BlockGenerator: Do not assert when finding model PHI nodes defined outside the scop
Such PHI nodes can not only appear in the ExitBlock of the Scop, but indeed
any scalar PHI node above the scop and used in the scop is modeled as scalar
read access.

llvm-svn: 251198
2015-10-24 19:01:09 +00:00
Johannes Doerfert 654c3284f4 [FIX] Do not hoist nested variant base pointers
This fixes bug 25249.

llvm-svn: 250958
2015-10-21 22:14:57 +00:00
Tobias Grosser ca7f5bb767 Full/partial tile separation for vectorization
We isolate full tiles from partial tiles to be able to, for example, vectorize
loops with parametric lower and/or upper bounds.

If we use -polly-vectorizer=stripmine, we can see execution-time improvements:
correlation from 1m7361s to 0m5720s (-67.05 %), covariance from 1m5561s to
0m5680s (-63.50 %), ary3 from 2m3201s to 1m2361s (-46.72 %), CrystalMk from
8m5565s to 7m4285s (-13.18 %).

The current full/partial tile separation increases compile-time more than
necessary. As a result, we see in compile time regressions, for example, for 3mm
from 0m6320s to 0m9881s (56.34%). Some of this compile time increase is expected
as we generate more IR and consequently more time is spent in the LLVM backends.
However, a first investiagation has shown that a larger portion of compile time
is unnecessarily spent inside Polly's parallelism detection and could be
eliminated by propagating existing knowledge about vector loop parallelism.
Before enabling -polly-vectorizer=stripmine by default, it is necessary to
address this compile-time issue.

Contributed-by: Roman Gareev <gareevroman@gmail.com>

Reviewers: jdoerfert, grosser

Subscribers: grosser, #polly

Differential Revision: http://reviews.llvm.org/D13779

llvm-svn: 250809
2015-10-20 09:12:21 +00:00
Michael Kruse 48ea8efd59 Correct typo in CHECK line
Thanks Tobias for the hint.

llvm-svn: 250695
2015-10-19 10:51:20 +00:00
Michael Kruse dc12222287 Synthesize phi arguments in incoming block
New values were always synthesized in the block of the instruction
that needed them. This is incorrect for PHI node whose' value must be
defined in the respective incoming block. This patch temporarily moves
the builder's insert point to the incoming block while synthesizing phi
node arguments. 

This fixes PR25241 (http://llvm.org/bugs/show_bug.cgi?id=25241)

llvm-svn: 250693
2015-10-19 09:19:25 +00:00
Johannes Doerfert 9c28bfa72c [FIX] Only constant integer branch conditions are always affine
There are several different kinds of constants that could occur in a
  branch condition, however we can only handle the most interesting one
  namely constant integers. To this end we have to treat others as
  non-affine.

  This fixes bug 25244.

llvm-svn: 250669
2015-10-18 22:56:42 +00:00
Johannes Doerfert 30c2265f98 [FIX] Normalize loops outside the SCoP during schedule generation
We build the schedule based on a traversal of the region and accumulate
  information for each loop in it. The total schedule is associated with the
  loop surrounding the SCoP, though it can happen that there are blocks in the
  SCoP which are part of loops that are only partially in the SCoP. Instead of
  associating information with them (they are not part of the SCoP and
  consequently are not modeled) we have to associate the schedule information
  with the surrounding loop if any.

  This fixes bug 25240.

llvm-svn: 250668
2015-10-18 21:17:11 +00:00
Johannes Doerfert b864c2c3c9 [FIX] Do not try to hoist "empty" accesses
Accesses that have a relative offset (in bytes) that is not divisible
  by the type size (in bytes) will be represented as empty in the SCoP
  description. This is on its own not good but it also crashed the
  invariant load hoisting. This patch will fix the latter problem while
  the former should be addressed too.

  This fixes bug 25236.

llvm-svn: 250664
2015-10-18 19:50:18 +00:00
Johannes Doerfert bc7cff4c18 [FIX] Do not hoist invariant pointers with non-loaded base ptr in SCoP
If the base pointer of a load is invariant and defined in the SCoP but
  not loaded we cannot hoist the load as we would not hoist the base
  pointer definition.

  This fixes bug 25237.

llvm-svn: 250663
2015-10-18 19:49:25 +00:00
Johannes Doerfert af3e301a67 [FIX] Restructure invariant load equivalence classes
Sorting is replaced by a demand driven code generation that will pre-load a
  value when it is needed or, if it was not needed before, at some point
  determined by the order of invariant accesses in the program. Only in very
  little cases this demand driven pre-loading will kick in, though it will
  prevent us from generating faulty code. An example where it is needed is
  shown in:
    test/ScopInfo/invariant_loads_complicated_dependences.ll

  Invariant loads that appear in parameters but are not on the top-level (e.g.,
  the parameter is not a SCEVUnknown) will now be treated correctly.

Differential Revision: http://reviews.llvm.org/D13831

llvm-svn: 250655
2015-10-18 12:39:19 +00:00
Johannes Doerfert d8b6ad255f [FIX] Cast preloaded values
Preloaded values have to match the type of their counterpart in the
  original code and not the type of the base array.

llvm-svn: 250654
2015-10-18 12:36:42 +00:00
Johannes Doerfert 01978cfa0c Remove independent blocks pass
Polly can now be used as a analysis only tool as long as the code
  generation is disabled. However, we do not have an alternative to the
  independent blocks pass in place yet, though in the relevant cases
  this does not seem to impact the performance much. Nevertheless, a
  virtual alternative that allows the same transformations without
  changing the input region will follow shortly.

llvm-svn: 250652
2015-10-18 12:28:00 +00:00
Tobias Grosser b8d27aab7d Revert to original BlockGenerator::getOrCreateAlloca(MemoryAccess &Access)
Expressing this in terms of BlockGenerator::getOrCreateAlloca(const
ScopArrayInfo *Array) does not work as the MemoryAccess BasePtr is in case of
invariant load hoisting different to the ScopArrayInfo BasePtr. Until this is
investigated and fixed, we move back to code that just uses the baseptr of
MemoryAccess.

llvm-svn: 250637
2015-10-18 00:51:13 +00:00
Michael Kruse 225f0d1ee2 Load/Store scalar accesses before/after the statement itself
Instead of generating implicit loads within basic blocks, put them 
before the instructions of the statment itself, including non-affine 
subregions. The region's entry node is dominating all blocks in the 
region and therefore the loaded value will be available there.

Implicit writes in block-stmts were already stored back at the end of 
the block. Now, also generate the stores of non-affine subregions when 
leaving the statement, i.e. in the exiting block.

This change is required for array-mapped implicits ("De-LICM") to 
ensure that there are no dependencies of demoted scalars within 
statments. Statement load all required values, operator on copied in 
registers, and then write back the changed value to the demoted memory. 
Lifetimes analysis within statements becomes unecessary.

Differential Revision: http://reviews.llvm.org/D13487

llvm-svn: 250625
2015-10-17 21:36:00 +00:00
Michael Kruse 01cb379fed Avoid unnecessay .s2a write access when used only in PHIs
Accesses for exit node phis will be handled separately by 
buildPHIAccesses if there is more than one exiting edge, 
buildScalarDependences does not need to create additional SCALAR 
accesses.

This is a corrected version of r250517, which was reverted in r250607.

Differential Revision: http://reviews.llvm.org/D13848

llvm-svn: 250622
2015-10-17 21:07:08 +00:00
Tobias Grosser 3839b422e6 Revert "Avoid unnecessay .s2a write access when used only in PHIs"
This reverts commit r250606 due to some bugs it introduced. After these bugs
have been resolved, we will add it back to tree.

llvm-svn: 250607
2015-10-17 08:54:05 +00:00
Michael Kruse e71893d580 Add testcase for r250517
llvm-svn: 250518
2015-10-16 15:17:26 +00:00
Michael Kruse aeceab770e Avoid unnecessay .s2a write access when used only in PHIs
PHI accesses will be handled separately by buildPHIAccesses,
buildScalarDependences does not need to create additional accesses.

llvm-svn: 250517
2015-10-16 15:14:40 +00:00