Commit Graph

1919 Commits

Author SHA1 Message Date
Tobias Grosser 3e9560200f RegionGenerator: Clear local maps after statement construction
These maps are only needed during the construction of a single region statement.
Clearing them is important, as we otherwise get an assert in case some of the
referenced values are erased before the RegionGenerator is deleted.

llvm-svn: 251341
2015-10-26 20:41:53 +00:00
Tobias Grosser 4dc909cbed tests: Add test cases for LLVM commit r251267
This fixes llvm.org/PR25242

llvm-svn: 251268
2015-10-25 22:56:42 +00:00
Tobias Grosser bf45e74254 ScopDetect: Bail out for non-simple memory accesses
Volatile or atomic memory accesses are currently not supported. Neither did
we think about any special handling needed nor do we support the unknown
instructions the alias set tracker turns them into sometimes. Before this
patch, us not supporting unkown instructions in an alias set caused the
following assertion failures:

Assertion `AG.size() > 1 && "Alias groups should contain at least two accesses"'
failed

llvm-svn: 251234
2015-10-25 13:48:40 +00:00
Tobias Grosser 27e19a022e Fix typo
llvm-svn: 251231
2015-10-25 12:05:14 +00:00
Tobias Grosser 4d935cd9aa tests: Add test case forgotten in 251191
llvm-svn: 251228
2015-10-25 10:55:40 +00:00
Tobias Grosser 907090c37c ScopDetection: Update DetectionContextMap accordingly
When verifying if a scop is still valid we rerun all analysis, but did not
update DetectionContextMap. This change ensures that information, e.g. about
non-affine regions, is correctly updated

llvm-svn: 251227
2015-10-25 10:55:35 +00:00
Tobias Grosser 51174cca30 ScopDetection: Do not crash if we find zero array size candidates for delinearization
llvm-svn: 251226
2015-10-25 08:40:44 +00:00
Tobias Grosser 5528dcdf25 ScopDetection: Always refuse multi-dimensional memory accesses with 'undef' in
the size expression.

We previously only checked if the size expression is 'undef', but allowed size
expressions of the form 'undef * undef' by accident. After this change we now
require size expressions to be affine which implies no 'undef' appears anywhere
in the expression.

llvm-svn: 251225
2015-10-25 08:40:38 +00:00
Tobias Grosser baffa091dd ScopInfo: PHI-node uses in the EntryNode with an incoming BB that is not part
of the Region are external.

During code generation we split off the parts of the PHI nodes in the entry
block, which have incoming blocks that are not part of the region. As these
split-off PHI nodes then are external uses, we consequently also need to model
these uses in ScopInfo.

llvm-svn: 251208
2015-10-24 20:55:27 +00:00
Tobias Grosser ffd6b3bb29 Add a missing '-S'
llvm-svn: 251199
2015-10-24 19:02:01 +00:00
Tobias Grosser a3f6edaee1 BlockGenerator: Do not assert when finding model PHI nodes defined outside the scop
Such PHI nodes can not only appear in the ExitBlock of the Scop, but indeed
any scalar PHI node above the scop and used in the scop is modeled as scalar
read access.

llvm-svn: 251198
2015-10-24 19:01:09 +00:00
Tobias Grosser 27d742da59 BlockGenerator: Directly handle multi-exit PHI nodes
This change adds code to directly code-generate multi-exit PHI nodes, instead
of trying to reuse the EscapeMap infrastructure for this. Using escape maps
adds a level of indirection that is hard to understand and - more importantly -
breaks in certain cases.

Specifically, the original code relied on simplifyRegion() to split the original
PHI node in two PHI nodes, one merging the values coming from within the scop
and a second that merges the first PHI node with the values that come from
outside the scop. To generate code the first PHI node is then just handled like
any other in-scop value that is used somewhere outside the scop. This fails for
the case where all values from inside the scop are identical, as the first PHI
node is in such cases automatically simplified and eliminated by LLVM right at
construction. As a result, there is no instruction that can be pass to the
EscapeMap handling, which means the references in the second PHI node are not
updated and may still reference values from within the original scop that do not
dominate it.

Our new code iterates directly over all modeled ScopArrayInfo objects that
represent multi-exit PHI nodes and generates code for them without relying on
the EscapeMap infrastructure. Hence, it works also for the case where the first
PHI node is eliminated.

llvm-svn: 251191
2015-10-24 17:41:29 +00:00
Tobias Grosser c73d8b0e18 ScopInfo: Drop unnecessary code
This case has already been taken care of in r250622 and was then accidentally
again committed in 250625.

llvm-svn: 251156
2015-10-23 22:36:22 +00:00
Johannes Doerfert 654c3284f4 [FIX] Do not hoist nested variant base pointers
This fixes bug 25249.

llvm-svn: 250958
2015-10-21 22:14:57 +00:00
Tobias Grosser ca7f5bb767 Full/partial tile separation for vectorization
We isolate full tiles from partial tiles to be able to, for example, vectorize
loops with parametric lower and/or upper bounds.

If we use -polly-vectorizer=stripmine, we can see execution-time improvements:
correlation from 1m7361s to 0m5720s (-67.05 %), covariance from 1m5561s to
0m5680s (-63.50 %), ary3 from 2m3201s to 1m2361s (-46.72 %), CrystalMk from
8m5565s to 7m4285s (-13.18 %).

The current full/partial tile separation increases compile-time more than
necessary. As a result, we see in compile time regressions, for example, for 3mm
from 0m6320s to 0m9881s (56.34%). Some of this compile time increase is expected
as we generate more IR and consequently more time is spent in the LLVM backends.
However, a first investiagation has shown that a larger portion of compile time
is unnecessarily spent inside Polly's parallelism detection and could be
eliminated by propagating existing knowledge about vector loop parallelism.
Before enabling -polly-vectorizer=stripmine by default, it is necessary to
address this compile-time issue.

Contributed-by: Roman Gareev <gareevroman@gmail.com>

Reviewers: jdoerfert, grosser

Subscribers: grosser, #polly

Differential Revision: http://reviews.llvm.org/D13779

llvm-svn: 250809
2015-10-20 09:12:21 +00:00
Michael Kruse 48ea8efd59 Correct typo in CHECK line
Thanks Tobias for the hint.

llvm-svn: 250695
2015-10-19 10:51:20 +00:00
Michael Kruse dc12222287 Synthesize phi arguments in incoming block
New values were always synthesized in the block of the instruction
that needed them. This is incorrect for PHI node whose' value must be
defined in the respective incoming block. This patch temporarily moves
the builder's insert point to the incoming block while synthesizing phi
node arguments. 

This fixes PR25241 (http://llvm.org/bugs/show_bug.cgi?id=25241)

llvm-svn: 250693
2015-10-19 09:19:25 +00:00
Johannes Doerfert 9c28bfa72c [FIX] Only constant integer branch conditions are always affine
There are several different kinds of constants that could occur in a
  branch condition, however we can only handle the most interesting one
  namely constant integers. To this end we have to treat others as
  non-affine.

  This fixes bug 25244.

llvm-svn: 250669
2015-10-18 22:56:42 +00:00
Johannes Doerfert 30c2265f98 [FIX] Normalize loops outside the SCoP during schedule generation
We build the schedule based on a traversal of the region and accumulate
  information for each loop in it. The total schedule is associated with the
  loop surrounding the SCoP, though it can happen that there are blocks in the
  SCoP which are part of loops that are only partially in the SCoP. Instead of
  associating information with them (they are not part of the SCoP and
  consequently are not modeled) we have to associate the schedule information
  with the surrounding loop if any.

  This fixes bug 25240.

llvm-svn: 250668
2015-10-18 21:17:11 +00:00
Johannes Doerfert b864c2c3c9 [FIX] Do not try to hoist "empty" accesses
Accesses that have a relative offset (in bytes) that is not divisible
  by the type size (in bytes) will be represented as empty in the SCoP
  description. This is on its own not good but it also crashed the
  invariant load hoisting. This patch will fix the latter problem while
  the former should be addressed too.

  This fixes bug 25236.

llvm-svn: 250664
2015-10-18 19:50:18 +00:00
Johannes Doerfert bc7cff4c18 [FIX] Do not hoist invariant pointers with non-loaded base ptr in SCoP
If the base pointer of a load is invariant and defined in the SCoP but
  not loaded we cannot hoist the load as we would not hoist the base
  pointer definition.

  This fixes bug 25237.

llvm-svn: 250663
2015-10-18 19:49:25 +00:00
Johannes Doerfert af3e301a67 [FIX] Restructure invariant load equivalence classes
Sorting is replaced by a demand driven code generation that will pre-load a
  value when it is needed or, if it was not needed before, at some point
  determined by the order of invariant accesses in the program. Only in very
  little cases this demand driven pre-loading will kick in, though it will
  prevent us from generating faulty code. An example where it is needed is
  shown in:
    test/ScopInfo/invariant_loads_complicated_dependences.ll

  Invariant loads that appear in parameters but are not on the top-level (e.g.,
  the parameter is not a SCEVUnknown) will now be treated correctly.

Differential Revision: http://reviews.llvm.org/D13831

llvm-svn: 250655
2015-10-18 12:39:19 +00:00
Johannes Doerfert d8b6ad255f [FIX] Cast preloaded values
Preloaded values have to match the type of their counterpart in the
  original code and not the type of the base array.

llvm-svn: 250654
2015-10-18 12:36:42 +00:00
Johannes Doerfert 01978cfa0c Remove independent blocks pass
Polly can now be used as a analysis only tool as long as the code
  generation is disabled. However, we do not have an alternative to the
  independent blocks pass in place yet, though in the relevant cases
  this does not seem to impact the performance much. Nevertheless, a
  virtual alternative that allows the same transformations without
  changing the input region will follow shortly.

llvm-svn: 250652
2015-10-18 12:28:00 +00:00
Tobias Grosser b8d27aab7d Revert to original BlockGenerator::getOrCreateAlloca(MemoryAccess &Access)
Expressing this in terms of BlockGenerator::getOrCreateAlloca(const
ScopArrayInfo *Array) does not work as the MemoryAccess BasePtr is in case of
invariant load hoisting different to the ScopArrayInfo BasePtr. Until this is
investigated and fixed, we move back to code that just uses the baseptr of
MemoryAccess.

llvm-svn: 250637
2015-10-18 00:51:13 +00:00
Tobias Grosser a4f0988df5 BlockGenerator: Add getOrCreateAlloca(const ScopArrayInfo *Array)
This allows the caller to get the alloca locations of an array without the
need to thank if Array is a PHI or a non-PHI Array. We directly make use of this
in BlockGenerator::getOrCreateAlloca(MemoryAccess &Access).

llvm-svn: 250628
2015-10-17 22:16:00 +00:00
Michael Kruse 8c1e5ec86a Return nullptr if MemoryAccess list is empty
Other places (e.g. hoistInvariantLoads) assume that an empty lookup
will return nullptr. The situation can currently not arise because
MemoryAccesses are not removed before hoistInvariantLoads.

llvm-svn: 250627
2015-10-17 22:11:07 +00:00
Tobias Grosser 05d7fa79b6 Format comment properly
While clang-format takes care that the line-length is not surpassed, the
resulting comments sometimes look not optimal. We re-flow the text in the
comment to avoid these ugly single-word lines.

llvm-svn: 250626
2015-10-17 21:46:28 +00:00
Michael Kruse 225f0d1ee2 Load/Store scalar accesses before/after the statement itself
Instead of generating implicit loads within basic blocks, put them 
before the instructions of the statment itself, including non-affine 
subregions. The region's entry node is dominating all blocks in the 
region and therefore the loaded value will be available there.

Implicit writes in block-stmts were already stored back at the end of 
the block. Now, also generate the stores of non-affine subregions when 
leaving the statement, i.e. in the exiting block.

This change is required for array-mapped implicits ("De-LICM") to 
ensure that there are no dependencies of demoted scalars within 
statments. Statement load all required values, operator on copied in 
registers, and then write back the changed value to the demoted memory. 
Lifetimes analysis within statements becomes unecessary.

Differential Revision: http://reviews.llvm.org/D13487

llvm-svn: 250625
2015-10-17 21:36:00 +00:00
Michael Kruse 01cb379fed Avoid unnecessay .s2a write access when used only in PHIs
Accesses for exit node phis will be handled separately by 
buildPHIAccesses if there is more than one exiting edge, 
buildScalarDependences does not need to create additional SCALAR 
accesses.

This is a corrected version of r250517, which was reverted in r250607.

Differential Revision: http://reviews.llvm.org/D13848

llvm-svn: 250622
2015-10-17 21:07:08 +00:00
Tobias Grosser ffa2446f28 BlockGenerator: Register outside users of scalars directly
Instead of checking at code generation time for each ScopStmt if a scalar has
external uses, we just iterate over the ScopArrayInfo descriptions we have and
check each of these for possible external uses.

Besides being somehow clearer, this approach has the benefit that we will always
create valid LLVM-IR even in case we disable the code generation of ScopStmt
bodies e.g. for testing purposes.

llvm-svn: 250608
2015-10-17 08:54:13 +00:00
Tobias Grosser 3839b422e6 Revert "Avoid unnecessay .s2a write access when used only in PHIs"
This reverts commit r250606 due to some bugs it introduced. After these bugs
have been resolved, we will add it back to tree.

llvm-svn: 250607
2015-10-17 08:54:05 +00:00
Tobias Grosser 26e59ee746 Drop unused parameter from handleOutsideUsers
llvm-svn: 250606
2015-10-17 08:25:54 +00:00
Michael Kruse e71893d580 Add testcase for r250517
llvm-svn: 250518
2015-10-16 15:17:26 +00:00
Michael Kruse aeceab770e Avoid unnecessay .s2a write access when used only in PHIs
PHI accesses will be handled separately by buildPHIAccesses,
buildScalarDependences does not need to create additional accesses.

llvm-svn: 250517
2015-10-16 15:14:40 +00:00
Tobias Grosser b860289dbd Add ScopInfo test case for r250411
llvm-svn: 250439
2015-10-15 18:26:06 +00:00
Tobias Grosser 473a5c3253 test: Correctly check for branch statements
In r250408 'CHECK-NEXT: br' lines were removed as they also matched a
'%polly.subregion.iv.inc' instruction and did consequently not check what they
were supposed to check. However, without these lines we can not test that the
.s2a instructions that are not any more generated since r250411 really are not
emitted. Hence, we add back the CHECK-NEXT lines to ensure there are really no
instructions generated between the store that we check for and the branch at the
end of the basic block. To ensure we do not match too early, we now check for
'br i1' or 'br label'.

llvm-svn: 250435
2015-10-15 18:04:20 +00:00
Michael Kruse 668af71b82 Do not add accesses for intra-ScopStmt scalar def-use chains
When pulling a llvm::Value to be written as a PHI write, the former
code did only check whether it is within the same basic block, but it
could also be the same non-affine subregion. In that case some 
unecessary pair of MemoryAccesses would have been created.

Two unit test were explicitely checking for the unecessary writes,
including the comments that the writes are unecessary. 

llvm-svn: 250411
2015-10-15 14:45:48 +00:00
Michael Kruse 90428328ee Remove "CHECK: br" from some unit tests
They happen to match
%polly.subregion.iv.inc = add i32 %polly.subregion.iv, 1
         ^^                                ^^
that is, are misleading in what they actually check.

llvm-svn: 250408
2015-10-15 14:40:40 +00:00
Tobias Grosser 5da48f392c Add -sort-includes to our automatic source code formatting
llvm-svn: 250393
2015-10-15 12:18:37 +00:00
Tobias Grosser 0e3a6b13a4 Sort includes using 'clang-format -sort-includes'
llvm-svn: 250392
2015-10-15 12:17:36 +00:00
Michael Kruse b987be12bf Add testcase for SCEV explansion in non-affine subregions
When sharing the same map from old to new value, CodeGeneration would
reuse the same new value for each basic block. However, the SCEV
expander might emit code in a basic block that does not dominate a use
of the SCEV in another basic block. This test checks whether both such
blocks have their own expanded new values. 

llvm-svn: 250389
2015-10-15 10:40:14 +00:00
Tobias Grosser 6b948d5efb [tests] More testing for PHI-nodes in non-affine regions
We harden one test case by ensuring no additional stores may possibly be
introduced between the stores we check for and the basic block terminator
statements.

We also add a test case for the situation where a value that is passed from
a non-affine region to a PHI node does not dominate the exit of the non-affine
region. This case has come up in patch reviews, so we make sure it is properly
handled today and in the future.

llvm-svn: 250217
2015-10-13 20:03:09 +00:00
Tobias Grosser f30be2f370 RegisterPasses: Optionally run inliner before Polly
This will allow us to optimize C++ template code with Polly. This support is
mostly for debugging purpose and individual experiments. The ultimate goal is
still to run Polly later in the pass manager when inlining already happened.

llvm-svn: 250092
2015-10-12 20:03:44 +00:00
Tobias Grosser d17183f20f Use EP_ModuleOptimizerEarly to run early polly passes,
instead of llvm::PassManagerBuilder::EP_EarlyAsPossible. This will allow us
to run actual module passes in Polly's canonicalization sequence, but should
otherwise have only little impact.

llvm-svn: 250091
2015-10-12 20:03:41 +00:00
Tobias Grosser e2c8275346 ScopInfo: Allow simple 'AddRec * Parameter' products in delinearization
We also allow such products for cases where 'Parameter' is loaded within the
scop, but where we can dynamically verify that the value of 'Parameter' remains
unchanged during the execution of the scop.

This change relies on Polly's new RequiredILS tracking infrastructure recently
contributed by Johannes.

llvm-svn: 250019
2015-10-12 08:02:30 +00:00
Tobias Grosser 3c037fd396 Delete leftover function declaration
The function's definition was already removed in r247289.

llvm-svn: 250015
2015-10-12 05:14:03 +00:00
Tobias Grosser 1c55e2b7f3 Also add -polly-no-early-exit back until LNT is restarted
llvm-svn: 249975
2015-10-11 13:49:27 +00:00
Tobias Grosser 3ecb71696b Add back -polly-detect-unprofitable as alias of -polly-process-unprofitable
This flag was still used in our LNT server. We leave it until it has been
removed from LNT as well.

llvm-svn: 249973
2015-10-11 13:39:17 +00:00
Johannes Doerfert 9b1f9c8b61 Allow eager evaluated binary && and || conditions
The domain generation can handle lazy && and || by default but eager
  evaluated expressions were dismissed as non-affine. With this patch we
  will allow arbitrary combinations of and/or bit-operations in the
  conditions of branches.

Differential Revision: http://reviews.llvm.org/D13624

llvm-svn: 249971
2015-10-11 13:21:03 +00:00