Commit Graph

352 Commits

Author SHA1 Message Date
Tobias Grosser a2fd441989 Rebase C++ bindings on top of latest isl bindings
The main difference in this change is that isl_stat is now always
checked by default. As we elminiated most used of isl_stat, thanks to
Philip Pfaffe's implementation of foreach, only a small set of changes
is needed.

This change does not include the following recent changes to isl's C++
bindings:

  - stricter error handling for isl_bool
  - dropping of the isl::namespace qualifiers

The former requires a larger patch in Polly and consequently should go
through a patch-review. The latter will be applied in the next commit to
keep this commit free from noise.

We also still apply a couple of other changes on top of the official isl
bindings. This delta is expected to shrink over time.

llvm-svn: 338504
2018-08-01 09:57:10 +00:00
Tobias Grosser bbb510b18d [ZoneAlgo] Replace isl foreach calls with for loops
llvm-svn: 337245
2018-07-17 06:33:41 +00:00
Tobias Grosser c253931fcf [FlattenSchedule] Replace isl foreach calls with for loops
llvm-svn: 337244
2018-07-17 06:33:37 +00:00
Tobias Grosser 3867bae74b [MaximalStaticExpansion] Replace isl foreach calls with for loops
llvm-svn: 337243
2018-07-17 06:33:34 +00:00
Tobias Grosser 91f851b11a [ForwardOpTree] Replace isl foreach calls with for loops
llvm-svn: 337242
2018-07-17 06:33:31 +00:00
Tobias Grosser a33871686f [Simplify] Replace isl foreach calls with for loops
llvm-svn: 337241
2018-07-17 06:33:26 +00:00
Tobias Grosser 9d8913020d [FlattenAlgo] Replace more isl foreach calls with for loops
This time we replace for loops where the return isl::stat::error has
been used to carry status information.

There are still two uses of foreach remaining as we do not have a
corresponding for implementation for pw_aff functions.

llvm-svn: 337239
2018-07-17 06:16:58 +00:00
Tobias Grosser 6106595ac1 [FlattenAlgo] Replace some isl foreach calls with for loops
Replace foreach calls which only return 'ok' with for loops.

llvm-svn: 337238
2018-07-17 06:11:53 +00:00
Tobias Grosser d43114f880 Use range for in normalizeValInst [NFCI]
llvm-svn: 335971
2018-06-29 13:06:44 +00:00
Michael Kruse 96da1ca584 [ZoneAlgo] Use getDefToTarget in makeValInst. NFC.
Move the optimized getDefToTarget() from ForwardOpTree to ZoneAlgo such
that it can be used by makeValInst.

This reduces the compile time of GrTestUtils of the aosp buildbot from
2m46s to 21s, which should fix the timeout issue.

Differential Revision: https://reviews.llvm.org/D48579

llvm-svn: 335606
2018-06-26 14:29:09 +00:00
Michael Kruse 2dab88e652 [OpTree] Introduce shortcut for computing the def->target mapping. NFCI.
In case the schedule has not changed and the operand tree root uses a
value defined in an ancestor loop, the def-to-target mapping is trivial.
For instance, the SCoP

    for (int i < 0; i < N; i+=1) {
    DefStmt:
      D = ...;
      for (int j < 0; j < N; j+=1) {
    TargetStmt:
        use(D);
      }
    }

has DefStmt-to-TargetStmt mapping of

    { DefStmt[i] -> TargetStmt[i,j] }

This should apply on the majority of def-to-target mappings.
This patch detects this case and directly constructs the expected
mapping. It assumes that the mapping never crosses the loop header
DefStmt is in, which ForwardOpTree does not support at the moment
anyway.

Differential Revision: https://reviews.llvm.org/D47752

llvm-svn: 334134
2018-06-06 21:37:35 +00:00
Tobias Grosser 6a6d9df78e getDependences to new C++ interface
Reviewers: Meinersbur, grosser, bollu, cs15btech11044, jdoerfert

Reviewed By: grosser

Subscribers: pollydev, llvm-commits

Tags: #polly

Differential Revision: https://reviews.llvm.org/D47786

llvm-svn: 334092
2018-06-06 13:10:32 +00:00
David Blaikie 4490465db7 Update for a header file move in LLVM
llvm-svn: 333956
2018-06-04 21:23:32 +00:00
Michael Kruse d51fbfca46 [ZoneAlgo] Make ZoneAlgorithm::isNormalized out-of-quota safe.
The aosp-O3-polly-before-vectorizer-unprofitable buildbot currently
fails in ZoneAlgorithm::isNormalized, presumably because an
out-of-quota happens in that function.

Modify ZoneAlgorithm::isNormalized to return an isl::boolean such
it can report an error.

In the failing case, it was called in an assertion in ForwardOpTree.
Allow to pass the assertion in an out-of-quota event, a condition that
is later checked before forwarding an operand tree.

llvm-svn: 333709
2018-05-31 22:44:23 +00:00
Michael Kruse d3ce899ddf [ForwardOpTree] Use less computationally expensive method to compute def-to-target map. NFCI.
When forwarding a LoadInst to another statement, a map that translates
their domain is needed. Before this patch, is was computed by appending
the def-to-use map to the def-to-target of the operand tree's target.

This patch lets the new method getDefToTarget do this. This is
computationally less expensive due to:

 * Caching of the result such that it can be used for multiple operands
   tree to the same target.

 * The map is only computed when there is a LoadInst that needs it.

 * It is only computed for the statement requiring the translator map,
   instead of having an intermediate result for every edge in the
  operand tree.

The downside is that this scheme cannot handle forwarding from a
previous loop iteration (which would require the entire path from
statement to target). Since ForwardOpTree currently does not support
forwarding across loop iterations (SCEV expressions would need to be
transformed), this was not needed anyway.

Differential Revision: https://reviews.llvm.org/D47385

llvm-svn: 333426
2018-05-29 15:19:17 +00:00
Nicola Zaghen 349506a926 [polly] Update uses of DEBUG macro to LLVM_DEBUG.
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM

Differential Revision: https://reviews.llvm.org/D44978

llvm-svn: 332352
2018-05-15 13:37:17 +00:00
Tobias Grosser d3d3d6b75d Remove the last uses of isl::give and isl::take
llvm-svn: 331126
2018-04-29 00:28:26 +00:00
Tobias Grosser da3e8c4ba7 [DeLICM] Remove uses of isl::give
llvm-svn: 331122
2018-04-28 22:11:55 +00:00
Tobias Grosser daf68ea309 [ZoneAlgo] Remove uses of isl::give - II
llvm-svn: 331121
2018-04-28 22:11:48 +00:00
Tobias Grosser 2f549fd6a9 [ZoneAlgo] Remove uses of isl::give
This moves more of Polly to islpp.

llvm-svn: 331120
2018-04-28 21:22:17 +00:00
Tobias Grosser 77e871aaf5 [MaximalStaticExpansion] Replace copied function with version from ISLTools
llvm-svn: 331118
2018-04-28 20:42:35 +00:00
David Blaikie 60dc462b04 Fixup Polly for an LLVM header file change.
llvm-svn: 330679
2018-04-24 02:23:41 +00:00
Reid Kleckner 757c8cf615 Fix polly build after r328717
llvm-svn: 328728
2018-03-28 19:56:26 +00:00
David Blaikie fd94eee3b9 Update for LLVM header movement
llvm-svn: 328169
2018-03-21 23:21:10 +00:00
Tobias Grosser 3a99893618 Adjust to clang-format changes
llvm-svn: 328005
2018-03-20 17:16:32 +00:00
Tobias Grosser 718d04c653 Use isl::manage_copy to simplify calls to isl::manage(isl_.._copy())
As part of this cleanup a couple of unnecessary isl::manage(obj.copy()) pattern
are eliminated as well.

We checked for all potential cleanups by scanning for:

  "grep -R isl::manage\( lib/ | grep copy"

llvm-svn: 325558
2018-02-20 07:26:58 +00:00
Michael Kruse 1a745a4ef6 Run clang-format after r324003. NFC.
llvm-svn: 324112
2018-02-02 18:11:58 +00:00
Benjamin Kramer e65c7bbe8a Update polly for r323999.
llvm-svn: 324003
2018-02-01 20:49:53 +00:00
Davide Italiano b0c7dee0b6 [MaximalStaticExpansion] Simplify this code a bit. NFCI.
llvm-svn: 318988
2017-11-25 23:01:31 +00:00
Michael Kruse 58166b13e0 Run polly-update-format. NFC.
polly-check-format has been failing since at least r318517,
due to more than one cause.

llvm-svn: 318795
2017-11-21 19:25:26 +00:00
Philip Pfaffe 00fd43b327 Port ScopInfo to the isl cpp bindings
Summary:
Most changes are mechanical, but in one place I changed the program semantics
by fixing a likely bug:

In `Scop::hasFeasibleRuntimeContext()`, I'm now explicitely handling the
error-case. Before, when the call to `addNonEmptyDomainConstraints()`
returned a null set, this (probably) accidentally worked because
isl_bool_error converts to true. I'm checking for nullptr now.

Reviewers: grosser, Meinersbur, bollu

Reviewed By: Meinersbur

Subscribers: nemanjai, kbarton, pollydev, llvm-commits

Differential Revision: https://reviews.llvm.org/D39971

llvm-svn: 318632
2017-11-19 22:13:34 +00:00
Zhaoshi Zheng ceec175dff [NFC] Make r318597 compatible with clang-format
llvm-svn: 318561
2017-11-17 22:05:19 +00:00
Philip Pfaffe 2813ce228b [nfc] Iwyu: forward-declare/include raw_ostream in zone algo
llvm-svn: 318517
2017-11-17 11:34:29 +00:00
Michael Kruse 4d3f3c7206 [ForwardOpTree] Limit isl operations of known content reload.
Put the analysis part of reloadKnownContent under an isl
max-operations quota scope, as has already been done for
forwardKnownLoad.

This should fix the aosp timeout of "GrTestUtils.cpp".

llvm-svn: 317495
2017-11-06 17:48:14 +00:00
Michael Kruse 68821a8b91 [ZoneAlgo/ForwardOpTree] Normalize PHIs to their known incoming values.
Represent PHIs by their incoming values instead of an opaque value of
themselves. This allows ForwardOpTree to "look through" the PHIs and
forward the incoming values since forwardings PHIs is currently not
supported.

This is particularly useful to cope with PHIs inserted by GVN LoadPRE.
The incoming values all resolve to a load from a single array element
which then can be forwarded.

It should in theory also reduce spurious conflicts in value mapping
(DeLICM), but I have not yet found a profitable case yet, so it is
not included here.

To avoid transitive closure and potentially necessary overapproximations
of those, PHIs that may reference themselves are excluded from
normalization and keep their opaque self-representation.

Differential Revision: https://reviews.llvm.org/D39333

llvm-svn: 317008
2017-10-31 16:11:46 +00:00
Michael Kruse ff426d974d [DeLICM] Fix wrong assumed access execution order.
ForwardOpTree may already transform a scalar access to an array
accesses. The access remains implicit (isOriginalScalarKind(), meaning
that the access is always executed at the begin/end of a statement), but
targets an array (isLatestArrayKind(), which is unrelated to whether the
execution is implicit/explicit).

Fix by properly using isOriginalXXX() to determine execution order.

This fixes the buildbots on MultiSource/Benchmarks/DOE-ProxyApps-C/miniGMG.

llvm-svn: 316995
2017-10-31 12:50:25 +00:00
Michael Kruse cc6ea8e74f [ForwardOpTree] Use space indention. NFC.
llvm-svn: 316769
2017-10-27 14:48:34 +00:00
Michael Kruse 822dfe271b [ForwardOpTree] Reload know values.
For scalar accesses, change the access target to an array element that
is known to contain the same value.

This may become an alternative to forwardKnownLoad which creates new
loads (and therefore closer to forwarding speculatives). Reloading does
not require the known value originating from a load, but can be a store
as well.

Differential Revision: https://reviews.llvm.org/D39325

llvm-svn: 316766
2017-10-27 14:26:14 +00:00
Michael Kruse 983fa9bf23 [ZoneAlgo] Translate addArrayWriteAccess to isl++. NFC.
llvm-svn: 316459
2017-10-24 16:40:34 +00:00
Michael Kruse 19cd61dc11 [DeLICM] Do not try to map to multiple array elements.
Add check and skip when the store used to determine the target accesses
multiple array elements. Only a single array location should for
mapping the scalar. Having multiple creates problems when deciding which
element to load from. While MemoryAccess::getAddressFunction() should
select just one of them, other problems arise in code that assumes
that there is just one target element per statement instance.

This fixes llvm.org/PR34989

This also reverts r313902 which fixed llvm.org/PR34485 also caused by
a non-functional target array element. This patch avoids the situation
to occur in the first place.

llvm-svn: 316432
2017-10-24 13:05:24 +00:00
Adam Nemet e0f1541f41 Rename OptimizationDiagnosticInfo.h to OptimizationRemarkEmitter.h
Polly version of r315249 on LLVM trunk.

llvm-svn: 315253
2017-10-09 23:49:08 +00:00
Michael Kruse e276e9f324 [ForwardOpTree] Fix out-of-quota in assertion.
llvm-svn: 314661
2017-10-02 11:41:06 +00:00
Michael Kruse ed787e7540 [Polly] Add dumpPw() and dumpExpanded() functions. NFC.
These functions print a multi-line and sorted representation of unions
of polyhedra. Each polyhedron (basic_{ast/map}) has its own line.
First sort key is the polyhedron's hierachical space structure.
Secondary sort key is the lower bound of the polyhedron, which should
ensure that the polyhedral are printed in approximately ascending order.

Example output of dumpPw():
[p_0, p_1, p_2] -> {
  Stmt0[0] -> [0, 0];
  Stmt0[i0] -> [i0, 0] : 0 < i0 <= 5 - p_2;
  Stmt1[0] -> [0, 2] : p_1 = 1 and p_0 = -1;
  Stmt2[0] -> [0, 1] : p_1 >= 3 + p_0;
  Stmt3[0] -> [0, 3];
}

In contrast dumpExpanded() prints each point in the sets, unless there
is an unbounded dimension that cannot be expandend.
This is useful for reduced test cases where the loop counts are set to
some constant to understand a bug.

Example output of dumpExpanded(
{ [MemRef_A[i0] -> [i1]] : (exists (e0 = floor((1 + i1)/3): i0 = 1 and
3e0 <= i1 and 3e0 >= -1 + i1 and i1 >= 15 and i1 <= 25)) or (exists (e0
= floor((i1)/3): i0 = 0 and 3e0 < i1 and 3e0 >= -2 + i1 and i1 > 0 and
i1 <= 11)) }):

{
  [MemRef_A[0] ->[1]];
  [MemRef_A[0] ->[2]];
  [MemRef_A[0] ->[4]];
  [MemRef_A[0] ->[5]];
  [MemRef_A[0] ->[7]];
  [MemRef_A[0] ->[8]];
  [MemRef_A[0] ->[10]];
  [MemRef_A[0] ->[11]];
  [MemRef_A[1] ->[15]];
  [MemRef_A[1] ->[16]];
  [MemRef_A[1] ->[18]];
  [MemRef_A[1] ->[19]];
  [MemRef_A[1] ->[21]];
  [MemRef_A[1] ->[22]];
  [MemRef_A[1] ->[24]];
  [MemRef_A[1] ->[25]]
}

Differential Revision: https://reviews.llvm.org/D38349

llvm-svn: 314525
2017-09-29 15:45:40 +00:00
Michael Kruse bfca5f4334 [DeLICM] Allow non-injective PHIRead->PHIWrite mapping.
Remove an assertion that tests the injectivity of the
PHIRead -> PHIWrite relation.  That is, allow a single PHI write to be
used by multiple PHI reads.  This may happen due to some statements
containing the PHI write not having the statement instances that would
overwrite the previous incoming value due to (assumed/invalid) contexts.
This result in that PHI write is mapped to multiple targets which is not
supported.  Codegen will select one one of the targets using
getAddressFunction().  However, the runtime check should protect us from
this case ever being executed.

We therefore allow injective PHI relations.  Additional calculations to
detect/santitize this case would probably not be worth the compuational
effort.

This fixes llvm.org/PR34485

llvm-svn: 313902
2017-09-21 19:08:23 +00:00
Michael Kruse 0e370cf1a7 Check whether IslAstInfo and DependenceInfo were computed for the same Scop.
Since -polly-codegen reports itself to preserve DependenceInfo and IslAstInfo,
we might get those analysis that were computed by a different ScopInfo for a
different Scop structure. This would be unfortunate because DependenceInfo and
IslAstInfo hold references to resources allocated by
ScopInfo/ScopBuilder/Scop (e.g. isl_id). If -polly-codegen and
DependenceInfo/IslAstInfo do not agree on which Scop to use, unpredictable
things can happen.

When the ScopInfo/Scop object is freed, there is a high probability that the
new ScopInfo/Scop object will be created at the same heap position with the
same address. Comparing whether the Scop or ScopInfo address is the expected
therefore is unreliable.

Instead, we compare the address of the isl_ctx object. Both, DependenceInfo
and IslAstInfo must hold a reference to the isl_ctx object to ensure it is
not freed before the destruction of those analyses which might happen after
the destruction of the Scop/ScopInfo they refer to.  Hence, the isl_ctx
will not be freed and its address not reused as long there is a
DependenceInfo or IslAstInfo around.

This fixes llvm.org/PR34441

llvm-svn: 313842
2017-09-21 00:01:13 +00:00
Michael Kruse 8dceb76066 [ScheduleOptimizer] Fix and test schedule tree statistics.
Fix walking over the schedule tree to collect its properties
(Number of permutable bands etc.).

Also add regression tests for these statistics.

llvm-svn: 313750
2017-09-20 11:53:05 +00:00
Michael Kruse 89972e21f8 [ForwardOpTree] Allow out-of-quota in examination part of forwardTree.
Computing the reaching definition in forwardTree() can take a long time
if the coefficients are large. When the forwarding is
carried-out (doIt==true), forwardTree() must execute entirely or not at
all to get a consistent output, which means we cannot just allow
out-of-quota errors to happen in the middle of the processing.

We introduce the class IslQuotaScope which allows to opt-in code that is
conformant and has been tested with out-of-quota events. In case of
ForwardOpTree, out-of-quota is allowed during the operand tree
examination, but not during the transformation. The same forwardTree()
recursion is used for examination and execution, meaning that the
reaching definition has already been computed in the examination tree
walk and cached for reuse in the transformation tree walk.

This should fix the time-out of grtestutils.ll of the asop buildbot. If
the compilation still takes too long, we can reduce the max-operations
allows for -polly-optree.

Differential Revision: https://reviews.llvm.org/D37984

llvm-svn: 313690
2017-09-19 22:53:20 +00:00
Michael Kruse ef8325ba50 [ForwardOpTree] Test the max operations quota.
cl::opt<unsigned long> is not specialized and hence the option
-polly-optree-max-ops impossible to use.

Replace by supported option cl::opt<unsigned>.

Also check for an error state when computing the written value, which
happens when the quota runs out.

llvm-svn: 313546
2017-09-18 17:43:50 +00:00
Michael Kruse ad32de9424 [ForwardOptTree] Remove redundant simplify(). NFC.
The result of computeKnown has already been simplified.

llvm-svn: 313526
2017-09-18 12:28:07 +00:00
Roman Gareev 925ce50f1b Unroll and separate the remaining parts of isolation
The remaining parts produced by the full partial tile isolation can contain
hot spots that are worth to be optimized. Currently, we rely on the simple
loop unrolling pass, LiCM and the SLP vectorizer to optimize such parts.
However, the approach can suffer from the lack of the information about
aliasing that Polly provides using additional alias metadata or/and the lack
of the information required by simple loop unrolling pass.

This patch is the first step to optimize the remaining parts. To do it, we
unroll and separate them. In case of, for instance, Intel Kaby Lake, it helps
to increase the performance of the generated code from 39.87 GFlop/s to
49.23 GFlop/s.

The next possible step is to avoid unrolling performed by Polly in case of
isolated and remaining parts and rely only on simple loop unrolling pass and
the Loop vectorizer.

Reviewed-by: Tobias Grosser <tobias@grosser.es>

Differential Revision: https://reviews.llvm.org/D37692

llvm-svn: 312929
2017-09-11 17:46:47 +00:00