Commit Graph

3667 Commits

Author SHA1 Message Date
Mandeep Singh Grang 02e789c9bf [polly] Remove redundant return [NFC]
Reviewers: grosser, bollu

Reviewed By: grosser

Subscribers: nemanjai, kbarton, llvm-commits

Tags: #polly

Differential Revision: https://reviews.llvm.org/D39916

llvm-svn: 317922
2017-11-10 20:33:08 +00:00
Michael Kruse 3a1e4bfb53 Update formatting to reflect change in clang-format. NFC.
clang-format has changed its algorithm
for sorting includes in r317794.

llvm-svn: 317808
2017-11-09 16:33:29 +00:00
Michael Kruse 4d3f3c7206 [ForwardOpTree] Limit isl operations of known content reload.
Put the analysis part of reloadKnownContent under an isl
max-operations quota scope, as has already been done for
forwardKnownLoad.

This should fix the aosp timeout of "GrTestUtils.cpp".

llvm-svn: 317495
2017-11-06 17:48:14 +00:00
Sanjay Patel 1b5114fa52 [Analysis] update to use new fast-math API - isFast()
llvm-svn: 317491
2017-11-06 16:52:31 +00:00
Florian Hahn 6720a089fd [Polly] Fix using order, as this caused a test failure (NFC)
Summary:
Without this patch, clang-format in check-polly fails for me, with current master:

```
FAILED: cd build/tools/polly && build/bin/clang-format -sort-includes -style=llvm llvm/tools/polly/include/polly/ScopPass.h | diff -u llvm/tools/polly/include/polly/ScopPass.h -
--- llvm/tools/polly/include/polly/ScopPass.h	2017-11-06 14:05:49.885345000 +0000
+++ -	2017-11-06 14:07:24.956241758 +0000
@@ -40,12 +40,12 @@
 } // namespace polly

 namespace llvm {
+using polly::SPMUpdater;
 using polly::Scop;
 using polly::ScopAnalysisManager;
 using polly::ScopAnalysisManagerFunctionProxy;
 using polly::ScopInfo;
 using polly::ScopStandardAnalysisResults;
-using polly::SPMUpdater;

 template <>
 class InnerAnalysisManagerProxy<ScopAnalysisManager, Function>::Result {
```

Reviewers: grosser, Meinersbur, bollu

Reviewed By: Meinersbur

Subscribers: llvm-commits, pollydev

Differential Revision: https://reviews.llvm.org/D39683

llvm-svn: 317478
2017-11-06 14:26:04 +00:00
Michael Kruse 68821a8b91 [ZoneAlgo/ForwardOpTree] Normalize PHIs to their known incoming values.
Represent PHIs by their incoming values instead of an opaque value of
themselves. This allows ForwardOpTree to "look through" the PHIs and
forward the incoming values since forwardings PHIs is currently not
supported.

This is particularly useful to cope with PHIs inserted by GVN LoadPRE.
The incoming values all resolve to a load from a single array element
which then can be forwarded.

It should in theory also reduce spurious conflicts in value mapping
(DeLICM), but I have not yet found a profitable case yet, so it is
not included here.

To avoid transitive closure and potentially necessary overapproximations
of those, PHIs that may reference themselves are excluded from
normalization and keep their opaque self-representation.

Differential Revision: https://reviews.llvm.org/D39333

llvm-svn: 317008
2017-10-31 16:11:46 +00:00
Michael Kruse ff426d974d [DeLICM] Fix wrong assumed access execution order.
ForwardOpTree may already transform a scalar access to an array
accesses. The access remains implicit (isOriginalScalarKind(), meaning
that the access is always executed at the begin/end of a statement), but
targets an array (isLatestArrayKind(), which is unrelated to whether the
execution is implicit/explicit).

Fix by properly using isOriginalXXX() to determine execution order.

This fixes the buildbots on MultiSource/Benchmarks/DOE-ProxyApps-C/miniGMG.

llvm-svn: 316995
2017-10-31 12:50:25 +00:00
Michael Kruse 06618bf71a [OpenMP] Fix reference collection of latest base ptrs.
When collecting base pointers that need to be made available in parallel
subfunctions, use the base pointer associated with the latest
ScopArrayInfo, instead of the original one.

llvm-svn: 316983
2017-10-31 10:28:22 +00:00
Philip Pfaffe 53c803871e [Acc] Do not statically dispatch into IslNodeBuilder's createFor
Summary:
When GPUNodeBuilder creates loops inside the kernel, it dispatches to
IslNodeBuilder. This however is surprisingly dangerous, since it accesses the
AST Node's user through the wrong type. This patch fixes this problem by
overriding createFor correctly.

This fixes PR35010.

Reviewers: grosser, bollu, Meinersbur

Reviewed By: Meinersbur

Subscribers: Meinersbur, nemanjai, pollydev, llvm-commits, kbarton

Differential Revision: https://reviews.llvm.org/D39364

llvm-svn: 316872
2017-10-29 21:36:34 +00:00
Philip Pfaffe 9b1d1e6ae7 Fix two testcases. NFC intended.
Add missing %loadPolly directive to support out of tree builds. One of
the changes is somewhat bigger, because the directive turns on LLVM
names, and the testcase deosn't use those.

llvm-svn: 316870
2017-10-29 21:00:48 +00:00
Michael Kruse cc6ea8e74f [ForwardOpTree] Use space indention. NFC.
llvm-svn: 316769
2017-10-27 14:48:34 +00:00
Michael Kruse 822dfe271b [ForwardOpTree] Reload know values.
For scalar accesses, change the access target to an array element that
is known to contain the same value.

This may become an alternative to forwardKnownLoad which creates new
loads (and therefore closer to forwarding speculatives). Reloading does
not require the known value originating from a load, but can be a store
as well.

Differential Revision: https://reviews.llvm.org/D39325

llvm-svn: 316766
2017-10-27 14:26:14 +00:00
Michael Kruse b6b65834a1 [Simplify] Mark (and sweep) based on latest access relation.
Previously we marked scalars based on the original access function. However,
when a scalar read access is redirected, the original definition
(or incoming values of a PHI) is not used anymore, and can be deleted
(unless referenced by use that has not been redirected).

llvm-svn: 316660
2017-10-26 12:34:36 +00:00
Michael Kruse 37d57dac63 [DeLICM] Add more tests for loop layouts. NFC.
llvm-svn: 316642
2017-10-26 08:03:28 +00:00
Michael Kruse 983fa9bf23 [ZoneAlgo] Translate addArrayWriteAccess to isl++. NFC.
llvm-svn: 316459
2017-10-24 16:40:34 +00:00
Michael Kruse 25bd602b7a [ISLTools] Translate computeReachingWrite to isl++. NFC.
llvm-svn: 316445
2017-10-24 15:19:46 +00:00
Michael Kruse 19cd61dc11 [DeLICM] Do not try to map to multiple array elements.
Add check and skip when the store used to determine the target accesses
multiple array elements. Only a single array location should for
mapping the scalar. Having multiple creates problems when deciding which
element to load from. While MemoryAccess::getAddressFunction() should
select just one of them, other problems arise in code that assumes
that there is just one target element per statement instance.

This fixes llvm.org/PR34989

This also reverts r313902 which fixed llvm.org/PR34485 also caused by
a non-functional target array element. This patch avoids the situation
to occur in the first place.

llvm-svn: 316432
2017-10-24 13:05:24 +00:00
Anna Thomas 0026d91437 [Polly] Add XFAIL to large-numbers-in-boundary-context.ll
After rL315683 (improve SCEV to calculate max BETakenCount when end
bound of loop is variant and loop is of form {Start,+1, Stride} LT End)
this test in polly started failing.
However, as discussed in https://reviews.llvm.org/rL315683,
this polly test is not a loops bound test and the MaxBECount calculated by
SCEV looks correct. The max BECount is the value calculated even when the end
bound of loop is invariant.

As discussed with Tobias offline, I'm marking this as an XFAIL, until he
gets a chance to update the testcase, so the build bot goes to green.

llvm-svn: 315912
2017-10-16 15:12:39 +00:00
Adam Nemet e0f1541f41 Rename OptimizationDiagnosticInfo.h to OptimizationRemarkEmitter.h
Polly version of r315249 on LLVM trunk.

llvm-svn: 315253
2017-10-09 23:49:08 +00:00
Michael Kruse cc345e6e94 [ScopBuilder] Introduce -polly-stmt-granularity=scalar-indep option.
The option splits BasicBlocks into minimal statements such that no
additional scalar dependencies are introduced.

The algorithm is based on a union-find structure, and unites sets if
putting them into separate statements would introduce a scalar
dependencies. As a consequence, instructions may be split into separate
statements such their relative order is different than the statements
they are in. This is accounted for instructions whose relative order
matters (e.g. memory accesses).

The algorithm is generic in that heuristic changes can be made
relatively easily. We might relax the order requirement for read-reads
or accesses to different base pointers. Forwardable instructions can be
made to not cause a join.

This implementation gives us a speed-up of 82% in SPEC 2006 456.hmmer
benchmark by allowing loop-distribution in a hot loop such that one of
the loops can be vectorized.

Differential Revision: https://reviews.llvm.org/D38403

llvm-svn: 314983
2017-10-05 13:43:00 +00:00
Michael Kruse 482d3f41e5 [ScopBuilder] Introduce -polly-stmt-granularity option. NFC.
The option is introduced with only one possible value
-polly-stmt-granularity=bb which represents the current behaviour, which
is outlined into the new function buildSequentialBlockStmts().

More options will be added in future commits.

llvm-svn: 314900
2017-10-04 12:18:57 +00:00
Tobias Grosser c52b71db15 [GPGPU] Make sure escaping invariant load hoisted scalars are preserved
We make sure that the final reload of an invariant scalar memory access uses the
same stack slot into which the invariant memory access was stored originally.
Earlier, this was broken as we introduce a new stack slot aside of the preload
stack slot, which remained uninitialized and caused our escaping loads to
contain garbage. This happened due to us clearing the pre-populated values
in EscapeMap after kernel code generation. We address this issue by preserving
the original host values and restoring them after kernel code generation.
EscapeMap is not expected to be used during kernel code generation, hence we
clear it during kernel generation to make sure that any unintended uses are
noticed.

llvm-svn: 314894
2017-10-04 10:24:23 +00:00
Jakub Kuderski 119753ad14 UnXFAIL tests that previously failed VerifyDFSNumbers
They started passing again by the DT::eraseNode fix in r314847.

llvm-svn: 314850
2017-10-03 21:23:56 +00:00
Jakub Kuderski 3c3bf74022 XFAIL two test that fail VerifyDFSNumbers DominatorTree check
This test XFAILs two test that start to fail when verifying DT's
DFS numbers, as per Tobias' suggestion.

Related VerifyDFSNumbers patch: D38331.

llvm-svn: 314800
2017-10-03 14:31:53 +00:00
Michael Kruse 4ee19603e9 [ScopBuilder] Iterate over statement instructions. NFC.
Iterate over statement instructions instead over basic block
instructions when creating MemoryAccesses. It allows making the creation
of MemoryAccesses independent of how the basic blocks are split into
multiple ScopStmts.

llvm-svn: 314665
2017-10-02 11:41:33 +00:00
Michael Kruse f5745b4e7d [ScopBuilder] Build invariant loads separately.
Create the MemoryAccesses of invariant loads separately and before
all other MemoryAccesses.

Invariant loads are classified as synthesizable and therefore are not
contained in any statement. When iterating over all instructions of all
statements, the invariant loads are consequently not processed and
iterating over them separately becomes necessary.

This patch can change the order in which MemoryAccesses are created, but
otherwise has no functional change.

Some temporary code is introduced to ensure correctness, but will be
removed in the next commit.

llvm-svn: 314664
2017-10-02 11:41:27 +00:00
Michael Kruse 89a6f3db02 [ScopBuilder] Build escaping dependencies separately.
Instructions that compute escaping values might be synthesizable and
therefore not contained in any ScopStmt. When buildAccessFunctions is
changed to only iterate over the instruction list of statement,
"free" instructions still need to be written. We do this after the
main MemoryAccesses have been created.

This can change the order in which MemoryAccesses are created, but has
otherwise no functional change.

llvm-svn: 314663
2017-10-02 11:41:19 +00:00
Michael Kruse 0bedec0e65 [ScopBuilder] Specialize exit block handling. NFC.
Decouple handling of exit block PHIs and other MemoryAccesses. Exit PHIs
only need the PHI handling part of buildAccessFunctions but requires
code for skipping them in while creating other MemoryAcesses.

This change will make it easier to modify how statement MemoryAccesses
are created without considering the exit block special case.

llvm-svn: 314662
2017-10-02 11:41:12 +00:00
Michael Kruse e276e9f324 [ForwardOpTree] Fix out-of-quota in assertion.
llvm-svn: 314661
2017-10-02 11:41:06 +00:00
Michael Kruse c013399197 [ScopDetect] Do not add loads out of the SCoP to required invariant loads.
Loads before the SCoP are always invariant within the SCoP and
therefore are no "required invariant loads". An assertion failes in
ScopBuilder when it finds such an invariant load.

Fix by not adding such loads to the required invariant load list. This
likely will cause the region to be not considered a valid SCoP.
We may want to unconditionally accept instructions defined before
the region as valid invariant conditions instead of rejecting them.

This fixes a compilation crash of SPEC CPU2006 453.povray's
render.cpp.

llvm-svn: 314636
2017-10-01 22:19:28 +00:00
Tobias Grosser d215e684b3 Add missing REQUIRES line
llvm-svn: 314625
2017-10-01 13:14:40 +00:00
Tobias Grosser 2fb847fbf6 [GPGPU] Set Polly's RTC to false in case invariant load hoisting fails
This matches the behavior we already have in lib/Codegen/CodeGeneration.cpp and
makes sure that we fall back to the original code. It seems when invariant load
hoisting was introduced to the GPGPU backend we missed to reset the RTC flag,
such that kernels where invariant load hoisting failed executed the 'optimized'
SCoP, which however is set to a simple 'unreachable'. Unsurprisingly, this
results in hard to debug issues that are a lot of fun to debug.

llvm-svn: 314624
2017-10-01 12:39:14 +00:00
Michael Kruse ed787e7540 [Polly] Add dumpPw() and dumpExpanded() functions. NFC.
These functions print a multi-line and sorted representation of unions
of polyhedra. Each polyhedron (basic_{ast/map}) has its own line.
First sort key is the polyhedron's hierachical space structure.
Secondary sort key is the lower bound of the polyhedron, which should
ensure that the polyhedral are printed in approximately ascending order.

Example output of dumpPw():
[p_0, p_1, p_2] -> {
  Stmt0[0] -> [0, 0];
  Stmt0[i0] -> [i0, 0] : 0 < i0 <= 5 - p_2;
  Stmt1[0] -> [0, 2] : p_1 = 1 and p_0 = -1;
  Stmt2[0] -> [0, 1] : p_1 >= 3 + p_0;
  Stmt3[0] -> [0, 3];
}

In contrast dumpExpanded() prints each point in the sets, unless there
is an unbounded dimension that cannot be expandend.
This is useful for reduced test cases where the loop counts are set to
some constant to understand a bug.

Example output of dumpExpanded(
{ [MemRef_A[i0] -> [i1]] : (exists (e0 = floor((1 + i1)/3): i0 = 1 and
3e0 <= i1 and 3e0 >= -1 + i1 and i1 >= 15 and i1 <= 25)) or (exists (e0
= floor((i1)/3): i0 = 0 and 3e0 < i1 and 3e0 >= -2 + i1 and i1 > 0 and
i1 <= 11)) }):

{
  [MemRef_A[0] ->[1]];
  [MemRef_A[0] ->[2]];
  [MemRef_A[0] ->[4]];
  [MemRef_A[0] ->[5]];
  [MemRef_A[0] ->[7]];
  [MemRef_A[0] ->[8]];
  [MemRef_A[0] ->[10]];
  [MemRef_A[0] ->[11]];
  [MemRef_A[1] ->[15]];
  [MemRef_A[1] ->[16]];
  [MemRef_A[1] ->[18]];
  [MemRef_A[1] ->[19]];
  [MemRef_A[1] ->[21]];
  [MemRef_A[1] ->[22]];
  [MemRef_A[1] ->[24]];
  [MemRef_A[1] ->[25]]
}

Differential Revision: https://reviews.llvm.org/D38349

llvm-svn: 314525
2017-09-29 15:45:40 +00:00
Michael Kruse 2dd5fa4dc7 [ScopBuilder] Fix typo. NFC.
Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in>

Differential Revision: https://reviews.llvm.org/D38322

llvm-svn: 314519
2017-09-29 15:13:05 +00:00
Siddharth Bhat 6cb10168f4 [Docs] Replace 0-byte incorrect GEMM_double image with the one from www/images
llvm-svn: 314423
2017-09-28 15:31:24 +00:00
Siddharth Bhat 9859707189 [Docs] fix rendering of alpha and beta
llvm-svn: 314422
2017-09-28 15:31:20 +00:00
Siddharth Bhat 99d3567b0d [Docs] Add a performance document.
Summary:
Add a document which describes:

- GEMM performance comparison.
- An experiment that measures the compile time impact
  of enabling Polly when compiling LLVM+Clang+Polly.

Contributed-by: Theodoros Theodoridis<theodoros.theodoridis@inf.ethz.ch>
Differential Revision: https://reviews.llvm.org/D38330

llvm-svn: 314419
2017-09-28 15:10:22 +00:00
Philip Pfaffe 859ef1c09e Fix the build after r314375
r314375 privatized Loop's constructor and replaced it with an Allocator.

llvm-svn: 314412
2017-09-28 12:20:24 +00:00
Michael Kruse 89d2be0702 [Support] Force instantiation of isl dump() methods. NFC.
In order for debuggers to be able to call an inline method, it must have
been instantiated somewhere. The dump() methods are usually not used, so
add an instantiation in debug builds.

This allows to call .dump() on any isl++ object from the gcc/gdb and
Visual Studio debugger in debug builds with assertions enabled.
In optimized builds, even with assertions enabled, the dump() methods
are also inlined in GICHelper.cpp, so no externally visible symbols
will be available either.

Differential Revision: https://reviews.llvm.org/D38198

llvm-svn: 314395
2017-09-28 09:51:04 +00:00
Tobias Grosser 1f93d0f1f9 [ScopInfo] Allow PHI nodes that reference an error block
As long as these PHI nodes are only referenced by terminator instructions.

llvm-svn: 314212
2017-09-26 15:00:10 +00:00
Tobias Grosser 5e531dfef4 [ScopInfo] Allow invariant loads in branch conditions
In case the value used in a branch condition is a load instruction, assume this
load to be invariant.

llvm-svn: 314146
2017-09-25 20:27:15 +00:00
Tobias Grosser 0a62b2d887 [ScopInfo] Allow uniform branch conditions
If all but one branch come from an error condition and the incoming value from
this branch is a constant, we can model this branch.

llvm-svn: 314116
2017-09-25 16:37:15 +00:00
Roman Gareev fef2c0027e [Polly] Information about generalized matrix multiplication
Reviewed-by: Tobias Grosser <tobias@grosser.es>

Differential Revision: https://reviews.llvm.org/D38218

llvm-svn: 314081
2017-09-24 19:00:25 +00:00
Tobias Grosser ee457594c2 [ScopDetect/Info] Look through PHIs that follow an error block
In case a PHI node follows an error block we can assume that the incoming value
can only come from the node that is not an error block. As a result, conditions
that seemed non-affine before are now in fact affine.

This is a recommit of r312663 after fixing
test/Isl/CodeGen/phi_after_error_block_outside_of_scop.ll

llvm-svn: 314075
2017-09-24 09:25:30 +00:00
Tobias Grosser 75d133f0ac [IslExprBuilder] Do not generate RTC with more than 64 bit
Such RTCs may introduce integer wrapping intrinsics with more than 64 bit,
which are translated to library calls on AOSP that are not part of the
runtime and will consequently cause linker errors.

Thanks to Eli Friedman for reporting this issue and reducing the test case.

llvm-svn: 314065
2017-09-23 15:32:07 +00:00
Reid Kleckner 3fc649cb76 [Support] Rename tool_output_file to ToolOutputFile, NFC
This class isn't similar to anything from the STL, so it shouldn't use
the STL naming conventions.

llvm-svn: 314050
2017-09-23 01:03:17 +00:00
Michael Kruse a9035a8fec polly-update-format after change in clang-format. NFC.
r313963 changed the sorting of using-declarations.

llvm-svn: 313976
2017-09-22 11:30:26 +00:00
Michael Kruse bfca5f4334 [DeLICM] Allow non-injective PHIRead->PHIWrite mapping.
Remove an assertion that tests the injectivity of the
PHIRead -> PHIWrite relation.  That is, allow a single PHI write to be
used by multiple PHI reads.  This may happen due to some statements
containing the PHI write not having the statement instances that would
overwrite the previous incoming value due to (assumed/invalid) contexts.
This result in that PHI write is mapped to multiple targets which is not
supported.  Codegen will select one one of the targets using
getAddressFunction().  However, the runtime check should protect us from
this case ever being executed.

We therefore allow injective PHI relations.  Additional calculations to
detect/santitize this case would probably not be worth the compuational
effort.

This fixes llvm.org/PR34485

llvm-svn: 313902
2017-09-21 19:08:23 +00:00
Michael Kruse 6d7a7896ce [ScopInfo] Use map for value def/PHI read accesses.
Before this patch, ScopInfo::getValueDef(SAI) used
getStmtFor(Instruction*) to find the MemoryAccess that writes a
MemoryKind::Value. In cases where the value is synthesizable within the
statement that defines, the instruction is not added to the statement's
instruction list, which means getStmtFor() won't return anything.

If the synthesiable instruction is not synthesiable in a different
statement (due to being defined in a loop that and ScalarEvolution
cannot derive its escape value), we still need a MemoryKind::Value
and a write to it that makes it available in the other statements.
Introduce a separate map for this purpose.

This fixes MultiSource/Benchmarks/MallocBench/cfrac where
-polly-simplify could not find the writing MemoryAccess for a use. The
write was not marked as required and consequently was removed.

Because this could in principle happen as well for PHI scalars,
add such a map for PHI reads as well.

llvm-svn: 313881
2017-09-21 14:23:11 +00:00
Michael Kruse 0e370cf1a7 Check whether IslAstInfo and DependenceInfo were computed for the same Scop.
Since -polly-codegen reports itself to preserve DependenceInfo and IslAstInfo,
we might get those analysis that were computed by a different ScopInfo for a
different Scop structure. This would be unfortunate because DependenceInfo and
IslAstInfo hold references to resources allocated by
ScopInfo/ScopBuilder/Scop (e.g. isl_id). If -polly-codegen and
DependenceInfo/IslAstInfo do not agree on which Scop to use, unpredictable
things can happen.

When the ScopInfo/Scop object is freed, there is a high probability that the
new ScopInfo/Scop object will be created at the same heap position with the
same address. Comparing whether the Scop or ScopInfo address is the expected
therefore is unreliable.

Instead, we compare the address of the isl_ctx object. Both, DependenceInfo
and IslAstInfo must hold a reference to the isl_ctx object to ensure it is
not freed before the destruction of those analyses which might happen after
the destruction of the Scop/ScopInfo they refer to.  Hence, the isl_ctx
will not be freed and its address not reused as long there is a
DependenceInfo or IslAstInfo around.

This fixes llvm.org/PR34441

llvm-svn: 313842
2017-09-21 00:01:13 +00:00