The new field in the MemoryAccess allows us to track a value related
to that access:
- For real memory accesses the value is the loaded result or the
stored value.
- For straigt line scalar accesses it is the access instruction
itself.
- For PHI operand accesses it is the operand value.
We use this value to simplify code which deduced information about the value
later in the Polly pipeline and was known to be error prone.
Reviewers: grosser, Meinsersbur
Subscribers: #polly
Differential Revision: http://reviews.llvm.org/D12062
llvm-svn: 245213
This allows the code generation to continue working even if a needed
value (that is reloaded anyway) was not yet demoted. Instead of
failing it will now create the location for future demotion to memory
and load from that location. The stores will use the same location and
by construction execute before the load even if the textual order in
the generated AST is otherwise.
Reviewers: grosser, Meinersbur
Subscribers: #polly
Differential Revision: http://reviews.llvm.org/D12072
llvm-svn: 245203
This test case crashes the scalar code generation as we are not
consistent with the usage of the assumed context. To be precise, we
use the assumed context for the dependence analysis but not to
restrict the domains of the statements.
A step by step explanation of the problem is given in the test case.
llvm-svn: 245176
This option allows the user to provide additional information about parameter
values as an isl_set. To specify that N has the value 1024, we can provide
the context -polly-context='[N] -> {: N = 1024}'.
llvm-svn: 245175
The July issue of TOPLAS contains a 50 page discussion of the AST generation
techniques used in Polly. This discussion gives not only an in-depth
description of how we (re)generate an imperative AST from our polyhedral based
mathematical program description, but also gives interesting insights about:
- Schedule trees: A tree-based mathematical program description that enables us
to perform loop transformations on an abstract level, while issues like the
generation of the correct loop structure and loop bounds will be taken care of
by our AST generator.
- Polyhedral unrolling: We discuss techniques that allow the unrolling of
non-trivial loops in the context of parameteric loop bounds, complex tile
shapes and conditionally executed statements. Such unrolling support enables
the generation of predicated code e.g. in the context of GPGPU computing.
- Isolation for full/partial tile separation: We discuss native support for
handling full/partial tile separation and -- in general -- native support for
isolation of boundary cases to enable smooth code generation for core
computations.
- AST generation with modulo constraints: We discuss how modulo mappings are
lowered to efficient C/LLVM code.
- User-defined constraint sets for run-time checks We discuss how arbitrary
sets of constraints can be used to automatically create run-time checks that
ensure a set of constrainst actually hold. This feature is very useful to
verify at run-time various assumptions that have been taken program
optimization.
Polyhedral AST generation is more than scanning polyhedra
Tobias Grosser, Sven Verdoolaege, Albert Cohen
ACM Transations on Programming Languages and Systems (TOPLAS), 37(4), July 2015
llvm-svn: 245157
This modifies the order in which Polly passes are executed.
Assuming a function has two scops (A and B), the order before was:
FunctionPassManager
ScopDetection
IndependentBlocks
TempScopInfo for A and B
RegionPassManager
ScopInfo for A
DependenceInfo for A
IslScheduleOptimizer for A
IslAstInfo for A
CodeGeneration for A
ScopInfo for B
DependenceInfo for B
IslScheduleOptimizer for B
IslAstInfo for B
CodeGeneration for B
After this patch:
FunctionPassManager
ScopDetection
IndependentBlocks
RegionPassManager
TempScopInfo for A
ScopInfo for A
DependenceInfo for A
IslScheduleOptimizer for A
IslAstInfo for A
CodeGeneration for A
TempScopInfo for B
ScopInfo for B
DependenceInfo for B
IslScheduleOptimizer for B
IslAstInfo for B
CodeGeneration for B
TempScopInfo for B might store information and references to the IR
that CodeGeneration for A might modify. Changing the order ensures that
the IR is not modified from the analysis of a region until code
generation.
Reviewers: grosser
Differential Revision: http://reviews.llvm.org/D12014
llvm-svn: 245091
This change extends the BlockGenerator to not only allow Instructions as
base elements of scalar dependences, but any llvm::Value. This allows
us to code-generate scalar dependences which reference function arguments, as
they arise when moddeling read-only scalar dependences.
llvm-svn: 244874
While the compile time is not affected by this patch much it will
allow us to look at all translated expressions after the SCoP is build
in a convenient way. Additionally, bigger SCoPs or SCoPs with
repeating complicated expressions might benefit from the cache later
on.
Reviewers: grosser, Meinersbur
Subscribers: #polly
Differential Revision: http://reviews.llvm.org/D11975
llvm-svn: 244734
This change has three major advantages:
- The ScopInfo becomes smaller.
- It allows to use the SCEVAffinator from outside the ScopInfo.
- A member object allows state which in turn allows e.g., caching.
Differential Revision: http://reviews.llvm.org/D9099
llvm-svn: 244730
In order to find the llvm-obj directory it has to be (or a soft link
to it) at one of the following locations:
${POLLY_SRC_DIR}/build
${POLLY_SRC_DIR}.build
${POLLY_SRC_DIR}-build
s/${POLLY_SRC_DIR}/src/build
Alternatively, the environment variable $POLLY_BIN_DIR can point to it.
llvm-svn: 244727
Before we only modeled PHI nodes if at least one incoming basic block was itself
part of the region, now we always model them except if all of their operands are
part of a single non-affine subregion which we model as a black-box.
This change only affects PHI nodes in the entry block, that have exactly one
incoming edge. Before this change, we did not model them and as a result code
generation would not know how to code generate them. With this change, code
generation can code generate them like any other PHI node.
This issue was exposed by r244606. Before this change simplifyRegion would have
moved these PHI nodes out of the SCoP, so we would never have tried to code
generate them. We could implement this behavior again, but changing the IR
after the scop has been modeled and transformed always adds a risk of us
invalidating earlier analysis results. It seems more save and overall also more
consistent to just model and handle this one-entry-edge PHI nodes like any
other PHI node in the scop.
Solution proposed by: Michael Kruse <llvm@meinersbur.de>
llvm-svn: 244721
This one was extracted from the test-suite's pifft and caused a
miscompilation because a scalar was not written to its alloca address.
llvm-svn: 244720
In order to have a valid region analysis, we assign all newly created blocks to the parent of the scop's region. This is correct for any pre-existing regions (including the scop's region and its parent), but does not discover any region inside the generated code. For Polly this is not necessary because we do not want to re-run Polly on its own generated code anyway.
Reviewers: grosser
Part of Differential Revision: http://reviews.llvm.org/D11867
llvm-svn: 244608
The previous code had several problems:
For newly created BasicBlocks it did not (always) call RegionInfo::setRegionFor in order to update its analysis. At the moment RegionInfo does not verify its BBMap, but will in the future. This is fixed by determining the region new BBs belong to and set it accordingly. The new executeScopConditionally() requires accurate getRegionFor information.
Which block is created by SplitEdge depends on the incoming and outgoing edges of the blocks it connects, which makes handling its output more difficult than it needs to be. Especially for finding which block has been created an to assign a region to it for the setRegionFor problem above. This patch uses an implementation for splitEdge that always creates a block between the predecessor and successor. simplifyRegion has also been simplified by using SplitBlockPredecessors instead of SplitEdge. Isolating the entries and exits have been refectored into individual functions.
Previously simplifyRegion did more than just ensuring that there is only one entering and one exiting edge. It ensured that the entering block had no other outgoing edge which was necessary for executeScopConditionally(). Now the latter uses the alternative splitEdge implementation which can handle this situation so simplifyRegion really only needs to simplify the region.
Also, executeScopConditionally assumed that there can be no PHI nodes in blocks with one incoming edge. This is wrong and LCSSA deliberately produces such edges. However, previous passes ensured that there can be no such PHIs in exit nodes, but which will no longer hold in the future.
The new code that the property that it preserves the identity of region block (the property that the memory address of the BasicBlock containing the instructions remains the same; new blocks only contain PHI nodes and a terminator), especially the entry block. As a result, there is no need to update the reference to the BasicBlock of ScopStmt that contain its instructions because they have been moved to other basic blocks.
Reviewers: grosser
Part of Differential Revision: http://reviews.llvm.org/D11867
llvm-svn: 244606
RegionInfo::splitBlock did not update RegionInfo correctly. Specifically, it tried to make the new block the entry block if possible. This breaks for nested regions that have edges to the old block.
We simply do not change the entry block. Updating RegionInfo becomes trivial as both block will always be in the same region.
splitEntryBlockForAlloca makes use of the new splitBlock.
Reviewers: grosser
Part of Differential Revision: http://reviews.llvm.org/D11867
llvm-svn: 244600
Besides other changes this version of isl contains a fundamental fix to memory
corruption issues we have seen with imath-32 backed isl_ints.
This update also contains a fix that ensures that the schedule-tree based
version of isl's dependence analysis takes the domain of the schedule into
account.
llvm-svn: 244585
With a deque (or any other sequential container) it is not sound to
take the address of the elements when the container is still under
construction. With a pointer based container this is save.
llvm-svn: 244459
Summary: The extracted function buildBBScopStmt will be needed later to be invoked individually on the region's exit block.
Reviewers: grosser, jdoerfert
Subscribers: jdoerfert, llvm-commits, pollydev
Projects: #polly
Differential Revision: http://reviews.llvm.org/D11878
llvm-svn: 244443
Summary: The splitExitBlock function is never called. Going to replace its functionality in successive patches that do not modify the IR.
Reviewers: grosser
Subscribers: pollydev
Projects: #polly
Differential Revision: http://reviews.llvm.org/D11865
llvm-svn: 244404
Even though read-only accesses to scalars outside of a scop do not need to be
modeled to derive valid transformations or to generate valid sequential code,
but information about them is useful when we considering memory footprint
analysis and/or kernel offloading.
llvm-svn: 243981
This change is required to see the detected scops even in cases where there is
no other ScopInfo user after the ScopViewers. Before this change, when
running with -polly-optimizer=none -polly-code-generator=none detected scops
have not been shown.
llvm-svn: 243971
If set, this option instructs -view-scops and -polly-show to only print
functions that contain the specified string in their name. This allows to
look at the scops of a specific function in a large .ll file, without flooding
the screen with .dot graphs.
llvm-svn: 243882
We use the branch instruction as the location at which a PHI-node write takes
place, instead of the PHI-node itself. This allows us to identify the
basic-block in a region statement which is on the incoming edge of the PHI-node
and for which the write access was originally introduced. As a result we can,
during code generation, avoid generating PHI-node write accesses for basic
blocks that do not preceed the PHI node without having to look at the IR
again.
This change fixes a bug which was introduced in r243420, when we started to
explicitly model PHI-node reads and writes, but dropped some additional checks
that where still necessary during code generation to not emit PHI-node writes
for basic-blocks that are not on incoming edges of the original PHI node.
Compared to the code before r243420 the new code does not need to inspect the IR
any more and we also do not generate multiple redundant writes.
llvm-svn: 243852