This patch aims to implement the option of allocating new arrays created
by polly on heap instead of stack. To enable this option, a key named
'allocation' must be written in the imported json file with the value
'heap'.
We need such a feature because in a next iteration, we will implement a
mechanism of maximal static expansion which will need a way to allocate
arrays on heap. Indeed, the expansion is very costly in terms of memory
and doing the allocation on stack is not worth considering.
The malloc and the free are added respectively at polly.start and
polly.exiting such that there is no use-after-free (for instance in case
of Scop in a loop) and such that all memory cells allocated with a
malloc are free'd when we don't need them anymore.
We also add :
- In the class ScopArrayInfo, we add a boolean as member called IsOnHeap
which represents the fact that the array in allocated on heap or not.
- A new branch in the method allocateNewArrays in the ISLNodeBuilder for
the case of heap allocation. allocateNewArrays now takes a BBPair
containing polly.start and polly.exiting. allocateNewArrays takes this
two blocks and add the malloc and free calls respectively to
polly.start and polly.exiting.
- As IntPtrTy for the malloc call, we use the DataLayout one.
To do that, we have modified :
- createScopArrayInfo and getOrCreateScopArrayInfo such that it returns
a non-const SAI, in order to be able to call setIsOnHeap in the
JSONImporter.
- executeScopConditionnaly such that it return both start block and end
block of the scop, because we need this two blocs to be able to add
the malloc and the free calls at the right position.
Differential Revision: https://reviews.llvm.org/D33688
llvm-svn: 306540
Before we would 'guess' the correct location for the MergeBlock
that got introduced when executing a Scop conditionally. This
implicitly depends on the situation that at this point during
CodeGen there will be nothing between polly.start and polly.exiting.
With this commit we explicitly state that we want the block that
directly follows polly.exiting.
llvm-svn: 306398
This commit returns both the start and the exit block that are created
by executeScopConditionally.
In a future commit we will make use of the exit block. Before we would
have to use the implicit property that there won't be any code generated
between polly.start and polly.exiting at the time of use to find the
correct block ('polly.exiting').
All usage location are semantically unchanged.
llvm-svn: 306283
Ensure that all array base pointers are assigned before generating
aliasing metadata by allocating new arrays beforehand.
Before this patch, getBasePtr() returned nullptr for new arrays because
the arrays were created at a later point. Nullptr did not match to any
array after the created array base pointers have been assigned and when
the loads/stores are generated.
llvm-svn: 305675
Previously, we would generate one performance counter for all scops.
Now, we generate both the old information, as well as a per-scop
performance counter to generate finer grained information.
This patch needed a way to generate a unique name for a `Scop`.
The start region, end region, and function name combined provides a
unique `Scop` name. So, `Scop` has a new public API to provide its start
and end region names.
Differential Revision: https://reviews.llvm.org/D33723
llvm-svn: 304528
Summary: To move CG to the new PM I outlined the various helper that were previously members of the CG class into free static functions. The CG class itself I moved into a header, which is required because we need to include it in `RegisterPasses` eventually.
Reviewers: grosser, Meinersbur
Reviewed By: grosser
Subscribers: pollydev, llvm-commits, sanjoy
Tags: #polly
Differential Revision: https://reviews.llvm.org/D33423
llvm-svn: 303624
Summary: This patch ports IslAst to the new PM. The change is mostly straightforward. The only major modification required is making IslAst move-only, to correctly manage the isl resources it owns.
Reviewers: grosser, Meinersbur
Reviewed By: grosser
Subscribers: nemanjai, pollydev, llvm-commits
Tags: #polly
Differential Revision: https://reviews.llvm.org/D33422
llvm-svn: 303622
Summary: This is a proof of concept of how to port polly-passes to the new PassManager architecture. This approach works ootb for Function-Passes, but might not be directly applicable to Scop/Region-Passes. While we could just run the Analyses/Transforms over functions instead, we'd surrender the nice pipelining behaviour we have now.
Reviewers: Meinersbur, grosser
Reviewed By: grosser
Subscribers: pollydev, sanjoy, nemanjai, llvm-commits
Tags: #polly
Differential Revision: https://reviews.llvm.org/D31459
llvm-svn: 302902
As has been reported in the previous commit, codegen verification can result in
quadratic compile time increases for large functions with many scops. This is
certainly not something we would like to have in the Polly default
configuration. Hence, we disable codegen verification by default -- also to see
if this resolves some of the compilation timeouts we currently see on the AOSP
buildbots. We still leave this feature in Polly as it has shown _very_ useful
for debugging. In fact, we may want to have a discussion if we can bring this
feature back in a way that does not impact compilation time so much.
Thanks to Eli Friedman <efriedma@codeaurora.org> for reporting this issue and
for providing the test case in the previous commit (where I forgot to
acknowledge him).
llvm-svn: 301670
Before this change, we always tried to verify the function and printed
verification errors, but just did not abort in case -polly-codegen-verify=false
was set and verification failed. As verification can become very cosly -- for
large functions with many scops we may verify the very same function very often
-- this can affect compile time very negatively. Hence, we respect the
-polly-codegen-verify flag with this check, ensuring that no verification is run
if -polly-codegen-verify=false.
This reduces code generation time from 26 seconds to 4 seconds on the test
case below with -polly-codegen-verify=false:
struct X { int x; };
void a();
#define SIG (int x, X **y, X **z)
typedef void (*fn)SIG;
#define FN { for (int i = 0; i < x; ++i) { (*y)[i].x += (*z)[i].x; } a(); }
#define FN5 FN FN FN FN FN
#define FN25 FN5 FN5 FN5 FN5
#define FN125 FN25 FN25 FN25 FN25 FN25
#define FN250 FN125 FN125
#define FN1250 FN250 FN250 FN250 FN250 FN250
void x SIG { FN1250 }
llvm-svn: 301669
The current StackColoring algorithm does not correctly handle the
situation when some, but not all paths from a BB to the entry node
cross a llvm.lifetime.start. According to an interpretation of the
language reference at
http://llvm.org/docs/LangRef.html#llvm-lifetime-start-intrinsic
this might be correct, but it would cost too much effort to handle
in StackColoring.
To be on the safe side, remove all lifetime markers even in the original
code version (they have never been copied to the optimized version)
to ensure that no path to the entry block will cross a
llvm.lifetime.start.
The same principle applies to paths the a function return and the
llvm.lifetime.end marker, so we remove them as well.
This fixes llvm.org/PR32251.
Also see the discussion at
http://lists.llvm.org/pipermail/llvm-dev/2017-March/111551.html
llvm-svn: 299585
Summary:
A couple of the utilities used to analyze or build IR make explicit use of the legacy PM on their interface, to access analysis results. This patch removes the legacy PM from the interface, and just passes the required results directly.
This shouldn't introduce any function changes, although the API technically allowed to obtain two different analysis results before, one passed by reference and one through the PM. I don't believe that was ever intended, however.
Reviewers: grosser, Meinersbur
Reviewed By: grosser
Subscribers: nemanjai, pollydev, llvm-commits
Tags: #polly
Differential Revision: https://reviews.llvm.org/D31653
llvm-svn: 299423
Add support for -polly-codegen-perf-monitoring. When performance monitoring
is enabled, we emit performance monitoring code during code generation that
prints after program exit statistics about the total number of cycles executed
as well as the number of cycles spent in scops. This gives an estimate on how
useful polyhedral optimizations might be for a given program.
Example output:
Polly runtime information
-------------------------
Total: 783110081637
Scops: 663718949365
In the future, we might also add functionality to measure how much time is spent
in optimized scops and how many cycles are spent in the fallback code.
Reviewers: bollu,sebpop
Tags: #polly
Differential Revision: https://reviews.llvm.org/D31599
llvm-svn: 299359
Marking a pass as preserved is necessary if any Polly pass uses it, even
if it is not preserved within the generated code. Not marking it would
cause the the Polly pass chain to be interrupted. It is not used by any
Polly pass anymore, hence we can remove all references to it.
llvm-svn: 295983
This makes polly generate a CFG which is closer to what we want
in LLVM IR, with a loop preheader for the original loop. This is
just a cleanup, but it exposes some fragile assumptions.
I'm not completely happy with the changes related to expandCodeFor;
RTCBB->getTerminator() is basically a random insertion point which
happens to work due to the way we generate runtime checks. I'm not
sure what the right answer looks like, though.
Differential Revision: https://reviews.llvm.org/D26053
llvm-svn: 285864
LLVM's coding guideline suggests to not use @brief for one-sentence doxygen
comments to improve readability. Switch this once and for all to ensure people
do not copy @brief comments from other parts of Polly, when writing new code.
llvm-svn: 280468
There is no need to reset the position of the builder, as we can just continue
to insert code at the current position of the IRBuilder, which happens to
be precisely the location we reset the builder to.
llvm-svn: 278014
Extend the jscop interface to allow the user to export arrays. It is required
that already existing arrays of the list of arrays correspond to arrays
of the SCoP. Each array that is appended to the list will be newly created.
Furthermore, we allow the user to modify access expressions to reference
any array in case it has the same element type.
Reviewed-by: Tobias Grosser <tobias@grosser.es>
Differential Revision: https://reviews.llvm.org/D22828
llvm-svn: 277263
This allows the finalization routine of the IslNodeBuilder to be overwritten
by derived classes. Being here, we also drop the unnecessary 'Scop' postfix
and the unnecessary 'Scop' parameter.
llvm-svn: 276622
llvm commonly adds a comment to the closing brace of a namespace to indicate
which namespace is closed. clang-tidy provides with llvm-namespace-comment
a handy tool to check for this habit. We use it to ensure we consitently use
namespace comments in Polly.
There are slightly different styles in how namespaces are closed in LLVM. As
there is no large difference between the different comment styles we go for the
style clang-tidy suggests by default.
To reproduce this fix run:
for i in `ls tools/polly/lib/*/*.cpp`; \
clang-tidy -checks='-*,llvm-namespace-comment' -p build $i -fix \
-header-filter=".*"; \
done
This cleanup was suggested by Eugene Zelenko <eugene.zelenko@gmail.com> in
http://reviews.llvm.org/D21488 and was split out to increase readability.
llvm-svn: 273621
The recent expression type changes still need more discussion, which will happen
on phabricator or on the mailing list. The precise list of commits reverted are:
- "Refactor division generation code"
- "[NFC] Generate runtime checks after the SCoP"
- "[FIX] Determine insertion point during SCEV expansion"
- "Look through IntToPtr & PtrToInt instructions"
- "Use minimal types for generated expressions"
- "Temporarily promote values to i64 again"
- "[NFC] Avoid unnecessary comparison for min/max expressions"
- "[Polly] Fix -Wunused-variable warnings (NFC)"
- "[NFC] Simplify min/max expression generation"
- "Simplify the type adjustment in the IslExprBuilder"
Some of them are just reverted as we would otherwise get conflicts. I will try
to re-commit them if possible.
llvm-svn: 272483
We now generate runtime checks __after__ the SCoP code generation and
not before, though they are still inserted at the same position int
the code. This allows to modify the runtime check during SCoP code
generation.
llvm-svn: 271894
Created a new pass ScopInfoRegionPass. As name suggests, it is a
region pass and it is there to preserve compatibility with our
existing Polly passes. ScopInfoRegionPass will return a SCoP object
for a valid region while the creation of the SCoP stays in the
ScopInfo class.
Contributed-by: Utpal Bora <cs14mtech11017@iith.ac.in>
Reviewed-by: Tobias Grosser <tobias@grosser.es>,
Johannes Doerfert <doerfert@cs.uni-saarland.de>
Differential Revision: http://reviews.llvm.org/D20770
llvm-svn: 271259
We utilize assumptions on the input to model IR in polyhedral world.
To verify these assumptions we version the code and guard it with a
runtime-check (RTC). However, since the RTCs are themselves generated
from the polyhedral representation we generate them under the same
assumptions that they should verify. In other words, the guarantees
that we try to provide with the RTCs do not hold for the RTCs
themselves. To this end it is necessary to employ a different check
for the RTCs that will verify the assumptions did hold for them too.
Differential Revision: http://reviews.llvm.org/D20165
llvm-svn: 269299
We verify the optimized function now for a long time and it helped to track
down bugs early. This will now also happen for all parallel subfunctions we
generate.
llvm-svn: 265823
When codegenerating invariant loads in some rare cases we cannot generate code
and bail out. This change ensures that we maintain a valid dominator tree
in these situations. This fixes llvm.org/PR26736
Contributed-by: Matthias Reisinger <d412vv1n@gmail.com>
llvm-svn: 264142
We now always print the reason why the code did not pass the LLVM verifier and
we also allow to disable verfication with -polly-codegen-verify=false. Before
this change the first assertion had generally no information why or what might
have gone wrong and it was also impossible to -view-cfg without recompile. This
change makes debugging bugs that result in incorrect IR a lot easier.
llvm-svn: 261320
We also disable this feature by default, as there are still some issues in
combination with invariant load hoisting that slipped through my initial
testing.
llvm-svn: 260025
Re-run canonicalization passes after Polly's code generation.
The set of passes currently added here are nearly all the passes between
--polly-position=early and --polly-position=before-vectorizer, i.e. all
passes that would usually run after Polly.
In order to run these only if Polly actually modified the code, we add a
function attribute "polly-optimzed" to a function that contains
generated code. The cleanup pass is skipped if the function does not
have this attribute.
There is no support by the (legacy) PassManager to run passes only under
some conditions. One could have wrapped all transformation passes to run
only when CodeGeneration changed the code, but the analyses would run
anyway. This patch creates an independent pass manager. The
disadvantages are that all analyses have to re-run even if preserved and
it does not honor compiler switches like the PassManagerBuilder does.
Differential Revision: http://reviews.llvm.org/D14333
llvm-svn: 254150
When we bail out early we make the partially build new code path
practically dead, though it was not unreachable. To remove dominance
problems we now make it not only dead but also prevent the control
flow to join with the original code path, thus allow to use original
values after the SCoP without any PHI nodes.
This fixes bug 25447.
llvm-svn: 252420
The bail out in r252412 left the code generation without verifying the (so
far) generated IR. This will change now and ensure we always run the
verifier.
Suggested-by: Tobias Grosser <tobias@grosser.es>
llvm-svn: 252419
While the program cannot cause a dependence cycle between invariant
loads, additional constraints (e.g., to ensure finite loops) can
introduce them. It is hard to detect them in the SCoP description,
thus we will only check for them at code generation time. If such a
recursion is detected we will bail out the code generation and place a
"false" runtime check to guarantee the original code is used.
This fixes bug 25443.
llvm-svn: 252412
Remove all the implicit ilist iterator conversions from polly, in
preparation for making them illegal in ADT. There was one oddity I came
across: at line 95 of lib/CodeGen/LoopGenerators.cpp, there was a
post-increment `Builder.GetInsertPoint()++`.
Since it was a no-op, I removed it, but I admit I wonder if it might be
a bug (both before and after this change)? Perhaps it should be a
pre-increment?
llvm-svn: 252357
Polly can now be used as a analysis only tool as long as the code
generation is disabled. However, we do not have an alternative to the
independent blocks pass in place yet, though in the relevant cases
this does not seem to impact the performance much. Nevertheless, a
virtual alternative that allows the same transformations without
changing the input region will follow shortly.
llvm-svn: 250652
This patch allows invariant loads to be used in the SCoP description,
e.g., as loop bounds, conditions or in memory access functions.
First we collect "required invariant loads" during SCoP detection that
would otherwise make an expression we care about non-affine. To this
end a new level of abstraction was introduced before
SCEVValidator::isAffineExpr() namely ScopDetection::isAffine() and
ScopDetection::onlyValidRequiredInvariantLoads(). Here we can decide
if we want a load inside the region to be optimistically assumed
invariant or not. If we do, it will be marked as required and in the
SCoP generation we bail if it is actually not invariant. If we don't
it will be a non-affine expression as before. At the moment we
optimistically assume all "hoistable" (namely non-loop-carried) loads
to be invariant. This causes us to expand some SCoPs and dismiss them
later but it also allows us to detect a lot we would dismiss directly
if we would ask e.g., AliasAnalysis::canBasicBlockModify(). We also
allow potential aliases between optimistically assumed invariant loads
and other pointers as our runtime alias checks are sound in case the
loads are actually invariant. Together with the invariant checks this
combination allows to handle a lot more than LICM can.
The code generation of the invariant loads had to be extended as we
can now have dependences between parameters and invariant (hoisted)
loads as well as the other way around, e.g.,
test/Isl/CodeGen/invariant_load_parameters_cyclic_dependence.ll
First, it is important to note that we cannot have real cycles but
only dependences from a hoisted load to a parameter and from another
parameter to that hoisted load (and so on). To handle such cases we
materialize llvm::Values for parameters that are referred by a hoisted
load on demand and then materialize the remaining parameters. Second,
there are new kinds of dependences between hoisted loads caused by the
constraints on their execution. If a hoisted load is conditionally
executed it might depend on the value of another hoisted load. To deal
with such situations we sort them already in the ScopInfo such that
they can be generated in the order they are listed in the
Scop::InvariantAccesses list (see compareInvariantAccesses). The
dependences between hoisted loads caused by indirect accesses are
handled the same way as before.
llvm-svn: 249607
When the ScopAnnotator was a class member variable some of the maps it contains
have not been properly cleared. As a result we had dangling pointers to
llvm::Value(s) which got detected by the AssertingVH we recently added.
No test case as this issue is hard to reproduce reliably as subsequent
optimizations need to delete some of the llvm::Values we still keep in our
lists.
llvm-svn: 249269
As a first step in the direction of assumed invariant loads (loads
that are not written in some context) we now detect and hoist
definitively invariant loads. These invariant loads will be preloaded
in the code generation and used in the optimized version of the SCoP.
If the load is only conditionally executed the preloaded version will
also only be executed under the same condition, hence we will never
access memory that wouldn't have been accessed otherwise. This is also
the most distinguishing feature to licm.
As hoisting can make statements empty we will simplify the SCoP and
remove empty statements that would otherwise cause artifacts in the
code generation.
Differential Revision: http://reviews.llvm.org/D13194
llvm-svn: 248861
The changes affect methods that are part of the Pass interface and
include:
- Comments that describe the methods purpose.
- A consistent use of the keywords override and virtual.
Additionally, the printScop method is now optional and removed from
SCoP passes that do not implement it.
llvm-svn: 248685
The TempScopInfo (-polly-analyze-ir) pass is removed and its work taken
over by ScopInfo (-polly-scops). Several tests depend on
-polly-analyze-ir and use -polly-scops instead which for the moment
prints the output of both passes. This again is not expected by some
other tests, especially those with negative searches, which have been
adapted.
Differential Version: http://reviews.llvm.org/D12694
llvm-svn: 247288