Since Ashutosh made findDefsUsedOutsideOfLoop public, we can clean this
up.
Now clients that don't compute DefsUsedOutsideOfLoop can just call
versionLoop() and computing DefsUsedOutsideOfLoop will happen
implicitly. With that there is no reason to expose addPHINodes anymore.
Ashutosh, you can now drop the calls to findDefsUsedOutsideOfLoop and
addPHINodes in LVerLICM and things should just work.
llvm-svn: 245579
This is the full set of checks that clients can further filter. IOW,
it's client-agnostic. This makes LAA complete in the sense that it now
provides the two main results of its analysis precomputed:
1. memory dependences via getDepChecker().getInsterestingDependences()
2. run-time checks via getRuntimePointerCheck().getChecks()
However, as a consequence we now compute this information pro-actively.
Thus if the client decides to skip the loop based on the dependences
we've computed the checks unnecessarily. In order to see whether this
was a significant overhead I checked compile time on SPEC2k6 LTO bitcode
files. The change was in the noise.
The checks are generated in canCheckPtrAtRT, at the same place where we
used to call groupChecks to merge checks.
llvm-svn: 244368
Before, we were passing the pointer partitions to LAA. Now, we get all
the checks from LAA and filter out the checks within partitions in
LoopDistribution.
This effectively concludes the steps to move filtering memchecks from
LAA into its clients. There is still some cleanup left to remove the
unused interfaces in LAA that still take PtrPartition.
(Moving this functionality to LoopDistribution also requires
needsChecking on pointers to be made public.)
llvm-svn: 243613
Before the patch, the checks were generated internally in
addRuntimeCheck. Now, we use the new overloaded version of
addRuntimeCheck that takes the ready-made set of checks as a parameter.
The checks are now generated by the client (LoopDistribution) with the
new RuntimePointerChecking::generateChecks API.
Also the new printChecks API is used to print out the checks for
debugging.
This is to continue the transition over to the new model whereby clients
will get the full set of checks from LAA, filter it and then pass it to
LoopVersioning and in turn to addRuntimeCheck.
llvm-svn: 243382
Instead of the pattern
for (auto I = x.rbegin(), E = x.end(); I != E; ++I)
we can use make_range to construct the reverse range and iterate using
that instead.
llvm-svn: 243163
I am planning to add more nested classes inside RuntimePointerCheck so
all these triple-nesting would be hard to follow.
Also rename it to RuntimePointerChecking (i.e. append 'ing').
llvm-svn: 242218
Summary:
The class will obviously need improvement down the road. For one, there
is no reason that addPHINodes would have to be exposed like that. I
will make this and other improvements in follow-up patches.
The main goal is to be able to share this functionality. The
LoopLoadElimination pass I am working on needs it too. Later we can
move other clients as well (LV and Ashutosh's LICMVer).
Reviewers: hfinkel, ashutosh.nema
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10577
llvm-svn: 241932
Summary:
This makes them available to the LoopVersioning class as that is moved
to its own module in the next patch.
Reviewers: ashutosh.nema, hfinkel
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10576
llvm-svn: 241931
As with the previous patch, the goal is to turn the class into a general
loop-versioning class. This patch removes any references to loop
distribution.
llvm-svn: 240352
Summary:
This implements the initial version as was proposed earlier this year
(http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-January/080462.html).
Since then Loop Access Analysis was split out from the Loop Vectorizer
and was made into a separate analysis pass. Loop Distribution becomes
the second user of this analysis.
The pass is off by default and can be enabled
with -enable-loop-distribution. There is currently no notion of
profitability; if there is a loop with dependence cycles, the pass will
try to split them off from other memory operations into a separate loop.
I decided to remove the control-dependence calculation from this first
version. This and the issues with the PDT are actively discussed so it
probably makes sense to treat it separately. Right now I just mark all
terminator instruction required which keeps identical CFGs for each
distributed loop. This seems to be working pretty well for 456.hmmer
where even though there is an empty if-then block in the distributed
loop initially, it gets completely removed.
The pass keeps DominatorTree and LoopInfo updated. I've tested this
with -loop-distribute-verify with the testsuite where we distribute ~90
loops. SimplifyLoop is violated in some cases and I have a FIXME
covering this.
Reviewers: hfinkel, nadav, aschwaighofer
Reviewed By: aschwaighofer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8831
llvm-svn: 237358