Commit Graph

18 Commits

Author SHA1 Message Date
Florian Hahn 8eda4b0dc0 [LoopInterchange] Require asserts for test using -stats (NFC)
This fixes a buildbot failure.

llvm-svn: 329279
2018-04-05 13:07:39 +00:00
Florian Hahn 6e0043365b [LoopInterchange] Add stats counter for number of interchanged loops.
Reviewers: samparker, karthikthecool, blitz.opensource

Reviewed By: samparker

Differential Revision: https://reviews.llvm.org/D45209

llvm-svn: 329269
2018-04-05 10:39:23 +00:00
Florian Hahn 831a757728 [LoopInterchange] Preserve LoopInfo after interchanging.
LoopInterchange relies on LoopInfo being up-to-date, so we should
preserve it after interchanging. This patch updates restructureLoops to
move the BBs of the interchanged loops to the right place.

Reviewers: davide, efriedma, karthikthecool, mcrosier

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D45278

llvm-svn: 329264
2018-04-05 09:48:45 +00:00
Florian Hahn 9467ccf447 [LoopInterchange] Add remark for calls preventing interchanging.
It also updates test/Transforms/LoopInterchange/call-instructions.ll
to use accesses where we can prove dependence after D35430.


Reviewers: sebpop, karthikthecool, blitz.opensource

Reviewed By: sebpop

Differential Revision: https://reviews.llvm.org/D45206

llvm-svn: 329111
2018-04-03 20:54:04 +00:00
Florian Hahn b79217077d [LoopInterchange] Update tests so DA can handle access after D35430.
I have taken the opportunity to simplify some tests slightly and move
parts around.

It also brings back a few IR checks for interchangable loops.

Reviewers: karthikthecool, sebpop, grosser

Reviewed By: sebpop

Differential Revision: https://reviews.llvm.org/D45207

llvm-svn: 329081
2018-04-03 16:37:58 +00:00
Sebastian Pop bf6e1c26cf DA: remove uses of GEP, only ask SCEV
It's been quite some time the Dependence Analysis (DA) is broken,
as it uses the GEP representation to "identify" multi-dimensional arrays.
It even wrongly detects multi-dimensional arrays in single nested loops:

from test/Analysis/DependenceAnalysis/Coupled.ll, example @couple6
;; for (long int i = 0; i < 50; i++) {
;; A[i][3*i - 6] = i;
;; *B++ = A[i][i];

DA used to detect two subscripts, which makes no sense in the LLVM IR
or in C/C++ semantics, as there are no guarantees as in Fortran of
subscripts not overlapping into a next array dimension:

maximum nesting levels = 1
SrcPtrSCEV = %A
DstPtrSCEV = %A
using GEPs
subscript 0
    src = {0,+,1}<nuw><nsw><%for.body>
    dst = {0,+,1}<nuw><nsw><%for.body>
    class = 1
    loops = {1}
subscript 1
    src = {-6,+,3}<nsw><%for.body>
    dst = {0,+,1}<nuw><nsw><%for.body>
    class = 1
    loops = {1}
Separable = {}
Coupled = {1}

With the current patch, DA will correctly work on only one dimension:

maximum nesting levels = 1
SrcSCEV = {(-2424 + %A)<nsw>,+,1212}<%for.body>
DstSCEV = {%A,+,404}<%for.body>
subscript 0
    src = {(-2424 + %A)<nsw>,+,1212}<%for.body>
    dst = {%A,+,404}<%for.body>
    class = 1
    loops = {1}
Separable = {0}
Coupled = {}

This change removes all uses of GEP from DA, and we now only rely
on the SCEV representation.

The patch does not turn on -da-delinearize by default, and so the DA analysis
will be more conservative in the case of multi-dimensional memory accesses in
nested loops.

I disabled some interchange tests, as the DA is not able to disambiguate
the dependence anymore. To make DA stronger, we may need to
compute a bound on the number of iterations based on the access functions
and array dimensions.

The patch cleans up all the CHECKs in test/Transforms/LoopInterchange/*.ll to
avoid checking for snippets of LLVM IR: this form of checking is very hard to
maintain. Instead, we now check for output of the pass that are more meaningful
than dozens of lines of LLVM IR. Some tests now require -debug messages and thus
only enabled with asserts.

Patch written by Sebastian Pop and Aditya Kumar.

Differential Revision: https://reviews.llvm.org/D35430

llvm-svn: 326837
2018-03-06 21:55:59 +00:00
Florian Hahn ed45836253 [LoopInterchange] Add test case for D43236.
llvm-svn: 326078
2018-02-26 10:46:25 +00:00
Florian Hahn c6296fea3f [LoopInterchange] Incrementally update the dominator tree.
We can use incremental dominator tree updates to avoid re-calculating
the dominator tree after interchanging 2 loops.

Reviewers: dmgreen, kuhar

Reviewed By: kuhar

Differential Revision: https://reviews.llvm.org/D43176

llvm-svn: 325122
2018-02-14 13:13:15 +00:00
Florian Hahn 1f95ef1815 [LoopInterchange] Check number of latch successors before accessing them.
In cases where the OuterMostLoopLatchBI only has a single successor,
accessing the second successor will fail.

This fixes a failure when building the test-suite with loop-interchange
enabled.

Reviewers: mcrosier, karthikthecool, davide

Reviewed by: karthikthecool

Differential Revision: https://reviews.llvm.org/D42906

llvm-svn: 324994
2018-02-13 10:02:52 +00:00
David Green 907b60fbba [LoopInterchange] Fix phi node ordering miscompile.
The way that splitInnerLoopHeader splits blocks requires that
the induction PHI will be the first PHI in the inner loop
header. This makes sure that is actually the case when there
are both IV and reduction phis.

Differential Revision: https://reviews.llvm.org/D38682

llvm-svn: 316261
2017-10-21 13:58:37 +00:00
Florian Hahn cd78345398 [LoopInterchange] Skip zext instructions when looking for induction var.
Summary:
SimplifyIndVar may introduce zext instructions to widen arguments of the
loop exit check. They should not prevent us from splitting the loop at
the induction variable, but maybe the check should be more conservative,
e.g. making sure it only extends arguments used by a comparison?

Reviewers: karthikthecool, mcrosier, mzolotukhin

Reviewed By: mcrosier

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D34879

llvm-svn: 311783
2017-08-25 16:52:29 +00:00
Florian Hahn 4284049dcc [LoopInterchange] Do not interchange loops with function calls.
Summary:
Without any information about the called function, we cannot be sure
that it is safe to interchange loops which contain function calls. For
example there could be dependences that prevent interchanging between
accesses in the called function and the loops. Even functions without any
parameters could cause problems, as they could access memory using
global pointers.

For now, I think it is only safe to interchange loops with calls marked
as readnone.

With this patch, the LLVM test suite passes with `-O3 -mllvm
-enable-loopinterchange` and LoopInterchangeProfitability::isProfitable
returning true for all loops. check-llvm and check-clang also pass when
bootstrapped in a similar fashion, although only 3 loops got
interchanged.

Reviewers: karthikthecool, blitz.opensource, hfinkel, mcrosier, mkuper

Reviewed By: mcrosier

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D35489

llvm-svn: 309547
2017-07-31 09:00:52 +00:00
Florian Hahn 6c7f0256c6 [LoopInterchange] Split up interchange.ll test case (NFC).
Summary:
Currently most tests for the loop interchange pass are in
test/Transforms/LoopInterchange/interchange.ll. This patch splits up the
large test file in smaller pieces, which makes debugging test failures
easier.

Reviewers: karthikthecool, blitz.opensource, hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, mcrosier, mkuper, mzolotukhin, mssimpso, llvm-commits

Differential Revision: https://reviews.llvm.org/D35488

llvm-svn: 308284
2017-07-18 09:47:06 +00:00
Florian Hahn ad993521ac [LoopInterchange] Add some optimization remarks.
Reviewers: anemet, karthikthecool, blitz.opensource

Reviewed By: anemet

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D35122

llvm-svn: 308094
2017-07-15 13:13:19 +00:00
Chad Rosier 00eb8db3a1 [LoopInterchange] Track all dependencies, not just anti dependencies.
Currently, we give up on loop interchange if we encounter a flow dependency
anywhere in the loop list. Worse yet, we don't even track output dependencies.

This patch updates the dependency matrix computation to track flow and output
dependencies in the same way we track anti dependencies.

This improves an internal workload by 2.2x.

Note the loop interchange pass is off by default and it can be enabled with
'-mllvm -enable-loopinterchange'

Differential Revision: https://reviews.llvm.org/D24564

llvm-svn: 282101
2016-09-21 19:16:47 +00:00
Karthik Bhat 8210fdf26e Add support to interchange loops with reductions.
This patch enables interchanging of tightly nested loops with reductions.
Differential Revision: http://reviews.llvm.org/D8314

llvm-svn: 235571
2015-04-23 04:51:44 +00:00
David Blaikie 23af64846f [opaque pointer type] Add textual IR support for explicit type parameter to the call instruction
See r230786 and r230794 for similar changes to gep and load
respectively.

Call is a bit different because it often doesn't have a single explicit
type - usually the type is deduced from the arguments, and just the
return type is explicit. In those cases there's no need to change the
IR.

When that's not the case, the IR usually contains the pointer type of
the first operand - but since typed pointers are going away, that
representation is insufficient so I'm just stripping the "pointerness"
of the explicit type away.

This does make the IR a bit weird - it /sort of/ reads like the type of
the first operand: "call void () %x(" but %x is actually of type "void
()*" and will eventually be just of type "ptr". But this seems not too
bad and I don't think it would benefit from repeating the type
("void (), void () * %x(" and then eventually "void (), ptr %x(") as has
been done with gep and load.

This also has a side benefit: since the explicit type is no longer a
pointer, there's no ambiguity between an explicit type and a function
that returns a function pointer. Previously this case needed an explicit
type (eg: a function returning a void() function was written as
"call void () () * @x(" rather than "call void () * @x(" because of the
ambiguity between a function returning a pointer to a void() function
and a function returning void).

No ambiguity means even function pointer return types can just be
written alone, without writing the whole function's type.

This leaves /only/ the varargs case where the explicit type is required.

Given the special type syntax in call instructions, the regex-fu used
for migration was a bit more involved in its own unique way (as every
one of these is) so here it is. Use it in conjunction with the apply.sh
script and associated find/xargs commands I've provided in rr230786 to
migrate your out of tree tests. Do let me know if any of this doesn't
cover your cases & we can iterate on a more general script/regexes to
help others with out of tree tests.

About 9 test cases couldn't be automatically migrated - half of those
were functions returning function pointers, where I just had to manually
delete the function argument types now that we didn't need an explicit
function type there. The other half were typedefs of function types used
in calls - just had to manually drop the * from those.

import fileinput
import sys
import re

pat = re.compile(r'((?:=|:|^|\s)call\s(?:[^@]*?))(\s*$|\s*(?:(?:\[\[[a-zA-Z0-9_]+\]\]|[@%](?:(")?[\\\?@a-zA-Z0-9_.]*?(?(3)"|)|{{.*}}))(?:\(|$)|undef|inttoptr|bitcast|null|asm).*$)')
addrspace_end = re.compile(r"addrspace\(\d+\)\s*\*$")
func_end = re.compile("(?:void.*|\)\s*)\*$")

def conv(match, line):
  if not match or re.search(addrspace_end, match.group(1)) or not re.search(func_end, match.group(1)):
    return line
  return line[:match.start()] + match.group(1)[:match.group(1).rfind('*')].rstrip() + match.group(2) + line[match.end():]

for line in sys.stdin:
  sys.stdout.write(conv(re.search(pat, line), line))

llvm-svn: 235145
2015-04-16 23:24:18 +00:00
Karthik Bhat 88db86dd29 Add a new pass "Loop Interchange"
This pass interchanges loops to provide a more cache-friendly memory access.

For e.g. given a loop like -
  for(int i=0;i<N;i++)
    for(int j=0;j<N;j++)
      A[j][i] = A[j][i]+B[j][i];

is interchanged to -
  for(int j=0;j<N;j++)
    for(int i=0;i<N;i++)
      A[j][i] = A[j][i]+B[j][i];

This pass is currently disabled by default.

To give a brief introduction it consists of 3 stages-

LoopInterchangeLegality : Checks the legality of loop interchange based on Dependency matrix.
LoopInterchangeProfitability: A very basic heuristic has been added to check for profitibility. This will evolve over time.
LoopInterchangeTransform : Which does the actual transform.

LNT Performance tests shows improvement in Polybench/linear-algebra/kernels/mvt and Polybench/linear-algebra/kernels/gemver becnmarks.

TODO:
1) Add support for reductions and lcssa phi.
2) Improve profitability model.
3) Improve loop selection algorithm to select best loop for interchange. Currently the innermost loop is selected for interchange.
4) Improve compile time regression found in llvm lnt due to this pass.
5) Fix issues in Dependency Analysis module.

A special thanks to Hal for reviewing this code.
Review: http://reviews.llvm.org/D7499

llvm-svn: 231458
2015-03-06 10:11:25 +00:00