Commit Graph

26 Commits

Author SHA1 Message Date
Silviu Baranga 4825060059 [LAA] Add clarifying comments for the checking pointer grouping algorithm. NFC
llvm-svn: 243416
2015-07-28 13:44:08 +00:00
Adam Nemet 54f0b83ee2 [LAA] Split out a helper to print a collection of memchecks
This is effectively an NFC but we can no longer print the index of the
pointer group so instead I print its address.  This still lets us
cross-check the section that list the checks against the section that
list the groups (see how I modified the test).

E.g. before we printed this:

    Run-time memory checks:
    Check 0:
      Comparing group 0:
        %arrayidxC = getelementptr inbounds i16, i16* %c, i64 %store_ind
        %arrayidxC1 = getelementptr inbounds i16, i16* %c, i64 %store_ind_inc
      Against group 1:
        %arrayidxA = getelementptr i16, i16* %a, i64 %ind
        %arrayidxA1 = getelementptr i16, i16* %a, i64 %add
    ...
    Grouped accesses:
      Group 0:
        (Low: %c High: (78 + %c))
          Member: {%c,+,4}<%for.body>
          Member: {(2 + %c),+,4}<%for.body>

Now we print this (changes are underlined):

    Run-time memory checks:
    Check 0:
      Comparing group (0x7f9c6040c320):
                       ~~~~~~~~~~~~~~
        %arrayidxC1 = getelementptr inbounds i16, i16* %c, i64 %store_ind_inc
        %arrayidxC = getelementptr inbounds i16, i16* %c, i64 %store_ind
      Against group (0x7f9c6040c358):
                     ~~~~~~~~~~~~~~
        %arrayidxA1 = getelementptr i16, i16* %a, i64 %add
        %arrayidxA = getelementptr i16, i16* %a, i64 %ind
    ...
    Grouped accesses:
      Group 0x7f9c6040c320:
            ~~~~~~~~~~~~~~
        (Low: %c High: (78 + %c))
          Member: {(2 + %c),+,4}<%for.body>
          Member: {%c,+,4}<%for.body>

llvm-svn: 243354
2015-07-27 23:54:41 +00:00
Silviu Baranga 0e5804a6af Fix memcheck interval ends for pointers with negative strides
Summary:
The checking pointer grouping algorithm assumes that the
starts/ends of the pointers are well formed (start <= end).

The runtime memory checking algorithm also assumes this by doing:

 start0 < end1 && start1 < end0

to detect conflicts. This check only works if start0 <= end0 and
start1 <= end1.

This change correctly orders the interval ends by either checking
the stride (if it is constant) or by using min/max SCEV expressions.

Reviewers: anemet, rengolin

Subscribers: rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D11149

llvm-svn: 242400
2015-07-16 14:02:58 +00:00
Silviu Baranga a647c30f88 Cleanup after r241809 - remove uncessary call to std::sort
Summary:
The iteration order within a member of DepCands is deterministic
and therefore we don't have to sort the accesses within a member.
We also don't have to copy the indices of the pointers into a
vector, since we can iterate over the members of the class.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11145

llvm-svn: 242033
2015-07-13 14:48:24 +00:00
Silviu Baranga 3e3095c53a Add a test of a regression discovered during testing of r241673
Summary:
We were missing a corner case where DepCands was not available,
but we were using DepCands to compute the checking pointer
groups.

This adds a test for that regression.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11068

llvm-svn: 241818
2015-07-09 16:40:25 +00:00
Silviu Baranga ce3877fc8c Don't rely on the DepCands iteration order when constructing checking pointer groups
Summary:
The checking pointer group construction algorithm relied on the iteration on DepCands.
We would need the same leaders across runs and the same iteration order over the underlying std::set for determinism.

This changes the algorithm to process the pointers in the order in which they were added to the runtime check, which is deterministic.
We need to update the tests, since the order in which pointers appear has changed.

No new tests were added, since it is impossible to test for non-determinism.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11064

llvm-svn: 241809
2015-07-09 15:18:25 +00:00
Adam Nemet 424edc6c80 [LAA] Revert a small part of r239295
This commit ([LAA] Fix estimation of number of memchecks) regressed the
logic a bit.  We shouldn't quit the analysis if we encounter a pointer
without known bounds *unless* we actually need to emit a memcheck for
it.

The original code was using NumComparisons which is now computed
differently.  Instead I compute NeedRTCheck from NumReadPtrChecks and
NumWritePtrChecks.

As side note, I find the separation of NeedRTCheck and CanDoRT
confusing, so I will try to merge them in a follow-up patch.

llvm-svn: 241756
2015-07-08 22:58:48 +00:00
Silviu Baranga 1b6b50a921 [LAA] Merge memchecks for accesses separated by a constant offset
Summary:
Often filter-like loops will do memory accesses that are
separated by constant offsets. In these cases it is
common that we will exceed the threshold for the
allowable number of checks.

However, it should be possible to merge such checks,
sice a check of any interval againt two other intervals separated
by a constant offset (a,b), (a+c, b+c) will be equivalent with
a check againt (a, b+c), as long as (a,b) and (a+c, b+c) overlap.
Assuming the loop will be executed for a sufficient number of
iterations, this will be true. If not true, checking against
(a, b+c) is still safe (although not equivalent).

As long as there are no dependencies between two accesses,
we can merge their checks into a single one. We use this
technique to construct groups of accesses, and then check
the intervals associated with the groups instead of
checking the accesses directly.

Reviewers: anemet

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10386

llvm-svn: 241673
2015-07-08 09:16:33 +00:00
Adam Nemet c4866d29dd [LAA] Try to prove non-wrapping of pointers if SCEV cannot
Summary:
Scalar evolution does not propagate the non-wrapping flags to values
that are derived from a non-wrapping induction variable because
the non-wrapping property could be flow-sensitive.

This change is a first attempt to establish the non-wrapping property in
some simple cases.  The main idea is to look through the operations
defining the pointer.  As long as we arrive to a non-wrapping AddRec via
a small chain of non-wrapping instruction, the pointer should not wrap
either.

I believe that this essentially is what Andy described in
http://article.gmane.org/gmane.comp.compilers.llvm.cvs/220731 as the way
forward.

Reviewers: aschwaighofer, nadav, sanjoy, atrick

Reviewed By: atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10472

llvm-svn: 240798
2015-06-26 17:25:43 +00:00
Silviu Baranga 98a137196a [LAA] Fix estimation of number of memchecks
Summary:
We need to add a runtime memcheck for pair of accesses (x,y) where at least one of x and y
are writes.
 
Assuming we have w writes and r reads, currently this number is  estimated as being
w* (w+r-1). This estimation will count (write,write) pairs twice and will overestimate
the number of checks required.

This change adds a getNumberOfChecks method to RuntimePointerCheck, which
will count the number of runtime checks needed (similar in implementation to
needsAnyChecking) and uses it to produce the correct number of runtime checks.

Test Plan:
llvm test suite
spec2k
spec2k6

Performance results: no changes observed (not surprising since the formula for 1 writer is basically the same, which would covers most cases - at least with the current check limit).

Reviewers: anemet

Reviewed By: anemet

Subscribers: mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D10217

llvm-svn: 239295
2015-06-08 10:27:06 +00:00
Hao Liu 751004a67d [LoopAccessAnalysis] Teach LAA to check the memory dependence between strided accesses.
Differential Revision: http://reviews.llvm.org/D9368

llvm-svn: 239285
2015-06-08 04:48:37 +00:00
Adam Nemet df3dc5b9ca [LoopAccesses] If shouldRetryWithRuntimeCheck, reset InterestingDependences
When dependence analysis encounters a non-constant distance between
memory accesses it aborts the analysis and falls back to run-time checks
only.  In this case we weren't resetting the array of dependences.

llvm-svn: 237574
2015-05-18 15:37:03 +00:00
Adam Nemet c3384320f2 [LoopAccesses] Rearrange printed lines in -analyze
"Store to invariant address..." is moved as the last line.  This is not
the prime result of the analysis.  Plus it simplifies some of the tests.

llvm-svn: 237573
2015-05-18 15:36:57 +00:00
Adam Nemet e2b885c4bc [getUnderlyingOjbects] Analyze loop PHIs further to remove false positives
Specifically, if a pointer accesses different underlying objects in each
iteration, don't look through the phi node defining the pointer.

The motivating case is the underlyling-objects-2.ll testcase.  Consider
the loop nest:

  int **A;
  for (i)
    for (j)
       A[i][j] = A[i-1][j] * B[j]

This loop is transformed by Load-PRE to stash away A[i] for the next
iteration of the outer loop:

  Curr = A[0];          // Prev_0
  for (i: 1..N) {
    Prev = Curr;        // Prev = PHI (Prev_0, Curr)
    Curr = A[i];
    for (j: 0..N)
       Curr[j] = Prev[j] * B[j]
  }

Since A[i] and A[i-1] are likely to be independent pointers,
getUnderlyingObjects should not assume that Curr and Prev share the same
underlying object in the inner loop.

If it did we would try to dependence-analyze Curr and Prev and the
analysis of the corresponding SCEVs would fail with non-constant
distance.

To fix this, the getUnderlyingObjects API is extended with an optional
LoopInfo parameter.  This is effectively what controls whether we want
the above behavior or the original.  Currently, I only changed to use
this approach for LoopAccessAnalysis.

The other testcase is to guard the opposite case where we do want to
look through the loop PHI.  If we step through an array by incrementing
a pointer, the underlying object is the incoming value of the phi as the
loop is entered.

Fixes rdar://problem/19566729

llvm-svn: 235634
2015-04-23 20:09:20 +00:00
Adam Nemet 26da8e9800 [LoopAccesses] Properly print whether memchecks are needed
Fix oversight in -analyze output.  PtrRtCheck contains the pointers that
need to be checked against each other and not whether memchecks are
necessary.

For instance in the testcase PtrRtCheck has four elements but all
no-alias so no checking is necessary.

llvm-svn: 234833
2015-04-14 01:12:55 +00:00
Adam Nemet ce48250f11 [LoopAccesses] Allow analysis to complete in the presence of uniform stores
(Re-apply r234361 with a fix and a testcase for PR23157)

Both run-time pointer checking and the dependence analysis are capable
of dealing with uniform addresses. I.e. it's really just an orthogonal
property of the loop that the analysis computes.

Run-time pointer checking will only try to reason about SCEVAddRec
pointers or else gives up. If the uniform pointer turns out the be a
SCEVAddRec in an outer loop, the run-time checks generated will be
correct (start and end bounds would be equal).

In case of the dependence analysis, we work again with SCEVs. When
compared against a loop-dependent address of the same underlying object,
the difference of the two SCEVs won't be constant. This will result in
returning an Unknown dependence for the pair.

When compared against another uniform access, the difference would be
constant and we should return the right type of dependence
(forward/backward/etc).

The changes also adds support to query this property of the loop and
modify the vectorizer to use this.

Patch by Ashutosh Nema!

llvm-svn: 234424
2015-04-08 17:48:40 +00:00
Adam Nemet e09a928c80 Revert "[LoopAccesses] Allow analysis to complete in the presence of uniform stores"
This reverts commit r234361.

It caused PR23157.

llvm-svn: 234387
2015-04-08 04:16:55 +00:00
Adam Nemet 0515c33b70 [LoopAccesses] Allow analysis to complete in the presence of uniform stores
Both run-time pointer checking and the dependence analysis are capable
of dealing with uniform addresses. I.e. it's really just an orthogonal
property of the loop that the analysis computes.

Run-time pointer checking will only try to reason about SCEVAddRec
pointers or else gives up. If the uniform pointer turns out the be a
SCEVAddRec in an outer loop, the run-time checks generated will be
correct (start and end bounds would be equal).

In case of the dependence analysis, we work again with SCEVs. When
compared against a loop-dependent address of the same underlying object,
the difference of the two SCEVs won't be constant. This will result in
returning an Unknown dependence for the pair.

When compared against another uniform access, the difference would be
constant and we should return the right type of dependence
(forward/backward/etc).

The changes also adds support to query this property of the loop and
modify the vectorizer to use this.

Patch by Ashutosh Nema!

llvm-svn: 234361
2015-04-07 21:46:16 +00:00
Adam Nemet 4ff3ed5960 [LoopAccesses] Remove unused global variables in tests
llvm-svn: 233887
2015-04-02 04:42:51 +00:00
Adam Nemet 58913d65ad [LoopAccesses 3/3] Print the dependences with -analyze
The dependences are now expose through the new getInterestingDependences
API so we can use that with -analyze too and fix the FIXME.

This lets us remove the test that relied on -debug to check the
dependences.

llvm-svn: 231807
2015-03-10 17:40:43 +00:00
David Blaikie a79ac14fa6 [opaque pointer type] Add textual IR support for explicit type parameter to load instruction
Essentially the same as the GEP change in r230786.

A similar migration script can be used to update test cases, though a few more
test case improvements/changes were required this time around: (r229269-r229278)

import fileinput
import sys
import re

pat = re.compile(r"((?:=|:|^)\s*load (?:atomic )?(?:volatile )?(.*?))(| addrspace\(\d+\) *)\*($| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$)")

for line in sys.stdin:
  sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line))

Reviewers: rafael, dexonsmith, grosser

Differential Revision: http://reviews.llvm.org/D7649

llvm-svn: 230794
2015-02-27 21:17:42 +00:00
David Blaikie 79e6c74981 [opaque pointer type] Add textual IR support for explicit type parameter to getelementptr instruction
One of several parallel first steps to remove the target type of pointers,
replacing them with a single opaque pointer type.

This adds an explicit type parameter to the gep instruction so that when the
first parameter becomes an opaque pointer type, the type to gep through is
still available to the instructions.

* This doesn't modify gep operators, only instructions (operators will be
  handled separately)

* Textual IR changes only. Bitcode (including upgrade) and changing the
  in-memory representation will be in separate changes.

* geps of vectors are transformed as:
    getelementptr <4 x float*> %x, ...
  ->getelementptr float, <4 x float*> %x, ...
  Then, once the opaque pointer type is introduced, this will ultimately look
  like:
    getelementptr float, <4 x ptr> %x
  with the unambiguous interpretation that it is a vector of pointers to float.

* address spaces remain on the pointer, not the type:
    getelementptr float addrspace(1)* %x
  ->getelementptr float, float addrspace(1)* %x
  Then, eventually:
    getelementptr float, ptr addrspace(1) %x

Importantly, the massive amount of test case churn has been automated by
same crappy python code. I had to manually update a few test cases that
wouldn't fit the script's model (r228970,r229196,r229197,r229198). The
python script just massages stdin and writes the result to stdout, I
then wrapped that in a shell script to handle replacing files, then
using the usual find+xargs to migrate all the files.

update.py:
import fileinput
import sys
import re

ibrep = re.compile(r"(^.*?[^%\w]getelementptr inbounds )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))")
normrep = re.compile(       r"(^.*?[^%\w]getelementptr )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))")

def conv(match, line):
  if not match:
    return line
  line = match.groups()[0]
  if len(match.groups()[5]) == 0:
    line += match.groups()[2]
  line += match.groups()[3]
  line += ", "
  line += match.groups()[1]
  line += "\n"
  return line

for line in sys.stdin:
  if line.find("getelementptr ") == line.find("getelementptr inbounds"):
    if line.find("getelementptr inbounds") != line.find("getelementptr inbounds ("):
      line = conv(re.match(ibrep, line), line)
  elif line.find("getelementptr ") != line.find("getelementptr ("):
    line = conv(re.match(normrep, line), line)
  sys.stdout.write(line)

apply.sh:
for name in "$@"
do
  python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name"
  rm -f "$name.tmp"
done

The actual commands:
From llvm/src:
find test/ -name *.ll | xargs ./apply.sh
From llvm/src/tools/clang:
find test/ -name *.mm -o -name *.m -o -name *.cpp -o -name *.c | xargs -I '{}' ../../apply.sh "{}"
From llvm/src/tools/polly:
find test/ -name *.ll | xargs ./apply.sh

After that, check-all (with llvm, clang, clang-tools-extra, lld,
compiler-rt, and polly all checked out).

The extra 'rm' in the apply.sh script is due to a few files in clang's test
suite using interesting unicode stuff that my python script was throwing
exceptions on. None of those files needed to be migrated, so it seemed
sufficient to ignore those cases.

Reviewers: rafael, dexonsmith, grosser

Differential Revision: http://reviews.llvm.org/D7636

llvm-svn: 230786
2015-02-27 19:29:02 +00:00
Adam Nemet 9cc0c3999d [LV/LoopAccesses] Backward dependences are not safe just because the
accesses are via different types

Noticed this while generalizing the code for loop distribution.

I confirmed with Arnold that this was indeed a bug and managed to create
a testcase.

llvm-svn: 230647
2015-02-26 17:58:48 +00:00
Adam Nemet e91cc6ef93 [LoopAccesses] Add -analyze support
The LoopInfo in combination with depth_first is used to enumerate the
loops.

Right now -analyze is not yet complete.  It only prints the result of
the analysis, the report and the run-time checks.  Printing the unsafe
depedences will require a bit more reshuffling which I'd like to do in a
follow-on to this patchset.  Unsafe dependences are currently checked
via -debug-only=loop-accesses in the new test.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229898
2015-02-19 19:15:19 +00:00
NAKAMURA Takumi fa520c5f49 Revert r229622: "[LoopAccesses] Make VectorizerParams global" and others. r229622 brought cyclic dependencies between Analysis and Vector.
r229622: "[LoopAccesses] Make VectorizerParams global"
  r229623: "[LoopAccesses] Stash the report from the analysis rather than emitting it"
  r229624: "[LoopAccesses] Cache the result of canVectorizeMemory"
  r229626: "[LoopAccesses] Create the analysis pass"
  r229628: "[LoopAccesses] Change debug messages from LV to LAA"
  r229630: "[LoopAccesses] Add canAnalyzeLoop"
  r229631: "[LoopAccesses] Add missing const to APIs in VectorizationReport"
  r229632: "[LoopAccesses] Split out LoopAccessReport from VectorizerReport"
  r229633: "[LoopAccesses] Add -analyze support"
  r229634: "[LoopAccesses] Change LAA:getInfo to return a constant reference"
  r229638: "Analysis: fix buildbots"

llvm-svn: 229650
2015-02-18 08:34:47 +00:00
Adam Nemet 75bc2d111f [LoopAccesses] Add -analyze support
The LoopInfo in combination with depth_first is used to enumerate the
loops.

Right now -analyze is not yet complete.  It only prints the result of
the analysis, the report and the run-time checks.  Printing the unsafe
depedences will require a bit more reshuffling which I'd like to do in a
follow-on to this patchset.  Unsafe dependences are currently checked
via -debug-only=loop-accesses in the new test.

This is part of the patchset that converts LoopAccessAnalysis into an
actual analysis pass.

llvm-svn: 229633
2015-02-18 03:44:30 +00:00