Commit Graph

241 Commits

Author SHA1 Message Date
Johannes Doerfert 0eefb0258f [Refactor] Use nicer print callback function in IslAst
llvm-svn: 214447
2014-07-31 21:33:49 +00:00
Rafael Espindola 08dfd8f25f Update for llvm change.
llvm-svn: 214358
2014-07-30 23:17:15 +00:00
Tobias Grosser 924e9e0226 IslAst: Enhance parallelism detection test
Add more check lines to ensure we do not accidentally generate nested openmp
parallel annotations.

llvm-svn: 214200
2014-07-29 19:22:46 +00:00
Johannes Doerfert af9b1e2d80 [Refactor] Remove containsLoop to find innermost loops
Use the fact that if we visit a for node first in pre and next in post order
  we know we did not visit any children, thus we found an innermost loop.

  + Test case for an innermost loop with a conditional inside

llvm-svn: 213870
2014-07-24 15:59:06 +00:00
Johannes Doerfert f6583176ab [Refactor] Unify IslAst print methods
+ Add const annotations to some member functions

llvm-svn: 213779
2014-07-23 18:14:43 +00:00
Johannes Doerfert 43e1eadf26 [Refactor] Use attributes to mark function as invalid for polly
+ Test case annotated with the new attribute
  + Modified test case to check if subfunctions are annotated

llvm-svn: 213093
2014-07-15 21:06:48 +00:00
Johannes Doerfert 457f73eaee Annotate reduction parallel loops in the IslAst textual output
+ Introduced dependency type TYPE_TC_RED to represent the transitive closure
    (& the reverse) of reduction dependences. These are used when we check for
    reduction parallel loops.
  + Test cases including loop reversals and modulo schedules which compute
    reductions in a alternated order.

llvm-svn: 213019
2014-07-15 00:00:35 +00:00
Tobias Grosser c2920ff747 DeadCodeElimination: Fix liveout computation
We move back to a simple approach where the liveout is the last must-write
statement for a data-location plus all may-write statements. The previous
approach did not work out. We would have to consider per-data-access
dependences, instead of per-statement dependences to correct it. As this adds
complexity and it seems we would not gain anything over the simpler approach
that we implement in this commit, I moved us back to the old approach of
computing the liveout, but enhanced it to also add may-write accesses.

We also fix the test case and explain why we can not perform dead code
elimination in this case.

llvm-svn: 212925
2014-07-14 08:32:01 +00:00
Tobias Grosser 780ce0f8e3 DeadCodeElim: Compute correct liveout for non-affine accesses
Thanks to Johannes Doerfert for narrowing down the bug.

Reported-by: Chris Jenneisch <chrisj@codeaurora.org>
llvm-svn: 212796
2014-07-11 07:12:10 +00:00
Tobias Grosser 5e6813d184 Derive run-time conditions for delinearization
As our delinearization works optimistically, we need in some cases run-time
checks that verify our optimistic assumptions. A simple example is the
following code:

void foo(long n, long m, long o, double A[n][m][o]) {

  for (long i = 0; i < 100; i++)
    for (long j = 0; j < 150; j++)
      for (long k = 0; k < 200; k++)
        A[i][j][k] = 1.0;
}

After clang linearized the access to A and we delinearized it again to
A[i][j][k] we need to ensure that we do not access the delinearized array
out of bounds (this information is not available in LLVM-IR). Hence, we
need to verify the following constraints at run-time:

CHECK:   Assumed Context:
CHECK:   [o, m] -> {  : m >= 150 and o >= 200 }
llvm-svn: 212198
2014-07-02 17:47:48 +00:00
Johannes Doerfert f618339a37 Introduce reduction types
This change is particularly useful in the code generation as we need
  to know which binary operator/identity element we need to combine/initialize
  the privatization locations.

  + Print the reduction type for each memory access
  + Adjusted the test cases to comply with the new output format and
    to test for the right reduction type

llvm-svn: 212126
2014-07-01 20:52:51 +00:00
Johannes Doerfert 9890a05287 [FIX] Don't consider reductions which are partially outside the SCoP
+ Test case

llvm-svn: 212080
2014-07-01 00:32:29 +00:00
Johannes Doerfert 1a62c7a34a [Fix] Deleted renamed test after r211957
llvm-svn: 211964
2014-06-27 21:48:42 +00:00
Johannes Doerfert e58a012094 Allow multiple reductions per statement
Iterate over all store memory accesses and check for valid binary reduction
  candidate loads by following the operands of the stored value.  For each
  candidate pair we check if they have the same base address and there are no
  other accesses which may overlap with them. This ensures that no intermediate
  value can escape into other memory locations or is overwritten at some point.

  + 17 test cases for reduction detection and reduction dependency modeling

llvm-svn: 211957
2014-06-27 20:31:28 +00:00
Andreas Simbuerger b379edbb3e Don't expand to invalid Scops with -polly-detect-keep-going
Enabling -keep-going in ScopDetection causes expansion to an invalid
Scop candidate.

Region A     <- Valid candidate
   |
Region B     <- Invalid candidate

If -keep-going is enabled, ScopDetection would expand A to A+B because
the RejectLog is never checked for errors during expansion.

With this patch only A becomes a valid Scop.

llvm-svn: 211875
2014-06-27 06:21:14 +00:00
Johannes Doerfert 76dd493eff [Fix] Broken tests after r211796.
llvm-svn: 211797
2014-06-26 19:29:11 +00:00
Johannes Doerfert f8ee915deb Use wrapped reduction dependences
This change will ease the transision to multiple reductions per statement as
  we can now distinguish the effects of multiple reductions in the same
  statement.

  + Wrapped reduction dependences are used to compute privatization dependences
  + Modified test cases to account for the change

llvm-svn: 211795
2014-06-26 18:44:14 +00:00
Johannes Doerfert ea23b1d561 Hybrid dependency analysis
This dependency analysis will keep track of memory accesses if they might be
  part of a reduction. If not, the dependences are tracked on a statement level.
  The main reason to do this is to reduce the compile time while beeing able to
  distinguish the effects of reduction and non-reduction accesses.

  + Adjusted two test cases

llvm-svn: 211794
2014-06-26 18:38:08 +00:00
Andreas Simbuerger 99d4ab2b84 Add diagnostic remark for ReportVariantBasePtr
llvm-svn: 211777
2014-06-26 13:33:35 +00:00
Andreas Simbuerger 5569bf300d Support the new DiagnosticRemarks
Add support for generating optimization remarks after completing the
detection of Scops.
The goal is to provide end-users with useful hints about opportunities that
help to increase the size of the detected Scops in their code.

By default the remark is unspecified and the debug location is empty. Future
patches have to expand on the messages generated.

This patch brings a simple test case for ReportFuncCall to demonstrate the
feature.

Reports all missed opportunities to increase the size/number of valid
Scops:
 clang <...> -Rpass-missed="polly-detect" <...>
 opt <...> -pass-remarks-missed="polly-detect" <...>

Reports beginning and end of all valid Scops:
 clang <...> -Rpass="polly-detect" <...>
 opt <...> -pass-remarks="polly-detect" <...>

Differential Revision: http://reviews.llvm.org/D4171

llvm-svn: 211769
2014-06-26 10:06:40 +00:00
Tobias Grosser 50a5e6dac0 test/ScopInfo: Remove %defaultOpts and list passes explicitly
Due to bad habit we sometimes used a variable %defaultOpts that listed
a set of passes commonly run to prepare for Polly. None of these test cases
actually needs special preparation and only two of them need the 'basicaa' to
be scheduled. Scheduling the required alias analysis explicitly makes the test
cases clearer.

llvm-svn: 211671
2014-06-25 06:38:18 +00:00
Tobias Grosser 08031390d5 Clean up XFAILed test cases
We had a set of test cases that have been incomplete and XFAILED. This patch
completes a couple of the interesting ones and removes the ones which seem
redundant or not sufficiently reduced to be useful.

llvm-svn: 211670
2014-06-25 06:31:19 +00:00
Johannes Doerfert 5e275bc83a [Refactor] Create nicer test cases from C/C++
Insert a header into the new testcase containing a sample RUN line a FIXME and
an XFAIL. Then insert the formated C code and finally the LLVM-IR without
attributes, the module ID or the target triple.

llvm-svn: 211612
2014-06-24 17:02:53 +00:00
Yabin Hu cc91169fd7 Remove use of llvm.codegen intrinsic for GPGPU codegen
We use llvm.codegen intrinsic to generate code for embedded LLVM-IR
strings. The reason we introduce such a intrinsic is that previous
clang/opt tools was NOT linked with various LLVM targets and their
AsmParsers and AsmPrinters. Since clang/opt been linked with all the
needed libraries, we no longer need the llvm.codegen intrinsic.

llvm-svn: 211573
2014-06-24 08:11:36 +00:00
Johannes Doerfert f1906138b4 Model statement wise reduction dependences
+ Collect reduction dependences
+ Introduced TYPE_RED in Dependences.h which can be used to obtain the
  reduction dependences
+ Used TYPE_RED to prevent parallelization while we do not have a privatizing
  code generation
+ Relax the dependences for non-parallel code generation
+ Add privatization dependences to ensure correctness
+ 12 Test cases to check for reduction and privatization dependences

llvm-svn: 211369
2014-06-20 16:37:11 +00:00
Johannes Doerfert da80386700 Missing reduction detection test cases
llvm-svn: 211235
2014-06-18 23:08:14 +00:00
Tobias Grosser f4fcbf4097 Test delinearization of 2D diagonal matrix
llvm-svn: 210538
2014-06-10 14:48:17 +00:00
Tobias Grosser be7eaddc69 Adjust another test case to not access out of bounds
llvm-svn: 210208
2014-06-04 19:41:47 +00:00
Tobias Grosser 5416a0395f Adjust multidim test cases to not access out-of-bound memory
We do this currently only for test cases where we have integer offsets that
clearly access array dimensions out-of-bound.

-;   for (long i = 0; i < n; i++)
-;     for (long j = 0; j < m; j++)
-;       for (long k = 0; k < o; k++)
+;   for (long i = 0; i < n - 3; i++)
+;     for (long j = 4; j < m; j++)
+;       for (long k = 0; k < o - 7; k++)
 ;         A[i+3][j-4][k+7] = 1.0;

This will be helpful if we later want to simplify the access functions under the
assumption that they do not access memory out of bounds.

llvm-svn: 210179
2014-06-04 11:47:54 +00:00
Sebastian Pop 422e33f363 record delinearization result and reuse it in polyhedral translation
Without this patch, the testcase would fail on the delinearization of the second
array:

; void foo(long n, long m, long o, double A[n][m][o]) {
;   for (long i = 0; i < n; i++)
;     for (long j = 0; j < m; j++)
;       for (long k = 0; k < o; k++) {
;         A[i+3][j-4][k+7] = 1.0;
;         A[i][0][k] = 2.0;
;       }
; }

; CHECK: [n, m, o] -> { Stmt_for_body6[i0, i1, i2] -> MemRef_A[3 + i0, -4 + i1, 7 + i2] };
; CHECK: [n, m, o] -> { Stmt_for_body6[i0, i1, i2] -> MemRef_A[i0, 0, i2] };

Here is the output of FileCheck on the testcase without this patch:

; CHECK: [n, m, o] -> { Stmt_for_body6[i0, i1, i2] -> MemRef_A[i0, 0, i2] };
         ^
<stdin>:26:2: note: possible intended match here
 [n, m, o] -> { Stmt_for_body6[i0, i1, i2] -> MemRef_A[o0] };
 ^

It is possible to find a good delinearization for A[i][0][k] only in the context
of the delinearization of both array accesses.

There are two ways to delinearize together all array subscripts touching the
same base address: either duplicate the code from scop detection to first gather
all array references and then run the delinearization; or as implemented in this
patch, use the same delinearization info that we computed during scop detection.

llvm-svn: 210117
2014-06-03 18:16:31 +00:00
Johannes Doerfert c3958b214c Added option for n-dimensional rectangular tiling
+ CL-option --polly-tile-sizes=<int,...,int>
  The i'th value is used as a tile size for dimension i, if
  there is no i'th value, the value of --polly-default-tile-size is
  used

+ CL-option --polly-default-tile-size=int
  Used if no tile size is given for a dimension i

+ 3 Simple testcases

llvm-svn: 209753
2014-05-28 17:21:02 +00:00
Tobias Grosser 5f860fdfe9 Do not run GPGPU test cases without nvptx target
Tag the GPGPU codegen test cases as unsupported if the nvptx target is not
included in the current llvm build.

Contributed-by:  Yabin Hu <yabin.hwu@gmail.com>
llvm-svn: 208779
2014-05-14 14:18:14 +00:00
Sebastian Pop c5c1055e3f do not build llc and lli for polly test
llvm-svn: 208619
2014-05-12 19:43:20 +00:00
Sebastian Pop e8863b8f00 correct the delinearization failing case
collect terms from affine and non affine memory accesses

llvm-svn: 208616
2014-05-12 19:02:02 +00:00
Sebastian Pop fcf68758b8 unxfail passing testcase
llvm-svn: 208233
2014-05-07 18:01:32 +00:00
Tobias Grosser f56af204b9 Add delinearization testcase for ivs that do not follow the loop order
This is a test case that is currently failing, but that should start working
with an upcoming version of our delinearization pass.

llvm-svn: 207678
2014-04-30 17:49:22 +00:00
Tobias Grosser 841009a2cc We missed two files in the last commit.
Contributed-by:  Johannes Doerfert <doerfert@cs.uni-saarland.de>
llvm-svn: 206901
2014-04-22 15:57:30 +00:00
Tobias Grosser 0d11dbabc4 Fixed missing cloog test with automake/configure build setup
Contributed-by:  Johannes Doerfert <doerfert@cs.uni-saarland.de>
llvm-svn: 206900
2014-04-22 15:30:43 +00:00
Tobias Grosser 954939842f Really fix the load case.
Commit r206510 falsely advertised to fix the load cases, even though it only
fixed the store case. This commit adds the same fix for the load case including
the missing test coverage.

llvm-svn: 206577
2014-04-18 09:46:35 +00:00
Tobias Grosser 50fd7010d8 Ensure a scalar pointer when issuing a vector load
Even tough we may want to generate a vector load, the address from which to load
still is a scalar. Make sure even if previous address computations may have been
vectorized, that the addresses are also available as scalars.

This fixes http://llvm.org/PR19469

Reported-by:  Jeremy Huddleston Sequoia <jeremyhu@apple.com>
llvm-svn: 206510
2014-04-17 23:13:49 +00:00
Tobias Grosser 75b76729ab Fix for vector codegen in OpenMP subfunctions
Contributed-by: Johannes Doerfert <doerfert@cs.uni-saarland.de>
llvm-svn: 206332
2014-04-15 22:30:06 +00:00
Tobias Grosser 364c136d08 Dependences: Do not fail in case a schedule eliminates all dependences
The following example shows a non-parallel loop

void f(int a[]) {
  int i;
  for (i = 0; i < 10; ++i)
    A[i] = A[i+5];
}

which, in case we import a schedule that limits the iteration domain
to 0 <= i < 5, becomes parallel. Previously we crashed in such cases, now we
just recognize it as parallel.

This fixes http://llvm.org/PR19435

Reported-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
llvm-svn: 206318
2014-04-15 20:14:57 +00:00
Tobias Grosser efc3013544 Codegeneration: Free memory correctly when using -polly-vectorizer=polly
This fixes PR19421.

Reported-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
llvm-svn: 206156
2014-04-14 08:33:24 +00:00
Sebastian Pop cd3bb59aa2 only delinearize when the access function is not affine
llvm-svn: 205971
2014-04-10 16:08:11 +00:00
Tobias Grosser 79baa21242 ScopInfo: Scalar accesses are zero dimensional
llvm-svn: 205958
2014-04-10 08:38:02 +00:00
Sebastian Pop 1801668af3 delinearize memory access functions
llvm-svn: 205799
2014-04-08 21:20:44 +00:00
Tobias Grosser 64b95123ef Delete trivial PHI nodes (aka stack slot sharing)
During code preperation trivial PHI nodes (mainly introduced by lcssa) are
deleted to decrease the number of introduced allocas (==> dependences). However
simply replacing them by their only incoming value would cause the independent
block pass to introduce new allocas. To prevent this we try to share stack slots
during code preperarion, hence to reuse a already created alloca 'to demote' the
trivial PHI node. This works if we know that the value stored in this alloca
will be the incoming value of the trivial PHI at the end of the predecessor
block of this trivial PHI.

Contributed-by: Johannes Doerfert <doerfert@cs.uni-saarland.de>
llvm-svn: 205320
2014-04-01 16:01:33 +00:00
Tobias Grosser 5fa36c0ff6 Updated test/create_ll.sh to work with old & new clang versions.
We explicitly specifying all filenames instead of assuming some naming
convention used by clang and opt.

Contributed-by: Johannes Doerfert <doerfert@cs.uni-saarland.de>
llvm-svn: 204726
2014-03-25 15:50:44 +00:00
Tobias Grosser e275e9216b Return conservative result in case the dependence check timed out
For complex examples it may happen that we do not compute dependences. In this
case we do not want to crash, but just not detect parallel loops.

llvm-svn: 204470
2014-03-21 15:12:09 +00:00
Tobias Grosser 0dd463facf Support for generating vectors for loads with -1 stride
This patch enables vectorization of loops containing backward array
traversal (array stride is -1).

Contributed-by: Chris Jenneisch <chrisj@codeaurora.org>
llvm-svn: 204257
2014-03-19 19:27:24 +00:00