Commit Graph

147 Commits

Author SHA1 Message Date
Tobias Grosser dc6b87c56e Add newline at end of debug print
In '[DBG] Allow to emit the RTC value at runtime' the diagnostics were printed
without a newline at the end of each diagnostic. We add such a newline to
improve readability.

llvm-svn: 288323
2016-12-01 08:08:47 +00:00
Michael Kruse 11c5e07925 canSynthesize: Remove unused argument LI. NFC.
The helper function polly::canSynthesize() does not directly use the LoopInfo
analysis, hence remove it from its argument list.

llvm-svn: 288144
2016-11-29 15:11:04 +00:00
Tobias Grosser df8f35b7b8 Update for clang-format change in r288119
llvm-svn: 288134
2016-11-29 12:52:08 +00:00
Tobias Grosser b3c3d149b9 [CodeGen] Add flag to code-generate most memory access expressions
Introduce the new flag -polly-codegen-generate-expressions which forces Polly
to code generate AST expressions instead of using our SCEV based access
expression generation even for cases where the original memory access relation
was not changed and the SCEV based access expression could be code generated
without any issue.

This is an experimental option for better testing the isl ast expression
generation. The default behavior of Polly remains unchanged. We also exclude
a couple of cases for which the AST expression is not yet working.

llvm-svn: 287694
2016-11-22 20:21:16 +00:00
Johannes Doerfert 81aa6e882f [NFC] Adjust naming scheme of statistic variables
Suggested-by: Tobias Grosser <tobias@grosser.es>
llvm-svn: 287347
2016-11-18 14:37:08 +00:00
Johannes Doerfert dae2e9287d [DBG] Collect statistics about actually versioned SCoPs
llvm-svn: 287267
2016-11-17 21:55:43 +00:00
Johannes Doerfert 8c5464a715 [DBG] Allow to emit the RTC value at runtime
The new command line flag "polly-codegen-emit-rtc-print" can be used to
place a "printf" in the generated code that will print the RTC value and
the overflow state.

llvm-svn: 287265
2016-11-17 21:49:19 +00:00
Tobias Grosser 16480186f8 IslNodeBuilder: Ensure newly generated memory accesses are well-defined
Add some additional asserts that ensure newly code-generated memory accesses
are defined on all domain and schedule domain instances.

llvm-svn: 286050
2016-11-05 21:46:01 +00:00
Eli Friedman acf8006471 [Polly CodeGen] Break critical edge from RTC to original loop.
This makes polly generate a CFG which is closer to what we want
in LLVM IR, with a loop preheader for the original loop. This is
just a cleanup, but it exposes some fragile assumptions.

I'm not completely happy with the changes related to expandCodeFor;
RTCBB->getTerminator() is basically a random insertion point which
happens to work due to the way we generate runtime checks. I'm not
sure what the right answer looks like, though.

Differential Revision: https://reviews.llvm.org/D26053

llvm-svn: 285864
2016-11-02 22:32:23 +00:00
Eli Friedman 3c1a75bf9c Handle multi-dimensional invariant load.
If the address of a load depends on another load, make sure to emit
the loads in the right order.

llvm-svn: 284426
2016-10-17 21:04:26 +00:00
Michael Kruse 4b0c5aea78 [CodeGen] Add assertion for indirect array index expression generation. NFC.
Currently Polly cannot generate code for index expressions if the base pointer
is computed within the scop. The base pointer must be generated as well, but
there is no code that triggers that.

Add an assertion to detect when this would occur and miscompile. The IR verifier
should catch it as well.

llvm-svn: 282893
2016-09-30 18:29:37 +00:00
Roman Gareev b3224adfb6 Perform copying to created arrays according to the packing transformation
This is the fourth patch to apply the BLIS matmul optimization pattern on matmul
kernels (http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf).
BLIS implements gemm as three nested loops around a macro-kernel, plus two
packing routines. The macro-kernel is implemented in terms of two additional
loops around a micro-kernel. The micro-kernel is a loop around a rank-1
(i.e., outer product) update. In this change we perform copying to created
arrays, which is the last step to implement the packing transformation.

Reviewed-by: Tobias Grosser <tobias@grosser.es>

Differential Revision: https://reviews.llvm.org/D23260

llvm-svn: 281441
2016-09-14 06:26:09 +00:00
Roman Gareev f5aff70405 Store the size of the outermost dimension in case of newly created arrays that require memory allocation.
We do not need the size of the outermost dimension in most cases, but if we
allocate memory for newly created arrays, that size is needed.

Reviewed-by: Michael Kruse <llvm@meinersbur.de>

Differential Revision: https://reviews.llvm.org/D23991

llvm-svn: 281234
2016-09-12 17:08:31 +00:00
Tobias Grosser a3afe44d6c IslNodeBuilder: Add missing __isl_take annotation
llvm-svn: 281034
2016-09-09 11:16:50 +00:00
Tobias Grosser f3600dfa2d IslNodeBuilder: Add missing __isl_take annotations
llvm-svn: 280936
2016-09-08 13:48:55 +00:00
Tobias Grosser c80d6979bd Drop '@brief' from doxygen comments
LLVM's coding guideline suggests to not use @brief for one-sentence doxygen
comments to improve readability. Switch this once and for all to ensure people
do not copy @brief comments from other parts of Polly, when writing new code.

llvm-svn: 280468
2016-09-02 06:33:33 +00:00
Tobias Grosser fa9abd1f03 Fix compilation in 'asserts' mode
llvm-svn: 278025
2016-08-08 17:35:52 +00:00
Tobias Grosser 0aa29532b7 [IslNodeBuilder] Move run-time check generation to NodeBuilder [NFC]
This improves the structure of the code and allows us to reuse the runtime
code generation in the PPCGCodeGeneration.

llvm-svn: 278017
2016-08-08 15:41:52 +00:00
Tobias Grosser 000db70754 [IslNodeBuilder] Directly use the insert location of our Builder
... instead of adding instructions at the end of the basic block the builder
is currently at. This makes it easier to reason about where IR is generated,
as with the IRBuilder there is just a single location that specificies where
IR is generated.

llvm-svn: 278013
2016-08-08 15:25:46 +00:00
Tobias Grosser 00bb5a99f5 GPGPU: Handle scalar array references
Pass the content of scalar array references to the alloca on the kernel side
and do not pass them additional as normal LLVM scalar value.

llvm-svn: 277699
2016-08-04 06:55:59 +00:00
Tobias Grosser 2219d15748 Fix a couple of spelling mistakes
llvm-svn: 277569
2016-08-03 05:28:09 +00:00
Roman Gareev d7754a1245 Extend the jscop interface to allow the user to declare new arrays and to reference these arrays from access expressions
Extend the jscop interface to allow the user to export arrays. It is required
that already existing arrays of the list of arrays correspond to arrays
of the SCoP. Each array that is appended to the list will be newly created.
Furthermore, we allow the user to modify access expressions to reference
any array in case it has the same element type.

Reviewed-by: Tobias Grosser <tobias@grosser.es>

Differential Revision: https://reviews.llvm.org/D22828

llvm-svn: 277263
2016-07-30 09:25:51 +00:00
Tobias Grosser 86083da0ec IslNodeBuilder: expose addReferencesFromStmt [NFC]
This will be used by Polly GPGPU to determine the values that need to be
passed to GPU kernels.

llvm-svn: 276269
2016-07-21 13:15:55 +00:00
Tobias Grosser faef9a7667 Fix gcc compile failure
Commit r275056 introduced a gcc compile failure due to us using two
types named 'Type', the first being the newly introduced member variable
'Type' the second being llvm::Type. We resolve this issue by renaming
the newly introduced member variable to AccessType.

llvm-svn: 275057
2016-07-11 12:27:04 +00:00
Tobias Grosser 4e2d9c45b9 InvariantEquivClassTy: Use struct instead of 4-tuple to increase readability
Summary:
With a struct we can use named accessors instead of generic std::get<3>()
calls. This increases readability of the source code.

Reviewers: jdoerfert

Subscribers: pollydev, llvm-commits

Differential Revision: http://reviews.llvm.org/D21955

llvm-svn: 275056
2016-07-11 12:15:10 +00:00
Tobias Grosser 3717aa5ddb This reverts recent expression type changes
The recent expression type changes still need more discussion, which will happen
on phabricator or on the mailing list. The precise list of commits reverted are:

- "Refactor division generation code"
- "[NFC] Generate runtime checks after the SCoP"
- "[FIX] Determine insertion point during SCEV expansion"
- "Look through IntToPtr & PtrToInt instructions"
- "Use minimal types for generated expressions"
- "Temporarily promote values to i64 again"
- "[NFC] Avoid unnecessary comparison for min/max expressions"
- "[Polly] Fix -Wunused-variable warnings (NFC)"
- "[NFC] Simplify min/max expression generation"
- "Simplify the type adjustment in the IslExprBuilder"

Some of them are just reverted as we would otherwise get conflicts. I will try
to re-commit them if possible.

llvm-svn: 272483
2016-06-11 19:17:15 +00:00
Johannes Doerfert 0767a511ba Use minimal types for generated expressions
We now use the minimal necessary bit width for the generated code. If
  operations might overflow (add/sub/mul) we will try to adjust the types in
  order to ensure a non-wrapping computation. If the type adjustment is not
  possible, thus the necessary type is bigger than the type value of
  --polly-max-expr-bit-width, we will use assumptions to verify the computation
  will not wrap. However, for run-time checks we cannot build assumptions but
  instead utilize overflow tracking intrinsics.

llvm-svn: 271878
2016-06-06 09:57:41 +00:00
Matthew Simpson acae9e3b30 [Polly] Fix -Wunused-variable warnings (NFC)
llvm-svn: 271518
2016-06-02 14:26:38 +00:00
Johannes Doerfert d36553753e Simplify the type adjustment in the IslExprBuilder
We now have a simple function to adjust/unify the types of two (or three)
  operands before an operation that requieres the same type for all operands.
  Due to this change we will not promote parameters that are added to i64
  anymore if that is not needed.

llvm-svn: 271513
2016-06-02 11:15:57 +00:00
Johannes Doerfert 0f0d209bec Use the SCoP directly for canSynthesize [NFC]
llvm-svn: 270429
2016-05-23 12:47:09 +00:00
Johannes Doerfert ef74443c97 Duplicate part of the Region interface in the Scop class [NFC]
This allows to use the SCoP directly for various queries,
  thus to hide the underlying region more often.

llvm-svn: 270426
2016-05-23 12:42:38 +00:00
Johannes Doerfert 952b5304bc Add and use Scop::contains(Loop/BasicBlock/Instruction) [NFC]
llvm-svn: 270424
2016-05-23 12:40:48 +00:00
Johannes Doerfert a61eda7698 [FIX] Let ScalarEvolution forget hoisted values
We have to rethink the handling of escaping values in order to make
  this kind of "fixes" go away.

llvm-svn: 270409
2016-05-23 09:02:54 +00:00
Johannes Doerfert 404a0f81ea Check overflows in RTCs and bail accordingly
We utilize assumptions on the input to model IR in polyhedral world.
  To verify these assumptions we version the code and guard it with a
  runtime-check (RTC). However, since the RTCs are themselves generated
  from the polyhedral representation we generate them under the same
  assumptions that they should verify. In other words, the guarantees
  that we try to provide with the RTCs do not hold for the RTCs
  themselves. To this end it is necessary to employ a different check
  for the RTCs that will verify the assumptions did hold for them too.

Differential Revision: http://reviews.llvm.org/D20165

llvm-svn: 269299
2016-05-12 15:12:43 +00:00
Johannes Doerfert e243753a4d Simplify access relation for invariant loads early [NFC]
llvm-svn: 269046
2016-05-10 11:59:59 +00:00
Johannes Doerfert 5f173d414e Prevent complex access ranges with low number of pieces.
Previously we checked the number of pieces to decide whether or not a
  invariant load was to complex to be generated. However, there are
  cases when e.g., divisions cause the complexity to spike regardless of
  the number of pieces. To this end we now check the number of totally
  involved dimensions which will increase with the number of pieces but
  also the number of divisions.

llvm-svn: 269045
2016-05-10 11:46:57 +00:00
Michael Kruse bc150127ae Rename Conjuncts -> Disjunctions. NFC.
The check for complexity compares the number of polyhedra in a set,
which are combined by disjunctions (union, "OR"),
not conjunctions (intersection, "AND").

llvm-svn: 268223
2016-05-02 12:25:18 +00:00
Johannes Doerfert 8ab2803b63 [FIX] Propagate execution domain of invariant loads
If the base pointer of an invariant load is is loaded conditionally, that
  condition needs to hold for the invariant load too. The structure of the
  program will imply this for domain constraints but not for imprecisions in
  the modeling. To this end we will propagate the execution context of base
  pointers during code generation and thus ensure the derived pointer does
  not access an invalid base pointer.

llvm-svn: 267707
2016-04-27 12:49:11 +00:00
Johannes Doerfert a9dc529442 Collect and verify generated parallel subfunctions
We verify the optimized function now for a long time and it helped to track
  down bugs early. This will now also happen for all parallel subfunctions we
  generate.

llvm-svn: 265823
2016-04-08 18:16:02 +00:00
Johannes Doerfert 7b81103589 [FIX] Look through div & srem instructions in SCEVs
The findValues() function did not look through div & srem instructions
  that were part of the argument SCEV. However, in different other
  places we already look through it. This mismatch caused us to preload
  values in the wrong order.

llvm-svn: 265775
2016-04-08 10:25:58 +00:00
Michael Kruse c7e0d9c216 Fix non-synthesizable loop exit values.
Polly recognizes affine loops that ScalarEvolution does not, in
particular those with loop conditions that depend on hoisted invariant
loads. Check for SCEVAddRec dependencies on such loops and do not
consider their exit values as synthesizable because SCEVExpander would
generate them as expressions that depend on the original induction
variables. These are not available in generated code.

llvm-svn: 262404
2016-03-01 21:44:06 +00:00
Johannes Doerfert abadd71da1 [FIX] Prevent compile time problems due to complex invariant loads
This cures the symptoms we see in h264 of SPEC2006 but not the cause.

llvm-svn: 262327
2016-03-01 13:05:14 +00:00
Michael Kruse 6f7721f02b Introduce Scop::getStmtFor. NFC.
Replace Scop::getStmtForBasicBlock and Scop::getStmtForRegionNode, and
add overloads for llvm::Instruction and llvm::RegionNode.

getStmtFor and overloads become the common interface to get the Stmt
that contains something. Named after LoopInfo::getLoopFor and
RegionInfo::getRegionFor.

llvm-svn: 261791
2016-02-24 22:08:19 +00:00
Roman Gareev 11001e1534 Annotation of SIMD loops
Use 'mark' nodes annotate a SIMD loop during ScheduleTransformation and skip
parallelism checks.

The buildbot shows the following compile/execution time changes:

  Compile time:
    Improvements    Δ     Previous  Current  σ
    …/gesummv      -6.06% 0.2640    0.2480   0.0055
    …/gemver       -4.46% 0.4480    0.4280   0.0044
    …/covariance   -4.31% 0.8360    0.8000   0.0065
    …/adi          -3.23% 0.9920    0.9600   0.0065
    …/doitgen      -2.53% 0.9480    0.9240   0.0090
    …/3mm          -2.33% 1.0320    1.0080   0.0087

  Execution time:
    Regressions     Δ     Previous  Current  σ
    …/viterbi       1.70% 5.1840    5.2720   0.0074
    …/smallpt       1.06% 12.4920   12.6240  0.0040

Reviewed-by: Tobias Grosser <tobias@grosser.es>

Differential Revision: http://reviews.llvm.org/D14491

llvm-svn: 261620
2016-02-23 09:00:13 +00:00
Johannes Doerfert 6a7c3e4bac Set AST Build for all statements [NFC]
llvm-svn: 260956
2016-02-16 12:11:03 +00:00
Johannes Doerfert 96e5471139 Separate invariant equivalence classes by type
We now distinguish invariant loads to the same memory location if they
  have different types. This will cause us to pre-load an invariant
  location once for each type that is used to access it. However, we can
  thereby avoid invalid casting, especially if an array is accessed
  though different typed/sized invariant loads.

  This basically reverts the changes in r260023 but keeps the test
  cases.

llvm-svn: 260045
2016-02-07 17:30:13 +00:00
Johannes Doerfert adeab372ca Simplify code [NFC]
llvm-svn: 260030
2016-02-07 13:57:32 +00:00
Tobias Grosser 107cd5f5f6 IslNodeBuilder: Invariant load hoisting of elements with differing sizes
Always use access-instruction pointer type to load the invariant values.
Otherwise mismatches between ScopArrayInfo element type and memory access
element type will result in invalid casts. These type mismatches are after
r259784 a lot more common and also arise with types of different size, which
have not been handled before.

Interestingly, this change actually simplifies the code, as we now have only
one code path that is always taken, rather then a standard code path for the
common case and a "fixup" code path that replaces the standard code path in
case of mismatching types.

llvm-svn: 260009
2016-02-06 21:23:39 +00:00
Tobias Grosser d840fc7277 Support accesses with differently sized types to the same array
This allows code such as:

void multiple_types(char *Short, char *Float, char *Double) {
  for (long i = 0; i < 100; i++) {
    Short[i] = *(short *)&Short[2 * i];
    Float[i] = *(float *)&Float[4 * i];
    Double[i] = *(double *)&Double[8 * i];
  }
}

To model such code we use as canonical element type of the modeled array the
smallest element type of all original array accesses, if type allocation sizes
are multiples of each other. Otherwise, we use a newly created iN type, where N
is the gcd of the allocation size of the types used in the accesses to this
array. Accesses with types larger as the canonical element type are modeled as
multiple accesses with the smaller type.

For example the second load access is modeled as:

  { Stmt_bb2[i0] -> MemRef_Float[o0] : 4i0 <= o0 <= 3 + 4i0 }

To support code-generating these memory accesses, we introduce a new method
getAccessAddressFunction that assigns each statement instance a single memory
location, the address we load from/store to. Currently we obtain this address by
taking the lexmin of the access function. We may consider keeping track of the
memory location more explicitly in the future.

We currently do _not_ handle multi-dimensional arrays and also keep the
restriction of not supporting accesses where the offset expression is not a
multiple of the access element type size. This patch adds tests that ensure
we correctly invalidate a scop in case these accesses are found. Both types of
accesses can be handled using the very same model, but are left to be added in
the future.

We also move the initialization of the scop-context into the constructor to
ensure it is already available when invalidating the scop.

Finally, we add this as a new item to the 2.9 release notes

Reviewers: jdoerfert, Meinersbur

Differential Revision: http://reviews.llvm.org/D16878

llvm-svn: 259784
2016-02-04 13:18:42 +00:00
Michael Kruse 70131d3416 Introduce MemAccInst helper class; NFC
MemAccInst wraps the common members of LoadInst and StoreInst. Also use
of this class in:
- ScopInfo::buildMemoryAccess
- BlockGenerator::generateLocationAccessed
- ScopInfo::addArrayAccess
- Scop::buildAliasGroups
- Replace every use of polly::getPointerOperand

Reviewers: jdoerfert, grosser

Differential Revision: http://reviews.llvm.org/D16530

llvm-svn: 258947
2016-01-27 17:09:17 +00:00
Johannes Doerfert 370cf00c9f Make sure we preserve alignment information after hoisting invariant load
In Polly, after hoisting loop invariant loads outside loop, the alignment
information for hoisted loads are missing, this patch restore them.

Contributed-by: Lawrence Hu <lawrence@codeaurora.org>

Differential Revision: http://reviews.llvm.org/D16160

llvm-svn: 258105
2016-01-19 00:17:21 +00:00
Tobias Grosser a535dff471 ScopInfo: Harmonize the different array kinds
Over time different vocabulary has been introduced to describe the different
memory objects in Polly, resulting in different - often inconsistent - naming
schemes in different parts of Polly. We now standartize this to the following
scheme:

  KindArray, KindValue, KindPHI, KindExitPHI
             | ------- isScalar -----------|

In most cases this naming scheme has already been used previously (this
minimizes changes and ensures we remain consistent with previous publications).
The main change is that we remove KindScalar to clearify the difference between
a scalar as a memory object of kind Value, PHI or ExitPHI and a value (former
KindScalar) which is a memory object modeling a llvm::Value.

We also move all documentation to the Kind* enum in the ScopArrayInfo class,
remove the second enum in the MemoryAccess class and update documentation to be
formulated from the perspective of the memory object, rather than the memory
access. The terms "Implicit"/"Explicit", formerly used to describe memory
accesses, have been dropped. From the perspective of memory accesses they
described the different memory kinds well - especially from the perspective of
code generation - but just from the perspective of a memory object it seems more
straightforward to talk about scalars and arrays, rather than explicit and
implicit arrays. The last comment is clearly subjective, though. A less
subjective reason to go for these terms is the historic use both in mailing list
discussions and publications.

llvm-svn: 255467
2015-12-13 19:59:01 +00:00
Tobias Grosser 2fd89da90d Remove non-debug printing of domain set
Contributed-by: Chris Jenneisch <chrisj@codeaurora.org>

Differential Revision: http://reviews.llvm.org/D15094

llvm-svn: 254343
2015-11-30 22:59:41 +00:00
Johannes Doerfert fdbf201fc9 [FIX] Do not generate code for parameters referencing dead values
Check if a value that is referenced by a parameter is dead and do not
generate code for the parameter in such a case.

llvm-svn: 252813
2015-11-11 22:40:51 +00:00
Johannes Doerfert dcfedf3505 [FIX] Cast pre-loaded values correctly or reload them with adjusted type.
Especially for structs, the SAI object of a base pointer does not
describe all the types that the user might expect when he loads from
that base pointer. While we will still cast integers and pointers we
will now reload the value with the correct type if floating point and
non-floating point values are involved. However, there are now TODOs
where we use bitcasts instead of a proper conversion or reloading.

This fixes bug 25479.

llvm-svn: 252706
2015-11-11 06:20:25 +00:00
Johannes Doerfert fc4bfc465a [FIX] Create empty invariant equivalence classes
We now create all invariant equivalence classes for required invariant loads
  instead of creating them on-demand. This way we can check if a parameter
  references an invariant load that is actually not executed and was therefor
  not materialized. If that happens the parameter is not materialized either.

This fixes bug 25469.

llvm-svn: 252701
2015-11-11 04:30:07 +00:00
Tobias Grosser 6abc75af4c ScopInfo: Introduce ArrayKind
Since 252422 we do not only distinguish two ScopArrayInfo kinds, PHI nodes
and others, but work with three kind of ScopArrayInfo objects. SCALAR, PHI and
ARRAY objects. Instead of keeping two boolean flags isPHI and isScalar and
wonder what an ScopArrayInfo object of kind (!isScalar && isPHI) is, we
list now explicitly the three different possible types of memory objects.

This change also allows us to remove the confusing nested pairs that have
been used in ArrayInfoMapTy.

llvm-svn: 252620
2015-11-10 17:31:31 +00:00
Johannes Doerfert 7a6e292d86 [FIX] Use same alloca for invariant loads and the scalar users
llvm-svn: 252451
2015-11-09 06:28:45 +00:00
Johannes Doerfert a768624f14 [FIX] Introduce different SAI objects for scalar and memory accesses
Even if a scalar and memory access have the same base pointer, we cannot use
  one SAI object as the type but also the number of dimensions are wrong. For
  the attached test case this caused a crash in the invariant load hoisting,
  though it could cause various other problems too.

This fixes bug 25428 and a execution time bug in MallocBench/cfrac.

Reported-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
llvm-svn: 252422
2015-11-08 19:12:05 +00:00
Johannes Doerfert c4898504ea [FIX] Bail out if there is a dependence cycle between invariant loads
While the program cannot cause a dependence cycle between invariant
  loads, additional constraints (e.g., to ensure finite loops) can
  introduce them. It is hard to detect them in the SCoP description,
  thus we will only check for them at code generation time. If such a
  recursion is detected we will bail out the code generation and place a
  "false" runtime check to guarantee the original code is used.

  This fixes bug 25443.

llvm-svn: 252412
2015-11-07 19:46:04 +00:00
Duncan P. N. Exon Smith b8f58b53dd polly/ADT: Remove implicit ilist iterator conversions, NFC
Remove all the implicit ilist iterator conversions from polly, in
preparation for making them illegal in ADT.  There was one oddity I came
across: at line 95 of lib/CodeGen/LoopGenerators.cpp, there was a
post-increment `Builder.GetInsertPoint()++`.

Since it was a no-op, I removed it, but I admit I wonder if it might be
a bug (both before and after this change)?  Perhaps it should be a
pre-increment?

llvm-svn: 252357
2015-11-06 22:56:54 +00:00
Johannes Doerfert 22892687f7 [FIX] Simplify and correct preloading of base pointer origin
To simplify and correct the preloading of a base pointer origin, e.g.,
  the base pointer for the current indirect invariant load, we now just
  check if there is an invariant access class that involves the base
  pointer of the current class.

llvm-svn: 251962
2015-11-03 19:15:33 +00:00
Johannes Doerfert 475d8e3f42 [FIX] Ensure base pointer origin was preloaded already
If a base pointer of a preloaded value has a base pointer origin, thus it is
  an indirect invariant load, we have to make sure the base pointer origin is
  preloaded first.

llvm-svn: 251946
2015-11-03 16:49:02 +00:00
Johannes Doerfert 3181c2ef72 [FIX] Correctly update SAI base pointer
If a base pointer load is preloaded, we have change the base pointer of
  the derived SAI. However, as the derived SAI relationship is is
  coarse grained, we need to check if we actually preloaded the base
  pointer or a different element of the base pointer SAI array.

llvm-svn: 251881
2015-11-03 01:42:59 +00:00
Johannes Doerfert af3e301a67 [FIX] Restructure invariant load equivalence classes
Sorting is replaced by a demand driven code generation that will pre-load a
  value when it is needed or, if it was not needed before, at some point
  determined by the order of invariant accesses in the program. Only in very
  little cases this demand driven pre-loading will kick in, though it will
  prevent us from generating faulty code. An example where it is needed is
  shown in:
    test/ScopInfo/invariant_loads_complicated_dependences.ll

  Invariant loads that appear in parameters but are not on the top-level (e.g.,
  the parameter is not a SCEVUnknown) will now be treated correctly.

Differential Revision: http://reviews.llvm.org/D13831

llvm-svn: 250655
2015-10-18 12:39:19 +00:00
Johannes Doerfert d8b6ad255f [FIX] Cast preloaded values
Preloaded values have to match the type of their counterpart in the
  original code and not the type of the base array.

llvm-svn: 250654
2015-10-18 12:36:42 +00:00
Johannes Doerfert 697fdf891c Consolidate invariant loads
If a (assumed) invariant location is loaded multiple times we
  generated a parameter for each location. However, this caused compile
  time problems for several benchmarks (e.g., 445_gobmk in SPEC2006 and
  BT in the NAS benchmarks). Additionally, the code we generate is
  suboptimal as we preload the same location multiple times and perform
  the same checks on all the parameters that refere to the same value.

  With this patch we consolidate the invariant loads in three steps:
    1) During SCoP initialization required invariant loads are put in
       equivalence classes based on their pointer operand. One
       representing load is used to generate a parameter for the whole
       class, thus we never generate multiple parameters for the same
       location.
    2) During the SCoP simplification we remove invariant memory
       accesses that are in the same equivalence class. While doing so
       we build the union of all execution domains as it is only
       important that the location is at least accessed once.
    3) During code generation we only preload one element of each
       equivalence class with the unified execution domain. All others
       are mapped to that preloaded value.

Differential Revision: http://reviews.llvm.org/D13338

llvm-svn: 249853
2015-10-09 17:12:26 +00:00
Johannes Doerfert 09e3697f44 Allow invariant loads in the SCoP description
This patch allows invariant loads to be used in the SCoP description,
  e.g., as loop bounds, conditions or in memory access functions.

  First we collect "required invariant loads" during SCoP detection that
  would otherwise make an expression we care about non-affine. To this
  end a new level of abstraction was introduced before
  SCEVValidator::isAffineExpr() namely ScopDetection::isAffine() and
  ScopDetection::onlyValidRequiredInvariantLoads(). Here we can decide
  if we want a load inside the region to be optimistically assumed
  invariant or not. If we do, it will be marked as required and in the
  SCoP generation we bail if it is actually not invariant. If we don't
  it will be a non-affine expression as before. At the moment we
  optimistically assume all "hoistable" (namely non-loop-carried) loads
  to be invariant. This causes us to expand some SCoPs and dismiss them
  later but it also allows us to detect a lot we would dismiss directly
  if we would ask e.g., AliasAnalysis::canBasicBlockModify(). We also
  allow potential aliases between optimistically assumed invariant loads
  and other pointers as our runtime alias checks are sound in case the
  loads are actually invariant. Together with the invariant checks this
  combination allows to handle a lot more than LICM can.

  The code generation of the invariant loads had to be extended as we
  can now have dependences between parameters and invariant (hoisted)
  loads as well as the other way around, e.g.,
    test/Isl/CodeGen/invariant_load_parameters_cyclic_dependence.ll
  First, it is important to note that we cannot have real cycles but
  only dependences from a hoisted load to a parameter and from another
  parameter to that hoisted load (and so on). To handle such cases we
  materialize llvm::Values for parameters that are referred by a hoisted
  load on demand and then materialize the remaining parameters. Second,
  there are new kinds of dependences between hoisted loads caused by the
  constraints on their execution. If a hoisted load is conditionally
  executed it might depend on the value of another hoisted load. To deal
  with such situations we sort them already in the ScopInfo such that
  they can be generated in the order they are listed in the
  Scop::InvariantAccesses list (see compareInvariantAccesses). The
  dependences between hoisted loads caused by indirect accesses are
  handled the same way as before.

llvm-svn: 249607
2015-10-07 20:17:36 +00:00
Johannes Doerfert 521dd5842f Move the ValueMapT declaration out of BlockGenerator
Value maps are created and used in many places and it is not always
  possible to include CodeGen/Blockgenerators.h. To this end, ValueMapT
  now lives in the ScopHelper.h which does not have any dependences itself.

  This patch also replaces uses of different other value map types with
  the ValueMapT.

llvm-svn: 249606
2015-10-07 20:15:56 +00:00
Tobias Grosser f4bb7a6a4d Consolidate the different ValueMapTypes we are using
There have been various places where llvm::DenseMap<const llvm::Value *,
llvm::Value *> types have been defined, but all types have been expected to be
identical. We make this more clear by consolidating the different types and use
BlockGenerator::ValueMapT wherever there is a need for types to match
BlockGenerator::ValueMapT.

llvm-svn: 249264
2015-10-04 10:18:32 +00:00
Tobias Grosser 1d45c6dadd IslExprBuilder: Use AssertingVH for IdToValueTy
llvm-svn: 249239
2015-10-03 17:20:00 +00:00
Johannes Doerfert 911951f4f8 Hand down referenced & globally mapped values to the subfunction
If a value is globally mapped (IslNodeBuilder::ValueMap) and
  referenced in the code that will be put into a subfunction, we hand
  down the new value to the subfunction.

  This patch also removes code that handed down all invariant loads to
  the subfunction. Instead, only needed invariant loads are given to the
  subfunction. There are two possible reasons for an invariant load to
  be handed down:
    1) The invariant load is used in a block that is placed in the
       subfunction but which is not the parent of the load. In this
       case, the scalar access that will read the loaded value, will
       cause its base pointer (the preloaded value) to be handed down to
       the subfunction.
    2) The invariant load is defined and used in a block that is placed
       in the subfunction. With this patch we will hand down the
       preloaded value to the subfunction as the invariant load is
       globally mapped to that value.

llvm-svn: 249126
2015-10-02 13:11:27 +00:00
Johannes Doerfert 850d346302 [FIX] Parallel codegen for invariant loads
Hand down all preloaded values to the parallel subfunction.

llvm-svn: 249010
2015-10-01 13:40:36 +00:00
Johannes Doerfert ebfd72493c [NFC] Extract materialization of parameters
llvm-svn: 248882
2015-09-30 09:52:08 +00:00
Johannes Doerfert ef19ead20e [FIX] Use escape logic for invariant loads
Before we unconditinoally forced all users outside the SCoP to use
  the preloaded value. However, if the SCoP is not executed due to the
  runtime checks, we need to use the original value because it might not
  be invariant in the first place.

llvm-svn: 248881
2015-09-30 09:43:20 +00:00
Johannes Doerfert c1db67e218 Identify and hoist definitively invariant loads
As a first step in the direction of assumed invariant loads (loads
  that are not written in some context) we now detect and hoist
  definitively invariant loads. These invariant loads will be preloaded
  in the code generation and used in the optimized version of the SCoP.
  If the load is only conditionally executed the preloaded version will
  also only be executed under the same condition, hence we will never
  access memory that wouldn't have been accessed otherwise. This is also
  the most distinguishing feature to licm.

  As hoisting can make statements empty we will simplify the SCoP and
  remove empty statements that would otherwise cause artifacts in the
  code generation.

Differential Revision: http://reviews.llvm.org/D13194

llvm-svn: 248861
2015-09-29 23:47:21 +00:00
Johannes Doerfert fb19dd694c Create parallel code in a separate block
This commit basically reverts r246427 but still solves the issue
  tackled by that commit. Instead of emitting initialization code in the
  beginning of the start block we now generate parallel code in its own
  block and thereby guarantee separation. This is necessary as we cannot
  generate code for hoisted loads prior to the start block but it still
  needs to be placed prior to everything else.

llvm-svn: 248674
2015-09-26 20:57:59 +00:00
Michael Kruse 8d0b734e71 Let MemoryAccess remember its purpose
There are three possible reasons to add a memory memory access: For explicit load and stores, for llvm::Value defs/uses, and to emulate PHI nodes (the latter two called implicit accesses). Previously MemoryAccess only stored IsPHI. Register accesses could be identified through the isScalar() method if it was no IsPHI. isScalar() determined the number of dimensions of the underlaying array, scalars represented by zero dimensions.

For the work on de-LICM, implicit accesses can have more than zero dimensions, making the distinction of isScalars() useless, hence now stored explicitly in the MemoryAccess. Instead, we replace it by isImplicit() and avoid the term scalar for zero-dimensional arrays as it might be confused with llvm::Value which are also often referred to as scalars (or alternatively, as registers).

No behavioral change intended, under the condition that it was impossible to create explicit accesses to zero-dimensional "arrays".

llvm-svn: 248616
2015-09-25 21:21:00 +00:00
Michael Kruse 7bf3944d23 Merge TempScopInfo.{cpp|h} into ScopInfo.{cpp|h}
This prepares for a series of patches that merges TempScopInfo into ScopInfo to
reduce Polly's code complexity. Only ScopInfo.{cpp|h} will be left thereafter.
Moving the code of TempScopInfo in one commit makes the mains diffs simpler to
understand.

In detail, merging the following classes is planned:
TempScopInfo into ScopInfo
TempScop into Scop
IRAccess into MemoryAccess

Only moving code, no functional changes intended.

Differential Version: http://reviews.llvm.org/D12693

llvm-svn: 247274
2015-09-10 12:46:52 +00:00
Tobias Grosser f1ac57c6cd IslNodeBuilder: Add virtual function to obtain the schedule of an ast node
Not all users of our IslNodeBuilder will attach scheduling information to the
AST in the same way IslAstInfo is doing it today. By going through a virtual
function when extracting the schedule of an AST node other users can provide
their own functions for extract scheduling information in case they attach
scheduling information in a different way to the AST nodes.

No functional change for Polly itself intended.

llvm-svn: 247126
2015-09-09 09:24:38 +00:00
Tobias Grosser e3d8c05c5f Add some more documentation and structure to the collection of subtree references
Some of the structures are renamed, subfunction introduced to clarify the
individual steps and comments are added describing their functionality.

llvm-svn: 246929
2015-09-05 15:45:25 +00:00
Tobias Grosser abcec37f64 IslNodeBuilder: Only obtain the isl_ast_build, when needed
In the common case, the access functions are not modified, hence there is no
need to obtain the IslAstBuild context at all. This should not only be minimally
faster, but this also allows the IslNodeBuilder to work on asts that are not
annotated with isl_ast_builds as long as the memory accesses are not modified.

llvm-svn: 246928
2015-09-05 13:03:57 +00:00
Tobias Grosser bc13260775 BlockGenerator: Make GlobalMap a member variable
The GlobalMap variable used in BlockGenerator should always reference the same
list througout the entire code generation, hence we can make it a member
variable to avoid passing it around through every function call.

History: Before we switched to the SCEV based code generation the GlobalMap
also contained a mapping form old to new induction variables, hence it was
different for each ScopStmt, which is why we passed it as function argument
to copyStmt. The new SCEV based code generation now uses a separate mapping
called LTS -> LoopToSCEV that maps each original loop to a new loop iteration
variable provided as a SCEVExpr. The GlobalMap is currently mostly used for
OpenMP code generation, where references to parameters in the original function
need to be rewritten to the locations of these variables after they have been
passed to the subfunction.

Suggested-by: Johannes Doerfert <doerfert@cs.uni-saarland.de>
llvm-svn: 246920
2015-09-05 09:56:54 +00:00
Tobias Grosser f93451802a OpenMP-codegen: Correctly pass function arguments to subfunctions
Before we only checked if certain instructions can be expanded by us. Now we
check any value, including function arguments.

llvm-svn: 246425
2015-08-31 09:05:43 +00:00
Tobias Grosser 64c0ff4141 Add support for scalar dependences to OpenMP code generation
Scalar dependences between scop statements have caused troubles during parallel
code generation as we did not pass on the new stack allocation created for such
scalars to the parallel subfunctions. This change now detects all scalar
reads/writes in parallel subfunctions, creates the allocas for these scalar
objects, passes the resulting memory locations to the subfunctions and ensures
that within the subfunction requests for these memory locations will return the
rewritten values.

Johannes suggested as a future optimization to privatizing some of the scalars
in the subfunction.

llvm-svn: 246414
2015-08-31 05:52:24 +00:00
Tobias Grosser 2d1ed0bfa7 BlockGenerator: Add the possiblity to pass a set of new access functions
This change allows the BlockGenerator to be reused in contexts where we want to
provide different/modified isl_ast_expressions, which are not only changed to
a different access relation than the original statement, but which may indeed
be different for each code-generated instance of the statement.

We ensure testing of this feature by moving Polly's support to import changed
access functions through a jscop file to use the BlockGenerators support for
generating arbitary access functions if provided.

This commit should not change the behavior of Polly for now. The diff is rather
large, but most changes are due to us passing the NewAccesses hash table through
functions. This style, even though rather verbose, matches what is done
throughout the BlockGenerator with other per-statement properties.

llvm-svn: 246144
2015-08-27 07:28:16 +00:00
Tobias Grosser 39f9f30e8b Only derive number of loop iterations for loops we can actually vectorize
llvm-svn: 245870
2015-08-24 20:11:34 +00:00
Tobias Grosser 1ac884d73a Use marker nodes to annotate the different levels of tiling
Currently, marker nodes are ignored during AST generation, but visible in the
-debug-only=polly-ast output.

llvm-svn: 245809
2015-08-23 09:11:00 +00:00
Roman Gareev c49724f008 Manually check a loop form
Add manual check of a loop form and return non-negative number of iterations
in case of trivially vectorizable loop.

llvm-svn: 245680
2015-08-21 09:08:14 +00:00
Tobias Grosser b0da42fb55 Generate alias metadata even in OpenMP mode
To make alias scope metadata generation work in OpenMP mode we now provide
the ScopAnnotator with information about the base pointer rewrite that happens
when passing arrays into the OpenMP subfunction.

llvm-svn: 245451
2015-08-19 16:04:35 +00:00
Johannes Doerfert e69e1141d9 Introduce the ScopExpander as a SCEVExpander replacement
The SCEVExpander cannot deal with all SCEVs Polly allows in all kinds
  of expressions. To this end we introduce a ScopExpander that handles
  the additional expressions separatly and falls back to the
  SCEVExpander for everything else.

Reviewers: grosser, Meinersbur

Subscribers: #polly

Differential Revision: http://reviews.llvm.org/D12066

llvm-svn: 245288
2015-08-18 11:56:00 +00:00
Tobias Grosser 808cd69a92 Use schedule trees to represent execution order of statements
Instead of flat schedules, we now use so-called schedule trees to represent the
execution order of the statements in a SCoP. Schedule trees make it a lot easier
to analyze, understand and modify properties of a schedule, as specific nodes
in the tree can be choosen and possibly replaced.

This patch does not yet fully move our DependenceInfo pass to schedule trees,
as some additional performance analysis is needed here. (In general schedule
trees should be faster in compile-time, as the more structured representation
is generally easier to analyze and work with). We also can not yet perform the
reduction analysis on schedule trees.

For more information regarding schedule trees, please see Section 6 of
https://lirias.kuleuven.be/handle/123456789/497238

llvm-svn: 242130
2015-07-14 09:33:13 +00:00
Tobias Grosser b2f399264d Update isl to 93b8e43d
This update brings mostly interface cleanups, but also fixes two bugs in
imath (a memory leak, some undefined behavior).

llvm-svn: 238422
2015-05-28 13:32:11 +00:00
Tobias Grosser eeb9f3ce15 Drop unnecessary 'this->' pointers
llvm-svn: 238257
2015-05-26 21:37:31 +00:00
Tobias Grosser cd524dc51d Add explicit #includes for used isl features
llvm-svn: 236931
2015-05-09 09:36:38 +00:00
Tobias Grosser ba0d09227c Sort include directives
Upcoming revisions of isl require us to include header files explicitly, which
have previously been already transitively included. Before we add them, we sort
the existing includes.

Thanks to Chandler for sort_includes.py. A simple, but very convenient script.

llvm-svn: 236930
2015-05-09 09:13:42 +00:00
Tobias Grosser 0c55cb6071 Extract IslNodeBuilder into its own file
The IslNodeBuilder is a generic class that may be useful in other contexts
as well. Hence, we extract it into its own .h/.cpp file.

llvm-svn: 235873
2015-04-27 12:32:24 +00:00