Commit Graph

494 Commits

Author SHA1 Message Date
Johannes Doerfert 561d36b320 Allow pointer expressions in SCEVs again.
In r247147 we disabled pointer expressions because the IslExprBuilder did not
  fully support them. This patch reintroduces them by simply treating them as
  integers. The only special handling for pointers that is left detects the
  comparison of two address_of operands and uses an unsigned compare.

llvm-svn: 265894
2016-04-10 09:50:10 +00:00
Johannes Doerfert 3c6a99b818 Add __isl_give annotations to return types [NFC]
llvm-svn: 265882
2016-04-09 21:55:23 +00:00
Johannes Doerfert a9dc529442 Collect and verify generated parallel subfunctions
We verify the optimized function now for a long time and it helped to track
  down bugs early. This will now also happen for all parallel subfunctions we
  generate.

llvm-svn: 265823
2016-04-08 18:16:02 +00:00
Johannes Doerfert 7b81103589 [FIX] Look through div & srem instructions in SCEVs
The findValues() function did not look through div & srem instructions
  that were part of the argument SCEV. However, in different other
  places we already look through it. This mismatch caused us to preload
  values in the wrong order.

llvm-svn: 265775
2016-04-08 10:25:58 +00:00
Johannes Doerfert 6ba927148d [FIX] Adjust the insert point for non-affine region PHIs
If a non-affine region PHI is generated we should not move the insert
  point prior to the synthezised value in the same block as we might
  split that block at the insert point later on. Only if the incoming
  value should be placed in a different block we should change the
  insertion point.

llvm-svn: 265132
2016-04-01 11:25:47 +00:00
Tobias Grosser b339594f5d CodegenCleanup: Drop -load-combine pass
This pass is not enabled in the default tool chain and currently can run into an
infinite loop, due to other parts of LLVM generating incorrect IR
(http://llvm.org/PR27065) -- which is not executed and consequently does not
seem to disturb other passes.  As this pass is not really needed, we can just
drop it to get our build clean.

This fixes the timeout issues in MultiSource/Benchmarks/MiBench/consumer-jpeg
and MultiSource/Benchmarks/mediabench/jpeg/jpeg-6a/cjpeg for
-polly-position=before-vectorizer -polly-process-unprofitable.. Unfortunately,
we are still left with a miscompile in cjpeg.

llvm-svn: 264396
2016-03-25 12:11:06 +00:00
Johannes Doerfert 47197fe3f3 Add namespace for struct [NFC]
This will clean up the doxygen documentation.

llvm-svn: 264272
2016-03-24 13:20:52 +00:00
Tobias Grosser bfb6a9683b Codegen:Do not invalidate dominator tree when bailing out during code generation
When codegenerating invariant loads in some rare cases we cannot generate code
and bail out. This change ensures that we maintain a valid dominator tree
in these situations. This fixes llvm.org/PR26736

Contributed-by: Matthias Reisinger <d412vv1n@gmail.com>
llvm-svn: 264142
2016-03-23 06:57:51 +00:00
Michael Kruse faedfcbf6d [BlockGenerator] Fix PHI merges for MK_Arrays.
Value merging is only necessary for scalars when they are used outside
of the scop. While an array's base pointer can be used after the scop,
it gets an extra ScopArrayInfo of type MK_Value. We used to generate
phi's for both of them, where one was assuming the reault of the other
phi would be the original value, because it has already been replaced by
the previous phi. This resulted in IR that the current IR verifier
allows, but is probably illegal.

This reduces the number of LNT test-suite fails with
-polly-position=before-vectorizer -polly-process-unprofitable
from 16 to 10.

Also see llvm.org/PR26718.

llvm-svn: 262629
2016-03-03 17:20:43 +00:00
Hongbin Zheng 2a798853f8 Allow the client of DependenceInfo to obtain dependences at different granularities.
llvm-svn: 262591
2016-03-03 08:15:33 +00:00
Michael Kruse c7e0d9c216 Fix non-synthesizable loop exit values.
Polly recognizes affine loops that ScalarEvolution does not, in
particular those with loop conditions that depend on hoisted invariant
loads. Check for SCEVAddRec dependencies on such loops and do not
consider their exit values as synthesizable because SCEVExpander would
generate them as expressions that depend on the original induction
variables. These are not available in generated code.

llvm-svn: 262404
2016-03-01 21:44:06 +00:00
Johannes Doerfert 066dbf3f8e Track assumptions and restrictions separatly
In order to speed up compile time and to avoid random timeouts we now
  separately track assumptions and restrictions. In this context
  assumptions describe parameter valuations we need and restrictions
  describe parameter valuations we do not allow. During AST generation
  we create a runtime check for both, whereas the one for the
  restrictions is negated before a conjunction is build.

  Except the In-Bounds assumptions we currently only track restrictions.

Differential Revision: http://reviews.llvm.org/D17247

llvm-svn: 262328
2016-03-01 13:06:28 +00:00
Johannes Doerfert abadd71da1 [FIX] Prevent compile time problems due to complex invariant loads
This cures the symptoms we see in h264 of SPEC2006 but not the cause.

llvm-svn: 262327
2016-03-01 13:05:14 +00:00
Tobias Grosser 64ca00c344 IslAst: Expose run-time check generation as individual function
This allows to construct run-time checks for a scop without having to generate
a full AST. This is currently not taken advantage of in Polly itself, but
external users may benefit from this feature.

llvm-svn: 262009
2016-02-26 12:59:38 +00:00
Hongbin Zheng defd098612 Adapt to LLVM head, again
llvm-svn: 261905
2016-02-25 17:54:42 +00:00
Hongbin Zheng 566c614525 Revert "Adapt to LLVM head. NFC"
This reverts commit 4d3753b9646a69c00d234ccd6e91dc3d0ea5d643.

llvm-svn: 261892
2016-02-25 16:46:17 +00:00
Hongbin Zheng f4e35f9cb9 Adapt to LLVM head. NFC
llvm-svn: 261886
2016-02-25 16:36:09 +00:00
Michael Kruse 8f25b0cb4d Use inline local variable declaration. NFC.
llvm-svn: 261876
2016-02-25 15:52:43 +00:00
Johannes Doerfert a792098047 Support calls with known ModRef function behaviour
Check the ModRefBehaviour of functions in order to decide whether or
  not a call instruction might be acceptable.

Differential Revision: http://reviews.llvm.org/D5227

llvm-svn: 261866
2016-02-25 14:08:48 +00:00
Michael Kruse f33c125dd2 Fix DomTree preservation for generated subregions.
The generated dedicated subregion exit block was assumed to have the same
dominance relation as the original exit block. This is incorrect if the exit
block receives other edges than only from the subregion, which results in that
e.g. the subregion's entry block does not dominate the exit block.

llvm-svn: 261865
2016-02-25 14:08:48 +00:00
Michael Kruse 375cb5fe0a Introduce ScopStmt::getEntryBlock(). NFC.
This replaces an ungly inline ternary operator pattern.

llvm-svn: 261792
2016-02-24 22:08:24 +00:00
Michael Kruse 6f7721f02b Introduce Scop::getStmtFor. NFC.
Replace Scop::getStmtForBasicBlock and Scop::getStmtForRegionNode, and
add overloads for llvm::Instruction and llvm::RegionNode.

getStmtFor and overloads become the common interface to get the Stmt
that contains something. Named after LoopInfo::getLoopFor and
RegionInfo::getRegionFor.

llvm-svn: 261791
2016-02-24 22:08:19 +00:00
Michael Kruse eac9726e8c Add assertions checking def dominates use. NFC.
This is also be caught by the function verifier, but disconnected from
the place that produced it. Catch it already at creation to be able to
reason more directly about the cause.

llvm-svn: 261790
2016-02-24 22:08:14 +00:00
Roman Gareev 11001e1534 Annotation of SIMD loops
Use 'mark' nodes annotate a SIMD loop during ScheduleTransformation and skip
parallelism checks.

The buildbot shows the following compile/execution time changes:

  Compile time:
    Improvements    Δ     Previous  Current  σ
    …/gesummv      -6.06% 0.2640    0.2480   0.0055
    …/gemver       -4.46% 0.4480    0.4280   0.0044
    …/covariance   -4.31% 0.8360    0.8000   0.0065
    …/adi          -3.23% 0.9920    0.9600   0.0065
    …/doitgen      -2.53% 0.9480    0.9240   0.0090
    …/3mm          -2.33% 1.0320    1.0080   0.0087

  Execution time:
    Regressions     Δ     Previous  Current  σ
    …/viterbi       1.70% 5.1840    5.2720   0.0074
    …/smallpt       1.06% 12.4920   12.6240  0.0040

Reviewed-by: Tobias Grosser <tobias@grosser.es>

Differential Revision: http://reviews.llvm.org/D14491

llvm-svn: 261620
2016-02-23 09:00:13 +00:00
Tobias Grosser 820cf20a98 IslAst: Expose IslAst class in header file [NFC]
This allows other passes and transformations to use some of the existing AST
building infrastructure. This is not yet used in Polly itself.

llvm-svn: 261496
2016-02-21 20:01:28 +00:00
Tobias Grosser 2b809d1390 BlockGenerator: Drop unnecessary return value
llvm-svn: 261473
2016-02-21 15:44:34 +00:00
Tobias Grosser 58e585444a Codegen: Print error in Polly code verification and allow to disable verfication.
We now always print the reason why the code did not pass the LLVM verifier and
we also allow to disable verfication with -polly-codegen-verify=false. Before
this change the first assertion had generally no information why or what might
have gone wrong and it was also impossible to -view-cfg without recompile. This
change makes debugging bugs that result in incorrect IR a lot easier.

llvm-svn: 261320
2016-02-19 11:07:12 +00:00
Hongbin Zheng 8831eb7db4 [Refactor] Move isl_ctx into Scop.
After we moved isl_ctx into Scop, we need to free the isl_ctx after
  freeing all isl objects, which requires the ScopInfo pass to be freed
  at last. But this is not guaranteed by the PassManager, and we need
  extra code to free the isl_ctx at the right time.

  We introduced a shared pointer to manage the isl_ctx, and distribute
  it to all analyses that create isl objects. As such, whenever we free
  an analyses with the shared_ptr (and also free the isl objects which
  are created by the analyses), we decrease the (shared) reference
  counter of the shared_ptr by 1. Whenever the reference counter reach
  0 in the releaseMemory function of an analysis, that analysis will
  be the last one that hold any isl objects, and we can safely free the
  isl_ctx with that analysis.

Differential Revision: http://reviews.llvm.org/D17241

llvm-svn: 261100
2016-02-17 15:49:21 +00:00
Johannes Doerfert 2c3ffc04f3 Replace getLoopForInst by getLoopForStmt
This patch was extracted from http://reviews.llvm.org/D13611.

llvm-svn: 260958
2016-02-16 12:36:14 +00:00
Johannes Doerfert 6a7c3e4bac Set AST Build for all statements [NFC]
llvm-svn: 260956
2016-02-16 12:11:03 +00:00
Tobias Grosser 652f780894 CodeGeneration: Add back verification of generated code
This got accidentally dropped in r260025

llvm-svn: 260857
2016-02-14 20:56:49 +00:00
Johannes Doerfert 96e5471139 Separate invariant equivalence classes by type
We now distinguish invariant loads to the same memory location if they
  have different types. This will cause us to pre-load an invariant
  location once for each type that is used to access it. However, we can
  thereby avoid invalid casting, especially if an array is accessed
  though different typed/sized invariant loads.

  This basically reverts the changes in r260023 but keeps the test
  cases.

llvm-svn: 260045
2016-02-07 17:30:13 +00:00
Johannes Doerfert adeab372ca Simplify code [NFC]
llvm-svn: 260030
2016-02-07 13:57:32 +00:00
Tobias Grosser 8ebdc2dd53 Make memory accesses with different element types optional
We also disable this feature by default, as there are still some issues in
combination with invariant load hoisting that slipped through my initial
testing.

llvm-svn: 260025
2016-02-07 08:48:57 +00:00
Tobias Grosser 107cd5f5f6 IslNodeBuilder: Invariant load hoisting of elements with differing sizes
Always use access-instruction pointer type to load the invariant values.
Otherwise mismatches between ScopArrayInfo element type and memory access
element type will result in invalid casts. These type mismatches are after
r259784 a lot more common and also arise with types of different size, which
have not been handled before.

Interestingly, this change actually simplifies the code, as we now have only
one code path that is always taken, rather then a standard code path for the
common case and a "fixup" code path that replaces the standard code path in
case of mismatching types.

llvm-svn: 260009
2016-02-06 21:23:39 +00:00
Tobias Grosser d840fc7277 Support accesses with differently sized types to the same array
This allows code such as:

void multiple_types(char *Short, char *Float, char *Double) {
  for (long i = 0; i < 100; i++) {
    Short[i] = *(short *)&Short[2 * i];
    Float[i] = *(float *)&Float[4 * i];
    Double[i] = *(double *)&Double[8 * i];
  }
}

To model such code we use as canonical element type of the modeled array the
smallest element type of all original array accesses, if type allocation sizes
are multiples of each other. Otherwise, we use a newly created iN type, where N
is the gcd of the allocation size of the types used in the accesses to this
array. Accesses with types larger as the canonical element type are modeled as
multiple accesses with the smaller type.

For example the second load access is modeled as:

  { Stmt_bb2[i0] -> MemRef_Float[o0] : 4i0 <= o0 <= 3 + 4i0 }

To support code-generating these memory accesses, we introduce a new method
getAccessAddressFunction that assigns each statement instance a single memory
location, the address we load from/store to. Currently we obtain this address by
taking the lexmin of the access function. We may consider keeping track of the
memory location more explicitly in the future.

We currently do _not_ handle multi-dimensional arrays and also keep the
restriction of not supporting accesses where the offset expression is not a
multiple of the access element type size. This patch adds tests that ensure
we correctly invalidate a scop in case these accesses are found. Both types of
accesses can be handled using the very same model, but are left to be added in
the future.

We also move the initialization of the scop-context into the constructor to
ensure it is already available when invalidating the scop.

Finally, we add this as a new item to the 2.9 release notes

Reviewers: jdoerfert, Meinersbur

Differential Revision: http://reviews.llvm.org/D16878

llvm-svn: 259784
2016-02-04 13:18:42 +00:00
Tobias Grosser e2c31210b2 Revert "Support loads with differently sized types from a single array"
This reverts commit (@259587). It needs some further discussions.

llvm-svn: 259629
2016-02-03 05:53:27 +00:00
Tobias Grosser 5d3fc1ea43 Support loads with differently sized types from a single array
We support now code such as:

void multiple_types(char *Short, char *Float, char *Double) {
  for (long i = 0; i < 100; i++) {
    Short[i] = *(short *)&Short[2 * i];
    Float[i] = *(float *)&Float[4 * i];
    Double[i] = *(double *)&Double[8 * i];
  }
}

To support such code we use as element type of the modeled array the smallest
element type of all original array accesses. Accesses with larger types are
modeled as multiple accesses with the smaller type.

For example the second load access is modeled as:

  { Stmt_bb2[i0] -> MemRef_Float[o0] : 4i0 <= o0 <= 3 + 4i0 }

To support jscop-rewritable memory accesses we need each statement instance to
only be assigned a single memory location, which will be the address at which
we load the value. Currently we obtain this address by taking the lexmin of
the access function. We may consider keeping track of the memory location more
explicitly in the future.

llvm-svn: 259587
2016-02-02 22:05:29 +00:00
Johannes Doerfert 800e17a75c Add const keyword to MemoryAccess argument [NFC]
llvm-svn: 259504
2016-02-02 14:16:01 +00:00
Michael Kruse 70131d3416 Introduce MemAccInst helper class; NFC
MemAccInst wraps the common members of LoadInst and StoreInst. Also use
of this class in:
- ScopInfo::buildMemoryAccess
- BlockGenerator::generateLocationAccessed
- ScopInfo::addArrayAccess
- Scop::buildAliasGroups
- Replace every use of polly::getPointerOperand

Reviewers: jdoerfert, grosser

Differential Revision: http://reviews.llvm.org/D16530

llvm-svn: 258947
2016-01-27 17:09:17 +00:00
Michael Kruse ee6a4fc680 Unique phi write accesses
Ensure that there is at most one phi write access per PHINode and
ScopStmt. In particular, this would be possible for non-affine
subregions with multiple exiting blocks. We replace multiple MAY_WRITE
accesses by one MUST_WRITE access. The written value is constructed
using a PHINode of all exiting blocks. The interpretation of the PHI
WRITE's "accessed value" changed from the incoming value to the PHI like
for PHI READs since there is no unique incoming value.

Because region simplification shuffles around PHI nodes -- particularly
with exit node PHIs -- the PHINodes at analysis time does not always
exist anymore in the code generation pass. We instead remember the
incoming block/value pair in the MemoryAccess.

Differential Revision: http://reviews.llvm.org/D15681

llvm-svn: 258809
2016-01-26 13:33:27 +00:00
Tobias Grosser f2cdd144e5 BlockGenerators: Replace getNewScalarValue with getNewValue
Both functions implement the same functionality, with the difference that
getNewScalarValue assumes that globals and out-of-scop scalars can be directly
reused without loading them from their corresponding stack slot. This is correct
for sequential code generation, but causes issues with outlining code e.g. for
OpenMP code generation. getNewValue handles such cases correctly.

Hence, we can replace getNewScalarValue with getNewValue. This is not only more
future proof, but also eliminates a bunch of code.

The only functionality that was available in getNewScalarValue that is lost
is the on-demand creation of scalar values. However, this is not necessary any
more as scalars are always loaded at the beginning of each basic block and will
consequently always be available when scalar stores are generated. As this was
not the case in older versions of Polly, it seems the on-demand loading is just
some older code that has not yet been removed.

Finally, generateScalarLoads also generated loads for values that are loop
invariant, available in GlobalMap and which are preferred over the ones loaded
in generateScalarLoads. Hence, we can just skip the code generation of such
scalar values, avoiding the generation of dead code.

Differential Revision: http://reviews.llvm.org/D16522

llvm-svn: 258799
2016-01-26 10:01:35 +00:00
Tobias Grosser 5c7f16be6b BlockGenerators: Avoid redundant map lookup [NFC]
llvm-svn: 258660
2016-01-24 14:16:59 +00:00
Johannes Doerfert 370cf00c9f Make sure we preserve alignment information after hoisting invariant load
In Polly, after hoisting loop invariant loads outside loop, the alignment
information for hoisted loads are missing, this patch restore them.

Contributed-by: Lawrence Hu <lawrence@codeaurora.org>

Differential Revision: http://reviews.llvm.org/D16160

llvm-svn: 258105
2016-01-19 00:17:21 +00:00
Roman Gareev b0c4e49a37 Fix of r257495.
Remove redundant "FPM->add(createDemoteRegisterToMemoryPass());"

llvm-svn: 257514
2016-01-12 20:47:48 +00:00
Roman Gareev 6ebc01c973 We do not need to schedule another loop interchange pass after Polly, as Polly
should perform loop interchanges itself.

This also fixes a bug we see due to the "loop-interchange" pass producing
incorrect IR when compiling linpack-pc.c from the LLVM test-suite with
"-polly-position=before-vectorizer".

Reviewed-by: Tobias Grosser <tobias@grosser.es>
llvm-svn: 257495
2016-01-12 17:59:06 +00:00
Johannes Doerfert 5dced2693e Refactor canSynthesize in the BlockGenerators [NFC]
llvm-svn: 256269
2015-12-22 19:08:49 +00:00
Johannes Doerfert 28f8ac1db2 Treat inline assembly as a constant in the code generation.
llvm-svn: 256267
2015-12-22 19:08:24 +00:00
Johannes Doerfert 42df8d1db6 Reduce indention in BlockGenerator::trySynthesizeNewValue [NFC]
llvm-svn: 256266
2015-12-22 19:08:01 +00:00
Tobias Grosser fcabb155c1 BlockGenerators: Remove unnecessary const_cast
llvm-svn: 256227
2015-12-22 01:41:25 +00:00